Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas
Rate this tutorial
The challenges that are raised in modern business contexts are increasingly complex. These challenges range from the ability to minimize downtime during migrations to adopting efficient tools for transitioning from relational to non-relational databases, and from implementing resilient architectures that ensure high availability to the ability to scale horizontally, allowing large amounts of data to be efficiently managed and queried.
Two of the main challenges, which will be covered in this article, are:
- The need to create resilient IT infrastructures that can ensure business continuity or minimal downtime even in critical situations, such as the loss of a data center.
- Conducting migrations from one infrastructure to another without compromising operations.
It is in this context that MongoDB stands out by offering innovative solutions such as MongoSync and live migrate.
MongoDB Atlas, with its capabilities and remarkable flexibility, offers two distinct approaches to implementing business continuity strategies. These two strategies are:
- Creating a cluster with a geographic distribution of nodes.
- The implementation of two clusters in different regions synchronized via MongoSync.
In this section, we will explore the second point (i.e., the implementation of two clusters in different regions synchronized via MongoSync) in more detail.
"The
mongosync
binary is the primary process used in Cluster-to-Cluster Sync. mongosync
migrates data from one cluster to another and can keep the clusters in continuous sync."This tool performs the following operations:
- It migrates data from one cluster to another.
- It keeps the clusters in continuous sync.
Let's make this more concrete with an example:
- Initially, the situation looks like this for the production cluster and the disaster recovery cluster:
The current state of the main cluster would look like this:
1 Atlas atlas-qsd40w-shard-0 [primary] test> show dbs 2 admin 140.00 KiB 3 config 276.00 KiB 4 local 524.00 KiB 5 sample_airbnb 52.09 MiB 6 sample_analytics 9.44 MiB 7 sample_geospatial 1.02 MiB 8 sample_guides 40.00 KiB 9 sample_mflix 109.01 MiB 10 sample_restaurants 5.73 MiB 11 sample_supplies 976.00 KiB 12 sample_training 41.20 MiB 13 sample_weatherdata 2.39 MiB
The back-up used for disaster recovery is still blank:
1 Atlas atlas-lcu71y-shard-0 [primary] test> show dbs 2 admin 172.00 KiB 3 config 212.00 KiB 4 local 584.00 KiB
Before proceeding, it is essential to install the
mongosync
binary. If you have not already done so, you can download it from the downloads page. The commands described below have been tested in the CentOS 7 operating system.Let's proceed with the configuration of
mongosync
by defining a configuration file and a service:1 vi /etc/mongosync.conf
You can copy and paste the current configuration into this file using the appropriate connection strings. You can also test with two Atlas clusters, which must be M10 level or higher. For more details on how to get the connection strings from your Atlas cluster, you can consult the documentation.
1 cluster0: "mongodb+srv://test_u:test_p@cluster0.*****.mongodb.net/?retryWrites=true&w=majority" 2 cluster1: "mongodb+srv://test_u:test_p@cluster1.*****.mongodb.net/?retryWrites=true&w=majority" 3 logPath: "/data/log/mongosync" 4 verbosity: "INFO"
Generally, this step is performed on a Linux machine by system administrators. Although the step is optional, it is recommended to implement it in a production environment.
Next, you will be able to create a service named mongosync.service.
1 vi /usr/lib/systemd/system/mongosync.service
This is what your service file should look like.
1 [Unit] 2 Description=Cluster-to-Cluster Sync 3 Documentation=https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/ 4 [Service] 5 User=root 6 Group=root 7 ExecStart=/usr/local/bin/mongosync --config /etc/mongosync.conf 8 [Install] 9 WantedBy=multi-user.target
Reload all unit files:
1 systemctl daemon-reload
Now, we can start the service:
1 systemctl start mongosync
We can also check whether the service has been started correctly:
1 systemctl status mongosync
Output:
1 mongosync.service - Cluster-to-Cluster Sync 2 Loaded: loaded (/usr/lib/systemd/system/mongosync.service; disabled; vendor preset: disabled) 3 Active: active (running) since dom 2024-04-14 21:45:45 CEST; 4s ago 4 Docs: https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/ 5 Main PID: 1573 (mongosync) 6 CGroup: /system.slice/mongosync.service 7 └─1573 /usr/local/bin/mongosync --config /etc/mongosync.conf 8 9 apr 14 21:45:45 mongosync.mongodb.int systemd[1]: Started Cluster-to-Cluster Sync.
If a service is not created and executed, in a more general way, you can start the process in the following way:
mongosync --config mongosync.conf
After starting the service, verify that it is in the idle state:
1 curl localhost:27182/api/v1/progress -XGET | jq
Output:
1 % Total % Received % Xferd Average Speed Time Time Time Current 2 Dload Upload Total Spent Left Speed 3 100 191 100 191 0 0 14384 0 --:--:-- --:--:-- --:--:-- 14692 4 { 5 "progress": { 6 "state": "IDLE", 7 "canCommit": false, 8 "canWrite": false, 9 "info": null, 10 "lagTimeSeconds": null, 11 "collectionCopy": null, 12 "directionMapping": null, 13 "mongosyncID": "coordinator", 14 "coordinatorID": "" 15 } 16 }
We can run the synchronization:
1 curl localhost:27182/api/v1/start -XPOST \ 2 --data ' 3 { 4 "source": "cluster0", 5 "destination": "cluster1", 6 "reversible": true, 7 "enableUserWriteBlocking": true 8 } '
Output:
1 {"success":true}
We can also keep track of the synchronization status:
1 curl localhost:27182/api/v1/progress -XGET | jq
Output:
1 % Total % Received % Xferd Average Speed Time Time Time Current 2 Dload Upload Total Spent Left Speed 3 100 502 100 502 0 0 36001 0 --:--:-- --:--:-- --:--:-- 38615 4 { 5 "progress": { 6 "state": "RUNNING", 7 "canCommit": false, 8 "canWrite": false, 9 "info": "collection copy", 10 "lagTimeSeconds": 54, 11 "collectionCopy": { 12 "estimatedTotalBytes": 390696597, 13 "estimatedCopiedBytes": 390696597 14 }, 15 "directionMapping": { 16 "Source": "cluster0: cluster0.*****.mongodb.net", 17 "Destination": "cluster1: cluster1.*****.mongodb.net" 18 }, 19 "mongosyncID": "coordinator", 20 "coordinatorID": "coordinator" 21 } 22 } 23 24 % Total % Received % Xferd Average Speed Time Time Time Current 25 Dload Upload Total Spent Left Speed 26 100 510 100 510 0 0 44270 0 --:--:-- --:--:-- --:--:-- 46363 27 { 28 "progress": { 29 "state": "RUNNING", 30 "canCommit": true, 31 "canWrite": false, 32 "info": "change event application", 33 "lagTimeSeconds": 64, 34 "collectionCopy": { 35 "estimatedTotalBytes": 390696597, 36 "estimatedCopiedBytes": 390696597 37 }, 38 "directionMapping": { 39 "Source": "cluster0: cluster0.*****.mongodb.net", 40 "Destination": "cluster1: cluster1.*****.mongodb.net" 41 }, 42 "mongosyncID": "coordinator", 43 "coordinatorID": "coordinator" 44 } 45 }
At this time, the DR environment is aligned with the production environment and will also maintain synchronization for the next operations:
1 Atlas atlas-qsd40w-shard-0 [primary] test> show dbs 2 admin 140.00 KiB 3 config 276.00 KiB 4 local 524.00 KiB 5 sample_airbnb 52.09 MiB 6 sample_analytics 9.44 MiB 7 sample_geospatial 1.02 MiB 8 sample_guides 40.00 KiB 9 sample_mflix 109.01 MiB 10 sample_restaurants 5.73 MiB 11 sample_supplies 976.00 KiB 12 sample_training 41.20 MiB 13 sample_weatherdata 2.39 MiB
And our second cluster is now in sync with the following data.
1 Atlas atlas-lcu71y-shard-0 [primary] test> show dbs 2 admin 172.00 KiB 3 config 380.00 KiB 4 local 427.22 MiB 5 mongosync_reserved_for_internal_use 420.00 KiB 6 sample_airbnb 53.06 MiB 7 sample_analytics 9.55 MiB 8 sample_geospatial 1.40 MiB 9 sample_guides 40.00 KiB 10 sample_mflix 128.38 MiB 11 sample_restaurants 6.47 MiB 12 sample_supplies 1.03 MiB 13 sample_training 47.21 MiB 14 sample_weatherdata 2.61 MiB
Armed with what we've discussed so far, we could ask a last question like:
Is it possible to take advantage of the disaster recovery environment in some way, or should we just let it synchronize?
By making the appropriate
mongosync
configurations --- for example, by setting the "buildIndexes" option to false and omitting the "enableUserWriteBlocking" parameter (which is set to false by default) --- we can take advantage of the limitation regarding non-synchronization of users and roles to create read-only users. We do this in such a way that no entries can be entered, thereby ensuring consistency between the origin and destination clusters and allowing us to use the disaster recovery environment to create the appropriate indexes that will go into optimizing slow queries identified in the production environment.Live migrate is a tool that allows users to perform migrations to MongoDB Atlas and more specifically, as mentioned by the official documentation, is a process that uses
mongosync
as the underlying data migration tool, enabling faster live migrations with less downtime if both the source and destination clusters are running MongoDB 6.0.8 or later.So, what is the added value of this tool compared to
mongosync
?It brings two advantages:
- You can avoid the need to provision and configure a server to host
mongosync
.
As we have seen, these tools, with proper care, allow us to achieve our goals while also providing us with a certain flexibility.
Regardless of the solution that will be used for migration and/or synchronization, you will be able to contact MongoDB support, who will help you identify the best strategy to solve that task successfully.
Top Comments in Forums
usmaan_rangrezUsmaan Rangrez3 months ago
When DC is completely down,
If i write in DR,
how will that data sync to DC?
I had mongosync insatlled in DC.