Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas

Fabio Ramohitaj5 min read • Published May 10, 2024 • Updated May 10, 2024
Atlas
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
The challenges that are raised in modern business contexts are increasingly complex. These challenges range from the ability to minimize downtime during migrations to adopting efficient tools for transitioning from relational to non-relational databases, and from implementing resilient architectures that ensure high availability to the ability to scale horizontally, allowing large amounts of data to be efficiently managed and queried.
Two of the main challenges, which will be covered in this article, are:
  • The need to create resilient IT infrastructures that can ensure business continuity or minimal downtime even in critical situations, such as the loss of a data center.
  • Conducting migrations from one infrastructure to another without compromising operations.
It is in this context that MongoDB stands out by offering innovative solutions such as MongoSync and live migrate.

Ensuring business continuity with MongoSync: an approach to disaster recovery

MongoDB Atlas, with its capabilities and remarkable flexibility, offers two distinct approaches to implementing business continuity strategies. These two strategies are:
  • Creating a cluster with a geographic distribution of nodes.
  • The implementation of two clusters in different regions synchronized via MongoSync.
In this section, we will explore the second point (i.e., the implementation of two clusters in different regions synchronized via MongoSync) in more detail.
What exactly is MongoSync? For a correct definition, we can refer to the official documentation:
"The mongosync binary is the primary process used in Cluster-to-Cluster Sync. mongosync migrates data from one cluster to another and can keep the clusters in continuous sync."
This tool performs the following operations:
  • It migrates data from one cluster to another.
  • It keeps the clusters in continuous sync.
Let's make this more concrete with an example:
  • Initially, the situation looks like this for the production cluster and the disaster recovery cluster:
Image of two clusters: the first already populated with data, while the second is empty and waiting to be populated. The two clusters are located in different datacenters
The current state of the main cluster would look like this:
1Atlas atlas-qsd40w-shard-0 [primary] test> show dbs
2admin               140.00 KiB
3config              276.00 KiB
4local               524.00 KiB
5sample_airbnb        52.09 MiB
6sample_analytics      9.44 MiB
7sample_geospatial     1.02 MiB
8sample_guides        40.00 KiB
9sample_mflix        109.01 MiB
10sample_restaurants    5.73 MiB
11sample_supplies     976.00 KiB
12sample_training      41.20 MiB
13sample_weatherdata    2.39 MiB
The back-up used for disaster recovery is still blank:
1Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2admin   172.00 KiB
3config  212.00 KiB
4local   584.00 KiB
Before proceeding, it is essential to install the mongosync binary. If you have not already done so, you can download it from the downloads page. The commands described below have been tested in the CentOS 7 operating system.
Let's proceed with the configuration of mongosync by defining a configuration file and a service:
1vi /etc/mongosync.conf
You can copy and paste the current configuration into this file using the appropriate connection strings. You can also test with two Atlas clusters, which must be M10 level or higher. For more details on how to get the connection strings from your Atlas cluster, you can consult the documentation.
1cluster0: "mongodb+srv://test_u:test_p@cluster0.*****.mongodb.net/?retryWrites=true&w=majority"
2cluster1: "mongodb+srv://test_u:test_p@cluster1.*****.mongodb.net/?retryWrites=true&w=majority"
3logPath: "/data/log/mongosync"
4verbosity: "INFO"
Generally, this step is performed on a Linux machine by system administrators. Although the step is optional, it is recommended to implement it in a production environment.
Next, you will be able to create a service named mongosync.service.
1vi /usr/lib/systemd/system/mongosync.service
This is what your service file should look like.
1 [Unit]
2Description=Cluster-to-Cluster Sync
3Documentation=https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
4[Service]
5User=root
6Group=root
7ExecStart=/usr/local/bin/mongosync --config /etc/mongosync.conf
8[Install]
9WantedBy=multi-user.target
Reload all unit files:
1systemctl daemon-reload
Now, we can start the service: 
1systemctl start mongosync
We can also check whether the service has been started correctly:
1systemctl status mongosync
Output:
1mongosync.service - Cluster-to-Cluster Sync
2    Loaded: loaded (/usr/lib/systemd/system/mongosync.service; disabled; vendor preset: disabled)
3Active: active (running) since dom 2024-04-14 21:45:45 CEST; 4s ago
4 Docs: https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
5Main PID: 1573 (mongosync)
6    CGroup: /system.slice/mongosync.service
7 └─1573 /usr/local/bin/mongosync --config /etc/mongosync.conf
8
9apr 14 21:45:45 mongosync.mongodb.int systemd[1]: Started Cluster-to-Cluster Sync.
If a service is not created and executed, in a more general way, you can start the process in the following way: mongosync --config mongosync.conf
After starting the service, verify that it is in the idle state:
1curl localhost:27182/api/v1/progress -XGET | jq
Output:
1  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2                                  Dload  Upload   Total   Spent    Left  Speed
3100 191 100 191 0 0 14384 0 --:--:-- --:--:-- --:--:-- 14692
4{
5 "progress": {
6 "state": "IDLE",
7 "canCommit": false,
8 "canWrite": false,
9 "info": null,
10 "lagTimeSeconds": null,
11 "collectionCopy": null,
12 "directionMapping": null,
13 "mongosyncID": "coordinator",
14 "coordinatorID": ""
15  }
16}
We can run the synchronization:
1curl localhost:27182/api/v1/start -XPOST \
2--data '
3    {
4 "source": "cluster0",
5 "destination": "cluster1",
6 "reversible": true,
7 "enableUserWriteBlocking": true
8    } '
Output:
1{"success":true}
We can also keep track of the synchronization status:
1curl localhost:27182/api/v1/progress -XGET | jq
Output:
1  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2                                  Dload  Upload   Total   Spent    Left  Speed
3100 502 100 502 0 0 36001 0 --:--:-- --:--:-- --:--:-- 38615
4{
5 "progress": {
6 "state": "RUNNING",
7 "canCommit": false,
8 "canWrite": false,
9 "info": "collection copy",
10 "lagTimeSeconds": 54,
11 "collectionCopy": {
12 "estimatedTotalBytes": 390696597,
13 "estimatedCopiedBytes": 390696597
14    },
15 "directionMapping": {
16 "Source": "cluster0: cluster0.*****.mongodb.net",
17 "Destination": "cluster1: cluster1.*****.mongodb.net"
18    },
19 "mongosyncID": "coordinator",
20 "coordinatorID": "coordinator"
21  }
22}
23
24  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
25                                  Dload  Upload   Total   Spent    Left  Speed
26100 510 100 510 0 0 44270 0 --:--:-- --:--:-- --:--:-- 46363
27{
28 "progress": {
29 "state": "RUNNING",
30 "canCommit": true,
31 "canWrite": false,
32 "info": "change event application",
33 "lagTimeSeconds": 64,
34 "collectionCopy": {
35 "estimatedTotalBytes": 390696597,
36 "estimatedCopiedBytes": 390696597
37    },
38 "directionMapping": {
39 "Source": "cluster0: cluster0.*****.mongodb.net",
40 "Destination": "cluster1: cluster1.*****.mongodb.net"
41    },
42 "mongosyncID": "coordinator",
43 "coordinatorID": "coordinator"
44  }
45}
At this time, the DR environment is aligned with the production environment and will also maintain synchronization for the next operations: 
Image of two clusters located in different datacenters, aligned and remained synchronized via mongosync. Mongosync runs on an on-premises server.
1Atlas atlas-qsd40w-shard-0 [primary] test> show dbs
2admin               140.00 KiB
3config              276.00 KiB
4local               524.00 KiB
5sample_airbnb        52.09 MiB
6sample_analytics      9.44 MiB
7sample_geospatial     1.02 MiB
8sample_guides        40.00 KiB
9sample_mflix        109.01 MiB
10sample_restaurants    5.73 MiB
11sample_supplies     976.00 KiB
12sample_training      41.20 MiB
13sample_weatherdata    2.39 MiB
And our second cluster is now in sync with the following data.
1Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2admin                                172.00 KiB
3config                               380.00 KiB
4local                                427.22 MiB
5mongosync_reserved_for_internal_use  420.00 KiB
6sample_airbnb                         53.06 MiB
7sample_analytics                       9.55 MiB
8sample_geospatial                      1.40 MiB
9sample_guides                         40.00 KiB
10sample_mflix                         128.38 MiB
11sample_restaurants                     6.47 MiB
12sample_supplies                        1.03 MiB
13sample_training                       47.21 MiB
14sample_weatherdata                     2.61 MiB
Armed with what we've discussed so far, we could ask a last question like:
Is it possible to take advantage of the disaster recovery environment in some way, or should we just let it synchronize?
By making the appropriate mongosync configurations --- for example, by setting the "buildIndexes" option to false and omitting the "enableUserWriteBlocking" parameter (which is set to false by default) --- we can take advantage of the limitation regarding non-synchronization of users and roles to create read-only users. We do this in such a way that no entries can be entered, thereby ensuring consistency between the origin and destination clusters and allowing us to use the disaster recovery environment to create the appropriate indexes that will go into optimizing slow queries identified in the production environment.

Live migrate to Atlas: minimizing downtime

Live migrate is a tool that allows users to perform migrations to MongoDB Atlas and more specifically, as mentioned by the official documentation, is a process that uses mongosync as the underlying data migration tool, enabling faster live migrations with less downtime if both the source and destination clusters are running MongoDB 6.0.8 or later.
So, what is the added value of this tool compared to mongosync?
It brings two advantages:
  • You can avoid the need to provision and configure a server to host mongosync.
  • You have the ability to migrate from previous versions, as indicated in the migration path.
Image representing the migration of an on-premises production cluster to MongoDB Atlas.

Conclusion

As we have seen, these tools, with proper care, allow us to achieve our goals while also providing us with a certain flexibility. 
Regardless of the solution that will be used for migration and/or synchronization, you will be able to contact MongoDB support, who will help you identify the best strategy to solve that task successfully.
If you have questions or comments, you can find us in the MongoDB Developer Community!
Top Comments in Forums
Forum Commenter Avatar
usmaan_rangrezUsmaan Rangrez3 months ago

When DC is completely down,
If i write in DR,
how will that data sync to DC?
I had mongosync insatlled in DC.

See More on Forums

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Build a Cocktail API with Beanie and MongoDB


Oct 01, 2024 | 6 min read
Tutorial

Leveraging MongoDB Atlas Vector Search With LangChain


Sep 18, 2024 | 6 min read
Tutorial

Building AI Graphs With Rivet and MongoDB Atlas Vector Search to Power AI Applications


Sep 18, 2024 | 10 min read
Tutorial

MongoDB Atlas With Terraform: Database Users and Vault


Apr 15, 2024 | 9 min read
Table of Contents