Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas

Fabio Ramohitaj5 min read • Published May 10, 2024 • Updated May 10, 2024

Atlas

Rate this tutorial

The challenges that are raised in modern business contexts are increasingly complex. These challenges range from the ability to minimize downtime during migrations to adopting efficient tools for transitioning from relational to non-relational databases, and from implementing resilient architectures that ensure high availability to the ability to scale horizontally, allowing large amounts of data to be efficiently managed and queried.

Two of the main challenges, which will be covered in this article, are:

The need to create resilient IT infrastructures that can ensure business continuity or minimal downtime even in critical situations, such as the loss of a data center.
Conducting migrations from one infrastructure to another without compromising operations.

It is in this context that MongoDB stands out by offering innovative solutions such as MongoSync and live migrate.

Ensuring business continuity with MongoSync: an approach to disaster recovery

MongoDB Atlas, with its capabilities and remarkable flexibility, offers two distinct approaches to implementing business continuity strategies. These two strategies are:

Creating a cluster with a geographic distribution of nodes.
The implementation of two clusters in different regions synchronized via MongoSync.

In this section, we will explore the second point (i.e., the implementation of two clusters in different regions synchronized via MongoSync) in more detail.

What exactly is MongoSync? For a correct definition, we can refer to the official documentation:

"The mongosync binary is the primary process used in Cluster-to-Cluster Sync. mongosync migrates data from one cluster to another and can keep the clusters in continuous sync."

This tool performs the following operations:

It migrates data from one cluster to another.
It keeps the clusters in continuous sync.

Let's make this more concrete with an example:

Initially, the situation looks like this for the production cluster and the disaster recovery cluster:

The current state of the main cluster would look like this:

1 Atlas atlas-qsd40w-shard-0 [primary] test> show dbs
2 admin               140.00 KiB
3 config              276.00 KiB
4 local               524.00 KiB
5 sample_airbnb        52.09 MiB
6 sample_analytics      9.44 MiB
7 sample_geospatial     1.02 MiB
8 sample_guides        40.00 KiB
9 sample_mflix        109.01 MiB
10 sample_restaurants    5.73 MiB
11 sample_supplies     976.00 KiB
12 sample_training      41.20 MiB
13 sample_weatherdata    2.39 MiB

The back-up used for disaster recovery is still blank:

1 Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2 admin   172.00 KiB
3 config  212.00 KiB
4 local   584.00 KiB

Before proceeding, it is essential to install the mongosync binary. If you have not already done so, you can download it from the downloads page. The commands described below have been tested in the CentOS 7 operating system.

Let's proceed with the configuration of mongosync by defining a configuration file and a service:

1 vi /etc/mongosync.conf

You can copy and paste the current configuration into this file using the appropriate connection strings. You can also test with two Atlas clusters, which must be M10 level or higher. For more details on how to get the connection strings from your Atlas cluster, you can consult the documentation.

1 cluster0:  "mongodb+srv://test_u:test_p@cluster0.*****.mongodb.net/?retryWrites=true&w=majority"
2 cluster1:  "mongodb+srv://test_u:test_p@cluster1.*****.mongodb.net/?retryWrites=true&w=majority"
3 logPath:  "/data/log/mongosync"
4 verbosity:  "INFO"

Generally, this step is performed on a Linux machine by system administrators. Although the step is optional, it is recommended to implement it in a production environment.

Next, you will be able to create a service named mongosync.service.

1 vi /usr/lib/systemd/system/mongosync.service

This is what your service file should look like.

1  [Unit]
2 Description=Cluster-to-Cluster Sync
3 Documentation=https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
4 [Service]
5 User=root
6 Group=root
7 ExecStart=/usr/local/bin/mongosync --config /etc/mongosync.conf
8 [Install]
9 WantedBy=multi-user.target

Reload all unit files:

1 systemctl daemon-reload

Now, we can start the service:

1 systemctl start mongosync

We can also check whether the service has been started correctly:

1 systemctl status mongosync

Output:

1 mongosync.service - Cluster-to-Cluster Sync
2     Loaded: loaded (/usr/lib/systemd/system/mongosync.service; disabled; vendor preset: disabled)
3 Active: active (running) since dom 2024-04-14  21:45:45 CEST; 4s ago
4  Docs: https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
5 Main PID: 1573 (mongosync)
6     CGroup: /system.slice/mongosync.service
7  └─1573 /usr/local/bin/mongosync --config /etc/mongosync.conf
8 
9 apr 14  21:45:45 mongosync.mongodb.int systemd[1]: Started Cluster-to-Cluster Sync.

If a service is not created and executed, in a more general way, you can start the process in the following way: mongosync --config mongosync.conf

After starting the service, verify that it is in the idle state:

1 curl localhost:27182/api/v1/progress -XGET | jq

Output:

1   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2                                   Dload  Upload   Total   Spent    Left  Speed
3 100  191 100  191 0  0 14384 0 --:--:-- --:--:-- --:--:--  14692
4 {
5  "progress": {
6  "state": "IDLE",
7  "canCommit": false,
8  "canWrite": false,
9  "info": null,
10  "lagTimeSeconds": null,
11  "collectionCopy": null,
12  "directionMapping": null,
13  "mongosyncID": "coordinator",
14  "coordinatorID": ""
15   }
16 }

We can run the synchronization:

1 curl localhost:27182/api/v1/start -XPOST \
2 --data '
3     {
4  "source": "cluster0",
5  "destination": "cluster1",
6  "reversible": true,
7  "enableUserWriteBlocking": true
8     } '

Output:

1 {"success":true}

We can also keep track of the synchronization status:

1 curl localhost:27182/api/v1/progress -XGET | jq

Output:

1   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
2                                   Dload  Upload   Total   Spent    Left  Speed
3 100  502 100  502 0  0 36001 0 --:--:-- --:--:-- --:--:--  38615
4 {
5  "progress": {
6  "state": "RUNNING",
7  "canCommit": false,
8  "canWrite": false,
9  "info": "collection copy",
10  "lagTimeSeconds": 54,
11  "collectionCopy": {
12  "estimatedTotalBytes": 390696597,
13  "estimatedCopiedBytes": 390696597
14     },
15  "directionMapping": {
16  "Source": "cluster0: cluster0.*****.mongodb.net",
17  "Destination": "cluster1: cluster1.*****.mongodb.net"
18     },
19  "mongosyncID": "coordinator",
20  "coordinatorID": "coordinator"
21   }
22 }
23 
24   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
25                                   Dload  Upload   Total   Spent    Left  Speed
26 100  510 100  510 0  0 44270 0 --:--:-- --:--:-- --:--:--  46363
27 {
28  "progress": {
29  "state": "RUNNING",
30  "canCommit": true,
31  "canWrite": false,
32  "info": "change event application",
33  "lagTimeSeconds": 64,
34  "collectionCopy": {
35  "estimatedTotalBytes": 390696597,
36  "estimatedCopiedBytes": 390696597
37     },
38  "directionMapping": {
39  "Source": "cluster0: cluster0.*****.mongodb.net",
40  "Destination": "cluster1: cluster1.*****.mongodb.net"
41     },
42  "mongosyncID": "coordinator",
43  "coordinatorID": "coordinator"
44   }
45 }

At this time, the DR environment is aligned with the production environment and will also maintain synchronization for the next operations:

1 Atlas atlas-qsd40w-shard-0 [primary] test> show dbs
2 admin               140.00 KiB
3 config              276.00 KiB
4 local               524.00 KiB
5 sample_airbnb        52.09 MiB
6 sample_analytics      9.44 MiB
7 sample_geospatial     1.02 MiB
8 sample_guides        40.00 KiB
9 sample_mflix        109.01 MiB
10 sample_restaurants    5.73 MiB
11 sample_supplies     976.00 KiB
12 sample_training      41.20 MiB
13 sample_weatherdata    2.39 MiB

And our second cluster is now in sync with the following data.

1 Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2 admin                                172.00 KiB
3 config                               380.00 KiB
4 local                                427.22 MiB
5 mongosync_reserved_for_internal_use  420.00 KiB
6 sample_airbnb                         53.06 MiB
7 sample_analytics                       9.55 MiB
8 sample_geospatial                      1.40 MiB
9 sample_guides                         40.00 KiB
10 sample_mflix                         128.38 MiB
11 sample_restaurants                     6.47 MiB
12 sample_supplies                        1.03 MiB
13 sample_training                       47.21 MiB
14 sample_weatherdata                     2.61 MiB

Armed with what we've discussed so far, we could ask a last question like:

Is it possible to take advantage of the disaster recovery environment in some way, or should we just let it synchronize?

By making the appropriate mongosync configurations --- for example, by setting the "buildIndexes" option to false and omitting the "enableUserWriteBlocking" parameter (which is set to false by default) --- we can take advantage of the limitation regarding non-synchronization of users and roles to create read-only users. We do this in such a way that no entries can be entered, thereby ensuring consistency between the origin and destination clusters and allowing us to use the disaster recovery environment to create the appropriate indexes that will go into optimizing slow queries identified in the production environment.

Live migrate to Atlas: minimizing downtime

Live migrate is a tool that allows users to perform migrations to MongoDB Atlas and more specifically, as mentioned by the official documentation, is a process that uses mongosync as the underlying data migration tool, enabling faster live migrations with less downtime if both the source and destination clusters are running MongoDB 6.0.8 or later.

So, what is the added value of this tool compared to mongosync?

It brings two advantages:

You can avoid the need to provision and configure a server to host mongosync.
You have the ability to migrate from previous versions, as indicated in the migration path.

Conclusion

As we have seen, these tools, with proper care, allow us to achieve our goals while also providing us with a certain flexibility.

Regardless of the solution that will be used for migration and/or synchronization, you will be able to contact MongoDB support, who will help you identify the best strategy to solve that task successfully.

If you have questions or comments, you can find us in the MongoDB Developer Community!

Top Comments in Forums

usmaan_rangrezUsmaan Rangrez3 months ago

When DC is completely down,
If i write in DR,
how will that data sync to DC?
I had mongosync insatlled in DC.

See More on Forums

Rate this tutorial

Tutorial

Build a Cocktail API with Beanie and MongoDB

Oct 01, 2024 | 6 min read

Tutorial

Leveraging MongoDB Atlas Vector Search With LangChain

Sep 18, 2024 | 6 min read

Tutorial

Building AI Graphs With Rivet and MongoDB Atlas Vector Search to Power AI Applications

Sep 18, 2024 | 10 min read

Tutorial

MongoDB Atlas With Terraform: Database Users and Vault

Apr 15, 2024 | 9 min read

Ensuring business continuity with MongoSync: an approach to disaster recovery
Live migrate to Atlas: minimizing downtime
Conclusion

Atlas

Efficient Sync Solutions: Cluster-to-Cluster Sync and Live Migration to Atlas

Ensuring business continuity with MongoSync: an approach to disaster recovery

Live migrate to Atlas: minimizing downtime

Conclusion

Top Comments in Forums

Related

Build a Cocktail API with Beanie and MongoDB

Leveraging MongoDB Atlas Vector Search With LangChain

Building AI Graphs With Rivet and MongoDB Atlas Vector Search to Power AI Applications

MongoDB Atlas With Terraform: Database Users and Vault

Table of Contents

1	Atlas atlas-qsd40w-shard-0 [primary] test> show dbs
2	admin 140.00 KiB
3	config 276.00 KiB
4	local 524.00 KiB
5	sample_airbnb 52.09 MiB
6	sample_analytics 9.44 MiB
7	sample_geospatial 1.02 MiB
8	sample_guides 40.00 KiB
9	sample_mflix 109.01 MiB
10	sample_restaurants 5.73 MiB
11	sample_supplies 976.00 KiB
12	sample_training 41.20 MiB
13	sample_weatherdata 2.39 MiB

1	Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2	admin 172.00 KiB
3	config 212.00 KiB
4	local 584.00 KiB

1	cluster0: "mongodb+srv://test_u:test_p@cluster0.*****.mongodb.net/?retryWrites=true&w=majority"
2	cluster1: "mongodb+srv://test_u:test_p@cluster1.*****.mongodb.net/?retryWrites=true&w=majority"
3	logPath: "/data/log/mongosync"
4	verbosity: "INFO"

1	[Unit]
2	Description=Cluster-to-Cluster Sync
3	Documentation=https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
4	[Service]
5	User=root
6	Group=root
7	ExecStart=/usr/local/bin/mongosync --config /etc/mongosync.conf
8	[Install]
9	WantedBy=multi-user.target

1	mongosync.service - Cluster-to-Cluster Sync
2	Loaded: loaded (/usr/lib/systemd/system/mongosync.service; disabled; vendor preset: disabled)
3	Active: active (running) since dom 2024-04-14 21:45:45 CEST; 4s ago
4	Docs: https://mongodb.prakticum-team.ru/docs/cluster-to-cluster-sync/
5	Main PID: 1573 (mongosync)
6	CGroup: /system.slice/mongosync.service
7	└─1573 /usr/local/bin/mongosync --config /etc/mongosync.conf
8
9	apr 14 21:45:45 mongosync.mongodb.int systemd[1]: Started Cluster-to-Cluster Sync.

1	% Total % Received % Xferd Average Speed Time Time Time Current
2	Dload Upload Total Spent Left Speed
3	100 191 100 191 0 0 14384 0 --:--:-- --:--:-- --:--:-- 14692
4	{
5	"progress": {
6	"state": "IDLE",
7	"canCommit": false,
8	"canWrite": false,
9	"info": null,
10	"lagTimeSeconds": null,
11	"collectionCopy": null,
12	"directionMapping": null,
13	"mongosyncID": "coordinator",
14	"coordinatorID": ""
15	}
16	}

1	curl localhost:27182/api/v1/start -XPOST \
2	--data '
3	{
4	"source": "cluster0",
5	"destination": "cluster1",
6	"reversible": true,
7	"enableUserWriteBlocking": true
8	} '

1	% Total % Received % Xferd Average Speed Time Time Time Current
2	Dload Upload Total Spent Left Speed
3	100 502 100 502 0 0 36001 0 --:--:-- --:--:-- --:--:-- 38615
4	{
5	"progress": {
6	"state": "RUNNING",
7	"canCommit": false,
8	"canWrite": false,
9	"info": "collection copy",
10	"lagTimeSeconds": 54,
11	"collectionCopy": {
12	"estimatedTotalBytes": 390696597,
13	"estimatedCopiedBytes": 390696597
14	},
15	"directionMapping": {
16	"Source": "cluster0: cluster0.*****.mongodb.net",
17	"Destination": "cluster1: cluster1.*****.mongodb.net"
18	},
19	"mongosyncID": "coordinator",
20	"coordinatorID": "coordinator"
21	}
22	}
23
24	% Total % Received % Xferd Average Speed Time Time Time Current
25	Dload Upload Total Spent Left Speed
26	100 510 100 510 0 0 44270 0 --:--:-- --:--:-- --:--:-- 46363
27	{
28	"progress": {
29	"state": "RUNNING",
30	"canCommit": true,
31	"canWrite": false,
32	"info": "change event application",
33	"lagTimeSeconds": 64,
34	"collectionCopy": {
35	"estimatedTotalBytes": 390696597,
36	"estimatedCopiedBytes": 390696597
37	},
38	"directionMapping": {
39	"Source": "cluster0: cluster0.*****.mongodb.net",
40	"Destination": "cluster1: cluster1.*****.mongodb.net"
41	},
42	"mongosyncID": "coordinator",
43	"coordinatorID": "coordinator"
44	}
45	}

1	Atlas atlas-lcu71y-shard-0 [primary] test> show dbs
2	admin 172.00 KiB
3	config 380.00 KiB
4	local 427.22 MiB
5	mongosync_reserved_for_internal_use 420.00 KiB
6	sample_airbnb 53.06 MiB
7	sample_analytics 9.55 MiB
8	sample_geospatial 1.40 MiB
9	sample_guides 40.00 KiB
10	sample_mflix 128.38 MiB
11	sample_restaurants 6.47 MiB
12	sample_supplies 1.03 MiB
13	sample_training 47.21 MiB
14	sample_weatherdata 2.61 MiB