oplog Sizing
On this page
The mongosync program uses change streams to synchronize data between source and destination
clusters. mongosync
does not access the oplog directly,
but when a change stream returns events from the past, the events must
be within the oplog
time range.
mongosync
applies operations in the oplog
on the source cluster
to the data on the destination cluster after the
collection copy phase. When
operations that mongosync
has not applied roll off the oplog
on the source cluster, the sync fails and mongosync
exits.
Note
mongosync
does not replicate applyOps
operations made on
the source cluster during sync to the destination cluster.
If you anticipate syncing a large data set, or if you plan to pause
synchronization for an extended period of time, you might exceed the
oplog window. Use the oplogSizeMB
setting
to increase the size of the oplog
on the source cluster.
Considerations
The destination cluster must have enough disk storage to accommodate the logical data size being migrated and the destination oplog entries from the initial sync. For example, to migrate 10 GB of data, the destination cluster must have at least 10 GB available for the data and another 10 GB for the insert oplog entries from the initial sync.
Important
To use embedded verification, you must have a
larger oplog on the destination. If you enable the embedded verifier and
reduce the size of the destination oplog, the embedded verifier might not be
able to keep up, causing mongosync
to error.
If you need to reduce the overhead of the destination oplog entries and the embedded verifier is disabled, you can:
Use the
oplogSizeMB
setting to lower the destination cluster's oplog size.Use to
oplogMinRetentionHours
setting to lower or remove the destination cluster's minimum oplog retention period.
Monitor oplog Size Needed for Initial Sync
Determine oplog Window
To get the difference in seconds between the first and last entry
in the oplog
run db.getReplicationInfo()
. If you
are replicating a sharded cluster, run the command on each shard.
db.getReplicationInfo().timeDiff
The value returned is the minimum oplog
window of the
cluster. If there are multiple shards, the smallest number is the
minimum oplog
window.
Determine mongosync Replication Lag
To get the lagTimeSeconds
value, run the
/progress command.
The lag time is the time in seconds between the
last event applied by mongosync
and time of the current
latest event on the source cluster.
It is a measure of how far behind the source cluster mongosync
is.
Validate oplog Size
If the lag time approaches the minimum oplog
window, make
one of the following changes:
Increase the
oplog
window. UsereplSetResizeOplog
to setminRetentionHours
greater than the currentoplog
window.Note
replSetResizeOplog
is unsupported in Atlas. To resize the oplog in Atlas, see Set Minimum Oplog Window.Scale up the
mongosync
instance. Add CPU or memory to scale up themongosync
node so that it has a higher copy rate.
Note
The oplog window and rate of change for replication lag may vary during synchronization. Repeat these steps during a migration to monitor the progress.