Data Migration within Database

Pankaj_singh3 · 2024-10-18T10:34:06.289Z

Hi,
I am quite new to Mongodb.
I wish to move 10 million documents from one collection to another within a database. The source documents will be nested and need to be flattened , tranformed before loading to target collection.
What is best way to do it considering I would need to do it at rapid pace with monitoring, logging, error handling features , recovery features.
Should I use a ETL tool like AWS Glue which is serverless and supports spark or some inbuilt Mongodb atlas functionality (aggregation pipeline) is best suited here.
Thanks,
Pankaj

John_Sewell · 2024-10-18T11:36:29.937Z

We do this quite a lot using the aggregation framework, if you can specify the transform in an aggregation you can have the last stage as a $merge or $out to output the results to another collection within the server.
If you need to run it on a schedule that you’d obviously need to have some method to do that, you could also split up the data and do it in batches if you wanted as well within a shell script.
Post copy, you could have another aggregation to check the source and target data counts.
A benefit of using aggregation with $out/$merge is it’ll run server side so no need to pull data off and back to the server.
If you need more complicated monitoring then it may be best to look at an ETL toolkit, how complicated is your data, have you tested to see how long it’ll take with your 10M documents and an aggregation to transform the data?

Topic	Replies	Views	Activity
mongoose session transactions for multiple db connection Working with Data node-js mongoose-odm transactions	0	627	May 2024
Sorting behavior and sorting sequence for mongodb Working with Data aggregation queries atlas-search	3	469	Jun 2024
Migrating from Mongo Atlas Sharded Cluster to a Mongo Atlas Replica Set in a Different Region Working with Data replication sharding migration	0	208	Aug 2024
Aggregate two collections where both collections have a property with array of objects Working with Data aggregation	0	45	Aug 2024
Slow query performance when working with large $in arrays Working with Data aggregation queries data-modeling	0	39	Jan 21

Data Migration within Database

Data Migration within Database

New & Unread Topics

Want to read more? Browse other topics in Working with Data or view latest topics.