Data Migration within Database

Pankaj_singh3 · 2024-10-18T10:34:06.289Z

Hi,
I am quite new to Mongodb.
I wish to move 10 million documents from one collection to another within a database. The source documents will be nested and need to be flattened , tranformed before loading to target collection.
What is best way to do it considering I would need to do it at rapid pace with monitoring, logging, error handling features , recovery features.
Should I use a ETL tool like AWS Glue which is serverless and supports spark or some inbuilt Mongodb atlas functionality (aggregation pipeline) is best suited here.
Thanks,
Pankaj

John_Sewell · 2024-10-18T11:36:29.937Z

We do this quite a lot using the aggregation framework, if you can specify the transform in an aggregation you can have the last stage as a $merge or $out to output the results to another collection within the server.
If you need to run it on a schedule that you’d obviously need to have some method to do that, you could also split up the data and do it in batches if you wanted as well within a shell script.
Post copy, you could have another aggregation to check the source and target data counts.
A benefit of using aggregation with $out/$merge is it’ll run server side so no need to pull data off and back to the server.
If you need more complicated monitoring then it may be best to look at an ETL toolkit, how complicated is your data, have you tested to see how long it’ll take with your 10M documents and an aggregation to transform the data?

Topic	Replies	Views	Activity
Partial index is not used during search Working with Data queries python	3	326	Feb 6
Working with aggregating dates based on Timezone Working with Data aggregation	0	32	Sep 2024
Help me creating indexes properly Working with Data queries node-js mongoose-odm indexes react-js	1	52	Oct 2024
MongoDB TTL Index. doesn’t delete old documents. Working with Data indexes	4	111	Jan 23
Binary Data SubType 3 deprecation Working with Data	0	9	28d

Data Migration within Database

New & Unread Topics

Want to read more? Browse other topics in Working with Data or view latest topics.