We have large growing collection in mongo each day. It has reached 200Million documents. To reduce cost we would like to use Online Archive option to archive old year data to AWS S3. Few questions related to this option:
- Collection has _id (ObjectId) column but no date column. ObjectId has date part in it. Can this be used for archiving rule? and How?
- Documents in this Collection are always deleted and recreated. But not modified. Can data in archive be deleted based on id in case user open any old record and re-generate data?
- How much time will it take to load single document from archive based on id, in seconds or milliseconds?
- Are there any management changes to use Online archive apart from Atlas Data Federation costs?
To clarify a little further on point #1:
You cannot use _id as the date field in the Date criteria. Technically, you can incorporate the workaround of using the $expr in custom criteria of Online Archive that @Hartek_Sabharwal mentioned above.
However, the custom query will likely not be using an index and the archival process itself will be likely slow. We have mentioned below in our documentation :
For custom criteria that use an expression, Atlas might first convert a value before it evaluates it against the query.
The recommendation is that you create a new indexed date field and use the Date criteria to archive. This is the right approach that will optimize/improve the archiving speed.
We wouldn’t recommend custom criteria and using $expr and using _id due to the slowness mentioned above.
1 Like
system
(system)
Closed
4
This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.