Hello,
What is correct strategy of indexing and querying strategy when using time series collection?
I have example collection created using following command:
db.createCollection("monitoring", {expireAfterSeconds : 604800, timeseries : {timeField : "timestamp", metaField : "metadata", granularity : "minutes"}})
This collection would store following data:
db.monitoring.insert({ timestamp: ISODate('2024-05-21T10:24:37.809Z'), metadata: { monitoringId: '123', deviceId : "dev" }, value: 1000} )
I want to query this collection using metadata field and timestamp field, however all my options have drawbacks:
- Query using on entire metadata field may produce inconsistent results because time series collection may change field order inside of metadata field, so following query may not produce any results depending on internal storage:
db.monitoring.find({metadata : {monitoringId : "123", deviceId : "dev"}})
``
Such query is also discouraged by Best Practices for Time Series Collections article in documentation
2. Query using scalar sub-fields will always produce correct result no matter what field order is, however it does not use automatically created index on metadata field, which will result in slow response time. In this case, hinting index does not work properly - index is used, but with incorrect bounds and query is still slow.
3. I could create following index and use scalar sub-fields query:
db.monitoring.createIndex({“metadata.monitoringId” : 1, “metadata.deviceId” : 1, timestamp : 1})
However, this is counter-intuitive because one would assume that automatically created metadata index would be useful for nested metadata fields.
Is there a way to use automatically created index in such situation?