2 / 2
May 2024

I have a time series collection that contains nearly one billion data.

I simply running:

db.cllection.find({'metaData.eventId': "FB-2348", 'metaData.marketName': "Match Odds", 'metaData.selectionId': 0})

The query cost more than 2mins to finish

"executionStats":{ "executionSuccess":true, "nReturned":10, "executionTimeMillis":207968, "totalKeysExamined":0, "totalDocsExamined":5032348, "executionStages":{ "stage":"COLLSCAN", "filter":{ "$and":[ { "meta.eventId":{ "$eq":"FB-2348" } }, { "meta.marketName":{ "$eq":"Match Odds" } }, { "meta.selectionId":{ "$eq":0 } } ] }, "nReturned":10, "executionTimeMillisEstimate":200392, "works":5032349, "advanced":10, "needTime":5032338, "needYield":0, "saveState":13268, "restoreState":13268, "isEOF":1, "direction":"forward", "docsExamined":5032348 },

I have billion data looks like this:

{ ts: 2024-04-05T07:50:53.111+00:00 metaData: { betType: "for,ahover,18", bookie: "pin88", eventId:"2024-04-07,1514,1154", inRunning:false, sport:"FB" }, min:null, price:5.03, max:118.6643 }

Is there anything wrong with my query? Or is there anything I can optimize in my collection/schema design/data configuration? Thank you.

16 days later

Hi @WONG_TUNG_TUNG

You are querying a timeseries collection without a time criterion, this is one of the main problems, this is related to the way mongodb stores data from this type of collection.
In general, any query on a timeseries dataset will always be based on a time criterion combined with others.
If your queries are not directly related to a time period, you should not use timeseries collections or timeseries database.

Best regards,