@steevej - Thanks for linking the ESR rule, I’m surprised I didn’t come across that strategy earlier when I was looking for the correct approach.
The ESR approach will help out in a big way. After adding some indexes focusing on the “E” in ESR, the 500k data set are back to a snappier query times and the test queries that I’m working with are using the indexes that I’ve created.
However, as a follow up, I do worry that for all queries to be efficient as the data set grows and as more fields become searchable, I’d have to create 50 indexes, potentially a lot more. Are there best practice strategies when running into this problem? Perhaps splitting up the document into separate collections? For my use case, writes are done via async processing jobs and are not done via user interaction, so keeping indexes updated are not a concern when it comes to write speed.
I don’t think the attribute pattern is applicable to my use case. Referring to my fields above in the original post, these fields cannot be combined together in the way that the attribute pattern describes. They are distinct fields that are unrelated to each other. I can however, turn field1 and field2 into an equality index field by introducing a boolean field that indicates whether the fields are present, e.g. hasField1 and hasField2. field3 is any array and $in can be used as an equality field operator. These will sufficiently narrow results down so that I can really just use the a couple dozen indexes where the prefixes are all of the variations of field1 + field2 + field3 and the sort portion of the each index is _id, or one of the fields in the sort fields list + _id. But as the data set grows, I worry that these indexes will not perform as well unless adding more fields, potentially ones that could be used for Range narrowing in the ESR rule. That would mean even more indexes, though, and that seems untenable.
Thank you again for your timely feedback, it was super helpful. Let me know what you think about my concerns above.
2 Likes