The best index for this specific query is the compound index {filter:1,filter:2}, both filter1 and filter2 would have to be matched on, and in that order for the index to be used.
In the sample here there are 3 indexes plus the _id index to begin with. The index I mention above is hidden:
db.foo.getIndexes()
[
{ v: 2, key: { _id: 1 }, name: '_id_' },
{ v: 2, key: { f1: 1 }, name: 'f1_1' },
{ v: 2, key: { f2: 1 }, name: 'f2_1' },
{ v: 2, key: { f1: 1, f2: 1 }, name: 'f1_1_f2_1', hidden: true }
]
Here are the query times:
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 21231.54756000638
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 14208.398685008287
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 13774.462996006012
Pretty slow. Now using the compound index:
test> db.foo.unhideIndex({f1:1,f2:1})
{ hidden_old: true, hidden_new: false, ok: 1 }
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 1593.1604879945517
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 1393.1082039922476
test> s=performance.now(); print(db.foo.countDocuments({ f1: 2, f2: 1 })); print(`Duration: ${performance.now() - s}`)
2406166
Duration: 1400.4836210012436
This is a big improvement. This is limited in that the index will support queries on f1 alone and f1,f2 combined, but not queries on f2 alone.
It is possible the the Attribute Pattern may suit this use case, updating the schema to support this could be beneficial. Have a read and see if it fits.