Hello.
Mongodb 4.4.12 Community.
I try to understand very strange performance characteristics of the query like below. There is a unique index on the myID
field of the mycoll
collection having quite a large number of documents (XX millions).
db.mycoll.aggregate([
{ $match: {
$and: [
{ 'myId': { $in: [~ 1000 constants] } }
]
}
}
])
I may see very different performance values for queries with the same queryHash
and planCacheKey
pair and slightly different number of ids in $in
(but still ~1000). For example:
{
attr: {
keysExamined: 8 000,
docsExamined: 4 000,
storage: {
data: {
bytesRead: 500 000 000
}
},
durationMillis: 1 000
}
}
{
attr: {
keysExamined: 350 000,
docsExamined: 180 000,
storage: {
data: {
bytesRead: 13 000 000 000
}
},
durationMillis: 50 000
}
}
It doesn’t depend on if a query was replanned or not.
Query plans look the same in both cases (w/ and w/o $and
in $match
) like below.
...
"optimizedPipeline" : true,
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"myId" : 1
},
"indexName" : "myId",
"isMultiKey" : false,
"multiKeyPaths" : {
"myId" : [ ]
},
"isUnique" : true,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "forward",
"indexBounds" : {
"myId" : [
"[\"id1\", \"id1\"]",
...,
"[\"idX\", \"idX\"]"
]
}
...
Questions:
-
What would be a reason for such a huge value in
keysExamined
(even in the better case) when not more than 1000 ids only passed to the query to check against a unique index? -
Is there any chance, that the reason is - unnecessary
$and
inside$match
, and we must run the following query instead to get better performance?
db.mycoll.aggregate([
{ $match:
{ 'myId': { $in: [~ 1000 constants] } }
}
])
- Could be the reason of such a huge
keysExamined
value some parallel write activity into the collection?
Regards,
Mark