2 / 2
Apr 21

Hello there, I am working with pymongo on a project that inserts time series data into a collection, with an _id field based on the date, a string generated by me instead of the default ObjectId.

However, I have come across the fact that multiple documents with the same _id can be added to the collection. How is this possible? Could it be a bug or is it something on my end?

20 days later

I’m going to assume you mean “inserts time series data into a Time Series collection” since the behaviour you describe is not possible in regular collections. Edit: If that did happen for regular collections, it would be a serious bug.

Not a bug and not any issue at your end. By default, Time Series collections don’t create a unique index on the _id field.

When you create a Time Series collection, a compound index will be created on the metaField and timeField. Note that even this is not a unique index.

Using the example from the docs, doing:

db.createCollection( "stocks", { timeseries: { timeField: "date", metaField: "ticker", granularity: "seconds" } })

will create a time series collection with a compound index on ticker & date, which are the metaField and timeField above. Using db.getCollection("stocks").getIndexes(), gives:

[ { v: 2, key: { ticker: 1, date: 1 }, name: 'ticker_1_date_1' } ]

Again, note that it’s not a unique index and no index on _id is listed above.
Seen here:

However, I would agree that the documentation about this behaviour in Time Series collections could be improved and made clear.