Reduce the Number of Collections

On this page

About this Task

Example
Learn More

Creating too many collections can decrease performance. With every collection, MongoDB creates a default _id index, which uses additional storage. If you create excessive collections, those collections and indexes can strain replica set resources and decrease performance.

About this Task

Consider the following schema that stores daily temperature readings that are taken every hour. The temperature database stores each day's readings in separate collections.

// Temperatures for May 10, 2024
{
   _id: 1,
   timestamp: "2024-05-10T010:00:00Z",
   temperature: 60
},
{
   _id: 2
   timestamp: "2024-05-10T011:00:00Z",
   temperature: 61
},
{
   _id: 3
   timestamp: "2024-05-10T012:00:00Z",
   temperature: 64
}
...

// Temperatures for May 11, 2024
{
   _id: 1,
   timestamp: "2024-05-11T010:00:00Z",
   temperature: 68
},
{
   _id: 2
   timestamp: "2024-05-11T011:00:00Z",
   temperature: 72
},
{
   _id: 3
   timestamp: "2024-05-11T012:00:00Z",
   temperature: 72
}
...

With an unbound number of collections, the number of default _id indexes can grow to degrade performance.

Additionally, this approach requires a $lookup operation to query across multiple collections. $lookup operations add query complexity and can strain resources.

To reduce the number of collections, drop or archive unused collections, or remodel your data schema by consolidating related collections, denormalizing data, or leveraging embedded documents where appropriate.

Example

You can modify the schema to store each day's temperature readings in a single collection. For example:

db.dailyTemperatures.insertMany( [
   {
      _id: ISODate("2024-05-10T00:00:00Z"),
      readings: [
         {
            timestamp: "2024-05-10T10:00:00Z",
            temperature: 60
         },
         {
            timestamp: "2024-05-10T11:00:00Z",
            temperature: 61
         },
         {
            timestamp: "2024-05-10T12:00:00Z",
            temperature: 64
         }
      ]
   },
   {
      _id: ISODate("2024-05-11T00:00:00Z"),
      readings: [
         {
            timestamp: "2024-05-11T10:00:00Z",
            temperature: 68
         },
         {
            timestamp: "2024-05-11T11:00:00Z",
            temperature: 72
         },
         {
            timestamp: "2024-05-11T12:00:00Z",
            temperature: 72
         }
      ]
   }
] )

The updated schema requires fewer resources than the original. Instead of needing a separate index for each day, the default _id index now facilitates queries by date.

Learn More

Back

Avoid Unbounded Arrays

Remove Unnecessary Indexes