Hi Babak… Great questions… You’re absolutely right - the size of an index differs depending on when it’s created or more accurately, how much data is in the collection when it’s created. Let dive in…

Wildcard Index Size and Rebuilding:

Inserting documents while an index exists (like a wildcard index), MongoDB has to update that index incrementally for each doc, which can lead to inefficiency and potentially even bloat. The db is building the index as it inserts, meaning there’s a potential for more fragmentation and less optimal disk use. That’s why, when you create the index after inserting data, mongo can build it more efficiently, scanning the entire collection at once. The index is tighter and more compact as a result… i.e. fresher?

Should You Rebuild Indexes?

I don’t recommend rebuilding indexes regularly… mongodb handles a lot of optimizations under the hood, like balancing data across the collection, but it doesn’t automatically compact or rebuild indexes. If you find that the index bloat is affecting your performance, manually rebuilding your index from time to time theoretically could be helpful. PErhaps schedule periodic maintenance where you drop and recreate the index if space and performance are issues over time.

Bulk Inserts and Indexing:

You’re right to assume that inserting data with indexes in place will take longer since mongo is updating the index with each document inserted… If you’re inserting millions of records, it can be more efficient to drop the index, insert the data, and then recreate the index afterward. However, dropping indexes also temporarily affects reads and queries that depend on that index, so it’s a trade-off… and depends on the active nature of the app. In most cases:

  • If you’re doing a one-time or infrequent bulk insert (like your million-record scenario), dropping indexes first and recreating them after the insert is good strategy.
  • For frequent bulk inserts, you might use mongo’s ordered: false option for the bulk write operation. It can speed up inserts by allowing MongoDB to proceed with other inserts even if some fail, plus it reduces overhead.

See the docs for insert…

boolean
Optional. A boolean specifying whether the mongod instance should perform an ordered or unordered insert. Defaults to true.`

### so in short... 

* **Yes**, rebuilding indexes occasionally can help reduce size and improve performance, but it's manual.
* **For large bulk inserts**, dropping indexes beforehand and recreating them after the insert is a solid approach. MongoDB won’t handle that automatically.

Let us know if you want more details on the index-rebuild process or how to optimize bulk inserts... Happy to dive (even) deeper!
2 Likes