Alan Zhan Blog

Live for nothing, or die for something

This is a rare case where I learned by working backward from a problem — I’d better document it well. Recently, our production environment hit a MongoDB sort operation limit with this specific error:

MongoDB.Driver.MongoCommandException: Command find failed: Encountered non-retryable error during query :: caused by :: Executor error during find command :: caused by :: Sort operation used more than the maximum 33554432 bytes of RAM.

Well, the problem has occurred. Let’s solve it first — so we added an index as an immediate fix. But what exactly is a MongoDB index?

What is a MongoDB Index?

If you were a database designer, how would you find data and return it to your client?

Isn’t that simple? Just filter through data one record at a time, sort everything at once, and return what’s needed. With small datasets, this feels perfectly fine — data comes back quickly. But once the data volume grows, isn’t this approach incredibly inefficient?

Under normal circumstances, the larger your dataset, the worse your query performance scales proportionally. This is where indexes come in to solve the query performance problem. So what is an index?

An index solves the problem of poor query efficiency. At the same time, it’s a double-edged sword — adding indexes introduces additional overhead, so we should add them judiciously:

  • Pros
    1. Speeds up query efficiency.
    2. Can use index uniqueness to control data duplication.
    3. Indexes can speed up sorting time and reduce memory consumption during sort operations.
  • Cons
    1. Indexes slow down data write operations.
    2. Indexes require additional storage space.

After understanding indexes, let’s see what the official documentation says: MongoDB Manual Index.

mongodb index btree

So it uses a B-tree design. Simply put, B-tree time complexity varies depending on where the key is in the tree, but the best case is O(1). What types of indexes does MongoDB offer?

MongoDB Index Types

  • Unique Index: Ensures uniqueness of a specific field. By default, MongoDB creates a unique index on the _id field when creating a collection.
  • Single Field Index: An index on a single field.
  • Compound Index: An index on multiple fields. In practice, this is the most commonly used type since queries rarely filter on just one condition.
  • Multikey Index: Similar to a single field index, but the difference is it creates an index on array fields or fields within objects inside arrays.
  • Text Index: Similar to Elasticsearch’s search functionality, though the tokenization isn’t as sophisticated as Elasticsearch’s.
  • etc.

These are the most commonly used ones. MongoDB also offers convenience indexes like geo indexes. For details, see the official documentation: MongoDB Manual Index.

OOM Sort Operation

Now that we have a basic understanding of indexes, why does the OOM Sort error occur?

Think about it — if you read all the data from disk into memory for sorting, what would happen? So MongoDB’s decision to impose this limit makes perfect sense.

But we still need to solve the problem. How can we fix it?

Increase the Available Memory for Sorting

We can increase the available memory space. Here’s an example setting it to 100 MB:

db.adminCommand({ setParameter: 1, internalQueryExecMaxBlockingSortBytes: 104857600 })

Use Disk as Memory

If you’re using the aggregation pipeline, you can use allowDiskUse: true to bypass the memory limit and let disk help you!

Add an Index

The two methods above can quickly solve the problem, but the recommended approach is to use an index. Remember to set background to true — otherwise, building an index in production with a large dataset could cause write lock contention.

db.Foo.createIndex({ "Bar": 1 }, { background: true })

Conclusion

As for how to design and plan indexes — I’ll keep you in suspense. I’ll explain in detail in upcoming posts.

Feel free to leave a comment on my blog. Your feedback motivates me to keep writing. Thank you for reading, and let’s grow together to become better versions of ourselves.

References