MongoDB preload documents into RAM for better performance MongoDB preload documents into RAM for better performance mongodb mongodb

MongoDB preload documents into RAM for better performance


Your observed performance problem on an initial query is likely one of the following issues (in rough order of likelihood):

1) Your application / web service has some overhead to initialize on first request (i.e. allocating memory, setting up connection pools, resolving DNS, ...).

2) Indexes or data you have requested are not yet in memory, so need to be loaded.

3) The Query Optimizer may take a bit longer to run on the first request, as it is comparing the plan execution for your query pattern.

It would be very helpful to test the query via the mongo shell, and isolate whether the overhead is related to MongoDB or your web service (rather than timing both, as you have done).

Following are some notes related to MongoDB.

Caching

MongoDB doesn't have a "caching" time for documents in memory. It uses memory-mapped files for disk I/O and the documents in memory are based on your active queries (documents/indexes you've recently loaded) as well as the available memory. The operating system's virtual memory manager is in charge of caching, and typically will follow a Least-Recently Used (LRU) algorithm to decide which pages to swap out of memory.

Memory Usage

The expected behaviour is that over time MongoDB will grow to use all free memory to store your active working data set.

Looking at your provided db.stats() numbers (and assuming that is your only database), it looks like your database size is current about 1Gb so you should be able to keep everything within your 10Gb total RAM unless:

  • there are other processes competing for memory
  • you have restarted your mongod server and those documents/indexes haven't been requested yet

In MongoDB 2.2, there is a new touch command you can use to load indexes or documents into memory after a server restart. This should only be used on initial startup to "warm up" the server, as otherwise you could be unhelpfully forcing actual "active" data out of memory.

On a linux system, for example, you can use the top command and should see that:

  • virtual bytes/VSIZE will tend to be the size of the entire database
  • if the server doesn't have other processes running, resident bytes/RSIZE will be the total memory of the machine (this includes file system cache contents)
  • mongod should not use swap (since the files are memory-mapped)

You can use the mongostat tool to get a quick view of your mongod activity .. or more usefully, use a service like MMS to monitor metrics over time.

Query Optimizer

The MongoDB Query Optimizer compares plan execution for a query pattern every ~1,000 write operations, and then caches the "winning" query plan until the next time the optimizer runs .. or you explicitly call an explain() on that query.

This should be a straightforward one to test: run your query in the mongo shell with .explain() and look at the ms timings, and also the number of index entries and documents scanned. The timing for an explain() isn't the actual time the queries will take to run, as it includes the cost of comparing the plans. The typical execution will be much faster .. and you can look for slow queries in your mongod log.

By default MongoDB will log all queries slower than 100ms, so this provides a good starting point to look for queries to optimize. You can adjust the slow ms value with the --slowms config option, or using the Database Profiler commands.

Further reading in the MongoDB documentation: