Possible to retrieve multiple random, non-sequential documents from MongoDB? Possible to retrieve multiple random, non-sequential documents from MongoDB? mongoose mongoose

Possible to retrieve multiple random, non-sequential documents from MongoDB?


You may do two requests, but in an efficient way :

  1. Your first request just gets the list of all "_id" of document of your collections. Be sure to use a mongo projection db.products.find({}, { '_id' : 1 }).
  2. You have a list of "_id", just pick N randomly from the list.
  3. Do a second query using the $in operator.

What is especially important is that your first query is fully supported by an index (because it's "_id"). This index is likely fully in memory (else you'd probably have performance problems). So, only the index is read while running the first query, and it's incredibly fast.

Although the second query means reading actual documents, the index will help a lot.

If you can do things this way, you should try.


I don't think MySQL ORDER BY rand() is particularly efficient - as I understand it, it essentially assigns a random number to each row, then sorts the table on this random number column and returns the top N results.

If you're willing to accept some overhead on your inserts to the collection, you can reduce the problem to generating N random integers in a range. Add a counter field to each document: each document will be assigned a unique positive integer, sequentially. It doesn't matter what document gets what number, as long as the assignment is unique and the numbers are sequential, and you either don't delete documents or you complicate the counter document scheme to handle holes. You can do this by making your inserts two-step. In a separate counter collection, keep a document with the first number that hasn't been used for the counter. When an insert occurs, first findAndModify the counter document to retrieve the next counter value and increment the counter value atomically. Then insert the new document with the counter value. To find N random values, find the max counter value, then generate N distinct random numbers in the range defined by the max counter, then use $in to retrieve the documents. Most languages should have random libraries that will handle generating the N random integers in a range.