Nodejs & Mongo pagination random order Nodejs & Mongo pagination random order mongoose mongoose

Nodejs & Mongo pagination random order


I think the only way that you will be able to guarentee that users see unique users every time is to store the list of users that have already been seen. Even in the RAND example that you linked to, there is a possibility of intersection with a previous user list because RAND won't necessarily exclude previously returned users.


Random Sampling

If you do want to go with random sampling, consider Random record from MongoDB which suggests using an an Aggregation and the $sample operator. The implementation would look something like this:

const {    MongoClient} = require("mongodb");const    DB_NAME = "weather",    COLLECTION_NAME = "readings",    MONGO_DOMAIN = "localhost",    MONGO_PORT = "32768",    MONGO_URL = `mongodb://${MONGO_DOMAIN}:${MONGO_PORT}`;(async function () {    const client = await MongoClient.connect(MONGO_URL),        db = await client.db(DB_NAME),        collection = await db.collection(COLLECTION_NAME);    const randomDocs = await collection        .aggregate([{            $sample: {                size: 5            }        }])        .map(doc => {            return {                id: doc._id,                temperature: doc.main.temp            }        });    randomDocs.forEach(doc => console.log(`ID: ${doc.id} | Temperature: ${doc.temperature}`));    client.close();}());

Cache of Previous Users

If you go with maintaining a list of previously viewed users, you could write an implementation using the $nin filter and store the _id of previously viewed users.

Here is an example using a weather database that I have returning entries 5 at a time until all have been printed:

const {    MongoClient} = require("mongodb");const    DB_NAME = "weather",    COLLECTION_NAME = "readings",    MONGO_DOMAIN = "localhost",    MONGO_PORT = "32768",    MONGO_URL = `mongodb://${MONGO_DOMAIN}:${MONGO_PORT}`;(async function () {    const client = await MongoClient.connect(MONGO_URL),        db = await client.db(DB_NAME),        collection = await db.collection(COLLECTION_NAME);    let previousEntries = [], // Track ids of things we have seen        empty = false;    while (!empty) {        const findFilter = {};        if (previousEntries.length) {            findFilter._id = {                $nin: previousEntries            }        }        // Get items 5 at a time        const docs = await collection            .find(findFilter, {                limit: 5,                projection: {                    main: 1                }            })            .map(doc => {                return {                    id: doc._id,                    temperature: doc.main.temp                }            })            .toArray();        // Keep track of already seen items        previousEntries = previousEntries.concat(docs.map(doc => doc.id));        // Are we still getting items?        console.log(docs.length);        empty = !docs.length;        // Print out the docs        docs.forEach(doc => console.log(`ID: ${doc.id} | Temperature: ${doc.temperature}`));    }    client.close();}());


I have encountered the same issue and can suggest an alternate solution.

TL;DR: Grab all Object ID of the collections on first landing, randomized using NodeJS and used it later on.

  • Disadvantage: slow first landing if have million of records
  • Advantage: subsequent execution is probably quicker than the other solution

Let's get to the detail explain :)

For better explain, I will make the following assumption

Assumption:

  1. Assume programming language used NodeJS
    • Solution works for other programming language as well
  2. Assume you have 4 total objects in yor collections
  3. Assume pagination limit is 2

Steps:

On first execution:

  1. Grab all Object Ids

Note: I do have considered performance, this execution takes spit seconds for 10,000 size collections. If you are solving a million record issue then maybe used some form of partition logic first / used the other solution listed

db.getCollection('my_collection').find({}, {_id:1}).map(function(item){ return item._id; });

OR

db.getCollection('my_collection').find({}, {_id:1}).map(function(item){ return item._id.valueOf(); });

Result:

ObjectId("FirstObjectID"),ObjectId("SecondObjectID"),ObjectId("ThirdObjectID"),ObjectId("ForthObjectID"),
  1. Randomized the array retrive using NodeJS

Result:

ObjectId("ThirdObjectID"),ObjectId("SecondObjectID"),ObjectId("ForthObjectID"),ObjectId("FirstObjectID"),
  1. Stored this randomized array:
  • If this is a Server side script that randomized pagination for each user, consider storing in Cookie / Session
    • I suggest Cookie (with timeout expired linked to browser close) for scaling purpose

On each retrieval:

  1. Retrieve the stored array

  2. Grab the pagination item, (e.g. first 2 items)

  3. Find the objects for those item using find $in

.

db.getCollection('my_collection')    .find({"_id" : {"$in" : [ObjectId("ThirdObjectID"), ObjectId("SecondObjectID")]}});
  1. Using NodeJS, sort the retrieved object based on the retrived pagination item

There you go! A randomized MongoDB query for pagination :)