Mongodb: Select the top N rows from each group [duplicate]
You need to first sort the documents in the collection by the blog_id
and timestamp
fields, then do an initial group which creates an array of the original documents in descending order. After that you can slice the array with the documents to return the first 3 elements.
The intuition can be followed in this example:
db.entries.aggregate([ { '$sort': { 'blog_id': 1, 'timestamp': -1 } }, { '$group': { '_id': '$blog_id', 'docs': { '$push': '$$ROOT' }, } }, { '$project': { 'top_three': { '$slice': ['$docs', 3] } } }])
The only way to do this in basic mongo if you can live with two things :
- An additional field in your entry document, let's call it "age"
- A new blog entry taking an additional update
If so, here's how you do it :
Upon creating a new intro do your normal insert and then execute this update to increase the age of all posts (including the one you just inserted for this blog) :
db.entries.update({blog_id: BLOG_ID}, {age:{$inc:1}}, false, true)
When querying, use the following query which will return the most recent 3 entries for each blog :
db.entries.find({age:{$lte:3}, timestamp:{$gte:STARTOFMONTH, $lt:ENDOFMONTH}}).sort({blog_id:1, age:1})
Note that this solution is actually concurrency safe (no entries with duplicate ages).
This answer using map reduce by drcosta from another question did the trick
In mongo, how do I use map reduce to get a group by ordered by most recent
mapper = function () { emit(this.category, {top:[this.score]});}reducer = function (key, values) { var scores = []; values.forEach( function (obj) { obj.top.forEach( function (score) { scores[scores.length] = score; }); }); scores.sort(); scores.reverse(); return {top:scores.slice(0, 3)};}function find_top_scores(categories) { var query = []; db.top_foos.find({_id:{$in:categories}}).forEach( function (topscores) { query[query.length] = { category:topscores._id, score:{$in:topscores.value.top} }; }); return db.foo.find({$or:query});