Referencing the whole document in MongoDB Aggregation Pipeline Referencing the whole document in MongoDB Aggregation Pipeline mongodb mongodb

Referencing the whole document in MongoDB Aggregation Pipeline


Use the $$ROOT variable:

References the root document, i.e. the top-level document, currently being processed in the aggregation pipeline stage.


There is currently no mechanism to access the full document in aggregation framework, if you only needed a subset of fields, you could do:

db.tweets.aggregate([ {$group: { _id: '$clusters.clusterID',                                  members: {$addToSet :                                         { user: "$user",                                         text: "$text", // etc for subset                                                         // of fields you want                                       }                                  }                                }                        } ] )

Don't forget with a few hundred thousand tweets, aggregating the full document will run you into the 16MB limit for returned aggregation framework result document.

You can do this via MapReduce like this:

var m = function() {  emit(this.clusters.clustersID, {members:[this]});}var r = function(k,v) {  res = {members: [ ] };  v.forEach( function (val) {     res.members = val.members.concat(res.members);  } );  return res;}db.tweets.mapReduce(m, r, {out:"output"});


I think MapReduce more useful for this task.

As written in the comments by Asya Kamsky, my example is incorrect for mongodb, please use official docs for mongoDB.