How to query all subdocuments How to query all subdocuments mongoose mongoose

How to query all subdocuments


Here is how you do this using the aggregation framework (you need to use just released 2.2).

db.stories.aggregate([    {        "$unwind" : "$tags"    },    {        "$group" : {            "_id" : "$tags.tagname",            "total" : {                "$sum" : 1            }        }    },    {        "$sort" : {            "total" : -1        }    }])

Your result will look like this:

{    "result" : [        {            "_id" : "fairytale",            "total" : 3        },        {            "_id" : "funny",            "total" : 2        },        {            "_id" : "silly",            "total" : 1        },        {            "_id" : "fox",            "total" : 1        }    ],    "ok" : 1}


Welcome to Mongo

The best "Schema" for your data will something like this.

You create a collection called stories, each story will be a document in this collection.You can then easily query your data with something like.

db.stories.find({ "tags.tagname": "fairytale"}); // will find all documents that have fairytale as a tagname.

UPDATE

db.stories.find({ "tags.tagname": { $exists : true }}); // will find all documents that have a tagname.

Notice the dot notation in the find query, that's how you reach into arrays/objects in mongo.


You can use an MR to accomplish this. In an MR you would simply pick out the tags and project them:

var map = function(){     for(var i=0;i<this.tags.length;i++){         emit(this.tags[i].tagname, {count: 1});     }}

And then your reduce would run through the emitted documents basically summing up the amount of times that tag was seen.

If you upgrade to the lastest unstable 2.2 you can also use the aggregation framework. You would use the $project and $sum piplines of the aggregation framework to project the tags out of each post and then summing them up to create a score based tag cloud allowing you size the text of each tag based upon the summing.

If yes, is it a good practice? Or does it break the nosql paradigm?

This is a pretty standard problem in MongoDB and one you won't get away from. With the reusable structure comes the inevitable need to do some complex querying over it. Fortunately in 2.2 there is the aggregationm framework to save.

As to whether this is a good or bad approach, it is a pretty standard one as such it is neither good or bad.

As to making the structure better, you could pre-aggregate unique tags with their count to a separate collection. This would make it easier to build your tag cloud in realtime.

Pre-aggregation is a form of creating the other collection you would normally get from an MR without the need to use MRs or the aggregation framework. It is normally event based upon your app, so when a user create a post or retags a post it will trigger a pre-aggregation event to a collection of "tag_count" which looks like:

{    _id: {},    tagname: "",    count: 1}

When the event is triggered your app will loop through the tags on the post basically doing $inc upserts like so:

db.tag_count.update({tagname: 'whoop'}, {$inc: {count: 1}}, true);

And so you will now have a collection of tags with their count throughout your blog. From there you go the same route as the MR did and just query this collection getting out your data. You would of course need to handle deletion and update events but you get the general idea.