MongoDB embedded vs. reference from performance perspective MongoDB embedded vs. reference from performance perspective mongodb mongodb

MongoDB embedded vs. reference from performance perspective


1.Paging possible with $slice operator:

db.blogs.find({}, {posts:{$slice: [10, 10]}}) // skip 10, limit 10

2.Filtering also possible:

db.blogs.find({"posts.title":"Mongodb!"}, {posts:{$slice: 1}}) //take one post

3,4. Generally i guess you are speaking about small performance difference. It's not rocket science, it just blog with at most 1000 posts.

You said:

Is this the correct conclusion?

No, if you care about performance (in general if system will be small you can go with separate document).

I've done small performance test regarding 3,4, here is results:

-----------------------------------------------------------------| Count/Time |  Inserting posts   | Adding to nested collection |-------------|--------------------------------------------------               |   1        |   1 ms             |  28 ms                      ||   1000     |   81 ms            |  590 ms                     ||   10000    |   759 ms           |  2723 ms                    | ---------------------------------------------------------------


As for 3 & 4, if you are inserting into a nested document, it is basically an update.

This can be terribly bad for your performance because inserts are generally appended to the end of the data which works fine and fast. Updates, on the other hand, can be much trickier.

If your update does not change the size of a document (meaning that you had a key\value pair and simply changed the value to a new value that takes up the same amount of space) then you will be ok but when you start modifying documents and adding new data, a problem arises.

The problem is that while MongoDB allots more space than it needs for each document, it may not be enough. If you insert a document that is 1k large, MongoDB may allot 1.5k for the document to ensure that minor changes to the document have enough space to grow. If you use more than the allocated space, MongoDB has to fetch the entire document and re-write it at the tail end of the data.

There is obviously a performance implication in fetching and re-writing the data which will be amplified by the frequency of such an operation. To make matters worse, when this happens you end up leaving holes or pockets of unused space in your data files.

This ultimately gets copied into memory which means that you may end up using 2GB of RAM to store your data set, while in reality the data itself only takes up 1.5GB because there are .5GB worth of pockets. This fragmentation can be avoided by doing inserts as opposed to updates. It can also be fixed by doing a database repair.

In the next version of MongoDB there will be an online compaction function.


  1. You can paging with '$slice' on embedded element
  2. You can search with "field1.field2": /aRegex/ with aRegex is the word you search. But take care of performance.

About 3. and 4. I have no proof data.

BTW 2 collections can be easier to code/use/manage. And you can simply register blogId in each 'blog' document and add "blogId":"1234ABCD" in all your query