MongoDB and Mongoose: Nested Array of Document Reference IDs MongoDB and Mongoose: Nested Array of Document Reference IDs mongoose mongoose

MongoDB and Mongoose: Nested Array of Document Reference IDs


Regarding your first question:

You specifically ask for a better way to work with child-ids that are stored in the parent. I'm pretty sure that there is no better way to deal with this, if it has to be this pattern.

But this problem also exist in relational databases. If you want to save your post in a relational database (using that pattern), you also have to first create the comment, get its ID and then update the post. Granted, you can send all these tasks in a single request, which is probably more efficient than using mongoose, but the type of work that needs to be done is the same.

Regarding your second question:

The benefit over variant A is, that you can for example get the post, and instantly know how many comments it has, without asking the mongodb to go through probably hundrets of documents.

The benefit over variant B is, that you can store more references to comments in a single document (a single post), than whole comments, because of mongos 16MB document-size-limit.


The Downside however is the one you mentioned, that it's inefficient to maintain that structure. I take it, that this is only an example to showcase the scenario, so here is what i would do:I would decide on a case by case basis what to use.

  • If the document will be read a lot, and not much written to, AND it is unlikely to grow larger than 16MB: Embed the sub-document. this way you can get all the data in a single query.

  • If you need to reference the document from multiple other documents AND your data really must be consistent, then you have no choice but to reference it.

  • If you need to reference the document from multiple other documents BUT data-consitency is not that super important AND the restrictions from the first bulletpoint apply, then embed the sub-documents, and write code to keep your data consistent.

  • If you need to reference the document from multiple other documents, and they are written to a lot, but not read that often, you're probably better off referencing them, as this is easier to code, because you don't need to write code to sync duplicate data.

In this specific case (post/comment) referencing the parent from the child (letting the child know the parents _id) is probably a good idea, because it's easier to maintain than the other way around, and the document might grow larger than 16MB if they were embedded directly. If i'd know for sure, that the document would NOT larger than over 16MB, embedding them would be better, because its faster to query the data that way