editing subdocments N-N relationship in mongodb editing subdocments N-N relationship in mongodb mongoose mongoose

editing subdocments N-N relationship in mongodb


Based on the information that you provided, I would recommend two possible approaches, starting from the same foundation:

Use two collections (articles and platforms) and store only a reference to platform documents in an array defined on article documents

I would recommend this approach if:

  • You have a high cardinality of both article documents, as well asplatforms
  • You want to be able to manage both entities independently, whilealso syncing references between them

    // articles collection schema{"_id": ...,"title": "I am an article",..."platforms": [ "platform_1", "platform_2", "platform_3" ],...}// platforms collection schema    {"_id": "platform_1","name": "Platform 1","url": "http://right/here",...},{"_id": "platform_2","name": "Platform 2","url": "http://right/here",...},{"_id": "platform_3","name": "Platform 3","url": "http://right/here",...}

Even if this approach is quite flexible, it comes at a cost - if you require both article and platform data, you will have to fire more queries to your MongoDB instance, as the data is split in two different collections.

For example, when loading an article page, considering that you also want to display a list of platforms, you would have to fire a query to the articles collection, and then also trigger a search on the platforms collection to retrieve all the platform entities to which that article is published via the members of the platforms array on the article document.

However, if you only have a small subset of frequently accessed platform attributes that you need to have available when loading an article document, you might enhance the platforms array on the articles collection to store those attributes in addition to the _id reference to the platform documents:

// enhanced articles collection schema  {"_id": ...,"title": "I am an article",..."platforms": [    {platform_id: "platform_1", name: "Platform 1"},    {platform_id: "platform_2", name: "Platform 2"},    {platform_id: "platform_3", name: "Platform 3"}],...

}

This hybrid approach would be suitable if the platform data attributes that you frequently retrieve to display together with article specific data are not changing that often.

Otherwise, you will have to synchronize all the updates that are made to the platform document attributes in the platforms collection with the subset of attributes that you track as part of the platforms array for article documents.

Regarding the management of article lists for individual platforms, I wouldn't recommend storing N-to-N references in both collections, as the aforementioned mechanism already allows you to extract article lists by querying the articles collection using a find query with the _id value of the platform document:

Approach #1db.articles.find({"platforms": "platform_1"});Approach #2:db.articles.find({"platforms.platform_id": "platform_1"});

Having presented two different approaches, what I would recommend now is for you to analyze the query patterns and performance thresholds of your application and make a calculated decision based on the scenarios that you encounter.