MongoDB beginner - to normalize or not to normalize?

mongodb non-relational-database database nosql

Try this approach:

Work out which entity (or entities) are the hero(s)

With 'hero', I mean the entity(s) that the database is centered around. Let's take your example. The hero of the real-estate example is the house*.

Work out the ownerships

Go through the other entities, such as the owner, agency, images and reviews and ask yourself whether it makes sense to place their information together with the house. Would you have a cascading delete on any of the foreign keys in your relational database? If so, then that implies ownership.

Work out whether it actually matters that data is de-normalised

You will have agency (and probably owner) details spread across multiple houses. Does that matter?

Your house collection will probably look like this:

house: {owner,agency,images[], // recommend references to GridFS herereviews[] // you probably won't get too many of these for a single house}

*Actually, it's probably the ad of the house (since houses are typically advertised on a real-estate website and that's probably what you're really interested in) so just consider that

mongodb non-relational-database database nosql

Sarah Mei wrote an informative article about the kinds of issues that can arise with data integrity in nosql dbs. The choice between duplicate data or using id's, code based joins and the challenges with keeping data integrity. Her take is that any nosql db with code based joins will lose data integrity at some point. Imho the articles comments are as valuable as the article itself in understanding these issues and possible resolutions.

Link: http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/comment-page-1/

mongodb non-relational-database database nosql

I would just like to give a normalization refresher from the MongoDB's perspective -

What are the goals of normalization?

Frees the database from modification anomalies - For MongoDB, it looks like embedding data would mostly cause this. And in fact, we should try to avoid embedding data in documents in MongoDB which possibly create these anomalies. Occasionally, we might need to duplicate data in the documents for performance reasons. However that's not the default approach. The default is to avoid it.
Should minimize re-design when extending - MongoDB is flexible enough because it allows addition of keys without re-designing all the documents
Avoid bias toward any particular access pattern - this is something, we're not going to worry about when describing schema in MongoDB. And one of the ideas behind the MongoDB is to tune up your database to the applications that we're trying to write and the problem we're trying to solve.

CodeHunter

MongoDB beginner - to normalize or not to normalize?

What are the goals of normalization?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last