Relational to NoSQL Database

php mongodb database-design relational-database nosql

First, NoSQL is not one size fits all. In SQL, almost every 1:N and M:N relation is modeled in the same way. The NoSQL philosophy is that the way you model the data depends on the data and its use patterns.

Second, I agree with Mark Baker: Scaling is hard, and it's achieved by loosening constraints. It's not a technology matter. I love working with MongoDB, but for other reasons (no need to code ugly SQL; no need for complicated, bloated ORM; etc.)

Now let's review your options:Option 1 copies more data than needed. You will often have to denormalize some data, but never all of it. If so, it's cheaper to fetch the referenced object.

Option 2/3 they are very similar. The key here is: who's writing? You don't want a lot of clients having write-access to the same document, because that will force you to use a locking mechanism, and/or restrict yourself to modifier operations only. Therefore, option 2 is probably better than 3. However, if A attacks B, they'd also trigger a write to user B, so you have to make sure your writes are safe.

Option 4 Partial denormalization: Your user object seems to be most important, so how about this:

user {  battles : [ {"Name" : "The battle of foo", "Id" : 4354 }, ... ] ...}

This will make it easier to show e.g. a user dashboard, because you don't need to know all the details in the dashboard. Note: the data structure is then coupled to details of the presentation.

Option 5 Data on edges. Often, the relation needs to hold data as well:

user { battles : [ {"Name" : "The battle of foo", "unitsLost" : 54, "Id" : 34354 }, ... ]}

here, unitsLost is specific to the user and the battle, hence the data sits on the edge of the graph. Contrary to the battle's name, this data is not denormalized.

Option 6 Linker collections. Of course, such 'edge-data' can grow huge and might even call for a separate collection (linker collection). This fully eliminates the problem of access locks:

user {   "_id" : 3443}userBattles {  userId : 3443,  battleId : 4354,  unitsLost : 43,  itemsWon : [ <some list > ],  // much more data}

Which of these is best depends on a lot of details of your application. If users make a lot of clicks (i.e. you have a fine-grained interface), it makes sense to split up objects like in option 4 or 6. If you really need all data in one batch, partial denormalization doesn't help, so option 2 would be preferable. Keep in mind the multiple writer problem.

php mongodb database-design relational-database nosql

Option 2 is the way to go.

If you would do it in a RDB, at some point in time (when you have to start scaling horizontally), you would also need to start removing SQL joins and join data on application level.

Even 10gen recommends using "manual" reference ids: http://www.mongodb.org/display/DOCS/Database+References

CodeHunter

Relational to NoSQL Database

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last