Scaling MongoDB on EC2 or should I just switch to DynamoDB? Scaling MongoDB on EC2 or should I just switch to DynamoDB? mongodb mongodb

Scaling MongoDB on EC2 or should I just switch to DynamoDB?


You could host Mongo on a single server on EC2 which each of the boxes in the web farm connect to. You can then easily spin up another web instance that uses the same DB box.

We currently have three Mongo servers as we run a replica set and when we get to the point where we need to scale horizontally with Mongo we'll spin up some new instances and shard the larger collections.


I currently run my website on a single server with MongoDB.

First off, this is a big red flag. When running on production, it is always recommended to run a replica set with at least three full nodes.

Replication provides automatic redundancy and fail-over.

Ability to seamlessly add/remove web-servers without worry about losing data in the DB

MongoDB supports a concept called sharding. Sharding provides a way to scale horizontally by automatically partioning data. The partitioning is done via a shard key.

If you plan to use sharding, please read that link very carefully and recognize the limitations. For MongoDB sharding you have to select the correct key that will allow queries to be evenly distributed across the shards.

The current MongoDB crawl index has about 100k entries that are keyed on ~15 different columns.

This is going to be a problem with sharding. Sharding can only scale queries that use the shard key. A query on the shard key can be routed directly to a single machine. A query on a secondary index goes to all machines.

You have 15 different indexes, so basically all of these queries will go to all shards. That will not "auto-scale" very well at all.


Beware that at the moment EC2 does not have 64 bit small instances, making replication potentially expensive. Because MongoDB memory maps files, a 32 bit OS is not advised.