Server architecture for a scalable web application

node.js mongodb amazon-web-services cloud elasticsearch

if our architecture might have any design flaws.

Well, keep in mind that we can't tell much from a generic diagram. But here are some notes:

1) MongoDB isn't as easy to scale as other databases such as DynamoDB, Riak or Cassandra. For example, if you ever exceed the capacity of a single master (no matter how many slaves you have, all writes go to the single master), you'll have to shard. But switching to sharding is very disruptive and very tedious to set up.

If you don't expect to exceed the write capacity of one node, then you'll be fine on MongoDB.

2) What will you do for async tasks such as sending emails, creating long reports, etc?

It's possible to do these things in the request loop, and that's probably a fine way to get started. But as you have more boxes, the chances of failure go up. When a box dies, all the async tasks go away and nobody will know what they were. You also can have problems where one box gets heavily loaded with async tasks (using too much CPU or memory), and the problem will get worse and worse as it gets more tasks and completes them more slowly.

Also, a front-end like ELB will have a 60-second limit, which can cause problems if some of your requests could take longer. (Spin them off into async jobs with polling or something.)

3) ELB doesn't support web sockets. Consider that if you think you might want websockets down the road.

node.js mongodb amazon-web-services cloud elasticsearch

There's no such thing as a master in elastic search. You have master copies of shards and replicas of shards but they are basically moved around through your cluster by elastic search. Nodes might be master for one shard and a replica for another. So, you could simply put a load balancer in front of it.

However, you can specialize nodes to be data nodes or routing nodes as explained here: http://www.elasticsearch.org/guide/reference/modules/node/

The routing nodes effectively become load balancers. You could have a few of those (redundancy) and distribute load between those. Alternatively, you could run a dedicated router node on each web server. Basically routing nodes are pretty light and you save a bit of bandwidth/latency since your web server talks to localhost and from there it is all elastic search internal cluster traffic.

node.js mongodb amazon-web-services cloud elasticsearch

I'd recommend to replace MongoDB with Amazon Dynamo DB (it has node.js SDK).

CodeHunter

Server architecture for a scalable web application

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last