How to avoid concurrency issues when scaling writes horizontally? How to avoid concurrency issues when scaling writes horizontally? azure azure

How to avoid concurrency issues when scaling writes horizontally?


Any solution attempting to divide the load upon different items in the same collection (like orders) are doomed to fail. The reason is that if you got a high rate of transactions flowing you'll have to start doing one of the following things:

  1. let nodes to talk each other (hey guys, are anyone working with this?)
  2. Divide the ID generation into segments (node a creates ID 1-1000, node B 1001-1999) etc and then just let them deal with their own segment
  3. dynamically divide a collection into segments (and let each node handle a segment.

so what's wrong with those approaches?

The first approach is simply replicating transactions in a database. Unless you can spend a large amount of time optimizing the strategy it's better to rely on transactions.

The second two options will decrease performance as you have to dynamically route messages upon ids and also change the strategy at run-time to also include newly inserted messages. It will fail eventually.

Solutions

Here are two solutions that you can also combine.

Retry automatically

Instead you have an entry point somewhere that reads from the message queue.

In it you have something like this:

while (true){    var message = queue.Read();    Process(message);}

What you could do instead to get very simple fault tolerance is to retry upon failure:

while (true){    for (i = 0; i < 3; i++)    {       try       {            var message = queue.Read();            Process(message);            break; //exit for loop       }       catch (Exception ex)       {           //log           //no throw = for loop runs the next attempt       }    }}

You could of course just catch db exceptions (or rather transaction failures) to just replay those messages.

Micro services

I know, Micro service is a buzz word. But in this case it's a great solution. Instead of having a monolithic core which processes all messages, divide the application in smaller parts. Or in your case just deactivate the processing of certain types of messages.

If you have five nodes running your application you can make sure that Node A receives messages related to orders, node B receives messages related to shipping etc.

By doing so you can still horizontally scale your application, you get no conflicts and it requires little effort (a few more message queues and reconfigure each node).


For this kind of a thing I use blob leases. Basically, I create a blob with the ID of an entity in some known storage account. When worker 1 picks up the entity, it tries to acquire a lease on the blob (and create the blob itself, if it doesn't exist). If it is successful in doing both, then I allow the processing of the message to occur. Always release the lease afterwards.If I am not successfull, I dump the message back onto the queue

I follow the apporach originally described by Steve Marx here http://blog.smarx.com/posts/managing-concurrency-in-windows-azure-with-leases although tweaked to use new Storage Libraries

Edit after comments:If you have a potentially high rate of messages all talking to the same entity (as your commend implies), I would redesign your approach somewhere.. either entity structure, or messaging structure.

For example: consider CQRS design pattern and store changes from processing of every message independently. Whereby, product entity is now an aggregate of all changes done to the entity by various workers, sequentially re-applied and rehydrated into a single object


If you want to always have the database up to date and always consistent with the already processed units then you have several updates on the same mutable entity.

In order to comply with this you need to serialize the updates for the same entity. Either you do this by partitioning your data at producers, either you accumulate the events for the entity on the same queue, either you lock the entity in the worker using an distributed lock or a lock at the database level.

You could use an actor model (in java/scala world using akka) that is creating a message queue for each entity or group of entities that process them serially.

UPDATEDYou can try an akka port to .net and here.Here you can find a nice tutorial with samples about using akka in scala. But for general principles you should search more about [actor model]. It has drawbacks nevertheless.

In the end pertains to partition your data and ability to create a unique specialized worker(that could be reused and/or restarted in case of failure) for a specific entity.