Async spawing of processes: design question - Celery or Twisted Async spawing of processes: design question - Celery or Twisted django django

Async spawing of processes: design question - Celery or Twisted


On my system, RabbitMQ running with pretty reasonable defaults is using about 2MB of RAM. Celeryd uses a bit more, but not an excessive amount.

In my opinion, the overhead of RabbitMQ and celery are pretty much negligible compared to the rest of the stack. If you're processing jobs that are going to take several minutes to complete, those jobs are what will overwhelm your 512MB server as soon as your traffic increases, not RabbitMQ. Starting off with RabbitMQ and Celery will at least set you up nicely to scale those jobs out horizontally though, so you're definitely on the right track there.

Sure, you could write your own job control in Twisted, but I don't see it gaining you much. Twisted has pretty good performance, but I wouldn't expect it to outperform RabbitMQ by enough to justify the time and potential for introducing bugs and architectural limitations. Mostly, it just seems like the wrong spot to worry about optimizing. Take the time that you would've spent re-writing RabbitMQ and work on reducing those three minute jobs by 20% or something. Or just spend an extra $20/month and double your capacity.


I'll answer this question as though I was the one doing the project and hopefully that might give you some insight.

I'm working on a project that will require the use of a queue, a web server for the public facing web application and several job clients.

The idea is to have the web server continuously running (no need for a very powerful machine here). However, the work is handled by these job clients which are more powerful machines that can be started and stopped at will. The job queue will also reside on the same machine as the web application. When a job gets inserted into the queue, a process that starts the job clients will kick into action and spin the first client. Using a load balancer that can start new servers as the load increases, I don't have to bother about managing the number of servers running to process jobs in the queue. If there are no jobs in the queue after a while, all job clients can be terminated.

I will suggest using a setup similar to this. You don't want job execution to affect the performance of your web application.


I Add, quite late another possibility: using Redis.Currently I using redis with twisted : I distribute work to worker. They perform work and return result asynchronously.

The "List" type is very useful : http://www.redis.io/commands/rpoplpush

So you can use the Reliable queue Pattern to send work and having a process that block/wait until he have a new work to do(a new message coming in queue.

you can use several worker on the same queue.

Redis have a low memory foot print but be careful of number of pending message , that will increase the memory that Redis use.