Asynchronous processing or message queues in PHP (CakePHP) [closed] Asynchronous processing or message queues in PHP (CakePHP) [closed] multithreading multithreading

Asynchronous processing or message queues in PHP (CakePHP) [closed]


I've had excellent results with BeanstalkD and a back-end written in PHP to retrieve jobs and then act on them. I wrapped the actual job-running in a bash-script to keep running if even if it exited (unless I do a 'exit(UNIQNUM);', when the script checks it and will actually exit). In that way, the restarted PHP script clears down any memory that may have been used, and can start afresh every 25/50/100 jobs it runs.

A couple of the advantages of using it is that you can set priorities and delays into a BeanstalkD job - "run this at a lower priority, but don't start for 10 seconds". I've also queued a number of jobs up at the some time (run this now, in 5 seconds and again after 30 secs).

With the appropriate network configuration (and running it on an accessible IP address to the rest of your network), you can also run a beanstalkd deamon on one server, and have it polled from a number of other machines, so if there are a large number of tasks being generated, the work can be split off between servers. If a particular set of tasks needs to be run on a particular machine, I've created a 'tube' which is that machine's hostname, which should be unique within our cluster, if not globally (useful for file uploads). I found it worked perfectly for image resizing, often returning the finished smaller images to the file system before the webpage itself that would refer to it would refer to the URL it would be arriving at.

I'm actually about to start writing a series of articles on this very subject for my blog (including some techniques for code that I've already pushed several million live requests through) - My URL is linked from my user profile here, on Stackoverflow.

(I've written a series of articles on the subject of Beanstalkd and queuing of jobs)


If you use a message queue like beanstalkd, you can start as many processes as you'd like (even on the same server). Each worker process will take one job from the queue and process it. You can add more workers and more servers if you need more capacity.

The nice thing about using a single threaded worker is that you don't have to deal with synchronization inside a process. The jobqueue will make sure no job will be handled twice.


Might also be worth checking out Amazon SQS to be used in conjunction with EC2?