Parallel processing in PHP - How do you do it? Parallel processing in PHP - How do you do it? multithreading multithreading

Parallel processing in PHP - How do you do it?


i use exec(). Its easy and clean. You basically need to build a thread manager, and thread scripts, that will do what you need.

I dont like fsockopen() because it will open a server connection, that will build up and may hit the apache's connection limit

I dont like curl functions for the same reason

I dont like pnctl because it needs the pnctl extension available, and you have to keep track of parent/child relations.

never played with gearman...


Well I guess we have 3 options there:

A. Multi-Thread:

PHP does not support multithread natively.But there is one PHP extension (experimental) called pthreads (https://github.com/krakjoe/pthreads) that allows you to do just that.

B. Multi-Process:

This can be done in 3 ways:

  • Forking
  • Executing Commands
  • Piping

C. Distributed Parallel Processing:

How it works:

  1. The Client App sends data (AKA message) “can be JSON formatted” to the Engine (MQ Engine) “can be local or external a web service”
  2. The MQ Engine stores the data “mostly in Memory and optionally in Database” inside a queues (you can define the queue name)
  3. The Client App asks the MQ Engine for a data (message) to be processed them in order (FIFO or based on priority) “you can also request data from specific queue".


Some MQ Engines:

  • ZeroMQ (good option, hard to use)a message orientated IPC Library, is a Message Queue Server in Erlang, stores jobs in memory. It is a socket library that acts as a concurrency framework. Faster than TCP for clustered products and supercomputing.
  • RabbitMQ (good option, easy to use) self hosted, Enterprise Message Queues, Not really a work queue - but rather a message queue that can be used as a work queue but requires additional semantics.
  • Beanstalkd (best option, easy to use)(Laravel built in support, built by facebook, for work queue) - has a "Beanstalkd console" tool which is very nice
  • Gearman(problem: centralized broker system for distributed processing)
  • Apache ActiveMQthe most popular open source message broker in Java, (problem: lot of bugs and problems)
  • Amazon SQS(Laravel built in support, Hosted - so no administration is required. Not really a work queue thus will require extra work to handle semantics such as burying a job)
  • IronMQ(Laravel built in support, Written in Go, Available both as cloud version and on-premise)
  • Redis(Laravel built in support, not that fast as its not designed for that)
  • Sparrow(written in Ruby that based on memcache)
  • Starling (written in Ruby that based on memcache, built in twitter)
  • Kestrel(just another QM)
  • Kafka(Written at LinkedIn in Scala)
  • EagleMQopen source, high-performance and lightweight queue manager (Written in C)

More of them can be foun here: http://queues.io


If your application is going to run under a unix/linux enviroment I would suggest you go with the forking option. It's basically childs play to get it working. I have used it for a Cron manager and had code for it to revert to a Windows friendly codepath if forking was not an option.

The options of running the entire script several times do, as you state, add quite a bit of overhead. If your script is small it might not be a problem. But you will probably get used to doing parallel processing in PHP by the way you choose to go. And next time when you have a job that uses 200mb of data it might very well be a problem. So you'd be better of learning a way that you can stick with.

I have also tested Gearman and I like it a lot. There are a few thing to think about but as a whole it offers a very good way to distribute works to different servers running different applications written in different languages. Besides setting it up, actually using it from within PHP, or any other language for that matter, is... once again... childs play.

It could very well be overkill for what you need to do. But it will open your eyes to new possibilities when it comes to handling data and jobs, so I would recommend you to try Gearman for that fact alone.