asyncio and coroutines vs task queues asyncio and coroutines vs task queues python python

asyncio and coroutines vs task queues


Not a proper answer, but a list of hints that could not fit into a comment:

  • You are mentioning the multiprocessing module (and let's consider threading too). Suppose you have to handle hundreds of sockets: can you spawn hundreds of processes or threads?

  • Again, with threads and processes: how do you handle concurrent access to shared resources? What is the overhead of mechanisms like locking?

  • Frameworks like Celery also add an important overhead. Can you use it e.g. for handling every single request on a high-traffic web server? By the way, in that scenario, who is responsible for handling sockets and connections (Celery for its nature can't do that for you)?

  • Be sure to read the rationale behind asyncio. That rationale (among other things) mentions a system call: writev() -- isn't that much more efficient than multiple write()s?


Adding to the above answer:

If the task at hand is I/O bound and operates on a shared data, coroutines and asyncio are probably the way to go.

If on the other hand, you have CPU-bound tasks where data is not shared, a multiprocessing system like Celery should be better.

If the task at hand is a both CPU and I/O bound and sharing of data is not required, I would still use Celery.You can use async I/O from within Celery!

If you have a CPU bound task but with the need to share data, the only viable option as I see now is to save the shared data in a database. There have been recent attempts like pyparallel but they are still work in progress.