parallel processing of DAG parallel processing of DAG multithreading multithreading

parallel processing of DAG


Have one master thread push items to a queue once they are ready for being processsed. Then have a pool of workers listen on the queue for tasks to work on. (Python provides a synchronized queue in the Queue module, renamed to lower-case queue in Python 3).

The master first creates a map from dependencies to dependent tasks. Every task that doesn't have any dependcies can go into the queue. Everytime a task is completed, the master uses the dictionary to figure out which dependent tasks there are, and puts them into the queue if all their depndencies are met now.


Celery (http://www.celeryproject.org/) is the leading task management tool for Python. It should be able to help you with this.