Python: Multicore processing? Python: Multicore processing? python python

Python: Multicore processing?


Yes, it's possible to do this summation over several processes, very much like doing it with multiple threads:

from multiprocessing import Process, Queuedef do_sum(q,l):    q.put(sum(l))def main():    my_list = range(1000000)    q = Queue()    p1 = Process(target=do_sum, args=(q,my_list[:500000]))    p2 = Process(target=do_sum, args=(q,my_list[500000:]))    p1.start()    p2.start()    r1 = q.get()    r2 = q.get()    print r1+r2if __name__=='__main__':    main()

However, it is likely that doing it with multiple processes is likely slower than doing it in a single process, as copying the data forth and back is more expensive than summing them right away.


Welcome the world of concurrent programming.

What Python can (and can't) do depends on two things.

  1. What the OS can (and can't) do. Most OS's allocate processes to cores. To use 4 cores, you need to break your problem into four processes. This is easier than it sounds. Sometimes.

  2. What the underlying C libraries can (and can't) do. If the C libraries expose features of the OS AND the OS exposes features of the hardware, you're solid.

To break a problem into multiple processes -- especially in GNU/Linux -- is easy. Break it into a multi-step pipeline.

In the case of summing a million numbers, think of the following shell script. Assuming some hypothetical sum.py program that sums either a range of numbers or a list of numbers on stdin.

( sum.py 0 500000 & sum.py 50000 1000000 ) | sum.py

This would have 3 concurrent processes. Two are doing sums of a lot of numbers, the third is summing two numbers.

Since the GNU/Linux shells and the OS already handle some parts of concurrency for you, you can design simple (very, very simple) programs that read from stdin, write to stdout, and are designed to do small parts of a large job.

You can try to reduce the overheads by using subprocess to build the pipeline instead of allocating the job to the shell. You may find, however, that the shell builds pipelines very, very quickly. (It was written directly in C and makes direct OS API calls for you.)


Sure, for example:

from multiprocessing import Process, Queuethelist = range(1000*1000)def f(q, sublist):    q.put(sum(sublist))def main():    start = 0    chunk = 500*1000    queue = Queue()    NP = 0    subprocesses = []    while start < len(thelist):      p = Process(target=f, args=(queue, thelist[start:start+chunk]))      NP += 1      print 'delegated %s:%s to subprocess %s' % (start, start+chunk, NP)      p.start()      start += chunk      subprocesses.append(p)    total = 0    for i in range(NP):      total += queue.get()    print "total is", total, '=', sum(thelist)    while subprocesses:      subprocesses.pop().join()if __name__ == '__main__':    main()

results in:

$ python2.6 mup.py delegated 0:500000 to subprocess 1delegated 500000:1000000 to subprocess 2total is 499999500000 = 499999500000

note that this granularity is too fine to be worth spawning processes for -- the overall summing task is small (which is why I can recompute the sum in main as a check;-) and too many data is being moved back and forth (in fact the subprocesses wouldn't need to get copies of the sublists they work on -- indices would suffice). So, it's a "toy example" where multiprocessing isn't really warranted. With different architectures (use a pool of subprocesses that receive multiple tasks to perform from a queue, minimize data movement back and forth, etc, etc) and on less granular tasks you could actually get benefits in terms of performance, however.