Writing a parallel loop Writing a parallel loop windows windows

Writing a parallel loop


Continuing on your request to provide a working multiprocessing code, I suggest that you use pool_map (if the delayed functionality is not important), I'll give you an example, if your'e working on python3 its worth to mention you can use starmap.Also worth mentioning that you can use map_sync/starmap_async if the order of the returned results does not have to correspond to the order of inputs.

import multiprocessing as mpdef processInput(i):        return i * iif __name__ == '__main__':    # what are your inputs, and what operation do you want to    # perform on each input. For example...    inputs = range(1000000)    #  removing processes argument makes the code run on all available cores    pool = mp.Pool(processes=4)    results = pool.map(processInput, inputs)    print(results)


On Windows, the multiprocessing module uses the 'spawn' method to start up multiple python interpreter processes. This is relatively slow. Parallel tries to be smart about running the code. In particular, it tries to adjust batch sizes so a batch takes about half a second to execute. (See the batch_size argument at https://pythonhosted.org/joblib/parallel.html)

Your processInput() function runs so fast that Parallel determines that it is faster to run the jobs serially on one processor than to spin up multiple python interpreters and run the code in parallel.

If you want to force your example to run on multiple cores, try setting batch_size to 1000 or making processInput() more complicated so it takes longer to execute.

Edit: Working example on windows that shows multiple processes in use (I'm using windows 7):

from joblib import Parallel, delayedfrom os import getpiddef modfib(n):    # print the process id to see that multiple processes are used, and    # re-used during the job.    if n%400 == 0:        print(getpid(), n)      # fibonacci sequence mod 1000000    a,b = 0,1    for i in range(n):        a,b = b,(a+b)%1000000    return bif __name__ == "__main__":    Parallel(n_jobs=-1, verbose=5)(delayed(modfib)(j) for j in range(1000, 4000))