How to use multiprocessing pool.map with multiple arguments? How to use multiprocessing pool.map with multiple arguments? python python

How to use multiprocessing pool.map with multiple arguments?


is there a variant of pool.map which support multiple arguments?

Python 3.3 includes pool.starmap() method:

#!/usr/bin/env python3from functools import partialfrom itertools import repeatfrom multiprocessing import Pool, freeze_supportdef func(a, b):    return a + bdef main():    a_args = [1,2,3]    second_arg = 1    with Pool() as pool:        L = pool.starmap(func, [(1, 1), (2, 1), (3, 1)])        M = pool.starmap(func, zip(a_args, repeat(second_arg)))        N = pool.map(partial(func, b=second_arg), a_args)        assert L == M == Nif __name__=="__main__":    freeze_support()    main()

For older versions:

#!/usr/bin/env python2import itertoolsfrom multiprocessing import Pool, freeze_supportdef func(a, b):    print a, bdef func_star(a_b):    """Convert `f([1,2])` to `f(1,2)` call."""    return func(*a_b)def main():    pool = Pool()    a_args = [1,2,3]    second_arg = 1    pool.map(func_star, itertools.izip(a_args, itertools.repeat(second_arg)))if __name__=="__main__":    freeze_support()    main()

Output

1 12 13 1

Notice how itertools.izip() and itertools.repeat() are used here.

Due to the bug mentioned by @unutbu you can't use functools.partial() or similar capabilities on Python 2.6, so the simple wrapper function func_star() should be defined explicitly. See also the workaround suggested by uptimebox.


The answer to this is version- and situation-dependent. The most general answer for recent versions of Python (since 3.3) was first described below by J.F. Sebastian.1 It uses the Pool.starmap method, which accepts a sequence of argument tuples. It then automatically unpacks the arguments from each tuple and passes them to the given function:

import multiprocessingfrom itertools import productdef merge_names(a, b):    return '{} & {}'.format(a, b)if __name__ == '__main__':    names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']    with multiprocessing.Pool(processes=3) as pool:        results = pool.starmap(merge_names, product(names, repeat=2))    print(results)# Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...

For earlier versions of Python, you'll need to write a helper function to unpack the arguments explicitly. If you want to use with, you'll also need to write a wrapper to turn Pool into a context manager. (Thanks to muon for pointing this out.)

import multiprocessingfrom itertools import productfrom contextlib import contextmanagerdef merge_names(a, b):    return '{} & {}'.format(a, b)def merge_names_unpack(args):    return merge_names(*args)@contextmanagerdef poolcontext(*args, **kwargs):    pool = multiprocessing.Pool(*args, **kwargs)    yield pool    pool.terminate()if __name__ == '__main__':    names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']    with poolcontext(processes=3) as pool:        results = pool.map(merge_names_unpack, product(names, repeat=2))    print(results)# Output: ['Brown & Brown', 'Brown & Wilson', 'Brown & Bartlett', ...

In simpler cases, with a fixed second argument, you can also use partial, but only in Python 2.7+.

import multiprocessingfrom functools import partialfrom contextlib import contextmanager@contextmanagerdef poolcontext(*args, **kwargs):    pool = multiprocessing.Pool(*args, **kwargs)    yield pool    pool.terminate()def merge_names(a, b):    return '{} & {}'.format(a, b)if __name__ == '__main__':    names = ['Brown', 'Wilson', 'Bartlett', 'Rivera', 'Molloy', 'Opie']    with poolcontext(processes=3) as pool:        results = pool.map(partial(merge_names, b='Sons'), names)    print(results)# Output: ['Brown & Sons', 'Wilson & Sons', 'Bartlett & Sons', ...

1. Much of this was inspired by his answer, which should probably have been accepted instead. But since this one is stuck at the top, it seemed best to improve it for future readers.


I think the below will be better

def multi_run_wrapper(args):   return add(*args)def add(x,y):    return x+yif __name__ == "__main__":    from multiprocessing import Pool    pool = Pool(4)    results = pool.map(multi_run_wrapper,[(1,2),(2,3),(3,4)])    print results

output

[3, 5, 7]