Python multiprocessing PicklingError: Can't pickle <type 'function'>
Here is a list of what can be pickled. In particular, functions are only picklable if they are defined at the top-level of a module.
This piece of code:
import multiprocessing as mpclass Foo(): def work(self): passif __name__ == '__main__': pool = mp.Pool() foo = Foo() pool.apply_async(foo.work) pool.close() pool.join()
yields an error almost identical to the one you posted:
Exception in thread Thread-2:Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 505, in run self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 315, in _handle_tasks put(task)PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed
The problem is that the
pool methods all use a
mp.SimpleQueue to pass tasks to the worker processes. Everything that goes through the
mp.SimpleQueue must be pickable, and
foo.work is not picklable since it is not defined at the top level of the module.
It can be fixed by defining a function at the top level, which calls
def work(foo): foo.work()pool.apply_async(work,args=(foo,))
foo is pickable, since
Foo is defined at the top level and
foo.__dict__ is picklable.
pathos.multiprocesssing, instead of
pathos.multiprocessing is a fork of
multiprocessing that uses
dill can serialize almost anything in python, so you are able to send a lot more around in parallel. The
pathos fork also has the ability to work directly with multiple argument functions, as you need for class methods.
from pathos.multiprocessing import ProcessingPool as Pool p = Pool(4)class Test(object): def plus(self, x, y): return x+y t = Test() p.map(t.plus, x, y)[4, 6, 8, 10]class Foo(object): @staticmethod def work(self, x): return x+1f = Foo() p.apipe(f.work, f, 100)<processing.pool.ApplyResult object at 0x10504f8d0> res = _ res.get()101
pathos (and if you like,
dill) here: https://github.com/uqfoundation
As others have said
multiprocessing can only transfer Python objects to worker processes which can be pickled. If you cannot reorganize your code as described by unutbu, you can use
dills extended pickling/unpickling capabilities for transferring data (especially code data) as I show below.
This solution requires only the installation of
dill and no other libraries as
import osfrom multiprocessing import Poolimport dilldef run_dill_encoded(payload): fun, args = dill.loads(payload) return fun(*args)def apply_async(pool, fun, args): payload = dill.dumps((fun, args)) return pool.apply_async(run_dill_encoded, (payload,))if __name__ == "__main__": pool = Pool(processes=5) # asyn execution of lambda jobs =  for i in range(10): job = apply_async(pool, lambda a, b: (a, b, a * b), (i, i + 1)) jobs.append(job) for job in jobs: print job.get() print # async execution of static method class O(object): def calc(): return os.getpid() jobs =  for i in range(10): job = apply_async(pool, O.calc, ()) jobs.append(job) for job in jobs: print job.get()