Multiprocessing IOError: bad message length Multiprocessing IOError: bad message length numpy numpy

Multiprocessing IOError: bad message length


You're creating a pool and sending all the images at once to func(). If you can get away with working on a single image at once, try something like this, which runs to completion with N=10000 in 35s with Python 2.7.10 for me:

import numpy as npimport multiprocessingdef func(args):    i = args[0]    img = args[1]    print "{}: {} {}".format(i, img.shape, img.sum())    return 0N=10000images = ((i, np.random.random_integers(1,100,size=(500,500))) for i in xrange(N))pool=multiprocessing.Pool(4)pool.imap(func, images)pool.close()pool.join()

The key here is to use iterators so you don't have to hold all the data in memory at once. For instance I converted images from an array holding all the data to a generator expression to create the image only when needed. You could modify this to load your images from disk or whatever. I also used pool.imap instead of pool.map.

If you can, try to load the image data in the worker function. Right now you have to serialize all the data and ship it across to another process. If your image data is larger, this might be a bottleneck.

[update now that we know func has to handle all images at once]

You could do an iterative mean on your images. Here's a solution without using multiprocessing. To use multiprocessing, you could divide your images into chunks, and farm those chunks out to the pool.

import numpy as npN=10000shape = (500,500)def func(images):    average = np.full(shape, 0)    for i, img in images:        average += img / N    return averageimages = ((i, np.full(shape,i)) for i in range(N))print func(images)


Python is likely to load your data in your RAM memory and you need this memory to be available. Have you checked your computer memory usage ?

Also as Patrick mentioned, you're loading 3GB of data, make sure you use the 64 bits version of Python as you are reaching the 32 bits memory contraint. This could cause your process to crash : 32 vs 64 bits Python

Another improvement would be to use python 3.4 instead of 2.7. Python 3 implementation seems to be optimized for very large ranges, see Python3 vs Python2 list/generator range performance


When running your program it actually gives me an clear error:

OSError: [Errno 12] Cannot allocate memory

Like mentioned by other users, the solution to your problem is simple add memory(a lot) or change the way your program is handling the images.

The reason it's using so much memory is because you allocate your memory for your images on a module level. So when multiprocess forks your process it's also copying all the images (which isn't free according to Shared-memory objects in python multiprocessing), this is not necessary because you are also giving the images as an argument to the function which the multiprocess module also copies using ipc and pickle, this would still likely result in a lack of memory. Try one of the proposed solutions given by the other users.