multiprocessing.Pool hangs if child causes a segmentation fault
As described in the comments, this just works in Python 3 if you use concurrent.Futures.ProcessPoolExecutor
instead of multiprocessing.Pool
.
If you're stuck on Python 2, the best option I've found is to use the timeout
argument on the result objects returned by Pool.apply_async
and Pool.map_async
. For example:
pool = Pool(2)out = pool.map_async(fit_one, range(10))for o in out: print o.get(timeout=1000) # allow 1000 seconds max
This works as long as you have an upper bound for how long a child process should take to complete a task.
This is a known bug, issue #22393, in Python. There is no meaningful workaround as long as you're using multiprocessing.pool
until it's fixed. A patch is available at that link, but it has not been integrated into the main release as yet, so no stable release of Python fixes the problem.
Instead of using Pool().imap()
maybe you would rather manually create child processes yourself with Process()
. I bet the object returned would allow you to get liveness status of any child. You will know if they hang up.