Why does multiprocessing use only a single core after I import numpy?
After some more googling I found the answer here.
It turns out that certain Python modules (numpy
, scipy
, tables
, pandas
, skimage
...) mess with core affinity on import. As far as I can tell, this problem seems to be specifically caused by them linking against multithreaded OpenBLAS libraries.
A workaround is to reset the task affinity using
os.system("taskset -p 0xff %d" % os.getpid())
With this line pasted in after the module imports, my example now runs on all cores:
My experience so far has been that this doesn't seem to have any negative effect on numpy
's performance, although this is probably machine- and task-specific .
Update:
There are also two ways to disable the CPU affinity-resetting behaviour of OpenBLAS itself. At run-time you can use the environment variable OPENBLAS_MAIN_FREE
(or GOTOBLAS_MAIN_FREE
), for example
OPENBLAS_MAIN_FREE=1 python myscript.py
Or alternatively, if you're compiling OpenBLAS from source you can permanently disable it at build-time by editing the Makefile.rule
to contain the line
NO_AFFINITY=1
Python 3 now exposes the methods to directly set the affinity
>>> import os>>> os.sched_getaffinity(0){0, 1, 2, 3}>>> os.sched_setaffinity(0, {1, 3})>>> os.sched_getaffinity(0){1, 3}>>> x = {i for i in range(10)}>>> x{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}>>> os.sched_setaffinity(0, x)>>> os.sched_getaffinity(0){0, 1, 2, 3}
This appears to be a common problem with Python on Ubuntu, and is not specific to joblib
:
- Both multiprocessing.map and joblib use only 1 cpu after upgrade from Ubuntu 10.10 to 12.04
- Python multiprocessing utilizes only one core
- multiprocessing.Pool processes locked to a single core
I would suggest experimenting with CPU affinity (taskset
).