concurrent.futures.ThreadPoolExecutor.map is slower than a for loop concurrent.futures.ThreadPoolExecutor.map is slower than a for loop multithreading multithreading

concurrent.futures.ThreadPoolExecutor.map is slower than a for loop


I've not yet tried futures, but I believe it's thread-based, so this probably applies:http://www.youtube.com/watch?v=ph374fJqFPE

In short, I/O bound workloads thread well in CPython, but CPU-bound workloads do not. And if you mix I/O bound and CPU-bound threads in the same process, that doesn't thread well either.

If that's the problem, I'd suggest increasing the size of your work chunks (just squaring a number is pretty small), and using multiprocessing. Multiprocessing is thread-like, but it uses multiple processes with shared memory, and tends to give looser coupling between program components than threading anyway.

That, or switch to Jython or IronPython; these reputedly thread well.


You're using async threads to try and make CPU-bound work concurrent? I wouldn't recommend it. Use processes instead, otherwise the GIL will slow things down more and more as the size of your thread pool increases.

[Edit 1]

Similar question with references to the GIL explanation from David Beazly (sp?).

Python code performance decreases with threading


Python has the global interpreter lock which doesn't let execute Python code of the same process in different threads simultaneously.To achieve true parallel execution you have to use multiple processes (easy to switch to ProcessPoolExecutor) or native (non-Python, e.g. C) code.