parallel dask for loop slower than regular loop? parallel dask for loop slower than regular loop? numpy numpy

parallel dask for loop slower than regular loop?


Two issues:

  1. Dask introduces about a millisecond of overhead per task. You'll want to ensure that your computations take significantly longer than that.
  2. When using the multiprocessing scheduler data gets serialized between processes, which can be quite expensive. See http://dask.pydata.org/en/latest/setup.html