Divide an uneven number between Threads Divide an uneven number between Threads arrays arrays

Divide an uneven number between Threads


Let me just take your example as it will be easy to explain. 22 elements amongst 4 threads.

22 % 4 = 2. This gives you the number of threads that will get one element more than the remaining threads.

22 / 4 = 5. This gives you the minimum number of elements per thread.

Now start dividing your array into 5 elements and assign them to a thread each till you are left with (22%4) 2 threads. Assign them the remaining (5+1=6) elements each.


It doesn't need to be as evenly as possible. If one thread has 6, this will determine the length of time it takes in which case it doesn't matter how many are up to 6.

You can do

int chunkSize = (tasks + threads - 1) / threads; // divide by threads rounded up.for (int t = 0; t < threads; t++) {    int start = t * chunksSize;    int end = Math.min(start + chunkSize, tasks);    executor.submit(() -> {         // inside the thread         for (int i = start; i < end; i++) {             process(i);    });}

Note: if you use Stream.of(array).parallel() it actually create two tasks per thread. This mitigates that some batches might take longer even though they have the same number of elements.


In order to make sure that the threads have a "similar" workload, it is important to find an even distribution. This is particularly important when the number of threads is "high" compared to the number of elements. For this case, one should make sure that the numbers of elements that the threads are responsible for differs by at most 1.

To achieve this, you can compute the remainder of dividing the number of elements (the array length, in your case) by the number of threads, and distribute this remainder, one by one, among the tasks.

I had the same problem a while ago. In fact, I tried to solve it in a slightly more general form, for some ParallelRangeExecutor class, which required the computation of the start- and end indices of the intervals of an arbitrary range (which does not need to start with index 0). The following is "extracted" from this class:

import java.util.Arrays;public class EvenTaskDistribution{    public static void main(String[] args)    {        test( 22, 4);        test( 21, 4);        test(100, 3);        test(  3, 4);    }    private static void test(int numElements, int parallelism)    {        int taskSizes[] = computeTaskSizes(parallelism, 0, numElements);        System.out.printf("Distributing %4d elements among %4d threads: %s\n",            numElements, parallelism, Arrays.toString(taskSizes));    }    public static int[] computeTaskSizes(        int parallelism, int globalMin, int globalMax)    {        if (parallelism <= 0)        {            throw new IllegalArgumentException(                "Parallelism must be positive, but is " + parallelism);        }        if (globalMin > globalMax)        {            throw new IllegalArgumentException(                "The global minimum may not be larger than the global " +                 "maximum. Global minimum is "+globalMin+", " +                 "global maximum is "+globalMax);        }        int range = globalMax - globalMin;        if (range == 0)        {            return new int[0];        }        int numTasks = Math.min(range, parallelism);        int localRange = (range - 1) / numTasks + 1;        int spare = localRange * numTasks - range;        int currentIndex = globalMin;        int taskSizes[] = new int[numTasks];        for (int i = 0; i < numTasks; i++)        {            final int min = currentIndex;            final int max = min + localRange - (i < spare ? 1 : 0);            taskSizes[i] = max - min;             currentIndex = max;        }        return taskSizes;    }}

The output is

Distributing   22 elements among    4 threads: [5, 5, 6, 6]Distributing   21 elements among    4 threads: [5, 5, 5, 6]Distributing  100 elements among    3 threads: [33, 33, 34]Distributing    3 elements among    4 threads: [1, 1, 1]

(The last one shows one of the corner cases that one might have to take into account. E.g. one could expect [1,1,1,0] here. But this can easily be adjusted depending on the application case).