What determines the number of threads a Java ForkJoinPool creates?

java parallel-processing threadpool fork-join

There're related questions on stackoverflow:

ForkJoinPool stalls during invokeAll/join

I made a runnable stripped down version of what is happening (jvm arguments i used: -Xms256m -Xmx1024m -Xss8m):

import java.util.ArrayList;import java.util.List;import java.util.concurrent.ForkJoinPool;import java.util.concurrent.RecursiveAction;import java.util.concurrent.RecursiveTask;import java.util.concurrent.TimeUnit;public class Test1 {    private static ForkJoinPool pool = new ForkJoinPool(2);    private static class SomeAction extends RecursiveAction {        private int counter;         //recursive counter        private int childrenCount=80;//amount of children to spawn        private int idx;             // just for displaying        private SomeAction(int counter, int idx) {            this.counter = counter;            this.idx = idx;        }        @Override        protected void compute() {            System.out.println(                "counter=" + counter + "." + idx +                " activeThreads=" + pool.getActiveThreadCount() +                " runningThreads=" + pool.getRunningThreadCount() +                " poolSize=" + pool.getPoolSize() +                " queuedTasks=" + pool.getQueuedTaskCount() +                " queuedSubmissions=" + pool.getQueuedSubmissionCount() +                " parallelism=" + pool.getParallelism() +                " stealCount=" + pool.getStealCount());            if (counter <= 0) return;            List<SomeAction> list = new ArrayList<>(childrenCount);            for (int i=0;i<childrenCount;i++){                SomeAction next = new SomeAction(counter-1,i);                list.add(next);                next.fork();            }            for (SomeAction action:list){                action.join();            }        }    }    public static void main(String[] args) throws Exception{        pool.invoke(new SomeAction(2,0));    }}

Apparently when you perform a join, current thread sees that required task is not yet completed and takes another task for himself to do.

It happens in java.util.concurrent.ForkJoinWorkerThread#joinTask.

However this new task spawns more of the same tasks, but they can not find threads in the pool, because threads are locked in join. And since it has no way to know how much time it will require for them to be released (thread could be in infinite loop or deadlocked forever), new thread(s) is(are) spawned (Compensating for joined threads as Louis Wasserman mentioned): java.util.concurrent.ForkJoinPool#signalWork

So to prevent such scenario you need to avoid recursive spawning of tasks.

For example if in above code you set initial parameter to 1, active thread amount will be 2, even if you increase childrenCount tenfold.

Also note that, while amount of active threads increases, amount of running threads is less or equal to parallelism.

java parallel-processing threadpool fork-join

From the source comments:

Compensating: Unless there are already enough live threads, method tryPreBlock() may create or re-activate a spare thread to compensate for blocked joiners until they unblock.

I think what's happening is that you're not finishing any of the tasks very quickly, and since there aren't available worker threads when you submit a new task, a new thread gets created.

java parallel-processing threadpool fork-join

strict, full-strict, and terminally-strict have to do with processing a directed acyclic graph (DAG). You can google those terms to get a full understanding of them. That is the type of processing the framework was designed to process. Look at the code in the API for Recursive..., the framework relies on your compute() code to do other compute() links and then do a join(). Each Task does a single join() just like processing a DAG.

You are not doing DAG processing. You are forking many new Tasks and waiting (join()) on each. Have a read in the source code. It's horrendously complex but you may be able to figure it out. The framework does not do proper Task Management. Where is it going to put the waiting Task when it does a join()? There is no suspended queue, that would require a monitor thread to constantly look at the queue to see what is finished. This is why the framework uses "continuation threads". When one task does join() the framework is assuming it is waiting for a single lower Task to finish. When many join() methods are present the thread cannot continue so a helper or continuation thread needs to exist.

As noted above, you need a scatter-gather type fork-join process. There you can fork as many Tasks

CodeHunter

What determines the number of threads a Java ForkJoinPool creates?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last