How to determine optimal number of threads for high latency network requests? How to determine optimal number of threads for high latency network requests? multithreading multithreading

How to determine optimal number of threads for high latency network requests?


It will shock you, but you do not need any threads for I/O (quantitatively, this means 0 threads). It is good that you have studied that multithreading does not multiply your network bandwidth. Now, it is time to know that threads do computation. They are not doing the (high-latency) communication. The communication is performed by a network adapter, which is another process, running really in parallel with with CPU. It is stupid to allocate a thread (see which resources allocated are listed by this gentlemen who claims that you need 1 thread) just to sleep until network adapter finishes its job. You need no threads for I/O = you need 0 threads.

It makes sense to allocate the threads for computation to make in parallel with I/O request(s). The amount of threads will depend on the computation-to-communication ratio and limited by the number of cores in your CPU.

Sorry, I had to say that despite you have certainly implied the commitment to blocking I/O, so many people do not understand this basic thing. Take the advise, use asynchronous I/O and you'll see that the issue does not exist.


As mentioned in one of the linked answers you refer to, Brian Goetz has covered this well in his article.

He seems to imply that in your situation you would be advised to gather metrics before committing to a thread count.

Tuning the pool size

Tuning the size of a thread pool is largely a matter of avoiding two mistakes: having too few threads or too many threads. ...

The optimum size of a thread pool depends on the number of processors available and the nature of the tasks on the work queue. ...

For tasks that may wait for I/O to complete -- for example, a task that reads an HTTP request from a socket -- you will want to increase the pool size beyond the number of available processors, because not all threads will be working at all times. Using profiling, you can estimate the ratio of waiting time (WT) to service time (ST) for a typical request. If we call this ratio WT/ST, for an N-processor system, you'll want to have approximately N*(1+WT/ST) threads to keep the processors fully utilized.

My emphasis.


Have you considered using Actors?

Best practises.

  • Actors should be like nice co-workers: do their job efficiently without bothering everyone else needlessly and avoid hogging resources. Translated to programming this means to process events and generate responses (or more requests) in an event-driven manner. Actors should not block (i.e. passively wait while occupying a Thread) on some external entity—which might be a lock, a network socket, etc.—unless it is unavoidable; in the latter case see below.

Sorry, I can't elaborate, because haven't much used this.

UPDATE

Answer in Good use case for Akka might be helpful.
Scala: Why are Actors lightweight?