What is the optimal number of threads for performing IO operations in java? What is the optimal number of threads for performing IO operations in java? multithreading multithreading

What is the optimal number of threads for performing IO operations in java?


In practice, I/O-bound applications can still benefit substantially from multithreading because it can be much faster to read or write a few files in parallel than sequentially. This is particularly the case where overall throughput is compromised by network latency. But it's also the case that one thread can be processing the last thing that it read while another thread is busy reading, allowing higher CPU utilization.

We can talk theory all day, but the right answer is to make the number of threads configurable. I think you'll find that increasing it past 1 will boost your speed, but there will also come a point of diminishing returns.


Yes, 20 threads can definitely write to disk faster than 4 threads on a 4 CPU machine. Many real programs are I/O bound more than CPU bound. However, it depends in great detail on your disks and how much CPU work your other threads are doing before they, too, end up waiting on those disks.

If all of your threads are solely writing to disk and doing nothing else, then it may well be that 1 thread on a 4 CPU machine is actually the fastest way to write to disk. It depends entirely on how many disks you have, how much data you're writing, and how good your OS is at I/O scheduling. Your specific question suggests you want 4 threads all writing to the same file. That doesn't make much sense, and in any practical scenario I can't think how that'd be faster. (You'd have to allocate the file ahead of time, then each thread would seek() to a different position, and you'd end up just thrashing the write head as each thread tried to write some blocks.)

The advantage of multithreading is much simpler when you're network bound. Ie: waiting on a database server, or a web browser, or the like. There you're waiting on multiple external resources.


If you are using synchronous I/O, then you should have one thread for every simultaneous I/O request your machine can handle. In the case of a single spindle single hard disk, that's 1 (you can either read or write but not both simultaneuosly). For a disk that can handle many I/O requests simultaneously, that would be however many requests it can handle simultaneously.

In other words, this is not bounded by the CPU count, as I/O does not really hit the CPU beyond submitting requests and waiting. See here for a better explanation.

There's a whole other can of worms with how many I/O requests you should have in flight at any given time.