Java multithreaded file downloading performance
To answer my own questions:
- The increased CPU usage was due to a
while() {}
loop that was waiting for the threads to finish. As it turns out,awaitTermination
is a much better alternative to wait for anExecutor
to finish :) - (And 3 and 4) This seems to be the nature of the beast; in the end I achieved what I wanted to do by using careful synchronization of the different threads that each download a chunk of data (well, in particular the writes of these chunks back to disk).
Presumably the Apache HTTP client will be doing some buffering, with a smaller buffer. It will need a buffer to read the HTTP header reasonably, and probably handling chunked encoding.
My immediate thought for best performance on Windows would be to use IO completions ports. What I don't know is (a) whether there are similar concepts in other OSes, and (b) whether there are any suitable Java wrappers? If portability isn't important to you, though, it may be possible to roll your own wrapper with JNI.