non-blocking IO vs async IO and implementation in Java non-blocking IO vs async IO and implementation in Java java java

non-blocking IO vs async IO and implementation in Java


I see this is an old question, but I think something was missed here, that @nickdu attempted to point out but wasn't quite clear.

There are four types of IO pertinent to this discussion:

Blocking IO

Non-Blocking IO

Asynchronous IO

Asynchronous Non-Blocking IO

The confusion arises I think because of ambiguous definitions. So let me attempt to clarify that.

First Let's talk about IO. When we have slow IO this is most apparent, but IO operations can either be blocking or non-blocking. This has nothing to do with threads, it has to do with the interface to the operating system. When I ask the OS for an IO operation I have the choice of waiting for all the data to be ready (blocking), or getting what is available right now and moving on (non-blocking). The default is blocking IO. It is much easier to write code using blocking IO as the path is much clearer. However, your code has to stop and wait for IO to complete. Non-Blocking IO requires interfacing with the IO libraries at a lower level, using select and read/write instead of the higher level libraries that provide convenient operations. Non-Blocking IO also implies that you have something you need to work on while the OS works on doing the IO. This might be multiple IO operations or computation on the IO that has completed.

Blocking IO - The application waits for the OS to gather all the bytes to complete the operation or reach the end before continuing. This is default. To be more clear for the very technical, the system call that initiates the IO will install a signal handler waiting for a processor interrupt that will occur when the IO operation makes progress. Then the system call will begin a sleep which suspends operation of the current process for a period of time, or until the process interrupt occurs.

Non-Blocking IO - The application tells the OS it only wants what bytes are available right now, and moves on while the OS concurrently gathers more bytes. The code uses select to determine what IO operations have bytes available. In this case the system call will again install a signal handler, but rather than sleep, it will associate the signal handler with the file handle, and immediately return. The process will become responsible for periodically checking the file handle for the interrupt flag having been set. This is usually done with a select call.

Now Asynchronous is where the confusion begins. The general concept of asynchronous only implies that the process continues while the background operation is performed, the mechanism by which this occurs is not specific. The term is ambiguous as both non-blocking IO and threaded blocking IO can be considered to be asynchronous. Both allow concurrent operations, however the resource requirements are different, and the code is substantially different. Because you have asked a question "What is Non-Blocking Asynchronous IO", I am going to use a stricter definition for asynchronous, a threaded system performing IO which may or may not be non-blocking.

The general definition

Asynchronous IO - Programmatic IO which allows multiple concurrent IO operations to occur. IO operations are happening simultaneously, so that code is not waiting for data that is not ready.

The stricter definition

Asynchronous IO - Programmatic IO which uses threading or multiprocessing to allow concurrent IO operations to occur.

Now with those clearer definitions we have the following four types of IO paradigms.

Blocking IO - Standard single threaded IO in which the application waits for all IO operations to complete before moving on. Easy to code, no concurrency and so slow for applications that require multiple IO operations. The process or thread will sleep while waiting for the IO interrupt to occur.

Asynchronous IO - Threaded IO in which the application uses threads of execution to perform Blocking IO operations concurrently. Requires thread safe code, but is generally easier to read and write than the alternative. Gains the overhead of multiple threads, but has clear execution paths. May require the use of synchronized methods and containers.

Non-Blocking IO - Single threaded IO in which the application uses select to determine which IO operations are ready to advance, allowing the execution of other code or other IO operations while the OS processes concurrent IO. The process does not sleep while waiting for the IO interrupt, but takes on the responsibility to check for the IO flag on the filehandle. Much more complicated code due to the need to check the IO flag with select, though does not require thread-safe code or synchronized methods and containers. Low execution over-head at the expense of code complexity. Execution paths are convoluted.

Asynchronous Non-Blocking IO - A hybrid approach to IO aimed at reducing complexity by using threads, while maintaining scalability by using non-blocking IO operations where possible. This would be the most complex type of IO requiring synchronized methods and containers, as well as convoluted execution paths. This is not the type of IO that one should consider coding lightly, and is most often only used when using a library that will mask the complexity, something like Futures and Promises.


So what is actually "non-blocking async IO"?

To answer that, you must first understand that there's no such thing as blocking async I/O. The very concept of asynchronism dictates that there's no waiting, no blocking, no delay. When you see non-blocking asynchronous I/O, the non-blocking bit only serves to further qualify the async adjective in that term. So effectively, non-blocking async I/O might be a bit of a redundancy.

There are mainly two kinds of I/O. Synchronous and Asynchronous. Synchronous blocks the current thread of execution until processing is complete, while Asynchronous doesn't block the current thread of execution, rather passing control to the OS Kernel for further processing. The kernel then advises the async thread when the submitted task is complete


Asynchronous Channel Groups

The concept of Async Channels in java is backed by Asynchronous Channel Groups. An async channel group basically pools a number of channels for reuse. Consumers of the async api retrieve a channel from the group (the JVM creates one by default) and the channel automatically puts itself back into the group after it's completed its read/write operation. Ultimately, Async Channel Groups are backed by surprise, threadpools. Also, Asynchronous channels are threadsafe.

The size of the threadpool that backs an async channel group is configured by the following JVM property

java.nio.channels.DefaultThreadPool.initialSize

which, given an integer value will setup a threadpool of that size, to back the channel group. The channel group is created and maintained transparently to the developer otherwise.


And how all them can be implemented in Java

Well, I'm glad you asked. Here's an example of an AsynchronousSocketChannel (used to open a non-blocking client Socket to a listening server.) This sample is an excerpt from Apress Pro Java NIO.2, commented by me:

//Create an Asynchronous channel. No connection has actually been established yetAsynchronousSocketChannel asynchronousSocketChannel = AsynchronousSocketChannel.open(); /**Connect to an actual server on the given port and address.    The operation returns a type of Future, the basis of the all    asynchronous operations in java. In this case, a Void is    returned because nothing is returned after a successful socket connection  */Void connect = asynchronousSocketChannel.connect(new InetSocketAddress("127.0.0.1", 5000)).get();//Allocate data structures to use to communicate over the wireByteBuffer helloBuffer = ByteBuffer.wrap("Hello !".getBytes()); //Send the messageFuture<Integer> successfullyWritten=  asynchronousSocketChannel.write(helloBuffer);//Do some stuff here. The point here is that asynchronousSocketChannel.write() //returns almost immediately, not waiting to actually finish writing //the hello to the channel before returning control to the currently executing threaddoSomethingElse();//now you can come back and check if it was all written (or not)System.out.println("Bytes written "+successfullyWritten.get());

EDIT: I should mention that support for Async NIO came in JDK 1.7


I would say there are three types of io:

synchronous blocking
synchronous non-blocking
asynchronous

Both synchronous non-blocking and asynchronous would be considered non-blocking as the calling thread is not waiting on the IO to complete. So while non-blocking asynchronous io might be redundant, they are not one in the same. When I open a file I can open it in non-blocking mode. What does this mean? It means when I issue a read() it won't block. It will either return me the bytes that are available or indicate that there are no bytes available. If I didn't enable non-blocking io the read() would block until data was available. I might want to enable non-blocking io if I want a thread to handle multiple io requests. For instance, I could use select() to find out what file descriptors, or maybe sockets, have data available to read. I then do synchronous reads on those file descriptors. None of those reads should block because I already know data is available, plus I have opened the file descriptors in non-blocking mode.

Asynchronous io is where you issue an io request. That request is queued, and thus doesn't block the issuing thread. You are notified when either the request failed or has completed successfully.