istream and ostream with shared streambuf mutually thread-safe for duplex I/O?

c++ multithreading sockets stream iostream

There is no special guarantee given for std::streambuf (or std::basic_streambuf<...>) which gives more guarantees than what is generally given. That is, you can have multiple threads reading the object's state at any time but if there is one thread modifying the object's state there shall be no other thread accessing the object. Both reading and writing characters modify the stream buffer's state, i.e. from a formal point of view you can't use them without external synchronization.

Internally the two buffers are entirely separate and have nothing to do with each other. The operations on stream buffers modify them in a rather structured way and I can't imagine that any implementation would have an explicit interaction between the two sets of pointers. That is, in practical terms I don't think there is any synchronization necessary between reading and writing. However, I hadn't realized before that the two sets of buffer pointers may actually share the same cache lines which may at least cause performance problems. I don't think this should cause any correctness problems.

The only resource possibly shared between the two stream buffers is the std::locale object which is meant to be stateless, however. Also, std::streambuf doesn't make any use of this object itself: it is your stream buffer which may use some of the facets (e.g. the std::codecvt<...> facet). As the locale is changed via a call to the virtual function imbue() you would be able to intercept this change and do whatever synchronization is needed if your stream buffer uses the locale.

In summary, the standard doesn't make any guarantee that it would work to use concurrent threads to read and write using the same stream buffer. In practice, the DS9k is probably the only system where it would fail and the two threads may end up effectively synchronized due to the buffer pointers ending up in shared cache lines.

c++ multithreading sockets stream iostream

The input and output sequences are essentially independent. There's a nice diagram at cppreference.com:

Diagram of streambuf members

The only thing shared between the input and output sequences is the locale object which contains the codecvt facet used to perform text encoding translation.

In theory, changing the text encoding midstream would be thread-unsafe, but in practice the libraries don't support that operation at all anyway!

You should be good to go.

c++ multithreading sockets stream iostream

For full-duplex you need two buffers. If you use streambuf interfaces for both, so that you can hook 'm up to the usual ostream and istream interface, then the complete picture looks something like this:

The two buffers are obviously completely independent and symmetrical, so we can ignore one side and just concentrate on a single buffer.

Moreover, it is safe to assume that there are only two threads: the reading thread, and the writing thread. If more threads would be involved then two threads would be reading at the same time, or writing at the same time; which would lead to undesirable race conditions and therefore makes no sense. We can assume that the user will have some mechanism in place that assures only one thread at a time writes to a streambuf, and likewise only one thread at a time reads from it.

In the most general case the actual buffer exists of multiple contiguous memory blocks. Each put- and get area is entirely inside one such block. As long as they are in a different memory blocks then they are, again, unrelated.

Each get/put area exists of three pointers: one pointer that points to the start of the area (eback/pbase), one pointer that points one byte past the end of the area (egptr/epptr), and a pointer that points to the current position in the area (gptr/pptr). Each of those pointers can be accessed directly by a class derived from std::streambuf through protected accessors of the same name (eback(), pbase(), egptr(), epptr(), gptr() and pptr()). Note that here we mean the eback(), egptr() and gptr() of one streambuf and the pbase(), epptr() and pptr() of the other streambuf (see the image above).

std::streambuf has public functions that access or change these six pointers. They are:

table, th, td {  border: 1px solid black;  border-collapse: collapse;}th, td {  padding: 5px;}

<table style="width:100%"><caption>Public member functions of <code>std::streambuf</code></caption><tr><th>Method</th><th>Changes and/or accesses</th></tr><tr><td><code>pubsetbuf()</code></td><td>Calls <code>setbuf()</code> of the most derived class</td><tr></tr><td><code>pubseekoff()</code></td><td>Calls <code>seekoff()</code> of the most derived class</td><tr></tr><td><code>pubseekpos()</code></td><td>Calls <code>seekpos()</code> of the most derived class</td><tr></tr><td><code>pubsync()</code></td><td>Calls <code>sync()</code> of the most derived class</td></tr><tr><td><code>in_avail()</code></td><td>Get area</td></tr><tr><td><code>snextc()</code></td><td>Calls <code>sbumpc()</code>, <code>uflow()</code> and/or <code>sgetc()</code></td></tr><tr><td><code>sbumpc()</code></td><td><code>gptr</code>, possibly calls <code>uflow()</code></td></tr><tr><td><code>sgetc()</code></td><td><code>gptr</code>, possibly calls <code>underflow()</code></td></tr><tr><td><code>sgetn()</code></td><td>Calls <code>xgetn()</code> of the most derived class.</td></tr><tr><td><code>sputc()</code></td><td><code>pptr</code>, possibly calls <code>overflow()</code></td></tr><tr><td><code>sputn()</code></td><td>Calls <code>xsputn()</code> of the most derived class</td></tr><tr><td><code>sputbackc()</code></td><td><code>gptr</code>, possibly calls <code>pbackfail()</code></td></tr><tr><td><code>sungetc()</code></td><td><code>gptr</code>, possibly calls <code>pbackfail()</code></td></tr></table>

The protected member functions are

table, th, td {  border: 1px solid black;  border-collapse: collapse;}th, td {  padding: 5px;}

<table style="width:100%"><caption>Protected member functions of <code>std::streambuf</code></caption><tr><th>Method</th><th>Changes and/or accesses</th></tr><tr><td><code>setbuf()</code></td><td>User defined (could be used for single array buffers)</td><tr></tr><td><code>seekoff()</code></td><td>User defined (repositions get area)</td><tr></tr><td><code>seekpos()</code></td><td>User defined (repositions get area)</td><tr></tr><td><code>sync()</code></td><td>User defined (could do anything, depending on which buffer this is, could change either get area or put area)</td></tr><tr><td><code>showmanyc()</code></td><td>User defined (get area; if put area uses the same allocated memory block, can also accesses pptr)</td></tr><tr><td><code>underflow()</code></td><td>User defined (get area; but also strongly coupled to put ares)</td></tr><tr><td><code>uflow()</code></td><td>Calls underflow() and advances gptr</td></tr><tr><td><code>xsgetn()</code></td><td>get area (as if calling <code>sbumpc()</code> repeatedly), might call <code>uflow()</code></td></tr><tr><td><code>gbump()</code></td><td>gptr</td></tr><tr><td><code>setg()</code></td><td>get area</td></tr><tr><td><code>xsputn()</code></td><td>put area (as if calling <code>sputc()</code> repeatedly), might call <code>overflow()</code> or do something similar)</td></tr><tr><td><code>overflow()</code></td><td>put area</td></tr><tr><td><code>pbump()</code></td><td>pptr</td></tr><tr><td><code>setp()</code></td><td>put area</td></tr><tr><td><code>pbackfail()</code></td><td>User defined (might be pure horror; aka, get and put area)</td></tr></table>

We should separate reading and writing actions into actions per (contiguous) memory block. Of course it is possible that a single call to -say- sputn() writes to multiple blocks, but we can lock and unlock per block-action.

There are several significant states of a buffer, depicted in the picture below. Green arrows represent transitions between states done by the thread(s) that read data from the get area, while blue arrows represent transitions between states done by thread(s) that write data to the put area. In other words, two green actions can not occur at the same time; not can two blue actions. But a green and a blue action might happen at the same time.

I still have to write an implementation for this, but my approach will be to use a single mutex per buffer and only lock it at the beginning of every action in order to get the necessary information to perform a read and/or write action. Then at the end of that action, lock the mutex again to see if something was changed by the other thread and/or to finish the read/write with an administrative action.

Every time the write thread bumps pptr, egptr is updated atomically, unless at the beginning of the write action eback != pbase; in which case egptr doesn't need updating of course. This requires to lock a mutex before the bump and unlock after also egptr is updated. The same mutex is locked when moving get- or put areas therefore. We might not lock the mutex when bumping gptr itself, but if we do that then at the beginning of the corresponding read action there was data in the buffer, and a concurrent write action wouldn't change that, so there is no danger that the write thread(s) would try to move the get area at the same time.

I'll edit this answer when I figure out more details.

CodeHunter

istream and ostream with shared streambuf mutually thread-safe for duplex I/O?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last