How can memory_order_relaxed work for incrementing atomic reference counts in smart pointers? How can memory_order_relaxed work for incrementing atomic reference counts in smart pointers? multithreading multithreading

How can memory_order_relaxed work for incrementing atomic reference counts in smart pointers?


Boost.Atomic library that emulates std::atomic provides similar reference counting example and explanation, and it may help your understanding.

Increasing the reference counter can always be done with memory_order_relaxed: New references to an object can only be formed from an existing reference, and passing an existing reference from one thread to another must already provide any required synchronization.

It is important to enforce any possible access to the object in one thread (through an existing reference) to happen before deleting the object in a different thread. This is achieved by a "release" operation after dropping a reference (any access to the object through this reference must obviously happened before), and an "acquire" operation before deleting the object.

It would be possible to use memory_order_acq_rel for the fetch_sub operation, but this results in unneeded "acquire" operations when the reference counter does not yet reach zero and may impose a performance penalty.


From C++ reference on std::memory_order:

memory_order_relaxed: Relaxed operation: there are no synchronization or ordering constraints imposed on other reads or writes, only this operation's atomicity is guaranteed

There is also an example below on that page.

So basically, std::atomic::fetch_add() is still atomic, even when with std::memory_order_relaxed, therefore concurrent refs.fetch_add(1, std::memory_order_relaxed) from 2 different threads will always increment refs by 2. The point of the memory order is how other non-atomic or std::memory_order_relaxed atomic operations can be reordered around the current atomic operation with memory order specified.


As this is rather confusing (at least to me) I'm going to partially address one point:

(...) then it may happen that both threads see the value of refs to be N and both write N+1 back to it (...)

According to @AnthonyWilliams in this answer, the above sentence seems to be wrong as:

The only way to guarantee you have the "latest" value is to use a read-modify-write operation such as exchange(), compare_exchange_strong() or fetch_add(). Read-modify-write operations have an additional constraint that they always operate on the "latest" value, so a sequence of ai.fetch_add(1) operations by a series of threads will return a sequence of values with no duplicates or gaps. In the absence of additional constraints, there's still no guarantee which threads will see which values though.

So, given the authority argument, I'd say it's impossible that both threads see the value going from N to N+1.