Which is more efficient, basic mutex lock or atomic integer? Which is more efficient, basic mutex lock or atomic integer? multithreading multithreading

Which is more efficient, basic mutex lock or atomic integer?


Atomic operations leverage processor support (compare and swap instructions) and don't use locks at all, whereas locks are more OS-dependent and perform differently on, for example, Win and Linux.

Locks actually suspend thread execution, freeing up cpu resources for other tasks, but incurring in obvious context-switching overhead when stopping/restarting the thread.On the contrary, threads attempting atomic operations don't wait and keep trying until success (so-called busy-waiting), so they don't incur in context-switching overhead, but neither free up cpu resources.

Summing up, in general atomic operations are faster if contention between threads is sufficiently low. You should definitely do benchmarking as there's no other reliable method of knowing what's the lowest overhead between context-switching and busy-waiting.


If you have a counter for which atomic operations are supported, it will be more efficient than a mutex.

Technically, the atomic will lock the memory bus on most platforms. However, there are two ameliorating details:

  • It is impossible to suspend a thread during the memory bus lock, but it is possible to suspend a thread during a mutex lock. This is what lets you get a lock-free guarantee (which doesn't say anything about not locking - it just guarantees that at least one thread makes progress).
  • Mutexes eventually end up being implemented with atomics. Since you need at least one atomic operation to lock a mutex, and one atomic operation to unlock a mutex, it takes at least twice long to do a mutex lock, even in the best of cases.


A minimal (standards compliant) mutex implementation requires 2 basic ingredients:

  • A way to atomically convey a state change between threads (the 'locked' state)
  • memory barriers to enforce memory operations protected by the mutex to stay inside the protected area.

There is no way you can make it any simpler than this because of the 'synchronizes-with' relationship the C++ standard requires.

A minimal (correct) implementation might look like this:

class mutex {    std::atomic<bool> flag{false};public:    void lock()    {        while (flag.exchange(true, std::memory_order_relaxed));        std::atomic_thread_fence(std::memory_order_acquire);    }    void unlock()    {        std::atomic_thread_fence(std::memory_order_release);        flag.store(false, std::memory_order_relaxed);    }};

Due to its simplicity (it cannot suspend the thread of execution), it is likely that, under low contention, this implementation outperforms a std::mutex.But even then, it is easy to see that each integer increment, protected by this mutex, requires the following operations:

  • an atomic store to release the mutex
  • an atomic compare-and-swap (read-modify-write) to acquire the mutex (possibly multiple times)
  • an integer increment

If you compare that with a standalone std::atomic<int> that is incremented with a single (unconditional) read-modify-write (eg. fetch_add),it is reasonable to expect that an atomic operation (using the same ordering model) will outperform the case whereby a mutex is used.