How Compare and Swap works How Compare and Swap works multithreading multithreading

How Compare and Swap works


"general pseudo code" is not an actual code of CAS (compare and swap) implementation. Special hardware instructions are used to activate special atomic hardware in the CPU. For example, in x86 the LOCK CMPXCHG can be used (http://en.wikipedia.org/wiki/Compare-and-swap).

In gcc, for example, there is __sync_val_compare_and_swap() builtin - which implements hardware-specific atomic CAS. There is description of this operation from fresh wonderful book from Paul E. McKenney (Is Parallel Programming Hard, And, If So, What Can You Do About It?, 2014), section 4.3 "Atomic operations", pages 31-32.

If you want to know more about building higher level synchronization on top of atomic operations and save your system from spinlocks and burning cpu cycles on active spinning, you can read something about futex mechanism in Linux. First paper on futexes is Futexes are tricky by Ulrich Drepper 2011; the other is LWN article http://lwn.net/Articles/360699/ (and the historic one is Fuss, Futexes and Furwocks: Fast Userland Locking in Linux, 2002)

Mutex locks described by Ulrich use only atomic operations for "fast path" (when the mutex is not locked and our thread is the only who wants to lock it), but if the mutex was locked, the thread will go to sleeping using futex(FUTEX_WAIT...) (and it will mark the mutex variable using atomic operation, to inform the unlocking thread about "there are somebody sleeping waiting on this mutex", so unlocker will know that he must wake them using futex(FUTEX_WAKE, ...)


How does it prevent two threads from acquiring the lock? Well, once any one thread succeeds, *mutex will be 1, so any other thread's CAS will fail (because it's called with expected value 0). The lock is released by storing 0 in *mutex.

Note that this is an odd use of CAS, since it's essentially requiring an ABA-violation. Typically you'd just use a plain atomic exchange:

while (exchange(mutex, 1) == 1) { /* spin */ }// critical section*mutex = 0;   // atomically

Or if you want to be slightly more sophisticated and store information about which thread has the lock, you can do tricks with atomic-fetch-and-add (see for example the Linux kernel spinlock code).


You cannot implement CAS in C. It's done on a hardware level in assembly.