How is conditional_wait() implemented at the kernel and hardware/assembly level? How is conditional_wait() implemented at the kernel and hardware/assembly level? multithreading multithreading

How is conditional_wait() implemented at the kernel and hardware/assembly level?


On a high level (and since you are asking this question, high level is what you need) it is not that complicated. First, you need to know the layers of responsibility. There are basically 3 layers:

  • Hardware level - usually something which can be coded in a single ASM instruction
  • Kernel level - something which OS kernel does
  • Application level - something which application does

Generally, those responsibilities are not overlapping - kernel can not do what only hardware can do, hardware can not do what only kernel can do. Having this in mind, it is useful to remember that when it comes to locking, there is very little hardware knows about it. It pretty much boils down to

  • atomic arithmetics - hardware can lock a particular memory region (make sure no other threads access it), perform arithmetic operation on it and unlock the region. This can only work on the arithmetics natively supported by the chip (no square roots!) and on the sizes natively supported by hardware
  • Memory barriers or fences - that is, introduce a barrier within a flow of instructions, so that when CPU reorders instructions or uses memory caches, they will not cross those fences and the cache will be fresh
  • Conditional setting (compare-and-set) - set memory region to value A if it is B and report the status of this operation (was it set or not)

That's pretty much all CPU can do. As you see, there is no futex, mutex or conditional variables here. This stuff is made by kernel having CPU-supported operations at it's disposal.

Let's look in a very high level how kernel might implement futex call. Actually, futex is slightly complicated, because it is mixture of user-level calls and kernel-level calls as needed. Let's look into 'pure' mutex, implemented solely in kernel space. On a high level, it will be demonstrational enough.

When mutex is initially created, kernel associates a memory region with it. This region will hold a value of mutex being locked or unlocked. Later, kernel is asked to lock the mutex, it first instructs CPU to issue memory barrier. A mutex must serve as a barrier, so that everything read/written after mutex is acquired (or released) is visible to the rest of CPUs. Then, it uses CPU-supported compare-and-set instruction to set memory region value to 1 if it was set to 0. (there are more complicated reentrant mutexes, but let's not complicate the picture with them). It is guaranteed by CPU that even if more then one thread attempts to do this at the same time, only one will succeed. If the operation succeeds, we now 'hold the mutex'. Once kernel is asked to release the mutex, memory region is set to 0 (there is no need to do this conditionally, since we know we hold the mutex!) and another memory barrier is issued. Kernel also updates the mutex status in it's tables - see below.

If mutex locking fails, kernel adds the thread to it's tables which list threads waiting on particular mutex to be released. When the mutex is released, kernel checks what thread(s) are waiting on this mutex, and 'schedules' (i.e. prepares for execution) one of those (in case there is more than one, which one will be scheduled or awaken depends on multitude of factors, in the simplest case it is simply random). The thread scheduled begins executing, locks the mutex again (at this point it can fail again!) and the cycle of life continues.

Hope it does make at least half-sense :)