Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

futex: update documentation for ordering guarantees

Commits 11d4616bd07f ("futex: revert back to the explicit waiter
counting code") and 69cd9eba3886 ("futex: avoid race between requeue and
wake") changed some of the finer details of how we think about futexes.
One was a late fix and the other a consequence of overlooking the whole
requeuing logic.

The first change caused our documentation to be incorrect, and the
second made us aware that we need to explicitly add more details to it.

Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Davidlohr Bueso and committed by
Linus Torvalds
d7e8af1a 454fd351

+23 -9
+23 -9
kernel/futex.c
··· 70 70 #include "locking/rtmutex_common.h" 71 71 72 72 /* 73 - * Basic futex operation and ordering guarantees: 73 + * READ this before attempting to hack on futexes! 74 + * 75 + * Basic futex operation and ordering guarantees 76 + * ============================================= 74 77 * 75 78 * The waiter reads the futex value in user space and calls 76 79 * futex_wait(). This function computes the hash bucket and acquires ··· 122 119 * sys_futex(WAIT, futex, val); 123 120 * futex_wait(futex, val); 124 121 * 125 - * waiters++; 122 + * waiters++; (a) 126 123 * mb(); (A) <-- paired with -. 127 124 * | 128 125 * lock(hash_bucket(futex)); | ··· 138 135 * unlock(hash_bucket(futex)); 139 136 * schedule(); if (waiters) 140 137 * lock(hash_bucket(futex)); 141 - * wake_waiters(futex); 142 - * unlock(hash_bucket(futex)); 138 + * else wake_waiters(futex); 139 + * waiters--; (b) unlock(hash_bucket(futex)); 143 140 * 144 - * Where (A) orders the waiters increment and the futex value read -- this 145 - * is guaranteed by the head counter in the hb spinlock; and where (B) 146 - * orders the write to futex and the waiters read -- this is done by the 147 - * barriers in get_futex_key_refs(), through either ihold or atomic_inc, 148 - * depending on the futex type. 141 + * Where (A) orders the waiters increment and the futex value read through 142 + * atomic operations (see hb_waiters_inc) and where (B) orders the write 143 + * to futex and the waiters read -- this is done by the barriers in 144 + * get_futex_key_refs(), through either ihold or atomic_inc, depending on the 145 + * futex type. 149 146 * 150 147 * This yields the following case (where X:=waiters, Y:=futex): 151 148 * ··· 158 155 * Which guarantees that x==0 && y==0 is impossible; which translates back into 159 156 * the guarantee that we cannot both miss the futex variable change and the 160 157 * enqueue. 158 + * 159 + * Note that a new waiter is accounted for in (a) even when it is possible that 160 + * the wait call can return error, in which case we backtrack from it in (b). 161 + * Refer to the comment in queue_lock(). 162 + * 163 + * Similarly, in order to account for waiters being requeued on another 164 + * address we always increment the waiters for the destination bucket before 165 + * acquiring the lock. It then decrements them again after releasing it - 166 + * the code that actually moves the futex(es) between hash buckets (requeue_futex) 167 + * will do the additional required waiter count housekeeping. This is done for 168 + * double_lock_hb() and double_unlock_hb(), respectively. 161 169 */ 162 170 163 171 #ifndef CONFIG_HAVE_FUTEX_CMPXCHG