Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

rqspinlock: Protect pending bit owners from stalls

The pending bit is used to avoid queueing in case the lock is
uncontended, and has demonstrated benefits for the 2 contender scenario,
esp. on x86. In case the pending bit is acquired and we wait for the
locked bit to disappear, we may get stuck due to the lock owner not
making progress. Hence, this waiting loop must be protected with a
timeout check.

To perform a graceful recovery once we decide to abort our lock
acquisition attempt in this case, we must unset the pending bit since we
own it. All waiters undoing their changes and exiting gracefully allows
the lock word to be restored to the unlocked state once all participants
(owner, waiters) have been recovered, and the lock remains usable.
Hence, set the pending bit back to zero before returning to the caller.

Introduce a lockevent (rqspinlock_lock_timeout) to capture timeout
event statistics.

Reviewed-by: Barret Rhoden <brho@google.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20250316040541.108729-10-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

authored by

Kumar Kartikeya Dwivedi and committed by
Alexei Starovoitov
337ffea5 ebababcd

+33 -6
+1 -1
include/asm-generic/rqspinlock.h
··· 15 15 struct qspinlock; 16 16 typedef struct qspinlock rqspinlock_t; 17 17 18 - extern void resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val); 18 + extern int resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val); 19 19 20 20 /* 21 21 * Default timeout for waiting loops is 0.25 seconds
+27 -5
kernel/bpf/rqspinlock.c
··· 138 138 * @lock: Pointer to queued spinlock structure 139 139 * @val: Current value of the queued spinlock 32-bit word 140 140 * 141 + * Return: 142 + * * 0 - Lock was acquired successfully. 143 + * * -ETIMEDOUT - Lock acquisition failed because of timeout. 144 + * 141 145 * (queue tail, pending bit, lock value) 142 146 * 143 147 * fast : slow : unlock ··· 158 154 * contended : (*,x,y) +--> (*,0,0) ---> (*,0,1) -' : 159 155 * queue : ^--' : 160 156 */ 161 - void __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) 157 + int __lockfunc resilient_queued_spin_lock_slowpath(rqspinlock_t *lock, u32 val) 162 158 { 163 159 struct mcs_spinlock *prev, *next, *node; 164 160 struct rqspinlock_timeout ts; 161 + int idx, ret = 0; 165 162 u32 old, tail; 166 - int idx; 167 163 168 164 BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); 169 165 ··· 221 217 * clear_pending_set_locked() implementations imply full 222 218 * barriers. 223 219 */ 224 - if (val & _Q_LOCKED_MASK) 225 - smp_cond_load_acquire(&lock->locked, !VAL); 220 + if (val & _Q_LOCKED_MASK) { 221 + RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT); 222 + res_smp_cond_load_acquire(&lock->locked, !VAL || RES_CHECK_TIMEOUT(ts, ret)); 223 + } 224 + 225 + if (ret) { 226 + /* 227 + * We waited for the locked bit to go back to 0, as the pending 228 + * waiter, but timed out. We need to clear the pending bit since 229 + * we own it. Once a stuck owner has been recovered, the lock 230 + * must be restored to a valid state, hence removing the pending 231 + * bit is necessary. 232 + * 233 + * *,1,* -> *,0,* 234 + */ 235 + clear_pending(lock); 236 + lockevent_inc(rqspinlock_lock_timeout); 237 + return ret; 238 + } 226 239 227 240 /* 228 241 * take ownership and clear the pending bit. ··· 248 227 */ 249 228 clear_pending_set_locked(lock); 250 229 lockevent_inc(lock_pending); 251 - return; 230 + return 0; 252 231 253 232 /* 254 233 * End of pending bit optimistic spinning and beginning of MCS ··· 399 378 * release the node 400 379 */ 401 380 __this_cpu_dec(rqnodes[0].mcs.count); 381 + return 0; 402 382 } 403 383 EXPORT_SYMBOL_GPL(resilient_queued_spin_lock_slowpath);
+5
kernel/locking/lock_events_list.h
··· 50 50 #endif /* CONFIG_QUEUED_SPINLOCKS */ 51 51 52 52 /* 53 + * Locking events for Resilient Queued Spin Lock 54 + */ 55 + LOCK_EVENT(rqspinlock_lock_timeout) /* # of locking ops that timeout */ 56 + 57 + /* 53 58 * Locking events for rwsem 54 59 */ 55 60 LOCK_EVENT(rwsem_sleep_reader) /* # of reader sleeps */