Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

locking/ww_mutex: Adjust to lockdep nest_lock requirements

When using mutex_acquire_nest() with a nest_lock, lockdep refcounts the
number of acquired lockdep_maps of mutexes of the same class, and also
keeps a pointer to the first acquired lockdep_map of a class. That pointer
is then used for various comparison-, printing- and checking purposes,
but there is no mechanism to actively ensure that lockdep_map stays in
memory. Instead, a warning is printed if the lockdep_map is freed and
there are still held locks of the same lock class, even if the lockdep_map
itself has been released.

In the context of WW/WD transactions that means that if a user unlocks
and frees a ww_mutex from within an ongoing ww transaction, and that
mutex happens to be the first ww_mutex grabbed in the transaction,
such a warning is printed and there might be a risk of a UAF.

Note that this is only problem when lockdep is enabled and affects only
dereferences of struct lockdep_map.

Adjust to this by adding a fake lockdep_map to the acquired context and
make sure it is the first acquired lockdep map of the associated
ww_mutex class. Then hold it for the duration of the WW/WD transaction.

This has the side effect that trying to lock a ww mutex *without* a
ww_acquire_context but where a such context has been acquire, we'd see
a lockdep splat. The test-ww_mutex.c selftest attempts to do that, so
modify that particular test to not acquire a ww_acquire_context if it
is not going to be used.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20241009092031.6356-1-thomas.hellstrom@linux.intel.com

authored by

Thomas Hellström and committed by
Peter Zijlstra
823a5662 afc256e1

+19 -3
+14
include/linux/ww_mutex.h
··· 65 65 #endif 66 66 #ifdef CONFIG_DEBUG_LOCK_ALLOC 67 67 struct lockdep_map dep_map; 68 + /** 69 + * @first_lock_dep_map: fake lockdep_map for first locked ww_mutex. 70 + * 71 + * lockdep requires the lockdep_map for the first locked ww_mutex 72 + * in a ww transaction to remain in memory until all ww_mutexes of 73 + * the transaction have been unlocked. Ensure this by keeping a 74 + * fake locked ww_mutex lockdep map between ww_acquire_init() and 75 + * ww_acquire_fini(). 76 + */ 77 + struct lockdep_map first_lock_dep_map; 68 78 #endif 69 79 #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH 70 80 unsigned int deadlock_inject_interval; ··· 156 146 debug_check_no_locks_freed((void *)ctx, sizeof(*ctx)); 157 147 lockdep_init_map(&ctx->dep_map, ww_class->acquire_name, 158 148 &ww_class->acquire_key, 0); 149 + lockdep_init_map(&ctx->first_lock_dep_map, ww_class->mutex_name, 150 + &ww_class->mutex_key, 0); 159 151 mutex_acquire(&ctx->dep_map, 0, 0, _RET_IP_); 152 + mutex_acquire_nest(&ctx->first_lock_dep_map, 0, 0, &ctx->dep_map, _RET_IP_); 160 153 #endif 161 154 #ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH 162 155 ctx->deadlock_inject_interval = 1; ··· 198 185 static inline void ww_acquire_fini(struct ww_acquire_ctx *ctx) 199 186 { 200 187 #ifdef CONFIG_DEBUG_LOCK_ALLOC 188 + mutex_release(&ctx->first_lock_dep_map, _THIS_IP_); 201 189 mutex_release(&ctx->dep_map, _THIS_IP_); 202 190 #endif 203 191 #ifdef DEBUG_WW_MUTEXES
+5 -3
kernel/locking/test-ww_mutex.c
··· 62 62 int ret; 63 63 64 64 ww_mutex_init(&mtx.mutex, &ww_class); 65 - ww_acquire_init(&ctx, &ww_class); 65 + if (flags & TEST_MTX_CTX) 66 + ww_acquire_init(&ctx, &ww_class); 66 67 67 68 INIT_WORK_ONSTACK(&mtx.work, test_mutex_work); 68 69 init_completion(&mtx.ready); ··· 91 90 ret = wait_for_completion_timeout(&mtx.done, TIMEOUT); 92 91 } 93 92 ww_mutex_unlock(&mtx.mutex); 94 - ww_acquire_fini(&ctx); 93 + if (flags & TEST_MTX_CTX) 94 + ww_acquire_fini(&ctx); 95 95 96 96 if (ret) { 97 97 pr_err("%s(flags=%x): mutual exclusion failure\n", ··· 681 679 if (ret) 682 680 return ret; 683 681 684 - ret = stress(2047, hweight32(STRESS_ALL)*ncpus, STRESS_ALL); 682 + ret = stress(2046, hweight32(STRESS_ALL)*ncpus, STRESS_ALL); 685 683 if (ret) 686 684 return ret; 687 685