Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/slab: add rcu_barrier() to kvfree_rcu_barrier_on_cache()

After we submit the rcu_free sheaves to call_rcu() we need to make sure
the rcu callbacks complete. kvfree_rcu_barrier() does that via
flush_all_rcu_sheaves() but kvfree_rcu_barrier_on_cache() doesn't. Fix
that.

This currently causes no issues because the caches with sheaves we have
are never destroyed. The problem flagged by kernel test robot was
reported for a patch that enables sheaves for (almost) all caches, and
occurred only with CONFIG_KASAN. Harry Yoo found the root cause [1]:

It turns out the object freed by sheaf_flush_unused() was in KASAN
percpu quarantine list (confirmed by dumping the list) by the time
__kmem_cache_shutdown() returns an error.

Quarantined objects are supposed to be flushed by kasan_cache_shutdown(),
but things go wrong if the rcu callback (rcu_free_sheaf_nobarn()) is
processed after kasan_cache_shutdown() finishes.

That's why rcu_barrier() in __kmem_cache_shutdown() didn't help,
because it's called after kasan_cache_shutdown().

Calling rcu_barrier() in kvfree_rcu_barrier_on_cache() guarantees
that it'll be added to the quarantine list before kasan_cache_shutdown()
is called. So it's a valid fix!

[1] https://lore.kernel.org/all/aWd6f3jERlrB5yeF@hyeyoo/

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202601121442.c530bed3-lkp@intel.com
Fixes: 0f35040de593 ("mm/slab: introduce kvfree_rcu_barrier_on_cache() for cache destruction")
Cc: stable@vger.kernel.org
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Tested-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

+4 -1
+4 -1
mm/slab_common.c
··· 2133 2133 */ 2134 2134 void kvfree_rcu_barrier_on_cache(struct kmem_cache *s) 2135 2135 { 2136 - if (s->cpu_sheaves) 2136 + if (s->cpu_sheaves) { 2137 2137 flush_rcu_sheaves_on_cache(s); 2138 + rcu_barrier(); 2139 + } 2140 + 2138 2141 /* 2139 2142 * TODO: Introduce a version of __kvfree_rcu_barrier() that works 2140 2143 * on a specific slab cache.