Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

memcg: fix data-race KCSAN bug in rstats

A data-race issue in memcg rstat occurs when two distinct code paths
access the same 4-byte region concurrently. KCSAN detection triggers the
following BUG as a result.

BUG: KCSAN: data-race in __count_memcg_events / mem_cgroup_css_rstat_flush

write to 0xffffe8ffff98e300 of 4 bytes by task 5274 on cpu 17:
mem_cgroup_css_rstat_flush (mm/memcontrol.c:5850)
cgroup_rstat_flush_locked (kernel/cgroup/rstat.c:243 (discriminator 7))
cgroup_rstat_flush (./include/linux/spinlock.h:401 kernel/cgroup/rstat.c:278)
mem_cgroup_flush_stats.part.0 (mm/memcontrol.c:767)
memory_numa_stat_show (mm/memcontrol.c:6911)
<snip>

read to 0xffffe8ffff98e300 of 4 bytes by task 410848 on cpu 27:
__count_memcg_events (mm/memcontrol.c:725 mm/memcontrol.c:962)
count_memcg_event_mm.part.0 (./include/linux/memcontrol.h:1097 ./include/linux/memcontrol.h:1120)
handle_mm_fault (mm/memory.c:5483 mm/memory.c:5622)
<snip>

value changed: 0x00000029 -> 0x00000000

The race occurs because two code paths access the same "stats_updates"
location. Although "stats_updates" is a per-CPU variable, it is remotely
accessed by another CPU at
cgroup_rstat_flush_locked()->mem_cgroup_css_rstat_flush(), leading to the
data race mentioned.

Considering that memcg_rstat_updated() is in the hot code path, adding a
lock to protect it may not be desirable, especially since this variable
pertains solely to statistics.

Therefore, annotating accesses to stats_updates with READ/WRITE_ONCE() can
prevent KCSAN splats and potential partial reads/writes.

Link: https://lkml.kernel.org/r/20240424125940.2410718-1-leitao@debian.org
Fixes: 9cee7e8ef3e3 ("mm: memcg: optimize parent iteration in memcg_rstat_updated()")
Signed-off-by: Breno Leitao <leitao@debian.org>
Suggested-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Breno Leitao and committed by
Andrew Morton
78ec6f9d 093137ea

+7 -5
+7 -5
mm/memcontrol.c
··· 715 715 { 716 716 struct memcg_vmstats_percpu *statc; 717 717 int cpu = smp_processor_id(); 718 + unsigned int stats_updates; 718 719 719 720 if (!val) 720 721 return; ··· 723 722 cgroup_rstat_updated(memcg->css.cgroup, cpu); 724 723 statc = this_cpu_ptr(memcg->vmstats_percpu); 725 724 for (; statc; statc = statc->parent) { 726 - statc->stats_updates += abs(val); 727 - if (statc->stats_updates < MEMCG_CHARGE_BATCH) 725 + stats_updates = READ_ONCE(statc->stats_updates) + abs(val); 726 + WRITE_ONCE(statc->stats_updates, stats_updates); 727 + if (stats_updates < MEMCG_CHARGE_BATCH) 728 728 continue; 729 729 730 730 /* ··· 733 731 * redundant. Avoid the overhead of the atomic update. 734 732 */ 735 733 if (!memcg_vmstats_needs_flush(statc->vmstats)) 736 - atomic64_add(statc->stats_updates, 734 + atomic64_add(stats_updates, 737 735 &statc->vmstats->stats_updates); 738 - statc->stats_updates = 0; 736 + WRITE_ONCE(statc->stats_updates, 0); 739 737 } 740 738 } 741 739 ··· 5889 5887 } 5890 5888 } 5891 5889 } 5892 - statc->stats_updates = 0; 5890 + WRITE_ONCE(statc->stats_updates, 0); 5893 5891 /* We are in a per-cpu loop here, only do the atomic write once */ 5894 5892 if (atomic64_read(&memcg->vmstats->stats_updates)) 5895 5893 atomic64_set(&memcg->vmstats->stats_updates, 0);