Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched_ext: Read scx_root under scx_cgroup_ops_rwsem in cgroup setters

scx_group_set_{weight,idle,bandwidth}() cache scx_root before acquiring
scx_cgroup_ops_rwsem, so the pointer can be stale by the time the op runs.
If the loaded scheduler is disabled and freed (via RCU work) and another is
enabled between the naked load and the rwsem acquire, the reader sees
scx_cgroup_enabled=true (the new scheduler's) but dereferences the freed one
- UAF on SCX_HAS_OP(sch, ...) / SCX_CALL_OP(sch, ...).

scx_cgroup_enabled is toggled only under scx_cgroup_ops_rwsem write
(scx_cgroup_{init,exit}), so reading scx_root inside the rwsem read section
correlates @sch with the enabled snapshot.

Fixes: a5bd6ba30b33 ("sched_ext: Use cgroup_lock/unlock() to synchronize against cgroup operations")
Cc: stable@vger.kernel.org # v6.18+
Reported-by: Chris Mason <clm@meta.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>

+6 -3
+6 -3
kernel/sched/ext.c
··· 4343 4343 4344 4344 void scx_group_set_weight(struct task_group *tg, unsigned long weight) 4345 4345 { 4346 - struct scx_sched *sch = scx_root; 4346 + struct scx_sched *sch; 4347 4347 4348 4348 percpu_down_read(&scx_cgroup_ops_rwsem); 4349 + sch = scx_root; 4349 4350 4350 4351 if (scx_cgroup_enabled && SCX_HAS_OP(sch, cgroup_set_weight) && 4351 4352 tg->scx.weight != weight) ··· 4359 4358 4360 4359 void scx_group_set_idle(struct task_group *tg, bool idle) 4361 4360 { 4362 - struct scx_sched *sch = scx_root; 4361 + struct scx_sched *sch; 4363 4362 4364 4363 percpu_down_read(&scx_cgroup_ops_rwsem); 4364 + sch = scx_root; 4365 4365 4366 4366 if (scx_cgroup_enabled && SCX_HAS_OP(sch, cgroup_set_idle)) 4367 4367 SCX_CALL_OP(sch, cgroup_set_idle, NULL, tg_cgrp(tg), idle); ··· 4376 4374 void scx_group_set_bandwidth(struct task_group *tg, 4377 4375 u64 period_us, u64 quota_us, u64 burst_us) 4378 4376 { 4379 - struct scx_sched *sch = scx_root; 4377 + struct scx_sched *sch; 4380 4378 4381 4379 percpu_down_read(&scx_cgroup_ops_rwsem); 4380 + sch = scx_root; 4382 4381 4383 4382 if (scx_cgroup_enabled && SCX_HAS_OP(sch, cgroup_set_bandwidth) && 4384 4383 (tg->scx.bw_period_us != period_us ||