Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched_ext: Fix incorrect sched_class settings for per-cpu migration tasks

When loading the ebpf scheduler, the tasks in the scx_tasks list will
be traversed and invoke __setscheduler_class() to get new sched_class.
however, this would also incorrectly set the per-cpu migration
task's->sched_class to rt_sched_class, even after unload, the per-cpu
migration task's->sched_class remains sched_rt_class.

The log for this issue is as follows:

./scx_rustland --stats 1
[ 199.245639][ T630] sched_ext: "rustland" does not implement cgroup cpu.weight
[ 199.269213][ T630] sched_ext: BPF scheduler "rustland" enabled
04:25:09 [INFO] RustLand scheduler attached

bpftrace -e 'iter:task /strcontains(ctx->task->comm, "migration")/
{ printf("%s:%d->%pS\n", ctx->task->comm, ctx->task->pid, ctx->task->sched_class); }'
Attaching 1 probe...
migration/0:24->rt_sched_class+0x0/0xe0
migration/1:27->rt_sched_class+0x0/0xe0
migration/2:33->rt_sched_class+0x0/0xe0
migration/3:39->rt_sched_class+0x0/0xe0
migration/4:45->rt_sched_class+0x0/0xe0
migration/5:52->rt_sched_class+0x0/0xe0
migration/6:58->rt_sched_class+0x0/0xe0
migration/7:64->rt_sched_class+0x0/0xe0

sched_ext: BPF scheduler "rustland" disabled (unregistered from user space)
EXIT: unregistered from user space
04:25:21 [INFO] Unregister RustLand scheduler

bpftrace -e 'iter:task /strcontains(ctx->task->comm, "migration")/
{ printf("%s:%d->%pS\n", ctx->task->comm, ctx->task->pid, ctx->task->sched_class); }'
Attaching 1 probe...
migration/0:24->rt_sched_class+0x0/0xe0
migration/1:27->rt_sched_class+0x0/0xe0
migration/2:33->rt_sched_class+0x0/0xe0
migration/3:39->rt_sched_class+0x0/0xe0
migration/4:45->rt_sched_class+0x0/0xe0
migration/5:52->rt_sched_class+0x0/0xe0
migration/6:58->rt_sched_class+0x0/0xe0
migration/7:64->rt_sched_class+0x0/0xe0

This commit therefore generate a new scx_setscheduler_class() and
add check for stop_sched_class to replace __setscheduler_class().

Fixes: f0e1a0643a59 ("sched_ext: Implement BPF extensible scheduler class")
Cc: stable@vger.kernel.org # v6.12+
Signed-off-by: Zqiang <qiang.zhang@linux.dev>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>

authored by

Zqiang and committed by
Tejun Heo
1dd6c84f 06a7415c

+10 -4
+10 -4
kernel/sched/ext.c
··· 248 248 return rhashtable_lookup(&sch->dsq_hash, &dsq_id, dsq_hash_params); 249 249 } 250 250 251 + static const struct sched_class *scx_setscheduler_class(struct task_struct *p) 252 + { 253 + if (p->sched_class == &stop_sched_class) 254 + return &stop_sched_class; 255 + 256 + return __setscheduler_class(p->policy, p->prio); 257 + } 258 + 251 259 /* 252 260 * scx_kf_mask enforcement. Some kfuncs can only be called from specific SCX 253 261 * ops. When invoking SCX ops, SCX_CALL_OP[_RET]() should be used to indicate ··· 4249 4241 while ((p = scx_task_iter_next_locked(&sti))) { 4250 4242 unsigned int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE | DEQUEUE_NOCLOCK; 4251 4243 const struct sched_class *old_class = p->sched_class; 4252 - const struct sched_class *new_class = 4253 - __setscheduler_class(p->policy, p->prio); 4244 + const struct sched_class *new_class = scx_setscheduler_class(p); 4254 4245 4255 4246 update_rq_clock(task_rq(p)); 4256 4247 ··· 5049 5042 while ((p = scx_task_iter_next_locked(&sti))) { 5050 5043 unsigned int queue_flags = DEQUEUE_SAVE | DEQUEUE_MOVE; 5051 5044 const struct sched_class *old_class = p->sched_class; 5052 - const struct sched_class *new_class = 5053 - __setscheduler_class(p->policy, p->prio); 5045 + const struct sched_class *new_class = scx_setscheduler_class(p); 5054 5046 5055 5047 if (scx_get_task_state(p) != SCX_TASK_READY) 5056 5048 continue;