Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched: Fix incorrect schedstats for rt and dl thread

For RT and DL thread, only 'set_next_task_(rt/dl)' will call
'update_stats_wait_end_(rt/dl)' to update schedstats information.
However, during the migration process,
'update_stats_wait_start_(rt/dl)' will be called twice, which
will cause the values of wait_max and wait_sum to be incorrect.
The specific output as follows:
$ cat /proc/6046/task/6046/sched | grep wait
wait_start : 0.000000
wait_max : 496717.080029
wait_sum : 7921540.776553

A complete schedstats information update flow of migrate should be
__update_stats_wait_start() [enter queue A, stage 1] ->
__update_stats_wait_end() [leave queue A, stage 2] ->
__update_stats_wait_start() [enter queue B, stage 3] ->
__update_stats_wait_end() [start running on queue B, stage 4]

Stage 1: prev_wait_start is 0, and in the end, wait_start records the
time of entering the queue.
Stage 2: task_on_rq_migrating(p) is true, and wait_start is updated to
the waiting time on queue A.
Stage 3: prev_wait_start is the waiting time on queue A, wait_start is
the time of entering queue B, and wait_start is expected to be greater
than prev_wait_start. Under this condition, wait_start is updated to
(the moment of entering queue B) - (the waiting time on queue A).
Stage 4: the final wait time = (time when starting to run on queue B)
- (time of entering queue B) + (waiting time on queue A) = waiting
time on queue B + waiting time on queue A.

The current problem is that stage 2 does not call __update_stats_wait_end
to update wait_start, which causes the final computed wait time = waiting
time on queue B + the moment of entering queue A, leading to incorrect
wait_max and wait_sum.

Add 'update_stats_wait_end_(rt/dl)' in 'update_stats_dequeue_(rt/dl)' to
update schedstats information when dequeue_task.

Signed-off-by: Dengjun Su <dengjun.su@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260204115959.3183567-1-dengjun.su@mediatek.com

authored by

Dengjun Su and committed by
Peter Zijlstra
c0e1832b d3d663fa

+10 -1
+4
kernel/sched/deadline.c
··· 2142 2142 int flags) 2143 2143 { 2144 2144 struct task_struct *p = dl_task_of(dl_se); 2145 + struct rq *rq = rq_of_dl_rq(dl_rq); 2145 2146 2146 2147 if (!schedstat_enabled()) 2147 2148 return; 2149 + 2150 + if (p != rq->curr) 2151 + update_stats_wait_end_dl(dl_rq, dl_se); 2148 2152 2149 2153 if ((flags & DEQUEUE_SLEEP)) { 2150 2154 unsigned int state;
+6 -1
kernel/sched/rt.c
··· 1302 1302 int flags) 1303 1303 { 1304 1304 struct task_struct *p = NULL; 1305 + struct rq *rq = rq_of_rt_rq(rt_rq); 1305 1306 1306 1307 if (!schedstat_enabled()) 1307 1308 return; 1308 1309 1309 - if (rt_entity_is_task(rt_se)) 1310 + if (rt_entity_is_task(rt_se)) { 1310 1311 p = rt_task_of(rt_se); 1312 + 1313 + if (p != rq->curr) 1314 + update_stats_wait_end_rt(rt_rq, rt_se); 1315 + } 1311 1316 1312 1317 if ((flags & DEQUEUE_SLEEP) && p) { 1313 1318 unsigned int state;