Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched/fair: Filter false overloaded_group case for EAS

With EAS, a group should be set overloaded if at least 1 CPU in the group
is overutilized but it can happen that a CPU is fully utilized by tasks
because of clamping the compute capacity of the CPU. In such case, the CPU
is not overutilized and as a result should not be set overloaded as well.

group_overloaded being a higher priority than group_misfit, such group can
be selected as the busiest group instead of a group with a mistfit task
and prevents load_balance to select the CPU with the misfit task to pull
the latter on a fitting CPU.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Link: https://patch.msgid.link/20260206095454.1520619-1-vincent.guittot@linaro.org

authored by

Vincent Guittot and committed by
Peter Zijlstra
d3d663fa 92647580

+13 -5
+13 -5
kernel/sched/fair.c
··· 10211 10211 unsigned int group_asym_packing; /* Tasks should be moved to preferred CPU */ 10212 10212 unsigned int group_smt_balance; /* Task on busy SMT be moved */ 10213 10213 unsigned long group_misfit_task_load; /* A CPU has a task too big for its capacity */ 10214 + unsigned int group_overutilized; /* At least one CPU is overutilized in the group */ 10214 10215 #ifdef CONFIG_NUMA_BALANCING 10215 10216 unsigned int nr_numa_running; 10216 10217 unsigned int nr_preferred_running; ··· 10444 10443 static inline bool 10445 10444 group_is_overloaded(unsigned int imbalance_pct, struct sg_lb_stats *sgs) 10446 10445 { 10446 + /* 10447 + * With EAS and uclamp, 1 CPU in the group must be overutilized to 10448 + * consider the group overloaded. 10449 + */ 10450 + if (sched_energy_enabled() && !sgs->group_overutilized) 10451 + return false; 10452 + 10447 10453 if (sgs->sum_nr_running <= sgs->group_weight) 10448 10454 return false; 10449 10455 ··· 10634 10626 * @group: sched_group whose statistics are to be updated. 10635 10627 * @sgs: variable to hold the statistics for this group. 10636 10628 * @sg_overloaded: sched_group is overloaded 10637 - * @sg_overutilized: sched_group is overutilized 10638 10629 */ 10639 10630 static inline void update_sg_lb_stats(struct lb_env *env, 10640 10631 struct sd_lb_stats *sds, 10641 10632 struct sched_group *group, 10642 10633 struct sg_lb_stats *sgs, 10643 - bool *sg_overloaded, 10644 - bool *sg_overutilized) 10634 + bool *sg_overloaded) 10645 10635 { 10646 10636 int i, nr_running, local_group, sd_flags = env->sd->flags; 10647 10637 bool balancing_at_rd = !env->sd->parent; ··· 10661 10655 sgs->sum_nr_running += nr_running; 10662 10656 10663 10657 if (cpu_overutilized(i)) 10664 - *sg_overutilized = 1; 10658 + sgs->group_overutilized = 1; 10665 10659 10666 10660 /* 10667 10661 * No need to call idle_cpu() if nr_running is not 0 ··· 11332 11326 update_group_capacity(env->sd, env->dst_cpu); 11333 11327 } 11334 11328 11335 - update_sg_lb_stats(env, sds, sg, sgs, &sg_overloaded, &sg_overutilized); 11329 + update_sg_lb_stats(env, sds, sg, sgs, &sg_overloaded); 11336 11330 11337 11331 if (!local_group && update_sd_pick_busiest(env, sds, sg, sgs)) { 11338 11332 sds->busiest = sg; 11339 11333 sds->busiest_stat = *sgs; 11340 11334 } 11335 + 11336 + sg_overutilized |= sgs->group_overutilized; 11341 11337 11342 11338 /* Now, start updating sd_lb_stats */ 11343 11339 sds->total_load += sgs->group_load;