Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched_ext: Unify regular and core-sched pick task paths

Because the BPF scheduler's dispatch path is invoked from balance(),
sched_ext needs to invoke balance_one() on all sibling rq's before picking
the next task for core-sched.

Before the recent pick_next_task() updates, sched_ext couldn't share pick
task between regular and core-sched paths because pick_next_task() depended
on put_prev_task() being called on the current task. Tasks currently running
on sibling rq's can't be put when one rq is trying to pick the next task, so
pick_task_scx() had to have a separate mechanism to pick between a sibling
rq's current task and the first task in its local DSQ.

However, with the preceding updates, pick_next_task_scx() no longer depends
on the current task being put and can compare the current task and the next
in line statelessly, and the pick task logic should be shareable between
regular and core-sched paths.

Unify regular and core-sched pick task paths:

- There's no reason to distinguish local and sibling picks anymore. @local
is removed from balance_one().

- pick_next_task_scx() is turned into pick_task_scx() by dropping the
put_prev_set_next_task() call.

- The old pick_task_scx() is dropped.

Signed-off-by: Tejun Heo <tj@kernel.org>

+11 -67
+11 -67
kernel/sched/ext.c
··· 2500 2500 dspc->cursor = 0; 2501 2501 } 2502 2502 2503 - static int balance_one(struct rq *rq, struct task_struct *prev, bool local) 2503 + static int balance_one(struct rq *rq, struct task_struct *prev) 2504 2504 { 2505 2505 struct scx_dsp_ctx *dspc = this_cpu_ptr(scx_dsp_ctx); 2506 2506 bool prev_on_scx = prev->sched_class == &ext_sched_class; ··· 2529 2529 /* 2530 2530 * If @prev is runnable & has slice left, it has priority and 2531 2531 * fetching more just increases latency for the fetched tasks. 2532 - * Tell pick_next_task_scx() to keep running @prev. If the BPF 2532 + * Tell pick_task_scx() to keep running @prev. If the BPF 2533 2533 * scheduler wants to handle this explicitly, it should 2534 2534 * implement ->cpu_release(). 2535 2535 * 2536 2536 * See scx_ops_disable_workfn() for the explanation on the 2537 2537 * bypassing test. 2538 - * 2539 - * When balancing a remote CPU for core-sched, there won't be a 2540 - * following put_prev_task_scx() call and we don't own 2541 - * %SCX_RQ_BAL_KEEP. Instead, pick_task_scx() will test the same 2542 - * conditions later and pick @rq->curr accordingly. 2543 2538 */ 2544 2539 if ((prev->scx.flags & SCX_TASK_QUEUED) && 2545 2540 prev->scx.slice && !scx_ops_bypassing()) { 2546 - if (local) 2547 - rq->scx.flags |= SCX_RQ_BAL_KEEP; 2541 + rq->scx.flags |= SCX_RQ_BAL_KEEP; 2548 2542 goto has_tasks; 2549 2543 } 2550 2544 } ··· 2597 2603 */ 2598 2604 if ((prev->scx.flags & SCX_TASK_QUEUED) && 2599 2605 (!static_branch_unlikely(&scx_ops_enq_last) || scx_ops_bypassing())) { 2600 - if (local) 2601 - rq->scx.flags |= SCX_RQ_BAL_KEEP; 2606 + rq->scx.flags |= SCX_RQ_BAL_KEEP; 2602 2607 goto has_tasks; 2603 2608 } 2604 2609 rq->scx.flags &= ~SCX_RQ_IN_BALANCE; ··· 2615 2622 2616 2623 rq_unpin_lock(rq, rf); 2617 2624 2618 - ret = balance_one(rq, prev, true); 2625 + ret = balance_one(rq, prev); 2619 2626 2620 2627 #ifdef CONFIG_SCHED_SMT 2621 2628 /* 2622 2629 * When core-sched is enabled, this ops.balance() call will be followed 2623 - * by put_prev_scx() and pick_task_scx() on this CPU and pick_task_scx() 2624 - * on the SMT siblings. Balance the siblings too. 2630 + * by pick_task_scx() on this CPU and the SMT siblings. Balance the 2631 + * siblings too. 2625 2632 */ 2626 2633 if (sched_core_enabled(rq)) { 2627 2634 const struct cpumask *smt_mask = cpu_smt_mask(cpu_of(rq)); ··· 2633 2640 2634 2641 WARN_ON_ONCE(__rq_lockp(rq) != __rq_lockp(srq)); 2635 2642 update_rq_clock(srq); 2636 - balance_one(srq, sprev, false); 2643 + balance_one(srq, sprev); 2637 2644 } 2638 2645 } 2639 2646 #endif ··· 2753 2760 struct task_struct, scx.dsq_list.node); 2754 2761 } 2755 2762 2756 - static struct task_struct *pick_next_task_scx(struct rq *rq, 2757 - struct task_struct *prev) 2763 + static struct task_struct *pick_task_scx(struct rq *rq) 2758 2764 { 2765 + struct task_struct *prev = rq->curr; 2759 2766 struct task_struct *p; 2760 2767 2761 2768 /* ··· 2782 2789 p->scx.slice = SCX_SLICE_DFL; 2783 2790 } 2784 2791 } 2785 - 2786 - put_prev_set_next_task(rq, prev, p); 2787 2792 2788 2793 return p; 2789 2794 } ··· 2818 2827 (struct task_struct *)b); 2819 2828 else 2820 2829 return time_after64(a->scx.core_sched_at, b->scx.core_sched_at); 2821 - } 2822 - 2823 - /** 2824 - * pick_task_scx - Pick a candidate task for core-sched 2825 - * @rq: rq to pick the candidate task from 2826 - * 2827 - * Core-sched calls this function on each SMT sibling to determine the next 2828 - * tasks to run on the SMT siblings. balance_one() has been called on all 2829 - * siblings and put_prev_task_scx() has been called only for the current CPU. 2830 - * 2831 - * As put_prev_task_scx() hasn't been called on remote CPUs, we can't just look 2832 - * at the first task in the local dsq. @rq->curr has to be considered explicitly 2833 - * to mimic %SCX_RQ_BAL_KEEP. 2834 - */ 2835 - static struct task_struct *pick_task_scx(struct rq *rq) 2836 - { 2837 - struct task_struct *curr = rq->curr; 2838 - struct task_struct *first = first_local_task(rq); 2839 - 2840 - if (curr->scx.flags & SCX_TASK_QUEUED) { 2841 - /* is curr the only runnable task? */ 2842 - if (!first) 2843 - return curr; 2844 - 2845 - /* 2846 - * Does curr trump first? We can always go by core_sched_at for 2847 - * this comparison as it represents global FIFO ordering when 2848 - * the default core-sched ordering is used and local-DSQ FIFO 2849 - * ordering otherwise. 2850 - * 2851 - * We can have a task with an earlier timestamp on the DSQ. For 2852 - * example, when a current task is preempted by a sibling 2853 - * picking a different cookie, the task would be requeued at the 2854 - * head of the local DSQ with an earlier timestamp than the 2855 - * core-sched picked next task. Besides, the BPF scheduler may 2856 - * dispatch any tasks to the local DSQ anytime. 2857 - */ 2858 - if (curr->scx.slice && time_before64(curr->scx.core_sched_at, 2859 - first->scx.core_sched_at)) 2860 - return curr; 2861 - } 2862 - 2863 - return first; /* this may be %NULL */ 2864 2830 } 2865 2831 #endif /* CONFIG_SCHED_CORE */ 2866 2832 ··· 3586 3638 .wakeup_preempt = wakeup_preempt_scx, 3587 3639 3588 3640 .balance = balance_scx, 3589 - .pick_next_task = pick_next_task_scx, 3641 + .pick_task = pick_task_scx, 3590 3642 3591 3643 .put_prev_task = put_prev_task_scx, 3592 3644 .set_next_task = set_next_task_scx, ··· 3600 3652 3601 3653 .rq_online = rq_online_scx, 3602 3654 .rq_offline = rq_offline_scx, 3603 - #endif 3604 - 3605 - #ifdef CONFIG_SCHED_CORE 3606 - .pick_task = pick_task_scx, 3607 3655 #endif 3608 3656 3609 3657 .task_tick = task_tick_scx,