Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched: Make class_schedulers avoid pushing current, and get rid of proxy_tag_curr()

With proxy-execution, the scheduler selects the donor, but for
blocked donors, we end up running the lock owner.

This caused some complexity, because the class schedulers make
sure to remove the task they pick from their pushable task
lists, which prevents the donor from being migrated, but there
wasn't then anything to prevent rq->curr from being migrated
if rq->curr != rq->donor.

This was sort of hacked around by calling proxy_tag_curr() on
the rq->curr task if we were running something other then the
donor. proxy_tag_curr() did a dequeue/enqueue pair on the
rq->curr task, allowing the class schedulers to remove it from
their pushable list.

The dequeue/enqueue pair was wasteful, and additonally K Prateek
highlighted that we didn't properly undo things when we stopped
proxying, leaving the lock owner off the pushable list.

After some alternative approaches were considered, Peter
suggested just having the RT/DL classes just avoid migrating
when task_on_cpu().

So rework pick_next_pushable_dl_task() and the rt
pick_next_pushable_task() functions so that they skip over the
first pushable task if it is on_cpu.

Then just drop all of the proxy_tag_curr() logic.

Fixes: be39617e38e0 ("sched: Fix proxy/current (push,pull)ability")
Closes: https://lore.kernel.org/lkml/e735cae0-2cc9-4bae-b761-fcb082ed3e94@amd.com/
Reported-by: K Prateek Nayak <kprateek.nayak@amd.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260324191337.1841376-2-jstultz@google.com

authored by

John Stultz and committed by
Peter Zijlstra
e0ca8991 9853914c

+28 -29
-24
kernel/sched/core.c
··· 6702 6702 } 6703 6703 #endif /* SCHED_PROXY_EXEC */ 6704 6704 6705 - static inline void proxy_tag_curr(struct rq *rq, struct task_struct *owner) 6706 - { 6707 - if (!sched_proxy_exec()) 6708 - return; 6709 - /* 6710 - * pick_next_task() calls set_next_task() on the chosen task 6711 - * at some point, which ensures it is not push/pullable. 6712 - * However, the chosen/donor task *and* the mutex owner form an 6713 - * atomic pair wrt push/pull. 6714 - * 6715 - * Make sure owner we run is not pushable. Unfortunately we can 6716 - * only deal with that by means of a dequeue/enqueue cycle. :-/ 6717 - */ 6718 - dequeue_task(rq, owner, DEQUEUE_NOCLOCK | DEQUEUE_SAVE); 6719 - enqueue_task(rq, owner, ENQUEUE_NOCLOCK | ENQUEUE_RESTORE); 6720 - } 6721 - 6722 6705 /* 6723 6706 * __schedule() is the main scheduler function. 6724 6707 * ··· 6854 6871 */ 6855 6872 RCU_INIT_POINTER(rq->curr, next); 6856 6873 6857 - if (!task_current_donor(rq, next)) 6858 - proxy_tag_curr(rq, next); 6859 - 6860 6874 /* 6861 6875 * The membarrier system call requires each architecture 6862 6876 * to have a full memory barrier after updating ··· 6887 6907 /* Also unlocks the rq: */ 6888 6908 rq = context_switch(rq, prev, next, &rf); 6889 6909 } else { 6890 - /* In case next was already curr but just got blocked_donor */ 6891 - if (!task_current_donor(rq, next)) 6892 - proxy_tag_curr(rq, next); 6893 - 6894 6910 rq_unpin_lock(rq, &rf); 6895 6911 __balance_callbacks(rq, NULL); 6896 6912 raw_spin_rq_unlock_irq(rq);
+16 -2
kernel/sched/deadline.c
··· 2805 2805 2806 2806 static struct task_struct *pick_next_pushable_dl_task(struct rq *rq) 2807 2807 { 2808 - struct task_struct *p; 2808 + struct task_struct *i, *p = NULL; 2809 + struct rb_node *next_node; 2809 2810 2810 2811 if (!has_pushable_dl_tasks(rq)) 2811 2812 return NULL; 2812 2813 2813 - p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root)); 2814 + next_node = rb_first_cached(&rq->dl.pushable_dl_tasks_root); 2815 + while (next_node) { 2816 + i = __node_2_pdl(next_node); 2817 + /* make sure task isn't on_cpu (possible with proxy-exec) */ 2818 + if (!task_on_cpu(rq, i)) { 2819 + p = i; 2820 + break; 2821 + } 2822 + 2823 + next_node = rb_next(next_node); 2824 + } 2825 + 2826 + if (!p) 2827 + return NULL; 2814 2828 2815 2829 WARN_ON_ONCE(rq->cpu != task_cpu(p)); 2816 2830 WARN_ON_ONCE(task_current(rq, p));
+12 -3
kernel/sched/rt.c
··· 1858 1858 1859 1859 static struct task_struct *pick_next_pushable_task(struct rq *rq) 1860 1860 { 1861 - struct task_struct *p; 1861 + struct plist_head *head = &rq->rt.pushable_tasks; 1862 + struct task_struct *i, *p = NULL; 1862 1863 1863 1864 if (!has_pushable_tasks(rq)) 1864 1865 return NULL; 1865 1866 1866 - p = plist_first_entry(&rq->rt.pushable_tasks, 1867 - struct task_struct, pushable_tasks); 1867 + plist_for_each_entry(i, head, pushable_tasks) { 1868 + /* make sure task isn't on_cpu (possible with proxy-exec) */ 1869 + if (!task_on_cpu(rq, i)) { 1870 + p = i; 1871 + break; 1872 + } 1873 + } 1874 + 1875 + if (!p) 1876 + return NULL; 1868 1877 1869 1878 BUG_ON(rq->cpu != task_cpu(p)); 1870 1879 BUG_ON(task_current(rq, p));