Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

sched_ext: Fix bypass depth leak on scx_enable() failure

scx_enable() calls scx_bypass(true) to initialize in bypass mode and then
scx_bypass(false) on success to exit. If scx_enable() fails during task
initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error -
it jumps to err_disable while bypass is still active. scx_disable_workfn()
then calls scx_bypass(true/false) for its own bypass, leaving the bypass depth
at 1 instead of 0. This causes the system to remain permanently in bypass mode
after a failed scx_enable().

Failures after task initialization is complete - e.g. scx_tryset_enable_state()
at the end - already call scx_bypass(false) before reaching the error path and
are not affected. This only affects a subset of failure modes.

Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool and
having scx_disable_workfn() call an extra scx_bypass(false) to clear it. This
is a temporary measure as the bypass depth will be moved into the sched
instance, which will make this tracking unnecessary.

Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Chris Mason <clm@meta.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/stable/286e6f7787a81239e1ce2989b52391ce%40kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>

Tejun Heo 9f769637 12b5cd99

+14
+14
kernel/sched/ext.c
··· 41 41 static bool scx_switching_all; 42 42 DEFINE_STATIC_KEY_FALSE(__scx_switched_all); 43 43 44 + /* 45 + * Tracks whether scx_enable() called scx_bypass(true). Used to balance bypass 46 + * depth on enable failure. Will be removed when bypass depth is moved into the 47 + * sched instance. 48 + */ 49 + static bool scx_bypassed_for_enable; 50 + 44 51 static atomic_long_t scx_nr_rejected = ATOMIC_LONG_INIT(0); 45 52 static atomic_long_t scx_hotplug_seq = ATOMIC_LONG_INIT(0); 46 53 ··· 4325 4318 scx_dsp_max_batch = 0; 4326 4319 free_kick_syncs(); 4327 4320 4321 + if (scx_bypassed_for_enable) { 4322 + scx_bypassed_for_enable = false; 4323 + scx_bypass(false); 4324 + } 4325 + 4328 4326 mutex_unlock(&scx_enable_mutex); 4329 4327 4330 4328 WARN_ON_ONCE(scx_set_enable_state(SCX_DISABLED) != SCX_DISABLING); ··· 4982 4970 * Init in bypass mode to guarantee forward progress. 4983 4971 */ 4984 4972 scx_bypass(true); 4973 + scx_bypassed_for_enable = true; 4985 4974 4986 4975 for (i = SCX_OPI_NORMAL_BEGIN; i < SCX_OPI_NORMAL_END; i++) 4987 4976 if (((void (**)(void))ops)[i]) ··· 5080 5067 scx_task_iter_stop(&sti); 5081 5068 percpu_up_write(&scx_fork_rwsem); 5082 5069 5070 + scx_bypassed_for_enable = false; 5083 5071 scx_bypass(false); 5084 5072 5085 5073 if (!scx_tryset_enable_state(SCX_ENABLED, SCX_ENABLING)) {