Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

at master 559 lines 23 kB view raw
1.. _sched-ext: 2 3========================== 4Extensible Scheduler Class 5========================== 6 7sched_ext is a scheduler class whose behavior can be defined by a set of BPF 8programs - the BPF scheduler. 9 10* sched_ext exports a full scheduling interface so that any scheduling 11 algorithm can be implemented on top. 12 13* The BPF scheduler can group CPUs however it sees fit and schedule them 14 together, as tasks aren't tied to specific CPUs at the time of wakeup. 15 16* The BPF scheduler can be turned on and off dynamically anytime. 17 18* The system integrity is maintained no matter what the BPF scheduler does. 19 The default scheduling behavior is restored anytime an error is detected, 20 a runnable task stalls, or on invoking the SysRq key sequence 21 `SysRq-S`. 22 23* When the BPF scheduler triggers an error, debug information is dumped to 24 aid debugging. The debug dump is passed to and printed out by the 25 scheduler binary. The debug dump can also be accessed through the 26 `sched_ext_dump` tracepoint. The SysRq key sequence `SysRq-D` 27 triggers a debug dump. This doesn't terminate the BPF scheduler and can 28 only be read through the tracepoint. 29 30Switching to and from sched_ext 31=============================== 32 33``CONFIG_SCHED_CLASS_EXT`` is the config option to enable sched_ext and 34``tools/sched_ext`` contains the example schedulers. The following config 35options should be enabled to use sched_ext: 36 37.. code-block:: none 38 39 CONFIG_BPF=y 40 CONFIG_SCHED_CLASS_EXT=y 41 CONFIG_BPF_SYSCALL=y 42 CONFIG_BPF_JIT=y 43 CONFIG_DEBUG_INFO_BTF=y 44 CONFIG_BPF_JIT_ALWAYS_ON=y 45 CONFIG_BPF_JIT_DEFAULT_ON=y 46 47sched_ext is used only when the BPF scheduler is loaded and running. 48 49If a task explicitly sets its scheduling policy to ``SCHED_EXT``, it will be 50treated as ``SCHED_NORMAL`` and scheduled by the fair-class scheduler until the 51BPF scheduler is loaded. 52 53When the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is not set 54in ``ops->flags``, all ``SCHED_NORMAL``, ``SCHED_BATCH``, ``SCHED_IDLE``, and 55``SCHED_EXT`` tasks are scheduled by sched_ext. 56 57However, when the BPF scheduler is loaded and ``SCX_OPS_SWITCH_PARTIAL`` is 58set in ``ops->flags``, only tasks with the ``SCHED_EXT`` policy are scheduled 59by sched_ext, while tasks with ``SCHED_NORMAL``, ``SCHED_BATCH`` and 60``SCHED_IDLE`` policies are scheduled by the fair-class scheduler which has 61higher sched_class precedence than ``SCHED_EXT``. 62 63Terminating the sched_ext scheduler program, triggering `SysRq-S`, or 64detection of any internal error including stalled runnable tasks aborts the 65BPF scheduler and reverts all tasks back to the fair-class scheduler. 66 67.. code-block:: none 68 69 # make -j16 -C tools/sched_ext 70 # tools/sched_ext/build/bin/scx_simple 71 local=0 global=3 72 local=5 global=24 73 local=9 global=44 74 local=13 global=56 75 local=17 global=72 76 ^CEXIT: BPF scheduler unregistered 77 78The current status of the BPF scheduler can be determined as follows: 79 80.. code-block:: none 81 82 # cat /sys/kernel/sched_ext/state 83 enabled 84 # cat /sys/kernel/sched_ext/root/ops 85 simple 86 87You can check if any BPF scheduler has ever been loaded since boot by examining 88this monotonically incrementing counter (a value of zero indicates that no BPF 89scheduler has been loaded): 90 91.. code-block:: none 92 93 # cat /sys/kernel/sched_ext/enable_seq 94 1 95 96Each running scheduler also exposes a per-scheduler ``events`` file under 97``/sys/kernel/sched_ext/<scheduler-name>/events`` that tracks diagnostic 98counters. Each counter occupies one ``name value`` line: 99 100.. code-block:: none 101 102 # cat /sys/kernel/sched_ext/simple/events 103 SCX_EV_SELECT_CPU_FALLBACK 0 104 SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE 0 105 SCX_EV_DISPATCH_KEEP_LAST 123 106 SCX_EV_ENQ_SKIP_EXITING 0 107 SCX_EV_ENQ_SKIP_MIGRATION_DISABLED 0 108 SCX_EV_REENQ_IMMED 0 109 SCX_EV_REENQ_LOCAL_REPEAT 0 110 SCX_EV_REFILL_SLICE_DFL 456789 111 SCX_EV_BYPASS_DURATION 0 112 SCX_EV_BYPASS_DISPATCH 0 113 SCX_EV_BYPASS_ACTIVATE 0 114 SCX_EV_INSERT_NOT_OWNED 0 115 SCX_EV_SUB_BYPASS_DISPATCH 0 116 117The counters are described in ``kernel/sched/ext_internal.h``; briefly: 118 119* ``SCX_EV_SELECT_CPU_FALLBACK``: ops.select_cpu() returned a CPU unusable by 120 the task and the core scheduler silently picked a fallback CPU. 121* ``SCX_EV_DISPATCH_LOCAL_DSQ_OFFLINE``: a local-DSQ dispatch was redirected 122 to the global DSQ because the target CPU went offline. 123* ``SCX_EV_DISPATCH_KEEP_LAST``: a task continued running because no other 124 task was available (only when ``SCX_OPS_ENQ_LAST`` is not set). 125* ``SCX_EV_ENQ_SKIP_EXITING``: an exiting task was dispatched to the local DSQ 126 directly, bypassing ops.enqueue() (only when ``SCX_OPS_ENQ_EXITING`` is not set). 127* ``SCX_EV_ENQ_SKIP_MIGRATION_DISABLED``: a migration-disabled task was 128 dispatched to its local DSQ directly (only when 129 ``SCX_OPS_ENQ_MIGRATION_DISABLED`` is not set). 130* ``SCX_EV_REENQ_IMMED``: a task dispatched with ``SCX_ENQ_IMMED`` was 131 re-enqueued because the target CPU was not available for immediate execution. 132* ``SCX_EV_REENQ_LOCAL_REPEAT``: a reenqueue of the local DSQ triggered 133 another reenqueue; recurring counts indicate incorrect ``SCX_ENQ_REENQ`` 134 handling in the BPF scheduler. 135* ``SCX_EV_REFILL_SLICE_DFL``: a task's time slice was refilled with the 136 default value (``SCX_SLICE_DFL``). 137* ``SCX_EV_BYPASS_DURATION``: total nanoseconds spent in bypass mode. 138* ``SCX_EV_BYPASS_DISPATCH``: number of tasks dispatched while in bypass mode. 139* ``SCX_EV_BYPASS_ACTIVATE``: number of times bypass mode was activated. 140* ``SCX_EV_INSERT_NOT_OWNED``: attempted to insert a task not owned by this 141 scheduler into a DSQ; such attempts are silently ignored. 142* ``SCX_EV_SUB_BYPASS_DISPATCH``: tasks dispatched from sub-scheduler bypass 143 DSQs (only relevant with ``CONFIG_EXT_SUB_SCHED``). 144 145``tools/sched_ext/scx_show_state.py`` is a drgn script which shows more 146detailed information: 147 148.. code-block:: none 149 150 # tools/sched_ext/scx_show_state.py 151 ops : simple 152 enabled : 1 153 switching_all : 1 154 switched_all : 1 155 enable_state : enabled (2) 156 bypass_depth : 0 157 nr_rejected : 0 158 enable_seq : 1 159 160Whether a given task is on sched_ext can be determined as follows: 161 162.. code-block:: none 163 164 # grep ext /proc/self/sched 165 ext.enabled : 1 166 167The Basics 168========== 169 170Userspace can implement an arbitrary BPF scheduler by loading a set of BPF 171programs that implement ``struct sched_ext_ops``. The only mandatory field 172is ``ops.name`` which must be a valid BPF object name. All operations are 173optional. The following modified excerpt is from 174``tools/sched_ext/scx_simple.bpf.c`` showing a minimal global FIFO scheduler. 175 176.. code-block:: c 177 178 /* 179 * Decide which CPU a task should be migrated to before being 180 * enqueued (either at wakeup, fork time, or exec time). If an 181 * idle core is found by the default ops.select_cpu() implementation, 182 * then insert the task directly into SCX_DSQ_LOCAL and skip the 183 * ops.enqueue() callback. 184 * 185 * Note that this implementation has exactly the same behavior as the 186 * default ops.select_cpu implementation. The behavior of the scheduler 187 * would be exactly same if the implementation just didn't define the 188 * simple_select_cpu() struct_ops prog. 189 */ 190 s32 BPF_STRUCT_OPS(simple_select_cpu, struct task_struct *p, 191 s32 prev_cpu, u64 wake_flags) 192 { 193 s32 cpu; 194 /* Need to initialize or the BPF verifier will reject the program */ 195 bool direct = false; 196 197 cpu = scx_bpf_select_cpu_dfl(p, prev_cpu, wake_flags, &direct); 198 199 if (direct) 200 scx_bpf_dsq_insert(p, SCX_DSQ_LOCAL, SCX_SLICE_DFL, 0); 201 202 return cpu; 203 } 204 205 /* 206 * Do a direct insertion of a task to the global DSQ. This ops.enqueue() 207 * callback will only be invoked if we failed to find a core to insert 208 * into in ops.select_cpu() above. 209 * 210 * Note that this implementation has exactly the same behavior as the 211 * default ops.enqueue implementation, which just dispatches the task 212 * to SCX_DSQ_GLOBAL. The behavior of the scheduler would be exactly same 213 * if the implementation just didn't define the simple_enqueue struct_ops 214 * prog. 215 */ 216 void BPF_STRUCT_OPS(simple_enqueue, struct task_struct *p, u64 enq_flags) 217 { 218 scx_bpf_dsq_insert(p, SCX_DSQ_GLOBAL, SCX_SLICE_DFL, enq_flags); 219 } 220 221 s32 BPF_STRUCT_OPS_SLEEPABLE(simple_init) 222 { 223 /* 224 * By default, all SCHED_EXT, SCHED_OTHER, SCHED_IDLE, and 225 * SCHED_BATCH tasks should use sched_ext. 226 */ 227 return 0; 228 } 229 230 void BPF_STRUCT_OPS(simple_exit, struct scx_exit_info *ei) 231 { 232 exit_type = ei->type; 233 } 234 235 SEC(".struct_ops") 236 struct sched_ext_ops simple_ops = { 237 .select_cpu = (void *)simple_select_cpu, 238 .enqueue = (void *)simple_enqueue, 239 .init = (void *)simple_init, 240 .exit = (void *)simple_exit, 241 .name = "simple", 242 }; 243 244Dispatch Queues 245--------------- 246 247To match the impedance between the scheduler core and the BPF scheduler, 248sched_ext uses DSQs (dispatch queues) which can operate as both a FIFO and a 249priority queue. By default, there is one global FIFO (``SCX_DSQ_GLOBAL``), 250and one local DSQ per CPU (``SCX_DSQ_LOCAL``). The BPF scheduler can manage 251an arbitrary number of DSQs using ``scx_bpf_create_dsq()`` and 252``scx_bpf_destroy_dsq()``. 253 254A CPU always executes a task from its local DSQ. A task is "inserted" into a 255DSQ. A task in a non-local DSQ is "move"d into the target CPU's local DSQ. 256 257When a CPU is looking for the next task to run, if the local DSQ is not 258empty, the first task is picked. Otherwise, the CPU tries to move a task 259from the global DSQ. If that doesn't yield a runnable task either, 260``ops.dispatch()`` is invoked. 261 262Scheduling Cycle 263---------------- 264 265The following briefly shows how a waking task is scheduled and executed. 266 2671. When a task is waking up, ``ops.select_cpu()`` is the first operation 268 invoked. This serves two purposes. First, CPU selection optimization 269 hint. Second, waking up the selected CPU if idle. 270 271 The CPU selected by ``ops.select_cpu()`` is an optimization hint and not 272 binding. The actual decision is made at the last step of scheduling. 273 However, there is a small performance gain if the CPU 274 ``ops.select_cpu()`` returns matches the CPU the task eventually runs on. 275 276 A side-effect of selecting a CPU is waking it up from idle. While a BPF 277 scheduler can wake up any cpu using the ``scx_bpf_kick_cpu()`` helper, 278 using ``ops.select_cpu()`` judiciously can be simpler and more efficient. 279 280 Note that the scheduler core will ignore an invalid CPU selection, for 281 example, if it's outside the allowed cpumask of the task. 282 283 A task can be immediately inserted into a DSQ from ``ops.select_cpu()`` 284 by calling ``scx_bpf_dsq_insert()`` or ``scx_bpf_dsq_insert_vtime()``. 285 286 If the task is inserted into ``SCX_DSQ_LOCAL`` from 287 ``ops.select_cpu()``, it will be added to the local DSQ of whichever CPU 288 is returned from ``ops.select_cpu()``. Additionally, inserting directly 289 from ``ops.select_cpu()`` will cause the ``ops.enqueue()`` callback to 290 be skipped. 291 292 Any other attempt to store a task in BPF-internal data structures from 293 ``ops.select_cpu()`` does not prevent ``ops.enqueue()`` from being 294 invoked. This is discouraged, as it can introduce racy behavior or 295 inconsistent state. 296 2972. Once the target CPU is selected, ``ops.enqueue()`` is invoked (unless the 298 task was inserted directly from ``ops.select_cpu()``). ``ops.enqueue()`` 299 can make one of the following decisions: 300 301 * Immediately insert the task into either the global or a local DSQ by 302 calling ``scx_bpf_dsq_insert()`` with one of the following options: 303 ``SCX_DSQ_GLOBAL``, ``SCX_DSQ_LOCAL``, or ``SCX_DSQ_LOCAL_ON | cpu``. 304 305 * Immediately insert the task into a custom DSQ by calling 306 ``scx_bpf_dsq_insert()`` with a DSQ ID which is smaller than 2^63. 307 308 * Queue the task on the BPF side. 309 310 **Task State Tracking and ops.dequeue() Semantics** 311 312 A task is in the "BPF scheduler's custody" when the BPF scheduler is 313 responsible for managing its lifecycle. A task enters custody when it is 314 dispatched to a user DSQ or stored in the BPF scheduler's internal data 315 structures. Custody is entered only from ``ops.enqueue()`` for those 316 operations. The only exception is dispatching to a user DSQ from 317 ``ops.select_cpu()``: although the task is not yet technically in BPF 318 scheduler custody at that point, the dispatch has the same semantic 319 effect as dispatching from ``ops.enqueue()`` for custody-related 320 purposes. 321 322 Once ``ops.enqueue()`` is called, the task may or may not enter custody 323 depending on what the scheduler does: 324 325 * **Directly dispatched to terminal DSQs** (``SCX_DSQ_LOCAL``, 326 ``SCX_DSQ_LOCAL_ON | cpu``, or ``SCX_DSQ_GLOBAL``): the BPF scheduler 327 is done with the task - it either goes straight to a CPU's local run 328 queue or to the global DSQ as a fallback. The task never enters (or 329 exits) BPF custody, and ``ops.dequeue()`` will not be called. 330 331 * **Dispatch to user-created DSQs** (custom DSQs): the task enters the 332 BPF scheduler's custody. When the task later leaves BPF custody 333 (dispatched to a terminal DSQ, picked by core-sched, or dequeued for 334 sleep/property changes), ``ops.dequeue()`` will be called exactly 335 once. 336 337 * **Stored in BPF data structures** (e.g., internal BPF queues): the 338 task is in BPF custody. ``ops.dequeue()`` will be called when it 339 leaves (e.g., when ``ops.dispatch()`` moves it to a terminal DSQ, or 340 on property change / sleep). 341 342 When a task leaves BPF scheduler custody, ``ops.dequeue()`` is invoked. 343 The dequeue can happen for different reasons, distinguished by flags: 344 345 1. **Regular dispatch**: when a task in BPF custody is dispatched to a 346 terminal DSQ from ``ops.dispatch()`` (leaving BPF custody for 347 execution), ``ops.dequeue()`` is triggered without any special flags. 348 349 2. **Core scheduling pick**: when ``CONFIG_SCHED_CORE`` is enabled and 350 core scheduling picks a task for execution while it's still in BPF 351 custody, ``ops.dequeue()`` is called with the 352 ``SCX_DEQ_CORE_SCHED_EXEC`` flag. 353 354 3. **Scheduling property change**: when a task property changes (via 355 operations like ``sched_setaffinity()``, ``sched_setscheduler()``, 356 priority changes, CPU migrations, etc.) while the task is still in 357 BPF custody, ``ops.dequeue()`` is called with the 358 ``SCX_DEQ_SCHED_CHANGE`` flag set in ``deq_flags``. 359 360 **Important**: Once a task has left BPF custody (e.g., after being 361 dispatched to a terminal DSQ), property changes will not trigger 362 ``ops.dequeue()``, since the task is no longer managed by the BPF 363 scheduler. 364 3653. When a CPU is ready to schedule, it first looks at its local DSQ. If 366 empty, it then looks at the global DSQ. If there still isn't a task to 367 run, ``ops.dispatch()`` is invoked which can use the following two 368 functions to populate the local DSQ. 369 370 * ``scx_bpf_dsq_insert()`` inserts a task to a DSQ. Any target DSQ can be 371 used - ``SCX_DSQ_LOCAL``, ``SCX_DSQ_LOCAL_ON | cpu``, 372 ``SCX_DSQ_GLOBAL`` or a custom DSQ. While ``scx_bpf_dsq_insert()`` 373 currently can't be called with BPF locks held, this is being worked on 374 and will be supported. ``scx_bpf_dsq_insert()`` schedules insertion 375 rather than performing them immediately. There can be up to 376 ``ops.dispatch_max_batch`` pending tasks. 377 378 * ``scx_bpf_dsq_move_to_local()`` moves a task from the specified non-local 379 DSQ to the dispatching DSQ. This function cannot be called with any BPF 380 locks held. ``scx_bpf_dsq_move_to_local()`` flushes the pending insertions 381 tasks before trying to move from the specified DSQ. 382 3834. After ``ops.dispatch()`` returns, if there are tasks in the local DSQ, 384 the CPU runs the first one. If empty, the following steps are taken: 385 386 * Try to move from the global DSQ. If successful, run the task. 387 388 * If ``ops.dispatch()`` has dispatched any tasks, retry #3. 389 390 * If the previous task is an SCX task and still runnable, keep executing 391 it (see ``SCX_OPS_ENQ_LAST``). 392 393 * Go idle. 394 395Note that the BPF scheduler can always choose to dispatch tasks immediately 396in ``ops.enqueue()`` as illustrated in the above simple example. If only the 397built-in DSQs are used, there is no need to implement ``ops.dispatch()`` as 398a task is never queued on the BPF scheduler and both the local and global 399DSQs are executed automatically. 400 401``scx_bpf_dsq_insert()`` inserts the task on the FIFO of the target DSQ. Use 402``scx_bpf_dsq_insert_vtime()`` for the priority queue. Internal DSQs such as 403``SCX_DSQ_LOCAL`` and ``SCX_DSQ_GLOBAL`` do not support priority-queue 404dispatching, and must be dispatched to with ``scx_bpf_dsq_insert()``. See 405the function documentation and usage in ``tools/sched_ext/scx_simple.bpf.c`` 406for more information. 407 408Task Lifecycle 409-------------- 410 411The following pseudo-code presents a rough overview of the entire lifecycle 412of a task managed by a sched_ext scheduler: 413 414.. code-block:: c 415 416 ops.init_task(); /* A new task is created */ 417 ops.enable(); /* Enable BPF scheduling for the task */ 418 419 while (task in SCHED_EXT) { 420 if (task can migrate) 421 ops.select_cpu(); /* Called on wakeup (optimization) */ 422 423 ops.runnable(); /* Task becomes ready to run */ 424 425 while (task_is_runnable(task)) { 426 if (task is not in a DSQ || task->scx.slice == 0) { 427 ops.enqueue(); /* Task can be added to a DSQ */ 428 429 /* Task property change (i.e., affinity, nice, etc.)? */ 430 if (sched_change(task)) { 431 ops.dequeue(); /* Exiting BPF scheduler custody */ 432 ops.quiescent(); 433 434 /* Property change callback, e.g. ops.set_weight() */ 435 436 ops.runnable(); 437 continue; 438 } 439 440 /* Any usable CPU becomes available */ 441 442 ops.dispatch(); /* Task is moved to a local DSQ */ 443 ops.dequeue(); /* Exiting BPF scheduler custody */ 444 } 445 446 ops.running(); /* Task starts running on its assigned CPU */ 447 448 while (task_is_runnable(task) && task->scx.slice > 0) { 449 ops.tick(); /* Called every 1/HZ seconds */ 450 451 if (task->scx.slice == 0) 452 ops.dispatch(); /* task->scx.slice can be refilled */ 453 } 454 455 ops.stopping(); /* Task stops running (time slice expires or wait) */ 456 } 457 458 ops.quiescent(); /* Task releases its assigned CPU (wait) */ 459 } 460 461 ops.disable(); /* Disable BPF scheduling for the task */ 462 ops.exit_task(); /* Task is destroyed */ 463 464Note that the above pseudo-code does not cover all possible state transitions 465and edge cases, to name a few examples: 466 467* ``ops.dispatch()`` may fail to move the task to a local DSQ due to a racing 468 property change on that task, in which case ``ops.dispatch()`` will be 469 retried. 470 471* The task may be direct-dispatched to a local DSQ from ``ops.enqueue()``, 472 in which case ``ops.dispatch()`` and ``ops.dequeue()`` are skipped and we go 473 straight to ``ops.running()``. 474 475* Property changes may occur at virtually any point during the task's lifecycle, 476 not just when the task is queued and waiting to be dispatched. For example, 477 changing a property of a running task will lead to the callback sequence 478 ``ops.stopping()`` -> ``ops.quiescent()`` -> (property change callback) -> 479 ``ops.runnable()`` -> ``ops.running()``. 480 481* A sched_ext task can be preempted by a task from a higher-priority scheduling 482 class, in which case it will exit the tick-dispatch loop even though it is runnable 483 and has a non-zero slice. 484 485See the "Scheduling Cycle" section for a more detailed description of how 486a freshly woken up task gets on a CPU. 487 488Where to Look 489============= 490 491* ``include/linux/sched/ext.h`` defines the core data structures, ops table 492 and constants. 493 494* ``kernel/sched/ext.c`` contains sched_ext core implementation and helpers. 495 The functions prefixed with ``scx_bpf_`` can be called from the BPF 496 scheduler. 497 498* ``kernel/sched/ext_idle.c`` contains the built-in idle CPU selection policy. 499 500* ``tools/sched_ext/`` hosts example BPF scheduler implementations. 501 502 * ``scx_simple[.bpf].c``: Minimal global FIFO scheduler example using a 503 custom DSQ. 504 505 * ``scx_qmap[.bpf].c``: A multi-level FIFO scheduler supporting five 506 levels of priority implemented with ``BPF_MAP_TYPE_QUEUE``. 507 508 * ``scx_central[.bpf].c``: A central FIFO scheduler where all scheduling 509 decisions are made on one CPU, demonstrating ``LOCAL_ON`` dispatching, 510 tickless operation, and kthread preemption. 511 512 * ``scx_cpu0[.bpf].c``: A scheduler that queues all tasks to a shared DSQ 513 and only dispatches them on CPU0 in FIFO order. Useful for testing bypass 514 behavior. 515 516 * ``scx_flatcg[.bpf].c``: A flattened cgroup hierarchy scheduler 517 implementing hierarchical weight-based cgroup CPU control by compounding 518 each cgroup's share at every level into a single flat scheduling layer. 519 520 * ``scx_pair[.bpf].c``: A core-scheduling example that always makes 521 sibling CPU pairs execute tasks from the same CPU cgroup. 522 523 * ``scx_sdt[.bpf].c``: A variation of ``scx_simple`` demonstrating BPF 524 arena memory management for per-task data. 525 526 * ``scx_userland[.bpf].c``: A minimal scheduler demonstrating user space 527 scheduling. Tasks with CPU affinity are direct-dispatched in FIFO order; 528 all others are scheduled in user space by a simple vruntime scheduler. 529 530Module Parameters 531================= 532 533sched_ext exposes two module parameters under the ``sched_ext.`` prefix that 534control bypass-mode behaviour. These knobs are primarily for debugging; there 535is usually no reason to change them during normal operation. They can be read 536and written at runtime (mode 0600) via 537``/sys/module/sched_ext/parameters/``. 538 539``sched_ext.slice_bypass_us`` (default: 5000 µs) 540 The time slice assigned to all tasks when the scheduler is in bypass mode, 541 i.e. during BPF scheduler load, unload, and error recovery. Valid range is 542 100 µs to 100 ms. 543 544``sched_ext.bypass_lb_intv_us`` (default: 500000 µs) 545 The interval at which the bypass-mode load balancer redistributes tasks 546 across CPUs. Set to 0 to disable load balancing during bypass mode. Valid 547 range is 0 to 10 s. 548 549ABI Instability 550=============== 551 552The APIs provided by sched_ext to BPF schedulers programs have no stability 553guarantees. This includes the ops table callbacks and constants defined in 554``include/linux/sched/ext.h``, as well as the ``scx_bpf_`` kfuncs defined in 555``kernel/sched/ext.c`` and ``kernel/sched/ext_idle.c``. 556 557While we will attempt to provide a relatively stable API surface when 558possible, they are subject to change without warning between kernel 559versions.