Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

- Cleanups for SCHED_DEADLINE

- Tracing updates/fixes

- CPU Accounting fixes

- First wave of changes to optimize the overhead of the scheduler
build, from the fast-headers tree - including placeholder *_api.h
headers for later header split-ups.

- Preempt-dynamic using static_branch() for ARM64

- Isolation housekeeping mask rework; preperatory for further changes

- NUMA-balancing: deal with CPU-less nodes

- NUMA-balancing: tune systems that have multiple LLC cache domains per
node (eg. AMD)

- Updates to RSEQ UAPI in preparation for glibc usage

- Lots of RSEQ/selftests, for same

- Add Suren as PSI co-maintainer

* tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
sched/headers: ARM needs asm/paravirt_api_clock.h too
sched/numa: Fix boot crash on arm64 systems
headers/prep: Fix header to build standalone: <linux/psi.h>
sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y
cgroup: Fix suspicious rcu_dereference_check() usage warning
sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers
sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains
sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity()
sched/deadline,rt: Remove unused functions for !CONFIG_SMP
sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently
sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file
sched/deadline: Remove unused def_dl_bandwidth
sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
sched/tracing: Don't re-read p->state when emitting sched_switch event
sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
sched/cpuacct: Remove redundant RCU read lock
sched/cpuacct: Optimize away RCU read lock
sched/cpuacct: Fix charge percpu cpuusage
sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies
...

+2350 -1312
+1 -45
Documentation/admin-guide/sysctl/kernel.rst
··· 609 609 The unmapping of pages and trapping faults incur additional overhead that 610 610 ideally is offset by improved memory locality but there is no universal 611 611 guarantee. If the target workload is already bound to NUMA nodes then this 612 - feature should be disabled. Otherwise, if the system overhead from the 613 - feature is too high then the rate the kernel samples for NUMA hinting 614 - faults may be controlled by the `numa_balancing_scan_period_min_ms, 615 - numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, 616 - numa_balancing_scan_size_mb`_, and numa_balancing_settle_count sysctls. 617 - 618 - 619 - numa_balancing_scan_period_min_ms, numa_balancing_scan_delay_ms, numa_balancing_scan_period_max_ms, numa_balancing_scan_size_mb 620 - =============================================================================================================================== 621 - 622 - 623 - Automatic NUMA balancing scans tasks address space and unmaps pages to 624 - detect if pages are properly placed or if the data should be migrated to a 625 - memory node local to where the task is running. Every "scan delay" the task 626 - scans the next "scan size" number of pages in its address space. When the 627 - end of the address space is reached the scanner restarts from the beginning. 628 - 629 - In combination, the "scan delay" and "scan size" determine the scan rate. 630 - When "scan delay" decreases, the scan rate increases. The scan delay and 631 - hence the scan rate of every task is adaptive and depends on historical 632 - behaviour. If pages are properly placed then the scan delay increases, 633 - otherwise the scan delay decreases. The "scan size" is not adaptive but 634 - the higher the "scan size", the higher the scan rate. 635 - 636 - Higher scan rates incur higher system overhead as page faults must be 637 - trapped and potentially data must be migrated. However, the higher the scan 638 - rate, the more quickly a tasks memory is migrated to a local node if the 639 - workload pattern changes and minimises performance impact due to remote 640 - memory accesses. These sysctls control the thresholds for scan delays and 641 - the number of pages scanned. 642 - 643 - ``numa_balancing_scan_period_min_ms`` is the minimum time in milliseconds to 644 - scan a tasks virtual memory. It effectively controls the maximum scanning 645 - rate for each task. 646 - 647 - ``numa_balancing_scan_delay_ms`` is the starting "scan delay" used for a task 648 - when it initially forks. 649 - 650 - ``numa_balancing_scan_period_max_ms`` is the maximum time in milliseconds to 651 - scan a tasks virtual memory. It effectively controls the minimum scanning 652 - rate for each task. 653 - 654 - ``numa_balancing_scan_size_mb`` is how many megabytes worth of pages are 655 - scanned for a given scan. 656 - 612 + feature should be disabled. 657 613 658 614 oops_all_cpu_backtrace 659 615 ======================
+1
Documentation/scheduler/index.rst
··· 18 18 sched-nice-design 19 19 sched-rt-group 20 20 sched-stats 21 + sched-debug 21 22 22 23 text_files 23 24
+54
Documentation/scheduler/sched-debug.rst
··· 1 + ================= 2 + Scheduler debugfs 3 + ================= 4 + 5 + Booting a kernel with CONFIG_SCHED_DEBUG=y will give access to 6 + scheduler specific debug files under /sys/kernel/debug/sched. Some of 7 + those files are described below. 8 + 9 + numa_balancing 10 + ============== 11 + 12 + `numa_balancing` directory is used to hold files to control NUMA 13 + balancing feature. If the system overhead from the feature is too 14 + high then the rate the kernel samples for NUMA hinting faults may be 15 + controlled by the `scan_period_min_ms, scan_delay_ms, 16 + scan_period_max_ms, scan_size_mb` files. 17 + 18 + 19 + scan_period_min_ms, scan_delay_ms, scan_period_max_ms, scan_size_mb 20 + ------------------------------------------------------------------- 21 + 22 + Automatic NUMA balancing scans tasks address space and unmaps pages to 23 + detect if pages are properly placed or if the data should be migrated to a 24 + memory node local to where the task is running. Every "scan delay" the task 25 + scans the next "scan size" number of pages in its address space. When the 26 + end of the address space is reached the scanner restarts from the beginning. 27 + 28 + In combination, the "scan delay" and "scan size" determine the scan rate. 29 + When "scan delay" decreases, the scan rate increases. The scan delay and 30 + hence the scan rate of every task is adaptive and depends on historical 31 + behaviour. If pages are properly placed then the scan delay increases, 32 + otherwise the scan delay decreases. The "scan size" is not adaptive but 33 + the higher the "scan size", the higher the scan rate. 34 + 35 + Higher scan rates incur higher system overhead as page faults must be 36 + trapped and potentially data must be migrated. However, the higher the scan 37 + rate, the more quickly a tasks memory is migrated to a local node if the 38 + workload pattern changes and minimises performance impact due to remote 39 + memory accesses. These files control the thresholds for scan delays and 40 + the number of pages scanned. 41 + 42 + ``scan_period_min_ms`` is the minimum time in milliseconds to scan a 43 + tasks virtual memory. It effectively controls the maximum scanning 44 + rate for each task. 45 + 46 + ``scan_delay_ms`` is the starting "scan delay" used for a task when it 47 + initially forks. 48 + 49 + ``scan_period_max_ms`` is the maximum time in milliseconds to scan a 50 + tasks virtual memory. It effectively controls the minimum scanning 51 + rate for each task. 52 + 53 + ``scan_size_mb`` is how many megabytes worth of pages are scanned for 54 + a given scan.
+1
MAINTAINERS
··· 15566 15566 15567 15567 PRESSURE STALL INFORMATION (PSI) 15568 15568 M: Johannes Weiner <hannes@cmpxchg.org> 15569 + M: Suren Baghdasaryan <surenb@google.com> 15569 15570 S: Maintained 15570 15571 F: include/linux/psi* 15571 15572 F: kernel/sched/psi.c
+33 -4
arch/Kconfig
··· 1293 1293 1294 1294 config HAVE_PREEMPT_DYNAMIC 1295 1295 bool 1296 + 1297 + config HAVE_PREEMPT_DYNAMIC_CALL 1298 + bool 1296 1299 depends on HAVE_STATIC_CALL 1297 - depends on GENERIC_ENTRY 1300 + select HAVE_PREEMPT_DYNAMIC 1298 1301 help 1299 - Select this if the architecture support boot time preempt setting 1300 - on top of static calls. It is strongly advised to support inline 1301 - static call to avoid any overhead. 1302 + An architecture should select this if it can handle the preemption 1303 + model being selected at boot time using static calls. 1304 + 1305 + Where an architecture selects HAVE_STATIC_CALL_INLINE, any call to a 1306 + preemption function will be patched directly. 1307 + 1308 + Where an architecture does not select HAVE_STATIC_CALL_INLINE, any 1309 + call to a preemption function will go through a trampoline, and the 1310 + trampoline will be patched. 1311 + 1312 + It is strongly advised to support inline static call to avoid any 1313 + overhead. 1314 + 1315 + config HAVE_PREEMPT_DYNAMIC_KEY 1316 + bool 1317 + depends on HAVE_ARCH_JUMP_LABEL && CC_HAS_ASM_GOTO 1318 + select HAVE_PREEMPT_DYNAMIC 1319 + help 1320 + An architecture should select this if it can handle the preemption 1321 + model being selected at boot time using static keys. 1322 + 1323 + Each preemption function will be given an early return based on a 1324 + static key. This should have slightly lower overhead than non-inline 1325 + static calls, as this effectively inlines each trampoline into the 1326 + start of its callee. This may avoid redundant work, and may 1327 + integrate better with CFI schemes. 1328 + 1329 + This will have greater overhead than using inline static calls as 1330 + the call to the preemption function cannot be entirely elided. 1302 1331 1303 1332 config ARCH_WANT_LD_ORPHAN_WARN 1304 1333 bool
+1
arch/arm/include/asm/paravirt_api_clock.h
··· 1 + #include <asm/paravirt.h>
+1
arch/arm64/Kconfig
··· 194 194 select HAVE_PERF_EVENTS 195 195 select HAVE_PERF_REGS 196 196 select HAVE_PERF_USER_STACK_DUMP 197 + select HAVE_PREEMPT_DYNAMIC_KEY 197 198 select HAVE_REGS_AND_STACK_ACCESS_API 198 199 select HAVE_POSIX_CPU_TIMERS_TASK_WORK 199 200 select HAVE_FUNCTION_ARG_ACCESS_API
+1
arch/arm64/include/asm/paravirt_api_clock.h
··· 1 + #include <asm/paravirt.h>
+17 -2
arch/arm64/include/asm/preempt.h
··· 2 2 #ifndef __ASM_PREEMPT_H 3 3 #define __ASM_PREEMPT_H 4 4 5 + #include <linux/jump_label.h> 5 6 #include <linux/thread_info.h> 6 7 7 8 #define PREEMPT_NEED_RESCHED BIT(32) ··· 81 80 } 82 81 83 82 #ifdef CONFIG_PREEMPTION 83 + 84 84 void preempt_schedule(void); 85 - #define __preempt_schedule() preempt_schedule() 86 85 void preempt_schedule_notrace(void); 87 - #define __preempt_schedule_notrace() preempt_schedule_notrace() 86 + 87 + #ifdef CONFIG_PREEMPT_DYNAMIC 88 + 89 + DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 90 + void dynamic_preempt_schedule(void); 91 + #define __preempt_schedule() dynamic_preempt_schedule() 92 + void dynamic_preempt_schedule_notrace(void); 93 + #define __preempt_schedule_notrace() dynamic_preempt_schedule_notrace() 94 + 95 + #else /* CONFIG_PREEMPT_DYNAMIC */ 96 + 97 + #define __preempt_schedule() preempt_schedule() 98 + #define __preempt_schedule_notrace() preempt_schedule_notrace() 99 + 100 + #endif /* CONFIG_PREEMPT_DYNAMIC */ 88 101 #endif /* CONFIG_PREEMPTION */ 89 102 90 103 #endif /* __ASM_PREEMPT_H */
+19 -9
arch/arm64/kernel/entry-common.c
··· 223 223 lockdep_hardirqs_on(CALLER_ADDR0); 224 224 } 225 225 226 + #ifdef CONFIG_PREEMPT_DYNAMIC 227 + DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 228 + #define need_irq_preemption() \ 229 + (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) 230 + #else 231 + #define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION)) 232 + #endif 233 + 226 234 static void __sched arm64_preempt_schedule_irq(void) 227 235 { 228 - lockdep_assert_irqs_disabled(); 236 + if (!need_irq_preemption()) 237 + return; 238 + 239 + /* 240 + * Note: thread_info::preempt_count includes both thread_info::count 241 + * and thread_info::need_resched, and is not equivalent to 242 + * preempt_count(). 243 + */ 244 + if (READ_ONCE(current_thread_info()->preempt_count) != 0) 245 + return; 229 246 230 247 /* 231 248 * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC ··· 458 441 do_interrupt_handler(regs, handler); 459 442 irq_exit_rcu(); 460 443 461 - /* 462 - * Note: thread_info::preempt_count includes both thread_info::count 463 - * and thread_info::need_resched, and is not equivalent to 464 - * preempt_count(). 465 - */ 466 - if (IS_ENABLED(CONFIG_PREEMPTION) && 467 - READ_ONCE(current_thread_info()->preempt_count) == 0) 468 - arm64_preempt_schedule_irq(); 444 + arm64_preempt_schedule_irq(); 469 445 470 446 exit_to_kernel_mode(regs); 471 447 }
+1 -1
arch/x86/Kconfig
··· 248 248 select HAVE_STACK_VALIDATION if X86_64 249 249 select HAVE_STATIC_CALL 250 250 select HAVE_STATIC_CALL_INLINE if HAVE_STACK_VALIDATION 251 - select HAVE_PREEMPT_DYNAMIC 251 + select HAVE_PREEMPT_DYNAMIC_CALL 252 252 select HAVE_RSEQ 253 253 select HAVE_SYSCALL_TRACEPOINTS 254 254 select HAVE_UNSTABLE_SCHED_CLOCK
+1
arch/x86/include/asm/paravirt_api_clock.h
··· 1 + #include <asm/paravirt.h>
+6 -4
arch/x86/include/asm/preempt.h
··· 108 108 extern asmlinkage void preempt_schedule(void); 109 109 extern asmlinkage void preempt_schedule_thunk(void); 110 110 111 - #define __preempt_schedule_func preempt_schedule_thunk 111 + #define preempt_schedule_dynamic_enabled preempt_schedule_thunk 112 + #define preempt_schedule_dynamic_disabled NULL 112 113 113 114 extern asmlinkage void preempt_schedule_notrace(void); 114 115 extern asmlinkage void preempt_schedule_notrace_thunk(void); 115 116 116 - #define __preempt_schedule_notrace_func preempt_schedule_notrace_thunk 117 + #define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace_thunk 118 + #define preempt_schedule_notrace_dynamic_disabled NULL 117 119 118 120 #ifdef CONFIG_PREEMPT_DYNAMIC 119 121 120 - DECLARE_STATIC_CALL(preempt_schedule, __preempt_schedule_func); 122 + DECLARE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled); 121 123 122 124 #define __preempt_schedule() \ 123 125 do { \ ··· 127 125 asm volatile ("call " STATIC_CALL_TRAMP_STR(preempt_schedule) : ASM_CALL_CONSTRAINT); \ 128 126 } while (0) 129 127 130 - DECLARE_STATIC_CALL(preempt_schedule_notrace, __preempt_schedule_notrace_func); 128 + DECLARE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled); 131 129 132 130 #define __preempt_schedule_notrace() \ 133 131 do { \
+3 -3
arch/x86/kernel/cpu/aperfmperf.c
··· 91 91 if (!boot_cpu_has(X86_FEATURE_APERFMPERF)) 92 92 return 0; 93 93 94 - if (!housekeeping_cpu(cpu, HK_FLAG_MISC)) 94 + if (!housekeeping_cpu(cpu, HK_TYPE_MISC)) 95 95 return 0; 96 96 97 97 if (rcu_is_idle_cpu(cpu)) ··· 114 114 return; 115 115 116 116 for_each_online_cpu(cpu) { 117 - if (!housekeeping_cpu(cpu, HK_FLAG_MISC)) 117 + if (!housekeeping_cpu(cpu, HK_TYPE_MISC)) 118 118 continue; 119 119 if (rcu_is_idle_cpu(cpu)) 120 120 continue; /* Idle CPUs are completely uninteresting. */ ··· 136 136 if (!boot_cpu_has(X86_FEATURE_APERFMPERF)) 137 137 return 0; 138 138 139 - if (!housekeeping_cpu(cpu, HK_FLAG_MISC)) 139 + if (!housekeeping_cpu(cpu, HK_TYPE_MISC)) 140 140 return 0; 141 141 142 142 if (aperfmperf_snapshot_cpu(cpu, ktime_get(), true))
+1 -1
arch/x86/kvm/x86.c
··· 8853 8853 } 8854 8854 8855 8855 if (pi_inject_timer == -1) 8856 - pi_inject_timer = housekeeping_enabled(HK_FLAG_TIMER); 8856 + pi_inject_timer = housekeeping_enabled(HK_TYPE_TIMER); 8857 8857 #ifdef CONFIG_X86_64 8858 8858 pvclock_gtod_register_notifier(&pvclock_gtod_notifier); 8859 8859
+1 -1
drivers/base/cpu.c
··· 275 275 return -ENOMEM; 276 276 277 277 cpumask_andnot(isolated, cpu_possible_mask, 278 - housekeeping_cpumask(HK_FLAG_DOMAIN)); 278 + housekeeping_cpumask(HK_TYPE_DOMAIN)); 279 279 len = sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(isolated)); 280 280 281 281 free_cpumask_var(isolated);
+16 -5
drivers/pci/pci-driver.c
··· 350 350 const struct pci_device_id *id) 351 351 { 352 352 int error, node, cpu; 353 - int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ; 354 353 struct drv_dev_and_id ddi = { drv, dev, id }; 355 354 356 355 /* ··· 367 368 * device is probed from work_on_cpu() of the Physical device. 368 369 */ 369 370 if (node < 0 || node >= MAX_NUMNODES || !node_online(node) || 370 - pci_physfn_is_probed(dev)) 371 + pci_physfn_is_probed(dev)) { 371 372 cpu = nr_cpu_ids; 372 - else 373 + } else { 374 + cpumask_var_t wq_domain_mask; 375 + 376 + if (!zalloc_cpumask_var(&wq_domain_mask, GFP_KERNEL)) { 377 + error = -ENOMEM; 378 + goto out; 379 + } 380 + cpumask_and(wq_domain_mask, 381 + housekeeping_cpumask(HK_TYPE_WQ), 382 + housekeeping_cpumask(HK_TYPE_DOMAIN)); 383 + 373 384 cpu = cpumask_any_and(cpumask_of_node(node), 374 - housekeeping_cpumask(hk_flags)); 385 + wq_domain_mask); 386 + free_cpumask_var(wq_domain_mask); 387 + } 375 388 376 389 if (cpu < nr_cpu_ids) 377 390 error = work_on_cpu(cpu, local_pci_probe, &ddi); 378 391 else 379 392 error = local_pci_probe(&ddi); 380 - 393 + out: 381 394 dev->is_probed = 0; 382 395 cpu_hotplug_enable(); 383 396 return error;
+1 -4
include/linux/cgroup.h
··· 450 450 extern spinlock_t css_set_lock; 451 451 #define task_css_set_check(task, __c) \ 452 452 rcu_dereference_check((task)->cgroups, \ 453 + rcu_read_lock_sched_held() || \ 453 454 lockdep_is_held(&cgroup_mutex) || \ 454 455 lockdep_is_held(&css_set_lock) || \ 455 456 ((task)->flags & PF_EXITING) || (__c)) ··· 792 791 793 792 cpuacct_charge(task, delta_exec); 794 793 795 - rcu_read_lock(); 796 794 cgrp = task_dfl_cgroup(task); 797 795 if (cgroup_parent(cgrp)) 798 796 __cgroup_account_cputime(cgrp, delta_exec); 799 - rcu_read_unlock(); 800 797 } 801 798 802 799 static inline void cgroup_account_cputime_field(struct task_struct *task, ··· 805 806 806 807 cpuacct_account_field(task, index, delta_exec); 807 808 808 - rcu_read_lock(); 809 809 cgrp = task_dfl_cgroup(task); 810 810 if (cgroup_parent(cgrp)) 811 811 __cgroup_account_cputime_field(cgrp, index, delta_exec); 812 - rcu_read_unlock(); 813 812 } 814 813 815 814 #else /* CONFIG_CGROUPS */
+1
include/linux/cgroup_api.h
··· 1 + #include <linux/cgroup.h>
+1
include/linux/cpumask_api.h
··· 1 + #include <linux/cpumask.h>
+13 -2
include/linux/entry-common.h
··· 454 454 * 455 455 * Conditional reschedule with additional sanity checks. 456 456 */ 457 - void irqentry_exit_cond_resched(void); 457 + void raw_irqentry_exit_cond_resched(void); 458 458 #ifdef CONFIG_PREEMPT_DYNAMIC 459 - DECLARE_STATIC_CALL(irqentry_exit_cond_resched, irqentry_exit_cond_resched); 459 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 460 + #define irqentry_exit_cond_resched_dynamic_enabled raw_irqentry_exit_cond_resched 461 + #define irqentry_exit_cond_resched_dynamic_disabled NULL 462 + DECLARE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched); 463 + #define irqentry_exit_cond_resched() static_call(irqentry_exit_cond_resched)() 464 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 465 + DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 466 + void dynamic_irqentry_exit_cond_resched(void); 467 + #define irqentry_exit_cond_resched() dynamic_irqentry_exit_cond_resched() 460 468 #endif 469 + #else /* CONFIG_PREEMPT_DYNAMIC */ 470 + #define irqentry_exit_cond_resched() raw_irqentry_exit_cond_resched() 471 + #endif /* CONFIG_PREEMPT_DYNAMIC */ 461 472 462 473 /** 463 474 * irqentry_exit - Handle return from exception that used irqentry_enter()
+1
include/linux/fs_api.h
··· 1 + #include <linux/fs.h>
+1
include/linux/gfp_api.h
··· 1 + #include <linux/gfp.h>
+1
include/linux/hashtable_api.h
··· 1 + #include <linux/hashtable.h>
+1
include/linux/hrtimer_api.h
··· 1 + #include <linux/hrtimer.h>
+6 -1
include/linux/kernel.h
··· 99 99 extern int __cond_resched(void); 100 100 # define might_resched() __cond_resched() 101 101 102 - #elif defined(CONFIG_PREEMPT_DYNAMIC) 102 + #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 103 103 104 104 extern int __cond_resched(void); 105 105 ··· 109 109 { 110 110 static_call_mod(might_resched)(); 111 111 } 112 + 113 + #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 114 + 115 + extern int dynamic_might_resched(void); 116 + # define might_resched() dynamic_might_resched() 112 117 113 118 #else 114 119
+1
include/linux/kobject_api.h
··· 1 + #include <linux/kobject.h>
+1
include/linux/kref_api.h
··· 1 + #include <linux/kref.h>
+1
include/linux/ktime_api.h
··· 1 + #include <linux/ktime.h>
+1
include/linux/llist_api.h
··· 1 + #include <linux/llist.h>
+1
include/linux/lockdep_api.h
··· 1 + #include <linux/lockdep.h>
+1
include/linux/mm_api.h
··· 1 + #include <linux/mm.h>
+1
include/linux/mutex_api.h
··· 1 + #include <linux/mutex.h>
+1
include/linux/perf_event_api.h
··· 1 + #include <linux/perf_event.h>
+1
include/linux/pgtable_api.h
··· 1 + #include <linux/pgtable.h>
+1
include/linux/psi.h
··· 6 6 #include <linux/psi_types.h> 7 7 #include <linux/sched.h> 8 8 #include <linux/poll.h> 9 + #include <linux/cgroup-defs.h> 9 10 10 11 struct seq_file; 11 12 struct css_set;
+3
include/linux/psi_types.h
··· 141 141 * events to one per window 142 142 */ 143 143 u64 last_event_time; 144 + 145 + /* Deferred event(s) from previous ratelimit window */ 146 + bool pending_event; 144 147 }; 145 148 146 149 struct psi_group {
+1
include/linux/ptrace_api.h
··· 1 + #include <linux/ptrace.h>
+1
include/linux/rcuwait_api.h
··· 1 + #include <linux/rcuwait.h>
+1
include/linux/refcount_api.h
··· 1 + #include <linux/refcount.h>
+25 -4
include/linux/sched.h
··· 1626 1626 #define TASK_REPORT_IDLE (TASK_REPORT + 1) 1627 1627 #define TASK_REPORT_MAX (TASK_REPORT_IDLE << 1) 1628 1628 1629 - static inline unsigned int task_state_index(struct task_struct *tsk) 1629 + static inline unsigned int __task_state_index(unsigned int tsk_state, 1630 + unsigned int tsk_exit_state) 1630 1631 { 1631 - unsigned int tsk_state = READ_ONCE(tsk->__state); 1632 - unsigned int state = (tsk_state | tsk->exit_state) & TASK_REPORT; 1632 + unsigned int state = (tsk_state | tsk_exit_state) & TASK_REPORT; 1633 1633 1634 1634 BUILD_BUG_ON_NOT_POWER_OF_2(TASK_REPORT_MAX); 1635 1635 1636 1636 if (tsk_state == TASK_IDLE) 1637 1637 state = TASK_REPORT_IDLE; 1638 1638 1639 + /* 1640 + * We're lying here, but rather than expose a completely new task state 1641 + * to userspace, we can make this appear as if the task has gone through 1642 + * a regular rt_mutex_lock() call. 1643 + */ 1644 + if (tsk_state == TASK_RTLOCK_WAIT) 1645 + state = TASK_UNINTERRUPTIBLE; 1646 + 1639 1647 return fls(state); 1648 + } 1649 + 1650 + static inline unsigned int task_state_index(struct task_struct *tsk) 1651 + { 1652 + return __task_state_index(READ_ONCE(tsk->__state), tsk->exit_state); 1640 1653 } 1641 1654 1642 1655 static inline char task_index_to_char(unsigned int state) ··· 2034 2021 #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) 2035 2022 extern int __cond_resched(void); 2036 2023 2037 - #ifdef CONFIG_PREEMPT_DYNAMIC 2024 + #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 2038 2025 2039 2026 DECLARE_STATIC_CALL(cond_resched, __cond_resched); 2040 2027 2041 2028 static __always_inline int _cond_resched(void) 2042 2029 { 2043 2030 return static_call_mod(cond_resched)(); 2031 + } 2032 + 2033 + #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 2034 + extern int dynamic_cond_resched(void); 2035 + 2036 + static __always_inline int _cond_resched(void) 2037 + { 2038 + return dynamic_cond_resched(); 2044 2039 } 2045 2040 2046 2041 #else
+1
include/linux/sched/affinity.h
··· 1 + #include <linux/sched.h>
+1
include/linux/sched/cond_resched.h
··· 1 + #include <linux/sched.h>
+2
include/linux/sched/deadline.h
··· 6 6 * NORMAL/BATCH tasks. 7 7 */ 8 8 9 + #include <linux/sched.h> 10 + 9 11 #define MAX_DL_PRIO 0 10 12 11 13 static inline int dl_prio(int prio)
+22 -21
include/linux/sched/isolation.h
··· 5 5 #include <linux/init.h> 6 6 #include <linux/tick.h> 7 7 8 - enum hk_flags { 9 - HK_FLAG_TIMER = 1, 10 - HK_FLAG_RCU = (1 << 1), 11 - HK_FLAG_MISC = (1 << 2), 12 - HK_FLAG_SCHED = (1 << 3), 13 - HK_FLAG_TICK = (1 << 4), 14 - HK_FLAG_DOMAIN = (1 << 5), 15 - HK_FLAG_WQ = (1 << 6), 16 - HK_FLAG_MANAGED_IRQ = (1 << 7), 17 - HK_FLAG_KTHREAD = (1 << 8), 8 + enum hk_type { 9 + HK_TYPE_TIMER, 10 + HK_TYPE_RCU, 11 + HK_TYPE_MISC, 12 + HK_TYPE_SCHED, 13 + HK_TYPE_TICK, 14 + HK_TYPE_DOMAIN, 15 + HK_TYPE_WQ, 16 + HK_TYPE_MANAGED_IRQ, 17 + HK_TYPE_KTHREAD, 18 + HK_TYPE_MAX 18 19 }; 19 20 20 21 #ifdef CONFIG_CPU_ISOLATION 21 22 DECLARE_STATIC_KEY_FALSE(housekeeping_overridden); 22 - extern int housekeeping_any_cpu(enum hk_flags flags); 23 - extern const struct cpumask *housekeeping_cpumask(enum hk_flags flags); 24 - extern bool housekeeping_enabled(enum hk_flags flags); 25 - extern void housekeeping_affine(struct task_struct *t, enum hk_flags flags); 26 - extern bool housekeeping_test_cpu(int cpu, enum hk_flags flags); 23 + extern int housekeeping_any_cpu(enum hk_type type); 24 + extern const struct cpumask *housekeeping_cpumask(enum hk_type type); 25 + extern bool housekeeping_enabled(enum hk_type type); 26 + extern void housekeeping_affine(struct task_struct *t, enum hk_type type); 27 + extern bool housekeeping_test_cpu(int cpu, enum hk_type type); 27 28 extern void __init housekeeping_init(void); 28 29 29 30 #else 30 31 31 - static inline int housekeeping_any_cpu(enum hk_flags flags) 32 + static inline int housekeeping_any_cpu(enum hk_type type) 32 33 { 33 34 return smp_processor_id(); 34 35 } 35 36 36 - static inline const struct cpumask *housekeeping_cpumask(enum hk_flags flags) 37 + static inline const struct cpumask *housekeeping_cpumask(enum hk_type type) 37 38 { 38 39 return cpu_possible_mask; 39 40 } 40 41 41 - static inline bool housekeeping_enabled(enum hk_flags flags) 42 + static inline bool housekeeping_enabled(enum hk_type type) 42 43 { 43 44 return false; 44 45 } 45 46 46 47 static inline void housekeeping_affine(struct task_struct *t, 47 - enum hk_flags flags) { } 48 + enum hk_type type) { } 48 49 static inline void housekeeping_init(void) { } 49 50 #endif /* CONFIG_CPU_ISOLATION */ 50 51 51 - static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) 52 + static inline bool housekeeping_cpu(int cpu, enum hk_type type) 52 53 { 53 54 #ifdef CONFIG_CPU_ISOLATION 54 55 if (static_branch_unlikely(&housekeeping_overridden)) 55 - return housekeeping_test_cpu(cpu, flags); 56 + return housekeeping_test_cpu(cpu, type); 56 57 #endif 57 58 return true; 58 59 }
+1
include/linux/sched/posix-timers.h
··· 1 + #include <linux/posix-timers.h>
+1
include/linux/sched/rseq_api.h
··· 1 + #include <linux/rseq.h>
-4
include/linux/sched/sysctl.h
··· 45 45 extern unsigned int sysctl_sched_cfs_bandwidth_slice; 46 46 #endif 47 47 48 - #ifdef CONFIG_SCHED_AUTOGROUP 49 - extern unsigned int sysctl_sched_autogroup_enabled; 50 - #endif 51 - 52 48 extern int sysctl_sched_rr_timeslice; 53 49 extern int sched_rr_timeslice; 54 50
+1
include/linux/sched/task_flags.h
··· 1 + #include <linux/sched.h>
+1
include/linux/sched/thread_info_api.h
··· 1 + #include <linux/thread_info.h>
+1
include/linux/sched/topology.h
··· 93 93 unsigned int busy_factor; /* less balancing by factor if busy */ 94 94 unsigned int imbalance_pct; /* No balance until over watermark */ 95 95 unsigned int cache_nice_tries; /* Leave cache hot tasks for # tries */ 96 + unsigned int imb_numa_nr; /* Nr running tasks that allows a NUMA imbalance */ 96 97 97 98 int nohz_idle; /* NOHZ IDLE status */ 98 99 int flags; /* See SD_* */
+2
include/linux/sched_clock.h
··· 5 5 #ifndef LINUX_SCHED_CLOCK 6 6 #define LINUX_SCHED_CLOCK 7 7 8 + #include <linux/types.h> 9 + 8 10 #ifdef CONFIG_GENERIC_SCHED_CLOCK 9 11 /** 10 12 * struct clock_read_data - data required to read from sched_clock()
+1
include/linux/seqlock_api.h
··· 1 + #include <linux/seqlock.h>
+1
include/linux/softirq.h
··· 1 + #include <linux/interrupt.h>
+1
include/linux/spinlock_api.h
··· 1 + #include <linux/spinlock.h>
+1
include/linux/swait_api.h
··· 1 + #include <linux/swait.h>
+1
include/linux/syscalls_api.h
··· 1 + #include <linux/syscalls.h>
+1
include/linux/u64_stats_sync_api.h
··· 1 + #include <linux/u64_stats_sync.h>
+1
include/linux/wait_api.h
··· 1 + #include <linux/wait.h>
+1
include/linux/workqueue_api.h
··· 1 + #include <linux/workqueue.h>
+7 -4
include/trace/events/sched.h
··· 187 187 TP_ARGS(p)); 188 188 189 189 #ifdef CREATE_TRACE_POINTS 190 - static inline long __trace_sched_switch_state(bool preempt, struct task_struct *p) 190 + static inline long __trace_sched_switch_state(bool preempt, 191 + unsigned int prev_state, 192 + struct task_struct *p) 191 193 { 192 194 unsigned int state; 193 195 ··· 210 208 * it for left shift operation to get the correct task->state 211 209 * mapping. 212 210 */ 213 - state = task_state_index(p); 211 + state = __task_state_index(prev_state, p->exit_state); 214 212 215 213 return state ? (1 << (state - 1)) : state; 216 214 } ··· 222 220 TRACE_EVENT(sched_switch, 223 221 224 222 TP_PROTO(bool preempt, 223 + unsigned int prev_state, 225 224 struct task_struct *prev, 226 225 struct task_struct *next), 227 226 228 - TP_ARGS(preempt, prev, next), 227 + TP_ARGS(preempt, prev_state, prev, next), 229 228 230 229 TP_STRUCT__entry( 231 230 __array( char, prev_comm, TASK_COMM_LEN ) ··· 242 239 memcpy(__entry->next_comm, next->comm, TASK_COMM_LEN); 243 240 __entry->prev_pid = prev->pid; 244 241 __entry->prev_prio = prev->prio; 245 - __entry->prev_state = __trace_sched_switch_state(preempt, prev); 242 + __entry->prev_state = __trace_sched_switch_state(preempt, prev_state, prev); 246 243 memcpy(__entry->prev_comm, prev->comm, TASK_COMM_LEN); 247 244 __entry->next_pid = next->pid; 248 245 __entry->next_prio = next->prio;
+4 -16
include/uapi/linux/rseq.h
··· 105 105 * Read and set by the kernel. Set by user-space with single-copy 106 106 * atomicity semantics. This field should only be updated by the 107 107 * thread which registered this data structure. Aligned on 64-bit. 108 + * 109 + * 32-bit architectures should update the low order bits of the 110 + * rseq_cs field, leaving the high order bits initialized to 0. 108 111 */ 109 - union { 110 - __u64 ptr64; 111 - #ifdef __LP64__ 112 - __u64 ptr; 113 - #else 114 - struct { 115 - #if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined(__BIG_ENDIAN) 116 - __u32 padding; /* Initialized to zero. */ 117 - __u32 ptr32; 118 - #else /* LITTLE */ 119 - __u32 ptr32; 120 - __u32 padding; /* Initialized to zero. */ 121 - #endif /* ENDIAN */ 122 - } ptr; 123 - #endif 124 - } rseq_cs; 112 + __u64 rseq_cs; 125 113 126 114 /* 127 115 * Restartable sequences flags field.
+2 -1
init/Makefile
··· 31 31 cmd_compile.h = \ 32 32 $(CONFIG_SHELL) $(srctree)/scripts/mkcompile_h $@ \ 33 33 "$(UTS_MACHINE)" "$(CONFIG_SMP)" "$(CONFIG_PREEMPT_BUILD)" \ 34 - "$(CONFIG_PREEMPT_RT)" "$(CONFIG_CC_VERSION_TEXT)" "$(LD)" 34 + "$(CONFIG_PREEMPT_DYNAMIC)" "$(CONFIG_PREEMPT_RT)" \ 35 + "$(CONFIG_CC_VERSION_TEXT)" "$(LD)" 35 36 36 37 include/generated/compile.h: FORCE 37 38 $(call cmd,compile.h)
+2 -1
kernel/Kconfig.preempt
··· 96 96 config PREEMPT_DYNAMIC 97 97 bool "Preemption behaviour defined on boot" 98 98 depends on HAVE_PREEMPT_DYNAMIC && !PREEMPT_RT 99 + select JUMP_LABEL if HAVE_PREEMPT_DYNAMIC_KEY 99 100 select PREEMPT_BUILD 100 - default y 101 + default y if HAVE_PREEMPT_DYNAMIC_CALL 101 102 help 102 103 This option allows to define the preemption model on the kernel 103 104 command line parameter and thus override the default preemption
+3 -3
kernel/cgroup/cpuset.c
··· 833 833 update_domain_attr_tree(dattr, &top_cpuset); 834 834 } 835 835 cpumask_and(doms[0], top_cpuset.effective_cpus, 836 - housekeeping_cpumask(HK_FLAG_DOMAIN)); 836 + housekeeping_cpumask(HK_TYPE_DOMAIN)); 837 837 838 838 goto done; 839 839 } ··· 863 863 if (!cpumask_empty(cp->cpus_allowed) && 864 864 !(is_sched_load_balance(cp) && 865 865 cpumask_intersects(cp->cpus_allowed, 866 - housekeeping_cpumask(HK_FLAG_DOMAIN)))) 866 + housekeeping_cpumask(HK_TYPE_DOMAIN)))) 867 867 continue; 868 868 869 869 if (root_load_balance && ··· 952 952 953 953 if (apn == b->pn) { 954 954 cpumask_or(dp, dp, b->effective_cpus); 955 - cpumask_and(dp, dp, housekeeping_cpumask(HK_FLAG_DOMAIN)); 955 + cpumask_and(dp, dp, housekeeping_cpumask(HK_TYPE_DOMAIN)); 956 956 if (dattr) 957 957 update_domain_attr_tree(dattr + nslot, b); 958 958
+2 -2
kernel/cpu.c
··· 1489 1489 cpu_maps_update_begin(); 1490 1490 if (primary == -1) { 1491 1491 primary = cpumask_first(cpu_online_mask); 1492 - if (!housekeeping_cpu(primary, HK_FLAG_TIMER)) 1493 - primary = housekeeping_any_cpu(HK_FLAG_TIMER); 1492 + if (!housekeeping_cpu(primary, HK_TYPE_TIMER)) 1493 + primary = housekeeping_any_cpu(HK_TYPE_TIMER); 1494 1494 } else { 1495 1495 if (!cpu_online(primary)) 1496 1496 primary = cpumask_first(cpu_online_mask);
+15 -8
kernel/entry/common.c
··· 3 3 #include <linux/context_tracking.h> 4 4 #include <linux/entry-common.h> 5 5 #include <linux/highmem.h> 6 + #include <linux/jump_label.h> 6 7 #include <linux/livepatch.h> 7 8 #include <linux/audit.h> 8 9 #include <linux/tick.h> ··· 395 394 return ret; 396 395 } 397 396 398 - void irqentry_exit_cond_resched(void) 397 + void raw_irqentry_exit_cond_resched(void) 399 398 { 400 399 if (!preempt_count()) { 401 400 /* Sanity check RCU and thread stack */ ··· 407 406 } 408 407 } 409 408 #ifdef CONFIG_PREEMPT_DYNAMIC 410 - DEFINE_STATIC_CALL(irqentry_exit_cond_resched, irqentry_exit_cond_resched); 409 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 410 + DEFINE_STATIC_CALL(irqentry_exit_cond_resched, raw_irqentry_exit_cond_resched); 411 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 412 + DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 413 + void dynamic_irqentry_exit_cond_resched(void) 414 + { 415 + if (!static_key_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) 416 + return; 417 + raw_irqentry_exit_cond_resched(); 418 + } 419 + #endif 411 420 #endif 412 421 413 422 noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state) ··· 445 434 } 446 435 447 436 instrumentation_begin(); 448 - if (IS_ENABLED(CONFIG_PREEMPTION)) { 449 - #ifdef CONFIG_PREEMPT_DYNAMIC 450 - static_call(irqentry_exit_cond_resched)(); 451 - #else 437 + if (IS_ENABLED(CONFIG_PREEMPTION)) 452 438 irqentry_exit_cond_resched(); 453 - #endif 454 - } 439 + 455 440 /* Covers both tracing and lockdep */ 456 441 trace_hardirqs_on(); 457 442 instrumentation_end();
+2 -2
kernel/irq/cpuhotplug.c
··· 176 176 { 177 177 const struct cpumask *hk_mask; 178 178 179 - if (!housekeeping_enabled(HK_FLAG_MANAGED_IRQ)) 179 + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) 180 180 return false; 181 181 182 - hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); 182 + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); 183 183 if (cpumask_subset(irq_data_get_effective_affinity_mask(data), hk_mask)) 184 184 return false; 185 185
+2 -2
kernel/irq/manage.c
··· 247 247 * online. 248 248 */ 249 249 if (irqd_affinity_is_managed(data) && 250 - housekeeping_enabled(HK_FLAG_MANAGED_IRQ)) { 250 + housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) { 251 251 const struct cpumask *hk_mask, *prog_mask; 252 252 253 253 static DEFINE_RAW_SPINLOCK(tmp_mask_lock); 254 254 static struct cpumask tmp_mask; 255 255 256 - hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); 256 + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); 257 257 258 258 raw_spin_lock(&tmp_mask_lock); 259 259 cpumask_and(&tmp_mask, mask, hk_mask);
+2 -2
kernel/kthread.c
··· 356 356 * back to default in case they have been changed. 357 357 */ 358 358 sched_setscheduler_nocheck(current, SCHED_NORMAL, &param); 359 - set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_KTHREAD)); 359 + set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_KTHREAD)); 360 360 361 361 /* OK, tell user we're spawned, wait for stop or wakeup */ 362 362 __set_current_state(TASK_UNINTERRUPTIBLE); ··· 722 722 /* Setup a clean context for our children to inherit. */ 723 723 set_task_comm(tsk, "kthreadd"); 724 724 ignore_signals(tsk); 725 - set_cpus_allowed_ptr(tsk, housekeeping_cpumask(HK_FLAG_KTHREAD)); 725 + set_cpus_allowed_ptr(tsk, housekeeping_cpumask(HK_TYPE_KTHREAD)); 726 726 set_mems_allowed(node_states[N_MEMORY]); 727 727 728 728 current->flags |= PF_NOFREEZE;
+1 -1
kernel/rcu/tasks.h
··· 496 496 struct rcu_tasks *rtp = arg; 497 497 498 498 /* Run on housekeeping CPUs by default. Sysadm can move if desired. */ 499 - housekeeping_affine(current, HK_FLAG_RCU); 499 + housekeeping_affine(current, HK_TYPE_RCU); 500 500 WRITE_ONCE(rtp->kthread_ptr, current); // Let GPs start! 501 501 502 502 /*
+3 -3
kernel/rcu/tree_plugin.h
··· 1218 1218 if ((mask & leaf_node_cpu_bit(rnp, cpu)) && 1219 1219 cpu != outgoingcpu) 1220 1220 cpumask_set_cpu(cpu, cm); 1221 - cpumask_and(cm, cm, housekeeping_cpumask(HK_FLAG_RCU)); 1221 + cpumask_and(cm, cm, housekeeping_cpumask(HK_TYPE_RCU)); 1222 1222 if (cpumask_empty(cm)) 1223 - cpumask_copy(cm, housekeeping_cpumask(HK_FLAG_RCU)); 1223 + cpumask_copy(cm, housekeeping_cpumask(HK_TYPE_RCU)); 1224 1224 set_cpus_allowed_ptr(t, cm); 1225 1225 mutex_unlock(&rnp->boost_kthread_mutex); 1226 1226 free_cpumask_var(cm); ··· 1296 1296 { 1297 1297 if (!tick_nohz_full_enabled()) 1298 1298 return; 1299 - housekeeping_affine(current, HK_FLAG_RCU); 1299 + housekeeping_affine(current, HK_TYPE_RCU); 1300 1300 } 1301 1301 1302 1302 /* Record the current task on dyntick-idle entry. */
+4 -4
kernel/rseq.c
··· 128 128 int ret; 129 129 130 130 #ifdef CONFIG_64BIT 131 - if (get_user(ptr, &t->rseq->rseq_cs.ptr64)) 131 + if (get_user(ptr, &t->rseq->rseq_cs)) 132 132 return -EFAULT; 133 133 #else 134 - if (copy_from_user(&ptr, &t->rseq->rseq_cs.ptr64, sizeof(ptr))) 134 + if (copy_from_user(&ptr, &t->rseq->rseq_cs, sizeof(ptr))) 135 135 return -EFAULT; 136 136 #endif 137 137 if (!ptr) { ··· 217 217 * Set rseq_cs to NULL. 218 218 */ 219 219 #ifdef CONFIG_64BIT 220 - return put_user(0UL, &t->rseq->rseq_cs.ptr64); 220 + return put_user(0UL, &t->rseq->rseq_cs); 221 221 #else 222 - if (clear_user(&t->rseq->rseq_cs.ptr64, sizeof(t->rseq->rseq_cs.ptr64))) 222 + if (clear_user(&t->rseq->rseq_cs, sizeof(t->rseq->rseq_cs))) 223 223 return -EFAULT; 224 224 return 0; 225 225 #endif
+10 -18
kernel/sched/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 - ifdef CONFIG_FUNCTION_TRACER 3 - CFLAGS_REMOVE_clock.o = $(CC_FLAGS_FTRACE) 4 - endif 5 2 6 3 # The compilers are complaining about unused variables inside an if(0) scope 7 4 # block. This is daft, shut them up. ··· 22 25 CFLAGS_core.o := $(PROFILING) -fno-omit-frame-pointer 23 26 endif 24 27 25 - obj-y += core.o loadavg.o clock.o cputime.o 26 - obj-y += idle.o fair.o rt.o deadline.o 27 - obj-y += wait.o wait_bit.o swait.o completion.o 28 - 29 - obj-$(CONFIG_SMP) += cpupri.o cpudeadline.o topology.o stop_task.o pelt.o 30 - obj-$(CONFIG_SCHED_AUTOGROUP) += autogroup.o 31 - obj-$(CONFIG_SCHEDSTATS) += stats.o 32 - obj-$(CONFIG_SCHED_DEBUG) += debug.o 33 - obj-$(CONFIG_CGROUP_CPUACCT) += cpuacct.o 34 - obj-$(CONFIG_CPU_FREQ) += cpufreq.o 35 - obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o 36 - obj-$(CONFIG_MEMBARRIER) += membarrier.o 37 - obj-$(CONFIG_CPU_ISOLATION) += isolation.o 38 - obj-$(CONFIG_PSI) += psi.o 39 - obj-$(CONFIG_SCHED_CORE) += core_sched.o 28 + # 29 + # Build efficiency: 30 + # 31 + # These compilation units have roughly the same size and complexity - so their 32 + # build parallelizes well and finishes roughly at once: 33 + # 34 + obj-y += core.o 35 + obj-y += fair.o 36 + obj-y += build_policy.o 37 + obj-y += build_utility.o
+24 -2
kernel/sched/autogroup.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + 2 3 /* 3 4 * Auto-group scheduling implementation: 4 5 */ 5 - #include <linux/nospec.h> 6 - #include "sched.h" 7 6 8 7 unsigned int __read_mostly sysctl_sched_autogroup_enabled = 1; 9 8 static struct autogroup autogroup_default; 10 9 static atomic_t autogroup_seq_nr; 10 + 11 + #ifdef CONFIG_SYSCTL 12 + static struct ctl_table sched_autogroup_sysctls[] = { 13 + { 14 + .procname = "sched_autogroup_enabled", 15 + .data = &sysctl_sched_autogroup_enabled, 16 + .maxlen = sizeof(unsigned int), 17 + .mode = 0644, 18 + .proc_handler = proc_dointvec_minmax, 19 + .extra1 = SYSCTL_ZERO, 20 + .extra2 = SYSCTL_ONE, 21 + }, 22 + {} 23 + }; 24 + 25 + static void __init sched_autogroup_sysctl_init(void) 26 + { 27 + register_sysctl_init("kernel", sched_autogroup_sysctls); 28 + } 29 + #else 30 + #define sched_autogroup_sysctl_init() do { } while (0) 31 + #endif 11 32 12 33 void __init autogroup_init(struct task_struct *init_task) 13 34 { ··· 219 198 static int __init setup_autogroup(char *str) 220 199 { 221 200 sysctl_sched_autogroup_enabled = 0; 201 + sched_autogroup_sysctl_init(); 222 202 223 203 return 1; 224 204 }
+6
kernel/sched/autogroup.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _KERNEL_SCHED_AUTOGROUP_H 3 + #define _KERNEL_SCHED_AUTOGROUP_H 4 + 2 5 #ifdef CONFIG_SCHED_AUTOGROUP 3 6 4 7 struct autogroup { ··· 30 27 static inline struct task_group * 31 28 autogroup_task_group(struct task_struct *p, struct task_group *tg) 32 29 { 30 + extern unsigned int sysctl_sched_autogroup_enabled; 33 31 int enabled = READ_ONCE(sysctl_sched_autogroup_enabled); 34 32 35 33 if (enabled && task_wants_autogroup(p, tg)) ··· 62 58 } 63 59 64 60 #endif /* CONFIG_SCHED_AUTOGROUP */ 61 + 62 + #endif /* _KERNEL_SCHED_AUTOGROUP_H */
+52
kernel/sched/build_policy.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * These are the scheduling policy related scheduler files, built 4 + * in a single compilation unit for build efficiency reasons. 5 + * 6 + * ( Incidentally, the size of the compilation unit is roughly 7 + * comparable to core.c and fair.c, the other two big 8 + * compilation units. This helps balance build time, while 9 + * coalescing source files to amortize header inclusion 10 + * cost. ) 11 + * 12 + * core.c and fair.c are built separately. 13 + */ 14 + 15 + /* Headers: */ 16 + #include <linux/sched/clock.h> 17 + #include <linux/sched/cputime.h> 18 + #include <linux/sched/posix-timers.h> 19 + #include <linux/sched/rt.h> 20 + 21 + #include <linux/cpuidle.h> 22 + #include <linux/jiffies.h> 23 + #include <linux/livepatch.h> 24 + #include <linux/psi.h> 25 + #include <linux/seqlock_api.h> 26 + #include <linux/slab.h> 27 + #include <linux/suspend.h> 28 + #include <linux/tsacct_kern.h> 29 + #include <linux/vtime.h> 30 + 31 + #include <uapi/linux/sched/types.h> 32 + 33 + #include "sched.h" 34 + 35 + #include "autogroup.h" 36 + #include "stats.h" 37 + #include "pelt.h" 38 + 39 + /* Source code modules: */ 40 + 41 + #include "idle.c" 42 + 43 + #include "rt.c" 44 + 45 + #ifdef CONFIG_SMP 46 + # include "cpudeadline.c" 47 + # include "pelt.c" 48 + #endif 49 + 50 + #include "cputime.c" 51 + #include "deadline.c" 52 +
+109
kernel/sched/build_utility.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * These are various utility functions of the scheduler, 4 + * built in a single compilation unit for build efficiency reasons. 5 + * 6 + * ( Incidentally, the size of the compilation unit is roughly 7 + * comparable to core.c, fair.c, smp.c and policy.c, the other 8 + * big compilation units. This helps balance build time, while 9 + * coalescing source files to amortize header inclusion 10 + * cost. ) 11 + */ 12 + #include <linux/sched/clock.h> 13 + #include <linux/sched/cputime.h> 14 + #include <linux/sched/debug.h> 15 + #include <linux/sched/isolation.h> 16 + #include <linux/sched/loadavg.h> 17 + #include <linux/sched/mm.h> 18 + #include <linux/sched/rseq_api.h> 19 + #include <linux/sched/task_stack.h> 20 + 21 + #include <linux/cpufreq.h> 22 + #include <linux/cpumask_api.h> 23 + #include <linux/cpuset.h> 24 + #include <linux/ctype.h> 25 + #include <linux/debugfs.h> 26 + #include <linux/energy_model.h> 27 + #include <linux/hashtable_api.h> 28 + #include <linux/irq.h> 29 + #include <linux/kobject_api.h> 30 + #include <linux/membarrier.h> 31 + #include <linux/mempolicy.h> 32 + #include <linux/nmi.h> 33 + #include <linux/nospec.h> 34 + #include <linux/proc_fs.h> 35 + #include <linux/psi.h> 36 + #include <linux/psi.h> 37 + #include <linux/ptrace_api.h> 38 + #include <linux/sched_clock.h> 39 + #include <linux/security.h> 40 + #include <linux/spinlock_api.h> 41 + #include <linux/swait_api.h> 42 + #include <linux/timex.h> 43 + #include <linux/utsname.h> 44 + #include <linux/wait_api.h> 45 + #include <linux/workqueue_api.h> 46 + 47 + #include <uapi/linux/prctl.h> 48 + #include <uapi/linux/sched/types.h> 49 + 50 + #include <asm/switch_to.h> 51 + 52 + #include "sched.h" 53 + #include "sched-pelt.h" 54 + #include "stats.h" 55 + #include "autogroup.h" 56 + 57 + #include "clock.c" 58 + 59 + #ifdef CONFIG_CGROUP_CPUACCT 60 + # include "cpuacct.c" 61 + #endif 62 + 63 + #ifdef CONFIG_CPU_FREQ 64 + # include "cpufreq.c" 65 + #endif 66 + 67 + #ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL 68 + # include "cpufreq_schedutil.c" 69 + #endif 70 + 71 + #ifdef CONFIG_SCHED_DEBUG 72 + # include "debug.c" 73 + #endif 74 + 75 + #ifdef CONFIG_SCHEDSTATS 76 + # include "stats.c" 77 + #endif 78 + 79 + #include "loadavg.c" 80 + #include "completion.c" 81 + #include "swait.c" 82 + #include "wait_bit.c" 83 + #include "wait.c" 84 + 85 + #ifdef CONFIG_SMP 86 + # include "cpupri.c" 87 + # include "stop_task.c" 88 + # include "topology.c" 89 + #endif 90 + 91 + #ifdef CONFIG_SCHED_CORE 92 + # include "core_sched.c" 93 + #endif 94 + 95 + #ifdef CONFIG_PSI 96 + # include "psi.c" 97 + #endif 98 + 99 + #ifdef CONFIG_MEMBARRIER 100 + # include "membarrier.c" 101 + #endif 102 + 103 + #ifdef CONFIG_CPU_ISOLATION 104 + # include "isolation.c" 105 + #endif 106 + 107 + #ifdef CONFIG_SCHED_AUTOGROUP 108 + # include "autogroup.c" 109 + #endif
+21 -23
kernel/sched/clock.c
··· 53 53 * that is otherwise invisible (TSC gets stopped). 54 54 * 55 55 */ 56 - #include "sched.h" 57 - #include <linux/sched_clock.h> 58 56 59 57 /* 60 58 * Scheduler clock - returns current time in nanosec units. 61 59 * This is default implementation. 62 60 * Architectures and sub-architectures can override this. 63 61 */ 64 - unsigned long long __weak sched_clock(void) 62 + notrace unsigned long long __weak sched_clock(void) 65 63 { 66 64 return (unsigned long long)(jiffies - INITIAL_JIFFIES) 67 65 * (NSEC_PER_SEC / HZ); ··· 93 95 94 96 static DEFINE_PER_CPU_SHARED_ALIGNED(struct sched_clock_data, sched_clock_data); 95 97 96 - static inline struct sched_clock_data *this_scd(void) 98 + notrace static inline struct sched_clock_data *this_scd(void) 97 99 { 98 100 return this_cpu_ptr(&sched_clock_data); 99 101 } 100 102 101 - static inline struct sched_clock_data *cpu_sdc(int cpu) 103 + notrace static inline struct sched_clock_data *cpu_sdc(int cpu) 102 104 { 103 105 return &per_cpu(sched_clock_data, cpu); 104 106 } 105 107 106 - int sched_clock_stable(void) 108 + notrace int sched_clock_stable(void) 107 109 { 108 110 return static_branch_likely(&__sched_clock_stable); 109 111 } 110 112 111 - static void __scd_stamp(struct sched_clock_data *scd) 113 + notrace static void __scd_stamp(struct sched_clock_data *scd) 112 114 { 113 115 scd->tick_gtod = ktime_get_ns(); 114 116 scd->tick_raw = sched_clock(); 115 117 } 116 118 117 - static void __set_sched_clock_stable(void) 119 + notrace static void __set_sched_clock_stable(void) 118 120 { 119 121 struct sched_clock_data *scd; 120 122 ··· 149 151 * The only way to fully avoid random clock jumps is to boot with: 150 152 * "tsc=unstable". 151 153 */ 152 - static void __sched_clock_work(struct work_struct *work) 154 + notrace static void __sched_clock_work(struct work_struct *work) 153 155 { 154 156 struct sched_clock_data *scd; 155 157 int cpu; ··· 175 177 176 178 static DECLARE_WORK(sched_clock_work, __sched_clock_work); 177 179 178 - static void __clear_sched_clock_stable(void) 180 + notrace static void __clear_sched_clock_stable(void) 179 181 { 180 182 if (!sched_clock_stable()) 181 183 return; ··· 184 186 schedule_work(&sched_clock_work); 185 187 } 186 188 187 - void clear_sched_clock_stable(void) 189 + notrace void clear_sched_clock_stable(void) 188 190 { 189 191 __sched_clock_stable_early = 0; 190 192 ··· 194 196 __clear_sched_clock_stable(); 195 197 } 196 198 197 - static void __sched_clock_gtod_offset(void) 199 + notrace static void __sched_clock_gtod_offset(void) 198 200 { 199 201 struct sched_clock_data *scd = this_scd(); 200 202 ··· 244 246 * min, max except they take wrapping into account 245 247 */ 246 248 247 - static inline u64 wrap_min(u64 x, u64 y) 249 + notrace static inline u64 wrap_min(u64 x, u64 y) 248 250 { 249 251 return (s64)(x - y) < 0 ? x : y; 250 252 } 251 253 252 - static inline u64 wrap_max(u64 x, u64 y) 254 + notrace static inline u64 wrap_max(u64 x, u64 y) 253 255 { 254 256 return (s64)(x - y) > 0 ? x : y; 255 257 } ··· 260 262 * - filter out backward motion 261 263 * - use the GTOD tick value to create a window to filter crazy TSC values 262 264 */ 263 - static u64 sched_clock_local(struct sched_clock_data *scd) 265 + notrace static u64 sched_clock_local(struct sched_clock_data *scd) 264 266 { 265 267 u64 now, clock, old_clock, min_clock, max_clock, gtod; 266 268 s64 delta; ··· 293 295 return clock; 294 296 } 295 297 296 - static u64 sched_clock_remote(struct sched_clock_data *scd) 298 + notrace static u64 sched_clock_remote(struct sched_clock_data *scd) 297 299 { 298 300 struct sched_clock_data *my_scd = this_scd(); 299 301 u64 this_clock, remote_clock; ··· 360 362 * 361 363 * See cpu_clock(). 362 364 */ 363 - u64 sched_clock_cpu(int cpu) 365 + notrace u64 sched_clock_cpu(int cpu) 364 366 { 365 367 struct sched_clock_data *scd; 366 368 u64 clock; ··· 384 386 } 385 387 EXPORT_SYMBOL_GPL(sched_clock_cpu); 386 388 387 - void sched_clock_tick(void) 389 + notrace void sched_clock_tick(void) 388 390 { 389 391 struct sched_clock_data *scd; 390 392 ··· 401 403 sched_clock_local(scd); 402 404 } 403 405 404 - void sched_clock_tick_stable(void) 406 + notrace void sched_clock_tick_stable(void) 405 407 { 406 408 if (!sched_clock_stable()) 407 409 return; ··· 421 423 /* 422 424 * We are going deep-idle (irqs are disabled): 423 425 */ 424 - void sched_clock_idle_sleep_event(void) 426 + notrace void sched_clock_idle_sleep_event(void) 425 427 { 426 428 sched_clock_cpu(smp_processor_id()); 427 429 } ··· 430 432 /* 431 433 * We just idled; resync with ktime. 432 434 */ 433 - void sched_clock_idle_wakeup_event(void) 435 + notrace void sched_clock_idle_wakeup_event(void) 434 436 { 435 437 unsigned long flags; 436 438 ··· 456 458 local_irq_enable(); 457 459 } 458 460 459 - u64 sched_clock_cpu(int cpu) 461 + notrace u64 sched_clock_cpu(int cpu) 460 462 { 461 463 if (!static_branch_likely(&sched_clock_running)) 462 464 return 0; ··· 474 476 * On bare metal this function should return the same as local_clock. 475 477 * Architectures and sub-architectures can override this. 476 478 */ 477 - u64 __weak running_clock(void) 479 + notrace u64 __weak running_clock(void) 478 480 { 479 481 return local_clock(); 480 482 }
+1 -1
kernel/sched/completion.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + 2 3 /* 3 4 * Generic wait-for-completion handler; 4 5 * ··· 12 11 * typically be used for exclusion which gives rise to priority inversion. 13 12 * Waiting for completion is a typically sync point, but not an exclusion point. 14 13 */ 15 - #include "sched.h" 16 14 17 15 /** 18 16 * complete: - signals a single thread waiting on this completion
+303 -165
kernel/sched/core.c
··· 6 6 * 7 7 * Copyright (C) 1991-2002 Linus Torvalds 8 8 */ 9 - #define CREATE_TRACE_POINTS 10 - #include <trace/events/sched.h> 11 - #undef CREATE_TRACE_POINTS 9 + #include <linux/highmem.h> 10 + #include <linux/hrtimer_api.h> 11 + #include <linux/ktime_api.h> 12 + #include <linux/sched/signal.h> 13 + #include <linux/syscalls_api.h> 14 + #include <linux/debug_locks.h> 15 + #include <linux/prefetch.h> 16 + #include <linux/capability.h> 17 + #include <linux/pgtable_api.h> 18 + #include <linux/wait_bit.h> 19 + #include <linux/jiffies.h> 20 + #include <linux/spinlock_api.h> 21 + #include <linux/cpumask_api.h> 22 + #include <linux/lockdep_api.h> 23 + #include <linux/hardirq.h> 24 + #include <linux/softirq.h> 25 + #include <linux/refcount_api.h> 26 + #include <linux/topology.h> 27 + #include <linux/sched/clock.h> 28 + #include <linux/sched/cond_resched.h> 29 + #include <linux/sched/debug.h> 30 + #include <linux/sched/isolation.h> 31 + #include <linux/sched/loadavg.h> 32 + #include <linux/sched/mm.h> 33 + #include <linux/sched/nohz.h> 34 + #include <linux/sched/rseq_api.h> 35 + #include <linux/sched/rt.h> 12 36 13 - #include "sched.h" 14 - 15 - #include <linux/nospec.h> 16 37 #include <linux/blkdev.h> 38 + #include <linux/context_tracking.h> 39 + #include <linux/cpuset.h> 40 + #include <linux/delayacct.h> 41 + #include <linux/init_task.h> 42 + #include <linux/interrupt.h> 43 + #include <linux/ioprio.h> 44 + #include <linux/kallsyms.h> 17 45 #include <linux/kcov.h> 46 + #include <linux/kprobes.h> 47 + #include <linux/llist_api.h> 48 + #include <linux/mmu_context.h> 49 + #include <linux/mmzone.h> 50 + #include <linux/mutex_api.h> 51 + #include <linux/nmi.h> 52 + #include <linux/nospec.h> 53 + #include <linux/perf_event_api.h> 54 + #include <linux/profile.h> 55 + #include <linux/psi.h> 56 + #include <linux/rcuwait_api.h> 57 + #include <linux/sched/wake_q.h> 18 58 #include <linux/scs.h> 59 + #include <linux/slab.h> 60 + #include <linux/syscalls.h> 61 + #include <linux/vtime.h> 62 + #include <linux/wait_api.h> 63 + #include <linux/workqueue_api.h> 64 + 65 + #ifdef CONFIG_PREEMPT_DYNAMIC 66 + # ifdef CONFIG_GENERIC_ENTRY 67 + # include <linux/entry-common.h> 68 + # endif 69 + #endif 70 + 71 + #include <uapi/linux/sched/types.h> 19 72 20 73 #include <asm/switch_to.h> 21 74 #include <asm/tlb.h> 22 75 76 + #define CREATE_TRACE_POINTS 77 + #include <linux/sched/rseq_api.h> 78 + #include <trace/events/sched.h> 79 + #undef CREATE_TRACE_POINTS 80 + 81 + #include "sched.h" 82 + #include "stats.h" 83 + #include "autogroup.h" 84 + 85 + #include "autogroup.h" 86 + #include "pelt.h" 87 + #include "smp.h" 88 + #include "stats.h" 89 + 23 90 #include "../workqueue_internal.h" 24 91 #include "../../fs/io-wq.h" 25 92 #include "../smpboot.h" 26 - 27 - #include "pelt.h" 28 - #include "smp.h" 29 93 30 94 /* 31 95 * Export tracepoints that act as a bare tracehook (ie: have no trace event ··· 100 36 EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_dl_tp); 101 37 EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_irq_tp); 102 38 EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_se_tp); 39 + EXPORT_TRACEPOINT_SYMBOL_GPL(pelt_thermal_tp); 103 40 EXPORT_TRACEPOINT_SYMBOL_GPL(sched_cpu_capacity_tp); 104 41 EXPORT_TRACEPOINT_SYMBOL_GPL(sched_overutilized_tp); 105 42 EXPORT_TRACEPOINT_SYMBOL_GPL(sched_util_est_cfs_tp); ··· 1089 1024 struct sched_domain *sd; 1090 1025 const struct cpumask *hk_mask; 1091 1026 1092 - if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) { 1027 + if (housekeeping_cpu(cpu, HK_TYPE_TIMER)) { 1093 1028 if (!idle_cpu(cpu)) 1094 1029 return cpu; 1095 1030 default_cpu = cpu; 1096 1031 } 1097 1032 1098 - hk_mask = housekeeping_cpumask(HK_FLAG_TIMER); 1033 + hk_mask = housekeeping_cpumask(HK_TYPE_TIMER); 1099 1034 1100 1035 rcu_read_lock(); 1101 1036 for_each_domain(cpu, sd) { ··· 1111 1046 } 1112 1047 1113 1048 if (default_cpu == -1) 1114 - default_cpu = housekeeping_any_cpu(HK_FLAG_TIMER); 1049 + default_cpu = housekeeping_any_cpu(HK_TYPE_TIMER); 1115 1050 cpu = default_cpu; 1116 1051 unlock: 1117 1052 rcu_read_unlock(); ··· 4899 4834 { 4900 4835 struct rq *rq = this_rq(); 4901 4836 struct mm_struct *mm = rq->prev_mm; 4902 - long prev_state; 4837 + unsigned int prev_state; 4903 4838 4904 4839 /* 4905 4840 * The previous task will have left us with a preempt_count of 2 ··· 5444 5379 int os; 5445 5380 struct tick_work *twork; 5446 5381 5447 - if (housekeeping_cpu(cpu, HK_FLAG_TICK)) 5382 + if (housekeeping_cpu(cpu, HK_TYPE_TICK)) 5448 5383 return; 5449 5384 5450 5385 WARN_ON_ONCE(!tick_work_cpu); ··· 5465 5400 struct tick_work *twork; 5466 5401 int os; 5467 5402 5468 - if (housekeeping_cpu(cpu, HK_FLAG_TICK)) 5403 + if (housekeeping_cpu(cpu, HK_TYPE_TICK)) 5469 5404 return; 5470 5405 5471 5406 WARN_ON_ONCE(!tick_work_cpu); ··· 6363 6298 migrate_disable_switch(rq, prev); 6364 6299 psi_sched_switch(prev, next, !task_on_rq_queued(prev)); 6365 6300 6366 - trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev, next); 6301 + trace_sched_switch(sched_mode & SM_MASK_PREEMPT, prev_state, prev, next); 6367 6302 6368 6303 /* Also unlocks the rq: */ 6369 6304 rq = context_switch(rq, prev, next, &rf); ··· 6555 6490 */ 6556 6491 if (likely(!preemptible())) 6557 6492 return; 6558 - 6559 6493 preempt_schedule_common(); 6560 6494 } 6561 6495 NOKPROBE_SYMBOL(preempt_schedule); 6562 6496 EXPORT_SYMBOL(preempt_schedule); 6563 6497 6564 6498 #ifdef CONFIG_PREEMPT_DYNAMIC 6565 - DEFINE_STATIC_CALL(preempt_schedule, __preempt_schedule_func); 6566 - EXPORT_STATIC_CALL_TRAMP(preempt_schedule); 6499 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 6500 + #ifndef preempt_schedule_dynamic_enabled 6501 + #define preempt_schedule_dynamic_enabled preempt_schedule 6502 + #define preempt_schedule_dynamic_disabled NULL 6567 6503 #endif 6568 - 6504 + DEFINE_STATIC_CALL(preempt_schedule, preempt_schedule_dynamic_enabled); 6505 + EXPORT_STATIC_CALL_TRAMP(preempt_schedule); 6506 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 6507 + static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule); 6508 + void __sched notrace dynamic_preempt_schedule(void) 6509 + { 6510 + if (!static_branch_unlikely(&sk_dynamic_preempt_schedule)) 6511 + return; 6512 + preempt_schedule(); 6513 + } 6514 + NOKPROBE_SYMBOL(dynamic_preempt_schedule); 6515 + EXPORT_SYMBOL(dynamic_preempt_schedule); 6516 + #endif 6517 + #endif 6569 6518 6570 6519 /** 6571 6520 * preempt_schedule_notrace - preempt_schedule called by tracing ··· 6634 6555 EXPORT_SYMBOL_GPL(preempt_schedule_notrace); 6635 6556 6636 6557 #ifdef CONFIG_PREEMPT_DYNAMIC 6637 - DEFINE_STATIC_CALL(preempt_schedule_notrace, __preempt_schedule_notrace_func); 6558 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 6559 + #ifndef preempt_schedule_notrace_dynamic_enabled 6560 + #define preempt_schedule_notrace_dynamic_enabled preempt_schedule_notrace 6561 + #define preempt_schedule_notrace_dynamic_disabled NULL 6562 + #endif 6563 + DEFINE_STATIC_CALL(preempt_schedule_notrace, preempt_schedule_notrace_dynamic_enabled); 6638 6564 EXPORT_STATIC_CALL_TRAMP(preempt_schedule_notrace); 6565 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 6566 + static DEFINE_STATIC_KEY_TRUE(sk_dynamic_preempt_schedule_notrace); 6567 + void __sched notrace dynamic_preempt_schedule_notrace(void) 6568 + { 6569 + if (!static_branch_unlikely(&sk_dynamic_preempt_schedule_notrace)) 6570 + return; 6571 + preempt_schedule_notrace(); 6572 + } 6573 + NOKPROBE_SYMBOL(dynamic_preempt_schedule_notrace); 6574 + EXPORT_SYMBOL(dynamic_preempt_schedule_notrace); 6575 + #endif 6639 6576 #endif 6640 6577 6641 6578 #endif /* CONFIG_PREEMPTION */ 6642 - 6643 - #ifdef CONFIG_PREEMPT_DYNAMIC 6644 - 6645 - #include <linux/entry-common.h> 6646 - 6647 - /* 6648 - * SC:cond_resched 6649 - * SC:might_resched 6650 - * SC:preempt_schedule 6651 - * SC:preempt_schedule_notrace 6652 - * SC:irqentry_exit_cond_resched 6653 - * 6654 - * 6655 - * NONE: 6656 - * cond_resched <- __cond_resched 6657 - * might_resched <- RET0 6658 - * preempt_schedule <- NOP 6659 - * preempt_schedule_notrace <- NOP 6660 - * irqentry_exit_cond_resched <- NOP 6661 - * 6662 - * VOLUNTARY: 6663 - * cond_resched <- __cond_resched 6664 - * might_resched <- __cond_resched 6665 - * preempt_schedule <- NOP 6666 - * preempt_schedule_notrace <- NOP 6667 - * irqentry_exit_cond_resched <- NOP 6668 - * 6669 - * FULL: 6670 - * cond_resched <- RET0 6671 - * might_resched <- RET0 6672 - * preempt_schedule <- preempt_schedule 6673 - * preempt_schedule_notrace <- preempt_schedule_notrace 6674 - * irqentry_exit_cond_resched <- irqentry_exit_cond_resched 6675 - */ 6676 - 6677 - enum { 6678 - preempt_dynamic_undefined = -1, 6679 - preempt_dynamic_none, 6680 - preempt_dynamic_voluntary, 6681 - preempt_dynamic_full, 6682 - }; 6683 - 6684 - int preempt_dynamic_mode = preempt_dynamic_undefined; 6685 - 6686 - int sched_dynamic_mode(const char *str) 6687 - { 6688 - if (!strcmp(str, "none")) 6689 - return preempt_dynamic_none; 6690 - 6691 - if (!strcmp(str, "voluntary")) 6692 - return preempt_dynamic_voluntary; 6693 - 6694 - if (!strcmp(str, "full")) 6695 - return preempt_dynamic_full; 6696 - 6697 - return -EINVAL; 6698 - } 6699 - 6700 - void sched_dynamic_update(int mode) 6701 - { 6702 - /* 6703 - * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in 6704 - * the ZERO state, which is invalid. 6705 - */ 6706 - static_call_update(cond_resched, __cond_resched); 6707 - static_call_update(might_resched, __cond_resched); 6708 - static_call_update(preempt_schedule, __preempt_schedule_func); 6709 - static_call_update(preempt_schedule_notrace, __preempt_schedule_notrace_func); 6710 - static_call_update(irqentry_exit_cond_resched, irqentry_exit_cond_resched); 6711 - 6712 - switch (mode) { 6713 - case preempt_dynamic_none: 6714 - static_call_update(cond_resched, __cond_resched); 6715 - static_call_update(might_resched, (void *)&__static_call_return0); 6716 - static_call_update(preempt_schedule, NULL); 6717 - static_call_update(preempt_schedule_notrace, NULL); 6718 - static_call_update(irqentry_exit_cond_resched, NULL); 6719 - pr_info("Dynamic Preempt: none\n"); 6720 - break; 6721 - 6722 - case preempt_dynamic_voluntary: 6723 - static_call_update(cond_resched, __cond_resched); 6724 - static_call_update(might_resched, __cond_resched); 6725 - static_call_update(preempt_schedule, NULL); 6726 - static_call_update(preempt_schedule_notrace, NULL); 6727 - static_call_update(irqentry_exit_cond_resched, NULL); 6728 - pr_info("Dynamic Preempt: voluntary\n"); 6729 - break; 6730 - 6731 - case preempt_dynamic_full: 6732 - static_call_update(cond_resched, (void *)&__static_call_return0); 6733 - static_call_update(might_resched, (void *)&__static_call_return0); 6734 - static_call_update(preempt_schedule, __preempt_schedule_func); 6735 - static_call_update(preempt_schedule_notrace, __preempt_schedule_notrace_func); 6736 - static_call_update(irqentry_exit_cond_resched, irqentry_exit_cond_resched); 6737 - pr_info("Dynamic Preempt: full\n"); 6738 - break; 6739 - } 6740 - 6741 - preempt_dynamic_mode = mode; 6742 - } 6743 - 6744 - static int __init setup_preempt_mode(char *str) 6745 - { 6746 - int mode = sched_dynamic_mode(str); 6747 - if (mode < 0) { 6748 - pr_warn("Dynamic Preempt: unsupported mode: %s\n", str); 6749 - return 0; 6750 - } 6751 - 6752 - sched_dynamic_update(mode); 6753 - return 1; 6754 - } 6755 - __setup("preempt=", setup_preempt_mode); 6756 - 6757 - static void __init preempt_dynamic_init(void) 6758 - { 6759 - if (preempt_dynamic_mode == preempt_dynamic_undefined) { 6760 - if (IS_ENABLED(CONFIG_PREEMPT_NONE)) { 6761 - sched_dynamic_update(preempt_dynamic_none); 6762 - } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) { 6763 - sched_dynamic_update(preempt_dynamic_voluntary); 6764 - } else { 6765 - /* Default static call setting, nothing to do */ 6766 - WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT)); 6767 - preempt_dynamic_mode = preempt_dynamic_full; 6768 - pr_info("Dynamic Preempt: full\n"); 6769 - } 6770 - } 6771 - } 6772 - 6773 - #else /* !CONFIG_PREEMPT_DYNAMIC */ 6774 - 6775 - static inline void preempt_dynamic_init(void) { } 6776 - 6777 - #endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */ 6778 6579 6779 6580 /* 6780 6581 * This is the entry point to schedule() from kernel preemption ··· 8161 8202 #endif 8162 8203 8163 8204 #ifdef CONFIG_PREEMPT_DYNAMIC 8205 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 8206 + #define cond_resched_dynamic_enabled __cond_resched 8207 + #define cond_resched_dynamic_disabled ((void *)&__static_call_return0) 8164 8208 DEFINE_STATIC_CALL_RET0(cond_resched, __cond_resched); 8165 8209 EXPORT_STATIC_CALL_TRAMP(cond_resched); 8166 8210 8211 + #define might_resched_dynamic_enabled __cond_resched 8212 + #define might_resched_dynamic_disabled ((void *)&__static_call_return0) 8167 8213 DEFINE_STATIC_CALL_RET0(might_resched, __cond_resched); 8168 8214 EXPORT_STATIC_CALL_TRAMP(might_resched); 8215 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 8216 + static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched); 8217 + int __sched dynamic_cond_resched(void) 8218 + { 8219 + if (!static_branch_unlikely(&sk_dynamic_cond_resched)) 8220 + return 0; 8221 + return __cond_resched(); 8222 + } 8223 + EXPORT_SYMBOL(dynamic_cond_resched); 8224 + 8225 + static DEFINE_STATIC_KEY_FALSE(sk_dynamic_might_resched); 8226 + int __sched dynamic_might_resched(void) 8227 + { 8228 + if (!static_branch_unlikely(&sk_dynamic_might_resched)) 8229 + return 0; 8230 + return __cond_resched(); 8231 + } 8232 + EXPORT_SYMBOL(dynamic_might_resched); 8233 + #endif 8169 8234 #endif 8170 8235 8171 8236 /* ··· 8253 8270 return ret; 8254 8271 } 8255 8272 EXPORT_SYMBOL(__cond_resched_rwlock_write); 8273 + 8274 + #ifdef CONFIG_PREEMPT_DYNAMIC 8275 + 8276 + #ifdef CONFIG_GENERIC_ENTRY 8277 + #include <linux/entry-common.h> 8278 + #endif 8279 + 8280 + /* 8281 + * SC:cond_resched 8282 + * SC:might_resched 8283 + * SC:preempt_schedule 8284 + * SC:preempt_schedule_notrace 8285 + * SC:irqentry_exit_cond_resched 8286 + * 8287 + * 8288 + * NONE: 8289 + * cond_resched <- __cond_resched 8290 + * might_resched <- RET0 8291 + * preempt_schedule <- NOP 8292 + * preempt_schedule_notrace <- NOP 8293 + * irqentry_exit_cond_resched <- NOP 8294 + * 8295 + * VOLUNTARY: 8296 + * cond_resched <- __cond_resched 8297 + * might_resched <- __cond_resched 8298 + * preempt_schedule <- NOP 8299 + * preempt_schedule_notrace <- NOP 8300 + * irqentry_exit_cond_resched <- NOP 8301 + * 8302 + * FULL: 8303 + * cond_resched <- RET0 8304 + * might_resched <- RET0 8305 + * preempt_schedule <- preempt_schedule 8306 + * preempt_schedule_notrace <- preempt_schedule_notrace 8307 + * irqentry_exit_cond_resched <- irqentry_exit_cond_resched 8308 + */ 8309 + 8310 + enum { 8311 + preempt_dynamic_undefined = -1, 8312 + preempt_dynamic_none, 8313 + preempt_dynamic_voluntary, 8314 + preempt_dynamic_full, 8315 + }; 8316 + 8317 + int preempt_dynamic_mode = preempt_dynamic_undefined; 8318 + 8319 + int sched_dynamic_mode(const char *str) 8320 + { 8321 + if (!strcmp(str, "none")) 8322 + return preempt_dynamic_none; 8323 + 8324 + if (!strcmp(str, "voluntary")) 8325 + return preempt_dynamic_voluntary; 8326 + 8327 + if (!strcmp(str, "full")) 8328 + return preempt_dynamic_full; 8329 + 8330 + return -EINVAL; 8331 + } 8332 + 8333 + #if defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) 8334 + #define preempt_dynamic_enable(f) static_call_update(f, f##_dynamic_enabled) 8335 + #define preempt_dynamic_disable(f) static_call_update(f, f##_dynamic_disabled) 8336 + #elif defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) 8337 + #define preempt_dynamic_enable(f) static_key_enable(&sk_dynamic_##f.key) 8338 + #define preempt_dynamic_disable(f) static_key_disable(&sk_dynamic_##f.key) 8339 + #else 8340 + #error "Unsupported PREEMPT_DYNAMIC mechanism" 8341 + #endif 8342 + 8343 + void sched_dynamic_update(int mode) 8344 + { 8345 + /* 8346 + * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in 8347 + * the ZERO state, which is invalid. 8348 + */ 8349 + preempt_dynamic_enable(cond_resched); 8350 + preempt_dynamic_enable(might_resched); 8351 + preempt_dynamic_enable(preempt_schedule); 8352 + preempt_dynamic_enable(preempt_schedule_notrace); 8353 + preempt_dynamic_enable(irqentry_exit_cond_resched); 8354 + 8355 + switch (mode) { 8356 + case preempt_dynamic_none: 8357 + preempt_dynamic_enable(cond_resched); 8358 + preempt_dynamic_disable(might_resched); 8359 + preempt_dynamic_disable(preempt_schedule); 8360 + preempt_dynamic_disable(preempt_schedule_notrace); 8361 + preempt_dynamic_disable(irqentry_exit_cond_resched); 8362 + pr_info("Dynamic Preempt: none\n"); 8363 + break; 8364 + 8365 + case preempt_dynamic_voluntary: 8366 + preempt_dynamic_enable(cond_resched); 8367 + preempt_dynamic_enable(might_resched); 8368 + preempt_dynamic_disable(preempt_schedule); 8369 + preempt_dynamic_disable(preempt_schedule_notrace); 8370 + preempt_dynamic_disable(irqentry_exit_cond_resched); 8371 + pr_info("Dynamic Preempt: voluntary\n"); 8372 + break; 8373 + 8374 + case preempt_dynamic_full: 8375 + preempt_dynamic_disable(cond_resched); 8376 + preempt_dynamic_disable(might_resched); 8377 + preempt_dynamic_enable(preempt_schedule); 8378 + preempt_dynamic_enable(preempt_schedule_notrace); 8379 + preempt_dynamic_enable(irqentry_exit_cond_resched); 8380 + pr_info("Dynamic Preempt: full\n"); 8381 + break; 8382 + } 8383 + 8384 + preempt_dynamic_mode = mode; 8385 + } 8386 + 8387 + static int __init setup_preempt_mode(char *str) 8388 + { 8389 + int mode = sched_dynamic_mode(str); 8390 + if (mode < 0) { 8391 + pr_warn("Dynamic Preempt: unsupported mode: %s\n", str); 8392 + return 0; 8393 + } 8394 + 8395 + sched_dynamic_update(mode); 8396 + return 1; 8397 + } 8398 + __setup("preempt=", setup_preempt_mode); 8399 + 8400 + static void __init preempt_dynamic_init(void) 8401 + { 8402 + if (preempt_dynamic_mode == preempt_dynamic_undefined) { 8403 + if (IS_ENABLED(CONFIG_PREEMPT_NONE)) { 8404 + sched_dynamic_update(preempt_dynamic_none); 8405 + } else if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) { 8406 + sched_dynamic_update(preempt_dynamic_voluntary); 8407 + } else { 8408 + /* Default static call setting, nothing to do */ 8409 + WARN_ON_ONCE(!IS_ENABLED(CONFIG_PREEMPT)); 8410 + preempt_dynamic_mode = preempt_dynamic_full; 8411 + pr_info("Dynamic Preempt: full\n"); 8412 + } 8413 + } 8414 + } 8415 + 8416 + #else /* !CONFIG_PREEMPT_DYNAMIC */ 8417 + 8418 + static inline void preempt_dynamic_init(void) { } 8419 + 8420 + #endif /* #ifdef CONFIG_PREEMPT_DYNAMIC */ 8256 8421 8257 8422 /** 8258 8423 * yield - yield the current processor to other threads. ··· 8837 8706 { 8838 8707 int ret = 1; 8839 8708 8840 - if (!cpumask_weight(cur)) 8709 + if (cpumask_empty(cur)) 8841 8710 return ret; 8842 8711 8843 8712 ret = dl_cpuset_cpumask_can_shrink(cur, trial); ··· 8865 8734 } 8866 8735 8867 8736 if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span, 8868 - cs_cpus_allowed)) 8869 - ret = dl_task_can_attach(p, cs_cpus_allowed); 8737 + cs_cpus_allowed)) { 8738 + int cpu = cpumask_any_and(cpu_active_mask, cs_cpus_allowed); 8739 + 8740 + ret = dl_cpu_busy(cpu, p); 8741 + } 8870 8742 8871 8743 out: 8872 8744 return ret; ··· 9153 9019 static int cpuset_cpu_inactive(unsigned int cpu) 9154 9020 { 9155 9021 if (!cpuhp_tasks_frozen) { 9156 - if (dl_cpu_busy(cpu)) 9157 - return -EBUSY; 9022 + int ret = dl_cpu_busy(cpu, NULL); 9023 + 9024 + if (ret) 9025 + return ret; 9158 9026 cpuset_update_active_cpus(); 9159 9027 } else { 9160 9028 num_cpus_frozen++; ··· 9186 9050 set_cpu_active(cpu, true); 9187 9051 9188 9052 if (sched_smp_initialized) { 9053 + sched_update_numa(cpu, true); 9189 9054 sched_domains_numa_masks_set(cpu); 9190 9055 cpuset_cpu_active(); 9191 9056 } ··· 9265 9128 if (!sched_smp_initialized) 9266 9129 return 0; 9267 9130 9131 + sched_update_numa(cpu, false); 9268 9132 ret = cpuset_cpu_inactive(cpu); 9269 9133 if (ret) { 9270 9134 balance_push_set(cpu, false); 9271 9135 set_cpu_active(cpu, true); 9136 + sched_update_numa(cpu, true); 9272 9137 return ret; 9273 9138 } 9274 9139 sched_domains_numa_masks_clear(cpu); ··· 9373 9234 9374 9235 void __init sched_init_smp(void) 9375 9236 { 9376 - sched_init_numa(); 9237 + sched_init_numa(NUMA_NO_NODE); 9377 9238 9378 9239 /* 9379 9240 * There's no userspace yet to cause hotplug operations; hence all the ··· 9385 9246 mutex_unlock(&sched_domains_mutex); 9386 9247 9387 9248 /* Move init over to a non-isolated CPU */ 9388 - if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_FLAG_DOMAIN)) < 0) 9249 + if (set_cpus_allowed_ptr(current, housekeeping_cpumask(HK_TYPE_DOMAIN)) < 0) 9389 9250 BUG(); 9390 9251 current->flags &= ~PF_NO_SETAFFINITY; 9391 9252 sched_init_granularity(); ··· 9485 9346 #endif /* CONFIG_CPUMASK_OFFSTACK */ 9486 9347 9487 9348 init_rt_bandwidth(&def_rt_bandwidth, global_rt_period(), global_rt_runtime()); 9488 - init_dl_bandwidth(&def_dl_bandwidth, global_rt_period(), global_rt_runtime()); 9489 9349 9490 9350 #ifdef CONFIG_SMP 9491 9351 init_defrootdomain();
-3
kernel/sched/core_sched.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 3 - #include <linux/prctl.h> 4 - #include "sched.h" 5 - 6 3 /* 7 4 * A simple wrapper around refcount. An allocated sched_core_cookie's 8 5 * address is used to compute the cookie of the task.
+4 -8
kernel/sched/cpuacct.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + 2 3 /* 3 4 * CPU accounting code for task groups. 4 5 * 5 6 * Based on the work by Paul Menage (menage@google.com) and Balbir Singh 6 7 * (balbir@in.ibm.com). 7 8 */ 8 - #include <asm/irq_regs.h> 9 - #include "sched.h" 10 9 11 10 /* Time spent by the tasks of the CPU accounting group executing in ... */ 12 11 enum cpuacct_stat_index { ··· 333 334 */ 334 335 void cpuacct_charge(struct task_struct *tsk, u64 cputime) 335 336 { 337 + unsigned int cpu = task_cpu(tsk); 336 338 struct cpuacct *ca; 337 339 338 - rcu_read_lock(); 340 + lockdep_assert_rq_held(cpu_rq(cpu)); 339 341 340 342 for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) 341 - __this_cpu_add(*ca->cpuusage, cputime); 342 - 343 - rcu_read_unlock(); 343 + *per_cpu_ptr(ca->cpuusage, cpu) += cputime; 344 344 } 345 345 346 346 /* ··· 351 353 { 352 354 struct cpuacct *ca; 353 355 354 - rcu_read_lock(); 355 356 for (ca = task_ca(tsk); ca != &root_cpuacct; ca = parent_ca(ca)) 356 357 __this_cpu_add(ca->cpustat->cpustat[index], val); 357 - rcu_read_unlock(); 358 358 } 359 359 360 360 struct cgroup_subsys cpuacct_cgrp_subsys = {
+1 -2
kernel/sched/cpudeadline.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 /* 3 - * kernel/sched/cpudl.c 3 + * kernel/sched/cpudeadline.c 4 4 * 5 5 * Global CPU deadline management 6 6 * 7 7 * Author: Juri Lelli <j.lelli@sssup.it> 8 8 */ 9 - #include "sched.h" 10 9 11 10 static inline int parent(int i) 12 11 {
-3
kernel/sched/cpufreq.c
··· 5 5 * Copyright (C) 2016, Intel Corporation 6 6 * Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 7 7 */ 8 - #include <linux/cpufreq.h> 9 - 10 - #include "sched.h" 11 8 12 9 DEFINE_PER_CPU(struct update_util_data __rcu *, cpufreq_update_util_data); 13 10
+9 -9
kernel/sched/cpufreq_schedutil.c
··· 6 6 * Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> 7 7 */ 8 8 9 - #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 10 - 11 - #include "sched.h" 12 - 13 - #include <linux/sched/cpufreq.h> 14 - #include <trace/events/power.h> 15 - 16 9 #define IOWAIT_BOOST_MIN (SCHED_CAPACITY_SCALE / 8) 17 10 18 11 struct sugov_tunables { ··· 282 289 * into the same scale so we can compare. 283 290 */ 284 291 boost = (sg_cpu->iowait_boost * sg_cpu->max) >> SCHED_CAPACITY_SHIFT; 292 + boost = uclamp_rq_util_with(cpu_rq(sg_cpu->cpu), boost, NULL); 285 293 if (sg_cpu->util < boost) 286 294 sg_cpu->util = boost; 287 295 } ··· 342 348 /* 343 349 * Do not reduce the frequency if the CPU has not been idle 344 350 * recently, as the reduction is likely to be premature then. 351 + * 352 + * Except when the rq is capped by uclamp_max. 345 353 */ 346 - if (sugov_cpu_is_busy(sg_cpu) && next_f < sg_policy->next_freq) { 354 + if (!uclamp_rq_is_capped(cpu_rq(sg_cpu->cpu)) && 355 + sugov_cpu_is_busy(sg_cpu) && next_f < sg_policy->next_freq) { 347 356 next_f = sg_policy->next_freq; 348 357 349 358 /* Restore cached freq as next_freq has changed */ ··· 392 395 /* 393 396 * Do not reduce the target performance level if the CPU has not been 394 397 * idle recently, as the reduction is likely to be premature then. 398 + * 399 + * Except when the rq is capped by uclamp_max. 395 400 */ 396 - if (sugov_cpu_is_busy(sg_cpu) && sg_cpu->util < prev_util) 401 + if (!uclamp_rq_is_capped(cpu_rq(sg_cpu->cpu)) && 402 + sugov_cpu_is_busy(sg_cpu) && sg_cpu->util < prev_util) 397 403 sg_cpu->util = prev_util; 398 404 399 405 cpufreq_driver_adjust_perf(sg_cpu->cpu, map_util_perf(sg_cpu->bw_dl),
-1
kernel/sched/cpupri.c
··· 22 22 * worst case complexity of O(min(101, nr_domcpus)), though the scenario that 23 23 * yields the worst case search is fairly contrived. 24 24 */ 25 - #include "sched.h" 26 25 27 26 /* 28 27 * p->rt_priority p->prio newpri cpupri
-1
kernel/sched/cputime.c
··· 2 2 /* 3 3 * Simple CPU accounting cgroup controller 4 4 */ 5 - #include "sched.h" 6 5 7 6 #ifdef CONFIG_IRQ_TIME_ACCOUNTING 8 7
+78 -77
kernel/sched/deadline.c
··· 15 15 * Michael Trimarchi <michael@amarulasolutions.com>, 16 16 * Fabio Checconi <fchecconi@gmail.com> 17 17 */ 18 - #include "sched.h" 19 - #include "pelt.h" 20 - 21 - struct dl_bandwidth def_dl_bandwidth; 22 18 23 19 static inline struct task_struct *dl_task_of(struct sched_dl_entity *dl_se) 24 20 { ··· 126 130 rd->visit_gen = gen; 127 131 return false; 128 132 } 133 + 134 + static inline 135 + void __dl_update(struct dl_bw *dl_b, s64 bw) 136 + { 137 + struct root_domain *rd = container_of(dl_b, struct root_domain, dl_bw); 138 + int i; 139 + 140 + RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(), 141 + "sched RCU must be held"); 142 + for_each_cpu_and(i, rd->span, cpu_active_mask) { 143 + struct rq *rq = cpu_rq(i); 144 + 145 + rq->dl.extra_bw += bw; 146 + } 147 + } 129 148 #else 130 149 static inline struct dl_bw *dl_bw_of(int i) 131 150 { ··· 161 150 { 162 151 return false; 163 152 } 153 + 154 + static inline 155 + void __dl_update(struct dl_bw *dl_b, s64 bw) 156 + { 157 + struct dl_rq *dl = container_of(dl_b, struct dl_rq, dl_bw); 158 + 159 + dl->extra_bw += bw; 160 + } 164 161 #endif 162 + 163 + static inline 164 + void __dl_sub(struct dl_bw *dl_b, u64 tsk_bw, int cpus) 165 + { 166 + dl_b->total_bw -= tsk_bw; 167 + __dl_update(dl_b, (s32)tsk_bw / cpus); 168 + } 169 + 170 + static inline 171 + void __dl_add(struct dl_bw *dl_b, u64 tsk_bw, int cpus) 172 + { 173 + dl_b->total_bw += tsk_bw; 174 + __dl_update(dl_b, -((s32)tsk_bw / cpus)); 175 + } 176 + 177 + static inline bool 178 + __dl_overflow(struct dl_bw *dl_b, unsigned long cap, u64 old_bw, u64 new_bw) 179 + { 180 + return dl_b->bw != -1 && 181 + cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw; 182 + } 165 183 166 184 static inline 167 185 void __add_running_bw(u64 dl_bw, struct dl_rq *dl_rq) ··· 448 408 { 449 409 struct sched_dl_entity *dl_se = &p->dl; 450 410 451 - return dl_rq->root.rb_leftmost == &dl_se->rb_node; 411 + return rb_first_cached(&dl_rq->root) == &dl_se->rb_node; 452 412 } 453 413 454 414 static void init_dl_rq_bw_ratio(struct dl_rq *dl_rq); ··· 463 423 void init_dl_bw(struct dl_bw *dl_b) 464 424 { 465 425 raw_spin_lock_init(&dl_b->lock); 466 - raw_spin_lock(&def_dl_bandwidth.dl_runtime_lock); 467 426 if (global_rt_runtime() == RUNTIME_INF) 468 427 dl_b->bw = -1; 469 428 else 470 429 dl_b->bw = to_ratio(global_rt_period(), global_rt_runtime()); 471 - raw_spin_unlock(&def_dl_bandwidth.dl_runtime_lock); 472 430 dl_b->total_bw = 0; 473 431 } 474 432 ··· 718 680 719 681 static inline 720 682 void dec_dl_migration(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq) 721 - { 722 - } 723 - 724 - static inline bool need_pull_dl_task(struct rq *rq, struct task_struct *prev) 725 - { 726 - return false; 727 - } 728 - 729 - static inline void pull_dl_task(struct rq *rq) 730 683 { 731 684 } 732 685 ··· 1422 1393 timer->function = inactive_task_timer; 1423 1394 } 1424 1395 1396 + #define __node_2_dle(node) \ 1397 + rb_entry((node), struct sched_dl_entity, rb_node) 1398 + 1425 1399 #ifdef CONFIG_SMP 1426 1400 1427 1401 static void inc_dl_deadline(struct dl_rq *dl_rq, u64 deadline) ··· 1454 1422 cpudl_clear(&rq->rd->cpudl, rq->cpu); 1455 1423 cpupri_set(&rq->rd->cpupri, rq->cpu, rq->rt.highest_prio.curr); 1456 1424 } else { 1457 - struct rb_node *leftmost = dl_rq->root.rb_leftmost; 1458 - struct sched_dl_entity *entry; 1425 + struct rb_node *leftmost = rb_first_cached(&dl_rq->root); 1426 + struct sched_dl_entity *entry = __node_2_dle(leftmost); 1459 1427 1460 - entry = rb_entry(leftmost, struct sched_dl_entity, rb_node); 1461 1428 dl_rq->earliest_dl.curr = entry->deadline; 1462 1429 cpudl_set(&rq->rd->cpudl, rq->cpu, entry->deadline); 1463 1430 } ··· 1496 1465 dec_dl_deadline(dl_rq, dl_se->deadline); 1497 1466 dec_dl_migration(dl_se, dl_rq); 1498 1467 } 1499 - 1500 - #define __node_2_dle(node) \ 1501 - rb_entry((node), struct sched_dl_entity, rb_node) 1502 1468 1503 1469 static inline bool __dl_less(struct rb_node *a, const struct rb_node *b) 1504 1470 { ··· 1959 1931 deadline_queue_push_tasks(rq); 1960 1932 } 1961 1933 1962 - static struct sched_dl_entity *pick_next_dl_entity(struct rq *rq, 1963 - struct dl_rq *dl_rq) 1934 + static struct sched_dl_entity *pick_next_dl_entity(struct dl_rq *dl_rq) 1964 1935 { 1965 1936 struct rb_node *left = rb_first_cached(&dl_rq->root); 1966 1937 1967 1938 if (!left) 1968 1939 return NULL; 1969 1940 1970 - return rb_entry(left, struct sched_dl_entity, rb_node); 1941 + return __node_2_dle(left); 1971 1942 } 1972 1943 1973 1944 static struct task_struct *pick_task_dl(struct rq *rq) ··· 1978 1951 if (!sched_dl_runnable(rq)) 1979 1952 return NULL; 1980 1953 1981 - dl_se = pick_next_dl_entity(rq, dl_rq); 1954 + dl_se = pick_next_dl_entity(dl_rq); 1982 1955 BUG_ON(!dl_se); 1983 1956 p = dl_task_of(dl_se); 1984 1957 ··· 2061 2034 */ 2062 2035 static struct task_struct *pick_earliest_pushable_dl_task(struct rq *rq, int cpu) 2063 2036 { 2064 - struct rb_node *next_node = rq->dl.pushable_dl_tasks_root.rb_leftmost; 2065 2037 struct task_struct *p = NULL; 2038 + struct rb_node *next_node; 2066 2039 2067 2040 if (!has_pushable_dl_tasks(rq)) 2068 2041 return NULL; 2069 2042 2043 + next_node = rb_first_cached(&rq->dl.pushable_dl_tasks_root); 2044 + 2070 2045 next_node: 2071 2046 if (next_node) { 2072 - p = rb_entry(next_node, struct task_struct, pushable_dl_tasks); 2047 + p = __node_2_pdl(next_node); 2073 2048 2074 2049 if (pick_dl_task(rq, p, cpu)) 2075 2050 return p; ··· 2237 2208 if (!has_pushable_dl_tasks(rq)) 2238 2209 return NULL; 2239 2210 2240 - p = rb_entry(rq->dl.pushable_dl_tasks_root.rb_leftmost, 2241 - struct task_struct, pushable_dl_tasks); 2211 + p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root)); 2242 2212 2243 2213 BUG_ON(rq->cpu != task_cpu(p)); 2244 2214 BUG_ON(task_current(rq, p)); ··· 2268 2240 return 0; 2269 2241 2270 2242 retry: 2271 - if (is_migration_disabled(next_task)) 2272 - return 0; 2273 - 2274 - if (WARN_ON(next_task == rq->curr)) 2275 - return 0; 2276 - 2277 2243 /* 2278 2244 * If next_task preempts rq->curr, and rq->curr 2279 2245 * can move away, it makes sense to just reschedule ··· 2279 2257 resched_curr(rq); 2280 2258 return 0; 2281 2259 } 2260 + 2261 + if (is_migration_disabled(next_task)) 2262 + return 0; 2263 + 2264 + if (WARN_ON(next_task == rq->curr)) 2265 + return 0; 2282 2266 2283 2267 /* We might release rq lock */ 2284 2268 get_task_struct(next_task); ··· 2759 2731 int cpu; 2760 2732 unsigned long flags; 2761 2733 2762 - def_dl_bandwidth.dl_period = global_rt_period(); 2763 - def_dl_bandwidth.dl_runtime = global_rt_runtime(); 2764 - 2765 2734 if (global_rt_runtime() != RUNTIME_INF) 2766 2735 new_bw = to_ratio(global_rt_period(), global_rt_runtime()); 2767 2736 ··· 2980 2955 } 2981 2956 2982 2957 #ifdef CONFIG_SMP 2983 - int dl_task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_allowed) 2984 - { 2985 - unsigned long flags, cap; 2986 - unsigned int dest_cpu; 2987 - struct dl_bw *dl_b; 2988 - bool overflow; 2989 - int ret; 2990 - 2991 - dest_cpu = cpumask_any_and(cpu_active_mask, cs_cpus_allowed); 2992 - 2993 - rcu_read_lock_sched(); 2994 - dl_b = dl_bw_of(dest_cpu); 2995 - raw_spin_lock_irqsave(&dl_b->lock, flags); 2996 - cap = dl_bw_capacity(dest_cpu); 2997 - overflow = __dl_overflow(dl_b, cap, 0, p->dl.dl_bw); 2998 - if (overflow) { 2999 - ret = -EBUSY; 3000 - } else { 3001 - /* 3002 - * We reserve space for this task in the destination 3003 - * root_domain, as we can't fail after this point. 3004 - * We will free resources in the source root_domain 3005 - * later on (see set_cpus_allowed_dl()). 3006 - */ 3007 - int cpus = dl_bw_cpus(dest_cpu); 3008 - 3009 - __dl_add(dl_b, p->dl.dl_bw, cpus); 3010 - ret = 0; 3011 - } 3012 - raw_spin_unlock_irqrestore(&dl_b->lock, flags); 3013 - rcu_read_unlock_sched(); 3014 - 3015 - return ret; 3016 - } 3017 - 3018 2958 int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, 3019 2959 const struct cpumask *trial) 3020 2960 { ··· 3001 3011 return ret; 3002 3012 } 3003 3013 3004 - bool dl_cpu_busy(unsigned int cpu) 3014 + int dl_cpu_busy(int cpu, struct task_struct *p) 3005 3015 { 3006 3016 unsigned long flags, cap; 3007 3017 struct dl_bw *dl_b; ··· 3011 3021 dl_b = dl_bw_of(cpu); 3012 3022 raw_spin_lock_irqsave(&dl_b->lock, flags); 3013 3023 cap = dl_bw_capacity(cpu); 3014 - overflow = __dl_overflow(dl_b, cap, 0, 0); 3024 + overflow = __dl_overflow(dl_b, cap, 0, p ? p->dl.dl_bw : 0); 3025 + 3026 + if (!overflow && p) { 3027 + /* 3028 + * We reserve space for this task in the destination 3029 + * root_domain, as we can't fail after this point. 3030 + * We will free resources in the source root_domain 3031 + * later on (see set_cpus_allowed_dl()). 3032 + */ 3033 + __dl_add(dl_b, p->dl.dl_bw, dl_bw_cpus(cpu)); 3034 + } 3035 + 3015 3036 raw_spin_unlock_irqrestore(&dl_b->lock, flags); 3016 3037 rcu_read_unlock_sched(); 3017 3038 3018 - return overflow; 3039 + return overflow ? -EBUSY : 0; 3019 3040 } 3020 3041 #endif 3021 3042
-11
kernel/sched/debug.c
··· 6 6 * 7 7 * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar 8 8 */ 9 - #include "sched.h" 10 9 11 10 /* 12 11 * This allows printing both to /proc/sched_debug and ··· 930 931 static void sched_show_numa(struct task_struct *p, struct seq_file *m) 931 932 { 932 933 #ifdef CONFIG_NUMA_BALANCING 933 - struct mempolicy *pol; 934 - 935 934 if (p->mm) 936 935 P(mm->numa_scan_seq); 937 - 938 - task_lock(p); 939 - pol = p->mempolicy; 940 - if (pol && !(pol->flags & MPOL_F_MORON)) 941 - pol = NULL; 942 - mpol_get(pol); 943 - task_unlock(p); 944 936 945 937 P(numa_pages_migrated); 946 938 P(numa_preferred_nid); ··· 939 949 SEQ_printf(m, "current_node=%d, numa_group_id=%d\n", 940 950 task_node(p), task_numa_group_id(p)); 941 951 show_numa_stats(p, m); 942 - mpol_put(pol); 943 952 #endif 944 953 } 945 954
+83 -30
kernel/sched/fair.c
··· 20 20 * Adaptive scheduling granularity, math enhancements by Peter Zijlstra 21 21 * Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra 22 22 */ 23 + #include <linux/energy_model.h> 24 + #include <linux/mmap_lock.h> 25 + #include <linux/hugetlb_inline.h> 26 + #include <linux/jiffies.h> 27 + #include <linux/mm_api.h> 28 + #include <linux/highmem.h> 29 + #include <linux/spinlock_api.h> 30 + #include <linux/cpumask_api.h> 31 + #include <linux/lockdep_api.h> 32 + #include <linux/softirq.h> 33 + #include <linux/refcount_api.h> 34 + #include <linux/topology.h> 35 + #include <linux/sched/clock.h> 36 + #include <linux/sched/cond_resched.h> 37 + #include <linux/sched/cputime.h> 38 + #include <linux/sched/isolation.h> 39 + 40 + #include <linux/cpuidle.h> 41 + #include <linux/interrupt.h> 42 + #include <linux/mempolicy.h> 43 + #include <linux/mutex_api.h> 44 + #include <linux/profile.h> 45 + #include <linux/psi.h> 46 + #include <linux/ratelimit.h> 47 + 48 + #include <asm/switch_to.h> 49 + 50 + #include <linux/sched/cond_resched.h> 51 + 23 52 #include "sched.h" 53 + #include "stats.h" 54 + #include "autogroup.h" 24 55 25 56 /* 26 57 * Targeted preemption latency for CPU-bound tasks: ··· 1290 1259 1291 1260 /* Handle placement on systems where not all nodes are directly connected. */ 1292 1261 static unsigned long score_nearby_nodes(struct task_struct *p, int nid, 1293 - int maxdist, bool task) 1262 + int lim_dist, bool task) 1294 1263 { 1295 1264 unsigned long score = 0; 1296 - int node; 1265 + int node, max_dist; 1297 1266 1298 1267 /* 1299 1268 * All nodes are directly connected, and the same distance ··· 1302 1271 if (sched_numa_topology_type == NUMA_DIRECT) 1303 1272 return 0; 1304 1273 1274 + /* sched_max_numa_distance may be changed in parallel. */ 1275 + max_dist = READ_ONCE(sched_max_numa_distance); 1305 1276 /* 1306 1277 * This code is called for each node, introducing N^2 complexity, 1307 1278 * which should be ok given the number of nodes rarely exceeds 8. ··· 1316 1283 * The furthest away nodes in the system are not interesting 1317 1284 * for placement; nid was already counted. 1318 1285 */ 1319 - if (dist == sched_max_numa_distance || node == nid) 1286 + if (dist >= max_dist || node == nid) 1320 1287 continue; 1321 1288 1322 1289 /* ··· 1326 1293 * "hoplimit", only nodes closer by than "hoplimit" are part 1327 1294 * of each group. Skip other nodes. 1328 1295 */ 1329 - if (sched_numa_topology_type == NUMA_BACKPLANE && 1330 - dist >= maxdist) 1296 + if (sched_numa_topology_type == NUMA_BACKPLANE && dist >= lim_dist) 1331 1297 continue; 1332 1298 1333 1299 /* Add up the faults from nearby nodes. */ ··· 1344 1312 * This seems to result in good task placement. 1345 1313 */ 1346 1314 if (sched_numa_topology_type == NUMA_GLUELESS_MESH) { 1347 - faults *= (sched_max_numa_distance - dist); 1348 - faults /= (sched_max_numa_distance - LOCAL_DISTANCE); 1315 + faults *= (max_dist - dist); 1316 + faults /= (max_dist - LOCAL_DISTANCE); 1349 1317 } 1350 1318 1351 1319 score += faults; ··· 1521 1489 1522 1490 int src_cpu, src_nid; 1523 1491 int dst_cpu, dst_nid; 1492 + int imb_numa_nr; 1524 1493 1525 1494 struct numa_stats src_stats, dst_stats; 1526 1495 ··· 1536 1503 static unsigned long cpu_load(struct rq *rq); 1537 1504 static unsigned long cpu_runnable(struct rq *rq); 1538 1505 static inline long adjust_numa_imbalance(int imbalance, 1539 - int dst_running, int dst_weight); 1506 + int dst_running, int imb_numa_nr); 1540 1507 1541 1508 static inline enum 1542 1509 numa_type numa_classify(unsigned int imbalance_pct, ··· 1917 1884 dst_running = env->dst_stats.nr_running + 1; 1918 1885 imbalance = max(0, dst_running - src_running); 1919 1886 imbalance = adjust_numa_imbalance(imbalance, dst_running, 1920 - env->dst_stats.weight); 1887 + env->imb_numa_nr); 1921 1888 1922 1889 /* Use idle CPU if there is no imbalance */ 1923 1890 if (!imbalance) { ··· 1982 1949 */ 1983 1950 rcu_read_lock(); 1984 1951 sd = rcu_dereference(per_cpu(sd_numa, env.src_cpu)); 1985 - if (sd) 1952 + if (sd) { 1986 1953 env.imbalance_pct = 100 + (sd->imbalance_pct - 100) / 2; 1954 + env.imb_numa_nr = sd->imb_numa_nr; 1955 + } 1987 1956 rcu_read_unlock(); 1988 1957 1989 1958 /* ··· 2020 1985 */ 2021 1986 ng = deref_curr_numa_group(p); 2022 1987 if (env.best_cpu == -1 || (ng && ng->active_nodes > 1)) { 2023 - for_each_online_node(nid) { 1988 + for_each_node_state(nid, N_CPU) { 2024 1989 if (nid == env.src_nid || nid == p->numa_preferred_nid) 2025 1990 continue; 2026 1991 ··· 2118 2083 unsigned long faults, max_faults = 0; 2119 2084 int nid, active_nodes = 0; 2120 2085 2121 - for_each_online_node(nid) { 2086 + for_each_node_state(nid, N_CPU) { 2122 2087 faults = group_faults_cpu(numa_group, nid); 2123 2088 if (faults > max_faults) 2124 2089 max_faults = faults; 2125 2090 } 2126 2091 2127 - for_each_online_node(nid) { 2092 + for_each_node_state(nid, N_CPU) { 2128 2093 faults = group_faults_cpu(numa_group, nid); 2129 2094 if (faults * ACTIVE_NODE_FRACTION > max_faults) 2130 2095 active_nodes++; ··· 2278 2243 2279 2244 dist = sched_max_numa_distance; 2280 2245 2281 - for_each_online_node(node) { 2246 + for_each_node_state(node, N_CPU) { 2282 2247 score = group_weight(p, node, dist); 2283 2248 if (score > max_score) { 2284 2249 max_score = score; ··· 2297 2262 * inside the highest scoring group of nodes. The nodemask tricks 2298 2263 * keep the complexity of the search down. 2299 2264 */ 2300 - nodes = node_online_map; 2265 + nodes = node_states[N_CPU]; 2301 2266 for (dist = sched_max_numa_distance; dist > LOCAL_DISTANCE; dist--) { 2302 2267 unsigned long max_faults = 0; 2303 2268 nodemask_t max_group = NODE_MASK_NONE; ··· 2434 2399 max_faults = group_faults; 2435 2400 max_nid = nid; 2436 2401 } 2402 + } 2403 + 2404 + /* Cannot migrate task to CPU-less node */ 2405 + if (max_nid != NUMA_NO_NODE && !node_state(max_nid, N_CPU)) { 2406 + int near_nid = max_nid; 2407 + int distance, near_distance = INT_MAX; 2408 + 2409 + for_each_node_state(nid, N_CPU) { 2410 + distance = node_distance(max_nid, nid); 2411 + if (distance < near_distance) { 2412 + near_nid = nid; 2413 + near_distance = distance; 2414 + } 2415 + } 2416 + max_nid = near_nid; 2437 2417 } 2438 2418 2439 2419 if (ng) { ··· 2875 2825 /* Protect against double add, see task_tick_numa and task_numa_work */ 2876 2826 p->numa_work.next = &p->numa_work; 2877 2827 p->numa_faults = NULL; 2828 + p->numa_pages_migrated = 0; 2829 + p->total_numa_faults = 0; 2878 2830 RCU_INIT_POINTER(p->numa_group, NULL); 2879 2831 p->last_task_numa_placement = 0; 2880 2832 p->last_sum_exec_runtime = 0; ··· 9092 9040 * This is an approximation as the number of running tasks may not be 9093 9041 * related to the number of busy CPUs due to sched_setaffinity. 9094 9042 */ 9095 - static inline bool allow_numa_imbalance(int dst_running, int dst_weight) 9043 + static inline bool allow_numa_imbalance(int running, int imb_numa_nr) 9096 9044 { 9097 - return (dst_running < (dst_weight >> 2)); 9045 + return running <= imb_numa_nr; 9098 9046 } 9099 9047 9100 9048 /* ··· 9228 9176 return idlest; 9229 9177 #endif 9230 9178 /* 9231 - * Otherwise, keep the task on this node to stay close 9232 - * its wakeup source and improve locality. If there is 9233 - * a real need of migration, periodic load balance will 9234 - * take care of it. 9179 + * Otherwise, keep the task close to the wakeup source 9180 + * and improve locality if the number of running tasks 9181 + * would remain below threshold where an imbalance is 9182 + * allowed. If there is a real need of migration, 9183 + * periodic load balance will take care of it. 9235 9184 */ 9236 - if (allow_numa_imbalance(local_sgs.sum_nr_running, sd->span_weight)) 9185 + if (allow_numa_imbalance(local_sgs.sum_nr_running + 1, sd->imb_numa_nr)) 9237 9186 return NULL; 9238 9187 } 9239 9188 ··· 9326 9273 #define NUMA_IMBALANCE_MIN 2 9327 9274 9328 9275 static inline long adjust_numa_imbalance(int imbalance, 9329 - int dst_running, int dst_weight) 9276 + int dst_running, int imb_numa_nr) 9330 9277 { 9331 - if (!allow_numa_imbalance(dst_running, dst_weight)) 9278 + if (!allow_numa_imbalance(dst_running, imb_numa_nr)) 9332 9279 return imbalance; 9333 9280 9334 9281 /* ··· 9440 9387 /* Consider allowing a small imbalance between NUMA groups */ 9441 9388 if (env->sd->flags & SD_NUMA) { 9442 9389 env->imbalance = adjust_numa_imbalance(env->imbalance, 9443 - busiest->sum_nr_running, busiest->group_weight); 9390 + local->sum_nr_running + 1, env->sd->imb_numa_nr); 9444 9391 } 9445 9392 9446 9393 return; ··· 10404 10351 * - When one of the busy CPUs notice that there may be an idle rebalancing 10405 10352 * needed, they will kick the idle load balancer, which then does idle 10406 10353 * load balancing for all the idle CPUs. 10407 - * - HK_FLAG_MISC CPUs are used for this task, because HK_FLAG_SCHED not set 10354 + * - HK_TYPE_MISC CPUs are used for this task, because HK_TYPE_SCHED not set 10408 10355 * anywhere yet. 10409 10356 */ 10410 10357 ··· 10413 10360 int ilb; 10414 10361 const struct cpumask *hk_mask; 10415 10362 10416 - hk_mask = housekeeping_cpumask(HK_FLAG_MISC); 10363 + hk_mask = housekeeping_cpumask(HK_TYPE_MISC); 10417 10364 10418 10365 for_each_cpu_and(ilb, nohz.idle_cpus_mask, hk_mask) { 10419 10366 ··· 10429 10376 10430 10377 /* 10431 10378 * Kick a CPU to do the nohz balancing, if it is time for it. We pick any 10432 - * idle CPU in the HK_FLAG_MISC housekeeping set (if there is one). 10379 + * idle CPU in the HK_TYPE_MISC housekeeping set (if there is one). 10433 10380 */ 10434 10381 static void kick_ilb(unsigned int flags) 10435 10382 { ··· 10642 10589 return; 10643 10590 10644 10591 /* Spare idle load balancing on CPUs that don't want to be disturbed: */ 10645 - if (!housekeeping_cpu(cpu, HK_FLAG_SCHED)) 10592 + if (!housekeeping_cpu(cpu, HK_TYPE_SCHED)) 10646 10593 return; 10647 10594 10648 10595 /* ··· 10858 10805 * This CPU doesn't want to be disturbed by scheduler 10859 10806 * housekeeping 10860 10807 */ 10861 - if (!housekeeping_cpu(this_cpu, HK_FLAG_SCHED)) 10808 + if (!housekeeping_cpu(this_cpu, HK_TYPE_SCHED)) 10862 10809 return; 10863 10810 10864 10811 /* Will wake up very soon. No time for doing anything else*/
-3
kernel/sched/idle.c
··· 6 6 * (NOTE: these are not related to SCHED_IDLE batch scheduled 7 7 * tasks which are handled in sched/fair.c ) 8 8 */ 9 - #include "sched.h" 10 - 11 - #include <trace/events/power.h> 12 9 13 10 /* Linker adds these: start and end of __cpuidle functions */ 14 11 extern char __cpuidle_text_start[], __cpuidle_text_end[];
+103 -60
kernel/sched/isolation.c
··· 7 7 * Copyright (C) 2017-2018 SUSE, Frederic Weisbecker 8 8 * 9 9 */ 10 - #include "sched.h" 10 + 11 + enum hk_flags { 12 + HK_FLAG_TIMER = BIT(HK_TYPE_TIMER), 13 + HK_FLAG_RCU = BIT(HK_TYPE_RCU), 14 + HK_FLAG_MISC = BIT(HK_TYPE_MISC), 15 + HK_FLAG_SCHED = BIT(HK_TYPE_SCHED), 16 + HK_FLAG_TICK = BIT(HK_TYPE_TICK), 17 + HK_FLAG_DOMAIN = BIT(HK_TYPE_DOMAIN), 18 + HK_FLAG_WQ = BIT(HK_TYPE_WQ), 19 + HK_FLAG_MANAGED_IRQ = BIT(HK_TYPE_MANAGED_IRQ), 20 + HK_FLAG_KTHREAD = BIT(HK_TYPE_KTHREAD), 21 + }; 11 22 12 23 DEFINE_STATIC_KEY_FALSE(housekeeping_overridden); 13 24 EXPORT_SYMBOL_GPL(housekeeping_overridden); 14 - static cpumask_var_t housekeeping_mask; 15 - static unsigned int housekeeping_flags; 16 25 17 - bool housekeeping_enabled(enum hk_flags flags) 26 + struct housekeeping { 27 + cpumask_var_t cpumasks[HK_TYPE_MAX]; 28 + unsigned long flags; 29 + }; 30 + 31 + static struct housekeeping housekeeping; 32 + 33 + bool housekeeping_enabled(enum hk_type type) 18 34 { 19 - return !!(housekeeping_flags & flags); 35 + return !!(housekeeping.flags & BIT(type)); 20 36 } 21 37 EXPORT_SYMBOL_GPL(housekeeping_enabled); 22 38 23 - int housekeeping_any_cpu(enum hk_flags flags) 39 + int housekeeping_any_cpu(enum hk_type type) 24 40 { 25 41 int cpu; 26 42 27 43 if (static_branch_unlikely(&housekeeping_overridden)) { 28 - if (housekeeping_flags & flags) { 29 - cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id()); 44 + if (housekeeping.flags & BIT(type)) { 45 + cpu = sched_numa_find_closest(housekeeping.cpumasks[type], smp_processor_id()); 30 46 if (cpu < nr_cpu_ids) 31 47 return cpu; 32 48 33 - return cpumask_any_and(housekeeping_mask, cpu_online_mask); 49 + return cpumask_any_and(housekeeping.cpumasks[type], cpu_online_mask); 34 50 } 35 51 } 36 52 return smp_processor_id(); 37 53 } 38 54 EXPORT_SYMBOL_GPL(housekeeping_any_cpu); 39 55 40 - const struct cpumask *housekeeping_cpumask(enum hk_flags flags) 56 + const struct cpumask *housekeeping_cpumask(enum hk_type type) 41 57 { 42 58 if (static_branch_unlikely(&housekeeping_overridden)) 43 - if (housekeeping_flags & flags) 44 - return housekeeping_mask; 59 + if (housekeeping.flags & BIT(type)) 60 + return housekeeping.cpumasks[type]; 45 61 return cpu_possible_mask; 46 62 } 47 63 EXPORT_SYMBOL_GPL(housekeeping_cpumask); 48 64 49 - void housekeeping_affine(struct task_struct *t, enum hk_flags flags) 65 + void housekeeping_affine(struct task_struct *t, enum hk_type type) 50 66 { 51 67 if (static_branch_unlikely(&housekeeping_overridden)) 52 - if (housekeeping_flags & flags) 53 - set_cpus_allowed_ptr(t, housekeeping_mask); 68 + if (housekeeping.flags & BIT(type)) 69 + set_cpus_allowed_ptr(t, housekeeping.cpumasks[type]); 54 70 } 55 71 EXPORT_SYMBOL_GPL(housekeeping_affine); 56 72 57 - bool housekeeping_test_cpu(int cpu, enum hk_flags flags) 73 + bool housekeeping_test_cpu(int cpu, enum hk_type type) 58 74 { 59 75 if (static_branch_unlikely(&housekeeping_overridden)) 60 - if (housekeeping_flags & flags) 61 - return cpumask_test_cpu(cpu, housekeeping_mask); 76 + if (housekeeping.flags & BIT(type)) 77 + return cpumask_test_cpu(cpu, housekeeping.cpumasks[type]); 62 78 return true; 63 79 } 64 80 EXPORT_SYMBOL_GPL(housekeeping_test_cpu); 65 81 66 82 void __init housekeeping_init(void) 67 83 { 68 - if (!housekeeping_flags) 84 + enum hk_type type; 85 + 86 + if (!housekeeping.flags) 69 87 return; 70 88 71 89 static_branch_enable(&housekeeping_overridden); 72 90 73 - if (housekeeping_flags & HK_FLAG_TICK) 91 + if (housekeeping.flags & HK_FLAG_TICK) 74 92 sched_tick_offload_init(); 75 93 76 - /* We need at least one CPU to handle housekeeping work */ 77 - WARN_ON_ONCE(cpumask_empty(housekeeping_mask)); 94 + for_each_set_bit(type, &housekeeping.flags, HK_TYPE_MAX) { 95 + /* We need at least one CPU to handle housekeeping work */ 96 + WARN_ON_ONCE(cpumask_empty(housekeeping.cpumasks[type])); 97 + } 78 98 } 79 99 80 - static int __init housekeeping_setup(char *str, enum hk_flags flags) 100 + static void __init housekeeping_setup_type(enum hk_type type, 101 + cpumask_var_t housekeeping_staging) 81 102 { 82 - cpumask_var_t non_housekeeping_mask; 83 - cpumask_var_t tmp; 103 + 104 + alloc_bootmem_cpumask_var(&housekeeping.cpumasks[type]); 105 + cpumask_copy(housekeeping.cpumasks[type], 106 + housekeeping_staging); 107 + } 108 + 109 + static int __init housekeeping_setup(char *str, unsigned long flags) 110 + { 111 + cpumask_var_t non_housekeeping_mask, housekeeping_staging; 112 + int err = 0; 113 + 114 + if ((flags & HK_FLAG_TICK) && !(housekeeping.flags & HK_FLAG_TICK)) { 115 + if (!IS_ENABLED(CONFIG_NO_HZ_FULL)) { 116 + pr_warn("Housekeeping: nohz unsupported." 117 + " Build with CONFIG_NO_HZ_FULL\n"); 118 + return 0; 119 + } 120 + } 84 121 85 122 alloc_bootmem_cpumask_var(&non_housekeeping_mask); 86 123 if (cpulist_parse(str, non_housekeeping_mask) < 0) { 87 124 pr_warn("Housekeeping: nohz_full= or isolcpus= incorrect CPU range\n"); 88 - free_bootmem_cpumask_var(non_housekeeping_mask); 89 - return 0; 125 + goto free_non_housekeeping_mask; 90 126 } 91 127 92 - alloc_bootmem_cpumask_var(&tmp); 93 - if (!housekeeping_flags) { 94 - alloc_bootmem_cpumask_var(&housekeeping_mask); 95 - cpumask_andnot(housekeeping_mask, 96 - cpu_possible_mask, non_housekeeping_mask); 128 + alloc_bootmem_cpumask_var(&housekeeping_staging); 129 + cpumask_andnot(housekeeping_staging, 130 + cpu_possible_mask, non_housekeeping_mask); 97 131 98 - cpumask_andnot(tmp, cpu_present_mask, non_housekeeping_mask); 99 - if (cpumask_empty(tmp)) { 132 + if (!cpumask_intersects(cpu_present_mask, housekeeping_staging)) { 133 + __cpumask_set_cpu(smp_processor_id(), housekeeping_staging); 134 + __cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask); 135 + if (!housekeeping.flags) { 100 136 pr_warn("Housekeeping: must include one present CPU, " 101 137 "using boot CPU:%d\n", smp_processor_id()); 102 - __cpumask_set_cpu(smp_processor_id(), housekeeping_mask); 103 - __cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask); 104 138 } 139 + } 140 + 141 + if (!housekeeping.flags) { 142 + /* First setup call ("nohz_full=" or "isolcpus=") */ 143 + enum hk_type type; 144 + 145 + for_each_set_bit(type, &flags, HK_TYPE_MAX) 146 + housekeeping_setup_type(type, housekeeping_staging); 105 147 } else { 106 - cpumask_andnot(tmp, cpu_present_mask, non_housekeeping_mask); 107 - if (cpumask_empty(tmp)) 108 - __cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask); 109 - cpumask_andnot(tmp, cpu_possible_mask, non_housekeeping_mask); 110 - if (!cpumask_equal(tmp, housekeeping_mask)) { 111 - pr_warn("Housekeeping: nohz_full= must match isolcpus=\n"); 112 - free_bootmem_cpumask_var(tmp); 113 - free_bootmem_cpumask_var(non_housekeeping_mask); 114 - return 0; 115 - } 116 - } 117 - free_bootmem_cpumask_var(tmp); 148 + /* Second setup call ("nohz_full=" after "isolcpus=" or the reverse) */ 149 + enum hk_type type; 150 + unsigned long iter_flags = flags & housekeeping.flags; 118 151 119 - if ((flags & HK_FLAG_TICK) && !(housekeeping_flags & HK_FLAG_TICK)) { 120 - if (IS_ENABLED(CONFIG_NO_HZ_FULL)) { 121 - tick_nohz_full_setup(non_housekeeping_mask); 122 - } else { 123 - pr_warn("Housekeeping: nohz unsupported." 124 - " Build with CONFIG_NO_HZ_FULL\n"); 125 - free_bootmem_cpumask_var(non_housekeeping_mask); 126 - return 0; 152 + for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) { 153 + if (!cpumask_equal(housekeeping_staging, 154 + housekeeping.cpumasks[type])) { 155 + pr_warn("Housekeeping: nohz_full= must match isolcpus=\n"); 156 + goto free_housekeeping_staging; 157 + } 127 158 } 159 + 160 + iter_flags = flags & ~housekeeping.flags; 161 + 162 + for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) 163 + housekeeping_setup_type(type, housekeeping_staging); 128 164 } 129 165 130 - housekeeping_flags |= flags; 166 + if ((flags & HK_FLAG_TICK) && !(housekeeping.flags & HK_FLAG_TICK)) 167 + tick_nohz_full_setup(non_housekeeping_mask); 131 168 169 + housekeeping.flags |= flags; 170 + err = 1; 171 + 172 + free_housekeeping_staging: 173 + free_bootmem_cpumask_var(housekeeping_staging); 174 + free_non_housekeeping_mask: 132 175 free_bootmem_cpumask_var(non_housekeeping_mask); 133 176 134 - return 1; 177 + return err; 135 178 } 136 179 137 180 static int __init housekeeping_nohz_full_setup(char *str) 138 181 { 139 - unsigned int flags; 182 + unsigned long flags; 140 183 141 184 flags = HK_FLAG_TICK | HK_FLAG_WQ | HK_FLAG_TIMER | HK_FLAG_RCU | 142 185 HK_FLAG_MISC | HK_FLAG_KTHREAD; ··· 190 147 191 148 static int __init housekeeping_isolcpus_setup(char *str) 192 149 { 193 - unsigned int flags = 0; 150 + unsigned long flags = 0; 194 151 bool illegal = false; 195 152 char *par; 196 153 int len;
-1
kernel/sched/loadavg.c
··· 6 6 * figure. Its a silly number but people think its important. We go through 7 7 * great pains to make it work on big machines and tickless kernels. 8 8 */ 9 - #include "sched.h" 10 9 11 10 /* 12 11 * Global load-average calculations
-1
kernel/sched/membarrier.c
··· 4 4 * 5 5 * membarrier system call 6 6 */ 7 - #include "sched.h" 8 7 9 8 /* 10 9 * For documentation purposes, here are some membarrier ordering
-4
kernel/sched/pelt.c
··· 24 24 * Author: Vincent Guittot <vincent.guittot@linaro.org> 25 25 */ 26 26 27 - #include <linux/sched.h> 28 - #include "sched.h" 29 - #include "pelt.h" 30 - 31 27 /* 32 28 * Approximate: 33 29 * val * y^n, where y^32 ~= 0.5 (~1 scheduling period)
+28 -29
kernel/sched/psi.c
··· 137 137 * sampling of the aggregate task states would be. 138 138 */ 139 139 140 - #include "../workqueue_internal.h" 141 - #include <linux/sched/loadavg.h> 142 - #include <linux/seq_file.h> 143 - #include <linux/proc_fs.h> 144 - #include <linux/seqlock.h> 145 - #include <linux/uaccess.h> 146 - #include <linux/cgroup.h> 147 - #include <linux/module.h> 148 - #include <linux/sched.h> 149 - #include <linux/ctype.h> 150 - #include <linux/file.h> 151 - #include <linux/poll.h> 152 - #include <linux/psi.h> 153 - #include "sched.h" 154 - 155 140 static int psi_bug __read_mostly; 156 141 157 142 DEFINE_STATIC_KEY_FALSE(psi_disabled); ··· 508 523 static u64 update_triggers(struct psi_group *group, u64 now) 509 524 { 510 525 struct psi_trigger *t; 511 - bool new_stall = false; 526 + bool update_total = false; 512 527 u64 *total = group->total[PSI_POLL]; 513 528 514 529 /* ··· 517 532 */ 518 533 list_for_each_entry(t, &group->triggers, node) { 519 534 u64 growth; 535 + bool new_stall; 520 536 521 - /* Check for stall activity */ 522 - if (group->polling_total[t->state] == total[t->state]) 537 + new_stall = group->polling_total[t->state] != total[t->state]; 538 + 539 + /* Check for stall activity or a previous threshold breach */ 540 + if (!new_stall && !t->pending_event) 523 541 continue; 524 - 525 542 /* 526 - * Multiple triggers might be looking at the same state, 527 - * remember to update group->polling_total[] once we've 528 - * been through all of them. Also remember to extend the 529 - * polling time if we see new stall activity. 543 + * Check for new stall activity, as well as deferred 544 + * events that occurred in the last window after the 545 + * trigger had already fired (we want to ratelimit 546 + * events without dropping any). 530 547 */ 531 - new_stall = true; 548 + if (new_stall) { 549 + /* 550 + * Multiple triggers might be looking at the same state, 551 + * remember to update group->polling_total[] once we've 552 + * been through all of them. Also remember to extend the 553 + * polling time if we see new stall activity. 554 + */ 555 + update_total = true; 532 556 533 - /* Calculate growth since last update */ 534 - growth = window_update(&t->win, now, total[t->state]); 535 - if (growth < t->threshold) 536 - continue; 557 + /* Calculate growth since last update */ 558 + growth = window_update(&t->win, now, total[t->state]); 559 + if (growth < t->threshold) 560 + continue; 537 561 562 + t->pending_event = true; 563 + } 538 564 /* Limit event signaling to once per window */ 539 565 if (now < t->last_event_time + t->win.size) 540 566 continue; ··· 554 558 if (cmpxchg(&t->event, 0, 1) == 0) 555 559 wake_up_interruptible(&t->event_wait); 556 560 t->last_event_time = now; 561 + /* Reset threshold breach flag once event got generated */ 562 + t->pending_event = false; 557 563 } 558 564 559 - if (new_stall) 565 + if (update_total) 560 566 memcpy(group->polling_total, total, 561 567 sizeof(group->polling_total)); 562 568 ··· 1122 1124 t->event = 0; 1123 1125 t->last_event_time = 0; 1124 1126 init_waitqueue_head(&t->event_wait); 1127 + t->pending_event = false; 1125 1128 1126 1129 mutex_lock(&group->trigger_lock); 1127 1130
+24 -27
kernel/sched/rt.c
··· 3 3 * Real-Time Scheduling Class (mapped to the SCHED_FIFO and SCHED_RR 4 4 * policies) 5 5 */ 6 - #include "sched.h" 7 - 8 - #include "pelt.h" 9 6 10 7 int sched_rr_timeslice = RR_TIMESLICE; 11 8 int sysctl_sched_rr_timeslice = (MSEC_PER_SEC / HZ) * RR_TIMESLICE; ··· 268 271 269 272 #ifdef CONFIG_SMP 270 273 271 - static void pull_rt_task(struct rq *this_rq); 272 - 273 274 static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *prev) 274 275 { 275 276 /* Try to pull RT tasks here if we lower this rq's prio */ ··· 421 426 422 427 static inline 423 428 void dec_rt_migration(struct sched_rt_entity *rt_se, struct rt_rq *rt_rq) 424 - { 425 - } 426 - 427 - static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *prev) 428 - { 429 - return false; 430 - } 431 - 432 - static inline void pull_rt_task(struct rq *this_rq) 433 429 { 434 430 } 435 431 ··· 1716 1730 rt_queue_push_tasks(rq); 1717 1731 } 1718 1732 1719 - static struct sched_rt_entity *pick_next_rt_entity(struct rq *rq, 1720 - struct rt_rq *rt_rq) 1733 + static struct sched_rt_entity *pick_next_rt_entity(struct rt_rq *rt_rq) 1721 1734 { 1722 1735 struct rt_prio_array *array = &rt_rq->active; 1723 1736 struct sched_rt_entity *next = NULL; ··· 1738 1753 struct rt_rq *rt_rq = &rq->rt; 1739 1754 1740 1755 do { 1741 - rt_se = pick_next_rt_entity(rq, rt_rq); 1756 + rt_se = pick_next_rt_entity(rt_rq); 1742 1757 BUG_ON(!rt_se); 1743 1758 rt_rq = group_rt_rq(rt_se); 1744 1759 } while (rt_rq); ··· 2011 2026 return 0; 2012 2027 2013 2028 retry: 2029 + /* 2030 + * It's possible that the next_task slipped in of 2031 + * higher priority than current. If that's the case 2032 + * just reschedule current. 2033 + */ 2034 + if (unlikely(next_task->prio < rq->curr->prio)) { 2035 + resched_curr(rq); 2036 + return 0; 2037 + } 2038 + 2014 2039 if (is_migration_disabled(next_task)) { 2015 2040 struct task_struct *push_task = NULL; 2016 2041 int cpu; 2017 2042 2018 2043 if (!pull || rq->push_busy) 2044 + return 0; 2045 + 2046 + /* 2047 + * Invoking find_lowest_rq() on anything but an RT task doesn't 2048 + * make sense. Per the above priority check, curr has to 2049 + * be of higher priority than next_task, so no need to 2050 + * reschedule when bailing out. 2051 + * 2052 + * Note that the stoppers are masqueraded as SCHED_FIFO 2053 + * (cf. sched_set_stop_task()), so we can't rely on rt_task(). 2054 + */ 2055 + if (rq->curr->sched_class != &rt_sched_class) 2019 2056 return 0; 2020 2057 2021 2058 cpu = find_lowest_rq(rq->curr); ··· 2063 2056 2064 2057 if (WARN_ON(next_task == rq->curr)) 2065 2058 return 0; 2066 - 2067 - /* 2068 - * It's possible that the next_task slipped in of 2069 - * higher priority than current. If that's the case 2070 - * just reschedule current. 2071 - */ 2072 - if (unlikely(next_task->prio < rq->curr->prio)) { 2073 - resched_curr(rq); 2074 - return 0; 2075 - } 2076 2059 2077 2060 /* We might release rq lock */ 2078 2061 get_task_struct(next_task);
+165 -189
kernel/sched/sched.h
··· 2 2 /* 3 3 * Scheduler internal types and methods: 4 4 */ 5 - #include <linux/sched.h> 5 + #ifndef _KERNEL_SCHED_SCHED_H 6 + #define _KERNEL_SCHED_SCHED_H 6 7 8 + #include <linux/sched/affinity.h> 7 9 #include <linux/sched/autogroup.h> 8 - #include <linux/sched/clock.h> 9 - #include <linux/sched/coredump.h> 10 10 #include <linux/sched/cpufreq.h> 11 - #include <linux/sched/cputime.h> 12 11 #include <linux/sched/deadline.h> 13 - #include <linux/sched/debug.h> 14 - #include <linux/sched/hotplug.h> 15 - #include <linux/sched/idle.h> 16 - #include <linux/sched/init.h> 17 - #include <linux/sched/isolation.h> 18 - #include <linux/sched/jobctl.h> 12 + #include <linux/sched.h> 19 13 #include <linux/sched/loadavg.h> 20 14 #include <linux/sched/mm.h> 21 - #include <linux/sched/nohz.h> 22 - #include <linux/sched/numa_balancing.h> 23 - #include <linux/sched/prio.h> 24 - #include <linux/sched/rt.h> 15 + #include <linux/sched/rseq_api.h> 25 16 #include <linux/sched/signal.h> 26 17 #include <linux/sched/smt.h> 27 18 #include <linux/sched/stat.h> 28 19 #include <linux/sched/sysctl.h> 20 + #include <linux/sched/task_flags.h> 29 21 #include <linux/sched/task.h> 30 - #include <linux/sched/task_stack.h> 31 22 #include <linux/sched/topology.h> 32 - #include <linux/sched/user.h> 33 - #include <linux/sched/wake_q.h> 34 - #include <linux/sched/xacct.h> 35 23 36 - #include <uapi/linux/sched/types.h> 37 - 38 - #include <linux/binfmts.h> 39 - #include <linux/bitops.h> 40 - #include <linux/compat.h> 41 - #include <linux/context_tracking.h> 24 + #include <linux/atomic.h> 25 + #include <linux/bitmap.h> 26 + #include <linux/bug.h> 27 + #include <linux/capability.h> 28 + #include <linux/cgroup_api.h> 29 + #include <linux/cgroup.h> 42 30 #include <linux/cpufreq.h> 43 - #include <linux/cpuidle.h> 44 - #include <linux/cpuset.h> 31 + #include <linux/cpumask_api.h> 45 32 #include <linux/ctype.h> 46 - #include <linux/debugfs.h> 47 - #include <linux/delayacct.h> 48 - #include <linux/energy_model.h> 49 - #include <linux/init_task.h> 50 - #include <linux/kprobes.h> 33 + #include <linux/file.h> 34 + #include <linux/fs_api.h> 35 + #include <linux/hrtimer_api.h> 36 + #include <linux/interrupt.h> 37 + #include <linux/irq_work.h> 38 + #include <linux/jiffies.h> 39 + #include <linux/kref_api.h> 51 40 #include <linux/kthread.h> 52 - #include <linux/membarrier.h> 53 - #include <linux/migrate.h> 54 - #include <linux/mmu_context.h> 55 - #include <linux/nmi.h> 41 + #include <linux/ktime_api.h> 42 + #include <linux/lockdep_api.h> 43 + #include <linux/lockdep.h> 44 + #include <linux/minmax.h> 45 + #include <linux/mm.h> 46 + #include <linux/module.h> 47 + #include <linux/mutex_api.h> 48 + #include <linux/plist.h> 49 + #include <linux/poll.h> 56 50 #include <linux/proc_fs.h> 57 - #include <linux/prefetch.h> 58 51 #include <linux/profile.h> 59 52 #include <linux/psi.h> 60 - #include <linux/ratelimit.h> 61 - #include <linux/rcupdate_wait.h> 62 - #include <linux/security.h> 53 + #include <linux/rcupdate.h> 54 + #include <linux/seq_file.h> 55 + #include <linux/seqlock.h> 56 + #include <linux/softirq.h> 57 + #include <linux/spinlock_api.h> 58 + #include <linux/static_key.h> 63 59 #include <linux/stop_machine.h> 64 - #include <linux/suspend.h> 65 - #include <linux/swait.h> 60 + #include <linux/syscalls_api.h> 66 61 #include <linux/syscalls.h> 67 - #include <linux/task_work.h> 68 - #include <linux/tsacct_kern.h> 62 + #include <linux/tick.h> 63 + #include <linux/topology.h> 64 + #include <linux/types.h> 65 + #include <linux/u64_stats_sync_api.h> 66 + #include <linux/uaccess.h> 67 + #include <linux/wait_api.h> 68 + #include <linux/wait_bit.h> 69 + #include <linux/workqueue_api.h> 69 70 70 - #include <asm/tlb.h> 71 + #include <trace/events/power.h> 72 + #include <trace/events/sched.h> 73 + 74 + #include "../workqueue_internal.h" 75 + 76 + #ifdef CONFIG_CGROUP_SCHED 77 + #include <linux/cgroup.h> 78 + #include <linux/psi.h> 79 + #endif 80 + 81 + #ifdef CONFIG_SCHED_DEBUG 82 + # include <linux/static_key.h> 83 + #endif 71 84 72 85 #ifdef CONFIG_PARAVIRT 73 86 # include <asm/paravirt.h> 87 + # include <asm/paravirt_api_clock.h> 74 88 #endif 75 89 76 90 #include "cpupri.h" 77 91 #include "cpudeadline.h" 78 92 79 - #include <trace/events/sched.h> 80 - 81 93 #ifdef CONFIG_SCHED_DEBUG 82 - # define SCHED_WARN_ON(x) WARN_ONCE(x, #x) 94 + # define SCHED_WARN_ON(x) WARN_ONCE(x, #x) 83 95 #else 84 - # define SCHED_WARN_ON(x) ({ (void)(x), 0; }) 96 + # define SCHED_WARN_ON(x) ({ (void)(x), 0; }) 85 97 #endif 86 98 87 99 struct rq; ··· 313 301 u64 total_bw; 314 302 }; 315 303 316 - static inline void __dl_update(struct dl_bw *dl_b, s64 bw); 317 - 318 - static inline 319 - void __dl_sub(struct dl_bw *dl_b, u64 tsk_bw, int cpus) 320 - { 321 - dl_b->total_bw -= tsk_bw; 322 - __dl_update(dl_b, (s32)tsk_bw / cpus); 323 - } 324 - 325 - static inline 326 - void __dl_add(struct dl_bw *dl_b, u64 tsk_bw, int cpus) 327 - { 328 - dl_b->total_bw += tsk_bw; 329 - __dl_update(dl_b, -((s32)tsk_bw / cpus)); 330 - } 331 - 332 - static inline bool __dl_overflow(struct dl_bw *dl_b, unsigned long cap, 333 - u64 old_bw, u64 new_bw) 334 - { 335 - return dl_b->bw != -1 && 336 - cap_scale(dl_b->bw, cap) < dl_b->total_bw - old_bw + new_bw; 337 - } 338 - 339 304 /* 340 305 * Verify the fitness of task @p to run on @cpu taking into account the 341 306 * CPU original capacity and the runtime/deadline ratio of the task. ··· 336 347 extern void __getparam_dl(struct task_struct *p, struct sched_attr *attr); 337 348 extern bool __checkparam_dl(const struct sched_attr *attr); 338 349 extern bool dl_param_changed(struct task_struct *p, const struct sched_attr *attr); 339 - extern int dl_task_can_attach(struct task_struct *p, const struct cpumask *cs_cpus_allowed); 340 350 extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const struct cpumask *trial); 341 - extern bool dl_cpu_busy(unsigned int cpu); 351 + extern int dl_cpu_busy(int cpu, struct task_struct *p); 342 352 343 353 #ifdef CONFIG_CGROUP_SCHED 344 - 345 - #include <linux/cgroup.h> 346 - #include <linux/psi.h> 347 354 348 355 struct cfs_rq; 349 356 struct rt_rq; ··· 1647 1662 extern enum numa_topology_type sched_numa_topology_type; 1648 1663 extern int sched_max_numa_distance; 1649 1664 extern bool find_numa_distance(int distance); 1650 - extern void sched_init_numa(void); 1665 + extern void sched_init_numa(int offline_node); 1666 + extern void sched_update_numa(int cpu, bool online); 1651 1667 extern void sched_domains_numa_masks_set(unsigned int cpu); 1652 1668 extern void sched_domains_numa_masks_clear(unsigned int cpu); 1653 1669 extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu); 1654 1670 #else 1655 - static inline void sched_init_numa(void) { } 1671 + static inline void sched_init_numa(int offline_node) { } 1672 + static inline void sched_update_numa(int cpu, bool online) { } 1656 1673 static inline void sched_domains_numa_masks_set(unsigned int cpu) { } 1657 1674 static inline void sched_domains_numa_masks_clear(unsigned int cpu) { } 1658 1675 static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu) ··· 1841 1854 #endif 1842 1855 1843 1856 #include "stats.h" 1844 - #include "autogroup.h" 1845 1857 1846 1858 #if defined(CONFIG_SCHED_CORE) && defined(CONFIG_SCHEDSTATS) 1847 1859 ··· 1936 1950 * Tunables that become constants when CONFIG_SCHED_DEBUG is off: 1937 1951 */ 1938 1952 #ifdef CONFIG_SCHED_DEBUG 1939 - # include <linux/static_key.h> 1940 1953 # define const_debug __read_mostly 1941 1954 #else 1942 1955 # define const_debug const ··· 2316 2331 extern struct rt_bandwidth def_rt_bandwidth; 2317 2332 extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime); 2318 2333 2319 - extern struct dl_bandwidth def_dl_bandwidth; 2320 2334 extern void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime); 2321 2335 extern void init_dl_task_timer(struct sched_dl_entity *dl_se); 2322 2336 extern void init_dl_inactive_task_timer(struct sched_dl_entity *dl_se); ··· 2731 2747 static inline void nohz_run_idle_balance(int cpu) { } 2732 2748 #endif 2733 2749 2734 - #ifdef CONFIG_SMP 2735 - static inline 2736 - void __dl_update(struct dl_bw *dl_b, s64 bw) 2737 - { 2738 - struct root_domain *rd = container_of(dl_b, struct root_domain, dl_bw); 2739 - int i; 2740 - 2741 - RCU_LOCKDEP_WARN(!rcu_read_lock_sched_held(), 2742 - "sched RCU must be held"); 2743 - for_each_cpu_and(i, rd->span, cpu_active_mask) { 2744 - struct rq *rq = cpu_rq(i); 2745 - 2746 - rq->dl.extra_bw += bw; 2747 - } 2748 - } 2749 - #else 2750 - static inline 2751 - void __dl_update(struct dl_bw *dl_b, s64 bw) 2752 - { 2753 - struct dl_rq *dl = container_of(dl_b, struct dl_rq, dl_bw); 2754 - 2755 - dl->extra_bw += bw; 2756 - } 2757 - #endif 2758 - 2759 - 2760 2750 #ifdef CONFIG_IRQ_TIME_ACCOUNTING 2761 2751 struct irqtime { 2762 2752 u64 total; ··· 2798 2840 #else 2799 2841 static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {} 2800 2842 #endif /* CONFIG_CPU_FREQ */ 2801 - 2802 - #ifdef CONFIG_UCLAMP_TASK 2803 - unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id); 2804 - 2805 - /** 2806 - * uclamp_rq_util_with - clamp @util with @rq and @p effective uclamp values. 2807 - * @rq: The rq to clamp against. Must not be NULL. 2808 - * @util: The util value to clamp. 2809 - * @p: The task to clamp against. Can be NULL if you want to clamp 2810 - * against @rq only. 2811 - * 2812 - * Clamps the passed @util to the max(@rq, @p) effective uclamp values. 2813 - * 2814 - * If sched_uclamp_used static key is disabled, then just return the util 2815 - * without any clamping since uclamp aggregation at the rq level in the fast 2816 - * path is disabled, rendering this operation a NOP. 2817 - * 2818 - * Use uclamp_eff_value() if you don't care about uclamp values at rq level. It 2819 - * will return the correct effective uclamp value of the task even if the 2820 - * static key is disabled. 2821 - */ 2822 - static __always_inline 2823 - unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, 2824 - struct task_struct *p) 2825 - { 2826 - unsigned long min_util = 0; 2827 - unsigned long max_util = 0; 2828 - 2829 - if (!static_branch_likely(&sched_uclamp_used)) 2830 - return util; 2831 - 2832 - if (p) { 2833 - min_util = uclamp_eff_value(p, UCLAMP_MIN); 2834 - max_util = uclamp_eff_value(p, UCLAMP_MAX); 2835 - 2836 - /* 2837 - * Ignore last runnable task's max clamp, as this task will 2838 - * reset it. Similarly, no need to read the rq's min clamp. 2839 - */ 2840 - if (rq->uclamp_flags & UCLAMP_FLAG_IDLE) 2841 - goto out; 2842 - } 2843 - 2844 - min_util = max_t(unsigned long, min_util, READ_ONCE(rq->uclamp[UCLAMP_MIN].value)); 2845 - max_util = max_t(unsigned long, max_util, READ_ONCE(rq->uclamp[UCLAMP_MAX].value)); 2846 - out: 2847 - /* 2848 - * Since CPU's {min,max}_util clamps are MAX aggregated considering 2849 - * RUNNABLE tasks with _different_ clamps, we can end up with an 2850 - * inversion. Fix it now when the clamps are applied. 2851 - */ 2852 - if (unlikely(min_util >= max_util)) 2853 - return min_util; 2854 - 2855 - return clamp(util, min_util, max_util); 2856 - } 2857 - 2858 - /* 2859 - * When uclamp is compiled in, the aggregation at rq level is 'turned off' 2860 - * by default in the fast path and only gets turned on once userspace performs 2861 - * an operation that requires it. 2862 - * 2863 - * Returns true if userspace opted-in to use uclamp and aggregation at rq level 2864 - * hence is active. 2865 - */ 2866 - static inline bool uclamp_is_used(void) 2867 - { 2868 - return static_branch_likely(&sched_uclamp_used); 2869 - } 2870 - #else /* CONFIG_UCLAMP_TASK */ 2871 - static inline 2872 - unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, 2873 - struct task_struct *p) 2874 - { 2875 - return util; 2876 - } 2877 - 2878 - static inline bool uclamp_is_used(void) 2879 - { 2880 - return false; 2881 - } 2882 - #endif /* CONFIG_UCLAMP_TASK */ 2883 2843 2884 2844 #ifdef arch_scale_freq_capacity 2885 2845 # ifndef arch_scale_freq_invariant ··· 2895 3019 return READ_ONCE(rq->avg_rt.util_avg); 2896 3020 } 2897 3021 #endif 3022 + 3023 + #ifdef CONFIG_UCLAMP_TASK 3024 + unsigned long uclamp_eff_value(struct task_struct *p, enum uclamp_id clamp_id); 3025 + 3026 + /** 3027 + * uclamp_rq_util_with - clamp @util with @rq and @p effective uclamp values. 3028 + * @rq: The rq to clamp against. Must not be NULL. 3029 + * @util: The util value to clamp. 3030 + * @p: The task to clamp against. Can be NULL if you want to clamp 3031 + * against @rq only. 3032 + * 3033 + * Clamps the passed @util to the max(@rq, @p) effective uclamp values. 3034 + * 3035 + * If sched_uclamp_used static key is disabled, then just return the util 3036 + * without any clamping since uclamp aggregation at the rq level in the fast 3037 + * path is disabled, rendering this operation a NOP. 3038 + * 3039 + * Use uclamp_eff_value() if you don't care about uclamp values at rq level. It 3040 + * will return the correct effective uclamp value of the task even if the 3041 + * static key is disabled. 3042 + */ 3043 + static __always_inline 3044 + unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, 3045 + struct task_struct *p) 3046 + { 3047 + unsigned long min_util = 0; 3048 + unsigned long max_util = 0; 3049 + 3050 + if (!static_branch_likely(&sched_uclamp_used)) 3051 + return util; 3052 + 3053 + if (p) { 3054 + min_util = uclamp_eff_value(p, UCLAMP_MIN); 3055 + max_util = uclamp_eff_value(p, UCLAMP_MAX); 3056 + 3057 + /* 3058 + * Ignore last runnable task's max clamp, as this task will 3059 + * reset it. Similarly, no need to read the rq's min clamp. 3060 + */ 3061 + if (rq->uclamp_flags & UCLAMP_FLAG_IDLE) 3062 + goto out; 3063 + } 3064 + 3065 + min_util = max_t(unsigned long, min_util, READ_ONCE(rq->uclamp[UCLAMP_MIN].value)); 3066 + max_util = max_t(unsigned long, max_util, READ_ONCE(rq->uclamp[UCLAMP_MAX].value)); 3067 + out: 3068 + /* 3069 + * Since CPU's {min,max}_util clamps are MAX aggregated considering 3070 + * RUNNABLE tasks with _different_ clamps, we can end up with an 3071 + * inversion. Fix it now when the clamps are applied. 3072 + */ 3073 + if (unlikely(min_util >= max_util)) 3074 + return min_util; 3075 + 3076 + return clamp(util, min_util, max_util); 3077 + } 3078 + 3079 + /* Is the rq being capped/throttled by uclamp_max? */ 3080 + static inline bool uclamp_rq_is_capped(struct rq *rq) 3081 + { 3082 + unsigned long rq_util; 3083 + unsigned long max_util; 3084 + 3085 + if (!static_branch_likely(&sched_uclamp_used)) 3086 + return false; 3087 + 3088 + rq_util = cpu_util_cfs(cpu_of(rq)) + cpu_util_rt(rq); 3089 + max_util = READ_ONCE(rq->uclamp[UCLAMP_MAX].value); 3090 + 3091 + return max_util != SCHED_CAPACITY_SCALE && rq_util >= max_util; 3092 + } 3093 + 3094 + /* 3095 + * When uclamp is compiled in, the aggregation at rq level is 'turned off' 3096 + * by default in the fast path and only gets turned on once userspace performs 3097 + * an operation that requires it. 3098 + * 3099 + * Returns true if userspace opted-in to use uclamp and aggregation at rq level 3100 + * hence is active. 3101 + */ 3102 + static inline bool uclamp_is_used(void) 3103 + { 3104 + return static_branch_likely(&sched_uclamp_used); 3105 + } 3106 + #else /* CONFIG_UCLAMP_TASK */ 3107 + static inline 3108 + unsigned long uclamp_rq_util_with(struct rq *rq, unsigned long util, 3109 + struct task_struct *p) 3110 + { 3111 + return util; 3112 + } 3113 + 3114 + static inline bool uclamp_rq_is_capped(struct rq *rq) { return false; } 3115 + 3116 + static inline bool uclamp_is_used(void) 3117 + { 3118 + return false; 3119 + } 3120 + #endif /* CONFIG_UCLAMP_TASK */ 2898 3121 2899 3122 #ifdef CONFIG_HAVE_SCHED_AVG_IRQ 2900 3123 static inline unsigned long cpu_util_irq(struct rq *rq) ··· 3093 3118 extern void sched_dynamic_update(int mode); 3094 3119 #endif 3095 3120 3121 + #endif /* _KERNEL_SCHED_SCHED_H */
-1
kernel/sched/stats.c
··· 2 2 /* 3 3 * /proc/schedstat implementation 4 4 */ 5 - #include "sched.h" 6 5 7 6 void __update_stats_wait_start(struct rq *rq, struct task_struct *p, 8 7 struct sched_statistics *stats)
+4
kernel/sched/stats.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _KERNEL_STATS_H 3 + #define _KERNEL_STATS_H 2 4 3 5 #ifdef CONFIG_SCHEDSTATS 4 6 ··· 300 298 # define sched_info_dequeue(rq, t) do { } while (0) 301 299 # define sched_info_switch(rq, t, next) do { } while (0) 302 300 #endif /* CONFIG_SCHED_INFO */ 301 + 302 + #endif /* _KERNEL_STATS_H */
-1
kernel/sched/stop_task.c
··· 7 7 * 8 8 * See kernel/stop_machine.c 9 9 */ 10 - #include "sched.h" 11 10 12 11 #ifdef CONFIG_SMP 13 12 static int
-1
kernel/sched/swait.c
··· 2 2 /* 3 3 * <linux/swait.h> (simple wait queues ) implementation: 4 4 */ 5 - #include "sched.h" 6 5 7 6 void __init_swait_queue_head(struct swait_queue_head *q, const char *name, 8 7 struct lock_class_key *key)
+180 -94
kernel/sched/topology.c
··· 2 2 /* 3 3 * Scheduler topology setup/handling methods 4 4 */ 5 - #include "sched.h" 6 5 7 6 DEFINE_MUTEX(sched_domains_mutex); 8 7 ··· 73 74 break; 74 75 } 75 76 76 - if (!cpumask_weight(sched_group_span(group))) { 77 + if (cpumask_empty(sched_group_span(group))) { 77 78 printk(KERN_CONT "\n"); 78 79 printk(KERN_ERR "ERROR: empty group\n"); 79 80 break; ··· 1365 1366 list_for_each_entry(entry, &asym_cap_list, link) 1366 1367 cpumask_clear(cpu_capacity_span(entry)); 1367 1368 1368 - for_each_cpu_and(cpu, cpu_possible_mask, housekeeping_cpumask(HK_FLAG_DOMAIN)) 1369 + for_each_cpu_and(cpu, cpu_possible_mask, housekeeping_cpumask(HK_TYPE_DOMAIN)) 1369 1370 asym_cpu_capacity_update_data(cpu); 1370 1371 1371 1372 list_for_each_entry_safe(entry, next, &asym_cap_list, link) { ··· 1491 1492 int sched_max_numa_distance; 1492 1493 static int *sched_domains_numa_distance; 1493 1494 static struct cpumask ***sched_domains_numa_masks; 1494 - 1495 - static unsigned long __read_mostly *sched_numa_onlined_nodes; 1496 1495 #endif 1497 1496 1498 1497 /* ··· 1648 1651 1649 1652 static struct sched_domain_topology_level *sched_domain_topology = 1650 1653 default_topology; 1654 + static struct sched_domain_topology_level *sched_domain_topology_saved; 1651 1655 1652 1656 #define for_each_sd_topology(tl) \ 1653 1657 for (tl = sched_domain_topology; tl->mask; tl++) ··· 1659 1661 return; 1660 1662 1661 1663 sched_domain_topology = tl; 1664 + sched_domain_topology_saved = NULL; 1662 1665 } 1663 1666 1664 1667 #ifdef CONFIG_NUMA ··· 1683 1684 1684 1685 for (i = 0; i < nr_node_ids; i++) { 1685 1686 printk(KERN_WARNING " "); 1686 - for (j = 0; j < nr_node_ids; j++) 1687 - printk(KERN_CONT "%02d ", node_distance(i,j)); 1687 + for (j = 0; j < nr_node_ids; j++) { 1688 + if (!node_state(i, N_CPU) || !node_state(j, N_CPU)) 1689 + printk(KERN_CONT "(%02d) ", node_distance(i,j)); 1690 + else 1691 + printk(KERN_CONT " %02d ", node_distance(i,j)); 1692 + } 1688 1693 printk(KERN_CONT "\n"); 1689 1694 } 1690 1695 printk(KERN_WARNING "\n"); ··· 1696 1693 1697 1694 bool find_numa_distance(int distance) 1698 1695 { 1699 - int i; 1696 + bool found = false; 1697 + int i, *distances; 1700 1698 1701 1699 if (distance == node_distance(0, 0)) 1702 1700 return true; 1703 1701 1702 + rcu_read_lock(); 1703 + distances = rcu_dereference(sched_domains_numa_distance); 1704 + if (!distances) 1705 + goto unlock; 1704 1706 for (i = 0; i < sched_domains_numa_levels; i++) { 1705 - if (sched_domains_numa_distance[i] == distance) 1706 - return true; 1707 + if (distances[i] == distance) { 1708 + found = true; 1709 + break; 1710 + } 1707 1711 } 1712 + unlock: 1713 + rcu_read_unlock(); 1708 1714 1709 - return false; 1715 + return found; 1710 1716 } 1717 + 1718 + #define for_each_cpu_node_but(n, nbut) \ 1719 + for_each_node_state(n, N_CPU) \ 1720 + if (n == nbut) \ 1721 + continue; \ 1722 + else 1711 1723 1712 1724 /* 1713 1725 * A system can have three types of NUMA topology: ··· 1743 1725 * there is an intermediary node C, which is < N hops away from both 1744 1726 * nodes A and B, the system is a glueless mesh. 1745 1727 */ 1746 - static void init_numa_topology_type(void) 1728 + static void init_numa_topology_type(int offline_node) 1747 1729 { 1748 1730 int a, b, c, n; 1749 1731 ··· 1754 1736 return; 1755 1737 } 1756 1738 1757 - for_each_online_node(a) { 1758 - for_each_online_node(b) { 1739 + for_each_cpu_node_but(a, offline_node) { 1740 + for_each_cpu_node_but(b, offline_node) { 1759 1741 /* Find two nodes furthest removed from each other. */ 1760 1742 if (node_distance(a, b) < n) 1761 1743 continue; 1762 1744 1763 1745 /* Is there an intermediary node between a and b? */ 1764 - for_each_online_node(c) { 1746 + for_each_cpu_node_but(c, offline_node) { 1765 1747 if (node_distance(a, c) < n && 1766 1748 node_distance(b, c) < n) { 1767 1749 sched_numa_topology_type = ··· 1774 1756 return; 1775 1757 } 1776 1758 } 1759 + 1760 + pr_err("Failed to find a NUMA topology type, defaulting to DIRECT\n"); 1761 + sched_numa_topology_type = NUMA_DIRECT; 1777 1762 } 1778 1763 1779 1764 1780 1765 #define NR_DISTANCE_VALUES (1 << DISTANCE_BITS) 1781 1766 1782 - void sched_init_numa(void) 1767 + void sched_init_numa(int offline_node) 1783 1768 { 1784 1769 struct sched_domain_topology_level *tl; 1785 1770 unsigned long *distance_map; 1786 1771 int nr_levels = 0; 1787 1772 int i, j; 1773 + int *distances; 1774 + struct cpumask ***masks; 1788 1775 1789 1776 /* 1790 1777 * O(nr_nodes^2) deduplicating selection sort -- in order to find the ··· 1800 1777 return; 1801 1778 1802 1779 bitmap_zero(distance_map, NR_DISTANCE_VALUES); 1803 - for (i = 0; i < nr_node_ids; i++) { 1804 - for (j = 0; j < nr_node_ids; j++) { 1780 + for_each_cpu_node_but(i, offline_node) { 1781 + for_each_cpu_node_but(j, offline_node) { 1805 1782 int distance = node_distance(i, j); 1806 1783 1807 1784 if (distance < LOCAL_DISTANCE || distance >= NR_DISTANCE_VALUES) { 1808 1785 sched_numa_warn("Invalid distance value range"); 1786 + bitmap_free(distance_map); 1809 1787 return; 1810 1788 } 1811 1789 ··· 1819 1795 */ 1820 1796 nr_levels = bitmap_weight(distance_map, NR_DISTANCE_VALUES); 1821 1797 1822 - sched_domains_numa_distance = kcalloc(nr_levels, sizeof(int), GFP_KERNEL); 1823 - if (!sched_domains_numa_distance) { 1798 + distances = kcalloc(nr_levels, sizeof(int), GFP_KERNEL); 1799 + if (!distances) { 1824 1800 bitmap_free(distance_map); 1825 1801 return; 1826 1802 } 1827 1803 1828 1804 for (i = 0, j = 0; i < nr_levels; i++, j++) { 1829 1805 j = find_next_bit(distance_map, NR_DISTANCE_VALUES, j); 1830 - sched_domains_numa_distance[i] = j; 1806 + distances[i] = j; 1831 1807 } 1808 + rcu_assign_pointer(sched_domains_numa_distance, distances); 1832 1809 1833 1810 bitmap_free(distance_map); 1834 1811 ··· 1851 1826 */ 1852 1827 sched_domains_numa_levels = 0; 1853 1828 1854 - sched_domains_numa_masks = kzalloc(sizeof(void *) * nr_levels, GFP_KERNEL); 1855 - if (!sched_domains_numa_masks) 1829 + masks = kzalloc(sizeof(void *) * nr_levels, GFP_KERNEL); 1830 + if (!masks) 1856 1831 return; 1857 1832 1858 1833 /* ··· 1860 1835 * CPUs of nodes that are that many hops away from us. 1861 1836 */ 1862 1837 for (i = 0; i < nr_levels; i++) { 1863 - sched_domains_numa_masks[i] = 1864 - kzalloc(nr_node_ids * sizeof(void *), GFP_KERNEL); 1865 - if (!sched_domains_numa_masks[i]) 1838 + masks[i] = kzalloc(nr_node_ids * sizeof(void *), GFP_KERNEL); 1839 + if (!masks[i]) 1866 1840 return; 1867 1841 1868 - for (j = 0; j < nr_node_ids; j++) { 1842 + for_each_cpu_node_but(j, offline_node) { 1869 1843 struct cpumask *mask = kzalloc(cpumask_size(), GFP_KERNEL); 1870 1844 int k; 1871 1845 1872 1846 if (!mask) 1873 1847 return; 1874 1848 1875 - sched_domains_numa_masks[i][j] = mask; 1849 + masks[i][j] = mask; 1876 1850 1877 - for_each_node(k) { 1878 - /* 1879 - * Distance information can be unreliable for 1880 - * offline nodes, defer building the node 1881 - * masks to its bringup. 1882 - * This relies on all unique distance values 1883 - * still being visible at init time. 1884 - */ 1885 - if (!node_online(j)) 1886 - continue; 1887 - 1851 + for_each_cpu_node_but(k, offline_node) { 1888 1852 if (sched_debug() && (node_distance(j, k) != node_distance(k, j))) 1889 1853 sched_numa_warn("Node-distance not symmetric"); 1890 1854 ··· 1884 1870 } 1885 1871 } 1886 1872 } 1873 + rcu_assign_pointer(sched_domains_numa_masks, masks); 1887 1874 1888 1875 /* Compute default topology size */ 1889 1876 for (i = 0; sched_domain_topology[i].mask; i++); ··· 1922 1907 }; 1923 1908 } 1924 1909 1910 + sched_domain_topology_saved = sched_domain_topology; 1925 1911 sched_domain_topology = tl; 1926 1912 1927 1913 sched_domains_numa_levels = nr_levels; 1928 - sched_max_numa_distance = sched_domains_numa_distance[nr_levels - 1]; 1914 + WRITE_ONCE(sched_max_numa_distance, sched_domains_numa_distance[nr_levels - 1]); 1929 1915 1930 - init_numa_topology_type(); 1931 - 1932 - sched_numa_onlined_nodes = bitmap_alloc(nr_node_ids, GFP_KERNEL); 1933 - if (!sched_numa_onlined_nodes) 1934 - return; 1935 - 1936 - bitmap_zero(sched_numa_onlined_nodes, nr_node_ids); 1937 - for_each_online_node(i) 1938 - bitmap_set(sched_numa_onlined_nodes, i, 1); 1916 + init_numa_topology_type(offline_node); 1939 1917 } 1940 1918 1941 - static void __sched_domains_numa_masks_set(unsigned int node) 1942 - { 1943 - int i, j; 1944 1919 1920 + static void sched_reset_numa(void) 1921 + { 1922 + int nr_levels, *distances; 1923 + struct cpumask ***masks; 1924 + 1925 + nr_levels = sched_domains_numa_levels; 1926 + sched_domains_numa_levels = 0; 1927 + sched_max_numa_distance = 0; 1928 + sched_numa_topology_type = NUMA_DIRECT; 1929 + distances = sched_domains_numa_distance; 1930 + rcu_assign_pointer(sched_domains_numa_distance, NULL); 1931 + masks = sched_domains_numa_masks; 1932 + rcu_assign_pointer(sched_domains_numa_masks, NULL); 1933 + if (distances || masks) { 1934 + int i, j; 1935 + 1936 + synchronize_rcu(); 1937 + kfree(distances); 1938 + for (i = 0; i < nr_levels && masks; i++) { 1939 + if (!masks[i]) 1940 + continue; 1941 + for_each_node(j) 1942 + kfree(masks[i][j]); 1943 + kfree(masks[i]); 1944 + } 1945 + kfree(masks); 1946 + } 1947 + if (sched_domain_topology_saved) { 1948 + kfree(sched_domain_topology); 1949 + sched_domain_topology = sched_domain_topology_saved; 1950 + sched_domain_topology_saved = NULL; 1951 + } 1952 + } 1953 + 1954 + /* 1955 + * Call with hotplug lock held 1956 + */ 1957 + void sched_update_numa(int cpu, bool online) 1958 + { 1959 + int node; 1960 + 1961 + node = cpu_to_node(cpu); 1945 1962 /* 1946 - * NUMA masks are not built for offline nodes in sched_init_numa(). 1947 - * Thus, when a CPU of a never-onlined-before node gets plugged in, 1948 - * adding that new CPU to the right NUMA masks is not sufficient: the 1949 - * masks of that CPU's node must also be updated. 1963 + * Scheduler NUMA topology is updated when the first CPU of a 1964 + * node is onlined or the last CPU of a node is offlined. 1950 1965 */ 1951 - if (test_bit(node, sched_numa_onlined_nodes)) 1966 + if (cpumask_weight(cpumask_of_node(node)) != 1) 1952 1967 return; 1953 1968 1954 - bitmap_set(sched_numa_onlined_nodes, node, 1); 1955 - 1956 - for (i = 0; i < sched_domains_numa_levels; i++) { 1957 - for (j = 0; j < nr_node_ids; j++) { 1958 - if (!node_online(j) || node == j) 1959 - continue; 1960 - 1961 - if (node_distance(j, node) > sched_domains_numa_distance[i]) 1962 - continue; 1963 - 1964 - /* Add remote nodes in our masks */ 1965 - cpumask_or(sched_domains_numa_masks[i][node], 1966 - sched_domains_numa_masks[i][node], 1967 - sched_domains_numa_masks[0][j]); 1968 - } 1969 - } 1970 - 1971 - /* 1972 - * A new node has been brought up, potentially changing the topology 1973 - * classification. 1974 - * 1975 - * Note that this is racy vs any use of sched_numa_topology_type :/ 1976 - */ 1977 - init_numa_topology_type(); 1969 + sched_reset_numa(); 1970 + sched_init_numa(online ? NUMA_NO_NODE : node); 1978 1971 } 1979 1972 1980 1973 void sched_domains_numa_masks_set(unsigned int cpu) ··· 1990 1967 int node = cpu_to_node(cpu); 1991 1968 int i, j; 1992 1969 1993 - __sched_domains_numa_masks_set(node); 1994 - 1995 1970 for (i = 0; i < sched_domains_numa_levels; i++) { 1996 1971 for (j = 0; j < nr_node_ids; j++) { 1997 - if (!node_online(j)) 1972 + if (!node_state(j, N_CPU)) 1998 1973 continue; 1999 1974 2000 1975 /* Set ourselves in the remote node's masks */ ··· 2007 1986 int i, j; 2008 1987 2009 1988 for (i = 0; i < sched_domains_numa_levels; i++) { 2010 - for (j = 0; j < nr_node_ids; j++) 2011 - cpumask_clear_cpu(cpu, sched_domains_numa_masks[i][j]); 1989 + for (j = 0; j < nr_node_ids; j++) { 1990 + if (sched_domains_numa_masks[i][j]) 1991 + cpumask_clear_cpu(cpu, sched_domains_numa_masks[i][j]); 1992 + } 2012 1993 } 2013 1994 } 2014 1995 ··· 2024 2001 */ 2025 2002 int sched_numa_find_closest(const struct cpumask *cpus, int cpu) 2026 2003 { 2027 - int i, j = cpu_to_node(cpu); 2004 + int i, j = cpu_to_node(cpu), found = nr_cpu_ids; 2005 + struct cpumask ***masks; 2028 2006 2007 + rcu_read_lock(); 2008 + masks = rcu_dereference(sched_domains_numa_masks); 2009 + if (!masks) 2010 + goto unlock; 2029 2011 for (i = 0; i < sched_domains_numa_levels; i++) { 2030 - cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]); 2031 - if (cpu < nr_cpu_ids) 2032 - return cpu; 2012 + if (!masks[i][j]) 2013 + break; 2014 + cpu = cpumask_any_and(cpus, masks[i][j]); 2015 + if (cpu < nr_cpu_ids) { 2016 + found = cpu; 2017 + break; 2018 + } 2033 2019 } 2034 - return nr_cpu_ids; 2020 + unlock: 2021 + rcu_read_unlock(); 2022 + 2023 + return found; 2035 2024 } 2036 2025 2037 2026 #endif /* CONFIG_NUMA */ ··· 2277 2242 } 2278 2243 } 2279 2244 2245 + /* 2246 + * Calculate an allowed NUMA imbalance such that LLCs do not get 2247 + * imbalanced. 2248 + */ 2249 + for_each_cpu(i, cpu_map) { 2250 + unsigned int imb = 0; 2251 + unsigned int imb_span = 1; 2252 + 2253 + for (sd = *per_cpu_ptr(d.sd, i); sd; sd = sd->parent) { 2254 + struct sched_domain *child = sd->child; 2255 + 2256 + if (!(sd->flags & SD_SHARE_PKG_RESOURCES) && child && 2257 + (child->flags & SD_SHARE_PKG_RESOURCES)) { 2258 + struct sched_domain __rcu *top_p; 2259 + unsigned int nr_llcs; 2260 + 2261 + /* 2262 + * For a single LLC per node, allow an 2263 + * imbalance up to 25% of the node. This is an 2264 + * arbitrary cutoff based on SMT-2 to balance 2265 + * between memory bandwidth and avoiding 2266 + * premature sharing of HT resources and SMT-4 2267 + * or SMT-8 *may* benefit from a different 2268 + * cutoff. 2269 + * 2270 + * For multiple LLCs, allow an imbalance 2271 + * until multiple tasks would share an LLC 2272 + * on one node while LLCs on another node 2273 + * remain idle. 2274 + */ 2275 + nr_llcs = sd->span_weight / child->span_weight; 2276 + if (nr_llcs == 1) 2277 + imb = sd->span_weight >> 2; 2278 + else 2279 + imb = nr_llcs; 2280 + sd->imb_numa_nr = imb; 2281 + 2282 + /* Set span based on the first NUMA domain. */ 2283 + top_p = sd->parent; 2284 + while (top_p && !(top_p->flags & SD_NUMA)) { 2285 + top_p = top_p->parent; 2286 + } 2287 + imb_span = top_p ? top_p->span_weight : sd->span_weight; 2288 + } else { 2289 + int factor = max(1U, (sd->span_weight / imb_span)); 2290 + 2291 + sd->imb_numa_nr = imb * factor; 2292 + } 2293 + } 2294 + } 2295 + 2280 2296 /* Calculate CPU capacity for physical packages and nodes */ 2281 2297 for (i = nr_cpumask_bits-1; i >= 0; i--) { 2282 2298 if (!cpumask_test_cpu(i, cpu_map)) ··· 2437 2351 doms_cur = alloc_sched_domains(ndoms_cur); 2438 2352 if (!doms_cur) 2439 2353 doms_cur = &fallback_doms; 2440 - cpumask_and(doms_cur[0], cpu_map, housekeeping_cpumask(HK_FLAG_DOMAIN)); 2354 + cpumask_and(doms_cur[0], cpu_map, housekeeping_cpumask(HK_TYPE_DOMAIN)); 2441 2355 err = build_sched_domains(doms_cur[0], NULL); 2442 2356 2443 2357 return err; ··· 2526 2440 if (doms_new) { 2527 2441 n = 1; 2528 2442 cpumask_and(doms_new[0], cpu_active_mask, 2529 - housekeeping_cpumask(HK_FLAG_DOMAIN)); 2443 + housekeeping_cpumask(HK_TYPE_DOMAIN)); 2530 2444 } 2531 2445 } else { 2532 2446 n = ndoms_new; ··· 2561 2475 n = 0; 2562 2476 doms_new = &fallback_doms; 2563 2477 cpumask_and(doms_new[0], cpu_active_mask, 2564 - housekeeping_cpumask(HK_FLAG_DOMAIN)); 2478 + housekeeping_cpumask(HK_TYPE_DOMAIN)); 2565 2479 } 2566 2480 2567 2481 /* Build new domains: */
-1
kernel/sched/wait.c
··· 4 4 * 5 5 * (C) 2004 Nadia Yvette Chambers, Oracle 6 6 */ 7 - #include "sched.h" 8 7 9 8 void __init_waitqueue_head(struct wait_queue_head *wq_head, const char *name, struct lock_class_key *key) 10 9 {
+1 -1
kernel/sched/wait_bit.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 + 2 3 /* 3 4 * The implementation of the wait_bit*() and related waiting APIs: 4 5 */ 5 - #include "sched.h" 6 6 7 7 #define WAIT_TABLE_BITS 8 8 8 #define WAIT_TABLE_SIZE (1 << WAIT_TABLE_BITS)
-11
kernel/sysctl.c
··· 1757 1757 .proc_handler = sysctl_sched_uclamp_handler, 1758 1758 }, 1759 1759 #endif 1760 - #ifdef CONFIG_SCHED_AUTOGROUP 1761 - { 1762 - .procname = "sched_autogroup_enabled", 1763 - .data = &sysctl_sched_autogroup_enabled, 1764 - .maxlen = sizeof(unsigned int), 1765 - .mode = 0644, 1766 - .proc_handler = proc_dointvec_minmax, 1767 - .extra1 = SYSCTL_ZERO, 1768 - .extra2 = SYSCTL_ONE, 1769 - }, 1770 - #endif 1771 1760 #ifdef CONFIG_CFS_BANDWIDTH 1772 1761 { 1773 1762 .procname = "sched_cfs_bandwidth_slice_us",
+3 -1
kernel/trace/fgraph.c
··· 415 415 416 416 static void 417 417 ftrace_graph_probe_sched_switch(void *ignore, bool preempt, 418 - struct task_struct *prev, struct task_struct *next) 418 + unsigned int prev_state, 419 + struct task_struct *prev, 420 + struct task_struct *next) 419 421 { 420 422 unsigned long long timestamp; 421 423 int index;
+3 -1
kernel/trace/ftrace.c
··· 7346 7346 7347 7347 static void 7348 7348 ftrace_filter_pid_sched_switch_probe(void *data, bool preempt, 7349 - struct task_struct *prev, struct task_struct *next) 7349 + unsigned int prev_state, 7350 + struct task_struct *prev, 7351 + struct task_struct *next) 7350 7352 { 7351 7353 struct trace_array *tr = data; 7352 7354 struct trace_pid_list *pid_list;
+6 -2
kernel/trace/trace_events.c
··· 765 765 766 766 static void 767 767 event_filter_pid_sched_switch_probe_pre(void *data, bool preempt, 768 - struct task_struct *prev, struct task_struct *next) 768 + unsigned int prev_state, 769 + struct task_struct *prev, 770 + struct task_struct *next) 769 771 { 770 772 struct trace_array *tr = data; 771 773 struct trace_pid_list *no_pid_list; ··· 791 789 792 790 static void 793 791 event_filter_pid_sched_switch_probe_post(void *data, bool preempt, 794 - struct task_struct *prev, struct task_struct *next) 792 + unsigned int prev_state, 793 + struct task_struct *prev, 794 + struct task_struct *next) 795 795 { 796 796 struct trace_array *tr = data; 797 797 struct trace_pid_list *no_pid_list;
+3 -1
kernel/trace/trace_osnoise.c
··· 1167 1167 * used to record the beginning and to report the end of a thread noise window. 1168 1168 */ 1169 1169 static void 1170 - trace_sched_switch_callback(void *data, bool preempt, struct task_struct *p, 1170 + trace_sched_switch_callback(void *data, bool preempt, 1171 + unsigned int prev_state, 1172 + struct task_struct *p, 1171 1173 struct task_struct *n) 1172 1174 { 1173 1175 struct osnoise_variables *osn_var = this_cpu_osn_var();
+1
kernel/trace/trace_sched_switch.c
··· 22 22 23 23 static void 24 24 probe_sched_switch(void *ignore, bool preempt, 25 + unsigned int prev_state, 25 26 struct task_struct *prev, struct task_struct *next) 26 27 { 27 28 int flags;
+1
kernel/trace/trace_sched_wakeup.c
··· 426 426 427 427 static void notrace 428 428 probe_wakeup_sched_switch(void *ignore, bool preempt, 429 + unsigned int prev_state, 429 430 struct task_struct *prev, struct task_struct *next) 430 431 { 431 432 struct trace_array_cpu *data;
+1 -1
kernel/watchdog.c
··· 848 848 pr_info("Disabling watchdog on nohz_full cores by default\n"); 849 849 850 850 cpumask_copy(&watchdog_cpumask, 851 - housekeeping_cpumask(HK_FLAG_TIMER)); 851 + housekeeping_cpumask(HK_TYPE_TIMER)); 852 852 853 853 if (!watchdog_nmi_probe()) 854 854 nmi_watchdog_available = true;
+2 -2
kernel/workqueue.c
··· 6006 6006 void __init workqueue_init_early(void) 6007 6007 { 6008 6008 int std_nice[NR_STD_WORKER_POOLS] = { 0, HIGHPRI_NICE_LEVEL }; 6009 - int hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ; 6010 6009 int i, cpu; 6011 6010 6012 6011 BUILD_BUG_ON(__alignof__(struct pool_workqueue) < __alignof__(long long)); 6013 6012 6014 6013 BUG_ON(!alloc_cpumask_var(&wq_unbound_cpumask, GFP_KERNEL)); 6015 - cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(hk_flags)); 6014 + cpumask_copy(wq_unbound_cpumask, housekeeping_cpumask(HK_TYPE_WQ)); 6015 + cpumask_and(wq_unbound_cpumask, wq_unbound_cpumask, housekeeping_cpumask(HK_TYPE_DOMAIN)); 6016 6016 6017 6017 pwq_cache = KMEM_CACHE(pool_workqueue, SLAB_PANIC); 6018 6018
+3 -3
net/core/net-sysfs.c
··· 823 823 { 824 824 struct rps_map *old_map, *map; 825 825 cpumask_var_t mask; 826 - int err, cpu, i, hk_flags; 826 + int err, cpu, i; 827 827 static DEFINE_MUTEX(rps_map_mutex); 828 828 829 829 if (!capable(CAP_NET_ADMIN)) ··· 839 839 } 840 840 841 841 if (!cpumask_empty(mask)) { 842 - hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ; 843 - cpumask_and(mask, mask, housekeeping_cpumask(hk_flags)); 842 + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_DOMAIN)); 843 + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_WQ)); 844 844 if (cpumask_empty(mask)) { 845 845 free_cpumask_var(mask); 846 846 return -EINVAL;
+12 -5
scripts/mkcompile_h
··· 5 5 ARCH=$2 6 6 SMP=$3 7 7 PREEMPT=$4 8 - PREEMPT_RT=$5 9 - CC_VERSION="$6" 10 - LD=$7 8 + PREEMPT_DYNAMIC=$5 9 + PREEMPT_RT=$6 10 + CC_VERSION="$7" 11 + LD=$8 11 12 12 13 # Do not expand names 13 14 set -f ··· 42 41 UTS_VERSION="#$VERSION" 43 42 CONFIG_FLAGS="" 44 43 if [ -n "$SMP" ] ; then CONFIG_FLAGS="SMP"; fi 45 - if [ -n "$PREEMPT" ] ; then CONFIG_FLAGS="$CONFIG_FLAGS PREEMPT"; fi 46 - if [ -n "$PREEMPT_RT" ] ; then CONFIG_FLAGS="$CONFIG_FLAGS PREEMPT_RT"; fi 44 + 45 + if [ -n "$PREEMPT_RT" ] ; then 46 + CONFIG_FLAGS="$CONFIG_FLAGS PREEMPT_RT" 47 + elif [ -n "$PREEMPT_DYNAMIC" ] ; then 48 + CONFIG_FLAGS="$CONFIG_FLAGS PREEMPT_DYNAMIC" 49 + elif [ -n "$PREEMPT" ] ; then 50 + CONFIG_FLAGS="$CONFIG_FLAGS PREEMPT" 51 + fi 47 52 48 53 # Truncate to maximum length 49 54 UTS_LEN=64
+1 -1
tools/testing/selftests/rseq/Makefile
··· 6 6 7 7 CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L$(OUTPUT) -Wl,-rpath=./ \ 8 8 $(CLANG_FLAGS) 9 - LDLIBS += -lpthread 9 + LDLIBS += -lpthread -ldl 10 10 11 11 # Own dependencies because we only want to build against 1st prerequisite, but 12 12 # still track changes to header files and depend on shared object.
+1 -1
tools/testing/selftests/rseq/basic_percpu_ops_test.c
··· 167 167 for (;;) { 168 168 struct percpu_list_node *head; 169 169 intptr_t *targetptr, expectnot, *load; 170 - off_t offset; 170 + long offset; 171 171 int ret, cpu; 172 172 173 173 cpu = rseq_cpu_start();
+30
tools/testing/selftests/rseq/compiler.h
··· 1 + /* SPDX-License-Identifier: LGPL-2.1-only OR MIT */ 2 + /* 3 + * rseq/compiler.h 4 + * 5 + * Work-around asm goto compiler bugs. 6 + * 7 + * (C) Copyright 2021 - Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 8 + */ 9 + 10 + #ifndef RSEQ_COMPILER_H 11 + #define RSEQ_COMPILER_H 12 + 13 + /* 14 + * gcc prior to 4.8.2 miscompiles asm goto. 15 + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 16 + * 17 + * gcc prior to 8.1.0 miscompiles asm goto at O1. 18 + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103908 19 + * 20 + * clang prior to version 13.0.1 miscompiles asm goto at O2. 21 + * https://github.com/llvm/llvm-project/issues/52735 22 + * 23 + * Work around these issues by adding a volatile inline asm with 24 + * memory clobber in the fallthrough after the asm goto and at each 25 + * label target. Emit this for all compilers in case other similar 26 + * issues are found in the future. 27 + */ 28 + #define rseq_after_asm_goto() asm volatile ("" : : : "memory") 29 + 30 + #endif /* RSEQ_COMPILER_H_ */
+3 -5
tools/testing/selftests/rseq/param_test.c
··· 161 161 " cbnz " INJECT_ASM_REG ", 222b\n" \ 162 162 "333:\n" 163 163 164 - #elif __PPC__ 164 + #elif defined(__PPC__) 165 165 166 166 #define RSEQ_INJECT_INPUT \ 167 167 , [loop_cnt_1]"m"(loop_cnt[1]) \ ··· 368 368 abort(); 369 369 reps = thread_data->reps; 370 370 for (i = 0; i < reps; i++) { 371 - int cpu = rseq_cpu_start(); 372 - 373 - cpu = rseq_this_cpu_lock(&data->lock); 371 + int cpu = rseq_this_cpu_lock(&data->lock); 374 372 data->c[cpu].count++; 375 373 rseq_percpu_unlock(&data->lock, cpu); 376 374 #ifndef BENCHMARK ··· 549 551 for (;;) { 550 552 struct percpu_list_node *head; 551 553 intptr_t *targetptr, expectnot, *load; 552 - off_t offset; 554 + long offset; 553 555 int ret; 554 556 555 557 cpu = rseq_cpu_start();
+151
tools/testing/selftests/rseq/rseq-abi.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ 2 + #ifndef _RSEQ_ABI_H 3 + #define _RSEQ_ABI_H 4 + 5 + /* 6 + * rseq-abi.h 7 + * 8 + * Restartable sequences system call API 9 + * 10 + * Copyright (c) 2015-2022 Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 11 + */ 12 + 13 + #include <linux/types.h> 14 + #include <asm/byteorder.h> 15 + 16 + enum rseq_abi_cpu_id_state { 17 + RSEQ_ABI_CPU_ID_UNINITIALIZED = -1, 18 + RSEQ_ABI_CPU_ID_REGISTRATION_FAILED = -2, 19 + }; 20 + 21 + enum rseq_abi_flags { 22 + RSEQ_ABI_FLAG_UNREGISTER = (1 << 0), 23 + }; 24 + 25 + enum rseq_abi_cs_flags_bit { 26 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT = 0, 27 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT = 1, 28 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT = 2, 29 + }; 30 + 31 + enum rseq_abi_cs_flags { 32 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_PREEMPT = 33 + (1U << RSEQ_ABI_CS_FLAG_NO_RESTART_ON_PREEMPT_BIT), 34 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_SIGNAL = 35 + (1U << RSEQ_ABI_CS_FLAG_NO_RESTART_ON_SIGNAL_BIT), 36 + RSEQ_ABI_CS_FLAG_NO_RESTART_ON_MIGRATE = 37 + (1U << RSEQ_ABI_CS_FLAG_NO_RESTART_ON_MIGRATE_BIT), 38 + }; 39 + 40 + /* 41 + * struct rseq_abi_cs is aligned on 4 * 8 bytes to ensure it is always 42 + * contained within a single cache-line. It is usually declared as 43 + * link-time constant data. 44 + */ 45 + struct rseq_abi_cs { 46 + /* Version of this structure. */ 47 + __u32 version; 48 + /* enum rseq_abi_cs_flags */ 49 + __u32 flags; 50 + __u64 start_ip; 51 + /* Offset from start_ip. */ 52 + __u64 post_commit_offset; 53 + __u64 abort_ip; 54 + } __attribute__((aligned(4 * sizeof(__u64)))); 55 + 56 + /* 57 + * struct rseq_abi is aligned on 4 * 8 bytes to ensure it is always 58 + * contained within a single cache-line. 59 + * 60 + * A single struct rseq_abi per thread is allowed. 61 + */ 62 + struct rseq_abi { 63 + /* 64 + * Restartable sequences cpu_id_start field. Updated by the 65 + * kernel. Read by user-space with single-copy atomicity 66 + * semantics. This field should only be read by the thread which 67 + * registered this data structure. Aligned on 32-bit. Always 68 + * contains a value in the range of possible CPUs, although the 69 + * value may not be the actual current CPU (e.g. if rseq is not 70 + * initialized). This CPU number value should always be compared 71 + * against the value of the cpu_id field before performing a rseq 72 + * commit or returning a value read from a data structure indexed 73 + * using the cpu_id_start value. 74 + */ 75 + __u32 cpu_id_start; 76 + /* 77 + * Restartable sequences cpu_id field. Updated by the kernel. 78 + * Read by user-space with single-copy atomicity semantics. This 79 + * field should only be read by the thread which registered this 80 + * data structure. Aligned on 32-bit. Values 81 + * RSEQ_CPU_ID_UNINITIALIZED and RSEQ_CPU_ID_REGISTRATION_FAILED 82 + * have a special semantic: the former means "rseq uninitialized", 83 + * and latter means "rseq initialization failed". This value is 84 + * meant to be read within rseq critical sections and compared 85 + * with the cpu_id_start value previously read, before performing 86 + * the commit instruction, or read and compared with the 87 + * cpu_id_start value before returning a value loaded from a data 88 + * structure indexed using the cpu_id_start value. 89 + */ 90 + __u32 cpu_id; 91 + /* 92 + * Restartable sequences rseq_cs field. 93 + * 94 + * Contains NULL when no critical section is active for the current 95 + * thread, or holds a pointer to the currently active struct rseq_cs. 96 + * 97 + * Updated by user-space, which sets the address of the currently 98 + * active rseq_cs at the beginning of assembly instruction sequence 99 + * block, and set to NULL by the kernel when it restarts an assembly 100 + * instruction sequence block, as well as when the kernel detects that 101 + * it is preempting or delivering a signal outside of the range 102 + * targeted by the rseq_cs. Also needs to be set to NULL by user-space 103 + * before reclaiming memory that contains the targeted struct rseq_cs. 104 + * 105 + * Read and set by the kernel. Set by user-space with single-copy 106 + * atomicity semantics. This field should only be updated by the 107 + * thread which registered this data structure. Aligned on 64-bit. 108 + */ 109 + union { 110 + __u64 ptr64; 111 + 112 + /* 113 + * The "arch" field provides architecture accessor for 114 + * the ptr field based on architecture pointer size and 115 + * endianness. 116 + */ 117 + struct { 118 + #ifdef __LP64__ 119 + __u64 ptr; 120 + #elif defined(__BYTE_ORDER) ? (__BYTE_ORDER == __BIG_ENDIAN) : defined(__BIG_ENDIAN) 121 + __u32 padding; /* Initialized to zero. */ 122 + __u32 ptr; 123 + #else 124 + __u32 ptr; 125 + __u32 padding; /* Initialized to zero. */ 126 + #endif 127 + } arch; 128 + } rseq_cs; 129 + 130 + /* 131 + * Restartable sequences flags field. 132 + * 133 + * This field should only be updated by the thread which 134 + * registered this data structure. Read by the kernel. 135 + * Mainly used for single-stepping through rseq critical sections 136 + * with debuggers. 137 + * 138 + * - RSEQ_ABI_CS_FLAG_NO_RESTART_ON_PREEMPT 139 + * Inhibit instruction sequence block restart on preemption 140 + * for this thread. 141 + * - RSEQ_ABI_CS_FLAG_NO_RESTART_ON_SIGNAL 142 + * Inhibit instruction sequence block restart on signal 143 + * delivery for this thread. 144 + * - RSEQ_ABI_CS_FLAG_NO_RESTART_ON_MIGRATE 145 + * Inhibit instruction sequence block restart on migration for 146 + * this thread. 147 + */ 148 + __u32 flags; 149 + } __attribute__((aligned(4 * sizeof(__u64)))); 150 + 151 + #endif /* _RSEQ_ABI_H */
+56 -54
tools/testing/selftests/rseq/rseq-arm.h
··· 147 147 teardown \ 148 148 "b %l[" __rseq_str(cmpfail_label) "]\n\t" 149 149 150 - #define rseq_workaround_gcc_asm_size_guess() __asm__ __volatile__("") 151 - 152 150 static inline __attribute__((always_inline)) 153 151 int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu) 154 152 { 155 153 RSEQ_INJECT_C(9) 156 154 157 - rseq_workaround_gcc_asm_size_guess(); 158 155 __asm__ __volatile__ goto ( 159 156 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 160 157 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 182 185 "5:\n\t" 183 186 : /* gcc asm goto does not allow outputs */ 184 187 : [cpu_id] "r" (cpu), 185 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 186 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 188 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 189 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 187 190 [v] "m" (*v), 188 191 [expect] "r" (expect), 189 192 [newv] "r" (newv) ··· 195 198 , error1, error2 196 199 #endif 197 200 ); 198 - rseq_workaround_gcc_asm_size_guess(); 201 + rseq_after_asm_goto(); 199 202 return 0; 200 203 abort: 201 - rseq_workaround_gcc_asm_size_guess(); 204 + rseq_after_asm_goto(); 202 205 RSEQ_INJECT_FAILED 203 206 return -1; 204 207 cmpfail: 205 - rseq_workaround_gcc_asm_size_guess(); 208 + rseq_after_asm_goto(); 206 209 return 1; 207 210 #ifdef RSEQ_COMPARE_TWICE 208 211 error1: 212 + rseq_after_asm_goto(); 209 213 rseq_bug("cpu_id comparison failed"); 210 214 error2: 215 + rseq_after_asm_goto(); 211 216 rseq_bug("expected value comparison failed"); 212 217 #endif 213 218 } 214 219 215 220 static inline __attribute__((always_inline)) 216 221 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 217 - off_t voffp, intptr_t *load, int cpu) 222 + long voffp, intptr_t *load, int cpu) 218 223 { 219 224 RSEQ_INJECT_C(9) 220 225 221 - rseq_workaround_gcc_asm_size_guess(); 222 226 __asm__ __volatile__ goto ( 223 227 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 224 228 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 253 255 "5:\n\t" 254 256 : /* gcc asm goto does not allow outputs */ 255 257 : [cpu_id] "r" (cpu), 256 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 257 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 258 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 259 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 258 260 /* final store input */ 259 261 [v] "m" (*v), 260 262 [expectnot] "r" (expectnot), ··· 268 270 , error1, error2 269 271 #endif 270 272 ); 271 - rseq_workaround_gcc_asm_size_guess(); 273 + rseq_after_asm_goto(); 272 274 return 0; 273 275 abort: 274 - rseq_workaround_gcc_asm_size_guess(); 276 + rseq_after_asm_goto(); 275 277 RSEQ_INJECT_FAILED 276 278 return -1; 277 279 cmpfail: 278 - rseq_workaround_gcc_asm_size_guess(); 280 + rseq_after_asm_goto(); 279 281 return 1; 280 282 #ifdef RSEQ_COMPARE_TWICE 281 283 error1: 284 + rseq_after_asm_goto(); 282 285 rseq_bug("cpu_id comparison failed"); 283 286 error2: 287 + rseq_after_asm_goto(); 284 288 rseq_bug("expected value comparison failed"); 285 289 #endif 286 290 } ··· 292 292 { 293 293 RSEQ_INJECT_C(9) 294 294 295 - rseq_workaround_gcc_asm_size_guess(); 296 295 __asm__ __volatile__ goto ( 297 296 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 298 297 #ifdef RSEQ_COMPARE_TWICE ··· 315 316 "5:\n\t" 316 317 : /* gcc asm goto does not allow outputs */ 317 318 : [cpu_id] "r" (cpu), 318 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 319 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 319 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 320 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 320 321 [v] "m" (*v), 321 322 [count] "Ir" (count) 322 323 RSEQ_INJECT_INPUT ··· 327 328 , error1 328 329 #endif 329 330 ); 330 - rseq_workaround_gcc_asm_size_guess(); 331 + rseq_after_asm_goto(); 331 332 return 0; 332 333 abort: 333 - rseq_workaround_gcc_asm_size_guess(); 334 + rseq_after_asm_goto(); 334 335 RSEQ_INJECT_FAILED 335 336 return -1; 336 337 #ifdef RSEQ_COMPARE_TWICE 337 338 error1: 339 + rseq_after_asm_goto(); 338 340 rseq_bug("cpu_id comparison failed"); 339 341 #endif 340 342 } ··· 347 347 { 348 348 RSEQ_INJECT_C(9) 349 349 350 - rseq_workaround_gcc_asm_size_guess(); 351 350 __asm__ __volatile__ goto ( 352 351 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 353 352 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 380 381 "5:\n\t" 381 382 : /* gcc asm goto does not allow outputs */ 382 383 : [cpu_id] "r" (cpu), 383 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 384 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 384 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 385 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 385 386 /* try store input */ 386 387 [v2] "m" (*v2), 387 388 [newv2] "r" (newv2), ··· 397 398 , error1, error2 398 399 #endif 399 400 ); 400 - rseq_workaround_gcc_asm_size_guess(); 401 + rseq_after_asm_goto(); 401 402 return 0; 402 403 abort: 403 - rseq_workaround_gcc_asm_size_guess(); 404 + rseq_after_asm_goto(); 404 405 RSEQ_INJECT_FAILED 405 406 return -1; 406 407 cmpfail: 407 - rseq_workaround_gcc_asm_size_guess(); 408 + rseq_after_asm_goto(); 408 409 return 1; 409 410 #ifdef RSEQ_COMPARE_TWICE 410 411 error1: 412 + rseq_after_asm_goto(); 411 413 rseq_bug("cpu_id comparison failed"); 412 414 error2: 415 + rseq_after_asm_goto(); 413 416 rseq_bug("expected value comparison failed"); 414 417 #endif 415 418 } ··· 423 422 { 424 423 RSEQ_INJECT_C(9) 425 424 426 - rseq_workaround_gcc_asm_size_guess(); 427 425 __asm__ __volatile__ goto ( 428 426 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 429 427 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 457 457 "5:\n\t" 458 458 : /* gcc asm goto does not allow outputs */ 459 459 : [cpu_id] "r" (cpu), 460 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 461 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 460 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 461 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 462 462 /* try store input */ 463 463 [v2] "m" (*v2), 464 464 [newv2] "r" (newv2), ··· 474 474 , error1, error2 475 475 #endif 476 476 ); 477 - rseq_workaround_gcc_asm_size_guess(); 477 + rseq_after_asm_goto(); 478 478 return 0; 479 479 abort: 480 - rseq_workaround_gcc_asm_size_guess(); 480 + rseq_after_asm_goto(); 481 481 RSEQ_INJECT_FAILED 482 482 return -1; 483 483 cmpfail: 484 - rseq_workaround_gcc_asm_size_guess(); 484 + rseq_after_asm_goto(); 485 485 return 1; 486 486 #ifdef RSEQ_COMPARE_TWICE 487 487 error1: 488 + rseq_after_asm_goto(); 488 489 rseq_bug("cpu_id comparison failed"); 489 490 error2: 491 + rseq_after_asm_goto(); 490 492 rseq_bug("expected value comparison failed"); 491 493 #endif 492 494 } ··· 500 498 { 501 499 RSEQ_INJECT_C(9) 502 500 503 - rseq_workaround_gcc_asm_size_guess(); 504 501 __asm__ __volatile__ goto ( 505 502 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 506 503 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 538 537 "5:\n\t" 539 538 : /* gcc asm goto does not allow outputs */ 540 539 : [cpu_id] "r" (cpu), 541 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 542 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 540 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 541 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 543 542 /* cmp2 input */ 544 543 [v2] "m" (*v2), 545 544 [expect2] "r" (expect2), ··· 555 554 , error1, error2, error3 556 555 #endif 557 556 ); 558 - rseq_workaround_gcc_asm_size_guess(); 557 + rseq_after_asm_goto(); 559 558 return 0; 560 559 abort: 561 - rseq_workaround_gcc_asm_size_guess(); 560 + rseq_after_asm_goto(); 562 561 RSEQ_INJECT_FAILED 563 562 return -1; 564 563 cmpfail: 565 - rseq_workaround_gcc_asm_size_guess(); 564 + rseq_after_asm_goto(); 566 565 return 1; 567 566 #ifdef RSEQ_COMPARE_TWICE 568 567 error1: 568 + rseq_after_asm_goto(); 569 569 rseq_bug("cpu_id comparison failed"); 570 570 error2: 571 + rseq_after_asm_goto(); 571 572 rseq_bug("1st expected value comparison failed"); 572 573 error3: 574 + rseq_after_asm_goto(); 573 575 rseq_bug("2nd expected value comparison failed"); 574 576 #endif 575 577 } ··· 586 582 587 583 RSEQ_INJECT_C(9) 588 584 589 - rseq_workaround_gcc_asm_size_guess(); 590 585 __asm__ __volatile__ goto ( 591 586 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 592 587 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 660 657 "8:\n\t" 661 658 : /* gcc asm goto does not allow outputs */ 662 659 : [cpu_id] "r" (cpu), 663 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 664 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 660 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 661 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 665 662 /* final store input */ 666 663 [v] "m" (*v), 667 664 [expect] "r" (expect), ··· 681 678 , error1, error2 682 679 #endif 683 680 ); 684 - rseq_workaround_gcc_asm_size_guess(); 681 + rseq_after_asm_goto(); 685 682 return 0; 686 683 abort: 687 - rseq_workaround_gcc_asm_size_guess(); 684 + rseq_after_asm_goto(); 688 685 RSEQ_INJECT_FAILED 689 686 return -1; 690 687 cmpfail: 691 - rseq_workaround_gcc_asm_size_guess(); 688 + rseq_after_asm_goto(); 692 689 return 1; 693 690 #ifdef RSEQ_COMPARE_TWICE 694 691 error1: 695 - rseq_workaround_gcc_asm_size_guess(); 692 + rseq_after_asm_goto(); 696 693 rseq_bug("cpu_id comparison failed"); 697 694 error2: 698 - rseq_workaround_gcc_asm_size_guess(); 695 + rseq_after_asm_goto(); 699 696 rseq_bug("expected value comparison failed"); 700 697 #endif 701 698 } ··· 709 706 710 707 RSEQ_INJECT_C(9) 711 708 712 - rseq_workaround_gcc_asm_size_guess(); 713 709 __asm__ __volatile__ goto ( 714 710 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 715 711 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 784 782 "8:\n\t" 785 783 : /* gcc asm goto does not allow outputs */ 786 784 : [cpu_id] "r" (cpu), 787 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 788 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 785 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 786 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 789 787 /* final store input */ 790 788 [v] "m" (*v), 791 789 [expect] "r" (expect), ··· 805 803 , error1, error2 806 804 #endif 807 805 ); 808 - rseq_workaround_gcc_asm_size_guess(); 806 + rseq_after_asm_goto(); 809 807 return 0; 810 808 abort: 811 - rseq_workaround_gcc_asm_size_guess(); 809 + rseq_after_asm_goto(); 812 810 RSEQ_INJECT_FAILED 813 811 return -1; 814 812 cmpfail: 815 - rseq_workaround_gcc_asm_size_guess(); 813 + rseq_after_asm_goto(); 816 814 return 1; 817 815 #ifdef RSEQ_COMPARE_TWICE 818 816 error1: 819 - rseq_workaround_gcc_asm_size_guess(); 817 + rseq_after_asm_goto(); 820 818 rseq_bug("cpu_id comparison failed"); 821 819 error2: 822 - rseq_workaround_gcc_asm_size_guess(); 820 + rseq_after_asm_goto(); 823 821 rseq_bug("expected value comparison failed"); 824 822 #endif 825 823 }
+56 -23
tools/testing/selftests/rseq/rseq-arm64.h
··· 230 230 RSEQ_ASM_DEFINE_ABORT(4, abort) 231 231 : /* gcc asm goto does not allow outputs */ 232 232 : [cpu_id] "r" (cpu), 233 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 234 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 233 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 234 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 235 235 [v] "Qo" (*v), 236 236 [expect] "r" (expect), 237 237 [newv] "r" (newv) ··· 242 242 , error1, error2 243 243 #endif 244 244 ); 245 - 245 + rseq_after_asm_goto(); 246 246 return 0; 247 247 abort: 248 + rseq_after_asm_goto(); 248 249 RSEQ_INJECT_FAILED 249 250 return -1; 250 251 cmpfail: 252 + rseq_after_asm_goto(); 251 253 return 1; 252 254 #ifdef RSEQ_COMPARE_TWICE 253 255 error1: 256 + rseq_after_asm_goto(); 254 257 rseq_bug("cpu_id comparison failed"); 255 258 error2: 259 + rseq_after_asm_goto(); 256 260 rseq_bug("expected value comparison failed"); 257 261 #endif 258 262 } 259 263 260 264 static inline __attribute__((always_inline)) 261 265 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 262 - off_t voffp, intptr_t *load, int cpu) 266 + long voffp, intptr_t *load, int cpu) 263 267 { 264 268 RSEQ_INJECT_C(9) 265 269 ··· 291 287 RSEQ_ASM_DEFINE_ABORT(4, abort) 292 288 : /* gcc asm goto does not allow outputs */ 293 289 : [cpu_id] "r" (cpu), 294 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 295 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 290 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 291 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 296 292 [v] "Qo" (*v), 297 293 [expectnot] "r" (expectnot), 298 294 [load] "Qo" (*load), ··· 304 300 , error1, error2 305 301 #endif 306 302 ); 303 + rseq_after_asm_goto(); 307 304 return 0; 308 305 abort: 306 + rseq_after_asm_goto(); 309 307 RSEQ_INJECT_FAILED 310 308 return -1; 311 309 cmpfail: 310 + rseq_after_asm_goto(); 312 311 return 1; 313 312 #ifdef RSEQ_COMPARE_TWICE 314 313 error1: 314 + rseq_after_asm_goto(); 315 315 rseq_bug("cpu_id comparison failed"); 316 316 error2: 317 + rseq_after_asm_goto(); 317 318 rseq_bug("expected value comparison failed"); 318 319 #endif 319 320 } ··· 346 337 RSEQ_ASM_DEFINE_ABORT(4, abort) 347 338 : /* gcc asm goto does not allow outputs */ 348 339 : [cpu_id] "r" (cpu), 349 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 350 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 340 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 341 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 351 342 [v] "Qo" (*v), 352 343 [count] "r" (count) 353 344 RSEQ_INJECT_INPUT ··· 357 348 , error1 358 349 #endif 359 350 ); 351 + rseq_after_asm_goto(); 360 352 return 0; 361 353 abort: 354 + rseq_after_asm_goto(); 362 355 RSEQ_INJECT_FAILED 363 356 return -1; 364 357 #ifdef RSEQ_COMPARE_TWICE 365 358 error1: 359 + rseq_after_asm_goto(); 366 360 rseq_bug("cpu_id comparison failed"); 367 361 #endif 368 362 } ··· 400 388 RSEQ_ASM_DEFINE_ABORT(4, abort) 401 389 : /* gcc asm goto does not allow outputs */ 402 390 : [cpu_id] "r" (cpu), 403 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 404 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 391 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 392 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 405 393 [expect] "r" (expect), 406 394 [v] "Qo" (*v), 407 395 [newv] "r" (newv), ··· 414 402 , error1, error2 415 403 #endif 416 404 ); 417 - 405 + rseq_after_asm_goto(); 418 406 return 0; 419 407 abort: 408 + rseq_after_asm_goto(); 420 409 RSEQ_INJECT_FAILED 421 410 return -1; 422 411 cmpfail: 412 + rseq_after_asm_goto(); 423 413 return 1; 424 414 #ifdef RSEQ_COMPARE_TWICE 425 415 error1: 416 + rseq_after_asm_goto(); 426 417 rseq_bug("cpu_id comparison failed"); 427 418 error2: 419 + rseq_after_asm_goto(); 428 420 rseq_bug("expected value comparison failed"); 429 421 #endif 430 422 } ··· 463 447 RSEQ_ASM_DEFINE_ABORT(4, abort) 464 448 : /* gcc asm goto does not allow outputs */ 465 449 : [cpu_id] "r" (cpu), 466 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 467 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 450 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 451 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 468 452 [expect] "r" (expect), 469 453 [v] "Qo" (*v), 470 454 [newv] "r" (newv), ··· 477 461 , error1, error2 478 462 #endif 479 463 ); 480 - 464 + rseq_after_asm_goto(); 481 465 return 0; 482 466 abort: 467 + rseq_after_asm_goto(); 483 468 RSEQ_INJECT_FAILED 484 469 return -1; 485 470 cmpfail: 471 + rseq_after_asm_goto(); 486 472 return 1; 487 473 #ifdef RSEQ_COMPARE_TWICE 488 474 error1: 475 + rseq_after_asm_goto(); 489 476 rseq_bug("cpu_id comparison failed"); 490 477 error2: 478 + rseq_after_asm_goto(); 491 479 rseq_bug("expected value comparison failed"); 492 480 #endif 493 481 } ··· 528 508 RSEQ_ASM_DEFINE_ABORT(4, abort) 529 509 : /* gcc asm goto does not allow outputs */ 530 510 : [cpu_id] "r" (cpu), 531 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 532 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 511 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 512 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 533 513 [v] "Qo" (*v), 534 514 [expect] "r" (expect), 535 515 [v2] "Qo" (*v2), ··· 542 522 , error1, error2, error3 543 523 #endif 544 524 ); 545 - 525 + rseq_after_asm_goto(); 546 526 return 0; 547 527 abort: 528 + rseq_after_asm_goto(); 548 529 RSEQ_INJECT_FAILED 549 530 return -1; 550 531 cmpfail: 532 + rseq_after_asm_goto(); 551 533 return 1; 552 534 #ifdef RSEQ_COMPARE_TWICE 553 535 error1: 536 + rseq_after_asm_goto(); 554 537 rseq_bug("cpu_id comparison failed"); 555 538 error2: 539 + rseq_after_asm_goto(); 556 540 rseq_bug("expected value comparison failed"); 557 541 error3: 542 + rseq_after_asm_goto(); 558 543 rseq_bug("2nd expected value comparison failed"); 559 544 #endif 560 545 } ··· 594 569 RSEQ_ASM_DEFINE_ABORT(4, abort) 595 570 : /* gcc asm goto does not allow outputs */ 596 571 : [cpu_id] "r" (cpu), 597 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 598 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 572 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 573 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 599 574 [expect] "r" (expect), 600 575 [v] "Qo" (*v), 601 576 [newv] "r" (newv), ··· 609 584 , error1, error2 610 585 #endif 611 586 ); 612 - 587 + rseq_after_asm_goto(); 613 588 return 0; 614 589 abort: 590 + rseq_after_asm_goto(); 615 591 RSEQ_INJECT_FAILED 616 592 return -1; 617 593 cmpfail: 594 + rseq_after_asm_goto(); 618 595 return 1; 619 596 #ifdef RSEQ_COMPARE_TWICE 620 597 error1: 598 + rseq_after_asm_goto(); 621 599 rseq_bug("cpu_id comparison failed"); 622 600 error2: 601 + rseq_after_asm_goto(); 623 602 rseq_bug("expected value comparison failed"); 624 603 #endif 625 604 } ··· 658 629 RSEQ_ASM_DEFINE_ABORT(4, abort) 659 630 : /* gcc asm goto does not allow outputs */ 660 631 : [cpu_id] "r" (cpu), 661 - [current_cpu_id] "Qo" (__rseq_abi.cpu_id), 662 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 632 + [current_cpu_id] "Qo" (rseq_get_abi()->cpu_id), 633 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 663 634 [expect] "r" (expect), 664 635 [v] "Qo" (*v), 665 636 [newv] "r" (newv), ··· 673 644 , error1, error2 674 645 #endif 675 646 ); 676 - 647 + rseq_after_asm_goto(); 677 648 return 0; 678 649 abort: 650 + rseq_after_asm_goto(); 679 651 RSEQ_INJECT_FAILED 680 652 return -1; 681 653 cmpfail: 654 + rseq_after_asm_goto(); 682 655 return 1; 683 656 #ifdef RSEQ_COMPARE_TWICE 684 657 error1: 658 + rseq_after_asm_goto(); 685 659 rseq_bug("cpu_id comparison failed"); 686 660 error2: 661 + rseq_after_asm_goto(); 687 662 rseq_bug("expected value comparison failed"); 688 663 #endif 689 664 }
+25
tools/testing/selftests/rseq/rseq-generic-thread-pointer.h
··· 1 + /* SPDX-License-Identifier: LGPL-2.1-only OR MIT */ 2 + /* 3 + * rseq-generic-thread-pointer.h 4 + * 5 + * (C) Copyright 2021 - Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 6 + */ 7 + 8 + #ifndef _RSEQ_GENERIC_THREAD_POINTER 9 + #define _RSEQ_GENERIC_THREAD_POINTER 10 + 11 + #ifdef __cplusplus 12 + extern "C" { 13 + #endif 14 + 15 + /* Use gcc builtin thread pointer. */ 16 + static inline void *rseq_thread_pointer(void) 17 + { 18 + return __builtin_thread_pointer(); 19 + } 20 + 21 + #ifdef __cplusplus 22 + } 23 + #endif 24 + 25 + #endif
+17 -54
tools/testing/selftests/rseq/rseq-mips.h
··· 154 154 teardown \ 155 155 "b %l[" __rseq_str(cmpfail_label) "]\n\t" 156 156 157 - #define rseq_workaround_gcc_asm_size_guess() __asm__ __volatile__("") 158 - 159 157 static inline __attribute__((always_inline)) 160 158 int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu) 161 159 { 162 160 RSEQ_INJECT_C(9) 163 161 164 - rseq_workaround_gcc_asm_size_guess(); 165 162 __asm__ __volatile__ goto ( 166 163 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 167 164 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 187 190 "5:\n\t" 188 191 : /* gcc asm goto does not allow outputs */ 189 192 : [cpu_id] "r" (cpu), 190 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 191 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 193 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 194 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 192 195 [v] "m" (*v), 193 196 [expect] "r" (expect), 194 197 [newv] "r" (newv) ··· 200 203 , error1, error2 201 204 #endif 202 205 ); 203 - rseq_workaround_gcc_asm_size_guess(); 204 206 return 0; 205 207 abort: 206 - rseq_workaround_gcc_asm_size_guess(); 207 208 RSEQ_INJECT_FAILED 208 209 return -1; 209 210 cmpfail: 210 - rseq_workaround_gcc_asm_size_guess(); 211 211 return 1; 212 212 #ifdef RSEQ_COMPARE_TWICE 213 213 error1: ··· 216 222 217 223 static inline __attribute__((always_inline)) 218 224 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 219 - off_t voffp, intptr_t *load, int cpu) 225 + long voffp, intptr_t *load, int cpu) 220 226 { 221 227 RSEQ_INJECT_C(9) 222 228 223 - rseq_workaround_gcc_asm_size_guess(); 224 229 __asm__ __volatile__ goto ( 225 230 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 226 231 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 251 258 "5:\n\t" 252 259 : /* gcc asm goto does not allow outputs */ 253 260 : [cpu_id] "r" (cpu), 254 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 255 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 261 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 262 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 256 263 /* final store input */ 257 264 [v] "m" (*v), 258 265 [expectnot] "r" (expectnot), ··· 266 273 , error1, error2 267 274 #endif 268 275 ); 269 - rseq_workaround_gcc_asm_size_guess(); 270 276 return 0; 271 277 abort: 272 - rseq_workaround_gcc_asm_size_guess(); 273 278 RSEQ_INJECT_FAILED 274 279 return -1; 275 280 cmpfail: 276 - rseq_workaround_gcc_asm_size_guess(); 277 281 return 1; 278 282 #ifdef RSEQ_COMPARE_TWICE 279 283 error1: ··· 285 295 { 286 296 RSEQ_INJECT_C(9) 287 297 288 - rseq_workaround_gcc_asm_size_guess(); 289 298 __asm__ __volatile__ goto ( 290 299 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 291 300 #ifdef RSEQ_COMPARE_TWICE ··· 308 319 "5:\n\t" 309 320 : /* gcc asm goto does not allow outputs */ 310 321 : [cpu_id] "r" (cpu), 311 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 312 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 322 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 323 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 313 324 [v] "m" (*v), 314 325 [count] "Ir" (count) 315 326 RSEQ_INJECT_INPUT ··· 320 331 , error1 321 332 #endif 322 333 ); 323 - rseq_workaround_gcc_asm_size_guess(); 324 334 return 0; 325 335 abort: 326 - rseq_workaround_gcc_asm_size_guess(); 327 336 RSEQ_INJECT_FAILED 328 337 return -1; 329 338 #ifdef RSEQ_COMPARE_TWICE ··· 337 350 { 338 351 RSEQ_INJECT_C(9) 339 352 340 - rseq_workaround_gcc_asm_size_guess(); 341 353 __asm__ __volatile__ goto ( 342 354 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 343 355 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 368 382 "5:\n\t" 369 383 : /* gcc asm goto does not allow outputs */ 370 384 : [cpu_id] "r" (cpu), 371 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 372 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 385 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 386 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 373 387 /* try store input */ 374 388 [v2] "m" (*v2), 375 389 [newv2] "r" (newv2), ··· 385 399 , error1, error2 386 400 #endif 387 401 ); 388 - rseq_workaround_gcc_asm_size_guess(); 389 402 return 0; 390 403 abort: 391 - rseq_workaround_gcc_asm_size_guess(); 392 404 RSEQ_INJECT_FAILED 393 405 return -1; 394 406 cmpfail: 395 - rseq_workaround_gcc_asm_size_guess(); 396 407 return 1; 397 408 #ifdef RSEQ_COMPARE_TWICE 398 409 error1: ··· 406 423 { 407 424 RSEQ_INJECT_C(9) 408 425 409 - rseq_workaround_gcc_asm_size_guess(); 410 426 __asm__ __volatile__ goto ( 411 427 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 412 428 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 438 456 "5:\n\t" 439 457 : /* gcc asm goto does not allow outputs */ 440 458 : [cpu_id] "r" (cpu), 441 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 442 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 459 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 460 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 443 461 /* try store input */ 444 462 [v2] "m" (*v2), 445 463 [newv2] "r" (newv2), ··· 455 473 , error1, error2 456 474 #endif 457 475 ); 458 - rseq_workaround_gcc_asm_size_guess(); 459 476 return 0; 460 477 abort: 461 - rseq_workaround_gcc_asm_size_guess(); 462 478 RSEQ_INJECT_FAILED 463 479 return -1; 464 480 cmpfail: 465 - rseq_workaround_gcc_asm_size_guess(); 466 481 return 1; 467 482 #ifdef RSEQ_COMPARE_TWICE 468 483 error1: ··· 476 497 { 477 498 RSEQ_INJECT_C(9) 478 499 479 - rseq_workaround_gcc_asm_size_guess(); 480 500 __asm__ __volatile__ goto ( 481 501 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 482 502 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 510 532 "5:\n\t" 511 533 : /* gcc asm goto does not allow outputs */ 512 534 : [cpu_id] "r" (cpu), 513 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 514 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 535 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 536 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 515 537 /* cmp2 input */ 516 538 [v2] "m" (*v2), 517 539 [expect2] "r" (expect2), ··· 527 549 , error1, error2, error3 528 550 #endif 529 551 ); 530 - rseq_workaround_gcc_asm_size_guess(); 531 552 return 0; 532 553 abort: 533 - rseq_workaround_gcc_asm_size_guess(); 534 554 RSEQ_INJECT_FAILED 535 555 return -1; 536 556 cmpfail: 537 - rseq_workaround_gcc_asm_size_guess(); 538 557 return 1; 539 558 #ifdef RSEQ_COMPARE_TWICE 540 559 error1: ··· 552 577 553 578 RSEQ_INJECT_C(9) 554 579 555 - rseq_workaround_gcc_asm_size_guess(); 556 580 __asm__ __volatile__ goto ( 557 581 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 558 582 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 623 649 "8:\n\t" 624 650 : /* gcc asm goto does not allow outputs */ 625 651 : [cpu_id] "r" (cpu), 626 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 627 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 652 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 653 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 628 654 /* final store input */ 629 655 [v] "m" (*v), 630 656 [expect] "r" (expect), ··· 644 670 , error1, error2 645 671 #endif 646 672 ); 647 - rseq_workaround_gcc_asm_size_guess(); 648 673 return 0; 649 674 abort: 650 - rseq_workaround_gcc_asm_size_guess(); 651 675 RSEQ_INJECT_FAILED 652 676 return -1; 653 677 cmpfail: 654 - rseq_workaround_gcc_asm_size_guess(); 655 678 return 1; 656 679 #ifdef RSEQ_COMPARE_TWICE 657 680 error1: 658 - rseq_workaround_gcc_asm_size_guess(); 659 681 rseq_bug("cpu_id comparison failed"); 660 682 error2: 661 - rseq_workaround_gcc_asm_size_guess(); 662 683 rseq_bug("expected value comparison failed"); 663 684 #endif 664 685 } ··· 667 698 668 699 RSEQ_INJECT_C(9) 669 700 670 - rseq_workaround_gcc_asm_size_guess(); 671 701 __asm__ __volatile__ goto ( 672 702 RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */ 673 703 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail]) ··· 739 771 "8:\n\t" 740 772 : /* gcc asm goto does not allow outputs */ 741 773 : [cpu_id] "r" (cpu), 742 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 743 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 774 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 775 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 744 776 /* final store input */ 745 777 [v] "m" (*v), 746 778 [expect] "r" (expect), ··· 760 792 , error1, error2 761 793 #endif 762 794 ); 763 - rseq_workaround_gcc_asm_size_guess(); 764 795 return 0; 765 796 abort: 766 - rseq_workaround_gcc_asm_size_guess(); 767 797 RSEQ_INJECT_FAILED 768 798 return -1; 769 799 cmpfail: 770 - rseq_workaround_gcc_asm_size_guess(); 771 800 return 1; 772 801 #ifdef RSEQ_COMPARE_TWICE 773 802 error1: 774 - rseq_workaround_gcc_asm_size_guess(); 775 803 rseq_bug("cpu_id comparison failed"); 776 804 error2: 777 - rseq_workaround_gcc_asm_size_guess(); 778 805 rseq_bug("expected value comparison failed"); 779 806 #endif 780 807 }
+30
tools/testing/selftests/rseq/rseq-ppc-thread-pointer.h
··· 1 + /* SPDX-License-Identifier: LGPL-2.1-only OR MIT */ 2 + /* 3 + * rseq-ppc-thread-pointer.h 4 + * 5 + * (C) Copyright 2021 - Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 6 + */ 7 + 8 + #ifndef _RSEQ_PPC_THREAD_POINTER 9 + #define _RSEQ_PPC_THREAD_POINTER 10 + 11 + #ifdef __cplusplus 12 + extern "C" { 13 + #endif 14 + 15 + static inline void *rseq_thread_pointer(void) 16 + { 17 + #ifdef __powerpc64__ 18 + register void *__result asm ("r13"); 19 + #else 20 + register void *__result asm ("r2"); 21 + #endif 22 + asm ("" : "=r" (__result)); 23 + return __result; 24 + } 25 + 26 + #ifdef __cplusplus 27 + } 28 + #endif 29 + 30 + #endif
+84 -44
tools/testing/selftests/rseq/rseq-ppc.h
··· 47 47 48 48 #ifdef __PPC64__ 49 49 50 - #define STORE_WORD "std " 51 - #define LOAD_WORD "ld " 52 - #define LOADX_WORD "ldx " 53 - #define CMP_WORD "cmpd " 50 + #define RSEQ_STORE_LONG(arg) "std%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* To memory ("m" constraint) */ 51 + #define RSEQ_STORE_INT(arg) "stw%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* To memory ("m" constraint) */ 52 + #define RSEQ_LOAD_LONG(arg) "ld%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* From memory ("m" constraint) */ 53 + #define RSEQ_LOAD_INT(arg) "lwz%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* From memory ("m" constraint) */ 54 + #define RSEQ_LOADX_LONG "ldx " /* From base register ("b" constraint) */ 55 + #define RSEQ_CMP_LONG "cmpd " 56 + #define RSEQ_CMP_LONG_INT "cmpdi " 54 57 55 58 #define __RSEQ_ASM_DEFINE_TABLE(label, version, flags, \ 56 59 start_ip, post_commit_offset, abort_ip) \ ··· 92 89 93 90 #else /* #ifdef __PPC64__ */ 94 91 95 - #define STORE_WORD "stw " 96 - #define LOAD_WORD "lwz " 97 - #define LOADX_WORD "lwzx " 98 - #define CMP_WORD "cmpw " 92 + #define RSEQ_STORE_LONG(arg) "stw%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* To memory ("m" constraint) */ 93 + #define RSEQ_STORE_INT(arg) RSEQ_STORE_LONG(arg) /* To memory ("m" constraint) */ 94 + #define RSEQ_LOAD_LONG(arg) "lwz%U[" __rseq_str(arg) "]%X[" __rseq_str(arg) "] " /* From memory ("m" constraint) */ 95 + #define RSEQ_LOAD_INT(arg) RSEQ_LOAD_LONG(arg) /* From memory ("m" constraint) */ 96 + #define RSEQ_LOADX_LONG "lwzx " /* From base register ("b" constraint) */ 97 + #define RSEQ_CMP_LONG "cmpw " 98 + #define RSEQ_CMP_LONG_INT "cmpwi " 99 99 100 100 #define __RSEQ_ASM_DEFINE_TABLE(label, version, flags, \ 101 101 start_ip, post_commit_offset, abort_ip) \ ··· 131 125 RSEQ_INJECT_ASM(1) \ 132 126 "lis %%r17, (" __rseq_str(cs_label) ")@ha\n\t" \ 133 127 "addi %%r17, %%r17, (" __rseq_str(cs_label) ")@l\n\t" \ 134 - "stw %%r17, %[" __rseq_str(rseq_cs) "]\n\t" \ 128 + RSEQ_STORE_INT(rseq_cs) "%%r17, %[" __rseq_str(rseq_cs) "]\n\t" \ 135 129 __rseq_str(label) ":\n\t" 136 130 137 131 #endif /* #ifdef __PPC64__ */ ··· 142 136 143 137 #define RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, label) \ 144 138 RSEQ_INJECT_ASM(2) \ 145 - "lwz %%r17, %[" __rseq_str(current_cpu_id) "]\n\t" \ 139 + RSEQ_LOAD_INT(current_cpu_id) "%%r17, %[" __rseq_str(current_cpu_id) "]\n\t" \ 146 140 "cmpw cr7, %[" __rseq_str(cpu_id) "], %%r17\n\t" \ 147 141 "bne- cr7, " __rseq_str(label) "\n\t" 148 142 ··· 159 153 * RSEQ_ASM_OP_* (else): doesn't have hard-code registers(unless cr7) 160 154 */ 161 155 #define RSEQ_ASM_OP_CMPEQ(var, expect, label) \ 162 - LOAD_WORD "%%r17, %[" __rseq_str(var) "]\n\t" \ 163 - CMP_WORD "cr7, %%r17, %[" __rseq_str(expect) "]\n\t" \ 156 + RSEQ_LOAD_LONG(var) "%%r17, %[" __rseq_str(var) "]\n\t" \ 157 + RSEQ_CMP_LONG "cr7, %%r17, %[" __rseq_str(expect) "]\n\t" \ 164 158 "bne- cr7, " __rseq_str(label) "\n\t" 165 159 166 160 #define RSEQ_ASM_OP_CMPNE(var, expectnot, label) \ 167 - LOAD_WORD "%%r17, %[" __rseq_str(var) "]\n\t" \ 168 - CMP_WORD "cr7, %%r17, %[" __rseq_str(expectnot) "]\n\t" \ 161 + RSEQ_LOAD_LONG(var) "%%r17, %[" __rseq_str(var) "]\n\t" \ 162 + RSEQ_CMP_LONG "cr7, %%r17, %[" __rseq_str(expectnot) "]\n\t" \ 169 163 "beq- cr7, " __rseq_str(label) "\n\t" 170 164 171 165 #define RSEQ_ASM_OP_STORE(value, var) \ 172 - STORE_WORD "%[" __rseq_str(value) "], %[" __rseq_str(var) "]\n\t" 166 + RSEQ_STORE_LONG(var) "%[" __rseq_str(value) "], %[" __rseq_str(var) "]\n\t" 173 167 174 168 /* Load @var to r17 */ 175 169 #define RSEQ_ASM_OP_R_LOAD(var) \ 176 - LOAD_WORD "%%r17, %[" __rseq_str(var) "]\n\t" 170 + RSEQ_LOAD_LONG(var) "%%r17, %[" __rseq_str(var) "]\n\t" 177 171 178 172 /* Store r17 to @var */ 179 173 #define RSEQ_ASM_OP_R_STORE(var) \ 180 - STORE_WORD "%%r17, %[" __rseq_str(var) "]\n\t" 174 + RSEQ_STORE_LONG(var) "%%r17, %[" __rseq_str(var) "]\n\t" 181 175 182 176 /* Add @count to r17 */ 183 177 #define RSEQ_ASM_OP_R_ADD(count) \ ··· 185 179 186 180 /* Load (r17 + voffp) to r17 */ 187 181 #define RSEQ_ASM_OP_R_LOADX(voffp) \ 188 - LOADX_WORD "%%r17, %[" __rseq_str(voffp) "], %%r17\n\t" 182 + RSEQ_LOADX_LONG "%%r17, %[" __rseq_str(voffp) "], %%r17\n\t" 189 183 190 184 /* TODO: implement a faster memcpy. */ 191 185 #define RSEQ_ASM_OP_R_MEMCPY() \ 192 - "cmpdi %%r19, 0\n\t" \ 186 + RSEQ_CMP_LONG_INT "%%r19, 0\n\t" \ 193 187 "beq 333f\n\t" \ 194 188 "addi %%r20, %%r20, -1\n\t" \ 195 189 "addi %%r21, %%r21, -1\n\t" \ ··· 197 191 "lbzu %%r18, 1(%%r20)\n\t" \ 198 192 "stbu %%r18, 1(%%r21)\n\t" \ 199 193 "addi %%r19, %%r19, -1\n\t" \ 200 - "cmpdi %%r19, 0\n\t" \ 194 + RSEQ_CMP_LONG_INT "%%r19, 0\n\t" \ 201 195 "bne 222b\n\t" \ 202 196 "333:\n\t" \ 203 197 204 198 #define RSEQ_ASM_OP_R_FINAL_STORE(var, post_commit_label) \ 205 - STORE_WORD "%%r17, %[" __rseq_str(var) "]\n\t" \ 199 + RSEQ_STORE_LONG(var) "%%r17, %[" __rseq_str(var) "]\n\t" \ 206 200 __rseq_str(post_commit_label) ":\n\t" 207 201 208 202 #define RSEQ_ASM_OP_FINAL_STORE(value, var, post_commit_label) \ 209 - STORE_WORD "%[" __rseq_str(value) "], %[" __rseq_str(var) "]\n\t" \ 203 + RSEQ_STORE_LONG(var) "%[" __rseq_str(value) "], %[" __rseq_str(var) "]\n\t" \ 210 204 __rseq_str(post_commit_label) ":\n\t" 211 205 212 206 static inline __attribute__((always_inline)) ··· 241 235 RSEQ_ASM_DEFINE_ABORT(4, abort) 242 236 : /* gcc asm goto does not allow outputs */ 243 237 : [cpu_id] "r" (cpu), 244 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 245 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 238 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 239 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 246 240 [v] "m" (*v), 247 241 [expect] "r" (expect), 248 242 [newv] "r" (newv) ··· 254 248 , error1, error2 255 249 #endif 256 250 ); 251 + rseq_after_asm_goto(); 257 252 return 0; 258 253 abort: 254 + rseq_after_asm_goto(); 259 255 RSEQ_INJECT_FAILED 260 256 return -1; 261 257 cmpfail: 258 + rseq_after_asm_goto(); 262 259 return 1; 263 260 #ifdef RSEQ_COMPARE_TWICE 264 261 error1: 262 + rseq_after_asm_goto(); 265 263 rseq_bug("cpu_id comparison failed"); 266 264 error2: 265 + rseq_after_asm_goto(); 267 266 rseq_bug("expected value comparison failed"); 268 267 #endif 269 268 } 270 269 271 270 static inline __attribute__((always_inline)) 272 271 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 273 - off_t voffp, intptr_t *load, int cpu) 272 + long voffp, intptr_t *load, int cpu) 274 273 { 275 274 RSEQ_INJECT_C(9) 276 275 ··· 312 301 RSEQ_ASM_DEFINE_ABORT(4, abort) 313 302 : /* gcc asm goto does not allow outputs */ 314 303 : [cpu_id] "r" (cpu), 315 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 316 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 304 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 305 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 317 306 /* final store input */ 318 307 [v] "m" (*v), 319 308 [expectnot] "r" (expectnot), ··· 327 316 , error1, error2 328 317 #endif 329 318 ); 319 + rseq_after_asm_goto(); 330 320 return 0; 331 321 abort: 322 + rseq_after_asm_goto(); 332 323 RSEQ_INJECT_FAILED 333 324 return -1; 334 325 cmpfail: 326 + rseq_after_asm_goto(); 335 327 return 1; 336 328 #ifdef RSEQ_COMPARE_TWICE 337 329 error1: 330 + rseq_after_asm_goto(); 338 331 rseq_bug("cpu_id comparison failed"); 339 332 error2: 333 + rseq_after_asm_goto(); 340 334 rseq_bug("expected value comparison failed"); 341 335 #endif 342 336 } ··· 375 359 RSEQ_ASM_DEFINE_ABORT(4, abort) 376 360 : /* gcc asm goto does not allow outputs */ 377 361 : [cpu_id] "r" (cpu), 378 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 379 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 362 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 363 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 380 364 /* final store input */ 381 365 [v] "m" (*v), 382 366 [count] "r" (count) ··· 388 372 , error1 389 373 #endif 390 374 ); 375 + rseq_after_asm_goto(); 391 376 return 0; 392 377 abort: 378 + rseq_after_asm_goto(); 393 379 RSEQ_INJECT_FAILED 394 380 return -1; 395 381 #ifdef RSEQ_COMPARE_TWICE 396 382 error1: 383 + rseq_after_asm_goto(); 397 384 rseq_bug("cpu_id comparison failed"); 398 385 #endif 399 386 } ··· 438 419 RSEQ_ASM_DEFINE_ABORT(4, abort) 439 420 : /* gcc asm goto does not allow outputs */ 440 421 : [cpu_id] "r" (cpu), 441 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 442 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 422 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 423 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 443 424 /* try store input */ 444 425 [v2] "m" (*v2), 445 426 [newv2] "r" (newv2), ··· 455 436 , error1, error2 456 437 #endif 457 438 ); 439 + rseq_after_asm_goto(); 458 440 return 0; 459 441 abort: 442 + rseq_after_asm_goto(); 460 443 RSEQ_INJECT_FAILED 461 444 return -1; 462 445 cmpfail: 446 + rseq_after_asm_goto(); 463 447 return 1; 464 448 #ifdef RSEQ_COMPARE_TWICE 465 449 error1: 450 + rseq_after_asm_goto(); 466 451 rseq_bug("cpu_id comparison failed"); 467 452 error2: 453 + rseq_after_asm_goto(); 468 454 rseq_bug("expected value comparison failed"); 469 455 #endif 470 456 } ··· 513 489 RSEQ_ASM_DEFINE_ABORT(4, abort) 514 490 : /* gcc asm goto does not allow outputs */ 515 491 : [cpu_id] "r" (cpu), 516 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 517 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 492 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 493 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 518 494 /* try store input */ 519 495 [v2] "m" (*v2), 520 496 [newv2] "r" (newv2), ··· 530 506 , error1, error2 531 507 #endif 532 508 ); 509 + rseq_after_asm_goto(); 533 510 return 0; 534 511 abort: 512 + rseq_after_asm_goto(); 535 513 RSEQ_INJECT_FAILED 536 514 return -1; 537 515 cmpfail: 516 + rseq_after_asm_goto(); 538 517 return 1; 539 518 #ifdef RSEQ_COMPARE_TWICE 540 519 error1: 520 + rseq_after_asm_goto(); 541 521 rseq_bug("cpu_id comparison failed"); 542 522 error2: 523 + rseq_after_asm_goto(); 543 524 rseq_bug("expected value comparison failed"); 544 525 #endif 545 526 } ··· 589 560 RSEQ_ASM_DEFINE_ABORT(4, abort) 590 561 : /* gcc asm goto does not allow outputs */ 591 562 : [cpu_id] "r" (cpu), 592 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 593 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 563 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 564 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 594 565 /* cmp2 input */ 595 566 [v2] "m" (*v2), 596 567 [expect2] "r" (expect2), ··· 606 577 , error1, error2, error3 607 578 #endif 608 579 ); 580 + rseq_after_asm_goto(); 609 581 return 0; 610 582 abort: 583 + rseq_after_asm_goto(); 611 584 RSEQ_INJECT_FAILED 612 585 return -1; 613 586 cmpfail: 587 + rseq_after_asm_goto(); 614 588 return 1; 615 589 #ifdef RSEQ_COMPARE_TWICE 616 590 error1: 591 + rseq_after_asm_goto(); 617 592 rseq_bug("cpu_id comparison failed"); 618 593 error2: 594 + rseq_after_asm_goto(); 619 595 rseq_bug("1st expected value comparison failed"); 620 596 error3: 597 + rseq_after_asm_goto(); 621 598 rseq_bug("2nd expected value comparison failed"); 622 599 #endif 623 600 } ··· 670 635 RSEQ_ASM_DEFINE_ABORT(4, abort) 671 636 : /* gcc asm goto does not allow outputs */ 672 637 : [cpu_id] "r" (cpu), 673 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 674 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 638 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 639 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 675 640 /* final store input */ 676 641 [v] "m" (*v), 677 642 [expect] "r" (expect), ··· 688 653 , error1, error2 689 654 #endif 690 655 ); 656 + rseq_after_asm_goto(); 691 657 return 0; 692 658 abort: 659 + rseq_after_asm_goto(); 693 660 RSEQ_INJECT_FAILED 694 661 return -1; 695 662 cmpfail: 663 + rseq_after_asm_goto(); 696 664 return 1; 697 665 #ifdef RSEQ_COMPARE_TWICE 698 666 error1: 667 + rseq_after_asm_goto(); 699 668 rseq_bug("cpu_id comparison failed"); 700 669 error2: 670 + rseq_after_asm_goto(); 701 671 rseq_bug("expected value comparison failed"); 702 672 #endif 703 673 } ··· 751 711 RSEQ_ASM_DEFINE_ABORT(4, abort) 752 712 : /* gcc asm goto does not allow outputs */ 753 713 : [cpu_id] "r" (cpu), 754 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 755 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 714 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 715 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 756 716 /* final store input */ 757 717 [v] "m" (*v), 758 718 [expect] "r" (expect), ··· 769 729 , error1, error2 770 730 #endif 771 731 ); 732 + rseq_after_asm_goto(); 772 733 return 0; 773 734 abort: 735 + rseq_after_asm_goto(); 774 736 RSEQ_INJECT_FAILED 775 737 return -1; 776 738 cmpfail: 739 + rseq_after_asm_goto(); 777 740 return 1; 778 741 #ifdef RSEQ_COMPARE_TWICE 779 742 error1: 743 + rseq_after_asm_goto(); 780 744 rseq_bug("cpu_id comparison failed"); 781 745 error2: 746 + rseq_after_asm_goto(); 782 747 rseq_bug("expected value comparison failed"); 783 748 #endif 784 749 } 785 - 786 - #undef STORE_WORD 787 - #undef LOAD_WORD 788 - #undef LOADX_WORD 789 - #undef CMP_WORD 790 750 791 751 #endif /* !RSEQ_SKIP_FASTPATH */
+42 -13
tools/testing/selftests/rseq/rseq-s390.h
··· 165 165 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 166 166 : /* gcc asm goto does not allow outputs */ 167 167 : [cpu_id] "r" (cpu), 168 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 169 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 168 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 169 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 170 170 [v] "m" (*v), 171 171 [expect] "r" (expect), 172 172 [newv] "r" (newv) ··· 178 178 , error1, error2 179 179 #endif 180 180 ); 181 + rseq_after_asm_goto(); 181 182 return 0; 182 183 abort: 184 + rseq_after_asm_goto(); 183 185 RSEQ_INJECT_FAILED 184 186 return -1; 185 187 cmpfail: 188 + rseq_after_asm_goto(); 186 189 return 1; 187 190 #ifdef RSEQ_COMPARE_TWICE 188 191 error1: 192 + rseq_after_asm_goto(); 189 193 rseq_bug("cpu_id comparison failed"); 190 194 error2: 195 + rseq_after_asm_goto(); 191 196 rseq_bug("expected value comparison failed"); 192 197 #endif 193 198 } ··· 203 198 */ 204 199 static inline __attribute__((always_inline)) 205 200 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 206 - off_t voffp, intptr_t *load, int cpu) 201 + long voffp, intptr_t *load, int cpu) 207 202 { 208 203 RSEQ_INJECT_C(9) 209 204 ··· 238 233 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 239 234 : /* gcc asm goto does not allow outputs */ 240 235 : [cpu_id] "r" (cpu), 241 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 242 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 236 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 237 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 243 238 /* final store input */ 244 239 [v] "m" (*v), 245 240 [expectnot] "r" (expectnot), ··· 253 248 , error1, error2 254 249 #endif 255 250 ); 251 + rseq_after_asm_goto(); 256 252 return 0; 257 253 abort: 254 + rseq_after_asm_goto(); 258 255 RSEQ_INJECT_FAILED 259 256 return -1; 260 257 cmpfail: 258 + rseq_after_asm_goto(); 261 259 return 1; 262 260 #ifdef RSEQ_COMPARE_TWICE 263 261 error1: 262 + rseq_after_asm_goto(); 264 263 rseq_bug("cpu_id comparison failed"); 265 264 error2: 265 + rseq_after_asm_goto(); 266 266 rseq_bug("expected value comparison failed"); 267 267 #endif 268 268 } ··· 298 288 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 299 289 : /* gcc asm goto does not allow outputs */ 300 290 : [cpu_id] "r" (cpu), 301 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 302 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 291 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 292 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 303 293 /* final store input */ 304 294 [v] "m" (*v), 305 295 [count] "r" (count) ··· 311 301 , error1 312 302 #endif 313 303 ); 304 + rseq_after_asm_goto(); 314 305 return 0; 315 306 abort: 307 + rseq_after_asm_goto(); 316 308 RSEQ_INJECT_FAILED 317 309 return -1; 318 310 #ifdef RSEQ_COMPARE_TWICE 319 311 error1: 312 + rseq_after_asm_goto(); 320 313 rseq_bug("cpu_id comparison failed"); 321 314 #endif 322 315 } ··· 360 347 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 361 348 : /* gcc asm goto does not allow outputs */ 362 349 : [cpu_id] "r" (cpu), 363 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 364 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 350 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 351 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 365 352 /* try store input */ 366 353 [v2] "m" (*v2), 367 354 [newv2] "r" (newv2), ··· 377 364 , error1, error2 378 365 #endif 379 366 ); 367 + rseq_after_asm_goto(); 380 368 return 0; 381 369 abort: 370 + rseq_after_asm_goto(); 382 371 RSEQ_INJECT_FAILED 383 372 return -1; 384 373 cmpfail: 374 + rseq_after_asm_goto(); 385 375 return 1; 386 376 #ifdef RSEQ_COMPARE_TWICE 387 377 error1: 378 + rseq_after_asm_goto(); 388 379 rseq_bug("cpu_id comparison failed"); 389 380 error2: 381 + rseq_after_asm_goto(); 390 382 rseq_bug("expected value comparison failed"); 391 383 #endif 392 384 } ··· 444 426 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 445 427 : /* gcc asm goto does not allow outputs */ 446 428 : [cpu_id] "r" (cpu), 447 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 448 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 429 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 430 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 449 431 /* cmp2 input */ 450 432 [v2] "m" (*v2), 451 433 [expect2] "r" (expect2), ··· 461 443 , error1, error2, error3 462 444 #endif 463 445 ); 446 + rseq_after_asm_goto(); 464 447 return 0; 465 448 abort: 449 + rseq_after_asm_goto(); 466 450 RSEQ_INJECT_FAILED 467 451 return -1; 468 452 cmpfail: 453 + rseq_after_asm_goto(); 469 454 return 1; 470 455 #ifdef RSEQ_COMPARE_TWICE 471 456 error1: 457 + rseq_after_asm_goto(); 472 458 rseq_bug("cpu_id comparison failed"); 473 459 error2: 460 + rseq_after_asm_goto(); 474 461 rseq_bug("1st expected value comparison failed"); 475 462 error3: 463 + rseq_after_asm_goto(); 476 464 rseq_bug("2nd expected value comparison failed"); 477 465 #endif 478 466 } ··· 558 534 #endif 559 535 : /* gcc asm goto does not allow outputs */ 560 536 : [cpu_id] "r" (cpu), 561 - [current_cpu_id] "m" (__rseq_abi.cpu_id), 562 - [rseq_cs] "m" (__rseq_abi.rseq_cs), 537 + [current_cpu_id] "m" (rseq_get_abi()->cpu_id), 538 + [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr), 563 539 /* final store input */ 564 540 [v] "m" (*v), 565 541 [expect] "r" (expect), ··· 579 555 , error1, error2 580 556 #endif 581 557 ); 558 + rseq_after_asm_goto(); 582 559 return 0; 583 560 abort: 561 + rseq_after_asm_goto(); 584 562 RSEQ_INJECT_FAILED 585 563 return -1; 586 564 cmpfail: 565 + rseq_after_asm_goto(); 587 566 return 1; 588 567 #ifdef RSEQ_COMPARE_TWICE 589 568 error1: 569 + rseq_after_asm_goto(); 590 570 rseq_bug("cpu_id comparison failed"); 591 571 error2: 572 + rseq_after_asm_goto(); 592 573 rseq_bug("expected value comparison failed"); 593 574 #endif 594 575 }
+1 -1
tools/testing/selftests/rseq/rseq-skip.h
··· 13 13 14 14 static inline __attribute__((always_inline)) 15 15 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 16 - off_t voffp, intptr_t *load, int cpu) 16 + long voffp, intptr_t *load, int cpu) 17 17 { 18 18 return -1; 19 19 }
+19
tools/testing/selftests/rseq/rseq-thread-pointer.h
··· 1 + /* SPDX-License-Identifier: LGPL-2.1-only OR MIT */ 2 + /* 3 + * rseq-thread-pointer.h 4 + * 5 + * (C) Copyright 2021 - Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 6 + */ 7 + 8 + #ifndef _RSEQ_THREAD_POINTER 9 + #define _RSEQ_THREAD_POINTER 10 + 11 + #if defined(__x86_64__) || defined(__i386__) 12 + #include "rseq-x86-thread-pointer.h" 13 + #elif defined(__PPC__) 14 + #include "rseq-ppc-thread-pointer.h" 15 + #else 16 + #include "rseq-generic-thread-pointer.h" 17 + #endif 18 + 19 + #endif
+40
tools/testing/selftests/rseq/rseq-x86-thread-pointer.h
··· 1 + /* SPDX-License-Identifier: LGPL-2.1-only OR MIT */ 2 + /* 3 + * rseq-x86-thread-pointer.h 4 + * 5 + * (C) Copyright 2021 - Mathieu Desnoyers <mathieu.desnoyers@efficios.com> 6 + */ 7 + 8 + #ifndef _RSEQ_X86_THREAD_POINTER 9 + #define _RSEQ_X86_THREAD_POINTER 10 + 11 + #include <features.h> 12 + 13 + #ifdef __cplusplus 14 + extern "C" { 15 + #endif 16 + 17 + #if __GNUC_PREREQ (11, 1) 18 + static inline void *rseq_thread_pointer(void) 19 + { 20 + return __builtin_thread_pointer(); 21 + } 22 + #else 23 + static inline void *rseq_thread_pointer(void) 24 + { 25 + void *__result; 26 + 27 + # ifdef __x86_64__ 28 + __asm__ ("mov %%fs:0, %0" : "=r" (__result)); 29 + # else 30 + __asm__ ("mov %%gs:0, %0" : "=r" (__result)); 31 + # endif 32 + return __result; 33 + } 34 + #endif /* !GCC 11 */ 35 + 36 + #ifdef __cplusplus 37 + } 38 + #endif 39 + 40 + #endif
+136 -64
tools/testing/selftests/rseq/rseq-x86.h
··· 28 28 29 29 #ifdef __x86_64__ 30 30 31 + #define RSEQ_ASM_TP_SEGMENT %%fs 32 + 31 33 #define rseq_smp_mb() \ 32 34 __asm__ __volatile__ ("lock; addl $0,-128(%%rsp)" ::: "memory", "cc") 33 35 #define rseq_smp_rmb() rseq_barrier() ··· 125 123 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 126 124 #endif 127 125 /* Start rseq by storing table entry pointer into rseq_cs. */ 128 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 129 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 126 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 127 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 130 128 RSEQ_INJECT_ASM(3) 131 129 "cmpq %[v], %[expect]\n\t" 132 130 "jnz %l[cmpfail]\n\t" 133 131 RSEQ_INJECT_ASM(4) 134 132 #ifdef RSEQ_COMPARE_TWICE 135 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 133 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 136 134 "cmpq %[v], %[expect]\n\t" 137 135 "jnz %l[error2]\n\t" 138 136 #endif ··· 143 141 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 144 142 : /* gcc asm goto does not allow outputs */ 145 143 : [cpu_id] "r" (cpu), 146 - [rseq_abi] "r" (&__rseq_abi), 144 + [rseq_offset] "r" (rseq_offset), 147 145 [v] "m" (*v), 148 146 [expect] "r" (expect), 149 147 [newv] "r" (newv) ··· 154 152 , error1, error2 155 153 #endif 156 154 ); 155 + rseq_after_asm_goto(); 157 156 return 0; 158 157 abort: 158 + rseq_after_asm_goto(); 159 159 RSEQ_INJECT_FAILED 160 160 return -1; 161 161 cmpfail: 162 + rseq_after_asm_goto(); 162 163 return 1; 163 164 #ifdef RSEQ_COMPARE_TWICE 164 165 error1: 166 + rseq_after_asm_goto(); 165 167 rseq_bug("cpu_id comparison failed"); 166 168 error2: 169 + rseq_after_asm_goto(); 167 170 rseq_bug("expected value comparison failed"); 168 171 #endif 169 172 } ··· 179 172 */ 180 173 static inline __attribute__((always_inline)) 181 174 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 182 - off_t voffp, intptr_t *load, int cpu) 175 + long voffp, intptr_t *load, int cpu) 183 176 { 184 177 RSEQ_INJECT_C(9) 185 178 ··· 191 184 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 192 185 #endif 193 186 /* Start rseq by storing table entry pointer into rseq_cs. */ 194 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 195 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 187 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 188 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 196 189 RSEQ_INJECT_ASM(3) 197 190 "movq %[v], %%rbx\n\t" 198 191 "cmpq %%rbx, %[expectnot]\n\t" 199 192 "je %l[cmpfail]\n\t" 200 193 RSEQ_INJECT_ASM(4) 201 194 #ifdef RSEQ_COMPARE_TWICE 202 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 195 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 203 196 "movq %[v], %%rbx\n\t" 204 197 "cmpq %%rbx, %[expectnot]\n\t" 205 198 "je %l[error2]\n\t" ··· 214 207 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 215 208 : /* gcc asm goto does not allow outputs */ 216 209 : [cpu_id] "r" (cpu), 217 - [rseq_abi] "r" (&__rseq_abi), 210 + [rseq_offset] "r" (rseq_offset), 218 211 /* final store input */ 219 212 [v] "m" (*v), 220 213 [expectnot] "r" (expectnot), ··· 227 220 , error1, error2 228 221 #endif 229 222 ); 223 + rseq_after_asm_goto(); 230 224 return 0; 231 225 abort: 226 + rseq_after_asm_goto(); 232 227 RSEQ_INJECT_FAILED 233 228 return -1; 234 229 cmpfail: 230 + rseq_after_asm_goto(); 235 231 return 1; 236 232 #ifdef RSEQ_COMPARE_TWICE 237 233 error1: 234 + rseq_after_asm_goto(); 238 235 rseq_bug("cpu_id comparison failed"); 239 236 error2: 237 + rseq_after_asm_goto(); 240 238 rseq_bug("expected value comparison failed"); 241 239 #endif 242 240 } ··· 257 245 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1]) 258 246 #endif 259 247 /* Start rseq by storing table entry pointer into rseq_cs. */ 260 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 261 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 248 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 249 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 262 250 RSEQ_INJECT_ASM(3) 263 251 #ifdef RSEQ_COMPARE_TWICE 264 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 252 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 265 253 #endif 266 254 /* final store */ 267 255 "addq %[count], %[v]\n\t" ··· 270 258 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 271 259 : /* gcc asm goto does not allow outputs */ 272 260 : [cpu_id] "r" (cpu), 273 - [rseq_abi] "r" (&__rseq_abi), 261 + [rseq_offset] "r" (rseq_offset), 274 262 /* final store input */ 275 263 [v] "m" (*v), 276 264 [count] "er" (count) ··· 281 269 , error1 282 270 #endif 283 271 ); 272 + rseq_after_asm_goto(); 284 273 return 0; 285 274 abort: 275 + rseq_after_asm_goto(); 286 276 RSEQ_INJECT_FAILED 287 277 return -1; 288 278 #ifdef RSEQ_COMPARE_TWICE 289 279 error1: 280 + rseq_after_asm_goto(); 290 281 rseq_bug("cpu_id comparison failed"); 291 282 #endif 292 283 } ··· 301 286 * *pval += inc; 302 287 */ 303 288 static inline __attribute__((always_inline)) 304 - int rseq_offset_deref_addv(intptr_t *ptr, off_t off, intptr_t inc, int cpu) 289 + int rseq_offset_deref_addv(intptr_t *ptr, long off, intptr_t inc, int cpu) 305 290 { 306 291 RSEQ_INJECT_C(9) 307 292 ··· 311 296 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1]) 312 297 #endif 313 298 /* Start rseq by storing table entry pointer into rseq_cs. */ 314 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 315 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 299 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 300 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 316 301 RSEQ_INJECT_ASM(3) 317 302 #ifdef RSEQ_COMPARE_TWICE 318 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 303 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 319 304 #endif 320 305 /* get p+v */ 321 306 "movq %[ptr], %%rbx\n\t" ··· 329 314 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 330 315 : /* gcc asm goto does not allow outputs */ 331 316 : [cpu_id] "r" (cpu), 332 - [rseq_abi] "r" (&__rseq_abi), 317 + [rseq_offset] "r" (rseq_offset), 333 318 /* final store input */ 334 319 [ptr] "m" (*ptr), 335 320 [off] "er" (off), ··· 366 351 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 367 352 #endif 368 353 /* Start rseq by storing table entry pointer into rseq_cs. */ 369 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 370 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 354 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 355 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 371 356 RSEQ_INJECT_ASM(3) 372 357 "cmpq %[v], %[expect]\n\t" 373 358 "jnz %l[cmpfail]\n\t" 374 359 RSEQ_INJECT_ASM(4) 375 360 #ifdef RSEQ_COMPARE_TWICE 376 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 361 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 377 362 "cmpq %[v], %[expect]\n\t" 378 363 "jnz %l[error2]\n\t" 379 364 #endif ··· 387 372 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 388 373 : /* gcc asm goto does not allow outputs */ 389 374 : [cpu_id] "r" (cpu), 390 - [rseq_abi] "r" (&__rseq_abi), 375 + [rseq_offset] "r" (rseq_offset), 391 376 /* try store input */ 392 377 [v2] "m" (*v2), 393 378 [newv2] "r" (newv2), ··· 402 387 , error1, error2 403 388 #endif 404 389 ); 390 + rseq_after_asm_goto(); 405 391 return 0; 406 392 abort: 393 + rseq_after_asm_goto(); 407 394 RSEQ_INJECT_FAILED 408 395 return -1; 409 396 cmpfail: 397 + rseq_after_asm_goto(); 410 398 return 1; 411 399 #ifdef RSEQ_COMPARE_TWICE 412 400 error1: 401 + rseq_after_asm_goto(); 413 402 rseq_bug("cpu_id comparison failed"); 414 403 error2: 404 + rseq_after_asm_goto(); 415 405 rseq_bug("expected value comparison failed"); 416 406 #endif 417 407 } ··· 446 426 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3]) 447 427 #endif 448 428 /* Start rseq by storing table entry pointer into rseq_cs. */ 449 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 450 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 429 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 430 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 451 431 RSEQ_INJECT_ASM(3) 452 432 "cmpq %[v], %[expect]\n\t" 453 433 "jnz %l[cmpfail]\n\t" ··· 456 436 "jnz %l[cmpfail]\n\t" 457 437 RSEQ_INJECT_ASM(5) 458 438 #ifdef RSEQ_COMPARE_TWICE 459 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 439 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 460 440 "cmpq %[v], %[expect]\n\t" 461 441 "jnz %l[error2]\n\t" 462 442 "cmpq %[v2], %[expect2]\n\t" ··· 469 449 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 470 450 : /* gcc asm goto does not allow outputs */ 471 451 : [cpu_id] "r" (cpu), 472 - [rseq_abi] "r" (&__rseq_abi), 452 + [rseq_offset] "r" (rseq_offset), 473 453 /* cmp2 input */ 474 454 [v2] "m" (*v2), 475 455 [expect2] "r" (expect2), ··· 484 464 , error1, error2, error3 485 465 #endif 486 466 ); 467 + rseq_after_asm_goto(); 487 468 return 0; 488 469 abort: 470 + rseq_after_asm_goto(); 489 471 RSEQ_INJECT_FAILED 490 472 return -1; 491 473 cmpfail: 474 + rseq_after_asm_goto(); 492 475 return 1; 493 476 #ifdef RSEQ_COMPARE_TWICE 494 477 error1: 478 + rseq_after_asm_goto(); 495 479 rseq_bug("cpu_id comparison failed"); 496 480 error2: 481 + rseq_after_asm_goto(); 497 482 rseq_bug("1st expected value comparison failed"); 498 483 error3: 484 + rseq_after_asm_goto(); 499 485 rseq_bug("2nd expected value comparison failed"); 500 486 #endif 501 487 } ··· 526 500 "movq %[dst], %[rseq_scratch1]\n\t" 527 501 "movq %[len], %[rseq_scratch2]\n\t" 528 502 /* Start rseq by storing table entry pointer into rseq_cs. */ 529 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 530 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 503 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 504 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 531 505 RSEQ_INJECT_ASM(3) 532 506 "cmpq %[v], %[expect]\n\t" 533 507 "jnz 5f\n\t" 534 508 RSEQ_INJECT_ASM(4) 535 509 #ifdef RSEQ_COMPARE_TWICE 536 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 6f) 510 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f) 537 511 "cmpq %[v], %[expect]\n\t" 538 512 "jnz 7f\n\t" 539 513 #endif ··· 581 555 #endif 582 556 : /* gcc asm goto does not allow outputs */ 583 557 : [cpu_id] "r" (cpu), 584 - [rseq_abi] "r" (&__rseq_abi), 558 + [rseq_offset] "r" (rseq_offset), 585 559 /* final store input */ 586 560 [v] "m" (*v), 587 561 [expect] "r" (expect), ··· 600 574 , error1, error2 601 575 #endif 602 576 ); 577 + rseq_after_asm_goto(); 603 578 return 0; 604 579 abort: 580 + rseq_after_asm_goto(); 605 581 RSEQ_INJECT_FAILED 606 582 return -1; 607 583 cmpfail: 584 + rseq_after_asm_goto(); 608 585 return 1; 609 586 #ifdef RSEQ_COMPARE_TWICE 610 587 error1: 588 + rseq_after_asm_goto(); 611 589 rseq_bug("cpu_id comparison failed"); 612 590 error2: 591 + rseq_after_asm_goto(); 613 592 rseq_bug("expected value comparison failed"); 614 593 #endif 615 594 } ··· 631 600 632 601 #endif /* !RSEQ_SKIP_FASTPATH */ 633 602 634 - #elif __i386__ 603 + #elif defined(__i386__) 604 + 605 + #define RSEQ_ASM_TP_SEGMENT %%gs 635 606 636 607 #define rseq_smp_mb() \ 637 608 __asm__ __volatile__ ("lock; addl $0,-128(%%esp)" ::: "memory", "cc") ··· 734 701 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 735 702 #endif 736 703 /* Start rseq by storing table entry pointer into rseq_cs. */ 737 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 738 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 704 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 705 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 739 706 RSEQ_INJECT_ASM(3) 740 707 "cmpl %[v], %[expect]\n\t" 741 708 "jnz %l[cmpfail]\n\t" 742 709 RSEQ_INJECT_ASM(4) 743 710 #ifdef RSEQ_COMPARE_TWICE 744 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 711 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 745 712 "cmpl %[v], %[expect]\n\t" 746 713 "jnz %l[error2]\n\t" 747 714 #endif ··· 752 719 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 753 720 : /* gcc asm goto does not allow outputs */ 754 721 : [cpu_id] "r" (cpu), 755 - [rseq_abi] "r" (&__rseq_abi), 722 + [rseq_offset] "r" (rseq_offset), 756 723 [v] "m" (*v), 757 724 [expect] "r" (expect), 758 725 [newv] "r" (newv) ··· 763 730 , error1, error2 764 731 #endif 765 732 ); 733 + rseq_after_asm_goto(); 766 734 return 0; 767 735 abort: 736 + rseq_after_asm_goto(); 768 737 RSEQ_INJECT_FAILED 769 738 return -1; 770 739 cmpfail: 740 + rseq_after_asm_goto(); 771 741 return 1; 772 742 #ifdef RSEQ_COMPARE_TWICE 773 743 error1: 744 + rseq_after_asm_goto(); 774 745 rseq_bug("cpu_id comparison failed"); 775 746 error2: 747 + rseq_after_asm_goto(); 776 748 rseq_bug("expected value comparison failed"); 777 749 #endif 778 750 } ··· 788 750 */ 789 751 static inline __attribute__((always_inline)) 790 752 int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot, 791 - off_t voffp, intptr_t *load, int cpu) 753 + long voffp, intptr_t *load, int cpu) 792 754 { 793 755 RSEQ_INJECT_C(9) 794 756 ··· 800 762 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 801 763 #endif 802 764 /* Start rseq by storing table entry pointer into rseq_cs. */ 803 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 804 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 765 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 766 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 805 767 RSEQ_INJECT_ASM(3) 806 768 "movl %[v], %%ebx\n\t" 807 769 "cmpl %%ebx, %[expectnot]\n\t" 808 770 "je %l[cmpfail]\n\t" 809 771 RSEQ_INJECT_ASM(4) 810 772 #ifdef RSEQ_COMPARE_TWICE 811 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 773 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 812 774 "movl %[v], %%ebx\n\t" 813 775 "cmpl %%ebx, %[expectnot]\n\t" 814 776 "je %l[error2]\n\t" ··· 823 785 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 824 786 : /* gcc asm goto does not allow outputs */ 825 787 : [cpu_id] "r" (cpu), 826 - [rseq_abi] "r" (&__rseq_abi), 788 + [rseq_offset] "r" (rseq_offset), 827 789 /* final store input */ 828 790 [v] "m" (*v), 829 791 [expectnot] "r" (expectnot), ··· 836 798 , error1, error2 837 799 #endif 838 800 ); 801 + rseq_after_asm_goto(); 839 802 return 0; 840 803 abort: 804 + rseq_after_asm_goto(); 841 805 RSEQ_INJECT_FAILED 842 806 return -1; 843 807 cmpfail: 808 + rseq_after_asm_goto(); 844 809 return 1; 845 810 #ifdef RSEQ_COMPARE_TWICE 846 811 error1: 812 + rseq_after_asm_goto(); 847 813 rseq_bug("cpu_id comparison failed"); 848 814 error2: 815 + rseq_after_asm_goto(); 849 816 rseq_bug("expected value comparison failed"); 850 817 #endif 851 818 } ··· 866 823 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1]) 867 824 #endif 868 825 /* Start rseq by storing table entry pointer into rseq_cs. */ 869 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 870 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 826 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 827 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 871 828 RSEQ_INJECT_ASM(3) 872 829 #ifdef RSEQ_COMPARE_TWICE 873 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 830 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 874 831 #endif 875 832 /* final store */ 876 833 "addl %[count], %[v]\n\t" ··· 879 836 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 880 837 : /* gcc asm goto does not allow outputs */ 881 838 : [cpu_id] "r" (cpu), 882 - [rseq_abi] "r" (&__rseq_abi), 839 + [rseq_offset] "r" (rseq_offset), 883 840 /* final store input */ 884 841 [v] "m" (*v), 885 842 [count] "ir" (count) ··· 890 847 , error1 891 848 #endif 892 849 ); 850 + rseq_after_asm_goto(); 893 851 return 0; 894 852 abort: 853 + rseq_after_asm_goto(); 895 854 RSEQ_INJECT_FAILED 896 855 return -1; 897 856 #ifdef RSEQ_COMPARE_TWICE 898 857 error1: 858 + rseq_after_asm_goto(); 899 859 rseq_bug("cpu_id comparison failed"); 900 860 #endif 901 861 } ··· 918 872 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 919 873 #endif 920 874 /* Start rseq by storing table entry pointer into rseq_cs. */ 921 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 922 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 875 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 876 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 923 877 RSEQ_INJECT_ASM(3) 924 878 "cmpl %[v], %[expect]\n\t" 925 879 "jnz %l[cmpfail]\n\t" 926 880 RSEQ_INJECT_ASM(4) 927 881 #ifdef RSEQ_COMPARE_TWICE 928 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 882 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 929 883 "cmpl %[v], %[expect]\n\t" 930 884 "jnz %l[error2]\n\t" 931 885 #endif ··· 940 894 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 941 895 : /* gcc asm goto does not allow outputs */ 942 896 : [cpu_id] "r" (cpu), 943 - [rseq_abi] "r" (&__rseq_abi), 897 + [rseq_offset] "r" (rseq_offset), 944 898 /* try store input */ 945 899 [v2] "m" (*v2), 946 900 [newv2] "m" (newv2), ··· 955 909 , error1, error2 956 910 #endif 957 911 ); 912 + rseq_after_asm_goto(); 958 913 return 0; 959 914 abort: 915 + rseq_after_asm_goto(); 960 916 RSEQ_INJECT_FAILED 961 917 return -1; 962 918 cmpfail: 919 + rseq_after_asm_goto(); 963 920 return 1; 964 921 #ifdef RSEQ_COMPARE_TWICE 965 922 error1: 923 + rseq_after_asm_goto(); 966 924 rseq_bug("cpu_id comparison failed"); 967 925 error2: 926 + rseq_after_asm_goto(); 968 927 rseq_bug("expected value comparison failed"); 969 928 #endif 970 929 } ··· 989 938 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2]) 990 939 #endif 991 940 /* Start rseq by storing table entry pointer into rseq_cs. */ 992 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 993 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 941 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 942 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 994 943 RSEQ_INJECT_ASM(3) 995 944 "movl %[expect], %%eax\n\t" 996 945 "cmpl %[v], %%eax\n\t" 997 946 "jnz %l[cmpfail]\n\t" 998 947 RSEQ_INJECT_ASM(4) 999 948 #ifdef RSEQ_COMPARE_TWICE 1000 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 949 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 1001 950 "movl %[expect], %%eax\n\t" 1002 951 "cmpl %[v], %%eax\n\t" 1003 952 "jnz %l[error2]\n\t" ··· 1013 962 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 1014 963 : /* gcc asm goto does not allow outputs */ 1015 964 : [cpu_id] "r" (cpu), 1016 - [rseq_abi] "r" (&__rseq_abi), 965 + [rseq_offset] "r" (rseq_offset), 1017 966 /* try store input */ 1018 967 [v2] "m" (*v2), 1019 968 [newv2] "r" (newv2), ··· 1028 977 , error1, error2 1029 978 #endif 1030 979 ); 980 + rseq_after_asm_goto(); 1031 981 return 0; 1032 982 abort: 983 + rseq_after_asm_goto(); 1033 984 RSEQ_INJECT_FAILED 1034 985 return -1; 1035 986 cmpfail: 987 + rseq_after_asm_goto(); 1036 988 return 1; 1037 989 #ifdef RSEQ_COMPARE_TWICE 1038 990 error1: 991 + rseq_after_asm_goto(); 1039 992 rseq_bug("cpu_id comparison failed"); 1040 993 error2: 994 + rseq_after_asm_goto(); 1041 995 rseq_bug("expected value comparison failed"); 1042 996 #endif 1043 997 ··· 1064 1008 RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3]) 1065 1009 #endif 1066 1010 /* Start rseq by storing table entry pointer into rseq_cs. */ 1067 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 1068 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 1011 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 1012 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 1069 1013 RSEQ_INJECT_ASM(3) 1070 1014 "cmpl %[v], %[expect]\n\t" 1071 1015 "jnz %l[cmpfail]\n\t" ··· 1074 1018 "jnz %l[cmpfail]\n\t" 1075 1019 RSEQ_INJECT_ASM(5) 1076 1020 #ifdef RSEQ_COMPARE_TWICE 1077 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), %l[error1]) 1021 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1]) 1078 1022 "cmpl %[v], %[expect]\n\t" 1079 1023 "jnz %l[error2]\n\t" 1080 1024 "cmpl %[expect2], %[v2]\n\t" ··· 1088 1032 RSEQ_ASM_DEFINE_ABORT(4, "", abort) 1089 1033 : /* gcc asm goto does not allow outputs */ 1090 1034 : [cpu_id] "r" (cpu), 1091 - [rseq_abi] "r" (&__rseq_abi), 1035 + [rseq_offset] "r" (rseq_offset), 1092 1036 /* cmp2 input */ 1093 1037 [v2] "m" (*v2), 1094 1038 [expect2] "r" (expect2), ··· 1103 1047 , error1, error2, error3 1104 1048 #endif 1105 1049 ); 1050 + rseq_after_asm_goto(); 1106 1051 return 0; 1107 1052 abort: 1053 + rseq_after_asm_goto(); 1108 1054 RSEQ_INJECT_FAILED 1109 1055 return -1; 1110 1056 cmpfail: 1057 + rseq_after_asm_goto(); 1111 1058 return 1; 1112 1059 #ifdef RSEQ_COMPARE_TWICE 1113 1060 error1: 1061 + rseq_after_asm_goto(); 1114 1062 rseq_bug("cpu_id comparison failed"); 1115 1063 error2: 1064 + rseq_after_asm_goto(); 1116 1065 rseq_bug("1st expected value comparison failed"); 1117 1066 error3: 1067 + rseq_after_asm_goto(); 1118 1068 rseq_bug("2nd expected value comparison failed"); 1119 1069 #endif 1120 1070 } ··· 1146 1084 "movl %[dst], %[rseq_scratch1]\n\t" 1147 1085 "movl %[len], %[rseq_scratch2]\n\t" 1148 1086 /* Start rseq by storing table entry pointer into rseq_cs. */ 1149 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 1150 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 1087 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 1088 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 1151 1089 RSEQ_INJECT_ASM(3) 1152 1090 "movl %[expect], %%eax\n\t" 1153 1091 "cmpl %%eax, %[v]\n\t" 1154 1092 "jnz 5f\n\t" 1155 1093 RSEQ_INJECT_ASM(4) 1156 1094 #ifdef RSEQ_COMPARE_TWICE 1157 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 6f) 1095 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f) 1158 1096 "movl %[expect], %%eax\n\t" 1159 1097 "cmpl %%eax, %[v]\n\t" 1160 1098 "jnz 7f\n\t" ··· 1204 1142 #endif 1205 1143 : /* gcc asm goto does not allow outputs */ 1206 1144 : [cpu_id] "r" (cpu), 1207 - [rseq_abi] "r" (&__rseq_abi), 1145 + [rseq_offset] "r" (rseq_offset), 1208 1146 /* final store input */ 1209 1147 [v] "m" (*v), 1210 1148 [expect] "m" (expect), ··· 1223 1161 , error1, error2 1224 1162 #endif 1225 1163 ); 1164 + rseq_after_asm_goto(); 1226 1165 return 0; 1227 1166 abort: 1167 + rseq_after_asm_goto(); 1228 1168 RSEQ_INJECT_FAILED 1229 1169 return -1; 1230 1170 cmpfail: 1171 + rseq_after_asm_goto(); 1231 1172 return 1; 1232 1173 #ifdef RSEQ_COMPARE_TWICE 1233 1174 error1: 1175 + rseq_after_asm_goto(); 1234 1176 rseq_bug("cpu_id comparison failed"); 1235 1177 error2: 1178 + rseq_after_asm_goto(); 1236 1179 rseq_bug("expected value comparison failed"); 1237 1180 #endif 1238 1181 } ··· 1263 1196 "movl %[dst], %[rseq_scratch1]\n\t" 1264 1197 "movl %[len], %[rseq_scratch2]\n\t" 1265 1198 /* Start rseq by storing table entry pointer into rseq_cs. */ 1266 - RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_CS_OFFSET(%[rseq_abi])) 1267 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 4f) 1199 + RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset])) 1200 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f) 1268 1201 RSEQ_INJECT_ASM(3) 1269 1202 "movl %[expect], %%eax\n\t" 1270 1203 "cmpl %%eax, %[v]\n\t" 1271 1204 "jnz 5f\n\t" 1272 1205 RSEQ_INJECT_ASM(4) 1273 1206 #ifdef RSEQ_COMPARE_TWICE 1274 - RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_CPU_ID_OFFSET(%[rseq_abi]), 6f) 1207 + RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f) 1275 1208 "movl %[expect], %%eax\n\t" 1276 1209 "cmpl %%eax, %[v]\n\t" 1277 1210 "jnz 7f\n\t" ··· 1322 1255 #endif 1323 1256 : /* gcc asm goto does not allow outputs */ 1324 1257 : [cpu_id] "r" (cpu), 1325 - [rseq_abi] "r" (&__rseq_abi), 1258 + [rseq_offset] "r" (rseq_offset), 1326 1259 /* final store input */ 1327 1260 [v] "m" (*v), 1328 1261 [expect] "m" (expect), ··· 1341 1274 , error1, error2 1342 1275 #endif 1343 1276 ); 1277 + rseq_after_asm_goto(); 1344 1278 return 0; 1345 1279 abort: 1280 + rseq_after_asm_goto(); 1346 1281 RSEQ_INJECT_FAILED 1347 1282 return -1; 1348 1283 cmpfail: 1284 + rseq_after_asm_goto(); 1349 1285 return 1; 1350 1286 #ifdef RSEQ_COMPARE_TWICE 1351 1287 error1: 1288 + rseq_after_asm_goto(); 1352 1289 rseq_bug("cpu_id comparison failed"); 1353 1290 error2: 1291 + rseq_after_asm_goto(); 1354 1292 rseq_bug("expected value comparison failed"); 1355 1293 #endif 1356 1294 }
+82 -88
tools/testing/selftests/rseq/rseq.c
··· 26 26 #include <assert.h> 27 27 #include <signal.h> 28 28 #include <limits.h> 29 + #include <dlfcn.h> 30 + #include <stddef.h> 29 31 30 32 #include "../kselftest.h" 31 33 #include "rseq.h" 32 34 33 - __thread volatile struct rseq __rseq_abi = { 34 - .cpu_id = RSEQ_CPU_ID_UNINITIALIZED, 35 - }; 35 + static const ptrdiff_t *libc_rseq_offset_p; 36 + static const unsigned int *libc_rseq_size_p; 37 + static const unsigned int *libc_rseq_flags_p; 36 38 37 - /* 38 - * Shared with other libraries. This library may take rseq ownership if it is 39 - * still 0 when executing the library constructor. Set to 1 by library 40 - * constructor when handling rseq. Set to 0 in destructor if handling rseq. 41 - */ 42 - int __rseq_handled; 39 + /* Offset from the thread pointer to the rseq area. */ 40 + ptrdiff_t rseq_offset; 43 41 44 - /* Whether this library have ownership of rseq registration. */ 42 + /* Size of the registered rseq area. 0 if the registration was 43 + unsuccessful. */ 44 + unsigned int rseq_size = -1U; 45 + 46 + /* Flags used during rseq registration. */ 47 + unsigned int rseq_flags; 48 + 45 49 static int rseq_ownership; 46 50 47 - static __thread volatile uint32_t __rseq_refcount; 51 + static 52 + __thread struct rseq_abi __rseq_abi __attribute__((tls_model("initial-exec"))) = { 53 + .cpu_id = RSEQ_ABI_CPU_ID_UNINITIALIZED, 54 + }; 48 55 49 - static void signal_off_save(sigset_t *oldset) 50 - { 51 - sigset_t set; 52 - int ret; 53 - 54 - sigfillset(&set); 55 - ret = pthread_sigmask(SIG_BLOCK, &set, oldset); 56 - if (ret) 57 - abort(); 58 - } 59 - 60 - static void signal_restore(sigset_t oldset) 61 - { 62 - int ret; 63 - 64 - ret = pthread_sigmask(SIG_SETMASK, &oldset, NULL); 65 - if (ret) 66 - abort(); 67 - } 68 - 69 - static int sys_rseq(volatile struct rseq *rseq_abi, uint32_t rseq_len, 56 + static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len, 70 57 int flags, uint32_t sig) 71 58 { 72 59 return syscall(__NR_rseq, rseq_abi, rseq_len, flags, sig); 73 60 } 74 61 62 + int rseq_available(void) 63 + { 64 + int rc; 65 + 66 + rc = sys_rseq(NULL, 0, 0, 0); 67 + if (rc != -1) 68 + abort(); 69 + switch (errno) { 70 + case ENOSYS: 71 + return 0; 72 + case EINVAL: 73 + return 1; 74 + default: 75 + abort(); 76 + } 77 + } 78 + 75 79 int rseq_register_current_thread(void) 76 80 { 77 - int rc, ret = 0; 78 - sigset_t oldset; 81 + int rc; 79 82 80 - if (!rseq_ownership) 83 + if (!rseq_ownership) { 84 + /* Treat libc's ownership as a successful registration. */ 81 85 return 0; 82 - signal_off_save(&oldset); 83 - if (__rseq_refcount == UINT_MAX) { 84 - ret = -1; 85 - goto end; 86 86 } 87 - if (__rseq_refcount++) 88 - goto end; 89 - rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 0, RSEQ_SIG); 90 - if (!rc) { 91 - assert(rseq_current_cpu_raw() >= 0); 92 - goto end; 93 - } 94 - if (errno != EBUSY) 95 - __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED; 96 - ret = -1; 97 - __rseq_refcount--; 98 - end: 99 - signal_restore(oldset); 100 - return ret; 87 + rc = sys_rseq(&__rseq_abi, sizeof(struct rseq_abi), 0, RSEQ_SIG); 88 + if (rc) 89 + return -1; 90 + assert(rseq_current_cpu_raw() >= 0); 91 + return 0; 101 92 } 102 93 103 94 int rseq_unregister_current_thread(void) 104 95 { 105 - int rc, ret = 0; 106 - sigset_t oldset; 96 + int rc; 107 97 108 - if (!rseq_ownership) 98 + if (!rseq_ownership) { 99 + /* Treat libc's ownership as a successful unregistration. */ 109 100 return 0; 110 - signal_off_save(&oldset); 111 - if (!__rseq_refcount) { 112 - ret = -1; 113 - goto end; 114 101 } 115 - if (--__rseq_refcount) 116 - goto end; 117 - rc = sys_rseq(&__rseq_abi, sizeof(struct rseq), 118 - RSEQ_FLAG_UNREGISTER, RSEQ_SIG); 119 - if (!rc) 120 - goto end; 121 - __rseq_refcount = 1; 122 - ret = -1; 123 - end: 124 - signal_restore(oldset); 125 - return ret; 102 + rc = sys_rseq(&__rseq_abi, sizeof(struct rseq_abi), RSEQ_ABI_FLAG_UNREGISTER, RSEQ_SIG); 103 + if (rc) 104 + return -1; 105 + return 0; 106 + } 107 + 108 + static __attribute__((constructor)) 109 + void rseq_init(void) 110 + { 111 + libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset"); 112 + libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size"); 113 + libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags"); 114 + if (libc_rseq_size_p && libc_rseq_offset_p && libc_rseq_flags_p) { 115 + /* rseq registration owned by glibc */ 116 + rseq_offset = *libc_rseq_offset_p; 117 + rseq_size = *libc_rseq_size_p; 118 + rseq_flags = *libc_rseq_flags_p; 119 + return; 120 + } 121 + if (!rseq_available()) 122 + return; 123 + rseq_ownership = 1; 124 + rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer(); 125 + rseq_size = sizeof(struct rseq_abi); 126 + rseq_flags = 0; 127 + } 128 + 129 + static __attribute__((destructor)) 130 + void rseq_exit(void) 131 + { 132 + if (!rseq_ownership) 133 + return; 134 + rseq_offset = 0; 135 + rseq_size = -1U; 136 + rseq_ownership = 0; 126 137 } 127 138 128 139 int32_t rseq_fallback_current_cpu(void) ··· 146 135 abort(); 147 136 } 148 137 return cpu; 149 - } 150 - 151 - void __attribute__((constructor)) rseq_init(void) 152 - { 153 - /* Check whether rseq is handled by another library. */ 154 - if (__rseq_handled) 155 - return; 156 - __rseq_handled = 1; 157 - rseq_ownership = 1; 158 - } 159 - 160 - void __attribute__((destructor)) rseq_fini(void) 161 - { 162 - if (!rseq_ownership) 163 - return; 164 - __rseq_handled = 0; 165 - rseq_ownership = 0; 166 138 }
+20 -10
tools/testing/selftests/rseq/rseq.h
··· 16 16 #include <errno.h> 17 17 #include <stdio.h> 18 18 #include <stdlib.h> 19 - #include <linux/rseq.h> 19 + #include <stddef.h> 20 + #include "rseq-abi.h" 21 + #include "compiler.h" 20 22 21 23 /* 22 24 * Empty code injection macros, override when testing. ··· 45 43 #define RSEQ_INJECT_FAILED 46 44 #endif 47 45 48 - extern __thread volatile struct rseq __rseq_abi; 49 - extern int __rseq_handled; 46 + #include "rseq-thread-pointer.h" 47 + 48 + /* Offset from the thread pointer to the rseq area. */ 49 + extern ptrdiff_t rseq_offset; 50 + /* Size of the registered rseq area. 0 if the registration was 51 + unsuccessful. */ 52 + extern unsigned int rseq_size; 53 + /* Flags used during rseq registration. */ 54 + extern unsigned int rseq_flags; 55 + 56 + static inline struct rseq_abi *rseq_get_abi(void) 57 + { 58 + return (struct rseq_abi *) ((uintptr_t) rseq_thread_pointer() + rseq_offset); 59 + } 50 60 51 61 #define rseq_likely(x) __builtin_expect(!!(x), 1) 52 62 #define rseq_unlikely(x) __builtin_expect(!!(x), 0) ··· 122 108 */ 123 109 static inline int32_t rseq_current_cpu_raw(void) 124 110 { 125 - return RSEQ_ACCESS_ONCE(__rseq_abi.cpu_id); 111 + return RSEQ_ACCESS_ONCE(rseq_get_abi()->cpu_id); 126 112 } 127 113 128 114 /* ··· 138 124 */ 139 125 static inline uint32_t rseq_cpu_start(void) 140 126 { 141 - return RSEQ_ACCESS_ONCE(__rseq_abi.cpu_id_start); 127 + return RSEQ_ACCESS_ONCE(rseq_get_abi()->cpu_id_start); 142 128 } 143 129 144 130 static inline uint32_t rseq_current_cpu(void) ··· 153 139 154 140 static inline void rseq_clear_rseq_cs(void) 155 141 { 156 - #ifdef __LP64__ 157 - __rseq_abi.rseq_cs.ptr = 0; 158 - #else 159 - __rseq_abi.rseq_cs.ptr.ptr32 = 0; 160 - #endif 142 + RSEQ_WRITE_ONCE(rseq_get_abi()->rseq_cs.arch.ptr, 0); 161 143 } 162 144 163 145 /*