Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

Pull RCU updates from Neeraj Upadhyay:
"Context tracking:
- rename context tracking state related symbols and remove references
to "dynticks" in various context tracking state variables and
related helpers
- force context_tracking_enabled_this_cpu() to be inlined to avoid
leaving a noinstr section

CSD lock:
- enhance CSD-lock diagnostic reports
- add an API to provide an indication of ongoing CSD-lock stall

nocb:
- update and simplify RCU nocb code to handle (de-)offloading of
callbacks only for offline CPUs
- fix RT throttling hrtimer being armed from offline CPU

rcutorture:
- remove redundant rcu_torture_ops get_gp_completed fields
- add SRCU ->same_gp_state and ->get_comp_state functions
- add generic test for NUM_ACTIVE_*RCU_POLL* for testing RCU and SRCU
polled grace periods
- add CFcommon.arch for arch-specific Kconfig options
- print number of update types in rcu_torture_write_types()
- add rcutree.nohz_full_patience_delay testing to the TREE07 scenario
- add a stall_cpu_repeat module parameter to test repeated CPU stalls
- add argument to limit number of CPUs a guest OS can use in
torture.sh

rcustall:
- abbreviate RCU CPU stall warnings during CSD-lock stalls
- Allow dump_cpu_task() to be called without disabling preemption
- defer printing stall-warning backtrace when holding rcu_node lock

srcu:
- make SRCU gp seq wrap-around faster
- add KCSAN checks for concurrent updates to ->srcu_n_exp_nodelay and
->reschedule_count which are used in heuristics governing
auto-expediting of normal SRCU grace periods and
grace-period-state-machine delays
- mark idle SRCU-barrier callbacks to help identify stuck
SRCU-barrier callback

rcu tasks:
- remove RCU Tasks Rude asynchronous APIs as they are no longer used
- stop testing RCU Tasks Rude asynchronous APIs
- fix access to non-existent percpu regions
- check processor-ID assumptions during chosen CPU calculation for
callback enqueuing
- update description of rtp->tasks_gp_seq grace-period sequence
number
- add rcu_barrier_cb_is_done() to identify whether a given
rcu_barrier callback is stuck
- mark idle Tasks-RCU-barrier callbacks
- add *torture_stats_print() functions to print detailed diagnostics
for Tasks-RCU variants
- capture start time of rcu_barrier_tasks*() operation to help
distinguish a hung barrier operation from a long series of barrier
operations

refscale:
- add a TINY scenario to support tests of Tiny RCU and Tiny
SRCU
- optimize process_durations() operation

rcuscale:
- dump stacks of stalled rcu_scale_writer() instances and
grace-period statistics when rcu_scale_writer() stalls
- mark idle RCU-barrier callbacks to identify stuck RCU-barrier
callbacks
- print detailed grace-period and barrier diagnostics on
rcu_scale_writer() hangs for Tasks-RCU variants
- warn if async module parameter is specified for RCU implementations
that do not have async primitives such as RCU Tasks Rude
- make all writer tasks report upon hang
- tolerate repeated GFP_KERNEL failure in rcu_scale_writer()
- use special allocator for rcu_scale_writer()
- NULL out top-level pointers to heap memory to avoid double-free
bugs on modprobe failures
- maintain per-task instead of per-CPU callbacks count to avoid any
issues with migration of either tasks or callbacks
- constify struct ref_scale_ops

Fixes:
- use system_unbound_wq for kfree_rcu work to avoid disturbing
isolated CPUs

Misc:
- warn on unexpected rcu_state.srs_done_tail state
- better define "atomic" for list_replace_rcu() and
hlist_replace_rcu() routines
- annotate struct kvfree_rcu_bulk_data with __counted_by()"

* tag 'rcu.release.v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (90 commits)
rcu: Defer printing stall-warning backtrace when holding rcu_node lock
rcu/nocb: Remove superfluous memory barrier after bypass enqueue
rcu/nocb: Conditionally wake up rcuo if not already waiting on GP
rcu/nocb: Fix RT throttling hrtimer armed from offline CPU
rcu/nocb: Simplify (de-)offloading state machine
context_tracking: Tag context_tracking_enabled_this_cpu() __always_inline
context_tracking, rcu: Rename rcu_dyntick trace event into rcu_watching
rcu: Update stray documentation references to rcu_dynticks_eqs_{enter, exit}()
rcu: Rename rcu_momentary_dyntick_idle() into rcu_momentary_eqs()
rcu: Rename rcu_implicit_dynticks_qs() into rcu_watching_snap_recheck()
rcu: Rename dyntick_save_progress_counter() into rcu_watching_snap_save()
rcu: Rename struct rcu_data .exp_dynticks_snap into .exp_watching_snap
rcu: Rename struct rcu_data .dynticks_snap into .watching_snap
rcu: Rename rcu_dynticks_zero_in_eqs() into rcu_watching_zero_in_eqs()
rcu: Rename rcu_dynticks_in_eqs_since() into rcu_watching_snap_stopped_since()
rcu: Rename rcu_dynticks_in_eqs() into rcu_watching_snap_in_eqs()
rcu: Rename rcu_dynticks_eqs_online() into rcu_watching_online()
context_tracking, rcu: Rename rcu_dynticks_curr_cpu_in_eqs() into rcu_is_watching_curr_cpu()
context_tracking, rcu: Rename rcu_dynticks_task*() into rcu_task*()
refscale: Constify struct ref_scale_ops
...

+1097 -795
+14 -14
Documentation/RCU/Design/Data-Structures/Data-Structures.rst
··· 921 921 922 922 :: 923 923 924 - 1 int dynticks_snap; 924 + 1 int watching_snap; 925 925 2 unsigned long dynticks_fqs; 926 926 927 - The ``->dynticks_snap`` field is used to take a snapshot of the 927 + The ``->watching_snap`` field is used to take a snapshot of the 928 928 corresponding CPU's dyntick-idle state when forcing quiescent states, 929 929 and is therefore accessed from other CPUs. Finally, the 930 930 ``->dynticks_fqs`` field is used to count the number of times this CPU ··· 935 935 936 936 :: 937 937 938 - 1 long dynticks_nesting; 939 - 2 long dynticks_nmi_nesting; 938 + 1 long nesting; 939 + 2 long nmi_nesting; 940 940 3 atomic_t dynticks; 941 941 4 bool rcu_need_heavy_qs; 942 942 5 bool rcu_urgent_qs; ··· 945 945 state for the corresponding CPU. The fields may be accessed only from 946 946 the corresponding CPU (and from tracing) unless otherwise stated. 947 947 948 - The ``->dynticks_nesting`` field counts the nesting depth of process 948 + The ``->nesting`` field counts the nesting depth of process 949 949 execution, so that in normal circumstances this counter has value zero 950 950 or one. NMIs, irqs, and tracers are counted by the 951 - ``->dynticks_nmi_nesting`` field. Because NMIs cannot be masked, changes 951 + ``->nmi_nesting`` field. Because NMIs cannot be masked, changes 952 952 to this variable have to be undertaken carefully using an algorithm 953 953 provided by Andy Lutomirski. The initial transition from idle adds one, 954 954 and nested transitions add two, so that a nesting level of five is 955 - represented by a ``->dynticks_nmi_nesting`` value of nine. This counter 955 + represented by a ``->nmi_nesting`` value of nine. This counter 956 956 can therefore be thought of as counting the number of reasons why this 957 957 CPU cannot be permitted to enter dyntick-idle mode, aside from 958 958 process-level transitions. ··· 960 960 However, it turns out that when running in non-idle kernel context, the 961 961 Linux kernel is fully capable of entering interrupt handlers that never 962 962 exit and perhaps also vice versa. Therefore, whenever the 963 - ``->dynticks_nesting`` field is incremented up from zero, the 964 - ``->dynticks_nmi_nesting`` field is set to a large positive number, and 965 - whenever the ``->dynticks_nesting`` field is decremented down to zero, 966 - the ``->dynticks_nmi_nesting`` field is set to zero. Assuming that 963 + ``->nesting`` field is incremented up from zero, the 964 + ``->nmi_nesting`` field is set to a large positive number, and 965 + whenever the ``->nesting`` field is decremented down to zero, 966 + the ``->nmi_nesting`` field is set to zero. Assuming that 967 967 the number of misnested interrupts is not sufficient to overflow the 968 - counter, this approach corrects the ``->dynticks_nmi_nesting`` field 968 + counter, this approach corrects the ``->nmi_nesting`` field 969 969 every time the corresponding CPU enters the idle loop from process 970 970 context. 971 971 ··· 992 992 +-----------------------------------------------------------------------+ 993 993 | **Quick Quiz**: | 994 994 +-----------------------------------------------------------------------+ 995 - | Why not simply combine the ``->dynticks_nesting`` and | 996 - | ``->dynticks_nmi_nesting`` counters into a single counter that just | 995 + | Why not simply combine the ``->nesting`` and | 996 + | ``->nmi_nesting`` counters into a single counter that just | 997 997 | counts the number of reasons that the corresponding CPU is non-idle? | 998 998 +-----------------------------------------------------------------------+ 999 999 | **Answer**: |
+4 -4
Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
··· 147 147 idle sojourn. 148 148 This case is handled by calls to the strongly ordered 149 149 ``atomic_add_return()`` read-modify-write atomic operation that 150 - is invoked within ``rcu_dynticks_eqs_enter()`` at idle-entry 151 - time and within ``rcu_dynticks_eqs_exit()`` at idle-exit time. 152 - The grace-period kthread invokes first ``ct_dynticks_cpu_acquire()`` 153 - (preceded by a full memory barrier) and ``rcu_dynticks_in_eqs_since()`` 150 + is invoked within ``ct_kernel_exit_state()`` at idle-entry 151 + time and within ``ct_kernel_enter_state()`` at idle-exit time. 152 + The grace-period kthread invokes first ``ct_rcu_watching_cpu_acquire()`` 153 + (preceded by a full memory barrier) and ``rcu_watching_snap_stopped_since()`` 154 154 (both of which rely on acquire semantics) to detect idle CPUs. 155 155 156 156 +-----------------------------------------------------------------------+
+4 -4
Documentation/RCU/Design/Memory-Ordering/TreeRCU-dyntick.svg
··· 528 528 font-style="normal" 529 529 y="-8652.5312" 530 530 x="2466.7822" 531 - xml:space="preserve">dyntick_save_progress_counter()</text> 531 + xml:space="preserve">rcu_watching_snap_save()</text> 532 532 <text 533 533 style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier" 534 534 id="text202-7-2-7-2-0" ··· 537 537 font-style="normal" 538 538 y="-8368.1475" 539 539 x="2463.3262" 540 - xml:space="preserve">rcu_implicit_dynticks_qs()</text> 540 + xml:space="preserve">rcu_watching_snap_recheck()</text> 541 541 </g> 542 542 <g 543 543 id="g4504" ··· 607 607 font-weight="bold" 608 608 font-size="192" 609 609 id="text202-7-5-3-27-6" 610 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> 610 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> 611 611 <text 612 612 xml:space="preserve" 613 613 x="3745.7725" ··· 638 638 font-weight="bold" 639 639 font-size="192" 640 640 id="text202-7-5-3-27-6-1" 641 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> 641 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> 642 642 <text 643 643 xml:space="preserve" 644 644 x="3745.7725"
+4 -4
Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-fqs.svg
··· 844 844 font-style="normal" 845 845 y="1547.8876" 846 846 x="4417.6396" 847 - xml:space="preserve">dyntick_save_progress_counter()</text> 847 + xml:space="preserve">rcu_watching_snap_save()</text> 848 848 <g 849 849 style="fill:none;stroke-width:0.025in" 850 850 transform="translate(6501.9719,-10685.904)" ··· 899 899 font-style="normal" 900 900 y="1858.8729" 901 901 x="4414.1836" 902 - xml:space="preserve">rcu_implicit_dynticks_qs()</text> 902 + xml:space="preserve">rcu_watching_snap_recheck()</text> 903 903 <text 904 904 xml:space="preserve" 905 905 x="14659.87" ··· 977 977 font-weight="bold" 978 978 font-size="192" 979 979 id="text202-7-5-3-27-6" 980 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> 980 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> 981 981 <text 982 982 xml:space="preserve" 983 983 x="3745.7725" ··· 1008 1008 font-weight="bold" 1009 1009 font-size="192" 1010 1010 id="text202-7-5-3-27-6-1" 1011 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> 1011 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> 1012 1012 <text 1013 1013 xml:space="preserve" 1014 1014 x="3745.7725"
+4 -4
Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg
··· 2974 2974 font-style="normal" 2975 2975 y="38114.047" 2976 2976 x="-334.33856" 2977 - xml:space="preserve">dyntick_save_progress_counter()</text> 2977 + xml:space="preserve">rcu_watching_snap_save()</text> 2978 2978 <g 2979 2979 style="fill:none;stroke-width:0.025in" 2980 2980 transform="translate(1749.9916,25880.249)" ··· 3029 3029 font-style="normal" 3030 3030 y="38425.035" 3031 3031 x="-337.79462" 3032 - xml:space="preserve">rcu_implicit_dynticks_qs()</text> 3032 + xml:space="preserve">rcu_watching_snap_recheck()</text> 3033 3033 <text 3034 3034 xml:space="preserve" 3035 3035 x="9907.8887" ··· 3107 3107 font-weight="bold" 3108 3108 font-size="192" 3109 3109 id="text202-7-5-3-27-6" 3110 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_enter()</text> 3110 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_exit_state()</text> 3111 3111 <text 3112 3112 xml:space="preserve" 3113 3113 x="3745.7725" ··· 3138 3138 font-weight="bold" 3139 3139 font-size="192" 3140 3140 id="text202-7-5-3-27-6-1" 3141 - style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_dynticks_eqs_exit()</text> 3141 + style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">ct_kernel_enter_state()</text> 3142 3142 <text 3143 3143 xml:space="preserve" 3144 3144 x="3745.7725"
+2 -2
Documentation/RCU/Design/Memory-Ordering/TreeRCU-hotplug.svg
··· 516 516 font-style="normal" 517 517 y="-8652.5312" 518 518 x="2466.7822" 519 - xml:space="preserve">dyntick_save_progress_counter()</text> 519 + xml:space="preserve">rcu_watching_snap_save()</text> 520 520 <text 521 521 style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier" 522 522 id="text202-7-2-7-2-0" ··· 525 525 font-style="normal" 526 526 y="-8368.1475" 527 527 x="2463.3262" 528 - xml:space="preserve">rcu_implicit_dynticks_qs()</text> 528 + xml:space="preserve">rcu_watching_snap_recheck()</text> 529 529 <text 530 530 sodipodi:linespacing="125%" 531 531 style="font-size:192px;font-style:normal;font-weight:bold;line-height:125%;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"
+1 -2
Documentation/RCU/Design/Requirements/Requirements.rst
··· 2649 2649 be removed from the kernel. 2650 2650 2651 2651 The tasks-rude-RCU API is also reader-marking-free and thus quite compact, 2652 - consisting of call_rcu_tasks_rude(), synchronize_rcu_tasks_rude(), 2653 - and rcu_barrier_tasks_rude(). 2652 + consisting solely of synchronize_rcu_tasks_rude(). 2654 2653 2655 2654 Tasks Trace RCU 2656 2655 ~~~~~~~~~~~~~~~
+28 -33
Documentation/RCU/checklist.rst
··· 194 194 when publicizing a pointer to a structure that can 195 195 be traversed by an RCU read-side critical section. 196 196 197 - 5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), 198 - call_rcu_tasks_rude(), or call_rcu_tasks_trace() is used, 199 - the callback function may be invoked from softirq context, 200 - and in any case with bottom halves disabled. In particular, 201 - this callback function cannot block. If you need the callback 202 - to block, run that code in a workqueue handler scheduled from 203 - the callback. The queue_rcu_work() function does this for you 204 - in the case of call_rcu(). 197 + 5. If any of call_rcu(), call_srcu(), call_rcu_tasks(), or 198 + call_rcu_tasks_trace() is used, the callback function may be 199 + invoked from softirq context, and in any case with bottom halves 200 + disabled. In particular, this callback function cannot block. 201 + If you need the callback to block, run that code in a workqueue 202 + handler scheduled from the callback. The queue_rcu_work() 203 + function does this for you in the case of call_rcu(). 205 204 206 205 6. Since synchronize_rcu() can block, it cannot be called 207 206 from any sort of irq context. The same rule applies ··· 253 254 corresponding readers must use rcu_read_lock_trace() 254 255 and rcu_read_unlock_trace(). 255 256 256 - c. If an updater uses call_rcu_tasks_rude() or 257 - synchronize_rcu_tasks_rude(), then the corresponding 258 - readers must use anything that disables preemption, 259 - for example, preempt_disable() and preempt_enable(). 257 + c. If an updater uses synchronize_rcu_tasks_rude(), 258 + then the corresponding readers must use anything that 259 + disables preemption, for example, preempt_disable() 260 + and preempt_enable(). 260 261 261 262 Mixing things up will result in confusion and broken kernels, and 262 263 has even resulted in an exploitable security issue. Therefore, ··· 325 326 d. Periodically invoke rcu_barrier(), permitting a limited 326 327 number of updates per grace period. 327 328 328 - The same cautions apply to call_srcu(), call_rcu_tasks(), 329 - call_rcu_tasks_rude(), and call_rcu_tasks_trace(). This is 330 - why there is an srcu_barrier(), rcu_barrier_tasks(), 331 - rcu_barrier_tasks_rude(), and rcu_barrier_tasks_rude(), 332 - respectively. 329 + The same cautions apply to call_srcu(), call_rcu_tasks(), and 330 + call_rcu_tasks_trace(). This is why there is an srcu_barrier(), 331 + rcu_barrier_tasks(), and rcu_barrier_tasks_trace(), respectively. 333 332 334 333 Note that although these primitives do take action to avoid 335 334 memory exhaustion when any given CPU has too many callbacks, ··· 380 383 must use whatever locking or other synchronization is required 381 384 to safely access and/or modify that data structure. 382 385 383 - Do not assume that RCU callbacks will be executed on 384 - the same CPU that executed the corresponding call_rcu(), 385 - call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), or 386 - call_rcu_tasks_trace(). For example, if a given CPU goes offline 387 - while having an RCU callback pending, then that RCU callback 388 - will execute on some surviving CPU. (If this was not the case, 389 - a self-spawning RCU callback would prevent the victim CPU from 390 - ever going offline.) Furthermore, CPUs designated by rcu_nocbs= 391 - might well *always* have their RCU callbacks executed on some 392 - other CPUs, in fact, for some real-time workloads, this is the 393 - whole point of using the rcu_nocbs= kernel boot parameter. 386 + Do not assume that RCU callbacks will be executed on the same 387 + CPU that executed the corresponding call_rcu(), call_srcu(), 388 + call_rcu_tasks(), or call_rcu_tasks_trace(). For example, if 389 + a given CPU goes offline while having an RCU callback pending, 390 + then that RCU callback will execute on some surviving CPU. 391 + (If this was not the case, a self-spawning RCU callback would 392 + prevent the victim CPU from ever going offline.) Furthermore, 393 + CPUs designated by rcu_nocbs= might well *always* have their 394 + RCU callbacks executed on some other CPUs, in fact, for some 395 + real-time workloads, this is the whole point of using the 396 + rcu_nocbs= kernel boot parameter. 394 397 395 398 In addition, do not assume that callbacks queued in a given order 396 399 will be invoked in that order, even if they all are queued on the ··· 504 507 These debugging aids can help you find problems that are 505 508 otherwise extremely difficult to spot. 506 509 507 - 17. If you pass a callback function defined within a module to one of 508 - call_rcu(), call_srcu(), call_rcu_tasks(), call_rcu_tasks_rude(), 509 - or call_rcu_tasks_trace(), then it is necessary to wait for all 510 + 17. If you pass a callback function defined within a module 511 + to one of call_rcu(), call_srcu(), call_rcu_tasks(), or 512 + call_rcu_tasks_trace(), then it is necessary to wait for all 510 513 pending callbacks to be invoked before unloading that module. 511 514 Note that it is absolutely *not* sufficient to wait for a grace 512 515 period! For example, synchronize_rcu() implementation is *not* ··· 519 522 - call_rcu() -> rcu_barrier() 520 523 - call_srcu() -> srcu_barrier() 521 524 - call_rcu_tasks() -> rcu_barrier_tasks() 522 - - call_rcu_tasks_rude() -> rcu_barrier_tasks_rude() 523 525 - call_rcu_tasks_trace() -> rcu_barrier_tasks_trace() 524 526 525 527 However, these barrier functions are absolutely *not* guaranteed ··· 535 539 - Either synchronize_srcu() or synchronize_srcu_expedited(), 536 540 together with and srcu_barrier() 537 541 - synchronize_rcu_tasks() and rcu_barrier_tasks() 538 - - synchronize_tasks_rude() and rcu_barrier_tasks_rude() 539 542 - synchronize_tasks_trace() and rcu_barrier_tasks_trace() 540 543 541 544 If necessary, you can use something like workqueues to execute
+1 -1
Documentation/RCU/whatisRCU.rst
··· 1103 1103 1104 1104 Critical sections Grace period Barrier 1105 1105 1106 - N/A call_rcu_tasks_rude rcu_barrier_tasks_rude 1106 + N/A N/A 1107 1107 synchronize_rcu_tasks_rude 1108 1108 1109 1109
+11 -9
Documentation/admin-guide/kernel-parameters.txt
··· 4969 4969 Set maximum number of finished RCU callbacks to 4970 4970 process in one batch. 4971 4971 4972 + rcutree.csd_lock_suppress_rcu_stall= [KNL] 4973 + Do only a one-line RCU CPU stall warning when 4974 + there is an ongoing too-long CSD-lock wait. 4975 + 4972 4976 rcutree.do_rcu_barrier= [KNL] 4973 4977 Request a call to rcu_barrier(). This is 4974 4978 throttled so that userspace tests can safely ··· 5420 5416 Time to wait (s) after boot before inducing stall. 5421 5417 5422 5418 rcutorture.stall_cpu_irqsoff= [KNL] 5423 - Disable interrupts while stalling if set. 5419 + Disable interrupts while stalling if set, but only 5420 + on the first stall in the set. 5421 + 5422 + rcutorture.stall_cpu_repeat= [KNL] 5423 + Number of times to repeat the stall sequence, 5424 + so that rcutorture.stall_cpu_repeat=3 will result 5425 + in four stall sequences. 5424 5426 5425 5427 rcutorture.stall_gp_kthread= [KNL] 5426 5428 Duration (s) of forced sleep within RCU ··· 5613 5603 A negative value will take the default. A value 5614 5604 of zero will disable batching. Batching is 5615 5605 always disabled for synchronize_rcu_tasks(). 5616 - 5617 - rcupdate.rcu_tasks_rude_lazy_ms= [KNL] 5618 - Set timeout in milliseconds RCU Tasks 5619 - Rude asynchronous callback batching for 5620 - call_rcu_tasks_rude(). A negative value 5621 - will take the default. A value of zero will 5622 - disable batching. Batching is always disabled 5623 - for synchronize_rcu_tasks_rude(). 5624 5606 5625 5607 rcupdate.rcu_tasks_trace_lazy_ms= [KNL] 5626 5608 Set timeout in milliseconds RCU Tasks
+1 -1
arch/Kconfig
··· 862 862 Architecture neither relies on exception_enter()/exception_exit() 863 863 nor on schedule_user(). Also preempt_schedule_notrace() and 864 864 preempt_schedule_irq() can't be called in a preemptible section 865 - while context tracking is CONTEXT_USER. This feature reflects a sane 865 + while context tracking is CT_STATE_USER. This feature reflects a sane 866 866 entry implementation where the following requirements are met on 867 867 critical entry code, ie: before user_exit() or after user_enter(): 868 868
+1 -1
arch/arm64/kernel/entry-common.c
··· 103 103 static __always_inline void __enter_from_user_mode(void) 104 104 { 105 105 lockdep_hardirqs_off(CALLER_ADDR0); 106 - CT_WARN_ON(ct_state() != CONTEXT_USER); 106 + CT_WARN_ON(ct_state() != CT_STATE_USER); 107 107 user_exit_irqoff(); 108 108 trace_hardirqs_off_finish(); 109 109 mte_disable_tco_entry(current);
+3 -3
arch/powerpc/include/asm/interrupt.h
··· 177 177 178 178 if (user_mode(regs)) { 179 179 kuap_lock(); 180 - CT_WARN_ON(ct_state() != CONTEXT_USER); 180 + CT_WARN_ON(ct_state() != CT_STATE_USER); 181 181 user_exit_irqoff(); 182 182 183 183 account_cpu_user_entry(); ··· 189 189 * so avoid recursion. 190 190 */ 191 191 if (TRAP(regs) != INTERRUPT_PROGRAM) 192 - CT_WARN_ON(ct_state() != CONTEXT_KERNEL && 193 - ct_state() != CONTEXT_IDLE); 192 + CT_WARN_ON(ct_state() != CT_STATE_KERNEL && 193 + ct_state() != CT_STATE_IDLE); 194 194 INT_SOFT_MASK_BUG_ON(regs, is_implicit_soft_masked(regs)); 195 195 INT_SOFT_MASK_BUG_ON(regs, arch_irq_disabled_regs(regs) && 196 196 search_kernel_restart_table(regs->nip));
+3 -3
arch/powerpc/kernel/interrupt.c
··· 266 266 unsigned long ret = 0; 267 267 bool is_not_scv = !IS_ENABLED(CONFIG_PPC_BOOK3S_64) || !scv; 268 268 269 - CT_WARN_ON(ct_state() == CONTEXT_USER); 269 + CT_WARN_ON(ct_state() == CT_STATE_USER); 270 270 271 271 kuap_assert_locked(); 272 272 ··· 344 344 345 345 BUG_ON(regs_is_unrecoverable(regs)); 346 346 BUG_ON(arch_irq_disabled_regs(regs)); 347 - CT_WARN_ON(ct_state() == CONTEXT_USER); 347 + CT_WARN_ON(ct_state() == CT_STATE_USER); 348 348 349 349 /* 350 350 * We don't need to restore AMR on the way back to userspace for KUAP. ··· 386 386 if (!IS_ENABLED(CONFIG_PPC_BOOK3E_64) && 387 387 TRAP(regs) != INTERRUPT_PROGRAM && 388 388 TRAP(regs) != INTERRUPT_PERFMON) 389 - CT_WARN_ON(ct_state() == CONTEXT_USER); 389 + CT_WARN_ON(ct_state() == CT_STATE_USER); 390 390 391 391 kuap = kuap_get_and_assert_locked(); 392 392
+1 -1
arch/powerpc/kernel/syscall.c
··· 27 27 28 28 trace_hardirqs_off(); /* finish reconciling */ 29 29 30 - CT_WARN_ON(ct_state() == CONTEXT_KERNEL); 30 + CT_WARN_ON(ct_state() == CT_STATE_KERNEL); 31 31 user_exit_irqoff(); 32 32 33 33 BUG_ON(regs_is_unrecoverable(regs));
+1 -1
arch/x86/entry/common.c
··· 150 150 #endif 151 151 152 152 /* 153 - * Invoke a 32-bit syscall. Called with IRQs on in CONTEXT_KERNEL. 153 + * Invoke a 32-bit syscall. Called with IRQs on in CT_STATE_KERNEL. 154 154 */ 155 155 static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr) 156 156 {
+18 -14
include/linux/context_tracking.h
··· 26 26 static inline void user_enter(void) 27 27 { 28 28 if (context_tracking_enabled()) 29 - ct_user_enter(CONTEXT_USER); 29 + ct_user_enter(CT_STATE_USER); 30 30 31 31 } 32 32 static inline void user_exit(void) 33 33 { 34 34 if (context_tracking_enabled()) 35 - ct_user_exit(CONTEXT_USER); 35 + ct_user_exit(CT_STATE_USER); 36 36 } 37 37 38 38 /* Called with interrupts disabled. */ 39 39 static __always_inline void user_enter_irqoff(void) 40 40 { 41 41 if (context_tracking_enabled()) 42 - __ct_user_enter(CONTEXT_USER); 42 + __ct_user_enter(CT_STATE_USER); 43 43 44 44 } 45 45 static __always_inline void user_exit_irqoff(void) 46 46 { 47 47 if (context_tracking_enabled()) 48 - __ct_user_exit(CONTEXT_USER); 48 + __ct_user_exit(CT_STATE_USER); 49 49 } 50 50 51 51 static inline enum ctx_state exception_enter(void) ··· 57 57 return 0; 58 58 59 59 prev_ctx = __ct_state(); 60 - if (prev_ctx != CONTEXT_KERNEL) 60 + if (prev_ctx != CT_STATE_KERNEL) 61 61 ct_user_exit(prev_ctx); 62 62 63 63 return prev_ctx; ··· 67 67 { 68 68 if (!IS_ENABLED(CONFIG_HAVE_CONTEXT_TRACKING_USER_OFFSTACK) && 69 69 context_tracking_enabled()) { 70 - if (prev_ctx != CONTEXT_KERNEL) 70 + if (prev_ctx != CT_STATE_KERNEL) 71 71 ct_user_enter(prev_ctx); 72 72 } 73 73 } ··· 75 75 static __always_inline bool context_tracking_guest_enter(void) 76 76 { 77 77 if (context_tracking_enabled()) 78 - __ct_user_enter(CONTEXT_GUEST); 78 + __ct_user_enter(CT_STATE_GUEST); 79 79 80 80 return context_tracking_enabled_this_cpu(); 81 81 } ··· 83 83 static __always_inline bool context_tracking_guest_exit(void) 84 84 { 85 85 if (context_tracking_enabled()) 86 - __ct_user_exit(CONTEXT_GUEST); 86 + __ct_user_exit(CT_STATE_GUEST); 87 87 88 88 return context_tracking_enabled_this_cpu(); 89 89 } ··· 115 115 extern void ct_idle_exit(void); 116 116 117 117 /* 118 - * Is the current CPU in an extended quiescent state? 118 + * Is RCU watching the current CPU (IOW, it is not in an extended quiescent state)? 119 + * 120 + * Note that this returns the actual boolean data (watching / not watching), 121 + * whereas ct_rcu_watching() returns the RCU_WATCHING subvariable of 122 + * context_tracking.state. 119 123 * 120 124 * No ordering, as we are sampling CPU-local information. 121 125 */ 122 - static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void) 126 + static __always_inline bool rcu_is_watching_curr_cpu(void) 123 127 { 124 - return !(raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & RCU_DYNTICKS_IDX); 128 + return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_RCU_WATCHING; 125 129 } 126 130 127 131 /* ··· 146 142 * lots of the actual reporting also relies on RCU. 147 143 */ 148 144 preempt_disable_notrace(); 149 - if (rcu_dynticks_curr_cpu_in_eqs()) { 145 + if (!rcu_is_watching_curr_cpu()) { 150 146 ret = true; 151 - ct_state_inc(RCU_DYNTICKS_IDX); 147 + ct_state_inc(CT_RCU_WATCHING); 152 148 } 153 149 154 150 return ret; ··· 157 153 static __always_inline void warn_rcu_exit(bool rcu) 158 154 { 159 155 if (rcu) 160 - ct_state_inc(RCU_DYNTICKS_IDX); 156 + ct_state_inc(CT_RCU_WATCHING); 161 157 preempt_enable_notrace(); 162 158 } 163 159
+38 -38
include/linux/context_tracking_state.h
··· 7 7 #include <linux/context_tracking_irq.h> 8 8 9 9 /* Offset to allow distinguishing irq vs. task-based idle entry/exit. */ 10 - #define DYNTICK_IRQ_NONIDLE ((LONG_MAX / 2) + 1) 10 + #define CT_NESTING_IRQ_NONIDLE ((LONG_MAX / 2) + 1) 11 11 12 12 enum ctx_state { 13 - CONTEXT_DISABLED = -1, /* returned by ct_state() if unknown */ 14 - CONTEXT_KERNEL = 0, 15 - CONTEXT_IDLE = 1, 16 - CONTEXT_USER = 2, 17 - CONTEXT_GUEST = 3, 18 - CONTEXT_MAX = 4, 13 + CT_STATE_DISABLED = -1, /* returned by ct_state() if unknown */ 14 + CT_STATE_KERNEL = 0, 15 + CT_STATE_IDLE = 1, 16 + CT_STATE_USER = 2, 17 + CT_STATE_GUEST = 3, 18 + CT_STATE_MAX = 4, 19 19 }; 20 20 21 - /* Even value for idle, else odd. */ 22 - #define RCU_DYNTICKS_IDX CONTEXT_MAX 21 + /* Odd value for watching, else even. */ 22 + #define CT_RCU_WATCHING CT_STATE_MAX 23 23 24 - #define CT_STATE_MASK (CONTEXT_MAX - 1) 25 - #define CT_DYNTICKS_MASK (~CT_STATE_MASK) 24 + #define CT_STATE_MASK (CT_STATE_MAX - 1) 25 + #define CT_RCU_WATCHING_MASK (~CT_STATE_MASK) 26 26 27 27 struct context_tracking { 28 28 #ifdef CONFIG_CONTEXT_TRACKING_USER ··· 39 39 atomic_t state; 40 40 #endif 41 41 #ifdef CONFIG_CONTEXT_TRACKING_IDLE 42 - long dynticks_nesting; /* Track process nesting level. */ 43 - long dynticks_nmi_nesting; /* Track irq/NMI nesting level. */ 42 + long nesting; /* Track process nesting level. */ 43 + long nmi_nesting; /* Track irq/NMI nesting level. */ 44 44 #endif 45 45 }; 46 46 ··· 56 56 #endif 57 57 58 58 #ifdef CONFIG_CONTEXT_TRACKING_IDLE 59 - static __always_inline int ct_dynticks(void) 59 + static __always_inline int ct_rcu_watching(void) 60 60 { 61 - return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_DYNTICKS_MASK; 61 + return atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_RCU_WATCHING_MASK; 62 62 } 63 63 64 - static __always_inline int ct_dynticks_cpu(int cpu) 64 + static __always_inline int ct_rcu_watching_cpu(int cpu) 65 65 { 66 66 struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); 67 67 68 - return atomic_read(&ct->state) & CT_DYNTICKS_MASK; 68 + return atomic_read(&ct->state) & CT_RCU_WATCHING_MASK; 69 69 } 70 70 71 - static __always_inline int ct_dynticks_cpu_acquire(int cpu) 71 + static __always_inline int ct_rcu_watching_cpu_acquire(int cpu) 72 72 { 73 73 struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); 74 74 75 - return atomic_read_acquire(&ct->state) & CT_DYNTICKS_MASK; 75 + return atomic_read_acquire(&ct->state) & CT_RCU_WATCHING_MASK; 76 76 } 77 77 78 - static __always_inline long ct_dynticks_nesting(void) 78 + static __always_inline long ct_nesting(void) 79 79 { 80 - return __this_cpu_read(context_tracking.dynticks_nesting); 80 + return __this_cpu_read(context_tracking.nesting); 81 81 } 82 82 83 - static __always_inline long ct_dynticks_nesting_cpu(int cpu) 84 - { 85 - struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); 86 - 87 - return ct->dynticks_nesting; 88 - } 89 - 90 - static __always_inline long ct_dynticks_nmi_nesting(void) 91 - { 92 - return __this_cpu_read(context_tracking.dynticks_nmi_nesting); 93 - } 94 - 95 - static __always_inline long ct_dynticks_nmi_nesting_cpu(int cpu) 83 + static __always_inline long ct_nesting_cpu(int cpu) 96 84 { 97 85 struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); 98 86 99 - return ct->dynticks_nmi_nesting; 87 + return ct->nesting; 88 + } 89 + 90 + static __always_inline long ct_nmi_nesting(void) 91 + { 92 + return __this_cpu_read(context_tracking.nmi_nesting); 93 + } 94 + 95 + static __always_inline long ct_nmi_nesting_cpu(int cpu) 96 + { 97 + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); 98 + 99 + return ct->nmi_nesting; 100 100 } 101 101 #endif /* #ifdef CONFIG_CONTEXT_TRACKING_IDLE */ 102 102 ··· 113 113 return context_tracking_enabled() && per_cpu(context_tracking.active, cpu); 114 114 } 115 115 116 - static inline bool context_tracking_enabled_this_cpu(void) 116 + static __always_inline bool context_tracking_enabled_this_cpu(void) 117 117 { 118 118 return context_tracking_enabled() && __this_cpu_read(context_tracking.active); 119 119 } ··· 123 123 * 124 124 * Returns the current cpu's context tracking state if context tracking 125 125 * is enabled. If context tracking is disabled, returns 126 - * CONTEXT_DISABLED. This should be used primarily for debugging. 126 + * CT_STATE_DISABLED. This should be used primarily for debugging. 127 127 */ 128 128 static __always_inline int ct_state(void) 129 129 { 130 130 int ret; 131 131 132 132 if (!context_tracking_enabled()) 133 - return CONTEXT_DISABLED; 133 + return CT_STATE_DISABLED; 134 134 135 135 preempt_disable(); 136 136 ret = __ct_state();
+1 -1
include/linux/entry-common.h
··· 108 108 arch_enter_from_user_mode(regs); 109 109 lockdep_hardirqs_off(CALLER_ADDR0); 110 110 111 - CT_WARN_ON(__ct_state() != CONTEXT_USER); 111 + CT_WARN_ON(__ct_state() != CT_STATE_USER); 112 112 user_exit_irqoff(); 113 113 114 114 instrumentation_begin();
+1 -5
include/linux/rcu_segcblist.h
··· 185 185 * ---------------------------------------------------------------------------- 186 186 */ 187 187 #define SEGCBLIST_ENABLED BIT(0) 188 - #define SEGCBLIST_RCU_CORE BIT(1) 189 - #define SEGCBLIST_LOCKING BIT(2) 190 - #define SEGCBLIST_KTHREAD_CB BIT(3) 191 - #define SEGCBLIST_KTHREAD_GP BIT(4) 192 - #define SEGCBLIST_OFFLOADED BIT(5) 188 + #define SEGCBLIST_OFFLOADED BIT(1) 193 189 194 190 struct rcu_segcblist { 195 191 struct rcu_head *head;
+7 -2
include/linux/rculist.h
··· 191 191 * @old : the element to be replaced 192 192 * @new : the new element to insert 193 193 * 194 - * The @old entry will be replaced with the @new entry atomically. 194 + * The @old entry will be replaced with the @new entry atomically from 195 + * the perspective of concurrent readers. It is the caller's responsibility 196 + * to synchronize with concurrent updaters, if any. 197 + * 195 198 * Note: @old should not be empty. 196 199 */ 197 200 static inline void list_replace_rcu(struct list_head *old, ··· 522 519 * @old : the element to be replaced 523 520 * @new : the new element to insert 524 521 * 525 - * The @old entry will be replaced with the @new entry atomically. 522 + * The @old entry will be replaced with the @new entry atomically from 523 + * the perspective of concurrent readers. It is the caller's responsibility 524 + * to synchronize with concurrent updaters, if any. 526 525 */ 527 526 static inline void hlist_replace_rcu(struct hlist_node *old, 528 527 struct hlist_node *new)
+13 -2
include/linux/rcupdate.h
··· 34 34 #define ULONG_CMP_GE(a, b) (ULONG_MAX / 2 >= (a) - (b)) 35 35 #define ULONG_CMP_LT(a, b) (ULONG_MAX / 2 < (a) - (b)) 36 36 37 + #define RCU_SEQ_CTR_SHIFT 2 38 + #define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1) 39 + 37 40 /* Exported common interfaces */ 38 41 void call_rcu(struct rcu_head *head, rcu_callback_t func); 39 42 void rcu_barrier_tasks(void); 40 - void rcu_barrier_tasks_rude(void); 41 43 void synchronize_rcu(void); 42 44 43 45 struct rcu_gp_oldstate; ··· 146 144 int rcu_nocb_cpu_offload(int cpu); 147 145 int rcu_nocb_cpu_deoffload(int cpu); 148 146 void rcu_nocb_flush_deferred_wakeup(void); 147 + 148 + #define RCU_NOCB_LOCKDEP_WARN(c, s) RCU_LOCKDEP_WARN(c, s) 149 + 149 150 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 151 + 150 152 static inline void rcu_init_nohz(void) { } 151 153 static inline int rcu_nocb_cpu_offload(int cpu) { return -EINVAL; } 152 154 static inline int rcu_nocb_cpu_deoffload(int cpu) { return 0; } 153 155 static inline void rcu_nocb_flush_deferred_wakeup(void) { } 156 + 157 + #define RCU_NOCB_LOCKDEP_WARN(c, s) 158 + 154 159 #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */ 155 160 156 161 /* ··· 174 165 } while (0) 175 166 void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func); 176 167 void synchronize_rcu_tasks(void); 168 + void rcu_tasks_torture_stats_print(char *tt, char *tf); 177 169 # else 178 170 # define rcu_tasks_classic_qs(t, preempt) do { } while (0) 179 171 # define call_rcu_tasks call_rcu ··· 201 191 rcu_tasks_trace_qs_blkd(t); \ 202 192 } \ 203 193 } while (0) 194 + void rcu_tasks_trace_torture_stats_print(char *tt, char *tf); 204 195 # else 205 196 # define rcu_tasks_trace_qs(t) do { } while (0) 206 197 # endif ··· 213 202 } while (0) 214 203 215 204 # ifdef CONFIG_TASKS_RUDE_RCU 216 - void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func); 217 205 void synchronize_rcu_tasks_rude(void); 206 + void rcu_tasks_rude_torture_stats_print(char *tt, char *tf); 218 207 # endif 219 208 220 209 #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
+1 -1
include/linux/rcutiny.h
··· 158 158 static inline void rcu_end_inkernel_boot(void) { } 159 159 static inline bool rcu_inkernel_boot_has_ended(void) { return true; } 160 160 static inline bool rcu_is_watching(void) { return true; } 161 - static inline void rcu_momentary_dyntick_idle(void) { } 161 + static inline void rcu_momentary_eqs(void) { } 162 162 static inline void kfree_rcu_scheduler_running(void) { } 163 163 static inline bool rcu_gp_might_be_stalled(void) { return false; } 164 164
+1 -1
include/linux/rcutree.h
··· 37 37 void kvfree_call_rcu(struct rcu_head *head, void *ptr); 38 38 39 39 void rcu_barrier(void); 40 - void rcu_momentary_dyntick_idle(void); 40 + void rcu_momentary_eqs(void); 41 41 void kfree_rcu_scheduler_running(void); 42 42 bool rcu_gp_might_be_stalled(void); 43 43
+6
include/linux/smp.h
··· 294 294 int smpcfd_dead_cpu(unsigned int cpu); 295 295 int smpcfd_dying_cpu(unsigned int cpu); 296 296 297 + #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG 298 + bool csd_lock_is_stuck(void); 299 + #else 300 + static inline bool csd_lock_is_stuck(void) { return false; } 301 + #endif 302 + 297 303 #endif /* __LINUX_SMP_H */
+14 -1
include/linux/srcutree.h
··· 129 129 #define SRCU_STATE_SCAN1 1 130 130 #define SRCU_STATE_SCAN2 2 131 131 132 + /* 133 + * Values for initializing gp sequence fields. Higher values allow wrap arounds to 134 + * occur earlier. 135 + * The second value with state is useful in the case of static initialization of 136 + * srcu_usage where srcu_gp_seq_needed is expected to have some state value in its 137 + * lower bits (or else it will appear to be already initialized within 138 + * the call check_init_srcu_struct()). 139 + */ 140 + #define SRCU_GP_SEQ_INITIAL_VAL ((0UL - 100UL) << RCU_SEQ_CTR_SHIFT) 141 + #define SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE (SRCU_GP_SEQ_INITIAL_VAL - 1) 142 + 132 143 #define __SRCU_USAGE_INIT(name) \ 133 144 { \ 134 145 .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ 135 - .srcu_gp_seq_needed = -1UL, \ 146 + .srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \ 147 + .srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \ 148 + .srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \ 136 149 .work = __DELAYED_WORK_INITIALIZER(name.work, NULL, 0), \ 137 150 } 138 151
+10 -10
include/trace/events/rcu.h
··· 466 466 /* 467 467 * Tracepoint for dyntick-idle entry/exit events. These take 2 strings 468 468 * as argument: 469 - * polarity: "Start", "End", "StillNonIdle" for entering, exiting or still not 470 - * being in dyntick-idle mode. 469 + * polarity: "Start", "End", "StillWatching" for entering, exiting or still not 470 + * being in EQS mode. 471 471 * context: "USER" or "IDLE" or "IRQ". 472 - * NMIs nested in IRQs are inferred with dynticks_nesting > 1 in IRQ context. 472 + * NMIs nested in IRQs are inferred with nesting > 1 in IRQ context. 473 473 * 474 474 * These events also take a pair of numbers, which indicate the nesting 475 475 * depth before and after the event of interest, and a third number that is 476 - * the ->dynticks counter. Note that task-related and interrupt-related 476 + * the RCU_WATCHING counter. Note that task-related and interrupt-related 477 477 * events use two separate counters, and that the "++=" and "--=" events 478 478 * for irq/NMI will change the counter by two, otherwise by one. 479 479 */ 480 - TRACE_EVENT_RCU(rcu_dyntick, 480 + TRACE_EVENT_RCU(rcu_watching, 481 481 482 - TP_PROTO(const char *polarity, long oldnesting, long newnesting, int dynticks), 482 + TP_PROTO(const char *polarity, long oldnesting, long newnesting, int counter), 483 483 484 - TP_ARGS(polarity, oldnesting, newnesting, dynticks), 484 + TP_ARGS(polarity, oldnesting, newnesting, counter), 485 485 486 486 TP_STRUCT__entry( 487 487 __field(const char *, polarity) 488 488 __field(long, oldnesting) 489 489 __field(long, newnesting) 490 - __field(int, dynticks) 490 + __field(int, counter) 491 491 ), 492 492 493 493 TP_fast_assign( 494 494 __entry->polarity = polarity; 495 495 __entry->oldnesting = oldnesting; 496 496 __entry->newnesting = newnesting; 497 - __entry->dynticks = dynticks; 497 + __entry->counter = counter; 498 498 ), 499 499 500 500 TP_printk("%s %lx %lx %#3x", __entry->polarity, 501 501 __entry->oldnesting, __entry->newnesting, 502 - __entry->dynticks & 0xfff) 502 + __entry->counter & 0xfff) 503 503 ); 504 504 505 505 /*
+70 -70
kernel/context_tracking.c
··· 28 28 29 29 DEFINE_PER_CPU(struct context_tracking, context_tracking) = { 30 30 #ifdef CONFIG_CONTEXT_TRACKING_IDLE 31 - .dynticks_nesting = 1, 32 - .dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE, 31 + .nesting = 1, 32 + .nmi_nesting = CT_NESTING_IRQ_NONIDLE, 33 33 #endif 34 - .state = ATOMIC_INIT(RCU_DYNTICKS_IDX), 34 + .state = ATOMIC_INIT(CT_RCU_WATCHING), 35 35 }; 36 36 EXPORT_SYMBOL_GPL(context_tracking); 37 37 38 38 #ifdef CONFIG_CONTEXT_TRACKING_IDLE 39 39 #define TPS(x) tracepoint_string(x) 40 40 41 - /* Record the current task on dyntick-idle entry. */ 42 - static __always_inline void rcu_dynticks_task_enter(void) 41 + /* Record the current task on exiting RCU-tasks (dyntick-idle entry). */ 42 + static __always_inline void rcu_task_exit(void) 43 43 { 44 44 #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) 45 45 WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id()); 46 46 #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */ 47 47 } 48 48 49 - /* Record no current task on dyntick-idle exit. */ 50 - static __always_inline void rcu_dynticks_task_exit(void) 49 + /* Record no current task on entering RCU-tasks (dyntick-idle exit). */ 50 + static __always_inline void rcu_task_enter(void) 51 51 { 52 52 #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) 53 53 WRITE_ONCE(current->rcu_tasks_idle_cpu, -1); 54 54 #endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */ 55 55 } 56 56 57 - /* Turn on heavyweight RCU tasks trace readers on idle/user entry. */ 58 - static __always_inline void rcu_dynticks_task_trace_enter(void) 57 + /* Turn on heavyweight RCU tasks trace readers on kernel exit. */ 58 + static __always_inline void rcu_task_trace_heavyweight_enter(void) 59 59 { 60 60 #ifdef CONFIG_TASKS_TRACE_RCU 61 61 if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB)) ··· 63 63 #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */ 64 64 } 65 65 66 - /* Turn off heavyweight RCU tasks trace readers on idle/user exit. */ 67 - static __always_inline void rcu_dynticks_task_trace_exit(void) 66 + /* Turn off heavyweight RCU tasks trace readers on kernel entry. */ 67 + static __always_inline void rcu_task_trace_heavyweight_exit(void) 68 68 { 69 69 #ifdef CONFIG_TASKS_TRACE_RCU 70 70 if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB)) ··· 87 87 * critical sections, and we also must force ordering with the 88 88 * next idle sojourn. 89 89 */ 90 - rcu_dynticks_task_trace_enter(); // Before ->dynticks update! 90 + rcu_task_trace_heavyweight_enter(); // Before CT state update! 91 91 seq = ct_state_inc(offset); 92 92 // RCU is no longer watching. Better be in extended quiescent state! 93 - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & RCU_DYNTICKS_IDX)); 93 + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && (seq & CT_RCU_WATCHING)); 94 94 } 95 95 96 96 /* ··· 109 109 */ 110 110 seq = ct_state_inc(offset); 111 111 // RCU is now watching. Better not be in an extended quiescent state! 112 - rcu_dynticks_task_trace_exit(); // After ->dynticks update! 113 - WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & RCU_DYNTICKS_IDX)); 112 + rcu_task_trace_heavyweight_exit(); // After CT state update! 113 + WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !(seq & CT_RCU_WATCHING)); 114 114 } 115 115 116 116 /* 117 117 * Enter an RCU extended quiescent state, which can be either the 118 118 * idle loop or adaptive-tickless usermode execution. 119 119 * 120 - * We crowbar the ->dynticks_nmi_nesting field to zero to allow for 120 + * We crowbar the ->nmi_nesting field to zero to allow for 121 121 * the possibility of usermode upcalls having messed up our count 122 122 * of interrupt nesting level during the prior busy period. 123 123 */ ··· 125 125 { 126 126 struct context_tracking *ct = this_cpu_ptr(&context_tracking); 127 127 128 - WARN_ON_ONCE(ct_dynticks_nmi_nesting() != DYNTICK_IRQ_NONIDLE); 129 - WRITE_ONCE(ct->dynticks_nmi_nesting, 0); 128 + WARN_ON_ONCE(ct_nmi_nesting() != CT_NESTING_IRQ_NONIDLE); 129 + WRITE_ONCE(ct->nmi_nesting, 0); 130 130 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && 131 - ct_dynticks_nesting() == 0); 132 - if (ct_dynticks_nesting() != 1) { 131 + ct_nesting() == 0); 132 + if (ct_nesting() != 1) { 133 133 // RCU will still be watching, so just do accounting and leave. 134 - ct->dynticks_nesting--; 134 + ct->nesting--; 135 135 return; 136 136 } 137 137 138 138 instrumentation_begin(); 139 139 lockdep_assert_irqs_disabled(); 140 - trace_rcu_dyntick(TPS("Start"), ct_dynticks_nesting(), 0, ct_dynticks()); 140 + trace_rcu_watching(TPS("End"), ct_nesting(), 0, ct_rcu_watching()); 141 141 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); 142 142 rcu_preempt_deferred_qs(current); 143 143 ··· 145 145 instrument_atomic_write(&ct->state, sizeof(ct->state)); 146 146 147 147 instrumentation_end(); 148 - WRITE_ONCE(ct->dynticks_nesting, 0); /* Avoid irq-access tearing. */ 148 + WRITE_ONCE(ct->nesting, 0); /* Avoid irq-access tearing. */ 149 149 // RCU is watching here ... 150 150 ct_kernel_exit_state(offset); 151 151 // ... but is no longer watching here. 152 - rcu_dynticks_task_enter(); 152 + rcu_task_exit(); 153 153 } 154 154 155 155 /* 156 156 * Exit an RCU extended quiescent state, which can be either the 157 157 * idle loop or adaptive-tickless usermode execution. 158 158 * 159 - * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to 159 + * We crowbar the ->nmi_nesting field to CT_NESTING_IRQ_NONIDLE to 160 160 * allow for the possibility of usermode upcalls messing up our count of 161 161 * interrupt nesting level during the busy period that is just now starting. 162 162 */ ··· 166 166 long oldval; 167 167 168 168 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled()); 169 - oldval = ct_dynticks_nesting(); 169 + oldval = ct_nesting(); 170 170 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0); 171 171 if (oldval) { 172 172 // RCU was already watching, so just do accounting and leave. 173 - ct->dynticks_nesting++; 173 + ct->nesting++; 174 174 return; 175 175 } 176 - rcu_dynticks_task_exit(); 176 + rcu_task_enter(); 177 177 // RCU is not watching here ... 178 178 ct_kernel_enter_state(offset); 179 179 // ... but is watching here. ··· 182 182 // instrumentation for the noinstr ct_kernel_enter_state() 183 183 instrument_atomic_write(&ct->state, sizeof(ct->state)); 184 184 185 - trace_rcu_dyntick(TPS("End"), ct_dynticks_nesting(), 1, ct_dynticks()); 185 + trace_rcu_watching(TPS("Start"), ct_nesting(), 1, ct_rcu_watching()); 186 186 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); 187 - WRITE_ONCE(ct->dynticks_nesting, 1); 188 - WARN_ON_ONCE(ct_dynticks_nmi_nesting()); 189 - WRITE_ONCE(ct->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE); 187 + WRITE_ONCE(ct->nesting, 1); 188 + WARN_ON_ONCE(ct_nmi_nesting()); 189 + WRITE_ONCE(ct->nmi_nesting, CT_NESTING_IRQ_NONIDLE); 190 190 instrumentation_end(); 191 191 } 192 192 ··· 194 194 * ct_nmi_exit - inform RCU of exit from NMI context 195 195 * 196 196 * If we are returning from the outermost NMI handler that interrupted an 197 - * RCU-idle period, update ct->state and ct->dynticks_nmi_nesting 197 + * RCU-idle period, update ct->state and ct->nmi_nesting 198 198 * to let the RCU grace-period handling know that the CPU is back to 199 199 * being RCU-idle. 200 200 * ··· 207 207 208 208 instrumentation_begin(); 209 209 /* 210 - * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks. 210 + * Check for ->nmi_nesting underflow and bad CT state. 211 211 * (We are exiting an NMI handler, so RCU better be paying attention 212 212 * to us!) 213 213 */ 214 - WARN_ON_ONCE(ct_dynticks_nmi_nesting() <= 0); 215 - WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs()); 214 + WARN_ON_ONCE(ct_nmi_nesting() <= 0); 215 + WARN_ON_ONCE(!rcu_is_watching_curr_cpu()); 216 216 217 217 /* 218 218 * If the nesting level is not 1, the CPU wasn't RCU-idle, so 219 219 * leave it in non-RCU-idle state. 220 220 */ 221 - if (ct_dynticks_nmi_nesting() != 1) { 222 - trace_rcu_dyntick(TPS("--="), ct_dynticks_nmi_nesting(), ct_dynticks_nmi_nesting() - 2, 223 - ct_dynticks()); 224 - WRITE_ONCE(ct->dynticks_nmi_nesting, /* No store tearing. */ 225 - ct_dynticks_nmi_nesting() - 2); 221 + if (ct_nmi_nesting() != 1) { 222 + trace_rcu_watching(TPS("--="), ct_nmi_nesting(), ct_nmi_nesting() - 2, 223 + ct_rcu_watching()); 224 + WRITE_ONCE(ct->nmi_nesting, /* No store tearing. */ 225 + ct_nmi_nesting() - 2); 226 226 instrumentation_end(); 227 227 return; 228 228 } 229 229 230 230 /* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */ 231 - trace_rcu_dyntick(TPS("Startirq"), ct_dynticks_nmi_nesting(), 0, ct_dynticks()); 232 - WRITE_ONCE(ct->dynticks_nmi_nesting, 0); /* Avoid store tearing. */ 231 + trace_rcu_watching(TPS("Endirq"), ct_nmi_nesting(), 0, ct_rcu_watching()); 232 + WRITE_ONCE(ct->nmi_nesting, 0); /* Avoid store tearing. */ 233 233 234 234 // instrumentation for the noinstr ct_kernel_exit_state() 235 235 instrument_atomic_write(&ct->state, sizeof(ct->state)); 236 236 instrumentation_end(); 237 237 238 238 // RCU is watching here ... 239 - ct_kernel_exit_state(RCU_DYNTICKS_IDX); 239 + ct_kernel_exit_state(CT_RCU_WATCHING); 240 240 // ... but is no longer watching here. 241 241 242 242 if (!in_nmi()) 243 - rcu_dynticks_task_enter(); 243 + rcu_task_exit(); 244 244 } 245 245 246 246 /** 247 247 * ct_nmi_enter - inform RCU of entry to NMI context 248 248 * 249 249 * If the CPU was idle from RCU's viewpoint, update ct->state and 250 - * ct->dynticks_nmi_nesting to let the RCU grace-period handling know 250 + * ct->nmi_nesting to let the RCU grace-period handling know 251 251 * that the CPU is active. This implementation permits nested NMIs, as 252 252 * long as the nesting level does not overflow an int. (You will probably 253 253 * run out of stack space first.) ··· 261 261 struct context_tracking *ct = this_cpu_ptr(&context_tracking); 262 262 263 263 /* Complain about underflow. */ 264 - WARN_ON_ONCE(ct_dynticks_nmi_nesting() < 0); 264 + WARN_ON_ONCE(ct_nmi_nesting() < 0); 265 265 266 266 /* 267 - * If idle from RCU viewpoint, atomically increment ->dynticks 268 - * to mark non-idle and increment ->dynticks_nmi_nesting by one. 269 - * Otherwise, increment ->dynticks_nmi_nesting by two. This means 270 - * if ->dynticks_nmi_nesting is equal to one, we are guaranteed 267 + * If idle from RCU viewpoint, atomically increment CT state 268 + * to mark non-idle and increment ->nmi_nesting by one. 269 + * Otherwise, increment ->nmi_nesting by two. This means 270 + * if ->nmi_nesting is equal to one, we are guaranteed 271 271 * to be in the outermost NMI handler that interrupted an RCU-idle 272 272 * period (observation due to Andy Lutomirski). 273 273 */ 274 - if (rcu_dynticks_curr_cpu_in_eqs()) { 274 + if (!rcu_is_watching_curr_cpu()) { 275 275 276 276 if (!in_nmi()) 277 - rcu_dynticks_task_exit(); 277 + rcu_task_enter(); 278 278 279 279 // RCU is not watching here ... 280 - ct_kernel_enter_state(RCU_DYNTICKS_IDX); 280 + ct_kernel_enter_state(CT_RCU_WATCHING); 281 281 // ... but is watching here. 282 282 283 283 instrumentation_begin(); 284 - // instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs() 284 + // instrumentation for the noinstr rcu_is_watching_curr_cpu() 285 285 instrument_atomic_read(&ct->state, sizeof(ct->state)); 286 286 // instrumentation for the noinstr ct_kernel_enter_state() 287 287 instrument_atomic_write(&ct->state, sizeof(ct->state)); ··· 294 294 instrumentation_begin(); 295 295 } 296 296 297 - trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="), 298 - ct_dynticks_nmi_nesting(), 299 - ct_dynticks_nmi_nesting() + incby, ct_dynticks()); 297 + trace_rcu_watching(incby == 1 ? TPS("Startirq") : TPS("++="), 298 + ct_nmi_nesting(), 299 + ct_nmi_nesting() + incby, ct_rcu_watching()); 300 300 instrumentation_end(); 301 - WRITE_ONCE(ct->dynticks_nmi_nesting, /* Prevent store tearing. */ 302 - ct_dynticks_nmi_nesting() + incby); 301 + WRITE_ONCE(ct->nmi_nesting, /* Prevent store tearing. */ 302 + ct_nmi_nesting() + incby); 303 303 barrier(); 304 304 } 305 305 ··· 317 317 void noinstr ct_idle_enter(void) 318 318 { 319 319 WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !raw_irqs_disabled()); 320 - ct_kernel_exit(false, RCU_DYNTICKS_IDX + CONTEXT_IDLE); 320 + ct_kernel_exit(false, CT_RCU_WATCHING + CT_STATE_IDLE); 321 321 } 322 322 EXPORT_SYMBOL_GPL(ct_idle_enter); 323 323 ··· 335 335 unsigned long flags; 336 336 337 337 raw_local_irq_save(flags); 338 - ct_kernel_enter(false, RCU_DYNTICKS_IDX - CONTEXT_IDLE); 338 + ct_kernel_enter(false, CT_RCU_WATCHING - CT_STATE_IDLE); 339 339 raw_local_irq_restore(flags); 340 340 } 341 341 EXPORT_SYMBOL_GPL(ct_idle_exit); ··· 485 485 * user_exit() or ct_irq_enter(). Let's remove RCU's dependency 486 486 * on the tick. 487 487 */ 488 - if (state == CONTEXT_USER) { 488 + if (state == CT_STATE_USER) { 489 489 instrumentation_begin(); 490 490 trace_user_enter(0); 491 491 vtime_user_enter(current); ··· 504 504 * CPU doesn't need to maintain the tick for RCU maintenance purposes 505 505 * when the CPU runs in userspace. 506 506 */ 507 - ct_kernel_exit(true, RCU_DYNTICKS_IDX + state); 507 + ct_kernel_exit(true, CT_RCU_WATCHING + state); 508 508 509 509 /* 510 510 * Special case if we only track user <-> kernel transitions for tickless ··· 534 534 /* 535 535 * Tracking for vtime and RCU EQS. Make sure we don't race 536 536 * with NMIs. OTOH we don't care about ordering here since 537 - * RCU only requires RCU_DYNTICKS_IDX increments to be fully 537 + * RCU only requires CT_RCU_WATCHING increments to be fully 538 538 * ordered. 539 539 */ 540 540 raw_atomic_add(state, &ct->state); ··· 620 620 * Exit RCU idle mode while entering the kernel because it can 621 621 * run a RCU read side critical section anytime. 622 622 */ 623 - ct_kernel_enter(true, RCU_DYNTICKS_IDX - state); 624 - if (state == CONTEXT_USER) { 623 + ct_kernel_enter(true, CT_RCU_WATCHING - state); 624 + if (state == CT_STATE_USER) { 625 625 instrumentation_begin(); 626 626 vtime_user_exit(current); 627 627 trace_user_exit(0); ··· 634 634 * In this we case we don't care about any concurrency/ordering. 635 635 */ 636 636 if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) 637 - raw_atomic_set(&ct->state, CONTEXT_KERNEL); 637 + raw_atomic_set(&ct->state, CT_STATE_KERNEL); 638 638 639 639 } else { 640 640 if (!IS_ENABLED(CONFIG_CONTEXT_TRACKING_IDLE)) { 641 641 /* Tracking for vtime only, no concurrent RCU EQS accounting */ 642 - raw_atomic_set(&ct->state, CONTEXT_KERNEL); 642 + raw_atomic_set(&ct->state, CT_STATE_KERNEL); 643 643 } else { 644 644 /* 645 645 * Tracking for vtime and RCU EQS. Make sure we don't race 646 646 * with NMIs. OTOH we don't care about ordering here since 647 - * RCU only requires RCU_DYNTICKS_IDX increments to be fully 647 + * RCU only requires CT_RCU_WATCHING increments to be fully 648 648 * ordered. 649 649 */ 650 650 raw_atomic_sub(state, &ct->state);
+1 -1
kernel/entry/common.c
··· 182 182 unsigned long work = READ_ONCE(current_thread_info()->syscall_work); 183 183 unsigned long nr = syscall_get_nr(current, regs); 184 184 185 - CT_WARN_ON(ct_state() != CONTEXT_KERNEL); 185 + CT_WARN_ON(ct_state() != CT_STATE_KERNEL); 186 186 187 187 if (IS_ENABLED(CONFIG_PROVE_LOCKING)) { 188 188 if (WARN(irqs_disabled(), "syscall %lu left IRQs disabled", nr))
+7 -5
kernel/rcu/rcu.h
··· 54 54 * grace-period sequence number. 55 55 */ 56 56 57 - #define RCU_SEQ_CTR_SHIFT 2 58 - #define RCU_SEQ_STATE_MASK ((1 << RCU_SEQ_CTR_SHIFT) - 1) 59 - 60 57 /* Low-order bit definition for polled grace-period APIs. */ 61 58 #define RCU_GET_STATE_COMPLETED 0x1 62 59 ··· 250 253 { 251 254 if (unlikely(!rhp->func)) 252 255 kmem_dump_obj(rhp); 256 + } 257 + 258 + static inline bool rcu_barrier_cb_is_done(struct rcu_head *rhp) 259 + { 260 + return rhp->next == rhp; 253 261 } 254 262 255 263 extern int rcu_cpu_stall_suppress_at_boot; ··· 608 606 #endif 609 607 610 608 #ifdef CONFIG_TINY_RCU 611 - static inline bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) { return false; } 609 + static inline bool rcu_watching_zero_in_eqs(int cpu, int *vp) { return false; } 612 610 static inline unsigned long rcu_get_gp_seq(void) { return 0; } 613 611 static inline unsigned long rcu_exp_batches_completed(void) { return 0; } 614 612 static inline unsigned long ··· 621 619 static inline void rcu_gp_slow_register(atomic_t *rgssp) { } 622 620 static inline void rcu_gp_slow_unregister(atomic_t *rgssp) { } 623 621 #else /* #ifdef CONFIG_TINY_RCU */ 624 - bool rcu_dynticks_zero_in_eqs(int cpu, int *vp); 622 + bool rcu_watching_zero_in_eqs(int cpu, int *vp); 625 623 unsigned long rcu_get_gp_seq(void); 626 624 unsigned long rcu_exp_batches_completed(void); 627 625 unsigned long srcu_batches_completed(struct srcu_struct *sp);
-11
kernel/rcu/rcu_segcblist.c
··· 261 261 } 262 262 263 263 /* 264 - * Mark the specified rcu_segcblist structure as offloaded (or not) 265 - */ 266 - void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload) 267 - { 268 - if (offload) 269 - rcu_segcblist_set_flags(rsclp, SEGCBLIST_LOCKING | SEGCBLIST_OFFLOADED); 270 - else 271 - rcu_segcblist_clear_flags(rsclp, SEGCBLIST_OFFLOADED); 272 - } 273 - 274 - /* 275 264 * Does the specified rcu_segcblist structure contain callbacks that 276 265 * are ready to be invoked? 277 266 */
+1 -10
kernel/rcu/rcu_segcblist.h
··· 89 89 static inline bool rcu_segcblist_is_offloaded(struct rcu_segcblist *rsclp) 90 90 { 91 91 if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && 92 - rcu_segcblist_test_flags(rsclp, SEGCBLIST_LOCKING)) 93 - return true; 94 - 95 - return false; 96 - } 97 - 98 - static inline bool rcu_segcblist_completely_offloaded(struct rcu_segcblist *rsclp) 99 - { 100 - if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) && 101 - !rcu_segcblist_test_flags(rsclp, SEGCBLIST_RCU_CORE)) 92 + rcu_segcblist_test_flags(rsclp, SEGCBLIST_OFFLOADED)) 102 93 return true; 103 94 104 95 return false;
+187 -27
kernel/rcu/rcuscale.c
··· 39 39 #include <linux/torture.h> 40 40 #include <linux/vmalloc.h> 41 41 #include <linux/rcupdate_trace.h> 42 + #include <linux/sched/debug.h> 42 43 43 44 #include "rcu.h" 44 45 ··· 105 104 module_param(scale_type, charp, 0444); 106 105 MODULE_PARM_DESC(scale_type, "Type of RCU to scalability-test (rcu, srcu, ...)"); 107 106 107 + // Structure definitions for custom fixed-per-task allocator. 108 + struct writer_mblock { 109 + struct rcu_head wmb_rh; 110 + struct llist_node wmb_node; 111 + struct writer_freelist *wmb_wfl; 112 + }; 113 + 114 + struct writer_freelist { 115 + struct llist_head ws_lhg; 116 + atomic_t ws_inflight; 117 + struct llist_head ____cacheline_internodealigned_in_smp ws_lhp; 118 + struct writer_mblock *ws_mblocks; 119 + }; 120 + 108 121 static int nrealreaders; 109 122 static int nrealwriters; 110 123 static struct task_struct **writer_tasks; ··· 126 111 static struct task_struct *shutdown_task; 127 112 128 113 static u64 **writer_durations; 114 + static bool *writer_done; 115 + static struct writer_freelist *writer_freelists; 129 116 static int *writer_n_durations; 130 117 static atomic_t n_rcu_scale_reader_started; 131 118 static atomic_t n_rcu_scale_writer_started; ··· 137 120 static u64 t_rcu_scale_writer_finished; 138 121 static unsigned long b_rcu_gp_test_started; 139 122 static unsigned long b_rcu_gp_test_finished; 140 - static DEFINE_PER_CPU(atomic_t, n_async_inflight); 141 123 142 124 #define MAX_MEAS 10000 143 125 #define MIN_MEAS 100 ··· 159 143 void (*sync)(void); 160 144 void (*exp_sync)(void); 161 145 struct task_struct *(*rso_gp_kthread)(void); 146 + void (*stats)(void); 162 147 const char *name; 163 148 }; 164 149 ··· 241 224 synchronize_srcu(srcu_ctlp); 242 225 } 243 226 227 + static void srcu_scale_stats(void) 228 + { 229 + srcu_torture_stats_print(srcu_ctlp, scale_type, SCALE_FLAG); 230 + } 231 + 244 232 static void srcu_scale_synchronize_expedited(void) 245 233 { 246 234 synchronize_srcu_expedited(srcu_ctlp); ··· 263 241 .gp_barrier = srcu_rcu_barrier, 264 242 .sync = srcu_scale_synchronize, 265 243 .exp_sync = srcu_scale_synchronize_expedited, 244 + .stats = srcu_scale_stats, 266 245 .name = "srcu" 267 246 }; 268 247 ··· 293 270 .gp_barrier = srcu_rcu_barrier, 294 271 .sync = srcu_scale_synchronize, 295 272 .exp_sync = srcu_scale_synchronize_expedited, 273 + .stats = srcu_scale_stats, 296 274 .name = "srcud" 297 275 }; 298 276 ··· 312 288 { 313 289 } 314 290 291 + static void rcu_tasks_scale_stats(void) 292 + { 293 + rcu_tasks_torture_stats_print(scale_type, SCALE_FLAG); 294 + } 295 + 315 296 static struct rcu_scale_ops tasks_ops = { 316 297 .ptype = RCU_TASKS_FLAVOR, 317 298 .init = rcu_sync_scale_init, ··· 329 300 .sync = synchronize_rcu_tasks, 330 301 .exp_sync = synchronize_rcu_tasks, 331 302 .rso_gp_kthread = get_rcu_tasks_gp_kthread, 303 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_scale_stats, 332 304 .name = "tasks" 333 305 }; 334 306 ··· 356 326 { 357 327 } 358 328 329 + static void rcu_tasks_rude_scale_stats(void) 330 + { 331 + rcu_tasks_rude_torture_stats_print(scale_type, SCALE_FLAG); 332 + } 333 + 359 334 static struct rcu_scale_ops tasks_rude_ops = { 360 335 .ptype = RCU_TASKS_RUDE_FLAVOR, 361 336 .init = rcu_sync_scale_init, ··· 368 333 .readunlock = tasks_rude_scale_read_unlock, 369 334 .get_gp_seq = rcu_no_completed, 370 335 .gp_diff = rcu_seq_diff, 371 - .async = call_rcu_tasks_rude, 372 - .gp_barrier = rcu_barrier_tasks_rude, 373 336 .sync = synchronize_rcu_tasks_rude, 374 337 .exp_sync = synchronize_rcu_tasks_rude, 375 338 .rso_gp_kthread = get_rcu_tasks_rude_gp_kthread, 339 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_rude_scale_stats, 376 340 .name = "tasks-rude" 377 341 }; 378 342 ··· 400 366 rcu_read_unlock_trace(); 401 367 } 402 368 369 + static void rcu_tasks_trace_scale_stats(void) 370 + { 371 + rcu_tasks_trace_torture_stats_print(scale_type, SCALE_FLAG); 372 + } 373 + 403 374 static struct rcu_scale_ops tasks_tracing_ops = { 404 375 .ptype = RCU_TASKS_FLAVOR, 405 376 .init = rcu_sync_scale_init, ··· 417 378 .sync = synchronize_rcu_tasks_trace, 418 379 .exp_sync = synchronize_rcu_tasks_trace, 419 380 .rso_gp_kthread = get_rcu_tasks_trace_gp_kthread, 381 + .stats = IS_ENABLED(CONFIG_TINY_RCU) ? NULL : rcu_tasks_trace_scale_stats, 420 382 .name = "tasks-tracing" 421 383 }; 422 384 ··· 478 438 } 479 439 480 440 /* 441 + * Allocate a writer_mblock structure for the specified rcu_scale_writer 442 + * task. 443 + */ 444 + static struct writer_mblock *rcu_scale_alloc(long me) 445 + { 446 + struct llist_node *llnp; 447 + struct writer_freelist *wflp; 448 + struct writer_mblock *wmbp; 449 + 450 + if (WARN_ON_ONCE(!writer_freelists)) 451 + return NULL; 452 + wflp = &writer_freelists[me]; 453 + if (llist_empty(&wflp->ws_lhp)) { 454 + // ->ws_lhp is private to its rcu_scale_writer task. 455 + wmbp = container_of(llist_del_all(&wflp->ws_lhg), struct writer_mblock, wmb_node); 456 + wflp->ws_lhp.first = &wmbp->wmb_node; 457 + } 458 + llnp = llist_del_first(&wflp->ws_lhp); 459 + if (!llnp) 460 + return NULL; 461 + return container_of(llnp, struct writer_mblock, wmb_node); 462 + } 463 + 464 + /* 465 + * Free a writer_mblock structure to its rcu_scale_writer task. 466 + */ 467 + static void rcu_scale_free(struct writer_mblock *wmbp) 468 + { 469 + struct writer_freelist *wflp; 470 + 471 + if (!wmbp) 472 + return; 473 + wflp = wmbp->wmb_wfl; 474 + llist_add(&wmbp->wmb_node, &wflp->ws_lhg); 475 + } 476 + 477 + /* 481 478 * Callback function for asynchronous grace periods from rcu_scale_writer(). 482 479 */ 483 480 static void rcu_scale_async_cb(struct rcu_head *rhp) 484 481 { 485 - atomic_dec(this_cpu_ptr(&n_async_inflight)); 486 - kfree(rhp); 482 + struct writer_mblock *wmbp = container_of(rhp, struct writer_mblock, wmb_rh); 483 + struct writer_freelist *wflp = wmbp->wmb_wfl; 484 + 485 + atomic_dec(&wflp->ws_inflight); 486 + rcu_scale_free(wmbp); 487 487 } 488 488 489 489 /* ··· 536 456 int i_max; 537 457 unsigned long jdone; 538 458 long me = (long)arg; 539 - struct rcu_head *rhp = NULL; 459 + bool selfreport = false; 540 460 bool started = false, done = false, alldone = false; 541 461 u64 t; 542 462 DEFINE_TORTURE_RANDOM(tr); 543 463 u64 *wdp; 544 464 u64 *wdpp = writer_durations[me]; 465 + struct writer_freelist *wflp = &writer_freelists[me]; 466 + struct writer_mblock *wmbp = NULL; 545 467 546 468 VERBOSE_SCALEOUT_STRING("rcu_scale_writer task started"); 547 469 WARN_ON(!wdpp); ··· 575 493 576 494 jdone = jiffies + minruntime * HZ; 577 495 do { 496 + bool gp_succeeded = false; 497 + 578 498 if (writer_holdoff) 579 499 udelay(writer_holdoff); 580 500 if (writer_holdoff_jiffies) 581 501 schedule_timeout_idle(torture_random(&tr) % writer_holdoff_jiffies + 1); 582 502 wdp = &wdpp[i]; 583 503 *wdp = ktime_get_mono_fast_ns(); 584 - if (gp_async) { 585 - retry: 586 - if (!rhp) 587 - rhp = kmalloc(sizeof(*rhp), GFP_KERNEL); 588 - if (rhp && atomic_read(this_cpu_ptr(&n_async_inflight)) < gp_async_max) { 589 - atomic_inc(this_cpu_ptr(&n_async_inflight)); 590 - cur_ops->async(rhp, rcu_scale_async_cb); 591 - rhp = NULL; 504 + if (gp_async && !WARN_ON_ONCE(!cur_ops->async)) { 505 + if (!wmbp) 506 + wmbp = rcu_scale_alloc(me); 507 + if (wmbp && atomic_read(&wflp->ws_inflight) < gp_async_max) { 508 + atomic_inc(&wflp->ws_inflight); 509 + cur_ops->async(&wmbp->wmb_rh, rcu_scale_async_cb); 510 + wmbp = NULL; 511 + gp_succeeded = true; 592 512 } else if (!kthread_should_stop()) { 593 513 cur_ops->gp_barrier(); 594 - goto retry; 595 514 } else { 596 - kfree(rhp); /* Because we are stopping. */ 515 + rcu_scale_free(wmbp); /* Because we are stopping. */ 516 + wmbp = NULL; 597 517 } 598 518 } else if (gp_exp) { 599 519 cur_ops->exp_sync(); 520 + gp_succeeded = true; 600 521 } else { 601 522 cur_ops->sync(); 523 + gp_succeeded = true; 602 524 } 603 525 t = ktime_get_mono_fast_ns(); 604 526 *wdp = t - *wdp; ··· 612 526 started = true; 613 527 if (!done && i >= MIN_MEAS && time_after(jiffies, jdone)) { 614 528 done = true; 529 + WRITE_ONCE(writer_done[me], true); 615 530 sched_set_normal(current, 0); 616 531 pr_alert("%s%s rcu_scale_writer %ld has %d measurements\n", 617 532 scale_type, SCALE_FLAG, me, MIN_MEAS); ··· 638 551 if (done && !alldone && 639 552 atomic_read(&n_rcu_scale_writer_finished) >= nrealwriters) 640 553 alldone = true; 641 - if (started && !alldone && i < MAX_MEAS - 1) 554 + if (done && !alldone && time_after(jiffies, jdone + HZ * 60)) { 555 + static atomic_t dumped; 556 + int i; 557 + 558 + if (!atomic_xchg(&dumped, 1)) { 559 + for (i = 0; i < nrealwriters; i++) { 560 + if (writer_done[i]) 561 + continue; 562 + pr_info("%s: Task %ld flags writer %d:\n", __func__, me, i); 563 + sched_show_task(writer_tasks[i]); 564 + } 565 + if (cur_ops->stats) 566 + cur_ops->stats(); 567 + } 568 + } 569 + if (!selfreport && time_after(jiffies, jdone + HZ * (70 + me))) { 570 + pr_info("%s: Writer %ld self-report: started %d done %d/%d->%d i %d jdone %lu.\n", 571 + __func__, me, started, done, writer_done[me], atomic_read(&n_rcu_scale_writer_finished), i, jiffies - jdone); 572 + selfreport = true; 573 + } 574 + if (gp_succeeded && started && !alldone && i < MAX_MEAS - 1) 642 575 i++; 643 576 rcu_scale_wait_shutdown(); 644 577 } while (!torture_must_stop()); 645 - if (gp_async) { 578 + if (gp_async && cur_ops->async) { 579 + rcu_scale_free(wmbp); 646 580 cur_ops->gp_barrier(); 647 581 } 648 582 writer_n_durations[me] = i_max + 1; ··· 821 713 torture_stop_kthread(kfree_scale_thread, 822 714 kfree_reader_tasks[i]); 823 715 kfree(kfree_reader_tasks); 716 + kfree_reader_tasks = NULL; 824 717 } 825 718 826 719 torture_cleanup_end(); ··· 990 881 torture_stop_kthread(rcu_scale_reader, 991 882 reader_tasks[i]); 992 883 kfree(reader_tasks); 884 + reader_tasks = NULL; 993 885 } 994 886 995 887 if (writer_tasks) { ··· 1029 919 schedule_timeout_uninterruptible(1); 1030 920 } 1031 921 kfree(writer_durations[i]); 922 + if (writer_freelists) { 923 + int ctr = 0; 924 + struct llist_node *llnp; 925 + struct writer_freelist *wflp = &writer_freelists[i]; 926 + 927 + if (wflp->ws_mblocks) { 928 + llist_for_each(llnp, wflp->ws_lhg.first) 929 + ctr++; 930 + llist_for_each(llnp, wflp->ws_lhp.first) 931 + ctr++; 932 + WARN_ONCE(ctr != gp_async_max, 933 + "%s: ctr = %d gp_async_max = %d\n", 934 + __func__, ctr, gp_async_max); 935 + kfree(wflp->ws_mblocks); 936 + } 937 + } 1032 938 } 1033 939 kfree(writer_tasks); 940 + writer_tasks = NULL; 1034 941 kfree(writer_durations); 942 + writer_durations = NULL; 1035 943 kfree(writer_n_durations); 944 + writer_n_durations = NULL; 945 + kfree(writer_done); 946 + writer_done = NULL; 947 + kfree(writer_freelists); 948 + writer_freelists = NULL; 1036 949 } 1037 950 1038 951 /* Do torture-type-specific cleanup operations. */ ··· 1082 949 static int __init 1083 950 rcu_scale_init(void) 1084 951 { 1085 - long i; 1086 952 int firsterr = 0; 953 + long i; 954 + long j; 1087 955 static struct rcu_scale_ops *scale_ops[] = { 1088 956 &rcu_ops, &srcu_ops, &srcud_ops, TASKS_OPS TASKS_RUDE_OPS TASKS_TRACING_OPS 1089 957 }; ··· 1151 1017 } 1152 1018 while (atomic_read(&n_rcu_scale_reader_started) < nrealreaders) 1153 1019 schedule_timeout_uninterruptible(1); 1154 - writer_tasks = kcalloc(nrealwriters, sizeof(reader_tasks[0]), 1155 - GFP_KERNEL); 1156 - writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), 1157 - GFP_KERNEL); 1158 - writer_n_durations = 1159 - kcalloc(nrealwriters, sizeof(*writer_n_durations), 1160 - GFP_KERNEL); 1161 - if (!writer_tasks || !writer_durations || !writer_n_durations) { 1020 + writer_tasks = kcalloc(nrealwriters, sizeof(writer_tasks[0]), GFP_KERNEL); 1021 + writer_durations = kcalloc(nrealwriters, sizeof(*writer_durations), GFP_KERNEL); 1022 + writer_n_durations = kcalloc(nrealwriters, sizeof(*writer_n_durations), GFP_KERNEL); 1023 + writer_done = kcalloc(nrealwriters, sizeof(writer_done[0]), GFP_KERNEL); 1024 + if (gp_async) { 1025 + if (gp_async_max <= 0) { 1026 + pr_warn("%s: gp_async_max = %d must be greater than zero.\n", 1027 + __func__, gp_async_max); 1028 + WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)); 1029 + firsterr = -EINVAL; 1030 + goto unwind; 1031 + } 1032 + writer_freelists = kcalloc(nrealwriters, sizeof(writer_freelists[0]), GFP_KERNEL); 1033 + } 1034 + if (!writer_tasks || !writer_durations || !writer_n_durations || !writer_done || 1035 + (gp_async && !writer_freelists)) { 1162 1036 SCALEOUT_ERRSTRING("out of memory"); 1163 1037 firsterr = -ENOMEM; 1164 1038 goto unwind; ··· 1178 1036 if (!writer_durations[i]) { 1179 1037 firsterr = -ENOMEM; 1180 1038 goto unwind; 1039 + } 1040 + if (writer_freelists) { 1041 + struct writer_freelist *wflp = &writer_freelists[i]; 1042 + 1043 + init_llist_head(&wflp->ws_lhg); 1044 + init_llist_head(&wflp->ws_lhp); 1045 + wflp->ws_mblocks = kcalloc(gp_async_max, sizeof(wflp->ws_mblocks[0]), 1046 + GFP_KERNEL); 1047 + if (!wflp->ws_mblocks) { 1048 + firsterr = -ENOMEM; 1049 + goto unwind; 1050 + } 1051 + for (j = 0; j < gp_async_max; j++) { 1052 + struct writer_mblock *wmbp = &wflp->ws_mblocks[j]; 1053 + 1054 + wmbp->wmb_wfl = wflp; 1055 + llist_add(&wmbp->wmb_node, &wflp->ws_lhp); 1056 + } 1181 1057 } 1182 1058 firsterr = torture_create_kthread(rcu_scale_writer, (void *)i, 1183 1059 writer_tasks[i]);
+79 -42
kernel/rcu/rcutorture.c
··· 115 115 torture_param(bool, stall_no_softlockup, false, "Avoid softlockup warning during cpu stall."); 116 116 torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling."); 117 117 torture_param(int, stall_cpu_block, 0, "Sleep while stalling."); 118 + torture_param(int, stall_cpu_repeat, 0, "Number of additional stalls after the first one."); 118 119 torture_param(int, stall_gp_kthread, 0, "Grace-period kthread stall duration (s)."); 119 120 torture_param(int, stat_interval, 60, "Number of seconds between stats printk()s"); 120 121 torture_param(int, stutter, 5, "Number of seconds to run/halt test"); ··· 367 366 bool (*same_gp_state_full)(struct rcu_gp_oldstate *rgosp1, struct rcu_gp_oldstate *rgosp2); 368 367 unsigned long (*get_gp_state)(void); 369 368 void (*get_gp_state_full)(struct rcu_gp_oldstate *rgosp); 370 - unsigned long (*get_gp_completed)(void); 371 - void (*get_gp_completed_full)(struct rcu_gp_oldstate *rgosp); 372 369 unsigned long (*start_gp_poll)(void); 373 370 void (*start_gp_poll_full)(struct rcu_gp_oldstate *rgosp); 374 371 bool (*poll_gp_state)(unsigned long oldstate); ··· 374 375 bool (*poll_need_2gp)(bool poll, bool poll_full); 375 376 void (*cond_sync)(unsigned long oldstate); 376 377 void (*cond_sync_full)(struct rcu_gp_oldstate *rgosp); 378 + int poll_active; 379 + int poll_active_full; 377 380 call_rcu_func_t call; 378 381 void (*cb_barrier)(void); 379 382 void (*fqs)(void); ··· 554 553 .get_comp_state_full = get_completed_synchronize_rcu_full, 555 554 .get_gp_state = get_state_synchronize_rcu, 556 555 .get_gp_state_full = get_state_synchronize_rcu_full, 557 - .get_gp_completed = get_completed_synchronize_rcu, 558 - .get_gp_completed_full = get_completed_synchronize_rcu_full, 559 556 .start_gp_poll = start_poll_synchronize_rcu, 560 557 .start_gp_poll_full = start_poll_synchronize_rcu_full, 561 558 .poll_gp_state = poll_state_synchronize_rcu, ··· 561 562 .poll_need_2gp = rcu_poll_need_2gp, 562 563 .cond_sync = cond_synchronize_rcu, 563 564 .cond_sync_full = cond_synchronize_rcu_full, 565 + .poll_active = NUM_ACTIVE_RCU_POLL_OLDSTATE, 566 + .poll_active_full = NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE, 564 567 .get_gp_state_exp = get_state_synchronize_rcu, 565 568 .start_gp_poll_exp = start_poll_synchronize_rcu_expedited, 566 569 .start_gp_poll_exp_full = start_poll_synchronize_rcu_expedited_full, ··· 741 740 .deferred_free = srcu_torture_deferred_free, 742 741 .sync = srcu_torture_synchronize, 743 742 .exp_sync = srcu_torture_synchronize_expedited, 743 + .same_gp_state = same_state_synchronize_srcu, 744 + .get_comp_state = get_completed_synchronize_srcu, 744 745 .get_gp_state = srcu_torture_get_gp_state, 745 746 .start_gp_poll = srcu_torture_start_gp_poll, 746 747 .poll_gp_state = srcu_torture_poll_gp_state, 748 + .poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE, 747 749 .call = srcu_torture_call, 748 750 .cb_barrier = srcu_torture_barrier, 749 751 .stats = srcu_torture_stats, ··· 784 780 .deferred_free = srcu_torture_deferred_free, 785 781 .sync = srcu_torture_synchronize, 786 782 .exp_sync = srcu_torture_synchronize_expedited, 783 + .same_gp_state = same_state_synchronize_srcu, 784 + .get_comp_state = get_completed_synchronize_srcu, 787 785 .get_gp_state = srcu_torture_get_gp_state, 788 786 .start_gp_poll = srcu_torture_start_gp_poll, 789 787 .poll_gp_state = srcu_torture_poll_gp_state, 788 + .poll_active = NUM_ACTIVE_SRCU_POLL_OLDSTATE, 790 789 .call = srcu_torture_call, 791 790 .cb_barrier = srcu_torture_barrier, 792 791 .stats = srcu_torture_stats, ··· 922 915 * Definitions for rude RCU-tasks torture testing. 923 916 */ 924 917 925 - static void rcu_tasks_rude_torture_deferred_free(struct rcu_torture *p) 926 - { 927 - call_rcu_tasks_rude(&p->rtort_rcu, rcu_torture_cb); 928 - } 929 - 930 918 static struct rcu_torture_ops tasks_rude_ops = { 931 919 .ttype = RCU_TASKS_RUDE_FLAVOR, 932 920 .init = rcu_sync_torture_init, ··· 929 927 .read_delay = rcu_read_delay, /* just reuse rcu's version. */ 930 928 .readunlock = rcu_torture_read_unlock_trivial, 931 929 .get_gp_seq = rcu_no_completed, 932 - .deferred_free = rcu_tasks_rude_torture_deferred_free, 933 930 .sync = synchronize_rcu_tasks_rude, 934 931 .exp_sync = synchronize_rcu_tasks_rude, 935 - .call = call_rcu_tasks_rude, 936 - .cb_barrier = rcu_barrier_tasks_rude, 937 932 .gp_kthread_dbg = show_rcu_tasks_rude_gp_kthread, 938 933 .get_gp_data = rcu_tasks_rude_get_gp_data, 939 934 .cbflood_max = 50000, ··· 1317 1318 } else if (gp_sync && !cur_ops->sync) { 1318 1319 pr_alert("%s: gp_sync without primitives.\n", __func__); 1319 1320 } 1321 + pr_alert("%s: Testing %d update types.\n", __func__, nsynctypes); 1320 1322 } 1321 1323 1322 1324 /* ··· 1374 1374 int i; 1375 1375 int idx; 1376 1376 int oldnice = task_nice(current); 1377 - struct rcu_gp_oldstate rgo[NUM_ACTIVE_RCU_POLL_FULL_OLDSTATE]; 1377 + struct rcu_gp_oldstate *rgo = NULL; 1378 + int rgo_size = 0; 1378 1379 struct rcu_torture *rp; 1379 1380 struct rcu_torture *old_rp; 1380 1381 static DEFINE_TORTURE_RANDOM(rand); 1381 1382 unsigned long stallsdone = jiffies; 1382 1383 bool stutter_waited; 1383 - unsigned long ulo[NUM_ACTIVE_RCU_POLL_OLDSTATE]; 1384 + unsigned long *ulo = NULL; 1385 + int ulo_size = 0; 1384 1386 1385 1387 // If a new stall test is added, this must be adjusted. 1386 1388 if (stall_cpu_holdoff + stall_gp_kthread + stall_cpu) 1387 - stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * HZ; 1389 + stallsdone += (stall_cpu_holdoff + stall_gp_kthread + stall_cpu + 60) * 1390 + HZ * (stall_cpu_repeat + 1); 1388 1391 VERBOSE_TOROUT_STRING("rcu_torture_writer task started"); 1389 1392 if (!can_expedite) 1390 1393 pr_alert("%s" TORTURE_FLAG ··· 1403 1400 rcu_torture_writer_state = RTWS_STOPPING; 1404 1401 torture_kthread_stopping("rcu_torture_writer"); 1405 1402 return 0; 1403 + } 1404 + if (cur_ops->poll_active > 0) { 1405 + ulo = kzalloc(cur_ops->poll_active * sizeof(ulo[0]), GFP_KERNEL); 1406 + if (!WARN_ON(!ulo)) 1407 + ulo_size = cur_ops->poll_active; 1408 + } 1409 + if (cur_ops->poll_active_full > 0) { 1410 + rgo = kzalloc(cur_ops->poll_active_full * sizeof(rgo[0]), GFP_KERNEL); 1411 + if (!WARN_ON(!rgo)) 1412 + rgo_size = cur_ops->poll_active_full; 1406 1413 } 1407 1414 1408 1415 do { ··· 1450 1437 rcu_torture_writer_state_getname(), 1451 1438 rcu_torture_writer_state, 1452 1439 cookie, cur_ops->get_gp_state()); 1453 - if (cur_ops->get_gp_completed) { 1454 - cookie = cur_ops->get_gp_completed(); 1440 + if (cur_ops->get_comp_state) { 1441 + cookie = cur_ops->get_comp_state(); 1455 1442 WARN_ON_ONCE(!cur_ops->poll_gp_state(cookie)); 1456 1443 } 1457 1444 cur_ops->readunlock(idx); ··· 1465 1452 rcu_torture_writer_state_getname(), 1466 1453 rcu_torture_writer_state, 1467 1454 cpumask_pr_args(cpu_online_mask)); 1468 - if (cur_ops->get_gp_completed_full) { 1469 - cur_ops->get_gp_completed_full(&cookie_full); 1455 + if (cur_ops->get_comp_state_full) { 1456 + cur_ops->get_comp_state_full(&cookie_full); 1470 1457 WARN_ON_ONCE(!cur_ops->poll_gp_state_full(&cookie_full)); 1471 1458 } 1472 1459 cur_ops->readunlock(idx); ··· 1515 1502 break; 1516 1503 case RTWS_POLL_GET: 1517 1504 rcu_torture_writer_state = RTWS_POLL_GET; 1518 - for (i = 0; i < ARRAY_SIZE(ulo); i++) 1505 + for (i = 0; i < ulo_size; i++) 1519 1506 ulo[i] = cur_ops->get_comp_state(); 1520 1507 gp_snap = cur_ops->start_gp_poll(); 1521 1508 rcu_torture_writer_state = RTWS_POLL_WAIT; 1522 1509 while (!cur_ops->poll_gp_state(gp_snap)) { 1523 1510 gp_snap1 = cur_ops->get_gp_state(); 1524 - for (i = 0; i < ARRAY_SIZE(ulo); i++) 1511 + for (i = 0; i < ulo_size; i++) 1525 1512 if (cur_ops->poll_gp_state(ulo[i]) || 1526 1513 cur_ops->same_gp_state(ulo[i], gp_snap1)) { 1527 1514 ulo[i] = gp_snap1; 1528 1515 break; 1529 1516 } 1530 - WARN_ON_ONCE(i >= ARRAY_SIZE(ulo)); 1517 + WARN_ON_ONCE(ulo_size > 0 && i >= ulo_size); 1531 1518 torture_hrtimeout_jiffies(torture_random(&rand) % 16, 1532 1519 &rand); 1533 1520 } ··· 1535 1522 break; 1536 1523 case RTWS_POLL_GET_FULL: 1537 1524 rcu_torture_writer_state = RTWS_POLL_GET_FULL; 1538 - for (i = 0; i < ARRAY_SIZE(rgo); i++) 1525 + for (i = 0; i < rgo_size; i++) 1539 1526 cur_ops->get_comp_state_full(&rgo[i]); 1540 1527 cur_ops->start_gp_poll_full(&gp_snap_full); 1541 1528 rcu_torture_writer_state = RTWS_POLL_WAIT_FULL; 1542 1529 while (!cur_ops->poll_gp_state_full(&gp_snap_full)) { 1543 1530 cur_ops->get_gp_state_full(&gp_snap1_full); 1544 - for (i = 0; i < ARRAY_SIZE(rgo); i++) 1531 + for (i = 0; i < rgo_size; i++) 1545 1532 if (cur_ops->poll_gp_state_full(&rgo[i]) || 1546 1533 cur_ops->same_gp_state_full(&rgo[i], 1547 1534 &gp_snap1_full)) { 1548 1535 rgo[i] = gp_snap1_full; 1549 1536 break; 1550 1537 } 1551 - WARN_ON_ONCE(i >= ARRAY_SIZE(rgo)); 1538 + WARN_ON_ONCE(rgo_size > 0 && i >= rgo_size); 1552 1539 torture_hrtimeout_jiffies(torture_random(&rand) % 16, 1553 1540 &rand); 1554 1541 } ··· 1630 1617 pr_alert("%s" TORTURE_FLAG 1631 1618 " Dynamic grace-period expediting was disabled.\n", 1632 1619 torture_type); 1620 + kfree(ulo); 1621 + kfree(rgo); 1633 1622 rcu_torture_writer_state = RTWS_STOPPING; 1634 1623 torture_kthread_stopping("rcu_torture_writer"); 1635 1624 return 0; ··· 2385 2370 "test_boost=%d/%d test_boost_interval=%d " 2386 2371 "test_boost_duration=%d shutdown_secs=%d " 2387 2372 "stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d " 2388 - "stall_cpu_block=%d " 2373 + "stall_cpu_block=%d stall_cpu_repeat=%d " 2389 2374 "n_barrier_cbs=%d " 2390 2375 "onoff_interval=%d onoff_holdoff=%d " 2391 2376 "read_exit_delay=%d read_exit_burst=%d " ··· 2397 2382 test_boost, cur_ops->can_boost, 2398 2383 test_boost_interval, test_boost_duration, shutdown_secs, 2399 2384 stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff, 2400 - stall_cpu_block, 2385 + stall_cpu_block, stall_cpu_repeat, 2401 2386 n_barrier_cbs, 2402 2387 onoff_interval, onoff_holdoff, 2403 2388 read_exit_delay, read_exit_burst, ··· 2475 2460 * induces a CPU stall for the time specified by stall_cpu. If a new 2476 2461 * stall test is added, stallsdone in rcu_torture_writer() must be adjusted. 2477 2462 */ 2478 - static int rcu_torture_stall(void *args) 2463 + static void rcu_torture_stall_one(int rep, int irqsoff) 2479 2464 { 2480 2465 int idx; 2481 - int ret; 2482 2466 unsigned long stop_at; 2483 2467 2484 - VERBOSE_TOROUT_STRING("rcu_torture_stall task started"); 2485 - if (rcu_cpu_stall_notifiers) { 2486 - ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block); 2487 - if (ret) 2488 - pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n", 2489 - __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : ""); 2490 - } 2491 2468 if (stall_cpu_holdoff > 0) { 2492 2469 VERBOSE_TOROUT_STRING("rcu_torture_stall begin holdoff"); 2493 2470 schedule_timeout_interruptible(stall_cpu_holdoff * HZ); ··· 2499 2492 stop_at = ktime_get_seconds() + stall_cpu; 2500 2493 /* RCU CPU stall is expected behavior in following code. */ 2501 2494 idx = cur_ops->readlock(); 2502 - if (stall_cpu_irqsoff) 2495 + if (irqsoff) 2503 2496 local_irq_disable(); 2504 2497 else if (!stall_cpu_block) 2505 2498 preempt_disable(); 2506 - pr_alert("%s start on CPU %d.\n", 2507 - __func__, raw_smp_processor_id()); 2499 + pr_alert("%s start stall episode %d on CPU %d.\n", 2500 + __func__, rep + 1, raw_smp_processor_id()); 2508 2501 while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(), stop_at) && 2509 2502 !kthread_should_stop()) 2510 2503 if (stall_cpu_block) { ··· 2516 2509 } else if (stall_no_softlockup) { 2517 2510 touch_softlockup_watchdog(); 2518 2511 } 2519 - if (stall_cpu_irqsoff) 2512 + if (irqsoff) 2520 2513 local_irq_enable(); 2521 2514 else if (!stall_cpu_block) 2522 2515 preempt_enable(); 2523 2516 cur_ops->readunlock(idx); 2517 + } 2518 + } 2519 + 2520 + /* 2521 + * CPU-stall kthread. Invokes rcu_torture_stall_one() once, and then as many 2522 + * additional times as specified by the stall_cpu_repeat module parameter. 2523 + * Note that stall_cpu_irqsoff is ignored on the second and subsequent 2524 + * stall. 2525 + */ 2526 + static int rcu_torture_stall(void *args) 2527 + { 2528 + int i; 2529 + int repeat = stall_cpu_repeat; 2530 + int ret; 2531 + 2532 + VERBOSE_TOROUT_STRING("rcu_torture_stall task started"); 2533 + if (repeat < 0) { 2534 + repeat = 0; 2535 + WARN_ON_ONCE(IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)); 2536 + } 2537 + if (rcu_cpu_stall_notifiers) { 2538 + ret = rcu_stall_chain_notifier_register(&rcu_torture_stall_block); 2539 + if (ret) 2540 + pr_info("%s: rcu_stall_chain_notifier_register() returned %d, %sexpected.\n", 2541 + __func__, ret, !IS_ENABLED(CONFIG_RCU_STALL_COMMON) ? "un" : ""); 2542 + } 2543 + for (i = 0; i <= repeat; i++) { 2544 + if (kthread_should_stop()) 2545 + break; 2546 + rcu_torture_stall_one(i, i == 0 ? stall_cpu_irqsoff : 0); 2524 2547 } 2525 2548 pr_alert("%s end.\n", __func__); 2526 2549 if (rcu_cpu_stall_notifiers && !ret) { ··· 2717 2680 rcu_torture_fwd_prog_cond_resched(freed); 2718 2681 if (tick_nohz_full_enabled()) { 2719 2682 local_irq_save(flags); 2720 - rcu_momentary_dyntick_idle(); 2683 + rcu_momentary_eqs(); 2721 2684 local_irq_restore(flags); 2722 2685 } 2723 2686 } ··· 2867 2830 rcu_torture_fwd_prog_cond_resched(n_launders + n_max_cbs); 2868 2831 if (tick_nohz_full_enabled()) { 2869 2832 local_irq_save(flags); 2870 - rcu_momentary_dyntick_idle(); 2833 + rcu_momentary_eqs(); 2871 2834 local_irq_restore(flags); 2872 2835 } 2873 2836 }
+35 -32
kernel/rcu/refscale.c
··· 28 28 #include <linux/rcupdate_trace.h> 29 29 #include <linux/reboot.h> 30 30 #include <linux/sched.h> 31 + #include <linux/seq_buf.h> 31 32 #include <linux/spinlock.h> 32 33 #include <linux/smp.h> 33 34 #include <linux/stat.h> ··· 135 134 const char *name; 136 135 }; 137 136 138 - static struct ref_scale_ops *cur_ops; 137 + static const struct ref_scale_ops *cur_ops; 139 138 140 139 static void un_delay(const int udl, const int ndl) 141 140 { ··· 171 170 return true; 172 171 } 173 172 174 - static struct ref_scale_ops rcu_ops = { 173 + static const struct ref_scale_ops rcu_ops = { 175 174 .init = rcu_sync_scale_init, 176 175 .readsection = ref_rcu_read_section, 177 176 .delaysection = ref_rcu_delay_section, ··· 205 204 } 206 205 } 207 206 208 - static struct ref_scale_ops srcu_ops = { 207 + static const struct ref_scale_ops srcu_ops = { 209 208 .init = rcu_sync_scale_init, 210 209 .readsection = srcu_ref_scale_read_section, 211 210 .delaysection = srcu_ref_scale_delay_section, ··· 232 231 un_delay(udl, ndl); 233 232 } 234 233 235 - static struct ref_scale_ops rcu_tasks_ops = { 234 + static const struct ref_scale_ops rcu_tasks_ops = { 236 235 .init = rcu_sync_scale_init, 237 236 .readsection = rcu_tasks_ref_scale_read_section, 238 237 .delaysection = rcu_tasks_ref_scale_delay_section, ··· 271 270 } 272 271 } 273 272 274 - static struct ref_scale_ops rcu_trace_ops = { 273 + static const struct ref_scale_ops rcu_trace_ops = { 275 274 .init = rcu_sync_scale_init, 276 275 .readsection = rcu_trace_ref_scale_read_section, 277 276 .delaysection = rcu_trace_ref_scale_delay_section, ··· 310 309 } 311 310 } 312 311 313 - static struct ref_scale_ops refcnt_ops = { 312 + static const struct ref_scale_ops refcnt_ops = { 314 313 .init = rcu_sync_scale_init, 315 314 .readsection = ref_refcnt_section, 316 315 .delaysection = ref_refcnt_delay_section, ··· 347 346 } 348 347 } 349 348 350 - static struct ref_scale_ops rwlock_ops = { 349 + static const struct ref_scale_ops rwlock_ops = { 351 350 .init = ref_rwlock_init, 352 351 .readsection = ref_rwlock_section, 353 352 .delaysection = ref_rwlock_delay_section, ··· 384 383 } 385 384 } 386 385 387 - static struct ref_scale_ops rwsem_ops = { 386 + static const struct ref_scale_ops rwsem_ops = { 388 387 .init = ref_rwsem_init, 389 388 .readsection = ref_rwsem_section, 390 389 .delaysection = ref_rwsem_delay_section, ··· 419 418 preempt_enable(); 420 419 } 421 420 422 - static struct ref_scale_ops lock_ops = { 421 + static const struct ref_scale_ops lock_ops = { 423 422 .readsection = ref_lock_section, 424 423 .delaysection = ref_lock_delay_section, 425 424 .name = "lock" ··· 454 453 preempt_enable(); 455 454 } 456 455 457 - static struct ref_scale_ops lock_irq_ops = { 456 + static const struct ref_scale_ops lock_irq_ops = { 458 457 .readsection = ref_lock_irq_section, 459 458 .delaysection = ref_lock_irq_delay_section, 460 459 .name = "lock-irq" ··· 490 489 preempt_enable(); 491 490 } 492 491 493 - static struct ref_scale_ops acqrel_ops = { 492 + static const struct ref_scale_ops acqrel_ops = { 494 493 .readsection = ref_acqrel_section, 495 494 .delaysection = ref_acqrel_delay_section, 496 495 .name = "acqrel" ··· 524 523 stopopts = x; 525 524 } 526 525 527 - static struct ref_scale_ops clock_ops = { 526 + static const struct ref_scale_ops clock_ops = { 528 527 .readsection = ref_clock_section, 529 528 .delaysection = ref_clock_delay_section, 530 529 .name = "clock" ··· 556 555 stopopts = x; 557 556 } 558 557 559 - static struct ref_scale_ops jiffies_ops = { 558 + static const struct ref_scale_ops jiffies_ops = { 560 559 .readsection = ref_jiffies_section, 561 560 .delaysection = ref_jiffies_delay_section, 562 561 .name = "jiffies" ··· 706 705 preempt_enable(); 707 706 } 708 707 709 - static struct ref_scale_ops typesafe_ref_ops; 710 - static struct ref_scale_ops typesafe_lock_ops; 711 - static struct ref_scale_ops typesafe_seqlock_ops; 708 + static const struct ref_scale_ops typesafe_ref_ops; 709 + static const struct ref_scale_ops typesafe_lock_ops; 710 + static const struct ref_scale_ops typesafe_seqlock_ops; 712 711 713 712 // Initialize for a typesafe test. 714 713 static bool typesafe_init(void) ··· 769 768 } 770 769 771 770 // The typesafe_init() function distinguishes these structures by address. 772 - static struct ref_scale_ops typesafe_ref_ops = { 771 + static const struct ref_scale_ops typesafe_ref_ops = { 773 772 .init = typesafe_init, 774 773 .cleanup = typesafe_cleanup, 775 774 .readsection = typesafe_read_section, ··· 777 776 .name = "typesafe_ref" 778 777 }; 779 778 780 - static struct ref_scale_ops typesafe_lock_ops = { 779 + static const struct ref_scale_ops typesafe_lock_ops = { 781 780 .init = typesafe_init, 782 781 .cleanup = typesafe_cleanup, 783 782 .readsection = typesafe_read_section, ··· 785 784 .name = "typesafe_lock" 786 785 }; 787 786 788 - static struct ref_scale_ops typesafe_seqlock_ops = { 787 + static const struct ref_scale_ops typesafe_seqlock_ops = { 789 788 .init = typesafe_init, 790 789 .cleanup = typesafe_cleanup, 791 790 .readsection = typesafe_read_section, ··· 892 891 { 893 892 int i; 894 893 struct reader_task *rt; 895 - char buf1[64]; 894 + struct seq_buf s; 896 895 char *buf; 897 896 u64 sum = 0; 898 897 899 898 buf = kmalloc(800 + 64, GFP_KERNEL); 900 899 if (!buf) 901 900 return 0; 902 - buf[0] = 0; 903 - sprintf(buf, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)", 904 - exp_idx); 901 + seq_buf_init(&s, buf, 800 + 64); 902 + 903 + seq_buf_printf(&s, "Experiment #%d (Format: <THREAD-NUM>:<Total loop time in ns>)", 904 + exp_idx); 905 905 906 906 for (i = 0; i < n && !torture_must_stop(); i++) { 907 907 rt = &(reader_tasks[i]); 908 - sprintf(buf1, "%d: %llu\t", i, rt->last_duration_ns); 909 908 910 909 if (i % 5 == 0) 911 - strcat(buf, "\n"); 912 - if (strlen(buf) >= 800) { 913 - pr_alert("%s", buf); 914 - buf[0] = 0; 910 + seq_buf_putc(&s, '\n'); 911 + 912 + if (seq_buf_used(&s) >= 800) { 913 + pr_alert("%s", seq_buf_str(&s)); 914 + seq_buf_clear(&s); 915 915 } 916 - strcat(buf, buf1); 916 + 917 + seq_buf_printf(&s, "%d: %llu\t", i, rt->last_duration_ns); 917 918 918 919 sum += rt->last_duration_ns; 919 920 } 920 - pr_alert("%s\n", buf); 921 + pr_alert("%s\n", seq_buf_str(&s)); 921 922 922 923 kfree(buf); 923 924 return sum; ··· 1026 1023 } 1027 1024 1028 1025 static void 1029 - ref_scale_print_module_parms(struct ref_scale_ops *cur_ops, const char *tag) 1026 + ref_scale_print_module_parms(const struct ref_scale_ops *cur_ops, const char *tag) 1030 1027 { 1031 1028 pr_alert("%s" SCALE_FLAG 1032 1029 "--- %s: verbose=%d verbose_batched=%d shutdown=%d holdoff=%d lookup_instances=%ld loops=%ld nreaders=%d nruns=%d readdelay=%d\n", scale_type, tag, ··· 1081 1078 { 1082 1079 long i; 1083 1080 int firsterr = 0; 1084 - static struct ref_scale_ops *scale_ops[] = { 1081 + static const struct ref_scale_ops *scale_ops[] = { 1085 1082 &rcu_ops, &srcu_ops, RCU_TRACE_OPS RCU_TASKS_OPS &refcnt_ops, &rwlock_ops, 1086 1083 &rwsem_ops, &lock_ops, &lock_irq_ops, &acqrel_ops, &clock_ops, &jiffies_ops, 1087 1084 &typesafe_ref_ops, &typesafe_lock_ops, &typesafe_seqlock_ops,
+8 -3
kernel/rcu/srcutree.c
··· 137 137 sdp->srcu_cblist_invoking = false; 138 138 sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq; 139 139 sdp->srcu_gp_seq_needed_exp = ssp->srcu_sup->srcu_gp_seq; 140 + sdp->srcu_barrier_head.next = &sdp->srcu_barrier_head; 140 141 sdp->mynode = NULL; 141 142 sdp->cpu = cpu; 142 143 INIT_WORK(&sdp->work, srcu_invoke_callbacks); ··· 248 247 mutex_init(&ssp->srcu_sup->srcu_cb_mutex); 249 248 mutex_init(&ssp->srcu_sup->srcu_gp_mutex); 250 249 ssp->srcu_idx = 0; 251 - ssp->srcu_sup->srcu_gp_seq = 0; 250 + ssp->srcu_sup->srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL; 252 251 ssp->srcu_sup->srcu_barrier_seq = 0; 253 252 mutex_init(&ssp->srcu_sup->srcu_barrier_mutex); 254 253 atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0); ··· 259 258 if (!ssp->sda) 260 259 goto err_free_sup; 261 260 init_srcu_struct_data(ssp); 262 - ssp->srcu_sup->srcu_gp_seq_needed_exp = 0; 261 + ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL; 263 262 ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns(); 264 263 if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) { 265 264 if (!init_srcu_struct_nodes(ssp, GFP_ATOMIC)) ··· 267 266 WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG); 268 267 } 269 268 ssp->srcu_sup->srcu_ssp = ssp; 270 - smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 0); /* Init done. */ 269 + smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, 270 + SRCU_GP_SEQ_INITIAL_VAL); /* Init done. */ 271 271 return 0; 272 272 273 273 err_free_sda: ··· 630 628 if (time_after(j, gpstart)) 631 629 jbase += j - gpstart; 632 630 if (!jbase) { 631 + ASSERT_EXCLUSIVE_WRITER(sup->srcu_n_exp_nodelay); 633 632 WRITE_ONCE(sup->srcu_n_exp_nodelay, READ_ONCE(sup->srcu_n_exp_nodelay) + 1); 634 633 if (READ_ONCE(sup->srcu_n_exp_nodelay) > srcu_max_nodelay_phase) 635 634 jbase = 1; ··· 1563 1560 struct srcu_data *sdp; 1564 1561 struct srcu_struct *ssp; 1565 1562 1563 + rhp->next = rhp; // Mark the callback as having been invoked. 1566 1564 sdp = container_of(rhp, struct srcu_data, srcu_barrier_head); 1567 1565 ssp = sdp->ssp; 1568 1566 if (atomic_dec_and_test(&ssp->srcu_sup->srcu_barrier_cpu_cnt)) ··· 1822 1818 } else { 1823 1819 j = jiffies; 1824 1820 if (READ_ONCE(sup->reschedule_jiffies) == j) { 1821 + ASSERT_EXCLUSIVE_WRITER(sup->reschedule_count); 1825 1822 WRITE_ONCE(sup->reschedule_count, READ_ONCE(sup->reschedule_count) + 1); 1826 1823 if (READ_ONCE(sup->reschedule_count) > srcu_max_nodelay) 1827 1824 curdelay = 1;
+143 -71
kernel/rcu/tasks.h
··· 34 34 * @rtp_blkd_tasks: List of tasks blocked as readers. 35 35 * @rtp_exit_list: List of tasks in the latter portion of do_exit(). 36 36 * @cpu: CPU number corresponding to this entry. 37 + * @index: Index of this CPU in rtpcp_array of the rcu_tasks structure. 37 38 * @rtpp: Pointer to the rcu_tasks structure. 38 39 */ 39 40 struct rcu_tasks_percpu { ··· 50 49 struct list_head rtp_blkd_tasks; 51 50 struct list_head rtp_exit_list; 52 51 int cpu; 52 + int index; 53 53 struct rcu_tasks *rtpp; 54 54 }; 55 55 ··· 65 63 * @init_fract: Initial backoff sleep interval. 66 64 * @gp_jiffies: Time of last @gp_state transition. 67 65 * @gp_start: Most recent grace-period start in jiffies. 68 - * @tasks_gp_seq: Number of grace periods completed since boot. 66 + * @tasks_gp_seq: Number of grace periods completed since boot in upper bits. 69 67 * @n_ipis: Number of IPIs sent to encourage grace periods to end. 70 68 * @n_ipis_fails: Number of IPI-send failures. 71 69 * @kthread_ptr: This flavor's grace-period/callback-invocation kthread. ··· 78 76 * @call_func: This flavor's call_rcu()-equivalent function. 79 77 * @wait_state: Task state for synchronous grace-period waits (default TASK_UNINTERRUPTIBLE). 80 78 * @rtpcpu: This flavor's rcu_tasks_percpu structure. 79 + * @rtpcp_array: Array of pointers to rcu_tasks_percpu structure of CPUs in cpu_possible_mask. 81 80 * @percpu_enqueue_shift: Shift down CPU ID this much when enqueuing callbacks. 82 81 * @percpu_enqueue_lim: Number of per-CPU callback queues in use for enqueuing. 83 82 * @percpu_dequeue_lim: Number of per-CPU callback queues in use for dequeuing. ··· 87 84 * @barrier_q_count: Number of queues being waited on. 88 85 * @barrier_q_completion: Barrier wait/wakeup mechanism. 89 86 * @barrier_q_seq: Sequence number for barrier operations. 87 + * @barrier_q_start: Most recent barrier start in jiffies. 90 88 * @name: This flavor's textual name. 91 89 * @kname: This flavor's kthread name. 92 90 */ ··· 114 110 call_rcu_func_t call_func; 115 111 unsigned int wait_state; 116 112 struct rcu_tasks_percpu __percpu *rtpcpu; 113 + struct rcu_tasks_percpu **rtpcp_array; 117 114 int percpu_enqueue_shift; 118 115 int percpu_enqueue_lim; 119 116 int percpu_dequeue_lim; ··· 123 118 atomic_t barrier_q_count; 124 119 struct completion barrier_q_completion; 125 120 unsigned long barrier_q_seq; 121 + unsigned long barrier_q_start; 126 122 char *name; 127 123 char *kname; 128 124 }; ··· 188 182 static int rcu_task_lazy_lim __read_mostly = 32; 189 183 module_param(rcu_task_lazy_lim, int, 0444); 190 184 185 + static int rcu_task_cpu_ids; 186 + 191 187 /* RCU tasks grace-period state for debugging. */ 192 188 #define RTGS_INIT 0 193 189 #define RTGS_WAIT_WAIT_CBS 1 ··· 253 245 int cpu; 254 246 int lim; 255 247 int shift; 248 + int maxcpu; 249 + int index = 0; 256 250 257 251 if (rcu_task_enqueue_lim < 0) { 258 252 rcu_task_enqueue_lim = 1; ··· 264 254 } 265 255 lim = rcu_task_enqueue_lim; 266 256 267 - if (lim > nr_cpu_ids) 268 - lim = nr_cpu_ids; 269 - shift = ilog2(nr_cpu_ids / lim); 270 - if (((nr_cpu_ids - 1) >> shift) >= lim) 271 - shift++; 272 - WRITE_ONCE(rtp->percpu_enqueue_shift, shift); 273 - WRITE_ONCE(rtp->percpu_dequeue_lim, lim); 274 - smp_store_release(&rtp->percpu_enqueue_lim, lim); 257 + rtp->rtpcp_array = kcalloc(num_possible_cpus(), sizeof(struct rcu_tasks_percpu *), GFP_KERNEL); 258 + BUG_ON(!rtp->rtpcp_array); 259 + 275 260 for_each_possible_cpu(cpu) { 276 261 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 277 262 ··· 278 273 INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq); 279 274 rtpcp->cpu = cpu; 280 275 rtpcp->rtpp = rtp; 276 + rtpcp->index = index; 277 + rtp->rtpcp_array[index] = rtpcp; 278 + index++; 281 279 if (!rtpcp->rtp_blkd_tasks.next) 282 280 INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); 283 281 if (!rtpcp->rtp_exit_list.next) 284 282 INIT_LIST_HEAD(&rtpcp->rtp_exit_list); 283 + rtpcp->barrier_q_head.next = &rtpcp->barrier_q_head; 284 + maxcpu = cpu; 285 285 } 286 286 287 - pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name, 288 - data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), rcu_task_cb_adjust); 287 + rcu_task_cpu_ids = maxcpu + 1; 288 + if (lim > rcu_task_cpu_ids) 289 + lim = rcu_task_cpu_ids; 290 + shift = ilog2(rcu_task_cpu_ids / lim); 291 + if (((rcu_task_cpu_ids - 1) >> shift) >= lim) 292 + shift++; 293 + WRITE_ONCE(rtp->percpu_enqueue_shift, shift); 294 + WRITE_ONCE(rtp->percpu_dequeue_lim, lim); 295 + smp_store_release(&rtp->percpu_enqueue_lim, lim); 296 + 297 + pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d rcu_task_cpu_ids=%d.\n", 298 + rtp->name, data_race(rtp->percpu_enqueue_shift), data_race(rtp->percpu_enqueue_lim), 299 + rcu_task_cb_adjust, rcu_task_cpu_ids); 289 300 } 290 301 291 302 // Compute wakeup time for lazy callback timer. ··· 360 339 rcu_read_lock(); 361 340 ideal_cpu = smp_processor_id() >> READ_ONCE(rtp->percpu_enqueue_shift); 362 341 chosen_cpu = cpumask_next(ideal_cpu - 1, cpu_possible_mask); 342 + WARN_ON_ONCE(chosen_cpu >= rcu_task_cpu_ids); 363 343 rtpcp = per_cpu_ptr(rtp->rtpcpu, chosen_cpu); 364 344 if (!raw_spin_trylock_rcu_node(rtpcp)) { // irqs already disabled. 365 345 raw_spin_lock_rcu_node(rtpcp); // irqs already disabled. ··· 370 348 rtpcp->rtp_n_lock_retries = 0; 371 349 } 372 350 if (rcu_task_cb_adjust && ++rtpcp->rtp_n_lock_retries > rcu_task_contend_lim && 373 - READ_ONCE(rtp->percpu_enqueue_lim) != nr_cpu_ids) 351 + READ_ONCE(rtp->percpu_enqueue_lim) != rcu_task_cpu_ids) 374 352 needadjust = true; // Defer adjustment to avoid deadlock. 375 353 } 376 354 // Queuing callbacks before initialization not yet supported. ··· 390 368 raw_spin_unlock_irqrestore_rcu_node(rtpcp, flags); 391 369 if (unlikely(needadjust)) { 392 370 raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); 393 - if (rtp->percpu_enqueue_lim != nr_cpu_ids) { 371 + if (rtp->percpu_enqueue_lim != rcu_task_cpu_ids) { 394 372 WRITE_ONCE(rtp->percpu_enqueue_shift, 0); 395 - WRITE_ONCE(rtp->percpu_dequeue_lim, nr_cpu_ids); 396 - smp_store_release(&rtp->percpu_enqueue_lim, nr_cpu_ids); 373 + WRITE_ONCE(rtp->percpu_dequeue_lim, rcu_task_cpu_ids); 374 + smp_store_release(&rtp->percpu_enqueue_lim, rcu_task_cpu_ids); 397 375 pr_info("Switching %s to per-CPU callback queuing.\n", rtp->name); 398 376 } 399 377 raw_spin_unlock_irqrestore(&rtp->cbs_gbl_lock, flags); ··· 410 388 struct rcu_tasks *rtp; 411 389 struct rcu_tasks_percpu *rtpcp; 412 390 391 + rhp->next = rhp; // Mark the callback as having been invoked. 413 392 rtpcp = container_of(rhp, struct rcu_tasks_percpu, barrier_q_head); 414 393 rtp = rtpcp->rtpp; 415 394 if (atomic_dec_and_test(&rtp->barrier_q_count)) ··· 419 396 420 397 // Wait for all in-flight callbacks for the specified RCU Tasks flavor. 421 398 // Operates in a manner similar to rcu_barrier(). 422 - static void rcu_barrier_tasks_generic(struct rcu_tasks *rtp) 399 + static void __maybe_unused rcu_barrier_tasks_generic(struct rcu_tasks *rtp) 423 400 { 424 401 int cpu; 425 402 unsigned long flags; ··· 432 409 mutex_unlock(&rtp->barrier_q_mutex); 433 410 return; 434 411 } 412 + rtp->barrier_q_start = jiffies; 435 413 rcu_seq_start(&rtp->barrier_q_seq); 436 414 init_completion(&rtp->barrier_q_completion); 437 415 atomic_set(&rtp->barrier_q_count, 2); ··· 468 444 469 445 dequeue_limit = smp_load_acquire(&rtp->percpu_dequeue_lim); 470 446 for (cpu = 0; cpu < dequeue_limit; cpu++) { 447 + if (!cpu_possible(cpu)) 448 + continue; 471 449 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 472 450 473 451 /* Advance and accelerate any new callbacks. */ ··· 507 481 if (rcu_task_cb_adjust && ncbs <= rcu_task_collapse_lim) { 508 482 raw_spin_lock_irqsave(&rtp->cbs_gbl_lock, flags); 509 483 if (rtp->percpu_enqueue_lim > 1) { 510 - WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(nr_cpu_ids)); 484 + WRITE_ONCE(rtp->percpu_enqueue_shift, order_base_2(rcu_task_cpu_ids)); 511 485 smp_store_release(&rtp->percpu_enqueue_lim, 1); 512 486 rtp->percpu_dequeue_gpseq = get_state_synchronize_rcu(); 513 487 gpdone = false; ··· 522 496 pr_info("Completing switch %s to CPU-0 callback queuing.\n", rtp->name); 523 497 } 524 498 if (rtp->percpu_dequeue_lim == 1) { 525 - for (cpu = rtp->percpu_dequeue_lim; cpu < nr_cpu_ids; cpu++) { 499 + for (cpu = rtp->percpu_dequeue_lim; cpu < rcu_task_cpu_ids; cpu++) { 500 + if (!cpu_possible(cpu)) 501 + continue; 526 502 struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 527 503 528 504 WARN_ON_ONCE(rcu_segcblist_n_cbs(&rtpcp->cblist)); ··· 539 511 // Advance callbacks and invoke any that are ready. 540 512 static void rcu_tasks_invoke_cbs(struct rcu_tasks *rtp, struct rcu_tasks_percpu *rtpcp) 541 513 { 542 - int cpu; 543 - int cpunext; 544 514 int cpuwq; 545 515 unsigned long flags; 546 516 int len; 517 + int index; 547 518 struct rcu_head *rhp; 548 519 struct rcu_cblist rcl = RCU_CBLIST_INITIALIZER(rcl); 549 520 struct rcu_tasks_percpu *rtpcp_next; 550 521 551 - cpu = rtpcp->cpu; 552 - cpunext = cpu * 2 + 1; 553 - if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 554 - rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext); 555 - cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND; 556 - queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 557 - cpunext++; 558 - if (cpunext < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 559 - rtpcp_next = per_cpu_ptr(rtp->rtpcpu, cpunext); 560 - cpuwq = rcu_cpu_beenfullyonline(cpunext) ? cpunext : WORK_CPU_UNBOUND; 522 + index = rtpcp->index * 2 + 1; 523 + if (index < num_possible_cpus()) { 524 + rtpcp_next = rtp->rtpcp_array[index]; 525 + if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 526 + cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND; 561 527 queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 528 + index++; 529 + if (index < num_possible_cpus()) { 530 + rtpcp_next = rtp->rtpcp_array[index]; 531 + if (rtpcp_next->cpu < smp_load_acquire(&rtp->percpu_dequeue_lim)) { 532 + cpuwq = rcu_cpu_beenfullyonline(rtpcp_next->cpu) ? rtpcp_next->cpu : WORK_CPU_UNBOUND; 533 + queue_work_on(cpuwq, system_wq, &rtpcp_next->rtp_work); 534 + } 535 + } 562 536 } 563 537 } 564 538 565 - if (rcu_segcblist_empty(&rtpcp->cblist) || !cpu_possible(cpu)) 539 + if (rcu_segcblist_empty(&rtpcp->cblist)) 566 540 return; 567 541 raw_spin_lock_irqsave_rcu_node(rtpcp, flags); 568 542 rcu_segcblist_advance(&rtpcp->cblist, rcu_seq_current(&rtp->tasks_gp_seq)); ··· 717 687 #endif /* #ifdef CONFIG_TASKS_TRACE_RCU */ 718 688 } 719 689 720 - #endif /* #ifndef CONFIG_TINY_RCU */ 721 690 722 - #ifndef CONFIG_TINY_RCU 723 691 /* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */ 724 692 static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s) 725 693 { ··· 751 723 rtp->lazy_jiffies, 752 724 s); 753 725 } 726 + 727 + /* Dump out more rcutorture-relevant state common to all RCU-tasks flavors. */ 728 + static void rcu_tasks_torture_stats_print_generic(struct rcu_tasks *rtp, char *tt, 729 + char *tf, char *tst) 730 + { 731 + cpumask_var_t cm; 732 + int cpu; 733 + bool gotcb = false; 734 + unsigned long j = jiffies; 735 + 736 + pr_alert("%s%s Tasks%s RCU g%ld gp_start %lu gp_jiffies %lu gp_state %d (%s).\n", 737 + tt, tf, tst, data_race(rtp->tasks_gp_seq), 738 + j - data_race(rtp->gp_start), j - data_race(rtp->gp_jiffies), 739 + data_race(rtp->gp_state), tasks_gp_state_getname(rtp)); 740 + pr_alert("\tEnqueue shift %d limit %d Dequeue limit %d gpseq %lu.\n", 741 + data_race(rtp->percpu_enqueue_shift), 742 + data_race(rtp->percpu_enqueue_lim), 743 + data_race(rtp->percpu_dequeue_lim), 744 + data_race(rtp->percpu_dequeue_gpseq)); 745 + (void)zalloc_cpumask_var(&cm, GFP_KERNEL); 746 + pr_alert("\tCallback counts:"); 747 + for_each_possible_cpu(cpu) { 748 + long n; 749 + struct rcu_tasks_percpu *rtpcp = per_cpu_ptr(rtp->rtpcpu, cpu); 750 + 751 + if (cpumask_available(cm) && !rcu_barrier_cb_is_done(&rtpcp->barrier_q_head)) 752 + cpumask_set_cpu(cpu, cm); 753 + n = rcu_segcblist_n_cbs(&rtpcp->cblist); 754 + if (!n) 755 + continue; 756 + pr_cont(" %d:%ld", cpu, n); 757 + gotcb = true; 758 + } 759 + if (gotcb) 760 + pr_cont(".\n"); 761 + else 762 + pr_cont(" (none).\n"); 763 + pr_alert("\tBarrier seq %lu start %lu count %d holdout CPUs ", 764 + data_race(rtp->barrier_q_seq), j - data_race(rtp->barrier_q_start), 765 + atomic_read(&rtp->barrier_q_count)); 766 + if (cpumask_available(cm) && !cpumask_empty(cm)) 767 + pr_cont(" %*pbl.\n", cpumask_pr_args(cm)); 768 + else 769 + pr_cont("(none).\n"); 770 + free_cpumask_var(cm); 771 + } 772 + 754 773 #endif // #ifndef CONFIG_TINY_RCU 755 774 756 775 static void exit_tasks_rcu_finish_trace(struct task_struct *t); ··· 1249 1174 show_rcu_tasks_generic_gp_kthread(&rcu_tasks, ""); 1250 1175 } 1251 1176 EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread); 1177 + 1178 + void rcu_tasks_torture_stats_print(char *tt, char *tf) 1179 + { 1180 + rcu_tasks_torture_stats_print_generic(&rcu_tasks, tt, tf, ""); 1181 + } 1182 + EXPORT_SYMBOL_GPL(rcu_tasks_torture_stats_print); 1252 1183 #endif // !defined(CONFIG_TINY_RCU) 1253 1184 1254 1185 struct task_struct *get_rcu_tasks_gp_kthread(void) ··· 1325 1244 1326 1245 //////////////////////////////////////////////////////////////////////// 1327 1246 // 1328 - // "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of 1329 - // passing an empty function to schedule_on_each_cpu(). This approach 1330 - // provides an asynchronous call_rcu_tasks_rude() API and batching of 1331 - // concurrent calls to the synchronous synchronize_rcu_tasks_rude() API. 1332 - // This invokes schedule_on_each_cpu() in order to send IPIs far and wide 1333 - // and induces otherwise unnecessary context switches on all online CPUs, 1334 - // whether idle or not. 1247 + // "Rude" variant of Tasks RCU, inspired by Steve Rostedt's 1248 + // trick of passing an empty function to schedule_on_each_cpu(). 1249 + // This approach provides batching of concurrent calls to the synchronous 1250 + // synchronize_rcu_tasks_rude() API. This invokes schedule_on_each_cpu() 1251 + // in order to send IPIs far and wide and induces otherwise unnecessary 1252 + // context switches on all online CPUs, whether idle or not. 1335 1253 // 1336 1254 // Callback handling is provided by the rcu_tasks_kthread() function. 1337 1255 // ··· 1348 1268 schedule_on_each_cpu(rcu_tasks_be_rude); 1349 1269 } 1350 1270 1351 - void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func); 1271 + static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func); 1352 1272 DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude, 1353 1273 "RCU Tasks Rude"); 1354 1274 1355 - /** 1275 + /* 1356 1276 * call_rcu_tasks_rude() - Queue a callback rude task-based grace period 1357 1277 * @rhp: structure to be used for queueing the RCU updates. 1358 1278 * @func: actual callback function to be invoked after the grace period ··· 1369 1289 * 1370 1290 * See the description of call_rcu() for more detailed information on 1371 1291 * memory ordering guarantees. 1292 + * 1293 + * This is no longer exported, and is instead reserved for use by 1294 + * synchronize_rcu_tasks_rude(). 1372 1295 */ 1373 - void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func) 1296 + static void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func) 1374 1297 { 1375 1298 call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude); 1376 1299 } 1377 - EXPORT_SYMBOL_GPL(call_rcu_tasks_rude); 1378 1300 1379 1301 /** 1380 1302 * synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period ··· 1402 1320 } 1403 1321 EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude); 1404 1322 1405 - /** 1406 - * rcu_barrier_tasks_rude - Wait for in-flight call_rcu_tasks_rude() callbacks. 1407 - * 1408 - * Although the current implementation is guaranteed to wait, it is not 1409 - * obligated to, for example, if there are no pending callbacks. 1410 - */ 1411 - void rcu_barrier_tasks_rude(void) 1412 - { 1413 - rcu_barrier_tasks_generic(&rcu_tasks_rude); 1414 - } 1415 - EXPORT_SYMBOL_GPL(rcu_barrier_tasks_rude); 1416 - 1417 - int rcu_tasks_rude_lazy_ms = -1; 1418 - module_param(rcu_tasks_rude_lazy_ms, int, 0444); 1419 - 1420 1323 static int __init rcu_spawn_tasks_rude_kthread(void) 1421 1324 { 1422 1325 rcu_tasks_rude.gp_sleep = HZ / 10; 1423 - if (rcu_tasks_rude_lazy_ms >= 0) 1424 - rcu_tasks_rude.lazy_jiffies = msecs_to_jiffies(rcu_tasks_rude_lazy_ms); 1425 1326 rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude); 1426 1327 return 0; 1427 1328 } ··· 1415 1350 show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, ""); 1416 1351 } 1417 1352 EXPORT_SYMBOL_GPL(show_rcu_tasks_rude_gp_kthread); 1353 + 1354 + void rcu_tasks_rude_torture_stats_print(char *tt, char *tf) 1355 + { 1356 + rcu_tasks_torture_stats_print_generic(&rcu_tasks_rude, tt, tf, ""); 1357 + } 1358 + EXPORT_SYMBOL_GPL(rcu_tasks_rude_torture_stats_print); 1418 1359 #endif // !defined(CONFIG_TINY_RCU) 1419 1360 1420 1361 struct task_struct *get_rcu_tasks_rude_gp_kthread(void) ··· 1684 1613 // However, we cannot safely change its state. 1685 1614 n_heavy_reader_attempts++; 1686 1615 // Check for "running" idle tasks on offline CPUs. 1687 - if (!rcu_dynticks_zero_in_eqs(cpu, &t->trc_reader_nesting)) 1616 + if (!rcu_watching_zero_in_eqs(cpu, &t->trc_reader_nesting)) 1688 1617 return -EINVAL; // No quiescent state, do it the hard way. 1689 1618 n_heavy_reader_updates++; 1690 1619 nesting = 0; ··· 2098 2027 show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf); 2099 2028 } 2100 2029 EXPORT_SYMBOL_GPL(show_rcu_tasks_trace_gp_kthread); 2030 + 2031 + void rcu_tasks_trace_torture_stats_print(char *tt, char *tf) 2032 + { 2033 + rcu_tasks_torture_stats_print_generic(&rcu_tasks_trace, tt, tf, ""); 2034 + } 2035 + EXPORT_SYMBOL_GPL(rcu_tasks_trace_torture_stats_print); 2101 2036 #endif // !defined(CONFIG_TINY_RCU) 2102 2037 2103 2038 struct task_struct *get_rcu_tasks_trace_gp_kthread(void) ··· 2147 2070 .notrun = IS_ENABLED(CONFIG_TASKS_RCU), 2148 2071 }, 2149 2072 { 2150 - .name = "call_rcu_tasks_rude()", 2151 - /* If not defined, the test is skipped. */ 2152 - .notrun = IS_ENABLED(CONFIG_TASKS_RUDE_RCU), 2153 - }, 2154 - { 2155 2073 .name = "call_rcu_tasks_trace()", 2156 2074 /* If not defined, the test is skipped. */ 2157 2075 .notrun = IS_ENABLED(CONFIG_TASKS_TRACE_RCU) 2158 2076 } 2159 2077 }; 2160 2078 2079 + #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) 2161 2080 static void test_rcu_tasks_callback(struct rcu_head *rhp) 2162 2081 { 2163 2082 struct rcu_tasks_test_desc *rttd = ··· 2163 2090 2164 2091 rttd->notrun = false; 2165 2092 } 2093 + #endif // #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) 2166 2094 2167 2095 static void rcu_tasks_initiate_self_tests(void) 2168 2096 { ··· 2176 2102 2177 2103 #ifdef CONFIG_TASKS_RUDE_RCU 2178 2104 pr_info("Running RCU Tasks Rude wait API self tests\n"); 2179 - tests[1].runstart = jiffies; 2180 2105 synchronize_rcu_tasks_rude(); 2181 - call_rcu_tasks_rude(&tests[1].rh, test_rcu_tasks_callback); 2182 2106 #endif 2183 2107 2184 2108 #ifdef CONFIG_TASKS_TRACE_RCU 2185 2109 pr_info("Running RCU Tasks Trace wait API self tests\n"); 2186 - tests[2].runstart = jiffies; 2110 + tests[1].runstart = jiffies; 2187 2111 synchronize_rcu_tasks_trace(); 2188 - call_rcu_tasks_trace(&tests[2].rh, test_rcu_tasks_callback); 2112 + call_rcu_tasks_trace(&tests[1].rh, test_rcu_tasks_callback); 2189 2113 #endif 2190 2114 } 2191 2115
+79 -95
kernel/rcu/tree.c
··· 79 79 80 80 static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = { 81 81 .gpwrap = true, 82 - #ifdef CONFIG_RCU_NOCB_CPU 83 - .cblist.flags = SEGCBLIST_RCU_CORE, 84 - #endif 85 82 }; 86 83 static struct rcu_state rcu_state = { 87 84 .level = { &rcu_state.node[0] }, ··· 94 97 .srs_cleanup_work = __WORK_INITIALIZER(rcu_state.srs_cleanup_work, 95 98 rcu_sr_normal_gp_cleanup_work), 96 99 .srs_cleanups_pending = ATOMIC_INIT(0), 100 + #ifdef CONFIG_RCU_NOCB_CPU 101 + .nocb_mutex = __MUTEX_INITIALIZER(rcu_state.nocb_mutex), 102 + #endif 97 103 }; 98 104 99 105 /* Dump rcu_node combining tree at boot to verify correct setup. */ ··· 283 283 } 284 284 285 285 /* 286 - * Reset the current CPU's ->dynticks counter to indicate that the 286 + * Reset the current CPU's RCU_WATCHING counter to indicate that the 287 287 * newly onlined CPU is no longer in an extended quiescent state. 288 288 * This will either leave the counter unchanged, or increment it 289 289 * to the next non-quiescent value. 290 290 * 291 291 * The non-atomic test/increment sequence works because the upper bits 292 - * of the ->dynticks counter are manipulated only by the corresponding CPU, 292 + * of the ->state variable are manipulated only by the corresponding CPU, 293 293 * or when the corresponding CPU is offline. 294 294 */ 295 - static void rcu_dynticks_eqs_online(void) 295 + static void rcu_watching_online(void) 296 296 { 297 - if (ct_dynticks() & RCU_DYNTICKS_IDX) 297 + if (ct_rcu_watching() & CT_RCU_WATCHING) 298 298 return; 299 - ct_state_inc(RCU_DYNTICKS_IDX); 299 + ct_state_inc(CT_RCU_WATCHING); 300 300 } 301 301 302 302 /* 303 - * Return true if the snapshot returned from rcu_dynticks_snap() 303 + * Return true if the snapshot returned from ct_rcu_watching() 304 304 * indicates that RCU is in an extended quiescent state. 305 305 */ 306 - static bool rcu_dynticks_in_eqs(int snap) 306 + static bool rcu_watching_snap_in_eqs(int snap) 307 307 { 308 - return !(snap & RCU_DYNTICKS_IDX); 308 + return !(snap & CT_RCU_WATCHING); 309 309 } 310 310 311 - /* 312 - * Return true if the CPU corresponding to the specified rcu_data 313 - * structure has spent some time in an extended quiescent state since 314 - * rcu_dynticks_snap() returned the specified snapshot. 311 + /** 312 + * rcu_watching_snap_stopped_since() - Has RCU stopped watching a given CPU 313 + * since the specified @snap? 314 + * 315 + * @rdp: The rcu_data corresponding to the CPU for which to check EQS. 316 + * @snap: rcu_watching snapshot taken when the CPU wasn't in an EQS. 317 + * 318 + * Returns true if the CPU corresponding to @rdp has spent some time in an 319 + * extended quiescent state since @snap. Note that this doesn't check if it 320 + * /still/ is in an EQS, just that it went through one since @snap. 321 + * 322 + * This is meant to be used in a loop waiting for a CPU to go through an EQS. 315 323 */ 316 - static bool rcu_dynticks_in_eqs_since(struct rcu_data *rdp, int snap) 324 + static bool rcu_watching_snap_stopped_since(struct rcu_data *rdp, int snap) 317 325 { 318 326 /* 319 327 * The first failing snapshot is already ordered against the accesses ··· 331 323 * performed by the remote CPU prior to entering idle and therefore can 332 324 * rely solely on acquire semantics. 333 325 */ 334 - return snap != ct_dynticks_cpu_acquire(rdp->cpu); 326 + if (WARN_ON_ONCE(rcu_watching_snap_in_eqs(snap))) 327 + return true; 328 + 329 + return snap != ct_rcu_watching_cpu_acquire(rdp->cpu); 335 330 } 336 331 337 332 /* 338 333 * Return true if the referenced integer is zero while the specified 339 334 * CPU remains within a single extended quiescent state. 340 335 */ 341 - bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) 336 + bool rcu_watching_zero_in_eqs(int cpu, int *vp) 342 337 { 343 338 int snap; 344 339 345 340 // If not quiescent, force back to earlier extended quiescent state. 346 - snap = ct_dynticks_cpu(cpu) & ~RCU_DYNTICKS_IDX; 347 - smp_rmb(); // Order ->dynticks and *vp reads. 341 + snap = ct_rcu_watching_cpu(cpu) & ~CT_RCU_WATCHING; 342 + smp_rmb(); // Order CT state and *vp reads. 348 343 if (READ_ONCE(*vp)) 349 344 return false; // Non-zero, so report failure; 350 - smp_rmb(); // Order *vp read and ->dynticks re-read. 345 + smp_rmb(); // Order *vp read and CT state re-read. 351 346 352 347 // If still in the same extended quiescent state, we are good! 353 - return snap == ct_dynticks_cpu(cpu); 348 + return snap == ct_rcu_watching_cpu(cpu); 354 349 } 355 350 356 351 /* ··· 367 356 * 368 357 * The caller must have disabled interrupts and must not be idle. 369 358 */ 370 - notrace void rcu_momentary_dyntick_idle(void) 359 + notrace void rcu_momentary_eqs(void) 371 360 { 372 361 int seq; 373 362 374 363 raw_cpu_write(rcu_data.rcu_need_heavy_qs, false); 375 - seq = ct_state_inc(2 * RCU_DYNTICKS_IDX); 364 + seq = ct_state_inc(2 * CT_RCU_WATCHING); 376 365 /* It is illegal to call this from idle state. */ 377 - WARN_ON_ONCE(!(seq & RCU_DYNTICKS_IDX)); 366 + WARN_ON_ONCE(!(seq & CT_RCU_WATCHING)); 378 367 rcu_preempt_deferred_qs(current); 379 368 } 380 - EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle); 369 + EXPORT_SYMBOL_GPL(rcu_momentary_eqs); 381 370 382 371 /** 383 372 * rcu_is_cpu_rrupt_from_idle - see if 'interrupted' from idle ··· 399 388 lockdep_assert_irqs_disabled(); 400 389 401 390 /* Check for counter underflows */ 402 - RCU_LOCKDEP_WARN(ct_dynticks_nesting() < 0, 403 - "RCU dynticks_nesting counter underflow!"); 404 - RCU_LOCKDEP_WARN(ct_dynticks_nmi_nesting() <= 0, 405 - "RCU dynticks_nmi_nesting counter underflow/zero!"); 391 + RCU_LOCKDEP_WARN(ct_nesting() < 0, 392 + "RCU nesting counter underflow!"); 393 + RCU_LOCKDEP_WARN(ct_nmi_nesting() <= 0, 394 + "RCU nmi_nesting counter underflow/zero!"); 406 395 407 396 /* Are we at first interrupt nesting level? */ 408 - nesting = ct_dynticks_nmi_nesting(); 397 + nesting = ct_nmi_nesting(); 409 398 if (nesting > 1) 410 399 return false; 411 400 ··· 415 404 WARN_ON_ONCE(!nesting && !is_idle_task(current)); 416 405 417 406 /* Does CPU appear to be idle from an RCU standpoint? */ 418 - return ct_dynticks_nesting() == 0; 407 + return ct_nesting() == 0; 419 408 } 420 409 421 410 #define DEFAULT_RCU_BLIMIT (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD) ? 1000 : 10) ··· 607 596 { 608 597 lockdep_assert_irqs_disabled(); 609 598 610 - RCU_LOCKDEP_WARN(ct_dynticks_nesting() <= 0, 611 - "RCU dynticks_nesting counter underflow/zero!"); 612 - RCU_LOCKDEP_WARN(ct_dynticks_nmi_nesting() != 613 - DYNTICK_IRQ_NONIDLE, 614 - "Bad RCU dynticks_nmi_nesting counter\n"); 615 - RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(), 599 + RCU_LOCKDEP_WARN(ct_nesting() <= 0, 600 + "RCU nesting counter underflow/zero!"); 601 + RCU_LOCKDEP_WARN(ct_nmi_nesting() != 602 + CT_NESTING_IRQ_NONIDLE, 603 + "Bad RCU nmi_nesting counter\n"); 604 + RCU_LOCKDEP_WARN(!rcu_is_watching_curr_cpu(), 616 605 "RCU in extended quiescent state!"); 617 606 } 618 607 #endif /* #ifdef CONFIG_PROVE_RCU */ ··· 652 641 if (in_nmi()) 653 642 return; 654 643 655 - RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(), 644 + RCU_LOCKDEP_WARN(!rcu_is_watching_curr_cpu(), 656 645 "Illegal rcu_irq_enter_check_tick() from extended quiescent state"); 657 646 658 647 if (!tick_nohz_full_cpu(rdp->cpu) || ··· 734 723 bool ret; 735 724 736 725 preempt_disable_notrace(); 737 - ret = !rcu_dynticks_curr_cpu_in_eqs(); 726 + ret = rcu_is_watching_curr_cpu(); 738 727 preempt_enable_notrace(); 739 728 return ret; 740 729 } ··· 776 765 } 777 766 778 767 /* 779 - * Snapshot the specified CPU's dynticks counter so that we can later 768 + * Snapshot the specified CPU's RCU_WATCHING counter so that we can later 780 769 * credit them with an implicit quiescent state. Return 1 if this CPU 781 770 * is in dynticks idle mode, which is an extended quiescent state. 782 771 */ 783 - static int dyntick_save_progress_counter(struct rcu_data *rdp) 772 + static int rcu_watching_snap_save(struct rcu_data *rdp) 784 773 { 785 774 /* 786 775 * Full ordering between remote CPU's post idle accesses and updater's ··· 793 782 * Ordering between remote CPU's pre idle accesses and post grace period 794 783 * updater's accesses is enforced by the below acquire semantic. 795 784 */ 796 - rdp->dynticks_snap = ct_dynticks_cpu_acquire(rdp->cpu); 797 - if (rcu_dynticks_in_eqs(rdp->dynticks_snap)) { 785 + rdp->watching_snap = ct_rcu_watching_cpu_acquire(rdp->cpu); 786 + if (rcu_watching_snap_in_eqs(rdp->watching_snap)) { 798 787 trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); 799 788 rcu_gpnum_ovf(rdp->mynode, rdp); 800 789 return 1; ··· 805 794 /* 806 795 * Returns positive if the specified CPU has passed through a quiescent state 807 796 * by virtue of being in or having passed through an dynticks idle state since 808 - * the last call to dyntick_save_progress_counter() for this same CPU, or by 797 + * the last call to rcu_watching_snap_save() for this same CPU, or by 809 798 * virtue of having been offline. 810 799 * 811 800 * Returns negative if the specified CPU needs a force resched. 812 801 * 813 802 * Returns zero otherwise. 814 803 */ 815 - static int rcu_implicit_dynticks_qs(struct rcu_data *rdp) 804 + static int rcu_watching_snap_recheck(struct rcu_data *rdp) 816 805 { 817 806 unsigned long jtsq; 818 807 int ret = 0; ··· 826 815 * read-side critical section that started before the beginning 827 816 * of the current RCU grace period. 828 817 */ 829 - if (rcu_dynticks_in_eqs_since(rdp, rdp->dynticks_snap)) { 818 + if (rcu_watching_snap_stopped_since(rdp, rdp->watching_snap)) { 830 819 trace_rcu_fqs(rcu_state.name, rdp->gp_seq, rdp->cpu, TPS("dti")); 831 820 rcu_gpnum_ovf(rnp, rdp); 832 821 return 1; ··· 1660 1649 * the done tail list manipulations are protected here. 1661 1650 */ 1662 1651 done = smp_load_acquire(&rcu_state.srs_done_tail); 1663 - if (!done) 1652 + if (WARN_ON_ONCE(!done)) 1664 1653 return; 1665 1654 1666 1655 WARN_ON_ONCE(!rcu_sr_is_wait_head(done)); ··· 1995 1984 1996 1985 if (first_time) { 1997 1986 /* Collect dyntick-idle snapshots. */ 1998 - force_qs_rnp(dyntick_save_progress_counter); 1987 + force_qs_rnp(rcu_watching_snap_save); 1999 1988 } else { 2000 1989 /* Handle dyntick-idle and offline CPUs. */ 2001 - force_qs_rnp(rcu_implicit_dynticks_qs); 1990 + force_qs_rnp(rcu_watching_snap_recheck); 2002 1991 } 2003 1992 /* Clear flag to prevent immediate re-entry. */ 2004 1993 if (READ_ONCE(rcu_state.gp_flags) & RCU_GP_FLAG_FQS) { ··· 2394 2383 { 2395 2384 unsigned long flags; 2396 2385 unsigned long mask; 2397 - bool needacc = false; 2398 2386 struct rcu_node *rnp; 2399 2387 2400 2388 WARN_ON_ONCE(rdp->cpu != smp_processor_id()); ··· 2430 2420 * to return true. So complain, but don't awaken. 2431 2421 */ 2432 2422 WARN_ON_ONCE(rcu_accelerate_cbs(rnp, rdp)); 2433 - } else if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) { 2434 - /* 2435 - * ...but NOCB kthreads may miss or delay callbacks acceleration 2436 - * if in the middle of a (de-)offloading process. 2437 - */ 2438 - needacc = true; 2439 2423 } 2440 2424 2441 2425 rcu_disable_urgency_upon_qs(rdp); 2442 2426 rcu_report_qs_rnp(mask, rnp, rnp->gp_seq, flags); 2443 2427 /* ^^^ Released rnp->lock */ 2444 - 2445 - if (needacc) { 2446 - rcu_nocb_lock_irqsave(rdp, flags); 2447 - rcu_accelerate_cbs_unlocked(rnp, rdp); 2448 - rcu_nocb_unlock_irqrestore(rdp, flags); 2449 - } 2450 2428 } 2451 2429 } 2452 2430 ··· 2789 2791 unsigned long flags; 2790 2792 struct rcu_data *rdp = raw_cpu_ptr(&rcu_data); 2791 2793 struct rcu_node *rnp = rdp->mynode; 2792 - /* 2793 - * On RT rcu_core() can be preempted when IRQs aren't disabled. 2794 - * Therefore this function can race with concurrent NOCB (de-)offloading 2795 - * on this CPU and the below condition must be considered volatile. 2796 - * However if we race with: 2797 - * 2798 - * _ Offloading: In the worst case we accelerate or process callbacks 2799 - * concurrently with NOCB kthreads. We are guaranteed to 2800 - * call rcu_nocb_lock() if that happens. 2801 - * 2802 - * _ Deoffloading: In the worst case we miss callbacks acceleration or 2803 - * processing. This is fine because the early stage 2804 - * of deoffloading invokes rcu_core() after setting 2805 - * SEGCBLIST_RCU_CORE. So we guarantee that we'll process 2806 - * what could have been dismissed without the need to wait 2807 - * for the next rcu_pending() check in the next jiffy. 2808 - */ 2809 - const bool do_batch = !rcu_segcblist_completely_offloaded(&rdp->cblist); 2810 2794 2811 2795 if (cpu_is_offline(smp_processor_id())) 2812 2796 return; ··· 2808 2828 2809 2829 /* No grace period and unregistered callbacks? */ 2810 2830 if (!rcu_gp_in_progress() && 2811 - rcu_segcblist_is_enabled(&rdp->cblist) && do_batch) { 2812 - rcu_nocb_lock_irqsave(rdp, flags); 2831 + rcu_segcblist_is_enabled(&rdp->cblist) && !rcu_rdp_is_offloaded(rdp)) { 2832 + local_irq_save(flags); 2813 2833 if (!rcu_segcblist_restempty(&rdp->cblist, RCU_NEXT_READY_TAIL)) 2814 2834 rcu_accelerate_cbs_unlocked(rnp, rdp); 2815 - rcu_nocb_unlock_irqrestore(rdp, flags); 2835 + local_irq_restore(flags); 2816 2836 } 2817 2837 2818 2838 rcu_check_gp_start_stall(rnp, rdp, rcu_jiffies_till_stall_check()); 2819 2839 2820 2840 /* If there are callbacks ready, invoke them. */ 2821 - if (do_batch && rcu_segcblist_ready_cbs(&rdp->cblist) && 2841 + if (!rcu_rdp_is_offloaded(rdp) && rcu_segcblist_ready_cbs(&rdp->cblist) && 2822 2842 likely(READ_ONCE(rcu_scheduler_fully_active))) { 2823 2843 rcu_do_batch(rdp); 2824 2844 /* Re-invoke RCU core processing if there are callbacks remaining. */ ··· 3207 3227 struct list_head list; 3208 3228 struct rcu_gp_oldstate gp_snap; 3209 3229 unsigned long nr_records; 3210 - void *records[]; 3230 + void *records[] __counted_by(nr_records); 3211 3231 }; 3212 3232 3213 3233 /* ··· 3519 3539 if (delayed_work_pending(&krcp->monitor_work)) { 3520 3540 delay_left = krcp->monitor_work.timer.expires - jiffies; 3521 3541 if (delay < delay_left) 3522 - mod_delayed_work(system_wq, &krcp->monitor_work, delay); 3542 + mod_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); 3523 3543 return; 3524 3544 } 3525 - queue_delayed_work(system_wq, &krcp->monitor_work, delay); 3545 + queue_delayed_work(system_unbound_wq, &krcp->monitor_work, delay); 3526 3546 } 3527 3547 3528 3548 static void ··· 3614 3634 // be that the work is in the pending state when 3615 3635 // channels have been detached following by each 3616 3636 // other. 3617 - queue_rcu_work(system_wq, &krwp->rcu_work); 3637 + queue_rcu_work(system_unbound_wq, &krwp->rcu_work); 3618 3638 } 3619 3639 } 3620 3640 ··· 3684 3704 if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && 3685 3705 !atomic_xchg(&krcp->work_in_progress, 1)) { 3686 3706 if (atomic_read(&krcp->backoff_page_cache_fill)) { 3687 - queue_delayed_work(system_wq, 3707 + queue_delayed_work(system_unbound_wq, 3688 3708 &krcp->page_cache_work, 3689 3709 msecs_to_jiffies(rcu_delay_page_cache_fill_msec)); 3690 3710 } else { ··· 3747 3767 } 3748 3768 3749 3769 // Finally insert and update the GP for this page. 3750 - bnode->records[bnode->nr_records++] = ptr; 3770 + bnode->nr_records++; 3771 + bnode->records[bnode->nr_records - 1] = ptr; 3751 3772 get_state_synchronize_rcu_full(&bnode->gp_snap); 3752 3773 atomic_inc(&(*krcp)->bulk_count[idx]); 3753 3774 ··· 4384 4403 { 4385 4404 unsigned long __maybe_unused s = rcu_state.barrier_sequence; 4386 4405 4406 + rhp->next = rhp; // Mark the callback as having been invoked. 4387 4407 if (atomic_dec_and_test(&rcu_state.barrier_cpu_count)) { 4388 4408 rcu_barrier_trace(TPS("LastCB"), -1, s); 4389 4409 complete(&rcu_state.barrier_completion); ··· 4786 4804 /* Set up local state, ensuring consistent view of global state. */ 4787 4805 rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu); 4788 4806 INIT_WORK(&rdp->strict_work, strict_work_handler); 4789 - WARN_ON_ONCE(ct->dynticks_nesting != 1); 4790 - WARN_ON_ONCE(rcu_dynticks_in_eqs(ct_dynticks_cpu(cpu))); 4807 + WARN_ON_ONCE(ct->nesting != 1); 4808 + WARN_ON_ONCE(rcu_watching_snap_in_eqs(ct_rcu_watching_cpu(cpu))); 4791 4809 rdp->barrier_seq_snap = rcu_state.barrier_sequence; 4792 4810 rdp->rcu_ofl_gp_seq = rcu_state.gp_seq; 4793 4811 rdp->rcu_ofl_gp_state = RCU_GP_CLEANED; ··· 4880 4898 rdp->qlen_last_fqs_check = 0; 4881 4899 rdp->n_force_qs_snap = READ_ONCE(rcu_state.n_force_qs); 4882 4900 rdp->blimit = blimit; 4883 - ct->dynticks_nesting = 1; /* CPU not up, no tearing. */ 4901 + ct->nesting = 1; /* CPU not up, no tearing. */ 4884 4902 raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */ 4885 4903 4886 4904 /* ··· 5040 5058 rnp = rdp->mynode; 5041 5059 mask = rdp->grpmask; 5042 5060 arch_spin_lock(&rcu_state.ofl_lock); 5043 - rcu_dynticks_eqs_online(); 5061 + rcu_watching_online(); 5044 5062 raw_spin_lock(&rcu_state.barrier_lock); 5045 5063 raw_spin_lock_rcu_node(rnp); 5046 5064 WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask); ··· 5406 5424 while (i > rnp->grphi) 5407 5425 rnp++; 5408 5426 per_cpu_ptr(&rcu_data, i)->mynode = rnp; 5427 + per_cpu_ptr(&rcu_data, i)->barrier_head.next = 5428 + &per_cpu_ptr(&rcu_data, i)->barrier_head; 5409 5429 rcu_boot_init_percpu_data(i); 5410 5430 } 5411 5431 }
+7 -3
kernel/rcu/tree.h
··· 206 206 long blimit; /* Upper limit on a processed batch */ 207 207 208 208 /* 3) dynticks interface. */ 209 - int dynticks_snap; /* Per-GP tracking for dynticks. */ 209 + int watching_snap; /* Per-GP tracking for dynticks. */ 210 210 bool rcu_need_heavy_qs; /* GP old, so heavy quiescent state! */ 211 211 bool rcu_urgent_qs; /* GP old need light quiescent state. */ 212 212 bool rcu_forced_tick; /* Forced tick to provide QS. */ ··· 215 215 /* 4) rcu_barrier(), OOM callbacks, and expediting. */ 216 216 unsigned long barrier_seq_snap; /* Snap of rcu_state.barrier_sequence. */ 217 217 struct rcu_head barrier_head; 218 - int exp_dynticks_snap; /* Double-check need for IPI. */ 218 + int exp_watching_snap; /* Double-check need for IPI. */ 219 219 220 220 /* 5) Callback offloading. */ 221 221 #ifdef CONFIG_RCU_NOCB_CPU ··· 411 411 arch_spinlock_t ofl_lock ____cacheline_internodealigned_in_smp; 412 412 /* Synchronize offline with */ 413 413 /* GP pre-initialization. */ 414 - int nocb_is_setup; /* nocb is setup from boot */ 415 414 416 415 /* synchronize_rcu() part. */ 417 416 struct llist_head srs_next; /* request a GP users. */ ··· 419 420 struct sr_wait_node srs_wait_nodes[SR_NORMAL_GP_WAIT_HEAD_MAX]; 420 421 struct work_struct srs_cleanup_work; 421 422 atomic_t srs_cleanups_pending; /* srs inflight worker cleanups. */ 423 + 424 + #ifdef CONFIG_RCU_NOCB_CPU 425 + struct mutex nocb_mutex; /* Guards (de-)offloading */ 426 + int nocb_is_setup; /* nocb is setup from boot */ 427 + #endif 422 428 }; 423 429 424 430 /* Values for rcu_state structure's gp_flags field. */
+66 -55
kernel/rcu/tree_exp.h
··· 377 377 * post grace period updater's accesses is enforced by the 378 378 * below acquire semantic. 379 379 */ 380 - snap = ct_dynticks_cpu_acquire(cpu); 381 - if (rcu_dynticks_in_eqs(snap)) 380 + snap = ct_rcu_watching_cpu_acquire(cpu); 381 + if (rcu_watching_snap_in_eqs(snap)) 382 382 mask_ofl_test |= mask; 383 383 else 384 - rdp->exp_dynticks_snap = snap; 384 + rdp->exp_watching_snap = snap; 385 385 } 386 386 } 387 387 mask_ofl_ipi = rnp->expmask & ~mask_ofl_test; ··· 401 401 unsigned long mask = rdp->grpmask; 402 402 403 403 retry_ipi: 404 - if (rcu_dynticks_in_eqs_since(rdp, rdp->exp_dynticks_snap)) { 404 + if (rcu_watching_snap_stopped_since(rdp, rdp->exp_watching_snap)) { 405 405 mask_ofl_test |= mask; 406 406 continue; 407 407 } ··· 544 544 } 545 545 546 546 /* 547 + * Print out an expedited RCU CPU stall warning message. 548 + */ 549 + static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigned long j) 550 + { 551 + int cpu; 552 + unsigned long mask; 553 + int ndetected; 554 + struct rcu_node *rnp; 555 + struct rcu_node *rnp_root = rcu_get_root(); 556 + 557 + if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) { 558 + pr_err("INFO: %s detected expedited stalls, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); 559 + return; 560 + } 561 + pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", rcu_state.name); 562 + ndetected = 0; 563 + rcu_for_each_leaf_node(rnp) { 564 + ndetected += rcu_print_task_exp_stall(rnp); 565 + for_each_leaf_node_possible_cpu(rnp, cpu) { 566 + struct rcu_data *rdp; 567 + 568 + mask = leaf_node_cpu_bit(rnp, cpu); 569 + if (!(READ_ONCE(rnp->expmask) & mask)) 570 + continue; 571 + ndetected++; 572 + rdp = per_cpu_ptr(&rcu_data, cpu); 573 + pr_cont(" %d-%c%c%c%c", cpu, 574 + "O."[!!cpu_online(cpu)], 575 + "o."[!!(rdp->grpmask & rnp->expmaskinit)], 576 + "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], 577 + "D."[!!data_race(rdp->cpu_no_qs.b.exp)]); 578 + } 579 + } 580 + pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", 581 + j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask), 582 + ".T"[!!data_race(rnp_root->exp_tasks)]); 583 + if (ndetected) { 584 + pr_err("blocking rcu_node structures (internal RCU debug):"); 585 + rcu_for_each_node_breadth_first(rnp) { 586 + if (rnp == rnp_root) 587 + continue; /* printed unconditionally */ 588 + if (sync_rcu_exp_done_unlocked(rnp)) 589 + continue; 590 + pr_cont(" l=%u:%d-%d:%#lx/%c", 591 + rnp->level, rnp->grplo, rnp->grphi, data_race(rnp->expmask), 592 + ".T"[!!data_race(rnp->exp_tasks)]); 593 + } 594 + pr_cont("\n"); 595 + } 596 + rcu_for_each_leaf_node(rnp) { 597 + for_each_leaf_node_possible_cpu(rnp, cpu) { 598 + mask = leaf_node_cpu_bit(rnp, cpu); 599 + if (!(READ_ONCE(rnp->expmask) & mask)) 600 + continue; 601 + dump_cpu_task(cpu); 602 + } 603 + rcu_exp_print_detail_task_stall_rnp(rnp); 604 + } 605 + } 606 + 607 + /* 547 608 * Wait for the expedited grace period to elapse, issuing any needed 548 609 * RCU CPU stall warnings along the way. 549 610 */ ··· 615 554 unsigned long jiffies_stall; 616 555 unsigned long jiffies_start; 617 556 unsigned long mask; 618 - int ndetected; 619 557 struct rcu_data *rdp; 620 558 struct rcu_node *rnp; 621 - struct rcu_node *rnp_root = rcu_get_root(); 622 559 unsigned long flags; 623 560 624 561 trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); ··· 656 597 j = jiffies; 657 598 rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_EXP, (void *)(j - jiffies_start)); 658 599 trace_rcu_stall_warning(rcu_state.name, TPS("ExpeditedStall")); 659 - pr_err("INFO: %s detected expedited stalls on CPUs/tasks: {", 660 - rcu_state.name); 661 - ndetected = 0; 662 - rcu_for_each_leaf_node(rnp) { 663 - ndetected += rcu_print_task_exp_stall(rnp); 664 - for_each_leaf_node_possible_cpu(rnp, cpu) { 665 - struct rcu_data *rdp; 666 - 667 - mask = leaf_node_cpu_bit(rnp, cpu); 668 - if (!(READ_ONCE(rnp->expmask) & mask)) 669 - continue; 670 - ndetected++; 671 - rdp = per_cpu_ptr(&rcu_data, cpu); 672 - pr_cont(" %d-%c%c%c%c", cpu, 673 - "O."[!!cpu_online(cpu)], 674 - "o."[!!(rdp->grpmask & rnp->expmaskinit)], 675 - "N."[!!(rdp->grpmask & rnp->expmaskinitnext)], 676 - "D."[!!data_race(rdp->cpu_no_qs.b.exp)]); 677 - } 678 - } 679 - pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n", 680 - j - jiffies_start, rcu_state.expedited_sequence, 681 - data_race(rnp_root->expmask), 682 - ".T"[!!data_race(rnp_root->exp_tasks)]); 683 - if (ndetected) { 684 - pr_err("blocking rcu_node structures (internal RCU debug):"); 685 - rcu_for_each_node_breadth_first(rnp) { 686 - if (rnp == rnp_root) 687 - continue; /* printed unconditionally */ 688 - if (sync_rcu_exp_done_unlocked(rnp)) 689 - continue; 690 - pr_cont(" l=%u:%d-%d:%#lx/%c", 691 - rnp->level, rnp->grplo, rnp->grphi, 692 - data_race(rnp->expmask), 693 - ".T"[!!data_race(rnp->exp_tasks)]); 694 - } 695 - pr_cont("\n"); 696 - } 697 - rcu_for_each_leaf_node(rnp) { 698 - for_each_leaf_node_possible_cpu(rnp, cpu) { 699 - mask = leaf_node_cpu_bit(rnp, cpu); 700 - if (!(READ_ONCE(rnp->expmask) & mask)) 701 - continue; 702 - preempt_disable(); // For smp_processor_id() in dump_cpu_task(). 703 - dump_cpu_task(cpu); 704 - preempt_enable(); 705 - } 706 - rcu_exp_print_detail_task_stall_rnp(rnp); 707 - } 600 + synchronize_rcu_expedited_stall(jiffies_start, j); 708 601 jiffies_stall = 3 * rcu_exp_jiffies_till_stall_check() + 3; 709 602 710 603 nbcon_cpu_emergency_exit();
+108 -173
kernel/rcu/tree_nocb.h
··· 16 16 #ifdef CONFIG_RCU_NOCB_CPU 17 17 static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */ 18 18 static bool __read_mostly rcu_nocb_poll; /* Offload kthread are to poll. */ 19 - static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp) 20 - { 21 - return lockdep_is_held(&rdp->nocb_lock); 22 - } 23 19 24 20 static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) 25 21 { ··· 216 220 raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); 217 221 if (needwake) { 218 222 trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DoWake")); 219 - wake_up_process(rdp_gp->nocb_gp_kthread); 223 + swake_up_one_online(&rdp_gp->nocb_gp_wq); 220 224 } 221 225 222 226 return needwake; ··· 409 413 return false; 410 414 } 411 415 412 - // In the process of (de-)offloading: no bypassing, but 413 - // locking. 414 - if (!rcu_segcblist_completely_offloaded(&rdp->cblist)) { 415 - rcu_nocb_lock(rdp); 416 - *was_alldone = !rcu_segcblist_pend_cbs(&rdp->cblist); 417 - return false; /* Not offloaded, no bypassing. */ 418 - } 419 - 420 416 // Don't use ->nocb_bypass during early boot. 421 417 if (rcu_scheduler_active != RCU_SCHEDULER_RUNNING) { 422 418 rcu_nocb_lock(rdp); ··· 493 505 trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("FirstBQ")); 494 506 } 495 507 rcu_nocb_bypass_unlock(rdp); 496 - smp_mb(); /* Order enqueue before wake. */ 508 + 497 509 // A wake up of the grace period kthread or timer adjustment 498 510 // needs to be done only if: 499 511 // 1. Bypass list was fully empty before (this is the first ··· 604 616 } 605 617 } 606 618 607 - static int nocb_gp_toggle_rdp(struct rcu_data *rdp) 619 + static void nocb_gp_toggle_rdp(struct rcu_data *rdp_gp, struct rcu_data *rdp) 608 620 { 609 621 struct rcu_segcblist *cblist = &rdp->cblist; 610 622 unsigned long flags; 611 - int ret; 612 623 613 - rcu_nocb_lock_irqsave(rdp, flags); 614 - if (rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && 615 - !rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) { 624 + /* 625 + * Locking orders future de-offloaded callbacks enqueue against previous 626 + * handling of this rdp. Ie: Make sure rcuog is done with this rdp before 627 + * deoffloaded callbacks can be enqueued. 628 + */ 629 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 630 + if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED)) { 616 631 /* 617 632 * Offloading. Set our flag and notify the offload worker. 618 633 * We will handle this rdp until it ever gets de-offloaded. 619 634 */ 620 - rcu_segcblist_set_flags(cblist, SEGCBLIST_KTHREAD_GP); 621 - ret = 1; 622 - } else if (!rcu_segcblist_test_flags(cblist, SEGCBLIST_OFFLOADED) && 623 - rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)) { 635 + list_add_tail(&rdp->nocb_entry_rdp, &rdp_gp->nocb_head_rdp); 636 + rcu_segcblist_set_flags(cblist, SEGCBLIST_OFFLOADED); 637 + } else { 624 638 /* 625 639 * De-offloading. Clear our flag and notify the de-offload worker. 626 640 * We will ignore this rdp until it ever gets re-offloaded. 627 641 */ 628 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_KTHREAD_GP); 629 - ret = 0; 630 - } else { 631 - WARN_ON_ONCE(1); 632 - ret = -1; 642 + list_del(&rdp->nocb_entry_rdp); 643 + rcu_segcblist_clear_flags(cblist, SEGCBLIST_OFFLOADED); 633 644 } 634 - 635 - rcu_nocb_unlock_irqrestore(rdp, flags); 636 - 637 - return ret; 645 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 638 646 } 639 647 640 648 static void nocb_gp_sleep(struct rcu_data *my_rdp, int cpu) ··· 837 853 } 838 854 839 855 if (rdp_toggling) { 840 - int ret; 841 - 842 - ret = nocb_gp_toggle_rdp(rdp_toggling); 843 - if (ret == 1) 844 - list_add_tail(&rdp_toggling->nocb_entry_rdp, &my_rdp->nocb_head_rdp); 845 - else if (ret == 0) 846 - list_del(&rdp_toggling->nocb_entry_rdp); 847 - 856 + nocb_gp_toggle_rdp(my_rdp, rdp_toggling); 848 857 swake_up_one(&rdp_toggling->nocb_state_wq); 849 858 } 850 859 ··· 894 917 WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp)); 895 918 896 919 local_irq_save(flags); 897 - rcu_momentary_dyntick_idle(); 920 + rcu_momentary_eqs(); 898 921 local_irq_restore(flags); 899 922 /* 900 923 * Disable BH to provide the expected environment. Also, when ··· 1007 1030 } 1008 1031 EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup); 1009 1032 1010 - static int rdp_offload_toggle(struct rcu_data *rdp, 1011 - bool offload, unsigned long flags) 1012 - __releases(rdp->nocb_lock) 1033 + static int rcu_nocb_queue_toggle_rdp(struct rcu_data *rdp) 1013 1034 { 1014 - struct rcu_segcblist *cblist = &rdp->cblist; 1015 1035 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1016 1036 bool wake_gp = false; 1017 - 1018 - rcu_segcblist_offload(cblist, offload); 1019 - rcu_nocb_unlock_irqrestore(rdp, flags); 1037 + unsigned long flags; 1020 1038 1021 1039 raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags); 1022 1040 // Queue this rdp for add/del to/from the list to iterate on rcuog ··· 1025 1053 return wake_gp; 1026 1054 } 1027 1055 1028 - static long rcu_nocb_rdp_deoffload(void *arg) 1056 + static bool rcu_nocb_rdp_deoffload_wait_cond(struct rcu_data *rdp) 1029 1057 { 1030 - struct rcu_data *rdp = arg; 1031 - struct rcu_segcblist *cblist = &rdp->cblist; 1058 + unsigned long flags; 1059 + bool ret; 1060 + 1061 + /* 1062 + * Locking makes sure rcuog is done handling this rdp before deoffloaded 1063 + * enqueue can happen. Also it keeps the SEGCBLIST_OFFLOADED flag stable 1064 + * while the ->nocb_lock is held. 1065 + */ 1066 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1067 + ret = !rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1068 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1069 + 1070 + return ret; 1071 + } 1072 + 1073 + static int rcu_nocb_rdp_deoffload(struct rcu_data *rdp) 1074 + { 1032 1075 unsigned long flags; 1033 1076 int wake_gp; 1034 1077 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1035 1078 1036 - /* 1037 - * rcu_nocb_rdp_deoffload() may be called directly if 1038 - * rcuog/o[p] spawn failed, because at this time the rdp->cpu 1039 - * is not online yet. 1040 - */ 1041 - WARN_ON_ONCE((rdp->cpu != raw_smp_processor_id()) && cpu_online(rdp->cpu)); 1079 + /* CPU must be offline, unless it's early boot */ 1080 + WARN_ON_ONCE(cpu_online(rdp->cpu) && rdp->cpu != raw_smp_processor_id()); 1042 1081 1043 1082 pr_info("De-offloading %d\n", rdp->cpu); 1044 1083 1084 + /* Flush all callbacks from segcblist and bypass */ 1085 + rcu_barrier(); 1086 + 1087 + /* 1088 + * Make sure the rcuoc kthread isn't in the middle of a nocb locked 1089 + * sequence while offloading is deactivated, along with nocb locking. 1090 + */ 1091 + if (rdp->nocb_cb_kthread) 1092 + kthread_park(rdp->nocb_cb_kthread); 1093 + 1045 1094 rcu_nocb_lock_irqsave(rdp, flags); 1046 - /* 1047 - * Flush once and for all now. This suffices because we are 1048 - * running on the target CPU holding ->nocb_lock (thus having 1049 - * interrupts disabled), and because rdp_offload_toggle() 1050 - * invokes rcu_segcblist_offload(), which clears SEGCBLIST_OFFLOADED. 1051 - * Thus future calls to rcu_segcblist_completely_offloaded() will 1052 - * return false, which means that future calls to rcu_nocb_try_bypass() 1053 - * will refuse to put anything into the bypass. 1054 - */ 1055 - WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies, false)); 1056 - /* 1057 - * Start with invoking rcu_core() early. This way if the current thread 1058 - * happens to preempt an ongoing call to rcu_core() in the middle, 1059 - * leaving some work dismissed because rcu_core() still thinks the rdp is 1060 - * completely offloaded, we are guaranteed a nearby future instance of 1061 - * rcu_core() to catch up. 1062 - */ 1063 - rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE); 1064 - invoke_rcu_core(); 1065 - wake_gp = rdp_offload_toggle(rdp, false, flags); 1095 + WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1096 + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); 1097 + rcu_nocb_unlock_irqrestore(rdp, flags); 1098 + 1099 + wake_gp = rcu_nocb_queue_toggle_rdp(rdp); 1066 1100 1067 1101 mutex_lock(&rdp_gp->nocb_gp_kthread_mutex); 1102 + 1068 1103 if (rdp_gp->nocb_gp_kthread) { 1069 1104 if (wake_gp) 1070 1105 wake_up_process(rdp_gp->nocb_gp_kthread); 1071 1106 1072 1107 swait_event_exclusive(rdp->nocb_state_wq, 1073 - !rcu_segcblist_test_flags(cblist, 1074 - SEGCBLIST_KTHREAD_GP)); 1075 - if (rdp->nocb_cb_kthread) 1076 - kthread_park(rdp->nocb_cb_kthread); 1108 + rcu_nocb_rdp_deoffload_wait_cond(rdp)); 1077 1109 } else { 1078 1110 /* 1079 1111 * No kthread to clear the flags for us or remove the rdp from the nocb list 1080 1112 * to iterate. Do it here instead. Locking doesn't look stricly necessary 1081 1113 * but we stick to paranoia in this rare path. 1082 1114 */ 1083 - rcu_nocb_lock_irqsave(rdp, flags); 1084 - rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP); 1085 - rcu_nocb_unlock_irqrestore(rdp, flags); 1115 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1116 + rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1117 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1086 1118 1087 1119 list_del(&rdp->nocb_entry_rdp); 1088 1120 } 1121 + 1089 1122 mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); 1090 - 1091 - /* 1092 - * Lock one last time to acquire latest callback updates from kthreads 1093 - * so we can later handle callbacks locally without locking. 1094 - */ 1095 - rcu_nocb_lock_irqsave(rdp, flags); 1096 - /* 1097 - * Theoretically we could clear SEGCBLIST_LOCKING after the nocb 1098 - * lock is released but how about being paranoid for once? 1099 - */ 1100 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_LOCKING); 1101 - /* 1102 - * Without SEGCBLIST_LOCKING, we can't use 1103 - * rcu_nocb_unlock_irqrestore() anymore. 1104 - */ 1105 - raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1106 - 1107 - /* Sanity check */ 1108 - WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1109 - 1110 1123 1111 1124 return 0; 1112 1125 } ··· 1102 1145 int ret = 0; 1103 1146 1104 1147 cpus_read_lock(); 1105 - mutex_lock(&rcu_state.barrier_mutex); 1148 + mutex_lock(&rcu_state.nocb_mutex); 1106 1149 if (rcu_rdp_is_offloaded(rdp)) { 1107 - if (cpu_online(cpu)) { 1108 - ret = work_on_cpu(cpu, rcu_nocb_rdp_deoffload, rdp); 1150 + if (!cpu_online(cpu)) { 1151 + ret = rcu_nocb_rdp_deoffload(rdp); 1109 1152 if (!ret) 1110 1153 cpumask_clear_cpu(cpu, rcu_nocb_mask); 1111 1154 } else { 1112 - pr_info("NOCB: Cannot CB-deoffload offline CPU %d\n", rdp->cpu); 1155 + pr_info("NOCB: Cannot CB-deoffload online CPU %d\n", rdp->cpu); 1113 1156 ret = -EINVAL; 1114 1157 } 1115 1158 } 1116 - mutex_unlock(&rcu_state.barrier_mutex); 1159 + mutex_unlock(&rcu_state.nocb_mutex); 1117 1160 cpus_read_unlock(); 1118 1161 1119 1162 return ret; 1120 1163 } 1121 1164 EXPORT_SYMBOL_GPL(rcu_nocb_cpu_deoffload); 1122 1165 1123 - static long rcu_nocb_rdp_offload(void *arg) 1166 + static bool rcu_nocb_rdp_offload_wait_cond(struct rcu_data *rdp) 1124 1167 { 1125 - struct rcu_data *rdp = arg; 1126 - struct rcu_segcblist *cblist = &rdp->cblist; 1127 1168 unsigned long flags; 1169 + bool ret; 1170 + 1171 + raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1172 + ret = rcu_segcblist_test_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1173 + raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags); 1174 + 1175 + return ret; 1176 + } 1177 + 1178 + static int rcu_nocb_rdp_offload(struct rcu_data *rdp) 1179 + { 1128 1180 int wake_gp; 1129 1181 struct rcu_data *rdp_gp = rdp->nocb_gp_rdp; 1130 1182 1131 - WARN_ON_ONCE(rdp->cpu != raw_smp_processor_id()); 1183 + WARN_ON_ONCE(cpu_online(rdp->cpu)); 1132 1184 /* 1133 1185 * For now we only support re-offload, ie: the rdp must have been 1134 1186 * offloaded on boot first. ··· 1150 1184 1151 1185 pr_info("Offloading %d\n", rdp->cpu); 1152 1186 1153 - /* 1154 - * Can't use rcu_nocb_lock_irqsave() before SEGCBLIST_LOCKING 1155 - * is set. 1156 - */ 1157 - raw_spin_lock_irqsave(&rdp->nocb_lock, flags); 1187 + WARN_ON_ONCE(rcu_cblist_n_cbs(&rdp->nocb_bypass)); 1188 + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist)); 1158 1189 1159 - /* 1160 - * We didn't take the nocb lock while working on the 1161 - * rdp->cblist with SEGCBLIST_LOCKING cleared (pure softirq/rcuc mode). 1162 - * Every modifications that have been done previously on 1163 - * rdp->cblist must be visible remotely by the nocb kthreads 1164 - * upon wake up after reading the cblist flags. 1165 - * 1166 - * The layout against nocb_lock enforces that ordering: 1167 - * 1168 - * __rcu_nocb_rdp_offload() nocb_cb_wait()/nocb_gp_wait() 1169 - * ------------------------- ---------------------------- 1170 - * WRITE callbacks rcu_nocb_lock() 1171 - * rcu_nocb_lock() READ flags 1172 - * WRITE flags READ callbacks 1173 - * rcu_nocb_unlock() rcu_nocb_unlock() 1174 - */ 1175 - wake_gp = rdp_offload_toggle(rdp, true, flags); 1190 + wake_gp = rcu_nocb_queue_toggle_rdp(rdp); 1176 1191 if (wake_gp) 1177 1192 wake_up_process(rdp_gp->nocb_gp_kthread); 1178 1193 1179 - kthread_unpark(rdp->nocb_cb_kthread); 1180 - 1181 1194 swait_event_exclusive(rdp->nocb_state_wq, 1182 - rcu_segcblist_test_flags(cblist, SEGCBLIST_KTHREAD_GP)); 1195 + rcu_nocb_rdp_offload_wait_cond(rdp)); 1183 1196 1184 - /* 1185 - * All kthreads are ready to work, we can finally relieve rcu_core() and 1186 - * enable nocb bypass. 1187 - */ 1188 - rcu_nocb_lock_irqsave(rdp, flags); 1189 - rcu_segcblist_clear_flags(cblist, SEGCBLIST_RCU_CORE); 1190 - rcu_nocb_unlock_irqrestore(rdp, flags); 1197 + kthread_unpark(rdp->nocb_cb_kthread); 1191 1198 1192 1199 return 0; 1193 1200 } ··· 1171 1232 int ret = 0; 1172 1233 1173 1234 cpus_read_lock(); 1174 - mutex_lock(&rcu_state.barrier_mutex); 1235 + mutex_lock(&rcu_state.nocb_mutex); 1175 1236 if (!rcu_rdp_is_offloaded(rdp)) { 1176 - if (cpu_online(cpu)) { 1177 - ret = work_on_cpu(cpu, rcu_nocb_rdp_offload, rdp); 1237 + if (!cpu_online(cpu)) { 1238 + ret = rcu_nocb_rdp_offload(rdp); 1178 1239 if (!ret) 1179 1240 cpumask_set_cpu(cpu, rcu_nocb_mask); 1180 1241 } else { 1181 - pr_info("NOCB: Cannot CB-offload offline CPU %d\n", rdp->cpu); 1242 + pr_info("NOCB: Cannot CB-offload online CPU %d\n", rdp->cpu); 1182 1243 ret = -EINVAL; 1183 1244 } 1184 1245 } 1185 - mutex_unlock(&rcu_state.barrier_mutex); 1246 + mutex_unlock(&rcu_state.nocb_mutex); 1186 1247 cpus_read_unlock(); 1187 1248 1188 1249 return ret; ··· 1200 1261 return 0; 1201 1262 1202 1263 /* Protect rcu_nocb_mask against concurrent (de-)offloading. */ 1203 - if (!mutex_trylock(&rcu_state.barrier_mutex)) 1264 + if (!mutex_trylock(&rcu_state.nocb_mutex)) 1204 1265 return 0; 1205 1266 1206 1267 /* Snapshot count of all CPUs */ ··· 1210 1271 count += READ_ONCE(rdp->lazy_len); 1211 1272 } 1212 1273 1213 - mutex_unlock(&rcu_state.barrier_mutex); 1274 + mutex_unlock(&rcu_state.nocb_mutex); 1214 1275 1215 1276 return count ? count : SHRINK_EMPTY; 1216 1277 } ··· 1228 1289 * Protect against concurrent (de-)offloading. Otherwise nocb locking 1229 1290 * may be ignored or imbalanced. 1230 1291 */ 1231 - if (!mutex_trylock(&rcu_state.barrier_mutex)) { 1292 + if (!mutex_trylock(&rcu_state.nocb_mutex)) { 1232 1293 /* 1233 - * But really don't insist if barrier_mutex is contended since we 1294 + * But really don't insist if nocb_mutex is contended since we 1234 1295 * can't guarantee that it will never engage in a dependency 1235 1296 * chain involving memory allocation. The lock is seldom contended 1236 1297 * anyway. ··· 1269 1330 break; 1270 1331 } 1271 1332 1272 - mutex_unlock(&rcu_state.barrier_mutex); 1333 + mutex_unlock(&rcu_state.nocb_mutex); 1273 1334 1274 1335 return count ? count : SHRINK_STOP; 1275 1336 } ··· 1335 1396 rdp = per_cpu_ptr(&rcu_data, cpu); 1336 1397 if (rcu_segcblist_empty(&rdp->cblist)) 1337 1398 rcu_segcblist_init(&rdp->cblist); 1338 - rcu_segcblist_offload(&rdp->cblist, true); 1339 - rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_KTHREAD_GP); 1340 - rcu_segcblist_clear_flags(&rdp->cblist, SEGCBLIST_RCU_CORE); 1399 + rcu_segcblist_set_flags(&rdp->cblist, SEGCBLIST_OFFLOADED); 1341 1400 } 1342 1401 rcu_organize_nocb_kthreads(); 1343 1402 } ··· 1383 1446 "rcuog/%d", rdp_gp->cpu); 1384 1447 if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo GP kthread, OOM is now expected behavior\n", __func__)) { 1385 1448 mutex_unlock(&rdp_gp->nocb_gp_kthread_mutex); 1386 - goto end; 1449 + goto err; 1387 1450 } 1388 1451 WRITE_ONCE(rdp_gp->nocb_gp_kthread, t); 1389 1452 if (kthread_prio) ··· 1395 1458 t = kthread_create(rcu_nocb_cb_kthread, rdp, 1396 1459 "rcuo%c/%d", rcu_state.abbr, cpu); 1397 1460 if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__)) 1398 - goto end; 1461 + goto err; 1399 1462 1400 1463 if (rcu_rdp_is_offloaded(rdp)) 1401 1464 wake_up_process(t); ··· 1408 1471 WRITE_ONCE(rdp->nocb_cb_kthread, t); 1409 1472 WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread); 1410 1473 return; 1411 - end: 1412 - mutex_lock(&rcu_state.barrier_mutex); 1474 + 1475 + err: 1476 + /* 1477 + * No need to protect against concurrent rcu_barrier() 1478 + * because the number of callbacks should be 0 for a non-boot CPU, 1479 + * therefore rcu_barrier() shouldn't even try to grab the nocb_lock. 1480 + * But hold nocb_mutex to avoid nocb_lock imbalance from shrinker. 1481 + */ 1482 + WARN_ON_ONCE(system_state > SYSTEM_BOOTING && rcu_segcblist_n_cbs(&rdp->cblist)); 1483 + mutex_lock(&rcu_state.nocb_mutex); 1413 1484 if (rcu_rdp_is_offloaded(rdp)) { 1414 1485 rcu_nocb_rdp_deoffload(rdp); 1415 1486 cpumask_clear_cpu(cpu, rcu_nocb_mask); 1416 1487 } 1417 - mutex_unlock(&rcu_state.barrier_mutex); 1488 + mutex_unlock(&rcu_state.nocb_mutex); 1418 1489 } 1419 1490 1420 1491 /* How many CB CPU IDs per GP kthread? Default of -1 for sqrt(nr_cpu_ids). */ ··· 1597 1652 } 1598 1653 1599 1654 #else /* #ifdef CONFIG_RCU_NOCB_CPU */ 1600 - 1601 - static inline int rcu_lockdep_is_held_nocb(struct rcu_data *rdp) 1602 - { 1603 - return 0; 1604 - } 1605 - 1606 - static inline bool rcu_current_is_nocb_kthread(struct rcu_data *rdp) 1607 - { 1608 - return false; 1609 - } 1610 1655 1611 1656 /* No ->nocb_lock to acquire. */ 1612 1657 static void rcu_nocb_lock(struct rcu_data *rdp)
+6 -5
kernel/rcu/tree_plugin.h
··· 24 24 * timers have their own means of synchronization against the 25 25 * offloaded state updaters. 26 26 */ 27 - RCU_LOCKDEP_WARN( 27 + RCU_NOCB_LOCKDEP_WARN( 28 28 !(lockdep_is_held(&rcu_state.barrier_mutex) || 29 29 (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_held()) || 30 - rcu_lockdep_is_held_nocb(rdp) || 30 + lockdep_is_held(&rdp->nocb_lock) || 31 + lockdep_is_held(&rcu_state.nocb_mutex) || 31 32 (!(IS_ENABLED(CONFIG_PREEMPT_COUNT) && preemptible()) && 32 33 rdp == this_cpu_ptr(&rcu_data)) || 33 34 rcu_current_is_nocb_kthread(rdp)), ··· 870 869 871 870 /* 872 871 * Register an urgently needed quiescent state. If there is an 873 - * emergency, invoke rcu_momentary_dyntick_idle() to do a heavy-weight 872 + * emergency, invoke rcu_momentary_eqs() to do a heavy-weight 874 873 * dyntick-idle quiescent state visible to other CPUs, which will in 875 874 * some cases serve for expedited as well as normal grace periods. 876 875 * Either way, register a lightweight quiescent state. ··· 890 889 this_cpu_write(rcu_data.rcu_urgent_qs, false); 891 890 if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) { 892 891 local_irq_save(flags); 893 - rcu_momentary_dyntick_idle(); 892 + rcu_momentary_eqs(); 894 893 local_irq_restore(flags); 895 894 } 896 895 rcu_qs(); ··· 910 909 goto out; 911 910 this_cpu_write(rcu_data.rcu_urgent_qs, false); 912 911 if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs))) 913 - rcu_momentary_dyntick_idle(); 912 + rcu_momentary_eqs(); 914 913 out: 915 914 rcu_tasks_qs(current, preempt); 916 915 trace_rcu_utilization(TPS("End context switch"));
+12 -4
kernel/rcu/tree_stall.h
··· 10 10 #include <linux/console.h> 11 11 #include <linux/kvm_para.h> 12 12 #include <linux/rcu_notifier.h> 13 + #include <linux/smp.h> 13 14 14 15 ////////////////////////////////////////////////////////////////////////////// 15 16 // ··· 372 371 struct rcu_node *rnp; 373 372 374 373 rcu_for_each_leaf_node(rnp) { 374 + printk_deferred_enter(); 375 375 raw_spin_lock_irqsave_rcu_node(rnp, flags); 376 376 for_each_leaf_node_possible_cpu(rnp, cpu) 377 377 if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu)) { ··· 382 380 dump_cpu_task(cpu); 383 381 } 384 382 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 383 + printk_deferred_exit(); 385 384 } 386 385 } 387 386 ··· 505 502 } 506 503 delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq); 507 504 falsepositive = rcu_is_gp_kthread_starving(NULL) && 508 - rcu_dynticks_in_eqs(ct_dynticks_cpu(cpu)); 505 + rcu_watching_snap_in_eqs(ct_rcu_watching_cpu(cpu)); 509 506 rcuc_starved = rcu_is_rcuc_kthread_starving(rdp, &j); 510 507 if (rcuc_starved) 511 508 // Print signed value, as negative values indicate a probable bug. ··· 519 516 rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' : 520 517 "!."[!delta], 521 518 ticks_value, ticks_title, 522 - ct_dynticks_cpu(cpu) & 0xffff, 523 - ct_dynticks_nesting_cpu(cpu), ct_dynticks_nmi_nesting_cpu(cpu), 519 + ct_rcu_watching_cpu(cpu) & 0xffff, 520 + ct_nesting_cpu(cpu), ct_nmi_nesting_cpu(cpu), 524 521 rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu), 525 522 data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart, 526 523 rcuc_starved ? buf : "", ··· 731 728 set_preempt_need_resched(); 732 729 } 733 730 731 + static bool csd_lock_suppress_rcu_stall; 732 + module_param(csd_lock_suppress_rcu_stall, bool, 0644); 733 + 734 734 static void check_cpu_stall(struct rcu_data *rdp) 735 735 { 736 736 bool self_detected; ··· 806 800 return; 807 801 808 802 rcu_stall_notifier_call_chain(RCU_STALL_NOTIFY_NORM, (void *)j - gps); 809 - if (self_detected) { 803 + if (READ_ONCE(csd_lock_suppress_rcu_stall) && csd_lock_is_stuck()) { 804 + pr_err("INFO: %s detected stall, but suppressed full report due to a stuck CSD-lock.\n", rcu_state.name); 805 + } else if (self_detected) { 810 806 /* We haven't checked in, so go dump stack. */ 811 807 print_cpu_stall(gps); 812 808 } else {
+3 -3
kernel/sched/core.c
··· 5762 5762 preempt_count_set(PREEMPT_DISABLED); 5763 5763 } 5764 5764 rcu_sleep_check(); 5765 - SCHED_WARN_ON(ct_state() == CONTEXT_USER); 5765 + SCHED_WARN_ON(ct_state() == CT_STATE_USER); 5766 5766 5767 5767 profile_hit(SCHED_PROFILING, __builtin_return_address(0)); 5768 5768 ··· 6658 6658 * we find a better solution. 6659 6659 * 6660 6660 * NB: There are buggy callers of this function. Ideally we 6661 - * should warn if prev_state != CONTEXT_USER, but that will trigger 6661 + * should warn if prev_state != CT_STATE_USER, but that will trigger 6662 6662 * too frequently to make sense yet. 6663 6663 */ 6664 6664 enum ctx_state prev_state = exception_enter(); ··· 9752 9752 9753 9753 void dump_cpu_task(int cpu) 9754 9754 { 9755 - if (cpu == smp_processor_id() && in_hardirq()) { 9755 + if (in_hardirq() && cpu == smp_processor_id()) { 9756 9756 struct pt_regs *regs; 9757 9757 9758 9758 regs = get_irq_regs();
+33 -5
kernel/smp.c
··· 208 208 return -1; 209 209 } 210 210 211 + static atomic_t n_csd_lock_stuck; 212 + 213 + /** 214 + * csd_lock_is_stuck - Has a CSD-lock acquisition been stuck too long? 215 + * 216 + * Returns @true if a CSD-lock acquisition is stuck and has been stuck 217 + * long enough for a "non-responsive CSD lock" message to be printed. 218 + */ 219 + bool csd_lock_is_stuck(void) 220 + { 221 + return !!atomic_read(&n_csd_lock_stuck); 222 + } 223 + 211 224 /* 212 225 * Complain if too much time spent waiting. Note that only 213 226 * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU, 214 227 * so waiting on other types gets much less information. 215 228 */ 216 - static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id) 229 + static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id, unsigned long *nmessages) 217 230 { 218 231 int cpu = -1; 219 232 int cpux; ··· 242 229 cpu = csd_lock_wait_getcpu(csd); 243 230 pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n", 244 231 *bug_id, raw_smp_processor_id(), cpu); 232 + atomic_dec(&n_csd_lock_stuck); 245 233 return true; 246 234 } 247 235 248 236 ts2 = sched_clock(); 249 237 /* How long since we last checked for a stuck CSD lock.*/ 250 238 ts_delta = ts2 - *ts1; 251 - if (likely(ts_delta <= csd_lock_timeout_ns || csd_lock_timeout_ns == 0)) 239 + if (likely(ts_delta <= csd_lock_timeout_ns * (*nmessages + 1) * 240 + (!*nmessages ? 1 : (ilog2(num_online_cpus()) / 2 + 1)) || 241 + csd_lock_timeout_ns == 0)) 252 242 return false; 243 + 244 + if (ts0 > ts2) { 245 + /* Our own sched_clock went backward; don't blame another CPU. */ 246 + ts_delta = ts0 - ts2; 247 + pr_alert("sched_clock on CPU %d went backward by %llu ns\n", raw_smp_processor_id(), ts_delta); 248 + *ts1 = ts2; 249 + return false; 250 + } 253 251 254 252 firsttime = !*bug_id; 255 253 if (firsttime) ··· 273 249 cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */ 274 250 /* How long since this CSD lock was stuck. */ 275 251 ts_delta = ts2 - ts0; 276 - pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n", 277 - firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts_delta, 252 + pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %lld ns for CPU#%02d %pS(%ps).\n", 253 + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), (s64)ts_delta, 278 254 cpu, csd->func, csd->info); 255 + (*nmessages)++; 256 + if (firsttime) 257 + atomic_inc(&n_csd_lock_stuck); 279 258 /* 280 259 * If the CSD lock is still stuck after 5 minutes, it is unlikely 281 260 * to become unstuck. Use a signed comparison to avoid triggering ··· 317 290 */ 318 291 static void __csd_lock_wait(call_single_data_t *csd) 319 292 { 293 + unsigned long nmessages = 0; 320 294 int bug_id = 0; 321 295 u64 ts0, ts1; 322 296 323 297 ts1 = ts0 = sched_clock(); 324 298 for (;;) { 325 - if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id)) 299 + if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages)) 326 300 break; 327 301 cpu_relax(); 328 302 }
+1 -1
kernel/stop_machine.c
··· 251 251 */ 252 252 touch_nmi_watchdog(); 253 253 } 254 - rcu_momentary_dyntick_idle(); 254 + rcu_momentary_eqs(); 255 255 } while (curstate != MULTI_STOP_EXIT); 256 256 257 257 local_irq_restore(flags);
+2 -2
kernel/trace/trace_osnoise.c
··· 1541 1541 * This will eventually cause unwarranted noise as PREEMPT_RCU 1542 1542 * will force preemption as the means of ending the current 1543 1543 * grace period. We avoid this problem by calling 1544 - * rcu_momentary_dyntick_idle(), which performs a zero duration 1544 + * rcu_momentary_eqs(), which performs a zero duration 1545 1545 * EQS allowing PREEMPT_RCU to end the current grace period. 1546 1546 * This call shouldn't be wrapped inside an RCU critical 1547 1547 * section. ··· 1553 1553 if (!disable_irq) 1554 1554 local_irq_disable(); 1555 1555 1556 - rcu_momentary_dyntick_idle(); 1556 + rcu_momentary_eqs(); 1557 1557 1558 1558 if (!disable_irq) 1559 1559 local_irq_enable();
+1
lib/Kconfig.debug
··· 1614 1614 config CSD_LOCK_WAIT_DEBUG 1615 1615 bool "Debugging for csd_lock_wait(), called from smp_call_function*()" 1616 1616 depends on DEBUG_KERNEL 1617 + depends on SMP 1617 1618 depends on 64BIT 1618 1619 default n 1619 1620 help
-2
tools/rcu/rcu-updaters.sh
··· 21 21 bpftrace -e 'kprobe:kvfree_call_rcu, 22 22 kprobe:call_rcu, 23 23 kprobe:call_rcu_tasks, 24 - kprobe:call_rcu_tasks_rude, 25 24 kprobe:call_rcu_tasks_trace, 26 25 kprobe:call_srcu, 27 26 kprobe:rcu_barrier, 28 27 kprobe:rcu_barrier_tasks, 29 - kprobe:rcu_barrier_tasks_rude, 30 28 kprobe:rcu_barrier_tasks_trace, 31 29 kprobe:srcu_barrier, 32 30 kprobe:synchronize_rcu,
+2
tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
··· 68 68 config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG" 69 69 config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG" 70 70 config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG" 71 + config_override_param "$config_dir/CFcommon.$(uname -m)" KcList \ 72 + "`cat $config_dir/CFcommon.$(uname -m) 2> /dev/null`" 71 73 cp $T/KcList $resdir/ConfigFragment 72 74 73 75 base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`
+27 -11
tools/testing/selftests/rcutorture/bin/torture.sh
··· 19 19 20 20 TORTURE_ALLOTED_CPUS="`identify_qemu_vcpus`" 21 21 MAKE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS*2)) 22 - HALF_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2)) 23 - if test "$HALF_ALLOTED_CPUS" -lt 1 22 + SCALE_ALLOTED_CPUS=$((TORTURE_ALLOTED_CPUS/2)) 23 + if test "$SCALE_ALLOTED_CPUS" -lt 1 24 24 then 25 - HALF_ALLOTED_CPUS=1 25 + SCALE_ALLOTED_CPUS=1 26 26 fi 27 27 VERBOSE_BATCH_CPUS=$((TORTURE_ALLOTED_CPUS/16)) 28 28 if test "$VERBOSE_BATCH_CPUS" -lt 2 ··· 90 90 echo " --do-scftorture / --do-no-scftorture / --no-scftorture" 91 91 echo " --do-srcu-lockdep / --do-no-srcu-lockdep / --no-srcu-lockdep" 92 92 echo " --duration [ <minutes> | <hours>h | <days>d ]" 93 + echo " --guest-cpu-limit N" 93 94 echo " --kcsan-kmake-arg kernel-make-arguments" 94 95 exit 1 95 96 } ··· 202 201 fi 203 202 ts=`echo $2 | sed -e 's/[smhd]$//'` 204 203 duration_base=$(($ts*mult)) 204 + shift 205 + ;; 206 + --guest-cpu-limit|--guest-cpu-lim) 207 + checkarg --guest-cpu-limit "(number)" "$#" "$2" '^[0-9]*$' '^--' 208 + if (("$2" <= "$TORTURE_ALLOTED_CPUS" / 2)) 209 + then 210 + SCALE_ALLOTED_CPUS="$2" 211 + VERBOSE_BATCH_CPUS="$((SCALE_ALLOTED_CPUS/8))" 212 + if (("$VERBOSE_BATCH_CPUS" < 2)) 213 + then 214 + VERBOSE_BATCH_CPUS=0 215 + fi 216 + else 217 + echo "Ignoring value of $2 for --guest-cpu-limit which is greater than (("$TORTURE_ALLOTED_CPUS" / 2))." 218 + fi 205 219 shift 206 220 ;; 207 221 --kcsan-kmake-arg|--kcsan-kmake-args) ··· 441 425 if test "$do_scftorture" = "yes" 442 426 then 443 427 # Scale memory based on the number of CPUs. 444 - scfmem=$((3+HALF_ALLOTED_CPUS/16)) 445 - torture_bootargs="scftorture.nthreads=$HALF_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1" 446 - torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory ${scfmem}G --trust-make 428 + scfmem=$((3+SCALE_ALLOTED_CPUS/16)) 429 + torture_bootargs="scftorture.nthreads=$SCALE_ALLOTED_CPUS torture.disable_onoff_at_boot csdlock_debug=1" 430 + torture_set "scftorture" tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration "$duration_scftorture" --configs "$configs_scftorture" --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory ${scfmem}G --trust-make 447 431 fi 448 432 449 433 if test "$do_rt" = "yes" ··· 487 471 do 488 472 if test -n "$firsttime" 489 473 then 490 - torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$HALF_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot" 491 - torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make 474 + torture_bootargs="refscale.scale_type="$prim" refscale.nreaders=$SCALE_ALLOTED_CPUS refscale.loops=10000 refscale.holdoff=20 torture.disable_onoff_at_boot" 475 + torture_set "refscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture refscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --bootargs "refscale.verbose_batched=$VERBOSE_BATCH_CPUS torture.verbose_sleep_frequency=8 torture.verbose_sleep_duration=$VERBOSE_BATCH_CPUS" --trust-make 492 476 mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : 493 477 if test -f "$T/last-resdir-kasan" 494 478 then ··· 536 520 do 537 521 if test -n "$firsttime" 538 522 then 539 - torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$HALF_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot" 540 - torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --trust-make 523 + torture_bootargs="rcuscale.scale_type="$prim" rcuscale.nwriters=$SCALE_ALLOTED_CPUS rcuscale.holdoff=20 torture.disable_onoff_at_boot" 524 + torture_set "rcuscale-$prim" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration 5 --kconfig "CONFIG_TASKS_TRACE_RCU=y CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --trust-make 541 525 mv $T/last-resdir-nodebug $T/first-resdir-nodebug || : 542 526 if test -f "$T/last-resdir-kasan" 543 527 then ··· 575 559 if test "$do_kvfree" = "yes" 576 560 then 577 561 torture_bootargs="rcuscale.kfree_rcu_test=1 rcuscale.kfree_nthreads=16 rcuscale.holdoff=20 rcuscale.kfree_loops=10000 torture.disable_onoff_at_boot" 578 - torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$HALF_ALLOTED_CPUS" --memory 2G --trust-make 562 + torture_set "rcuscale-kvfree" tools/testing/selftests/rcutorture/bin/kvm.sh --torture rcuscale --allcpus --duration $duration_rcutorture --kconfig "CONFIG_NR_CPUS=$SCALE_ALLOTED_CPUS" --memory 2G --trust-make 579 563 fi 580 564 581 565 if test "$do_clocksourcewd" = "yes"
-2
tools/testing/selftests/rcutorture/configs/rcu/CFcommon
··· 1 1 CONFIG_RCU_TORTURE_TEST=y 2 2 CONFIG_PRINTK_TIME=y 3 - CONFIG_HYPERVISOR_GUEST=y 4 3 CONFIG_PARAVIRT=y 5 - CONFIG_KVM_GUEST=y 6 4 CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n 7 5 CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
+2
tools/testing/selftests/rcutorture/configs/rcu/CFcommon.i686
··· 1 + CONFIG_HYPERVISOR_GUEST=y 2 + CONFIG_KVM_GUEST=y
+1
tools/testing/selftests/rcutorture/configs/rcu/CFcommon.ppc64le
··· 1 + CONFIG_KVM_GUEST=y
+2
tools/testing/selftests/rcutorture/configs/rcu/CFcommon.x86_64
··· 1 + CONFIG_HYPERVISOR_GUEST=y 2 + CONFIG_KVM_GUEST=y
+1
tools/testing/selftests/rcutorture/configs/rcu/TREE07.boot
··· 2 2 rcutorture.stall_cpu=14 3 3 rcutorture.stall_cpu_holdoff=90 4 4 rcutorture.fwd_progress=0 5 + rcutree.nohz_full_patience_delay=1000
+20
tools/testing/selftests/rcutorture/configs/refscale/TINY
··· 1 + CONFIG_SMP=n 2 + CONFIG_PREEMPT_NONE=y 3 + CONFIG_PREEMPT_VOLUNTARY=n 4 + CONFIG_PREEMPT=n 5 + CONFIG_PREEMPT_DYNAMIC=n 6 + #CHECK#CONFIG_PREEMPT_RCU=n 7 + CONFIG_HZ_PERIODIC=n 8 + CONFIG_NO_HZ_IDLE=y 9 + CONFIG_NO_HZ_FULL=n 10 + CONFIG_HOTPLUG_CPU=n 11 + CONFIG_SUSPEND=n 12 + CONFIG_HIBERNATION=n 13 + CONFIG_RCU_NOCB_CPU=n 14 + CONFIG_DEBUG_LOCK_ALLOC=n 15 + CONFIG_PROVE_LOCKING=n 16 + CONFIG_RCU_BOOST=n 17 + CONFIG_DEBUG_OBJECTS_RCU_HEAD=n 18 + CONFIG_RCU_EXPERT=y 19 + CONFIG_KPROBES=n 20 + CONFIG_FTRACE=n