Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

rcu: Add noinstr-fast rcu_read_{,un}lock_tasks_trace() APIs

When expressing RCU Tasks Trace in terms of SRCU-fast, it was
necessary to keep a nesting count and per-CPU srcu_ctr structure
pointer in the task_struct structure, which is slow to access.
But an alternative is to instead make rcu_read_lock_tasks_trace() and
rcu_read_unlock_tasks_trace(), which match the underlying SRCU-fast
semantics, avoiding the task_struct accesses.

When all callers have switched to the new API, the previous
rcu_read_lock_trace() and rcu_read_unlock_trace() APIs will be removed.

The rcu_read_{,un}lock_{,tasks_}trace() functions need to use smp_mb()
only if invoked where RCU is not watching, that is, from locations where
a call to rcu_is_watching() would return false. In architectures that
define the ARCH_WANTS_NO_INSTR Kconfig option, use of noinstr and friends
ensures that tracing happens only where RCU is watching, so those
architectures can dispense entirely with the read-side calls to smp_mb().

Other architectures include these read-side calls by default, but in many
installations there might be either larger than average tolerance for
risk, prohibition of removing tracing on a running system, or careful
review and approval of removing of tracing. Such installations can
build their kernels with CONFIG_TASKS_TRACE_RCU_NO_MB=y to avoid those
read-side calls to smp_mb(), thus accepting responsibility for run-time
removal of tracing from code regions that RCU is not watching.

Those wishing to disable read-side memory barriers for an entire
architecture can select this TASKS_TRACE_RCU_NO_MB Kconfig option,
hence the polarity.

[ paulmck: Apply Peter Zijlstra feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>

authored by

Paul E. McKenney and committed by
Boqun Feng
1a72f4bb 176a6aea

+80 -8
+57 -8
include/linux/rcupdate_trace.h
··· 35 35 #ifdef CONFIG_TASKS_TRACE_RCU 36 36 37 37 /** 38 + * rcu_read_lock_tasks_trace - mark beginning of RCU-trace read-side critical section 39 + * 40 + * When synchronize_rcu_tasks_trace() is invoked by one task, then that 41 + * task is guaranteed to block until all other tasks exit their read-side 42 + * critical sections. Similarly, if call_rcu_trace() is invoked on one 43 + * task while other tasks are within RCU read-side critical sections, 44 + * invocation of the corresponding RCU callback is deferred until after 45 + * the all the other tasks exit their critical sections. 46 + * 47 + * For more details, please see the documentation for 48 + * srcu_read_lock_fast(). For a description of how implicit RCU 49 + * readers provide the needed ordering for architectures defining the 50 + * ARCH_WANTS_NO_INSTR Kconfig option (and thus promising never to trace 51 + * code where RCU is not watching), please see the __srcu_read_lock_fast() 52 + * (non-kerneldoc) header comment. Otherwise, the smp_mb() below provided 53 + * the needed ordering. 54 + */ 55 + static inline struct srcu_ctr __percpu *rcu_read_lock_tasks_trace(void) 56 + { 57 + struct srcu_ctr __percpu *ret = __srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct); 58 + 59 + rcu_try_lock_acquire(&rcu_tasks_trace_srcu_struct.dep_map); 60 + if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB)) 61 + smp_mb(); // Provide ordering on noinstr-incomplete architectures. 62 + return ret; 63 + } 64 + 65 + /** 66 + * rcu_read_unlock_tasks_trace - mark end of RCU-trace read-side critical section 67 + * @scp: return value from corresponding rcu_read_lock_tasks_trace(). 68 + * 69 + * Pairs with the preceding call to rcu_read_lock_tasks_trace() that 70 + * returned the value passed in via scp. 71 + * 72 + * For more details, please see the documentation for rcu_read_unlock(). 73 + * For memory-ordering information, please see the header comment for the 74 + * rcu_read_lock_tasks_trace() function. 75 + */ 76 + static inline void rcu_read_unlock_tasks_trace(struct srcu_ctr __percpu *scp) 77 + { 78 + if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB)) 79 + smp_mb(); // Provide ordering on noinstr-incomplete architectures. 80 + __srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp); 81 + srcu_lock_release(&rcu_tasks_trace_srcu_struct.dep_map); 82 + } 83 + 84 + /** 38 85 * rcu_read_lock_trace - mark beginning of RCU-trace read-side critical section 39 86 * 40 87 * When synchronize_rcu_tasks_trace() is invoked by one task, then that ··· 97 50 { 98 51 struct task_struct *t = current; 99 52 53 + rcu_try_lock_acquire(&rcu_tasks_trace_srcu_struct.dep_map); 100 54 if (t->trc_reader_nesting++) { 101 55 // In case we interrupted a Tasks Trace RCU reader. 102 - rcu_try_lock_acquire(&rcu_tasks_trace_srcu_struct.dep_map); 103 56 return; 104 57 } 105 58 barrier(); // nesting before scp to protect against interrupt handler. 106 - t->trc_reader_scp = srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct); 107 - smp_mb(); // Placeholder for more selective ordering 59 + t->trc_reader_scp = __srcu_read_lock_fast(&rcu_tasks_trace_srcu_struct); 60 + if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB)) 61 + smp_mb(); // Placeholder for more selective ordering 108 62 } 109 63 110 64 /** ··· 122 74 struct srcu_ctr __percpu *scp; 123 75 struct task_struct *t = current; 124 76 125 - smp_mb(); // Placeholder for more selective ordering 126 77 scp = t->trc_reader_scp; 127 78 barrier(); // scp before nesting to protect against interrupt handler. 128 - if (!--t->trc_reader_nesting) 129 - srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp); 130 - else 131 - srcu_lock_release(&rcu_tasks_trace_srcu_struct.dep_map); 79 + if (!--t->trc_reader_nesting) { 80 + if (!IS_ENABLED(CONFIG_TASKS_TRACE_RCU_NO_MB)) 81 + smp_mb(); // Placeholder for more selective ordering 82 + __srcu_read_unlock_fast(&rcu_tasks_trace_srcu_struct, scp); 83 + } 84 + srcu_lock_release(&rcu_tasks_trace_srcu_struct.dep_map); 132 85 } 133 86 134 87 /**
+23
kernel/rcu/Kconfig
··· 142 142 default n 143 143 select IRQ_WORK 144 144 145 + config TASKS_TRACE_RCU_NO_MB 146 + bool "Override RCU Tasks Trace inclusion of read-side memory barriers" 147 + depends on RCU_EXPERT && TASKS_TRACE_RCU 148 + default ARCH_WANTS_NO_INSTR 149 + help 150 + This option prevents the use of read-side memory barriers in 151 + rcu_read_lock_tasks_trace() and rcu_read_unlock_tasks_trace() 152 + even in kernels built with CONFIG_ARCH_WANTS_NO_INSTR=n, that is, 153 + in kernels that do not have noinstr set up in entry/exit code. 154 + By setting this option, you are promising to carefully review 155 + use of ftrace, BPF, and friends to ensure that no tracing 156 + operation is attached to a function that runs in that portion 157 + of the entry/exit code that RCU does not watch, that is, 158 + where rcu_is_watching() returns false. Alternatively, you 159 + might choose to never remove traces except by rebooting. 160 + 161 + Those wishing to disable read-side memory barriers for an entire 162 + architecture can select this Kconfig option, hence the polarity. 163 + 164 + Say Y here if you need speed and will review use of tracing. 165 + Say N here for certain esoteric testing of RCU itself. 166 + Take the default if you are unsure. 167 + 145 168 config RCU_STALL_COMMON 146 169 def_bool TREE_RCU 147 170 help