Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE()

The syscall events are pseudo events that hook to the raw syscalls. The
ftrace_syscall_enter/exit() callback is called by the raw_syscall
enter/exit tracepoints respectively whenever any of the syscall events are
enabled.

The trace_array has an array of syscall "files" that correspond to the
system calls based on their __NR_SYSCALL number. The array is read and if
there's a pointer to a trace_event_file then it is considered enabled and
if it is NULL that syscall event is considered disabled.

Currently it uses an rcu_dereference_sched() to get this pointer and a
rcu_assign_ptr() or RCU_INIT_POINTER() to write to it. This is unnecessary
as the file pointer will not go away outside the synchronization of the
tracepoint logic itself. And this code adds no extra RCU synchronization
that uses this.

Replace these functions with a simple READ_ONCE() and WRITE_ONCE() which
is all they need. This will also allow this code to not depend on
preemption being disabled as system call tracepoints are now allowed to
fault.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Takaya Saeki <takayas@google.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Douglas Raillard <douglas.raillard@arm.com>
Link: https://lore.kernel.org/20250923130713.594320290@kernel.org
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

+8 -10
+2 -2
kernel/trace/trace.h
··· 380 380 #ifdef CONFIG_FTRACE_SYSCALLS 381 381 int sys_refcount_enter; 382 382 int sys_refcount_exit; 383 - struct trace_event_file __rcu *enter_syscall_files[NR_syscalls]; 384 - struct trace_event_file __rcu *exit_syscall_files[NR_syscalls]; 383 + struct trace_event_file *enter_syscall_files[NR_syscalls]; 384 + struct trace_event_file *exit_syscall_files[NR_syscalls]; 385 385 #endif 386 386 int stop_count; 387 387 int clock_id;
+6 -8
kernel/trace/trace_syscalls.c
··· 310 310 if (syscall_nr < 0 || syscall_nr >= NR_syscalls) 311 311 return; 312 312 313 - /* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE) */ 314 - trace_file = rcu_dereference_sched(tr->enter_syscall_files[syscall_nr]); 313 + trace_file = READ_ONCE(tr->enter_syscall_files[syscall_nr]); 315 314 if (!trace_file) 316 315 return; 317 316 ··· 355 356 if (syscall_nr < 0 || syscall_nr >= NR_syscalls) 356 357 return; 357 358 358 - /* Here we're inside tp handler's rcu_read_lock_sched (__DO_TRACE()) */ 359 - trace_file = rcu_dereference_sched(tr->exit_syscall_files[syscall_nr]); 359 + trace_file = READ_ONCE(tr->exit_syscall_files[syscall_nr]); 360 360 if (!trace_file) 361 361 return; 362 362 ··· 391 393 if (!tr->sys_refcount_enter) 392 394 ret = register_trace_sys_enter(ftrace_syscall_enter, tr); 393 395 if (!ret) { 394 - rcu_assign_pointer(tr->enter_syscall_files[num], file); 396 + WRITE_ONCE(tr->enter_syscall_files[num], file); 395 397 tr->sys_refcount_enter++; 396 398 } 397 399 mutex_unlock(&syscall_trace_lock); ··· 409 411 return; 410 412 mutex_lock(&syscall_trace_lock); 411 413 tr->sys_refcount_enter--; 412 - RCU_INIT_POINTER(tr->enter_syscall_files[num], NULL); 414 + WRITE_ONCE(tr->enter_syscall_files[num], NULL); 413 415 if (!tr->sys_refcount_enter) 414 416 unregister_trace_sys_enter(ftrace_syscall_enter, tr); 415 417 mutex_unlock(&syscall_trace_lock); ··· 429 431 if (!tr->sys_refcount_exit) 430 432 ret = register_trace_sys_exit(ftrace_syscall_exit, tr); 431 433 if (!ret) { 432 - rcu_assign_pointer(tr->exit_syscall_files[num], file); 434 + WRITE_ONCE(tr->exit_syscall_files[num], file); 433 435 tr->sys_refcount_exit++; 434 436 } 435 437 mutex_unlock(&syscall_trace_lock); ··· 447 449 return; 448 450 mutex_lock(&syscall_trace_lock); 449 451 tr->sys_refcount_exit--; 450 - RCU_INIT_POINTER(tr->exit_syscall_files[num], NULL); 452 + WRITE_ONCE(tr->exit_syscall_files[num], NULL); 451 453 if (!tr->sys_refcount_exit) 452 454 unregister_trace_sys_exit(ftrace_syscall_exit, tr); 453 455 mutex_unlock(&syscall_trace_lock);