Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ring-buffer: Use a housekeeping CPU to wake up waiters

Avoid running the wakeup irq_work on an isolated CPU. Since the wakeup can
run on any CPU, let's pick a housekeeping CPU to do the job.

This change reduces additional noise when tracing isolated CPUs. For
example, the following ipi_send_cpu stack trace was captured with
nohz_full=2 on the isolated CPU:

<idle>-0 [002] d.h4. 1255.379293: ipi_send_cpu: cpu=2 callsite=irq_work_queue+0x2d/0x50 callback=rb_wake_up_waiters+0x0/0x80
<idle>-0 [002] d.h4. 1255.379329: <stack trace>
=> trace_event_raw_event_ipi_send_cpu
=> __irq_work_queue_local
=> irq_work_queue
=> ring_buffer_unlock_commit
=> trace_buffer_unlock_commit_regs
=> trace_event_buffer_commit
=> trace_event_raw_event_x86_irq_vector
=> __sysvec_apic_timer_interrupt
=> sysvec_apic_timer_interrupt
=> asm_sysvec_apic_timer_interrupt
=> pv_native_safe_halt
=> default_idle
=> default_idle_call
=> do_idle
=> cpu_startup_entry
=> start_secondary
=> common_startup_64

The IRQ work interrupt alone adds considerable noise, but the impact can
get even worse with PREEMPT_RT, because the IRQ work interrupt is then
handled by a separate kernel thread. This requires a task switch and makes
tracing useless for analyzing latency on an isolated CPU.

After applying the patch, the trace is similar, but ipi_send_cpu always
targets a non-isolated CPU.

Unfortunately, irq_work_queue_on() is not NMI-safe. When running in NMI
context, fall back to queuing the irq work on the local CPU.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Clark Williams <clrkwllms@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Link: https://patch.msgid.link/20260108132132.2473515-1-ptesarik@suse.com
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

authored by

Petr Tesarik and committed by
Steven Rostedt (Google)
8aa76aa4 e4ef389e

+21 -3
+21 -3
kernel/trace/ring_buffer.c
··· 4 4 * 5 5 * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com> 6 6 */ 7 + #include <linux/sched/isolation.h> 7 8 #include <linux/trace_recursion.h> 8 9 #include <linux/trace_events.h> 9 10 #include <linux/ring_buffer.h> ··· 4014 4013 rb_end_commit(cpu_buffer); 4015 4014 } 4016 4015 4016 + static bool 4017 + rb_irq_work_queue(struct rb_irq_work *irq_work) 4018 + { 4019 + int cpu; 4020 + 4021 + /* irq_work_queue_on() is not NMI-safe */ 4022 + if (unlikely(in_nmi())) 4023 + return irq_work_queue(&irq_work->work); 4024 + 4025 + /* 4026 + * If CPU isolation is not active, cpu is always the current 4027 + * CPU, and the following is equivallent to irq_work_queue(). 4028 + */ 4029 + cpu = housekeeping_any_cpu(HK_TYPE_KERNEL_NOISE); 4030 + return irq_work_queue_on(&irq_work->work, cpu); 4031 + } 4032 + 4017 4033 static __always_inline void 4018 4034 rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer) 4019 4035 { 4020 4036 if (buffer->irq_work.waiters_pending) { 4021 4037 buffer->irq_work.waiters_pending = false; 4022 4038 /* irq_work_queue() supplies it's own memory barriers */ 4023 - irq_work_queue(&buffer->irq_work.work); 4039 + rb_irq_work_queue(&buffer->irq_work); 4024 4040 } 4025 4041 4026 4042 if (cpu_buffer->irq_work.waiters_pending) { 4027 4043 cpu_buffer->irq_work.waiters_pending = false; 4028 4044 /* irq_work_queue() supplies it's own memory barriers */ 4029 - irq_work_queue(&cpu_buffer->irq_work.work); 4045 + rb_irq_work_queue(&cpu_buffer->irq_work); 4030 4046 } 4031 4047 4032 4048 if (cpu_buffer->last_pages_touch == local_read(&cpu_buffer->pages_touched)) ··· 4063 4045 cpu_buffer->irq_work.wakeup_full = true; 4064 4046 cpu_buffer->irq_work.full_waiters_pending = false; 4065 4047 /* irq_work_queue() supplies it's own memory barriers */ 4066 - irq_work_queue(&cpu_buffer->irq_work.work); 4048 + rb_irq_work_queue(&cpu_buffer->irq_work); 4067 4049 } 4068 4050 4069 4051 #ifdef CONFIG_RING_BUFFER_RECORD_RECURSION