Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'trace-ringbuffer-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ring-buffer updates from Steven Rostedt:

- Add remote buffers for pKVM

pKVM has a hypervisor component that is used to protect the guest
from the host kernel. This hypervisor is a black box to the kernel as
the kernel is to user space. The remote buffers are used to have a
memory mapping between the hypervisor and the kernel where kernel may
send commands to enable tracing within the hypervisor. Then the
kernel will read this memory mapping just like user space can read
the memory mapped ring buffer of the kernel tracing system.

Since the hypervisor only has a single context, it doesn't need to
worry about races between normal context, interrupt context and NMIs
like the kernel does. The ring buffer it uses doesn't need to be as
complex. The remote buffers are a simple version of the ring buffer
that works in a single context. They are still per-CPU and use sub
buffers. The data layout is the same as the kernel's ring buffer to
share the same parsing.

Currently, only ARM64 implements pKVM, but there's work to implement
it also in x86. The remote buffer code is separated out from the ARM
implementation so that it can be used in the future by x86.

The ARM64 updates for pKVM is in the ARM/KVM tree and it merged in
the remote buffers of this tree.

- Make the backup instance non reusable

The backup instance is a copy of the persistent ring buffer so that
the persistent ring buffer could start recording again without using
the data from the previous boot. The backup isn't for normal tracing.
It is made read-only, and after it is consumed, it is automatically
removed.

- Have backup copy persistent instance before it starts recording

To allow the persistent ring buffer to start recording from the
kernel command line commands, move the copy of the backup instance to
before the the command line options start recording.

- Report header_page overwrite field as "char" and not "int'

The rust parser of the header_page file was triggering a warning when
it defined the overwrite variable as "int" but it was only a single
byte in size.

- Fix memory barriers for the trace_buffer CPU mask

When a CPU comes online, the bit is set to allow readers to know that
the CPU buffer is allocated. The bit is set after the allocation is
done, and a smp_wmb() is performed after the allocation and before
the setting of the bit. But instead of adding a smp_rmb() to all
readers, since once a buffer is created for a CPU it is not deleted
if that CPU goes offline, so this allocation is almost always done at
boot up before any readers exist.

If for the unlikely case where a CPU comes online for the first time
after the system boot has finished, send an IPI to all CPUs to force
the smp_rmb() for each CPU.

- Show clock function being used in debugging ring buffer data

When the ring buffer checks are enabled and the ring buffer detects
an inconsistency in the times of the invents, print out the clock
being used when the error occurred. There was a very hard to hit bug
that would happen every so often and it ended up being only triggered
when the jiffies clock was being used. If the bug showed the clock
being used, it would have been much easier to find the problem (which
was an internal function was being traced which caused the clock
accounting to go off).

* tag 'trace-ringbuffer-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
ring-buffer: Prevent off-by-one array access in ring_buffer_desc_page()
ring-buffer: Report header_page overwrite as char
tracing: Allow backup to save persistent ring buffer before it starts
tracing/Documentation: Add a section about backup instance
tracing: Remove the backup instance automatically after read
tracing: Make the backup instance non-reusable
ring-buffer: Enforce read ordering of trace_buffer cpumask and buffers
ring-buffer: Show what clock function is used on timestamp errors
tracing: Check for undefined symbols in simple_ring_buffer
tracing: load/unload page callbacks for simple_ring_buffer
Documentation: tracing: Add tracing remotes
tracing: selftests: Add trace remote tests
tracing: Add a trace remote module for testing
tracing: Introduce simple_ring_buffer
ring-buffer: Export buffer_data_page and macros
tracing: Add helpers to create trace remote events
tracing: Add events/ root files to trace remotes
tracing: Add events to trace remotes
tracing: Add init callback to trace remotes
tracing: Add non-consuming read to trace remotes
...

+3658 -140
+19
Documentation/trace/debugging.rst
··· 159 159 disable tracing with the "traceoff" flag, and enable tracing after boot up. 160 160 Otherwise the trace from the most recent boot will be mixed with the trace 161 161 from the previous boot, and may make it confusing to read. 162 + 163 + Using a backup instance for keeping previous boot data 164 + ------------------------------------------------------ 165 + 166 + It is also possible to record trace data at system boot time by specifying 167 + events with the persistent ring buffer, but in this case the data before the 168 + reboot will be lost before it can be read. This problem can be solved by a 169 + backup instance. From the kernel command line:: 170 + 171 + reserve_mem=12M:4096:trace trace_instance=boot_map@trace,sched,irq trace_instance=backup=boot_map 172 + 173 + On boot up, the previous data in the "boot_map" is copied to the "backup" 174 + instance, and the "sched:*" and "irq:*" events for the current boot are traced 175 + in the "boot_map". Thus the user can read the previous boot data from the "backup" 176 + instance without stopping the trace. 177 + 178 + Note that this "backup" instance is readonly, and will be removed automatically 179 + if you clear the trace data or read out all trace data from the "trace_pipe" 180 + or the "trace_pipe_raw" files.
+11
Documentation/trace/index.rst
··· 92 92 user_events 93 93 uprobetracer 94 94 95 + Remote Tracing 96 + -------------- 97 + 98 + This section covers the framework to read compatible ring-buffers, written by 99 + entities outside of the kernel (most likely firmware or hypervisor) 100 + 101 + .. toctree:: 102 + :maxdepth: 1 103 + 104 + remotes 105 + 95 106 Additional Resources 96 107 -------------------- 97 108
+66
Documentation/trace/remotes.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + =============== 4 + Tracing Remotes 5 + =============== 6 + 7 + :Author: Vincent Donnefort <vdonnefort@google.com> 8 + 9 + Overview 10 + ======== 11 + Firmware and hypervisors are black boxes to the kernel. Having a way to see what 12 + they are doing can be useful to debug both. This is where remote tracing buffers 13 + come in. A remote tracing buffer is a ring buffer executed by the firmware or 14 + hypervisor into memory that is memory mapped to the host kernel. This is similar 15 + to how user space memory maps the kernel ring buffer but in this case the kernel 16 + is acting like user space and the firmware or hypervisor is the "kernel" side. 17 + With a trace remote ring buffer, the firmware and hypervisor can record events 18 + for which the host kernel can see and expose to user space. 19 + 20 + Register a remote 21 + ================= 22 + A remote must provide a set of callbacks `struct trace_remote_callbacks` whom 23 + description can be found below. Those callbacks allows Tracefs to enable and 24 + disable tracing and events, to load and unload a tracing buffer (a set of 25 + ring-buffers) and to swap a reader page with the head page, which enables 26 + consuming reading. 27 + 28 + .. kernel-doc:: include/linux/trace_remote.h 29 + 30 + Once registered, an instance will appear for this remote in the Tracefs 31 + directory **remotes/**. Buffers can then be read using the usual Tracefs files 32 + **trace_pipe** and **trace**. 33 + 34 + Declare a remote event 35 + ====================== 36 + Macros are provided to ease the declaration of remote events, in a similar 37 + fashion to in-kernel events. A declaration must provide an ID, a description of 38 + the event arguments and how to print the event: 39 + 40 + .. code-block:: c 41 + 42 + REMOTE_EVENT(foo, EVENT_FOO_ID, 43 + RE_STRUCT( 44 + re_field(u64, bar) 45 + ), 46 + RE_PRINTK("bar=%lld", __entry->bar) 47 + ); 48 + 49 + Then those events must be declared in a C file with the following: 50 + 51 + .. code-block:: c 52 + 53 + #define REMOTE_EVENT_INCLUDE_FILE foo_events.h 54 + #include <trace/define_remote_events.h> 55 + 56 + This will provide a `struct remote_event remote_event_foo` that can be given to 57 + `trace_remote_register`. 58 + 59 + Registered events appear in the remote directory under **events/**. 60 + 61 + Simple ring-buffer 62 + ================== 63 + A simple implementation for a ring-buffer writer can be found in 64 + kernel/trace/simple_ring_buffer.c. 65 + 66 + .. kernel-doc:: include/linux/simple_ring_buffer.h
+1
fs/tracefs/inode.c
··· 664 664 fsnotify_create(d_inode(dentry->d_parent), dentry); 665 665 return tracefs_end_creating(dentry); 666 666 } 667 + EXPORT_SYMBOL_GPL(tracefs_create_file); 667 668 668 669 static struct dentry *__create_dir(const char *name, struct dentry *parent, 669 670 const struct inode_operations *ops)
+58
include/linux/ring_buffer.h
··· 251 251 void ring_buffer_map_dup(struct trace_buffer *buffer, int cpu); 252 252 int ring_buffer_unmap(struct trace_buffer *buffer, int cpu); 253 253 int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu); 254 + 255 + struct ring_buffer_desc { 256 + int cpu; 257 + unsigned int nr_page_va; /* excludes the meta page */ 258 + unsigned long meta_va; 259 + unsigned long page_va[] __counted_by(nr_page_va); 260 + }; 261 + 262 + struct trace_buffer_desc { 263 + int nr_cpus; 264 + size_t struct_len; 265 + char __data[]; /* list of ring_buffer_desc */ 266 + }; 267 + 268 + static inline struct ring_buffer_desc *__next_ring_buffer_desc(struct ring_buffer_desc *desc) 269 + { 270 + size_t len = struct_size(desc, page_va, desc->nr_page_va); 271 + 272 + return (struct ring_buffer_desc *)((void *)desc + len); 273 + } 274 + 275 + static inline struct ring_buffer_desc *__first_ring_buffer_desc(struct trace_buffer_desc *desc) 276 + { 277 + return (struct ring_buffer_desc *)(&desc->__data[0]); 278 + } 279 + 280 + static inline size_t trace_buffer_desc_size(size_t buffer_size, unsigned int nr_cpus) 281 + { 282 + unsigned int nr_pages = max(DIV_ROUND_UP(buffer_size, PAGE_SIZE), 2UL) + 1; 283 + struct ring_buffer_desc *rbdesc; 284 + 285 + return size_add(offsetof(struct trace_buffer_desc, __data), 286 + size_mul(nr_cpus, struct_size(rbdesc, page_va, nr_pages))); 287 + } 288 + 289 + #define for_each_ring_buffer_desc(__pdesc, __cpu, __trace_pdesc) \ 290 + for (__pdesc = __first_ring_buffer_desc(__trace_pdesc), __cpu = 0; \ 291 + (__cpu) < (__trace_pdesc)->nr_cpus; \ 292 + (__cpu)++, __pdesc = __next_ring_buffer_desc(__pdesc)) 293 + 294 + struct ring_buffer_remote { 295 + struct trace_buffer_desc *desc; 296 + int (*swap_reader_page)(unsigned int cpu, void *priv); 297 + int (*reset)(unsigned int cpu, void *priv); 298 + void *priv; 299 + }; 300 + 301 + int ring_buffer_poll_remote(struct trace_buffer *buffer, int cpu); 302 + 303 + struct trace_buffer * 304 + __ring_buffer_alloc_remote(struct ring_buffer_remote *remote, 305 + struct lock_class_key *key); 306 + 307 + #define ring_buffer_alloc_remote(remote) \ 308 + ({ \ 309 + static struct lock_class_key __key; \ 310 + __ring_buffer_alloc_remote(remote, &__key); \ 311 + }) 254 312 #endif /* _LINUX_RING_BUFFER_H */
+41
include/linux/ring_buffer_types.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _LINUX_RING_BUFFER_TYPES_H 3 + #define _LINUX_RING_BUFFER_TYPES_H 4 + 5 + #include <asm/local.h> 6 + 7 + #define TS_SHIFT 27 8 + #define TS_MASK ((1ULL << TS_SHIFT) - 1) 9 + #define TS_DELTA_TEST (~TS_MASK) 10 + 11 + /* 12 + * We need to fit the time_stamp delta into 27 bits. 13 + */ 14 + static inline bool test_time_stamp(u64 delta) 15 + { 16 + return !!(delta & TS_DELTA_TEST); 17 + } 18 + 19 + #define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) 20 + 21 + #define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) 22 + #define RB_ALIGNMENT 4U 23 + #define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) 24 + #define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ 25 + 26 + #ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS 27 + # define RB_FORCE_8BYTE_ALIGNMENT 0 28 + # define RB_ARCH_ALIGNMENT RB_ALIGNMENT 29 + #else 30 + # define RB_FORCE_8BYTE_ALIGNMENT 1 31 + # define RB_ARCH_ALIGNMENT 8U 32 + #endif 33 + 34 + #define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) 35 + 36 + struct buffer_data_page { 37 + u64 time_stamp; /* page time stamp */ 38 + local_t commit; /* write committed index */ 39 + unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ 40 + }; 41 + #endif
+65
include/linux/simple_ring_buffer.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _LINUX_SIMPLE_RING_BUFFER_H 3 + #define _LINUX_SIMPLE_RING_BUFFER_H 4 + 5 + #include <linux/list.h> 6 + #include <linux/ring_buffer.h> 7 + #include <linux/ring_buffer_types.h> 8 + #include <linux/types.h> 9 + 10 + /* 11 + * Ideally those struct would stay private but the caller needs to know 12 + * the allocation size for simple_ring_buffer_init(). 13 + */ 14 + struct simple_buffer_page { 15 + struct list_head link; 16 + struct buffer_data_page *page; 17 + u64 entries; 18 + u32 write; 19 + u32 id; 20 + }; 21 + 22 + struct simple_rb_per_cpu { 23 + struct simple_buffer_page *tail_page; 24 + struct simple_buffer_page *reader_page; 25 + struct simple_buffer_page *head_page; 26 + struct simple_buffer_page *bpages; 27 + struct trace_buffer_meta *meta; 28 + u32 nr_pages; 29 + 30 + #define SIMPLE_RB_UNAVAILABLE 0 31 + #define SIMPLE_RB_READY 1 32 + #define SIMPLE_RB_WRITING 2 33 + u32 status; 34 + 35 + u64 last_overrun; 36 + u64 write_stamp; 37 + 38 + struct simple_rb_cbs *cbs; 39 + }; 40 + 41 + int simple_ring_buffer_init(struct simple_rb_per_cpu *cpu_buffer, struct simple_buffer_page *bpages, 42 + const struct ring_buffer_desc *desc); 43 + 44 + void simple_ring_buffer_unload(struct simple_rb_per_cpu *cpu_buffer); 45 + 46 + void *simple_ring_buffer_reserve(struct simple_rb_per_cpu *cpu_buffer, unsigned long length, 47 + u64 timestamp); 48 + 49 + void simple_ring_buffer_commit(struct simple_rb_per_cpu *cpu_buffer); 50 + 51 + int simple_ring_buffer_enable_tracing(struct simple_rb_per_cpu *cpu_buffer, bool enable); 52 + 53 + int simple_ring_buffer_reset(struct simple_rb_per_cpu *cpu_buffer); 54 + 55 + int simple_ring_buffer_swap_reader_page(struct simple_rb_per_cpu *cpu_buffer); 56 + 57 + int simple_ring_buffer_init_mm(struct simple_rb_per_cpu *cpu_buffer, 58 + struct simple_buffer_page *bpages, 59 + const struct ring_buffer_desc *desc, 60 + void *(*load_page)(unsigned long va), 61 + void (*unload_page)(void *va)); 62 + 63 + void simple_ring_buffer_unload_mm(struct simple_rb_per_cpu *cpu_buffer, 64 + void (*unload_page)(void *)); 65 + #endif
+48
include/linux/trace_remote.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef _LINUX_TRACE_REMOTE_H 4 + #define _LINUX_TRACE_REMOTE_H 5 + 6 + #include <linux/dcache.h> 7 + #include <linux/ring_buffer.h> 8 + #include <linux/trace_remote_event.h> 9 + 10 + /** 11 + * struct trace_remote_callbacks - Callbacks used by Tracefs to control the remote 12 + * @init: Called once the remote has been registered. Allows the 13 + * caller to extend the Tracefs remote directory 14 + * @load_trace_buffer: Called before Tracefs accesses the trace buffer for the first 15 + * time. Must return a &trace_buffer_desc 16 + * (most likely filled with trace_remote_alloc_buffer()) 17 + * @unload_trace_buffer: 18 + * Called once Tracefs has no use for the trace buffer 19 + * (most likely call trace_remote_free_buffer()) 20 + * @enable_tracing: Called on Tracefs tracing_on. It is expected from the 21 + * remote to allow writing. 22 + * @swap_reader_page: Called when Tracefs consumes a new page from a 23 + * ring-buffer. It is expected from the remote to isolate a 24 + * @reset: Called on `echo 0 > trace`. It is expected from the 25 + * remote to reset all ring-buffer pages. 26 + * new reader-page from the @cpu ring-buffer. 27 + * @enable_event: Called on events/event_name/enable. It is expected from 28 + * the remote to allow the writing event @id. 29 + */ 30 + struct trace_remote_callbacks { 31 + int (*init)(struct dentry *d, void *priv); 32 + struct trace_buffer_desc *(*load_trace_buffer)(unsigned long size, void *priv); 33 + void (*unload_trace_buffer)(struct trace_buffer_desc *desc, void *priv); 34 + int (*enable_tracing)(bool enable, void *priv); 35 + int (*swap_reader_page)(unsigned int cpu, void *priv); 36 + int (*reset)(unsigned int cpu, void *priv); 37 + int (*enable_event)(unsigned short id, bool enable, void *priv); 38 + }; 39 + 40 + int trace_remote_register(const char *name, struct trace_remote_callbacks *cbs, void *priv, 41 + struct remote_event *events, size_t nr_events); 42 + 43 + int trace_remote_alloc_buffer(struct trace_buffer_desc *desc, size_t desc_size, size_t buffer_size, 44 + const struct cpumask *cpumask); 45 + 46 + void trace_remote_free_buffer(struct trace_buffer_desc *desc); 47 + 48 + #endif
+33
include/linux/trace_remote_event.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef _LINUX_TRACE_REMOTE_EVENTS_H 4 + #define _LINUX_TRACE_REMOTE_EVENTS_H 5 + 6 + struct trace_remote; 7 + struct trace_event_fields; 8 + struct trace_seq; 9 + 10 + struct remote_event_hdr { 11 + unsigned short id; 12 + }; 13 + 14 + #define REMOTE_EVENT_NAME_MAX 30 15 + struct remote_event { 16 + char name[REMOTE_EVENT_NAME_MAX]; 17 + unsigned short id; 18 + bool enabled; 19 + struct trace_remote *remote; 20 + struct trace_event_fields *fields; 21 + char *print_fmt; 22 + void (*print)(void *evt, struct trace_seq *seq); 23 + }; 24 + 25 + #define RE_STRUCT(__args...) __args 26 + #define re_field(__type, __field) __type __field; 27 + 28 + #define REMOTE_EVENT_FORMAT(__name, __struct) \ 29 + struct remote_event_format_##__name { \ 30 + struct remote_event_hdr hdr; \ 31 + __struct \ 32 + } 33 + #endif
+73
include/trace/define_remote_events.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #include <linux/trace_events.h> 4 + #include <linux/trace_remote_event.h> 5 + #include <linux/trace_seq.h> 6 + #include <linux/stringify.h> 7 + 8 + #define REMOTE_EVENT_INCLUDE(__file) __stringify(../../__file) 9 + 10 + #ifdef REMOTE_EVENT_SECTION 11 + # define __REMOTE_EVENT_SECTION(__name) __used __section(REMOTE_EVENT_SECTION"."#__name) 12 + #else 13 + # define __REMOTE_EVENT_SECTION(__name) 14 + #endif 15 + 16 + #define REMOTE_PRINTK_COUNT_ARGS(__args...) \ 17 + __COUNT_ARGS(, ##__args, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 0) 18 + 19 + #define __remote_printk0() \ 20 + trace_seq_putc(seq, '\n') 21 + 22 + #define __remote_printk1(__fmt) \ 23 + trace_seq_puts(seq, " " __fmt "\n") \ 24 + 25 + #define __remote_printk2(__fmt, __args...) \ 26 + do { \ 27 + trace_seq_putc(seq, ' '); \ 28 + trace_seq_printf(seq, __fmt, __args); \ 29 + trace_seq_putc(seq, '\n'); \ 30 + } while (0) 31 + 32 + /* Apply the appropriate trace_seq sequence according to the number of arguments */ 33 + #define remote_printk(__args...) \ 34 + CONCATENATE(__remote_printk, REMOTE_PRINTK_COUNT_ARGS(__args))(__args) 35 + 36 + #define RE_PRINTK(__args...) __args 37 + 38 + #define REMOTE_EVENT(__name, __id, __struct, __printk) \ 39 + REMOTE_EVENT_FORMAT(__name, __struct); \ 40 + static void remote_event_print_##__name(void *evt, struct trace_seq *seq) \ 41 + { \ 42 + struct remote_event_format_##__name __maybe_unused *__entry = evt; \ 43 + trace_seq_puts(seq, #__name); \ 44 + remote_printk(__printk); \ 45 + } 46 + #include REMOTE_EVENT_INCLUDE(REMOTE_EVENT_INCLUDE_FILE) 47 + 48 + #undef REMOTE_EVENT 49 + #undef RE_PRINTK 50 + #undef re_field 51 + #define re_field(__type, __field) \ 52 + { \ 53 + .type = #__type, .name = #__field, \ 54 + .size = sizeof(__type), .align = __alignof__(__type), \ 55 + .is_signed = is_signed_type(__type), \ 56 + }, 57 + #define __entry REC 58 + #define RE_PRINTK(__fmt, __args...) "\"" __fmt "\", " __stringify(__args) 59 + #define REMOTE_EVENT(__name, __id, __struct, __printk) \ 60 + static struct trace_event_fields remote_event_fields_##__name[] = { \ 61 + __struct \ 62 + {} \ 63 + }; \ 64 + static char remote_event_print_fmt_##__name[] = __printk; \ 65 + static struct remote_event __REMOTE_EVENT_SECTION(__name) \ 66 + remote_event_##__name = { \ 67 + .name = #__name, \ 68 + .id = __id, \ 69 + .fields = remote_event_fields_##__name, \ 70 + .print_fmt = remote_event_print_fmt_##__name, \ 71 + .print = remote_event_print_##__name, \ 72 + } 73 + #include REMOTE_EVENT_INCLUDE(REMOTE_EVENT_INCLUDE_FILE)
+4 -4
include/uapi/linux/trace_mmap.h
··· 17 17 * @entries: Number of entries in the ring-buffer. 18 18 * @overrun: Number of entries lost in the ring-buffer. 19 19 * @read: Number of entries that have been read. 20 - * @Reserved1: Internal use only. 21 - * @Reserved2: Internal use only. 20 + * @pages_lost: Number of pages overwritten by the writer. 21 + * @pages_touched: Number of pages written by the writer. 22 22 */ 23 23 struct trace_buffer_meta { 24 24 __u32 meta_page_size; ··· 39 39 __u64 overrun; 40 40 __u64 read; 41 41 42 - __u64 Reserved1; 43 - __u64 Reserved2; 42 + __u64 pages_lost; 43 + __u64 pages_touched; 44 44 }; 45 45 46 46 #define TRACE_MMAP_IOCTL_GET_READER _IO('R', 0x20)
+14
kernel/trace/Kconfig
··· 1281 1281 1282 1282 source "kernel/trace/rv/Kconfig" 1283 1283 1284 + config TRACE_REMOTE 1285 + bool 1286 + 1287 + config SIMPLE_RING_BUFFER 1288 + bool 1289 + 1290 + config TRACE_REMOTE_TEST 1291 + tristate "Test module for remote tracing" 1292 + select TRACE_REMOTE 1293 + select SIMPLE_RING_BUFFER 1294 + help 1295 + This trace remote includes a ring-buffer writer implementation using 1296 + "simple_ring_buffer". This is solely intending for testing. 1297 + 1284 1298 endif # FTRACE
+20
kernel/trace/Makefile
··· 128 128 obj-$(CONFIG_TRACEPOINT_BENCHMARK) += trace_benchmark.o 129 129 obj-$(CONFIG_RV) += rv/ 130 130 131 + obj-$(CONFIG_TRACE_REMOTE) += trace_remote.o 132 + obj-$(CONFIG_SIMPLE_RING_BUFFER) += simple_ring_buffer.o 133 + obj-$(CONFIG_TRACE_REMOTE_TEST) += remote_test.o 134 + 135 + # 136 + # simple_ring_buffer is used by the pKVM hypervisor which does not have access 137 + # to all kernel symbols. Fail the build if forbidden symbols are found. 138 + # 139 + UNDEFINED_ALLOWLIST := memset alt_cb_patch_nops __x86 __ubsan __asan __kasan __gcov __aeabi_unwind 140 + UNDEFINED_ALLOWLIST += __stack_chk_fail stackleak_track_stack __ref_stack __sanitizer 141 + UNDEFINED_ALLOWLIST := $(addprefix -e , $(UNDEFINED_ALLOWLIST)) 142 + 143 + quiet_cmd_check_undefined = NM $< 144 + cmd_check_undefined = test -z "`$(NM) -u $< | grep -v $(UNDEFINED_ALLOWLIST)`" 145 + 146 + $(obj)/%.o.checked: $(obj)/%.o FORCE 147 + $(call if_changed,check_undefined) 148 + 149 + always-$(CONFIG_SIMPLE_RING_BUFFER) += simple_ring_buffer.o.checked 150 + 131 151 libftrace-y := ftrace.o
+261
kernel/trace/remote_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2025 - Google LLC 4 + * Author: Vincent Donnefort <vdonnefort@google.com> 5 + */ 6 + 7 + #include <linux/module.h> 8 + #include <linux/simple_ring_buffer.h> 9 + #include <linux/trace_remote.h> 10 + #include <linux/tracefs.h> 11 + #include <linux/types.h> 12 + 13 + #define REMOTE_EVENT_INCLUDE_FILE kernel/trace/remote_test_events.h 14 + #include <trace/define_remote_events.h> 15 + 16 + static DEFINE_PER_CPU(struct simple_rb_per_cpu *, simple_rbs); 17 + static struct trace_buffer_desc *remote_test_buffer_desc; 18 + 19 + /* 20 + * The trace_remote lock already serializes accesses from the trace_remote_callbacks. 21 + * However write_event can still race with load/unload. 22 + */ 23 + static DEFINE_MUTEX(simple_rbs_lock); 24 + 25 + static int remote_test_load_simple_rb(int cpu, struct ring_buffer_desc *rb_desc) 26 + { 27 + struct simple_rb_per_cpu *cpu_buffer; 28 + struct simple_buffer_page *bpages; 29 + int ret = -ENOMEM; 30 + 31 + cpu_buffer = kmalloc_obj(*cpu_buffer); 32 + if (!cpu_buffer) 33 + return ret; 34 + 35 + bpages = kmalloc_objs(*bpages, rb_desc->nr_page_va); 36 + if (!bpages) 37 + goto err_free_cpu_buffer; 38 + 39 + ret = simple_ring_buffer_init(cpu_buffer, bpages, rb_desc); 40 + if (ret) 41 + goto err_free_bpages; 42 + 43 + scoped_guard(mutex, &simple_rbs_lock) { 44 + WARN_ON(*per_cpu_ptr(&simple_rbs, cpu)); 45 + *per_cpu_ptr(&simple_rbs, cpu) = cpu_buffer; 46 + } 47 + 48 + return 0; 49 + 50 + err_free_bpages: 51 + kfree(bpages); 52 + 53 + err_free_cpu_buffer: 54 + kfree(cpu_buffer); 55 + 56 + return ret; 57 + } 58 + 59 + static void remote_test_unload_simple_rb(int cpu) 60 + { 61 + struct simple_rb_per_cpu *cpu_buffer = *per_cpu_ptr(&simple_rbs, cpu); 62 + struct simple_buffer_page *bpages; 63 + 64 + if (!cpu_buffer) 65 + return; 66 + 67 + guard(mutex)(&simple_rbs_lock); 68 + 69 + bpages = cpu_buffer->bpages; 70 + simple_ring_buffer_unload(cpu_buffer); 71 + kfree(bpages); 72 + kfree(cpu_buffer); 73 + *per_cpu_ptr(&simple_rbs, cpu) = NULL; 74 + } 75 + 76 + static struct trace_buffer_desc *remote_test_load(unsigned long size, void *unused) 77 + { 78 + struct ring_buffer_desc *rb_desc; 79 + struct trace_buffer_desc *desc; 80 + size_t desc_size; 81 + int cpu, ret; 82 + 83 + if (WARN_ON(remote_test_buffer_desc)) 84 + return ERR_PTR(-EINVAL); 85 + 86 + desc_size = trace_buffer_desc_size(size, num_possible_cpus()); 87 + if (desc_size == SIZE_MAX) { 88 + ret = -E2BIG; 89 + goto err; 90 + } 91 + 92 + desc = kmalloc(desc_size, GFP_KERNEL); 93 + if (!desc) { 94 + ret = -ENOMEM; 95 + goto err; 96 + } 97 + 98 + ret = trace_remote_alloc_buffer(desc, desc_size, size, cpu_possible_mask); 99 + if (ret) 100 + goto err_free_desc; 101 + 102 + for_each_ring_buffer_desc(rb_desc, cpu, desc) { 103 + ret = remote_test_load_simple_rb(rb_desc->cpu, rb_desc); 104 + if (ret) 105 + goto err_unload; 106 + } 107 + 108 + remote_test_buffer_desc = desc; 109 + 110 + return remote_test_buffer_desc; 111 + 112 + err_unload: 113 + for_each_ring_buffer_desc(rb_desc, cpu, remote_test_buffer_desc) 114 + remote_test_unload_simple_rb(rb_desc->cpu); 115 + trace_remote_free_buffer(remote_test_buffer_desc); 116 + 117 + err_free_desc: 118 + kfree(desc); 119 + 120 + err: 121 + return ERR_PTR(ret); 122 + } 123 + 124 + static void remote_test_unload(struct trace_buffer_desc *desc, void *unused) 125 + { 126 + struct ring_buffer_desc *rb_desc; 127 + int cpu; 128 + 129 + if (WARN_ON(desc != remote_test_buffer_desc)) 130 + return; 131 + 132 + for_each_ring_buffer_desc(rb_desc, cpu, desc) 133 + remote_test_unload_simple_rb(rb_desc->cpu); 134 + 135 + remote_test_buffer_desc = NULL; 136 + trace_remote_free_buffer(desc); 137 + kfree(desc); 138 + } 139 + 140 + static int remote_test_enable_tracing(bool enable, void *unused) 141 + { 142 + struct ring_buffer_desc *rb_desc; 143 + int cpu; 144 + 145 + if (!remote_test_buffer_desc) 146 + return -ENODEV; 147 + 148 + for_each_ring_buffer_desc(rb_desc, cpu, remote_test_buffer_desc) 149 + WARN_ON(simple_ring_buffer_enable_tracing(*per_cpu_ptr(&simple_rbs, rb_desc->cpu), 150 + enable)); 151 + return 0; 152 + } 153 + 154 + static int remote_test_swap_reader_page(unsigned int cpu, void *unused) 155 + { 156 + struct simple_rb_per_cpu *cpu_buffer; 157 + 158 + if (cpu >= NR_CPUS) 159 + return -EINVAL; 160 + 161 + cpu_buffer = *per_cpu_ptr(&simple_rbs, cpu); 162 + if (!cpu_buffer) 163 + return -EINVAL; 164 + 165 + return simple_ring_buffer_swap_reader_page(cpu_buffer); 166 + } 167 + 168 + static int remote_test_reset(unsigned int cpu, void *unused) 169 + { 170 + struct simple_rb_per_cpu *cpu_buffer; 171 + 172 + if (cpu >= NR_CPUS) 173 + return -EINVAL; 174 + 175 + cpu_buffer = *per_cpu_ptr(&simple_rbs, cpu); 176 + if (!cpu_buffer) 177 + return -EINVAL; 178 + 179 + return simple_ring_buffer_reset(cpu_buffer); 180 + } 181 + 182 + static int remote_test_enable_event(unsigned short id, bool enable, void *unused) 183 + { 184 + if (id != REMOTE_TEST_EVENT_ID) 185 + return -EINVAL; 186 + 187 + /* 188 + * Let's just use the struct remote_event enabled field that is turned on and off by 189 + * trace_remote. This is a bit racy but good enough for a simple test module. 190 + */ 191 + return 0; 192 + } 193 + 194 + static ssize_t 195 + write_event_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *pos) 196 + { 197 + struct remote_event_format_selftest *evt_test; 198 + struct simple_rb_per_cpu *cpu_buffer; 199 + unsigned long val; 200 + int ret; 201 + 202 + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); 203 + if (ret) 204 + return ret; 205 + 206 + guard(mutex)(&simple_rbs_lock); 207 + 208 + if (!remote_event_selftest.enabled) 209 + return -ENODEV; 210 + 211 + guard(preempt)(); 212 + 213 + cpu_buffer = *this_cpu_ptr(&simple_rbs); 214 + if (!cpu_buffer) 215 + return -ENODEV; 216 + 217 + evt_test = simple_ring_buffer_reserve(cpu_buffer, 218 + sizeof(struct remote_event_format_selftest), 219 + trace_clock_global()); 220 + if (!evt_test) 221 + return -ENODEV; 222 + 223 + evt_test->hdr.id = REMOTE_TEST_EVENT_ID; 224 + evt_test->id = val; 225 + 226 + simple_ring_buffer_commit(cpu_buffer); 227 + 228 + return cnt; 229 + } 230 + 231 + static const struct file_operations write_event_fops = { 232 + .write = write_event_write, 233 + }; 234 + 235 + static int remote_test_init_tracefs(struct dentry *d, void *unused) 236 + { 237 + return tracefs_create_file("write_event", 0200, d, NULL, &write_event_fops) ? 238 + 0 : -ENOMEM; 239 + } 240 + 241 + static struct trace_remote_callbacks trace_remote_callbacks = { 242 + .init = remote_test_init_tracefs, 243 + .load_trace_buffer = remote_test_load, 244 + .unload_trace_buffer = remote_test_unload, 245 + .enable_tracing = remote_test_enable_tracing, 246 + .swap_reader_page = remote_test_swap_reader_page, 247 + .reset = remote_test_reset, 248 + .enable_event = remote_test_enable_event, 249 + }; 250 + 251 + static int __init remote_test_init(void) 252 + { 253 + return trace_remote_register("test", &trace_remote_callbacks, NULL, 254 + &remote_event_selftest, 1); 255 + } 256 + 257 + module_init(remote_test_init); 258 + 259 + MODULE_DESCRIPTION("Test module for the trace remote interface"); 260 + MODULE_AUTHOR("Vincent Donnefort"); 261 + MODULE_LICENSE("GPL");
+10
kernel/trace/remote_test_events.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #define REMOTE_TEST_EVENT_ID 1 4 + 5 + REMOTE_EVENT(selftest, REMOTE_TEST_EVENT_ID, 6 + RE_STRUCT( 7 + re_field(u64, id) 8 + ), 9 + RE_PRINTK("id=%llu", __entry->id) 10 + );
+329 -58
kernel/trace/ring_buffer.c
··· 4 4 * 5 5 * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com> 6 6 */ 7 + #include <linux/ring_buffer_types.h> 7 8 #include <linux/sched/isolation.h> 8 9 #include <linux/trace_recursion.h> 9 10 #include <linux/trace_events.h> ··· 158 157 /* Used for individual buffers (after the counter) */ 159 158 #define RB_BUFFER_OFF (1 << 20) 160 159 161 - #define BUF_PAGE_HDR_SIZE offsetof(struct buffer_data_page, data) 162 - 163 - #define RB_EVNT_HDR_SIZE (offsetof(struct ring_buffer_event, array)) 164 - #define RB_ALIGNMENT 4U 165 - #define RB_MAX_SMALL_DATA (RB_ALIGNMENT * RINGBUF_TYPE_DATA_TYPE_LEN_MAX) 166 - #define RB_EVNT_MIN_SIZE 8U /* two 32bit words */ 167 - 168 - #ifndef CONFIG_HAVE_64BIT_ALIGNED_ACCESS 169 - # define RB_FORCE_8BYTE_ALIGNMENT 0 170 - # define RB_ARCH_ALIGNMENT RB_ALIGNMENT 171 - #else 172 - # define RB_FORCE_8BYTE_ALIGNMENT 1 173 - # define RB_ARCH_ALIGNMENT 8U 174 - #endif 175 - 176 - #define RB_ALIGN_DATA __aligned(RB_ARCH_ALIGNMENT) 177 - 178 160 /* define RINGBUF_TYPE_DATA for 'case RINGBUF_TYPE_DATA:' */ 179 161 #define RINGBUF_TYPE_DATA 0 ... RINGBUF_TYPE_DATA_TYPE_LEN_MAX 180 162 ··· 300 316 #define for_each_online_buffer_cpu(buffer, cpu) \ 301 317 for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask) 302 318 303 - #define TS_SHIFT 27 304 - #define TS_MASK ((1ULL << TS_SHIFT) - 1) 305 - #define TS_DELTA_TEST (~TS_MASK) 306 - 307 319 static u64 rb_event_time_stamp(struct ring_buffer_event *event) 308 320 { 309 321 u64 ts; ··· 317 337 #define RB_MISSED_STORED (1 << 30) 318 338 319 339 #define RB_MISSED_MASK (3 << 30) 320 - 321 - struct buffer_data_page { 322 - u64 time_stamp; /* page time stamp */ 323 - local_t commit; /* write committed index */ 324 - unsigned char data[] RB_ALIGN_DATA; /* data of buffer page */ 325 - }; 326 340 327 341 struct buffer_data_read_page { 328 342 unsigned order; /* order of the page */ ··· 409 435 rb_init_page(dpage); 410 436 411 437 return dpage; 412 - } 413 - 414 - /* 415 - * We need to fit the time_stamp delta into 27 bits. 416 - */ 417 - static inline bool test_time_stamp(u64 delta) 418 - { 419 - return !!(delta & TS_DELTA_TEST); 420 438 } 421 439 422 440 struct rb_irq_work { ··· 521 555 unsigned int mapped; 522 556 unsigned int user_mapped; /* user space mapping */ 523 557 struct mutex mapping_lock; 524 - unsigned long *subbuf_ids; /* ID to subbuf VA */ 558 + struct buffer_page **subbuf_ids; /* ID to subbuf VA */ 525 559 struct trace_buffer_meta *meta_page; 526 560 struct ring_buffer_cpu_meta *ring_meta; 561 + 562 + struct ring_buffer_remote *remote; 527 563 528 564 /* ring buffer pages to update, > 0 to add, < 0 to remove */ 529 565 long nr_pages_to_update; ··· 548 580 struct mutex mutex; 549 581 550 582 struct ring_buffer_per_cpu **buffers; 583 + 584 + struct ring_buffer_remote *remote; 551 585 552 586 struct hlist_node node; 553 587 u64 (*clock)(void); ··· 597 627 (unsigned int)sizeof(field.commit), 598 628 (unsigned int)is_signed_type(long)); 599 629 600 - trace_seq_printf(s, "\tfield: int overwrite;\t" 630 + trace_seq_printf(s, "\tfield: char overwrite;\t" 601 631 "offset:%u;\tsize:%u;\tsigned:%u;\n", 602 632 (unsigned int)offsetof(typeof(field), commit), 603 633 1, 604 - (unsigned int)is_signed_type(long)); 634 + (unsigned int)is_signed_type(char)); 605 635 606 636 trace_seq_printf(s, "\tfield: char data;\t" 607 637 "offset:%u;\tsize:%u;\tsigned:%u;\n", 608 638 (unsigned int)offsetof(typeof(field), data), 609 - (unsigned int)buffer->subbuf_size, 639 + (unsigned int)(buffer ? buffer->subbuf_size : 640 + PAGE_SIZE - BUF_PAGE_HDR_SIZE), 610 641 (unsigned int)is_signed_type(char)); 611 642 612 643 return !trace_seq_has_overflowed(s); ··· 2209 2238 } 2210 2239 } 2211 2240 2241 + static struct ring_buffer_desc *ring_buffer_desc(struct trace_buffer_desc *trace_desc, int cpu) 2242 + { 2243 + struct ring_buffer_desc *desc, *end; 2244 + size_t len; 2245 + int i; 2246 + 2247 + if (!trace_desc) 2248 + return NULL; 2249 + 2250 + if (cpu >= trace_desc->nr_cpus) 2251 + return NULL; 2252 + 2253 + end = (struct ring_buffer_desc *)((void *)trace_desc + trace_desc->struct_len); 2254 + desc = __first_ring_buffer_desc(trace_desc); 2255 + len = struct_size(desc, page_va, desc->nr_page_va); 2256 + desc = (struct ring_buffer_desc *)((void *)desc + (len * cpu)); 2257 + 2258 + if (desc < end && desc->cpu == cpu) 2259 + return desc; 2260 + 2261 + /* Missing CPUs, need to linear search */ 2262 + for_each_ring_buffer_desc(desc, i, trace_desc) { 2263 + if (desc->cpu == cpu) 2264 + return desc; 2265 + } 2266 + 2267 + return NULL; 2268 + } 2269 + 2270 + static void *ring_buffer_desc_page(struct ring_buffer_desc *desc, unsigned int page_id) 2271 + { 2272 + return page_id >= desc->nr_page_va ? NULL : (void *)desc->page_va[page_id]; 2273 + } 2274 + 2212 2275 static int __rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer, 2213 2276 long nr_pages, struct list_head *pages) 2214 2277 { ··· 2250 2245 struct ring_buffer_cpu_meta *meta = NULL; 2251 2246 struct buffer_page *bpage, *tmp; 2252 2247 bool user_thread = current->mm != NULL; 2248 + struct ring_buffer_desc *desc = NULL; 2253 2249 long i; 2254 2250 2255 2251 /* ··· 2279 2273 if (buffer->range_addr_start) 2280 2274 meta = rb_range_meta(buffer, nr_pages, cpu_buffer->cpu); 2281 2275 2276 + if (buffer->remote) { 2277 + desc = ring_buffer_desc(buffer->remote->desc, cpu_buffer->cpu); 2278 + if (!desc || WARN_ON(desc->nr_page_va != (nr_pages + 1))) 2279 + return -EINVAL; 2280 + } 2281 + 2282 2282 for (i = 0; i < nr_pages; i++) { 2283 2283 2284 2284 bpage = alloc_cpu_page(cpu_buffer->cpu); ··· 2309 2297 rb_meta_buffer_update(cpu_buffer, bpage); 2310 2298 bpage->range = 1; 2311 2299 bpage->id = i + 1; 2300 + } else if (desc) { 2301 + void *p = ring_buffer_desc_page(desc, i + 1); 2302 + 2303 + if (WARN_ON(!p)) 2304 + goto free_pages; 2305 + 2306 + bpage->page = p; 2307 + bpage->range = 1; /* bpage->page can't be freed */ 2308 + bpage->id = i + 1; 2309 + cpu_buffer->subbuf_ids[i + 1] = bpage; 2312 2310 } else { 2313 2311 int order = cpu_buffer->buffer->subbuf_order; 2314 2312 bpage->page = alloc_cpu_data(cpu_buffer->cpu, order); ··· 2416 2394 if (cpu_buffer->ring_meta->head_buffer) 2417 2395 rb_meta_buffer_update(cpu_buffer, bpage); 2418 2396 bpage->range = 1; 2397 + } else if (buffer->remote) { 2398 + struct ring_buffer_desc *desc = ring_buffer_desc(buffer->remote->desc, cpu); 2399 + 2400 + if (!desc) 2401 + goto fail_free_reader; 2402 + 2403 + cpu_buffer->remote = buffer->remote; 2404 + cpu_buffer->meta_page = (struct trace_buffer_meta *)(void *)desc->meta_va; 2405 + cpu_buffer->nr_pages = nr_pages; 2406 + cpu_buffer->subbuf_ids = kcalloc(cpu_buffer->nr_pages + 1, 2407 + sizeof(*cpu_buffer->subbuf_ids), GFP_KERNEL); 2408 + if (!cpu_buffer->subbuf_ids) 2409 + goto fail_free_reader; 2410 + 2411 + /* Remote buffers are read-only and immutable */ 2412 + atomic_inc(&cpu_buffer->record_disabled); 2413 + atomic_inc(&cpu_buffer->resize_disabled); 2414 + 2415 + bpage->page = ring_buffer_desc_page(desc, cpu_buffer->meta_page->reader.id); 2416 + if (!bpage->page) 2417 + goto fail_free_reader; 2418 + 2419 + bpage->range = 1; 2420 + cpu_buffer->subbuf_ids[0] = bpage; 2419 2421 } else { 2420 2422 int order = cpu_buffer->buffer->subbuf_order; 2421 2423 bpage->page = alloc_cpu_data(cpu, order); ··· 2499 2453 2500 2454 irq_work_sync(&cpu_buffer->irq_work.work); 2501 2455 2456 + if (cpu_buffer->remote) 2457 + kfree(cpu_buffer->subbuf_ids); 2458 + 2502 2459 free_buffer_page(cpu_buffer->reader_page); 2503 2460 2504 2461 if (head) { ··· 2524 2475 int order, unsigned long start, 2525 2476 unsigned long end, 2526 2477 unsigned long scratch_size, 2527 - struct lock_class_key *key) 2478 + struct lock_class_key *key, 2479 + struct ring_buffer_remote *remote) 2528 2480 { 2529 2481 struct trace_buffer *buffer __free(kfree) = NULL; 2530 2482 long nr_pages; ··· 2564 2514 GFP_KERNEL); 2565 2515 if (!buffer->buffers) 2566 2516 goto fail_free_cpumask; 2517 + 2518 + cpu = raw_smp_processor_id(); 2567 2519 2568 2520 /* If start/end are specified, then that overrides size */ 2569 2521 if (start && end) { ··· 2622 2570 buffer->range_addr_end = end; 2623 2571 2624 2572 rb_range_meta_init(buffer, nr_pages, scratch_size); 2573 + } else if (remote) { 2574 + struct ring_buffer_desc *desc = ring_buffer_desc(remote->desc, cpu); 2575 + 2576 + buffer->remote = remote; 2577 + /* The writer is remote. This ring-buffer is read-only */ 2578 + atomic_inc(&buffer->record_disabled); 2579 + nr_pages = desc->nr_page_va - 1; 2580 + if (nr_pages < 2) 2581 + goto fail_free_buffers; 2625 2582 } else { 2626 2583 2627 2584 /* need at least two pages */ ··· 2639 2578 nr_pages = 2; 2640 2579 } 2641 2580 2642 - cpu = raw_smp_processor_id(); 2643 2581 cpumask_set_cpu(cpu, buffer->cpumask); 2644 2582 buffer->buffers[cpu] = rb_allocate_cpu_buffer(buffer, nr_pages, cpu); 2645 2583 if (!buffer->buffers[cpu]) ··· 2680 2620 struct lock_class_key *key) 2681 2621 { 2682 2622 /* Default buffer page size - one system page */ 2683 - return alloc_buffer(size, flags, 0, 0, 0, 0, key); 2623 + return alloc_buffer(size, flags, 0, 0, 0, 0, key, NULL); 2684 2624 2685 2625 } 2686 2626 EXPORT_SYMBOL_GPL(__ring_buffer_alloc); ··· 2707 2647 struct lock_class_key *key) 2708 2648 { 2709 2649 return alloc_buffer(size, flags, order, start, start + range_size, 2710 - scratch_size, key); 2650 + scratch_size, key, NULL); 2651 + } 2652 + 2653 + /** 2654 + * __ring_buffer_alloc_remote - allocate a new ring_buffer from a remote 2655 + * @remote: Contains a description of the ring-buffer pages and remote callbacks. 2656 + * @key: ring buffer reader_lock_key. 2657 + */ 2658 + struct trace_buffer *__ring_buffer_alloc_remote(struct ring_buffer_remote *remote, 2659 + struct lock_class_key *key) 2660 + { 2661 + return alloc_buffer(0, 0, 0, 0, 0, 0, key, remote); 2711 2662 } 2712 2663 2713 2664 void *ring_buffer_meta_scratch(struct trace_buffer *buffer, unsigned int *size) ··· 4506 4435 ret = rb_read_data_buffer(bpage, tail, cpu_buffer->cpu, &ts, &delta); 4507 4436 if (ret < 0) { 4508 4437 if (delta < ts) { 4509 - buffer_warn_return("[CPU: %d]ABSOLUTE TIME WENT BACKWARDS: last ts: %lld absolute ts: %lld\n", 4510 - cpu_buffer->cpu, ts, delta); 4438 + buffer_warn_return("[CPU: %d]ABSOLUTE TIME WENT BACKWARDS: last ts: %lld absolute ts: %lld clock:%pS\n", 4439 + cpu_buffer->cpu, ts, delta, 4440 + cpu_buffer->buffer->clock); 4511 4441 goto out; 4512 4442 } 4513 4443 } 4514 4444 if ((full && ts > info->ts) || 4515 4445 (!full && ts + info->delta != info->ts)) { 4516 - buffer_warn_return("[CPU: %d]TIME DOES NOT MATCH expected:%lld actual:%lld delta:%lld before:%lld after:%lld%s context:%s\n", 4446 + buffer_warn_return("[CPU: %d]TIME DOES NOT MATCH expected:%lld actual:%lld delta:%lld before:%lld after:%lld%s context:%s\ntrace clock:%pS", 4517 4447 cpu_buffer->cpu, 4518 4448 ts + info->delta, info->ts, info->delta, 4519 4449 info->before, info->after, 4520 - full ? " (full)" : "", show_interrupt_level()); 4450 + full ? " (full)" : "", show_interrupt_level(), 4451 + cpu_buffer->buffer->clock); 4521 4452 } 4522 4453 out: 4523 4454 atomic_dec(this_cpu_ptr(&checking)); ··· 5347 5274 } 5348 5275 EXPORT_SYMBOL_GPL(ring_buffer_overruns); 5349 5276 5277 + static bool rb_read_remote_meta_page(struct ring_buffer_per_cpu *cpu_buffer) 5278 + { 5279 + local_set(&cpu_buffer->entries, READ_ONCE(cpu_buffer->meta_page->entries)); 5280 + local_set(&cpu_buffer->overrun, READ_ONCE(cpu_buffer->meta_page->overrun)); 5281 + local_set(&cpu_buffer->pages_touched, READ_ONCE(cpu_buffer->meta_page->pages_touched)); 5282 + local_set(&cpu_buffer->pages_lost, READ_ONCE(cpu_buffer->meta_page->pages_lost)); 5283 + 5284 + return rb_num_of_entries(cpu_buffer); 5285 + } 5286 + 5287 + static void rb_update_remote_head(struct ring_buffer_per_cpu *cpu_buffer) 5288 + { 5289 + struct buffer_page *next, *orig; 5290 + int retry = 3; 5291 + 5292 + orig = next = cpu_buffer->head_page; 5293 + rb_inc_page(&next); 5294 + 5295 + /* Run after the writer */ 5296 + while (cpu_buffer->head_page->page->time_stamp > next->page->time_stamp) { 5297 + rb_inc_page(&next); 5298 + 5299 + rb_list_head_clear(cpu_buffer->head_page->list.prev); 5300 + rb_inc_page(&cpu_buffer->head_page); 5301 + rb_set_list_to_head(cpu_buffer->head_page->list.prev); 5302 + 5303 + if (cpu_buffer->head_page == orig) { 5304 + if (WARN_ON_ONCE(!(--retry))) 5305 + return; 5306 + } 5307 + } 5308 + 5309 + orig = cpu_buffer->commit_page = cpu_buffer->head_page; 5310 + retry = 3; 5311 + 5312 + while (cpu_buffer->commit_page->page->time_stamp < next->page->time_stamp) { 5313 + rb_inc_page(&next); 5314 + rb_inc_page(&cpu_buffer->commit_page); 5315 + 5316 + if (cpu_buffer->commit_page == orig) { 5317 + if (WARN_ON_ONCE(!(--retry))) 5318 + return; 5319 + } 5320 + } 5321 + } 5322 + 5350 5323 static void rb_iter_reset(struct ring_buffer_iter *iter) 5351 5324 { 5352 5325 struct ring_buffer_per_cpu *cpu_buffer = iter->cpu_buffer; 5326 + 5327 + if (cpu_buffer->remote) { 5328 + rb_read_remote_meta_page(cpu_buffer); 5329 + rb_update_remote_head(cpu_buffer); 5330 + } 5353 5331 5354 5332 /* Iterator usage is expected to have record disabled */ 5355 5333 iter->head_page = cpu_buffer->reader_page; ··· 5552 5428 } 5553 5429 5554 5430 static struct buffer_page * 5555 - rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) 5431 + __rb_get_reader_page_from_remote(struct ring_buffer_per_cpu *cpu_buffer) 5432 + { 5433 + struct buffer_page *new_reader, *prev_reader, *prev_head, *new_head, *last; 5434 + 5435 + if (!rb_read_remote_meta_page(cpu_buffer)) 5436 + return NULL; 5437 + 5438 + /* More to read on the reader page */ 5439 + if (cpu_buffer->reader_page->read < rb_page_size(cpu_buffer->reader_page)) { 5440 + if (!cpu_buffer->reader_page->read) 5441 + cpu_buffer->read_stamp = cpu_buffer->reader_page->page->time_stamp; 5442 + return cpu_buffer->reader_page; 5443 + } 5444 + 5445 + prev_reader = cpu_buffer->subbuf_ids[cpu_buffer->meta_page->reader.id]; 5446 + 5447 + WARN_ON_ONCE(cpu_buffer->remote->swap_reader_page(cpu_buffer->cpu, 5448 + cpu_buffer->remote->priv)); 5449 + /* nr_pages doesn't include the reader page */ 5450 + if (WARN_ON_ONCE(cpu_buffer->meta_page->reader.id > cpu_buffer->nr_pages)) 5451 + return NULL; 5452 + 5453 + new_reader = cpu_buffer->subbuf_ids[cpu_buffer->meta_page->reader.id]; 5454 + 5455 + WARN_ON_ONCE(prev_reader == new_reader); 5456 + 5457 + prev_head = new_reader; /* New reader was also the previous head */ 5458 + new_head = prev_head; 5459 + rb_inc_page(&new_head); 5460 + last = prev_head; 5461 + rb_dec_page(&last); 5462 + 5463 + /* Clear the old HEAD flag */ 5464 + rb_list_head_clear(cpu_buffer->head_page->list.prev); 5465 + 5466 + prev_reader->list.next = prev_head->list.next; 5467 + prev_reader->list.prev = prev_head->list.prev; 5468 + 5469 + /* Swap prev_reader with new_reader */ 5470 + last->list.next = &prev_reader->list; 5471 + new_head->list.prev = &prev_reader->list; 5472 + 5473 + new_reader->list.prev = &new_reader->list; 5474 + new_reader->list.next = &new_head->list; 5475 + 5476 + /* Reactivate the HEAD flag */ 5477 + rb_set_list_to_head(&last->list); 5478 + 5479 + cpu_buffer->head_page = new_head; 5480 + cpu_buffer->reader_page = new_reader; 5481 + cpu_buffer->pages = &new_head->list; 5482 + cpu_buffer->read_stamp = new_reader->page->time_stamp; 5483 + cpu_buffer->lost_events = cpu_buffer->meta_page->reader.lost_events; 5484 + 5485 + return rb_page_size(cpu_buffer->reader_page) ? cpu_buffer->reader_page : NULL; 5486 + } 5487 + 5488 + static struct buffer_page * 5489 + __rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) 5556 5490 { 5557 5491 struct buffer_page *reader = NULL; 5558 5492 unsigned long bsize = READ_ONCE(cpu_buffer->buffer->subbuf_size); ··· 5778 5596 5779 5597 5780 5598 return reader; 5599 + } 5600 + 5601 + static struct buffer_page * 5602 + rb_get_reader_page(struct ring_buffer_per_cpu *cpu_buffer) 5603 + { 5604 + return cpu_buffer->remote ? __rb_get_reader_page_from_remote(cpu_buffer) : 5605 + __rb_get_reader_page(cpu_buffer); 5781 5606 } 5782 5607 5783 5608 static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer) ··· 6343 6154 meta->entries = local_read(&cpu_buffer->entries); 6344 6155 meta->overrun = local_read(&cpu_buffer->overrun); 6345 6156 meta->read = cpu_buffer->read; 6157 + meta->pages_lost = local_read(&cpu_buffer->pages_lost); 6158 + meta->pages_touched = local_read(&cpu_buffer->pages_touched); 6346 6159 6347 6160 /* Some archs do not have data cache coherency between kernel and user-space */ 6348 6161 flush_kernel_vmap_range(cpu_buffer->meta_page, PAGE_SIZE); ··· 6354 6163 rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) 6355 6164 { 6356 6165 struct buffer_page *page; 6166 + 6167 + if (cpu_buffer->remote) { 6168 + if (!cpu_buffer->remote->reset) 6169 + return; 6170 + 6171 + cpu_buffer->remote->reset(cpu_buffer->cpu, cpu_buffer->remote->priv); 6172 + rb_read_remote_meta_page(cpu_buffer); 6173 + 6174 + /* Read related values, not covered by the meta-page */ 6175 + local_set(&cpu_buffer->pages_read, 0); 6176 + cpu_buffer->read = 0; 6177 + cpu_buffer->read_bytes = 0; 6178 + cpu_buffer->last_overrun = 0; 6179 + cpu_buffer->reader_page->read = 0; 6180 + 6181 + return; 6182 + } 6357 6183 6358 6184 rb_head_page_deactivate(cpu_buffer); 6359 6185 ··· 6601 6393 return ret; 6602 6394 } 6603 6395 EXPORT_SYMBOL_GPL(ring_buffer_empty_cpu); 6396 + 6397 + int ring_buffer_poll_remote(struct trace_buffer *buffer, int cpu) 6398 + { 6399 + struct ring_buffer_per_cpu *cpu_buffer; 6400 + 6401 + if (cpu != RING_BUFFER_ALL_CPUS) { 6402 + if (!cpumask_test_cpu(cpu, buffer->cpumask)) 6403 + return -EINVAL; 6404 + 6405 + cpu_buffer = buffer->buffers[cpu]; 6406 + 6407 + guard(raw_spinlock)(&cpu_buffer->reader_lock); 6408 + if (rb_read_remote_meta_page(cpu_buffer)) 6409 + rb_wakeups(buffer, cpu_buffer); 6410 + 6411 + return 0; 6412 + } 6413 + 6414 + guard(cpus_read_lock)(); 6415 + 6416 + /* 6417 + * Make sure all the ring buffers are up to date before we start reading 6418 + * them. 6419 + */ 6420 + for_each_buffer_cpu(buffer, cpu) { 6421 + cpu_buffer = buffer->buffers[cpu]; 6422 + 6423 + guard(raw_spinlock)(&cpu_buffer->reader_lock); 6424 + rb_read_remote_meta_page(cpu_buffer); 6425 + } 6426 + 6427 + for_each_buffer_cpu(buffer, cpu) { 6428 + cpu_buffer = buffer->buffers[cpu]; 6429 + 6430 + if (rb_num_of_entries(cpu_buffer)) 6431 + rb_wakeups(buffer, cpu_buffer); 6432 + } 6433 + 6434 + return 0; 6435 + } 6604 6436 6605 6437 #ifdef CONFIG_RING_BUFFER_ALLOW_SWAP 6606 6438 /** ··· 6880 6632 unsigned int commit; 6881 6633 unsigned int read; 6882 6634 u64 save_timestamp; 6635 + bool force_memcpy; 6883 6636 6884 6637 if (!cpumask_test_cpu(cpu, buffer->cpumask)) 6885 6638 return -1; ··· 6918 6669 /* Check if any events were dropped */ 6919 6670 missed_events = cpu_buffer->lost_events; 6920 6671 6672 + force_memcpy = cpu_buffer->mapped || cpu_buffer->remote; 6673 + 6921 6674 /* 6922 6675 * If this page has been partially read or 6923 6676 * if len is not big enough to read the rest of the page or ··· 6929 6678 */ 6930 6679 if (read || (len < (commit - read)) || 6931 6680 cpu_buffer->reader_page == cpu_buffer->commit_page || 6932 - cpu_buffer->mapped) { 6681 + force_memcpy) { 6933 6682 struct buffer_data_page *rpage = cpu_buffer->reader_page->page; 6934 6683 unsigned int rpos = read; 6935 6684 unsigned int pos = 0; ··· 7285 7034 } 7286 7035 7287 7036 static void rb_setup_ids_meta_page(struct ring_buffer_per_cpu *cpu_buffer, 7288 - unsigned long *subbuf_ids) 7037 + struct buffer_page **subbuf_ids) 7289 7038 { 7290 7039 struct trace_buffer_meta *meta = cpu_buffer->meta_page; 7291 7040 unsigned int nr_subbufs = cpu_buffer->nr_pages + 1; ··· 7294 7043 int id = 0; 7295 7044 7296 7045 id = rb_page_id(cpu_buffer, cpu_buffer->reader_page, id); 7297 - subbuf_ids[id++] = (unsigned long)cpu_buffer->reader_page->page; 7046 + subbuf_ids[id++] = cpu_buffer->reader_page; 7298 7047 cnt++; 7299 7048 7300 7049 first_subbuf = subbuf = rb_set_head_page(cpu_buffer); ··· 7304 7053 if (WARN_ON(id >= nr_subbufs)) 7305 7054 break; 7306 7055 7307 - subbuf_ids[id] = (unsigned long)subbuf->page; 7056 + subbuf_ids[id] = subbuf; 7308 7057 7309 7058 rb_inc_page(&subbuf); 7310 7059 id++; ··· 7313 7062 7314 7063 WARN_ON(cnt != nr_subbufs); 7315 7064 7316 - /* install subbuf ID to kern VA translation */ 7065 + /* install subbuf ID to bpage translation */ 7317 7066 cpu_buffer->subbuf_ids = subbuf_ids; 7318 7067 7319 7068 meta->meta_struct_len = sizeof(*meta); ··· 7469 7218 } 7470 7219 7471 7220 while (p < nr_pages) { 7221 + struct buffer_page *subbuf; 7472 7222 struct page *page; 7473 7223 int off = 0; 7474 7224 7475 7225 if (WARN_ON_ONCE(s >= nr_subbufs)) 7476 7226 return -EINVAL; 7477 7227 7478 - page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]); 7228 + subbuf = cpu_buffer->subbuf_ids[s]; 7229 + page = virt_to_page((void *)subbuf->page); 7479 7230 7480 7231 for (; off < (1 << (subbuf_order)); off++, page++) { 7481 7232 if (p >= nr_pages) ··· 7504 7251 struct vm_area_struct *vma) 7505 7252 { 7506 7253 struct ring_buffer_per_cpu *cpu_buffer; 7507 - unsigned long flags, *subbuf_ids; 7254 + struct buffer_page **subbuf_ids; 7255 + unsigned long flags; 7508 7256 int err; 7509 7257 7510 - if (!cpumask_test_cpu(cpu, buffer->cpumask)) 7258 + if (!cpumask_test_cpu(cpu, buffer->cpumask) || buffer->remote) 7511 7259 return -EINVAL; 7512 7260 7513 7261 cpu_buffer = buffer->buffers[cpu]; ··· 7529 7275 if (err) 7530 7276 return err; 7531 7277 7532 - /* subbuf_ids include the reader while nr_pages does not */ 7278 + /* subbuf_ids includes the reader while nr_pages does not */ 7533 7279 subbuf_ids = kcalloc(cpu_buffer->nr_pages + 1, sizeof(*subbuf_ids), GFP_KERNEL); 7534 7280 if (!subbuf_ids) { 7535 7281 rb_free_meta_page(cpu_buffer); ··· 7722 7468 return 0; 7723 7469 } 7724 7470 7471 + static void rb_cpu_sync(void *data) 7472 + { 7473 + /* Not really needed, but documents what is happening */ 7474 + smp_rmb(); 7475 + } 7476 + 7725 7477 /* 7726 7478 * We only allocate new buffers, never free them if the CPU goes down. 7727 7479 * If we were to free the buffer, then the user would lose any trace that was in ··· 7766 7506 cpu); 7767 7507 return -ENOMEM; 7768 7508 } 7769 - smp_wmb(); 7509 + 7510 + /* 7511 + * Ensure trace_buffer readers observe the newly allocated 7512 + * ring_buffer_per_cpu before they check the cpumask. Instead of using a 7513 + * read barrier for all readers, send an IPI. 7514 + */ 7515 + if (unlikely(system_state == SYSTEM_RUNNING)) { 7516 + on_each_cpu(rb_cpu_sync, NULL, 1); 7517 + /* Not really needed, but documents what is happening */ 7518 + smp_wmb(); 7519 + } 7520 + 7770 7521 cpumask_set_cpu(cpu, buffer->cpumask); 7771 7522 return 0; 7772 7523 }
+517
kernel/trace/simple_ring_buffer.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2025 - Google LLC 4 + * Author: Vincent Donnefort <vdonnefort@google.com> 5 + */ 6 + 7 + #include <linux/atomic.h> 8 + #include <linux/simple_ring_buffer.h> 9 + 10 + #include <asm/barrier.h> 11 + #include <asm/local.h> 12 + 13 + enum simple_rb_link_type { 14 + SIMPLE_RB_LINK_NORMAL = 0, 15 + SIMPLE_RB_LINK_HEAD = 1, 16 + SIMPLE_RB_LINK_HEAD_MOVING 17 + }; 18 + 19 + #define SIMPLE_RB_LINK_MASK ~(SIMPLE_RB_LINK_HEAD | SIMPLE_RB_LINK_HEAD_MOVING) 20 + 21 + static void simple_bpage_set_head_link(struct simple_buffer_page *bpage) 22 + { 23 + unsigned long link = (unsigned long)bpage->link.next; 24 + 25 + link &= SIMPLE_RB_LINK_MASK; 26 + link |= SIMPLE_RB_LINK_HEAD; 27 + 28 + /* 29 + * Paired with simple_rb_find_head() to order access between the head 30 + * link and overrun. It ensures we always report an up-to-date value 31 + * after swapping the reader page. 32 + */ 33 + smp_store_release(&bpage->link.next, (struct list_head *)link); 34 + } 35 + 36 + static bool simple_bpage_unset_head_link(struct simple_buffer_page *bpage, 37 + struct simple_buffer_page *dst, 38 + enum simple_rb_link_type new_type) 39 + { 40 + unsigned long *link = (unsigned long *)(&bpage->link.next); 41 + unsigned long old = (*link & SIMPLE_RB_LINK_MASK) | SIMPLE_RB_LINK_HEAD; 42 + unsigned long new = (unsigned long)(&dst->link) | new_type; 43 + 44 + return try_cmpxchg(link, &old, new); 45 + } 46 + 47 + static void simple_bpage_set_normal_link(struct simple_buffer_page *bpage) 48 + { 49 + unsigned long link = (unsigned long)bpage->link.next; 50 + 51 + WRITE_ONCE(bpage->link.next, (struct list_head *)(link & SIMPLE_RB_LINK_MASK)); 52 + } 53 + 54 + static struct simple_buffer_page *simple_bpage_from_link(struct list_head *link) 55 + { 56 + unsigned long ptr = (unsigned long)link & SIMPLE_RB_LINK_MASK; 57 + 58 + return container_of((struct list_head *)ptr, struct simple_buffer_page, link); 59 + } 60 + 61 + static struct simple_buffer_page *simple_bpage_next_page(struct simple_buffer_page *bpage) 62 + { 63 + return simple_bpage_from_link(bpage->link.next); 64 + } 65 + 66 + static void simple_bpage_reset(struct simple_buffer_page *bpage) 67 + { 68 + bpage->write = 0; 69 + bpage->entries = 0; 70 + 71 + local_set(&bpage->page->commit, 0); 72 + } 73 + 74 + static void simple_bpage_init(struct simple_buffer_page *bpage, void *page) 75 + { 76 + INIT_LIST_HEAD(&bpage->link); 77 + bpage->page = (struct buffer_data_page *)page; 78 + 79 + simple_bpage_reset(bpage); 80 + } 81 + 82 + #define simple_rb_meta_inc(__meta, __inc) \ 83 + WRITE_ONCE((__meta), (__meta + __inc)) 84 + 85 + static bool simple_rb_loaded(struct simple_rb_per_cpu *cpu_buffer) 86 + { 87 + return !!cpu_buffer->bpages; 88 + } 89 + 90 + static int simple_rb_find_head(struct simple_rb_per_cpu *cpu_buffer) 91 + { 92 + int retry = cpu_buffer->nr_pages * 2; 93 + struct simple_buffer_page *head; 94 + 95 + head = cpu_buffer->head_page; 96 + 97 + while (retry--) { 98 + unsigned long link; 99 + 100 + spin: 101 + /* See smp_store_release in simple_bpage_set_head_link() */ 102 + link = (unsigned long)smp_load_acquire(&head->link.prev->next); 103 + 104 + switch (link & ~SIMPLE_RB_LINK_MASK) { 105 + /* Found the head */ 106 + case SIMPLE_RB_LINK_HEAD: 107 + cpu_buffer->head_page = head; 108 + return 0; 109 + /* The writer caught the head, we can spin, that won't be long */ 110 + case SIMPLE_RB_LINK_HEAD_MOVING: 111 + goto spin; 112 + } 113 + 114 + head = simple_bpage_next_page(head); 115 + } 116 + 117 + return -EBUSY; 118 + } 119 + 120 + /** 121 + * simple_ring_buffer_swap_reader_page - Swap ring-buffer head with the reader 122 + * @cpu_buffer: A simple_rb_per_cpu 123 + * 124 + * This function enables consuming reading. It ensures the current head page will not be overwritten 125 + * and can be safely read. 126 + * 127 + * Returns 0 on success, -ENODEV if @cpu_buffer was unloaded or -EBUSY if we failed to catch the 128 + * head page. 129 + */ 130 + int simple_ring_buffer_swap_reader_page(struct simple_rb_per_cpu *cpu_buffer) 131 + { 132 + struct simple_buffer_page *last, *head, *reader; 133 + unsigned long overrun; 134 + int retry = 8; 135 + int ret; 136 + 137 + if (!simple_rb_loaded(cpu_buffer)) 138 + return -ENODEV; 139 + 140 + reader = cpu_buffer->reader_page; 141 + 142 + do { 143 + /* Run after the writer to find the head */ 144 + ret = simple_rb_find_head(cpu_buffer); 145 + if (ret) 146 + return ret; 147 + 148 + head = cpu_buffer->head_page; 149 + 150 + /* Connect the reader page around the header page */ 151 + reader->link.next = head->link.next; 152 + reader->link.prev = head->link.prev; 153 + 154 + /* The last page before the head */ 155 + last = simple_bpage_from_link(head->link.prev); 156 + 157 + /* The reader page points to the new header page */ 158 + simple_bpage_set_head_link(reader); 159 + 160 + overrun = cpu_buffer->meta->overrun; 161 + } while (!simple_bpage_unset_head_link(last, reader, SIMPLE_RB_LINK_NORMAL) && retry--); 162 + 163 + if (!retry) 164 + return -EINVAL; 165 + 166 + cpu_buffer->head_page = simple_bpage_from_link(reader->link.next); 167 + cpu_buffer->head_page->link.prev = &reader->link; 168 + cpu_buffer->reader_page = head; 169 + cpu_buffer->meta->reader.lost_events = overrun - cpu_buffer->last_overrun; 170 + cpu_buffer->meta->reader.id = cpu_buffer->reader_page->id; 171 + cpu_buffer->last_overrun = overrun; 172 + 173 + return 0; 174 + } 175 + EXPORT_SYMBOL_GPL(simple_ring_buffer_swap_reader_page); 176 + 177 + static struct simple_buffer_page *simple_rb_move_tail(struct simple_rb_per_cpu *cpu_buffer) 178 + { 179 + struct simple_buffer_page *tail, *new_tail; 180 + 181 + tail = cpu_buffer->tail_page; 182 + new_tail = simple_bpage_next_page(tail); 183 + 184 + if (simple_bpage_unset_head_link(tail, new_tail, SIMPLE_RB_LINK_HEAD_MOVING)) { 185 + /* 186 + * Oh no! we've caught the head. There is none anymore and 187 + * swap_reader will spin until we set the new one. Overrun must 188 + * be written first, to make sure we report the correct number 189 + * of lost events. 190 + */ 191 + simple_rb_meta_inc(cpu_buffer->meta->overrun, new_tail->entries); 192 + simple_rb_meta_inc(cpu_buffer->meta->pages_lost, 1); 193 + 194 + simple_bpage_set_head_link(new_tail); 195 + simple_bpage_set_normal_link(tail); 196 + } 197 + 198 + simple_bpage_reset(new_tail); 199 + cpu_buffer->tail_page = new_tail; 200 + 201 + simple_rb_meta_inc(cpu_buffer->meta->pages_touched, 1); 202 + 203 + return new_tail; 204 + } 205 + 206 + static unsigned long rb_event_size(unsigned long length) 207 + { 208 + struct ring_buffer_event *event; 209 + 210 + return length + RB_EVNT_HDR_SIZE + sizeof(event->array[0]); 211 + } 212 + 213 + static struct ring_buffer_event * 214 + rb_event_add_ts_extend(struct ring_buffer_event *event, u64 delta) 215 + { 216 + event->type_len = RINGBUF_TYPE_TIME_EXTEND; 217 + event->time_delta = delta & TS_MASK; 218 + event->array[0] = delta >> TS_SHIFT; 219 + 220 + return (struct ring_buffer_event *)((unsigned long)event + 8); 221 + } 222 + 223 + static struct ring_buffer_event * 224 + simple_rb_reserve_next(struct simple_rb_per_cpu *cpu_buffer, unsigned long length, u64 timestamp) 225 + { 226 + unsigned long ts_ext_size = 0, event_size = rb_event_size(length); 227 + struct simple_buffer_page *tail = cpu_buffer->tail_page; 228 + struct ring_buffer_event *event; 229 + u32 write, prev_write; 230 + u64 time_delta; 231 + 232 + time_delta = timestamp - cpu_buffer->write_stamp; 233 + 234 + if (test_time_stamp(time_delta)) 235 + ts_ext_size = 8; 236 + 237 + prev_write = tail->write; 238 + write = prev_write + event_size + ts_ext_size; 239 + 240 + if (unlikely(write > (PAGE_SIZE - BUF_PAGE_HDR_SIZE))) 241 + tail = simple_rb_move_tail(cpu_buffer); 242 + 243 + if (!tail->entries) { 244 + tail->page->time_stamp = timestamp; 245 + time_delta = 0; 246 + ts_ext_size = 0; 247 + write = event_size; 248 + prev_write = 0; 249 + } 250 + 251 + tail->write = write; 252 + tail->entries++; 253 + 254 + cpu_buffer->write_stamp = timestamp; 255 + 256 + event = (struct ring_buffer_event *)(tail->page->data + prev_write); 257 + if (ts_ext_size) { 258 + event = rb_event_add_ts_extend(event, time_delta); 259 + time_delta = 0; 260 + } 261 + 262 + event->type_len = 0; 263 + event->time_delta = time_delta; 264 + event->array[0] = event_size - RB_EVNT_HDR_SIZE; 265 + 266 + return event; 267 + } 268 + 269 + /** 270 + * simple_ring_buffer_reserve - Reserve an entry in @cpu_buffer 271 + * @cpu_buffer: A simple_rb_per_cpu 272 + * @length: Size of the entry in bytes 273 + * @timestamp: Timestamp of the entry 274 + * 275 + * Returns the address of the entry where to write data or NULL 276 + */ 277 + void *simple_ring_buffer_reserve(struct simple_rb_per_cpu *cpu_buffer, unsigned long length, 278 + u64 timestamp) 279 + { 280 + struct ring_buffer_event *rb_event; 281 + 282 + if (cmpxchg(&cpu_buffer->status, SIMPLE_RB_READY, SIMPLE_RB_WRITING) != SIMPLE_RB_READY) 283 + return NULL; 284 + 285 + rb_event = simple_rb_reserve_next(cpu_buffer, length, timestamp); 286 + 287 + return &rb_event->array[1]; 288 + } 289 + EXPORT_SYMBOL_GPL(simple_ring_buffer_reserve); 290 + 291 + /** 292 + * simple_ring_buffer_commit - Commit the entry reserved with simple_ring_buffer_reserve() 293 + * @cpu_buffer: The simple_rb_per_cpu where the entry has been reserved 294 + */ 295 + void simple_ring_buffer_commit(struct simple_rb_per_cpu *cpu_buffer) 296 + { 297 + local_set(&cpu_buffer->tail_page->page->commit, 298 + cpu_buffer->tail_page->write); 299 + simple_rb_meta_inc(cpu_buffer->meta->entries, 1); 300 + 301 + /* 302 + * Paired with simple_rb_enable_tracing() to ensure data is 303 + * written to the ring-buffer before teardown. 304 + */ 305 + smp_store_release(&cpu_buffer->status, SIMPLE_RB_READY); 306 + } 307 + EXPORT_SYMBOL_GPL(simple_ring_buffer_commit); 308 + 309 + static u32 simple_rb_enable_tracing(struct simple_rb_per_cpu *cpu_buffer, bool enable) 310 + { 311 + u32 prev_status; 312 + 313 + if (enable) 314 + return cmpxchg(&cpu_buffer->status, SIMPLE_RB_UNAVAILABLE, SIMPLE_RB_READY); 315 + 316 + /* Wait for the buffer to be released */ 317 + do { 318 + prev_status = cmpxchg_acquire(&cpu_buffer->status, 319 + SIMPLE_RB_READY, 320 + SIMPLE_RB_UNAVAILABLE); 321 + } while (prev_status == SIMPLE_RB_WRITING); 322 + 323 + return prev_status; 324 + } 325 + 326 + /** 327 + * simple_ring_buffer_reset - Reset @cpu_buffer 328 + * @cpu_buffer: A simple_rb_per_cpu 329 + * 330 + * This will not clear the content of the data, only reset counters and pointers 331 + * 332 + * Returns 0 on success or -ENODEV if @cpu_buffer was unloaded. 333 + */ 334 + int simple_ring_buffer_reset(struct simple_rb_per_cpu *cpu_buffer) 335 + { 336 + struct simple_buffer_page *bpage; 337 + u32 prev_status; 338 + int ret; 339 + 340 + if (!simple_rb_loaded(cpu_buffer)) 341 + return -ENODEV; 342 + 343 + prev_status = simple_rb_enable_tracing(cpu_buffer, false); 344 + 345 + ret = simple_rb_find_head(cpu_buffer); 346 + if (ret) 347 + return ret; 348 + 349 + bpage = cpu_buffer->tail_page = cpu_buffer->head_page; 350 + do { 351 + simple_bpage_reset(bpage); 352 + bpage = simple_bpage_next_page(bpage); 353 + } while (bpage != cpu_buffer->head_page); 354 + 355 + simple_bpage_reset(cpu_buffer->reader_page); 356 + 357 + cpu_buffer->last_overrun = 0; 358 + cpu_buffer->write_stamp = 0; 359 + 360 + cpu_buffer->meta->reader.read = 0; 361 + cpu_buffer->meta->reader.lost_events = 0; 362 + cpu_buffer->meta->entries = 0; 363 + cpu_buffer->meta->overrun = 0; 364 + cpu_buffer->meta->read = 0; 365 + cpu_buffer->meta->pages_lost = 0; 366 + cpu_buffer->meta->pages_touched = 0; 367 + 368 + if (prev_status == SIMPLE_RB_READY) 369 + simple_rb_enable_tracing(cpu_buffer, true); 370 + 371 + return 0; 372 + } 373 + EXPORT_SYMBOL_GPL(simple_ring_buffer_reset); 374 + 375 + int simple_ring_buffer_init_mm(struct simple_rb_per_cpu *cpu_buffer, 376 + struct simple_buffer_page *bpages, 377 + const struct ring_buffer_desc *desc, 378 + void *(*load_page)(unsigned long va), 379 + void (*unload_page)(void *va)) 380 + { 381 + struct simple_buffer_page *bpage = bpages; 382 + int ret = 0; 383 + void *page; 384 + int i; 385 + 386 + /* At least 1 reader page and two pages in the ring-buffer */ 387 + if (desc->nr_page_va < 3) 388 + return -EINVAL; 389 + 390 + memset(cpu_buffer, 0, sizeof(*cpu_buffer)); 391 + 392 + cpu_buffer->meta = load_page(desc->meta_va); 393 + if (!cpu_buffer->meta) 394 + return -EINVAL; 395 + 396 + memset(cpu_buffer->meta, 0, sizeof(*cpu_buffer->meta)); 397 + cpu_buffer->meta->meta_page_size = PAGE_SIZE; 398 + cpu_buffer->meta->nr_subbufs = cpu_buffer->nr_pages; 399 + 400 + /* The reader page is not part of the ring initially */ 401 + page = load_page(desc->page_va[0]); 402 + if (!page) { 403 + unload_page(cpu_buffer->meta); 404 + return -EINVAL; 405 + } 406 + 407 + simple_bpage_init(bpage, page); 408 + bpage->id = 0; 409 + 410 + cpu_buffer->nr_pages = 1; 411 + 412 + cpu_buffer->reader_page = bpage; 413 + cpu_buffer->tail_page = bpage + 1; 414 + cpu_buffer->head_page = bpage + 1; 415 + 416 + for (i = 1; i < desc->nr_page_va; i++) { 417 + page = load_page(desc->page_va[i]); 418 + if (!page) { 419 + ret = -EINVAL; 420 + break; 421 + } 422 + 423 + simple_bpage_init(++bpage, page); 424 + 425 + bpage->link.next = &(bpage + 1)->link; 426 + bpage->link.prev = &(bpage - 1)->link; 427 + bpage->id = i; 428 + 429 + cpu_buffer->nr_pages = i + 1; 430 + } 431 + 432 + if (ret) { 433 + for (i--; i >= 0; i--) 434 + unload_page((void *)desc->page_va[i]); 435 + unload_page(cpu_buffer->meta); 436 + 437 + return ret; 438 + } 439 + 440 + /* Close the ring */ 441 + bpage->link.next = &cpu_buffer->tail_page->link; 442 + cpu_buffer->tail_page->link.prev = &bpage->link; 443 + 444 + /* The last init'ed page points to the head page */ 445 + simple_bpage_set_head_link(bpage); 446 + 447 + cpu_buffer->bpages = bpages; 448 + 449 + return 0; 450 + } 451 + 452 + static void *__load_page(unsigned long page) 453 + { 454 + return (void *)page; 455 + } 456 + 457 + static void __unload_page(void *page) { } 458 + 459 + /** 460 + * simple_ring_buffer_init - Init @cpu_buffer based on @desc 461 + * @cpu_buffer: A simple_rb_per_cpu buffer to init, allocated by the caller. 462 + * @bpages: Array of simple_buffer_pages, with as many elements as @desc->nr_page_va 463 + * @desc: A ring_buffer_desc 464 + * 465 + * Returns 0 on success or -EINVAL if the content of @desc is invalid 466 + */ 467 + int simple_ring_buffer_init(struct simple_rb_per_cpu *cpu_buffer, struct simple_buffer_page *bpages, 468 + const struct ring_buffer_desc *desc) 469 + { 470 + return simple_ring_buffer_init_mm(cpu_buffer, bpages, desc, __load_page, __unload_page); 471 + } 472 + EXPORT_SYMBOL_GPL(simple_ring_buffer_init); 473 + 474 + void simple_ring_buffer_unload_mm(struct simple_rb_per_cpu *cpu_buffer, 475 + void (*unload_page)(void *)) 476 + { 477 + int p; 478 + 479 + if (!simple_rb_loaded(cpu_buffer)) 480 + return; 481 + 482 + simple_rb_enable_tracing(cpu_buffer, false); 483 + 484 + unload_page(cpu_buffer->meta); 485 + for (p = 0; p < cpu_buffer->nr_pages; p++) 486 + unload_page(cpu_buffer->bpages[p].page); 487 + 488 + cpu_buffer->bpages = NULL; 489 + } 490 + 491 + /** 492 + * simple_ring_buffer_unload - Prepare @cpu_buffer for deletion 493 + * @cpu_buffer: A simple_rb_per_cpu that will be deleted. 494 + */ 495 + void simple_ring_buffer_unload(struct simple_rb_per_cpu *cpu_buffer) 496 + { 497 + return simple_ring_buffer_unload_mm(cpu_buffer, __unload_page); 498 + } 499 + EXPORT_SYMBOL_GPL(simple_ring_buffer_unload); 500 + 501 + /** 502 + * simple_ring_buffer_enable_tracing - Enable or disable writing to @cpu_buffer 503 + * @cpu_buffer: A simple_rb_per_cpu 504 + * @enable: True to enable tracing, False to disable it 505 + * 506 + * Returns 0 on success or -ENODEV if @cpu_buffer was unloaded 507 + */ 508 + int simple_ring_buffer_enable_tracing(struct simple_rb_per_cpu *cpu_buffer, bool enable) 509 + { 510 + if (!simple_rb_loaded(cpu_buffer)) 511 + return -ENODEV; 512 + 513 + simple_rb_enable_tracing(cpu_buffer, enable); 514 + 515 + return 0; 516 + } 517 + EXPORT_SYMBOL_GPL(simple_ring_buffer_enable_tracing);
+148 -40
kernel/trace/trace.c
··· 578 578 tr->ring_buffer_expanded = true; 579 579 } 580 580 581 + static void trace_array_autoremove(struct work_struct *work) 582 + { 583 + struct trace_array *tr = container_of(work, struct trace_array, autoremove_work); 584 + 585 + trace_array_destroy(tr); 586 + } 587 + 588 + static struct workqueue_struct *autoremove_wq; 589 + 590 + static void trace_array_kick_autoremove(struct trace_array *tr) 591 + { 592 + if (autoremove_wq) 593 + queue_work(autoremove_wq, &tr->autoremove_work); 594 + } 595 + 596 + static void trace_array_cancel_autoremove(struct trace_array *tr) 597 + { 598 + /* 599 + * Since this can be called inside trace_array_autoremove(), 600 + * it has to avoid deadlock of the workqueue. 601 + */ 602 + if (work_pending(&tr->autoremove_work)) 603 + cancel_work_sync(&tr->autoremove_work); 604 + } 605 + 606 + static void trace_array_init_autoremove(struct trace_array *tr) 607 + { 608 + INIT_WORK(&tr->autoremove_work, trace_array_autoremove); 609 + } 610 + 611 + static void trace_array_start_autoremove(void) 612 + { 613 + if (autoremove_wq) 614 + return; 615 + 616 + autoremove_wq = alloc_workqueue("tr_autoremove_wq", 617 + WQ_UNBOUND | WQ_HIGHPRI, 0); 618 + if (!autoremove_wq) 619 + pr_warn("Unable to allocate tr_autoremove_wq. autoremove disabled.\n"); 620 + } 621 + 581 622 LIST_HEAD(ftrace_trace_arrays); 623 + 624 + static int __trace_array_get(struct trace_array *this_tr) 625 + { 626 + /* When free_on_close is set, this is not available anymore. */ 627 + if (autoremove_wq && this_tr->free_on_close) 628 + return -ENODEV; 629 + 630 + this_tr->ref++; 631 + return 0; 632 + } 582 633 583 634 int trace_array_get(struct trace_array *this_tr) 584 635 { ··· 638 587 guard(mutex)(&trace_types_lock); 639 588 list_for_each_entry(tr, &ftrace_trace_arrays, list) { 640 589 if (tr == this_tr) { 641 - tr->ref++; 642 - return 0; 590 + return __trace_array_get(tr); 643 591 } 644 592 } 645 593 ··· 649 599 { 650 600 WARN_ON(!this_tr->ref); 651 601 this_tr->ref--; 602 + /* 603 + * When free_on_close is set, prepare removing the array 604 + * when the last reference is released. 605 + */ 606 + if (this_tr->ref == 1 && this_tr->free_on_close) 607 + trace_array_kick_autoremove(this_tr); 652 608 } 653 609 654 610 /** ··· 3912 3856 * Should be used after trace_array_get(), trace_types_lock 3913 3857 * ensures that i_cdev was already initialized. 3914 3858 */ 3915 - static inline int tracing_get_cpu(struct inode *inode) 3859 + int tracing_get_cpu(struct inode *inode) 3916 3860 { 3917 3861 if (inode->i_cdev) /* See trace_create_cpu_file() */ 3918 3862 return (long)inode->i_cdev - 1; ··· 4077 4021 ret = tracing_check_open_get_tr(tr); 4078 4022 if (ret) 4079 4023 return ret; 4024 + 4025 + if ((filp->f_mode & FMODE_WRITE) && trace_array_is_readonly(tr)) { 4026 + trace_array_put(tr); 4027 + return -EACCES; 4028 + } 4080 4029 4081 4030 filp->private_data = inode->i_private; 4082 4031 ··· 5522 5461 5523 5462 /* Only if the buffer has previous boot data clear and update it. */ 5524 5463 tr->flags &= ~TRACE_ARRAY_FL_LAST_BOOT; 5464 + 5465 + /* If this is a backup instance, mark it for autoremove. */ 5466 + if (tr->flags & TRACE_ARRAY_FL_VMALLOC) 5467 + tr->free_on_close = true; 5525 5468 5526 5469 /* Reset the module list and reload them */ 5527 5470 if (tr->scratch) { ··· 7162 7097 if (ret) 7163 7098 return ret; 7164 7099 7100 + if ((file->f_mode & FMODE_WRITE) && trace_array_is_readonly(tr)) { 7101 + trace_array_put(tr); 7102 + return -EACCES; 7103 + } 7104 + 7165 7105 ret = single_open(file, tracing_clock_show, inode->i_private); 7166 7106 if (ret < 0) 7167 7107 trace_array_put(tr); ··· 8676 8606 return tr->percpu_dir; 8677 8607 } 8678 8608 8679 - static struct dentry * 8609 + struct dentry * 8680 8610 trace_create_cpu_file(const char *name, umode_t mode, struct dentry *parent, 8681 8611 void *data, long cpu, const struct file_operations *fops) 8682 8612 { ··· 9597 9527 9598 9528 guard(mutex)(&trace_types_lock); 9599 9529 tr = trace_array_find(instance); 9600 - if (tr) 9601 - tr->ref++; 9530 + if (tr && __trace_array_get(tr) < 0) 9531 + tr = NULL; 9602 9532 9603 9533 return tr; 9604 9534 } ··· 9694 9624 9695 9625 if (ftrace_allocate_ftrace_ops(tr) < 0) 9696 9626 goto out_free_tr; 9627 + 9628 + trace_array_init_autoremove(tr); 9697 9629 9698 9630 ftrace_init_trace_array(tr); 9699 9631 ··· 9807 9735 9808 9736 list_for_each_entry(tr, &ftrace_trace_arrays, list) { 9809 9737 if (tr->name && strcmp(tr->name, name) == 0) { 9810 - tr->ref++; 9738 + /* if this fails, @tr is going to be removed. */ 9739 + if (__trace_array_get(tr) < 0) 9740 + tr = NULL; 9811 9741 return tr; 9812 9742 } 9813 9743 } ··· 9848 9774 set_tracer_flag(tr, 1ULL << i, 0); 9849 9775 } 9850 9776 9777 + trace_array_cancel_autoremove(tr); 9851 9778 tracing_set_nop(tr); 9852 9779 clear_ftrace_function_probes(tr); 9853 9780 event_trace_del_tracer(tr); ··· 9941 9866 static void 9942 9867 init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer) 9943 9868 { 9869 + umode_t writable_mode = TRACE_MODE_WRITE; 9944 9870 int cpu; 9945 9871 9872 + if (trace_array_is_readonly(tr)) 9873 + writable_mode = TRACE_MODE_READ; 9874 + 9946 9875 trace_create_file("available_tracers", TRACE_MODE_READ, d_tracer, 9947 - tr, &show_traces_fops); 9876 + tr, &show_traces_fops); 9948 9877 9949 - trace_create_file("current_tracer", TRACE_MODE_WRITE, d_tracer, 9950 - tr, &set_tracer_fops); 9878 + trace_create_file("current_tracer", writable_mode, d_tracer, 9879 + tr, &set_tracer_fops); 9951 9880 9952 - trace_create_file("tracing_cpumask", TRACE_MODE_WRITE, d_tracer, 9881 + trace_create_file("tracing_cpumask", writable_mode, d_tracer, 9953 9882 tr, &tracing_cpumask_fops); 9954 9883 9884 + /* Options are used for changing print-format even for readonly instance. */ 9955 9885 trace_create_file("trace_options", TRACE_MODE_WRITE, d_tracer, 9956 9886 tr, &tracing_iter_fops); 9957 9887 ··· 9966 9886 trace_create_file("trace_pipe", TRACE_MODE_READ, d_tracer, 9967 9887 tr, &tracing_pipe_fops); 9968 9888 9969 - trace_create_file("buffer_size_kb", TRACE_MODE_WRITE, d_tracer, 9889 + trace_create_file("buffer_size_kb", writable_mode, d_tracer, 9970 9890 tr, &tracing_entries_fops); 9971 9891 9972 9892 trace_create_file("buffer_total_size_kb", TRACE_MODE_READ, d_tracer, 9973 9893 tr, &tracing_total_entries_fops); 9894 + 9895 + trace_create_file("trace_clock", writable_mode, d_tracer, tr, 9896 + &trace_clock_fops); 9897 + 9898 + trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, 9899 + &trace_time_stamp_mode_fops); 9900 + 9901 + tr->buffer_percent = 50; 9902 + 9903 + trace_create_file("buffer_subbuf_size_kb", writable_mode, d_tracer, 9904 + tr, &buffer_subbuf_size_fops); 9905 + 9906 + create_trace_options_dir(tr); 9907 + 9908 + if (tr->range_addr_start) 9909 + trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, 9910 + tr, &last_boot_fops); 9911 + 9912 + for_each_tracing_cpu(cpu) 9913 + tracing_init_tracefs_percpu(tr, cpu); 9914 + 9915 + /* Read-only instance has above files only. */ 9916 + if (trace_array_is_readonly(tr)) 9917 + return; 9974 9918 9975 9919 trace_create_file("free_buffer", 0200, d_tracer, 9976 9920 tr, &tracing_free_buffer_fops); ··· 10007 9903 trace_create_file("trace_marker_raw", 0220, d_tracer, 10008 9904 tr, &tracing_mark_raw_fops); 10009 9905 10010 - trace_create_file("trace_clock", TRACE_MODE_WRITE, d_tracer, tr, 10011 - &trace_clock_fops); 9906 + trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, 9907 + tr, &buffer_percent_fops); 9908 + 9909 + trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, 9910 + tr, &tracing_syscall_buf_fops); 10012 9911 10013 9912 trace_create_file("tracing_on", TRACE_MODE_WRITE, d_tracer, 10014 9913 tr, &rb_simple_fops); 10015 - 10016 - trace_create_file("timestamp_mode", TRACE_MODE_READ, d_tracer, tr, 10017 - &trace_time_stamp_mode_fops); 10018 - 10019 - tr->buffer_percent = 50; 10020 - 10021 - trace_create_file("buffer_percent", TRACE_MODE_WRITE, d_tracer, 10022 - tr, &buffer_percent_fops); 10023 - 10024 - trace_create_file("buffer_subbuf_size_kb", TRACE_MODE_WRITE, d_tracer, 10025 - tr, &buffer_subbuf_size_fops); 10026 - 10027 - trace_create_file("syscall_user_buf_size", TRACE_MODE_WRITE, d_tracer, 10028 - tr, &tracing_syscall_buf_fops); 10029 - 10030 - create_trace_options_dir(tr); 10031 9914 10032 9915 trace_create_maxlat_file(tr, d_tracer); 10033 9916 10034 9917 if (ftrace_create_function_files(tr, d_tracer)) 10035 9918 MEM_FAIL(1, "Could not allocate function filter files"); 10036 9919 10037 - if (tr->range_addr_start) { 10038 - trace_create_file("last_boot_info", TRACE_MODE_READ, d_tracer, 10039 - tr, &last_boot_fops); 10040 9920 #ifdef CONFIG_TRACER_SNAPSHOT 10041 - } else { 9921 + if (!tr->range_addr_start) 10042 9922 trace_create_file("snapshot", TRACE_MODE_WRITE, d_tracer, 10043 9923 tr, &snapshot_fops); 10044 9924 #endif 10045 - } 10046 9925 10047 9926 trace_create_file("error_log", TRACE_MODE_WRITE, d_tracer, 10048 9927 tr, &tracing_err_log_fops); 10049 - 10050 - for_each_tracing_cpu(cpu) 10051 - tracing_init_tracefs_percpu(tr, cpu); 10052 9928 10053 9929 ftrace_init_tracefs(tr, d_tracer); 10054 9930 } ··· 10855 10771 /* 10856 10772 * Backup buffers can be freed but need vfree(). 10857 10773 */ 10858 - if (backup) 10859 - tr->flags |= TRACE_ARRAY_FL_VMALLOC; 10774 + if (backup) { 10775 + tr->flags |= TRACE_ARRAY_FL_VMALLOC | TRACE_ARRAY_FL_RDONLY; 10776 + trace_array_start_autoremove(); 10777 + } 10860 10778 10861 10779 if (start || backup) { 10862 10780 tr->flags |= TRACE_ARRAY_FL_BOOT | TRACE_ARRAY_FL_LAST_BOOT; 10863 10781 tr->range_name = no_free_ptr(rname); 10864 10782 } 10865 10783 10784 + /* 10785 + * Save the events to start and enabled them after all boot instances 10786 + * have been created. 10787 + */ 10788 + tr->boot_events = curr_str; 10789 + } 10790 + 10791 + /* Enable the events after all boot instances have been created */ 10792 + list_for_each_entry(tr, &ftrace_trace_arrays, list) { 10793 + 10794 + if (!tr->boot_events || !(*tr->boot_events)) { 10795 + tr->boot_events = NULL; 10796 + continue; 10797 + } 10798 + 10799 + curr_str = tr->boot_events; 10800 + 10801 + /* Clear the instance if this is a persistent buffer */ 10802 + if (tr->flags & TRACE_ARRAY_FL_LAST_BOOT) 10803 + update_last_data(tr); 10804 + 10866 10805 while ((tok = strsep(&curr_str, ","))) { 10867 10806 early_enable_events(tr, tok, true); 10868 10807 } 10808 + tr->boot_events = NULL; 10869 10809 } 10870 10810 } 10871 10811
+24 -1
kernel/trace/trace.h
··· 405 405 unsigned char trace_flags_index[TRACE_FLAGS_MAX_SIZE]; 406 406 unsigned int flags; 407 407 raw_spinlock_t start_lock; 408 - const char *system_names; 408 + union { 409 + const char *system_names; 410 + char *boot_events; 411 + }; 409 412 struct list_head err_log; 410 413 struct dentry *dir; 411 414 struct dentry *options; ··· 456 453 * we do not waste memory on systems that are not using tracing. 457 454 */ 458 455 bool ring_buffer_expanded; 456 + /* 457 + * If the ring buffer is a read only backup instance, it will be 458 + * removed after dumping all data via pipe, because no readable data. 459 + */ 460 + bool free_on_close; 461 + struct work_struct autoremove_work; 459 462 }; 460 463 461 464 enum { ··· 471 462 TRACE_ARRAY_FL_MOD_INIT = BIT(3), 472 463 TRACE_ARRAY_FL_MEMMAP = BIT(4), 473 464 TRACE_ARRAY_FL_VMALLOC = BIT(5), 465 + TRACE_ARRAY_FL_RDONLY = BIT(6), 474 466 }; 475 467 476 468 #ifdef CONFIG_MODULES ··· 500 490 extern unsigned long trace_adjust_address(struct trace_array *tr, unsigned long addr); 501 491 502 492 extern struct trace_array *printk_trace; 493 + 494 + static inline bool trace_array_is_readonly(struct trace_array *tr) 495 + { 496 + /* backup instance is read only. */ 497 + return tr->flags & TRACE_ARRAY_FL_RDONLY; 498 + } 503 499 504 500 /* 505 501 * The global tracer (top) should be the first trace array added, ··· 705 689 struct dentry *parent, 706 690 void *data, 707 691 const struct file_operations *fops); 692 + struct dentry *trace_create_cpu_file(const char *name, 693 + umode_t mode, 694 + struct dentry *parent, 695 + void *data, 696 + long cpu, 697 + const struct file_operations *fops); 698 + int tracing_get_cpu(struct inode *inode); 708 699 709 700 710 701 /**
+3 -2
kernel/trace/trace_boot.c
··· 61 61 v = memparse(p, NULL); 62 62 if (v < PAGE_SIZE) 63 63 pr_err("Buffer size is too small: %s\n", p); 64 - if (tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) 64 + if (trace_array_is_readonly(tr) || 65 + tracing_resize_ring_buffer(tr, v, RING_BUFFER_ALL_CPUS) < 0) 65 66 pr_err("Failed to resize trace buffer to %s\n", p); 66 67 } 67 68 ··· 598 597 599 598 p = xbc_node_find_value(node, "tracer", NULL); 600 599 if (p && *p != '\0') { 601 - if (tracing_set_tracer(tr, p) < 0) 600 + if (trace_array_is_readonly(tr) || tracing_set_tracer(tr, p) < 0) 602 601 pr_err("Failed to set given tracer: %s\n", p); 603 602 } 604 603
+47 -35
kernel/trace/trace_events.c
··· 1401 1401 { 1402 1402 int ret; 1403 1403 1404 + if (trace_array_is_readonly(tr)) 1405 + return -EACCES; 1406 + 1404 1407 mutex_lock(&event_mutex); 1405 1408 ret = __ftrace_set_clr_event_nolock(tr, match, sub, event, set, mod); 1406 1409 mutex_unlock(&event_mutex); ··· 2976 2973 } else 2977 2974 __get_system(system); 2978 2975 2979 - /* ftrace only has directories no files */ 2980 - if (strcmp(name, "ftrace") == 0) 2976 + /* ftrace only has directories no files, readonly instance too. */ 2977 + if (strcmp(name, "ftrace") == 0 || trace_array_is_readonly(tr)) 2981 2978 nr_entries = 0; 2982 2979 else 2983 2980 nr_entries = ARRAY_SIZE(system_entries); ··· 3142 3139 int ret; 3143 3140 static struct eventfs_entry event_entries[] = { 3144 3141 { 3142 + .name = "format", 3143 + .callback = event_callback, 3144 + }, 3145 + #ifdef CONFIG_PERF_EVENTS 3146 + { 3147 + .name = "id", 3148 + .callback = event_callback, 3149 + }, 3150 + #endif 3151 + #define NR_RO_EVENT_ENTRIES (1 + IS_ENABLED(CONFIG_PERF_EVENTS)) 3152 + /* Readonly files must be above this line and counted by NR_RO_EVENT_ENTRIES. */ 3153 + { 3145 3154 .name = "enable", 3146 3155 .callback = event_callback, 3147 3156 .release = event_release, ··· 3166 3151 .name = "trigger", 3167 3152 .callback = event_callback, 3168 3153 }, 3169 - { 3170 - .name = "format", 3171 - .callback = event_callback, 3172 - }, 3173 - #ifdef CONFIG_PERF_EVENTS 3174 - { 3175 - .name = "id", 3176 - .callback = event_callback, 3177 - }, 3178 - #endif 3179 3154 #ifdef CONFIG_HIST_TRIGGERS 3180 3155 { 3181 3156 .name = "hist", ··· 3198 3193 if (!e_events) 3199 3194 return -ENOMEM; 3200 3195 3201 - nr_entries = ARRAY_SIZE(event_entries); 3196 + if (trace_array_is_readonly(tr)) 3197 + nr_entries = NR_RO_EVENT_ENTRIES; 3198 + else 3199 + nr_entries = ARRAY_SIZE(event_entries); 3202 3200 3203 3201 name = trace_event_name(call); 3204 3202 ei = eventfs_create_dir(name, e_events, event_entries, nr_entries, file); ··· 4544 4536 int nr_entries; 4545 4537 static struct eventfs_entry events_entries[] = { 4546 4538 { 4547 - .name = "enable", 4548 - .callback = events_callback, 4549 - }, 4550 - { 4551 4539 .name = "header_page", 4552 4540 .callback = events_callback, 4553 4541 }, ··· 4551 4547 .name = "header_event", 4552 4548 .callback = events_callback, 4553 4549 }, 4550 + #define NR_RO_TOP_ENTRIES 2 4551 + /* Readonly files must be above this line and counted by NR_RO_TOP_ENTRIES. */ 4552 + { 4553 + .name = "enable", 4554 + .callback = events_callback, 4555 + }, 4554 4556 }; 4555 4557 4556 - entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, 4557 - tr, &ftrace_set_event_fops); 4558 - if (!entry) 4559 - return -ENOMEM; 4558 + if (!trace_array_is_readonly(tr)) { 4559 + entry = trace_create_file("set_event", TRACE_MODE_WRITE, parent, 4560 + tr, &ftrace_set_event_fops); 4561 + if (!entry) 4562 + return -ENOMEM; 4560 4563 4561 - trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, 4562 - &ftrace_show_event_filters_fops); 4564 + /* There are not as crucial, just warn if they are not created */ 4565 + trace_create_file("show_event_filters", TRACE_MODE_READ, parent, tr, 4566 + &ftrace_show_event_filters_fops); 4563 4567 4564 - trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, 4565 - &ftrace_show_event_triggers_fops); 4568 + trace_create_file("show_event_triggers", TRACE_MODE_READ, parent, tr, 4569 + &ftrace_show_event_triggers_fops); 4566 4570 4567 - nr_entries = ARRAY_SIZE(events_entries); 4571 + trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, 4572 + tr, &ftrace_set_event_pid_fops); 4573 + 4574 + trace_create_file("set_event_notrace_pid", 4575 + TRACE_MODE_WRITE, parent, tr, 4576 + &ftrace_set_event_notrace_pid_fops); 4577 + nr_entries = ARRAY_SIZE(events_entries); 4578 + } else { 4579 + nr_entries = NR_RO_TOP_ENTRIES; 4580 + } 4568 4581 4569 4582 e_events = eventfs_create_events_dir("events", parent, events_entries, 4570 4583 nr_entries, tr); ··· 4589 4568 pr_warn("Could not create tracefs 'events' directory\n"); 4590 4569 return -ENOMEM; 4591 4570 } 4592 - 4593 - /* There are not as crucial, just warn if they are not created */ 4594 - 4595 - trace_create_file("set_event_pid", TRACE_MODE_WRITE, parent, 4596 - tr, &ftrace_set_event_pid_fops); 4597 - 4598 - trace_create_file("set_event_notrace_pid", 4599 - TRACE_MODE_WRITE, parent, tr, 4600 - &ftrace_set_event_notrace_pid_fops); 4601 4571 4602 4572 tr->event_dir = e_events; 4603 4573
+1368
kernel/trace/trace_remote.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2025 - Google LLC 4 + * Author: Vincent Donnefort <vdonnefort@google.com> 5 + */ 6 + 7 + #include <linux/kstrtox.h> 8 + #include <linux/lockdep.h> 9 + #include <linux/mutex.h> 10 + #include <linux/tracefs.h> 11 + #include <linux/trace_remote.h> 12 + #include <linux/trace_seq.h> 13 + #include <linux/types.h> 14 + 15 + #include "trace.h" 16 + 17 + #define TRACEFS_DIR "remotes" 18 + #define TRACEFS_MODE_WRITE 0640 19 + #define TRACEFS_MODE_READ 0440 20 + 21 + enum tri_type { 22 + TRI_CONSUMING, 23 + TRI_NONCONSUMING, 24 + }; 25 + 26 + struct trace_remote_iterator { 27 + struct trace_remote *remote; 28 + struct trace_seq seq; 29 + struct delayed_work poll_work; 30 + unsigned long lost_events; 31 + u64 ts; 32 + struct ring_buffer_iter *rb_iter; 33 + struct ring_buffer_iter **rb_iters; 34 + struct remote_event_hdr *evt; 35 + int cpu; 36 + int evt_cpu; 37 + loff_t pos; 38 + enum tri_type type; 39 + }; 40 + 41 + struct trace_remote { 42 + struct trace_remote_callbacks *cbs; 43 + void *priv; 44 + struct trace_buffer *trace_buffer; 45 + struct trace_buffer_desc *trace_buffer_desc; 46 + struct dentry *dentry; 47 + struct eventfs_inode *eventfs; 48 + struct remote_event *events; 49 + unsigned long nr_events; 50 + unsigned long trace_buffer_size; 51 + struct ring_buffer_remote rb_remote; 52 + struct mutex lock; 53 + struct rw_semaphore reader_lock; 54 + struct rw_semaphore *pcpu_reader_locks; 55 + unsigned int nr_readers; 56 + unsigned int poll_ms; 57 + bool tracing_on; 58 + }; 59 + 60 + static bool trace_remote_loaded(struct trace_remote *remote) 61 + { 62 + return !!remote->trace_buffer; 63 + } 64 + 65 + static int trace_remote_load(struct trace_remote *remote) 66 + { 67 + struct ring_buffer_remote *rb_remote = &remote->rb_remote; 68 + struct trace_buffer_desc *desc; 69 + 70 + lockdep_assert_held(&remote->lock); 71 + 72 + if (trace_remote_loaded(remote)) 73 + return 0; 74 + 75 + desc = remote->cbs->load_trace_buffer(remote->trace_buffer_size, remote->priv); 76 + if (IS_ERR(desc)) 77 + return PTR_ERR(desc); 78 + 79 + rb_remote->desc = desc; 80 + rb_remote->swap_reader_page = remote->cbs->swap_reader_page; 81 + rb_remote->priv = remote->priv; 82 + rb_remote->reset = remote->cbs->reset; 83 + remote->trace_buffer = ring_buffer_alloc_remote(rb_remote); 84 + if (!remote->trace_buffer) { 85 + remote->cbs->unload_trace_buffer(desc, remote->priv); 86 + return -ENOMEM; 87 + } 88 + 89 + remote->trace_buffer_desc = desc; 90 + 91 + return 0; 92 + } 93 + 94 + static void trace_remote_try_unload(struct trace_remote *remote) 95 + { 96 + lockdep_assert_held(&remote->lock); 97 + 98 + if (!trace_remote_loaded(remote)) 99 + return; 100 + 101 + /* The buffer is being read or writable */ 102 + if (remote->nr_readers || remote->tracing_on) 103 + return; 104 + 105 + /* The buffer has readable data */ 106 + if (!ring_buffer_empty(remote->trace_buffer)) 107 + return; 108 + 109 + ring_buffer_free(remote->trace_buffer); 110 + remote->trace_buffer = NULL; 111 + remote->cbs->unload_trace_buffer(remote->trace_buffer_desc, remote->priv); 112 + } 113 + 114 + static int trace_remote_enable_tracing(struct trace_remote *remote) 115 + { 116 + int ret; 117 + 118 + lockdep_assert_held(&remote->lock); 119 + 120 + if (remote->tracing_on) 121 + return 0; 122 + 123 + ret = trace_remote_load(remote); 124 + if (ret) 125 + return ret; 126 + 127 + ret = remote->cbs->enable_tracing(true, remote->priv); 128 + if (ret) { 129 + trace_remote_try_unload(remote); 130 + return ret; 131 + } 132 + 133 + remote->tracing_on = true; 134 + 135 + return 0; 136 + } 137 + 138 + static int trace_remote_disable_tracing(struct trace_remote *remote) 139 + { 140 + int ret; 141 + 142 + lockdep_assert_held(&remote->lock); 143 + 144 + if (!remote->tracing_on) 145 + return 0; 146 + 147 + ret = remote->cbs->enable_tracing(false, remote->priv); 148 + if (ret) 149 + return ret; 150 + 151 + ring_buffer_poll_remote(remote->trace_buffer, RING_BUFFER_ALL_CPUS); 152 + remote->tracing_on = false; 153 + trace_remote_try_unload(remote); 154 + 155 + return 0; 156 + } 157 + 158 + static void trace_remote_reset(struct trace_remote *remote, int cpu) 159 + { 160 + lockdep_assert_held(&remote->lock); 161 + 162 + if (!trace_remote_loaded(remote)) 163 + return; 164 + 165 + if (cpu == RING_BUFFER_ALL_CPUS) 166 + ring_buffer_reset(remote->trace_buffer); 167 + else 168 + ring_buffer_reset_cpu(remote->trace_buffer, cpu); 169 + 170 + trace_remote_try_unload(remote); 171 + } 172 + 173 + static ssize_t 174 + tracing_on_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) 175 + { 176 + struct seq_file *seq = filp->private_data; 177 + struct trace_remote *remote = seq->private; 178 + unsigned long val; 179 + int ret; 180 + 181 + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); 182 + if (ret) 183 + return ret; 184 + 185 + guard(mutex)(&remote->lock); 186 + 187 + ret = val ? trace_remote_enable_tracing(remote) : trace_remote_disable_tracing(remote); 188 + if (ret) 189 + return ret; 190 + 191 + return cnt; 192 + } 193 + static int tracing_on_show(struct seq_file *s, void *unused) 194 + { 195 + struct trace_remote *remote = s->private; 196 + 197 + seq_printf(s, "%d\n", remote->tracing_on); 198 + 199 + return 0; 200 + } 201 + DEFINE_SHOW_STORE_ATTRIBUTE(tracing_on); 202 + 203 + static ssize_t buffer_size_kb_write(struct file *filp, const char __user *ubuf, size_t cnt, 204 + loff_t *ppos) 205 + { 206 + struct seq_file *seq = filp->private_data; 207 + struct trace_remote *remote = seq->private; 208 + unsigned long val; 209 + int ret; 210 + 211 + ret = kstrtoul_from_user(ubuf, cnt, 10, &val); 212 + if (ret) 213 + return ret; 214 + 215 + /* KiB to Bytes */ 216 + if (!val || check_shl_overflow(val, 10, &val)) 217 + return -EINVAL; 218 + 219 + guard(mutex)(&remote->lock); 220 + 221 + if (trace_remote_loaded(remote)) 222 + return -EBUSY; 223 + 224 + remote->trace_buffer_size = val; 225 + 226 + return cnt; 227 + } 228 + 229 + static int buffer_size_kb_show(struct seq_file *s, void *unused) 230 + { 231 + struct trace_remote *remote = s->private; 232 + 233 + seq_printf(s, "%lu (%s)\n", remote->trace_buffer_size >> 10, 234 + trace_remote_loaded(remote) ? "loaded" : "unloaded"); 235 + 236 + return 0; 237 + } 238 + DEFINE_SHOW_STORE_ATTRIBUTE(buffer_size_kb); 239 + 240 + static int trace_remote_get(struct trace_remote *remote, int cpu) 241 + { 242 + int ret; 243 + 244 + if (remote->nr_readers == UINT_MAX) 245 + return -EBUSY; 246 + 247 + ret = trace_remote_load(remote); 248 + if (ret) 249 + return ret; 250 + 251 + if (cpu != RING_BUFFER_ALL_CPUS && !remote->pcpu_reader_locks) { 252 + int lock_cpu; 253 + 254 + remote->pcpu_reader_locks = kcalloc(nr_cpu_ids, sizeof(*remote->pcpu_reader_locks), 255 + GFP_KERNEL); 256 + if (!remote->pcpu_reader_locks) { 257 + trace_remote_try_unload(remote); 258 + return -ENOMEM; 259 + } 260 + 261 + for_each_possible_cpu(lock_cpu) 262 + init_rwsem(&remote->pcpu_reader_locks[lock_cpu]); 263 + } 264 + 265 + remote->nr_readers++; 266 + 267 + return 0; 268 + } 269 + 270 + static void trace_remote_put(struct trace_remote *remote) 271 + { 272 + if (WARN_ON(!remote->nr_readers)) 273 + return; 274 + 275 + remote->nr_readers--; 276 + if (remote->nr_readers) 277 + return; 278 + 279 + kfree(remote->pcpu_reader_locks); 280 + remote->pcpu_reader_locks = NULL; 281 + 282 + trace_remote_try_unload(remote); 283 + } 284 + 285 + static void __poll_remote(struct work_struct *work) 286 + { 287 + struct delayed_work *dwork = to_delayed_work(work); 288 + struct trace_remote_iterator *iter; 289 + 290 + iter = container_of(dwork, struct trace_remote_iterator, poll_work); 291 + ring_buffer_poll_remote(iter->remote->trace_buffer, iter->cpu); 292 + schedule_delayed_work((struct delayed_work *)work, 293 + msecs_to_jiffies(iter->remote->poll_ms)); 294 + } 295 + 296 + static void __free_ring_buffer_iter(struct trace_remote_iterator *iter, int cpu) 297 + { 298 + if (cpu != RING_BUFFER_ALL_CPUS) { 299 + ring_buffer_read_finish(iter->rb_iter); 300 + return; 301 + } 302 + 303 + for_each_possible_cpu(cpu) { 304 + if (iter->rb_iters[cpu]) 305 + ring_buffer_read_finish(iter->rb_iters[cpu]); 306 + } 307 + 308 + kfree(iter->rb_iters); 309 + } 310 + 311 + static int __alloc_ring_buffer_iter(struct trace_remote_iterator *iter, int cpu) 312 + { 313 + if (cpu != RING_BUFFER_ALL_CPUS) { 314 + iter->rb_iter = ring_buffer_read_start(iter->remote->trace_buffer, cpu, GFP_KERNEL); 315 + 316 + return iter->rb_iter ? 0 : -ENOMEM; 317 + } 318 + 319 + iter->rb_iters = kcalloc(nr_cpu_ids, sizeof(*iter->rb_iters), GFP_KERNEL); 320 + if (!iter->rb_iters) 321 + return -ENOMEM; 322 + 323 + for_each_possible_cpu(cpu) { 324 + iter->rb_iters[cpu] = ring_buffer_read_start(iter->remote->trace_buffer, cpu, 325 + GFP_KERNEL); 326 + if (!iter->rb_iters[cpu]) { 327 + __free_ring_buffer_iter(iter, RING_BUFFER_ALL_CPUS); 328 + return -ENOMEM; 329 + } 330 + } 331 + 332 + return 0; 333 + } 334 + 335 + static struct trace_remote_iterator 336 + *trace_remote_iter(struct trace_remote *remote, int cpu, enum tri_type type) 337 + { 338 + struct trace_remote_iterator *iter = NULL; 339 + int ret; 340 + 341 + lockdep_assert_held(&remote->lock); 342 + 343 + if (type == TRI_NONCONSUMING && !trace_remote_loaded(remote)) 344 + return NULL; 345 + 346 + ret = trace_remote_get(remote, cpu); 347 + if (ret) 348 + return ERR_PTR(ret); 349 + 350 + /* Test the CPU */ 351 + ret = ring_buffer_poll_remote(remote->trace_buffer, cpu); 352 + if (ret) 353 + goto err; 354 + 355 + iter = kzalloc_obj(*iter); 356 + if (iter) { 357 + iter->remote = remote; 358 + iter->cpu = cpu; 359 + iter->type = type; 360 + trace_seq_init(&iter->seq); 361 + 362 + switch (type) { 363 + case TRI_CONSUMING: 364 + INIT_DELAYED_WORK(&iter->poll_work, __poll_remote); 365 + schedule_delayed_work(&iter->poll_work, msecs_to_jiffies(remote->poll_ms)); 366 + break; 367 + case TRI_NONCONSUMING: 368 + ret = __alloc_ring_buffer_iter(iter, cpu); 369 + break; 370 + } 371 + 372 + if (ret) 373 + goto err; 374 + 375 + return iter; 376 + } 377 + ret = -ENOMEM; 378 + 379 + err: 380 + kfree(iter); 381 + trace_remote_put(remote); 382 + 383 + return ERR_PTR(ret); 384 + } 385 + 386 + static void trace_remote_iter_free(struct trace_remote_iterator *iter) 387 + { 388 + struct trace_remote *remote; 389 + 390 + if (!iter) 391 + return; 392 + 393 + remote = iter->remote; 394 + 395 + lockdep_assert_held(&remote->lock); 396 + 397 + switch (iter->type) { 398 + case TRI_CONSUMING: 399 + cancel_delayed_work_sync(&iter->poll_work); 400 + break; 401 + case TRI_NONCONSUMING: 402 + __free_ring_buffer_iter(iter, iter->cpu); 403 + break; 404 + } 405 + 406 + kfree(iter); 407 + trace_remote_put(remote); 408 + } 409 + 410 + static void trace_remote_iter_read_start(struct trace_remote_iterator *iter) 411 + { 412 + struct trace_remote *remote = iter->remote; 413 + int cpu = iter->cpu; 414 + 415 + /* Acquire global reader lock */ 416 + if (cpu == RING_BUFFER_ALL_CPUS && iter->type == TRI_CONSUMING) 417 + down_write(&remote->reader_lock); 418 + else 419 + down_read(&remote->reader_lock); 420 + 421 + if (cpu == RING_BUFFER_ALL_CPUS) 422 + return; 423 + 424 + /* 425 + * No need for the remote lock here, iter holds a reference on 426 + * remote->nr_readers 427 + */ 428 + 429 + /* Get the per-CPU one */ 430 + if (WARN_ON_ONCE(!remote->pcpu_reader_locks)) 431 + return; 432 + 433 + if (iter->type == TRI_CONSUMING) 434 + down_write(&remote->pcpu_reader_locks[cpu]); 435 + else 436 + down_read(&remote->pcpu_reader_locks[cpu]); 437 + } 438 + 439 + static void trace_remote_iter_read_finished(struct trace_remote_iterator *iter) 440 + { 441 + struct trace_remote *remote = iter->remote; 442 + int cpu = iter->cpu; 443 + 444 + /* Release per-CPU reader lock */ 445 + if (cpu != RING_BUFFER_ALL_CPUS) { 446 + /* 447 + * No need for the remote lock here, iter holds a reference on 448 + * remote->nr_readers 449 + */ 450 + if (iter->type == TRI_CONSUMING) 451 + up_write(&remote->pcpu_reader_locks[cpu]); 452 + else 453 + up_read(&remote->pcpu_reader_locks[cpu]); 454 + } 455 + 456 + /* Release global reader lock */ 457 + if (cpu == RING_BUFFER_ALL_CPUS && iter->type == TRI_CONSUMING) 458 + up_write(&remote->reader_lock); 459 + else 460 + up_read(&remote->reader_lock); 461 + } 462 + 463 + static struct ring_buffer_iter *__get_rb_iter(struct trace_remote_iterator *iter, int cpu) 464 + { 465 + return iter->cpu != RING_BUFFER_ALL_CPUS ? iter->rb_iter : iter->rb_iters[cpu]; 466 + } 467 + 468 + static struct ring_buffer_event * 469 + __peek_event(struct trace_remote_iterator *iter, int cpu, u64 *ts, unsigned long *lost_events) 470 + { 471 + struct ring_buffer_event *rb_evt; 472 + struct ring_buffer_iter *rb_iter; 473 + 474 + switch (iter->type) { 475 + case TRI_CONSUMING: 476 + return ring_buffer_peek(iter->remote->trace_buffer, cpu, ts, lost_events); 477 + case TRI_NONCONSUMING: 478 + rb_iter = __get_rb_iter(iter, cpu); 479 + rb_evt = ring_buffer_iter_peek(rb_iter, ts); 480 + if (!rb_evt) 481 + return NULL; 482 + 483 + *lost_events = ring_buffer_iter_dropped(rb_iter); 484 + 485 + return rb_evt; 486 + } 487 + 488 + return NULL; 489 + } 490 + 491 + static bool trace_remote_iter_read_event(struct trace_remote_iterator *iter) 492 + { 493 + struct trace_buffer *trace_buffer = iter->remote->trace_buffer; 494 + struct ring_buffer_event *rb_evt; 495 + int cpu = iter->cpu; 496 + 497 + if (cpu != RING_BUFFER_ALL_CPUS) { 498 + if (ring_buffer_empty_cpu(trace_buffer, cpu)) 499 + return false; 500 + 501 + rb_evt = __peek_event(iter, cpu, &iter->ts, &iter->lost_events); 502 + if (!rb_evt) 503 + return false; 504 + 505 + iter->evt_cpu = cpu; 506 + iter->evt = ring_buffer_event_data(rb_evt); 507 + return true; 508 + } 509 + 510 + iter->ts = U64_MAX; 511 + for_each_possible_cpu(cpu) { 512 + unsigned long lost_events; 513 + u64 ts; 514 + 515 + if (ring_buffer_empty_cpu(trace_buffer, cpu)) 516 + continue; 517 + 518 + rb_evt = __peek_event(iter, cpu, &ts, &lost_events); 519 + if (!rb_evt) 520 + continue; 521 + 522 + if (ts >= iter->ts) 523 + continue; 524 + 525 + iter->ts = ts; 526 + iter->evt_cpu = cpu; 527 + iter->evt = ring_buffer_event_data(rb_evt); 528 + iter->lost_events = lost_events; 529 + } 530 + 531 + return iter->ts != U64_MAX; 532 + } 533 + 534 + static void trace_remote_iter_move(struct trace_remote_iterator *iter) 535 + { 536 + struct trace_buffer *trace_buffer = iter->remote->trace_buffer; 537 + 538 + switch (iter->type) { 539 + case TRI_CONSUMING: 540 + ring_buffer_consume(trace_buffer, iter->evt_cpu, NULL, NULL); 541 + break; 542 + case TRI_NONCONSUMING: 543 + ring_buffer_iter_advance(__get_rb_iter(iter, iter->evt_cpu)); 544 + break; 545 + } 546 + } 547 + 548 + static struct remote_event *trace_remote_find_event(struct trace_remote *remote, unsigned short id); 549 + 550 + static int trace_remote_iter_print_event(struct trace_remote_iterator *iter) 551 + { 552 + struct remote_event *evt; 553 + unsigned long usecs_rem; 554 + u64 ts = iter->ts; 555 + 556 + if (iter->lost_events) 557 + trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n", 558 + iter->evt_cpu, iter->lost_events); 559 + 560 + do_div(ts, 1000); 561 + usecs_rem = do_div(ts, USEC_PER_SEC); 562 + 563 + trace_seq_printf(&iter->seq, "[%03d]\t%5llu.%06lu: ", iter->evt_cpu, 564 + ts, usecs_rem); 565 + 566 + evt = trace_remote_find_event(iter->remote, iter->evt->id); 567 + if (!evt) 568 + trace_seq_printf(&iter->seq, "UNKNOWN id=%d\n", iter->evt->id); 569 + else 570 + evt->print(iter->evt, &iter->seq); 571 + 572 + return trace_seq_has_overflowed(&iter->seq) ? -EOVERFLOW : 0; 573 + } 574 + 575 + static int trace_pipe_open(struct inode *inode, struct file *filp) 576 + { 577 + struct trace_remote *remote = inode->i_private; 578 + struct trace_remote_iterator *iter; 579 + int cpu = tracing_get_cpu(inode); 580 + 581 + guard(mutex)(&remote->lock); 582 + 583 + iter = trace_remote_iter(remote, cpu, TRI_CONSUMING); 584 + if (IS_ERR(iter)) 585 + return PTR_ERR(iter); 586 + 587 + filp->private_data = iter; 588 + 589 + return IS_ERR(iter) ? PTR_ERR(iter) : 0; 590 + } 591 + 592 + static int trace_pipe_release(struct inode *inode, struct file *filp) 593 + { 594 + struct trace_remote_iterator *iter = filp->private_data; 595 + struct trace_remote *remote = iter->remote; 596 + 597 + guard(mutex)(&remote->lock); 598 + 599 + trace_remote_iter_free(iter); 600 + 601 + return 0; 602 + } 603 + 604 + static ssize_t trace_pipe_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) 605 + { 606 + struct trace_remote_iterator *iter = filp->private_data; 607 + struct trace_buffer *trace_buffer = iter->remote->trace_buffer; 608 + int ret; 609 + 610 + copy_to_user: 611 + ret = trace_seq_to_user(&iter->seq, ubuf, cnt); 612 + if (ret != -EBUSY) 613 + return ret; 614 + 615 + trace_seq_init(&iter->seq); 616 + 617 + ret = ring_buffer_wait(trace_buffer, iter->cpu, 0, NULL, NULL); 618 + if (ret < 0) 619 + return ret; 620 + 621 + trace_remote_iter_read_start(iter); 622 + 623 + while (trace_remote_iter_read_event(iter)) { 624 + int prev_len = iter->seq.seq.len; 625 + 626 + if (trace_remote_iter_print_event(iter)) { 627 + iter->seq.seq.len = prev_len; 628 + break; 629 + } 630 + 631 + trace_remote_iter_move(iter); 632 + } 633 + 634 + trace_remote_iter_read_finished(iter); 635 + 636 + goto copy_to_user; 637 + } 638 + 639 + static const struct file_operations trace_pipe_fops = { 640 + .open = trace_pipe_open, 641 + .read = trace_pipe_read, 642 + .release = trace_pipe_release, 643 + }; 644 + 645 + static void *trace_next(struct seq_file *m, void *v, loff_t *pos) 646 + { 647 + struct trace_remote_iterator *iter = m->private; 648 + 649 + ++*pos; 650 + 651 + if (!iter || !trace_remote_iter_read_event(iter)) 652 + return NULL; 653 + 654 + trace_remote_iter_move(iter); 655 + iter->pos++; 656 + 657 + return iter; 658 + } 659 + 660 + static void *trace_start(struct seq_file *m, loff_t *pos) 661 + { 662 + struct trace_remote_iterator *iter = m->private; 663 + loff_t i; 664 + 665 + if (!iter) 666 + return NULL; 667 + 668 + trace_remote_iter_read_start(iter); 669 + 670 + if (!*pos) { 671 + iter->pos = -1; 672 + return trace_next(m, NULL, &i); 673 + } 674 + 675 + i = iter->pos; 676 + while (i < *pos) { 677 + iter = trace_next(m, NULL, &i); 678 + if (!iter) 679 + return NULL; 680 + } 681 + 682 + return iter; 683 + } 684 + 685 + static int trace_show(struct seq_file *m, void *v) 686 + { 687 + struct trace_remote_iterator *iter = v; 688 + 689 + trace_seq_init(&iter->seq); 690 + 691 + if (trace_remote_iter_print_event(iter)) { 692 + seq_printf(m, "[EVENT %d PRINT TOO BIG]\n", iter->evt->id); 693 + return 0; 694 + } 695 + 696 + return trace_print_seq(m, &iter->seq); 697 + } 698 + 699 + static void trace_stop(struct seq_file *m, void *v) 700 + { 701 + struct trace_remote_iterator *iter = m->private; 702 + 703 + if (iter) 704 + trace_remote_iter_read_finished(iter); 705 + } 706 + 707 + static const struct seq_operations trace_sops = { 708 + .start = trace_start, 709 + .next = trace_next, 710 + .show = trace_show, 711 + .stop = trace_stop, 712 + }; 713 + 714 + static int trace_open(struct inode *inode, struct file *filp) 715 + { 716 + struct trace_remote *remote = inode->i_private; 717 + struct trace_remote_iterator *iter = NULL; 718 + int cpu = tracing_get_cpu(inode); 719 + int ret; 720 + 721 + if (!(filp->f_mode & FMODE_READ)) 722 + return 0; 723 + 724 + guard(mutex)(&remote->lock); 725 + 726 + iter = trace_remote_iter(remote, cpu, TRI_NONCONSUMING); 727 + if (IS_ERR(iter)) 728 + return PTR_ERR(iter); 729 + 730 + ret = seq_open(filp, &trace_sops); 731 + if (ret) { 732 + trace_remote_iter_free(iter); 733 + return ret; 734 + } 735 + 736 + ((struct seq_file *)filp->private_data)->private = (void *)iter; 737 + 738 + return 0; 739 + } 740 + 741 + static int trace_release(struct inode *inode, struct file *filp) 742 + { 743 + struct trace_remote_iterator *iter; 744 + 745 + if (!(filp->f_mode & FMODE_READ)) 746 + return 0; 747 + 748 + iter = ((struct seq_file *)filp->private_data)->private; 749 + seq_release(inode, filp); 750 + 751 + if (!iter) 752 + return 0; 753 + 754 + guard(mutex)(&iter->remote->lock); 755 + 756 + trace_remote_iter_free(iter); 757 + 758 + return 0; 759 + } 760 + 761 + static ssize_t trace_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) 762 + { 763 + struct inode *inode = file_inode(filp); 764 + struct trace_remote *remote = inode->i_private; 765 + int cpu = tracing_get_cpu(inode); 766 + 767 + guard(mutex)(&remote->lock); 768 + 769 + trace_remote_reset(remote, cpu); 770 + 771 + return cnt; 772 + } 773 + 774 + static const struct file_operations trace_fops = { 775 + .open = trace_open, 776 + .write = trace_write, 777 + .read = seq_read, 778 + .read_iter = seq_read_iter, 779 + .release = trace_release, 780 + }; 781 + 782 + static int trace_remote_init_tracefs(const char *name, struct trace_remote *remote) 783 + { 784 + struct dentry *remote_d, *percpu_d, *d; 785 + static struct dentry *root; 786 + static DEFINE_MUTEX(lock); 787 + bool root_inited = false; 788 + int cpu; 789 + 790 + guard(mutex)(&lock); 791 + 792 + if (!root) { 793 + root = tracefs_create_dir(TRACEFS_DIR, NULL); 794 + if (!root) { 795 + pr_err("Failed to create tracefs dir "TRACEFS_DIR"\n"); 796 + return -ENOMEM; 797 + } 798 + root_inited = true; 799 + } 800 + 801 + remote_d = tracefs_create_dir(name, root); 802 + if (!remote_d) { 803 + pr_err("Failed to create tracefs dir "TRACEFS_DIR"%s/\n", name); 804 + goto err; 805 + } 806 + 807 + d = trace_create_file("tracing_on", TRACEFS_MODE_WRITE, remote_d, remote, &tracing_on_fops); 808 + if (!d) 809 + goto err; 810 + 811 + d = trace_create_file("buffer_size_kb", TRACEFS_MODE_WRITE, remote_d, remote, 812 + &buffer_size_kb_fops); 813 + if (!d) 814 + goto err; 815 + 816 + d = trace_create_file("trace_pipe", TRACEFS_MODE_READ, remote_d, remote, &trace_pipe_fops); 817 + if (!d) 818 + goto err; 819 + 820 + d = trace_create_file("trace", TRACEFS_MODE_WRITE, remote_d, remote, &trace_fops); 821 + if (!d) 822 + goto err; 823 + 824 + percpu_d = tracefs_create_dir("per_cpu", remote_d); 825 + if (!percpu_d) { 826 + pr_err("Failed to create tracefs dir "TRACEFS_DIR"%s/per_cpu/\n", name); 827 + goto err; 828 + } 829 + 830 + for_each_possible_cpu(cpu) { 831 + struct dentry *cpu_d; 832 + char cpu_name[16]; 833 + 834 + snprintf(cpu_name, sizeof(cpu_name), "cpu%d", cpu); 835 + cpu_d = tracefs_create_dir(cpu_name, percpu_d); 836 + if (!cpu_d) { 837 + pr_err("Failed to create tracefs dir "TRACEFS_DIR"%s/percpu/cpu%d\n", 838 + name, cpu); 839 + goto err; 840 + } 841 + 842 + d = trace_create_cpu_file("trace_pipe", TRACEFS_MODE_READ, cpu_d, remote, cpu, 843 + &trace_pipe_fops); 844 + if (!d) 845 + goto err; 846 + 847 + d = trace_create_cpu_file("trace", TRACEFS_MODE_WRITE, cpu_d, remote, cpu, 848 + &trace_fops); 849 + if (!d) 850 + goto err; 851 + } 852 + 853 + remote->dentry = remote_d; 854 + 855 + return 0; 856 + 857 + err: 858 + if (root_inited) { 859 + tracefs_remove(root); 860 + root = NULL; 861 + } else { 862 + tracefs_remove(remote_d); 863 + } 864 + 865 + return -ENOMEM; 866 + } 867 + 868 + static int trace_remote_register_events(const char *remote_name, struct trace_remote *remote, 869 + struct remote_event *events, size_t nr_events); 870 + 871 + /** 872 + * trace_remote_register() - Register a Tracefs remote 873 + * @name: Name of the remote, used for the Tracefs remotes/ directory. 874 + * @cbs: Set of callbacks used to control the remote. 875 + * @priv: Private data, passed to each callback from @cbs. 876 + * @events: Array of events. &remote_event.name and &remote_event.id must be 877 + * filled by the caller. 878 + * @nr_events: Number of events in the @events array. 879 + * 880 + * A trace remote is an entity, outside of the kernel (most likely firmware or 881 + * hypervisor) capable of writing events into a Tracefs compatible ring-buffer. 882 + * The kernel would then act as a reader. 883 + * 884 + * The registered remote will be found under the Tracefs directory 885 + * remotes/<name>. 886 + * 887 + * Return: 0 on success, negative error code on failure. 888 + */ 889 + int trace_remote_register(const char *name, struct trace_remote_callbacks *cbs, void *priv, 890 + struct remote_event *events, size_t nr_events) 891 + { 892 + struct trace_remote *remote; 893 + int ret; 894 + 895 + remote = kzalloc_obj(*remote); 896 + if (!remote) 897 + return -ENOMEM; 898 + 899 + remote->cbs = cbs; 900 + remote->priv = priv; 901 + remote->trace_buffer_size = 7 << 10; 902 + remote->poll_ms = 100; 903 + mutex_init(&remote->lock); 904 + init_rwsem(&remote->reader_lock); 905 + 906 + if (trace_remote_init_tracefs(name, remote)) { 907 + kfree(remote); 908 + return -ENOMEM; 909 + } 910 + 911 + ret = trace_remote_register_events(name, remote, events, nr_events); 912 + if (ret) { 913 + pr_err("Failed to register events for trace remote '%s' (%d)\n", 914 + name, ret); 915 + return ret; 916 + } 917 + 918 + ret = cbs->init ? cbs->init(remote->dentry, priv) : 0; 919 + if (ret) 920 + pr_err("Init failed for trace remote '%s' (%d)\n", name, ret); 921 + 922 + return ret; 923 + } 924 + EXPORT_SYMBOL_GPL(trace_remote_register); 925 + 926 + /** 927 + * trace_remote_free_buffer() - Free trace buffer allocated with trace_remote_alloc_buffer() 928 + * @desc: Descriptor of the per-CPU ring-buffers, originally filled by 929 + * trace_remote_alloc_buffer() 930 + * 931 + * Most likely called from &trace_remote_callbacks.unload_trace_buffer. 932 + */ 933 + void trace_remote_free_buffer(struct trace_buffer_desc *desc) 934 + { 935 + struct ring_buffer_desc *rb_desc; 936 + int cpu; 937 + 938 + for_each_ring_buffer_desc(rb_desc, cpu, desc) { 939 + unsigned int id; 940 + 941 + free_page(rb_desc->meta_va); 942 + 943 + for (id = 0; id < rb_desc->nr_page_va; id++) 944 + free_page(rb_desc->page_va[id]); 945 + } 946 + } 947 + EXPORT_SYMBOL_GPL(trace_remote_free_buffer); 948 + 949 + /** 950 + * trace_remote_alloc_buffer() - Dynamically allocate a trace buffer 951 + * @desc: Uninitialized trace_buffer_desc 952 + * @desc_size: Size of the trace_buffer_desc. Must be at least equal to 953 + * trace_buffer_desc_size() 954 + * @buffer_size: Size in bytes of each per-CPU ring-buffer 955 + * @cpumask: CPUs to allocate a ring-buffer for 956 + * 957 + * Helper to dynamically allocate a set of pages (enough to cover @buffer_size) 958 + * for each CPU from @cpumask and fill @desc. Most likely called from 959 + * &trace_remote_callbacks.load_trace_buffer. 960 + * 961 + * Return: 0 on success, negative error code on failure. 962 + */ 963 + int trace_remote_alloc_buffer(struct trace_buffer_desc *desc, size_t desc_size, size_t buffer_size, 964 + const struct cpumask *cpumask) 965 + { 966 + unsigned int nr_pages = max(DIV_ROUND_UP(buffer_size, PAGE_SIZE), 2UL) + 1; 967 + void *desc_end = desc + desc_size; 968 + struct ring_buffer_desc *rb_desc; 969 + int cpu, ret = -ENOMEM; 970 + 971 + if (desc_size < struct_size(desc, __data, 0)) 972 + return -EINVAL; 973 + 974 + desc->nr_cpus = 0; 975 + desc->struct_len = struct_size(desc, __data, 0); 976 + 977 + rb_desc = (struct ring_buffer_desc *)&desc->__data[0]; 978 + 979 + for_each_cpu(cpu, cpumask) { 980 + unsigned int id; 981 + 982 + if ((void *)rb_desc + struct_size(rb_desc, page_va, nr_pages) > desc_end) { 983 + ret = -EINVAL; 984 + goto err; 985 + } 986 + 987 + rb_desc->cpu = cpu; 988 + rb_desc->nr_page_va = 0; 989 + rb_desc->meta_va = (unsigned long)__get_free_page(GFP_KERNEL); 990 + if (!rb_desc->meta_va) 991 + goto err; 992 + 993 + for (id = 0; id < nr_pages; id++) { 994 + rb_desc->page_va[id] = (unsigned long)__get_free_page(GFP_KERNEL); 995 + if (!rb_desc->page_va[id]) 996 + goto err; 997 + 998 + rb_desc->nr_page_va++; 999 + } 1000 + desc->nr_cpus++; 1001 + desc->struct_len += offsetof(struct ring_buffer_desc, page_va); 1002 + desc->struct_len += struct_size(rb_desc, page_va, rb_desc->nr_page_va); 1003 + rb_desc = __next_ring_buffer_desc(rb_desc); 1004 + } 1005 + 1006 + return 0; 1007 + 1008 + err: 1009 + trace_remote_free_buffer(desc); 1010 + return ret; 1011 + } 1012 + EXPORT_SYMBOL_GPL(trace_remote_alloc_buffer); 1013 + 1014 + static int 1015 + trace_remote_enable_event(struct trace_remote *remote, struct remote_event *evt, bool enable) 1016 + { 1017 + int ret; 1018 + 1019 + lockdep_assert_held(&remote->lock); 1020 + 1021 + if (evt->enabled == enable) 1022 + return 0; 1023 + 1024 + ret = remote->cbs->enable_event(evt->id, enable, remote->priv); 1025 + if (ret) 1026 + return ret; 1027 + 1028 + evt->enabled = enable; 1029 + 1030 + return 0; 1031 + } 1032 + 1033 + static int remote_event_enable_show(struct seq_file *s, void *unused) 1034 + { 1035 + struct remote_event *evt = s->private; 1036 + 1037 + seq_printf(s, "%d\n", evt->enabled); 1038 + 1039 + return 0; 1040 + } 1041 + 1042 + static ssize_t remote_event_enable_write(struct file *filp, const char __user *ubuf, 1043 + size_t count, loff_t *ppos) 1044 + { 1045 + struct seq_file *seq = filp->private_data; 1046 + struct remote_event *evt = seq->private; 1047 + struct trace_remote *remote = evt->remote; 1048 + u8 enable; 1049 + int ret; 1050 + 1051 + ret = kstrtou8_from_user(ubuf, count, 10, &enable); 1052 + if (ret) 1053 + return ret; 1054 + 1055 + guard(mutex)(&remote->lock); 1056 + 1057 + ret = trace_remote_enable_event(remote, evt, enable); 1058 + if (ret) 1059 + return ret; 1060 + 1061 + return count; 1062 + } 1063 + DEFINE_SHOW_STORE_ATTRIBUTE(remote_event_enable); 1064 + 1065 + static int remote_event_id_show(struct seq_file *s, void *unused) 1066 + { 1067 + struct remote_event *evt = s->private; 1068 + 1069 + seq_printf(s, "%d\n", evt->id); 1070 + 1071 + return 0; 1072 + } 1073 + DEFINE_SHOW_ATTRIBUTE(remote_event_id); 1074 + 1075 + static int remote_event_format_show(struct seq_file *s, void *unused) 1076 + { 1077 + size_t offset = sizeof(struct remote_event_hdr); 1078 + struct remote_event *evt = s->private; 1079 + struct trace_event_fields *field; 1080 + 1081 + seq_printf(s, "name: %s\n", evt->name); 1082 + seq_printf(s, "ID: %d\n", evt->id); 1083 + seq_puts(s, 1084 + "format:\n\tfield:unsigned short common_type;\toffset:0;\tsize:2;\tsigned:0;\n\n"); 1085 + 1086 + field = &evt->fields[0]; 1087 + while (field->name) { 1088 + seq_printf(s, "\tfield:%s %s;\toffset:%zu;\tsize:%u;\tsigned:%d;\n", 1089 + field->type, field->name, offset, field->size, 1090 + field->is_signed); 1091 + offset += field->size; 1092 + field++; 1093 + } 1094 + 1095 + if (field != &evt->fields[0]) 1096 + seq_puts(s, "\n"); 1097 + 1098 + seq_printf(s, "print fmt: %s\n", evt->print_fmt); 1099 + 1100 + return 0; 1101 + } 1102 + DEFINE_SHOW_ATTRIBUTE(remote_event_format); 1103 + 1104 + static int remote_event_callback(const char *name, umode_t *mode, void **data, 1105 + const struct file_operations **fops) 1106 + { 1107 + if (!strcmp(name, "enable")) { 1108 + *mode = TRACEFS_MODE_WRITE; 1109 + *fops = &remote_event_enable_fops; 1110 + return 1; 1111 + } 1112 + 1113 + if (!strcmp(name, "id")) { 1114 + *mode = TRACEFS_MODE_READ; 1115 + *fops = &remote_event_id_fops; 1116 + return 1; 1117 + } 1118 + 1119 + if (!strcmp(name, "format")) { 1120 + *mode = TRACEFS_MODE_READ; 1121 + *fops = &remote_event_format_fops; 1122 + return 1; 1123 + } 1124 + 1125 + return 0; 1126 + } 1127 + 1128 + static ssize_t remote_events_dir_enable_write(struct file *filp, const char __user *ubuf, 1129 + size_t count, loff_t *ppos) 1130 + { 1131 + struct trace_remote *remote = file_inode(filp)->i_private; 1132 + int i, ret; 1133 + u8 enable; 1134 + 1135 + ret = kstrtou8_from_user(ubuf, count, 10, &enable); 1136 + if (ret) 1137 + return ret; 1138 + 1139 + guard(mutex)(&remote->lock); 1140 + 1141 + for (i = 0; i < remote->nr_events; i++) { 1142 + struct remote_event *evt = &remote->events[i]; 1143 + 1144 + trace_remote_enable_event(remote, evt, enable); 1145 + } 1146 + 1147 + return count; 1148 + } 1149 + 1150 + static ssize_t remote_events_dir_enable_read(struct file *filp, char __user *ubuf, size_t cnt, 1151 + loff_t *ppos) 1152 + { 1153 + struct trace_remote *remote = file_inode(filp)->i_private; 1154 + const char enabled_char[] = {'0', '1', 'X'}; 1155 + char enabled_str[] = " \n"; 1156 + int i, enabled = -1; 1157 + 1158 + guard(mutex)(&remote->lock); 1159 + 1160 + for (i = 0; i < remote->nr_events; i++) { 1161 + struct remote_event *evt = &remote->events[i]; 1162 + 1163 + if (enabled == -1) { 1164 + enabled = evt->enabled; 1165 + } else if (enabled != evt->enabled) { 1166 + enabled = 2; 1167 + break; 1168 + } 1169 + } 1170 + 1171 + enabled_str[0] = enabled_char[enabled == -1 ? 0 : enabled]; 1172 + 1173 + return simple_read_from_buffer(ubuf, cnt, ppos, enabled_str, 2); 1174 + } 1175 + 1176 + static const struct file_operations remote_events_dir_enable_fops = { 1177 + .write = remote_events_dir_enable_write, 1178 + .read = remote_events_dir_enable_read, 1179 + }; 1180 + 1181 + static ssize_t 1182 + remote_events_dir_header_page_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) 1183 + { 1184 + struct trace_seq *s; 1185 + int ret; 1186 + 1187 + s = kmalloc(sizeof(*s), GFP_KERNEL); 1188 + if (!s) 1189 + return -ENOMEM; 1190 + 1191 + trace_seq_init(s); 1192 + 1193 + ring_buffer_print_page_header(NULL, s); 1194 + ret = simple_read_from_buffer(ubuf, cnt, ppos, s->buffer, trace_seq_used(s)); 1195 + kfree(s); 1196 + 1197 + return ret; 1198 + } 1199 + 1200 + static const struct file_operations remote_events_dir_header_page_fops = { 1201 + .read = remote_events_dir_header_page_read, 1202 + }; 1203 + 1204 + static ssize_t 1205 + remote_events_dir_header_event_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) 1206 + { 1207 + struct trace_seq *s; 1208 + int ret; 1209 + 1210 + s = kmalloc(sizeof(*s), GFP_KERNEL); 1211 + if (!s) 1212 + return -ENOMEM; 1213 + 1214 + trace_seq_init(s); 1215 + 1216 + ring_buffer_print_entry_header(s); 1217 + ret = simple_read_from_buffer(ubuf, cnt, ppos, s->buffer, trace_seq_used(s)); 1218 + kfree(s); 1219 + 1220 + return ret; 1221 + } 1222 + 1223 + static const struct file_operations remote_events_dir_header_event_fops = { 1224 + .read = remote_events_dir_header_event_read, 1225 + }; 1226 + 1227 + static int remote_events_dir_callback(const char *name, umode_t *mode, void **data, 1228 + const struct file_operations **fops) 1229 + { 1230 + if (!strcmp(name, "enable")) { 1231 + *mode = TRACEFS_MODE_WRITE; 1232 + *fops = &remote_events_dir_enable_fops; 1233 + return 1; 1234 + } 1235 + 1236 + if (!strcmp(name, "header_page")) { 1237 + *mode = TRACEFS_MODE_READ; 1238 + *fops = &remote_events_dir_header_page_fops; 1239 + return 1; 1240 + } 1241 + 1242 + if (!strcmp(name, "header_event")) { 1243 + *mode = TRACEFS_MODE_READ; 1244 + *fops = &remote_events_dir_header_event_fops; 1245 + return 1; 1246 + } 1247 + 1248 + return 0; 1249 + } 1250 + 1251 + static int trace_remote_init_eventfs(const char *remote_name, struct trace_remote *remote, 1252 + struct remote_event *evt) 1253 + { 1254 + struct eventfs_inode *eventfs = remote->eventfs; 1255 + static struct eventfs_entry dir_entries[] = { 1256 + { 1257 + .name = "enable", 1258 + .callback = remote_events_dir_callback, 1259 + }, { 1260 + .name = "header_page", 1261 + .callback = remote_events_dir_callback, 1262 + }, { 1263 + .name = "header_event", 1264 + .callback = remote_events_dir_callback, 1265 + } 1266 + }; 1267 + static struct eventfs_entry entries[] = { 1268 + { 1269 + .name = "enable", 1270 + .callback = remote_event_callback, 1271 + }, { 1272 + .name = "id", 1273 + .callback = remote_event_callback, 1274 + }, { 1275 + .name = "format", 1276 + .callback = remote_event_callback, 1277 + } 1278 + }; 1279 + bool eventfs_create = false; 1280 + 1281 + if (!eventfs) { 1282 + eventfs = eventfs_create_events_dir("events", remote->dentry, dir_entries, 1283 + ARRAY_SIZE(dir_entries), remote); 1284 + if (IS_ERR(eventfs)) 1285 + return PTR_ERR(eventfs); 1286 + 1287 + /* 1288 + * Create similar hierarchy as local events even if a single system is supported at 1289 + * the moment 1290 + */ 1291 + eventfs = eventfs_create_dir(remote_name, eventfs, NULL, 0, NULL); 1292 + if (IS_ERR(eventfs)) 1293 + return PTR_ERR(eventfs); 1294 + 1295 + remote->eventfs = eventfs; 1296 + eventfs_create = true; 1297 + } 1298 + 1299 + eventfs = eventfs_create_dir(evt->name, eventfs, entries, ARRAY_SIZE(entries), evt); 1300 + if (IS_ERR(eventfs)) { 1301 + if (eventfs_create) { 1302 + eventfs_remove_events_dir(remote->eventfs); 1303 + remote->eventfs = NULL; 1304 + } 1305 + return PTR_ERR(eventfs); 1306 + } 1307 + 1308 + return 0; 1309 + } 1310 + 1311 + static int trace_remote_attach_events(struct trace_remote *remote, struct remote_event *events, 1312 + size_t nr_events) 1313 + { 1314 + int i; 1315 + 1316 + for (i = 0; i < nr_events; i++) { 1317 + struct remote_event *evt = &events[i]; 1318 + 1319 + if (evt->remote) 1320 + return -EEXIST; 1321 + 1322 + evt->remote = remote; 1323 + 1324 + /* We need events to be sorted for efficient lookup */ 1325 + if (i && evt->id <= events[i - 1].id) 1326 + return -EINVAL; 1327 + } 1328 + 1329 + remote->events = events; 1330 + remote->nr_events = nr_events; 1331 + 1332 + return 0; 1333 + } 1334 + 1335 + static int trace_remote_register_events(const char *remote_name, struct trace_remote *remote, 1336 + struct remote_event *events, size_t nr_events) 1337 + { 1338 + int i, ret; 1339 + 1340 + ret = trace_remote_attach_events(remote, events, nr_events); 1341 + if (ret) 1342 + return ret; 1343 + 1344 + for (i = 0; i < nr_events; i++) { 1345 + struct remote_event *evt = &events[i]; 1346 + 1347 + ret = trace_remote_init_eventfs(remote_name, remote, evt); 1348 + if (ret) 1349 + pr_warn("Failed to init eventfs for event '%s' (%d)", 1350 + evt->name, ret); 1351 + } 1352 + 1353 + return 0; 1354 + } 1355 + 1356 + static int __cmp_events(const void *key, const void *data) 1357 + { 1358 + const struct remote_event *evt = data; 1359 + int id = (int)((long)key); 1360 + 1361 + return id - (int)evt->id; 1362 + } 1363 + 1364 + static struct remote_event *trace_remote_find_event(struct trace_remote *remote, unsigned short id) 1365 + { 1366 + return bsearch((const void *)(unsigned long)id, remote->events, remote->nr_events, 1367 + sizeof(*remote->events), __cmp_events); 1368 + }
+25
tools/testing/selftests/ftrace/test.d/remotes/buffer_size.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Test trace remote buffer size 4 + # requires: remotes/test 5 + 6 + . $TEST_DIR/remotes/functions 7 + 8 + test_buffer_size() 9 + { 10 + echo 0 > tracing_on 11 + assert_unloaded 12 + 13 + echo 4096 > buffer_size_kb 14 + echo 1 > tracing_on 15 + assert_loaded 16 + 17 + echo 0 > tracing_on 18 + echo 7 > buffer_size_kb 19 + } 20 + 21 + if [ -z "$SOURCE_REMOTE_TEST" ]; then 22 + set -e 23 + setup_remote_test 24 + test_buffer_size 25 + fi
+88
tools/testing/selftests/ftrace/test.d/remotes/functions
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + 3 + setup_remote() 4 + { 5 + local name=$1 6 + 7 + [ -e $TRACING_DIR/remotes/$name/write_event ] || exit_unresolved 8 + 9 + cd remotes/$name/ 10 + echo 0 > tracing_on 11 + clear_trace 12 + echo 7 > buffer_size_kb 13 + echo 0 > events/enable 14 + echo 1 > events/$name/selftest/enable 15 + echo 1 > tracing_on 16 + } 17 + 18 + setup_remote_test() 19 + { 20 + [ -d $TRACING_DIR/remotes/test/ ] || modprobe remote_test || exit_unresolved 21 + 22 + setup_remote "test" 23 + } 24 + 25 + assert_loaded() 26 + { 27 + grep -q "(loaded)" buffer_size_kb 28 + } 29 + 30 + assert_unloaded() 31 + { 32 + grep -q "(unloaded)" buffer_size_kb 33 + } 34 + 35 + dump_trace_pipe() 36 + { 37 + output=$(mktemp $TMPDIR/remote_test.XXXXXX) 38 + cat trace_pipe > $output & 39 + pid=$! 40 + sleep 1 41 + kill -1 $pid 42 + 43 + echo $output 44 + } 45 + 46 + check_trace() 47 + { 48 + start_id="$1" 49 + end_id="$2" 50 + file="$3" 51 + 52 + # Ensure the file is not empty 53 + test -n "$(head $file)" 54 + 55 + prev_ts=0 56 + id=0 57 + 58 + # Only keep <timestamp> <id> 59 + tmp=$(mktemp $TMPDIR/remote_test.XXXXXX) 60 + sed -e 's/\[[0-9]*\]\s*\([0-9]*.[0-9]*\): [a-z]* id=\([0-9]*\)/\1 \2/' $file > $tmp 61 + 62 + while IFS= read -r line; do 63 + ts=$(echo $line | cut -d ' ' -f 1) 64 + id=$(echo $line | cut -d ' ' -f 2) 65 + 66 + test $(echo "$ts>$prev_ts" | bc) -eq 1 67 + test $id -eq $start_id 68 + 69 + prev_ts=$ts 70 + start_id=$((start_id + 1)) 71 + done < $tmp 72 + 73 + test $id -eq $end_id 74 + rm $tmp 75 + } 76 + 77 + get_cpu_ids() 78 + { 79 + sed -n 's/^processor\s*:\s*\([0-9]\+\).*/\1/p' /proc/cpuinfo 80 + } 81 + 82 + get_page_size() { 83 + sed -ne 's/^.*data.*size:\([0-9][0-9]*\).*/\1/p' events/header_page 84 + } 85 + 86 + get_selftest_event_size() { 87 + sed -ne 's/^.*field:.*;.*size:\([0-9][0-9]*\);.*/\1/p' events/*/selftest/format | awk '{s+=$1} END {print s}' 88 + }
+90
tools/testing/selftests/ftrace/test.d/remotes/reset.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Test trace remote reset 4 + # requires: remotes/test 5 + 6 + . $TEST_DIR/remotes/functions 7 + 8 + check_reset() 9 + { 10 + write_event_path="write_event" 11 + taskset="" 12 + 13 + clear_trace 14 + 15 + # Is the buffer empty? 16 + output=$(dump_trace_pipe) 17 + test $(wc -l $output | cut -d ' ' -f1) -eq 0 18 + 19 + if $(echo $(pwd) | grep -q "per_cpu/cpu"); then 20 + write_event_path="../../write_event" 21 + cpu_id=$(echo $(pwd) | sed -e 's/.*per_cpu\/cpu//') 22 + taskset="taskset -c $cpu_id" 23 + fi 24 + rm $output 25 + 26 + # Can we properly write a new event? 27 + $taskset echo 7890 > $write_event_path 28 + output=$(dump_trace_pipe) 29 + test $(wc -l $output | cut -d ' ' -f1) -eq 1 30 + grep -q "id=7890" $output 31 + rm $output 32 + } 33 + 34 + test_global_interface() 35 + { 36 + output=$(mktemp $TMPDIR/remote_test.XXXXXX) 37 + 38 + # Confidence check 39 + echo 123456 > write_event 40 + output=$(dump_trace_pipe) 41 + grep -q "id=123456" $output 42 + rm $output 43 + 44 + # Reset single event 45 + echo 1 > write_event 46 + check_reset 47 + 48 + # Reset lost events 49 + for i in $(seq 1 10000); do 50 + echo 1 > write_event 51 + done 52 + check_reset 53 + } 54 + 55 + test_percpu_interface() 56 + { 57 + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 58 + 59 + for cpu in $(get_cpu_ids); do 60 + taskset -c $cpu echo 1 > write_event 61 + done 62 + 63 + check_non_empty=0 64 + for cpu in $(get_cpu_ids); do 65 + cd per_cpu/cpu$cpu/ 66 + 67 + if [ $check_non_empty -eq 0 ]; then 68 + check_reset 69 + check_non_empty=1 70 + else 71 + # Check we have only reset 1 CPU 72 + output=$(dump_trace_pipe) 73 + test $(wc -l $output | cut -d ' ' -f1) -eq 1 74 + rm $output 75 + fi 76 + cd - 77 + done 78 + } 79 + 80 + test_reset() 81 + { 82 + test_global_interface 83 + test_percpu_interface 84 + } 85 + 86 + if [ -z "$SOURCE_REMOTE_TEST" ]; then 87 + set -e 88 + setup_remote_test 89 + test_reset 90 + fi
+127
tools/testing/selftests/ftrace/test.d/remotes/trace.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Test trace remote non-consuming read 4 + # requires: remotes/test 5 + 6 + . $TEST_DIR/remotes/functions 7 + 8 + test_trace() 9 + { 10 + echo 0 > tracing_on 11 + assert_unloaded 12 + 13 + echo 7 > buffer_size_kb 14 + echo 1 > tracing_on 15 + assert_loaded 16 + 17 + # Simple test: Emit few events and try to read them 18 + for i in $(seq 1 8); do 19 + echo $i > write_event 20 + done 21 + 22 + check_trace 1 8 trace 23 + 24 + # 25 + # Test interaction with consuming read 26 + # 27 + 28 + cat trace_pipe > /dev/null & 29 + pid=$! 30 + 31 + sleep 1 32 + kill $pid 33 + 34 + test $(wc -l < trace) -eq 0 35 + 36 + for i in $(seq 16 32); do 37 + echo $i > write_event 38 + done 39 + 40 + check_trace 16 32 trace 41 + 42 + # 43 + # Test interaction with reset 44 + # 45 + 46 + echo 0 > trace 47 + 48 + test $(wc -l < trace) -eq 0 49 + 50 + for i in $(seq 1 8); do 51 + echo $i > write_event 52 + done 53 + 54 + check_trace 1 8 trace 55 + 56 + # 57 + # Test interaction with lost events 58 + # 59 + 60 + # Ensure the writer is not on the reader page by reloading the buffer 61 + echo 0 > tracing_on 62 + echo 0 > trace 63 + assert_unloaded 64 + echo 1 > tracing_on 65 + assert_loaded 66 + 67 + # Ensure ring-buffer overflow by emitting events from the same CPU 68 + for cpu in $(get_cpu_ids); do 69 + break 70 + done 71 + 72 + events_per_page=$(($(get_page_size) / $(get_selftest_event_size))) # Approx: does not take TS into account 73 + nr_events=$(($events_per_page * 2)) 74 + for i in $(seq 1 $nr_events); do 75 + taskset -c $cpu echo $i > write_event 76 + done 77 + 78 + id=$(sed -n -e '1s/\[[0-9]*\]\s*[0-9]*.[0-9]*: [a-z]* id=\([0-9]*\)/\1/p' trace) 79 + test $id -ne 1 80 + 81 + check_trace $id $nr_events trace 82 + 83 + # 84 + # Test per-CPU interface 85 + # 86 + echo 0 > trace 87 + 88 + for cpu in $(get_cpu_ids) ; do 89 + taskset -c $cpu echo $cpu > write_event 90 + done 91 + 92 + for cpu in $(get_cpu_ids); do 93 + cd per_cpu/cpu$cpu/ 94 + 95 + check_trace $cpu $cpu trace 96 + 97 + cd - > /dev/null 98 + done 99 + 100 + # 101 + # Test with hotplug 102 + # 103 + 104 + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 105 + 106 + echo 0 > trace 107 + 108 + for cpu in $(get_cpu_ids); do 109 + echo 0 > /sys/devices/system/cpu/cpu$cpu/online || return 0 110 + break 111 + done 112 + 113 + for i in $(seq 1 8); do 114 + echo $i > write_event 115 + done 116 + 117 + check_trace 1 8 trace 118 + 119 + echo 1 > /sys/devices/system/cpu/cpu$cpu/online 120 + } 121 + 122 + if [ -z "$SOURCE_REMOTE_TEST" ]; then 123 + set -e 124 + 125 + setup_remote_test 126 + test_trace 127 + fi
+127
tools/testing/selftests/ftrace/test.d/remotes/trace_pipe.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Test trace remote consuming read 4 + # requires: remotes/test 5 + 6 + . $TEST_DIR/remotes/functions 7 + 8 + test_trace_pipe() 9 + { 10 + echo 0 > tracing_on 11 + assert_unloaded 12 + 13 + # Emit events from the same CPU 14 + for cpu in $(get_cpu_ids); do 15 + break 16 + done 17 + 18 + # 19 + # Simple test: Emit enough events to fill few pages 20 + # 21 + 22 + echo 1024 > buffer_size_kb 23 + echo 1 > tracing_on 24 + assert_loaded 25 + 26 + events_per_page=$(($(get_page_size) / $(get_selftest_event_size))) 27 + nr_events=$(($events_per_page * 4)) 28 + 29 + output=$(mktemp $TMPDIR/remote_test.XXXXXX) 30 + 31 + cat trace_pipe > $output & 32 + pid=$! 33 + 34 + for i in $(seq 1 $nr_events); do 35 + taskset -c $cpu echo $i > write_event 36 + done 37 + 38 + echo 0 > tracing_on 39 + sleep 1 40 + kill $pid 41 + 42 + check_trace 1 $nr_events $output 43 + 44 + rm $output 45 + 46 + # 47 + # Test interaction with lost events 48 + # 49 + 50 + assert_unloaded 51 + echo 7 > buffer_size_kb 52 + echo 1 > tracing_on 53 + assert_loaded 54 + 55 + nr_events=$((events_per_page * 2)) 56 + for i in $(seq 1 $nr_events); do 57 + taskset -c $cpu echo $i > write_event 58 + done 59 + 60 + output=$(dump_trace_pipe) 61 + 62 + lost_events=$(sed -n -e '1s/CPU:.*\[LOST \([0-9]*\) EVENTS\]/\1/p' $output) 63 + test -n "$lost_events" 64 + 65 + id=$(sed -n -e '2s/\[[0-9]*\]\s*[0-9]*.[0-9]*: [a-z]* id=\([0-9]*\)/\1/p' $output) 66 + test "$id" -eq $(($lost_events + 1)) 67 + 68 + # Drop [LOST EVENTS] line 69 + sed -i '1d' $output 70 + 71 + check_trace $id $nr_events $output 72 + 73 + rm $output 74 + 75 + # 76 + # Test per-CPU interface 77 + # 78 + 79 + echo 0 > trace 80 + echo 1 > tracing_on 81 + 82 + for cpu in $(get_cpu_ids); do 83 + taskset -c $cpu echo $cpu > write_event 84 + done 85 + 86 + for cpu in $(get_cpu_ids); do 87 + cd per_cpu/cpu$cpu/ 88 + output=$(dump_trace_pipe) 89 + 90 + check_trace $cpu $cpu $output 91 + 92 + rm $output 93 + cd - > /dev/null 94 + done 95 + 96 + # 97 + # Test interaction with hotplug 98 + # 99 + 100 + [ "$(get_cpu_ids | wc -l)" -ge 2 ] || return 0 101 + 102 + echo 0 > trace 103 + 104 + for cpu in $(get_cpu_ids); do 105 + echo 0 > /sys/devices/system/cpu/cpu$cpu/online || return 0 106 + break 107 + done 108 + 109 + for i in $(seq 1 8); do 110 + echo $i > write_event 111 + done 112 + 113 + output=$(dump_trace_pipe) 114 + 115 + check_trace 1 8 $output 116 + 117 + rm $output 118 + 119 + echo 1 > /sys/devices/system/cpu/cpu$cpu/online 120 + } 121 + 122 + if [ -z "$SOURCE_REMOTE_TEST" ]; then 123 + set -e 124 + 125 + setup_remote_test 126 + test_trace_pipe 127 + fi
+41
tools/testing/selftests/ftrace/test.d/remotes/unloading.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Test trace remote unloading 4 + # requires: remotes/test 5 + 6 + . $TEST_DIR/remotes/functions 7 + 8 + test_unloading() 9 + { 10 + # No reader, writing 11 + assert_loaded 12 + 13 + # No reader, no writing 14 + echo 0 > tracing_on 15 + assert_unloaded 16 + 17 + # 1 reader, no writing 18 + cat trace_pipe & 19 + pid=$! 20 + sleep 1 21 + assert_loaded 22 + kill $pid 23 + assert_unloaded 24 + 25 + # No reader, no writing, events 26 + echo 1 > tracing_on 27 + echo 1 > write_event 28 + echo 0 > tracing_on 29 + assert_loaded 30 + 31 + # Test reset 32 + clear_trace 33 + assert_unloaded 34 + } 35 + 36 + if [ -z "$SOURCE_REMOTE_TEST" ]; then 37 + set -e 38 + 39 + setup_remote_test 40 + test_unloading 41 + fi