Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing ring buffer updates from Steven Rostedt:
"Add ring_buffer memory mappings.

The tracing ring buffer was created based on being mostly used with
the splice system call. It is broken up into page ordered sub-buffers
and the reader swaps a new sub-buffer with an existing sub-buffer
that's part of the write buffer. It then has total access to the
swapped out sub-buffer and can do copyless movements of the memory
into other mediums (file system, network, etc).

The buffer is great for passing around the ring buffer contents in the
kernel, but is not so good for when the consumer is the user space
task itself.

A new interface is added that allows user space to memory map the ring
buffer. It will get all the write sub-buffers as well as reader
sub-buffer (that is not written to). It can send an ioctl to change
which sub-buffer is the new reader sub-buffer.

The ring buffer is read only to user space. It only needs to call the
ioctl when it is finished with a sub-buffer and needs a new sub-buffer
that the writer will not write over.

A self test program was also created for testing and can be used as an
example for the interface to user space. The libtracefs (external to
the kernel) also has code that interacts with this, although it is
disabled until the interface is in a official release. It can be
enabled by compiling the library with a special flag. This was used
for testing applications that perform better with the buffer being
mapped.

Memory mapped buffers have limitations. The main one is that it can
not be used with the snapshot logic. If the buffer is mapped,
snapshots will be disabled. If any logic is set to trigger snapshots
on a buffer, that buffer will not be allowed to be mapped"

* tag 'trace-ringbuffer-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Add cast to unsigned long addr passed to virt_to_page()
ring-buffer: Have mmapped ring buffer keep track of missed events
ring-buffer/selftest: Add ring-buffer mapping test
Documentation: tracing: Add ring-buffer mapping
tracing: Allow user-space mapping of the ring-buffer
ring-buffer: Introducing ring-buffer mapping functions
ring-buffer: Allocate sub-buffers with __GFP_COMP

+1026 -16
+1
Documentation/trace/index.rst
··· 29 29 timerlat-tracer 30 30 intel_th 31 31 ring-buffer-design 32 + ring-buffer-map 32 33 stm 33 34 sys-t 34 35 coresight/index
+106
Documentation/trace/ring-buffer-map.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================================== 4 + Tracefs ring-buffer memory mapping 5 + ================================== 6 + 7 + :Author: Vincent Donnefort <vdonnefort@google.com> 8 + 9 + Overview 10 + ======== 11 + Tracefs ring-buffer memory map provides an efficient method to stream data 12 + as no memory copy is necessary. The application mapping the ring-buffer becomes 13 + then a consumer for that ring-buffer, in a similar fashion to trace_pipe. 14 + 15 + Memory mapping setup 16 + ==================== 17 + The mapping works with a mmap() of the trace_pipe_raw interface. 18 + 19 + The first system page of the mapping contains ring-buffer statistics and 20 + description. It is referred to as the meta-page. One of the most important 21 + fields of the meta-page is the reader. It contains the sub-buffer ID which can 22 + be safely read by the mapper (see ring-buffer-design.rst). 23 + 24 + The meta-page is followed by all the sub-buffers, ordered by ascending ID. It is 25 + therefore effortless to know where the reader starts in the mapping: 26 + 27 + .. code-block:: c 28 + 29 + reader_id = meta->reader->id; 30 + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; 31 + 32 + When the application is done with the current reader, it can get a new one using 33 + the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates 34 + the meta-page fields. 35 + 36 + Limitations 37 + =========== 38 + When a mapping is in place on a Tracefs ring-buffer, it is not possible to 39 + either resize it (either by increasing the entire size of the ring-buffer or 40 + each subbuf). It is also not possible to use snapshot and causes splice to copy 41 + the ring buffer data instead of using the copyless swap from the ring buffer. 42 + 43 + Concurrent readers (either another application mapping that ring-buffer or the 44 + kernel with trace_pipe) are allowed but not recommended. They will compete for 45 + the ring-buffer and the output is unpredictable, just like concurrent readers on 46 + trace_pipe would be. 47 + 48 + Example 49 + ======= 50 + 51 + .. code-block:: c 52 + 53 + #include <fcntl.h> 54 + #include <stdio.h> 55 + #include <stdlib.h> 56 + #include <unistd.h> 57 + 58 + #include <linux/trace_mmap.h> 59 + 60 + #include <sys/mman.h> 61 + #include <sys/ioctl.h> 62 + 63 + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" 64 + 65 + int main(void) 66 + { 67 + int page_size = getpagesize(), fd, reader_id; 68 + unsigned long meta_len, data_len; 69 + struct trace_buffer_meta *meta; 70 + void *map, *reader, *data; 71 + 72 + fd = open(TRACE_PIPE_RAW, O_RDONLY | O_NONBLOCK); 73 + if (fd < 0) 74 + exit(EXIT_FAILURE); 75 + 76 + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); 77 + if (map == MAP_FAILED) 78 + exit(EXIT_FAILURE); 79 + 80 + meta = (struct trace_buffer_meta *)map; 81 + meta_len = meta->meta_page_size; 82 + 83 + printf("entries: %llu\n", meta->entries); 84 + printf("overrun: %llu\n", meta->overrun); 85 + printf("read: %llu\n", meta->read); 86 + printf("nr_subbufs: %u\n", meta->nr_subbufs); 87 + 88 + data_len = meta->subbuf_size * meta->nr_subbufs; 89 + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len); 90 + if (data == MAP_FAILED) 91 + exit(EXIT_FAILURE); 92 + 93 + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) 94 + exit(EXIT_FAILURE); 95 + 96 + reader_id = meta->reader.id; 97 + reader = data + meta->subbuf_size * reader_id; 98 + 99 + printf("Current reader address: %p\n", reader); 100 + 101 + munmap(data, data_len); 102 + munmap(meta, meta_len); 103 + close (fd); 104 + 105 + return 0; 106 + }
+6
include/linux/ring_buffer.h
··· 6 6 #include <linux/seq_file.h> 7 7 #include <linux/poll.h> 8 8 9 + #include <uapi/linux/trace_mmap.h> 10 + 9 11 struct trace_buffer; 10 12 struct ring_buffer_iter; 11 13 ··· 225 223 #define trace_rb_cpu_prepare NULL 226 224 #endif 227 225 226 + int ring_buffer_map(struct trace_buffer *buffer, int cpu, 227 + struct vm_area_struct *vma); 228 + int ring_buffer_unmap(struct trace_buffer *buffer, int cpu); 229 + int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu); 228 230 #endif /* _LINUX_RING_BUFFER_H */
+48
include/uapi/linux/trace_mmap.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + #ifndef _TRACE_MMAP_H_ 3 + #define _TRACE_MMAP_H_ 4 + 5 + #include <linux/types.h> 6 + 7 + /** 8 + * struct trace_buffer_meta - Ring-buffer Meta-page description 9 + * @meta_page_size: Size of this meta-page. 10 + * @meta_struct_len: Size of this structure. 11 + * @subbuf_size: Size of each sub-buffer. 12 + * @nr_subbufs: Number of subbfs in the ring-buffer, including the reader. 13 + * @reader.lost_events: Number of events lost at the time of the reader swap. 14 + * @reader.id: subbuf ID of the current reader. ID range [0 : @nr_subbufs - 1] 15 + * @reader.read: Number of bytes read on the reader subbuf. 16 + * @flags: Placeholder for now, 0 until new features are supported. 17 + * @entries: Number of entries in the ring-buffer. 18 + * @overrun: Number of entries lost in the ring-buffer. 19 + * @read: Number of entries that have been read. 20 + * @Reserved1: Internal use only. 21 + * @Reserved2: Internal use only. 22 + */ 23 + struct trace_buffer_meta { 24 + __u32 meta_page_size; 25 + __u32 meta_struct_len; 26 + 27 + __u32 subbuf_size; 28 + __u32 nr_subbufs; 29 + 30 + struct { 31 + __u64 lost_events; 32 + __u32 id; 33 + __u32 read; 34 + } reader; 35 + 36 + __u64 flags; 37 + 38 + __u64 entries; 39 + __u64 overrun; 40 + __u64 read; 41 + 42 + __u64 Reserved1; 43 + __u64 Reserved2; 44 + }; 45 + 46 + #define TRACE_MMAP_IOCTL_GET_READER _IO('T', 0x1) 47 + 48 + #endif /* _TRACE_MMAP_H_ */
+460 -11
kernel/trace/ring_buffer.c
··· 9 9 #include <linux/ring_buffer.h> 10 10 #include <linux/trace_clock.h> 11 11 #include <linux/sched/clock.h> 12 + #include <linux/cacheflush.h> 12 13 #include <linux/trace_seq.h> 13 14 #include <linux/spinlock.h> 14 15 #include <linux/irq_work.h> ··· 27 26 #include <linux/list.h> 28 27 #include <linux/cpu.h> 29 28 #include <linux/oom.h> 29 + #include <linux/mm.h> 30 30 31 31 #include <asm/local64.h> 32 32 #include <asm/local.h> ··· 314 312 /* Missed count stored at end */ 315 313 #define RB_MISSED_STORED (1 << 30) 316 314 315 + #define RB_MISSED_MASK (3 << 30) 316 + 317 317 struct buffer_data_page { 318 318 u64 time_stamp; /* page time stamp */ 319 319 local_t commit; /* write committed index */ ··· 342 338 local_t entries; /* entries on this page */ 343 339 unsigned long real_end; /* real end of data */ 344 340 unsigned order; /* order of the page */ 341 + u32 id; /* ID for external mapping */ 345 342 struct buffer_data_page *page; /* Actual data page */ 346 343 }; 347 344 ··· 489 484 u64 read_stamp; 490 485 /* pages removed since last reset */ 491 486 unsigned long pages_removed; 487 + 488 + unsigned int mapped; 489 + struct mutex mapping_lock; 490 + unsigned long *subbuf_ids; /* ID to subbuf VA */ 491 + struct trace_buffer_meta *meta_page; 492 + 492 493 /* ring buffer pages to update, > 0 to add, < 0 to remove */ 493 494 long nr_pages_to_update; 494 495 struct list_head new_pages; /* new pages to add */ ··· 1535 1524 list_add(&bpage->list, pages); 1536 1525 1537 1526 page = alloc_pages_node(cpu_to_node(cpu_buffer->cpu), 1538 - mflags | __GFP_ZERO, 1527 + mflags | __GFP_COMP | __GFP_ZERO, 1539 1528 cpu_buffer->buffer->subbuf_order); 1540 1529 if (!page) 1541 1530 goto free_pages; ··· 1610 1599 init_irq_work(&cpu_buffer->irq_work.work, rb_wake_up_waiters); 1611 1600 init_waitqueue_head(&cpu_buffer->irq_work.waiters); 1612 1601 init_waitqueue_head(&cpu_buffer->irq_work.full_waiters); 1602 + mutex_init(&cpu_buffer->mapping_lock); 1613 1603 1614 1604 bpage = kzalloc_node(ALIGN(sizeof(*bpage), cache_line_size()), 1615 1605 GFP_KERNEL, cpu_to_node(cpu)); ··· 1621 1609 1622 1610 cpu_buffer->reader_page = bpage; 1623 1611 1624 - page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL | __GFP_ZERO, 1612 + page = alloc_pages_node(cpu_to_node(cpu), GFP_KERNEL | __GFP_COMP | __GFP_ZERO, 1625 1613 cpu_buffer->buffer->subbuf_order); 1626 1614 if (!page) 1627 1615 goto fail_free_reader; ··· 1800 1788 { 1801 1789 return buffer->time_stamp_abs; 1802 1790 } 1803 - 1804 - static void rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer); 1805 1791 1806 1792 static inline unsigned long rb_page_entries(struct buffer_page *bpage) 1807 1793 { ··· 2328 2318 /* Size is determined by what has been committed */ 2329 2319 static __always_inline unsigned rb_page_size(struct buffer_page *bpage) 2330 2320 { 2331 - return rb_page_commit(bpage); 2321 + return rb_page_commit(bpage) & ~RB_MISSED_MASK; 2332 2322 } 2333 2323 2334 2324 static __always_inline unsigned ··· 3955 3945 return true; 3956 3946 3957 3947 /* Reader should exhaust content in reader page */ 3958 - if (reader->read != rb_page_commit(reader)) 3948 + if (reader->read != rb_page_size(reader)) 3959 3949 return false; 3960 3950 3961 3951 /* ··· 4426 4416 return ((iter->head_page == commit_page && iter->head >= commit) || 4427 4417 (iter->head_page == reader && commit_page == head_page && 4428 4418 head_page->read == commit && 4429 - iter->head == rb_page_commit(cpu_buffer->reader_page))); 4419 + iter->head == rb_page_size(cpu_buffer->reader_page))); 4430 4420 } 4431 4421 EXPORT_SYMBOL_GPL(ring_buffer_iter_empty); 4432 4422 ··· 5221 5211 page->read = 0; 5222 5212 } 5223 5213 5214 + static void rb_update_meta_page(struct ring_buffer_per_cpu *cpu_buffer) 5215 + { 5216 + struct trace_buffer_meta *meta = cpu_buffer->meta_page; 5217 + 5218 + meta->reader.read = cpu_buffer->reader_page->read; 5219 + meta->reader.id = cpu_buffer->reader_page->id; 5220 + meta->reader.lost_events = cpu_buffer->lost_events; 5221 + 5222 + meta->entries = local_read(&cpu_buffer->entries); 5223 + meta->overrun = local_read(&cpu_buffer->overrun); 5224 + meta->read = cpu_buffer->read; 5225 + 5226 + /* Some archs do not have data cache coherency between kernel and user-space */ 5227 + flush_dcache_folio(virt_to_folio(cpu_buffer->meta_page)); 5228 + } 5229 + 5224 5230 static void 5225 5231 rb_reset_cpu(struct ring_buffer_per_cpu *cpu_buffer) 5226 5232 { ··· 5280 5254 5281 5255 cpu_buffer->lost_events = 0; 5282 5256 cpu_buffer->last_overrun = 0; 5257 + 5258 + if (cpu_buffer->mapped) 5259 + rb_update_meta_page(cpu_buffer); 5283 5260 5284 5261 rb_head_page_activate(cpu_buffer); 5285 5262 cpu_buffer->pages_removed = 0; ··· 5498 5469 cpu_buffer_a = buffer_a->buffers[cpu]; 5499 5470 cpu_buffer_b = buffer_b->buffers[cpu]; 5500 5471 5472 + /* It's up to the callers to not try to swap mapped buffers */ 5473 + if (WARN_ON_ONCE(cpu_buffer_a->mapped || cpu_buffer_b->mapped)) { 5474 + ret = -EBUSY; 5475 + goto out; 5476 + } 5477 + 5501 5478 /* At least make sure the two buffers are somewhat the same */ 5502 5479 if (cpu_buffer_a->nr_pages != cpu_buffer_b->nr_pages) 5503 5480 goto out; ··· 5614 5579 goto out; 5615 5580 5616 5581 page = alloc_pages_node(cpu_to_node(cpu), 5617 - GFP_KERNEL | __GFP_NORETRY | __GFP_ZERO, 5582 + GFP_KERNEL | __GFP_NORETRY | __GFP_COMP | __GFP_ZERO, 5618 5583 cpu_buffer->buffer->subbuf_order); 5619 5584 if (!page) { 5620 5585 kfree(bpage); ··· 5755 5720 event = rb_reader_event(cpu_buffer); 5756 5721 5757 5722 read = reader->read; 5758 - commit = rb_page_commit(reader); 5723 + commit = rb_page_size(reader); 5759 5724 5760 5725 /* Check if any events were dropped */ 5761 5726 missed_events = cpu_buffer->lost_events; ··· 5768 5733 * Otherwise, we can simply swap the page with the one passed in. 5769 5734 */ 5770 5735 if (read || (len < (commit - read)) || 5771 - cpu_buffer->reader_page == cpu_buffer->commit_page) { 5736 + cpu_buffer->reader_page == cpu_buffer->commit_page || 5737 + cpu_buffer->mapped) { 5772 5738 struct buffer_data_page *rpage = cpu_buffer->reader_page->page; 5773 5739 unsigned int rpos = read; 5774 5740 unsigned int pos = 0; ··· 5832 5796 } else { 5833 5797 /* update the entry counter */ 5834 5798 cpu_buffer->read += rb_page_entries(reader); 5835 - cpu_buffer->read_bytes += rb_page_commit(reader); 5799 + cpu_buffer->read_bytes += rb_page_size(reader); 5836 5800 5837 5801 /* swap the pages */ 5838 5802 rb_init_page(bpage); ··· 5992 5956 5993 5957 cpu_buffer = buffer->buffers[cpu]; 5994 5958 5959 + if (cpu_buffer->mapped) { 5960 + err = -EBUSY; 5961 + goto error; 5962 + } 5963 + 5995 5964 /* Update the number of pages to match the new size */ 5996 5965 nr_pages = old_size * buffer->buffers[cpu]->nr_pages; 5997 5966 nr_pages = DIV_ROUND_UP(nr_pages, buffer->subbuf_size); ··· 6097 6056 return err; 6098 6057 } 6099 6058 EXPORT_SYMBOL_GPL(ring_buffer_subbuf_order_set); 6059 + 6060 + static int rb_alloc_meta_page(struct ring_buffer_per_cpu *cpu_buffer) 6061 + { 6062 + struct page *page; 6063 + 6064 + if (cpu_buffer->meta_page) 6065 + return 0; 6066 + 6067 + page = alloc_page(GFP_USER | __GFP_ZERO); 6068 + if (!page) 6069 + return -ENOMEM; 6070 + 6071 + cpu_buffer->meta_page = page_to_virt(page); 6072 + 6073 + return 0; 6074 + } 6075 + 6076 + static void rb_free_meta_page(struct ring_buffer_per_cpu *cpu_buffer) 6077 + { 6078 + unsigned long addr = (unsigned long)cpu_buffer->meta_page; 6079 + 6080 + free_page(addr); 6081 + cpu_buffer->meta_page = NULL; 6082 + } 6083 + 6084 + static void rb_setup_ids_meta_page(struct ring_buffer_per_cpu *cpu_buffer, 6085 + unsigned long *subbuf_ids) 6086 + { 6087 + struct trace_buffer_meta *meta = cpu_buffer->meta_page; 6088 + unsigned int nr_subbufs = cpu_buffer->nr_pages + 1; 6089 + struct buffer_page *first_subbuf, *subbuf; 6090 + int id = 0; 6091 + 6092 + subbuf_ids[id] = (unsigned long)cpu_buffer->reader_page->page; 6093 + cpu_buffer->reader_page->id = id++; 6094 + 6095 + first_subbuf = subbuf = rb_set_head_page(cpu_buffer); 6096 + do { 6097 + if (WARN_ON(id >= nr_subbufs)) 6098 + break; 6099 + 6100 + subbuf_ids[id] = (unsigned long)subbuf->page; 6101 + subbuf->id = id; 6102 + 6103 + rb_inc_page(&subbuf); 6104 + id++; 6105 + } while (subbuf != first_subbuf); 6106 + 6107 + /* install subbuf ID to kern VA translation */ 6108 + cpu_buffer->subbuf_ids = subbuf_ids; 6109 + 6110 + meta->meta_page_size = PAGE_SIZE; 6111 + meta->meta_struct_len = sizeof(*meta); 6112 + meta->nr_subbufs = nr_subbufs; 6113 + meta->subbuf_size = cpu_buffer->buffer->subbuf_size + BUF_PAGE_HDR_SIZE; 6114 + 6115 + rb_update_meta_page(cpu_buffer); 6116 + } 6117 + 6118 + static struct ring_buffer_per_cpu * 6119 + rb_get_mapped_buffer(struct trace_buffer *buffer, int cpu) 6120 + { 6121 + struct ring_buffer_per_cpu *cpu_buffer; 6122 + 6123 + if (!cpumask_test_cpu(cpu, buffer->cpumask)) 6124 + return ERR_PTR(-EINVAL); 6125 + 6126 + cpu_buffer = buffer->buffers[cpu]; 6127 + 6128 + mutex_lock(&cpu_buffer->mapping_lock); 6129 + 6130 + if (!cpu_buffer->mapped) { 6131 + mutex_unlock(&cpu_buffer->mapping_lock); 6132 + return ERR_PTR(-ENODEV); 6133 + } 6134 + 6135 + return cpu_buffer; 6136 + } 6137 + 6138 + static void rb_put_mapped_buffer(struct ring_buffer_per_cpu *cpu_buffer) 6139 + { 6140 + mutex_unlock(&cpu_buffer->mapping_lock); 6141 + } 6142 + 6143 + /* 6144 + * Fast-path for rb_buffer_(un)map(). Called whenever the meta-page doesn't need 6145 + * to be set-up or torn-down. 6146 + */ 6147 + static int __rb_inc_dec_mapped(struct ring_buffer_per_cpu *cpu_buffer, 6148 + bool inc) 6149 + { 6150 + unsigned long flags; 6151 + 6152 + lockdep_assert_held(&cpu_buffer->mapping_lock); 6153 + 6154 + if (inc && cpu_buffer->mapped == UINT_MAX) 6155 + return -EBUSY; 6156 + 6157 + if (WARN_ON(!inc && cpu_buffer->mapped == 0)) 6158 + return -EINVAL; 6159 + 6160 + mutex_lock(&cpu_buffer->buffer->mutex); 6161 + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 6162 + 6163 + if (inc) 6164 + cpu_buffer->mapped++; 6165 + else 6166 + cpu_buffer->mapped--; 6167 + 6168 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 6169 + mutex_unlock(&cpu_buffer->buffer->mutex); 6170 + 6171 + return 0; 6172 + } 6173 + 6174 + /* 6175 + * +--------------+ pgoff == 0 6176 + * | meta page | 6177 + * +--------------+ pgoff == 1 6178 + * | subbuffer 0 | 6179 + * | | 6180 + * +--------------+ pgoff == (1 + (1 << subbuf_order)) 6181 + * | subbuffer 1 | 6182 + * | | 6183 + * ... 6184 + */ 6185 + #ifdef CONFIG_MMU 6186 + static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, 6187 + struct vm_area_struct *vma) 6188 + { 6189 + unsigned long nr_subbufs, nr_pages, vma_pages, pgoff = vma->vm_pgoff; 6190 + unsigned int subbuf_pages, subbuf_order; 6191 + struct page **pages; 6192 + int p = 0, s = 0; 6193 + int err; 6194 + 6195 + /* Refuse MP_PRIVATE or writable mappings */ 6196 + if (vma->vm_flags & VM_WRITE || vma->vm_flags & VM_EXEC || 6197 + !(vma->vm_flags & VM_MAYSHARE)) 6198 + return -EPERM; 6199 + 6200 + /* 6201 + * Make sure the mapping cannot become writable later. Also tell the VM 6202 + * to not touch these pages (VM_DONTCOPY | VM_DONTEXPAND). 6203 + */ 6204 + vm_flags_mod(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP, 6205 + VM_MAYWRITE); 6206 + 6207 + lockdep_assert_held(&cpu_buffer->mapping_lock); 6208 + 6209 + subbuf_order = cpu_buffer->buffer->subbuf_order; 6210 + subbuf_pages = 1 << subbuf_order; 6211 + 6212 + nr_subbufs = cpu_buffer->nr_pages + 1; /* + reader-subbuf */ 6213 + nr_pages = ((nr_subbufs) << subbuf_order) - pgoff + 1; /* + meta-page */ 6214 + 6215 + vma_pages = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; 6216 + if (!vma_pages || vma_pages > nr_pages) 6217 + return -EINVAL; 6218 + 6219 + nr_pages = vma_pages; 6220 + 6221 + pages = kcalloc(nr_pages, sizeof(*pages), GFP_KERNEL); 6222 + if (!pages) 6223 + return -ENOMEM; 6224 + 6225 + if (!pgoff) { 6226 + pages[p++] = virt_to_page(cpu_buffer->meta_page); 6227 + 6228 + /* 6229 + * TODO: Align sub-buffers on their size, once 6230 + * vm_insert_pages() supports the zero-page. 6231 + */ 6232 + } else { 6233 + /* Skip the meta-page */ 6234 + pgoff--; 6235 + 6236 + if (pgoff % subbuf_pages) { 6237 + err = -EINVAL; 6238 + goto out; 6239 + } 6240 + 6241 + s += pgoff / subbuf_pages; 6242 + } 6243 + 6244 + while (p < nr_pages) { 6245 + struct page *page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]); 6246 + int off = 0; 6247 + 6248 + if (WARN_ON_ONCE(s >= nr_subbufs)) { 6249 + err = -EINVAL; 6250 + goto out; 6251 + } 6252 + 6253 + for (; off < (1 << (subbuf_order)); off++, page++) { 6254 + if (p >= nr_pages) 6255 + break; 6256 + 6257 + pages[p++] = page; 6258 + } 6259 + s++; 6260 + } 6261 + 6262 + err = vm_insert_pages(vma, vma->vm_start, pages, &nr_pages); 6263 + 6264 + out: 6265 + kfree(pages); 6266 + 6267 + return err; 6268 + } 6269 + #else 6270 + static int __rb_map_vma(struct ring_buffer_per_cpu *cpu_buffer, 6271 + struct vm_area_struct *vma) 6272 + { 6273 + return -EOPNOTSUPP; 6274 + } 6275 + #endif 6276 + 6277 + int ring_buffer_map(struct trace_buffer *buffer, int cpu, 6278 + struct vm_area_struct *vma) 6279 + { 6280 + struct ring_buffer_per_cpu *cpu_buffer; 6281 + unsigned long flags, *subbuf_ids; 6282 + int err = 0; 6283 + 6284 + if (!cpumask_test_cpu(cpu, buffer->cpumask)) 6285 + return -EINVAL; 6286 + 6287 + cpu_buffer = buffer->buffers[cpu]; 6288 + 6289 + mutex_lock(&cpu_buffer->mapping_lock); 6290 + 6291 + if (cpu_buffer->mapped) { 6292 + err = __rb_map_vma(cpu_buffer, vma); 6293 + if (!err) 6294 + err = __rb_inc_dec_mapped(cpu_buffer, true); 6295 + mutex_unlock(&cpu_buffer->mapping_lock); 6296 + return err; 6297 + } 6298 + 6299 + /* prevent another thread from changing buffer/sub-buffer sizes */ 6300 + mutex_lock(&buffer->mutex); 6301 + 6302 + err = rb_alloc_meta_page(cpu_buffer); 6303 + if (err) 6304 + goto unlock; 6305 + 6306 + /* subbuf_ids include the reader while nr_pages does not */ 6307 + subbuf_ids = kcalloc(cpu_buffer->nr_pages + 1, sizeof(*subbuf_ids), GFP_KERNEL); 6308 + if (!subbuf_ids) { 6309 + rb_free_meta_page(cpu_buffer); 6310 + err = -ENOMEM; 6311 + goto unlock; 6312 + } 6313 + 6314 + atomic_inc(&cpu_buffer->resize_disabled); 6315 + 6316 + /* 6317 + * Lock all readers to block any subbuf swap until the subbuf IDs are 6318 + * assigned. 6319 + */ 6320 + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 6321 + rb_setup_ids_meta_page(cpu_buffer, subbuf_ids); 6322 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 6323 + 6324 + err = __rb_map_vma(cpu_buffer, vma); 6325 + if (!err) { 6326 + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 6327 + cpu_buffer->mapped = 1; 6328 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 6329 + } else { 6330 + kfree(cpu_buffer->subbuf_ids); 6331 + cpu_buffer->subbuf_ids = NULL; 6332 + rb_free_meta_page(cpu_buffer); 6333 + } 6334 + 6335 + unlock: 6336 + mutex_unlock(&buffer->mutex); 6337 + mutex_unlock(&cpu_buffer->mapping_lock); 6338 + 6339 + return err; 6340 + } 6341 + 6342 + int ring_buffer_unmap(struct trace_buffer *buffer, int cpu) 6343 + { 6344 + struct ring_buffer_per_cpu *cpu_buffer; 6345 + unsigned long flags; 6346 + int err = 0; 6347 + 6348 + if (!cpumask_test_cpu(cpu, buffer->cpumask)) 6349 + return -EINVAL; 6350 + 6351 + cpu_buffer = buffer->buffers[cpu]; 6352 + 6353 + mutex_lock(&cpu_buffer->mapping_lock); 6354 + 6355 + if (!cpu_buffer->mapped) { 6356 + err = -ENODEV; 6357 + goto out; 6358 + } else if (cpu_buffer->mapped > 1) { 6359 + __rb_inc_dec_mapped(cpu_buffer, false); 6360 + goto out; 6361 + } 6362 + 6363 + mutex_lock(&buffer->mutex); 6364 + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 6365 + 6366 + cpu_buffer->mapped = 0; 6367 + 6368 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 6369 + 6370 + kfree(cpu_buffer->subbuf_ids); 6371 + cpu_buffer->subbuf_ids = NULL; 6372 + rb_free_meta_page(cpu_buffer); 6373 + atomic_dec(&cpu_buffer->resize_disabled); 6374 + 6375 + mutex_unlock(&buffer->mutex); 6376 + 6377 + out: 6378 + mutex_unlock(&cpu_buffer->mapping_lock); 6379 + 6380 + return err; 6381 + } 6382 + 6383 + int ring_buffer_map_get_reader(struct trace_buffer *buffer, int cpu) 6384 + { 6385 + struct ring_buffer_per_cpu *cpu_buffer; 6386 + struct buffer_page *reader; 6387 + unsigned long missed_events; 6388 + unsigned long reader_size; 6389 + unsigned long flags; 6390 + 6391 + cpu_buffer = rb_get_mapped_buffer(buffer, cpu); 6392 + if (IS_ERR(cpu_buffer)) 6393 + return (int)PTR_ERR(cpu_buffer); 6394 + 6395 + raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); 6396 + 6397 + consume: 6398 + if (rb_per_cpu_empty(cpu_buffer)) 6399 + goto out; 6400 + 6401 + reader_size = rb_page_size(cpu_buffer->reader_page); 6402 + 6403 + /* 6404 + * There are data to be read on the current reader page, we can 6405 + * return to the caller. But before that, we assume the latter will read 6406 + * everything. Let's update the kernel reader accordingly. 6407 + */ 6408 + if (cpu_buffer->reader_page->read < reader_size) { 6409 + while (cpu_buffer->reader_page->read < reader_size) 6410 + rb_advance_reader(cpu_buffer); 6411 + goto out; 6412 + } 6413 + 6414 + reader = rb_get_reader_page(cpu_buffer); 6415 + if (WARN_ON(!reader)) 6416 + goto out; 6417 + 6418 + /* Check if any events were dropped */ 6419 + missed_events = cpu_buffer->lost_events; 6420 + 6421 + if (cpu_buffer->reader_page != cpu_buffer->commit_page) { 6422 + if (missed_events) { 6423 + struct buffer_data_page *bpage = reader->page; 6424 + unsigned int commit; 6425 + /* 6426 + * Use the real_end for the data size, 6427 + * This gives us a chance to store the lost events 6428 + * on the page. 6429 + */ 6430 + if (reader->real_end) 6431 + local_set(&bpage->commit, reader->real_end); 6432 + /* 6433 + * If there is room at the end of the page to save the 6434 + * missed events, then record it there. 6435 + */ 6436 + commit = rb_page_size(reader); 6437 + if (buffer->subbuf_size - commit >= sizeof(missed_events)) { 6438 + memcpy(&bpage->data[commit], &missed_events, 6439 + sizeof(missed_events)); 6440 + local_add(RB_MISSED_STORED, &bpage->commit); 6441 + } 6442 + local_add(RB_MISSED_EVENTS, &bpage->commit); 6443 + } 6444 + } else { 6445 + /* 6446 + * There really shouldn't be any missed events if the commit 6447 + * is on the reader page. 6448 + */ 6449 + WARN_ON_ONCE(missed_events); 6450 + } 6451 + 6452 + cpu_buffer->lost_events = 0; 6453 + 6454 + goto consume; 6455 + 6456 + out: 6457 + /* Some archs do not have data cache coherency between kernel and user-space */ 6458 + flush_dcache_folio(virt_to_folio(cpu_buffer->reader_page->page)); 6459 + 6460 + rb_update_meta_page(cpu_buffer); 6461 + 6462 + raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); 6463 + rb_put_mapped_buffer(cpu_buffer); 6464 + 6465 + return 0; 6466 + } 6100 6467 6101 6468 /* 6102 6469 * We only allocate new buffers, never free them if the CPU goes down.
+99 -5
kernel/trace/trace.c
··· 1191 1191 return; 1192 1192 } 1193 1193 1194 + if (tr->mapped) { 1195 + trace_array_puts(tr, "*** BUFFER MEMORY MAPPED ***\n"); 1196 + trace_array_puts(tr, "*** Can not use snapshot (sorry) ***\n"); 1197 + return; 1198 + } 1199 + 1194 1200 local_irq_save(flags); 1195 1201 update_max_tr(tr, current, smp_processor_id(), cond_data); 1196 1202 local_irq_restore(flags); ··· 1329 1323 lockdep_assert_held(&trace_types_lock); 1330 1324 1331 1325 spin_lock(&tr->snapshot_trigger_lock); 1332 - if (tr->snapshot == UINT_MAX) { 1326 + if (tr->snapshot == UINT_MAX || tr->mapped) { 1333 1327 spin_unlock(&tr->snapshot_trigger_lock); 1334 1328 return -EBUSY; 1335 1329 } ··· 6074 6068 { 6075 6069 if (tr->current_trace == &nop_trace) 6076 6070 return; 6077 - 6071 + 6078 6072 tr->current_trace->enabled--; 6079 6073 6080 6074 if (tr->current_trace->reset) ··· 8200 8194 return ret; 8201 8195 } 8202 8196 8203 - /* An ioctl call with cmd 0 to the ring buffer file will wake up all waiters */ 8204 8197 static long tracing_buffers_ioctl(struct file *file, unsigned int cmd, unsigned long arg) 8205 8198 { 8206 8199 struct ftrace_buffer_info *info = file->private_data; 8207 8200 struct trace_iterator *iter = &info->iter; 8201 + int err; 8208 8202 8209 - if (cmd) 8210 - return -ENOIOCTLCMD; 8203 + if (cmd == TRACE_MMAP_IOCTL_GET_READER) { 8204 + if (!(file->f_flags & O_NONBLOCK)) { 8205 + err = ring_buffer_wait(iter->array_buffer->buffer, 8206 + iter->cpu_file, 8207 + iter->tr->buffer_percent, 8208 + NULL, NULL); 8209 + if (err) 8210 + return err; 8211 + } 8211 8212 8213 + return ring_buffer_map_get_reader(iter->array_buffer->buffer, 8214 + iter->cpu_file); 8215 + } else if (cmd) { 8216 + return -ENOTTY; 8217 + } 8218 + 8219 + /* 8220 + * An ioctl call with cmd 0 to the ring buffer file will wake up all 8221 + * waiters 8222 + */ 8212 8223 mutex_lock(&trace_types_lock); 8213 8224 8214 8225 /* Make sure the waiters see the new wait_index */ ··· 8237 8214 return 0; 8238 8215 } 8239 8216 8217 + #ifdef CONFIG_TRACER_MAX_TRACE 8218 + static int get_snapshot_map(struct trace_array *tr) 8219 + { 8220 + int err = 0; 8221 + 8222 + /* 8223 + * Called with mmap_lock held. lockdep would be unhappy if we would now 8224 + * take trace_types_lock. Instead use the specific 8225 + * snapshot_trigger_lock. 8226 + */ 8227 + spin_lock(&tr->snapshot_trigger_lock); 8228 + 8229 + if (tr->snapshot || tr->mapped == UINT_MAX) 8230 + err = -EBUSY; 8231 + else 8232 + tr->mapped++; 8233 + 8234 + spin_unlock(&tr->snapshot_trigger_lock); 8235 + 8236 + /* Wait for update_max_tr() to observe iter->tr->mapped */ 8237 + if (tr->mapped == 1) 8238 + synchronize_rcu(); 8239 + 8240 + return err; 8241 + 8242 + } 8243 + static void put_snapshot_map(struct trace_array *tr) 8244 + { 8245 + spin_lock(&tr->snapshot_trigger_lock); 8246 + if (!WARN_ON(!tr->mapped)) 8247 + tr->mapped--; 8248 + spin_unlock(&tr->snapshot_trigger_lock); 8249 + } 8250 + #else 8251 + static inline int get_snapshot_map(struct trace_array *tr) { return 0; } 8252 + static inline void put_snapshot_map(struct trace_array *tr) { } 8253 + #endif 8254 + 8255 + static void tracing_buffers_mmap_close(struct vm_area_struct *vma) 8256 + { 8257 + struct ftrace_buffer_info *info = vma->vm_file->private_data; 8258 + struct trace_iterator *iter = &info->iter; 8259 + 8260 + WARN_ON(ring_buffer_unmap(iter->array_buffer->buffer, iter->cpu_file)); 8261 + put_snapshot_map(iter->tr); 8262 + } 8263 + 8264 + static const struct vm_operations_struct tracing_buffers_vmops = { 8265 + .close = tracing_buffers_mmap_close, 8266 + }; 8267 + 8268 + static int tracing_buffers_mmap(struct file *filp, struct vm_area_struct *vma) 8269 + { 8270 + struct ftrace_buffer_info *info = filp->private_data; 8271 + struct trace_iterator *iter = &info->iter; 8272 + int ret = 0; 8273 + 8274 + ret = get_snapshot_map(iter->tr); 8275 + if (ret) 8276 + return ret; 8277 + 8278 + ret = ring_buffer_map(iter->array_buffer->buffer, iter->cpu_file, vma); 8279 + if (ret) 8280 + put_snapshot_map(iter->tr); 8281 + 8282 + vma->vm_ops = &tracing_buffers_vmops; 8283 + 8284 + return ret; 8285 + } 8286 + 8240 8287 static const struct file_operations tracing_buffers_fops = { 8241 8288 .open = tracing_buffers_open, 8242 8289 .read = tracing_buffers_read, ··· 8316 8223 .splice_read = tracing_buffers_splice_read, 8317 8224 .unlocked_ioctl = tracing_buffers_ioctl, 8318 8225 .llseek = no_llseek, 8226 + .mmap = tracing_buffers_mmap, 8319 8227 }; 8320 8228 8321 8229 static ssize_t
+1
kernel/trace/trace.h
··· 336 336 bool allocated_snapshot; 337 337 spinlock_t snapshot_trigger_lock; 338 338 unsigned int snapshot; 339 + unsigned int mapped; 339 340 unsigned long max_latency; 340 341 #ifdef CONFIG_FSNOTIFY 341 342 struct dentry *d_max_latency;
+1
tools/testing/selftests/ring-buffer/.gitignore
··· 1 + map_test
+8
tools/testing/selftests/ring-buffer/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + CFLAGS += -Wl,-no-as-needed -Wall 3 + CFLAGS += $(KHDR_INCLUDES) 4 + CFLAGS += -D_GNU_SOURCE 5 + 6 + TEST_GEN_PROGS = map_test 7 + 8 + include ../lib.mk
+2
tools/testing/selftests/ring-buffer/config
··· 1 + CONFIG_FTRACE=y 2 + CONFIG_TRACER_SNAPSHOT=y
+294
tools/testing/selftests/ring-buffer/map_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Ring-buffer memory mapping tests 4 + * 5 + * Copyright (c) 2024 Vincent Donnefort <vdonnefort@google.com> 6 + */ 7 + #include <fcntl.h> 8 + #include <sched.h> 9 + #include <stdbool.h> 10 + #include <stdio.h> 11 + #include <stdlib.h> 12 + #include <unistd.h> 13 + 14 + #include <linux/trace_mmap.h> 15 + 16 + #include <sys/mman.h> 17 + #include <sys/ioctl.h> 18 + 19 + #include "../user_events/user_events_selftests.h" /* share tracefs setup */ 20 + #include "../kselftest_harness.h" 21 + 22 + #define TRACEFS_ROOT "/sys/kernel/tracing" 23 + 24 + static int __tracefs_write(const char *path, const char *value) 25 + { 26 + int fd, ret; 27 + 28 + fd = open(path, O_WRONLY | O_TRUNC); 29 + if (fd < 0) 30 + return fd; 31 + 32 + ret = write(fd, value, strlen(value)); 33 + 34 + close(fd); 35 + 36 + return ret == -1 ? -errno : 0; 37 + } 38 + 39 + static int __tracefs_write_int(const char *path, int value) 40 + { 41 + char *str; 42 + int ret; 43 + 44 + if (asprintf(&str, "%d", value) < 0) 45 + return -1; 46 + 47 + ret = __tracefs_write(path, str); 48 + 49 + free(str); 50 + 51 + return ret; 52 + } 53 + 54 + #define tracefs_write_int(path, value) \ 55 + ASSERT_EQ(__tracefs_write_int((path), (value)), 0) 56 + 57 + #define tracefs_write(path, value) \ 58 + ASSERT_EQ(__tracefs_write((path), (value)), 0) 59 + 60 + static int tracefs_reset(void) 61 + { 62 + if (__tracefs_write_int(TRACEFS_ROOT"/tracing_on", 0)) 63 + return -1; 64 + if (__tracefs_write(TRACEFS_ROOT"/trace", "")) 65 + return -1; 66 + if (__tracefs_write(TRACEFS_ROOT"/set_event", "")) 67 + return -1; 68 + if (__tracefs_write(TRACEFS_ROOT"/current_tracer", "nop")) 69 + return -1; 70 + 71 + return 0; 72 + } 73 + 74 + struct tracefs_cpu_map_desc { 75 + struct trace_buffer_meta *meta; 76 + int cpu_fd; 77 + }; 78 + 79 + int tracefs_cpu_map(struct tracefs_cpu_map_desc *desc, int cpu) 80 + { 81 + int page_size = getpagesize(); 82 + char *cpu_path; 83 + void *map; 84 + 85 + if (asprintf(&cpu_path, 86 + TRACEFS_ROOT"/per_cpu/cpu%d/trace_pipe_raw", 87 + cpu) < 0) 88 + return -ENOMEM; 89 + 90 + desc->cpu_fd = open(cpu_path, O_RDONLY | O_NONBLOCK); 91 + free(cpu_path); 92 + if (desc->cpu_fd < 0) 93 + return -ENODEV; 94 + 95 + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, desc->cpu_fd, 0); 96 + if (map == MAP_FAILED) 97 + return -errno; 98 + 99 + desc->meta = (struct trace_buffer_meta *)map; 100 + 101 + return 0; 102 + } 103 + 104 + void tracefs_cpu_unmap(struct tracefs_cpu_map_desc *desc) 105 + { 106 + munmap(desc->meta, desc->meta->meta_page_size); 107 + close(desc->cpu_fd); 108 + } 109 + 110 + FIXTURE(map) { 111 + struct tracefs_cpu_map_desc map_desc; 112 + bool umount; 113 + }; 114 + 115 + FIXTURE_VARIANT(map) { 116 + int subbuf_size; 117 + }; 118 + 119 + FIXTURE_VARIANT_ADD(map, subbuf_size_4k) { 120 + .subbuf_size = 4, 121 + }; 122 + 123 + FIXTURE_VARIANT_ADD(map, subbuf_size_8k) { 124 + .subbuf_size = 8, 125 + }; 126 + 127 + FIXTURE_SETUP(map) 128 + { 129 + int cpu = sched_getcpu(); 130 + cpu_set_t cpu_mask; 131 + bool fail, umount; 132 + char *message; 133 + 134 + if (getuid() != 0) 135 + SKIP(return, "Skipping: %s", "Please run the test as root"); 136 + 137 + if (!tracefs_enabled(&message, &fail, &umount)) { 138 + if (fail) { 139 + TH_LOG("Tracefs setup failed: %s", message); 140 + ASSERT_FALSE(fail); 141 + } 142 + SKIP(return, "Skipping: %s", message); 143 + } 144 + 145 + self->umount = umount; 146 + 147 + ASSERT_GE(cpu, 0); 148 + 149 + ASSERT_EQ(tracefs_reset(), 0); 150 + 151 + tracefs_write_int(TRACEFS_ROOT"/buffer_subbuf_size_kb", variant->subbuf_size); 152 + 153 + ASSERT_EQ(tracefs_cpu_map(&self->map_desc, cpu), 0); 154 + 155 + /* 156 + * Ensure generated events will be found on this very same ring-buffer. 157 + */ 158 + CPU_ZERO(&cpu_mask); 159 + CPU_SET(cpu, &cpu_mask); 160 + ASSERT_EQ(sched_setaffinity(0, sizeof(cpu_mask), &cpu_mask), 0); 161 + } 162 + 163 + FIXTURE_TEARDOWN(map) 164 + { 165 + tracefs_reset(); 166 + 167 + if (self->umount) 168 + tracefs_unmount(); 169 + 170 + tracefs_cpu_unmap(&self->map_desc); 171 + } 172 + 173 + TEST_F(map, meta_page_check) 174 + { 175 + struct tracefs_cpu_map_desc *desc = &self->map_desc; 176 + int cnt = 0; 177 + 178 + ASSERT_EQ(desc->meta->entries, 0); 179 + ASSERT_EQ(desc->meta->overrun, 0); 180 + ASSERT_EQ(desc->meta->read, 0); 181 + 182 + ASSERT_EQ(desc->meta->reader.id, 0); 183 + ASSERT_EQ(desc->meta->reader.read, 0); 184 + 185 + ASSERT_EQ(ioctl(desc->cpu_fd, TRACE_MMAP_IOCTL_GET_READER), 0); 186 + ASSERT_EQ(desc->meta->reader.id, 0); 187 + 188 + tracefs_write_int(TRACEFS_ROOT"/tracing_on", 1); 189 + for (int i = 0; i < 16; i++) 190 + tracefs_write_int(TRACEFS_ROOT"/trace_marker", i); 191 + again: 192 + ASSERT_EQ(ioctl(desc->cpu_fd, TRACE_MMAP_IOCTL_GET_READER), 0); 193 + 194 + ASSERT_EQ(desc->meta->entries, 16); 195 + ASSERT_EQ(desc->meta->overrun, 0); 196 + ASSERT_EQ(desc->meta->read, 16); 197 + 198 + ASSERT_EQ(desc->meta->reader.id, 1); 199 + 200 + if (!(cnt++)) 201 + goto again; 202 + } 203 + 204 + TEST_F(map, data_mmap) 205 + { 206 + struct tracefs_cpu_map_desc *desc = &self->map_desc; 207 + unsigned long meta_len, data_len; 208 + void *data; 209 + 210 + meta_len = desc->meta->meta_page_size; 211 + data_len = desc->meta->subbuf_size * desc->meta->nr_subbufs; 212 + 213 + /* Map all the available subbufs */ 214 + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, 215 + desc->cpu_fd, meta_len); 216 + ASSERT_NE(data, MAP_FAILED); 217 + munmap(data, data_len); 218 + 219 + /* Map all the available subbufs - 1 */ 220 + data_len -= desc->meta->subbuf_size; 221 + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, 222 + desc->cpu_fd, meta_len); 223 + ASSERT_NE(data, MAP_FAILED); 224 + munmap(data, data_len); 225 + 226 + /* Overflow the available subbufs by 1 */ 227 + meta_len += desc->meta->subbuf_size * 2; 228 + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, 229 + desc->cpu_fd, meta_len); 230 + ASSERT_EQ(data, MAP_FAILED); 231 + } 232 + 233 + FIXTURE(snapshot) { 234 + bool umount; 235 + }; 236 + 237 + FIXTURE_SETUP(snapshot) 238 + { 239 + bool fail, umount; 240 + struct stat sb; 241 + char *message; 242 + 243 + if (getuid() != 0) 244 + SKIP(return, "Skipping: %s", "Please run the test as root"); 245 + 246 + if (stat(TRACEFS_ROOT"/snapshot", &sb)) 247 + SKIP(return, "Skipping: %s", "snapshot not available"); 248 + 249 + if (!tracefs_enabled(&message, &fail, &umount)) { 250 + if (fail) { 251 + TH_LOG("Tracefs setup failed: %s", message); 252 + ASSERT_FALSE(fail); 253 + } 254 + SKIP(return, "Skipping: %s", message); 255 + } 256 + 257 + self->umount = umount; 258 + } 259 + 260 + FIXTURE_TEARDOWN(snapshot) 261 + { 262 + __tracefs_write(TRACEFS_ROOT"/events/sched/sched_switch/trigger", 263 + "!snapshot"); 264 + tracefs_reset(); 265 + 266 + if (self->umount) 267 + tracefs_unmount(); 268 + } 269 + 270 + TEST_F(snapshot, excludes_map) 271 + { 272 + struct tracefs_cpu_map_desc map_desc; 273 + int cpu = sched_getcpu(); 274 + 275 + ASSERT_GE(cpu, 0); 276 + tracefs_write(TRACEFS_ROOT"/events/sched/sched_switch/trigger", 277 + "snapshot"); 278 + ASSERT_EQ(tracefs_cpu_map(&map_desc, cpu), -EBUSY); 279 + } 280 + 281 + TEST_F(snapshot, excluded_by_map) 282 + { 283 + struct tracefs_cpu_map_desc map_desc; 284 + int cpu = sched_getcpu(); 285 + 286 + ASSERT_EQ(tracefs_cpu_map(&map_desc, cpu), 0); 287 + 288 + ASSERT_EQ(__tracefs_write(TRACEFS_ROOT"/events/sched/sched_switch/trigger", 289 + "snapshot"), -EBUSY); 290 + ASSERT_EQ(__tracefs_write(TRACEFS_ROOT"/snapshot", 291 + "1"), -EBUSY); 292 + } 293 + 294 + TEST_HARNESS_MAIN