Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'probes-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:
"x86 kprobes:

- Use boolean for some function return instead of 0 and 1

- Prohibit probing on INT/UD. This prevents user to put kprobe on
INTn/INT1/INT3/INTO and UD0/UD1/UD2 because these are used for a
special purpose in the kernel

- Boost Grp instructions. Because a few percent of kernel
instructions are Grp 2/3/4/5 and those are safe to be executed
without ip register fixup, allow those to be boosted (direct
execution on the trampoline buffer with a JMP)

tracing:

- Add function argument access from return events (kretprobe and
fprobe). This allows user to compare how a data structure field is
changed after executing a function. With BTF, return event also
accepts function argument access by name.

- Fix a wrong comment (using "Kretprobe" in fprobe)

- Cleanup a big probe argument parser function into three parts, type
parser, post-processing function, and main parser

- Cleanup to set nr_args field when initializing trace_probe instead
of counting up it while parsing

- Cleanup a redundant #else block from tracefs/README source code

- Update selftests to check entry argument access from return probes

- Documentation update about entry argument access from return
probes"

* tag 'probes-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation: tracing: Add entry argument access at function exit
selftests/ftrace: Add test cases for entry args at function exit
tracing/probes: Support $argN in return probe (kprobe and fprobe)
tracing: Remove redundant #else block for BTF args from README
tracing/probes: cleanup: Set trace_probe::nr_args at trace_probe_init
tracing/probes: Cleanup probe argument parser
tracing/fprobe-event: cleanup: Fix a wrong comment in fprobe event
x86/kprobes: Boost more instructions from grp2/3/4/5
x86/kprobes: Prohibit kprobing on INT and UD
x86/kprobes: Refactor can_{probe,boost} return type to bool

+585 -200
+31
Documentation/trace/fprobetrace.rst
··· 70 70 71 71 For the details of TYPE, see :ref:`kprobetrace documentation <kprobetrace_types>`. 72 72 73 + Function arguments at exit 74 + -------------------------- 75 + Function arguments can be accessed at exit probe using $arg<N> fetcharg. This 76 + is useful to record the function parameter and return value at once, and 77 + trace the difference of structure fields (for debuging a function whether it 78 + correctly updates the given data structure or not) 79 + See the :ref:`sample<fprobetrace_exit_args_sample>` below for how it works. 80 + 73 81 BTF arguments 74 82 ------------- 75 83 BTF (BPF Type Format) argument allows user to trace function and tracepoint ··· 226 218 <idle>-0 [000] d..3. 5606.690317: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 227 219 kworker/0:1-14 [000] d..3. 5606.690339: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="swapper/0" usage=2 start_time=0 228 220 <idle>-0 [000] d..3. 5606.692368: sched_switch: (__probestub_sched_switch+0x4/0x10) comm="kworker/0:1" usage=1 start_time=137000000 221 + 222 + .. _fprobetrace_exit_args_sample: 223 + 224 + The return probe allows us to access the results of some functions, which returns 225 + the error code and its results are passed via function parameter, such as an 226 + structure-initialization function. 227 + 228 + For example, vfs_open() will link the file structure to the inode and update 229 + mode. You can trace that changes with return probe. 230 + :: 231 + 232 + # echo 'f vfs_open mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events 233 + # echo 'f vfs_open%%return mode=file->f_mode:x32 inode=file->f_inode:x64' >> dynamic_events 234 + # echo 1 > events/fprobes/enable 235 + # cat trace 236 + sh-131 [006] ...1. 1945.714346: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x2 inode=0x0 237 + sh-131 [006] ...1. 1945.714358: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4d801e inode=0xffff888008470168 238 + cat-143 [007] ...1. 1945.717949: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 239 + cat-143 [007] ...1. 1945.717956: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0x4a801d inode=0xffff888005f78d28 240 + cat-143 [007] ...1. 1945.720616: vfs_open__entry: (vfs_open+0x4/0x40) mode=0x1 inode=0x0 241 + cat-143 [007] ...1. 1945.728263: vfs_open__exit: (do_open+0x274/0x3d0 <- vfs_open) mode=0xa800d inode=0xffff888004ada8d8 242 + 243 + You can see the `file::f_mode` and `file::f_inode` are upated in `vfs_open()`.
+9
Documentation/trace/kprobetrace.rst
··· 70 70 (\*3) this is useful for fetching a field of data structures. 71 71 (\*4) "u" means user-space dereference. See :ref:`user_mem_access`. 72 72 73 + Function arguments at kretprobe 74 + ------------------------------- 75 + Function arguments can be accessed at kretprobe using $arg<N> fetcharg. This 76 + is useful to record the function parameter and return value at once, and 77 + trace the difference of structure fields (for debuging a function whether it 78 + correctly updates the given data structure or not). 79 + See the :ref:`sample<fprobetrace_exit_args_sample>` in fprobe event for how 80 + it works. 81 + 73 82 .. _kprobetrace_types: 74 83 75 84 Types
+1 -1
arch/x86/kernel/kprobes/common.h
··· 78 78 #endif 79 79 80 80 /* Ensure if the instruction can be boostable */ 81 - extern int can_boost(struct insn *insn, void *orig_addr); 81 + extern bool can_boost(struct insn *insn, void *orig_addr); 82 82 /* Recover instruction if given address is probed */ 83 83 extern unsigned long recover_probed_instruction(kprobe_opcode_t *buf, 84 84 unsigned long addr);
+69 -31
arch/x86/kernel/kprobes/core.c
··· 137 137 * Returns non-zero if INSN is boostable. 138 138 * RIP relative instructions are adjusted at copying time in 64 bits mode 139 139 */ 140 - int can_boost(struct insn *insn, void *addr) 140 + bool can_boost(struct insn *insn, void *addr) 141 141 { 142 142 kprobe_opcode_t opcode; 143 143 insn_byte_t prefix; 144 144 int i; 145 145 146 146 if (search_exception_tables((unsigned long)addr)) 147 - return 0; /* Page fault may occur on this address. */ 147 + return false; /* Page fault may occur on this address. */ 148 148 149 149 /* 2nd-byte opcode */ 150 150 if (insn->opcode.nbytes == 2) ··· 152 152 (unsigned long *)twobyte_is_boostable); 153 153 154 154 if (insn->opcode.nbytes != 1) 155 - return 0; 155 + return false; 156 156 157 157 for_each_insn_prefix(insn, i, prefix) { 158 158 insn_attr_t attr; ··· 160 160 attr = inat_get_opcode_attribute(prefix); 161 161 /* Can't boost Address-size override prefix and CS override prefix */ 162 162 if (prefix == 0x2e || inat_is_address_size_prefix(attr)) 163 - return 0; 163 + return false; 164 164 } 165 165 166 166 opcode = insn->opcode.bytes[0]; ··· 169 169 case 0x62: /* bound */ 170 170 case 0x70 ... 0x7f: /* Conditional jumps */ 171 171 case 0x9a: /* Call far */ 172 - case 0xc0 ... 0xc1: /* Grp2 */ 173 172 case 0xcc ... 0xce: /* software exceptions */ 174 - case 0xd0 ... 0xd3: /* Grp2 */ 175 173 case 0xd6: /* (UD) */ 176 174 case 0xd8 ... 0xdf: /* ESC */ 177 175 case 0xe0 ... 0xe3: /* LOOP*, JCXZ */ 178 176 case 0xe8 ... 0xe9: /* near Call, JMP */ 179 177 case 0xeb: /* Short JMP */ 180 178 case 0xf0 ... 0xf4: /* LOCK/REP, HLT */ 181 - case 0xf6 ... 0xf7: /* Grp3 */ 182 - case 0xfe: /* Grp4 */ 183 179 /* ... are not boostable */ 184 - return 0; 180 + return false; 181 + case 0xc0 ... 0xc1: /* Grp2 */ 182 + case 0xd0 ... 0xd3: /* Grp2 */ 183 + /* 184 + * AMD uses nnn == 110 as SHL/SAL, but Intel makes it reserved. 185 + */ 186 + return X86_MODRM_REG(insn->modrm.bytes[0]) != 0b110; 187 + case 0xf6 ... 0xf7: /* Grp3 */ 188 + /* AMD uses nnn == 001 as TEST, but Intel makes it reserved. */ 189 + return X86_MODRM_REG(insn->modrm.bytes[0]) != 0b001; 190 + case 0xfe: /* Grp4 */ 191 + /* Only INC and DEC are boostable */ 192 + return X86_MODRM_REG(insn->modrm.bytes[0]) == 0b000 || 193 + X86_MODRM_REG(insn->modrm.bytes[0]) == 0b001; 185 194 case 0xff: /* Grp5 */ 186 - /* Only indirect jmp is boostable */ 187 - return X86_MODRM_REG(insn->modrm.bytes[0]) == 4; 195 + /* Only INC, DEC, and indirect JMP are boostable */ 196 + return X86_MODRM_REG(insn->modrm.bytes[0]) == 0b000 || 197 + X86_MODRM_REG(insn->modrm.bytes[0]) == 0b001 || 198 + X86_MODRM_REG(insn->modrm.bytes[0]) == 0b100; 188 199 default: 189 - return 1; 200 + return true; 190 201 } 191 202 } 192 203 ··· 263 252 return __recover_probed_insn(buf, addr); 264 253 } 265 254 266 - /* Check if paddr is at an instruction boundary */ 267 - static int can_probe(unsigned long paddr) 255 + /* Check if insn is INT or UD */ 256 + static inline bool is_exception_insn(struct insn *insn) 257 + { 258 + /* UD uses 0f escape */ 259 + if (insn->opcode.bytes[0] == 0x0f) { 260 + /* UD0 / UD1 / UD2 */ 261 + return insn->opcode.bytes[1] == 0xff || 262 + insn->opcode.bytes[1] == 0xb9 || 263 + insn->opcode.bytes[1] == 0x0b; 264 + } 265 + 266 + /* INT3 / INT n / INTO / INT1 */ 267 + return insn->opcode.bytes[0] == 0xcc || 268 + insn->opcode.bytes[0] == 0xcd || 269 + insn->opcode.bytes[0] == 0xce || 270 + insn->opcode.bytes[0] == 0xf1; 271 + } 272 + 273 + /* 274 + * Check if paddr is at an instruction boundary and that instruction can 275 + * be probed 276 + */ 277 + static bool can_probe(unsigned long paddr) 268 278 { 269 279 unsigned long addr, __addr, offset = 0; 270 280 struct insn insn; 271 281 kprobe_opcode_t buf[MAX_INSN_SIZE]; 272 282 273 283 if (!kallsyms_lookup_size_offset(paddr, NULL, &offset)) 274 - return 0; 284 + return false; 275 285 276 286 /* Decode instructions */ 277 287 addr = paddr - offset; 278 288 while (addr < paddr) { 279 - int ret; 280 - 281 289 /* 282 290 * Check if the instruction has been modified by another 283 291 * kprobe, in which case we replace the breakpoint by the ··· 307 277 */ 308 278 __addr = recover_probed_instruction(buf, addr); 309 279 if (!__addr) 310 - return 0; 280 + return false; 311 281 312 - ret = insn_decode_kernel(&insn, (void *)__addr); 313 - if (ret < 0) 314 - return 0; 282 + if (insn_decode_kernel(&insn, (void *)__addr) < 0) 283 + return false; 315 284 316 285 #ifdef CONFIG_KGDB 317 286 /* ··· 319 290 */ 320 291 if (insn.opcode.bytes[0] == INT3_INSN_OPCODE && 321 292 kgdb_has_hit_break(addr)) 322 - return 0; 293 + return false; 323 294 #endif 324 295 addr += insn.length; 325 296 } 297 + 298 + /* Check if paddr is at an instruction boundary */ 299 + if (addr != paddr) 300 + return false; 301 + 302 + __addr = recover_probed_instruction(buf, addr); 303 + if (!__addr) 304 + return false; 305 + 306 + if (insn_decode_kernel(&insn, (void *)__addr) < 0) 307 + return false; 308 + 309 + /* INT and UD are special and should not be kprobed */ 310 + if (is_exception_insn(&insn)) 311 + return false; 312 + 326 313 if (IS_ENABLED(CONFIG_CFI_CLANG)) { 327 314 /* 328 315 * The compiler generates the following instruction sequence ··· 353 308 * Also, these movl and addl are used for showing expected 354 309 * type. So those must not be touched. 355 310 */ 356 - __addr = recover_probed_instruction(buf, addr); 357 - if (!__addr) 358 - return 0; 359 - 360 - if (insn_decode_kernel(&insn, (void *)__addr) < 0) 361 - return 0; 362 - 363 311 if (insn.opcode.value == 0xBA) 364 312 offset = 12; 365 313 else if (insn.opcode.value == 0x3) ··· 362 324 363 325 /* This movl/addl is used for decoding CFI. */ 364 326 if (is_cfi_trap(addr + offset)) 365 - return 0; 327 + return false; 366 328 } 367 329 368 330 out: 369 - return (addr == paddr); 331 + return true; 370 332 } 371 333 372 334 /* If x86 supports IBT (ENDBR) it must be skipped. */
+2 -3
kernel/trace/trace.c
··· 5747 5747 "\t args: <name>=fetcharg[:type]\n" 5748 5748 "\t fetcharg: (%<register>|$<efield>), @<address>, @<symbol>[+|-<offset>],\n" 5749 5749 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 5750 + "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" 5750 5751 #ifdef CONFIG_PROBE_EVENTS_BTF_ARGS 5751 - "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" 5752 5752 "\t <argname>[->field[->field|.field...]],\n" 5753 - #else 5754 - "\t $stack<index>, $stack, $retval, $comm, $arg<N>,\n" 5755 5753 #endif 5756 5754 #else 5757 5755 "\t $stack<index>, $stack, $retval, $comm,\n" 5758 5756 #endif 5759 5757 "\t +|-[u]<offset>(<fetcharg>), \\imm-value, \\\"imm-string\"\n" 5758 + "\t kernel return probes support: $retval, $arg<N>, $comm\n" 5760 5759 "\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, char, string, symbol,\n" 5761 5760 "\t b<bit-width>@<bit-offset>/<container-size>, ustring,\n" 5762 5761 "\t symstr, <type>\\[<array-size>\\]\n"
+4 -4
kernel/trace/trace_eprobe.c
··· 220 220 if (!ep->event_system) 221 221 goto error; 222 222 223 - ret = trace_probe_init(&ep->tp, this_event, group, false); 223 + ret = trace_probe_init(&ep->tp, this_event, group, false, nargs); 224 224 if (ret < 0) 225 225 goto error; 226 226 ··· 390 390 391 391 /* Note that we don't verify it, since the code does not come from user space */ 392 392 static int 393 - process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, 394 - void *base) 393 + process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, 394 + void *dest, void *base) 395 395 { 396 396 unsigned long val; 397 397 int ret; ··· 438 438 return; 439 439 440 440 entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); 441 - store_trace_args(&entry[1], &edata->ep->tp, rec, sizeof(*entry), dsize); 441 + store_trace_args(&entry[1], &edata->ep->tp, rec, NULL, sizeof(*entry), dsize); 442 442 443 443 trace_event_buffer_commit(&fbuffer); 444 444 }
+41 -18
kernel/trace/trace_fprobe.c
··· 4 4 * Copyright (C) 2022 Google LLC. 5 5 */ 6 6 #define pr_fmt(fmt) "trace_fprobe: " fmt 7 + #include <asm/ptrace.h> 7 8 8 9 #include <linux/fprobe.h> 9 10 #include <linux/module.h> ··· 130 129 * from user space. 131 130 */ 132 131 static int 133 - process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, 134 - void *base) 132 + process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, 133 + void *dest, void *base) 135 134 { 136 135 struct pt_regs *regs = rec; 137 136 unsigned long val; ··· 152 151 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 153 152 case FETCH_OP_ARG: 154 153 val = regs_get_kernel_argument(regs, code->param); 154 + break; 155 + case FETCH_OP_EDATA: 156 + val = *(unsigned long *)((unsigned long)edata + code->offset); 155 157 break; 156 158 #endif 157 159 case FETCH_NOP_SYMBOL: /* Ignore a place holder */ ··· 188 184 if (trace_trigger_soft_disabled(trace_file)) 189 185 return; 190 186 191 - dsize = __get_data_size(&tf->tp, regs); 187 + dsize = __get_data_size(&tf->tp, regs, NULL); 192 188 193 189 entry = trace_event_buffer_reserve(&fbuffer, trace_file, 194 190 sizeof(*entry) + tf->tp.size + dsize); ··· 198 194 fbuffer.regs = regs; 199 195 entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); 200 196 entry->ip = entry_ip; 201 - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); 197 + store_trace_args(&entry[1], &tf->tp, regs, NULL, sizeof(*entry), dsize); 202 198 203 199 trace_event_buffer_commit(&fbuffer); 204 200 } ··· 214 210 } 215 211 NOKPROBE_SYMBOL(fentry_trace_func); 216 212 217 - /* Kretprobe handler */ 213 + /* function exit handler */ 214 + static int trace_fprobe_entry_handler(struct fprobe *fp, unsigned long entry_ip, 215 + unsigned long ret_ip, struct pt_regs *regs, 216 + void *entry_data) 217 + { 218 + struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); 219 + 220 + if (tf->tp.entry_arg) 221 + store_trace_entry_data(entry_data, &tf->tp, regs); 222 + 223 + return 0; 224 + } 225 + NOKPROBE_SYMBOL(trace_fprobe_entry_handler) 226 + 218 227 static nokprobe_inline void 219 228 __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, 220 229 unsigned long ret_ip, struct pt_regs *regs, 221 - struct trace_event_file *trace_file) 230 + void *entry_data, struct trace_event_file *trace_file) 222 231 { 223 232 struct fexit_trace_entry_head *entry; 224 233 struct trace_event_buffer fbuffer; ··· 244 227 if (trace_trigger_soft_disabled(trace_file)) 245 228 return; 246 229 247 - dsize = __get_data_size(&tf->tp, regs); 230 + dsize = __get_data_size(&tf->tp, regs, entry_data); 248 231 249 232 entry = trace_event_buffer_reserve(&fbuffer, trace_file, 250 233 sizeof(*entry) + tf->tp.size + dsize); ··· 255 238 entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event); 256 239 entry->func = entry_ip; 257 240 entry->ret_ip = ret_ip; 258 - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); 241 + store_trace_args(&entry[1], &tf->tp, regs, entry_data, sizeof(*entry), dsize); 259 242 260 243 trace_event_buffer_commit(&fbuffer); 261 244 } 262 245 263 246 static void 264 247 fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip, 265 - unsigned long ret_ip, struct pt_regs *regs) 248 + unsigned long ret_ip, struct pt_regs *regs, void *entry_data) 266 249 { 267 250 struct event_file_link *link; 268 251 269 252 trace_probe_for_each_link_rcu(link, &tf->tp) 270 - __fexit_trace_func(tf, entry_ip, ret_ip, regs, link->file); 253 + __fexit_trace_func(tf, entry_ip, ret_ip, regs, entry_data, link->file); 271 254 } 272 255 NOKPROBE_SYMBOL(fexit_trace_func); 273 256 ··· 286 269 if (hlist_empty(head)) 287 270 return 0; 288 271 289 - dsize = __get_data_size(&tf->tp, regs); 272 + dsize = __get_data_size(&tf->tp, regs, NULL); 290 273 __size = sizeof(*entry) + tf->tp.size + dsize; 291 274 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 292 275 size -= sizeof(u32); ··· 297 280 298 281 entry->ip = entry_ip; 299 282 memset(&entry[1], 0, dsize); 300 - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); 283 + store_trace_args(&entry[1], &tf->tp, regs, NULL, sizeof(*entry), dsize); 301 284 perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, 302 285 head, NULL); 303 286 return 0; ··· 306 289 307 290 static void 308 291 fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip, 309 - unsigned long ret_ip, struct pt_regs *regs) 292 + unsigned long ret_ip, struct pt_regs *regs, 293 + void *entry_data) 310 294 { 311 295 struct trace_event_call *call = trace_probe_event_call(&tf->tp); 312 296 struct fexit_trace_entry_head *entry; ··· 319 301 if (hlist_empty(head)) 320 302 return; 321 303 322 - dsize = __get_data_size(&tf->tp, regs); 304 + dsize = __get_data_size(&tf->tp, regs, entry_data); 323 305 __size = sizeof(*entry) + tf->tp.size + dsize; 324 306 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 325 307 size -= sizeof(u32); ··· 330 312 331 313 entry->func = entry_ip; 332 314 entry->ret_ip = ret_ip; 333 - store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize); 315 + store_trace_args(&entry[1], &tf->tp, regs, entry_data, sizeof(*entry), dsize); 334 316 perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, 335 317 head, NULL); 336 318 } ··· 361 343 struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); 362 344 363 345 if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE)) 364 - fexit_trace_func(tf, entry_ip, ret_ip, regs); 346 + fexit_trace_func(tf, entry_ip, ret_ip, regs, entry_data); 365 347 #ifdef CONFIG_PERF_EVENTS 366 348 if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE)) 367 - fexit_perf_func(tf, entry_ip, ret_ip, regs); 349 + fexit_perf_func(tf, entry_ip, ret_ip, regs, entry_data); 368 350 #endif 369 351 } 370 352 NOKPROBE_SYMBOL(fexit_dispatcher); ··· 407 389 tf->tpoint = tpoint; 408 390 tf->fp.nr_maxactive = maxactive; 409 391 410 - ret = trace_probe_init(&tf->tp, event, group, false); 392 + ret = trace_probe_init(&tf->tp, event, group, false, nargs); 411 393 if (ret < 0) 412 394 goto error; 413 395 ··· 1125 1107 ret = traceprobe_parse_probe_arg(&tf->tp, i, argv[i], &ctx); 1126 1108 if (ret) 1127 1109 goto error; /* This can be -ENOMEM */ 1110 + } 1111 + 1112 + if (is_return && tf->tp.entry_arg) { 1113 + tf->fp.entry_handler = trace_fprobe_entry_handler; 1114 + tf->fp.entry_data_size = traceprobe_get_entry_data_size(&tf->tp); 1128 1115 } 1129 1116 1130 1117 ret = traceprobe_set_print_fmt(&tf->tp,
+47 -11
kernel/trace/trace_kprobe.c
··· 290 290 INIT_HLIST_NODE(&tk->rp.kp.hlist); 291 291 INIT_LIST_HEAD(&tk->rp.kp.list); 292 292 293 - ret = trace_probe_init(&tk->tp, event, group, false); 293 + ret = trace_probe_init(&tk->tp, event, group, false, nargs); 294 294 if (ret < 0) 295 295 goto error; 296 296 ··· 740 740 return ctx.count; 741 741 } 742 742 743 + static int trace_kprobe_entry_handler(struct kretprobe_instance *ri, 744 + struct pt_regs *regs); 745 + 743 746 static int __trace_kprobe_create(int argc, const char *argv[]) 744 747 { 745 748 /* ··· 950 947 ret = traceprobe_parse_probe_arg(&tk->tp, i, argv[i], &ctx); 951 948 if (ret) 952 949 goto error; /* This can be -ENOMEM */ 950 + } 951 + /* entry handler for kretprobe */ 952 + if (is_return && tk->tp.entry_arg) { 953 + tk->rp.entry_handler = trace_kprobe_entry_handler; 954 + tk->rp.data_size = traceprobe_get_entry_data_size(&tk->tp); 953 955 } 954 956 955 957 ptype = is_return ? PROBE_PRINT_RETURN : PROBE_PRINT_NORMAL; ··· 1311 1303 1312 1304 /* Note that we don't verify it, since the code does not come from user space */ 1313 1305 static int 1314 - process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, 1315 - void *base) 1306 + process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, 1307 + void *dest, void *base) 1316 1308 { 1317 1309 struct pt_regs *regs = rec; 1318 1310 unsigned long val; ··· 1336 1328 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 1337 1329 case FETCH_OP_ARG: 1338 1330 val = regs_get_kernel_argument(regs, code->param); 1331 + break; 1332 + case FETCH_OP_EDATA: 1333 + val = *(unsigned long *)((unsigned long)edata + code->offset); 1339 1334 break; 1340 1335 #endif 1341 1336 case FETCH_NOP_SYMBOL: /* Ignore a place holder */ ··· 1370 1359 if (trace_trigger_soft_disabled(trace_file)) 1371 1360 return; 1372 1361 1373 - dsize = __get_data_size(&tk->tp, regs); 1362 + dsize = __get_data_size(&tk->tp, regs, NULL); 1374 1363 1375 1364 entry = trace_event_buffer_reserve(&fbuffer, trace_file, 1376 1365 sizeof(*entry) + tk->tp.size + dsize); ··· 1379 1368 1380 1369 fbuffer.regs = regs; 1381 1370 entry->ip = (unsigned long)tk->rp.kp.addr; 1382 - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); 1371 + store_trace_args(&entry[1], &tk->tp, regs, NULL, sizeof(*entry), dsize); 1383 1372 1384 1373 trace_event_buffer_commit(&fbuffer); 1385 1374 } ··· 1395 1384 NOKPROBE_SYMBOL(kprobe_trace_func); 1396 1385 1397 1386 /* Kretprobe handler */ 1387 + 1388 + static int trace_kprobe_entry_handler(struct kretprobe_instance *ri, 1389 + struct pt_regs *regs) 1390 + { 1391 + struct kretprobe *rp = get_kretprobe(ri); 1392 + struct trace_kprobe *tk; 1393 + 1394 + /* 1395 + * There is a small chance that get_kretprobe(ri) returns NULL when 1396 + * the kretprobe is unregister on another CPU between kretprobe's 1397 + * trampoline_handler and this function. 1398 + */ 1399 + if (unlikely(!rp)) 1400 + return -ENOENT; 1401 + 1402 + tk = container_of(rp, struct trace_kprobe, rp); 1403 + 1404 + /* store argument values into ri->data as entry data */ 1405 + if (tk->tp.entry_arg) 1406 + store_trace_entry_data(ri->data, &tk->tp, regs); 1407 + 1408 + return 0; 1409 + } 1410 + 1411 + 1398 1412 static nokprobe_inline void 1399 1413 __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri, 1400 1414 struct pt_regs *regs, ··· 1435 1399 if (trace_trigger_soft_disabled(trace_file)) 1436 1400 return; 1437 1401 1438 - dsize = __get_data_size(&tk->tp, regs); 1402 + dsize = __get_data_size(&tk->tp, regs, ri->data); 1439 1403 1440 1404 entry = trace_event_buffer_reserve(&fbuffer, trace_file, 1441 1405 sizeof(*entry) + tk->tp.size + dsize); ··· 1445 1409 fbuffer.regs = regs; 1446 1410 entry->func = (unsigned long)tk->rp.kp.addr; 1447 1411 entry->ret_ip = get_kretprobe_retaddr(ri); 1448 - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); 1412 + store_trace_args(&entry[1], &tk->tp, regs, ri->data, sizeof(*entry), dsize); 1449 1413 1450 1414 trace_event_buffer_commit(&fbuffer); 1451 1415 } ··· 1593 1557 if (hlist_empty(head)) 1594 1558 return 0; 1595 1559 1596 - dsize = __get_data_size(&tk->tp, regs); 1560 + dsize = __get_data_size(&tk->tp, regs, NULL); 1597 1561 __size = sizeof(*entry) + tk->tp.size + dsize; 1598 1562 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 1599 1563 size -= sizeof(u32); ··· 1604 1568 1605 1569 entry->ip = (unsigned long)tk->rp.kp.addr; 1606 1570 memset(&entry[1], 0, dsize); 1607 - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); 1571 + store_trace_args(&entry[1], &tk->tp, regs, NULL, sizeof(*entry), dsize); 1608 1572 perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, 1609 1573 head, NULL); 1610 1574 return 0; ··· 1629 1593 if (hlist_empty(head)) 1630 1594 return; 1631 1595 1632 - dsize = __get_data_size(&tk->tp, regs); 1596 + dsize = __get_data_size(&tk->tp, regs, ri->data); 1633 1597 __size = sizeof(*entry) + tk->tp.size + dsize; 1634 1598 size = ALIGN(__size + sizeof(u32), sizeof(u64)); 1635 1599 size -= sizeof(u32); ··· 1640 1604 1641 1605 entry->func = (unsigned long)tk->rp.kp.addr; 1642 1606 entry->ret_ip = get_kretprobe_retaddr(ri); 1643 - store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize); 1607 + store_trace_args(&entry[1], &tk->tp, regs, ri->data, sizeof(*entry), dsize); 1644 1608 perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs, 1645 1609 head, NULL); 1646 1610 }
+299 -118
kernel/trace/trace_probe.c
··· 594 594 return 0; 595 595 } 596 596 597 + static int __store_entry_arg(struct trace_probe *tp, int argnum); 598 + 597 599 static int parse_btf_arg(char *varname, 598 600 struct fetch_insn **pcode, struct fetch_insn *end, 599 601 struct traceprobe_parse_context *ctx) ··· 620 618 return -EOPNOTSUPP; 621 619 } 622 620 623 - if (ctx->flags & TPARG_FL_RETURN) { 624 - if (strcmp(varname, "$retval") != 0) { 625 - trace_probe_log_err(ctx->offset, NO_BTFARG); 626 - return -ENOENT; 627 - } 621 + if (ctx->flags & TPARG_FL_RETURN && !strcmp(varname, "$retval")) { 628 622 code->op = FETCH_OP_RETVAL; 629 623 /* Check whether the function return type is not void */ 630 624 if (query_btf_context(ctx) == 0) { ··· 652 654 const char *name = btf_name_by_offset(ctx->btf, params[i].name_off); 653 655 654 656 if (name && !strcmp(name, varname)) { 655 - code->op = FETCH_OP_ARG; 656 - if (ctx->flags & TPARG_FL_TPOINT) 657 - code->param = i + 1; 658 - else 659 - code->param = i; 657 + if (tparg_is_function_entry(ctx->flags)) { 658 + code->op = FETCH_OP_ARG; 659 + if (ctx->flags & TPARG_FL_TPOINT) 660 + code->param = i + 1; 661 + else 662 + code->param = i; 663 + } else if (tparg_is_function_return(ctx->flags)) { 664 + code->op = FETCH_OP_EDATA; 665 + ret = __store_entry_arg(ctx->tp, i); 666 + if (ret < 0) { 667 + /* internal error */ 668 + return ret; 669 + } 670 + code->offset = ret; 671 + } 660 672 tid = params[i].type; 661 673 goto found; 662 674 } ··· 763 755 764 756 #endif 765 757 758 + #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 759 + 760 + static int __store_entry_arg(struct trace_probe *tp, int argnum) 761 + { 762 + struct probe_entry_arg *earg = tp->entry_arg; 763 + bool match = false; 764 + int i, offset; 765 + 766 + if (!earg) { 767 + earg = kzalloc(sizeof(*tp->entry_arg), GFP_KERNEL); 768 + if (!earg) 769 + return -ENOMEM; 770 + earg->size = 2 * tp->nr_args + 1; 771 + earg->code = kcalloc(earg->size, sizeof(struct fetch_insn), 772 + GFP_KERNEL); 773 + if (!earg->code) { 774 + kfree(earg); 775 + return -ENOMEM; 776 + } 777 + /* Fill the code buffer with 'end' to simplify it */ 778 + for (i = 0; i < earg->size; i++) 779 + earg->code[i].op = FETCH_OP_END; 780 + tp->entry_arg = earg; 781 + } 782 + 783 + offset = 0; 784 + for (i = 0; i < earg->size - 1; i++) { 785 + switch (earg->code[i].op) { 786 + case FETCH_OP_END: 787 + earg->code[i].op = FETCH_OP_ARG; 788 + earg->code[i].param = argnum; 789 + earg->code[i + 1].op = FETCH_OP_ST_EDATA; 790 + earg->code[i + 1].offset = offset; 791 + return offset; 792 + case FETCH_OP_ARG: 793 + match = (earg->code[i].param == argnum); 794 + break; 795 + case FETCH_OP_ST_EDATA: 796 + offset = earg->code[i].offset; 797 + if (match) 798 + return offset; 799 + offset += sizeof(unsigned long); 800 + break; 801 + default: 802 + break; 803 + } 804 + } 805 + return -ENOSPC; 806 + } 807 + 808 + int traceprobe_get_entry_data_size(struct trace_probe *tp) 809 + { 810 + struct probe_entry_arg *earg = tp->entry_arg; 811 + int i, size = 0; 812 + 813 + if (!earg) 814 + return 0; 815 + 816 + for (i = 0; i < earg->size; i++) { 817 + switch (earg->code[i].op) { 818 + case FETCH_OP_END: 819 + goto out; 820 + case FETCH_OP_ST_EDATA: 821 + size = earg->code[i].offset + sizeof(unsigned long); 822 + break; 823 + default: 824 + break; 825 + } 826 + } 827 + out: 828 + return size; 829 + } 830 + 831 + void store_trace_entry_data(void *edata, struct trace_probe *tp, struct pt_regs *regs) 832 + { 833 + struct probe_entry_arg *earg = tp->entry_arg; 834 + unsigned long val; 835 + int i; 836 + 837 + if (!earg) 838 + return; 839 + 840 + for (i = 0; i < earg->size; i++) { 841 + struct fetch_insn *code = &earg->code[i]; 842 + 843 + switch (code->op) { 844 + case FETCH_OP_ARG: 845 + val = regs_get_kernel_argument(regs, code->param); 846 + break; 847 + case FETCH_OP_ST_EDATA: 848 + *(unsigned long *)((unsigned long)edata + code->offset) = val; 849 + break; 850 + case FETCH_OP_END: 851 + goto end; 852 + default: 853 + break; 854 + } 855 + } 856 + end: 857 + return; 858 + } 859 + NOKPROBE_SYMBOL(store_trace_entry_data) 860 + #endif 861 + 766 862 #define PARAM_MAX_STACK (THREAD_SIZE / sizeof(unsigned long)) 767 863 768 864 /* Parse $vars. @orig_arg points '$', which syncs to @ctx->offset */ ··· 942 830 943 831 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 944 832 len = str_has_prefix(arg, "arg"); 945 - if (len && tparg_is_function_entry(ctx->flags)) { 833 + if (len) { 946 834 ret = kstrtoul(arg + len, 10, &param); 947 835 if (ret) 948 836 goto inval; ··· 951 839 err = TP_ERR_BAD_ARG_NUM; 952 840 goto inval; 953 841 } 842 + param--; /* argN starts from 1, but internal arg[N] starts from 0 */ 954 843 955 - code->op = FETCH_OP_ARG; 956 - code->param = (unsigned int)param - 1; 957 - /* 958 - * The tracepoint probe will probe a stub function, and the 959 - * first parameter of the stub is a dummy and should be ignored. 960 - */ 961 - if (ctx->flags & TPARG_FL_TPOINT) 962 - code->param++; 844 + if (tparg_is_function_entry(ctx->flags)) { 845 + code->op = FETCH_OP_ARG; 846 + code->param = (unsigned int)param; 847 + /* 848 + * The tracepoint probe will probe a stub function, and the 849 + * first parameter of the stub is a dummy and should be ignored. 850 + */ 851 + if (ctx->flags & TPARG_FL_TPOINT) 852 + code->param++; 853 + } else if (tparg_is_function_return(ctx->flags)) { 854 + /* function entry argument access from return probe */ 855 + ret = __store_entry_arg(ctx->tp, param); 856 + if (ret < 0) /* This error should be an internal error */ 857 + return ret; 858 + 859 + code->op = FETCH_OP_EDATA; 860 + code->offset = ret; 861 + } else { 862 + err = TP_ERR_NOFENTRY_ARGS; 863 + goto inval; 864 + } 963 865 return 0; 964 866 } 965 867 #endif ··· 1163 1037 break; 1164 1038 default: 1165 1039 if (isalpha(arg[0]) || arg[0] == '_') { /* BTF variable */ 1166 - if (!tparg_is_function_entry(ctx->flags)) { 1040 + if (!tparg_is_function_entry(ctx->flags) && 1041 + !tparg_is_function_return(ctx->flags)) { 1167 1042 trace_probe_log_err(ctx->offset, NOSUP_BTFARG); 1168 1043 return -EINVAL; 1169 1044 } ··· 1217 1090 return (BYTES_TO_BITS(t->size) < (bw + bo)) ? -EINVAL : 0; 1218 1091 } 1219 1092 1220 - /* String length checking wrapper */ 1221 - static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, 1222 - struct probe_arg *parg, 1223 - struct traceprobe_parse_context *ctx) 1093 + /* Split type part from @arg and return it. */ 1094 + static char *parse_probe_arg_type(char *arg, struct probe_arg *parg, 1095 + struct traceprobe_parse_context *ctx) 1224 1096 { 1225 - struct fetch_insn *code, *scode, *tmp = NULL; 1226 - char *t, *t2, *t3; 1227 - int ret, len; 1228 - char *arg; 1097 + char *t = NULL, *t2, *t3; 1098 + int offs; 1229 1099 1230 - arg = kstrdup(argv, GFP_KERNEL); 1231 - if (!arg) 1232 - return -ENOMEM; 1233 - 1234 - ret = -EINVAL; 1235 - len = strlen(arg); 1236 - if (len > MAX_ARGSTR_LEN) { 1237 - trace_probe_log_err(ctx->offset, ARG_TOO_LONG); 1238 - goto out; 1239 - } else if (len == 0) { 1240 - trace_probe_log_err(ctx->offset, NO_ARG_BODY); 1241 - goto out; 1242 - } 1243 - 1244 - ret = -ENOMEM; 1245 - parg->comm = kstrdup(arg, GFP_KERNEL); 1246 - if (!parg->comm) 1247 - goto out; 1248 - 1249 - ret = -EINVAL; 1250 1100 t = strchr(arg, ':'); 1251 1101 if (t) { 1252 - *t = '\0'; 1253 - t2 = strchr(++t, '['); 1102 + *t++ = '\0'; 1103 + t2 = strchr(t, '['); 1254 1104 if (t2) { 1255 1105 *t2++ = '\0'; 1256 1106 t3 = strchr(t2, ']'); 1257 1107 if (!t3) { 1258 - int offs = t2 + strlen(t2) - arg; 1108 + offs = t2 + strlen(t2) - arg; 1259 1109 1260 1110 trace_probe_log_err(ctx->offset + offs, 1261 1111 ARRAY_NO_CLOSE); 1262 - goto out; 1112 + return ERR_PTR(-EINVAL); 1263 1113 } else if (t3[1] != '\0') { 1264 1114 trace_probe_log_err(ctx->offset + t3 + 1 - arg, 1265 1115 BAD_ARRAY_SUFFIX); 1266 - goto out; 1116 + return ERR_PTR(-EINVAL); 1267 1117 } 1268 1118 *t3 = '\0'; 1269 1119 if (kstrtouint(t2, 0, &parg->count) || !parg->count) { 1270 1120 trace_probe_log_err(ctx->offset + t2 - arg, 1271 1121 BAD_ARRAY_NUM); 1272 - goto out; 1122 + return ERR_PTR(-EINVAL); 1273 1123 } 1274 1124 if (parg->count > MAX_ARRAY_LEN) { 1275 1125 trace_probe_log_err(ctx->offset + t2 - arg, 1276 1126 ARRAY_TOO_BIG); 1277 - goto out; 1127 + return ERR_PTR(-EINVAL); 1278 1128 } 1279 1129 } 1280 1130 } 1131 + offs = t ? t - arg : 0; 1281 1132 1282 1133 /* 1283 1134 * Since $comm and immediate string can not be dereferenced, ··· 1266 1161 strncmp(arg, "\\\"", 2) == 0)) { 1267 1162 /* The type of $comm must be "string", and not an array type. */ 1268 1163 if (parg->count || (t && strcmp(t, "string"))) { 1269 - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), 1270 - NEED_STRING_TYPE); 1271 - goto out; 1164 + trace_probe_log_err(ctx->offset + offs, NEED_STRING_TYPE); 1165 + return ERR_PTR(-EINVAL); 1272 1166 } 1273 1167 parg->type = find_fetch_type("string", ctx->flags); 1274 1168 } else 1275 1169 parg->type = find_fetch_type(t, ctx->flags); 1170 + 1276 1171 if (!parg->type) { 1277 - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), BAD_TYPE); 1278 - goto out; 1172 + trace_probe_log_err(ctx->offset + offs, BAD_TYPE); 1173 + return ERR_PTR(-EINVAL); 1279 1174 } 1280 1175 1281 - code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL); 1282 - if (!code) 1283 - goto out; 1284 - code[FETCH_INSN_MAX - 1].op = FETCH_OP_END; 1176 + return t; 1177 + } 1285 1178 1286 - ctx->last_type = NULL; 1287 - ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1], 1288 - ctx); 1289 - if (ret) 1290 - goto fail; 1179 + /* After parsing, adjust the fetch_insn according to the probe_arg */ 1180 + static int finalize_fetch_insn(struct fetch_insn *code, 1181 + struct probe_arg *parg, 1182 + char *type, 1183 + int type_offset, 1184 + struct traceprobe_parse_context *ctx) 1185 + { 1186 + struct fetch_insn *scode; 1187 + int ret; 1291 1188 1292 - /* Update storing type if BTF is available */ 1293 - if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && 1294 - ctx->last_type) { 1295 - if (!t) { 1296 - parg->type = find_fetch_type_from_btf_type(ctx); 1297 - } else if (strstr(t, "string")) { 1298 - ret = check_prepare_btf_string_fetch(t, &code, ctx); 1299 - if (ret) 1300 - goto fail; 1301 - } 1302 - } 1303 - parg->offset = *size; 1304 - *size += parg->type->size * (parg->count ?: 1); 1305 - 1306 - if (parg->count) { 1307 - len = strlen(parg->type->fmttype) + 6; 1308 - parg->fmt = kmalloc(len, GFP_KERNEL); 1309 - if (!parg->fmt) { 1310 - ret = -ENOMEM; 1311 - goto out; 1312 - } 1313 - snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype, 1314 - parg->count); 1315 - } 1316 - 1317 - ret = -EINVAL; 1318 1189 /* Store operation */ 1319 1190 if (parg->type->is_string) { 1191 + /* Check bad combination of the type and the last fetch_insn. */ 1320 1192 if (!strcmp(parg->type->name, "symstr")) { 1321 1193 if (code->op != FETCH_OP_REG && code->op != FETCH_OP_STACK && 1322 1194 code->op != FETCH_OP_RETVAL && code->op != FETCH_OP_ARG && 1323 1195 code->op != FETCH_OP_DEREF && code->op != FETCH_OP_TP_ARG) { 1324 - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), 1196 + trace_probe_log_err(ctx->offset + type_offset, 1325 1197 BAD_SYMSTRING); 1326 - goto fail; 1198 + return -EINVAL; 1327 1199 } 1328 1200 } else { 1329 1201 if (code->op != FETCH_OP_DEREF && code->op != FETCH_OP_UDEREF && 1330 1202 code->op != FETCH_OP_IMM && code->op != FETCH_OP_COMM && 1331 1203 code->op != FETCH_OP_DATA && code->op != FETCH_OP_TP_ARG) { 1332 - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), 1204 + trace_probe_log_err(ctx->offset + type_offset, 1333 1205 BAD_STRING); 1334 - goto fail; 1206 + return -EINVAL; 1335 1207 } 1336 1208 } 1209 + 1337 1210 if (!strcmp(parg->type->name, "symstr") || 1338 1211 (code->op == FETCH_OP_IMM || code->op == FETCH_OP_COMM || 1339 1212 code->op == FETCH_OP_DATA) || code->op == FETCH_OP_TP_ARG || ··· 1327 1244 code++; 1328 1245 if (code->op != FETCH_OP_NOP) { 1329 1246 trace_probe_log_err(ctx->offset, TOO_MANY_OPS); 1330 - goto fail; 1247 + return -EINVAL; 1331 1248 } 1332 1249 } 1250 + 1333 1251 /* If op == DEREF, replace it with STRING */ 1334 1252 if (!strcmp(parg->type->name, "ustring") || 1335 1253 code->op == FETCH_OP_UDEREF) ··· 1351 1267 code++; 1352 1268 if (code->op != FETCH_OP_NOP) { 1353 1269 trace_probe_log_err(ctx->offset, TOO_MANY_OPS); 1354 - goto fail; 1270 + return -E2BIG; 1355 1271 } 1356 1272 code->op = FETCH_OP_ST_RAW; 1357 1273 code->size = parg->type->size; 1358 1274 } 1275 + 1276 + /* Save storing fetch_insn. */ 1359 1277 scode = code; 1278 + 1360 1279 /* Modify operation */ 1361 - if (t != NULL) { 1362 - ret = __parse_bitfield_probe_arg(t, parg->type, &code); 1280 + if (type != NULL) { 1281 + /* Bitfield needs a special fetch_insn. */ 1282 + ret = __parse_bitfield_probe_arg(type, parg->type, &code); 1363 1283 if (ret) { 1364 - trace_probe_log_err(ctx->offset + t - arg, BAD_BITFIELD); 1365 - goto fail; 1284 + trace_probe_log_err(ctx->offset + type_offset, BAD_BITFIELD); 1285 + return ret; 1366 1286 } 1367 1287 } else if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && 1368 1288 ctx->last_type) { 1289 + /* If user not specified the type, try parsing BTF bitfield. */ 1369 1290 ret = parse_btf_bitfield(&code, ctx); 1370 1291 if (ret) 1371 - goto fail; 1292 + return ret; 1372 1293 } 1373 - ret = -EINVAL; 1294 + 1374 1295 /* Loop(Array) operation */ 1375 1296 if (parg->count) { 1376 1297 if (scode->op != FETCH_OP_ST_MEM && 1377 1298 scode->op != FETCH_OP_ST_STRING && 1378 1299 scode->op != FETCH_OP_ST_USTRING) { 1379 - trace_probe_log_err(ctx->offset + (t ? (t - arg) : 0), 1380 - BAD_STRING); 1381 - goto fail; 1300 + trace_probe_log_err(ctx->offset + type_offset, BAD_STRING); 1301 + return -EINVAL; 1382 1302 } 1383 1303 code++; 1384 1304 if (code->op != FETCH_OP_NOP) { 1385 1305 trace_probe_log_err(ctx->offset, TOO_MANY_OPS); 1386 - goto fail; 1306 + return -E2BIG; 1387 1307 } 1388 1308 code->op = FETCH_OP_LP_ARRAY; 1389 1309 code->param = parg->count; 1390 1310 } 1311 + 1312 + /* Finalize the fetch_insn array. */ 1391 1313 code++; 1392 1314 code->op = FETCH_OP_END; 1393 1315 1394 - ret = 0; 1316 + return 0; 1317 + } 1318 + 1319 + /* String length checking wrapper */ 1320 + static int traceprobe_parse_probe_arg_body(const char *argv, ssize_t *size, 1321 + struct probe_arg *parg, 1322 + struct traceprobe_parse_context *ctx) 1323 + { 1324 + struct fetch_insn *code, *tmp = NULL; 1325 + char *type, *arg; 1326 + int ret, len; 1327 + 1328 + len = strlen(argv); 1329 + if (len > MAX_ARGSTR_LEN) { 1330 + trace_probe_log_err(ctx->offset, ARG_TOO_LONG); 1331 + return -E2BIG; 1332 + } else if (len == 0) { 1333 + trace_probe_log_err(ctx->offset, NO_ARG_BODY); 1334 + return -EINVAL; 1335 + } 1336 + 1337 + arg = kstrdup(argv, GFP_KERNEL); 1338 + if (!arg) 1339 + return -ENOMEM; 1340 + 1341 + parg->comm = kstrdup(arg, GFP_KERNEL); 1342 + if (!parg->comm) { 1343 + ret = -ENOMEM; 1344 + goto out; 1345 + } 1346 + 1347 + type = parse_probe_arg_type(arg, parg, ctx); 1348 + if (IS_ERR(type)) { 1349 + ret = PTR_ERR(type); 1350 + goto out; 1351 + } 1352 + 1353 + code = tmp = kcalloc(FETCH_INSN_MAX, sizeof(*code), GFP_KERNEL); 1354 + if (!code) { 1355 + ret = -ENOMEM; 1356 + goto out; 1357 + } 1358 + code[FETCH_INSN_MAX - 1].op = FETCH_OP_END; 1359 + 1360 + ctx->last_type = NULL; 1361 + ret = parse_probe_arg(arg, parg->type, &code, &code[FETCH_INSN_MAX - 1], 1362 + ctx); 1363 + if (ret < 0) 1364 + goto fail; 1365 + 1366 + /* Update storing type if BTF is available */ 1367 + if (IS_ENABLED(CONFIG_PROBE_EVENTS_BTF_ARGS) && 1368 + ctx->last_type) { 1369 + if (!type) { 1370 + parg->type = find_fetch_type_from_btf_type(ctx); 1371 + } else if (strstr(type, "string")) { 1372 + ret = check_prepare_btf_string_fetch(type, &code, ctx); 1373 + if (ret) 1374 + goto fail; 1375 + } 1376 + } 1377 + parg->offset = *size; 1378 + *size += parg->type->size * (parg->count ?: 1); 1379 + 1380 + if (parg->count) { 1381 + len = strlen(parg->type->fmttype) + 6; 1382 + parg->fmt = kmalloc(len, GFP_KERNEL); 1383 + if (!parg->fmt) { 1384 + ret = -ENOMEM; 1385 + goto out; 1386 + } 1387 + snprintf(parg->fmt, len, "%s[%d]", parg->type->fmttype, 1388 + parg->count); 1389 + } 1390 + 1391 + ret = finalize_fetch_insn(code, parg, type, type ? type - arg : 0, ctx); 1392 + if (ret < 0) 1393 + goto fail; 1394 + 1395 + for (; code < tmp + FETCH_INSN_MAX; code++) 1396 + if (code->op == FETCH_OP_END) 1397 + break; 1395 1398 /* Shrink down the code buffer */ 1396 1399 parg->code = kcalloc(code - tmp + 1, sizeof(*code), GFP_KERNEL); 1397 1400 if (!parg->code) ··· 1487 1316 memcpy(parg->code, tmp, sizeof(*code) * (code - tmp + 1)); 1488 1317 1489 1318 fail: 1490 - if (ret) { 1319 + if (ret < 0) { 1491 1320 for (code = tmp; code < tmp + FETCH_INSN_MAX; code++) 1492 1321 if (code->op == FETCH_NOP_SYMBOL || 1493 1322 code->op == FETCH_OP_DATA) ··· 1550 1379 struct probe_arg *parg = &tp->args[i]; 1551 1380 const char *body; 1552 1381 1553 - /* Increment count for freeing args in error case */ 1554 - tp->nr_args++; 1555 - 1382 + ctx->tp = tp; 1556 1383 body = strchr(arg, '='); 1557 1384 if (body) { 1558 1385 if (body - arg > MAX_ARG_NAME_LEN) { ··· 1607 1438 if (str_has_prefix(argv[i], "$arg")) { 1608 1439 trace_probe_log_set_index(i + 2); 1609 1440 1610 - if (!tparg_is_function_entry(ctx->flags)) { 1441 + if (!tparg_is_function_entry(ctx->flags) && 1442 + !tparg_is_function_return(ctx->flags)) { 1611 1443 trace_probe_log_err(0, NOFENTRY_ARGS); 1612 1444 return -EINVAL; 1613 1445 } ··· 1931 1761 for (i = 0; i < tp->nr_args; i++) 1932 1762 traceprobe_free_probe_arg(&tp->args[i]); 1933 1763 1764 + if (tp->entry_arg) { 1765 + kfree(tp->entry_arg->code); 1766 + kfree(tp->entry_arg); 1767 + tp->entry_arg = NULL; 1768 + } 1769 + 1934 1770 if (tp->event) 1935 1771 trace_probe_unlink(tp); 1936 1772 } 1937 1773 1938 1774 int trace_probe_init(struct trace_probe *tp, const char *event, 1939 - const char *group, bool alloc_filter) 1775 + const char *group, bool alloc_filter, int nargs) 1940 1776 { 1941 1777 struct trace_event_call *call; 1942 1778 size_t size = sizeof(struct trace_probe_event); ··· 1977 1801 ret = -ENOMEM; 1978 1802 goto error; 1979 1803 } 1804 + 1805 + tp->nr_args = nargs; 1806 + /* Make sure pointers in args[] are NULL */ 1807 + if (nargs) 1808 + memset(tp->args, 0, sizeof(tp->args[0]) * nargs); 1980 1809 1981 1810 return 0; 1982 1811
+28 -2
kernel/trace/trace_probe.h
··· 92 92 FETCH_OP_ARG, /* Function argument : .param */ 93 93 FETCH_OP_FOFFS, /* File offset: .immediate */ 94 94 FETCH_OP_DATA, /* Allocated data: .data */ 95 + FETCH_OP_EDATA, /* Entry data: .offset */ 95 96 // Stage 2 (dereference) op 96 97 FETCH_OP_DEREF, /* Dereference: .offset */ 97 98 FETCH_OP_UDEREF, /* User-space Dereference: .offset */ ··· 103 102 FETCH_OP_ST_STRING, /* String: .offset, .size */ 104 103 FETCH_OP_ST_USTRING, /* User String: .offset, .size */ 105 104 FETCH_OP_ST_SYMSTR, /* Kernel Symbol String: .offset, .size */ 105 + FETCH_OP_ST_EDATA, /* Store Entry Data: .offset */ 106 106 // Stage 4 (modify) op 107 107 FETCH_OP_MOD_BF, /* Bitfield: .basesize, .lshift, .rshift */ 108 108 // Stage 5 (loop) op ··· 234 232 const struct fetch_type *type; /* Type of this argument */ 235 233 }; 236 234 235 + struct probe_entry_arg { 236 + struct fetch_insn *code; 237 + unsigned int size; /* The entry data size */ 238 + }; 239 + 237 240 struct trace_uprobe_filter { 238 241 rwlock_t rwlock; 239 242 int nr_systemwide; ··· 260 253 struct trace_probe_event *event; 261 254 ssize_t size; /* trace entry size */ 262 255 unsigned int nr_args; 256 + struct probe_entry_arg *entry_arg; /* This is only for return probe */ 263 257 struct probe_arg args[]; 264 258 }; 265 259 ··· 346 338 } 347 339 348 340 int trace_probe_init(struct trace_probe *tp, const char *event, 349 - const char *group, bool alloc_filter); 341 + const char *group, bool alloc_filter, int nargs); 350 342 void trace_probe_cleanup(struct trace_probe *tp); 351 343 int trace_probe_append(struct trace_probe *tp, struct trace_probe *to); 352 344 void trace_probe_unlink(struct trace_probe *tp); ··· 362 354 int trace_probe_create(const char *raw_command, int (*createfn)(int, const char **)); 363 355 int trace_probe_print_args(struct trace_seq *s, struct probe_arg *args, int nr_args, 364 356 u8 *data, void *field); 357 + 358 + #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API 359 + int traceprobe_get_entry_data_size(struct trace_probe *tp); 360 + /* This is a runtime function to store entry data */ 361 + void store_trace_entry_data(void *edata, struct trace_probe *tp, struct pt_regs *regs); 362 + #else /* !CONFIG_HAVE_FUNCTION_ARG_ACCESS_API */ 363 + static inline int traceprobe_get_entry_data_size(struct trace_probe *tp) 364 + { 365 + return 0; 366 + } 367 + #define store_trace_entry_data(edata, tp, regs) do { } while (0) 368 + #endif 365 369 366 370 #define trace_probe_for_each_link(pos, tp) \ 367 371 list_for_each_entry(pos, &(tp)->event->files, list) ··· 401 381 return (flags & TPARG_FL_LOC_MASK) == (TPARG_FL_KERNEL | TPARG_FL_FENTRY); 402 382 } 403 383 384 + static inline bool tparg_is_function_return(unsigned int flags) 385 + { 386 + return (flags & TPARG_FL_LOC_MASK) == (TPARG_FL_KERNEL | TPARG_FL_RETURN); 387 + } 388 + 404 389 struct traceprobe_parse_context { 405 390 struct trace_event_call *event; 406 391 /* BTF related parameters */ ··· 417 392 const struct btf_type *last_type; /* Saved type */ 418 393 u32 last_bitoffs; /* Saved bitoffs */ 419 394 u32 last_bitsize; /* Saved bitsize */ 395 + struct trace_probe *tp; 420 396 unsigned int flags; 421 397 int offset; 422 398 }; ··· 532 506 C(NO_BTFARG, "This variable is not found at this probe point"),\ 533 507 C(NO_BTF_ENTRY, "No BTF entry for this probe point"), \ 534 508 C(BAD_VAR_ARGS, "$arg* must be an independent parameter without name etc."),\ 535 - C(NOFENTRY_ARGS, "$arg* can be used only on function entry"), \ 509 + C(NOFENTRY_ARGS, "$arg* can be used only on function entry or exit"), \ 536 510 C(DOUBLE_ARGS, "$arg* can be used only once in the parameters"), \ 537 511 C(ARGS_2LONG, "$arg* failed because the argument list is too long"), \ 538 512 C(ARGIDX_2BIG, "$argN index is too big"), \
+5 -5
kernel/trace/trace_probe_tmpl.h
··· 54 54 * If dest is NULL, don't store result and return required dynamic data size. 55 55 */ 56 56 static int 57 - process_fetch_insn(struct fetch_insn *code, void *rec, 57 + process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, 58 58 void *dest, void *base); 59 59 static nokprobe_inline int fetch_store_strlen(unsigned long addr); 60 60 static nokprobe_inline int ··· 232 232 233 233 /* Sum up total data length for dynamic arrays (strings) */ 234 234 static nokprobe_inline int 235 - __get_data_size(struct trace_probe *tp, struct pt_regs *regs) 235 + __get_data_size(struct trace_probe *tp, struct pt_regs *regs, void *edata) 236 236 { 237 237 struct probe_arg *arg; 238 238 int i, len, ret = 0; ··· 240 240 for (i = 0; i < tp->nr_args; i++) { 241 241 arg = tp->args + i; 242 242 if (unlikely(arg->dynamic)) { 243 - len = process_fetch_insn(arg->code, regs, NULL, NULL); 243 + len = process_fetch_insn(arg->code, regs, edata, NULL, NULL); 244 244 if (len > 0) 245 245 ret += len; 246 246 } ··· 251 251 252 252 /* Store the value of each argument */ 253 253 static nokprobe_inline void 254 - store_trace_args(void *data, struct trace_probe *tp, void *rec, 254 + store_trace_args(void *data, struct trace_probe *tp, void *rec, void *edata, 255 255 int header_size, int maxlen) 256 256 { 257 257 struct probe_arg *arg; ··· 266 266 /* Point the dynamic data area if needed */ 267 267 if (unlikely(arg->dynamic)) 268 268 *dl = make_data_loc(maxlen, dyndata - base); 269 - ret = process_fetch_insn(arg->code, rec, dl, base); 269 + ret = process_fetch_insn(arg->code, rec, edata, dl, base); 270 270 if (arg->dynamic && likely(ret > 0)) { 271 271 dyndata += ret; 272 272 maxlen -= ret;
+7 -7
kernel/trace/trace_uprobe.c
··· 211 211 212 212 /* Note that we don't verify it, since the code does not come from user space */ 213 213 static int 214 - process_fetch_insn(struct fetch_insn *code, void *rec, void *dest, 215 - void *base) 214 + process_fetch_insn(struct fetch_insn *code, void *rec, void *edata, 215 + void *dest, void *base) 216 216 { 217 217 struct pt_regs *regs = rec; 218 218 unsigned long val; ··· 337 337 if (!tu) 338 338 return ERR_PTR(-ENOMEM); 339 339 340 - ret = trace_probe_init(&tu->tp, event, group, true); 340 + ret = trace_probe_init(&tu->tp, event, group, true, nargs); 341 341 if (ret < 0) 342 342 goto error; 343 343 ··· 1490 1490 if (WARN_ON_ONCE(!uprobe_cpu_buffer)) 1491 1491 return 0; 1492 1492 1493 - dsize = __get_data_size(&tu->tp, regs); 1493 + dsize = __get_data_size(&tu->tp, regs, NULL); 1494 1494 esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu)); 1495 1495 1496 1496 ucb = uprobe_buffer_get(); 1497 - store_trace_args(ucb->buf, &tu->tp, regs, esize, dsize); 1497 + store_trace_args(ucb->buf, &tu->tp, regs, NULL, esize, dsize); 1498 1498 1499 1499 if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) 1500 1500 ret |= uprobe_trace_func(tu, regs, ucb, dsize); ··· 1525 1525 if (WARN_ON_ONCE(!uprobe_cpu_buffer)) 1526 1526 return 0; 1527 1527 1528 - dsize = __get_data_size(&tu->tp, regs); 1528 + dsize = __get_data_size(&tu->tp, regs, NULL); 1529 1529 esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu)); 1530 1530 1531 1531 ucb = uprobe_buffer_get(); 1532 - store_trace_args(ucb->buf, &tu->tp, regs, esize, dsize); 1532 + store_trace_args(ucb->buf, &tu->tp, regs, NULL, esize, dsize); 1533 1533 1534 1534 if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) 1535 1535 uretprobe_trace_func(tu, func, regs, ucb, dsize);
+18
tools/testing/selftests/ftrace/test.d/dynevent/fprobe_entry_arg.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Function return probe entry argument access 4 + # requires: dynamic_events 'f[:[<group>/][<event>]] <func-name>':README 'kernel return probes support:':README 5 + 6 + echo 'f:tests/myevent1 vfs_open arg=$arg1' >> dynamic_events 7 + echo 'f:tests/myevent2 vfs_open%return arg=$arg1' >> dynamic_events 8 + 9 + echo 1 > events/tests/enable 10 + 11 + echo > trace 12 + cat trace > /dev/null 13 + 14 + function streq() { 15 + test $1 = $2 16 + } 17 + 18 + streq `grep -A 1 -m 1 myevent1 trace | sed -r 's/^.*(arg=.*)/\1/' `
+4
tools/testing/selftests/ftrace/test.d/dynevent/fprobe_syntax_errors.tc
··· 34 34 35 35 check_error 'f vfs_read ^$arg10000' # BAD_ARG_NUM 36 36 37 + if !grep -q 'kernel return probes support:' README; then 37 38 check_error 'f vfs_read $retval ^$arg1' # BAD_VAR 39 + fi 38 40 check_error 'f vfs_read ^$none_var' # BAD_VAR 39 41 check_error 'f vfs_read ^'$REG # BAD_VAR 40 42 ··· 101 99 check_error 'f vfs_read args=^$arg*' # BAD_VAR_ARGS 102 100 check_error 'f vfs_read +0(^$arg*)' # BAD_VAR_ARGS 103 101 check_error 'f vfs_read $arg* ^$arg*' # DOUBLE_ARGS 102 + if !grep -q 'kernel return probes support:' README; then 104 103 check_error 'f vfs_read%return ^$arg*' # NOFENTRY_ARGS 104 + fi 105 105 check_error 'f vfs_read ^hoge' # NO_BTFARG 106 106 check_error 'f kfree ^$arg10' # NO_BTFARG (exceed the number of parameters) 107 107 check_error 'f kfree%return ^$retval' # NO_RETVAL
+2
tools/testing/selftests/ftrace/test.d/kprobe/kprobe_syntax_errors.tc
··· 108 108 check_error 'p vfs_read args=^$arg*' # BAD_VAR_ARGS 109 109 check_error 'p vfs_read +0(^$arg*)' # BAD_VAR_ARGS 110 110 check_error 'p vfs_read $arg* ^$arg*' # DOUBLE_ARGS 111 + if !grep -q 'kernel return probes support:' README; then 111 112 check_error 'r vfs_read ^$arg*' # NOFENTRY_ARGS 113 + fi 112 114 check_error 'p vfs_read+8 ^$arg*' # NOFENTRY_ARGS 113 115 check_error 'p vfs_read ^hoge' # NO_BTFARG 114 116 check_error 'p kfree ^$arg10' # NO_BTFARG (exceed the number of parameters)
+18
tools/testing/selftests/ftrace/test.d/kprobe/kretprobe_entry_arg.tc
··· 1 + #!/bin/sh 2 + # SPDX-License-Identifier: GPL-2.0 3 + # description: Kretprobe entry argument access 4 + # requires: kprobe_events 'kernel return probes support:':README 5 + 6 + echo 'p:myevent1 vfs_open arg=$arg1' >> kprobe_events 7 + echo 'r:myevent2 vfs_open arg=$arg1' >> kprobe_events 8 + 9 + echo 1 > events/kprobes/enable 10 + 11 + echo > trace 12 + cat trace > /dev/null 13 + 14 + function streq() { 15 + test $1 = $2 16 + } 17 + 18 + streq `grep -A 1 -m 1 myevent1 trace | sed -r 's/^.*(arg=.*)/\1/' `