Merge tag 'bpf-next-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

+35 -4

Documentation/bpf/btf.rst

··· 368 368 * ``info.kind_flag``: 0 369 369 * ``info.kind``: BTF_KIND_FUNC 370 370 * ``info.vlen``: linkage information (BTF_FUNC_STATIC, BTF_FUNC_GLOBAL 371 - or BTF_FUNC_EXTERN) 371 + or BTF_FUNC_EXTERN - see :ref:`BTF_Function_Linkage_Constants`) 372 372 * ``type``: a BTF_KIND_FUNC_PROTO type 373 373 374 374 No additional type data follow ``btf_type``. ··· 424 424 __u32 linkage; 425 425 }; 426 426 427 - ``struct btf_var`` encoding: 428 - * ``linkage``: currently only static variable 0, or globally allocated 429 - variable in ELF sections 1 427 + ``btf_var.linkage`` may take the values: BTF_VAR_STATIC, BTF_VAR_GLOBAL_ALLOCATED or BTF_VAR_GLOBAL_EXTERN - 428 + see :ref:`BTF_Var_Linkage_Constants`. 430 429 431 430 Not all type of global variables are supported by LLVM at this point. 432 431 The following is currently available: ··· 547 548 548 549 If the original enum value is signed and the size is less than 8, 549 550 that value will be sign extended into 8 bytes. 551 + 552 + 2.3 Constant Values 553 + ------------------- 554 + 555 + .. _BTF_Function_Linkage_Constants: 556 + 557 + 2.3.1 Function Linkage Constant Values 558 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 559 + .. table:: Function Linkage Values and Meanings 560 + 561 + =================== ===== =========== 562 + kind value description 563 + =================== ===== =========== 564 + ``BTF_FUNC_STATIC`` 0x0 definition of subprogram not visible outside containing compilation unit 565 + ``BTF_FUNC_GLOBAL`` 0x1 definition of subprogram visible outside containing compilation unit 566 + ``BTF_FUNC_EXTERN`` 0x2 declaration of a subprogram whose definition is outside the containing compilation unit 567 + =================== ===== =========== 568 + 569 + 570 + .. _BTF_Var_Linkage_Constants: 571 + 572 + 2.3.2 Variable Linkage Constant Values 573 + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 574 + .. table:: Variable Linkage Values and Meanings 575 + 576 + ============================ ===== =========== 577 + kind value description 578 + ============================ ===== =========== 579 + ``BTF_VAR_STATIC`` 0x0 definition of global variable not visible outside containing compilation unit 580 + ``BTF_VAR_GLOBAL_ALLOCATED`` 0x1 definition of global variable visible outside containing compilation unit 581 + ``BTF_VAR_GLOBAL_EXTERN`` 0x2 declaration of global variable whose definition is outside the containing compilation unit 582 + ============================ ===== =========== 550 583 551 584 3. BTF Kernel API 552 585 =================

+26 -4

Documentation/bpf/libbpf/program_types.rst

··· 121 121 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 122 122 | ``BPF_PROG_TYPE_LWT_XMIT`` | | ``lwt_xmit`` | | 123 123 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 124 + | ``BPF_PROG_TYPE_NETFILTER`` | | ``netfilter`` | | 125 + +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 124 126 | ``BPF_PROG_TYPE_PERF_EVENT`` | | ``perf_event`` | | 125 127 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 126 128 | ``BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE`` | | ``raw_tp.w+`` [#rawtp]_ | | ··· 133 131 + + +----------------------------------+-----------+ 134 132 | | | ``raw_tracepoint+`` | | 135 133 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 136 - | ``BPF_PROG_TYPE_SCHED_ACT`` | | ``action`` | | 134 + | ``BPF_PROG_TYPE_SCHED_ACT`` | | ``action`` [#tc_legacy]_ | | 137 135 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 138 - | ``BPF_PROG_TYPE_SCHED_CLS`` | | ``classifier`` | | 136 + | ``BPF_PROG_TYPE_SCHED_CLS`` | | ``classifier`` [#tc_legacy]_ | | 139 137 + + +----------------------------------+-----------+ 140 - | | | ``tc`` | | 138 + | | | ``tc`` [#tc_legacy]_ | | 139 + + +----------------------------------------+----------------------------------+-----------+ 140 + | | ``BPF_NETKIT_PRIMARY`` | ``netkit/primary`` | | 141 + + +----------------------------------------+----------------------------------+-----------+ 142 + | | ``BPF_NETKIT_PEER`` | ``netkit/peer`` | | 143 + + +----------------------------------------+----------------------------------+-----------+ 144 + | | ``BPF_TCX_INGRESS`` | ``tc/ingress`` | | 145 + + +----------------------------------------+----------------------------------+-----------+ 146 + | | ``BPF_TCX_EGRESS`` | ``tc/egress`` | | 147 + + +----------------------------------------+----------------------------------+-----------+ 148 + | | ``BPF_TCX_INGRESS`` | ``tcx/ingress`` | | 149 + + +----------------------------------------+----------------------------------+-----------+ 150 + | | ``BPF_TCX_EGRESS`` | ``tcx/egress`` | | 141 151 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 142 152 | ``BPF_PROG_TYPE_SK_LOOKUP`` | ``BPF_SK_LOOKUP`` | ``sk_lookup`` | | 143 153 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ ··· 169 155 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 170 156 | ``BPF_PROG_TYPE_SOCK_OPS`` | ``BPF_CGROUP_SOCK_OPS`` | ``sockops`` | | 171 157 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 172 - | ``BPF_PROG_TYPE_STRUCT_OPS`` | | ``struct_ops+`` | | 158 + | ``BPF_PROG_TYPE_STRUCT_OPS`` | | ``struct_ops+`` [#struct_ops]_ | | 159 + + + +----------------------------------+-----------+ 160 + | | | ``struct_ops.s+`` [#struct_ops]_ | Yes | 173 161 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ 174 162 | ``BPF_PROG_TYPE_SYSCALL`` | | ``syscall`` | Yes | 175 163 +-------------------------------------------+----------------------------------------+----------------------------------+-----------+ ··· 225 209 ``a-zA-Z0-9_.*?``. 226 210 .. [#lsm] The ``lsm`` attachment format is ``lsm[.s]/<hook>``. 227 211 .. [#rawtp] The ``raw_tp`` attach format is ``raw_tracepoint[.w]/<tracepoint>``. 212 + .. [#tc_legacy] The ``tc``, ``classifier`` and ``action`` attach types are deprecated, use 213 + ``tcx/*`` instead. 214 + .. [#struct_ops] The ``struct_ops`` attach format supports ``struct_ops[.s]/<name>`` convention, 215 + but ``name`` is ignored and it is recommended to just use plain 216 + ``SEC("struct_ops[.s]")``. The attachments are defined in a struct initializer 217 + that is tagged with ``SEC(".struct_ops[.link]")``. 228 218 .. [#tp] The ``tracepoint`` attach format is ``tracepoint/<category>/<name>``. 229 219 .. [#iter] The ``iter`` attach format is ``iter[.s]/<struct-name>``.

+1 -1

Documentation/bpf/verifier.rst

··· 418 418 linked to the registers and stack slots of the parent state with the same 419 419 indices. 420 420 421 - * For the outer stack frames, only caller saved registers (r6-r9) and stack 421 + * For the outer stack frames, only callee saved registers (r6-r9) and stack 422 422 slots are linked to the registers and stack slots of the parent state with the 423 423 same indices. 424 424

+3 -1

MAINTAINERS

··· 3997 3997 F: drivers/iio/imu/bmi323/ 3998 3998 3999 3999 BPF JIT for ARC 4000 - M: Shahab Vahedi <shahab@synopsys.com> 4000 + M: Shahab Vahedi <list+bpf@vahedi.org> 4001 4001 L: bpf@vger.kernel.org 4002 4002 S: Maintained 4003 4003 F: arch/arc/net/ ··· 4164 4164 F: include/uapi/linux/filter.h 4165 4165 F: kernel/bpf/ 4166 4166 F: kernel/trace/bpf_trace.c 4167 + F: lib/buildid.c 4167 4168 F: lib/test_bpf.c 4168 4169 F: net/bpf/ 4169 4170 F: net/core/filter.c ··· 4285 4284 S: Maintained 4286 4285 F: kernel/bpf/stackmap.c 4287 4286 F: kernel/trace/bpf_trace.c 4287 + F: lib/buildid.c 4288 4288 4289 4289 BROADCOM ASP 2.0 ETHERNET DRIVER 4290 4290 M: Justin Chen <justin.chen@broadcom.com>

+291 -217

arch/arm64/net/bpf_jit_comp.c

··· 26 26 27 27 #define TMP_REG_1 (MAX_BPF_JIT_REG + 0) 28 28 #define TMP_REG_2 (MAX_BPF_JIT_REG + 1) 29 - #define TCALL_CNT (MAX_BPF_JIT_REG + 2) 29 + #define TCCNT_PTR (MAX_BPF_JIT_REG + 2) 30 30 #define TMP_REG_3 (MAX_BPF_JIT_REG + 3) 31 - #define FP_BOTTOM (MAX_BPF_JIT_REG + 4) 32 31 #define ARENA_VM_START (MAX_BPF_JIT_REG + 5) 33 32 34 33 #define check_imm(bits, imm) do { \ ··· 62 63 [TMP_REG_1] = A64_R(10), 63 64 [TMP_REG_2] = A64_R(11), 64 65 [TMP_REG_3] = A64_R(12), 65 - /* tail_call_cnt */ 66 - [TCALL_CNT] = A64_R(26), 66 + /* tail_call_cnt_ptr */ 67 + [TCCNT_PTR] = A64_R(26), 67 68 /* temporary register for blinding constants */ 68 69 [BPF_REG_AX] = A64_R(9), 69 - [FP_BOTTOM] = A64_R(27), 70 70 /* callee saved register for kern_vm_start address */ 71 71 [ARENA_VM_START] = A64_R(28), 72 72 }; ··· 76 78 int epilogue_offset; 77 79 int *offset; 78 80 int exentry_idx; 81 + int nr_used_callee_reg; 82 + u8 used_callee_reg[8]; /* r6~r9, fp, arena_vm_start */ 79 83 __le32 *image; 80 84 __le32 *ro_image; 81 85 u32 stack_size; 82 - int fpb_offset; 83 86 u64 user_vm_start; 87 + u64 arena_vm_start; 88 + bool fp_used; 89 + bool write; 84 90 }; 85 91 86 92 struct bpf_plt { ··· 98 96 99 97 static inline void emit(const u32 insn, struct jit_ctx *ctx) 100 98 { 101 - if (ctx->image != NULL) 99 + if (ctx->image != NULL && ctx->write) 102 100 ctx->image[ctx->idx] = cpu_to_le32(insn); 103 101 104 102 ctx->idx++; ··· 183 181 } 184 182 } 185 183 186 - static inline void emit_call(u64 target, struct jit_ctx *ctx) 184 + static bool should_emit_indirect_call(long target, const struct jit_ctx *ctx) 187 185 { 188 - u8 tmp = bpf2a64[TMP_REG_1]; 186 + long offset; 189 187 188 + /* when ctx->ro_image is not allocated or the target is unknown, 189 + * emit indirect call 190 + */ 191 + if (!ctx->ro_image || !target) 192 + return true; 193 + 194 + offset = target - (long)&ctx->ro_image[ctx->idx]; 195 + return offset < -SZ_128M || offset >= SZ_128M; 196 + } 197 + 198 + static void emit_direct_call(u64 target, struct jit_ctx *ctx) 199 + { 200 + u32 insn; 201 + unsigned long pc; 202 + 203 + pc = (unsigned long)&ctx->ro_image[ctx->idx]; 204 + insn = aarch64_insn_gen_branch_imm(pc, target, AARCH64_INSN_BRANCH_LINK); 205 + emit(insn, ctx); 206 + } 207 + 208 + static void emit_indirect_call(u64 target, struct jit_ctx *ctx) 209 + { 210 + u8 tmp; 211 + 212 + tmp = bpf2a64[TMP_REG_1]; 190 213 emit_addr_mov_i64(tmp, target, ctx); 191 214 emit(A64_BLR(tmp), ctx); 215 + } 216 + 217 + static void emit_call(u64 target, struct jit_ctx *ctx) 218 + { 219 + if (should_emit_indirect_call((long)target, ctx)) 220 + emit_indirect_call(target, ctx); 221 + else 222 + emit_direct_call(target, ctx); 192 223 } 193 224 194 225 static inline int bpf2a64_offset(int bpf_insn, int off, ··· 308 273 return true; 309 274 } 310 275 311 - /* generated prologue: 276 + /* generated main prog prologue: 312 277 * bti c // if CONFIG_ARM64_BTI_KERNEL 313 278 * mov x9, lr 314 279 * nop // POKE_OFFSET 315 280 * paciasp // if CONFIG_ARM64_PTR_AUTH_KERNEL 316 281 * stp x29, lr, [sp, #-16]! 317 282 * mov x29, sp 318 - * stp x19, x20, [sp, #-16]! 319 - * stp x21, x22, [sp, #-16]! 320 - * stp x25, x26, [sp, #-16]! 321 - * stp x27, x28, [sp, #-16]! 322 - * mov x25, sp 323 - * mov tcc, #0 283 + * stp xzr, x26, [sp, #-16]! 284 + * mov x26, sp 324 285 * // PROLOGUE_OFFSET 286 + * // save callee-saved registers 325 287 */ 288 + static void prepare_bpf_tail_call_cnt(struct jit_ctx *ctx) 289 + { 290 + const bool is_main_prog = !bpf_is_subprog(ctx->prog); 291 + const u8 ptr = bpf2a64[TCCNT_PTR]; 292 + 293 + if (is_main_prog) { 294 + /* Initialize tail_call_cnt. */ 295 + emit(A64_PUSH(A64_ZR, ptr, A64_SP), ctx); 296 + emit(A64_MOV(1, ptr, A64_SP), ctx); 297 + } else 298 + emit(A64_PUSH(ptr, ptr, A64_SP), ctx); 299 + } 300 + 301 + static void find_used_callee_regs(struct jit_ctx *ctx) 302 + { 303 + int i; 304 + const struct bpf_prog *prog = ctx->prog; 305 + const struct bpf_insn *insn = &prog->insnsi[0]; 306 + int reg_used = 0; 307 + 308 + for (i = 0; i < prog->len; i++, insn++) { 309 + if (insn->dst_reg == BPF_REG_6 || insn->src_reg == BPF_REG_6) 310 + reg_used |= 1; 311 + 312 + if (insn->dst_reg == BPF_REG_7 || insn->src_reg == BPF_REG_7) 313 + reg_used |= 2; 314 + 315 + if (insn->dst_reg == BPF_REG_8 || insn->src_reg == BPF_REG_8) 316 + reg_used |= 4; 317 + 318 + if (insn->dst_reg == BPF_REG_9 || insn->src_reg == BPF_REG_9) 319 + reg_used |= 8; 320 + 321 + if (insn->dst_reg == BPF_REG_FP || insn->src_reg == BPF_REG_FP) { 322 + ctx->fp_used = true; 323 + reg_used |= 16; 324 + } 325 + } 326 + 327 + i = 0; 328 + if (reg_used & 1) 329 + ctx->used_callee_reg[i++] = bpf2a64[BPF_REG_6]; 330 + 331 + if (reg_used & 2) 332 + ctx->used_callee_reg[i++] = bpf2a64[BPF_REG_7]; 333 + 334 + if (reg_used & 4) 335 + ctx->used_callee_reg[i++] = bpf2a64[BPF_REG_8]; 336 + 337 + if (reg_used & 8) 338 + ctx->used_callee_reg[i++] = bpf2a64[BPF_REG_9]; 339 + 340 + if (reg_used & 16) 341 + ctx->used_callee_reg[i++] = bpf2a64[BPF_REG_FP]; 342 + 343 + if (ctx->arena_vm_start) 344 + ctx->used_callee_reg[i++] = bpf2a64[ARENA_VM_START]; 345 + 346 + ctx->nr_used_callee_reg = i; 347 + } 348 + 349 + /* Save callee-saved registers */ 350 + static void push_callee_regs(struct jit_ctx *ctx) 351 + { 352 + int reg1, reg2, i; 353 + 354 + /* 355 + * Program acting as exception boundary should save all ARM64 356 + * Callee-saved registers as the exception callback needs to recover 357 + * all ARM64 Callee-saved registers in its epilogue. 358 + */ 359 + if (ctx->prog->aux->exception_boundary) { 360 + emit(A64_PUSH(A64_R(19), A64_R(20), A64_SP), ctx); 361 + emit(A64_PUSH(A64_R(21), A64_R(22), A64_SP), ctx); 362 + emit(A64_PUSH(A64_R(23), A64_R(24), A64_SP), ctx); 363 + emit(A64_PUSH(A64_R(25), A64_R(26), A64_SP), ctx); 364 + emit(A64_PUSH(A64_R(27), A64_R(28), A64_SP), ctx); 365 + } else { 366 + find_used_callee_regs(ctx); 367 + for (i = 0; i + 1 < ctx->nr_used_callee_reg; i += 2) { 368 + reg1 = ctx->used_callee_reg[i]; 369 + reg2 = ctx->used_callee_reg[i + 1]; 370 + emit(A64_PUSH(reg1, reg2, A64_SP), ctx); 371 + } 372 + if (i < ctx->nr_used_callee_reg) { 373 + reg1 = ctx->used_callee_reg[i]; 374 + /* keep SP 16-byte aligned */ 375 + emit(A64_PUSH(reg1, A64_ZR, A64_SP), ctx); 376 + } 377 + } 378 + } 379 + 380 + /* Restore callee-saved registers */ 381 + static void pop_callee_regs(struct jit_ctx *ctx) 382 + { 383 + struct bpf_prog_aux *aux = ctx->prog->aux; 384 + int reg1, reg2, i; 385 + 386 + /* 387 + * Program acting as exception boundary pushes R23 and R24 in addition 388 + * to BPF callee-saved registers. Exception callback uses the boundary 389 + * program's stack frame, so recover these extra registers in the above 390 + * two cases. 391 + */ 392 + if (aux->exception_boundary || aux->exception_cb) { 393 + emit(A64_POP(A64_R(27), A64_R(28), A64_SP), ctx); 394 + emit(A64_POP(A64_R(25), A64_R(26), A64_SP), ctx); 395 + emit(A64_POP(A64_R(23), A64_R(24), A64_SP), ctx); 396 + emit(A64_POP(A64_R(21), A64_R(22), A64_SP), ctx); 397 + emit(A64_POP(A64_R(19), A64_R(20), A64_SP), ctx); 398 + } else { 399 + i = ctx->nr_used_callee_reg - 1; 400 + if (ctx->nr_used_callee_reg % 2 != 0) { 401 + reg1 = ctx->used_callee_reg[i]; 402 + emit(A64_POP(reg1, A64_ZR, A64_SP), ctx); 403 + i--; 404 + } 405 + while (i > 0) { 406 + reg1 = ctx->used_callee_reg[i - 1]; 407 + reg2 = ctx->used_callee_reg[i]; 408 + emit(A64_POP(reg1, reg2, A64_SP), ctx); 409 + i -= 2; 410 + } 411 + } 412 + } 326 413 327 414 #define BTI_INSNS (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) ? 1 : 0) 328 415 #define PAC_INSNS (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) ? 1 : 0) ··· 453 296 #define POKE_OFFSET (BTI_INSNS + 1) 454 297 455 298 /* Tail call offset to jump into */ 456 - #define PROLOGUE_OFFSET (BTI_INSNS + 2 + PAC_INSNS + 8) 299 + #define PROLOGUE_OFFSET (BTI_INSNS + 2 + PAC_INSNS + 4) 457 300 458 - static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf, 459 - bool is_exception_cb, u64 arena_vm_start) 301 + static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) 460 302 { 461 303 const struct bpf_prog *prog = ctx->prog; 462 304 const bool is_main_prog = !bpf_is_subprog(prog); 463 - const u8 r6 = bpf2a64[BPF_REG_6]; 464 - const u8 r7 = bpf2a64[BPF_REG_7]; 465 - const u8 r8 = bpf2a64[BPF_REG_8]; 466 - const u8 r9 = bpf2a64[BPF_REG_9]; 467 305 const u8 fp = bpf2a64[BPF_REG_FP]; 468 - const u8 tcc = bpf2a64[TCALL_CNT]; 469 - const u8 fpb = bpf2a64[FP_BOTTOM]; 470 306 const u8 arena_vm_base = bpf2a64[ARENA_VM_START]; 471 307 const int idx0 = ctx->idx; 472 308 int cur_offset; ··· 498 348 emit(A64_MOV(1, A64_R(9), A64_LR), ctx); 499 349 emit(A64_NOP, ctx); 500 350 501 - if (!is_exception_cb) { 351 + if (!prog->aux->exception_cb) { 502 352 /* Sign lr */ 503 353 if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL)) 504 354 emit(A64_PACIASP, ctx); 355 + 505 356 /* Save FP and LR registers to stay align with ARM64 AAPCS */ 506 357 emit(A64_PUSH(A64_FP, A64_LR, A64_SP), ctx); 507 358 emit(A64_MOV(1, A64_FP, A64_SP), ctx); 508 359 509 - /* Save callee-saved registers */ 510 - emit(A64_PUSH(r6, r7, A64_SP), ctx); 511 - emit(A64_PUSH(r8, r9, A64_SP), ctx); 512 - emit(A64_PUSH(fp, tcc, A64_SP), ctx); 513 - emit(A64_PUSH(fpb, A64_R(28), A64_SP), ctx); 360 + prepare_bpf_tail_call_cnt(ctx); 361 + 362 + if (!ebpf_from_cbpf && is_main_prog) { 363 + cur_offset = ctx->idx - idx0; 364 + if (cur_offset != PROLOGUE_OFFSET) { 365 + pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n", 366 + cur_offset, PROLOGUE_OFFSET); 367 + return -1; 368 + } 369 + /* BTI landing pad for the tail call, done with a BR */ 370 + emit_bti(A64_BTI_J, ctx); 371 + } 372 + push_callee_regs(ctx); 514 373 } else { 515 374 /* 516 375 * Exception callback receives FP of Main Program as third ··· 531 372 * callee-saved registers. The exception callback will not push 532 373 * anything and re-use the main program's stack. 533 374 * 534 - * 10 registers are on the stack 375 + * 12 registers are on the stack 535 376 */ 536 - emit(A64_SUB_I(1, A64_SP, A64_FP, 80), ctx); 377 + emit(A64_SUB_I(1, A64_SP, A64_FP, 96), ctx); 537 378 } 538 379 539 - /* Set up BPF prog stack base register */ 540 - emit(A64_MOV(1, fp, A64_SP), ctx); 541 - 542 - if (!ebpf_from_cbpf && is_main_prog) { 543 - /* Initialize tail_call_cnt */ 544 - emit(A64_MOVZ(1, tcc, 0, 0), ctx); 545 - 546 - cur_offset = ctx->idx - idx0; 547 - if (cur_offset != PROLOGUE_OFFSET) { 548 - pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n", 549 - cur_offset, PROLOGUE_OFFSET); 550 - return -1; 551 - } 552 - 553 - /* BTI landing pad for the tail call, done with a BR */ 554 - emit_bti(A64_BTI_J, ctx); 555 - } 556 - 557 - /* 558 - * Program acting as exception boundary should save all ARM64 559 - * Callee-saved registers as the exception callback needs to recover 560 - * all ARM64 Callee-saved registers in its epilogue. 561 - */ 562 - if (prog->aux->exception_boundary) { 563 - /* 564 - * As we are pushing two more registers, BPF_FP should be moved 565 - * 16 bytes 566 - */ 567 - emit(A64_SUB_I(1, fp, fp, 16), ctx); 568 - emit(A64_PUSH(A64_R(23), A64_R(24), A64_SP), ctx); 569 - } 570 - 571 - emit(A64_SUB_I(1, fpb, fp, ctx->fpb_offset), ctx); 380 + if (ctx->fp_used) 381 + /* Set up BPF prog stack base register */ 382 + emit(A64_MOV(1, fp, A64_SP), ctx); 572 383 573 384 /* Stack must be multiples of 16B */ 574 385 ctx->stack_size = round_up(prog->aux->stack_depth, 16); 575 386 576 387 /* Set up function call stack */ 577 - emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 388 + if (ctx->stack_size) 389 + emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 578 390 579 - if (arena_vm_start) 580 - emit_a64_mov_i64(arena_vm_base, arena_vm_start, ctx); 391 + if (ctx->arena_vm_start) 392 + emit_a64_mov_i64(arena_vm_base, ctx->arena_vm_start, ctx); 581 393 582 394 return 0; 583 395 } 584 396 585 - static int out_offset = -1; /* initialized on the first pass of build_body() */ 586 397 static int emit_bpf_tail_call(struct jit_ctx *ctx) 587 398 { 588 399 /* bpf_tail_call(void *prog_ctx, struct bpf_array *array, u64 index) */ ··· 561 432 562 433 const u8 tmp = bpf2a64[TMP_REG_1]; 563 434 const u8 prg = bpf2a64[TMP_REG_2]; 564 - const u8 tcc = bpf2a64[TCALL_CNT]; 565 - const int idx0 = ctx->idx; 566 - #define cur_offset (ctx->idx - idx0) 567 - #define jmp_offset (out_offset - (cur_offset)) 435 + const u8 tcc = bpf2a64[TMP_REG_3]; 436 + const u8 ptr = bpf2a64[TCCNT_PTR]; 568 437 size_t off; 438 + __le32 *branch1 = NULL; 439 + __le32 *branch2 = NULL; 440 + __le32 *branch3 = NULL; 569 441 570 442 /* if (index >= array->map.max_entries) 571 443 * goto out; ··· 576 446 emit(A64_LDR32(tmp, r2, tmp), ctx); 577 447 emit(A64_MOV(0, r3, r3), ctx); 578 448 emit(A64_CMP(0, r3, tmp), ctx); 579 - emit(A64_B_(A64_COND_CS, jmp_offset), ctx); 449 + branch1 = ctx->image + ctx->idx; 450 + emit(A64_NOP, ctx); 580 451 581 452 /* 582 - * if (tail_call_cnt >= MAX_TAIL_CALL_CNT) 453 + * if ((*tail_call_cnt_ptr) >= MAX_TAIL_CALL_CNT) 583 454 * goto out; 584 - * tail_call_cnt++; 585 455 */ 586 456 emit_a64_mov_i64(tmp, MAX_TAIL_CALL_CNT, ctx); 457 + emit(A64_LDR64I(tcc, ptr, 0), ctx); 587 458 emit(A64_CMP(1, tcc, tmp), ctx); 588 - emit(A64_B_(A64_COND_CS, jmp_offset), ctx); 459 + branch2 = ctx->image + ctx->idx; 460 + emit(A64_NOP, ctx); 461 + 462 + /* (*tail_call_cnt_ptr)++; */ 589 463 emit(A64_ADD_I(1, tcc, tcc, 1), ctx); 590 464 591 465 /* prog = array->ptrs[index]; ··· 601 467 emit(A64_ADD(1, tmp, r2, tmp), ctx); 602 468 emit(A64_LSL(1, prg, r3, 3), ctx); 603 469 emit(A64_LDR64(prg, tmp, prg), ctx); 604 - emit(A64_CBZ(1, prg, jmp_offset), ctx); 470 + branch3 = ctx->image + ctx->idx; 471 + emit(A64_NOP, ctx); 472 + 473 + /* Update tail_call_cnt if the slot is populated. */ 474 + emit(A64_STR64I(tcc, ptr, 0), ctx); 475 + 476 + /* restore SP */ 477 + if (ctx->stack_size) 478 + emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 479 + 480 + pop_callee_regs(ctx); 605 481 606 482 /* goto *(prog->bpf_func + prologue_offset); */ 607 483 off = offsetof(struct bpf_prog, bpf_func); 608 484 emit_a64_mov_i64(tmp, off, ctx); 609 485 emit(A64_LDR64(tmp, prg, tmp), ctx); 610 486 emit(A64_ADD_I(1, tmp, tmp, sizeof(u32) * PROLOGUE_OFFSET), ctx); 611 - emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 612 487 emit(A64_BR(tmp), ctx); 613 488 614 - /* out: */ 615 - if (out_offset == -1) 616 - out_offset = cur_offset; 617 - if (cur_offset != out_offset) { 618 - pr_err_once("tail_call out_offset = %d, expected %d!\n", 619 - cur_offset, out_offset); 620 - return -1; 489 + if (ctx->image) { 490 + off = &ctx->image[ctx->idx] - branch1; 491 + *branch1 = cpu_to_le32(A64_B_(A64_COND_CS, off)); 492 + 493 + off = &ctx->image[ctx->idx] - branch2; 494 + *branch2 = cpu_to_le32(A64_B_(A64_COND_CS, off)); 495 + 496 + off = &ctx->image[ctx->idx] - branch3; 497 + *branch3 = cpu_to_le32(A64_CBZ(1, prg, off)); 621 498 } 499 + 622 500 return 0; 623 - #undef cur_offset 624 - #undef jmp_offset 625 501 } 626 502 627 503 #ifdef CONFIG_ARM64_LSE_ATOMICS ··· 857 713 plt->target = (u64)&dummy_tramp; 858 714 } 859 715 860 - static void build_epilogue(struct jit_ctx *ctx, bool is_exception_cb) 716 + static void build_epilogue(struct jit_ctx *ctx) 861 717 { 862 718 const u8 r0 = bpf2a64[BPF_REG_0]; 863 - const u8 r6 = bpf2a64[BPF_REG_6]; 864 - const u8 r7 = bpf2a64[BPF_REG_7]; 865 - const u8 r8 = bpf2a64[BPF_REG_8]; 866 - const u8 r9 = bpf2a64[BPF_REG_9]; 867 - const u8 fp = bpf2a64[BPF_REG_FP]; 868 - const u8 fpb = bpf2a64[FP_BOTTOM]; 719 + const u8 ptr = bpf2a64[TCCNT_PTR]; 869 720 870 721 /* We're done with BPF stack */ 871 - emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 722 + if (ctx->stack_size) 723 + emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx); 872 724 873 - /* 874 - * Program acting as exception boundary pushes R23 and R24 in addition 875 - * to BPF callee-saved registers. Exception callback uses the boundary 876 - * program's stack frame, so recover these extra registers in the above 877 - * two cases. 878 - */ 879 - if (ctx->prog->aux->exception_boundary || is_exception_cb) 880 - emit(A64_POP(A64_R(23), A64_R(24), A64_SP), ctx); 725 + pop_callee_regs(ctx); 881 726 882 - /* Restore x27 and x28 */ 883 - emit(A64_POP(fpb, A64_R(28), A64_SP), ctx); 884 - /* Restore fs (x25) and x26 */ 885 - emit(A64_POP(fp, A64_R(26), A64_SP), ctx); 886 - 887 - /* Restore callee-saved register */ 888 - emit(A64_POP(r8, r9, A64_SP), ctx); 889 - emit(A64_POP(r6, r7, A64_SP), ctx); 727 + emit(A64_POP(A64_ZR, ptr, A64_SP), ctx); 890 728 891 729 /* Restore FP/LR registers */ 892 730 emit(A64_POP(A64_FP, A64_LR, A64_SP), ctx); ··· 988 862 const u8 tmp = bpf2a64[TMP_REG_1]; 989 863 const u8 tmp2 = bpf2a64[TMP_REG_2]; 990 864 const u8 fp = bpf2a64[BPF_REG_FP]; 991 - const u8 fpb = bpf2a64[FP_BOTTOM]; 992 865 const u8 arena_vm_base = bpf2a64[ARENA_VM_START]; 993 866 const s16 off = insn->off; 994 867 const s32 imm = insn->imm; ··· 1439 1314 emit(A64_ADD(1, tmp2, src, arena_vm_base), ctx); 1440 1315 src = tmp2; 1441 1316 } 1442 - if (ctx->fpb_offset > 0 && src == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) { 1443 - src_adj = fpb; 1444 - off_adj = off + ctx->fpb_offset; 1317 + if (src == fp) { 1318 + src_adj = A64_SP; 1319 + off_adj = off + ctx->stack_size; 1445 1320 } else { 1446 1321 src_adj = src; 1447 1322 off_adj = off; ··· 1532 1407 emit(A64_ADD(1, tmp2, dst, arena_vm_base), ctx); 1533 1408 dst = tmp2; 1534 1409 } 1535 - if (ctx->fpb_offset > 0 && dst == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) { 1536 - dst_adj = fpb; 1537 - off_adj = off + ctx->fpb_offset; 1410 + if (dst == fp) { 1411 + dst_adj = A64_SP; 1412 + off_adj = off + ctx->stack_size; 1538 1413 } else { 1539 1414 dst_adj = dst; 1540 1415 off_adj = off; ··· 1594 1469 emit(A64_ADD(1, tmp2, dst, arena_vm_base), ctx); 1595 1470 dst = tmp2; 1596 1471 } 1597 - if (ctx->fpb_offset > 0 && dst == fp && BPF_MODE(insn->code) != BPF_PROBE_MEM32) { 1598 - dst_adj = fpb; 1599 - off_adj = off + ctx->fpb_offset; 1472 + if (dst == fp) { 1473 + dst_adj = A64_SP; 1474 + off_adj = off + ctx->stack_size; 1600 1475 } else { 1601 1476 dst_adj = dst; 1602 1477 off_adj = off; ··· 1665 1540 return 0; 1666 1541 } 1667 1542 1668 - /* 1669 - * Return 0 if FP may change at runtime, otherwise find the minimum negative 1670 - * offset to FP, converts it to positive number, and align down to 8 bytes. 1671 - */ 1672 - static int find_fpb_offset(struct bpf_prog *prog) 1673 - { 1674 - int i; 1675 - int offset = 0; 1676 - 1677 - for (i = 0; i < prog->len; i++) { 1678 - const struct bpf_insn *insn = &prog->insnsi[i]; 1679 - const u8 class = BPF_CLASS(insn->code); 1680 - const u8 mode = BPF_MODE(insn->code); 1681 - const u8 src = insn->src_reg; 1682 - const u8 dst = insn->dst_reg; 1683 - const s32 imm = insn->imm; 1684 - const s16 off = insn->off; 1685 - 1686 - switch (class) { 1687 - case BPF_STX: 1688 - case BPF_ST: 1689 - /* fp holds atomic operation result */ 1690 - if (class == BPF_STX && mode == BPF_ATOMIC && 1691 - ((imm == BPF_XCHG || 1692 - imm == (BPF_FETCH | BPF_ADD) || 1693 - imm == (BPF_FETCH | BPF_AND) || 1694 - imm == (BPF_FETCH | BPF_XOR) || 1695 - imm == (BPF_FETCH | BPF_OR)) && 1696 - src == BPF_REG_FP)) 1697 - return 0; 1698 - 1699 - if (mode == BPF_MEM && dst == BPF_REG_FP && 1700 - off < offset) 1701 - offset = insn->off; 1702 - break; 1703 - 1704 - case BPF_JMP32: 1705 - case BPF_JMP: 1706 - break; 1707 - 1708 - case BPF_LDX: 1709 - case BPF_LD: 1710 - /* fp holds load result */ 1711 - if (dst == BPF_REG_FP) 1712 - return 0; 1713 - 1714 - if (class == BPF_LDX && mode == BPF_MEM && 1715 - src == BPF_REG_FP && off < offset) 1716 - offset = off; 1717 - break; 1718 - 1719 - case BPF_ALU: 1720 - case BPF_ALU64: 1721 - default: 1722 - /* fp holds ALU result */ 1723 - if (dst == BPF_REG_FP) 1724 - return 0; 1725 - } 1726 - } 1727 - 1728 - if (offset < 0) { 1729 - /* 1730 - * safely be converted to a positive 'int', since insn->off 1731 - * is 's16' 1732 - */ 1733 - offset = -offset; 1734 - /* align down to 8 bytes */ 1735 - offset = ALIGN_DOWN(offset, 8); 1736 - } 1737 - 1738 - return offset; 1739 - } 1740 - 1741 1543 static int build_body(struct jit_ctx *ctx, bool extra_pass) 1742 1544 { 1743 1545 const struct bpf_prog *prog = ctx->prog; ··· 1683 1631 const struct bpf_insn *insn = &prog->insnsi[i]; 1684 1632 int ret; 1685 1633 1686 - if (ctx->image == NULL) 1687 - ctx->offset[i] = ctx->idx; 1634 + ctx->offset[i] = ctx->idx; 1688 1635 ret = build_insn(insn, ctx, extra_pass); 1689 1636 if (ret > 0) { 1690 1637 i++; 1691 - if (ctx->image == NULL) 1692 - ctx->offset[i] = ctx->idx; 1638 + ctx->offset[i] = ctx->idx; 1693 1639 continue; 1694 1640 } 1695 1641 if (ret) ··· 1698 1648 * the last element with the offset after the last 1699 1649 * instruction (end of program) 1700 1650 */ 1701 - if (ctx->image == NULL) 1702 - ctx->offset[i] = ctx->idx; 1651 + ctx->offset[i] = ctx->idx; 1703 1652 1704 1653 return 0; 1705 1654 } ··· 1750 1701 bool tmp_blinded = false; 1751 1702 bool extra_pass = false; 1752 1703 struct jit_ctx ctx; 1753 - u64 arena_vm_start; 1754 1704 u8 *image_ptr; 1755 1705 u8 *ro_image_ptr; 1706 + int body_idx; 1707 + int exentry_idx; 1756 1708 1757 1709 if (!prog->jit_requested) 1758 1710 return orig_prog; ··· 1769 1719 prog = tmp; 1770 1720 } 1771 1721 1772 - arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena); 1773 1722 jit_data = prog->aux->jit_data; 1774 1723 if (!jit_data) { 1775 1724 jit_data = kzalloc(sizeof(*jit_data), GFP_KERNEL); ··· 1798 1749 goto out_off; 1799 1750 } 1800 1751 1801 - ctx.fpb_offset = find_fpb_offset(prog); 1802 1752 ctx.user_vm_start = bpf_arena_get_user_vm_start(prog->aux->arena); 1753 + ctx.arena_vm_start = bpf_arena_get_kern_vm_start(prog->aux->arena); 1803 1754 1804 - /* 1805 - * 1. Initial fake pass to compute ctx->idx and ctx->offset. 1755 + /* Pass 1: Estimate the maximum image size. 1806 1756 * 1807 1757 * BPF line info needs ctx->offset[i] to be the offset of 1808 1758 * instruction[i] in jited image, so build prologue first. 1809 1759 */ 1810 - if (build_prologue(&ctx, was_classic, prog->aux->exception_cb, 1811 - arena_vm_start)) { 1760 + if (build_prologue(&ctx, was_classic)) { 1812 1761 prog = orig_prog; 1813 1762 goto out_off; 1814 1763 } ··· 1817 1770 } 1818 1771 1819 1772 ctx.epilogue_offset = ctx.idx; 1820 - build_epilogue(&ctx, prog->aux->exception_cb); 1773 + build_epilogue(&ctx); 1821 1774 build_plt(&ctx); 1822 1775 1823 1776 extable_align = __alignof__(struct exception_table_entry); 1824 1777 extable_size = prog->aux->num_exentries * 1825 1778 sizeof(struct exception_table_entry); 1826 1779 1827 - /* Now we know the actual image size. */ 1780 + /* Now we know the maximum image size. */ 1828 1781 prog_size = sizeof(u32) * ctx.idx; 1829 1782 /* also allocate space for plt target */ 1830 1783 extable_offset = round_up(prog_size + PLT_TARGET_SIZE, extable_align); ··· 1837 1790 goto out_off; 1838 1791 } 1839 1792 1840 - /* 2. Now, the actual pass. */ 1793 + /* Pass 2: Determine jited position and result for each instruction */ 1841 1794 1842 1795 /* 1843 1796 * Use the image(RW) for writing the JITed instructions. But also save ··· 1853 1806 skip_init_ctx: 1854 1807 ctx.idx = 0; 1855 1808 ctx.exentry_idx = 0; 1809 + ctx.write = true; 1856 1810 1857 - build_prologue(&ctx, was_classic, prog->aux->exception_cb, arena_vm_start); 1811 + build_prologue(&ctx, was_classic); 1812 + 1813 + /* Record exentry_idx and body_idx before first build_body */ 1814 + exentry_idx = ctx.exentry_idx; 1815 + body_idx = ctx.idx; 1816 + /* Dont write body instructions to memory for now */ 1817 + ctx.write = false; 1858 1818 1859 1819 if (build_body(&ctx, extra_pass)) { 1860 1820 prog = orig_prog; 1861 1821 goto out_free_hdr; 1862 1822 } 1863 1823 1864 - build_epilogue(&ctx, prog->aux->exception_cb); 1824 + ctx.epilogue_offset = ctx.idx; 1825 + ctx.exentry_idx = exentry_idx; 1826 + ctx.idx = body_idx; 1827 + ctx.write = true; 1828 + 1829 + /* Pass 3: Adjust jump offset and write final image */ 1830 + if (build_body(&ctx, extra_pass) || 1831 + WARN_ON_ONCE(ctx.idx != ctx.epilogue_offset)) { 1832 + prog = orig_prog; 1833 + goto out_free_hdr; 1834 + } 1835 + 1836 + build_epilogue(&ctx); 1865 1837 build_plt(&ctx); 1866 1838 1867 - /* 3. Extra pass to validate JITed code. */ 1839 + /* Extra pass to validate JITed code. */ 1868 1840 if (validate_ctx(&ctx)) { 1869 1841 prog = orig_prog; 1870 1842 goto out_free_hdr; 1871 1843 } 1844 + 1845 + /* update the real prog size */ 1846 + prog_size = sizeof(u32) * ctx.idx; 1872 1847 1873 1848 /* And we're done. */ 1874 1849 if (bpf_jit_enable > 1) 1875 1850 bpf_jit_dump(prog->len, prog_size, 2, ctx.image); 1876 1851 1877 1852 if (!prog->is_func || extra_pass) { 1878 - if (extra_pass && ctx.idx != jit_data->ctx.idx) { 1879 - pr_err_once("multi-func JIT bug %d != %d\n", 1853 + /* The jited image may shrink since the jited result for 1854 + * BPF_CALL to subprog may be changed from indirect call 1855 + * to direct call. 1856 + */ 1857 + if (extra_pass && ctx.idx > jit_data->ctx.idx) { 1858 + pr_err_once("multi-func JIT bug %d > %d\n", 1880 1859 ctx.idx, jit_data->ctx.idx); 1881 1860 prog->bpf_func = NULL; 1882 1861 prog->jited = 0; ··· 2373 2300 .image = image, 2374 2301 .ro_image = ro_image, 2375 2302 .idx = 0, 2303 + .write = true, 2376 2304 }; 2377 2305 2378 2306 nregs = btf_func_model_nregs(m);

+131 -30

arch/x86/net/bpf_jit_comp.c

··· 64 64 return value <= 127 && value >= -128; 65 65 } 66 66 67 + /* 68 + * Let us limit the positive offset to be <= 123. 69 + * This is to ensure eventual jit convergence For the following patterns: 70 + * ... 71 + * pass4, final_proglen=4391: 72 + * ... 73 + * 20e: 48 85 ff test rdi,rdi 74 + * 211: 74 7d je 0x290 75 + * 213: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] 76 + * ... 77 + * 289: 48 85 ff test rdi,rdi 78 + * 28c: 74 17 je 0x2a5 79 + * 28e: e9 7f ff ff ff jmp 0x212 80 + * 293: bf 03 00 00 00 mov edi,0x3 81 + * Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125) 82 + * and insn at 0x28e is 5-byte jmp insn with offset -129. 83 + * 84 + * pass5, final_proglen=4392: 85 + * ... 86 + * 20e: 48 85 ff test rdi,rdi 87 + * 211: 0f 84 80 00 00 00 je 0x297 88 + * 217: 48 8b 77 00 mov rsi,QWORD PTR [rdi+0x0] 89 + * ... 90 + * 28d: 48 85 ff test rdi,rdi 91 + * 290: 74 1a je 0x2ac 92 + * 292: eb 84 jmp 0x218 93 + * 294: bf 03 00 00 00 mov edi,0x3 94 + * Note that insn at 0x211 is 6-byte cond jump insn now since its offset 95 + * becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). 96 + * At the same time, insn at 0x292 is a 2-byte insn since its offset is 97 + * -124. 98 + * 99 + * pass6 will repeat the same code as in pass4 and this will prevent 100 + * eventual convergence. 101 + * 102 + * To fix this issue, we need to break je (2->6 bytes) <-> jmp (5->2 bytes) 103 + * cycle in the above. In the above example je offset <= 0x7c should work. 104 + * 105 + * For other cases, je <-> je needs offset <= 0x7b to avoid no convergence 106 + * issue. For jmp <-> je and jmp <-> jmp cases, jmp offset <= 0x7c should 107 + * avoid no convergence issue. 108 + * 109 + * Overall, let us limit the positive offset for 8bit cond/uncond jmp insn 110 + * to maximum 123 (0x7b). This way, the jit pass can eventually converge. 111 + */ 112 + static bool is_imm8_jmp_offset(int value) 113 + { 114 + return value <= 123 && value >= -128; 115 + } 116 + 67 117 static bool is_simm32(s64 value) 68 118 { 69 119 return value == (s64)(s32)value; ··· 323 273 /* Number of bytes emit_patch() needs to generate instructions */ 324 274 #define X86_PATCH_SIZE 5 325 275 /* Number of bytes that will be skipped on tailcall */ 326 - #define X86_TAIL_CALL_OFFSET (11 + ENDBR_INSN_SIZE) 276 + #define X86_TAIL_CALL_OFFSET (12 + ENDBR_INSN_SIZE) 327 277 328 278 static void push_r12(u8 **pprog) 329 279 { ··· 453 403 *pprog = prog; 454 404 } 455 405 406 + static void emit_prologue_tail_call(u8 **pprog, bool is_subprog) 407 + { 408 + u8 *prog = *pprog; 409 + 410 + if (!is_subprog) { 411 + /* cmp rax, MAX_TAIL_CALL_CNT */ 412 + EMIT4(0x48, 0x83, 0xF8, MAX_TAIL_CALL_CNT); 413 + EMIT2(X86_JA, 6); /* ja 6 */ 414 + /* rax is tail_call_cnt if <= MAX_TAIL_CALL_CNT. 415 + * case1: entry of main prog. 416 + * case2: tail callee of main prog. 417 + */ 418 + EMIT1(0x50); /* push rax */ 419 + /* Make rax as tail_call_cnt_ptr. */ 420 + EMIT3(0x48, 0x89, 0xE0); /* mov rax, rsp */ 421 + EMIT2(0xEB, 1); /* jmp 1 */ 422 + /* rax is tail_call_cnt_ptr if > MAX_TAIL_CALL_CNT. 423 + * case: tail callee of subprog. 424 + */ 425 + EMIT1(0x50); /* push rax */ 426 + /* push tail_call_cnt_ptr */ 427 + EMIT1(0x50); /* push rax */ 428 + } else { /* is_subprog */ 429 + /* rax is tail_call_cnt_ptr. */ 430 + EMIT1(0x50); /* push rax */ 431 + EMIT1(0x50); /* push rax */ 432 + } 433 + 434 + *pprog = prog; 435 + } 436 + 456 437 /* 457 438 * Emit x86-64 prologue code for BPF program. 458 439 * bpf_tail_call helper will skip the first X86_TAIL_CALL_OFFSET bytes ··· 505 424 /* When it's the entry of the whole tailcall context, 506 425 * zeroing rax means initialising tail_call_cnt. 507 426 */ 508 - EMIT2(0x31, 0xC0); /* xor eax, eax */ 427 + EMIT3(0x48, 0x31, 0xC0); /* xor rax, rax */ 509 428 else 510 429 /* Keep the same instruction layout. */ 511 - EMIT2(0x66, 0x90); /* nop2 */ 430 + emit_nops(&prog, 3); /* nop3 */ 512 431 } 513 432 /* Exception callback receives FP as third parameter */ 514 433 if (is_exception_cb) { ··· 534 453 if (stack_depth) 535 454 EMIT3_off32(0x48, 0x81, 0xEC, round_up(stack_depth, 8)); 536 455 if (tail_call_reachable) 537 - EMIT1(0x50); /* push rax */ 456 + emit_prologue_tail_call(&prog, is_subprog); 538 457 *pprog = prog; 539 458 } 540 459 ··· 670 589 *pprog = prog; 671 590 } 672 591 592 + #define BPF_TAIL_CALL_CNT_PTR_STACK_OFF(stack) (-16 - round_up(stack, 8)) 593 + 673 594 /* 674 595 * Generate the following code: 675 596 * 676 597 * ... bpf_tail_call(void *ctx, struct bpf_array *array, u64 index) ... 677 598 * if (index >= array->map.max_entries) 678 599 * goto out; 679 - * if (tail_call_cnt++ >= MAX_TAIL_CALL_CNT) 600 + * if ((*tcc_ptr)++ >= MAX_TAIL_CALL_CNT) 680 601 * goto out; 681 602 * prog = array->ptrs[index]; 682 603 * if (prog == NULL) ··· 691 608 u32 stack_depth, u8 *ip, 692 609 struct jit_context *ctx) 693 610 { 694 - int tcc_off = -4 - round_up(stack_depth, 8); 611 + int tcc_ptr_off = BPF_TAIL_CALL_CNT_PTR_STACK_OFF(stack_depth); 695 612 u8 *prog = *pprog, *start = *pprog; 696 613 int offset; 697 614 ··· 713 630 EMIT2(X86_JBE, offset); /* jbe out */ 714 631 715 632 /* 716 - * if (tail_call_cnt++ >= MAX_TAIL_CALL_CNT) 633 + * if ((*tcc_ptr)++ >= MAX_TAIL_CALL_CNT) 717 634 * goto out; 718 635 */ 719 - EMIT2_off32(0x8B, 0x85, tcc_off); /* mov eax, dword ptr [rbp - tcc_off] */ 720 - EMIT3(0x83, 0xF8, MAX_TAIL_CALL_CNT); /* cmp eax, MAX_TAIL_CALL_CNT */ 636 + EMIT3_off32(0x48, 0x8B, 0x85, tcc_ptr_off); /* mov rax, qword ptr [rbp - tcc_ptr_off] */ 637 + EMIT4(0x48, 0x83, 0x38, MAX_TAIL_CALL_CNT); /* cmp qword ptr [rax], MAX_TAIL_CALL_CNT */ 721 638 722 639 offset = ctx->tail_call_indirect_label - (prog + 2 - start); 723 640 EMIT2(X86_JAE, offset); /* jae out */ 724 - EMIT3(0x83, 0xC0, 0x01); /* add eax, 1 */ 725 - EMIT2_off32(0x89, 0x85, tcc_off); /* mov dword ptr [rbp - tcc_off], eax */ 726 641 727 642 /* prog = array->ptrs[index]; */ 728 643 EMIT4_off32(0x48, 0x8B, 0x8C, 0xD6, /* mov rcx, [rsi + rdx * 8 + offsetof(...)] */ ··· 735 654 offset = ctx->tail_call_indirect_label - (prog + 2 - start); 736 655 EMIT2(X86_JE, offset); /* je out */ 737 656 657 + /* Inc tail_call_cnt if the slot is populated. */ 658 + EMIT4(0x48, 0x83, 0x00, 0x01); /* add qword ptr [rax], 1 */ 659 + 738 660 if (bpf_prog->aux->exception_boundary) { 739 661 pop_callee_regs(&prog, all_callee_regs_used); 740 662 pop_r12(&prog); ··· 747 663 pop_r12(&prog); 748 664 } 749 665 666 + /* Pop tail_call_cnt_ptr. */ 667 + EMIT1(0x58); /* pop rax */ 668 + /* Pop tail_call_cnt, if it's main prog. 669 + * Pop tail_call_cnt_ptr, if it's subprog. 670 + */ 750 671 EMIT1(0x58); /* pop rax */ 751 672 if (stack_depth) 752 673 EMIT3_off32(0x48, 0x81, 0xC4, /* add rsp, sd */ ··· 780 691 bool *callee_regs_used, u32 stack_depth, 781 692 struct jit_context *ctx) 782 693 { 783 - int tcc_off = -4 - round_up(stack_depth, 8); 694 + int tcc_ptr_off = BPF_TAIL_CALL_CNT_PTR_STACK_OFF(stack_depth); 784 695 u8 *prog = *pprog, *start = *pprog; 785 696 int offset; 786 697 787 698 /* 788 - * if (tail_call_cnt++ >= MAX_TAIL_CALL_CNT) 699 + * if ((*tcc_ptr)++ >= MAX_TAIL_CALL_CNT) 789 700 * goto out; 790 701 */ 791 - EMIT2_off32(0x8B, 0x85, tcc_off); /* mov eax, dword ptr [rbp - tcc_off] */ 792 - EMIT3(0x83, 0xF8, MAX_TAIL_CALL_CNT); /* cmp eax, MAX_TAIL_CALL_CNT */ 702 + EMIT3_off32(0x48, 0x8B, 0x85, tcc_ptr_off); /* mov rax, qword ptr [rbp - tcc_ptr_off] */ 703 + EMIT4(0x48, 0x83, 0x38, MAX_TAIL_CALL_CNT); /* cmp qword ptr [rax], MAX_TAIL_CALL_CNT */ 793 704 794 705 offset = ctx->tail_call_direct_label - (prog + 2 - start); 795 706 EMIT2(X86_JAE, offset); /* jae out */ 796 - EMIT3(0x83, 0xC0, 0x01); /* add eax, 1 */ 797 - EMIT2_off32(0x89, 0x85, tcc_off); /* mov dword ptr [rbp - tcc_off], eax */ 798 707 799 708 poke->tailcall_bypass = ip + (prog - start); 800 709 poke->adj_off = X86_TAIL_CALL_OFFSET; ··· 801 714 802 715 emit_jump(&prog, (u8 *)poke->tailcall_target + X86_PATCH_SIZE, 803 716 poke->tailcall_bypass); 717 + 718 + /* Inc tail_call_cnt if the slot is populated. */ 719 + EMIT4(0x48, 0x83, 0x00, 0x01); /* add qword ptr [rax], 1 */ 804 720 805 721 if (bpf_prog->aux->exception_boundary) { 806 722 pop_callee_regs(&prog, all_callee_regs_used); ··· 814 724 pop_r12(&prog); 815 725 } 816 726 727 + /* Pop tail_call_cnt_ptr. */ 728 + EMIT1(0x58); /* pop rax */ 729 + /* Pop tail_call_cnt, if it's main prog. 730 + * Pop tail_call_cnt_ptr, if it's subprog. 731 + */ 817 732 EMIT1(0x58); /* pop rax */ 818 733 if (stack_depth) 819 734 EMIT3_off32(0x48, 0x81, 0xC4, round_up(stack_depth, 8)); ··· 1406 1311 1407 1312 #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) 1408 1313 1409 - /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ 1410 - #define RESTORE_TAIL_CALL_CNT(stack) \ 1411 - EMIT3_off32(0x48, 0x8B, 0x85, -round_up(stack, 8) - 8) 1314 + #define __LOAD_TCC_PTR(off) \ 1315 + EMIT3_off32(0x48, 0x8B, 0x85, off) 1316 + /* mov rax, qword ptr [rbp - rounded_stack_depth - 16] */ 1317 + #define LOAD_TAIL_CALL_CNT_PTR(stack) \ 1318 + __LOAD_TCC_PTR(BPF_TAIL_CALL_CNT_PTR_STACK_OFF(stack)) 1412 1319 1413 1320 static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, 1414 1321 int oldproglen, struct jit_context *ctx, bool jmp_padding) ··· 2128 2031 2129 2032 func = (u8 *) __bpf_call_base + imm32; 2130 2033 if (tail_call_reachable) { 2131 - RESTORE_TAIL_CALL_CNT(bpf_prog->aux->stack_depth); 2034 + LOAD_TAIL_CALL_CNT_PTR(bpf_prog->aux->stack_depth); 2132 2035 ip += 7; 2133 2036 } 2134 2037 if (!imm32) ··· 2281 2184 return -EFAULT; 2282 2185 } 2283 2186 jmp_offset = addrs[i + insn->off] - addrs[i]; 2284 - if (is_imm8(jmp_offset)) { 2187 + if (is_imm8_jmp_offset(jmp_offset)) { 2285 2188 if (jmp_padding) { 2286 2189 /* To keep the jmp_offset valid, the extra bytes are 2287 2190 * padded before the jump insn, so we subtract the ··· 2363 2266 break; 2364 2267 } 2365 2268 emit_jmp: 2366 - if (is_imm8(jmp_offset)) { 2269 + if (is_imm8_jmp_offset(jmp_offset)) { 2367 2270 if (jmp_padding) { 2368 2271 /* To avoid breaking jmp_offset, the extra bytes 2369 2272 * are padded before the actual jmp insn, so ··· 2803 2706 return 0; 2804 2707 } 2805 2708 2709 + /* mov rax, qword ptr [rbp - rounded_stack_depth - 8] */ 2710 + #define LOAD_TRAMP_TAIL_CALL_CNT_PTR(stack) \ 2711 + __LOAD_TCC_PTR(-round_up(stack, 8) - 8) 2712 + 2806 2713 /* Example: 2807 2714 * __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev); 2808 2715 * its 'struct btf_func_model' will be nr_args=2 ··· 2927 2826 * [ ... ] 2928 2827 * [ stack_arg2 ] 2929 2828 * RBP - arg_stack_off [ stack_arg1 ] 2930 - * RSP [ tail_call_cnt ] BPF_TRAMP_F_TAIL_CALL_CTX 2829 + * RSP [ tail_call_cnt_ptr ] BPF_TRAMP_F_TAIL_CALL_CTX 2931 2830 */ 2932 2831 2933 2832 /* room for return value of orig_call or fentry prog */ ··· 3056 2955 save_args(m, &prog, arg_stack_off, true); 3057 2956 3058 2957 if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) { 3059 - /* Before calling the original function, restore the 3060 - * tail_call_cnt from stack to rax. 2958 + /* Before calling the original function, load the 2959 + * tail_call_cnt_ptr from stack to rax. 3061 2960 */ 3062 - RESTORE_TAIL_CALL_CNT(stack_size); 2961 + LOAD_TRAMP_TAIL_CALL_CNT_PTR(stack_size); 3063 2962 } 3064 2963 3065 2964 if (flags & BPF_TRAMP_F_ORIG_STACK) { ··· 3118 3017 goto cleanup; 3119 3018 } 3120 3019 } else if (flags & BPF_TRAMP_F_TAIL_CALL_CTX) { 3121 - /* Before running the original function, restore the 3122 - * tail_call_cnt from stack to rax. 3020 + /* Before running the original function, load the 3021 + * tail_call_cnt_ptr from stack to rax. 3123 3022 */ 3124 - RESTORE_TAIL_CALL_CNT(stack_size); 3023 + LOAD_TRAMP_TAIL_CALL_CNT_PTR(stack_size); 3125 3024 } 3126 3025 3127 3026 /* restore return value of orig_call or fentry prog back into RAX */

+1

fs/Makefile

··· 129 129 obj-$(CONFIG_EROFS_FS) += erofs/ 130 130 obj-$(CONFIG_VBOXSF_FS) += vboxsf/ 131 131 obj-$(CONFIG_ZONEFS_FS) += zonefs/ 132 + obj-$(CONFIG_BPF_LSM) += bpf_fs_kfuncs.o

+185

fs/bpf_fs_kfuncs.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Google LLC. */ 3 + 4 + #include <linux/bpf.h> 5 + #include <linux/btf.h> 6 + #include <linux/btf_ids.h> 7 + #include <linux/dcache.h> 8 + #include <linux/fs.h> 9 + #include <linux/file.h> 10 + #include <linux/mm.h> 11 + #include <linux/xattr.h> 12 + 13 + __bpf_kfunc_start_defs(); 14 + 15 + /** 16 + * bpf_get_task_exe_file - get a reference on the exe_file struct file member of 17 + * the mm_struct that is nested within the supplied 18 + * task_struct 19 + * @task: task_struct of which the nested mm_struct exe_file member to get a 20 + * reference on 21 + * 22 + * Get a reference on the exe_file struct file member field of the mm_struct 23 + * nested within the supplied *task*. The referenced file pointer acquired by 24 + * this BPF kfunc must be released using bpf_put_file(). Failing to call 25 + * bpf_put_file() on the returned referenced struct file pointer that has been 26 + * acquired by this BPF kfunc will result in the BPF program being rejected by 27 + * the BPF verifier. 28 + * 29 + * This BPF kfunc may only be called from BPF LSM programs. 30 + * 31 + * Internally, this BPF kfunc leans on get_task_exe_file(), such that calling 32 + * bpf_get_task_exe_file() would be analogous to calling get_task_exe_file() 33 + * directly in kernel context. 34 + * 35 + * Return: A referenced struct file pointer to the exe_file member of the 36 + * mm_struct that is nested within the supplied *task*. On error, NULL is 37 + * returned. 38 + */ 39 + __bpf_kfunc struct file *bpf_get_task_exe_file(struct task_struct *task) 40 + { 41 + return get_task_exe_file(task); 42 + } 43 + 44 + /** 45 + * bpf_put_file - put a reference on the supplied file 46 + * @file: file to put a reference on 47 + * 48 + * Put a reference on the supplied *file*. Only referenced file pointers may be 49 + * passed to this BPF kfunc. Attempting to pass an unreferenced file pointer, or 50 + * any other arbitrary pointer for that matter, will result in the BPF program 51 + * being rejected by the BPF verifier. 52 + * 53 + * This BPF kfunc may only be called from BPF LSM programs. 54 + */ 55 + __bpf_kfunc void bpf_put_file(struct file *file) 56 + { 57 + fput(file); 58 + } 59 + 60 + /** 61 + * bpf_path_d_path - resolve the pathname for the supplied path 62 + * @path: path to resolve the pathname for 63 + * @buf: buffer to return the resolved pathname in 64 + * @buf__sz: length of the supplied buffer 65 + * 66 + * Resolve the pathname for the supplied *path* and store it in *buf*. This BPF 67 + * kfunc is the safer variant of the legacy bpf_d_path() helper and should be 68 + * used in place of bpf_d_path() whenever possible. It enforces KF_TRUSTED_ARGS 69 + * semantics, meaning that the supplied *path* must itself hold a valid 70 + * reference, or else the BPF program will be outright rejected by the BPF 71 + * verifier. 72 + * 73 + * This BPF kfunc may only be called from BPF LSM programs. 74 + * 75 + * Return: A positive integer corresponding to the length of the resolved 76 + * pathname in *buf*, including the NUL termination character. On error, a 77 + * negative integer is returned. 78 + */ 79 + __bpf_kfunc int bpf_path_d_path(struct path *path, char *buf, size_t buf__sz) 80 + { 81 + int len; 82 + char *ret; 83 + 84 + if (!buf__sz) 85 + return -EINVAL; 86 + 87 + ret = d_path(path, buf, buf__sz); 88 + if (IS_ERR(ret)) 89 + return PTR_ERR(ret); 90 + 91 + len = buf + buf__sz - ret; 92 + memmove(buf, ret, len); 93 + return len; 94 + } 95 + 96 + /** 97 + * bpf_get_dentry_xattr - get xattr of a dentry 98 + * @dentry: dentry to get xattr from 99 + * @name__str: name of the xattr 100 + * @value_p: output buffer of the xattr value 101 + * 102 + * Get xattr *name__str* of *dentry* and store the output in *value_ptr*. 103 + * 104 + * For security reasons, only *name__str* with prefix "user." is allowed. 105 + * 106 + * Return: 0 on success, a negative value on error. 107 + */ 108 + __bpf_kfunc int bpf_get_dentry_xattr(struct dentry *dentry, const char *name__str, 109 + struct bpf_dynptr *value_p) 110 + { 111 + struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p; 112 + struct inode *inode = d_inode(dentry); 113 + u32 value_len; 114 + void *value; 115 + int ret; 116 + 117 + if (WARN_ON(!inode)) 118 + return -EINVAL; 119 + 120 + if (strncmp(name__str, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) 121 + return -EPERM; 122 + 123 + value_len = __bpf_dynptr_size(value_ptr); 124 + value = __bpf_dynptr_data_rw(value_ptr, value_len); 125 + if (!value) 126 + return -EINVAL; 127 + 128 + ret = inode_permission(&nop_mnt_idmap, inode, MAY_READ); 129 + if (ret) 130 + return ret; 131 + return __vfs_getxattr(dentry, inode, name__str, value, value_len); 132 + } 133 + 134 + /** 135 + * bpf_get_file_xattr - get xattr of a file 136 + * @file: file to get xattr from 137 + * @name__str: name of the xattr 138 + * @value_p: output buffer of the xattr value 139 + * 140 + * Get xattr *name__str* of *file* and store the output in *value_ptr*. 141 + * 142 + * For security reasons, only *name__str* with prefix "user." is allowed. 143 + * 144 + * Return: 0 on success, a negative value on error. 145 + */ 146 + __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str, 147 + struct bpf_dynptr *value_p) 148 + { 149 + struct dentry *dentry; 150 + 151 + dentry = file_dentry(file); 152 + return bpf_get_dentry_xattr(dentry, name__str, value_p); 153 + } 154 + 155 + __bpf_kfunc_end_defs(); 156 + 157 + BTF_KFUNCS_START(bpf_fs_kfunc_set_ids) 158 + BTF_ID_FLAGS(func, bpf_get_task_exe_file, 159 + KF_ACQUIRE | KF_TRUSTED_ARGS | KF_RET_NULL) 160 + BTF_ID_FLAGS(func, bpf_put_file, KF_RELEASE) 161 + BTF_ID_FLAGS(func, bpf_path_d_path, KF_TRUSTED_ARGS) 162 + BTF_ID_FLAGS(func, bpf_get_dentry_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) 163 + BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) 164 + BTF_KFUNCS_END(bpf_fs_kfunc_set_ids) 165 + 166 + static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id) 167 + { 168 + if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) || 169 + prog->type == BPF_PROG_TYPE_LSM) 170 + return 0; 171 + return -EACCES; 172 + } 173 + 174 + static const struct btf_kfunc_id_set bpf_fs_kfunc_set = { 175 + .owner = THIS_MODULE, 176 + .set = &bpf_fs_kfunc_set_ids, 177 + .filter = bpf_fs_kfuncs_filter, 178 + }; 179 + 180 + static int __init bpf_fs_kfuncs_init(void) 181 + { 182 + return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); 183 + } 184 + 185 + late_initcall(bpf_fs_kfuncs_init);

+25 -3

include/linux/bpf.h

··· 294 294 * same prog type, JITed flag and xdp_has_frags flag. 295 295 */ 296 296 struct { 297 + const struct btf_type *attach_func_proto; 297 298 spinlock_t lock; 298 299 enum bpf_prog_type type; 299 300 bool jited; ··· 695 694 /* DYNPTR points to xdp_buff */ 696 695 DYNPTR_TYPE_XDP = BIT(16 + BPF_BASE_TYPE_BITS), 697 696 697 + /* Memory must be aligned on some architectures, used in combination with 698 + * MEM_FIXED_SIZE. 699 + */ 700 + MEM_ALIGNED = BIT(17 + BPF_BASE_TYPE_BITS), 701 + 698 702 __BPF_TYPE_FLAG_MAX, 699 703 __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1, 700 704 }; ··· 737 731 ARG_ANYTHING, /* any (initialized) argument is ok */ 738 732 ARG_PTR_TO_SPIN_LOCK, /* pointer to bpf_spin_lock */ 739 733 ARG_PTR_TO_SOCK_COMMON, /* pointer to sock_common */ 740 - ARG_PTR_TO_INT, /* pointer to int */ 741 - ARG_PTR_TO_LONG, /* pointer to long */ 742 734 ARG_PTR_TO_SOCKET, /* pointer to bpf_sock (fullsock) */ 743 735 ARG_PTR_TO_BTF_ID, /* pointer to in-kernel struct */ 744 736 ARG_PTR_TO_RINGBUF_MEM, /* pointer to dynamically reserved ringbuf memory */ ··· 747 743 ARG_PTR_TO_STACK, /* pointer to stack */ 748 744 ARG_PTR_TO_CONST_STR, /* pointer to a null terminated read-only string */ 749 745 ARG_PTR_TO_TIMER, /* pointer to bpf_timer */ 750 - ARG_PTR_TO_KPTR, /* pointer to referenced kptr */ 746 + ARG_KPTR_XCHG_DEST, /* pointer to destination that kptrs are bpf_kptr_xchg'd into */ 751 747 ARG_PTR_TO_DYNPTR, /* pointer to bpf_dynptr. See bpf_type_flag for dynptr type */ 752 748 __BPF_ARG_TYPE_MAX, 753 749 ··· 811 807 bool gpl_only; 812 808 bool pkt_access; 813 809 bool might_sleep; 810 + /* set to true if helper follows contract for llvm 811 + * attribute bpf_fastcall: 812 + * - void functions do not scratch r0 813 + * - functions taking N arguments scratch only registers r1-rN 814 + */ 815 + bool allow_fastcall; 814 816 enum bpf_return_type ret_type; 815 817 union { 816 818 struct { ··· 929 919 */ 930 920 struct bpf_insn_access_aux { 931 921 enum bpf_reg_type reg_type; 922 + bool is_ldsx; 932 923 union { 933 924 int ctx_field_size; 934 925 struct { ··· 938 927 }; 939 928 }; 940 929 struct bpf_verifier_log *log; /* for verbose logs */ 930 + bool is_retval; /* is accessing function return value ? */ 941 931 }; 942 932 943 933 static inline void ··· 977 965 struct bpf_insn_access_aux *info); 978 966 int (*gen_prologue)(struct bpf_insn *insn, bool direct_write, 979 967 const struct bpf_prog *prog); 968 + int (*gen_epilogue)(struct bpf_insn *insn, const struct bpf_prog *prog, 969 + s16 ctx_stack_off); 980 970 int (*gen_ld_abs)(const struct bpf_insn *orig, 981 971 struct bpf_insn *insn_buf); 982 972 u32 (*convert_ctx_access)(enum bpf_access_type type, ··· 1809 1795 #define BPF_MODULE_OWNER ((void *)((0xeB9FUL << 2) + POISON_POINTER_DELTA)) 1810 1796 bool bpf_struct_ops_get(const void *kdata); 1811 1797 void bpf_struct_ops_put(const void *kdata); 1798 + int bpf_struct_ops_supported(const struct bpf_struct_ops *st_ops, u32 moff); 1812 1799 int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key, 1813 1800 void *value); 1814 1801 int bpf_struct_ops_prepare_trampoline(struct bpf_tramp_links *tlinks, ··· 1865 1850 static inline void bpf_module_put(const void *data, struct module *owner) 1866 1851 { 1867 1852 module_put(owner); 1853 + } 1854 + static inline int bpf_struct_ops_supported(const struct bpf_struct_ops *st_ops, u32 moff) 1855 + { 1856 + return -ENOTSUPP; 1868 1857 } 1869 1858 static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, 1870 1859 void *key, ··· 3203 3184 extern const struct bpf_func_proto bpf_get_current_comm_proto; 3204 3185 extern const struct bpf_func_proto bpf_get_stackid_proto; 3205 3186 extern const struct bpf_func_proto bpf_get_stack_proto; 3187 + extern const struct bpf_func_proto bpf_get_stack_sleepable_proto; 3206 3188 extern const struct bpf_func_proto bpf_get_task_stack_proto; 3189 + extern const struct bpf_func_proto bpf_get_task_stack_sleepable_proto; 3207 3190 extern const struct bpf_func_proto bpf_get_stackid_proto_pe; 3208 3191 extern const struct bpf_func_proto bpf_get_stack_proto_pe; 3209 3192 extern const struct bpf_func_proto bpf_sock_map_update_proto; ··· 3213 3192 extern const struct bpf_func_proto bpf_get_current_cgroup_id_proto; 3214 3193 extern const struct bpf_func_proto bpf_get_current_ancestor_cgroup_id_proto; 3215 3194 extern const struct bpf_func_proto bpf_get_cgroup_classid_curr_proto; 3195 + extern const struct bpf_func_proto bpf_current_task_under_cgroup_proto; 3216 3196 extern const struct bpf_func_proto bpf_msg_redirect_hash_proto; 3217 3197 extern const struct bpf_func_proto bpf_msg_redirect_map_proto; 3218 3198 extern const struct bpf_func_proto bpf_sk_redirect_hash_proto;

+8

include/linux/bpf_lsm.h

··· 9 9 10 10 #include <linux/sched.h> 11 11 #include <linux/bpf.h> 12 + #include <linux/bpf_verifier.h> 12 13 #include <linux/lsm_hooks.h> 13 14 14 15 #ifdef CONFIG_BPF_LSM ··· 46 45 47 46 void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func); 48 47 48 + int bpf_lsm_get_retval_range(const struct bpf_prog *prog, 49 + struct bpf_retval_range *range); 49 50 #else /* !CONFIG_BPF_LSM */ 50 51 51 52 static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id) ··· 81 78 { 82 79 } 83 80 81 + static inline int bpf_lsm_get_retval_range(const struct bpf_prog *prog, 82 + struct bpf_retval_range *range) 83 + { 84 + return -EOPNOTSUPP; 85 + } 84 86 #endif /* CONFIG_BPF_LSM */ 85 87 86 88 #endif /* _LINUX_BPF_LSM_H */

+27

include/linux/bpf_verifier.h

··· 23 23 * (in the "-8,-16,...,-512" form) 24 24 */ 25 25 #define TMP_STR_BUF_LEN 320 26 + /* Patch buffer size */ 27 + #define INSN_BUF_SIZE 32 26 28 27 29 /* Liveness marks, used for registers and spilled-regs (in stack slots). 28 30 * Read marks propagate upwards until they find a write mark; they record that ··· 373 371 u32 prev_idx : 22; 374 372 /* special flags, e.g., whether insn is doing register stack spill/load */ 375 373 u32 flags : 10; 374 + /* additional registers that need precision tracking when this 375 + * jump is backtracked, vector of six 10-bit records 376 + */ 377 + u64 linked_regs; 376 378 }; 377 379 378 380 /* Maximum number of register states that can exist at once */ ··· 578 572 bool is_iter_next; /* bpf_iter_<type>_next() kfunc call */ 579 573 bool call_with_percpu_alloc_ptr; /* {this,per}_cpu_ptr() with prog percpu alloc */ 580 574 u8 alu_state; /* used in combination with alu_limit */ 575 + /* true if STX or LDX instruction is a part of a spill/fill 576 + * pattern for a bpf_fastcall call. 577 + */ 578 + u8 fastcall_pattern:1; 579 + /* for CALL instructions, a number of spill/fill pairs in the 580 + * bpf_fastcall pattern. 581 + */ 582 + u8 fastcall_spills_num:3; 581 583 582 584 /* below fields are initialized once */ 583 585 unsigned int orig_idx; /* original instruction index */ ··· 655 641 u32 linfo_idx; /* The idx to the main_prog->aux->linfo */ 656 642 u16 stack_depth; /* max. stack depth used by this function */ 657 643 u16 stack_extra; 644 + /* offsets in range [stack_depth .. fastcall_stack_off) 645 + * are used for bpf_fastcall spills and fills. 646 + */ 647 + s16 fastcall_stack_off; 658 648 bool has_tail_call: 1; 659 649 bool tail_call_reachable: 1; 660 650 bool has_ld_abs: 1; ··· 666 648 bool is_async_cb: 1; 667 649 bool is_exception_cb: 1; 668 650 bool args_cached: 1; 651 + /* true if bpf_fastcall stack region is used by functions that can't be inlined */ 652 + bool keep_fastcall_stack: 1; 669 653 670 654 u8 arg_cnt; 671 655 struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS]; ··· 782 762 * e.g., in reg_type_str() to generate reg_type string 783 763 */ 784 764 char tmp_str_buf[TMP_STR_BUF_LEN]; 765 + struct bpf_insn insn_buf[INSN_BUF_SIZE]; 766 + struct bpf_insn epilogue_buf[INSN_BUF_SIZE]; 785 767 }; 786 768 787 769 static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog) ··· 925 903 type == PTR_TO_SOCK_COMMON || 926 904 type == PTR_TO_TCP_SOCK || 927 905 type == PTR_TO_XDP_SOCK; 906 + } 907 + 908 + static inline bool type_may_be_null(u32 type) 909 + { 910 + return type & PTR_MAYBE_NULL; 928 911 } 929 912 930 913 static inline void mark_reg_scratched(struct bpf_verifier_env *env, u32 regno)

+5

include/linux/btf.h

··· 580 580 int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_type); 581 581 bool btf_types_are_same(const struct btf *btf1, u32 id1, 582 582 const struct btf *btf2, u32 id2); 583 + int btf_check_iter_arg(struct btf *btf, const struct btf_type *func, int arg_idx); 583 584 #else 584 585 static inline const struct btf_type *btf_type_by_id(const struct btf *btf, 585 586 u32 type_id) ··· 654 653 const struct btf *btf2, u32 id2) 655 654 { 656 655 return false; 656 + } 657 + static inline int btf_check_iter_arg(struct btf *btf, const struct btf_type *func, int arg_idx) 658 + { 659 + return -EOPNOTSUPP; 657 660 } 658 661 #endif 659 662

+2 -2

include/linux/buildid.h

··· 7 7 #define BUILD_ID_SIZE_MAX 20 8 8 9 9 struct vm_area_struct; 10 - int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, 11 - __u32 *size); 10 + int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size); 11 + int build_id_parse_nofault(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size); 12 12 int build_id_parse_buf(const void *buf, unsigned char *build_id, u32 buf_size); 13 13 14 14 #if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID) || IS_ENABLED(CONFIG_VMCORE_INFO)

+10

include/linux/filter.h

··· 437 437 .off = OFF, \ 438 438 .imm = 0 }) 439 439 440 + /* Unconditional jumps, gotol pc + imm32 */ 441 + 442 + #define BPF_JMP32_A(IMM) \ 443 + ((struct bpf_insn) { \ 444 + .code = BPF_JMP32 | BPF_JA, \ 445 + .dst_reg = 0, \ 446 + .src_reg = 0, \ 447 + .off = 0, \ 448 + .imm = IMM }) 449 + 440 450 /* Relative call */ 441 451 442 452 #define BPF_CALL_REL(TGT) \

+14 -4

include/uapi/linux/bpf.h

··· 5519 5519 * **-EOPNOTSUPP** if the hash calculation failed or **-EINVAL** if 5520 5520 * invalid arguments are passed. 5521 5521 * 5522 - * void *bpf_kptr_xchg(void *map_value, void *ptr) 5522 + * void *bpf_kptr_xchg(void *dst, void *ptr) 5523 5523 * Description 5524 - * Exchange kptr at pointer *map_value* with *ptr*, and return the 5525 - * old value. *ptr* can be NULL, otherwise it must be a referenced 5526 - * pointer which will be released when this helper is called. 5524 + * Exchange kptr at pointer *dst* with *ptr*, and return the old value. 5525 + * *dst* can be map value or local kptr. *ptr* can be NULL, otherwise 5526 + * it must be a referenced pointer which will be released when this helper 5527 + * is called. 5527 5528 * Return 5528 5529 * The old value of kptr (which can be NULL). The returned pointer 5529 5530 * if not NULL, is a reference which must be released using its ··· 7513 7512 */ 7514 7513 __u64 __opaque[1]; 7515 7514 } __attribute__((aligned(8))); 7515 + 7516 + /* 7517 + * Flags to control BPF kfunc behaviour. 7518 + * - BPF_F_PAD_ZEROS: Pad destination buffer with zeros. (See the respective 7519 + * helper documentation for details.) 7520 + */ 7521 + enum bpf_kfunc_flags { 7522 + BPF_F_PAD_ZEROS = (1ULL << 0), 7523 + }; 7516 7524 7517 7525 #endif /* _UAPI__LINUX_BPF_H__ */

-6

kernel/bpf/Makefile

··· 52 52 obj-$(CONFIG_BPF_SYSCALL) += relo_core.o 53 53 obj-$(CONFIG_BPF_SYSCALL) += btf_iter.o 54 54 obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o 55 - 56 - # Some source files are common to libbpf. 57 - vpath %.c $(srctree)/kernel/bpf:$(srctree)/tools/lib/bpf 58 - 59 - $(obj)/%.o: %.c FORCE 60 - $(call if_changed_rule,cc_o_c)

+10 -7

kernel/bpf/arraymap.c

··· 73 73 /* avoid overflow on round_up(map->value_size) */ 74 74 if (attr->value_size > INT_MAX) 75 75 return -E2BIG; 76 + /* percpu map value size is bound by PCPU_MIN_UNIT_SIZE */ 77 + if (percpu && round_up(attr->value_size, 8) > PCPU_MIN_UNIT_SIZE) 78 + return -E2BIG; 76 79 77 80 return 0; 78 81 } ··· 497 494 if (map->btf_key_type_id) 498 495 seq_printf(m, "%u: ", *(u32 *)key); 499 496 btf_type_seq_show(map->btf, map->btf_value_type_id, value, m); 500 - seq_puts(m, "\n"); 497 + seq_putc(m, '\n'); 501 498 502 499 rcu_read_unlock(); 503 500 } ··· 518 515 seq_printf(m, "\tcpu%d: ", cpu); 519 516 btf_type_seq_show(map->btf, map->btf_value_type_id, 520 517 per_cpu_ptr(pptr, cpu), m); 521 - seq_puts(m, "\n"); 518 + seq_putc(m, '\n'); 522 519 } 523 520 seq_puts(m, "}\n"); 524 521 ··· 603 600 array = container_of(map, struct bpf_array, map); 604 601 index = info->index & array->index_mask; 605 602 if (info->percpu_value_buf) 606 - return array->pptrs[index]; 603 + return (void *)(uintptr_t)array->pptrs[index]; 607 604 return array_map_elem_ptr(array, index); 608 605 } 609 606 ··· 622 619 array = container_of(map, struct bpf_array, map); 623 620 index = info->index & array->index_mask; 624 621 if (info->percpu_value_buf) 625 - return array->pptrs[index]; 622 + return (void *)(uintptr_t)array->pptrs[index]; 626 623 return array_map_elem_ptr(array, index); 627 624 } 628 625 ··· 635 632 struct bpf_iter_meta meta; 636 633 struct bpf_prog *prog; 637 634 int off = 0, cpu = 0; 638 - void __percpu **pptr; 635 + void __percpu *pptr; 639 636 u32 size; 640 637 641 638 meta.seq = seq; ··· 651 648 if (!info->percpu_value_buf) { 652 649 ctx.value = v; 653 650 } else { 654 - pptr = v; 651 + pptr = (void __percpu *)(uintptr_t)v; 655 652 size = array->elem_size; 656 653 for_each_possible_cpu(cpu) { 657 654 copy_map_value_long(map, info->percpu_value_buf + off, ··· 996 993 prog_id = prog_fd_array_sys_lookup_elem(ptr); 997 994 btf_type_seq_show(map->btf, map->btf_value_type_id, 998 995 &prog_id, m); 999 - seq_puts(m, "\n"); 996 + seq_putc(m, '\n'); 1000 997 } 1001 998 } 1002 999

+62 -3

kernel/bpf/bpf_lsm.c

··· 11 11 #include <linux/lsm_hooks.h> 12 12 #include <linux/bpf_lsm.h> 13 13 #include <linux/kallsyms.h> 14 - #include <linux/bpf_verifier.h> 15 14 #include <net/bpf_sk_storage.h> 16 15 #include <linux/bpf_local_storage.h> 17 16 #include <linux/btf_ids.h> ··· 34 35 #include <linux/lsm_hook_defs.h> 35 36 #undef LSM_HOOK 36 37 BTF_SET_END(bpf_lsm_hooks) 38 + 39 + BTF_SET_START(bpf_lsm_disabled_hooks) 40 + BTF_ID(func, bpf_lsm_vm_enough_memory) 41 + BTF_ID(func, bpf_lsm_inode_need_killpriv) 42 + BTF_ID(func, bpf_lsm_inode_getsecurity) 43 + BTF_ID(func, bpf_lsm_inode_listsecurity) 44 + BTF_ID(func, bpf_lsm_inode_copy_up_xattr) 45 + BTF_ID(func, bpf_lsm_getselfattr) 46 + BTF_ID(func, bpf_lsm_getprocattr) 47 + BTF_ID(func, bpf_lsm_setprocattr) 48 + #ifdef CONFIG_KEYS 49 + BTF_ID(func, bpf_lsm_key_getsecurity) 50 + #endif 51 + #ifdef CONFIG_AUDIT 52 + BTF_ID(func, bpf_lsm_audit_rule_match) 53 + #endif 54 + BTF_ID(func, bpf_lsm_ismaclabel) 55 + BTF_SET_END(bpf_lsm_disabled_hooks) 37 56 38 57 /* List of LSM hooks that should operate on 'current' cgroup regardless 39 58 * of function signature. ··· 114 97 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, 115 98 const struct bpf_prog *prog) 116 99 { 100 + u32 btf_id = prog->aux->attach_btf_id; 101 + const char *func_name = prog->aux->attach_func_name; 102 + 117 103 if (!prog->gpl_compatible) { 118 104 bpf_log(vlog, 119 105 "LSM programs must have a GPL compatible license\n"); 120 106 return -EINVAL; 121 107 } 122 108 123 - if (!btf_id_set_contains(&bpf_lsm_hooks, prog->aux->attach_btf_id)) { 109 + if (btf_id_set_contains(&bpf_lsm_disabled_hooks, btf_id)) { 110 + bpf_log(vlog, "attach_btf_id %u points to disabled hook %s\n", 111 + btf_id, func_name); 112 + return -EINVAL; 113 + } 114 + 115 + if (!btf_id_set_contains(&bpf_lsm_hooks, btf_id)) { 124 116 bpf_log(vlog, "attach_btf_id %u points to wrong type name %s\n", 125 - prog->aux->attach_btf_id, prog->aux->attach_func_name); 117 + btf_id, func_name); 126 118 return -EINVAL; 127 119 } 128 120 ··· 416 390 .get_func_proto = bpf_lsm_func_proto, 417 391 .is_valid_access = btf_ctx_access, 418 392 }; 393 + 394 + /* hooks return 0 or 1 */ 395 + BTF_SET_START(bool_lsm_hooks) 396 + #ifdef CONFIG_SECURITY_NETWORK_XFRM 397 + BTF_ID(func, bpf_lsm_xfrm_state_pol_flow_match) 398 + #endif 399 + #ifdef CONFIG_AUDIT 400 + BTF_ID(func, bpf_lsm_audit_rule_known) 401 + #endif 402 + BTF_ID(func, bpf_lsm_inode_xattr_skipcap) 403 + BTF_SET_END(bool_lsm_hooks) 404 + 405 + int bpf_lsm_get_retval_range(const struct bpf_prog *prog, 406 + struct bpf_retval_range *retval_range) 407 + { 408 + /* no return value range for void hooks */ 409 + if (!prog->aux->attach_func_proto->type) 410 + return -EINVAL; 411 + 412 + if (btf_id_set_contains(&bool_lsm_hooks, prog->aux->attach_btf_id)) { 413 + retval_range->minval = 0; 414 + retval_range->maxval = 1; 415 + } else { 416 + /* All other available LSM hooks, except task_prctl, return 0 417 + * on success and negative error code on failure. 418 + * To keep things simple, we only allow bpf progs to return 0 419 + * or negative errno for task_prctl too. 420 + */ 421 + retval_range->minval = -MAX_ERRNO; 422 + retval_range->maxval = 0; 423 + } 424 + return 0; 425 + }

+8 -1

kernel/bpf/bpf_struct_ops.c

··· 837 837 btf_type_seq_show(st_map->btf, 838 838 map->btf_vmlinux_value_type_id, 839 839 value, m); 840 - seq_puts(m, "\n"); 840 + seq_putc(m, '\n'); 841 841 } 842 842 843 843 kfree(value); ··· 1038 1038 st_map = container_of(kvalue, struct bpf_struct_ops_map, kvalue); 1039 1039 1040 1040 bpf_map_put(&st_map->map); 1041 + } 1042 + 1043 + int bpf_struct_ops_supported(const struct bpf_struct_ops *st_ops, u32 moff) 1044 + { 1045 + void *func_ptr = *(void **)(st_ops->cfi_stubs + moff); 1046 + 1047 + return func_ptr ? 0 : -ENOTSUPP; 1041 1048 } 1042 1049 1043 1050 static bool bpf_struct_ops_valid_to_reg(struct bpf_map *map)

+115 -46

kernel/bpf/btf.c

··· 212 212 BTF_KFUNC_HOOK_TRACING, 213 213 BTF_KFUNC_HOOK_SYSCALL, 214 214 BTF_KFUNC_HOOK_FMODRET, 215 - BTF_KFUNC_HOOK_CGROUP_SKB, 215 + BTF_KFUNC_HOOK_CGROUP, 216 216 BTF_KFUNC_HOOK_SCHED_ACT, 217 217 BTF_KFUNC_HOOK_SK_SKB, 218 218 BTF_KFUNC_HOOK_SOCKET_FILTER, ··· 790 790 return NULL; 791 791 } 792 792 793 - static bool __btf_name_valid(const struct btf *btf, u32 offset) 793 + static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) 794 794 { 795 795 /* offset must be valid */ 796 796 const char *src = btf_str_by_offset(btf, offset); ··· 809 809 } 810 810 811 811 return !*src; 812 - } 813 - 814 - static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) 815 - { 816 - return __btf_name_valid(btf, offset); 817 812 } 818 813 819 814 /* Allow any printable character in DATASEC names */ ··· 3756 3761 return -EINVAL; 3757 3762 } 3758 3763 3764 + /* Callers have to ensure the life cycle of btf if it is program BTF */ 3759 3765 static int btf_parse_kptr(const struct btf *btf, struct btf_field *field, 3760 3766 struct btf_field_info *info) 3761 3767 { ··· 3785 3789 field->kptr.dtor = NULL; 3786 3790 id = info->kptr.type_id; 3787 3791 kptr_btf = (struct btf *)btf; 3788 - btf_get(kptr_btf); 3789 3792 goto found_dtor; 3790 3793 } 3791 3794 if (id < 0) ··· 4626 4631 } 4627 4632 4628 4633 if (!t->name_off || 4629 - !__btf_name_valid(env->btf, t->name_off)) { 4634 + !btf_name_valid_identifier(env->btf, t->name_off)) { 4630 4635 btf_verifier_log_type(env, t, "Invalid name"); 4631 4636 return -EINVAL; 4632 4637 } ··· 5514 5519 static struct btf_struct_metas * 5515 5520 btf_parse_struct_metas(struct bpf_verifier_log *log, struct btf *btf) 5516 5521 { 5517 - union { 5518 - struct btf_id_set set; 5519 - struct { 5520 - u32 _cnt; 5521 - u32 _ids[ARRAY_SIZE(alloc_obj_fields)]; 5522 - } _arr; 5523 - } aof; 5524 5522 struct btf_struct_metas *tab = NULL; 5523 + struct btf_id_set *aof; 5525 5524 int i, n, id, ret; 5526 5525 5527 5526 BUILD_BUG_ON(offsetof(struct btf_id_set, cnt) != 0); 5528 5527 BUILD_BUG_ON(sizeof(struct btf_id_set) != sizeof(u32)); 5529 5528 5530 - memset(&aof, 0, sizeof(aof)); 5529 + aof = kmalloc(sizeof(*aof), GFP_KERNEL | __GFP_NOWARN); 5530 + if (!aof) 5531 + return ERR_PTR(-ENOMEM); 5532 + aof->cnt = 0; 5533 + 5531 5534 for (i = 0; i < ARRAY_SIZE(alloc_obj_fields); i++) { 5532 5535 /* Try to find whether this special type exists in user BTF, and 5533 5536 * if so remember its ID so we can easily find it among members 5534 5537 * of structs that we iterate in the next loop. 5535 5538 */ 5539 + struct btf_id_set *new_aof; 5540 + 5536 5541 id = btf_find_by_name_kind(btf, alloc_obj_fields[i], BTF_KIND_STRUCT); 5537 5542 if (id < 0) 5538 5543 continue; 5539 - aof.set.ids[aof.set.cnt++] = id; 5544 + 5545 + new_aof = krealloc(aof, offsetof(struct btf_id_set, ids[aof->cnt + 1]), 5546 + GFP_KERNEL | __GFP_NOWARN); 5547 + if (!new_aof) { 5548 + ret = -ENOMEM; 5549 + goto free_aof; 5550 + } 5551 + aof = new_aof; 5552 + aof->ids[aof->cnt++] = id; 5540 5553 } 5541 5554 5542 - if (!aof.set.cnt) 5543 - return NULL; 5544 - sort(&aof.set.ids, aof.set.cnt, sizeof(aof.set.ids[0]), btf_id_cmp_func, NULL); 5545 - 5546 5555 n = btf_nr_types(btf); 5556 + for (i = 1; i < n; i++) { 5557 + /* Try to find if there are kptrs in user BTF and remember their ID */ 5558 + struct btf_id_set *new_aof; 5559 + struct btf_field_info tmp; 5560 + const struct btf_type *t; 5561 + 5562 + t = btf_type_by_id(btf, i); 5563 + if (!t) { 5564 + ret = -EINVAL; 5565 + goto free_aof; 5566 + } 5567 + 5568 + ret = btf_find_kptr(btf, t, 0, 0, &tmp); 5569 + if (ret != BTF_FIELD_FOUND) 5570 + continue; 5571 + 5572 + new_aof = krealloc(aof, offsetof(struct btf_id_set, ids[aof->cnt + 1]), 5573 + GFP_KERNEL | __GFP_NOWARN); 5574 + if (!new_aof) { 5575 + ret = -ENOMEM; 5576 + goto free_aof; 5577 + } 5578 + aof = new_aof; 5579 + aof->ids[aof->cnt++] = i; 5580 + } 5581 + 5582 + if (!aof->cnt) { 5583 + kfree(aof); 5584 + return NULL; 5585 + } 5586 + sort(&aof->ids, aof->cnt, sizeof(aof->ids[0]), btf_id_cmp_func, NULL); 5587 + 5547 5588 for (i = 1; i < n; i++) { 5548 5589 struct btf_struct_metas *new_tab; 5549 5590 const struct btf_member *member; ··· 5589 5558 int j, tab_cnt; 5590 5559 5591 5560 t = btf_type_by_id(btf, i); 5592 - if (!t) { 5593 - ret = -EINVAL; 5594 - goto free; 5595 - } 5596 5561 if (!__btf_type_is_struct(t)) 5597 5562 continue; 5598 5563 5599 5564 cond_resched(); 5600 5565 5601 5566 for_each_member(j, t, member) { 5602 - if (btf_id_set_contains(&aof.set, member->type)) 5567 + if (btf_id_set_contains(aof, member->type)) 5603 5568 goto parse; 5604 5569 } 5605 5570 continue; ··· 5614 5587 type = &tab->types[tab->cnt]; 5615 5588 type->btf_id = i; 5616 5589 record = btf_parse_fields(btf, t, BPF_SPIN_LOCK | BPF_LIST_HEAD | BPF_LIST_NODE | 5617 - BPF_RB_ROOT | BPF_RB_NODE | BPF_REFCOUNT, t->size); 5590 + BPF_RB_ROOT | BPF_RB_NODE | BPF_REFCOUNT | 5591 + BPF_KPTR, t->size); 5618 5592 /* The record cannot be unset, treat it as an error if so */ 5619 5593 if (IS_ERR_OR_NULL(record)) { 5620 5594 ret = PTR_ERR_OR_ZERO(record) ?: -EFAULT; ··· 5624 5596 type->record = record; 5625 5597 tab->cnt++; 5626 5598 } 5599 + kfree(aof); 5627 5600 return tab; 5628 5601 free: 5629 5602 btf_struct_metas_free(tab); 5603 + free_aof: 5604 + kfree(aof); 5630 5605 return ERR_PTR(ret); 5631 5606 } 5632 5607 ··· 6276 6245 btf->kernel_btf = true; 6277 6246 snprintf(btf->name, sizeof(btf->name), "%s", module_name); 6278 6247 6279 - btf->data = kvmalloc(data_size, GFP_KERNEL | __GFP_NOWARN); 6248 + btf->data = kvmemdup(data, data_size, GFP_KERNEL | __GFP_NOWARN); 6280 6249 if (!btf->data) { 6281 6250 err = -ENOMEM; 6282 6251 goto errout; 6283 6252 } 6284 - memcpy(btf->data, data, data_size); 6285 6253 btf->data_size = data_size; 6286 6254 6287 6255 err = btf_parse_hdr(env); ··· 6448 6418 6449 6419 if (arg == nr_args) { 6450 6420 switch (prog->expected_attach_type) { 6451 - case BPF_LSM_CGROUP: 6452 6421 case BPF_LSM_MAC: 6422 + /* mark we are accessing the return value */ 6423 + info->is_retval = true; 6424 + fallthrough; 6425 + case BPF_LSM_CGROUP: 6453 6426 case BPF_TRACE_FEXIT: 6454 6427 /* When LSM programs are attached to void LSM hooks 6455 6428 * they use FEXIT trampolines and when attached to ··· 8087 8054 BTF_TRACING_TYPE_xxx 8088 8055 #undef BTF_TRACING_TYPE 8089 8056 8057 + /* Validate well-formedness of iter argument type. 8058 + * On success, return positive BTF ID of iter state's STRUCT type. 8059 + * On error, negative error is returned. 8060 + */ 8061 + int btf_check_iter_arg(struct btf *btf, const struct btf_type *func, int arg_idx) 8062 + { 8063 + const struct btf_param *arg; 8064 + const struct btf_type *t; 8065 + const char *name; 8066 + int btf_id; 8067 + 8068 + if (btf_type_vlen(func) <= arg_idx) 8069 + return -EINVAL; 8070 + 8071 + arg = &btf_params(func)[arg_idx]; 8072 + t = btf_type_skip_modifiers(btf, arg->type, NULL); 8073 + if (!t || !btf_type_is_ptr(t)) 8074 + return -EINVAL; 8075 + t = btf_type_skip_modifiers(btf, t->type, &btf_id); 8076 + if (!t || !__btf_type_is_struct(t)) 8077 + return -EINVAL; 8078 + 8079 + name = btf_name_by_offset(btf, t->name_off); 8080 + if (!name || strncmp(name, ITER_PREFIX, sizeof(ITER_PREFIX) - 1)) 8081 + return -EINVAL; 8082 + 8083 + return btf_id; 8084 + } 8085 + 8090 8086 static int btf_check_iter_kfuncs(struct btf *btf, const char *func_name, 8091 8087 const struct btf_type *func, u32 func_flags) 8092 8088 { 8093 8089 u32 flags = func_flags & (KF_ITER_NEW | KF_ITER_NEXT | KF_ITER_DESTROY); 8094 - const char *name, *sfx, *iter_name; 8095 - const struct btf_param *arg; 8090 + const char *sfx, *iter_name; 8096 8091 const struct btf_type *t; 8097 8092 char exp_name[128]; 8098 8093 u32 nr_args; 8094 + int btf_id; 8099 8095 8100 8096 /* exactly one of KF_ITER_{NEW,NEXT,DESTROY} can be set */ 8101 8097 if (!flags || (flags & (flags - 1))) ··· 8135 8073 if (nr_args < 1) 8136 8074 return -EINVAL; 8137 8075 8138 - arg = &btf_params(func)[0]; 8139 - t = btf_type_skip_modifiers(btf, arg->type, NULL); 8140 - if (!t || !btf_type_is_ptr(t)) 8141 - return -EINVAL; 8142 - t = btf_type_skip_modifiers(btf, t->type, NULL); 8143 - if (!t || !__btf_type_is_struct(t)) 8144 - return -EINVAL; 8145 - 8146 - name = btf_name_by_offset(btf, t->name_off); 8147 - if (!name || strncmp(name, ITER_PREFIX, sizeof(ITER_PREFIX) - 1)) 8148 - return -EINVAL; 8076 + btf_id = btf_check_iter_arg(btf, func, 0); 8077 + if (btf_id < 0) 8078 + return btf_id; 8149 8079 8150 8080 /* sizeof(struct bpf_iter_<type>) should be a multiple of 8 to 8151 8081 * fit nicely in stack slots 8152 8082 */ 8083 + t = btf_type_by_id(btf, btf_id); 8153 8084 if (t->size == 0 || (t->size % 8)) 8154 8085 return -EINVAL; 8155 8086 8156 8087 /* validate bpf_iter_<type>_{new,next,destroy}(struct bpf_iter_<type> *) 8157 8088 * naming pattern 8158 8089 */ 8159 - iter_name = name + sizeof(ITER_PREFIX) - 1; 8090 + iter_name = btf_name_by_offset(btf, t->name_off) + sizeof(ITER_PREFIX) - 1; 8160 8091 if (flags & KF_ITER_NEW) 8161 8092 sfx = "new"; 8162 8093 else if (flags & KF_ITER_NEXT) ··· 8364 8309 case BPF_PROG_TYPE_STRUCT_OPS: 8365 8310 return BTF_KFUNC_HOOK_STRUCT_OPS; 8366 8311 case BPF_PROG_TYPE_TRACING: 8312 + case BPF_PROG_TYPE_TRACEPOINT: 8313 + case BPF_PROG_TYPE_PERF_EVENT: 8367 8314 case BPF_PROG_TYPE_LSM: 8368 8315 return BTF_KFUNC_HOOK_TRACING; 8369 8316 case BPF_PROG_TYPE_SYSCALL: 8370 8317 return BTF_KFUNC_HOOK_SYSCALL; 8371 8318 case BPF_PROG_TYPE_CGROUP_SKB: 8319 + case BPF_PROG_TYPE_CGROUP_SOCK: 8320 + case BPF_PROG_TYPE_CGROUP_DEVICE: 8372 8321 case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: 8373 - return BTF_KFUNC_HOOK_CGROUP_SKB; 8322 + case BPF_PROG_TYPE_CGROUP_SOCKOPT: 8323 + case BPF_PROG_TYPE_CGROUP_SYSCTL: 8324 + return BTF_KFUNC_HOOK_CGROUP; 8374 8325 case BPF_PROG_TYPE_SCHED_ACT: 8375 8326 return BTF_KFUNC_HOOK_SCHED_ACT; 8376 8327 case BPF_PROG_TYPE_SK_SKB: ··· 8952 8891 struct bpf_core_cand_list cands = {}; 8953 8892 struct bpf_core_relo_res targ_res; 8954 8893 struct bpf_core_spec *specs; 8894 + const struct btf_type *type; 8955 8895 int err; 8956 8896 8957 8897 /* ~4k of temp memory necessary to convert LLVM spec like "0:1:0:5" ··· 8961 8899 specs = kcalloc(3, sizeof(*specs), GFP_KERNEL); 8962 8900 if (!specs) 8963 8901 return -ENOMEM; 8902 + 8903 + type = btf_type_by_id(ctx->btf, relo->type_id); 8904 + if (!type) { 8905 + bpf_log(ctx->log, "relo #%u: bad type id %u\n", 8906 + relo_idx, relo->type_id); 8907 + return -EINVAL; 8908 + } 8964 8909 8965 8910 if (need_cands) { 8966 8911 struct bpf_cand_cache *cc;

+2

kernel/bpf/btf_iter.c

··· 1 + // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + #include "../../tools/lib/bpf/btf_iter.c"

+2

kernel/bpf/btf_relocate.c

··· 1 + // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + #include "../../tools/lib/bpf/btf_relocate.c"

+2

kernel/bpf/cgroup.c

··· 2581 2581 case BPF_FUNC_get_cgroup_classid: 2582 2582 return &bpf_get_cgroup_classid_curr_proto; 2583 2583 #endif 2584 + case BPF_FUNC_current_task_under_cgroup: 2585 + return &bpf_current_task_under_cgroup_proto; 2584 2586 default: 2585 2587 return NULL; 2586 2588 }

+18 -3

kernel/bpf/core.c

··· 2302 2302 { 2303 2303 enum bpf_prog_type prog_type = resolve_prog_type(fp); 2304 2304 bool ret; 2305 + struct bpf_prog_aux *aux = fp->aux; 2305 2306 2306 2307 if (fp->kprobe_override) 2307 2308 return false; ··· 2312 2311 * in the case of devmap and cpumap). Until device checks 2313 2312 * are implemented, prohibit adding dev-bound programs to program maps. 2314 2313 */ 2315 - if (bpf_prog_is_dev_bound(fp->aux)) 2314 + if (bpf_prog_is_dev_bound(aux)) 2316 2315 return false; 2317 2316 2318 2317 spin_lock(&map->owner.lock); ··· 2322 2321 */ 2323 2322 map->owner.type = prog_type; 2324 2323 map->owner.jited = fp->jited; 2325 - map->owner.xdp_has_frags = fp->aux->xdp_has_frags; 2324 + map->owner.xdp_has_frags = aux->xdp_has_frags; 2325 + map->owner.attach_func_proto = aux->attach_func_proto; 2326 2326 ret = true; 2327 2327 } else { 2328 2328 ret = map->owner.type == prog_type && 2329 2329 map->owner.jited == fp->jited && 2330 - map->owner.xdp_has_frags == fp->aux->xdp_has_frags; 2330 + map->owner.xdp_has_frags == aux->xdp_has_frags; 2331 + if (ret && 2332 + map->owner.attach_func_proto != aux->attach_func_proto) { 2333 + switch (prog_type) { 2334 + case BPF_PROG_TYPE_TRACING: 2335 + case BPF_PROG_TYPE_LSM: 2336 + case BPF_PROG_TYPE_EXT: 2337 + case BPF_PROG_TYPE_STRUCT_OPS: 2338 + ret = false; 2339 + break; 2340 + default: 2341 + break; 2342 + } 2343 + } 2331 2344 } 2332 2345 spin_unlock(&map->owner.lock); 2333 2346

+10 -6

kernel/bpf/hashtab.c

··· 462 462 * kmalloc-able later in htab_map_update_elem() 463 463 */ 464 464 return -E2BIG; 465 + /* percpu map value size is bound by PCPU_MIN_UNIT_SIZE */ 466 + if (percpu && round_up(attr->value_size, 8) > PCPU_MIN_UNIT_SIZE) 467 + return -E2BIG; 465 468 466 469 return 0; 467 470 } ··· 1052 1049 pptr = htab_elem_get_ptr(l_new, key_size); 1053 1050 } else { 1054 1051 /* alloc_percpu zero-fills */ 1055 - pptr = bpf_mem_cache_alloc(&htab->pcpu_ma); 1056 - if (!pptr) { 1052 + void *ptr = bpf_mem_cache_alloc(&htab->pcpu_ma); 1053 + 1054 + if (!ptr) { 1057 1055 bpf_mem_cache_free(&htab->ma, l_new); 1058 1056 l_new = ERR_PTR(-ENOMEM); 1059 1057 goto dec_count; 1060 1058 } 1061 - l_new->ptr_to_pptr = pptr; 1062 - pptr = *(void **)pptr; 1059 + l_new->ptr_to_pptr = ptr; 1060 + pptr = *(void __percpu **)ptr; 1063 1061 } 1064 1062 1065 1063 pcpu_init_value(htab, pptr, value, onallcpus); ··· 1590 1586 btf_type_seq_show(map->btf, map->btf_key_type_id, key, m); 1591 1587 seq_puts(m, ": "); 1592 1588 btf_type_seq_show(map->btf, map->btf_value_type_id, value, m); 1593 - seq_puts(m, "\n"); 1589 + seq_putc(m, '\n'); 1594 1590 1595 1591 rcu_read_unlock(); 1596 1592 } ··· 2454 2450 seq_printf(m, "\tcpu%d: ", cpu); 2455 2451 btf_type_seq_show(map->btf, map->btf_value_type_id, 2456 2452 per_cpu_ptr(pptr, cpu), m); 2457 - seq_puts(m, "\n"); 2453 + seq_putc(m, '\n'); 2458 2454 } 2459 2455 seq_puts(m, "}\n"); 2460 2456

+81 -13

kernel/bpf/helpers.c

··· 158 158 .func = bpf_get_smp_processor_id, 159 159 .gpl_only = false, 160 160 .ret_type = RET_INTEGER, 161 + .allow_fastcall = true, 161 162 }; 162 163 163 164 BPF_CALL_0(bpf_get_numa_node_id) ··· 518 517 } 519 518 520 519 BPF_CALL_4(bpf_strtol, const char *, buf, size_t, buf_len, u64, flags, 521 - long *, res) 520 + s64 *, res) 522 521 { 523 522 long long _res; 524 523 int err; 525 524 525 + *res = 0; 526 526 err = __bpf_strtoll(buf, buf_len, flags, &_res); 527 527 if (err < 0) 528 528 return err; 529 - if (_res != (long)_res) 530 - return -ERANGE; 531 529 *res = _res; 532 530 return err; 533 531 } ··· 538 538 .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, 539 539 .arg2_type = ARG_CONST_SIZE, 540 540 .arg3_type = ARG_ANYTHING, 541 - .arg4_type = ARG_PTR_TO_LONG, 541 + .arg4_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 542 + .arg4_size = sizeof(s64), 542 543 }; 543 544 544 545 BPF_CALL_4(bpf_strtoul, const char *, buf, size_t, buf_len, u64, flags, 545 - unsigned long *, res) 546 + u64 *, res) 546 547 { 547 548 unsigned long long _res; 548 549 bool is_negative; 549 550 int err; 550 551 552 + *res = 0; 551 553 err = __bpf_strtoull(buf, buf_len, flags, &_res, &is_negative); 552 554 if (err < 0) 553 555 return err; 554 556 if (is_negative) 555 557 return -EINVAL; 556 - if (_res != (unsigned long)_res) 557 - return -ERANGE; 558 558 *res = _res; 559 559 return err; 560 560 } ··· 566 566 .arg1_type = ARG_PTR_TO_MEM | MEM_RDONLY, 567 567 .arg2_type = ARG_CONST_SIZE, 568 568 .arg3_type = ARG_ANYTHING, 569 - .arg4_type = ARG_PTR_TO_LONG, 569 + .arg4_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 570 + .arg4_size = sizeof(u64), 570 571 }; 571 572 572 573 BPF_CALL_3(bpf_strncmp, const char *, s1, u32, s1_sz, const char *, s2) ··· 715 714 if (cpu >= nr_cpu_ids) 716 715 return (unsigned long)NULL; 717 716 718 - return (unsigned long)per_cpu_ptr((const void __percpu *)ptr, cpu); 717 + return (unsigned long)per_cpu_ptr((const void __percpu *)(const uintptr_t)ptr, cpu); 719 718 } 720 719 721 720 const struct bpf_func_proto bpf_per_cpu_ptr_proto = { ··· 728 727 729 728 BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr) 730 729 { 731 - return (unsigned long)this_cpu_ptr((const void __percpu *)percpu_ptr); 730 + return (unsigned long)this_cpu_ptr((const void __percpu *)(const uintptr_t)percpu_ptr); 732 731 } 733 732 734 733 const struct bpf_func_proto bpf_this_cpu_ptr_proto = { ··· 1619 1618 schedule_work(&work->delete_work); 1620 1619 } 1621 1620 1622 - BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr) 1621 + BPF_CALL_2(bpf_kptr_xchg, void *, dst, void *, ptr) 1623 1622 { 1624 - unsigned long *kptr = map_value; 1623 + unsigned long *kptr = dst; 1625 1624 1626 1625 /* This helper may be inlined by verifier. */ 1627 1626 return xchg(kptr, (unsigned long)ptr); ··· 1636 1635 .gpl_only = false, 1637 1636 .ret_type = RET_PTR_TO_BTF_ID_OR_NULL, 1638 1637 .ret_btf_id = BPF_PTR_POISON, 1639 - .arg1_type = ARG_PTR_TO_KPTR, 1638 + .arg1_type = ARG_KPTR_XCHG_DEST, 1640 1639 .arg2_type = ARG_PTR_TO_BTF_ID_OR_NULL | OBJ_RELEASE, 1641 1640 .arg2_btf_id = BPF_PTR_POISON, 1642 1641 }; ··· 2034 2033 return NULL; 2035 2034 } 2036 2035 } 2036 + EXPORT_SYMBOL_GPL(bpf_base_func_proto); 2037 2037 2038 2038 void bpf_list_head_free(const struct btf_field *field, void *list_head, 2039 2039 struct bpf_spin_lock *spin_lock) ··· 2458 2456 rcu_read_unlock(); 2459 2457 return ret; 2460 2458 } 2459 + 2460 + BPF_CALL_2(bpf_current_task_under_cgroup, struct bpf_map *, map, u32, idx) 2461 + { 2462 + struct bpf_array *array = container_of(map, struct bpf_array, map); 2463 + struct cgroup *cgrp; 2464 + 2465 + if (unlikely(idx >= array->map.max_entries)) 2466 + return -E2BIG; 2467 + 2468 + cgrp = READ_ONCE(array->ptrs[idx]); 2469 + if (unlikely(!cgrp)) 2470 + return -EAGAIN; 2471 + 2472 + return task_under_cgroup_hierarchy(current, cgrp); 2473 + } 2474 + 2475 + const struct bpf_func_proto bpf_current_task_under_cgroup_proto = { 2476 + .func = bpf_current_task_under_cgroup, 2477 + .gpl_only = false, 2478 + .ret_type = RET_INTEGER, 2479 + .arg1_type = ARG_CONST_MAP_PTR, 2480 + .arg2_type = ARG_ANYTHING, 2481 + }; 2461 2482 2462 2483 /** 2463 2484 * bpf_task_get_cgroup1 - Acquires the associated cgroup of a task within a ··· 2963 2938 bpf_mem_free(&bpf_global_ma, kit->bits); 2964 2939 } 2965 2940 2941 + /** 2942 + * bpf_copy_from_user_str() - Copy a string from an unsafe user address 2943 + * @dst: Destination address, in kernel space. This buffer must be 2944 + * at least @dst__sz bytes long. 2945 + * @dst__sz: Maximum number of bytes to copy, includes the trailing NUL. 2946 + * @unsafe_ptr__ign: Source address, in user space. 2947 + * @flags: The only supported flag is BPF_F_PAD_ZEROS 2948 + * 2949 + * Copies a NUL-terminated string from userspace to BPF space. If user string is 2950 + * too long this will still ensure zero termination in the dst buffer unless 2951 + * buffer size is 0. 2952 + * 2953 + * If BPF_F_PAD_ZEROS flag is set, memset the tail of @dst to 0 on success and 2954 + * memset all of @dst on failure. 2955 + */ 2956 + __bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void __user *unsafe_ptr__ign, u64 flags) 2957 + { 2958 + int ret; 2959 + 2960 + if (unlikely(flags & ~BPF_F_PAD_ZEROS)) 2961 + return -EINVAL; 2962 + 2963 + if (unlikely(!dst__sz)) 2964 + return 0; 2965 + 2966 + ret = strncpy_from_user(dst, unsafe_ptr__ign, dst__sz - 1); 2967 + if (ret < 0) { 2968 + if (flags & BPF_F_PAD_ZEROS) 2969 + memset((char *)dst, 0, dst__sz); 2970 + 2971 + return ret; 2972 + } 2973 + 2974 + if (flags & BPF_F_PAD_ZEROS) 2975 + memset((char *)dst + ret, 0, dst__sz - ret); 2976 + else 2977 + ((char *)dst)[ret] = '\0'; 2978 + 2979 + return ret + 1; 2980 + } 2981 + 2966 2982 __bpf_kfunc_end_defs(); 2967 2983 2968 2984 BTF_KFUNCS_START(generic_btf_ids) ··· 3089 3023 BTF_ID_FLAGS(func, bpf_iter_bits_new, KF_ITER_NEW) 3090 3024 BTF_ID_FLAGS(func, bpf_iter_bits_next, KF_ITER_NEXT | KF_RET_NULL) 3091 3025 BTF_ID_FLAGS(func, bpf_iter_bits_destroy, KF_ITER_DESTROY) 3026 + BTF_ID_FLAGS(func, bpf_copy_from_user_str, KF_SLEEPABLE) 3092 3027 BTF_KFUNCS_END(common_btf_ids) 3093 3028 3094 3029 static const struct btf_kfunc_id_set common_kfunc_set = { ··· 3118 3051 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &generic_kfunc_set); 3119 3052 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &generic_kfunc_set); 3120 3053 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &generic_kfunc_set); 3054 + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SKB, &generic_kfunc_set); 3121 3055 ret = ret ?: register_btf_id_dtor_kfuncs(generic_dtors, 3122 3056 ARRAY_SIZE(generic_dtors), 3123 3057 THIS_MODULE);

+2 -2

kernel/bpf/inode.c

··· 709 709 msk = 1ULL << e->val; 710 710 if (delegate_msk & msk) { 711 711 /* emit lower-case name without prefix */ 712 - seq_printf(m, "%c", first ? '=' : ':'); 712 + seq_putc(m, first ? '=' : ':'); 713 713 name += pfx_len; 714 714 while (*name) { 715 - seq_printf(m, "%c", tolower(*name)); 715 + seq_putc(m, tolower(*name)); 716 716 name++; 717 717 } 718 718

+2 -2

kernel/bpf/local_storage.c

··· 431 431 seq_puts(m, ": "); 432 432 btf_type_seq_show(map->btf, map->btf_value_type_id, 433 433 &READ_ONCE(storage->buf)->data[0], m); 434 - seq_puts(m, "\n"); 434 + seq_putc(m, '\n'); 435 435 } else { 436 436 seq_puts(m, ": {\n"); 437 437 for_each_possible_cpu(cpu) { ··· 439 439 btf_type_seq_show(map->btf, map->btf_value_type_id, 440 440 per_cpu_ptr(storage->percpu_buf, cpu), 441 441 m); 442 - seq_puts(m, "\n"); 442 + seq_putc(m, '\n'); 443 443 } 444 444 seq_puts(m, "}\n"); 445 445 }

+6 -6

kernel/bpf/memalloc.c

··· 138 138 static void *__alloc(struct bpf_mem_cache *c, int node, gfp_t flags) 139 139 { 140 140 if (c->percpu_size) { 141 - void **obj = kmalloc_node(c->percpu_size, flags, node); 142 - void *pptr = __alloc_percpu_gfp(c->unit_size, 8, flags); 141 + void __percpu **obj = kmalloc_node(c->percpu_size, flags, node); 142 + void __percpu *pptr = __alloc_percpu_gfp(c->unit_size, 8, flags); 143 143 144 144 if (!obj || !pptr) { 145 145 free_percpu(pptr); ··· 253 253 static void free_one(void *obj, bool percpu) 254 254 { 255 255 if (percpu) { 256 - free_percpu(((void **)obj)[1]); 256 + free_percpu(((void __percpu **)obj)[1]); 257 257 kfree(obj); 258 258 return; 259 259 } ··· 509 509 */ 510 510 int bpf_mem_alloc_init(struct bpf_mem_alloc *ma, int size, bool percpu) 511 511 { 512 - struct bpf_mem_caches *cc, __percpu *pcc; 513 - struct bpf_mem_cache *c, __percpu *pc; 512 + struct bpf_mem_caches *cc; struct bpf_mem_caches __percpu *pcc; 513 + struct bpf_mem_cache *c; struct bpf_mem_cache __percpu *pc; 514 514 struct obj_cgroup *objcg = NULL; 515 515 int cpu, i, unit_size, percpu_size = 0; 516 516 ··· 591 591 592 592 int bpf_mem_alloc_percpu_unit_init(struct bpf_mem_alloc *ma, int size) 593 593 { 594 - struct bpf_mem_caches *cc, __percpu *pcc; 594 + struct bpf_mem_caches *cc; struct bpf_mem_caches __percpu *pcc; 595 595 int cpu, i, unit_size, percpu_size; 596 596 struct obj_cgroup *objcg; 597 597 struct bpf_mem_cache *c;

+2

kernel/bpf/relo_core.c

··· 1 + // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + #include "../../tools/lib/bpf/relo_core.c"

+1 -1

kernel/bpf/reuseport_array.c

··· 308 308 309 309 spin_unlock_bh(&reuseport_lock); 310 310 put_file: 311 - fput(socket->file); 311 + sockfd_put(socket); 312 312 return err; 313 313 } 314 314

+101 -30

kernel/bpf/stackmap.c

··· 124 124 return ERR_PTR(err); 125 125 } 126 126 127 + static int fetch_build_id(struct vm_area_struct *vma, unsigned char *build_id, bool may_fault) 128 + { 129 + return may_fault ? build_id_parse(vma, build_id, NULL) 130 + : build_id_parse_nofault(vma, build_id, NULL); 131 + } 132 + 133 + /* 134 + * Expects all id_offs[i].ip values to be set to correct initial IPs. 135 + * They will be subsequently: 136 + * - either adjusted in place to a file offset, if build ID fetching 137 + * succeeds; in this case id_offs[i].build_id is set to correct build ID, 138 + * and id_offs[i].status is set to BPF_STACK_BUILD_ID_VALID; 139 + * - or IP will be kept intact, if build ID fetching failed; in this case 140 + * id_offs[i].build_id is zeroed out and id_offs[i].status is set to 141 + * BPF_STACK_BUILD_ID_IP. 142 + */ 127 143 static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs, 128 - u64 *ips, u32 trace_nr, bool user) 144 + u32 trace_nr, bool user, bool may_fault) 129 145 { 130 146 int i; 131 147 struct mmap_unlock_irq_work *work = NULL; ··· 158 142 /* cannot access current->mm, fall back to ips */ 159 143 for (i = 0; i < trace_nr; i++) { 160 144 id_offs[i].status = BPF_STACK_BUILD_ID_IP; 161 - id_offs[i].ip = ips[i]; 162 145 memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX); 163 146 } 164 147 return; 165 148 } 166 149 167 150 for (i = 0; i < trace_nr; i++) { 168 - if (range_in_vma(prev_vma, ips[i], ips[i])) { 151 + u64 ip = READ_ONCE(id_offs[i].ip); 152 + 153 + if (range_in_vma(prev_vma, ip, ip)) { 169 154 vma = prev_vma; 170 - memcpy(id_offs[i].build_id, prev_build_id, 171 - BUILD_ID_SIZE_MAX); 155 + memcpy(id_offs[i].build_id, prev_build_id, BUILD_ID_SIZE_MAX); 172 156 goto build_id_valid; 173 157 } 174 - vma = find_vma(current->mm, ips[i]); 175 - if (!vma || build_id_parse(vma, id_offs[i].build_id, NULL)) { 158 + vma = find_vma(current->mm, ip); 159 + if (!vma || fetch_build_id(vma, id_offs[i].build_id, may_fault)) { 176 160 /* per entry fall back to ips */ 177 161 id_offs[i].status = BPF_STACK_BUILD_ID_IP; 178 - id_offs[i].ip = ips[i]; 179 162 memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX); 180 163 continue; 181 164 } 182 165 build_id_valid: 183 - id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i] 184 - - vma->vm_start; 166 + id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ip - vma->vm_start; 185 167 id_offs[i].status = BPF_STACK_BUILD_ID_VALID; 186 168 prev_vma = vma; 187 169 prev_build_id = id_offs[i].build_id; ··· 230 216 struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); 231 217 struct stack_map_bucket *bucket, *new_bucket, *old_bucket; 232 218 u32 skip = flags & BPF_F_SKIP_FIELD_MASK; 233 - u32 hash, id, trace_nr, trace_len; 219 + u32 hash, id, trace_nr, trace_len, i; 234 220 bool user = flags & BPF_F_USER_STACK; 235 221 u64 *ips; 236 222 bool hash_matches; ··· 252 238 return id; 253 239 254 240 if (stack_map_use_build_id(map)) { 241 + struct bpf_stack_build_id *id_offs; 242 + 255 243 /* for build_id+offset, pop a bucket before slow cmp */ 256 244 new_bucket = (struct stack_map_bucket *) 257 245 pcpu_freelist_pop(&smap->freelist); 258 246 if (unlikely(!new_bucket)) 259 247 return -ENOMEM; 260 248 new_bucket->nr = trace_nr; 261 - stack_map_get_build_id_offset( 262 - (struct bpf_stack_build_id *)new_bucket->data, 263 - ips, trace_nr, user); 249 + id_offs = (struct bpf_stack_build_id *)new_bucket->data; 250 + for (i = 0; i < trace_nr; i++) 251 + id_offs[i].ip = ips[i]; 252 + stack_map_get_build_id_offset(id_offs, trace_nr, user, false /* !may_fault */); 264 253 trace_len = trace_nr * sizeof(struct bpf_stack_build_id); 265 254 if (hash_matches && bucket->nr == trace_nr && 266 255 memcmp(bucket->data, new_bucket->data, trace_len) == 0) { ··· 404 387 405 388 static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, 406 389 struct perf_callchain_entry *trace_in, 407 - void *buf, u32 size, u64 flags) 390 + void *buf, u32 size, u64 flags, bool may_fault) 408 391 { 409 392 u32 trace_nr, copy_len, elem_size, num_elem, max_depth; 410 393 bool user_build_id = flags & BPF_F_USER_BUILD_ID; ··· 422 405 if (kernel && user_build_id) 423 406 goto clear; 424 407 425 - elem_size = (user && user_build_id) ? sizeof(struct bpf_stack_build_id) 426 - : sizeof(u64); 408 + elem_size = user_build_id ? sizeof(struct bpf_stack_build_id) : sizeof(u64); 427 409 if (unlikely(size % elem_size)) 428 410 goto clear; 429 411 ··· 443 427 if (sysctl_perf_event_max_stack < max_depth) 444 428 max_depth = sysctl_perf_event_max_stack; 445 429 430 + if (may_fault) 431 + rcu_read_lock(); /* need RCU for perf's callchain below */ 432 + 446 433 if (trace_in) 447 434 trace = trace_in; 448 435 else if (kernel && task) ··· 453 434 else 454 435 trace = get_perf_callchain(regs, 0, kernel, user, max_depth, 455 436 crosstask, false); 456 - if (unlikely(!trace)) 457 - goto err_fault; 458 437 459 - if (trace->nr < skip) 438 + if (unlikely(!trace) || trace->nr < skip) { 439 + if (may_fault) 440 + rcu_read_unlock(); 460 441 goto err_fault; 442 + } 461 443 462 444 trace_nr = trace->nr - skip; 463 445 trace_nr = (trace_nr <= num_elem) ? trace_nr : num_elem; 464 446 copy_len = trace_nr * elem_size; 465 447 466 448 ips = trace->ip + skip; 467 - if (user && user_build_id) 468 - stack_map_get_build_id_offset(buf, ips, trace_nr, user); 469 - else 449 + if (user_build_id) { 450 + struct bpf_stack_build_id *id_offs = buf; 451 + u32 i; 452 + 453 + for (i = 0; i < trace_nr; i++) 454 + id_offs[i].ip = ips[i]; 455 + } else { 470 456 memcpy(buf, ips, copy_len); 457 + } 458 + 459 + /* trace/ips should not be dereferenced after this point */ 460 + if (may_fault) 461 + rcu_read_unlock(); 462 + 463 + if (user_build_id) 464 + stack_map_get_build_id_offset(buf, trace_nr, user, may_fault); 471 465 472 466 if (size > copy_len) 473 467 memset(buf + copy_len, 0, size - copy_len); ··· 496 464 BPF_CALL_4(bpf_get_stack, struct pt_regs *, regs, void *, buf, u32, size, 497 465 u64, flags) 498 466 { 499 - return __bpf_get_stack(regs, NULL, NULL, buf, size, flags); 467 + return __bpf_get_stack(regs, NULL, NULL, buf, size, flags, false /* !may_fault */); 500 468 } 501 469 502 470 const struct bpf_func_proto bpf_get_stack_proto = { ··· 509 477 .arg4_type = ARG_ANYTHING, 510 478 }; 511 479 512 - BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf, 513 - u32, size, u64, flags) 480 + BPF_CALL_4(bpf_get_stack_sleepable, struct pt_regs *, regs, void *, buf, u32, size, 481 + u64, flags) 482 + { 483 + return __bpf_get_stack(regs, NULL, NULL, buf, size, flags, true /* may_fault */); 484 + } 485 + 486 + const struct bpf_func_proto bpf_get_stack_sleepable_proto = { 487 + .func = bpf_get_stack_sleepable, 488 + .gpl_only = true, 489 + .ret_type = RET_INTEGER, 490 + .arg1_type = ARG_PTR_TO_CTX, 491 + .arg2_type = ARG_PTR_TO_UNINIT_MEM, 492 + .arg3_type = ARG_CONST_SIZE_OR_ZERO, 493 + .arg4_type = ARG_ANYTHING, 494 + }; 495 + 496 + static long __bpf_get_task_stack(struct task_struct *task, void *buf, u32 size, 497 + u64 flags, bool may_fault) 514 498 { 515 499 struct pt_regs *regs; 516 500 long res = -EINVAL; ··· 536 488 537 489 regs = task_pt_regs(task); 538 490 if (regs) 539 - res = __bpf_get_stack(regs, task, NULL, buf, size, flags); 491 + res = __bpf_get_stack(regs, task, NULL, buf, size, flags, may_fault); 540 492 put_task_stack(task); 541 493 542 494 return res; 543 495 } 544 496 497 + BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf, 498 + u32, size, u64, flags) 499 + { 500 + return __bpf_get_task_stack(task, buf, size, flags, false /* !may_fault */); 501 + } 502 + 545 503 const struct bpf_func_proto bpf_get_task_stack_proto = { 546 504 .func = bpf_get_task_stack, 505 + .gpl_only = false, 506 + .ret_type = RET_INTEGER, 507 + .arg1_type = ARG_PTR_TO_BTF_ID, 508 + .arg1_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK], 509 + .arg2_type = ARG_PTR_TO_UNINIT_MEM, 510 + .arg3_type = ARG_CONST_SIZE_OR_ZERO, 511 + .arg4_type = ARG_ANYTHING, 512 + }; 513 + 514 + BPF_CALL_4(bpf_get_task_stack_sleepable, struct task_struct *, task, void *, buf, 515 + u32, size, u64, flags) 516 + { 517 + return __bpf_get_task_stack(task, buf, size, flags, true /* !may_fault */); 518 + } 519 + 520 + const struct bpf_func_proto bpf_get_task_stack_sleepable_proto = { 521 + .func = bpf_get_task_stack_sleepable, 547 522 .gpl_only = false, 548 523 .ret_type = RET_INTEGER, 549 524 .arg1_type = ARG_PTR_TO_BTF_ID, ··· 587 516 __u64 nr_kernel; 588 517 589 518 if (!(event->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) 590 - return __bpf_get_stack(regs, NULL, NULL, buf, size, flags); 519 + return __bpf_get_stack(regs, NULL, NULL, buf, size, flags, false /* !may_fault */); 591 520 592 521 if (unlikely(flags & ~(BPF_F_SKIP_FIELD_MASK | BPF_F_USER_STACK | 593 522 BPF_F_USER_BUILD_ID))) ··· 607 536 __u64 nr = trace->nr; 608 537 609 538 trace->nr = nr_kernel; 610 - err = __bpf_get_stack(regs, NULL, trace, buf, size, flags); 539 + err = __bpf_get_stack(regs, NULL, trace, buf, size, flags, false /* !may_fault */); 611 540 612 541 /* restore nr */ 613 542 trace->nr = nr; ··· 619 548 goto clear; 620 549 621 550 flags = (flags & ~BPF_F_SKIP_FIELD_MASK) | skip; 622 - err = __bpf_get_stack(regs, NULL, trace, buf, size, flags); 551 + err = __bpf_get_stack(regs, NULL, trace, buf, size, flags, false /* !may_fault */); 623 552 } 624 553 return err; 625 554

+20 -11

kernel/bpf/syscall.c

··· 550 550 case BPF_KPTR_PERCPU: 551 551 if (rec->fields[i].kptr.module) 552 552 module_put(rec->fields[i].kptr.module); 553 - btf_put(rec->fields[i].kptr.btf); 553 + if (btf_is_kernel(rec->fields[i].kptr.btf)) 554 + btf_put(rec->fields[i].kptr.btf); 554 555 break; 555 556 case BPF_LIST_HEAD: 556 557 case BPF_LIST_NODE: ··· 597 596 case BPF_KPTR_UNREF: 598 597 case BPF_KPTR_REF: 599 598 case BPF_KPTR_PERCPU: 600 - btf_get(fields[i].kptr.btf); 599 + if (btf_is_kernel(fields[i].kptr.btf)) 600 + btf_get(fields[i].kptr.btf); 601 601 if (fields[i].kptr.module && !try_module_get(fields[i].kptr.module)) { 602 602 ret = -ENXIO; 603 603 goto free; ··· 735 733 } 736 734 } 737 735 738 - /* called from workqueue */ 739 - static void bpf_map_free_deferred(struct work_struct *work) 736 + static void bpf_map_free(struct bpf_map *map) 740 737 { 741 - struct bpf_map *map = container_of(work, struct bpf_map, work); 742 738 struct btf_record *rec = map->record; 743 739 struct btf *btf = map->btf; 744 740 745 - security_bpf_map_free(map); 746 - bpf_map_release_memcg(map); 747 741 /* implementation dependent freeing */ 748 742 map->ops->map_free(map); 749 743 /* Delay freeing of btf_record for maps, as map_free ··· 756 758 * struct_meta info which will be freed with btf_put(). 757 759 */ 758 760 btf_put(btf); 761 + } 762 + 763 + /* called from workqueue */ 764 + static void bpf_map_free_deferred(struct work_struct *work) 765 + { 766 + struct bpf_map *map = container_of(work, struct bpf_map, work); 767 + 768 + security_bpf_map_free(map); 769 + bpf_map_release_memcg(map); 770 + bpf_map_free(map); 759 771 } 760 772 761 773 static void bpf_map_put_uref(struct bpf_map *map) ··· 1419 1411 free_map_sec: 1420 1412 security_bpf_map_free(map); 1421 1413 free_map: 1422 - btf_put(map->btf); 1423 - map->ops->map_free(map); 1414 + bpf_map_free(map); 1424 1415 put_token: 1425 1416 bpf_token_put(token); 1426 1417 return err; ··· 5675 5668 return bpf_token_create(attr); 5676 5669 } 5677 5670 5678 - static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size) 5671 + static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size) 5679 5672 { 5680 5673 union bpf_attr attr; 5681 5674 int err; ··· 5939 5932 5940 5933 BPF_CALL_4(bpf_kallsyms_lookup_name, const char *, name, int, name_sz, int, flags, u64 *, res) 5941 5934 { 5935 + *res = 0; 5942 5936 if (flags) 5943 5937 return -EINVAL; 5944 5938 ··· 5960 5952 .arg1_type = ARG_PTR_TO_MEM, 5961 5953 .arg2_type = ARG_CONST_SIZE_OR_ZERO, 5962 5954 .arg3_type = ARG_ANYTHING, 5963 - .arg4_type = ARG_PTR_TO_LONG, 5955 + .arg4_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 5956 + .arg4_size = sizeof(u64), 5964 5957 }; 5965 5958 5966 5959 static const struct bpf_func_proto *

+996 -295

kernel/bpf/verifier.c

··· 385 385 verbose(env, " should have been in [%d, %d]\n", range.minval, range.maxval); 386 386 } 387 387 388 - static bool type_may_be_null(u32 type) 389 - { 390 - return type & PTR_MAYBE_NULL; 391 - } 392 - 393 388 static bool reg_not_null(const struct bpf_reg_state *reg) 394 389 { 395 390 enum bpf_reg_type type; ··· 2179 2184 reg->smin_value = max_t(s64, reg->smin_value, new_smin); 2180 2185 reg->smax_value = min_t(s64, reg->smax_value, new_smax); 2181 2186 } 2187 + 2188 + /* Here we would like to handle a special case after sign extending load, 2189 + * when upper bits for a 64-bit range are all 1s or all 0s. 2190 + * 2191 + * Upper bits are all 1s when register is in a range: 2192 + * [0xffff_ffff_0000_0000, 0xffff_ffff_ffff_ffff] 2193 + * Upper bits are all 0s when register is in a range: 2194 + * [0x0000_0000_0000_0000, 0x0000_0000_ffff_ffff] 2195 + * Together this forms are continuous range: 2196 + * [0xffff_ffff_0000_0000, 0x0000_0000_ffff_ffff] 2197 + * 2198 + * Now, suppose that register range is in fact tighter: 2199 + * [0xffff_ffff_8000_0000, 0x0000_0000_ffff_ffff] (R) 2200 + * Also suppose that it's 32-bit range is positive, 2201 + * meaning that lower 32-bits of the full 64-bit register 2202 + * are in the range: 2203 + * [0x0000_0000, 0x7fff_ffff] (W) 2204 + * 2205 + * If this happens, then any value in a range: 2206 + * [0xffff_ffff_0000_0000, 0xffff_ffff_7fff_ffff] 2207 + * is smaller than a lowest bound of the range (R): 2208 + * 0xffff_ffff_8000_0000 2209 + * which means that upper bits of the full 64-bit register 2210 + * can't be all 1s, when lower bits are in range (W). 2211 + * 2212 + * Note that: 2213 + * - 0xffff_ffff_8000_0000 == (s64)S32_MIN 2214 + * - 0x0000_0000_7fff_ffff == (s64)S32_MAX 2215 + * These relations are used in the conditions below. 2216 + */ 2217 + if (reg->s32_min_value >= 0 && reg->smin_value >= S32_MIN && reg->smax_value <= S32_MAX) { 2218 + reg->smin_value = reg->s32_min_value; 2219 + reg->smax_value = reg->s32_max_value; 2220 + reg->umin_value = reg->s32_min_value; 2221 + reg->umax_value = reg->s32_max_value; 2222 + reg->var_off = tnum_intersect(reg->var_off, 2223 + tnum_range(reg->smin_value, reg->smax_value)); 2224 + } 2182 2225 } 2183 2226 2184 2227 static void __reg_deduce_bounds(struct bpf_reg_state *reg) ··· 2367 2334 return; 2368 2335 } 2369 2336 __mark_reg_unknown(env, regs + regno); 2337 + } 2338 + 2339 + static int __mark_reg_s32_range(struct bpf_verifier_env *env, 2340 + struct bpf_reg_state *regs, 2341 + u32 regno, 2342 + s32 s32_min, 2343 + s32 s32_max) 2344 + { 2345 + struct bpf_reg_state *reg = regs + regno; 2346 + 2347 + reg->s32_min_value = max_t(s32, reg->s32_min_value, s32_min); 2348 + reg->s32_max_value = min_t(s32, reg->s32_max_value, s32_max); 2349 + 2350 + reg->smin_value = max_t(s64, reg->smin_value, s32_min); 2351 + reg->smax_value = min_t(s64, reg->smax_value, s32_max); 2352 + 2353 + reg_bounds_sync(reg); 2354 + 2355 + return reg_bounds_sanity_check(env, reg, "s32_range"); 2370 2356 } 2371 2357 2372 2358 static void __mark_reg_not_init(const struct bpf_verifier_env *env, ··· 3389 3337 return env->insn_aux_data[insn_idx].jmp_point; 3390 3338 } 3391 3339 3340 + #define LR_FRAMENO_BITS 3 3341 + #define LR_SPI_BITS 6 3342 + #define LR_ENTRY_BITS (LR_SPI_BITS + LR_FRAMENO_BITS + 1) 3343 + #define LR_SIZE_BITS 4 3344 + #define LR_FRAMENO_MASK ((1ull << LR_FRAMENO_BITS) - 1) 3345 + #define LR_SPI_MASK ((1ull << LR_SPI_BITS) - 1) 3346 + #define LR_SIZE_MASK ((1ull << LR_SIZE_BITS) - 1) 3347 + #define LR_SPI_OFF LR_FRAMENO_BITS 3348 + #define LR_IS_REG_OFF (LR_SPI_BITS + LR_FRAMENO_BITS) 3349 + #define LINKED_REGS_MAX 6 3350 + 3351 + struct linked_reg { 3352 + u8 frameno; 3353 + union { 3354 + u8 spi; 3355 + u8 regno; 3356 + }; 3357 + bool is_reg; 3358 + }; 3359 + 3360 + struct linked_regs { 3361 + int cnt; 3362 + struct linked_reg entries[LINKED_REGS_MAX]; 3363 + }; 3364 + 3365 + static struct linked_reg *linked_regs_push(struct linked_regs *s) 3366 + { 3367 + if (s->cnt < LINKED_REGS_MAX) 3368 + return &s->entries[s->cnt++]; 3369 + 3370 + return NULL; 3371 + } 3372 + 3373 + /* Use u64 as a vector of 6 10-bit values, use first 4-bits to track 3374 + * number of elements currently in stack. 3375 + * Pack one history entry for linked registers as 10 bits in the following format: 3376 + * - 3-bits frameno 3377 + * - 6-bits spi_or_reg 3378 + * - 1-bit is_reg 3379 + */ 3380 + static u64 linked_regs_pack(struct linked_regs *s) 3381 + { 3382 + u64 val = 0; 3383 + int i; 3384 + 3385 + for (i = 0; i < s->cnt; ++i) { 3386 + struct linked_reg *e = &s->entries[i]; 3387 + u64 tmp = 0; 3388 + 3389 + tmp |= e->frameno; 3390 + tmp |= e->spi << LR_SPI_OFF; 3391 + tmp |= (e->is_reg ? 1 : 0) << LR_IS_REG_OFF; 3392 + 3393 + val <<= LR_ENTRY_BITS; 3394 + val |= tmp; 3395 + } 3396 + val <<= LR_SIZE_BITS; 3397 + val |= s->cnt; 3398 + return val; 3399 + } 3400 + 3401 + static void linked_regs_unpack(u64 val, struct linked_regs *s) 3402 + { 3403 + int i; 3404 + 3405 + s->cnt = val & LR_SIZE_MASK; 3406 + val >>= LR_SIZE_BITS; 3407 + 3408 + for (i = 0; i < s->cnt; ++i) { 3409 + struct linked_reg *e = &s->entries[i]; 3410 + 3411 + e->frameno = val & LR_FRAMENO_MASK; 3412 + e->spi = (val >> LR_SPI_OFF) & LR_SPI_MASK; 3413 + e->is_reg = (val >> LR_IS_REG_OFF) & 0x1; 3414 + val >>= LR_ENTRY_BITS; 3415 + } 3416 + } 3417 + 3392 3418 /* for any branch, call, exit record the history of jmps in the given state */ 3393 3419 static int push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur, 3394 - int insn_flags) 3420 + int insn_flags, u64 linked_regs) 3395 3421 { 3396 3422 u32 cnt = cur->jmp_history_cnt; 3397 3423 struct bpf_jmp_history_entry *p; ··· 3485 3355 "verifier insn history bug: insn_idx %d cur flags %x new flags %x\n", 3486 3356 env->insn_idx, env->cur_hist_ent->flags, insn_flags); 3487 3357 env->cur_hist_ent->flags |= insn_flags; 3358 + WARN_ONCE(env->cur_hist_ent->linked_regs != 0, 3359 + "verifier insn history bug: insn_idx %d linked_regs != 0: %#llx\n", 3360 + env->insn_idx, env->cur_hist_ent->linked_regs); 3361 + env->cur_hist_ent->linked_regs = linked_regs; 3488 3362 return 0; 3489 3363 } 3490 3364 ··· 3503 3369 p->idx = env->insn_idx; 3504 3370 p->prev_idx = env->prev_insn_idx; 3505 3371 p->flags = insn_flags; 3372 + p->linked_regs = linked_regs; 3506 3373 cur->jmp_history_cnt = cnt; 3507 3374 env->cur_hist_ent = p; 3508 3375 ··· 3669 3534 return bt->reg_masks[bt->frame] & (1 << reg); 3670 3535 } 3671 3536 3537 + static inline bool bt_is_frame_reg_set(struct backtrack_state *bt, u32 frame, u32 reg) 3538 + { 3539 + return bt->reg_masks[frame] & (1 << reg); 3540 + } 3541 + 3672 3542 static inline bool bt_is_frame_slot_set(struct backtrack_state *bt, u32 frame, u32 slot) 3673 3543 { 3674 3544 return bt->stack_masks[frame] & (1ull << slot); ··· 3718 3578 } 3719 3579 } 3720 3580 3581 + /* If any register R in hist->linked_regs is marked as precise in bt, 3582 + * do bt_set_frame_{reg,slot}(bt, R) for all registers in hist->linked_regs. 3583 + */ 3584 + static void bt_sync_linked_regs(struct backtrack_state *bt, struct bpf_jmp_history_entry *hist) 3585 + { 3586 + struct linked_regs linked_regs; 3587 + bool some_precise = false; 3588 + int i; 3589 + 3590 + if (!hist || hist->linked_regs == 0) 3591 + return; 3592 + 3593 + linked_regs_unpack(hist->linked_regs, &linked_regs); 3594 + for (i = 0; i < linked_regs.cnt; ++i) { 3595 + struct linked_reg *e = &linked_regs.entries[i]; 3596 + 3597 + if ((e->is_reg && bt_is_frame_reg_set(bt, e->frameno, e->regno)) || 3598 + (!e->is_reg && bt_is_frame_slot_set(bt, e->frameno, e->spi))) { 3599 + some_precise = true; 3600 + break; 3601 + } 3602 + } 3603 + 3604 + if (!some_precise) 3605 + return; 3606 + 3607 + for (i = 0; i < linked_regs.cnt; ++i) { 3608 + struct linked_reg *e = &linked_regs.entries[i]; 3609 + 3610 + if (e->is_reg) 3611 + bt_set_frame_reg(bt, e->frameno, e->regno); 3612 + else 3613 + bt_set_frame_slot(bt, e->frameno, e->spi); 3614 + } 3615 + } 3616 + 3721 3617 static bool calls_callback(struct bpf_verifier_env *env, int insn_idx); 3722 3618 3723 3619 /* For given verifier state backtrack_insn() is called from the last insn to ··· 3792 3616 verbose(env, "%d: ", idx); 3793 3617 print_bpf_insn(&cbs, insn, env->allow_ptr_leaks); 3794 3618 } 3619 + 3620 + /* If there is a history record that some registers gained range at this insn, 3621 + * propagate precision marks to those registers, so that bt_is_reg_set() 3622 + * accounts for these registers. 3623 + */ 3624 + bt_sync_linked_regs(bt, hist); 3795 3625 3796 3626 if (class == BPF_ALU || class == BPF_ALU64) { 3797 3627 if (!bt_is_reg_set(bt, dreg)) ··· 4028 3846 */ 4029 3847 bt_set_reg(bt, dreg); 4030 3848 bt_set_reg(bt, sreg); 4031 - /* else dreg <cond> K 3849 + } else if (BPF_SRC(insn->code) == BPF_K) { 3850 + /* dreg <cond> K 4032 3851 * Only dreg still needs precision before 4033 3852 * this insn, so for the K-based conditional 4034 3853 * there is nothing new to be marked. ··· 4047 3864 /* to be analyzed */ 4048 3865 return -ENOTSUPP; 4049 3866 } 3867 + /* Propagate precision marks to linked registers, to account for 3868 + * registers marked as precise in this function. 3869 + */ 3870 + bt_sync_linked_regs(bt, hist); 4050 3871 return 0; 4051 3872 } 4052 3873 ··· 4176 3989 reg->precise = false; 4177 3990 } 4178 3991 } 4179 - } 4180 - 4181 - static bool idset_contains(struct bpf_idset *s, u32 id) 4182 - { 4183 - u32 i; 4184 - 4185 - for (i = 0; i < s->count; ++i) 4186 - if (s->ids[i] == (id & ~BPF_ADD_CONST)) 4187 - return true; 4188 - 4189 - return false; 4190 - } 4191 - 4192 - static int idset_push(struct bpf_idset *s, u32 id) 4193 - { 4194 - if (WARN_ON_ONCE(s->count >= ARRAY_SIZE(s->ids))) 4195 - return -EFAULT; 4196 - s->ids[s->count++] = id & ~BPF_ADD_CONST; 4197 - return 0; 4198 - } 4199 - 4200 - static void idset_reset(struct bpf_idset *s) 4201 - { 4202 - s->count = 0; 4203 - } 4204 - 4205 - /* Collect a set of IDs for all registers currently marked as precise in env->bt. 4206 - * Mark all registers with these IDs as precise. 4207 - */ 4208 - static int mark_precise_scalar_ids(struct bpf_verifier_env *env, struct bpf_verifier_state *st) 4209 - { 4210 - struct bpf_idset *precise_ids = &env->idset_scratch; 4211 - struct backtrack_state *bt = &env->bt; 4212 - struct bpf_func_state *func; 4213 - struct bpf_reg_state *reg; 4214 - DECLARE_BITMAP(mask, 64); 4215 - int i, fr; 4216 - 4217 - idset_reset(precise_ids); 4218 - 4219 - for (fr = bt->frame; fr >= 0; fr--) { 4220 - func = st->frame[fr]; 4221 - 4222 - bitmap_from_u64(mask, bt_frame_reg_mask(bt, fr)); 4223 - for_each_set_bit(i, mask, 32) { 4224 - reg = &func->regs[i]; 4225 - if (!reg->id || reg->type != SCALAR_VALUE) 4226 - continue; 4227 - if (idset_push(precise_ids, reg->id)) 4228 - return -EFAULT; 4229 - } 4230 - 4231 - bitmap_from_u64(mask, bt_frame_stack_mask(bt, fr)); 4232 - for_each_set_bit(i, mask, 64) { 4233 - if (i >= func->allocated_stack / BPF_REG_SIZE) 4234 - break; 4235 - if (!is_spilled_scalar_reg(&func->stack[i])) 4236 - continue; 4237 - reg = &func->stack[i].spilled_ptr; 4238 - if (!reg->id) 4239 - continue; 4240 - if (idset_push(precise_ids, reg->id)) 4241 - return -EFAULT; 4242 - } 4243 - } 4244 - 4245 - for (fr = 0; fr <= st->curframe; ++fr) { 4246 - func = st->frame[fr]; 4247 - 4248 - for (i = BPF_REG_0; i < BPF_REG_10; ++i) { 4249 - reg = &func->regs[i]; 4250 - if (!reg->id) 4251 - continue; 4252 - if (!idset_contains(precise_ids, reg->id)) 4253 - continue; 4254 - bt_set_frame_reg(bt, fr, i); 4255 - } 4256 - for (i = 0; i < func->allocated_stack / BPF_REG_SIZE; ++i) { 4257 - if (!is_spilled_scalar_reg(&func->stack[i])) 4258 - continue; 4259 - reg = &func->stack[i].spilled_ptr; 4260 - if (!reg->id) 4261 - continue; 4262 - if (!idset_contains(precise_ids, reg->id)) 4263 - continue; 4264 - bt_set_frame_slot(bt, fr, i); 4265 - } 4266 - } 4267 - 4268 - return 0; 4269 3992 } 4270 3993 4271 3994 /* ··· 4309 4212 verbose(env, "mark_precise: frame%d: last_idx %d first_idx %d subseq_idx %d \n", 4310 4213 bt->frame, last_idx, first_idx, subseq_idx); 4311 4214 } 4312 - 4313 - /* If some register with scalar ID is marked as precise, 4314 - * make sure that all registers sharing this ID are also precise. 4315 - * This is needed to estimate effect of find_equal_scalars(). 4316 - * Do this at the last instruction of each state, 4317 - * bpf_reg_state::id fields are valid for these instructions. 4318 - * 4319 - * Allows to track precision in situation like below: 4320 - * 4321 - * r2 = unknown value 4322 - * ... 4323 - * --- state #0 --- 4324 - * ... 4325 - * r1 = r2 // r1 and r2 now share the same ID 4326 - * ... 4327 - * --- state #1 {r1.id = A, r2.id = A} --- 4328 - * ... 4329 - * if (r2 > 10) goto exit; // find_equal_scalars() assigns range to r1 4330 - * ... 4331 - * --- state #2 {r1.id = A, r2.id = A} --- 4332 - * r3 = r10 4333 - * r3 += r1 // need to mark both r1 and r2 4334 - */ 4335 - if (mark_precise_scalar_ids(env, st)) 4336 - return -EFAULT; 4337 4215 4338 4216 if (last_idx < 0) { 4339 4217 /* we are at the entry into subprog, which ··· 4530 4458 4531 4459 if (!src_reg->id && !tnum_is_const(src_reg->var_off)) 4532 4460 /* Ensure that src_reg has a valid ID that will be copied to 4533 - * dst_reg and then will be used by find_equal_scalars() to 4461 + * dst_reg and then will be used by sync_linked_regs() to 4534 4462 * propagate min/max range. 4535 4463 */ 4536 4464 src_reg->id = ++env->id_gen; ··· 4574 4502 static int get_reg_width(struct bpf_reg_state *reg) 4575 4503 { 4576 4504 return fls64(reg->umax_value); 4505 + } 4506 + 4507 + /* See comment for mark_fastcall_pattern_for_call() */ 4508 + static void check_fastcall_stack_contract(struct bpf_verifier_env *env, 4509 + struct bpf_func_state *state, int insn_idx, int off) 4510 + { 4511 + struct bpf_subprog_info *subprog = &env->subprog_info[state->subprogno]; 4512 + struct bpf_insn_aux_data *aux = env->insn_aux_data; 4513 + int i; 4514 + 4515 + if (subprog->fastcall_stack_off <= off || aux[insn_idx].fastcall_pattern) 4516 + return; 4517 + /* access to the region [max_stack_depth .. fastcall_stack_off) 4518 + * from something that is not a part of the fastcall pattern, 4519 + * disable fastcall rewrites for current subprogram by setting 4520 + * fastcall_stack_off to a value smaller than any possible offset. 4521 + */ 4522 + subprog->fastcall_stack_off = S16_MIN; 4523 + /* reset fastcall aux flags within subprogram, 4524 + * happens at most once per subprogram 4525 + */ 4526 + for (i = subprog->start; i < (subprog + 1)->start; ++i) { 4527 + aux[i].fastcall_spills_num = 0; 4528 + aux[i].fastcall_pattern = 0; 4529 + } 4577 4530 } 4578 4531 4579 4532 /* check_stack_{read,write}_fixed_off functions track spill/fill of registers, ··· 4649 4552 if (err) 4650 4553 return err; 4651 4554 4555 + check_fastcall_stack_contract(env, state, insn_idx, off); 4652 4556 mark_stack_slot_scratched(env, spi); 4653 4557 if (reg && !(off % BPF_REG_SIZE) && reg->type == SCALAR_VALUE && env->bpf_capable) { 4654 4558 bool reg_value_fits; ··· 4725 4627 } 4726 4628 4727 4629 if (insn_flags) 4728 - return push_jmp_history(env, env->cur_state, insn_flags); 4630 + return push_jmp_history(env, env->cur_state, insn_flags, 0); 4729 4631 return 0; 4730 4632 } 4731 4633 ··· 4784 4686 return err; 4785 4687 } 4786 4688 4689 + check_fastcall_stack_contract(env, state, insn_idx, min_off); 4787 4690 /* Variable offset writes destroy any spilled pointers in range. */ 4788 4691 for (i = min_off; i < max_off; i++) { 4789 4692 u8 new_type, *stype; ··· 4923 4824 reg = &reg_state->stack[spi].spilled_ptr; 4924 4825 4925 4826 mark_stack_slot_scratched(env, spi); 4827 + check_fastcall_stack_contract(env, state, env->insn_idx, off); 4926 4828 4927 4829 if (is_spilled_reg(&reg_state->stack[spi])) { 4928 4830 u8 spill_size = 1; ··· 5032 4932 insn_flags = 0; /* we are not restoring spilled register */ 5033 4933 } 5034 4934 if (insn_flags) 5035 - return push_jmp_history(env, env->cur_state, insn_flags); 4935 + return push_jmp_history(env, env->cur_state, insn_flags, 0); 5036 4936 return 0; 5037 4937 } 5038 4938 ··· 5084 4984 min_off = reg->smin_value + off; 5085 4985 max_off = reg->smax_value + off; 5086 4986 mark_reg_stack_read(env, ptr_state, min_off, max_off + size, dst_regno); 4987 + check_fastcall_stack_contract(env, ptr_state, env->insn_idx, min_off); 5087 4988 return 0; 5088 4989 } 5089 4990 ··· 5690 5589 /* check access to 'struct bpf_context' fields. Supports fixed offsets only */ 5691 5590 static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off, int size, 5692 5591 enum bpf_access_type t, enum bpf_reg_type *reg_type, 5693 - struct btf **btf, u32 *btf_id) 5592 + struct btf **btf, u32 *btf_id, bool *is_retval, bool is_ldsx) 5694 5593 { 5695 5594 struct bpf_insn_access_aux info = { 5696 5595 .reg_type = *reg_type, 5697 5596 .log = &env->log, 5597 + .is_retval = false, 5598 + .is_ldsx = is_ldsx, 5698 5599 }; 5699 5600 5700 5601 if (env->ops->is_valid_access && ··· 5709 5606 * type of narrower access. 5710 5607 */ 5711 5608 *reg_type = info.reg_type; 5609 + *is_retval = info.is_retval; 5712 5610 5713 5611 if (base_type(*reg_type) == PTR_TO_BTF_ID) { 5714 5612 *btf = info.btf; ··· 6798 6694 struct bpf_func_state *state, 6799 6695 enum bpf_access_type t) 6800 6696 { 6801 - int min_valid_off; 6697 + struct bpf_insn_aux_data *aux = &env->insn_aux_data[env->insn_idx]; 6698 + int min_valid_off, max_bpf_stack; 6699 + 6700 + /* If accessing instruction is a spill/fill from bpf_fastcall pattern, 6701 + * add room for all caller saved registers below MAX_BPF_STACK. 6702 + * In case if bpf_fastcall rewrite won't happen maximal stack depth 6703 + * would be checked by check_max_stack_depth_subprog(). 6704 + */ 6705 + max_bpf_stack = MAX_BPF_STACK; 6706 + if (aux->fastcall_pattern) 6707 + max_bpf_stack += CALLER_SAVED_REGS * BPF_REG_SIZE; 6802 6708 6803 6709 if (t == BPF_WRITE || env->allow_uninit_stack) 6804 - min_valid_off = -MAX_BPF_STACK; 6710 + min_valid_off = -max_bpf_stack; 6805 6711 else 6806 6712 min_valid_off = -state->allocated_stack; 6807 6713 ··· 6886 6772 * size is -min_off, not -min_off+1. 6887 6773 */ 6888 6774 return grow_stack_state(env, state, -min_off /* size */); 6775 + } 6776 + 6777 + static bool get_func_retval_range(struct bpf_prog *prog, 6778 + struct bpf_retval_range *range) 6779 + { 6780 + if (prog->type == BPF_PROG_TYPE_LSM && 6781 + prog->expected_attach_type == BPF_LSM_MAC && 6782 + !bpf_lsm_get_retval_range(prog, range)) { 6783 + return true; 6784 + } 6785 + return false; 6889 6786 } 6890 6787 6891 6788 /* check whether memory at (regno + off) is accessible for t = (read | write) ··· 7003 6878 if (!err && value_regno >= 0 && (t == BPF_READ || rdonly_mem)) 7004 6879 mark_reg_unknown(env, regs, value_regno); 7005 6880 } else if (reg->type == PTR_TO_CTX) { 6881 + bool is_retval = false; 6882 + struct bpf_retval_range range; 7006 6883 enum bpf_reg_type reg_type = SCALAR_VALUE; 7007 6884 struct btf *btf = NULL; 7008 6885 u32 btf_id = 0; ··· 7020 6893 return err; 7021 6894 7022 6895 err = check_ctx_access(env, insn_idx, off, size, t, &reg_type, &btf, 7023 - &btf_id); 6896 + &btf_id, &is_retval, is_ldsx); 7024 6897 if (err) 7025 6898 verbose_linfo(env, insn_idx, "; "); 7026 6899 if (!err && t == BPF_READ && value_regno >= 0) { ··· 7029 6902 * case, we know the offset is zero. 7030 6903 */ 7031 6904 if (reg_type == SCALAR_VALUE) { 7032 - mark_reg_unknown(env, regs, value_regno); 6905 + if (is_retval && get_func_retval_range(env->prog, &range)) { 6906 + err = __mark_reg_s32_range(env, regs, value_regno, 6907 + range.minval, range.maxval); 6908 + if (err) 6909 + return err; 6910 + } else { 6911 + mark_reg_unknown(env, regs, value_regno); 6912 + } 7033 6913 } else { 7034 6914 mark_reg_known_zero(env, regs, 7035 6915 value_regno); ··· 7800 7666 struct bpf_call_arg_meta *meta) 7801 7667 { 7802 7668 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 7803 - struct bpf_map *map_ptr = reg->map_ptr; 7804 7669 struct btf_field *kptr_field; 7670 + struct bpf_map *map_ptr; 7671 + struct btf_record *rec; 7805 7672 u32 kptr_off; 7673 + 7674 + if (type_is_ptr_alloc_obj(reg->type)) { 7675 + rec = reg_btf_record(reg); 7676 + } else { /* PTR_TO_MAP_VALUE */ 7677 + map_ptr = reg->map_ptr; 7678 + if (!map_ptr->btf) { 7679 + verbose(env, "map '%s' has to have BTF in order to use bpf_kptr_xchg\n", 7680 + map_ptr->name); 7681 + return -EINVAL; 7682 + } 7683 + rec = map_ptr->record; 7684 + meta->map_ptr = map_ptr; 7685 + } 7806 7686 7807 7687 if (!tnum_is_const(reg->var_off)) { 7808 7688 verbose(env, ··· 7824 7676 regno); 7825 7677 return -EINVAL; 7826 7678 } 7827 - if (!map_ptr->btf) { 7828 - verbose(env, "map '%s' has to have BTF in order to use bpf_kptr_xchg\n", 7829 - map_ptr->name); 7830 - return -EINVAL; 7831 - } 7832 - if (!btf_record_has_field(map_ptr->record, BPF_KPTR)) { 7833 - verbose(env, "map '%s' has no valid kptr\n", map_ptr->name); 7679 + 7680 + if (!btf_record_has_field(rec, BPF_KPTR)) { 7681 + verbose(env, "R%d has no valid kptr\n", regno); 7834 7682 return -EINVAL; 7835 7683 } 7836 7684 7837 - meta->map_ptr = map_ptr; 7838 7685 kptr_off = reg->off + reg->var_off.value; 7839 - kptr_field = btf_record_find(map_ptr->record, kptr_off, BPF_KPTR); 7686 + kptr_field = btf_record_find(rec, kptr_off, BPF_KPTR); 7840 7687 if (!kptr_field) { 7841 7688 verbose(env, "off=%d doesn't point to kptr\n", kptr_off); 7842 7689 return -EACCES; ··· 7976 7833 return meta->kfunc_flags & KF_ITER_DESTROY; 7977 7834 } 7978 7835 7979 - static bool is_kfunc_arg_iter(struct bpf_kfunc_call_arg_meta *meta, int arg) 7836 + static bool is_kfunc_arg_iter(struct bpf_kfunc_call_arg_meta *meta, int arg_idx, 7837 + const struct btf_param *arg) 7980 7838 { 7981 7839 /* btf_check_iter_kfuncs() guarantees that first argument of any iter 7982 7840 * kfunc is iter state pointer 7983 7841 */ 7984 - return arg == 0 && is_iter_kfunc(meta); 7842 + if (is_iter_kfunc(meta)) 7843 + return arg_idx == 0; 7844 + 7845 + /* iter passed as an argument to a generic kfunc */ 7846 + return btf_param_match_suffix(meta->btf, arg, "__iter"); 7985 7847 } 7986 7848 7987 7849 static int process_iter_arg(struct bpf_verifier_env *env, int regno, int insn_idx, ··· 7994 7846 { 7995 7847 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 7996 7848 const struct btf_type *t; 7997 - const struct btf_param *arg; 7998 - int spi, err, i, nr_slots; 7999 - u32 btf_id; 7849 + int spi, err, i, nr_slots, btf_id; 8000 7850 8001 - /* btf_check_iter_kfuncs() ensures we don't need to validate anything here */ 8002 - arg = &btf_params(meta->func_proto)[0]; 8003 - t = btf_type_skip_modifiers(meta->btf, arg->type, NULL); /* PTR */ 8004 - t = btf_type_skip_modifiers(meta->btf, t->type, &btf_id); /* STRUCT */ 7851 + /* For iter_{new,next,destroy} functions, btf_check_iter_kfuncs() 7852 + * ensures struct convention, so we wouldn't need to do any BTF 7853 + * validation here. But given iter state can be passed as a parameter 7854 + * to any kfunc, if arg has "__iter" suffix, we need to be a bit more 7855 + * conservative here. 7856 + */ 7857 + btf_id = btf_check_iter_arg(meta->btf, meta->func_proto, regno - 1); 7858 + if (btf_id < 0) { 7859 + verbose(env, "expected valid iter pointer as arg #%d\n", regno); 7860 + return -EINVAL; 7861 + } 7862 + t = btf_type_by_id(meta->btf, btf_id); 8005 7863 nr_slots = t->size / BPF_REG_SIZE; 8006 7864 8007 7865 if (is_iter_new_kfunc(meta)) { ··· 8029 7875 if (err) 8030 7876 return err; 8031 7877 } else { 8032 - /* iter_next() or iter_destroy() expect initialized iter state*/ 7878 + /* iter_next() or iter_destroy(), as well as any kfunc 7879 + * accepting iter argument, expect initialized iter state 7880 + */ 8033 7881 err = is_iter_reg_valid_init(env, reg, meta->btf, btf_id, nr_slots); 8034 7882 switch (err) { 8035 7883 case 0: ··· 8145 7989 return 0; 8146 7990 } 8147 7991 7992 + static struct bpf_reg_state *get_iter_from_state(struct bpf_verifier_state *cur_st, 7993 + struct bpf_kfunc_call_arg_meta *meta) 7994 + { 7995 + int iter_frameno = meta->iter.frameno; 7996 + int iter_spi = meta->iter.spi; 7997 + 7998 + return &cur_st->frame[iter_frameno]->stack[iter_spi].spilled_ptr; 7999 + } 8000 + 8148 8001 /* process_iter_next_call() is called when verifier gets to iterator's next 8149 8002 * "method" (e.g., bpf_iter_num_next() for numbers iterator) call. We'll refer 8150 8003 * to it as just "iter_next()" in comments below. ··· 8238 8073 struct bpf_verifier_state *cur_st = env->cur_state, *queued_st, *prev_st; 8239 8074 struct bpf_func_state *cur_fr = cur_st->frame[cur_st->curframe], *queued_fr; 8240 8075 struct bpf_reg_state *cur_iter, *queued_iter; 8241 - int iter_frameno = meta->iter.frameno; 8242 - int iter_spi = meta->iter.spi; 8243 8076 8244 8077 BTF_TYPE_EMIT(struct bpf_iter); 8245 8078 8246 - cur_iter = &env->cur_state->frame[iter_frameno]->stack[iter_spi].spilled_ptr; 8079 + cur_iter = get_iter_from_state(cur_st, meta); 8247 8080 8248 8081 if (cur_iter->iter.state != BPF_ITER_STATE_ACTIVE && 8249 8082 cur_iter->iter.state != BPF_ITER_STATE_DRAINED) { ··· 8269 8106 if (!queued_st) 8270 8107 return -ENOMEM; 8271 8108 8272 - queued_iter = &queued_st->frame[iter_frameno]->stack[iter_spi].spilled_ptr; 8109 + queued_iter = get_iter_from_state(queued_st, meta); 8273 8110 queued_iter->iter.state = BPF_ITER_STATE_ACTIVE; 8274 8111 queued_iter->iter.depth++; 8275 8112 if (prev_st) ··· 8293 8130 type == ARG_CONST_SIZE_OR_ZERO; 8294 8131 } 8295 8132 8133 + static bool arg_type_is_raw_mem(enum bpf_arg_type type) 8134 + { 8135 + return base_type(type) == ARG_PTR_TO_MEM && 8136 + type & MEM_UNINIT; 8137 + } 8138 + 8296 8139 static bool arg_type_is_release(enum bpf_arg_type type) 8297 8140 { 8298 8141 return type & OBJ_RELEASE; ··· 8307 8138 static bool arg_type_is_dynptr(enum bpf_arg_type type) 8308 8139 { 8309 8140 return base_type(type) == ARG_PTR_TO_DYNPTR; 8310 - } 8311 - 8312 - static int int_ptr_type_to_size(enum bpf_arg_type type) 8313 - { 8314 - if (type == ARG_PTR_TO_INT) 8315 - return sizeof(u32); 8316 - else if (type == ARG_PTR_TO_LONG) 8317 - return sizeof(u64); 8318 - 8319 - return -EINVAL; 8320 8141 } 8321 8142 8322 8143 static int resolve_map_arg_type(struct bpf_verifier_env *env, ··· 8381 8222 }, 8382 8223 }; 8383 8224 8384 - static const struct bpf_reg_types int_ptr_types = { 8385 - .types = { 8386 - PTR_TO_STACK, 8387 - PTR_TO_PACKET, 8388 - PTR_TO_PACKET_META, 8389 - PTR_TO_MAP_KEY, 8390 - PTR_TO_MAP_VALUE, 8391 - }, 8392 - }; 8393 - 8394 8225 static const struct bpf_reg_types spin_lock_types = { 8395 8226 .types = { 8396 8227 PTR_TO_MAP_VALUE, ··· 8411 8262 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } }; 8412 8263 static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } }; 8413 8264 static const struct bpf_reg_types timer_types = { .types = { PTR_TO_MAP_VALUE } }; 8414 - static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }; 8265 + static const struct bpf_reg_types kptr_xchg_dest_types = { 8266 + .types = { 8267 + PTR_TO_MAP_VALUE, 8268 + PTR_TO_BTF_ID | MEM_ALLOC 8269 + } 8270 + }; 8415 8271 static const struct bpf_reg_types dynptr_types = { 8416 8272 .types = { 8417 8273 PTR_TO_STACK, ··· 8441 8287 [ARG_PTR_TO_SPIN_LOCK] = &spin_lock_types, 8442 8288 [ARG_PTR_TO_MEM] = &mem_types, 8443 8289 [ARG_PTR_TO_RINGBUF_MEM] = &ringbuf_mem_types, 8444 - [ARG_PTR_TO_INT] = &int_ptr_types, 8445 - [ARG_PTR_TO_LONG] = &int_ptr_types, 8446 8290 [ARG_PTR_TO_PERCPU_BTF_ID] = &percpu_btf_ptr_types, 8447 8291 [ARG_PTR_TO_FUNC] = &func_ptr_types, 8448 8292 [ARG_PTR_TO_STACK] = &stack_ptr_types, 8449 8293 [ARG_PTR_TO_CONST_STR] = &const_str_ptr_types, 8450 8294 [ARG_PTR_TO_TIMER] = &timer_types, 8451 - [ARG_PTR_TO_KPTR] = &kptr_types, 8295 + [ARG_KPTR_XCHG_DEST] = &kptr_xchg_dest_types, 8452 8296 [ARG_PTR_TO_DYNPTR] = &dynptr_types, 8453 8297 }; 8454 8298 ··· 8485 8333 if (base_type(arg_type) == ARG_PTR_TO_MEM) 8486 8334 type &= ~DYNPTR_TYPE_FLAG_MASK; 8487 8335 8488 - if (meta->func_id == BPF_FUNC_kptr_xchg && type_is_alloc(type)) { 8336 + /* Local kptr types are allowed as the source argument of bpf_kptr_xchg */ 8337 + if (meta->func_id == BPF_FUNC_kptr_xchg && type_is_alloc(type) && regno == BPF_REG_2) { 8489 8338 type &= ~MEM_ALLOC; 8490 8339 type &= ~MEM_PERCPU; 8491 8340 } ··· 8579 8426 verbose(env, "verifier internal error: unimplemented handling of MEM_ALLOC\n"); 8580 8427 return -EFAULT; 8581 8428 } 8582 - if (meta->func_id == BPF_FUNC_kptr_xchg) { 8429 + /* Check if local kptr in src arg matches kptr in dst arg */ 8430 + if (meta->func_id == BPF_FUNC_kptr_xchg && regno == BPF_REG_2) { 8583 8431 if (map_kptr_match_type(env, meta->kptr_field, reg, regno)) 8584 8432 return -EACCES; 8585 8433 } ··· 8891 8737 meta->release_regno = regno; 8892 8738 } 8893 8739 8894 - if (reg->ref_obj_id) { 8740 + if (reg->ref_obj_id && base_type(arg_type) != ARG_KPTR_XCHG_DEST) { 8895 8741 if (meta->ref_obj_id) { 8896 8742 verbose(env, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n", 8897 8743 regno, reg->ref_obj_id, ··· 9003 8849 */ 9004 8850 meta->raw_mode = arg_type & MEM_UNINIT; 9005 8851 if (arg_type & MEM_FIXED_SIZE) { 9006 - err = check_helper_mem_access(env, regno, 9007 - fn->arg_size[arg], false, 9008 - meta); 8852 + err = check_helper_mem_access(env, regno, fn->arg_size[arg], false, meta); 8853 + if (err) 8854 + return err; 8855 + if (arg_type & MEM_ALIGNED) 8856 + err = check_ptr_alignment(env, reg, 0, fn->arg_size[arg], true); 9009 8857 } 9010 8858 break; 9011 8859 case ARG_CONST_SIZE: ··· 9032 8876 if (err) 9033 8877 return err; 9034 8878 break; 9035 - case ARG_PTR_TO_INT: 9036 - case ARG_PTR_TO_LONG: 9037 - { 9038 - int size = int_ptr_type_to_size(arg_type); 9039 - 9040 - err = check_helper_mem_access(env, regno, size, false, meta); 9041 - if (err) 9042 - return err; 9043 - err = check_ptr_alignment(env, reg, 0, size, true); 9044 - break; 9045 - } 9046 8879 case ARG_PTR_TO_CONST_STR: 9047 8880 { 9048 8881 err = check_reg_const_str(env, reg, regno); ··· 9039 8894 return err; 9040 8895 break; 9041 8896 } 9042 - case ARG_PTR_TO_KPTR: 8897 + case ARG_KPTR_XCHG_DEST: 9043 8898 err = process_kptr_func(env, regno, meta); 9044 8899 if (err) 9045 8900 return err; ··· 9348 9203 { 9349 9204 int count = 0; 9350 9205 9351 - if (fn->arg1_type == ARG_PTR_TO_UNINIT_MEM) 9206 + if (arg_type_is_raw_mem(fn->arg1_type)) 9352 9207 count++; 9353 - if (fn->arg2_type == ARG_PTR_TO_UNINIT_MEM) 9208 + if (arg_type_is_raw_mem(fn->arg2_type)) 9354 9209 count++; 9355 - if (fn->arg3_type == ARG_PTR_TO_UNINIT_MEM) 9210 + if (arg_type_is_raw_mem(fn->arg3_type)) 9356 9211 count++; 9357 - if (fn->arg4_type == ARG_PTR_TO_UNINIT_MEM) 9212 + if (arg_type_is_raw_mem(fn->arg4_type)) 9358 9213 count++; 9359 - if (fn->arg5_type == ARG_PTR_TO_UNINIT_MEM) 9214 + if (arg_type_is_raw_mem(fn->arg5_type)) 9360 9215 count++; 9361 9216 9362 9217 /* We only support one arg being in raw mode at the moment, ··· 10070 9925 return is_rbtree_lock_required_kfunc(kfunc_btf_id); 10071 9926 } 10072 9927 10073 - static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg) 9928 + static bool retval_range_within(struct bpf_retval_range range, const struct bpf_reg_state *reg, 9929 + bool return_32bit) 10074 9930 { 10075 - return range.minval <= reg->smin_value && reg->smax_value <= range.maxval; 9931 + if (return_32bit) 9932 + return range.minval <= reg->s32_min_value && reg->s32_max_value <= range.maxval; 9933 + else 9934 + return range.minval <= reg->smin_value && reg->smax_value <= range.maxval; 10076 9935 } 10077 9936 10078 9937 static int prepare_func_exit(struct bpf_verifier_env *env, int *insn_idx) ··· 10113 9964 if (err) 10114 9965 return err; 10115 9966 10116 - /* enforce R0 return value range */ 10117 - if (!retval_range_within(callee->callback_ret_range, r0)) { 9967 + /* enforce R0 return value range, and bpf_callback_t returns 64bit */ 9968 + if (!retval_range_within(callee->callback_ret_range, r0, false)) { 10118 9969 verbose_invalid_scalar(env, r0, callee->callback_ret_range, 10119 9970 "At callback return", "R0"); 10120 9971 return -EINVAL; ··· 10416 10267 state->callback_subprogno == subprogno); 10417 10268 } 10418 10269 10270 + static int get_helper_proto(struct bpf_verifier_env *env, int func_id, 10271 + const struct bpf_func_proto **ptr) 10272 + { 10273 + if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) 10274 + return -ERANGE; 10275 + 10276 + if (!env->ops->get_func_proto) 10277 + return -EINVAL; 10278 + 10279 + *ptr = env->ops->get_func_proto(func_id, env->prog); 10280 + return *ptr ? 0 : -EINVAL; 10281 + } 10282 + 10419 10283 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 10420 10284 int *insn_idx_p) 10421 10285 { ··· 10445 10283 10446 10284 /* find function prototype */ 10447 10285 func_id = insn->imm; 10448 - if (func_id < 0 || func_id >= __BPF_FUNC_MAX_ID) { 10449 - verbose(env, "invalid func %s#%d\n", func_id_name(func_id), 10450 - func_id); 10286 + err = get_helper_proto(env, insn->imm, &fn); 10287 + if (err == -ERANGE) { 10288 + verbose(env, "invalid func %s#%d\n", func_id_name(func_id), func_id); 10451 10289 return -EINVAL; 10452 10290 } 10453 10291 10454 - if (env->ops->get_func_proto) 10455 - fn = env->ops->get_func_proto(func_id, env->prog); 10456 - if (!fn) { 10292 + if (err) { 10457 10293 verbose(env, "program of this type cannot use helper %s#%d\n", 10458 10294 func_id_name(func_id), func_id); 10459 - return -EINVAL; 10295 + return err; 10460 10296 } 10461 10297 10462 10298 /* eBPF programs must be GPL compatible to use GPL-ed functions */ ··· 11390 11230 if (is_kfunc_arg_dynptr(meta->btf, &args[argno])) 11391 11231 return KF_ARG_PTR_TO_DYNPTR; 11392 11232 11393 - if (is_kfunc_arg_iter(meta, argno)) 11233 + if (is_kfunc_arg_iter(meta, argno, &args[argno])) 11394 11234 return KF_ARG_PTR_TO_ITER; 11395 11235 11396 11236 if (is_kfunc_arg_list_head(meta->btf, &args[argno])) ··· 11492 11332 * btf_struct_ids_match() to walk the struct at the 0th offset, and 11493 11333 * resolve types. 11494 11334 */ 11495 - if (is_kfunc_acquire(meta) || 11496 - (is_kfunc_release(meta) && reg->ref_obj_id) || 11335 + if ((is_kfunc_release(meta) && reg->ref_obj_id) || 11497 11336 btf_type_ids_nocast_alias(&env->log, reg_btf, reg_ref_id, meta->btf, ref_id)) 11498 11337 strict_type_match = true; 11499 11338 ··· 12109 11950 switch (kf_arg_type) { 12110 11951 case KF_ARG_PTR_TO_CTX: 12111 11952 if (reg->type != PTR_TO_CTX) { 12112 - verbose(env, "arg#%d expected pointer to ctx, but got %s\n", i, btf_type_str(t)); 11953 + verbose(env, "arg#%d expected pointer to ctx, but got %s\n", 11954 + i, reg_type_str(env, reg->type)); 12113 11955 return -EINVAL; 12114 11956 } 12115 11957 ··· 12833 12673 regs[BPF_REG_0].btf = desc_btf; 12834 12674 regs[BPF_REG_0].type = PTR_TO_BTF_ID; 12835 12675 regs[BPF_REG_0].btf_id = ptr_type_id; 12676 + 12677 + if (is_iter_next_kfunc(&meta)) { 12678 + struct bpf_reg_state *cur_iter; 12679 + 12680 + cur_iter = get_iter_from_state(env->cur_state, &meta); 12681 + 12682 + if (cur_iter->type & MEM_RCU) /* KF_RCU_PROTECTED */ 12683 + regs[BPF_REG_0].type |= MEM_RCU; 12684 + else 12685 + regs[BPF_REG_0].type |= PTR_TRUSTED; 12686 + } 12836 12687 } 12837 12688 12838 12689 if (is_kfunc_ret_null(&meta)) { ··· 14272 14101 u64 val = reg_const_value(src_reg, alu32); 14273 14102 14274 14103 if ((dst_reg->id & BPF_ADD_CONST) || 14275 - /* prevent overflow in find_equal_scalars() later */ 14104 + /* prevent overflow in sync_linked_regs() later */ 14276 14105 val > (u32)S32_MAX) { 14277 14106 /* 14278 14107 * If the register already went through rX += val ··· 14287 14116 } else { 14288 14117 /* 14289 14118 * Make sure ID is cleared otherwise dst_reg min/max could be 14290 - * incorrectly propagated into other registers by find_equal_scalars() 14119 + * incorrectly propagated into other registers by sync_linked_regs() 14291 14120 */ 14292 14121 dst_reg->id = 0; 14293 14122 } ··· 14437 14266 copy_register_state(dst_reg, src_reg); 14438 14267 /* Make sure ID is cleared if src_reg is not in u32 14439 14268 * range otherwise dst_reg min/max could be incorrectly 14440 - * propagated into src_reg by find_equal_scalars() 14269 + * propagated into src_reg by sync_linked_regs() 14441 14270 */ 14442 14271 if (!is_src_reg_u32) 14443 14272 dst_reg->id = 0; ··· 15260 15089 return true; 15261 15090 } 15262 15091 15263 - static void find_equal_scalars(struct bpf_verifier_state *vstate, 15264 - struct bpf_reg_state *known_reg) 15092 + static void __collect_linked_regs(struct linked_regs *reg_set, struct bpf_reg_state *reg, 15093 + u32 id, u32 frameno, u32 spi_or_reg, bool is_reg) 15094 + { 15095 + struct linked_reg *e; 15096 + 15097 + if (reg->type != SCALAR_VALUE || (reg->id & ~BPF_ADD_CONST) != id) 15098 + return; 15099 + 15100 + e = linked_regs_push(reg_set); 15101 + if (e) { 15102 + e->frameno = frameno; 15103 + e->is_reg = is_reg; 15104 + e->regno = spi_or_reg; 15105 + } else { 15106 + reg->id = 0; 15107 + } 15108 + } 15109 + 15110 + /* For all R being scalar registers or spilled scalar registers 15111 + * in verifier state, save R in linked_regs if R->id == id. 15112 + * If there are too many Rs sharing same id, reset id for leftover Rs. 15113 + */ 15114 + static void collect_linked_regs(struct bpf_verifier_state *vstate, u32 id, 15115 + struct linked_regs *linked_regs) 15116 + { 15117 + struct bpf_func_state *func; 15118 + struct bpf_reg_state *reg; 15119 + int i, j; 15120 + 15121 + id = id & ~BPF_ADD_CONST; 15122 + for (i = vstate->curframe; i >= 0; i--) { 15123 + func = vstate->frame[i]; 15124 + for (j = 0; j < BPF_REG_FP; j++) { 15125 + reg = &func->regs[j]; 15126 + __collect_linked_regs(linked_regs, reg, id, i, j, true); 15127 + } 15128 + for (j = 0; j < func->allocated_stack / BPF_REG_SIZE; j++) { 15129 + if (!is_spilled_reg(&func->stack[j])) 15130 + continue; 15131 + reg = &func->stack[j].spilled_ptr; 15132 + __collect_linked_regs(linked_regs, reg, id, i, j, false); 15133 + } 15134 + } 15135 + } 15136 + 15137 + /* For all R in linked_regs, copy known_reg range into R 15138 + * if R->id == known_reg->id. 15139 + */ 15140 + static void sync_linked_regs(struct bpf_verifier_state *vstate, struct bpf_reg_state *known_reg, 15141 + struct linked_regs *linked_regs) 15265 15142 { 15266 15143 struct bpf_reg_state fake_reg; 15267 - struct bpf_func_state *state; 15268 15144 struct bpf_reg_state *reg; 15145 + struct linked_reg *e; 15146 + int i; 15269 15147 15270 - bpf_for_each_reg_in_vstate(vstate, state, reg, ({ 15148 + for (i = 0; i < linked_regs->cnt; ++i) { 15149 + e = &linked_regs->entries[i]; 15150 + reg = e->is_reg ? &vstate->frame[e->frameno]->regs[e->regno] 15151 + : &vstate->frame[e->frameno]->stack[e->spi].spilled_ptr; 15271 15152 if (reg->type != SCALAR_VALUE || reg == known_reg) 15272 15153 continue; 15273 15154 if ((reg->id & ~BPF_ADD_CONST) != (known_reg->id & ~BPF_ADD_CONST)) ··· 15337 15114 copy_register_state(reg, known_reg); 15338 15115 /* 15339 15116 * Must preserve off, id and add_const flag, 15340 - * otherwise another find_equal_scalars() will be incorrect. 15117 + * otherwise another sync_linked_regs() will be incorrect. 15341 15118 */ 15342 15119 reg->off = saved_off; 15343 15120 ··· 15345 15122 scalar_min_max_add(reg, &fake_reg); 15346 15123 reg->var_off = tnum_add(reg->var_off, fake_reg.var_off); 15347 15124 } 15348 - })); 15125 + } 15349 15126 } 15350 15127 15351 15128 static int check_cond_jmp_op(struct bpf_verifier_env *env, ··· 15356 15133 struct bpf_reg_state *regs = this_branch->frame[this_branch->curframe]->regs; 15357 15134 struct bpf_reg_state *dst_reg, *other_branch_regs, *src_reg = NULL; 15358 15135 struct bpf_reg_state *eq_branch_regs; 15136 + struct linked_regs linked_regs = {}; 15359 15137 u8 opcode = BPF_OP(insn->code); 15360 15138 bool is_jmp32; 15361 15139 int pred = -1; ··· 15471 15247 return 0; 15472 15248 } 15473 15249 15250 + /* Push scalar registers sharing same ID to jump history, 15251 + * do this before creating 'other_branch', so that both 15252 + * 'this_branch' and 'other_branch' share this history 15253 + * if parent state is created. 15254 + */ 15255 + if (BPF_SRC(insn->code) == BPF_X && src_reg->type == SCALAR_VALUE && src_reg->id) 15256 + collect_linked_regs(this_branch, src_reg->id, &linked_regs); 15257 + if (dst_reg->type == SCALAR_VALUE && dst_reg->id) 15258 + collect_linked_regs(this_branch, dst_reg->id, &linked_regs); 15259 + if (linked_regs.cnt > 1) { 15260 + err = push_jmp_history(env, this_branch, 0, linked_regs_pack(&linked_regs)); 15261 + if (err) 15262 + return err; 15263 + } 15264 + 15474 15265 other_branch = push_stack(env, *insn_idx + insn->off + 1, *insn_idx, 15475 15266 false); 15476 15267 if (!other_branch) ··· 15516 15277 if (BPF_SRC(insn->code) == BPF_X && 15517 15278 src_reg->type == SCALAR_VALUE && src_reg->id && 15518 15279 !WARN_ON_ONCE(src_reg->id != other_branch_regs[insn->src_reg].id)) { 15519 - find_equal_scalars(this_branch, src_reg); 15520 - find_equal_scalars(other_branch, &other_branch_regs[insn->src_reg]); 15280 + sync_linked_regs(this_branch, src_reg, &linked_regs); 15281 + sync_linked_regs(other_branch, &other_branch_regs[insn->src_reg], &linked_regs); 15521 15282 } 15522 15283 if (dst_reg->type == SCALAR_VALUE && dst_reg->id && 15523 15284 !WARN_ON_ONCE(dst_reg->id != other_branch_regs[insn->dst_reg].id)) { 15524 - find_equal_scalars(this_branch, dst_reg); 15525 - find_equal_scalars(other_branch, &other_branch_regs[insn->dst_reg]); 15285 + sync_linked_regs(this_branch, dst_reg, &linked_regs); 15286 + sync_linked_regs(other_branch, &other_branch_regs[insn->dst_reg], &linked_regs); 15526 15287 } 15527 15288 15528 15289 /* if one pointer register is compared to another pointer ··· 15810 15571 int err; 15811 15572 struct bpf_func_state *frame = env->cur_state->frame[0]; 15812 15573 const bool is_subprog = frame->subprogno; 15574 + bool return_32bit = false; 15813 15575 15814 15576 /* LSM and struct_ops func-ptr's return type could be "void" */ 15815 15577 if (!is_subprog || frame->in_exception_callback_fn) { ··· 15916 15676 15917 15677 case BPF_PROG_TYPE_LSM: 15918 15678 if (env->prog->expected_attach_type != BPF_LSM_CGROUP) { 15919 - /* Regular BPF_PROG_TYPE_LSM programs can return 15920 - * any value. 15921 - */ 15922 - return 0; 15923 - } 15924 - if (!env->prog->aux->attach_func_proto->type) { 15679 + /* no range found, any return value is allowed */ 15680 + if (!get_func_retval_range(env->prog, &range)) 15681 + return 0; 15682 + /* no restricted range, any return value is allowed */ 15683 + if (range.minval == S32_MIN && range.maxval == S32_MAX) 15684 + return 0; 15685 + return_32bit = true; 15686 + } else if (!env->prog->aux->attach_func_proto->type) { 15925 15687 /* Make sure programs that attach to void 15926 15688 * hooks don't try to modify return value. 15927 15689 */ ··· 15953 15711 if (err) 15954 15712 return err; 15955 15713 15956 - if (!retval_range_within(range, reg)) { 15714 + if (!retval_range_within(range, reg, return_32bit)) { 15957 15715 verbose_invalid_scalar(env, reg, range, exit_ctx, reg_name); 15958 15716 if (!is_subprog && 15959 15717 prog->expected_attach_type == BPF_LSM_CGROUP && ··· 16117 15875 ret = push_insn(t, t + insns[t].imm + 1, BRANCH, env); 16118 15876 } 16119 15877 return ret; 15878 + } 15879 + 15880 + /* Bitmask with 1s for all caller saved registers */ 15881 + #define ALL_CALLER_SAVED_REGS ((1u << CALLER_SAVED_REGS) - 1) 15882 + 15883 + /* Return a bitmask specifying which caller saved registers are 15884 + * clobbered by a call to a helper *as if* this helper follows 15885 + * bpf_fastcall contract: 15886 + * - includes R0 if function is non-void; 15887 + * - includes R1-R5 if corresponding parameter has is described 15888 + * in the function prototype. 15889 + */ 15890 + static u32 helper_fastcall_clobber_mask(const struct bpf_func_proto *fn) 15891 + { 15892 + u32 mask; 15893 + int i; 15894 + 15895 + mask = 0; 15896 + if (fn->ret_type != RET_VOID) 15897 + mask |= BIT(BPF_REG_0); 15898 + for (i = 0; i < ARRAY_SIZE(fn->arg_type); ++i) 15899 + if (fn->arg_type[i] != ARG_DONTCARE) 15900 + mask |= BIT(BPF_REG_1 + i); 15901 + return mask; 15902 + } 15903 + 15904 + /* True if do_misc_fixups() replaces calls to helper number 'imm', 15905 + * replacement patch is presumed to follow bpf_fastcall contract 15906 + * (see mark_fastcall_pattern_for_call() below). 15907 + */ 15908 + static bool verifier_inlines_helper_call(struct bpf_verifier_env *env, s32 imm) 15909 + { 15910 + switch (imm) { 15911 + #ifdef CONFIG_X86_64 15912 + case BPF_FUNC_get_smp_processor_id: 15913 + return env->prog->jit_requested && bpf_jit_supports_percpu_insn(); 15914 + #endif 15915 + default: 15916 + return false; 15917 + } 15918 + } 15919 + 15920 + /* Same as helper_fastcall_clobber_mask() but for kfuncs, see comment above */ 15921 + static u32 kfunc_fastcall_clobber_mask(struct bpf_kfunc_call_arg_meta *meta) 15922 + { 15923 + u32 vlen, i, mask; 15924 + 15925 + vlen = btf_type_vlen(meta->func_proto); 15926 + mask = 0; 15927 + if (!btf_type_is_void(btf_type_by_id(meta->btf, meta->func_proto->type))) 15928 + mask |= BIT(BPF_REG_0); 15929 + for (i = 0; i < vlen; ++i) 15930 + mask |= BIT(BPF_REG_1 + i); 15931 + return mask; 15932 + } 15933 + 15934 + /* Same as verifier_inlines_helper_call() but for kfuncs, see comment above */ 15935 + static bool is_fastcall_kfunc_call(struct bpf_kfunc_call_arg_meta *meta) 15936 + { 15937 + if (meta->btf == btf_vmlinux) 15938 + return meta->func_id == special_kfunc_list[KF_bpf_cast_to_kern_ctx] || 15939 + meta->func_id == special_kfunc_list[KF_bpf_rdonly_cast]; 15940 + return false; 15941 + } 15942 + 15943 + /* LLVM define a bpf_fastcall function attribute. 15944 + * This attribute means that function scratches only some of 15945 + * the caller saved registers defined by ABI. 15946 + * For BPF the set of such registers could be defined as follows: 15947 + * - R0 is scratched only if function is non-void; 15948 + * - R1-R5 are scratched only if corresponding parameter type is defined 15949 + * in the function prototype. 15950 + * 15951 + * The contract between kernel and clang allows to simultaneously use 15952 + * such functions and maintain backwards compatibility with old 15953 + * kernels that don't understand bpf_fastcall calls: 15954 + * 15955 + * - for bpf_fastcall calls clang allocates registers as-if relevant r0-r5 15956 + * registers are not scratched by the call; 15957 + * 15958 + * - as a post-processing step, clang visits each bpf_fastcall call and adds 15959 + * spill/fill for every live r0-r5; 15960 + * 15961 + * - stack offsets used for the spill/fill are allocated as lowest 15962 + * stack offsets in whole function and are not used for any other 15963 + * purposes; 15964 + * 15965 + * - when kernel loads a program, it looks for such patterns 15966 + * (bpf_fastcall function surrounded by spills/fills) and checks if 15967 + * spill/fill stack offsets are used exclusively in fastcall patterns; 15968 + * 15969 + * - if so, and if verifier or current JIT inlines the call to the 15970 + * bpf_fastcall function (e.g. a helper call), kernel removes unnecessary 15971 + * spill/fill pairs; 15972 + * 15973 + * - when old kernel loads a program, presence of spill/fill pairs 15974 + * keeps BPF program valid, albeit slightly less efficient. 15975 + * 15976 + * For example: 15977 + * 15978 + * r1 = 1; 15979 + * r2 = 2; 15980 + * *(u64 *)(r10 - 8) = r1; r1 = 1; 15981 + * *(u64 *)(r10 - 16) = r2; r2 = 2; 15982 + * call %[to_be_inlined] --> call %[to_be_inlined] 15983 + * r2 = *(u64 *)(r10 - 16); r0 = r1; 15984 + * r1 = *(u64 *)(r10 - 8); r0 += r2; 15985 + * r0 = r1; exit; 15986 + * r0 += r2; 15987 + * exit; 15988 + * 15989 + * The purpose of mark_fastcall_pattern_for_call is to: 15990 + * - look for such patterns; 15991 + * - mark spill and fill instructions in env->insn_aux_data[*].fastcall_pattern; 15992 + * - mark set env->insn_aux_data[*].fastcall_spills_num for call instruction; 15993 + * - update env->subprog_info[*]->fastcall_stack_off to find an offset 15994 + * at which bpf_fastcall spill/fill stack slots start; 15995 + * - update env->subprog_info[*]->keep_fastcall_stack. 15996 + * 15997 + * The .fastcall_pattern and .fastcall_stack_off are used by 15998 + * check_fastcall_stack_contract() to check if every stack access to 15999 + * fastcall spill/fill stack slot originates from spill/fill 16000 + * instructions, members of fastcall patterns. 16001 + * 16002 + * If such condition holds true for a subprogram, fastcall patterns could 16003 + * be rewritten by remove_fastcall_spills_fills(). 16004 + * Otherwise bpf_fastcall patterns are not changed in the subprogram 16005 + * (code, presumably, generated by an older clang version). 16006 + * 16007 + * For example, it is *not* safe to remove spill/fill below: 16008 + * 16009 + * r1 = 1; 16010 + * *(u64 *)(r10 - 8) = r1; r1 = 1; 16011 + * call %[to_be_inlined] --> call %[to_be_inlined] 16012 + * r1 = *(u64 *)(r10 - 8); r0 = *(u64 *)(r10 - 8); <---- wrong !!! 16013 + * r0 = *(u64 *)(r10 - 8); r0 += r1; 16014 + * r0 += r1; exit; 16015 + * exit; 16016 + */ 16017 + static void mark_fastcall_pattern_for_call(struct bpf_verifier_env *env, 16018 + struct bpf_subprog_info *subprog, 16019 + int insn_idx, s16 lowest_off) 16020 + { 16021 + struct bpf_insn *insns = env->prog->insnsi, *stx, *ldx; 16022 + struct bpf_insn *call = &env->prog->insnsi[insn_idx]; 16023 + const struct bpf_func_proto *fn; 16024 + u32 clobbered_regs_mask = ALL_CALLER_SAVED_REGS; 16025 + u32 expected_regs_mask; 16026 + bool can_be_inlined = false; 16027 + s16 off; 16028 + int i; 16029 + 16030 + if (bpf_helper_call(call)) { 16031 + if (get_helper_proto(env, call->imm, &fn) < 0) 16032 + /* error would be reported later */ 16033 + return; 16034 + clobbered_regs_mask = helper_fastcall_clobber_mask(fn); 16035 + can_be_inlined = fn->allow_fastcall && 16036 + (verifier_inlines_helper_call(env, call->imm) || 16037 + bpf_jit_inlines_helper_call(call->imm)); 16038 + } 16039 + 16040 + if (bpf_pseudo_kfunc_call(call)) { 16041 + struct bpf_kfunc_call_arg_meta meta; 16042 + int err; 16043 + 16044 + err = fetch_kfunc_meta(env, call, &meta, NULL); 16045 + if (err < 0) 16046 + /* error would be reported later */ 16047 + return; 16048 + 16049 + clobbered_regs_mask = kfunc_fastcall_clobber_mask(&meta); 16050 + can_be_inlined = is_fastcall_kfunc_call(&meta); 16051 + } 16052 + 16053 + if (clobbered_regs_mask == ALL_CALLER_SAVED_REGS) 16054 + return; 16055 + 16056 + /* e.g. if helper call clobbers r{0,1}, expect r{2,3,4,5} in the pattern */ 16057 + expected_regs_mask = ~clobbered_regs_mask & ALL_CALLER_SAVED_REGS; 16058 + 16059 + /* match pairs of form: 16060 + * 16061 + * *(u64 *)(r10 - Y) = rX (where Y % 8 == 0) 16062 + * ... 16063 + * call %[to_be_inlined] 16064 + * ... 16065 + * rX = *(u64 *)(r10 - Y) 16066 + */ 16067 + for (i = 1, off = lowest_off; i <= ARRAY_SIZE(caller_saved); ++i, off += BPF_REG_SIZE) { 16068 + if (insn_idx - i < 0 || insn_idx + i >= env->prog->len) 16069 + break; 16070 + stx = &insns[insn_idx - i]; 16071 + ldx = &insns[insn_idx + i]; 16072 + /* must be a stack spill/fill pair */ 16073 + if (stx->code != (BPF_STX | BPF_MEM | BPF_DW) || 16074 + ldx->code != (BPF_LDX | BPF_MEM | BPF_DW) || 16075 + stx->dst_reg != BPF_REG_10 || 16076 + ldx->src_reg != BPF_REG_10) 16077 + break; 16078 + /* must be a spill/fill for the same reg */ 16079 + if (stx->src_reg != ldx->dst_reg) 16080 + break; 16081 + /* must be one of the previously unseen registers */ 16082 + if ((BIT(stx->src_reg) & expected_regs_mask) == 0) 16083 + break; 16084 + /* must be a spill/fill for the same expected offset, 16085 + * no need to check offset alignment, BPF_DW stack access 16086 + * is always 8-byte aligned. 16087 + */ 16088 + if (stx->off != off || ldx->off != off) 16089 + break; 16090 + expected_regs_mask &= ~BIT(stx->src_reg); 16091 + env->insn_aux_data[insn_idx - i].fastcall_pattern = 1; 16092 + env->insn_aux_data[insn_idx + i].fastcall_pattern = 1; 16093 + } 16094 + if (i == 1) 16095 + return; 16096 + 16097 + /* Conditionally set 'fastcall_spills_num' to allow forward 16098 + * compatibility when more helper functions are marked as 16099 + * bpf_fastcall at compile time than current kernel supports, e.g: 16100 + * 16101 + * 1: *(u64 *)(r10 - 8) = r1 16102 + * 2: call A ;; assume A is bpf_fastcall for current kernel 16103 + * 3: r1 = *(u64 *)(r10 - 8) 16104 + * 4: *(u64 *)(r10 - 8) = r1 16105 + * 5: call B ;; assume B is not bpf_fastcall for current kernel 16106 + * 6: r1 = *(u64 *)(r10 - 8) 16107 + * 16108 + * There is no need to block bpf_fastcall rewrite for such program. 16109 + * Set 'fastcall_pattern' for both calls to keep check_fastcall_stack_contract() happy, 16110 + * don't set 'fastcall_spills_num' for call B so that remove_fastcall_spills_fills() 16111 + * does not remove spill/fill pair {4,6}. 16112 + */ 16113 + if (can_be_inlined) 16114 + env->insn_aux_data[insn_idx].fastcall_spills_num = i - 1; 16115 + else 16116 + subprog->keep_fastcall_stack = 1; 16117 + subprog->fastcall_stack_off = min(subprog->fastcall_stack_off, off); 16118 + } 16119 + 16120 + static int mark_fastcall_patterns(struct bpf_verifier_env *env) 16121 + { 16122 + struct bpf_subprog_info *subprog = env->subprog_info; 16123 + struct bpf_insn *insn; 16124 + s16 lowest_off; 16125 + int s, i; 16126 + 16127 + for (s = 0; s < env->subprog_cnt; ++s, ++subprog) { 16128 + /* find lowest stack spill offset used in this subprog */ 16129 + lowest_off = 0; 16130 + for (i = subprog->start; i < (subprog + 1)->start; ++i) { 16131 + insn = env->prog->insnsi + i; 16132 + if (insn->code != (BPF_STX | BPF_MEM | BPF_DW) || 16133 + insn->dst_reg != BPF_REG_10) 16134 + continue; 16135 + lowest_off = min(lowest_off, insn->off); 16136 + } 16137 + /* use this offset to find fastcall patterns */ 16138 + for (i = subprog->start; i < (subprog + 1)->start; ++i) { 16139 + insn = env->prog->insnsi + i; 16140 + if (insn->code != (BPF_JMP | BPF_CALL)) 16141 + continue; 16142 + mark_fastcall_pattern_for_call(env, subprog, i, lowest_off); 16143 + } 16144 + } 16145 + return 0; 16120 16146 } 16121 16147 16122 16148 /* Visits the instruction at index t and returns one of the following: ··· 17282 16772 * 17283 16773 * First verification path is [1-6]: 17284 16774 * - at (4) same bpf_reg_state::id (b) would be assigned to r6 and r7; 17285 - * - at (5) r6 would be marked <= X, find_equal_scalars() would also mark 16775 + * - at (5) r6 would be marked <= X, sync_linked_regs() would also mark 17286 16776 * r7 <= X, because r6 and r7 share same id. 17287 16777 * Next verification path is [1-4, 6]. 17288 16778 * ··· 18076 17566 * the current state. 18077 17567 */ 18078 17568 if (is_jmp_point(env, env->insn_idx)) 18079 - err = err ? : push_jmp_history(env, cur, 0); 17569 + err = err ? : push_jmp_history(env, cur, 0, 0); 18080 17570 err = err ? : propagate_precision(env, &sl->state); 18081 17571 if (err) 18082 17572 return err; ··· 18344 17834 } 18345 17835 18346 17836 if (is_jmp_point(env, env->insn_idx)) { 18347 - err = push_jmp_history(env, state, 0); 17837 + err = push_jmp_history(env, state, 0, 0); 18348 17838 if (err) 18349 17839 return err; 18350 17840 } ··· 19277 18767 for (i = 0; i < insn_cnt; i++, insn++) { 19278 18768 u8 code = insn->code; 19279 18769 18770 + if (tgt_idx <= i && i < tgt_idx + delta) 18771 + continue; 18772 + 19280 18773 if ((BPF_CLASS(code) != BPF_JMP && BPF_CLASS(code) != BPF_JMP32) || 19281 18774 BPF_OP(code) == BPF_CALL || BPF_OP(code) == BPF_EXIT) 19282 18775 continue; ··· 19539 19026 return 0; 19540 19027 } 19541 19028 19029 + static const struct bpf_insn NOP = BPF_JMP_IMM(BPF_JA, 0, 0, 0); 19030 + 19542 19031 static int opt_remove_nops(struct bpf_verifier_env *env) 19543 19032 { 19544 - const struct bpf_insn ja = BPF_JMP_IMM(BPF_JA, 0, 0, 0); 19033 + const struct bpf_insn ja = NOP; 19545 19034 struct bpf_insn *insn = env->prog->insnsi; 19546 19035 int insn_cnt = env->prog->len; 19547 19036 int i, err; ··· 19668 19153 */ 19669 19154 static int convert_ctx_accesses(struct bpf_verifier_env *env) 19670 19155 { 19156 + struct bpf_subprog_info *subprogs = env->subprog_info; 19671 19157 const struct bpf_verifier_ops *ops = env->ops; 19672 - int i, cnt, size, ctx_field_size, delta = 0; 19158 + int i, cnt, size, ctx_field_size, delta = 0, epilogue_cnt = 0; 19673 19159 const int insn_cnt = env->prog->len; 19674 - struct bpf_insn insn_buf[16], *insn; 19160 + struct bpf_insn *epilogue_buf = env->epilogue_buf; 19161 + struct bpf_insn *insn_buf = env->insn_buf; 19162 + struct bpf_insn *insn; 19675 19163 u32 target_size, size_default, off; 19676 19164 struct bpf_prog *new_prog; 19677 19165 enum bpf_access_type type; 19678 19166 bool is_narrower_load; 19167 + int epilogue_idx = 0; 19168 + 19169 + if (ops->gen_epilogue) { 19170 + epilogue_cnt = ops->gen_epilogue(epilogue_buf, env->prog, 19171 + -(subprogs[0].stack_depth + 8)); 19172 + if (epilogue_cnt >= INSN_BUF_SIZE) { 19173 + verbose(env, "bpf verifier is misconfigured\n"); 19174 + return -EINVAL; 19175 + } else if (epilogue_cnt) { 19176 + /* Save the ARG_PTR_TO_CTX for the epilogue to use */ 19177 + cnt = 0; 19178 + subprogs[0].stack_depth += 8; 19179 + insn_buf[cnt++] = BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_1, 19180 + -subprogs[0].stack_depth); 19181 + insn_buf[cnt++] = env->prog->insnsi[0]; 19182 + new_prog = bpf_patch_insn_data(env, 0, insn_buf, cnt); 19183 + if (!new_prog) 19184 + return -ENOMEM; 19185 + env->prog = new_prog; 19186 + delta += cnt - 1; 19187 + } 19188 + } 19679 19189 19680 19190 if (ops->gen_prologue || env->seen_direct_write) { 19681 19191 if (!ops->gen_prologue) { ··· 19709 19169 } 19710 19170 cnt = ops->gen_prologue(insn_buf, env->seen_direct_write, 19711 19171 env->prog); 19712 - if (cnt >= ARRAY_SIZE(insn_buf)) { 19172 + if (cnt >= INSN_BUF_SIZE) { 19713 19173 verbose(env, "bpf verifier is misconfigured\n"); 19714 19174 return -EINVAL; 19715 19175 } else if (cnt) { ··· 19721 19181 delta += cnt - 1; 19722 19182 } 19723 19183 } 19184 + 19185 + if (delta) 19186 + WARN_ON(adjust_jmp_off(env->prog, 0, delta)); 19724 19187 19725 19188 if (bpf_prog_is_offloaded(env->prog->aux)) 19726 19189 return 0; ··· 19757 19214 insn->code = BPF_STX | BPF_PROBE_ATOMIC | BPF_SIZE(insn->code); 19758 19215 env->prog->aux->num_exentries++; 19759 19216 continue; 19217 + } else if (insn->code == (BPF_JMP | BPF_EXIT) && 19218 + epilogue_cnt && 19219 + i + delta < subprogs[1].start) { 19220 + /* Generate epilogue for the main prog */ 19221 + if (epilogue_idx) { 19222 + /* jump back to the earlier generated epilogue */ 19223 + insn_buf[0] = BPF_JMP32_A(epilogue_idx - i - delta - 1); 19224 + cnt = 1; 19225 + } else { 19226 + memcpy(insn_buf, epilogue_buf, 19227 + epilogue_cnt * sizeof(*epilogue_buf)); 19228 + cnt = epilogue_cnt; 19229 + /* epilogue_idx cannot be 0. It must have at 19230 + * least one ctx ptr saving insn before the 19231 + * epilogue. 19232 + */ 19233 + epilogue_idx = i + delta; 19234 + } 19235 + goto patch_insn_buf; 19760 19236 } else { 19761 19237 continue; 19762 19238 } ··· 19878 19316 target_size = 0; 19879 19317 cnt = convert_ctx_access(type, insn, insn_buf, env->prog, 19880 19318 &target_size); 19881 - if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf) || 19319 + if (cnt == 0 || cnt >= INSN_BUF_SIZE || 19882 19320 (ctx_field_size && !target_size)) { 19883 19321 verbose(env, "bpf verifier is misconfigured\n"); 19884 19322 return -EINVAL; ··· 19887 19325 if (is_narrower_load && size < target_size) { 19888 19326 u8 shift = bpf_ctx_narrow_access_offset( 19889 19327 off, size, size_default) * 8; 19890 - if (shift && cnt + 1 >= ARRAY_SIZE(insn_buf)) { 19328 + if (shift && cnt + 1 >= INSN_BUF_SIZE) { 19891 19329 verbose(env, "bpf verifier narrow ctx load misconfigured\n"); 19892 19330 return -EINVAL; 19893 19331 } ··· 19912 19350 insn->dst_reg, insn->dst_reg, 19913 19351 size * 8, 0); 19914 19352 19353 + patch_insn_buf: 19915 19354 new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); 19916 19355 if (!new_prog) 19917 19356 return -ENOMEM; ··· 20433 19870 const int insn_cnt = prog->len; 20434 19871 const struct bpf_map_ops *ops; 20435 19872 struct bpf_insn_aux_data *aux; 20436 - struct bpf_insn insn_buf[16]; 19873 + struct bpf_insn *insn_buf = env->insn_buf; 20437 19874 struct bpf_prog *new_prog; 20438 19875 struct bpf_map *map_ptr; 20439 19876 int i, ret, cnt, delta = 0, cur_subprog = 0; ··· 20476 19913 /* Convert BPF_CLASS(insn->code) == BPF_ALU64 to 32-bit ALU */ 20477 19914 insn->code = BPF_ALU | BPF_OP(insn->code) | BPF_SRC(insn->code); 20478 19915 20479 - /* Make divide-by-zero exceptions impossible. */ 19916 + /* Make sdiv/smod divide-by-minus-one exceptions impossible. */ 19917 + if ((insn->code == (BPF_ALU64 | BPF_MOD | BPF_K) || 19918 + insn->code == (BPF_ALU64 | BPF_DIV | BPF_K) || 19919 + insn->code == (BPF_ALU | BPF_MOD | BPF_K) || 19920 + insn->code == (BPF_ALU | BPF_DIV | BPF_K)) && 19921 + insn->off == 1 && insn->imm == -1) { 19922 + bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; 19923 + bool isdiv = BPF_OP(insn->code) == BPF_DIV; 19924 + struct bpf_insn *patchlet; 19925 + struct bpf_insn chk_and_sdiv[] = { 19926 + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | 19927 + BPF_NEG | BPF_K, insn->dst_reg, 19928 + 0, 0, 0), 19929 + }; 19930 + struct bpf_insn chk_and_smod[] = { 19931 + BPF_MOV32_IMM(insn->dst_reg, 0), 19932 + }; 19933 + 19934 + patchlet = isdiv ? chk_and_sdiv : chk_and_smod; 19935 + cnt = isdiv ? ARRAY_SIZE(chk_and_sdiv) : ARRAY_SIZE(chk_and_smod); 19936 + 19937 + new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); 19938 + if (!new_prog) 19939 + return -ENOMEM; 19940 + 19941 + delta += cnt - 1; 19942 + env->prog = prog = new_prog; 19943 + insn = new_prog->insnsi + i + delta; 19944 + goto next_insn; 19945 + } 19946 + 19947 + /* Make divide-by-zero and divide-by-minus-one exceptions impossible. */ 20480 19948 if (insn->code == (BPF_ALU64 | BPF_MOD | BPF_X) || 20481 19949 insn->code == (BPF_ALU64 | BPF_DIV | BPF_X) || 20482 19950 insn->code == (BPF_ALU | BPF_MOD | BPF_X) || 20483 19951 insn->code == (BPF_ALU | BPF_DIV | BPF_X)) { 20484 19952 bool is64 = BPF_CLASS(insn->code) == BPF_ALU64; 20485 19953 bool isdiv = BPF_OP(insn->code) == BPF_DIV; 19954 + bool is_sdiv = isdiv && insn->off == 1; 19955 + bool is_smod = !isdiv && insn->off == 1; 20486 19956 struct bpf_insn *patchlet; 20487 19957 struct bpf_insn chk_and_div[] = { 20488 19958 /* [R,W]x div 0 -> 0 */ ··· 20535 19939 BPF_JMP_IMM(BPF_JA, 0, 0, 1), 20536 19940 BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), 20537 19941 }; 19942 + struct bpf_insn chk_and_sdiv[] = { 19943 + /* [R,W]x sdiv 0 -> 0 19944 + * LLONG_MIN sdiv -1 -> LLONG_MIN 19945 + * INT_MIN sdiv -1 -> INT_MIN 19946 + */ 19947 + BPF_MOV64_REG(BPF_REG_AX, insn->src_reg), 19948 + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | 19949 + BPF_ADD | BPF_K, BPF_REG_AX, 19950 + 0, 0, 1), 19951 + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | 19952 + BPF_JGT | BPF_K, BPF_REG_AX, 19953 + 0, 4, 1), 19954 + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | 19955 + BPF_JEQ | BPF_K, BPF_REG_AX, 19956 + 0, 1, 0), 19957 + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | 19958 + BPF_MOV | BPF_K, insn->dst_reg, 19959 + 0, 0, 0), 19960 + /* BPF_NEG(LLONG_MIN) == -LLONG_MIN == LLONG_MIN */ 19961 + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | 19962 + BPF_NEG | BPF_K, insn->dst_reg, 19963 + 0, 0, 0), 19964 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 19965 + *insn, 19966 + }; 19967 + struct bpf_insn chk_and_smod[] = { 19968 + /* [R,W]x mod 0 -> [R,W]x */ 19969 + /* [R,W]x mod -1 -> 0 */ 19970 + BPF_MOV64_REG(BPF_REG_AX, insn->src_reg), 19971 + BPF_RAW_INSN((is64 ? BPF_ALU64 : BPF_ALU) | 19972 + BPF_ADD | BPF_K, BPF_REG_AX, 19973 + 0, 0, 1), 19974 + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | 19975 + BPF_JGT | BPF_K, BPF_REG_AX, 19976 + 0, 3, 1), 19977 + BPF_RAW_INSN((is64 ? BPF_JMP : BPF_JMP32) | 19978 + BPF_JEQ | BPF_K, BPF_REG_AX, 19979 + 0, 3 + (is64 ? 0 : 1), 1), 19980 + BPF_MOV32_IMM(insn->dst_reg, 0), 19981 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 19982 + *insn, 19983 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 19984 + BPF_MOV32_REG(insn->dst_reg, insn->dst_reg), 19985 + }; 20538 19986 20539 - patchlet = isdiv ? chk_and_div : chk_and_mod; 20540 - cnt = isdiv ? ARRAY_SIZE(chk_and_div) : 20541 - ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); 19987 + if (is_sdiv) { 19988 + patchlet = chk_and_sdiv; 19989 + cnt = ARRAY_SIZE(chk_and_sdiv); 19990 + } else if (is_smod) { 19991 + patchlet = chk_and_smod; 19992 + cnt = ARRAY_SIZE(chk_and_smod) - (is64 ? 2 : 0); 19993 + } else { 19994 + patchlet = isdiv ? chk_and_div : chk_and_mod; 19995 + cnt = isdiv ? ARRAY_SIZE(chk_and_div) : 19996 + ARRAY_SIZE(chk_and_mod) - (is64 ? 2 : 0); 19997 + } 20542 19998 20543 19999 new_prog = bpf_patch_insn_data(env, i + delta, patchlet, cnt); 20544 20000 if (!new_prog) ··· 20637 19989 (BPF_MODE(insn->code) == BPF_ABS || 20638 19990 BPF_MODE(insn->code) == BPF_IND)) { 20639 19991 cnt = env->ops->gen_ld_abs(insn, insn_buf); 20640 - if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) { 19992 + if (cnt == 0 || cnt >= INSN_BUF_SIZE) { 20641 19993 verbose(env, "bpf verifier is misconfigured\n"); 20642 19994 return -EINVAL; 20643 19995 } ··· 20930 20282 cnt = ops->map_gen_lookup(map_ptr, insn_buf); 20931 20283 if (cnt == -EOPNOTSUPP) 20932 20284 goto patch_map_ops_generic; 20933 - if (cnt <= 0 || cnt >= ARRAY_SIZE(insn_buf)) { 20285 + if (cnt <= 0 || cnt >= INSN_BUF_SIZE) { 20934 20286 verbose(env, "bpf verifier is misconfigured\n"); 20935 20287 return -EINVAL; 20936 20288 } ··· 21032 20384 #if defined(CONFIG_X86_64) && !defined(CONFIG_UML) 21033 20385 /* Implement bpf_get_smp_processor_id() inline. */ 21034 20386 if (insn->imm == BPF_FUNC_get_smp_processor_id && 21035 - prog->jit_requested && bpf_jit_supports_percpu_insn()) { 20387 + verifier_inlines_helper_call(env, insn->imm)) { 21036 20388 /* BPF_FUNC_get_smp_processor_id inlining is an 21037 20389 * optimization, so if pcpu_hot.cpu_number is ever 21038 20390 * changed in some incompatible and hard to support ··· 21290 20642 int position, 21291 20643 s32 stack_base, 21292 20644 u32 callback_subprogno, 21293 - u32 *cnt) 20645 + u32 *total_cnt) 21294 20646 { 21295 20647 s32 r6_offset = stack_base + 0 * BPF_REG_SIZE; 21296 20648 s32 r7_offset = stack_base + 1 * BPF_REG_SIZE; ··· 21299 20651 int reg_loop_cnt = BPF_REG_7; 21300 20652 int reg_loop_ctx = BPF_REG_8; 21301 20653 20654 + struct bpf_insn *insn_buf = env->insn_buf; 21302 20655 struct bpf_prog *new_prog; 21303 20656 u32 callback_start; 21304 20657 u32 call_insn_offset; 21305 20658 s32 callback_offset; 20659 + u32 cnt = 0; 21306 20660 21307 20661 /* This represents an inlined version of bpf_iter.c:bpf_loop, 21308 20662 * be careful to modify this code in sync. 21309 20663 */ 21310 - struct bpf_insn insn_buf[] = { 21311 - /* Return error and jump to the end of the patch if 21312 - * expected number of iterations is too big. 21313 - */ 21314 - BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2), 21315 - BPF_MOV32_IMM(BPF_REG_0, -E2BIG), 21316 - BPF_JMP_IMM(BPF_JA, 0, 0, 16), 21317 - /* spill R6, R7, R8 to use these as loop vars */ 21318 - BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset), 21319 - BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset), 21320 - BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset), 21321 - /* initialize loop vars */ 21322 - BPF_MOV64_REG(reg_loop_max, BPF_REG_1), 21323 - BPF_MOV32_IMM(reg_loop_cnt, 0), 21324 - BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3), 21325 - /* loop header, 21326 - * if reg_loop_cnt >= reg_loop_max skip the loop body 21327 - */ 21328 - BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max, 5), 21329 - /* callback call, 21330 - * correct callback offset would be set after patching 21331 - */ 21332 - BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt), 21333 - BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx), 21334 - BPF_CALL_REL(0), 21335 - /* increment loop counter */ 21336 - BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1), 21337 - /* jump to loop header if callback returned 0 */ 21338 - BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -6), 21339 - /* return value of bpf_loop, 21340 - * set R0 to the number of iterations 21341 - */ 21342 - BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt), 21343 - /* restore original values of R6, R7, R8 */ 21344 - BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset), 21345 - BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset), 21346 - BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset), 21347 - }; 21348 20664 21349 - *cnt = ARRAY_SIZE(insn_buf); 21350 - new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt); 20665 + /* Return error and jump to the end of the patch if 20666 + * expected number of iterations is too big. 20667 + */ 20668 + insn_buf[cnt++] = BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2); 20669 + insn_buf[cnt++] = BPF_MOV32_IMM(BPF_REG_0, -E2BIG); 20670 + insn_buf[cnt++] = BPF_JMP_IMM(BPF_JA, 0, 0, 16); 20671 + /* spill R6, R7, R8 to use these as loop vars */ 20672 + insn_buf[cnt++] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset); 20673 + insn_buf[cnt++] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset); 20674 + insn_buf[cnt++] = BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset); 20675 + /* initialize loop vars */ 20676 + insn_buf[cnt++] = BPF_MOV64_REG(reg_loop_max, BPF_REG_1); 20677 + insn_buf[cnt++] = BPF_MOV32_IMM(reg_loop_cnt, 0); 20678 + insn_buf[cnt++] = BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3); 20679 + /* loop header, 20680 + * if reg_loop_cnt >= reg_loop_max skip the loop body 20681 + */ 20682 + insn_buf[cnt++] = BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max, 5); 20683 + /* callback call, 20684 + * correct callback offset would be set after patching 20685 + */ 20686 + insn_buf[cnt++] = BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt); 20687 + insn_buf[cnt++] = BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx); 20688 + insn_buf[cnt++] = BPF_CALL_REL(0); 20689 + /* increment loop counter */ 20690 + insn_buf[cnt++] = BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1); 20691 + /* jump to loop header if callback returned 0 */ 20692 + insn_buf[cnt++] = BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -6); 20693 + /* return value of bpf_loop, 20694 + * set R0 to the number of iterations 20695 + */ 20696 + insn_buf[cnt++] = BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt); 20697 + /* restore original values of R6, R7, R8 */ 20698 + insn_buf[cnt++] = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset); 20699 + insn_buf[cnt++] = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset); 20700 + insn_buf[cnt++] = BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset); 20701 + 20702 + *total_cnt = cnt; 20703 + new_prog = bpf_patch_insn_data(env, position, insn_buf, cnt); 21351 20704 if (!new_prog) 21352 20705 return new_prog; 21353 20706 ··· 21419 20770 } 21420 20771 21421 20772 env->prog->aux->stack_depth = env->subprog_info[0].stack_depth; 20773 + 20774 + return 0; 20775 + } 20776 + 20777 + /* Remove unnecessary spill/fill pairs, members of fastcall pattern, 20778 + * adjust subprograms stack depth when possible. 20779 + */ 20780 + static int remove_fastcall_spills_fills(struct bpf_verifier_env *env) 20781 + { 20782 + struct bpf_subprog_info *subprog = env->subprog_info; 20783 + struct bpf_insn_aux_data *aux = env->insn_aux_data; 20784 + struct bpf_insn *insn = env->prog->insnsi; 20785 + int insn_cnt = env->prog->len; 20786 + u32 spills_num; 20787 + bool modified = false; 20788 + int i, j; 20789 + 20790 + for (i = 0; i < insn_cnt; i++, insn++) { 20791 + if (aux[i].fastcall_spills_num > 0) { 20792 + spills_num = aux[i].fastcall_spills_num; 20793 + /* NOPs would be removed by opt_remove_nops() */ 20794 + for (j = 1; j <= spills_num; ++j) { 20795 + *(insn - j) = NOP; 20796 + *(insn + j) = NOP; 20797 + } 20798 + modified = true; 20799 + } 20800 + if ((subprog + 1)->start == i + 1) { 20801 + if (modified && !subprog->keep_fastcall_stack) 20802 + subprog->stack_depth = -subprog->fastcall_stack_off; 20803 + subprog++; 20804 + modified = false; 20805 + } 20806 + } 21422 20807 21423 20808 return 0; 21424 20809 } ··· 21730 21047 u32 btf_id, member_idx; 21731 21048 struct btf *btf; 21732 21049 const char *mname; 21050 + int err; 21733 21051 21734 21052 if (!prog->gpl_compatible) { 21735 21053 verbose(env, "struct ops programs must have a GPL compatible license\n"); ··· 21778 21094 return -EINVAL; 21779 21095 } 21780 21096 21097 + err = bpf_struct_ops_supported(st_ops, __btf_member_bit_offset(t, member) / 8); 21098 + if (err) { 21099 + verbose(env, "attach to unsupported member %s of struct %s\n", 21100 + mname, st_ops->name); 21101 + return err; 21102 + } 21103 + 21781 21104 if (st_ops->check_member) { 21782 - int err = st_ops->check_member(t, member, prog); 21105 + err = st_ops->check_member(t, member, prog); 21783 21106 21784 21107 if (err) { 21785 21108 verbose(env, "attach to unsupported member %s of struct %s\n", ··· 22397 21706 if (ret < 0) 22398 21707 goto skip_full_check; 22399 21708 21709 + ret = mark_fastcall_patterns(env); 21710 + if (ret < 0) 21711 + goto skip_full_check; 21712 + 22400 21713 ret = do_check_main(env); 22401 21714 ret = ret ?: do_check_subprogs(env); 22402 21715 ··· 22409 21714 22410 21715 skip_full_check: 22411 21716 kvfree(env->explored_states); 21717 + 21718 + /* might decrease stack depth, keep it before passes that 21719 + * allocate additional slots. 21720 + */ 21721 + if (ret == 0) 21722 + ret = remove_fastcall_spills_fills(env); 22412 21723 22413 21724 if (ret == 0) 22414 21725 ret = check_max_stack_depth(env);

+1 -1

kernel/events/core.c

··· 8964 8964 mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size; 8965 8965 8966 8966 if (atomic_read(&nr_build_id_events)) 8967 - build_id_parse(vma, mmap_event->build_id, &mmap_event->build_id_size); 8967 + build_id_parse_nofault(vma, mmap_event->build_id, &mmap_event->build_id_size); 8968 8968 8969 8969 perf_iterate_sb(perf_event_mmap_output, 8970 8970 mmap_event,

+10 -98

kernel/trace/bpf_trace.c

··· 24 24 #include <linux/key.h> 25 25 #include <linux/verification.h> 26 26 #include <linux/namei.h> 27 - #include <linux/fileattr.h> 28 27 29 28 #include <net/bpf_sk_storage.h> 30 29 ··· 797 798 .ret_btf_id = &bpf_task_pt_regs_ids[0], 798 799 }; 799 800 800 - BPF_CALL_2(bpf_current_task_under_cgroup, struct bpf_map *, map, u32, idx) 801 - { 802 - struct bpf_array *array = container_of(map, struct bpf_array, map); 803 - struct cgroup *cgrp; 804 - 805 - if (unlikely(idx >= array->map.max_entries)) 806 - return -E2BIG; 807 - 808 - cgrp = READ_ONCE(array->ptrs[idx]); 809 - if (unlikely(!cgrp)) 810 - return -EAGAIN; 811 - 812 - return task_under_cgroup_hierarchy(current, cgrp); 813 - } 814 - 815 - static const struct bpf_func_proto bpf_current_task_under_cgroup_proto = { 816 - .func = bpf_current_task_under_cgroup, 817 - .gpl_only = false, 818 - .ret_type = RET_INTEGER, 819 - .arg1_type = ARG_CONST_MAP_PTR, 820 - .arg2_type = ARG_ANYTHING, 821 - }; 822 - 823 801 struct send_signal_irq_work { 824 802 struct irq_work irq_work; 825 803 struct task_struct *task; ··· 1202 1226 .ret_type = RET_INTEGER, 1203 1227 .arg1_type = ARG_PTR_TO_CTX, 1204 1228 .arg2_type = ARG_ANYTHING, 1205 - .arg3_type = ARG_PTR_TO_LONG, 1229 + .arg3_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 1230 + .arg3_size = sizeof(u64), 1206 1231 }; 1207 1232 1208 1233 BPF_CALL_2(get_func_ret, void *, ctx, u64 *, value) ··· 1219 1242 .func = get_func_ret, 1220 1243 .ret_type = RET_INTEGER, 1221 1244 .arg1_type = ARG_PTR_TO_CTX, 1222 - .arg2_type = ARG_PTR_TO_LONG, 1245 + .arg2_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 1246 + .arg2_size = sizeof(u64), 1223 1247 }; 1224 1248 1225 1249 BPF_CALL_1(get_func_arg_cnt, void *, ctx) ··· 1417 1439 late_initcall(bpf_key_sig_kfuncs_init); 1418 1440 #endif /* CONFIG_KEYS */ 1419 1441 1420 - /* filesystem kfuncs */ 1421 - __bpf_kfunc_start_defs(); 1422 - 1423 - /** 1424 - * bpf_get_file_xattr - get xattr of a file 1425 - * @file: file to get xattr from 1426 - * @name__str: name of the xattr 1427 - * @value_p: output buffer of the xattr value 1428 - * 1429 - * Get xattr *name__str* of *file* and store the output in *value_ptr*. 1430 - * 1431 - * For security reasons, only *name__str* with prefix "user." is allowed. 1432 - * 1433 - * Return: 0 on success, a negative value on error. 1434 - */ 1435 - __bpf_kfunc int bpf_get_file_xattr(struct file *file, const char *name__str, 1436 - struct bpf_dynptr *value_p) 1437 - { 1438 - struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p; 1439 - struct dentry *dentry; 1440 - u32 value_len; 1441 - void *value; 1442 - int ret; 1443 - 1444 - if (strncmp(name__str, XATTR_USER_PREFIX, XATTR_USER_PREFIX_LEN)) 1445 - return -EPERM; 1446 - 1447 - value_len = __bpf_dynptr_size(value_ptr); 1448 - value = __bpf_dynptr_data_rw(value_ptr, value_len); 1449 - if (!value) 1450 - return -EINVAL; 1451 - 1452 - dentry = file_dentry(file); 1453 - ret = inode_permission(&nop_mnt_idmap, dentry->d_inode, MAY_READ); 1454 - if (ret) 1455 - return ret; 1456 - return __vfs_getxattr(dentry, dentry->d_inode, name__str, value, value_len); 1457 - } 1458 - 1459 - __bpf_kfunc_end_defs(); 1460 - 1461 - BTF_KFUNCS_START(fs_kfunc_set_ids) 1462 - BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE | KF_TRUSTED_ARGS) 1463 - BTF_KFUNCS_END(fs_kfunc_set_ids) 1464 - 1465 - static int bpf_get_file_xattr_filter(const struct bpf_prog *prog, u32 kfunc_id) 1466 - { 1467 - if (!btf_id_set8_contains(&fs_kfunc_set_ids, kfunc_id)) 1468 - return 0; 1469 - 1470 - /* Only allow to attach from LSM hooks, to avoid recursion */ 1471 - return prog->type != BPF_PROG_TYPE_LSM ? -EACCES : 0; 1472 - } 1473 - 1474 - static const struct btf_kfunc_id_set bpf_fs_kfunc_set = { 1475 - .owner = THIS_MODULE, 1476 - .set = &fs_kfunc_set_ids, 1477 - .filter = bpf_get_file_xattr_filter, 1478 - }; 1479 - 1480 - static int __init bpf_fs_kfuncs_init(void) 1481 - { 1482 - return register_btf_kfunc_id_set(BPF_PROG_TYPE_LSM, &bpf_fs_kfunc_set); 1483 - } 1484 - 1485 - late_initcall(bpf_fs_kfuncs_init); 1486 - 1487 1442 static const struct bpf_func_proto * 1488 1443 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) 1489 1444 { ··· 1459 1548 return &bpf_get_numa_node_id_proto; 1460 1549 case BPF_FUNC_perf_event_read: 1461 1550 return &bpf_perf_event_read_proto; 1462 - case BPF_FUNC_current_task_under_cgroup: 1463 - return &bpf_current_task_under_cgroup_proto; 1464 1551 case BPF_FUNC_get_prandom_u32: 1465 1552 return &bpf_get_prandom_u32_proto; 1466 1553 case BPF_FUNC_probe_write_user: ··· 1487 1578 return &bpf_cgrp_storage_get_proto; 1488 1579 case BPF_FUNC_cgrp_storage_delete: 1489 1580 return &bpf_cgrp_storage_delete_proto; 1581 + case BPF_FUNC_current_task_under_cgroup: 1582 + return &bpf_current_task_under_cgroup_proto; 1490 1583 #endif 1491 1584 case BPF_FUNC_send_signal: 1492 1585 return &bpf_send_signal_proto; ··· 1509 1598 case BPF_FUNC_jiffies64: 1510 1599 return &bpf_jiffies64_proto; 1511 1600 case BPF_FUNC_get_task_stack: 1512 - return &bpf_get_task_stack_proto; 1601 + return prog->sleepable ? &bpf_get_task_stack_sleepable_proto 1602 + : &bpf_get_task_stack_proto; 1513 1603 case BPF_FUNC_copy_from_user: 1514 1604 return &bpf_copy_from_user_proto; 1515 1605 case BPF_FUNC_copy_from_user_task: ··· 1566 1654 case BPF_FUNC_get_stackid: 1567 1655 return &bpf_get_stackid_proto; 1568 1656 case BPF_FUNC_get_stack: 1569 - return &bpf_get_stack_proto; 1657 + return prog->sleepable ? &bpf_get_stack_sleepable_proto : &bpf_get_stack_proto; 1570 1658 #ifdef CONFIG_BPF_KPROBE_OVERRIDE 1571 1659 case BPF_FUNC_override_return: 1572 1660 return &bpf_override_return_proto; ··· 3211 3299 struct bpf_run_ctx *old_run_ctx; 3212 3300 int err = 0; 3213 3301 3214 - if (link->task && current->mm != link->task->mm) 3302 + if (link->task && !same_thread_group(current, link->task)) 3215 3303 return 0; 3216 3304 3217 3305 if (sleepable)

+8 -4

kernel/trace/trace_syscalls.c

··· 564 564 BUILD_BUG_ON(sizeof(param.ent) < sizeof(void *)); 565 565 566 566 /* bpf prog requires 'regs' to be the first member in the ctx (a.k.a. &param) */ 567 + perf_fetch_caller_regs(regs); 567 568 *(struct pt_regs **)&param = regs; 568 569 param.syscall_nr = rec->nr; 569 570 for (i = 0; i < sys_data->nb_args; i++) ··· 576 575 { 577 576 struct syscall_metadata *sys_data; 578 577 struct syscall_trace_enter *rec; 578 + struct pt_regs *fake_regs; 579 579 struct hlist_head *head; 580 580 unsigned long args[6]; 581 581 bool valid_prog_array; ··· 604 602 size = ALIGN(size + sizeof(u32), sizeof(u64)); 605 603 size -= sizeof(u32); 606 604 607 - rec = perf_trace_buf_alloc(size, NULL, &rctx); 605 + rec = perf_trace_buf_alloc(size, &fake_regs, &rctx); 608 606 if (!rec) 609 607 return; 610 608 ··· 613 611 memcpy(&rec->args, args, sizeof(unsigned long) * sys_data->nb_args); 614 612 615 613 if ((valid_prog_array && 616 - !perf_call_bpf_enter(sys_data->enter_event, regs, sys_data, rec)) || 614 + !perf_call_bpf_enter(sys_data->enter_event, fake_regs, sys_data, rec)) || 617 615 hlist_empty(head)) { 618 616 perf_swevent_put_recursion_context(rctx); 619 617 return; ··· 668 666 } __aligned(8) param; 669 667 670 668 /* bpf prog requires 'regs' to be the first member in the ctx (a.k.a. &param) */ 669 + perf_fetch_caller_regs(regs); 671 670 *(struct pt_regs **)&param = regs; 672 671 param.syscall_nr = rec->nr; 673 672 param.ret = rec->ret; ··· 679 676 { 680 677 struct syscall_metadata *sys_data; 681 678 struct syscall_trace_exit *rec; 679 + struct pt_regs *fake_regs; 682 680 struct hlist_head *head; 683 681 bool valid_prog_array; 684 682 int syscall_nr; ··· 705 701 size = ALIGN(sizeof(*rec) + sizeof(u32), sizeof(u64)); 706 702 size -= sizeof(u32); 707 703 708 - rec = perf_trace_buf_alloc(size, NULL, &rctx); 704 + rec = perf_trace_buf_alloc(size, &fake_regs, &rctx); 709 705 if (!rec) 710 706 return; 711 707 ··· 713 709 rec->ret = syscall_get_return_value(current, regs); 714 710 715 711 if ((valid_prog_array && 716 - !perf_call_bpf_exit(sys_data->exit_event, regs, rec)) || 712 + !perf_call_bpf_exit(sys_data->exit_event, fake_regs, rec)) || 717 713 hlist_empty(head)) { 718 714 perf_swevent_put_recursion_context(rctx); 719 715 return;

+5 -3

lib/Kconfig.debug

··· 379 379 depends on !DEBUG_INFO_SPLIT && !DEBUG_INFO_REDUCED 380 380 depends on !GCC_PLUGIN_RANDSTRUCT || COMPILE_TEST 381 381 depends on BPF_SYSCALL 382 - depends on !DEBUG_INFO_DWARF5 || PAHOLE_VERSION >= 121 382 + depends on PAHOLE_VERSION >= 116 383 + depends on DEBUG_INFO_DWARF4 || PAHOLE_VERSION >= 121 383 384 # pahole uses elfutils, which does not have support for Hexagon relocations 384 385 depends on !HEXAGON 385 386 help 386 387 Generate deduplicated BTF type information from DWARF debug info. 387 - Turning this on expects presence of pahole tool, which will convert 388 - DWARF type info into equivalent deduplicated BTF type info. 388 + Turning this on requires pahole v1.16 or later (v1.21 or later to 389 + support DWARF 5), which will convert DWARF type info into equivalent 390 + deduplicated BTF type info. 389 391 390 392 config PAHOLE_HAS_SPLIT_BTF 391 393 def_bool PAHOLE_VERSION >= 119

+292 -105

lib/buildid.c

··· 8 8 9 9 #define BUILD_ID 3 10 10 11 + #define MAX_PHDR_CNT 256 12 + 13 + struct freader { 14 + void *buf; 15 + u32 buf_sz; 16 + int err; 17 + union { 18 + struct { 19 + struct file *file; 20 + struct folio *folio; 21 + void *addr; 22 + loff_t folio_off; 23 + bool may_fault; 24 + }; 25 + struct { 26 + const char *data; 27 + u64 data_sz; 28 + }; 29 + }; 30 + }; 31 + 32 + static void freader_init_from_file(struct freader *r, void *buf, u32 buf_sz, 33 + struct file *file, bool may_fault) 34 + { 35 + memset(r, 0, sizeof(*r)); 36 + r->buf = buf; 37 + r->buf_sz = buf_sz; 38 + r->file = file; 39 + r->may_fault = may_fault; 40 + } 41 + 42 + static void freader_init_from_mem(struct freader *r, const char *data, u64 data_sz) 43 + { 44 + memset(r, 0, sizeof(*r)); 45 + r->data = data; 46 + r->data_sz = data_sz; 47 + } 48 + 49 + static void freader_put_folio(struct freader *r) 50 + { 51 + if (!r->folio) 52 + return; 53 + kunmap_local(r->addr); 54 + folio_put(r->folio); 55 + r->folio = NULL; 56 + } 57 + 58 + static int freader_get_folio(struct freader *r, loff_t file_off) 59 + { 60 + /* check if we can just reuse current folio */ 61 + if (r->folio && file_off >= r->folio_off && 62 + file_off < r->folio_off + folio_size(r->folio)) 63 + return 0; 64 + 65 + freader_put_folio(r); 66 + 67 + r->folio = filemap_get_folio(r->file->f_mapping, file_off >> PAGE_SHIFT); 68 + 69 + /* if sleeping is allowed, wait for the page, if necessary */ 70 + if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio))) { 71 + filemap_invalidate_lock_shared(r->file->f_mapping); 72 + r->folio = read_cache_folio(r->file->f_mapping, file_off >> PAGE_SHIFT, 73 + NULL, r->file); 74 + filemap_invalidate_unlock_shared(r->file->f_mapping); 75 + } 76 + 77 + if (IS_ERR(r->folio) || !folio_test_uptodate(r->folio)) { 78 + if (!IS_ERR(r->folio)) 79 + folio_put(r->folio); 80 + r->folio = NULL; 81 + return -EFAULT; 82 + } 83 + 84 + r->folio_off = folio_pos(r->folio); 85 + r->addr = kmap_local_folio(r->folio, 0); 86 + 87 + return 0; 88 + } 89 + 90 + static const void *freader_fetch(struct freader *r, loff_t file_off, size_t sz) 91 + { 92 + size_t folio_sz; 93 + 94 + /* provided internal temporary buffer should be sized correctly */ 95 + if (WARN_ON(r->buf && sz > r->buf_sz)) { 96 + r->err = -E2BIG; 97 + return NULL; 98 + } 99 + 100 + if (unlikely(file_off + sz < file_off)) { 101 + r->err = -EOVERFLOW; 102 + return NULL; 103 + } 104 + 105 + /* working with memory buffer is much more straightforward */ 106 + if (!r->buf) { 107 + if (file_off + sz > r->data_sz) { 108 + r->err = -ERANGE; 109 + return NULL; 110 + } 111 + return r->data + file_off; 112 + } 113 + 114 + /* fetch or reuse folio for given file offset */ 115 + r->err = freader_get_folio(r, file_off); 116 + if (r->err) 117 + return NULL; 118 + 119 + /* if requested data is crossing folio boundaries, we have to copy 120 + * everything into our local buffer to keep a simple linear memory 121 + * access interface 122 + */ 123 + folio_sz = folio_size(r->folio); 124 + if (file_off + sz > r->folio_off + folio_sz) { 125 + int part_sz = r->folio_off + folio_sz - file_off; 126 + 127 + /* copy the part that resides in the current folio */ 128 + memcpy(r->buf, r->addr + (file_off - r->folio_off), part_sz); 129 + 130 + /* fetch next folio */ 131 + r->err = freader_get_folio(r, r->folio_off + folio_sz); 132 + if (r->err) 133 + return NULL; 134 + 135 + /* copy the rest of requested data */ 136 + memcpy(r->buf + part_sz, r->addr, sz - part_sz); 137 + 138 + return r->buf; 139 + } 140 + 141 + /* if data fits in a single folio, just return direct pointer */ 142 + return r->addr + (file_off - r->folio_off); 143 + } 144 + 145 + static void freader_cleanup(struct freader *r) 146 + { 147 + if (!r->buf) 148 + return; /* non-file-backed mode */ 149 + 150 + freader_put_folio(r); 151 + } 152 + 11 153 /* 12 154 * Parse build id from the note segment. This logic can be shared between 13 155 * 32-bit and 64-bit system, because Elf32_Nhdr and Elf64_Nhdr are 14 156 * identical. 15 157 */ 16 - static int parse_build_id_buf(unsigned char *build_id, 17 - __u32 *size, 18 - const void *note_start, 19 - Elf32_Word note_size) 158 + static int parse_build_id(struct freader *r, unsigned char *build_id, __u32 *size, 159 + loff_t note_off, Elf32_Word note_size) 20 160 { 21 - Elf32_Word note_offs = 0, new_offs; 161 + const char note_name[] = "GNU"; 162 + const size_t note_name_sz = sizeof(note_name); 163 + u32 build_id_off, new_off, note_end, name_sz, desc_sz; 164 + const Elf32_Nhdr *nhdr; 165 + const char *data; 22 166 23 - while (note_offs + sizeof(Elf32_Nhdr) < note_size) { 24 - Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs); 167 + if (check_add_overflow(note_off, note_size, &note_end)) 168 + return -EINVAL; 169 + 170 + while (note_end - note_off > sizeof(Elf32_Nhdr) + note_name_sz) { 171 + nhdr = freader_fetch(r, note_off, sizeof(Elf32_Nhdr) + note_name_sz); 172 + if (!nhdr) 173 + return r->err; 174 + 175 + name_sz = READ_ONCE(nhdr->n_namesz); 176 + desc_sz = READ_ONCE(nhdr->n_descsz); 177 + 178 + new_off = note_off + sizeof(Elf32_Nhdr); 179 + if (check_add_overflow(new_off, ALIGN(name_sz, 4), &new_off) || 180 + check_add_overflow(new_off, ALIGN(desc_sz, 4), &new_off) || 181 + new_off > note_end) 182 + break; 25 183 26 184 if (nhdr->n_type == BUILD_ID && 27 - nhdr->n_namesz == sizeof("GNU") && 28 - !strcmp((char *)(nhdr + 1), "GNU") && 29 - nhdr->n_descsz > 0 && 30 - nhdr->n_descsz <= BUILD_ID_SIZE_MAX) { 31 - memcpy(build_id, 32 - note_start + note_offs + 33 - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr), 34 - nhdr->n_descsz); 35 - memset(build_id + nhdr->n_descsz, 0, 36 - BUILD_ID_SIZE_MAX - nhdr->n_descsz); 185 + name_sz == note_name_sz && 186 + memcmp(nhdr + 1, note_name, note_name_sz) == 0 && 187 + desc_sz > 0 && desc_sz <= BUILD_ID_SIZE_MAX) { 188 + build_id_off = note_off + sizeof(Elf32_Nhdr) + ALIGN(note_name_sz, 4); 189 + 190 + /* freader_fetch() will invalidate nhdr pointer */ 191 + data = freader_fetch(r, build_id_off, desc_sz); 192 + if (!data) 193 + return r->err; 194 + 195 + memcpy(build_id, data, desc_sz); 196 + memset(build_id + desc_sz, 0, BUILD_ID_SIZE_MAX - desc_sz); 37 197 if (size) 38 - *size = nhdr->n_descsz; 198 + *size = desc_sz; 39 199 return 0; 40 200 } 41 - new_offs = note_offs + sizeof(Elf32_Nhdr) + 42 - ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4); 43 - if (new_offs <= note_offs) /* overflow */ 44 - break; 45 - note_offs = new_offs; 201 + 202 + note_off = new_off; 46 203 } 47 204 48 205 return -EINVAL; 49 206 } 50 207 51 - static inline int parse_build_id(const void *page_addr, 52 - unsigned char *build_id, 53 - __u32 *size, 54 - const void *note_start, 55 - Elf32_Word note_size) 56 - { 57 - /* check for overflow */ 58 - if (note_start < page_addr || note_start + note_size < note_start) 59 - return -EINVAL; 60 - 61 - /* only supports note that fits in the first page */ 62 - if (note_start + note_size > page_addr + PAGE_SIZE) 63 - return -EINVAL; 64 - 65 - return parse_build_id_buf(build_id, size, note_start, note_size); 66 - } 67 - 68 208 /* Parse build ID from 32-bit ELF */ 69 - static int get_build_id_32(const void *page_addr, unsigned char *build_id, 70 - __u32 *size) 209 + static int get_build_id_32(struct freader *r, unsigned char *build_id, __u32 *size) 71 210 { 72 - Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr; 73 - Elf32_Phdr *phdr; 74 - int i; 211 + const Elf32_Ehdr *ehdr; 212 + const Elf32_Phdr *phdr; 213 + __u32 phnum, phoff, i; 75 214 76 - /* 77 - * FIXME 78 - * Neither ELF spec nor ELF loader require that program headers 79 - * start immediately after ELF header. 80 - */ 81 - if (ehdr->e_phoff != sizeof(Elf32_Ehdr)) 215 + ehdr = freader_fetch(r, 0, sizeof(Elf32_Ehdr)); 216 + if (!ehdr) 217 + return r->err; 218 + 219 + /* subsequent freader_fetch() calls invalidate pointers, so remember locally */ 220 + phnum = READ_ONCE(ehdr->e_phnum); 221 + phoff = READ_ONCE(ehdr->e_phoff); 222 + 223 + /* set upper bound on amount of segments (phdrs) we iterate */ 224 + if (phnum > MAX_PHDR_CNT) 225 + phnum = MAX_PHDR_CNT; 226 + 227 + /* check that phoff is not large enough to cause an overflow */ 228 + if (phoff + phnum * sizeof(Elf32_Phdr) < phoff) 82 229 return -EINVAL; 83 - /* only supports phdr that fits in one page */ 84 - if (ehdr->e_phnum > 85 - (PAGE_SIZE - sizeof(Elf32_Ehdr)) / sizeof(Elf32_Phdr)) 86 - return -EINVAL; 87 230 88 - phdr = (Elf32_Phdr *)(page_addr + sizeof(Elf32_Ehdr)); 231 + for (i = 0; i < phnum; ++i) { 232 + phdr = freader_fetch(r, phoff + i * sizeof(Elf32_Phdr), sizeof(Elf32_Phdr)); 233 + if (!phdr) 234 + return r->err; 89 235 90 - for (i = 0; i < ehdr->e_phnum; ++i) { 91 - if (phdr[i].p_type == PT_NOTE && 92 - !parse_build_id(page_addr, build_id, size, 93 - page_addr + phdr[i].p_offset, 94 - phdr[i].p_filesz)) 236 + if (phdr->p_type == PT_NOTE && 237 + !parse_build_id(r, build_id, size, READ_ONCE(phdr->p_offset), 238 + READ_ONCE(phdr->p_filesz))) 95 239 return 0; 96 240 } 97 241 return -EINVAL; 98 242 } 99 243 100 244 /* Parse build ID from 64-bit ELF */ 101 - static int get_build_id_64(const void *page_addr, unsigned char *build_id, 102 - __u32 *size) 245 + static int get_build_id_64(struct freader *r, unsigned char *build_id, __u32 *size) 103 246 { 104 - Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr; 105 - Elf64_Phdr *phdr; 106 - int i; 247 + const Elf64_Ehdr *ehdr; 248 + const Elf64_Phdr *phdr; 249 + __u32 phnum, i; 250 + __u64 phoff; 107 251 108 - /* 109 - * FIXME 110 - * Neither ELF spec nor ELF loader require that program headers 111 - * start immediately after ELF header. 112 - */ 113 - if (ehdr->e_phoff != sizeof(Elf64_Ehdr)) 252 + ehdr = freader_fetch(r, 0, sizeof(Elf64_Ehdr)); 253 + if (!ehdr) 254 + return r->err; 255 + 256 + /* subsequent freader_fetch() calls invalidate pointers, so remember locally */ 257 + phnum = READ_ONCE(ehdr->e_phnum); 258 + phoff = READ_ONCE(ehdr->e_phoff); 259 + 260 + /* set upper bound on amount of segments (phdrs) we iterate */ 261 + if (phnum > MAX_PHDR_CNT) 262 + phnum = MAX_PHDR_CNT; 263 + 264 + /* check that phoff is not large enough to cause an overflow */ 265 + if (phoff + phnum * sizeof(Elf64_Phdr) < phoff) 114 266 return -EINVAL; 115 - /* only supports phdr that fits in one page */ 116 - if (ehdr->e_phnum > 117 - (PAGE_SIZE - sizeof(Elf64_Ehdr)) / sizeof(Elf64_Phdr)) 118 - return -EINVAL; 119 267 120 - phdr = (Elf64_Phdr *)(page_addr + sizeof(Elf64_Ehdr)); 268 + for (i = 0; i < phnum; ++i) { 269 + phdr = freader_fetch(r, phoff + i * sizeof(Elf64_Phdr), sizeof(Elf64_Phdr)); 270 + if (!phdr) 271 + return r->err; 121 272 122 - for (i = 0; i < ehdr->e_phnum; ++i) { 123 - if (phdr[i].p_type == PT_NOTE && 124 - !parse_build_id(page_addr, build_id, size, 125 - page_addr + phdr[i].p_offset, 126 - phdr[i].p_filesz)) 273 + if (phdr->p_type == PT_NOTE && 274 + !parse_build_id(r, build_id, size, READ_ONCE(phdr->p_offset), 275 + READ_ONCE(phdr->p_filesz))) 127 276 return 0; 128 277 } 278 + 129 279 return -EINVAL; 130 280 } 131 281 132 - /* 133 - * Parse build ID of ELF file mapped to vma 134 - * @vma: vma object 135 - * @build_id: buffer to store build id, at least BUILD_ID_SIZE long 136 - * @size: returns actual build id size in case of success 137 - * 138 - * Return: 0 on success, -EINVAL otherwise 139 - */ 140 - int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, 141 - __u32 *size) 282 + /* enough for Elf64_Ehdr, Elf64_Phdr, and all the smaller requests */ 283 + #define MAX_FREADER_BUF_SZ 64 284 + 285 + static int __build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, 286 + __u32 *size, bool may_fault) 142 287 { 143 - Elf32_Ehdr *ehdr; 144 - struct page *page; 145 - void *page_addr; 288 + const Elf32_Ehdr *ehdr; 289 + struct freader r; 290 + char buf[MAX_FREADER_BUF_SZ]; 146 291 int ret; 147 292 148 293 /* only works for page backed storage */ 149 294 if (!vma->vm_file) 150 295 return -EINVAL; 151 296 152 - page = find_get_page(vma->vm_file->f_mapping, 0); 153 - if (!page) 154 - return -EFAULT; /* page not mapped */ 297 + freader_init_from_file(&r, buf, sizeof(buf), vma->vm_file, may_fault); 298 + 299 + /* fetch first 18 bytes of ELF header for checks */ 300 + ehdr = freader_fetch(&r, 0, offsetofend(Elf32_Ehdr, e_type)); 301 + if (!ehdr) { 302 + ret = r.err; 303 + goto out; 304 + } 155 305 156 306 ret = -EINVAL; 157 - page_addr = kmap_local_page(page); 158 - ehdr = (Elf32_Ehdr *)page_addr; 159 307 160 308 /* compare magic x7f "ELF" */ 161 309 if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) ··· 314 166 goto out; 315 167 316 168 if (ehdr->e_ident[EI_CLASS] == ELFCLASS32) 317 - ret = get_build_id_32(page_addr, build_id, size); 169 + ret = get_build_id_32(&r, build_id, size); 318 170 else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64) 319 - ret = get_build_id_64(page_addr, build_id, size); 171 + ret = get_build_id_64(&r, build_id, size); 320 172 out: 321 - kunmap_local(page_addr); 322 - put_page(page); 173 + freader_cleanup(&r); 323 174 return ret; 175 + } 176 + 177 + /* 178 + * Parse build ID of ELF file mapped to vma 179 + * @vma: vma object 180 + * @build_id: buffer to store build id, at least BUILD_ID_SIZE long 181 + * @size: returns actual build id size in case of success 182 + * 183 + * Assumes no page fault can be taken, so if relevant portions of ELF file are 184 + * not already paged in, fetching of build ID fails. 185 + * 186 + * Return: 0 on success; negative error, otherwise 187 + */ 188 + int build_id_parse_nofault(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size) 189 + { 190 + return __build_id_parse(vma, build_id, size, false /* !may_fault */); 191 + } 192 + 193 + /* 194 + * Parse build ID of ELF file mapped to VMA 195 + * @vma: vma object 196 + * @build_id: buffer to store build id, at least BUILD_ID_SIZE long 197 + * @size: returns actual build id size in case of success 198 + * 199 + * Assumes faultable context and can cause page faults to bring in file data 200 + * into page cache. 201 + * 202 + * Return: 0 on success; negative error, otherwise 203 + */ 204 + int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id, __u32 *size) 205 + { 206 + return __build_id_parse(vma, build_id, size, true /* may_fault */); 324 207 } 325 208 326 209 /** ··· 364 185 */ 365 186 int build_id_parse_buf(const void *buf, unsigned char *build_id, u32 buf_size) 366 187 { 367 - return parse_build_id_buf(build_id, NULL, buf, buf_size); 188 + struct freader r; 189 + int err; 190 + 191 + freader_init_from_mem(&r, buf, buf_size); 192 + 193 + err = parse_build_id(&r, build_id, NULL, 0, buf_size); 194 + 195 + freader_cleanup(&r); 196 + return err; 368 197 } 369 198 370 199 #if IS_ENABLED(CONFIG_STACKTRACE_BUILD_ID) || IS_ENABLED(CONFIG_VMCORE_INFO)

+1 -1

net/bpf/bpf_dummy_struct_ops.c

··· 115 115 116 116 offset = btf_ctx_arg_offset(bpf_dummy_ops_btf, func_proto, arg_no); 117 117 info = find_ctx_arg_info(prog->aux, offset); 118 - if (info && (info->reg_type & PTR_MAYBE_NULL)) 118 + if (info && type_may_be_null(info->reg_type)) 119 119 continue; 120 120 121 121 return -EINVAL;

+45 -30

net/core/filter.c

··· 1266 1266 * so we need to keep the user BPF around until the 2nd 1267 1267 * pass. At this time, the user BPF is stored in fp->insns. 1268 1268 */ 1269 - old_prog = kmemdup(fp->insns, old_len * sizeof(struct sock_filter), 1270 - GFP_KERNEL | __GFP_NOWARN); 1269 + old_prog = kmemdup_array(fp->insns, old_len, sizeof(struct sock_filter), 1270 + GFP_KERNEL | __GFP_NOWARN); 1271 1271 if (!old_prog) { 1272 1272 err = -ENOMEM; 1273 1273 goto out_err; ··· 6280 6280 int ret = BPF_MTU_CHK_RET_FRAG_NEEDED; 6281 6281 struct net_device *dev = skb->dev; 6282 6282 int skb_len, dev_len; 6283 - int mtu; 6283 + int mtu = 0; 6284 6284 6285 - if (unlikely(flags & ~(BPF_MTU_CHK_SEGS))) 6286 - return -EINVAL; 6285 + if (unlikely(flags & ~(BPF_MTU_CHK_SEGS))) { 6286 + ret = -EINVAL; 6287 + goto out; 6288 + } 6287 6289 6288 - if (unlikely(flags & BPF_MTU_CHK_SEGS && (len_diff || *mtu_len))) 6289 - return -EINVAL; 6290 + if (unlikely(flags & BPF_MTU_CHK_SEGS && (len_diff || *mtu_len))) { 6291 + ret = -EINVAL; 6292 + goto out; 6293 + } 6290 6294 6291 6295 dev = __dev_via_ifindex(dev, ifindex); 6292 - if (unlikely(!dev)) 6293 - return -ENODEV; 6296 + if (unlikely(!dev)) { 6297 + ret = -ENODEV; 6298 + goto out; 6299 + } 6294 6300 6295 6301 mtu = READ_ONCE(dev->mtu); 6296 - 6297 6302 dev_len = mtu + dev->hard_header_len; 6298 6303 6299 6304 /* If set use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */ ··· 6316 6311 */ 6317 6312 if (skb_is_gso(skb)) { 6318 6313 ret = BPF_MTU_CHK_RET_SUCCESS; 6319 - 6320 6314 if (flags & BPF_MTU_CHK_SEGS && 6321 6315 !skb_gso_validate_network_len(skb, mtu)) 6322 6316 ret = BPF_MTU_CHK_RET_SEGS_TOOBIG; 6323 6317 } 6324 6318 out: 6325 - /* BPF verifier guarantees valid pointer */ 6326 6319 *mtu_len = mtu; 6327 - 6328 6320 return ret; 6329 6321 } 6330 6322 ··· 6331 6329 struct net_device *dev = xdp->rxq->dev; 6332 6330 int xdp_len = xdp->data_end - xdp->data; 6333 6331 int ret = BPF_MTU_CHK_RET_SUCCESS; 6334 - int mtu, dev_len; 6332 + int mtu = 0, dev_len; 6335 6333 6336 6334 /* XDP variant doesn't support multi-buffer segment check (yet) */ 6337 - if (unlikely(flags)) 6338 - return -EINVAL; 6335 + if (unlikely(flags)) { 6336 + ret = -EINVAL; 6337 + goto out; 6338 + } 6339 6339 6340 6340 dev = __dev_via_ifindex(dev, ifindex); 6341 - if (unlikely(!dev)) 6342 - return -ENODEV; 6341 + if (unlikely(!dev)) { 6342 + ret = -ENODEV; 6343 + goto out; 6344 + } 6343 6345 6344 6346 mtu = READ_ONCE(dev->mtu); 6345 - 6346 - /* Add L2-header as dev MTU is L3 size */ 6347 6347 dev_len = mtu + dev->hard_header_len; 6348 6348 6349 6349 /* Use *mtu_len as input, L3 as iph->tot_len (like fib_lookup) */ ··· 6355 6351 xdp_len += len_diff; /* minus result pass check */ 6356 6352 if (xdp_len > dev_len) 6357 6353 ret = BPF_MTU_CHK_RET_FRAG_NEEDED; 6358 - 6359 - /* BPF verifier guarantees valid pointer */ 6354 + out: 6360 6355 *mtu_len = mtu; 6361 - 6362 6356 return ret; 6363 6357 } 6364 6358 ··· 6366 6364 .ret_type = RET_INTEGER, 6367 6365 .arg1_type = ARG_PTR_TO_CTX, 6368 6366 .arg2_type = ARG_ANYTHING, 6369 - .arg3_type = ARG_PTR_TO_INT, 6367 + .arg3_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 6368 + .arg3_size = sizeof(u32), 6370 6369 .arg4_type = ARG_ANYTHING, 6371 6370 .arg5_type = ARG_ANYTHING, 6372 6371 }; ··· 6378 6375 .ret_type = RET_INTEGER, 6379 6376 .arg1_type = ARG_PTR_TO_CTX, 6380 6377 .arg2_type = ARG_ANYTHING, 6381 - .arg3_type = ARG_PTR_TO_INT, 6378 + .arg3_type = ARG_PTR_TO_FIXED_SIZE_MEM | MEM_UNINIT | MEM_ALIGNED, 6379 + .arg3_size = sizeof(u32), 6382 6380 .arg4_type = ARG_ANYTHING, 6383 6381 .arg5_type = ARG_ANYTHING, 6384 6382 }; ··· 8601 8597 if (off + size > offsetofend(struct __sk_buff, cb[4])) 8602 8598 return false; 8603 8599 break; 8600 + case bpf_ctx_range(struct __sk_buff, data): 8601 + case bpf_ctx_range(struct __sk_buff, data_meta): 8602 + case bpf_ctx_range(struct __sk_buff, data_end): 8603 + if (info->is_ldsx || size != size_default) 8604 + return false; 8605 + break; 8604 8606 case bpf_ctx_range_till(struct __sk_buff, remote_ip6[0], remote_ip6[3]): 8605 8607 case bpf_ctx_range_till(struct __sk_buff, local_ip6[0], local_ip6[3]): 8606 8608 case bpf_ctx_range_till(struct __sk_buff, remote_ip4, remote_ip4): 8607 8609 case bpf_ctx_range_till(struct __sk_buff, local_ip4, local_ip4): 8608 - case bpf_ctx_range(struct __sk_buff, data): 8609 - case bpf_ctx_range(struct __sk_buff, data_meta): 8610 - case bpf_ctx_range(struct __sk_buff, data_end): 8611 8610 if (size != size_default) 8612 8611 return false; 8613 8612 break; ··· 9054 9047 } 9055 9048 } 9056 9049 return false; 9050 + } else { 9051 + switch (off) { 9052 + case offsetof(struct xdp_md, data_meta): 9053 + case offsetof(struct xdp_md, data): 9054 + case offsetof(struct xdp_md, data_end): 9055 + if (info->is_ldsx) 9056 + return false; 9057 + } 9057 9058 } 9058 9059 9059 9060 switch (off) { ··· 9387 9372 9388 9373 switch (off) { 9389 9374 case bpf_ctx_range(struct __sk_buff, data): 9390 - if (size != size_default) 9375 + if (info->is_ldsx || size != size_default) 9391 9376 return false; 9392 9377 info->reg_type = PTR_TO_PACKET; 9393 9378 return true; 9394 9379 case bpf_ctx_range(struct __sk_buff, data_end): 9395 - if (size != size_default) 9380 + if (info->is_ldsx || size != size_default) 9396 9381 return false; 9397 9382 info->reg_type = PTR_TO_PACKET_END; 9398 9383 return true;

-26

net/ipv4/bpf_tcp_ca.c

··· 14 14 /* "extern" is to avoid sparse warning. It is only used in bpf_struct_ops.c. */ 15 15 static struct bpf_struct_ops bpf_tcp_congestion_ops; 16 16 17 - static u32 unsupported_ops[] = { 18 - offsetof(struct tcp_congestion_ops, get_info), 19 - }; 20 - 21 17 static const struct btf_type *tcp_sock_type; 22 18 static u32 tcp_sock_id, sock_id; 23 19 static const struct btf_type *tcp_congestion_ops_type; ··· 39 43 tcp_congestion_ops_type = btf_type_by_id(btf, type_id); 40 44 41 45 return 0; 42 - } 43 - 44 - static bool is_unsupported(u32 member_offset) 45 - { 46 - unsigned int i; 47 - 48 - for (i = 0; i < ARRAY_SIZE(unsupported_ops); i++) { 49 - if (member_offset == unsupported_ops[i]) 50 - return true; 51 - } 52 - 53 - return false; 54 46 } 55 47 56 48 static bool bpf_tcp_ca_is_valid_access(int off, int size, ··· 235 251 return 0; 236 252 } 237 253 238 - static int bpf_tcp_ca_check_member(const struct btf_type *t, 239 - const struct btf_member *member, 240 - const struct bpf_prog *prog) 241 - { 242 - if (is_unsupported(__btf_member_bit_offset(t, member) / 8)) 243 - return -ENOTSUPP; 244 - return 0; 245 - } 246 - 247 254 static int bpf_tcp_ca_reg(void *kdata, struct bpf_link *link) 248 255 { 249 256 return tcp_register_congestion_control(kdata); ··· 329 354 .reg = bpf_tcp_ca_reg, 330 355 .unreg = bpf_tcp_ca_unreg, 331 356 .update = bpf_tcp_ca_update, 332 - .check_member = bpf_tcp_ca_check_member, 333 357 .init_member = bpf_tcp_ca_init_member, 334 358 .init = bpf_tcp_ca_init, 335 359 .validate = bpf_tcp_ca_validate,

+12 -11

net/xdp/xsk.c

··· 1320 1320 __u32 headroom; 1321 1321 }; 1322 1322 1323 - struct xdp_umem_reg_v2 { 1324 - __u64 addr; /* Start of packet data area */ 1325 - __u64 len; /* Length of packet data area */ 1326 - __u32 chunk_size; 1327 - __u32 headroom; 1328 - __u32 flags; 1329 - }; 1330 - 1331 1323 static int xsk_setsockopt(struct socket *sock, int level, int optname, 1332 1324 sockptr_t optval, unsigned int optlen) 1333 1325 { ··· 1363 1371 1364 1372 if (optlen < sizeof(struct xdp_umem_reg_v1)) 1365 1373 return -EINVAL; 1366 - else if (optlen < sizeof(struct xdp_umem_reg_v2)) 1367 - mr_size = sizeof(struct xdp_umem_reg_v1); 1368 1374 else if (optlen < sizeof(mr)) 1369 - mr_size = sizeof(struct xdp_umem_reg_v2); 1375 + mr_size = sizeof(struct xdp_umem_reg_v1); 1376 + 1377 + BUILD_BUG_ON(sizeof(struct xdp_umem_reg_v1) >= sizeof(struct xdp_umem_reg)); 1378 + 1379 + /* Make sure the last field of the struct doesn't have 1380 + * uninitialized padding. All padding has to be explicit 1381 + * and has to be set to zero by the userspace to make 1382 + * struct xdp_umem_reg extensible in the future. 1383 + */ 1384 + BUILD_BUG_ON(offsetof(struct xdp_umem_reg, tx_metadata_len) + 1385 + sizeof_field(struct xdp_umem_reg, tx_metadata_len) != 1386 + sizeof(struct xdp_umem_reg)); 1370 1387 1371 1388 if (copy_from_sockptr(&mr, optval, mr_size)) 1372 1389 return -EFAULT;

+5 -4

samples/bpf/Makefile

··· 13 13 tprogs-y += sockex2 14 14 tprogs-y += sockex3 15 15 tprogs-y += tracex1 16 - tprogs-y += tracex2 17 16 tprogs-y += tracex3 18 17 tprogs-y += tracex4 19 18 tprogs-y += tracex5 ··· 62 63 sockex2-objs := sockex2_user.o 63 64 sockex3-objs := sockex3_user.o 64 65 tracex1-objs := tracex1_user.o $(TRACE_HELPERS) 65 - tracex2-objs := tracex2_user.o 66 66 tracex3-objs := tracex3_user.o 67 67 tracex4-objs := tracex4_user.o 68 68 tracex5-objs := tracex5_user.o $(TRACE_HELPERS) ··· 103 105 always-y += sockex2_kern.o 104 106 always-y += sockex3_kern.o 105 107 always-y += tracex1.bpf.o 106 - always-y += tracex2.bpf.o 107 108 always-y += tracex3.bpf.o 108 109 always-y += tracex4.bpf.o 109 110 always-y += tracex5.bpf.o ··· 164 167 BPF_EXTRA_CFLAGS += -I$(srctree)/arch/mips/include/asm/mach-loongson64 165 168 BPF_EXTRA_CFLAGS += -I$(srctree)/arch/mips/include/asm/mach-generic 166 169 endif 170 + endif 171 + 172 + ifeq ($(ARCH), x86) 173 + BPF_EXTRA_CFLAGS += -fcf-protection 167 174 endif 168 175 169 176 TPROGS_CFLAGS += -Wall -O2 ··· 406 405 -Wno-gnu-variable-sized-type-not-at-end \ 407 406 -Wno-address-of-packed-member -Wno-tautological-compare \ 408 407 -Wno-unknown-warning-option $(CLANG_ARCH_ARGS) \ 409 - -fno-asynchronous-unwind-tables -fcf-protection \ 408 + -fno-asynchronous-unwind-tables \ 410 409 -I$(srctree)/samples/bpf/ -include asm_goto_workaround.h \ 411 410 -O2 -emit-llvm -Xclang -disable-llvm-passes -c $< -o - | \ 412 411 $(OPT) -O2 -mtriple=bpf-pc-linux | $(LLVM_DIS) | \

-99

samples/bpf/tracex2.bpf.c

··· 1 - /* Copyright (c) 2013-2015 PLUMgrid, http://plumgrid.com 2 - * 3 - * This program is free software; you can redistribute it and/or 4 - * modify it under the terms of version 2 of the GNU General Public 5 - * License as published by the Free Software Foundation. 6 - */ 7 - #include "vmlinux.h" 8 - #include <linux/version.h> 9 - #include <bpf/bpf_helpers.h> 10 - #include <bpf/bpf_tracing.h> 11 - #include <bpf/bpf_core_read.h> 12 - 13 - struct { 14 - __uint(type, BPF_MAP_TYPE_HASH); 15 - __type(key, long); 16 - __type(value, long); 17 - __uint(max_entries, 1024); 18 - } my_map SEC(".maps"); 19 - 20 - /* kprobe is NOT a stable ABI. If kernel internals change this bpf+kprobe 21 - * example will no longer be meaningful 22 - */ 23 - SEC("kprobe/kfree_skb_reason") 24 - int bpf_prog2(struct pt_regs *ctx) 25 - { 26 - long loc = 0; 27 - long init_val = 1; 28 - long *value; 29 - 30 - /* read ip of kfree_skb_reason caller. 31 - * non-portable version of __builtin_return_address(0) 32 - */ 33 - BPF_KPROBE_READ_RET_IP(loc, ctx); 34 - 35 - value = bpf_map_lookup_elem(&my_map, &loc); 36 - if (value) 37 - *value += 1; 38 - else 39 - bpf_map_update_elem(&my_map, &loc, &init_val, BPF_ANY); 40 - return 0; 41 - } 42 - 43 - static unsigned int log2(unsigned int v) 44 - { 45 - unsigned int r; 46 - unsigned int shift; 47 - 48 - r = (v > 0xFFFF) << 4; v >>= r; 49 - shift = (v > 0xFF) << 3; v >>= shift; r |= shift; 50 - shift = (v > 0xF) << 2; v >>= shift; r |= shift; 51 - shift = (v > 0x3) << 1; v >>= shift; r |= shift; 52 - r |= (v >> 1); 53 - return r; 54 - } 55 - 56 - static unsigned int log2l(unsigned long v) 57 - { 58 - unsigned int hi = v >> 32; 59 - if (hi) 60 - return log2(hi) + 32; 61 - else 62 - return log2(v); 63 - } 64 - 65 - struct hist_key { 66 - char comm[16]; 67 - u64 pid_tgid; 68 - u64 uid_gid; 69 - u64 index; 70 - }; 71 - 72 - struct { 73 - __uint(type, BPF_MAP_TYPE_PERCPU_HASH); 74 - __uint(key_size, sizeof(struct hist_key)); 75 - __uint(value_size, sizeof(long)); 76 - __uint(max_entries, 1024); 77 - } my_hist_map SEC(".maps"); 78 - 79 - SEC("ksyscall/write") 80 - int BPF_KSYSCALL(bpf_prog3, unsigned int fd, const char *buf, size_t count) 81 - { 82 - long init_val = 1; 83 - long *value; 84 - struct hist_key key; 85 - 86 - key.index = log2l(count); 87 - key.pid_tgid = bpf_get_current_pid_tgid(); 88 - key.uid_gid = bpf_get_current_uid_gid(); 89 - bpf_get_current_comm(&key.comm, sizeof(key.comm)); 90 - 91 - value = bpf_map_lookup_elem(&my_hist_map, &key); 92 - if (value) 93 - __sync_fetch_and_add(value, 1); 94 - else 95 - bpf_map_update_elem(&my_hist_map, &key, &init_val, BPF_ANY); 96 - return 0; 97 - } 98 - char _license[] SEC("license") = "GPL"; 99 - u32 _version SEC("version") = LINUX_VERSION_CODE;

-187

samples/bpf/tracex2_user.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include <stdio.h> 3 - #include <unistd.h> 4 - #include <stdlib.h> 5 - #include <signal.h> 6 - #include <string.h> 7 - 8 - #include <bpf/bpf.h> 9 - #include <bpf/libbpf.h> 10 - #include "bpf_util.h" 11 - 12 - #define MAX_INDEX 64 13 - #define MAX_STARS 38 14 - 15 - /* my_map, my_hist_map */ 16 - static int map_fd[2]; 17 - 18 - static void stars(char *str, long val, long max, int width) 19 - { 20 - int i; 21 - 22 - for (i = 0; i < (width * val / max) - 1 && i < width - 1; i++) 23 - str[i] = '*'; 24 - if (val > max) 25 - str[i - 1] = '+'; 26 - str[i] = '\0'; 27 - } 28 - 29 - struct task { 30 - char comm[16]; 31 - __u64 pid_tgid; 32 - __u64 uid_gid; 33 - }; 34 - 35 - struct hist_key { 36 - struct task t; 37 - __u32 index; 38 - }; 39 - 40 - #define SIZE sizeof(struct task) 41 - 42 - static void print_hist_for_pid(int fd, void *task) 43 - { 44 - unsigned int nr_cpus = bpf_num_possible_cpus(); 45 - struct hist_key key = {}, next_key; 46 - long values[nr_cpus]; 47 - char starstr[MAX_STARS]; 48 - long value; 49 - long data[MAX_INDEX] = {}; 50 - int max_ind = -1; 51 - long max_value = 0; 52 - int i, ind; 53 - 54 - while (bpf_map_get_next_key(fd, &key, &next_key) == 0) { 55 - if (memcmp(&next_key, task, SIZE)) { 56 - key = next_key; 57 - continue; 58 - } 59 - bpf_map_lookup_elem(fd, &next_key, values); 60 - value = 0; 61 - for (i = 0; i < nr_cpus; i++) 62 - value += values[i]; 63 - ind = next_key.index; 64 - data[ind] = value; 65 - if (value && ind > max_ind) 66 - max_ind = ind; 67 - if (value > max_value) 68 - max_value = value; 69 - key = next_key; 70 - } 71 - 72 - printf(" syscall write() stats\n"); 73 - printf(" byte_size : count distribution\n"); 74 - for (i = 1; i <= max_ind + 1; i++) { 75 - stars(starstr, data[i - 1], max_value, MAX_STARS); 76 - printf("%8ld -> %-8ld : %-8ld |%-*s|\n", 77 - (1l << i) >> 1, (1l << i) - 1, data[i - 1], 78 - MAX_STARS, starstr); 79 - } 80 - } 81 - 82 - static void print_hist(int fd) 83 - { 84 - struct hist_key key = {}, next_key; 85 - static struct task tasks[1024]; 86 - int task_cnt = 0; 87 - int i; 88 - 89 - while (bpf_map_get_next_key(fd, &key, &next_key) == 0) { 90 - int found = 0; 91 - 92 - for (i = 0; i < task_cnt; i++) 93 - if (memcmp(&tasks[i], &next_key, SIZE) == 0) 94 - found = 1; 95 - if (!found) 96 - memcpy(&tasks[task_cnt++], &next_key, SIZE); 97 - key = next_key; 98 - } 99 - 100 - for (i = 0; i < task_cnt; i++) { 101 - printf("\npid %d cmd %s uid %d\n", 102 - (__u32) tasks[i].pid_tgid, 103 - tasks[i].comm, 104 - (__u32) tasks[i].uid_gid); 105 - print_hist_for_pid(fd, &tasks[i]); 106 - } 107 - 108 - } 109 - 110 - static void int_exit(int sig) 111 - { 112 - print_hist(map_fd[1]); 113 - exit(0); 114 - } 115 - 116 - int main(int ac, char **argv) 117 - { 118 - long key, next_key, value; 119 - struct bpf_link *links[2]; 120 - struct bpf_program *prog; 121 - struct bpf_object *obj; 122 - char filename[256]; 123 - int i, j = 0; 124 - FILE *f; 125 - 126 - snprintf(filename, sizeof(filename), "%s.bpf.o", argv[0]); 127 - obj = bpf_object__open_file(filename, NULL); 128 - if (libbpf_get_error(obj)) { 129 - fprintf(stderr, "ERROR: opening BPF object file failed\n"); 130 - return 0; 131 - } 132 - 133 - /* load BPF program */ 134 - if (bpf_object__load(obj)) { 135 - fprintf(stderr, "ERROR: loading BPF object file failed\n"); 136 - goto cleanup; 137 - } 138 - 139 - map_fd[0] = bpf_object__find_map_fd_by_name(obj, "my_map"); 140 - map_fd[1] = bpf_object__find_map_fd_by_name(obj, "my_hist_map"); 141 - if (map_fd[0] < 0 || map_fd[1] < 0) { 142 - fprintf(stderr, "ERROR: finding a map in obj file failed\n"); 143 - goto cleanup; 144 - } 145 - 146 - signal(SIGINT, int_exit); 147 - signal(SIGTERM, int_exit); 148 - 149 - /* start 'ping' in the background to have some kfree_skb_reason 150 - * events */ 151 - f = popen("ping -4 -c5 localhost", "r"); 152 - (void) f; 153 - 154 - /* start 'dd' in the background to have plenty of 'write' syscalls */ 155 - f = popen("dd if=/dev/zero of=/dev/null count=5000000", "r"); 156 - (void) f; 157 - 158 - bpf_object__for_each_program(prog, obj) { 159 - links[j] = bpf_program__attach(prog); 160 - if (libbpf_get_error(links[j])) { 161 - fprintf(stderr, "ERROR: bpf_program__attach failed\n"); 162 - links[j] = NULL; 163 - goto cleanup; 164 - } 165 - j++; 166 - } 167 - 168 - for (i = 0; i < 5; i++) { 169 - key = 0; 170 - while (bpf_map_get_next_key(map_fd[0], &key, &next_key) == 0) { 171 - bpf_map_lookup_elem(map_fd[0], &next_key, &value); 172 - printf("location 0x%lx count %ld\n", next_key, value); 173 - key = next_key; 174 - } 175 - if (key) 176 - printf("\n"); 177 - sleep(1); 178 - } 179 - print_hist(map_fd[1]); 180 - 181 - cleanup: 182 - for (j--; j >= 0; j--) 183 - bpf_link__destroy(links[j]); 184 - 185 - bpf_object__close(obj); 186 - return 0; 187 - }

+2 -2

samples/bpf/tracex4.bpf.c

··· 33 33 return 0; 34 34 } 35 35 36 - SEC("kretprobe/kmem_cache_alloc_node") 36 + SEC("kretprobe/kmem_cache_alloc_node_noprof") 37 37 int bpf_prog2(struct pt_regs *ctx) 38 38 { 39 39 long ptr = PT_REGS_RC(ctx); 40 40 long ip = 0; 41 41 42 - /* get ip address of kmem_cache_alloc_node() caller */ 42 + /* get ip address of kmem_cache_alloc_node_noprof() caller */ 43 43 BPF_KRETPROBE_READ_RET_IP(ip, ctx); 44 44 45 45 struct pair v = {

+1 -13

scripts/link-vmlinux.sh

··· 107 107 # ${1} - vmlinux image 108 108 gen_btf() 109 109 { 110 - local pahole_ver 111 110 local btf_data=${1}.btf.o 112 - 113 - if ! [ -x "$(command -v ${PAHOLE})" ]; then 114 - echo >&2 "BTF: ${1}: pahole (${PAHOLE}) is not available" 115 - return 1 116 - fi 117 - 118 - pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+)\.([0-9]+)/\1\2/') 119 - if [ "${pahole_ver}" -lt "116" ]; then 120 - echo >&2 "BTF: ${1}: pahole version $(${PAHOLE} --version) is too old, need at least v1.16" 121 - return 1 122 - fi 123 111 124 112 info BTF "${btf_data}" 125 113 LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${PAHOLE_FLAGS} ${1} ··· 272 284 vmlinux_link vmlinux 273 285 274 286 # fill in BTF IDs 275 - if is_enabled CONFIG_DEBUG_INFO_BTF && is_enabled CONFIG_BPF; then 287 + if is_enabled CONFIG_DEBUG_INFO_BTF; then 276 288 info BTFIDS vmlinux 277 289 ${RESOLVE_BTFIDS} vmlinux 278 290 fi

-1

security/bpf/hooks.c

··· 31 31 32 32 struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = { 33 33 .lbs_inode = sizeof(struct bpf_storage_blob), 34 - .lbs_task = sizeof(struct bpf_storage_blob), 35 34 }; 36 35 37 36 DEFINE_LSM(bpf) = {

+2 -2

tools/bpf/bpftool/Documentation/bpftool-gen.rst

··· 104 104 105 105 - **example__load**. 106 106 This function creates maps, loads and verifies BPF programs, initializes 107 - global data maps. It corresponds to libppf's **bpf_object__load**\ () 107 + global data maps. It corresponds to libbpf's **bpf_object__load**\ () 108 108 API. 109 109 110 110 - **example__open_and_load** combines **example__open** and ··· 172 172 CO-RE based application, turning the application portable to different 173 173 kernel versions. 174 174 175 - Check examples bellow for more information how to use it. 175 + Check examples below for more information on how to use it. 176 176 177 177 bpftool gen help 178 178 Print short help message.

+23 -1

tools/bpf/bpftool/Documentation/bpftool-net.rst

··· 29 29 | **bpftool** **net help** 30 30 | 31 31 | *PROG* := { **id** *PROG_ID* | **pinned** *FILE* | **tag** *PROG_TAG* | **name** *PROG_NAME* } 32 - | *ATTACH_TYPE* := { **xdp** | **xdpgeneric** | **xdpdrv** | **xdpoffload** } 32 + | *ATTACH_TYPE* := { **xdp** | **xdpgeneric** | **xdpdrv** | **xdpoffload** | **tcx_ingress** | **tcx_egress** } 33 33 34 34 DESCRIPTION 35 35 =========== ··· 69 69 **xdpgeneric** - Generic XDP. runs at generic XDP hook when packet already enters receive path as skb; 70 70 **xdpdrv** - Native XDP. runs earliest point in driver's receive path; 71 71 **xdpoffload** - Offload XDP. runs directly on NIC on each packet reception; 72 + **tcx_ingress** - Ingress TCX. runs on ingress net traffic; 73 + **tcx_egress** - Egress TCX. runs on egress net traffic; 72 74 73 75 bpftool net detach *ATTACH_TYPE* dev *NAME* 74 76 Detach bpf program attached to network interface *NAME* with type specified ··· 180 178 :: 181 179 182 180 xdp: 181 + 182 + | 183 + | **# bpftool net attach tcx_ingress name tc_prog dev lo** 184 + | **# bpftool net** 185 + | 186 + 187 + :: 188 + 189 + tc: 190 + lo(1) tcx/ingress tc_prog prog_id 29 191 + 192 + | 193 + | **# bpftool net attach tcx_ingress name tc_prog dev lo** 194 + | **# bpftool net detach tcx_ingress dev lo** 195 + | **# bpftool net** 196 + | 197 + 198 + :: 199 + 200 + tc:

+1 -1

tools/bpf/bpftool/bash-completion/bpftool

··· 1079 1079 esac 1080 1080 ;; 1081 1081 net) 1082 - local ATTACH_TYPES='xdp xdpgeneric xdpdrv xdpoffload' 1082 + local ATTACH_TYPES='xdp xdpgeneric xdpdrv xdpoffload tcx_ingress tcx_egress' 1083 1083 case $command in 1084 1084 show|list) 1085 1085 [[ $prev != "$command" ]] && return 0

+79 -8

tools/bpf/bpftool/btf.c

··· 50 50 int type_rank; 51 51 const char *sort_name; 52 52 const char *own_name; 53 + __u64 disambig_hash; 53 54 }; 54 55 55 56 static const char *btf_int_enc_str(__u8 encoding) ··· 562 561 case BTF_KIND_ENUM64: { 563 562 int name_off = t->name_off; 564 563 565 - /* Use name of the first element for anonymous enums if allowed */ 566 - if (!from_ref && !t->name_off && btf_vlen(t)) 567 - name_off = btf_enum(t)->name_off; 564 + if (!from_ref && !name_off && btf_vlen(t)) 565 + name_off = btf_kind(t) == BTF_KIND_ENUM64 ? 566 + btf_enum64(t)->name_off : 567 + btf_enum(t)->name_off; 568 568 569 569 return btf__name_by_offset(btf, name_off); 570 570 } ··· 585 583 return NULL; 586 584 } 587 585 586 + static __u64 hasher(__u64 hash, __u64 val) 587 + { 588 + return hash * 31 + val; 589 + } 590 + 591 + static __u64 btf_name_hasher(__u64 hash, const struct btf *btf, __u32 name_off) 592 + { 593 + if (!name_off) 594 + return hash; 595 + 596 + return hasher(hash, str_hash(btf__name_by_offset(btf, name_off))); 597 + } 598 + 599 + static __u64 btf_type_disambig_hash(const struct btf *btf, __u32 id, bool include_members) 600 + { 601 + const struct btf_type *t = btf__type_by_id(btf, id); 602 + int i; 603 + size_t hash = 0; 604 + 605 + hash = btf_name_hasher(hash, btf, t->name_off); 606 + 607 + switch (btf_kind(t)) { 608 + case BTF_KIND_ENUM: 609 + case BTF_KIND_ENUM64: 610 + for (i = 0; i < btf_vlen(t); i++) { 611 + __u32 name_off = btf_is_enum(t) ? 612 + btf_enum(t)[i].name_off : 613 + btf_enum64(t)[i].name_off; 614 + 615 + hash = btf_name_hasher(hash, btf, name_off); 616 + } 617 + break; 618 + case BTF_KIND_STRUCT: 619 + case BTF_KIND_UNION: 620 + if (!include_members) 621 + break; 622 + for (i = 0; i < btf_vlen(t); i++) { 623 + const struct btf_member *m = btf_members(t) + i; 624 + 625 + hash = btf_name_hasher(hash, btf, m->name_off); 626 + /* resolve field type's name and hash it as well */ 627 + hash = hasher(hash, btf_type_disambig_hash(btf, m->type, false)); 628 + } 629 + break; 630 + case BTF_KIND_TYPE_TAG: 631 + case BTF_KIND_CONST: 632 + case BTF_KIND_PTR: 633 + case BTF_KIND_VOLATILE: 634 + case BTF_KIND_RESTRICT: 635 + case BTF_KIND_TYPEDEF: 636 + case BTF_KIND_DECL_TAG: 637 + hash = hasher(hash, btf_type_disambig_hash(btf, t->type, include_members)); 638 + break; 639 + case BTF_KIND_ARRAY: { 640 + struct btf_array *arr = btf_array(t); 641 + 642 + hash = hasher(hash, arr->nelems); 643 + hash = hasher(hash, btf_type_disambig_hash(btf, arr->type, include_members)); 644 + break; 645 + } 646 + default: 647 + break; 648 + } 649 + return hash; 650 + } 651 + 588 652 static int btf_type_compare(const void *left, const void *right) 589 653 { 590 654 const struct sort_datum *d1 = (const struct sort_datum *)left; 591 655 const struct sort_datum *d2 = (const struct sort_datum *)right; 592 656 int r; 593 657 594 - if (d1->type_rank != d2->type_rank) 595 - return d1->type_rank < d2->type_rank ? -1 : 1; 596 - 597 - r = strcmp(d1->sort_name, d2->sort_name); 658 + r = d1->type_rank - d2->type_rank; 659 + r = r ?: strcmp(d1->sort_name, d2->sort_name); 660 + r = r ?: strcmp(d1->own_name, d2->own_name); 598 661 if (r) 599 662 return r; 600 663 601 - return strcmp(d1->own_name, d2->own_name); 664 + if (d1->disambig_hash != d2->disambig_hash) 665 + return d1->disambig_hash < d2->disambig_hash ? -1 : 1; 666 + 667 + return d1->index - d2->index; 602 668 } 603 669 604 670 static struct sort_datum *sort_btf_c(const struct btf *btf) ··· 687 617 d->type_rank = btf_type_rank(btf, i, false); 688 618 d->sort_name = btf_type_sort_name(btf, i, false); 689 619 d->own_name = btf__name_by_offset(btf, t->name_off); 620 + d->disambig_hash = btf_type_disambig_hash(btf, i, true); 690 621 } 691 622 692 623 qsort(datums, n, sizeof(struct sort_datum), btf_type_compare);

+5 -5

tools/bpf/bpftool/feature.c

··· 196 196 { 197 197 long res; 198 198 199 - /* No support for C-style ouptut */ 199 + /* No support for C-style output */ 200 200 201 201 res = read_procfs("/proc/sys/kernel/unprivileged_bpf_disabled"); 202 202 if (json_output) { ··· 225 225 { 226 226 long res; 227 227 228 - /* No support for C-style ouptut */ 228 + /* No support for C-style output */ 229 229 230 230 res = read_procfs("/proc/sys/net/core/bpf_jit_enable"); 231 231 if (json_output) { ··· 255 255 { 256 256 long res; 257 257 258 - /* No support for C-style ouptut */ 258 + /* No support for C-style output */ 259 259 260 260 res = read_procfs("/proc/sys/net/core/bpf_jit_harden"); 261 261 if (json_output) { ··· 285 285 { 286 286 long res; 287 287 288 - /* No support for C-style ouptut */ 288 + /* No support for C-style output */ 289 289 290 290 res = read_procfs("/proc/sys/net/core/bpf_jit_kallsyms"); 291 291 if (json_output) { ··· 311 311 { 312 312 long res; 313 313 314 - /* No support for C-style ouptut */ 314 + /* No support for C-style output */ 315 315 316 316 res = read_procfs("/proc/sys/net/core/bpf_jit_limit"); 317 317 if (json_output) {

+70 -10

tools/bpf/bpftool/net.c

··· 67 67 NET_ATTACH_TYPE_XDP_GENERIC, 68 68 NET_ATTACH_TYPE_XDP_DRIVER, 69 69 NET_ATTACH_TYPE_XDP_OFFLOAD, 70 + NET_ATTACH_TYPE_TCX_INGRESS, 71 + NET_ATTACH_TYPE_TCX_EGRESS, 70 72 }; 71 73 72 74 static const char * const attach_type_strings[] = { ··· 76 74 [NET_ATTACH_TYPE_XDP_GENERIC] = "xdpgeneric", 77 75 [NET_ATTACH_TYPE_XDP_DRIVER] = "xdpdrv", 78 76 [NET_ATTACH_TYPE_XDP_OFFLOAD] = "xdpoffload", 77 + [NET_ATTACH_TYPE_TCX_INGRESS] = "tcx_ingress", 78 + [NET_ATTACH_TYPE_TCX_EGRESS] = "tcx_egress", 79 79 }; 80 80 81 81 static const char * const attach_loc_strings[] = { ··· 486 482 if (prog_flags[i] || json_output) { 487 483 NET_START_ARRAY("prog_flags", "%s "); 488 484 for (j = 0; prog_flags[i] && j < 32; j++) { 489 - if (!(prog_flags[i] & (1 << j))) 485 + if (!(prog_flags[i] & (1U << j))) 490 486 continue; 491 - NET_DUMP_UINT_ONLY(1 << j); 487 + NET_DUMP_UINT_ONLY(1U << j); 492 488 } 493 489 NET_END_ARRAY(""); 494 490 } ··· 497 493 if (link_flags[i] || json_output) { 498 494 NET_START_ARRAY("link_flags", "%s "); 499 495 for (j = 0; link_flags[i] && j < 32; j++) { 500 - if (!(link_flags[i] & (1 << j))) 496 + if (!(link_flags[i] & (1U << j))) 501 497 continue; 502 - NET_DUMP_UINT_ONLY(1 << j); 498 + NET_DUMP_UINT_ONLY(1U << j); 503 499 } 504 500 NET_END_ARRAY(""); 505 501 } ··· 651 647 return bpf_xdp_attach(ifindex, progfd, flags, NULL); 652 648 } 653 649 650 + static int get_tcx_type(enum net_attach_type attach_type) 651 + { 652 + switch (attach_type) { 653 + case NET_ATTACH_TYPE_TCX_INGRESS: 654 + return BPF_TCX_INGRESS; 655 + case NET_ATTACH_TYPE_TCX_EGRESS: 656 + return BPF_TCX_EGRESS; 657 + default: 658 + return -1; 659 + } 660 + } 661 + 662 + static int do_attach_tcx(int progfd, enum net_attach_type attach_type, int ifindex) 663 + { 664 + int type = get_tcx_type(attach_type); 665 + 666 + return bpf_prog_attach(progfd, ifindex, type, 0); 667 + } 668 + 669 + static int do_detach_tcx(int targetfd, enum net_attach_type attach_type) 670 + { 671 + int type = get_tcx_type(attach_type); 672 + 673 + return bpf_prog_detach(targetfd, type); 674 + } 675 + 654 676 static int do_attach(int argc, char **argv) 655 677 { 656 678 enum net_attach_type attach_type; ··· 714 684 } 715 685 } 716 686 687 + switch (attach_type) { 717 688 /* attach xdp prog */ 718 - if (is_prefix("xdp", attach_type_strings[attach_type])) 719 - err = do_attach_detach_xdp(progfd, attach_type, ifindex, 720 - overwrite); 689 + case NET_ATTACH_TYPE_XDP: 690 + case NET_ATTACH_TYPE_XDP_GENERIC: 691 + case NET_ATTACH_TYPE_XDP_DRIVER: 692 + case NET_ATTACH_TYPE_XDP_OFFLOAD: 693 + err = do_attach_detach_xdp(progfd, attach_type, ifindex, overwrite); 694 + break; 695 + /* attach tcx prog */ 696 + case NET_ATTACH_TYPE_TCX_INGRESS: 697 + case NET_ATTACH_TYPE_TCX_EGRESS: 698 + err = do_attach_tcx(progfd, attach_type, ifindex); 699 + break; 700 + default: 701 + break; 702 + } 703 + 721 704 if (err) { 722 705 p_err("interface %s attach failed: %s", 723 706 attach_type_strings[attach_type], strerror(-err)); ··· 764 721 if (ifindex < 1) 765 722 return -EINVAL; 766 723 724 + switch (attach_type) { 767 725 /* detach xdp prog */ 768 - progfd = -1; 769 - if (is_prefix("xdp", attach_type_strings[attach_type])) 726 + case NET_ATTACH_TYPE_XDP: 727 + case NET_ATTACH_TYPE_XDP_GENERIC: 728 + case NET_ATTACH_TYPE_XDP_DRIVER: 729 + case NET_ATTACH_TYPE_XDP_OFFLOAD: 730 + progfd = -1; 770 731 err = do_attach_detach_xdp(progfd, attach_type, ifindex, NULL); 732 + break; 733 + /* detach tcx prog */ 734 + case NET_ATTACH_TYPE_TCX_INGRESS: 735 + case NET_ATTACH_TYPE_TCX_EGRESS: 736 + err = do_detach_tcx(ifindex, attach_type); 737 + break; 738 + default: 739 + break; 740 + } 771 741 772 742 if (err < 0) { 773 743 p_err("interface %s detach failed: %s", ··· 879 823 nf_link_info[nf_link_count] = info; 880 824 nf_link_count++; 881 825 } 826 + 827 + if (!nf_link_info) 828 + return; 882 829 883 830 qsort(nf_link_info, nf_link_count, sizeof(*nf_link_info), netfilter_link_compar); 884 831 ··· 987 928 " %1$s %2$s help\n" 988 929 "\n" 989 930 " " HELP_SPEC_PROGRAM "\n" 990 - " ATTACH_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload }\n" 931 + " ATTACH_TYPE := { xdp | xdpgeneric | xdpdrv | xdpoffload | tcx_ingress\n" 932 + " | tcx_egress }\n" 991 933 " " HELP_SPEC_OPTIONS " }\n" 992 934 "\n" 993 935 "Note: Only xdp, tcx, tc, netkit, flow_dissector and netfilter attachments\n"

+2 -2

tools/bpf/bpftool/xlated_dumper.c

··· 349 349 350 350 double_insn = insn[i].code == (BPF_LD | BPF_IMM | BPF_DW); 351 351 352 - printf("% 4d: ", i); 352 + printf("%4u: ", i); 353 353 print_bpf_insn(&cbs, insn + i, true); 354 354 355 355 if (opcodes) { ··· 415 415 } 416 416 } 417 417 418 - printf("%d: ", insn_off); 418 + printf("%u: ", insn_off); 419 419 print_bpf_insn(&cbs, cur, true); 420 420 421 421 if (opcodes) {

+2 -1

tools/bpf/runqslower/Makefile

··· 15 15 CFLAGS := -g -Wall $(CLANG_CROSS_FLAGS) 16 16 CFLAGS += $(EXTRA_CFLAGS) 17 17 LDFLAGS += $(EXTRA_LDFLAGS) 18 + LDLIBS += -lelf -lz 18 19 19 20 # Try to detect best kernel BTF source 20 21 KERNEL_REL := $(shell uname -r) ··· 52 51 libbpf_hdrs: $(BPFOBJ) 53 52 54 53 $(OUTPUT)/runqslower: $(OUTPUT)/runqslower.o $(BPFOBJ) 55 - $(QUIET_LINK)$(CC) $(CFLAGS) $^ -lelf -lz -o $@ 54 + $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $^ $(LDLIBS) -o $@ 56 55 57 56 $(OUTPUT)/runqslower.o: runqslower.h $(OUTPUT)/runqslower.skel.h \ 58 57 $(OUTPUT)/runqslower.bpf.o | libbpf_hdrs

+9

tools/include/uapi/linux/bpf.h

··· 7513 7513 __u64 __opaque[1]; 7514 7514 } __attribute__((aligned(8))); 7515 7515 7516 + /* 7517 + * Flags to control BPF kfunc behaviour. 7518 + * - BPF_F_PAD_ZEROS: Pad destination buffer with zeros. (See the respective 7519 + * helper documentation for details.) 7520 + */ 7521 + enum bpf_kfunc_flags { 7522 + BPF_F_PAD_ZEROS = (1ULL << 0), 7523 + }; 7524 + 7516 7525 #endif /* _UAPI__LINUX_BPF_H__ */

+2 -2

tools/lib/bpf/bpf.h

··· 100 100 __u32 log_level; 101 101 __u32 log_size; 102 102 char *log_buf; 103 - /* output: actual total log contents size (including termintaing zero). 103 + /* output: actual total log contents size (including terminating zero). 104 104 * It could be both larger than original log_size (if log was 105 105 * truncated), or smaller (if log buffer wasn't filled completely). 106 106 * If kernel doesn't support this feature, log_size is left unchanged. ··· 129 129 char *log_buf; 130 130 __u32 log_level; 131 131 __u32 log_size; 132 - /* output: actual total log contents size (including termintaing zero). 132 + /* output: actual total log contents size (including terminating zero). 133 133 * It could be both larger than original log_size (if log was 134 134 * truncated), or smaller (if log buffer wasn't filled completely). 135 135 * If kernel doesn't support this feature, log_size is left unchanged.

+1 -1

tools/lib/bpf/bpf_helpers.h

··· 341 341 * I.e., it looks almost like high-level for each loop in other languages, 342 342 * supports continue/break, and is verifiable by BPF verifier. 343 343 * 344 - * For iterating integers, the difference betwen bpf_for_each(num, i, N, M) 344 + * For iterating integers, the difference between bpf_for_each(num, i, N, M) 345 345 * and bpf_for(i, N, M) is in that bpf_for() provides additional proof to 346 346 * verifier that i is in [N, M) range, and in bpf_for_each() case i is `int 347 347 * *`, not just `int`. So for integers bpf_for() is more convenient.

+16 -9

tools/lib/bpf/bpf_tracing.h

··· 163 163 164 164 struct pt_regs___s390 { 165 165 unsigned long orig_gpr2; 166 - }; 166 + } __attribute__((preserve_access_index)); 167 167 168 168 /* s390 provides user_pt_regs instead of struct pt_regs to userspace */ 169 169 #define __PT_REGS_CAST(x) ((const user_pt_regs *)(x)) ··· 179 179 #define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG 180 180 #define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG 181 181 #define __PT_PARM6_SYSCALL_REG gprs[7] 182 - #define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) 182 + #define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___s390 *)(x))->__PT_PARM1_SYSCALL_REG) 183 183 #define PT_REGS_PARM1_CORE_SYSCALL(x) \ 184 184 BPF_CORE_READ((const struct pt_regs___s390 *)(x), __PT_PARM1_SYSCALL_REG) 185 185 ··· 222 222 223 223 struct pt_regs___arm64 { 224 224 unsigned long orig_x0; 225 - }; 225 + } __attribute__((preserve_access_index)); 226 226 227 227 /* arm64 provides struct user_pt_regs instead of struct pt_regs to userspace */ 228 228 #define __PT_REGS_CAST(x) ((const struct user_pt_regs *)(x)) ··· 241 241 #define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG 242 242 #define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG 243 243 #define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG 244 - #define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x) 244 + #define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___arm64 *)(x))->__PT_PARM1_SYSCALL_REG) 245 245 #define PT_REGS_PARM1_CORE_SYSCALL(x) \ 246 246 BPF_CORE_READ((const struct pt_regs___arm64 *)(x), __PT_PARM1_SYSCALL_REG) 247 247 ··· 351 351 * https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#risc-v-calling-conventions 352 352 */ 353 353 354 + struct pt_regs___riscv { 355 + unsigned long orig_a0; 356 + } __attribute__((preserve_access_index)); 357 + 354 358 /* riscv provides struct user_regs_struct instead of struct pt_regs to userspace */ 355 359 #define __PT_REGS_CAST(x) ((const struct user_regs_struct *)(x)) 356 360 #define __PT_PARM1_REG a0 ··· 366 362 #define __PT_PARM7_REG a6 367 363 #define __PT_PARM8_REG a7 368 364 369 - #define __PT_PARM1_SYSCALL_REG __PT_PARM1_REG 365 + #define __PT_PARM1_SYSCALL_REG orig_a0 370 366 #define __PT_PARM2_SYSCALL_REG __PT_PARM2_REG 371 367 #define __PT_PARM3_SYSCALL_REG __PT_PARM3_REG 372 368 #define __PT_PARM4_SYSCALL_REG __PT_PARM4_REG 373 369 #define __PT_PARM5_SYSCALL_REG __PT_PARM5_REG 374 370 #define __PT_PARM6_SYSCALL_REG __PT_PARM6_REG 371 + #define PT_REGS_PARM1_SYSCALL(x) (((const struct pt_regs___riscv *)(x))->__PT_PARM1_SYSCALL_REG) 372 + #define PT_REGS_PARM1_CORE_SYSCALL(x) \ 373 + BPF_CORE_READ((const struct pt_regs___riscv *)(x), __PT_PARM1_SYSCALL_REG) 375 374 376 375 #define __PT_RET_REG ra 377 376 #define __PT_FP_REG s0 ··· 480 473 #endif 481 474 /* 482 475 * Similarly, syscall-specific conventions might differ between function call 483 - * conventions within each architecutre. All supported architectures pass 476 + * conventions within each architecture. All supported architectures pass 484 477 * either 6 or 7 syscall arguments in registers. 485 478 * 486 479 * See syscall(2) manpage for succinct table with information on each arch. ··· 522 515 #define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = (ctx)->link; }) 523 516 #define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP 524 517 525 - #elif defined(bpf_target_sparc) 518 + #elif defined(bpf_target_sparc) || defined(bpf_target_arm64) 526 519 527 520 #define BPF_KPROBE_READ_RET_IP(ip, ctx) ({ (ip) = PT_REGS_RET(ctx); }) 528 521 #define BPF_KRETPROBE_READ_RET_IP BPF_KPROBE_READ_RET_IP ··· 658 651 * BPF_PROG is a convenience wrapper for generic tp_btf/fentry/fexit and 659 652 * similar kinds of BPF programs, that accept input arguments as a single 660 653 * pointer to untyped u64 array, where each u64 can actually be a typed 661 - * pointer or integer of different size. Instead of requring user to write 654 + * pointer or integer of different size. Instead of requiring user to write 662 655 * manual casts and work with array elements by index, BPF_PROG macro 663 656 * allows user to declare a list of named and typed input arguments in the 664 657 * same syntax as for normal C function. All the casting is hidden and ··· 808 801 * tp_btf/fentry/fexit BPF programs. It hides the underlying platform-specific 809 802 * low-level way of getting kprobe input arguments from struct pt_regs, and 810 803 * provides a familiar typed and named function arguments syntax and 811 - * semantics of accessing kprobe input paremeters. 804 + * semantics of accessing kprobe input parameters. 812 805 * 813 806 * Original struct pt_regs* context is preserved as 'ctx' argument. This might 814 807 * be necessary when using BPF helpers like bpf_perf_event_output().

+6 -2

tools/lib/bpf/btf.c

··· 996 996 btf->base_btf = base_btf; 997 997 btf->start_id = btf__type_cnt(base_btf); 998 998 btf->start_str_off = base_btf->hdr->str_len; 999 + btf->swapped_endian = base_btf->swapped_endian; 999 1000 } 1000 1001 1001 1002 /* +1 for empty string at offset 0 */ ··· 4192 4191 * and canonical graphs are not compatible structurally, whole graphs are 4193 4192 * incompatible. If types are structurally equivalent (i.e., all information 4194 4193 * except referenced type IDs is exactly the same), a mapping from `canon_id` to 4195 - * a `cand_id` is recored in hypothetical mapping (`btf_dedup->hypot_map`). 4194 + * a `cand_id` is recoded in hypothetical mapping (`btf_dedup->hypot_map`). 4196 4195 * If a type references other types, then those referenced types are checked 4197 4196 * for equivalence recursively. 4198 4197 * ··· 4230 4229 * consists of portions of the graph that come from multiple compilation units. 4231 4230 * This is due to the fact that types within single compilation unit are always 4232 4231 * deduplicated and FWDs are already resolved, if referenced struct/union 4233 - * definiton is available. So, if we had unresolved FWD and found corresponding 4232 + * definition is available. So, if we had unresolved FWD and found corresponding 4234 4233 * STRUCT/UNION, they will be from different compilation units. This 4235 4234 * consequently means that when we "link" FWD to corresponding STRUCT/UNION, 4236 4235 * type graph will likely have at least two different BTF types that describe ··· 5395 5394 new_base = btf__new_empty(); 5396 5395 if (!new_base) 5397 5396 return libbpf_err(-ENOMEM); 5397 + 5398 + btf__set_endianness(new_base, btf__endianness(src_btf)); 5399 + 5398 5400 dist.id_map = calloc(n, sizeof(*dist.id_map)); 5399 5401 if (!dist.id_map) { 5400 5402 err = -ENOMEM;

+1 -1

tools/lib/bpf/btf.h

··· 286 286 LIBBPF_API int btf_dump__dump_type(struct btf_dump *d, __u32 id); 287 287 288 288 struct btf_dump_emit_type_decl_opts { 289 - /* size of this struct, for forward/backward compatiblity */ 289 + /* size of this struct, for forward/backward compatibility */ 290 290 size_t sz; 291 291 /* optional field name for type declaration, e.g.: 292 292 * - struct my_struct <FNAME>

+1 -1

tools/lib/bpf/btf_dump.c

··· 304 304 * definition, in which case they have to be declared inline as part of field 305 305 * type declaration; or as a top-level anonymous enum, typically used for 306 306 * declaring global constants. It's impossible to distinguish between two 307 - * without knowning whether given enum type was referenced from other type: 307 + * without knowing whether given enum type was referenced from other type: 308 308 * top-level anonymous enum won't be referenced by anything, while embedded 309 309 * one will. 310 310 */

+1 -1

tools/lib/bpf/btf_relocate.c

+3

tools/lib/bpf/elf.c

··· 28 28 int fd, ret; 29 29 Elf *elf; 30 30 31 + elf_fd->elf = NULL; 32 + elf_fd->fd = -1; 33 + 31 34 if (elf_version(EV_CURRENT) == EV_NONE) { 32 35 pr_warn("elf: failed to init libelf for %s\n", binary_path); 33 36 return -LIBBPF_ERRNO__LIBELF;

+41 -47

tools/lib/bpf/libbpf.c

··· 496 496 }; 497 497 498 498 struct bpf_struct_ops { 499 - const char *tname; 500 - const struct btf_type *type; 501 499 struct bpf_program **progs; 502 500 __u32 *kern_func_off; 503 501 /* e.g. struct tcp_congestion_ops in bpf_prog's btf format */ ··· 986 988 { 987 989 const struct btf_type *kern_type, *kern_vtype; 988 990 const struct btf_member *kern_data_member; 989 - struct btf *btf; 991 + struct btf *btf = NULL; 990 992 __s32 kern_vtype_id, kern_type_id; 991 993 char tname[256]; 992 994 __u32 i; ··· 1081 1083 continue; 1082 1084 1083 1085 for (j = 0; j < obj->nr_maps; ++j) { 1086 + const struct btf_type *type; 1087 + 1084 1088 map = &obj->maps[j]; 1085 1089 if (!bpf_map__is_struct_ops(map)) 1086 1090 continue; 1087 1091 1088 - vlen = btf_vlen(map->st_ops->type); 1092 + type = btf__type_by_id(obj->btf, map->st_ops->type_id); 1093 + vlen = btf_vlen(type); 1089 1094 for (k = 0; k < vlen; ++k) { 1090 1095 slot_prog = map->st_ops->progs[k]; 1091 1096 if (prog != slot_prog) ··· 1116 1115 const struct btf *btf = obj->btf; 1117 1116 struct bpf_struct_ops *st_ops; 1118 1117 const struct btf *kern_btf; 1119 - struct module_btf *mod_btf; 1118 + struct module_btf *mod_btf = NULL; 1120 1119 void *data, *kern_data; 1121 1120 const char *tname; 1122 1121 int err; 1123 1122 1124 1123 st_ops = map->st_ops; 1125 - type = st_ops->type; 1126 - tname = st_ops->tname; 1124 + type = btf__type_by_id(btf, st_ops->type_id); 1125 + tname = btf__name_by_offset(btf, type->name_off); 1127 1126 err = find_struct_ops_kern_types(obj, tname, &mod_btf, 1128 1127 &kern_type, &kern_type_id, 1129 1128 &kern_vtype, &kern_vtype_id, ··· 1424 1423 memcpy(st_ops->data, 1425 1424 data->d_buf + vsi->offset, 1426 1425 type->size); 1427 - st_ops->tname = tname; 1428 - st_ops->type = type; 1429 1426 st_ops->type_id = type_id; 1430 1427 1431 1428 pr_debug("struct_ops init: struct %s(type_id=%u) %s found at offset %u\n", ··· 1848 1849 snprintf(map_name, sizeof(map_name), "%.*s%.*s", pfx_len, obj->name, 1849 1850 sfx_len, real_name); 1850 1851 1851 - /* sanitise map name to characters allowed by kernel */ 1852 + /* sanities map name to characters allowed by kernel */ 1852 1853 for (p = map_name; *p && p < map_name + sizeof(map_name); p++) 1853 1854 if (!isalnum(*p) && *p != '_' && *p != '.') 1854 1855 *p = '_'; ··· 7905 7906 } 7906 7907 7907 7908 static struct bpf_object *bpf_object_open(const char *path, const void *obj_buf, size_t obj_buf_sz, 7909 + const char *obj_name, 7908 7910 const struct bpf_object_open_opts *opts) 7909 7911 { 7910 - const char *obj_name, *kconfig, *btf_tmp_path, *token_path; 7912 + const char *kconfig, *btf_tmp_path, *token_path; 7911 7913 struct bpf_object *obj; 7912 - char tmp_name[64]; 7913 7914 int err; 7914 7915 char *log_buf; 7915 7916 size_t log_size; 7916 7917 __u32 log_level; 7918 + 7919 + if (obj_buf && !obj_name) 7920 + return ERR_PTR(-EINVAL); 7917 7921 7918 7922 if (elf_version(EV_CURRENT) == EV_NONE) { 7919 7923 pr_warn("failed to init libelf for %s\n", ··· 7927 7925 if (!OPTS_VALID(opts, bpf_object_open_opts)) 7928 7926 return ERR_PTR(-EINVAL); 7929 7927 7930 - obj_name = OPTS_GET(opts, object_name, NULL); 7928 + obj_name = OPTS_GET(opts, object_name, NULL) ?: obj_name; 7931 7929 if (obj_buf) { 7932 - if (!obj_name) { 7933 - snprintf(tmp_name, sizeof(tmp_name), "%lx-%lx", 7934 - (unsigned long)obj_buf, 7935 - (unsigned long)obj_buf_sz); 7936 - obj_name = tmp_name; 7937 - } 7938 7930 path = obj_name; 7939 7931 pr_debug("loading object '%s' from buffer\n", obj_name); 7932 + } else { 7933 + pr_debug("loading object from %s\n", path); 7940 7934 } 7941 7935 7942 7936 log_buf = OPTS_GET(opts, kernel_log_buf, NULL); ··· 8016 8018 if (!path) 8017 8019 return libbpf_err_ptr(-EINVAL); 8018 8020 8019 - pr_debug("loading %s\n", path); 8020 - 8021 - return libbpf_ptr(bpf_object_open(path, NULL, 0, opts)); 8021 + return libbpf_ptr(bpf_object_open(path, NULL, 0, NULL, opts)); 8022 8022 } 8023 8023 8024 8024 struct bpf_object *bpf_object__open(const char *path) ··· 8028 8032 bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, 8029 8033 const struct bpf_object_open_opts *opts) 8030 8034 { 8035 + char tmp_name[64]; 8036 + 8031 8037 if (!obj_buf || obj_buf_sz == 0) 8032 8038 return libbpf_err_ptr(-EINVAL); 8033 8039 8034 - return libbpf_ptr(bpf_object_open(NULL, obj_buf, obj_buf_sz, opts)); 8040 + /* create a (quite useless) default "name" for this memory buffer object */ 8041 + snprintf(tmp_name, sizeof(tmp_name), "%lx-%zx", (unsigned long)obj_buf, obj_buf_sz); 8042 + 8043 + return libbpf_ptr(bpf_object_open(NULL, obj_buf, obj_buf_sz, tmp_name, opts)); 8035 8044 } 8036 8045 8037 8046 static int bpf_object_unload(struct bpf_object *obj) ··· 8446 8445 8447 8446 static void bpf_map_prepare_vdata(const struct bpf_map *map) 8448 8447 { 8448 + const struct btf_type *type; 8449 8449 struct bpf_struct_ops *st_ops; 8450 8450 __u32 i; 8451 8451 8452 8452 st_ops = map->st_ops; 8453 - for (i = 0; i < btf_vlen(st_ops->type); i++) { 8453 + type = btf__type_by_id(map->obj->btf, st_ops->type_id); 8454 + for (i = 0; i < btf_vlen(type); i++) { 8454 8455 struct bpf_program *prog = st_ops->progs[i]; 8455 8456 void *kern_data; 8456 8457 int prog_fd; ··· 9057 9054 unsigned int bpf_object__kversion(const struct bpf_object *obj) 9058 9055 { 9059 9056 return obj ? obj->kern_version : 0; 9057 + } 9058 + 9059 + int bpf_object__token_fd(const struct bpf_object *obj) 9060 + { 9061 + return obj->token_fd ?: -1; 9060 9062 } 9061 9063 9062 9064 struct btf *bpf_object__btf(const struct bpf_object *obj) ··· 9720 9712 static int bpf_object__collect_st_ops_relos(struct bpf_object *obj, 9721 9713 Elf64_Shdr *shdr, Elf_Data *data) 9722 9714 { 9715 + const struct btf_type *type; 9723 9716 const struct btf_member *member; 9724 9717 struct bpf_struct_ops *st_ops; 9725 9718 struct bpf_program *prog; ··· 9780 9771 } 9781 9772 insn_idx = sym->st_value / BPF_INSN_SZ; 9782 9773 9783 - member = find_member_by_offset(st_ops->type, moff * 8); 9774 + type = btf__type_by_id(btf, st_ops->type_id); 9775 + member = find_member_by_offset(type, moff * 8); 9784 9776 if (!member) { 9785 9777 pr_warn("struct_ops reloc %s: cannot find member at moff %u\n", 9786 9778 map->name, moff); 9787 9779 return -EINVAL; 9788 9780 } 9789 - member_idx = member - btf_members(st_ops->type); 9781 + member_idx = member - btf_members(type); 9790 9782 name = btf__name_by_offset(btf, member->name_off); 9791 9783 9792 9784 if (!resolve_func_ptr(btf, member->type, NULL)) { ··· 11693 11683 ret = 0; 11694 11684 break; 11695 11685 case 3: 11696 - opts.retprobe = strcmp(probe_type, "uretprobe.multi") == 0; 11686 + opts.retprobe = str_has_pfx(probe_type, "uretprobe.multi"); 11697 11687 *link = bpf_program__attach_uprobe_multi(prog, -1, binary_path, func_name, &opts); 11698 11688 ret = libbpf_get_error(*link); 11699 11689 break; ··· 13768 13758 int bpf_object__open_skeleton(struct bpf_object_skeleton *s, 13769 13759 const struct bpf_object_open_opts *opts) 13770 13760 { 13771 - DECLARE_LIBBPF_OPTS(bpf_object_open_opts, skel_opts, 13772 - .object_name = s->name, 13773 - ); 13774 13761 struct bpf_object *obj; 13775 13762 int err; 13776 13763 13777 - /* Attempt to preserve opts->object_name, unless overriden by user 13778 - * explicitly. Overwriting object name for skeletons is discouraged, 13779 - * as it breaks global data maps, because they contain object name 13780 - * prefix as their own map name prefix. When skeleton is generated, 13781 - * bpftool is making an assumption that this name will stay the same. 13782 - */ 13783 - if (opts) { 13784 - memcpy(&skel_opts, opts, sizeof(*opts)); 13785 - if (!opts->object_name) 13786 - skel_opts.object_name = s->name; 13787 - } 13788 - 13789 - obj = bpf_object__open_mem(s->data, s->data_sz, &skel_opts); 13790 - err = libbpf_get_error(obj); 13791 - if (err) { 13792 - pr_warn("failed to initialize skeleton BPF object '%s': %d\n", 13793 - s->name, err); 13764 + obj = bpf_object_open(NULL, s->data, s->data_sz, s->name, opts); 13765 + if (IS_ERR(obj)) { 13766 + err = PTR_ERR(obj); 13767 + pr_warn("failed to initialize skeleton BPF object '%s': %d\n", s->name, err); 13794 13768 return libbpf_err(err); 13795 13769 } 13796 13770

+13 -5

tools/lib/bpf/libbpf.h

··· 152 152 * log_buf and log_level settings. 153 153 * 154 154 * If specified, this log buffer will be passed for: 155 - * - each BPF progral load (BPF_PROG_LOAD) attempt, unless overriden 155 + * - each BPF progral load (BPF_PROG_LOAD) attempt, unless overridden 156 156 * with bpf_program__set_log() on per-program level, to get 157 157 * BPF verifier log output. 158 158 * - during BPF object's BTF load into kernel (BPF_BTF_LOAD) to get ··· 293 293 LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj); 294 294 LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj); 295 295 LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_version); 296 + 297 + /** 298 + * @brief **bpf_object__token_fd** is an accessor for BPF token FD associated 299 + * with BPF object. 300 + * @param obj Pointer to a valid BPF object 301 + * @return BPF token FD or -1, if it wasn't set 302 + */ 303 + LIBBPF_API int bpf_object__token_fd(const struct bpf_object *obj); 296 304 297 305 struct btf; 298 306 LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); ··· 463 455 /** 464 456 * @brief **bpf_program__attach()** is a generic function for attaching 465 457 * a BPF program based on auto-detection of program type, attach type, 466 - * and extra paremeters, where applicable. 458 + * and extra parameters, where applicable. 467 459 * 468 460 * @param prog BPF program to attach 469 461 * @return Reference to the newly created BPF link; or NULL is returned on error, ··· 687 679 /** 688 680 * @brief **bpf_program__attach_uprobe()** attaches a BPF program 689 681 * to the userspace function which is found by binary path and 690 - * offset. You can optionally specify a particular proccess to attach 682 + * offset. You can optionally specify a particular process to attach 691 683 * to. You can also optionally attach the program to the function 692 684 * exit instead of entry. 693 685 * ··· 1601 1593 * memory region of the ring buffer. 1602 1594 * This ring buffer can be used to implement a custom events consumer. 1603 1595 * The ring buffer starts with the *struct perf_event_mmap_page*, which 1604 - * holds the ring buffer managment fields, when accessing the header 1596 + * holds the ring buffer management fields, when accessing the header 1605 1597 * structure it's important to be SMP aware. 1606 1598 * You can refer to *perf_event_read_simple* for a simple example. 1607 1599 * @param pb the perf buffer structure 1608 - * @param buf_idx the buffer index to retreive 1600 + * @param buf_idx the buffer index to retrieve 1609 1601 * @param buf (out) gets the base pointer of the mmap()'ed memory 1610 1602 * @param buf_size (out) gets the size of the mmap()'ed region 1611 1603 * @return 0 on success, negative error code for failure

+1

tools/lib/bpf/libbpf.map

··· 423 423 btf__relocate; 424 424 bpf_map__autoattach; 425 425 bpf_map__set_autoattach; 426 + bpf_object__token_fd; 426 427 bpf_program__attach_sockmap; 427 428 ring__consume_n; 428 429 ring_buffer__consume_n;

+2 -2

tools/lib/bpf/libbpf_legacy.h

··· 76 76 * first BPF program or map creation operation. This is done only if 77 77 * kernel is too old to support memcg-based memory accounting for BPF 78 78 * subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY, 79 - * but it can be overriden with libbpf_set_memlock_rlim() API. 79 + * but it can be overridden with libbpf_set_memlock_rlim() API. 80 80 * Note that libbpf_set_memlock_rlim() needs to be called before 81 81 * the very first bpf_prog_load(), bpf_map_create() or bpf_object__load() 82 82 * operation. ··· 97 97 * @brief **libbpf_get_error()** extracts the error code from the passed 98 98 * pointer 99 99 * @param ptr pointer returned from libbpf API function 100 - * @return error code; or 0 if no error occured 100 + * @return error code; or 0 if no error occurred 101 101 * 102 102 * Note, as of libbpf 1.0 this function is not necessary and not recommended 103 103 * to be used. Libbpf doesn't return error code embedded into the pointer

+2 -2

tools/lib/bpf/linker.c

··· 1413 1413 return true; 1414 1414 case BTF_KIND_PTR: 1415 1415 /* just validate overall shape of the referenced type, so no 1416 - * contents comparison for struct/union, and allowd fwd vs 1416 + * contents comparison for struct/union, and allowed fwd vs 1417 1417 * struct/union 1418 1418 */ 1419 1419 exact = false; ··· 1962 1962 1963 1963 /* If existing symbol is a strong resolved symbol, bail out, 1964 1964 * because we lost resolution battle have nothing to 1965 - * contribute. We already checked abover that there is no 1965 + * contribute. We already checked above that there is no 1966 1966 * strong-strong conflict. We also already tightened binding 1967 1967 * and visibility, so nothing else to contribute at that point. 1968 1968 */

+1 -1

tools/lib/bpf/skel_internal.h

··· 107 107 * The loader program will perform probe_read_kernel() from maps.rodata.initial_value. 108 108 * skel_finalize_map_data() sets skel->rodata to point to actual value in a bpf map and 109 109 * does maps.rodata.initial_value = ~0ULL to signal skel_free_map_data() that kvfree 110 - * is not nessary. 110 + * is not necessary. 111 111 * 112 112 * For user space: 113 113 * skel_prep_map_data() mmaps anon memory into skel->rodata that can be accessed directly.

+1 -1

tools/lib/bpf/usdt.bpf.h

··· 39 39 struct __bpf_usdt_arg_spec { 40 40 /* u64 scalar interpreted depending on arg_type, see below */ 41 41 __u64 val_off; 42 - /* arg location case, see bpf_udst_arg() for details */ 42 + /* arg location case, see bpf_usdt_arg() for details */ 43 43 enum __bpf_usdt_arg_type arg_type; 44 44 /* offset of referenced register within struct pt_regs */ 45 45 short reg_off;

+2 -4

tools/testing/selftests/bpf/.gitignore

··· 8 8 test_lpm_map 9 9 test_tag 10 10 FEATURE-DUMP.libbpf 11 + FEATURE-DUMP.selftests 11 12 fixdep 12 - test_dev_cgroup 13 13 /test_progs 14 14 /test_progs-no_alu32 15 15 /test_progs-bpf_gcc ··· 20 20 urandom_read 21 21 test_sockmap 22 22 test_lirc_mode2_user 23 - get_cgroup_id_user 24 - test_skb_cgroup_id_user 25 - test_cgroup_storage 26 23 test_flow_dissector 27 24 flow_dissector_load 28 25 test_tcpnotify_user ··· 28 31 test_sysctl 29 32 xdping 30 33 test_cpp 34 + *.d 31 35 *.subskel.h 32 36 *.skel.h 33 37 *.lskel.h

+3

tools/testing/selftests/bpf/DENYLIST.riscv64

··· 1 + # riscv64 deny list for BPF CI and local vmtest 2 + exceptions # JIT does not support exceptions 3 + tailcalls/tailcall_bpf2bpf* # JIT does not support mixing bpf2bpf and tailcalls

+118 -33

tools/testing/selftests/bpf/Makefile

··· 33 33 LIBELF_CFLAGS := $(shell $(PKG_CONFIG) libelf --cflags 2>/dev/null) 34 34 LIBELF_LIBS := $(shell $(PKG_CONFIG) libelf --libs 2>/dev/null || echo -lelf) 35 35 36 + ifeq ($(srctree),) 37 + srctree := $(patsubst %/,%,$(dir $(CURDIR))) 38 + srctree := $(patsubst %/,%,$(dir $(srctree))) 39 + srctree := $(patsubst %/,%,$(dir $(srctree))) 40 + srctree := $(patsubst %/,%,$(dir $(srctree))) 41 + endif 42 + 36 43 CFLAGS += -g $(OPT_FLAGS) -rdynamic \ 37 44 -Wall -Werror -fno-omit-frame-pointer \ 38 45 $(GENFLAGS) $(SAN_CFLAGS) $(LIBELF_CFLAGS) \ ··· 47 40 -I$(TOOLSINCDIR) -I$(APIDIR) -I$(OUTPUT) 48 41 LDFLAGS += $(SAN_LDFLAGS) 49 42 LDLIBS += $(LIBELF_LIBS) -lz -lrt -lpthread 43 + 44 + PCAP_CFLAGS := $(shell $(PKG_CONFIG) --cflags libpcap 2>/dev/null && echo "-DTRAFFIC_MONITOR=1") 45 + PCAP_LIBS := $(shell $(PKG_CONFIG) --libs libpcap 2>/dev/null) 46 + LDLIBS += $(PCAP_LIBS) 47 + CFLAGS += $(PCAP_CFLAGS) 50 48 51 49 # The following tests perform type punning and they may break strict 52 50 # aliasing rules, which are exploited by both GCC and clang by default ··· 66 54 progs/test_sk_lookup.c-CFLAGS := -fno-strict-aliasing 67 55 progs/timer_crash.c-CFLAGS := -fno-strict-aliasing 68 56 progs/test_global_func9.c-CFLAGS := -fno-strict-aliasing 57 + progs/verifier_nocsr.c-CFLAGS := -fno-strict-aliasing 58 + 59 + # Some utility functions use LLVM libraries 60 + jit_disasm_helpers.c-CFLAGS = $(LLVM_CFLAGS) 69 61 70 62 ifneq ($(LLVM),) 71 63 # Silence some warnings when compiled with clang ··· 83 67 84 68 # Order correspond to 'make run_tests' order 85 69 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \ 86 - test_dev_cgroup \ 87 - test_sock test_sockmap get_cgroup_id_user \ 88 - test_cgroup_storage \ 70 + test_sock test_sockmap \ 89 71 test_tcpnotify_user test_sysctl \ 90 72 test_progs-no_alu32 91 73 TEST_INST_SUBDIRS := no_alu32 ··· 129 115 test_xdp_redirect.sh \ 130 116 test_xdp_redirect_multi.sh \ 131 117 test_xdp_meta.sh \ 132 - test_xdp_veth.sh \ 133 118 test_tunnel.sh \ 134 119 test_lwt_seg6local.sh \ 135 120 test_lirc_mode2.sh \ ··· 153 140 test_xdp_vlan.sh test_bpftool.py 154 141 155 142 # Compile but not part of 'make run_tests' 156 - TEST_GEN_PROGS_EXTENDED = test_skb_cgroup_id_user \ 143 + TEST_GEN_PROGS_EXTENDED = \ 157 144 flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ 158 145 test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ 159 146 xskxceiver xdp_redirect_multi xdp_synproxy veristat xdp_hw_metadata \ ··· 178 165 endef 179 166 180 167 include ../lib.mk 168 + 169 + NON_CHECK_FEAT_TARGETS := clean docs-clean 170 + CHECK_FEAT := $(filter-out $(NON_CHECK_FEAT_TARGETS),$(or $(MAKECMDGOALS), "none")) 171 + ifneq ($(CHECK_FEAT),) 172 + FEATURE_USER := .selftests 173 + FEATURE_TESTS := llvm 174 + FEATURE_DISPLAY := $(FEATURE_TESTS) 175 + 176 + # Makefile.feature expects OUTPUT to end with a slash 177 + ifeq ($(shell expr $(MAKE_VERSION) \>= 4.4), 1) 178 + $(let OUTPUT,$(OUTPUT)/,\ 179 + $(eval include ../../../build/Makefile.feature)) 180 + else 181 + OUTPUT := $(OUTPUT)/ 182 + $(eval include ../../../build/Makefile.feature) 183 + OUTPUT := $(patsubst %/,%,$(OUTPUT)) 184 + endif 185 + endif 186 + 187 + ifeq ($(feature-llvm),1) 188 + LLVM_CFLAGS += -DHAVE_LLVM_SUPPORT 189 + LLVM_CONFIG_LIB_COMPONENTS := mcdisassembler all-targets 190 + # both llvm-config and lib.mk add -D_GNU_SOURCE, which ends up as conflict 191 + LLVM_CFLAGS += $(filter-out -D_GNU_SOURCE,$(shell $(LLVM_CONFIG) --cflags)) 192 + LLVM_LDLIBS += $(shell $(LLVM_CONFIG) --link-static --libs $(LLVM_CONFIG_LIB_COMPONENTS)) 193 + LLVM_LDLIBS += $(shell $(LLVM_CONFIG) --link-static --system-libs $(LLVM_CONFIG_LIB_COMPONENTS)) 194 + LLVM_LDLIBS += -lstdc++ 195 + LLVM_LDFLAGS += $(shell $(LLVM_CONFIG) --ldflags) 196 + endif 181 197 182 198 SCRATCH_DIR := $(OUTPUT)/tools 183 199 BUILD_DIR := $(SCRATCH_DIR)/build ··· 335 293 CAP_HELPERS := $(OUTPUT)/cap_helpers.o 336 294 NETWORK_HELPERS := $(OUTPUT)/network_helpers.o 337 295 338 - $(OUTPUT)/test_dev_cgroup: $(CGROUP_HELPERS) $(TESTING_HELPERS) 339 - $(OUTPUT)/test_skb_cgroup_id_user: $(CGROUP_HELPERS) $(TESTING_HELPERS) 340 296 $(OUTPUT)/test_sock: $(CGROUP_HELPERS) $(TESTING_HELPERS) 341 297 $(OUTPUT)/test_sockmap: $(CGROUP_HELPERS) $(TESTING_HELPERS) 342 298 $(OUTPUT)/test_tcpnotify_user: $(CGROUP_HELPERS) $(TESTING_HELPERS) $(TRACE_HELPERS) 343 - $(OUTPUT)/get_cgroup_id_user: $(CGROUP_HELPERS) $(TESTING_HELPERS) 344 - $(OUTPUT)/test_cgroup_storage: $(CGROUP_HELPERS) $(TESTING_HELPERS) 345 299 $(OUTPUT)/test_sock_fields: $(CGROUP_HELPERS) $(TESTING_HELPERS) 346 300 $(OUTPUT)/test_sysctl: $(CGROUP_HELPERS) $(TESTING_HELPERS) 347 301 $(OUTPUT)/test_tag: $(TESTING_HELPERS) ··· 403 365 DESTDIR=$(HOST_SCRATCH_DIR)/ prefix= all install_headers 404 366 endif 405 367 368 + # vmlinux.h is first dumped to a temprorary file and then compared to 369 + # the previous version. This helps to avoid unnecessary re-builds of 370 + # $(TRUNNER_BPF_OBJS) 406 371 $(INCLUDE_DIR)/vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL) | $(INCLUDE_DIR) 407 372 ifeq ($(VMLINUX_H),) 408 373 $(call msg,GEN,,$@) 409 - $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $@ 374 + $(Q)$(BPFTOOL) btf dump file $(VMLINUX_BTF) format c > $(INCLUDE_DIR)/.vmlinux.h.tmp 375 + $(Q)cmp -s $(INCLUDE_DIR)/.vmlinux.h.tmp $@ || mv $(INCLUDE_DIR)/.vmlinux.h.tmp $@ 410 376 else 411 377 $(call msg,CP,,$@) 412 378 $(Q)cp "$(VMLINUX_H)" $@ ··· 438 396 $(shell $(1) $(2) -v -E - </dev/null 2>&1 \ 439 397 | sed -n '/<...> search starts here:/,/End of search list./{ s| $/.*$|-idirafter \1|p }') \ 440 398 $(shell $(1) $(2) -dM -E - </dev/null | grep '__riscv_xlen ' | awk '{printf("-D__riscv_xlen=%d -D__BITS_PER_LONG=%d", $$3, $$3)}') \ 441 - $(shell $(1) $(2) -dM -E - </dev/null | grep '__loongarch_grlen ' | awk '{printf("-D__BITS_PER_LONG=%d", $$3)}') 399 + $(shell $(1) $(2) -dM -E - </dev/null | grep '__loongarch_grlen ' | awk '{printf("-D__BITS_PER_LONG=%d", $$3)}') \ 400 + $(shell $(1) $(2) -dM -E - </dev/null | grep -E 'MIPS(EL|EB)|_MIPS_SZ(PTR|LONG) |_MIPS_SIM |_ABI(O32|N32|64) ' | awk '{printf("-D%s=%s ", $$2, $$3)}') 442 401 endef 443 402 444 403 # Determine target endianness. ··· 470 427 # $1 - input .c file 471 428 # $2 - output .o file 472 429 # $3 - CFLAGS 430 + # $4 - binary name 473 431 define CLANG_BPF_BUILD_RULE 474 - $(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2) 432 + $(call msg,CLNG-BPF,$4,$2) 475 433 $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v3 -o $2 476 434 endef 477 435 # Similar to CLANG_BPF_BUILD_RULE, but with disabled alu32 478 436 define CLANG_NOALU32_BPF_BUILD_RULE 479 - $(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2) 437 + $(call msg,CLNG-BPF,$4,$2) 480 438 $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v2 -o $2 481 439 endef 482 440 # Similar to CLANG_BPF_BUILD_RULE, but with cpu-v4 483 441 define CLANG_CPUV4_BPF_BUILD_RULE 484 - $(call msg,CLNG-BPF,$(TRUNNER_BINARY),$2) 442 + $(call msg,CLNG-BPF,$4,$2) 485 443 $(Q)$(CLANG) $3 -O2 --target=bpf -c $1 -mcpu=v4 -o $2 486 444 endef 487 445 # Build BPF object using GCC 488 446 define GCC_BPF_BUILD_RULE 489 - $(call msg,GCC-BPF,$(TRUNNER_BINARY),$2) 447 + $(call msg,GCC-BPF,$4,$2) 490 448 $(Q)$(BPF_GCC) $3 -DBPF_NO_PRESERVE_ACCESS_INDEX -Wno-attributes -O2 -c $1 -o $2 491 449 endef 492 450 ··· 521 477 xdp_hw_metadata.skel.h-deps := xdp_hw_metadata.bpf.o 522 478 xdp_features.skel.h-deps := xdp_features.bpf.o 523 479 524 - LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(foreach skel,$(LINKED_SKELS),$($(skel)-deps))) 480 + LINKED_BPF_OBJS := $(foreach skel,$(LINKED_SKELS),$($(skel)-deps)) 481 + LINKED_BPF_SRCS := $(patsubst %.bpf.o,%.c,$(LINKED_BPF_OBJS)) 482 + 483 + HEADERS_FOR_BPF_OBJS := $(wildcard $(BPFDIR)/*.bpf.h) \ 484 + $(addprefix $(BPFDIR)/, bpf_core_read.h \ 485 + bpf_endian.h \ 486 + bpf_helpers.h \ 487 + bpf_tracing.h) 525 488 526 489 # Set up extra TRUNNER_XXX "temporary" variables in the environment (relies on 527 490 # $eval()) and pass control to DEFINE_TEST_RUNNER_RULES. ··· 580 529 $(TRUNNER_BPF_PROGS_DIR)/%.c \ 581 530 $(TRUNNER_BPF_PROGS_DIR)/*.h \ 582 531 $$(INCLUDE_DIR)/vmlinux.h \ 583 - $(wildcard $(BPFDIR)/bpf_*.h) \ 584 - $(wildcard $(BPFDIR)/*.bpf.h) \ 532 + $(HEADERS_FOR_BPF_OBJS) \ 585 533 | $(TRUNNER_OUTPUT) $$(BPFOBJ) 586 534 $$(call $(TRUNNER_BPF_BUILD_RULE),$$<,$$@, \ 587 535 $(TRUNNER_BPF_CFLAGS) \ 588 536 $$($$<-CFLAGS) \ 589 - $$($$<-$2-CFLAGS)) 537 + $$($$<-$2-CFLAGS),$(TRUNNER_BINARY)) 590 538 591 539 $(TRUNNER_BPF_SKELS): %.skel.h: %.bpf.o $(BPFTOOL) | $(TRUNNER_OUTPUT) 592 540 $$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@) ··· 606 556 $(Q)$$(BPFTOOL) gen skeleton -L $$(<:.o=.llinked3.o) name $$(notdir $$(<:.bpf.o=_lskel)) > $$@ 607 557 $(Q)rm -f $$(<:.o=.llinked1.o) $$(<:.o=.llinked2.o) $$(<:.o=.llinked3.o) 608 558 609 - $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT) 559 + $(LINKED_BPF_OBJS): %: $(TRUNNER_OUTPUT)/% 560 + 561 + # .SECONDEXPANSION here allows to correctly expand %-deps variables as prerequisites 562 + .SECONDEXPANSION: 563 + $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_OUTPUT)/%: $$$$(%-deps) $(BPFTOOL) | $(TRUNNER_OUTPUT) 610 564 $$(call msg,LINK-BPF,$(TRUNNER_BINARY),$$(@:.skel.h=.bpf.o)) 611 565 $(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked1.o) $$(addprefix $(TRUNNER_OUTPUT)/,$$($$(@F)-deps)) 612 566 $(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked2.o) $$(@:.skel.h=.linked1.o) ··· 620 566 $(Q)$$(BPFTOOL) gen skeleton $$(@:.skel.h=.linked3.o) name $$(notdir $$(@:.skel.h=)) > $$@ 621 567 $(Q)$$(BPFTOOL) gen subskeleton $$(@:.skel.h=.linked3.o) name $$(notdir $$(@:.skel.h=)) > $$(@:.skel.h=.subskel.h) 622 568 $(Q)rm -f $$(@:.skel.h=.linked1.o) $$(@:.skel.h=.linked2.o) $$(@:.skel.h=.linked3.o) 569 + 570 + # When the compiler generates a %.d file, only skel basenames (not 571 + # full paths) are specified as prerequisites for corresponding %.o 572 + # file. This target makes %.skel.h basename dependent on full paths, 573 + # linking generated %.d dependency with actual %.skel.h files. 574 + $(notdir %.skel.h): $(TRUNNER_OUTPUT)/%.skel.h 575 + @true 576 + 623 577 endif 624 578 625 579 # ensure we set up tests.h header generation rule just once ··· 645 583 # Note: we cd into output directory to ensure embedded BPF object is found 646 584 $(TRUNNER_TEST_OBJS): $(TRUNNER_OUTPUT)/%.test.o: \ 647 585 $(TRUNNER_TESTS_DIR)/%.c \ 648 - $(TRUNNER_EXTRA_HDRS) \ 649 - $(TRUNNER_BPF_OBJS) \ 650 - $(TRUNNER_BPF_SKELS) \ 651 - $(TRUNNER_BPF_LSKELS) \ 652 - $(TRUNNER_BPF_SKELS_LINKED) \ 653 - $$(BPFOBJ) | $(TRUNNER_OUTPUT) 586 + | $(TRUNNER_OUTPUT)/%.test.d 654 587 $$(call msg,TEST-OBJ,$(TRUNNER_BINARY),$$@) 655 - $(Q)cd $$(@D) && $$(CC) -I. $$(CFLAGS) -c $(CURDIR)/$$< $$(LDLIBS) -o $$(@F) 588 + $(Q)cd $$(@D) && $$(CC) -I. $$(CFLAGS) -MMD -MT $$@ -c $(CURDIR)/$$< $$(LDLIBS) -o $$(@F) 589 + 590 + $(TRUNNER_TEST_OBJS:.o=.d): $(TRUNNER_OUTPUT)/%.test.d: \ 591 + $(TRUNNER_TESTS_DIR)/%.c \ 592 + $(TRUNNER_EXTRA_HDRS) \ 593 + $(TRUNNER_BPF_SKELS) \ 594 + $(TRUNNER_BPF_LSKELS) \ 595 + $(TRUNNER_BPF_SKELS_LINKED) \ 596 + $$(BPFOBJ) | $(TRUNNER_OUTPUT) 597 + 598 + ifeq ($(filter clean docs-clean,$(MAKECMDGOALS)),) 599 + include $(wildcard $(TRUNNER_TEST_OBJS:.o=.d)) 600 + endif 601 + 602 + # add per extra obj CFGLAGS definitions 603 + $(foreach N,$(patsubst $(TRUNNER_OUTPUT)/%.o,%,$(TRUNNER_EXTRA_OBJS)), \ 604 + $(eval $(TRUNNER_OUTPUT)/$(N).o: CFLAGS += $($(N).c-CFLAGS))) 656 605 657 606 $(TRUNNER_EXTRA_OBJS): $(TRUNNER_OUTPUT)/%.o: \ 658 607 %.c \ ··· 681 608 $(Q)rsync -aq $$^ $(TRUNNER_OUTPUT)/ 682 609 endif 683 610 611 + $(OUTPUT)/$(TRUNNER_BINARY): LDLIBS += $$(LLVM_LDLIBS) 612 + $(OUTPUT)/$(TRUNNER_BINARY): LDFLAGS += $$(LLVM_LDFLAGS) 613 + 614 + # some X.test.o files have runtime dependencies on Y.bpf.o files 615 + $(OUTPUT)/$(TRUNNER_BINARY): | $(TRUNNER_BPF_OBJS) 616 + 684 617 $(OUTPUT)/$(TRUNNER_BINARY): $(TRUNNER_TEST_OBJS) \ 685 618 $(TRUNNER_EXTRA_OBJS) $$(BPFOBJ) \ 686 619 $(RESOLVE_BTFIDS) \ 687 620 $(TRUNNER_BPFTOOL) \ 688 621 | $(TRUNNER_BINARY)-extras 689 622 $$(call msg,BINARY,,$$@) 690 - $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) -o $$@ 623 + $(Q)$$(CC) $$(CFLAGS) $$(filter %.a %.o,$$^) $$(LDLIBS) $$(LDFLAGS) -o $$@ 691 624 $(Q)$(RESOLVE_BTFIDS) --btf $(TRUNNER_OUTPUT)/btf_data.bpf.o $$@ 692 625 $(Q)ln -sf $(if $2,..,.)/tools/build/bpftool/$(USE_BOOTSTRAP)bpftool \ 693 626 $(OUTPUT)/$(if $2,$2/)bpftool ··· 712 633 cap_helpers.c \ 713 634 unpriv_helpers.c \ 714 635 netlink_helpers.c \ 636 + jit_disasm_helpers.c \ 715 637 test_loader.c \ 716 638 xsk.c \ 717 639 disasm.c \ 640 + disasm_helpers.c \ 718 641 json_writer.c \ 719 642 flow_dissector_load.h \ 720 643 ip_check_defrag_frags.h ··· 843 762 $(call msg,BINARY,,$@) 844 763 $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ 845 764 846 - $(OUTPUT)/uprobe_multi: uprobe_multi.c 765 + # Linking uprobe_multi can fail due to relocation overflows on mips. 766 + $(OUTPUT)/uprobe_multi: CFLAGS += $(if $(filter mips, $(ARCH)),-mxgot) 767 + $(OUTPUT)/uprobe_multi: uprobe_multi.c uprobe_multi.ld 847 768 $(call msg,BINARY,,$@) 848 - $(Q)$(CC) $(CFLAGS) -O0 $(LDFLAGS) $^ $(LDLIBS) -o $@ 769 + $(Q)$(CC) $(CFLAGS) -Wl,-T,uprobe_multi.ld -O0 $(LDFLAGS) \ 770 + $(filter-out %.ld,$^) $(LDLIBS) -o $@ 849 771 850 772 EXTRA_CLEAN := $(SCRATCH_DIR) $(HOST_SCRATCH_DIR) \ 851 773 prog_tests/tests.h map_tests/tests.h verifier/tests.h \ 852 - feature bpftool \ 853 - $(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h *.subskel.h \ 774 + feature bpftool \ 775 + $(addprefix $(OUTPUT)/,*.o *.d *.skel.h *.lskel.h *.subskel.h \ 854 776 no_alu32 cpuv4 bpf_gcc bpf_testmod.ko \ 855 777 bpf_test_no_cfi.ko \ 856 - liburandom_read.so) 778 + liburandom_read.so) \ 779 + $(OUTPUT)/FEATURE-DUMP.selftests 857 780 858 781 .PHONY: docs docs-clean 859 782

+31 -1

tools/testing/selftests/bpf/README.rst

··· 85 85 If you want to change pahole and llvm, you can change `PATH` environment 86 86 variable in the beginning of script. 87 87 88 - .. note:: The script currently only supports x86_64 and s390x architectures. 88 + Running vmtest on RV64 89 + ====================== 90 + To speed up testing and avoid various dependency issues, it is recommended to 91 + run vmtest in a Docker container. Before running vmtest, we need to prepare 92 + Docker container and local rootfs image. The overall steps are as follows: 93 + 94 + 1. Create Docker container as shown in link [0]. 95 + 96 + 2. Use mkrootfs_debian.sh script [1] to build local rootfs image: 97 + 98 + .. code-block:: console 99 + 100 + $ sudo ./mkrootfs_debian.sh --arch riscv64 --distro noble 101 + 102 + 3. Start Docker container [0] and run vmtest in the container: 103 + 104 + .. code-block:: console 105 + 106 + $ PLATFORM=riscv64 CROSS_COMPILE=riscv64-linux-gnu- \ 107 + tools/testing/selftests/bpf/vmtest.sh \ 108 + -l <path of local rootfs image> -- \ 109 + ./test_progs -d \ 110 + \"$(cat tools/testing/selftests/bpf/DENYLIST.riscv64 \ 111 + | cut -d'#' -f1 \ 112 + | sed -e 's/^[[:space:]]*//' \ 113 + -e 's/[[:space:]]*$//' \ 114 + | tr -s '\n' ',' \ 115 + )\" 116 + 117 + Link: https://github.com/pulehui/riscv-bpf-vmtest.git [0] 118 + Link: https://github.com/libbpf/ci/blob/main/rootfs/mkrootfs_debian.sh [1] 89 119 90 120 Additional information about selftest failures are 91 121 documented here.

+13

tools/testing/selftests/bpf/bench.c

··· 10 10 #include <sys/sysinfo.h> 11 11 #include <signal.h> 12 12 #include "bench.h" 13 + #include "bpf_util.h" 13 14 #include "testing_helpers.h" 14 15 15 16 struct env env = { ··· 520 519 extern const struct bench bench_trig_uretprobe_push; 521 520 extern const struct bench bench_trig_uprobe_ret; 522 521 extern const struct bench bench_trig_uretprobe_ret; 522 + extern const struct bench bench_trig_uprobe_multi_nop; 523 + extern const struct bench bench_trig_uretprobe_multi_nop; 524 + extern const struct bench bench_trig_uprobe_multi_push; 525 + extern const struct bench bench_trig_uretprobe_multi_push; 526 + extern const struct bench bench_trig_uprobe_multi_ret; 527 + extern const struct bench bench_trig_uretprobe_multi_ret; 523 528 524 529 extern const struct bench bench_rb_libbpf; 525 530 extern const struct bench bench_rb_custom; ··· 580 573 &bench_trig_uretprobe_push, 581 574 &bench_trig_uprobe_ret, 582 575 &bench_trig_uretprobe_ret, 576 + &bench_trig_uprobe_multi_nop, 577 + &bench_trig_uretprobe_multi_nop, 578 + &bench_trig_uprobe_multi_push, 579 + &bench_trig_uretprobe_multi_push, 580 + &bench_trig_uprobe_multi_ret, 581 + &bench_trig_uretprobe_multi_ret, 583 582 /* ringbuf/perfbuf benchmarks */ 584 583 &bench_rb_libbpf, 585 584 &bench_rb_custom,

+1

tools/testing/selftests/bpf/bench.h

··· 10 10 #include <math.h> 11 11 #include <time.h> 12 12 #include <sys/syscall.h> 13 + #include <limits.h> 13 14 14 15 struct cpu_set { 15 16 bool *cpus;

+67 -16

tools/testing/selftests/bpf/benchs/bench_trigger.c

··· 276 276 * instructions. So use two different targets, one of which starts with nop 277 277 * and another doesn't. 278 278 * 279 - * GCC doesn't generate stack setup preample for these functions due to them 279 + * GCC doesn't generate stack setup preamble for these functions due to them 280 280 * having no input arguments and doing nothing in the body. 281 281 */ 282 282 __nocf_check __weak void uprobe_target_nop(void) ··· 332 332 return NULL; 333 333 } 334 334 335 - static void usetup(bool use_retprobe, void *target_addr) 335 + static void usetup(bool use_retprobe, bool use_multi, void *target_addr) 336 336 { 337 337 size_t uprobe_offset; 338 338 struct bpf_link *link; ··· 346 346 exit(1); 347 347 } 348 348 349 - bpf_program__set_autoload(ctx.skel->progs.bench_trigger_uprobe, true); 349 + if (use_multi) 350 + bpf_program__set_autoload(ctx.skel->progs.bench_trigger_uprobe_multi, true); 351 + else 352 + bpf_program__set_autoload(ctx.skel->progs.bench_trigger_uprobe, true); 350 353 351 354 err = trigger_bench__load(ctx.skel); 352 355 if (err) { ··· 358 355 } 359 356 360 357 uprobe_offset = get_uprobe_offset(target_addr); 361 - link = bpf_program__attach_uprobe(ctx.skel->progs.bench_trigger_uprobe, 362 - use_retprobe, 363 - -1 /* all PIDs */, 364 - "/proc/self/exe", 365 - uprobe_offset); 358 + if (use_multi) { 359 + LIBBPF_OPTS(bpf_uprobe_multi_opts, opts, 360 + .retprobe = use_retprobe, 361 + .cnt = 1, 362 + .offsets = &uprobe_offset, 363 + ); 364 + link = bpf_program__attach_uprobe_multi( 365 + ctx.skel->progs.bench_trigger_uprobe_multi, 366 + -1 /* all PIDs */, "/proc/self/exe", NULL, &opts); 367 + ctx.skel->links.bench_trigger_uprobe_multi = link; 368 + } else { 369 + link = bpf_program__attach_uprobe(ctx.skel->progs.bench_trigger_uprobe, 370 + use_retprobe, 371 + -1 /* all PIDs */, 372 + "/proc/self/exe", 373 + uprobe_offset); 374 + ctx.skel->links.bench_trigger_uprobe = link; 375 + } 366 376 if (!link) { 367 - fprintf(stderr, "failed to attach uprobe!\n"); 377 + fprintf(stderr, "failed to attach %s!\n", use_multi ? "multi-uprobe" : "uprobe"); 368 378 exit(1); 369 379 } 370 - ctx.skel->links.bench_trigger_uprobe = link; 371 380 } 372 381 373 382 static void usermode_count_setup(void) ··· 389 374 390 375 static void uprobe_nop_setup(void) 391 376 { 392 - usetup(false, &uprobe_target_nop); 377 + usetup(false, false /* !use_multi */, &uprobe_target_nop); 393 378 } 394 379 395 380 static void uretprobe_nop_setup(void) 396 381 { 397 - usetup(true, &uprobe_target_nop); 382 + usetup(true, false /* !use_multi */, &uprobe_target_nop); 398 383 } 399 384 400 385 static void uprobe_push_setup(void) 401 386 { 402 - usetup(false, &uprobe_target_push); 387 + usetup(false, false /* !use_multi */, &uprobe_target_push); 403 388 } 404 389 405 390 static void uretprobe_push_setup(void) 406 391 { 407 - usetup(true, &uprobe_target_push); 392 + usetup(true, false /* !use_multi */, &uprobe_target_push); 408 393 } 409 394 410 395 static void uprobe_ret_setup(void) 411 396 { 412 - usetup(false, &uprobe_target_ret); 397 + usetup(false, false /* !use_multi */, &uprobe_target_ret); 413 398 } 414 399 415 400 static void uretprobe_ret_setup(void) 416 401 { 417 - usetup(true, &uprobe_target_ret); 402 + usetup(true, false /* !use_multi */, &uprobe_target_ret); 403 + } 404 + 405 + static void uprobe_multi_nop_setup(void) 406 + { 407 + usetup(false, true /* use_multi */, &uprobe_target_nop); 408 + } 409 + 410 + static void uretprobe_multi_nop_setup(void) 411 + { 412 + usetup(true, true /* use_multi */, &uprobe_target_nop); 413 + } 414 + 415 + static void uprobe_multi_push_setup(void) 416 + { 417 + usetup(false, true /* use_multi */, &uprobe_target_push); 418 + } 419 + 420 + static void uretprobe_multi_push_setup(void) 421 + { 422 + usetup(true, true /* use_multi */, &uprobe_target_push); 423 + } 424 + 425 + static void uprobe_multi_ret_setup(void) 426 + { 427 + usetup(false, true /* use_multi */, &uprobe_target_ret); 428 + } 429 + 430 + static void uretprobe_multi_ret_setup(void) 431 + { 432 + usetup(true, true /* use_multi */, &uprobe_target_ret); 418 433 } 419 434 420 435 const struct bench bench_trig_syscall_count = { ··· 499 454 BENCH_TRIG_USERMODE(uretprobe_nop, nop, "uretprobe-nop"); 500 455 BENCH_TRIG_USERMODE(uretprobe_push, push, "uretprobe-push"); 501 456 BENCH_TRIG_USERMODE(uretprobe_ret, ret, "uretprobe-ret"); 457 + BENCH_TRIG_USERMODE(uprobe_multi_nop, nop, "uprobe-multi-nop"); 458 + BENCH_TRIG_USERMODE(uprobe_multi_push, push, "uprobe-multi-push"); 459 + BENCH_TRIG_USERMODE(uprobe_multi_ret, ret, "uprobe-multi-ret"); 460 + BENCH_TRIG_USERMODE(uretprobe_multi_nop, nop, "uretprobe-multi-nop"); 461 + BENCH_TRIG_USERMODE(uretprobe_multi_push, push, "uretprobe-multi-push"); 462 + BENCH_TRIG_USERMODE(uretprobe_multi_ret, ret, "uretprobe-multi-ret");

+26

tools/testing/selftests/bpf/bpf_experimental.h

··· 195 195 */ 196 196 extern void bpf_throw(u64 cookie) __ksym; 197 197 198 + /* Description 199 + * Acquire a reference on the exe_file member field belonging to the 200 + * mm_struct that is nested within the supplied task_struct. The supplied 201 + * task_struct must be trusted/referenced. 202 + * Returns 203 + * A referenced file pointer pointing to the exe_file member field of the 204 + * mm_struct nested in the supplied task_struct, or NULL. 205 + */ 206 + extern struct file *bpf_get_task_exe_file(struct task_struct *task) __ksym; 207 + 208 + /* Description 209 + * Release a reference on the supplied file. The supplied file must be 210 + * acquired. 211 + */ 212 + extern void bpf_put_file(struct file *file) __ksym; 213 + 214 + /* Description 215 + * Resolve a pathname for the supplied path and store it in the supplied 216 + * buffer. The supplied path must be trusted/referenced. 217 + * Returns 218 + * A positive integer corresponding to the length of the resolved pathname, 219 + * including the NULL termination character, stored in the supplied 220 + * buffer. On error, a negative integer is returned. 221 + */ 222 + extern int bpf_path_d_path(struct path *path, char *buf, size_t buf__sz) __ksym; 223 + 198 224 /* This macro must be used to mark the exception callback corresponding to the 199 225 * main program. For example: 200 226 *

+10 -1

tools/testing/selftests/bpf/bpf_kfuncs.h

··· 45 45 46 46 /* Description 47 47 * Modify the address of a AF_UNIX sockaddr. 48 - * Returns__bpf_kfunc 48 + * Returns 49 49 * -EINVAL if the address size is too big or, 0 if the sockaddr was successfully modified. 50 50 */ 51 51 extern int bpf_sock_addr_set_sun_path(struct bpf_sock_addr_kern *sa_kern, ··· 78 78 79 79 extern bool bpf_session_is_return(void) __ksym __weak; 80 80 extern __u64 *bpf_session_cookie(void) __ksym __weak; 81 + 82 + struct dentry; 83 + /* Description 84 + * Returns xattr of a dentry 85 + * Returns 86 + * Error code 87 + */ 88 + extern int bpf_get_dentry_xattr(struct dentry *dentry, const char *name, 89 + struct bpf_dynptr *value_ptr) __ksym __weak; 81 90 #endif

+253 -4

tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c

··· 17 17 #include <linux/in.h> 18 18 #include <linux/in6.h> 19 19 #include <linux/un.h> 20 + #include <linux/filter.h> 20 21 #include <net/sock.h> 21 22 #include <linux/namei.h> 22 23 #include "bpf_testmod.h" ··· 142 141 143 142 __bpf_kfunc int bpf_iter_testmod_seq_new(struct bpf_iter_testmod_seq *it, s64 value, int cnt) 144 143 { 145 - if (cnt < 0) { 146 - it->cnt = 0; 144 + it->cnt = cnt; 145 + 146 + if (cnt < 0) 147 147 return -EINVAL; 148 - } 149 148 150 149 it->value = value; 151 - it->cnt = cnt; 152 150 153 151 return 0; 154 152 } ··· 162 162 return &it->value; 163 163 } 164 164 165 + __bpf_kfunc s64 bpf_iter_testmod_seq_value(int val, struct bpf_iter_testmod_seq* it__iter) 166 + { 167 + if (it__iter->cnt < 0) 168 + return 0; 169 + 170 + return val + it__iter->value; 171 + } 172 + 165 173 __bpf_kfunc void bpf_iter_testmod_seq_destroy(struct bpf_iter_testmod_seq *it) 166 174 { 167 175 it->cnt = 0; ··· 181 173 182 174 __bpf_kfunc void bpf_kfunc_dynptr_test(struct bpf_dynptr *ptr, 183 175 struct bpf_dynptr *ptr__nullable) 176 + { 177 + } 178 + 179 + __bpf_kfunc struct sk_buff *bpf_kfunc_nested_acquire_nonzero_offset_test(struct sk_buff_head *ptr) 180 + { 181 + return NULL; 182 + } 183 + 184 + __bpf_kfunc struct sk_buff *bpf_kfunc_nested_acquire_zero_offset_test(struct sock_common *ptr) 185 + { 186 + return NULL; 187 + } 188 + 189 + __bpf_kfunc void bpf_kfunc_nested_release_test(struct sk_buff *ptr) 190 + { 191 + } 192 + 193 + __bpf_kfunc void bpf_kfunc_trusted_vma_test(struct vm_area_struct *ptr) 194 + { 195 + } 196 + 197 + __bpf_kfunc void bpf_kfunc_trusted_task_test(struct task_struct *ptr) 198 + { 199 + } 200 + 201 + __bpf_kfunc void bpf_kfunc_trusted_num_test(int *ptr) 202 + { 203 + } 204 + 205 + __bpf_kfunc void bpf_kfunc_rcu_task_test(struct task_struct *ptr) 184 206 { 185 207 } 186 208 ··· 572 534 BTF_ID_FLAGS(func, bpf_iter_testmod_seq_new, KF_ITER_NEW) 573 535 BTF_ID_FLAGS(func, bpf_iter_testmod_seq_next, KF_ITER_NEXT | KF_RET_NULL) 574 536 BTF_ID_FLAGS(func, bpf_iter_testmod_seq_destroy, KF_ITER_DESTROY) 537 + BTF_ID_FLAGS(func, bpf_iter_testmod_seq_value) 575 538 BTF_ID_FLAGS(func, bpf_kfunc_common_test) 576 539 BTF_ID_FLAGS(func, bpf_kfunc_dynptr_test) 540 + BTF_ID_FLAGS(func, bpf_kfunc_nested_acquire_nonzero_offset_test, KF_ACQUIRE) 541 + BTF_ID_FLAGS(func, bpf_kfunc_nested_acquire_zero_offset_test, KF_ACQUIRE) 542 + BTF_ID_FLAGS(func, bpf_kfunc_nested_release_test, KF_RELEASE) 543 + BTF_ID_FLAGS(func, bpf_kfunc_trusted_vma_test, KF_TRUSTED_ARGS) 544 + BTF_ID_FLAGS(func, bpf_kfunc_trusted_task_test, KF_TRUSTED_ARGS) 545 + BTF_ID_FLAGS(func, bpf_kfunc_trusted_num_test, KF_TRUSTED_ARGS) 546 + BTF_ID_FLAGS(func, bpf_kfunc_rcu_task_test, KF_RCU) 577 547 BTF_ID_FLAGS(func, bpf_testmod_ctx_create, KF_ACQUIRE | KF_RET_NULL) 578 548 BTF_ID_FLAGS(func, bpf_testmod_ctx_release, KF_RELEASE) 579 549 BTF_KFUNCS_END(bpf_testmod_common_kfunc_ids) ··· 969 923 return err; 970 924 } 971 925 926 + static DEFINE_MUTEX(st_ops_mutex); 927 + static struct bpf_testmod_st_ops *st_ops; 928 + 929 + __bpf_kfunc int bpf_kfunc_st_ops_test_prologue(struct st_ops_args *args) 930 + { 931 + int ret = -1; 932 + 933 + mutex_lock(&st_ops_mutex); 934 + if (st_ops && st_ops->test_prologue) 935 + ret = st_ops->test_prologue(args); 936 + mutex_unlock(&st_ops_mutex); 937 + 938 + return ret; 939 + } 940 + 941 + __bpf_kfunc int bpf_kfunc_st_ops_test_epilogue(struct st_ops_args *args) 942 + { 943 + int ret = -1; 944 + 945 + mutex_lock(&st_ops_mutex); 946 + if (st_ops && st_ops->test_epilogue) 947 + ret = st_ops->test_epilogue(args); 948 + mutex_unlock(&st_ops_mutex); 949 + 950 + return ret; 951 + } 952 + 953 + __bpf_kfunc int bpf_kfunc_st_ops_test_pro_epilogue(struct st_ops_args *args) 954 + { 955 + int ret = -1; 956 + 957 + mutex_lock(&st_ops_mutex); 958 + if (st_ops && st_ops->test_pro_epilogue) 959 + ret = st_ops->test_pro_epilogue(args); 960 + mutex_unlock(&st_ops_mutex); 961 + 962 + return ret; 963 + } 964 + 965 + __bpf_kfunc int bpf_kfunc_st_ops_inc10(struct st_ops_args *args) 966 + { 967 + args->a += 10; 968 + return args->a; 969 + } 970 + 972 971 BTF_KFUNCS_START(bpf_testmod_check_kfunc_ids) 973 972 BTF_ID_FLAGS(func, bpf_testmod_test_mod_kfunc) 974 973 BTF_ID_FLAGS(func, bpf_kfunc_call_test1) ··· 1050 959 BTF_ID_FLAGS(func, bpf_kfunc_call_sock_sendmsg, KF_SLEEPABLE) 1051 960 BTF_ID_FLAGS(func, bpf_kfunc_call_kernel_getsockname, KF_SLEEPABLE) 1052 961 BTF_ID_FLAGS(func, bpf_kfunc_call_kernel_getpeername, KF_SLEEPABLE) 962 + BTF_ID_FLAGS(func, bpf_kfunc_st_ops_test_prologue, KF_TRUSTED_ARGS | KF_SLEEPABLE) 963 + BTF_ID_FLAGS(func, bpf_kfunc_st_ops_test_epilogue, KF_TRUSTED_ARGS | KF_SLEEPABLE) 964 + BTF_ID_FLAGS(func, bpf_kfunc_st_ops_test_pro_epilogue, KF_TRUSTED_ARGS | KF_SLEEPABLE) 965 + BTF_ID_FLAGS(func, bpf_kfunc_st_ops_inc10, KF_TRUSTED_ARGS) 1053 966 BTF_KFUNCS_END(bpf_testmod_check_kfunc_ids) 1054 967 1055 968 static int bpf_testmod_ops_init(struct btf *btf) ··· 1122 1027 { 1123 1028 } 1124 1029 1030 + static int bpf_testmod_tramp(int value) 1031 + { 1032 + return 0; 1033 + } 1034 + 1125 1035 static int bpf_testmod_ops__test_maybe_null(int dummy, 1126 1036 struct task_struct *task__nullable) 1127 1037 { ··· 1173 1073 .owner = THIS_MODULE, 1174 1074 }; 1175 1075 1076 + static int bpf_test_mod_st_ops__test_prologue(struct st_ops_args *args) 1077 + { 1078 + return 0; 1079 + } 1080 + 1081 + static int bpf_test_mod_st_ops__test_epilogue(struct st_ops_args *args) 1082 + { 1083 + return 0; 1084 + } 1085 + 1086 + static int bpf_test_mod_st_ops__test_pro_epilogue(struct st_ops_args *args) 1087 + { 1088 + return 0; 1089 + } 1090 + 1091 + static int st_ops_gen_prologue(struct bpf_insn *insn_buf, bool direct_write, 1092 + const struct bpf_prog *prog) 1093 + { 1094 + struct bpf_insn *insn = insn_buf; 1095 + 1096 + if (strcmp(prog->aux->attach_func_name, "test_prologue") && 1097 + strcmp(prog->aux->attach_func_name, "test_pro_epilogue")) 1098 + return 0; 1099 + 1100 + /* r6 = r1[0]; // r6 will be "struct st_ops *args". r1 is "u64 *ctx". 1101 + * r7 = r6->a; 1102 + * r7 += 1000; 1103 + * r6->a = r7; 1104 + */ 1105 + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, 0); 1106 + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_6, offsetof(struct st_ops_args, a)); 1107 + *insn++ = BPF_ALU64_IMM(BPF_ADD, BPF_REG_7, 1000); 1108 + *insn++ = BPF_STX_MEM(BPF_DW, BPF_REG_6, BPF_REG_7, offsetof(struct st_ops_args, a)); 1109 + *insn++ = prog->insnsi[0]; 1110 + 1111 + return insn - insn_buf; 1112 + } 1113 + 1114 + static int st_ops_gen_epilogue(struct bpf_insn *insn_buf, const struct bpf_prog *prog, 1115 + s16 ctx_stack_off) 1116 + { 1117 + struct bpf_insn *insn = insn_buf; 1118 + 1119 + if (strcmp(prog->aux->attach_func_name, "test_epilogue") && 1120 + strcmp(prog->aux->attach_func_name, "test_pro_epilogue")) 1121 + return 0; 1122 + 1123 + /* r1 = stack[ctx_stack_off]; // r1 will be "u64 *ctx" 1124 + * r1 = r1[0]; // r1 will be "struct st_ops *args" 1125 + * r6 = r1->a; 1126 + * r6 += 10000; 1127 + * r1->a = r6; 1128 + * r0 = r6; 1129 + * r0 *= 2; 1130 + * BPF_EXIT; 1131 + */ 1132 + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_FP, ctx_stack_off); 1133 + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, 0); 1134 + *insn++ = BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_1, offsetof(struct st_ops_args, a)); 1135 + *insn++ = BPF_ALU64_IMM(BPF_ADD, BPF_REG_6, 10000); 1136 + *insn++ = BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, offsetof(struct st_ops_args, a)); 1137 + *insn++ = BPF_MOV64_REG(BPF_REG_0, BPF_REG_6); 1138 + *insn++ = BPF_ALU64_IMM(BPF_MUL, BPF_REG_0, 2); 1139 + *insn++ = BPF_EXIT_INSN(); 1140 + 1141 + return insn - insn_buf; 1142 + } 1143 + 1144 + static int st_ops_btf_struct_access(struct bpf_verifier_log *log, 1145 + const struct bpf_reg_state *reg, 1146 + int off, int size) 1147 + { 1148 + if (off < 0 || off + size > sizeof(struct st_ops_args)) 1149 + return -EACCES; 1150 + return 0; 1151 + } 1152 + 1153 + static const struct bpf_verifier_ops st_ops_verifier_ops = { 1154 + .is_valid_access = bpf_testmod_ops_is_valid_access, 1155 + .btf_struct_access = st_ops_btf_struct_access, 1156 + .gen_prologue = st_ops_gen_prologue, 1157 + .gen_epilogue = st_ops_gen_epilogue, 1158 + .get_func_proto = bpf_base_func_proto, 1159 + }; 1160 + 1161 + static struct bpf_testmod_st_ops st_ops_cfi_stubs = { 1162 + .test_prologue = bpf_test_mod_st_ops__test_prologue, 1163 + .test_epilogue = bpf_test_mod_st_ops__test_epilogue, 1164 + .test_pro_epilogue = bpf_test_mod_st_ops__test_pro_epilogue, 1165 + }; 1166 + 1167 + static int st_ops_reg(void *kdata, struct bpf_link *link) 1168 + { 1169 + int err = 0; 1170 + 1171 + mutex_lock(&st_ops_mutex); 1172 + if (st_ops) { 1173 + pr_err("st_ops has already been registered\n"); 1174 + err = -EEXIST; 1175 + goto unlock; 1176 + } 1177 + st_ops = kdata; 1178 + 1179 + unlock: 1180 + mutex_unlock(&st_ops_mutex); 1181 + return err; 1182 + } 1183 + 1184 + static void st_ops_unreg(void *kdata, struct bpf_link *link) 1185 + { 1186 + mutex_lock(&st_ops_mutex); 1187 + st_ops = NULL; 1188 + mutex_unlock(&st_ops_mutex); 1189 + } 1190 + 1191 + static int st_ops_init(struct btf *btf) 1192 + { 1193 + return 0; 1194 + } 1195 + 1196 + static int st_ops_init_member(const struct btf_type *t, 1197 + const struct btf_member *member, 1198 + void *kdata, const void *udata) 1199 + { 1200 + return 0; 1201 + } 1202 + 1203 + static struct bpf_struct_ops testmod_st_ops = { 1204 + .verifier_ops = &st_ops_verifier_ops, 1205 + .init = st_ops_init, 1206 + .init_member = st_ops_init_member, 1207 + .reg = st_ops_reg, 1208 + .unreg = st_ops_unreg, 1209 + .cfi_stubs = &st_ops_cfi_stubs, 1210 + .name = "bpf_testmod_st_ops", 1211 + .owner = THIS_MODULE, 1212 + }; 1213 + 1176 1214 extern int bpf_fentry_test1(int a); 1177 1215 1178 1216 static int bpf_testmod_init(void) ··· 1321 1083 .kfunc_btf_id = bpf_testmod_dtor_ids[1] 1322 1084 }, 1323 1085 }; 1086 + void **tramp; 1324 1087 int ret; 1325 1088 1326 1089 ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_UNSPEC, &bpf_testmod_common_kfunc_set); 1327 1090 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_testmod_kfunc_set); 1328 1091 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &bpf_testmod_kfunc_set); 1329 1092 ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &bpf_testmod_kfunc_set); 1093 + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &bpf_testmod_kfunc_set); 1330 1094 ret = ret ?: register_bpf_struct_ops(&bpf_bpf_testmod_ops, bpf_testmod_ops); 1331 1095 ret = ret ?: register_bpf_struct_ops(&bpf_testmod_ops2, bpf_testmod_ops2); 1096 + ret = ret ?: register_bpf_struct_ops(&testmod_st_ops, bpf_testmod_st_ops); 1332 1097 ret = ret ?: register_btf_id_dtor_kfuncs(bpf_testmod_dtors, 1333 1098 ARRAY_SIZE(bpf_testmod_dtors), 1334 1099 THIS_MODULE); ··· 1347 1106 ret = register_bpf_testmod_uprobe(); 1348 1107 if (ret < 0) 1349 1108 return ret; 1109 + 1110 + /* Ensure nothing is between tramp_1..tramp_40 */ 1111 + BUILD_BUG_ON(offsetof(struct bpf_testmod_ops, tramp_1) + 40 * sizeof(long) != 1112 + offsetofend(struct bpf_testmod_ops, tramp_40)); 1113 + tramp = (void **)&__bpf_testmod_ops.tramp_1; 1114 + while (tramp <= (void **)&__bpf_testmod_ops.tramp_40) 1115 + *tramp++ = bpf_testmod_tramp; 1116 + 1350 1117 return 0; 1351 1118 } 1352 1119

+12

tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.h

··· 35 35 void (*test_2)(int a, int b); 36 36 /* Used to test nullable arguments. */ 37 37 int (*test_maybe_null)(int dummy, struct task_struct *task); 38 + int (*unsupported_ops)(void); 38 39 39 40 /* The following fields are used to test shadow copies. */ 40 41 char onebyte; ··· 92 91 93 92 struct bpf_testmod_ops2 { 94 93 int (*test_1)(void); 94 + }; 95 + 96 + struct st_ops_args { 97 + u64 a; 98 + }; 99 + 100 + struct bpf_testmod_st_ops { 101 + int (*test_prologue)(struct st_ops_args *args); 102 + int (*test_epilogue)(struct st_ops_args *args); 103 + int (*test_pro_epilogue)(struct st_ops_args *args); 104 + struct module *owner; 95 105 }; 96 106 97 107 #endif /* _BPF_TESTMOD_H */

+15

tools/testing/selftests/bpf/bpf_testmod/bpf_testmod_kfunc.h

··· 144 144 struct bpf_testmod_ctx *bpf_testmod_ctx_create(int *err) __ksym; 145 145 void bpf_testmod_ctx_release(struct bpf_testmod_ctx *ctx) __ksym; 146 146 147 + struct sk_buff *bpf_kfunc_nested_acquire_nonzero_offset_test(struct sk_buff_head *ptr) __ksym; 148 + struct sk_buff *bpf_kfunc_nested_acquire_zero_offset_test(struct sock_common *ptr) __ksym; 149 + void bpf_kfunc_nested_release_test(struct sk_buff *ptr) __ksym; 150 + 151 + struct st_ops_args; 152 + int bpf_kfunc_st_ops_test_prologue(struct st_ops_args *args) __ksym; 153 + int bpf_kfunc_st_ops_test_epilogue(struct st_ops_args *args) __ksym; 154 + int bpf_kfunc_st_ops_test_pro_epilogue(struct st_ops_args *args) __ksym; 155 + int bpf_kfunc_st_ops_inc10(struct st_ops_args *args) __ksym; 156 + 157 + void bpf_kfunc_trusted_vma_test(struct vm_area_struct *ptr) __ksym; 158 + void bpf_kfunc_trusted_task_test(struct task_struct *ptr) __ksym; 159 + void bpf_kfunc_trusted_num_test(int *ptr) __ksym; 160 + void bpf_kfunc_rcu_task_test(struct task_struct *ptr) __ksym; 161 + 147 162 #endif /* _BPF_TESTMOD_KFUNC_H */

+1 -1

tools/testing/selftests/bpf/cgroup_helpers.c

··· 644 644 /** 645 645 * get_cgroup1_hierarchy_id - Retrieves the ID of a cgroup1 hierarchy from the cgroup1 subsys name. 646 646 * @subsys_name: The cgroup1 subsys name, which can be retrieved from /proc/self/cgroup. It can be 647 - * a named cgroup like "name=systemd", a controller name like "net_cls", or multi-contollers like 647 + * a named cgroup like "name=systemd", a controller name like "net_cls", or multi-controllers like 648 648 * "net_cls,net_prio". 649 649 */ 650 650 int get_cgroup1_hierarchy_id(const char *subsys_name)

+84

tools/testing/selftests/bpf/config.riscv64

··· 1 + CONFIG_AUDIT=y 2 + CONFIG_BLK_CGROUP=y 3 + CONFIG_BLK_DEV_INITRD=y 4 + CONFIG_BLK_DEV_RAM=y 5 + CONFIG_BONDING=y 6 + CONFIG_BPF_JIT_ALWAYS_ON=y 7 + CONFIG_BPF_PRELOAD=y 8 + CONFIG_BPF_PRELOAD_UMD=y 9 + CONFIG_CGROUPS=y 10 + CONFIG_CGROUP_CPUACCT=y 11 + CONFIG_CGROUP_DEVICE=y 12 + CONFIG_CGROUP_FREEZER=y 13 + CONFIG_CGROUP_HUGETLB=y 14 + CONFIG_CGROUP_NET_CLASSID=y 15 + CONFIG_CGROUP_PERF=y 16 + CONFIG_CGROUP_PIDS=y 17 + CONFIG_CGROUP_SCHED=y 18 + CONFIG_CPUSETS=y 19 + CONFIG_DEBUG_ATOMIC_SLEEP=y 20 + CONFIG_DEBUG_FS=y 21 + CONFIG_DETECT_HUNG_TASK=y 22 + CONFIG_DEVTMPFS=y 23 + CONFIG_DEVTMPFS_MOUNT=y 24 + CONFIG_EXPERT=y 25 + CONFIG_EXT4_FS=y 26 + CONFIG_EXT4_FS_POSIX_ACL=y 27 + CONFIG_EXT4_FS_SECURITY=y 28 + CONFIG_FRAME_POINTER=y 29 + CONFIG_HARDLOCKUP_DETECTOR=y 30 + CONFIG_HIGH_RES_TIMERS=y 31 + CONFIG_HUGETLBFS=y 32 + CONFIG_INET=y 33 + CONFIG_IPV6_SEG6_LWTUNNEL=y 34 + CONFIG_IP_ADVANCED_ROUTER=y 35 + CONFIG_IP_MULTICAST=y 36 + CONFIG_IP_MULTIPLE_TABLES=y 37 + CONFIG_JUMP_LABEL=y 38 + CONFIG_KALLSYMS_ALL=y 39 + CONFIG_KPROBES=y 40 + CONFIG_MEMCG=y 41 + CONFIG_NAMESPACES=y 42 + CONFIG_NET=y 43 + CONFIG_NETDEVICES=y 44 + CONFIG_NETFILTER_XT_MATCH_BPF=y 45 + CONFIG_NET_ACT_BPF=y 46 + CONFIG_NET_L3_MASTER_DEV=y 47 + CONFIG_NET_VRF=y 48 + CONFIG_NONPORTABLE=y 49 + CONFIG_NO_HZ_IDLE=y 50 + CONFIG_NR_CPUS=256 51 + CONFIG_PACKET=y 52 + CONFIG_PANIC_ON_OOPS=y 53 + CONFIG_PARTITION_ADVANCED=y 54 + CONFIG_PCI=y 55 + CONFIG_PCI_HOST_GENERIC=y 56 + CONFIG_POSIX_MQUEUE=y 57 + CONFIG_PRINTK_TIME=y 58 + CONFIG_PROC_KCORE=y 59 + CONFIG_PROFILING=y 60 + CONFIG_RCU_CPU_STALL_TIMEOUT=60 61 + CONFIG_RISCV_EFFICIENT_UNALIGNED_ACCESS=y 62 + CONFIG_RISCV_ISA_C=y 63 + CONFIG_RISCV_PMU=y 64 + CONFIG_RISCV_PMU_SBI=y 65 + CONFIG_RT_GROUP_SCHED=y 66 + CONFIG_SECURITY_NETWORK=y 67 + CONFIG_SERIAL_8250=y 68 + CONFIG_SERIAL_8250_CONSOLE=y 69 + CONFIG_SERIAL_OF_PLATFORM=y 70 + CONFIG_SMP=y 71 + CONFIG_SOC_VIRT=y 72 + CONFIG_SYSVIPC=y 73 + CONFIG_TCP_CONG_ADVANCED=y 74 + CONFIG_TLS=y 75 + CONFIG_TMPFS=y 76 + CONFIG_TMPFS_POSIX_ACL=y 77 + CONFIG_TUN=y 78 + CONFIG_UNIX=y 79 + CONFIG_UPROBES=y 80 + CONFIG_USER_NS=y 81 + CONFIG_VETH=y 82 + CONFIG_VLAN_8021Q=y 83 + CONFIG_VSOCKETS_LOOPBACK=y 84 + CONFIG_XFRM_USER=y

+69

tools/testing/selftests/bpf/disasm_helpers.c

··· 1 + // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + 3 + #include <bpf/bpf.h> 4 + #include "disasm.h" 5 + 6 + struct print_insn_context { 7 + char scratch[16]; 8 + char *buf; 9 + size_t sz; 10 + }; 11 + 12 + static void print_insn_cb(void *private_data, const char *fmt, ...) 13 + { 14 + struct print_insn_context *ctx = private_data; 15 + va_list args; 16 + 17 + va_start(args, fmt); 18 + vsnprintf(ctx->buf, ctx->sz, fmt, args); 19 + va_end(args); 20 + } 21 + 22 + static const char *print_call_cb(void *private_data, const struct bpf_insn *insn) 23 + { 24 + struct print_insn_context *ctx = private_data; 25 + 26 + /* For pseudo calls verifier.c:jit_subprogs() hides original 27 + * imm to insn->off and changes insn->imm to be an index of 28 + * the subprog instead. 29 + */ 30 + if (insn->src_reg == BPF_PSEUDO_CALL) { 31 + snprintf(ctx->scratch, sizeof(ctx->scratch), "%+d", insn->off); 32 + return ctx->scratch; 33 + } 34 + 35 + return NULL; 36 + } 37 + 38 + struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz) 39 + { 40 + struct print_insn_context ctx = { 41 + .buf = buf, 42 + .sz = buf_sz, 43 + }; 44 + struct bpf_insn_cbs cbs = { 45 + .cb_print = print_insn_cb, 46 + .cb_call = print_call_cb, 47 + .private_data = &ctx, 48 + }; 49 + char *tmp, *pfx_end, *sfx_start; 50 + bool double_insn; 51 + int len; 52 + 53 + print_bpf_insn(&cbs, insn, true); 54 + /* We share code with kernel BPF disassembler, it adds '(FF) ' prefix 55 + * for each instruction (FF stands for instruction `code` byte). 56 + * Remove the prefix inplace, and also simplify call instructions. 57 + * E.g.: "(85) call foo#10" -> "call foo". 58 + * Also remove newline in the end (the 'max(strlen(buf) - 1, 0)' thing). 59 + */ 60 + pfx_end = buf + 5; 61 + sfx_start = buf + max((int)strlen(buf) - 1, 0); 62 + if (strncmp(pfx_end, "call ", 5) == 0 && (tmp = strrchr(buf, '#'))) 63 + sfx_start = tmp; 64 + len = sfx_start - pfx_end; 65 + memmove(buf, pfx_end, len); 66 + buf[len] = 0; 67 + double_insn = insn->code == (BPF_LD | BPF_IMM | BPF_DW); 68 + return insn + (double_insn ? 2 : 1); 69 + }

+12

tools/testing/selftests/bpf/disasm_helpers.h

··· 1 + /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ 2 + 3 + #ifndef __DISASM_HELPERS_H 4 + #define __DISASM_HELPERS_H 5 + 6 + #include <stdlib.h> 7 + 8 + struct bpf_insn; 9 + 10 + struct bpf_insn *disasm_insn(struct bpf_insn *insn, char *buf, size_t buf_sz); 11 + 12 + #endif /* __DISASM_HELPERS_H */

-151

tools/testing/selftests/bpf/get_cgroup_id_user.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - // Copyright (c) 2018 Facebook 3 - 4 - #include <stdio.h> 5 - #include <stdlib.h> 6 - #include <string.h> 7 - #include <errno.h> 8 - #include <fcntl.h> 9 - #include <syscall.h> 10 - #include <unistd.h> 11 - #include <linux/perf_event.h> 12 - #include <sys/ioctl.h> 13 - #include <sys/time.h> 14 - #include <sys/types.h> 15 - #include <sys/stat.h> 16 - 17 - #include <linux/bpf.h> 18 - #include <bpf/bpf.h> 19 - #include <bpf/libbpf.h> 20 - 21 - #include "cgroup_helpers.h" 22 - #include "testing_helpers.h" 23 - 24 - #define CHECK(condition, tag, format...) ({ \ 25 - int __ret = !!(condition); \ 26 - if (__ret) { \ 27 - printf("%s:FAIL:%s ", __func__, tag); \ 28 - printf(format); \ 29 - } else { \ 30 - printf("%s:PASS:%s\n", __func__, tag); \ 31 - } \ 32 - __ret; \ 33 - }) 34 - 35 - static int bpf_find_map(const char *test, struct bpf_object *obj, 36 - const char *name) 37 - { 38 - struct bpf_map *map; 39 - 40 - map = bpf_object__find_map_by_name(obj, name); 41 - if (!map) 42 - return -1; 43 - return bpf_map__fd(map); 44 - } 45 - 46 - #define TEST_CGROUP "/test-bpf-get-cgroup-id/" 47 - 48 - int main(int argc, char **argv) 49 - { 50 - const char *probe_name = "syscalls/sys_enter_nanosleep"; 51 - const char *file = "get_cgroup_id_kern.bpf.o"; 52 - int err, bytes, efd, prog_fd, pmu_fd; 53 - int cgroup_fd, cgidmap_fd, pidmap_fd; 54 - struct perf_event_attr attr = {}; 55 - struct bpf_object *obj; 56 - __u64 kcgid = 0, ucgid; 57 - __u32 key = 0, pid; 58 - int exit_code = 1; 59 - char buf[256]; 60 - const struct timespec req = { 61 - .tv_sec = 1, 62 - .tv_nsec = 0, 63 - }; 64 - 65 - cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 66 - if (CHECK(cgroup_fd < 0, "cgroup_setup_and_join", "err %d errno %d\n", cgroup_fd, errno)) 67 - return 1; 68 - 69 - /* Use libbpf 1.0 API mode */ 70 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 71 - 72 - err = bpf_prog_test_load(file, BPF_PROG_TYPE_TRACEPOINT, &obj, &prog_fd); 73 - if (CHECK(err, "bpf_prog_test_load", "err %d errno %d\n", err, errno)) 74 - goto cleanup_cgroup_env; 75 - 76 - cgidmap_fd = bpf_find_map(__func__, obj, "cg_ids"); 77 - if (CHECK(cgidmap_fd < 0, "bpf_find_map", "err %d errno %d\n", 78 - cgidmap_fd, errno)) 79 - goto close_prog; 80 - 81 - pidmap_fd = bpf_find_map(__func__, obj, "pidmap"); 82 - if (CHECK(pidmap_fd < 0, "bpf_find_map", "err %d errno %d\n", 83 - pidmap_fd, errno)) 84 - goto close_prog; 85 - 86 - pid = getpid(); 87 - bpf_map_update_elem(pidmap_fd, &key, &pid, 0); 88 - 89 - if (access("/sys/kernel/tracing/trace", F_OK) == 0) { 90 - snprintf(buf, sizeof(buf), 91 - "/sys/kernel/tracing/events/%s/id", probe_name); 92 - } else { 93 - snprintf(buf, sizeof(buf), 94 - "/sys/kernel/debug/tracing/events/%s/id", probe_name); 95 - } 96 - efd = open(buf, O_RDONLY, 0); 97 - if (CHECK(efd < 0, "open", "err %d errno %d\n", efd, errno)) 98 - goto close_prog; 99 - bytes = read(efd, buf, sizeof(buf)); 100 - close(efd); 101 - if (CHECK(bytes <= 0 || bytes >= sizeof(buf), "read", 102 - "bytes %d errno %d\n", bytes, errno)) 103 - goto close_prog; 104 - 105 - attr.config = strtol(buf, NULL, 0); 106 - attr.type = PERF_TYPE_TRACEPOINT; 107 - attr.sample_type = PERF_SAMPLE_RAW; 108 - attr.sample_period = 1; 109 - attr.wakeup_events = 1; 110 - 111 - /* attach to this pid so the all bpf invocations will be in the 112 - * cgroup associated with this pid. 113 - */ 114 - pmu_fd = syscall(__NR_perf_event_open, &attr, getpid(), -1, -1, 0); 115 - if (CHECK(pmu_fd < 0, "perf_event_open", "err %d errno %d\n", pmu_fd, 116 - errno)) 117 - goto close_prog; 118 - 119 - err = ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0); 120 - if (CHECK(err, "perf_event_ioc_enable", "err %d errno %d\n", err, 121 - errno)) 122 - goto close_pmu; 123 - 124 - err = ioctl(pmu_fd, PERF_EVENT_IOC_SET_BPF, prog_fd); 125 - if (CHECK(err, "perf_event_ioc_set_bpf", "err %d errno %d\n", err, 126 - errno)) 127 - goto close_pmu; 128 - 129 - /* trigger some syscalls */ 130 - syscall(__NR_nanosleep, &req, NULL); 131 - 132 - err = bpf_map_lookup_elem(cgidmap_fd, &key, &kcgid); 133 - if (CHECK(err, "bpf_map_lookup_elem", "err %d errno %d\n", err, errno)) 134 - goto close_pmu; 135 - 136 - ucgid = get_cgroup_id(TEST_CGROUP); 137 - if (CHECK(kcgid != ucgid, "compare_cgroup_id", 138 - "kern cgid %llx user cgid %llx", kcgid, ucgid)) 139 - goto close_pmu; 140 - 141 - exit_code = 0; 142 - printf("%s:PASS\n", argv[0]); 143 - 144 - close_pmu: 145 - close(pmu_fd); 146 - close_prog: 147 - bpf_object__close(obj); 148 - cleanup_cgroup_env: 149 - cleanup_cgroup_environment(); 150 - return exit_code; 151 - }

+245

tools/testing/selftests/bpf/jit_disasm_helpers.c

··· 1 + // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) 2 + #include <bpf/bpf.h> 3 + #include <bpf/libbpf.h> 4 + #include <test_progs.h> 5 + 6 + #ifdef HAVE_LLVM_SUPPORT 7 + 8 + #include <llvm-c/Core.h> 9 + #include <llvm-c/Disassembler.h> 10 + #include <llvm-c/Target.h> 11 + #include <llvm-c/TargetMachine.h> 12 + 13 + /* The intent is to use get_jited_program_text() for small test 14 + * programs written in BPF assembly, thus assume that 32 local labels 15 + * would be sufficient. 16 + */ 17 + #define MAX_LOCAL_LABELS 32 18 + 19 + /* Local labels are encoded as 'L42', this requires 4 bytes of storage: 20 + * 3 characters + zero byte 21 + */ 22 + #define LOCAL_LABEL_LEN 4 23 + 24 + static bool llvm_initialized; 25 + 26 + struct local_labels { 27 + bool print_phase; 28 + __u32 prog_len; 29 + __u32 cnt; 30 + __u32 pcs[MAX_LOCAL_LABELS]; 31 + char names[MAX_LOCAL_LABELS][LOCAL_LABEL_LEN]; 32 + }; 33 + 34 + static const char *lookup_symbol(void *data, uint64_t ref_value, uint64_t *ref_type, 35 + uint64_t ref_pc, const char **ref_name) 36 + { 37 + struct local_labels *labels = data; 38 + uint64_t type = *ref_type; 39 + int i; 40 + 41 + *ref_type = LLVMDisassembler_ReferenceType_InOut_None; 42 + *ref_name = NULL; 43 + if (type != LLVMDisassembler_ReferenceType_In_Branch) 44 + return NULL; 45 + /* Depending on labels->print_phase either discover local labels or 46 + * return a name assigned with local jump target: 47 + * - if print_phase is true and ref_value is in labels->pcs, 48 + * return corresponding labels->name. 49 + * - if print_phase is false, save program-local jump targets 50 + * in labels->pcs; 51 + */ 52 + if (labels->print_phase) { 53 + for (i = 0; i < labels->cnt; ++i) 54 + if (labels->pcs[i] == ref_value) 55 + return labels->names[i]; 56 + } else { 57 + if (labels->cnt < MAX_LOCAL_LABELS && ref_value < labels->prog_len) 58 + labels->pcs[labels->cnt++] = ref_value; 59 + } 60 + return NULL; 61 + } 62 + 63 + static int disasm_insn(LLVMDisasmContextRef ctx, uint8_t *image, __u32 len, __u32 pc, 64 + char *buf, __u32 buf_sz) 65 + { 66 + int i, cnt; 67 + 68 + cnt = LLVMDisasmInstruction(ctx, image + pc, len - pc, pc, 69 + buf, buf_sz); 70 + if (cnt > 0) 71 + return cnt; 72 + PRINT_FAIL("Can't disasm instruction at offset %d:", pc); 73 + for (i = 0; i < 16 && pc + i < len; ++i) 74 + printf(" %02x", image[pc + i]); 75 + printf("\n"); 76 + return -EINVAL; 77 + } 78 + 79 + static int cmp_u32(const void *_a, const void *_b) 80 + { 81 + __u32 a = *(__u32 *)_a; 82 + __u32 b = *(__u32 *)_b; 83 + 84 + if (a < b) 85 + return -1; 86 + if (a > b) 87 + return 1; 88 + return 0; 89 + } 90 + 91 + static int disasm_one_func(FILE *text_out, uint8_t *image, __u32 len) 92 + { 93 + char *label, *colon, *triple = NULL; 94 + LLVMDisasmContextRef ctx = NULL; 95 + struct local_labels labels = {}; 96 + __u32 *label_pc, pc; 97 + int i, cnt, err = 0; 98 + char buf[64]; 99 + 100 + triple = LLVMGetDefaultTargetTriple(); 101 + ctx = LLVMCreateDisasm(triple, &labels, 0, NULL, lookup_symbol); 102 + if (!ASSERT_OK_PTR(ctx, "LLVMCreateDisasm")) { 103 + err = -EINVAL; 104 + goto out; 105 + } 106 + 107 + cnt = LLVMSetDisasmOptions(ctx, LLVMDisassembler_Option_PrintImmHex); 108 + if (!ASSERT_EQ(cnt, 1, "LLVMSetDisasmOptions")) { 109 + err = -EINVAL; 110 + goto out; 111 + } 112 + 113 + /* discover labels */ 114 + labels.prog_len = len; 115 + pc = 0; 116 + while (pc < len) { 117 + cnt = disasm_insn(ctx, image, len, pc, buf, 1); 118 + if (cnt < 0) { 119 + err = cnt; 120 + goto out; 121 + } 122 + pc += cnt; 123 + } 124 + qsort(labels.pcs, labels.cnt, sizeof(*labels.pcs), cmp_u32); 125 + for (i = 0; i < labels.cnt; ++i) 126 + /* gcc is unable to infer upper bound for labels.cnt and assumes 127 + * it to be U32_MAX. U32_MAX takes 10 decimal digits. 128 + * snprintf below prints into labels.names[*], 129 + * which has space only for two digits and a letter. 130 + * To avoid truncation warning use (i % MAX_LOCAL_LABELS), 131 + * which informs gcc about printed value upper bound. 132 + */ 133 + snprintf(labels.names[i], sizeof(labels.names[i]), "L%d", i % MAX_LOCAL_LABELS); 134 + 135 + /* now print with labels */ 136 + labels.print_phase = true; 137 + pc = 0; 138 + while (pc < len) { 139 + cnt = disasm_insn(ctx, image, len, pc, buf, sizeof(buf)); 140 + if (cnt < 0) { 141 + err = cnt; 142 + goto out; 143 + } 144 + label_pc = bsearch(&pc, labels.pcs, labels.cnt, sizeof(*labels.pcs), cmp_u32); 145 + label = ""; 146 + colon = ""; 147 + if (label_pc) { 148 + label = labels.names[label_pc - labels.pcs]; 149 + colon = ":"; 150 + } 151 + fprintf(text_out, "%x:\t", pc); 152 + for (i = 0; i < cnt; ++i) 153 + fprintf(text_out, "%02x ", image[pc + i]); 154 + for (i = cnt * 3; i < 12 * 3; ++i) 155 + fputc(' ', text_out); 156 + fprintf(text_out, "%s%s%s\n", label, colon, buf); 157 + pc += cnt; 158 + } 159 + 160 + out: 161 + if (triple) 162 + LLVMDisposeMessage(triple); 163 + if (ctx) 164 + LLVMDisasmDispose(ctx); 165 + return err; 166 + } 167 + 168 + int get_jited_program_text(int fd, char *text, size_t text_sz) 169 + { 170 + struct bpf_prog_info info = {}; 171 + __u32 info_len = sizeof(info); 172 + __u32 jited_funcs, len, pc; 173 + __u32 *func_lens = NULL; 174 + FILE *text_out = NULL; 175 + uint8_t *image = NULL; 176 + int i, err = 0; 177 + 178 + if (!llvm_initialized) { 179 + LLVMInitializeAllTargetInfos(); 180 + LLVMInitializeAllTargetMCs(); 181 + LLVMInitializeAllDisassemblers(); 182 + llvm_initialized = 1; 183 + } 184 + 185 + text_out = fmemopen(text, text_sz, "w"); 186 + if (!ASSERT_OK_PTR(text_out, "open_memstream")) { 187 + err = -errno; 188 + goto out; 189 + } 190 + 191 + /* first call is to find out jited program len */ 192 + err = bpf_prog_get_info_by_fd(fd, &info, &info_len); 193 + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd #1")) 194 + goto out; 195 + 196 + len = info.jited_prog_len; 197 + image = malloc(len); 198 + if (!ASSERT_OK_PTR(image, "malloc(info.jited_prog_len)")) { 199 + err = -ENOMEM; 200 + goto out; 201 + } 202 + 203 + jited_funcs = info.nr_jited_func_lens; 204 + func_lens = malloc(jited_funcs * sizeof(__u32)); 205 + if (!ASSERT_OK_PTR(func_lens, "malloc(info.nr_jited_func_lens)")) { 206 + err = -ENOMEM; 207 + goto out; 208 + } 209 + 210 + memset(&info, 0, sizeof(info)); 211 + info.jited_prog_insns = (__u64)image; 212 + info.jited_prog_len = len; 213 + info.jited_func_lens = (__u64)func_lens; 214 + info.nr_jited_func_lens = jited_funcs; 215 + err = bpf_prog_get_info_by_fd(fd, &info, &info_len); 216 + if (!ASSERT_OK(err, "bpf_prog_get_info_by_fd #2")) 217 + goto out; 218 + 219 + for (pc = 0, i = 0; i < jited_funcs; ++i) { 220 + fprintf(text_out, "func #%d:\n", i); 221 + disasm_one_func(text_out, image + pc, func_lens[i]); 222 + fprintf(text_out, "\n"); 223 + pc += func_lens[i]; 224 + } 225 + 226 + out: 227 + if (text_out) 228 + fclose(text_out); 229 + if (image) 230 + free(image); 231 + if (func_lens) 232 + free(func_lens); 233 + return err; 234 + } 235 + 236 + #else /* HAVE_LLVM_SUPPORT */ 237 + 238 + int get_jited_program_text(int fd, char *text, size_t text_sz) 239 + { 240 + if (env.verbosity >= VERBOSE_VERY) 241 + printf("compiled w/o llvm development libraries, can't dis-assembly binary code"); 242 + return -EOPNOTSUPP; 243 + } 244 + 245 + #endif /* HAVE_LLVM_SUPPORT */

+10

tools/testing/selftests/bpf/jit_disasm_helpers.h

··· 1 + /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */ 2 + 3 + #ifndef __JIT_DISASM_HELPERS_H 4 + #define __JIT_DISASM_HELPERS_H 5 + 6 + #include <stddef.h> 7 + 8 + int get_jited_program_text(int fd, char *text, size_t text_sz); 9 + 10 + #endif /* __JIT_DISASM_HELPERS_H */

+1 -1

tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c

··· 197 197 CHECK(total != max_entries, "delete with steps", 198 198 "total = %u, max_entries = %u\n", total, max_entries); 199 199 200 - /* check map is empty, errono == ENOENT */ 200 + /* check map is empty, errno == ENOENT */ 201 201 err = bpf_map_get_next_key(map_fd, NULL, &key); 202 202 CHECK(!err || errno != ENOENT, "bpf_map_get_next_key()", 203 203 "error: %s\n", strerror(errno));

+1 -1

tools/testing/selftests/bpf/map_tests/lpm_trie_map_batch_ops.c

··· 135 135 CHECK(total != max_entries, "delete with steps", 136 136 "total = %u, max_entries = %u\n", total, max_entries); 137 137 138 - /* check map is empty, errono == ENOENT */ 138 + /* check map is empty, errno == ENOENT */ 139 139 err = bpf_map_get_next_key(map_fd, NULL, &key); 140 140 CHECK(!err || errno != ENOENT, "bpf_map_get_next_key()", 141 141 "error: %s\n", strerror(errno));

+18

tools/testing/selftests/bpf/map_tests/map_percpu_stats.c

··· 17 17 #define MAX_ENTRIES_HASH_OF_MAPS 64 18 18 #define N_THREADS 8 19 19 #define MAX_MAP_KEY_SIZE 4 20 + #define PCPU_MIN_UNIT_SIZE 32768 20 21 21 22 static void map_info(int map_fd, struct bpf_map_info *info) 22 23 { ··· 457 456 printf("test_%s:PASS\n", __func__); 458 457 } 459 458 459 + static void map_percpu_stats_map_value_size(void) 460 + { 461 + int fd; 462 + int value_sz = PCPU_MIN_UNIT_SIZE + 1; 463 + struct bpf_map_create_opts opts = { .sz = sizeof(opts) }; 464 + enum bpf_map_type map_types[] = { BPF_MAP_TYPE_PERCPU_ARRAY, 465 + BPF_MAP_TYPE_PERCPU_HASH, 466 + BPF_MAP_TYPE_LRU_PERCPU_HASH }; 467 + for (int i = 0; i < ARRAY_SIZE(map_types); i++) { 468 + fd = bpf_map_create(map_types[i], NULL, sizeof(__u32), value_sz, 1, &opts); 469 + CHECK(fd < 0 && errno != E2BIG, "percpu map value size", 470 + "error: %s\n", strerror(errno)); 471 + } 472 + printf("test_%s:PASS\n", __func__); 473 + } 474 + 460 475 void test_map_percpu_stats(void) 461 476 { 462 477 map_percpu_stats_hash(); ··· 484 467 map_percpu_stats_percpu_lru_hash(); 485 468 map_percpu_stats_percpu_lru_hash_no_common(); 486 469 map_percpu_stats_hash_of_maps(); 470 + map_percpu_stats_map_value_size(); 487 471 }

+1 -1

tools/testing/selftests/bpf/map_tests/sk_storage_map.c

··· 412 412 rlim_new.rlim_max = rlim_new.rlim_cur + 128; 413 413 err = setrlimit(RLIMIT_NOFILE, &rlim_new); 414 414 CHECK(err, "setrlimit(RLIMIT_NOFILE)", "rlim_new:%lu errno:%d", 415 - rlim_new.rlim_cur, errno); 415 + (unsigned long) rlim_new.rlim_cur, errno); 416 416 } 417 417 418 418 err = do_sk_storage_map_stress_free();

+543 -59

tools/testing/selftests/bpf/network_helpers.c

··· 11 11 #include <arpa/inet.h> 12 12 #include <sys/mount.h> 13 13 #include <sys/stat.h> 14 + #include <sys/types.h> 14 15 #include <sys/un.h> 16 + #include <sys/eventfd.h> 15 17 16 18 #include <linux/err.h> 17 19 #include <linux/in.h> 18 20 #include <linux/in6.h> 19 21 #include <linux/limits.h> 20 22 23 + #include <linux/ip.h> 24 + #include <linux/udp.h> 25 + #include <netinet/tcp.h> 26 + #include <net/if.h> 27 + 21 28 #include "bpf_util.h" 22 29 #include "network_helpers.h" 23 30 #include "test_progs.h" 31 + 32 + #ifdef TRAFFIC_MONITOR 33 + /* Prevent pcap.h from including pcap/bpf.h and causing conflicts */ 34 + #define PCAP_DONT_INCLUDE_PCAP_BPF_H 1 35 + #include <pcap/pcap.h> 36 + #include <pcap/dlt.h> 37 + #endif 24 38 25 39 #ifndef IPPROTO_MPTCP 26 40 #define IPPROTO_MPTCP 262 ··· 94 80 95 81 #define save_errno_close(fd) ({ int __save = errno; close(fd); errno = __save; }) 96 82 97 - static int __start_server(int type, const struct sockaddr *addr, socklen_t addrlen, 98 - const struct network_helper_opts *opts) 83 + int start_server_addr(int type, const struct sockaddr_storage *addr, socklen_t addrlen, 84 + const struct network_helper_opts *opts) 99 85 { 100 86 int fd; 101 87 102 - fd = socket(addr->sa_family, type, opts->proto); 88 + if (!opts) 89 + opts = &default_opts; 90 + 91 + fd = socket(addr->ss_family, type, opts->proto); 103 92 if (fd < 0) { 104 93 log_err("Failed to create server socket"); 105 94 return -1; ··· 117 100 goto error_close; 118 101 } 119 102 120 - if (bind(fd, addr, addrlen) < 0) { 103 + if (bind(fd, (struct sockaddr *)addr, addrlen) < 0) { 121 104 log_err("Failed to bind socket"); 122 105 goto error_close; 123 106 } ··· 148 131 if (make_sockaddr(family, addr_str, port, &addr, &addrlen)) 149 132 return -1; 150 133 151 - return __start_server(type, (struct sockaddr *)&addr, addrlen, opts); 134 + return start_server_addr(type, &addr, addrlen, opts); 152 135 } 153 136 154 137 int start_server(int family, int type, const char *addr_str, __u16 port, ··· 190 173 if (!fds) 191 174 return NULL; 192 175 193 - fds[0] = __start_server(type, (struct sockaddr *)&addr, addrlen, &opts); 176 + fds[0] = start_server_addr(type, &addr, addrlen, &opts); 194 177 if (fds[0] == -1) 195 178 goto close_fds; 196 179 nr_fds = 1; ··· 199 182 goto close_fds; 200 183 201 184 for (; nr_fds < nr_listens; nr_fds++) { 202 - fds[nr_fds] = __start_server(type, (struct sockaddr *)&addr, addrlen, &opts); 185 + fds[nr_fds] = start_server_addr(type, &addr, addrlen, &opts); 203 186 if (fds[nr_fds] == -1) 204 187 goto close_fds; 205 188 } ··· 209 192 close_fds: 210 193 free_fds(fds, nr_fds); 211 194 return NULL; 212 - } 213 - 214 - int start_server_addr(int type, const struct sockaddr_storage *addr, socklen_t len, 215 - const struct network_helper_opts *opts) 216 - { 217 - if (!opts) 218 - opts = &default_opts; 219 - 220 - return __start_server(type, (struct sockaddr *)addr, len, opts); 221 195 } 222 196 223 197 void free_fds(int *fds, unsigned int nr_close_fds) ··· 285 277 return -1; 286 278 } 287 279 288 - static int connect_fd_to_addr(int fd, 289 - const struct sockaddr_storage *addr, 290 - socklen_t addrlen, const bool must_fail) 291 - { 292 - int ret; 293 - 294 - errno = 0; 295 - ret = connect(fd, (const struct sockaddr *)addr, addrlen); 296 - if (must_fail) { 297 - if (!ret) { 298 - log_err("Unexpected success to connect to server"); 299 - return -1; 300 - } 301 - if (errno != EPERM) { 302 - log_err("Unexpected error from connect to server"); 303 - return -1; 304 - } 305 - } else { 306 - if (ret) { 307 - log_err("Failed to connect to server"); 308 - return -1; 309 - } 310 - } 311 - 312 - return 0; 313 - } 314 - 315 280 int connect_to_addr(int type, const struct sockaddr_storage *addr, socklen_t addrlen, 316 281 const struct network_helper_opts *opts) 317 282 { ··· 299 318 return -1; 300 319 } 301 320 302 - if (connect_fd_to_addr(fd, addr, addrlen, opts->must_fail)) 303 - goto error_close; 321 + if (connect(fd, (const struct sockaddr *)addr, addrlen)) { 322 + log_err("Failed to connect to server"); 323 + save_errno_close(fd); 324 + return -1; 325 + } 304 326 305 327 return fd; 306 - 307 - error_close: 308 - save_errno_close(fd); 309 - return -1; 310 328 } 311 329 312 - int connect_to_fd_opts(int server_fd, int type, const struct network_helper_opts *opts) 330 + int connect_to_addr_str(int family, int type, const char *addr_str, __u16 port, 331 + const struct network_helper_opts *opts) 313 332 { 314 333 struct sockaddr_storage addr; 315 334 socklen_t addrlen; 316 335 317 336 if (!opts) 318 337 opts = &default_opts; 338 + 339 + if (make_sockaddr(family, addr_str, port, &addr, &addrlen)) 340 + return -1; 341 + 342 + return connect_to_addr(type, &addr, addrlen, opts); 343 + } 344 + 345 + int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts) 346 + { 347 + struct sockaddr_storage addr; 348 + socklen_t addrlen, optlen; 349 + int type; 350 + 351 + if (!opts) 352 + opts = &default_opts; 353 + 354 + optlen = sizeof(type); 355 + if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { 356 + log_err("getsockopt(SOL_TYPE)"); 357 + return -1; 358 + } 319 359 320 360 addrlen = sizeof(addr); 321 361 if (getsockname(server_fd, (struct sockaddr *)&addr, &addrlen)) { ··· 352 350 struct network_helper_opts opts = { 353 351 .timeout_ms = timeout_ms, 354 352 }; 355 - int type, protocol; 356 353 socklen_t optlen; 357 - 358 - optlen = sizeof(type); 359 - if (getsockopt(server_fd, SOL_SOCKET, SO_TYPE, &type, &optlen)) { 360 - log_err("getsockopt(SOL_TYPE)"); 361 - return -1; 362 - } 354 + int protocol; 363 355 364 356 optlen = sizeof(protocol); 365 357 if (getsockopt(server_fd, SOL_SOCKET, SO_PROTOCOL, &protocol, &optlen)) { ··· 362 366 } 363 367 opts.proto = protocol; 364 368 365 - return connect_to_fd_opts(server_fd, type, &opts); 369 + return connect_to_fd_opts(server_fd, &opts); 366 370 } 367 371 368 372 int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms) ··· 378 382 return -1; 379 383 } 380 384 381 - if (connect_fd_to_addr(client_fd, &addr, len, false)) 385 + if (connect(client_fd, (const struct sockaddr *)&addr, len)) { 386 + log_err("Failed to connect to server"); 382 387 return -1; 388 + } 383 389 384 390 return 0; 385 391 } ··· 444 446 return "ping -6"; 445 447 } 446 448 return "ping"; 449 + } 450 + 451 + int remove_netns(const char *name) 452 + { 453 + char *cmd; 454 + int r; 455 + 456 + r = asprintf(&cmd, "ip netns del %s >/dev/null 2>&1", name); 457 + if (r < 0) { 458 + log_err("Failed to malloc cmd"); 459 + return -1; 460 + } 461 + 462 + r = system(cmd); 463 + free(cmd); 464 + return r; 465 + } 466 + 467 + int make_netns(const char *name) 468 + { 469 + char *cmd; 470 + int r; 471 + 472 + r = asprintf(&cmd, "ip netns add %s", name); 473 + if (r < 0) { 474 + log_err("Failed to malloc cmd"); 475 + return -1; 476 + } 477 + 478 + r = system(cmd); 479 + free(cmd); 480 + 481 + if (r) 482 + return r; 483 + 484 + r = asprintf(&cmd, "ip -n %s link set lo up", name); 485 + if (r < 0) { 486 + log_err("Failed to malloc cmd for setting up lo"); 487 + remove_netns(name); 488 + return -1; 489 + } 490 + 491 + r = system(cmd); 492 + free(cmd); 493 + 494 + return r; 447 495 } 448 496 449 497 struct nstoken { ··· 720 676 721 677 return err; 722 678 } 679 + 680 + #ifdef TRAFFIC_MONITOR 681 + struct tmonitor_ctx { 682 + pcap_t *pcap; 683 + pcap_dumper_t *dumper; 684 + pthread_t thread; 685 + int wake_fd; 686 + 687 + volatile bool done; 688 + char pkt_fname[PATH_MAX]; 689 + int pcap_fd; 690 + }; 691 + 692 + /* Is this packet captured with a Ethernet protocol type? */ 693 + static bool is_ethernet(const u_char *packet) 694 + { 695 + u16 arphdr_type; 696 + 697 + memcpy(&arphdr_type, packet + 8, 2); 698 + arphdr_type = ntohs(arphdr_type); 699 + 700 + /* Except the following cases, the protocol type contains the 701 + * Ethernet protocol type for the packet. 702 + * 703 + * https://www.tcpdump.org/linktypes/LINKTYPE_LINUX_SLL2.html 704 + */ 705 + switch (arphdr_type) { 706 + case 770: /* ARPHRD_FRAD */ 707 + case 778: /* ARPHDR_IPGRE */ 708 + case 803: /* ARPHRD_IEEE80211_RADIOTAP */ 709 + printf("Packet captured: arphdr_type=%d\n", arphdr_type); 710 + return false; 711 + } 712 + return true; 713 + } 714 + 715 + static const char * const pkt_types[] = { 716 + "In", 717 + "B", /* Broadcast */ 718 + "M", /* Multicast */ 719 + "C", /* Captured with the promiscuous mode */ 720 + "Out", 721 + }; 722 + 723 + static const char *pkt_type_str(u16 pkt_type) 724 + { 725 + if (pkt_type < ARRAY_SIZE(pkt_types)) 726 + return pkt_types[pkt_type]; 727 + return "Unknown"; 728 + } 729 + 730 + /* Show the information of the transport layer in the packet */ 731 + static void show_transport(const u_char *packet, u16 len, u32 ifindex, 732 + const char *src_addr, const char *dst_addr, 733 + u16 proto, bool ipv6, u8 pkt_type) 734 + { 735 + char *ifname, _ifname[IF_NAMESIZE]; 736 + const char *transport_str; 737 + u16 src_port, dst_port; 738 + struct udphdr *udp; 739 + struct tcphdr *tcp; 740 + 741 + ifname = if_indextoname(ifindex, _ifname); 742 + if (!ifname) { 743 + snprintf(_ifname, sizeof(_ifname), "unknown(%d)", ifindex); 744 + ifname = _ifname; 745 + } 746 + 747 + if (proto == IPPROTO_UDP) { 748 + udp = (struct udphdr *)packet; 749 + src_port = ntohs(udp->source); 750 + dst_port = ntohs(udp->dest); 751 + transport_str = "UDP"; 752 + } else if (proto == IPPROTO_TCP) { 753 + tcp = (struct tcphdr *)packet; 754 + src_port = ntohs(tcp->source); 755 + dst_port = ntohs(tcp->dest); 756 + transport_str = "TCP"; 757 + } else if (proto == IPPROTO_ICMP) { 758 + printf("%-7s %-3s IPv4 %s > %s: ICMP, length %d, type %d, code %d\n", 759 + ifname, pkt_type_str(pkt_type), src_addr, dst_addr, len, 760 + packet[0], packet[1]); 761 + return; 762 + } else if (proto == IPPROTO_ICMPV6) { 763 + printf("%-7s %-3s IPv6 %s > %s: ICMPv6, length %d, type %d, code %d\n", 764 + ifname, pkt_type_str(pkt_type), src_addr, dst_addr, len, 765 + packet[0], packet[1]); 766 + return; 767 + } else { 768 + printf("%-7s %-3s %s %s > %s: protocol %d\n", 769 + ifname, pkt_type_str(pkt_type), ipv6 ? "IPv6" : "IPv4", 770 + src_addr, dst_addr, proto); 771 + return; 772 + } 773 + 774 + /* TCP or UDP*/ 775 + 776 + flockfile(stdout); 777 + if (ipv6) 778 + printf("%-7s %-3s IPv6 %s.%d > %s.%d: %s, length %d", 779 + ifname, pkt_type_str(pkt_type), src_addr, src_port, 780 + dst_addr, dst_port, transport_str, len); 781 + else 782 + printf("%-7s %-3s IPv4 %s:%d > %s:%d: %s, length %d", 783 + ifname, pkt_type_str(pkt_type), src_addr, src_port, 784 + dst_addr, dst_port, transport_str, len); 785 + 786 + if (proto == IPPROTO_TCP) { 787 + if (tcp->fin) 788 + printf(", FIN"); 789 + if (tcp->syn) 790 + printf(", SYN"); 791 + if (tcp->rst) 792 + printf(", RST"); 793 + if (tcp->ack) 794 + printf(", ACK"); 795 + } 796 + 797 + printf("\n"); 798 + funlockfile(stdout); 799 + } 800 + 801 + static void show_ipv6_packet(const u_char *packet, u32 ifindex, u8 pkt_type) 802 + { 803 + char src_buf[INET6_ADDRSTRLEN], dst_buf[INET6_ADDRSTRLEN]; 804 + struct ipv6hdr *pkt = (struct ipv6hdr *)packet; 805 + const char *src, *dst; 806 + u_char proto; 807 + 808 + src = inet_ntop(AF_INET6, &pkt->saddr, src_buf, sizeof(src_buf)); 809 + if (!src) 810 + src = "<invalid>"; 811 + dst = inet_ntop(AF_INET6, &pkt->daddr, dst_buf, sizeof(dst_buf)); 812 + if (!dst) 813 + dst = "<invalid>"; 814 + proto = pkt->nexthdr; 815 + show_transport(packet + sizeof(struct ipv6hdr), 816 + ntohs(pkt->payload_len), 817 + ifindex, src, dst, proto, true, pkt_type); 818 + } 819 + 820 + static void show_ipv4_packet(const u_char *packet, u32 ifindex, u8 pkt_type) 821 + { 822 + char src_buf[INET_ADDRSTRLEN], dst_buf[INET_ADDRSTRLEN]; 823 + struct iphdr *pkt = (struct iphdr *)packet; 824 + const char *src, *dst; 825 + u_char proto; 826 + 827 + src = inet_ntop(AF_INET, &pkt->saddr, src_buf, sizeof(src_buf)); 828 + if (!src) 829 + src = "<invalid>"; 830 + dst = inet_ntop(AF_INET, &pkt->daddr, dst_buf, sizeof(dst_buf)); 831 + if (!dst) 832 + dst = "<invalid>"; 833 + proto = pkt->protocol; 834 + show_transport(packet + sizeof(struct iphdr), 835 + ntohs(pkt->tot_len), 836 + ifindex, src, dst, proto, false, pkt_type); 837 + } 838 + 839 + static void *traffic_monitor_thread(void *arg) 840 + { 841 + char *ifname, _ifname[IF_NAMESIZE]; 842 + const u_char *packet, *payload; 843 + struct tmonitor_ctx *ctx = arg; 844 + pcap_dumper_t *dumper = ctx->dumper; 845 + int fd = ctx->pcap_fd, nfds, r; 846 + int wake_fd = ctx->wake_fd; 847 + struct pcap_pkthdr header; 848 + pcap_t *pcap = ctx->pcap; 849 + u32 ifindex; 850 + fd_set fds; 851 + u16 proto; 852 + u8 ptype; 853 + 854 + nfds = (fd > wake_fd ? fd : wake_fd) + 1; 855 + FD_ZERO(&fds); 856 + 857 + while (!ctx->done) { 858 + FD_SET(fd, &fds); 859 + FD_SET(wake_fd, &fds); 860 + r = select(nfds, &fds, NULL, NULL, NULL); 861 + if (!r) 862 + continue; 863 + if (r < 0) { 864 + if (errno == EINTR) 865 + continue; 866 + log_err("Fail to select on pcap fd and wake fd"); 867 + break; 868 + } 869 + 870 + /* This instance of pcap is non-blocking */ 871 + packet = pcap_next(pcap, &header); 872 + if (!packet) 873 + continue; 874 + 875 + /* According to the man page of pcap_dump(), first argument 876 + * is the pcap_dumper_t pointer even it's argument type is 877 + * u_char *. 878 + */ 879 + pcap_dump((u_char *)dumper, &header, packet); 880 + 881 + /* Not sure what other types of packets look like. Here, we 882 + * parse only Ethernet and compatible packets. 883 + */ 884 + if (!is_ethernet(packet)) 885 + continue; 886 + 887 + /* Skip SLL2 header 888 + * https://www.tcpdump.org/linktypes/LINKTYPE_LINUX_SLL2.html 889 + * 890 + * Although the document doesn't mention that, the payload 891 + * doesn't include the Ethernet header. The payload starts 892 + * from the first byte of the network layer header. 893 + */ 894 + payload = packet + 20; 895 + 896 + memcpy(&proto, packet, 2); 897 + proto = ntohs(proto); 898 + memcpy(&ifindex, packet + 4, 4); 899 + ifindex = ntohl(ifindex); 900 + ptype = packet[10]; 901 + 902 + if (proto == ETH_P_IPV6) { 903 + show_ipv6_packet(payload, ifindex, ptype); 904 + } else if (proto == ETH_P_IP) { 905 + show_ipv4_packet(payload, ifindex, ptype); 906 + } else { 907 + ifname = if_indextoname(ifindex, _ifname); 908 + if (!ifname) { 909 + snprintf(_ifname, sizeof(_ifname), "unknown(%d)", ifindex); 910 + ifname = _ifname; 911 + } 912 + 913 + printf("%-7s %-3s Unknown network protocol type 0x%x\n", 914 + ifname, pkt_type_str(ptype), proto); 915 + } 916 + } 917 + 918 + return NULL; 919 + } 920 + 921 + /* Prepare the pcap handle to capture packets. 922 + * 923 + * This pcap is non-blocking and immediate mode is enabled to receive 924 + * captured packets as soon as possible. The snaplen is set to 1024 bytes 925 + * to limit the size of captured content. The format of the link-layer 926 + * header is set to DLT_LINUX_SLL2 to enable handling various link-layer 927 + * technologies. 928 + */ 929 + static pcap_t *traffic_monitor_prepare_pcap(void) 930 + { 931 + char errbuf[PCAP_ERRBUF_SIZE]; 932 + pcap_t *pcap; 933 + int r; 934 + 935 + /* Listen on all NICs in the namespace */ 936 + pcap = pcap_create("any", errbuf); 937 + if (!pcap) { 938 + log_err("Failed to open pcap: %s", errbuf); 939 + return NULL; 940 + } 941 + /* Limit the size of the packet (first N bytes) */ 942 + r = pcap_set_snaplen(pcap, 1024); 943 + if (r) { 944 + log_err("Failed to set snaplen: %s", pcap_geterr(pcap)); 945 + goto error; 946 + } 947 + /* To receive packets as fast as possible */ 948 + r = pcap_set_immediate_mode(pcap, 1); 949 + if (r) { 950 + log_err("Failed to set immediate mode: %s", pcap_geterr(pcap)); 951 + goto error; 952 + } 953 + r = pcap_setnonblock(pcap, 1, errbuf); 954 + if (r) { 955 + log_err("Failed to set nonblock: %s", errbuf); 956 + goto error; 957 + } 958 + r = pcap_activate(pcap); 959 + if (r) { 960 + log_err("Failed to activate pcap: %s", pcap_geterr(pcap)); 961 + goto error; 962 + } 963 + /* Determine the format of the link-layer header */ 964 + r = pcap_set_datalink(pcap, DLT_LINUX_SLL2); 965 + if (r) { 966 + log_err("Failed to set datalink: %s", pcap_geterr(pcap)); 967 + goto error; 968 + } 969 + 970 + return pcap; 971 + error: 972 + pcap_close(pcap); 973 + return NULL; 974 + } 975 + 976 + static void encode_test_name(char *buf, size_t len, const char *test_name, const char *subtest_name) 977 + { 978 + char *p; 979 + 980 + if (subtest_name) 981 + snprintf(buf, len, "%s__%s", test_name, subtest_name); 982 + else 983 + snprintf(buf, len, "%s", test_name); 984 + while ((p = strchr(buf, '/'))) 985 + *p = '_'; 986 + while ((p = strchr(buf, ' '))) 987 + *p = '_'; 988 + } 989 + 990 + #define PCAP_DIR "/tmp/tmon_pcap" 991 + 992 + /* Start to monitor the network traffic in the given network namespace. 993 + * 994 + * netns: the name of the network namespace to monitor. If NULL, the 995 + * current network namespace is monitored. 996 + * test_name: the name of the running test. 997 + * subtest_name: the name of the running subtest if there is. It should be 998 + * NULL if it is not a subtest. 999 + * 1000 + * This function will start a thread to capture packets going through NICs 1001 + * in the give network namespace. 1002 + */ 1003 + struct tmonitor_ctx *traffic_monitor_start(const char *netns, const char *test_name, 1004 + const char *subtest_name) 1005 + { 1006 + struct nstoken *nstoken = NULL; 1007 + struct tmonitor_ctx *ctx; 1008 + char test_name_buf[64]; 1009 + static int tmon_seq; 1010 + int r; 1011 + 1012 + if (netns) { 1013 + nstoken = open_netns(netns); 1014 + if (!nstoken) 1015 + return NULL; 1016 + } 1017 + ctx = malloc(sizeof(*ctx)); 1018 + if (!ctx) { 1019 + log_err("Failed to malloc ctx"); 1020 + goto fail_ctx; 1021 + } 1022 + memset(ctx, 0, sizeof(*ctx)); 1023 + 1024 + encode_test_name(test_name_buf, sizeof(test_name_buf), test_name, subtest_name); 1025 + snprintf(ctx->pkt_fname, sizeof(ctx->pkt_fname), 1026 + PCAP_DIR "/packets-%d-%d-%s-%s.log", getpid(), tmon_seq++, 1027 + test_name_buf, netns ? netns : "unknown"); 1028 + 1029 + r = mkdir(PCAP_DIR, 0755); 1030 + if (r && errno != EEXIST) { 1031 + log_err("Failed to create " PCAP_DIR); 1032 + goto fail_pcap; 1033 + } 1034 + 1035 + ctx->pcap = traffic_monitor_prepare_pcap(); 1036 + if (!ctx->pcap) 1037 + goto fail_pcap; 1038 + ctx->pcap_fd = pcap_get_selectable_fd(ctx->pcap); 1039 + if (ctx->pcap_fd < 0) { 1040 + log_err("Failed to get pcap fd"); 1041 + goto fail_dumper; 1042 + } 1043 + 1044 + /* Create a packet file */ 1045 + ctx->dumper = pcap_dump_open(ctx->pcap, ctx->pkt_fname); 1046 + if (!ctx->dumper) { 1047 + log_err("Failed to open pcap dump: %s", ctx->pkt_fname); 1048 + goto fail_dumper; 1049 + } 1050 + 1051 + /* Create an eventfd to wake up the monitor thread */ 1052 + ctx->wake_fd = eventfd(0, 0); 1053 + if (ctx->wake_fd < 0) { 1054 + log_err("Failed to create eventfd"); 1055 + goto fail_eventfd; 1056 + } 1057 + 1058 + r = pthread_create(&ctx->thread, NULL, traffic_monitor_thread, ctx); 1059 + if (r) { 1060 + log_err("Failed to create thread"); 1061 + goto fail; 1062 + } 1063 + 1064 + close_netns(nstoken); 1065 + 1066 + return ctx; 1067 + 1068 + fail: 1069 + close(ctx->wake_fd); 1070 + 1071 + fail_eventfd: 1072 + pcap_dump_close(ctx->dumper); 1073 + unlink(ctx->pkt_fname); 1074 + 1075 + fail_dumper: 1076 + pcap_close(ctx->pcap); 1077 + 1078 + fail_pcap: 1079 + free(ctx); 1080 + 1081 + fail_ctx: 1082 + close_netns(nstoken); 1083 + 1084 + return NULL; 1085 + } 1086 + 1087 + static void traffic_monitor_release(struct tmonitor_ctx *ctx) 1088 + { 1089 + pcap_close(ctx->pcap); 1090 + pcap_dump_close(ctx->dumper); 1091 + 1092 + close(ctx->wake_fd); 1093 + 1094 + free(ctx); 1095 + } 1096 + 1097 + /* Stop the network traffic monitor. 1098 + * 1099 + * ctx: the context returned by traffic_monitor_start() 1100 + */ 1101 + void traffic_monitor_stop(struct tmonitor_ctx *ctx) 1102 + { 1103 + __u64 w = 1; 1104 + 1105 + if (!ctx) 1106 + return; 1107 + 1108 + /* Stop the monitor thread */ 1109 + ctx->done = true; 1110 + /* Wake up the background thread. */ 1111 + write(ctx->wake_fd, &w, sizeof(w)); 1112 + pthread_join(ctx->thread, NULL); 1113 + 1114 + printf("Packet file: %s\n", strrchr(ctx->pkt_fname, '/') + 1); 1115 + 1116 + traffic_monitor_release(ctx); 1117 + } 1118 + #endif /* TRAFFIC_MONITOR */

+23 -2

tools/testing/selftests/bpf/network_helpers.h

··· 23 23 24 24 struct network_helper_opts { 25 25 int timeout_ms; 26 - bool must_fail; 27 26 int proto; 28 27 /* +ve: Passed to listen() as-is. 29 28 * 0: Default when the test does not set ··· 69 70 const struct network_helper_opts *opts); 70 71 int connect_to_addr(int type, const struct sockaddr_storage *addr, socklen_t len, 71 72 const struct network_helper_opts *opts); 73 + int connect_to_addr_str(int family, int type, const char *addr_str, __u16 port, 74 + const struct network_helper_opts *opts); 72 75 int connect_to_fd(int server_fd, int timeout_ms); 73 - int connect_to_fd_opts(int server_fd, int type, const struct network_helper_opts *opts); 76 + int connect_to_fd_opts(int server_fd, const struct network_helper_opts *opts); 74 77 int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms); 75 78 int fastopen_connect(int server_fd, const char *data, unsigned int data_len, 76 79 int timeout_ms); ··· 93 92 struct nstoken *open_netns(const char *name); 94 93 void close_netns(struct nstoken *token); 95 94 int send_recv_data(int lfd, int fd, uint32_t total_bytes); 95 + int make_netns(const char *name); 96 + int remove_netns(const char *name); 96 97 97 98 static __u16 csum_fold(__u32 csum) 98 99 { ··· 137 134 138 135 return csum_fold((__u32)s); 139 136 } 137 + 138 + struct tmonitor_ctx; 139 + 140 + #ifdef TRAFFIC_MONITOR 141 + struct tmonitor_ctx *traffic_monitor_start(const char *netns, const char *test_name, 142 + const char *subtest_name); 143 + void traffic_monitor_stop(struct tmonitor_ctx *ctx); 144 + #else 145 + static inline struct tmonitor_ctx *traffic_monitor_start(const char *netns, const char *test_name, 146 + const char *subtest_name) 147 + { 148 + return NULL; 149 + } 150 + 151 + static inline void traffic_monitor_stop(struct tmonitor_ctx *ctx) 152 + { 153 + } 154 + #endif 140 155 141 156 #endif

+5 -3

tools/testing/selftests/bpf/prog_tests/attach_probe.c

··· 283 283 trigger_func3(); 284 284 285 285 ASSERT_EQ(skel->bss->uprobe_byname3_sleepable_res, 9, "check_uprobe_byname3_sleepable_res"); 286 - ASSERT_EQ(skel->bss->uprobe_byname3_res, 10, "check_uprobe_byname3_res"); 287 - ASSERT_EQ(skel->bss->uretprobe_byname3_sleepable_res, 11, "check_uretprobe_byname3_sleepable_res"); 288 - ASSERT_EQ(skel->bss->uretprobe_byname3_res, 12, "check_uretprobe_byname3_res"); 286 + ASSERT_EQ(skel->bss->uprobe_byname3_str_sleepable_res, 10, "check_uprobe_byname3_str_sleepable_res"); 287 + ASSERT_EQ(skel->bss->uprobe_byname3_res, 11, "check_uprobe_byname3_res"); 288 + ASSERT_EQ(skel->bss->uretprobe_byname3_sleepable_res, 12, "check_uretprobe_byname3_sleepable_res"); 289 + ASSERT_EQ(skel->bss->uretprobe_byname3_str_sleepable_res, 13, "check_uretprobe_byname3_str_sleepable_res"); 290 + ASSERT_EQ(skel->bss->uretprobe_byname3_res, 14, "check_uretprobe_byname3_res"); 289 291 } 290 292 291 293 void test_attach_probe(void)

+2 -2

tools/testing/selftests/bpf/prog_tests/bpf_iter.c

··· 1218 1218 bpf_iter_bpf_sk_storage_helpers__destroy(skel); 1219 1219 } 1220 1220 1221 - static void test_bpf_sk_stoarge_map_iter_fd(void) 1221 + static void test_bpf_sk_storage_map_iter_fd(void) 1222 1222 { 1223 1223 struct bpf_iter_bpf_sk_storage_map *skel; 1224 1224 ··· 1693 1693 if (test__start_subtest("bpf_sk_storage_map")) 1694 1694 test_bpf_sk_storage_map(); 1695 1695 if (test__start_subtest("bpf_sk_storage_map_iter_fd")) 1696 - test_bpf_sk_stoarge_map_iter_fd(); 1696 + test_bpf_sk_storage_map_iter_fd(); 1697 1697 if (test__start_subtest("bpf_sk_storage_delete")) 1698 1698 test_bpf_sk_storage_delete(); 1699 1699 if (test__start_subtest("bpf_sk_storage_get"))

+1 -1

tools/testing/selftests/bpf/prog_tests/bpf_iter_setsockopt.c

··· 95 95 struct sockaddr_in6 addr; 96 96 socklen_t addrlen = sizeof(addr); 97 97 98 - if (!getsockname(fd, &addr, &addrlen)) 98 + if (!getsockname(fd, (struct sockaddr *)&addr, &addrlen)) 99 99 return ntohs(addr.sin6_port); 100 100 101 101 return 0;

+2 -2

tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c

··· 49 49 goto err; 50 50 51 51 /* connect to server */ 52 - *cli_fd = connect_to_fd_opts(*srv_fd, SOCK_STREAM, cli_opts); 52 + *cli_fd = connect_to_fd_opts(*srv_fd, cli_opts); 53 53 if (!ASSERT_NEQ(*cli_fd, -1, "connect_to_fd_opts")) 54 54 goto err; 55 55 ··· 285 285 dctcp_skel = bpf_dctcp__open(); 286 286 if (!ASSERT_OK_PTR(dctcp_skel, "dctcp_skel")) 287 287 return; 288 - strcpy(dctcp_skel->rodata->fallback, "cubic"); 288 + strcpy(dctcp_skel->rodata->fallback_cc, "cubic"); 289 289 if (!ASSERT_OK(bpf_dctcp__load(dctcp_skel), "bpf_dctcp__load")) 290 290 goto done; 291 291

+3 -3

tools/testing/selftests/bpf/prog_tests/btf.c

··· 5020 5020 static struct btf_raw_test pprint_test_template[] = { 5021 5021 { 5022 5022 .raw_types = { 5023 - /* unsighed char */ /* [1] */ 5023 + /* unsigned char */ /* [1] */ 5024 5024 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 8, 1), 5025 5025 /* unsigned short */ /* [2] */ 5026 5026 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 16, 2), ··· 5087 5087 * be encoded with kind_flag set. 5088 5088 */ 5089 5089 .raw_types = { 5090 - /* unsighed char */ /* [1] */ 5090 + /* unsigned char */ /* [1] */ 5091 5091 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 8, 1), 5092 5092 /* unsigned short */ /* [2] */ 5093 5093 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 16, 2), ··· 5154 5154 * will have both int and enum types. 5155 5155 */ 5156 5156 .raw_types = { 5157 - /* unsighed char */ /* [1] */ 5157 + /* unsigned char */ /* [1] */ 5158 5158 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 8, 1), 5159 5159 /* unsigned short */ /* [2] */ 5160 5160 BTF_TYPE_INT_ENC(NAME_TBD, 0, 0, 16, 2),

+68

tools/testing/selftests/bpf/prog_tests/btf_distill.c

··· 535 535 btf__free(vmlinux_btf); 536 536 } 537 537 538 + /* Split and new base BTFs should inherit endianness from source BTF. */ 539 + static void test_distilled_endianness(void) 540 + { 541 + struct btf *base = NULL, *split = NULL, *new_base = NULL, *new_split = NULL; 542 + struct btf *new_base1 = NULL, *new_split1 = NULL; 543 + enum btf_endianness inverse_endianness; 544 + const void *raw_data; 545 + __u32 size; 546 + 547 + base = btf__new_empty(); 548 + if (!ASSERT_OK_PTR(base, "empty_main_btf")) 549 + return; 550 + inverse_endianness = btf__endianness(base) == BTF_LITTLE_ENDIAN ? BTF_BIG_ENDIAN 551 + : BTF_LITTLE_ENDIAN; 552 + btf__set_endianness(base, inverse_endianness); 553 + btf__add_int(base, "int", 4, BTF_INT_SIGNED); /* [1] int */ 554 + VALIDATE_RAW_BTF( 555 + base, 556 + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED"); 557 + split = btf__new_empty_split(base); 558 + if (!ASSERT_OK_PTR(split, "empty_split_btf")) 559 + goto cleanup; 560 + btf__add_ptr(split, 1); 561 + VALIDATE_RAW_BTF( 562 + split, 563 + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 564 + "[2] PTR '(anon)' type_id=1"); 565 + if (!ASSERT_EQ(0, btf__distill_base(split, &new_base, &new_split), 566 + "distilled_base") || 567 + !ASSERT_OK_PTR(new_base, "distilled_base") || 568 + !ASSERT_OK_PTR(new_split, "distilled_split") || 569 + !ASSERT_EQ(2, btf__type_cnt(new_base), "distilled_base_type_cnt")) 570 + goto cleanup; 571 + VALIDATE_RAW_BTF( 572 + new_split, 573 + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 574 + "[2] PTR '(anon)' type_id=1"); 575 + 576 + raw_data = btf__raw_data(new_base, &size); 577 + if (!ASSERT_OK_PTR(raw_data, "btf__raw_data #1")) 578 + goto cleanup; 579 + new_base1 = btf__new(raw_data, size); 580 + if (!ASSERT_OK_PTR(new_base1, "new_base1 = btf__new()")) 581 + goto cleanup; 582 + raw_data = btf__raw_data(new_split, &size); 583 + if (!ASSERT_OK_PTR(raw_data, "btf__raw_data #2")) 584 + goto cleanup; 585 + new_split1 = btf__new_split(raw_data, size, new_base1); 586 + if (!ASSERT_OK_PTR(new_split1, "new_split1 = btf__new()")) 587 + goto cleanup; 588 + 589 + ASSERT_EQ(btf__endianness(new_base1), inverse_endianness, "new_base1 endianness"); 590 + ASSERT_EQ(btf__endianness(new_split1), inverse_endianness, "new_split1 endianness"); 591 + VALIDATE_RAW_BTF( 592 + new_split1, 593 + "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", 594 + "[2] PTR '(anon)' type_id=1"); 595 + cleanup: 596 + btf__free(new_split1); 597 + btf__free(new_base1); 598 + btf__free(new_split); 599 + btf__free(new_base); 600 + btf__free(split); 601 + btf__free(base); 602 + } 603 + 538 604 void test_btf_distill(void) 539 605 { 540 606 if (test__start_subtest("distilled_base")) ··· 615 549 test_distilled_base_multi_err2(); 616 550 if (test__start_subtest("distilled_base_vmlinux")) 617 551 test_distilled_base_vmlinux(); 552 + if (test__start_subtest("distilled_endianness")) 553 + test_distilled_endianness(); 618 554 }

+2 -2

tools/testing/selftests/bpf/prog_tests/btf_dump.c

··· 805 805 TEST_BTF_DUMP_VAR(btf, d, NULL, str, "cpu_number", int, BTF_F_COMPACT, 806 806 "int cpu_number = (int)100", 100); 807 807 #endif 808 - TEST_BTF_DUMP_VAR(btf, d, NULL, str, "cpu_profile_flip", int, BTF_F_COMPACT, 809 - "static int cpu_profile_flip = (int)2", 2); 808 + TEST_BTF_DUMP_VAR(btf, d, NULL, str, "bpf_cgrp_storage_busy", int, BTF_F_COMPACT, 809 + "static int bpf_cgrp_storage_busy = (int)2", 2); 810 810 } 811 811 812 812 static void test_btf_datasec(struct btf *btf, struct btf_dump *d, char *str,

+118

tools/testing/selftests/bpf/prog_tests/build_id.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + #include <test_progs.h> 4 + 5 + #include "test_build_id.skel.h" 6 + 7 + static char build_id[BPF_BUILD_ID_SIZE]; 8 + static int build_id_sz; 9 + 10 + static void print_stack(struct bpf_stack_build_id *stack, int frame_cnt) 11 + { 12 + int i, j; 13 + 14 + for (i = 0; i < frame_cnt; i++) { 15 + printf("FRAME #%02d: ", i); 16 + switch (stack[i].status) { 17 + case BPF_STACK_BUILD_ID_EMPTY: 18 + printf("<EMPTY>\n"); 19 + break; 20 + case BPF_STACK_BUILD_ID_VALID: 21 + printf("BUILD ID = "); 22 + for (j = 0; j < BPF_BUILD_ID_SIZE; j++) 23 + printf("%02hhx", (unsigned)stack[i].build_id[j]); 24 + printf(" OFFSET = %llx", (unsigned long long)stack[i].offset); 25 + break; 26 + case BPF_STACK_BUILD_ID_IP: 27 + printf("IP = %llx", (unsigned long long)stack[i].ip); 28 + break; 29 + default: 30 + printf("UNEXPECTED STATUS %d ", stack[i].status); 31 + break; 32 + } 33 + printf("\n"); 34 + } 35 + } 36 + 37 + static void subtest_nofault(bool build_id_resident) 38 + { 39 + struct test_build_id *skel; 40 + struct bpf_stack_build_id *stack; 41 + int frame_cnt; 42 + 43 + skel = test_build_id__open_and_load(); 44 + if (!ASSERT_OK_PTR(skel, "skel_open")) 45 + return; 46 + 47 + skel->links.uprobe_nofault = bpf_program__attach(skel->progs.uprobe_nofault); 48 + if (!ASSERT_OK_PTR(skel->links.uprobe_nofault, "link")) 49 + goto cleanup; 50 + 51 + if (build_id_resident) 52 + ASSERT_OK(system("./uprobe_multi uprobe-paged-in"), "trigger_uprobe"); 53 + else 54 + ASSERT_OK(system("./uprobe_multi uprobe-paged-out"), "trigger_uprobe"); 55 + 56 + if (!ASSERT_GT(skel->bss->res_nofault, 0, "res")) 57 + goto cleanup; 58 + 59 + stack = skel->bss->stack_nofault; 60 + frame_cnt = skel->bss->res_nofault / sizeof(struct bpf_stack_build_id); 61 + if (env.verbosity >= VERBOSE_NORMAL) 62 + print_stack(stack, frame_cnt); 63 + 64 + if (build_id_resident) { 65 + ASSERT_EQ(stack[0].status, BPF_STACK_BUILD_ID_VALID, "build_id_status"); 66 + ASSERT_EQ(memcmp(stack[0].build_id, build_id, build_id_sz), 0, "build_id_match"); 67 + } else { 68 + ASSERT_EQ(stack[0].status, BPF_STACK_BUILD_ID_IP, "build_id_status"); 69 + } 70 + 71 + cleanup: 72 + test_build_id__destroy(skel); 73 + } 74 + 75 + static void subtest_sleepable(void) 76 + { 77 + struct test_build_id *skel; 78 + struct bpf_stack_build_id *stack; 79 + int frame_cnt; 80 + 81 + skel = test_build_id__open_and_load(); 82 + if (!ASSERT_OK_PTR(skel, "skel_open")) 83 + return; 84 + 85 + skel->links.uprobe_sleepable = bpf_program__attach(skel->progs.uprobe_sleepable); 86 + if (!ASSERT_OK_PTR(skel->links.uprobe_sleepable, "link")) 87 + goto cleanup; 88 + 89 + /* force build ID to not be paged in */ 90 + ASSERT_OK(system("./uprobe_multi uprobe-paged-out"), "trigger_uprobe"); 91 + 92 + if (!ASSERT_GT(skel->bss->res_sleepable, 0, "res")) 93 + goto cleanup; 94 + 95 + stack = skel->bss->stack_sleepable; 96 + frame_cnt = skel->bss->res_sleepable / sizeof(struct bpf_stack_build_id); 97 + if (env.verbosity >= VERBOSE_NORMAL) 98 + print_stack(stack, frame_cnt); 99 + 100 + ASSERT_EQ(stack[0].status, BPF_STACK_BUILD_ID_VALID, "build_id_status"); 101 + ASSERT_EQ(memcmp(stack[0].build_id, build_id, build_id_sz), 0, "build_id_match"); 102 + 103 + cleanup: 104 + test_build_id__destroy(skel); 105 + } 106 + 107 + void serial_test_build_id(void) 108 + { 109 + build_id_sz = read_build_id("uprobe_multi", build_id, sizeof(build_id)); 110 + ASSERT_EQ(build_id_sz, BPF_BUILD_ID_SIZE, "parse_build_id"); 111 + 112 + if (test__start_subtest("nofault-paged-out")) 113 + subtest_nofault(false /* not resident */); 114 + if (test__start_subtest("nofault-paged-in")) 115 + subtest_nofault(true /* resident */); 116 + if (test__start_subtest("sleepable")) 117 + subtest_sleepable(); 118 + }

+1 -1

tools/testing/selftests/bpf/prog_tests/cg_storage_multi.c

··· 214 214 /* Attach to parent and child cgroup, trigger packet from child. 215 215 * Assert that there is six additional runs, parent cgroup egresses and 216 216 * ingress, child cgroup egresses and ingress. 217 - * Assert that egree and ingress storages are separate. 217 + * Assert that egress and ingress storages are separate. 218 218 */ 219 219 child_egress1_link = bpf_program__attach_cgroup(obj->progs.egress1, 220 220 child_cgroup_fd);

+141

tools/testing/selftests/bpf/prog_tests/cgroup_ancestor.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "test_progs.h" 4 + #include "network_helpers.h" 5 + #include "cgroup_helpers.h" 6 + #include "cgroup_ancestor.skel.h" 7 + 8 + #define CGROUP_PATH "/skb_cgroup_test" 9 + #define TEST_NS "cgroup_ancestor_ns" 10 + #define NUM_CGROUP_LEVELS 4 11 + #define WAIT_AUTO_IP_MAX_ATTEMPT 10 12 + #define DST_ADDR "::1" 13 + #define DST_PORT 1234 14 + #define MAX_ASSERT_NAME 32 15 + 16 + struct test_data { 17 + struct cgroup_ancestor *skel; 18 + struct bpf_tc_hook qdisc; 19 + struct bpf_tc_opts tc_attach; 20 + struct nstoken *ns; 21 + }; 22 + 23 + static int send_datagram(void) 24 + { 25 + unsigned char buf[] = "some random test data"; 26 + struct sockaddr_in6 addr = { .sin6_family = AF_INET6, 27 + .sin6_port = htons(DST_PORT), }; 28 + int sock, n; 29 + 30 + if (!ASSERT_EQ(inet_pton(AF_INET6, DST_ADDR, &addr.sin6_addr), 1, 31 + "inet_pton")) 32 + return -1; 33 + 34 + sock = socket(AF_INET6, SOCK_DGRAM, 0); 35 + if (!ASSERT_OK_FD(sock, "create socket")) 36 + return sock; 37 + 38 + if (!ASSERT_OK(connect(sock, &addr, sizeof(addr)), "connect")) { 39 + close(sock); 40 + return -1; 41 + } 42 + 43 + n = sendto(sock, buf, sizeof(buf), 0, (const struct sockaddr *)&addr, 44 + sizeof(addr)); 45 + close(sock); 46 + return ASSERT_EQ(n, sizeof(buf), "send data") ? 0 : -1; 47 + } 48 + 49 + static int setup_network(struct test_data *t) 50 + { 51 + SYS(fail, "ip netns add %s", TEST_NS); 52 + t->ns = open_netns(TEST_NS); 53 + if (!ASSERT_OK_PTR(t->ns, "open netns")) 54 + goto cleanup_ns; 55 + 56 + SYS(close_ns, "ip link set lo up"); 57 + 58 + memset(&t->qdisc, 0, sizeof(t->qdisc)); 59 + t->qdisc.sz = sizeof(t->qdisc); 60 + t->qdisc.attach_point = BPF_TC_EGRESS; 61 + t->qdisc.ifindex = if_nametoindex("lo"); 62 + if (!ASSERT_NEQ(t->qdisc.ifindex, 0, "if_nametoindex")) 63 + goto close_ns; 64 + if (!ASSERT_OK(bpf_tc_hook_create(&t->qdisc), "qdisc add")) 65 + goto close_ns; 66 + 67 + memset(&t->tc_attach, 0, sizeof(t->tc_attach)); 68 + t->tc_attach.sz = sizeof(t->tc_attach); 69 + t->tc_attach.prog_fd = bpf_program__fd(t->skel->progs.log_cgroup_id); 70 + if (!ASSERT_OK(bpf_tc_attach(&t->qdisc, &t->tc_attach), "filter add")) 71 + goto cleanup_qdisc; 72 + 73 + return 0; 74 + 75 + cleanup_qdisc: 76 + bpf_tc_hook_destroy(&t->qdisc); 77 + close_ns: 78 + close_netns(t->ns); 79 + cleanup_ns: 80 + SYS_NOFAIL("ip netns del %s", TEST_NS); 81 + fail: 82 + return 1; 83 + } 84 + 85 + static void cleanup_network(struct test_data *t) 86 + { 87 + bpf_tc_detach(&t->qdisc, &t->tc_attach); 88 + bpf_tc_hook_destroy(&t->qdisc); 89 + close_netns(t->ns); 90 + SYS_NOFAIL("ip netns del %s", TEST_NS); 91 + } 92 + 93 + static void check_ancestors_ids(struct test_data *t) 94 + { 95 + __u64 expected_ids[NUM_CGROUP_LEVELS]; 96 + char assert_name[MAX_ASSERT_NAME]; 97 + __u32 level; 98 + 99 + expected_ids[0] = get_cgroup_id("/.."); /* root cgroup */ 100 + expected_ids[1] = get_cgroup_id(""); 101 + expected_ids[2] = get_cgroup_id(CGROUP_PATH); 102 + expected_ids[3] = 0; /* non-existent cgroup */ 103 + 104 + for (level = 0; level < NUM_CGROUP_LEVELS; level++) { 105 + snprintf(assert_name, MAX_ASSERT_NAME, 106 + "ancestor id at level %d", level); 107 + ASSERT_EQ(t->skel->bss->cgroup_ids[level], expected_ids[level], 108 + assert_name); 109 + } 110 + } 111 + 112 + void test_cgroup_ancestor(void) 113 + { 114 + struct test_data t; 115 + int cgroup_fd; 116 + 117 + t.skel = cgroup_ancestor__open_and_load(); 118 + if (!ASSERT_OK_PTR(t.skel, "open and load")) 119 + return; 120 + 121 + t.skel->bss->dport = htons(DST_PORT); 122 + cgroup_fd = cgroup_setup_and_join(CGROUP_PATH); 123 + if (cgroup_fd < 0) 124 + goto cleanup_progs; 125 + 126 + if (setup_network(&t)) 127 + goto cleanup_cgroups; 128 + 129 + if (send_datagram()) 130 + goto cleanup_network; 131 + 132 + check_ancestors_ids(&t); 133 + 134 + cleanup_network: 135 + cleanup_network(&t); 136 + cleanup_cgroups: 137 + close(cgroup_fd); 138 + cleanup_cgroup_environment(); 139 + cleanup_progs: 140 + cgroup_ancestor__destroy(t.skel); 141 + }

+125

tools/testing/selftests/bpf/prog_tests/cgroup_dev.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <sys/stat.h> 4 + #include <sys/sysmacros.h> 5 + #include <errno.h> 6 + #include "test_progs.h" 7 + #include "cgroup_helpers.h" 8 + #include "dev_cgroup.skel.h" 9 + 10 + #define TEST_CGROUP "/test-bpf-based-device-cgroup/" 11 + #define TEST_BUFFER_SIZE 64 12 + 13 + static void test_mknod(const char *path, mode_t mode, int dev_major, 14 + int dev_minor, int expected_ret, int expected_errno) 15 + { 16 + int ret; 17 + 18 + unlink(path); 19 + ret = mknod(path, mode, makedev(dev_major, dev_minor)); 20 + ASSERT_EQ(ret, expected_ret, "mknod"); 21 + if (expected_ret) 22 + ASSERT_EQ(errno, expected_errno, "mknod errno"); 23 + else 24 + unlink(path); 25 + } 26 + 27 + static void test_read(const char *path, char *buf, int buf_size, 28 + int expected_ret, int expected_errno) 29 + { 30 + int ret, fd; 31 + 32 + fd = open(path, O_RDONLY); 33 + 34 + /* A bare open on unauthorized device should fail */ 35 + if (expected_ret < 0) { 36 + ASSERT_EQ(fd, expected_ret, "open ret for read"); 37 + ASSERT_EQ(errno, expected_errno, "open errno for read"); 38 + if (fd >= 0) 39 + close(fd); 40 + return; 41 + } 42 + 43 + if (!ASSERT_OK_FD(fd, "open ret for read")) 44 + return; 45 + 46 + ret = read(fd, buf, buf_size); 47 + ASSERT_EQ(ret, expected_ret, "read"); 48 + 49 + close(fd); 50 + } 51 + 52 + static void test_write(const char *path, char *buf, int buf_size, 53 + int expected_ret, int expected_errno) 54 + { 55 + int ret, fd; 56 + 57 + fd = open(path, O_WRONLY); 58 + 59 + /* A bare open on unauthorized device should fail */ 60 + if (expected_ret < 0) { 61 + ASSERT_EQ(fd, expected_ret, "open ret for write"); 62 + ASSERT_EQ(errno, expected_errno, "open errno for write"); 63 + if (fd >= 0) 64 + close(fd); 65 + return; 66 + } 67 + 68 + if (!ASSERT_OK_FD(fd, "open ret for write")) 69 + return; 70 + 71 + ret = write(fd, buf, buf_size); 72 + ASSERT_EQ(ret, expected_ret, "write"); 73 + 74 + close(fd); 75 + } 76 + 77 + void test_cgroup_dev(void) 78 + { 79 + char buf[TEST_BUFFER_SIZE] = "some random test data"; 80 + struct dev_cgroup *skel; 81 + int cgroup_fd; 82 + 83 + cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 84 + if (!ASSERT_OK_FD(cgroup_fd, "cgroup switch")) 85 + return; 86 + 87 + skel = dev_cgroup__open_and_load(); 88 + if (!ASSERT_OK_PTR(skel, "load program")) 89 + goto cleanup_cgroup; 90 + 91 + skel->links.bpf_prog1 = 92 + bpf_program__attach_cgroup(skel->progs.bpf_prog1, cgroup_fd); 93 + if (!ASSERT_OK_PTR(skel->links.bpf_prog1, "attach_program")) 94 + goto cleanup_progs; 95 + 96 + if (test__start_subtest("allow-mknod")) 97 + test_mknod("/dev/test_dev_cgroup_null", S_IFCHR, 1, 3, 0, 0); 98 + 99 + if (test__start_subtest("allow-read")) 100 + test_read("/dev/urandom", buf, TEST_BUFFER_SIZE, 101 + TEST_BUFFER_SIZE, 0); 102 + 103 + if (test__start_subtest("allow-write")) 104 + test_write("/dev/null", buf, TEST_BUFFER_SIZE, 105 + TEST_BUFFER_SIZE, 0); 106 + 107 + if (test__start_subtest("deny-mknod")) 108 + test_mknod("/dev/test_dev_cgroup_zero", S_IFCHR, 1, 5, -1, 109 + EPERM); 110 + 111 + if (test__start_subtest("deny-read")) 112 + test_read("/dev/random", buf, TEST_BUFFER_SIZE, -1, EPERM); 113 + 114 + if (test__start_subtest("deny-write")) 115 + test_write("/dev/zero", buf, TEST_BUFFER_SIZE, -1, EPERM); 116 + 117 + if (test__start_subtest("deny-mknod-wrong-type")) 118 + test_mknod("/dev/test_dev_cgroup_block", S_IFBLK, 1, 3, -1, 119 + EPERM); 120 + 121 + cleanup_progs: 122 + dev_cgroup__destroy(skel); 123 + cleanup_cgroup: 124 + cleanup_cgroup_environment(); 125 + }

+46

tools/testing/selftests/bpf/prog_tests/cgroup_get_current_cgroup_id.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <sys/stat.h> 4 + #include <sys/sysmacros.h> 5 + #include "test_progs.h" 6 + #include "cgroup_helpers.h" 7 + #include "get_cgroup_id_kern.skel.h" 8 + 9 + #define TEST_CGROUP "/test-bpf-get-cgroup-id/" 10 + 11 + void test_cgroup_get_current_cgroup_id(void) 12 + { 13 + struct get_cgroup_id_kern *skel; 14 + const struct timespec req = { 15 + .tv_sec = 0, 16 + .tv_nsec = 1, 17 + }; 18 + int cgroup_fd; 19 + __u64 ucgid; 20 + 21 + cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 22 + if (!ASSERT_OK_FD(cgroup_fd, "cgroup switch")) 23 + return; 24 + 25 + skel = get_cgroup_id_kern__open_and_load(); 26 + if (!ASSERT_OK_PTR(skel, "load program")) 27 + goto cleanup_cgroup; 28 + 29 + if (!ASSERT_OK(get_cgroup_id_kern__attach(skel), "attach bpf program")) 30 + goto cleanup_progs; 31 + 32 + skel->bss->expected_pid = getpid(); 33 + /* trigger the syscall on which is attached the tested prog */ 34 + if (!ASSERT_OK(syscall(__NR_nanosleep, &req, NULL), "nanosleep")) 35 + goto cleanup_progs; 36 + 37 + ucgid = get_cgroup_id(TEST_CGROUP); 38 + 39 + ASSERT_EQ(skel->bss->cg_id, ucgid, "compare cgroup ids"); 40 + 41 + cleanup_progs: 42 + get_cgroup_id_kern__destroy(skel); 43 + cleanup_cgroup: 44 + close(cgroup_fd); 45 + cleanup_cgroup_environment(); 46 + }

+96

tools/testing/selftests/bpf/prog_tests/cgroup_storage.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <test_progs.h> 4 + #include "cgroup_helpers.h" 5 + #include "network_helpers.h" 6 + #include "cgroup_storage.skel.h" 7 + 8 + #define TEST_CGROUP "/test-bpf-cgroup-storage-buf/" 9 + #define TEST_NS "cgroup_storage_ns" 10 + #define PING_CMD "ping localhost -c 1 -W 1 -q" 11 + 12 + static int setup_network(struct nstoken **token) 13 + { 14 + SYS(fail, "ip netns add %s", TEST_NS); 15 + *token = open_netns(TEST_NS); 16 + if (!ASSERT_OK_PTR(*token, "open netns")) 17 + goto cleanup_ns; 18 + SYS(cleanup_ns, "ip link set lo up"); 19 + 20 + return 0; 21 + 22 + cleanup_ns: 23 + SYS_NOFAIL("ip netns del %s", TEST_NS); 24 + fail: 25 + return -1; 26 + } 27 + 28 + static void cleanup_network(struct nstoken *ns) 29 + { 30 + close_netns(ns); 31 + SYS_NOFAIL("ip netns del %s", TEST_NS); 32 + } 33 + 34 + void test_cgroup_storage(void) 35 + { 36 + struct bpf_cgroup_storage_key key; 37 + struct cgroup_storage *skel; 38 + struct nstoken *ns = NULL; 39 + unsigned long long value; 40 + int cgroup_fd; 41 + int err; 42 + 43 + cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 44 + if (!ASSERT_OK_FD(cgroup_fd, "create cgroup")) 45 + return; 46 + 47 + if (!ASSERT_OK(setup_network(&ns), "setup network")) 48 + goto cleanup_cgroup; 49 + 50 + skel = cgroup_storage__open_and_load(); 51 + if (!ASSERT_OK_PTR(skel, "load program")) 52 + goto cleanup_network; 53 + 54 + skel->links.bpf_prog = 55 + bpf_program__attach_cgroup(skel->progs.bpf_prog, cgroup_fd); 56 + if (!ASSERT_OK_PTR(skel->links.bpf_prog, "attach program")) 57 + goto cleanup_progs; 58 + 59 + /* Check that one out of every two packets is dropped */ 60 + err = SYS_NOFAIL(PING_CMD); 61 + ASSERT_OK(err, "first ping"); 62 + err = SYS_NOFAIL(PING_CMD); 63 + ASSERT_NEQ(err, 0, "second ping"); 64 + err = SYS_NOFAIL(PING_CMD); 65 + ASSERT_OK(err, "third ping"); 66 + 67 + err = bpf_map__get_next_key(skel->maps.cgroup_storage, NULL, &key, 68 + sizeof(key)); 69 + if (!ASSERT_OK(err, "get first key")) 70 + goto cleanup_progs; 71 + err = bpf_map__lookup_elem(skel->maps.cgroup_storage, &key, sizeof(key), 72 + &value, sizeof(value), 0); 73 + if (!ASSERT_OK(err, "first packet count read")) 74 + goto cleanup_progs; 75 + 76 + /* Add one to the packet counter, check again packet filtering */ 77 + value++; 78 + err = bpf_map__update_elem(skel->maps.cgroup_storage, &key, sizeof(key), 79 + &value, sizeof(value), 0); 80 + if (!ASSERT_OK(err, "increment packet counter")) 81 + goto cleanup_progs; 82 + err = SYS_NOFAIL(PING_CMD); 83 + ASSERT_OK(err, "fourth ping"); 84 + err = SYS_NOFAIL(PING_CMD); 85 + ASSERT_NEQ(err, 0, "fifth ping"); 86 + err = SYS_NOFAIL(PING_CMD); 87 + ASSERT_OK(err, "sixth ping"); 88 + 89 + cleanup_progs: 90 + cgroup_storage__destroy(skel); 91 + cleanup_network: 92 + cleanup_network(ns); 93 + cleanup_cgroup: 94 + close(cgroup_fd); 95 + cleanup_cgroup_environment(); 96 + }

+9 -7

tools/testing/selftests/bpf/prog_tests/cgroup_v1v2.c

··· 9 9 10 10 static int run_test(int cgroup_fd, int server_fd, bool classid) 11 11 { 12 - struct network_helper_opts opts = { 13 - .must_fail = true, 14 - }; 15 12 struct connect4_dropper *skel; 16 13 int fd, err = 0; 17 14 ··· 29 32 goto out; 30 33 } 31 34 32 - fd = connect_to_fd_opts(server_fd, SOCK_STREAM, &opts); 33 - if (fd < 0) 35 + errno = 0; 36 + fd = connect_to_fd_opts(server_fd, NULL); 37 + if (fd >= 0) { 38 + log_err("Unexpected success to connect to server"); 34 39 err = -1; 35 - else 36 40 close(fd); 41 + } else if (errno != EPERM) { 42 + log_err("Unexpected errno from connect to server"); 43 + err = -1; 44 + } 37 45 out: 38 46 connect4_dropper__destroy(skel); 39 47 return err; ··· 54 52 server_fd = start_server(AF_INET, SOCK_STREAM, NULL, port, 0); 55 53 if (!ASSERT_GE(server_fd, 0, "server_fd")) 56 54 return; 57 - client_fd = connect_to_fd_opts(server_fd, SOCK_STREAM, &opts); 55 + client_fd = connect_to_fd_opts(server_fd, &opts); 58 56 if (!ASSERT_GE(client_fd, 0, "client_fd")) { 59 57 close(server_fd); 60 58 return;

+1

tools/testing/selftests/bpf/prog_tests/core_reloc.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #define _GNU_SOURCE 2 3 #include <test_progs.h> 3 4 #include "progs/core_reloc_types.h" 4 5 #include "bpf_testmod/bpf_testmod.h"

+125

tools/testing/selftests/bpf/prog_tests/core_reloc_raw.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + /* Test cases that can't load programs using libbpf and need direct 4 + * BPF syscall access 5 + */ 6 + 7 + #include <sys/syscall.h> 8 + #include <bpf/libbpf.h> 9 + #include <bpf/btf.h> 10 + 11 + #include "test_progs.h" 12 + #include "test_btf.h" 13 + #include "bpf/libbpf_internal.h" 14 + 15 + static char log[16 * 1024]; 16 + 17 + /* Check that verifier rejects BPF program containing relocation 18 + * pointing to non-existent BTF type. 19 + */ 20 + static void test_bad_local_id(void) 21 + { 22 + struct test_btf { 23 + struct btf_header hdr; 24 + __u32 types[15]; 25 + char strings[128]; 26 + } raw_btf = { 27 + .hdr = { 28 + .magic = BTF_MAGIC, 29 + .version = BTF_VERSION, 30 + .hdr_len = sizeof(struct btf_header), 31 + .type_off = 0, 32 + .type_len = sizeof(raw_btf.types), 33 + .str_off = offsetof(struct test_btf, strings) - 34 + offsetof(struct test_btf, types), 35 + .str_len = sizeof(raw_btf.strings), 36 + }, 37 + .types = { 38 + BTF_PTR_ENC(0), /* [1] void* */ 39 + BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), /* [2] int */ 40 + BTF_FUNC_PROTO_ENC(2, 1), /* [3] int (*)(void*) */ 41 + BTF_FUNC_PROTO_ARG_ENC(8, 1), 42 + BTF_FUNC_ENC(8, 3) /* [4] FUNC 'foo' type_id=2 */ 43 + }, 44 + .strings = "\0int\0 0\0foo\0" 45 + }; 46 + __u32 log_level = 1 | 2 | 4; 47 + LIBBPF_OPTS(bpf_btf_load_opts, opts, 48 + .log_buf = log, 49 + .log_size = sizeof(log), 50 + .log_level = log_level, 51 + ); 52 + struct bpf_insn insns[] = { 53 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 54 + BPF_EXIT_INSN(), 55 + }; 56 + struct bpf_func_info funcs[] = { 57 + { 58 + .insn_off = 0, 59 + .type_id = 4, 60 + } 61 + }; 62 + struct bpf_core_relo relos[] = { 63 + { 64 + .insn_off = 0, /* patch first instruction (r0 = 0) */ 65 + .type_id = 100500, /* !!! this type id does not exist */ 66 + .access_str_off = 6, /* offset of "0" */ 67 + .kind = BPF_CORE_TYPE_ID_LOCAL, 68 + } 69 + }; 70 + union bpf_attr attr; 71 + int saved_errno; 72 + int prog_fd = -1; 73 + int btf_fd = -1; 74 + 75 + btf_fd = bpf_btf_load(&raw_btf, sizeof(raw_btf), &opts); 76 + saved_errno = errno; 77 + if (btf_fd < 0 || env.verbosity > VERBOSE_NORMAL) { 78 + printf("-------- BTF load log start --------\n"); 79 + printf("%s", log); 80 + printf("-------- BTF load log end ----------\n"); 81 + } 82 + if (btf_fd < 0) { 83 + PRINT_FAIL("bpf_btf_load() failed, errno=%d\n", saved_errno); 84 + return; 85 + } 86 + 87 + log[0] = 0; 88 + memset(&attr, 0, sizeof(attr)); 89 + attr.prog_btf_fd = btf_fd; 90 + attr.prog_type = BPF_TRACE_RAW_TP; 91 + attr.license = (__u64)"GPL"; 92 + attr.insns = (__u64)&insns; 93 + attr.insn_cnt = sizeof(insns) / sizeof(*insns); 94 + attr.log_buf = (__u64)log; 95 + attr.log_size = sizeof(log); 96 + attr.log_level = log_level; 97 + attr.func_info = (__u64)funcs; 98 + attr.func_info_cnt = sizeof(funcs) / sizeof(*funcs); 99 + attr.func_info_rec_size = sizeof(*funcs); 100 + attr.core_relos = (__u64)relos; 101 + attr.core_relo_cnt = sizeof(relos) / sizeof(*relos); 102 + attr.core_relo_rec_size = sizeof(*relos); 103 + prog_fd = sys_bpf_prog_load(&attr, sizeof(attr), 1); 104 + saved_errno = errno; 105 + if (prog_fd < 0 || env.verbosity > VERBOSE_NORMAL) { 106 + printf("-------- program load log start --------\n"); 107 + printf("%s", log); 108 + printf("-------- program load log end ----------\n"); 109 + } 110 + if (prog_fd >= 0) { 111 + PRINT_FAIL("sys_bpf_prog_load() expected to fail\n"); 112 + goto out; 113 + } 114 + ASSERT_HAS_SUBSTR(log, "relo #0: bad type id 100500", "program load log"); 115 + 116 + out: 117 + close(prog_fd); 118 + close(btf_fd); 119 + } 120 + 121 + void test_core_reloc_raw(void) 122 + { 123 + if (test__start_subtest("bad_local_id")) 124 + test_bad_local_id(); 125 + }

-1

tools/testing/selftests/bpf/prog_tests/crypto_sanity.c

··· 4 4 #include <sys/types.h> 5 5 #include <sys/socket.h> 6 6 #include <net/if.h> 7 - #include <linux/in6.h> 8 7 #include <linux/if_alg.h> 9 8 10 9 #include "test_progs.h"

+10 -64

tools/testing/selftests/bpf/prog_tests/ctx_rewrite.c

··· 10 10 #include "bpf/btf.h" 11 11 #include "bpf_util.h" 12 12 #include "linux/filter.h" 13 - #include "disasm.h" 13 + #include "linux/kernel.h" 14 + #include "disasm_helpers.h" 14 15 15 16 #define MAX_PROG_TEXT_SZ (32 * 1024) 16 17 ··· 629 628 return false; 630 629 } 631 630 632 - static void print_insn(void *private_data, const char *fmt, ...) 633 - { 634 - va_list args; 635 - 636 - va_start(args, fmt); 637 - vfprintf((FILE *)private_data, fmt, args); 638 - va_end(args); 639 - } 640 - 641 - /* Disassemble instructions to a stream */ 642 - static void print_xlated(FILE *out, struct bpf_insn *insn, __u32 len) 643 - { 644 - const struct bpf_insn_cbs cbs = { 645 - .cb_print = print_insn, 646 - .cb_call = NULL, 647 - .cb_imm = NULL, 648 - .private_data = out, 649 - }; 650 - bool double_insn = false; 651 - int i; 652 - 653 - for (i = 0; i < len; i++) { 654 - if (double_insn) { 655 - double_insn = false; 656 - continue; 657 - } 658 - 659 - double_insn = insn[i].code == (BPF_LD | BPF_IMM | BPF_DW); 660 - print_bpf_insn(&cbs, insn + i, true); 661 - } 662 - } 663 - 664 - /* We share code with kernel BPF disassembler, it adds '(FF) ' prefix 665 - * for each instruction (FF stands for instruction `code` byte). 666 - * This function removes the prefix inplace for each line in `str`. 667 - */ 668 - static void remove_insn_prefix(char *str, int size) 669 - { 670 - const int prefix_size = 5; 671 - 672 - int write_pos = 0, read_pos = prefix_size; 673 - int len = strlen(str); 674 - char c; 675 - 676 - size = min(size, len); 677 - 678 - while (read_pos < size) { 679 - c = str[read_pos++]; 680 - if (c == 0) 681 - break; 682 - str[write_pos++] = c; 683 - if (c == '\n') 684 - read_pos += prefix_size; 685 - } 686 - str[write_pos] = 0; 687 - } 688 - 689 631 struct prog_info { 690 632 char *prog_kind; 691 633 enum bpf_prog_type prog_type; ··· 643 699 char *reg_map[][2], 644 700 bool skip_first_insn) 645 701 { 646 - struct bpf_insn *buf = NULL; 702 + struct bpf_insn *buf = NULL, *insn, *insn_end; 647 703 int err = 0, prog_fd = 0; 648 704 FILE *prog_out = NULL; 705 + char insn_buf[64]; 649 706 char *text = NULL; 650 707 __u32 cnt = 0; 651 708 ··· 684 739 PRINT_FAIL("Can't open memory stream\n"); 685 740 goto out; 686 741 } 687 - if (skip_first_insn) 688 - print_xlated(prog_out, buf + 1, cnt - 1); 689 - else 690 - print_xlated(prog_out, buf, cnt); 742 + insn_end = buf + cnt; 743 + insn = buf + (skip_first_insn ? 1 : 0); 744 + while (insn < insn_end) { 745 + insn = disasm_insn(insn, insn_buf, sizeof(insn_buf)); 746 + fprintf(prog_out, "%s\n", insn_buf); 747 + } 691 748 fclose(prog_out); 692 - remove_insn_prefix(text, MAX_PROG_TEXT_SZ); 693 749 694 750 ASSERT_TRUE(match_pattern(btf, pattern, text, reg_map), 695 751 pinfo->prog_kind);

-1

tools/testing/selftests/bpf/prog_tests/decap_sanity.c

··· 4 4 #include <sys/types.h> 5 5 #include <sys/socket.h> 6 6 #include <net/if.h> 7 - #include <linux/in6.h> 8 7 9 8 #include "test_progs.h" 10 9 #include "network_helpers.h"

+2 -1

tools/testing/selftests/bpf/prog_tests/fexit_stress.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* Copyright (c) 2019 Facebook */ 3 3 #include <test_progs.h> 4 + #include "bpf_util.h" 4 5 5 6 void serial_test_fexit_stress(void) 6 7 { ··· 37 36 for (i = 0; i < bpf_max_tramp_links; i++) { 38 37 fexit_fd[i] = bpf_prog_load(BPF_PROG_TYPE_TRACING, NULL, "GPL", 39 38 trace_program, 40 - sizeof(trace_program) / sizeof(struct bpf_insn), 39 + ARRAY_SIZE(trace_program), 41 40 &trace_opts); 42 41 if (!ASSERT_GE(fexit_fd[i], 0, "fexit load")) 43 42 goto out;

+1 -1

tools/testing/selftests/bpf/prog_tests/flow_dissector.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #define _GNU_SOURCE 2 3 #include <test_progs.h> 3 4 #include <network_helpers.h> 4 - #include <error.h> 5 5 #include <linux/if_tun.h> 6 6 #include <sys/uio.h> 7 7

+8 -1

tools/testing/selftests/bpf/prog_tests/fs_kfuncs.c

··· 16 16 { 17 17 struct test_get_xattr *skel = NULL; 18 18 int fd = -1, err; 19 + int v[32]; 19 20 20 21 fd = open(testfile, O_CREAT | O_RDONLY, 0644); 21 22 if (!ASSERT_GE(fd, 0, "create_file")) ··· 51 50 if (!ASSERT_GE(fd, 0, "open_file")) 52 51 goto out; 53 52 54 - ASSERT_EQ(skel->bss->found_xattr, 1, "found_xattr"); 53 + ASSERT_EQ(skel->bss->found_xattr_from_file, 1, "found_xattr_from_file"); 54 + 55 + /* Trigger security_inode_getxattr */ 56 + err = getxattr(testfile, "user.kfuncs", v, sizeof(v)); 57 + ASSERT_EQ(err, -1, "getxattr_return"); 58 + ASSERT_EQ(errno, EINVAL, "getxattr_errno"); 59 + ASSERT_EQ(skel->bss->found_xattr_from_dentry, 1, "found_xattr_from_dentry"); 55 60 56 61 out: 57 62 close(fd);

+4 -1

tools/testing/selftests/bpf/prog_tests/iters.c

··· 14 14 #include "iters_state_safety.skel.h" 15 15 #include "iters_looping.skel.h" 16 16 #include "iters_num.skel.h" 17 + #include "iters_testmod.skel.h" 17 18 #include "iters_testmod_seq.skel.h" 18 19 #include "iters_task_vma.skel.h" 19 20 #include "iters_task.skel.h" ··· 298 297 RUN_TESTS(iters); 299 298 RUN_TESTS(iters_css_task); 300 299 301 - if (env.has_testmod) 300 + if (env.has_testmod) { 301 + RUN_TESTS(iters_testmod); 302 302 RUN_TESTS(iters_testmod_seq); 303 + } 303 304 304 305 if (test__start_subtest("num")) 305 306 subtest_num_iters();

+1

tools/testing/selftests/bpf/prog_tests/kfree_skb.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #define _GNU_SOURCE 2 3 #include <test_progs.h> 3 4 #include <network_helpers.h> 4 5 #include "kfree_skb.skel.h"

+1

tools/testing/selftests/bpf/prog_tests/kfunc_call.c

··· 68 68 TC_FAIL(kfunc_call_test_get_mem_fail_oob, 0, "min value is outside of the allowed memory range"), 69 69 TC_FAIL(kfunc_call_test_get_mem_fail_not_const, 0, "is not a const"), 70 70 TC_FAIL(kfunc_call_test_mem_acquire_fail, 0, "acquire kernel function does not return PTR_TO_BTF_ID"), 71 + TC_FAIL(kfunc_call_test_pointer_arg_type_mismatch, 0, "arg#0 expected pointer to ctx, but got scalar"), 71 72 72 73 /* success cases */ 73 74 TC_TEST(kfunc_call_test1, 12),

+5 -4

tools/testing/selftests/bpf/prog_tests/log_buf.c

··· 5 5 #include <bpf/btf.h> 6 6 7 7 #include "test_log_buf.skel.h" 8 + #include "bpf_util.h" 8 9 9 10 static size_t libbpf_log_pos; 10 11 static char libbpf_log_buf[1024 * 1024]; ··· 144 143 BPF_MOV64_IMM(BPF_REG_0, 0), 145 144 BPF_EXIT_INSN(), 146 145 }; 147 - const size_t good_prog_insn_cnt = sizeof(good_prog_insns) / sizeof(struct bpf_insn); 146 + const size_t good_prog_insn_cnt = ARRAY_SIZE(good_prog_insns); 148 147 const struct bpf_insn bad_prog_insns[] = { 149 148 BPF_EXIT_INSN(), 150 149 }; 151 - size_t bad_prog_insn_cnt = sizeof(bad_prog_insns) / sizeof(struct bpf_insn); 150 + size_t bad_prog_insn_cnt = ARRAY_SIZE(bad_prog_insns); 152 151 LIBBPF_OPTS(bpf_prog_load_opts, opts); 153 152 const size_t log_buf_sz = 1024 * 1024; 154 153 char *log_buf; ··· 160 159 opts.log_buf = log_buf; 161 160 opts.log_size = log_buf_sz; 162 161 163 - /* with log_level == 0 log_buf shoud stay empty for good prog */ 162 + /* with log_level == 0 log_buf should stay empty for good prog */ 164 163 log_buf[0] = '\0'; 165 164 opts.log_level = 0; 166 165 fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, "good_prog", "GPL", ··· 222 221 opts.log_buf = log_buf; 223 222 opts.log_size = log_buf_sz; 224 223 225 - /* with log_level == 0 log_buf shoud stay empty for good BTF */ 224 + /* with log_level == 0 log_buf should stay empty for good BTF */ 226 225 log_buf[0] = '\0'; 227 226 opts.log_level = 0; 228 227 fd = bpf_btf_load(raw_btf_data, raw_btf_size, &opts);

-1

tools/testing/selftests/bpf/prog_tests/lwt_redirect.c

··· 47 47 #include <linux/if_ether.h> 48 48 #include <linux/if_packet.h> 49 49 #include <linux/if_tun.h> 50 - #include <linux/icmp.h> 51 50 #include <arpa/inet.h> 52 51 #include <unistd.h> 53 52 #include <errno.h>

+1

tools/testing/selftests/bpf/prog_tests/lwt_reroute.c

··· 49 49 * is not crashed, it is considered successful. 50 50 */ 51 51 #define NETNS "ns_lwt_reroute" 52 + #include <netinet/in.h> 52 53 #include "lwt_helpers.h" 53 54 #include "network_helpers.h" 54 55 #include <linux/net_tstamp.h>

+2 -1

tools/testing/selftests/bpf/prog_tests/module_fentry_shadow.c

··· 4 4 #include <bpf/btf.h> 5 5 #include "bpf/libbpf_internal.h" 6 6 #include "cgroup_helpers.h" 7 + #include "bpf_util.h" 7 8 8 9 static const char *module_name = "bpf_testmod"; 9 10 static const char *symbol_name = "bpf_fentry_shadow_test"; ··· 101 100 load_opts.attach_btf_obj_fd = btf_fd[i]; 102 101 prog_fd[i] = bpf_prog_load(BPF_PROG_TYPE_TRACING, NULL, "GPL", 103 102 trace_program, 104 - sizeof(trace_program) / sizeof(struct bpf_insn), 103 + ARRAY_SIZE(trace_program), 105 104 &load_opts); 106 105 if (!ASSERT_GE(prog_fd[i], 0, "bpf_prog_load")) 107 106 goto out;

+4

tools/testing/selftests/bpf/prog_tests/nested_trust.c

··· 4 4 #include <test_progs.h> 5 5 #include "nested_trust_failure.skel.h" 6 6 #include "nested_trust_success.skel.h" 7 + #include "nested_acquire.skel.h" 7 8 8 9 void test_nested_trust(void) 9 10 { 10 11 RUN_TESTS(nested_trust_success); 11 12 RUN_TESTS(nested_trust_failure); 13 + 14 + if (env.has_testmod) 15 + RUN_TESTS(nested_acquire); 12 16 }

+1 -1

tools/testing/selftests/bpf/prog_tests/ns_current_pid_tgid.c

··· 11 11 #include <sched.h> 12 12 #include <sys/wait.h> 13 13 #include <sys/mount.h> 14 - #include <sys/fcntl.h> 14 + #include <fcntl.h> 15 15 #include "network_helpers.h" 16 16 17 17 #define STACK_SIZE (1024 * 1024)

+1

tools/testing/selftests/bpf/prog_tests/parse_tcp_hdr_opt.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 3 + #define _GNU_SOURCE 3 4 #include <test_progs.h> 4 5 #include <network_helpers.h> 5 6 #include "test_parse_tcp_hdr_opt.skel.h"

+60

tools/testing/selftests/bpf/prog_tests/pro_epilogue.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <test_progs.h> 5 + #include "pro_epilogue.skel.h" 6 + #include "epilogue_tailcall.skel.h" 7 + #include "pro_epilogue_goto_start.skel.h" 8 + #include "epilogue_exit.skel.h" 9 + 10 + struct st_ops_args { 11 + __u64 a; 12 + }; 13 + 14 + static void test_tailcall(void) 15 + { 16 + LIBBPF_OPTS(bpf_test_run_opts, topts); 17 + struct epilogue_tailcall *skel; 18 + struct st_ops_args args; 19 + int err, prog_fd; 20 + 21 + skel = epilogue_tailcall__open_and_load(); 22 + if (!ASSERT_OK_PTR(skel, "epilogue_tailcall__open_and_load")) 23 + return; 24 + 25 + topts.ctx_in = &args; 26 + topts.ctx_size_in = sizeof(args); 27 + 28 + skel->links.epilogue_tailcall = 29 + bpf_map__attach_struct_ops(skel->maps.epilogue_tailcall); 30 + if (!ASSERT_OK_PTR(skel->links.epilogue_tailcall, "attach_struct_ops")) 31 + goto done; 32 + 33 + /* Both test_epilogue_tailcall and test_epilogue_subprog are 34 + * patched with epilogue. When syscall_epilogue_tailcall() 35 + * is run, test_epilogue_tailcall() is triggered. 36 + * It executes a tail call and control is transferred to 37 + * test_epilogue_subprog(). Only test_epilogue_subprog() 38 + * does args->a += 1, thus final args.a value of 10001 39 + * guarantees that only the epilogue of the 40 + * test_epilogue_subprog is executed. 41 + */ 42 + memset(&args, 0, sizeof(args)); 43 + prog_fd = bpf_program__fd(skel->progs.syscall_epilogue_tailcall); 44 + err = bpf_prog_test_run_opts(prog_fd, &topts); 45 + ASSERT_OK(err, "bpf_prog_test_run_opts"); 46 + ASSERT_EQ(args.a, 10001, "args.a"); 47 + ASSERT_EQ(topts.retval, 10001 * 2, "topts.retval"); 48 + 49 + done: 50 + epilogue_tailcall__destroy(skel); 51 + } 52 + 53 + void test_pro_epilogue(void) 54 + { 55 + RUN_TESTS(pro_epilogue); 56 + RUN_TESTS(pro_epilogue_goto_start); 57 + RUN_TESTS(epilogue_exit); 58 + if (test__start_subtest("tailcall")) 59 + test_tailcall(); 60 + }

+2 -1

tools/testing/selftests/bpf/prog_tests/raw_tp_writable_reject_nbd_invalid.c

··· 2 2 3 3 #include <test_progs.h> 4 4 #include <linux/nbd.h> 5 + #include "bpf_util.h" 5 6 6 7 void test_raw_tp_writable_reject_nbd_invalid(void) 7 8 { ··· 26 25 ); 27 26 28 27 bpf_fd = bpf_prog_load(BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, NULL, "GPL v2", 29 - program, sizeof(program) / sizeof(struct bpf_insn), 28 + program, ARRAY_SIZE(program), 30 29 &opts); 31 30 if (CHECK(bpf_fd < 0, "bpf_raw_tracepoint_writable load", 32 31 "failed: %d errno %d\n", bpf_fd, errno))

+3 -2

tools/testing/selftests/bpf/prog_tests/raw_tp_writable_test_run.c

··· 2 2 3 3 #include <test_progs.h> 4 4 #include <linux/nbd.h> 5 + #include "bpf_util.h" 5 6 6 7 /* NOTE: conflict with other tests. */ 7 8 void serial_test_raw_tp_writable_test_run(void) ··· 25 24 ); 26 25 27 26 int bpf_fd = bpf_prog_load(BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE, NULL, "GPL v2", 28 - trace_program, sizeof(trace_program) / sizeof(struct bpf_insn), 27 + trace_program, ARRAY_SIZE(trace_program), 29 28 &trace_opts); 30 29 if (CHECK(bpf_fd < 0, "bpf_raw_tracepoint_writable loaded", 31 30 "failed: %d errno %d\n", bpf_fd, errno)) ··· 42 41 ); 43 42 44 43 int filter_fd = bpf_prog_load(BPF_PROG_TYPE_SOCKET_FILTER, NULL, "GPL v2", 45 - skb_program, sizeof(skb_program) / sizeof(struct bpf_insn), 44 + skb_program, ARRAY_SIZE(skb_program), 46 45 &skb_opts); 47 46 if (CHECK(filter_fd < 0, "test_program_loaded", "failed: %d errno %d\n", 48 47 filter_fd, errno))

+1

tools/testing/selftests/bpf/prog_tests/read_vsyscall.c

··· 23 23 { .name = "probe_read_user_str", .ret = -EFAULT }, 24 24 { .name = "copy_from_user", .ret = -EFAULT }, 25 25 { .name = "copy_from_user_task", .ret = -EFAULT }, 26 + { .name = "copy_from_user_str", .ret = -EFAULT }, 26 27 }; 27 28 28 29 void test_read_vsyscall(void)

+24 -8

tools/testing/selftests/bpf/prog_tests/reg_bounds.c

··· 433 433 434 434 y_cast = range_cast(y_t, x_t, y); 435 435 436 + /* If we know that 437 + * - *x* is in the range of signed 32bit value, and 438 + * - *y_cast* range is 32-bit signed non-negative 439 + * then *x* range can be improved with *y_cast* such that *x* range 440 + * is 32-bit signed non-negative. Otherwise, if the new range for *x* 441 + * allows upper 32-bit * 0xffffffff then the eventual new range for 442 + * *x* will be out of signed 32-bit range which violates the origin 443 + * *x* range. 444 + */ 445 + if (x_t == S64 && y_t == S32 && y_cast.a <= S32_MAX && y_cast.b <= S32_MAX && 446 + (s64)x.a >= S32_MIN && (s64)x.b <= S32_MAX) 447 + return range_improve(x_t, x, y_cast); 448 + 436 449 /* the case when new range knowledge, *y*, is a 32-bit subregister 437 450 * range, while previous range knowledge, *x*, is a full register 438 451 * 64-bit range, needs special treatment to take into account upper 32 ··· 503 490 504 491 /* Can register with range [x.a, x.b] *EVER* satisfy 505 492 * OP (<, <=, >, >=, ==, !=) relation to 506 - * a regsiter with range [y.a, y.b] 493 + * a register with range [y.a, y.b] 507 494 * _in *num_t* domain_ 508 495 */ 509 496 static bool range_canbe_op(enum num_t t, struct range x, struct range y, enum op op) ··· 532 519 533 520 /* Does register with range [x.a, x.b] *ALWAYS* satisfy 534 521 * OP (<, <=, >, >=, ==, !=) relation to 535 - * a regsiter with range [y.a, y.b] 522 + * a register with range [y.a, y.b] 536 523 * _in *num_t* domain_ 537 524 */ 538 525 static bool range_always_op(enum num_t t, struct range x, struct range y, enum op op) ··· 543 530 544 531 /* Does register with range [x.a, x.b] *NEVER* satisfy 545 532 * OP (<, <=, >, >=, ==, !=) relation to 546 - * a regsiter with range [y.a, y.b] 533 + * a register with range [y.a, y.b] 547 534 * _in *num_t* domain_ 548 535 */ 549 536 static bool range_never_op(enum num_t t, struct range x, struct range y, enum op op) ··· 1018 1005 * - umin=%llu, if missing, assumed 0; 1019 1006 * - umax=%llu, if missing, assumed U64_MAX; 1020 1007 * - smin=%lld, if missing, assumed S64_MIN; 1021 - * - smax=%lld, if missing, assummed S64_MAX; 1008 + * - smax=%lld, if missing, assumed S64_MAX; 1022 1009 * - umin32=%d, if missing, assumed 0; 1023 1010 * - umax32=%d, if missing, assumed U32_MAX; 1024 1011 * - smin32=%d, if missing, assumed S32_MIN; 1025 - * - smax32=%d, if missing, assummed S32_MAX; 1012 + * - smax32=%d, if missing, assumed S32_MAX; 1026 1013 * - var_off=(%#llx; %#llx), tnum part, we don't care about it. 1027 1014 * 1028 1015 * If some of the values are equal, they will be grouped (but min/max ··· 1487 1474 u64 elapsed_ns = get_time_ns() - ctx->start_ns; 1488 1475 double remain_ns = elapsed_ns / progress * (1 - progress); 1489 1476 1490 - fprintf(env.stderr, "PROGRESS (%s): %d/%d (%.2lf%%), " 1477 + fprintf(env.stderr_saved, "PROGRESS (%s): %d/%d (%.2lf%%), " 1491 1478 "elapsed %llu mins (%.2lf hrs), " 1492 1479 "ETA %.0lf mins (%.2lf hrs)\n", 1493 1480 ctx->progress_ctx, ··· 1884 1871 * envvar is not set, this test is skipped during test_progs testing. 1885 1872 * 1886 1873 * We split this up into smaller subsets based on initialization and 1887 - * conditiona numeric domains to get an easy parallelization with test_progs' 1874 + * conditional numeric domains to get an easy parallelization with test_progs' 1888 1875 * -j argument. 1889 1876 */ 1890 1877 ··· 1938 1925 { 1939 1926 /* RAND_MAX is guaranteed to be at least 1<<15, but in practice it 1940 1927 * seems to be 1<<31, so we need to call it thrice to get full u64; 1941 - * we'll use rougly equal split: 22 + 21 + 21 bits 1928 + * we'll use roughly equal split: 22 + 21 + 21 bits 1942 1929 */ 1943 1930 return ((u64)random() << 42) | 1944 1931 (((u64)random() & RAND_21BIT_MASK) << 21) | ··· 2121 2108 {S32, U32, {(u32)S32_MIN, 0}, {0, 0}}, 2122 2109 {S32, U32, {(u32)S32_MIN, 0}, {(u32)S32_MIN, (u32)S32_MIN}}, 2123 2110 {S32, U32, {(u32)S32_MIN, S32_MAX}, {S32_MAX, S32_MAX}}, 2111 + {S64, U32, {0x0, 0x1f}, {0xffffffff80000000ULL, 0x000000007fffffffULL}}, 2112 + {S64, U32, {0x0, 0x1f}, {0xffffffffffff8000ULL, 0x0000000000007fffULL}}, 2113 + {S64, U32, {0x0, 0x1f}, {0xffffffffffffff80ULL, 0x000000000000007fULL}}, 2124 2114 }; 2125 2115 2126 2116 /* Go over crafted hard-coded cases. This is fast, so we do it as part of

+1 -1

tools/testing/selftests/bpf/prog_tests/resolve_btfids.c

··· 103 103 104 104 btf = btf__parse_elf("btf_data.bpf.o", NULL); 105 105 if (CHECK(libbpf_get_error(btf), "resolve", 106 - "Failed to load BTF from btf_data.o\n")) 106 + "Failed to load BTF from btf_data.bpf.o\n")) 107 107 return -1; 108 108 109 109 nr = btf__type_cnt(btf);

+13 -24

tools/testing/selftests/bpf/prog_tests/select_reuseport.c

··· 37 37 static int reuseport_array = -1, outer_map = -1; 38 38 static enum bpf_map_type inner_map_type; 39 39 static int select_by_skb_data_prog; 40 - static int saved_tcp_syncookie = -1; 41 40 static struct bpf_object *obj; 42 - static int saved_tcp_fo = -1; 43 41 static __u32 index_zero; 44 42 static int epfd; 45 43 ··· 189 191 190 192 close(fd); 191 193 return 0; 192 - } 193 - 194 - static void restore_sysctls(void) 195 - { 196 - if (saved_tcp_fo != -1) 197 - write_int_sysctl(TCP_FO_SYSCTL, saved_tcp_fo); 198 - if (saved_tcp_syncookie != -1) 199 - write_int_sysctl(TCP_SYNCOOKIE_SYSCTL, saved_tcp_syncookie); 200 194 } 201 195 202 196 static int enable_fastopen(void) ··· 783 793 TEST_INIT(test_pass_on_err), 784 794 TEST_INIT(test_detach_bpf), 785 795 }; 796 + struct netns_obj *netns; 786 797 char s[MAX_TEST_NAME]; 787 798 const struct test *t; 788 799 ··· 799 808 if (!test__start_subtest(s)) 800 809 continue; 801 810 811 + netns = netns_new("select_reuseport", true); 812 + if (!ASSERT_OK_PTR(netns, "netns_new")) 813 + continue; 814 + 815 + if (CHECK_FAIL(enable_fastopen())) 816 + goto out; 817 + if (CHECK_FAIL(disable_syncookie())) 818 + goto out; 819 + 802 820 setup_per_test(sotype, family, inany, t->no_inner_map); 803 821 t->fn(sotype, family); 804 822 cleanup_per_test(t->no_inner_map); 823 + 824 + out: 825 + netns_free(netns); 805 826 } 806 827 } 807 828 ··· 853 850 854 851 void serial_test_select_reuseport(void) 855 852 { 856 - saved_tcp_fo = read_int_sysctl(TCP_FO_SYSCTL); 857 - if (saved_tcp_fo < 0) 858 - goto out; 859 - saved_tcp_syncookie = read_int_sysctl(TCP_SYNCOOKIE_SYSCTL); 860 - if (saved_tcp_syncookie < 0) 861 - goto out; 862 - 863 - if (enable_fastopen()) 864 - goto out; 865 - if (disable_syncookie()) 866 - goto out; 867 - 868 853 test_map_type(BPF_MAP_TYPE_REUSEPORT_SOCKARRAY); 869 854 test_map_type(BPF_MAP_TYPE_SOCKMAP); 870 855 test_map_type(BPF_MAP_TYPE_SOCKHASH); 871 - out: 872 - restore_sysctls(); 873 856 }

+30 -81

tools/testing/selftests/bpf/prog_tests/sk_lookup.c

··· 18 18 #include <arpa/inet.h> 19 19 #include <assert.h> 20 20 #include <errno.h> 21 - #include <error.h> 22 21 #include <fcntl.h> 23 22 #include <sched.h> 24 23 #include <stdio.h> ··· 45 46 #define INT_IP4_V6 "::ffff:127.0.0.2" 46 47 #define INT_IP6 "fd00::2" 47 48 #define INT_PORT 8008 48 - 49 - #define IO_TIMEOUT_SEC 3 50 49 51 50 enum server { 52 51 SERVER_A = 0, ··· 103 106 return -1; 104 107 105 108 return 0; 106 - } 107 - 108 - static socklen_t inetaddr_len(const struct sockaddr_storage *addr) 109 - { 110 - return (addr->ss_family == AF_INET ? sizeof(struct sockaddr_in) : 111 - addr->ss_family == AF_INET6 ? sizeof(struct sockaddr_in6) : 0); 112 - } 113 - 114 - static int make_socket(int sotype, const char *ip, int port, 115 - struct sockaddr_storage *addr) 116 - { 117 - struct timeval timeo = { .tv_sec = IO_TIMEOUT_SEC }; 118 - int err, family, fd; 119 - 120 - family = is_ipv6(ip) ? AF_INET6 : AF_INET; 121 - err = make_sockaddr(family, ip, port, addr, NULL); 122 - if (CHECK(err, "make_address", "failed\n")) 123 - return -1; 124 - 125 - fd = socket(addr->ss_family, sotype, 0); 126 - if (CHECK(fd < 0, "socket", "failed\n")) { 127 - log_err("failed to make socket"); 128 - return -1; 129 - } 130 - 131 - err = setsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &timeo, sizeof(timeo)); 132 - if (CHECK(err, "setsockopt(SO_SNDTIMEO)", "failed\n")) { 133 - log_err("failed to set SNDTIMEO"); 134 - close(fd); 135 - return -1; 136 - } 137 - 138 - err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &timeo, sizeof(timeo)); 139 - if (CHECK(err, "setsockopt(SO_RCVTIMEO)", "failed\n")) { 140 - log_err("failed to set RCVTIMEO"); 141 - close(fd); 142 - return -1; 143 - } 144 - 145 - return fd; 146 109 } 147 110 148 111 static int setsockopts(int fd, void *opts) ··· 178 221 log_err("failed to attach reuseport prog"); 179 222 goto fail; 180 223 } 181 - } 182 - 183 - return fd; 184 - fail: 185 - close(fd); 186 - return -1; 187 - } 188 - 189 - static int make_client(int sotype, const char *ip, int port) 190 - { 191 - struct sockaddr_storage addr = {0}; 192 - int err, fd; 193 - 194 - fd = make_socket(sotype, ip, port, &addr); 195 - if (fd < 0) 196 - return -1; 197 - 198 - err = connect(fd, (void *)&addr, inetaddr_len(&addr)); 199 - if (CHECK(err, "make_client", "connect")) { 200 - log_err("failed to connect client socket"); 201 - goto fail; 202 224 } 203 225 204 226 return fd; ··· 582 646 goto close; 583 647 } 584 648 585 - client_fd = make_client(t->sotype, t->connect_to.ip, t->connect_to.port); 586 - if (client_fd < 0) 649 + client_fd = connect_to_addr_str(is_ipv6(t->connect_to.ip) ? AF_INET6 : AF_INET, 650 + t->sotype, t->connect_to.ip, t->connect_to.port, NULL); 651 + if (!ASSERT_OK_FD(client_fd, "connect_to_addr_str")) 587 652 goto close; 588 653 589 654 if (t->sotype == SOCK_STREAM) ··· 799 862 800 863 static void drop_on_lookup(const struct test *t) 801 864 { 865 + int family = is_ipv6(t->connect_to.ip) ? AF_INET6 : AF_INET; 802 866 struct sockaddr_storage dst = {}; 803 867 int client_fd, server_fd, err; 804 868 struct bpf_link *lookup_link; 869 + socklen_t len; 805 870 ssize_t n; 806 871 807 872 lookup_link = attach_lookup_prog(t->lookup_prog); ··· 815 876 if (server_fd < 0) 816 877 goto detach; 817 878 818 - client_fd = make_socket(t->sotype, t->connect_to.ip, 819 - t->connect_to.port, &dst); 820 - if (client_fd < 0) 879 + client_fd = client_socket(family, t->sotype, NULL); 880 + if (!ASSERT_OK_FD(client_fd, "client_socket")) 821 881 goto close_srv; 822 882 823 - err = connect(client_fd, (void *)&dst, inetaddr_len(&dst)); 883 + err = make_sockaddr(family, t->connect_to.ip, t->connect_to.port, &dst, &len); 884 + if (!ASSERT_OK(err, "make_sockaddr")) 885 + goto close_all; 886 + err = connect(client_fd, (void *)&dst, len); 824 887 if (t->sotype == SOCK_DGRAM) { 825 888 err = send_byte(client_fd); 826 889 if (err) ··· 917 976 918 977 static void drop_on_reuseport(const struct test *t) 919 978 { 979 + int family = is_ipv6(t->connect_to.ip) ? AF_INET6 : AF_INET; 920 980 struct sockaddr_storage dst = { 0 }; 921 981 int client, server1, server2, err; 922 982 struct bpf_link *lookup_link; 983 + socklen_t len; 923 984 ssize_t n; 924 985 925 986 lookup_link = attach_lookup_prog(t->lookup_prog); ··· 943 1000 if (server2 < 0) 944 1001 goto close_srv1; 945 1002 946 - client = make_socket(t->sotype, t->connect_to.ip, 947 - t->connect_to.port, &dst); 948 - if (client < 0) 1003 + client = client_socket(family, t->sotype, NULL); 1004 + if (!ASSERT_OK_FD(client, "client_socket")) 949 1005 goto close_srv2; 950 1006 951 - err = connect(client, (void *)&dst, inetaddr_len(&dst)); 1007 + err = make_sockaddr(family, t->connect_to.ip, t->connect_to.port, &dst, &len); 1008 + if (!ASSERT_OK(err, "make_sockaddr")) 1009 + goto close_all; 1010 + err = connect(client, (void *)&dst, len); 952 1011 if (t->sotype == SOCK_DGRAM) { 953 1012 err = send_byte(client); 954 1013 if (err) ··· 1097 1152 if (server_fd < 0) 1098 1153 return; 1099 1154 1100 - connected_fd = make_client(sotype, EXT_IP4, EXT_PORT); 1101 - if (connected_fd < 0) 1155 + connected_fd = connect_to_addr_str(AF_INET, sotype, EXT_IP4, EXT_PORT, NULL); 1156 + if (!ASSERT_OK_FD(connected_fd, "connect_to_addr_str")) 1102 1157 goto out_close_server; 1103 1158 1104 1159 /* Put a connected socket in redirect map */ ··· 1111 1166 goto out_close_connected; 1112 1167 1113 1168 /* Try to redirect TCP SYN / UDP packet to a connected socket */ 1114 - client_fd = make_client(sotype, EXT_IP4, EXT_PORT); 1115 - if (client_fd < 0) 1169 + client_fd = connect_to_addr_str(AF_INET, sotype, EXT_IP4, EXT_PORT, NULL); 1170 + if (!ASSERT_OK_FD(client_fd, "connect_to_addr_str")) 1116 1171 goto out_unlink_prog; 1117 1172 if (sotype == SOCK_DGRAM) { 1118 1173 send_byte(client_fd); ··· 1164 1219 int map_fd, server_fd, client_fd; 1165 1220 struct bpf_link *link1, *link2; 1166 1221 int prog_idx, done, err; 1222 + socklen_t len; 1167 1223 1168 1224 map_fd = bpf_map__fd(t->run_map); 1169 1225 ··· 1194 1248 if (err) 1195 1249 goto out_close_server; 1196 1250 1197 - client_fd = make_socket(SOCK_STREAM, EXT_IP4, EXT_PORT, &dst); 1198 - if (client_fd < 0) 1251 + client_fd = client_socket(AF_INET, SOCK_STREAM, NULL); 1252 + if (!ASSERT_OK_FD(client_fd, "client_socket")) 1199 1253 goto out_close_server; 1200 1254 1201 - err = connect(client_fd, (void *)&dst, inetaddr_len(&dst)); 1255 + err = make_sockaddr(AF_INET, EXT_IP4, EXT_PORT, &dst, &len); 1256 + if (!ASSERT_OK(err, "make_sockaddr")) 1257 + goto out_close_client; 1258 + err = connect(client_fd, (void *)&dst, len); 1202 1259 if (CHECK(err && !t->expect_errno, "connect", 1203 1260 "unexpected error %d\n", errno)) 1204 1261 goto out_close_client;

+1

tools/testing/selftests/bpf/prog_tests/sock_addr.c

··· 2642 2642 break; 2643 2643 default: 2644 2644 ASSERT_TRUE(false, "Unknown sock addr test type"); 2645 + err = -EINVAL; 2645 2646 break; 2646 2647 } 2647 2648

+8

tools/testing/selftests/bpf/prog_tests/sockmap_listen.c

··· 1836 1836 int family) 1837 1837 { 1838 1838 const char *family_name, *map_name; 1839 + struct netns_obj *netns; 1839 1840 char s[MAX_TEST_NAME]; 1840 1841 1841 1842 family_name = family_str(family); ··· 1844 1843 snprintf(s, sizeof(s), "%s %s %s", map_name, family_name, __func__); 1845 1844 if (!test__start_subtest(s)) 1846 1845 return; 1846 + 1847 + netns = netns_new("sockmap_listen", true); 1848 + if (!ASSERT_OK_PTR(netns, "netns_new")) 1849 + return; 1850 + 1847 1851 inet_unix_skb_redir_to_connected(skel, map, family); 1848 1852 unix_inet_skb_redir_to_connected(skel, map, family); 1853 + 1854 + netns_free(netns); 1849 1855 } 1850 1856 1851 1857 static void run_tests(struct test_sockmap_listen *skel, struct bpf_map *map,

+384 -1

tools/testing/selftests/bpf/prog_tests/tailcalls.c

··· 3 3 #include <test_progs.h> 4 4 #include <network_helpers.h> 5 5 #include "tailcall_poke.skel.h" 6 - 6 + #include "tailcall_bpf2bpf_hierarchy2.skel.h" 7 + #include "tailcall_bpf2bpf_hierarchy3.skel.h" 8 + #include "tailcall_freplace.skel.h" 9 + #include "tc_bpf2bpf.skel.h" 7 10 8 11 /* test_tailcall_1 checks basic functionality by patching multiple locations 9 12 * in a single program for a single tail call slot with nop->jmp, jmp->nop ··· 1190 1187 tailcall_poke__destroy(call); 1191 1188 } 1192 1189 1190 + static void test_tailcall_hierarchy_count(const char *which, bool test_fentry, 1191 + bool test_fexit, 1192 + bool test_fentry_entry) 1193 + { 1194 + int err, map_fd, prog_fd, main_data_fd, fentry_data_fd, fexit_data_fd, i, val; 1195 + struct bpf_object *obj = NULL, *fentry_obj = NULL, *fexit_obj = NULL; 1196 + struct bpf_link *fentry_link = NULL, *fexit_link = NULL; 1197 + struct bpf_program *prog, *fentry_prog; 1198 + struct bpf_map *prog_array, *data_map; 1199 + int fentry_prog_fd; 1200 + char buff[128] = {}; 1201 + 1202 + LIBBPF_OPTS(bpf_test_run_opts, topts, 1203 + .data_in = buff, 1204 + .data_size_in = sizeof(buff), 1205 + .repeat = 1, 1206 + ); 1207 + 1208 + err = bpf_prog_test_load(which, BPF_PROG_TYPE_SCHED_CLS, &obj, 1209 + &prog_fd); 1210 + if (!ASSERT_OK(err, "load obj")) 1211 + return; 1212 + 1213 + prog = bpf_object__find_program_by_name(obj, "entry"); 1214 + if (!ASSERT_OK_PTR(prog, "find entry prog")) 1215 + goto out; 1216 + 1217 + prog_fd = bpf_program__fd(prog); 1218 + if (!ASSERT_GE(prog_fd, 0, "prog_fd")) 1219 + goto out; 1220 + 1221 + if (test_fentry_entry) { 1222 + fentry_obj = bpf_object__open_file("tailcall_bpf2bpf_hierarchy_fentry.bpf.o", 1223 + NULL); 1224 + if (!ASSERT_OK_PTR(fentry_obj, "open fentry_obj file")) 1225 + goto out; 1226 + 1227 + fentry_prog = bpf_object__find_program_by_name(fentry_obj, 1228 + "fentry"); 1229 + if (!ASSERT_OK_PTR(prog, "find fentry prog")) 1230 + goto out; 1231 + 1232 + err = bpf_program__set_attach_target(fentry_prog, prog_fd, 1233 + "entry"); 1234 + if (!ASSERT_OK(err, "set_attach_target entry")) 1235 + goto out; 1236 + 1237 + err = bpf_object__load(fentry_obj); 1238 + if (!ASSERT_OK(err, "load fentry_obj")) 1239 + goto out; 1240 + 1241 + fentry_link = bpf_program__attach_trace(fentry_prog); 1242 + if (!ASSERT_OK_PTR(fentry_link, "attach_trace")) 1243 + goto out; 1244 + 1245 + fentry_prog_fd = bpf_program__fd(fentry_prog); 1246 + if (!ASSERT_GE(fentry_prog_fd, 0, "fentry_prog_fd")) 1247 + goto out; 1248 + 1249 + prog_array = bpf_object__find_map_by_name(fentry_obj, "jmp_table"); 1250 + if (!ASSERT_OK_PTR(prog_array, "find jmp_table")) 1251 + goto out; 1252 + 1253 + map_fd = bpf_map__fd(prog_array); 1254 + if (!ASSERT_GE(map_fd, 0, "map_fd")) 1255 + goto out; 1256 + 1257 + i = 0; 1258 + err = bpf_map_update_elem(map_fd, &i, &fentry_prog_fd, BPF_ANY); 1259 + if (!ASSERT_OK(err, "update jmp_table")) 1260 + goto out; 1261 + 1262 + data_map = bpf_object__find_map_by_name(fentry_obj, ".bss"); 1263 + if (!ASSERT_FALSE(!data_map || !bpf_map__is_internal(data_map), 1264 + "find data_map")) 1265 + goto out; 1266 + 1267 + } else { 1268 + prog_array = bpf_object__find_map_by_name(obj, "jmp_table"); 1269 + if (!ASSERT_OK_PTR(prog_array, "find jmp_table")) 1270 + goto out; 1271 + 1272 + map_fd = bpf_map__fd(prog_array); 1273 + if (!ASSERT_GE(map_fd, 0, "map_fd")) 1274 + goto out; 1275 + 1276 + i = 0; 1277 + err = bpf_map_update_elem(map_fd, &i, &prog_fd, BPF_ANY); 1278 + if (!ASSERT_OK(err, "update jmp_table")) 1279 + goto out; 1280 + 1281 + data_map = bpf_object__find_map_by_name(obj, ".bss"); 1282 + if (!ASSERT_FALSE(!data_map || !bpf_map__is_internal(data_map), 1283 + "find data_map")) 1284 + goto out; 1285 + } 1286 + 1287 + if (test_fentry) { 1288 + fentry_obj = bpf_object__open_file("tailcall_bpf2bpf_fentry.bpf.o", 1289 + NULL); 1290 + if (!ASSERT_OK_PTR(fentry_obj, "open fentry_obj file")) 1291 + goto out; 1292 + 1293 + prog = bpf_object__find_program_by_name(fentry_obj, "fentry"); 1294 + if (!ASSERT_OK_PTR(prog, "find fentry prog")) 1295 + goto out; 1296 + 1297 + err = bpf_program__set_attach_target(prog, prog_fd, 1298 + "subprog_tail"); 1299 + if (!ASSERT_OK(err, "set_attach_target subprog_tail")) 1300 + goto out; 1301 + 1302 + err = bpf_object__load(fentry_obj); 1303 + if (!ASSERT_OK(err, "load fentry_obj")) 1304 + goto out; 1305 + 1306 + fentry_link = bpf_program__attach_trace(prog); 1307 + if (!ASSERT_OK_PTR(fentry_link, "attach_trace")) 1308 + goto out; 1309 + } 1310 + 1311 + if (test_fexit) { 1312 + fexit_obj = bpf_object__open_file("tailcall_bpf2bpf_fexit.bpf.o", 1313 + NULL); 1314 + if (!ASSERT_OK_PTR(fexit_obj, "open fexit_obj file")) 1315 + goto out; 1316 + 1317 + prog = bpf_object__find_program_by_name(fexit_obj, "fexit"); 1318 + if (!ASSERT_OK_PTR(prog, "find fexit prog")) 1319 + goto out; 1320 + 1321 + err = bpf_program__set_attach_target(prog, prog_fd, 1322 + "subprog_tail"); 1323 + if (!ASSERT_OK(err, "set_attach_target subprog_tail")) 1324 + goto out; 1325 + 1326 + err = bpf_object__load(fexit_obj); 1327 + if (!ASSERT_OK(err, "load fexit_obj")) 1328 + goto out; 1329 + 1330 + fexit_link = bpf_program__attach_trace(prog); 1331 + if (!ASSERT_OK_PTR(fexit_link, "attach_trace")) 1332 + goto out; 1333 + } 1334 + 1335 + err = bpf_prog_test_run_opts(prog_fd, &topts); 1336 + ASSERT_OK(err, "tailcall"); 1337 + ASSERT_EQ(topts.retval, 1, "tailcall retval"); 1338 + 1339 + main_data_fd = bpf_map__fd(data_map); 1340 + if (!ASSERT_GE(main_data_fd, 0, "main_data_fd")) 1341 + goto out; 1342 + 1343 + i = 0; 1344 + err = bpf_map_lookup_elem(main_data_fd, &i, &val); 1345 + ASSERT_OK(err, "tailcall count"); 1346 + ASSERT_EQ(val, 34, "tailcall count"); 1347 + 1348 + if (test_fentry) { 1349 + data_map = bpf_object__find_map_by_name(fentry_obj, ".bss"); 1350 + if (!ASSERT_FALSE(!data_map || !bpf_map__is_internal(data_map), 1351 + "find tailcall_bpf2bpf_fentry.bss map")) 1352 + goto out; 1353 + 1354 + fentry_data_fd = bpf_map__fd(data_map); 1355 + if (!ASSERT_GE(fentry_data_fd, 0, 1356 + "find tailcall_bpf2bpf_fentry.bss map fd")) 1357 + goto out; 1358 + 1359 + i = 0; 1360 + err = bpf_map_lookup_elem(fentry_data_fd, &i, &val); 1361 + ASSERT_OK(err, "fentry count"); 1362 + ASSERT_EQ(val, 68, "fentry count"); 1363 + } 1364 + 1365 + if (test_fexit) { 1366 + data_map = bpf_object__find_map_by_name(fexit_obj, ".bss"); 1367 + if (!ASSERT_FALSE(!data_map || !bpf_map__is_internal(data_map), 1368 + "find tailcall_bpf2bpf_fexit.bss map")) 1369 + goto out; 1370 + 1371 + fexit_data_fd = bpf_map__fd(data_map); 1372 + if (!ASSERT_GE(fexit_data_fd, 0, 1373 + "find tailcall_bpf2bpf_fexit.bss map fd")) 1374 + goto out; 1375 + 1376 + i = 0; 1377 + err = bpf_map_lookup_elem(fexit_data_fd, &i, &val); 1378 + ASSERT_OK(err, "fexit count"); 1379 + ASSERT_EQ(val, 68, "fexit count"); 1380 + } 1381 + 1382 + i = 0; 1383 + err = bpf_map_delete_elem(map_fd, &i); 1384 + if (!ASSERT_OK(err, "delete_elem from jmp_table")) 1385 + goto out; 1386 + 1387 + err = bpf_prog_test_run_opts(prog_fd, &topts); 1388 + ASSERT_OK(err, "tailcall"); 1389 + ASSERT_EQ(topts.retval, 1, "tailcall retval"); 1390 + 1391 + i = 0; 1392 + err = bpf_map_lookup_elem(main_data_fd, &i, &val); 1393 + ASSERT_OK(err, "tailcall count"); 1394 + ASSERT_EQ(val, 35, "tailcall count"); 1395 + 1396 + if (test_fentry) { 1397 + i = 0; 1398 + err = bpf_map_lookup_elem(fentry_data_fd, &i, &val); 1399 + ASSERT_OK(err, "fentry count"); 1400 + ASSERT_EQ(val, 70, "fentry count"); 1401 + } 1402 + 1403 + if (test_fexit) { 1404 + i = 0; 1405 + err = bpf_map_lookup_elem(fexit_data_fd, &i, &val); 1406 + ASSERT_OK(err, "fexit count"); 1407 + ASSERT_EQ(val, 70, "fexit count"); 1408 + } 1409 + 1410 + out: 1411 + bpf_link__destroy(fentry_link); 1412 + bpf_link__destroy(fexit_link); 1413 + bpf_object__close(fentry_obj); 1414 + bpf_object__close(fexit_obj); 1415 + bpf_object__close(obj); 1416 + } 1417 + 1418 + /* test_tailcall_bpf2bpf_hierarchy_1 checks that the count value of the tail 1419 + * call limit enforcement matches with expectations when tailcalls are preceded 1420 + * with two bpf2bpf calls. 1421 + * 1422 + * subprog --tailcall-> entry 1423 + * entry < 1424 + * subprog --tailcall-> entry 1425 + */ 1426 + static void test_tailcall_bpf2bpf_hierarchy_1(void) 1427 + { 1428 + test_tailcall_hierarchy_count("tailcall_bpf2bpf_hierarchy1.bpf.o", 1429 + false, false, false); 1430 + } 1431 + 1432 + /* test_tailcall_bpf2bpf_hierarchy_fentry checks that the count value of the 1433 + * tail call limit enforcement matches with expectations when tailcalls are 1434 + * preceded with two bpf2bpf calls, and the two subprogs are traced by fentry. 1435 + */ 1436 + static void test_tailcall_bpf2bpf_hierarchy_fentry(void) 1437 + { 1438 + test_tailcall_hierarchy_count("tailcall_bpf2bpf_hierarchy1.bpf.o", 1439 + true, false, false); 1440 + } 1441 + 1442 + /* test_tailcall_bpf2bpf_hierarchy_fexit checks that the count value of the tail 1443 + * call limit enforcement matches with expectations when tailcalls are preceded 1444 + * with two bpf2bpf calls, and the two subprogs are traced by fexit. 1445 + */ 1446 + static void test_tailcall_bpf2bpf_hierarchy_fexit(void) 1447 + { 1448 + test_tailcall_hierarchy_count("tailcall_bpf2bpf_hierarchy1.bpf.o", 1449 + false, true, false); 1450 + } 1451 + 1452 + /* test_tailcall_bpf2bpf_hierarchy_fentry_fexit checks that the count value of 1453 + * the tail call limit enforcement matches with expectations when tailcalls are 1454 + * preceded with two bpf2bpf calls, and the two subprogs are traced by both 1455 + * fentry and fexit. 1456 + */ 1457 + static void test_tailcall_bpf2bpf_hierarchy_fentry_fexit(void) 1458 + { 1459 + test_tailcall_hierarchy_count("tailcall_bpf2bpf_hierarchy1.bpf.o", 1460 + true, true, false); 1461 + } 1462 + 1463 + /* test_tailcall_bpf2bpf_hierarchy_fentry_entry checks that the count value of 1464 + * the tail call limit enforcement matches with expectations when tailcalls are 1465 + * preceded with two bpf2bpf calls in fentry. 1466 + */ 1467 + static void test_tailcall_bpf2bpf_hierarchy_fentry_entry(void) 1468 + { 1469 + test_tailcall_hierarchy_count("tc_dummy.bpf.o", false, false, true); 1470 + } 1471 + 1472 + /* test_tailcall_bpf2bpf_hierarchy_2 checks that the count value of the tail 1473 + * call limit enforcement matches with expectations: 1474 + * 1475 + * subprog_tail0 --tailcall-> classifier_0 -> subprog_tail0 1476 + * entry < 1477 + * subprog_tail1 --tailcall-> classifier_1 -> subprog_tail1 1478 + */ 1479 + static void test_tailcall_bpf2bpf_hierarchy_2(void) 1480 + { 1481 + RUN_TESTS(tailcall_bpf2bpf_hierarchy2); 1482 + } 1483 + 1484 + /* test_tailcall_bpf2bpf_hierarchy_3 checks that the count value of the tail 1485 + * call limit enforcement matches with expectations: 1486 + * 1487 + * subprog with jmp_table0 to classifier_0 1488 + * entry --tailcall-> classifier_0 < 1489 + * subprog with jmp_table1 to classifier_0 1490 + */ 1491 + static void test_tailcall_bpf2bpf_hierarchy_3(void) 1492 + { 1493 + RUN_TESTS(tailcall_bpf2bpf_hierarchy3); 1494 + } 1495 + 1496 + /* test_tailcall_freplace checks that the attached freplace prog is OK to 1497 + * update the prog_array map. 1498 + */ 1499 + static void test_tailcall_freplace(void) 1500 + { 1501 + struct tailcall_freplace *freplace_skel = NULL; 1502 + struct bpf_link *freplace_link = NULL; 1503 + struct bpf_program *freplace_prog; 1504 + struct tc_bpf2bpf *tc_skel = NULL; 1505 + int prog_fd, map_fd; 1506 + char buff[128] = {}; 1507 + int err, key; 1508 + 1509 + LIBBPF_OPTS(bpf_test_run_opts, topts, 1510 + .data_in = buff, 1511 + .data_size_in = sizeof(buff), 1512 + .repeat = 1, 1513 + ); 1514 + 1515 + freplace_skel = tailcall_freplace__open(); 1516 + if (!ASSERT_OK_PTR(freplace_skel, "tailcall_freplace__open")) 1517 + return; 1518 + 1519 + tc_skel = tc_bpf2bpf__open_and_load(); 1520 + if (!ASSERT_OK_PTR(tc_skel, "tc_bpf2bpf__open_and_load")) 1521 + goto out; 1522 + 1523 + prog_fd = bpf_program__fd(tc_skel->progs.entry_tc); 1524 + freplace_prog = freplace_skel->progs.entry_freplace; 1525 + err = bpf_program__set_attach_target(freplace_prog, prog_fd, "subprog"); 1526 + if (!ASSERT_OK(err, "set_attach_target")) 1527 + goto out; 1528 + 1529 + err = tailcall_freplace__load(freplace_skel); 1530 + if (!ASSERT_OK(err, "tailcall_freplace__load")) 1531 + goto out; 1532 + 1533 + freplace_link = bpf_program__attach_freplace(freplace_prog, prog_fd, 1534 + "subprog"); 1535 + if (!ASSERT_OK_PTR(freplace_link, "attach_freplace")) 1536 + goto out; 1537 + 1538 + map_fd = bpf_map__fd(freplace_skel->maps.jmp_table); 1539 + prog_fd = bpf_program__fd(freplace_prog); 1540 + key = 0; 1541 + err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY); 1542 + if (!ASSERT_OK(err, "update jmp_table")) 1543 + goto out; 1544 + 1545 + prog_fd = bpf_program__fd(tc_skel->progs.entry_tc); 1546 + err = bpf_prog_test_run_opts(prog_fd, &topts); 1547 + ASSERT_OK(err, "test_run"); 1548 + ASSERT_EQ(topts.retval, 34, "test_run retval"); 1549 + 1550 + out: 1551 + bpf_link__destroy(freplace_link); 1552 + tc_bpf2bpf__destroy(tc_skel); 1553 + tailcall_freplace__destroy(freplace_skel); 1554 + } 1555 + 1193 1556 void test_tailcalls(void) 1194 1557 { 1195 1558 if (test__start_subtest("tailcall_1")) ··· 1592 1223 test_tailcall_bpf2bpf_fentry_entry(); 1593 1224 if (test__start_subtest("tailcall_poke")) 1594 1225 test_tailcall_poke(); 1226 + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_1")) 1227 + test_tailcall_bpf2bpf_hierarchy_1(); 1228 + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_fentry")) 1229 + test_tailcall_bpf2bpf_hierarchy_fentry(); 1230 + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_fexit")) 1231 + test_tailcall_bpf2bpf_hierarchy_fexit(); 1232 + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_fentry_fexit")) 1233 + test_tailcall_bpf2bpf_hierarchy_fentry_fexit(); 1234 + if (test__start_subtest("tailcall_bpf2bpf_hierarchy_fentry_entry")) 1235 + test_tailcall_bpf2bpf_hierarchy_fentry_entry(); 1236 + test_tailcall_bpf2bpf_hierarchy_2(); 1237 + test_tailcall_bpf2bpf_hierarchy_3(); 1238 + if (test__start_subtest("tailcall_freplace")) 1239 + test_tailcall_freplace(); 1595 1240 }

+1 -1

tools/testing/selftests/bpf/prog_tests/tc_opts.c

··· 2384 2384 BPF_MOV64_IMM(BPF_REG_0, 0), 2385 2385 BPF_EXIT_INSN(), 2386 2386 }; 2387 - const size_t prog_insn_cnt = sizeof(prog_insns) / sizeof(struct bpf_insn); 2387 + const size_t prog_insn_cnt = ARRAY_SIZE(prog_insns); 2388 2388 LIBBPF_OPTS(bpf_prog_load_opts, opts); 2389 2389 const size_t log_buf_sz = 256; 2390 2390 char log_buf[log_buf_sz];

+29 -14

tools/testing/selftests/bpf/prog_tests/tc_redirect.c

··· 68 68 __FILE__, __LINE__, strerror(errno), ##__VA_ARGS__) 69 69 70 70 static const char * const namespaces[] = {NS_SRC, NS_FWD, NS_DST, NULL}; 71 + static struct netns_obj *netns_objs[3]; 71 72 72 73 static int write_file(const char *path, const char *newval) 73 74 { ··· 88 87 89 88 static int netns_setup_namespaces(const char *verb) 90 89 { 90 + struct netns_obj **ns_obj = netns_objs; 91 91 const char * const *ns = namespaces; 92 - char cmd[128]; 93 92 94 93 while (*ns) { 95 - snprintf(cmd, sizeof(cmd), "ip netns %s %s", verb, *ns); 96 - if (!ASSERT_OK(system(cmd), cmd)) 97 - return -1; 94 + if (strcmp(verb, "add") == 0) { 95 + *ns_obj = netns_new(*ns, false); 96 + if (!ASSERT_OK_PTR(*ns_obj, "netns_new")) 97 + return -1; 98 + } else { 99 + if (!ASSERT_OK_PTR(*ns_obj, "netns_obj is NULL")) 100 + return -1; 101 + netns_free(*ns_obj); 102 + *ns_obj = NULL; 103 + } 98 104 ns++; 105 + ns_obj++; 99 106 } 100 107 return 0; 101 108 } 102 109 103 110 static void netns_setup_namespaces_nofail(const char *verb) 104 111 { 112 + struct netns_obj **ns_obj = netns_objs; 105 113 const char * const *ns = namespaces; 106 - char cmd[128]; 107 114 108 115 while (*ns) { 109 - snprintf(cmd, sizeof(cmd), "ip netns %s %s > /dev/null 2>&1", verb, *ns); 110 - system(cmd); 116 + if (strcmp(verb, "add") == 0) { 117 + *ns_obj = netns_new(*ns, false); 118 + } else { 119 + if (*ns_obj) 120 + netns_free(*ns_obj); 121 + *ns_obj = NULL; 122 + } 111 123 ns++; 124 + ns_obj++; 112 125 } 113 126 } 114 127 ··· 486 471 487 472 static int __rcv_tstamp(int fd, const char *expected, size_t s, __u64 *tstamp) 488 473 { 489 - struct __kernel_timespec pkt_ts = {}; 474 + struct timespec pkt_ts = {}; 490 475 char ctl[CMSG_SPACE(sizeof(pkt_ts))]; 491 476 struct timespec now_ts; 492 477 struct msghdr msg = {}; ··· 510 495 511 496 cmsg = CMSG_FIRSTHDR(&msg); 512 497 if (cmsg && cmsg->cmsg_level == SOL_SOCKET && 513 - cmsg->cmsg_type == SO_TIMESTAMPNS_NEW) 498 + cmsg->cmsg_type == SO_TIMESTAMPNS) 514 499 memcpy(&pkt_ts, CMSG_DATA(cmsg), sizeof(pkt_ts)); 515 500 516 501 pkt_ns = pkt_ts.tv_sec * NSEC_PER_SEC + pkt_ts.tv_nsec; ··· 552 537 if (!ASSERT_GE(srv_fd, 0, "start_server")) 553 538 goto done; 554 539 555 - err = setsockopt(srv_fd, SOL_SOCKET, SO_TIMESTAMPNS_NEW, 540 + err = setsockopt(srv_fd, SOL_SOCKET, SO_TIMESTAMPNS, 556 541 &opt, sizeof(opt)); 557 - if (!ASSERT_OK(err, "setsockopt(SO_TIMESTAMPNS_NEW)")) 542 + if (!ASSERT_OK(err, "setsockopt(SO_TIMESTAMPNS)")) 558 543 goto done; 559 544 560 545 cli_fd = connect_to_fd(srv_fd, TIMEOUT_MILLIS); ··· 636 621 return; 637 622 638 623 /* Ensure the kernel puts the (rcv) timestamp for all skb */ 639 - err = setsockopt(listen_fd, SOL_SOCKET, SO_TIMESTAMPNS_NEW, 624 + err = setsockopt(listen_fd, SOL_SOCKET, SO_TIMESTAMPNS, 640 625 &opt, sizeof(opt)); 641 - if (!ASSERT_OK(err, "setsockopt(SO_TIMESTAMPNS_NEW)")) 626 + if (!ASSERT_OK(err, "setsockopt(SO_TIMESTAMPNS)")) 642 627 goto done; 643 628 644 629 if (type == SOCK_STREAM) { ··· 872 857 test_inet_dtime(family, SOCK_STREAM, addr, 50000 + t); 873 858 874 859 /* fwdns_prio100 prog does not read delivery_time_type, so 875 - * kernel puts the (rcv) timetamp in __sk_buff->tstamp 860 + * kernel puts the (rcv) timestamp in __sk_buff->tstamp 876 861 */ 877 862 ASSERT_EQ(dtimes[INGRESS_FWDNS_P100], 0, 878 863 dtime_cnt_str(t, INGRESS_FWDNS_P100));

+1

tools/testing/selftests/bpf/prog_tests/tcp_rtt.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 + #define _GNU_SOURCE 2 3 #include <test_progs.h> 3 4 #include "cgroup_helpers.h" 4 5 #include "network_helpers.h"

-4

tools/testing/selftests/bpf/prog_tests/test_bpf_syscall_macro.c

··· 38 38 /* check whether args of syscall are copied correctly */ 39 39 prctl(exp_arg1, exp_arg2, exp_arg3, exp_arg4, exp_arg5); 40 40 41 - #if defined(__aarch64__) || defined(__s390__) 42 - ASSERT_NEQ(skel->bss->arg1, exp_arg1, "syscall_arg1"); 43 - #else 44 41 ASSERT_EQ(skel->bss->arg1, exp_arg1, "syscall_arg1"); 45 - #endif 46 42 ASSERT_EQ(skel->bss->arg2, exp_arg2, "syscall_arg2"); 47 43 ASSERT_EQ(skel->bss->arg3, exp_arg3, "syscall_arg3"); 48 44 /* it cannot copy arg4 when uses PT_REGS_PARM4 on x86_64 */

+1 -1

tools/testing/selftests/bpf/prog_tests/test_bprm_opts.c

··· 51 51 exit(ret); 52 52 53 53 /* If the binary is executed with securexec=1, the dynamic 54 - * loader ingores and unsets certain variables like LD_PRELOAD, 54 + * loader ignores and unsets certain variables like LD_PRELOAD, 55 55 * TMPDIR etc. TMPDIR is used here to simplify the example, as 56 56 * LD_PRELOAD requires a real .so file. 57 57 *

+45 -1

tools/testing/selftests/bpf/prog_tests/test_lsm.c

··· 12 12 #include <stdlib.h> 13 13 14 14 #include "lsm.skel.h" 15 + #include "lsm_tailcall.skel.h" 15 16 16 17 char *CMD_ARGS[] = {"true", NULL}; 17 18 ··· 96 95 return 0; 97 96 } 98 97 99 - void test_test_lsm(void) 98 + static void test_lsm_basic(void) 100 99 { 101 100 struct lsm *skel = NULL; 102 101 int err; ··· 114 113 115 114 close_prog: 116 115 lsm__destroy(skel); 116 + } 117 + 118 + static void test_lsm_tailcall(void) 119 + { 120 + struct lsm_tailcall *skel = NULL; 121 + int map_fd, prog_fd; 122 + int err, key; 123 + 124 + skel = lsm_tailcall__open_and_load(); 125 + if (!ASSERT_OK_PTR(skel, "lsm_tailcall__skel_load")) 126 + goto close_prog; 127 + 128 + map_fd = bpf_map__fd(skel->maps.jmp_table); 129 + if (CHECK_FAIL(map_fd < 0)) 130 + goto close_prog; 131 + 132 + prog_fd = bpf_program__fd(skel->progs.lsm_file_permission_prog); 133 + if (CHECK_FAIL(prog_fd < 0)) 134 + goto close_prog; 135 + 136 + key = 0; 137 + err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY); 138 + if (CHECK_FAIL(!err)) 139 + goto close_prog; 140 + 141 + prog_fd = bpf_program__fd(skel->progs.lsm_file_alloc_security_prog); 142 + if (CHECK_FAIL(prog_fd < 0)) 143 + goto close_prog; 144 + 145 + err = bpf_map_update_elem(map_fd, &key, &prog_fd, BPF_ANY); 146 + if (CHECK_FAIL(err)) 147 + goto close_prog; 148 + 149 + close_prog: 150 + lsm_tailcall__destroy(skel); 151 + } 152 + 153 + void test_test_lsm(void) 154 + { 155 + if (test__start_subtest("lsm_basic")) 156 + test_lsm_basic(); 157 + if (test__start_subtest("lsm_tailcall")) 158 + test_lsm_tailcall(); 117 159 }

+57

tools/testing/selftests/bpf/prog_tests/test_mmap_inner_array.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + #include <test_progs.h> 4 + #include <sys/mman.h> 5 + #include "mmap_inner_array.skel.h" 6 + 7 + void test_mmap_inner_array(void) 8 + { 9 + const long page_size = sysconf(_SC_PAGE_SIZE); 10 + struct mmap_inner_array *skel; 11 + int inner_array_fd, err; 12 + void *tmp; 13 + __u64 *val; 14 + 15 + skel = mmap_inner_array__open_and_load(); 16 + 17 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 18 + return; 19 + 20 + inner_array_fd = bpf_map__fd(skel->maps.inner_array); 21 + tmp = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, inner_array_fd, 0); 22 + if (!ASSERT_OK_PTR(tmp, "inner array mmap")) 23 + goto out; 24 + val = (void *)tmp; 25 + 26 + err = mmap_inner_array__attach(skel); 27 + if (!ASSERT_OK(err, "attach")) 28 + goto out_unmap; 29 + 30 + skel->bss->pid = getpid(); 31 + usleep(1); 32 + 33 + /* pid is set, pid_match == true and outer_map_match == false */ 34 + ASSERT_TRUE(skel->bss->pid_match, "pid match 1"); 35 + ASSERT_FALSE(skel->bss->outer_map_match, "outer map match 1"); 36 + ASSERT_FALSE(skel->bss->done, "done 1"); 37 + ASSERT_EQ(*val, 0, "value match 1"); 38 + 39 + err = bpf_map__update_elem(skel->maps.outer_map, 40 + &skel->bss->pid, sizeof(skel->bss->pid), 41 + &inner_array_fd, sizeof(inner_array_fd), 42 + BPF_ANY); 43 + if (!ASSERT_OK(err, "update elem")) 44 + goto out_unmap; 45 + usleep(1); 46 + 47 + /* outer map key is set, outer_map_match == true */ 48 + ASSERT_TRUE(skel->bss->pid_match, "pid match 2"); 49 + ASSERT_TRUE(skel->bss->outer_map_match, "outer map match 2"); 50 + ASSERT_TRUE(skel->bss->done, "done 2"); 51 + ASSERT_EQ(*val, skel->data->match_value, "value match 2"); 52 + 53 + out_unmap: 54 + munmap(tmp, page_size); 55 + out: 56 + mmap_inner_array__destroy(skel); 57 + }

+1 -1

tools/testing/selftests/bpf/prog_tests/test_strncmp.c

··· 72 72 got = trigger_strncmp(skel); 73 73 ASSERT_EQ(got, 0, "strncmp: same str"); 74 74 75 - /* Not-null-termainted string */ 75 + /* Not-null-terminated string */ 76 76 memcpy(skel->bss->str, skel->rodata->target, sizeof(skel->bss->str)); 77 77 skel->bss->str[sizeof(skel->bss->str) - 1] = 'A'; 78 78 got = trigger_strncmp(skel);

+2

tools/testing/selftests/bpf/prog_tests/test_struct_ops_module.c

··· 9 9 #include "struct_ops_nulled_out_cb.skel.h" 10 10 #include "struct_ops_forgotten_cb.skel.h" 11 11 #include "struct_ops_detach.skel.h" 12 + #include "unsupported_ops.skel.h" 12 13 13 14 static void check_map_info(struct bpf_map_info *info) 14 15 { ··· 312 311 test_struct_ops_forgotten_cb(); 313 312 if (test__start_subtest("test_detach_link")) 314 313 test_detach_link(); 314 + RUN_TESTS(unsupported_ops); 315 315 } 316 316

+213

tools/testing/selftests/bpf/prog_tests/test_xdp_veth.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + /* Create 3 namespaces with 3 veth peers, and forward packets in-between using 4 + * native XDP 5 + * 6 + * XDP_TX 7 + * NS1(veth11) NS2(veth22) NS3(veth33) 8 + * | | | 9 + * | | | 10 + * (veth1, (veth2, (veth3, 11 + * id:111) id:122) id:133) 12 + * ^ | ^ | ^ | 13 + * | | XDP_REDIRECT | | XDP_REDIRECT | | 14 + * | ------------------ ------------------ | 15 + * ----------------------------------------- 16 + * XDP_REDIRECT 17 + */ 18 + 19 + #define _GNU_SOURCE 20 + #include <net/if.h> 21 + #include "test_progs.h" 22 + #include "network_helpers.h" 23 + #include "xdp_dummy.skel.h" 24 + #include "xdp_redirect_map.skel.h" 25 + #include "xdp_tx.skel.h" 26 + 27 + #define VETH_PAIRS_COUNT 3 28 + #define NS_SUFFIX_LEN 6 29 + #define VETH_NAME_MAX_LEN 16 30 + #define IP_SRC "10.1.1.11" 31 + #define IP_DST "10.1.1.33" 32 + #define IP_CMD_MAX_LEN 128 33 + 34 + struct skeletons { 35 + struct xdp_dummy *xdp_dummy; 36 + struct xdp_tx *xdp_tx; 37 + struct xdp_redirect_map *xdp_redirect_maps; 38 + }; 39 + 40 + struct veth_configuration { 41 + char local_veth[VETH_NAME_MAX_LEN]; /* Interface in main namespace */ 42 + char remote_veth[VETH_NAME_MAX_LEN]; /* Peer interface in dedicated namespace*/ 43 + const char *namespace; /* Namespace for the remote veth */ 44 + char next_veth[VETH_NAME_MAX_LEN]; /* Local interface to redirect traffic to */ 45 + char *remote_addr; /* IP address of the remote veth */ 46 + }; 47 + 48 + static struct veth_configuration config[VETH_PAIRS_COUNT] = { 49 + { 50 + .local_veth = "veth1", 51 + .remote_veth = "veth11", 52 + .next_veth = "veth2", 53 + .remote_addr = IP_SRC, 54 + .namespace = "ns-veth11" 55 + }, 56 + { 57 + .local_veth = "veth2", 58 + .remote_veth = "veth22", 59 + .next_veth = "veth3", 60 + .remote_addr = NULL, 61 + .namespace = "ns-veth22" 62 + }, 63 + { 64 + .local_veth = "veth3", 65 + .remote_veth = "veth33", 66 + .next_veth = "veth1", 67 + .remote_addr = IP_DST, 68 + .namespace = "ns-veth33" 69 + } 70 + }; 71 + 72 + static int attach_programs_to_veth_pair(struct skeletons *skeletons, int index) 73 + { 74 + struct bpf_program *local_prog, *remote_prog; 75 + struct bpf_link **local_link, **remote_link; 76 + struct nstoken *nstoken; 77 + struct bpf_link *link; 78 + int interface; 79 + 80 + switch (index) { 81 + case 0: 82 + local_prog = skeletons->xdp_redirect_maps->progs.xdp_redirect_map_0; 83 + local_link = &skeletons->xdp_redirect_maps->links.xdp_redirect_map_0; 84 + remote_prog = skeletons->xdp_dummy->progs.xdp_dummy_prog; 85 + remote_link = &skeletons->xdp_dummy->links.xdp_dummy_prog; 86 + break; 87 + case 1: 88 + local_prog = skeletons->xdp_redirect_maps->progs.xdp_redirect_map_1; 89 + local_link = &skeletons->xdp_redirect_maps->links.xdp_redirect_map_1; 90 + remote_prog = skeletons->xdp_tx->progs.xdp_tx; 91 + remote_link = &skeletons->xdp_tx->links.xdp_tx; 92 + break; 93 + case 2: 94 + local_prog = skeletons->xdp_redirect_maps->progs.xdp_redirect_map_2; 95 + local_link = &skeletons->xdp_redirect_maps->links.xdp_redirect_map_2; 96 + remote_prog = skeletons->xdp_dummy->progs.xdp_dummy_prog; 97 + remote_link = &skeletons->xdp_dummy->links.xdp_dummy_prog; 98 + break; 99 + } 100 + interface = if_nametoindex(config[index].local_veth); 101 + if (!ASSERT_NEQ(interface, 0, "non zero interface index")) 102 + return -1; 103 + link = bpf_program__attach_xdp(local_prog, interface); 104 + if (!ASSERT_OK_PTR(link, "attach xdp program to local veth")) 105 + return -1; 106 + *local_link = link; 107 + nstoken = open_netns(config[index].namespace); 108 + if (!ASSERT_OK_PTR(nstoken, "switch to remote veth namespace")) 109 + return -1; 110 + interface = if_nametoindex(config[index].remote_veth); 111 + if (!ASSERT_NEQ(interface, 0, "non zero interface index")) { 112 + close_netns(nstoken); 113 + return -1; 114 + } 115 + link = bpf_program__attach_xdp(remote_prog, interface); 116 + *remote_link = link; 117 + close_netns(nstoken); 118 + if (!ASSERT_OK_PTR(link, "attach xdp program to remote veth")) 119 + return -1; 120 + 121 + return 0; 122 + } 123 + 124 + static int configure_network(struct skeletons *skeletons) 125 + { 126 + int interface_id; 127 + int map_fd; 128 + int err; 129 + int i = 0; 130 + 131 + /* First create and configure all interfaces */ 132 + for (i = 0; i < VETH_PAIRS_COUNT; i++) { 133 + SYS(fail, "ip netns add %s", config[i].namespace); 134 + SYS(fail, "ip link add %s type veth peer name %s netns %s", 135 + config[i].local_veth, config[i].remote_veth, config[i].namespace); 136 + SYS(fail, "ip link set dev %s up", config[i].local_veth); 137 + if (config[i].remote_addr) 138 + SYS(fail, "ip -n %s addr add %s/24 dev %s", config[i].namespace, 139 + config[i].remote_addr, config[i].remote_veth); 140 + SYS(fail, "ip -n %s link set dev %s up", config[i].namespace, 141 + config[i].remote_veth); 142 + } 143 + 144 + /* Then configure the redirect map and attach programs to interfaces */ 145 + map_fd = bpf_map__fd(skeletons->xdp_redirect_maps->maps.tx_port); 146 + if (!ASSERT_GE(map_fd, 0, "open redirect map")) 147 + goto fail; 148 + for (i = 0; i < VETH_PAIRS_COUNT; i++) { 149 + interface_id = if_nametoindex(config[i].next_veth); 150 + if (!ASSERT_NEQ(interface_id, 0, "non zero interface index")) 151 + goto fail; 152 + err = bpf_map_update_elem(map_fd, &i, &interface_id, BPF_ANY); 153 + if (!ASSERT_OK(err, "configure interface redirection through map")) 154 + goto fail; 155 + if (attach_programs_to_veth_pair(skeletons, i)) 156 + goto fail; 157 + } 158 + 159 + return 0; 160 + 161 + fail: 162 + return -1; 163 + } 164 + 165 + static void cleanup_network(void) 166 + { 167 + int i; 168 + 169 + /* Deleting namespaces is enough to automatically remove veth pairs as well 170 + */ 171 + for (i = 0; i < VETH_PAIRS_COUNT; i++) 172 + SYS_NOFAIL("ip netns del %s", config[i].namespace); 173 + } 174 + 175 + static int check_ping(struct skeletons *skeletons) 176 + { 177 + /* Test: if all interfaces are properly configured, we must be able to ping 178 + * veth33 from veth11 179 + */ 180 + return SYS_NOFAIL("ip netns exec %s ping -c 1 -W 1 %s > /dev/null", 181 + config[0].namespace, IP_DST); 182 + } 183 + 184 + void test_xdp_veth_redirect(void) 185 + { 186 + struct skeletons skeletons = {}; 187 + 188 + skeletons.xdp_dummy = xdp_dummy__open_and_load(); 189 + if (!ASSERT_OK_PTR(skeletons.xdp_dummy, "xdp_dummy__open_and_load")) 190 + return; 191 + 192 + skeletons.xdp_tx = xdp_tx__open_and_load(); 193 + if (!ASSERT_OK_PTR(skeletons.xdp_tx, "xdp_tx__open_and_load")) 194 + goto destroy_xdp_dummy; 195 + 196 + skeletons.xdp_redirect_maps = xdp_redirect_map__open_and_load(); 197 + if (!ASSERT_OK_PTR(skeletons.xdp_redirect_maps, "xdp_redirect_map__open_and_load")) 198 + goto destroy_xdp_tx; 199 + 200 + if (configure_network(&skeletons)) 201 + goto destroy_xdp_redirect_map; 202 + 203 + ASSERT_OK(check_ping(&skeletons), "ping"); 204 + 205 + destroy_xdp_redirect_map: 206 + xdp_redirect_map__destroy(skeletons.xdp_redirect_maps); 207 + destroy_xdp_tx: 208 + xdp_tx__destroy(skeletons.xdp_tx); 209 + destroy_xdp_dummy: 210 + xdp_dummy__destroy(skeletons.xdp_dummy); 211 + 212 + cleanup_network(); 213 + }

+2 -2

tools/testing/selftests/bpf/prog_tests/token.c

··· 867 867 } 868 868 unsetenv(TOKEN_ENVVAR); 869 869 870 - /* now the same struct_ops skeleton should succeed thanks to libppf 870 + /* now the same struct_ops skeleton should succeed thanks to libbpf 871 871 * creating BPF token from /sys/fs/bpf mount point 872 872 */ 873 873 skel = dummy_st_ops_success__open_and_load(); ··· 929 929 if (!ASSERT_OK(err, "setenv_token_path")) 930 930 goto err_out; 931 931 932 - /* now the same struct_ops skeleton should succeed thanks to libppf 932 + /* now the same struct_ops skeleton should succeed thanks to libbpf 933 933 * creating BPF token from custom mount point 934 934 */ 935 935 skel = dummy_st_ops_success__open_and_load();

+2 -1

tools/testing/selftests/bpf/prog_tests/unpriv_bpf_disabled.c

··· 7 7 #include "test_unpriv_bpf_disabled.skel.h" 8 8 9 9 #include "cap_helpers.h" 10 + #include "bpf_util.h" 10 11 11 12 /* Using CAP_LAST_CAP is risky here, since it can get pulled in from 12 13 * an old /usr/include/linux/capability.h and be < CAP_BPF; as a result ··· 147 146 BPF_MOV64_IMM(BPF_REG_0, 0), 148 147 BPF_EXIT_INSN(), 149 148 }; 150 - const size_t prog_insn_cnt = sizeof(prog_insns) / sizeof(struct bpf_insn); 149 + const size_t prog_insn_cnt = ARRAY_SIZE(prog_insns); 151 150 LIBBPF_OPTS(bpf_prog_load_opts, load_opts); 152 151 struct bpf_map_info map_info = {}; 153 152 __u32 map_info_len = sizeof(map_info);

+468 -59

tools/testing/selftests/bpf/prog_tests/uprobe_multi_test.c

··· 6 6 #include "uprobe_multi.skel.h" 7 7 #include "uprobe_multi_bench.skel.h" 8 8 #include "uprobe_multi_usdt.skel.h" 9 + #include "uprobe_multi_consumers.skel.h" 10 + #include "uprobe_multi_pid_filter.skel.h" 9 11 #include "bpf/libbpf_internal.h" 10 12 #include "testing_helpers.h" 11 13 #include "../sdt.h" ··· 40 38 int pid; 41 39 int tid; 42 40 pthread_t thread; 41 + char stack[65536]; 43 42 }; 44 43 45 44 static void release_child(struct child *child) ··· 70 67 fflush(NULL); 71 68 } 72 69 73 - static struct child *spawn_child(void) 70 + static int child_func(void *arg) 74 71 { 75 - static struct child child; 76 - int err; 77 - int c; 72 + struct child *child = arg; 73 + int err, c; 78 74 75 + close(child->go[1]); 76 + 77 + /* wait for parent's kick */ 78 + err = read(child->go[0], &c, 1); 79 + if (err != 1) 80 + exit(err); 81 + 82 + uprobe_multi_func_1(); 83 + uprobe_multi_func_2(); 84 + uprobe_multi_func_3(); 85 + usdt_trigger(); 86 + 87 + exit(errno); 88 + } 89 + 90 + static int spawn_child_flag(struct child *child, bool clone_vm) 91 + { 79 92 /* pipe to notify child to execute the trigger functions */ 80 - if (pipe(child.go)) 81 - return NULL; 93 + if (pipe(child->go)) 94 + return -1; 82 95 83 - child.pid = child.tid = fork(); 84 - if (child.pid < 0) { 85 - release_child(&child); 96 + if (clone_vm) { 97 + child->pid = child->tid = clone(child_func, child->stack + sizeof(child->stack)/2, 98 + CLONE_VM|SIGCHLD, child); 99 + } else { 100 + child->pid = child->tid = fork(); 101 + } 102 + if (child->pid < 0) { 103 + release_child(child); 86 104 errno = EINVAL; 87 - return NULL; 105 + return -1; 88 106 } 89 107 90 - /* child */ 91 - if (child.pid == 0) { 92 - close(child.go[1]); 108 + /* fork-ed child */ 109 + if (!clone_vm && child->pid == 0) 110 + child_func(child); 93 111 94 - /* wait for parent's kick */ 95 - err = read(child.go[0], &c, 1); 96 - if (err != 1) 97 - exit(err); 112 + return 0; 113 + } 98 114 99 - uprobe_multi_func_1(); 100 - uprobe_multi_func_2(); 101 - uprobe_multi_func_3(); 102 - usdt_trigger(); 103 - 104 - exit(errno); 105 - } 106 - 107 - return &child; 115 + static int spawn_child(struct child *child) 116 + { 117 + return spawn_child_flag(child, false); 108 118 } 109 119 110 120 static void *child_thread(void *ctx) ··· 146 130 pthread_exit(&err); 147 131 } 148 132 149 - static struct child *spawn_thread(void) 133 + static int spawn_thread(struct child *child) 150 134 { 151 - static struct child child; 152 135 int c, err; 153 136 154 137 /* pipe to notify child to execute the trigger functions */ 155 - if (pipe(child.go)) 156 - return NULL; 138 + if (pipe(child->go)) 139 + return -1; 157 140 /* pipe to notify parent that child thread is ready */ 158 - if (pipe(child.c2p)) { 159 - close(child.go[0]); 160 - close(child.go[1]); 161 - return NULL; 141 + if (pipe(child->c2p)) { 142 + close(child->go[0]); 143 + close(child->go[1]); 144 + return -1; 162 145 } 163 146 164 - child.pid = getpid(); 147 + child->pid = getpid(); 165 148 166 - err = pthread_create(&child.thread, NULL, child_thread, &child); 149 + err = pthread_create(&child->thread, NULL, child_thread, child); 167 150 if (err) { 168 151 err = -errno; 169 - close(child.go[0]); 170 - close(child.go[1]); 171 - close(child.c2p[0]); 172 - close(child.c2p[1]); 152 + close(child->go[0]); 153 + close(child->go[1]); 154 + close(child->c2p[0]); 155 + close(child->c2p[1]); 173 156 errno = -err; 174 - return NULL; 157 + return -1; 175 158 } 176 159 177 - err = read(child.c2p[0], &c, 1); 160 + err = read(child->c2p[0], &c, 1); 178 161 if (!ASSERT_EQ(err, 1, "child_thread_ready")) 179 - return NULL; 162 + return -1; 180 163 181 - return &child; 164 + return 0; 182 165 } 183 166 184 167 static void uprobe_multi_test_run(struct uprobe_multi *skel, struct child *child) ··· 213 198 214 199 /* 215 200 * There are 2 entry and 2 exit probe called for each uprobe_multi_func_[123] 216 - * function and each slepable probe (6) increments uprobe_multi_sleep_result. 201 + * function and each sleepable probe (6) increments uprobe_multi_sleep_result. 217 202 */ 218 203 ASSERT_EQ(skel->bss->uprobe_multi_func_1_result, 2, "uprobe_multi_func_1_result"); 219 204 ASSERT_EQ(skel->bss->uprobe_multi_func_2_result, 2, "uprobe_multi_func_2_result"); ··· 318 303 static void 319 304 test_attach_api(const char *binary, const char *pattern, struct bpf_uprobe_multi_opts *opts) 320 305 { 321 - struct child *child; 306 + static struct child child; 322 307 323 308 /* no pid filter */ 324 309 __test_attach_api(binary, pattern, opts, NULL); 325 310 326 311 /* pid filter */ 327 - child = spawn_child(); 328 - if (!ASSERT_OK_PTR(child, "spawn_child")) 312 + if (!ASSERT_OK(spawn_child(&child), "spawn_child")) 329 313 return; 330 314 331 - __test_attach_api(binary, pattern, opts, child); 315 + __test_attach_api(binary, pattern, opts, &child); 332 316 333 317 /* pid filter (thread) */ 334 - child = spawn_thread(); 335 - if (!ASSERT_OK_PTR(child, "spawn_thread")) 318 + if (!ASSERT_OK(spawn_thread(&child), "spawn_thread")) 336 319 return; 337 320 338 - __test_attach_api(binary, pattern, opts, child); 321 + __test_attach_api(binary, pattern, opts, &child); 339 322 } 340 323 341 324 static void test_attach_api_pattern(void) ··· 529 516 uprobe_multi__destroy(skel); 530 517 } 531 518 519 + #ifdef __x86_64__ 520 + noinline void uprobe_multi_error_func(void) 521 + { 522 + /* 523 + * If --fcf-protection=branch is enabled the gcc generates endbr as 524 + * first instruction, so marking the exact address of int3 with the 525 + * symbol to be used in the attach_uprobe_fail_trap test below. 526 + */ 527 + asm volatile ( 528 + ".globl uprobe_multi_error_func_int3; \n" 529 + "uprobe_multi_error_func_int3: \n" 530 + "int3 \n" 531 + ); 532 + } 533 + 534 + /* 535 + * Attaching uprobe on uprobe_multi_error_func results in error 536 + * because it already starts with int3 instruction. 537 + */ 538 + static void attach_uprobe_fail_trap(struct uprobe_multi *skel) 539 + { 540 + LIBBPF_OPTS(bpf_uprobe_multi_opts, opts); 541 + const char *syms[4] = { 542 + "uprobe_multi_func_1", 543 + "uprobe_multi_func_2", 544 + "uprobe_multi_func_3", 545 + "uprobe_multi_error_func_int3", 546 + }; 547 + 548 + opts.syms = syms; 549 + opts.cnt = ARRAY_SIZE(syms); 550 + 551 + skel->links.uprobe = bpf_program__attach_uprobe_multi(skel->progs.uprobe, -1, 552 + "/proc/self/exe", NULL, &opts); 553 + if (!ASSERT_ERR_PTR(skel->links.uprobe, "bpf_program__attach_uprobe_multi")) { 554 + bpf_link__destroy(skel->links.uprobe); 555 + skel->links.uprobe = NULL; 556 + } 557 + } 558 + #else 559 + static void attach_uprobe_fail_trap(struct uprobe_multi *skel) { } 560 + #endif 561 + 562 + short sema_1 __used, sema_2 __used; 563 + 564 + static void attach_uprobe_fail_refctr(struct uprobe_multi *skel) 565 + { 566 + unsigned long *tmp_offsets = NULL, *tmp_ref_ctr_offsets = NULL; 567 + unsigned long offsets[3], ref_ctr_offsets[3]; 568 + LIBBPF_OPTS(bpf_link_create_opts, opts); 569 + const char *path = "/proc/self/exe"; 570 + const char *syms[3] = { 571 + "uprobe_multi_func_1", 572 + "uprobe_multi_func_2", 573 + }; 574 + const char *sema[3] = { 575 + "sema_1", 576 + "sema_2", 577 + }; 578 + int prog_fd, link_fd, err; 579 + 580 + prog_fd = bpf_program__fd(skel->progs.uprobe_extra); 581 + 582 + err = elf_resolve_syms_offsets("/proc/self/exe", 2, (const char **) &syms, 583 + &tmp_offsets, STT_FUNC); 584 + if (!ASSERT_OK(err, "elf_resolve_syms_offsets_func")) 585 + return; 586 + 587 + err = elf_resolve_syms_offsets("/proc/self/exe", 2, (const char **) &sema, 588 + &tmp_ref_ctr_offsets, STT_OBJECT); 589 + if (!ASSERT_OK(err, "elf_resolve_syms_offsets_sema")) 590 + goto cleanup; 591 + 592 + /* 593 + * We attach to 3 uprobes on 2 functions, so 2 uprobes share single function, 594 + * but with different ref_ctr_offset which is not allowed and results in fail. 595 + */ 596 + offsets[0] = tmp_offsets[0]; /* uprobe_multi_func_1 */ 597 + offsets[1] = tmp_offsets[1]; /* uprobe_multi_func_2 */ 598 + offsets[2] = tmp_offsets[1]; /* uprobe_multi_func_2 */ 599 + 600 + ref_ctr_offsets[0] = tmp_ref_ctr_offsets[0]; /* sema_1 */ 601 + ref_ctr_offsets[1] = tmp_ref_ctr_offsets[1]; /* sema_2 */ 602 + ref_ctr_offsets[2] = tmp_ref_ctr_offsets[0]; /* sema_1, error */ 603 + 604 + opts.uprobe_multi.path = path; 605 + opts.uprobe_multi.offsets = (const unsigned long *) &offsets; 606 + opts.uprobe_multi.ref_ctr_offsets = (const unsigned long *) &ref_ctr_offsets; 607 + opts.uprobe_multi.cnt = 3; 608 + 609 + link_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_UPROBE_MULTI, &opts); 610 + if (!ASSERT_ERR(link_fd, "link_fd")) 611 + close(link_fd); 612 + 613 + cleanup: 614 + free(tmp_ref_ctr_offsets); 615 + free(tmp_offsets); 616 + } 617 + 618 + static void test_attach_uprobe_fails(void) 619 + { 620 + struct uprobe_multi *skel = NULL; 621 + 622 + skel = uprobe_multi__open_and_load(); 623 + if (!ASSERT_OK_PTR(skel, "uprobe_multi__open_and_load")) 624 + return; 625 + 626 + /* attach fails due to adding uprobe on trap instruction, x86_64 only */ 627 + attach_uprobe_fail_trap(skel); 628 + 629 + /* attach fail due to wrong ref_ctr_offs on one of the uprobes */ 630 + attach_uprobe_fail_refctr(skel); 631 + 632 + uprobe_multi__destroy(skel); 633 + } 634 + 532 635 static void __test_link_api(struct child *child) 533 636 { 534 637 int prog_fd, link1_fd = -1, link2_fd = -1, link3_fd = -1, link4_fd = -1; ··· 724 595 725 596 static void test_link_api(void) 726 597 { 727 - struct child *child; 598 + static struct child child; 728 599 729 600 /* no pid filter */ 730 601 __test_link_api(NULL); 731 602 732 603 /* pid filter */ 733 - child = spawn_child(); 734 - if (!ASSERT_OK_PTR(child, "spawn_child")) 604 + if (!ASSERT_OK(spawn_child(&child), "spawn_child")) 735 605 return; 736 606 737 - __test_link_api(child); 607 + __test_link_api(&child); 738 608 739 609 /* pid filter (thread) */ 740 - child = spawn_thread(); 741 - if (!ASSERT_OK_PTR(child, "spawn_thread")) 610 + if (!ASSERT_OK(spawn_thread(&child), "spawn_thread")) 742 611 return; 743 612 744 - __test_link_api(child); 613 + __test_link_api(&child); 614 + } 615 + 616 + static struct bpf_program * 617 + get_program(struct uprobe_multi_consumers *skel, int prog) 618 + { 619 + switch (prog) { 620 + case 0: 621 + return skel->progs.uprobe_0; 622 + case 1: 623 + return skel->progs.uprobe_1; 624 + case 2: 625 + return skel->progs.uprobe_2; 626 + case 3: 627 + return skel->progs.uprobe_3; 628 + default: 629 + ASSERT_FAIL("get_program"); 630 + return NULL; 631 + } 632 + } 633 + 634 + static struct bpf_link ** 635 + get_link(struct uprobe_multi_consumers *skel, int link) 636 + { 637 + switch (link) { 638 + case 0: 639 + return &skel->links.uprobe_0; 640 + case 1: 641 + return &skel->links.uprobe_1; 642 + case 2: 643 + return &skel->links.uprobe_2; 644 + case 3: 645 + return &skel->links.uprobe_3; 646 + default: 647 + ASSERT_FAIL("get_link"); 648 + return NULL; 649 + } 650 + } 651 + 652 + static int uprobe_attach(struct uprobe_multi_consumers *skel, int idx) 653 + { 654 + struct bpf_program *prog = get_program(skel, idx); 655 + struct bpf_link **link = get_link(skel, idx); 656 + LIBBPF_OPTS(bpf_uprobe_multi_opts, opts); 657 + 658 + if (!prog || !link) 659 + return -1; 660 + 661 + /* 662 + * bit/prog: 0,1 uprobe entry 663 + * bit/prog: 2,3 uprobe return 664 + */ 665 + opts.retprobe = idx == 2 || idx == 3; 666 + 667 + *link = bpf_program__attach_uprobe_multi(prog, 0, "/proc/self/exe", 668 + "uprobe_consumer_test", 669 + &opts); 670 + if (!ASSERT_OK_PTR(*link, "bpf_program__attach_uprobe_multi")) 671 + return -1; 672 + return 0; 673 + } 674 + 675 + static void uprobe_detach(struct uprobe_multi_consumers *skel, int idx) 676 + { 677 + struct bpf_link **link = get_link(skel, idx); 678 + 679 + bpf_link__destroy(*link); 680 + *link = NULL; 681 + } 682 + 683 + static bool test_bit(int bit, unsigned long val) 684 + { 685 + return val & (1 << bit); 686 + } 687 + 688 + noinline int 689 + uprobe_consumer_test(struct uprobe_multi_consumers *skel, 690 + unsigned long before, unsigned long after) 691 + { 692 + int idx; 693 + 694 + /* detach uprobe for each unset programs in 'before' state ... */ 695 + for (idx = 0; idx < 4; idx++) { 696 + if (test_bit(idx, before) && !test_bit(idx, after)) 697 + uprobe_detach(skel, idx); 698 + } 699 + 700 + /* ... and attach all new programs in 'after' state */ 701 + for (idx = 0; idx < 4; idx++) { 702 + if (!test_bit(idx, before) && test_bit(idx, after)) { 703 + if (!ASSERT_OK(uprobe_attach(skel, idx), "uprobe_attach_after")) 704 + return -1; 705 + } 706 + } 707 + return 0; 708 + } 709 + 710 + static void consumer_test(struct uprobe_multi_consumers *skel, 711 + unsigned long before, unsigned long after) 712 + { 713 + int err, idx; 714 + 715 + printf("consumer_test before %lu after %lu\n", before, after); 716 + 717 + /* 'before' is each, we attach uprobe for every set idx */ 718 + for (idx = 0; idx < 4; idx++) { 719 + if (test_bit(idx, before)) { 720 + if (!ASSERT_OK(uprobe_attach(skel, idx), "uprobe_attach_before")) 721 + goto cleanup; 722 + } 723 + } 724 + 725 + err = uprobe_consumer_test(skel, before, after); 726 + if (!ASSERT_EQ(err, 0, "uprobe_consumer_test")) 727 + goto cleanup; 728 + 729 + for (idx = 0; idx < 4; idx++) { 730 + const char *fmt = "BUG"; 731 + __u64 val = 0; 732 + 733 + if (idx < 2) { 734 + /* 735 + * uprobe entry 736 + * +1 if define in 'before' 737 + */ 738 + if (test_bit(idx, before)) 739 + val++; 740 + fmt = "prog 0/1: uprobe"; 741 + } else { 742 + /* 743 + * uprobe return is tricky ;-) 744 + * 745 + * to trigger uretprobe consumer, the uretprobe needs to be installed, 746 + * which means one of the 'return' uprobes was alive when probe was hit: 747 + * 748 + * idxs: 2/3 uprobe return in 'installed' mask 749 + * 750 + * in addition if 'after' state removes everything that was installed in 751 + * 'before' state, then uprobe kernel object goes away and return uprobe 752 + * is not installed and we won't hit it even if it's in 'after' state. 753 + */ 754 + unsigned long had_uretprobes = before & 0b1100; /* is uretprobe installed */ 755 + unsigned long probe_preserved = before & after; /* did uprobe go away */ 756 + 757 + if (had_uretprobes && probe_preserved && test_bit(idx, after)) 758 + val++; 759 + fmt = "idx 2/3: uretprobe"; 760 + } 761 + 762 + ASSERT_EQ(skel->bss->uprobe_result[idx], val, fmt); 763 + skel->bss->uprobe_result[idx] = 0; 764 + } 765 + 766 + cleanup: 767 + for (idx = 0; idx < 4; idx++) 768 + uprobe_detach(skel, idx); 769 + } 770 + 771 + static void test_consumers(void) 772 + { 773 + struct uprobe_multi_consumers *skel; 774 + int before, after; 775 + 776 + skel = uprobe_multi_consumers__open_and_load(); 777 + if (!ASSERT_OK_PTR(skel, "uprobe_multi_consumers__open_and_load")) 778 + return; 779 + 780 + /* 781 + * The idea of this test is to try all possible combinations of 782 + * uprobes consumers attached on single function. 783 + * 784 + * - 2 uprobe entry consumer 785 + * - 2 uprobe exit consumers 786 + * 787 + * The test uses 4 uprobes attached on single function, but that 788 + * translates into single uprobe with 4 consumers in kernel. 789 + * 790 + * The before/after values present the state of attached consumers 791 + * before and after the probed function: 792 + * 793 + * bit/prog 0,1 : uprobe entry 794 + * bit/prog 2,3 : uprobe return 795 + * 796 + * For example for: 797 + * 798 + * before = 0b0101 799 + * after = 0b0110 800 + * 801 + * it means that before we call 'uprobe_consumer_test' we attach 802 + * uprobes defined in 'before' value: 803 + * 804 + * - bit/prog 0: uprobe entry 805 + * - bit/prog 2: uprobe return 806 + * 807 + * uprobe_consumer_test is called and inside it we attach and detach 808 + * uprobes based on 'after' value: 809 + * 810 + * - bit/prog 0: stays untouched 811 + * - bit/prog 2: uprobe return is detached 812 + * 813 + * uprobe_consumer_test returns and we check counters values increased 814 + * by bpf programs on each uprobe to match the expected count based on 815 + * before/after bits. 816 + */ 817 + 818 + for (before = 0; before < 16; before++) { 819 + for (after = 0; after < 16; after++) 820 + consumer_test(skel, before, after); 821 + } 822 + 823 + uprobe_multi_consumers__destroy(skel); 824 + } 825 + 826 + static struct bpf_program *uprobe_multi_program(struct uprobe_multi_pid_filter *skel, int idx) 827 + { 828 + switch (idx) { 829 + case 0: return skel->progs.uprobe_multi_0; 830 + case 1: return skel->progs.uprobe_multi_1; 831 + case 2: return skel->progs.uprobe_multi_2; 832 + } 833 + return NULL; 834 + } 835 + 836 + #define TASKS 3 837 + 838 + static void run_pid_filter(struct uprobe_multi_pid_filter *skel, bool clone_vm, bool retprobe) 839 + { 840 + LIBBPF_OPTS(bpf_uprobe_multi_opts, opts, .retprobe = retprobe); 841 + struct bpf_link *link[TASKS] = {}; 842 + struct child child[TASKS] = {}; 843 + int i; 844 + 845 + memset(skel->bss->test, 0, sizeof(skel->bss->test)); 846 + 847 + for (i = 0; i < TASKS; i++) { 848 + if (!ASSERT_OK(spawn_child_flag(&child[i], clone_vm), "spawn_child")) 849 + goto cleanup; 850 + skel->bss->pids[i] = child[i].pid; 851 + } 852 + 853 + for (i = 0; i < TASKS; i++) { 854 + link[i] = bpf_program__attach_uprobe_multi(uprobe_multi_program(skel, i), 855 + child[i].pid, "/proc/self/exe", 856 + "uprobe_multi_func_1", &opts); 857 + if (!ASSERT_OK_PTR(link[i], "bpf_program__attach_uprobe_multi")) 858 + goto cleanup; 859 + } 860 + 861 + for (i = 0; i < TASKS; i++) 862 + kick_child(&child[i]); 863 + 864 + for (i = 0; i < TASKS; i++) { 865 + ASSERT_EQ(skel->bss->test[i][0], 1, "pid"); 866 + ASSERT_EQ(skel->bss->test[i][1], 0, "unknown"); 867 + } 868 + 869 + cleanup: 870 + for (i = 0; i < TASKS; i++) 871 + bpf_link__destroy(link[i]); 872 + for (i = 0; i < TASKS; i++) 873 + release_child(&child[i]); 874 + } 875 + 876 + static void test_pid_filter_process(bool clone_vm) 877 + { 878 + struct uprobe_multi_pid_filter *skel; 879 + 880 + skel = uprobe_multi_pid_filter__open_and_load(); 881 + if (!ASSERT_OK_PTR(skel, "uprobe_multi_pid_filter__open_and_load")) 882 + return; 883 + 884 + run_pid_filter(skel, clone_vm, false); 885 + run_pid_filter(skel, clone_vm, true); 886 + 887 + uprobe_multi_pid_filter__destroy(skel); 745 888 } 746 889 747 890 static void test_bench_attach_uprobe(void) ··· 1104 703 test_bench_attach_usdt(); 1105 704 if (test__start_subtest("attach_api_fails")) 1106 705 test_attach_api_fails(); 706 + if (test__start_subtest("attach_uprobe_fails")) 707 + test_attach_uprobe_fails(); 708 + if (test__start_subtest("consumers")) 709 + test_consumers(); 710 + if (test__start_subtest("filter_fork")) 711 + test_pid_filter_process(false); 712 + if (test__start_subtest("filter_clone_vm")) 713 + test_pid_filter_process(true); 1107 714 }

+2 -1

tools/testing/selftests/bpf/prog_tests/user_ringbuf.c

··· 4 4 #define _GNU_SOURCE 5 5 #include <linux/compiler.h> 6 6 #include <linux/ring_buffer.h> 7 + #include <linux/build_bug.h> 7 8 #include <pthread.h> 8 9 #include <stdio.h> 9 10 #include <stdlib.h> ··· 643 642 if (!ASSERT_EQ(err, 0, "deferred_kick_thread\n")) 644 643 goto cleanup; 645 644 646 - /* After spawning another thread that asychronously kicks the kernel to 645 + /* After spawning another thread that asynchronously kicks the kernel to 647 646 * drain the messages, we're able to block and successfully get a 648 647 * sample once we receive an event notification. 649 648 */

+14

tools/testing/selftests/bpf/prog_tests/verifier.c

··· 21 21 #include "verifier_cgroup_inv_retcode.skel.h" 22 22 #include "verifier_cgroup_skb.skel.h" 23 23 #include "verifier_cgroup_storage.skel.h" 24 + #include "verifier_const.skel.h" 24 25 #include "verifier_const_or.skel.h" 25 26 #include "verifier_ctx.skel.h" 26 27 #include "verifier_ctx_sk_msg.skel.h" ··· 40 39 #include "verifier_int_ptr.skel.h" 41 40 #include "verifier_iterating_callbacks.skel.h" 42 41 #include "verifier_jeq_infer_not_null.skel.h" 42 + #include "verifier_jit_convergence.skel.h" 43 43 #include "verifier_ld_ind.skel.h" 44 44 #include "verifier_ldsx.skel.h" 45 45 #include "verifier_leak_ptr.skel.h" ··· 55 53 #include "verifier_movsx.skel.h" 56 54 #include "verifier_netfilter_ctx.skel.h" 57 55 #include "verifier_netfilter_retcode.skel.h" 56 + #include "verifier_bpf_fastcall.skel.h" 58 57 #include "verifier_or_jmp32_k.skel.h" 59 58 #include "verifier_precision.skel.h" 60 59 #include "verifier_prevent_map_lookup.skel.h" ··· 77 74 #include "verifier_stack_ptr.skel.h" 78 75 #include "verifier_subprog_precision.skel.h" 79 76 #include "verifier_subreg.skel.h" 77 + #include "verifier_tailcall_jit.skel.h" 80 78 #include "verifier_typedef.skel.h" 81 79 #include "verifier_uninit.skel.h" 82 80 #include "verifier_unpriv.skel.h" ··· 88 84 #include "verifier_value_or_null.skel.h" 89 85 #include "verifier_value_ptr_arith.skel.h" 90 86 #include "verifier_var_off.skel.h" 87 + #include "verifier_vfs_accept.skel.h" 88 + #include "verifier_vfs_reject.skel.h" 91 89 #include "verifier_xadd.skel.h" 92 90 #include "verifier_xdp.skel.h" 93 91 #include "verifier_xdp_direct_packet_access.skel.h" 94 92 #include "verifier_bits_iter.skel.h" 93 + #include "verifier_lsm.skel.h" 95 94 96 95 #define MAX_ENTRIES 11 97 96 ··· 147 140 void test_verifier_cgroup_inv_retcode(void) { RUN(verifier_cgroup_inv_retcode); } 148 141 void test_verifier_cgroup_skb(void) { RUN(verifier_cgroup_skb); } 149 142 void test_verifier_cgroup_storage(void) { RUN(verifier_cgroup_storage); } 143 + void test_verifier_const(void) { RUN(verifier_const); } 150 144 void test_verifier_const_or(void) { RUN(verifier_const_or); } 151 145 void test_verifier_ctx(void) { RUN(verifier_ctx); } 152 146 void test_verifier_ctx_sk_msg(void) { RUN(verifier_ctx_sk_msg); } ··· 166 158 void test_verifier_int_ptr(void) { RUN(verifier_int_ptr); } 167 159 void test_verifier_iterating_callbacks(void) { RUN(verifier_iterating_callbacks); } 168 160 void test_verifier_jeq_infer_not_null(void) { RUN(verifier_jeq_infer_not_null); } 161 + void test_verifier_jit_convergence(void) { RUN(verifier_jit_convergence); } 169 162 void test_verifier_ld_ind(void) { RUN(verifier_ld_ind); } 170 163 void test_verifier_ldsx(void) { RUN(verifier_ldsx); } 171 164 void test_verifier_leak_ptr(void) { RUN(verifier_leak_ptr); } ··· 181 172 void test_verifier_movsx(void) { RUN(verifier_movsx); } 182 173 void test_verifier_netfilter_ctx(void) { RUN(verifier_netfilter_ctx); } 183 174 void test_verifier_netfilter_retcode(void) { RUN(verifier_netfilter_retcode); } 175 + void test_verifier_bpf_fastcall(void) { RUN(verifier_bpf_fastcall); } 184 176 void test_verifier_or_jmp32_k(void) { RUN(verifier_or_jmp32_k); } 185 177 void test_verifier_precision(void) { RUN(verifier_precision); } 186 178 void test_verifier_prevent_map_lookup(void) { RUN(verifier_prevent_map_lookup); } ··· 203 193 void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); } 204 194 void test_verifier_subprog_precision(void) { RUN(verifier_subprog_precision); } 205 195 void test_verifier_subreg(void) { RUN(verifier_subreg); } 196 + void test_verifier_tailcall_jit(void) { RUN(verifier_tailcall_jit); } 206 197 void test_verifier_typedef(void) { RUN(verifier_typedef); } 207 198 void test_verifier_uninit(void) { RUN(verifier_uninit); } 208 199 void test_verifier_unpriv(void) { RUN(verifier_unpriv); } ··· 213 202 void test_verifier_value_illegal_alu(void) { RUN(verifier_value_illegal_alu); } 214 203 void test_verifier_value_or_null(void) { RUN(verifier_value_or_null); } 215 204 void test_verifier_var_off(void) { RUN(verifier_var_off); } 205 + void test_verifier_vfs_accept(void) { RUN(verifier_vfs_accept); } 206 + void test_verifier_vfs_reject(void) { RUN(verifier_vfs_reject); } 216 207 void test_verifier_xadd(void) { RUN(verifier_xadd); } 217 208 void test_verifier_xdp(void) { RUN(verifier_xdp); } 218 209 void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); } 219 210 void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); } 211 + void test_verifier_lsm(void) { RUN(verifier_lsm); } 220 212 221 213 static int init_test_val_map(struct bpf_object *obj, char *map_name) 222 214 {

+31 -1

tools/testing/selftests/bpf/progs/arena_atomics.c

··· 4 4 #include <bpf/bpf_helpers.h> 5 5 #include <bpf/bpf_tracing.h> 6 6 #include <stdbool.h> 7 + #include <stdatomic.h> 7 8 #include "bpf_arena_common.h" 8 9 9 10 struct { ··· 78 77 return 0; 79 78 } 80 79 80 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 81 + _Atomic __u64 __arena_global and64_value = (0x110ull << 32); 82 + _Atomic __u32 __arena_global and32_value = 0x110; 83 + #else 81 84 __u64 __arena_global and64_value = (0x110ull << 32); 82 85 __u32 __arena_global and32_value = 0x110; 86 + #endif 83 87 84 88 SEC("raw_tp/sys_enter") 85 89 int and(const void *ctx) ··· 92 86 if (pid != (bpf_get_current_pid_tgid() >> 32)) 93 87 return 0; 94 88 #ifdef ENABLE_ATOMICS_TESTS 95 - 89 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 90 + __c11_atomic_fetch_and(&and64_value, 0x011ull << 32, memory_order_relaxed); 91 + __c11_atomic_fetch_and(&and32_value, 0x011, memory_order_relaxed); 92 + #else 96 93 __sync_fetch_and_and(&and64_value, 0x011ull << 32); 97 94 __sync_fetch_and_and(&and32_value, 0x011); 95 + #endif 98 96 #endif 99 97 100 98 return 0; 101 99 } 102 100 101 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 102 + _Atomic __u32 __arena_global or32_value = 0x110; 103 + _Atomic __u64 __arena_global or64_value = (0x110ull << 32); 104 + #else 103 105 __u32 __arena_global or32_value = 0x110; 104 106 __u64 __arena_global or64_value = (0x110ull << 32); 107 + #endif 105 108 106 109 SEC("raw_tp/sys_enter") 107 110 int or(const void *ctx) ··· 118 103 if (pid != (bpf_get_current_pid_tgid() >> 32)) 119 104 return 0; 120 105 #ifdef ENABLE_ATOMICS_TESTS 106 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 107 + __c11_atomic_fetch_or(&or64_value, 0x011ull << 32, memory_order_relaxed); 108 + __c11_atomic_fetch_or(&or32_value, 0x011, memory_order_relaxed); 109 + #else 121 110 __sync_fetch_and_or(&or64_value, 0x011ull << 32); 122 111 __sync_fetch_and_or(&or32_value, 0x011); 112 + #endif 123 113 #endif 124 114 125 115 return 0; 126 116 } 127 117 118 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 119 + _Atomic __u64 __arena_global xor64_value = (0x110ull << 32); 120 + _Atomic __u32 __arena_global xor32_value = 0x110; 121 + #else 128 122 __u64 __arena_global xor64_value = (0x110ull << 32); 129 123 __u32 __arena_global xor32_value = 0x110; 124 + #endif 130 125 131 126 SEC("raw_tp/sys_enter") 132 127 int xor(const void *ctx) ··· 144 119 if (pid != (bpf_get_current_pid_tgid() >> 32)) 145 120 return 0; 146 121 #ifdef ENABLE_ATOMICS_TESTS 122 + #ifdef __BPF_FEATURE_ATOMIC_MEM_ORDERING 123 + __c11_atomic_fetch_xor(&xor64_value, 0x011ull << 32, memory_order_relaxed); 124 + __c11_atomic_fetch_xor(&xor32_value, 0x011, memory_order_relaxed); 125 + #else 147 126 __sync_fetch_and_xor(&xor64_value, 0x011ull << 32); 148 127 __sync_fetch_and_xor(&xor32_value, 0x011); 128 + #endif 149 129 #endif 150 130 151 131 return 0;

+3 -3

tools/testing/selftests/bpf/progs/bpf_cubic.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 3 - /* WARNING: This implemenation is not necessarily the same 3 + /* WARNING: This implementation is not necessarily the same 4 4 * as the tcp_cubic.c. The purpose is mainly for testing 5 5 * the kernel BPF logic. 6 6 * ··· 314 314 * (so time^3 is done by using 64 bit) 315 315 * and without the support of division of 64bit numbers 316 316 * (so all divisions are done by using 32 bit) 317 - * also NOTE the unit of those veriables 317 + * also NOTE the unit of those variables 318 318 * time = (t - K) / 2^bictcp_HZ 319 319 * c = bic_scale >> 10 320 320 * rtt = (srtt >> 3) / HZ ··· 507 507 __u32 delay; 508 508 509 509 bpf_cubic_acked_called = 1; 510 - /* Some calls are for duplicates without timetamps */ 510 + /* Some calls are for duplicates without timestamps */ 511 511 if (sample->rtt_us < 0) 512 512 return; 513 513

+4 -4

tools/testing/selftests/bpf/progs/bpf_dctcp.c

··· 26 26 27 27 char _license[] SEC("license") = "GPL"; 28 28 29 - volatile const char fallback[TCP_CA_NAME_MAX]; 29 + volatile const char fallback_cc[TCP_CA_NAME_MAX]; 30 30 const char bpf_dctcp[] = "bpf_dctcp"; 31 31 const char tcp_cdg[] = "cdg"; 32 32 char cc_res[TCP_CA_NAME_MAX]; ··· 71 71 struct bpf_dctcp *ca = inet_csk_ca(sk); 72 72 int *stg; 73 73 74 - if (!(tp->ecn_flags & TCP_ECN_OK) && fallback[0]) { 74 + if (!(tp->ecn_flags & TCP_ECN_OK) && fallback_cc[0]) { 75 75 /* Switch to fallback */ 76 76 if (bpf_setsockopt(sk, SOL_TCP, TCP_CONGESTION, 77 - (void *)fallback, sizeof(fallback)) == -EBUSY) 77 + (void *)fallback_cc, sizeof(fallback_cc)) == -EBUSY) 78 78 ebusy_cnt++; 79 79 80 80 /* Switch back to myself and the recurred bpf_dctcp_init() ··· 87 87 88 88 /* Switch back to fallback */ 89 89 if (bpf_setsockopt(sk, SOL_TCP, TCP_CONGESTION, 90 - (void *)fallback, sizeof(fallback)) == -EBUSY) 90 + (void *)fallback_cc, sizeof(fallback_cc)) == -EBUSY) 91 91 ebusy_cnt++; 92 92 93 93 /* Expecting -ENOTSUPP for tcp_cdg_res */

+58 -6

tools/testing/selftests/bpf/progs/bpf_misc.h

··· 2 2 #ifndef __BPF_MISC_H__ 3 3 #define __BPF_MISC_H__ 4 4 5 + #define XSTR(s) STR(s) 6 + #define STR(s) #s 7 + 5 8 /* This set of attributes controls behavior of the 6 9 * test_loader.c:test_loader__run_subtests(). 7 10 * ··· 25 22 * 26 23 * __msg Message expected to be found in the verifier log. 27 24 * Multiple __msg attributes could be specified. 25 + * To match a regular expression use "{{" "}}" brackets, 26 + * e.g. "foo{{[0-9]+}}" matches strings like "foo007". 27 + * Extended POSIX regular expression syntax is allowed 28 + * inside the brackets. 28 29 * __msg_unpriv Same as __msg but for unprivileged mode. 29 30 * 30 - * __regex Same as __msg, but using a regular expression. 31 - * __regex_unpriv Same as __msg_unpriv but using a regular expression. 31 + * __xlated Expect a line in a disassembly log after verifier applies rewrites. 32 + * Multiple __xlated attributes could be specified. 33 + * Regular expressions could be specified same way as in __msg. 34 + * __xlated_unpriv Same as __xlated but for unprivileged mode. 35 + * 36 + * __jited Match a line in a disassembly of the jited BPF program. 37 + * Has to be used after __arch_* macro. 38 + * For example: 39 + * 40 + * __arch_x86_64 41 + * __jited(" endbr64") 42 + * __jited(" nopl (%rax,%rax)") 43 + * __jited(" xorq %rax, %rax") 44 + * ... 45 + * __naked void some_test(void) 46 + * { 47 + * asm volatile (... ::: __clobber_all); 48 + * } 49 + * 50 + * Regular expressions could be included in patterns same way 51 + * as in __msg. 52 + * 53 + * By default assume that each pattern has to be matched on the 54 + * next consecutive line of disassembly, e.g.: 55 + * 56 + * __jited(" endbr64") # matched on line N 57 + * __jited(" nopl (%rax,%rax)") # matched on line N+1 58 + * 59 + * If match occurs on a wrong line an error is reported. 60 + * To override this behaviour use literal "...", e.g.: 61 + * 62 + * __jited(" endbr64") # matched on line N 63 + * __jited("...") # not matched 64 + * __jited(" nopl (%rax,%rax)") # matched on any line >= N 65 + * 66 + * __jited_unpriv Same as __jited but for unprivileged mode. 67 + * 32 68 * 33 69 * __success Expect program load success in privileged mode. 34 70 * __success_unpriv Expect program load success in unprivileged mode. ··· 102 60 * __auxiliary Annotated program is not a separate test, but used as auxiliary 103 61 * for some other test cases and should always be loaded. 104 62 * __auxiliary_unpriv Same, but load program in unprivileged mode. 63 + * 64 + * __arch_* Specify on which architecture the test case should be tested. 65 + * Several __arch_* annotations could be specified at once. 66 + * When test case is not run on current arch it is marked as skipped. 105 67 */ 106 - #define __msg(msg) __attribute__((btf_decl_tag("comment:test_expect_msg=" msg))) 107 - #define __regex(regex) __attribute__((btf_decl_tag("comment:test_expect_regex=" regex))) 68 + #define __msg(msg) __attribute__((btf_decl_tag("comment:test_expect_msg=" XSTR(__COUNTER__) "=" msg))) 69 + #define __xlated(msg) __attribute__((btf_decl_tag("comment:test_expect_xlated=" XSTR(__COUNTER__) "=" msg))) 70 + #define __jited(msg) __attribute__((btf_decl_tag("comment:test_jited=" XSTR(__COUNTER__) "=" msg))) 108 71 #define __failure __attribute__((btf_decl_tag("comment:test_expect_failure"))) 109 72 #define __success __attribute__((btf_decl_tag("comment:test_expect_success"))) 110 73 #define __description(desc) __attribute__((btf_decl_tag("comment:test_description=" desc))) 111 - #define __msg_unpriv(msg) __attribute__((btf_decl_tag("comment:test_expect_msg_unpriv=" msg))) 112 - #define __regex_unpriv(regex) __attribute__((btf_decl_tag("comment:test_expect_regex_unpriv=" regex))) 74 + #define __msg_unpriv(msg) __attribute__((btf_decl_tag("comment:test_expect_msg_unpriv=" XSTR(__COUNTER__) "=" msg))) 75 + #define __xlated_unpriv(msg) __attribute__((btf_decl_tag("comment:test_expect_xlated_unpriv=" XSTR(__COUNTER__) "=" msg))) 76 + #define __jited_unpriv(msg) __attribute__((btf_decl_tag("comment:test_jited=" XSTR(__COUNTER__) "=" msg))) 113 77 #define __failure_unpriv __attribute__((btf_decl_tag("comment:test_expect_failure_unpriv"))) 114 78 #define __success_unpriv __attribute__((btf_decl_tag("comment:test_expect_success_unpriv"))) 115 79 #define __log_level(lvl) __attribute__((btf_decl_tag("comment:test_log_level="#lvl))) ··· 125 77 #define __auxiliary __attribute__((btf_decl_tag("comment:test_auxiliary"))) 126 78 #define __auxiliary_unpriv __attribute__((btf_decl_tag("comment:test_auxiliary_unpriv"))) 127 79 #define __btf_path(path) __attribute__((btf_decl_tag("comment:test_btf_path=" path))) 80 + #define __arch(arch) __attribute__((btf_decl_tag("comment:test_arch=" arch))) 81 + #define __arch_x86_64 __arch("X86_64") 82 + #define __arch_arm64 __arch("ARM64") 83 + #define __arch_riscv64 __arch("RISCV64") 128 84 129 85 /* Convenience macro for use with 'asm volatile' blocks */ 130 86 #define __naked __attribute__((naked))

-2

tools/testing/selftests/bpf/progs/bpf_syscall_macro.c

··· 43 43 44 44 /* test for PT_REGS_PARM */ 45 45 46 - #if !defined(bpf_target_arm64) && !defined(bpf_target_s390) 47 46 bpf_probe_read_kernel(&tmp, sizeof(tmp), &PT_REGS_PARM1_SYSCALL(real_regs)); 48 - #endif 49 47 arg1 = tmp; 50 48 bpf_probe_read_kernel(&arg2, sizeof(arg2), &PT_REGS_PARM2_SYSCALL(real_regs)); 51 49 bpf_probe_read_kernel(&arg3, sizeof(arg3), &PT_REGS_PARM3_SYSCALL(real_regs));

-2

tools/testing/selftests/bpf/progs/cg_storage_multi.h

··· 3 3 #ifndef __PROGS_CG_STORAGE_MULTI_H 4 4 #define __PROGS_CG_STORAGE_MULTI_H 5 5 6 - #include <asm/types.h> 7 - 8 6 struct cgroup_value { 9 7 __u32 egress_pkts; 10 8 __u32 ingress_pkts;

+40

tools/testing/selftests/bpf/progs/cgroup_ancestor.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + // Copyright (c) 2018 Facebook 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_core_read.h> 7 + #include "bpf_tracing_net.h" 8 + #define NUM_CGROUP_LEVELS 4 9 + 10 + __u64 cgroup_ids[NUM_CGROUP_LEVELS]; 11 + __u16 dport; 12 + 13 + static __always_inline void log_nth_level(struct __sk_buff *skb, __u32 level) 14 + { 15 + /* [1] &level passed to external function that may change it, it's 16 + * incompatible with loop unroll. 17 + */ 18 + cgroup_ids[level] = bpf_skb_ancestor_cgroup_id(skb, level); 19 + } 20 + 21 + SEC("tc") 22 + int log_cgroup_id(struct __sk_buff *skb) 23 + { 24 + struct sock *sk = (void *)skb->sk; 25 + 26 + if (!sk) 27 + return TC_ACT_OK; 28 + 29 + sk = bpf_core_cast(sk, struct sock); 30 + if (sk->sk_protocol == IPPROTO_UDP && sk->sk_dport == dport) { 31 + log_nth_level(skb, 0); 32 + log_nth_level(skb, 1); 33 + log_nth_level(skb, 2); 34 + log_nth_level(skb, 3); 35 + } 36 + 37 + return TC_ACT_OK; 38 + } 39 + 40 + char _license[] SEC("license") = "GPL";

+24

tools/testing/selftests/bpf/progs/cgroup_storage.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + 6 + struct { 7 + __uint(type, BPF_MAP_TYPE_CGROUP_STORAGE); 8 + __type(key, struct bpf_cgroup_storage_key); 9 + __type(value, __u64); 10 + } cgroup_storage SEC(".maps"); 11 + 12 + SEC("cgroup_skb/egress") 13 + int bpf_prog(struct __sk_buff *skb) 14 + { 15 + __u64 *counter; 16 + 17 + counter = bpf_get_local_storage(&cgroup_storage, 0); 18 + __sync_fetch_and_add(counter, 1); 19 + 20 + /* Drop one out of every two packets */ 21 + return (*counter & 1); 22 + } 23 + 24 + char _license[] SEC("license") = "GPL";

+2 -2

tools/testing/selftests/bpf/progs/dev_cgroup.c

··· 41 41 bpf_trace_printk(fmt, sizeof(fmt), ctx->major, ctx->minor); 42 42 #endif 43 43 44 - /* Allow access to /dev/zero and /dev/random. 44 + /* Allow access to /dev/null and /dev/urandom. 45 45 * Forbid everything else. 46 46 */ 47 47 if (ctx->major != 1 || type != BPF_DEVCG_DEV_CHAR) 48 48 return 0; 49 49 50 50 switch (ctx->minor) { 51 - case 5: /* 1:5 /dev/zero */ 51 + case 3: /* 1:3 /dev/null */ 52 52 case 9: /* 1:9 /dev/urandom */ 53 53 return 1; 54 54 }

+3 -3

tools/testing/selftests/bpf/progs/dynptr_fail.c

··· 965 965 * mem_or_null pointers. 966 966 */ 967 967 SEC("?raw_tp") 968 - __failure __regex("R[0-9]+ type=scalar expected=percpu_ptr_") 968 + __failure __msg("R{{[0-9]+}} type=scalar expected=percpu_ptr_") 969 969 int dynptr_invalidate_slice_or_null(void *ctx) 970 970 { 971 971 struct bpf_dynptr ptr; ··· 983 983 984 984 /* Destruction of dynptr should also any slices obtained from it */ 985 985 SEC("?raw_tp") 986 - __failure __regex("R[0-9]+ invalid mem access 'scalar'") 986 + __failure __msg("R{{[0-9]+}} invalid mem access 'scalar'") 987 987 int dynptr_invalidate_slice_failure(void *ctx) 988 988 { 989 989 struct bpf_dynptr ptr1; ··· 1070 1070 1071 1071 /* bpf_dynptr_slice()s are read-only and cannot be written to */ 1072 1072 SEC("?tc") 1073 - __failure __regex("R[0-9]+ cannot write into rdonly_mem") 1073 + __failure __msg("R{{[0-9]+}} cannot write into rdonly_mem") 1074 1074 int skb_invalid_slice_write(struct __sk_buff *skb) 1075 1075 { 1076 1076 struct bpf_dynptr ptr;

+82

tools/testing/selftests/bpf/progs/epilogue_exit.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod.h" 8 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + __success 13 + /* save __u64 *ctx to stack */ 14 + __xlated("0: *(u64 *)(r10 -8) = r1") 15 + /* main prog */ 16 + __xlated("1: r1 = *(u64 *)(r1 +0)") 17 + __xlated("2: r2 = *(u64 *)(r1 +0)") 18 + __xlated("3: r3 = 0") 19 + __xlated("4: r4 = 1") 20 + __xlated("5: if r2 == 0x0 goto pc+10") 21 + __xlated("6: r0 = 0") 22 + __xlated("7: *(u64 *)(r1 +0) = r3") 23 + /* epilogue */ 24 + __xlated("8: r1 = *(u64 *)(r10 -8)") 25 + __xlated("9: r1 = *(u64 *)(r1 +0)") 26 + __xlated("10: r6 = *(u64 *)(r1 +0)") 27 + __xlated("11: r6 += 10000") 28 + __xlated("12: *(u64 *)(r1 +0) = r6") 29 + __xlated("13: r0 = r6") 30 + __xlated("14: r0 *= 2") 31 + __xlated("15: exit") 32 + /* 2nd part of the main prog after the first exit */ 33 + __xlated("16: *(u64 *)(r1 +0) = r4") 34 + __xlated("17: r0 = 1") 35 + /* Clear the r1 to ensure it does not have 36 + * off-by-1 error and ensure it jumps back to the 37 + * beginning of epilogue which initializes 38 + * the r1 with the ctx ptr. 39 + */ 40 + __xlated("18: r1 = 0") 41 + __xlated("19: gotol pc-12") 42 + SEC("struct_ops/test_epilogue_exit") 43 + __naked int test_epilogue_exit(void) 44 + { 45 + asm volatile ( 46 + "r1 = *(u64 *)(r1 +0);" 47 + "r2 = *(u64 *)(r1 +0);" 48 + "r3 = 0;" 49 + "r4 = 1;" 50 + "if r2 == 0 goto +3;" 51 + "r0 = 0;" 52 + "*(u64 *)(r1 + 0) = r3;" 53 + "exit;" 54 + "*(u64 *)(r1 + 0) = r4;" 55 + "r0 = 1;" 56 + "r1 = 0;" 57 + "exit;" 58 + ::: __clobber_all); 59 + } 60 + 61 + SEC(".struct_ops.link") 62 + struct bpf_testmod_st_ops epilogue_exit = { 63 + .test_epilogue = (void *)test_epilogue_exit, 64 + }; 65 + 66 + SEC("syscall") 67 + __retval(20000) 68 + int syscall_epilogue_exit0(void *ctx) 69 + { 70 + struct st_ops_args args = { .a = 1 }; 71 + 72 + return bpf_kfunc_st_ops_test_epilogue(&args); 73 + } 74 + 75 + SEC("syscall") 76 + __retval(20002) 77 + int syscall_epilogue_exit1(void *ctx) 78 + { 79 + struct st_ops_args args = {}; 80 + 81 + return bpf_kfunc_st_ops_test_epilogue(&args); 82 + }

+58

tools/testing/selftests/bpf/progs/epilogue_tailcall.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod.h" 8 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + static __noinline __used int subprog(struct st_ops_args *args) 13 + { 14 + args->a += 1; 15 + return args->a; 16 + } 17 + 18 + SEC("struct_ops/test_epilogue_subprog") 19 + int BPF_PROG(test_epilogue_subprog, struct st_ops_args *args) 20 + { 21 + subprog(args); 22 + return args->a; 23 + } 24 + 25 + struct { 26 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 27 + __uint(max_entries, 1); 28 + __uint(key_size, sizeof(__u32)); 29 + __uint(value_size, sizeof(__u32)); 30 + __array(values, void (void)); 31 + } epilogue_map SEC(".maps") = { 32 + .values = { 33 + [0] = (void *)&test_epilogue_subprog, 34 + } 35 + }; 36 + 37 + SEC("struct_ops/test_epilogue_tailcall") 38 + int test_epilogue_tailcall(unsigned long long *ctx) 39 + { 40 + bpf_tail_call(ctx, &epilogue_map, 0); 41 + return 0; 42 + } 43 + 44 + SEC(".struct_ops.link") 45 + struct bpf_testmod_st_ops epilogue_tailcall = { 46 + .test_epilogue = (void *)test_epilogue_tailcall, 47 + }; 48 + 49 + SEC(".struct_ops.link") 50 + struct bpf_testmod_st_ops epilogue_subprog = { 51 + .test_epilogue = (void *)test_epilogue_subprog, 52 + }; 53 + 54 + SEC("syscall") 55 + int syscall_epilogue_tailcall(struct st_ops_args *args) 56 + { 57 + return bpf_kfunc_st_ops_test_epilogue(args); 58 + }

+10

tools/testing/selftests/bpf/progs/err.h

··· 5 5 #define MAX_ERRNO 4095 6 6 #define IS_ERR_VALUE(x) (unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO 7 7 8 + #define __STR(x) #x 9 + 10 + #define set_if_not_errno_or_zero(x, y) \ 11 + ({ \ 12 + asm volatile ("if %0 s< -4095 goto +1\n" \ 13 + "if %0 s<= 0 goto +1\n" \ 14 + "%0 = " __STR(y) "\n" \ 15 + : "+r"(x)); \ 16 + }) 17 + 8 18 static inline int IS_ERR_OR_NULL(const void *ptr) 9 19 { 10 20 return !ptr || IS_ERR_VALUE((unsigned long)ptr);

+4 -22

tools/testing/selftests/bpf/progs/get_cgroup_id_kern.c

··· 4 4 #include <linux/bpf.h> 5 5 #include <bpf/bpf_helpers.h> 6 6 7 - struct { 8 - __uint(type, BPF_MAP_TYPE_ARRAY); 9 - __uint(max_entries, 1); 10 - __type(key, __u32); 11 - __type(value, __u64); 12 - } cg_ids SEC(".maps"); 13 - 14 - struct { 15 - __uint(type, BPF_MAP_TYPE_ARRAY); 16 - __uint(max_entries, 1); 17 - __type(key, __u32); 18 - __type(value, __u32); 19 - } pidmap SEC(".maps"); 7 + __u64 cg_id; 8 + __u64 expected_pid; 20 9 21 10 SEC("tracepoint/syscalls/sys_enter_nanosleep") 22 11 int trace(void *ctx) 23 12 { 24 13 __u32 pid = bpf_get_current_pid_tgid(); 25 - __u32 key = 0, *expected_pid; 26 - __u64 *val; 27 14 28 - expected_pid = bpf_map_lookup_elem(&pidmap, &key); 29 - if (!expected_pid || *expected_pid != pid) 30 - return 0; 31 - 32 - val = bpf_map_lookup_elem(&cg_ids, &key); 33 - if (val) 34 - *val = bpf_get_current_cgroup_id(); 15 + if (expected_pid == pid) 16 + cg_id = bpf_get_current_cgroup_id(); 35 17 36 18 return 0; 37 19 }

+125

tools/testing/selftests/bpf/progs/iters_testmod.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + #include "bpf_experimental.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + SEC("raw_tp/sys_enter") 12 + __success 13 + int iter_next_trusted(const void *ctx) 14 + { 15 + struct task_struct *cur_task = bpf_get_current_task_btf(); 16 + struct bpf_iter_task_vma vma_it; 17 + struct vm_area_struct *vma_ptr; 18 + 19 + bpf_iter_task_vma_new(&vma_it, cur_task, 0); 20 + 21 + vma_ptr = bpf_iter_task_vma_next(&vma_it); 22 + if (vma_ptr == NULL) 23 + goto out; 24 + 25 + bpf_kfunc_trusted_vma_test(vma_ptr); 26 + out: 27 + bpf_iter_task_vma_destroy(&vma_it); 28 + return 0; 29 + } 30 + 31 + SEC("raw_tp/sys_enter") 32 + __failure __msg("Possibly NULL pointer passed to trusted arg0") 33 + int iter_next_trusted_or_null(const void *ctx) 34 + { 35 + struct task_struct *cur_task = bpf_get_current_task_btf(); 36 + struct bpf_iter_task_vma vma_it; 37 + struct vm_area_struct *vma_ptr; 38 + 39 + bpf_iter_task_vma_new(&vma_it, cur_task, 0); 40 + 41 + vma_ptr = bpf_iter_task_vma_next(&vma_it); 42 + 43 + bpf_kfunc_trusted_vma_test(vma_ptr); 44 + 45 + bpf_iter_task_vma_destroy(&vma_it); 46 + return 0; 47 + } 48 + 49 + SEC("raw_tp/sys_enter") 50 + __success 51 + int iter_next_rcu(const void *ctx) 52 + { 53 + struct task_struct *cur_task = bpf_get_current_task_btf(); 54 + struct bpf_iter_task task_it; 55 + struct task_struct *task_ptr; 56 + 57 + bpf_iter_task_new(&task_it, cur_task, 0); 58 + 59 + task_ptr = bpf_iter_task_next(&task_it); 60 + if (task_ptr == NULL) 61 + goto out; 62 + 63 + bpf_kfunc_rcu_task_test(task_ptr); 64 + out: 65 + bpf_iter_task_destroy(&task_it); 66 + return 0; 67 + } 68 + 69 + SEC("raw_tp/sys_enter") 70 + __failure __msg("Possibly NULL pointer passed to trusted arg0") 71 + int iter_next_rcu_or_null(const void *ctx) 72 + { 73 + struct task_struct *cur_task = bpf_get_current_task_btf(); 74 + struct bpf_iter_task task_it; 75 + struct task_struct *task_ptr; 76 + 77 + bpf_iter_task_new(&task_it, cur_task, 0); 78 + 79 + task_ptr = bpf_iter_task_next(&task_it); 80 + 81 + bpf_kfunc_rcu_task_test(task_ptr); 82 + 83 + bpf_iter_task_destroy(&task_it); 84 + return 0; 85 + } 86 + 87 + SEC("raw_tp/sys_enter") 88 + __failure __msg("R1 must be referenced or trusted") 89 + int iter_next_rcu_not_trusted(const void *ctx) 90 + { 91 + struct task_struct *cur_task = bpf_get_current_task_btf(); 92 + struct bpf_iter_task task_it; 93 + struct task_struct *task_ptr; 94 + 95 + bpf_iter_task_new(&task_it, cur_task, 0); 96 + 97 + task_ptr = bpf_iter_task_next(&task_it); 98 + if (task_ptr == NULL) 99 + goto out; 100 + 101 + bpf_kfunc_trusted_task_test(task_ptr); 102 + out: 103 + bpf_iter_task_destroy(&task_it); 104 + return 0; 105 + } 106 + 107 + SEC("raw_tp/sys_enter") 108 + __failure __msg("R1 cannot write into rdonly_mem") 109 + /* Message should not be 'R1 cannot write into rdonly_trusted_mem' */ 110 + int iter_next_ptr_mem_not_trusted(const void *ctx) 111 + { 112 + struct bpf_iter_num num_it; 113 + int *num_ptr; 114 + 115 + bpf_iter_num_new(&num_it, 0, 10); 116 + 117 + num_ptr = bpf_iter_num_next(&num_it); 118 + if (num_ptr == NULL) 119 + goto out; 120 + 121 + bpf_kfunc_trusted_num_test(num_ptr); 122 + out: 123 + bpf_iter_num_destroy(&num_it); 124 + return 0; 125 + }

+50

tools/testing/selftests/bpf/progs/iters_testmod_seq.c

··· 12 12 13 13 extern int bpf_iter_testmod_seq_new(struct bpf_iter_testmod_seq *it, s64 value, int cnt) __ksym; 14 14 extern s64 *bpf_iter_testmod_seq_next(struct bpf_iter_testmod_seq *it) __ksym; 15 + extern s64 bpf_iter_testmod_seq_value(int blah, struct bpf_iter_testmod_seq *it) __ksym; 15 16 extern void bpf_iter_testmod_seq_destroy(struct bpf_iter_testmod_seq *it) __ksym; 16 17 17 18 const volatile __s64 exp_empty = 0 + 1; ··· 75 74 res_truncated = sum; 76 75 77 76 return 0; 77 + } 78 + 79 + SEC("?raw_tp") 80 + __failure 81 + __msg("expected an initialized iter_testmod_seq as arg #2") 82 + int testmod_seq_getter_before_bad(const void *ctx) 83 + { 84 + struct bpf_iter_testmod_seq it; 85 + 86 + return bpf_iter_testmod_seq_value(0, &it); 87 + } 88 + 89 + SEC("?raw_tp") 90 + __failure 91 + __msg("expected an initialized iter_testmod_seq as arg #2") 92 + int testmod_seq_getter_after_bad(const void *ctx) 93 + { 94 + struct bpf_iter_testmod_seq it; 95 + s64 sum = 0, *v; 96 + 97 + bpf_iter_testmod_seq_new(&it, 100, 100); 98 + 99 + while ((v = bpf_iter_testmod_seq_next(&it))) { 100 + sum += *v; 101 + } 102 + 103 + bpf_iter_testmod_seq_destroy(&it); 104 + 105 + return sum + bpf_iter_testmod_seq_value(0, &it); 106 + } 107 + 108 + SEC("?socket") 109 + __success __retval(1000000) 110 + int testmod_seq_getter_good(const void *ctx) 111 + { 112 + struct bpf_iter_testmod_seq it; 113 + s64 sum = 0, *v; 114 + 115 + bpf_iter_testmod_seq_new(&it, 100, 100); 116 + 117 + while ((v = bpf_iter_testmod_seq_next(&it))) { 118 + sum += *v; 119 + } 120 + 121 + sum *= bpf_iter_testmod_seq_value(0, &it); 122 + 123 + bpf_iter_testmod_seq_destroy(&it); 124 + 125 + return sum; 78 126 } 79 127 80 128 char _license[] SEC("license") = "GPL";

+7

tools/testing/selftests/bpf/progs/kfunc_call_fail.c

··· 150 150 return ret; 151 151 } 152 152 153 + SEC("?tc") 154 + int kfunc_call_test_pointer_arg_type_mismatch(struct __sk_buff *skb) 155 + { 156 + bpf_kfunc_call_test_pass_ctx((void *)10); 157 + return 0; 158 + } 159 + 153 160 char _license[] SEC("license") = "GPL";

+28 -2

tools/testing/selftests/bpf/progs/local_kptr_stash.c

··· 8 8 #include "../bpf_experimental.h" 9 9 #include "../bpf_testmod/bpf_testmod_kfunc.h" 10 10 11 + struct plain_local; 12 + 11 13 struct node_data { 12 14 long key; 13 15 long data; 16 + struct plain_local __kptr * stashed_in_local_kptr; 14 17 struct bpf_rb_node node; 15 18 }; 16 19 ··· 88 85 89 86 static int create_and_stash(int idx, int val) 90 87 { 88 + struct plain_local *inner_local_kptr; 91 89 struct map_value *mapval; 92 90 struct node_data *res; 93 91 ··· 96 92 if (!mapval) 97 93 return 1; 98 94 95 + inner_local_kptr = bpf_obj_new(typeof(*inner_local_kptr)); 96 + if (!inner_local_kptr) 97 + return 2; 98 + 99 99 res = bpf_obj_new(typeof(*res)); 100 - if (!res) 101 - return 1; 100 + if (!res) { 101 + bpf_obj_drop(inner_local_kptr); 102 + return 3; 103 + } 102 104 res->key = val; 105 + 106 + inner_local_kptr = bpf_kptr_xchg(&res->stashed_in_local_kptr, inner_local_kptr); 107 + if (inner_local_kptr) { 108 + /* Should never happen, we just obj_new'd res */ 109 + bpf_obj_drop(inner_local_kptr); 110 + bpf_obj_drop(res); 111 + return 4; 112 + } 103 113 104 114 res = bpf_kptr_xchg(&mapval->node, res); 105 115 if (res) ··· 187 169 SEC("tc") 188 170 long unstash_rb_node(void *ctx) 189 171 { 172 + struct plain_local *inner_local_kptr = NULL; 190 173 struct map_value *mapval; 191 174 struct node_data *res; 192 175 long retval; ··· 199 180 200 181 res = bpf_kptr_xchg(&mapval->node, NULL); 201 182 if (res) { 183 + inner_local_kptr = bpf_kptr_xchg(&res->stashed_in_local_kptr, inner_local_kptr); 184 + if (!inner_local_kptr) { 185 + bpf_obj_drop(res); 186 + return 1; 187 + } 188 + bpf_obj_drop(inner_local_kptr); 189 + 202 190 retval = res->key; 203 191 bpf_obj_drop(res); 204 192 return retval;

+34

tools/testing/selftests/bpf/progs/lsm_tailcall.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Huawei Technologies Co., Ltd */ 3 + 4 + #include "vmlinux.h" 5 + #include <errno.h> 6 + #include <bpf/bpf_helpers.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + struct { 11 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 12 + __uint(max_entries, 1); 13 + __uint(key_size, sizeof(__u32)); 14 + __uint(value_size, sizeof(__u32)); 15 + } jmp_table SEC(".maps"); 16 + 17 + SEC("lsm/file_permission") 18 + int lsm_file_permission_prog(void *ctx) 19 + { 20 + return 0; 21 + } 22 + 23 + SEC("lsm/file_alloc_security") 24 + int lsm_file_alloc_security_prog(void *ctx) 25 + { 26 + return 0; 27 + } 28 + 29 + SEC("lsm/file_alloc_security") 30 + int lsm_file_alloc_security_entry(void *ctx) 31 + { 32 + bpf_tail_call_static(ctx, &jmp_table, 0); 33 + return 0; 34 + }

+57

tools/testing/selftests/bpf/progs/mmap_inner_array.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + 7 + #include "bpf_misc.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + struct inner_array_type { 12 + __uint(type, BPF_MAP_TYPE_ARRAY); 13 + __uint(map_flags, BPF_F_MMAPABLE); 14 + __type(key, __u32); 15 + __type(value, __u64); 16 + __uint(max_entries, 1); 17 + } inner_array SEC(".maps"); 18 + 19 + struct { 20 + __uint(type, BPF_MAP_TYPE_HASH_OF_MAPS); 21 + __uint(key_size, 4); 22 + __uint(value_size, 4); 23 + __uint(max_entries, 1); 24 + __array(values, struct inner_array_type); 25 + } outer_map SEC(".maps"); 26 + 27 + int pid = 0; 28 + __u64 match_value = 0x13572468; 29 + bool done = false; 30 + bool pid_match = false; 31 + bool outer_map_match = false; 32 + 33 + SEC("fentry/" SYS_PREFIX "sys_nanosleep") 34 + int add_to_list_in_inner_array(void *ctx) 35 + { 36 + __u32 curr_pid, zero = 0; 37 + struct bpf_map *map; 38 + __u64 *value; 39 + 40 + curr_pid = (u32)bpf_get_current_pid_tgid(); 41 + if (done || curr_pid != pid) 42 + return 0; 43 + 44 + pid_match = true; 45 + map = bpf_map_lookup_elem(&outer_map, &curr_pid); 46 + if (!map) 47 + return 0; 48 + 49 + outer_map_match = true; 50 + value = bpf_map_lookup_elem(map, &zero); 51 + if (!value) 52 + return 0; 53 + 54 + *value = match_value; 55 + done = true; 56 + return 0; 57 + }

+33

tools/testing/selftests/bpf/progs/nested_acquire.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <vmlinux.h> 4 + #include <bpf/bpf_tracing.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + SEC("tp_btf/tcp_probe") 12 + __success 13 + int BPF_PROG(test_nested_acquire_nonzero, struct sock *sk, struct sk_buff *skb) 14 + { 15 + struct sk_buff *ptr; 16 + 17 + ptr = bpf_kfunc_nested_acquire_nonzero_offset_test(&sk->sk_write_queue); 18 + 19 + bpf_kfunc_nested_release_test(ptr); 20 + return 0; 21 + } 22 + 23 + SEC("tp_btf/tcp_probe") 24 + __success 25 + int BPF_PROG(test_nested_acquire_zero, struct sock *sk, struct sk_buff *skb) 26 + { 27 + struct sk_buff *ptr; 28 + 29 + ptr = bpf_kfunc_nested_acquire_zero_offset_test(&sk->__sk_common); 30 + 31 + bpf_kfunc_nested_release_test(ptr); 32 + return 0; 33 + }

+154

tools/testing/selftests/bpf/progs/pro_epilogue.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod.h" 8 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + void __kfunc_btf_root(void) 13 + { 14 + bpf_kfunc_st_ops_inc10(NULL); 15 + } 16 + 17 + static __noinline __used int subprog(struct st_ops_args *args) 18 + { 19 + args->a += 1; 20 + return args->a; 21 + } 22 + 23 + __success 24 + /* prologue */ 25 + __xlated("0: r6 = *(u64 *)(r1 +0)") 26 + __xlated("1: r7 = *(u64 *)(r6 +0)") 27 + __xlated("2: r7 += 1000") 28 + __xlated("3: *(u64 *)(r6 +0) = r7") 29 + /* main prog */ 30 + __xlated("4: r1 = *(u64 *)(r1 +0)") 31 + __xlated("5: r6 = r1") 32 + __xlated("6: call kernel-function") 33 + __xlated("7: r1 = r6") 34 + __xlated("8: call pc+1") 35 + __xlated("9: exit") 36 + SEC("struct_ops/test_prologue") 37 + __naked int test_prologue(void) 38 + { 39 + asm volatile ( 40 + "r1 = *(u64 *)(r1 +0);" 41 + "r6 = r1;" 42 + "call %[bpf_kfunc_st_ops_inc10];" 43 + "r1 = r6;" 44 + "call subprog;" 45 + "exit;" 46 + : 47 + : __imm(bpf_kfunc_st_ops_inc10) 48 + : __clobber_all); 49 + } 50 + 51 + __success 52 + /* save __u64 *ctx to stack */ 53 + __xlated("0: *(u64 *)(r10 -8) = r1") 54 + /* main prog */ 55 + __xlated("1: r1 = *(u64 *)(r1 +0)") 56 + __xlated("2: r6 = r1") 57 + __xlated("3: call kernel-function") 58 + __xlated("4: r1 = r6") 59 + __xlated("5: call pc+") 60 + /* epilogue */ 61 + __xlated("6: r1 = *(u64 *)(r10 -8)") 62 + __xlated("7: r1 = *(u64 *)(r1 +0)") 63 + __xlated("8: r6 = *(u64 *)(r1 +0)") 64 + __xlated("9: r6 += 10000") 65 + __xlated("10: *(u64 *)(r1 +0) = r6") 66 + __xlated("11: r0 = r6") 67 + __xlated("12: r0 *= 2") 68 + __xlated("13: exit") 69 + SEC("struct_ops/test_epilogue") 70 + __naked int test_epilogue(void) 71 + { 72 + asm volatile ( 73 + "r1 = *(u64 *)(r1 +0);" 74 + "r6 = r1;" 75 + "call %[bpf_kfunc_st_ops_inc10];" 76 + "r1 = r6;" 77 + "call subprog;" 78 + "exit;" 79 + : 80 + : __imm(bpf_kfunc_st_ops_inc10) 81 + : __clobber_all); 82 + } 83 + 84 + __success 85 + /* prologue */ 86 + __xlated("0: r6 = *(u64 *)(r1 +0)") 87 + __xlated("1: r7 = *(u64 *)(r6 +0)") 88 + __xlated("2: r7 += 1000") 89 + __xlated("3: *(u64 *)(r6 +0) = r7") 90 + /* save __u64 *ctx to stack */ 91 + __xlated("4: *(u64 *)(r10 -8) = r1") 92 + /* main prog */ 93 + __xlated("5: r1 = *(u64 *)(r1 +0)") 94 + __xlated("6: r6 = r1") 95 + __xlated("7: call kernel-function") 96 + __xlated("8: r1 = r6") 97 + __xlated("9: call pc+") 98 + /* epilogue */ 99 + __xlated("10: r1 = *(u64 *)(r10 -8)") 100 + __xlated("11: r1 = *(u64 *)(r1 +0)") 101 + __xlated("12: r6 = *(u64 *)(r1 +0)") 102 + __xlated("13: r6 += 10000") 103 + __xlated("14: *(u64 *)(r1 +0) = r6") 104 + __xlated("15: r0 = r6") 105 + __xlated("16: r0 *= 2") 106 + __xlated("17: exit") 107 + SEC("struct_ops/test_pro_epilogue") 108 + __naked int test_pro_epilogue(void) 109 + { 110 + asm volatile ( 111 + "r1 = *(u64 *)(r1 +0);" 112 + "r6 = r1;" 113 + "call %[bpf_kfunc_st_ops_inc10];" 114 + "r1 = r6;" 115 + "call subprog;" 116 + "exit;" 117 + : 118 + : __imm(bpf_kfunc_st_ops_inc10) 119 + : __clobber_all); 120 + } 121 + 122 + SEC("syscall") 123 + __retval(1011) /* PROLOGUE_A [1000] + KFUNC_INC10 + SUBPROG_A [1] */ 124 + int syscall_prologue(void *ctx) 125 + { 126 + struct st_ops_args args = {}; 127 + 128 + return bpf_kfunc_st_ops_test_prologue(&args); 129 + } 130 + 131 + SEC("syscall") 132 + __retval(20022) /* (KFUNC_INC10 + SUBPROG_A [1] + EPILOGUE_A [10000]) * 2 */ 133 + int syscall_epilogue(void *ctx) 134 + { 135 + struct st_ops_args args = {}; 136 + 137 + return bpf_kfunc_st_ops_test_epilogue(&args); 138 + } 139 + 140 + SEC("syscall") 141 + __retval(22022) /* (PROLOGUE_A [1000] + KFUNC_INC10 + SUBPROG_A [1] + EPILOGUE_A [10000]) * 2 */ 142 + int syscall_pro_epilogue(void *ctx) 143 + { 144 + struct st_ops_args args = {}; 145 + 146 + return bpf_kfunc_st_ops_test_pro_epilogue(&args); 147 + } 148 + 149 + SEC(".struct_ops.link") 150 + struct bpf_testmod_st_ops pro_epilogue = { 151 + .test_prologue = (void *)test_prologue, 152 + .test_epilogue = (void *)test_epilogue, 153 + .test_pro_epilogue = (void *)test_pro_epilogue, 154 + };

+149

tools/testing/selftests/bpf/progs/pro_epilogue_goto_start.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod.h" 8 + #include "../bpf_testmod/bpf_testmod_kfunc.h" 9 + 10 + char _license[] SEC("license") = "GPL"; 11 + 12 + __success 13 + /* prologue */ 14 + __xlated("0: r6 = *(u64 *)(r1 +0)") 15 + __xlated("1: r7 = *(u64 *)(r6 +0)") 16 + __xlated("2: r7 += 1000") 17 + __xlated("3: *(u64 *)(r6 +0) = r7") 18 + /* main prog */ 19 + __xlated("4: if r1 == 0x0 goto pc+5") 20 + __xlated("5: if r1 == 0x1 goto pc+2") 21 + __xlated("6: r1 = 1") 22 + __xlated("7: goto pc-3") 23 + __xlated("8: r1 = 0") 24 + __xlated("9: goto pc-6") 25 + __xlated("10: r0 = 0") 26 + __xlated("11: exit") 27 + SEC("struct_ops/test_prologue_goto_start") 28 + __naked int test_prologue_goto_start(void) 29 + { 30 + asm volatile ( 31 + "if r1 == 0 goto +5;" 32 + "if r1 == 1 goto +2;" 33 + "r1 = 1;" 34 + "goto -3;" 35 + "r1 = 0;" 36 + "goto -6;" 37 + "r0 = 0;" 38 + "exit;" 39 + ::: __clobber_all); 40 + } 41 + 42 + __success 43 + /* save __u64 *ctx to stack */ 44 + __xlated("0: *(u64 *)(r10 -8) = r1") 45 + /* main prog */ 46 + __xlated("1: if r1 == 0x0 goto pc+5") 47 + __xlated("2: if r1 == 0x1 goto pc+2") 48 + __xlated("3: r1 = 1") 49 + __xlated("4: goto pc-3") 50 + __xlated("5: r1 = 0") 51 + __xlated("6: goto pc-6") 52 + __xlated("7: r0 = 0") 53 + /* epilogue */ 54 + __xlated("8: r1 = *(u64 *)(r10 -8)") 55 + __xlated("9: r1 = *(u64 *)(r1 +0)") 56 + __xlated("10: r6 = *(u64 *)(r1 +0)") 57 + __xlated("11: r6 += 10000") 58 + __xlated("12: *(u64 *)(r1 +0) = r6") 59 + __xlated("13: r0 = r6") 60 + __xlated("14: r0 *= 2") 61 + __xlated("15: exit") 62 + SEC("struct_ops/test_epilogue_goto_start") 63 + __naked int test_epilogue_goto_start(void) 64 + { 65 + asm volatile ( 66 + "if r1 == 0 goto +5;" 67 + "if r1 == 1 goto +2;" 68 + "r1 = 1;" 69 + "goto -3;" 70 + "r1 = 0;" 71 + "goto -6;" 72 + "r0 = 0;" 73 + "exit;" 74 + ::: __clobber_all); 75 + } 76 + 77 + __success 78 + /* prologue */ 79 + __xlated("0: r6 = *(u64 *)(r1 +0)") 80 + __xlated("1: r7 = *(u64 *)(r6 +0)") 81 + __xlated("2: r7 += 1000") 82 + __xlated("3: *(u64 *)(r6 +0) = r7") 83 + /* save __u64 *ctx to stack */ 84 + __xlated("4: *(u64 *)(r10 -8) = r1") 85 + /* main prog */ 86 + __xlated("5: if r1 == 0x0 goto pc+5") 87 + __xlated("6: if r1 == 0x1 goto pc+2") 88 + __xlated("7: r1 = 1") 89 + __xlated("8: goto pc-3") 90 + __xlated("9: r1 = 0") 91 + __xlated("10: goto pc-6") 92 + __xlated("11: r0 = 0") 93 + /* epilogue */ 94 + __xlated("12: r1 = *(u64 *)(r10 -8)") 95 + __xlated("13: r1 = *(u64 *)(r1 +0)") 96 + __xlated("14: r6 = *(u64 *)(r1 +0)") 97 + __xlated("15: r6 += 10000") 98 + __xlated("16: *(u64 *)(r1 +0) = r6") 99 + __xlated("17: r0 = r6") 100 + __xlated("18: r0 *= 2") 101 + __xlated("19: exit") 102 + SEC("struct_ops/test_pro_epilogue_goto_start") 103 + __naked int test_pro_epilogue_goto_start(void) 104 + { 105 + asm volatile ( 106 + "if r1 == 0 goto +5;" 107 + "if r1 == 1 goto +2;" 108 + "r1 = 1;" 109 + "goto -3;" 110 + "r1 = 0;" 111 + "goto -6;" 112 + "r0 = 0;" 113 + "exit;" 114 + ::: __clobber_all); 115 + } 116 + 117 + SEC(".struct_ops.link") 118 + struct bpf_testmod_st_ops epilogue_goto_start = { 119 + .test_prologue = (void *)test_prologue_goto_start, 120 + .test_epilogue = (void *)test_epilogue_goto_start, 121 + .test_pro_epilogue = (void *)test_pro_epilogue_goto_start, 122 + }; 123 + 124 + SEC("syscall") 125 + __retval(0) 126 + int syscall_prologue_goto_start(void *ctx) 127 + { 128 + struct st_ops_args args = {}; 129 + 130 + return bpf_kfunc_st_ops_test_prologue(&args); 131 + } 132 + 133 + SEC("syscall") 134 + __retval(20000) /* (EPILOGUE_A [10000]) * 2 */ 135 + int syscall_epilogue_goto_start(void *ctx) 136 + { 137 + struct st_ops_args args = {}; 138 + 139 + return bpf_kfunc_st_ops_test_epilogue(&args); 140 + } 141 + 142 + SEC("syscall") 143 + __retval(22000) /* (PROLOGUE_A [1000] + EPILOGUE_A [10000]) * 2 */ 144 + int syscall_pro_epilogue_goto_start(void *ctx) 145 + { 146 + struct st_ops_args args = {}; 147 + 148 + return bpf_kfunc_st_ops_test_pro_epilogue(&args); 149 + }

+1 -1

tools/testing/selftests/bpf/progs/rbtree_fail.c

··· 105 105 } 106 106 107 107 SEC("?tc") 108 - __failure __regex("Unreleased reference id=3 alloc_insn=[0-9]+") 108 + __failure __msg("Unreleased reference id=3 alloc_insn={{[0-9]+}}") 109 109 long rbtree_api_remove_no_drop(void *ctx) 110 110 { 111 111 struct bpf_rb_node *res;

+8 -1

tools/testing/selftests/bpf/progs/read_vsyscall.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* Copyright (C) 2024. Huawei Technologies Co., Ltd */ 3 + #include "vmlinux.h" 3 4 #include <linux/types.h> 4 5 #include <bpf/bpf_helpers.h> 5 6 ··· 8 7 9 8 int target_pid = 0; 10 9 void *user_ptr = 0; 11 - int read_ret[8]; 10 + int read_ret[9]; 12 11 13 12 char _license[] SEC("license") = "GPL"; 13 + 14 + /* 15 + * This is the only kfunc, the others are helpers 16 + */ 17 + int bpf_copy_from_user_str(void *dst, u32, const void *, u64) __weak __ksym; 14 18 15 19 SEC("fentry/" SYS_PREFIX "sys_nanosleep") 16 20 int do_probe_read(void *ctx) ··· 46 40 read_ret[6] = bpf_copy_from_user(buf, sizeof(buf), user_ptr); 47 41 read_ret[7] = bpf_copy_from_user_task(buf, sizeof(buf), user_ptr, 48 42 bpf_get_current_task_btf(), 0); 43 + read_ret[8] = bpf_copy_from_user_str((char *)buf, sizeof(buf), user_ptr, 0); 49 44 50 45 return 0; 51 46 }

+2 -2

tools/testing/selftests/bpf/progs/refcounted_kptr_fail.c

··· 32 32 } 33 33 34 34 SEC("?tc") 35 - __failure __regex("Unreleased reference id=4 alloc_insn=[0-9]+") 35 + __failure __msg("Unreleased reference id=4 alloc_insn={{[0-9]+}}") 36 36 long rbtree_refcounted_node_ref_escapes(void *ctx) 37 37 { 38 38 struct node_acquire *n, *m; ··· 73 73 } 74 74 75 75 SEC("?tc") 76 - __failure __regex("Unreleased reference id=3 alloc_insn=[0-9]+") 76 + __failure __msg("Unreleased reference id=3 alloc_insn={{[0-9]+}}") 77 77 long rbtree_refcounted_node_ref_escapes_owning_input(void *ctx) 78 78 { 79 79 struct node_acquire *n, *m;

+2 -2

tools/testing/selftests/bpf/progs/strobemeta.h

··· 373 373 len = bpf_probe_read_user_str(&data->payload[off], STROBE_MAX_STR_LEN, value->ptr); 374 374 /* 375 375 * if bpf_probe_read_user_str returns error (<0), due to casting to 376 - * unsinged int, it will become big number, so next check is 376 + * unsigned int, it will become big number, so next check is 377 377 * sufficient to check for errors AND prove to BPF verifier, that 378 378 * bpf_probe_read_user_str won't return anything bigger than 379 379 * STROBE_MAX_STR_LEN ··· 557 557 return NULL; 558 558 559 559 payload_off = ctx.payload_off; 560 - /* this should not really happen, here only to satisfy verifer */ 560 + /* this should not really happen, here only to satisfy verifier */ 561 561 if (payload_off > sizeof(data->payload)) 562 562 payload_off = sizeof(data->payload); 563 563 #else

+2 -1

tools/testing/selftests/bpf/progs/syscall.c

··· 8 8 #include <linux/btf.h> 9 9 #include <string.h> 10 10 #include <errno.h> 11 + #include "bpf_misc.h" 11 12 12 13 char _license[] SEC("license") = "GPL"; 13 14 ··· 120 119 static __u64 value = 34; 121 120 static union bpf_attr prog_load_attr = { 122 121 .prog_type = BPF_PROG_TYPE_XDP, 123 - .insn_cnt = sizeof(insns) / sizeof(insns[0]), 122 + .insn_cnt = ARRAY_SIZE(insns), 124 123 }; 125 124 int ret; 126 125

+34

tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy1.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include "bpf_legacy.h" 5 + 6 + struct { 7 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 8 + __uint(max_entries, 1); 9 + __uint(key_size, sizeof(__u32)); 10 + __uint(value_size, sizeof(__u32)); 11 + } jmp_table SEC(".maps"); 12 + 13 + int count = 0; 14 + 15 + static __noinline 16 + int subprog_tail(struct __sk_buff *skb) 17 + { 18 + bpf_tail_call_static(skb, &jmp_table, 0); 19 + return 0; 20 + } 21 + 22 + SEC("tc") 23 + int entry(struct __sk_buff *skb) 24 + { 25 + int ret = 1; 26 + 27 + count++; 28 + subprog_tail(skb); 29 + subprog_tail(skb); 30 + 31 + return ret; 32 + } 33 + 34 + char __license[] SEC("license") = "GPL";

+70

tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy2.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include "bpf_misc.h" 5 + 6 + int classifier_0(struct __sk_buff *skb); 7 + int classifier_1(struct __sk_buff *skb); 8 + 9 + struct { 10 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 11 + __uint(max_entries, 2); 12 + __uint(key_size, sizeof(__u32)); 13 + __array(values, void (void)); 14 + } jmp_table SEC(".maps") = { 15 + .values = { 16 + [0] = (void *) &classifier_0, 17 + [1] = (void *) &classifier_1, 18 + }, 19 + }; 20 + 21 + int count0 = 0; 22 + int count1 = 0; 23 + 24 + static __noinline 25 + int subprog_tail0(struct __sk_buff *skb) 26 + { 27 + bpf_tail_call_static(skb, &jmp_table, 0); 28 + return 0; 29 + } 30 + 31 + __auxiliary 32 + SEC("tc") 33 + int classifier_0(struct __sk_buff *skb) 34 + { 35 + count0++; 36 + subprog_tail0(skb); 37 + return 0; 38 + } 39 + 40 + static __noinline 41 + int subprog_tail1(struct __sk_buff *skb) 42 + { 43 + bpf_tail_call_static(skb, &jmp_table, 1); 44 + return 0; 45 + } 46 + 47 + __auxiliary 48 + SEC("tc") 49 + int classifier_1(struct __sk_buff *skb) 50 + { 51 + count1++; 52 + subprog_tail1(skb); 53 + return 0; 54 + } 55 + 56 + __success 57 + __retval(33) 58 + SEC("tc") 59 + int tailcall_bpf2bpf_hierarchy_2(struct __sk_buff *skb) 60 + { 61 + int ret = 0; 62 + 63 + subprog_tail0(skb); 64 + subprog_tail1(skb); 65 + 66 + __sink(ret); 67 + return (count1 << 16) | count0; 68 + } 69 + 70 + char __license[] SEC("license") = "GPL";

+62

tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy3.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include "bpf_misc.h" 5 + 6 + int classifier_0(struct __sk_buff *skb); 7 + 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 10 + __uint(max_entries, 1); 11 + __uint(key_size, sizeof(__u32)); 12 + __array(values, void (void)); 13 + } jmp_table0 SEC(".maps") = { 14 + .values = { 15 + [0] = (void *) &classifier_0, 16 + }, 17 + }; 18 + 19 + struct { 20 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 21 + __uint(max_entries, 1); 22 + __uint(key_size, sizeof(__u32)); 23 + __array(values, void (void)); 24 + } jmp_table1 SEC(".maps") = { 25 + .values = { 26 + [0] = (void *) &classifier_0, 27 + }, 28 + }; 29 + 30 + int count = 0; 31 + 32 + static __noinline 33 + int subprog_tail(struct __sk_buff *skb, void *jmp_table) 34 + { 35 + bpf_tail_call_static(skb, jmp_table, 0); 36 + return 0; 37 + } 38 + 39 + __auxiliary 40 + SEC("tc") 41 + int classifier_0(struct __sk_buff *skb) 42 + { 43 + count++; 44 + subprog_tail(skb, &jmp_table0); 45 + subprog_tail(skb, &jmp_table1); 46 + return count; 47 + } 48 + 49 + __success 50 + __retval(33) 51 + SEC("tc") 52 + int tailcall_bpf2bpf_hierarchy_3(struct __sk_buff *skb) 53 + { 54 + int ret = 0; 55 + 56 + bpf_tail_call_static(skb, &jmp_table0, 0); 57 + 58 + __sink(ret); 59 + return ret; 60 + } 61 + 62 + char __license[] SEC("license") = "GPL";

+35

tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_hierarchy_fentry.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright Leon Hwang */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 10 + __uint(max_entries, 1); 11 + __uint(key_size, sizeof(__u32)); 12 + __uint(value_size, sizeof(__u32)); 13 + } jmp_table SEC(".maps"); 14 + 15 + int count = 0; 16 + 17 + static __noinline 18 + int subprog_tail(void *ctx) 19 + { 20 + bpf_tail_call_static(ctx, &jmp_table, 0); 21 + return 0; 22 + } 23 + 24 + SEC("fentry/dummy") 25 + int BPF_PROG(fentry, struct sk_buff *skb) 26 + { 27 + count++; 28 + subprog_tail(ctx); 29 + subprog_tail(ctx); 30 + 31 + return 0; 32 + } 33 + 34 + 35 + char _license[] SEC("license") = "GPL";

+23

tools/testing/selftests/bpf/progs/tailcall_freplace.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + 6 + struct { 7 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 8 + __uint(max_entries, 1); 9 + __uint(key_size, sizeof(__u32)); 10 + __uint(value_size, sizeof(__u32)); 11 + } jmp_table SEC(".maps"); 12 + 13 + int count = 0; 14 + 15 + SEC("freplace") 16 + int entry_freplace(struct __sk_buff *skb) 17 + { 18 + count++; 19 + bpf_tail_call_static(skb, &jmp_table, 0); 20 + return count; 21 + } 22 + 23 + char __license[] SEC("license") = "GPL";

+54 -2

tools/testing/selftests/bpf/progs/task_kfunc_success.c

··· 5 5 #include <bpf/bpf_tracing.h> 6 6 #include <bpf/bpf_helpers.h> 7 7 8 + #include "../bpf_experimental.h" 8 9 #include "task_kfunc_common.h" 9 10 10 11 char _license[] SEC("license") = "GPL"; ··· 143 142 SEC("tp_btf/task_newtask") 144 143 int BPF_PROG(test_task_xchg_release, struct task_struct *task, u64 clone_flags) 145 144 { 146 - struct task_struct *kptr; 147 - struct __tasks_kfunc_map_value *v; 145 + struct task_struct *kptr, *acquired; 146 + struct __tasks_kfunc_map_value *v, *local; 147 + int refcnt, refcnt_after_drop; 148 148 long status; 149 149 150 150 if (!is_test_kfunc_task()) ··· 166 164 kptr = bpf_kptr_xchg(&v->task, NULL); 167 165 if (!kptr) { 168 166 err = 3; 167 + return 0; 168 + } 169 + 170 + local = bpf_obj_new(typeof(*local)); 171 + if (!local) { 172 + err = 4; 173 + bpf_task_release(kptr); 174 + return 0; 175 + } 176 + 177 + kptr = bpf_kptr_xchg(&local->task, kptr); 178 + if (kptr) { 179 + err = 5; 180 + bpf_obj_drop(local); 181 + bpf_task_release(kptr); 182 + return 0; 183 + } 184 + 185 + kptr = bpf_kptr_xchg(&local->task, NULL); 186 + if (!kptr) { 187 + err = 6; 188 + bpf_obj_drop(local); 189 + return 0; 190 + } 191 + 192 + /* Stash a copy into local kptr and check if it is released recursively */ 193 + acquired = bpf_task_acquire(kptr); 194 + if (!acquired) { 195 + err = 7; 196 + bpf_obj_drop(local); 197 + bpf_task_release(kptr); 198 + return 0; 199 + } 200 + bpf_probe_read_kernel(&refcnt, sizeof(refcnt), &acquired->rcu_users); 201 + 202 + acquired = bpf_kptr_xchg(&local->task, acquired); 203 + if (acquired) { 204 + err = 8; 205 + bpf_obj_drop(local); 206 + bpf_task_release(kptr); 207 + bpf_task_release(acquired); 208 + return 0; 209 + } 210 + 211 + bpf_obj_drop(local); 212 + 213 + bpf_probe_read_kernel(&refcnt_after_drop, sizeof(refcnt_after_drop), &kptr->rcu_users); 214 + if (refcnt != refcnt_after_drop + 1) { 215 + err = 9; 216 + bpf_task_release(kptr); 169 217 return 0; 170 218 } 171 219

+22

tools/testing/selftests/bpf/progs/tc_bpf2bpf.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include "bpf_misc.h" 6 + 7 + __noinline 8 + int subprog(struct __sk_buff *skb) 9 + { 10 + int ret = 1; 11 + 12 + __sink(ret); 13 + return ret; 14 + } 15 + 16 + SEC("tc") 17 + int entry_tc(struct __sk_buff *skb) 18 + { 19 + return subprog(skb); 20 + } 21 + 22 + char __license[] SEC("license") = "GPL";

+12

tools/testing/selftests/bpf/progs/tc_dummy.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include "bpf_legacy.h" 5 + 6 + SEC("tc") 7 + int entry(struct __sk_buff *skb) 8 + { 9 + return 1; 10 + } 11 + 12 + char __license[] SEC("license") = "GPL";

+61 -3

tools/testing/selftests/bpf/progs/test_attach_probe.c

··· 5 5 #include <bpf/bpf_helpers.h> 6 6 #include <bpf/bpf_tracing.h> 7 7 #include <bpf/bpf_core_read.h> 8 + #include <errno.h> 8 9 #include "bpf_misc.h" 9 10 11 + u32 dynamic_sz = 1; 10 12 int kprobe2_res = 0; 11 13 int kretprobe2_res = 0; 12 14 int uprobe_byname_res = 0; ··· 16 14 int uprobe_byname2_res = 0; 17 15 int uretprobe_byname2_res = 0; 18 16 int uprobe_byname3_sleepable_res = 0; 17 + int uprobe_byname3_str_sleepable_res = 0; 19 18 int uprobe_byname3_res = 0; 20 19 int uretprobe_byname3_sleepable_res = 0; 20 + int uretprobe_byname3_str_sleepable_res = 0; 21 21 int uretprobe_byname3_res = 0; 22 22 void *user_ptr = 0; 23 + 24 + int bpf_copy_from_user_str(void *dst, u32, const void *, u64) __weak __ksym; 23 25 24 26 SEC("ksyscall/nanosleep") 25 27 int BPF_KSYSCALL(handle_kprobe_auto, struct __kernel_timespec *req, struct __kernel_timespec *rem) ··· 93 87 return bpf_strncmp(data, sizeof(data), "test_data") == 0; 94 88 } 95 89 90 + static __always_inline bool verify_sleepable_user_copy_str(void) 91 + { 92 + int ret; 93 + char data_long[20]; 94 + char data_long_pad[20]; 95 + char data_long_err[20]; 96 + char data_short[4]; 97 + char data_short_pad[4]; 98 + 99 + ret = bpf_copy_from_user_str(data_short, sizeof(data_short), user_ptr, 0); 100 + 101 + if (bpf_strncmp(data_short, 4, "tes\0") != 0 || ret != 4) 102 + return false; 103 + 104 + ret = bpf_copy_from_user_str(data_short_pad, sizeof(data_short_pad), user_ptr, BPF_F_PAD_ZEROS); 105 + 106 + if (bpf_strncmp(data_short, 4, "tes\0") != 0 || ret != 4) 107 + return false; 108 + 109 + /* Make sure this passes the verifier */ 110 + ret = bpf_copy_from_user_str(data_long, dynamic_sz & sizeof(data_long), user_ptr, 0); 111 + 112 + if (ret != 0) 113 + return false; 114 + 115 + ret = bpf_copy_from_user_str(data_long, sizeof(data_long), user_ptr, 0); 116 + 117 + if (bpf_strncmp(data_long, 10, "test_data\0") != 0 || ret != 10) 118 + return false; 119 + 120 + ret = bpf_copy_from_user_str(data_long_pad, sizeof(data_long_pad), user_ptr, BPF_F_PAD_ZEROS); 121 + 122 + if (bpf_strncmp(data_long_pad, 10, "test_data\0") != 0 || ret != 10 || data_long_pad[19] != '\0') 123 + return false; 124 + 125 + ret = bpf_copy_from_user_str(data_long_err, sizeof(data_long_err), (void *)data_long, BPF_F_PAD_ZEROS); 126 + 127 + if (ret > 0 || data_long_err[19] != '\0') 128 + return false; 129 + 130 + ret = bpf_copy_from_user_str(data_long, sizeof(data_long), user_ptr, 2); 131 + 132 + if (ret != -EINVAL) 133 + return false; 134 + 135 + return true; 136 + } 137 + 96 138 SEC("uprobe.s//proc/self/exe:trigger_func3") 97 139 int handle_uprobe_byname3_sleepable(struct pt_regs *ctx) 98 140 { 99 141 if (verify_sleepable_user_copy()) 100 142 uprobe_byname3_sleepable_res = 9; 143 + if (verify_sleepable_user_copy_str()) 144 + uprobe_byname3_str_sleepable_res = 10; 101 145 return 0; 102 146 } 103 147 ··· 158 102 SEC("uprobe//proc/self/exe:trigger_func3") 159 103 int handle_uprobe_byname3(struct pt_regs *ctx) 160 104 { 161 - uprobe_byname3_res = 10; 105 + uprobe_byname3_res = 11; 162 106 return 0; 163 107 } 164 108 ··· 166 110 int handle_uretprobe_byname3_sleepable(struct pt_regs *ctx) 167 111 { 168 112 if (verify_sleepable_user_copy()) 169 - uretprobe_byname3_sleepable_res = 11; 113 + uretprobe_byname3_sleepable_res = 12; 114 + if (verify_sleepable_user_copy_str()) 115 + uretprobe_byname3_str_sleepable_res = 13; 170 116 return 0; 171 117 } 172 118 173 119 SEC("uretprobe//proc/self/exe:trigger_func3") 174 120 int handle_uretprobe_byname3(struct pt_regs *ctx) 175 121 { 176 - uretprobe_byname3_res = 12; 122 + uretprobe_byname3_res = 14; 177 123 return 0; 178 124 } 179 125

+31

tools/testing/selftests/bpf/progs/test_build_id.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + 7 + struct bpf_stack_build_id stack_sleepable[128]; 8 + int res_sleepable; 9 + 10 + struct bpf_stack_build_id stack_nofault[128]; 11 + int res_nofault; 12 + 13 + SEC("uprobe.multi/./uprobe_multi:uprobe") 14 + int uprobe_nofault(struct pt_regs *ctx) 15 + { 16 + res_nofault = bpf_get_stack(ctx, stack_nofault, sizeof(stack_nofault), 17 + BPF_F_USER_STACK | BPF_F_USER_BUILD_ID); 18 + 19 + return 0; 20 + } 21 + 22 + SEC("uprobe.multi.s/./uprobe_multi:uprobe") 23 + int uprobe_sleepable(struct pt_regs *ctx) 24 + { 25 + res_sleepable = bpf_get_stack(ctx, stack_sleepable, sizeof(stack_sleepable), 26 + BPF_F_USER_STACK | BPF_F_USER_BUILD_ID); 27 + 28 + return 0; 29 + } 30 + 31 + char _license[] SEC("license") = "GPL";

+1 -1

tools/testing/selftests/bpf/progs/test_cls_redirect_dynptr.c

··· 503 503 * 504 504 * fill_tuple(&t, foo, sizeof(struct iphdr), 123, 321) 505 505 * 506 - * clang will substitue a costant for sizeof, which allows the verifier 506 + * clang will substitute a constant for sizeof, which allows the verifier 507 507 * to track it's value. Based on this, it can figure out the constant 508 508 * return value, and calling code works while still being "generic" to 509 509 * IPv4 and IPv6.

+1 -1

tools/testing/selftests/bpf/progs/test_core_read_macros.c

··· 36 36 return 0; 37 37 38 38 /* next pointers for kernel address space have to be initialized from 39 - * BPF side, user-space mmaped addresses are stil user-space addresses 39 + * BPF side, user-space mmaped addresses are still user-space addresses 40 40 */ 41 41 k_probe_in.next = &k_probe_in; 42 42 __builtin_preserve_access_index(({k_core_in.next = &k_core_in;}));

+32 -5

tools/testing/selftests/bpf/progs/test_get_xattr.c

··· 2 2 /* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ 3 3 4 4 #include "vmlinux.h" 5 + #include <errno.h> 5 6 #include <bpf/bpf_helpers.h> 6 7 #include <bpf/bpf_tracing.h> 7 8 #include "bpf_kfuncs.h" ··· 10 9 char _license[] SEC("license") = "GPL"; 11 10 12 11 __u32 monitored_pid; 13 - __u32 found_xattr; 12 + __u32 found_xattr_from_file; 13 + __u32 found_xattr_from_dentry; 14 14 15 15 static const char expected_value[] = "hello"; 16 - char value[32]; 16 + char value1[32]; 17 + char value2[32]; 17 18 18 19 SEC("lsm.s/file_open") 19 20 int BPF_PROG(test_file_open, struct file *f) ··· 28 25 if (pid != monitored_pid) 29 26 return 0; 30 27 31 - bpf_dynptr_from_mem(value, sizeof(value), 0, &value_ptr); 28 + bpf_dynptr_from_mem(value1, sizeof(value1), 0, &value_ptr); 32 29 33 30 ret = bpf_get_file_xattr(f, "user.kfuncs", &value_ptr); 34 31 if (ret != sizeof(expected_value)) 35 32 return 0; 36 - if (bpf_strncmp(value, ret, expected_value)) 33 + if (bpf_strncmp(value1, ret, expected_value)) 37 34 return 0; 38 - found_xattr = 1; 35 + found_xattr_from_file = 1; 39 36 return 0; 37 + } 38 + 39 + SEC("lsm.s/inode_getxattr") 40 + int BPF_PROG(test_inode_getxattr, struct dentry *dentry, char *name) 41 + { 42 + struct bpf_dynptr value_ptr; 43 + __u32 pid; 44 + int ret; 45 + 46 + pid = bpf_get_current_pid_tgid() >> 32; 47 + if (pid != monitored_pid) 48 + return 0; 49 + 50 + bpf_dynptr_from_mem(value2, sizeof(value2), 0, &value_ptr); 51 + 52 + ret = bpf_get_dentry_xattr(dentry, "user.kfuncs", &value_ptr); 53 + if (ret != sizeof(expected_value)) 54 + return 0; 55 + if (bpf_strncmp(value2, ret, expected_value)) 56 + return 0; 57 + found_xattr_from_dentry = 1; 58 + 59 + /* return non-zero to fail getxattr from user space */ 60 + return -EINVAL; 40 61 }

+1 -1

tools/testing/selftests/bpf/progs/test_global_func15.c

··· 44 44 * case we have a valid 1 stored in R0 register, but in 45 45 * a branch case we assign some random value to R0. So if 46 46 * there is something wrong with precision tracking for R0 at 47 - * program exit, we might erronenously prune branch case, 47 + * program exit, we might erroneously prune branch case, 48 48 * because R0 in fallthrough case is imprecise (and thus any 49 49 * value is valid from POV of verifier is_state_equal() logic) 50 50 */

+17 -1

tools/testing/selftests/bpf/progs/test_global_map_resize.c

··· 3 3 4 4 #include "vmlinux.h" 5 5 #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 6 7 7 8 char _license[] SEC("license") = "GPL"; 8 9 ··· 16 15 int sum = 0; 17 16 int array[1]; 18 17 19 - /* custom data secton */ 18 + /* custom data section */ 20 19 int my_array[1] SEC(".data.custom"); 21 20 22 21 /* custom data section which should NOT be resizable, ··· 61 60 62 61 return 0; 63 62 } 63 + 64 + SEC("struct_ops/test_1") 65 + int BPF_PROG(test_1) 66 + { 67 + return 0; 68 + } 69 + 70 + struct bpf_testmod_ops { 71 + int (*test_1)(void); 72 + }; 73 + 74 + SEC(".struct_ops.link") 75 + struct bpf_testmod_ops st_ops_resize = { 76 + .test_1 = (void *)test_1 77 + };

+1

tools/testing/selftests/bpf/progs/test_libbpf_get_fd_by_id_opts.c

··· 31 31 32 32 if (fmode & FMODE_WRITE) 33 33 return -EACCES; 34 + barrier(); 34 35 35 36 return 0; 36 37 }

+2 -1

tools/testing/selftests/bpf/progs/test_rdonly_maps.c

··· 4 4 #include <linux/ptrace.h> 5 5 #include <linux/bpf.h> 6 6 #include <bpf/bpf_helpers.h> 7 + #include "bpf_misc.h" 7 8 8 9 const struct { 9 10 unsigned a[4]; ··· 65 64 { 66 65 /* prevent compiler to optimize everything out */ 67 66 unsigned * volatile p = (void *)&rdonly_values.a; 68 - int i = sizeof(rdonly_values.a) / sizeof(rdonly_values.a[0]); 67 + int i = ARRAY_SIZE(rdonly_values.a); 69 68 unsigned iters = 0, sum = 0; 70 69 71 70 /* validate verifier can allow full loop as well */

+4

tools/testing/selftests/bpf/progs/test_sig_in_xattr.c

··· 6 6 #include <bpf/bpf_helpers.h> 7 7 #include <bpf/bpf_tracing.h> 8 8 #include "bpf_kfuncs.h" 9 + #include "err.h" 9 10 10 11 char _license[] SEC("license") = "GPL"; 11 12 ··· 80 79 ret = bpf_verify_pkcs7_signature(&digest_ptr, &sig_ptr, trusted_keyring); 81 80 82 81 bpf_key_put(trusted_keyring); 82 + 83 + set_if_not_errno_or_zero(ret, -EFAULT); 84 + 83 85 return ret; 84 86 }

-45

tools/testing/selftests/bpf/progs/test_skb_cgroup_id_kern.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - // Copyright (c) 2018 Facebook 3 - 4 - #include <linux/bpf.h> 5 - #include <linux/pkt_cls.h> 6 - 7 - #include <string.h> 8 - 9 - #include <bpf/bpf_helpers.h> 10 - 11 - #define NUM_CGROUP_LEVELS 4 12 - 13 - struct { 14 - __uint(type, BPF_MAP_TYPE_ARRAY); 15 - __type(key, __u32); 16 - __type(value, __u64); 17 - __uint(max_entries, NUM_CGROUP_LEVELS); 18 - } cgroup_ids SEC(".maps"); 19 - 20 - static __always_inline void log_nth_level(struct __sk_buff *skb, __u32 level) 21 - { 22 - __u64 id; 23 - 24 - /* [1] &level passed to external function that may change it, it's 25 - * incompatible with loop unroll. 26 - */ 27 - id = bpf_skb_ancestor_cgroup_id(skb, level); 28 - bpf_map_update_elem(&cgroup_ids, &level, &id, 0); 29 - } 30 - 31 - SEC("cgroup_id_logger") 32 - int log_cgroup_id(struct __sk_buff *skb) 33 - { 34 - /* Loop unroll can't be used here due to [1]. Unrolling manually. 35 - * Number of calls should be in sync with NUM_CGROUP_LEVELS. 36 - */ 37 - log_nth_level(skb, 0); 38 - log_nth_level(skb, 1); 39 - log_nth_level(skb, 2); 40 - log_nth_level(skb, 3); 41 - 42 - return TC_ACT_OK; 43 - } 44 - 45 - char _license[] SEC("license") = "GPL";

+21 -6

tools/testing/selftests/bpf/progs/test_tunnel_kern.c

··· 26 26 */ 27 27 #define ASSIGNED_ADDR_VETH1 0xac1001c8 28 28 29 + struct bpf_fou_encap___local { 30 + __be16 sport; 31 + __be16 dport; 32 + } __attribute__((preserve_access_index)); 33 + 34 + enum bpf_fou_encap_type___local { 35 + FOU_BPF_ENCAP_FOU___local, 36 + FOU_BPF_ENCAP_GUE___local, 37 + }; 38 + 39 + struct bpf_fou_encap; 40 + 29 41 int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx, 30 42 struct bpf_fou_encap *encap, int type) __ksym; 31 43 int bpf_skb_get_fou_encap(struct __sk_buff *skb_ctx, ··· 757 745 int ipip_gue_set_tunnel(struct __sk_buff *skb) 758 746 { 759 747 struct bpf_tunnel_key key = {}; 760 - struct bpf_fou_encap encap = {}; 748 + struct bpf_fou_encap___local encap = {}; 761 749 void *data = (void *)(long)skb->data; 762 750 struct iphdr *iph = data; 763 751 void *data_end = (void *)(long)skb->data_end; ··· 781 769 encap.sport = 0; 782 770 encap.dport = bpf_htons(5555); 783 771 784 - ret = bpf_skb_set_fou_encap(skb, &encap, FOU_BPF_ENCAP_GUE); 772 + ret = bpf_skb_set_fou_encap(skb, (struct bpf_fou_encap *)&encap, 773 + bpf_core_enum_value(enum bpf_fou_encap_type___local, 774 + FOU_BPF_ENCAP_GUE___local)); 785 775 if (ret < 0) { 786 776 log_err(ret); 787 777 return TC_ACT_SHOT; ··· 796 782 int ipip_fou_set_tunnel(struct __sk_buff *skb) 797 783 { 798 784 struct bpf_tunnel_key key = {}; 799 - struct bpf_fou_encap encap = {}; 785 + struct bpf_fou_encap___local encap = {}; 800 786 void *data = (void *)(long)skb->data; 801 787 struct iphdr *iph = data; 802 788 void *data_end = (void *)(long)skb->data_end; ··· 820 806 encap.sport = 0; 821 807 encap.dport = bpf_htons(5555); 822 808 823 - ret = bpf_skb_set_fou_encap(skb, &encap, FOU_BPF_ENCAP_FOU); 809 + ret = bpf_skb_set_fou_encap(skb, (struct bpf_fou_encap *)&encap, 810 + FOU_BPF_ENCAP_FOU___local); 824 811 if (ret < 0) { 825 812 log_err(ret); 826 813 return TC_ACT_SHOT; ··· 835 820 { 836 821 int ret; 837 822 struct bpf_tunnel_key key = {}; 838 - struct bpf_fou_encap encap = {}; 823 + struct bpf_fou_encap___local encap = {}; 839 824 840 825 ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0); 841 826 if (ret < 0) { ··· 843 828 return TC_ACT_SHOT; 844 829 } 845 830 846 - ret = bpf_skb_get_fou_encap(skb, &encap); 831 + ret = bpf_skb_get_fou_encap(skb, (struct bpf_fou_encap *)&encap); 847 832 if (ret < 0) { 848 833 log_err(ret); 849 834 return TC_ACT_SHOT;

+6 -2

tools/testing/selftests/bpf/progs/test_verify_pkcs7_sig.c

··· 11 11 #include <bpf/bpf_helpers.h> 12 12 #include <bpf/bpf_tracing.h> 13 13 #include "bpf_kfuncs.h" 14 + #include "err.h" 14 15 15 16 #define MAX_DATA_SIZE (1024 * 1024) 16 17 #define MAX_SIG_SIZE 1024 ··· 56 55 57 56 ret = bpf_probe_read_kernel(&value, sizeof(value), &attr->value); 58 57 if (ret) 59 - return ret; 58 + goto out; 60 59 61 60 ret = bpf_copy_from_user(data_val, sizeof(struct data), 62 61 (void *)(unsigned long)value); 63 62 if (ret) 64 - return ret; 63 + goto out; 65 64 66 65 if (data_val->data_len > sizeof(data_val->data)) 67 66 return -EINVAL; ··· 84 83 ret = bpf_verify_pkcs7_signature(&data_ptr, &sig_ptr, trusted_keyring); 85 84 86 85 bpf_key_put(trusted_keyring); 86 + 87 + out: 88 + set_if_not_errno_or_zero(ret, -EFAULT); 87 89 88 90 return ret; 89 91 }

+2 -2

tools/testing/selftests/bpf/progs/token_lsm.c

··· 8 8 char _license[] SEC("license") = "GPL"; 9 9 10 10 int my_pid; 11 - bool reject_capable; 12 - bool reject_cmd; 11 + int reject_capable; 12 + int reject_cmd; 13 13 14 14 SEC("lsm/bpf_token_capable") 15 15 int BPF_PROG(token_capable, struct bpf_token *token, int cap)

+7

tools/testing/selftests/bpf/progs/trigger_bench.c

··· 32 32 return 0; 33 33 } 34 34 35 + SEC("?uprobe.multi") 36 + int bench_trigger_uprobe_multi(void *ctx) 37 + { 38 + inc_counter(); 39 + return 0; 40 + } 41 + 35 42 const volatile int batch_iters = 0; 36 43 37 44 SEC("?raw_tp")

+22

tools/testing/selftests/bpf/progs/unsupported_ops.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_tracing.h> 6 + #include "bpf_misc.h" 7 + #include "../bpf_testmod/bpf_testmod.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + SEC("struct_ops/unsupported_ops") 12 + __failure 13 + __msg("attach to unsupported member unsupported_ops of struct bpf_testmod_ops") 14 + int BPF_PROG(unsupported_ops) 15 + { 16 + return 0; 17 + } 18 + 19 + SEC(".struct_ops.link") 20 + struct bpf_testmod_ops testmod = { 21 + .unsupported_ops = (void *)unsupported_ops, 22 + };

+39

tools/testing/selftests/bpf/progs/uprobe_multi_consumers.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include <bpf/bpf_tracing.h> 5 + #include <stdbool.h> 6 + #include "bpf_kfuncs.h" 7 + #include "bpf_misc.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + __u64 uprobe_result[4]; 12 + 13 + SEC("uprobe.multi") 14 + int uprobe_0(struct pt_regs *ctx) 15 + { 16 + uprobe_result[0]++; 17 + return 0; 18 + } 19 + 20 + SEC("uprobe.multi") 21 + int uprobe_1(struct pt_regs *ctx) 22 + { 23 + uprobe_result[1]++; 24 + return 0; 25 + } 26 + 27 + SEC("uprobe.multi") 28 + int uprobe_2(struct pt_regs *ctx) 29 + { 30 + uprobe_result[2]++; 31 + return 0; 32 + } 33 + 34 + SEC("uprobe.multi") 35 + int uprobe_3(struct pt_regs *ctx) 36 + { 37 + uprobe_result[3]++; 38 + return 0; 39 + }

+40

tools/testing/selftests/bpf/progs/uprobe_multi_pid_filter.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include "vmlinux.h" 3 + #include <bpf/bpf_helpers.h> 4 + #include <bpf/bpf_tracing.h> 5 + 6 + char _license[] SEC("license") = "GPL"; 7 + 8 + __u32 pids[3]; 9 + __u32 test[3][2]; 10 + 11 + static void update_pid(int idx) 12 + { 13 + __u32 pid = bpf_get_current_pid_tgid() >> 32; 14 + 15 + if (pid == pids[idx]) 16 + test[idx][0]++; 17 + else 18 + test[idx][1]++; 19 + } 20 + 21 + SEC("uprobe.multi") 22 + int uprobe_multi_0(struct pt_regs *ctx) 23 + { 24 + update_pid(0); 25 + return 0; 26 + } 27 + 28 + SEC("uprobe.multi") 29 + int uprobe_multi_1(struct pt_regs *ctx) 30 + { 31 + update_pid(1); 32 + return 0; 33 + } 34 + 35 + SEC("uprobe.multi") 36 + int uprobe_multi_2(struct pt_regs *ctx) 37 + { 38 + update_pid(2); 39 + return 0; 40 + }

+1 -1

tools/testing/selftests/bpf/progs/verifier_bits_iter.c

··· 87 87 int *bit; 88 88 89 89 __builtin_memset(&data, 0xf0, sizeof(data)); /* 4 * 16 */ 90 - bpf_for_each(bits, bit, &data[0], sizeof(data) / sizeof(u64)) 90 + bpf_for_each(bits, bit, &data[0], ARRAY_SIZE(data)) 91 91 nr++; 92 92 return nr; 93 93 }

+900

tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_core_read.h> 6 + #include "../../../include/linux/filter.h" 7 + #include "bpf_misc.h" 8 + #include <stdbool.h> 9 + #include "bpf_kfuncs.h" 10 + 11 + SEC("raw_tp") 12 + __arch_x86_64 13 + __log_level(4) __msg("stack depth 8") 14 + __xlated("4: r5 = 5") 15 + __xlated("5: w0 = ") 16 + __xlated("6: r0 = &(void __percpu *)(r0)") 17 + __xlated("7: r0 = *(u32 *)(r0 +0)") 18 + __xlated("8: exit") 19 + __success 20 + __naked void simple(void) 21 + { 22 + asm volatile ( 23 + "r1 = 1;" 24 + "r2 = 2;" 25 + "r3 = 3;" 26 + "r4 = 4;" 27 + "r5 = 5;" 28 + "*(u64 *)(r10 - 16) = r1;" 29 + "*(u64 *)(r10 - 24) = r2;" 30 + "*(u64 *)(r10 - 32) = r3;" 31 + "*(u64 *)(r10 - 40) = r4;" 32 + "*(u64 *)(r10 - 48) = r5;" 33 + "call %[bpf_get_smp_processor_id];" 34 + "r5 = *(u64 *)(r10 - 48);" 35 + "r4 = *(u64 *)(r10 - 40);" 36 + "r3 = *(u64 *)(r10 - 32);" 37 + "r2 = *(u64 *)(r10 - 24);" 38 + "r1 = *(u64 *)(r10 - 16);" 39 + "exit;" 40 + : 41 + : __imm(bpf_get_smp_processor_id) 42 + : __clobber_all); 43 + } 44 + 45 + /* The logic for detecting and verifying bpf_fastcall pattern is the same for 46 + * any arch, however x86 differs from arm64 or riscv64 in a way 47 + * bpf_get_smp_processor_id is rewritten: 48 + * - on x86 it is done by verifier 49 + * - on arm64 and riscv64 it is done by jit 50 + * 51 + * Which leads to different xlated patterns for different archs: 52 + * - on x86 the call is expanded as 3 instructions 53 + * - on arm64 and riscv64 the call remains as is 54 + * (but spills/fills are still removed) 55 + * 56 + * It is really desirable to check instruction indexes in the xlated 57 + * patterns, so add this canary test to check that function rewrite by 58 + * jit is correctly processed by bpf_fastcall logic, keep the rest of the 59 + * tests as x86. 60 + */ 61 + SEC("raw_tp") 62 + __arch_arm64 63 + __arch_riscv64 64 + __xlated("0: r1 = 1") 65 + __xlated("1: call bpf_get_smp_processor_id") 66 + __xlated("2: exit") 67 + __success 68 + __naked void canary_arm64_riscv64(void) 69 + { 70 + asm volatile ( 71 + "r1 = 1;" 72 + "*(u64 *)(r10 - 16) = r1;" 73 + "call %[bpf_get_smp_processor_id];" 74 + "r1 = *(u64 *)(r10 - 16);" 75 + "exit;" 76 + : 77 + : __imm(bpf_get_smp_processor_id) 78 + : __clobber_all); 79 + } 80 + 81 + SEC("raw_tp") 82 + __arch_x86_64 83 + __xlated("1: r0 = &(void __percpu *)(r0)") 84 + __xlated("...") 85 + __xlated("3: exit") 86 + __success 87 + __naked void canary_zero_spills(void) 88 + { 89 + asm volatile ( 90 + "call %[bpf_get_smp_processor_id];" 91 + "exit;" 92 + : 93 + : __imm(bpf_get_smp_processor_id) 94 + : __clobber_all); 95 + } 96 + 97 + SEC("raw_tp") 98 + __arch_x86_64 99 + __log_level(4) __msg("stack depth 16") 100 + __xlated("1: *(u64 *)(r10 -16) = r1") 101 + __xlated("...") 102 + __xlated("3: r0 = &(void __percpu *)(r0)") 103 + __xlated("...") 104 + __xlated("5: r2 = *(u64 *)(r10 -16)") 105 + __success 106 + __naked void wrong_reg_in_pattern1(void) 107 + { 108 + asm volatile ( 109 + "r1 = 1;" 110 + "*(u64 *)(r10 - 16) = r1;" 111 + "call %[bpf_get_smp_processor_id];" 112 + "r2 = *(u64 *)(r10 - 16);" 113 + "exit;" 114 + : 115 + : __imm(bpf_get_smp_processor_id) 116 + : __clobber_all); 117 + } 118 + 119 + SEC("raw_tp") 120 + __arch_x86_64 121 + __xlated("1: *(u64 *)(r10 -16) = r6") 122 + __xlated("...") 123 + __xlated("3: r0 = &(void __percpu *)(r0)") 124 + __xlated("...") 125 + __xlated("5: r6 = *(u64 *)(r10 -16)") 126 + __success 127 + __naked void wrong_reg_in_pattern2(void) 128 + { 129 + asm volatile ( 130 + "r6 = 1;" 131 + "*(u64 *)(r10 - 16) = r6;" 132 + "call %[bpf_get_smp_processor_id];" 133 + "r6 = *(u64 *)(r10 - 16);" 134 + "exit;" 135 + : 136 + : __imm(bpf_get_smp_processor_id) 137 + : __clobber_all); 138 + } 139 + 140 + SEC("raw_tp") 141 + __arch_x86_64 142 + __xlated("1: *(u64 *)(r10 -16) = r0") 143 + __xlated("...") 144 + __xlated("3: r0 = &(void __percpu *)(r0)") 145 + __xlated("...") 146 + __xlated("5: r0 = *(u64 *)(r10 -16)") 147 + __success 148 + __naked void wrong_reg_in_pattern3(void) 149 + { 150 + asm volatile ( 151 + "r0 = 1;" 152 + "*(u64 *)(r10 - 16) = r0;" 153 + "call %[bpf_get_smp_processor_id];" 154 + "r0 = *(u64 *)(r10 - 16);" 155 + "exit;" 156 + : 157 + : __imm(bpf_get_smp_processor_id) 158 + : __clobber_all); 159 + } 160 + 161 + SEC("raw_tp") 162 + __arch_x86_64 163 + __xlated("2: *(u64 *)(r2 -16) = r1") 164 + __xlated("...") 165 + __xlated("4: r0 = &(void __percpu *)(r0)") 166 + __xlated("...") 167 + __xlated("6: r1 = *(u64 *)(r10 -16)") 168 + __success 169 + __naked void wrong_base_in_pattern(void) 170 + { 171 + asm volatile ( 172 + "r1 = 1;" 173 + "r2 = r10;" 174 + "*(u64 *)(r2 - 16) = r1;" 175 + "call %[bpf_get_smp_processor_id];" 176 + "r1 = *(u64 *)(r10 - 16);" 177 + "exit;" 178 + : 179 + : __imm(bpf_get_smp_processor_id) 180 + : __clobber_all); 181 + } 182 + 183 + SEC("raw_tp") 184 + __arch_x86_64 185 + __xlated("1: *(u64 *)(r10 -16) = r1") 186 + __xlated("...") 187 + __xlated("3: r0 = &(void __percpu *)(r0)") 188 + __xlated("...") 189 + __xlated("5: r2 = 1") 190 + __success 191 + __naked void wrong_insn_in_pattern(void) 192 + { 193 + asm volatile ( 194 + "r1 = 1;" 195 + "*(u64 *)(r10 - 16) = r1;" 196 + "call %[bpf_get_smp_processor_id];" 197 + "r2 = 1;" 198 + "r1 = *(u64 *)(r10 - 16);" 199 + "exit;" 200 + : 201 + : __imm(bpf_get_smp_processor_id) 202 + : __clobber_all); 203 + } 204 + 205 + SEC("raw_tp") 206 + __arch_x86_64 207 + __xlated("2: *(u64 *)(r10 -16) = r1") 208 + __xlated("...") 209 + __xlated("4: r0 = &(void __percpu *)(r0)") 210 + __xlated("...") 211 + __xlated("6: r1 = *(u64 *)(r10 -8)") 212 + __success 213 + __naked void wrong_off_in_pattern1(void) 214 + { 215 + asm volatile ( 216 + "r1 = 1;" 217 + "*(u64 *)(r10 - 8) = r1;" 218 + "*(u64 *)(r10 - 16) = r1;" 219 + "call %[bpf_get_smp_processor_id];" 220 + "r1 = *(u64 *)(r10 - 8);" 221 + "exit;" 222 + : 223 + : __imm(bpf_get_smp_processor_id) 224 + : __clobber_all); 225 + } 226 + 227 + SEC("raw_tp") 228 + __arch_x86_64 229 + __xlated("1: *(u32 *)(r10 -4) = r1") 230 + __xlated("...") 231 + __xlated("3: r0 = &(void __percpu *)(r0)") 232 + __xlated("...") 233 + __xlated("5: r1 = *(u32 *)(r10 -4)") 234 + __success 235 + __naked void wrong_off_in_pattern2(void) 236 + { 237 + asm volatile ( 238 + "r1 = 1;" 239 + "*(u32 *)(r10 - 4) = r1;" 240 + "call %[bpf_get_smp_processor_id];" 241 + "r1 = *(u32 *)(r10 - 4);" 242 + "exit;" 243 + : 244 + : __imm(bpf_get_smp_processor_id) 245 + : __clobber_all); 246 + } 247 + 248 + SEC("raw_tp") 249 + __arch_x86_64 250 + __xlated("1: *(u32 *)(r10 -16) = r1") 251 + __xlated("...") 252 + __xlated("3: r0 = &(void __percpu *)(r0)") 253 + __xlated("...") 254 + __xlated("5: r1 = *(u32 *)(r10 -16)") 255 + __success 256 + __naked void wrong_size_in_pattern(void) 257 + { 258 + asm volatile ( 259 + "r1 = 1;" 260 + "*(u32 *)(r10 - 16) = r1;" 261 + "call %[bpf_get_smp_processor_id];" 262 + "r1 = *(u32 *)(r10 - 16);" 263 + "exit;" 264 + : 265 + : __imm(bpf_get_smp_processor_id) 266 + : __clobber_all); 267 + } 268 + 269 + SEC("raw_tp") 270 + __arch_x86_64 271 + __xlated("2: *(u32 *)(r10 -8) = r1") 272 + __xlated("...") 273 + __xlated("4: r0 = &(void __percpu *)(r0)") 274 + __xlated("...") 275 + __xlated("6: r1 = *(u32 *)(r10 -8)") 276 + __success 277 + __naked void partial_pattern(void) 278 + { 279 + asm volatile ( 280 + "r1 = 1;" 281 + "r2 = 2;" 282 + "*(u32 *)(r10 - 8) = r1;" 283 + "*(u64 *)(r10 - 16) = r2;" 284 + "call %[bpf_get_smp_processor_id];" 285 + "r2 = *(u64 *)(r10 - 16);" 286 + "r1 = *(u32 *)(r10 - 8);" 287 + "exit;" 288 + : 289 + : __imm(bpf_get_smp_processor_id) 290 + : __clobber_all); 291 + } 292 + 293 + SEC("raw_tp") 294 + __arch_x86_64 295 + __xlated("0: r1 = 1") 296 + __xlated("1: r2 = 2") 297 + /* not patched, spills for -8, -16 not removed */ 298 + __xlated("2: *(u64 *)(r10 -8) = r1") 299 + __xlated("3: *(u64 *)(r10 -16) = r2") 300 + __xlated("...") 301 + __xlated("5: r0 = &(void __percpu *)(r0)") 302 + __xlated("...") 303 + __xlated("7: r2 = *(u64 *)(r10 -16)") 304 + __xlated("8: r1 = *(u64 *)(r10 -8)") 305 + /* patched, spills for -24, -32 removed */ 306 + __xlated("...") 307 + __xlated("10: r0 = &(void __percpu *)(r0)") 308 + __xlated("...") 309 + __xlated("12: exit") 310 + __success 311 + __naked void min_stack_offset(void) 312 + { 313 + asm volatile ( 314 + "r1 = 1;" 315 + "r2 = 2;" 316 + /* this call won't be patched */ 317 + "*(u64 *)(r10 - 8) = r1;" 318 + "*(u64 *)(r10 - 16) = r2;" 319 + "call %[bpf_get_smp_processor_id];" 320 + "r2 = *(u64 *)(r10 - 16);" 321 + "r1 = *(u64 *)(r10 - 8);" 322 + /* this call would be patched */ 323 + "*(u64 *)(r10 - 24) = r1;" 324 + "*(u64 *)(r10 - 32) = r2;" 325 + "call %[bpf_get_smp_processor_id];" 326 + "r2 = *(u64 *)(r10 - 32);" 327 + "r1 = *(u64 *)(r10 - 24);" 328 + "exit;" 329 + : 330 + : __imm(bpf_get_smp_processor_id) 331 + : __clobber_all); 332 + } 333 + 334 + SEC("raw_tp") 335 + __arch_x86_64 336 + __xlated("1: *(u64 *)(r10 -8) = r1") 337 + __xlated("...") 338 + __xlated("3: r0 = &(void __percpu *)(r0)") 339 + __xlated("...") 340 + __xlated("5: r1 = *(u64 *)(r10 -8)") 341 + __success 342 + __naked void bad_fixed_read(void) 343 + { 344 + asm volatile ( 345 + "r1 = 1;" 346 + "*(u64 *)(r10 - 8) = r1;" 347 + "call %[bpf_get_smp_processor_id];" 348 + "r1 = *(u64 *)(r10 - 8);" 349 + "r1 = r10;" 350 + "r1 += -8;" 351 + "r1 = *(u64 *)(r1 - 0);" 352 + "exit;" 353 + : 354 + : __imm(bpf_get_smp_processor_id) 355 + : __clobber_all); 356 + } 357 + 358 + SEC("raw_tp") 359 + __arch_x86_64 360 + __xlated("1: *(u64 *)(r10 -8) = r1") 361 + __xlated("...") 362 + __xlated("3: r0 = &(void __percpu *)(r0)") 363 + __xlated("...") 364 + __xlated("5: r1 = *(u64 *)(r10 -8)") 365 + __success 366 + __naked void bad_fixed_write(void) 367 + { 368 + asm volatile ( 369 + "r1 = 1;" 370 + "*(u64 *)(r10 - 8) = r1;" 371 + "call %[bpf_get_smp_processor_id];" 372 + "r1 = *(u64 *)(r10 - 8);" 373 + "r1 = r10;" 374 + "r1 += -8;" 375 + "*(u64 *)(r1 - 0) = r1;" 376 + "exit;" 377 + : 378 + : __imm(bpf_get_smp_processor_id) 379 + : __clobber_all); 380 + } 381 + 382 + SEC("raw_tp") 383 + __arch_x86_64 384 + __xlated("6: *(u64 *)(r10 -16) = r1") 385 + __xlated("...") 386 + __xlated("8: r0 = &(void __percpu *)(r0)") 387 + __xlated("...") 388 + __xlated("10: r1 = *(u64 *)(r10 -16)") 389 + __success 390 + __naked void bad_varying_read(void) 391 + { 392 + asm volatile ( 393 + "r6 = *(u64 *)(r1 + 0);" /* random scalar value */ 394 + "r6 &= 0x7;" /* r6 range [0..7] */ 395 + "r6 += 0x2;" /* r6 range [2..9] */ 396 + "r7 = 0;" 397 + "r7 -= r6;" /* r7 range [-9..-2] */ 398 + "r1 = 1;" 399 + "*(u64 *)(r10 - 16) = r1;" 400 + "call %[bpf_get_smp_processor_id];" 401 + "r1 = *(u64 *)(r10 - 16);" 402 + "r1 = r10;" 403 + "r1 += r7;" 404 + "r1 = *(u8 *)(r1 - 0);" /* touches slot [-16..-9] where spills are stored */ 405 + "exit;" 406 + : 407 + : __imm(bpf_get_smp_processor_id) 408 + : __clobber_all); 409 + } 410 + 411 + SEC("raw_tp") 412 + __arch_x86_64 413 + __xlated("6: *(u64 *)(r10 -16) = r1") 414 + __xlated("...") 415 + __xlated("8: r0 = &(void __percpu *)(r0)") 416 + __xlated("...") 417 + __xlated("10: r1 = *(u64 *)(r10 -16)") 418 + __success 419 + __naked void bad_varying_write(void) 420 + { 421 + asm volatile ( 422 + "r6 = *(u64 *)(r1 + 0);" /* random scalar value */ 423 + "r6 &= 0x7;" /* r6 range [0..7] */ 424 + "r6 += 0x2;" /* r6 range [2..9] */ 425 + "r7 = 0;" 426 + "r7 -= r6;" /* r7 range [-9..-2] */ 427 + "r1 = 1;" 428 + "*(u64 *)(r10 - 16) = r1;" 429 + "call %[bpf_get_smp_processor_id];" 430 + "r1 = *(u64 *)(r10 - 16);" 431 + "r1 = r10;" 432 + "r1 += r7;" 433 + "*(u8 *)(r1 - 0) = r7;" /* touches slot [-16..-9] where spills are stored */ 434 + "exit;" 435 + : 436 + : __imm(bpf_get_smp_processor_id) 437 + : __clobber_all); 438 + } 439 + 440 + SEC("raw_tp") 441 + __arch_x86_64 442 + __xlated("1: *(u64 *)(r10 -8) = r1") 443 + __xlated("...") 444 + __xlated("3: r0 = &(void __percpu *)(r0)") 445 + __xlated("...") 446 + __xlated("5: r1 = *(u64 *)(r10 -8)") 447 + __success 448 + __naked void bad_write_in_subprog(void) 449 + { 450 + asm volatile ( 451 + "r1 = 1;" 452 + "*(u64 *)(r10 - 8) = r1;" 453 + "call %[bpf_get_smp_processor_id];" 454 + "r1 = *(u64 *)(r10 - 8);" 455 + "r1 = r10;" 456 + "r1 += -8;" 457 + "call bad_write_in_subprog_aux;" 458 + "exit;" 459 + : 460 + : __imm(bpf_get_smp_processor_id) 461 + : __clobber_all); 462 + } 463 + 464 + __used 465 + __naked static void bad_write_in_subprog_aux(void) 466 + { 467 + asm volatile ( 468 + "r0 = 1;" 469 + "*(u64 *)(r1 - 0) = r0;" /* invalidates bpf_fastcall contract for caller: */ 470 + "exit;" /* caller stack at -8 used outside of the pattern */ 471 + ::: __clobber_all); 472 + } 473 + 474 + SEC("raw_tp") 475 + __arch_x86_64 476 + __xlated("1: *(u64 *)(r10 -8) = r1") 477 + __xlated("...") 478 + __xlated("3: r0 = &(void __percpu *)(r0)") 479 + __xlated("...") 480 + __xlated("5: r1 = *(u64 *)(r10 -8)") 481 + __success 482 + __naked void bad_helper_write(void) 483 + { 484 + asm volatile ( 485 + "r1 = 1;" 486 + /* bpf_fastcall pattern with stack offset -8 */ 487 + "*(u64 *)(r10 - 8) = r1;" 488 + "call %[bpf_get_smp_processor_id];" 489 + "r1 = *(u64 *)(r10 - 8);" 490 + "r1 = r10;" 491 + "r1 += -8;" 492 + "r2 = 1;" 493 + "r3 = 42;" 494 + /* read dst is fp[-8], thus bpf_fastcall rewrite not applied */ 495 + "call %[bpf_probe_read_kernel];" 496 + "exit;" 497 + : 498 + : __imm(bpf_get_smp_processor_id), 499 + __imm(bpf_probe_read_kernel) 500 + : __clobber_all); 501 + } 502 + 503 + SEC("raw_tp") 504 + __arch_x86_64 505 + /* main, not patched */ 506 + __xlated("1: *(u64 *)(r10 -8) = r1") 507 + __xlated("...") 508 + __xlated("3: r0 = &(void __percpu *)(r0)") 509 + __xlated("...") 510 + __xlated("5: r1 = *(u64 *)(r10 -8)") 511 + __xlated("...") 512 + __xlated("9: call pc+1") 513 + __xlated("...") 514 + __xlated("10: exit") 515 + /* subprogram, patched */ 516 + __xlated("11: r1 = 1") 517 + __xlated("...") 518 + __xlated("13: r0 = &(void __percpu *)(r0)") 519 + __xlated("...") 520 + __xlated("15: exit") 521 + __success 522 + __naked void invalidate_one_subprog(void) 523 + { 524 + asm volatile ( 525 + "r1 = 1;" 526 + "*(u64 *)(r10 - 8) = r1;" 527 + "call %[bpf_get_smp_processor_id];" 528 + "r1 = *(u64 *)(r10 - 8);" 529 + "r1 = r10;" 530 + "r1 += -8;" 531 + "r1 = *(u64 *)(r1 - 0);" 532 + "call invalidate_one_subprog_aux;" 533 + "exit;" 534 + : 535 + : __imm(bpf_get_smp_processor_id) 536 + : __clobber_all); 537 + } 538 + 539 + __used 540 + __naked static void invalidate_one_subprog_aux(void) 541 + { 542 + asm volatile ( 543 + "r1 = 1;" 544 + "*(u64 *)(r10 - 8) = r1;" 545 + "call %[bpf_get_smp_processor_id];" 546 + "r1 = *(u64 *)(r10 - 8);" 547 + "exit;" 548 + : 549 + : __imm(bpf_get_smp_processor_id) 550 + : __clobber_all); 551 + } 552 + 553 + SEC("raw_tp") 554 + __arch_x86_64 555 + /* main */ 556 + __xlated("0: r1 = 1") 557 + __xlated("...") 558 + __xlated("2: r0 = &(void __percpu *)(r0)") 559 + __xlated("...") 560 + __xlated("4: call pc+1") 561 + __xlated("5: exit") 562 + /* subprogram */ 563 + __xlated("6: r1 = 1") 564 + __xlated("...") 565 + __xlated("8: r0 = &(void __percpu *)(r0)") 566 + __xlated("...") 567 + __xlated("10: *(u64 *)(r10 -16) = r1") 568 + __xlated("11: exit") 569 + __success 570 + __naked void subprogs_use_independent_offsets(void) 571 + { 572 + asm volatile ( 573 + "r1 = 1;" 574 + "*(u64 *)(r10 - 16) = r1;" 575 + "call %[bpf_get_smp_processor_id];" 576 + "r1 = *(u64 *)(r10 - 16);" 577 + "call subprogs_use_independent_offsets_aux;" 578 + "exit;" 579 + : 580 + : __imm(bpf_get_smp_processor_id) 581 + : __clobber_all); 582 + } 583 + 584 + __used 585 + __naked static void subprogs_use_independent_offsets_aux(void) 586 + { 587 + asm volatile ( 588 + "r1 = 1;" 589 + "*(u64 *)(r10 - 24) = r1;" 590 + "call %[bpf_get_smp_processor_id];" 591 + "r1 = *(u64 *)(r10 - 24);" 592 + "*(u64 *)(r10 - 16) = r1;" 593 + "exit;" 594 + : 595 + : __imm(bpf_get_smp_processor_id) 596 + : __clobber_all); 597 + } 598 + 599 + SEC("raw_tp") 600 + __arch_x86_64 601 + __log_level(4) __msg("stack depth 8") 602 + __xlated("2: r0 = &(void __percpu *)(r0)") 603 + __success 604 + __naked void helper_call_does_not_prevent_bpf_fastcall(void) 605 + { 606 + asm volatile ( 607 + "r1 = 1;" 608 + "*(u64 *)(r10 - 8) = r1;" 609 + "call %[bpf_get_smp_processor_id];" 610 + "r1 = *(u64 *)(r10 - 8);" 611 + "*(u64 *)(r10 - 8) = r1;" 612 + "call %[bpf_get_prandom_u32];" 613 + "r1 = *(u64 *)(r10 - 8);" 614 + "exit;" 615 + : 616 + : __imm(bpf_get_smp_processor_id), 617 + __imm(bpf_get_prandom_u32) 618 + : __clobber_all); 619 + } 620 + 621 + SEC("raw_tp") 622 + __arch_x86_64 623 + __log_level(4) __msg("stack depth 16") 624 + /* may_goto counter at -16 */ 625 + __xlated("0: *(u64 *)(r10 -16) =") 626 + __xlated("1: r1 = 1") 627 + __xlated("...") 628 + __xlated("3: r0 = &(void __percpu *)(r0)") 629 + __xlated("...") 630 + /* may_goto expansion starts */ 631 + __xlated("5: r11 = *(u64 *)(r10 -16)") 632 + __xlated("6: if r11 == 0x0 goto pc+3") 633 + __xlated("7: r11 -= 1") 634 + __xlated("8: *(u64 *)(r10 -16) = r11") 635 + /* may_goto expansion ends */ 636 + __xlated("9: *(u64 *)(r10 -8) = r1") 637 + __xlated("10: exit") 638 + __success 639 + __naked void may_goto_interaction(void) 640 + { 641 + asm volatile ( 642 + "r1 = 1;" 643 + "*(u64 *)(r10 - 16) = r1;" 644 + "call %[bpf_get_smp_processor_id];" 645 + "r1 = *(u64 *)(r10 - 16);" 646 + ".8byte %[may_goto];" 647 + /* just touch some stack at -8 */ 648 + "*(u64 *)(r10 - 8) = r1;" 649 + "exit;" 650 + : 651 + : __imm(bpf_get_smp_processor_id), 652 + __imm_insn(may_goto, BPF_RAW_INSN(BPF_JMP | BPF_JCOND, 0, 0, +1 /* offset */, 0)) 653 + : __clobber_all); 654 + } 655 + 656 + __used 657 + __naked static void dummy_loop_callback(void) 658 + { 659 + asm volatile ( 660 + "r0 = 0;" 661 + "exit;" 662 + ::: __clobber_all); 663 + } 664 + 665 + SEC("raw_tp") 666 + __arch_x86_64 667 + __log_level(4) __msg("stack depth 32+0") 668 + __xlated("2: r1 = 1") 669 + __xlated("3: w0 =") 670 + __xlated("4: r0 = &(void __percpu *)(r0)") 671 + __xlated("5: r0 = *(u32 *)(r0 +0)") 672 + /* bpf_loop params setup */ 673 + __xlated("6: r2 =") 674 + __xlated("7: r3 = 0") 675 + __xlated("8: r4 = 0") 676 + __xlated("...") 677 + /* ... part of the inlined bpf_loop */ 678 + __xlated("12: *(u64 *)(r10 -32) = r6") 679 + __xlated("13: *(u64 *)(r10 -24) = r7") 680 + __xlated("14: *(u64 *)(r10 -16) = r8") 681 + __xlated("...") 682 + __xlated("21: call pc+8") /* dummy_loop_callback */ 683 + /* ... last insns of the bpf_loop_interaction1 */ 684 + __xlated("...") 685 + __xlated("28: r0 = 0") 686 + __xlated("29: exit") 687 + /* dummy_loop_callback */ 688 + __xlated("30: r0 = 0") 689 + __xlated("31: exit") 690 + __success 691 + __naked int bpf_loop_interaction1(void) 692 + { 693 + asm volatile ( 694 + "r1 = 1;" 695 + /* bpf_fastcall stack region at -16, but could be removed */ 696 + "*(u64 *)(r10 - 16) = r1;" 697 + "call %[bpf_get_smp_processor_id];" 698 + "r1 = *(u64 *)(r10 - 16);" 699 + "r2 = %[dummy_loop_callback];" 700 + "r3 = 0;" 701 + "r4 = 0;" 702 + "call %[bpf_loop];" 703 + "r0 = 0;" 704 + "exit;" 705 + : 706 + : __imm_ptr(dummy_loop_callback), 707 + __imm(bpf_get_smp_processor_id), 708 + __imm(bpf_loop) 709 + : __clobber_common 710 + ); 711 + } 712 + 713 + SEC("raw_tp") 714 + __arch_x86_64 715 + __log_level(4) __msg("stack depth 40+0") 716 + /* call bpf_get_smp_processor_id */ 717 + __xlated("2: r1 = 42") 718 + __xlated("3: w0 =") 719 + __xlated("4: r0 = &(void __percpu *)(r0)") 720 + __xlated("5: r0 = *(u32 *)(r0 +0)") 721 + /* call bpf_get_prandom_u32 */ 722 + __xlated("6: *(u64 *)(r10 -16) = r1") 723 + __xlated("7: call") 724 + __xlated("8: r1 = *(u64 *)(r10 -16)") 725 + __xlated("...") 726 + /* ... part of the inlined bpf_loop */ 727 + __xlated("15: *(u64 *)(r10 -40) = r6") 728 + __xlated("16: *(u64 *)(r10 -32) = r7") 729 + __xlated("17: *(u64 *)(r10 -24) = r8") 730 + __success 731 + __naked int bpf_loop_interaction2(void) 732 + { 733 + asm volatile ( 734 + "r1 = 42;" 735 + /* bpf_fastcall stack region at -16, cannot be removed */ 736 + "*(u64 *)(r10 - 16) = r1;" 737 + "call %[bpf_get_smp_processor_id];" 738 + "r1 = *(u64 *)(r10 - 16);" 739 + "*(u64 *)(r10 - 16) = r1;" 740 + "call %[bpf_get_prandom_u32];" 741 + "r1 = *(u64 *)(r10 - 16);" 742 + "r2 = %[dummy_loop_callback];" 743 + "r3 = 0;" 744 + "r4 = 0;" 745 + "call %[bpf_loop];" 746 + "r0 = 0;" 747 + "exit;" 748 + : 749 + : __imm_ptr(dummy_loop_callback), 750 + __imm(bpf_get_smp_processor_id), 751 + __imm(bpf_get_prandom_u32), 752 + __imm(bpf_loop) 753 + : __clobber_common 754 + ); 755 + } 756 + 757 + SEC("raw_tp") 758 + __arch_x86_64 759 + __log_level(4) 760 + __msg("stack depth 512+0") 761 + /* just to print xlated version when debugging */ 762 + __xlated("r0 = &(void __percpu *)(r0)") 763 + __success 764 + /* cumulative_stack_depth() stack usage is MAX_BPF_STACK, 765 + * called subprogram uses an additional slot for bpf_fastcall spill/fill, 766 + * since bpf_fastcall spill/fill could be removed the program still fits 767 + * in MAX_BPF_STACK and should be accepted. 768 + */ 769 + __naked int cumulative_stack_depth(void) 770 + { 771 + asm volatile( 772 + "r1 = 42;" 773 + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" 774 + "call cumulative_stack_depth_subprog;" 775 + "exit;" 776 + : 777 + : __imm_const(max_bpf_stack, MAX_BPF_STACK) 778 + : __clobber_all 779 + ); 780 + } 781 + 782 + __used 783 + __naked static void cumulative_stack_depth_subprog(void) 784 + { 785 + asm volatile ( 786 + "*(u64 *)(r10 - 8) = r1;" 787 + "call %[bpf_get_smp_processor_id];" 788 + "r1 = *(u64 *)(r10 - 8);" 789 + "exit;" 790 + :: __imm(bpf_get_smp_processor_id) : __clobber_all); 791 + } 792 + 793 + SEC("raw_tp") 794 + __arch_x86_64 795 + __log_level(4) 796 + __msg("stack depth 512") 797 + __xlated("0: r1 = 42") 798 + __xlated("1: *(u64 *)(r10 -512) = r1") 799 + __xlated("2: w0 = ") 800 + __xlated("3: r0 = &(void __percpu *)(r0)") 801 + __xlated("4: r0 = *(u32 *)(r0 +0)") 802 + __xlated("5: exit") 803 + __success 804 + __naked int bpf_fastcall_max_stack_ok(void) 805 + { 806 + asm volatile( 807 + "r1 = 42;" 808 + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" 809 + "*(u64 *)(r10 - %[max_bpf_stack_8]) = r1;" 810 + "call %[bpf_get_smp_processor_id];" 811 + "r1 = *(u64 *)(r10 - %[max_bpf_stack_8]);" 812 + "exit;" 813 + : 814 + : __imm_const(max_bpf_stack, MAX_BPF_STACK), 815 + __imm_const(max_bpf_stack_8, MAX_BPF_STACK + 8), 816 + __imm(bpf_get_smp_processor_id) 817 + : __clobber_all 818 + ); 819 + } 820 + 821 + SEC("raw_tp") 822 + __arch_x86_64 823 + __log_level(4) 824 + __msg("stack depth 520") 825 + __failure 826 + __naked int bpf_fastcall_max_stack_fail(void) 827 + { 828 + asm volatile( 829 + "r1 = 42;" 830 + "*(u64 *)(r10 - %[max_bpf_stack]) = r1;" 831 + "*(u64 *)(r10 - %[max_bpf_stack_8]) = r1;" 832 + "call %[bpf_get_smp_processor_id];" 833 + "r1 = *(u64 *)(r10 - %[max_bpf_stack_8]);" 834 + /* call to prandom blocks bpf_fastcall rewrite */ 835 + "*(u64 *)(r10 - %[max_bpf_stack_8]) = r1;" 836 + "call %[bpf_get_prandom_u32];" 837 + "r1 = *(u64 *)(r10 - %[max_bpf_stack_8]);" 838 + "exit;" 839 + : 840 + : __imm_const(max_bpf_stack, MAX_BPF_STACK), 841 + __imm_const(max_bpf_stack_8, MAX_BPF_STACK + 8), 842 + __imm(bpf_get_smp_processor_id), 843 + __imm(bpf_get_prandom_u32) 844 + : __clobber_all 845 + ); 846 + } 847 + 848 + SEC("cgroup/getsockname_unix") 849 + __xlated("0: r2 = 1") 850 + /* bpf_cast_to_kern_ctx is replaced by a single assignment */ 851 + __xlated("1: r0 = r1") 852 + __xlated("2: r0 = r2") 853 + __xlated("3: exit") 854 + __success 855 + __naked void kfunc_bpf_cast_to_kern_ctx(void) 856 + { 857 + asm volatile ( 858 + "r2 = 1;" 859 + "*(u64 *)(r10 - 32) = r2;" 860 + "call %[bpf_cast_to_kern_ctx];" 861 + "r2 = *(u64 *)(r10 - 32);" 862 + "r0 = r2;" 863 + "exit;" 864 + : 865 + : __imm(bpf_cast_to_kern_ctx) 866 + : __clobber_all); 867 + } 868 + 869 + SEC("raw_tp") 870 + __xlated("3: r3 = 1") 871 + /* bpf_rdonly_cast is replaced by a single assignment */ 872 + __xlated("4: r0 = r1") 873 + __xlated("5: r0 = r3") 874 + void kfunc_bpf_rdonly_cast(void) 875 + { 876 + asm volatile ( 877 + "r2 = %[btf_id];" 878 + "r3 = 1;" 879 + "*(u64 *)(r10 - 32) = r3;" 880 + "call %[bpf_rdonly_cast];" 881 + "r3 = *(u64 *)(r10 - 32);" 882 + "r0 = r3;" 883 + : 884 + : __imm(bpf_rdonly_cast), 885 + [btf_id]"r"(bpf_core_type_id_kernel(union bpf_attr)) 886 + : __clobber_common); 887 + } 888 + 889 + /* BTF FUNC records are not generated for kfuncs referenced 890 + * from inline assembly. These records are necessary for 891 + * libbpf to link the program. The function below is a hack 892 + * to ensure that BTF FUNC records are generated. 893 + */ 894 + void kfunc_root(void) 895 + { 896 + bpf_cast_to_kern_ctx(0); 897 + bpf_rdonly_cast(0, 0); 898 + } 899 + 900 + char _license[] SEC("license") = "GPL";

+69

tools/testing/selftests/bpf/progs/verifier_const.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Isovalent */ 3 + 4 + #include <linux/bpf.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + 8 + const volatile long foo = 42; 9 + long bar; 10 + long bart = 96; 11 + 12 + SEC("tc/ingress") 13 + __description("rodata/strtol: write rejected") 14 + __failure __msg("write into map forbidden") 15 + int tcx1(struct __sk_buff *skb) 16 + { 17 + char buff[] = { '8', '4', '\0' }; 18 + bpf_strtol(buff, sizeof(buff), 0, (long *)&foo); 19 + return TCX_PASS; 20 + } 21 + 22 + SEC("tc/ingress") 23 + __description("bss/strtol: write accepted") 24 + __success 25 + int tcx2(struct __sk_buff *skb) 26 + { 27 + char buff[] = { '8', '4', '\0' }; 28 + bpf_strtol(buff, sizeof(buff), 0, &bar); 29 + return TCX_PASS; 30 + } 31 + 32 + SEC("tc/ingress") 33 + __description("data/strtol: write accepted") 34 + __success 35 + int tcx3(struct __sk_buff *skb) 36 + { 37 + char buff[] = { '8', '4', '\0' }; 38 + bpf_strtol(buff, sizeof(buff), 0, &bart); 39 + return TCX_PASS; 40 + } 41 + 42 + SEC("tc/ingress") 43 + __description("rodata/mtu: write rejected") 44 + __failure __msg("write into map forbidden") 45 + int tcx4(struct __sk_buff *skb) 46 + { 47 + bpf_check_mtu(skb, skb->ifindex, (__u32 *)&foo, 0, 0); 48 + return TCX_PASS; 49 + } 50 + 51 + SEC("tc/ingress") 52 + __description("bss/mtu: write accepted") 53 + __success 54 + int tcx5(struct __sk_buff *skb) 55 + { 56 + bpf_check_mtu(skb, skb->ifindex, (__u32 *)&bar, 0, 0); 57 + return TCX_PASS; 58 + } 59 + 60 + SEC("tc/ingress") 61 + __description("data/mtu: write accepted") 62 + __success 63 + int tcx6(struct __sk_buff *skb) 64 + { 65 + bpf_check_mtu(skb, skb->ifindex, (__u32 *)&bart, 0, 0); 66 + return TCX_PASS; 67 + } 68 + 69 + char LICENSE[] SEC("license") = "GPL";

+6 -1

tools/testing/selftests/bpf/progs/verifier_global_subprogs.c

··· 7 7 #include "bpf_misc.h" 8 8 #include "xdp_metadata.h" 9 9 #include "bpf_kfuncs.h" 10 + #include "err.h" 10 11 11 12 /* The compiler may be able to detect the access to uninitialized 12 13 memory in the routines performing out of bound memory accesses and ··· 332 331 __success __log_level(2) 333 332 int BPF_PROG(arg_tag_ctx_lsm) 334 333 { 335 - return tracing_subprog_void(ctx) + tracing_subprog_u64(ctx); 334 + int ret; 335 + 336 + ret = tracing_subprog_void(ctx) + tracing_subprog_u64(ctx); 337 + set_if_not_errno_or_zero(ret, -1); 338 + return ret; 336 339 } 337 340 338 341 SEC("?struct_ops/test_1")

+6 -9

tools/testing/selftests/bpf/progs/verifier_int_ptr.c

··· 6 6 #include "bpf_misc.h" 7 7 8 8 SEC("socket") 9 - __description("ARG_PTR_TO_LONG uninitialized") 9 + __description("arg pointer to long uninitialized") 10 10 __success 11 - __failure_unpriv __msg_unpriv("invalid indirect read from stack R4 off -16+0 size 8") 12 11 __naked void arg_ptr_to_long_uninitialized(void) 13 12 { 14 13 asm volatile (" \ ··· 34 35 } 35 36 36 37 SEC("socket") 37 - __description("ARG_PTR_TO_LONG half-uninitialized") 38 - /* in privileged mode reads from uninitialized stack locations are permitted */ 39 - __success __failure_unpriv 40 - __msg_unpriv("invalid indirect read from stack R4 off -16+4 size 8") 38 + __description("arg pointer to long half-uninitialized") 39 + __success 41 40 __retval(0) 42 41 __naked void ptr_to_long_half_uninitialized(void) 43 42 { ··· 64 67 } 65 68 66 69 SEC("cgroup/sysctl") 67 - __description("ARG_PTR_TO_LONG misaligned") 70 + __description("arg pointer to long misaligned") 68 71 __failure __msg("misaligned stack access off 0+-20+0 size 8") 69 72 __naked void arg_ptr_to_long_misaligned(void) 70 73 { ··· 95 98 } 96 99 97 100 SEC("cgroup/sysctl") 98 - __description("ARG_PTR_TO_LONG size < sizeof(long)") 101 + __description("arg pointer to long size < sizeof(long)") 99 102 __failure __msg("invalid indirect access to stack R4 off=-4 size=8") 100 103 __naked void to_long_size_sizeof_long(void) 101 104 { ··· 124 127 } 125 128 126 129 SEC("cgroup/sysctl") 127 - __description("ARG_PTR_TO_LONG initialized") 130 + __description("arg pointer to long initialized") 128 131 __success 129 132 __naked void arg_ptr_to_long_initialized(void) 130 133 {

+114

tools/testing/selftests/bpf/progs/verifier_jit_convergence.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <linux/bpf.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + 8 + struct value_t { 9 + long long a[32]; 10 + }; 11 + 12 + struct { 13 + __uint(type, BPF_MAP_TYPE_HASH); 14 + __uint(max_entries, 1); 15 + __type(key, long long); 16 + __type(value, struct value_t); 17 + } map_hash SEC(".maps"); 18 + 19 + SEC("socket") 20 + __description("bpf_jit_convergence je <-> jmp") 21 + __success __retval(0) 22 + __arch_x86_64 23 + __jited(" pushq %rbp") 24 + __naked void btf_jit_convergence_je_jmp(void) 25 + { 26 + asm volatile ( 27 + "call %[bpf_get_prandom_u32];" 28 + "if r0 == 0 goto l20_%=;" 29 + "if r0 == 1 goto l21_%=;" 30 + "if r0 == 2 goto l22_%=;" 31 + "if r0 == 3 goto l23_%=;" 32 + "if r0 == 4 goto l24_%=;" 33 + "call %[bpf_get_prandom_u32];" 34 + "call %[bpf_get_prandom_u32];" 35 + "l20_%=:" 36 + "l21_%=:" 37 + "l22_%=:" 38 + "l23_%=:" 39 + "l24_%=:" 40 + "r1 = 0;" 41 + "*(u64 *)(r10 - 8) = r1;" 42 + "r2 = r10;" 43 + "r2 += -8;" 44 + "r1 = %[map_hash] ll;" 45 + "call %[bpf_map_lookup_elem];" 46 + "if r0 == 0 goto l1_%=;" 47 + "r6 = r0;" 48 + "call %[bpf_get_prandom_u32];" 49 + "r7 = r0;" 50 + "r5 = r6;" 51 + "if r0 != 0x0 goto l12_%=;" 52 + "call %[bpf_get_prandom_u32];" 53 + "r1 = r0;" 54 + "r2 = r6;" 55 + "if r1 == 0x0 goto l0_%=;" 56 + "l9_%=:" 57 + "r2 = *(u64 *)(r6 + 0x0);" 58 + "r2 += 0x1;" 59 + "*(u64 *)(r6 + 0x0) = r2;" 60 + "goto l1_%=;" 61 + "l12_%=:" 62 + "r1 = r7;" 63 + "r1 += 0x98;" 64 + "r2 = r5;" 65 + "r2 += 0x90;" 66 + "r2 = *(u32 *)(r2 + 0x0);" 67 + "r3 = r7;" 68 + "r3 &= 0x1;" 69 + "r2 *= 0xa8;" 70 + "if r3 == 0x0 goto l2_%=;" 71 + "r1 += r2;" 72 + "r1 -= r7;" 73 + "r1 += 0x8;" 74 + "if r1 <= 0xb20 goto l3_%=;" 75 + "r1 = 0x0;" 76 + "goto l4_%=;" 77 + "l3_%=:" 78 + "r1 += r7;" 79 + "l4_%=:" 80 + "if r1 == 0x0 goto l8_%=;" 81 + "goto l9_%=;" 82 + "l2_%=:" 83 + "r1 += r2;" 84 + "r1 -= r7;" 85 + "r1 += 0x10;" 86 + "if r1 <= 0xb20 goto l6_%=;" 87 + "r1 = 0x0;" 88 + "goto l7_%=;" 89 + "l6_%=:" 90 + "r1 += r7;" 91 + "l7_%=:" 92 + "if r1 == 0x0 goto l8_%=;" 93 + "goto l9_%=;" 94 + "l0_%=:" 95 + "r1 = 0x3;" 96 + "*(u64 *)(r10 - 0x10) = r1;" 97 + "r2 = r1;" 98 + "goto l1_%=;" 99 + "l8_%=:" 100 + "r1 = r5;" 101 + "r1 += 0x4;" 102 + "r1 = *(u32 *)(r1 + 0x0);" 103 + "*(u64 *)(r10 - 0x8) = r1;" 104 + "l1_%=:" 105 + "r0 = 0;" 106 + "exit;" 107 + : 108 + : __imm(bpf_get_prandom_u32), 109 + __imm(bpf_map_lookup_elem), 110 + __imm_addr(map_hash) 111 + : __clobber_all); 112 + } 113 + 114 + char _license[] SEC("license") = "GPL";

+48

tools/testing/selftests/bpf/progs/verifier_kfunc_prog_types.c

··· 47 47 return 0; 48 48 } 49 49 50 + SEC("tracepoint") 51 + __success 52 + int BPF_PROG(task_kfunc_tracepoint) 53 + { 54 + task_kfunc_load_test(); 55 + return 0; 56 + } 57 + 58 + SEC("perf_event") 59 + __success 60 + int BPF_PROG(task_kfunc_perf_event) 61 + { 62 + task_kfunc_load_test(); 63 + return 0; 64 + } 65 + 50 66 /***************** 51 67 * cgroup kfuncs * 52 68 *****************/ ··· 101 85 return 0; 102 86 } 103 87 88 + SEC("tracepoint") 89 + __success 90 + int BPF_PROG(cgrp_kfunc_tracepoint) 91 + { 92 + cgrp_kfunc_load_test(); 93 + return 0; 94 + } 95 + 96 + SEC("perf_event") 97 + __success 98 + int BPF_PROG(cgrp_kfunc_perf_event) 99 + { 100 + cgrp_kfunc_load_test(); 101 + return 0; 102 + } 103 + 104 104 /****************** 105 105 * cpumask kfuncs * 106 106 ******************/ ··· 148 116 SEC("syscall") 149 117 __success 150 118 int BPF_PROG(cpumask_kfunc_syscall) 119 + { 120 + cpumask_kfunc_load_test(); 121 + return 0; 122 + } 123 + 124 + SEC("tracepoint") 125 + __success 126 + int BPF_PROG(cpumask_kfunc_tracepoint) 127 + { 128 + cpumask_kfunc_load_test(); 129 + return 0; 130 + } 131 + 132 + SEC("perf_event") 133 + __success 134 + int BPF_PROG(cpumask_kfunc_perf_event) 151 135 { 152 136 cpumask_kfunc_load_test(); 153 137 return 0;

+112

tools/testing/selftests/bpf/progs/verifier_ldsx.c

··· 144 144 : __clobber_all); 145 145 } 146 146 147 + SEC("xdp") 148 + __description("LDSX, xdp s32 xdp_md->data") 149 + __failure __msg("invalid bpf_context access") 150 + __naked void ldsx_ctx_1(void) 151 + { 152 + asm volatile ( 153 + "r2 = *(s32 *)(r1 + %[xdp_md_data]);" 154 + "r0 = 0;" 155 + "exit;" 156 + : 157 + : __imm_const(xdp_md_data, offsetof(struct xdp_md, data)) 158 + : __clobber_all); 159 + } 160 + 161 + SEC("xdp") 162 + __description("LDSX, xdp s32 xdp_md->data_end") 163 + __failure __msg("invalid bpf_context access") 164 + __naked void ldsx_ctx_2(void) 165 + { 166 + asm volatile ( 167 + "r2 = *(s32 *)(r1 + %[xdp_md_data_end]);" 168 + "r0 = 0;" 169 + "exit;" 170 + : 171 + : __imm_const(xdp_md_data_end, offsetof(struct xdp_md, data_end)) 172 + : __clobber_all); 173 + } 174 + 175 + SEC("xdp") 176 + __description("LDSX, xdp s32 xdp_md->data_meta") 177 + __failure __msg("invalid bpf_context access") 178 + __naked void ldsx_ctx_3(void) 179 + { 180 + asm volatile ( 181 + "r2 = *(s32 *)(r1 + %[xdp_md_data_meta]);" 182 + "r0 = 0;" 183 + "exit;" 184 + : 185 + : __imm_const(xdp_md_data_meta, offsetof(struct xdp_md, data_meta)) 186 + : __clobber_all); 187 + } 188 + 189 + SEC("tcx/ingress") 190 + __description("LDSX, tcx s32 __sk_buff->data") 191 + __failure __msg("invalid bpf_context access") 192 + __naked void ldsx_ctx_4(void) 193 + { 194 + asm volatile ( 195 + "r2 = *(s32 *)(r1 + %[sk_buff_data]);" 196 + "r0 = 0;" 197 + "exit;" 198 + : 199 + : __imm_const(sk_buff_data, offsetof(struct __sk_buff, data)) 200 + : __clobber_all); 201 + } 202 + 203 + SEC("tcx/ingress") 204 + __description("LDSX, tcx s32 __sk_buff->data_end") 205 + __failure __msg("invalid bpf_context access") 206 + __naked void ldsx_ctx_5(void) 207 + { 208 + asm volatile ( 209 + "r2 = *(s32 *)(r1 + %[sk_buff_data_end]);" 210 + "r0 = 0;" 211 + "exit;" 212 + : 213 + : __imm_const(sk_buff_data_end, offsetof(struct __sk_buff, data_end)) 214 + : __clobber_all); 215 + } 216 + 217 + SEC("tcx/ingress") 218 + __description("LDSX, tcx s32 __sk_buff->data_meta") 219 + __failure __msg("invalid bpf_context access") 220 + __naked void ldsx_ctx_6(void) 221 + { 222 + asm volatile ( 223 + "r2 = *(s32 *)(r1 + %[sk_buff_data_meta]);" 224 + "r0 = 0;" 225 + "exit;" 226 + : 227 + : __imm_const(sk_buff_data_meta, offsetof(struct __sk_buff, data_meta)) 228 + : __clobber_all); 229 + } 230 + 231 + SEC("flow_dissector") 232 + __description("LDSX, flow_dissector s32 __sk_buff->data") 233 + __failure __msg("invalid bpf_context access") 234 + __naked void ldsx_ctx_7(void) 235 + { 236 + asm volatile ( 237 + "r2 = *(s32 *)(r1 + %[sk_buff_data]);" 238 + "r0 = 0;" 239 + "exit;" 240 + : 241 + : __imm_const(sk_buff_data, offsetof(struct __sk_buff, data)) 242 + : __clobber_all); 243 + } 244 + 245 + SEC("flow_dissector") 246 + __description("LDSX, flow_dissector s32 __sk_buff->data_end") 247 + __failure __msg("invalid bpf_context access") 248 + __naked void ldsx_ctx_8(void) 249 + { 250 + asm volatile ( 251 + "r2 = *(s32 *)(r1 + %[sk_buff_data_end]);" 252 + "r0 = 0;" 253 + "exit;" 254 + : 255 + : __imm_const(sk_buff_data_end, offsetof(struct __sk_buff, data_end)) 256 + : __clobber_all); 257 + } 258 + 147 259 #else 148 260 149 261 SEC("socket")

+162

tools/testing/selftests/bpf/progs/verifier_lsm.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <linux/bpf.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include "bpf_misc.h" 6 + 7 + SEC("lsm/file_alloc_security") 8 + __description("lsm bpf prog with -4095~0 retval. test 1") 9 + __success 10 + __naked int errno_zero_retval_test1(void *ctx) 11 + { 12 + asm volatile ( 13 + "r0 = 0;" 14 + "exit;" 15 + ::: __clobber_all); 16 + } 17 + 18 + SEC("lsm/file_alloc_security") 19 + __description("lsm bpf prog with -4095~0 retval. test 2") 20 + __success 21 + __naked int errno_zero_retval_test2(void *ctx) 22 + { 23 + asm volatile ( 24 + "r0 = -4095;" 25 + "exit;" 26 + ::: __clobber_all); 27 + } 28 + 29 + SEC("lsm/file_mprotect") 30 + __description("lsm bpf prog with -4095~0 retval. test 4") 31 + __failure __msg("R0 has smin=-4096 smax=-4096 should have been in [-4095, 0]") 32 + __naked int errno_zero_retval_test4(void *ctx) 33 + { 34 + asm volatile ( 35 + "r0 = -4096;" 36 + "exit;" 37 + ::: __clobber_all); 38 + } 39 + 40 + SEC("lsm/file_mprotect") 41 + __description("lsm bpf prog with -4095~0 retval. test 5") 42 + __failure __msg("R0 has smin=4096 smax=4096 should have been in [-4095, 0]") 43 + __naked int errno_zero_retval_test5(void *ctx) 44 + { 45 + asm volatile ( 46 + "r0 = 4096;" 47 + "exit;" 48 + ::: __clobber_all); 49 + } 50 + 51 + SEC("lsm/file_mprotect") 52 + __description("lsm bpf prog with -4095~0 retval. test 6") 53 + __failure __msg("R0 has smin=1 smax=1 should have been in [-4095, 0]") 54 + __naked int errno_zero_retval_test6(void *ctx) 55 + { 56 + asm volatile ( 57 + "r0 = 1;" 58 + "exit;" 59 + ::: __clobber_all); 60 + } 61 + 62 + SEC("lsm/audit_rule_known") 63 + __description("lsm bpf prog with bool retval. test 1") 64 + __success 65 + __naked int bool_retval_test1(void *ctx) 66 + { 67 + asm volatile ( 68 + "r0 = 1;" 69 + "exit;" 70 + ::: __clobber_all); 71 + } 72 + 73 + SEC("lsm/audit_rule_known") 74 + __description("lsm bpf prog with bool retval. test 2") 75 + __success 76 + __success 77 + __naked int bool_retval_test2(void *ctx) 78 + { 79 + asm volatile ( 80 + "r0 = 0;" 81 + "exit;" 82 + ::: __clobber_all); 83 + } 84 + 85 + SEC("lsm/audit_rule_known") 86 + __description("lsm bpf prog with bool retval. test 3") 87 + __failure __msg("R0 has smin=-1 smax=-1 should have been in [0, 1]") 88 + __naked int bool_retval_test3(void *ctx) 89 + { 90 + asm volatile ( 91 + "r0 = -1;" 92 + "exit;" 93 + ::: __clobber_all); 94 + } 95 + 96 + SEC("lsm/audit_rule_known") 97 + __description("lsm bpf prog with bool retval. test 4") 98 + __failure __msg("R0 has smin=2 smax=2 should have been in [0, 1]") 99 + __naked int bool_retval_test4(void *ctx) 100 + { 101 + asm volatile ( 102 + "r0 = 2;" 103 + "exit;" 104 + ::: __clobber_all); 105 + } 106 + 107 + SEC("lsm/file_free_security") 108 + __success 109 + __description("lsm bpf prog with void retval. test 1") 110 + __naked int void_retval_test1(void *ctx) 111 + { 112 + asm volatile ( 113 + "r0 = -4096;" 114 + "exit;" 115 + ::: __clobber_all); 116 + } 117 + 118 + SEC("lsm/file_free_security") 119 + __success 120 + __description("lsm bpf prog with void retval. test 2") 121 + __naked int void_retval_test2(void *ctx) 122 + { 123 + asm volatile ( 124 + "r0 = 4096;" 125 + "exit;" 126 + ::: __clobber_all); 127 + } 128 + 129 + SEC("lsm/getprocattr") 130 + __description("lsm disabled hook: getprocattr") 131 + __failure __msg("points to disabled hook") 132 + __naked int disabled_hook_test1(void *ctx) 133 + { 134 + asm volatile ( 135 + "r0 = 0;" 136 + "exit;" 137 + ::: __clobber_all); 138 + } 139 + 140 + SEC("lsm/setprocattr") 141 + __description("lsm disabled hook: setprocattr") 142 + __failure __msg("points to disabled hook") 143 + __naked int disabled_hook_test2(void *ctx) 144 + { 145 + asm volatile ( 146 + "r0 = 0;" 147 + "exit;" 148 + ::: __clobber_all); 149 + } 150 + 151 + SEC("lsm/ismaclabel") 152 + __description("lsm disabled hook: ismaclabel") 153 + __failure __msg("points to disabled hook") 154 + __naked int disabled_hook_test3(void *ctx) 155 + { 156 + asm volatile ( 157 + "r0 = 0;" 158 + "exit;" 159 + ::: __clobber_all); 160 + } 161 + 162 + char _license[] SEC("license") = "GPL";

+220 -116

tools/testing/selftests/bpf/progs/verifier_scalar_ids.c

··· 5 5 #include "bpf_misc.h" 6 6 7 7 /* Check that precision marks propagate through scalar IDs. 8 - * Registers r{0,1,2} have the same scalar ID at the moment when r0 is 9 - * marked to be precise, this mark is immediately propagated to r{1,2}. 8 + * Registers r{0,1,2} have the same scalar ID. 9 + * Range information is propagated for scalars sharing same ID. 10 + * Check that precision mark for r0 causes precision marks for r{1,2} 11 + * when range information is propagated for 'if <reg> <op> <const>' insn. 10 12 */ 11 13 SEC("socket") 12 14 __success __log_level(2) 13 - __msg("frame0: regs=r0,r1,r2 stack= before 4: (bf) r3 = r10") 14 - __msg("frame0: regs=r0,r1,r2 stack= before 3: (bf) r2 = r0") 15 - __msg("frame0: regs=r0,r1 stack= before 2: (bf) r1 = r0") 16 - __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 17 - __msg("frame0: regs=r0 stack= before 0: (85) call bpf_ktime_get_ns") 18 - __flag(BPF_F_TEST_STATE_FREQ) 19 - __naked void precision_same_state(void) 20 - { 21 - asm volatile ( 22 - /* r0 = random number up to 0xff */ 23 - "call %[bpf_ktime_get_ns];" 24 - "r0 &= 0xff;" 25 - /* tie r0.id == r1.id == r2.id */ 26 - "r1 = r0;" 27 - "r2 = r0;" 28 - /* force r0 to be precise, this immediately marks r1 and r2 as 29 - * precise as well because of shared IDs 30 - */ 31 - "r3 = r10;" 32 - "r3 += r0;" 33 - "r0 = 0;" 34 - "exit;" 35 - : 36 - : __imm(bpf_ktime_get_ns) 37 - : __clobber_all); 38 - } 39 - 40 - /* Same as precision_same_state, but mark propagates through state / 41 - * parent state boundary. 42 - */ 43 - SEC("socket") 44 - __success __log_level(2) 45 - __msg("frame0: last_idx 6 first_idx 5 subseq_idx -1") 46 - __msg("frame0: regs=r0,r1,r2 stack= before 5: (bf) r3 = r10") 15 + /* first 'if' branch */ 16 + __msg("6: (0f) r3 += r0") 17 + __msg("frame0: regs=r0 stack= before 4: (25) if r1 > 0x7 goto pc+0") 47 18 __msg("frame0: parent state regs=r0,r1,r2 stack=:") 48 - __msg("frame0: regs=r0,r1,r2 stack= before 4: (05) goto pc+0") 49 19 __msg("frame0: regs=r0,r1,r2 stack= before 3: (bf) r2 = r0") 50 - __msg("frame0: regs=r0,r1 stack= before 2: (bf) r1 = r0") 51 - __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 52 - __msg("frame0: parent state regs=r0 stack=:") 53 - __msg("frame0: regs=r0 stack= before 0: (85) call bpf_ktime_get_ns") 20 + /* second 'if' branch */ 21 + __msg("from 4 to 5: ") 22 + __msg("6: (0f) r3 += r0") 23 + __msg("frame0: regs=r0 stack= before 5: (bf) r3 = r10") 24 + __msg("frame0: regs=r0 stack= before 4: (25) if r1 > 0x7 goto pc+0") 25 + /* parent state already has r{0,1,2} as precise */ 26 + __msg("frame0: parent state regs= stack=:") 54 27 __flag(BPF_F_TEST_STATE_FREQ) 55 - __naked void precision_cross_state(void) 28 + __naked void linked_regs_bpf_k(void) 56 29 { 57 30 asm volatile ( 58 31 /* r0 = random number up to 0xff */ ··· 34 61 /* tie r0.id == r1.id == r2.id */ 35 62 "r1 = r0;" 36 63 "r2 = r0;" 37 - /* force checkpoint */ 38 - "goto +0;" 39 - /* force r0 to be precise, this immediately marks r1 and r2 as 64 + "if r1 > 7 goto +0;" 65 + /* force r0 to be precise, this eventually marks r1 and r2 as 40 66 * precise as well because of shared IDs 41 67 */ 42 68 "r3 = r10;" ··· 47 75 : __clobber_all); 48 76 } 49 77 50 - /* Same as precision_same_state, but break one of the 78 + /* Registers r{0,1,2} share same ID when 'if r1 > ...' insn is processed, 79 + * check that verifier marks r{1,2} as precise while backtracking 80 + * 'if r1 > ...' with r0 already marked. 81 + */ 82 + SEC("socket") 83 + __success __log_level(2) 84 + __flag(BPF_F_TEST_STATE_FREQ) 85 + __msg("frame0: regs=r0 stack= before 5: (2d) if r1 > r3 goto pc+0") 86 + __msg("frame0: parent state regs=r0,r1,r2,r3 stack=:") 87 + __msg("frame0: regs=r0,r1,r2,r3 stack= before 4: (b7) r3 = 7") 88 + __naked void linked_regs_bpf_x_src(void) 89 + { 90 + asm volatile ( 91 + /* r0 = random number up to 0xff */ 92 + "call %[bpf_ktime_get_ns];" 93 + "r0 &= 0xff;" 94 + /* tie r0.id == r1.id == r2.id */ 95 + "r1 = r0;" 96 + "r2 = r0;" 97 + "r3 = 7;" 98 + "if r1 > r3 goto +0;" 99 + /* force r0 to be precise, this eventually marks r1 and r2 as 100 + * precise as well because of shared IDs 101 + */ 102 + "r4 = r10;" 103 + "r4 += r0;" 104 + "r0 = 0;" 105 + "exit;" 106 + : 107 + : __imm(bpf_ktime_get_ns) 108 + : __clobber_all); 109 + } 110 + 111 + /* Registers r{0,1,2} share same ID when 'if r1 > r3' insn is processed, 112 + * check that verifier marks r{0,1,2} as precise while backtracking 113 + * 'if r1 > r3' with r3 already marked. 114 + */ 115 + SEC("socket") 116 + __success __log_level(2) 117 + __flag(BPF_F_TEST_STATE_FREQ) 118 + __msg("frame0: regs=r3 stack= before 5: (2d) if r1 > r3 goto pc+0") 119 + __msg("frame0: parent state regs=r0,r1,r2,r3 stack=:") 120 + __msg("frame0: regs=r0,r1,r2,r3 stack= before 4: (b7) r3 = 7") 121 + __naked void linked_regs_bpf_x_dst(void) 122 + { 123 + asm volatile ( 124 + /* r0 = random number up to 0xff */ 125 + "call %[bpf_ktime_get_ns];" 126 + "r0 &= 0xff;" 127 + /* tie r0.id == r1.id == r2.id */ 128 + "r1 = r0;" 129 + "r2 = r0;" 130 + "r3 = 7;" 131 + "if r1 > r3 goto +0;" 132 + /* force r0 to be precise, this eventually marks r1 and r2 as 133 + * precise as well because of shared IDs 134 + */ 135 + "r4 = r10;" 136 + "r4 += r3;" 137 + "r0 = 0;" 138 + "exit;" 139 + : 140 + : __imm(bpf_ktime_get_ns) 141 + : __clobber_all); 142 + } 143 + 144 + /* Same as linked_regs_bpf_k, but break one of the 51 145 * links, note that r1 is absent from regs=... in __msg below. 52 146 */ 53 147 SEC("socket") 54 148 __success __log_level(2) 55 - __msg("frame0: regs=r0,r2 stack= before 5: (bf) r3 = r10") 56 - __msg("frame0: regs=r0,r2 stack= before 4: (b7) r1 = 0") 57 - __msg("frame0: regs=r0,r2 stack= before 3: (bf) r2 = r0") 58 - __msg("frame0: regs=r0 stack= before 2: (bf) r1 = r0") 59 - __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 60 - __msg("frame0: regs=r0 stack= before 0: (85) call bpf_ktime_get_ns") 149 + __msg("7: (0f) r3 += r0") 150 + __msg("frame0: regs=r0 stack= before 6: (bf) r3 = r10") 151 + __msg("frame0: parent state regs=r0 stack=:") 152 + __msg("frame0: regs=r0 stack= before 5: (25) if r0 > 0x7 goto pc+0") 153 + __msg("frame0: parent state regs=r0,r2 stack=:") 61 154 __flag(BPF_F_TEST_STATE_FREQ) 62 - __naked void precision_same_state_broken_link(void) 155 + __naked void linked_regs_broken_link(void) 63 156 { 64 157 asm volatile ( 65 158 /* r0 = random number up to 0xff */ ··· 137 100 * compared to the previous test 138 101 */ 139 102 "r1 = 0;" 140 - /* force r0 to be precise, this immediately marks r1 and r2 as 141 - * precise as well because of shared IDs 142 - */ 143 - "r3 = r10;" 144 - "r3 += r0;" 145 - "r0 = 0;" 146 - "exit;" 147 - : 148 - : __imm(bpf_ktime_get_ns) 149 - : __clobber_all); 150 - } 151 - 152 - /* Same as precision_same_state_broken_link, but with state / 153 - * parent state boundary. 154 - */ 155 - SEC("socket") 156 - __success __log_level(2) 157 - __msg("frame0: regs=r0,r2 stack= before 6: (bf) r3 = r10") 158 - __msg("frame0: regs=r0,r2 stack= before 5: (b7) r1 = 0") 159 - __msg("frame0: parent state regs=r0,r2 stack=:") 160 - __msg("frame0: regs=r0,r1,r2 stack= before 4: (05) goto pc+0") 161 - __msg("frame0: regs=r0,r1,r2 stack= before 3: (bf) r2 = r0") 162 - __msg("frame0: regs=r0,r1 stack= before 2: (bf) r1 = r0") 163 - __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 164 - __msg("frame0: parent state regs=r0 stack=:") 165 - __msg("frame0: regs=r0 stack= before 0: (85) call bpf_ktime_get_ns") 166 - __flag(BPF_F_TEST_STATE_FREQ) 167 - __naked void precision_cross_state_broken_link(void) 168 - { 169 - asm volatile ( 170 - /* r0 = random number up to 0xff */ 171 - "call %[bpf_ktime_get_ns];" 172 - "r0 &= 0xff;" 173 - /* tie r0.id == r1.id == r2.id */ 174 - "r1 = r0;" 175 - "r2 = r0;" 176 - /* force checkpoint, although link between r1 and r{0,2} is 177 - * broken by the next statement current precision tracking 178 - * algorithm can't react to it and propagates mark for r1 to 179 - * the parent state. 180 - */ 181 - "goto +0;" 182 - /* break link for r1, this is the only line that differs 183 - * compared to precision_cross_state() 184 - */ 185 - "r1 = 0;" 186 - /* force r0 to be precise, this immediately marks r1 and r2 as 187 - * precise as well because of shared IDs 103 + "if r0 > 7 goto +0;" 104 + /* force r0 to be precise, 105 + * this eventually marks r2 as precise because of shared IDs 188 106 */ 189 107 "r3 = r10;" 190 108 "r3 += r0;" ··· 156 164 */ 157 165 SEC("socket") 158 166 __success __log_level(2) 159 - __msg("11: (0f) r2 += r1") 167 + __msg("12: (0f) r2 += r1") 160 168 /* Current state */ 161 - __msg("frame2: last_idx 11 first_idx 10 subseq_idx -1") 162 - __msg("frame2: regs=r1 stack= before 10: (bf) r2 = r10") 169 + __msg("frame2: last_idx 12 first_idx 11 subseq_idx -1 ") 170 + __msg("frame2: regs=r1 stack= before 11: (bf) r2 = r10") 171 + __msg("frame2: parent state regs=r1 stack=") 172 + __msg("frame1: parent state regs= stack=") 173 + __msg("frame0: parent state regs= stack=") 174 + /* Parent state */ 175 + __msg("frame2: last_idx 10 first_idx 10 subseq_idx 11 ") 176 + __msg("frame2: regs=r1 stack= before 10: (25) if r1 > 0x7 goto pc+0") 163 177 __msg("frame2: parent state regs=r1 stack=") 164 178 /* frame1.r{6,7} are marked because mark_precise_scalar_ids() 165 179 * looks for all registers with frame2.r1.id in the current state ··· 190 192 __msg("frame0: parent state regs=r1,r6 stack=") 191 193 /* Parent state */ 192 194 __msg("frame0: last_idx 3 first_idx 1 subseq_idx 4") 193 - __msg("frame0: regs=r0,r1,r6 stack= before 3: (bf) r6 = r0") 195 + __msg("frame0: regs=r1,r6 stack= before 3: (bf) r6 = r0") 194 196 __msg("frame0: regs=r0,r1 stack= before 2: (bf) r1 = r0") 195 197 __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 196 198 __flag(BPF_F_TEST_STATE_FREQ) ··· 228 230 void precision_many_frames__bar(void) 229 231 { 230 232 asm volatile ( 231 - /* force r1 to be precise, this immediately marks: 233 + "if r1 > 7 goto +0;" 234 + /* force r1 to be precise, this eventually marks: 232 235 * - bar frame r1 233 236 * - foo frame r{1,6,7} 234 237 * - main frame r{1,6} ··· 246 247 */ 247 248 SEC("socket") 248 249 __success __log_level(2) 250 + __msg("11: (0f) r2 += r1") 249 251 /* foo frame */ 250 - __msg("frame1: regs=r1 stack=-8,-16 before 9: (bf) r2 = r10") 252 + __msg("frame1: regs=r1 stack= before 10: (bf) r2 = r10") 253 + __msg("frame1: regs=r1 stack= before 9: (25) if r1 > 0x7 goto pc+0") 251 254 __msg("frame1: regs=r1 stack=-8,-16 before 8: (7b) *(u64 *)(r10 -16) = r1") 252 255 __msg("frame1: regs=r1 stack=-8 before 7: (7b) *(u64 *)(r10 -8) = r1") 253 256 __msg("frame1: regs=r1 stack= before 4: (85) call pc+2") 254 257 /* main frame */ 255 - __msg("frame0: regs=r0,r1 stack=-8 before 3: (7b) *(u64 *)(r10 -8) = r1") 256 - __msg("frame0: regs=r0,r1 stack= before 2: (bf) r1 = r0") 258 + __msg("frame0: regs=r1 stack=-8 before 3: (7b) *(u64 *)(r10 -8) = r1") 259 + __msg("frame0: regs=r1 stack= before 2: (bf) r1 = r0") 257 260 __msg("frame0: regs=r0 stack= before 1: (57) r0 &= 255") 258 261 __flag(BPF_F_TEST_STATE_FREQ) 259 262 __naked void precision_stack(void) ··· 284 283 */ 285 284 "*(u64*)(r10 - 8) = r1;" 286 285 "*(u64*)(r10 - 16) = r1;" 287 - /* force r1 to be precise, this immediately marks: 286 + "if r1 > 7 goto +0;" 287 + /* force r1 to be precise, this eventually marks: 288 288 * - foo frame r1,fp{-8,-16} 289 289 * - main frame r1,fp{-8} 290 290 */ ··· 301 299 SEC("socket") 302 300 __success __log_level(2) 303 301 /* r{6,7} */ 304 - __msg("11: (0f) r3 += r7") 305 - __msg("frame0: regs=r6,r7 stack= before 10: (bf) r3 = r10") 302 + __msg("12: (0f) r3 += r7") 303 + __msg("frame0: regs=r7 stack= before 11: (bf) r3 = r10") 304 + __msg("frame0: regs=r7 stack= before 9: (25) if r7 > 0x7 goto pc+0") 306 305 /* ... skip some insns ... */ 307 306 __msg("frame0: regs=r6,r7 stack= before 3: (bf) r7 = r0") 308 307 __msg("frame0: regs=r0,r6 stack= before 2: (bf) r6 = r0") 309 308 /* r{8,9} */ 310 - __msg("12: (0f) r3 += r9") 311 - __msg("frame0: regs=r8,r9 stack= before 11: (0f) r3 += r7") 309 + __msg("13: (0f) r3 += r9") 310 + __msg("frame0: regs=r9 stack= before 12: (0f) r3 += r7") 312 311 /* ... skip some insns ... */ 312 + __msg("frame0: regs=r9 stack= before 10: (25) if r9 > 0x7 goto pc+0") 313 313 __msg("frame0: regs=r8,r9 stack= before 7: (bf) r9 = r0") 314 314 __msg("frame0: regs=r0,r8 stack= before 6: (bf) r8 = r0") 315 315 __flag(BPF_F_TEST_STATE_FREQ) ··· 332 328 "r9 = r0;" 333 329 /* clear r0 id */ 334 330 "r0 = 0;" 335 - /* force checkpoint */ 336 - "goto +0;" 331 + /* propagate equal scalars precision */ 332 + "if r7 > 7 goto +0;" 333 + "if r9 > 7 goto +0;" 337 334 "r3 = r10;" 338 335 /* force r7 to be precise, this also marks r6 */ 339 336 "r3 += r7;" 340 337 /* force r9 to be precise, this also marks r8 */ 341 338 "r3 += r9;" 339 + "exit;" 340 + : 341 + : __imm(bpf_ktime_get_ns) 342 + : __clobber_all); 343 + } 344 + 345 + SEC("socket") 346 + __success __log_level(2) 347 + __flag(BPF_F_TEST_STATE_FREQ) 348 + /* check thar r0 and r6 have different IDs after 'if', 349 + * collect_linked_regs() can't tie more than 6 registers for a single insn. 350 + */ 351 + __msg("8: (25) if r0 > 0x7 goto pc+0 ; R0=scalar(id=1") 352 + __msg("9: (bf) r6 = r6 ; R6_w=scalar(id=2") 353 + /* check that r{0-5} are marked precise after 'if' */ 354 + __msg("frame0: regs=r0 stack= before 8: (25) if r0 > 0x7 goto pc+0") 355 + __msg("frame0: parent state regs=r0,r1,r2,r3,r4,r5 stack=:") 356 + __naked void linked_regs_too_many_regs(void) 357 + { 358 + asm volatile ( 359 + /* r0 = random number up to 0xff */ 360 + "call %[bpf_ktime_get_ns];" 361 + "r0 &= 0xff;" 362 + /* tie r{0-6} IDs */ 363 + "r1 = r0;" 364 + "r2 = r0;" 365 + "r3 = r0;" 366 + "r4 = r0;" 367 + "r5 = r0;" 368 + "r6 = r0;" 369 + /* propagate range for r{0-6} */ 370 + "if r0 > 7 goto +0;" 371 + /* make r6 appear in the log */ 372 + "r6 = r6;" 373 + /* force r0 to be precise, 374 + * this would cause r{0-4} to be precise because of shared IDs 375 + */ 376 + "r7 = r10;" 377 + "r7 += r0;" 378 + "r0 = 0;" 379 + "exit;" 380 + : 381 + : __imm(bpf_ktime_get_ns) 382 + : __clobber_all); 383 + } 384 + 385 + SEC("socket") 386 + __failure __log_level(2) 387 + __flag(BPF_F_TEST_STATE_FREQ) 388 + __msg("regs=r7 stack= before 5: (3d) if r8 >= r0") 389 + __msg("parent state regs=r0,r7,r8") 390 + __msg("regs=r0,r7,r8 stack= before 4: (25) if r0 > 0x1") 391 + __msg("div by zero") 392 + __naked void linked_regs_broken_link_2(void) 393 + { 394 + asm volatile ( 395 + "call %[bpf_get_prandom_u32];" 396 + "r7 = r0;" 397 + "r8 = r0;" 398 + "call %[bpf_get_prandom_u32];" 399 + "if r0 > 1 goto +0;" 400 + /* r7.id == r8.id, 401 + * thus r7 precision implies r8 precision, 402 + * which implies r0 precision because of the conditional below. 403 + */ 404 + "if r8 >= r0 goto 1f;" 405 + /* break id relation between r7 and r8 */ 406 + "r8 += r8;" 407 + /* make r7 precise */ 408 + "if r7 == 0 goto 1f;" 409 + "r0 /= 0;" 410 + "1:" 411 + "r0 = 42;" 412 + "exit;" 413 + : 414 + : __imm(bpf_get_prandom_u32) 415 + : __clobber_all); 416 + } 417 + 418 + /* Check that mark_chain_precision() for one of the conditional jump 419 + * operands does not trigger equal scalars precision propagation. 420 + */ 421 + SEC("socket") 422 + __success __log_level(2) 423 + __msg("3: (25) if r1 > 0x100 goto pc+0") 424 + __msg("frame0: regs=r1 stack= before 2: (bf) r1 = r0") 425 + __naked void cjmp_no_linked_regs_trigger(void) 426 + { 427 + asm volatile ( 428 + /* r0 = random number up to 0xff */ 429 + "call %[bpf_ktime_get_ns];" 430 + "r0 &= 0xff;" 431 + /* tie r0.id == r1.id */ 432 + "r1 = r0;" 433 + /* the jump below would be predicted, thus r1 would be marked precise, 434 + * this should not imply precision mark for r0 435 + */ 436 + "if r1 > 256 goto +0;" 437 + "r0 = 0;" 342 438 "exit;" 343 439 : 344 440 : __imm(bpf_ktime_get_ns)

+439

tools/testing/selftests/bpf/progs/verifier_sdiv.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 3 3 #include <linux/bpf.h> 4 + #include <limits.h> 4 5 #include <bpf/bpf_helpers.h> 5 6 #include "bpf_misc.h" 6 7 ··· 769 768 r0 = r2; \ 770 769 exit; \ 771 770 " ::: __clobber_all); 771 + } 772 + 773 + SEC("socket") 774 + __description("SDIV64, overflow r/r, LLONG_MIN/-1") 775 + __success __retval(1) 776 + __arch_x86_64 777 + __xlated("0: r2 = 0x8000000000000000") 778 + __xlated("2: r3 = -1") 779 + __xlated("3: r4 = r2") 780 + __xlated("4: r11 = r3") 781 + __xlated("5: r11 += 1") 782 + __xlated("6: if r11 > 0x1 goto pc+4") 783 + __xlated("7: if r11 == 0x0 goto pc+1") 784 + __xlated("8: r2 = 0") 785 + __xlated("9: r2 = -r2") 786 + __xlated("10: goto pc+1") 787 + __xlated("11: r2 s/= r3") 788 + __xlated("12: r0 = 0") 789 + __xlated("13: if r2 != r4 goto pc+1") 790 + __xlated("14: r0 = 1") 791 + __xlated("15: exit") 792 + __naked void sdiv64_overflow_rr(void) 793 + { 794 + asm volatile (" \ 795 + r2 = %[llong_min] ll; \ 796 + r3 = -1; \ 797 + r4 = r2; \ 798 + r2 s/= r3; \ 799 + r0 = 0; \ 800 + if r2 != r4 goto +1; \ 801 + r0 = 1; \ 802 + exit; \ 803 + " : 804 + : __imm_const(llong_min, LLONG_MIN) 805 + : __clobber_all); 806 + } 807 + 808 + SEC("socket") 809 + __description("SDIV64, r/r, small_val/-1") 810 + __success __retval(-5) 811 + __arch_x86_64 812 + __xlated("0: r2 = 5") 813 + __xlated("1: r3 = -1") 814 + __xlated("2: r11 = r3") 815 + __xlated("3: r11 += 1") 816 + __xlated("4: if r11 > 0x1 goto pc+4") 817 + __xlated("5: if r11 == 0x0 goto pc+1") 818 + __xlated("6: r2 = 0") 819 + __xlated("7: r2 = -r2") 820 + __xlated("8: goto pc+1") 821 + __xlated("9: r2 s/= r3") 822 + __xlated("10: r0 = r2") 823 + __xlated("11: exit") 824 + __naked void sdiv64_rr_divisor_neg_1(void) 825 + { 826 + asm volatile (" \ 827 + r2 = 5; \ 828 + r3 = -1; \ 829 + r2 s/= r3; \ 830 + r0 = r2; \ 831 + exit; \ 832 + " : 833 + : 834 + : __clobber_all); 835 + } 836 + 837 + SEC("socket") 838 + __description("SDIV64, overflow r/i, LLONG_MIN/-1") 839 + __success __retval(1) 840 + __arch_x86_64 841 + __xlated("0: r2 = 0x8000000000000000") 842 + __xlated("2: r4 = r2") 843 + __xlated("3: r2 = -r2") 844 + __xlated("4: r0 = 0") 845 + __xlated("5: if r2 != r4 goto pc+1") 846 + __xlated("6: r0 = 1") 847 + __xlated("7: exit") 848 + __naked void sdiv64_overflow_ri(void) 849 + { 850 + asm volatile (" \ 851 + r2 = %[llong_min] ll; \ 852 + r4 = r2; \ 853 + r2 s/= -1; \ 854 + r0 = 0; \ 855 + if r2 != r4 goto +1; \ 856 + r0 = 1; \ 857 + exit; \ 858 + " : 859 + : __imm_const(llong_min, LLONG_MIN) 860 + : __clobber_all); 861 + } 862 + 863 + SEC("socket") 864 + __description("SDIV64, r/i, small_val/-1") 865 + __success __retval(-5) 866 + __arch_x86_64 867 + __xlated("0: r2 = 5") 868 + __xlated("1: r4 = r2") 869 + __xlated("2: r2 = -r2") 870 + __xlated("3: r0 = r2") 871 + __xlated("4: exit") 872 + __naked void sdiv64_ri_divisor_neg_1(void) 873 + { 874 + asm volatile (" \ 875 + r2 = 5; \ 876 + r4 = r2; \ 877 + r2 s/= -1; \ 878 + r0 = r2; \ 879 + exit; \ 880 + " : 881 + : 882 + : __clobber_all); 883 + } 884 + 885 + SEC("socket") 886 + __description("SDIV32, overflow r/r, INT_MIN/-1") 887 + __success __retval(1) 888 + __arch_x86_64 889 + __xlated("0: w2 = -2147483648") 890 + __xlated("1: w3 = -1") 891 + __xlated("2: w4 = w2") 892 + __xlated("3: r11 = r3") 893 + __xlated("4: w11 += 1") 894 + __xlated("5: if w11 > 0x1 goto pc+4") 895 + __xlated("6: if w11 == 0x0 goto pc+1") 896 + __xlated("7: w2 = 0") 897 + __xlated("8: w2 = -w2") 898 + __xlated("9: goto pc+1") 899 + __xlated("10: w2 s/= w3") 900 + __xlated("11: r0 = 0") 901 + __xlated("12: if w2 != w4 goto pc+1") 902 + __xlated("13: r0 = 1") 903 + __xlated("14: exit") 904 + __naked void sdiv32_overflow_rr(void) 905 + { 906 + asm volatile (" \ 907 + w2 = %[int_min]; \ 908 + w3 = -1; \ 909 + w4 = w2; \ 910 + w2 s/= w3; \ 911 + r0 = 0; \ 912 + if w2 != w4 goto +1; \ 913 + r0 = 1; \ 914 + exit; \ 915 + " : 916 + : __imm_const(int_min, INT_MIN) 917 + : __clobber_all); 918 + } 919 + 920 + SEC("socket") 921 + __description("SDIV32, r/r, small_val/-1") 922 + __success __retval(5) 923 + __arch_x86_64 924 + __xlated("0: w2 = -5") 925 + __xlated("1: w3 = -1") 926 + __xlated("2: w4 = w2") 927 + __xlated("3: r11 = r3") 928 + __xlated("4: w11 += 1") 929 + __xlated("5: if w11 > 0x1 goto pc+4") 930 + __xlated("6: if w11 == 0x0 goto pc+1") 931 + __xlated("7: w2 = 0") 932 + __xlated("8: w2 = -w2") 933 + __xlated("9: goto pc+1") 934 + __xlated("10: w2 s/= w3") 935 + __xlated("11: w0 = w2") 936 + __xlated("12: exit") 937 + __naked void sdiv32_rr_divisor_neg_1(void) 938 + { 939 + asm volatile (" \ 940 + w2 = -5; \ 941 + w3 = -1; \ 942 + w4 = w2; \ 943 + w2 s/= w3; \ 944 + w0 = w2; \ 945 + exit; \ 946 + " : 947 + : 948 + : __clobber_all); 949 + } 950 + 951 + SEC("socket") 952 + __description("SDIV32, overflow r/i, INT_MIN/-1") 953 + __success __retval(1) 954 + __arch_x86_64 955 + __xlated("0: w2 = -2147483648") 956 + __xlated("1: w4 = w2") 957 + __xlated("2: w2 = -w2") 958 + __xlated("3: r0 = 0") 959 + __xlated("4: if w2 != w4 goto pc+1") 960 + __xlated("5: r0 = 1") 961 + __xlated("6: exit") 962 + __naked void sdiv32_overflow_ri(void) 963 + { 964 + asm volatile (" \ 965 + w2 = %[int_min]; \ 966 + w4 = w2; \ 967 + w2 s/= -1; \ 968 + r0 = 0; \ 969 + if w2 != w4 goto +1; \ 970 + r0 = 1; \ 971 + exit; \ 972 + " : 973 + : __imm_const(int_min, INT_MIN) 974 + : __clobber_all); 975 + } 976 + 977 + SEC("socket") 978 + __description("SDIV32, r/i, small_val/-1") 979 + __success __retval(-5) 980 + __arch_x86_64 981 + __xlated("0: w2 = 5") 982 + __xlated("1: w4 = w2") 983 + __xlated("2: w2 = -w2") 984 + __xlated("3: w0 = w2") 985 + __xlated("4: exit") 986 + __naked void sdiv32_ri_divisor_neg_1(void) 987 + { 988 + asm volatile (" \ 989 + w2 = 5; \ 990 + w4 = w2; \ 991 + w2 s/= -1; \ 992 + w0 = w2; \ 993 + exit; \ 994 + " : 995 + : 996 + : __clobber_all); 997 + } 998 + 999 + SEC("socket") 1000 + __description("SMOD64, overflow r/r, LLONG_MIN/-1") 1001 + __success __retval(0) 1002 + __arch_x86_64 1003 + __xlated("0: r2 = 0x8000000000000000") 1004 + __xlated("2: r3 = -1") 1005 + __xlated("3: r4 = r2") 1006 + __xlated("4: r11 = r3") 1007 + __xlated("5: r11 += 1") 1008 + __xlated("6: if r11 > 0x1 goto pc+3") 1009 + __xlated("7: if r11 == 0x1 goto pc+3") 1010 + __xlated("8: w2 = 0") 1011 + __xlated("9: goto pc+1") 1012 + __xlated("10: r2 s%= r3") 1013 + __xlated("11: r0 = r2") 1014 + __xlated("12: exit") 1015 + __naked void smod64_overflow_rr(void) 1016 + { 1017 + asm volatile (" \ 1018 + r2 = %[llong_min] ll; \ 1019 + r3 = -1; \ 1020 + r4 = r2; \ 1021 + r2 s%%= r3; \ 1022 + r0 = r2; \ 1023 + exit; \ 1024 + " : 1025 + : __imm_const(llong_min, LLONG_MIN) 1026 + : __clobber_all); 1027 + } 1028 + 1029 + SEC("socket") 1030 + __description("SMOD64, r/r, small_val/-1") 1031 + __success __retval(0) 1032 + __arch_x86_64 1033 + __xlated("0: r2 = 5") 1034 + __xlated("1: r3 = -1") 1035 + __xlated("2: r4 = r2") 1036 + __xlated("3: r11 = r3") 1037 + __xlated("4: r11 += 1") 1038 + __xlated("5: if r11 > 0x1 goto pc+3") 1039 + __xlated("6: if r11 == 0x1 goto pc+3") 1040 + __xlated("7: w2 = 0") 1041 + __xlated("8: goto pc+1") 1042 + __xlated("9: r2 s%= r3") 1043 + __xlated("10: r0 = r2") 1044 + __xlated("11: exit") 1045 + __naked void smod64_rr_divisor_neg_1(void) 1046 + { 1047 + asm volatile (" \ 1048 + r2 = 5; \ 1049 + r3 = -1; \ 1050 + r4 = r2; \ 1051 + r2 s%%= r3; \ 1052 + r0 = r2; \ 1053 + exit; \ 1054 + " : 1055 + : 1056 + : __clobber_all); 1057 + } 1058 + 1059 + SEC("socket") 1060 + __description("SMOD64, overflow r/i, LLONG_MIN/-1") 1061 + __success __retval(0) 1062 + __arch_x86_64 1063 + __xlated("0: r2 = 0x8000000000000000") 1064 + __xlated("2: r4 = r2") 1065 + __xlated("3: w2 = 0") 1066 + __xlated("4: r0 = r2") 1067 + __xlated("5: exit") 1068 + __naked void smod64_overflow_ri(void) 1069 + { 1070 + asm volatile (" \ 1071 + r2 = %[llong_min] ll; \ 1072 + r4 = r2; \ 1073 + r2 s%%= -1; \ 1074 + r0 = r2; \ 1075 + exit; \ 1076 + " : 1077 + : __imm_const(llong_min, LLONG_MIN) 1078 + : __clobber_all); 1079 + } 1080 + 1081 + SEC("socket") 1082 + __description("SMOD64, r/i, small_val/-1") 1083 + __success __retval(0) 1084 + __arch_x86_64 1085 + __xlated("0: r2 = 5") 1086 + __xlated("1: r4 = r2") 1087 + __xlated("2: w2 = 0") 1088 + __xlated("3: r0 = r2") 1089 + __xlated("4: exit") 1090 + __naked void smod64_ri_divisor_neg_1(void) 1091 + { 1092 + asm volatile (" \ 1093 + r2 = 5; \ 1094 + r4 = r2; \ 1095 + r2 s%%= -1; \ 1096 + r0 = r2; \ 1097 + exit; \ 1098 + " : 1099 + : 1100 + : __clobber_all); 1101 + } 1102 + 1103 + SEC("socket") 1104 + __description("SMOD32, overflow r/r, INT_MIN/-1") 1105 + __success __retval(0) 1106 + __arch_x86_64 1107 + __xlated("0: w2 = -2147483648") 1108 + __xlated("1: w3 = -1") 1109 + __xlated("2: w4 = w2") 1110 + __xlated("3: r11 = r3") 1111 + __xlated("4: w11 += 1") 1112 + __xlated("5: if w11 > 0x1 goto pc+3") 1113 + __xlated("6: if w11 == 0x1 goto pc+4") 1114 + __xlated("7: w2 = 0") 1115 + __xlated("8: goto pc+1") 1116 + __xlated("9: w2 s%= w3") 1117 + __xlated("10: goto pc+1") 1118 + __xlated("11: w2 = w2") 1119 + __xlated("12: r0 = r2") 1120 + __xlated("13: exit") 1121 + __naked void smod32_overflow_rr(void) 1122 + { 1123 + asm volatile (" \ 1124 + w2 = %[int_min]; \ 1125 + w3 = -1; \ 1126 + w4 = w2; \ 1127 + w2 s%%= w3; \ 1128 + r0 = r2; \ 1129 + exit; \ 1130 + " : 1131 + : __imm_const(int_min, INT_MIN) 1132 + : __clobber_all); 1133 + } 1134 + 1135 + SEC("socket") 1136 + __description("SMOD32, r/r, small_val/-1") 1137 + __success __retval(0) 1138 + __arch_x86_64 1139 + __xlated("0: w2 = -5") 1140 + __xlated("1: w3 = -1") 1141 + __xlated("2: w4 = w2") 1142 + __xlated("3: r11 = r3") 1143 + __xlated("4: w11 += 1") 1144 + __xlated("5: if w11 > 0x1 goto pc+3") 1145 + __xlated("6: if w11 == 0x1 goto pc+4") 1146 + __xlated("7: w2 = 0") 1147 + __xlated("8: goto pc+1") 1148 + __xlated("9: w2 s%= w3") 1149 + __xlated("10: goto pc+1") 1150 + __xlated("11: w2 = w2") 1151 + __xlated("12: r0 = r2") 1152 + __xlated("13: exit") 1153 + __naked void smod32_rr_divisor_neg_1(void) 1154 + { 1155 + asm volatile (" \ 1156 + w2 = -5; \ 1157 + w3 = -1; \ 1158 + w4 = w2; \ 1159 + w2 s%%= w3; \ 1160 + r0 = r2; \ 1161 + exit; \ 1162 + " : 1163 + : 1164 + : __clobber_all); 1165 + } 1166 + 1167 + SEC("socket") 1168 + __description("SMOD32, overflow r/i, INT_MIN/-1") 1169 + __success __retval(0) 1170 + __arch_x86_64 1171 + __xlated("0: w2 = -2147483648") 1172 + __xlated("1: w4 = w2") 1173 + __xlated("2: w2 = 0") 1174 + __xlated("3: r0 = r2") 1175 + __xlated("4: exit") 1176 + __naked void smod32_overflow_ri(void) 1177 + { 1178 + asm volatile (" \ 1179 + w2 = %[int_min]; \ 1180 + w4 = w2; \ 1181 + w2 s%%= -1; \ 1182 + r0 = r2; \ 1183 + exit; \ 1184 + " : 1185 + : __imm_const(int_min, INT_MIN) 1186 + : __clobber_all); 1187 + } 1188 + 1189 + SEC("socket") 1190 + __description("SMOD32, r/i, small_val/-1") 1191 + __success __retval(0) 1192 + __arch_x86_64 1193 + __xlated("0: w2 = 5") 1194 + __xlated("1: w4 = w2") 1195 + __xlated("2: w2 = 0") 1196 + __xlated("3: w0 = w2") 1197 + __xlated("4: exit") 1198 + __naked void smod32_ri_divisor_neg_1(void) 1199 + { 1200 + asm volatile (" \ 1201 + w2 = 5; \ 1202 + w4 = w2; \ 1203 + w2 s%%= -1; \ 1204 + w0 = w2; \ 1205 + exit; \ 1206 + " : 1207 + : 1208 + : __clobber_all); 772 1209 } 773 1210 774 1211 #else

+12 -12

tools/testing/selftests/bpf/progs/verifier_spill_fill.c

··· 402 402 *(u32*)(r10 - 8) = r1; \ 403 403 /* 32-bit fill r2 from stack. */ \ 404 404 r2 = *(u32*)(r10 - 8); \ 405 - /* Compare r2 with another register to trigger find_equal_scalars.\ 405 + /* Compare r2 with another register to trigger sync_linked_regs.\ 406 406 * Having one random bit is important here, otherwise the verifier cuts\ 407 407 * the corners. If the ID was mistakenly preserved on spill, this would\ 408 408 * cause the verifier to think that r1 is also equal to zero in one of\ ··· 441 441 *(u16*)(r10 - 8) = r1; \ 442 442 /* 16-bit fill r2 from stack. */ \ 443 443 r2 = *(u16*)(r10 - 8); \ 444 - /* Compare r2 with another register to trigger find_equal_scalars.\ 444 + /* Compare r2 with another register to trigger sync_linked_regs.\ 445 445 * Having one random bit is important here, otherwise the verifier cuts\ 446 446 * the corners. If the ID was mistakenly preserved on spill, this would\ 447 447 * cause the verifier to think that r1 is also equal to zero in one of\ ··· 833 833 *(u64*)(r10 - 8) = r0; \ 834 834 /* 64-bit fill r1 from stack - should preserve the ID. */\ 835 835 r1 = *(u64*)(r10 - 8); \ 836 - /* Compare r1 with another register to trigger find_equal_scalars.\ 836 + /* Compare r1 with another register to trigger sync_linked_regs.\ 837 837 * Having one random bit is important here, otherwise the verifier cuts\ 838 838 * the corners. \ 839 839 */ \ ··· 866 866 *(u32*)(r10 - 8) = r0; \ 867 867 /* 32-bit fill r1 from stack - should preserve the ID. */\ 868 868 r1 = *(u32*)(r10 - 8); \ 869 - /* Compare r1 with another register to trigger find_equal_scalars.\ 869 + /* Compare r1 with another register to trigger sync_linked_regs.\ 870 870 * Having one random bit is important here, otherwise the verifier cuts\ 871 871 * the corners. \ 872 872 */ \ ··· 899 899 *(u16*)(r10 - 8) = r0; \ 900 900 /* 16-bit fill r1 from stack - should preserve the ID. */\ 901 901 r1 = *(u16*)(r10 - 8); \ 902 - /* Compare r1 with another register to trigger find_equal_scalars.\ 902 + /* Compare r1 with another register to trigger sync_linked_regs.\ 903 903 * Having one random bit is important here, otherwise the verifier cuts\ 904 904 * the corners. \ 905 905 */ \ ··· 932 932 *(u8*)(r10 - 8) = r0; \ 933 933 /* 8-bit fill r1 from stack - should preserve the ID. */\ 934 934 r1 = *(u8*)(r10 - 8); \ 935 - /* Compare r1 with another register to trigger find_equal_scalars.\ 935 + /* Compare r1 with another register to trigger sync_linked_regs.\ 936 936 * Having one random bit is important here, otherwise the verifier cuts\ 937 937 * the corners. \ 938 938 */ \ ··· 1029 1029 "r1 = *(u32*)(r10 - 4);" 1030 1030 #endif 1031 1031 " \ 1032 - /* Compare r1 with another register to trigger find_equal_scalars. */\ 1032 + /* Compare r1 with another register to trigger sync_linked_regs. */\ 1033 1033 r2 = 0; \ 1034 1034 if r1 != r2 goto l0_%=; \ 1035 1035 /* The result of this comparison is predefined. */\ ··· 1070 1070 "r2 = *(u32*)(r10 - 4);" 1071 1071 #endif 1072 1072 " \ 1073 - /* Compare r2 with another register to trigger find_equal_scalars.\ 1073 + /* Compare r2 with another register to trigger sync_linked_regs.\ 1074 1074 * Having one random bit is important here, otherwise the verifier cuts\ 1075 1075 * the corners. If the ID was mistakenly preserved on fill, this would\ 1076 1076 * cause the verifier to think that r1 is also equal to zero in one of\ ··· 1213 1213 * - once for path entry - label 2; 1214 1214 * - once for path entry - label 1 - label 2. 1215 1215 */ 1216 - __msg("r1 = *(u64 *)(r10 -8)") 1217 - __msg("exit") 1218 - __msg("r1 = *(u64 *)(r10 -8)") 1219 - __msg("exit") 1216 + __msg("8: (79) r1 = *(u64 *)(r10 -8)") 1217 + __msg("9: (95) exit") 1218 + __msg("from 2 to 7") 1219 + __msg("8: safe") 1220 1220 __msg("processed 11 insns") 1221 1221 __flag(BPF_F_TEST_STATE_FREQ) 1222 1222 __naked void old_stack_misc_vs_cur_ctx_ptr(void)

+1 -1

tools/testing/selftests/bpf/progs/verifier_subprog_precision.c

··· 278 278 __msg("mark_precise: frame0: regs=r6 stack= before 13: (bf) r1 = r7") 279 279 __msg("mark_precise: frame0: regs=r6 stack= before 12: (27) r6 *= 4") 280 280 __msg("mark_precise: frame0: regs=r6 stack= before 11: (25) if r6 > 0x3 goto pc+4") 281 - __msg("mark_precise: frame0: regs=r6 stack= before 10: (bf) r6 = r0") 281 + __msg("mark_precise: frame0: regs=r0,r6 stack= before 10: (bf) r6 = r0") 282 282 __msg("mark_precise: frame0: regs=r0 stack= before 9: (85) call bpf_loop") 283 283 /* State entering callback body popped from states stack */ 284 284 __msg("from 9 to 17: frame1:")

+105

tools/testing/selftests/bpf/progs/verifier_tailcall_jit.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + #include "bpf_misc.h" 5 + 6 + int main(void); 7 + 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); 10 + __uint(max_entries, 1); 11 + __uint(key_size, sizeof(__u32)); 12 + __array(values, void (void)); 13 + } jmp_table SEC(".maps") = { 14 + .values = { 15 + [0] = (void *) &main, 16 + }, 17 + }; 18 + 19 + __noinline __auxiliary 20 + static __naked int sub(void) 21 + { 22 + asm volatile ( 23 + "r2 = %[jmp_table] ll;" 24 + "r3 = 0;" 25 + "call 12;" 26 + "exit;" 27 + : 28 + : __imm_addr(jmp_table) 29 + : __clobber_all); 30 + } 31 + 32 + __success 33 + __arch_x86_64 34 + /* program entry for main(), regular function prologue */ 35 + __jited(" endbr64") 36 + __jited(" nopl (%rax,%rax)") 37 + __jited(" xorq %rax, %rax") 38 + __jited(" pushq %rbp") 39 + __jited(" movq %rsp, %rbp") 40 + /* tail call prologue for program: 41 + * - establish memory location for tail call counter at &rbp[-8]; 42 + * - spill tail_call_cnt_ptr at &rbp[-16]; 43 + * - expect tail call counter to be passed in rax; 44 + * - for entry program rax is a raw counter, value < 33; 45 + * - for tail called program rax is tail_call_cnt_ptr (value > 33). 46 + */ 47 + __jited(" endbr64") 48 + __jited(" cmpq $0x21, %rax") 49 + __jited(" ja L0") 50 + __jited(" pushq %rax") 51 + __jited(" movq %rsp, %rax") 52 + __jited(" jmp L1") 53 + __jited("L0: pushq %rax") /* rbp[-8] = rax */ 54 + __jited("L1: pushq %rax") /* rbp[-16] = rax */ 55 + /* on subprogram call restore rax to be tail_call_cnt_ptr from rbp[-16] 56 + * (cause original rax might be clobbered by this point) 57 + */ 58 + __jited(" movq -0x10(%rbp), %rax") 59 + __jited(" callq 0x{{.*}}") /* call to sub() */ 60 + __jited(" xorl %eax, %eax") 61 + __jited(" leave") 62 + __jited(" {{(retq|jmp 0x)}}") /* return or jump to rethunk */ 63 + __jited("...") 64 + /* subprogram entry for sub(), regular function prologue */ 65 + __jited(" endbr64") 66 + __jited(" nopl (%rax,%rax)") 67 + __jited(" nopl (%rax)") 68 + __jited(" pushq %rbp") 69 + __jited(" movq %rsp, %rbp") 70 + /* tail call prologue for subprogram address of tail call counter 71 + * stored at rbp[-16]. 72 + */ 73 + __jited(" endbr64") 74 + __jited(" pushq %rax") /* rbp[-8] = rax */ 75 + __jited(" pushq %rax") /* rbp[-16] = rax */ 76 + __jited(" movabsq ${{.*}}, %rsi") /* r2 = &jmp_table */ 77 + __jited(" xorl %edx, %edx") /* r3 = 0 */ 78 + /* bpf_tail_call implementation: 79 + * - load tail_call_cnt_ptr from rbp[-16]; 80 + * - if *tail_call_cnt_ptr < 33, increment it and jump to target; 81 + * - otherwise do nothing. 82 + */ 83 + __jited(" movq -0x10(%rbp), %rax") 84 + __jited(" cmpq $0x21, (%rax)") 85 + __jited(" jae L0") 86 + __jited(" nopl (%rax,%rax)") 87 + __jited(" addq $0x1, (%rax)") /* *tail_call_cnt_ptr += 1 */ 88 + __jited(" popq %rax") 89 + __jited(" popq %rax") 90 + __jited(" jmp {{.*}}") /* jump to tail call tgt */ 91 + __jited("L0: leave") 92 + __jited(" {{(retq|jmp 0x)}}") /* return or jump to rethunk */ 93 + SEC("tc") 94 + __naked int main(void) 95 + { 96 + asm volatile ( 97 + "call %[sub];" 98 + "r0 = 0;" 99 + "exit;" 100 + : 101 + : __imm(sub) 102 + : __clobber_all); 103 + } 104 + 105 + char __license[] SEC("license") = "GPL";

+85

tools/testing/selftests/bpf/progs/verifier_vfs_accept.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Google LLC. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + #include "bpf_misc.h" 9 + #include "bpf_experimental.h" 10 + 11 + static char buf[64]; 12 + 13 + SEC("lsm.s/file_open") 14 + __success 15 + int BPF_PROG(get_task_exe_file_and_put_kfunc_from_current_sleepable) 16 + { 17 + struct file *acquired; 18 + 19 + acquired = bpf_get_task_exe_file(bpf_get_current_task_btf()); 20 + if (!acquired) 21 + return 0; 22 + 23 + bpf_put_file(acquired); 24 + return 0; 25 + } 26 + 27 + SEC("lsm/file_open") 28 + __success 29 + int BPF_PROG(get_task_exe_file_and_put_kfunc_from_current_non_sleepable, struct file *file) 30 + { 31 + struct file *acquired; 32 + 33 + acquired = bpf_get_task_exe_file(bpf_get_current_task_btf()); 34 + if (!acquired) 35 + return 0; 36 + 37 + bpf_put_file(acquired); 38 + return 0; 39 + } 40 + 41 + SEC("lsm.s/task_alloc") 42 + __success 43 + int BPF_PROG(get_task_exe_file_and_put_kfunc_from_argument, 44 + struct task_struct *task) 45 + { 46 + struct file *acquired; 47 + 48 + acquired = bpf_get_task_exe_file(task); 49 + if (!acquired) 50 + return 0; 51 + 52 + bpf_put_file(acquired); 53 + return 0; 54 + } 55 + 56 + SEC("lsm.s/inode_getattr") 57 + __success 58 + int BPF_PROG(path_d_path_from_path_argument, struct path *path) 59 + { 60 + int ret; 61 + 62 + ret = bpf_path_d_path(path, buf, sizeof(buf)); 63 + __sink(ret); 64 + return 0; 65 + } 66 + 67 + SEC("lsm.s/file_open") 68 + __success 69 + int BPF_PROG(path_d_path_from_file_argument, struct file *file) 70 + { 71 + int ret; 72 + struct path *path; 73 + 74 + /* The f_path member is a path which is embedded directly within a 75 + * file. Therefore, a pointer to such embedded members are still 76 + * recognized by the BPF verifier as being PTR_TRUSTED as it's 77 + * essentially PTR_TRUSTED w/ a non-zero fixed offset. 78 + */ 79 + path = &file->f_path; 80 + ret = bpf_path_d_path(path, buf, sizeof(buf)); 81 + __sink(ret); 82 + return 0; 83 + } 84 + 85 + char _license[] SEC("license") = "GPL";

+161

tools/testing/selftests/bpf/progs/verifier_vfs_reject.c

··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Google LLC. */ 3 + 4 + #include <vmlinux.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + #include <linux/limits.h> 8 + 9 + #include "bpf_misc.h" 10 + #include "bpf_experimental.h" 11 + 12 + static char buf[PATH_MAX]; 13 + 14 + SEC("lsm.s/file_open") 15 + __failure __msg("Possibly NULL pointer passed to trusted arg0") 16 + int BPF_PROG(get_task_exe_file_kfunc_null) 17 + { 18 + struct file *acquired; 19 + 20 + /* Can't pass a NULL pointer to bpf_get_task_exe_file(). */ 21 + acquired = bpf_get_task_exe_file(NULL); 22 + if (!acquired) 23 + return 0; 24 + 25 + bpf_put_file(acquired); 26 + return 0; 27 + } 28 + 29 + SEC("lsm.s/inode_getxattr") 30 + __failure __msg("arg#0 pointer type STRUCT task_struct must point to scalar, or struct with scalar") 31 + int BPF_PROG(get_task_exe_file_kfunc_fp) 32 + { 33 + u64 x; 34 + struct file *acquired; 35 + struct task_struct *task; 36 + 37 + task = (struct task_struct *)&x; 38 + /* Can't pass random frame pointer to bpf_get_task_exe_file(). */ 39 + acquired = bpf_get_task_exe_file(task); 40 + if (!acquired) 41 + return 0; 42 + 43 + bpf_put_file(acquired); 44 + return 0; 45 + } 46 + 47 + SEC("lsm.s/file_open") 48 + __failure __msg("R1 must be referenced or trusted") 49 + int BPF_PROG(get_task_exe_file_kfunc_untrusted) 50 + { 51 + struct file *acquired; 52 + struct task_struct *parent; 53 + 54 + /* Walking a trusted struct task_struct returned from 55 + * bpf_get_current_task_btf() yields an untrusted pointer. 56 + */ 57 + parent = bpf_get_current_task_btf()->parent; 58 + /* Can't pass untrusted pointer to bpf_get_task_exe_file(). */ 59 + acquired = bpf_get_task_exe_file(parent); 60 + if (!acquired) 61 + return 0; 62 + 63 + bpf_put_file(acquired); 64 + return 0; 65 + } 66 + 67 + SEC("lsm.s/file_open") 68 + __failure __msg("Unreleased reference") 69 + int BPF_PROG(get_task_exe_file_kfunc_unreleased) 70 + { 71 + struct file *acquired; 72 + 73 + acquired = bpf_get_task_exe_file(bpf_get_current_task_btf()); 74 + if (!acquired) 75 + return 0; 76 + 77 + /* Acquired but never released. */ 78 + return 0; 79 + } 80 + 81 + SEC("lsm.s/file_open") 82 + __failure __msg("release kernel function bpf_put_file expects") 83 + int BPF_PROG(put_file_kfunc_unacquired, struct file *file) 84 + { 85 + /* Can't release an unacquired pointer. */ 86 + bpf_put_file(file); 87 + return 0; 88 + } 89 + 90 + SEC("lsm.s/file_open") 91 + __failure __msg("Possibly NULL pointer passed to trusted arg0") 92 + int BPF_PROG(path_d_path_kfunc_null) 93 + { 94 + /* Can't pass NULL value to bpf_path_d_path() kfunc. */ 95 + bpf_path_d_path(NULL, buf, sizeof(buf)); 96 + return 0; 97 + } 98 + 99 + SEC("lsm.s/task_alloc") 100 + __failure __msg("R1 must be referenced or trusted") 101 + int BPF_PROG(path_d_path_kfunc_untrusted_from_argument, struct task_struct *task) 102 + { 103 + struct path *root; 104 + 105 + /* Walking a trusted argument typically yields an untrusted 106 + * pointer. This is one example of that. 107 + */ 108 + root = &task->fs->root; 109 + bpf_path_d_path(root, buf, sizeof(buf)); 110 + return 0; 111 + } 112 + 113 + SEC("lsm.s/file_open") 114 + __failure __msg("R1 must be referenced or trusted") 115 + int BPF_PROG(path_d_path_kfunc_untrusted_from_current) 116 + { 117 + struct path *pwd; 118 + struct task_struct *current; 119 + 120 + current = bpf_get_current_task_btf(); 121 + /* Walking a trusted pointer returned from bpf_get_current_task_btf() 122 + * yields an untrusted pointer. 123 + */ 124 + pwd = &current->fs->pwd; 125 + bpf_path_d_path(pwd, buf, sizeof(buf)); 126 + return 0; 127 + } 128 + 129 + SEC("lsm.s/file_open") 130 + __failure __msg("kernel function bpf_path_d_path args#0 expected pointer to STRUCT path but R1 has a pointer to STRUCT file") 131 + int BPF_PROG(path_d_path_kfunc_type_mismatch, struct file *file) 132 + { 133 + bpf_path_d_path((struct path *)&file->f_task_work, buf, sizeof(buf)); 134 + return 0; 135 + } 136 + 137 + SEC("lsm.s/file_open") 138 + __failure __msg("invalid access to map value, value_size=4096 off=0 size=8192") 139 + int BPF_PROG(path_d_path_kfunc_invalid_buf_sz, struct file *file) 140 + { 141 + /* bpf_path_d_path() enforces a constraint on the buffer size supplied 142 + * by the BPF LSM program via the __sz annotation. buf here is set to 143 + * PATH_MAX, so let's ensure that the BPF verifier rejects BPF_PROG_LOAD 144 + * attempts if the supplied size and the actual size of the buffer 145 + * mismatches. 146 + */ 147 + bpf_path_d_path(&file->f_path, buf, PATH_MAX * 2); 148 + return 0; 149 + } 150 + 151 + SEC("fentry/vfs_open") 152 + __failure __msg("calling kernel function bpf_path_d_path is not allowed") 153 + int BPF_PROG(path_d_path_kfunc_non_lsm, struct path *path, struct file *f) 154 + { 155 + /* Calling bpf_path_d_path() from a non-LSM BPF program isn't permitted. 156 + */ 157 + bpf_path_d_path(path, buf, sizeof(buf)); 158 + return 0; 159 + } 160 + 161 + char _license[] SEC("license") = "GPL";

+3 -3

tools/testing/selftests/bpf/progs/xdp_redirect_map.c

··· 10 10 __uint(value_size, sizeof(int)); 11 11 } tx_port SEC(".maps"); 12 12 13 - SEC("redirect_map_0") 13 + SEC("xdp") 14 14 int xdp_redirect_map_0(struct xdp_md *xdp) 15 15 { 16 16 return bpf_redirect_map(&tx_port, 0, 0); 17 17 } 18 18 19 - SEC("redirect_map_1") 19 + SEC("xdp") 20 20 int xdp_redirect_map_1(struct xdp_md *xdp) 21 21 { 22 22 return bpf_redirect_map(&tx_port, 1, 0); 23 23 } 24 24 25 - SEC("redirect_map_2") 25 + SEC("xdp") 26 26 int xdp_redirect_map_2(struct xdp_md *xdp) 27 27 { 28 28 return bpf_redirect_map(&tx_port, 2, 0);

-174

tools/testing/selftests/bpf/test_cgroup_storage.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include <assert.h> 3 - #include <bpf/bpf.h> 4 - #include <linux/filter.h> 5 - #include <stdio.h> 6 - #include <stdlib.h> 7 - #include <sys/sysinfo.h> 8 - 9 - #include "bpf_util.h" 10 - #include "cgroup_helpers.h" 11 - #include "testing_helpers.h" 12 - 13 - char bpf_log_buf[BPF_LOG_BUF_SIZE]; 14 - 15 - #define TEST_CGROUP "/test-bpf-cgroup-storage-buf/" 16 - 17 - int main(int argc, char **argv) 18 - { 19 - struct bpf_insn prog[] = { 20 - BPF_LD_MAP_FD(BPF_REG_1, 0), /* percpu map fd */ 21 - BPF_MOV64_IMM(BPF_REG_2, 0), /* flags, not used */ 22 - BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, 23 - BPF_FUNC_get_local_storage), 24 - BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_0, 0), 25 - BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 0x1), 26 - BPF_STX_MEM(BPF_DW, BPF_REG_0, BPF_REG_3, 0), 27 - 28 - BPF_LD_MAP_FD(BPF_REG_1, 0), /* map fd */ 29 - BPF_MOV64_IMM(BPF_REG_2, 0), /* flags, not used */ 30 - BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, 31 - BPF_FUNC_get_local_storage), 32 - BPF_MOV64_IMM(BPF_REG_1, 1), 33 - BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0), 34 - BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0), 35 - BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0x1), 36 - BPF_MOV64_REG(BPF_REG_0, BPF_REG_1), 37 - BPF_EXIT_INSN(), 38 - }; 39 - size_t insns_cnt = ARRAY_SIZE(prog); 40 - int error = EXIT_FAILURE; 41 - int map_fd, percpu_map_fd, prog_fd, cgroup_fd; 42 - struct bpf_cgroup_storage_key key; 43 - unsigned long long value; 44 - unsigned long long *percpu_value; 45 - int cpu, nproc; 46 - 47 - nproc = bpf_num_possible_cpus(); 48 - percpu_value = malloc(sizeof(*percpu_value) * nproc); 49 - if (!percpu_value) { 50 - printf("Not enough memory for per-cpu area (%d cpus)\n", nproc); 51 - goto err; 52 - } 53 - 54 - /* Use libbpf 1.0 API mode */ 55 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 56 - 57 - map_fd = bpf_map_create(BPF_MAP_TYPE_CGROUP_STORAGE, NULL, sizeof(key), 58 - sizeof(value), 0, NULL); 59 - if (map_fd < 0) { 60 - printf("Failed to create map: %s\n", strerror(errno)); 61 - goto out; 62 - } 63 - 64 - percpu_map_fd = bpf_map_create(BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, NULL, 65 - sizeof(key), sizeof(value), 0, NULL); 66 - if (percpu_map_fd < 0) { 67 - printf("Failed to create map: %s\n", strerror(errno)); 68 - goto out; 69 - } 70 - 71 - prog[0].imm = percpu_map_fd; 72 - prog[7].imm = map_fd; 73 - prog_fd = bpf_test_load_program(BPF_PROG_TYPE_CGROUP_SKB, 74 - prog, insns_cnt, "GPL", 0, 75 - bpf_log_buf, BPF_LOG_BUF_SIZE); 76 - if (prog_fd < 0) { 77 - printf("Failed to load bpf program: %s\n", bpf_log_buf); 78 - goto out; 79 - } 80 - 81 - cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 82 - 83 - /* Attach the bpf program */ 84 - if (bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_INET_EGRESS, 0)) { 85 - printf("Failed to attach bpf program\n"); 86 - goto err; 87 - } 88 - 89 - if (bpf_map_get_next_key(map_fd, NULL, &key)) { 90 - printf("Failed to get the first key in cgroup storage\n"); 91 - goto err; 92 - } 93 - 94 - if (bpf_map_lookup_elem(map_fd, &key, &value)) { 95 - printf("Failed to lookup cgroup storage 0\n"); 96 - goto err; 97 - } 98 - 99 - for (cpu = 0; cpu < nproc; cpu++) 100 - percpu_value[cpu] = 1000; 101 - 102 - if (bpf_map_update_elem(percpu_map_fd, &key, percpu_value, 0)) { 103 - printf("Failed to update the data in the cgroup storage\n"); 104 - goto err; 105 - } 106 - 107 - /* Every second packet should be dropped */ 108 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null") == 0); 109 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null")); 110 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null") == 0); 111 - 112 - /* Check the counter in the cgroup local storage */ 113 - if (bpf_map_lookup_elem(map_fd, &key, &value)) { 114 - printf("Failed to lookup cgroup storage\n"); 115 - goto err; 116 - } 117 - 118 - if (value != 3) { 119 - printf("Unexpected data in the cgroup storage: %llu\n", value); 120 - goto err; 121 - } 122 - 123 - /* Bump the counter in the cgroup local storage */ 124 - value++; 125 - if (bpf_map_update_elem(map_fd, &key, &value, 0)) { 126 - printf("Failed to update the data in the cgroup storage\n"); 127 - goto err; 128 - } 129 - 130 - /* Every second packet should be dropped */ 131 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null") == 0); 132 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null")); 133 - assert(system("ping localhost -c 1 -W 1 -q > /dev/null") == 0); 134 - 135 - /* Check the final value of the counter in the cgroup local storage */ 136 - if (bpf_map_lookup_elem(map_fd, &key, &value)) { 137 - printf("Failed to lookup the cgroup storage\n"); 138 - goto err; 139 - } 140 - 141 - if (value != 7) { 142 - printf("Unexpected data in the cgroup storage: %llu\n", value); 143 - goto err; 144 - } 145 - 146 - /* Check the final value of the counter in the percpu local storage */ 147 - 148 - for (cpu = 0; cpu < nproc; cpu++) 149 - percpu_value[cpu] = 0; 150 - 151 - if (bpf_map_lookup_elem(percpu_map_fd, &key, percpu_value)) { 152 - printf("Failed to lookup the per-cpu cgroup storage\n"); 153 - goto err; 154 - } 155 - 156 - value = 0; 157 - for (cpu = 0; cpu < nproc; cpu++) 158 - value += percpu_value[cpu]; 159 - 160 - if (value != nproc * 1000 + 6) { 161 - printf("Unexpected data in the per-cpu cgroup storage\n"); 162 - goto err; 163 - } 164 - 165 - error = 0; 166 - printf("test_cgroup_storage:PASS\n"); 167 - 168 - err: 169 - cleanup_cgroup_environment(); 170 - free(percpu_value); 171 - 172 - out: 173 - return error; 174 - }

+4

tools/testing/selftests/bpf/test_cpp.cpp

··· 6 6 #include <bpf/libbpf.h> 7 7 #include <bpf/bpf.h> 8 8 #include <bpf/btf.h> 9 + 10 + #ifndef _Bool 11 + #define _Bool bool 12 + #endif 9 13 #include "test_core_extern.skel.h" 10 14 #include "struct_ops_module.skel.h" 11 15

-85

tools/testing/selftests/bpf/test_dev_cgroup.c

··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* Copyright (c) 2017 Facebook 3 - */ 4 - 5 - #include <stdio.h> 6 - #include <stdlib.h> 7 - #include <string.h> 8 - #include <errno.h> 9 - #include <assert.h> 10 - #include <sys/time.h> 11 - 12 - #include <linux/bpf.h> 13 - #include <bpf/bpf.h> 14 - #include <bpf/libbpf.h> 15 - 16 - #include "cgroup_helpers.h" 17 - #include "testing_helpers.h" 18 - 19 - #define DEV_CGROUP_PROG "./dev_cgroup.bpf.o" 20 - 21 - #define TEST_CGROUP "/test-bpf-based-device-cgroup/" 22 - 23 - int main(int argc, char **argv) 24 - { 25 - struct bpf_object *obj; 26 - int error = EXIT_FAILURE; 27 - int prog_fd, cgroup_fd; 28 - __u32 prog_cnt; 29 - 30 - /* Use libbpf 1.0 API mode */ 31 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 32 - 33 - if (bpf_prog_test_load(DEV_CGROUP_PROG, BPF_PROG_TYPE_CGROUP_DEVICE, 34 - &obj, &prog_fd)) { 35 - printf("Failed to load DEV_CGROUP program\n"); 36 - goto out; 37 - } 38 - 39 - cgroup_fd = cgroup_setup_and_join(TEST_CGROUP); 40 - if (cgroup_fd < 0) { 41 - printf("Failed to create test cgroup\n"); 42 - goto out; 43 - } 44 - 45 - /* Attach bpf program */ 46 - if (bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_DEVICE, 0)) { 47 - printf("Failed to attach DEV_CGROUP program"); 48 - goto err; 49 - } 50 - 51 - if (bpf_prog_query(cgroup_fd, BPF_CGROUP_DEVICE, 0, NULL, NULL, 52 - &prog_cnt)) { 53 - printf("Failed to query attached programs"); 54 - goto err; 55 - } 56 - 57 - /* All operations with /dev/zero and and /dev/urandom are allowed, 58 - * everything else is forbidden. 59 - */ 60 - assert(system("rm -f /tmp/test_dev_cgroup_null") == 0); 61 - assert(system("mknod /tmp/test_dev_cgroup_null c 1 3")); 62 - assert(system("rm -f /tmp/test_dev_cgroup_null") == 0); 63 - 64 - /* /dev/zero is whitelisted */ 65 - assert(system("rm -f /tmp/test_dev_cgroup_zero") == 0); 66 - assert(system("mknod /tmp/test_dev_cgroup_zero c 1 5") == 0); 67 - assert(system("rm -f /tmp/test_dev_cgroup_zero") == 0); 68 - 69 - assert(system("dd if=/dev/urandom of=/dev/zero count=64") == 0); 70 - 71 - /* src is allowed, target is forbidden */ 72 - assert(system("dd if=/dev/urandom of=/dev/full count=64")); 73 - 74 - /* src is forbidden, target is allowed */ 75 - assert(system("dd if=/dev/random of=/dev/zero count=64")); 76 - 77 - error = 0; 78 - printf("test_dev_cgroup:PASS\n"); 79 - 80 - err: 81 - cleanup_cgroup_environment(); 82 - 83 - out: 84 - return error; 85 - }

+414 -94

tools/testing/selftests/bpf/test_loader.c

··· 7 7 #include <bpf/btf.h> 8 8 9 9 #include "autoconf_helper.h" 10 + #include "disasm_helpers.h" 10 11 #include "unpriv_helpers.h" 11 12 #include "cap_helpers.h" 13 + #include "jit_disasm_helpers.h" 12 14 13 15 #define str_has_pfx(str, pfx) \ 14 16 (strncmp(str, pfx, __builtin_constant_p(pfx) ? sizeof(pfx) - 1 : strlen(pfx)) == 0) ··· 20 18 #define TEST_TAG_EXPECT_FAILURE "comment:test_expect_failure" 21 19 #define TEST_TAG_EXPECT_SUCCESS "comment:test_expect_success" 22 20 #define TEST_TAG_EXPECT_MSG_PFX "comment:test_expect_msg=" 23 - #define TEST_TAG_EXPECT_REGEX_PFX "comment:test_expect_regex=" 21 + #define TEST_TAG_EXPECT_XLATED_PFX "comment:test_expect_xlated=" 24 22 #define TEST_TAG_EXPECT_FAILURE_UNPRIV "comment:test_expect_failure_unpriv" 25 23 #define TEST_TAG_EXPECT_SUCCESS_UNPRIV "comment:test_expect_success_unpriv" 26 24 #define TEST_TAG_EXPECT_MSG_PFX_UNPRIV "comment:test_expect_msg_unpriv=" 27 - #define TEST_TAG_EXPECT_REGEX_PFX_UNPRIV "comment:test_expect_regex_unpriv=" 25 + #define TEST_TAG_EXPECT_XLATED_PFX_UNPRIV "comment:test_expect_xlated_unpriv=" 28 26 #define TEST_TAG_LOG_LEVEL_PFX "comment:test_log_level=" 29 27 #define TEST_TAG_PROG_FLAGS_PFX "comment:test_prog_flags=" 30 28 #define TEST_TAG_DESCRIPTION_PFX "comment:test_description=" ··· 33 31 #define TEST_TAG_AUXILIARY "comment:test_auxiliary" 34 32 #define TEST_TAG_AUXILIARY_UNPRIV "comment:test_auxiliary_unpriv" 35 33 #define TEST_BTF_PATH "comment:test_btf_path=" 34 + #define TEST_TAG_ARCH "comment:test_arch=" 35 + #define TEST_TAG_JITED_PFX "comment:test_jited=" 36 + #define TEST_TAG_JITED_PFX_UNPRIV "comment:test_jited_unpriv=" 36 37 37 38 /* Warning: duplicated in bpf_misc.h */ 38 39 #define POINTER_VALUE 0xcafe4all ··· 56 51 57 52 struct expect_msg { 58 53 const char *substr; /* substring match */ 59 - const char *regex_str; /* regex-based match */ 60 54 regex_t regex; 55 + bool is_regex; 56 + bool on_next_line; 57 + }; 58 + 59 + struct expected_msgs { 60 + struct expect_msg *patterns; 61 + size_t cnt; 61 62 }; 62 63 63 64 struct test_subspec { 64 65 char *name; 65 66 bool expect_failure; 66 - struct expect_msg *expect_msgs; 67 - size_t expect_msg_cnt; 67 + struct expected_msgs expect_msgs; 68 + struct expected_msgs expect_xlated; 69 + struct expected_msgs jited; 68 70 int retval; 69 71 bool execute; 70 72 }; ··· 84 72 int log_level; 85 73 int prog_flags; 86 74 int mode_mask; 75 + int arch_mask; 87 76 bool auxiliary; 88 77 bool valid; 89 78 }; ··· 109 96 free(tester->log_buf); 110 97 } 111 98 112 - static void free_test_spec(struct test_spec *spec) 99 + static void free_msgs(struct expected_msgs *msgs) 113 100 { 114 101 int i; 115 102 103 + for (i = 0; i < msgs->cnt; i++) 104 + if (msgs->patterns[i].is_regex) 105 + regfree(&msgs->patterns[i].regex); 106 + free(msgs->patterns); 107 + msgs->patterns = NULL; 108 + msgs->cnt = 0; 109 + } 110 + 111 + static void free_test_spec(struct test_spec *spec) 112 + { 116 113 /* Deallocate expect_msgs arrays. */ 117 - for (i = 0; i < spec->priv.expect_msg_cnt; i++) 118 - if (spec->priv.expect_msgs[i].regex_str) 119 - regfree(&spec->priv.expect_msgs[i].regex); 120 - for (i = 0; i < spec->unpriv.expect_msg_cnt; i++) 121 - if (spec->unpriv.expect_msgs[i].regex_str) 122 - regfree(&spec->unpriv.expect_msgs[i].regex); 114 + free_msgs(&spec->priv.expect_msgs); 115 + free_msgs(&spec->unpriv.expect_msgs); 116 + free_msgs(&spec->priv.expect_xlated); 117 + free_msgs(&spec->unpriv.expect_xlated); 118 + free_msgs(&spec->priv.jited); 119 + free_msgs(&spec->unpriv.jited); 123 120 124 121 free(spec->priv.name); 125 122 free(spec->unpriv.name); 126 - free(spec->priv.expect_msgs); 127 - free(spec->unpriv.expect_msgs); 128 - 129 123 spec->priv.name = NULL; 130 124 spec->unpriv.name = NULL; 131 - spec->priv.expect_msgs = NULL; 132 - spec->unpriv.expect_msgs = NULL; 133 125 } 134 126 135 - static int push_msg(const char *substr, const char *regex_str, struct test_subspec *subspec) 127 + /* Compiles regular expression matching pattern. 128 + * Pattern has a special syntax: 129 + * 130 + * pattern := (<verbatim text> | regex)* 131 + * regex := "{{" <posix extended regular expression> "}}" 132 + * 133 + * In other words, pattern is a verbatim text with inclusion 134 + * of regular expressions enclosed in "{{" "}}" pairs. 135 + * For example, pattern "foo{{[0-9]+}}" matches strings like 136 + * "foo0", "foo007", etc. 137 + */ 138 + static int compile_regex(const char *pattern, regex_t *regex) 136 139 { 137 - void *tmp; 138 - int regcomp_res; 139 - char error_msg[100]; 140 - struct expect_msg *msg; 140 + char err_buf[256], buf[256] = {}, *ptr, *buf_end; 141 + const char *original_pattern = pattern; 142 + bool in_regex = false; 143 + int err; 141 144 142 - tmp = realloc(subspec->expect_msgs, 143 - (1 + subspec->expect_msg_cnt) * sizeof(struct expect_msg)); 145 + buf_end = buf + sizeof(buf); 146 + ptr = buf; 147 + while (*pattern && ptr < buf_end - 2) { 148 + if (!in_regex && str_has_pfx(pattern, "{{")) { 149 + in_regex = true; 150 + pattern += 2; 151 + continue; 152 + } 153 + if (in_regex && str_has_pfx(pattern, "}}")) { 154 + in_regex = false; 155 + pattern += 2; 156 + continue; 157 + } 158 + if (in_regex) { 159 + *ptr++ = *pattern++; 160 + continue; 161 + } 162 + /* list of characters that need escaping for extended posix regex */ 163 + if (strchr(".[]\\()*+?{}|^$", *pattern)) { 164 + *ptr++ = '\\'; 165 + *ptr++ = *pattern++; 166 + continue; 167 + } 168 + *ptr++ = *pattern++; 169 + } 170 + if (*pattern) { 171 + PRINT_FAIL("Regexp too long: '%s'\n", original_pattern); 172 + return -EINVAL; 173 + } 174 + if (in_regex) { 175 + PRINT_FAIL("Regexp has open '{{' but no closing '}}': '%s'\n", original_pattern); 176 + return -EINVAL; 177 + } 178 + err = regcomp(regex, buf, REG_EXTENDED | REG_NEWLINE); 179 + if (err != 0) { 180 + regerror(err, regex, err_buf, sizeof(err_buf)); 181 + PRINT_FAIL("Regexp compilation error in '%s': '%s'\n", buf, err_buf); 182 + return -EINVAL; 183 + } 184 + return 0; 185 + } 186 + 187 + static int __push_msg(const char *pattern, bool on_next_line, struct expected_msgs *msgs) 188 + { 189 + struct expect_msg *msg; 190 + void *tmp; 191 + int err; 192 + 193 + tmp = realloc(msgs->patterns, 194 + (1 + msgs->cnt) * sizeof(struct expect_msg)); 144 195 if (!tmp) { 145 196 ASSERT_FAIL("failed to realloc memory for messages\n"); 146 197 return -ENOMEM; 147 198 } 148 - subspec->expect_msgs = tmp; 149 - msg = &subspec->expect_msgs[subspec->expect_msg_cnt]; 150 - 151 - if (substr) { 152 - msg->substr = substr; 153 - msg->regex_str = NULL; 154 - } else { 155 - msg->regex_str = regex_str; 156 - msg->substr = NULL; 157 - regcomp_res = regcomp(&msg->regex, regex_str, REG_EXTENDED|REG_NEWLINE); 158 - if (regcomp_res != 0) { 159 - regerror(regcomp_res, &msg->regex, error_msg, sizeof(error_msg)); 160 - PRINT_FAIL("Regexp compilation error in '%s': '%s'\n", 161 - regex_str, error_msg); 162 - return -EINVAL; 163 - } 199 + msgs->patterns = tmp; 200 + msg = &msgs->patterns[msgs->cnt]; 201 + msg->on_next_line = on_next_line; 202 + msg->substr = pattern; 203 + msg->is_regex = false; 204 + if (strstr(pattern, "{{")) { 205 + err = compile_regex(pattern, &msg->regex); 206 + if (err) 207 + return err; 208 + msg->is_regex = true; 164 209 } 210 + msgs->cnt += 1; 211 + return 0; 212 + } 165 213 166 - subspec->expect_msg_cnt += 1; 214 + static int clone_msgs(struct expected_msgs *from, struct expected_msgs *to) 215 + { 216 + struct expect_msg *msg; 217 + int i, err; 218 + 219 + for (i = 0; i < from->cnt; i++) { 220 + msg = &from->patterns[i]; 221 + err = __push_msg(msg->substr, msg->on_next_line, to); 222 + if (err) 223 + return err; 224 + } 225 + return 0; 226 + } 227 + 228 + static int push_msg(const char *substr, struct expected_msgs *msgs) 229 + { 230 + return __push_msg(substr, false, msgs); 231 + } 232 + 233 + static int push_disasm_msg(const char *regex_str, bool *on_next_line, struct expected_msgs *msgs) 234 + { 235 + int err; 236 + 237 + if (strcmp(regex_str, "...") == 0) { 238 + *on_next_line = false; 239 + return 0; 240 + } 241 + err = __push_msg(regex_str, *on_next_line, msgs); 242 + if (err) 243 + return err; 244 + *on_next_line = true; 167 245 return 0; 168 246 } 169 247 ··· 306 202 *flags |= flag; 307 203 } 308 204 205 + /* Matches a string of form '<pfx>[^=]=.*' and returns it's suffix. 206 + * Used to parse btf_decl_tag values. 207 + * Such values require unique prefix because compiler does not add 208 + * same __attribute__((btf_decl_tag(...))) twice. 209 + * Test suite uses two-component tags for such cases: 210 + * 211 + * <pfx> __COUNTER__ '=' 212 + * 213 + * For example, two consecutive __msg tags '__msg("foo") __msg("foo")' 214 + * would be encoded as: 215 + * 216 + * [18] DECL_TAG 'comment:test_expect_msg=0=foo' type_id=15 component_idx=-1 217 + * [19] DECL_TAG 'comment:test_expect_msg=1=foo' type_id=15 component_idx=-1 218 + * 219 + * And the purpose of this function is to extract 'foo' from the above. 220 + */ 221 + static const char *skip_dynamic_pfx(const char *s, const char *pfx) 222 + { 223 + const char *msg; 224 + 225 + if (strncmp(s, pfx, strlen(pfx)) != 0) 226 + return NULL; 227 + msg = s + strlen(pfx); 228 + msg = strchr(msg, '='); 229 + if (!msg) 230 + return NULL; 231 + return msg + 1; 232 + } 233 + 234 + enum arch { 235 + ARCH_UNKNOWN = 0x1, 236 + ARCH_X86_64 = 0x2, 237 + ARCH_ARM64 = 0x4, 238 + ARCH_RISCV64 = 0x8, 239 + }; 240 + 241 + static int get_current_arch(void) 242 + { 243 + #if defined(__x86_64__) 244 + return ARCH_X86_64; 245 + #elif defined(__aarch64__) 246 + return ARCH_ARM64; 247 + #elif defined(__riscv) && __riscv_xlen == 64 248 + return ARCH_RISCV64; 249 + #endif 250 + return ARCH_UNKNOWN; 251 + } 252 + 309 253 /* Uses btf_decl_tag attributes to describe the expected test 310 254 * behavior, see bpf_misc.h for detailed description of each attribute 311 255 * and attribute combinations. ··· 366 214 const char *description = NULL; 367 215 bool has_unpriv_result = false; 368 216 bool has_unpriv_retval = false; 217 + bool unpriv_xlated_on_next_line = true; 218 + bool xlated_on_next_line = true; 219 + bool unpriv_jit_on_next_line; 220 + bool jit_on_next_line; 221 + bool collect_jit = false; 369 222 int func_id, i, err = 0; 223 + u32 arch_mask = 0; 370 224 struct btf *btf; 225 + enum arch arch; 371 226 372 227 memset(spec, 0, sizeof(*spec)); 373 228 ··· 429 270 } else if (strcmp(s, TEST_TAG_AUXILIARY_UNPRIV) == 0) { 430 271 spec->auxiliary = true; 431 272 spec->mode_mask |= UNPRIV; 432 - } else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX)) { 433 - msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX) - 1; 434 - err = push_msg(msg, NULL, &spec->priv); 273 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_EXPECT_MSG_PFX))) { 274 + err = push_msg(msg, &spec->priv.expect_msgs); 435 275 if (err) 436 276 goto cleanup; 437 277 spec->mode_mask |= PRIV; 438 - } else if (str_has_pfx(s, TEST_TAG_EXPECT_MSG_PFX_UNPRIV)) { 439 - msg = s + sizeof(TEST_TAG_EXPECT_MSG_PFX_UNPRIV) - 1; 440 - err = push_msg(msg, NULL, &spec->unpriv); 278 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_EXPECT_MSG_PFX_UNPRIV))) { 279 + err = push_msg(msg, &spec->unpriv.expect_msgs); 441 280 if (err) 442 281 goto cleanup; 443 282 spec->mode_mask |= UNPRIV; 444 - } else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX)) { 445 - msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX) - 1; 446 - err = push_msg(NULL, msg, &spec->priv); 283 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_JITED_PFX))) { 284 + if (arch_mask == 0) { 285 + PRINT_FAIL("__jited used before __arch_*"); 286 + goto cleanup; 287 + } 288 + if (collect_jit) { 289 + err = push_disasm_msg(msg, &jit_on_next_line, 290 + &spec->priv.jited); 291 + if (err) 292 + goto cleanup; 293 + spec->mode_mask |= PRIV; 294 + } 295 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_JITED_PFX_UNPRIV))) { 296 + if (arch_mask == 0) { 297 + PRINT_FAIL("__unpriv_jited used before __arch_*"); 298 + goto cleanup; 299 + } 300 + if (collect_jit) { 301 + err = push_disasm_msg(msg, &unpriv_jit_on_next_line, 302 + &spec->unpriv.jited); 303 + if (err) 304 + goto cleanup; 305 + spec->mode_mask |= UNPRIV; 306 + } 307 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_EXPECT_XLATED_PFX))) { 308 + err = push_disasm_msg(msg, &xlated_on_next_line, 309 + &spec->priv.expect_xlated); 447 310 if (err) 448 311 goto cleanup; 449 312 spec->mode_mask |= PRIV; 450 - } else if (str_has_pfx(s, TEST_TAG_EXPECT_REGEX_PFX_UNPRIV)) { 451 - msg = s + sizeof(TEST_TAG_EXPECT_REGEX_PFX_UNPRIV) - 1; 452 - err = push_msg(NULL, msg, &spec->unpriv); 313 + } else if ((msg = skip_dynamic_pfx(s, TEST_TAG_EXPECT_XLATED_PFX_UNPRIV))) { 314 + err = push_disasm_msg(msg, &unpriv_xlated_on_next_line, 315 + &spec->unpriv.expect_xlated); 453 316 if (err) 454 317 goto cleanup; 455 318 spec->mode_mask |= UNPRIV; ··· 522 341 goto cleanup; 523 342 update_flags(&spec->prog_flags, flags, clear); 524 343 } 344 + } else if (str_has_pfx(s, TEST_TAG_ARCH)) { 345 + val = s + sizeof(TEST_TAG_ARCH) - 1; 346 + if (strcmp(val, "X86_64") == 0) { 347 + arch = ARCH_X86_64; 348 + } else if (strcmp(val, "ARM64") == 0) { 349 + arch = ARCH_ARM64; 350 + } else if (strcmp(val, "RISCV64") == 0) { 351 + arch = ARCH_RISCV64; 352 + } else { 353 + PRINT_FAIL("bad arch spec: '%s'", val); 354 + err = -EINVAL; 355 + goto cleanup; 356 + } 357 + arch_mask |= arch; 358 + collect_jit = get_current_arch() == arch; 359 + unpriv_jit_on_next_line = true; 360 + jit_on_next_line = true; 525 361 } else if (str_has_pfx(s, TEST_BTF_PATH)) { 526 362 spec->btf_custom_path = s + sizeof(TEST_BTF_PATH) - 1; 527 363 } 528 364 } 365 + 366 + spec->arch_mask = arch_mask ?: -1; 529 367 530 368 if (spec->mode_mask == 0) 531 369 spec->mode_mask = PRIV; ··· 587 387 spec->unpriv.execute = spec->priv.execute; 588 388 } 589 389 590 - if (!spec->unpriv.expect_msgs) { 591 - for (i = 0; i < spec->priv.expect_msg_cnt; i++) { 592 - struct expect_msg *msg = &spec->priv.expect_msgs[i]; 593 - 594 - err = push_msg(msg->substr, msg->regex_str, &spec->unpriv); 595 - if (err) 596 - goto cleanup; 597 - } 598 - } 390 + if (spec->unpriv.expect_msgs.cnt == 0) 391 + clone_msgs(&spec->priv.expect_msgs, &spec->unpriv.expect_msgs); 392 + if (spec->unpriv.expect_xlated.cnt == 0) 393 + clone_msgs(&spec->priv.expect_xlated, &spec->unpriv.expect_xlated); 394 + if (spec->unpriv.jited.cnt == 0) 395 + clone_msgs(&spec->priv.jited, &spec->unpriv.jited); 599 396 } 600 397 601 398 spec->valid = true; ··· 631 434 bpf_program__set_flags(prog, prog_flags | spec->prog_flags); 632 435 633 436 tester->log_buf[0] = '\0'; 634 - tester->next_match_pos = 0; 635 437 } 636 438 637 439 static void emit_verifier_log(const char *log_buf, bool force) ··· 640 444 fprintf(stdout, "VERIFIER LOG:\n=============\n%s=============\n", log_buf); 641 445 } 642 446 643 - static void validate_case(struct test_loader *tester, 644 - struct test_subspec *subspec, 645 - struct bpf_object *obj, 646 - struct bpf_program *prog, 647 - int load_err) 447 + static void emit_xlated(const char *xlated, bool force) 648 448 { 649 - int i, j, err; 650 - char *match; 449 + if (!force && env.verbosity == VERBOSE_NONE) 450 + return; 451 + fprintf(stdout, "XLATED:\n=============\n%s=============\n", xlated); 452 + } 453 + 454 + static void emit_jited(const char *jited, bool force) 455 + { 456 + if (!force && env.verbosity == VERBOSE_NONE) 457 + return; 458 + fprintf(stdout, "JITED:\n=============\n%s=============\n", jited); 459 + } 460 + 461 + static void validate_msgs(char *log_buf, struct expected_msgs *msgs, 462 + void (*emit_fn)(const char *buf, bool force)) 463 + { 464 + const char *log = log_buf, *prev_match; 651 465 regmatch_t reg_match[1]; 466 + int prev_match_line; 467 + int match_line; 468 + int i, j, err; 652 469 653 - for (i = 0; i < subspec->expect_msg_cnt; i++) { 654 - struct expect_msg *msg = &subspec->expect_msgs[i]; 470 + prev_match_line = -1; 471 + match_line = 0; 472 + prev_match = log; 473 + for (i = 0; i < msgs->cnt; i++) { 474 + struct expect_msg *msg = &msgs->patterns[i]; 475 + const char *match = NULL, *pat_status; 476 + bool wrong_line = false; 655 477 656 - if (msg->substr) { 657 - match = strstr(tester->log_buf + tester->next_match_pos, msg->substr); 478 + if (!msg->is_regex) { 479 + match = strstr(log, msg->substr); 658 480 if (match) 659 - tester->next_match_pos = match - tester->log_buf + strlen(msg->substr); 481 + log = match + strlen(msg->substr); 660 482 } else { 661 - err = regexec(&msg->regex, 662 - tester->log_buf + tester->next_match_pos, 1, reg_match, 0); 483 + err = regexec(&msg->regex, log, 1, reg_match, 0); 663 484 if (err == 0) { 664 - match = tester->log_buf + tester->next_match_pos + reg_match[0].rm_so; 665 - tester->next_match_pos += reg_match[0].rm_eo; 666 - } else { 667 - match = NULL; 485 + match = log + reg_match[0].rm_so; 486 + log += reg_match[0].rm_eo; 668 487 } 669 488 } 670 489 671 - if (!ASSERT_OK_PTR(match, "expect_msg")) { 672 - if (env.verbosity == VERBOSE_NONE) 673 - emit_verifier_log(tester->log_buf, true /*force*/); 674 - for (j = 0; j <= i; j++) { 675 - msg = &subspec->expect_msgs[j]; 676 - fprintf(stderr, "%s %s: '%s'\n", 677 - j < i ? "MATCHED " : "EXPECTED", 678 - msg->substr ? "SUBSTR" : " REGEX", 679 - msg->substr ?: msg->regex_str); 680 - } 681 - return; 490 + if (match) { 491 + for (; prev_match < match; ++prev_match) 492 + if (*prev_match == '\n') 493 + ++match_line; 494 + wrong_line = msg->on_next_line && prev_match_line >= 0 && 495 + prev_match_line + 1 != match_line; 682 496 } 497 + 498 + if (!match || wrong_line) { 499 + PRINT_FAIL("expect_msg\n"); 500 + if (env.verbosity == VERBOSE_NONE) 501 + emit_fn(log_buf, true /*force*/); 502 + for (j = 0; j <= i; j++) { 503 + msg = &msgs->patterns[j]; 504 + if (j < i) 505 + pat_status = "MATCHED "; 506 + else if (wrong_line) 507 + pat_status = "WRONG LINE"; 508 + else 509 + pat_status = "EXPECTED "; 510 + msg = &msgs->patterns[j]; 511 + fprintf(stderr, "%s %s: '%s'\n", 512 + pat_status, 513 + msg->is_regex ? " REGEX" : "SUBSTR", 514 + msg->substr); 515 + } 516 + if (wrong_line) { 517 + fprintf(stderr, 518 + "expecting match at line %d, actual match is at line %d\n", 519 + prev_match_line + 1, match_line); 520 + } 521 + break; 522 + } 523 + 524 + prev_match_line = match_line; 683 525 } 684 526 } 685 527 ··· 845 611 return true; 846 612 } 847 613 614 + /* Get a disassembly of BPF program after verifier applies all rewrites */ 615 + static int get_xlated_program_text(int prog_fd, char *text, size_t text_sz) 616 + { 617 + struct bpf_insn *insn_start = NULL, *insn, *insn_end; 618 + __u32 insns_cnt = 0, i; 619 + char buf[64]; 620 + FILE *out = NULL; 621 + int err; 622 + 623 + err = get_xlated_program(prog_fd, &insn_start, &insns_cnt); 624 + if (!ASSERT_OK(err, "get_xlated_program")) 625 + goto out; 626 + out = fmemopen(text, text_sz, "w"); 627 + if (!ASSERT_OK_PTR(out, "open_memstream")) 628 + goto out; 629 + insn_end = insn_start + insns_cnt; 630 + insn = insn_start; 631 + while (insn < insn_end) { 632 + i = insn - insn_start; 633 + insn = disasm_insn(insn, buf, sizeof(buf)); 634 + fprintf(out, "%d: %s\n", i, buf); 635 + } 636 + fflush(out); 637 + 638 + out: 639 + free(insn_start); 640 + if (out) 641 + fclose(out); 642 + return err; 643 + } 644 + 848 645 /* this function is forced noinline and has short generic name to look better 849 646 * in test_progs output (in case of a failure) 850 647 */ ··· 890 625 { 891 626 struct test_subspec *subspec = unpriv ? &spec->unpriv : &spec->priv; 892 627 struct bpf_program *tprog = NULL, *tprog_iter; 628 + struct bpf_link *link, *links[32] = {}; 893 629 struct test_spec *spec_iter; 894 630 struct cap_state caps = {}; 895 631 struct bpf_object *tobj; 896 632 struct bpf_map *map; 897 633 int retval, err, i; 634 + int links_cnt = 0; 898 635 bool should_load; 899 636 900 637 if (!test__start_subtest(subspec->name)) 901 638 return; 639 + 640 + if ((get_current_arch() & spec->arch_mask) == 0) { 641 + test__skip(); 642 + return; 643 + } 902 644 903 645 if (unpriv) { 904 646 if (!can_execute_unpriv(tester, spec)) { ··· 967 695 goto tobj_cleanup; 968 696 } 969 697 } 970 - 971 698 emit_verifier_log(tester->log_buf, false /*force*/); 972 - validate_case(tester, subspec, tobj, tprog, err); 699 + validate_msgs(tester->log_buf, &subspec->expect_msgs, emit_verifier_log); 700 + 701 + if (subspec->expect_xlated.cnt) { 702 + err = get_xlated_program_text(bpf_program__fd(tprog), 703 + tester->log_buf, tester->log_buf_sz); 704 + if (err) 705 + goto tobj_cleanup; 706 + emit_xlated(tester->log_buf, false /*force*/); 707 + validate_msgs(tester->log_buf, &subspec->expect_xlated, emit_xlated); 708 + } 709 + 710 + if (subspec->jited.cnt) { 711 + err = get_jited_program_text(bpf_program__fd(tprog), 712 + tester->log_buf, tester->log_buf_sz); 713 + if (err == -EOPNOTSUPP) { 714 + printf("%s:SKIP: jited programs disassembly is not supported,\n", __func__); 715 + printf("%s:SKIP: tests are built w/o LLVM development libs\n", __func__); 716 + test__skip(); 717 + goto tobj_cleanup; 718 + } 719 + if (!ASSERT_EQ(err, 0, "get_jited_program_text")) 720 + goto tobj_cleanup; 721 + emit_jited(tester->log_buf, false /*force*/); 722 + validate_msgs(tester->log_buf, &subspec->jited, emit_jited); 723 + } 973 724 974 725 if (should_do_test_run(spec, subspec)) { 975 726 /* For some reason test_verifier executes programs ··· 1000 705 */ 1001 706 if (restore_capabilities(&caps)) 1002 707 goto tobj_cleanup; 708 + 709 + /* Do bpf_map__attach_struct_ops() for each struct_ops map. 710 + * This should trigger bpf_struct_ops->reg callback on kernel side. 711 + */ 712 + bpf_object__for_each_map(map, tobj) { 713 + if (!bpf_map__autocreate(map) || 714 + bpf_map__type(map) != BPF_MAP_TYPE_STRUCT_OPS) 715 + continue; 716 + if (links_cnt >= ARRAY_SIZE(links)) { 717 + PRINT_FAIL("too many struct_ops maps"); 718 + goto tobj_cleanup; 719 + } 720 + link = bpf_map__attach_struct_ops(map); 721 + if (!link) { 722 + PRINT_FAIL("bpf_map__attach_struct_ops failed for map %s: err=%d\n", 723 + bpf_map__name(map), err); 724 + goto tobj_cleanup; 725 + } 726 + links[links_cnt++] = link; 727 + } 1003 728 1004 729 if (tester->pre_execution_cb) { 1005 730 err = tester->pre_execution_cb(tobj); ··· 1035 720 PRINT_FAIL("Unexpected retval: %d != %d\n", retval, subspec->retval); 1036 721 goto tobj_cleanup; 1037 722 } 723 + /* redo bpf_map__attach_struct_ops for each test */ 724 + while (links_cnt > 0) 725 + bpf_link__destroy(links[--links_cnt]); 1038 726 } 1039 727 1040 728 tobj_cleanup: 729 + while (links_cnt > 0) 730 + bpf_link__destroy(links[--links_cnt]); 1041 731 bpf_object__close(tobj); 1042 732 subtest_cleanup: 1043 733 test__end_subtest();

+2 -1

tools/testing/selftests/bpf/test_lru_map.c

··· 126 126 127 127 while (next < nr_cpus) { 128 128 CPU_ZERO(&cpuset); 129 - CPU_SET(next++, &cpuset); 129 + CPU_SET(next, &cpuset); 130 + next++; 130 131 if (!sched_setaffinity(pid, sizeof(cpuset), &cpuset)) { 131 132 ret = 0; 132 133 break;

+1 -1

tools/testing/selftests/bpf/test_maps.c

··· 1515 1515 value == key); 1516 1516 } 1517 1517 1518 - /* Now let's delete all elemenets in parallel. */ 1518 + /* Now let's delete all elements in parallel. */ 1519 1519 data[1] = DO_DELETE; 1520 1520 run_parallel(TASKS, test_update_delete, data); 1521 1521

+206 -63

tools/testing/selftests/bpf/test_progs.c

··· 10 10 #include <sched.h> 11 11 #include <signal.h> 12 12 #include <string.h> 13 - #include <execinfo.h> /* backtrace */ 14 13 #include <sys/sysinfo.h> /* get_nprocs */ 15 14 #include <netinet/in.h> 16 15 #include <sys/select.h> ··· 17 18 #include <sys/un.h> 18 19 #include <bpf/btf.h> 19 20 #include "json_writer.h" 21 + 22 + #include "network_helpers.h" 23 + 24 + #ifdef __GLIBC__ 25 + #include <execinfo.h> /* backtrace */ 26 + #endif 27 + 28 + /* Default backtrace funcs if missing at link */ 29 + __weak int backtrace(void **buffer, int size) 30 + { 31 + return 0; 32 + } 33 + 34 + __weak void backtrace_symbols_fd(void *const *buffer, int size, int fd) 35 + { 36 + dprintf(fd, "<backtrace not supported>\n"); 37 + } 38 + 39 + int env_verbosity = 0; 20 40 21 41 static bool verbose(void) 22 42 { ··· 55 37 56 38 stdout = open_memstream(log_buf, log_cnt); 57 39 if (!stdout) { 58 - stdout = env.stdout; 40 + stdout = env.stdout_saved; 59 41 perror("open_memstream"); 60 42 return; 61 43 } 62 44 63 45 if (env.subtest_state) 64 - env.subtest_state->stdout = stdout; 46 + env.subtest_state->stdout_saved = stdout; 65 47 else 66 - env.test_state->stdout = stdout; 48 + env.test_state->stdout_saved = stdout; 67 49 68 50 stderr = stdout; 69 51 #endif ··· 77 59 return; 78 60 } 79 61 80 - env.stdout = stdout; 81 - env.stderr = stderr; 62 + env.stdout_saved = stdout; 63 + env.stderr_saved = stderr; 82 64 83 65 stdio_hijack_init(log_buf, log_cnt); 84 66 #endif ··· 95 77 fflush(stdout); 96 78 97 79 if (env.subtest_state) { 98 - fclose(env.subtest_state->stdout); 99 - env.subtest_state->stdout = NULL; 100 - stdout = env.test_state->stdout; 101 - stderr = env.test_state->stdout; 80 + fclose(env.subtest_state->stdout_saved); 81 + env.subtest_state->stdout_saved = NULL; 82 + stdout = env.test_state->stdout_saved; 83 + stderr = env.test_state->stdout_saved; 102 84 } else { 103 - fclose(env.test_state->stdout); 104 - env.test_state->stdout = NULL; 85 + fclose(env.test_state->stdout_saved); 86 + env.test_state->stdout_saved = NULL; 105 87 } 106 88 #endif 107 89 } ··· 114 96 return; 115 97 } 116 98 117 - if (stdout == env.stdout) 99 + if (stdout == env.stdout_saved) 118 100 return; 119 101 120 102 stdio_restore_cleanup(); 121 103 122 - stdout = env.stdout; 123 - stderr = env.stderr; 104 + stdout = env.stdout_saved; 105 + stderr = env.stderr_saved; 124 106 #endif 125 107 } 126 108 ··· 159 141 void (*run_serial_test)(void); 160 142 bool should_run; 161 143 bool need_cgroup_cleanup; 144 + bool should_tmon; 162 145 }; 163 146 164 147 /* Override C runtime library's usleep() implementation to ensure nanosleep() ··· 197 178 return num < sel->num_set_len && sel->num_set[num]; 198 179 } 199 180 181 + static bool match_subtest(struct test_filter_set *filter, 182 + const char *test_name, 183 + const char *subtest_name) 184 + { 185 + int i, j; 186 + 187 + for (i = 0; i < filter->cnt; i++) { 188 + if (glob_match(test_name, filter->tests[i].name)) { 189 + if (!filter->tests[i].subtest_cnt) 190 + return true; 191 + 192 + for (j = 0; j < filter->tests[i].subtest_cnt; j++) { 193 + if (glob_match(subtest_name, 194 + filter->tests[i].subtests[j])) 195 + return true; 196 + } 197 + } 198 + } 199 + 200 + return false; 201 + } 202 + 200 203 static bool should_run_subtest(struct test_selector *sel, 201 204 struct test_selector *subtest_sel, 202 205 int subtest_num, 203 206 const char *test_name, 204 207 const char *subtest_name) 205 208 { 206 - int i, j; 209 + if (match_subtest(&sel->blacklist, test_name, subtest_name)) 210 + return false; 207 211 208 - for (i = 0; i < sel->blacklist.cnt; i++) { 209 - if (glob_match(test_name, sel->blacklist.tests[i].name)) { 210 - if (!sel->blacklist.tests[i].subtest_cnt) 211 - return false; 212 - 213 - for (j = 0; j < sel->blacklist.tests[i].subtest_cnt; j++) { 214 - if (glob_match(subtest_name, 215 - sel->blacklist.tests[i].subtests[j])) 216 - return false; 217 - } 218 - } 219 - } 220 - 221 - for (i = 0; i < sel->whitelist.cnt; i++) { 222 - if (glob_match(test_name, sel->whitelist.tests[i].name)) { 223 - if (!sel->whitelist.tests[i].subtest_cnt) 224 - return true; 225 - 226 - for (j = 0; j < sel->whitelist.tests[i].subtest_cnt; j++) { 227 - if (glob_match(subtest_name, 228 - sel->whitelist.tests[i].subtests[j])) 229 - return true; 230 - } 231 - } 232 - } 212 + if (match_subtest(&sel->whitelist, test_name, subtest_name)) 213 + return true; 233 214 234 215 if (!sel->whitelist.cnt && !subtest_sel->num_set) 235 216 return true; 236 217 237 218 return subtest_num < subtest_sel->num_set_len && subtest_sel->num_set[subtest_num]; 219 + } 220 + 221 + static bool should_tmon(struct test_selector *sel, const char *name) 222 + { 223 + int i; 224 + 225 + for (i = 0; i < sel->whitelist.cnt; i++) { 226 + if (glob_match(name, sel->whitelist.tests[i].name) && 227 + !sel->whitelist.tests[i].subtest_cnt) 228 + return true; 229 + } 230 + 231 + return false; 238 232 } 239 233 240 234 static char *test_result(bool failed, bool skipped) ··· 262 230 int skipped_cnt = test_state->skip_cnt; 263 231 int subtests_cnt = test_state->subtest_num; 264 232 265 - fprintf(env.stdout, "#%-*d %s:", TEST_NUM_WIDTH, test->test_num, test->test_name); 233 + fprintf(env.stdout_saved, "#%-*d %s:", TEST_NUM_WIDTH, test->test_num, test->test_name); 266 234 if (test_state->error_cnt) 267 - fprintf(env.stdout, "FAIL"); 235 + fprintf(env.stdout_saved, "FAIL"); 268 236 else if (!skipped_cnt) 269 - fprintf(env.stdout, "OK"); 237 + fprintf(env.stdout_saved, "OK"); 270 238 else if (skipped_cnt == subtests_cnt || !subtests_cnt) 271 - fprintf(env.stdout, "SKIP"); 239 + fprintf(env.stdout_saved, "SKIP"); 272 240 else 273 - fprintf(env.stdout, "OK (SKIP: %d/%d)", skipped_cnt, subtests_cnt); 241 + fprintf(env.stdout_saved, "OK (SKIP: %d/%d)", skipped_cnt, subtests_cnt); 274 242 275 - fprintf(env.stdout, "\n"); 243 + fprintf(env.stdout_saved, "\n"); 276 244 } 277 245 278 246 static void print_test_log(char *log_buf, size_t log_cnt) 279 247 { 280 248 log_buf[log_cnt] = '\0'; 281 - fprintf(env.stdout, "%s", log_buf); 249 + fprintf(env.stdout_saved, "%s", log_buf); 282 250 if (log_buf[log_cnt - 1] != '\n') 283 - fprintf(env.stdout, "\n"); 251 + fprintf(env.stdout_saved, "\n"); 284 252 } 285 253 286 254 static void print_subtest_name(int test_num, int subtest_num, ··· 291 259 292 260 snprintf(test_num_str, sizeof(test_num_str), "%d/%d", test_num, subtest_num); 293 261 294 - fprintf(env.stdout, "#%-*s %s/%s", 262 + fprintf(env.stdout_saved, "#%-*s %s/%s", 295 263 TEST_NUM_WIDTH, test_num_str, 296 264 test_name, subtest_name); 297 265 298 266 if (result) 299 - fprintf(env.stdout, ":%s", result); 267 + fprintf(env.stdout_saved, ":%s", result); 300 268 301 - fprintf(env.stdout, "\n"); 269 + fprintf(env.stdout_saved, "\n"); 302 270 } 303 271 304 272 static void jsonw_write_log_message(json_writer_t *w, char *log_buf, size_t log_cnt) ··· 483 451 memset(subtest_state, 0, sub_state_size); 484 452 485 453 if (!subtest_name || !subtest_name[0]) { 486 - fprintf(env.stderr, 454 + fprintf(env.stderr_saved, 487 455 "Subtest #%d didn't provide sub-test name!\n", 488 456 state->subtest_num); 489 457 return false; ··· 491 459 492 460 subtest_state->name = strdup(subtest_name); 493 461 if (!subtest_state->name) { 494 - fprintf(env.stderr, 462 + fprintf(env.stderr_saved, 495 463 "Subtest #%d: failed to copy subtest name!\n", 496 464 state->subtest_num); 497 465 return false; ··· 505 473 subtest_state->filtered = true; 506 474 return false; 507 475 } 476 + 477 + subtest_state->should_tmon = match_subtest(&env.tmon_selector.whitelist, 478 + test->test_name, 479 + subtest_name); 508 480 509 481 env.subtest_state = subtest_state; 510 482 stdio_hijack_init(&subtest_state->log_buf, &subtest_state->log_cnt); ··· 646 610 return err; 647 611 } 648 612 613 + struct netns_obj { 614 + char *nsname; 615 + struct tmonitor_ctx *tmon; 616 + struct nstoken *nstoken; 617 + }; 618 + 619 + /* Create a new network namespace with the given name. 620 + * 621 + * Create a new network namespace and set the network namespace of the 622 + * current process to the new network namespace if the argument "open" is 623 + * true. This function should be paired with netns_free() to release the 624 + * resource and delete the network namespace. 625 + * 626 + * It also implements the functionality of the option "-m" by starting 627 + * traffic monitor on the background to capture the packets in this network 628 + * namespace if the current test or subtest matching the pattern. 629 + * 630 + * nsname: the name of the network namespace to create. 631 + * open: open the network namespace if true. 632 + * 633 + * Return: the network namespace object on success, NULL on failure. 634 + */ 635 + struct netns_obj *netns_new(const char *nsname, bool open) 636 + { 637 + struct netns_obj *netns_obj = malloc(sizeof(*netns_obj)); 638 + const char *test_name, *subtest_name; 639 + int r; 640 + 641 + if (!netns_obj) 642 + return NULL; 643 + memset(netns_obj, 0, sizeof(*netns_obj)); 644 + 645 + netns_obj->nsname = strdup(nsname); 646 + if (!netns_obj->nsname) 647 + goto fail; 648 + 649 + /* Create the network namespace */ 650 + r = make_netns(nsname); 651 + if (r) 652 + goto fail; 653 + 654 + /* Start traffic monitor */ 655 + if (env.test->should_tmon || 656 + (env.subtest_state && env.subtest_state->should_tmon)) { 657 + test_name = env.test->test_name; 658 + subtest_name = env.subtest_state ? env.subtest_state->name : NULL; 659 + netns_obj->tmon = traffic_monitor_start(nsname, test_name, subtest_name); 660 + if (!netns_obj->tmon) { 661 + fprintf(stderr, "Failed to start traffic monitor for %s\n", nsname); 662 + goto fail; 663 + } 664 + } else { 665 + netns_obj->tmon = NULL; 666 + } 667 + 668 + if (open) { 669 + netns_obj->nstoken = open_netns(nsname); 670 + if (!netns_obj->nstoken) 671 + goto fail; 672 + } 673 + 674 + return netns_obj; 675 + fail: 676 + traffic_monitor_stop(netns_obj->tmon); 677 + remove_netns(nsname); 678 + free(netns_obj->nsname); 679 + free(netns_obj); 680 + return NULL; 681 + } 682 + 683 + /* Delete the network namespace. 684 + * 685 + * This function should be paired with netns_new() to delete the namespace 686 + * created by netns_new(). 687 + */ 688 + void netns_free(struct netns_obj *netns_obj) 689 + { 690 + if (!netns_obj) 691 + return; 692 + traffic_monitor_stop(netns_obj->tmon); 693 + close_netns(netns_obj->nstoken); 694 + remove_netns(netns_obj->nsname); 695 + free(netns_obj->nsname); 696 + free(netns_obj); 697 + } 698 + 649 699 /* extern declarations for test funcs */ 650 700 #define DEFINE_TEST(name) \ 651 701 extern void test_##name(void) __weak; \ ··· 775 653 ARG_TEST_NAME_GLOB_DENYLIST = 'd', 776 654 ARG_NUM_WORKERS = 'j', 777 655 ARG_DEBUG = -1, 778 - ARG_JSON_SUMMARY = 'J' 656 + ARG_JSON_SUMMARY = 'J', 657 + ARG_TRAFFIC_MONITOR = 'm', 779 658 }; 780 659 781 660 static const struct argp_option opts[] = { ··· 803 680 { "debug", ARG_DEBUG, NULL, 0, 804 681 "print extra debug information for test_progs." }, 805 682 { "json-summary", ARG_JSON_SUMMARY, "FILE", 0, "Write report in json format to this file."}, 683 + #ifdef TRAFFIC_MONITOR 684 + { "traffic-monitor", ARG_TRAFFIC_MONITOR, "NAMES", 0, 685 + "Monitor network traffic of tests with name matching the pattern (supports '*' wildcard)." }, 686 + #endif 806 687 {}, 807 688 }; 808 689 ··· 975 848 return -EINVAL; 976 849 } 977 850 } 851 + env_verbosity = env->verbosity; 978 852 979 853 if (verbose()) { 980 854 if (setenv("SELFTESTS_VERBOSE", "1", 1) == -1) { ··· 1019 891 break; 1020 892 case ARGP_KEY_END: 1021 893 break; 894 + #ifdef TRAFFIC_MONITOR 895 + case ARG_TRAFFIC_MONITOR: 896 + if (arg[0] == '@') 897 + err = parse_test_list_file(arg + 1, 898 + &env->tmon_selector.whitelist, 899 + true); 900 + else 901 + err = parse_test_list(arg, 902 + &env->tmon_selector.whitelist, 903 + true); 904 + break; 905 + #endif 1022 906 default: 1023 907 return ARGP_ERR_UNKNOWN; 1024 908 } ··· 1169 1029 1170 1030 sz = backtrace(bt, ARRAY_SIZE(bt)); 1171 1031 1172 - if (env.stdout) 1032 + if (env.stdout_saved) 1173 1033 stdio_restore(); 1174 1034 if (env.test) { 1175 1035 env.test_state->error_cnt++; ··· 1485 1345 if (env->json) { 1486 1346 w = jsonw_new(env->json); 1487 1347 if (!w) 1488 - fprintf(env->stderr, "Failed to create new JSON stream."); 1348 + fprintf(env->stderr_saved, "Failed to create new JSON stream."); 1489 1349 } 1490 1350 1491 1351 if (w) { ··· 1500 1360 1501 1361 /* 1502 1362 * We only print error logs summary when there are failed tests and 1503 - * verbose mode is not enabled. Otherwise, results may be incosistent. 1363 + * verbose mode is not enabled. Otherwise, results may be inconsistent. 1504 1364 * 1505 1365 */ 1506 1366 if (!verbose() && fail_cnt) { ··· 1834 1694 return -1; 1835 1695 } 1836 1696 1837 - env.stdout = stdout; 1838 - env.stderr = stderr; 1697 + env.stdout_saved = stdout; 1698 + env.stderr_saved = stderr; 1839 1699 1840 1700 env.has_testmod = true; 1841 1701 if (!env.list_test_names) { ··· 1843 1703 unload_bpf_testmod(verbose()); 1844 1704 1845 1705 if (load_bpf_testmod(verbose())) { 1846 - fprintf(env.stderr, "WARNING! Selftests relying on bpf_testmod.ko will be skipped.\n"); 1706 + fprintf(env.stderr_saved, "WARNING! Selftests relying on bpf_testmod.ko will be skipped.\n"); 1847 1707 env.has_testmod = false; 1848 1708 } 1849 1709 } ··· 1862 1722 test->test_num, test->test_name, test->test_name, test->test_name); 1863 1723 exit(EXIT_ERR_SETUP_INFRA); 1864 1724 } 1725 + if (test->should_run) 1726 + test->should_tmon = should_tmon(&env.tmon_selector, test->test_name); 1865 1727 } 1866 1728 1867 1729 /* ignore workers if we are just listing */ ··· 1873 1731 /* launch workers if requested */ 1874 1732 env.worker_id = -1; /* main process */ 1875 1733 if (env.workers) { 1876 - env.worker_pids = calloc(sizeof(__pid_t), env.workers); 1734 + env.worker_pids = calloc(sizeof(pid_t), env.workers); 1877 1735 env.worker_socks = calloc(sizeof(int), env.workers); 1878 1736 if (env.debug) 1879 1737 fprintf(stdout, "Launching %d workers.\n", env.workers); ··· 1923 1781 } 1924 1782 1925 1783 if (env.list_test_names) { 1926 - fprintf(env.stdout, "%s\n", test->test_name); 1784 + fprintf(env.stdout_saved, "%s\n", test->test_name); 1927 1785 env.succ_cnt++; 1928 1786 continue; 1929 1787 } ··· 1948 1806 1949 1807 free_test_selector(&env.test_selector); 1950 1808 free_test_selector(&env.subtest_selector); 1809 + free_test_selector(&env.tmon_selector); 1951 1810 free_test_states(); 1952 1811 1953 1812 if (env.succ_cnt + env.fail_cnt + env.skip_cnt == 0)

+12 -5

tools/testing/selftests/bpf/test_progs.h

··· 74 74 int error_cnt; 75 75 bool skipped; 76 76 bool filtered; 77 + bool should_tmon; 77 78 78 - FILE *stdout; 79 + FILE *stdout_saved; 79 80 }; 80 81 81 82 struct test_state { ··· 93 92 size_t log_cnt; 94 93 char *log_buf; 95 94 96 - FILE *stdout; 95 + FILE *stdout_saved; 97 96 }; 97 + 98 + extern int env_verbosity; 98 99 99 100 struct test_env { 100 101 struct test_selector test_selector; 101 102 struct test_selector subtest_selector; 103 + struct test_selector tmon_selector; 102 104 bool verifier_stats; 103 105 bool debug; 104 106 enum verbosity verbosity; ··· 115 111 struct test_state *test_state; /* current running test state */ 116 112 struct subtest_state *subtest_state; /* current running subtest state */ 117 113 118 - FILE *stdout; 119 - FILE *stderr; 114 + FILE *stdout_saved; 115 + FILE *stderr_saved; 120 116 int nr_cpus; 121 117 FILE *json; 122 118 ··· 432 428 int get_bpf_max_tramp_links_from(struct btf *btf); 433 429 int get_bpf_max_tramp_links(void); 434 430 431 + struct netns_obj; 432 + struct netns_obj *netns_new(const char *name, bool open); 433 + void netns_free(struct netns_obj *netns); 434 + 435 435 #ifdef __x86_64__ 436 436 #define SYS_NANOSLEEP_KPROBE_NAME "__x64_sys_nanosleep" 437 437 #elif defined(__s390x__) ··· 455 447 struct test_loader { 456 448 char *log_buf; 457 449 size_t log_buf_sz; 458 - size_t next_match_pos; 459 450 pre_execution_cb pre_execution_cb; 460 451 461 452 struct bpf_object *obj;

-63

tools/testing/selftests/bpf/test_skb_cgroup_id.sh

··· 1 - #!/bin/sh 2 - # SPDX-License-Identifier: GPL-2.0 3 - # Copyright (c) 2018 Facebook 4 - 5 - set -eu 6 - 7 - wait_for_ip() 8 - { 9 - local _i 10 - echo -n "Wait for testing link-local IP to become available " 11 - for _i in $(seq ${MAX_PING_TRIES}); do 12 - echo -n "." 13 - if $PING6 -c 1 -W 1 ff02::1%${TEST_IF} >/dev/null 2>&1; then 14 - echo " OK" 15 - return 16 - fi 17 - sleep 1 18 - done 19 - echo 1>&2 "ERROR: Timeout waiting for test IP to become available." 20 - exit 1 21 - } 22 - 23 - setup() 24 - { 25 - # Create testing interfaces not to interfere with current environment. 26 - ip link add dev ${TEST_IF} type veth peer name ${TEST_IF_PEER} 27 - ip link set ${TEST_IF} up 28 - ip link set ${TEST_IF_PEER} up 29 - 30 - wait_for_ip 31 - 32 - tc qdisc add dev ${TEST_IF} clsact 33 - tc filter add dev ${TEST_IF} egress bpf obj ${BPF_PROG_OBJ} \ 34 - sec ${BPF_PROG_SECTION} da 35 - 36 - BPF_PROG_ID=$(tc filter show dev ${TEST_IF} egress | \ 37 - awk '/ id / {sub(/.* id /, "", $0); print($1)}') 38 - } 39 - 40 - cleanup() 41 - { 42 - ip link del ${TEST_IF} 2>/dev/null || : 43 - ip link del ${TEST_IF_PEER} 2>/dev/null || : 44 - } 45 - 46 - main() 47 - { 48 - trap cleanup EXIT 2 3 6 15 49 - setup 50 - ${PROG} ${TEST_IF} ${BPF_PROG_ID} 51 - } 52 - 53 - DIR=$(dirname $0) 54 - TEST_IF="test_cgid_1" 55 - TEST_IF_PEER="test_cgid_2" 56 - MAX_PING_TRIES=5 57 - BPF_PROG_OBJ="${DIR}/test_skb_cgroup_id_kern.bpf.o" 58 - BPF_PROG_SECTION="cgroup_id_logger" 59 - BPF_PROG_ID=0 60 - PROG="${DIR}/test_skb_cgroup_id_user" 61 - type ping6 >/dev/null 2>&1 && PING6="ping6" || PING6="ping -6" 62 - 63 - main

-183

tools/testing/selftests/bpf/test_skb_cgroup_id_user.c

··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - // Copyright (c) 2018 Facebook 3 - 4 - #include <stdlib.h> 5 - #include <string.h> 6 - #include <unistd.h> 7 - 8 - #include <arpa/inet.h> 9 - #include <net/if.h> 10 - #include <netinet/in.h> 11 - #include <sys/socket.h> 12 - #include <sys/types.h> 13 - 14 - 15 - #include <bpf/bpf.h> 16 - #include <bpf/libbpf.h> 17 - 18 - #include "cgroup_helpers.h" 19 - 20 - #define CGROUP_PATH "/skb_cgroup_test" 21 - #define NUM_CGROUP_LEVELS 4 22 - 23 - /* RFC 4291, Section 2.7.1 */ 24 - #define LINKLOCAL_MULTICAST "ff02::1" 25 - 26 - static int mk_dst_addr(const char *ip, const char *iface, 27 - struct sockaddr_in6 *dst) 28 - { 29 - memset(dst, 0, sizeof(*dst)); 30 - 31 - dst->sin6_family = AF_INET6; 32 - dst->sin6_port = htons(1025); 33 - 34 - if (inet_pton(AF_INET6, ip, &dst->sin6_addr) != 1) { 35 - log_err("Invalid IPv6: %s", ip); 36 - return -1; 37 - } 38 - 39 - dst->sin6_scope_id = if_nametoindex(iface); 40 - if (!dst->sin6_scope_id) { 41 - log_err("Failed to get index of iface: %s", iface); 42 - return -1; 43 - } 44 - 45 - return 0; 46 - } 47 - 48 - static int send_packet(const char *iface) 49 - { 50 - struct sockaddr_in6 dst; 51 - char msg[] = "msg"; 52 - int err = 0; 53 - int fd = -1; 54 - 55 - if (mk_dst_addr(LINKLOCAL_MULTICAST, iface, &dst)) 56 - goto err; 57 - 58 - fd = socket(AF_INET6, SOCK_DGRAM, 0); 59 - if (fd == -1) { 60 - log_err("Failed to create UDP socket"); 61 - goto err; 62 - } 63 - 64 - if (sendto(fd, &msg, sizeof(msg), 0, (const struct sockaddr *)&dst, 65 - sizeof(dst)) == -1) { 66 - log_err("Failed to send datagram"); 67 - goto err; 68 - } 69 - 70 - goto out; 71 - err: 72 - err = -1; 73 - out: 74 - if (fd >= 0) 75 - close(fd); 76 - return err; 77 - } 78 - 79 - int get_map_fd_by_prog_id(int prog_id) 80 - { 81 - struct bpf_prog_info info = {}; 82 - __u32 info_len = sizeof(info); 83 - __u32 map_ids[1]; 84 - int prog_fd = -1; 85 - int map_fd = -1; 86 - 87 - prog_fd = bpf_prog_get_fd_by_id(prog_id); 88 - if (prog_fd < 0) { 89 - log_err("Failed to get fd by prog id %d", prog_id); 90 - goto err; 91 - } 92 - 93 - info.nr_map_ids = 1; 94 - info.map_ids = (__u64) (unsigned long) map_ids; 95 - 96 - if (bpf_prog_get_info_by_fd(prog_fd, &info, &info_len)) { 97 - log_err("Failed to get info by prog fd %d", prog_fd); 98 - goto err; 99 - } 100 - 101 - if (!info.nr_map_ids) { 102 - log_err("No maps found for prog fd %d", prog_fd); 103 - goto err; 104 - } 105 - 106 - map_fd = bpf_map_get_fd_by_id(map_ids[0]); 107 - if (map_fd < 0) 108 - log_err("Failed to get fd by map id %d", map_ids[0]); 109 - err: 110 - if (prog_fd >= 0) 111 - close(prog_fd); 112 - return map_fd; 113 - } 114 - 115 - int check_ancestor_cgroup_ids(int prog_id) 116 - { 117 - __u64 actual_ids[NUM_CGROUP_LEVELS], expected_ids[NUM_CGROUP_LEVELS]; 118 - __u32 level; 119 - int err = 0; 120 - int map_fd; 121 - 122 - expected_ids[0] = get_cgroup_id("/.."); /* root cgroup */ 123 - expected_ids[1] = get_cgroup_id(""); 124 - expected_ids[2] = get_cgroup_id(CGROUP_PATH); 125 - expected_ids[3] = 0; /* non-existent cgroup */ 126 - 127 - map_fd = get_map_fd_by_prog_id(prog_id); 128 - if (map_fd < 0) 129 - goto err; 130 - 131 - for (level = 0; level < NUM_CGROUP_LEVELS; ++level) { 132 - if (bpf_map_lookup_elem(map_fd, &level, &actual_ids[level])) { 133 - log_err("Failed to lookup key %d", level); 134 - goto err; 135 - } 136 - if (actual_ids[level] != expected_ids[level]) { 137 - log_err("%llx (actual) != %llx (expected), level: %u\n", 138 - actual_ids[level], expected_ids[level], level); 139 - goto err; 140 - } 141 - } 142 - 143 - goto out; 144 - err: 145 - err = -1; 146 - out: 147 - if (map_fd >= 0) 148 - close(map_fd); 149 - return err; 150 - } 151 - 152 - int main(int argc, char **argv) 153 - { 154 - int cgfd = -1; 155 - int err = 0; 156 - 157 - if (argc < 3) { 158 - fprintf(stderr, "Usage: %s iface prog_id\n", argv[0]); 159 - exit(EXIT_FAILURE); 160 - } 161 - 162 - /* Use libbpf 1.0 API mode */ 163 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 164 - 165 - cgfd = cgroup_setup_and_join(CGROUP_PATH); 166 - if (cgfd < 0) 167 - goto err; 168 - 169 - if (send_packet(argv[1])) 170 - goto err; 171 - 172 - if (check_ancestor_cgroup_ids(atoi(argv[2]))) 173 - goto err; 174 - 175 - goto out; 176 - err: 177 - err = -1; 178 - out: 179 - close(cgfd); 180 - cleanup_cgroup_environment(); 181 - printf("[%s]\n", err ? "FAIL" : "PASS"); 182 - return err; 183 - }

-121

tools/testing/selftests/bpf/test_xdp_veth.sh

··· 1 - #!/bin/sh 2 - # SPDX-License-Identifier: GPL-2.0 3 - # 4 - # Create 3 namespaces with 3 veth peers, and 5 - # forward packets in-between using native XDP 6 - # 7 - # XDP_TX 8 - # NS1(veth11) NS2(veth22) NS3(veth33) 9 - # | | | 10 - # | | | 11 - # (veth1, (veth2, (veth3, 12 - # id:111) id:122) id:133) 13 - # ^ | ^ | ^ | 14 - # | | XDP_REDIRECT | | XDP_REDIRECT | | 15 - # | ------------------ ------------------ | 16 - # ----------------------------------------- 17 - # XDP_REDIRECT 18 - 19 - # Kselftest framework requirement - SKIP code is 4. 20 - ksft_skip=4 21 - 22 - TESTNAME=xdp_veth 23 - BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts) 24 - BPF_DIR=$BPF_FS/test_$TESTNAME 25 - readonly NS1="ns1-$(mktemp -u XXXXXX)" 26 - readonly NS2="ns2-$(mktemp -u XXXXXX)" 27 - readonly NS3="ns3-$(mktemp -u XXXXXX)" 28 - 29 - _cleanup() 30 - { 31 - set +e 32 - ip link del veth1 2> /dev/null 33 - ip link del veth2 2> /dev/null 34 - ip link del veth3 2> /dev/null 35 - ip netns del ${NS1} 2> /dev/null 36 - ip netns del ${NS2} 2> /dev/null 37 - ip netns del ${NS3} 2> /dev/null 38 - rm -rf $BPF_DIR 2> /dev/null 39 - } 40 - 41 - cleanup_skip() 42 - { 43 - echo "selftests: $TESTNAME [SKIP]" 44 - _cleanup 45 - 46 - exit $ksft_skip 47 - } 48 - 49 - cleanup() 50 - { 51 - if [ "$?" = 0 ]; then 52 - echo "selftests: $TESTNAME [PASS]" 53 - else 54 - echo "selftests: $TESTNAME [FAILED]" 55 - fi 56 - _cleanup 57 - } 58 - 59 - if [ $(id -u) -ne 0 ]; then 60 - echo "selftests: $TESTNAME [SKIP] Need root privileges" 61 - exit $ksft_skip 62 - fi 63 - 64 - if ! ip link set dev lo xdp off > /dev/null 2>&1; then 65 - echo "selftests: $TESTNAME [SKIP] Could not run test without the ip xdp support" 66 - exit $ksft_skip 67 - fi 68 - 69 - if [ -z "$BPF_FS" ]; then 70 - echo "selftests: $TESTNAME [SKIP] Could not run test without bpffs mounted" 71 - exit $ksft_skip 72 - fi 73 - 74 - if ! bpftool version > /dev/null 2>&1; then 75 - echo "selftests: $TESTNAME [SKIP] Could not run test without bpftool" 76 - exit $ksft_skip 77 - fi 78 - 79 - set -e 80 - 81 - trap cleanup_skip EXIT 82 - 83 - ip netns add ${NS1} 84 - ip netns add ${NS2} 85 - ip netns add ${NS3} 86 - 87 - ip link add veth1 index 111 type veth peer name veth11 netns ${NS1} 88 - ip link add veth2 index 122 type veth peer name veth22 netns ${NS2} 89 - ip link add veth3 index 133 type veth peer name veth33 netns ${NS3} 90 - 91 - ip link set veth1 up 92 - ip link set veth2 up 93 - ip link set veth3 up 94 - 95 - ip -n ${NS1} addr add 10.1.1.11/24 dev veth11 96 - ip -n ${NS3} addr add 10.1.1.33/24 dev veth33 97 - 98 - ip -n ${NS1} link set dev veth11 up 99 - ip -n ${NS2} link set dev veth22 up 100 - ip -n ${NS3} link set dev veth33 up 101 - 102 - mkdir $BPF_DIR 103 - bpftool prog loadall \ 104 - xdp_redirect_map.bpf.o $BPF_DIR/progs type xdp \ 105 - pinmaps $BPF_DIR/maps 106 - bpftool map update pinned $BPF_DIR/maps/tx_port key 0 0 0 0 value 122 0 0 0 107 - bpftool map update pinned $BPF_DIR/maps/tx_port key 1 0 0 0 value 133 0 0 0 108 - bpftool map update pinned $BPF_DIR/maps/tx_port key 2 0 0 0 value 111 0 0 0 109 - ip link set dev veth1 xdp pinned $BPF_DIR/progs/xdp_redirect_map_0 110 - ip link set dev veth2 xdp pinned $BPF_DIR/progs/xdp_redirect_map_1 111 - ip link set dev veth3 xdp pinned $BPF_DIR/progs/xdp_redirect_map_2 112 - 113 - ip -n ${NS1} link set dev veth11 xdp obj xdp_dummy.bpf.o sec xdp 114 - ip -n ${NS2} link set dev veth22 xdp obj xdp_tx.bpf.o sec xdp 115 - ip -n ${NS3} link set dev veth33 xdp obj xdp_dummy.bpf.o sec xdp 116 - 117 - trap cleanup EXIT 118 - 119 - ip netns exec ${NS1} ping -c 1 -W 1 10.1.1.33 120 - 121 - exit 0

+4 -3

tools/testing/selftests/bpf/testing_helpers.c

··· 7 7 #include <errno.h> 8 8 #include <bpf/bpf.h> 9 9 #include <bpf/libbpf.h> 10 + #include "disasm.h" 10 11 #include "test_progs.h" 11 12 #include "testing_helpers.h" 12 13 #include <linux/membarrier.h> ··· 221 220 bool is_glob_pattern) 222 221 { 223 222 char *input, *state = NULL, *test_spec; 224 - int err = 0; 223 + int err = 0, cnt = 0; 225 224 226 225 input = strdup(s); 227 226 if (!input) 228 227 return -ENOMEM; 229 228 230 - while ((test_spec = strtok_r(state ? NULL : input, ",", &state))) { 229 + while ((test_spec = strtok_r(cnt++ ? NULL : input, ",", &state))) { 231 230 err = insert_test(set, test_spec, is_glob_pattern); 232 231 if (err) 233 232 break; ··· 452 451 453 452 *cnt = xlated_prog_len / buf_element_size; 454 453 *buf = calloc(*cnt, buf_element_size); 455 - if (!buf) { 454 + if (!*buf) { 456 455 perror("can't allocate xlated program buffer"); 457 456 return -ENOMEM; 458 457 }

+89 -15

tools/testing/selftests/bpf/trace_helpers.c

··· 10 10 #include <pthread.h> 11 11 #include <unistd.h> 12 12 #include <linux/perf_event.h> 13 + #include <linux/fs.h> 14 + #include <sys/ioctl.h> 13 15 #include <sys/mman.h> 14 16 #include "trace_helpers.h" 15 17 #include <linux/limits.h> ··· 246 244 return err; 247 245 } 248 246 247 + #ifdef PROCMAP_QUERY 248 + int env_verbosity __weak = 0; 249 + 250 + static int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) 251 + { 252 + char path_buf[PATH_MAX], build_id_buf[20]; 253 + struct procmap_query q; 254 + int err; 255 + 256 + memset(&q, 0, sizeof(q)); 257 + q.size = sizeof(q); 258 + q.query_flags = query_flags; 259 + q.query_addr = (__u64)addr; 260 + q.vma_name_addr = (__u64)path_buf; 261 + q.vma_name_size = sizeof(path_buf); 262 + q.build_id_addr = (__u64)build_id_buf; 263 + q.build_id_size = sizeof(build_id_buf); 264 + 265 + err = ioctl(fd, PROCMAP_QUERY, &q); 266 + if (err < 0) { 267 + err = -errno; 268 + if (err == -ENOTTY) 269 + return -EOPNOTSUPP; /* ioctl() not implemented yet */ 270 + if (err == -ENOENT) 271 + return -ESRCH; /* vma not found */ 272 + return err; 273 + } 274 + 275 + if (env_verbosity >= 1) { 276 + printf("VMA FOUND (addr %08lx): %08lx-%08lx %c%c%c%c %08lx %02x:%02x %ld %s (build ID: %s, %d bytes)\n", 277 + (long)addr, (long)q.vma_start, (long)q.vma_end, 278 + (q.vma_flags & PROCMAP_QUERY_VMA_READABLE) ? 'r' : '-', 279 + (q.vma_flags & PROCMAP_QUERY_VMA_WRITABLE) ? 'w' : '-', 280 + (q.vma_flags & PROCMAP_QUERY_VMA_EXECUTABLE) ? 'x' : '-', 281 + (q.vma_flags & PROCMAP_QUERY_VMA_SHARED) ? 's' : 'p', 282 + (long)q.vma_offset, q.dev_major, q.dev_minor, (long)q.inode, 283 + q.vma_name_size ? path_buf : "", 284 + q.build_id_size ? "YES" : "NO", 285 + q.build_id_size); 286 + } 287 + 288 + *start = q.vma_start; 289 + *offset = q.vma_offset; 290 + *flags = q.vma_flags; 291 + return 0; 292 + } 293 + #else 294 + static int procmap_query(int fd, const void *addr, __u32 query_flags, size_t *start, size_t *offset, int *flags) 295 + { 296 + return -EOPNOTSUPP; 297 + } 298 + #endif 299 + 249 300 ssize_t get_uprobe_offset(const void *addr) 250 301 { 251 - size_t start, end, base; 252 - char buf[256]; 253 - bool found = false; 302 + size_t start, base, end; 254 303 FILE *f; 304 + char buf[256]; 305 + int err, flags; 255 306 256 307 f = fopen("/proc/self/maps", "r"); 257 308 if (!f) 258 309 return -errno; 259 310 260 - while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &base) == 4) { 261 - if (buf[2] == 'x' && (uintptr_t)addr >= start && (uintptr_t)addr < end) { 262 - found = true; 263 - break; 311 + /* requested executable VMA only */ 312 + err = procmap_query(fileno(f), addr, PROCMAP_QUERY_VMA_EXECUTABLE, &start, &base, &flags); 313 + if (err == -EOPNOTSUPP) { 314 + bool found = false; 315 + 316 + while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &base) == 4) { 317 + if (buf[2] == 'x' && (uintptr_t)addr >= start && (uintptr_t)addr < end) { 318 + found = true; 319 + break; 320 + } 264 321 } 322 + if (!found) { 323 + fclose(f); 324 + return -ESRCH; 325 + } 326 + } else if (err) { 327 + fclose(f); 328 + return err; 265 329 } 266 - 267 330 fclose(f); 268 - 269 - if (!found) 270 - return -ESRCH; 271 331 272 332 #if defined(__powerpc64__) && defined(_CALL_ELF) && _CALL_ELF == 2 273 333 ··· 371 307 size_t start, end, offset; 372 308 char buf[256]; 373 309 FILE *f; 310 + int err, flags; 374 311 375 312 f = fopen("/proc/self/maps", "r"); 376 313 if (!f) 377 314 return -errno; 378 315 379 - while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &offset) == 4) { 380 - if (addr >= start && addr < end) { 381 - fclose(f); 382 - return (size_t)addr - start + offset; 316 + err = procmap_query(fileno(f), (const void *)addr, 0, &start, &offset, &flags); 317 + if (err == 0) { 318 + fclose(f); 319 + return (size_t)addr - start + offset; 320 + } else if (err != -EOPNOTSUPP) { 321 + fclose(f); 322 + return err; 323 + } else if (err) { 324 + while (fscanf(f, "%zx-%zx %s %zx %*[^\n]\n", &start, &end, buf, &offset) == 4) { 325 + if (addr >= start && addr < end) { 326 + fclose(f); 327 + return (size_t)addr - start + offset; 328 + } 383 329 } 384 330 } 385 331

-1

tools/testing/selftests/bpf/unpriv_helpers.c

··· 2 2 3 3 #include <stdbool.h> 4 4 #include <stdlib.h> 5 - #include <error.h> 6 5 #include <stdio.h> 7 6 #include <string.h> 8 7 #include <unistd.h>

+41

tools/testing/selftests/bpf/uprobe_multi.c

··· 2 2 3 3 #include <stdio.h> 4 4 #include <string.h> 5 + #include <stdbool.h> 6 + #include <stdint.h> 7 + #include <sys/mman.h> 8 + #include <unistd.h> 5 9 #include <sdt.h> 10 + 11 + #ifndef MADV_POPULATE_READ 12 + #define MADV_POPULATE_READ 22 13 + #endif 14 + 15 + int __attribute__((weak)) uprobe(void) 16 + { 17 + return 0; 18 + } 6 19 7 20 #define __PASTE(a, b) a##b 8 21 #define PASTE(a, b) __PASTE(a, b) ··· 88 75 return 0; 89 76 } 90 77 78 + extern char build_id_start[]; 79 + extern char build_id_end[]; 80 + 81 + int __attribute__((weak)) trigger_uprobe(bool build_id_resident) 82 + { 83 + int page_sz = sysconf(_SC_PAGESIZE); 84 + void *addr; 85 + 86 + /* page-align build ID start */ 87 + addr = (void *)((uintptr_t)&build_id_start & ~(page_sz - 1)); 88 + 89 + /* to guarantee MADV_PAGEOUT work reliably, we need to ensure that 90 + * memory range is mapped into current process, so we unconditionally 91 + * do MADV_POPULATE_READ, and then MADV_PAGEOUT, if necessary 92 + */ 93 + madvise(addr, page_sz, MADV_POPULATE_READ); 94 + if (!build_id_resident) 95 + madvise(addr, page_sz, MADV_PAGEOUT); 96 + 97 + (void)uprobe(); 98 + 99 + return 0; 100 + } 101 + 91 102 int main(int argc, char **argv) 92 103 { 93 104 if (argc != 2) ··· 121 84 return bench(); 122 85 if (!strcmp("usdt", argv[1])) 123 86 return usdt(); 87 + if (!strcmp("uprobe-paged-out", argv[1])) 88 + return trigger_uprobe(false /* page-out build ID */); 89 + if (!strcmp("uprobe-paged-in", argv[1])) 90 + return trigger_uprobe(true /* page-in build ID */); 124 91 125 92 error: 126 93 fprintf(stderr, "usage: %s <bench|usdt>\n", argv[0]);

+11

tools/testing/selftests/bpf/uprobe_multi.ld

··· 1 + SECTIONS 2 + { 3 + . = ALIGN(4096); 4 + .note.gnu.build-id : { *(.note.gnu.build-id) } 5 + . = ALIGN(4096); 6 + } 7 + INSERT AFTER .text; 8 + 9 + build_id_start = ADDR(.note.gnu.build-id); 10 + build_id_end = ADDR(.note.gnu.build-id) + SIZEOF(.note.gnu.build-id); 11 +

+1 -1

tools/testing/selftests/bpf/verifier/calls.c

··· 76 76 }, 77 77 .prog_type = BPF_PROG_TYPE_SCHED_CLS, 78 78 .result = REJECT, 79 - .errstr = "arg#0 expected pointer to ctx, but got PTR", 79 + .errstr = "arg#0 expected pointer to ctx, but got fp", 80 80 .fixup_kfunc_btf_id = { 81 81 { "bpf_kfunc_call_test_pass_ctx", 2 }, 82 82 },

+1 -1

tools/testing/selftests/bpf/verifier/map_kptr.c

··· 153 153 .result = REJECT, 154 154 .errstr = "variable untrusted_ptr_ access var_off=(0x0; 0x7) disallowed", 155 155 }, 156 - /* Tests for unreferened PTR_TO_BTF_ID */ 156 + /* Tests for unreferenced PTR_TO_BTF_ID */ 157 157 { 158 158 "map_kptr: unref: reject btf_struct_ids_match == false", 159 159 .insns = {

+14 -14

tools/testing/selftests/bpf/verifier/precise.c

··· 39 39 .result = VERBOSE_ACCEPT, 40 40 .errstr = 41 41 "mark_precise: frame0: last_idx 26 first_idx 20\ 42 - mark_precise: frame0: regs=r2,r9 stack= before 25\ 43 - mark_precise: frame0: regs=r2,r9 stack= before 24\ 44 - mark_precise: frame0: regs=r2,r9 stack= before 23\ 45 - mark_precise: frame0: regs=r2,r9 stack= before 22\ 46 - mark_precise: frame0: regs=r2,r9 stack= before 20\ 42 + mark_precise: frame0: regs=r2 stack= before 25\ 43 + mark_precise: frame0: regs=r2 stack= before 24\ 44 + mark_precise: frame0: regs=r2 stack= before 23\ 45 + mark_precise: frame0: regs=r2 stack= before 22\ 46 + mark_precise: frame0: regs=r2 stack= before 20\ 47 47 mark_precise: frame0: parent state regs=r2,r9 stack=:\ 48 48 mark_precise: frame0: last_idx 19 first_idx 10\ 49 49 mark_precise: frame0: regs=r2,r9 stack= before 19\ ··· 100 100 .errstr = 101 101 "26: (85) call bpf_probe_read_kernel#113\ 102 102 mark_precise: frame0: last_idx 26 first_idx 22\ 103 - mark_precise: frame0: regs=r2,r9 stack= before 25\ 104 - mark_precise: frame0: regs=r2,r9 stack= before 24\ 105 - mark_precise: frame0: regs=r2,r9 stack= before 23\ 106 - mark_precise: frame0: regs=r2,r9 stack= before 22\ 107 - mark_precise: frame0: parent state regs=r2,r9 stack=:\ 103 + mark_precise: frame0: regs=r2 stack= before 25\ 104 + mark_precise: frame0: regs=r2 stack= before 24\ 105 + mark_precise: frame0: regs=r2 stack= before 23\ 106 + mark_precise: frame0: regs=r2 stack= before 22\ 107 + mark_precise: frame0: parent state regs=r2 stack=:\ 108 108 mark_precise: frame0: last_idx 20 first_idx 20\ 109 - mark_precise: frame0: regs=r2,r9 stack= before 20\ 109 + mark_precise: frame0: regs=r2 stack= before 20\ 110 110 mark_precise: frame0: parent state regs=r2,r9 stack=:\ 111 111 mark_precise: frame0: last_idx 19 first_idx 17\ 112 112 mark_precise: frame0: regs=r2,r9 stack= before 19\ ··· 183 183 .prog_type = BPF_PROG_TYPE_XDP, 184 184 .flags = BPF_F_TEST_STATE_FREQ, 185 185 .errstr = "mark_precise: frame0: last_idx 7 first_idx 7\ 186 - mark_precise: frame0: parent state regs=r4 stack=-8:\ 186 + mark_precise: frame0: parent state regs=r4 stack=:\ 187 187 mark_precise: frame0: last_idx 6 first_idx 4\ 188 - mark_precise: frame0: regs=r4 stack=-8 before 6: (b7) r0 = -1\ 189 - mark_precise: frame0: regs=r4 stack=-8 before 5: (79) r4 = *(u64 *)(r10 -8)\ 188 + mark_precise: frame0: regs=r4 stack= before 6: (b7) r0 = -1\ 189 + mark_precise: frame0: regs=r4 stack= before 5: (79) r4 = *(u64 *)(r10 -8)\ 190 190 mark_precise: frame0: regs= stack=-8 before 4: (7b) *(u64 *)(r3 -8) = r0\ 191 191 mark_precise: frame0: parent state regs=r0 stack=:\ 192 192 mark_precise: frame0: last_idx 3 first_idx 3\

+9 -7

tools/testing/selftests/bpf/veristat.c

··· 2 2 /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 3 #define _GNU_SOURCE 4 4 #include <argp.h> 5 + #include <libgen.h> 5 6 #include <string.h> 6 7 #include <stdlib.h> 7 8 #include <sched.h> ··· 785 784 static int parse_stats(const char *stats_str, struct stat_specs *specs) 786 785 { 787 786 char *input, *state = NULL, *next; 788 - int err; 787 + int err, cnt = 0; 789 788 790 789 input = strdup(stats_str); 791 790 if (!input) 792 791 return -ENOMEM; 793 792 794 - while ((next = strtok_r(state ? NULL : input, ",", &state))) { 793 + while ((next = strtok_r(cnt++ ? NULL : input, ",", &state))) { 795 794 err = parse_stat(next, specs); 796 795 if (err) { 797 796 free(input); ··· 989 988 990 989 static int process_prog(const char *filename, struct bpf_object *obj, struct bpf_program *prog) 991 990 { 991 + const char *base_filename = basename(strdupa(filename)); 992 992 const char *prog_name = bpf_program__name(prog); 993 - const char *base_filename = basename(filename); 994 993 char *buf; 995 994 int buf_sz, log_level; 996 995 struct verif_stats *stats; ··· 1057 1056 1058 1057 static int process_obj(const char *filename) 1059 1058 { 1059 + const char *base_filename = basename(strdupa(filename)); 1060 1060 struct bpf_object *obj = NULL, *tobj; 1061 1061 struct bpf_program *prog, *tprog, *lprog; 1062 1062 libbpf_print_fn_t old_libbpf_print_fn; 1063 1063 LIBBPF_OPTS(bpf_object_open_opts, opts); 1064 1064 int err = 0, prog_cnt = 0; 1065 1065 1066 - if (!should_process_file_prog(basename(filename), NULL)) { 1066 + if (!should_process_file_prog(base_filename, NULL)) { 1067 1067 if (env.verbose) 1068 1068 printf("Skipping '%s' due to filters...\n", filename); 1069 1069 env.files_skipped++; ··· 1078 1076 } 1079 1077 1080 1078 if (!env.quiet && env.out_fmt == RESFMT_TABLE) 1081 - printf("Processing '%s'...\n", basename(filename)); 1079 + printf("Processing '%s'...\n", base_filename); 1082 1080 1083 1081 old_libbpf_print_fn = libbpf_set_print(libbpf_print_fn); 1084 1082 obj = bpf_object__open_file(filename, &opts); ··· 1495 1493 while (fgets(line, sizeof(line), f)) { 1496 1494 char *input = line, *state = NULL, *next; 1497 1495 struct verif_stats *st = NULL; 1498 - int col = 0; 1496 + int col = 0, cnt = 0; 1499 1497 1500 1498 if (!header) { 1501 1499 void *tmp; ··· 1513 1511 *stat_cntp += 1; 1514 1512 } 1515 1513 1516 - while ((next = strtok_r(state ? NULL : input, ",\n", &state))) { 1514 + while ((next = strtok_r(cnt++ ? NULL : input, ",\n", &state))) { 1517 1515 if (header) { 1518 1516 /* for the first line, set up spec stats */ 1519 1517 err = parse_stat(next, specs);

+73 -34

tools/testing/selftests/bpf/vmtest.sh

··· 1 1 #!/bin/bash 2 2 # SPDX-License-Identifier: GPL-2.0 3 3 4 - set -u 5 4 set -e 6 5 7 - # This script currently only works for x86_64 and s390x, as 8 - # it is based on the VM image used by the BPF CI, which is 9 - # available only for these architectures. 10 - ARCH="$(uname -m)" 11 - case "${ARCH}" in 6 + # This script currently only works for the following platforms, 7 + # as it is based on the VM image used by the BPF CI, which is 8 + # available only for these architectures. We can also specify 9 + # the local rootfs image generated by the following script: 10 + # https://github.com/libbpf/ci/blob/main/rootfs/mkrootfs_debian.sh 11 + PLATFORM="${PLATFORM:-$(uname -m)}" 12 + case "${PLATFORM}" in 12 13 s390x) 13 14 QEMU_BINARY=qemu-system-s390x 14 15 QEMU_CONSOLE="ttyS1" 15 - QEMU_FLAGS=(-smp 2) 16 + HOST_FLAGS=(-smp 2 -enable-kvm) 17 + CROSS_FLAGS=(-smp 2) 16 18 BZIMAGE="arch/s390/boot/vmlinux" 19 + ARCH="s390" 17 20 ;; 18 21 x86_64) 19 22 QEMU_BINARY=qemu-system-x86_64 20 23 QEMU_CONSOLE="ttyS0,115200" 21 - QEMU_FLAGS=(-cpu host -smp 8) 24 + HOST_FLAGS=(-cpu host -enable-kvm -smp 8) 25 + CROSS_FLAGS=(-smp 8) 22 26 BZIMAGE="arch/x86/boot/bzImage" 27 + ARCH="x86" 23 28 ;; 24 29 aarch64) 25 30 QEMU_BINARY=qemu-system-aarch64 26 31 QEMU_CONSOLE="ttyAMA0,115200" 27 - QEMU_FLAGS=(-M virt,gic-version=3 -cpu host -smp 8) 32 + HOST_FLAGS=(-M virt,gic-version=3 -cpu host -enable-kvm -smp 8) 33 + CROSS_FLAGS=(-M virt,gic-version=3 -cpu cortex-a76 -smp 8) 28 34 BZIMAGE="arch/arm64/boot/Image" 35 + ARCH="arm64" 36 + ;; 37 + riscv64) 38 + # required qemu version v7.2.0+ 39 + QEMU_BINARY=qemu-system-riscv64 40 + QEMU_CONSOLE="ttyS0,115200" 41 + HOST_FLAGS=(-M virt -cpu host -enable-kvm -smp 8) 42 + CROSS_FLAGS=(-M virt -cpu rv64,sscofpmf=true -smp 8) 43 + BZIMAGE="arch/riscv/boot/Image" 44 + ARCH="riscv" 29 45 ;; 30 46 *) 31 47 echo "Unsupported architecture" ··· 50 34 esac 51 35 DEFAULT_COMMAND="./test_progs" 52 36 MOUNT_DIR="mnt" 37 + LOCAL_ROOTFS_IMAGE="" 53 38 ROOTFS_IMAGE="root.img" 54 39 OUTPUT_DIR="$HOME/.bpf_selftests" 55 40 KCONFIG_REL_PATHS=("tools/testing/selftests/bpf/config" 56 41 "tools/testing/selftests/bpf/config.vm" 57 - "tools/testing/selftests/bpf/config.${ARCH}") 42 + "tools/testing/selftests/bpf/config.${PLATFORM}") 58 43 INDEX_URL="https://raw.githubusercontent.com/libbpf/ci/master/INDEX" 59 44 NUM_COMPILE_JOBS="$(nproc)" 60 45 LOG_FILE_BASE="$(date +"bpf_selftests.%Y-%m-%d_%H-%M-%S")" ··· 75 58 If no command is specified and a debug shell (-s) is not requested, 76 59 "${DEFAULT_COMMAND}" will be run by default. 77 60 61 + Using PLATFORM= and CROSS_COMPILE= options will enable cross platform testing: 62 + 63 + PLATFORM=<platform> CROSS_COMPILE=<toolchain> $0 -- ./test_progs -t test_lsm 64 + 78 65 If you build your kernel using KBUILD_OUTPUT= or O= options, these 79 66 can be passed as environment variables to the script: 80 67 ··· 90 69 91 70 Options: 92 71 72 + -l) Specify the path to the local rootfs image. 93 73 -i) Update the rootfs image with a newer version. 94 74 -d) Update the output directory (default: ${OUTPUT_DIR}) 95 75 -j) Number of jobs for compilation, similar to -j in make ··· 114 92 fi 115 93 } 116 94 117 - download() 95 + newest_rootfs_version() 118 96 { 119 - local file="$1" 97 + { 98 + for file in "${!URLS[@]}"; do 99 + if [[ $file =~ ^"${PLATFORM}"/libbpf-vmtest-rootfs-(.*)\.tar\.zst$ ]]; then 100 + echo "${BASH_REMATCH[1]}" 101 + fi 102 + done 103 + } | sort -rV | head -1 104 + } 105 + 106 + download_rootfs() 107 + { 108 + populate_url_map 109 + 110 + local rootfsversion="$(newest_rootfs_version)" 111 + local file="${PLATFORM}/libbpf-vmtest-rootfs-$rootfsversion.tar.zst" 120 112 121 113 if [[ ! -v URLS[$file] ]]; then 122 114 echo "$file not found" >&2 ··· 141 105 curl -Lsf "${URLS[$file]}" "${@:2}" 142 106 } 143 107 144 - newest_rootfs_version() 108 + load_rootfs() 145 109 { 146 - { 147 - for file in "${!URLS[@]}"; do 148 - if [[ $file =~ ^"${ARCH}"/libbpf-vmtest-rootfs-(.*)\.tar\.zst$ ]]; then 149 - echo "${BASH_REMATCH[1]}" 150 - fi 151 - done 152 - } | sort -rV | head -1 153 - } 154 - 155 - download_rootfs() 156 - { 157 - local rootfsversion="$1" 158 - local dir="$2" 110 + local dir="$1" 159 111 160 112 if ! which zstd &> /dev/null; then 161 113 echo 'Could not find "zstd" on the system, please install zstd' 162 114 exit 1 163 115 fi 164 116 165 - download "${ARCH}/libbpf-vmtest-rootfs-$rootfsversion.tar.zst" | 166 - zstd -d | sudo tar -C "$dir" -x 117 + if [[ -n "${LOCAL_ROOTFS_IMAGE}" ]]; then 118 + cat "${LOCAL_ROOTFS_IMAGE}" | zstd -d | sudo tar -C "$dir" -x 119 + else 120 + download_rootfs | zstd -d | sudo tar -C "$dir" -x 121 + fi 167 122 } 168 123 169 124 recompile_kernel() ··· 254 227 mkfs.ext4 -q "${rootfs_img}" 255 228 256 229 mount_image 257 - download_rootfs "$(newest_rootfs_version)" "${mount_dir}" 230 + load_rootfs "${mount_dir}" 258 231 unmount_image 259 232 } 260 233 ··· 271 244 exit 1 272 245 fi 273 246 247 + if [[ "${PLATFORM}" != "$(uname -m)" ]]; then 248 + QEMU_FLAGS=("${CROSS_FLAGS[@]}") 249 + else 250 + QEMU_FLAGS=("${HOST_FLAGS[@]}") 251 + fi 252 + 274 253 ${QEMU_BINARY} \ 275 254 -nodefaults \ 276 255 -display none \ 277 256 -serial mon:stdio \ 278 257 "${QEMU_FLAGS[@]}" \ 279 - -enable-kvm \ 280 258 -m 4G \ 281 259 -drive file="${rootfs_img}",format=raw,index=1,media=disk,if=virtio,cache=none \ 282 260 -kernel "${kernel_bzimage}" \ ··· 373 341 local exit_command="poweroff -f" 374 342 local debug_shell="no" 375 343 376 - while getopts ':hskid:j:' opt; do 344 + while getopts ':hskl:id:j:' opt; do 377 345 case ${opt} in 346 + l) 347 + LOCAL_ROOTFS_IMAGE="$OPTARG" 348 + ;; 378 349 i) 379 350 update_image="yes" 380 351 ;; ··· 412 377 413 378 trap 'catch "$?"' EXIT 414 379 380 + if [[ "${PLATFORM}" != "$(uname -m)" ]] && [[ -z "${CROSS_COMPILE}" ]]; then 381 + echo "Cross-platform testing needs to specify CROSS_COMPILE" 382 + exit 1 383 + fi 384 + 415 385 if [[ $# -eq 0 && "${debug_shell}" == "no" ]]; then 416 386 echo "No command specified, will run ${DEFAULT_COMMAND} in the vm" 417 387 else ··· 424 384 fi 425 385 426 386 local kconfig_file="${OUTPUT_DIR}/latest.config" 427 - local make_command="make -j ${NUM_COMPILE_JOBS} KCONFIG_CONFIG=${kconfig_file}" 387 + local make_command="make ARCH=${ARCH} CROSS_COMPILE=${CROSS_COMPILE} \ 388 + -j ${NUM_COMPILE_JOBS} KCONFIG_CONFIG=${kconfig_file}" 428 389 429 390 # Figure out where the kernel is being built. 430 391 # O takes precedence over KBUILD_OUTPUT. ··· 442 401 kernel_bzimage="${KBUILD_OUTPUT}/${BZIMAGE}" 443 402 make_command="${make_command} KBUILD_OUTPUT=${KBUILD_OUTPUT}" 444 403 fi 445 - 446 - populate_url_map 447 404 448 405 local rootfs_img="${OUTPUT_DIR}/${ROOTFS_IMAGE}" 449 406 local mount_dir="${OUTPUT_DIR}/${MOUNT_DIR}"

+1

tools/testing/selftests/bpf/xskxceiver.c

··· 90 90 #include <signal.h> 91 91 #include <stdio.h> 92 92 #include <stdlib.h> 93 + #include <libgen.h> 93 94 #include <string.h> 94 95 #include <stddef.h> 95 96 #include <sys/mman.h>

Configure Feed

Configure Feed