Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

libbpf: Rework feature-probing APIs

Create three extensible alternatives to inconsistently named
feature-probing APIs:

- libbpf_probe_bpf_prog_type() instead of bpf_probe_prog_type();
- libbpf_probe_bpf_map_type() instead of bpf_probe_map_type();
- libbpf_probe_bpf_helper() instead of bpf_probe_helper().

Set up return values such that libbpf can report errors (e.g., if some
combination of input arguments isn't possible to validate, etc), in
addition to whether the feature is supported (return value 1) or not
supported (return value 0).

Also schedule deprecation of those three APIs. Also schedule deprecation
of bpf_probe_large_insn_limit().

Also fix all the existing detection logic for various program and map
types that never worked:

- BPF_PROG_TYPE_LIRC_MODE2;
- BPF_PROG_TYPE_TRACING;
- BPF_PROG_TYPE_LSM;
- BPF_PROG_TYPE_EXT;
- BPF_PROG_TYPE_SYSCALL;
- BPF_PROG_TYPE_STRUCT_OPS;
- BPF_MAP_TYPE_STRUCT_OPS;
- BPF_MAP_TYPE_BLOOM_FILTER.

Above prog/map types needed special setups and detection logic to work.
Subsequent patch adds selftests that will make sure that all the
detection logic keeps working for all current and future program and map
types, avoiding otherwise inevitable bit rot.

[0] Closes: https://github.com/libbpf/libbpf/issues/312

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Julia Kartseva <hex@fb.com>
Link: https://lore.kernel.org/bpf/20211217171202.3352835-2-andrii@kernel.org

authored by

Andrii Nakryiko and committed by
Daniel Borkmann
878d8def 496f3324

+237 -55
+48 -4
tools/lib/bpf/libbpf.h
··· 1052 1052 * user, causing subsequent probes to fail. In this case, the caller may want 1053 1053 * to adjust that limit with setrlimit(). 1054 1054 */ 1055 - LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type, 1056 - __u32 ifindex); 1055 + LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_prog_type() instead") 1056 + LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex); 1057 + LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_map_type() instead") 1057 1058 LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex); 1058 - LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id, 1059 - enum bpf_prog_type prog_type, __u32 ifindex); 1059 + LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_helper() instead") 1060 + LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, __u32 ifindex); 1061 + LIBBPF_DEPRECATED_SINCE(0, 8, "implement your own or use bpftool for feature detection") 1060 1062 LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex); 1063 + 1064 + /** 1065 + * @brief **libbpf_probe_bpf_prog_type()** detects if host kernel supports 1066 + * BPF programs of a given type. 1067 + * @param prog_type BPF program type to detect kernel support for 1068 + * @param opts reserved for future extensibility, should be NULL 1069 + * @return 1, if given program type is supported; 0, if given program type is 1070 + * not supported; negative error code if feature detection failed or can't be 1071 + * performed 1072 + * 1073 + * Make sure the process has required set of CAP_* permissions (or runs as 1074 + * root) when performing feature checking. 1075 + */ 1076 + LIBBPF_API int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts); 1077 + /** 1078 + * @brief **libbpf_probe_bpf_map_type()** detects if host kernel supports 1079 + * BPF maps of a given type. 1080 + * @param map_type BPF map type to detect kernel support for 1081 + * @param opts reserved for future extensibility, should be NULL 1082 + * @return 1, if given map type is supported; 0, if given map type is 1083 + * not supported; negative error code if feature detection failed or can't be 1084 + * performed 1085 + * 1086 + * Make sure the process has required set of CAP_* permissions (or runs as 1087 + * root) when performing feature checking. 1088 + */ 1089 + LIBBPF_API int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts); 1090 + /** 1091 + * @brief **libbpf_probe_bpf_helper()** detects if host kernel supports the 1092 + * use of a given BPF helper from specified BPF program type. 1093 + * @param prog_type BPF program type used to check the support of BPF helper 1094 + * @param helper_id BPF helper ID (enum bpf_func_id) to check support for 1095 + * @param opts reserved for future extensibility, should be NULL 1096 + * @return 1, if given combination of program type and helper is supported; 0, 1097 + * if the combination is not supported; negative error code if feature 1098 + * detection for provided input arguments failed or can't be performed 1099 + * 1100 + * Make sure the process has required set of CAP_* permissions (or runs as 1101 + * root) when performing feature checking. 1102 + */ 1103 + LIBBPF_API int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, 1104 + enum bpf_func_id helper_id, const void *opts); 1061 1105 1062 1106 /* 1063 1107 * Get bpf_prog_info in continuous memory
+3
tools/lib/bpf/libbpf.map
··· 427 427 bpf_program__log_level; 428 428 bpf_program__set_log_buf; 429 429 bpf_program__set_log_level; 430 + libbpf_probe_bpf_helper; 431 + libbpf_probe_bpf_map_type; 432 + libbpf_probe_bpf_prog_type; 430 433 libbpf_set_memlock_rlim_max; 431 434 };
+186 -51
tools/lib/bpf/libbpf_probes.c
··· 64 64 return (version << 16) + (subversion << 8) + patchlevel; 65 65 } 66 66 67 - static void 68 - probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, 69 - size_t insns_cnt, char *buf, size_t buf_len, __u32 ifindex) 67 + static int probe_prog_load(enum bpf_prog_type prog_type, 68 + const struct bpf_insn *insns, size_t insns_cnt, 69 + char *log_buf, size_t log_buf_sz, 70 + __u32 ifindex) 70 71 { 71 - LIBBPF_OPTS(bpf_prog_load_opts, opts); 72 - int fd; 72 + LIBBPF_OPTS(bpf_prog_load_opts, opts, 73 + .log_buf = log_buf, 74 + .log_size = log_buf_sz, 75 + .log_level = log_buf ? 1 : 0, 76 + .prog_ifindex = ifindex, 77 + ); 78 + int fd, err, exp_err = 0; 79 + const char *exp_msg = NULL; 80 + char buf[4096]; 73 81 74 82 switch (prog_type) { 75 83 case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: ··· 91 83 break; 92 84 case BPF_PROG_TYPE_KPROBE: 93 85 opts.kern_version = get_kernel_version(); 86 + break; 87 + case BPF_PROG_TYPE_LIRC_MODE2: 88 + opts.expected_attach_type = BPF_LIRC_MODE2; 89 + break; 90 + case BPF_PROG_TYPE_TRACING: 91 + case BPF_PROG_TYPE_LSM: 92 + opts.log_buf = buf; 93 + opts.log_size = sizeof(buf); 94 + opts.log_level = 1; 95 + if (prog_type == BPF_PROG_TYPE_TRACING) 96 + opts.expected_attach_type = BPF_TRACE_FENTRY; 97 + else 98 + opts.expected_attach_type = BPF_MODIFY_RETURN; 99 + opts.attach_btf_id = 1; 100 + 101 + exp_err = -EINVAL; 102 + exp_msg = "attach_btf_id 1 is not a function"; 103 + break; 104 + case BPF_PROG_TYPE_EXT: 105 + opts.log_buf = buf; 106 + opts.log_size = sizeof(buf); 107 + opts.log_level = 1; 108 + opts.attach_btf_id = 1; 109 + 110 + exp_err = -EINVAL; 111 + exp_msg = "Cannot replace kernel functions"; 112 + break; 113 + case BPF_PROG_TYPE_SYSCALL: 114 + opts.prog_flags = BPF_F_SLEEPABLE; 115 + break; 116 + case BPF_PROG_TYPE_STRUCT_OPS: 117 + exp_err = -524; /* -ENOTSUPP */ 94 118 break; 95 119 case BPF_PROG_TYPE_UNSPEC: 96 120 case BPF_PROG_TYPE_SOCKET_FILTER: ··· 143 103 case BPF_PROG_TYPE_RAW_TRACEPOINT: 144 104 case BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE: 145 105 case BPF_PROG_TYPE_LWT_SEG6LOCAL: 146 - case BPF_PROG_TYPE_LIRC_MODE2: 147 106 case BPF_PROG_TYPE_SK_REUSEPORT: 148 107 case BPF_PROG_TYPE_FLOW_DISSECTOR: 149 108 case BPF_PROG_TYPE_CGROUP_SYSCTL: 150 - case BPF_PROG_TYPE_TRACING: 151 - case BPF_PROG_TYPE_STRUCT_OPS: 152 - case BPF_PROG_TYPE_EXT: 153 - case BPF_PROG_TYPE_LSM: 154 - default: 155 109 break; 110 + default: 111 + return -EOPNOTSUPP; 156 112 } 157 113 158 - opts.prog_ifindex = ifindex; 159 - opts.log_buf = buf; 160 - opts.log_size = buf_len; 161 - 162 - fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, NULL); 114 + fd = bpf_prog_load(prog_type, NULL, "GPL", insns, insns_cnt, &opts); 115 + err = -errno; 163 116 if (fd >= 0) 164 117 close(fd); 118 + if (exp_err) { 119 + if (fd >= 0 || err != exp_err) 120 + return 0; 121 + if (exp_msg && !strstr(buf, exp_msg)) 122 + return 0; 123 + return 1; 124 + } 125 + return fd >= 0 ? 1 : 0; 126 + } 127 + 128 + int libbpf_probe_bpf_prog_type(enum bpf_prog_type prog_type, const void *opts) 129 + { 130 + struct bpf_insn insns[] = { 131 + BPF_MOV64_IMM(BPF_REG_0, 0), 132 + BPF_EXIT_INSN() 133 + }; 134 + const size_t insn_cnt = ARRAY_SIZE(insns); 135 + int ret; 136 + 137 + if (opts) 138 + return libbpf_err(-EINVAL); 139 + 140 + ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0, 0); 141 + return libbpf_err(ret); 165 142 } 166 143 167 144 bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex) ··· 188 131 BPF_EXIT_INSN() 189 132 }; 190 133 134 + /* prefer libbpf_probe_bpf_prog_type() unless offload is requested */ 135 + if (ifindex == 0) 136 + return libbpf_probe_bpf_prog_type(prog_type, NULL) == 1; 137 + 191 138 if (ifindex && prog_type == BPF_PROG_TYPE_SCHED_CLS) 192 139 /* nfp returns -EINVAL on exit(0) with TC offload */ 193 140 insns[0].imm = 2; 194 141 195 142 errno = 0; 196 - probe_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex); 143 + probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex); 197 144 198 145 return errno != EINVAL && errno != EOPNOTSUPP; 199 146 } ··· 258 197 strs, sizeof(strs)); 259 198 } 260 199 261 - bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) 200 + static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex) 262 201 { 263 - int key_size, value_size, max_entries, map_flags; 202 + LIBBPF_OPTS(bpf_map_create_opts, opts); 203 + int key_size, value_size, max_entries; 264 204 __u32 btf_key_type_id = 0, btf_value_type_id = 0; 265 - int fd = -1, btf_fd = -1, fd_inner; 205 + int fd = -1, btf_fd = -1, fd_inner = -1, exp_err = 0, err; 206 + 207 + opts.map_ifindex = ifindex; 266 208 267 209 key_size = sizeof(__u32); 268 210 value_size = sizeof(__u32); 269 211 max_entries = 1; 270 - map_flags = 0; 271 212 272 213 switch (map_type) { 273 214 case BPF_MAP_TYPE_STACK_TRACE: ··· 278 215 case BPF_MAP_TYPE_LPM_TRIE: 279 216 key_size = sizeof(__u64); 280 217 value_size = sizeof(__u64); 281 - map_flags = BPF_F_NO_PREALLOC; 218 + opts.map_flags = BPF_F_NO_PREALLOC; 282 219 break; 283 220 case BPF_MAP_TYPE_CGROUP_STORAGE: 284 221 case BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE: ··· 297 234 btf_value_type_id = 3; 298 235 value_size = 8; 299 236 max_entries = 0; 300 - map_flags = BPF_F_NO_PREALLOC; 237 + opts.map_flags = BPF_F_NO_PREALLOC; 301 238 btf_fd = load_local_storage_btf(); 302 239 if (btf_fd < 0) 303 - return false; 240 + return btf_fd; 304 241 break; 305 242 case BPF_MAP_TYPE_RINGBUF: 306 243 key_size = 0; 307 244 value_size = 0; 308 245 max_entries = 4096; 309 246 break; 310 - case BPF_MAP_TYPE_UNSPEC: 247 + case BPF_MAP_TYPE_STRUCT_OPS: 248 + /* we'll get -ENOTSUPP for invalid BTF type ID for struct_ops */ 249 + opts.btf_vmlinux_value_type_id = 1; 250 + exp_err = -524; /* -ENOTSUPP */ 251 + break; 252 + case BPF_MAP_TYPE_BLOOM_FILTER: 253 + key_size = 0; 254 + max_entries = 1; 255 + break; 311 256 case BPF_MAP_TYPE_HASH: 312 257 case BPF_MAP_TYPE_ARRAY: 313 258 case BPF_MAP_TYPE_PROG_ARRAY: ··· 334 263 case BPF_MAP_TYPE_XSKMAP: 335 264 case BPF_MAP_TYPE_SOCKHASH: 336 265 case BPF_MAP_TYPE_REUSEPORT_SOCKARRAY: 337 - case BPF_MAP_TYPE_STRUCT_OPS: 338 - default: 339 266 break; 267 + case BPF_MAP_TYPE_UNSPEC: 268 + default: 269 + return -EOPNOTSUPP; 340 270 } 341 271 342 272 if (map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS || 343 273 map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { 344 - LIBBPF_OPTS(bpf_map_create_opts, opts); 345 - 346 274 /* TODO: probe for device, once libbpf has a function to create 347 275 * map-in-map for offload 348 276 */ 349 277 if (ifindex) 350 - return false; 278 + goto cleanup; 351 279 352 280 fd_inner = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, 353 281 sizeof(__u32), sizeof(__u32), 1, NULL); 354 282 if (fd_inner < 0) 355 - return false; 283 + goto cleanup; 356 284 357 285 opts.inner_map_fd = fd_inner; 358 - fd = bpf_map_create(map_type, NULL, sizeof(__u32), sizeof(__u32), 1, &opts); 359 - close(fd_inner); 360 - } else { 361 - LIBBPF_OPTS(bpf_map_create_opts, opts); 362 - 363 - /* Note: No other restriction on map type probes for offload */ 364 - opts.map_flags = map_flags; 365 - opts.map_ifindex = ifindex; 366 - if (btf_fd >= 0) { 367 - opts.btf_fd = btf_fd; 368 - opts.btf_key_type_id = btf_key_type_id; 369 - opts.btf_value_type_id = btf_value_type_id; 370 - } 371 - 372 - fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts); 373 286 } 287 + 288 + if (btf_fd >= 0) { 289 + opts.btf_fd = btf_fd; 290 + opts.btf_key_type_id = btf_key_type_id; 291 + opts.btf_value_type_id = btf_value_type_id; 292 + } 293 + 294 + fd = bpf_map_create(map_type, NULL, key_size, value_size, max_entries, &opts); 295 + err = -errno; 296 + 297 + cleanup: 374 298 if (fd >= 0) 375 299 close(fd); 300 + if (fd_inner >= 0) 301 + close(fd_inner); 376 302 if (btf_fd >= 0) 377 303 close(btf_fd); 378 304 379 - return fd >= 0; 305 + if (exp_err) 306 + return fd < 0 && err == exp_err ? 1 : 0; 307 + else 308 + return fd >= 0 ? 1 : 0; 309 + } 310 + 311 + int libbpf_probe_bpf_map_type(enum bpf_map_type map_type, const void *opts) 312 + { 313 + int ret; 314 + 315 + if (opts) 316 + return libbpf_err(-EINVAL); 317 + 318 + ret = probe_map_create(map_type, 0); 319 + return libbpf_err(ret); 320 + } 321 + 322 + bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) 323 + { 324 + return probe_map_create(map_type, ifindex) == 1; 325 + } 326 + 327 + int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helper_id, 328 + const void *opts) 329 + { 330 + struct bpf_insn insns[] = { 331 + BPF_EMIT_CALL((__u32)helper_id), 332 + BPF_EXIT_INSN(), 333 + }; 334 + const size_t insn_cnt = ARRAY_SIZE(insns); 335 + char buf[4096]; 336 + int ret; 337 + 338 + if (opts) 339 + return libbpf_err(-EINVAL); 340 + 341 + /* we can't successfully load all prog types to check for BPF helper 342 + * support, so bail out with -EOPNOTSUPP error 343 + */ 344 + switch (prog_type) { 345 + case BPF_PROG_TYPE_TRACING: 346 + case BPF_PROG_TYPE_EXT: 347 + case BPF_PROG_TYPE_LSM: 348 + case BPF_PROG_TYPE_STRUCT_OPS: 349 + return -EOPNOTSUPP; 350 + default: 351 + break; 352 + } 353 + 354 + buf[0] = '\0'; 355 + ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf), 0); 356 + if (ret < 0) 357 + return libbpf_err(ret); 358 + 359 + /* If BPF verifier doesn't recognize BPF helper ID (enum bpf_func_id) 360 + * at all, it will emit something like "invalid func unknown#181". 361 + * If BPF verifier recognizes BPF helper but it's not supported for 362 + * given BPF program type, it will emit "unknown func bpf_sys_bpf#166". 363 + * In both cases, provided combination of BPF program type and BPF 364 + * helper is not supported by the kernel. 365 + * In all other cases, probe_prog_load() above will either succeed (e.g., 366 + * because BPF helper happens to accept no input arguments or it 367 + * accepts one input argument and initial PTR_TO_CTX is fine for 368 + * that), or we'll get some more specific BPF verifier error about 369 + * some unsatisfied conditions. 370 + */ 371 + if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func "))) 372 + return 0; 373 + return 1; /* assume supported */ 380 374 } 381 375 382 376 bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, ··· 454 318 char buf[4096] = {}; 455 319 bool res; 456 320 457 - probe_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), 458 - ifindex); 321 + probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), ifindex); 459 322 res = !grep(buf, "invalid func ") && !grep(buf, "unknown func "); 460 323 461 324 if (ifindex) { ··· 486 351 insns[BPF_MAXINSNS] = BPF_EXIT_INSN(); 487 352 488 353 errno = 0; 489 - probe_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0, 490 - ifindex); 354 + probe_prog_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0, 355 + ifindex); 491 356 492 357 return errno != E2BIG && errno != EINVAL; 493 358 }