Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2022-07-09

We've added 94 non-merge commits during the last 19 day(s) which contain
a total of 125 files changed, 5141 insertions(+), 6701 deletions(-).

The main changes are:

1) Add new way for performing BTF type queries to BPF, from Daniel Müller.

2) Add inlining of calls to bpf_loop() helper when its function callback is
statically known, from Eduard Zingerman.

3) Implement BPF TCP CC framework usability improvements, from Jörn-Thorben Hinz.

4) Add LSM flavor for attaching per-cgroup BPF programs to existing LSM
hooks, from Stanislav Fomichev.

5) Remove all deprecated libbpf APIs in prep for 1.0 release, from Andrii Nakryiko.

6) Add benchmarks around local_storage to BPF selftests, from Dave Marchevsky.

7) AF_XDP sample removal (given move to libxdp) and various improvements around AF_XDP
selftests, from Magnus Karlsson & Maciej Fijalkowski.

8) Add bpftool improvements for memcg probing and bash completion, from Quentin Monnet.

9) Add arm64 JIT support for BPF-2-BPF coupled with tail calls, from Jakub Sitnicki.

10) Sockmap optimizations around throughput of UDP transmissions which have been
improved by 61%, from Cong Wang.

11) Rework perf's BPF prologue code to remove deprecated functions, from Jiri Olsa.

12) Fix sockmap teardown path to avoid sleepable sk_psock_stop, from John Fastabend.

13) Fix libbpf's cleanup around legacy kprobe/uprobe on error case, from Chuang Wang.

14) Fix libbpf's bpf_helpers.h to work with gcc for the case of its sec/pragma
macro, from James Hilliard.

15) Fix libbpf's pt_regs macros for riscv to use a0 for RC register, from Yixun Lan.

16) Fix bpftool to show the name of type BPF_OBJ_LINK, from Yafang Shao.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (94 commits)
selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/n
bpf: Correctly propagate errors up from bpf_core_composites_match
libbpf: Disable SEC pragma macro on GCC
bpf: Check attach_func_proto more carefully in check_return_code
selftests/bpf: Add test involving restrict type qualifier
bpftool: Add support for KIND_RESTRICT to gen min_core_btf command
MAINTAINERS: Add entry for AF_XDP selftests files
selftests, xsk: Rename AF_XDP testing app
bpf, docs: Remove deprecated xsk libbpf APIs description
selftests/bpf: Add benchmark for local_storage RCU Tasks Trace usage
libbpf, riscv: Use a0 for RC register
libbpf: Remove unnecessary usdt_rel_ip assignments
selftests/bpf: Fix few more compiler warnings
selftests/bpf: Fix bogus uninitialized variable warning
bpftool: Remove zlib feature test from Makefile
libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()
libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()
libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()
selftests/bpf: Add type match test against kernel's task_struct
selftests/bpf: Add nested type to type based tests
...
====================

Link: https://lore.kernel.org/r/20220708233145.32365-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+5153 -6713
+1 -1
Documentation/bpf/instruction-set.rst
··· 351 351 * Register R0 is an implicit output which contains the data fetched from 352 352 the packet. 353 353 * Registers R1-R5 are scratch registers that are clobbered after a call to 354 - ``BPF_ABS | BPF_LD`` or ``BPF_IND`` | BPF_LD instructions. 354 + ``BPF_ABS | BPF_LD`` or ``BPF_IND | BPF_LD`` instructions. 355 355 356 356 These instructions have an implicit program exit condition as well. When an 357 357 eBPF program is trying to access the data beyond the packet boundary, the
+2 -11
Documentation/bpf/libbpf/libbpf_naming_convention.rst
··· 9 9 new function or type is added to keep libbpf API clean and consistent. 10 10 11 11 All types and functions provided by libbpf API should have one of the 12 - following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``, 13 - ``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``. 12 + following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``btf_dump_``, 13 + ``ring_buffer_``, ``perf_buffer_``. 14 14 15 15 System call wrappers 16 16 -------------------- ··· 58 58 Auxiliary functions and types that don't fit well in any of categories 59 59 described above should have ``libbpf_`` prefix, e.g. 60 60 ``libbpf_get_error`` or ``libbpf_prog_type_by_name``. 61 - 62 - AF_XDP functions 63 - ------------------- 64 - 65 - AF_XDP functions should have an ``xsk_`` prefix, e.g. 66 - ``xsk_umem__get_data`` or ``xsk_umem__create``. The interface consists 67 - of both low-level ring access functions and high-level configuration 68 - functions. These can be mixed and matched. Note that these functions 69 - are not reentrant for performance reasons. 70 61 71 62 ABI 72 63 ---
+1 -2
MAINTAINERS
··· 21917 21917 F: include/uapi/linux/xdp_diag.h 21918 21918 F: include/net/netns/xdp.h 21919 21919 F: net/xdp/ 21920 - F: samples/bpf/xdpsock* 21921 - F: tools/lib/bpf/xsk* 21920 + F: tools/testing/selftests/bpf/*xsk* 21922 21921 21923 21922 XEN BLOCK SUBSYSTEM 21924 21923 M: Roger Pau Monné <roger.pau@citrix.com>
+8 -1
arch/arm64/net/bpf_jit_comp.c
··· 246 246 static int build_prologue(struct jit_ctx *ctx, bool ebpf_from_cbpf) 247 247 { 248 248 const struct bpf_prog *prog = ctx->prog; 249 + const bool is_main_prog = prog->aux->func_idx == 0; 249 250 const u8 r6 = bpf2a64[BPF_REG_6]; 250 251 const u8 r7 = bpf2a64[BPF_REG_7]; 251 252 const u8 r8 = bpf2a64[BPF_REG_8]; ··· 300 299 /* Set up BPF prog stack base register */ 301 300 emit(A64_MOV(1, fp, A64_SP), ctx); 302 301 303 - if (!ebpf_from_cbpf) { 302 + if (!ebpf_from_cbpf && is_main_prog) { 304 303 /* Initialize tail_call_cnt */ 305 304 emit(A64_MOVZ(1, tcc, 0, 0), ctx); 306 305 ··· 1530 1529 void bpf_jit_free_exec(void *addr) 1531 1530 { 1532 1531 return vfree(addr); 1532 + } 1533 + 1534 + /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */ 1535 + bool bpf_jit_supports_subprog_tailcalls(void) 1536 + { 1537 + return true; 1533 1538 }
+22 -8
arch/x86/net/bpf_jit_comp.c
··· 1771 1771 struct bpf_tramp_link *l, int stack_size, 1772 1772 int run_ctx_off, bool save_ret) 1773 1773 { 1774 + void (*exit)(struct bpf_prog *prog, u64 start, 1775 + struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit; 1776 + u64 (*enter)(struct bpf_prog *prog, 1777 + struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter; 1774 1778 u8 *prog = *pprog; 1775 1779 u8 *jmp_insn; 1776 1780 int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie); ··· 1793 1789 */ 1794 1790 emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off); 1795 1791 1792 + if (p->aux->sleepable) { 1793 + enter = __bpf_prog_enter_sleepable; 1794 + exit = __bpf_prog_exit_sleepable; 1795 + } else if (p->expected_attach_type == BPF_LSM_CGROUP) { 1796 + enter = __bpf_prog_enter_lsm_cgroup; 1797 + exit = __bpf_prog_exit_lsm_cgroup; 1798 + } 1799 + 1796 1800 /* arg1: mov rdi, progs[i] */ 1797 1801 emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p); 1798 1802 /* arg2: lea rsi, [rbp - ctx_cookie_off] */ 1799 1803 EMIT4(0x48, 0x8D, 0x75, -run_ctx_off); 1800 1804 1801 - if (emit_call(&prog, 1802 - p->aux->sleepable ? __bpf_prog_enter_sleepable : 1803 - __bpf_prog_enter, prog)) 1804 - return -EINVAL; 1805 + if (emit_call(&prog, enter, prog)) 1806 + return -EINVAL; 1805 1807 /* remember prog start time returned by __bpf_prog_enter */ 1806 1808 emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0); 1807 1809 ··· 1851 1841 emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); 1852 1842 /* arg3: lea rdx, [rbp - run_ctx_off] */ 1853 1843 EMIT4(0x48, 0x8D, 0x55, -run_ctx_off); 1854 - if (emit_call(&prog, 1855 - p->aux->sleepable ? __bpf_prog_exit_sleepable : 1856 - __bpf_prog_exit, prog)) 1857 - return -EINVAL; 1844 + if (emit_call(&prog, exit, prog)) 1845 + return -EINVAL; 1858 1846 1859 1847 *pprog = prog; 1860 1848 return 0; ··· 2499 2491 if (text_poke_copy(dst, src, len) == NULL) 2500 2492 return ERR_PTR(-EINVAL); 2501 2493 return dst; 2494 + } 2495 + 2496 + /* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */ 2497 + bool bpf_jit_supports_subprog_tailcalls(void) 2498 + { 2499 + return true; 2502 2500 }
+11 -2
include/linux/bpf-cgroup-defs.h
··· 10 10 11 11 struct bpf_prog_array; 12 12 13 + #ifdef CONFIG_BPF_LSM 14 + /* Maximum number of concurrently attachable per-cgroup LSM hooks. */ 15 + #define CGROUP_LSM_NUM 10 16 + #else 17 + #define CGROUP_LSM_NUM 0 18 + #endif 19 + 13 20 enum cgroup_bpf_attach_type { 14 21 CGROUP_BPF_ATTACH_TYPE_INVALID = -1, 15 22 CGROUP_INET_INGRESS = 0, ··· 42 35 CGROUP_INET4_GETSOCKNAME, 43 36 CGROUP_INET6_GETSOCKNAME, 44 37 CGROUP_INET_SOCK_RELEASE, 38 + CGROUP_LSM_START, 39 + CGROUP_LSM_END = CGROUP_LSM_START + CGROUP_LSM_NUM - 1, 45 40 MAX_CGROUP_BPF_ATTACH_TYPE 46 41 }; 47 42 ··· 56 47 * have either zero or one element 57 48 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS 58 49 */ 59 - struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; 60 - u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; 50 + struct hlist_head progs[MAX_CGROUP_BPF_ATTACH_TYPE]; 51 + u8 flags[MAX_CGROUP_BPF_ATTACH_TYPE]; 61 52 62 53 /* list of cgroup shared storages */ 63 54 struct list_head storages;
+8 -1
include/linux/bpf-cgroup.h
··· 23 23 struct ctl_table_header; 24 24 struct task_struct; 25 25 26 + unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx, 27 + const struct bpf_insn *insn); 28 + unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx, 29 + const struct bpf_insn *insn); 30 + unsigned int __cgroup_bpf_run_lsm_current(const void *ctx, 31 + const struct bpf_insn *insn); 32 + 26 33 #ifdef CONFIG_CGROUP_BPF 27 34 28 35 #define CGROUP_ATYPE(type) \ ··· 102 95 }; 103 96 104 97 struct bpf_prog_list { 105 - struct list_head node; 98 + struct hlist_node node; 106 99 struct bpf_prog *prog; 107 100 struct bpf_cgroup_link *link; 108 101 struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE];
+41 -6
include/linux/bpf.h
··· 56 56 typedef int (*bpf_iter_init_seq_priv_t)(void *private_data, 57 57 struct bpf_iter_aux_info *aux); 58 58 typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data); 59 + typedef unsigned int (*bpf_func_t)(const void *, 60 + const struct bpf_insn *); 59 61 struct bpf_iter_seq_info { 60 62 const struct seq_operations *seq_ops; 61 63 bpf_iter_init_seq_priv_t init_seq_private; ··· 794 792 u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx); 795 793 void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start, 796 794 struct bpf_tramp_run_ctx *run_ctx); 795 + u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog, 796 + struct bpf_tramp_run_ctx *run_ctx); 797 + void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start, 798 + struct bpf_tramp_run_ctx *run_ctx); 797 799 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); 798 800 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr); 799 801 ··· 885 879 static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func( 886 880 const void *ctx, 887 881 const struct bpf_insn *insnsi, 888 - unsigned int (*bpf_func)(const void *, 889 - const struct bpf_insn *)) 882 + bpf_func_t bpf_func) 890 883 { 891 884 return bpf_func(ctx, insnsi); 892 885 } ··· 914 909 noinline __nocfi unsigned int bpf_dispatcher_##name##_func( \ 915 910 const void *ctx, \ 916 911 const struct bpf_insn *insnsi, \ 917 - unsigned int (*bpf_func)(const void *, \ 918 - const struct bpf_insn *)) \ 912 + bpf_func_t bpf_func) \ 919 913 { \ 920 914 return bpf_func(ctx, insnsi); \ 921 915 } \ ··· 925 921 unsigned int bpf_dispatcher_##name##_func( \ 926 922 const void *ctx, \ 927 923 const struct bpf_insn *insnsi, \ 928 - unsigned int (*bpf_func)(const void *, \ 929 - const struct bpf_insn *)); \ 924 + bpf_func_t bpf_func); \ 930 925 extern struct bpf_dispatcher bpf_dispatcher_##name; 931 926 #define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_##name##_func 932 927 #define BPF_DISPATCHER_PTR(name) (&bpf_dispatcher_##name) ··· 1064 1061 struct user_struct *user; 1065 1062 u64 load_time; /* ns since boottime */ 1066 1063 u32 verified_insns; 1064 + int cgroup_atype; /* enum cgroup_bpf_attach_type */ 1067 1065 struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE]; 1068 1066 char name[BPF_OBJ_NAME_LEN]; 1069 1067 #ifdef CONFIG_SECURITY ··· 1172 1168 u64 cookie; 1173 1169 }; 1174 1170 1171 + struct bpf_shim_tramp_link { 1172 + struct bpf_tramp_link link; 1173 + struct bpf_trampoline *trampoline; 1174 + }; 1175 + 1175 1176 struct bpf_tracing_link { 1176 1177 struct bpf_tramp_link link; 1177 1178 enum bpf_attach_type attach_type; ··· 1255 1246 int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr, 1256 1247 union bpf_attr __user *uattr); 1257 1248 #endif 1249 + int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, 1250 + int cgroup_atype); 1251 + void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog); 1258 1252 #else 1259 1253 static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id) 1260 1254 { ··· 1281 1269 { 1282 1270 return -EINVAL; 1283 1271 } 1272 + static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, 1273 + int cgroup_atype) 1274 + { 1275 + return -EOPNOTSUPP; 1276 + } 1277 + static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog) 1278 + { 1279 + } 1284 1280 #endif 1285 1281 1286 1282 struct bpf_array { ··· 1305 1285 1306 1286 #define BPF_COMPLEXITY_LIMIT_INSNS 1000000 /* yes. 1M insns */ 1307 1287 #define MAX_TAIL_CALL_CNT 33 1288 + 1289 + /* Maximum number of loops for bpf_loop */ 1290 + #define BPF_MAX_LOOPS BIT(23) 1308 1291 1309 1292 #define BPF_F_ACCESS_MASK (BPF_F_RDONLY | \ 1310 1293 BPF_F_RDONLY_PROG | \ ··· 2386 2363 extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; 2387 2364 extern const struct bpf_func_proto bpf_sk_setsockopt_proto; 2388 2365 extern const struct bpf_func_proto bpf_sk_getsockopt_proto; 2366 + extern const struct bpf_func_proto bpf_unlocked_sk_setsockopt_proto; 2367 + extern const struct bpf_func_proto bpf_unlocked_sk_getsockopt_proto; 2389 2368 extern const struct bpf_func_proto bpf_find_vma_proto; 2390 2369 extern const struct bpf_func_proto bpf_loop_proto; 2391 2370 extern const struct bpf_func_proto bpf_copy_from_user_task_proto; 2371 + extern const struct bpf_func_proto bpf_set_retval_proto; 2372 + extern const struct bpf_func_proto bpf_get_retval_proto; 2392 2373 2393 2374 const struct bpf_func_proto *tracing_prog_func_proto( 2394 2375 enum bpf_func_id func_id, const struct bpf_prog *prog); ··· 2545 2518 enum bpf_dynptr_type type, u32 offset, u32 size); 2546 2519 void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr); 2547 2520 int bpf_dynptr_check_size(u32 size); 2521 + 2522 + #ifdef CONFIG_BPF_LSM 2523 + void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype); 2524 + void bpf_cgroup_atype_put(int cgroup_atype); 2525 + #else 2526 + static inline void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype) {} 2527 + static inline void bpf_cgroup_atype_put(int cgroup_atype) {} 2528 + #endif /* CONFIG_BPF_LSM */ 2548 2529 2549 2530 #endif /* _LINUX_BPF_H */
+7
include/linux/bpf_lsm.h
··· 42 42 extern const struct bpf_func_proto bpf_inode_storage_delete_proto; 43 43 void bpf_inode_storage_free(struct inode *inode); 44 44 45 + void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func); 46 + 45 47 #else /* !CONFIG_BPF_LSM */ 46 48 47 49 static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id) ··· 64 62 } 65 63 66 64 static inline void bpf_inode_storage_free(struct inode *inode) 65 + { 66 + } 67 + 68 + static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, 69 + bpf_func_t *bpf_func) 67 70 { 68 71 } 69 72
+12
include/linux/bpf_verifier.h
··· 344 344 int miss_cnt, hit_cnt; 345 345 }; 346 346 347 + struct bpf_loop_inline_state { 348 + int initialized:1; /* set to true upon first entry */ 349 + int fit_for_inline:1; /* true if callback function is the same 350 + * at each call and flags are always zero 351 + */ 352 + u32 callback_subprogno; /* valid when fit_for_inline is true */ 353 + }; 354 + 347 355 /* Possible states for alu_state member. */ 348 356 #define BPF_ALU_SANITIZE_SRC (1U << 0) 349 357 #define BPF_ALU_SANITIZE_DST (1U << 1) ··· 381 373 u32 mem_size; /* mem_size for non-struct typed var */ 382 374 }; 383 375 } btf_var; 376 + /* if instruction is a call to bpf_loop this field tracks 377 + * the state of the relevant registers to make decision about inlining 378 + */ 379 + struct bpf_loop_inline_state loop_inline_state; 384 380 }; 385 381 u64 map_key_state; /* constant (32 bit) key tracking for maps */ 386 382 int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
+2 -1
include/linux/btf_ids.h
··· 179 179 BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock) \ 180 180 BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock) \ 181 181 BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock) \ 182 - BTF_SOCK_TYPE(BTF_SOCK_TYPE_MPTCP, mptcp_sock) 182 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_MPTCP, mptcp_sock) \ 183 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket) 183 184 184 185 enum { 185 186 #define BTF_SOCK_TYPE(name, str) name,
+1
include/linux/filter.h
··· 914 914 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog); 915 915 void bpf_jit_compile(struct bpf_prog *prog); 916 916 bool bpf_jit_needs_zext(void); 917 + bool bpf_jit_supports_subprog_tailcalls(void); 917 918 bool bpf_jit_supports_kfunc_call(void); 918 919 bool bpf_helper_changes_pkt_data(void *func); 919 920
+4
include/linux/net.h
··· 152 152 struct sk_buff; 153 153 typedef int (*sk_read_actor_t)(read_descriptor_t *, struct sk_buff *, 154 154 unsigned int, size_t); 155 + typedef int (*skb_read_actor_t)(struct sock *, struct sk_buff *); 156 + 155 157 156 158 struct proto_ops { 157 159 int family; ··· 216 214 */ 217 215 int (*read_sock)(struct sock *sk, read_descriptor_t *desc, 218 216 sk_read_actor_t recv_actor); 217 + /* This is different from read_sock(), it reads an entire skb at a time. */ 218 + int (*read_skb)(struct sock *sk, skb_read_actor_t recv_actor); 219 219 int (*sendpage_locked)(struct sock *sk, struct page *page, 220 220 int offset, size_t size, int flags); 221 221 int (*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
+1
include/net/tcp.h
··· 672 672 /* Read 'sendfile()'-style from a TCP socket */ 673 673 int tcp_read_sock(struct sock *sk, read_descriptor_t *desc, 674 674 sk_read_actor_t recv_actor); 675 + int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor); 675 676 676 677 void tcp_initialize_rcv_mss(struct sock *sk); 677 678
+1 -2
include/net/udp.h
··· 306 306 struct sk_buff *skb); 307 307 struct sock *udp6_lib_lookup_skb(const struct sk_buff *skb, 308 308 __be16 sport, __be16 dport); 309 - int udp_read_sock(struct sock *sk, read_descriptor_t *desc, 310 - sk_read_actor_t recv_actor); 309 + int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor); 311 310 312 311 /* UDP uses skb->dev_scratch to cache as much information as possible and avoid 313 312 * possibly multiple cache miss on dequeue()
+5
include/uapi/linux/bpf.h
··· 998 998 BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, 999 999 BPF_PERF_EVENT, 1000 1000 BPF_TRACE_KPROBE_MULTI, 1001 + BPF_LSM_CGROUP, 1001 1002 __MAX_BPF_ATTACH_TYPE 1002 1003 }; 1003 1004 ··· 1432 1431 __u32 attach_flags; 1433 1432 __aligned_u64 prog_ids; 1434 1433 __u32 prog_cnt; 1434 + __aligned_u64 prog_attach_flags; /* output: per-program attach_flags */ 1435 1435 } query; 1436 1436 1437 1437 struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ ··· 6077 6075 __u64 run_cnt; 6078 6076 __u64 recursion_misses; 6079 6077 __u32 verified_insns; 6078 + __u32 attach_btf_obj_id; 6079 + __u32 attach_btf_id; 6080 6080 } __attribute__((aligned(8))); 6081 6081 6082 6082 struct bpf_map_info { ··· 6786 6782 BPF_CORE_TYPE_SIZE = 9, /* type size in bytes */ 6787 6783 BPF_CORE_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ 6788 6784 BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ 6785 + BPF_CORE_TYPE_MATCHES = 12, /* type match in target kernel */ 6789 6786 }; 6790 6787 6791 6788 /*
+5 -4
kernel/bpf/bpf_iter.c
··· 723 723 .arg4_type = ARG_ANYTHING, 724 724 }; 725 725 726 - /* maximum number of loops */ 727 - #define MAX_LOOPS BIT(23) 728 - 729 726 BPF_CALL_4(bpf_loop, u32, nr_loops, void *, callback_fn, void *, callback_ctx, 730 727 u64, flags) 731 728 { ··· 730 733 u64 ret; 731 734 u32 i; 732 735 736 + /* Note: these safety checks are also verified when bpf_loop 737 + * is inlined, be careful to modify this code in sync. See 738 + * function verifier.c:inline_bpf_loop. 739 + */ 733 740 if (flags) 734 741 return -EINVAL; 735 - if (nr_loops > MAX_LOOPS) 742 + if (nr_loops > BPF_MAX_LOOPS) 736 743 return -E2BIG; 737 744 738 745 for (i = 0; i < nr_loops; i++) {
+81
kernel/bpf/bpf_lsm.c
··· 16 16 #include <linux/bpf_local_storage.h> 17 17 #include <linux/btf_ids.h> 18 18 #include <linux/ima.h> 19 + #include <linux/bpf-cgroup.h> 19 20 20 21 /* For every LSM hook that allows attachment of BPF programs, declare a nop 21 22 * function where a BPF program can be attached. ··· 35 34 #include <linux/lsm_hook_defs.h> 36 35 #undef LSM_HOOK 37 36 BTF_SET_END(bpf_lsm_hooks) 37 + 38 + /* List of LSM hooks that should operate on 'current' cgroup regardless 39 + * of function signature. 40 + */ 41 + BTF_SET_START(bpf_lsm_current_hooks) 42 + /* operate on freshly allocated sk without any cgroup association */ 43 + BTF_ID(func, bpf_lsm_sk_alloc_security) 44 + BTF_ID(func, bpf_lsm_sk_free_security) 45 + BTF_SET_END(bpf_lsm_current_hooks) 46 + 47 + /* List of LSM hooks that trigger while the socket is properly locked. 48 + */ 49 + BTF_SET_START(bpf_lsm_locked_sockopt_hooks) 50 + BTF_ID(func, bpf_lsm_socket_sock_rcv_skb) 51 + BTF_ID(func, bpf_lsm_sock_graft) 52 + BTF_ID(func, bpf_lsm_inet_csk_clone) 53 + BTF_ID(func, bpf_lsm_inet_conn_established) 54 + BTF_SET_END(bpf_lsm_locked_sockopt_hooks) 55 + 56 + /* List of LSM hooks that trigger while the socket is _not_ locked, 57 + * but it's ok to call bpf_{g,s}etsockopt because the socket is still 58 + * in the early init phase. 59 + */ 60 + BTF_SET_START(bpf_lsm_unlocked_sockopt_hooks) 61 + BTF_ID(func, bpf_lsm_socket_post_create) 62 + BTF_ID(func, bpf_lsm_socket_socketpair) 63 + BTF_SET_END(bpf_lsm_unlocked_sockopt_hooks) 64 + 65 + void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, 66 + bpf_func_t *bpf_func) 67 + { 68 + const struct btf_param *args; 69 + 70 + if (btf_type_vlen(prog->aux->attach_func_proto) < 1 || 71 + btf_id_set_contains(&bpf_lsm_current_hooks, 72 + prog->aux->attach_btf_id)) { 73 + *bpf_func = __cgroup_bpf_run_lsm_current; 74 + return; 75 + } 76 + 77 + args = btf_params(prog->aux->attach_func_proto); 78 + 79 + #ifdef CONFIG_NET 80 + if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCKET]) 81 + *bpf_func = __cgroup_bpf_run_lsm_socket; 82 + else if (args[0].type == btf_sock_ids[BTF_SOCK_TYPE_SOCK]) 83 + *bpf_func = __cgroup_bpf_run_lsm_sock; 84 + else 85 + #endif 86 + *bpf_func = __cgroup_bpf_run_lsm_current; 87 + } 38 88 39 89 int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog, 40 90 const struct bpf_prog *prog) ··· 210 158 return prog->aux->sleepable ? &bpf_ima_file_hash_proto : NULL; 211 159 case BPF_FUNC_get_attach_cookie: 212 160 return bpf_prog_has_trampoline(prog) ? &bpf_get_attach_cookie_proto : NULL; 161 + case BPF_FUNC_get_local_storage: 162 + return prog->expected_attach_type == BPF_LSM_CGROUP ? 163 + &bpf_get_local_storage_proto : NULL; 164 + case BPF_FUNC_set_retval: 165 + return prog->expected_attach_type == BPF_LSM_CGROUP ? 166 + &bpf_set_retval_proto : NULL; 167 + case BPF_FUNC_get_retval: 168 + return prog->expected_attach_type == BPF_LSM_CGROUP ? 169 + &bpf_get_retval_proto : NULL; 170 + case BPF_FUNC_setsockopt: 171 + if (prog->expected_attach_type != BPF_LSM_CGROUP) 172 + return NULL; 173 + if (btf_id_set_contains(&bpf_lsm_locked_sockopt_hooks, 174 + prog->aux->attach_btf_id)) 175 + return &bpf_sk_setsockopt_proto; 176 + if (btf_id_set_contains(&bpf_lsm_unlocked_sockopt_hooks, 177 + prog->aux->attach_btf_id)) 178 + return &bpf_unlocked_sk_setsockopt_proto; 179 + return NULL; 180 + case BPF_FUNC_getsockopt: 181 + if (prog->expected_attach_type != BPF_LSM_CGROUP) 182 + return NULL; 183 + if (btf_id_set_contains(&bpf_lsm_locked_sockopt_hooks, 184 + prog->aux->attach_btf_id)) 185 + return &bpf_sk_getsockopt_proto; 186 + if (btf_id_set_contains(&bpf_lsm_unlocked_sockopt_hooks, 187 + prog->aux->attach_btf_id)) 188 + return &bpf_unlocked_sk_getsockopt_proto; 189 + return NULL; 213 190 default: 214 191 return tracing_prog_func_proto(func_id, prog); 215 192 }
+3 -4
kernel/bpf/bpf_struct_ops.c
··· 503 503 goto unlock; 504 504 } 505 505 506 - /* Error during st_ops->reg(). It is very unlikely since 507 - * the above init_member() should have caught it earlier 508 - * before reg(). The only possibility is if there was a race 509 - * in registering the struct_ops (under the same name) to 506 + /* Error during st_ops->reg(). Can happen if this struct_ops needs to be 507 + * verified as a whole, after all init_member() calls. Can also happen if 508 + * there was a race in registering the struct_ops (under the same name) to 510 509 * a sub-system through different struct_ops's maps. 511 510 */ 512 511 set_memory_nx((long)st_map->image, 1);
+11 -83
kernel/bpf/btf.c
··· 5368 5368 5369 5369 if (arg == nr_args) { 5370 5370 switch (prog->expected_attach_type) { 5371 + case BPF_LSM_CGROUP: 5371 5372 case BPF_LSM_MAC: 5372 5373 case BPF_TRACE_FEXIT: 5373 5374 /* When LSM programs are attached to void LSM hooks ··· 7422 7421 7423 7422 #define MAX_TYPES_ARE_COMPAT_DEPTH 2 7424 7423 7425 - static 7426 - int __bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 7427 - const struct btf *targ_btf, __u32 targ_id, 7428 - int level) 7429 - { 7430 - const struct btf_type *local_type, *targ_type; 7431 - int depth = 32; /* max recursion depth */ 7432 - 7433 - /* caller made sure that names match (ignoring flavor suffix) */ 7434 - local_type = btf_type_by_id(local_btf, local_id); 7435 - targ_type = btf_type_by_id(targ_btf, targ_id); 7436 - if (btf_kind(local_type) != btf_kind(targ_type)) 7437 - return 0; 7438 - 7439 - recur: 7440 - depth--; 7441 - if (depth < 0) 7442 - return -EINVAL; 7443 - 7444 - local_type = btf_type_skip_modifiers(local_btf, local_id, &local_id); 7445 - targ_type = btf_type_skip_modifiers(targ_btf, targ_id, &targ_id); 7446 - if (!local_type || !targ_type) 7447 - return -EINVAL; 7448 - 7449 - if (btf_kind(local_type) != btf_kind(targ_type)) 7450 - return 0; 7451 - 7452 - switch (btf_kind(local_type)) { 7453 - case BTF_KIND_UNKN: 7454 - case BTF_KIND_STRUCT: 7455 - case BTF_KIND_UNION: 7456 - case BTF_KIND_ENUM: 7457 - case BTF_KIND_FWD: 7458 - case BTF_KIND_ENUM64: 7459 - return 1; 7460 - case BTF_KIND_INT: 7461 - /* just reject deprecated bitfield-like integers; all other 7462 - * integers are by default compatible between each other 7463 - */ 7464 - return btf_int_offset(local_type) == 0 && btf_int_offset(targ_type) == 0; 7465 - case BTF_KIND_PTR: 7466 - local_id = local_type->type; 7467 - targ_id = targ_type->type; 7468 - goto recur; 7469 - case BTF_KIND_ARRAY: 7470 - local_id = btf_array(local_type)->type; 7471 - targ_id = btf_array(targ_type)->type; 7472 - goto recur; 7473 - case BTF_KIND_FUNC_PROTO: { 7474 - struct btf_param *local_p = btf_params(local_type); 7475 - struct btf_param *targ_p = btf_params(targ_type); 7476 - __u16 local_vlen = btf_vlen(local_type); 7477 - __u16 targ_vlen = btf_vlen(targ_type); 7478 - int i, err; 7479 - 7480 - if (local_vlen != targ_vlen) 7481 - return 0; 7482 - 7483 - for (i = 0; i < local_vlen; i++, local_p++, targ_p++) { 7484 - if (level <= 0) 7485 - return -EINVAL; 7486 - 7487 - btf_type_skip_modifiers(local_btf, local_p->type, &local_id); 7488 - btf_type_skip_modifiers(targ_btf, targ_p->type, &targ_id); 7489 - err = __bpf_core_types_are_compat(local_btf, local_id, 7490 - targ_btf, targ_id, 7491 - level - 1); 7492 - if (err <= 0) 7493 - return err; 7494 - } 7495 - 7496 - /* tail recurse for return type check */ 7497 - btf_type_skip_modifiers(local_btf, local_type->type, &local_id); 7498 - btf_type_skip_modifiers(targ_btf, targ_type->type, &targ_id); 7499 - goto recur; 7500 - } 7501 - default: 7502 - return 0; 7503 - } 7504 - } 7505 - 7506 7424 /* Check local and target types for compatibility. This check is used for 7507 7425 * type-based CO-RE relocations and follow slightly different rules than 7508 7426 * field-based relocations. This function assumes that root types were already ··· 7444 7524 int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 7445 7525 const struct btf *targ_btf, __u32 targ_id) 7446 7526 { 7447 - return __bpf_core_types_are_compat(local_btf, local_id, 7448 - targ_btf, targ_id, 7527 + return __bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id, 7449 7528 MAX_TYPES_ARE_COMPAT_DEPTH); 7529 + } 7530 + 7531 + #define MAX_TYPES_MATCH_DEPTH 2 7532 + 7533 + int bpf_core_types_match(const struct btf *local_btf, u32 local_id, 7534 + const struct btf *targ_btf, u32 targ_id) 7535 + { 7536 + return __bpf_core_types_match(local_btf, local_id, targ_btf, targ_id, false, 7537 + MAX_TYPES_MATCH_DEPTH); 7450 7538 } 7451 7539 7452 7540 static bool bpf_core_is_flavor_sep(const char *s)
+276 -72
kernel/bpf/cgroup.c
··· 14 14 #include <linux/string.h> 15 15 #include <linux/bpf.h> 16 16 #include <linux/bpf-cgroup.h> 17 + #include <linux/bpf_lsm.h> 18 + #include <linux/bpf_verifier.h> 17 19 #include <net/sock.h> 18 20 #include <net/bpf_sk_storage.h> 19 21 ··· 62 60 migrate_enable(); 63 61 return run_ctx.retval; 64 62 } 63 + 64 + unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx, 65 + const struct bpf_insn *insn) 66 + { 67 + const struct bpf_prog *shim_prog; 68 + struct sock *sk; 69 + struct cgroup *cgrp; 70 + int ret = 0; 71 + u64 *args; 72 + 73 + args = (u64 *)ctx; 74 + sk = (void *)(unsigned long)args[0]; 75 + /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/ 76 + shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi)); 77 + 78 + cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data); 79 + if (likely(cgrp)) 80 + ret = bpf_prog_run_array_cg(&cgrp->bpf, 81 + shim_prog->aux->cgroup_atype, 82 + ctx, bpf_prog_run, 0, NULL); 83 + return ret; 84 + } 85 + 86 + unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx, 87 + const struct bpf_insn *insn) 88 + { 89 + const struct bpf_prog *shim_prog; 90 + struct socket *sock; 91 + struct cgroup *cgrp; 92 + int ret = 0; 93 + u64 *args; 94 + 95 + args = (u64 *)ctx; 96 + sock = (void *)(unsigned long)args[0]; 97 + /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/ 98 + shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi)); 99 + 100 + cgrp = sock_cgroup_ptr(&sock->sk->sk_cgrp_data); 101 + if (likely(cgrp)) 102 + ret = bpf_prog_run_array_cg(&cgrp->bpf, 103 + shim_prog->aux->cgroup_atype, 104 + ctx, bpf_prog_run, 0, NULL); 105 + return ret; 106 + } 107 + 108 + unsigned int __cgroup_bpf_run_lsm_current(const void *ctx, 109 + const struct bpf_insn *insn) 110 + { 111 + const struct bpf_prog *shim_prog; 112 + struct cgroup *cgrp; 113 + int ret = 0; 114 + 115 + /*shim_prog = container_of(insn, struct bpf_prog, insnsi);*/ 116 + shim_prog = (const struct bpf_prog *)((void *)insn - offsetof(struct bpf_prog, insnsi)); 117 + 118 + /* We rely on trampoline's __bpf_prog_enter_lsm_cgroup to grab RCU read lock. */ 119 + cgrp = task_dfl_cgroup(current); 120 + if (likely(cgrp)) 121 + ret = bpf_prog_run_array_cg(&cgrp->bpf, 122 + shim_prog->aux->cgroup_atype, 123 + ctx, bpf_prog_run, 0, NULL); 124 + return ret; 125 + } 126 + 127 + #ifdef CONFIG_BPF_LSM 128 + struct cgroup_lsm_atype { 129 + u32 attach_btf_id; 130 + int refcnt; 131 + }; 132 + 133 + static struct cgroup_lsm_atype cgroup_lsm_atype[CGROUP_LSM_NUM]; 134 + 135 + static enum cgroup_bpf_attach_type 136 + bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id) 137 + { 138 + int i; 139 + 140 + lockdep_assert_held(&cgroup_mutex); 141 + 142 + if (attach_type != BPF_LSM_CGROUP) 143 + return to_cgroup_bpf_attach_type(attach_type); 144 + 145 + for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype); i++) 146 + if (cgroup_lsm_atype[i].attach_btf_id == attach_btf_id) 147 + return CGROUP_LSM_START + i; 148 + 149 + for (i = 0; i < ARRAY_SIZE(cgroup_lsm_atype); i++) 150 + if (cgroup_lsm_atype[i].attach_btf_id == 0) 151 + return CGROUP_LSM_START + i; 152 + 153 + return -E2BIG; 154 + 155 + } 156 + 157 + void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype) 158 + { 159 + int i = cgroup_atype - CGROUP_LSM_START; 160 + 161 + lockdep_assert_held(&cgroup_mutex); 162 + 163 + WARN_ON_ONCE(cgroup_lsm_atype[i].attach_btf_id && 164 + cgroup_lsm_atype[i].attach_btf_id != attach_btf_id); 165 + 166 + cgroup_lsm_atype[i].attach_btf_id = attach_btf_id; 167 + cgroup_lsm_atype[i].refcnt++; 168 + } 169 + 170 + void bpf_cgroup_atype_put(int cgroup_atype) 171 + { 172 + int i = cgroup_atype - CGROUP_LSM_START; 173 + 174 + mutex_lock(&cgroup_mutex); 175 + if (--cgroup_lsm_atype[i].refcnt <= 0) 176 + cgroup_lsm_atype[i].attach_btf_id = 0; 177 + WARN_ON_ONCE(cgroup_lsm_atype[i].refcnt < 0); 178 + mutex_unlock(&cgroup_mutex); 179 + } 180 + #else 181 + static enum cgroup_bpf_attach_type 182 + bpf_cgroup_atype_find(enum bpf_attach_type attach_type, u32 attach_btf_id) 183 + { 184 + if (attach_type != BPF_LSM_CGROUP) 185 + return to_cgroup_bpf_attach_type(attach_type); 186 + return -EOPNOTSUPP; 187 + } 188 + #endif /* CONFIG_BPF_LSM */ 65 189 66 190 void cgroup_bpf_offline(struct cgroup *cgrp) 67 191 { ··· 285 157 mutex_lock(&cgroup_mutex); 286 158 287 159 for (atype = 0; atype < ARRAY_SIZE(cgrp->bpf.progs); atype++) { 288 - struct list_head *progs = &cgrp->bpf.progs[atype]; 289 - struct bpf_prog_list *pl, *pltmp; 160 + struct hlist_head *progs = &cgrp->bpf.progs[atype]; 161 + struct bpf_prog_list *pl; 162 + struct hlist_node *pltmp; 290 163 291 - list_for_each_entry_safe(pl, pltmp, progs, node) { 292 - list_del(&pl->node); 293 - if (pl->prog) 164 + hlist_for_each_entry_safe(pl, pltmp, progs, node) { 165 + hlist_del(&pl->node); 166 + if (pl->prog) { 167 + if (pl->prog->expected_attach_type == BPF_LSM_CGROUP) 168 + bpf_trampoline_unlink_cgroup_shim(pl->prog); 294 169 bpf_prog_put(pl->prog); 295 - if (pl->link) 170 + } 171 + if (pl->link) { 172 + if (pl->link->link.prog->expected_attach_type == BPF_LSM_CGROUP) 173 + bpf_trampoline_unlink_cgroup_shim(pl->link->link.prog); 296 174 bpf_cgroup_link_auto_detach(pl->link); 175 + } 297 176 kfree(pl); 298 177 static_branch_dec(&cgroup_bpf_enabled_key[atype]); 299 178 } ··· 352 217 /* count number of elements in the list. 353 218 * it's slow but the list cannot be long 354 219 */ 355 - static u32 prog_list_length(struct list_head *head) 220 + static u32 prog_list_length(struct hlist_head *head) 356 221 { 357 222 struct bpf_prog_list *pl; 358 223 u32 cnt = 0; 359 224 360 - list_for_each_entry(pl, head, node) { 225 + hlist_for_each_entry(pl, head, node) { 361 226 if (!prog_list_prog(pl)) 362 227 continue; 363 228 cnt++; ··· 426 291 if (cnt > 0 && !(p->bpf.flags[atype] & BPF_F_ALLOW_MULTI)) 427 292 continue; 428 293 429 - list_for_each_entry(pl, &p->bpf.progs[atype], node) { 294 + hlist_for_each_entry(pl, &p->bpf.progs[atype], node) { 430 295 if (!prog_list_prog(pl)) 431 296 continue; 432 297 ··· 477 342 cgroup_bpf_get(p); 478 343 479 344 for (i = 0; i < NR; i++) 480 - INIT_LIST_HEAD(&cgrp->bpf.progs[i]); 345 + INIT_HLIST_HEAD(&cgrp->bpf.progs[i]); 481 346 482 347 INIT_LIST_HEAD(&cgrp->bpf.storages); 483 348 ··· 553 418 554 419 #define BPF_CGROUP_MAX_PROGS 64 555 420 556 - static struct bpf_prog_list *find_attach_entry(struct list_head *progs, 421 + static struct bpf_prog_list *find_attach_entry(struct hlist_head *progs, 557 422 struct bpf_prog *prog, 558 423 struct bpf_cgroup_link *link, 559 424 struct bpf_prog *replace_prog, ··· 563 428 564 429 /* single-attach case */ 565 430 if (!allow_multi) { 566 - if (list_empty(progs)) 431 + if (hlist_empty(progs)) 567 432 return NULL; 568 - return list_first_entry(progs, typeof(*pl), node); 433 + return hlist_entry(progs->first, typeof(*pl), node); 569 434 } 570 435 571 - list_for_each_entry(pl, progs, node) { 436 + hlist_for_each_entry(pl, progs, node) { 572 437 if (prog && pl->prog == prog && prog != replace_prog) 573 438 /* disallow attaching the same prog twice */ 574 439 return ERR_PTR(-EINVAL); ··· 579 444 580 445 /* direct prog multi-attach w/ replacement case */ 581 446 if (replace_prog) { 582 - list_for_each_entry(pl, progs, node) { 447 + hlist_for_each_entry(pl, progs, node) { 583 448 if (pl->prog == replace_prog) 584 449 /* a match found */ 585 450 return pl; ··· 613 478 struct bpf_prog *old_prog = NULL; 614 479 struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {}; 615 480 struct bpf_cgroup_storage *new_storage[MAX_BPF_CGROUP_STORAGE_TYPE] = {}; 481 + struct bpf_prog *new_prog = prog ? : link->link.prog; 616 482 enum cgroup_bpf_attach_type atype; 617 483 struct bpf_prog_list *pl; 618 - struct list_head *progs; 484 + struct hlist_head *progs; 619 485 int err; 620 486 621 487 if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) || ··· 630 494 /* replace_prog implies BPF_F_REPLACE, and vice versa */ 631 495 return -EINVAL; 632 496 633 - atype = to_cgroup_bpf_attach_type(type); 497 + atype = bpf_cgroup_atype_find(type, new_prog->aux->attach_btf_id); 634 498 if (atype < 0) 635 499 return -EINVAL; 636 500 ··· 639 503 if (!hierarchy_allows_attach(cgrp, atype)) 640 504 return -EPERM; 641 505 642 - if (!list_empty(progs) && cgrp->bpf.flags[atype] != saved_flags) 506 + if (!hlist_empty(progs) && cgrp->bpf.flags[atype] != saved_flags) 643 507 /* Disallow attaching non-overridable on top 644 508 * of existing overridable in this cgroup. 645 509 * Disallow attaching multi-prog if overridable or none ··· 661 525 if (pl) { 662 526 old_prog = pl->prog; 663 527 } else { 528 + struct hlist_node *last = NULL; 529 + 664 530 pl = kmalloc(sizeof(*pl), GFP_KERNEL); 665 531 if (!pl) { 666 532 bpf_cgroup_storages_free(new_storage); 667 533 return -ENOMEM; 668 534 } 669 - list_add_tail(&pl->node, progs); 535 + if (hlist_empty(progs)) 536 + hlist_add_head(&pl->node, progs); 537 + else 538 + hlist_for_each(last, progs) { 539 + if (last->next) 540 + continue; 541 + hlist_add_behind(&pl->node, last); 542 + break; 543 + } 670 544 } 671 545 672 546 pl->prog = prog; ··· 684 538 bpf_cgroup_storages_assign(pl->storage, storage); 685 539 cgrp->bpf.flags[atype] = saved_flags; 686 540 541 + if (type == BPF_LSM_CGROUP) { 542 + err = bpf_trampoline_link_cgroup_shim(new_prog, atype); 543 + if (err) 544 + goto cleanup; 545 + } 546 + 687 547 err = update_effective_progs(cgrp, atype); 688 548 if (err) 689 - goto cleanup; 549 + goto cleanup_trampoline; 690 550 691 - if (old_prog) 551 + if (old_prog) { 552 + if (type == BPF_LSM_CGROUP) 553 + bpf_trampoline_unlink_cgroup_shim(old_prog); 692 554 bpf_prog_put(old_prog); 693 - else 555 + } else { 694 556 static_branch_inc(&cgroup_bpf_enabled_key[atype]); 557 + } 695 558 bpf_cgroup_storages_link(new_storage, cgrp, type); 696 559 return 0; 560 + 561 + cleanup_trampoline: 562 + if (type == BPF_LSM_CGROUP) 563 + bpf_trampoline_unlink_cgroup_shim(new_prog); 697 564 698 565 cleanup: 699 566 if (old_prog) { ··· 715 556 } 716 557 bpf_cgroup_storages_free(new_storage); 717 558 if (!old_prog) { 718 - list_del(&pl->node); 559 + hlist_del(&pl->node); 719 560 kfree(pl); 720 561 } 721 562 return err; ··· 746 587 struct cgroup_subsys_state *css; 747 588 struct bpf_prog_array *progs; 748 589 struct bpf_prog_list *pl; 749 - struct list_head *head; 590 + struct hlist_head *head; 750 591 struct cgroup *cg; 751 592 int pos; 752 593 ··· 762 603 continue; 763 604 764 605 head = &cg->bpf.progs[atype]; 765 - list_for_each_entry(pl, head, node) { 606 + hlist_for_each_entry(pl, head, node) { 766 607 if (!prog_list_prog(pl)) 767 608 continue; 768 609 if (pl->link == link) ··· 796 637 enum cgroup_bpf_attach_type atype; 797 638 struct bpf_prog *old_prog; 798 639 struct bpf_prog_list *pl; 799 - struct list_head *progs; 640 + struct hlist_head *progs; 800 641 bool found = false; 801 642 802 - atype = to_cgroup_bpf_attach_type(link->type); 643 + atype = bpf_cgroup_atype_find(link->type, new_prog->aux->attach_btf_id); 803 644 if (atype < 0) 804 645 return -EINVAL; 805 646 ··· 808 649 if (link->link.prog->type != new_prog->type) 809 650 return -EINVAL; 810 651 811 - list_for_each_entry(pl, progs, node) { 652 + hlist_for_each_entry(pl, progs, node) { 812 653 if (pl->link == link) { 813 654 found = true; 814 655 break; ··· 847 688 return ret; 848 689 } 849 690 850 - static struct bpf_prog_list *find_detach_entry(struct list_head *progs, 691 + static struct bpf_prog_list *find_detach_entry(struct hlist_head *progs, 851 692 struct bpf_prog *prog, 852 693 struct bpf_cgroup_link *link, 853 694 bool allow_multi) ··· 855 696 struct bpf_prog_list *pl; 856 697 857 698 if (!allow_multi) { 858 - if (list_empty(progs)) 699 + if (hlist_empty(progs)) 859 700 /* report error when trying to detach and nothing is attached */ 860 701 return ERR_PTR(-ENOENT); 861 702 862 703 /* to maintain backward compatibility NONE and OVERRIDE cgroups 863 704 * allow detaching with invalid FD (prog==NULL) in legacy mode 864 705 */ 865 - return list_first_entry(progs, typeof(*pl), node); 706 + return hlist_entry(progs->first, typeof(*pl), node); 866 707 } 867 708 868 709 if (!prog && !link) ··· 872 713 return ERR_PTR(-EINVAL); 873 714 874 715 /* find the prog or link and detach it */ 875 - list_for_each_entry(pl, progs, node) { 716 + hlist_for_each_entry(pl, progs, node) { 876 717 if (pl->prog == prog && pl->link == link) 877 718 return pl; 878 719 } ··· 896 737 struct cgroup_subsys_state *css; 897 738 struct bpf_prog_array *progs; 898 739 struct bpf_prog_list *pl; 899 - struct list_head *head; 740 + struct hlist_head *head; 900 741 struct cgroup *cg; 901 742 int pos; 902 743 ··· 913 754 continue; 914 755 915 756 head = &cg->bpf.progs[atype]; 916 - list_for_each_entry(pl, head, node) { 757 + hlist_for_each_entry(pl, head, node) { 917 758 if (!prog_list_prog(pl)) 918 759 continue; 919 760 if (pl->prog == prog && pl->link == link) ··· 950 791 enum cgroup_bpf_attach_type atype; 951 792 struct bpf_prog *old_prog; 952 793 struct bpf_prog_list *pl; 953 - struct list_head *progs; 794 + struct hlist_head *progs; 795 + u32 attach_btf_id = 0; 954 796 u32 flags; 955 797 956 - atype = to_cgroup_bpf_attach_type(type); 798 + if (prog) 799 + attach_btf_id = prog->aux->attach_btf_id; 800 + if (link) 801 + attach_btf_id = link->link.prog->aux->attach_btf_id; 802 + 803 + atype = bpf_cgroup_atype_find(type, attach_btf_id); 957 804 if (atype < 0) 958 805 return -EINVAL; 959 806 ··· 987 822 } 988 823 989 824 /* now can actually delete it from this cgroup list */ 990 - list_del(&pl->node); 825 + hlist_del(&pl->node); 826 + 991 827 kfree(pl); 992 - if (list_empty(progs)) 828 + if (hlist_empty(progs)) 993 829 /* last program was detached, reset flags to zero */ 994 830 cgrp->bpf.flags[atype] = 0; 995 - if (old_prog) 831 + if (old_prog) { 832 + if (type == BPF_LSM_CGROUP) 833 + bpf_trampoline_unlink_cgroup_shim(old_prog); 996 834 bpf_prog_put(old_prog); 835 + } 997 836 static_branch_dec(&cgroup_bpf_enabled_key[atype]); 998 837 return 0; 999 838 } ··· 1017 848 static int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr, 1018 849 union bpf_attr __user *uattr) 1019 850 { 851 + __u32 __user *prog_attach_flags = u64_to_user_ptr(attr->query.prog_attach_flags); 1020 852 __u32 __user *prog_ids = u64_to_user_ptr(attr->query.prog_ids); 1021 853 enum bpf_attach_type type = attr->query.attach_type; 854 + enum cgroup_bpf_attach_type from_atype, to_atype; 1022 855 enum cgroup_bpf_attach_type atype; 1023 856 struct bpf_prog_array *effective; 1024 - struct list_head *progs; 1025 - struct bpf_prog *prog; 1026 857 int cnt, ret = 0, i; 858 + int total_cnt = 0; 1027 859 u32 flags; 1028 860 1029 - atype = to_cgroup_bpf_attach_type(type); 1030 - if (atype < 0) 1031 - return -EINVAL; 861 + if (type == BPF_LSM_CGROUP) { 862 + if (attr->query.prog_cnt && prog_ids && !prog_attach_flags) 863 + return -EINVAL; 1032 864 1033 - progs = &cgrp->bpf.progs[atype]; 1034 - flags = cgrp->bpf.flags[atype]; 865 + from_atype = CGROUP_LSM_START; 866 + to_atype = CGROUP_LSM_END; 867 + flags = 0; 868 + } else { 869 + from_atype = to_cgroup_bpf_attach_type(type); 870 + if (from_atype < 0) 871 + return -EINVAL; 872 + to_atype = from_atype; 873 + flags = cgrp->bpf.flags[from_atype]; 874 + } 1035 875 1036 - effective = rcu_dereference_protected(cgrp->bpf.effective[atype], 1037 - lockdep_is_held(&cgroup_mutex)); 1038 - 1039 - if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) 1040 - cnt = bpf_prog_array_length(effective); 1041 - else 1042 - cnt = prog_list_length(progs); 876 + for (atype = from_atype; atype <= to_atype; atype++) { 877 + if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) { 878 + effective = rcu_dereference_protected(cgrp->bpf.effective[atype], 879 + lockdep_is_held(&cgroup_mutex)); 880 + total_cnt += bpf_prog_array_length(effective); 881 + } else { 882 + total_cnt += prog_list_length(&cgrp->bpf.progs[atype]); 883 + } 884 + } 1043 885 1044 886 if (copy_to_user(&uattr->query.attach_flags, &flags, sizeof(flags))) 1045 887 return -EFAULT; 1046 - if (copy_to_user(&uattr->query.prog_cnt, &cnt, sizeof(cnt))) 888 + if (copy_to_user(&uattr->query.prog_cnt, &total_cnt, sizeof(total_cnt))) 1047 889 return -EFAULT; 1048 - if (attr->query.prog_cnt == 0 || !prog_ids || !cnt) 890 + if (attr->query.prog_cnt == 0 || !prog_ids || !total_cnt) 1049 891 /* return early if user requested only program count + flags */ 1050 892 return 0; 1051 - if (attr->query.prog_cnt < cnt) { 1052 - cnt = attr->query.prog_cnt; 893 + 894 + if (attr->query.prog_cnt < total_cnt) { 895 + total_cnt = attr->query.prog_cnt; 1053 896 ret = -ENOSPC; 1054 897 } 1055 898 1056 - if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) { 1057 - return bpf_prog_array_copy_to_user(effective, prog_ids, cnt); 1058 - } else { 1059 - struct bpf_prog_list *pl; 1060 - u32 id; 899 + for (atype = from_atype; atype <= to_atype && total_cnt; atype++) { 900 + if (attr->query.query_flags & BPF_F_QUERY_EFFECTIVE) { 901 + effective = rcu_dereference_protected(cgrp->bpf.effective[atype], 902 + lockdep_is_held(&cgroup_mutex)); 903 + cnt = min_t(int, bpf_prog_array_length(effective), total_cnt); 904 + ret = bpf_prog_array_copy_to_user(effective, prog_ids, cnt); 905 + } else { 906 + struct hlist_head *progs; 907 + struct bpf_prog_list *pl; 908 + struct bpf_prog *prog; 909 + u32 id; 1061 910 1062 - i = 0; 1063 - list_for_each_entry(pl, progs, node) { 1064 - prog = prog_list_prog(pl); 1065 - id = prog->aux->id; 1066 - if (copy_to_user(prog_ids + i, &id, sizeof(id))) 1067 - return -EFAULT; 1068 - if (++i == cnt) 1069 - break; 911 + progs = &cgrp->bpf.progs[atype]; 912 + cnt = min_t(int, prog_list_length(progs), total_cnt); 913 + i = 0; 914 + hlist_for_each_entry(pl, progs, node) { 915 + prog = prog_list_prog(pl); 916 + id = prog->aux->id; 917 + if (copy_to_user(prog_ids + i, &id, sizeof(id))) 918 + return -EFAULT; 919 + if (++i == cnt) 920 + break; 921 + } 1070 922 } 923 + 924 + if (prog_attach_flags) { 925 + flags = cgrp->bpf.flags[atype]; 926 + 927 + for (i = 0; i < cnt; i++) 928 + if (copy_to_user(prog_attach_flags + i, &flags, sizeof(flags))) 929 + return -EFAULT; 930 + prog_attach_flags += cnt; 931 + } 932 + 933 + prog_ids += cnt; 934 + total_cnt -= cnt; 1071 935 } 1072 936 return ret; 1073 937 } ··· 1189 987 1190 988 WARN_ON(__cgroup_bpf_detach(cg_link->cgroup, NULL, cg_link, 1191 989 cg_link->type)); 990 + if (cg_link->type == BPF_LSM_CGROUP) 991 + bpf_trampoline_unlink_cgroup_shim(cg_link->link.prog); 1192 992 1193 993 cg = cg_link->cgroup; 1194 994 cg_link->cgroup = NULL; ··· 1535 1331 return ctx->retval; 1536 1332 } 1537 1333 1538 - static const struct bpf_func_proto bpf_get_retval_proto = { 1334 + const struct bpf_func_proto bpf_get_retval_proto = { 1539 1335 .func = bpf_get_retval, 1540 1336 .gpl_only = false, 1541 1337 .ret_type = RET_INTEGER, ··· 1550 1346 return 0; 1551 1347 } 1552 1348 1553 - static const struct bpf_func_proto bpf_set_retval_proto = { 1349 + const struct bpf_func_proto bpf_set_retval_proto = { 1554 1350 .func = bpf_set_retval, 1555 1351 .gpl_only = false, 1556 1352 .ret_type = RET_INTEGER,
+15
kernel/bpf/core.c
··· 107 107 fp->aux->prog = fp; 108 108 fp->jit_requested = ebpf_jit_enabled(); 109 109 fp->blinding_requested = bpf_jit_blinding_enabled(fp); 110 + #ifdef CONFIG_CGROUP_BPF 111 + aux->cgroup_atype = CGROUP_BPF_ATTACH_TYPE_INVALID; 112 + #endif 110 113 111 114 INIT_LIST_HEAD_RCU(&fp->aux->ksym.lnode); 112 115 mutex_init(&fp->aux->used_maps_mutex); ··· 2573 2570 #ifdef CONFIG_BPF_SYSCALL 2574 2571 bpf_free_kfunc_btf_tab(aux->kfunc_btf_tab); 2575 2572 #endif 2573 + #ifdef CONFIG_CGROUP_BPF 2574 + if (aux->cgroup_atype != CGROUP_BPF_ATTACH_TYPE_INVALID) 2575 + bpf_cgroup_atype_put(aux->cgroup_atype); 2576 + #endif 2576 2577 bpf_free_used_maps(aux); 2577 2578 bpf_free_used_btfs(aux); 2578 2579 if (bpf_prog_is_dev_bound(aux)) ··· 2673 2666 const struct bpf_func_proto bpf_get_ns_current_pid_tgid_proto __weak; 2674 2667 const struct bpf_func_proto bpf_snprintf_btf_proto __weak; 2675 2668 const struct bpf_func_proto bpf_seq_printf_btf_proto __weak; 2669 + const struct bpf_func_proto bpf_set_retval_proto __weak; 2670 + const struct bpf_func_proto bpf_get_retval_proto __weak; 2676 2671 2677 2672 const struct bpf_func_proto * __weak bpf_get_trace_printk_proto(void) 2678 2673 { ··· 2734 2725 * them using insn_is_zext. 2735 2726 */ 2736 2727 bool __weak bpf_jit_needs_zext(void) 2728 + { 2729 + return false; 2730 + } 2731 + 2732 + /* Return TRUE if the JIT backend supports mixing bpf2bpf and tailcalls. */ 2733 + bool __weak bpf_jit_supports_subprog_tailcalls(void) 2737 2734 { 2738 2735 return false; 2739 2736 }
+17 -1
kernel/bpf/syscall.c
··· 3416 3416 return BPF_PROG_TYPE_SK_LOOKUP; 3417 3417 case BPF_XDP: 3418 3418 return BPF_PROG_TYPE_XDP; 3419 + case BPF_LSM_CGROUP: 3420 + return BPF_PROG_TYPE_LSM; 3419 3421 default: 3420 3422 return BPF_PROG_TYPE_UNSPEC; 3421 3423 } ··· 3471 3469 case BPF_PROG_TYPE_CGROUP_SOCKOPT: 3472 3470 case BPF_PROG_TYPE_CGROUP_SYSCTL: 3473 3471 case BPF_PROG_TYPE_SOCK_OPS: 3472 + case BPF_PROG_TYPE_LSM: 3473 + if (ptype == BPF_PROG_TYPE_LSM && 3474 + prog->expected_attach_type != BPF_LSM_CGROUP) 3475 + return -EINVAL; 3476 + 3474 3477 ret = cgroup_bpf_prog_attach(attr, ptype, prog); 3475 3478 break; 3476 3479 default: ··· 3513 3506 case BPF_PROG_TYPE_CGROUP_SOCKOPT: 3514 3507 case BPF_PROG_TYPE_CGROUP_SYSCTL: 3515 3508 case BPF_PROG_TYPE_SOCK_OPS: 3509 + case BPF_PROG_TYPE_LSM: 3516 3510 return cgroup_bpf_prog_detach(attr, ptype); 3517 3511 default: 3518 3512 return -EINVAL; 3519 3513 } 3520 3514 } 3521 3515 3522 - #define BPF_PROG_QUERY_LAST_FIELD query.prog_cnt 3516 + #define BPF_PROG_QUERY_LAST_FIELD query.prog_attach_flags 3523 3517 3524 3518 static int bpf_prog_query(const union bpf_attr *attr, 3525 3519 union bpf_attr __user *uattr) ··· 3556 3548 case BPF_CGROUP_SYSCTL: 3557 3549 case BPF_CGROUP_GETSOCKOPT: 3558 3550 case BPF_CGROUP_SETSOCKOPT: 3551 + case BPF_LSM_CGROUP: 3559 3552 return cgroup_bpf_prog_query(attr, uattr); 3560 3553 case BPF_LIRC_MODE2: 3561 3554 return lirc_prog_query(attr, uattr); ··· 4067 4058 4068 4059 if (prog->aux->btf) 4069 4060 info.btf_id = btf_obj_id(prog->aux->btf); 4061 + info.attach_btf_id = prog->aux->attach_btf_id; 4062 + if (prog->aux->attach_btf) 4063 + info.attach_btf_obj_id = btf_obj_id(prog->aux->attach_btf); 4064 + else if (prog->aux->dst_prog) 4065 + info.attach_btf_obj_id = btf_obj_id(prog->aux->dst_prog->aux->attach_btf); 4070 4066 4071 4067 ulen = info.nr_func_info; 4072 4068 info.nr_func_info = prog->aux->func_info_cnt; ··· 4554 4540 ret = bpf_raw_tp_link_attach(prog, NULL); 4555 4541 else if (prog->expected_attach_type == BPF_TRACE_ITER) 4556 4542 ret = bpf_iter_link_attach(attr, uattr, prog); 4543 + else if (prog->expected_attach_type == BPF_LSM_CGROUP) 4544 + ret = cgroup_bpf_link_attach(attr, prog); 4557 4545 else 4558 4546 ret = bpf_tracing_prog_attach(prog, 4559 4547 attr->link_create.target_fd,
+232 -30
kernel/bpf/trampoline.c
··· 11 11 #include <linux/rcupdate_wait.h> 12 12 #include <linux/module.h> 13 13 #include <linux/static_call.h> 14 + #include <linux/bpf_verifier.h> 15 + #include <linux/bpf_lsm.h> 14 16 15 17 /* dummy _ops. The verifier will operate on target program's ops. */ 16 18 const struct bpf_verifier_ops bpf_extension_verifier_ops = { ··· 412 410 } 413 411 } 414 412 415 - int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 413 + static int __bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 416 414 { 417 415 enum bpf_tramp_prog_type kind; 418 416 struct bpf_tramp_link *link_exiting; ··· 420 418 int cnt = 0, i; 421 419 422 420 kind = bpf_attach_type_to_tramp(link->link.prog); 423 - mutex_lock(&tr->mutex); 424 - if (tr->extension_prog) { 421 + if (tr->extension_prog) 425 422 /* cannot attach fentry/fexit if extension prog is attached. 426 423 * cannot overwrite extension prog either. 427 424 */ 428 - err = -EBUSY; 429 - goto out; 430 - } 425 + return -EBUSY; 431 426 432 427 for (i = 0; i < BPF_TRAMP_MAX; i++) 433 428 cnt += tr->progs_cnt[i]; 434 429 435 430 if (kind == BPF_TRAMP_REPLACE) { 436 431 /* Cannot attach extension if fentry/fexit are in use. */ 437 - if (cnt) { 438 - err = -EBUSY; 439 - goto out; 440 - } 432 + if (cnt) 433 + return -EBUSY; 441 434 tr->extension_prog = link->link.prog; 442 - err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL, 443 - link->link.prog->bpf_func); 444 - goto out; 435 + return bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL, 436 + link->link.prog->bpf_func); 445 437 } 446 - if (cnt >= BPF_MAX_TRAMP_LINKS) { 447 - err = -E2BIG; 448 - goto out; 449 - } 450 - if (!hlist_unhashed(&link->tramp_hlist)) { 438 + if (cnt >= BPF_MAX_TRAMP_LINKS) 439 + return -E2BIG; 440 + if (!hlist_unhashed(&link->tramp_hlist)) 451 441 /* prog already linked */ 452 - err = -EBUSY; 453 - goto out; 454 - } 442 + return -EBUSY; 455 443 hlist_for_each_entry(link_exiting, &tr->progs_hlist[kind], tramp_hlist) { 456 444 if (link_exiting->link.prog != link->link.prog) 457 445 continue; 458 446 /* prog already linked */ 459 - err = -EBUSY; 460 - goto out; 447 + return -EBUSY; 461 448 } 462 449 463 450 hlist_add_head(&link->tramp_hlist, &tr->progs_hlist[kind]); ··· 456 465 hlist_del_init(&link->tramp_hlist); 457 466 tr->progs_cnt[kind]--; 458 467 } 459 - out: 468 + return err; 469 + } 470 + 471 + int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 472 + { 473 + int err; 474 + 475 + mutex_lock(&tr->mutex); 476 + err = __bpf_trampoline_link_prog(link, tr); 460 477 mutex_unlock(&tr->mutex); 461 478 return err; 462 479 } 463 480 464 - /* bpf_trampoline_unlink_prog() should never fail. */ 465 - int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 481 + static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 466 482 { 467 483 enum bpf_tramp_prog_type kind; 468 484 int err; 469 485 470 486 kind = bpf_attach_type_to_tramp(link->link.prog); 471 - mutex_lock(&tr->mutex); 472 487 if (kind == BPF_TRAMP_REPLACE) { 473 488 WARN_ON_ONCE(!tr->extension_prog); 474 489 err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, 475 490 tr->extension_prog->bpf_func, NULL); 476 491 tr->extension_prog = NULL; 477 - goto out; 492 + return err; 478 493 } 479 494 hlist_del_init(&link->tramp_hlist); 480 495 tr->progs_cnt[kind]--; 481 - err = bpf_trampoline_update(tr); 482 - out: 496 + return bpf_trampoline_update(tr); 497 + } 498 + 499 + /* bpf_trampoline_unlink_prog() should never fail. */ 500 + int bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr) 501 + { 502 + int err; 503 + 504 + mutex_lock(&tr->mutex); 505 + err = __bpf_trampoline_unlink_prog(link, tr); 483 506 mutex_unlock(&tr->mutex); 484 507 return err; 485 508 } 509 + 510 + #if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL) 511 + static void bpf_shim_tramp_link_release(struct bpf_link *link) 512 + { 513 + struct bpf_shim_tramp_link *shim_link = 514 + container_of(link, struct bpf_shim_tramp_link, link.link); 515 + 516 + /* paired with 'shim_link->trampoline = tr' in bpf_trampoline_link_cgroup_shim */ 517 + if (!shim_link->trampoline) 518 + return; 519 + 520 + WARN_ON_ONCE(bpf_trampoline_unlink_prog(&shim_link->link, shim_link->trampoline)); 521 + bpf_trampoline_put(shim_link->trampoline); 522 + } 523 + 524 + static void bpf_shim_tramp_link_dealloc(struct bpf_link *link) 525 + { 526 + struct bpf_shim_tramp_link *shim_link = 527 + container_of(link, struct bpf_shim_tramp_link, link.link); 528 + 529 + kfree(shim_link); 530 + } 531 + 532 + static const struct bpf_link_ops bpf_shim_tramp_link_lops = { 533 + .release = bpf_shim_tramp_link_release, 534 + .dealloc = bpf_shim_tramp_link_dealloc, 535 + }; 536 + 537 + static struct bpf_shim_tramp_link *cgroup_shim_alloc(const struct bpf_prog *prog, 538 + bpf_func_t bpf_func, 539 + int cgroup_atype) 540 + { 541 + struct bpf_shim_tramp_link *shim_link = NULL; 542 + struct bpf_prog *p; 543 + 544 + shim_link = kzalloc(sizeof(*shim_link), GFP_USER); 545 + if (!shim_link) 546 + return NULL; 547 + 548 + p = bpf_prog_alloc(1, 0); 549 + if (!p) { 550 + kfree(shim_link); 551 + return NULL; 552 + } 553 + 554 + p->jited = false; 555 + p->bpf_func = bpf_func; 556 + 557 + p->aux->cgroup_atype = cgroup_atype; 558 + p->aux->attach_func_proto = prog->aux->attach_func_proto; 559 + p->aux->attach_btf_id = prog->aux->attach_btf_id; 560 + p->aux->attach_btf = prog->aux->attach_btf; 561 + btf_get(p->aux->attach_btf); 562 + p->type = BPF_PROG_TYPE_LSM; 563 + p->expected_attach_type = BPF_LSM_MAC; 564 + bpf_prog_inc(p); 565 + bpf_link_init(&shim_link->link.link, BPF_LINK_TYPE_UNSPEC, 566 + &bpf_shim_tramp_link_lops, p); 567 + bpf_cgroup_atype_get(p->aux->attach_btf_id, cgroup_atype); 568 + 569 + return shim_link; 570 + } 571 + 572 + static struct bpf_shim_tramp_link *cgroup_shim_find(struct bpf_trampoline *tr, 573 + bpf_func_t bpf_func) 574 + { 575 + struct bpf_tramp_link *link; 576 + int kind; 577 + 578 + for (kind = 0; kind < BPF_TRAMP_MAX; kind++) { 579 + hlist_for_each_entry(link, &tr->progs_hlist[kind], tramp_hlist) { 580 + struct bpf_prog *p = link->link.prog; 581 + 582 + if (p->bpf_func == bpf_func) 583 + return container_of(link, struct bpf_shim_tramp_link, link); 584 + } 585 + } 586 + 587 + return NULL; 588 + } 589 + 590 + int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog, 591 + int cgroup_atype) 592 + { 593 + struct bpf_shim_tramp_link *shim_link = NULL; 594 + struct bpf_attach_target_info tgt_info = {}; 595 + struct bpf_trampoline *tr; 596 + bpf_func_t bpf_func; 597 + u64 key; 598 + int err; 599 + 600 + err = bpf_check_attach_target(NULL, prog, NULL, 601 + prog->aux->attach_btf_id, 602 + &tgt_info); 603 + if (err) 604 + return err; 605 + 606 + key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf, 607 + prog->aux->attach_btf_id); 608 + 609 + bpf_lsm_find_cgroup_shim(prog, &bpf_func); 610 + tr = bpf_trampoline_get(key, &tgt_info); 611 + if (!tr) 612 + return -ENOMEM; 613 + 614 + mutex_lock(&tr->mutex); 615 + 616 + shim_link = cgroup_shim_find(tr, bpf_func); 617 + if (shim_link) { 618 + /* Reusing existing shim attached by the other program. */ 619 + bpf_link_inc(&shim_link->link.link); 620 + 621 + mutex_unlock(&tr->mutex); 622 + bpf_trampoline_put(tr); /* bpf_trampoline_get above */ 623 + return 0; 624 + } 625 + 626 + /* Allocate and install new shim. */ 627 + 628 + shim_link = cgroup_shim_alloc(prog, bpf_func, cgroup_atype); 629 + if (!shim_link) { 630 + err = -ENOMEM; 631 + goto err; 632 + } 633 + 634 + err = __bpf_trampoline_link_prog(&shim_link->link, tr); 635 + if (err) 636 + goto err; 637 + 638 + shim_link->trampoline = tr; 639 + /* note, we're still holding tr refcnt from above */ 640 + 641 + mutex_unlock(&tr->mutex); 642 + 643 + return 0; 644 + err: 645 + mutex_unlock(&tr->mutex); 646 + 647 + if (shim_link) 648 + bpf_link_put(&shim_link->link.link); 649 + 650 + /* have to release tr while _not_ holding its mutex */ 651 + bpf_trampoline_put(tr); /* bpf_trampoline_get above */ 652 + 653 + return err; 654 + } 655 + 656 + void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog) 657 + { 658 + struct bpf_shim_tramp_link *shim_link = NULL; 659 + struct bpf_trampoline *tr; 660 + bpf_func_t bpf_func; 661 + u64 key; 662 + 663 + key = bpf_trampoline_compute_key(NULL, prog->aux->attach_btf, 664 + prog->aux->attach_btf_id); 665 + 666 + bpf_lsm_find_cgroup_shim(prog, &bpf_func); 667 + tr = bpf_trampoline_lookup(key); 668 + if (WARN_ON_ONCE(!tr)) 669 + return; 670 + 671 + mutex_lock(&tr->mutex); 672 + shim_link = cgroup_shim_find(tr, bpf_func); 673 + mutex_unlock(&tr->mutex); 674 + 675 + if (shim_link) 676 + bpf_link_put(&shim_link->link.link); 677 + 678 + bpf_trampoline_put(tr); /* bpf_trampoline_lookup above */ 679 + } 680 + #endif 486 681 487 682 struct bpf_trampoline *bpf_trampoline_get(u64 key, 488 683 struct bpf_attach_target_info *tgt_info) ··· 798 621 799 622 update_prog_stats(prog, start); 800 623 __this_cpu_dec(*(prog->active)); 624 + migrate_enable(); 625 + rcu_read_unlock(); 626 + } 627 + 628 + u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog, 629 + struct bpf_tramp_run_ctx *run_ctx) 630 + __acquires(RCU) 631 + { 632 + /* Runtime stats are exported via actual BPF_LSM_CGROUP 633 + * programs, not the shims. 634 + */ 635 + rcu_read_lock(); 636 + migrate_disable(); 637 + 638 + run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx); 639 + 640 + return NO_START_TIME; 641 + } 642 + 643 + void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start, 644 + struct bpf_tramp_run_ctx *run_ctx) 645 + __releases(RCU) 646 + { 647 + bpf_reset_run_ctx(run_ctx->saved_run_ctx); 648 + 801 649 migrate_enable(); 802 650 rcu_read_unlock(); 803 651 }
+226 -12
kernel/bpf/verifier.c
··· 6153 6153 6154 6154 static bool allow_tail_call_in_subprogs(struct bpf_verifier_env *env) 6155 6155 { 6156 - return env->prog->jit_requested && IS_ENABLED(CONFIG_X86_64); 6156 + return env->prog->jit_requested && 6157 + bpf_jit_supports_subprog_tailcalls(); 6157 6158 } 6158 6159 6159 6160 static int check_map_func_compatibility(struct bpf_verifier_env *env, ··· 7122 7121 return -ENOTSUPP; 7123 7122 } 7124 7123 7124 + static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env) 7125 + { 7126 + return &env->insn_aux_data[env->insn_idx]; 7127 + } 7128 + 7129 + static bool loop_flag_is_zero(struct bpf_verifier_env *env) 7130 + { 7131 + struct bpf_reg_state *regs = cur_regs(env); 7132 + struct bpf_reg_state *reg = &regs[BPF_REG_4]; 7133 + bool reg_is_null = register_is_null(reg); 7134 + 7135 + if (reg_is_null) 7136 + mark_chain_precision(env, BPF_REG_4); 7137 + 7138 + return reg_is_null; 7139 + } 7140 + 7141 + static void update_loop_inline_state(struct bpf_verifier_env *env, u32 subprogno) 7142 + { 7143 + struct bpf_loop_inline_state *state = &cur_aux(env)->loop_inline_state; 7144 + 7145 + if (!state->initialized) { 7146 + state->initialized = 1; 7147 + state->fit_for_inline = loop_flag_is_zero(env); 7148 + state->callback_subprogno = subprogno; 7149 + return; 7150 + } 7151 + 7152 + if (!state->fit_for_inline) 7153 + return; 7154 + 7155 + state->fit_for_inline = (loop_flag_is_zero(env) && 7156 + state->callback_subprogno == subprogno); 7157 + } 7158 + 7125 7159 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn, 7126 7160 int *insn_idx_p) 7127 7161 { ··· 7309 7273 err = check_bpf_snprintf_call(env, regs); 7310 7274 break; 7311 7275 case BPF_FUNC_loop: 7276 + update_loop_inline_state(env, meta.subprogno); 7312 7277 err = __check_func_call(env, insn, insn_idx_p, meta.subprogno, 7313 7278 set_loop_callback_state); 7314 7279 break; ··· 7319 7282 reg_type_str(env, regs[BPF_REG_1].type)); 7320 7283 return -EACCES; 7321 7284 } 7285 + break; 7286 + case BPF_FUNC_set_retval: 7287 + if (env->prog->expected_attach_type == BPF_LSM_CGROUP) { 7288 + if (!env->prog->aux->attach_func_proto->type) { 7289 + /* Make sure programs that attach to void 7290 + * hooks don't try to modify return value. 7291 + */ 7292 + verbose(env, "BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n"); 7293 + return -EINVAL; 7294 + } 7295 + } 7296 + break; 7322 7297 } 7323 7298 7324 7299 if (err) ··· 7726 7677 } 7727 7678 7728 7679 return true; 7729 - } 7730 - 7731 - static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env) 7732 - { 7733 - return &env->insn_aux_data[env->insn_idx]; 7734 7680 } 7735 7681 7736 7682 enum { ··· 9098 9054 9099 9055 if (opcode == BPF_END || opcode == BPF_NEG) { 9100 9056 if (opcode == BPF_NEG) { 9101 - if (BPF_SRC(insn->code) != 0 || 9057 + if (BPF_SRC(insn->code) != BPF_K || 9102 9058 insn->src_reg != BPF_REG_0 || 9103 9059 insn->off != 0 || insn->imm != 0) { 9104 9060 verbose(env, "BPF_NEG uses reserved fields\n"); ··· 10425 10381 const bool is_subprog = frame->subprogno; 10426 10382 10427 10383 /* LSM and struct_ops func-ptr's return type could be "void" */ 10428 - if (!is_subprog && 10429 - (prog_type == BPF_PROG_TYPE_STRUCT_OPS || 10430 - prog_type == BPF_PROG_TYPE_LSM) && 10431 - !prog->aux->attach_func_proto->type) 10432 - return 0; 10384 + if (!is_subprog) { 10385 + switch (prog_type) { 10386 + case BPF_PROG_TYPE_LSM: 10387 + if (prog->expected_attach_type == BPF_LSM_CGROUP) 10388 + /* See below, can be 0 or 0-1 depending on hook. */ 10389 + break; 10390 + fallthrough; 10391 + case BPF_PROG_TYPE_STRUCT_OPS: 10392 + if (!prog->aux->attach_func_proto->type) 10393 + return 0; 10394 + break; 10395 + default: 10396 + break; 10397 + } 10398 + } 10433 10399 10434 10400 /* eBPF calling convention is such that R0 is used 10435 10401 * to return the value from eBPF program. ··· 10530 10476 case BPF_PROG_TYPE_SK_LOOKUP: 10531 10477 range = tnum_range(SK_DROP, SK_PASS); 10532 10478 break; 10479 + 10480 + case BPF_PROG_TYPE_LSM: 10481 + if (env->prog->expected_attach_type != BPF_LSM_CGROUP) { 10482 + /* Regular BPF_PROG_TYPE_LSM programs can return 10483 + * any value. 10484 + */ 10485 + return 0; 10486 + } 10487 + if (!env->prog->aux->attach_func_proto->type) { 10488 + /* Make sure programs that attach to void 10489 + * hooks don't try to modify return value. 10490 + */ 10491 + range = tnum_range(1, 1); 10492 + } 10493 + break; 10494 + 10533 10495 case BPF_PROG_TYPE_EXT: 10534 10496 /* freplace program can return anything as its return value 10535 10497 * depends on the to-be-replaced kernel func or bpf program. ··· 10562 10492 10563 10493 if (!tnum_in(range, reg->var_off)) { 10564 10494 verbose_invalid_scalar(env, reg, &range, "program exit", "R0"); 10495 + if (prog->expected_attach_type == BPF_LSM_CGROUP && 10496 + prog_type == BPF_PROG_TYPE_LSM && 10497 + !prog->aux->attach_func_proto->type) 10498 + verbose(env, "Note, BPF_LSM_CGROUP that attach to void LSM hooks can't modify return value!\n"); 10565 10499 return -EINVAL; 10566 10500 } 10567 10501 ··· 14370 14296 return 0; 14371 14297 } 14372 14298 14299 + static struct bpf_prog *inline_bpf_loop(struct bpf_verifier_env *env, 14300 + int position, 14301 + s32 stack_base, 14302 + u32 callback_subprogno, 14303 + u32 *cnt) 14304 + { 14305 + s32 r6_offset = stack_base + 0 * BPF_REG_SIZE; 14306 + s32 r7_offset = stack_base + 1 * BPF_REG_SIZE; 14307 + s32 r8_offset = stack_base + 2 * BPF_REG_SIZE; 14308 + int reg_loop_max = BPF_REG_6; 14309 + int reg_loop_cnt = BPF_REG_7; 14310 + int reg_loop_ctx = BPF_REG_8; 14311 + 14312 + struct bpf_prog *new_prog; 14313 + u32 callback_start; 14314 + u32 call_insn_offset; 14315 + s32 callback_offset; 14316 + 14317 + /* This represents an inlined version of bpf_iter.c:bpf_loop, 14318 + * be careful to modify this code in sync. 14319 + */ 14320 + struct bpf_insn insn_buf[] = { 14321 + /* Return error and jump to the end of the patch if 14322 + * expected number of iterations is too big. 14323 + */ 14324 + BPF_JMP_IMM(BPF_JLE, BPF_REG_1, BPF_MAX_LOOPS, 2), 14325 + BPF_MOV32_IMM(BPF_REG_0, -E2BIG), 14326 + BPF_JMP_IMM(BPF_JA, 0, 0, 16), 14327 + /* spill R6, R7, R8 to use these as loop vars */ 14328 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, r6_offset), 14329 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, r7_offset), 14330 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, r8_offset), 14331 + /* initialize loop vars */ 14332 + BPF_MOV64_REG(reg_loop_max, BPF_REG_1), 14333 + BPF_MOV32_IMM(reg_loop_cnt, 0), 14334 + BPF_MOV64_REG(reg_loop_ctx, BPF_REG_3), 14335 + /* loop header, 14336 + * if reg_loop_cnt >= reg_loop_max skip the loop body 14337 + */ 14338 + BPF_JMP_REG(BPF_JGE, reg_loop_cnt, reg_loop_max, 5), 14339 + /* callback call, 14340 + * correct callback offset would be set after patching 14341 + */ 14342 + BPF_MOV64_REG(BPF_REG_1, reg_loop_cnt), 14343 + BPF_MOV64_REG(BPF_REG_2, reg_loop_ctx), 14344 + BPF_CALL_REL(0), 14345 + /* increment loop counter */ 14346 + BPF_ALU64_IMM(BPF_ADD, reg_loop_cnt, 1), 14347 + /* jump to loop header if callback returned 0 */ 14348 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, -6), 14349 + /* return value of bpf_loop, 14350 + * set R0 to the number of iterations 14351 + */ 14352 + BPF_MOV64_REG(BPF_REG_0, reg_loop_cnt), 14353 + /* restore original values of R6, R7, R8 */ 14354 + BPF_LDX_MEM(BPF_DW, BPF_REG_6, BPF_REG_10, r6_offset), 14355 + BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_10, r7_offset), 14356 + BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_10, r8_offset), 14357 + }; 14358 + 14359 + *cnt = ARRAY_SIZE(insn_buf); 14360 + new_prog = bpf_patch_insn_data(env, position, insn_buf, *cnt); 14361 + if (!new_prog) 14362 + return new_prog; 14363 + 14364 + /* callback start is known only after patching */ 14365 + callback_start = env->subprog_info[callback_subprogno].start; 14366 + /* Note: insn_buf[12] is an offset of BPF_CALL_REL instruction */ 14367 + call_insn_offset = position + 12; 14368 + callback_offset = callback_start - call_insn_offset - 1; 14369 + new_prog->insnsi[call_insn_offset].imm = callback_offset; 14370 + 14371 + return new_prog; 14372 + } 14373 + 14374 + static bool is_bpf_loop_call(struct bpf_insn *insn) 14375 + { 14376 + return insn->code == (BPF_JMP | BPF_CALL) && 14377 + insn->src_reg == 0 && 14378 + insn->imm == BPF_FUNC_loop; 14379 + } 14380 + 14381 + /* For all sub-programs in the program (including main) check 14382 + * insn_aux_data to see if there are bpf_loop calls that require 14383 + * inlining. If such calls are found the calls are replaced with a 14384 + * sequence of instructions produced by `inline_bpf_loop` function and 14385 + * subprog stack_depth is increased by the size of 3 registers. 14386 + * This stack space is used to spill values of the R6, R7, R8. These 14387 + * registers are used to store the loop bound, counter and context 14388 + * variables. 14389 + */ 14390 + static int optimize_bpf_loop(struct bpf_verifier_env *env) 14391 + { 14392 + struct bpf_subprog_info *subprogs = env->subprog_info; 14393 + int i, cur_subprog = 0, cnt, delta = 0; 14394 + struct bpf_insn *insn = env->prog->insnsi; 14395 + int insn_cnt = env->prog->len; 14396 + u16 stack_depth = subprogs[cur_subprog].stack_depth; 14397 + u16 stack_depth_roundup = round_up(stack_depth, 8) - stack_depth; 14398 + u16 stack_depth_extra = 0; 14399 + 14400 + for (i = 0; i < insn_cnt; i++, insn++) { 14401 + struct bpf_loop_inline_state *inline_state = 14402 + &env->insn_aux_data[i + delta].loop_inline_state; 14403 + 14404 + if (is_bpf_loop_call(insn) && inline_state->fit_for_inline) { 14405 + struct bpf_prog *new_prog; 14406 + 14407 + stack_depth_extra = BPF_REG_SIZE * 3 + stack_depth_roundup; 14408 + new_prog = inline_bpf_loop(env, 14409 + i + delta, 14410 + -(stack_depth + stack_depth_extra), 14411 + inline_state->callback_subprogno, 14412 + &cnt); 14413 + if (!new_prog) 14414 + return -ENOMEM; 14415 + 14416 + delta += cnt - 1; 14417 + env->prog = new_prog; 14418 + insn = new_prog->insnsi + i + delta; 14419 + } 14420 + 14421 + if (subprogs[cur_subprog + 1].start == i + delta + 1) { 14422 + subprogs[cur_subprog].stack_depth += stack_depth_extra; 14423 + cur_subprog++; 14424 + stack_depth = subprogs[cur_subprog].stack_depth; 14425 + stack_depth_roundup = round_up(stack_depth, 8) - stack_depth; 14426 + stack_depth_extra = 0; 14427 + } 14428 + } 14429 + 14430 + env->prog->aux->stack_depth = env->subprog_info[0].stack_depth; 14431 + 14432 + return 0; 14433 + } 14434 + 14373 14435 static void free_states(struct bpf_verifier_env *env) 14374 14436 { 14375 14437 struct bpf_verifier_state_list *sl, *sln; ··· 14925 14715 fallthrough; 14926 14716 case BPF_MODIFY_RETURN: 14927 14717 case BPF_LSM_MAC: 14718 + case BPF_LSM_CGROUP: 14928 14719 case BPF_TRACE_FENTRY: 14929 14720 case BPF_TRACE_FEXIT: 14930 14721 if (!btf_type_is_func(t)) { ··· 15244 15033 ret = check_max_stack_depth(env); 15245 15034 15246 15035 /* instruction rewrites happen after this point */ 15036 + if (ret == 0) 15037 + ret = optimize_bpf_loop(env); 15038 + 15247 15039 if (is_priv) { 15248 15040 if (ret == 0) 15249 15041 opt_hard_wire_dead_code_branches(env);
+2
kernel/trace/trace_uprobe.c
··· 1343 1343 int size, esize; 1344 1344 int rctx; 1345 1345 1346 + #ifdef CONFIG_BPF_EVENTS 1346 1347 if (bpf_prog_array_valid(call)) { 1347 1348 u32 ret; 1348 1349 ··· 1351 1350 if (!ret) 1352 1351 return; 1353 1352 } 1353 + #endif /* CONFIG_BPF_EVENTS */ 1354 1354 1355 1355 esize = SIZEOF_TRACE_ENTRY(is_ret_probe(tu)); 1356 1356
+2 -2
lib/test_bpf.c
··· 14733 14733 .build_skb = build_test_skb_linear_no_head_frag, 14734 14734 .features = NETIF_F_SG | NETIF_F_FRAGLIST | 14735 14735 NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_GSO | 14736 - NETIF_F_LLTX_BIT | NETIF_F_GRO | 14736 + NETIF_F_LLTX | NETIF_F_GRO | 14737 14737 NETIF_F_IPV6_CSUM | NETIF_F_RXCSUM | 14738 - NETIF_F_HW_VLAN_STAG_TX_BIT 14738 + NETIF_F_HW_VLAN_STAG_TX 14739 14739 } 14740 14740 }; 14741 14741
+55 -10
net/core/filter.c
··· 5012 5012 .arg1_type = ARG_PTR_TO_CTX, 5013 5013 }; 5014 5014 5015 - static int _bpf_setsockopt(struct sock *sk, int level, int optname, 5016 - char *optval, int optlen) 5015 + static int __bpf_setsockopt(struct sock *sk, int level, int optname, 5016 + char *optval, int optlen) 5017 5017 { 5018 5018 char devname[IFNAMSIZ]; 5019 5019 int val, valbool; ··· 5023 5023 5024 5024 if (!sk_fullsock(sk)) 5025 5025 return -EINVAL; 5026 - 5027 - sock_owned_by_me(sk); 5028 5026 5029 5027 if (level == SOL_SOCKET) { 5030 5028 if (optlen != sizeof(int) && optname != SO_BINDTODEVICE) ··· 5256 5258 return ret; 5257 5259 } 5258 5260 5259 - static int _bpf_getsockopt(struct sock *sk, int level, int optname, 5261 + static int _bpf_setsockopt(struct sock *sk, int level, int optname, 5260 5262 char *optval, int optlen) 5263 + { 5264 + if (sk_fullsock(sk)) 5265 + sock_owned_by_me(sk); 5266 + return __bpf_setsockopt(sk, level, optname, optval, optlen); 5267 + } 5268 + 5269 + static int __bpf_getsockopt(struct sock *sk, int level, int optname, 5270 + char *optval, int optlen) 5261 5271 { 5262 5272 if (!sk_fullsock(sk)) 5263 5273 goto err_clear; 5264 - 5265 - sock_owned_by_me(sk); 5266 5274 5267 5275 if (level == SOL_SOCKET) { 5268 5276 if (optlen != sizeof(int)) ··· 5364 5360 return -EINVAL; 5365 5361 } 5366 5362 5363 + static int _bpf_getsockopt(struct sock *sk, int level, int optname, 5364 + char *optval, int optlen) 5365 + { 5366 + if (sk_fullsock(sk)) 5367 + sock_owned_by_me(sk); 5368 + return __bpf_getsockopt(sk, level, optname, optval, optlen); 5369 + } 5370 + 5367 5371 BPF_CALL_5(bpf_sk_setsockopt, struct sock *, sk, int, level, 5368 5372 int, optname, char *, optval, int, optlen) 5369 5373 { ··· 5403 5391 5404 5392 const struct bpf_func_proto bpf_sk_getsockopt_proto = { 5405 5393 .func = bpf_sk_getsockopt, 5394 + .gpl_only = false, 5395 + .ret_type = RET_INTEGER, 5396 + .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, 5397 + .arg2_type = ARG_ANYTHING, 5398 + .arg3_type = ARG_ANYTHING, 5399 + .arg4_type = ARG_PTR_TO_UNINIT_MEM, 5400 + .arg5_type = ARG_CONST_SIZE, 5401 + }; 5402 + 5403 + BPF_CALL_5(bpf_unlocked_sk_setsockopt, struct sock *, sk, int, level, 5404 + int, optname, char *, optval, int, optlen) 5405 + { 5406 + return __bpf_setsockopt(sk, level, optname, optval, optlen); 5407 + } 5408 + 5409 + const struct bpf_func_proto bpf_unlocked_sk_setsockopt_proto = { 5410 + .func = bpf_unlocked_sk_setsockopt, 5411 + .gpl_only = false, 5412 + .ret_type = RET_INTEGER, 5413 + .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, 5414 + .arg2_type = ARG_ANYTHING, 5415 + .arg3_type = ARG_ANYTHING, 5416 + .arg4_type = ARG_PTR_TO_MEM | MEM_RDONLY, 5417 + .arg5_type = ARG_CONST_SIZE, 5418 + }; 5419 + 5420 + BPF_CALL_5(bpf_unlocked_sk_getsockopt, struct sock *, sk, int, level, 5421 + int, optname, char *, optval, int, optlen) 5422 + { 5423 + return __bpf_getsockopt(sk, level, optname, optval, optlen); 5424 + } 5425 + 5426 + const struct bpf_func_proto bpf_unlocked_sk_getsockopt_proto = { 5427 + .func = bpf_unlocked_sk_getsockopt, 5406 5428 .gpl_only = false, 5407 5429 .ret_type = RET_INTEGER, 5408 5430 .arg1_type = ARG_PTR_TO_BTF_ID_SOCK_COMMON, ··· 6516 6470 u64 flags) 6517 6471 { 6518 6472 struct sock *sk = NULL; 6519 - u8 family = AF_UNSPEC; 6520 6473 struct net *net; 6474 + u8 family; 6521 6475 int sdif; 6522 6476 6523 6477 if (len == sizeof(tuple->ipv4)) ··· 6527 6481 else 6528 6482 return NULL; 6529 6483 6530 - if (unlikely(family == AF_UNSPEC || flags || 6531 - !((s32)netns_id < 0 || netns_id <= S32_MAX))) 6484 + if (unlikely(flags || !((s32)netns_id < 0 || netns_id <= S32_MAX))) 6532 6485 goto out; 6533 6486 6534 6487 if (family == AF_INET)
+18 -30
net/core/skmsg.c
··· 497 497 } 498 498 EXPORT_SYMBOL_GPL(sk_msg_is_readable); 499 499 500 - static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, 501 - struct sk_buff *skb) 500 + static struct sk_msg *alloc_sk_msg(void) 502 501 { 503 502 struct sk_msg *msg; 504 503 504 + msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_KERNEL); 505 + if (unlikely(!msg)) 506 + return NULL; 507 + sg_init_marker(msg->sg.data, NR_MSG_FRAG_IDS); 508 + return msg; 509 + } 510 + 511 + static struct sk_msg *sk_psock_create_ingress_msg(struct sock *sk, 512 + struct sk_buff *skb) 513 + { 505 514 if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf) 506 515 return NULL; 507 516 508 517 if (!sk_rmem_schedule(sk, skb, skb->truesize)) 509 518 return NULL; 510 519 511 - msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_KERNEL); 512 - if (unlikely(!msg)) 513 - return NULL; 514 - 515 - sk_msg_init(msg); 516 - return msg; 520 + return alloc_sk_msg(); 517 521 } 518 522 519 523 static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, ··· 594 590 static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb, 595 591 u32 off, u32 len) 596 592 { 597 - struct sk_msg *msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_ATOMIC); 593 + struct sk_msg *msg = alloc_sk_msg(); 598 594 struct sock *sk = psock->sk; 599 595 int err; 600 596 601 597 if (unlikely(!msg)) 602 598 return -EAGAIN; 603 - sk_msg_init(msg); 604 599 skb_set_owner_r(skb, sk); 605 600 err = sk_psock_skb_ingress_enqueue(skb, off, len, psock, sk, msg); 606 601 if (err < 0) ··· 1168 1165 } 1169 1166 #endif /* CONFIG_BPF_STREAM_PARSER */ 1170 1167 1171 - static int sk_psock_verdict_recv(read_descriptor_t *desc, struct sk_buff *skb, 1172 - unsigned int offset, size_t orig_len) 1168 + static int sk_psock_verdict_recv(struct sock *sk, struct sk_buff *skb) 1173 1169 { 1174 - struct sock *sk = (struct sock *)desc->arg.data; 1175 1170 struct sk_psock *psock; 1176 1171 struct bpf_prog *prog; 1177 1172 int ret = __SK_DROP; 1178 - int len = orig_len; 1173 + int len = skb->len; 1179 1174 1180 - /* clone here so sk_eat_skb() in tcp_read_sock does not drop our data */ 1181 - skb = skb_clone(skb, GFP_ATOMIC); 1182 - if (!skb) { 1183 - desc->error = -ENOMEM; 1184 - return 0; 1185 - } 1175 + skb_get(skb); 1186 1176 1187 1177 rcu_read_lock(); 1188 1178 psock = sk_psock(sk); ··· 1188 1192 if (!prog) 1189 1193 prog = READ_ONCE(psock->progs.skb_verdict); 1190 1194 if (likely(prog)) { 1191 - skb->sk = sk; 1192 1195 skb_dst_drop(skb); 1193 1196 skb_bpf_redirect_clear(skb); 1194 1197 ret = bpf_prog_run_pin_on_cpu(prog, skb); 1195 1198 ret = sk_psock_map_verd(ret, skb_bpf_redirect_fetch(skb)); 1196 - skb->sk = NULL; 1197 1199 } 1198 1200 if (sk_psock_verdict_apply(psock, skb, ret) < 0) 1199 1201 len = 0; ··· 1203 1209 static void sk_psock_verdict_data_ready(struct sock *sk) 1204 1210 { 1205 1211 struct socket *sock = sk->sk_socket; 1206 - read_descriptor_t desc; 1207 1212 1208 - if (unlikely(!sock || !sock->ops || !sock->ops->read_sock)) 1213 + if (unlikely(!sock || !sock->ops || !sock->ops->read_skb)) 1209 1214 return; 1210 - 1211 - desc.arg.data = sk; 1212 - desc.error = 0; 1213 - desc.count = 1; 1214 - 1215 - sock->ops->read_sock(sk, &desc, sk_psock_verdict_recv); 1215 + sock->ops->read_skb(sk, sk_psock_verdict_recv); 1216 1216 } 1217 1217 1218 1218 void sk_psock_start_verdict(struct sock *sk, struct sk_psock *psock)
+1 -1
net/core/sock_map.c
··· 1578 1578 saved_destroy = psock->saved_destroy; 1579 1579 sock_map_remove_links(sk, psock); 1580 1580 rcu_read_unlock(); 1581 - sk_psock_stop(psock, true); 1581 + sk_psock_stop(psock, false); 1582 1582 sk_psock_put(sk, psock); 1583 1583 saved_destroy(sk); 1584 1584 }
+2 -1
net/ipv4/af_inet.c
··· 1040 1040 .sendpage = inet_sendpage, 1041 1041 .splice_read = tcp_splice_read, 1042 1042 .read_sock = tcp_read_sock, 1043 + .read_skb = tcp_read_skb, 1043 1044 .sendmsg_locked = tcp_sendmsg_locked, 1044 1045 .sendpage_locked = tcp_sendpage_locked, 1045 1046 .peek_len = tcp_peek_len, ··· 1068 1067 .setsockopt = sock_common_setsockopt, 1069 1068 .getsockopt = sock_common_getsockopt, 1070 1069 .sendmsg = inet_sendmsg, 1071 - .read_sock = udp_read_sock, 1070 + .read_skb = udp_read_skb, 1072 1071 .recvmsg = inet_recvmsg, 1073 1072 .mmap = sock_no_mmap, 1074 1073 .sendpage = inet_sendpage,
+6 -33
net/ipv4/bpf_tcp_ca.c
··· 14 14 /* "extern" is to avoid sparse warning. It is only used in bpf_struct_ops.c. */ 15 15 extern struct bpf_struct_ops bpf_tcp_congestion_ops; 16 16 17 - static u32 optional_ops[] = { 18 - offsetof(struct tcp_congestion_ops, init), 19 - offsetof(struct tcp_congestion_ops, release), 20 - offsetof(struct tcp_congestion_ops, set_state), 21 - offsetof(struct tcp_congestion_ops, cwnd_event), 22 - offsetof(struct tcp_congestion_ops, in_ack_event), 23 - offsetof(struct tcp_congestion_ops, pkts_acked), 24 - offsetof(struct tcp_congestion_ops, min_tso_segs), 25 - offsetof(struct tcp_congestion_ops, sndbuf_expand), 26 - offsetof(struct tcp_congestion_ops, cong_control), 27 - }; 28 - 29 17 static u32 unsupported_ops[] = { 30 18 offsetof(struct tcp_congestion_ops, get_info), 31 19 }; ··· 37 49 tcp_sock_type = btf_type_by_id(btf, tcp_sock_id); 38 50 39 51 return 0; 40 - } 41 - 42 - static bool is_optional(u32 member_offset) 43 - { 44 - unsigned int i; 45 - 46 - for (i = 0; i < ARRAY_SIZE(optional_ops); i++) { 47 - if (member_offset == optional_ops[i]) 48 - return true; 49 - } 50 - 51 - return false; 52 52 } 53 53 54 54 static bool is_unsupported(u32 member_offset) ··· 87 111 } 88 112 89 113 switch (off) { 114 + case offsetof(struct sock, sk_pacing_rate): 115 + end = offsetofend(struct sock, sk_pacing_rate); 116 + break; 117 + case offsetof(struct sock, sk_pacing_status): 118 + end = offsetofend(struct sock, sk_pacing_status); 119 + break; 90 120 case bpf_ctx_range(struct inet_connection_sock, icsk_ca_priv): 91 121 end = offsetofend(struct inet_connection_sock, icsk_ca_priv); 92 122 break; ··· 222 240 { 223 241 const struct tcp_congestion_ops *utcp_ca; 224 242 struct tcp_congestion_ops *tcp_ca; 225 - int prog_fd; 226 243 u32 moff; 227 244 228 245 utcp_ca = (const struct tcp_congestion_ops *)udata; ··· 242 261 return -EEXIST; 243 262 return 1; 244 263 } 245 - 246 - if (!btf_type_resolve_func_ptr(btf_vmlinux, member->type, NULL)) 247 - return 0; 248 - 249 - /* Ensure bpf_prog is provided for compulsory func ptr */ 250 - prog_fd = (int)(*(unsigned long *)(udata + moff)); 251 - if (!prog_fd && !is_optional(moff) && !is_unsupported(moff)) 252 - return -EINVAL; 253 264 254 265 return 0; 255 266 }
+44
net/ipv4/tcp.c
··· 1734 1734 } 1735 1735 EXPORT_SYMBOL(tcp_read_sock); 1736 1736 1737 + int tcp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) 1738 + { 1739 + struct tcp_sock *tp = tcp_sk(sk); 1740 + u32 seq = tp->copied_seq; 1741 + struct sk_buff *skb; 1742 + int copied = 0; 1743 + u32 offset; 1744 + 1745 + if (sk->sk_state == TCP_LISTEN) 1746 + return -ENOTCONN; 1747 + 1748 + while ((skb = tcp_recv_skb(sk, seq, &offset)) != NULL) { 1749 + int used; 1750 + 1751 + __skb_unlink(skb, &sk->sk_receive_queue); 1752 + used = recv_actor(sk, skb); 1753 + if (used <= 0) { 1754 + if (!copied) 1755 + copied = used; 1756 + break; 1757 + } 1758 + seq += used; 1759 + copied += used; 1760 + 1761 + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN) { 1762 + consume_skb(skb); 1763 + ++seq; 1764 + break; 1765 + } 1766 + consume_skb(skb); 1767 + break; 1768 + } 1769 + WRITE_ONCE(tp->copied_seq, seq); 1770 + 1771 + tcp_rcv_space_adjust(sk); 1772 + 1773 + /* Clean up data we have read: This will do ACK frames. */ 1774 + if (copied > 0) 1775 + tcp_cleanup_rbuf(sk, copied); 1776 + 1777 + return copied; 1778 + } 1779 + EXPORT_SYMBOL(tcp_read_skb); 1780 + 1737 1781 int tcp_peek_len(struct socket *sock) 1738 1782 { 1739 1783 return tcp_inq(sock->sk);
+5 -6
net/ipv4/udp.c
··· 1797 1797 } 1798 1798 EXPORT_SYMBOL(__skb_recv_udp); 1799 1799 1800 - int udp_read_sock(struct sock *sk, read_descriptor_t *desc, 1801 - sk_read_actor_t recv_actor) 1800 + int udp_read_skb(struct sock *sk, skb_read_actor_t recv_actor) 1802 1801 { 1803 1802 int copied = 0; 1804 1803 ··· 1819 1820 continue; 1820 1821 } 1821 1822 1822 - used = recv_actor(desc, skb, 0, skb->len); 1823 + WARN_ON(!skb_set_owner_sk_safe(skb, sk)); 1824 + used = recv_actor(sk, skb); 1823 1825 if (used <= 0) { 1824 1826 if (!copied) 1825 1827 copied = used; ··· 1831 1831 } 1832 1832 1833 1833 kfree_skb(skb); 1834 - if (!desc->count) 1835 - break; 1834 + break; 1836 1835 } 1837 1836 1838 1837 return copied; 1839 1838 } 1840 - EXPORT_SYMBOL(udp_read_sock); 1839 + EXPORT_SYMBOL(udp_read_skb); 1841 1840 1842 1841 /* 1843 1842 * This should be easy, if there is something there we
+2 -1
net/ipv6/af_inet6.c
··· 702 702 .sendpage_locked = tcp_sendpage_locked, 703 703 .splice_read = tcp_splice_read, 704 704 .read_sock = tcp_read_sock, 705 + .read_skb = tcp_read_skb, 705 706 .peek_len = tcp_peek_len, 706 707 #ifdef CONFIG_COMPAT 707 708 .compat_ioctl = inet6_compat_ioctl, ··· 728 727 .getsockopt = sock_common_getsockopt, /* ok */ 729 728 .sendmsg = inet6_sendmsg, /* retpoline's sake */ 730 729 .recvmsg = inet6_recvmsg, /* retpoline's sake */ 731 - .read_sock = udp_read_sock, 730 + .read_skb = udp_read_skb, 732 731 .mmap = sock_no_mmap, 733 732 .sendpage = sock_no_sendpage, 734 733 .set_peek_off = sk_set_peek_off,
+9 -14
net/unix/af_unix.c
··· 763 763 unsigned int flags); 764 764 static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t); 765 765 static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int); 766 - static int unix_read_sock(struct sock *sk, read_descriptor_t *desc, 767 - sk_read_actor_t recv_actor); 768 - static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc, 769 - sk_read_actor_t recv_actor); 766 + static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor); 767 + static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor); 770 768 static int unix_dgram_connect(struct socket *, struct sockaddr *, 771 769 int, int); 772 770 static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t); ··· 818 820 .shutdown = unix_shutdown, 819 821 .sendmsg = unix_stream_sendmsg, 820 822 .recvmsg = unix_stream_recvmsg, 821 - .read_sock = unix_stream_read_sock, 823 + .read_skb = unix_stream_read_skb, 822 824 .mmap = sock_no_mmap, 823 825 .sendpage = unix_stream_sendpage, 824 826 .splice_read = unix_stream_splice_read, ··· 843 845 .listen = sock_no_listen, 844 846 .shutdown = unix_shutdown, 845 847 .sendmsg = unix_dgram_sendmsg, 846 - .read_sock = unix_read_sock, 848 + .read_skb = unix_read_skb, 847 849 .recvmsg = unix_dgram_recvmsg, 848 850 .mmap = sock_no_mmap, 849 851 .sendpage = sock_no_sendpage, ··· 2504 2506 return __unix_dgram_recvmsg(sk, msg, size, flags); 2505 2507 } 2506 2508 2507 - static int unix_read_sock(struct sock *sk, read_descriptor_t *desc, 2508 - sk_read_actor_t recv_actor) 2509 + static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor) 2509 2510 { 2510 2511 int copied = 0; 2511 2512 ··· 2519 2522 if (!skb) 2520 2523 return err; 2521 2524 2522 - used = recv_actor(desc, skb, 0, skb->len); 2525 + used = recv_actor(sk, skb); 2523 2526 if (used <= 0) { 2524 2527 if (!copied) 2525 2528 copied = used; ··· 2530 2533 } 2531 2534 2532 2535 kfree_skb(skb); 2533 - if (!desc->count) 2534 - break; 2536 + break; 2535 2537 } 2536 2538 2537 2539 return copied; ··· 2665 2669 } 2666 2670 #endif 2667 2671 2668 - static int unix_stream_read_sock(struct sock *sk, read_descriptor_t *desc, 2669 - sk_read_actor_t recv_actor) 2672 + static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor) 2670 2673 { 2671 2674 if (unlikely(sk->sk_state != TCP_ESTABLISHED)) 2672 2675 return -ENOTCONN; 2673 2676 2674 - return unix_read_sock(sk, desc, recv_actor); 2677 + return unix_read_skb(sk, recv_actor); 2675 2678 } 2676 2679 2677 2680 static int unix_stream_read_generic(struct unix_stream_read_state *state,
-9
samples/bpf/Makefile
··· 45 45 tprogs-y += syscall_tp 46 46 tprogs-y += cpustat 47 47 tprogs-y += xdp_adjust_tail 48 - tprogs-y += xdpsock 49 - tprogs-y += xdpsock_ctrl_proc 50 - tprogs-y += xsk_fwd 51 48 tprogs-y += xdp_fwd 52 49 tprogs-y += task_fd_query 53 50 tprogs-y += xdp_sample_pkts ··· 106 109 syscall_tp-objs := syscall_tp_user.o 107 110 cpustat-objs := cpustat_user.o 108 111 xdp_adjust_tail-objs := xdp_adjust_tail_user.o 109 - xdpsock-objs := xdpsock_user.o 110 - xdpsock_ctrl_proc-objs := xdpsock_ctrl_proc.o 111 - xsk_fwd-objs := xsk_fwd.o 112 112 xdp_fwd-objs := xdp_fwd_user.o 113 113 task_fd_query-objs := task_fd_query_user.o $(TRACE_HELPERS) 114 114 xdp_sample_pkts-objs := xdp_sample_pkts_user.o ··· 173 179 always-y += ibumad_kern.o 174 180 always-y += hbm_out_kern.o 175 181 always-y += hbm_edt_kern.o 176 - always-y += xdpsock_kern.o 177 182 178 183 ifeq ($(ARCH), arm) 179 184 # Strip all except -D__LINUX_ARM_ARCH__ option needed to handle linux ··· 217 224 TPROGLDLIBS_trace_output += -lrt 218 225 TPROGLDLIBS_map_perf_test += -lrt 219 226 TPROGLDLIBS_test_overhead += -lrt 220 - TPROGLDLIBS_xdpsock += -pthread -lcap 221 - TPROGLDLIBS_xsk_fwd += -pthread 222 227 223 228 # Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline: 224 229 # make M=samples/bpf LLC=~/git/llvm-project/llvm/build/bin/llc CLANG=~/git/llvm-project/llvm/build/bin/clang
+8 -3
samples/bpf/xdp1_kern.c
··· 39 39 return ip6h->nexthdr; 40 40 } 41 41 42 - SEC("xdp1") 42 + #define XDPBUFSIZE 64 43 + SEC("xdp.frags") 43 44 int xdp_prog1(struct xdp_md *ctx) 44 45 { 45 - void *data_end = (void *)(long)ctx->data_end; 46 - void *data = (void *)(long)ctx->data; 46 + __u8 pkt[XDPBUFSIZE] = {}; 47 + void *data_end = &pkt[XDPBUFSIZE-1]; 48 + void *data = pkt; 47 49 struct ethhdr *eth = data; 48 50 int rc = XDP_DROP; 49 51 long *value; 50 52 u16 h_proto; 51 53 u64 nh_off; 52 54 u32 ipproto; 55 + 56 + if (bpf_xdp_load_bytes(ctx, 0, pkt, sizeof(pkt))) 57 + return rc; 53 58 54 59 nh_off = sizeof(*eth); 55 60 if (data + nh_off > data_end)
+8 -3
samples/bpf/xdp2_kern.c
··· 55 55 return ip6h->nexthdr; 56 56 } 57 57 58 - SEC("xdp1") 58 + #define XDPBUFSIZE 64 59 + SEC("xdp.frags") 59 60 int xdp_prog1(struct xdp_md *ctx) 60 61 { 61 - void *data_end = (void *)(long)ctx->data_end; 62 - void *data = (void *)(long)ctx->data; 62 + __u8 pkt[XDPBUFSIZE] = {}; 63 + void *data_end = &pkt[XDPBUFSIZE-1]; 64 + void *data = pkt; 63 65 struct ethhdr *eth = data; 64 66 int rc = XDP_DROP; 65 67 long *value; 66 68 u16 h_proto; 67 69 u64 nh_off; 68 70 u32 ipproto; 71 + 72 + if (bpf_xdp_load_bytes(ctx, 0, pkt, sizeof(pkt))) 73 + return rc; 69 74 70 75 nh_off = sizeof(*eth); 71 76 if (data + nh_off > data_end)
+1 -1
samples/bpf/xdp_tx_iptunnel_kern.c
··· 212 212 return XDP_TX; 213 213 } 214 214 215 - SEC("xdp_tx_iptunnel") 215 + SEC("xdp.frags") 216 216 int _xdp_tx_iptunnel(struct xdp_md *xdp) 217 217 { 218 218 void *data_end = (void *)(long)xdp->data_end;
-19
samples/bpf/xdpsock.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 2 - * 3 - * Copyright(c) 2019 Intel Corporation. 4 - */ 5 - 6 - #ifndef XDPSOCK_H_ 7 - #define XDPSOCK_H_ 8 - 9 - #define MAX_SOCKS 4 10 - 11 - #define SOCKET_NAME "sock_cal_bpf_fd" 12 - #define MAX_NUM_OF_CLIENTS 10 13 - 14 - #define CLOSE_CONN 1 15 - 16 - typedef __u64 u64; 17 - typedef __u32 u32; 18 - 19 - #endif /* XDPSOCK_H */
-190
samples/bpf/xdpsock_ctrl_proc.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* Copyright(c) 2017 - 2018 Intel Corporation. */ 3 - 4 - #include <errno.h> 5 - #include <getopt.h> 6 - #include <libgen.h> 7 - #include <net/if.h> 8 - #include <stdio.h> 9 - #include <stdlib.h> 10 - #include <sys/socket.h> 11 - #include <sys/un.h> 12 - #include <unistd.h> 13 - 14 - #include <bpf/bpf.h> 15 - #include <bpf/xsk.h> 16 - #include "xdpsock.h" 17 - 18 - /* libbpf APIs for AF_XDP are deprecated starting from v0.7 */ 19 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 20 - 21 - static const char *opt_if = ""; 22 - 23 - static struct option long_options[] = { 24 - {"interface", required_argument, 0, 'i'}, 25 - {0, 0, 0, 0} 26 - }; 27 - 28 - static void usage(const char *prog) 29 - { 30 - const char *str = 31 - " Usage: %s [OPTIONS]\n" 32 - " Options:\n" 33 - " -i, --interface=n Run on interface n\n" 34 - "\n"; 35 - fprintf(stderr, "%s\n", str); 36 - 37 - exit(0); 38 - } 39 - 40 - static void parse_command_line(int argc, char **argv) 41 - { 42 - int option_index, c; 43 - 44 - opterr = 0; 45 - 46 - for (;;) { 47 - c = getopt_long(argc, argv, "i:", 48 - long_options, &option_index); 49 - if (c == -1) 50 - break; 51 - 52 - switch (c) { 53 - case 'i': 54 - opt_if = optarg; 55 - break; 56 - default: 57 - usage(basename(argv[0])); 58 - } 59 - } 60 - } 61 - 62 - static int send_xsks_map_fd(int sock, int fd) 63 - { 64 - char cmsgbuf[CMSG_SPACE(sizeof(int))]; 65 - struct msghdr msg; 66 - struct iovec iov; 67 - int value = 0; 68 - 69 - if (fd == -1) { 70 - fprintf(stderr, "Incorrect fd = %d\n", fd); 71 - return -1; 72 - } 73 - iov.iov_base = &value; 74 - iov.iov_len = sizeof(int); 75 - 76 - msg.msg_name = NULL; 77 - msg.msg_namelen = 0; 78 - msg.msg_iov = &iov; 79 - msg.msg_iovlen = 1; 80 - msg.msg_flags = 0; 81 - msg.msg_control = cmsgbuf; 82 - msg.msg_controllen = CMSG_LEN(sizeof(int)); 83 - 84 - struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg); 85 - 86 - cmsg->cmsg_level = SOL_SOCKET; 87 - cmsg->cmsg_type = SCM_RIGHTS; 88 - cmsg->cmsg_len = CMSG_LEN(sizeof(int)); 89 - 90 - *(int *)CMSG_DATA(cmsg) = fd; 91 - int ret = sendmsg(sock, &msg, 0); 92 - 93 - if (ret == -1) { 94 - fprintf(stderr, "Sendmsg failed with %s", strerror(errno)); 95 - return -errno; 96 - } 97 - 98 - return ret; 99 - } 100 - 101 - int 102 - main(int argc, char **argv) 103 - { 104 - struct sockaddr_un server; 105 - int listening = 1; 106 - int rval, msgsock; 107 - int ifindex = 0; 108 - int flag = 1; 109 - int cmd = 0; 110 - int sock; 111 - int err; 112 - int xsks_map_fd; 113 - 114 - parse_command_line(argc, argv); 115 - 116 - ifindex = if_nametoindex(opt_if); 117 - if (ifindex == 0) { 118 - fprintf(stderr, "Unable to get ifindex for Interface %s. Reason:%s", 119 - opt_if, strerror(errno)); 120 - return -errno; 121 - } 122 - 123 - sock = socket(AF_UNIX, SOCK_STREAM, 0); 124 - if (sock < 0) { 125 - fprintf(stderr, "Opening socket stream failed: %s", strerror(errno)); 126 - return -errno; 127 - } 128 - 129 - server.sun_family = AF_UNIX; 130 - strcpy(server.sun_path, SOCKET_NAME); 131 - 132 - setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &flag, sizeof(int)); 133 - 134 - if (bind(sock, (struct sockaddr *)&server, sizeof(struct sockaddr_un))) { 135 - fprintf(stderr, "Binding to socket stream failed: %s", strerror(errno)); 136 - return -errno; 137 - } 138 - 139 - listen(sock, MAX_NUM_OF_CLIENTS); 140 - 141 - err = xsk_setup_xdp_prog(ifindex, &xsks_map_fd); 142 - if (err) { 143 - fprintf(stderr, "Setup of xdp program failed\n"); 144 - goto close_sock; 145 - } 146 - 147 - while (listening) { 148 - msgsock = accept(sock, 0, 0); 149 - if (msgsock == -1) { 150 - fprintf(stderr, "Error accepting connection: %s", strerror(errno)); 151 - err = -errno; 152 - goto close_sock; 153 - } 154 - err = send_xsks_map_fd(msgsock, xsks_map_fd); 155 - if (err <= 0) { 156 - fprintf(stderr, "Error %d sending xsks_map_fd\n", err); 157 - goto cleanup; 158 - } 159 - do { 160 - rval = read(msgsock, &cmd, sizeof(int)); 161 - if (rval < 0) { 162 - fprintf(stderr, "Error reading stream message"); 163 - } else { 164 - if (cmd != CLOSE_CONN) 165 - fprintf(stderr, "Recv unknown cmd = %d\n", cmd); 166 - listening = 0; 167 - break; 168 - } 169 - } while (rval > 0); 170 - } 171 - close(msgsock); 172 - close(sock); 173 - unlink(SOCKET_NAME); 174 - 175 - /* Unset fd for given ifindex */ 176 - err = bpf_xdp_detach(ifindex, 0, NULL); 177 - if (err) { 178 - fprintf(stderr, "Error when unsetting bpf prog_fd for ifindex(%d)\n", ifindex); 179 - return err; 180 - } 181 - 182 - return 0; 183 - 184 - cleanup: 185 - close(msgsock); 186 - close_sock: 187 - close(sock); 188 - unlink(SOCKET_NAME); 189 - return err; 190 - }
-24
samples/bpf/xdpsock_kern.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - #include <linux/bpf.h> 3 - #include <bpf/bpf_helpers.h> 4 - #include "xdpsock.h" 5 - 6 - /* This XDP program is only needed for the XDP_SHARED_UMEM mode. 7 - * If you do not use this mode, libbpf can supply an XDP program for you. 8 - */ 9 - 10 - struct { 11 - __uint(type, BPF_MAP_TYPE_XSKMAP); 12 - __uint(max_entries, MAX_SOCKS); 13 - __uint(key_size, sizeof(int)); 14 - __uint(value_size, sizeof(int)); 15 - } xsks_map SEC(".maps"); 16 - 17 - static unsigned int rr; 18 - 19 - SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) 20 - { 21 - rr = (rr + 1) & (MAX_SOCKS - 1); 22 - 23 - return bpf_redirect_map(&xsks_map, rr, XDP_DROP); 24 - }
-2019
samples/bpf/xdpsock_user.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* Copyright(c) 2017 - 2018 Intel Corporation. */ 3 - 4 - #include <errno.h> 5 - #include <getopt.h> 6 - #include <libgen.h> 7 - #include <linux/bpf.h> 8 - #include <linux/if_link.h> 9 - #include <linux/if_xdp.h> 10 - #include <linux/if_ether.h> 11 - #include <linux/ip.h> 12 - #include <linux/limits.h> 13 - #include <linux/udp.h> 14 - #include <arpa/inet.h> 15 - #include <locale.h> 16 - #include <net/ethernet.h> 17 - #include <netinet/ether.h> 18 - #include <net/if.h> 19 - #include <poll.h> 20 - #include <pthread.h> 21 - #include <signal.h> 22 - #include <stdbool.h> 23 - #include <stdio.h> 24 - #include <stdlib.h> 25 - #include <string.h> 26 - #include <sys/capability.h> 27 - #include <sys/mman.h> 28 - #include <sys/socket.h> 29 - #include <sys/types.h> 30 - #include <sys/un.h> 31 - #include <time.h> 32 - #include <unistd.h> 33 - #include <sched.h> 34 - 35 - #include <bpf/libbpf.h> 36 - #include <bpf/xsk.h> 37 - #include <bpf/bpf.h> 38 - #include "xdpsock.h" 39 - 40 - /* libbpf APIs for AF_XDP are deprecated starting from v0.7 */ 41 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 42 - 43 - #ifndef SOL_XDP 44 - #define SOL_XDP 283 45 - #endif 46 - 47 - #ifndef AF_XDP 48 - #define AF_XDP 44 49 - #endif 50 - 51 - #ifndef PF_XDP 52 - #define PF_XDP AF_XDP 53 - #endif 54 - 55 - #define NUM_FRAMES (4 * 1024) 56 - #define MIN_PKT_SIZE 64 57 - 58 - #define DEBUG_HEXDUMP 0 59 - 60 - #define VLAN_PRIO_MASK 0xe000 /* Priority Code Point */ 61 - #define VLAN_PRIO_SHIFT 13 62 - #define VLAN_VID_MASK 0x0fff /* VLAN Identifier */ 63 - #define VLAN_VID__DEFAULT 1 64 - #define VLAN_PRI__DEFAULT 0 65 - 66 - #define NSEC_PER_SEC 1000000000UL 67 - #define NSEC_PER_USEC 1000 68 - 69 - #define SCHED_PRI__DEFAULT 0 70 - 71 - typedef __u64 u64; 72 - typedef __u32 u32; 73 - typedef __u16 u16; 74 - typedef __u8 u8; 75 - 76 - static unsigned long prev_time; 77 - static long tx_cycle_diff_min; 78 - static long tx_cycle_diff_max; 79 - static double tx_cycle_diff_ave; 80 - static long tx_cycle_cnt; 81 - 82 - enum benchmark_type { 83 - BENCH_RXDROP = 0, 84 - BENCH_TXONLY = 1, 85 - BENCH_L2FWD = 2, 86 - }; 87 - 88 - static enum benchmark_type opt_bench = BENCH_RXDROP; 89 - static u32 opt_xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST; 90 - static const char *opt_if = ""; 91 - static int opt_ifindex; 92 - static int opt_queue; 93 - static unsigned long opt_duration; 94 - static unsigned long start_time; 95 - static bool benchmark_done; 96 - static u32 opt_batch_size = 64; 97 - static int opt_pkt_count; 98 - static u16 opt_pkt_size = MIN_PKT_SIZE; 99 - static u32 opt_pkt_fill_pattern = 0x12345678; 100 - static bool opt_vlan_tag; 101 - static u16 opt_pkt_vlan_id = VLAN_VID__DEFAULT; 102 - static u16 opt_pkt_vlan_pri = VLAN_PRI__DEFAULT; 103 - static struct ether_addr opt_txdmac = {{ 0x3c, 0xfd, 0xfe, 104 - 0x9e, 0x7f, 0x71 }}; 105 - static struct ether_addr opt_txsmac = {{ 0xec, 0xb1, 0xd7, 106 - 0x98, 0x3a, 0xc0 }}; 107 - static bool opt_extra_stats; 108 - static bool opt_quiet; 109 - static bool opt_app_stats; 110 - static const char *opt_irq_str = ""; 111 - static u32 irq_no; 112 - static int irqs_at_init = -1; 113 - static u32 sequence; 114 - static int opt_poll; 115 - static int opt_interval = 1; 116 - static int opt_retries = 3; 117 - static u32 opt_xdp_bind_flags = XDP_USE_NEED_WAKEUP; 118 - static u32 opt_umem_flags; 119 - static int opt_unaligned_chunks; 120 - static int opt_mmap_flags; 121 - static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; 122 - static int opt_timeout = 1000; 123 - static bool opt_need_wakeup = true; 124 - static u32 opt_num_xsks = 1; 125 - static u32 prog_id; 126 - static bool opt_busy_poll; 127 - static bool opt_reduced_cap; 128 - static clockid_t opt_clock = CLOCK_MONOTONIC; 129 - static unsigned long opt_tx_cycle_ns; 130 - static int opt_schpolicy = SCHED_OTHER; 131 - static int opt_schprio = SCHED_PRI__DEFAULT; 132 - static bool opt_tstamp; 133 - 134 - struct vlan_ethhdr { 135 - unsigned char h_dest[6]; 136 - unsigned char h_source[6]; 137 - __be16 h_vlan_proto; 138 - __be16 h_vlan_TCI; 139 - __be16 h_vlan_encapsulated_proto; 140 - }; 141 - 142 - #define PKTGEN_MAGIC 0xbe9be955 143 - struct pktgen_hdr { 144 - __be32 pgh_magic; 145 - __be32 seq_num; 146 - __be32 tv_sec; 147 - __be32 tv_usec; 148 - }; 149 - 150 - struct xsk_ring_stats { 151 - unsigned long rx_npkts; 152 - unsigned long tx_npkts; 153 - unsigned long rx_dropped_npkts; 154 - unsigned long rx_invalid_npkts; 155 - unsigned long tx_invalid_npkts; 156 - unsigned long rx_full_npkts; 157 - unsigned long rx_fill_empty_npkts; 158 - unsigned long tx_empty_npkts; 159 - unsigned long prev_rx_npkts; 160 - unsigned long prev_tx_npkts; 161 - unsigned long prev_rx_dropped_npkts; 162 - unsigned long prev_rx_invalid_npkts; 163 - unsigned long prev_tx_invalid_npkts; 164 - unsigned long prev_rx_full_npkts; 165 - unsigned long prev_rx_fill_empty_npkts; 166 - unsigned long prev_tx_empty_npkts; 167 - }; 168 - 169 - struct xsk_driver_stats { 170 - unsigned long intrs; 171 - unsigned long prev_intrs; 172 - }; 173 - 174 - struct xsk_app_stats { 175 - unsigned long rx_empty_polls; 176 - unsigned long fill_fail_polls; 177 - unsigned long copy_tx_sendtos; 178 - unsigned long tx_wakeup_sendtos; 179 - unsigned long opt_polls; 180 - unsigned long prev_rx_empty_polls; 181 - unsigned long prev_fill_fail_polls; 182 - unsigned long prev_copy_tx_sendtos; 183 - unsigned long prev_tx_wakeup_sendtos; 184 - unsigned long prev_opt_polls; 185 - }; 186 - 187 - struct xsk_umem_info { 188 - struct xsk_ring_prod fq; 189 - struct xsk_ring_cons cq; 190 - struct xsk_umem *umem; 191 - void *buffer; 192 - }; 193 - 194 - struct xsk_socket_info { 195 - struct xsk_ring_cons rx; 196 - struct xsk_ring_prod tx; 197 - struct xsk_umem_info *umem; 198 - struct xsk_socket *xsk; 199 - struct xsk_ring_stats ring_stats; 200 - struct xsk_app_stats app_stats; 201 - struct xsk_driver_stats drv_stats; 202 - u32 outstanding_tx; 203 - }; 204 - 205 - static const struct clockid_map { 206 - const char *name; 207 - clockid_t clockid; 208 - } clockids_map[] = { 209 - { "REALTIME", CLOCK_REALTIME }, 210 - { "TAI", CLOCK_TAI }, 211 - { "BOOTTIME", CLOCK_BOOTTIME }, 212 - { "MONOTONIC", CLOCK_MONOTONIC }, 213 - { NULL } 214 - }; 215 - 216 - static const struct sched_map { 217 - const char *name; 218 - int policy; 219 - } schmap[] = { 220 - { "OTHER", SCHED_OTHER }, 221 - { "FIFO", SCHED_FIFO }, 222 - { NULL } 223 - }; 224 - 225 - static int num_socks; 226 - struct xsk_socket_info *xsks[MAX_SOCKS]; 227 - int sock; 228 - 229 - static int get_clockid(clockid_t *id, const char *name) 230 - { 231 - const struct clockid_map *clk; 232 - 233 - for (clk = clockids_map; clk->name; clk++) { 234 - if (strcasecmp(clk->name, name) == 0) { 235 - *id = clk->clockid; 236 - return 0; 237 - } 238 - } 239 - 240 - return -1; 241 - } 242 - 243 - static int get_schpolicy(int *policy, const char *name) 244 - { 245 - const struct sched_map *sch; 246 - 247 - for (sch = schmap; sch->name; sch++) { 248 - if (strcasecmp(sch->name, name) == 0) { 249 - *policy = sch->policy; 250 - return 0; 251 - } 252 - } 253 - 254 - return -1; 255 - } 256 - 257 - static unsigned long get_nsecs(void) 258 - { 259 - struct timespec ts; 260 - 261 - clock_gettime(opt_clock, &ts); 262 - return ts.tv_sec * 1000000000UL + ts.tv_nsec; 263 - } 264 - 265 - static void print_benchmark(bool running) 266 - { 267 - const char *bench_str = "INVALID"; 268 - 269 - if (opt_bench == BENCH_RXDROP) 270 - bench_str = "rxdrop"; 271 - else if (opt_bench == BENCH_TXONLY) 272 - bench_str = "txonly"; 273 - else if (opt_bench == BENCH_L2FWD) 274 - bench_str = "l2fwd"; 275 - 276 - printf("%s:%d %s ", opt_if, opt_queue, bench_str); 277 - if (opt_xdp_flags & XDP_FLAGS_SKB_MODE) 278 - printf("xdp-skb "); 279 - else if (opt_xdp_flags & XDP_FLAGS_DRV_MODE) 280 - printf("xdp-drv "); 281 - else 282 - printf(" "); 283 - 284 - if (opt_poll) 285 - printf("poll() "); 286 - 287 - if (running) { 288 - printf("running..."); 289 - fflush(stdout); 290 - } 291 - } 292 - 293 - static int xsk_get_xdp_stats(int fd, struct xsk_socket_info *xsk) 294 - { 295 - struct xdp_statistics stats; 296 - socklen_t optlen; 297 - int err; 298 - 299 - optlen = sizeof(stats); 300 - err = getsockopt(fd, SOL_XDP, XDP_STATISTICS, &stats, &optlen); 301 - if (err) 302 - return err; 303 - 304 - if (optlen == sizeof(struct xdp_statistics)) { 305 - xsk->ring_stats.rx_dropped_npkts = stats.rx_dropped; 306 - xsk->ring_stats.rx_invalid_npkts = stats.rx_invalid_descs; 307 - xsk->ring_stats.tx_invalid_npkts = stats.tx_invalid_descs; 308 - xsk->ring_stats.rx_full_npkts = stats.rx_ring_full; 309 - xsk->ring_stats.rx_fill_empty_npkts = stats.rx_fill_ring_empty_descs; 310 - xsk->ring_stats.tx_empty_npkts = stats.tx_ring_empty_descs; 311 - return 0; 312 - } 313 - 314 - return -EINVAL; 315 - } 316 - 317 - static void dump_app_stats(long dt) 318 - { 319 - int i; 320 - 321 - for (i = 0; i < num_socks && xsks[i]; i++) { 322 - char *fmt = "%-18s %'-14.0f %'-14lu\n"; 323 - double rx_empty_polls_ps, fill_fail_polls_ps, copy_tx_sendtos_ps, 324 - tx_wakeup_sendtos_ps, opt_polls_ps; 325 - 326 - rx_empty_polls_ps = (xsks[i]->app_stats.rx_empty_polls - 327 - xsks[i]->app_stats.prev_rx_empty_polls) * 1000000000. / dt; 328 - fill_fail_polls_ps = (xsks[i]->app_stats.fill_fail_polls - 329 - xsks[i]->app_stats.prev_fill_fail_polls) * 1000000000. / dt; 330 - copy_tx_sendtos_ps = (xsks[i]->app_stats.copy_tx_sendtos - 331 - xsks[i]->app_stats.prev_copy_tx_sendtos) * 1000000000. / dt; 332 - tx_wakeup_sendtos_ps = (xsks[i]->app_stats.tx_wakeup_sendtos - 333 - xsks[i]->app_stats.prev_tx_wakeup_sendtos) 334 - * 1000000000. / dt; 335 - opt_polls_ps = (xsks[i]->app_stats.opt_polls - 336 - xsks[i]->app_stats.prev_opt_polls) * 1000000000. / dt; 337 - 338 - printf("\n%-18s %-14s %-14s\n", "", "calls/s", "count"); 339 - printf(fmt, "rx empty polls", rx_empty_polls_ps, xsks[i]->app_stats.rx_empty_polls); 340 - printf(fmt, "fill fail polls", fill_fail_polls_ps, 341 - xsks[i]->app_stats.fill_fail_polls); 342 - printf(fmt, "copy tx sendtos", copy_tx_sendtos_ps, 343 - xsks[i]->app_stats.copy_tx_sendtos); 344 - printf(fmt, "tx wakeup sendtos", tx_wakeup_sendtos_ps, 345 - xsks[i]->app_stats.tx_wakeup_sendtos); 346 - printf(fmt, "opt polls", opt_polls_ps, xsks[i]->app_stats.opt_polls); 347 - 348 - xsks[i]->app_stats.prev_rx_empty_polls = xsks[i]->app_stats.rx_empty_polls; 349 - xsks[i]->app_stats.prev_fill_fail_polls = xsks[i]->app_stats.fill_fail_polls; 350 - xsks[i]->app_stats.prev_copy_tx_sendtos = xsks[i]->app_stats.copy_tx_sendtos; 351 - xsks[i]->app_stats.prev_tx_wakeup_sendtos = xsks[i]->app_stats.tx_wakeup_sendtos; 352 - xsks[i]->app_stats.prev_opt_polls = xsks[i]->app_stats.opt_polls; 353 - } 354 - 355 - if (opt_tx_cycle_ns) { 356 - printf("\n%-18s %-10s %-10s %-10s %-10s %-10s\n", 357 - "", "period", "min", "ave", "max", "cycle"); 358 - printf("%-18s %-10lu %-10lu %-10lu %-10lu %-10lu\n", 359 - "Cyclic TX", opt_tx_cycle_ns, tx_cycle_diff_min, 360 - (long)(tx_cycle_diff_ave / tx_cycle_cnt), 361 - tx_cycle_diff_max, tx_cycle_cnt); 362 - } 363 - } 364 - 365 - static bool get_interrupt_number(void) 366 - { 367 - FILE *f_int_proc; 368 - char line[4096]; 369 - bool found = false; 370 - 371 - f_int_proc = fopen("/proc/interrupts", "r"); 372 - if (f_int_proc == NULL) { 373 - printf("Failed to open /proc/interrupts.\n"); 374 - return found; 375 - } 376 - 377 - while (!feof(f_int_proc) && !found) { 378 - /* Make sure to read a full line at a time */ 379 - if (fgets(line, sizeof(line), f_int_proc) == NULL || 380 - line[strlen(line) - 1] != '\n') { 381 - printf("Error reading from interrupts file\n"); 382 - break; 383 - } 384 - 385 - /* Extract interrupt number from line */ 386 - if (strstr(line, opt_irq_str) != NULL) { 387 - irq_no = atoi(line); 388 - found = true; 389 - break; 390 - } 391 - } 392 - 393 - fclose(f_int_proc); 394 - 395 - return found; 396 - } 397 - 398 - static int get_irqs(void) 399 - { 400 - char count_path[PATH_MAX]; 401 - int total_intrs = -1; 402 - FILE *f_count_proc; 403 - char line[4096]; 404 - 405 - snprintf(count_path, sizeof(count_path), 406 - "/sys/kernel/irq/%i/per_cpu_count", irq_no); 407 - f_count_proc = fopen(count_path, "r"); 408 - if (f_count_proc == NULL) { 409 - printf("Failed to open %s\n", count_path); 410 - return total_intrs; 411 - } 412 - 413 - if (fgets(line, sizeof(line), f_count_proc) == NULL || 414 - line[strlen(line) - 1] != '\n') { 415 - printf("Error reading from %s\n", count_path); 416 - } else { 417 - static const char com[2] = ","; 418 - char *token; 419 - 420 - total_intrs = 0; 421 - token = strtok(line, com); 422 - while (token != NULL) { 423 - /* sum up interrupts across all cores */ 424 - total_intrs += atoi(token); 425 - token = strtok(NULL, com); 426 - } 427 - } 428 - 429 - fclose(f_count_proc); 430 - 431 - return total_intrs; 432 - } 433 - 434 - static void dump_driver_stats(long dt) 435 - { 436 - int i; 437 - 438 - for (i = 0; i < num_socks && xsks[i]; i++) { 439 - char *fmt = "%-18s %'-14.0f %'-14lu\n"; 440 - double intrs_ps; 441 - int n_ints = get_irqs(); 442 - 443 - if (n_ints < 0) { 444 - printf("error getting intr info for intr %i\n", irq_no); 445 - return; 446 - } 447 - xsks[i]->drv_stats.intrs = n_ints - irqs_at_init; 448 - 449 - intrs_ps = (xsks[i]->drv_stats.intrs - xsks[i]->drv_stats.prev_intrs) * 450 - 1000000000. / dt; 451 - 452 - printf("\n%-18s %-14s %-14s\n", "", "intrs/s", "count"); 453 - printf(fmt, "irqs", intrs_ps, xsks[i]->drv_stats.intrs); 454 - 455 - xsks[i]->drv_stats.prev_intrs = xsks[i]->drv_stats.intrs; 456 - } 457 - } 458 - 459 - static void dump_stats(void) 460 - { 461 - unsigned long now = get_nsecs(); 462 - long dt = now - prev_time; 463 - int i; 464 - 465 - prev_time = now; 466 - 467 - for (i = 0; i < num_socks && xsks[i]; i++) { 468 - char *fmt = "%-18s %'-14.0f %'-14lu\n"; 469 - double rx_pps, tx_pps, dropped_pps, rx_invalid_pps, full_pps, fill_empty_pps, 470 - tx_invalid_pps, tx_empty_pps; 471 - 472 - rx_pps = (xsks[i]->ring_stats.rx_npkts - xsks[i]->ring_stats.prev_rx_npkts) * 473 - 1000000000. / dt; 474 - tx_pps = (xsks[i]->ring_stats.tx_npkts - xsks[i]->ring_stats.prev_tx_npkts) * 475 - 1000000000. / dt; 476 - 477 - printf("\n sock%d@", i); 478 - print_benchmark(false); 479 - printf("\n"); 480 - 481 - printf("%-18s %-14s %-14s %-14.2f\n", "", "pps", "pkts", 482 - dt / 1000000000.); 483 - printf(fmt, "rx", rx_pps, xsks[i]->ring_stats.rx_npkts); 484 - printf(fmt, "tx", tx_pps, xsks[i]->ring_stats.tx_npkts); 485 - 486 - xsks[i]->ring_stats.prev_rx_npkts = xsks[i]->ring_stats.rx_npkts; 487 - xsks[i]->ring_stats.prev_tx_npkts = xsks[i]->ring_stats.tx_npkts; 488 - 489 - if (opt_extra_stats) { 490 - if (!xsk_get_xdp_stats(xsk_socket__fd(xsks[i]->xsk), xsks[i])) { 491 - dropped_pps = (xsks[i]->ring_stats.rx_dropped_npkts - 492 - xsks[i]->ring_stats.prev_rx_dropped_npkts) * 493 - 1000000000. / dt; 494 - rx_invalid_pps = (xsks[i]->ring_stats.rx_invalid_npkts - 495 - xsks[i]->ring_stats.prev_rx_invalid_npkts) * 496 - 1000000000. / dt; 497 - tx_invalid_pps = (xsks[i]->ring_stats.tx_invalid_npkts - 498 - xsks[i]->ring_stats.prev_tx_invalid_npkts) * 499 - 1000000000. / dt; 500 - full_pps = (xsks[i]->ring_stats.rx_full_npkts - 501 - xsks[i]->ring_stats.prev_rx_full_npkts) * 502 - 1000000000. / dt; 503 - fill_empty_pps = (xsks[i]->ring_stats.rx_fill_empty_npkts - 504 - xsks[i]->ring_stats.prev_rx_fill_empty_npkts) * 505 - 1000000000. / dt; 506 - tx_empty_pps = (xsks[i]->ring_stats.tx_empty_npkts - 507 - xsks[i]->ring_stats.prev_tx_empty_npkts) * 508 - 1000000000. / dt; 509 - 510 - printf(fmt, "rx dropped", dropped_pps, 511 - xsks[i]->ring_stats.rx_dropped_npkts); 512 - printf(fmt, "rx invalid", rx_invalid_pps, 513 - xsks[i]->ring_stats.rx_invalid_npkts); 514 - printf(fmt, "tx invalid", tx_invalid_pps, 515 - xsks[i]->ring_stats.tx_invalid_npkts); 516 - printf(fmt, "rx queue full", full_pps, 517 - xsks[i]->ring_stats.rx_full_npkts); 518 - printf(fmt, "fill ring empty", fill_empty_pps, 519 - xsks[i]->ring_stats.rx_fill_empty_npkts); 520 - printf(fmt, "tx ring empty", tx_empty_pps, 521 - xsks[i]->ring_stats.tx_empty_npkts); 522 - 523 - xsks[i]->ring_stats.prev_rx_dropped_npkts = 524 - xsks[i]->ring_stats.rx_dropped_npkts; 525 - xsks[i]->ring_stats.prev_rx_invalid_npkts = 526 - xsks[i]->ring_stats.rx_invalid_npkts; 527 - xsks[i]->ring_stats.prev_tx_invalid_npkts = 528 - xsks[i]->ring_stats.tx_invalid_npkts; 529 - xsks[i]->ring_stats.prev_rx_full_npkts = 530 - xsks[i]->ring_stats.rx_full_npkts; 531 - xsks[i]->ring_stats.prev_rx_fill_empty_npkts = 532 - xsks[i]->ring_stats.rx_fill_empty_npkts; 533 - xsks[i]->ring_stats.prev_tx_empty_npkts = 534 - xsks[i]->ring_stats.tx_empty_npkts; 535 - } else { 536 - printf("%-15s\n", "Error retrieving extra stats"); 537 - } 538 - } 539 - } 540 - 541 - if (opt_app_stats) 542 - dump_app_stats(dt); 543 - if (irq_no) 544 - dump_driver_stats(dt); 545 - } 546 - 547 - static bool is_benchmark_done(void) 548 - { 549 - if (opt_duration > 0) { 550 - unsigned long dt = (get_nsecs() - start_time); 551 - 552 - if (dt >= opt_duration) 553 - benchmark_done = true; 554 - } 555 - return benchmark_done; 556 - } 557 - 558 - static void *poller(void *arg) 559 - { 560 - (void)arg; 561 - while (!is_benchmark_done()) { 562 - sleep(opt_interval); 563 - dump_stats(); 564 - } 565 - 566 - return NULL; 567 - } 568 - 569 - static void remove_xdp_program(void) 570 - { 571 - u32 curr_prog_id = 0; 572 - 573 - if (bpf_xdp_query_id(opt_ifindex, opt_xdp_flags, &curr_prog_id)) { 574 - printf("bpf_xdp_query_id failed\n"); 575 - exit(EXIT_FAILURE); 576 - } 577 - 578 - if (prog_id == curr_prog_id) 579 - bpf_xdp_detach(opt_ifindex, opt_xdp_flags, NULL); 580 - else if (!curr_prog_id) 581 - printf("couldn't find a prog id on a given interface\n"); 582 - else 583 - printf("program on interface changed, not removing\n"); 584 - } 585 - 586 - static void int_exit(int sig) 587 - { 588 - benchmark_done = true; 589 - } 590 - 591 - static void __exit_with_error(int error, const char *file, const char *func, 592 - int line) 593 - { 594 - fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func, 595 - line, error, strerror(error)); 596 - 597 - if (opt_num_xsks > 1) 598 - remove_xdp_program(); 599 - exit(EXIT_FAILURE); 600 - } 601 - 602 - #define exit_with_error(error) __exit_with_error(error, __FILE__, __func__, __LINE__) 603 - 604 - static void xdpsock_cleanup(void) 605 - { 606 - struct xsk_umem *umem = xsks[0]->umem->umem; 607 - int i, cmd = CLOSE_CONN; 608 - 609 - dump_stats(); 610 - for (i = 0; i < num_socks; i++) 611 - xsk_socket__delete(xsks[i]->xsk); 612 - (void)xsk_umem__delete(umem); 613 - 614 - if (opt_reduced_cap) { 615 - if (write(sock, &cmd, sizeof(int)) < 0) 616 - exit_with_error(errno); 617 - } 618 - 619 - if (opt_num_xsks > 1) 620 - remove_xdp_program(); 621 - } 622 - 623 - static void swap_mac_addresses(void *data) 624 - { 625 - struct ether_header *eth = (struct ether_header *)data; 626 - struct ether_addr *src_addr = (struct ether_addr *)&eth->ether_shost; 627 - struct ether_addr *dst_addr = (struct ether_addr *)&eth->ether_dhost; 628 - struct ether_addr tmp; 629 - 630 - tmp = *src_addr; 631 - *src_addr = *dst_addr; 632 - *dst_addr = tmp; 633 - } 634 - 635 - static void hex_dump(void *pkt, size_t length, u64 addr) 636 - { 637 - const unsigned char *address = (unsigned char *)pkt; 638 - const unsigned char *line = address; 639 - size_t line_size = 32; 640 - unsigned char c; 641 - char buf[32]; 642 - int i = 0; 643 - 644 - if (!DEBUG_HEXDUMP) 645 - return; 646 - 647 - sprintf(buf, "addr=%llu", addr); 648 - printf("length = %zu\n", length); 649 - printf("%s | ", buf); 650 - while (length-- > 0) { 651 - printf("%02X ", *address++); 652 - if (!(++i % line_size) || (length == 0 && i % line_size)) { 653 - if (length == 0) { 654 - while (i++ % line_size) 655 - printf("__ "); 656 - } 657 - printf(" | "); /* right close */ 658 - while (line < address) { 659 - c = *line++; 660 - printf("%c", (c < 33 || c == 255) ? 0x2E : c); 661 - } 662 - printf("\n"); 663 - if (length > 0) 664 - printf("%s | ", buf); 665 - } 666 - } 667 - printf("\n"); 668 - } 669 - 670 - static void *memset32_htonl(void *dest, u32 val, u32 size) 671 - { 672 - u32 *ptr = (u32 *)dest; 673 - int i; 674 - 675 - val = htonl(val); 676 - 677 - for (i = 0; i < (size & (~0x3)); i += 4) 678 - ptr[i >> 2] = val; 679 - 680 - for (; i < size; i++) 681 - ((char *)dest)[i] = ((char *)&val)[i & 3]; 682 - 683 - return dest; 684 - } 685 - 686 - /* 687 - * This function code has been taken from 688 - * Linux kernel lib/checksum.c 689 - */ 690 - static inline unsigned short from32to16(unsigned int x) 691 - { 692 - /* add up 16-bit and 16-bit for 16+c bit */ 693 - x = (x & 0xffff) + (x >> 16); 694 - /* add up carry.. */ 695 - x = (x & 0xffff) + (x >> 16); 696 - return x; 697 - } 698 - 699 - /* 700 - * This function code has been taken from 701 - * Linux kernel lib/checksum.c 702 - */ 703 - static unsigned int do_csum(const unsigned char *buff, int len) 704 - { 705 - unsigned int result = 0; 706 - int odd; 707 - 708 - if (len <= 0) 709 - goto out; 710 - odd = 1 & (unsigned long)buff; 711 - if (odd) { 712 - #ifdef __LITTLE_ENDIAN 713 - result += (*buff << 8); 714 - #else 715 - result = *buff; 716 - #endif 717 - len--; 718 - buff++; 719 - } 720 - if (len >= 2) { 721 - if (2 & (unsigned long)buff) { 722 - result += *(unsigned short *)buff; 723 - len -= 2; 724 - buff += 2; 725 - } 726 - if (len >= 4) { 727 - const unsigned char *end = buff + 728 - ((unsigned int)len & ~3); 729 - unsigned int carry = 0; 730 - 731 - do { 732 - unsigned int w = *(unsigned int *)buff; 733 - 734 - buff += 4; 735 - result += carry; 736 - result += w; 737 - carry = (w > result); 738 - } while (buff < end); 739 - result += carry; 740 - result = (result & 0xffff) + (result >> 16); 741 - } 742 - if (len & 2) { 743 - result += *(unsigned short *)buff; 744 - buff += 2; 745 - } 746 - } 747 - if (len & 1) 748 - #ifdef __LITTLE_ENDIAN 749 - result += *buff; 750 - #else 751 - result += (*buff << 8); 752 - #endif 753 - result = from32to16(result); 754 - if (odd) 755 - result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); 756 - out: 757 - return result; 758 - } 759 - 760 - /* 761 - * This is a version of ip_compute_csum() optimized for IP headers, 762 - * which always checksum on 4 octet boundaries. 763 - * This function code has been taken from 764 - * Linux kernel lib/checksum.c 765 - */ 766 - static inline __sum16 ip_fast_csum(const void *iph, unsigned int ihl) 767 - { 768 - return (__sum16)~do_csum(iph, ihl * 4); 769 - } 770 - 771 - /* 772 - * Fold a partial checksum 773 - * This function code has been taken from 774 - * Linux kernel include/asm-generic/checksum.h 775 - */ 776 - static inline __sum16 csum_fold(__wsum csum) 777 - { 778 - u32 sum = (u32)csum; 779 - 780 - sum = (sum & 0xffff) + (sum >> 16); 781 - sum = (sum & 0xffff) + (sum >> 16); 782 - return (__sum16)~sum; 783 - } 784 - 785 - /* 786 - * This function code has been taken from 787 - * Linux kernel lib/checksum.c 788 - */ 789 - static inline u32 from64to32(u64 x) 790 - { 791 - /* add up 32-bit and 32-bit for 32+c bit */ 792 - x = (x & 0xffffffff) + (x >> 32); 793 - /* add up carry.. */ 794 - x = (x & 0xffffffff) + (x >> 32); 795 - return (u32)x; 796 - } 797 - 798 - __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, 799 - __u32 len, __u8 proto, __wsum sum); 800 - 801 - /* 802 - * This function code has been taken from 803 - * Linux kernel lib/checksum.c 804 - */ 805 - __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, 806 - __u32 len, __u8 proto, __wsum sum) 807 - { 808 - unsigned long long s = (u32)sum; 809 - 810 - s += (u32)saddr; 811 - s += (u32)daddr; 812 - #ifdef __BIG_ENDIAN__ 813 - s += proto + len; 814 - #else 815 - s += (proto + len) << 8; 816 - #endif 817 - return (__wsum)from64to32(s); 818 - } 819 - 820 - /* 821 - * This function has been taken from 822 - * Linux kernel include/asm-generic/checksum.h 823 - */ 824 - static inline __sum16 825 - csum_tcpudp_magic(__be32 saddr, __be32 daddr, __u32 len, 826 - __u8 proto, __wsum sum) 827 - { 828 - return csum_fold(csum_tcpudp_nofold(saddr, daddr, len, proto, sum)); 829 - } 830 - 831 - static inline u16 udp_csum(u32 saddr, u32 daddr, u32 len, 832 - u8 proto, u16 *udp_pkt) 833 - { 834 - u32 csum = 0; 835 - u32 cnt = 0; 836 - 837 - /* udp hdr and data */ 838 - for (; cnt < len; cnt += 2) 839 - csum += udp_pkt[cnt >> 1]; 840 - 841 - return csum_tcpudp_magic(saddr, daddr, len, proto, csum); 842 - } 843 - 844 - #define ETH_FCS_SIZE 4 845 - 846 - #define ETH_HDR_SIZE (opt_vlan_tag ? sizeof(struct vlan_ethhdr) : \ 847 - sizeof(struct ethhdr)) 848 - #define PKTGEN_HDR_SIZE (opt_tstamp ? sizeof(struct pktgen_hdr) : 0) 849 - #define PKT_HDR_SIZE (ETH_HDR_SIZE + sizeof(struct iphdr) + \ 850 - sizeof(struct udphdr) + PKTGEN_HDR_SIZE) 851 - #define PKTGEN_HDR_OFFSET (ETH_HDR_SIZE + sizeof(struct iphdr) + \ 852 - sizeof(struct udphdr)) 853 - #define PKTGEN_SIZE_MIN (PKTGEN_HDR_OFFSET + sizeof(struct pktgen_hdr) + \ 854 - ETH_FCS_SIZE) 855 - 856 - #define PKT_SIZE (opt_pkt_size - ETH_FCS_SIZE) 857 - #define IP_PKT_SIZE (PKT_SIZE - ETH_HDR_SIZE) 858 - #define UDP_PKT_SIZE (IP_PKT_SIZE - sizeof(struct iphdr)) 859 - #define UDP_PKT_DATA_SIZE (UDP_PKT_SIZE - \ 860 - (sizeof(struct udphdr) + PKTGEN_HDR_SIZE)) 861 - 862 - static u8 pkt_data[XSK_UMEM__DEFAULT_FRAME_SIZE]; 863 - 864 - static void gen_eth_hdr_data(void) 865 - { 866 - struct pktgen_hdr *pktgen_hdr; 867 - struct udphdr *udp_hdr; 868 - struct iphdr *ip_hdr; 869 - 870 - if (opt_vlan_tag) { 871 - struct vlan_ethhdr *veth_hdr = (struct vlan_ethhdr *)pkt_data; 872 - u16 vlan_tci = 0; 873 - 874 - udp_hdr = (struct udphdr *)(pkt_data + 875 - sizeof(struct vlan_ethhdr) + 876 - sizeof(struct iphdr)); 877 - ip_hdr = (struct iphdr *)(pkt_data + 878 - sizeof(struct vlan_ethhdr)); 879 - pktgen_hdr = (struct pktgen_hdr *)(pkt_data + 880 - sizeof(struct vlan_ethhdr) + 881 - sizeof(struct iphdr) + 882 - sizeof(struct udphdr)); 883 - /* ethernet & VLAN header */ 884 - memcpy(veth_hdr->h_dest, &opt_txdmac, ETH_ALEN); 885 - memcpy(veth_hdr->h_source, &opt_txsmac, ETH_ALEN); 886 - veth_hdr->h_vlan_proto = htons(ETH_P_8021Q); 887 - vlan_tci = opt_pkt_vlan_id & VLAN_VID_MASK; 888 - vlan_tci |= (opt_pkt_vlan_pri << VLAN_PRIO_SHIFT) & VLAN_PRIO_MASK; 889 - veth_hdr->h_vlan_TCI = htons(vlan_tci); 890 - veth_hdr->h_vlan_encapsulated_proto = htons(ETH_P_IP); 891 - } else { 892 - struct ethhdr *eth_hdr = (struct ethhdr *)pkt_data; 893 - 894 - udp_hdr = (struct udphdr *)(pkt_data + 895 - sizeof(struct ethhdr) + 896 - sizeof(struct iphdr)); 897 - ip_hdr = (struct iphdr *)(pkt_data + 898 - sizeof(struct ethhdr)); 899 - pktgen_hdr = (struct pktgen_hdr *)(pkt_data + 900 - sizeof(struct ethhdr) + 901 - sizeof(struct iphdr) + 902 - sizeof(struct udphdr)); 903 - /* ethernet header */ 904 - memcpy(eth_hdr->h_dest, &opt_txdmac, ETH_ALEN); 905 - memcpy(eth_hdr->h_source, &opt_txsmac, ETH_ALEN); 906 - eth_hdr->h_proto = htons(ETH_P_IP); 907 - } 908 - 909 - 910 - /* IP header */ 911 - ip_hdr->version = IPVERSION; 912 - ip_hdr->ihl = 0x5; /* 20 byte header */ 913 - ip_hdr->tos = 0x0; 914 - ip_hdr->tot_len = htons(IP_PKT_SIZE); 915 - ip_hdr->id = 0; 916 - ip_hdr->frag_off = 0; 917 - ip_hdr->ttl = IPDEFTTL; 918 - ip_hdr->protocol = IPPROTO_UDP; 919 - ip_hdr->saddr = htonl(0x0a0a0a10); 920 - ip_hdr->daddr = htonl(0x0a0a0a20); 921 - 922 - /* IP header checksum */ 923 - ip_hdr->check = 0; 924 - ip_hdr->check = ip_fast_csum((const void *)ip_hdr, ip_hdr->ihl); 925 - 926 - /* UDP header */ 927 - udp_hdr->source = htons(0x1000); 928 - udp_hdr->dest = htons(0x1000); 929 - udp_hdr->len = htons(UDP_PKT_SIZE); 930 - 931 - if (opt_tstamp) 932 - pktgen_hdr->pgh_magic = htonl(PKTGEN_MAGIC); 933 - 934 - /* UDP data */ 935 - memset32_htonl(pkt_data + PKT_HDR_SIZE, opt_pkt_fill_pattern, 936 - UDP_PKT_DATA_SIZE); 937 - 938 - /* UDP header checksum */ 939 - udp_hdr->check = 0; 940 - udp_hdr->check = udp_csum(ip_hdr->saddr, ip_hdr->daddr, UDP_PKT_SIZE, 941 - IPPROTO_UDP, (u16 *)udp_hdr); 942 - } 943 - 944 - static void gen_eth_frame(struct xsk_umem_info *umem, u64 addr) 945 - { 946 - memcpy(xsk_umem__get_data(umem->buffer, addr), pkt_data, 947 - PKT_SIZE); 948 - } 949 - 950 - static struct xsk_umem_info *xsk_configure_umem(void *buffer, u64 size) 951 - { 952 - struct xsk_umem_info *umem; 953 - struct xsk_umem_config cfg = { 954 - /* We recommend that you set the fill ring size >= HW RX ring size + 955 - * AF_XDP RX ring size. Make sure you fill up the fill ring 956 - * with buffers at regular intervals, and you will with this setting 957 - * avoid allocation failures in the driver. These are usually quite 958 - * expensive since drivers have not been written to assume that 959 - * allocation failures are common. For regular sockets, kernel 960 - * allocated memory is used that only runs out in OOM situations 961 - * that should be rare. 962 - */ 963 - .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS * 2, 964 - .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, 965 - .frame_size = opt_xsk_frame_size, 966 - .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, 967 - .flags = opt_umem_flags 968 - }; 969 - int ret; 970 - 971 - umem = calloc(1, sizeof(*umem)); 972 - if (!umem) 973 - exit_with_error(errno); 974 - 975 - ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, &umem->cq, 976 - &cfg); 977 - if (ret) 978 - exit_with_error(-ret); 979 - 980 - umem->buffer = buffer; 981 - return umem; 982 - } 983 - 984 - static void xsk_populate_fill_ring(struct xsk_umem_info *umem) 985 - { 986 - int ret, i; 987 - u32 idx; 988 - 989 - ret = xsk_ring_prod__reserve(&umem->fq, 990 - XSK_RING_PROD__DEFAULT_NUM_DESCS * 2, &idx); 991 - if (ret != XSK_RING_PROD__DEFAULT_NUM_DESCS * 2) 992 - exit_with_error(-ret); 993 - for (i = 0; i < XSK_RING_PROD__DEFAULT_NUM_DESCS * 2; i++) 994 - *xsk_ring_prod__fill_addr(&umem->fq, idx++) = 995 - i * opt_xsk_frame_size; 996 - xsk_ring_prod__submit(&umem->fq, XSK_RING_PROD__DEFAULT_NUM_DESCS * 2); 997 - } 998 - 999 - static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, 1000 - bool rx, bool tx) 1001 - { 1002 - struct xsk_socket_config cfg; 1003 - struct xsk_socket_info *xsk; 1004 - struct xsk_ring_cons *rxr; 1005 - struct xsk_ring_prod *txr; 1006 - int ret; 1007 - 1008 - xsk = calloc(1, sizeof(*xsk)); 1009 - if (!xsk) 1010 - exit_with_error(errno); 1011 - 1012 - xsk->umem = umem; 1013 - cfg.rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS; 1014 - cfg.tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS; 1015 - if (opt_num_xsks > 1 || opt_reduced_cap) 1016 - cfg.libbpf_flags = XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD; 1017 - else 1018 - cfg.libbpf_flags = 0; 1019 - cfg.xdp_flags = opt_xdp_flags; 1020 - cfg.bind_flags = opt_xdp_bind_flags; 1021 - 1022 - rxr = rx ? &xsk->rx : NULL; 1023 - txr = tx ? &xsk->tx : NULL; 1024 - ret = xsk_socket__create(&xsk->xsk, opt_if, opt_queue, umem->umem, 1025 - rxr, txr, &cfg); 1026 - if (ret) 1027 - exit_with_error(-ret); 1028 - 1029 - ret = bpf_xdp_query_id(opt_ifindex, opt_xdp_flags, &prog_id); 1030 - if (ret) 1031 - exit_with_error(-ret); 1032 - 1033 - xsk->app_stats.rx_empty_polls = 0; 1034 - xsk->app_stats.fill_fail_polls = 0; 1035 - xsk->app_stats.copy_tx_sendtos = 0; 1036 - xsk->app_stats.tx_wakeup_sendtos = 0; 1037 - xsk->app_stats.opt_polls = 0; 1038 - xsk->app_stats.prev_rx_empty_polls = 0; 1039 - xsk->app_stats.prev_fill_fail_polls = 0; 1040 - xsk->app_stats.prev_copy_tx_sendtos = 0; 1041 - xsk->app_stats.prev_tx_wakeup_sendtos = 0; 1042 - xsk->app_stats.prev_opt_polls = 0; 1043 - 1044 - return xsk; 1045 - } 1046 - 1047 - static struct option long_options[] = { 1048 - {"rxdrop", no_argument, 0, 'r'}, 1049 - {"txonly", no_argument, 0, 't'}, 1050 - {"l2fwd", no_argument, 0, 'l'}, 1051 - {"interface", required_argument, 0, 'i'}, 1052 - {"queue", required_argument, 0, 'q'}, 1053 - {"poll", no_argument, 0, 'p'}, 1054 - {"xdp-skb", no_argument, 0, 'S'}, 1055 - {"xdp-native", no_argument, 0, 'N'}, 1056 - {"interval", required_argument, 0, 'n'}, 1057 - {"retries", required_argument, 0, 'O'}, 1058 - {"zero-copy", no_argument, 0, 'z'}, 1059 - {"copy", no_argument, 0, 'c'}, 1060 - {"frame-size", required_argument, 0, 'f'}, 1061 - {"no-need-wakeup", no_argument, 0, 'm'}, 1062 - {"unaligned", no_argument, 0, 'u'}, 1063 - {"shared-umem", no_argument, 0, 'M'}, 1064 - {"force", no_argument, 0, 'F'}, 1065 - {"duration", required_argument, 0, 'd'}, 1066 - {"clock", required_argument, 0, 'w'}, 1067 - {"batch-size", required_argument, 0, 'b'}, 1068 - {"tx-pkt-count", required_argument, 0, 'C'}, 1069 - {"tx-pkt-size", required_argument, 0, 's'}, 1070 - {"tx-pkt-pattern", required_argument, 0, 'P'}, 1071 - {"tx-vlan", no_argument, 0, 'V'}, 1072 - {"tx-vlan-id", required_argument, 0, 'J'}, 1073 - {"tx-vlan-pri", required_argument, 0, 'K'}, 1074 - {"tx-dmac", required_argument, 0, 'G'}, 1075 - {"tx-smac", required_argument, 0, 'H'}, 1076 - {"tx-cycle", required_argument, 0, 'T'}, 1077 - {"tstamp", no_argument, 0, 'y'}, 1078 - {"policy", required_argument, 0, 'W'}, 1079 - {"schpri", required_argument, 0, 'U'}, 1080 - {"extra-stats", no_argument, 0, 'x'}, 1081 - {"quiet", no_argument, 0, 'Q'}, 1082 - {"app-stats", no_argument, 0, 'a'}, 1083 - {"irq-string", no_argument, 0, 'I'}, 1084 - {"busy-poll", no_argument, 0, 'B'}, 1085 - {"reduce-cap", no_argument, 0, 'R'}, 1086 - {0, 0, 0, 0} 1087 - }; 1088 - 1089 - static void usage(const char *prog) 1090 - { 1091 - const char *str = 1092 - " Usage: %s [OPTIONS]\n" 1093 - " Options:\n" 1094 - " -r, --rxdrop Discard all incoming packets (default)\n" 1095 - " -t, --txonly Only send packets\n" 1096 - " -l, --l2fwd MAC swap L2 forwarding\n" 1097 - " -i, --interface=n Run on interface n\n" 1098 - " -q, --queue=n Use queue n (default 0)\n" 1099 - " -p, --poll Use poll syscall\n" 1100 - " -S, --xdp-skb=n Use XDP skb-mod\n" 1101 - " -N, --xdp-native=n Enforce XDP native mode\n" 1102 - " -n, --interval=n Specify statistics update interval (default 1 sec).\n" 1103 - " -O, --retries=n Specify time-out retries (1s interval) attempt (default 3).\n" 1104 - " -z, --zero-copy Force zero-copy mode.\n" 1105 - " -c, --copy Force copy mode.\n" 1106 - " -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n" 1107 - " -f, --frame-size=n Set the frame size (must be a power of two in aligned mode, default is %d).\n" 1108 - " -u, --unaligned Enable unaligned chunk placement\n" 1109 - " -M, --shared-umem Enable XDP_SHARED_UMEM (cannot be used with -R)\n" 1110 - " -F, --force Force loading the XDP prog\n" 1111 - " -d, --duration=n Duration in secs to run command.\n" 1112 - " Default: forever.\n" 1113 - " -w, --clock=CLOCK Clock NAME (default MONOTONIC).\n" 1114 - " -b, --batch-size=n Batch size for sending or receiving\n" 1115 - " packets. Default: %d\n" 1116 - " -C, --tx-pkt-count=n Number of packets to send.\n" 1117 - " Default: Continuous packets.\n" 1118 - " -s, --tx-pkt-size=n Transmit packet size.\n" 1119 - " (Default: %d bytes)\n" 1120 - " Min size: %d, Max size %d.\n" 1121 - " -P, --tx-pkt-pattern=nPacket fill pattern. Default: 0x%x\n" 1122 - " -V, --tx-vlan Send VLAN tagged packets (For -t|--txonly)\n" 1123 - " -J, --tx-vlan-id=n Tx VLAN ID [1-4095]. Default: %d (For -V|--tx-vlan)\n" 1124 - " -K, --tx-vlan-pri=n Tx VLAN Priority [0-7]. Default: %d (For -V|--tx-vlan)\n" 1125 - " -G, --tx-dmac=<MAC> Dest MAC addr of TX frame in aa:bb:cc:dd:ee:ff format (For -V|--tx-vlan)\n" 1126 - " -H, --tx-smac=<MAC> Src MAC addr of TX frame in aa:bb:cc:dd:ee:ff format (For -V|--tx-vlan)\n" 1127 - " -T, --tx-cycle=n Tx cycle time in micro-seconds (For -t|--txonly).\n" 1128 - " -y, --tstamp Add time-stamp to packet (For -t|--txonly).\n" 1129 - " -W, --policy=POLICY Schedule policy. Default: SCHED_OTHER\n" 1130 - " -U, --schpri=n Schedule priority. Default: %d\n" 1131 - " -x, --extra-stats Display extra statistics.\n" 1132 - " -Q, --quiet Do not display any stats.\n" 1133 - " -a, --app-stats Display application (syscall) statistics.\n" 1134 - " -I, --irq-string Display driver interrupt statistics for interface associated with irq-string.\n" 1135 - " -B, --busy-poll Busy poll.\n" 1136 - " -R, --reduce-cap Use reduced capabilities (cannot be used with -M)\n" 1137 - "\n"; 1138 - fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE, 1139 - opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE, 1140 - XSK_UMEM__DEFAULT_FRAME_SIZE, opt_pkt_fill_pattern, 1141 - VLAN_VID__DEFAULT, VLAN_PRI__DEFAULT, 1142 - SCHED_PRI__DEFAULT); 1143 - 1144 - exit(EXIT_FAILURE); 1145 - } 1146 - 1147 - static void parse_command_line(int argc, char **argv) 1148 - { 1149 - int option_index, c; 1150 - 1151 - opterr = 0; 1152 - 1153 - for (;;) { 1154 - c = getopt_long(argc, argv, 1155 - "Frtli:q:pSNn:w:O:czf:muMd:b:C:s:P:VJ:K:G:H:T:yW:U:xQaI:BR", 1156 - long_options, &option_index); 1157 - if (c == -1) 1158 - break; 1159 - 1160 - switch (c) { 1161 - case 'r': 1162 - opt_bench = BENCH_RXDROP; 1163 - break; 1164 - case 't': 1165 - opt_bench = BENCH_TXONLY; 1166 - break; 1167 - case 'l': 1168 - opt_bench = BENCH_L2FWD; 1169 - break; 1170 - case 'i': 1171 - opt_if = optarg; 1172 - break; 1173 - case 'q': 1174 - opt_queue = atoi(optarg); 1175 - break; 1176 - case 'p': 1177 - opt_poll = 1; 1178 - break; 1179 - case 'S': 1180 - opt_xdp_flags |= XDP_FLAGS_SKB_MODE; 1181 - opt_xdp_bind_flags |= XDP_COPY; 1182 - break; 1183 - case 'N': 1184 - /* default, set below */ 1185 - break; 1186 - case 'n': 1187 - opt_interval = atoi(optarg); 1188 - break; 1189 - case 'w': 1190 - if (get_clockid(&opt_clock, optarg)) { 1191 - fprintf(stderr, 1192 - "ERROR: Invalid clock %s. Default to CLOCK_MONOTONIC.\n", 1193 - optarg); 1194 - opt_clock = CLOCK_MONOTONIC; 1195 - } 1196 - break; 1197 - case 'O': 1198 - opt_retries = atoi(optarg); 1199 - break; 1200 - case 'z': 1201 - opt_xdp_bind_flags |= XDP_ZEROCOPY; 1202 - break; 1203 - case 'c': 1204 - opt_xdp_bind_flags |= XDP_COPY; 1205 - break; 1206 - case 'u': 1207 - opt_umem_flags |= XDP_UMEM_UNALIGNED_CHUNK_FLAG; 1208 - opt_unaligned_chunks = 1; 1209 - opt_mmap_flags = MAP_HUGETLB; 1210 - break; 1211 - case 'F': 1212 - opt_xdp_flags &= ~XDP_FLAGS_UPDATE_IF_NOEXIST; 1213 - break; 1214 - case 'f': 1215 - opt_xsk_frame_size = atoi(optarg); 1216 - break; 1217 - case 'm': 1218 - opt_need_wakeup = false; 1219 - opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP; 1220 - break; 1221 - case 'M': 1222 - opt_num_xsks = MAX_SOCKS; 1223 - break; 1224 - case 'd': 1225 - opt_duration = atoi(optarg); 1226 - opt_duration *= 1000000000; 1227 - break; 1228 - case 'b': 1229 - opt_batch_size = atoi(optarg); 1230 - break; 1231 - case 'C': 1232 - opt_pkt_count = atoi(optarg); 1233 - break; 1234 - case 's': 1235 - opt_pkt_size = atoi(optarg); 1236 - if (opt_pkt_size > (XSK_UMEM__DEFAULT_FRAME_SIZE) || 1237 - opt_pkt_size < MIN_PKT_SIZE) { 1238 - fprintf(stderr, 1239 - "ERROR: Invalid frame size %d\n", 1240 - opt_pkt_size); 1241 - usage(basename(argv[0])); 1242 - } 1243 - break; 1244 - case 'P': 1245 - opt_pkt_fill_pattern = strtol(optarg, NULL, 16); 1246 - break; 1247 - case 'V': 1248 - opt_vlan_tag = true; 1249 - break; 1250 - case 'J': 1251 - opt_pkt_vlan_id = atoi(optarg); 1252 - break; 1253 - case 'K': 1254 - opt_pkt_vlan_pri = atoi(optarg); 1255 - break; 1256 - case 'G': 1257 - if (!ether_aton_r(optarg, 1258 - (struct ether_addr *)&opt_txdmac)) { 1259 - fprintf(stderr, "Invalid dmac address:%s\n", 1260 - optarg); 1261 - usage(basename(argv[0])); 1262 - } 1263 - break; 1264 - case 'H': 1265 - if (!ether_aton_r(optarg, 1266 - (struct ether_addr *)&opt_txsmac)) { 1267 - fprintf(stderr, "Invalid smac address:%s\n", 1268 - optarg); 1269 - usage(basename(argv[0])); 1270 - } 1271 - break; 1272 - case 'T': 1273 - opt_tx_cycle_ns = atoi(optarg); 1274 - opt_tx_cycle_ns *= NSEC_PER_USEC; 1275 - break; 1276 - case 'y': 1277 - opt_tstamp = 1; 1278 - break; 1279 - case 'W': 1280 - if (get_schpolicy(&opt_schpolicy, optarg)) { 1281 - fprintf(stderr, 1282 - "ERROR: Invalid policy %s. Default to SCHED_OTHER.\n", 1283 - optarg); 1284 - opt_schpolicy = SCHED_OTHER; 1285 - } 1286 - break; 1287 - case 'U': 1288 - opt_schprio = atoi(optarg); 1289 - break; 1290 - case 'x': 1291 - opt_extra_stats = 1; 1292 - break; 1293 - case 'Q': 1294 - opt_quiet = 1; 1295 - break; 1296 - case 'a': 1297 - opt_app_stats = 1; 1298 - break; 1299 - case 'I': 1300 - opt_irq_str = optarg; 1301 - if (get_interrupt_number()) 1302 - irqs_at_init = get_irqs(); 1303 - if (irqs_at_init < 0) { 1304 - fprintf(stderr, "ERROR: Failed to get irqs for %s\n", opt_irq_str); 1305 - usage(basename(argv[0])); 1306 - } 1307 - break; 1308 - case 'B': 1309 - opt_busy_poll = 1; 1310 - break; 1311 - case 'R': 1312 - opt_reduced_cap = true; 1313 - break; 1314 - default: 1315 - usage(basename(argv[0])); 1316 - } 1317 - } 1318 - 1319 - if (!(opt_xdp_flags & XDP_FLAGS_SKB_MODE)) 1320 - opt_xdp_flags |= XDP_FLAGS_DRV_MODE; 1321 - 1322 - opt_ifindex = if_nametoindex(opt_if); 1323 - if (!opt_ifindex) { 1324 - fprintf(stderr, "ERROR: interface \"%s\" does not exist\n", 1325 - opt_if); 1326 - usage(basename(argv[0])); 1327 - } 1328 - 1329 - if ((opt_xsk_frame_size & (opt_xsk_frame_size - 1)) && 1330 - !opt_unaligned_chunks) { 1331 - fprintf(stderr, "--frame-size=%d is not a power of two\n", 1332 - opt_xsk_frame_size); 1333 - usage(basename(argv[0])); 1334 - } 1335 - 1336 - if (opt_reduced_cap && opt_num_xsks > 1) { 1337 - fprintf(stderr, "ERROR: -M and -R cannot be used together\n"); 1338 - usage(basename(argv[0])); 1339 - } 1340 - } 1341 - 1342 - static void kick_tx(struct xsk_socket_info *xsk) 1343 - { 1344 - int ret; 1345 - 1346 - ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); 1347 - if (ret >= 0 || errno == ENOBUFS || errno == EAGAIN || 1348 - errno == EBUSY || errno == ENETDOWN) 1349 - return; 1350 - exit_with_error(errno); 1351 - } 1352 - 1353 - static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk) 1354 - { 1355 - struct xsk_umem_info *umem = xsk->umem; 1356 - u32 idx_cq = 0, idx_fq = 0; 1357 - unsigned int rcvd; 1358 - size_t ndescs; 1359 - 1360 - if (!xsk->outstanding_tx) 1361 - return; 1362 - 1363 - /* In copy mode, Tx is driven by a syscall so we need to use e.g. sendto() to 1364 - * really send the packets. In zero-copy mode we do not have to do this, since Tx 1365 - * is driven by the NAPI loop. So as an optimization, we do not have to call 1366 - * sendto() all the time in zero-copy mode for l2fwd. 1367 - */ 1368 - if (opt_xdp_bind_flags & XDP_COPY) { 1369 - xsk->app_stats.copy_tx_sendtos++; 1370 - kick_tx(xsk); 1371 - } 1372 - 1373 - ndescs = (xsk->outstanding_tx > opt_batch_size) ? opt_batch_size : 1374 - xsk->outstanding_tx; 1375 - 1376 - /* re-add completed Tx buffers */ 1377 - rcvd = xsk_ring_cons__peek(&umem->cq, ndescs, &idx_cq); 1378 - if (rcvd > 0) { 1379 - unsigned int i; 1380 - int ret; 1381 - 1382 - ret = xsk_ring_prod__reserve(&umem->fq, rcvd, &idx_fq); 1383 - while (ret != rcvd) { 1384 - if (ret < 0) 1385 - exit_with_error(-ret); 1386 - if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&umem->fq)) { 1387 - xsk->app_stats.fill_fail_polls++; 1388 - recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, 1389 - NULL); 1390 - } 1391 - ret = xsk_ring_prod__reserve(&umem->fq, rcvd, &idx_fq); 1392 - } 1393 - 1394 - for (i = 0; i < rcvd; i++) 1395 - *xsk_ring_prod__fill_addr(&umem->fq, idx_fq++) = 1396 - *xsk_ring_cons__comp_addr(&umem->cq, idx_cq++); 1397 - 1398 - xsk_ring_prod__submit(&xsk->umem->fq, rcvd); 1399 - xsk_ring_cons__release(&xsk->umem->cq, rcvd); 1400 - xsk->outstanding_tx -= rcvd; 1401 - } 1402 - } 1403 - 1404 - static inline void complete_tx_only(struct xsk_socket_info *xsk, 1405 - int batch_size) 1406 - { 1407 - unsigned int rcvd; 1408 - u32 idx; 1409 - 1410 - if (!xsk->outstanding_tx) 1411 - return; 1412 - 1413 - if (!opt_need_wakeup || xsk_ring_prod__needs_wakeup(&xsk->tx)) { 1414 - xsk->app_stats.tx_wakeup_sendtos++; 1415 - kick_tx(xsk); 1416 - } 1417 - 1418 - rcvd = xsk_ring_cons__peek(&xsk->umem->cq, batch_size, &idx); 1419 - if (rcvd > 0) { 1420 - xsk_ring_cons__release(&xsk->umem->cq, rcvd); 1421 - xsk->outstanding_tx -= rcvd; 1422 - } 1423 - } 1424 - 1425 - static void rx_drop(struct xsk_socket_info *xsk) 1426 - { 1427 - unsigned int rcvd, i; 1428 - u32 idx_rx = 0, idx_fq = 0; 1429 - int ret; 1430 - 1431 - rcvd = xsk_ring_cons__peek(&xsk->rx, opt_batch_size, &idx_rx); 1432 - if (!rcvd) { 1433 - if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { 1434 - xsk->app_stats.rx_empty_polls++; 1435 - recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); 1436 - } 1437 - return; 1438 - } 1439 - 1440 - ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq); 1441 - while (ret != rcvd) { 1442 - if (ret < 0) 1443 - exit_with_error(-ret); 1444 - if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { 1445 - xsk->app_stats.fill_fail_polls++; 1446 - recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); 1447 - } 1448 - ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq); 1449 - } 1450 - 1451 - for (i = 0; i < rcvd; i++) { 1452 - u64 addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr; 1453 - u32 len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len; 1454 - u64 orig = xsk_umem__extract_addr(addr); 1455 - 1456 - addr = xsk_umem__add_offset_to_addr(addr); 1457 - char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr); 1458 - 1459 - hex_dump(pkt, len, addr); 1460 - *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq++) = orig; 1461 - } 1462 - 1463 - xsk_ring_prod__submit(&xsk->umem->fq, rcvd); 1464 - xsk_ring_cons__release(&xsk->rx, rcvd); 1465 - xsk->ring_stats.rx_npkts += rcvd; 1466 - } 1467 - 1468 - static void rx_drop_all(void) 1469 - { 1470 - struct pollfd fds[MAX_SOCKS] = {}; 1471 - int i, ret; 1472 - 1473 - for (i = 0; i < num_socks; i++) { 1474 - fds[i].fd = xsk_socket__fd(xsks[i]->xsk); 1475 - fds[i].events = POLLIN; 1476 - } 1477 - 1478 - for (;;) { 1479 - if (opt_poll) { 1480 - for (i = 0; i < num_socks; i++) 1481 - xsks[i]->app_stats.opt_polls++; 1482 - ret = poll(fds, num_socks, opt_timeout); 1483 - if (ret <= 0) 1484 - continue; 1485 - } 1486 - 1487 - for (i = 0; i < num_socks; i++) 1488 - rx_drop(xsks[i]); 1489 - 1490 - if (benchmark_done) 1491 - break; 1492 - } 1493 - } 1494 - 1495 - static int tx_only(struct xsk_socket_info *xsk, u32 *frame_nb, 1496 - int batch_size, unsigned long tx_ns) 1497 - { 1498 - u32 idx, tv_sec, tv_usec; 1499 - unsigned int i; 1500 - 1501 - while (xsk_ring_prod__reserve(&xsk->tx, batch_size, &idx) < 1502 - batch_size) { 1503 - complete_tx_only(xsk, batch_size); 1504 - if (benchmark_done) 1505 - return 0; 1506 - } 1507 - 1508 - if (opt_tstamp) { 1509 - tv_sec = (u32)(tx_ns / NSEC_PER_SEC); 1510 - tv_usec = (u32)((tx_ns % NSEC_PER_SEC) / 1000); 1511 - } 1512 - 1513 - for (i = 0; i < batch_size; i++) { 1514 - struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(&xsk->tx, 1515 - idx + i); 1516 - tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size; 1517 - tx_desc->len = PKT_SIZE; 1518 - 1519 - if (opt_tstamp) { 1520 - struct pktgen_hdr *pktgen_hdr; 1521 - u64 addr = tx_desc->addr; 1522 - char *pkt; 1523 - 1524 - pkt = xsk_umem__get_data(xsk->umem->buffer, addr); 1525 - pktgen_hdr = (struct pktgen_hdr *)(pkt + PKTGEN_HDR_OFFSET); 1526 - 1527 - pktgen_hdr->seq_num = htonl(sequence++); 1528 - pktgen_hdr->tv_sec = htonl(tv_sec); 1529 - pktgen_hdr->tv_usec = htonl(tv_usec); 1530 - 1531 - hex_dump(pkt, PKT_SIZE, addr); 1532 - } 1533 - } 1534 - 1535 - xsk_ring_prod__submit(&xsk->tx, batch_size); 1536 - xsk->ring_stats.tx_npkts += batch_size; 1537 - xsk->outstanding_tx += batch_size; 1538 - *frame_nb += batch_size; 1539 - *frame_nb %= NUM_FRAMES; 1540 - complete_tx_only(xsk, batch_size); 1541 - 1542 - return batch_size; 1543 - } 1544 - 1545 - static inline int get_batch_size(int pkt_cnt) 1546 - { 1547 - if (!opt_pkt_count) 1548 - return opt_batch_size; 1549 - 1550 - if (pkt_cnt + opt_batch_size <= opt_pkt_count) 1551 - return opt_batch_size; 1552 - 1553 - return opt_pkt_count - pkt_cnt; 1554 - } 1555 - 1556 - static void complete_tx_only_all(void) 1557 - { 1558 - bool pending; 1559 - int i; 1560 - 1561 - do { 1562 - pending = false; 1563 - for (i = 0; i < num_socks; i++) { 1564 - if (xsks[i]->outstanding_tx) { 1565 - complete_tx_only(xsks[i], opt_batch_size); 1566 - pending = !!xsks[i]->outstanding_tx; 1567 - } 1568 - } 1569 - sleep(1); 1570 - } while (pending && opt_retries-- > 0); 1571 - } 1572 - 1573 - static void tx_only_all(void) 1574 - { 1575 - struct pollfd fds[MAX_SOCKS] = {}; 1576 - u32 frame_nb[MAX_SOCKS] = {}; 1577 - unsigned long next_tx_ns = 0; 1578 - int pkt_cnt = 0; 1579 - int i, ret; 1580 - 1581 - if (opt_poll && opt_tx_cycle_ns) { 1582 - fprintf(stderr, 1583 - "Error: --poll and --tx-cycles are both set\n"); 1584 - return; 1585 - } 1586 - 1587 - for (i = 0; i < num_socks; i++) { 1588 - fds[0].fd = xsk_socket__fd(xsks[i]->xsk); 1589 - fds[0].events = POLLOUT; 1590 - } 1591 - 1592 - if (opt_tx_cycle_ns) { 1593 - /* Align Tx time to micro-second boundary */ 1594 - next_tx_ns = (get_nsecs() / NSEC_PER_USEC + 1) * 1595 - NSEC_PER_USEC; 1596 - next_tx_ns += opt_tx_cycle_ns; 1597 - 1598 - /* Initialize periodic Tx scheduling variance */ 1599 - tx_cycle_diff_min = 1000000000; 1600 - tx_cycle_diff_max = 0; 1601 - tx_cycle_diff_ave = 0.0; 1602 - } 1603 - 1604 - while ((opt_pkt_count && pkt_cnt < opt_pkt_count) || !opt_pkt_count) { 1605 - int batch_size = get_batch_size(pkt_cnt); 1606 - unsigned long tx_ns = 0; 1607 - struct timespec next; 1608 - int tx_cnt = 0; 1609 - long diff; 1610 - int err; 1611 - 1612 - if (opt_poll) { 1613 - for (i = 0; i < num_socks; i++) 1614 - xsks[i]->app_stats.opt_polls++; 1615 - ret = poll(fds, num_socks, opt_timeout); 1616 - if (ret <= 0) 1617 - continue; 1618 - 1619 - if (!(fds[0].revents & POLLOUT)) 1620 - continue; 1621 - } 1622 - 1623 - if (opt_tx_cycle_ns) { 1624 - next.tv_sec = next_tx_ns / NSEC_PER_SEC; 1625 - next.tv_nsec = next_tx_ns % NSEC_PER_SEC; 1626 - err = clock_nanosleep(opt_clock, TIMER_ABSTIME, &next, NULL); 1627 - if (err) { 1628 - if (err != EINTR) 1629 - fprintf(stderr, 1630 - "clock_nanosleep failed. Err:%d errno:%d\n", 1631 - err, errno); 1632 - break; 1633 - } 1634 - 1635 - /* Measure periodic Tx scheduling variance */ 1636 - tx_ns = get_nsecs(); 1637 - diff = tx_ns - next_tx_ns; 1638 - if (diff < tx_cycle_diff_min) 1639 - tx_cycle_diff_min = diff; 1640 - 1641 - if (diff > tx_cycle_diff_max) 1642 - tx_cycle_diff_max = diff; 1643 - 1644 - tx_cycle_diff_ave += (double)diff; 1645 - tx_cycle_cnt++; 1646 - } else if (opt_tstamp) { 1647 - tx_ns = get_nsecs(); 1648 - } 1649 - 1650 - for (i = 0; i < num_socks; i++) 1651 - tx_cnt += tx_only(xsks[i], &frame_nb[i], batch_size, tx_ns); 1652 - 1653 - pkt_cnt += tx_cnt; 1654 - 1655 - if (benchmark_done) 1656 - break; 1657 - 1658 - if (opt_tx_cycle_ns) 1659 - next_tx_ns += opt_tx_cycle_ns; 1660 - } 1661 - 1662 - if (opt_pkt_count) 1663 - complete_tx_only_all(); 1664 - } 1665 - 1666 - static void l2fwd(struct xsk_socket_info *xsk) 1667 - { 1668 - unsigned int rcvd, i; 1669 - u32 idx_rx = 0, idx_tx = 0; 1670 - int ret; 1671 - 1672 - complete_tx_l2fwd(xsk); 1673 - 1674 - rcvd = xsk_ring_cons__peek(&xsk->rx, opt_batch_size, &idx_rx); 1675 - if (!rcvd) { 1676 - if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { 1677 - xsk->app_stats.rx_empty_polls++; 1678 - recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); 1679 - } 1680 - return; 1681 - } 1682 - xsk->ring_stats.rx_npkts += rcvd; 1683 - 1684 - ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx); 1685 - while (ret != rcvd) { 1686 - if (ret < 0) 1687 - exit_with_error(-ret); 1688 - complete_tx_l2fwd(xsk); 1689 - if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->tx)) { 1690 - xsk->app_stats.tx_wakeup_sendtos++; 1691 - kick_tx(xsk); 1692 - } 1693 - ret = xsk_ring_prod__reserve(&xsk->tx, rcvd, &idx_tx); 1694 - } 1695 - 1696 - for (i = 0; i < rcvd; i++) { 1697 - u64 addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr; 1698 - u32 len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx++)->len; 1699 - u64 orig = addr; 1700 - 1701 - addr = xsk_umem__add_offset_to_addr(addr); 1702 - char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr); 1703 - 1704 - swap_mac_addresses(pkt); 1705 - 1706 - hex_dump(pkt, len, addr); 1707 - xsk_ring_prod__tx_desc(&xsk->tx, idx_tx)->addr = orig; 1708 - xsk_ring_prod__tx_desc(&xsk->tx, idx_tx++)->len = len; 1709 - } 1710 - 1711 - xsk_ring_prod__submit(&xsk->tx, rcvd); 1712 - xsk_ring_cons__release(&xsk->rx, rcvd); 1713 - 1714 - xsk->ring_stats.tx_npkts += rcvd; 1715 - xsk->outstanding_tx += rcvd; 1716 - } 1717 - 1718 - static void l2fwd_all(void) 1719 - { 1720 - struct pollfd fds[MAX_SOCKS] = {}; 1721 - int i, ret; 1722 - 1723 - for (;;) { 1724 - if (opt_poll) { 1725 - for (i = 0; i < num_socks; i++) { 1726 - fds[i].fd = xsk_socket__fd(xsks[i]->xsk); 1727 - fds[i].events = POLLOUT | POLLIN; 1728 - xsks[i]->app_stats.opt_polls++; 1729 - } 1730 - ret = poll(fds, num_socks, opt_timeout); 1731 - if (ret <= 0) 1732 - continue; 1733 - } 1734 - 1735 - for (i = 0; i < num_socks; i++) 1736 - l2fwd(xsks[i]); 1737 - 1738 - if (benchmark_done) 1739 - break; 1740 - } 1741 - } 1742 - 1743 - static void load_xdp_program(char **argv, struct bpf_object **obj) 1744 - { 1745 - struct bpf_prog_load_attr prog_load_attr = { 1746 - .prog_type = BPF_PROG_TYPE_XDP, 1747 - }; 1748 - char xdp_filename[256]; 1749 - int prog_fd; 1750 - 1751 - snprintf(xdp_filename, sizeof(xdp_filename), "%s_kern.o", argv[0]); 1752 - prog_load_attr.file = xdp_filename; 1753 - 1754 - if (bpf_prog_load_xattr(&prog_load_attr, obj, &prog_fd)) 1755 - exit(EXIT_FAILURE); 1756 - if (prog_fd < 0) { 1757 - fprintf(stderr, "ERROR: no program found: %s\n", 1758 - strerror(prog_fd)); 1759 - exit(EXIT_FAILURE); 1760 - } 1761 - 1762 - if (bpf_xdp_attach(opt_ifindex, prog_fd, opt_xdp_flags, NULL) < 0) { 1763 - fprintf(stderr, "ERROR: link set xdp fd failed\n"); 1764 - exit(EXIT_FAILURE); 1765 - } 1766 - } 1767 - 1768 - static void enter_xsks_into_map(struct bpf_object *obj) 1769 - { 1770 - struct bpf_map *map; 1771 - int i, xsks_map; 1772 - 1773 - map = bpf_object__find_map_by_name(obj, "xsks_map"); 1774 - xsks_map = bpf_map__fd(map); 1775 - if (xsks_map < 0) { 1776 - fprintf(stderr, "ERROR: no xsks map found: %s\n", 1777 - strerror(xsks_map)); 1778 - exit(EXIT_FAILURE); 1779 - } 1780 - 1781 - for (i = 0; i < num_socks; i++) { 1782 - int fd = xsk_socket__fd(xsks[i]->xsk); 1783 - int key, ret; 1784 - 1785 - key = i; 1786 - ret = bpf_map_update_elem(xsks_map, &key, &fd, 0); 1787 - if (ret) { 1788 - fprintf(stderr, "ERROR: bpf_map_update_elem %d\n", i); 1789 - exit(EXIT_FAILURE); 1790 - } 1791 - } 1792 - } 1793 - 1794 - static void apply_setsockopt(struct xsk_socket_info *xsk) 1795 - { 1796 - int sock_opt; 1797 - 1798 - if (!opt_busy_poll) 1799 - return; 1800 - 1801 - sock_opt = 1; 1802 - if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_PREFER_BUSY_POLL, 1803 - (void *)&sock_opt, sizeof(sock_opt)) < 0) 1804 - exit_with_error(errno); 1805 - 1806 - sock_opt = 20; 1807 - if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_BUSY_POLL, 1808 - (void *)&sock_opt, sizeof(sock_opt)) < 0) 1809 - exit_with_error(errno); 1810 - 1811 - sock_opt = opt_batch_size; 1812 - if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_BUSY_POLL_BUDGET, 1813 - (void *)&sock_opt, sizeof(sock_opt)) < 0) 1814 - exit_with_error(errno); 1815 - } 1816 - 1817 - static int recv_xsks_map_fd_from_ctrl_node(int sock, int *_fd) 1818 - { 1819 - char cms[CMSG_SPACE(sizeof(int))]; 1820 - struct cmsghdr *cmsg; 1821 - struct msghdr msg; 1822 - struct iovec iov; 1823 - int value; 1824 - int len; 1825 - 1826 - iov.iov_base = &value; 1827 - iov.iov_len = sizeof(int); 1828 - 1829 - msg.msg_name = 0; 1830 - msg.msg_namelen = 0; 1831 - msg.msg_iov = &iov; 1832 - msg.msg_iovlen = 1; 1833 - msg.msg_flags = 0; 1834 - msg.msg_control = (caddr_t)cms; 1835 - msg.msg_controllen = sizeof(cms); 1836 - 1837 - len = recvmsg(sock, &msg, 0); 1838 - 1839 - if (len < 0) { 1840 - fprintf(stderr, "Recvmsg failed length incorrect.\n"); 1841 - return -EINVAL; 1842 - } 1843 - 1844 - if (len == 0) { 1845 - fprintf(stderr, "Recvmsg failed no data\n"); 1846 - return -EINVAL; 1847 - } 1848 - 1849 - cmsg = CMSG_FIRSTHDR(&msg); 1850 - *_fd = *(int *)CMSG_DATA(cmsg); 1851 - 1852 - return 0; 1853 - } 1854 - 1855 - static int 1856 - recv_xsks_map_fd(int *xsks_map_fd) 1857 - { 1858 - struct sockaddr_un server; 1859 - int err; 1860 - 1861 - sock = socket(AF_UNIX, SOCK_STREAM, 0); 1862 - if (sock < 0) { 1863 - fprintf(stderr, "Error opening socket stream: %s", strerror(errno)); 1864 - return errno; 1865 - } 1866 - 1867 - server.sun_family = AF_UNIX; 1868 - strcpy(server.sun_path, SOCKET_NAME); 1869 - 1870 - if (connect(sock, (struct sockaddr *)&server, sizeof(struct sockaddr_un)) < 0) { 1871 - close(sock); 1872 - fprintf(stderr, "Error connecting stream socket: %s", strerror(errno)); 1873 - return errno; 1874 - } 1875 - 1876 - err = recv_xsks_map_fd_from_ctrl_node(sock, xsks_map_fd); 1877 - if (err) { 1878 - fprintf(stderr, "Error %d receiving fd\n", err); 1879 - return err; 1880 - } 1881 - return 0; 1882 - } 1883 - 1884 - int main(int argc, char **argv) 1885 - { 1886 - struct __user_cap_header_struct hdr = { _LINUX_CAPABILITY_VERSION_3, 0 }; 1887 - struct __user_cap_data_struct data[2] = { { 0 } }; 1888 - bool rx = false, tx = false; 1889 - struct sched_param schparam; 1890 - struct xsk_umem_info *umem; 1891 - struct bpf_object *obj; 1892 - int xsks_map_fd = 0; 1893 - pthread_t pt; 1894 - int i, ret; 1895 - void *bufs; 1896 - 1897 - parse_command_line(argc, argv); 1898 - 1899 - if (opt_reduced_cap) { 1900 - if (capget(&hdr, data) < 0) 1901 - fprintf(stderr, "Error getting capabilities\n"); 1902 - 1903 - data->effective &= CAP_TO_MASK(CAP_NET_RAW); 1904 - data->permitted &= CAP_TO_MASK(CAP_NET_RAW); 1905 - 1906 - if (capset(&hdr, data) < 0) 1907 - fprintf(stderr, "Setting capabilities failed\n"); 1908 - 1909 - if (capget(&hdr, data) < 0) { 1910 - fprintf(stderr, "Error getting capabilities\n"); 1911 - } else { 1912 - fprintf(stderr, "Capabilities EFF %x Caps INH %x Caps Per %x\n", 1913 - data[0].effective, data[0].inheritable, data[0].permitted); 1914 - fprintf(stderr, "Capabilities EFF %x Caps INH %x Caps Per %x\n", 1915 - data[1].effective, data[1].inheritable, data[1].permitted); 1916 - } 1917 - } else { 1918 - /* Use libbpf 1.0 API mode */ 1919 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 1920 - 1921 - if (opt_num_xsks > 1) 1922 - load_xdp_program(argv, &obj); 1923 - } 1924 - 1925 - /* Reserve memory for the umem. Use hugepages if unaligned chunk mode */ 1926 - bufs = mmap(NULL, NUM_FRAMES * opt_xsk_frame_size, 1927 - PROT_READ | PROT_WRITE, 1928 - MAP_PRIVATE | MAP_ANONYMOUS | opt_mmap_flags, -1, 0); 1929 - if (bufs == MAP_FAILED) { 1930 - printf("ERROR: mmap failed\n"); 1931 - exit(EXIT_FAILURE); 1932 - } 1933 - 1934 - /* Create sockets... */ 1935 - umem = xsk_configure_umem(bufs, NUM_FRAMES * opt_xsk_frame_size); 1936 - if (opt_bench == BENCH_RXDROP || opt_bench == BENCH_L2FWD) { 1937 - rx = true; 1938 - xsk_populate_fill_ring(umem); 1939 - } 1940 - if (opt_bench == BENCH_L2FWD || opt_bench == BENCH_TXONLY) 1941 - tx = true; 1942 - for (i = 0; i < opt_num_xsks; i++) 1943 - xsks[num_socks++] = xsk_configure_socket(umem, rx, tx); 1944 - 1945 - for (i = 0; i < opt_num_xsks; i++) 1946 - apply_setsockopt(xsks[i]); 1947 - 1948 - if (opt_bench == BENCH_TXONLY) { 1949 - if (opt_tstamp && opt_pkt_size < PKTGEN_SIZE_MIN) 1950 - opt_pkt_size = PKTGEN_SIZE_MIN; 1951 - 1952 - gen_eth_hdr_data(); 1953 - 1954 - for (i = 0; i < NUM_FRAMES; i++) 1955 - gen_eth_frame(umem, i * opt_xsk_frame_size); 1956 - } 1957 - 1958 - if (opt_num_xsks > 1 && opt_bench != BENCH_TXONLY) 1959 - enter_xsks_into_map(obj); 1960 - 1961 - if (opt_reduced_cap) { 1962 - ret = recv_xsks_map_fd(&xsks_map_fd); 1963 - if (ret) { 1964 - fprintf(stderr, "Error %d receiving xsks_map_fd\n", ret); 1965 - exit_with_error(ret); 1966 - } 1967 - if (xsks[0]->xsk) { 1968 - ret = xsk_socket__update_xskmap(xsks[0]->xsk, xsks_map_fd); 1969 - if (ret) { 1970 - fprintf(stderr, "Update of BPF map failed(%d)\n", ret); 1971 - exit_with_error(ret); 1972 - } 1973 - } 1974 - } 1975 - 1976 - signal(SIGINT, int_exit); 1977 - signal(SIGTERM, int_exit); 1978 - signal(SIGABRT, int_exit); 1979 - 1980 - setlocale(LC_ALL, ""); 1981 - 1982 - prev_time = get_nsecs(); 1983 - start_time = prev_time; 1984 - 1985 - if (!opt_quiet) { 1986 - ret = pthread_create(&pt, NULL, poller, NULL); 1987 - if (ret) 1988 - exit_with_error(ret); 1989 - } 1990 - 1991 - /* Configure sched priority for better wake-up accuracy */ 1992 - memset(&schparam, 0, sizeof(schparam)); 1993 - schparam.sched_priority = opt_schprio; 1994 - ret = sched_setscheduler(0, opt_schpolicy, &schparam); 1995 - if (ret) { 1996 - fprintf(stderr, "Error(%d) in setting priority(%d): %s\n", 1997 - errno, opt_schprio, strerror(errno)); 1998 - goto out; 1999 - } 2000 - 2001 - if (opt_bench == BENCH_RXDROP) 2002 - rx_drop_all(); 2003 - else if (opt_bench == BENCH_TXONLY) 2004 - tx_only_all(); 2005 - else 2006 - l2fwd_all(); 2007 - 2008 - out: 2009 - benchmark_done = true; 2010 - 2011 - if (!opt_quiet) 2012 - pthread_join(pt, NULL); 2013 - 2014 - xdpsock_cleanup(); 2015 - 2016 - munmap(bufs, NUM_FRAMES * opt_xsk_frame_size); 2017 - 2018 - return 0; 2019 - }
-1085
samples/bpf/xsk_fwd.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* Copyright(c) 2020 Intel Corporation. */ 3 - 4 - #define _GNU_SOURCE 5 - #include <poll.h> 6 - #include <pthread.h> 7 - #include <signal.h> 8 - #include <sched.h> 9 - #include <stdio.h> 10 - #include <stdlib.h> 11 - #include <string.h> 12 - #include <sys/mman.h> 13 - #include <sys/socket.h> 14 - #include <sys/types.h> 15 - #include <time.h> 16 - #include <unistd.h> 17 - #include <getopt.h> 18 - #include <netinet/ether.h> 19 - #include <net/if.h> 20 - 21 - #include <linux/bpf.h> 22 - #include <linux/if_link.h> 23 - #include <linux/if_xdp.h> 24 - 25 - #include <bpf/libbpf.h> 26 - #include <bpf/xsk.h> 27 - #include <bpf/bpf.h> 28 - 29 - /* libbpf APIs for AF_XDP are deprecated starting from v0.7 */ 30 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 31 - 32 - #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) 33 - 34 - typedef __u64 u64; 35 - typedef __u32 u32; 36 - typedef __u16 u16; 37 - typedef __u8 u8; 38 - 39 - /* This program illustrates the packet forwarding between multiple AF_XDP 40 - * sockets in multi-threaded environment. All threads are sharing a common 41 - * buffer pool, with each socket having its own private buffer cache. 42 - * 43 - * Example 1: Single thread handling two sockets. The packets received by socket 44 - * A (interface IFA, queue QA) are forwarded to socket B (interface IFB, queue 45 - * QB), while the packets received by socket B are forwarded to socket A. The 46 - * thread is running on CPU core X: 47 - * 48 - * ./xsk_fwd -i IFA -q QA -i IFB -q QB -c X 49 - * 50 - * Example 2: Two threads, each handling two sockets. The thread running on CPU 51 - * core X forwards all the packets received by socket A to socket B, and all the 52 - * packets received by socket B to socket A. The thread running on CPU core Y is 53 - * performing the same packet forwarding between sockets C and D: 54 - * 55 - * ./xsk_fwd -i IFA -q QA -i IFB -q QB -i IFC -q QC -i IFD -q QD 56 - * -c CX -c CY 57 - */ 58 - 59 - /* 60 - * Buffer pool and buffer cache 61 - * 62 - * For packet forwarding, the packet buffers are typically allocated from the 63 - * pool for packet reception and freed back to the pool for further reuse once 64 - * the packet transmission is completed. 65 - * 66 - * The buffer pool is shared between multiple threads. In order to minimize the 67 - * access latency to the shared buffer pool, each thread creates one (or 68 - * several) buffer caches, which, unlike the buffer pool, are private to the 69 - * thread that creates them and therefore cannot be shared with other threads. 70 - * The access to the shared pool is only needed either (A) when the cache gets 71 - * empty due to repeated buffer allocations and it needs to be replenished from 72 - * the pool, or (B) when the cache gets full due to repeated buffer free and it 73 - * needs to be flushed back to the pull. 74 - * 75 - * In a packet forwarding system, a packet received on any input port can 76 - * potentially be transmitted on any output port, depending on the forwarding 77 - * configuration. For AF_XDP sockets, for this to work with zero-copy of the 78 - * packet buffers when, it is required that the buffer pool memory fits into the 79 - * UMEM area shared by all the sockets. 80 - */ 81 - 82 - struct bpool_params { 83 - u32 n_buffers; 84 - u32 buffer_size; 85 - int mmap_flags; 86 - 87 - u32 n_users_max; 88 - u32 n_buffers_per_slab; 89 - }; 90 - 91 - /* This buffer pool implementation organizes the buffers into equally sized 92 - * slabs of *n_buffers_per_slab*. Initially, there are *n_slabs* slabs in the 93 - * pool that are completely filled with buffer pointers (full slabs). 94 - * 95 - * Each buffer cache has a slab for buffer allocation and a slab for buffer 96 - * free, with both of these slabs initially empty. When the cache's allocation 97 - * slab goes empty, it is swapped with one of the available full slabs from the 98 - * pool, if any is available. When the cache's free slab goes full, it is 99 - * swapped for one of the empty slabs from the pool, which is guaranteed to 100 - * succeed. 101 - * 102 - * Partially filled slabs never get traded between the cache and the pool 103 - * (except when the cache itself is destroyed), which enables fast operation 104 - * through pointer swapping. 105 - */ 106 - struct bpool { 107 - struct bpool_params params; 108 - pthread_mutex_t lock; 109 - void *addr; 110 - 111 - u64 **slabs; 112 - u64 **slabs_reserved; 113 - u64 *buffers; 114 - u64 *buffers_reserved; 115 - 116 - u64 n_slabs; 117 - u64 n_slabs_reserved; 118 - u64 n_buffers; 119 - 120 - u64 n_slabs_available; 121 - u64 n_slabs_reserved_available; 122 - 123 - struct xsk_umem_config umem_cfg; 124 - struct xsk_ring_prod umem_fq; 125 - struct xsk_ring_cons umem_cq; 126 - struct xsk_umem *umem; 127 - }; 128 - 129 - static struct bpool * 130 - bpool_init(struct bpool_params *params, 131 - struct xsk_umem_config *umem_cfg) 132 - { 133 - u64 n_slabs, n_slabs_reserved, n_buffers, n_buffers_reserved; 134 - u64 slabs_size, slabs_reserved_size; 135 - u64 buffers_size, buffers_reserved_size; 136 - u64 total_size, i; 137 - struct bpool *bp; 138 - u8 *p; 139 - int status; 140 - 141 - /* Use libbpf 1.0 API mode */ 142 - libbpf_set_strict_mode(LIBBPF_STRICT_ALL); 143 - 144 - /* bpool internals dimensioning. */ 145 - n_slabs = (params->n_buffers + params->n_buffers_per_slab - 1) / 146 - params->n_buffers_per_slab; 147 - n_slabs_reserved = params->n_users_max * 2; 148 - n_buffers = n_slabs * params->n_buffers_per_slab; 149 - n_buffers_reserved = n_slabs_reserved * params->n_buffers_per_slab; 150 - 151 - slabs_size = n_slabs * sizeof(u64 *); 152 - slabs_reserved_size = n_slabs_reserved * sizeof(u64 *); 153 - buffers_size = n_buffers * sizeof(u64); 154 - buffers_reserved_size = n_buffers_reserved * sizeof(u64); 155 - 156 - total_size = sizeof(struct bpool) + 157 - slabs_size + slabs_reserved_size + 158 - buffers_size + buffers_reserved_size; 159 - 160 - /* bpool memory allocation. */ 161 - p = calloc(total_size, sizeof(u8)); 162 - if (!p) 163 - return NULL; 164 - 165 - /* bpool memory initialization. */ 166 - bp = (struct bpool *)p; 167 - memcpy(&bp->params, params, sizeof(*params)); 168 - bp->params.n_buffers = n_buffers; 169 - 170 - bp->slabs = (u64 **)&p[sizeof(struct bpool)]; 171 - bp->slabs_reserved = (u64 **)&p[sizeof(struct bpool) + 172 - slabs_size]; 173 - bp->buffers = (u64 *)&p[sizeof(struct bpool) + 174 - slabs_size + slabs_reserved_size]; 175 - bp->buffers_reserved = (u64 *)&p[sizeof(struct bpool) + 176 - slabs_size + slabs_reserved_size + buffers_size]; 177 - 178 - bp->n_slabs = n_slabs; 179 - bp->n_slabs_reserved = n_slabs_reserved; 180 - bp->n_buffers = n_buffers; 181 - 182 - for (i = 0; i < n_slabs; i++) 183 - bp->slabs[i] = &bp->buffers[i * params->n_buffers_per_slab]; 184 - bp->n_slabs_available = n_slabs; 185 - 186 - for (i = 0; i < n_slabs_reserved; i++) 187 - bp->slabs_reserved[i] = &bp->buffers_reserved[i * 188 - params->n_buffers_per_slab]; 189 - bp->n_slabs_reserved_available = n_slabs_reserved; 190 - 191 - for (i = 0; i < n_buffers; i++) 192 - bp->buffers[i] = i * params->buffer_size; 193 - 194 - /* lock. */ 195 - status = pthread_mutex_init(&bp->lock, NULL); 196 - if (status) { 197 - free(p); 198 - return NULL; 199 - } 200 - 201 - /* mmap. */ 202 - bp->addr = mmap(NULL, 203 - n_buffers * params->buffer_size, 204 - PROT_READ | PROT_WRITE, 205 - MAP_PRIVATE | MAP_ANONYMOUS | params->mmap_flags, 206 - -1, 207 - 0); 208 - if (bp->addr == MAP_FAILED) { 209 - pthread_mutex_destroy(&bp->lock); 210 - free(p); 211 - return NULL; 212 - } 213 - 214 - /* umem. */ 215 - status = xsk_umem__create(&bp->umem, 216 - bp->addr, 217 - bp->params.n_buffers * bp->params.buffer_size, 218 - &bp->umem_fq, 219 - &bp->umem_cq, 220 - umem_cfg); 221 - if (status) { 222 - munmap(bp->addr, bp->params.n_buffers * bp->params.buffer_size); 223 - pthread_mutex_destroy(&bp->lock); 224 - free(p); 225 - return NULL; 226 - } 227 - memcpy(&bp->umem_cfg, umem_cfg, sizeof(*umem_cfg)); 228 - 229 - return bp; 230 - } 231 - 232 - static void 233 - bpool_free(struct bpool *bp) 234 - { 235 - if (!bp) 236 - return; 237 - 238 - xsk_umem__delete(bp->umem); 239 - munmap(bp->addr, bp->params.n_buffers * bp->params.buffer_size); 240 - pthread_mutex_destroy(&bp->lock); 241 - free(bp); 242 - } 243 - 244 - struct bcache { 245 - struct bpool *bp; 246 - 247 - u64 *slab_cons; 248 - u64 *slab_prod; 249 - 250 - u64 n_buffers_cons; 251 - u64 n_buffers_prod; 252 - }; 253 - 254 - static u32 255 - bcache_slab_size(struct bcache *bc) 256 - { 257 - struct bpool *bp = bc->bp; 258 - 259 - return bp->params.n_buffers_per_slab; 260 - } 261 - 262 - static struct bcache * 263 - bcache_init(struct bpool *bp) 264 - { 265 - struct bcache *bc; 266 - 267 - bc = calloc(1, sizeof(struct bcache)); 268 - if (!bc) 269 - return NULL; 270 - 271 - bc->bp = bp; 272 - bc->n_buffers_cons = 0; 273 - bc->n_buffers_prod = 0; 274 - 275 - pthread_mutex_lock(&bp->lock); 276 - if (bp->n_slabs_reserved_available == 0) { 277 - pthread_mutex_unlock(&bp->lock); 278 - free(bc); 279 - return NULL; 280 - } 281 - 282 - bc->slab_cons = bp->slabs_reserved[bp->n_slabs_reserved_available - 1]; 283 - bc->slab_prod = bp->slabs_reserved[bp->n_slabs_reserved_available - 2]; 284 - bp->n_slabs_reserved_available -= 2; 285 - pthread_mutex_unlock(&bp->lock); 286 - 287 - return bc; 288 - } 289 - 290 - static void 291 - bcache_free(struct bcache *bc) 292 - { 293 - struct bpool *bp; 294 - 295 - if (!bc) 296 - return; 297 - 298 - /* In order to keep this example simple, the case of freeing any 299 - * existing buffers from the cache back to the pool is ignored. 300 - */ 301 - 302 - bp = bc->bp; 303 - pthread_mutex_lock(&bp->lock); 304 - bp->slabs_reserved[bp->n_slabs_reserved_available] = bc->slab_prod; 305 - bp->slabs_reserved[bp->n_slabs_reserved_available + 1] = bc->slab_cons; 306 - bp->n_slabs_reserved_available += 2; 307 - pthread_mutex_unlock(&bp->lock); 308 - 309 - free(bc); 310 - } 311 - 312 - /* To work correctly, the implementation requires that the *n_buffers* input 313 - * argument is never greater than the buffer pool's *n_buffers_per_slab*. This 314 - * is typically the case, with one exception taking place when large number of 315 - * buffers are allocated at init time (e.g. for the UMEM fill queue setup). 316 - */ 317 - static inline u32 318 - bcache_cons_check(struct bcache *bc, u32 n_buffers) 319 - { 320 - struct bpool *bp = bc->bp; 321 - u64 n_buffers_per_slab = bp->params.n_buffers_per_slab; 322 - u64 n_buffers_cons = bc->n_buffers_cons; 323 - u64 n_slabs_available; 324 - u64 *slab_full; 325 - 326 - /* 327 - * Consumer slab is not empty: Use what's available locally. Do not 328 - * look for more buffers from the pool when the ask can only be 329 - * partially satisfied. 330 - */ 331 - if (n_buffers_cons) 332 - return (n_buffers_cons < n_buffers) ? 333 - n_buffers_cons : 334 - n_buffers; 335 - 336 - /* 337 - * Consumer slab is empty: look to trade the current consumer slab 338 - * (full) for a full slab from the pool, if any is available. 339 - */ 340 - pthread_mutex_lock(&bp->lock); 341 - n_slabs_available = bp->n_slabs_available; 342 - if (!n_slabs_available) { 343 - pthread_mutex_unlock(&bp->lock); 344 - return 0; 345 - } 346 - 347 - n_slabs_available--; 348 - slab_full = bp->slabs[n_slabs_available]; 349 - bp->slabs[n_slabs_available] = bc->slab_cons; 350 - bp->n_slabs_available = n_slabs_available; 351 - pthread_mutex_unlock(&bp->lock); 352 - 353 - bc->slab_cons = slab_full; 354 - bc->n_buffers_cons = n_buffers_per_slab; 355 - return n_buffers; 356 - } 357 - 358 - static inline u64 359 - bcache_cons(struct bcache *bc) 360 - { 361 - u64 n_buffers_cons = bc->n_buffers_cons - 1; 362 - u64 buffer; 363 - 364 - buffer = bc->slab_cons[n_buffers_cons]; 365 - bc->n_buffers_cons = n_buffers_cons; 366 - return buffer; 367 - } 368 - 369 - static inline void 370 - bcache_prod(struct bcache *bc, u64 buffer) 371 - { 372 - struct bpool *bp = bc->bp; 373 - u64 n_buffers_per_slab = bp->params.n_buffers_per_slab; 374 - u64 n_buffers_prod = bc->n_buffers_prod; 375 - u64 n_slabs_available; 376 - u64 *slab_empty; 377 - 378 - /* 379 - * Producer slab is not yet full: store the current buffer to it. 380 - */ 381 - if (n_buffers_prod < n_buffers_per_slab) { 382 - bc->slab_prod[n_buffers_prod] = buffer; 383 - bc->n_buffers_prod = n_buffers_prod + 1; 384 - return; 385 - } 386 - 387 - /* 388 - * Producer slab is full: trade the cache's current producer slab 389 - * (full) for an empty slab from the pool, then store the current 390 - * buffer to the new producer slab. As one full slab exists in the 391 - * cache, it is guaranteed that there is at least one empty slab 392 - * available in the pool. 393 - */ 394 - pthread_mutex_lock(&bp->lock); 395 - n_slabs_available = bp->n_slabs_available; 396 - slab_empty = bp->slabs[n_slabs_available]; 397 - bp->slabs[n_slabs_available] = bc->slab_prod; 398 - bp->n_slabs_available = n_slabs_available + 1; 399 - pthread_mutex_unlock(&bp->lock); 400 - 401 - slab_empty[0] = buffer; 402 - bc->slab_prod = slab_empty; 403 - bc->n_buffers_prod = 1; 404 - } 405 - 406 - /* 407 - * Port 408 - * 409 - * Each of the forwarding ports sits on top of an AF_XDP socket. In order for 410 - * packet forwarding to happen with no packet buffer copy, all the sockets need 411 - * to share the same UMEM area, which is used as the buffer pool memory. 412 - */ 413 - #ifndef MAX_BURST_RX 414 - #define MAX_BURST_RX 64 415 - #endif 416 - 417 - #ifndef MAX_BURST_TX 418 - #define MAX_BURST_TX 64 419 - #endif 420 - 421 - struct burst_rx { 422 - u64 addr[MAX_BURST_RX]; 423 - u32 len[MAX_BURST_RX]; 424 - }; 425 - 426 - struct burst_tx { 427 - u64 addr[MAX_BURST_TX]; 428 - u32 len[MAX_BURST_TX]; 429 - u32 n_pkts; 430 - }; 431 - 432 - struct port_params { 433 - struct xsk_socket_config xsk_cfg; 434 - struct bpool *bp; 435 - const char *iface; 436 - u32 iface_queue; 437 - }; 438 - 439 - struct port { 440 - struct port_params params; 441 - 442 - struct bcache *bc; 443 - 444 - struct xsk_ring_cons rxq; 445 - struct xsk_ring_prod txq; 446 - struct xsk_ring_prod umem_fq; 447 - struct xsk_ring_cons umem_cq; 448 - struct xsk_socket *xsk; 449 - int umem_fq_initialized; 450 - 451 - u64 n_pkts_rx; 452 - u64 n_pkts_tx; 453 - }; 454 - 455 - static void 456 - port_free(struct port *p) 457 - { 458 - if (!p) 459 - return; 460 - 461 - /* To keep this example simple, the code to free the buffers from the 462 - * socket's receive and transmit queues, as well as from the UMEM fill 463 - * and completion queues, is not included. 464 - */ 465 - 466 - if (p->xsk) 467 - xsk_socket__delete(p->xsk); 468 - 469 - bcache_free(p->bc); 470 - 471 - free(p); 472 - } 473 - 474 - static struct port * 475 - port_init(struct port_params *params) 476 - { 477 - struct port *p; 478 - u32 umem_fq_size, pos = 0; 479 - int status, i; 480 - 481 - /* Memory allocation and initialization. */ 482 - p = calloc(sizeof(struct port), 1); 483 - if (!p) 484 - return NULL; 485 - 486 - memcpy(&p->params, params, sizeof(p->params)); 487 - umem_fq_size = params->bp->umem_cfg.fill_size; 488 - 489 - /* bcache. */ 490 - p->bc = bcache_init(params->bp); 491 - if (!p->bc || 492 - (bcache_slab_size(p->bc) < umem_fq_size) || 493 - (bcache_cons_check(p->bc, umem_fq_size) < umem_fq_size)) { 494 - port_free(p); 495 - return NULL; 496 - } 497 - 498 - /* xsk socket. */ 499 - status = xsk_socket__create_shared(&p->xsk, 500 - params->iface, 501 - params->iface_queue, 502 - params->bp->umem, 503 - &p->rxq, 504 - &p->txq, 505 - &p->umem_fq, 506 - &p->umem_cq, 507 - &params->xsk_cfg); 508 - if (status) { 509 - port_free(p); 510 - return NULL; 511 - } 512 - 513 - /* umem fq. */ 514 - xsk_ring_prod__reserve(&p->umem_fq, umem_fq_size, &pos); 515 - 516 - for (i = 0; i < umem_fq_size; i++) 517 - *xsk_ring_prod__fill_addr(&p->umem_fq, pos + i) = 518 - bcache_cons(p->bc); 519 - 520 - xsk_ring_prod__submit(&p->umem_fq, umem_fq_size); 521 - p->umem_fq_initialized = 1; 522 - 523 - return p; 524 - } 525 - 526 - static inline u32 527 - port_rx_burst(struct port *p, struct burst_rx *b) 528 - { 529 - u32 n_pkts, pos, i; 530 - 531 - /* Free buffers for FQ replenish. */ 532 - n_pkts = ARRAY_SIZE(b->addr); 533 - 534 - n_pkts = bcache_cons_check(p->bc, n_pkts); 535 - if (!n_pkts) 536 - return 0; 537 - 538 - /* RXQ. */ 539 - n_pkts = xsk_ring_cons__peek(&p->rxq, n_pkts, &pos); 540 - if (!n_pkts) { 541 - if (xsk_ring_prod__needs_wakeup(&p->umem_fq)) { 542 - struct pollfd pollfd = { 543 - .fd = xsk_socket__fd(p->xsk), 544 - .events = POLLIN, 545 - }; 546 - 547 - poll(&pollfd, 1, 0); 548 - } 549 - return 0; 550 - } 551 - 552 - for (i = 0; i < n_pkts; i++) { 553 - b->addr[i] = xsk_ring_cons__rx_desc(&p->rxq, pos + i)->addr; 554 - b->len[i] = xsk_ring_cons__rx_desc(&p->rxq, pos + i)->len; 555 - } 556 - 557 - xsk_ring_cons__release(&p->rxq, n_pkts); 558 - p->n_pkts_rx += n_pkts; 559 - 560 - /* UMEM FQ. */ 561 - for ( ; ; ) { 562 - int status; 563 - 564 - status = xsk_ring_prod__reserve(&p->umem_fq, n_pkts, &pos); 565 - if (status == n_pkts) 566 - break; 567 - 568 - if (xsk_ring_prod__needs_wakeup(&p->umem_fq)) { 569 - struct pollfd pollfd = { 570 - .fd = xsk_socket__fd(p->xsk), 571 - .events = POLLIN, 572 - }; 573 - 574 - poll(&pollfd, 1, 0); 575 - } 576 - } 577 - 578 - for (i = 0; i < n_pkts; i++) 579 - *xsk_ring_prod__fill_addr(&p->umem_fq, pos + i) = 580 - bcache_cons(p->bc); 581 - 582 - xsk_ring_prod__submit(&p->umem_fq, n_pkts); 583 - 584 - return n_pkts; 585 - } 586 - 587 - static inline void 588 - port_tx_burst(struct port *p, struct burst_tx *b) 589 - { 590 - u32 n_pkts, pos, i; 591 - int status; 592 - 593 - /* UMEM CQ. */ 594 - n_pkts = p->params.bp->umem_cfg.comp_size; 595 - 596 - n_pkts = xsk_ring_cons__peek(&p->umem_cq, n_pkts, &pos); 597 - 598 - for (i = 0; i < n_pkts; i++) { 599 - u64 addr = *xsk_ring_cons__comp_addr(&p->umem_cq, pos + i); 600 - 601 - bcache_prod(p->bc, addr); 602 - } 603 - 604 - xsk_ring_cons__release(&p->umem_cq, n_pkts); 605 - 606 - /* TXQ. */ 607 - n_pkts = b->n_pkts; 608 - 609 - for ( ; ; ) { 610 - status = xsk_ring_prod__reserve(&p->txq, n_pkts, &pos); 611 - if (status == n_pkts) 612 - break; 613 - 614 - if (xsk_ring_prod__needs_wakeup(&p->txq)) 615 - sendto(xsk_socket__fd(p->xsk), NULL, 0, MSG_DONTWAIT, 616 - NULL, 0); 617 - } 618 - 619 - for (i = 0; i < n_pkts; i++) { 620 - xsk_ring_prod__tx_desc(&p->txq, pos + i)->addr = b->addr[i]; 621 - xsk_ring_prod__tx_desc(&p->txq, pos + i)->len = b->len[i]; 622 - } 623 - 624 - xsk_ring_prod__submit(&p->txq, n_pkts); 625 - if (xsk_ring_prod__needs_wakeup(&p->txq)) 626 - sendto(xsk_socket__fd(p->xsk), NULL, 0, MSG_DONTWAIT, NULL, 0); 627 - p->n_pkts_tx += n_pkts; 628 - } 629 - 630 - /* 631 - * Thread 632 - * 633 - * Packet forwarding threads. 634 - */ 635 - #ifndef MAX_PORTS_PER_THREAD 636 - #define MAX_PORTS_PER_THREAD 16 637 - #endif 638 - 639 - struct thread_data { 640 - struct port *ports_rx[MAX_PORTS_PER_THREAD]; 641 - struct port *ports_tx[MAX_PORTS_PER_THREAD]; 642 - u32 n_ports_rx; 643 - struct burst_rx burst_rx; 644 - struct burst_tx burst_tx[MAX_PORTS_PER_THREAD]; 645 - u32 cpu_core_id; 646 - int quit; 647 - }; 648 - 649 - static void swap_mac_addresses(void *data) 650 - { 651 - struct ether_header *eth = (struct ether_header *)data; 652 - struct ether_addr *src_addr = (struct ether_addr *)&eth->ether_shost; 653 - struct ether_addr *dst_addr = (struct ether_addr *)&eth->ether_dhost; 654 - struct ether_addr tmp; 655 - 656 - tmp = *src_addr; 657 - *src_addr = *dst_addr; 658 - *dst_addr = tmp; 659 - } 660 - 661 - static void * 662 - thread_func(void *arg) 663 - { 664 - struct thread_data *t = arg; 665 - cpu_set_t cpu_cores; 666 - u32 i; 667 - 668 - CPU_ZERO(&cpu_cores); 669 - CPU_SET(t->cpu_core_id, &cpu_cores); 670 - pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpu_cores); 671 - 672 - for (i = 0; !t->quit; i = (i + 1) & (t->n_ports_rx - 1)) { 673 - struct port *port_rx = t->ports_rx[i]; 674 - struct port *port_tx = t->ports_tx[i]; 675 - struct burst_rx *brx = &t->burst_rx; 676 - struct burst_tx *btx = &t->burst_tx[i]; 677 - u32 n_pkts, j; 678 - 679 - /* RX. */ 680 - n_pkts = port_rx_burst(port_rx, brx); 681 - if (!n_pkts) 682 - continue; 683 - 684 - /* Process & TX. */ 685 - for (j = 0; j < n_pkts; j++) { 686 - u64 addr = xsk_umem__add_offset_to_addr(brx->addr[j]); 687 - u8 *pkt = xsk_umem__get_data(port_rx->params.bp->addr, 688 - addr); 689 - 690 - swap_mac_addresses(pkt); 691 - 692 - btx->addr[btx->n_pkts] = brx->addr[j]; 693 - btx->len[btx->n_pkts] = brx->len[j]; 694 - btx->n_pkts++; 695 - 696 - if (btx->n_pkts == MAX_BURST_TX) { 697 - port_tx_burst(port_tx, btx); 698 - btx->n_pkts = 0; 699 - } 700 - } 701 - } 702 - 703 - return NULL; 704 - } 705 - 706 - /* 707 - * Process 708 - */ 709 - static const struct bpool_params bpool_params_default = { 710 - .n_buffers = 64 * 1024, 711 - .buffer_size = XSK_UMEM__DEFAULT_FRAME_SIZE, 712 - .mmap_flags = 0, 713 - 714 - .n_users_max = 16, 715 - .n_buffers_per_slab = XSK_RING_PROD__DEFAULT_NUM_DESCS * 2, 716 - }; 717 - 718 - static const struct xsk_umem_config umem_cfg_default = { 719 - .fill_size = XSK_RING_PROD__DEFAULT_NUM_DESCS * 2, 720 - .comp_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, 721 - .frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE, 722 - .frame_headroom = XSK_UMEM__DEFAULT_FRAME_HEADROOM, 723 - .flags = 0, 724 - }; 725 - 726 - static const struct port_params port_params_default = { 727 - .xsk_cfg = { 728 - .rx_size = XSK_RING_CONS__DEFAULT_NUM_DESCS, 729 - .tx_size = XSK_RING_PROD__DEFAULT_NUM_DESCS, 730 - .libbpf_flags = 0, 731 - .xdp_flags = XDP_FLAGS_DRV_MODE, 732 - .bind_flags = XDP_USE_NEED_WAKEUP | XDP_ZEROCOPY, 733 - }, 734 - 735 - .bp = NULL, 736 - .iface = NULL, 737 - .iface_queue = 0, 738 - }; 739 - 740 - #ifndef MAX_PORTS 741 - #define MAX_PORTS 64 742 - #endif 743 - 744 - #ifndef MAX_THREADS 745 - #define MAX_THREADS 64 746 - #endif 747 - 748 - static struct bpool_params bpool_params; 749 - static struct xsk_umem_config umem_cfg; 750 - static struct bpool *bp; 751 - 752 - static struct port_params port_params[MAX_PORTS]; 753 - static struct port *ports[MAX_PORTS]; 754 - static u64 n_pkts_rx[MAX_PORTS]; 755 - static u64 n_pkts_tx[MAX_PORTS]; 756 - static int n_ports; 757 - 758 - static pthread_t threads[MAX_THREADS]; 759 - static struct thread_data thread_data[MAX_THREADS]; 760 - static int n_threads; 761 - 762 - static void 763 - print_usage(char *prog_name) 764 - { 765 - const char *usage = 766 - "Usage:\n" 767 - "\t%s [ -b SIZE ] -c CORE -i INTERFACE [ -q QUEUE ]\n" 768 - "\n" 769 - "-c CORE CPU core to run a packet forwarding thread\n" 770 - " on. May be invoked multiple times.\n" 771 - "\n" 772 - "-b SIZE Number of buffers in the buffer pool shared\n" 773 - " by all the forwarding threads. Default: %u.\n" 774 - "\n" 775 - "-i INTERFACE Network interface. Each (INTERFACE, QUEUE)\n" 776 - " pair specifies one forwarding port. May be\n" 777 - " invoked multiple times.\n" 778 - "\n" 779 - "-q QUEUE Network interface queue for RX and TX. Each\n" 780 - " (INTERFACE, QUEUE) pair specified one\n" 781 - " forwarding port. Default: %u. May be invoked\n" 782 - " multiple times.\n" 783 - "\n"; 784 - printf(usage, 785 - prog_name, 786 - bpool_params_default.n_buffers, 787 - port_params_default.iface_queue); 788 - } 789 - 790 - static int 791 - parse_args(int argc, char **argv) 792 - { 793 - struct option lgopts[] = { 794 - { NULL, 0, 0, 0 } 795 - }; 796 - int opt, option_index; 797 - 798 - /* Parse the input arguments. */ 799 - for ( ; ;) { 800 - opt = getopt_long(argc, argv, "c:i:q:", lgopts, &option_index); 801 - if (opt == EOF) 802 - break; 803 - 804 - switch (opt) { 805 - case 'b': 806 - bpool_params.n_buffers = atoi(optarg); 807 - break; 808 - 809 - case 'c': 810 - if (n_threads == MAX_THREADS) { 811 - printf("Max number of threads (%d) reached.\n", 812 - MAX_THREADS); 813 - return -1; 814 - } 815 - 816 - thread_data[n_threads].cpu_core_id = atoi(optarg); 817 - n_threads++; 818 - break; 819 - 820 - case 'i': 821 - if (n_ports == MAX_PORTS) { 822 - printf("Max number of ports (%d) reached.\n", 823 - MAX_PORTS); 824 - return -1; 825 - } 826 - 827 - port_params[n_ports].iface = optarg; 828 - port_params[n_ports].iface_queue = 0; 829 - n_ports++; 830 - break; 831 - 832 - case 'q': 833 - if (n_ports == 0) { 834 - printf("No port specified for queue.\n"); 835 - return -1; 836 - } 837 - port_params[n_ports - 1].iface_queue = atoi(optarg); 838 - break; 839 - 840 - default: 841 - printf("Illegal argument.\n"); 842 - return -1; 843 - } 844 - } 845 - 846 - optind = 1; /* reset getopt lib */ 847 - 848 - /* Check the input arguments. */ 849 - if (!n_ports) { 850 - printf("No ports specified.\n"); 851 - return -1; 852 - } 853 - 854 - if (!n_threads) { 855 - printf("No threads specified.\n"); 856 - return -1; 857 - } 858 - 859 - if (n_ports % n_threads) { 860 - printf("Ports cannot be evenly distributed to threads.\n"); 861 - return -1; 862 - } 863 - 864 - return 0; 865 - } 866 - 867 - static void 868 - print_port(u32 port_id) 869 - { 870 - struct port *port = ports[port_id]; 871 - 872 - printf("Port %u: interface = %s, queue = %u\n", 873 - port_id, port->params.iface, port->params.iface_queue); 874 - } 875 - 876 - static void 877 - print_thread(u32 thread_id) 878 - { 879 - struct thread_data *t = &thread_data[thread_id]; 880 - u32 i; 881 - 882 - printf("Thread %u (CPU core %u): ", 883 - thread_id, t->cpu_core_id); 884 - 885 - for (i = 0; i < t->n_ports_rx; i++) { 886 - struct port *port_rx = t->ports_rx[i]; 887 - struct port *port_tx = t->ports_tx[i]; 888 - 889 - printf("(%s, %u) -> (%s, %u), ", 890 - port_rx->params.iface, 891 - port_rx->params.iface_queue, 892 - port_tx->params.iface, 893 - port_tx->params.iface_queue); 894 - } 895 - 896 - printf("\n"); 897 - } 898 - 899 - static void 900 - print_port_stats_separator(void) 901 - { 902 - printf("+-%4s-+-%12s-+-%13s-+-%12s-+-%13s-+\n", 903 - "----", 904 - "------------", 905 - "-------------", 906 - "------------", 907 - "-------------"); 908 - } 909 - 910 - static void 911 - print_port_stats_header(void) 912 - { 913 - print_port_stats_separator(); 914 - printf("| %4s | %12s | %13s | %12s | %13s |\n", 915 - "Port", 916 - "RX packets", 917 - "RX rate (pps)", 918 - "TX packets", 919 - "TX_rate (pps)"); 920 - print_port_stats_separator(); 921 - } 922 - 923 - static void 924 - print_port_stats_trailer(void) 925 - { 926 - print_port_stats_separator(); 927 - printf("\n"); 928 - } 929 - 930 - static void 931 - print_port_stats(int port_id, u64 ns_diff) 932 - { 933 - struct port *p = ports[port_id]; 934 - double rx_pps, tx_pps; 935 - 936 - rx_pps = (p->n_pkts_rx - n_pkts_rx[port_id]) * 1000000000. / ns_diff; 937 - tx_pps = (p->n_pkts_tx - n_pkts_tx[port_id]) * 1000000000. / ns_diff; 938 - 939 - printf("| %4d | %12llu | %13.0f | %12llu | %13.0f |\n", 940 - port_id, 941 - p->n_pkts_rx, 942 - rx_pps, 943 - p->n_pkts_tx, 944 - tx_pps); 945 - 946 - n_pkts_rx[port_id] = p->n_pkts_rx; 947 - n_pkts_tx[port_id] = p->n_pkts_tx; 948 - } 949 - 950 - static void 951 - print_port_stats_all(u64 ns_diff) 952 - { 953 - int i; 954 - 955 - print_port_stats_header(); 956 - for (i = 0; i < n_ports; i++) 957 - print_port_stats(i, ns_diff); 958 - print_port_stats_trailer(); 959 - } 960 - 961 - static int quit; 962 - 963 - static void 964 - signal_handler(int sig) 965 - { 966 - quit = 1; 967 - } 968 - 969 - static void remove_xdp_program(void) 970 - { 971 - int i; 972 - 973 - for (i = 0 ; i < n_ports; i++) 974 - bpf_xdp_detach(if_nametoindex(port_params[i].iface), 975 - port_params[i].xsk_cfg.xdp_flags, NULL); 976 - } 977 - 978 - int main(int argc, char **argv) 979 - { 980 - struct timespec time; 981 - u64 ns0; 982 - int i; 983 - 984 - /* Parse args. */ 985 - memcpy(&bpool_params, &bpool_params_default, 986 - sizeof(struct bpool_params)); 987 - memcpy(&umem_cfg, &umem_cfg_default, 988 - sizeof(struct xsk_umem_config)); 989 - for (i = 0; i < MAX_PORTS; i++) 990 - memcpy(&port_params[i], &port_params_default, 991 - sizeof(struct port_params)); 992 - 993 - if (parse_args(argc, argv)) { 994 - print_usage(argv[0]); 995 - return -1; 996 - } 997 - 998 - /* Buffer pool initialization. */ 999 - bp = bpool_init(&bpool_params, &umem_cfg); 1000 - if (!bp) { 1001 - printf("Buffer pool initialization failed.\n"); 1002 - return -1; 1003 - } 1004 - printf("Buffer pool created successfully.\n"); 1005 - 1006 - /* Ports initialization. */ 1007 - for (i = 0; i < MAX_PORTS; i++) 1008 - port_params[i].bp = bp; 1009 - 1010 - for (i = 0; i < n_ports; i++) { 1011 - ports[i] = port_init(&port_params[i]); 1012 - if (!ports[i]) { 1013 - printf("Port %d initialization failed.\n", i); 1014 - return -1; 1015 - } 1016 - print_port(i); 1017 - } 1018 - printf("All ports created successfully.\n"); 1019 - 1020 - /* Threads. */ 1021 - for (i = 0; i < n_threads; i++) { 1022 - struct thread_data *t = &thread_data[i]; 1023 - u32 n_ports_per_thread = n_ports / n_threads, j; 1024 - 1025 - for (j = 0; j < n_ports_per_thread; j++) { 1026 - t->ports_rx[j] = ports[i * n_ports_per_thread + j]; 1027 - t->ports_tx[j] = ports[i * n_ports_per_thread + 1028 - (j + 1) % n_ports_per_thread]; 1029 - } 1030 - 1031 - t->n_ports_rx = n_ports_per_thread; 1032 - 1033 - print_thread(i); 1034 - } 1035 - 1036 - for (i = 0; i < n_threads; i++) { 1037 - int status; 1038 - 1039 - status = pthread_create(&threads[i], 1040 - NULL, 1041 - thread_func, 1042 - &thread_data[i]); 1043 - if (status) { 1044 - printf("Thread %d creation failed.\n", i); 1045 - return -1; 1046 - } 1047 - } 1048 - printf("All threads created successfully.\n"); 1049 - 1050 - /* Print statistics. */ 1051 - signal(SIGINT, signal_handler); 1052 - signal(SIGTERM, signal_handler); 1053 - signal(SIGABRT, signal_handler); 1054 - 1055 - clock_gettime(CLOCK_MONOTONIC, &time); 1056 - ns0 = time.tv_sec * 1000000000UL + time.tv_nsec; 1057 - for ( ; !quit; ) { 1058 - u64 ns1, ns_diff; 1059 - 1060 - sleep(1); 1061 - clock_gettime(CLOCK_MONOTONIC, &time); 1062 - ns1 = time.tv_sec * 1000000000UL + time.tv_nsec; 1063 - ns_diff = ns1 - ns0; 1064 - ns0 = ns1; 1065 - 1066 - print_port_stats_all(ns_diff); 1067 - } 1068 - 1069 - /* Threads completion. */ 1070 - printf("Quit.\n"); 1071 - for (i = 0; i < n_threads; i++) 1072 - thread_data[i].quit = 1; 1073 - 1074 - for (i = 0; i < n_threads; i++) 1075 - pthread_join(threads[i], NULL); 1076 - 1077 - for (i = 0; i < n_ports; i++) 1078 - port_free(ports[i]); 1079 - 1080 - bpool_free(bp); 1081 - 1082 - remove_xdp_program(); 1083 - 1084 - return 0; 1085 - }
+12
tools/bpf/bpftool/Documentation/bpftool-feature.rst
··· 24 24 ================ 25 25 26 26 | **bpftool** **feature probe** [*COMPONENT*] [**full**] [**unprivileged**] [**macros** [**prefix** *PREFIX*]] 27 + | **bpftool** **feature list_builtins** *GROUP* 27 28 | **bpftool** **feature help** 28 29 | 29 30 | *COMPONENT* := { **kernel** | **dev** *NAME* } 31 + | *GROUP* := { **prog_types** | **map_types** | **attach_types** | **link_types** | **helpers** } 30 32 31 33 DESCRIPTION 32 34 =========== ··· 71 69 72 70 The keywords **full**, **macros** and **prefix** have the 73 71 same role as when probing the kernel. 72 + 73 + **bpftool feature list_builtins** *GROUP* 74 + List items known to bpftool. These can be BPF program types 75 + (**prog_types**), BPF map types (**map_types**), attach types 76 + (**attach_types**), link types (**link_types**), or BPF helper 77 + functions (**helpers**). The command does not probe the system, but 78 + simply lists the elements that bpftool knows from compilation time, 79 + as provided from libbpf (for all object types) or from the BPF UAPI 80 + header (list of helpers). This can be used in scripts to iterate over 81 + BPF types or helpers. 74 82 75 83 **bpftool feature help** 76 84 Print short help message.
+2 -9
tools/bpf/bpftool/Makefile
··· 93 93 RM ?= rm -f 94 94 95 95 FEATURE_USER = .bpftool 96 - FEATURE_TESTS = libbfd disassembler-four-args zlib libcap \ 97 - clang-bpf-co-re 98 - FEATURE_DISPLAY = libbfd disassembler-four-args zlib libcap \ 99 - clang-bpf-co-re 96 + FEATURE_TESTS = libbfd disassembler-four-args libcap clang-bpf-co-re 97 + FEATURE_DISPLAY = libbfd disassembler-four-args libcap clang-bpf-co-re 100 98 101 99 check_feat := 1 102 100 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall ··· 201 203 202 204 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c 203 205 $(QUIET_CC)$(CC) $(CFLAGS) -c -MMD $< -o $@ 204 - 205 - $(OUTPUT)feature.o: 206 - ifneq ($(feature-zlib), 1) 207 - $(error "No zlib found") 208 - endif 209 206 210 207 $(BPFTOOL_BOOTSTRAP): $(BOOTSTRAP_OBJS) $(LIBBPF_BOOTSTRAP) 211 208 $(QUIET_LINK)$(HOSTCC) $(HOST_CFLAGS) $(LDFLAGS) $(BOOTSTRAP_OBJS) $(LIBS_BOOTSTRAP) -o $@
+10 -18
tools/bpf/bpftool/bash-completion/bpftool
··· 703 703 return 0 704 704 ;; 705 705 type) 706 - local BPFTOOL_MAP_CREATE_TYPES='hash array \ 707 - prog_array perf_event_array percpu_hash \ 708 - percpu_array stack_trace cgroup_array lru_hash \ 709 - lru_percpu_hash lpm_trie array_of_maps \ 710 - hash_of_maps devmap devmap_hash sockmap cpumap \ 711 - xskmap sockhash cgroup_storage reuseport_sockarray \ 712 - percpu_cgroup_storage queue stack sk_storage \ 713 - struct_ops ringbuf inode_storage task_storage \ 714 - bloom_filter' 706 + local BPFTOOL_MAP_CREATE_TYPES="$(bpftool feature list_builtins map_types 2>/dev/null | \ 707 + grep -v '^unspec$')" 715 708 COMPREPLY=( $( compgen -W "$BPFTOOL_MAP_CREATE_TYPES" -- "$cur" ) ) 716 709 return 0 717 710 ;; ··· 1032 1039 return 0 1033 1040 ;; 1034 1041 attach|detach) 1035 - local BPFTOOL_CGROUP_ATTACH_TYPES='cgroup_inet_ingress cgroup_inet_egress \ 1036 - cgroup_inet_sock_create cgroup_sock_ops cgroup_device cgroup_inet4_bind \ 1037 - cgroup_inet6_bind cgroup_inet4_post_bind cgroup_inet6_post_bind \ 1038 - cgroup_inet4_connect cgroup_inet6_connect cgroup_inet4_getpeername \ 1039 - cgroup_inet6_getpeername cgroup_inet4_getsockname cgroup_inet6_getsockname \ 1040 - cgroup_udp4_sendmsg cgroup_udp6_sendmsg cgroup_udp4_recvmsg \ 1041 - cgroup_udp6_recvmsg cgroup_sysctl cgroup_getsockopt cgroup_setsockopt \ 1042 - cgroup_inet_sock_release' 1042 + local BPFTOOL_CGROUP_ATTACH_TYPES="$(bpftool feature list_builtins attach_types 2>/dev/null | \ 1043 + grep '^cgroup_')" 1043 1044 local ATTACH_FLAGS='multi override' 1044 1045 local PROG_TYPE='id pinned tag name' 1045 1046 # Check for $prev = $command first ··· 1162 1175 _bpftool_once_attr 'full unprivileged' 1163 1176 return 0 1164 1177 ;; 1178 + list_builtins) 1179 + [[ $prev != "$command" ]] && return 0 1180 + COMPREPLY=( $( compgen -W 'prog_types map_types \ 1181 + attach_types link_types helpers' -- "$cur" ) ) 1182 + ;; 1165 1183 *) 1166 1184 [[ $prev == $object ]] && \ 1167 - COMPREPLY=( $( compgen -W 'help probe' -- "$cur" ) ) 1185 + COMPREPLY=( $( compgen -W 'help list_builtins probe' -- "$cur" ) ) 1168 1186 ;; 1169 1187 esac 1170 1188 ;;
+87 -22
tools/bpf/bpftool/cgroup.c
··· 15 15 #include <unistd.h> 16 16 17 17 #include <bpf/bpf.h> 18 + #include <bpf/btf.h> 18 19 19 20 #include "main.h" 20 21 ··· 37 36 " cgroup_inet_sock_release }" 38 37 39 38 static unsigned int query_flags; 39 + static struct btf *btf_vmlinux; 40 + static __u32 btf_vmlinux_id; 40 41 41 42 static enum bpf_attach_type parse_attach_type(const char *str) 42 43 { ··· 67 64 return __MAX_BPF_ATTACH_TYPE; 68 65 } 69 66 67 + static void guess_vmlinux_btf_id(__u32 attach_btf_obj_id) 68 + { 69 + struct bpf_btf_info btf_info = {}; 70 + __u32 btf_len = sizeof(btf_info); 71 + char name[16] = {}; 72 + int err; 73 + int fd; 74 + 75 + btf_info.name = ptr_to_u64(name); 76 + btf_info.name_len = sizeof(name); 77 + 78 + fd = bpf_btf_get_fd_by_id(attach_btf_obj_id); 79 + if (fd < 0) 80 + return; 81 + 82 + err = bpf_obj_get_info_by_fd(fd, &btf_info, &btf_len); 83 + if (err) 84 + goto out; 85 + 86 + if (btf_info.kernel_btf && strncmp(name, "vmlinux", sizeof(name)) == 0) 87 + btf_vmlinux_id = btf_info.id; 88 + 89 + out: 90 + close(fd); 91 + } 92 + 70 93 static int show_bpf_prog(int id, enum bpf_attach_type attach_type, 71 94 const char *attach_flags_str, 72 95 int level) 73 96 { 74 97 char prog_name[MAX_PROG_FULL_NAME]; 98 + const char *attach_btf_name = NULL; 75 99 struct bpf_prog_info info = {}; 76 100 const char *attach_type_str; 77 101 __u32 info_len = sizeof(info); ··· 114 84 } 115 85 116 86 attach_type_str = libbpf_bpf_attach_type_str(attach_type); 87 + 88 + if (btf_vmlinux) { 89 + if (!btf_vmlinux_id) 90 + guess_vmlinux_btf_id(info.attach_btf_obj_id); 91 + 92 + if (btf_vmlinux_id == info.attach_btf_obj_id && 93 + info.attach_btf_id < btf__type_cnt(btf_vmlinux)) { 94 + const struct btf_type *t = 95 + btf__type_by_id(btf_vmlinux, info.attach_btf_id); 96 + attach_btf_name = 97 + btf__name_by_offset(btf_vmlinux, t->name_off); 98 + } 99 + } 100 + 117 101 get_prog_full_name(&info, prog_fd, prog_name, sizeof(prog_name)); 118 102 if (json_output) { 119 103 jsonw_start_object(json_wtr); ··· 139 95 jsonw_string_field(json_wtr, "attach_flags", 140 96 attach_flags_str); 141 97 jsonw_string_field(json_wtr, "name", prog_name); 98 + if (attach_btf_name) 99 + jsonw_string_field(json_wtr, "attach_btf_name", attach_btf_name); 100 + jsonw_uint_field(json_wtr, "attach_btf_obj_id", info.attach_btf_obj_id); 101 + jsonw_uint_field(json_wtr, "attach_btf_id", info.attach_btf_id); 142 102 jsonw_end_object(json_wtr); 143 103 } else { 144 104 printf("%s%-8u ", level ? " " : "", info.id); ··· 150 102 printf("%-15s", attach_type_str); 151 103 else 152 104 printf("type %-10u", attach_type); 153 - printf(" %-15s %-15s\n", attach_flags_str, prog_name); 105 + printf(" %-15s %-15s", attach_flags_str, prog_name); 106 + if (attach_btf_name) 107 + printf(" %-15s", attach_btf_name); 108 + else if (info.attach_btf_id) 109 + printf(" attach_btf_obj_id=%d attach_btf_id=%d", 110 + info.attach_btf_obj_id, info.attach_btf_id); 111 + printf("\n"); 154 112 } 155 113 156 114 close(prog_fd); ··· 198 144 static int show_attached_bpf_progs(int cgroup_fd, enum bpf_attach_type type, 199 145 int level) 200 146 { 147 + LIBBPF_OPTS(bpf_prog_query_opts, p); 148 + __u32 prog_attach_flags[1024] = {0}; 201 149 const char *attach_flags_str; 202 150 __u32 prog_ids[1024] = {0}; 203 - __u32 prog_cnt, iter; 204 - __u32 attach_flags; 205 151 char buf[32]; 152 + __u32 iter; 206 153 int ret; 207 154 208 - prog_cnt = ARRAY_SIZE(prog_ids); 209 - ret = bpf_prog_query(cgroup_fd, type, query_flags, &attach_flags, 210 - prog_ids, &prog_cnt); 155 + p.query_flags = query_flags; 156 + p.prog_cnt = ARRAY_SIZE(prog_ids); 157 + p.prog_ids = prog_ids; 158 + p.prog_attach_flags = prog_attach_flags; 159 + 160 + ret = bpf_prog_query_opts(cgroup_fd, type, &p); 211 161 if (ret) 212 162 return ret; 213 163 214 - if (prog_cnt == 0) 164 + if (p.prog_cnt == 0) 215 165 return 0; 216 166 217 - switch (attach_flags) { 218 - case BPF_F_ALLOW_MULTI: 219 - attach_flags_str = "multi"; 220 - break; 221 - case BPF_F_ALLOW_OVERRIDE: 222 - attach_flags_str = "override"; 223 - break; 224 - case 0: 225 - attach_flags_str = ""; 226 - break; 227 - default: 228 - snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags); 229 - attach_flags_str = buf; 230 - } 167 + for (iter = 0; iter < p.prog_cnt; iter++) { 168 + __u32 attach_flags; 231 169 232 - for (iter = 0; iter < prog_cnt; iter++) 170 + attach_flags = prog_attach_flags[iter] ?: p.attach_flags; 171 + 172 + switch (attach_flags) { 173 + case BPF_F_ALLOW_MULTI: 174 + attach_flags_str = "multi"; 175 + break; 176 + case BPF_F_ALLOW_OVERRIDE: 177 + attach_flags_str = "override"; 178 + break; 179 + case 0: 180 + attach_flags_str = ""; 181 + break; 182 + default: 183 + snprintf(buf, sizeof(buf), "unknown(%x)", attach_flags); 184 + attach_flags_str = buf; 185 + } 186 + 233 187 show_bpf_prog(prog_ids[iter], type, 234 188 attach_flags_str, level); 189 + } 235 190 236 191 return 0; 237 192 } ··· 296 233 printf("%-8s %-15s %-15s %-15s\n", "ID", "AttachType", 297 234 "AttachFlags", "Name"); 298 235 236 + btf_vmlinux = libbpf_find_kernel_btf(); 299 237 for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) { 300 238 /* 301 239 * Not all attach types may be supported, so it's expected, ··· 360 296 printf("%s\n", fpath); 361 297 } 362 298 299 + btf_vmlinux = libbpf_find_kernel_btf(); 363 300 for (type = 0; type < __MAX_BPF_ATTACH_TYPE; type++) 364 301 show_attached_bpf_progs(cgroup_fd, type, ftw->level); 365 302
+69 -3
tools/bpf/bpftool/common.c
··· 13 13 #include <stdlib.h> 14 14 #include <string.h> 15 15 #include <unistd.h> 16 - #include <linux/limits.h> 17 - #include <linux/magic.h> 18 16 #include <net/if.h> 19 17 #include <sys/mount.h> 20 18 #include <sys/resource.h> 21 19 #include <sys/stat.h> 22 20 #include <sys/vfs.h> 21 + 22 + #include <linux/filter.h> 23 + #include <linux/limits.h> 24 + #include <linux/magic.h> 25 + #include <linux/unistd.h> 23 26 24 27 #include <bpf/bpf.h> 25 28 #include <bpf/hashmap.h> ··· 76 73 return (unsigned long)st_fs.f_type == BPF_FS_MAGIC; 77 74 } 78 75 76 + /* Probe whether kernel switched from memlock-based (RLIMIT_MEMLOCK) to 77 + * memcg-based memory accounting for BPF maps and programs. This was done in 78 + * commit 97306be45fbe ("Merge branch 'switch to memcg-based memory 79 + * accounting'"), in Linux 5.11. 80 + * 81 + * Libbpf also offers to probe for memcg-based accounting vs rlimit, but does 82 + * so by checking for the availability of a given BPF helper and this has 83 + * failed on some kernels with backports in the past, see commit 6b4384ff1088 84 + * ("Revert "bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK""). 85 + * Instead, we can probe by lowering the process-based rlimit to 0, trying to 86 + * load a BPF object, and resetting the rlimit. If the load succeeds then 87 + * memcg-based accounting is supported. 88 + * 89 + * This would be too dangerous to do in the library, because multithreaded 90 + * applications might attempt to load items while the rlimit is at 0. Given 91 + * that bpftool is single-threaded, this is fine to do here. 92 + */ 93 + static bool known_to_need_rlimit(void) 94 + { 95 + struct rlimit rlim_init, rlim_cur_zero = {}; 96 + struct bpf_insn insns[] = { 97 + BPF_MOV64_IMM(BPF_REG_0, 0), 98 + BPF_EXIT_INSN(), 99 + }; 100 + size_t insn_cnt = ARRAY_SIZE(insns); 101 + union bpf_attr attr; 102 + int prog_fd, err; 103 + 104 + memset(&attr, 0, sizeof(attr)); 105 + attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER; 106 + attr.insns = ptr_to_u64(insns); 107 + attr.insn_cnt = insn_cnt; 108 + attr.license = ptr_to_u64("GPL"); 109 + 110 + if (getrlimit(RLIMIT_MEMLOCK, &rlim_init)) 111 + return false; 112 + 113 + /* Drop the soft limit to zero. We maintain the hard limit to its 114 + * current value, because lowering it would be a permanent operation 115 + * for unprivileged users. 116 + */ 117 + rlim_cur_zero.rlim_max = rlim_init.rlim_max; 118 + if (setrlimit(RLIMIT_MEMLOCK, &rlim_cur_zero)) 119 + return false; 120 + 121 + /* Do not use bpf_prog_load() from libbpf here, because it calls 122 + * bump_rlimit_memlock(), interfering with the current probe. 123 + */ 124 + prog_fd = syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); 125 + err = errno; 126 + 127 + /* reset soft rlimit to its initial value */ 128 + setrlimit(RLIMIT_MEMLOCK, &rlim_init); 129 + 130 + if (prog_fd < 0) 131 + return err == EPERM; 132 + 133 + close(prog_fd); 134 + return false; 135 + } 136 + 79 137 void set_max_rlimit(void) 80 138 { 81 139 struct rlimit rinf = { RLIM_INFINITY, RLIM_INFINITY }; 82 140 83 - setrlimit(RLIMIT_MEMLOCK, &rinf); 141 + if (known_to_need_rlimit()) 142 + setrlimit(RLIMIT_MEMLOCK, &rinf); 84 143 } 85 144 86 145 static int ··· 316 251 [BPF_OBJ_UNKNOWN] = "unknown", 317 252 [BPF_OBJ_PROG] = "prog", 318 253 [BPF_OBJ_MAP] = "map", 254 + [BPF_OBJ_LINK] = "link", 319 255 }; 320 256 321 257 if (type < 0 || type >= ARRAY_SIZE(names) || !names[type])
+57 -2
tools/bpf/bpftool/feature.c
··· 1258 1258 return 0; 1259 1259 } 1260 1260 1261 + static const char *get_helper_name(unsigned int id) 1262 + { 1263 + if (id >= ARRAY_SIZE(helper_name)) 1264 + return NULL; 1265 + 1266 + return helper_name[id]; 1267 + } 1268 + 1269 + static int do_list_builtins(int argc, char **argv) 1270 + { 1271 + const char *(*get_name)(unsigned int id); 1272 + unsigned int id = 0; 1273 + 1274 + if (argc < 1) 1275 + usage(); 1276 + 1277 + if (is_prefix(*argv, "prog_types")) { 1278 + get_name = (const char *(*)(unsigned int))libbpf_bpf_prog_type_str; 1279 + } else if (is_prefix(*argv, "map_types")) { 1280 + get_name = (const char *(*)(unsigned int))libbpf_bpf_map_type_str; 1281 + } else if (is_prefix(*argv, "attach_types")) { 1282 + get_name = (const char *(*)(unsigned int))libbpf_bpf_attach_type_str; 1283 + } else if (is_prefix(*argv, "link_types")) { 1284 + get_name = (const char *(*)(unsigned int))libbpf_bpf_link_type_str; 1285 + } else if (is_prefix(*argv, "helpers")) { 1286 + get_name = get_helper_name; 1287 + } else { 1288 + p_err("expected 'prog_types', 'map_types', 'attach_types', 'link_types' or 'helpers', got: %s", *argv); 1289 + return -1; 1290 + } 1291 + 1292 + if (json_output) 1293 + jsonw_start_array(json_wtr); /* root array */ 1294 + 1295 + while (true) { 1296 + const char *name; 1297 + 1298 + name = get_name(id++); 1299 + if (!name) 1300 + break; 1301 + if (json_output) 1302 + jsonw_string(json_wtr, name); 1303 + else 1304 + printf("%s\n", name); 1305 + } 1306 + 1307 + if (json_output) 1308 + jsonw_end_array(json_wtr); /* root array */ 1309 + 1310 + return 0; 1311 + } 1312 + 1261 1313 static int do_help(int argc, char **argv) 1262 1314 { 1263 1315 if (json_output) { ··· 1319 1267 1320 1268 fprintf(stderr, 1321 1269 "Usage: %1$s %2$s probe [COMPONENT] [full] [unprivileged] [macros [prefix PREFIX]]\n" 1270 + " %1$s %2$s list_builtins GROUP\n" 1322 1271 " %1$s %2$s help\n" 1323 1272 "\n" 1324 1273 " COMPONENT := { kernel | dev NAME }\n" 1274 + " GROUP := { prog_types | map_types | attach_types | link_types | helpers }\n" 1325 1275 " " HELP_SPEC_OPTIONS " }\n" 1326 1276 "", 1327 1277 bin_name, argv[-2]); ··· 1332 1278 } 1333 1279 1334 1280 static const struct cmd cmds[] = { 1335 - { "probe", do_probe }, 1336 - { "help", do_help }, 1281 + { "probe", do_probe }, 1282 + { "list_builtins", do_list_builtins }, 1283 + { "help", do_help }, 1337 1284 { 0 } 1338 1285 }; 1339 1286
+109
tools/bpf/bpftool/gen.c
··· 1762 1762 } 1763 1763 break; 1764 1764 case BTF_KIND_CONST: 1765 + case BTF_KIND_RESTRICT: 1765 1766 case BTF_KIND_VOLATILE: 1766 1767 case BTF_KIND_TYPEDEF: 1767 1768 err = btfgen_mark_type(info, btf_type->type, follow_pointers); ··· 1857 1856 return 0; 1858 1857 } 1859 1858 1859 + /* Mark types, members, and member types. Compared to btfgen_record_field_relo, 1860 + * this function does not rely on the target spec for inferring members, but 1861 + * uses the associated BTF. 1862 + * 1863 + * The `behind_ptr` argument is used to stop marking of composite types reached 1864 + * through a pointer. This way, we can keep BTF size in check while providing 1865 + * reasonable match semantics. 1866 + */ 1867 + static int btfgen_mark_type_match(struct btfgen_info *info, __u32 type_id, bool behind_ptr) 1868 + { 1869 + const struct btf_type *btf_type; 1870 + struct btf *btf = info->src_btf; 1871 + struct btf_type *cloned_type; 1872 + int i, err; 1873 + 1874 + if (type_id == 0) 1875 + return 0; 1876 + 1877 + btf_type = btf__type_by_id(btf, type_id); 1878 + /* mark type on cloned BTF as used */ 1879 + cloned_type = (struct btf_type *)btf__type_by_id(info->marked_btf, type_id); 1880 + cloned_type->name_off = MARKED; 1881 + 1882 + switch (btf_kind(btf_type)) { 1883 + case BTF_KIND_UNKN: 1884 + case BTF_KIND_INT: 1885 + case BTF_KIND_FLOAT: 1886 + case BTF_KIND_ENUM: 1887 + case BTF_KIND_ENUM64: 1888 + break; 1889 + case BTF_KIND_STRUCT: 1890 + case BTF_KIND_UNION: { 1891 + struct btf_member *m = btf_members(btf_type); 1892 + __u16 vlen = btf_vlen(btf_type); 1893 + 1894 + if (behind_ptr) 1895 + break; 1896 + 1897 + for (i = 0; i < vlen; i++, m++) { 1898 + /* mark member */ 1899 + btfgen_mark_member(info, type_id, i); 1900 + 1901 + /* mark member's type */ 1902 + err = btfgen_mark_type_match(info, m->type, false); 1903 + if (err) 1904 + return err; 1905 + } 1906 + break; 1907 + } 1908 + case BTF_KIND_CONST: 1909 + case BTF_KIND_FWD: 1910 + case BTF_KIND_RESTRICT: 1911 + case BTF_KIND_TYPEDEF: 1912 + case BTF_KIND_VOLATILE: 1913 + return btfgen_mark_type_match(info, btf_type->type, behind_ptr); 1914 + case BTF_KIND_PTR: 1915 + return btfgen_mark_type_match(info, btf_type->type, true); 1916 + case BTF_KIND_ARRAY: { 1917 + struct btf_array *array; 1918 + 1919 + array = btf_array(btf_type); 1920 + /* mark array type */ 1921 + err = btfgen_mark_type_match(info, array->type, false); 1922 + /* mark array's index type */ 1923 + err = err ? : btfgen_mark_type_match(info, array->index_type, false); 1924 + if (err) 1925 + return err; 1926 + break; 1927 + } 1928 + case BTF_KIND_FUNC_PROTO: { 1929 + __u16 vlen = btf_vlen(btf_type); 1930 + struct btf_param *param; 1931 + 1932 + /* mark ret type */ 1933 + err = btfgen_mark_type_match(info, btf_type->type, false); 1934 + if (err) 1935 + return err; 1936 + 1937 + /* mark parameters types */ 1938 + param = btf_params(btf_type); 1939 + for (i = 0; i < vlen; i++) { 1940 + err = btfgen_mark_type_match(info, param->type, false); 1941 + if (err) 1942 + return err; 1943 + param++; 1944 + } 1945 + break; 1946 + } 1947 + /* tells if some other type needs to be handled */ 1948 + default: 1949 + p_err("unsupported kind: %s (%d)", btf_kind_str(btf_type), type_id); 1950 + return -EINVAL; 1951 + } 1952 + 1953 + return 0; 1954 + } 1955 + 1956 + /* Mark types, members, and member types. Compared to btfgen_record_field_relo, 1957 + * this function does not rely on the target spec for inferring members, but 1958 + * uses the associated BTF. 1959 + */ 1960 + static int btfgen_record_type_match_relo(struct btfgen_info *info, struct bpf_core_spec *targ_spec) 1961 + { 1962 + return btfgen_mark_type_match(info, targ_spec->root_type_id, false); 1963 + } 1964 + 1860 1965 static int btfgen_record_type_relo(struct btfgen_info *info, struct bpf_core_spec *targ_spec) 1861 1966 { 1862 1967 return btfgen_mark_type(info, targ_spec->root_type_id, true); ··· 1989 1882 case BPF_CORE_TYPE_EXISTS: 1990 1883 case BPF_CORE_TYPE_SIZE: 1991 1884 return btfgen_record_type_relo(info, res); 1885 + case BPF_CORE_TYPE_MATCHES: 1886 + return btfgen_record_type_match_relo(info, res); 1992 1887 case BPF_CORE_ENUMVAL_EXISTS: 1993 1888 case BPF_CORE_ENUMVAL_VALUE: 1994 1889 return btfgen_record_enumval_relo(info, res);
-2
tools/bpf/bpftool/main.h
··· 63 63 #define HELP_SPEC_LINK \ 64 64 "LINK := { id LINK_ID | pinned FILE }" 65 65 66 - extern const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE]; 67 - 68 66 /* keep in sync with the definition in skeleton/pid_iter.bpf.c */ 69 67 enum bpf_obj_type { 70 68 BPF_OBJ_UNKNOWN,
+28 -7
tools/include/linux/btf_ids.h
··· 73 73 __BTF_ID_LIST(name, local) \ 74 74 extern u32 name[]; 75 75 76 - #define BTF_ID_LIST_GLOBAL(name) \ 76 + #define BTF_ID_LIST_GLOBAL(name, n) \ 77 77 __BTF_ID_LIST(name, globl) 78 78 79 79 /* The BTF_ID_LIST_SINGLE macro defines a BTF_ID_LIST with ··· 81 81 */ 82 82 #define BTF_ID_LIST_SINGLE(name, prefix, typename) \ 83 83 BTF_ID_LIST(name) \ 84 + BTF_ID(prefix, typename) 85 + #define BTF_ID_LIST_GLOBAL_SINGLE(name, prefix, typename) \ 86 + BTF_ID_LIST_GLOBAL(name, 1) \ 84 87 BTF_ID(prefix, typename) 85 88 86 89 /* ··· 146 143 147 144 #else 148 145 149 - #define BTF_ID_LIST(name) static u32 name[5]; 146 + #define BTF_ID_LIST(name) static u32 __maybe_unused name[5]; 150 147 #define BTF_ID(prefix, name) 151 148 #define BTF_ID_UNUSED 152 - #define BTF_ID_LIST_GLOBAL(name) u32 name[1]; 153 - #define BTF_ID_LIST_SINGLE(name, prefix, typename) static u32 name[1]; 154 - #define BTF_SET_START(name) static struct btf_id_set name = { 0 }; 155 - #define BTF_SET_START_GLOBAL(name) static struct btf_id_set name = { 0 }; 149 + #define BTF_ID_LIST_GLOBAL(name, n) u32 __maybe_unused name[n]; 150 + #define BTF_ID_LIST_SINGLE(name, prefix, typename) static u32 __maybe_unused name[1]; 151 + #define BTF_ID_LIST_GLOBAL_SINGLE(name, prefix, typename) u32 __maybe_unused name[1]; 152 + #define BTF_SET_START(name) static struct btf_id_set __maybe_unused name = { 0 }; 153 + #define BTF_SET_START_GLOBAL(name) static struct btf_id_set __maybe_unused name = { 0 }; 156 154 #define BTF_SET_END(name) 157 155 158 156 #endif /* CONFIG_DEBUG_INFO_BTF */ ··· 176 172 BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP_TW, tcp_timewait_sock) \ 177 173 BTF_SOCK_TYPE(BTF_SOCK_TYPE_TCP6, tcp6_sock) \ 178 174 BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP, udp_sock) \ 179 - BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock) 175 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_UDP6, udp6_sock) \ 176 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_UNIX, unix_sock) \ 177 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_MPTCP, mptcp_sock) \ 178 + BTF_SOCK_TYPE(BTF_SOCK_TYPE_SOCKET, socket) 180 179 181 180 enum { 182 181 #define BTF_SOCK_TYPE(name, str) name, ··· 190 183 191 184 extern u32 btf_sock_ids[]; 192 185 #endif 186 + 187 + #define BTF_TRACING_TYPE_xxx \ 188 + BTF_TRACING_TYPE(BTF_TRACING_TYPE_TASK, task_struct) \ 189 + BTF_TRACING_TYPE(BTF_TRACING_TYPE_FILE, file) \ 190 + BTF_TRACING_TYPE(BTF_TRACING_TYPE_VMA, vm_area_struct) 191 + 192 + enum { 193 + #define BTF_TRACING_TYPE(name, type) name, 194 + BTF_TRACING_TYPE_xxx 195 + #undef BTF_TRACING_TYPE 196 + MAX_BTF_TRACING_TYPE, 197 + }; 198 + 199 + extern u32 btf_tracing_ids[]; 193 200 194 201 #endif
+5
tools/include/uapi/linux/bpf.h
··· 998 998 BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, 999 999 BPF_PERF_EVENT, 1000 1000 BPF_TRACE_KPROBE_MULTI, 1001 + BPF_LSM_CGROUP, 1001 1002 __MAX_BPF_ATTACH_TYPE 1002 1003 }; 1003 1004 ··· 1432 1431 __u32 attach_flags; 1433 1432 __aligned_u64 prog_ids; 1434 1433 __u32 prog_cnt; 1434 + __aligned_u64 prog_attach_flags; /* output: per-program attach_flags */ 1435 1435 } query; 1436 1436 1437 1437 struct { /* anonymous struct used by BPF_RAW_TRACEPOINT_OPEN command */ ··· 6077 6075 __u64 run_cnt; 6078 6076 __u64 recursion_misses; 6079 6077 __u32 verified_insns; 6078 + __u32 attach_btf_obj_id; 6079 + __u32 attach_btf_id; 6080 6080 } __attribute__((aligned(8))); 6081 6081 6082 6082 struct bpf_map_info { ··· 6786 6782 BPF_CORE_TYPE_SIZE = 9, /* type size in bytes */ 6787 6783 BPF_CORE_ENUMVAL_EXISTS = 10, /* enum value existence in target kernel */ 6788 6784 BPF_CORE_ENUMVAL_VALUE = 11, /* enum value integer value */ 6785 + BPF_CORE_TYPE_MATCHES = 12, /* type match in target kernel */ 6789 6786 }; 6790 6787 6791 6788 /*
+1 -1
tools/lib/bpf/Build
··· 1 1 libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \ 2 - netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \ 2 + netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \ 3 3 btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \ 4 4 usdt.o
+1 -1
tools/lib/bpf/Makefile
··· 237 237 $(call do_install_mkdir,$(libdir_SQ)); \ 238 238 cp -fpR $(LIB_FILE) $(DESTDIR)$(libdir_SQ) 239 239 240 - SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \ 240 + SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h \ 241 241 bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h \ 242 242 skel_internal.h libbpf_version.h usdt.bpf.h 243 243 GEN_HDRS := $(BPF_GENERATED)
+37 -183
tools/lib/bpf/bpf.c
··· 147 147 { 148 148 struct rlimit rlim; 149 149 150 - /* this the default in libbpf 1.0, but for now user has to opt-in explicitly */ 151 - if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK)) 152 - return 0; 153 - 154 150 /* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */ 155 151 if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT)) 156 152 return 0; ··· 229 233 return info; 230 234 } 231 235 232 - DEFAULT_VERSION(bpf_prog_load_v0_6_0, bpf_prog_load, LIBBPF_0.6.0) 233 - int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type, 234 - const char *prog_name, const char *license, 235 - const struct bpf_insn *insns, size_t insn_cnt, 236 - const struct bpf_prog_load_opts *opts) 236 + int bpf_prog_load(enum bpf_prog_type prog_type, 237 + const char *prog_name, const char *license, 238 + const struct bpf_insn *insns, size_t insn_cnt, 239 + const struct bpf_prog_load_opts *opts) 237 240 { 238 241 void *finfo = NULL, *linfo = NULL; 239 242 const char *func_info, *line_info; ··· 376 381 /* free() doesn't affect errno, so we don't need to restore it */ 377 382 free(finfo); 378 383 free(linfo); 379 - return libbpf_err_errno(fd); 380 - } 381 - 382 - __attribute__((alias("bpf_load_program_xattr2"))) 383 - int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, 384 - char *log_buf, size_t log_buf_sz); 385 - 386 - static int bpf_load_program_xattr2(const struct bpf_load_program_attr *load_attr, 387 - char *log_buf, size_t log_buf_sz) 388 - { 389 - LIBBPF_OPTS(bpf_prog_load_opts, p); 390 - 391 - if (!load_attr || !log_buf != !log_buf_sz) 392 - return libbpf_err(-EINVAL); 393 - 394 - p.expected_attach_type = load_attr->expected_attach_type; 395 - switch (load_attr->prog_type) { 396 - case BPF_PROG_TYPE_STRUCT_OPS: 397 - case BPF_PROG_TYPE_LSM: 398 - p.attach_btf_id = load_attr->attach_btf_id; 399 - break; 400 - case BPF_PROG_TYPE_TRACING: 401 - case BPF_PROG_TYPE_EXT: 402 - p.attach_btf_id = load_attr->attach_btf_id; 403 - p.attach_prog_fd = load_attr->attach_prog_fd; 404 - break; 405 - default: 406 - p.prog_ifindex = load_attr->prog_ifindex; 407 - p.kern_version = load_attr->kern_version; 408 - } 409 - p.log_level = load_attr->log_level; 410 - p.log_buf = log_buf; 411 - p.log_size = log_buf_sz; 412 - p.prog_btf_fd = load_attr->prog_btf_fd; 413 - p.func_info_rec_size = load_attr->func_info_rec_size; 414 - p.func_info_cnt = load_attr->func_info_cnt; 415 - p.func_info = load_attr->func_info; 416 - p.line_info_rec_size = load_attr->line_info_rec_size; 417 - p.line_info_cnt = load_attr->line_info_cnt; 418 - p.line_info = load_attr->line_info; 419 - p.prog_flags = load_attr->prog_flags; 420 - 421 - return bpf_prog_load(load_attr->prog_type, load_attr->name, load_attr->license, 422 - load_attr->insns, load_attr->insns_cnt, &p); 423 - } 424 - 425 - int bpf_load_program(enum bpf_prog_type type, const struct bpf_insn *insns, 426 - size_t insns_cnt, const char *license, 427 - __u32 kern_version, char *log_buf, 428 - size_t log_buf_sz) 429 - { 430 - struct bpf_load_program_attr load_attr; 431 - 432 - memset(&load_attr, 0, sizeof(struct bpf_load_program_attr)); 433 - load_attr.prog_type = type; 434 - load_attr.expected_attach_type = 0; 435 - load_attr.name = NULL; 436 - load_attr.insns = insns; 437 - load_attr.insns_cnt = insns_cnt; 438 - load_attr.license = license; 439 - load_attr.kern_version = kern_version; 440 - 441 - return bpf_load_program_xattr2(&load_attr, log_buf, log_buf_sz); 442 - } 443 - 444 - int bpf_verify_program(enum bpf_prog_type type, const struct bpf_insn *insns, 445 - size_t insns_cnt, __u32 prog_flags, const char *license, 446 - __u32 kern_version, char *log_buf, size_t log_buf_sz, 447 - int log_level) 448 - { 449 - union bpf_attr attr; 450 - int fd; 451 - 452 - bump_rlimit_memlock(); 453 - 454 - memset(&attr, 0, sizeof(attr)); 455 - attr.prog_type = type; 456 - attr.insn_cnt = (__u32)insns_cnt; 457 - attr.insns = ptr_to_u64(insns); 458 - attr.license = ptr_to_u64(license); 459 - attr.log_buf = ptr_to_u64(log_buf); 460 - attr.log_size = log_buf_sz; 461 - attr.log_level = log_level; 462 - log_buf[0] = 0; 463 - attr.kern_version = kern_version; 464 - attr.prog_flags = prog_flags; 465 - 466 - fd = sys_bpf_prog_load(&attr, sizeof(attr), PROG_LOAD_ATTEMPTS); 467 384 return libbpf_err_errno(fd); 468 385 } 469 386 ··· 795 888 return libbpf_err_errno(fd); 796 889 } 797 890 798 - int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, 799 - __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt) 891 + int bpf_prog_query_opts(int target_fd, 892 + enum bpf_attach_type type, 893 + struct bpf_prog_query_opts *opts) 800 894 { 801 895 union bpf_attr attr; 802 896 int ret; 803 897 804 - memset(&attr, 0, sizeof(attr)); 805 - attr.query.target_fd = target_fd; 806 - attr.query.attach_type = type; 807 - attr.query.query_flags = query_flags; 808 - attr.query.prog_cnt = *prog_cnt; 809 - attr.query.prog_ids = ptr_to_u64(prog_ids); 810 - 811 - ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr)); 812 - 813 - if (attach_flags) 814 - *attach_flags = attr.query.attach_flags; 815 - *prog_cnt = attr.query.prog_cnt; 816 - 817 - return libbpf_err_errno(ret); 818 - } 819 - 820 - int bpf_prog_test_run(int prog_fd, int repeat, void *data, __u32 size, 821 - void *data_out, __u32 *size_out, __u32 *retval, 822 - __u32 *duration) 823 - { 824 - union bpf_attr attr; 825 - int ret; 826 - 827 - memset(&attr, 0, sizeof(attr)); 828 - attr.test.prog_fd = prog_fd; 829 - attr.test.data_in = ptr_to_u64(data); 830 - attr.test.data_out = ptr_to_u64(data_out); 831 - attr.test.data_size_in = size; 832 - attr.test.repeat = repeat; 833 - 834 - ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); 835 - 836 - if (size_out) 837 - *size_out = attr.test.data_size_out; 838 - if (retval) 839 - *retval = attr.test.retval; 840 - if (duration) 841 - *duration = attr.test.duration; 842 - 843 - return libbpf_err_errno(ret); 844 - } 845 - 846 - int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr) 847 - { 848 - union bpf_attr attr; 849 - int ret; 850 - 851 - if (!test_attr->data_out && test_attr->data_size_out > 0) 898 + if (!OPTS_VALID(opts, bpf_prog_query_opts)) 852 899 return libbpf_err(-EINVAL); 853 900 854 901 memset(&attr, 0, sizeof(attr)); 855 - attr.test.prog_fd = test_attr->prog_fd; 856 - attr.test.data_in = ptr_to_u64(test_attr->data_in); 857 - attr.test.data_out = ptr_to_u64(test_attr->data_out); 858 - attr.test.data_size_in = test_attr->data_size_in; 859 - attr.test.data_size_out = test_attr->data_size_out; 860 - attr.test.ctx_in = ptr_to_u64(test_attr->ctx_in); 861 - attr.test.ctx_out = ptr_to_u64(test_attr->ctx_out); 862 - attr.test.ctx_size_in = test_attr->ctx_size_in; 863 - attr.test.ctx_size_out = test_attr->ctx_size_out; 864 - attr.test.repeat = test_attr->repeat; 865 902 866 - ret = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr)); 903 + attr.query.target_fd = target_fd; 904 + attr.query.attach_type = type; 905 + attr.query.query_flags = OPTS_GET(opts, query_flags, 0); 906 + attr.query.prog_cnt = OPTS_GET(opts, prog_cnt, 0); 907 + attr.query.prog_ids = ptr_to_u64(OPTS_GET(opts, prog_ids, NULL)); 908 + attr.query.prog_attach_flags = ptr_to_u64(OPTS_GET(opts, prog_attach_flags, NULL)); 867 909 868 - test_attr->data_size_out = attr.test.data_size_out; 869 - test_attr->ctx_size_out = attr.test.ctx_size_out; 870 - test_attr->retval = attr.test.retval; 871 - test_attr->duration = attr.test.duration; 910 + ret = sys_bpf(BPF_PROG_QUERY, &attr, sizeof(attr)); 911 + 912 + OPTS_SET(opts, attach_flags, attr.query.attach_flags); 913 + OPTS_SET(opts, prog_cnt, attr.query.prog_cnt); 914 + 915 + return libbpf_err_errno(ret); 916 + } 917 + 918 + int bpf_prog_query(int target_fd, enum bpf_attach_type type, __u32 query_flags, 919 + __u32 *attach_flags, __u32 *prog_ids, __u32 *prog_cnt) 920 + { 921 + LIBBPF_OPTS(bpf_prog_query_opts, opts); 922 + int ret; 923 + 924 + opts.query_flags = query_flags; 925 + opts.prog_ids = prog_ids; 926 + opts.prog_cnt = *prog_cnt; 927 + 928 + ret = bpf_prog_query_opts(target_fd, type, &opts); 929 + 930 + if (attach_flags) 931 + *attach_flags = opts.attach_flags; 932 + *prog_cnt = opts.prog_cnt; 872 933 873 934 return libbpf_err_errno(ret); 874 935 } ··· 1034 1159 attr.btf_log_level = 1; 1035 1160 fd = sys_bpf_fd(BPF_BTF_LOAD, &attr, attr_sz); 1036 1161 } 1037 - return libbpf_err_errno(fd); 1038 - } 1039 - 1040 - int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, __u32 log_buf_size, bool do_log) 1041 - { 1042 - LIBBPF_OPTS(bpf_btf_load_opts, opts); 1043 - int fd; 1044 - 1045 - retry: 1046 - if (do_log && log_buf && log_buf_size) { 1047 - opts.log_buf = log_buf; 1048 - opts.log_size = log_buf_size; 1049 - opts.log_level = 1; 1050 - } 1051 - 1052 - fd = bpf_btf_load(btf, btf_size, &opts); 1053 - if (fd < 0 && !do_log && log_buf && log_buf_size) { 1054 - do_log = true; 1055 - goto retry; 1056 - } 1057 - 1058 1162 return libbpf_err_errno(fd); 1059 1163 } 1060 1164
+15 -83
tools/lib/bpf/bpf.h
··· 103 103 const char *prog_name, const char *license, 104 104 const struct bpf_insn *insns, size_t insn_cnt, 105 105 const struct bpf_prog_load_opts *opts); 106 - /* this "specialization" should go away in libbpf 1.0 */ 107 - LIBBPF_API int bpf_prog_load_v0_6_0(enum bpf_prog_type prog_type, 108 - const char *prog_name, const char *license, 109 - const struct bpf_insn *insns, size_t insn_cnt, 110 - const struct bpf_prog_load_opts *opts); 111 - 112 - /* This is an elaborate way to not conflict with deprecated bpf_prog_load() 113 - * API, defined in libbpf.h. Once we hit libbpf 1.0, all this will be gone. 114 - * With this approach, if someone is calling bpf_prog_load() with 115 - * 4 arguments, they will use the deprecated API, which keeps backwards 116 - * compatibility (both source code and binary). If bpf_prog_load() is called 117 - * with 6 arguments, though, it gets redirected to __bpf_prog_load. 118 - * So looking forward to libbpf 1.0 when this hack will be gone and 119 - * __bpf_prog_load() will be called just bpf_prog_load(). 120 - */ 121 - #ifndef bpf_prog_load 122 - #define bpf_prog_load(...) ___libbpf_overload(___bpf_prog_load, __VA_ARGS__) 123 - #define ___bpf_prog_load4(file, type, pobj, prog_fd) \ 124 - bpf_prog_load_deprecated(file, type, pobj, prog_fd) 125 - #define ___bpf_prog_load6(prog_type, prog_name, license, insns, insn_cnt, opts) \ 126 - bpf_prog_load(prog_type, prog_name, license, insns, insn_cnt, opts) 127 - #endif /* bpf_prog_load */ 128 - 129 - struct bpf_load_program_attr { 130 - enum bpf_prog_type prog_type; 131 - enum bpf_attach_type expected_attach_type; 132 - const char *name; 133 - const struct bpf_insn *insns; 134 - size_t insns_cnt; 135 - const char *license; 136 - union { 137 - __u32 kern_version; 138 - __u32 attach_prog_fd; 139 - }; 140 - union { 141 - __u32 prog_ifindex; 142 - __u32 attach_btf_id; 143 - }; 144 - __u32 prog_btf_fd; 145 - __u32 func_info_rec_size; 146 - const void *func_info; 147 - __u32 func_info_cnt; 148 - __u32 line_info_rec_size; 149 - const void *line_info; 150 - __u32 line_info_cnt; 151 - __u32 log_level; 152 - __u32 prog_flags; 153 - }; 154 106 155 107 /* Flags to direct loading requirements */ 156 108 #define MAPS_RELAX_COMPAT 0x01 157 109 158 110 /* Recommended log buffer size */ 159 111 #define BPF_LOG_BUF_SIZE (UINT32_MAX >> 8) /* verifier maximum in kernels <= 5.1 */ 160 - 161 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead") 162 - LIBBPF_API int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr, 163 - char *log_buf, size_t log_buf_sz); 164 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead") 165 - LIBBPF_API int bpf_load_program(enum bpf_prog_type type, 166 - const struct bpf_insn *insns, size_t insns_cnt, 167 - const char *license, __u32 kern_version, 168 - char *log_buf, size_t log_buf_sz); 169 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_load() instead") 170 - LIBBPF_API int bpf_verify_program(enum bpf_prog_type type, 171 - const struct bpf_insn *insns, 172 - size_t insns_cnt, __u32 prog_flags, 173 - const char *license, __u32 kern_version, 174 - char *log_buf, size_t log_buf_sz, 175 - int log_level); 176 112 177 113 struct bpf_btf_load_opts { 178 114 size_t sz; /* size of this struct for forward/backward compatibility */ ··· 122 186 123 187 LIBBPF_API int bpf_btf_load(const void *btf_data, size_t btf_size, 124 188 const struct bpf_btf_load_opts *opts); 125 - 126 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_btf_load() instead") 127 - LIBBPF_API int bpf_load_btf(const void *btf, __u32 btf_size, char *log_buf, 128 - __u32 log_buf_size, bool do_log); 129 189 130 190 LIBBPF_API int bpf_map_update_elem(int fd, const void *key, const void *value, 131 191 __u64 flags); ··· 285 353 LIBBPF_API int bpf_prog_attach_opts(int prog_fd, int attachable_fd, 286 354 enum bpf_attach_type type, 287 355 const struct bpf_prog_attach_opts *opts); 288 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_prog_attach_opts() instead") 289 - LIBBPF_API int bpf_prog_attach_xattr(int prog_fd, int attachable_fd, 290 - enum bpf_attach_type type, 291 - const struct bpf_prog_attach_opts *opts); 292 356 LIBBPF_API int bpf_prog_detach(int attachable_fd, enum bpf_attach_type type); 293 357 LIBBPF_API int bpf_prog_detach2(int prog_fd, int attachable_fd, 294 358 enum bpf_attach_type type); ··· 350 422 * out: length of cxt_out */ 351 423 }; 352 424 353 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead") 354 - LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr *test_attr); 355 - 356 - /* 357 - * bpf_prog_test_run does not check that data_out is large enough. Consider 358 - * using bpf_prog_test_run_opts instead. 359 - */ 360 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_prog_test_run_opts() instead") 361 - LIBBPF_API int bpf_prog_test_run(int prog_fd, int repeat, void *data, 362 - __u32 size, void *data_out, __u32 *size_out, 363 - __u32 *retval, __u32 *duration); 364 425 LIBBPF_API int bpf_prog_get_next_id(__u32 start_id, __u32 *next_id); 365 426 LIBBPF_API int bpf_map_get_next_id(__u32 start_id, __u32 *next_id); 366 427 LIBBPF_API int bpf_btf_get_next_id(__u32 start_id, __u32 *next_id); ··· 359 442 LIBBPF_API int bpf_btf_get_fd_by_id(__u32 id); 360 443 LIBBPF_API int bpf_link_get_fd_by_id(__u32 id); 361 444 LIBBPF_API int bpf_obj_get_info_by_fd(int bpf_fd, void *info, __u32 *info_len); 445 + 446 + struct bpf_prog_query_opts { 447 + size_t sz; /* size of this struct for forward/backward compatibility */ 448 + __u32 query_flags; 449 + __u32 attach_flags; /* output argument */ 450 + __u32 *prog_ids; 451 + __u32 prog_cnt; /* input+output argument */ 452 + __u32 *prog_attach_flags; 453 + }; 454 + #define bpf_prog_query_opts__last_field prog_attach_flags 455 + 456 + LIBBPF_API int bpf_prog_query_opts(int target_fd, 457 + enum bpf_attach_type type, 458 + struct bpf_prog_query_opts *opts); 362 459 LIBBPF_API int bpf_prog_query(int target_fd, enum bpf_attach_type type, 363 460 __u32 query_flags, __u32 *attach_flags, 364 461 __u32 *prog_ids, __u32 *prog_cnt); 462 + 365 463 LIBBPF_API int bpf_raw_tracepoint_open(const char *name, int prog_fd); 366 464 LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, 367 465 __u32 *buf_len, __u32 *prog_id, __u32 *fd_type,
+11
tools/lib/bpf/bpf_core_read.h
··· 29 29 enum bpf_type_info_kind { 30 30 BPF_TYPE_EXISTS = 0, /* type existence in target kernel */ 31 31 BPF_TYPE_SIZE = 1, /* type size in target kernel */ 32 + BPF_TYPE_MATCHES = 2, /* type match in target kernel */ 32 33 }; 33 34 34 35 /* second argument to __builtin_preserve_enum_value() built-in */ ··· 183 182 */ 184 183 #define bpf_core_type_exists(type) \ 185 184 __builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_EXISTS) 185 + 186 + /* 187 + * Convenience macro to check that provided named type 188 + * (struct/union/enum/typedef) "matches" that in a target kernel. 189 + * Returns: 190 + * 1, if the type matches in the target kernel's BTF; 191 + * 0, if the type does not match any in the target kernel 192 + */ 193 + #define bpf_core_type_matches(type) \ 194 + __builtin_preserve_type_info(*(typeof(type) *)0, BPF_TYPE_MATCHES) 186 195 187 196 /* 188 197 * Convenience macro to get the byte size of a provided named type
+13
tools/lib/bpf/bpf_helpers.h
··· 22 22 * To allow use of SEC() with externs (e.g., for extern .maps declarations), 23 23 * make sure __attribute__((unused)) doesn't trigger compilation warning. 24 24 */ 25 + #if __GNUC__ && !__clang__ 26 + 27 + /* 28 + * Pragma macros are broken on GCC 29 + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578 30 + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90400 31 + */ 32 + #define SEC(name) __attribute__((section(name), used)) 33 + 34 + #else 35 + 25 36 #define SEC(name) \ 26 37 _Pragma("GCC diagnostic push") \ 27 38 _Pragma("GCC diagnostic ignored \"-Wignored-attributes\"") \ 28 39 __attribute__((section(name), used)) \ 29 40 _Pragma("GCC diagnostic pop") \ 41 + 42 + #endif 30 43 31 44 /* Avoid 'linux/stddef.h' definition of '__always_inline'. */ 32 45 #undef __always_inline
+1 -1
tools/lib/bpf/bpf_tracing.h
··· 233 233 #define __PT_PARM5_REG a4 234 234 #define __PT_RET_REG ra 235 235 #define __PT_FP_REG s0 236 - #define __PT_RC_REG a5 236 + #define __PT_RC_REG a0 237 237 #define __PT_SP_REG sp 238 238 #define __PT_IP_REG pc 239 239 /* riscv does not select ARCH_HAS_SYSCALL_WRAPPER. */
+1 -182
tools/lib/bpf/btf.c
··· 448 448 return 0; 449 449 } 450 450 451 - __u32 btf__get_nr_types(const struct btf *btf) 452 - { 453 - return btf->start_id + btf->nr_types - 1; 454 - } 455 - 456 451 __u32 btf__type_cnt(const struct btf *btf) 457 452 { 458 453 return btf->start_id + btf->nr_types; ··· 1401 1406 struct btf *btf__load_from_kernel_by_id(__u32 id) 1402 1407 { 1403 1408 return btf__load_from_kernel_by_id_split(id, NULL); 1404 - } 1405 - 1406 - int btf__get_from_id(__u32 id, struct btf **btf) 1407 - { 1408 - struct btf *res; 1409 - int err; 1410 - 1411 - *btf = NULL; 1412 - res = btf__load_from_kernel_by_id(id); 1413 - err = libbpf_get_error(res); 1414 - 1415 - if (err) 1416 - return libbpf_err(err); 1417 - 1418 - *btf = res; 1419 - return 0; 1420 - } 1421 - 1422 - int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, 1423 - __u32 expected_key_size, __u32 expected_value_size, 1424 - __u32 *key_type_id, __u32 *value_type_id) 1425 - { 1426 - const struct btf_type *container_type; 1427 - const struct btf_member *key, *value; 1428 - const size_t max_name = 256; 1429 - char container_name[max_name]; 1430 - __s64 key_size, value_size; 1431 - __s32 container_id; 1432 - 1433 - if (snprintf(container_name, max_name, "____btf_map_%s", map_name) == max_name) { 1434 - pr_warn("map:%s length of '____btf_map_%s' is too long\n", 1435 - map_name, map_name); 1436 - return libbpf_err(-EINVAL); 1437 - } 1438 - 1439 - container_id = btf__find_by_name(btf, container_name); 1440 - if (container_id < 0) { 1441 - pr_debug("map:%s container_name:%s cannot be found in BTF. Missing BPF_ANNOTATE_KV_PAIR?\n", 1442 - map_name, container_name); 1443 - return libbpf_err(container_id); 1444 - } 1445 - 1446 - container_type = btf__type_by_id(btf, container_id); 1447 - if (!container_type) { 1448 - pr_warn("map:%s cannot find BTF type for container_id:%u\n", 1449 - map_name, container_id); 1450 - return libbpf_err(-EINVAL); 1451 - } 1452 - 1453 - if (!btf_is_struct(container_type) || btf_vlen(container_type) < 2) { 1454 - pr_warn("map:%s container_name:%s is an invalid container struct\n", 1455 - map_name, container_name); 1456 - return libbpf_err(-EINVAL); 1457 - } 1458 - 1459 - key = btf_members(container_type); 1460 - value = key + 1; 1461 - 1462 - key_size = btf__resolve_size(btf, key->type); 1463 - if (key_size < 0) { 1464 - pr_warn("map:%s invalid BTF key_type_size\n", map_name); 1465 - return libbpf_err(key_size); 1466 - } 1467 - 1468 - if (expected_key_size != key_size) { 1469 - pr_warn("map:%s btf_key_type_size:%u != map_def_key_size:%u\n", 1470 - map_name, (__u32)key_size, expected_key_size); 1471 - return libbpf_err(-EINVAL); 1472 - } 1473 - 1474 - value_size = btf__resolve_size(btf, value->type); 1475 - if (value_size < 0) { 1476 - pr_warn("map:%s invalid BTF value_type_size\n", map_name); 1477 - return libbpf_err(value_size); 1478 - } 1479 - 1480 - if (expected_value_size != value_size) { 1481 - pr_warn("map:%s btf_value_type_size:%u != map_def_value_size:%u\n", 1482 - map_name, (__u32)value_size, expected_value_size); 1483 - return libbpf_err(-EINVAL); 1484 - } 1485 - 1486 - *key_type_id = key->type; 1487 - *value_type_id = value->type; 1488 - 1489 - return 0; 1490 1409 } 1491 1410 1492 1411 static void btf_invalidate_raw_data(struct btf *btf) ··· 2874 2965 return btf_ext->data; 2875 2966 } 2876 2967 2877 - static int btf_ext_reloc_info(const struct btf *btf, 2878 - const struct btf_ext_info *ext_info, 2879 - const char *sec_name, __u32 insns_cnt, 2880 - void **info, __u32 *cnt) 2881 - { 2882 - __u32 sec_hdrlen = sizeof(struct btf_ext_info_sec); 2883 - __u32 i, record_size, existing_len, records_len; 2884 - struct btf_ext_info_sec *sinfo; 2885 - const char *info_sec_name; 2886 - __u64 remain_len; 2887 - void *data; 2888 - 2889 - record_size = ext_info->rec_size; 2890 - sinfo = ext_info->info; 2891 - remain_len = ext_info->len; 2892 - while (remain_len > 0) { 2893 - records_len = sinfo->num_info * record_size; 2894 - info_sec_name = btf__name_by_offset(btf, sinfo->sec_name_off); 2895 - if (strcmp(info_sec_name, sec_name)) { 2896 - remain_len -= sec_hdrlen + records_len; 2897 - sinfo = (void *)sinfo + sec_hdrlen + records_len; 2898 - continue; 2899 - } 2900 - 2901 - existing_len = (*cnt) * record_size; 2902 - data = realloc(*info, existing_len + records_len); 2903 - if (!data) 2904 - return libbpf_err(-ENOMEM); 2905 - 2906 - memcpy(data + existing_len, sinfo->data, records_len); 2907 - /* adjust insn_off only, the rest data will be passed 2908 - * to the kernel. 2909 - */ 2910 - for (i = 0; i < sinfo->num_info; i++) { 2911 - __u32 *insn_off; 2912 - 2913 - insn_off = data + existing_len + (i * record_size); 2914 - *insn_off = *insn_off / sizeof(struct bpf_insn) + insns_cnt; 2915 - } 2916 - *info = data; 2917 - *cnt += sinfo->num_info; 2918 - return 0; 2919 - } 2920 - 2921 - return libbpf_err(-ENOENT); 2922 - } 2923 - 2924 - int btf_ext__reloc_func_info(const struct btf *btf, 2925 - const struct btf_ext *btf_ext, 2926 - const char *sec_name, __u32 insns_cnt, 2927 - void **func_info, __u32 *cnt) 2928 - { 2929 - return btf_ext_reloc_info(btf, &btf_ext->func_info, sec_name, 2930 - insns_cnt, func_info, cnt); 2931 - } 2932 - 2933 - int btf_ext__reloc_line_info(const struct btf *btf, 2934 - const struct btf_ext *btf_ext, 2935 - const char *sec_name, __u32 insns_cnt, 2936 - void **line_info, __u32 *cnt) 2937 - { 2938 - return btf_ext_reloc_info(btf, &btf_ext->line_info, sec_name, 2939 - insns_cnt, line_info, cnt); 2940 - } 2941 - 2942 - __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext) 2943 - { 2944 - return btf_ext->func_info.rec_size; 2945 - } 2946 - 2947 - __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext) 2948 - { 2949 - return btf_ext->line_info.rec_size; 2950 - } 2951 - 2952 2968 struct btf_dedup; 2953 2969 2954 2970 static struct btf_dedup *btf_dedup_new(struct btf *btf, const struct btf_dedup_opts *opts); ··· 3023 3189 * deduplicating structs/unions is described in greater details in comments for 3024 3190 * `btf_dedup_is_equiv` function. 3025 3191 */ 3026 - 3027 - DEFAULT_VERSION(btf__dedup_v0_6_0, btf__dedup, LIBBPF_0.6.0) 3028 - int btf__dedup_v0_6_0(struct btf *btf, const struct btf_dedup_opts *opts) 3192 + int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts) 3029 3193 { 3030 3194 struct btf_dedup *d; 3031 3195 int err; ··· 3081 3249 done: 3082 3250 btf_dedup_free(d); 3083 3251 return libbpf_err(err); 3084 - } 3085 - 3086 - COMPAT_VERSION(btf__dedup_deprecated, btf__dedup, LIBBPF_0.0.2) 3087 - int btf__dedup_deprecated(struct btf *btf, struct btf_ext *btf_ext, const void *unused_opts) 3088 - { 3089 - LIBBPF_OPTS(btf_dedup_opts, opts, .btf_ext = btf_ext); 3090 - 3091 - if (unused_opts) { 3092 - pr_warn("please use new version of btf__dedup() that supports options\n"); 3093 - return libbpf_err(-ENOTSUP); 3094 - } 3095 - 3096 - return btf__dedup(btf, &opts); 3097 3252 } 3098 3253 3099 3254 #define BTF_UNPROCESSED_ID ((__u32)-1)
+2 -84
tools/lib/bpf/btf.h
··· 120 120 121 121 LIBBPF_API struct btf *btf__load_from_kernel_by_id(__u32 id); 122 122 LIBBPF_API struct btf *btf__load_from_kernel_by_id_split(__u32 id, struct btf *base_btf); 123 - LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_from_kernel_by_id instead") 124 - LIBBPF_API int btf__get_from_id(__u32 id, struct btf **btf); 125 123 126 - LIBBPF_DEPRECATED_SINCE(0, 6, "intended for internal libbpf use only") 127 - LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf); 128 - LIBBPF_DEPRECATED_SINCE(0, 6, "use btf__load_into_kernel instead") 129 - LIBBPF_API int btf__load(struct btf *btf); 130 124 LIBBPF_API int btf__load_into_kernel(struct btf *btf); 131 125 LIBBPF_API __s32 btf__find_by_name(const struct btf *btf, 132 126 const char *type_name); 133 127 LIBBPF_API __s32 btf__find_by_name_kind(const struct btf *btf, 134 128 const char *type_name, __u32 kind); 135 - LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__type_cnt() instead; note that btf__get_nr_types() == btf__type_cnt() - 1") 136 - LIBBPF_API __u32 btf__get_nr_types(const struct btf *btf); 137 129 LIBBPF_API __u32 btf__type_cnt(const struct btf *btf); 138 130 LIBBPF_API const struct btf *btf__base_btf(const struct btf *btf); 139 131 LIBBPF_API const struct btf_type *btf__type_by_id(const struct btf *btf, ··· 142 150 LIBBPF_API const void *btf__raw_data(const struct btf *btf, __u32 *size); 143 151 LIBBPF_API const char *btf__name_by_offset(const struct btf *btf, __u32 offset); 144 152 LIBBPF_API const char *btf__str_by_offset(const struct btf *btf, __u32 offset); 145 - LIBBPF_DEPRECATED_SINCE(0, 7, "this API is not necessary when BTF-defined maps are used") 146 - LIBBPF_API int btf__get_map_kv_tids(const struct btf *btf, const char *map_name, 147 - __u32 expected_key_size, 148 - __u32 expected_value_size, 149 - __u32 *key_type_id, __u32 *value_type_id); 150 153 151 154 LIBBPF_API struct btf_ext *btf_ext__new(const __u8 *data, __u32 size); 152 155 LIBBPF_API void btf_ext__free(struct btf_ext *btf_ext); 153 156 LIBBPF_API const void *btf_ext__raw_data(const struct btf_ext *btf_ext, __u32 *size); 154 - LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions") 155 - int btf_ext__reloc_func_info(const struct btf *btf, 156 - const struct btf_ext *btf_ext, 157 - const char *sec_name, __u32 insns_cnt, 158 - void **func_info, __u32 *cnt); 159 - LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info was never meant as a public API and has wrong assumptions embedded in it; it will be removed in the future libbpf versions") 160 - int btf_ext__reloc_line_info(const struct btf *btf, 161 - const struct btf_ext *btf_ext, 162 - const char *sec_name, __u32 insns_cnt, 163 - void **line_info, __u32 *cnt); 164 - LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_func_info is deprecated; write custom func_info parsing to fetch rec_size") 165 - __u32 btf_ext__func_info_rec_size(const struct btf_ext *btf_ext); 166 - LIBBPF_API LIBBPF_DEPRECATED("btf_ext__reloc_line_info is deprecated; write custom line_info parsing to fetch rec_size") 167 - __u32 btf_ext__line_info_rec_size(const struct btf_ext *btf_ext); 168 157 169 158 LIBBPF_API int btf__find_str(struct btf *btf, const char *s); 170 159 LIBBPF_API int btf__add_str(struct btf *btf, const char *s); ··· 232 259 233 260 LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts); 234 261 235 - LIBBPF_API int btf__dedup_v0_6_0(struct btf *btf, const struct btf_dedup_opts *opts); 236 - 237 - LIBBPF_DEPRECATED_SINCE(0, 7, "use btf__dedup() instead") 238 - LIBBPF_API int btf__dedup_deprecated(struct btf *btf, struct btf_ext *btf_ext, const void *opts); 239 - #define btf__dedup(...) ___libbpf_overload(___btf_dedup, __VA_ARGS__) 240 - #define ___btf_dedup3(btf, btf_ext, opts) btf__dedup_deprecated(btf, btf_ext, opts) 241 - #define ___btf_dedup2(btf, opts) btf__dedup(btf, opts) 242 - 243 262 struct btf_dump; 244 263 245 264 struct btf_dump_opts { 246 - union { 247 - size_t sz; 248 - void *ctx; /* DEPRECATED: will be gone in v1.0 */ 249 - }; 265 + size_t sz; 250 266 }; 267 + #define btf_dump_opts__last_field sz 251 268 252 269 typedef void (*btf_dump_printf_fn_t)(void *ctx, const char *fmt, va_list args); 253 270 ··· 245 282 btf_dump_printf_fn_t printf_fn, 246 283 void *ctx, 247 284 const struct btf_dump_opts *opts); 248 - 249 - LIBBPF_API struct btf_dump *btf_dump__new_v0_6_0(const struct btf *btf, 250 - btf_dump_printf_fn_t printf_fn, 251 - void *ctx, 252 - const struct btf_dump_opts *opts); 253 - 254 - LIBBPF_API struct btf_dump *btf_dump__new_deprecated(const struct btf *btf, 255 - const struct btf_ext *btf_ext, 256 - const struct btf_dump_opts *opts, 257 - btf_dump_printf_fn_t printf_fn); 258 - 259 - /* Choose either btf_dump__new() or btf_dump__new_deprecated() based on the 260 - * type of 4th argument. If it's btf_dump's print callback, use deprecated 261 - * API; otherwise, choose the new btf_dump__new(). ___libbpf_override() 262 - * doesn't work here because both variants have 4 input arguments. 263 - * 264 - * (void *) casts are necessary to avoid compilation warnings about type 265 - * mismatches, because even though __builtin_choose_expr() only ever evaluates 266 - * one side the other side still has to satisfy type constraints (this is 267 - * compiler implementation limitation which might be lifted eventually, 268 - * according to the documentation). So passing struct btf_ext in place of 269 - * btf_dump_printf_fn_t would be generating compilation warning. Casting to 270 - * void * avoids this issue. 271 - * 272 - * Also, two type compatibility checks for a function and function pointer are 273 - * required because passing function reference into btf_dump__new() as 274 - * btf_dump__new(..., my_callback, ...) and as btf_dump__new(..., 275 - * &my_callback, ...) (not explicit ampersand in the latter case) actually 276 - * differs as far as __builtin_types_compatible_p() is concerned. Thus two 277 - * checks are combined to detect callback argument. 278 - * 279 - * The rest works just like in case of ___libbpf_override() usage with symbol 280 - * versioning. 281 - * 282 - * C++ compilers don't support __builtin_types_compatible_p(), so at least 283 - * don't screw up compilation for them and let C++ users pick btf_dump__new 284 - * vs btf_dump__new_deprecated explicitly. 285 - */ 286 - #ifndef __cplusplus 287 - #define btf_dump__new(a1, a2, a3, a4) __builtin_choose_expr( \ 288 - __builtin_types_compatible_p(typeof(a4), btf_dump_printf_fn_t) || \ 289 - __builtin_types_compatible_p(typeof(a4), void(void *, const char *, va_list)), \ 290 - btf_dump__new_deprecated((void *)a1, (void *)a2, (void *)a3, (void *)a4), \ 291 - btf_dump__new((void *)a1, (void *)a2, (void *)a3, (void *)a4)) 292 - #endif 293 285 294 286 LIBBPF_API void btf_dump__free(struct btf_dump *d); 295 287
+7 -16
tools/lib/bpf/btf_dump.c
··· 144 144 static int btf_dump_mark_referenced(struct btf_dump *d); 145 145 static int btf_dump_resize(struct btf_dump *d); 146 146 147 - DEFAULT_VERSION(btf_dump__new_v0_6_0, btf_dump__new, LIBBPF_0.6.0) 148 - struct btf_dump *btf_dump__new_v0_6_0(const struct btf *btf, 149 - btf_dump_printf_fn_t printf_fn, 150 - void *ctx, 151 - const struct btf_dump_opts *opts) 147 + struct btf_dump *btf_dump__new(const struct btf *btf, 148 + btf_dump_printf_fn_t printf_fn, 149 + void *ctx, 150 + const struct btf_dump_opts *opts) 152 151 { 153 152 struct btf_dump *d; 154 153 int err; 154 + 155 + if (!OPTS_VALID(opts, btf_dump_opts)) 156 + return libbpf_err_ptr(-EINVAL); 155 157 156 158 if (!printf_fn) 157 159 return libbpf_err_ptr(-EINVAL); ··· 188 186 err: 189 187 btf_dump__free(d); 190 188 return libbpf_err_ptr(err); 191 - } 192 - 193 - COMPAT_VERSION(btf_dump__new_deprecated, btf_dump__new, LIBBPF_0.0.4) 194 - struct btf_dump *btf_dump__new_deprecated(const struct btf *btf, 195 - const struct btf_ext *btf_ext, 196 - const struct btf_dump_opts *opts, 197 - btf_dump_printf_fn_t printf_fn) 198 - { 199 - if (!printf_fn) 200 - return libbpf_err_ptr(-EINVAL); 201 - return btf_dump__new_v0_6_0(btf, printf_fn, opts ? opts->ctx : NULL, opts); 202 189 } 203 190 204 191 static int btf_dump_resize(struct btf_dump *d)
+175 -1368
tools/lib/bpf/libbpf.c
··· 31 31 #include <linux/bpf.h> 32 32 #include <linux/btf.h> 33 33 #include <linux/filter.h> 34 - #include <linux/list.h> 35 34 #include <linux/limits.h> 36 35 #include <linux/perf_event.h> 37 36 #include <linux/ring_buffer.h> ··· 106 107 [BPF_TRACE_FEXIT] = "trace_fexit", 107 108 [BPF_MODIFY_RETURN] = "modify_return", 108 109 [BPF_LSM_MAC] = "lsm_mac", 110 + [BPF_LSM_CGROUP] = "lsm_cgroup", 109 111 [BPF_SK_LOOKUP] = "sk_lookup", 110 112 [BPF_TRACE_ITER] = "trace_iter", 111 113 [BPF_XDP_DEVMAP] = "xdp_devmap", ··· 279 279 return (__u64) (unsigned long) ptr; 280 280 } 281 281 282 - /* this goes away in libbpf 1.0 */ 283 - enum libbpf_strict_mode libbpf_mode = LIBBPF_STRICT_NONE; 284 - 285 282 int libbpf_set_strict_mode(enum libbpf_strict_mode mode) 286 283 { 287 - libbpf_mode = mode; 284 + /* as of v1.0 libbpf_set_strict_mode() is a no-op */ 288 285 return 0; 289 286 } 290 287 ··· 344 347 SEC_ATTACH_BTF = 4, 345 348 /* BPF program type allows sleeping/blocking in kernel */ 346 349 SEC_SLEEPABLE = 8, 347 - /* allow non-strict prefix matching */ 348 - SEC_SLOPPY_PFX = 16, 349 350 /* BPF program support non-linear XDP buffer */ 350 - SEC_XDP_FRAGS = 32, 351 - /* deprecated sec definitions not supposed to be used */ 352 - SEC_DEPRECATED = 64, 351 + SEC_XDP_FRAGS = 16, 353 352 }; 354 353 355 354 struct bpf_sec_def { ··· 365 372 * linux/filter.h. 366 373 */ 367 374 struct bpf_program { 368 - const struct bpf_sec_def *sec_def; 375 + char *name; 369 376 char *sec_name; 370 377 size_t sec_idx; 378 + const struct bpf_sec_def *sec_def; 371 379 /* this program's instruction offset (in number of instructions) 372 380 * within its containing ELF section 373 381 */ ··· 387 393 * if yes, at which instruction offset. 388 394 */ 389 395 size_t sub_insn_off; 390 - 391 - char *name; 392 - /* name with / replaced by _; makes recursive pinning 393 - * in bpf_object__pin_programs easier 394 - */ 395 - char *pin_name; 396 396 397 397 /* instructions that belong to BPF program; insns[0] is located at 398 398 * sec_insn_off instruction within its ELF section in ELF file, so ··· 408 420 size_t log_size; 409 421 __u32 log_level; 410 422 411 - struct { 412 - int nr; 413 - int *fds; 414 - } instances; 415 - bpf_program_prep_t preprocessor; 416 - 417 423 struct bpf_object *obj; 418 - void *priv; 419 - bpf_program_clear_priv_t clear_priv; 420 424 425 + int fd; 421 426 bool autoload; 422 427 bool mark_btf_static; 423 428 enum bpf_prog_type type; 424 429 enum bpf_attach_type expected_attach_type; 430 + 425 431 int prog_ifindex; 426 432 __u32 attach_btf_obj_fd; 427 433 __u32 attach_btf_id; 428 434 __u32 attach_prog_fd; 435 + 429 436 void *func_info; 430 437 __u32 func_info_rec_size; 431 438 __u32 func_info_cnt; ··· 467 484 LIBBPF_MAP_KCONFIG, 468 485 }; 469 486 487 + struct bpf_map_def { 488 + unsigned int type; 489 + unsigned int key_size; 490 + unsigned int value_size; 491 + unsigned int max_entries; 492 + unsigned int map_flags; 493 + }; 494 + 470 495 struct bpf_map { 471 496 struct bpf_object *obj; 472 497 char *name; ··· 495 504 __u32 btf_key_type_id; 496 505 __u32 btf_value_type_id; 497 506 __u32 btf_vmlinux_value_type_id; 498 - void *priv; 499 - bpf_map_clear_priv_t clear_priv; 500 507 enum libbpf_map_type libbpf_type; 501 508 void *mmaped; 502 509 struct bpf_struct_ops *st_ops; ··· 556 567 } ksym; 557 568 }; 558 569 }; 559 - 560 - static LIST_HEAD(bpf_objects_list); 561 570 562 571 struct module_btf { 563 572 struct btf *btf; ··· 625 638 626 639 /* Information when doing ELF related work. Only valid if efile.elf is not NULL */ 627 640 struct elf_state efile; 628 - /* 629 - * All loaded bpf_object are linked in a list, which is 630 - * hidden to caller. bpf_objects__<func> handlers deal with 631 - * all objects. 632 - */ 633 - struct list_head list; 634 641 635 642 struct btf *btf; 636 643 struct btf_ext *btf_ext; ··· 650 669 size_t log_size; 651 670 __u32 log_level; 652 671 653 - void *priv; 654 - bpf_object_clear_priv_t clear_priv; 655 - 656 672 int *fd_array; 657 673 size_t fd_array_cap; 658 674 size_t fd_array_cnt; ··· 671 693 672 694 void bpf_program__unload(struct bpf_program *prog) 673 695 { 674 - int i; 675 - 676 696 if (!prog) 677 697 return; 678 698 679 - /* 680 - * If the object is opened but the program was never loaded, 681 - * it is possible that prog->instances.nr == -1. 682 - */ 683 - if (prog->instances.nr > 0) { 684 - for (i = 0; i < prog->instances.nr; i++) 685 - zclose(prog->instances.fds[i]); 686 - } else if (prog->instances.nr != -1) { 687 - pr_warn("Internal error: instances.nr is %d\n", 688 - prog->instances.nr); 689 - } 690 - 691 - prog->instances.nr = -1; 692 - zfree(&prog->instances.fds); 699 + zclose(prog->fd); 693 700 694 701 zfree(&prog->func_info); 695 702 zfree(&prog->line_info); ··· 685 722 if (!prog) 686 723 return; 687 724 688 - if (prog->clear_priv) 689 - prog->clear_priv(prog, prog->priv); 690 - 691 - prog->priv = NULL; 692 - prog->clear_priv = NULL; 693 - 694 725 bpf_program__unload(prog); 695 726 zfree(&prog->name); 696 727 zfree(&prog->sec_name); 697 - zfree(&prog->pin_name); 698 728 zfree(&prog->insns); 699 729 zfree(&prog->reloc_desc); 700 730 701 731 prog->nr_reloc = 0; 702 732 prog->insns_cnt = 0; 703 733 prog->sec_idx = -1; 704 - } 705 - 706 - static char *__bpf_program__pin_name(struct bpf_program *prog) 707 - { 708 - char *name, *p; 709 - 710 - if (libbpf_mode & LIBBPF_STRICT_SEC_NAME) 711 - name = strdup(prog->name); 712 - else 713 - name = strdup(prog->sec_name); 714 - 715 - if (!name) 716 - return NULL; 717 - 718 - p = name; 719 - 720 - while ((p = strchr(p, '/'))) 721 - *p = '_'; 722 - 723 - return name; 724 734 } 725 735 726 736 static bool insn_is_subprog_call(const struct bpf_insn *insn) ··· 737 801 prog->insns_cnt = prog->sec_insn_cnt; 738 802 739 803 prog->type = BPF_PROG_TYPE_UNSPEC; 804 + prog->fd = -1; 740 805 741 806 /* libbpf's convention for SEC("?abc...") is that it's just like 742 807 * SEC("abc...") but the corresponding bpf_program starts out with ··· 751 814 prog->autoload = true; 752 815 } 753 816 754 - prog->instances.fds = NULL; 755 - prog->instances.nr = -1; 756 - 757 817 /* inherit object's log_level */ 758 818 prog->log_level = obj->log_level; 759 819 ··· 760 826 761 827 prog->name = strdup(name); 762 828 if (!prog->name) 763 - goto errout; 764 - 765 - prog->pin_name = __bpf_program__pin_name(prog); 766 - if (!prog->pin_name) 767 829 goto errout; 768 830 769 831 prog->insns = malloc(insn_data_sz); ··· 1243 1313 size_t obj_buf_sz, 1244 1314 const char *obj_name) 1245 1315 { 1246 - bool strict = (libbpf_mode & LIBBPF_STRICT_NO_OBJECT_LIST); 1247 1316 struct bpf_object *obj; 1248 1317 char *end; 1249 1318 ··· 1280 1351 obj->kern_version = get_kernel_version(); 1281 1352 obj->loaded = false; 1282 1353 1283 - INIT_LIST_HEAD(&obj->list); 1284 - if (!strict) 1285 - list_add(&obj->list, &bpf_objects_list); 1286 1354 return obj; 1287 1355 } 1288 1356 ··· 1312 1386 } 1313 1387 1314 1388 if (obj->efile.obj_buf_sz > 0) { 1315 - /* 1316 - * obj_buf should have been validated by 1317 - * bpf_object__open_buffer(). 1318 - */ 1389 + /* obj_buf should have been validated by bpf_object__open_mem(). */ 1319 1390 elf = elf_memory((char *)obj->efile.obj_buf, obj->efile.obj_buf_sz); 1320 1391 } else { 1321 1392 obj->efile.fd = open(obj->path, O_RDONLY | O_CLOEXEC); ··· 1975 2052 return 0; 1976 2053 } 1977 2054 1978 - static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict) 1979 - { 1980 - Elf_Data *symbols = obj->efile.symbols; 1981 - int i, map_def_sz = 0, nr_maps = 0, nr_syms; 1982 - Elf_Data *data = NULL; 1983 - Elf_Scn *scn; 1984 - 1985 - if (obj->efile.maps_shndx < 0) 1986 - return 0; 1987 - 1988 - if (libbpf_mode & LIBBPF_STRICT_MAP_DEFINITIONS) { 1989 - pr_warn("legacy map definitions in SEC(\"maps\") are not supported\n"); 1990 - return -EOPNOTSUPP; 1991 - } 1992 - 1993 - if (!symbols) 1994 - return -EINVAL; 1995 - 1996 - scn = elf_sec_by_idx(obj, obj->efile.maps_shndx); 1997 - data = elf_sec_data(obj, scn); 1998 - if (!scn || !data) { 1999 - pr_warn("elf: failed to get legacy map definitions for %s\n", 2000 - obj->path); 2001 - return -EINVAL; 2002 - } 2003 - 2004 - /* 2005 - * Count number of maps. Each map has a name. 2006 - * Array of maps is not supported: only the first element is 2007 - * considered. 2008 - * 2009 - * TODO: Detect array of map and report error. 2010 - */ 2011 - nr_syms = symbols->d_size / sizeof(Elf64_Sym); 2012 - for (i = 0; i < nr_syms; i++) { 2013 - Elf64_Sym *sym = elf_sym_by_idx(obj, i); 2014 - 2015 - if (sym->st_shndx != obj->efile.maps_shndx) 2016 - continue; 2017 - if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) 2018 - continue; 2019 - nr_maps++; 2020 - } 2021 - /* Assume equally sized map definitions */ 2022 - pr_debug("elf: found %d legacy map definitions (%zd bytes) in %s\n", 2023 - nr_maps, data->d_size, obj->path); 2024 - 2025 - if (!data->d_size || nr_maps == 0 || (data->d_size % nr_maps) != 0) { 2026 - pr_warn("elf: unable to determine legacy map definition size in %s\n", 2027 - obj->path); 2028 - return -EINVAL; 2029 - } 2030 - map_def_sz = data->d_size / nr_maps; 2031 - 2032 - /* Fill obj->maps using data in "maps" section. */ 2033 - for (i = 0; i < nr_syms; i++) { 2034 - Elf64_Sym *sym = elf_sym_by_idx(obj, i); 2035 - const char *map_name; 2036 - struct bpf_map_def *def; 2037 - struct bpf_map *map; 2038 - 2039 - if (sym->st_shndx != obj->efile.maps_shndx) 2040 - continue; 2041 - if (ELF64_ST_TYPE(sym->st_info) == STT_SECTION) 2042 - continue; 2043 - 2044 - map = bpf_object__add_map(obj); 2045 - if (IS_ERR(map)) 2046 - return PTR_ERR(map); 2047 - 2048 - map_name = elf_sym_str(obj, sym->st_name); 2049 - if (!map_name) { 2050 - pr_warn("failed to get map #%d name sym string for obj %s\n", 2051 - i, obj->path); 2052 - return -LIBBPF_ERRNO__FORMAT; 2053 - } 2054 - 2055 - pr_warn("map '%s' (legacy): legacy map definitions are deprecated, use BTF-defined maps instead\n", map_name); 2056 - 2057 - if (ELF64_ST_BIND(sym->st_info) == STB_LOCAL) { 2058 - pr_warn("map '%s' (legacy): static maps are not supported\n", map_name); 2059 - return -ENOTSUP; 2060 - } 2061 - 2062 - map->libbpf_type = LIBBPF_MAP_UNSPEC; 2063 - map->sec_idx = sym->st_shndx; 2064 - map->sec_offset = sym->st_value; 2065 - pr_debug("map '%s' (legacy): at sec_idx %d, offset %zu.\n", 2066 - map_name, map->sec_idx, map->sec_offset); 2067 - if (sym->st_value + map_def_sz > data->d_size) { 2068 - pr_warn("corrupted maps section in %s: last map \"%s\" too small\n", 2069 - obj->path, map_name); 2070 - return -EINVAL; 2071 - } 2072 - 2073 - map->name = strdup(map_name); 2074 - if (!map->name) { 2075 - pr_warn("map '%s': failed to alloc map name\n", map_name); 2076 - return -ENOMEM; 2077 - } 2078 - pr_debug("map %d is \"%s\"\n", i, map->name); 2079 - def = (struct bpf_map_def *)(data->d_buf + sym->st_value); 2080 - /* 2081 - * If the definition of the map in the object file fits in 2082 - * bpf_map_def, copy it. Any extra fields in our version 2083 - * of bpf_map_def will default to zero as a result of the 2084 - * calloc above. 2085 - */ 2086 - if (map_def_sz <= sizeof(struct bpf_map_def)) { 2087 - memcpy(&map->def, def, map_def_sz); 2088 - } else { 2089 - /* 2090 - * Here the map structure being read is bigger than what 2091 - * we expect, truncate if the excess bits are all zero. 2092 - * If they are not zero, reject this map as 2093 - * incompatible. 2094 - */ 2095 - char *b; 2096 - 2097 - for (b = ((char *)def) + sizeof(struct bpf_map_def); 2098 - b < ((char *)def) + map_def_sz; b++) { 2099 - if (*b != 0) { 2100 - pr_warn("maps section in %s: \"%s\" has unrecognized, non-zero options\n", 2101 - obj->path, map_name); 2102 - if (strict) 2103 - return -EINVAL; 2104 - } 2105 - } 2106 - memcpy(&map->def, def, sizeof(struct bpf_map_def)); 2107 - } 2108 - 2109 - /* btf info may not exist but fill it in if it does exist */ 2110 - (void) bpf_map_find_btf_info(obj, map); 2111 - } 2112 - return 0; 2113 - } 2114 - 2115 2055 const struct btf_type * 2116 2056 skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id) 2117 2057 { ··· 2091 2305 2092 2306 return bpf_map__set_pin_path(map, buf); 2093 2307 } 2308 + 2309 + /* should match definition in bpf_helpers.h */ 2310 + enum libbpf_pin_type { 2311 + LIBBPF_PIN_NONE, 2312 + /* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */ 2313 + LIBBPF_PIN_BY_NAME, 2314 + }; 2094 2315 2095 2316 int parse_btf_map_def(const char *map_name, struct btf *btf, 2096 2317 const struct btf_type *def_t, bool strict, ··· 2531 2738 { 2532 2739 const char *pin_root_path; 2533 2740 bool strict; 2534 - int err; 2741 + int err = 0; 2535 2742 2536 2743 strict = !OPTS_GET(opts, relaxed_maps, false); 2537 2744 pin_root_path = OPTS_GET(opts, pin_root_path, NULL); 2538 2745 2539 - err = bpf_object__init_user_maps(obj, strict); 2540 2746 err = err ?: bpf_object__init_user_btf_maps(obj, strict, pin_root_path); 2541 2747 err = err ?: bpf_object__init_global_data_maps(obj); 2542 2748 err = err ?: bpf_object__init_kconfig_map(obj); ··· 2851 3059 } 2852 3060 2853 3061 return libbpf_err(err); 2854 - } 2855 - 2856 - int btf__finalize_data(struct bpf_object *obj, struct btf *btf) 2857 - { 2858 - return btf_finalize_data(obj, btf); 2859 3062 } 2860 3063 2861 3064 static int bpf_object__finalize_btf(struct bpf_object *obj) ··· 3809 4022 return 0; 3810 4023 } 3811 4024 3812 - struct bpf_program * 3813 - bpf_object__find_program_by_title(const struct bpf_object *obj, 3814 - const char *title) 4025 + static bool prog_is_subprog(const struct bpf_object *obj, const struct bpf_program *prog) 3815 4026 { 3816 - struct bpf_program *pos; 3817 - 3818 - bpf_object__for_each_program(pos, obj) { 3819 - if (pos->sec_name && !strcmp(pos->sec_name, title)) 3820 - return pos; 3821 - } 3822 - return errno = ENOENT, NULL; 3823 - } 3824 - 3825 - static bool prog_is_subprog(const struct bpf_object *obj, 3826 - const struct bpf_program *prog) 3827 - { 3828 - /* For legacy reasons, libbpf supports an entry-point BPF programs 3829 - * without SEC() attribute, i.e., those in the .text section. But if 3830 - * there are 2 or more such programs in the .text section, they all 3831 - * must be subprograms called from entry-point BPF programs in 3832 - * designated SEC()'tions, otherwise there is no way to distinguish 3833 - * which of those programs should be loaded vs which are a subprogram. 3834 - * Similarly, if there is a function/program in .text and at least one 3835 - * other BPF program with custom SEC() attribute, then we just assume 3836 - * .text programs are subprograms (even if they are not called from 3837 - * other programs), because libbpf never explicitly supported mixing 3838 - * SEC()-designated BPF programs and .text entry-point BPF programs. 3839 - * 3840 - * In libbpf 1.0 strict mode, we always consider .text 3841 - * programs to be subprograms. 3842 - */ 3843 - 3844 - if (libbpf_mode & LIBBPF_STRICT_SEC_NAME) 3845 - return prog->sec_idx == obj->efile.text_shndx; 3846 - 3847 4027 return prog->sec_idx == obj->efile.text_shndx && obj->nr_programs > 1; 3848 4028 } 3849 4029 ··· 4151 4397 4152 4398 static int bpf_map_find_btf_info(struct bpf_object *obj, struct bpf_map *map) 4153 4399 { 4154 - struct bpf_map_def *def = &map->def; 4155 - __u32 key_type_id = 0, value_type_id = 0; 4156 - int ret; 4400 + int id; 4157 4401 4158 4402 if (!obj->btf) 4159 4403 return -ENOENT; ··· 4160 4408 * For struct_ops map, it does not need btf_key_type_id and 4161 4409 * btf_value_type_id. 4162 4410 */ 4163 - if (map->sec_idx == obj->efile.btf_maps_shndx || 4164 - bpf_map__is_struct_ops(map)) 4411 + if (map->sec_idx == obj->efile.btf_maps_shndx || bpf_map__is_struct_ops(map)) 4165 4412 return 0; 4166 4413 4167 - if (!bpf_map__is_internal(map)) { 4168 - pr_warn("Use of BPF_ANNOTATE_KV_PAIR is deprecated, use BTF-defined maps in .maps section instead\n"); 4169 - #pragma GCC diagnostic push 4170 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 4171 - ret = btf__get_map_kv_tids(obj->btf, map->name, def->key_size, 4172 - def->value_size, &key_type_id, 4173 - &value_type_id); 4174 - #pragma GCC diagnostic pop 4175 - } else { 4176 - /* 4177 - * LLVM annotates global data differently in BTF, that is, 4178 - * only as '.data', '.bss' or '.rodata'. 4179 - */ 4180 - ret = btf__find_by_name(obj->btf, map->real_name); 4181 - } 4182 - if (ret < 0) 4183 - return ret; 4414 + /* 4415 + * LLVM annotates global data differently in BTF, that is, 4416 + * only as '.data', '.bss' or '.rodata'. 4417 + */ 4418 + if (!bpf_map__is_internal(map)) 4419 + return -ENOENT; 4184 4420 4185 - map->btf_key_type_id = key_type_id; 4186 - map->btf_value_type_id = bpf_map__is_internal(map) ? 4187 - ret : value_type_id; 4421 + id = btf__find_by_name(obj->btf, map->real_name); 4422 + if (id < 0) 4423 + return id; 4424 + 4425 + map->btf_key_type_id = 0; 4426 + map->btf_value_type_id = id; 4188 4427 return 0; 4189 4428 } 4190 4429 ··· 4305 4562 return libbpf_err(-EBUSY); 4306 4563 map->def.max_entries = max_entries; 4307 4564 return 0; 4308 - } 4309 - 4310 - int bpf_map__resize(struct bpf_map *map, __u32 max_entries) 4311 - { 4312 - if (!map || !max_entries) 4313 - return libbpf_err(-EINVAL); 4314 - 4315 - return bpf_map__set_max_entries(map, max_entries); 4316 4565 } 4317 4566 4318 4567 static int ··· 5467 5732 int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 5468 5733 const struct btf *targ_btf, __u32 targ_id) 5469 5734 { 5470 - const struct btf_type *local_type, *targ_type; 5471 - int depth = 32; /* max recursion depth */ 5735 + return __bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id, 32); 5736 + } 5472 5737 5473 - /* caller made sure that names match (ignoring flavor suffix) */ 5474 - local_type = btf__type_by_id(local_btf, local_id); 5475 - targ_type = btf__type_by_id(targ_btf, targ_id); 5476 - if (!btf_kind_core_compat(local_type, targ_type)) 5477 - return 0; 5478 - 5479 - recur: 5480 - depth--; 5481 - if (depth < 0) 5482 - return -EINVAL; 5483 - 5484 - local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id); 5485 - targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); 5486 - if (!local_type || !targ_type) 5487 - return -EINVAL; 5488 - 5489 - if (!btf_kind_core_compat(local_type, targ_type)) 5490 - return 0; 5491 - 5492 - switch (btf_kind(local_type)) { 5493 - case BTF_KIND_UNKN: 5494 - case BTF_KIND_STRUCT: 5495 - case BTF_KIND_UNION: 5496 - case BTF_KIND_ENUM: 5497 - case BTF_KIND_ENUM64: 5498 - case BTF_KIND_FWD: 5499 - return 1; 5500 - case BTF_KIND_INT: 5501 - /* just reject deprecated bitfield-like integers; all other 5502 - * integers are by default compatible between each other 5503 - */ 5504 - return btf_int_offset(local_type) == 0 && btf_int_offset(targ_type) == 0; 5505 - case BTF_KIND_PTR: 5506 - local_id = local_type->type; 5507 - targ_id = targ_type->type; 5508 - goto recur; 5509 - case BTF_KIND_ARRAY: 5510 - local_id = btf_array(local_type)->type; 5511 - targ_id = btf_array(targ_type)->type; 5512 - goto recur; 5513 - case BTF_KIND_FUNC_PROTO: { 5514 - struct btf_param *local_p = btf_params(local_type); 5515 - struct btf_param *targ_p = btf_params(targ_type); 5516 - __u16 local_vlen = btf_vlen(local_type); 5517 - __u16 targ_vlen = btf_vlen(targ_type); 5518 - int i, err; 5519 - 5520 - if (local_vlen != targ_vlen) 5521 - return 0; 5522 - 5523 - for (i = 0; i < local_vlen; i++, local_p++, targ_p++) { 5524 - skip_mods_and_typedefs(local_btf, local_p->type, &local_id); 5525 - skip_mods_and_typedefs(targ_btf, targ_p->type, &targ_id); 5526 - err = bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id); 5527 - if (err <= 0) 5528 - return err; 5529 - } 5530 - 5531 - /* tail recurse for return type check */ 5532 - skip_mods_and_typedefs(local_btf, local_type->type, &local_id); 5533 - skip_mods_and_typedefs(targ_btf, targ_type->type, &targ_id); 5534 - goto recur; 5535 - } 5536 - default: 5537 - pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n", 5538 - btf_kind_str(local_type), local_id, targ_id); 5539 - return 0; 5540 - } 5738 + int bpf_core_types_match(const struct btf *local_btf, __u32 local_id, 5739 + const struct btf *targ_btf, __u32 targ_id) 5740 + { 5741 + return __bpf_core_types_match(local_btf, local_id, targ_btf, targ_id, false, 32); 5541 5742 } 5542 5743 5543 5744 static size_t bpf_core_hash_fn(const void *key, void *ctx) ··· 6597 6926 if (prog->type == BPF_PROG_TYPE_XDP && (def & SEC_XDP_FRAGS)) 6598 6927 opts->prog_flags |= BPF_F_XDP_HAS_FRAGS; 6599 6928 6600 - if (def & SEC_DEPRECATED) { 6601 - pr_warn("SEC(\"%s\") is deprecated, please see https://github.com/libbpf/libbpf/wiki/Libbpf-1.0-migration-guide#bpf-program-sec-annotation-deprecations for details\n", 6602 - prog->sec_name); 6603 - } 6604 - 6605 6929 if ((def & SEC_ATTACH_BTF) && !prog->attach_btf_id) { 6606 6930 int btf_obj_fd = 0, btf_type_id = 0, err; 6607 6931 const char *attach_name; ··· 6639 6973 6640 6974 static void fixup_verifier_log(struct bpf_program *prog, char *buf, size_t buf_sz); 6641 6975 6642 - static int bpf_object_load_prog_instance(struct bpf_object *obj, struct bpf_program *prog, 6643 - struct bpf_insn *insns, int insns_cnt, 6644 - const char *license, __u32 kern_version, 6645 - int *prog_fd) 6976 + static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog, 6977 + struct bpf_insn *insns, int insns_cnt, 6978 + const char *license, __u32 kern_version, int *prog_fd) 6646 6979 { 6647 6980 LIBBPF_OPTS(bpf_prog_load_opts, load_attr); 6648 6981 const char *prog_name = NULL; ··· 7008 7343 return 0; 7009 7344 } 7010 7345 7011 - static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog, 7012 - const char *license, __u32 kern_ver) 7013 - { 7014 - int err = 0, fd, i; 7015 - 7016 - if (obj->loaded) { 7017 - pr_warn("prog '%s': can't load after object was loaded\n", prog->name); 7018 - return libbpf_err(-EINVAL); 7019 - } 7020 - 7021 - if (prog->instances.nr < 0 || !prog->instances.fds) { 7022 - if (prog->preprocessor) { 7023 - pr_warn("Internal error: can't load program '%s'\n", 7024 - prog->name); 7025 - return libbpf_err(-LIBBPF_ERRNO__INTERNAL); 7026 - } 7027 - 7028 - prog->instances.fds = malloc(sizeof(int)); 7029 - if (!prog->instances.fds) { 7030 - pr_warn("Not enough memory for BPF fds\n"); 7031 - return libbpf_err(-ENOMEM); 7032 - } 7033 - prog->instances.nr = 1; 7034 - prog->instances.fds[0] = -1; 7035 - } 7036 - 7037 - if (!prog->preprocessor) { 7038 - if (prog->instances.nr != 1) { 7039 - pr_warn("prog '%s': inconsistent nr(%d) != 1\n", 7040 - prog->name, prog->instances.nr); 7041 - } 7042 - if (obj->gen_loader) 7043 - bpf_program_record_relos(prog); 7044 - err = bpf_object_load_prog_instance(obj, prog, 7045 - prog->insns, prog->insns_cnt, 7046 - license, kern_ver, &fd); 7047 - if (!err) 7048 - prog->instances.fds[0] = fd; 7049 - goto out; 7050 - } 7051 - 7052 - for (i = 0; i < prog->instances.nr; i++) { 7053 - struct bpf_prog_prep_result result; 7054 - bpf_program_prep_t preprocessor = prog->preprocessor; 7055 - 7056 - memset(&result, 0, sizeof(result)); 7057 - err = preprocessor(prog, i, prog->insns, 7058 - prog->insns_cnt, &result); 7059 - if (err) { 7060 - pr_warn("Preprocessing the %dth instance of program '%s' failed\n", 7061 - i, prog->name); 7062 - goto out; 7063 - } 7064 - 7065 - if (!result.new_insn_ptr || !result.new_insn_cnt) { 7066 - pr_debug("Skip loading the %dth instance of program '%s'\n", 7067 - i, prog->name); 7068 - prog->instances.fds[i] = -1; 7069 - if (result.pfd) 7070 - *result.pfd = -1; 7071 - continue; 7072 - } 7073 - 7074 - err = bpf_object_load_prog_instance(obj, prog, 7075 - result.new_insn_ptr, result.new_insn_cnt, 7076 - license, kern_ver, &fd); 7077 - if (err) { 7078 - pr_warn("Loading the %dth instance of program '%s' failed\n", 7079 - i, prog->name); 7080 - goto out; 7081 - } 7082 - 7083 - if (result.pfd) 7084 - *result.pfd = fd; 7085 - prog->instances.fds[i] = fd; 7086 - } 7087 - out: 7088 - if (err) 7089 - pr_warn("failed to load program '%s'\n", prog->name); 7090 - return libbpf_err(err); 7091 - } 7092 - 7093 - int bpf_program__load(struct bpf_program *prog, const char *license, __u32 kern_ver) 7094 - { 7095 - return bpf_object_load_prog(prog->obj, prog, license, kern_ver); 7096 - } 7097 - 7098 7346 static int 7099 7347 bpf_object__load_progs(struct bpf_object *obj, int log_level) 7100 7348 { ··· 7031 7453 continue; 7032 7454 } 7033 7455 prog->log_level |= log_level; 7034 - err = bpf_object_load_prog(obj, prog, obj->license, obj->kern_version); 7035 - if (err) 7456 + 7457 + if (obj->gen_loader) 7458 + bpf_program_record_relos(prog); 7459 + 7460 + err = bpf_object_load_prog(obj, prog, prog->insns, prog->insns_cnt, 7461 + obj->license, obj->kern_version, &prog->fd); 7462 + if (err) { 7463 + pr_warn("prog '%s': failed to load: %d\n", prog->name, err); 7036 7464 return err; 7465 + } 7037 7466 } 7038 7467 7039 7468 bpf_object__free_relocs(obj); ··· 7065 7480 7066 7481 prog->type = prog->sec_def->prog_type; 7067 7482 prog->expected_attach_type = prog->sec_def->expected_attach_type; 7068 - 7069 - #pragma GCC diagnostic push 7070 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 7071 - if (prog->sec_def->prog_type == BPF_PROG_TYPE_TRACING || 7072 - prog->sec_def->prog_type == BPF_PROG_TYPE_EXT) 7073 - prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0); 7074 - #pragma GCC diagnostic pop 7075 7483 7076 7484 /* sec_def can have custom callback which should be called 7077 7485 * after bpf_program is initialized to adjust its properties ··· 7171 7593 return ERR_PTR(err); 7172 7594 } 7173 7595 7174 - static struct bpf_object * 7175 - __bpf_object__open_xattr(struct bpf_object_open_attr *attr, int flags) 7176 - { 7177 - DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, 7178 - .relaxed_maps = flags & MAPS_RELAX_COMPAT, 7179 - ); 7180 - 7181 - /* param validation */ 7182 - if (!attr->file) 7183 - return NULL; 7184 - 7185 - pr_debug("loading %s\n", attr->file); 7186 - return bpf_object_open(attr->file, NULL, 0, &opts); 7187 - } 7188 - 7189 - struct bpf_object *bpf_object__open_xattr(struct bpf_object_open_attr *attr) 7190 - { 7191 - return libbpf_ptr(__bpf_object__open_xattr(attr, 0)); 7192 - } 7193 - 7194 - struct bpf_object *bpf_object__open(const char *path) 7195 - { 7196 - struct bpf_object_open_attr attr = { 7197 - .file = path, 7198 - .prog_type = BPF_PROG_TYPE_UNSPEC, 7199 - }; 7200 - 7201 - return libbpf_ptr(__bpf_object__open_xattr(&attr, 0)); 7202 - } 7203 - 7204 7596 struct bpf_object * 7205 7597 bpf_object__open_file(const char *path, const struct bpf_object_open_opts *opts) 7206 7598 { ··· 7182 7634 return libbpf_ptr(bpf_object_open(path, NULL, 0, opts)); 7183 7635 } 7184 7636 7637 + struct bpf_object *bpf_object__open(const char *path) 7638 + { 7639 + return bpf_object__open_file(path, NULL); 7640 + } 7641 + 7185 7642 struct bpf_object * 7186 7643 bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, 7187 7644 const struct bpf_object_open_opts *opts) ··· 7195 7642 return libbpf_err_ptr(-EINVAL); 7196 7643 7197 7644 return libbpf_ptr(bpf_object_open(NULL, obj_buf, obj_buf_sz, opts)); 7198 - } 7199 - 7200 - struct bpf_object * 7201 - bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz, 7202 - const char *name) 7203 - { 7204 - DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts, 7205 - .object_name = name, 7206 - /* wrong default, but backwards-compatible */ 7207 - .relaxed_maps = true, 7208 - ); 7209 - 7210 - /* returning NULL is wrong, but backwards-compatible */ 7211 - if (!obj_buf || obj_buf_sz == 0) 7212 - return errno = EINVAL, NULL; 7213 - 7214 - return libbpf_ptr(bpf_object_open(NULL, obj_buf, obj_buf_sz, &opts)); 7215 7645 } 7216 7646 7217 7647 static int bpf_object_unload(struct bpf_object *obj) ··· 7629 8093 return libbpf_err(err); 7630 8094 } 7631 8095 7632 - int bpf_object__load_xattr(struct bpf_object_load_attr *attr) 7633 - { 7634 - return bpf_object_load(attr->obj, attr->log_level, attr->target_btf_path); 7635 - } 7636 - 7637 8096 int bpf_object__load(struct bpf_object *obj) 7638 8097 { 7639 8098 return bpf_object_load(obj, 0, NULL); ··· 7686 8155 return err; 7687 8156 } 7688 8157 7689 - static int bpf_program_pin_instance(struct bpf_program *prog, const char *path, int instance) 8158 + int bpf_program__pin(struct bpf_program *prog, const char *path) 7690 8159 { 7691 8160 char *cp, errmsg[STRERR_BUFSIZE]; 7692 8161 int err; 7693 8162 8163 + if (prog->fd < 0) { 8164 + pr_warn("prog '%s': can't pin program that wasn't loaded\n", prog->name); 8165 + return libbpf_err(-EINVAL); 8166 + } 8167 + 7694 8168 err = make_parent_dir(path); 7695 8169 if (err) 7696 8170 return libbpf_err(err); ··· 7704 8168 if (err) 7705 8169 return libbpf_err(err); 7706 8170 7707 - if (prog == NULL) { 7708 - pr_warn("invalid program pointer\n"); 7709 - return libbpf_err(-EINVAL); 7710 - } 7711 - 7712 - if (instance < 0 || instance >= prog->instances.nr) { 7713 - pr_warn("invalid prog instance %d of prog %s (max %d)\n", 7714 - instance, prog->name, prog->instances.nr); 7715 - return libbpf_err(-EINVAL); 7716 - } 7717 - 7718 - if (bpf_obj_pin(prog->instances.fds[instance], path)) { 8171 + if (bpf_obj_pin(prog->fd, path)) { 7719 8172 err = -errno; 7720 8173 cp = libbpf_strerror_r(err, errmsg, sizeof(errmsg)); 7721 - pr_warn("failed to pin program: %s\n", cp); 8174 + pr_warn("prog '%s': failed to pin at '%s': %s\n", prog->name, path, cp); 7722 8175 return libbpf_err(err); 7723 8176 } 7724 - pr_debug("pinned program '%s'\n", path); 7725 8177 8178 + pr_debug("prog '%s': pinned at '%s'\n", prog->name, path); 7726 8179 return 0; 7727 - } 7728 - 7729 - static int bpf_program_unpin_instance(struct bpf_program *prog, const char *path, int instance) 7730 - { 7731 - int err; 7732 - 7733 - err = check_path(path); 7734 - if (err) 7735 - return libbpf_err(err); 7736 - 7737 - if (prog == NULL) { 7738 - pr_warn("invalid program pointer\n"); 7739 - return libbpf_err(-EINVAL); 7740 - } 7741 - 7742 - if (instance < 0 || instance >= prog->instances.nr) { 7743 - pr_warn("invalid prog instance %d of prog %s (max %d)\n", 7744 - instance, prog->name, prog->instances.nr); 7745 - return libbpf_err(-EINVAL); 7746 - } 7747 - 7748 - err = unlink(path); 7749 - if (err != 0) 7750 - return libbpf_err(-errno); 7751 - 7752 - pr_debug("unpinned program '%s'\n", path); 7753 - 7754 - return 0; 7755 - } 7756 - 7757 - __attribute__((alias("bpf_program_pin_instance"))) 7758 - int bpf_object__pin_instance(struct bpf_program *prog, const char *path, int instance); 7759 - 7760 - __attribute__((alias("bpf_program_unpin_instance"))) 7761 - int bpf_program__unpin_instance(struct bpf_program *prog, const char *path, int instance); 7762 - 7763 - int bpf_program__pin(struct bpf_program *prog, const char *path) 7764 - { 7765 - int i, err; 7766 - 7767 - err = make_parent_dir(path); 7768 - if (err) 7769 - return libbpf_err(err); 7770 - 7771 - err = check_path(path); 7772 - if (err) 7773 - return libbpf_err(err); 7774 - 7775 - if (prog == NULL) { 7776 - pr_warn("invalid program pointer\n"); 7777 - return libbpf_err(-EINVAL); 7778 - } 7779 - 7780 - if (prog->instances.nr <= 0) { 7781 - pr_warn("no instances of prog %s to pin\n", prog->name); 7782 - return libbpf_err(-EINVAL); 7783 - } 7784 - 7785 - if (prog->instances.nr == 1) { 7786 - /* don't create subdirs when pinning single instance */ 7787 - return bpf_program_pin_instance(prog, path, 0); 7788 - } 7789 - 7790 - for (i = 0; i < prog->instances.nr; i++) { 7791 - char buf[PATH_MAX]; 7792 - int len; 7793 - 7794 - len = snprintf(buf, PATH_MAX, "%s/%d", path, i); 7795 - if (len < 0) { 7796 - err = -EINVAL; 7797 - goto err_unpin; 7798 - } else if (len >= PATH_MAX) { 7799 - err = -ENAMETOOLONG; 7800 - goto err_unpin; 7801 - } 7802 - 7803 - err = bpf_program_pin_instance(prog, buf, i); 7804 - if (err) 7805 - goto err_unpin; 7806 - } 7807 - 7808 - return 0; 7809 - 7810 - err_unpin: 7811 - for (i = i - 1; i >= 0; i--) { 7812 - char buf[PATH_MAX]; 7813 - int len; 7814 - 7815 - len = snprintf(buf, PATH_MAX, "%s/%d", path, i); 7816 - if (len < 0) 7817 - continue; 7818 - else if (len >= PATH_MAX) 7819 - continue; 7820 - 7821 - bpf_program_unpin_instance(prog, buf, i); 7822 - } 7823 - 7824 - rmdir(path); 7825 - 7826 - return libbpf_err(err); 7827 8180 } 7828 8181 7829 8182 int bpf_program__unpin(struct bpf_program *prog, const char *path) 7830 8183 { 7831 - int i, err; 8184 + int err; 8185 + 8186 + if (prog->fd < 0) { 8187 + pr_warn("prog '%s': can't unpin program that wasn't loaded\n", prog->name); 8188 + return libbpf_err(-EINVAL); 8189 + } 7832 8190 7833 8191 err = check_path(path); 7834 8192 if (err) 7835 8193 return libbpf_err(err); 7836 8194 7837 - if (prog == NULL) { 7838 - pr_warn("invalid program pointer\n"); 7839 - return libbpf_err(-EINVAL); 7840 - } 7841 - 7842 - if (prog->instances.nr <= 0) { 7843 - pr_warn("no instances of prog %s to pin\n", prog->name); 7844 - return libbpf_err(-EINVAL); 7845 - } 7846 - 7847 - if (prog->instances.nr == 1) { 7848 - /* don't create subdirs when pinning single instance */ 7849 - return bpf_program_unpin_instance(prog, path, 0); 7850 - } 7851 - 7852 - for (i = 0; i < prog->instances.nr; i++) { 7853 - char buf[PATH_MAX]; 7854 - int len; 7855 - 7856 - len = snprintf(buf, PATH_MAX, "%s/%d", path, i); 7857 - if (len < 0) 7858 - return libbpf_err(-EINVAL); 7859 - else if (len >= PATH_MAX) 7860 - return libbpf_err(-ENAMETOOLONG); 7861 - 7862 - err = bpf_program_unpin_instance(prog, buf, i); 7863 - if (err) 7864 - return err; 7865 - } 7866 - 7867 - err = rmdir(path); 8195 + err = unlink(path); 7868 8196 if (err) 7869 8197 return libbpf_err(-errno); 7870 8198 8199 + pr_debug("prog '%s': unpinned from '%s'\n", prog->name, path); 7871 8200 return 0; 7872 8201 } 7873 8202 ··· 7979 8578 char buf[PATH_MAX]; 7980 8579 int len; 7981 8580 7982 - len = snprintf(buf, PATH_MAX, "%s/%s", path, 7983 - prog->pin_name); 8581 + len = snprintf(buf, PATH_MAX, "%s/%s", path, prog->name); 7984 8582 if (len < 0) { 7985 8583 err = -EINVAL; 7986 8584 goto err_unpin_programs; ··· 8000 8600 char buf[PATH_MAX]; 8001 8601 int len; 8002 8602 8003 - len = snprintf(buf, PATH_MAX, "%s/%s", path, 8004 - prog->pin_name); 8603 + len = snprintf(buf, PATH_MAX, "%s/%s", path, prog->name); 8005 8604 if (len < 0) 8006 8605 continue; 8007 8606 else if (len >= PATH_MAX) ··· 8024 8625 char buf[PATH_MAX]; 8025 8626 int len; 8026 8627 8027 - len = snprintf(buf, PATH_MAX, "%s/%s", path, 8028 - prog->pin_name); 8628 + len = snprintf(buf, PATH_MAX, "%s/%s", path, prog->name); 8029 8629 if (len < 0) 8030 8630 return libbpf_err(-EINVAL); 8031 8631 else if (len >= PATH_MAX) ··· 8057 8659 8058 8660 static void bpf_map__destroy(struct bpf_map *map) 8059 8661 { 8060 - if (map->clear_priv) 8061 - map->clear_priv(map, map->priv); 8062 - map->priv = NULL; 8063 - map->clear_priv = NULL; 8064 - 8065 8662 if (map->inner_map) { 8066 8663 bpf_map__destroy(map->inner_map); 8067 8664 zfree(&map->inner_map); ··· 8092 8699 if (IS_ERR_OR_NULL(obj)) 8093 8700 return; 8094 8701 8095 - if (obj->clear_priv) 8096 - obj->clear_priv(obj, obj->priv); 8097 - 8098 8702 usdt_manager_free(obj->usdt_man); 8099 8703 obj->usdt_man = NULL; 8100 8704 ··· 8118 8728 } 8119 8729 zfree(&obj->programs); 8120 8730 8121 - list_del(&obj->list); 8122 8731 free(obj); 8123 - } 8124 - 8125 - struct bpf_object * 8126 - bpf_object__next(struct bpf_object *prev) 8127 - { 8128 - struct bpf_object *next; 8129 - bool strict = (libbpf_mode & LIBBPF_STRICT_NO_OBJECT_LIST); 8130 - 8131 - if (strict) 8132 - return NULL; 8133 - 8134 - if (!prev) 8135 - next = list_first_entry(&bpf_objects_list, 8136 - struct bpf_object, 8137 - list); 8138 - else 8139 - next = list_next_entry(prev, list); 8140 - 8141 - /* Empty list is noticed here so don't need checking on entry. */ 8142 - if (&next->list == &bpf_objects_list) 8143 - return NULL; 8144 - 8145 - return next; 8146 8732 } 8147 8733 8148 8734 const char *bpf_object__name(const struct bpf_object *obj) ··· 8149 8783 obj->kern_version = kern_version; 8150 8784 8151 8785 return 0; 8152 - } 8153 - 8154 - int bpf_object__set_priv(struct bpf_object *obj, void *priv, 8155 - bpf_object_clear_priv_t clear_priv) 8156 - { 8157 - if (obj->priv && obj->clear_priv) 8158 - obj->clear_priv(obj, obj->priv); 8159 - 8160 - obj->priv = priv; 8161 - obj->clear_priv = clear_priv; 8162 - return 0; 8163 - } 8164 - 8165 - void *bpf_object__priv(const struct bpf_object *obj) 8166 - { 8167 - return obj ? obj->priv : libbpf_err_ptr(-EINVAL); 8168 8786 } 8169 8787 8170 8788 int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts) ··· 8194 8844 } 8195 8845 8196 8846 struct bpf_program * 8197 - bpf_program__next(struct bpf_program *prev, const struct bpf_object *obj) 8198 - { 8199 - return bpf_object__next_program(obj, prev); 8200 - } 8201 - 8202 - struct bpf_program * 8203 8847 bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev) 8204 8848 { 8205 8849 struct bpf_program *prog = prev; ··· 8206 8862 } 8207 8863 8208 8864 struct bpf_program * 8209 - bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj) 8210 - { 8211 - return bpf_object__prev_program(obj, next); 8212 - } 8213 - 8214 - struct bpf_program * 8215 8865 bpf_object__prev_program(const struct bpf_object *obj, struct bpf_program *next) 8216 8866 { 8217 8867 struct bpf_program *prog = next; ··· 8215 8877 } while (prog && prog_is_subprog(obj, prog)); 8216 8878 8217 8879 return prog; 8218 - } 8219 - 8220 - int bpf_program__set_priv(struct bpf_program *prog, void *priv, 8221 - bpf_program_clear_priv_t clear_priv) 8222 - { 8223 - if (prog->priv && prog->clear_priv) 8224 - prog->clear_priv(prog, prog->priv); 8225 - 8226 - prog->priv = priv; 8227 - prog->clear_priv = clear_priv; 8228 - return 0; 8229 - } 8230 - 8231 - void *bpf_program__priv(const struct bpf_program *prog) 8232 - { 8233 - return prog ? prog->priv : libbpf_err_ptr(-EINVAL); 8234 8880 } 8235 8881 8236 8882 void bpf_program__set_ifindex(struct bpf_program *prog, __u32 ifindex) ··· 8232 8910 return prog->sec_name; 8233 8911 } 8234 8912 8235 - const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy) 8236 - { 8237 - const char *title; 8238 - 8239 - title = prog->sec_name; 8240 - if (needs_copy) { 8241 - title = strdup(title); 8242 - if (!title) { 8243 - pr_warn("failed to strdup program title\n"); 8244 - return libbpf_err_ptr(-ENOMEM); 8245 - } 8246 - } 8247 - 8248 - return title; 8249 - } 8250 - 8251 8913 bool bpf_program__autoload(const struct bpf_program *prog) 8252 8914 { 8253 8915 return prog->autoload; ··· 8244 8938 8245 8939 prog->autoload = autoload; 8246 8940 return 0; 8247 - } 8248 - 8249 - static int bpf_program_nth_fd(const struct bpf_program *prog, int n); 8250 - 8251 - int bpf_program__fd(const struct bpf_program *prog) 8252 - { 8253 - return bpf_program_nth_fd(prog, 0); 8254 - } 8255 - 8256 - size_t bpf_program__size(const struct bpf_program *prog) 8257 - { 8258 - return prog->insns_cnt * BPF_INSN_SZ; 8259 8941 } 8260 8942 8261 8943 const struct bpf_insn *bpf_program__insns(const struct bpf_program *prog) ··· 8276 8982 return 0; 8277 8983 } 8278 8984 8279 - int bpf_program__set_prep(struct bpf_program *prog, int nr_instances, 8280 - bpf_program_prep_t prep) 8985 + int bpf_program__fd(const struct bpf_program *prog) 8281 8986 { 8282 - int *instances_fds; 8283 - 8284 - if (nr_instances <= 0 || !prep) 8285 - return libbpf_err(-EINVAL); 8286 - 8287 - if (prog->instances.nr > 0 || prog->instances.fds) { 8288 - pr_warn("Can't set pre-processor after loading\n"); 8289 - return libbpf_err(-EINVAL); 8290 - } 8291 - 8292 - instances_fds = malloc(sizeof(int) * nr_instances); 8293 - if (!instances_fds) { 8294 - pr_warn("alloc memory failed for fds\n"); 8295 - return libbpf_err(-ENOMEM); 8296 - } 8297 - 8298 - /* fill all fd with -1 */ 8299 - memset(instances_fds, -1, sizeof(int) * nr_instances); 8300 - 8301 - prog->instances.nr = nr_instances; 8302 - prog->instances.fds = instances_fds; 8303 - prog->preprocessor = prep; 8304 - return 0; 8305 - } 8306 - 8307 - __attribute__((alias("bpf_program_nth_fd"))) 8308 - int bpf_program__nth_fd(const struct bpf_program *prog, int n); 8309 - 8310 - static int bpf_program_nth_fd(const struct bpf_program *prog, int n) 8311 - { 8312 - int fd; 8313 - 8314 8987 if (!prog) 8315 8988 return libbpf_err(-EINVAL); 8316 8989 8317 - if (n >= prog->instances.nr || n < 0) { 8318 - pr_warn("Can't get the %dth fd from program %s: only %d instances\n", 8319 - n, prog->name, prog->instances.nr); 8320 - return libbpf_err(-EINVAL); 8321 - } 8322 - 8323 - fd = prog->instances.fds[n]; 8324 - if (fd < 0) { 8325 - pr_warn("%dth instance of program '%s' is invalid\n", 8326 - n, prog->name); 8990 + if (prog->fd < 0) 8327 8991 return libbpf_err(-ENOENT); 8328 - } 8329 8992 8330 - return fd; 8993 + return prog->fd; 8331 8994 } 8332 8995 8333 8996 __alias(bpf_program__type) ··· 8303 9052 prog->type = type; 8304 9053 return 0; 8305 9054 } 8306 - 8307 - static bool bpf_program__is_type(const struct bpf_program *prog, 8308 - enum bpf_prog_type type) 8309 - { 8310 - return prog ? (prog->type == type) : false; 8311 - } 8312 - 8313 - #define BPF_PROG_TYPE_FNS(NAME, TYPE) \ 8314 - int bpf_program__set_##NAME(struct bpf_program *prog) \ 8315 - { \ 8316 - if (!prog) \ 8317 - return libbpf_err(-EINVAL); \ 8318 - return bpf_program__set_type(prog, TYPE); \ 8319 - } \ 8320 - \ 8321 - bool bpf_program__is_##NAME(const struct bpf_program *prog) \ 8322 - { \ 8323 - return bpf_program__is_type(prog, TYPE); \ 8324 - } \ 8325 - 8326 - BPF_PROG_TYPE_FNS(socket_filter, BPF_PROG_TYPE_SOCKET_FILTER); 8327 - BPF_PROG_TYPE_FNS(lsm, BPF_PROG_TYPE_LSM); 8328 - BPF_PROG_TYPE_FNS(kprobe, BPF_PROG_TYPE_KPROBE); 8329 - BPF_PROG_TYPE_FNS(sched_cls, BPF_PROG_TYPE_SCHED_CLS); 8330 - BPF_PROG_TYPE_FNS(sched_act, BPF_PROG_TYPE_SCHED_ACT); 8331 - BPF_PROG_TYPE_FNS(tracepoint, BPF_PROG_TYPE_TRACEPOINT); 8332 - BPF_PROG_TYPE_FNS(raw_tracepoint, BPF_PROG_TYPE_RAW_TRACEPOINT); 8333 - BPF_PROG_TYPE_FNS(xdp, BPF_PROG_TYPE_XDP); 8334 - BPF_PROG_TYPE_FNS(perf_event, BPF_PROG_TYPE_PERF_EVENT); 8335 - BPF_PROG_TYPE_FNS(tracing, BPF_PROG_TYPE_TRACING); 8336 - BPF_PROG_TYPE_FNS(struct_ops, BPF_PROG_TYPE_STRUCT_OPS); 8337 - BPF_PROG_TYPE_FNS(extension, BPF_PROG_TYPE_EXT); 8338 - BPF_PROG_TYPE_FNS(sk_lookup, BPF_PROG_TYPE_SK_LOOKUP); 8339 9055 8340 9056 __alias(bpf_program__expected_attach_type) 8341 9057 enum bpf_attach_type bpf_program__get_expected_attach_type(const struct bpf_program *prog); ··· 8390 9172 static int attach_iter(const struct bpf_program *prog, long cookie, struct bpf_link **link); 8391 9173 8392 9174 static const struct bpf_sec_def section_defs[] = { 8393 - SEC_DEF("socket", SOCKET_FILTER, 0, SEC_NONE | SEC_SLOPPY_PFX), 8394 - SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8395 - SEC_DEF("sk_reuseport", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 9175 + SEC_DEF("socket", SOCKET_FILTER, 0, SEC_NONE), 9176 + SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE), 9177 + SEC_DEF("sk_reuseport", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE), 8396 9178 SEC_DEF("kprobe+", KPROBE, 0, SEC_NONE, attach_kprobe), 8397 9179 SEC_DEF("uprobe+", KPROBE, 0, SEC_NONE, attach_uprobe), 8398 9180 SEC_DEF("uprobe.s+", KPROBE, 0, SEC_SLEEPABLE, attach_uprobe), ··· 8403 9185 SEC_DEF("kretprobe.multi+", KPROBE, BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi), 8404 9186 SEC_DEF("usdt+", KPROBE, 0, SEC_NONE, attach_usdt), 8405 9187 SEC_DEF("tc", SCHED_CLS, 0, SEC_NONE), 8406 - SEC_DEF("classifier", SCHED_CLS, 0, SEC_NONE | SEC_SLOPPY_PFX | SEC_DEPRECATED), 8407 - SEC_DEF("action", SCHED_ACT, 0, SEC_NONE | SEC_SLOPPY_PFX), 9188 + SEC_DEF("classifier", SCHED_CLS, 0, SEC_NONE), 9189 + SEC_DEF("action", SCHED_ACT, 0, SEC_NONE), 8408 9190 SEC_DEF("tracepoint+", TRACEPOINT, 0, SEC_NONE, attach_tp), 8409 9191 SEC_DEF("tp+", TRACEPOINT, 0, SEC_NONE, attach_tp), 8410 9192 SEC_DEF("raw_tracepoint+", RAW_TRACEPOINT, 0, SEC_NONE, attach_raw_tp), ··· 8421 9203 SEC_DEF("freplace+", EXT, 0, SEC_ATTACH_BTF, attach_trace), 8422 9204 SEC_DEF("lsm+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF, attach_lsm), 8423 9205 SEC_DEF("lsm.s+", LSM, BPF_LSM_MAC, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_lsm), 9206 + SEC_DEF("lsm_cgroup+", LSM, BPF_LSM_CGROUP, SEC_ATTACH_BTF), 8424 9207 SEC_DEF("iter+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF, attach_iter), 8425 9208 SEC_DEF("iter.s+", TRACING, BPF_TRACE_ITER, SEC_ATTACH_BTF | SEC_SLEEPABLE, attach_iter), 8426 9209 SEC_DEF("syscall", SYSCALL, 0, SEC_SLEEPABLE), 8427 9210 SEC_DEF("xdp.frags/devmap", XDP, BPF_XDP_DEVMAP, SEC_XDP_FRAGS), 8428 9211 SEC_DEF("xdp/devmap", XDP, BPF_XDP_DEVMAP, SEC_ATTACHABLE), 8429 - SEC_DEF("xdp_devmap/", XDP, BPF_XDP_DEVMAP, SEC_ATTACHABLE | SEC_DEPRECATED), 8430 9212 SEC_DEF("xdp.frags/cpumap", XDP, BPF_XDP_CPUMAP, SEC_XDP_FRAGS), 8431 9213 SEC_DEF("xdp/cpumap", XDP, BPF_XDP_CPUMAP, SEC_ATTACHABLE), 8432 - SEC_DEF("xdp_cpumap/", XDP, BPF_XDP_CPUMAP, SEC_ATTACHABLE | SEC_DEPRECATED), 8433 9214 SEC_DEF("xdp.frags", XDP, BPF_XDP, SEC_XDP_FRAGS), 8434 - SEC_DEF("xdp", XDP, BPF_XDP, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8435 - SEC_DEF("perf_event", PERF_EVENT, 0, SEC_NONE | SEC_SLOPPY_PFX), 8436 - SEC_DEF("lwt_in", LWT_IN, 0, SEC_NONE | SEC_SLOPPY_PFX), 8437 - SEC_DEF("lwt_out", LWT_OUT, 0, SEC_NONE | SEC_SLOPPY_PFX), 8438 - SEC_DEF("lwt_xmit", LWT_XMIT, 0, SEC_NONE | SEC_SLOPPY_PFX), 8439 - SEC_DEF("lwt_seg6local", LWT_SEG6LOCAL, 0, SEC_NONE | SEC_SLOPPY_PFX), 8440 - SEC_DEF("cgroup_skb/ingress", CGROUP_SKB, BPF_CGROUP_INET_INGRESS, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8441 - SEC_DEF("cgroup_skb/egress", CGROUP_SKB, BPF_CGROUP_INET_EGRESS, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8442 - SEC_DEF("cgroup/skb", CGROUP_SKB, 0, SEC_NONE | SEC_SLOPPY_PFX), 8443 - SEC_DEF("cgroup/sock_create", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_CREATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8444 - SEC_DEF("cgroup/sock_release", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_RELEASE, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8445 - SEC_DEF("cgroup/sock", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_CREATE, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8446 - SEC_DEF("cgroup/post_bind4", CGROUP_SOCK, BPF_CGROUP_INET4_POST_BIND, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8447 - SEC_DEF("cgroup/post_bind6", CGROUP_SOCK, BPF_CGROUP_INET6_POST_BIND, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8448 - SEC_DEF("cgroup/dev", CGROUP_DEVICE, BPF_CGROUP_DEVICE, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8449 - SEC_DEF("sockops", SOCK_OPS, BPF_CGROUP_SOCK_OPS, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8450 - SEC_DEF("sk_skb/stream_parser", SK_SKB, BPF_SK_SKB_STREAM_PARSER, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8451 - SEC_DEF("sk_skb/stream_verdict",SK_SKB, BPF_SK_SKB_STREAM_VERDICT, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8452 - SEC_DEF("sk_skb", SK_SKB, 0, SEC_NONE | SEC_SLOPPY_PFX), 8453 - SEC_DEF("sk_msg", SK_MSG, BPF_SK_MSG_VERDICT, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8454 - SEC_DEF("lirc_mode2", LIRC_MODE2, BPF_LIRC_MODE2, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8455 - SEC_DEF("flow_dissector", FLOW_DISSECTOR, BPF_FLOW_DISSECTOR, SEC_ATTACHABLE_OPT | SEC_SLOPPY_PFX), 8456 - SEC_DEF("cgroup/bind4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_BIND, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8457 - SEC_DEF("cgroup/bind6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_BIND, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8458 - SEC_DEF("cgroup/connect4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_CONNECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8459 - SEC_DEF("cgroup/connect6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_CONNECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8460 - SEC_DEF("cgroup/sendmsg4", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP4_SENDMSG, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8461 - SEC_DEF("cgroup/sendmsg6", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP6_SENDMSG, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8462 - SEC_DEF("cgroup/recvmsg4", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP4_RECVMSG, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8463 - SEC_DEF("cgroup/recvmsg6", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP6_RECVMSG, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8464 - SEC_DEF("cgroup/getpeername4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_GETPEERNAME, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8465 - SEC_DEF("cgroup/getpeername6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_GETPEERNAME, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8466 - SEC_DEF("cgroup/getsockname4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_GETSOCKNAME, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8467 - SEC_DEF("cgroup/getsockname6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_GETSOCKNAME, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8468 - SEC_DEF("cgroup/sysctl", CGROUP_SYSCTL, BPF_CGROUP_SYSCTL, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8469 - SEC_DEF("cgroup/getsockopt", CGROUP_SOCKOPT, BPF_CGROUP_GETSOCKOPT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 8470 - SEC_DEF("cgroup/setsockopt", CGROUP_SOCKOPT, BPF_CGROUP_SETSOCKOPT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 9215 + SEC_DEF("xdp", XDP, BPF_XDP, SEC_ATTACHABLE_OPT), 9216 + SEC_DEF("perf_event", PERF_EVENT, 0, SEC_NONE), 9217 + SEC_DEF("lwt_in", LWT_IN, 0, SEC_NONE), 9218 + SEC_DEF("lwt_out", LWT_OUT, 0, SEC_NONE), 9219 + SEC_DEF("lwt_xmit", LWT_XMIT, 0, SEC_NONE), 9220 + SEC_DEF("lwt_seg6local", LWT_SEG6LOCAL, 0, SEC_NONE), 9221 + SEC_DEF("sockops", SOCK_OPS, BPF_CGROUP_SOCK_OPS, SEC_ATTACHABLE_OPT), 9222 + SEC_DEF("sk_skb/stream_parser", SK_SKB, BPF_SK_SKB_STREAM_PARSER, SEC_ATTACHABLE_OPT), 9223 + SEC_DEF("sk_skb/stream_verdict",SK_SKB, BPF_SK_SKB_STREAM_VERDICT, SEC_ATTACHABLE_OPT), 9224 + SEC_DEF("sk_skb", SK_SKB, 0, SEC_NONE), 9225 + SEC_DEF("sk_msg", SK_MSG, BPF_SK_MSG_VERDICT, SEC_ATTACHABLE_OPT), 9226 + SEC_DEF("lirc_mode2", LIRC_MODE2, BPF_LIRC_MODE2, SEC_ATTACHABLE_OPT), 9227 + SEC_DEF("flow_dissector", FLOW_DISSECTOR, BPF_FLOW_DISSECTOR, SEC_ATTACHABLE_OPT), 9228 + SEC_DEF("cgroup_skb/ingress", CGROUP_SKB, BPF_CGROUP_INET_INGRESS, SEC_ATTACHABLE_OPT), 9229 + SEC_DEF("cgroup_skb/egress", CGROUP_SKB, BPF_CGROUP_INET_EGRESS, SEC_ATTACHABLE_OPT), 9230 + SEC_DEF("cgroup/skb", CGROUP_SKB, 0, SEC_NONE), 9231 + SEC_DEF("cgroup/sock_create", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_CREATE, SEC_ATTACHABLE), 9232 + SEC_DEF("cgroup/sock_release", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_RELEASE, SEC_ATTACHABLE), 9233 + SEC_DEF("cgroup/sock", CGROUP_SOCK, BPF_CGROUP_INET_SOCK_CREATE, SEC_ATTACHABLE_OPT), 9234 + SEC_DEF("cgroup/post_bind4", CGROUP_SOCK, BPF_CGROUP_INET4_POST_BIND, SEC_ATTACHABLE), 9235 + SEC_DEF("cgroup/post_bind6", CGROUP_SOCK, BPF_CGROUP_INET6_POST_BIND, SEC_ATTACHABLE), 9236 + SEC_DEF("cgroup/bind4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_BIND, SEC_ATTACHABLE), 9237 + SEC_DEF("cgroup/bind6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_BIND, SEC_ATTACHABLE), 9238 + SEC_DEF("cgroup/connect4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_CONNECT, SEC_ATTACHABLE), 9239 + SEC_DEF("cgroup/connect6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_CONNECT, SEC_ATTACHABLE), 9240 + SEC_DEF("cgroup/sendmsg4", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP4_SENDMSG, SEC_ATTACHABLE), 9241 + SEC_DEF("cgroup/sendmsg6", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP6_SENDMSG, SEC_ATTACHABLE), 9242 + SEC_DEF("cgroup/recvmsg4", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP4_RECVMSG, SEC_ATTACHABLE), 9243 + SEC_DEF("cgroup/recvmsg6", CGROUP_SOCK_ADDR, BPF_CGROUP_UDP6_RECVMSG, SEC_ATTACHABLE), 9244 + SEC_DEF("cgroup/getpeername4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_GETPEERNAME, SEC_ATTACHABLE), 9245 + SEC_DEF("cgroup/getpeername6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_GETPEERNAME, SEC_ATTACHABLE), 9246 + SEC_DEF("cgroup/getsockname4", CGROUP_SOCK_ADDR, BPF_CGROUP_INET4_GETSOCKNAME, SEC_ATTACHABLE), 9247 + SEC_DEF("cgroup/getsockname6", CGROUP_SOCK_ADDR, BPF_CGROUP_INET6_GETSOCKNAME, SEC_ATTACHABLE), 9248 + SEC_DEF("cgroup/sysctl", CGROUP_SYSCTL, BPF_CGROUP_SYSCTL, SEC_ATTACHABLE), 9249 + SEC_DEF("cgroup/getsockopt", CGROUP_SOCKOPT, BPF_CGROUP_GETSOCKOPT, SEC_ATTACHABLE), 9250 + SEC_DEF("cgroup/setsockopt", CGROUP_SOCKOPT, BPF_CGROUP_SETSOCKOPT, SEC_ATTACHABLE), 9251 + SEC_DEF("cgroup/dev", CGROUP_DEVICE, BPF_CGROUP_DEVICE, SEC_ATTACHABLE_OPT), 8471 9252 SEC_DEF("struct_ops+", STRUCT_OPS, 0, SEC_NONE), 8472 - SEC_DEF("sk_lookup", SK_LOOKUP, BPF_SK_LOOKUP, SEC_ATTACHABLE | SEC_SLOPPY_PFX), 9253 + SEC_DEF("sk_lookup", SK_LOOKUP, BPF_SK_LOOKUP, SEC_ATTACHABLE), 8473 9254 }; 8474 9255 8475 9256 static size_t custom_sec_def_cnt; ··· 8563 9346 return 0; 8564 9347 } 8565 9348 8566 - static bool sec_def_matches(const struct bpf_sec_def *sec_def, const char *sec_name, 8567 - bool allow_sloppy) 9349 + static bool sec_def_matches(const struct bpf_sec_def *sec_def, const char *sec_name) 8568 9350 { 8569 9351 size_t len = strlen(sec_def->sec); 8570 9352 ··· 8588 9372 return false; 8589 9373 } 8590 9374 8591 - /* SEC_SLOPPY_PFX definitions are allowed to be just prefix 8592 - * matches, unless strict section name mode 8593 - * (LIBBPF_STRICT_SEC_NAME) is enabled, in which case the 8594 - * match has to be exact. 8595 - */ 8596 - if (allow_sloppy && str_has_pfx(sec_name, sec_def->sec)) 8597 - return true; 8598 - 8599 - /* Definitions not marked SEC_SLOPPY_PFX (e.g., 8600 - * SEC("syscall")) are exact matches in both modes. 8601 - */ 8602 9375 return strcmp(sec_name, sec_def->sec) == 0; 8603 9376 } 8604 9377 ··· 8595 9390 { 8596 9391 const struct bpf_sec_def *sec_def; 8597 9392 int i, n; 8598 - bool strict = libbpf_mode & LIBBPF_STRICT_SEC_NAME, allow_sloppy; 8599 9393 8600 9394 n = custom_sec_def_cnt; 8601 9395 for (i = 0; i < n; i++) { 8602 9396 sec_def = &custom_sec_defs[i]; 8603 - if (sec_def_matches(sec_def, sec_name, false)) 9397 + if (sec_def_matches(sec_def, sec_name)) 8604 9398 return sec_def; 8605 9399 } 8606 9400 8607 9401 n = ARRAY_SIZE(section_defs); 8608 9402 for (i = 0; i < n; i++) { 8609 9403 sec_def = &section_defs[i]; 8610 - allow_sloppy = (sec_def->cookie & SEC_SLOPPY_PFX) && !strict; 8611 - if (sec_def_matches(sec_def, sec_name, allow_sloppy)) 9404 + if (sec_def_matches(sec_def, sec_name)) 8612 9405 return sec_def; 8613 9406 } 8614 9407 ··· 8859 9656 *kind = BTF_KIND_TYPEDEF; 8860 9657 break; 8861 9658 case BPF_LSM_MAC: 9659 + case BPF_LSM_CGROUP: 8862 9660 *prefix = BTF_LSM_PREFIX; 8863 9661 *kind = BTF_KIND_FUNC; 8864 9662 break; ··· 9063 9859 return map ? map->fd : libbpf_err(-EINVAL); 9064 9860 } 9065 9861 9066 - const struct bpf_map_def *bpf_map__def(const struct bpf_map *map) 9067 - { 9068 - return map ? &map->def : libbpf_err_ptr(-EINVAL); 9069 - } 9070 - 9071 9862 static bool map_uses_real_name(const struct bpf_map *map) 9072 9863 { 9073 9864 /* Since libbpf started to support custom .data.* and .rodata.* maps, ··· 9177 9978 return map ? map->btf_value_type_id : 0; 9178 9979 } 9179 9980 9180 - int bpf_map__set_priv(struct bpf_map *map, void *priv, 9181 - bpf_map_clear_priv_t clear_priv) 9182 - { 9183 - if (!map) 9184 - return libbpf_err(-EINVAL); 9185 - 9186 - if (map->priv) { 9187 - if (map->clear_priv) 9188 - map->clear_priv(map, map->priv); 9189 - } 9190 - 9191 - map->priv = priv; 9192 - map->clear_priv = clear_priv; 9193 - return 0; 9194 - } 9195 - 9196 - void *bpf_map__priv(const struct bpf_map *map) 9197 - { 9198 - return map ? map->priv : libbpf_err_ptr(-EINVAL); 9199 - } 9200 - 9201 9981 int bpf_map__set_initial_value(struct bpf_map *map, 9202 9982 const void *data, size_t size) 9203 9983 { ··· 9194 10016 return NULL; 9195 10017 *psize = map->def.value_size; 9196 10018 return map->mmaped; 9197 - } 9198 - 9199 - bool bpf_map__is_offload_neutral(const struct bpf_map *map) 9200 - { 9201 - return map->def.type == BPF_MAP_TYPE_PERF_EVENT_ARRAY; 9202 10019 } 9203 10020 9204 10021 bool bpf_map__is_internal(const struct bpf_map *map) ··· 9257 10084 } 9258 10085 9259 10086 struct bpf_map * 9260 - bpf_map__next(const struct bpf_map *prev, const struct bpf_object *obj) 9261 - { 9262 - return bpf_object__next_map(obj, prev); 9263 - } 9264 - 9265 - struct bpf_map * 9266 10087 bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *prev) 9267 10088 { 9268 10089 if (prev == NULL) 9269 10090 return obj->maps; 9270 10091 9271 10092 return __bpf_map__iter(prev, obj, 1); 9272 - } 9273 - 9274 - struct bpf_map * 9275 - bpf_map__prev(const struct bpf_map *next, const struct bpf_object *obj) 9276 - { 9277 - return bpf_object__prev_map(obj, next); 9278 10093 } 9279 10094 9280 10095 struct bpf_map * ··· 9308 10147 bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name) 9309 10148 { 9310 10149 return bpf_map__fd(bpf_object__find_map_by_name(obj, name)); 9311 - } 9312 - 9313 - struct bpf_map * 9314 - bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset) 9315 - { 9316 - return libbpf_err_ptr(-ENOTSUP); 9317 10150 } 9318 10151 9319 10152 static int validate_map_op(const struct bpf_map *map, size_t key_sz, ··· 9428 10273 * case. 9429 10274 */ 9430 10275 return -errno; 9431 - } 9432 - 9433 - __attribute__((alias("bpf_prog_load_xattr2"))) 9434 - int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, 9435 - struct bpf_object **pobj, int *prog_fd); 9436 - 9437 - static int bpf_prog_load_xattr2(const struct bpf_prog_load_attr *attr, 9438 - struct bpf_object **pobj, int *prog_fd) 9439 - { 9440 - struct bpf_object_open_attr open_attr = {}; 9441 - struct bpf_program *prog, *first_prog = NULL; 9442 - struct bpf_object *obj; 9443 - struct bpf_map *map; 9444 - int err; 9445 - 9446 - if (!attr) 9447 - return libbpf_err(-EINVAL); 9448 - if (!attr->file) 9449 - return libbpf_err(-EINVAL); 9450 - 9451 - open_attr.file = attr->file; 9452 - open_attr.prog_type = attr->prog_type; 9453 - 9454 - obj = __bpf_object__open_xattr(&open_attr, 0); 9455 - err = libbpf_get_error(obj); 9456 - if (err) 9457 - return libbpf_err(-ENOENT); 9458 - 9459 - bpf_object__for_each_program(prog, obj) { 9460 - enum bpf_attach_type attach_type = attr->expected_attach_type; 9461 - /* 9462 - * to preserve backwards compatibility, bpf_prog_load treats 9463 - * attr->prog_type, if specified, as an override to whatever 9464 - * bpf_object__open guessed 9465 - */ 9466 - if (attr->prog_type != BPF_PROG_TYPE_UNSPEC) { 9467 - prog->type = attr->prog_type; 9468 - prog->expected_attach_type = attach_type; 9469 - } 9470 - if (bpf_program__type(prog) == BPF_PROG_TYPE_UNSPEC) { 9471 - /* 9472 - * we haven't guessed from section name and user 9473 - * didn't provide a fallback type, too bad... 9474 - */ 9475 - bpf_object__close(obj); 9476 - return libbpf_err(-EINVAL); 9477 - } 9478 - 9479 - prog->prog_ifindex = attr->ifindex; 9480 - prog->log_level = attr->log_level; 9481 - prog->prog_flags |= attr->prog_flags; 9482 - if (!first_prog) 9483 - first_prog = prog; 9484 - } 9485 - 9486 - bpf_object__for_each_map(map, obj) { 9487 - if (map->def.type != BPF_MAP_TYPE_PERF_EVENT_ARRAY) 9488 - map->map_ifindex = attr->ifindex; 9489 - } 9490 - 9491 - if (!first_prog) { 9492 - pr_warn("object file doesn't contain bpf program\n"); 9493 - bpf_object__close(obj); 9494 - return libbpf_err(-ENOENT); 9495 - } 9496 - 9497 - err = bpf_object__load(obj); 9498 - if (err) { 9499 - bpf_object__close(obj); 9500 - return libbpf_err(err); 9501 - } 9502 - 9503 - *pobj = obj; 9504 - *prog_fd = bpf_program__fd(first_prog); 9505 - return 0; 9506 - } 9507 - 9508 - COMPAT_VERSION(bpf_prog_load_deprecated, bpf_prog_load, LIBBPF_0.0.1) 9509 - int bpf_prog_load_deprecated(const char *file, enum bpf_prog_type type, 9510 - struct bpf_object **pobj, int *prog_fd) 9511 - { 9512 - struct bpf_prog_load_attr attr; 9513 - 9514 - memset(&attr, 0, sizeof(struct bpf_prog_load_attr)); 9515 - attr.file = file; 9516 - attr.prog_type = type; 9517 - attr.expected_attach_type = 0; 9518 - 9519 - return bpf_prog_load_xattr2(&attr, pobj, prog_fd); 9520 10276 } 9521 10277 9522 10278 /* Replace link's underlying BPF program with the new one */ ··· 9877 10811 } 9878 10812 type = determine_kprobe_perf_type_legacy(probe_name, retprobe); 9879 10813 if (type < 0) { 10814 + err = type; 9880 10815 pr_warn("failed to determine legacy kprobe event id for '%s+0x%zx': %s\n", 9881 10816 kfunc_name, offset, 9882 - libbpf_strerror_r(type, errmsg, sizeof(errmsg))); 9883 - return type; 10817 + libbpf_strerror_r(err, errmsg, sizeof(errmsg))); 10818 + goto err_clean_legacy; 9884 10819 } 9885 10820 attr.size = sizeof(attr); 9886 10821 attr.config = type; ··· 9895 10828 err = -errno; 9896 10829 pr_warn("legacy kprobe perf_event_open() failed: %s\n", 9897 10830 libbpf_strerror_r(err, errmsg, sizeof(errmsg))); 9898 - return err; 10831 + goto err_clean_legacy; 9899 10832 } 9900 10833 return pfd; 10834 + 10835 + err_clean_legacy: 10836 + /* Clear the newly added legacy kprobe_event */ 10837 + remove_kprobe_event_legacy(probe_name, retprobe); 10838 + return err; 9901 10839 } 9902 10840 9903 10841 struct bpf_link * ··· 9959 10887 prog->name, retprobe ? "kretprobe" : "kprobe", 9960 10888 func_name, offset, 9961 10889 libbpf_strerror_r(err, errmsg, sizeof(errmsg))); 9962 - goto err_out; 10890 + goto err_clean_legacy; 9963 10891 } 9964 10892 if (legacy) { 9965 10893 struct bpf_link_perf *perf_link = container_of(link, struct bpf_link_perf, link); ··· 9970 10898 } 9971 10899 9972 10900 return link; 10901 + 10902 + err_clean_legacy: 10903 + if (legacy) 10904 + remove_kprobe_event_legacy(legacy_probe, retprobe); 9973 10905 err_out: 9974 10906 free(legacy_probe); 9975 10907 return libbpf_err_ptr(err); ··· 10248 11172 } 10249 11173 type = determine_uprobe_perf_type_legacy(probe_name, retprobe); 10250 11174 if (type < 0) { 11175 + err = type; 10251 11176 pr_warn("failed to determine legacy uprobe event id for %s:0x%zx: %d\n", 10252 11177 binary_path, offset, err); 10253 - return type; 11178 + goto err_clean_legacy; 10254 11179 } 10255 11180 10256 11181 memset(&attr, 0, sizeof(attr)); ··· 10266 11189 if (pfd < 0) { 10267 11190 err = -errno; 10268 11191 pr_warn("legacy uprobe perf_event_open() failed: %d\n", err); 10269 - return err; 11192 + goto err_clean_legacy; 10270 11193 } 10271 11194 return pfd; 11195 + 11196 + err_clean_legacy: 11197 + /* Clear the newly added legacy uprobe_event */ 11198 + remove_uprobe_event_legacy(probe_name, retprobe); 11199 + return err; 10272 11200 } 10273 11201 10274 11202 /* Return next ELF section of sh_type after scn, or first of that type if scn is NULL. */ ··· 10607 11525 prog->name, retprobe ? "uretprobe" : "uprobe", 10608 11526 binary_path, func_offset, 10609 11527 libbpf_strerror_r(err, errmsg, sizeof(errmsg))); 10610 - goto err_out; 11528 + goto err_clean_legacy; 10611 11529 } 10612 11530 if (legacy) { 10613 11531 struct bpf_link_perf *perf_link = container_of(link, struct bpf_link_perf, link); ··· 10617 11535 perf_link->legacy_is_retprobe = retprobe; 10618 11536 } 10619 11537 return link; 11538 + 11539 + err_clean_legacy: 11540 + if (legacy) 11541 + remove_uprobe_event_legacy(legacy_probe, retprobe); 10620 11542 err_out: 10621 11543 free(legacy_probe); 10622 11544 return libbpf_err_ptr(err); 10623 - 10624 11545 } 10625 11546 10626 11547 /* Format of u[ret]probe section definition supporting auto-attach: ··· 11235 12150 return link; 11236 12151 } 11237 12152 12153 + typedef enum bpf_perf_event_ret (*bpf_perf_event_print_t)(struct perf_event_header *hdr, 12154 + void *private_data); 12155 + 11238 12156 static enum bpf_perf_event_ret 11239 12157 perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, 11240 12158 void **copy_mem, size_t *copy_size, ··· 11285 12197 ring_buffer_write_tail(header, data_tail); 11286 12198 return libbpf_err(ret); 11287 12199 } 11288 - 11289 - __attribute__((alias("perf_event_read_simple"))) 11290 - enum bpf_perf_event_ret 11291 - bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, 11292 - void **copy_mem, size_t *copy_size, 11293 - bpf_perf_event_print_t fn, void *private_data); 11294 12200 11295 12201 struct perf_buffer; 11296 12202 ··· 11419 12337 static struct perf_buffer *__perf_buffer__new(int map_fd, size_t page_cnt, 11420 12338 struct perf_buffer_params *p); 11421 12339 11422 - DEFAULT_VERSION(perf_buffer__new_v0_6_0, perf_buffer__new, LIBBPF_0.6.0) 11423 - struct perf_buffer *perf_buffer__new_v0_6_0(int map_fd, size_t page_cnt, 11424 - perf_buffer_sample_fn sample_cb, 11425 - perf_buffer_lost_fn lost_cb, 11426 - void *ctx, 11427 - const struct perf_buffer_opts *opts) 12340 + struct perf_buffer *perf_buffer__new(int map_fd, size_t page_cnt, 12341 + perf_buffer_sample_fn sample_cb, 12342 + perf_buffer_lost_fn lost_cb, 12343 + void *ctx, 12344 + const struct perf_buffer_opts *opts) 11428 12345 { 11429 12346 struct perf_buffer_params p = {}; 11430 12347 struct perf_event_attr attr = {}; ··· 11445 12364 return libbpf_ptr(__perf_buffer__new(map_fd, page_cnt, &p)); 11446 12365 } 11447 12366 11448 - COMPAT_VERSION(perf_buffer__new_deprecated, perf_buffer__new, LIBBPF_0.0.4) 11449 - struct perf_buffer *perf_buffer__new_deprecated(int map_fd, size_t page_cnt, 11450 - const struct perf_buffer_opts *opts) 11451 - { 11452 - return perf_buffer__new_v0_6_0(map_fd, page_cnt, 11453 - opts ? opts->sample_cb : NULL, 11454 - opts ? opts->lost_cb : NULL, 11455 - opts ? opts->ctx : NULL, 11456 - NULL); 11457 - } 11458 - 11459 - DEFAULT_VERSION(perf_buffer__new_raw_v0_6_0, perf_buffer__new_raw, LIBBPF_0.6.0) 11460 - struct perf_buffer *perf_buffer__new_raw_v0_6_0(int map_fd, size_t page_cnt, 11461 - struct perf_event_attr *attr, 11462 - perf_buffer_event_fn event_cb, void *ctx, 11463 - const struct perf_buffer_raw_opts *opts) 12367 + struct perf_buffer *perf_buffer__new_raw(int map_fd, size_t page_cnt, 12368 + struct perf_event_attr *attr, 12369 + perf_buffer_event_fn event_cb, void *ctx, 12370 + const struct perf_buffer_raw_opts *opts) 11464 12371 { 11465 12372 struct perf_buffer_params p = {}; 11466 12373 ··· 11466 12397 p.map_keys = OPTS_GET(opts, map_keys, NULL); 11467 12398 11468 12399 return libbpf_ptr(__perf_buffer__new(map_fd, page_cnt, &p)); 11469 - } 11470 - 11471 - COMPAT_VERSION(perf_buffer__new_raw_deprecated, perf_buffer__new_raw, LIBBPF_0.0.4) 11472 - struct perf_buffer *perf_buffer__new_raw_deprecated(int map_fd, size_t page_cnt, 11473 - const struct perf_buffer_raw_opts *opts) 11474 - { 11475 - LIBBPF_OPTS(perf_buffer_raw_opts, inner_opts, 11476 - .cpu_cnt = opts->cpu_cnt, 11477 - .cpus = opts->cpus, 11478 - .map_keys = opts->map_keys, 11479 - ); 11480 - 11481 - return perf_buffer__new_raw_v0_6_0(map_fd, page_cnt, opts->attr, 11482 - opts->event_cb, opts->ctx, &inner_opts); 11483 12400 } 11484 12401 11485 12402 static struct perf_buffer *__perf_buffer__new(int map_fd, size_t page_cnt, ··· 11767 12712 } 11768 12713 } 11769 12714 return 0; 11770 - } 11771 - 11772 - struct bpf_prog_info_array_desc { 11773 - int array_offset; /* e.g. offset of jited_prog_insns */ 11774 - int count_offset; /* e.g. offset of jited_prog_len */ 11775 - int size_offset; /* > 0: offset of rec size, 11776 - * < 0: fix size of -size_offset 11777 - */ 11778 - }; 11779 - 11780 - static struct bpf_prog_info_array_desc bpf_prog_info_array_desc[] = { 11781 - [BPF_PROG_INFO_JITED_INSNS] = { 11782 - offsetof(struct bpf_prog_info, jited_prog_insns), 11783 - offsetof(struct bpf_prog_info, jited_prog_len), 11784 - -1, 11785 - }, 11786 - [BPF_PROG_INFO_XLATED_INSNS] = { 11787 - offsetof(struct bpf_prog_info, xlated_prog_insns), 11788 - offsetof(struct bpf_prog_info, xlated_prog_len), 11789 - -1, 11790 - }, 11791 - [BPF_PROG_INFO_MAP_IDS] = { 11792 - offsetof(struct bpf_prog_info, map_ids), 11793 - offsetof(struct bpf_prog_info, nr_map_ids), 11794 - -(int)sizeof(__u32), 11795 - }, 11796 - [BPF_PROG_INFO_JITED_KSYMS] = { 11797 - offsetof(struct bpf_prog_info, jited_ksyms), 11798 - offsetof(struct bpf_prog_info, nr_jited_ksyms), 11799 - -(int)sizeof(__u64), 11800 - }, 11801 - [BPF_PROG_INFO_JITED_FUNC_LENS] = { 11802 - offsetof(struct bpf_prog_info, jited_func_lens), 11803 - offsetof(struct bpf_prog_info, nr_jited_func_lens), 11804 - -(int)sizeof(__u32), 11805 - }, 11806 - [BPF_PROG_INFO_FUNC_INFO] = { 11807 - offsetof(struct bpf_prog_info, func_info), 11808 - offsetof(struct bpf_prog_info, nr_func_info), 11809 - offsetof(struct bpf_prog_info, func_info_rec_size), 11810 - }, 11811 - [BPF_PROG_INFO_LINE_INFO] = { 11812 - offsetof(struct bpf_prog_info, line_info), 11813 - offsetof(struct bpf_prog_info, nr_line_info), 11814 - offsetof(struct bpf_prog_info, line_info_rec_size), 11815 - }, 11816 - [BPF_PROG_INFO_JITED_LINE_INFO] = { 11817 - offsetof(struct bpf_prog_info, jited_line_info), 11818 - offsetof(struct bpf_prog_info, nr_jited_line_info), 11819 - offsetof(struct bpf_prog_info, jited_line_info_rec_size), 11820 - }, 11821 - [BPF_PROG_INFO_PROG_TAGS] = { 11822 - offsetof(struct bpf_prog_info, prog_tags), 11823 - offsetof(struct bpf_prog_info, nr_prog_tags), 11824 - -(int)sizeof(__u8) * BPF_TAG_SIZE, 11825 - }, 11826 - 11827 - }; 11828 - 11829 - static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info, 11830 - int offset) 11831 - { 11832 - __u32 *array = (__u32 *)info; 11833 - 11834 - if (offset >= 0) 11835 - return array[offset / sizeof(__u32)]; 11836 - return -(int)offset; 11837 - } 11838 - 11839 - static __u64 bpf_prog_info_read_offset_u64(struct bpf_prog_info *info, 11840 - int offset) 11841 - { 11842 - __u64 *array = (__u64 *)info; 11843 - 11844 - if (offset >= 0) 11845 - return array[offset / sizeof(__u64)]; 11846 - return -(int)offset; 11847 - } 11848 - 11849 - static void bpf_prog_info_set_offset_u32(struct bpf_prog_info *info, int offset, 11850 - __u32 val) 11851 - { 11852 - __u32 *array = (__u32 *)info; 11853 - 11854 - if (offset >= 0) 11855 - array[offset / sizeof(__u32)] = val; 11856 - } 11857 - 11858 - static void bpf_prog_info_set_offset_u64(struct bpf_prog_info *info, int offset, 11859 - __u64 val) 11860 - { 11861 - __u64 *array = (__u64 *)info; 11862 - 11863 - if (offset >= 0) 11864 - array[offset / sizeof(__u64)] = val; 11865 - } 11866 - 11867 - struct bpf_prog_info_linear * 11868 - bpf_program__get_prog_info_linear(int fd, __u64 arrays) 11869 - { 11870 - struct bpf_prog_info_linear *info_linear; 11871 - struct bpf_prog_info info = {}; 11872 - __u32 info_len = sizeof(info); 11873 - __u32 data_len = 0; 11874 - int i, err; 11875 - void *ptr; 11876 - 11877 - if (arrays >> BPF_PROG_INFO_LAST_ARRAY) 11878 - return libbpf_err_ptr(-EINVAL); 11879 - 11880 - /* step 1: get array dimensions */ 11881 - err = bpf_obj_get_info_by_fd(fd, &info, &info_len); 11882 - if (err) { 11883 - pr_debug("can't get prog info: %s", strerror(errno)); 11884 - return libbpf_err_ptr(-EFAULT); 11885 - } 11886 - 11887 - /* step 2: calculate total size of all arrays */ 11888 - for (i = BPF_PROG_INFO_FIRST_ARRAY; i < BPF_PROG_INFO_LAST_ARRAY; ++i) { 11889 - bool include_array = (arrays & (1UL << i)) > 0; 11890 - struct bpf_prog_info_array_desc *desc; 11891 - __u32 count, size; 11892 - 11893 - desc = bpf_prog_info_array_desc + i; 11894 - 11895 - /* kernel is too old to support this field */ 11896 - if (info_len < desc->array_offset + sizeof(__u32) || 11897 - info_len < desc->count_offset + sizeof(__u32) || 11898 - (desc->size_offset > 0 && info_len < desc->size_offset)) 11899 - include_array = false; 11900 - 11901 - if (!include_array) { 11902 - arrays &= ~(1UL << i); /* clear the bit */ 11903 - continue; 11904 - } 11905 - 11906 - count = bpf_prog_info_read_offset_u32(&info, desc->count_offset); 11907 - size = bpf_prog_info_read_offset_u32(&info, desc->size_offset); 11908 - 11909 - data_len += count * size; 11910 - } 11911 - 11912 - /* step 3: allocate continuous memory */ 11913 - data_len = roundup(data_len, sizeof(__u64)); 11914 - info_linear = malloc(sizeof(struct bpf_prog_info_linear) + data_len); 11915 - if (!info_linear) 11916 - return libbpf_err_ptr(-ENOMEM); 11917 - 11918 - /* step 4: fill data to info_linear->info */ 11919 - info_linear->arrays = arrays; 11920 - memset(&info_linear->info, 0, sizeof(info)); 11921 - ptr = info_linear->data; 11922 - 11923 - for (i = BPF_PROG_INFO_FIRST_ARRAY; i < BPF_PROG_INFO_LAST_ARRAY; ++i) { 11924 - struct bpf_prog_info_array_desc *desc; 11925 - __u32 count, size; 11926 - 11927 - if ((arrays & (1UL << i)) == 0) 11928 - continue; 11929 - 11930 - desc = bpf_prog_info_array_desc + i; 11931 - count = bpf_prog_info_read_offset_u32(&info, desc->count_offset); 11932 - size = bpf_prog_info_read_offset_u32(&info, desc->size_offset); 11933 - bpf_prog_info_set_offset_u32(&info_linear->info, 11934 - desc->count_offset, count); 11935 - bpf_prog_info_set_offset_u32(&info_linear->info, 11936 - desc->size_offset, size); 11937 - bpf_prog_info_set_offset_u64(&info_linear->info, 11938 - desc->array_offset, 11939 - ptr_to_u64(ptr)); 11940 - ptr += count * size; 11941 - } 11942 - 11943 - /* step 5: call syscall again to get required arrays */ 11944 - err = bpf_obj_get_info_by_fd(fd, &info_linear->info, &info_len); 11945 - if (err) { 11946 - pr_debug("can't get prog info: %s", strerror(errno)); 11947 - free(info_linear); 11948 - return libbpf_err_ptr(-EFAULT); 11949 - } 11950 - 11951 - /* step 6: verify the data */ 11952 - for (i = BPF_PROG_INFO_FIRST_ARRAY; i < BPF_PROG_INFO_LAST_ARRAY; ++i) { 11953 - struct bpf_prog_info_array_desc *desc; 11954 - __u32 v1, v2; 11955 - 11956 - if ((arrays & (1UL << i)) == 0) 11957 - continue; 11958 - 11959 - desc = bpf_prog_info_array_desc + i; 11960 - v1 = bpf_prog_info_read_offset_u32(&info, desc->count_offset); 11961 - v2 = bpf_prog_info_read_offset_u32(&info_linear->info, 11962 - desc->count_offset); 11963 - if (v1 != v2) 11964 - pr_warn("%s: mismatch in element count\n", __func__); 11965 - 11966 - v1 = bpf_prog_info_read_offset_u32(&info, desc->size_offset); 11967 - v2 = bpf_prog_info_read_offset_u32(&info_linear->info, 11968 - desc->size_offset); 11969 - if (v1 != v2) 11970 - pr_warn("%s: mismatch in rec size\n", __func__); 11971 - } 11972 - 11973 - /* step 7: update info_len and data_len */ 11974 - info_linear->info_len = sizeof(struct bpf_prog_info); 11975 - info_linear->data_len = data_len; 11976 - 11977 - return info_linear; 11978 - } 11979 - 11980 - void bpf_program__bpil_addr_to_offs(struct bpf_prog_info_linear *info_linear) 11981 - { 11982 - int i; 11983 - 11984 - for (i = BPF_PROG_INFO_FIRST_ARRAY; i < BPF_PROG_INFO_LAST_ARRAY; ++i) { 11985 - struct bpf_prog_info_array_desc *desc; 11986 - __u64 addr, offs; 11987 - 11988 - if ((info_linear->arrays & (1UL << i)) == 0) 11989 - continue; 11990 - 11991 - desc = bpf_prog_info_array_desc + i; 11992 - addr = bpf_prog_info_read_offset_u64(&info_linear->info, 11993 - desc->array_offset); 11994 - offs = addr - ptr_to_u64(info_linear->data); 11995 - bpf_prog_info_set_offset_u64(&info_linear->info, 11996 - desc->array_offset, offs); 11997 - } 11998 - } 11999 - 12000 - void bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear) 12001 - { 12002 - int i; 12003 - 12004 - for (i = BPF_PROG_INFO_FIRST_ARRAY; i < BPF_PROG_INFO_LAST_ARRAY; ++i) { 12005 - struct bpf_prog_info_array_desc *desc; 12006 - __u64 addr, offs; 12007 - 12008 - if ((info_linear->arrays & (1UL << i)) == 0) 12009 - continue; 12010 - 12011 - desc = bpf_prog_info_array_desc + i; 12012 - offs = bpf_prog_info_read_offset_u64(&info_linear->info, 12013 - desc->array_offset); 12014 - addr = offs + ptr_to_u64(info_linear->data); 12015 - bpf_prog_info_set_offset_u64(&info_linear->info, 12016 - desc->array_offset, addr); 12017 - } 12018 12715 } 12019 12716 12020 12717 int bpf_program__set_attach_target(struct bpf_program *prog,
+11 -458
tools/lib/bpf/libbpf.h
··· 101 101 /* Hide internal to user */ 102 102 struct bpf_object; 103 103 104 - struct bpf_object_open_attr { 105 - const char *file; 106 - enum bpf_prog_type prog_type; 107 - }; 108 - 109 104 struct bpf_object_open_opts { 110 105 /* size of this struct, for forward/backward compatibility */ 111 106 size_t sz; ··· 113 118 const char *object_name; 114 119 /* parse map definitions non-strictly, allowing extra attributes/data */ 115 120 bool relaxed_maps; 116 - /* DEPRECATED: handle CO-RE relocations non-strictly, allowing failures. 117 - * Value is ignored. Relocations always are processed non-strictly. 118 - * Non-relocatable instructions are replaced with invalid ones to 119 - * prevent accidental errors. 120 - * */ 121 - LIBBPF_DEPRECATED_SINCE(0, 6, "field has no effect") 122 - bool relaxed_core_relocs; 123 121 /* maps that set the 'pinning' attribute in their definition will have 124 122 * their pin_path attribute set to a file in this directory, and be 125 123 * auto-pinned to that path on load; defaults to "/sys/fs/bpf". 126 124 */ 127 125 const char *pin_root_path; 128 - 129 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__set_attach_target() on each individual bpf_program") 130 - __u32 attach_prog_fd; 126 + long :0; 131 127 /* Additional kernel config content that augments and overrides 132 128 * system Kconfig for CONFIG_xxx externs. 133 129 */ ··· 201 215 bpf_object__open_mem(const void *obj_buf, size_t obj_buf_sz, 202 216 const struct bpf_object_open_opts *opts); 203 217 204 - /* deprecated bpf_object__open variants */ 205 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_object__open_mem() instead") 206 - LIBBPF_API struct bpf_object * 207 - bpf_object__open_buffer(const void *obj_buf, size_t obj_buf_sz, 208 - const char *name); 209 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__open_file() instead") 210 - LIBBPF_API struct bpf_object * 211 - bpf_object__open_xattr(struct bpf_object_open_attr *attr); 218 + /* Load/unload object into/from kernel */ 219 + LIBBPF_API int bpf_object__load(struct bpf_object *obj); 212 220 213 - enum libbpf_pin_type { 214 - LIBBPF_PIN_NONE, 215 - /* PIN_BY_NAME: pin maps by name (in /sys/fs/bpf by default) */ 216 - LIBBPF_PIN_BY_NAME, 217 - }; 221 + LIBBPF_API void bpf_object__close(struct bpf_object *object); 218 222 219 223 /* pin_maps and unpin_maps can both be called with a NULL path, in which case 220 224 * they will use the pin_path attribute of each map (and ignore all maps that ··· 218 242 LIBBPF_API int bpf_object__unpin_programs(struct bpf_object *obj, 219 243 const char *path); 220 244 LIBBPF_API int bpf_object__pin(struct bpf_object *object, const char *path); 221 - LIBBPF_API void bpf_object__close(struct bpf_object *object); 222 - 223 - struct bpf_object_load_attr { 224 - struct bpf_object *obj; 225 - int log_level; 226 - const char *target_btf_path; 227 - }; 228 - 229 - /* Load/unload object into/from kernel */ 230 - LIBBPF_API int bpf_object__load(struct bpf_object *obj); 231 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_object__load() instead") 232 - LIBBPF_API int bpf_object__load_xattr(struct bpf_object_load_attr *attr); 233 - LIBBPF_DEPRECATED_SINCE(0, 6, "bpf_object__unload() is deprecated, use bpf_object__close() instead") 234 - LIBBPF_API int bpf_object__unload(struct bpf_object *obj); 235 245 236 246 LIBBPF_API const char *bpf_object__name(const struct bpf_object *obj); 237 247 LIBBPF_API unsigned int bpf_object__kversion(const struct bpf_object *obj); ··· 227 265 LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); 228 266 LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj); 229 267 230 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead") 231 - LIBBPF_API struct bpf_program * 232 - bpf_object__find_program_by_title(const struct bpf_object *obj, 233 - const char *title); 234 268 LIBBPF_API struct bpf_program * 235 269 bpf_object__find_program_by_name(const struct bpf_object *obj, 236 270 const char *name); 237 - 238 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "track bpf_objects in application code instead") 239 - struct bpf_object *bpf_object__next(struct bpf_object *prev); 240 - #define bpf_object__for_each_safe(pos, tmp) \ 241 - for ((pos) = bpf_object__next(NULL), \ 242 - (tmp) = bpf_object__next(pos); \ 243 - (pos) != NULL; \ 244 - (pos) = (tmp), (tmp) = bpf_object__next(tmp)) 245 - 246 - typedef void (*bpf_object_clear_priv_t)(struct bpf_object *, void *); 247 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 248 - LIBBPF_API int bpf_object__set_priv(struct bpf_object *obj, void *priv, 249 - bpf_object_clear_priv_t clear_priv); 250 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 251 - LIBBPF_API void *bpf_object__priv(const struct bpf_object *prog); 252 271 253 272 LIBBPF_API int 254 273 libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type, ··· 241 298 242 299 /* Accessors of bpf_program */ 243 300 struct bpf_program; 244 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_program() instead") 245 - struct bpf_program *bpf_program__next(struct bpf_program *prog, 246 - const struct bpf_object *obj); 301 + 247 302 LIBBPF_API struct bpf_program * 248 303 bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prog); 249 304 ··· 250 309 (pos) != NULL; \ 251 310 (pos) = bpf_object__next_program((obj), (pos))) 252 311 253 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_program() instead") 254 - struct bpf_program *bpf_program__prev(struct bpf_program *prog, 255 - const struct bpf_object *obj); 256 312 LIBBPF_API struct bpf_program * 257 313 bpf_object__prev_program(const struct bpf_object *obj, struct bpf_program *prog); 258 314 259 - typedef void (*bpf_program_clear_priv_t)(struct bpf_program *, void *); 260 - 261 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 262 - LIBBPF_API int bpf_program__set_priv(struct bpf_program *prog, void *priv, 263 - bpf_program_clear_priv_t clear_priv); 264 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 265 - LIBBPF_API void *bpf_program__priv(const struct bpf_program *prog); 266 315 LIBBPF_API void bpf_program__set_ifindex(struct bpf_program *prog, 267 316 __u32 ifindex); 268 317 269 318 LIBBPF_API const char *bpf_program__name(const struct bpf_program *prog); 270 319 LIBBPF_API const char *bpf_program__section_name(const struct bpf_program *prog); 271 - LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bpf_program__section_name() instead") 272 - const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy); 273 320 LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog); 274 321 LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload); 275 - 276 - /* returns program size in bytes */ 277 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead") 278 - LIBBPF_API size_t bpf_program__size(const struct bpf_program *prog); 279 322 280 323 struct bpf_insn; 281 324 ··· 313 388 */ 314 389 LIBBPF_API size_t bpf_program__insn_cnt(const struct bpf_program *prog); 315 390 316 - LIBBPF_DEPRECATED_SINCE(0, 6, "use bpf_object__load() instead") 317 - LIBBPF_API int bpf_program__load(struct bpf_program *prog, const char *license, __u32 kern_version); 318 391 LIBBPF_API int bpf_program__fd(const struct bpf_program *prog); 319 - LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") 320 - LIBBPF_API int bpf_program__pin_instance(struct bpf_program *prog, 321 - const char *path, 322 - int instance); 323 - LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") 324 - LIBBPF_API int bpf_program__unpin_instance(struct bpf_program *prog, 325 - const char *path, 326 - int instance); 327 392 328 393 /** 329 394 * @brief **bpf_program__pin()** pins the BPF program to a file ··· 613 698 bpf_program__attach_iter(const struct bpf_program *prog, 614 699 const struct bpf_iter_attach_opts *opts); 615 700 616 - /* 617 - * Libbpf allows callers to adjust BPF programs before being loaded 618 - * into kernel. One program in an object file can be transformed into 619 - * multiple variants to be attached to different hooks. 620 - * 621 - * bpf_program_prep_t, bpf_program__set_prep and bpf_program__nth_fd 622 - * form an API for this purpose. 623 - * 624 - * - bpf_program_prep_t: 625 - * Defines a 'preprocessor', which is a caller defined function 626 - * passed to libbpf through bpf_program__set_prep(), and will be 627 - * called before program is loaded. The processor should adjust 628 - * the program one time for each instance according to the instance id 629 - * passed to it. 630 - * 631 - * - bpf_program__set_prep: 632 - * Attaches a preprocessor to a BPF program. The number of instances 633 - * that should be created is also passed through this function. 634 - * 635 - * - bpf_program__nth_fd: 636 - * After the program is loaded, get resulting FD of a given instance 637 - * of the BPF program. 638 - * 639 - * If bpf_program__set_prep() is not used, the program would be loaded 640 - * without adjustment during bpf_object__load(). The program has only 641 - * one instance. In this case bpf_program__fd(prog) is equal to 642 - * bpf_program__nth_fd(prog, 0). 643 - */ 644 - struct bpf_prog_prep_result { 645 - /* 646 - * If not NULL, load new instruction array. 647 - * If set to NULL, don't load this instance. 648 - */ 649 - struct bpf_insn *new_insn_ptr; 650 - int new_insn_cnt; 651 - 652 - /* If not NULL, result FD is written to it. */ 653 - int *pfd; 654 - }; 655 - 656 - /* 657 - * Parameters of bpf_program_prep_t: 658 - * - prog: The bpf_program being loaded. 659 - * - n: Index of instance being generated. 660 - * - insns: BPF instructions array. 661 - * - insns_cnt:Number of instructions in insns. 662 - * - res: Output parameter, result of transformation. 663 - * 664 - * Return value: 665 - * - Zero: pre-processing success. 666 - * - Non-zero: pre-processing error, stop loading. 667 - */ 668 - typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n, 669 - struct bpf_insn *insns, int insns_cnt, 670 - struct bpf_prog_prep_result *res); 671 - 672 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insns() for getting bpf_program instructions") 673 - LIBBPF_API int bpf_program__set_prep(struct bpf_program *prog, int nr_instance, 674 - bpf_program_prep_t prep); 675 - 676 - LIBBPF_DEPRECATED_SINCE(0, 7, "multi-instance bpf_program support is deprecated") 677 - LIBBPF_API int bpf_program__nth_fd(const struct bpf_program *prog, int n); 678 - 679 - /* 680 - * Adjust type of BPF program. Default is kprobe. 681 - */ 682 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 683 - LIBBPF_API int bpf_program__set_socket_filter(struct bpf_program *prog); 684 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 685 - LIBBPF_API int bpf_program__set_tracepoint(struct bpf_program *prog); 686 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 687 - LIBBPF_API int bpf_program__set_raw_tracepoint(struct bpf_program *prog); 688 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 689 - LIBBPF_API int bpf_program__set_kprobe(struct bpf_program *prog); 690 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 691 - LIBBPF_API int bpf_program__set_lsm(struct bpf_program *prog); 692 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 693 - LIBBPF_API int bpf_program__set_sched_cls(struct bpf_program *prog); 694 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 695 - LIBBPF_API int bpf_program__set_sched_act(struct bpf_program *prog); 696 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 697 - LIBBPF_API int bpf_program__set_xdp(struct bpf_program *prog); 698 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 699 - LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog); 700 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 701 - LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog); 702 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 703 - LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog); 704 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 705 - LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog); 706 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__set_type() instead") 707 - LIBBPF_API int bpf_program__set_sk_lookup(struct bpf_program *prog); 708 - 709 701 LIBBPF_API enum bpf_prog_type bpf_program__type(const struct bpf_program *prog); 710 702 711 703 /** ··· 675 853 bpf_program__set_attach_target(struct bpf_program *prog, int attach_prog_fd, 676 854 const char *attach_func_name); 677 855 678 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 679 - LIBBPF_API bool bpf_program__is_socket_filter(const struct bpf_program *prog); 680 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 681 - LIBBPF_API bool bpf_program__is_tracepoint(const struct bpf_program *prog); 682 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 683 - LIBBPF_API bool bpf_program__is_raw_tracepoint(const struct bpf_program *prog); 684 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 685 - LIBBPF_API bool bpf_program__is_kprobe(const struct bpf_program *prog); 686 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 687 - LIBBPF_API bool bpf_program__is_lsm(const struct bpf_program *prog); 688 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 689 - LIBBPF_API bool bpf_program__is_sched_cls(const struct bpf_program *prog); 690 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 691 - LIBBPF_API bool bpf_program__is_sched_act(const struct bpf_program *prog); 692 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 693 - LIBBPF_API bool bpf_program__is_xdp(const struct bpf_program *prog); 694 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 695 - LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog); 696 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 697 - LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog); 698 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 699 - LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog); 700 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 701 - LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog); 702 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_program__type() instead") 703 - LIBBPF_API bool bpf_program__is_sk_lookup(const struct bpf_program *prog); 704 - 705 - /* 706 - * No need for __attribute__((packed)), all members of 'bpf_map_def' 707 - * are all aligned. In addition, using __attribute__((packed)) 708 - * would trigger a -Wpacked warning message, and lead to an error 709 - * if -Werror is set. 710 - */ 711 - struct bpf_map_def { 712 - unsigned int type; 713 - unsigned int key_size; 714 - unsigned int value_size; 715 - unsigned int max_entries; 716 - unsigned int map_flags; 717 - }; 718 - 719 856 /** 720 857 * @brief **bpf_object__find_map_by_name()** returns BPF map of 721 858 * the given name, if it exists within the passed BPF object ··· 689 908 LIBBPF_API int 690 909 bpf_object__find_map_fd_by_name(const struct bpf_object *obj, const char *name); 691 910 692 - /* 693 - * Get bpf_map through the offset of corresponding struct bpf_map_def 694 - * in the BPF object file. 695 - */ 696 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_object__find_map_by_name() instead") 697 - struct bpf_map * 698 - bpf_object__find_map_by_offset(struct bpf_object *obj, size_t offset); 699 - 700 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__next_map() instead") 701 - struct bpf_map *bpf_map__next(const struct bpf_map *map, const struct bpf_object *obj); 702 911 LIBBPF_API struct bpf_map * 703 912 bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *map); 704 913 ··· 698 927 (pos) = bpf_object__next_map((obj), (pos))) 699 928 #define bpf_map__for_each bpf_object__for_each_map 700 929 701 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__prev_map() instead") 702 - struct bpf_map *bpf_map__prev(const struct bpf_map *map, const struct bpf_object *obj); 703 930 LIBBPF_API struct bpf_map * 704 931 bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *map); 705 932 ··· 731 962 */ 732 963 LIBBPF_API int bpf_map__fd(const struct bpf_map *map); 733 964 LIBBPF_API int bpf_map__reuse_fd(struct bpf_map *map, int fd); 734 - /* get map definition */ 735 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 8, "use appropriate getters or setters instead") 736 - const struct bpf_map_def *bpf_map__def(const struct bpf_map *map); 737 965 /* get map name */ 738 966 LIBBPF_API const char *bpf_map__name(const struct bpf_map *map); 739 967 /* get/set map type */ ··· 739 973 /* get/set map size (max_entries) */ 740 974 LIBBPF_API __u32 bpf_map__max_entries(const struct bpf_map *map); 741 975 LIBBPF_API int bpf_map__set_max_entries(struct bpf_map *map, __u32 max_entries); 742 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_map__set_max_entries() instead") 743 - LIBBPF_API int bpf_map__resize(struct bpf_map *map, __u32 max_entries); 744 976 /* get/set map flags */ 745 977 LIBBPF_API __u32 bpf_map__map_flags(const struct bpf_map *map); 746 978 LIBBPF_API int bpf_map__set_map_flags(struct bpf_map *map, __u32 flags); ··· 761 997 LIBBPF_API __u64 bpf_map__map_extra(const struct bpf_map *map); 762 998 LIBBPF_API int bpf_map__set_map_extra(struct bpf_map *map, __u64 map_extra); 763 999 764 - typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *); 765 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 766 - LIBBPF_API int bpf_map__set_priv(struct bpf_map *map, void *priv, 767 - bpf_map_clear_priv_t clear_priv); 768 - LIBBPF_DEPRECATED_SINCE(0, 7, "storage via set_priv/priv is deprecated") 769 - LIBBPF_API void *bpf_map__priv(const struct bpf_map *map); 770 1000 LIBBPF_API int bpf_map__set_initial_value(struct bpf_map *map, 771 1001 const void *data, size_t size); 772 1002 LIBBPF_API const void *bpf_map__initial_value(struct bpf_map *map, size_t *psize); 773 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_map__type() instead") 774 - LIBBPF_API bool bpf_map__is_offload_neutral(const struct bpf_map *map); 775 1003 776 1004 /** 777 1005 * @brief **bpf_map__is_internal()** tells the caller whether or not the ··· 886 1130 LIBBPF_API int bpf_map__get_next_key(const struct bpf_map *map, 887 1131 const void *cur_key, void *next_key, size_t key_sz); 888 1132 889 - /** 890 - * @brief **libbpf_get_error()** extracts the error code from the passed 891 - * pointer 892 - * @param ptr pointer returned from libbpf API function 893 - * @return error code; or 0 if no error occured 894 - * 895 - * Many libbpf API functions which return pointers have logic to encode error 896 - * codes as pointers, and do not return NULL. Meaning **libbpf_get_error()** 897 - * should be used on the return value from these functions immediately after 898 - * calling the API function, with no intervening calls that could clobber the 899 - * `errno` variable. Consult the individual functions documentation to verify 900 - * if this logic applies should be used. 901 - * 902 - * For these API functions, if `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` 903 - * is enabled, NULL is returned on error instead. 904 - * 905 - * If ptr is NULL, then errno should be already set by the failing 906 - * API, because libbpf never returns NULL on success and it now always 907 - * sets errno on error. 908 - * 909 - * Example usage: 910 - * 911 - * struct perf_buffer *pb; 912 - * 913 - * pb = perf_buffer__new(bpf_map__fd(obj->maps.events), PERF_BUFFER_PAGES, &opts); 914 - * err = libbpf_get_error(pb); 915 - * if (err) { 916 - * pb = NULL; 917 - * fprintf(stderr, "failed to open perf buffer: %d\n", err); 918 - * goto cleanup; 919 - * } 920 - */ 921 - LIBBPF_API long libbpf_get_error(const void *ptr); 922 - 923 - struct bpf_prog_load_attr { 924 - const char *file; 925 - enum bpf_prog_type prog_type; 926 - enum bpf_attach_type expected_attach_type; 927 - int ifindex; 928 - int log_level; 929 - int prog_flags; 930 - }; 931 - 932 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_object__open() and bpf_object__load() instead") 933 - LIBBPF_API int bpf_prog_load_xattr(const struct bpf_prog_load_attr *attr, 934 - struct bpf_object **pobj, int *prog_fd); 935 - LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__open() and bpf_object__load() instead") 936 - LIBBPF_API int bpf_prog_load_deprecated(const char *file, enum bpf_prog_type type, 937 - struct bpf_object **pobj, int *prog_fd); 938 - 939 - /* XDP related API */ 940 - struct xdp_link_info { 941 - __u32 prog_id; 942 - __u32 drv_prog_id; 943 - __u32 hw_prog_id; 944 - __u32 skb_prog_id; 945 - __u8 attach_mode; 946 - }; 947 - 948 1133 struct bpf_xdp_set_link_opts { 949 1134 size_t sz; 950 1135 int old_fd; 951 1136 size_t :0; 952 1137 }; 953 1138 #define bpf_xdp_set_link_opts__last_field old_fd 954 - 955 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_xdp_attach() instead") 956 - LIBBPF_API int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags); 957 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_xdp_attach() instead") 958 - LIBBPF_API int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags, 959 - const struct bpf_xdp_set_link_opts *opts); 960 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_xdp_query_id() instead") 961 - LIBBPF_API int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags); 962 - LIBBPF_DEPRECATED_SINCE(0, 8, "use bpf_xdp_query() instead") 963 - LIBBPF_API int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, 964 - size_t info_size, __u32 flags); 965 1139 966 1140 struct bpf_xdp_attach_opts { 967 1141 size_t sz; ··· 991 1305 992 1306 /* common use perf buffer options */ 993 1307 struct perf_buffer_opts { 994 - union { 995 - size_t sz; 996 - struct { /* DEPRECATED: will be removed in v1.0 */ 997 - /* if specified, sample_cb is called for each sample */ 998 - perf_buffer_sample_fn sample_cb; 999 - /* if specified, lost_cb is called for each batch of lost samples */ 1000 - perf_buffer_lost_fn lost_cb; 1001 - /* ctx is provided to sample_cb and lost_cb */ 1002 - void *ctx; 1003 - }; 1004 - }; 1308 + size_t sz; 1005 1309 }; 1006 1310 #define perf_buffer_opts__last_field sz 1007 1311 ··· 1012 1336 perf_buffer_sample_fn sample_cb, perf_buffer_lost_fn lost_cb, void *ctx, 1013 1337 const struct perf_buffer_opts *opts); 1014 1338 1015 - LIBBPF_API struct perf_buffer * 1016 - perf_buffer__new_v0_6_0(int map_fd, size_t page_cnt, 1017 - perf_buffer_sample_fn sample_cb, perf_buffer_lost_fn lost_cb, void *ctx, 1018 - const struct perf_buffer_opts *opts); 1019 - 1020 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use new variant of perf_buffer__new() instead") 1021 - struct perf_buffer *perf_buffer__new_deprecated(int map_fd, size_t page_cnt, 1022 - const struct perf_buffer_opts *opts); 1023 - 1024 - #define perf_buffer__new(...) ___libbpf_overload(___perf_buffer_new, __VA_ARGS__) 1025 - #define ___perf_buffer_new6(map_fd, page_cnt, sample_cb, lost_cb, ctx, opts) \ 1026 - perf_buffer__new(map_fd, page_cnt, sample_cb, lost_cb, ctx, opts) 1027 - #define ___perf_buffer_new3(map_fd, page_cnt, opts) \ 1028 - perf_buffer__new_deprecated(map_fd, page_cnt, opts) 1029 - 1030 1339 enum bpf_perf_event_ret { 1031 1340 LIBBPF_PERF_EVENT_DONE = 0, 1032 1341 LIBBPF_PERF_EVENT_ERROR = -1, ··· 1025 1364 1026 1365 /* raw perf buffer options, giving most power and control */ 1027 1366 struct perf_buffer_raw_opts { 1028 - union { 1029 - struct { 1030 - size_t sz; 1031 - long :0; 1032 - long :0; 1033 - }; 1034 - struct { /* DEPRECATED: will be removed in v1.0 */ 1035 - /* perf event attrs passed directly into perf_event_open() */ 1036 - struct perf_event_attr *attr; 1037 - /* raw event callback */ 1038 - perf_buffer_event_fn event_cb; 1039 - /* ctx is provided to event_cb */ 1040 - void *ctx; 1041 - }; 1042 - }; 1367 + size_t sz; 1368 + long :0; 1369 + long :0; 1043 1370 /* if cpu_cnt == 0, open all on all possible CPUs (up to the number of 1044 1371 * max_entries of given PERF_EVENT_ARRAY map) 1045 1372 */ ··· 1039 1390 }; 1040 1391 #define perf_buffer_raw_opts__last_field map_keys 1041 1392 1393 + struct perf_event_attr; 1394 + 1042 1395 LIBBPF_API struct perf_buffer * 1043 1396 perf_buffer__new_raw(int map_fd, size_t page_cnt, struct perf_event_attr *attr, 1044 1397 perf_buffer_event_fn event_cb, void *ctx, 1045 1398 const struct perf_buffer_raw_opts *opts); 1046 - 1047 - LIBBPF_API struct perf_buffer * 1048 - perf_buffer__new_raw_v0_6_0(int map_fd, size_t page_cnt, struct perf_event_attr *attr, 1049 - perf_buffer_event_fn event_cb, void *ctx, 1050 - const struct perf_buffer_raw_opts *opts); 1051 - 1052 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "use new variant of perf_buffer__new_raw() instead") 1053 - struct perf_buffer *perf_buffer__new_raw_deprecated(int map_fd, size_t page_cnt, 1054 - const struct perf_buffer_raw_opts *opts); 1055 - 1056 - #define perf_buffer__new_raw(...) ___libbpf_overload(___perf_buffer_new_raw, __VA_ARGS__) 1057 - #define ___perf_buffer_new_raw6(map_fd, page_cnt, attr, event_cb, ctx, opts) \ 1058 - perf_buffer__new_raw(map_fd, page_cnt, attr, event_cb, ctx, opts) 1059 - #define ___perf_buffer_new_raw3(map_fd, page_cnt, opts) \ 1060 - perf_buffer__new_raw_deprecated(map_fd, page_cnt, opts) 1061 1399 1062 1400 LIBBPF_API void perf_buffer__free(struct perf_buffer *pb); 1063 1401 LIBBPF_API int perf_buffer__epoll_fd(const struct perf_buffer *pb); ··· 1053 1417 LIBBPF_API int perf_buffer__consume_buffer(struct perf_buffer *pb, size_t buf_idx); 1054 1418 LIBBPF_API size_t perf_buffer__buffer_cnt(const struct perf_buffer *pb); 1055 1419 LIBBPF_API int perf_buffer__buffer_fd(const struct perf_buffer *pb, size_t buf_idx); 1056 - 1057 - typedef enum bpf_perf_event_ret 1058 - (*bpf_perf_event_print_t)(struct perf_event_header *hdr, 1059 - void *private_data); 1060 - LIBBPF_DEPRECATED_SINCE(0, 8, "use perf_buffer__poll() or perf_buffer__consume() instead") 1061 - LIBBPF_API enum bpf_perf_event_ret 1062 - bpf_perf_event_read_simple(void *mmap_mem, size_t mmap_size, size_t page_size, 1063 - void **copy_mem, size_t *copy_size, 1064 - bpf_perf_event_print_t fn, void *private_data); 1065 1420 1066 1421 struct bpf_prog_linfo; 1067 1422 struct bpf_prog_info; ··· 1075 1448 * user, causing subsequent probes to fail. In this case, the caller may want 1076 1449 * to adjust that limit with setrlimit(). 1077 1450 */ 1078 - LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_prog_type() instead") 1079 - LIBBPF_API bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex); 1080 - LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_map_type() instead") 1081 - LIBBPF_API bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex); 1082 - LIBBPF_DEPRECATED_SINCE(0, 8, "use libbpf_probe_bpf_helper() instead") 1083 - LIBBPF_API bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, __u32 ifindex); 1084 - LIBBPF_DEPRECATED_SINCE(0, 8, "implement your own or use bpftool for feature detection") 1085 - LIBBPF_API bool bpf_probe_large_insn_limit(__u32 ifindex); 1086 1451 1087 1452 /** 1088 1453 * @brief **libbpf_probe_bpf_prog_type()** detects if host kernel supports ··· 1117 1498 */ 1118 1499 LIBBPF_API int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, 1119 1500 enum bpf_func_id helper_id, const void *opts); 1120 - 1121 - /* 1122 - * Get bpf_prog_info in continuous memory 1123 - * 1124 - * struct bpf_prog_info has multiple arrays. The user has option to choose 1125 - * arrays to fetch from kernel. The following APIs provide an uniform way to 1126 - * fetch these data. All arrays in bpf_prog_info are stored in a single 1127 - * continuous memory region. This makes it easy to store the info in a 1128 - * file. 1129 - * 1130 - * Before writing bpf_prog_info_linear to files, it is necessary to 1131 - * translate pointers in bpf_prog_info to offsets. Helper functions 1132 - * bpf_program__bpil_addr_to_offs() and bpf_program__bpil_offs_to_addr() 1133 - * are introduced to switch between pointers and offsets. 1134 - * 1135 - * Examples: 1136 - * # To fetch map_ids and prog_tags: 1137 - * __u64 arrays = (1UL << BPF_PROG_INFO_MAP_IDS) | 1138 - * (1UL << BPF_PROG_INFO_PROG_TAGS); 1139 - * struct bpf_prog_info_linear *info_linear = 1140 - * bpf_program__get_prog_info_linear(fd, arrays); 1141 - * 1142 - * # To save data in file 1143 - * bpf_program__bpil_addr_to_offs(info_linear); 1144 - * write(f, info_linear, sizeof(*info_linear) + info_linear->data_len); 1145 - * 1146 - * # To read data from file 1147 - * read(f, info_linear, <proper_size>); 1148 - * bpf_program__bpil_offs_to_addr(info_linear); 1149 - */ 1150 - enum bpf_prog_info_array { 1151 - BPF_PROG_INFO_FIRST_ARRAY = 0, 1152 - BPF_PROG_INFO_JITED_INSNS = 0, 1153 - BPF_PROG_INFO_XLATED_INSNS, 1154 - BPF_PROG_INFO_MAP_IDS, 1155 - BPF_PROG_INFO_JITED_KSYMS, 1156 - BPF_PROG_INFO_JITED_FUNC_LENS, 1157 - BPF_PROG_INFO_FUNC_INFO, 1158 - BPF_PROG_INFO_LINE_INFO, 1159 - BPF_PROG_INFO_JITED_LINE_INFO, 1160 - BPF_PROG_INFO_PROG_TAGS, 1161 - BPF_PROG_INFO_LAST_ARRAY, 1162 - }; 1163 - 1164 - struct bpf_prog_info_linear { 1165 - /* size of struct bpf_prog_info, when the tool is compiled */ 1166 - __u32 info_len; 1167 - /* total bytes allocated for data, round up to 8 bytes */ 1168 - __u32 data_len; 1169 - /* which arrays are included in data */ 1170 - __u64 arrays; 1171 - struct bpf_prog_info info; 1172 - __u8 data[]; 1173 - }; 1174 - 1175 - LIBBPF_DEPRECATED_SINCE(0, 6, "use a custom linear prog_info wrapper") 1176 - LIBBPF_API struct bpf_prog_info_linear * 1177 - bpf_program__get_prog_info_linear(int fd, __u64 arrays); 1178 - 1179 - LIBBPF_DEPRECATED_SINCE(0, 6, "use a custom linear prog_info wrapper") 1180 - LIBBPF_API void 1181 - bpf_program__bpil_addr_to_offs(struct bpf_prog_info_linear *info_linear); 1182 - 1183 - LIBBPF_DEPRECATED_SINCE(0, 6, "use a custom linear prog_info wrapper") 1184 - LIBBPF_API void 1185 - bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear); 1186 1501 1187 1502 /** 1188 1503 * @brief **libbpf_num_possible_cpus()** is a helper function to get the
+3 -111
tools/lib/bpf/libbpf.map
··· 1 1 LIBBPF_0.0.1 { 2 2 global: 3 3 bpf_btf_get_fd_by_id; 4 - bpf_create_map; 5 - bpf_create_map_in_map; 6 - bpf_create_map_in_map_node; 7 - bpf_create_map_name; 8 - bpf_create_map_node; 9 - bpf_create_map_xattr; 10 - bpf_load_btf; 11 - bpf_load_program; 12 - bpf_load_program_xattr; 13 4 bpf_map__btf_key_type_id; 14 5 bpf_map__btf_value_type_id; 15 - bpf_map__def; 16 6 bpf_map__fd; 17 - bpf_map__is_offload_neutral; 18 7 bpf_map__name; 19 - bpf_map__next; 20 8 bpf_map__pin; 21 - bpf_map__prev; 22 - bpf_map__priv; 23 9 bpf_map__reuse_fd; 24 10 bpf_map__set_ifindex; 25 11 bpf_map__set_inner_map_fd; 26 - bpf_map__set_priv; 27 12 bpf_map__unpin; 28 13 bpf_map_delete_elem; 29 14 bpf_map_get_fd_by_id; ··· 23 38 bpf_object__btf_fd; 24 39 bpf_object__close; 25 40 bpf_object__find_map_by_name; 26 - bpf_object__find_map_by_offset; 27 - bpf_object__find_program_by_title; 28 41 bpf_object__kversion; 29 42 bpf_object__load; 30 43 bpf_object__name; 31 - bpf_object__next; 32 44 bpf_object__open; 33 - bpf_object__open_buffer; 34 - bpf_object__open_xattr; 35 45 bpf_object__pin; 36 46 bpf_object__pin_maps; 37 47 bpf_object__pin_programs; 38 - bpf_object__priv; 39 - bpf_object__set_priv; 40 - bpf_object__unload; 41 48 bpf_object__unpin_maps; 42 49 bpf_object__unpin_programs; 43 - bpf_perf_event_read_simple; 44 50 bpf_prog_attach; 45 51 bpf_prog_detach; 46 52 bpf_prog_detach2; 47 53 bpf_prog_get_fd_by_id; 48 54 bpf_prog_get_next_id; 49 - bpf_prog_load; 50 - bpf_prog_load_xattr; 51 55 bpf_prog_query; 52 - bpf_prog_test_run; 53 - bpf_prog_test_run_xattr; 54 56 bpf_program__fd; 55 - bpf_program__is_kprobe; 56 - bpf_program__is_perf_event; 57 - bpf_program__is_raw_tracepoint; 58 - bpf_program__is_sched_act; 59 - bpf_program__is_sched_cls; 60 - bpf_program__is_socket_filter; 61 - bpf_program__is_tracepoint; 62 - bpf_program__is_xdp; 63 - bpf_program__load; 64 - bpf_program__next; 65 - bpf_program__nth_fd; 66 57 bpf_program__pin; 67 - bpf_program__pin_instance; 68 - bpf_program__prev; 69 - bpf_program__priv; 70 58 bpf_program__set_expected_attach_type; 71 59 bpf_program__set_ifindex; 72 - bpf_program__set_kprobe; 73 - bpf_program__set_perf_event; 74 - bpf_program__set_prep; 75 - bpf_program__set_priv; 76 - bpf_program__set_raw_tracepoint; 77 - bpf_program__set_sched_act; 78 - bpf_program__set_sched_cls; 79 - bpf_program__set_socket_filter; 80 - bpf_program__set_tracepoint; 81 60 bpf_program__set_type; 82 - bpf_program__set_xdp; 83 - bpf_program__title; 84 61 bpf_program__unload; 85 62 bpf_program__unpin; 86 - bpf_program__unpin_instance; 87 63 bpf_prog_linfo__free; 88 64 bpf_prog_linfo__new; 89 65 bpf_prog_linfo__lfind_addr_func; 90 66 bpf_prog_linfo__lfind; 91 67 bpf_raw_tracepoint_open; 92 - bpf_set_link_xdp_fd; 93 68 bpf_task_fd_query; 94 - bpf_verify_program; 95 69 btf__fd; 96 70 btf__find_by_name; 97 71 btf__free; 98 - btf__get_from_id; 99 72 btf__name_by_offset; 100 73 btf__new; 101 74 btf__resolve_size; ··· 70 127 71 128 LIBBPF_0.0.2 { 72 129 global: 73 - bpf_probe_helper; 74 - bpf_probe_map_type; 75 - bpf_probe_prog_type; 76 - bpf_map__resize; 77 130 bpf_map_lookup_elem_flags; 78 131 bpf_object__btf; 79 132 bpf_object__find_map_fd_by_name; 80 - bpf_get_link_xdp_id; 81 - btf__dedup; 82 - btf__get_map_kv_tids; 83 - btf__get_nr_types; 84 133 btf__get_raw_data; 85 - btf__load; 86 134 btf_ext__free; 87 - btf_ext__func_info_rec_size; 88 135 btf_ext__get_raw_data; 89 - btf_ext__line_info_rec_size; 90 136 btf_ext__new; 91 - btf_ext__reloc_func_info; 92 - btf_ext__reloc_line_info; 93 - xsk_umem__create; 94 - xsk_socket__create; 95 - xsk_umem__delete; 96 - xsk_socket__delete; 97 - xsk_umem__fd; 98 - xsk_socket__fd; 99 - bpf_program__get_prog_info_linear; 100 - bpf_program__bpil_addr_to_offs; 101 - bpf_program__bpil_offs_to_addr; 102 137 } LIBBPF_0.0.1; 103 138 104 139 LIBBPF_0.0.3 { 105 140 global: 106 141 bpf_map__is_internal; 107 142 bpf_map_freeze; 108 - btf__finalize_data; 109 143 } LIBBPF_0.0.2; 110 144 111 145 LIBBPF_0.0.4 { 112 146 global: 113 147 bpf_link__destroy; 114 - bpf_object__load_xattr; 115 148 bpf_program__attach_kprobe; 116 149 bpf_program__attach_perf_event; 117 150 bpf_program__attach_raw_tracepoint; ··· 95 176 bpf_program__attach_uprobe; 96 177 btf_dump__dump_type; 97 178 btf_dump__free; 98 - btf_dump__new; 99 179 btf__parse_elf; 100 180 libbpf_num_possible_cpus; 101 181 perf_buffer__free; 102 - perf_buffer__new; 103 - perf_buffer__new_raw; 104 182 perf_buffer__poll; 105 - xsk_umem__create; 106 183 } LIBBPF_0.0.3; 107 184 108 185 LIBBPF_0.0.5 { ··· 108 193 109 194 LIBBPF_0.0.6 { 110 195 global: 111 - bpf_get_link_xdp_info; 112 196 bpf_map__get_pin_path; 113 197 bpf_map__is_pinned; 114 198 bpf_map__set_pin_path; ··· 116 202 bpf_program__attach_trace; 117 203 bpf_program__get_expected_attach_type; 118 204 bpf_program__get_type; 119 - bpf_program__is_tracing; 120 - bpf_program__set_tracing; 121 - bpf_program__size; 122 205 btf__find_by_name_kind; 123 206 libbpf_find_vmlinux_btf_id; 124 207 } LIBBPF_0.0.5; ··· 135 224 bpf_object__detach_skeleton; 136 225 bpf_object__load_skeleton; 137 226 bpf_object__open_skeleton; 138 - bpf_probe_large_insn_limit; 139 - bpf_prog_attach_xattr; 140 227 bpf_program__attach; 141 228 bpf_program__name; 142 - bpf_program__is_extension; 143 - bpf_program__is_struct_ops; 144 - bpf_program__set_extension; 145 - bpf_program__set_struct_ops; 146 229 btf__align_of; 147 230 libbpf_find_kernel_btf; 148 231 } LIBBPF_0.0.6; ··· 155 250 bpf_prog_attach_opts; 156 251 bpf_program__attach_cgroup; 157 252 bpf_program__attach_lsm; 158 - bpf_program__is_lsm; 159 253 bpf_program__set_attach_target; 160 - bpf_program__set_lsm; 161 - bpf_set_link_xdp_fd_opts; 162 254 } LIBBPF_0.0.7; 163 255 164 256 LIBBPF_0.0.9 { ··· 193 291 bpf_map__value_size; 194 292 bpf_program__attach_xdp; 195 293 bpf_program__autoload; 196 - bpf_program__is_sk_lookup; 197 294 bpf_program__set_autoload; 198 - bpf_program__set_sk_lookup; 199 295 btf__parse; 200 296 btf__parse_raw; 201 297 btf__pointer_size; ··· 236 336 perf_buffer__buffer_fd; 237 337 perf_buffer__epoll_fd; 238 338 perf_buffer__consume_buffer; 239 - xsk_socket__create_shared; 240 339 } LIBBPF_0.1.0; 241 340 242 341 LIBBPF_0.3.0 { ··· 247 348 btf__new_empty_split; 248 349 btf__new_split; 249 350 ring_buffer__epoll_fd; 250 - xsk_setup_xdp_prog; 251 - xsk_socket__update_xskmap; 252 351 } LIBBPF_0.2.0; 253 352 254 353 LIBBPF_0.4.0 { ··· 294 397 bpf_object__next_program; 295 398 bpf_object__prev_map; 296 399 bpf_object__prev_program; 297 - bpf_prog_load_deprecated; 298 400 bpf_prog_load; 299 401 bpf_program__flags; 300 402 bpf_program__insn_cnt; ··· 303 407 btf__add_decl_tag; 304 408 btf__add_type_tag; 305 409 btf__dedup; 306 - btf__dedup_deprecated; 307 410 btf__raw_data; 308 411 btf__type_cnt; 309 412 btf_dump__new; 310 - btf_dump__new_deprecated; 311 413 libbpf_major_version; 312 414 libbpf_minor_version; 313 415 libbpf_version_string; 314 416 perf_buffer__new; 315 - perf_buffer__new_deprecated; 316 417 perf_buffer__new_raw; 317 - perf_buffer__new_raw_deprecated; 318 418 } LIBBPF_0.5.0; 319 419 320 420 LIBBPF_0.7.0 { ··· 326 434 bpf_xdp_detach; 327 435 bpf_xdp_query; 328 436 bpf_xdp_query_id; 437 + btf_ext__raw_data; 329 438 libbpf_probe_bpf_helper; 330 439 libbpf_probe_bpf_map_type; 331 440 libbpf_probe_bpf_prog_type; 332 - libbpf_set_memlock_rlim_max; 441 + libbpf_set_memlock_rlim; 333 442 } LIBBPF_0.6.0; 334 443 335 444 LIBBPF_0.8.0 { ··· 355 462 356 463 LIBBPF_1.0.0 { 357 464 global: 465 + bpf_prog_query_opts; 358 466 btf__add_enum64; 359 467 btf__add_enum64_value; 360 468 libbpf_bpf_attach_type_str; 361 469 libbpf_bpf_link_type_str; 362 470 libbpf_bpf_map_type_str; 363 471 libbpf_bpf_prog_type_str; 364 - 365 - local: *; 366 472 };
+3 -13
tools/lib/bpf/libbpf_common.h
··· 30 30 /* Add checks for other versions below when planning deprecation of API symbols 31 31 * with the LIBBPF_DEPRECATED_SINCE macro. 32 32 */ 33 - #if __LIBBPF_CURRENT_VERSION_GEQ(0, 6) 34 - #define __LIBBPF_MARK_DEPRECATED_0_6(X) X 33 + #if __LIBBPF_CURRENT_VERSION_GEQ(1, 0) 34 + #define __LIBBPF_MARK_DEPRECATED_1_0(X) X 35 35 #else 36 - #define __LIBBPF_MARK_DEPRECATED_0_6(X) 37 - #endif 38 - #if __LIBBPF_CURRENT_VERSION_GEQ(0, 7) 39 - #define __LIBBPF_MARK_DEPRECATED_0_7(X) X 40 - #else 41 - #define __LIBBPF_MARK_DEPRECATED_0_7(X) 42 - #endif 43 - #if __LIBBPF_CURRENT_VERSION_GEQ(0, 8) 44 - #define __LIBBPF_MARK_DEPRECATED_0_8(X) X 45 - #else 46 - #define __LIBBPF_MARK_DEPRECATED_0_8(X) 36 + #define __LIBBPF_MARK_DEPRECATED_1_0(X) 47 37 #endif 48 38 49 39 /* This set of internal macros allows to do "function overloading" based on
+4 -20
tools/lib/bpf/libbpf_internal.h
··· 15 15 #include <linux/err.h> 16 16 #include <fcntl.h> 17 17 #include <unistd.h> 18 - #include "libbpf_legacy.h" 19 18 #include "relo_core.h" 20 19 21 20 /* make sure libbpf doesn't use kernel-only integer typedefs */ ··· 477 478 __s32 btf__find_by_name_kind_own(const struct btf *btf, const char *type_name, 478 479 __u32 kind); 479 480 480 - extern enum libbpf_strict_mode libbpf_mode; 481 - 482 481 typedef int (*kallsyms_cb_t)(unsigned long long sym_addr, char sym_type, 483 482 const char *sym_name, void *ctx); 484 483 ··· 495 498 */ 496 499 static inline int libbpf_err_errno(int ret) 497 500 { 498 - if (libbpf_mode & LIBBPF_STRICT_DIRECT_ERRS) 499 - /* errno is already assumed to be set on error */ 500 - return ret < 0 ? -errno : ret; 501 - 502 - /* legacy: on error return -1 directly and don't touch errno */ 503 - return ret; 501 + /* errno is already assumed to be set on error */ 502 + return ret < 0 ? -errno : ret; 504 503 } 505 504 506 505 /* handle error for pointer-returning APIs, err is assumed to be < 0 always */ ··· 504 511 { 505 512 /* set errno on error, this doesn't break anything */ 506 513 errno = -err; 507 - 508 - if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS) 509 - return NULL; 510 - 511 - /* legacy: encode err as ptr */ 512 - return ERR_PTR(err); 514 + return NULL; 513 515 } 514 516 515 517 /* handle pointer-returning APIs' error handling */ ··· 514 526 if (IS_ERR(ret)) 515 527 errno = -PTR_ERR(ret); 516 528 517 - if (libbpf_mode & LIBBPF_STRICT_CLEAN_PTRS) 518 - return IS_ERR(ret) ? NULL : ret; 519 - 520 - /* legacy: pass-through original pointer */ 521 - return ret; 529 + return IS_ERR(ret) ? NULL : ret; 522 530 } 523 531 524 532 static inline bool str_is_empty(const char *s)
+26 -2
tools/lib/bpf/libbpf_legacy.h
··· 20 20 extern "C" { 21 21 #endif 22 22 23 + /* As of libbpf 1.0 libbpf_set_strict_mode() and enum libbpf_struct_mode have 24 + * no effect. But they are left in libbpf_legacy.h so that applications that 25 + * prepared for libbpf 1.0 before final release by using 26 + * libbpf_set_strict_mode() still work with libbpf 1.0+ without any changes. 27 + */ 23 28 enum libbpf_strict_mode { 24 29 /* Turn on all supported strict features of libbpf to simulate libbpf 25 30 * v1.0 behavior. ··· 76 71 * first BPF program or map creation operation. This is done only if 77 72 * kernel is too old to support memcg-based memory accounting for BPF 78 73 * subsystem. By default, RLIMIT_MEMLOCK limit is set to RLIM_INFINITY, 79 - * but it can be overriden with libbpf_set_memlock_rlim_max() API. 80 - * Note that libbpf_set_memlock_rlim_max() needs to be called before 74 + * but it can be overriden with libbpf_set_memlock_rlim() API. 75 + * Note that libbpf_set_memlock_rlim() needs to be called before 81 76 * the very first bpf_prog_load(), bpf_map_create() or bpf_object__load() 82 77 * operation. 83 78 */ ··· 92 87 }; 93 88 94 89 LIBBPF_API int libbpf_set_strict_mode(enum libbpf_strict_mode mode); 90 + 91 + /** 92 + * @brief **libbpf_get_error()** extracts the error code from the passed 93 + * pointer 94 + * @param ptr pointer returned from libbpf API function 95 + * @return error code; or 0 if no error occured 96 + * 97 + * Note, as of libbpf 1.0 this function is not necessary and not recommended 98 + * to be used. Libbpf doesn't return error code embedded into the pointer 99 + * itself. Instead, NULL is returned on error and error code is passed through 100 + * thread-local errno variable. **libbpf_get_error()** is just returning -errno 101 + * value if it receives NULL, which is correct only if errno hasn't been 102 + * modified between libbpf API call and corresponding **libbpf_get_error()** 103 + * call. Prefer to check return for NULL and use errno directly. 104 + * 105 + * This API is left in libbpf 1.0 to allow applications that were 1.0-ready 106 + * before final libbpf 1.0 without needing to change them. 107 + */ 108 + LIBBPF_API long libbpf_get_error(const void *ptr); 95 109 96 110 #define DECLARE_LIBBPF_OPTS LIBBPF_OPTS 97 111
+5 -120
tools/lib/bpf/libbpf_probes.c
··· 17 17 #include "libbpf.h" 18 18 #include "libbpf_internal.h" 19 19 20 - static bool grep(const char *buffer, const char *pattern) 21 - { 22 - return !!strstr(buffer, pattern); 23 - } 24 - 25 - static int get_vendor_id(int ifindex) 26 - { 27 - char ifname[IF_NAMESIZE], path[64], buf[8]; 28 - ssize_t len; 29 - int fd; 30 - 31 - if (!if_indextoname(ifindex, ifname)) 32 - return -1; 33 - 34 - snprintf(path, sizeof(path), "/sys/class/net/%s/device/vendor", ifname); 35 - 36 - fd = open(path, O_RDONLY | O_CLOEXEC); 37 - if (fd < 0) 38 - return -1; 39 - 40 - len = read(fd, buf, sizeof(buf)); 41 - close(fd); 42 - if (len < 0) 43 - return -1; 44 - if (len >= (ssize_t)sizeof(buf)) 45 - return -1; 46 - buf[len] = '\0'; 47 - 48 - return strtol(buf, NULL, 0); 49 - } 50 - 51 20 static int probe_prog_load(enum bpf_prog_type prog_type, 52 21 const struct bpf_insn *insns, size_t insns_cnt, 53 - char *log_buf, size_t log_buf_sz, 54 - __u32 ifindex) 22 + char *log_buf, size_t log_buf_sz) 55 23 { 56 24 LIBBPF_OPTS(bpf_prog_load_opts, opts, 57 25 .log_buf = log_buf, 58 26 .log_size = log_buf_sz, 59 27 .log_level = log_buf ? 1 : 0, 60 - .prog_ifindex = ifindex, 61 28 ); 62 29 int fd, err, exp_err = 0; 63 30 const char *exp_msg = NULL; ··· 128 161 if (opts) 129 162 return libbpf_err(-EINVAL); 130 163 131 - ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0, 0); 164 + ret = probe_prog_load(prog_type, insns, insn_cnt, NULL, 0); 132 165 return libbpf_err(ret); 133 - } 134 - 135 - bool bpf_probe_prog_type(enum bpf_prog_type prog_type, __u32 ifindex) 136 - { 137 - struct bpf_insn insns[2] = { 138 - BPF_MOV64_IMM(BPF_REG_0, 0), 139 - BPF_EXIT_INSN() 140 - }; 141 - 142 - /* prefer libbpf_probe_bpf_prog_type() unless offload is requested */ 143 - if (ifindex == 0) 144 - return libbpf_probe_bpf_prog_type(prog_type, NULL) == 1; 145 - 146 - if (ifindex && prog_type == BPF_PROG_TYPE_SCHED_CLS) 147 - /* nfp returns -EINVAL on exit(0) with TC offload */ 148 - insns[0].imm = 2; 149 - 150 - errno = 0; 151 - probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), NULL, 0, ifindex); 152 - 153 - return errno != EINVAL && errno != EOPNOTSUPP; 154 166 } 155 167 156 168 int libbpf__load_raw_btf(const char *raw_types, size_t types_len, ··· 188 242 strs, sizeof(strs)); 189 243 } 190 244 191 - static int probe_map_create(enum bpf_map_type map_type, __u32 ifindex) 245 + static int probe_map_create(enum bpf_map_type map_type) 192 246 { 193 247 LIBBPF_OPTS(bpf_map_create_opts, opts); 194 248 int key_size, value_size, max_entries; 195 249 __u32 btf_key_type_id = 0, btf_value_type_id = 0; 196 250 int fd = -1, btf_fd = -1, fd_inner = -1, exp_err = 0, err; 197 - 198 - opts.map_ifindex = ifindex; 199 251 200 252 key_size = sizeof(__u32); 201 253 value_size = sizeof(__u32); ··· 270 326 271 327 if (map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS || 272 328 map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { 273 - /* TODO: probe for device, once libbpf has a function to create 274 - * map-in-map for offload 275 - */ 276 - if (ifindex) 277 - goto cleanup; 278 - 279 329 fd_inner = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, 280 330 sizeof(__u32), sizeof(__u32), 1, NULL); 281 331 if (fd_inner < 0) ··· 308 370 if (opts) 309 371 return libbpf_err(-EINVAL); 310 372 311 - ret = probe_map_create(map_type, 0); 373 + ret = probe_map_create(map_type); 312 374 return libbpf_err(ret); 313 - } 314 - 315 - bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) 316 - { 317 - return probe_map_create(map_type, ifindex) == 1; 318 375 } 319 376 320 377 int libbpf_probe_bpf_helper(enum bpf_prog_type prog_type, enum bpf_func_id helper_id, ··· 340 407 } 341 408 342 409 buf[0] = '\0'; 343 - ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf), 0); 410 + ret = probe_prog_load(prog_type, insns, insn_cnt, buf, sizeof(buf)); 344 411 if (ret < 0) 345 412 return libbpf_err(ret); 346 413 ··· 359 426 if (ret == 0 && (strstr(buf, "invalid func ") || strstr(buf, "unknown func "))) 360 427 return 0; 361 428 return 1; /* assume supported */ 362 - } 363 - 364 - bool bpf_probe_helper(enum bpf_func_id id, enum bpf_prog_type prog_type, 365 - __u32 ifindex) 366 - { 367 - struct bpf_insn insns[2] = { 368 - BPF_EMIT_CALL(id), 369 - BPF_EXIT_INSN() 370 - }; 371 - char buf[4096] = {}; 372 - bool res; 373 - 374 - probe_prog_load(prog_type, insns, ARRAY_SIZE(insns), buf, sizeof(buf), ifindex); 375 - res = !grep(buf, "invalid func ") && !grep(buf, "unknown func "); 376 - 377 - if (ifindex) { 378 - switch (get_vendor_id(ifindex)) { 379 - case 0x19ee: /* Netronome specific */ 380 - res = res && !grep(buf, "not supported by FW") && 381 - !grep(buf, "unsupported function id"); 382 - break; 383 - default: 384 - break; 385 - } 386 - } 387 - 388 - return res; 389 - } 390 - 391 - /* 392 - * Probe for availability of kernel commit (5.3): 393 - * 394 - * c04c0d2b968a ("bpf: increase complexity limit and maximum program size") 395 - */ 396 - bool bpf_probe_large_insn_limit(__u32 ifindex) 397 - { 398 - struct bpf_insn insns[BPF_MAXINSNS + 1]; 399 - int i; 400 - 401 - for (i = 0; i < BPF_MAXINSNS; i++) 402 - insns[i] = BPF_MOV64_IMM(BPF_REG_0, 1); 403 - insns[BPF_MAXINSNS] = BPF_EXIT_INSN(); 404 - 405 - errno = 0; 406 - probe_prog_load(BPF_PROG_TYPE_SCHED_CLS, insns, ARRAY_SIZE(insns), NULL, 0, 407 - ifindex); 408 - 409 - return errno != E2BIG && errno != EINVAL; 410 429 }
+8 -54
tools/lib/bpf/netlink.c
··· 27 27 typedef int (*__dump_nlmsg_t)(struct nlmsghdr *nlmsg, libbpf_dump_nlmsg_t, 28 28 void *cookie); 29 29 30 + struct xdp_link_info { 31 + __u32 prog_id; 32 + __u32 drv_prog_id; 33 + __u32 hw_prog_id; 34 + __u32 skb_prog_id; 35 + __u8 attach_mode; 36 + }; 37 + 30 38 struct xdp_id_md { 31 39 int ifindex; 32 40 __u32 flags; ··· 296 288 return bpf_xdp_attach(ifindex, -1, flags, opts); 297 289 } 298 290 299 - int bpf_set_link_xdp_fd_opts(int ifindex, int fd, __u32 flags, 300 - const struct bpf_xdp_set_link_opts *opts) 301 - { 302 - int old_fd = -1, ret; 303 - 304 - if (!OPTS_VALID(opts, bpf_xdp_set_link_opts)) 305 - return libbpf_err(-EINVAL); 306 - 307 - if (OPTS_HAS(opts, old_fd)) { 308 - old_fd = OPTS_GET(opts, old_fd, -1); 309 - flags |= XDP_FLAGS_REPLACE; 310 - } 311 - 312 - ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, old_fd, flags); 313 - return libbpf_err(ret); 314 - } 315 - 316 - int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags) 317 - { 318 - int ret; 319 - 320 - ret = __bpf_set_link_xdp_fd_replace(ifindex, fd, 0, flags); 321 - return libbpf_err(ret); 322 - } 323 - 324 291 static int __dump_link_nlmsg(struct nlmsghdr *nlh, 325 292 libbpf_dump_nlmsg_t dump_link_nlmsg, void *cookie) 326 293 { ··· 396 413 return 0; 397 414 } 398 415 399 - int bpf_get_link_xdp_info(int ifindex, struct xdp_link_info *info, 400 - size_t info_size, __u32 flags) 401 - { 402 - LIBBPF_OPTS(bpf_xdp_query_opts, opts); 403 - size_t sz; 404 - int err; 405 - 406 - if (!info_size) 407 - return libbpf_err(-EINVAL); 408 - 409 - err = bpf_xdp_query(ifindex, flags, &opts); 410 - if (err) 411 - return libbpf_err(err); 412 - 413 - /* struct xdp_link_info field layout matches struct bpf_xdp_query_opts 414 - * layout after sz field 415 - */ 416 - sz = min(info_size, offsetofend(struct xdp_link_info, attach_mode)); 417 - memcpy(info, &opts.prog_id, sz); 418 - memset((void *)info + sz, 0, info_size - sz); 419 - 420 - return 0; 421 - } 422 - 423 416 int bpf_xdp_query_id(int ifindex, int flags, __u32 *prog_id) 424 417 { 425 418 LIBBPF_OPTS(bpf_xdp_query_opts, opts); ··· 421 462 return 0; 422 463 } 423 464 424 - 425 - int bpf_get_link_xdp_id(int ifindex, __u32 *prog_id, __u32 flags) 426 - { 427 - return bpf_xdp_query_id(ifindex, flags, prog_id); 428 - } 429 465 430 466 typedef int (*qdisc_config_t)(struct libbpf_nla_req *req); 431 467
+362 -4
tools/lib/bpf/relo_core.c
··· 95 95 case BPF_CORE_TYPE_ID_LOCAL: return "local_type_id"; 96 96 case BPF_CORE_TYPE_ID_TARGET: return "target_type_id"; 97 97 case BPF_CORE_TYPE_EXISTS: return "type_exists"; 98 + case BPF_CORE_TYPE_MATCHES: return "type_matches"; 98 99 case BPF_CORE_TYPE_SIZE: return "type_size"; 99 100 case BPF_CORE_ENUMVAL_EXISTS: return "enumval_exists"; 100 101 case BPF_CORE_ENUMVAL_VALUE: return "enumval_value"; ··· 124 123 case BPF_CORE_TYPE_ID_LOCAL: 125 124 case BPF_CORE_TYPE_ID_TARGET: 126 125 case BPF_CORE_TYPE_EXISTS: 126 + case BPF_CORE_TYPE_MATCHES: 127 127 case BPF_CORE_TYPE_SIZE: 128 128 return true; 129 129 default: ··· 140 138 return true; 141 139 default: 142 140 return false; 141 + } 142 + } 143 + 144 + int __bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 145 + const struct btf *targ_btf, __u32 targ_id, int level) 146 + { 147 + const struct btf_type *local_type, *targ_type; 148 + int depth = 32; /* max recursion depth */ 149 + 150 + /* caller made sure that names match (ignoring flavor suffix) */ 151 + local_type = btf_type_by_id(local_btf, local_id); 152 + targ_type = btf_type_by_id(targ_btf, targ_id); 153 + if (!btf_kind_core_compat(local_type, targ_type)) 154 + return 0; 155 + 156 + recur: 157 + depth--; 158 + if (depth < 0) 159 + return -EINVAL; 160 + 161 + local_type = skip_mods_and_typedefs(local_btf, local_id, &local_id); 162 + targ_type = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); 163 + if (!local_type || !targ_type) 164 + return -EINVAL; 165 + 166 + if (!btf_kind_core_compat(local_type, targ_type)) 167 + return 0; 168 + 169 + switch (btf_kind(local_type)) { 170 + case BTF_KIND_UNKN: 171 + case BTF_KIND_STRUCT: 172 + case BTF_KIND_UNION: 173 + case BTF_KIND_ENUM: 174 + case BTF_KIND_FWD: 175 + case BTF_KIND_ENUM64: 176 + return 1; 177 + case BTF_KIND_INT: 178 + /* just reject deprecated bitfield-like integers; all other 179 + * integers are by default compatible between each other 180 + */ 181 + return btf_int_offset(local_type) == 0 && btf_int_offset(targ_type) == 0; 182 + case BTF_KIND_PTR: 183 + local_id = local_type->type; 184 + targ_id = targ_type->type; 185 + goto recur; 186 + case BTF_KIND_ARRAY: 187 + local_id = btf_array(local_type)->type; 188 + targ_id = btf_array(targ_type)->type; 189 + goto recur; 190 + case BTF_KIND_FUNC_PROTO: { 191 + struct btf_param *local_p = btf_params(local_type); 192 + struct btf_param *targ_p = btf_params(targ_type); 193 + __u16 local_vlen = btf_vlen(local_type); 194 + __u16 targ_vlen = btf_vlen(targ_type); 195 + int i, err; 196 + 197 + if (local_vlen != targ_vlen) 198 + return 0; 199 + 200 + for (i = 0; i < local_vlen; i++, local_p++, targ_p++) { 201 + if (level <= 0) 202 + return -EINVAL; 203 + 204 + skip_mods_and_typedefs(local_btf, local_p->type, &local_id); 205 + skip_mods_and_typedefs(targ_btf, targ_p->type, &targ_id); 206 + err = __bpf_core_types_are_compat(local_btf, local_id, targ_btf, targ_id, 207 + level - 1); 208 + if (err <= 0) 209 + return err; 210 + } 211 + 212 + /* tail recurse for return type check */ 213 + skip_mods_and_typedefs(local_btf, local_type->type, &local_id); 214 + skip_mods_and_typedefs(targ_btf, targ_type->type, &targ_id); 215 + goto recur; 216 + } 217 + default: 218 + pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n", 219 + btf_kind_str(local_type), local_id, targ_id); 220 + return 0; 143 221 } 144 222 } 145 223 ··· 253 171 * - field 'a' access (corresponds to '2' in low-level spec); 254 172 * - array element #3 access (corresponds to '3' in low-level spec). 255 173 * 256 - * Type-based relocations (TYPE_EXISTS/TYPE_SIZE, 174 + * Type-based relocations (TYPE_EXISTS/TYPE_MATCHES/TYPE_SIZE, 257 175 * TYPE_ID_LOCAL/TYPE_ID_TARGET) don't capture any field information. Their 258 176 * spec and raw_spec are kept empty. 259 177 * ··· 570 488 targ_spec->relo_kind = local_spec->relo_kind; 571 489 572 490 if (core_relo_is_type_based(local_spec->relo_kind)) { 573 - return bpf_core_types_are_compat(local_spec->btf, 574 - local_spec->root_type_id, 575 - targ_btf, targ_id); 491 + if (local_spec->relo_kind == BPF_CORE_TYPE_MATCHES) 492 + return bpf_core_types_match(local_spec->btf, 493 + local_spec->root_type_id, 494 + targ_btf, targ_id); 495 + else 496 + return bpf_core_types_are_compat(local_spec->btf, 497 + local_spec->root_type_id, 498 + targ_btf, targ_id); 576 499 } 577 500 578 501 local_acc = &local_spec->spec[0]; ··· 826 739 *validate = false; 827 740 break; 828 741 case BPF_CORE_TYPE_EXISTS: 742 + case BPF_CORE_TYPE_MATCHES: 829 743 *val = 1; 830 744 break; 831 745 case BPF_CORE_TYPE_SIZE: ··· 1417 1329 } 1418 1330 1419 1331 return 0; 1332 + } 1333 + 1334 + static bool bpf_core_names_match(const struct btf *local_btf, size_t local_name_off, 1335 + const struct btf *targ_btf, size_t targ_name_off) 1336 + { 1337 + const char *local_n, *targ_n; 1338 + size_t local_len, targ_len; 1339 + 1340 + local_n = btf__name_by_offset(local_btf, local_name_off); 1341 + targ_n = btf__name_by_offset(targ_btf, targ_name_off); 1342 + 1343 + if (str_is_empty(targ_n)) 1344 + return str_is_empty(local_n); 1345 + 1346 + targ_len = bpf_core_essential_name_len(targ_n); 1347 + local_len = bpf_core_essential_name_len(local_n); 1348 + 1349 + return targ_len == local_len && strncmp(local_n, targ_n, local_len) == 0; 1350 + } 1351 + 1352 + static int bpf_core_enums_match(const struct btf *local_btf, const struct btf_type *local_t, 1353 + const struct btf *targ_btf, const struct btf_type *targ_t) 1354 + { 1355 + __u16 local_vlen = btf_vlen(local_t); 1356 + __u16 targ_vlen = btf_vlen(targ_t); 1357 + int i, j; 1358 + 1359 + if (local_t->size != targ_t->size) 1360 + return 0; 1361 + 1362 + if (local_vlen > targ_vlen) 1363 + return 0; 1364 + 1365 + /* iterate over the local enum's variants and make sure each has 1366 + * a symbolic name correspondent in the target 1367 + */ 1368 + for (i = 0; i < local_vlen; i++) { 1369 + bool matched = false; 1370 + __u32 local_n_off, targ_n_off; 1371 + 1372 + local_n_off = btf_is_enum(local_t) ? btf_enum(local_t)[i].name_off : 1373 + btf_enum64(local_t)[i].name_off; 1374 + 1375 + for (j = 0; j < targ_vlen; j++) { 1376 + targ_n_off = btf_is_enum(targ_t) ? btf_enum(targ_t)[j].name_off : 1377 + btf_enum64(targ_t)[j].name_off; 1378 + 1379 + if (bpf_core_names_match(local_btf, local_n_off, targ_btf, targ_n_off)) { 1380 + matched = true; 1381 + break; 1382 + } 1383 + } 1384 + 1385 + if (!matched) 1386 + return 0; 1387 + } 1388 + return 1; 1389 + } 1390 + 1391 + static int bpf_core_composites_match(const struct btf *local_btf, const struct btf_type *local_t, 1392 + const struct btf *targ_btf, const struct btf_type *targ_t, 1393 + bool behind_ptr, int level) 1394 + { 1395 + const struct btf_member *local_m = btf_members(local_t); 1396 + __u16 local_vlen = btf_vlen(local_t); 1397 + __u16 targ_vlen = btf_vlen(targ_t); 1398 + int i, j, err; 1399 + 1400 + if (local_vlen > targ_vlen) 1401 + return 0; 1402 + 1403 + /* check that all local members have a match in the target */ 1404 + for (i = 0; i < local_vlen; i++, local_m++) { 1405 + const struct btf_member *targ_m = btf_members(targ_t); 1406 + bool matched = false; 1407 + 1408 + for (j = 0; j < targ_vlen; j++, targ_m++) { 1409 + if (!bpf_core_names_match(local_btf, local_m->name_off, 1410 + targ_btf, targ_m->name_off)) 1411 + continue; 1412 + 1413 + err = __bpf_core_types_match(local_btf, local_m->type, targ_btf, 1414 + targ_m->type, behind_ptr, level - 1); 1415 + if (err < 0) 1416 + return err; 1417 + if (err > 0) { 1418 + matched = true; 1419 + break; 1420 + } 1421 + } 1422 + 1423 + if (!matched) 1424 + return 0; 1425 + } 1426 + return 1; 1427 + } 1428 + 1429 + /* Check that two types "match". This function assumes that root types were 1430 + * already checked for name match. 1431 + * 1432 + * The matching relation is defined as follows: 1433 + * - modifiers and typedefs are stripped (and, hence, effectively ignored) 1434 + * - generally speaking types need to be of same kind (struct vs. struct, union 1435 + * vs. union, etc.) 1436 + * - exceptions are struct/union behind a pointer which could also match a 1437 + * forward declaration of a struct or union, respectively, and enum vs. 1438 + * enum64 (see below) 1439 + * Then, depending on type: 1440 + * - integers: 1441 + * - match if size and signedness match 1442 + * - arrays & pointers: 1443 + * - target types are recursively matched 1444 + * - structs & unions: 1445 + * - local members need to exist in target with the same name 1446 + * - for each member we recursively check match unless it is already behind a 1447 + * pointer, in which case we only check matching names and compatible kind 1448 + * - enums: 1449 + * - local variants have to have a match in target by symbolic name (but not 1450 + * numeric value) 1451 + * - size has to match (but enum may match enum64 and vice versa) 1452 + * - function pointers: 1453 + * - number and position of arguments in local type has to match target 1454 + * - for each argument and the return value we recursively check match 1455 + */ 1456 + int __bpf_core_types_match(const struct btf *local_btf, __u32 local_id, const struct btf *targ_btf, 1457 + __u32 targ_id, bool behind_ptr, int level) 1458 + { 1459 + const struct btf_type *local_t, *targ_t; 1460 + int depth = 32; /* max recursion depth */ 1461 + __u16 local_k, targ_k; 1462 + 1463 + if (level <= 0) 1464 + return -EINVAL; 1465 + 1466 + local_t = btf_type_by_id(local_btf, local_id); 1467 + targ_t = btf_type_by_id(targ_btf, targ_id); 1468 + 1469 + recur: 1470 + depth--; 1471 + if (depth < 0) 1472 + return -EINVAL; 1473 + 1474 + local_t = skip_mods_and_typedefs(local_btf, local_id, &local_id); 1475 + targ_t = skip_mods_and_typedefs(targ_btf, targ_id, &targ_id); 1476 + if (!local_t || !targ_t) 1477 + return -EINVAL; 1478 + 1479 + /* While the name check happens after typedefs are skipped, root-level 1480 + * typedefs would still be name-matched as that's the contract with 1481 + * callers. 1482 + */ 1483 + if (!bpf_core_names_match(local_btf, local_t->name_off, targ_btf, targ_t->name_off)) 1484 + return 0; 1485 + 1486 + local_k = btf_kind(local_t); 1487 + targ_k = btf_kind(targ_t); 1488 + 1489 + switch (local_k) { 1490 + case BTF_KIND_UNKN: 1491 + return local_k == targ_k; 1492 + case BTF_KIND_FWD: { 1493 + bool local_f = BTF_INFO_KFLAG(local_t->info); 1494 + 1495 + if (behind_ptr) { 1496 + if (local_k == targ_k) 1497 + return local_f == BTF_INFO_KFLAG(targ_t->info); 1498 + 1499 + /* for forward declarations kflag dictates whether the 1500 + * target is a struct (0) or union (1) 1501 + */ 1502 + return (targ_k == BTF_KIND_STRUCT && !local_f) || 1503 + (targ_k == BTF_KIND_UNION && local_f); 1504 + } else { 1505 + if (local_k != targ_k) 1506 + return 0; 1507 + 1508 + /* match if the forward declaration is for the same kind */ 1509 + return local_f == BTF_INFO_KFLAG(targ_t->info); 1510 + } 1511 + } 1512 + case BTF_KIND_ENUM: 1513 + case BTF_KIND_ENUM64: 1514 + if (!btf_is_any_enum(targ_t)) 1515 + return 0; 1516 + 1517 + return bpf_core_enums_match(local_btf, local_t, targ_btf, targ_t); 1518 + case BTF_KIND_STRUCT: 1519 + case BTF_KIND_UNION: 1520 + if (behind_ptr) { 1521 + bool targ_f = BTF_INFO_KFLAG(targ_t->info); 1522 + 1523 + if (local_k == targ_k) 1524 + return 1; 1525 + 1526 + if (targ_k != BTF_KIND_FWD) 1527 + return 0; 1528 + 1529 + return (local_k == BTF_KIND_UNION) == targ_f; 1530 + } else { 1531 + if (local_k != targ_k) 1532 + return 0; 1533 + 1534 + return bpf_core_composites_match(local_btf, local_t, targ_btf, targ_t, 1535 + behind_ptr, level); 1536 + } 1537 + case BTF_KIND_INT: { 1538 + __u8 local_sgn; 1539 + __u8 targ_sgn; 1540 + 1541 + if (local_k != targ_k) 1542 + return 0; 1543 + 1544 + local_sgn = btf_int_encoding(local_t) & BTF_INT_SIGNED; 1545 + targ_sgn = btf_int_encoding(targ_t) & BTF_INT_SIGNED; 1546 + 1547 + return local_t->size == targ_t->size && local_sgn == targ_sgn; 1548 + } 1549 + case BTF_KIND_PTR: 1550 + if (local_k != targ_k) 1551 + return 0; 1552 + 1553 + behind_ptr = true; 1554 + 1555 + local_id = local_t->type; 1556 + targ_id = targ_t->type; 1557 + goto recur; 1558 + case BTF_KIND_ARRAY: { 1559 + const struct btf_array *local_array = btf_array(local_t); 1560 + const struct btf_array *targ_array = btf_array(targ_t); 1561 + 1562 + if (local_k != targ_k) 1563 + return 0; 1564 + 1565 + if (local_array->nelems != targ_array->nelems) 1566 + return 0; 1567 + 1568 + local_id = local_array->type; 1569 + targ_id = targ_array->type; 1570 + goto recur; 1571 + } 1572 + case BTF_KIND_FUNC_PROTO: { 1573 + struct btf_param *local_p = btf_params(local_t); 1574 + struct btf_param *targ_p = btf_params(targ_t); 1575 + __u16 local_vlen = btf_vlen(local_t); 1576 + __u16 targ_vlen = btf_vlen(targ_t); 1577 + int i, err; 1578 + 1579 + if (local_k != targ_k) 1580 + return 0; 1581 + 1582 + if (local_vlen != targ_vlen) 1583 + return 0; 1584 + 1585 + for (i = 0; i < local_vlen; i++, local_p++, targ_p++) { 1586 + err = __bpf_core_types_match(local_btf, local_p->type, targ_btf, 1587 + targ_p->type, behind_ptr, level - 1); 1588 + if (err <= 0) 1589 + return err; 1590 + } 1591 + 1592 + /* tail recurse for return type check */ 1593 + local_id = local_t->type; 1594 + targ_id = targ_t->type; 1595 + goto recur; 1596 + } 1597 + default: 1598 + pr_warn("unexpected kind %s relocated, local [%d], target [%d]\n", 1599 + btf_kind_str(local_t), local_id, targ_id); 1600 + return 0; 1601 + } 1420 1602 }
+6
tools/lib/bpf/relo_core.h
··· 68 68 __u32 new_type_id; 69 69 }; 70 70 71 + int __bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 72 + const struct btf *targ_btf, __u32 targ_id, int level); 71 73 int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, 72 74 const struct btf *targ_btf, __u32 targ_id); 75 + int __bpf_core_types_match(const struct btf *local_btf, __u32 local_id, const struct btf *targ_btf, 76 + __u32 targ_id, bool behind_ptr, int level); 77 + int bpf_core_types_match(const struct btf *local_btf, __u32 local_id, const struct btf *targ_btf, 78 + __u32 targ_id); 73 79 74 80 size_t bpf_core_essential_name_len(const char *name); 75 81
+2 -4
tools/lib/bpf/usdt.c
··· 652 652 * 653 653 * [0] https://sourceware.org/systemtap/wiki/UserSpaceProbeImplementation 654 654 */ 655 - usdt_rel_ip = usdt_abs_ip = note.loc_addr; 656 - if (base_addr) { 655 + usdt_abs_ip = note.loc_addr; 656 + if (base_addr) 657 657 usdt_abs_ip += base_addr - note.base_addr; 658 - usdt_rel_ip += base_addr - note.base_addr; 659 - } 660 658 661 659 /* When attaching uprobes (which is what USDTs basically are) 662 660 * kernel expects file offset to be specified, not a relative
+50 -42
tools/lib/bpf/xsk.c tools/testing/selftests/bpf/xsk.c
··· 30 30 #include <sys/types.h> 31 31 #include <linux/if_link.h> 32 32 33 - #include "bpf.h" 34 - #include "libbpf.h" 35 - #include "libbpf_internal.h" 33 + #include <bpf/bpf.h> 34 + #include <bpf/libbpf.h> 36 35 #include "xsk.h" 37 - 38 - /* entire xsk.h and xsk.c is going away in libbpf 1.0, so ignore all internal 39 - * uses of deprecated APIs 40 - */ 41 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 42 36 43 37 #ifndef SOL_XDP 44 38 #define SOL_XDP 283 ··· 45 51 #ifndef PF_XDP 46 52 #define PF_XDP AF_XDP 47 53 #endif 54 + 55 + #define pr_warn(fmt, ...) fprintf(stderr, fmt, ##__VA_ARGS__) 48 56 49 57 enum xsk_prog { 50 58 XSK_PROG_FALLBACK, ··· 282 286 return err; 283 287 } 284 288 285 - DEFAULT_VERSION(xsk_umem__create_v0_0_4, xsk_umem__create, LIBBPF_0.0.4) 286 - int xsk_umem__create_v0_0_4(struct xsk_umem **umem_ptr, void *umem_area, 287 - __u64 size, struct xsk_ring_prod *fill, 288 - struct xsk_ring_cons *comp, 289 - const struct xsk_umem_config *usr_config) 289 + int xsk_umem__create(struct xsk_umem **umem_ptr, void *umem_area, 290 + __u64 size, struct xsk_ring_prod *fill, 291 + struct xsk_ring_cons *comp, 292 + const struct xsk_umem_config *usr_config) 290 293 { 291 294 struct xdp_umem_reg mr; 292 295 struct xsk_umem *umem; ··· 346 351 __u32 frame_headroom; 347 352 }; 348 353 349 - COMPAT_VERSION(xsk_umem__create_v0_0_2, xsk_umem__create, LIBBPF_0.0.2) 350 - int xsk_umem__create_v0_0_2(struct xsk_umem **umem_ptr, void *umem_area, 351 - __u64 size, struct xsk_ring_prod *fill, 352 - struct xsk_ring_cons *comp, 353 - const struct xsk_umem_config *usr_config) 354 - { 355 - struct xsk_umem_config config; 356 - 357 - memcpy(&config, usr_config, sizeof(struct xsk_umem_config_v1)); 358 - config.flags = 0; 359 - 360 - return xsk_umem__create_v0_0_4(umem_ptr, umem_area, size, fill, comp, 361 - &config); 362 - } 363 - 364 354 static enum xsk_prog get_xsk_prog(void) 365 355 { 366 356 enum xsk_prog detected = XSK_PROG_FALLBACK; 367 - __u32 size_out, retval, duration; 368 357 char data_in = 0, data_out; 369 358 struct bpf_insn insns[] = { 370 359 BPF_LD_MAP_FD(BPF_REG_1, 0), ··· 357 378 BPF_EMIT_CALL(BPF_FUNC_redirect_map), 358 379 BPF_EXIT_INSN(), 359 380 }; 381 + LIBBPF_OPTS(bpf_test_run_opts, opts, 382 + .data_in = &data_in, 383 + .data_size_in = 1, 384 + .data_out = &data_out, 385 + ); 386 + 360 387 int prog_fd, map_fd, ret, insn_cnt = ARRAY_SIZE(insns); 361 388 362 389 map_fd = bpf_map_create(BPF_MAP_TYPE_XSKMAP, NULL, sizeof(int), sizeof(int), 1, NULL); ··· 377 392 return detected; 378 393 } 379 394 380 - ret = bpf_prog_test_run(prog_fd, 0, &data_in, 1, &data_out, &size_out, &retval, &duration); 381 - if (!ret && retval == XDP_PASS) 395 + ret = bpf_prog_test_run_opts(prog_fd, &opts); 396 + if (!ret && opts.retval == XDP_PASS) 382 397 detected = XSK_PROG_REDIRECT_FLAGS; 383 398 close(prog_fd); 384 399 close(map_fd); ··· 495 510 int link_fd; 496 511 int err; 497 512 498 - err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags); 513 + err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id); 499 514 if (err) { 500 515 pr_warn("getting XDP prog id failed\n"); 501 516 return err; ··· 519 534 520 535 ctx->link_fd = link_fd; 521 536 return 0; 537 + } 538 + 539 + /* Copy up to sz - 1 bytes from zero-terminated src string and ensure that dst 540 + * is zero-terminated string no matter what (unless sz == 0, in which case 541 + * it's a no-op). It's conceptually close to FreeBSD's strlcpy(), but differs 542 + * in what is returned. Given this is internal helper, it's trivial to extend 543 + * this, when necessary. Use this instead of strncpy inside libbpf source code. 544 + */ 545 + static inline void libbpf_strlcpy(char *dst, const char *src, size_t sz) 546 + { 547 + size_t i; 548 + 549 + if (sz == 0) 550 + return; 551 + 552 + sz--; 553 + for (i = 0; i < sz && src[i]; i++) 554 + dst[i] = src[i]; 555 + dst[i] = '\0'; 522 556 } 523 557 524 558 static int xsk_get_max_queues(struct xsk_socket *xsk) ··· 796 792 if (ctx->has_bpf_link) 797 793 err = xsk_create_bpf_link(xsk); 798 794 else 799 - err = bpf_set_link_xdp_fd(xsk->ctx->ifindex, ctx->prog_fd, 800 - xsk->config.xdp_flags); 795 + err = bpf_xdp_attach(xsk->ctx->ifindex, ctx->prog_fd, 796 + xsk->config.xdp_flags, NULL); 801 797 802 798 if (err) 803 799 goto err_attach_xdp_prog; ··· 815 811 if (ctx->has_bpf_link) 816 812 close(ctx->link_fd); 817 813 else 818 - bpf_set_link_xdp_fd(ctx->ifindex, -1, 0); 814 + bpf_xdp_detach(ctx->ifindex, 0, NULL); 819 815 err_attach_xdp_prog: 820 816 close(ctx->prog_fd); 821 817 err_load_xdp_prog: ··· 866 862 if (ctx->has_bpf_link) 867 863 err = xsk_link_lookup(ctx->ifindex, &prog_id, &ctx->link_fd); 868 864 else 869 - err = bpf_get_link_xdp_id(ctx->ifindex, &prog_id, xsk->config.xdp_flags); 865 + err = bpf_xdp_query_id(ctx->ifindex, xsk->config.xdp_flags, &prog_id); 870 866 871 867 if (err) 872 868 return err; ··· 878 874 *xsks_map_fd = ctx->xsks_map_fd; 879 875 880 876 return err; 877 + } 878 + 879 + int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd) 880 + { 881 + return __xsk_setup_xdp_prog(xsk, xsks_map_fd); 881 882 } 882 883 883 884 static struct xsk_ctx *xsk_get_ctx(struct xsk_umem *umem, int ifindex, ··· 963 954 ctx->fill = fill; 964 955 ctx->comp = comp; 965 956 list_add(&ctx->list, &umem->ctx_list); 957 + ctx->has_bpf_link = xsk_probe_bpf_link(); 966 958 return ctx; 967 959 } 968 960 ··· 1065 1055 } 1066 1056 } 1067 1057 xsk->ctx = ctx; 1068 - xsk->ctx->has_bpf_link = xsk_probe_bpf_link(); 1069 1058 1070 1059 if (rx && !rx_setup_done) { 1071 1060 err = setsockopt(xsk->fd, SOL_XDP, XDP_RX_RING, ··· 1156 1147 goto out_mmap_tx; 1157 1148 } 1158 1149 1159 - ctx->prog_fd = -1; 1160 - 1161 1150 if (!(xsk->config.libbpf_flags & XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD)) { 1162 1151 err = __xsk_setup_xdp_prog(xsk, NULL); 1163 1152 if (err) ··· 1236 1229 1237 1230 ctx = xsk->ctx; 1238 1231 umem = ctx->umem; 1239 - if (ctx->prog_fd != -1) { 1232 + 1233 + xsk_put_ctx(ctx, true); 1234 + 1235 + if (!ctx->refcount) { 1240 1236 xsk_delete_bpf_maps(xsk); 1241 1237 close(ctx->prog_fd); 1242 1238 if (ctx->has_bpf_link) ··· 1257 1247 off.tx.desc + xsk->config.tx_size * desc_sz); 1258 1248 } 1259 1249 } 1260 - 1261 - xsk_put_ctx(ctx, true); 1262 1250 1263 1251 umem->refcount--; 1264 1252 /* Do not close an fd that also has an associated umem connected
+5 -25
tools/lib/bpf/xsk.h tools/testing/selftests/bpf/xsk.h
··· 9 9 * Author(s): Magnus Karlsson <magnus.karlsson@intel.com> 10 10 */ 11 11 12 - #ifndef __LIBBPF_XSK_H 13 - #define __LIBBPF_XSK_H 12 + #ifndef __XSK_H 13 + #define __XSK_H 14 14 15 15 #include <stdio.h> 16 16 #include <stdint.h> 17 17 #include <stdbool.h> 18 18 #include <linux/if_xdp.h> 19 19 20 - #include "libbpf.h" 20 + #include <bpf/libbpf.h> 21 21 22 22 #ifdef __cplusplus 23 23 extern "C" { ··· 251 251 return xsk_umem__extract_addr(addr) + xsk_umem__extract_offset(addr); 252 252 } 253 253 254 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 255 254 int xsk_umem__fd(const struct xsk_umem *umem); 256 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 257 255 int xsk_socket__fd(const struct xsk_socket *xsk); 258 256 259 257 #define XSK_RING_CONS__DEFAULT_NUM_DESCS 2048 ··· 269 271 __u32 flags; 270 272 }; 271 273 272 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 274 + int xsk_setup_xdp_prog_xsk(struct xsk_socket *xsk, int *xsks_map_fd); 273 275 int xsk_setup_xdp_prog(int ifindex, int *xsks_map_fd); 274 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 275 276 int xsk_socket__update_xskmap(struct xsk_socket *xsk, int xsks_map_fd); 276 277 277 278 /* Flags for the libbpf_flags field. */ ··· 285 288 }; 286 289 287 290 /* Set config to NULL to get the default configuration. */ 288 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 289 291 int xsk_umem__create(struct xsk_umem **umem, 290 292 void *umem_area, __u64 size, 291 293 struct xsk_ring_prod *fill, 292 294 struct xsk_ring_cons *comp, 293 295 const struct xsk_umem_config *config); 294 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 295 - int xsk_umem__create_v0_0_2(struct xsk_umem **umem, 296 - void *umem_area, __u64 size, 297 - struct xsk_ring_prod *fill, 298 - struct xsk_ring_cons *comp, 299 - const struct xsk_umem_config *config); 300 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 301 - int xsk_umem__create_v0_0_4(struct xsk_umem **umem, 302 - void *umem_area, __u64 size, 303 - struct xsk_ring_prod *fill, 304 - struct xsk_ring_cons *comp, 305 - const struct xsk_umem_config *config); 306 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 307 296 int xsk_socket__create(struct xsk_socket **xsk, 308 297 const char *ifname, __u32 queue_id, 309 298 struct xsk_umem *umem, 310 299 struct xsk_ring_cons *rx, 311 300 struct xsk_ring_prod *tx, 312 301 const struct xsk_socket_config *config); 313 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 314 302 int xsk_socket__create_shared(struct xsk_socket **xsk_ptr, 315 303 const char *ifname, 316 304 __u32 queue_id, struct xsk_umem *umem, ··· 306 324 const struct xsk_socket_config *config); 307 325 308 326 /* Returns 0 for success and -EBUSY if the umem is still in use. */ 309 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 310 327 int xsk_umem__delete(struct xsk_umem *umem); 311 - LIBBPF_API LIBBPF_DEPRECATED_SINCE(0, 7, "AF_XDP support deprecated and moved to libxdp") 312 328 void xsk_socket__delete(struct xsk_socket *xsk); 313 329 314 330 #ifdef __cplusplus 315 331 } /* extern "C" */ 316 332 #endif 317 333 318 - #endif /* __LIBBPF_XSK_H */ 334 + #endif /* __XSK_H */
+175 -29
tools/perf/util/bpf-loader.c
··· 9 9 #include <linux/bpf.h> 10 10 #include <bpf/libbpf.h> 11 11 #include <bpf/bpf.h> 12 + #include <linux/filter.h> 12 13 #include <linux/err.h> 13 14 #include <linux/kernel.h> 14 15 #include <linux/string.h> ··· 50 49 struct bpf_insn *insns_buf; 51 50 int nr_types; 52 51 int *type_mapping; 52 + int *prologue_fds; 53 53 }; 54 54 55 55 struct bpf_perf_object { 56 56 struct list_head list; 57 57 struct bpf_object *obj; 58 + }; 59 + 60 + struct bpf_preproc_result { 61 + struct bpf_insn *new_insn_ptr; 62 + int new_insn_cnt; 58 63 }; 59 64 60 65 static LIST_HEAD(bpf_objects_list); ··· 93 86 (perf_obj) = (tmp), (tmp) = bpf_perf_object__next(tmp)) 94 87 95 88 static bool libbpf_initialized; 89 + static int libbpf_sec_handler; 96 90 97 91 static int bpf_perf_object__add(struct bpf_object *obj) 98 92 { ··· 107 99 return perf_obj ? 0 : -ENOMEM; 108 100 } 109 101 102 + static void *program_priv(const struct bpf_program *prog) 103 + { 104 + void *priv; 105 + 106 + if (IS_ERR_OR_NULL(bpf_program_hash)) 107 + return NULL; 108 + if (!hashmap__find(bpf_program_hash, prog, &priv)) 109 + return NULL; 110 + return priv; 111 + } 112 + 113 + static struct bpf_insn prologue_init_insn[] = { 114 + BPF_MOV64_IMM(BPF_REG_2, 0), 115 + BPF_MOV64_IMM(BPF_REG_3, 0), 116 + BPF_MOV64_IMM(BPF_REG_4, 0), 117 + BPF_MOV64_IMM(BPF_REG_5, 0), 118 + }; 119 + 120 + static int libbpf_prog_prepare_load_fn(struct bpf_program *prog, 121 + struct bpf_prog_load_opts *opts __maybe_unused, 122 + long cookie __maybe_unused) 123 + { 124 + size_t init_size_cnt = ARRAY_SIZE(prologue_init_insn); 125 + size_t orig_insn_cnt, insn_cnt, init_size, orig_size; 126 + struct bpf_prog_priv *priv = program_priv(prog); 127 + const struct bpf_insn *orig_insn; 128 + struct bpf_insn *insn; 129 + 130 + if (IS_ERR_OR_NULL(priv)) { 131 + pr_debug("bpf: failed to get private field\n"); 132 + return -BPF_LOADER_ERRNO__INTERNAL; 133 + } 134 + 135 + if (!priv->need_prologue) 136 + return 0; 137 + 138 + /* prepend initialization code to program instructions */ 139 + orig_insn = bpf_program__insns(prog); 140 + orig_insn_cnt = bpf_program__insn_cnt(prog); 141 + init_size = init_size_cnt * sizeof(*insn); 142 + orig_size = orig_insn_cnt * sizeof(*insn); 143 + 144 + insn_cnt = orig_insn_cnt + init_size_cnt; 145 + insn = malloc(insn_cnt * sizeof(*insn)); 146 + if (!insn) 147 + return -ENOMEM; 148 + 149 + memcpy(insn, prologue_init_insn, init_size); 150 + memcpy((char *) insn + init_size, orig_insn, orig_size); 151 + bpf_program__set_insns(prog, insn, insn_cnt); 152 + return 0; 153 + } 154 + 110 155 static int libbpf_init(void) 111 156 { 157 + LIBBPF_OPTS(libbpf_prog_handler_opts, handler_opts, 158 + .prog_prepare_load_fn = libbpf_prog_prepare_load_fn, 159 + ); 160 + 112 161 if (libbpf_initialized) 113 162 return 0; 114 163 115 164 libbpf_set_print(libbpf_perf_print); 165 + libbpf_sec_handler = libbpf_register_prog_handler(NULL, BPF_PROG_TYPE_KPROBE, 166 + 0, &handler_opts); 167 + if (libbpf_sec_handler < 0) { 168 + pr_debug("bpf: failed to register libbpf section handler: %d\n", 169 + libbpf_sec_handler); 170 + return -BPF_LOADER_ERRNO__INTERNAL; 171 + } 116 172 libbpf_initialized = true; 117 173 return 0; 118 174 } ··· 260 188 return obj; 261 189 } 262 190 191 + static void close_prologue_programs(struct bpf_prog_priv *priv) 192 + { 193 + struct perf_probe_event *pev; 194 + int i, fd; 195 + 196 + if (!priv->need_prologue) 197 + return; 198 + pev = &priv->pev; 199 + for (i = 0; i < pev->ntevs; i++) { 200 + fd = priv->prologue_fds[i]; 201 + if (fd != -1) 202 + close(fd); 203 + } 204 + } 205 + 263 206 static void 264 207 clear_prog_priv(const struct bpf_program *prog __maybe_unused, 265 208 void *_priv) 266 209 { 267 210 struct bpf_prog_priv *priv = _priv; 268 211 212 + close_prologue_programs(priv); 269 213 cleanup_perf_probe_events(&priv->pev, 1); 270 214 zfree(&priv->insns_buf); 215 + zfree(&priv->prologue_fds); 271 216 zfree(&priv->type_mapping); 272 217 zfree(&priv->sys_name); 273 218 zfree(&priv->evt_name); ··· 330 241 void *ctx __maybe_unused) 331 242 { 332 243 return key1 == key2; 333 - } 334 - 335 - static void *program_priv(const struct bpf_program *prog) 336 - { 337 - void *priv; 338 - 339 - if (IS_ERR_OR_NULL(bpf_program_hash)) 340 - return NULL; 341 - if (!hashmap__find(bpf_program_hash, prog, &priv)) 342 - return NULL; 343 - return priv; 344 244 } 345 245 346 246 static int program_set_priv(struct bpf_program *prog, void *priv) ··· 636 558 637 559 static int 638 560 preproc_gen_prologue(struct bpf_program *prog, int n, 639 - struct bpf_insn *orig_insns, int orig_insns_cnt, 640 - struct bpf_prog_prep_result *res) 561 + const struct bpf_insn *orig_insns, int orig_insns_cnt, 562 + struct bpf_preproc_result *res) 641 563 { 642 564 struct bpf_prog_priv *priv = program_priv(prog); 643 565 struct probe_trace_event *tev; ··· 685 607 686 608 res->new_insn_ptr = buf; 687 609 res->new_insn_cnt = prologue_cnt + orig_insns_cnt; 688 - res->pfd = NULL; 689 610 return 0; 690 611 691 612 errout: ··· 792 715 struct bpf_prog_priv *priv = program_priv(prog); 793 716 struct perf_probe_event *pev; 794 717 bool need_prologue = false; 795 - int err, i; 718 + int i; 796 719 797 720 if (IS_ERR_OR_NULL(priv)) { 798 721 pr_debug("Internal error when hook preprocessor\n"); ··· 830 753 return -ENOMEM; 831 754 } 832 755 756 + priv->prologue_fds = malloc(sizeof(int) * pev->ntevs); 757 + if (!priv->prologue_fds) { 758 + pr_debug("Not enough memory: alloc prologue fds failed\n"); 759 + return -ENOMEM; 760 + } 761 + memset(priv->prologue_fds, -1, sizeof(int) * pev->ntevs); 762 + 833 763 priv->type_mapping = malloc(sizeof(int) * pev->ntevs); 834 764 if (!priv->type_mapping) { 835 765 pr_debug("Not enough memory: alloc type_mapping failed\n"); ··· 845 761 memset(priv->type_mapping, -1, 846 762 sizeof(int) * pev->ntevs); 847 763 848 - err = map_prologue(pev, priv->type_mapping, &priv->nr_types); 849 - if (err) 850 - return err; 851 - 852 - err = bpf_program__set_prep(prog, priv->nr_types, 853 - preproc_gen_prologue); 854 - return err; 764 + return map_prologue(pev, priv->type_mapping, &priv->nr_types); 855 765 } 856 766 857 767 int bpf__probe(struct bpf_object *obj) ··· 952 874 return ret; 953 875 } 954 876 877 + static int bpf_object__load_prologue(struct bpf_object *obj) 878 + { 879 + int init_cnt = ARRAY_SIZE(prologue_init_insn); 880 + const struct bpf_insn *orig_insns; 881 + struct bpf_preproc_result res; 882 + struct perf_probe_event *pev; 883 + struct bpf_program *prog; 884 + int orig_insns_cnt; 885 + 886 + bpf_object__for_each_program(prog, obj) { 887 + struct bpf_prog_priv *priv = program_priv(prog); 888 + int err, i, fd; 889 + 890 + if (IS_ERR_OR_NULL(priv)) { 891 + pr_debug("bpf: failed to get private field\n"); 892 + return -BPF_LOADER_ERRNO__INTERNAL; 893 + } 894 + 895 + if (!priv->need_prologue) 896 + continue; 897 + 898 + /* 899 + * For each program that needs prologue we do following: 900 + * 901 + * - take its current instructions and use them 902 + * to generate the new code with prologue 903 + * - load new instructions with bpf_prog_load 904 + * and keep the fd in prologue_fds 905 + * - new fd will be used in bpf__foreach_event 906 + * to connect this program with perf evsel 907 + */ 908 + orig_insns = bpf_program__insns(prog); 909 + orig_insns_cnt = bpf_program__insn_cnt(prog); 910 + 911 + pev = &priv->pev; 912 + for (i = 0; i < pev->ntevs; i++) { 913 + /* 914 + * Skipping artificall prologue_init_insn instructions 915 + * (init_cnt), so the prologue can be generated instead 916 + * of them. 917 + */ 918 + err = preproc_gen_prologue(prog, i, 919 + orig_insns + init_cnt, 920 + orig_insns_cnt - init_cnt, 921 + &res); 922 + if (err) 923 + return err; 924 + 925 + fd = bpf_prog_load(bpf_program__get_type(prog), 926 + bpf_program__name(prog), "GPL", 927 + res.new_insn_ptr, 928 + res.new_insn_cnt, NULL); 929 + if (fd < 0) { 930 + char bf[128]; 931 + 932 + libbpf_strerror(-errno, bf, sizeof(bf)); 933 + pr_debug("bpf: load objects with prologue failed: err=%d: (%s)\n", 934 + -errno, bf); 935 + return -errno; 936 + } 937 + priv->prologue_fds[i] = fd; 938 + } 939 + /* 940 + * We no longer need the original program, 941 + * we can unload it. 942 + */ 943 + bpf_program__unload(prog); 944 + } 945 + return 0; 946 + } 947 + 955 948 int bpf__load(struct bpf_object *obj) 956 949 { 957 950 int err; ··· 1034 885 pr_debug("bpf: load objects failed: err=%d: (%s)\n", err, bf); 1035 886 return err; 1036 887 } 1037 - return 0; 888 + return bpf_object__load_prologue(obj); 1038 889 } 1039 890 1040 891 int bpf__foreach_event(struct bpf_object *obj, ··· 1069 920 for (i = 0; i < pev->ntevs; i++) { 1070 921 tev = &pev->tevs[i]; 1071 922 1072 - if (priv->need_prologue) { 1073 - int type = priv->type_mapping[i]; 1074 - 1075 - fd = bpf_program__nth_fd(prog, type); 1076 - } else { 923 + if (priv->need_prologue) 924 + fd = priv->prologue_fds[i]; 925 + else 1077 926 fd = bpf_program__fd(prog); 1078 - } 1079 927 1080 928 if (fd < 0) { 1081 929 pr_debug("bpf: failed to get file descriptor\n");
+1 -1
tools/testing/selftests/bpf/.gitignore
··· 41 41 /bench 42 42 *.ko 43 43 *.tmp 44 - xdpxceiver 44 + xskxceiver 45 45 xdp_redirect_multi 46 46 xdp_synproxy
+8 -2
tools/testing/selftests/bpf/Makefile
··· 82 82 TEST_GEN_PROGS_EXTENDED = test_sock_addr test_skb_cgroup_id_user \ 83 83 flow_dissector_load test_flow_dissector test_tcp_check_syncookie_user \ 84 84 test_lirc_mode2_user xdping test_cpp runqslower bench bpf_testmod.ko \ 85 - xdpxceiver xdp_redirect_multi xdp_synproxy 85 + xskxceiver xdp_redirect_multi xdp_synproxy 86 86 87 87 TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read 88 88 ··· 230 230 $(OUTPUT)/flow_dissector_load: $(TESTING_HELPERS) 231 231 $(OUTPUT)/test_maps: $(TESTING_HELPERS) 232 232 $(OUTPUT)/test_verifier: $(TESTING_HELPERS) $(CAP_HELPERS) 233 + $(OUTPUT)/xsk.o: $(BPFOBJ) 234 + $(OUTPUT)/xskxceiver: $(OUTPUT)/xsk.o 233 235 234 236 BPFTOOL ?= $(DEFAULT_BPFTOOL) 235 237 $(DEFAULT_BPFTOOL): $(wildcard $(BPFTOOLDIR)/*.[ch] $(BPFTOOLDIR)/Makefile) \ ··· 573 571 $(OUTPUT)/bench_bpf_loop.o: $(OUTPUT)/bpf_loop_bench.skel.h 574 572 $(OUTPUT)/bench_strncmp.o: $(OUTPUT)/strncmp_bench.skel.h 575 573 $(OUTPUT)/bench_bpf_hashmap_full_update.o: $(OUTPUT)/bpf_hashmap_full_update_bench.skel.h 574 + $(OUTPUT)/bench_local_storage.o: $(OUTPUT)/local_storage_bench.skel.h 575 + $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o: $(OUTPUT)/local_storage_rcu_tasks_trace_bench.skel.h 576 576 $(OUTPUT)/bench.o: bench.h testing_helpers.h $(BPFOBJ) 577 577 $(OUTPUT)/bench: LDLIBS += -lm 578 578 $(OUTPUT)/bench: $(OUTPUT)/bench.o \ ··· 587 583 $(OUTPUT)/bench_bloom_filter_map.o \ 588 584 $(OUTPUT)/bench_bpf_loop.o \ 589 585 $(OUTPUT)/bench_strncmp.o \ 590 - $(OUTPUT)/bench_bpf_hashmap_full_update.o 586 + $(OUTPUT)/bench_bpf_hashmap_full_update.o \ 587 + $(OUTPUT)/bench_local_storage.o \ 588 + $(OUTPUT)/bench_local_storage_rcu_tasks_trace.o 591 589 $(call msg,BINARY,,$@) 592 590 $(Q)$(CC) $(CFLAGS) $(LDFLAGS) $(filter %.a %.o,$^) $(LDLIBS) -o $@ 593 591
+97
tools/testing/selftests/bpf/bench.c
··· 79 79 hits_per_sec, hits_per_prod, drops_per_sec, hits_per_sec + drops_per_sec); 80 80 } 81 81 82 + void 83 + grace_period_latency_basic_stats(struct bench_res res[], int res_cnt, struct basic_stats *gp_stat) 84 + { 85 + int i; 86 + 87 + memset(gp_stat, 0, sizeof(struct basic_stats)); 88 + 89 + for (i = 0; i < res_cnt; i++) 90 + gp_stat->mean += res[i].gp_ns / 1000.0 / (double)res[i].gp_ct / (0.0 + res_cnt); 91 + 92 + #define IT_MEAN_DIFF (res[i].gp_ns / 1000.0 / (double)res[i].gp_ct - gp_stat->mean) 93 + if (res_cnt > 1) { 94 + for (i = 0; i < res_cnt; i++) 95 + gp_stat->stddev += (IT_MEAN_DIFF * IT_MEAN_DIFF) / (res_cnt - 1.0); 96 + } 97 + gp_stat->stddev = sqrt(gp_stat->stddev); 98 + #undef IT_MEAN_DIFF 99 + } 100 + 101 + void 102 + grace_period_ticks_basic_stats(struct bench_res res[], int res_cnt, struct basic_stats *gp_stat) 103 + { 104 + int i; 105 + 106 + memset(gp_stat, 0, sizeof(struct basic_stats)); 107 + for (i = 0; i < res_cnt; i++) 108 + gp_stat->mean += res[i].stime / (double)res[i].gp_ct / (0.0 + res_cnt); 109 + 110 + #define IT_MEAN_DIFF (res[i].stime / (double)res[i].gp_ct - gp_stat->mean) 111 + if (res_cnt > 1) { 112 + for (i = 0; i < res_cnt; i++) 113 + gp_stat->stddev += (IT_MEAN_DIFF * IT_MEAN_DIFF) / (res_cnt - 1.0); 114 + } 115 + gp_stat->stddev = sqrt(gp_stat->stddev); 116 + #undef IT_MEAN_DIFF 117 + } 118 + 82 119 void hits_drops_report_final(struct bench_res res[], int res_cnt) 83 120 { 84 121 int i; ··· 187 150 printf("latency %8.3lf ns/op\n", 1000.0 / hits_mean * env.producer_cnt); 188 151 } 189 152 153 + void local_storage_report_progress(int iter, struct bench_res *res, 154 + long delta_ns) 155 + { 156 + double important_hits_per_sec, hits_per_sec; 157 + double delta_sec = delta_ns / 1000000000.0; 158 + 159 + hits_per_sec = res->hits / 1000000.0 / delta_sec; 160 + important_hits_per_sec = res->important_hits / 1000000.0 / delta_sec; 161 + 162 + printf("Iter %3d (%7.3lfus): ", iter, (delta_ns - 1000000000) / 1000.0); 163 + 164 + printf("hits %8.3lfM/s ", hits_per_sec); 165 + printf("important_hits %8.3lfM/s\n", important_hits_per_sec); 166 + } 167 + 168 + void local_storage_report_final(struct bench_res res[], int res_cnt) 169 + { 170 + double important_hits_mean = 0.0, important_hits_stddev = 0.0; 171 + double hits_mean = 0.0, hits_stddev = 0.0; 172 + int i; 173 + 174 + for (i = 0; i < res_cnt; i++) { 175 + hits_mean += res[i].hits / 1000000.0 / (0.0 + res_cnt); 176 + important_hits_mean += res[i].important_hits / 1000000.0 / (0.0 + res_cnt); 177 + } 178 + 179 + if (res_cnt > 1) { 180 + for (i = 0; i < res_cnt; i++) { 181 + hits_stddev += (hits_mean - res[i].hits / 1000000.0) * 182 + (hits_mean - res[i].hits / 1000000.0) / 183 + (res_cnt - 1.0); 184 + important_hits_stddev += 185 + (important_hits_mean - res[i].important_hits / 1000000.0) * 186 + (important_hits_mean - res[i].important_hits / 1000000.0) / 187 + (res_cnt - 1.0); 188 + } 189 + 190 + hits_stddev = sqrt(hits_stddev); 191 + important_hits_stddev = sqrt(important_hits_stddev); 192 + } 193 + printf("Summary: hits throughput %8.3lf \u00B1 %5.3lf M ops/s, ", 194 + hits_mean, hits_stddev); 195 + printf("hits latency %8.3lf ns/op, ", 1000.0 / hits_mean); 196 + printf("important_hits throughput %8.3lf \u00B1 %5.3lf M ops/s\n", 197 + important_hits_mean, important_hits_stddev); 198 + } 199 + 190 200 const char *argp_program_version = "benchmark"; 191 201 const char *argp_program_bug_address = "<bpf@vger.kernel.org>"; 192 202 const char argp_program_doc[] = ··· 272 188 extern struct argp bench_ringbufs_argp; 273 189 extern struct argp bench_bloom_map_argp; 274 190 extern struct argp bench_bpf_loop_argp; 191 + extern struct argp bench_local_storage_argp; 192 + extern struct argp bench_local_storage_rcu_tasks_trace_argp; 275 193 extern struct argp bench_strncmp_argp; 276 194 277 195 static const struct argp_child bench_parsers[] = { 278 196 { &bench_ringbufs_argp, 0, "Ring buffers benchmark", 0 }, 279 197 { &bench_bloom_map_argp, 0, "Bloom filter map benchmark", 0 }, 280 198 { &bench_bpf_loop_argp, 0, "bpf_loop helper benchmark", 0 }, 199 + { &bench_local_storage_argp, 0, "local_storage benchmark", 0 }, 281 200 { &bench_strncmp_argp, 0, "bpf_strncmp helper benchmark", 0 }, 201 + { &bench_local_storage_rcu_tasks_trace_argp, 0, 202 + "local_storage RCU Tasks Trace slowdown benchmark", 0 }, 282 203 {}, 283 204 }; 284 205 ··· 486 397 extern const struct bench bench_strncmp_no_helper; 487 398 extern const struct bench bench_strncmp_helper; 488 399 extern const struct bench bench_bpf_hashmap_full_update; 400 + extern const struct bench bench_local_storage_cache_seq_get; 401 + extern const struct bench bench_local_storage_cache_interleaved_get; 402 + extern const struct bench bench_local_storage_cache_hashmap_control; 403 + extern const struct bench bench_local_storage_tasks_trace; 489 404 490 405 static const struct bench *benchs[] = { 491 406 &bench_count_global, ··· 525 432 &bench_strncmp_no_helper, 526 433 &bench_strncmp_helper, 527 434 &bench_bpf_hashmap_full_update, 435 + &bench_local_storage_cache_seq_get, 436 + &bench_local_storage_cache_interleaved_get, 437 + &bench_local_storage_cache_hashmap_control, 438 + &bench_local_storage_tasks_trace, 528 439 }; 529 440 530 441 static void setup_benchmark()
+16
tools/testing/selftests/bpf/bench.h
··· 30 30 struct cpu_set cons_cpus; 31 31 }; 32 32 33 + struct basic_stats { 34 + double mean; 35 + double stddev; 36 + }; 37 + 33 38 struct bench_res { 34 39 long hits; 35 40 long drops; 36 41 long false_hits; 42 + long important_hits; 43 + unsigned long gp_ns; 44 + unsigned long gp_ct; 45 + unsigned int stime; 37 46 }; 38 47 39 48 struct bench { ··· 70 61 void false_hits_report_final(struct bench_res res[], int res_cnt); 71 62 void ops_report_progress(int iter, struct bench_res *res, long delta_ns); 72 63 void ops_report_final(struct bench_res res[], int res_cnt); 64 + void local_storage_report_progress(int iter, struct bench_res *res, 65 + long delta_ns); 66 + void local_storage_report_final(struct bench_res res[], int res_cnt); 67 + void grace_period_latency_basic_stats(struct bench_res res[], int res_cnt, 68 + struct basic_stats *gp_stat); 69 + void grace_period_ticks_basic_stats(struct bench_res res[], int res_cnt, 70 + struct basic_stats *gp_stat); 73 71 74 72 static inline __u64 get_time_ns(void) 75 73 {
+287
tools/testing/selftests/bpf/benchs/bench_local_storage.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <argp.h> 5 + #include <linux/btf.h> 6 + 7 + #include "local_storage_bench.skel.h" 8 + #include "bench.h" 9 + 10 + #include <test_btf.h> 11 + 12 + static struct { 13 + __u32 nr_maps; 14 + __u32 hashmap_nr_keys_used; 15 + } args = { 16 + .nr_maps = 1000, 17 + .hashmap_nr_keys_used = 1000, 18 + }; 19 + 20 + enum { 21 + ARG_NR_MAPS = 6000, 22 + ARG_HASHMAP_NR_KEYS_USED = 6001, 23 + }; 24 + 25 + static const struct argp_option opts[] = { 26 + { "nr_maps", ARG_NR_MAPS, "NR_MAPS", 0, 27 + "Set number of local_storage maps"}, 28 + { "hashmap_nr_keys_used", ARG_HASHMAP_NR_KEYS_USED, "NR_KEYS", 29 + 0, "When doing hashmap test, set number of hashmap keys test uses"}, 30 + {}, 31 + }; 32 + 33 + static error_t parse_arg(int key, char *arg, struct argp_state *state) 34 + { 35 + long ret; 36 + 37 + switch (key) { 38 + case ARG_NR_MAPS: 39 + ret = strtol(arg, NULL, 10); 40 + if (ret < 1 || ret > UINT_MAX) { 41 + fprintf(stderr, "invalid nr_maps"); 42 + argp_usage(state); 43 + } 44 + args.nr_maps = ret; 45 + break; 46 + case ARG_HASHMAP_NR_KEYS_USED: 47 + ret = strtol(arg, NULL, 10); 48 + if (ret < 1 || ret > UINT_MAX) { 49 + fprintf(stderr, "invalid hashmap_nr_keys_used"); 50 + argp_usage(state); 51 + } 52 + args.hashmap_nr_keys_used = ret; 53 + break; 54 + default: 55 + return ARGP_ERR_UNKNOWN; 56 + } 57 + 58 + return 0; 59 + } 60 + 61 + const struct argp bench_local_storage_argp = { 62 + .options = opts, 63 + .parser = parse_arg, 64 + }; 65 + 66 + /* Keep in sync w/ array of maps in bpf */ 67 + #define MAX_NR_MAPS 1000 68 + /* keep in sync w/ same define in bpf */ 69 + #define HASHMAP_SZ 4194304 70 + 71 + static void validate(void) 72 + { 73 + if (env.producer_cnt != 1) { 74 + fprintf(stderr, "benchmark doesn't support multi-producer!\n"); 75 + exit(1); 76 + } 77 + if (env.consumer_cnt != 1) { 78 + fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); 79 + exit(1); 80 + } 81 + 82 + if (args.nr_maps > MAX_NR_MAPS) { 83 + fprintf(stderr, "nr_maps must be <= 1000\n"); 84 + exit(1); 85 + } 86 + 87 + if (args.hashmap_nr_keys_used > HASHMAP_SZ) { 88 + fprintf(stderr, "hashmap_nr_keys_used must be <= %u\n", HASHMAP_SZ); 89 + exit(1); 90 + } 91 + } 92 + 93 + static struct { 94 + struct local_storage_bench *skel; 95 + void *bpf_obj; 96 + struct bpf_map *array_of_maps; 97 + } ctx; 98 + 99 + static void prepopulate_hashmap(int fd) 100 + { 101 + int i, key, val; 102 + 103 + /* local_storage gets will have BPF_LOCAL_STORAGE_GET_F_CREATE flag set, so 104 + * populate the hashmap for a similar comparison 105 + */ 106 + for (i = 0; i < HASHMAP_SZ; i++) { 107 + key = val = i; 108 + if (bpf_map_update_elem(fd, &key, &val, 0)) { 109 + fprintf(stderr, "Error prepopulating hashmap (key %d)\n", key); 110 + exit(1); 111 + } 112 + } 113 + } 114 + 115 + static void __setup(struct bpf_program *prog, bool hashmap) 116 + { 117 + struct bpf_map *inner_map; 118 + int i, fd, mim_fd, err; 119 + 120 + LIBBPF_OPTS(bpf_map_create_opts, create_opts); 121 + 122 + if (!hashmap) 123 + create_opts.map_flags = BPF_F_NO_PREALLOC; 124 + 125 + ctx.skel->rodata->num_maps = args.nr_maps; 126 + ctx.skel->rodata->hashmap_num_keys = args.hashmap_nr_keys_used; 127 + inner_map = bpf_map__inner_map(ctx.array_of_maps); 128 + create_opts.btf_key_type_id = bpf_map__btf_key_type_id(inner_map); 129 + create_opts.btf_value_type_id = bpf_map__btf_value_type_id(inner_map); 130 + 131 + err = local_storage_bench__load(ctx.skel); 132 + if (err) { 133 + fprintf(stderr, "Error loading skeleton\n"); 134 + goto err_out; 135 + } 136 + 137 + create_opts.btf_fd = bpf_object__btf_fd(ctx.skel->obj); 138 + 139 + mim_fd = bpf_map__fd(ctx.array_of_maps); 140 + if (mim_fd < 0) { 141 + fprintf(stderr, "Error getting map_in_map fd\n"); 142 + goto err_out; 143 + } 144 + 145 + for (i = 0; i < args.nr_maps; i++) { 146 + if (hashmap) 147 + fd = bpf_map_create(BPF_MAP_TYPE_HASH, NULL, sizeof(int), 148 + sizeof(int), HASHMAP_SZ, &create_opts); 149 + else 150 + fd = bpf_map_create(BPF_MAP_TYPE_TASK_STORAGE, NULL, sizeof(int), 151 + sizeof(int), 0, &create_opts); 152 + if (fd < 0) { 153 + fprintf(stderr, "Error creating map %d: %d\n", i, fd); 154 + goto err_out; 155 + } 156 + 157 + if (hashmap) 158 + prepopulate_hashmap(fd); 159 + 160 + err = bpf_map_update_elem(mim_fd, &i, &fd, 0); 161 + if (err) { 162 + fprintf(stderr, "Error updating array-of-maps w/ map %d\n", i); 163 + goto err_out; 164 + } 165 + } 166 + 167 + if (!bpf_program__attach(prog)) { 168 + fprintf(stderr, "Error attaching bpf program\n"); 169 + goto err_out; 170 + } 171 + 172 + return; 173 + err_out: 174 + exit(1); 175 + } 176 + 177 + static void hashmap_setup(void) 178 + { 179 + struct local_storage_bench *skel; 180 + 181 + setup_libbpf(); 182 + 183 + skel = local_storage_bench__open(); 184 + ctx.skel = skel; 185 + ctx.array_of_maps = skel->maps.array_of_hash_maps; 186 + skel->rodata->use_hashmap = 1; 187 + skel->rodata->interleave = 0; 188 + 189 + __setup(skel->progs.get_local, true); 190 + } 191 + 192 + static void local_storage_cache_get_setup(void) 193 + { 194 + struct local_storage_bench *skel; 195 + 196 + setup_libbpf(); 197 + 198 + skel = local_storage_bench__open(); 199 + ctx.skel = skel; 200 + ctx.array_of_maps = skel->maps.array_of_local_storage_maps; 201 + skel->rodata->use_hashmap = 0; 202 + skel->rodata->interleave = 0; 203 + 204 + __setup(skel->progs.get_local, false); 205 + } 206 + 207 + static void local_storage_cache_get_interleaved_setup(void) 208 + { 209 + struct local_storage_bench *skel; 210 + 211 + setup_libbpf(); 212 + 213 + skel = local_storage_bench__open(); 214 + ctx.skel = skel; 215 + ctx.array_of_maps = skel->maps.array_of_local_storage_maps; 216 + skel->rodata->use_hashmap = 0; 217 + skel->rodata->interleave = 1; 218 + 219 + __setup(skel->progs.get_local, false); 220 + } 221 + 222 + static void measure(struct bench_res *res) 223 + { 224 + res->hits = atomic_swap(&ctx.skel->bss->hits, 0); 225 + res->important_hits = atomic_swap(&ctx.skel->bss->important_hits, 0); 226 + } 227 + 228 + static inline void trigger_bpf_program(void) 229 + { 230 + syscall(__NR_getpgid); 231 + } 232 + 233 + static void *consumer(void *input) 234 + { 235 + return NULL; 236 + } 237 + 238 + static void *producer(void *input) 239 + { 240 + while (true) 241 + trigger_bpf_program(); 242 + 243 + return NULL; 244 + } 245 + 246 + /* cache sequential and interleaved get benchs test local_storage get 247 + * performance, specifically they demonstrate performance cliff of 248 + * current list-plus-cache local_storage model. 249 + * 250 + * cache sequential get: call bpf_task_storage_get on n maps in order 251 + * cache interleaved get: like "sequential get", but interleave 4 calls to the 252 + * 'important' map (idx 0 in array_of_maps) for every 10 calls. Goal 253 + * is to mimic environment where many progs are accessing their local_storage 254 + * maps, with 'our' prog needing to access its map more often than others 255 + */ 256 + const struct bench bench_local_storage_cache_seq_get = { 257 + .name = "local-storage-cache-seq-get", 258 + .validate = validate, 259 + .setup = local_storage_cache_get_setup, 260 + .producer_thread = producer, 261 + .consumer_thread = consumer, 262 + .measure = measure, 263 + .report_progress = local_storage_report_progress, 264 + .report_final = local_storage_report_final, 265 + }; 266 + 267 + const struct bench bench_local_storage_cache_interleaved_get = { 268 + .name = "local-storage-cache-int-get", 269 + .validate = validate, 270 + .setup = local_storage_cache_get_interleaved_setup, 271 + .producer_thread = producer, 272 + .consumer_thread = consumer, 273 + .measure = measure, 274 + .report_progress = local_storage_report_progress, 275 + .report_final = local_storage_report_final, 276 + }; 277 + 278 + const struct bench bench_local_storage_cache_hashmap_control = { 279 + .name = "local-storage-cache-hashmap-control", 280 + .validate = validate, 281 + .setup = hashmap_setup, 282 + .producer_thread = producer, 283 + .consumer_thread = consumer, 284 + .measure = measure, 285 + .report_progress = local_storage_report_progress, 286 + .report_final = local_storage_report_final, 287 + };
+281
tools/testing/selftests/bpf/benchs/bench_local_storage_rcu_tasks_trace.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <argp.h> 5 + 6 + #include <sys/prctl.h> 7 + #include "local_storage_rcu_tasks_trace_bench.skel.h" 8 + #include "bench.h" 9 + 10 + #include <signal.h> 11 + 12 + static struct { 13 + __u32 nr_procs; 14 + __u32 kthread_pid; 15 + bool quiet; 16 + } args = { 17 + .nr_procs = 1000, 18 + .kthread_pid = 0, 19 + .quiet = false, 20 + }; 21 + 22 + enum { 23 + ARG_NR_PROCS = 7000, 24 + ARG_KTHREAD_PID = 7001, 25 + ARG_QUIET = 7002, 26 + }; 27 + 28 + static const struct argp_option opts[] = { 29 + { "nr_procs", ARG_NR_PROCS, "NR_PROCS", 0, 30 + "Set number of user processes to spin up"}, 31 + { "kthread_pid", ARG_KTHREAD_PID, "PID", 0, 32 + "Pid of rcu_tasks_trace kthread for ticks tracking"}, 33 + { "quiet", ARG_QUIET, "{0,1}", 0, 34 + "If true, don't report progress"}, 35 + {}, 36 + }; 37 + 38 + static error_t parse_arg(int key, char *arg, struct argp_state *state) 39 + { 40 + long ret; 41 + 42 + switch (key) { 43 + case ARG_NR_PROCS: 44 + ret = strtol(arg, NULL, 10); 45 + if (ret < 1 || ret > UINT_MAX) { 46 + fprintf(stderr, "invalid nr_procs\n"); 47 + argp_usage(state); 48 + } 49 + args.nr_procs = ret; 50 + break; 51 + case ARG_KTHREAD_PID: 52 + ret = strtol(arg, NULL, 10); 53 + if (ret < 1) { 54 + fprintf(stderr, "invalid kthread_pid\n"); 55 + argp_usage(state); 56 + } 57 + args.kthread_pid = ret; 58 + break; 59 + case ARG_QUIET: 60 + ret = strtol(arg, NULL, 10); 61 + if (ret < 0 || ret > 1) { 62 + fprintf(stderr, "invalid quiet %ld\n", ret); 63 + argp_usage(state); 64 + } 65 + args.quiet = ret; 66 + break; 67 + break; 68 + default: 69 + return ARGP_ERR_UNKNOWN; 70 + } 71 + 72 + return 0; 73 + } 74 + 75 + const struct argp bench_local_storage_rcu_tasks_trace_argp = { 76 + .options = opts, 77 + .parser = parse_arg, 78 + }; 79 + 80 + #define MAX_SLEEP_PROCS 150000 81 + 82 + static void validate(void) 83 + { 84 + if (env.producer_cnt != 1) { 85 + fprintf(stderr, "benchmark doesn't support multi-producer!\n"); 86 + exit(1); 87 + } 88 + if (env.consumer_cnt != 1) { 89 + fprintf(stderr, "benchmark doesn't support multi-consumer!\n"); 90 + exit(1); 91 + } 92 + 93 + if (args.nr_procs > MAX_SLEEP_PROCS) { 94 + fprintf(stderr, "benchmark supports up to %u sleeper procs!\n", 95 + MAX_SLEEP_PROCS); 96 + exit(1); 97 + } 98 + } 99 + 100 + static long kthread_pid_ticks(void) 101 + { 102 + char procfs_path[100]; 103 + long stime; 104 + FILE *f; 105 + 106 + if (!args.kthread_pid) 107 + return -1; 108 + 109 + sprintf(procfs_path, "/proc/%u/stat", args.kthread_pid); 110 + f = fopen(procfs_path, "r"); 111 + if (!f) { 112 + fprintf(stderr, "couldn't open %s, exiting\n", procfs_path); 113 + goto err_out; 114 + } 115 + if (fscanf(f, "%*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %*s %ld", &stime) != 1) { 116 + fprintf(stderr, "fscanf of %s failed, exiting\n", procfs_path); 117 + goto err_out; 118 + } 119 + fclose(f); 120 + return stime; 121 + 122 + err_out: 123 + if (f) 124 + fclose(f); 125 + exit(1); 126 + return 0; 127 + } 128 + 129 + static struct { 130 + struct local_storage_rcu_tasks_trace_bench *skel; 131 + long prev_kthread_stime; 132 + } ctx; 133 + 134 + static void sleep_and_loop(void) 135 + { 136 + while (true) { 137 + sleep(rand() % 4); 138 + syscall(__NR_getpgid); 139 + } 140 + } 141 + 142 + static void local_storage_tasks_trace_setup(void) 143 + { 144 + int i, err, forkret, runner_pid; 145 + 146 + runner_pid = getpid(); 147 + 148 + for (i = 0; i < args.nr_procs; i++) { 149 + forkret = fork(); 150 + if (forkret < 0) { 151 + fprintf(stderr, "Error forking sleeper proc %u of %u, exiting\n", i, 152 + args.nr_procs); 153 + goto err_out; 154 + } 155 + 156 + if (!forkret) { 157 + err = prctl(PR_SET_PDEATHSIG, SIGKILL); 158 + if (err < 0) { 159 + fprintf(stderr, "prctl failed with err %d, exiting\n", errno); 160 + goto err_out; 161 + } 162 + 163 + if (getppid() != runner_pid) { 164 + fprintf(stderr, "Runner died while spinning up procs, exiting\n"); 165 + goto err_out; 166 + } 167 + sleep_and_loop(); 168 + } 169 + } 170 + printf("Spun up %u procs (our pid %d)\n", args.nr_procs, runner_pid); 171 + 172 + setup_libbpf(); 173 + 174 + ctx.skel = local_storage_rcu_tasks_trace_bench__open_and_load(); 175 + if (!ctx.skel) { 176 + fprintf(stderr, "Error doing open_and_load, exiting\n"); 177 + goto err_out; 178 + } 179 + 180 + ctx.prev_kthread_stime = kthread_pid_ticks(); 181 + 182 + if (!bpf_program__attach(ctx.skel->progs.get_local)) { 183 + fprintf(stderr, "Error attaching bpf program\n"); 184 + goto err_out; 185 + } 186 + 187 + if (!bpf_program__attach(ctx.skel->progs.pregp_step)) { 188 + fprintf(stderr, "Error attaching bpf program\n"); 189 + goto err_out; 190 + } 191 + 192 + if (!bpf_program__attach(ctx.skel->progs.postgp)) { 193 + fprintf(stderr, "Error attaching bpf program\n"); 194 + goto err_out; 195 + } 196 + 197 + return; 198 + err_out: 199 + exit(1); 200 + } 201 + 202 + static void measure(struct bench_res *res) 203 + { 204 + long ticks; 205 + 206 + res->gp_ct = atomic_swap(&ctx.skel->bss->gp_hits, 0); 207 + res->gp_ns = atomic_swap(&ctx.skel->bss->gp_times, 0); 208 + ticks = kthread_pid_ticks(); 209 + res->stime = ticks - ctx.prev_kthread_stime; 210 + ctx.prev_kthread_stime = ticks; 211 + } 212 + 213 + static void *consumer(void *input) 214 + { 215 + return NULL; 216 + } 217 + 218 + static void *producer(void *input) 219 + { 220 + while (true) 221 + syscall(__NR_getpgid); 222 + return NULL; 223 + } 224 + 225 + static void report_progress(int iter, struct bench_res *res, long delta_ns) 226 + { 227 + if (ctx.skel->bss->unexpected) { 228 + fprintf(stderr, "Error: Unexpected order of bpf prog calls (postgp after pregp)."); 229 + fprintf(stderr, "Data can't be trusted, exiting\n"); 230 + exit(1); 231 + } 232 + 233 + if (args.quiet) 234 + return; 235 + 236 + printf("Iter %d\t avg tasks_trace grace period latency\t%lf ns\n", 237 + iter, res->gp_ns / (double)res->gp_ct); 238 + printf("Iter %d\t avg ticks per tasks_trace grace period\t%lf\n", 239 + iter, res->stime / (double)res->gp_ct); 240 + } 241 + 242 + static void report_final(struct bench_res res[], int res_cnt) 243 + { 244 + struct basic_stats gp_stat; 245 + 246 + grace_period_latency_basic_stats(res, res_cnt, &gp_stat); 247 + printf("SUMMARY tasks_trace grace period latency"); 248 + printf("\tavg %.3lf us\tstddev %.3lf us\n", gp_stat.mean, gp_stat.stddev); 249 + grace_period_ticks_basic_stats(res, res_cnt, &gp_stat); 250 + printf("SUMMARY ticks per tasks_trace grace period"); 251 + printf("\tavg %.3lf\tstddev %.3lf\n", gp_stat.mean, gp_stat.stddev); 252 + } 253 + 254 + /* local-storage-tasks-trace: Benchmark performance of BPF local_storage's use 255 + * of RCU Tasks-Trace. 256 + * 257 + * Stress RCU Tasks Trace by forking many tasks, all of which do no work aside 258 + * from sleep() loop, and creating/destroying BPF task-local storage on wakeup. 259 + * The number of forked tasks is configurable. 260 + * 261 + * exercising code paths which call call_rcu_tasks_trace while there are many 262 + * thousands of tasks on the system should result in RCU Tasks-Trace having to 263 + * do a noticeable amount of work. 264 + * 265 + * This should be observable by measuring rcu_tasks_trace_kthread CPU usage 266 + * after the grace period has ended, or by measuring grace period latency. 267 + * 268 + * This benchmark uses both approaches, attaching to rcu_tasks_trace_pregp_step 269 + * and rcu_tasks_trace_postgp functions to measure grace period latency and 270 + * using /proc/PID/stat to measure rcu_tasks_trace_kthread kernel ticks 271 + */ 272 + const struct bench bench_local_storage_tasks_trace = { 273 + .name = "local-storage-tasks-trace", 274 + .validate = validate, 275 + .setup = local_storage_tasks_trace_setup, 276 + .producer_thread = producer, 277 + .consumer_thread = consumer, 278 + .measure = measure, 279 + .report_progress = report_progress, 280 + .report_final = report_final, 281 + };
+24
tools/testing/selftests/bpf/benchs/run_bench_local_storage.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + source ./benchs/run_common.sh 5 + 6 + set -eufo pipefail 7 + 8 + header "Hashmap Control" 9 + for i in 10 1000 10000 100000 4194304; do 10 + subtitle "num keys: $i" 11 + summarize_local_storage "hashmap (control) sequential get: "\ 12 + "$(./bench --nr_maps 1 --hashmap_nr_keys_used=$i local-storage-cache-hashmap-control)" 13 + printf "\n" 14 + done 15 + 16 + header "Local Storage" 17 + for i in 1 10 16 17 24 32 100 1000; do 18 + subtitle "num_maps: $i" 19 + summarize_local_storage "local_storage cache sequential get: "\ 20 + "$(./bench --nr_maps $i local-storage-cache-seq-get)" 21 + summarize_local_storage "local_storage cache interleaved get: "\ 22 + "$(./bench --nr_maps $i local-storage-cache-int-get)" 23 + printf "\n" 24 + done
+11
tools/testing/selftests/bpf/benchs/run_bench_local_storage_rcu_tasks_trace.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + kthread_pid=`pgrep rcu_tasks_trace_kthread` 5 + 6 + if [ -z $kthread_pid ]; then 7 + echo "error: Couldn't find rcu_tasks_trace_kthread" 8 + exit 1 9 + fi 10 + 11 + ./bench --nr_procs 15000 --kthread_pid $kthread_pid -d 600 --quiet 1 local-storage-tasks-trace
+17
tools/testing/selftests/bpf/benchs/run_common.sh
··· 41 41 echo "$*" | sed -E "s/.*latency\s+([0-9]+\.[0-9]+\sns\/op).*/\1/" 42 42 } 43 43 44 + function local_storage() 45 + { 46 + echo -n "hits throughput: " 47 + echo -n "$*" | sed -E "s/.* hits throughput\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+\sM\sops\/s).*/\1/" 48 + echo -n -e ", hits latency: " 49 + echo -n "$*" | sed -E "s/.* hits latency\s+([0-9]+\.[0-9]+\sns\/op).*/\1/" 50 + echo -n ", important_hits throughput: " 51 + echo "$*" | sed -E "s/.*important_hits throughput\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+\sM\sops\/s).*/\1/" 52 + } 53 + 44 54 function total() 45 55 { 46 56 echo "$*" | sed -E "s/.*total operations\s+([0-9]+\.[0-9]+ ± [0-9]+\.[0-9]+M\/s).*/\1/" ··· 75 65 bench="$1" 76 66 summary=$(echo $2 | tail -n1) 77 67 printf "%-20s %s\n" "$bench" "$(ops $summary)" 68 + } 69 + 70 + function summarize_local_storage() 71 + { 72 + bench="$1" 73 + summary=$(echo $2 | tail -n1) 74 + printf "%-20s %s\n" "$bench" "$(local_storage $summary)" 78 75 } 79 76 80 77 function summarize_total()
-9
tools/testing/selftests/bpf/bpf_legacy.h
··· 2 2 #ifndef __BPF_LEGACY__ 3 3 #define __BPF_LEGACY__ 4 4 5 - #define BPF_ANNOTATE_KV_PAIR(name, type_key, type_val) \ 6 - struct ____btf_map_##name { \ 7 - type_key key; \ 8 - type_val value; \ 9 - }; \ 10 - struct ____btf_map_##name \ 11 - __attribute__ ((section(".maps." #name), used)) \ 12 - ____btf_map_##name = { } 13 - 14 5 /* llvm builtin functions that eBPF C program may use to 15 6 * emit BPF_LD_ABS and BPF_LD_IND instructions 16 7 */
+6
tools/testing/selftests/bpf/config
··· 57 57 CONFIG_IKCONFIG=y 58 58 CONFIG_IKCONFIG_PROC=y 59 59 CONFIG_MPTCP=y 60 + CONFIG_NETFILTER_SYNPROXY=y 61 + CONFIG_NETFILTER_XT_TARGET_CT=y 62 + CONFIG_NETFILTER_XT_MATCH_STATE=y 63 + CONFIG_IP_NF_FILTER=y 64 + CONFIG_IP_NF_TARGET_SYNPROXY=y 65 + CONFIG_IP_NF_RAW=y
+1 -1
tools/testing/selftests/bpf/network_helpers.c
··· 436 436 int err; 437 437 struct nstoken *token; 438 438 439 - token = malloc(sizeof(struct nstoken)); 439 + token = calloc(1, sizeof(struct nstoken)); 440 440 if (!ASSERT_OK_PTR(token, "malloc token")) 441 441 return NULL; 442 442
+62
tools/testing/selftests/bpf/prog_tests/bpf_loop.c
··· 120 120 bpf_link__destroy(link); 121 121 } 122 122 123 + static void check_non_constant_callback(struct bpf_loop *skel) 124 + { 125 + struct bpf_link *link = 126 + bpf_program__attach(skel->progs.prog_non_constant_callback); 127 + 128 + if (!ASSERT_OK_PTR(link, "link")) 129 + return; 130 + 131 + skel->bss->callback_selector = 0x0F; 132 + usleep(1); 133 + ASSERT_EQ(skel->bss->g_output, 0x0F, "g_output #1"); 134 + 135 + skel->bss->callback_selector = 0xF0; 136 + usleep(1); 137 + ASSERT_EQ(skel->bss->g_output, 0xF0, "g_output #2"); 138 + 139 + bpf_link__destroy(link); 140 + } 141 + 142 + static void check_stack(struct bpf_loop *skel) 143 + { 144 + struct bpf_link *link = bpf_program__attach(skel->progs.stack_check); 145 + const int max_key = 12; 146 + int key; 147 + int map_fd; 148 + 149 + if (!ASSERT_OK_PTR(link, "link")) 150 + return; 151 + 152 + map_fd = bpf_map__fd(skel->maps.map1); 153 + 154 + if (!ASSERT_GE(map_fd, 0, "bpf_map__fd")) 155 + goto out; 156 + 157 + for (key = 1; key <= max_key; ++key) { 158 + int val = key; 159 + int err = bpf_map_update_elem(map_fd, &key, &val, BPF_NOEXIST); 160 + 161 + if (!ASSERT_OK(err, "bpf_map_update_elem")) 162 + goto out; 163 + } 164 + 165 + usleep(1); 166 + 167 + for (key = 1; key <= max_key; ++key) { 168 + int val; 169 + int err = bpf_map_lookup_elem(map_fd, &key, &val); 170 + 171 + if (!ASSERT_OK(err, "bpf_map_lookup_elem")) 172 + goto out; 173 + if (!ASSERT_EQ(val, key + 1, "bad value in the map")) 174 + goto out; 175 + } 176 + 177 + out: 178 + bpf_link__destroy(link); 179 + } 180 + 123 181 void test_bpf_loop(void) 124 182 { 125 183 struct bpf_loop *skel; ··· 198 140 check_invalid_flags(skel); 199 141 if (test__start_subtest("check_nested_calls")) 200 142 check_nested_calls(skel); 143 + if (test__start_subtest("check_non_constant_callback")) 144 + check_non_constant_callback(skel); 145 + if (test__start_subtest("check_stack")) 146 + check_stack(skel); 201 147 202 148 bpf_loop__destroy(skel); 203 149 }
+61
tools/testing/selftests/bpf/prog_tests/bpf_tcp_ca.c
··· 9 9 #include "bpf_cubic.skel.h" 10 10 #include "bpf_tcp_nogpl.skel.h" 11 11 #include "bpf_dctcp_release.skel.h" 12 + #include "tcp_ca_write_sk_pacing.skel.h" 13 + #include "tcp_ca_incompl_cong_ops.skel.h" 14 + #include "tcp_ca_unsupp_cong_op.skel.h" 12 15 13 16 #ifndef ENOTSUPP 14 17 #define ENOTSUPP 524 ··· 325 322 bpf_dctcp_release__destroy(rel_skel); 326 323 } 327 324 325 + static void test_write_sk_pacing(void) 326 + { 327 + struct tcp_ca_write_sk_pacing *skel; 328 + struct bpf_link *link; 329 + 330 + skel = tcp_ca_write_sk_pacing__open_and_load(); 331 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 332 + return; 333 + 334 + link = bpf_map__attach_struct_ops(skel->maps.write_sk_pacing); 335 + ASSERT_OK_PTR(link, "attach_struct_ops"); 336 + 337 + bpf_link__destroy(link); 338 + tcp_ca_write_sk_pacing__destroy(skel); 339 + } 340 + 341 + static void test_incompl_cong_ops(void) 342 + { 343 + struct tcp_ca_incompl_cong_ops *skel; 344 + struct bpf_link *link; 345 + 346 + skel = tcp_ca_incompl_cong_ops__open_and_load(); 347 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 348 + return; 349 + 350 + /* That cong_avoid() and cong_control() are missing is only reported at 351 + * this point: 352 + */ 353 + link = bpf_map__attach_struct_ops(skel->maps.incompl_cong_ops); 354 + ASSERT_ERR_PTR(link, "attach_struct_ops"); 355 + 356 + bpf_link__destroy(link); 357 + tcp_ca_incompl_cong_ops__destroy(skel); 358 + } 359 + 360 + static void test_unsupp_cong_op(void) 361 + { 362 + libbpf_print_fn_t old_print_fn; 363 + struct tcp_ca_unsupp_cong_op *skel; 364 + 365 + err_str = "attach to unsupported member get_info"; 366 + found = false; 367 + old_print_fn = libbpf_set_print(libbpf_debug_print); 368 + 369 + skel = tcp_ca_unsupp_cong_op__open_and_load(); 370 + ASSERT_NULL(skel, "open_and_load"); 371 + ASSERT_EQ(found, true, "expected_err_msg"); 372 + 373 + tcp_ca_unsupp_cong_op__destroy(skel); 374 + libbpf_set_print(old_print_fn); 375 + } 376 + 328 377 void test_bpf_tcp_ca(void) 329 378 { 330 379 if (test__start_subtest("dctcp")) ··· 389 334 test_dctcp_fallback(); 390 335 if (test__start_subtest("rel_setsockopt")) 391 336 test_rel_setsockopt(); 337 + if (test__start_subtest("write_sk_pacing")) 338 + test_write_sk_pacing(); 339 + if (test__start_subtest("incompl_cong_ops")) 340 + test_incompl_cong_ops(); 341 + if (test__start_subtest("unsupp_cong_op")) 342 + test_unsupp_cong_op(); 392 343 }
-2
tools/testing/selftests/bpf/prog_tests/btf.c
··· 34 34 #undef CHECK 35 35 #define CHECK(condition, format...) _CHECK(condition, "check", duration, format) 36 36 37 - #define BTF_END_RAW 0xdeadbeef 38 37 #define NAME_TBD 0xdeadb33f 39 38 40 39 #define NAME_NTH(N) (0xfffe0000 | N) ··· 4651 4652 }; 4652 4653 4653 4654 static struct btf_file_test file_tests[] = { 4654 - { .file = "test_btf_haskv.o", }, 4655 4655 { .file = "test_btf_newkv.o", }, 4656 4656 { .file = "test_btf_nokv.o", .btf_kv_notfound = true, }, 4657 4657 };
+73 -2
tools/testing/selftests/bpf/prog_tests/core_reloc.c
··· 543 543 return 0; 544 544 } 545 545 546 - 547 546 static const struct core_reloc_test_case test_cases[] = { 548 547 /* validate we can find kernel image and use its BTF for relocs */ 549 548 { ··· 555 556 .valid = { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, }, 556 557 .comm = "test_progs", 557 558 .comm_len = sizeof("test_progs"), 559 + .local_task_struct_matches = true, 558 560 }, 559 561 .output_len = sizeof(struct core_reloc_kernel_output), 560 562 .raw_tp_name = "sys_enter", ··· 752 752 SIZE_CASE(size___diff_offs), 753 753 SIZE_ERR_CASE(size___err_ambiguous), 754 754 755 - /* validate type existence and size relocations */ 755 + /* validate type existence, match, and size relocations */ 756 756 TYPE_BASED_CASE(type_based, { 757 757 .struct_exists = 1, 758 + .complex_struct_exists = 1, 758 759 .union_exists = 1, 759 760 .enum_exists = 1, 760 761 .typedef_named_struct_exists = 1, ··· 764 763 .typedef_int_exists = 1, 765 764 .typedef_enum_exists = 1, 766 765 .typedef_void_ptr_exists = 1, 766 + .typedef_restrict_ptr_exists = 1, 767 767 .typedef_func_proto_exists = 1, 768 768 .typedef_arr_exists = 1, 769 + 770 + .struct_matches = 1, 771 + .complex_struct_matches = 1, 772 + .union_matches = 1, 773 + .enum_matches = 1, 774 + .typedef_named_struct_matches = 1, 775 + .typedef_anon_struct_matches = 1, 776 + .typedef_struct_ptr_matches = 1, 777 + .typedef_int_matches = 1, 778 + .typedef_enum_matches = 1, 779 + .typedef_void_ptr_matches = 1, 780 + .typedef_restrict_ptr_matches = 1, 781 + .typedef_func_proto_matches = 1, 782 + .typedef_arr_matches = 1, 783 + 769 784 .struct_sz = sizeof(struct a_struct), 770 785 .union_sz = sizeof(union a_union), 771 786 .enum_sz = sizeof(enum an_enum), ··· 797 780 TYPE_BASED_CASE(type_based___all_missing, { 798 781 /* all zeros */ 799 782 }), 783 + TYPE_BASED_CASE(type_based___diff, { 784 + .struct_exists = 1, 785 + .complex_struct_exists = 1, 786 + .union_exists = 1, 787 + .enum_exists = 1, 788 + .typedef_named_struct_exists = 1, 789 + .typedef_anon_struct_exists = 1, 790 + .typedef_struct_ptr_exists = 1, 791 + .typedef_int_exists = 1, 792 + .typedef_enum_exists = 1, 793 + .typedef_void_ptr_exists = 1, 794 + .typedef_func_proto_exists = 1, 795 + .typedef_arr_exists = 1, 796 + 797 + .struct_matches = 1, 798 + .complex_struct_matches = 1, 799 + .union_matches = 1, 800 + .enum_matches = 1, 801 + .typedef_named_struct_matches = 1, 802 + .typedef_anon_struct_matches = 1, 803 + .typedef_struct_ptr_matches = 1, 804 + .typedef_int_matches = 0, 805 + .typedef_enum_matches = 1, 806 + .typedef_void_ptr_matches = 1, 807 + .typedef_func_proto_matches = 0, 808 + .typedef_arr_matches = 0, 809 + 810 + .struct_sz = sizeof(struct a_struct___diff), 811 + .union_sz = sizeof(union a_union___diff), 812 + .enum_sz = sizeof(enum an_enum___diff), 813 + .typedef_named_struct_sz = sizeof(named_struct_typedef___diff), 814 + .typedef_anon_struct_sz = sizeof(anon_struct_typedef___diff), 815 + .typedef_struct_ptr_sz = sizeof(struct_ptr_typedef___diff), 816 + .typedef_int_sz = sizeof(int_typedef___diff), 817 + .typedef_enum_sz = sizeof(enum_typedef___diff), 818 + .typedef_void_ptr_sz = sizeof(void_ptr_typedef___diff), 819 + .typedef_func_proto_sz = sizeof(func_proto_typedef___diff), 820 + .typedef_arr_sz = sizeof(arr_typedef___diff), 821 + }), 800 822 TYPE_BASED_CASE(type_based___diff_sz, { 801 823 .struct_exists = 1, 802 824 .union_exists = 1, ··· 848 792 .typedef_void_ptr_exists = 1, 849 793 .typedef_func_proto_exists = 1, 850 794 .typedef_arr_exists = 1, 795 + 796 + .struct_matches = 0, 797 + .union_matches = 0, 798 + .enum_matches = 0, 799 + .typedef_named_struct_matches = 0, 800 + .typedef_anon_struct_matches = 0, 801 + .typedef_struct_ptr_matches = 1, 802 + .typedef_int_matches = 0, 803 + .typedef_enum_matches = 0, 804 + .typedef_void_ptr_matches = 1, 805 + .typedef_func_proto_matches = 0, 806 + .typedef_arr_matches = 0, 807 + 851 808 .struct_sz = sizeof(struct a_struct___diff_sz), 852 809 .union_sz = sizeof(union a_union___diff_sz), 853 810 .enum_sz = sizeof(enum an_enum___diff_sz), ··· 875 806 }), 876 807 TYPE_BASED_CASE(type_based___incompat, { 877 808 .enum_exists = 1, 809 + .enum_matches = 1, 878 810 .enum_sz = sizeof(enum an_enum), 879 811 }), 880 812 TYPE_BASED_CASE(type_based___fn_wrong_args, { 881 813 .struct_exists = 1, 814 + .struct_matches = 1, 882 815 .struct_sz = sizeof(struct a_struct), 883 816 }), 884 817
+2 -2
tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
··· 329 329 struct hashmap *map; 330 330 char buf[256]; 331 331 FILE *f; 332 - int err; 332 + int err = 0; 333 333 334 334 /* 335 335 * The available_filter_functions contains many duplicates, ··· 407 407 double attach_delta, detach_delta; 408 408 struct bpf_link *link = NULL; 409 409 char **syms = NULL; 410 - size_t cnt, i; 410 + size_t cnt = 0, i; 411 411 412 412 if (!ASSERT_OK(get_syms(&syms, &cnt), "get_syms")) 413 413 return;
+313
tools/testing/selftests/bpf/prog_tests/lsm_cgroup.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include <sys/types.h> 4 + #include <sys/socket.h> 5 + #include <test_progs.h> 6 + #include <bpf/btf.h> 7 + 8 + #include "lsm_cgroup.skel.h" 9 + #include "lsm_cgroup_nonvoid.skel.h" 10 + #include "cgroup_helpers.h" 11 + #include "network_helpers.h" 12 + 13 + #ifndef ENOTSUPP 14 + #define ENOTSUPP 524 15 + #endif 16 + 17 + static struct btf *btf; 18 + 19 + static __u32 query_prog_cnt(int cgroup_fd, const char *attach_func) 20 + { 21 + LIBBPF_OPTS(bpf_prog_query_opts, p); 22 + int cnt = 0; 23 + int i; 24 + 25 + ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query"); 26 + 27 + if (!attach_func) 28 + return p.prog_cnt; 29 + 30 + /* When attach_func is provided, count the number of progs that 31 + * attach to the given symbol. 32 + */ 33 + 34 + if (!btf) 35 + btf = btf__load_vmlinux_btf(); 36 + if (!ASSERT_OK(libbpf_get_error(btf), "btf_vmlinux")) 37 + return -1; 38 + 39 + p.prog_ids = malloc(sizeof(u32) * p.prog_cnt); 40 + p.prog_attach_flags = malloc(sizeof(u32) * p.prog_cnt); 41 + ASSERT_OK(bpf_prog_query_opts(cgroup_fd, BPF_LSM_CGROUP, &p), "prog_query"); 42 + 43 + for (i = 0; i < p.prog_cnt; i++) { 44 + struct bpf_prog_info info = {}; 45 + __u32 info_len = sizeof(info); 46 + int fd; 47 + 48 + fd = bpf_prog_get_fd_by_id(p.prog_ids[i]); 49 + ASSERT_GE(fd, 0, "prog_get_fd_by_id"); 50 + ASSERT_OK(bpf_obj_get_info_by_fd(fd, &info, &info_len), "prog_info_by_fd"); 51 + close(fd); 52 + 53 + if (info.attach_btf_id == 54 + btf__find_by_name_kind(btf, attach_func, BTF_KIND_FUNC)) 55 + cnt++; 56 + } 57 + 58 + free(p.prog_ids); 59 + free(p.prog_attach_flags); 60 + 61 + return cnt; 62 + } 63 + 64 + static void test_lsm_cgroup_functional(void) 65 + { 66 + DECLARE_LIBBPF_OPTS(bpf_prog_attach_opts, attach_opts); 67 + DECLARE_LIBBPF_OPTS(bpf_link_update_opts, update_opts); 68 + int cgroup_fd = -1, cgroup_fd2 = -1, cgroup_fd3 = -1; 69 + int listen_fd, client_fd, accepted_fd; 70 + struct lsm_cgroup *skel = NULL; 71 + int post_create_prog_fd2 = -1; 72 + int post_create_prog_fd = -1; 73 + int bind_link_fd2 = -1; 74 + int bind_prog_fd2 = -1; 75 + int alloc_prog_fd = -1; 76 + int bind_prog_fd = -1; 77 + int bind_link_fd = -1; 78 + int clone_prog_fd = -1; 79 + int err, fd, prio; 80 + socklen_t socklen; 81 + 82 + cgroup_fd3 = test__join_cgroup("/sock_policy_empty"); 83 + if (!ASSERT_GE(cgroup_fd3, 0, "create empty cgroup")) 84 + goto close_cgroup; 85 + 86 + cgroup_fd2 = test__join_cgroup("/sock_policy_reuse"); 87 + if (!ASSERT_GE(cgroup_fd2, 0, "create cgroup for reuse")) 88 + goto close_cgroup; 89 + 90 + cgroup_fd = test__join_cgroup("/sock_policy"); 91 + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup")) 92 + goto close_cgroup; 93 + 94 + skel = lsm_cgroup__open_and_load(); 95 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 96 + goto close_cgroup; 97 + 98 + post_create_prog_fd = bpf_program__fd(skel->progs.socket_post_create); 99 + post_create_prog_fd2 = bpf_program__fd(skel->progs.socket_post_create2); 100 + bind_prog_fd = bpf_program__fd(skel->progs.socket_bind); 101 + bind_prog_fd2 = bpf_program__fd(skel->progs.socket_bind2); 102 + alloc_prog_fd = bpf_program__fd(skel->progs.socket_alloc); 103 + clone_prog_fd = bpf_program__fd(skel->progs.socket_clone); 104 + 105 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 0, "prog count"); 106 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 0, "total prog count"); 107 + err = bpf_prog_attach(alloc_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0); 108 + if (err == -ENOTSUPP) { 109 + test__skip(); 110 + goto close_cgroup; 111 + } 112 + if (!ASSERT_OK(err, "attach alloc_prog_fd")) 113 + goto detach_cgroup; 114 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_sk_alloc_security"), 1, "prog count"); 115 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 1, "total prog count"); 116 + 117 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 0, "prog count"); 118 + err = bpf_prog_attach(clone_prog_fd, cgroup_fd, BPF_LSM_CGROUP, 0); 119 + if (!ASSERT_OK(err, "attach clone_prog_fd")) 120 + goto detach_cgroup; 121 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_inet_csk_clone"), 1, "prog count"); 122 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 2, "total prog count"); 123 + 124 + /* Make sure replacing works. */ 125 + 126 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 0, "prog count"); 127 + err = bpf_prog_attach(post_create_prog_fd, cgroup_fd, 128 + BPF_LSM_CGROUP, 0); 129 + if (!ASSERT_OK(err, "attach post_create_prog_fd")) 130 + goto detach_cgroup; 131 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count"); 132 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count"); 133 + 134 + attach_opts.replace_prog_fd = post_create_prog_fd; 135 + err = bpf_prog_attach_opts(post_create_prog_fd2, cgroup_fd, 136 + BPF_LSM_CGROUP, &attach_opts); 137 + if (!ASSERT_OK(err, "prog replace post_create_prog_fd")) 138 + goto detach_cgroup; 139 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_post_create"), 1, "prog count"); 140 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 3, "total prog count"); 141 + 142 + /* Try the same attach/replace via link API. */ 143 + 144 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 0, "prog count"); 145 + bind_link_fd = bpf_link_create(bind_prog_fd, cgroup_fd, 146 + BPF_LSM_CGROUP, NULL); 147 + if (!ASSERT_GE(bind_link_fd, 0, "link create bind_prog_fd")) 148 + goto detach_cgroup; 149 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count"); 150 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count"); 151 + 152 + update_opts.old_prog_fd = bind_prog_fd; 153 + update_opts.flags = BPF_F_REPLACE; 154 + 155 + err = bpf_link_update(bind_link_fd, bind_prog_fd2, &update_opts); 156 + if (!ASSERT_OK(err, "link update bind_prog_fd")) 157 + goto detach_cgroup; 158 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count"); 159 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count"); 160 + 161 + /* Attach another instance of bind program to another cgroup. 162 + * This should trigger the reuse of the trampoline shim (two 163 + * programs attaching to the same btf_id). 164 + */ 165 + 166 + ASSERT_EQ(query_prog_cnt(cgroup_fd, "bpf_lsm_socket_bind"), 1, "prog count"); 167 + ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 0, "prog count"); 168 + bind_link_fd2 = bpf_link_create(bind_prog_fd2, cgroup_fd2, 169 + BPF_LSM_CGROUP, NULL); 170 + if (!ASSERT_GE(bind_link_fd2, 0, "link create bind_prog_fd2")) 171 + goto detach_cgroup; 172 + ASSERT_EQ(query_prog_cnt(cgroup_fd2, "bpf_lsm_socket_bind"), 1, "prog count"); 173 + ASSERT_EQ(query_prog_cnt(cgroup_fd, NULL), 4, "total prog count"); 174 + ASSERT_EQ(query_prog_cnt(cgroup_fd2, NULL), 1, "total prog count"); 175 + 176 + /* AF_UNIX is prohibited. */ 177 + 178 + fd = socket(AF_UNIX, SOCK_STREAM, 0); 179 + ASSERT_LT(fd, 0, "socket(AF_UNIX)"); 180 + close(fd); 181 + 182 + /* AF_INET6 gets default policy (sk_priority). */ 183 + 184 + fd = socket(AF_INET6, SOCK_STREAM, 0); 185 + if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)")) 186 + goto detach_cgroup; 187 + 188 + prio = 0; 189 + socklen = sizeof(prio); 190 + ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0, 191 + "getsockopt"); 192 + ASSERT_EQ(prio, 123, "sk_priority"); 193 + 194 + close(fd); 195 + 196 + /* TX-only AF_PACKET is allowed. */ 197 + 198 + ASSERT_LT(socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)), 0, 199 + "socket(AF_PACKET, ..., ETH_P_ALL)"); 200 + 201 + fd = socket(AF_PACKET, SOCK_RAW, 0); 202 + ASSERT_GE(fd, 0, "socket(AF_PACKET, ..., 0)"); 203 + 204 + /* TX-only AF_PACKET can not be rebound. */ 205 + 206 + struct sockaddr_ll sa = { 207 + .sll_family = AF_PACKET, 208 + .sll_protocol = htons(ETH_P_ALL), 209 + }; 210 + ASSERT_LT(bind(fd, (struct sockaddr *)&sa, sizeof(sa)), 0, 211 + "bind(ETH_P_ALL)"); 212 + 213 + close(fd); 214 + 215 + /* Trigger passive open. */ 216 + 217 + listen_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0); 218 + ASSERT_GE(listen_fd, 0, "start_server"); 219 + client_fd = connect_to_fd(listen_fd, 0); 220 + ASSERT_GE(client_fd, 0, "connect_to_fd"); 221 + accepted_fd = accept(listen_fd, NULL, NULL); 222 + ASSERT_GE(accepted_fd, 0, "accept"); 223 + 224 + prio = 0; 225 + socklen = sizeof(prio); 226 + ASSERT_GE(getsockopt(accepted_fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0, 227 + "getsockopt"); 228 + ASSERT_EQ(prio, 234, "sk_priority"); 229 + 230 + /* These are replaced and never called. */ 231 + ASSERT_EQ(skel->bss->called_socket_post_create, 0, "called_create"); 232 + ASSERT_EQ(skel->bss->called_socket_bind, 0, "called_bind"); 233 + 234 + /* AF_INET6+SOCK_STREAM 235 + * AF_PACKET+SOCK_RAW 236 + * listen_fd 237 + * client_fd 238 + * accepted_fd 239 + */ 240 + ASSERT_EQ(skel->bss->called_socket_post_create2, 5, "called_create2"); 241 + 242 + /* start_server 243 + * bind(ETH_P_ALL) 244 + */ 245 + ASSERT_EQ(skel->bss->called_socket_bind2, 2, "called_bind2"); 246 + /* Single accept(). */ 247 + ASSERT_EQ(skel->bss->called_socket_clone, 1, "called_clone"); 248 + 249 + /* AF_UNIX+SOCK_STREAM (failed) 250 + * AF_INET6+SOCK_STREAM 251 + * AF_PACKET+SOCK_RAW (failed) 252 + * AF_PACKET+SOCK_RAW 253 + * listen_fd 254 + * client_fd 255 + * accepted_fd 256 + */ 257 + ASSERT_EQ(skel->bss->called_socket_alloc, 7, "called_alloc"); 258 + 259 + close(listen_fd); 260 + close(client_fd); 261 + close(accepted_fd); 262 + 263 + /* Make sure other cgroup doesn't trigger the programs. */ 264 + 265 + if (!ASSERT_OK(join_cgroup("/sock_policy_empty"), "join root cgroup")) 266 + goto detach_cgroup; 267 + 268 + fd = socket(AF_INET6, SOCK_STREAM, 0); 269 + if (!ASSERT_GE(fd, 0, "socket(SOCK_STREAM)")) 270 + goto detach_cgroup; 271 + 272 + prio = 0; 273 + socklen = sizeof(prio); 274 + ASSERT_GE(getsockopt(fd, SOL_SOCKET, SO_PRIORITY, &prio, &socklen), 0, 275 + "getsockopt"); 276 + ASSERT_EQ(prio, 0, "sk_priority"); 277 + 278 + close(fd); 279 + 280 + detach_cgroup: 281 + ASSERT_GE(bpf_prog_detach2(post_create_prog_fd2, cgroup_fd, 282 + BPF_LSM_CGROUP), 0, "detach_create"); 283 + close(bind_link_fd); 284 + /* Don't close bind_link_fd2, exercise cgroup release cleanup. */ 285 + ASSERT_GE(bpf_prog_detach2(alloc_prog_fd, cgroup_fd, 286 + BPF_LSM_CGROUP), 0, "detach_alloc"); 287 + ASSERT_GE(bpf_prog_detach2(clone_prog_fd, cgroup_fd, 288 + BPF_LSM_CGROUP), 0, "detach_clone"); 289 + 290 + close_cgroup: 291 + close(cgroup_fd); 292 + close(cgroup_fd2); 293 + close(cgroup_fd3); 294 + lsm_cgroup__destroy(skel); 295 + } 296 + 297 + static void test_lsm_cgroup_nonvoid(void) 298 + { 299 + struct lsm_cgroup_nonvoid *skel = NULL; 300 + 301 + skel = lsm_cgroup_nonvoid__open_and_load(); 302 + ASSERT_NULL(skel, "open succeeds"); 303 + lsm_cgroup_nonvoid__destroy(skel); 304 + } 305 + 306 + void test_lsm_cgroup(void) 307 + { 308 + if (test__start_subtest("functional")) 309 + test_lsm_cgroup_functional(); 310 + if (test__start_subtest("nonvoid")) 311 + test_lsm_cgroup_nonvoid(); 312 + btf__free(btf); 313 + }
+1 -1
tools/testing/selftests/bpf/prog_tests/resolve_btfids.c
··· 44 44 BTF_ID(func, func) 45 45 46 46 extern __u32 test_list_global[]; 47 - BTF_ID_LIST_GLOBAL(test_list_global) 47 + BTF_ID_LIST_GLOBAL(test_list_global, 1) 48 48 BTF_ID_UNUSED 49 49 BTF_ID(typedef, S) 50 50 BTF_ID(typedef, T)
-1
tools/testing/selftests/bpf/prog_tests/sock_fields.c
··· 394 394 test(); 395 395 396 396 done: 397 - test_sock_fields__detach(skel); 398 397 test_sock_fields__destroy(skel); 399 398 if (child_cg_fd >= 0) 400 399 close(child_cg_fd);
+1 -1
tools/testing/selftests/bpf/prog_tests/usdt.c
··· 12 12 13 13 static volatile int idx = 2; 14 14 static volatile __u64 bla = 0xFEDCBA9876543210ULL; 15 - static volatile short nums[] = {-1, -2, -3, }; 15 + static volatile short nums[] = {-1, -2, -3, -4}; 16 16 17 17 static volatile struct { 18 18 int x;
+1 -1
tools/testing/selftests/bpf/prog_tests/xdp_synproxy.c
··· 63 63 static void test_synproxy(bool xdp) 64 64 { 65 65 int server_fd = -1, client_fd = -1, accept_fd = -1; 66 - char *prog_id, *prog_id_end; 66 + char *prog_id = NULL, *prog_id_end; 67 67 struct nstoken *ns = NULL; 68 68 FILE *ctrl_file = NULL; 69 69 char buf[CMD_OUT_BUF_SIZE];
+114
tools/testing/selftests/bpf/progs/bpf_loop.c
··· 11 11 int output; 12 12 }; 13 13 14 + struct { 15 + __uint(type, BPF_MAP_TYPE_HASH); 16 + __uint(max_entries, 32); 17 + __type(key, int); 18 + __type(value, int); 19 + } map1 SEC(".maps"); 20 + 14 21 /* These should be set by the user program */ 15 22 u32 nested_callback_nr_loops; 16 23 u32 stop_index = -1; 17 24 u32 nr_loops; 18 25 int pid; 26 + int callback_selector; 19 27 20 28 /* Making these global variables so that the userspace program 21 29 * can verify the output through the skeleton ··· 116 108 bpf_loop(nr_loops, nested_callback1, &data, 0); 117 109 118 110 g_output = data.output; 111 + 112 + return 0; 113 + } 114 + 115 + static int callback_set_f0(int i, void *ctx) 116 + { 117 + g_output = 0xF0; 118 + return 0; 119 + } 120 + 121 + static int callback_set_0f(int i, void *ctx) 122 + { 123 + g_output = 0x0F; 124 + return 0; 125 + } 126 + 127 + /* 128 + * non-constant callback is a corner case for bpf_loop inline logic 129 + */ 130 + SEC("fentry/" SYS_PREFIX "sys_nanosleep") 131 + int prog_non_constant_callback(void *ctx) 132 + { 133 + struct callback_ctx data = {}; 134 + 135 + if (bpf_get_current_pid_tgid() >> 32 != pid) 136 + return 0; 137 + 138 + int (*callback)(int i, void *ctx); 139 + 140 + g_output = 0; 141 + 142 + if (callback_selector == 0x0F) 143 + callback = callback_set_0f; 144 + else 145 + callback = callback_set_f0; 146 + 147 + bpf_loop(1, callback, NULL, 0); 148 + 149 + return 0; 150 + } 151 + 152 + static int stack_check_inner_callback(void *ctx) 153 + { 154 + return 0; 155 + } 156 + 157 + static int map1_lookup_elem(int key) 158 + { 159 + int *val = bpf_map_lookup_elem(&map1, &key); 160 + 161 + return val ? *val : -1; 162 + } 163 + 164 + static void map1_update_elem(int key, int val) 165 + { 166 + bpf_map_update_elem(&map1, &key, &val, BPF_ANY); 167 + } 168 + 169 + static int stack_check_outer_callback(void *ctx) 170 + { 171 + int a = map1_lookup_elem(1); 172 + int b = map1_lookup_elem(2); 173 + int c = map1_lookup_elem(3); 174 + int d = map1_lookup_elem(4); 175 + int e = map1_lookup_elem(5); 176 + int f = map1_lookup_elem(6); 177 + 178 + bpf_loop(1, stack_check_inner_callback, NULL, 0); 179 + 180 + map1_update_elem(1, a + 1); 181 + map1_update_elem(2, b + 1); 182 + map1_update_elem(3, c + 1); 183 + map1_update_elem(4, d + 1); 184 + map1_update_elem(5, e + 1); 185 + map1_update_elem(6, f + 1); 186 + 187 + return 0; 188 + } 189 + 190 + /* Some of the local variables in stack_check and 191 + * stack_check_outer_callback would be allocated on stack by 192 + * compiler. This test should verify that stack content for these 193 + * variables is preserved between calls to bpf_loop (might be an issue 194 + * if loop inlining allocates stack slots incorrectly). 195 + */ 196 + SEC("fentry/" SYS_PREFIX "sys_nanosleep") 197 + int stack_check(void *ctx) 198 + { 199 + if (bpf_get_current_pid_tgid() >> 32 != pid) 200 + return 0; 201 + 202 + int a = map1_lookup_elem(7); 203 + int b = map1_lookup_elem(8); 204 + int c = map1_lookup_elem(9); 205 + int d = map1_lookup_elem(10); 206 + int e = map1_lookup_elem(11); 207 + int f = map1_lookup_elem(12); 208 + 209 + bpf_loop(1, stack_check_outer_callback, NULL, 0); 210 + 211 + map1_update_elem(7, a + 1); 212 + map1_update_elem(8, b + 1); 213 + map1_update_elem(9, c + 1); 214 + map1_update_elem(10, d + 1); 215 + map1_update_elem(11, e + 1); 216 + map1_update_elem(12, f + 1); 119 217 120 218 return 0; 121 219 }
+1
tools/testing/selftests/bpf/progs/bpf_tracing_net.h
··· 8 8 #define SOL_SOCKET 1 9 9 #define SO_SNDBUF 7 10 10 #define __SO_ACCEPTCON (1 << 16) 11 + #define SO_PRIORITY 12 11 12 12 13 #define SOL_TCP 6 13 14 #define TCP_CONGESTION 13
+3
tools/testing/selftests/bpf/progs/btf__core_reloc_type_based___diff.c
··· 1 + #include "core_reloc_types.h" 2 + 3 + void f(struct core_reloc_type_based___diff x) {}
+101 -11
tools/testing/selftests/bpf/progs/core_reloc_types.h
··· 13 13 int valid[10]; 14 14 char comm[sizeof("test_progs")]; 15 15 int comm_len; 16 + bool local_task_struct_matches; 16 17 }; 17 18 18 19 /* ··· 861 860 }; 862 861 863 862 /* 864 - * TYPE EXISTENCE & SIZE 863 + * TYPE EXISTENCE, MATCH & SIZE 865 864 */ 866 865 struct core_reloc_type_based_output { 867 866 bool struct_exists; 867 + bool complex_struct_exists; 868 868 bool union_exists; 869 869 bool enum_exists; 870 870 bool typedef_named_struct_exists; ··· 874 872 bool typedef_int_exists; 875 873 bool typedef_enum_exists; 876 874 bool typedef_void_ptr_exists; 875 + bool typedef_restrict_ptr_exists; 877 876 bool typedef_func_proto_exists; 878 877 bool typedef_arr_exists; 878 + 879 + bool struct_matches; 880 + bool complex_struct_matches; 881 + bool union_matches; 882 + bool enum_matches; 883 + bool typedef_named_struct_matches; 884 + bool typedef_anon_struct_matches; 885 + bool typedef_struct_ptr_matches; 886 + bool typedef_int_matches; 887 + bool typedef_enum_matches; 888 + bool typedef_void_ptr_matches; 889 + bool typedef_restrict_ptr_matches; 890 + bool typedef_func_proto_matches; 891 + bool typedef_arr_matches; 879 892 880 893 int struct_sz; 881 894 int union_sz; ··· 907 890 908 891 struct a_struct { 909 892 int x; 893 + }; 894 + 895 + struct a_complex_struct { 896 + union { 897 + struct a_struct * restrict a; 898 + void *b; 899 + } x; 900 + volatile long y; 910 901 }; 911 902 912 903 union a_union { ··· 941 916 typedef enum { TYPEDEF_ENUM_VAL1, TYPEDEF_ENUM_VAL2 } enum_typedef; 942 917 943 918 typedef void *void_ptr_typedef; 919 + typedef int *restrict restrict_ptr_typedef; 944 920 945 921 typedef int (*func_proto_typedef)(long); 946 922 ··· 949 923 950 924 struct core_reloc_type_based { 951 925 struct a_struct f1; 952 - union a_union f2; 953 - enum an_enum f3; 954 - named_struct_typedef f4; 955 - anon_struct_typedef f5; 956 - struct_ptr_typedef f6; 957 - int_typedef f7; 958 - enum_typedef f8; 959 - void_ptr_typedef f9; 960 - func_proto_typedef f10; 961 - arr_typedef f11; 926 + struct a_complex_struct f2; 927 + union a_union f3; 928 + enum an_enum f4; 929 + named_struct_typedef f5; 930 + anon_struct_typedef f6; 931 + struct_ptr_typedef f7; 932 + int_typedef f8; 933 + enum_typedef f9; 934 + void_ptr_typedef f10; 935 + restrict_ptr_typedef f11; 936 + func_proto_typedef f12; 937 + arr_typedef f13; 962 938 }; 963 939 964 940 /* no types in target */ 965 941 struct core_reloc_type_based___all_missing { 942 + }; 943 + 944 + /* different member orders, enum variant values, signedness, etc */ 945 + struct a_struct___diff { 946 + int x; 947 + int a; 948 + }; 949 + 950 + struct a_struct___forward; 951 + 952 + struct a_complex_struct___diff { 953 + union { 954 + struct a_struct___forward *a; 955 + void *b; 956 + } x; 957 + volatile long y; 958 + }; 959 + 960 + union a_union___diff { 961 + int z; 962 + int y; 963 + }; 964 + 965 + typedef struct a_struct___diff named_struct_typedef___diff; 966 + 967 + typedef struct { int z, x, y; } anon_struct_typedef___diff; 968 + 969 + typedef struct { 970 + int c; 971 + int b; 972 + int a; 973 + } *struct_ptr_typedef___diff; 974 + 975 + enum an_enum___diff { 976 + AN_ENUM_VAL2___diff = 0, 977 + AN_ENUM_VAL1___diff = 42, 978 + AN_ENUM_VAL3___diff = 1, 979 + }; 980 + 981 + typedef unsigned int int_typedef___diff; 982 + 983 + typedef enum { TYPEDEF_ENUM_VAL2___diff, TYPEDEF_ENUM_VAL1___diff = 50 } enum_typedef___diff; 984 + 985 + typedef const void *void_ptr_typedef___diff; 986 + 987 + typedef int_typedef___diff (*func_proto_typedef___diff)(long); 988 + 989 + typedef char arr_typedef___diff[3]; 990 + 991 + struct core_reloc_type_based___diff { 992 + struct a_struct___diff f1; 993 + struct a_complex_struct___diff f2; 994 + union a_union___diff f3; 995 + enum an_enum___diff f4; 996 + named_struct_typedef___diff f5; 997 + anon_struct_typedef___diff f6; 998 + struct_ptr_typedef___diff f7; 999 + int_typedef___diff f8; 1000 + enum_typedef___diff f9; 1001 + void_ptr_typedef___diff f10; 1002 + func_proto_typedef___diff f11; 1003 + arr_typedef___diff f12; 966 1004 }; 967 1005 968 1006 /* different type sizes, extra modifiers, anon vs named enums, etc */
+104
tools/testing/selftests/bpf/progs/local_storage_bench.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + 8 + #define HASHMAP_SZ 4194304 9 + 10 + struct { 11 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 12 + __uint(max_entries, 1000); 13 + __type(key, int); 14 + __type(value, int); 15 + __array(values, struct { 16 + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); 17 + __uint(map_flags, BPF_F_NO_PREALLOC); 18 + __type(key, int); 19 + __type(value, int); 20 + }); 21 + } array_of_local_storage_maps SEC(".maps"); 22 + 23 + struct { 24 + __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); 25 + __uint(max_entries, 1000); 26 + __type(key, int); 27 + __type(value, int); 28 + __array(values, struct { 29 + __uint(type, BPF_MAP_TYPE_HASH); 30 + __uint(max_entries, HASHMAP_SZ); 31 + __type(key, int); 32 + __type(value, int); 33 + }); 34 + } array_of_hash_maps SEC(".maps"); 35 + 36 + long important_hits; 37 + long hits; 38 + 39 + /* set from user-space */ 40 + const volatile unsigned int use_hashmap; 41 + const volatile unsigned int hashmap_num_keys; 42 + const volatile unsigned int num_maps; 43 + const volatile unsigned int interleave; 44 + 45 + struct loop_ctx { 46 + struct task_struct *task; 47 + long loop_hits; 48 + long loop_important_hits; 49 + }; 50 + 51 + static int do_lookup(unsigned int elem, struct loop_ctx *lctx) 52 + { 53 + void *map, *inner_map; 54 + int idx = 0; 55 + 56 + if (use_hashmap) 57 + map = &array_of_hash_maps; 58 + else 59 + map = &array_of_local_storage_maps; 60 + 61 + inner_map = bpf_map_lookup_elem(map, &elem); 62 + if (!inner_map) 63 + return -1; 64 + 65 + if (use_hashmap) { 66 + idx = bpf_get_prandom_u32() % hashmap_num_keys; 67 + bpf_map_lookup_elem(inner_map, &idx); 68 + } else { 69 + bpf_task_storage_get(inner_map, lctx->task, &idx, 70 + BPF_LOCAL_STORAGE_GET_F_CREATE); 71 + } 72 + 73 + lctx->loop_hits++; 74 + if (!elem) 75 + lctx->loop_important_hits++; 76 + return 0; 77 + } 78 + 79 + static long loop(u32 index, void *ctx) 80 + { 81 + struct loop_ctx *lctx = (struct loop_ctx *)ctx; 82 + unsigned int map_idx = index % num_maps; 83 + 84 + do_lookup(map_idx, lctx); 85 + if (interleave && map_idx % 3 == 0) 86 + do_lookup(0, lctx); 87 + return 0; 88 + } 89 + 90 + SEC("fentry/" SYS_PREFIX "sys_getpgid") 91 + int get_local(void *ctx) 92 + { 93 + struct loop_ctx lctx; 94 + 95 + lctx.task = bpf_get_current_task_btf(); 96 + lctx.loop_hits = 0; 97 + lctx.loop_important_hits = 0; 98 + bpf_loop(10000, &loop, &lctx, 0); 99 + __sync_add_and_fetch(&hits, lctx.loop_hits); 100 + __sync_add_and_fetch(&important_hits, lctx.loop_important_hits); 101 + return 0; 102 + } 103 + 104 + char _license[] SEC("license") = "GPL";
+67
tools/testing/selftests/bpf/progs/local_storage_rcu_tasks_trace_bench.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + 8 + struct { 9 + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); 10 + __uint(map_flags, BPF_F_NO_PREALLOC); 11 + __type(key, int); 12 + __type(value, int); 13 + } task_storage SEC(".maps"); 14 + 15 + long hits; 16 + long gp_hits; 17 + long gp_times; 18 + long current_gp_start; 19 + long unexpected; 20 + bool postgp_seen; 21 + 22 + SEC("fentry/" SYS_PREFIX "sys_getpgid") 23 + int get_local(void *ctx) 24 + { 25 + struct task_struct *task; 26 + int idx; 27 + int *s; 28 + 29 + idx = 0; 30 + task = bpf_get_current_task_btf(); 31 + s = bpf_task_storage_get(&task_storage, task, &idx, 32 + BPF_LOCAL_STORAGE_GET_F_CREATE); 33 + if (!s) 34 + return 0; 35 + 36 + *s = 3; 37 + bpf_task_storage_delete(&task_storage, task); 38 + __sync_add_and_fetch(&hits, 1); 39 + return 0; 40 + } 41 + 42 + SEC("fentry/rcu_tasks_trace_pregp_step") 43 + int pregp_step(struct pt_regs *ctx) 44 + { 45 + current_gp_start = bpf_ktime_get_ns(); 46 + return 0; 47 + } 48 + 49 + SEC("fentry/rcu_tasks_trace_postgp") 50 + int postgp(struct pt_regs *ctx) 51 + { 52 + if (!current_gp_start && postgp_seen) { 53 + /* Will only happen if prog tracing rcu_tasks_trace_pregp_step doesn't 54 + * execute before this prog 55 + */ 56 + __sync_add_and_fetch(&unexpected, 1); 57 + return 0; 58 + } 59 + 60 + __sync_add_and_fetch(&gp_times, bpf_ktime_get_ns() - current_gp_start); 61 + __sync_add_and_fetch(&gp_hits, 1); 62 + current_gp_start = 0; 63 + postgp_seen = true; 64 + return 0; 65 + } 66 + 67 + char _license[] SEC("license") = "GPL";
+180
tools/testing/selftests/bpf/progs/lsm_cgroup.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + #include "bpf_tracing_net.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + #ifndef AF_PACKET 11 + #define AF_PACKET 17 12 + #endif 13 + 14 + #ifndef AF_UNIX 15 + #define AF_UNIX 1 16 + #endif 17 + 18 + #ifndef EPERM 19 + #define EPERM 1 20 + #endif 21 + 22 + struct { 23 + __uint(type, BPF_MAP_TYPE_CGROUP_STORAGE); 24 + __type(key, __u64); 25 + __type(value, __u64); 26 + } cgroup_storage SEC(".maps"); 27 + 28 + int called_socket_post_create; 29 + int called_socket_post_create2; 30 + int called_socket_bind; 31 + int called_socket_bind2; 32 + int called_socket_alloc; 33 + int called_socket_clone; 34 + 35 + static __always_inline int test_local_storage(void) 36 + { 37 + __u64 *val; 38 + 39 + val = bpf_get_local_storage(&cgroup_storage, 0); 40 + if (!val) 41 + return 0; 42 + *val += 1; 43 + 44 + return 1; 45 + } 46 + 47 + static __always_inline int real_create(struct socket *sock, int family, 48 + int protocol) 49 + { 50 + struct sock *sk; 51 + int prio = 123; 52 + 53 + /* Reject non-tx-only AF_PACKET. */ 54 + if (family == AF_PACKET && protocol != 0) 55 + return 0; /* EPERM */ 56 + 57 + sk = sock->sk; 58 + if (!sk) 59 + return 1; 60 + 61 + /* The rest of the sockets get default policy. */ 62 + if (bpf_setsockopt(sk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio))) 63 + return 0; /* EPERM */ 64 + 65 + /* Make sure bpf_getsockopt is allowed and works. */ 66 + prio = 0; 67 + if (bpf_getsockopt(sk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio))) 68 + return 0; /* EPERM */ 69 + if (prio != 123) 70 + return 0; /* EPERM */ 71 + 72 + /* Can access cgroup local storage. */ 73 + if (!test_local_storage()) 74 + return 0; /* EPERM */ 75 + 76 + return 1; 77 + } 78 + 79 + /* __cgroup_bpf_run_lsm_socket */ 80 + SEC("lsm_cgroup/socket_post_create") 81 + int BPF_PROG(socket_post_create, struct socket *sock, int family, 82 + int type, int protocol, int kern) 83 + { 84 + called_socket_post_create++; 85 + return real_create(sock, family, protocol); 86 + } 87 + 88 + /* __cgroup_bpf_run_lsm_socket */ 89 + SEC("lsm_cgroup/socket_post_create") 90 + int BPF_PROG(socket_post_create2, struct socket *sock, int family, 91 + int type, int protocol, int kern) 92 + { 93 + called_socket_post_create2++; 94 + return real_create(sock, family, protocol); 95 + } 96 + 97 + static __always_inline int real_bind(struct socket *sock, 98 + struct sockaddr *address, 99 + int addrlen) 100 + { 101 + struct sockaddr_ll sa = {}; 102 + 103 + if (sock->sk->__sk_common.skc_family != AF_PACKET) 104 + return 1; 105 + 106 + if (sock->sk->sk_kern_sock) 107 + return 1; 108 + 109 + bpf_probe_read_kernel(&sa, sizeof(sa), address); 110 + if (sa.sll_protocol) 111 + return 0; /* EPERM */ 112 + 113 + /* Can access cgroup local storage. */ 114 + if (!test_local_storage()) 115 + return 0; /* EPERM */ 116 + 117 + return 1; 118 + } 119 + 120 + /* __cgroup_bpf_run_lsm_socket */ 121 + SEC("lsm_cgroup/socket_bind") 122 + int BPF_PROG(socket_bind, struct socket *sock, struct sockaddr *address, 123 + int addrlen) 124 + { 125 + called_socket_bind++; 126 + return real_bind(sock, address, addrlen); 127 + } 128 + 129 + /* __cgroup_bpf_run_lsm_socket */ 130 + SEC("lsm_cgroup/socket_bind") 131 + int BPF_PROG(socket_bind2, struct socket *sock, struct sockaddr *address, 132 + int addrlen) 133 + { 134 + called_socket_bind2++; 135 + return real_bind(sock, address, addrlen); 136 + } 137 + 138 + /* __cgroup_bpf_run_lsm_current (via bpf_lsm_current_hooks) */ 139 + SEC("lsm_cgroup/sk_alloc_security") 140 + int BPF_PROG(socket_alloc, struct sock *sk, int family, gfp_t priority) 141 + { 142 + called_socket_alloc++; 143 + if (family == AF_UNIX) 144 + return 0; /* EPERM */ 145 + 146 + /* Can access cgroup local storage. */ 147 + if (!test_local_storage()) 148 + return 0; /* EPERM */ 149 + 150 + return 1; 151 + } 152 + 153 + /* __cgroup_bpf_run_lsm_sock */ 154 + SEC("lsm_cgroup/inet_csk_clone") 155 + int BPF_PROG(socket_clone, struct sock *newsk, const struct request_sock *req) 156 + { 157 + int prio = 234; 158 + 159 + if (!newsk) 160 + return 1; 161 + 162 + /* Accepted request sockets get a different priority. */ 163 + if (bpf_setsockopt(newsk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio))) 164 + return 1; 165 + 166 + /* Make sure bpf_getsockopt is allowed and works. */ 167 + prio = 0; 168 + if (bpf_getsockopt(newsk, SOL_SOCKET, SO_PRIORITY, &prio, sizeof(prio))) 169 + return 1; 170 + if (prio != 234) 171 + return 1; 172 + 173 + /* Can access cgroup local storage. */ 174 + if (!test_local_storage()) 175 + return 1; 176 + 177 + called_socket_clone++; 178 + 179 + return 1; 180 + }
+14
tools/testing/selftests/bpf/progs/lsm_cgroup_nonvoid.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_tracing.h> 6 + 7 + char _license[] SEC("license") = "GPL"; 8 + 9 + SEC("lsm_cgroup/inet_csk_clone") 10 + int BPF_PROG(nonvoid_socket_clone, struct sock *newsk, const struct request_sock *req) 11 + { 12 + /* Can not return any errors from void LSM hooks. */ 13 + return 0; 14 + }
+35
tools/testing/selftests/bpf/progs/tcp_ca_incompl_cong_ops.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + static inline struct tcp_sock *tcp_sk(const struct sock *sk) 11 + { 12 + return (struct tcp_sock *)sk; 13 + } 14 + 15 + SEC("struct_ops/incompl_cong_ops_ssthresh") 16 + __u32 BPF_PROG(incompl_cong_ops_ssthresh, struct sock *sk) 17 + { 18 + return tcp_sk(sk)->snd_ssthresh; 19 + } 20 + 21 + SEC("struct_ops/incompl_cong_ops_undo_cwnd") 22 + __u32 BPF_PROG(incompl_cong_ops_undo_cwnd, struct sock *sk) 23 + { 24 + return tcp_sk(sk)->snd_cwnd; 25 + } 26 + 27 + SEC(".struct_ops") 28 + struct tcp_congestion_ops incompl_cong_ops = { 29 + /* Intentionally leaving out any of the required cong_avoid() and 30 + * cong_control() here. 31 + */ 32 + .ssthresh = (void *)incompl_cong_ops_ssthresh, 33 + .undo_cwnd = (void *)incompl_cong_ops_undo_cwnd, 34 + .name = "bpf_incompl_ops", 35 + };
+21
tools/testing/selftests/bpf/progs/tcp_ca_unsupp_cong_op.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + SEC("struct_ops/unsupp_cong_op_get_info") 11 + size_t BPF_PROG(unsupp_cong_op_get_info, struct sock *sk, u32 ext, int *attr, 12 + union tcp_cc_info *info) 13 + { 14 + return 0; 15 + } 16 + 17 + SEC(".struct_ops") 18 + struct tcp_congestion_ops unsupp_cong_op = { 19 + .get_info = (void *)unsupp_cong_op_get_info, 20 + .name = "bpf_unsupp_op", 21 + };
+60
tools/testing/selftests/bpf/progs/tcp_ca_write_sk_pacing.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + #define USEC_PER_SEC 1000000UL 11 + 12 + #define min(a, b) ((a) < (b) ? (a) : (b)) 13 + 14 + static inline struct tcp_sock *tcp_sk(const struct sock *sk) 15 + { 16 + return (struct tcp_sock *)sk; 17 + } 18 + 19 + SEC("struct_ops/write_sk_pacing_init") 20 + void BPF_PROG(write_sk_pacing_init, struct sock *sk) 21 + { 22 + #ifdef ENABLE_ATOMICS_TESTS 23 + __sync_bool_compare_and_swap(&sk->sk_pacing_status, SK_PACING_NONE, 24 + SK_PACING_NEEDED); 25 + #else 26 + sk->sk_pacing_status = SK_PACING_NEEDED; 27 + #endif 28 + } 29 + 30 + SEC("struct_ops/write_sk_pacing_cong_control") 31 + void BPF_PROG(write_sk_pacing_cong_control, struct sock *sk, 32 + const struct rate_sample *rs) 33 + { 34 + const struct tcp_sock *tp = tcp_sk(sk); 35 + unsigned long rate = 36 + ((tp->snd_cwnd * tp->mss_cache * USEC_PER_SEC) << 3) / 37 + (tp->srtt_us ?: 1U << 3); 38 + sk->sk_pacing_rate = min(rate, sk->sk_max_pacing_rate); 39 + } 40 + 41 + SEC("struct_ops/write_sk_pacing_ssthresh") 42 + __u32 BPF_PROG(write_sk_pacing_ssthresh, struct sock *sk) 43 + { 44 + return tcp_sk(sk)->snd_ssthresh; 45 + } 46 + 47 + SEC("struct_ops/write_sk_pacing_undo_cwnd") 48 + __u32 BPF_PROG(write_sk_pacing_undo_cwnd, struct sock *sk) 49 + { 50 + return tcp_sk(sk)->snd_cwnd; 51 + } 52 + 53 + SEC(".struct_ops") 54 + struct tcp_congestion_ops write_sk_pacing = { 55 + .init = (void *)write_sk_pacing_init, 56 + .cong_control = (void *)write_sk_pacing_cong_control, 57 + .ssthresh = (void *)write_sk_pacing_ssthresh, 58 + .undo_cwnd = (void *)write_sk_pacing_undo_cwnd, 59 + .name = "bpf_w_sk_pacing", 60 + };
-51
tools/testing/selftests/bpf/progs/test_btf_haskv.c
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 2 - /* Copyright (c) 2018 Facebook */ 3 - #include <linux/bpf.h> 4 - #include <bpf/bpf_helpers.h> 5 - #include "bpf_legacy.h" 6 - 7 - struct ipv_counts { 8 - unsigned int v4; 9 - unsigned int v6; 10 - }; 11 - 12 - #pragma GCC diagnostic push 13 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 14 - struct bpf_map_def SEC("maps") btf_map = { 15 - .type = BPF_MAP_TYPE_ARRAY, 16 - .key_size = sizeof(int), 17 - .value_size = sizeof(struct ipv_counts), 18 - .max_entries = 4, 19 - }; 20 - #pragma GCC diagnostic pop 21 - 22 - BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts); 23 - 24 - __attribute__((noinline)) 25 - int test_long_fname_2(void) 26 - { 27 - struct ipv_counts *counts; 28 - int key = 0; 29 - 30 - counts = bpf_map_lookup_elem(&btf_map, &key); 31 - if (!counts) 32 - return 0; 33 - 34 - counts->v6++; 35 - 36 - return 0; 37 - } 38 - 39 - __attribute__((noinline)) 40 - int test_long_fname_1(void) 41 - { 42 - return test_long_fname_2(); 43 - } 44 - 45 - SEC("dummy_tracepoint") 46 - int _dummy_tracepoint(void *arg) 47 - { 48 - return test_long_fname_1(); 49 - } 50 - 51 - char _license[] SEC("license") = "GPL";
-18
tools/testing/selftests/bpf/progs/test_btf_newkv.c
··· 9 9 unsigned int v6; 10 10 }; 11 11 12 - #pragma GCC diagnostic push 13 - #pragma GCC diagnostic ignored "-Wdeprecated-declarations" 14 - /* just to validate we can handle maps in multiple sections */ 15 - struct bpf_map_def SEC("maps") btf_map_legacy = { 16 - .type = BPF_MAP_TYPE_ARRAY, 17 - .key_size = sizeof(int), 18 - .value_size = sizeof(long long), 19 - .max_entries = 4, 20 - }; 21 - #pragma GCC diagnostic pop 22 - 23 - BPF_ANNOTATE_KV_PAIR(btf_map_legacy, int, struct ipv_counts); 24 - 25 12 struct { 26 13 __uint(type, BPF_MAP_TYPE_ARRAY); 27 14 __uint(max_entries, 4); ··· 27 40 return 0; 28 41 29 42 counts->v6++; 30 - 31 - /* just verify we can reference both maps */ 32 - counts = bpf_map_lookup_elem(&btf_map_legacy, &key); 33 - if (!counts) 34 - return 0; 35 43 36 44 return 0; 37 45 }
+19
tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
··· 21 21 /* we have test_progs[-flavor], so cut flavor part */ 22 22 char comm[sizeof("test_progs")]; 23 23 int comm_len; 24 + bool local_task_struct_matches; 24 25 }; 25 26 26 27 struct task_struct { ··· 31 30 struct task_struct *group_leader; 32 31 }; 33 32 33 + struct mm_struct___wrong { 34 + int abc_whatever_should_not_exist; 35 + }; 36 + 37 + struct task_struct___local { 38 + int pid; 39 + struct mm_struct___wrong *mm; 40 + }; 41 + 34 42 #define CORE_READ(dst, src) bpf_core_read(dst, sizeof(*(dst)), src) 35 43 36 44 SEC("raw_tracepoint/sys_enter") 37 45 int test_core_kernel(void *ctx) 38 46 { 47 + /* Support for the BPF_TYPE_MATCHES argument to the 48 + * __builtin_preserve_type_info builtin was added at some point during 49 + * development of clang 15 and it's what we require for this test. 50 + */ 51 + #if __has_builtin(__builtin_preserve_type_info) && __clang_major__ >= 15 39 52 struct task_struct *task = (void *)bpf_get_current_task(); 40 53 struct core_reloc_kernel_output *out = (void *)&data.out; 41 54 uint64_t pid_tgid = bpf_get_current_pid_tgid(); ··· 108 93 group_leader, group_leader, group_leader, group_leader, 109 94 comm); 110 95 96 + out->local_task_struct_matches = bpf_core_type_matches(struct task_struct___local); 97 + #else 98 + data.skip = true; 99 + #endif 111 100 return 0; 112 101 } 113 102
+48 -1
tools/testing/selftests/bpf/progs/test_core_reloc_type_based.c
··· 19 19 int x; 20 20 }; 21 21 22 + struct a_complex_struct { 23 + union { 24 + struct a_struct *a; 25 + void *b; 26 + } x; 27 + volatile long y; 28 + }; 29 + 22 30 union a_union { 23 31 int y; 24 32 int z; ··· 51 43 typedef enum { TYPEDEF_ENUM_VAL1, TYPEDEF_ENUM_VAL2 } enum_typedef; 52 44 53 45 typedef void *void_ptr_typedef; 46 + typedef int *restrict restrict_ptr_typedef; 54 47 55 48 typedef int (*func_proto_typedef)(long); 56 49 ··· 59 50 60 51 struct core_reloc_type_based_output { 61 52 bool struct_exists; 53 + bool complex_struct_exists; 62 54 bool union_exists; 63 55 bool enum_exists; 64 56 bool typedef_named_struct_exists; ··· 68 58 bool typedef_int_exists; 69 59 bool typedef_enum_exists; 70 60 bool typedef_void_ptr_exists; 61 + bool typedef_restrict_ptr_exists; 71 62 bool typedef_func_proto_exists; 72 63 bool typedef_arr_exists; 64 + 65 + bool struct_matches; 66 + bool complex_struct_matches; 67 + bool union_matches; 68 + bool enum_matches; 69 + bool typedef_named_struct_matches; 70 + bool typedef_anon_struct_matches; 71 + bool typedef_struct_ptr_matches; 72 + bool typedef_int_matches; 73 + bool typedef_enum_matches; 74 + bool typedef_void_ptr_matches; 75 + bool typedef_restrict_ptr_matches; 76 + bool typedef_func_proto_matches; 77 + bool typedef_arr_matches; 73 78 74 79 int struct_sz; 75 80 int union_sz; ··· 102 77 SEC("raw_tracepoint/sys_enter") 103 78 int test_core_type_based(void *ctx) 104 79 { 105 - #if __has_builtin(__builtin_preserve_type_info) 80 + /* Support for the BPF_TYPE_MATCHES argument to the 81 + * __builtin_preserve_type_info builtin was added at some point during 82 + * development of clang 15 and it's what we require for this test. Part of it 83 + * could run with merely __builtin_preserve_type_info (which could be checked 84 + * separately), but we have to find an upper bound. 85 + */ 86 + #if __has_builtin(__builtin_preserve_type_info) && __clang_major__ >= 15 106 87 struct core_reloc_type_based_output *out = (void *)&data.out; 107 88 108 89 out->struct_exists = bpf_core_type_exists(struct a_struct); 90 + out->complex_struct_exists = bpf_core_type_exists(struct a_complex_struct); 109 91 out->union_exists = bpf_core_type_exists(union a_union); 110 92 out->enum_exists = bpf_core_type_exists(enum an_enum); 111 93 out->typedef_named_struct_exists = bpf_core_type_exists(named_struct_typedef); ··· 121 89 out->typedef_int_exists = bpf_core_type_exists(int_typedef); 122 90 out->typedef_enum_exists = bpf_core_type_exists(enum_typedef); 123 91 out->typedef_void_ptr_exists = bpf_core_type_exists(void_ptr_typedef); 92 + out->typedef_restrict_ptr_exists = bpf_core_type_exists(restrict_ptr_typedef); 124 93 out->typedef_func_proto_exists = bpf_core_type_exists(func_proto_typedef); 125 94 out->typedef_arr_exists = bpf_core_type_exists(arr_typedef); 95 + 96 + out->struct_matches = bpf_core_type_matches(struct a_struct); 97 + out->complex_struct_matches = bpf_core_type_matches(struct a_complex_struct); 98 + out->union_matches = bpf_core_type_matches(union a_union); 99 + out->enum_matches = bpf_core_type_matches(enum an_enum); 100 + out->typedef_named_struct_matches = bpf_core_type_matches(named_struct_typedef); 101 + out->typedef_anon_struct_matches = bpf_core_type_matches(anon_struct_typedef); 102 + out->typedef_struct_ptr_matches = bpf_core_type_matches(struct_ptr_typedef); 103 + out->typedef_int_matches = bpf_core_type_matches(int_typedef); 104 + out->typedef_enum_matches = bpf_core_type_matches(enum_typedef); 105 + out->typedef_void_ptr_matches = bpf_core_type_matches(void_ptr_typedef); 106 + out->typedef_restrict_ptr_matches = bpf_core_type_matches(restrict_ptr_typedef); 107 + out->typedef_func_proto_matches = bpf_core_type_matches(func_proto_typedef); 108 + out->typedef_arr_matches = bpf_core_type_matches(arr_typedef); 126 109 127 110 out->struct_sz = bpf_core_type_size(struct a_struct); 128 111 out->union_sz = bpf_core_type_size(union a_union);
+17 -7
tools/testing/selftests/bpf/progs/xdp_synproxy_kern.c
··· 77 77 __uint(max_entries, MAX_ALLOWED_PORTS); 78 78 } allowed_ports SEC(".maps"); 79 79 80 + /* Some symbols defined in net/netfilter/nf_conntrack_bpf.c are unavailable in 81 + * vmlinux.h if CONFIG_NF_CONNTRACK=m, so they are redefined locally. 82 + */ 83 + 84 + struct bpf_ct_opts___local { 85 + s32 netns_id; 86 + s32 error; 87 + u8 l4proto; 88 + u8 dir; 89 + u8 reserved[2]; 90 + } __attribute__((preserve_access_index)); 91 + 92 + #define BPF_F_CURRENT_NETNS (-1) 93 + 80 94 extern struct nf_conn *bpf_xdp_ct_lookup(struct xdp_md *xdp_ctx, 81 95 struct bpf_sock_tuple *bpf_tuple, 82 96 __u32 len_tuple, 83 - struct bpf_ct_opts *opts, 97 + struct bpf_ct_opts___local *opts, 84 98 __u32 len_opts) __ksym; 85 99 86 100 extern struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, 87 101 struct bpf_sock_tuple *bpf_tuple, 88 102 u32 len_tuple, 89 - struct bpf_ct_opts *opts, 103 + struct bpf_ct_opts___local *opts, 90 104 u32 len_opts) __ksym; 91 105 92 106 extern void bpf_ct_release(struct nf_conn *ct) __ksym; ··· 407 393 408 394 static __always_inline int tcp_lookup(void *ctx, struct header_pointers *hdr, bool xdp) 409 395 { 410 - struct bpf_ct_opts ct_lookup_opts = { 396 + struct bpf_ct_opts___local ct_lookup_opts = { 411 397 .netns_id = BPF_F_CURRENT_NETNS, 412 398 .l4proto = IPPROTO_TCP, 413 399 }; ··· 728 714 static __always_inline int syncookie_part1(void *ctx, void *data, void *data_end, 729 715 struct header_pointers *hdr, bool xdp) 730 716 { 731 - struct bpf_ct_opts ct_lookup_opts = { 732 - .netns_id = BPF_F_CURRENT_NETNS, 733 - .l4proto = IPPROTO_TCP, 734 - }; 735 717 int ret; 736 718 737 719 ret = tcp_dissect(data, data_end, hdr);
+3 -17
tools/testing/selftests/bpf/test_bpftool_synctypes.py
··· 471 471 def get_prog_attach_types(self): 472 472 return self.get_bashcomp_list('BPFTOOL_PROG_ATTACH_TYPES') 473 473 474 - def get_map_types(self): 475 - return self.get_bashcomp_list('BPFTOOL_MAP_CREATE_TYPES') 476 - 477 - def get_cgroup_attach_types(self): 478 - return self.get_bashcomp_list('BPFTOOL_CGROUP_ATTACH_TYPES') 479 - 480 474 def verify(first_set, second_set, message): 481 475 """ 482 476 Print all values that differ between two sets. ··· 510 516 man_map_types = man_map_info.get_map_types() 511 517 man_map_info.close() 512 518 513 - bashcomp_info = BashcompExtractor() 514 - bashcomp_map_types = bashcomp_info.get_map_types() 515 - 516 519 verify(source_map_types, help_map_types, 517 520 f'Comparing {BpfHeaderExtractor.filename} (bpf_map_type) and {MapFileExtractor.filename} (do_help() TYPE):') 518 521 verify(source_map_types, man_map_types, 519 522 f'Comparing {BpfHeaderExtractor.filename} (bpf_map_type) and {ManMapExtractor.filename} (TYPE):') 520 523 verify(help_map_options, man_map_options, 521 524 f'Comparing {MapFileExtractor.filename} (do_help() OPTIONS) and {ManMapExtractor.filename} (OPTIONS):') 522 - verify(source_map_types, bashcomp_map_types, 523 - f'Comparing {BpfHeaderExtractor.filename} (bpf_map_type) and {BashcompExtractor.filename} (BPFTOOL_MAP_CREATE_TYPES):') 524 525 525 526 # Attach types (names) 526 527 ··· 531 542 man_prog_attach_types = man_prog_info.get_attach_types() 532 543 man_prog_info.close() 533 544 534 - bashcomp_info.reset_read() # We stopped at map types, rewind 545 + 546 + bashcomp_info = BashcompExtractor() 535 547 bashcomp_prog_attach_types = bashcomp_info.get_prog_attach_types() 548 + bashcomp_info.close() 536 549 537 550 verify(source_prog_attach_types, help_prog_attach_types, 538 551 f'Comparing {ProgFileExtractor.filename} (bpf_attach_type) and {ProgFileExtractor.filename} (do_help() ATTACH_TYPE):') ··· 559 568 man_cgroup_attach_types = man_cgroup_info.get_attach_types() 560 569 man_cgroup_info.close() 561 570 562 - bashcomp_cgroup_attach_types = bashcomp_info.get_cgroup_attach_types() 563 - bashcomp_info.close() 564 - 565 571 verify(source_cgroup_attach_types, help_cgroup_attach_types, 566 572 f'Comparing {BpfHeaderExtractor.filename} (bpf_attach_type) and {CgroupFileExtractor.filename} (do_help() ATTACH_TYPE):') 567 573 verify(source_cgroup_attach_types, man_cgroup_attach_types, 568 574 f'Comparing {BpfHeaderExtractor.filename} (bpf_attach_type) and {ManCgroupExtractor.filename} (ATTACH_TYPE):') 569 575 verify(help_cgroup_options, man_cgroup_options, 570 576 f'Comparing {CgroupFileExtractor.filename} (do_help() OPTIONS) and {ManCgroupExtractor.filename} (OPTIONS):') 571 - verify(source_cgroup_attach_types, bashcomp_cgroup_attach_types, 572 - f'Comparing {BpfHeaderExtractor.filename} (bpf_attach_type) and {BashcompExtractor.filename} (BPFTOOL_CGROUP_ATTACH_TYPES):') 573 577 574 578 # Options for remaining commands 575 579
+2
tools/testing/selftests/bpf/test_btf.h
··· 4 4 #ifndef _TEST_BTF_H 5 5 #define _TEST_BTF_H 6 6 7 + #define BTF_END_RAW 0xdeadbeef 8 + 7 9 #define BTF_INFO_ENC(kind, kind_flag, vlen) \ 8 10 ((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN)) 9 11
+350 -17
tools/testing/selftests/bpf/test_verifier.c
··· 51 51 #endif 52 52 53 53 #define MAX_INSNS BPF_MAXINSNS 54 + #define MAX_EXPECTED_INSNS 32 55 + #define MAX_UNEXPECTED_INSNS 32 54 56 #define MAX_TEST_INSNS 1000000 55 57 #define MAX_FIXUPS 8 56 58 #define MAX_NR_MAPS 23 57 59 #define MAX_TEST_RUNS 8 58 60 #define POINTER_VALUE 0xcafe4all 59 61 #define TEST_DATA_LEN 64 62 + #define MAX_FUNC_INFOS 8 63 + #define MAX_BTF_STRINGS 256 64 + #define MAX_BTF_TYPES 256 65 + 66 + #define INSN_OFF_MASK ((__s16)0xFFFF) 67 + #define INSN_IMM_MASK ((__s32)0xFFFFFFFF) 68 + #define SKIP_INSNS() BPF_RAW_INSN(0xde, 0xa, 0xd, 0xbeef, 0xdeadbeef) 69 + 70 + #define DEFAULT_LIBBPF_LOG_LEVEL 4 71 + #define VERBOSE_LIBBPF_LOG_LEVEL 1 60 72 61 73 #define F_NEEDS_EFFICIENT_UNALIGNED_ACCESS (1 << 0) 62 74 #define F_LOAD_WITH_STRICT_ALIGNMENT (1 << 1) ··· 91 79 const char *descr; 92 80 struct bpf_insn insns[MAX_INSNS]; 93 81 struct bpf_insn *fill_insns; 82 + /* If specified, test engine looks for this sequence of 83 + * instructions in the BPF program after loading. Allows to 84 + * test rewrites applied by verifier. Use values 85 + * INSN_OFF_MASK and INSN_IMM_MASK to mask `off` and `imm` 86 + * fields if content does not matter. The test case fails if 87 + * specified instructions are not found. 88 + * 89 + * The sequence could be split into sub-sequences by adding 90 + * SKIP_INSNS instruction at the end of each sub-sequence. In 91 + * such case sub-sequences are searched for one after another. 92 + */ 93 + struct bpf_insn expected_insns[MAX_EXPECTED_INSNS]; 94 + /* If specified, test engine applies same pattern matching 95 + * logic as for `expected_insns`. If the specified pattern is 96 + * matched test case is marked as failed. 97 + */ 98 + struct bpf_insn unexpected_insns[MAX_UNEXPECTED_INSNS]; 94 99 int fixup_map_hash_8b[MAX_FIXUPS]; 95 100 int fixup_map_hash_48b[MAX_FIXUPS]; 96 101 int fixup_map_hash_16b[MAX_FIXUPS]; ··· 164 135 }; 165 136 enum bpf_attach_type expected_attach_type; 166 137 const char *kfunc; 138 + struct bpf_func_info func_info[MAX_FUNC_INFOS]; 139 + int func_info_cnt; 140 + char btf_strings[MAX_BTF_STRINGS]; 141 + /* A set of BTF types to load when specified, 142 + * use macro definitions from test_btf.h, 143 + * must end with BTF_END_RAW 144 + */ 145 + __u32 btf_types[MAX_BTF_TYPES]; 167 146 }; 168 147 169 148 /* Note we want this to be 64 bit aligned so that the end of our array is ··· 423 386 self->prog_len = 0; 424 387 break; 425 388 } 389 + } 390 + 391 + static void bpf_fill_big_prog_with_loop_1(struct bpf_test *self) 392 + { 393 + struct bpf_insn *insn = self->fill_insns; 394 + /* This test was added to catch a specific use after free 395 + * error, which happened upon BPF program reallocation. 396 + * Reallocation is handled by core.c:bpf_prog_realloc, which 397 + * reuses old memory if page boundary is not crossed. The 398 + * value of `len` is chosen to cross this boundary on bpf_loop 399 + * patching. 400 + */ 401 + const int len = getpagesize() - 25; 402 + int callback_load_idx; 403 + int callback_idx; 404 + int i = 0; 405 + 406 + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1); 407 + callback_load_idx = i; 408 + insn[i++] = BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, 409 + BPF_REG_2, BPF_PSEUDO_FUNC, 0, 410 + 777 /* filled below */); 411 + insn[i++] = BPF_RAW_INSN(0, 0, 0, 0, 0); 412 + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0); 413 + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0); 414 + insn[i++] = BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop); 415 + 416 + while (i < len - 3) 417 + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0); 418 + insn[i++] = BPF_EXIT_INSN(); 419 + 420 + callback_idx = i; 421 + insn[i++] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0); 422 + insn[i++] = BPF_EXIT_INSN(); 423 + 424 + insn[callback_load_idx].imm = callback_idx - callback_load_idx - 1; 425 + self->func_info[1].insn_off = callback_idx; 426 + self->prog_len = i; 427 + assert(i == len); 426 428 } 427 429 428 430 /* BPF_SK_LOOKUP contains 13 instructions, if you need to fix up maps */ ··· 740 664 BTF_MEMBER_ENC(71, 13, 128), /* struct prog_test_member __kptr_ref *ptr; */ 741 665 }; 742 666 743 - static int load_btf(void) 667 + static char bpf_vlog[UINT_MAX >> 8]; 668 + 669 + static int load_btf_spec(__u32 *types, int types_len, 670 + const char *strings, int strings_len) 744 671 { 745 672 struct btf_header hdr = { 746 673 .magic = BTF_MAGIC, 747 674 .version = BTF_VERSION, 748 675 .hdr_len = sizeof(struct btf_header), 749 - .type_len = sizeof(btf_raw_types), 750 - .str_off = sizeof(btf_raw_types), 751 - .str_len = sizeof(btf_str_sec), 676 + .type_len = types_len, 677 + .str_off = types_len, 678 + .str_len = strings_len, 752 679 }; 753 680 void *ptr, *raw_btf; 754 681 int btf_fd; 682 + LIBBPF_OPTS(bpf_btf_load_opts, opts, 683 + .log_buf = bpf_vlog, 684 + .log_size = sizeof(bpf_vlog), 685 + .log_level = (verbose 686 + ? VERBOSE_LIBBPF_LOG_LEVEL 687 + : DEFAULT_LIBBPF_LOG_LEVEL), 688 + ); 755 689 756 - ptr = raw_btf = malloc(sizeof(hdr) + sizeof(btf_raw_types) + 757 - sizeof(btf_str_sec)); 690 + raw_btf = malloc(sizeof(hdr) + types_len + strings_len); 758 691 692 + ptr = raw_btf; 759 693 memcpy(ptr, &hdr, sizeof(hdr)); 760 694 ptr += sizeof(hdr); 761 - memcpy(ptr, btf_raw_types, hdr.type_len); 695 + memcpy(ptr, types, hdr.type_len); 762 696 ptr += hdr.type_len; 763 - memcpy(ptr, btf_str_sec, hdr.str_len); 697 + memcpy(ptr, strings, hdr.str_len); 764 698 ptr += hdr.str_len; 765 699 766 - btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, NULL); 767 - free(raw_btf); 700 + btf_fd = bpf_btf_load(raw_btf, ptr - raw_btf, &opts); 768 701 if (btf_fd < 0) 769 - return -1; 770 - return btf_fd; 702 + printf("Failed to load BTF spec: '%s'\n", strerror(errno)); 703 + 704 + free(raw_btf); 705 + 706 + return btf_fd < 0 ? -1 : btf_fd; 707 + } 708 + 709 + static int load_btf(void) 710 + { 711 + return load_btf_spec(btf_raw_types, sizeof(btf_raw_types), 712 + btf_str_sec, sizeof(btf_str_sec)); 713 + } 714 + 715 + static int load_btf_for_test(struct bpf_test *test) 716 + { 717 + int types_num = 0; 718 + 719 + while (types_num < MAX_BTF_TYPES && 720 + test->btf_types[types_num] != BTF_END_RAW) 721 + ++types_num; 722 + 723 + int types_len = types_num * sizeof(test->btf_types[0]); 724 + 725 + return load_btf_spec(test->btf_types, types_len, 726 + test->btf_strings, sizeof(test->btf_strings)); 771 727 } 772 728 773 729 static int create_map_spin_lock(void) ··· 877 769 printf("Failed to create map with btf_id pointer\n"); 878 770 return fd; 879 771 } 880 - 881 - static char bpf_vlog[UINT_MAX >> 8]; 882 772 883 773 static void do_test_fixup(struct bpf_test *test, enum bpf_prog_type prog_type, 884 774 struct bpf_insn *prog, int *map_fds) ··· 1232 1126 return true; 1233 1127 } 1234 1128 1129 + static int get_xlated_program(int fd_prog, struct bpf_insn **buf, int *cnt) 1130 + { 1131 + struct bpf_prog_info info = {}; 1132 + __u32 info_len = sizeof(info); 1133 + __u32 xlated_prog_len; 1134 + __u32 buf_element_size = sizeof(struct bpf_insn); 1135 + 1136 + if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) { 1137 + perror("bpf_obj_get_info_by_fd failed"); 1138 + return -1; 1139 + } 1140 + 1141 + xlated_prog_len = info.xlated_prog_len; 1142 + if (xlated_prog_len % buf_element_size) { 1143 + printf("Program length %d is not multiple of %d\n", 1144 + xlated_prog_len, buf_element_size); 1145 + return -1; 1146 + } 1147 + 1148 + *cnt = xlated_prog_len / buf_element_size; 1149 + *buf = calloc(*cnt, buf_element_size); 1150 + if (!buf) { 1151 + perror("can't allocate xlated program buffer"); 1152 + return -ENOMEM; 1153 + } 1154 + 1155 + bzero(&info, sizeof(info)); 1156 + info.xlated_prog_len = xlated_prog_len; 1157 + info.xlated_prog_insns = (__u64)*buf; 1158 + if (bpf_obj_get_info_by_fd(fd_prog, &info, &info_len)) { 1159 + perror("second bpf_obj_get_info_by_fd failed"); 1160 + goto out_free_buf; 1161 + } 1162 + 1163 + return 0; 1164 + 1165 + out_free_buf: 1166 + free(*buf); 1167 + return -1; 1168 + } 1169 + 1170 + static bool is_null_insn(struct bpf_insn *insn) 1171 + { 1172 + struct bpf_insn null_insn = {}; 1173 + 1174 + return memcmp(insn, &null_insn, sizeof(null_insn)) == 0; 1175 + } 1176 + 1177 + static bool is_skip_insn(struct bpf_insn *insn) 1178 + { 1179 + struct bpf_insn skip_insn = SKIP_INSNS(); 1180 + 1181 + return memcmp(insn, &skip_insn, sizeof(skip_insn)) == 0; 1182 + } 1183 + 1184 + static int null_terminated_insn_len(struct bpf_insn *seq, int max_len) 1185 + { 1186 + int i; 1187 + 1188 + for (i = 0; i < max_len; ++i) { 1189 + if (is_null_insn(&seq[i])) 1190 + return i; 1191 + } 1192 + return max_len; 1193 + } 1194 + 1195 + static bool compare_masked_insn(struct bpf_insn *orig, struct bpf_insn *masked) 1196 + { 1197 + struct bpf_insn orig_masked; 1198 + 1199 + memcpy(&orig_masked, orig, sizeof(orig_masked)); 1200 + if (masked->imm == INSN_IMM_MASK) 1201 + orig_masked.imm = INSN_IMM_MASK; 1202 + if (masked->off == INSN_OFF_MASK) 1203 + orig_masked.off = INSN_OFF_MASK; 1204 + 1205 + return memcmp(&orig_masked, masked, sizeof(orig_masked)) == 0; 1206 + } 1207 + 1208 + static int find_insn_subseq(struct bpf_insn *seq, struct bpf_insn *subseq, 1209 + int seq_len, int subseq_len) 1210 + { 1211 + int i, j; 1212 + 1213 + if (subseq_len > seq_len) 1214 + return -1; 1215 + 1216 + for (i = 0; i < seq_len - subseq_len + 1; ++i) { 1217 + bool found = true; 1218 + 1219 + for (j = 0; j < subseq_len; ++j) { 1220 + if (!compare_masked_insn(&seq[i + j], &subseq[j])) { 1221 + found = false; 1222 + break; 1223 + } 1224 + } 1225 + if (found) 1226 + return i; 1227 + } 1228 + 1229 + return -1; 1230 + } 1231 + 1232 + static int find_skip_insn_marker(struct bpf_insn *seq, int len) 1233 + { 1234 + int i; 1235 + 1236 + for (i = 0; i < len; ++i) 1237 + if (is_skip_insn(&seq[i])) 1238 + return i; 1239 + 1240 + return -1; 1241 + } 1242 + 1243 + /* Return true if all sub-sequences in `subseqs` could be found in 1244 + * `seq` one after another. Sub-sequences are separated by a single 1245 + * nil instruction. 1246 + */ 1247 + static bool find_all_insn_subseqs(struct bpf_insn *seq, struct bpf_insn *subseqs, 1248 + int seq_len, int max_subseqs_len) 1249 + { 1250 + int subseqs_len = null_terminated_insn_len(subseqs, max_subseqs_len); 1251 + 1252 + while (subseqs_len > 0) { 1253 + int skip_idx = find_skip_insn_marker(subseqs, subseqs_len); 1254 + int cur_subseq_len = skip_idx < 0 ? subseqs_len : skip_idx; 1255 + int subseq_idx = find_insn_subseq(seq, subseqs, 1256 + seq_len, cur_subseq_len); 1257 + 1258 + if (subseq_idx < 0) 1259 + return false; 1260 + seq += subseq_idx + cur_subseq_len; 1261 + seq_len -= subseq_idx + cur_subseq_len; 1262 + subseqs += cur_subseq_len + 1; 1263 + subseqs_len -= cur_subseq_len + 1; 1264 + } 1265 + 1266 + return true; 1267 + } 1268 + 1269 + static void print_insn(struct bpf_insn *buf, int cnt) 1270 + { 1271 + int i; 1272 + 1273 + printf(" addr op d s off imm\n"); 1274 + for (i = 0; i < cnt; ++i) { 1275 + struct bpf_insn *insn = &buf[i]; 1276 + 1277 + if (is_null_insn(insn)) 1278 + break; 1279 + 1280 + if (is_skip_insn(insn)) 1281 + printf(" ...\n"); 1282 + else 1283 + printf(" %04x: %02x %1x %x %04hx %08x\n", 1284 + i, insn->code, insn->dst_reg, 1285 + insn->src_reg, insn->off, insn->imm); 1286 + } 1287 + } 1288 + 1289 + static bool check_xlated_program(struct bpf_test *test, int fd_prog) 1290 + { 1291 + struct bpf_insn *buf; 1292 + int cnt; 1293 + bool result = true; 1294 + bool check_expected = !is_null_insn(test->expected_insns); 1295 + bool check_unexpected = !is_null_insn(test->unexpected_insns); 1296 + 1297 + if (!check_expected && !check_unexpected) 1298 + goto out; 1299 + 1300 + if (get_xlated_program(fd_prog, &buf, &cnt)) { 1301 + printf("FAIL: can't get xlated program\n"); 1302 + result = false; 1303 + goto out; 1304 + } 1305 + 1306 + if (check_expected && 1307 + !find_all_insn_subseqs(buf, test->expected_insns, 1308 + cnt, MAX_EXPECTED_INSNS)) { 1309 + printf("FAIL: can't find expected subsequence of instructions\n"); 1310 + result = false; 1311 + if (verbose) { 1312 + printf("Program:\n"); 1313 + print_insn(buf, cnt); 1314 + printf("Expected subsequence:\n"); 1315 + print_insn(test->expected_insns, MAX_EXPECTED_INSNS); 1316 + } 1317 + } 1318 + 1319 + if (check_unexpected && 1320 + find_all_insn_subseqs(buf, test->unexpected_insns, 1321 + cnt, MAX_UNEXPECTED_INSNS)) { 1322 + printf("FAIL: found unexpected subsequence of instructions\n"); 1323 + result = false; 1324 + if (verbose) { 1325 + printf("Program:\n"); 1326 + print_insn(buf, cnt); 1327 + printf("Un-expected subsequence:\n"); 1328 + print_insn(test->unexpected_insns, MAX_UNEXPECTED_INSNS); 1329 + } 1330 + } 1331 + 1332 + free(buf); 1333 + out: 1334 + return result; 1335 + } 1336 + 1235 1337 static void do_test_single(struct bpf_test *test, bool unpriv, 1236 1338 int *passes, int *errors) 1237 1339 { 1238 - int fd_prog, expected_ret, alignment_prevented_execution; 1340 + int fd_prog, btf_fd, expected_ret, alignment_prevented_execution; 1239 1341 int prog_len, prog_type = test->prog_type; 1240 1342 struct bpf_insn *prog = test->insns; 1241 1343 LIBBPF_OPTS(bpf_prog_load_opts, opts); ··· 1455 1141 __u32 pflags; 1456 1142 int i, err; 1457 1143 1144 + fd_prog = -1; 1458 1145 for (i = 0; i < MAX_NR_MAPS; i++) 1459 1146 map_fds[i] = -1; 1147 + btf_fd = -1; 1460 1148 1461 1149 if (!prog_type) 1462 1150 prog_type = BPF_PROG_TYPE_SOCKET_FILTER; ··· 1491 1175 1492 1176 opts.expected_attach_type = test->expected_attach_type; 1493 1177 if (verbose) 1494 - opts.log_level = 1; 1178 + opts.log_level = VERBOSE_LIBBPF_LOG_LEVEL; 1495 1179 else if (expected_ret == VERBOSE_ACCEPT) 1496 1180 opts.log_level = 2; 1497 1181 else 1498 - opts.log_level = 4; 1182 + opts.log_level = DEFAULT_LIBBPF_LOG_LEVEL; 1499 1183 opts.prog_flags = pflags; 1500 1184 1501 1185 if (prog_type == BPF_PROG_TYPE_TRACING && test->kfunc) { ··· 1511 1195 } 1512 1196 1513 1197 opts.attach_btf_id = attach_btf_id; 1198 + } 1199 + 1200 + if (test->btf_types[0] != 0) { 1201 + btf_fd = load_btf_for_test(test); 1202 + if (btf_fd < 0) 1203 + goto fail_log; 1204 + opts.prog_btf_fd = btf_fd; 1205 + } 1206 + 1207 + if (test->func_info_cnt != 0) { 1208 + opts.func_info = test->func_info; 1209 + opts.func_info_cnt = test->func_info_cnt; 1210 + opts.func_info_rec_size = sizeof(test->func_info[0]); 1514 1211 } 1515 1212 1516 1213 opts.log_buf = bpf_vlog; ··· 1591 1262 if (verbose) 1592 1263 printf(", verifier log:\n%s", bpf_vlog); 1593 1264 1265 + if (!check_xlated_program(test, fd_prog)) 1266 + goto fail_log; 1267 + 1594 1268 run_errs = 0; 1595 1269 run_successes = 0; 1596 1270 if (!alignment_prevented_execution && fd_prog >= 0 && test->runs >= 0) { ··· 1637 1305 if (test->fill_insns) 1638 1306 free(test->fill_insns); 1639 1307 close(fd_prog); 1308 + close(btf_fd); 1640 1309 for (i = 0; i < MAX_NR_MAPS; i++) 1641 1310 close(map_fds[i]); 1642 1311 sched_yield();
+3 -3
tools/testing/selftests/bpf/test_xsk.sh
··· 47 47 # conflict with any existing interface 48 48 # * tests the veth and xsk layers of the topology 49 49 # 50 - # See the source xdpxceiver.c for information on each test 50 + # See the source xskxceiver.c for information on each test 51 51 # 52 52 # Kernel configuration: 53 53 # --------------------- ··· 160 160 161 161 TEST_NAME="XSK_SELFTESTS_SOFTIRQ" 162 162 163 - execxdpxceiver 163 + exec_xskxceiver 164 164 165 165 cleanup_exit ${VETH0} ${VETH1} ${NS1} 166 166 TEST_NAME="XSK_SELFTESTS_BUSY_POLL" 167 167 busy_poll=1 168 168 169 169 setup_vethPairs 170 - execxdpxceiver 170 + exec_xskxceiver 171 171 172 172 ## END TESTS 173 173
+263
tools/testing/selftests/bpf/verifier/bpf_loop_inline.c
··· 1 + #define BTF_TYPES \ 2 + .btf_strings = "\0int\0i\0ctx\0callback\0main\0", \ 3 + .btf_types = { \ 4 + /* 1: int */ BTF_TYPE_INT_ENC(1, BTF_INT_SIGNED, 0, 32, 4), \ 5 + /* 2: int* */ BTF_PTR_ENC(1), \ 6 + /* 3: void* */ BTF_PTR_ENC(0), \ 7 + /* 4: int __(void*) */ BTF_FUNC_PROTO_ENC(1, 1), \ 8 + BTF_FUNC_PROTO_ARG_ENC(7, 3), \ 9 + /* 5: int __(int, int*) */ BTF_FUNC_PROTO_ENC(1, 2), \ 10 + BTF_FUNC_PROTO_ARG_ENC(5, 1), \ 11 + BTF_FUNC_PROTO_ARG_ENC(7, 2), \ 12 + /* 6: main */ BTF_FUNC_ENC(20, 4), \ 13 + /* 7: callback */ BTF_FUNC_ENC(11, 5), \ 14 + BTF_END_RAW \ 15 + } 16 + 17 + #define MAIN_TYPE 6 18 + #define CALLBACK_TYPE 7 19 + 20 + /* can't use BPF_CALL_REL, jit_subprogs adjusts IMM & OFF 21 + * fields for pseudo calls 22 + */ 23 + #define PSEUDO_CALL_INSN() \ 24 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_CALL, \ 25 + INSN_OFF_MASK, INSN_IMM_MASK) 26 + 27 + /* can't use BPF_FUNC_loop constant, 28 + * do_mix_fixups adjusts the IMM field 29 + */ 30 + #define HELPER_CALL_INSN() \ 31 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, INSN_OFF_MASK, INSN_IMM_MASK) 32 + 33 + { 34 + "inline simple bpf_loop call", 35 + .insns = { 36 + /* main */ 37 + /* force verifier state branching to verify logic on first and 38 + * subsequent bpf_loop insn processing steps 39 + */ 40 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64), 41 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 2), 42 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 43 + BPF_JMP_IMM(BPF_JA, 0, 0, 1), 44 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2), 45 + 46 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6), 47 + BPF_RAW_INSN(0, 0, 0, 0, 0), 48 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 49 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 50 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 51 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 52 + BPF_EXIT_INSN(), 53 + /* callback */ 54 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 55 + BPF_EXIT_INSN(), 56 + }, 57 + .expected_insns = { PSEUDO_CALL_INSN() }, 58 + .unexpected_insns = { HELPER_CALL_INSN() }, 59 + .prog_type = BPF_PROG_TYPE_TRACEPOINT, 60 + .result = ACCEPT, 61 + .runs = 0, 62 + .func_info = { { 0, MAIN_TYPE }, { 12, CALLBACK_TYPE } }, 63 + .func_info_cnt = 2, 64 + BTF_TYPES 65 + }, 66 + { 67 + "don't inline bpf_loop call, flags non-zero", 68 + .insns = { 69 + /* main */ 70 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64), 71 + BPF_ALU64_REG(BPF_MOV, BPF_REG_6, BPF_REG_0), 72 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64), 73 + BPF_ALU64_REG(BPF_MOV, BPF_REG_7, BPF_REG_0), 74 + BPF_JMP_IMM(BPF_JNE, BPF_REG_6, 0, 9), 75 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 76 + BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0, 0), 77 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 78 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 7), 79 + BPF_RAW_INSN(0, 0, 0, 0, 0), 80 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 81 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 82 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 83 + BPF_EXIT_INSN(), 84 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 1), 85 + BPF_JMP_IMM(BPF_JA, 0, 0, -10), 86 + /* callback */ 87 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 88 + BPF_EXIT_INSN(), 89 + }, 90 + .expected_insns = { HELPER_CALL_INSN() }, 91 + .unexpected_insns = { PSEUDO_CALL_INSN() }, 92 + .prog_type = BPF_PROG_TYPE_TRACEPOINT, 93 + .result = ACCEPT, 94 + .runs = 0, 95 + .func_info = { { 0, MAIN_TYPE }, { 16, CALLBACK_TYPE } }, 96 + .func_info_cnt = 2, 97 + BTF_TYPES 98 + }, 99 + { 100 + "don't inline bpf_loop call, callback non-constant", 101 + .insns = { 102 + /* main */ 103 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_jiffies64), 104 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 777, 4), /* pick a random callback */ 105 + 106 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 107 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 10), 108 + BPF_RAW_INSN(0, 0, 0, 0, 0), 109 + BPF_JMP_IMM(BPF_JA, 0, 0, 3), 110 + 111 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 112 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8), 113 + BPF_RAW_INSN(0, 0, 0, 0, 0), 114 + 115 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 116 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 117 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 118 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 119 + BPF_EXIT_INSN(), 120 + /* callback */ 121 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 122 + BPF_EXIT_INSN(), 123 + /* callback #2 */ 124 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 125 + BPF_EXIT_INSN(), 126 + }, 127 + .expected_insns = { HELPER_CALL_INSN() }, 128 + .unexpected_insns = { PSEUDO_CALL_INSN() }, 129 + .prog_type = BPF_PROG_TYPE_TRACEPOINT, 130 + .result = ACCEPT, 131 + .runs = 0, 132 + .func_info = { 133 + { 0, MAIN_TYPE }, 134 + { 14, CALLBACK_TYPE }, 135 + { 16, CALLBACK_TYPE } 136 + }, 137 + .func_info_cnt = 3, 138 + BTF_TYPES 139 + }, 140 + { 141 + "bpf_loop_inline and a dead func", 142 + .insns = { 143 + /* main */ 144 + 145 + /* A reference to callback #1 to make verifier count it as a func. 146 + * This reference is overwritten below and callback #1 is dead. 147 + */ 148 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 9), 149 + BPF_RAW_INSN(0, 0, 0, 0, 0), 150 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 151 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 8), 152 + BPF_RAW_INSN(0, 0, 0, 0, 0), 153 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 154 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 155 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 156 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 157 + BPF_EXIT_INSN(), 158 + /* callback */ 159 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 160 + BPF_EXIT_INSN(), 161 + /* callback #2 */ 162 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 163 + BPF_EXIT_INSN(), 164 + }, 165 + .expected_insns = { PSEUDO_CALL_INSN() }, 166 + .unexpected_insns = { HELPER_CALL_INSN() }, 167 + .prog_type = BPF_PROG_TYPE_TRACEPOINT, 168 + .result = ACCEPT, 169 + .runs = 0, 170 + .func_info = { 171 + { 0, MAIN_TYPE }, 172 + { 10, CALLBACK_TYPE }, 173 + { 12, CALLBACK_TYPE } 174 + }, 175 + .func_info_cnt = 3, 176 + BTF_TYPES 177 + }, 178 + { 179 + "bpf_loop_inline stack locations for loop vars", 180 + .insns = { 181 + /* main */ 182 + BPF_ST_MEM(BPF_W, BPF_REG_10, -12, 0x77), 183 + /* bpf_loop call #1 */ 184 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 1), 185 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 22), 186 + BPF_RAW_INSN(0, 0, 0, 0, 0), 187 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 188 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 189 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 190 + /* bpf_loop call #2 */ 191 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2), 192 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 16), 193 + BPF_RAW_INSN(0, 0, 0, 0, 0), 194 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 195 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 196 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 197 + /* call func and exit */ 198 + BPF_CALL_REL(2), 199 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 200 + BPF_EXIT_INSN(), 201 + /* func */ 202 + BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55), 203 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_1, 2), 204 + BPF_RAW_INSN(BPF_LD | BPF_IMM | BPF_DW, BPF_REG_2, BPF_PSEUDO_FUNC, 0, 6), 205 + BPF_RAW_INSN(0, 0, 0, 0, 0), 206 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_3, 0), 207 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_4, 0), 208 + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_loop), 209 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0), 210 + BPF_EXIT_INSN(), 211 + /* callback */ 212 + BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 1), 213 + BPF_EXIT_INSN(), 214 + }, 215 + .expected_insns = { 216 + BPF_ST_MEM(BPF_W, BPF_REG_10, -12, 0x77), 217 + SKIP_INSNS(), 218 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40), 219 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32), 220 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24), 221 + SKIP_INSNS(), 222 + /* offsets are the same as in the first call */ 223 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -40), 224 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -32), 225 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -24), 226 + SKIP_INSNS(), 227 + BPF_ST_MEM(BPF_DW, BPF_REG_10, -32, 0x55), 228 + SKIP_INSNS(), 229 + /* offsets differ from main because of different offset 230 + * in BPF_ST_MEM instruction 231 + */ 232 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_6, -56), 233 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_7, -48), 234 + BPF_STX_MEM(BPF_DW, BPF_REG_10, BPF_REG_8, -40), 235 + }, 236 + .unexpected_insns = { HELPER_CALL_INSN() }, 237 + .prog_type = BPF_PROG_TYPE_TRACEPOINT, 238 + .result = ACCEPT, 239 + .func_info = { 240 + { 0, MAIN_TYPE }, 241 + { 16, MAIN_TYPE }, 242 + { 25, CALLBACK_TYPE }, 243 + }, 244 + .func_info_cnt = 3, 245 + BTF_TYPES 246 + }, 247 + { 248 + "inline bpf_loop call in a big program", 249 + .insns = {}, 250 + .fill_helper = bpf_fill_big_prog_with_loop_1, 251 + .expected_insns = { PSEUDO_CALL_INSN() }, 252 + .unexpected_insns = { HELPER_CALL_INSN() }, 253 + .result = ACCEPT, 254 + .func_info = { { 0, MAIN_TYPE }, { 16, CALLBACK_TYPE } }, 255 + .func_info_cnt = 2, 256 + BTF_TYPES 257 + }, 258 + 259 + #undef HELPER_CALL_INSN 260 + #undef PSEUDO_CALL_INSN 261 + #undef CALLBACK_TYPE 262 + #undef MAIN_TYPE 263 + #undef BTF_TYPES
+21 -4
tools/testing/selftests/bpf/xdpxceiver.c tools/testing/selftests/bpf/xskxceiver.c
··· 97 97 #include <time.h> 98 98 #include <unistd.h> 99 99 #include <stdatomic.h> 100 - #include <bpf/xsk.h> 101 - #include "xdpxceiver.h" 100 + #include "xsk.h" 101 + #include "xskxceiver.h" 102 102 #include "../kselftest.h" 103 103 104 104 /* AF_XDP APIs were moved into libxdp and marked as deprecated in libbpf. 105 - * Until xdpxceiver is either moved or re-writed into libxdp, suppress 105 + * Until xskxceiver is either moved or re-writed into libxdp, suppress 106 106 * deprecation warnings in this file 107 107 */ 108 108 #pragma GCC diagnostic ignored "-Wdeprecated-declarations" ··· 1085 1085 { 1086 1086 u64 umem_sz = ifobject->umem->num_frames * ifobject->umem->frame_size; 1087 1087 int mmap_flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_NORESERVE; 1088 + LIBBPF_OPTS(bpf_xdp_query_opts, opts); 1088 1089 int ret, ifindex; 1089 1090 void *bufs; 1090 1091 u32 i; ··· 1131 1130 if (!ifindex) 1132 1131 exit_with_error(errno); 1133 1132 1134 - ret = xsk_setup_xdp_prog(ifindex, &ifobject->xsk_map_fd); 1133 + ret = xsk_setup_xdp_prog_xsk(ifobject->xsk->xsk, &ifobject->xsk_map_fd); 1135 1134 if (ret) 1136 1135 exit_with_error(-ret); 1136 + 1137 + ret = bpf_xdp_query(ifindex, ifobject->xdp_flags, &opts); 1138 + if (ret) 1139 + exit_with_error(-ret); 1140 + 1141 + if (ifobject->xdp_flags & XDP_FLAGS_SKB_MODE) { 1142 + if (opts.attach_mode != XDP_ATTACHED_SKB) { 1143 + ksft_print_msg("ERROR: [%s] XDP prog not in SKB mode\n"); 1144 + exit_with_error(-EINVAL); 1145 + } 1146 + } else if (ifobject->xdp_flags & XDP_FLAGS_DRV_MODE) { 1147 + if (opts.attach_mode != XDP_ATTACHED_DRV) { 1148 + ksft_print_msg("ERROR: [%s] XDP prog not in DRV mode\n"); 1149 + exit_with_error(-EINVAL); 1150 + } 1151 + } 1137 1152 1138 1153 ret = xsk_socket__update_xskmap(ifobject->xsk->xsk, ifobject->xsk_map_fd); 1139 1154 if (ret)
+3 -3
tools/testing/selftests/bpf/xdpxceiver.h tools/testing/selftests/bpf/xskxceiver.h
··· 2 2 * Copyright(c) 2020 Intel Corporation. 3 3 */ 4 4 5 - #ifndef XDPXCEIVER_H_ 6 - #define XDPXCEIVER_H_ 5 + #ifndef XSKXCEIVER_H_ 6 + #define XSKXCEIVER_H_ 7 7 8 8 #ifndef SOL_XDP 9 9 #define SOL_XDP 283 ··· 169 169 170 170 int pkts_in_flight; 171 171 172 - #endif /* XDPXCEIVER_H */ 172 + #endif /* XSKXCEIVER_H_ */
+2 -2
tools/testing/selftests/bpf/xsk_prereqs.sh
··· 8 8 ksft_xpass=3 9 9 ksft_skip=4 10 10 11 - XSKOBJ=xdpxceiver 11 + XSKOBJ=xskxceiver 12 12 13 13 validate_root_exec() 14 14 { ··· 77 77 [ ! $(type -P ip) ] && { echo "'ip' not found. Skipping tests."; test_exit $ksft_skip; } 78 78 } 79 79 80 - execxdpxceiver() 80 + exec_xskxceiver() 81 81 { 82 82 if [[ $busy_poll -eq 1 ]]; then 83 83 ARGS+="-b "