Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
bpf-next 2022-11-02

We've added 70 non-merge commits during the last 14 day(s) which contain
a total of 96 files changed, 3203 insertions(+), 640 deletions(-).

The main changes are:

1) Make cgroup local storage available to non-cgroup attached BPF programs
such as tc BPF ones, from Yonghong Song.

2) Avoid unnecessary deadlock detection and failures wrt BPF task storage
helpers, from Martin KaFai Lau.

3) Add LLVM disassembler as default library for dumping JITed code
in bpftool, from Quentin Monnet.

4) Various kprobe_multi_link fixes related to kernel modules,
from Jiri Olsa.

5) Optimize x86-64 JIT with emitting BMI2-based shift instructions,
from Jie Meng.

6) Improve BPF verifier's memory type compatibility for map key/value
arguments, from Dave Marchevsky.

7) Only create mmap-able data section maps in libbpf when data is exposed
via skeletons, from Andrii Nakryiko.

8) Add an autoattach option for bpftool to load all object assets,
from Wang Yufen.

9) Various memory handling fixes for libbpf and BPF selftests,
from Xu Kuohai.

10) Initial support for BPF selftest's vmtest.sh on arm64,
from Manu Bretelle.

11) Improve libbpf's BTF handling to dedup identical structs,
from Alan Maguire.

12) Add BPF CI and denylist documentation for BPF selftests,
from Daniel Müller.

13) Check BPF cpumap max_entries before doing allocation work,
from Florian Lehner.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (70 commits)
samples/bpf: Fix typo in README
bpf: Remove the obsolte u64_stats_fetch_*_irq() users.
bpf: check max_entries before allocating memory
bpf: Fix a typo in comment for DFS algorithm
bpftool: Fix spelling mistake "disasembler" -> "disassembler"
selftests/bpf: Fix bpftool synctypes checking failure
selftests/bpf: Panic on hard/soft lockup
docs/bpf: Add documentation for new cgroup local storage
selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x
selftests/bpf: Add selftests for new cgroup local storage
selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str
bpftool: Support new cgroup local storage
libbpf: Support new cgroup local storage
bpf: Implement cgroup storage available to non-cgroup-attached bpf progs
bpf: Refactor some inode/task/sk storage functions for reuse
bpf: Make struct cgroup btf id global
selftests/bpf: Tracing prog can still do lookup under busy lock
selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection
bpf: Add new bpf_task_storage_delete proto with no deadlock detection
bpf: bpf_task_storage_delete_recur does lookup first before the deadlock check
...
====================

Link: https://lore.kernel.org/r/20221102062120.5724-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+3219 -656
+109
Documentation/bpf/map_cgrp_storage.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + .. Copyright (C) 2022 Meta Platforms, Inc. and affiliates. 3 + 4 + ========================= 5 + BPF_MAP_TYPE_CGRP_STORAGE 6 + ========================= 7 + 8 + The ``BPF_MAP_TYPE_CGRP_STORAGE`` map type represents a local fix-sized 9 + storage for cgroups. It is only available with ``CONFIG_CGROUPS``. 10 + The programs are made available by the same Kconfig. The 11 + data for a particular cgroup can be retrieved by looking up the map 12 + with that cgroup. 13 + 14 + This document describes the usage and semantics of the 15 + ``BPF_MAP_TYPE_CGRP_STORAGE`` map type. 16 + 17 + Usage 18 + ===== 19 + 20 + The map key must be ``sizeof(int)`` representing a cgroup fd. 21 + To access the storage in a program, use ``bpf_cgrp_storage_get``:: 22 + 23 + void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) 24 + 25 + ``flags`` could be 0 or ``BPF_LOCAL_STORAGE_GET_F_CREATE`` which indicates that 26 + a new local storage will be created if one does not exist. 27 + 28 + The local storage can be removed with ``bpf_cgrp_storage_delete``:: 29 + 30 + long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgroup *cgroup) 31 + 32 + The map is available to all program types. 33 + 34 + Examples 35 + ======== 36 + 37 + A BPF program example with BPF_MAP_TYPE_CGRP_STORAGE:: 38 + 39 + #include <vmlinux.h> 40 + #include <bpf/bpf_helpers.h> 41 + #include <bpf/bpf_tracing.h> 42 + 43 + struct { 44 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 45 + __uint(map_flags, BPF_F_NO_PREALLOC); 46 + __type(key, int); 47 + __type(value, long); 48 + } cgrp_storage SEC(".maps"); 49 + 50 + SEC("tp_btf/sys_enter") 51 + int BPF_PROG(on_enter, struct pt_regs *regs, long id) 52 + { 53 + struct task_struct *task = bpf_get_current_task_btf(); 54 + long *ptr; 55 + 56 + ptr = bpf_cgrp_storage_get(&cgrp_storage, task->cgroups->dfl_cgrp, 0, 57 + BPF_LOCAL_STORAGE_GET_F_CREATE); 58 + if (ptr) 59 + __sync_fetch_and_add(ptr, 1); 60 + 61 + return 0; 62 + } 63 + 64 + Userspace accessing map declared above:: 65 + 66 + #include <linux/bpf.h> 67 + #include <linux/libbpf.h> 68 + 69 + __u32 map_lookup(struct bpf_map *map, int cgrp_fd) 70 + { 71 + __u32 *value; 72 + value = bpf_map_lookup_elem(bpf_map__fd(map), &cgrp_fd); 73 + if (value) 74 + return *value; 75 + return 0; 76 + } 77 + 78 + Difference Between BPF_MAP_TYPE_CGRP_STORAGE and BPF_MAP_TYPE_CGROUP_STORAGE 79 + ============================================================================ 80 + 81 + The old cgroup storage map ``BPF_MAP_TYPE_CGROUP_STORAGE`` has been marked as 82 + deprecated (renamed to ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED``). The new 83 + ``BPF_MAP_TYPE_CGRP_STORAGE`` map should be used instead. The following 84 + illusates the main difference between ``BPF_MAP_TYPE_CGRP_STORAGE`` and 85 + ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED``. 86 + 87 + (1). ``BPF_MAP_TYPE_CGRP_STORAGE`` can be used by all program types while 88 + ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED`` is available only to cgroup program types 89 + like BPF_CGROUP_INET_INGRESS or BPF_CGROUP_SOCK_OPS, etc. 90 + 91 + (2). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports local storage for more than one 92 + cgroup while ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED`` only supports one cgroup 93 + which is attached by a BPF program. 94 + 95 + (3). ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED`` allocates local storage at attach time so 96 + ``bpf_get_local_storage()`` always returns non-NULL local storage. 97 + ``BPF_MAP_TYPE_CGRP_STORAGE`` allocates local storage at runtime so 98 + it is possible that ``bpf_cgrp_storage_get()`` may return null local storage. 99 + To avoid such null local storage issue, user space can do 100 + ``bpf_map_update_elem()`` to pre-allocate local storage before a BPF program 101 + is attached. 102 + 103 + (4). ``BPF_MAP_TYPE_CGRP_STORAGE`` supports deleting local storage by a BPF program 104 + while ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED`` only deletes storage during 105 + prog detach time. 106 + 107 + So overall, ``BPF_MAP_TYPE_CGRP_STORAGE`` supports all ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED`` 108 + functionality and beyond. It is recommended to use ``BPF_MAP_TYPE_CGRP_STORAGE`` 109 + instead of ``BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED``.
+69 -40
Documentation/bpf/maps.rst
··· 1 1 2 - ========= 3 - eBPF maps 4 - ========= 2 + ======== 3 + BPF maps 4 + ======== 5 5 6 - 'maps' is a generic storage of different types for sharing data between kernel 7 - and userspace. 6 + BPF 'maps' provide generic storage of different types for sharing data between 7 + kernel and user space. There are several storage types available, including 8 + hash, array, bloom filter and radix-tree. Several of the map types exist to 9 + support specific BPF helpers that perform actions based on the map contents. The 10 + maps are accessed from BPF programs via BPF helpers which are documented in the 11 + `man-pages`_ for `bpf-helpers(7)`_. 8 12 9 - The maps are accessed from user space via BPF syscall, which has commands: 10 - 11 - - create a map with given type and attributes 12 - ``map_fd = bpf(BPF_MAP_CREATE, union bpf_attr *attr, u32 size)`` 13 - using attr->map_type, attr->key_size, attr->value_size, attr->max_entries 14 - returns process-local file descriptor or negative error 15 - 16 - - lookup key in a given map 17 - ``err = bpf(BPF_MAP_LOOKUP_ELEM, union bpf_attr *attr, u32 size)`` 18 - using attr->map_fd, attr->key, attr->value 19 - returns zero and stores found elem into value or negative error 20 - 21 - - create or update key/value pair in a given map 22 - ``err = bpf(BPF_MAP_UPDATE_ELEM, union bpf_attr *attr, u32 size)`` 23 - using attr->map_fd, attr->key, attr->value 24 - returns zero or negative error 25 - 26 - - find and delete element by key in a given map 27 - ``err = bpf(BPF_MAP_DELETE_ELEM, union bpf_attr *attr, u32 size)`` 28 - using attr->map_fd, attr->key 29 - 30 - - to delete map: close(fd) 31 - Exiting process will delete maps automatically 32 - 33 - userspace programs use this syscall to create/access maps that eBPF programs 34 - are concurrently updating. 35 - 36 - maps can have different types: hash, array, bloom filter, radix-tree, etc. 37 - 38 - The map is defined by: 39 - 40 - - type 41 - - max number of elements 42 - - key size in bytes 43 - - value size in bytes 13 + BPF maps are accessed from user space via the ``bpf`` syscall, which provides 14 + commands to create maps, lookup elements, update elements and delete 15 + elements. More details of the BPF syscall are available in 16 + :doc:`/userspace-api/ebpf/syscall` and in the `man-pages`_ for `bpf(2)`_. 44 17 45 18 Map Types 46 19 ========= ··· 23 50 :glob: 24 51 25 52 map_* 53 + 54 + Usage Notes 55 + =========== 56 + 57 + .. c:function:: 58 + int bpf(int command, union bpf_attr *attr, u32 size) 59 + 60 + Use the ``bpf()`` system call to perform the operation specified by 61 + ``command``. The operation takes parameters provided in ``attr``. The ``size`` 62 + argument is the size of the ``union bpf_attr`` in ``attr``. 63 + 64 + **BPF_MAP_CREATE** 65 + 66 + Create a map with the desired type and attributes in ``attr``: 67 + 68 + .. code-block:: c 69 + 70 + int fd; 71 + union bpf_attr attr = { 72 + .map_type = BPF_MAP_TYPE_ARRAY; /* mandatory */ 73 + .key_size = sizeof(__u32); /* mandatory */ 74 + .value_size = sizeof(__u32); /* mandatory */ 75 + .max_entries = 256; /* mandatory */ 76 + .map_flags = BPF_F_MMAPABLE; 77 + .map_name = "example_array"; 78 + }; 79 + 80 + fd = bpf(BPF_MAP_CREATE, &attr, sizeof(attr)); 81 + 82 + Returns a process-local file descriptor on success, or negative error in case of 83 + failure. The map can be deleted by calling ``close(fd)``. Maps held by open 84 + file descriptors will be deleted automatically when a process exits. 85 + 86 + .. note:: Valid characters for ``map_name`` are ``A-Z``, ``a-z``, ``0-9``, 87 + ``'_'`` and ``'.'``. 88 + 89 + **BPF_MAP_LOOKUP_ELEM** 90 + 91 + Lookup key in a given map using ``attr->map_fd``, ``attr->key``, 92 + ``attr->value``. Returns zero and stores found elem into ``attr->value`` on 93 + success, or negative error on failure. 94 + 95 + **BPF_MAP_UPDATE_ELEM** 96 + 97 + Create or update key/value pair in a given map using ``attr->map_fd``, ``attr->key``, 98 + ``attr->value``. Returns zero on success or negative error on failure. 99 + 100 + **BPF_MAP_DELETE_ELEM** 101 + 102 + Find and delete element by key in a given map using ``attr->map_fd``, 103 + ``attr->key``. Returns zero on success or negative error on failure. 104 + 105 + .. Links: 106 + .. _man-pages: https://www.kernel.org/doc/man-pages/ 107 + .. _bpf(2): https://man7.org/linux/man-pages/man2/bpf.2.html 108 + .. _bpf-helpers(7): https://man7.org/linux/man-pages/man7/bpf-helpers.7.html
+2 -7
arch/arm64/net/bpf_jit_comp.c
··· 1649 1649 struct bpf_prog *p = l->link.prog; 1650 1650 int cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie); 1651 1651 1652 - if (p->aux->sleepable) { 1653 - enter_prog = (u64)__bpf_prog_enter_sleepable; 1654 - exit_prog = (u64)__bpf_prog_exit_sleepable; 1655 - } else { 1656 - enter_prog = (u64)__bpf_prog_enter; 1657 - exit_prog = (u64)__bpf_prog_exit; 1658 - } 1652 + enter_prog = (u64)bpf_trampoline_enter(p); 1653 + exit_prog = (u64)bpf_trampoline_exit(p); 1659 1654 1660 1655 if (l->cookie == 0) { 1661 1656 /* if cookie is zero, one instruction is enough to store it */
+96 -29
arch/x86/net/bpf_jit_comp.c
··· 904 904 *pprog = prog; 905 905 } 906 906 907 + /* emit the 3-byte VEX prefix 908 + * 909 + * r: same as rex.r, extra bit for ModRM reg field 910 + * x: same as rex.x, extra bit for SIB index field 911 + * b: same as rex.b, extra bit for ModRM r/m, or SIB base 912 + * m: opcode map select, encoding escape bytes e.g. 0x0f38 913 + * w: same as rex.w (32 bit or 64 bit) or opcode specific 914 + * src_reg2: additional source reg (encoded as BPF reg) 915 + * l: vector length (128 bit or 256 bit) or reserved 916 + * pp: opcode prefix (none, 0x66, 0xf2 or 0xf3) 917 + */ 918 + static void emit_3vex(u8 **pprog, bool r, bool x, bool b, u8 m, 919 + bool w, u8 src_reg2, bool l, u8 pp) 920 + { 921 + u8 *prog = *pprog; 922 + const u8 b0 = 0xc4; /* first byte of 3-byte VEX prefix */ 923 + u8 b1, b2; 924 + u8 vvvv = reg2hex[src_reg2]; 925 + 926 + /* reg2hex gives only the lower 3 bit of vvvv */ 927 + if (is_ereg(src_reg2)) 928 + vvvv |= 1 << 3; 929 + 930 + /* 931 + * 2nd byte of 3-byte VEX prefix 932 + * ~ means bit inverted encoding 933 + * 934 + * 7 0 935 + * +---+---+---+---+---+---+---+---+ 936 + * |~R |~X |~B | m | 937 + * +---+---+---+---+---+---+---+---+ 938 + */ 939 + b1 = (!r << 7) | (!x << 6) | (!b << 5) | (m & 0x1f); 940 + /* 941 + * 3rd byte of 3-byte VEX prefix 942 + * 943 + * 7 0 944 + * +---+---+---+---+---+---+---+---+ 945 + * | W | ~vvvv | L | pp | 946 + * +---+---+---+---+---+---+---+---+ 947 + */ 948 + b2 = (w << 7) | ((~vvvv & 0xf) << 3) | (l << 2) | (pp & 3); 949 + 950 + EMIT3(b0, b1, b2); 951 + *pprog = prog; 952 + } 953 + 954 + /* emit BMI2 shift instruction */ 955 + static void emit_shiftx(u8 **pprog, u32 dst_reg, u8 src_reg, bool is64, u8 op) 956 + { 957 + u8 *prog = *pprog; 958 + bool r = is_ereg(dst_reg); 959 + u8 m = 2; /* escape code 0f38 */ 960 + 961 + emit_3vex(&prog, r, false, r, m, is64, src_reg, false, op); 962 + EMIT2(0xf7, add_2reg(0xC0, dst_reg, dst_reg)); 963 + *pprog = prog; 964 + } 965 + 907 966 #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp))) 908 967 909 968 static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, u8 *rw_image, ··· 1209 1150 case BPF_ALU64 | BPF_LSH | BPF_X: 1210 1151 case BPF_ALU64 | BPF_RSH | BPF_X: 1211 1152 case BPF_ALU64 | BPF_ARSH | BPF_X: 1153 + /* BMI2 shifts aren't better when shift count is already in rcx */ 1154 + if (boot_cpu_has(X86_FEATURE_BMI2) && src_reg != BPF_REG_4) { 1155 + /* shrx/sarx/shlx dst_reg, dst_reg, src_reg */ 1156 + bool w = (BPF_CLASS(insn->code) == BPF_ALU64); 1157 + u8 op; 1212 1158 1213 - /* Check for bad case when dst_reg == rcx */ 1214 - if (dst_reg == BPF_REG_4) { 1215 - /* mov r11, dst_reg */ 1216 - EMIT_mov(AUX_REG, dst_reg); 1217 - dst_reg = AUX_REG; 1159 + switch (BPF_OP(insn->code)) { 1160 + case BPF_LSH: 1161 + op = 1; /* prefix 0x66 */ 1162 + break; 1163 + case BPF_RSH: 1164 + op = 3; /* prefix 0xf2 */ 1165 + break; 1166 + case BPF_ARSH: 1167 + op = 2; /* prefix 0xf3 */ 1168 + break; 1169 + } 1170 + 1171 + emit_shiftx(&prog, dst_reg, src_reg, w, op); 1172 + 1173 + break; 1218 1174 } 1219 1175 1220 1176 if (src_reg != BPF_REG_4) { /* common case */ 1221 - EMIT1(0x51); /* push rcx */ 1222 - 1177 + /* Check for bad case when dst_reg == rcx */ 1178 + if (dst_reg == BPF_REG_4) { 1179 + /* mov r11, dst_reg */ 1180 + EMIT_mov(AUX_REG, dst_reg); 1181 + dst_reg = AUX_REG; 1182 + } else { 1183 + EMIT1(0x51); /* push rcx */ 1184 + } 1223 1185 /* mov rcx, src_reg */ 1224 1186 EMIT_mov(BPF_REG_4, src_reg); 1225 1187 } ··· 1252 1172 b3 = simple_alu_opcodes[BPF_OP(insn->code)]; 1253 1173 EMIT2(0xD3, add_1reg(b3, dst_reg)); 1254 1174 1255 - if (src_reg != BPF_REG_4) 1256 - EMIT1(0x59); /* pop rcx */ 1175 + if (src_reg != BPF_REG_4) { 1176 + if (insn->dst_reg == BPF_REG_4) 1177 + /* mov dst_reg, r11 */ 1178 + EMIT_mov(insn->dst_reg, AUX_REG); 1179 + else 1180 + EMIT1(0x59); /* pop rcx */ 1181 + } 1257 1182 1258 - if (insn->dst_reg == BPF_REG_4) 1259 - /* mov dst_reg, r11 */ 1260 - EMIT_mov(insn->dst_reg, AUX_REG); 1261 1183 break; 1262 1184 1263 1185 case BPF_ALU | BPF_END | BPF_FROM_BE: ··· 1907 1825 struct bpf_tramp_link *l, int stack_size, 1908 1826 int run_ctx_off, bool save_ret) 1909 1827 { 1910 - void (*exit)(struct bpf_prog *prog, u64 start, 1911 - struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit; 1912 - u64 (*enter)(struct bpf_prog *prog, 1913 - struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter; 1914 1828 u8 *prog = *pprog; 1915 1829 u8 *jmp_insn; 1916 1830 int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie); ··· 1925 1847 */ 1926 1848 emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off); 1927 1849 1928 - if (p->aux->sleepable) { 1929 - enter = __bpf_prog_enter_sleepable; 1930 - exit = __bpf_prog_exit_sleepable; 1931 - } else if (p->type == BPF_PROG_TYPE_STRUCT_OPS) { 1932 - enter = __bpf_prog_enter_struct_ops; 1933 - exit = __bpf_prog_exit_struct_ops; 1934 - } else if (p->expected_attach_type == BPF_LSM_CGROUP) { 1935 - enter = __bpf_prog_enter_lsm_cgroup; 1936 - exit = __bpf_prog_exit_lsm_cgroup; 1937 - } 1938 - 1939 1850 /* arg1: mov rdi, progs[i] */ 1940 1851 emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p); 1941 1852 /* arg2: lea rsi, [rbp - ctx_cookie_off] */ 1942 1853 EMIT4(0x48, 0x8D, 0x75, -run_ctx_off); 1943 1854 1944 - if (emit_call(&prog, enter, prog)) 1855 + if (emit_call(&prog, bpf_trampoline_enter(p), prog)) 1945 1856 return -EINVAL; 1946 1857 /* remember prog start time returned by __bpf_prog_enter */ 1947 1858 emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0); ··· 1975 1908 emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6); 1976 1909 /* arg3: lea rdx, [rbp - run_ctx_off] */ 1977 1910 EMIT4(0x48, 0x8D, 0x55, -run_ctx_off); 1978 - if (emit_call(&prog, exit, prog)) 1911 + if (emit_call(&prog, bpf_trampoline_exit(p), prog)) 1979 1912 return -EINVAL; 1980 1913 1981 1914 *pprog = prog;
+19 -14
include/linux/bpf.h
··· 855 855 const struct btf_func_model *m, u32 flags, 856 856 struct bpf_tramp_links *tlinks, 857 857 void *orig_call); 858 - /* these two functions are called from generated trampoline */ 859 - u64 notrace __bpf_prog_enter(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx); 860 - void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx); 861 - u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx); 862 - void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start, 863 - struct bpf_tramp_run_ctx *run_ctx); 864 - u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog, 865 - struct bpf_tramp_run_ctx *run_ctx); 866 - void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start, 867 - struct bpf_tramp_run_ctx *run_ctx); 868 - u64 notrace __bpf_prog_enter_struct_ops(struct bpf_prog *prog, 869 - struct bpf_tramp_run_ctx *run_ctx); 870 - void notrace __bpf_prog_exit_struct_ops(struct bpf_prog *prog, u64 start, 871 - struct bpf_tramp_run_ctx *run_ctx); 858 + u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog, 859 + struct bpf_tramp_run_ctx *run_ctx); 860 + void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start, 861 + struct bpf_tramp_run_ctx *run_ctx); 872 862 void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr); 873 863 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr); 864 + typedef u64 (*bpf_trampoline_enter_t)(struct bpf_prog *prog, 865 + struct bpf_tramp_run_ctx *run_ctx); 866 + typedef void (*bpf_trampoline_exit_t)(struct bpf_prog *prog, u64 start, 867 + struct bpf_tramp_run_ctx *run_ctx); 868 + bpf_trampoline_enter_t bpf_trampoline_enter(const struct bpf_prog *prog); 869 + bpf_trampoline_exit_t bpf_trampoline_exit(const struct bpf_prog *prog); 874 870 875 871 struct bpf_ksym { 876 872 unsigned long start; ··· 2053 2057 2054 2058 const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id); 2055 2059 void bpf_task_storage_free(struct task_struct *task); 2060 + void bpf_cgrp_storage_free(struct cgroup *cgroup); 2056 2061 bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog); 2057 2062 const struct btf_func_model * 2058 2063 bpf_jit_find_kfunc_model(const struct bpf_prog *prog, ··· 2308 2311 static inline void bpf_prog_inc_misses_counter(struct bpf_prog *prog) 2309 2312 { 2310 2313 } 2314 + 2315 + static inline void bpf_cgrp_storage_free(struct cgroup *cgroup) 2316 + { 2317 + } 2311 2318 #endif /* CONFIG_BPF_SYSCALL */ 2312 2319 2313 2320 void __bpf_free_used_btfs(struct bpf_prog_aux *aux, ··· 2536 2535 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto; 2537 2536 extern const struct bpf_func_proto bpf_sock_from_file_proto; 2538 2537 extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto; 2538 + extern const struct bpf_func_proto bpf_task_storage_get_recur_proto; 2539 2539 extern const struct bpf_func_proto bpf_task_storage_get_proto; 2540 + extern const struct bpf_func_proto bpf_task_storage_delete_recur_proto; 2540 2541 extern const struct bpf_func_proto bpf_task_storage_delete_proto; 2541 2542 extern const struct bpf_func_proto bpf_for_each_map_elem_proto; 2542 2543 extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto; ··· 2552 2549 extern const struct bpf_func_proto bpf_set_retval_proto; 2553 2550 extern const struct bpf_func_proto bpf_get_retval_proto; 2554 2551 extern const struct bpf_func_proto bpf_user_ringbuf_drain_proto; 2552 + extern const struct bpf_func_proto bpf_cgrp_storage_get_proto; 2553 + extern const struct bpf_func_proto bpf_cgrp_storage_delete_proto; 2555 2554 2556 2555 const struct bpf_func_proto *tracing_prog_func_proto( 2557 2556 enum bpf_func_id func_id, const struct bpf_prog *prog);
+7 -10
include/linux/bpf_local_storage.h
··· 116 116 .idx_lock = __SPIN_LOCK_UNLOCKED(name.idx_lock), \ 117 117 } 118 118 119 - u16 bpf_local_storage_cache_idx_get(struct bpf_local_storage_cache *cache); 120 - void bpf_local_storage_cache_idx_free(struct bpf_local_storage_cache *cache, 121 - u16 idx); 122 - 123 119 /* Helper functions for bpf_local_storage */ 124 120 int bpf_local_storage_map_alloc_check(union bpf_attr *attr); 125 121 126 - struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr); 122 + struct bpf_map * 123 + bpf_local_storage_map_alloc(union bpf_attr *attr, 124 + struct bpf_local_storage_cache *cache); 127 125 128 126 struct bpf_local_storage_data * 129 127 bpf_local_storage_lookup(struct bpf_local_storage *local_storage, 130 128 struct bpf_local_storage_map *smap, 131 129 bool cacheit_lockit); 132 130 133 - void bpf_local_storage_map_free(struct bpf_local_storage_map *smap, 131 + bool bpf_local_storage_unlink_nolock(struct bpf_local_storage *local_storage); 132 + 133 + void bpf_local_storage_map_free(struct bpf_map *map, 134 + struct bpf_local_storage_cache *cache, 134 135 int __percpu *busy_counter); 135 136 136 137 int bpf_local_storage_map_check_btf(const struct bpf_map *map, ··· 141 140 142 141 void bpf_selem_link_storage_nolock(struct bpf_local_storage *local_storage, 143 142 struct bpf_local_storage_elem *selem); 144 - 145 - bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, 146 - struct bpf_local_storage_elem *selem, 147 - bool uncharge_omem, bool use_trace_rcu); 148 143 149 144 void bpf_selem_unlink(struct bpf_local_storage_elem *selem, bool use_trace_rcu); 150 145
+1
include/linux/bpf_types.h
··· 86 86 BPF_MAP_TYPE(BPF_MAP_TYPE_PERF_EVENT_ARRAY, perf_event_array_map_ops) 87 87 #ifdef CONFIG_CGROUPS 88 88 BPF_MAP_TYPE(BPF_MAP_TYPE_CGROUP_ARRAY, cgroup_array_map_ops) 89 + BPF_MAP_TYPE(BPF_MAP_TYPE_CGRP_STORAGE, cgrp_storage_map_ops) 89 90 #endif 90 91 #ifdef CONFIG_CGROUP_BPF 91 92 BPF_MAP_TYPE(BPF_MAP_TYPE_CGROUP_STORAGE, cgroup_storage_map_ops)
+14 -1
include/linux/bpf_verifier.h
··· 642 642 } 643 643 644 644 /* only use after check_attach_btf_id() */ 645 - static inline enum bpf_prog_type resolve_prog_type(struct bpf_prog *prog) 645 + static inline enum bpf_prog_type resolve_prog_type(const struct bpf_prog *prog) 646 646 { 647 647 return prog->type == BPF_PROG_TYPE_EXT ? 648 648 prog->aux->dst_prog->type : prog->type; 649 + } 650 + 651 + static inline bool bpf_prog_check_recur(const struct bpf_prog *prog) 652 + { 653 + switch (resolve_prog_type(prog)) { 654 + case BPF_PROG_TYPE_TRACING: 655 + return prog->expected_attach_type != BPF_TRACE_ITER; 656 + case BPF_PROG_TYPE_STRUCT_OPS: 657 + case BPF_PROG_TYPE_LSM: 658 + return false; 659 + default: 660 + return true; 661 + } 649 662 } 650 663 651 664 #endif /* _LINUX_BPF_VERIFIER_H */
+1
include/linux/btf_ids.h
··· 265 265 }; 266 266 267 267 extern u32 btf_tracing_ids[]; 268 + extern u32 bpf_cgroup_btf_id[]; 268 269 269 270 #endif
+4
include/linux/cgroup-defs.h
··· 507 507 /* Used to store internal freezer state */ 508 508 struct cgroup_freezer_state freezer; 509 509 510 + #ifdef CONFIG_BPF_SYSCALL 511 + struct bpf_local_storage __rcu *bpf_cgrp_storage; 512 + #endif 513 + 510 514 /* All ancestors including self */ 511 515 struct cgroup *ancestors[]; 512 516 };
+9
include/linux/module.h
··· 879 879 } 880 880 #endif /* CONFIG_MODULE_SIG */ 881 881 882 + #if defined(CONFIG_MODULES) && defined(CONFIG_KALLSYMS) 882 883 int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *, 883 884 struct module *, unsigned long), 884 885 void *data); 886 + #else 887 + static inline int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *, 888 + struct module *, unsigned long), 889 + void *data) 890 + { 891 + return -EOPNOTSUPP; 892 + } 893 + #endif /* CONFIG_MODULES && CONFIG_KALLSYMS */ 885 894 886 895 #endif /* _LINUX_MODULE_H */
+49 -1
include/uapi/linux/bpf.h
··· 922 922 BPF_MAP_TYPE_CPUMAP, 923 923 BPF_MAP_TYPE_XSKMAP, 924 924 BPF_MAP_TYPE_SOCKHASH, 925 - BPF_MAP_TYPE_CGROUP_STORAGE, 925 + BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, 926 + /* BPF_MAP_TYPE_CGROUP_STORAGE is available to bpf programs attaching 927 + * to a cgroup. The newer BPF_MAP_TYPE_CGRP_STORAGE is available to 928 + * both cgroup-attached and other progs and supports all functionality 929 + * provided by BPF_MAP_TYPE_CGROUP_STORAGE. So mark 930 + * BPF_MAP_TYPE_CGROUP_STORAGE deprecated. 931 + */ 932 + BPF_MAP_TYPE_CGROUP_STORAGE = BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, 926 933 BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, 927 934 BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, 928 935 BPF_MAP_TYPE_QUEUE, ··· 942 935 BPF_MAP_TYPE_TASK_STORAGE, 943 936 BPF_MAP_TYPE_BLOOM_FILTER, 944 937 BPF_MAP_TYPE_USER_RINGBUF, 938 + BPF_MAP_TYPE_CGRP_STORAGE, 945 939 }; 946 940 947 941 /* Note that tracing related programs such as ··· 5443 5435 * **-E2BIG** if user-space has tried to publish a sample which is 5444 5436 * larger than the size of the ring buffer, or which cannot fit 5445 5437 * within a struct bpf_dynptr. 5438 + * 5439 + * void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) 5440 + * Description 5441 + * Get a bpf_local_storage from the *cgroup*. 5442 + * 5443 + * Logically, it could be thought of as getting the value from 5444 + * a *map* with *cgroup* as the **key**. From this 5445 + * perspective, the usage is not much different from 5446 + * **bpf_map_lookup_elem**\ (*map*, **&**\ *cgroup*) except this 5447 + * helper enforces the key must be a cgroup struct and the map must also 5448 + * be a **BPF_MAP_TYPE_CGRP_STORAGE**. 5449 + * 5450 + * In reality, the local-storage value is embedded directly inside of the 5451 + * *cgroup* object itself, rather than being located in the 5452 + * **BPF_MAP_TYPE_CGRP_STORAGE** map. When the local-storage value is 5453 + * queried for some *map* on a *cgroup* object, the kernel will perform an 5454 + * O(n) iteration over all of the live local-storage values for that 5455 + * *cgroup* object until the local-storage value for the *map* is found. 5456 + * 5457 + * An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be 5458 + * used such that a new bpf_local_storage will be 5459 + * created if one does not exist. *value* can be used 5460 + * together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify 5461 + * the initial value of a bpf_local_storage. If *value* is 5462 + * **NULL**, the new bpf_local_storage will be zero initialized. 5463 + * Return 5464 + * A bpf_local_storage pointer is returned on success. 5465 + * 5466 + * **NULL** if not found or there was an error in adding 5467 + * a new bpf_local_storage. 5468 + * 5469 + * long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgroup *cgroup) 5470 + * Description 5471 + * Delete a bpf_local_storage from a *cgroup*. 5472 + * Return 5473 + * 0 on success. 5474 + * 5475 + * **-ENOENT** if the bpf_local_storage cannot be found. 5446 5476 */ 5447 5477 #define ___BPF_FUNC_MAPPER(FN, ctx...) \ 5448 5478 FN(unspec, 0, ##ctx) \ ··· 5693 5647 FN(tcp_raw_check_syncookie_ipv6, 207, ##ctx) \ 5694 5648 FN(ktime_get_tai_ns, 208, ##ctx) \ 5695 5649 FN(user_ringbuf_drain, 209, ##ctx) \ 5650 + FN(cgrp_storage_get, 210, ##ctx) \ 5651 + FN(cgrp_storage_delete, 211, ##ctx) \ 5696 5652 /* */ 5697 5653 5698 5654 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
+1 -1
kernel/bpf/Makefile
··· 25 25 obj-$(CONFIG_BPF_SYSCALL) += stackmap.o 26 26 endif 27 27 ifeq ($(CONFIG_CGROUPS),y) 28 - obj-$(CONFIG_BPF_SYSCALL) += cgroup_iter.o 28 + obj-$(CONFIG_BPF_SYSCALL) += cgroup_iter.o bpf_cgrp_storage.o 29 29 endif 30 30 obj-$(CONFIG_CGROUP_BPF) += cgroup.o 31 31 ifeq ($(CONFIG_INET),y)
+247
kernel/bpf/bpf_cgrp_storage.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (c) 2022 Meta Platforms, Inc. and affiliates. 4 + */ 5 + 6 + #include <linux/types.h> 7 + #include <linux/bpf.h> 8 + #include <linux/bpf_local_storage.h> 9 + #include <uapi/linux/btf.h> 10 + #include <linux/btf_ids.h> 11 + 12 + DEFINE_BPF_STORAGE_CACHE(cgroup_cache); 13 + 14 + static DEFINE_PER_CPU(int, bpf_cgrp_storage_busy); 15 + 16 + static void bpf_cgrp_storage_lock(void) 17 + { 18 + migrate_disable(); 19 + this_cpu_inc(bpf_cgrp_storage_busy); 20 + } 21 + 22 + static void bpf_cgrp_storage_unlock(void) 23 + { 24 + this_cpu_dec(bpf_cgrp_storage_busy); 25 + migrate_enable(); 26 + } 27 + 28 + static bool bpf_cgrp_storage_trylock(void) 29 + { 30 + migrate_disable(); 31 + if (unlikely(this_cpu_inc_return(bpf_cgrp_storage_busy) != 1)) { 32 + this_cpu_dec(bpf_cgrp_storage_busy); 33 + migrate_enable(); 34 + return false; 35 + } 36 + return true; 37 + } 38 + 39 + static struct bpf_local_storage __rcu **cgroup_storage_ptr(void *owner) 40 + { 41 + struct cgroup *cg = owner; 42 + 43 + return &cg->bpf_cgrp_storage; 44 + } 45 + 46 + void bpf_cgrp_storage_free(struct cgroup *cgroup) 47 + { 48 + struct bpf_local_storage *local_storage; 49 + bool free_cgroup_storage = false; 50 + unsigned long flags; 51 + 52 + rcu_read_lock(); 53 + local_storage = rcu_dereference(cgroup->bpf_cgrp_storage); 54 + if (!local_storage) { 55 + rcu_read_unlock(); 56 + return; 57 + } 58 + 59 + bpf_cgrp_storage_lock(); 60 + raw_spin_lock_irqsave(&local_storage->lock, flags); 61 + free_cgroup_storage = bpf_local_storage_unlink_nolock(local_storage); 62 + raw_spin_unlock_irqrestore(&local_storage->lock, flags); 63 + bpf_cgrp_storage_unlock(); 64 + rcu_read_unlock(); 65 + 66 + if (free_cgroup_storage) 67 + kfree_rcu(local_storage, rcu); 68 + } 69 + 70 + static struct bpf_local_storage_data * 71 + cgroup_storage_lookup(struct cgroup *cgroup, struct bpf_map *map, bool cacheit_lockit) 72 + { 73 + struct bpf_local_storage *cgroup_storage; 74 + struct bpf_local_storage_map *smap; 75 + 76 + cgroup_storage = rcu_dereference_check(cgroup->bpf_cgrp_storage, 77 + bpf_rcu_lock_held()); 78 + if (!cgroup_storage) 79 + return NULL; 80 + 81 + smap = (struct bpf_local_storage_map *)map; 82 + return bpf_local_storage_lookup(cgroup_storage, smap, cacheit_lockit); 83 + } 84 + 85 + static void *bpf_cgrp_storage_lookup_elem(struct bpf_map *map, void *key) 86 + { 87 + struct bpf_local_storage_data *sdata; 88 + struct cgroup *cgroup; 89 + int fd; 90 + 91 + fd = *(int *)key; 92 + cgroup = cgroup_get_from_fd(fd); 93 + if (IS_ERR(cgroup)) 94 + return ERR_CAST(cgroup); 95 + 96 + bpf_cgrp_storage_lock(); 97 + sdata = cgroup_storage_lookup(cgroup, map, true); 98 + bpf_cgrp_storage_unlock(); 99 + cgroup_put(cgroup); 100 + return sdata ? sdata->data : NULL; 101 + } 102 + 103 + static int bpf_cgrp_storage_update_elem(struct bpf_map *map, void *key, 104 + void *value, u64 map_flags) 105 + { 106 + struct bpf_local_storage_data *sdata; 107 + struct cgroup *cgroup; 108 + int fd; 109 + 110 + fd = *(int *)key; 111 + cgroup = cgroup_get_from_fd(fd); 112 + if (IS_ERR(cgroup)) 113 + return PTR_ERR(cgroup); 114 + 115 + bpf_cgrp_storage_lock(); 116 + sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map, 117 + value, map_flags, GFP_ATOMIC); 118 + bpf_cgrp_storage_unlock(); 119 + cgroup_put(cgroup); 120 + return PTR_ERR_OR_ZERO(sdata); 121 + } 122 + 123 + static int cgroup_storage_delete(struct cgroup *cgroup, struct bpf_map *map) 124 + { 125 + struct bpf_local_storage_data *sdata; 126 + 127 + sdata = cgroup_storage_lookup(cgroup, map, false); 128 + if (!sdata) 129 + return -ENOENT; 130 + 131 + bpf_selem_unlink(SELEM(sdata), true); 132 + return 0; 133 + } 134 + 135 + static int bpf_cgrp_storage_delete_elem(struct bpf_map *map, void *key) 136 + { 137 + struct cgroup *cgroup; 138 + int err, fd; 139 + 140 + fd = *(int *)key; 141 + cgroup = cgroup_get_from_fd(fd); 142 + if (IS_ERR(cgroup)) 143 + return PTR_ERR(cgroup); 144 + 145 + bpf_cgrp_storage_lock(); 146 + err = cgroup_storage_delete(cgroup, map); 147 + bpf_cgrp_storage_unlock(); 148 + cgroup_put(cgroup); 149 + return err; 150 + } 151 + 152 + static int notsupp_get_next_key(struct bpf_map *map, void *key, void *next_key) 153 + { 154 + return -ENOTSUPP; 155 + } 156 + 157 + static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr) 158 + { 159 + return bpf_local_storage_map_alloc(attr, &cgroup_cache); 160 + } 161 + 162 + static void cgroup_storage_map_free(struct bpf_map *map) 163 + { 164 + bpf_local_storage_map_free(map, &cgroup_cache, NULL); 165 + } 166 + 167 + /* *gfp_flags* is a hidden argument provided by the verifier */ 168 + BPF_CALL_5(bpf_cgrp_storage_get, struct bpf_map *, map, struct cgroup *, cgroup, 169 + void *, value, u64, flags, gfp_t, gfp_flags) 170 + { 171 + struct bpf_local_storage_data *sdata; 172 + 173 + WARN_ON_ONCE(!bpf_rcu_lock_held()); 174 + if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) 175 + return (unsigned long)NULL; 176 + 177 + if (!cgroup) 178 + return (unsigned long)NULL; 179 + 180 + if (!bpf_cgrp_storage_trylock()) 181 + return (unsigned long)NULL; 182 + 183 + sdata = cgroup_storage_lookup(cgroup, map, true); 184 + if (sdata) 185 + goto unlock; 186 + 187 + /* only allocate new storage, when the cgroup is refcounted */ 188 + if (!percpu_ref_is_dying(&cgroup->self.refcnt) && 189 + (flags & BPF_LOCAL_STORAGE_GET_F_CREATE)) 190 + sdata = bpf_local_storage_update(cgroup, (struct bpf_local_storage_map *)map, 191 + value, BPF_NOEXIST, gfp_flags); 192 + 193 + unlock: 194 + bpf_cgrp_storage_unlock(); 195 + return IS_ERR_OR_NULL(sdata) ? (unsigned long)NULL : (unsigned long)sdata->data; 196 + } 197 + 198 + BPF_CALL_2(bpf_cgrp_storage_delete, struct bpf_map *, map, struct cgroup *, cgroup) 199 + { 200 + int ret; 201 + 202 + WARN_ON_ONCE(!bpf_rcu_lock_held()); 203 + if (!cgroup) 204 + return -EINVAL; 205 + 206 + if (!bpf_cgrp_storage_trylock()) 207 + return -EBUSY; 208 + 209 + ret = cgroup_storage_delete(cgroup, map); 210 + bpf_cgrp_storage_unlock(); 211 + return ret; 212 + } 213 + 214 + BTF_ID_LIST_SINGLE(cgroup_storage_map_btf_ids, struct, bpf_local_storage_map) 215 + const struct bpf_map_ops cgrp_storage_map_ops = { 216 + .map_meta_equal = bpf_map_meta_equal, 217 + .map_alloc_check = bpf_local_storage_map_alloc_check, 218 + .map_alloc = cgroup_storage_map_alloc, 219 + .map_free = cgroup_storage_map_free, 220 + .map_get_next_key = notsupp_get_next_key, 221 + .map_lookup_elem = bpf_cgrp_storage_lookup_elem, 222 + .map_update_elem = bpf_cgrp_storage_update_elem, 223 + .map_delete_elem = bpf_cgrp_storage_delete_elem, 224 + .map_check_btf = bpf_local_storage_map_check_btf, 225 + .map_btf_id = &cgroup_storage_map_btf_ids[0], 226 + .map_owner_storage_ptr = cgroup_storage_ptr, 227 + }; 228 + 229 + const struct bpf_func_proto bpf_cgrp_storage_get_proto = { 230 + .func = bpf_cgrp_storage_get, 231 + .gpl_only = false, 232 + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL, 233 + .arg1_type = ARG_CONST_MAP_PTR, 234 + .arg2_type = ARG_PTR_TO_BTF_ID, 235 + .arg2_btf_id = &bpf_cgroup_btf_id[0], 236 + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL, 237 + .arg4_type = ARG_ANYTHING, 238 + }; 239 + 240 + const struct bpf_func_proto bpf_cgrp_storage_delete_proto = { 241 + .func = bpf_cgrp_storage_delete, 242 + .gpl_only = false, 243 + .ret_type = RET_INTEGER, 244 + .arg1_type = ARG_CONST_MAP_PTR, 245 + .arg2_type = ARG_PTR_TO_BTF_ID, 246 + .arg2_btf_id = &bpf_cgroup_btf_id[0], 247 + };
+3 -35
kernel/bpf/bpf_inode_storage.c
··· 56 56 57 57 void bpf_inode_storage_free(struct inode *inode) 58 58 { 59 - struct bpf_local_storage_elem *selem; 60 59 struct bpf_local_storage *local_storage; 61 60 bool free_inode_storage = false; 62 61 struct bpf_storage_blob *bsb; 63 - struct hlist_node *n; 64 62 65 63 bsb = bpf_inode(inode); 66 64 if (!bsb) ··· 72 74 return; 73 75 } 74 76 75 - /* Neither the bpf_prog nor the bpf-map's syscall 76 - * could be modifying the local_storage->list now. 77 - * Thus, no elem can be added-to or deleted-from the 78 - * local_storage->list by the bpf_prog or by the bpf-map's syscall. 79 - * 80 - * It is racing with bpf_local_storage_map_free() alone 81 - * when unlinking elem from the local_storage->list and 82 - * the map's bucket->list. 83 - */ 84 77 raw_spin_lock_bh(&local_storage->lock); 85 - hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) { 86 - /* Always unlink from map before unlinking from 87 - * local_storage. 88 - */ 89 - bpf_selem_unlink_map(selem); 90 - free_inode_storage = bpf_selem_unlink_storage_nolock( 91 - local_storage, selem, false, false); 92 - } 78 + free_inode_storage = bpf_local_storage_unlink_nolock(local_storage); 93 79 raw_spin_unlock_bh(&local_storage->lock); 94 80 rcu_read_unlock(); 95 81 96 - /* free_inoode_storage should always be true as long as 97 - * local_storage->list was non-empty. 98 - */ 99 82 if (free_inode_storage) 100 83 kfree_rcu(local_storage, rcu); 101 84 } ··· 205 226 206 227 static struct bpf_map *inode_storage_map_alloc(union bpf_attr *attr) 207 228 { 208 - struct bpf_local_storage_map *smap; 209 - 210 - smap = bpf_local_storage_map_alloc(attr); 211 - if (IS_ERR(smap)) 212 - return ERR_CAST(smap); 213 - 214 - smap->cache_idx = bpf_local_storage_cache_idx_get(&inode_cache); 215 - return &smap->map; 229 + return bpf_local_storage_map_alloc(attr, &inode_cache); 216 230 } 217 231 218 232 static void inode_storage_map_free(struct bpf_map *map) 219 233 { 220 - struct bpf_local_storage_map *smap; 221 - 222 - smap = (struct bpf_local_storage_map *)map; 223 - bpf_local_storage_cache_idx_free(&inode_cache, smap->cache_idx); 224 - bpf_local_storage_map_free(smap, NULL); 234 + bpf_local_storage_map_free(map, &inode_cache, NULL); 225 235 } 226 236 227 237 BTF_ID_LIST_SINGLE(inode_storage_map_btf_ids, struct,
+131 -78
kernel/bpf/bpf_local_storage.c
··· 113 113 * The caller must ensure selem->smap is still valid to be 114 114 * dereferenced for its smap->elem_size and smap->cache_idx. 115 115 */ 116 - bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, 117 - struct bpf_local_storage_elem *selem, 118 - bool uncharge_mem, bool use_trace_rcu) 116 + static bool bpf_selem_unlink_storage_nolock(struct bpf_local_storage *local_storage, 117 + struct bpf_local_storage_elem *selem, 118 + bool uncharge_mem, bool use_trace_rcu) 119 119 { 120 120 struct bpf_local_storage_map *smap; 121 121 bool free_local_storage; ··· 242 242 __bpf_selem_unlink_storage(selem, use_trace_rcu); 243 243 } 244 244 245 + /* If cacheit_lockit is false, this lookup function is lockless */ 245 246 struct bpf_local_storage_data * 246 247 bpf_local_storage_lookup(struct bpf_local_storage *local_storage, 247 248 struct bpf_local_storage_map *smap, ··· 501 500 return ERR_PTR(err); 502 501 } 503 502 504 - u16 bpf_local_storage_cache_idx_get(struct bpf_local_storage_cache *cache) 503 + static u16 bpf_local_storage_cache_idx_get(struct bpf_local_storage_cache *cache) 505 504 { 506 505 u64 min_usage = U64_MAX; 507 506 u16 i, res = 0; ··· 525 524 return res; 526 525 } 527 526 528 - void bpf_local_storage_cache_idx_free(struct bpf_local_storage_cache *cache, 529 - u16 idx) 527 + static void bpf_local_storage_cache_idx_free(struct bpf_local_storage_cache *cache, 528 + u16 idx) 530 529 { 531 530 spin_lock(&cache->idx_lock); 532 531 cache->idx_usage_counts[idx]--; 533 532 spin_unlock(&cache->idx_lock); 534 533 } 535 534 536 - void bpf_local_storage_map_free(struct bpf_local_storage_map *smap, 537 - int __percpu *busy_counter) 535 + int bpf_local_storage_map_alloc_check(union bpf_attr *attr) 536 + { 537 + if (attr->map_flags & ~BPF_LOCAL_STORAGE_CREATE_FLAG_MASK || 538 + !(attr->map_flags & BPF_F_NO_PREALLOC) || 539 + attr->max_entries || 540 + attr->key_size != sizeof(int) || !attr->value_size || 541 + /* Enforce BTF for userspace sk dumping */ 542 + !attr->btf_key_type_id || !attr->btf_value_type_id) 543 + return -EINVAL; 544 + 545 + if (!bpf_capable()) 546 + return -EPERM; 547 + 548 + if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE) 549 + return -E2BIG; 550 + 551 + return 0; 552 + } 553 + 554 + static struct bpf_local_storage_map *__bpf_local_storage_map_alloc(union bpf_attr *attr) 555 + { 556 + struct bpf_local_storage_map *smap; 557 + unsigned int i; 558 + u32 nbuckets; 559 + 560 + smap = bpf_map_area_alloc(sizeof(*smap), NUMA_NO_NODE); 561 + if (!smap) 562 + return ERR_PTR(-ENOMEM); 563 + bpf_map_init_from_attr(&smap->map, attr); 564 + 565 + nbuckets = roundup_pow_of_two(num_possible_cpus()); 566 + /* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */ 567 + nbuckets = max_t(u32, 2, nbuckets); 568 + smap->bucket_log = ilog2(nbuckets); 569 + 570 + smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets, 571 + GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT); 572 + if (!smap->buckets) { 573 + bpf_map_area_free(smap); 574 + return ERR_PTR(-ENOMEM); 575 + } 576 + 577 + for (i = 0; i < nbuckets; i++) { 578 + INIT_HLIST_HEAD(&smap->buckets[i].list); 579 + raw_spin_lock_init(&smap->buckets[i].lock); 580 + } 581 + 582 + smap->elem_size = 583 + sizeof(struct bpf_local_storage_elem) + attr->value_size; 584 + 585 + return smap; 586 + } 587 + 588 + int bpf_local_storage_map_check_btf(const struct bpf_map *map, 589 + const struct btf *btf, 590 + const struct btf_type *key_type, 591 + const struct btf_type *value_type) 592 + { 593 + u32 int_data; 594 + 595 + if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT) 596 + return -EINVAL; 597 + 598 + int_data = *(u32 *)(key_type + 1); 599 + if (BTF_INT_BITS(int_data) != 32 || BTF_INT_OFFSET(int_data)) 600 + return -EINVAL; 601 + 602 + return 0; 603 + } 604 + 605 + bool bpf_local_storage_unlink_nolock(struct bpf_local_storage *local_storage) 538 606 { 539 607 struct bpf_local_storage_elem *selem; 608 + bool free_storage = false; 609 + struct hlist_node *n; 610 + 611 + /* Neither the bpf_prog nor the bpf_map's syscall 612 + * could be modifying the local_storage->list now. 613 + * Thus, no elem can be added to or deleted from the 614 + * local_storage->list by the bpf_prog or by the bpf_map's syscall. 615 + * 616 + * It is racing with bpf_local_storage_map_free() alone 617 + * when unlinking elem from the local_storage->list and 618 + * the map's bucket->list. 619 + */ 620 + hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) { 621 + /* Always unlink from map before unlinking from 622 + * local_storage. 623 + */ 624 + bpf_selem_unlink_map(selem); 625 + /* If local_storage list has only one element, the 626 + * bpf_selem_unlink_storage_nolock() will return true. 627 + * Otherwise, it will return false. The current loop iteration 628 + * intends to remove all local storage. So the last iteration 629 + * of the loop will set the free_cgroup_storage to true. 630 + */ 631 + free_storage = bpf_selem_unlink_storage_nolock( 632 + local_storage, selem, false, false); 633 + } 634 + 635 + return free_storage; 636 + } 637 + 638 + struct bpf_map * 639 + bpf_local_storage_map_alloc(union bpf_attr *attr, 640 + struct bpf_local_storage_cache *cache) 641 + { 642 + struct bpf_local_storage_map *smap; 643 + 644 + smap = __bpf_local_storage_map_alloc(attr); 645 + if (IS_ERR(smap)) 646 + return ERR_CAST(smap); 647 + 648 + smap->cache_idx = bpf_local_storage_cache_idx_get(cache); 649 + return &smap->map; 650 + } 651 + 652 + void bpf_local_storage_map_free(struct bpf_map *map, 653 + struct bpf_local_storage_cache *cache, 654 + int __percpu *busy_counter) 655 + { 540 656 struct bpf_local_storage_map_bucket *b; 657 + struct bpf_local_storage_elem *selem; 658 + struct bpf_local_storage_map *smap; 541 659 unsigned int i; 660 + 661 + smap = (struct bpf_local_storage_map *)map; 662 + bpf_local_storage_cache_idx_free(cache, smap->cache_idx); 542 663 543 664 /* Note that this map might be concurrently cloned from 544 665 * bpf_sk_storage_clone. Wait for any existing bpf_sk_storage_clone ··· 715 592 716 593 kvfree(smap->buckets); 717 594 bpf_map_area_free(smap); 718 - } 719 - 720 - int bpf_local_storage_map_alloc_check(union bpf_attr *attr) 721 - { 722 - if (attr->map_flags & ~BPF_LOCAL_STORAGE_CREATE_FLAG_MASK || 723 - !(attr->map_flags & BPF_F_NO_PREALLOC) || 724 - attr->max_entries || 725 - attr->key_size != sizeof(int) || !attr->value_size || 726 - /* Enforce BTF for userspace sk dumping */ 727 - !attr->btf_key_type_id || !attr->btf_value_type_id) 728 - return -EINVAL; 729 - 730 - if (!bpf_capable()) 731 - return -EPERM; 732 - 733 - if (attr->value_size > BPF_LOCAL_STORAGE_MAX_VALUE_SIZE) 734 - return -E2BIG; 735 - 736 - return 0; 737 - } 738 - 739 - struct bpf_local_storage_map *bpf_local_storage_map_alloc(union bpf_attr *attr) 740 - { 741 - struct bpf_local_storage_map *smap; 742 - unsigned int i; 743 - u32 nbuckets; 744 - 745 - smap = bpf_map_area_alloc(sizeof(*smap), NUMA_NO_NODE); 746 - if (!smap) 747 - return ERR_PTR(-ENOMEM); 748 - bpf_map_init_from_attr(&smap->map, attr); 749 - 750 - nbuckets = roundup_pow_of_two(num_possible_cpus()); 751 - /* Use at least 2 buckets, select_bucket() is undefined behavior with 1 bucket */ 752 - nbuckets = max_t(u32, 2, nbuckets); 753 - smap->bucket_log = ilog2(nbuckets); 754 - 755 - smap->buckets = kvcalloc(sizeof(*smap->buckets), nbuckets, 756 - GFP_USER | __GFP_NOWARN | __GFP_ACCOUNT); 757 - if (!smap->buckets) { 758 - bpf_map_area_free(smap); 759 - return ERR_PTR(-ENOMEM); 760 - } 761 - 762 - for (i = 0; i < nbuckets; i++) { 763 - INIT_HLIST_HEAD(&smap->buckets[i].list); 764 - raw_spin_lock_init(&smap->buckets[i].lock); 765 - } 766 - 767 - smap->elem_size = 768 - sizeof(struct bpf_local_storage_elem) + attr->value_size; 769 - 770 - return smap; 771 - } 772 - 773 - int bpf_local_storage_map_check_btf(const struct bpf_map *map, 774 - const struct btf *btf, 775 - const struct btf_type *key_type, 776 - const struct btf_type *value_type) 777 - { 778 - u32 int_data; 779 - 780 - if (BTF_INFO_KIND(key_type->info) != BTF_KIND_INT) 781 - return -EINVAL; 782 - 783 - int_data = *(u32 *)(key_type + 1); 784 - if (BTF_INT_BITS(int_data) != 32 || BTF_INT_OFFSET(int_data)) 785 - return -EINVAL; 786 - 787 - return 0; 788 595 }
+100 -63
kernel/bpf/bpf_task_storage.c
··· 71 71 72 72 void bpf_task_storage_free(struct task_struct *task) 73 73 { 74 - struct bpf_local_storage_elem *selem; 75 74 struct bpf_local_storage *local_storage; 76 75 bool free_task_storage = false; 77 - struct hlist_node *n; 78 76 unsigned long flags; 79 77 80 78 rcu_read_lock(); ··· 83 85 return; 84 86 } 85 87 86 - /* Neither the bpf_prog nor the bpf-map's syscall 87 - * could be modifying the local_storage->list now. 88 - * Thus, no elem can be added-to or deleted-from the 89 - * local_storage->list by the bpf_prog or by the bpf-map's syscall. 90 - * 91 - * It is racing with bpf_local_storage_map_free() alone 92 - * when unlinking elem from the local_storage->list and 93 - * the map's bucket->list. 94 - */ 95 88 bpf_task_storage_lock(); 96 89 raw_spin_lock_irqsave(&local_storage->lock, flags); 97 - hlist_for_each_entry_safe(selem, n, &local_storage->list, snode) { 98 - /* Always unlink from map before unlinking from 99 - * local_storage. 100 - */ 101 - bpf_selem_unlink_map(selem); 102 - free_task_storage = bpf_selem_unlink_storage_nolock( 103 - local_storage, selem, false, false); 104 - } 90 + free_task_storage = bpf_local_storage_unlink_nolock(local_storage); 105 91 raw_spin_unlock_irqrestore(&local_storage->lock, flags); 106 92 bpf_task_storage_unlock(); 107 93 rcu_read_unlock(); 108 94 109 - /* free_task_storage should always be true as long as 110 - * local_storage->list was non-empty. 111 - */ 112 95 if (free_task_storage) 113 96 kfree_rcu(local_storage, rcu); 114 97 } ··· 163 184 return err; 164 185 } 165 186 166 - static int task_storage_delete(struct task_struct *task, struct bpf_map *map) 187 + static int task_storage_delete(struct task_struct *task, struct bpf_map *map, 188 + bool nobusy) 167 189 { 168 190 struct bpf_local_storage_data *sdata; 169 191 170 192 sdata = task_storage_lookup(task, map, false); 171 193 if (!sdata) 172 194 return -ENOENT; 195 + 196 + if (!nobusy) 197 + return -EBUSY; 173 198 174 199 bpf_selem_unlink(SELEM(sdata), true); 175 200 ··· 203 220 } 204 221 205 222 bpf_task_storage_lock(); 206 - err = task_storage_delete(task, map); 223 + err = task_storage_delete(task, map, true); 207 224 bpf_task_storage_unlock(); 208 225 out: 209 226 put_pid(pid); 210 227 return err; 211 228 } 212 229 230 + /* Called by bpf_task_storage_get*() helpers */ 231 + static void *__bpf_task_storage_get(struct bpf_map *map, 232 + struct task_struct *task, void *value, 233 + u64 flags, gfp_t gfp_flags, bool nobusy) 234 + { 235 + struct bpf_local_storage_data *sdata; 236 + 237 + sdata = task_storage_lookup(task, map, nobusy); 238 + if (sdata) 239 + return sdata->data; 240 + 241 + /* only allocate new storage, when the task is refcounted */ 242 + if (refcount_read(&task->usage) && 243 + (flags & BPF_LOCAL_STORAGE_GET_F_CREATE) && nobusy) { 244 + sdata = bpf_local_storage_update( 245 + task, (struct bpf_local_storage_map *)map, value, 246 + BPF_NOEXIST, gfp_flags); 247 + return IS_ERR(sdata) ? NULL : sdata->data; 248 + } 249 + 250 + return NULL; 251 + } 252 + 253 + /* *gfp_flags* is a hidden argument provided by the verifier */ 254 + BPF_CALL_5(bpf_task_storage_get_recur, struct bpf_map *, map, struct task_struct *, 255 + task, void *, value, u64, flags, gfp_t, gfp_flags) 256 + { 257 + bool nobusy; 258 + void *data; 259 + 260 + WARN_ON_ONCE(!bpf_rcu_lock_held()); 261 + if (flags & ~BPF_LOCAL_STORAGE_GET_F_CREATE || !task) 262 + return (unsigned long)NULL; 263 + 264 + nobusy = bpf_task_storage_trylock(); 265 + data = __bpf_task_storage_get(map, task, value, flags, 266 + gfp_flags, nobusy); 267 + if (nobusy) 268 + bpf_task_storage_unlock(); 269 + return (unsigned long)data; 270 + } 271 + 213 272 /* *gfp_flags* is a hidden argument provided by the verifier */ 214 273 BPF_CALL_5(bpf_task_storage_get, struct bpf_map *, map, struct task_struct *, 215 274 task, void *, value, u64, flags, gfp_t, gfp_flags) 216 275 { 217 - struct bpf_local_storage_data *sdata; 276 + void *data; 218 277 219 278 WARN_ON_ONCE(!bpf_rcu_lock_held()); 220 - if (flags & ~(BPF_LOCAL_STORAGE_GET_F_CREATE)) 279 + if (flags & ~BPF_LOCAL_STORAGE_GET_F_CREATE || !task) 221 280 return (unsigned long)NULL; 222 281 223 - if (!task) 224 - return (unsigned long)NULL; 225 - 226 - if (!bpf_task_storage_trylock()) 227 - return (unsigned long)NULL; 228 - 229 - sdata = task_storage_lookup(task, map, true); 230 - if (sdata) 231 - goto unlock; 232 - 233 - /* only allocate new storage, when the task is refcounted */ 234 - if (refcount_read(&task->usage) && 235 - (flags & BPF_LOCAL_STORAGE_GET_F_CREATE)) 236 - sdata = bpf_local_storage_update( 237 - task, (struct bpf_local_storage_map *)map, value, 238 - BPF_NOEXIST, gfp_flags); 239 - 240 - unlock: 282 + bpf_task_storage_lock(); 283 + data = __bpf_task_storage_get(map, task, value, flags, 284 + gfp_flags, true); 241 285 bpf_task_storage_unlock(); 242 - return IS_ERR_OR_NULL(sdata) ? (unsigned long)NULL : 243 - (unsigned long)sdata->data; 286 + return (unsigned long)data; 287 + } 288 + 289 + BPF_CALL_2(bpf_task_storage_delete_recur, struct bpf_map *, map, struct task_struct *, 290 + task) 291 + { 292 + bool nobusy; 293 + int ret; 294 + 295 + WARN_ON_ONCE(!bpf_rcu_lock_held()); 296 + if (!task) 297 + return -EINVAL; 298 + 299 + nobusy = bpf_task_storage_trylock(); 300 + /* This helper must only be called from places where the lifetime of the task 301 + * is guaranteed. Either by being refcounted or by being protected 302 + * by an RCU read-side critical section. 303 + */ 304 + ret = task_storage_delete(task, map, nobusy); 305 + if (nobusy) 306 + bpf_task_storage_unlock(); 307 + return ret; 244 308 } 245 309 246 310 BPF_CALL_2(bpf_task_storage_delete, struct bpf_map *, map, struct task_struct *, ··· 299 269 if (!task) 300 270 return -EINVAL; 301 271 302 - if (!bpf_task_storage_trylock()) 303 - return -EBUSY; 304 - 272 + bpf_task_storage_lock(); 305 273 /* This helper must only be called from places where the lifetime of the task 306 274 * is guaranteed. Either by being refcounted or by being protected 307 275 * by an RCU read-side critical section. 308 276 */ 309 - ret = task_storage_delete(task, map); 277 + ret = task_storage_delete(task, map, true); 310 278 bpf_task_storage_unlock(); 311 279 return ret; 312 280 } ··· 316 288 317 289 static struct bpf_map *task_storage_map_alloc(union bpf_attr *attr) 318 290 { 319 - struct bpf_local_storage_map *smap; 320 - 321 - smap = bpf_local_storage_map_alloc(attr); 322 - if (IS_ERR(smap)) 323 - return ERR_CAST(smap); 324 - 325 - smap->cache_idx = bpf_local_storage_cache_idx_get(&task_cache); 326 - return &smap->map; 291 + return bpf_local_storage_map_alloc(attr, &task_cache); 327 292 } 328 293 329 294 static void task_storage_map_free(struct bpf_map *map) 330 295 { 331 - struct bpf_local_storage_map *smap; 332 - 333 - smap = (struct bpf_local_storage_map *)map; 334 - bpf_local_storage_cache_idx_free(&task_cache, smap->cache_idx); 335 - bpf_local_storage_map_free(smap, &bpf_task_storage_busy); 296 + bpf_local_storage_map_free(map, &task_cache, &bpf_task_storage_busy); 336 297 } 337 298 338 299 BTF_ID_LIST_SINGLE(task_storage_map_btf_ids, struct, bpf_local_storage_map) ··· 339 322 .map_owner_storage_ptr = task_storage_ptr, 340 323 }; 341 324 325 + const struct bpf_func_proto bpf_task_storage_get_recur_proto = { 326 + .func = bpf_task_storage_get_recur, 327 + .gpl_only = false, 328 + .ret_type = RET_PTR_TO_MAP_VALUE_OR_NULL, 329 + .arg1_type = ARG_CONST_MAP_PTR, 330 + .arg2_type = ARG_PTR_TO_BTF_ID, 331 + .arg2_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK], 332 + .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL, 333 + .arg4_type = ARG_ANYTHING, 334 + }; 335 + 342 336 const struct bpf_func_proto bpf_task_storage_get_proto = { 343 337 .func = bpf_task_storage_get, 344 338 .gpl_only = false, ··· 359 331 .arg2_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK], 360 332 .arg3_type = ARG_PTR_TO_MAP_VALUE_OR_NULL, 361 333 .arg4_type = ARG_ANYTHING, 334 + }; 335 + 336 + const struct bpf_func_proto bpf_task_storage_delete_recur_proto = { 337 + .func = bpf_task_storage_delete_recur, 338 + .gpl_only = false, 339 + .ret_type = RET_INTEGER, 340 + .arg1_type = ARG_CONST_MAP_PTR, 341 + .arg2_type = ARG_PTR_TO_BTF_ID, 342 + .arg2_btf_id = &btf_tracing_ids[BTF_TRACING_TYPE_TASK], 362 343 }; 363 344 364 345 const struct bpf_func_proto bpf_task_storage_delete_proto = {
+1 -1
kernel/bpf/cgroup_iter.c
··· 157 157 .show = cgroup_iter_seq_show, 158 158 }; 159 159 160 - BTF_ID_LIST_SINGLE(bpf_cgroup_btf_id, struct, cgroup) 160 + BTF_ID_LIST_GLOBAL_SINGLE(bpf_cgroup_btf_id, struct, cgroup) 161 161 162 162 static int cgroup_iter_seq_init(void *priv, struct bpf_iter_aux_info *aux) 163 163 {
+8 -12
kernel/bpf/cpumap.c
··· 85 85 { 86 86 u32 value_size = attr->value_size; 87 87 struct bpf_cpu_map *cmap; 88 - int err = -ENOMEM; 89 88 90 89 if (!bpf_capable()) 91 90 return ERR_PTR(-EPERM); ··· 96 97 attr->map_flags & ~BPF_F_NUMA_NODE) 97 98 return ERR_PTR(-EINVAL); 98 99 100 + /* Pre-limit array size based on NR_CPUS, not final CPU check */ 101 + if (attr->max_entries > NR_CPUS) 102 + return ERR_PTR(-E2BIG); 103 + 99 104 cmap = bpf_map_area_alloc(sizeof(*cmap), NUMA_NO_NODE); 100 105 if (!cmap) 101 106 return ERR_PTR(-ENOMEM); 102 107 103 108 bpf_map_init_from_attr(&cmap->map, attr); 104 109 105 - /* Pre-limit array size based on NR_CPUS, not final CPU check */ 106 - if (cmap->map.max_entries > NR_CPUS) { 107 - err = -E2BIG; 108 - goto free_cmap; 109 - } 110 - 111 110 /* Alloc array for possible remote "destination" CPUs */ 112 111 cmap->cpu_map = bpf_map_area_alloc(cmap->map.max_entries * 113 112 sizeof(struct bpf_cpu_map_entry *), 114 113 cmap->map.numa_node); 115 - if (!cmap->cpu_map) 116 - goto free_cmap; 114 + if (!cmap->cpu_map) { 115 + bpf_map_area_free(cmap); 116 + return ERR_PTR(-ENOMEM); 117 + } 117 118 118 119 return &cmap->map; 119 - free_cmap: 120 - bpf_map_area_free(cmap); 121 - return ERR_PTR(err); 122 120 } 123 121 124 122 static void get_cpu_map_entry(struct bpf_cpu_map_entry *rcpu)
+6
kernel/bpf/helpers.c
··· 1663 1663 return &bpf_dynptr_write_proto; 1664 1664 case BPF_FUNC_dynptr_data: 1665 1665 return &bpf_dynptr_data_proto; 1666 + #ifdef CONFIG_CGROUPS 1667 + case BPF_FUNC_cgrp_storage_get: 1668 + return &bpf_cgrp_storage_get_proto; 1669 + case BPF_FUNC_cgrp_storage_delete: 1670 + return &bpf_cgrp_storage_delete_proto; 1671 + #endif 1666 1672 default: 1667 1673 break; 1668 1674 }
+7 -5
kernel/bpf/syscall.c
··· 1016 1016 map->map_type != BPF_MAP_TYPE_CGROUP_STORAGE && 1017 1017 map->map_type != BPF_MAP_TYPE_SK_STORAGE && 1018 1018 map->map_type != BPF_MAP_TYPE_INODE_STORAGE && 1019 - map->map_type != BPF_MAP_TYPE_TASK_STORAGE) 1019 + map->map_type != BPF_MAP_TYPE_TASK_STORAGE && 1020 + map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) 1020 1021 return -ENOTSUPP; 1021 1022 if (map->spin_lock_off + sizeof(struct bpf_spin_lock) > 1022 1023 map->value_size) { ··· 2118 2117 2119 2118 st = per_cpu_ptr(prog->stats, cpu); 2120 2119 do { 2121 - start = u64_stats_fetch_begin_irq(&st->syncp); 2120 + start = u64_stats_fetch_begin(&st->syncp); 2122 2121 tnsecs = u64_stats_read(&st->nsecs); 2123 2122 tcnt = u64_stats_read(&st->cnt); 2124 2123 tmisses = u64_stats_read(&st->misses); 2125 - } while (u64_stats_fetch_retry_irq(&st->syncp, start)); 2124 + } while (u64_stats_fetch_retry(&st->syncp, start)); 2126 2125 nsecs += tnsecs; 2127 2126 cnt += tcnt; 2128 2127 misses += tmisses; ··· 5134 5133 5135 5134 run_ctx.bpf_cookie = 0; 5136 5135 run_ctx.saved_run_ctx = NULL; 5137 - if (!__bpf_prog_enter_sleepable(prog, &run_ctx)) { 5136 + if (!__bpf_prog_enter_sleepable_recur(prog, &run_ctx)) { 5138 5137 /* recursion detected */ 5139 5138 bpf_prog_put(prog); 5140 5139 return -EBUSY; 5141 5140 } 5142 5141 attr->test.retval = bpf_prog_run(prog, (void *) (long) attr->test.ctx_in); 5143 - __bpf_prog_exit_sleepable(prog, 0 /* bpf_prog_run does runtime stats */, &run_ctx); 5142 + __bpf_prog_exit_sleepable_recur(prog, 0 /* bpf_prog_run does runtime stats */, 5143 + &run_ctx); 5144 5144 bpf_prog_put(prog); 5145 5145 return 0; 5146 5146 #endif
+67 -13
kernel/bpf/trampoline.c
··· 864 864 * [2..MAX_U64] - execute bpf prog and record execution time. 865 865 * This is start time. 866 866 */ 867 - u64 notrace __bpf_prog_enter(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx) 867 + static u64 notrace __bpf_prog_enter_recur(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx) 868 868 __acquires(RCU) 869 869 { 870 870 rcu_read_lock(); ··· 901 901 } 902 902 } 903 903 904 - void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_run_ctx *run_ctx) 904 + static void notrace __bpf_prog_exit_recur(struct bpf_prog *prog, u64 start, 905 + struct bpf_tramp_run_ctx *run_ctx) 905 906 __releases(RCU) 906 907 { 907 908 bpf_reset_run_ctx(run_ctx->saved_run_ctx); ··· 913 912 rcu_read_unlock(); 914 913 } 915 914 916 - u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog, 917 - struct bpf_tramp_run_ctx *run_ctx) 915 + static u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog, 916 + struct bpf_tramp_run_ctx *run_ctx) 918 917 __acquires(RCU) 919 918 { 920 919 /* Runtime stats are exported via actual BPF_LSM_CGROUP ··· 928 927 return NO_START_TIME; 929 928 } 930 929 931 - void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start, 932 - struct bpf_tramp_run_ctx *run_ctx) 930 + static void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start, 931 + struct bpf_tramp_run_ctx *run_ctx) 933 932 __releases(RCU) 934 933 { 935 934 bpf_reset_run_ctx(run_ctx->saved_run_ctx); ··· 938 937 rcu_read_unlock(); 939 938 } 940 939 941 - u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx) 940 + u64 notrace __bpf_prog_enter_sleepable_recur(struct bpf_prog *prog, 941 + struct bpf_tramp_run_ctx *run_ctx) 942 942 { 943 943 rcu_read_lock_trace(); 944 944 migrate_disable(); ··· 955 953 return bpf_prog_start_time(); 956 954 } 957 955 958 - void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start, 959 - struct bpf_tramp_run_ctx *run_ctx) 956 + void notrace __bpf_prog_exit_sleepable_recur(struct bpf_prog *prog, u64 start, 957 + struct bpf_tramp_run_ctx *run_ctx) 960 958 { 961 959 bpf_reset_run_ctx(run_ctx->saved_run_ctx); 962 960 ··· 966 964 rcu_read_unlock_trace(); 967 965 } 968 966 969 - u64 notrace __bpf_prog_enter_struct_ops(struct bpf_prog *prog, 970 - struct bpf_tramp_run_ctx *run_ctx) 967 + static u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, 968 + struct bpf_tramp_run_ctx *run_ctx) 969 + { 970 + rcu_read_lock_trace(); 971 + migrate_disable(); 972 + might_fault(); 973 + 974 + run_ctx->saved_run_ctx = bpf_set_run_ctx(&run_ctx->run_ctx); 975 + 976 + return bpf_prog_start_time(); 977 + } 978 + 979 + static void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start, 980 + struct bpf_tramp_run_ctx *run_ctx) 981 + { 982 + bpf_reset_run_ctx(run_ctx->saved_run_ctx); 983 + 984 + update_prog_stats(prog, start); 985 + migrate_enable(); 986 + rcu_read_unlock_trace(); 987 + } 988 + 989 + static u64 notrace __bpf_prog_enter(struct bpf_prog *prog, 990 + struct bpf_tramp_run_ctx *run_ctx) 971 991 __acquires(RCU) 972 992 { 973 993 rcu_read_lock(); ··· 1000 976 return bpf_prog_start_time(); 1001 977 } 1002 978 1003 - void notrace __bpf_prog_exit_struct_ops(struct bpf_prog *prog, u64 start, 1004 - struct bpf_tramp_run_ctx *run_ctx) 979 + static void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, 980 + struct bpf_tramp_run_ctx *run_ctx) 1005 981 __releases(RCU) 1006 982 { 1007 983 bpf_reset_run_ctx(run_ctx->saved_run_ctx); ··· 1019 995 void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr) 1020 996 { 1021 997 percpu_ref_put(&tr->pcref); 998 + } 999 + 1000 + bpf_trampoline_enter_t bpf_trampoline_enter(const struct bpf_prog *prog) 1001 + { 1002 + bool sleepable = prog->aux->sleepable; 1003 + 1004 + if (bpf_prog_check_recur(prog)) 1005 + return sleepable ? __bpf_prog_enter_sleepable_recur : 1006 + __bpf_prog_enter_recur; 1007 + 1008 + if (resolve_prog_type(prog) == BPF_PROG_TYPE_LSM && 1009 + prog->expected_attach_type == BPF_LSM_CGROUP) 1010 + return __bpf_prog_enter_lsm_cgroup; 1011 + 1012 + return sleepable ? __bpf_prog_enter_sleepable : __bpf_prog_enter; 1013 + } 1014 + 1015 + bpf_trampoline_exit_t bpf_trampoline_exit(const struct bpf_prog *prog) 1016 + { 1017 + bool sleepable = prog->aux->sleepable; 1018 + 1019 + if (bpf_prog_check_recur(prog)) 1020 + return sleepable ? __bpf_prog_exit_sleepable_recur : 1021 + __bpf_prog_exit_recur; 1022 + 1023 + if (resolve_prog_type(prog) == BPF_PROG_TYPE_LSM && 1024 + prog->expected_attach_type == BPF_LSM_CGROUP) 1025 + return __bpf_prog_exit_lsm_cgroup; 1026 + 1027 + return sleepable ? __bpf_prog_exit_sleepable : __bpf_prog_exit; 1022 1028 } 1023 1029 1024 1030 int __weak
+15 -14
kernel/bpf/verifier.c
··· 5634 5634 u32 *btf_id; 5635 5635 }; 5636 5636 5637 - static const struct bpf_reg_types map_key_value_types = { 5638 - .types = { 5639 - PTR_TO_STACK, 5640 - PTR_TO_PACKET, 5641 - PTR_TO_PACKET_META, 5642 - PTR_TO_MAP_KEY, 5643 - PTR_TO_MAP_VALUE, 5644 - }, 5645 - }; 5646 - 5647 5637 static const struct bpf_reg_types sock_types = { 5648 5638 .types = { 5649 5639 PTR_TO_SOCK_COMMON, ··· 5700 5710 }; 5701 5711 5702 5712 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = { 5703 - [ARG_PTR_TO_MAP_KEY] = &map_key_value_types, 5704 - [ARG_PTR_TO_MAP_VALUE] = &map_key_value_types, 5713 + [ARG_PTR_TO_MAP_KEY] = &mem_types, 5714 + [ARG_PTR_TO_MAP_VALUE] = &mem_types, 5705 5715 [ARG_CONST_SIZE] = &scalar_types, 5706 5716 [ARG_CONST_SIZE_OR_ZERO] = &scalar_types, 5707 5717 [ARG_CONST_ALLOC_SIZE_OR_ZERO] = &scalar_types, ··· 6350 6360 func_id != BPF_FUNC_task_storage_delete) 6351 6361 goto error; 6352 6362 break; 6363 + case BPF_MAP_TYPE_CGRP_STORAGE: 6364 + if (func_id != BPF_FUNC_cgrp_storage_get && 6365 + func_id != BPF_FUNC_cgrp_storage_delete) 6366 + goto error; 6367 + break; 6353 6368 case BPF_MAP_TYPE_BLOOM_FILTER: 6354 6369 if (func_id != BPF_FUNC_map_peek_elem && 6355 6370 func_id != BPF_FUNC_map_push_elem) ··· 6465 6470 case BPF_FUNC_task_storage_get: 6466 6471 case BPF_FUNC_task_storage_delete: 6467 6472 if (map->map_type != BPF_MAP_TYPE_TASK_STORAGE) 6473 + goto error; 6474 + break; 6475 + case BPF_FUNC_cgrp_storage_get: 6476 + case BPF_FUNC_cgrp_storage_delete: 6477 + if (map->map_type != BPF_MAP_TYPE_CGRP_STORAGE) 6468 6478 goto error; 6469 6479 break; 6470 6480 default: ··· 10671 10671 * 3 let S be a stack 10672 10672 * 4 S.push(v) 10673 10673 * 5 while S is not empty 10674 - * 6 t <- S.pop() 10674 + * 6 t <- S.peek() 10675 10675 * 7 if t is what we're looking for: 10676 10676 * 8 return t 10677 10677 * 9 for all edges e in G.adjacentEdges(t) do ··· 14150 14150 14151 14151 if (insn->imm == BPF_FUNC_task_storage_get || 14152 14152 insn->imm == BPF_FUNC_sk_storage_get || 14153 - insn->imm == BPF_FUNC_inode_storage_get) { 14153 + insn->imm == BPF_FUNC_inode_storage_get || 14154 + insn->imm == BPF_FUNC_cgrp_storage_get) { 14154 14155 if (env->prog->aux->sleepable) 14155 14156 insn_buf[0] = BPF_MOV64_IMM(BPF_REG_5, (__force __s32)GFP_KERNEL); 14156 14157 else
+1
kernel/cgroup/cgroup.c
··· 5349 5349 atomic_dec(&cgrp->root->nr_cgrps); 5350 5350 cgroup1_pidlist_destroy_all(cgrp); 5351 5351 cancel_work_sync(&cgrp->release_agent_work); 5352 + bpf_cgrp_storage_free(cgrp); 5352 5353 5353 5354 if (cgroup_parent(cgrp)) { 5354 5355 /*
-2
kernel/module/kallsyms.c
··· 494 494 return ret; 495 495 } 496 496 497 - #ifdef CONFIG_LIVEPATCH 498 497 int module_kallsyms_on_each_symbol(int (*fn)(void *, const char *, 499 498 struct module *, unsigned long), 500 499 void *data) ··· 530 531 mutex_unlock(&module_mutex); 531 532 return ret; 532 533 } 533 - #endif /* CONFIG_LIVEPATCH */
+104 -3
kernel/trace/bpf_trace.c
··· 6 6 #include <linux/types.h> 7 7 #include <linux/slab.h> 8 8 #include <linux/bpf.h> 9 + #include <linux/bpf_verifier.h> 9 10 #include <linux/bpf_perf_event.h> 10 11 #include <linux/btf.h> 11 12 #include <linux/filter.h> ··· 1457 1456 return &bpf_get_current_cgroup_id_proto; 1458 1457 case BPF_FUNC_get_current_ancestor_cgroup_id: 1459 1458 return &bpf_get_current_ancestor_cgroup_id_proto; 1459 + case BPF_FUNC_cgrp_storage_get: 1460 + return &bpf_cgrp_storage_get_proto; 1461 + case BPF_FUNC_cgrp_storage_delete: 1462 + return &bpf_cgrp_storage_delete_proto; 1460 1463 #endif 1461 1464 case BPF_FUNC_send_signal: 1462 1465 return &bpf_send_signal_proto; ··· 1495 1490 case BPF_FUNC_this_cpu_ptr: 1496 1491 return &bpf_this_cpu_ptr_proto; 1497 1492 case BPF_FUNC_task_storage_get: 1493 + if (bpf_prog_check_recur(prog)) 1494 + return &bpf_task_storage_get_recur_proto; 1498 1495 return &bpf_task_storage_get_proto; 1499 1496 case BPF_FUNC_task_storage_delete: 1497 + if (bpf_prog_check_recur(prog)) 1498 + return &bpf_task_storage_delete_recur_proto; 1500 1499 return &bpf_task_storage_delete_proto; 1501 1500 case BPF_FUNC_for_each_map_elem: 1502 1501 return &bpf_for_each_map_elem_proto; ··· 2461 2452 unsigned long *addrs; 2462 2453 u64 *cookies; 2463 2454 u32 cnt; 2455 + u32 mods_cnt; 2456 + struct module **mods; 2464 2457 }; 2465 2458 2466 2459 struct bpf_kprobe_multi_run_ctx { ··· 2518 2507 return err; 2519 2508 } 2520 2509 2510 + static void kprobe_multi_put_modules(struct module **mods, u32 cnt) 2511 + { 2512 + u32 i; 2513 + 2514 + for (i = 0; i < cnt; i++) 2515 + module_put(mods[i]); 2516 + } 2517 + 2521 2518 static void free_user_syms(struct user_syms *us) 2522 2519 { 2523 2520 kvfree(us->syms); ··· 2538 2519 2539 2520 kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link); 2540 2521 unregister_fprobe(&kmulti_link->fp); 2522 + kprobe_multi_put_modules(kmulti_link->mods, kmulti_link->mods_cnt); 2541 2523 } 2542 2524 2543 2525 static void bpf_kprobe_multi_link_dealloc(struct bpf_link *link) ··· 2548 2528 kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link); 2549 2529 kvfree(kmulti_link->addrs); 2550 2530 kvfree(kmulti_link->cookies); 2531 + kfree(kmulti_link->mods); 2551 2532 kfree(kmulti_link); 2552 2533 } 2553 2534 ··· 2571 2550 swap(*cookie_a, *cookie_b); 2572 2551 } 2573 2552 2574 - static int __bpf_kprobe_multi_cookie_cmp(const void *a, const void *b) 2553 + static int bpf_kprobe_multi_addrs_cmp(const void *a, const void *b) 2575 2554 { 2576 2555 const unsigned long *addr_a = a, *addr_b = b; 2577 2556 ··· 2582 2561 2583 2562 static int bpf_kprobe_multi_cookie_cmp(const void *a, const void *b, const void *priv) 2584 2563 { 2585 - return __bpf_kprobe_multi_cookie_cmp(a, b); 2564 + return bpf_kprobe_multi_addrs_cmp(a, b); 2586 2565 } 2587 2566 2588 2567 static u64 bpf_kprobe_multi_cookie(struct bpf_run_ctx *ctx) ··· 2600 2579 return 0; 2601 2580 entry_ip = run_ctx->entry_ip; 2602 2581 addr = bsearch(&entry_ip, link->addrs, link->cnt, sizeof(entry_ip), 2603 - __bpf_kprobe_multi_cookie_cmp); 2582 + bpf_kprobe_multi_addrs_cmp); 2604 2583 if (!addr) 2605 2584 return 0; 2606 2585 cookie = link->cookies + (addr - link->addrs); ··· 2682 2661 cookie_b = data->cookies + (name_b - data->funcs); 2683 2662 swap(*cookie_a, *cookie_b); 2684 2663 } 2664 + } 2665 + 2666 + struct module_addr_args { 2667 + unsigned long *addrs; 2668 + u32 addrs_cnt; 2669 + struct module **mods; 2670 + int mods_cnt; 2671 + int mods_cap; 2672 + }; 2673 + 2674 + static int module_callback(void *data, const char *name, 2675 + struct module *mod, unsigned long addr) 2676 + { 2677 + struct module_addr_args *args = data; 2678 + struct module **mods; 2679 + 2680 + /* We iterate all modules symbols and for each we: 2681 + * - search for it in provided addresses array 2682 + * - if found we check if we already have the module pointer stored 2683 + * (we iterate modules sequentially, so we can check just the last 2684 + * module pointer) 2685 + * - take module reference and store it 2686 + */ 2687 + if (!bsearch(&addr, args->addrs, args->addrs_cnt, sizeof(addr), 2688 + bpf_kprobe_multi_addrs_cmp)) 2689 + return 0; 2690 + 2691 + if (args->mods && args->mods[args->mods_cnt - 1] == mod) 2692 + return 0; 2693 + 2694 + if (args->mods_cnt == args->mods_cap) { 2695 + args->mods_cap = max(16, args->mods_cap * 3 / 2); 2696 + mods = krealloc_array(args->mods, args->mods_cap, sizeof(*mods), GFP_KERNEL); 2697 + if (!mods) 2698 + return -ENOMEM; 2699 + args->mods = mods; 2700 + } 2701 + 2702 + if (!try_module_get(mod)) 2703 + return -EINVAL; 2704 + 2705 + args->mods[args->mods_cnt] = mod; 2706 + args->mods_cnt++; 2707 + return 0; 2708 + } 2709 + 2710 + static int get_modules_for_addrs(struct module ***mods, unsigned long *addrs, u32 addrs_cnt) 2711 + { 2712 + struct module_addr_args args = { 2713 + .addrs = addrs, 2714 + .addrs_cnt = addrs_cnt, 2715 + }; 2716 + int err; 2717 + 2718 + /* We return either err < 0 in case of error, ... */ 2719 + err = module_kallsyms_on_each_symbol(module_callback, &args); 2720 + if (err) { 2721 + kprobe_multi_put_modules(args.mods, args.mods_cnt); 2722 + kfree(args.mods); 2723 + return err; 2724 + } 2725 + 2726 + /* or number of modules found if everything is ok. */ 2727 + *mods = args.mods; 2728 + return args.mods_cnt; 2685 2729 } 2686 2730 2687 2731 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) ··· 2859 2773 bpf_kprobe_multi_cookie_cmp, 2860 2774 bpf_kprobe_multi_cookie_swap, 2861 2775 link); 2776 + } else { 2777 + /* 2778 + * We need to sort addrs array even if there are no cookies 2779 + * provided, to allow bsearch in get_modules_for_addrs. 2780 + */ 2781 + sort(addrs, cnt, sizeof(*addrs), 2782 + bpf_kprobe_multi_addrs_cmp, NULL); 2862 2783 } 2784 + 2785 + err = get_modules_for_addrs(&link->mods, addrs, cnt); 2786 + if (err < 0) { 2787 + bpf_link_cleanup(&link_primer); 2788 + return err; 2789 + } 2790 + link->mods_cnt = err; 2863 2791 2864 2792 err = register_fprobe_ips(&link->fp, addrs, cnt); 2865 2793 if (err) { 2794 + kprobe_multi_put_modules(link->mods, link->mods_cnt); 2866 2795 bpf_link_cleanup(&link_primer); 2867 2796 return err; 2868 2797 }
+11 -5
kernel/trace/ftrace.c
··· 8267 8267 size_t found; 8268 8268 }; 8269 8269 8270 + /* This function gets called for all kernel and module symbols 8271 + * and returns 1 in case we resolved all the requested symbols, 8272 + * 0 otherwise. 8273 + */ 8270 8274 static int kallsyms_callback(void *data, const char *name, 8271 8275 struct module *mod, unsigned long addr) 8272 8276 { ··· 8313 8309 int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned long *addrs) 8314 8310 { 8315 8311 struct kallsyms_data args; 8316 - int err; 8312 + int found_all; 8317 8313 8318 8314 memset(addrs, 0, sizeof(*addrs) * cnt); 8319 8315 args.addrs = addrs; 8320 8316 args.syms = sorted_syms; 8321 8317 args.cnt = cnt; 8322 8318 args.found = 0; 8323 - err = kallsyms_on_each_symbol(kallsyms_callback, &args); 8324 - if (err < 0) 8325 - return err; 8326 - return args.found == args.cnt ? 0 : -ESRCH; 8319 + 8320 + found_all = kallsyms_on_each_symbol(kallsyms_callback, &args); 8321 + if (found_all) 8322 + return 0; 8323 + found_all = module_kallsyms_on_each_symbol(kallsyms_callback, &args); 8324 + return found_all ? 0 : -ESRCH; 8327 8325 } 8328 8326 8329 8327 #ifdef CONFIG_SYSCTL
+3 -32
net/core/bpf_sk_storage.c
··· 48 48 /* Called by __sk_destruct() & bpf_sk_storage_clone() */ 49 49 void bpf_sk_storage_free(struct sock *sk) 50 50 { 51 - struct bpf_local_storage_elem *selem; 52 51 struct bpf_local_storage *sk_storage; 53 52 bool free_sk_storage = false; 54 - struct hlist_node *n; 55 53 56 54 rcu_read_lock(); 57 55 sk_storage = rcu_dereference(sk->sk_bpf_storage); ··· 58 60 return; 59 61 } 60 62 61 - /* Netiher the bpf_prog nor the bpf-map's syscall 62 - * could be modifying the sk_storage->list now. 63 - * Thus, no elem can be added-to or deleted-from the 64 - * sk_storage->list by the bpf_prog or by the bpf-map's syscall. 65 - * 66 - * It is racing with bpf_local_storage_map_free() alone 67 - * when unlinking elem from the sk_storage->list and 68 - * the map's bucket->list. 69 - */ 70 63 raw_spin_lock_bh(&sk_storage->lock); 71 - hlist_for_each_entry_safe(selem, n, &sk_storage->list, snode) { 72 - /* Always unlink from map before unlinking from 73 - * sk_storage. 74 - */ 75 - bpf_selem_unlink_map(selem); 76 - free_sk_storage = bpf_selem_unlink_storage_nolock( 77 - sk_storage, selem, true, false); 78 - } 64 + free_sk_storage = bpf_local_storage_unlink_nolock(sk_storage); 79 65 raw_spin_unlock_bh(&sk_storage->lock); 80 66 rcu_read_unlock(); 81 67 ··· 69 87 70 88 static void bpf_sk_storage_map_free(struct bpf_map *map) 71 89 { 72 - struct bpf_local_storage_map *smap; 73 - 74 - smap = (struct bpf_local_storage_map *)map; 75 - bpf_local_storage_cache_idx_free(&sk_cache, smap->cache_idx); 76 - bpf_local_storage_map_free(smap, NULL); 90 + bpf_local_storage_map_free(map, &sk_cache, NULL); 77 91 } 78 92 79 93 static struct bpf_map *bpf_sk_storage_map_alloc(union bpf_attr *attr) 80 94 { 81 - struct bpf_local_storage_map *smap; 82 - 83 - smap = bpf_local_storage_map_alloc(attr); 84 - if (IS_ERR(smap)) 85 - return ERR_CAST(smap); 86 - 87 - smap->cache_idx = bpf_local_storage_cache_idx_get(&sk_cache); 88 - return &smap->map; 95 + return bpf_local_storage_map_alloc(attr, &sk_cache); 89 96 } 90 97 91 98 static int notsupp_get_next_key(struct bpf_map *map, void *key,
+3 -3
samples/bpf/README.rst
··· 37 37 38 38 make headers_install 39 39 40 - This will creates a local "usr/include" directory in the git/build top 41 - level directory, that the make system automatically pickup first. 40 + This will create a local "usr/include" directory in the git/build top 41 + level directory, that the make system will automatically pick up first. 42 42 43 43 Compiling 44 44 ========= ··· 87 87 ----------------------- 88 88 In order to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH 89 89 environment variables before calling make. But do this before clean, 90 - cofiguration and header install steps described above. This will direct make to 90 + configuration and header install steps described above. This will direct make to 91 91 build samples for the cross target:: 92 92 93 93 export ARCH=arm64
+1 -1
samples/bpf/hbm_edt_kern.c
··· 35 35 * 36 36 * If the credit is below the drop threshold, the packet is dropped. If it 37 37 * is a TCP packet, then it also calls tcp_cwr since packets dropped by 38 - * by a cgroup skb BPF program do not automatically trigger a call to 38 + * a cgroup skb BPF program do not automatically trigger a call to 39 39 * tcp_cwr in the current kernel code. 40 40 * 41 41 * This BPF program actually uses 2 drop thresholds, one threshold
+1 -1
samples/bpf/xdp1_user.c
··· 51 51 52 52 sleep(interval); 53 53 54 - while (bpf_map_get_next_key(map_fd, &key, &key) != -1) { 54 + while (bpf_map_get_next_key(map_fd, &key, &key) == 0) { 55 55 __u64 sum = 0; 56 56 57 57 assert(bpf_map_lookup_elem(map_fd, &key, values) == 0);
+4
samples/bpf/xdp2_kern.c
··· 112 112 113 113 if (ipproto == IPPROTO_UDP) { 114 114 swap_src_dst_mac(data); 115 + 116 + if (bpf_xdp_store_bytes(ctx, 0, pkt, sizeof(pkt))) 117 + return rc; 118 + 115 119 rc = XDP_TX; 116 120 } 117 121
+2
scripts/bpf_doc.py
··· 685 685 'struct udp6_sock', 686 686 'struct unix_sock', 687 687 'struct task_struct', 688 + 'struct cgroup', 688 689 689 690 'struct __sk_buff', 690 691 'struct sk_msg_md', ··· 743 742 'struct udp6_sock', 744 743 'struct unix_sock', 745 744 'struct task_struct', 745 + 'struct cgroup', 746 746 'struct path', 747 747 'struct btf_ptr', 748 748 'struct inode',
+1 -1
tools/bpf/bpftool/Documentation/bpftool-map.rst
··· 55 55 | | **devmap** | **devmap_hash** | **sockmap** | **cpumap** | **xskmap** | **sockhash** 56 56 | | **cgroup_storage** | **reuseport_sockarray** | **percpu_cgroup_storage** 57 57 | | **queue** | **stack** | **sk_storage** | **struct_ops** | **ringbuf** | **inode_storage** 58 - | | **task_storage** | **bloom_filter** | **user_ringbuf** } 58 + | | **task_storage** | **bloom_filter** | **user_ringbuf** | **cgrp_storage** } 59 59 60 60 DESCRIPTION 61 61 ===========
+13 -2
tools/bpf/bpftool/Documentation/bpftool-prog.rst
··· 31 31 | **bpftool** **prog dump xlated** *PROG* [{**file** *FILE* | **opcodes** | **visual** | **linum**}] 32 32 | **bpftool** **prog dump jited** *PROG* [{**file** *FILE* | **opcodes** | **linum**}] 33 33 | **bpftool** **prog pin** *PROG* *FILE* 34 - | **bpftool** **prog** { **load** | **loadall** } *OBJ* *PATH* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*] [**pinmaps** *MAP_DIR*] 34 + | **bpftool** **prog** { **load** | **loadall** } *OBJ* *PATH* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*] [**pinmaps** *MAP_DIR*] [**autoattach**] 35 35 | **bpftool** **prog attach** *PROG* *ATTACH_TYPE* [*MAP*] 36 36 | **bpftool** **prog detach** *PROG* *ATTACH_TYPE* [*MAP*] 37 37 | **bpftool** **prog tracelog** ··· 131 131 contain a dot character ('.'), which is reserved for future 132 132 extensions of *bpffs*. 133 133 134 - **bpftool prog { load | loadall }** *OBJ* *PATH* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*] [**pinmaps** *MAP_DIR*] 134 + **bpftool prog { load | loadall }** *OBJ* *PATH* [**type** *TYPE*] [**map** {**idx** *IDX* | **name** *NAME*} *MAP*] [**dev** *NAME*] [**pinmaps** *MAP_DIR*] [**autoattach**] 135 135 Load bpf program(s) from binary *OBJ* and pin as *PATH*. 136 136 **bpftool prog load** pins only the first program from the 137 137 *OBJ* as *PATH*. **bpftool prog loadall** pins all programs ··· 149 149 given networking device (offload). 150 150 Optional **pinmaps** argument can be provided to pin all 151 151 maps under *MAP_DIR* directory. 152 + 153 + If **autoattach** is specified program will be attached 154 + before pin. In that case, only the link (representing the 155 + program attached to its hook) is pinned, not the program as 156 + such, so the path won't show in **bpftool prog show -f**, 157 + only show in **bpftool link show -f**. Also, this only works 158 + when bpftool (libbpf) is able to infer all necessary 159 + information from the object file, in particular, it's not 160 + supported for all program types. If a program does not 161 + support autoattach, bpftool falls back to regular pinning 162 + for that program instead. 152 163 153 164 Note: *PATH* must be located in *bpffs* mount. It must not 154 165 contain a dot character ('.'), which is reserved for future
+4 -4
tools/bpf/bpftool/Documentation/common_options.rst
··· 7 7 Print bpftool's version number (similar to **bpftool version**), the 8 8 number of the libbpf version in use, and optional features that were 9 9 included when bpftool was compiled. Optional features include linking 10 - against libbfd to provide the disassembler for JIT-ted programs 11 - (**bpftool prog dump jited**) and usage of BPF skeletons (some 12 - features like **bpftool prog profile** or showing pids associated to 13 - BPF objects may rely on it). 10 + against LLVM or libbfd to provide the disassembler for JIT-ted 11 + programs (**bpftool prog dump jited**) and usage of BPF skeletons 12 + (some features like **bpftool prog profile** or showing pids 13 + associated to BPF objects may rely on it). 14 14 15 15 -j, --json 16 16 Generate JSON output. For commands that cannot produce JSON, this
+48 -24
tools/bpf/bpftool/Makefile
··· 93 93 RM ?= rm -f 94 94 95 95 FEATURE_USER = .bpftool 96 - FEATURE_TESTS = libbfd libbfd-liberty libbfd-liberty-z \ 97 - disassembler-four-args disassembler-init-styled libcap \ 98 - clang-bpf-co-re 99 - FEATURE_DISPLAY = libbfd libbfd-liberty libbfd-liberty-z \ 100 - libcap clang-bpf-co-re 96 + 97 + FEATURE_TESTS := clang-bpf-co-re 98 + FEATURE_TESTS += llvm 99 + FEATURE_TESTS += libcap 100 + FEATURE_TESTS += libbfd 101 + FEATURE_TESTS += libbfd-liberty 102 + FEATURE_TESTS += libbfd-liberty-z 103 + FEATURE_TESTS += disassembler-four-args 104 + FEATURE_TESTS += disassembler-init-styled 105 + 106 + FEATURE_DISPLAY := clang-bpf-co-re 107 + FEATURE_DISPLAY += llvm 108 + FEATURE_DISPLAY += libcap 109 + FEATURE_DISPLAY += libbfd 110 + FEATURE_DISPLAY += libbfd-liberty 111 + FEATURE_DISPLAY += libbfd-liberty-z 101 112 102 113 check_feat := 1 103 114 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall ··· 126 115 endif 127 116 endif 128 117 129 - ifeq ($(feature-disassembler-four-args), 1) 130 - CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE 131 - endif 132 - ifeq ($(feature-disassembler-init-styled), 1) 133 - CFLAGS += -DDISASM_INIT_STYLED 134 - endif 135 - 136 118 LIBS = $(LIBBPF) -lelf -lz 137 119 LIBS_BOOTSTRAP = $(LIBBPF_BOOTSTRAP) -lelf -lz 138 120 ifeq ($(feature-libcap), 1) ··· 137 133 138 134 all: $(OUTPUT)bpftool 139 135 140 - BFD_SRCS = jit_disasm.c 136 + SRCS := $(wildcard *.c) 141 137 142 - SRCS = $(filter-out $(BFD_SRCS),$(wildcard *.c)) 138 + ifeq ($(feature-llvm),1) 139 + # If LLVM is available, use it for JIT disassembly 140 + CFLAGS += -DHAVE_LLVM_SUPPORT 141 + LLVM_CONFIG_LIB_COMPONENTS := mcdisassembler all-targets 142 + CFLAGS += $(shell $(LLVM_CONFIG) --cflags --libs $(LLVM_CONFIG_LIB_COMPONENTS)) 143 + LIBS += $(shell $(LLVM_CONFIG) --libs $(LLVM_CONFIG_LIB_COMPONENTS)) 144 + LDFLAGS += $(shell $(LLVM_CONFIG) --ldflags) 145 + else 146 + # Fall back on libbfd 147 + ifeq ($(feature-libbfd),1) 148 + LIBS += -lbfd -ldl -lopcodes 149 + else ifeq ($(feature-libbfd-liberty),1) 150 + LIBS += -lbfd -ldl -lopcodes -liberty 151 + else ifeq ($(feature-libbfd-liberty-z),1) 152 + LIBS += -lbfd -ldl -lopcodes -liberty -lz 153 + endif 143 154 144 - ifeq ($(feature-libbfd),1) 145 - LIBS += -lbfd -ldl -lopcodes 146 - else ifeq ($(feature-libbfd-liberty),1) 147 - LIBS += -lbfd -ldl -lopcodes -liberty 148 - else ifeq ($(feature-libbfd-liberty-z),1) 149 - LIBS += -lbfd -ldl -lopcodes -liberty -lz 155 + # If one of the above feature combinations is set, we support libbfd 156 + ifneq ($(filter -lbfd,$(LIBS)),) 157 + CFLAGS += -DHAVE_LIBBFD_SUPPORT 158 + 159 + # Libbfd interface changed over time, figure out what we need 160 + ifeq ($(feature-disassembler-four-args), 1) 161 + CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE 162 + endif 163 + ifeq ($(feature-disassembler-init-styled), 1) 164 + CFLAGS += -DDISASM_INIT_STYLED 165 + endif 166 + endif 150 167 endif 151 - 152 - ifneq ($(filter -lbfd,$(LIBS)),) 153 - CFLAGS += -DHAVE_LIBBFD_SUPPORT 154 - SRCS += $(BFD_SRCS) 168 + ifeq ($(filter -DHAVE_LLVM_SUPPORT -DHAVE_LIBBFD_SUPPORT,$(CFLAGS)),) 169 + # No support for JIT disassembly 170 + SRCS := $(filter-out jit_disasm.c,$(SRCS)) 155 171 endif 156 172 157 173 HOST_CFLAGS = $(subst -I$(LIBBPF_INCLUDE),-I$(LIBBPF_BOOTSTRAP_INCLUDE),\
+1
tools/bpf/bpftool/bash-completion/bpftool
··· 505 505 _bpftool_once_attr 'type' 506 506 _bpftool_once_attr 'dev' 507 507 _bpftool_once_attr 'pinmaps' 508 + _bpftool_once_attr 'autoattach' 508 509 return 0 509 510 ;; 510 511 esac
+8 -4
tools/bpf/bpftool/common.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */ 3 3 4 + #ifndef _GNU_SOURCE 4 5 #define _GNU_SOURCE 6 + #endif 5 7 #include <ctype.h> 6 8 #include <errno.h> 7 9 #include <fcntl.h> ··· 627 625 } 628 626 629 627 const char * 630 - ifindex_to_bfd_params(__u32 ifindex, __u64 ns_dev, __u64 ns_ino, 631 - const char **opt) 628 + ifindex_to_arch(__u32 ifindex, __u64 ns_dev, __u64 ns_ino, const char **opt) 632 629 { 630 + __maybe_unused int device_id; 633 631 char devname[IF_NAMESIZE]; 634 632 int vendor_id; 635 - int device_id; 636 633 637 634 if (!ifindex_to_name_ns(ifindex, ns_dev, ns_ino, devname)) { 638 635 p_err("Can't get net device name for ifindex %d: %s", ifindex, ··· 646 645 } 647 646 648 647 switch (vendor_id) { 648 + #ifdef HAVE_LIBBFD_SUPPORT 649 649 case 0x19ee: 650 650 device_id = read_sysfs_netdev_hex_int(devname, "device"); 651 651 if (device_id != 0x4000 && ··· 655 653 p_info("Unknown NFP device ID, assuming it is NFP-6xxx arch"); 656 654 *opt = "ctx4"; 657 655 return "NFP-6xxx"; 656 + #endif /* HAVE_LIBBFD_SUPPORT */ 657 + /* No NFP support in LLVM, we have no valid triple to return. */ 658 658 default: 659 - p_err("Can't get bfd arch name for device vendor id 0x%04x", 659 + p_err("Can't get arch name for device vendor id 0x%04x", 660 660 vendor_id); 661 661 return NULL; 662 662 }
+2
tools/bpf/bpftool/iter.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 // Copyright (C) 2020 Facebook 3 3 4 + #ifndef _GNU_SOURCE 4 5 #define _GNU_SOURCE 6 + #endif 5 7 #include <unistd.h> 6 8 #include <linux/err.h> 7 9 #include <bpf/libbpf.h>
+215 -46
tools/bpf/bpftool/jit_disasm.c
··· 11 11 * Licensed under the GNU General Public License, version 2.0 (GPLv2) 12 12 */ 13 13 14 + #ifndef _GNU_SOURCE 14 15 #define _GNU_SOURCE 16 + #endif 15 17 #include <stdio.h> 16 18 #include <stdarg.h> 17 19 #include <stdint.h> 18 20 #include <stdlib.h> 19 - #include <assert.h> 20 21 #include <unistd.h> 21 22 #include <string.h> 22 - #include <bfd.h> 23 - #include <dis-asm.h> 24 23 #include <sys/stat.h> 25 24 #include <limits.h> 26 25 #include <bpf/libbpf.h> 26 + 27 + #ifdef HAVE_LLVM_SUPPORT 28 + #include <llvm-c/Core.h> 29 + #include <llvm-c/Disassembler.h> 30 + #include <llvm-c/Target.h> 31 + #include <llvm-c/TargetMachine.h> 32 + #endif 33 + 34 + #ifdef HAVE_LIBBFD_SUPPORT 35 + #include <bfd.h> 36 + #include <dis-asm.h> 27 37 #include <tools/dis-asm-compat.h> 38 + #endif 28 39 29 40 #include "json_writer.h" 30 41 #include "main.h" 31 42 32 - static void get_exec_path(char *tpath, size_t size) 43 + static int oper_count; 44 + 45 + #ifdef HAVE_LLVM_SUPPORT 46 + #define DISASM_SPACER 47 + 48 + typedef LLVMDisasmContextRef disasm_ctx_t; 49 + 50 + static int printf_json(char *s) 51 + { 52 + s = strtok(s, " \t"); 53 + jsonw_string_field(json_wtr, "operation", s); 54 + 55 + jsonw_name(json_wtr, "operands"); 56 + jsonw_start_array(json_wtr); 57 + oper_count = 1; 58 + 59 + while ((s = strtok(NULL, " \t,()")) != 0) { 60 + jsonw_string(json_wtr, s); 61 + oper_count++; 62 + } 63 + return 0; 64 + } 65 + 66 + /* This callback to set the ref_type is necessary to have the LLVM disassembler 67 + * print PC-relative addresses instead of byte offsets for branch instruction 68 + * targets. 69 + */ 70 + static const char * 71 + symbol_lookup_callback(__maybe_unused void *disasm_info, 72 + __maybe_unused uint64_t ref_value, 73 + uint64_t *ref_type, __maybe_unused uint64_t ref_PC, 74 + __maybe_unused const char **ref_name) 75 + { 76 + *ref_type = LLVMDisassembler_ReferenceType_InOut_None; 77 + return NULL; 78 + } 79 + 80 + static int 81 + init_context(disasm_ctx_t *ctx, const char *arch, 82 + __maybe_unused const char *disassembler_options, 83 + __maybe_unused unsigned char *image, __maybe_unused ssize_t len) 84 + { 85 + char *triple; 86 + 87 + if (arch) 88 + triple = LLVMNormalizeTargetTriple(arch); 89 + else 90 + triple = LLVMGetDefaultTargetTriple(); 91 + if (!triple) { 92 + p_err("Failed to retrieve triple"); 93 + return -1; 94 + } 95 + *ctx = LLVMCreateDisasm(triple, NULL, 0, NULL, symbol_lookup_callback); 96 + LLVMDisposeMessage(triple); 97 + 98 + if (!*ctx) { 99 + p_err("Failed to create disassembler"); 100 + return -1; 101 + } 102 + 103 + return 0; 104 + } 105 + 106 + static void destroy_context(disasm_ctx_t *ctx) 107 + { 108 + LLVMDisposeMessage(*ctx); 109 + } 110 + 111 + static int 112 + disassemble_insn(disasm_ctx_t *ctx, unsigned char *image, ssize_t len, int pc) 113 + { 114 + char buf[256]; 115 + int count; 116 + 117 + count = LLVMDisasmInstruction(*ctx, image + pc, len - pc, pc, 118 + buf, sizeof(buf)); 119 + if (json_output) 120 + printf_json(buf); 121 + else 122 + printf("%s", buf); 123 + 124 + return count; 125 + } 126 + 127 + int disasm_init(void) 128 + { 129 + LLVMInitializeAllTargetInfos(); 130 + LLVMInitializeAllTargetMCs(); 131 + LLVMInitializeAllDisassemblers(); 132 + return 0; 133 + } 134 + #endif /* HAVE_LLVM_SUPPORT */ 135 + 136 + #ifdef HAVE_LIBBFD_SUPPORT 137 + #define DISASM_SPACER "\t" 138 + 139 + typedef struct { 140 + struct disassemble_info *info; 141 + disassembler_ftype disassemble; 142 + bfd *bfdf; 143 + } disasm_ctx_t; 144 + 145 + static int get_exec_path(char *tpath, size_t size) 33 146 { 34 147 const char *path = "/proc/self/exe"; 35 148 ssize_t len; 36 149 37 150 len = readlink(path, tpath, size - 1); 38 - assert(len > 0); 151 + if (len <= 0) 152 + return -1; 153 + 39 154 tpath[len] = 0; 155 + 156 + return 0; 40 157 } 41 158 42 - static int oper_count; 43 159 static int printf_json(void *out, const char *fmt, va_list ap) 44 160 { 45 161 char *s; ··· 213 97 return r; 214 98 } 215 99 216 - void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 217 - const char *arch, const char *disassembler_options, 218 - const struct btf *btf, 219 - const struct bpf_prog_linfo *prog_linfo, 220 - __u64 func_ksym, unsigned int func_idx, 221 - bool linum) 100 + static int init_context(disasm_ctx_t *ctx, const char *arch, 101 + const char *disassembler_options, 102 + unsigned char *image, ssize_t len) 222 103 { 223 - const struct bpf_line_info *linfo = NULL; 224 - disassembler_ftype disassemble; 225 - struct disassemble_info info; 226 - unsigned int nr_skip = 0; 227 - int count, i, pc = 0; 104 + struct disassemble_info *info; 228 105 char tpath[PATH_MAX]; 229 106 bfd *bfdf; 230 107 231 - if (!len) 232 - return; 233 - 234 108 memset(tpath, 0, sizeof(tpath)); 235 - get_exec_path(tpath, sizeof(tpath)); 109 + if (get_exec_path(tpath, sizeof(tpath))) { 110 + p_err("failed to create disassembler (get_exec_path)"); 111 + return -1; 112 + } 236 113 237 - bfdf = bfd_openr(tpath, NULL); 238 - assert(bfdf); 239 - assert(bfd_check_format(bfdf, bfd_object)); 114 + ctx->bfdf = bfd_openr(tpath, NULL); 115 + if (!ctx->bfdf) { 116 + p_err("failed to create disassembler (bfd_openr)"); 117 + return -1; 118 + } 119 + if (!bfd_check_format(ctx->bfdf, bfd_object)) { 120 + p_err("failed to create disassembler (bfd_check_format)"); 121 + goto err_close; 122 + } 123 + bfdf = ctx->bfdf; 124 + 125 + ctx->info = malloc(sizeof(struct disassemble_info)); 126 + if (!ctx->info) { 127 + p_err("mem alloc failed"); 128 + goto err_close; 129 + } 130 + info = ctx->info; 240 131 241 132 if (json_output) 242 - init_disassemble_info_compat(&info, stdout, 133 + init_disassemble_info_compat(info, stdout, 243 134 (fprintf_ftype) fprintf_json, 244 135 fprintf_json_styled); 245 136 else 246 - init_disassemble_info_compat(&info, stdout, 137 + init_disassemble_info_compat(info, stdout, 247 138 (fprintf_ftype) fprintf, 248 139 fprintf_styled); 249 140 ··· 262 139 bfdf->arch_info = inf; 263 140 } else { 264 141 p_err("No libbfd support for %s", arch); 265 - return; 142 + goto err_free; 266 143 } 267 144 } 268 145 269 - info.arch = bfd_get_arch(bfdf); 270 - info.mach = bfd_get_mach(bfdf); 146 + info->arch = bfd_get_arch(bfdf); 147 + info->mach = bfd_get_mach(bfdf); 271 148 if (disassembler_options) 272 - info.disassembler_options = disassembler_options; 273 - info.buffer = image; 274 - info.buffer_length = len; 149 + info->disassembler_options = disassembler_options; 150 + info->buffer = image; 151 + info->buffer_length = len; 275 152 276 - disassemble_init_for_target(&info); 153 + disassemble_init_for_target(info); 277 154 278 155 #ifdef DISASM_FOUR_ARGS_SIGNATURE 279 - disassemble = disassembler(info.arch, 280 - bfd_big_endian(bfdf), 281 - info.mach, 282 - bfdf); 156 + ctx->disassemble = disassembler(info->arch, 157 + bfd_big_endian(bfdf), 158 + info->mach, 159 + bfdf); 283 160 #else 284 - disassemble = disassembler(bfdf); 161 + ctx->disassemble = disassembler(bfdf); 285 162 #endif 286 - assert(disassemble); 163 + if (!ctx->disassemble) { 164 + p_err("failed to create disassembler"); 165 + goto err_free; 166 + } 167 + return 0; 168 + 169 + err_free: 170 + free(info); 171 + err_close: 172 + bfd_close(ctx->bfdf); 173 + return -1; 174 + } 175 + 176 + static void destroy_context(disasm_ctx_t *ctx) 177 + { 178 + free(ctx->info); 179 + bfd_close(ctx->bfdf); 180 + } 181 + 182 + static int 183 + disassemble_insn(disasm_ctx_t *ctx, __maybe_unused unsigned char *image, 184 + __maybe_unused ssize_t len, int pc) 185 + { 186 + return ctx->disassemble(pc, ctx->info); 187 + } 188 + 189 + int disasm_init(void) 190 + { 191 + bfd_init(); 192 + return 0; 193 + } 194 + #endif /* HAVE_LIBBPFD_SUPPORT */ 195 + 196 + int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 197 + const char *arch, const char *disassembler_options, 198 + const struct btf *btf, 199 + const struct bpf_prog_linfo *prog_linfo, 200 + __u64 func_ksym, unsigned int func_idx, 201 + bool linum) 202 + { 203 + const struct bpf_line_info *linfo = NULL; 204 + unsigned int nr_skip = 0; 205 + int count, i, pc = 0; 206 + disasm_ctx_t ctx; 207 + 208 + if (!len) 209 + return -1; 210 + 211 + if (init_context(&ctx, arch, disassembler_options, image, len)) 212 + return -1; 287 213 288 214 if (json_output) 289 215 jsonw_start_array(json_wtr); ··· 357 185 if (linfo) 358 186 btf_dump_linfo_plain(btf, linfo, "; ", 359 187 linum); 360 - printf("%4x:\t", pc); 188 + printf("%4x:" DISASM_SPACER, pc); 361 189 } 362 190 363 - count = disassemble(pc, &info); 191 + count = disassemble_insn(&ctx, image, len, pc); 192 + 364 193 if (json_output) { 365 194 /* Operand array, was started in fprintf_json. Before 366 195 * that, make sure we have a _null_ value if no operand ··· 397 224 if (json_output) 398 225 jsonw_end_array(json_wtr); 399 226 400 - bfd_close(bfdf); 401 - } 227 + destroy_context(&ctx); 402 228 403 - int disasm_init(void) 404 - { 405 - bfd_init(); 406 229 return 0; 407 230 }
+57 -33
tools/bpf/bpftool/main.c
··· 71 71 return 0; 72 72 } 73 73 74 + static int do_batch(int argc, char **argv); 75 + static int do_version(int argc, char **argv); 76 + 77 + static const struct cmd commands[] = { 78 + { "help", do_help }, 79 + { "batch", do_batch }, 80 + { "prog", do_prog }, 81 + { "map", do_map }, 82 + { "link", do_link }, 83 + { "cgroup", do_cgroup }, 84 + { "perf", do_perf }, 85 + { "net", do_net }, 86 + { "feature", do_feature }, 87 + { "btf", do_btf }, 88 + { "gen", do_gen }, 89 + { "struct_ops", do_struct_ops }, 90 + { "iter", do_iter }, 91 + { "version", do_version }, 92 + { 0 } 93 + }; 94 + 74 95 #ifndef BPFTOOL_VERSION 75 96 /* bpftool's major and minor version numbers are aligned on libbpf's. There is 76 97 * an offset of 6 for the version number, because bpftool's version was higher ··· 103 82 #define BPFTOOL_PATCH_VERSION 0 104 83 #endif 105 84 85 + static void 86 + print_feature(const char *feature, bool state, unsigned int *nb_features) 87 + { 88 + if (state) { 89 + printf("%s %s", *nb_features ? "," : "", feature); 90 + *nb_features = *nb_features + 1; 91 + } 92 + } 93 + 106 94 static int do_version(int argc, char **argv) 107 95 { 108 96 #ifdef HAVE_LIBBFD_SUPPORT ··· 119 89 #else 120 90 const bool has_libbfd = false; 121 91 #endif 92 + #ifdef HAVE_LLVM_SUPPORT 93 + const bool has_llvm = true; 94 + #else 95 + const bool has_llvm = false; 96 + #endif 122 97 #ifdef BPFTOOL_WITHOUT_SKELETONS 123 98 const bool has_skeletons = false; 124 99 #else 125 100 const bool has_skeletons = true; 126 101 #endif 102 + bool bootstrap = false; 103 + int i; 104 + 105 + for (i = 0; commands[i].cmd; i++) { 106 + if (!strcmp(commands[i].cmd, "prog")) { 107 + /* Assume we run a bootstrap version if "bpftool prog" 108 + * is not available. 109 + */ 110 + bootstrap = !commands[i].func; 111 + break; 112 + } 113 + } 127 114 128 115 if (json_output) { 129 116 jsonw_start_object(json_wtr); /* root object */ ··· 159 112 jsonw_name(json_wtr, "features"); 160 113 jsonw_start_object(json_wtr); /* features */ 161 114 jsonw_bool_field(json_wtr, "libbfd", has_libbfd); 115 + jsonw_bool_field(json_wtr, "llvm", has_llvm); 162 116 jsonw_bool_field(json_wtr, "libbpf_strict", !legacy_libbpf); 163 117 jsonw_bool_field(json_wtr, "skeletons", has_skeletons); 118 + jsonw_bool_field(json_wtr, "bootstrap", bootstrap); 164 119 jsonw_end_object(json_wtr); /* features */ 165 120 166 121 jsonw_end_object(json_wtr); /* root object */ ··· 177 128 #endif 178 129 printf("using libbpf %s\n", libbpf_version_string()); 179 130 printf("features:"); 180 - if (has_libbfd) { 181 - printf(" libbfd"); 182 - nb_features++; 183 - } 184 - if (!legacy_libbpf) { 185 - printf("%s libbpf_strict", nb_features++ ? "," : ""); 186 - nb_features++; 187 - } 188 - if (has_skeletons) 189 - printf("%s skeletons", nb_features++ ? "," : ""); 131 + print_feature("libbfd", has_libbfd, &nb_features); 132 + print_feature("llvm", has_llvm, &nb_features); 133 + print_feature("libbpf_strict", !legacy_libbpf, &nb_features); 134 + print_feature("skeletons", has_skeletons, &nb_features); 135 + print_feature("bootstrap", bootstrap, &nb_features); 190 136 printf("\n"); 191 137 } 192 138 return 0; ··· 323 279 return n_argc; 324 280 } 325 281 326 - static int do_batch(int argc, char **argv); 327 - 328 - static const struct cmd cmds[] = { 329 - { "help", do_help }, 330 - { "batch", do_batch }, 331 - { "prog", do_prog }, 332 - { "map", do_map }, 333 - { "link", do_link }, 334 - { "cgroup", do_cgroup }, 335 - { "perf", do_perf }, 336 - { "net", do_net }, 337 - { "feature", do_feature }, 338 - { "btf", do_btf }, 339 - { "gen", do_gen }, 340 - { "struct_ops", do_struct_ops }, 341 - { "iter", do_iter }, 342 - { "version", do_version }, 343 - { 0 } 344 - }; 345 - 346 282 static int do_batch(int argc, char **argv) 347 283 { 348 284 char buf[BATCH_LINE_LEN_MAX], contline[BATCH_LINE_LEN_MAX]; ··· 410 386 jsonw_name(json_wtr, "output"); 411 387 } 412 388 413 - err = cmd_select(cmds, n_argc, n_argv, do_help); 389 + err = cmd_select(commands, n_argc, n_argv, do_help); 414 390 415 391 if (json_output) 416 392 jsonw_end_object(json_wtr); ··· 474 450 json_output = false; 475 451 show_pinned = false; 476 452 block_mount = false; 477 - bin_name = argv[0]; 453 + bin_name = "bpftool"; 478 454 479 455 opterr = 0; 480 456 while ((opt = getopt_long(argc, argv, "VhpjfLmndB:l", ··· 552 528 if (version_requested) 553 529 return do_version(argc, argv); 554 530 555 - ret = cmd_select(cmds, argc, argv, do_help); 531 + ret = cmd_select(commands, argc, argv, do_help); 556 532 557 533 if (json_output) 558 534 jsonw_destroy(&json_wtr);
+16 -16
tools/bpf/bpftool/main.h
··· 172 172 int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len); 173 173 174 174 struct bpf_prog_linfo; 175 - #ifdef HAVE_LIBBFD_SUPPORT 176 - void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 177 - const char *arch, const char *disassembler_options, 178 - const struct btf *btf, 179 - const struct bpf_prog_linfo *prog_linfo, 180 - __u64 func_ksym, unsigned int func_idx, 181 - bool linum); 175 + #if defined(HAVE_LLVM_SUPPORT) || defined(HAVE_LIBBFD_SUPPORT) 176 + int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 177 + const char *arch, const char *disassembler_options, 178 + const struct btf *btf, 179 + const struct bpf_prog_linfo *prog_linfo, 180 + __u64 func_ksym, unsigned int func_idx, 181 + bool linum); 182 182 int disasm_init(void); 183 183 #else 184 184 static inline 185 - void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 186 - const char *arch, const char *disassembler_options, 187 - const struct btf *btf, 188 - const struct bpf_prog_linfo *prog_linfo, 189 - __u64 func_ksym, unsigned int func_idx, 190 - bool linum) 185 + int disasm_print_insn(unsigned char *image, ssize_t len, int opcodes, 186 + const char *arch, const char *disassembler_options, 187 + const struct btf *btf, 188 + const struct bpf_prog_linfo *prog_linfo, 189 + __u64 func_ksym, unsigned int func_idx, 190 + bool linum) 191 191 { 192 + return 0; 192 193 } 193 194 static inline int disasm_init(void) 194 195 { 195 - p_err("No libbfd support"); 196 + p_err("No JIT disassembly support"); 196 197 return -1; 197 198 } 198 199 #endif ··· 203 202 unsigned int get_page_size(void); 204 203 unsigned int get_possible_cpus(void); 205 204 const char * 206 - ifindex_to_bfd_params(__u32 ifindex, __u64 ns_dev, __u64 ns_ino, 207 - const char **opt); 205 + ifindex_to_arch(__u32 ifindex, __u64 ns_dev, __u64 ns_ino, const char **opt); 208 206 209 207 struct btf_dumper { 210 208 const struct btf *btf;
+1 -2
tools/bpf/bpftool/map.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */ 3 3 4 - #include <assert.h> 5 4 #include <errno.h> 6 5 #include <fcntl.h> 7 6 #include <linux/err.h> ··· 1458 1459 " devmap | devmap_hash | sockmap | cpumap | xskmap | sockhash |\n" 1459 1460 " cgroup_storage | reuseport_sockarray | percpu_cgroup_storage |\n" 1460 1461 " queue | stack | sk_storage | struct_ops | ringbuf | inode_storage |\n" 1461 - " task_storage | bloom_filter | user_ringbuf }\n" 1462 + " task_storage | bloom_filter | user_ringbuf | cgrp_storage }\n" 1462 1463 " " HELP_SPEC_OPTIONS " |\n" 1463 1464 " {-f|--bpffs} | {-n|--nomount} }\n" 1464 1465 "",
+2
tools/bpf/bpftool/net.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 // Copyright (C) 2018 Facebook 3 3 4 + #ifndef _GNU_SOURCE 4 5 #define _GNU_SOURCE 6 + #endif 5 7 #include <errno.h> 6 8 #include <fcntl.h> 7 9 #include <stdlib.h>
+2
tools/bpf/bpftool/perf.c
··· 2 2 // Copyright (C) 2018 Facebook 3 3 // Author: Yonghong Song <yhs@fb.com> 4 4 5 + #ifndef _GNU_SOURCE 5 6 #define _GNU_SOURCE 7 + #endif 6 8 #include <ctype.h> 7 9 #include <errno.h> 8 10 #include <fcntl.h>
+87 -12
tools/bpf/bpftool/prog.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 /* Copyright (C) 2017-2018 Netronome Systems, Inc. */ 3 3 4 + #ifndef _GNU_SOURCE 4 5 #define _GNU_SOURCE 6 + #endif 5 7 #include <errno.h> 6 8 #include <fcntl.h> 7 9 #include <signal.h> ··· 764 762 const char *name = NULL; 765 763 766 764 if (info->ifindex) { 767 - name = ifindex_to_bfd_params(info->ifindex, 768 - info->netns_dev, 769 - info->netns_ino, 770 - &disasm_opt); 765 + name = ifindex_to_arch(info->ifindex, info->netns_dev, 766 + info->netns_ino, &disasm_opt); 771 767 if (!name) 772 768 goto exit_free; 773 769 } ··· 820 820 printf("%s:\n", sym_name); 821 821 } 822 822 823 - disasm_print_insn(img, lens[i], opcodes, 824 - name, disasm_opt, btf, 825 - prog_linfo, ksyms[i], i, 826 - linum); 823 + if (disasm_print_insn(img, lens[i], opcodes, 824 + name, disasm_opt, btf, 825 + prog_linfo, ksyms[i], i, 826 + linum)) 827 + goto exit_free; 827 828 828 829 img += lens[i]; 829 830 ··· 837 836 if (json_output) 838 837 jsonw_end_array(json_wtr); 839 838 } else { 840 - disasm_print_insn(buf, member_len, opcodes, name, 841 - disasm_opt, btf, NULL, 0, 0, false); 839 + if (disasm_print_insn(buf, member_len, opcodes, name, 840 + disasm_opt, btf, NULL, 0, 0, 841 + false)) 842 + goto exit_free; 842 843 } 843 844 } else if (visual) { 844 845 if (json_output) ··· 1456 1453 return ret; 1457 1454 } 1458 1455 1456 + static int 1457 + auto_attach_program(struct bpf_program *prog, const char *path) 1458 + { 1459 + struct bpf_link *link; 1460 + int err; 1461 + 1462 + link = bpf_program__attach(prog); 1463 + if (!link) { 1464 + p_info("Program %s does not support autoattach, falling back to pinning", 1465 + bpf_program__name(prog)); 1466 + return bpf_obj_pin(bpf_program__fd(prog), path); 1467 + } 1468 + 1469 + err = bpf_link__pin(link, path); 1470 + bpf_link__destroy(link); 1471 + return err; 1472 + } 1473 + 1474 + static int pathname_concat(char *buf, size_t buf_sz, const char *path, const char *name) 1475 + { 1476 + int len; 1477 + 1478 + len = snprintf(buf, buf_sz, "%s/%s", path, name); 1479 + if (len < 0) 1480 + return -EINVAL; 1481 + if ((size_t)len >= buf_sz) 1482 + return -ENAMETOOLONG; 1483 + 1484 + return 0; 1485 + } 1486 + 1487 + static int 1488 + auto_attach_programs(struct bpf_object *obj, const char *path) 1489 + { 1490 + struct bpf_program *prog; 1491 + char buf[PATH_MAX]; 1492 + int err; 1493 + 1494 + bpf_object__for_each_program(prog, obj) { 1495 + err = pathname_concat(buf, sizeof(buf), path, bpf_program__name(prog)); 1496 + if (err) 1497 + goto err_unpin_programs; 1498 + 1499 + err = auto_attach_program(prog, buf); 1500 + if (err) 1501 + goto err_unpin_programs; 1502 + } 1503 + 1504 + return 0; 1505 + 1506 + err_unpin_programs: 1507 + while ((prog = bpf_object__prev_program(obj, prog))) { 1508 + if (pathname_concat(buf, sizeof(buf), path, bpf_program__name(prog))) 1509 + continue; 1510 + 1511 + bpf_program__unpin(prog, buf); 1512 + } 1513 + 1514 + return err; 1515 + } 1516 + 1459 1517 static int load_with_options(int argc, char **argv, bool first_prog_only) 1460 1518 { 1461 1519 enum bpf_prog_type common_prog_type = BPF_PROG_TYPE_UNSPEC; ··· 1528 1464 struct bpf_program *prog = NULL, *pos; 1529 1465 unsigned int old_map_fds = 0; 1530 1466 const char *pinmaps = NULL; 1467 + bool auto_attach = false; 1531 1468 struct bpf_object *obj; 1532 1469 struct bpf_map *map; 1533 1470 const char *pinfile; ··· 1648 1583 goto err_free_reuse_maps; 1649 1584 1650 1585 pinmaps = GET_ARG(); 1586 + } else if (is_prefix(*argv, "autoattach")) { 1587 + auto_attach = true; 1588 + NEXT_ARG(); 1651 1589 } else { 1652 1590 p_err("expected no more arguments, 'type', 'map' or 'dev', got: '%s'?", 1653 1591 *argv); ··· 1760 1692 goto err_close_obj; 1761 1693 } 1762 1694 1763 - err = bpf_obj_pin(bpf_program__fd(prog), pinfile); 1695 + if (auto_attach) 1696 + err = auto_attach_program(prog, pinfile); 1697 + else 1698 + err = bpf_obj_pin(bpf_program__fd(prog), pinfile); 1764 1699 if (err) { 1765 1700 p_err("failed to pin program %s", 1766 1701 bpf_program__section_name(prog)); 1767 1702 goto err_close_obj; 1768 1703 } 1769 1704 } else { 1770 - err = bpf_object__pin_programs(obj, pinfile); 1705 + if (auto_attach) 1706 + err = auto_attach_programs(obj, pinfile); 1707 + else 1708 + err = bpf_object__pin_programs(obj, pinfile); 1771 1709 if (err) { 1772 1710 p_err("failed to pin all programs"); 1773 1711 goto err_close_obj; ··· 2412 2338 " [type TYPE] [dev NAME] \\\n" 2413 2339 " [map { idx IDX | name NAME } MAP]\\\n" 2414 2340 " [pinmaps MAP_DIR]\n" 2341 + " [autoattach]\n" 2415 2342 " %1$s %2$s attach PROG ATTACH_TYPE [MAP]\n" 2416 2343 " %1$s %2$s detach PROG ATTACH_TYPE [MAP]\n" 2417 2344 " %1$s %2$s run PROG \\\n"
+2
tools/bpf/bpftool/xlated_dumper.c
··· 1 1 // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) 2 2 /* Copyright (C) 2018 Netronome Systems, Inc. */ 3 3 4 + #ifndef _GNU_SOURCE 4 5 #define _GNU_SOURCE 6 + #endif 5 7 #include <stdarg.h> 6 8 #include <stdio.h> 7 9 #include <stdlib.h>
+49 -1
tools/include/uapi/linux/bpf.h
··· 922 922 BPF_MAP_TYPE_CPUMAP, 923 923 BPF_MAP_TYPE_XSKMAP, 924 924 BPF_MAP_TYPE_SOCKHASH, 925 - BPF_MAP_TYPE_CGROUP_STORAGE, 925 + BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, 926 + /* BPF_MAP_TYPE_CGROUP_STORAGE is available to bpf programs attaching 927 + * to a cgroup. The newer BPF_MAP_TYPE_CGRP_STORAGE is available to 928 + * both cgroup-attached and other progs and supports all functionality 929 + * provided by BPF_MAP_TYPE_CGROUP_STORAGE. So mark 930 + * BPF_MAP_TYPE_CGROUP_STORAGE deprecated. 931 + */ 932 + BPF_MAP_TYPE_CGROUP_STORAGE = BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED, 926 933 BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, 927 934 BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, 928 935 BPF_MAP_TYPE_QUEUE, ··· 942 935 BPF_MAP_TYPE_TASK_STORAGE, 943 936 BPF_MAP_TYPE_BLOOM_FILTER, 944 937 BPF_MAP_TYPE_USER_RINGBUF, 938 + BPF_MAP_TYPE_CGRP_STORAGE, 945 939 }; 946 940 947 941 /* Note that tracing related programs such as ··· 5443 5435 * **-E2BIG** if user-space has tried to publish a sample which is 5444 5436 * larger than the size of the ring buffer, or which cannot fit 5445 5437 * within a struct bpf_dynptr. 5438 + * 5439 + * void *bpf_cgrp_storage_get(struct bpf_map *map, struct cgroup *cgroup, void *value, u64 flags) 5440 + * Description 5441 + * Get a bpf_local_storage from the *cgroup*. 5442 + * 5443 + * Logically, it could be thought of as getting the value from 5444 + * a *map* with *cgroup* as the **key**. From this 5445 + * perspective, the usage is not much different from 5446 + * **bpf_map_lookup_elem**\ (*map*, **&**\ *cgroup*) except this 5447 + * helper enforces the key must be a cgroup struct and the map must also 5448 + * be a **BPF_MAP_TYPE_CGRP_STORAGE**. 5449 + * 5450 + * In reality, the local-storage value is embedded directly inside of the 5451 + * *cgroup* object itself, rather than being located in the 5452 + * **BPF_MAP_TYPE_CGRP_STORAGE** map. When the local-storage value is 5453 + * queried for some *map* on a *cgroup* object, the kernel will perform an 5454 + * O(n) iteration over all of the live local-storage values for that 5455 + * *cgroup* object until the local-storage value for the *map* is found. 5456 + * 5457 + * An optional *flags* (**BPF_LOCAL_STORAGE_GET_F_CREATE**) can be 5458 + * used such that a new bpf_local_storage will be 5459 + * created if one does not exist. *value* can be used 5460 + * together with **BPF_LOCAL_STORAGE_GET_F_CREATE** to specify 5461 + * the initial value of a bpf_local_storage. If *value* is 5462 + * **NULL**, the new bpf_local_storage will be zero initialized. 5463 + * Return 5464 + * A bpf_local_storage pointer is returned on success. 5465 + * 5466 + * **NULL** if not found or there was an error in adding 5467 + * a new bpf_local_storage. 5468 + * 5469 + * long bpf_cgrp_storage_delete(struct bpf_map *map, struct cgroup *cgroup) 5470 + * Description 5471 + * Delete a bpf_local_storage from a *cgroup*. 5472 + * Return 5473 + * 0 on success. 5474 + * 5475 + * **-ENOENT** if the bpf_local_storage cannot be found. 5446 5476 */ 5447 5477 #define ___BPF_FUNC_MAPPER(FN, ctx...) \ 5448 5478 FN(unspec, 0, ##ctx) \ ··· 5693 5647 FN(tcp_raw_check_syncookie_ipv6, 207, ##ctx) \ 5694 5648 FN(ktime_get_tai_ns, 208, ##ctx) \ 5695 5649 FN(user_ringbuf_drain, 209, ##ctx) \ 5650 + FN(cgrp_storage_get, 210, ##ctx) \ 5651 + FN(cgrp_storage_delete, 211, ##ctx) \ 5696 5652 /* */ 5697 5653 5698 5654 /* backwards-compatibility macros for users of __BPF_FUNC_MAPPER that don't
+5 -3
tools/lib/bpf/btf.c
··· 3887 3887 } 3888 3888 3889 3889 /* Check if given two types are identical ARRAY definitions */ 3890 - static int btf_dedup_identical_arrays(struct btf_dedup *d, __u32 id1, __u32 id2) 3890 + static bool btf_dedup_identical_arrays(struct btf_dedup *d, __u32 id1, __u32 id2) 3891 3891 { 3892 3892 struct btf_type *t1, *t2; 3893 3893 3894 3894 t1 = btf_type_by_id(d->btf, id1); 3895 3895 t2 = btf_type_by_id(d->btf, id2); 3896 3896 if (!btf_is_array(t1) || !btf_is_array(t2)) 3897 - return 0; 3897 + return false; 3898 3898 3899 3899 return btf_equal_array(t1, t2); 3900 3900 } ··· 3918 3918 m1 = btf_members(t1); 3919 3919 m2 = btf_members(t2); 3920 3920 for (i = 0, n = btf_vlen(t1); i < n; i++, m1++, m2++) { 3921 - if (m1->type != m2->type) 3921 + if (m1->type != m2->type && 3922 + !btf_dedup_identical_arrays(d, m1->type, m2->type) && 3923 + !btf_dedup_identical_structs(d, m1->type, m2->type)) 3922 3924 return false; 3923 3925 } 3924 3926 return true;
+113 -65
tools/lib/bpf/libbpf.c
··· 164 164 [BPF_MAP_TYPE_TASK_STORAGE] = "task_storage", 165 165 [BPF_MAP_TYPE_BLOOM_FILTER] = "bloom_filter", 166 166 [BPF_MAP_TYPE_USER_RINGBUF] = "user_ringbuf", 167 + [BPF_MAP_TYPE_CGRP_STORAGE] = "cgrp_storage", 167 168 }; 168 169 169 170 static const char * const prog_type_name[] = { ··· 1462 1461 return -ENOENT; 1463 1462 } 1464 1463 1465 - static int find_elf_var_offset(const struct bpf_object *obj, const char *name, __u32 *off) 1464 + static Elf64_Sym *find_elf_var_sym(const struct bpf_object *obj, const char *name) 1466 1465 { 1467 1466 Elf_Data *symbols = obj->efile.symbols; 1468 1467 const char *sname; 1469 1468 size_t si; 1470 - 1471 - if (!name || !off) 1472 - return -EINVAL; 1473 1469 1474 1470 for (si = 0; si < symbols->d_size / sizeof(Elf64_Sym); si++) { 1475 1471 Elf64_Sym *sym = elf_sym_by_idx(obj, si); ··· 1481 1483 sname = elf_sym_str(obj, sym->st_name); 1482 1484 if (!sname) { 1483 1485 pr_warn("failed to get sym name string for var %s\n", name); 1484 - return -EIO; 1486 + return ERR_PTR(-EIO); 1485 1487 } 1486 - if (strcmp(name, sname) == 0) { 1487 - *off = sym->st_value; 1488 - return 0; 1489 - } 1488 + if (strcmp(name, sname) == 0) 1489 + return sym; 1490 1490 } 1491 1491 1492 - return -ENOENT; 1492 + return ERR_PTR(-ENOENT); 1493 1493 } 1494 1494 1495 1495 static struct bpf_map *bpf_object__add_map(struct bpf_object *obj) ··· 1578 1582 } 1579 1583 1580 1584 static int 1581 - bpf_map_find_btf_info(struct bpf_object *obj, struct bpf_map *map); 1585 + map_fill_btf_type_info(struct bpf_object *obj, struct bpf_map *map); 1586 + 1587 + /* Internal BPF map is mmap()'able only if at least one of corresponding 1588 + * DATASEC's VARs are to be exposed through BPF skeleton. I.e., it's a GLOBAL 1589 + * variable and it's not marked as __hidden (which turns it into, effectively, 1590 + * a STATIC variable). 1591 + */ 1592 + static bool map_is_mmapable(struct bpf_object *obj, struct bpf_map *map) 1593 + { 1594 + const struct btf_type *t, *vt; 1595 + struct btf_var_secinfo *vsi; 1596 + int i, n; 1597 + 1598 + if (!map->btf_value_type_id) 1599 + return false; 1600 + 1601 + t = btf__type_by_id(obj->btf, map->btf_value_type_id); 1602 + if (!btf_is_datasec(t)) 1603 + return false; 1604 + 1605 + vsi = btf_var_secinfos(t); 1606 + for (i = 0, n = btf_vlen(t); i < n; i++, vsi++) { 1607 + vt = btf__type_by_id(obj->btf, vsi->type); 1608 + if (!btf_is_var(vt)) 1609 + continue; 1610 + 1611 + if (btf_var(vt)->linkage != BTF_VAR_STATIC) 1612 + return true; 1613 + } 1614 + 1615 + return false; 1616 + } 1582 1617 1583 1618 static int 1584 1619 bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, ··· 1641 1614 def->max_entries = 1; 1642 1615 def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_KCONFIG 1643 1616 ? BPF_F_RDONLY_PROG : 0; 1644 - def->map_flags |= BPF_F_MMAPABLE; 1617 + 1618 + /* failures are fine because of maps like .rodata.str1.1 */ 1619 + (void) map_fill_btf_type_info(obj, map); 1620 + 1621 + if (map_is_mmapable(obj, map)) 1622 + def->map_flags |= BPF_F_MMAPABLE; 1645 1623 1646 1624 pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n", 1647 1625 map->name, map->sec_idx, map->sec_offset, def->map_flags); ··· 1662 1630 zfree(&map->name); 1663 1631 return err; 1664 1632 } 1665 - 1666 - /* failures are fine because of maps like .rodata.str1.1 */ 1667 - (void) bpf_map_find_btf_info(obj, map); 1668 1633 1669 1634 if (data) 1670 1635 memcpy(map->mmaped, data, data_sz); ··· 2574 2545 fill_map_from_def(map->inner_map, &inner_def); 2575 2546 } 2576 2547 2577 - err = bpf_map_find_btf_info(obj, map); 2548 + err = map_fill_btf_type_info(obj, map); 2578 2549 if (err) 2579 2550 return err; 2580 2551 ··· 2879 2850 static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf, 2880 2851 struct btf_type *t) 2881 2852 { 2882 - __u32 size = 0, off = 0, i, vars = btf_vlen(t); 2883 - const char *name = btf__name_by_offset(btf, t->name_off); 2884 - const struct btf_type *t_var; 2853 + __u32 size = 0, i, vars = btf_vlen(t); 2854 + const char *sec_name = btf__name_by_offset(btf, t->name_off); 2885 2855 struct btf_var_secinfo *vsi; 2886 - const struct btf_var *var; 2887 - int ret; 2856 + bool fixup_offsets = false; 2857 + int err; 2888 2858 2889 - if (!name) { 2859 + if (!sec_name) { 2890 2860 pr_debug("No name found in string section for DATASEC kind.\n"); 2891 2861 return -ENOENT; 2892 2862 } 2893 2863 2894 - /* .extern datasec size and var offsets were set correctly during 2895 - * extern collection step, so just skip straight to sorting variables 2864 + /* Extern-backing datasecs (.ksyms, .kconfig) have their size and 2865 + * variable offsets set at the previous step. Further, not every 2866 + * extern BTF VAR has corresponding ELF symbol preserved, so we skip 2867 + * all fixups altogether for such sections and go straight to sorting 2868 + * VARs within their DATASEC. 2896 2869 */ 2897 - if (t->size) 2870 + if (strcmp(sec_name, KCONFIG_SEC) == 0 || strcmp(sec_name, KSYMS_SEC) == 0) 2898 2871 goto sort_vars; 2899 2872 2900 - ret = find_elf_sec_sz(obj, name, &size); 2901 - if (ret || !size) { 2902 - pr_debug("Invalid size for section %s: %u bytes\n", name, size); 2903 - return -ENOENT; 2873 + /* Clang leaves DATASEC size and VAR offsets as zeroes, so we need to 2874 + * fix this up. But BPF static linker already fixes this up and fills 2875 + * all the sizes and offsets during static linking. So this step has 2876 + * to be optional. But the STV_HIDDEN handling is non-optional for any 2877 + * non-extern DATASEC, so the variable fixup loop below handles both 2878 + * functions at the same time, paying the cost of BTF VAR <-> ELF 2879 + * symbol matching just once. 2880 + */ 2881 + if (t->size == 0) { 2882 + err = find_elf_sec_sz(obj, sec_name, &size); 2883 + if (err || !size) { 2884 + pr_debug("sec '%s': failed to determine size from ELF: size %u, err %d\n", 2885 + sec_name, size, err); 2886 + return -ENOENT; 2887 + } 2888 + 2889 + t->size = size; 2890 + fixup_offsets = true; 2904 2891 } 2905 2892 2906 - t->size = size; 2907 - 2908 2893 for (i = 0, vsi = btf_var_secinfos(t); i < vars; i++, vsi++) { 2894 + const struct btf_type *t_var; 2895 + struct btf_var *var; 2896 + const char *var_name; 2897 + Elf64_Sym *sym; 2898 + 2909 2899 t_var = btf__type_by_id(btf, vsi->type); 2910 2900 if (!t_var || !btf_is_var(t_var)) { 2911 - pr_debug("Non-VAR type seen in section %s\n", name); 2901 + pr_debug("sec '%s': unexpected non-VAR type found\n", sec_name); 2912 2902 return -EINVAL; 2913 2903 } 2914 2904 2915 2905 var = btf_var(t_var); 2916 - if (var->linkage == BTF_VAR_STATIC) 2906 + if (var->linkage == BTF_VAR_STATIC || var->linkage == BTF_VAR_GLOBAL_EXTERN) 2917 2907 continue; 2918 2908 2919 - name = btf__name_by_offset(btf, t_var->name_off); 2920 - if (!name) { 2921 - pr_debug("No name found in string section for VAR kind\n"); 2909 + var_name = btf__name_by_offset(btf, t_var->name_off); 2910 + if (!var_name) { 2911 + pr_debug("sec '%s': failed to find name of DATASEC's member #%d\n", 2912 + sec_name, i); 2922 2913 return -ENOENT; 2923 2914 } 2924 2915 2925 - ret = find_elf_var_offset(obj, name, &off); 2926 - if (ret) { 2927 - pr_debug("No offset found in symbol table for VAR %s\n", 2928 - name); 2916 + sym = find_elf_var_sym(obj, var_name); 2917 + if (IS_ERR(sym)) { 2918 + pr_debug("sec '%s': failed to find ELF symbol for VAR '%s'\n", 2919 + sec_name, var_name); 2929 2920 return -ENOENT; 2930 2921 } 2931 2922 2932 - vsi->offset = off; 2923 + if (fixup_offsets) 2924 + vsi->offset = sym->st_value; 2925 + 2926 + /* if variable is a global/weak symbol, but has restricted 2927 + * (STV_HIDDEN or STV_INTERNAL) visibility, mark its BTF VAR 2928 + * as static. This follows similar logic for functions (BPF 2929 + * subprogs) and influences libbpf's further decisions about 2930 + * whether to make global data BPF array maps as 2931 + * BPF_F_MMAPABLE. 2932 + */ 2933 + if (ELF64_ST_VISIBILITY(sym->st_other) == STV_HIDDEN 2934 + || ELF64_ST_VISIBILITY(sym->st_other) == STV_INTERNAL) 2935 + var->linkage = BTF_VAR_STATIC; 2933 2936 } 2934 2937 2935 2938 sort_vars: ··· 2969 2908 return 0; 2970 2909 } 2971 2910 2972 - static int btf_finalize_data(struct bpf_object *obj, struct btf *btf) 2911 + static int bpf_object_fixup_btf(struct bpf_object *obj) 2973 2912 { 2974 - int err = 0; 2975 - __u32 i, n = btf__type_cnt(btf); 2913 + int i, n, err = 0; 2976 2914 2915 + if (!obj->btf) 2916 + return 0; 2917 + 2918 + n = btf__type_cnt(obj->btf); 2977 2919 for (i = 1; i < n; i++) { 2978 - struct btf_type *t = btf_type_by_id(btf, i); 2920 + struct btf_type *t = btf_type_by_id(obj->btf, i); 2979 2921 2980 2922 /* Loader needs to fix up some of the things compiler 2981 2923 * couldn't get its hands on while emitting BTF. This ··· 2986 2922 * the info from the ELF itself for this purpose. 2987 2923 */ 2988 2924 if (btf_is_datasec(t)) { 2989 - err = btf_fixup_datasec(obj, btf, t); 2925 + err = btf_fixup_datasec(obj, obj->btf, t); 2990 2926 if (err) 2991 - break; 2927 + return err; 2992 2928 } 2993 - } 2994 - 2995 - return libbpf_err(err); 2996 - } 2997 - 2998 - static int bpf_object__finalize_btf(struct bpf_object *obj) 2999 - { 3000 - int err; 3001 - 3002 - if (!obj->btf) 3003 - return 0; 3004 - 3005 - err = btf_finalize_data(obj, obj->btf); 3006 - if (err) { 3007 - pr_warn("Error finalizing %s: %d.\n", BTF_ELF_SEC, err); 3008 - return err; 3009 2929 } 3010 2930 3011 2931 return 0; ··· 4283 4235 return 0; 4284 4236 } 4285 4237 4286 - static int bpf_map_find_btf_info(struct bpf_object *obj, struct bpf_map *map) 4238 + static int map_fill_btf_type_info(struct bpf_object *obj, struct bpf_map *map) 4287 4239 { 4288 4240 int id; 4289 4241 ··· 7281 7233 err = err ? : bpf_object__check_endianness(obj); 7282 7234 err = err ? : bpf_object__elf_collect(obj); 7283 7235 err = err ? : bpf_object__collect_externs(obj); 7284 - err = err ? : bpf_object__finalize_btf(obj); 7236 + err = err ? : bpf_object_fixup_btf(obj); 7285 7237 err = err ? : bpf_object__init_maps(obj, opts); 7286 7238 err = err ? : bpf_object_init_progs(obj, opts); 7287 7239 err = err ? : bpf_object__collect_relos(obj);
+1
tools/lib/bpf/libbpf_probes.c
··· 221 221 case BPF_MAP_TYPE_SK_STORAGE: 222 222 case BPF_MAP_TYPE_INODE_STORAGE: 223 223 case BPF_MAP_TYPE_TASK_STORAGE: 224 + case BPF_MAP_TYPE_CGRP_STORAGE: 224 225 btf_key_type_id = 1; 225 226 btf_value_type_id = 3; 226 227 value_size = 8;
+6 -10
tools/lib/bpf/usdt.c
··· 1225 1225 1226 1226 static int parse_usdt_arg(const char *arg_str, int arg_num, struct usdt_arg_spec *arg) 1227 1227 { 1228 - char *reg_name = NULL; 1228 + char reg_name[16]; 1229 1229 int arg_sz, len, reg_off; 1230 1230 long off; 1231 1231 1232 - if (sscanf(arg_str, " %d @ %ld ( %%%m[^)] ) %n", &arg_sz, &off, &reg_name, &len) == 3) { 1232 + if (sscanf(arg_str, " %d @ %ld ( %%%15[^)] ) %n", &arg_sz, &off, reg_name, &len) == 3) { 1233 1233 /* Memory dereference case, e.g., -4@-20(%rbp) */ 1234 1234 arg->arg_type = USDT_ARG_REG_DEREF; 1235 1235 arg->val_off = off; 1236 1236 reg_off = calc_pt_regs_off(reg_name); 1237 - free(reg_name); 1238 1237 if (reg_off < 0) 1239 1238 return reg_off; 1240 1239 arg->reg_off = reg_off; 1241 - } else if (sscanf(arg_str, " %d @ %%%ms %n", &arg_sz, &reg_name, &len) == 2) { 1240 + } else if (sscanf(arg_str, " %d @ %%%15s %n", &arg_sz, reg_name, &len) == 2) { 1242 1241 /* Register read case, e.g., -4@%eax */ 1243 1242 arg->arg_type = USDT_ARG_REG; 1244 1243 arg->val_off = 0; 1245 1244 1246 1245 reg_off = calc_pt_regs_off(reg_name); 1247 - free(reg_name); 1248 1246 if (reg_off < 0) 1249 1247 return reg_off; 1250 1248 arg->reg_off = reg_off; ··· 1454 1456 1455 1457 static int parse_usdt_arg(const char *arg_str, int arg_num, struct usdt_arg_spec *arg) 1456 1458 { 1457 - char *reg_name = NULL; 1459 + char reg_name[16]; 1458 1460 int arg_sz, len, reg_off; 1459 1461 long off; 1460 1462 1461 - if (sscanf(arg_str, " %d @ %ld ( %m[a-z0-9] ) %n", &arg_sz, &off, &reg_name, &len) == 3) { 1463 + if (sscanf(arg_str, " %d @ %ld ( %15[a-z0-9] ) %n", &arg_sz, &off, reg_name, &len) == 3) { 1462 1464 /* Memory dereference case, e.g., -8@-88(s0) */ 1463 1465 arg->arg_type = USDT_ARG_REG_DEREF; 1464 1466 arg->val_off = off; 1465 1467 reg_off = calc_pt_regs_off(reg_name); 1466 - free(reg_name); 1467 1468 if (reg_off < 0) 1468 1469 return reg_off; 1469 1470 arg->reg_off = reg_off; ··· 1471 1474 arg->arg_type = USDT_ARG_CONST; 1472 1475 arg->val_off = off; 1473 1476 arg->reg_off = 0; 1474 - } else if (sscanf(arg_str, " %d @ %m[a-z0-9] %n", &arg_sz, &reg_name, &len) == 2) { 1477 + } else if (sscanf(arg_str, " %d @ %15[a-z0-9] %n", &arg_sz, reg_name, &len) == 2) { 1475 1478 /* Register read case, e.g., -8@a1 */ 1476 1479 arg->arg_type = USDT_ARG_REG; 1477 1480 arg->val_off = 0; 1478 1481 reg_off = calc_pt_regs_off(reg_name); 1479 - free(reg_name); 1480 1482 if (reg_off < 0) 1481 1483 return reg_off; 1482 1484 arg->reg_off = reg_off;
+81
tools/testing/selftests/bpf/DENYLIST.aarch64
··· 1 + bloom_filter_map # libbpf: prog 'check_bloom': failed to attach: ERROR: strerror_r(-524)=22 2 + bpf_cookie/lsm 3 + bpf_cookie/multi_kprobe_attach_api 4 + bpf_cookie/multi_kprobe_link_api 5 + bpf_cookie/trampoline 6 + bpf_loop/check_callback_fn_stop # link unexpected error: -524 7 + bpf_loop/check_invalid_flags 8 + bpf_loop/check_nested_calls 9 + bpf_loop/check_non_constant_callback 10 + bpf_loop/check_nr_loops 11 + bpf_loop/check_null_callback_ctx 12 + bpf_loop/check_stack 13 + bpf_mod_race # bpf_mod_kfunc_race__attach unexpected error: -524 (errno 524) 14 + bpf_tcp_ca/dctcp_fallback 15 + btf_dump/btf_dump: var_data # find type id unexpected find type id: actual -2 < expected 0 16 + cgroup_hierarchical_stats # attach unexpected error: -524 (errno 524) 17 + d_path/basic # setup attach failed: -524 18 + deny_namespace # attach unexpected error: -524 (errno 524) 19 + fentry_fexit # fentry_attach unexpected error: -1 (errno 524) 20 + fentry_test # fentry_attach unexpected error: -1 (errno 524) 21 + fexit_sleep # fexit_attach fexit attach failed: -1 22 + fexit_stress # fexit attach unexpected fexit attach: actual -524 < expected 0 23 + fexit_test # fexit_attach unexpected error: -1 (errno 524) 24 + get_func_args_test # get_func_args_test__attach unexpected error: -524 (errno 524) (trampoline) 25 + get_func_ip_test # get_func_ip_test__attach unexpected error: -524 (errno 524) (trampoline) 26 + htab_update/reenter_update 27 + kfree_skb # attach fentry unexpected error: -524 (trampoline) 28 + kfunc_call/subprog # extern (var ksym) 'bpf_prog_active': not found in kernel BTF 29 + kfunc_call/subprog_lskel # skel unexpected error: -2 30 + kfunc_dynptr_param/dynptr_data_null # libbpf: prog 'dynptr_data_null': failed to attach: ERROR: strerror_r(-524)=22 31 + kprobe_multi_test/attach_api_addrs # bpf_program__attach_kprobe_multi_opts unexpected error: -95 32 + kprobe_multi_test/attach_api_pattern # bpf_program__attach_kprobe_multi_opts unexpected error: -95 33 + kprobe_multi_test/attach_api_syms # bpf_program__attach_kprobe_multi_opts unexpected error: -95 34 + kprobe_multi_test/bench_attach # bpf_program__attach_kprobe_multi_opts unexpected error: -95 35 + kprobe_multi_test/link_api_addrs # link_fd unexpected link_fd: actual -95 < expected 0 36 + kprobe_multi_test/link_api_syms # link_fd unexpected link_fd: actual -95 < expected 0 37 + kprobe_multi_test/skel_api # kprobe_multi__attach unexpected error: -524 (errno 524) 38 + ksyms_module/libbpf # 'bpf_testmod_ksym_percpu': not found in kernel BTF 39 + ksyms_module/lskel # test_ksyms_module_lskel__open_and_load unexpected error: -2 40 + libbpf_get_fd_by_id_opts # test_libbpf_get_fd_by_id_opts__attach unexpected error: -524 (errno 524) 41 + lookup_key # test_lookup_key__attach unexpected error: -524 (errno 524) 42 + lru_bug # lru_bug__attach unexpected error: -524 (errno 524) 43 + modify_return # modify_return__attach failed unexpected error: -524 (errno 524) 44 + module_attach # skel_attach skeleton attach failed: -524 45 + mptcp/base # run_test mptcp unexpected error: -524 (errno 524) 46 + netcnt # packets unexpected packets: actual 10001 != expected 10000 47 + recursion # skel_attach unexpected error: -524 (errno 524) 48 + ringbuf # skel_attach skeleton attachment failed: -1 49 + setget_sockopt # attach_cgroup unexpected error: -524 50 + sk_storage_tracing # test_sk_storage_tracing__attach unexpected error: -524 (errno 524) 51 + skc_to_unix_sock # could not attach BPF object unexpected error: -524 (errno 524) 52 + socket_cookie # prog_attach unexpected error: -524 53 + stacktrace_build_id # compare_stack_ips stackmap vs. stack_amap err -1 errno 2 54 + task_local_storage/exit_creds # skel_attach unexpected error: -524 (errno 524) 55 + task_local_storage/recursion # skel_attach unexpected error: -524 (errno 524) 56 + test_bprm_opts # attach attach failed: -524 57 + test_ima # attach attach failed: -524 58 + test_local_storage # attach lsm attach failed: -524 59 + test_lsm # test_lsm_first_attach unexpected error: -524 (errno 524) 60 + test_overhead # attach_fentry unexpected error: -524 61 + timer # timer unexpected error: -524 (errno 524) 62 + timer_crash # timer_crash__attach unexpected error: -524 (errno 524) 63 + timer_mim # timer_mim unexpected error: -524 (errno 524) 64 + trace_printk # trace_printk__attach unexpected error: -1 (errno 524) 65 + trace_vprintk # trace_vprintk__attach unexpected error: -1 (errno 524) 66 + tracing_struct # tracing_struct__attach unexpected error: -524 (errno 524) 67 + trampoline_count # attach_prog unexpected error: -524 68 + unpriv_bpf_disabled # skel_attach unexpected error: -524 (errno 524) 69 + user_ringbuf/test_user_ringbuf_post_misaligned # misaligned_skel unexpected error: -524 (errno 524) 70 + user_ringbuf/test_user_ringbuf_post_producer_wrong_offset 71 + user_ringbuf/test_user_ringbuf_post_larger_than_ringbuf_sz 72 + user_ringbuf/test_user_ringbuf_basic # ringbuf_basic_skel unexpected error: -524 (errno 524) 73 + user_ringbuf/test_user_ringbuf_sample_full_ring_buffer 74 + user_ringbuf/test_user_ringbuf_post_alignment_autoadjust 75 + user_ringbuf/test_user_ringbuf_overfill 76 + user_ringbuf/test_user_ringbuf_discards_properly_ignored 77 + user_ringbuf/test_user_ringbuf_loop 78 + user_ringbuf/test_user_ringbuf_msg_protocol 79 + user_ringbuf/test_user_ringbuf_blocking_reserve 80 + verify_pkcs7_sig # test_verify_pkcs7_sig__attach unexpected error: -524 (errno 524) 81 + vmlinux # skel_attach skeleton attach failed: -524
+1
tools/testing/selftests/bpf/DENYLIST.s390x
··· 10 10 bpf_tcp_ca # JIT does not support calling kernel function (kfunc) 11 11 cb_refs # expected error message unexpected error: -524 (trampoline) 12 12 cgroup_hierarchical_stats # JIT does not support calling kernel function (kfunc) 13 + cgrp_local_storage # prog_attach unexpected error: -524 (trampoline) 13 14 core_read_macros # unknown func bpf_probe_read#4 (overlapping) 14 15 d_path # failed to auto-attach program 'prog_stat': -524 (trampoline) 15 16 deny_namespace # failed to attach: ERROR: strerror_r(-524)=22 (trampoline)
+5 -3
tools/testing/selftests/bpf/Makefile
··· 359 359 test_subskeleton.skel.h test_subskeleton_lib.skel.h \ 360 360 test_usdt.skel.h 361 361 362 - LSKELS := fentry_test.c fexit_test.c fexit_sleep.c \ 363 - test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c \ 364 - map_ptr_kern.c core_kern.c core_kern_overflow.c 362 + LSKELS := fentry_test.c fexit_test.c fexit_sleep.c atomics.c \ 363 + trace_printk.c trace_vprintk.c map_ptr_kern.c \ 364 + core_kern.c core_kern_overflow.c test_ringbuf.c \ 365 + test_ringbuf_map_key.c 366 + 365 367 # Generate both light skeleton and libbpf skeleton for these 366 368 LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c kfunc_call_test.c \ 367 369 kfunc_call_test_subprog.c
+41 -1
tools/testing/selftests/bpf/README.rst
··· 6 6 7 7 __ /Documentation/bpf/bpf_devel_QA.rst#q-how-to-run-bpf-selftests 8 8 9 + ============= 10 + BPF CI System 11 + ============= 12 + 13 + BPF employs a continuous integration (CI) system to check patch submission in an 14 + automated fashion. The system runs selftests for each patch in a series. Results 15 + are propagated to patchwork, where failures are highlighted similar to 16 + violations of other checks (such as additional warnings being emitted or a 17 + ``scripts/checkpatch.pl`` reported deficiency): 18 + 19 + https://patchwork.kernel.org/project/netdevbpf/list/?delegate=121173 20 + 21 + The CI system executes tests on multiple architectures. It uses a kernel 22 + configuration derived from both the generic and architecture specific config 23 + file fragments below ``tools/testing/selftests/bpf/`` (e.g., ``config`` and 24 + ``config.x86_64``). 25 + 26 + Denylisting Tests 27 + ================= 28 + 29 + It is possible for some architectures to not have support for all BPF features. 30 + In such a case tests in CI may fail. An example of such a shortcoming is BPF 31 + trampoline support on IBM's s390x architecture. For cases like this, an in-tree 32 + deny list file, located at ``tools/testing/selftests/bpf/DENYLIST.<arch>``, can 33 + be used to prevent the test from running on such an architecture. 34 + 35 + In addition to that, the generic ``tools/testing/selftests/bpf/DENYLIST`` is 36 + honored on every architecture running tests. 37 + 38 + These files are organized in three columns. The first column lists the test in 39 + question. This can be the name of a test suite or of an individual test. The 40 + remaining two columns provide additional meta data that helps identify and 41 + classify the entry: column two is a copy and paste of the error being reported 42 + when running the test in the setting in question. The third column, if 43 + available, summarizes the underlying problem. A value of ``trampoline``, for 44 + example, indicates that lack of trampoline support is causing the test to fail. 45 + This last entry helps identify tests that can be re-enabled once such support is 46 + added. 47 + 9 48 ========================= 10 49 Running Selftests in a VM 11 50 ========================= 12 51 13 52 It's now possible to run the selftests using ``tools/testing/selftests/bpf/vmtest.sh``. 14 53 The script tries to ensure that the tests are run with the same environment as they 15 - would be run post-submit in the CI used by the Maintainers. 54 + would be run post-submit in the CI used by the Maintainers, with the exception 55 + that deny lists are not automatically honored. 16 56 17 57 This script uses the in-tree kernel configuration and downloads a VM userspace 18 58 image from the system used by the CI. It builds the kernel (without overwriting
+24
tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
··· 128 128 } 129 129 } 130 130 131 + noinline int bpf_testmod_fentry_test1(int a) 132 + { 133 + return a + 1; 134 + } 135 + 136 + noinline int bpf_testmod_fentry_test2(int a, u64 b) 137 + { 138 + return a + b; 139 + } 140 + 141 + noinline int bpf_testmod_fentry_test3(char a, int b, u64 c) 142 + { 143 + return a + b + c; 144 + } 145 + 146 + int bpf_testmod_fentry_ok; 147 + 131 148 noinline ssize_t 132 149 bpf_testmod_test_read(struct file *file, struct kobject *kobj, 133 150 struct bin_attribute *bin_attr, ··· 184 167 return snprintf(buf, len, "%d\n", writable.val); 185 168 } 186 169 170 + if (bpf_testmod_fentry_test1(1) != 2 || 171 + bpf_testmod_fentry_test2(2, 3) != 5 || 172 + bpf_testmod_fentry_test3(4, 5, 6) != 15) 173 + goto out; 174 + 175 + bpf_testmod_fentry_ok = 1; 176 + out: 187 177 return -EIO; /* always fail */ 188 178 } 189 179 EXPORT_SYMBOL(bpf_testmod_test_read);
+2
tools/testing/selftests/bpf/config
··· 1 1 CONFIG_BLK_DEV_LOOP=y 2 + CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y 3 + CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y 2 4 CONFIG_BPF=y 3 5 CONFIG_BPF_EVENTS=y 4 6 CONFIG_BPF_JIT=y
+181
tools/testing/selftests/bpf/config.aarch64
··· 1 + CONFIG_9P_FS=y 2 + CONFIG_ARCH_VEXPRESS=y 3 + CONFIG_ARCH_WANT_DEFAULT_BPF_JIT=y 4 + CONFIG_ARM_SMMU_V3=y 5 + CONFIG_ATA=y 6 + CONFIG_AUDIT=y 7 + CONFIG_BINFMT_MISC=y 8 + CONFIG_BLK_CGROUP=y 9 + CONFIG_BLK_DEV_BSGLIB=y 10 + CONFIG_BLK_DEV_INITRD=y 11 + CONFIG_BLK_DEV_IO_TRACE=y 12 + CONFIG_BLK_DEV_RAM=y 13 + CONFIG_BLK_DEV_SD=y 14 + CONFIG_BONDING=y 15 + CONFIG_BPFILTER=y 16 + CONFIG_BPF_JIT_ALWAYS_ON=y 17 + CONFIG_BPF_JIT_DEFAULT_ON=y 18 + CONFIG_BPF_PRELOAD_UMD=y 19 + CONFIG_BPF_PRELOAD=y 20 + CONFIG_BRIDGE=m 21 + CONFIG_CGROUP_CPUACCT=y 22 + CONFIG_CGROUP_DEVICE=y 23 + CONFIG_CGROUP_FREEZER=y 24 + CONFIG_CGROUP_HUGETLB=y 25 + CONFIG_CGROUP_NET_CLASSID=y 26 + CONFIG_CGROUP_PERF=y 27 + CONFIG_CGROUP_PIDS=y 28 + CONFIG_CGROUP_SCHED=y 29 + CONFIG_CGROUPS=y 30 + CONFIG_CHECKPOINT_RESTORE=y 31 + CONFIG_CHR_DEV_SG=y 32 + CONFIG_COMPAT=y 33 + CONFIG_CPUSETS=y 34 + CONFIG_CRASH_DUMP=y 35 + CONFIG_CRYPTO_USER_API_RNG=y 36 + CONFIG_CRYPTO_USER_API_SKCIPHER=y 37 + CONFIG_DEBUG_ATOMIC_SLEEP=y 38 + CONFIG_DEBUG_INFO_BTF=y 39 + CONFIG_DEBUG_INFO_DWARF4=y 40 + CONFIG_DEBUG_LIST=y 41 + CONFIG_DEBUG_LOCKDEP=y 42 + CONFIG_DEBUG_NOTIFIERS=y 43 + CONFIG_DEBUG_PAGEALLOC=y 44 + CONFIG_DEBUG_SECTION_MISMATCH=y 45 + CONFIG_DEBUG_SG=y 46 + CONFIG_DETECT_HUNG_TASK=y 47 + CONFIG_DEVTMPFS_MOUNT=y 48 + CONFIG_DEVTMPFS=y 49 + CONFIG_DRM_VIRTIO_GPU=y 50 + CONFIG_DRM=y 51 + CONFIG_DUMMY=y 52 + CONFIG_EXPERT=y 53 + CONFIG_EXT4_FS_POSIX_ACL=y 54 + CONFIG_EXT4_FS_SECURITY=y 55 + CONFIG_EXT4_FS=y 56 + CONFIG_FANOTIFY=y 57 + CONFIG_FB=y 58 + CONFIG_FUNCTION_PROFILER=y 59 + CONFIG_FUSE_FS=y 60 + CONFIG_FW_CFG_SYSFS_CMDLINE=y 61 + CONFIG_FW_CFG_SYSFS=y 62 + CONFIG_GDB_SCRIPTS=y 63 + CONFIG_HAVE_EBPF_JIT=y 64 + CONFIG_HAVE_KPROBES_ON_FTRACE=y 65 + CONFIG_HAVE_KPROBES=y 66 + CONFIG_HAVE_KRETPROBES=y 67 + CONFIG_HEADERS_INSTALL=y 68 + CONFIG_HIGH_RES_TIMERS=y 69 + CONFIG_HUGETLBFS=y 70 + CONFIG_HW_RANDOM_VIRTIO=y 71 + CONFIG_HW_RANDOM=y 72 + CONFIG_HZ_100=y 73 + CONFIG_IDLE_PAGE_TRACKING=y 74 + CONFIG_IKHEADERS=y 75 + CONFIG_INET6_ESP=y 76 + CONFIG_INET_ESP=y 77 + CONFIG_INET=y 78 + CONFIG_INPUT_EVDEV=y 79 + CONFIG_IP_ADVANCED_ROUTER=y 80 + CONFIG_IP_MULTICAST=y 81 + CONFIG_IP_MULTIPLE_TABLES=y 82 + CONFIG_IP_NF_IPTABLES=y 83 + CONFIG_IPV6_SEG6_LWTUNNEL=y 84 + CONFIG_IPVLAN=y 85 + CONFIG_JUMP_LABEL=y 86 + CONFIG_KERNEL_UNCOMPRESSED=y 87 + CONFIG_KPROBES_ON_FTRACE=y 88 + CONFIG_KPROBES=y 89 + CONFIG_KRETPROBES=y 90 + CONFIG_KSM=y 91 + CONFIG_LATENCYTOP=y 92 + CONFIG_LIVEPATCH=y 93 + CONFIG_LOCK_STAT=y 94 + CONFIG_MACVLAN=y 95 + CONFIG_MACVTAP=y 96 + CONFIG_MAGIC_SYSRQ=y 97 + CONFIG_MAILBOX=y 98 + CONFIG_MEMCG=y 99 + CONFIG_MEMORY_HOTPLUG=y 100 + CONFIG_MEMORY_HOTREMOVE=y 101 + CONFIG_NAMESPACES=y 102 + CONFIG_NET_9P_VIRTIO=y 103 + CONFIG_NET_9P=y 104 + CONFIG_NET_ACT_BPF=y 105 + CONFIG_NET_ACT_GACT=y 106 + CONFIG_NETDEVICES=y 107 + CONFIG_NETFILTER_XT_MATCH_BPF=y 108 + CONFIG_NETFILTER_XT_TARGET_MARK=y 109 + CONFIG_NET_KEY=y 110 + CONFIG_NET_SCH_FQ=y 111 + CONFIG_NET_VRF=y 112 + CONFIG_NET=y 113 + CONFIG_NF_TABLES=y 114 + CONFIG_NLMON=y 115 + CONFIG_NO_HZ_IDLE=y 116 + CONFIG_NR_CPUS=256 117 + CONFIG_NUMA=y 118 + CONFIG_OVERLAY_FS=y 119 + CONFIG_PACKET_DIAG=y 120 + CONFIG_PACKET=y 121 + CONFIG_PANIC_ON_OOPS=y 122 + CONFIG_PARTITION_ADVANCED=y 123 + CONFIG_PCI_HOST_GENERIC=y 124 + CONFIG_PCI=y 125 + CONFIG_PL320_MBOX=y 126 + CONFIG_POSIX_MQUEUE=y 127 + CONFIG_PROC_KCORE=y 128 + CONFIG_PROFILING=y 129 + CONFIG_PROVE_LOCKING=y 130 + CONFIG_PTDUMP_DEBUGFS=y 131 + CONFIG_RC_DEVICES=y 132 + CONFIG_RC_LOOPBACK=y 133 + CONFIG_RTC_CLASS=y 134 + CONFIG_RTC_DRV_PL031=y 135 + CONFIG_RT_GROUP_SCHED=y 136 + CONFIG_SAMPLE_SECCOMP=y 137 + CONFIG_SAMPLES=y 138 + CONFIG_SCHED_AUTOGROUP=y 139 + CONFIG_SCHED_TRACER=y 140 + CONFIG_SCSI_CONSTANTS=y 141 + CONFIG_SCSI_LOGGING=y 142 + CONFIG_SCSI_SCAN_ASYNC=y 143 + CONFIG_SCSI_VIRTIO=y 144 + CONFIG_SCSI=y 145 + CONFIG_SECURITY_NETWORK=y 146 + CONFIG_SERIAL_AMBA_PL011_CONSOLE=y 147 + CONFIG_SERIAL_AMBA_PL011=y 148 + CONFIG_STACK_TRACER=y 149 + CONFIG_STATIC_KEYS_SELFTEST=y 150 + CONFIG_SYSVIPC=y 151 + CONFIG_TASK_DELAY_ACCT=y 152 + CONFIG_TASK_IO_ACCOUNTING=y 153 + CONFIG_TASKSTATS=y 154 + CONFIG_TASK_XACCT=y 155 + CONFIG_TCG_TIS=y 156 + CONFIG_TCG_TPM=y 157 + CONFIG_TCP_CONG_ADVANCED=y 158 + CONFIG_TCP_CONG_DCTCP=y 159 + CONFIG_TLS=y 160 + CONFIG_TMPFS_POSIX_ACL=y 161 + CONFIG_TMPFS=y 162 + CONFIG_TRACER_SNAPSHOT_PER_CPU_SWAP=y 163 + CONFIG_TRANSPARENT_HUGEPAGE=y 164 + CONFIG_TUN=y 165 + CONFIG_UNIX=y 166 + CONFIG_UPROBES=y 167 + CONFIG_USELIB=y 168 + CONFIG_USER_NS=y 169 + CONFIG_VETH=y 170 + CONFIG_VIRTIO_BALLOON=y 171 + CONFIG_VIRTIO_BLK=y 172 + CONFIG_VIRTIO_CONSOLE=y 173 + CONFIG_VIRTIO_FS=y 174 + CONFIG_VIRTIO_INPUT=y 175 + CONFIG_VIRTIO_MMIO_CMDLINE_DEVICES=y 176 + CONFIG_VIRTIO_MMIO=y 177 + CONFIG_VIRTIO_NET=y 178 + CONFIG_VIRTIO_PCI=y 179 + CONFIG_VLAN_8021Q=y 180 + CONFIG_VSOCKETS=y 181 + CONFIG_XFRM_USER=y
-3
tools/testing/selftests/bpf/config.s390x
··· 82 82 CONFIG_MEMCG=y 83 83 CONFIG_MEMORY_HOTPLUG=y 84 84 CONFIG_MEMORY_HOTREMOVE=y 85 - CONFIG_MODULE_SIG=y 86 - CONFIG_MODULE_UNLOAD=y 87 - CONFIG_MODULES=y 88 85 CONFIG_NAMESPACES=y 89 86 CONFIG_NET=y 90 87 CONFIG_NET_9P=y
-1
tools/testing/selftests/bpf/config.x86_64
··· 18 18 CONFIG_BLK_DEV_RAM_SIZE=16384 19 19 CONFIG_BLK_DEV_THROTTLING=y 20 20 CONFIG_BONDING=y 21 - CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y 22 21 CONFIG_BOOTTIME_TRACING=y 23 22 CONFIG_BPF_JIT_ALWAYS_ON=y 24 23 CONFIG_BPF_KPROBE_OVERRIDE=y
+14 -6
tools/testing/selftests/bpf/prog_tests/bpf_iter.c
··· 941 941 { 942 942 __u64 val, expected_val = 0, res_first_val, first_val = 0; 943 943 DECLARE_LIBBPF_OPTS(bpf_iter_attach_opts, opts); 944 - __u32 expected_key = 0, res_first_key; 944 + __u32 key, expected_key = 0, res_first_key; 945 + int err, i, map_fd, hash_fd, iter_fd; 945 946 struct bpf_iter_bpf_array_map *skel; 946 947 union bpf_iter_link_info linfo; 947 - int err, i, map_fd, iter_fd; 948 948 struct bpf_link *link; 949 949 char buf[64] = {}; 950 950 int len, start; ··· 1001 1001 if (!ASSERT_EQ(skel->bss->val_sum, expected_val, "val_sum")) 1002 1002 goto close_iter; 1003 1003 1004 + hash_fd = bpf_map__fd(skel->maps.hashmap1); 1004 1005 for (i = 0; i < bpf_map__max_entries(skel->maps.arraymap1); i++) { 1005 1006 err = bpf_map_lookup_elem(map_fd, &i, &val); 1006 - if (!ASSERT_OK(err, "map_lookup")) 1007 - goto out; 1008 - if (!ASSERT_EQ(i, val, "invalid_val")) 1009 - goto out; 1007 + if (!ASSERT_OK(err, "map_lookup arraymap1")) 1008 + goto close_iter; 1009 + if (!ASSERT_EQ(i, val, "invalid_val arraymap1")) 1010 + goto close_iter; 1011 + 1012 + val = i + 4; 1013 + err = bpf_map_lookup_elem(hash_fd, &val, &key); 1014 + if (!ASSERT_OK(err, "map_lookup hashmap1")) 1015 + goto close_iter; 1016 + if (!ASSERT_EQ(key, val - 4, "invalid_val hashmap1")) 1017 + goto close_iter; 1010 1018 } 1011 1019 1012 1020 close_iter:
+171
tools/testing/selftests/bpf/prog_tests/cgrp_local_storage.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates.*/ 3 + 4 + #define _GNU_SOURCE 5 + #include <unistd.h> 6 + #include <sys/syscall.h> 7 + #include <sys/types.h> 8 + #include <test_progs.h> 9 + #include "cgrp_ls_tp_btf.skel.h" 10 + #include "cgrp_ls_recursion.skel.h" 11 + #include "cgrp_ls_attach_cgroup.skel.h" 12 + #include "cgrp_ls_negative.skel.h" 13 + #include "network_helpers.h" 14 + 15 + struct socket_cookie { 16 + __u64 cookie_key; 17 + __u32 cookie_value; 18 + }; 19 + 20 + static void test_tp_btf(int cgroup_fd) 21 + { 22 + struct cgrp_ls_tp_btf *skel; 23 + long val1 = 1, val2 = 0; 24 + int err; 25 + 26 + skel = cgrp_ls_tp_btf__open_and_load(); 27 + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) 28 + return; 29 + 30 + /* populate a value in map_b */ 31 + err = bpf_map_update_elem(bpf_map__fd(skel->maps.map_b), &cgroup_fd, &val1, BPF_ANY); 32 + if (!ASSERT_OK(err, "map_update_elem")) 33 + goto out; 34 + 35 + /* check value */ 36 + err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.map_b), &cgroup_fd, &val2); 37 + if (!ASSERT_OK(err, "map_lookup_elem")) 38 + goto out; 39 + if (!ASSERT_EQ(val2, 1, "map_lookup_elem, invalid val")) 40 + goto out; 41 + 42 + /* delete value */ 43 + err = bpf_map_delete_elem(bpf_map__fd(skel->maps.map_b), &cgroup_fd); 44 + if (!ASSERT_OK(err, "map_delete_elem")) 45 + goto out; 46 + 47 + skel->bss->target_pid = syscall(SYS_gettid); 48 + 49 + err = cgrp_ls_tp_btf__attach(skel); 50 + if (!ASSERT_OK(err, "skel_attach")) 51 + goto out; 52 + 53 + syscall(SYS_gettid); 54 + syscall(SYS_gettid); 55 + 56 + skel->bss->target_pid = 0; 57 + 58 + /* 3x syscalls: 1x attach and 2x gettid */ 59 + ASSERT_EQ(skel->bss->enter_cnt, 3, "enter_cnt"); 60 + ASSERT_EQ(skel->bss->exit_cnt, 3, "exit_cnt"); 61 + ASSERT_EQ(skel->bss->mismatch_cnt, 0, "mismatch_cnt"); 62 + out: 63 + cgrp_ls_tp_btf__destroy(skel); 64 + } 65 + 66 + static void test_attach_cgroup(int cgroup_fd) 67 + { 68 + int server_fd = 0, client_fd = 0, err = 0; 69 + socklen_t addr_len = sizeof(struct sockaddr_in6); 70 + struct cgrp_ls_attach_cgroup *skel; 71 + __u32 cookie_expected_value; 72 + struct sockaddr_in6 addr; 73 + struct socket_cookie val; 74 + 75 + skel = cgrp_ls_attach_cgroup__open_and_load(); 76 + if (!ASSERT_OK_PTR(skel, "skel_open")) 77 + return; 78 + 79 + skel->links.set_cookie = bpf_program__attach_cgroup( 80 + skel->progs.set_cookie, cgroup_fd); 81 + if (!ASSERT_OK_PTR(skel->links.set_cookie, "prog_attach")) 82 + goto out; 83 + 84 + skel->links.update_cookie_sockops = bpf_program__attach_cgroup( 85 + skel->progs.update_cookie_sockops, cgroup_fd); 86 + if (!ASSERT_OK_PTR(skel->links.update_cookie_sockops, "prog_attach")) 87 + goto out; 88 + 89 + skel->links.update_cookie_tracing = bpf_program__attach( 90 + skel->progs.update_cookie_tracing); 91 + if (!ASSERT_OK_PTR(skel->links.update_cookie_tracing, "prog_attach")) 92 + goto out; 93 + 94 + server_fd = start_server(AF_INET6, SOCK_STREAM, "::1", 0, 0); 95 + if (!ASSERT_GE(server_fd, 0, "start_server")) 96 + goto out; 97 + 98 + client_fd = connect_to_fd(server_fd, 0); 99 + if (!ASSERT_GE(client_fd, 0, "connect_to_fd")) 100 + goto close_server_fd; 101 + 102 + err = bpf_map_lookup_elem(bpf_map__fd(skel->maps.socket_cookies), 103 + &cgroup_fd, &val); 104 + if (!ASSERT_OK(err, "map_lookup(socket_cookies)")) 105 + goto close_client_fd; 106 + 107 + err = getsockname(client_fd, (struct sockaddr *)&addr, &addr_len); 108 + if (!ASSERT_OK(err, "getsockname")) 109 + goto close_client_fd; 110 + 111 + cookie_expected_value = (ntohs(addr.sin6_port) << 8) | 0xFF; 112 + ASSERT_EQ(val.cookie_value, cookie_expected_value, "cookie_value"); 113 + 114 + close_client_fd: 115 + close(client_fd); 116 + close_server_fd: 117 + close(server_fd); 118 + out: 119 + cgrp_ls_attach_cgroup__destroy(skel); 120 + } 121 + 122 + static void test_recursion(int cgroup_fd) 123 + { 124 + struct cgrp_ls_recursion *skel; 125 + int err; 126 + 127 + skel = cgrp_ls_recursion__open_and_load(); 128 + if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) 129 + return; 130 + 131 + err = cgrp_ls_recursion__attach(skel); 132 + if (!ASSERT_OK(err, "skel_attach")) 133 + goto out; 134 + 135 + /* trigger sys_enter, make sure it does not cause deadlock */ 136 + syscall(SYS_gettid); 137 + 138 + out: 139 + cgrp_ls_recursion__destroy(skel); 140 + } 141 + 142 + static void test_negative(void) 143 + { 144 + struct cgrp_ls_negative *skel; 145 + 146 + skel = cgrp_ls_negative__open_and_load(); 147 + if (!ASSERT_ERR_PTR(skel, "skel_open_and_load")) { 148 + cgrp_ls_negative__destroy(skel); 149 + return; 150 + } 151 + } 152 + 153 + void test_cgrp_local_storage(void) 154 + { 155 + int cgroup_fd; 156 + 157 + cgroup_fd = test__join_cgroup("/cgrp_local_storage"); 158 + if (!ASSERT_GE(cgroup_fd, 0, "join_cgroup /cgrp_local_storage")) 159 + return; 160 + 161 + if (test__start_subtest("tp_btf")) 162 + test_tp_btf(cgroup_fd); 163 + if (test__start_subtest("attach_cgroup")) 164 + test_attach_cgroup(cgroup_fd); 165 + if (test__start_subtest("recursion")) 166 + test_recursion(cgroup_fd); 167 + if (test__start_subtest("negative")) 168 + test_negative(); 169 + 170 + close(cgroup_fd); 171 + }
+89
tools/testing/selftests/bpf/prog_tests/kprobe_multi_testmod_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <test_progs.h> 3 + #include "kprobe_multi.skel.h" 4 + #include "trace_helpers.h" 5 + #include "bpf/libbpf_internal.h" 6 + 7 + static void kprobe_multi_testmod_check(struct kprobe_multi *skel) 8 + { 9 + ASSERT_EQ(skel->bss->kprobe_testmod_test1_result, 1, "kprobe_test1_result"); 10 + ASSERT_EQ(skel->bss->kprobe_testmod_test2_result, 1, "kprobe_test2_result"); 11 + ASSERT_EQ(skel->bss->kprobe_testmod_test3_result, 1, "kprobe_test3_result"); 12 + 13 + ASSERT_EQ(skel->bss->kretprobe_testmod_test1_result, 1, "kretprobe_test1_result"); 14 + ASSERT_EQ(skel->bss->kretprobe_testmod_test2_result, 1, "kretprobe_test2_result"); 15 + ASSERT_EQ(skel->bss->kretprobe_testmod_test3_result, 1, "kretprobe_test3_result"); 16 + } 17 + 18 + static void test_testmod_attach_api(struct bpf_kprobe_multi_opts *opts) 19 + { 20 + struct kprobe_multi *skel = NULL; 21 + 22 + skel = kprobe_multi__open_and_load(); 23 + if (!ASSERT_OK_PTR(skel, "fentry_raw_skel_load")) 24 + return; 25 + 26 + skel->bss->pid = getpid(); 27 + 28 + skel->links.test_kprobe_testmod = bpf_program__attach_kprobe_multi_opts( 29 + skel->progs.test_kprobe_testmod, 30 + NULL, opts); 31 + if (!skel->links.test_kprobe_testmod) 32 + goto cleanup; 33 + 34 + opts->retprobe = true; 35 + skel->links.test_kretprobe_testmod = bpf_program__attach_kprobe_multi_opts( 36 + skel->progs.test_kretprobe_testmod, 37 + NULL, opts); 38 + if (!skel->links.test_kretprobe_testmod) 39 + goto cleanup; 40 + 41 + ASSERT_OK(trigger_module_test_read(1), "trigger_read"); 42 + kprobe_multi_testmod_check(skel); 43 + 44 + cleanup: 45 + kprobe_multi__destroy(skel); 46 + } 47 + 48 + static void test_testmod_attach_api_addrs(void) 49 + { 50 + LIBBPF_OPTS(bpf_kprobe_multi_opts, opts); 51 + unsigned long long addrs[3]; 52 + 53 + addrs[0] = ksym_get_addr("bpf_testmod_fentry_test1"); 54 + ASSERT_NEQ(addrs[0], 0, "ksym_get_addr"); 55 + addrs[1] = ksym_get_addr("bpf_testmod_fentry_test2"); 56 + ASSERT_NEQ(addrs[1], 0, "ksym_get_addr"); 57 + addrs[2] = ksym_get_addr("bpf_testmod_fentry_test3"); 58 + ASSERT_NEQ(addrs[2], 0, "ksym_get_addr"); 59 + 60 + opts.addrs = (const unsigned long *) addrs; 61 + opts.cnt = ARRAY_SIZE(addrs); 62 + 63 + test_testmod_attach_api(&opts); 64 + } 65 + 66 + static void test_testmod_attach_api_syms(void) 67 + { 68 + LIBBPF_OPTS(bpf_kprobe_multi_opts, opts); 69 + const char *syms[3] = { 70 + "bpf_testmod_fentry_test1", 71 + "bpf_testmod_fentry_test2", 72 + "bpf_testmod_fentry_test3", 73 + }; 74 + 75 + opts.syms = syms; 76 + opts.cnt = ARRAY_SIZE(syms); 77 + test_testmod_attach_api(&opts); 78 + } 79 + 80 + void serial_test_kprobe_multi_testmod_test(void) 81 + { 82 + if (!ASSERT_OK(load_kallsyms_refresh(), "load_kallsyms_refresh")) 83 + return; 84 + 85 + if (test__start_subtest("testmod_attach_api_syms")) 86 + test_testmod_attach_api_syms(); 87 + if (test__start_subtest("testmod_attach_api_addrs")) 88 + test_testmod_attach_api_addrs(); 89 + }
+8
tools/testing/selftests/bpf/prog_tests/libbpf_str.c
··· 139 139 snprintf(buf, sizeof(buf), "BPF_MAP_TYPE_%s", map_type_str); 140 140 uppercase(buf); 141 141 142 + /* Special case for map_type_name BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED 143 + * where it and BPF_MAP_TYPE_CGROUP_STORAGE have the same enum value 144 + * (map_type). For this enum value, libbpf_bpf_map_type_str() picks 145 + * BPF_MAP_TYPE_CGROUP_STORAGE. 146 + */ 147 + if (strcmp(map_type_name, "BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED") == 0) 148 + continue; 149 + 142 150 ASSERT_STREQ(buf, map_type_name, "exp_str_value"); 143 151 } 144 152
+7
tools/testing/selftests/bpf/prog_tests/module_attach.c
··· 103 103 ASSERT_ERR(delete_module("bpf_testmod", 0), "delete_module"); 104 104 bpf_link__destroy(link); 105 105 106 + link = bpf_program__attach(skel->progs.kprobe_multi); 107 + if (!ASSERT_OK_PTR(link, "attach_kprobe_multi")) 108 + goto cleanup; 109 + 110 + ASSERT_ERR(delete_module("bpf_testmod", 0), "delete_module"); 111 + bpf_link__destroy(link); 112 + 106 113 cleanup: 107 114 test_module_attach__destroy(skel); 108 115 }
+65 -1
tools/testing/selftests/bpf/prog_tests/ringbuf.c
··· 13 13 #include <linux/perf_event.h> 14 14 #include <linux/ring_buffer.h> 15 15 #include "test_ringbuf.lskel.h" 16 + #include "test_ringbuf_map_key.lskel.h" 16 17 17 18 #define EDONE 7777 18 19 ··· 59 58 } 60 59 } 61 60 61 + static struct test_ringbuf_map_key_lskel *skel_map_key; 62 62 static struct test_ringbuf_lskel *skel; 63 63 static struct ring_buffer *ringbuf; 64 64 ··· 83 81 return (void *)(long)ring_buffer__poll(ringbuf, timeout); 84 82 } 85 83 86 - void test_ringbuf(void) 84 + static void ringbuf_subtest(void) 87 85 { 88 86 const size_t rec_sz = BPF_RINGBUF_HDR_SZ + sizeof(struct sample); 89 87 pthread_t thread; ··· 298 296 cleanup: 299 297 ring_buffer__free(ringbuf); 300 298 test_ringbuf_lskel__destroy(skel); 299 + } 300 + 301 + static int process_map_key_sample(void *ctx, void *data, size_t len) 302 + { 303 + struct sample *s; 304 + int err, val; 305 + 306 + s = data; 307 + switch (s->seq) { 308 + case 1: 309 + ASSERT_EQ(s->value, 42, "sample_value"); 310 + err = bpf_map_lookup_elem(skel_map_key->maps.hash_map.map_fd, 311 + s, &val); 312 + ASSERT_OK(err, "hash_map bpf_map_lookup_elem"); 313 + ASSERT_EQ(val, 1, "hash_map val"); 314 + return -EDONE; 315 + default: 316 + return 0; 317 + } 318 + } 319 + 320 + static void ringbuf_map_key_subtest(void) 321 + { 322 + int err; 323 + 324 + skel_map_key = test_ringbuf_map_key_lskel__open(); 325 + if (!ASSERT_OK_PTR(skel_map_key, "test_ringbuf_map_key_lskel__open")) 326 + return; 327 + 328 + skel_map_key->maps.ringbuf.max_entries = getpagesize(); 329 + skel_map_key->bss->pid = getpid(); 330 + 331 + err = test_ringbuf_map_key_lskel__load(skel_map_key); 332 + if (!ASSERT_OK(err, "test_ringbuf_map_key_lskel__load")) 333 + goto cleanup; 334 + 335 + ringbuf = ring_buffer__new(skel_map_key->maps.ringbuf.map_fd, 336 + process_map_key_sample, NULL, NULL); 337 + if (!ASSERT_OK_PTR(ringbuf, "ring_buffer__new")) 338 + goto cleanup; 339 + 340 + err = test_ringbuf_map_key_lskel__attach(skel_map_key); 341 + if (!ASSERT_OK(err, "test_ringbuf_map_key_lskel__attach")) 342 + goto cleanup_ringbuf; 343 + 344 + syscall(__NR_getpgid); 345 + ASSERT_EQ(skel_map_key->bss->seq, 1, "skel_map_key->bss->seq"); 346 + err = ring_buffer__poll(ringbuf, -1); 347 + ASSERT_EQ(err, -EDONE, "ring_buffer__poll"); 348 + 349 + cleanup_ringbuf: 350 + ring_buffer__free(ringbuf); 351 + cleanup: 352 + test_ringbuf_map_key_lskel__destroy(skel_map_key); 353 + } 354 + 355 + void test_ringbuf(void) 356 + { 357 + if (test__start_subtest("ringbuf")) 358 + ringbuf_subtest(); 359 + if (test__start_subtest("ringbuf_map_key")) 360 + ringbuf_map_key_subtest(); 301 361 }
+10 -1
tools/testing/selftests/bpf/prog_tests/skeleton.c
··· 2 2 /* Copyright (c) 2019 Facebook */ 3 3 4 4 #include <test_progs.h> 5 + #include <sys/mman.h> 5 6 6 7 struct s { 7 8 int a; ··· 23 22 struct test_skeleton__kconfig *kcfg; 24 23 const void *elf_bytes; 25 24 size_t elf_bytes_sz = 0; 26 - int i; 25 + void *m; 26 + int i, fd; 27 27 28 28 skel = test_skeleton__open(); 29 29 if (CHECK(!skel, "skel_open", "failed to open skeleton\n")) ··· 125 123 ASSERT_EQ(skel->bss->out_mostly_var, 123, "out_mostly_var"); 126 124 127 125 ASSERT_EQ(bss->huge_arr[ARRAY_SIZE(bss->huge_arr) - 1], 123, "huge_arr"); 126 + 127 + fd = bpf_map__fd(skel->maps.data_non_mmapable); 128 + m = mmap(NULL, getpagesize(), PROT_READ, MAP_SHARED, fd, 0); 129 + if (!ASSERT_EQ(m, MAP_FAILED, "unexpected_mmap_success")) 130 + munmap(m, getpagesize()); 131 + 132 + ASSERT_EQ(bpf_map__map_flags(skel->maps.data_non_mmapable), 0, "non_mmap_flags"); 128 133 129 134 elf_bytes = test_skeleton__elf_bytes(&elf_bytes_sz); 130 135 ASSERT_OK_PTR(elf_bytes, "elf_bytes");
+159 -5
tools/testing/selftests/bpf/prog_tests/task_local_storage.c
··· 3 3 4 4 #define _GNU_SOURCE /* See feature_test_macros(7) */ 5 5 #include <unistd.h> 6 + #include <sched.h> 7 + #include <pthread.h> 6 8 #include <sys/syscall.h> /* For SYS_xxx definitions */ 7 9 #include <sys/types.h> 8 10 #include <test_progs.h> 11 + #include "task_local_storage_helpers.h" 9 12 #include "task_local_storage.skel.h" 10 13 #include "task_local_storage_exit_creds.skel.h" 11 14 #include "task_ls_recursion.skel.h" 15 + #include "task_storage_nodeadlock.skel.h" 12 16 13 17 static void test_sys_enter_exit(void) 14 18 { ··· 43 39 static void test_exit_creds(void) 44 40 { 45 41 struct task_local_storage_exit_creds *skel; 46 - int err; 42 + int err, run_count, sync_rcu_calls = 0; 43 + const int MAX_SYNC_RCU_CALLS = 1000; 47 44 48 45 skel = task_local_storage_exit_creds__open_and_load(); 49 46 if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) ··· 58 53 if (CHECK_FAIL(system("ls > /dev/null"))) 59 54 goto out; 60 55 61 - /* sync rcu to make sure exit_creds() is called for "ls" */ 62 - kern_sync_rcu(); 56 + /* kern_sync_rcu is not enough on its own as the read section we want 57 + * to wait for may start after we enter synchronize_rcu, so our call 58 + * won't wait for the section to finish. Loop on the run counter 59 + * as well to ensure the program has run. 60 + */ 61 + do { 62 + kern_sync_rcu(); 63 + run_count = __atomic_load_n(&skel->bss->run_count, __ATOMIC_SEQ_CST); 64 + } while (run_count == 0 && ++sync_rcu_calls < MAX_SYNC_RCU_CALLS); 65 + 66 + ASSERT_NEQ(sync_rcu_calls, MAX_SYNC_RCU_CALLS, 67 + "sync_rcu count too high"); 68 + ASSERT_NEQ(run_count, 0, "run_count"); 63 69 ASSERT_EQ(skel->bss->valid_ptr_count, 0, "valid_ptr_count"); 64 70 ASSERT_NEQ(skel->bss->null_ptr_count, 0, "null_ptr_count"); 65 71 out: ··· 79 63 80 64 static void test_recursion(void) 81 65 { 66 + int err, map_fd, prog_fd, task_fd; 82 67 struct task_ls_recursion *skel; 83 - int err; 68 + struct bpf_prog_info info; 69 + __u32 info_len = sizeof(info); 70 + long value; 71 + 72 + task_fd = sys_pidfd_open(getpid(), 0); 73 + if (!ASSERT_NEQ(task_fd, -1, "sys_pidfd_open")) 74 + return; 84 75 85 76 skel = task_ls_recursion__open_and_load(); 86 77 if (!ASSERT_OK_PTR(skel, "skel_open_and_load")) 87 - return; 78 + goto out; 88 79 89 80 err = task_ls_recursion__attach(skel); 90 81 if (!ASSERT_OK(err, "skel_attach")) 91 82 goto out; 92 83 93 84 /* trigger sys_enter, make sure it does not cause deadlock */ 85 + skel->bss->test_pid = getpid(); 94 86 syscall(SYS_gettid); 87 + skel->bss->test_pid = 0; 88 + task_ls_recursion__detach(skel); 89 + 90 + /* Refer to the comment in BPF_PROG(on_update) for 91 + * the explanation on the value 201 and 100. 92 + */ 93 + map_fd = bpf_map__fd(skel->maps.map_a); 94 + err = bpf_map_lookup_elem(map_fd, &task_fd, &value); 95 + ASSERT_OK(err, "lookup map_a"); 96 + ASSERT_EQ(value, 201, "map_a value"); 97 + ASSERT_EQ(skel->bss->nr_del_errs, 1, "bpf_task_storage_delete busy"); 98 + 99 + map_fd = bpf_map__fd(skel->maps.map_b); 100 + err = bpf_map_lookup_elem(map_fd, &task_fd, &value); 101 + ASSERT_OK(err, "lookup map_b"); 102 + ASSERT_EQ(value, 100, "map_b value"); 103 + 104 + prog_fd = bpf_program__fd(skel->progs.on_lookup); 105 + memset(&info, 0, sizeof(info)); 106 + err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); 107 + ASSERT_OK(err, "get prog info"); 108 + ASSERT_GT(info.recursion_misses, 0, "on_lookup prog recursion"); 109 + 110 + prog_fd = bpf_program__fd(skel->progs.on_update); 111 + memset(&info, 0, sizeof(info)); 112 + err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); 113 + ASSERT_OK(err, "get prog info"); 114 + ASSERT_EQ(info.recursion_misses, 0, "on_update prog recursion"); 115 + 116 + prog_fd = bpf_program__fd(skel->progs.on_enter); 117 + memset(&info, 0, sizeof(info)); 118 + err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); 119 + ASSERT_OK(err, "get prog info"); 120 + ASSERT_EQ(info.recursion_misses, 0, "on_enter prog recursion"); 95 121 96 122 out: 123 + close(task_fd); 97 124 task_ls_recursion__destroy(skel); 125 + } 126 + 127 + static bool stop; 128 + 129 + static void waitall(const pthread_t *tids, int nr) 130 + { 131 + int i; 132 + 133 + stop = true; 134 + for (i = 0; i < nr; i++) 135 + pthread_join(tids[i], NULL); 136 + } 137 + 138 + static void *sock_create_loop(void *arg) 139 + { 140 + struct task_storage_nodeadlock *skel = arg; 141 + int fd; 142 + 143 + while (!stop) { 144 + fd = socket(AF_INET, SOCK_STREAM, 0); 145 + close(fd); 146 + if (skel->bss->nr_get_errs || skel->bss->nr_del_errs) 147 + stop = true; 148 + } 149 + 150 + return NULL; 151 + } 152 + 153 + static void test_nodeadlock(void) 154 + { 155 + struct task_storage_nodeadlock *skel; 156 + struct bpf_prog_info info = {}; 157 + __u32 info_len = sizeof(info); 158 + const int nr_threads = 32; 159 + pthread_t tids[nr_threads]; 160 + int i, prog_fd, err; 161 + cpu_set_t old, new; 162 + 163 + /* Pin all threads to one cpu to increase the chance of preemption 164 + * in a sleepable bpf prog. 165 + */ 166 + CPU_ZERO(&new); 167 + CPU_SET(0, &new); 168 + err = sched_getaffinity(getpid(), sizeof(old), &old); 169 + if (!ASSERT_OK(err, "getaffinity")) 170 + return; 171 + err = sched_setaffinity(getpid(), sizeof(new), &new); 172 + if (!ASSERT_OK(err, "setaffinity")) 173 + return; 174 + 175 + skel = task_storage_nodeadlock__open_and_load(); 176 + if (!ASSERT_OK_PTR(skel, "open_and_load")) 177 + goto done; 178 + 179 + /* Unnecessary recursion and deadlock detection are reproducible 180 + * in the preemptible kernel. 181 + */ 182 + if (!skel->kconfig->CONFIG_PREEMPT) { 183 + test__skip(); 184 + goto done; 185 + } 186 + 187 + err = task_storage_nodeadlock__attach(skel); 188 + ASSERT_OK(err, "attach prog"); 189 + 190 + for (i = 0; i < nr_threads; i++) { 191 + err = pthread_create(&tids[i], NULL, sock_create_loop, skel); 192 + if (err) { 193 + /* Only assert once here to avoid excessive 194 + * PASS printing during test failure. 195 + */ 196 + ASSERT_OK(err, "pthread_create"); 197 + waitall(tids, i); 198 + goto done; 199 + } 200 + } 201 + 202 + /* With 32 threads, 1s is enough to reproduce the issue */ 203 + sleep(1); 204 + waitall(tids, nr_threads); 205 + 206 + info_len = sizeof(info); 207 + prog_fd = bpf_program__fd(skel->progs.socket_post_create); 208 + err = bpf_obj_get_info_by_fd(prog_fd, &info, &info_len); 209 + ASSERT_OK(err, "get prog info"); 210 + ASSERT_EQ(info.recursion_misses, 0, "prog recursion"); 211 + 212 + ASSERT_EQ(skel->bss->nr_get_errs, 0, "bpf_task_storage_get busy"); 213 + ASSERT_EQ(skel->bss->nr_del_errs, 0, "bpf_task_storage_delete busy"); 214 + 215 + done: 216 + task_storage_nodeadlock__destroy(skel); 217 + sched_setaffinity(getpid(), sizeof(old), &old); 98 218 } 99 219 100 220 void test_task_local_storage(void) ··· 241 89 test_exit_creds(); 242 90 if (test__start_subtest("recursion")) 243 91 test_recursion(); 92 + if (test__start_subtest("nodeadlock")) 93 + test_nodeadlock(); 244 94 }
+20 -1
tools/testing/selftests/bpf/progs/bpf_iter_bpf_array_map.c
··· 19 19 __type(value, __u64); 20 20 } arraymap1 SEC(".maps"); 21 21 22 + struct { 23 + __uint(type, BPF_MAP_TYPE_HASH); 24 + __uint(max_entries, 10); 25 + __type(key, __u64); 26 + __type(value, __u32); 27 + } hashmap1 SEC(".maps"); 28 + 22 29 __u32 key_sum = 0; 23 30 __u64 val_sum = 0; 24 31 25 32 SEC("iter/bpf_map_elem") 26 33 int dump_bpf_array_map(struct bpf_iter__bpf_map_elem *ctx) 27 34 { 28 - __u32 *key = ctx->key; 35 + __u32 *hmap_val, *key = ctx->key; 29 36 __u64 *val = ctx->value; 30 37 31 38 if (key == (void *)0 || val == (void *)0) ··· 42 35 bpf_seq_write(ctx->meta->seq, val, sizeof(__u64)); 43 36 key_sum += *key; 44 37 val_sum += *val; 38 + 39 + /* workaround - It's necessary to do this convoluted (val, key) 40 + * write into hashmap1, instead of simply doing 41 + * bpf_map_update_elem(&hashmap1, val, key, BPF_ANY); 42 + * because key has MEM_RDONLY flag and bpf_map_update elem expects 43 + * types without this flag 44 + */ 45 + bpf_map_update_elem(&hashmap1, val, val, BPF_ANY); 46 + hmap_val = bpf_map_lookup_elem(&hashmap1, val); 47 + if (hmap_val) 48 + *hmap_val = *key; 49 + 45 50 *val = *key; 46 51 return 0; 47 52 }
+101
tools/testing/selftests/bpf/progs/cgrp_ls_attach_cgroup.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + #include "bpf_tracing_net.h" 8 + 9 + char _license[] SEC("license") = "GPL"; 10 + 11 + struct socket_cookie { 12 + __u64 cookie_key; 13 + __u64 cookie_value; 14 + }; 15 + 16 + struct { 17 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 18 + __uint(map_flags, BPF_F_NO_PREALLOC); 19 + __type(key, int); 20 + __type(value, struct socket_cookie); 21 + } socket_cookies SEC(".maps"); 22 + 23 + SEC("cgroup/connect6") 24 + int set_cookie(struct bpf_sock_addr *ctx) 25 + { 26 + struct socket_cookie *p; 27 + struct tcp_sock *tcp_sk; 28 + struct bpf_sock *sk; 29 + 30 + if (ctx->family != AF_INET6 || ctx->user_family != AF_INET6) 31 + return 1; 32 + 33 + sk = ctx->sk; 34 + if (!sk) 35 + return 1; 36 + 37 + tcp_sk = bpf_skc_to_tcp_sock(sk); 38 + if (!tcp_sk) 39 + return 1; 40 + 41 + p = bpf_cgrp_storage_get(&socket_cookies, 42 + tcp_sk->inet_conn.icsk_inet.sk.sk_cgrp_data.cgroup, 0, 43 + BPF_LOCAL_STORAGE_GET_F_CREATE); 44 + if (!p) 45 + return 1; 46 + 47 + p->cookie_value = 0xF; 48 + p->cookie_key = bpf_get_socket_cookie(ctx); 49 + return 1; 50 + } 51 + 52 + SEC("sockops") 53 + int update_cookie_sockops(struct bpf_sock_ops *ctx) 54 + { 55 + struct socket_cookie *p; 56 + struct tcp_sock *tcp_sk; 57 + struct bpf_sock *sk; 58 + 59 + if (ctx->family != AF_INET6 || ctx->op != BPF_SOCK_OPS_TCP_CONNECT_CB) 60 + return 1; 61 + 62 + sk = ctx->sk; 63 + if (!sk) 64 + return 1; 65 + 66 + tcp_sk = bpf_skc_to_tcp_sock(sk); 67 + if (!tcp_sk) 68 + return 1; 69 + 70 + p = bpf_cgrp_storage_get(&socket_cookies, 71 + tcp_sk->inet_conn.icsk_inet.sk.sk_cgrp_data.cgroup, 0, 0); 72 + if (!p) 73 + return 1; 74 + 75 + if (p->cookie_key != bpf_get_socket_cookie(ctx)) 76 + return 1; 77 + 78 + p->cookie_value |= (ctx->local_port << 8); 79 + return 1; 80 + } 81 + 82 + SEC("fexit/inet_stream_connect") 83 + int BPF_PROG(update_cookie_tracing, struct socket *sock, 84 + struct sockaddr *uaddr, int addr_len, int flags) 85 + { 86 + struct socket_cookie *p; 87 + struct tcp_sock *tcp_sk; 88 + 89 + if (uaddr->sa_family != AF_INET6) 90 + return 0; 91 + 92 + p = bpf_cgrp_storage_get(&socket_cookies, sock->sk->sk_cgrp_data.cgroup, 0, 0); 93 + if (!p) 94 + return 0; 95 + 96 + if (p->cookie_key != bpf_get_socket_cookie(sock->sk)) 97 + return 0; 98 + 99 + p->cookie_value |= 0xF0; 100 + return 0; 101 + }
+26
tools/testing/selftests/bpf/progs/cgrp_ls_negative.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + struct { 11 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 12 + __uint(map_flags, BPF_F_NO_PREALLOC); 13 + __type(key, int); 14 + __type(value, long); 15 + } map_a SEC(".maps"); 16 + 17 + SEC("tp_btf/sys_enter") 18 + int BPF_PROG(on_enter, struct pt_regs *regs, long id) 19 + { 20 + struct task_struct *task; 21 + 22 + task = bpf_get_current_task_btf(); 23 + (void)bpf_cgrp_storage_get(&map_a, (struct cgroup *)task, 0, 24 + BPF_LOCAL_STORAGE_GET_F_CREATE); 25 + return 0; 26 + }
+70
tools/testing/selftests/bpf/progs/cgrp_ls_recursion.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + struct { 11 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 12 + __uint(map_flags, BPF_F_NO_PREALLOC); 13 + __type(key, int); 14 + __type(value, long); 15 + } map_a SEC(".maps"); 16 + 17 + struct { 18 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 19 + __uint(map_flags, BPF_F_NO_PREALLOC); 20 + __type(key, int); 21 + __type(value, long); 22 + } map_b SEC(".maps"); 23 + 24 + SEC("fentry/bpf_local_storage_lookup") 25 + int BPF_PROG(on_lookup) 26 + { 27 + struct task_struct *task = bpf_get_current_task_btf(); 28 + 29 + bpf_cgrp_storage_delete(&map_a, task->cgroups->dfl_cgrp); 30 + bpf_cgrp_storage_delete(&map_b, task->cgroups->dfl_cgrp); 31 + return 0; 32 + } 33 + 34 + SEC("fentry/bpf_local_storage_update") 35 + int BPF_PROG(on_update) 36 + { 37 + struct task_struct *task = bpf_get_current_task_btf(); 38 + long *ptr; 39 + 40 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 41 + BPF_LOCAL_STORAGE_GET_F_CREATE); 42 + if (ptr) 43 + *ptr += 1; 44 + 45 + ptr = bpf_cgrp_storage_get(&map_b, task->cgroups->dfl_cgrp, 0, 46 + BPF_LOCAL_STORAGE_GET_F_CREATE); 47 + if (ptr) 48 + *ptr += 1; 49 + 50 + return 0; 51 + } 52 + 53 + SEC("tp_btf/sys_enter") 54 + int BPF_PROG(on_enter, struct pt_regs *regs, long id) 55 + { 56 + struct task_struct *task; 57 + long *ptr; 58 + 59 + task = bpf_get_current_task_btf(); 60 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 61 + BPF_LOCAL_STORAGE_GET_F_CREATE); 62 + if (ptr) 63 + *ptr = 200; 64 + 65 + ptr = bpf_cgrp_storage_get(&map_b, task->cgroups->dfl_cgrp, 0, 66 + BPF_LOCAL_STORAGE_GET_F_CREATE); 67 + if (ptr) 68 + *ptr = 100; 69 + return 0; 70 + }
+88
tools/testing/selftests/bpf/progs/cgrp_ls_tp_btf.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include "vmlinux.h" 5 + #include <bpf/bpf_helpers.h> 6 + #include <bpf/bpf_tracing.h> 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + struct { 11 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 12 + __uint(map_flags, BPF_F_NO_PREALLOC); 13 + __type(key, int); 14 + __type(value, long); 15 + } map_a SEC(".maps"); 16 + 17 + struct { 18 + __uint(type, BPF_MAP_TYPE_CGRP_STORAGE); 19 + __uint(map_flags, BPF_F_NO_PREALLOC); 20 + __type(key, int); 21 + __type(value, long); 22 + } map_b SEC(".maps"); 23 + 24 + #define MAGIC_VALUE 0xabcd1234 25 + 26 + pid_t target_pid = 0; 27 + int mismatch_cnt = 0; 28 + int enter_cnt = 0; 29 + int exit_cnt = 0; 30 + 31 + SEC("tp_btf/sys_enter") 32 + int BPF_PROG(on_enter, struct pt_regs *regs, long id) 33 + { 34 + struct task_struct *task; 35 + long *ptr; 36 + int err; 37 + 38 + task = bpf_get_current_task_btf(); 39 + if (task->pid != target_pid) 40 + return 0; 41 + 42 + /* populate value 0 */ 43 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 44 + BPF_LOCAL_STORAGE_GET_F_CREATE); 45 + if (!ptr) 46 + return 0; 47 + 48 + /* delete value 0 */ 49 + err = bpf_cgrp_storage_delete(&map_a, task->cgroups->dfl_cgrp); 50 + if (err) 51 + return 0; 52 + 53 + /* value is not available */ 54 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 0); 55 + if (ptr) 56 + return 0; 57 + 58 + /* re-populate the value */ 59 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 60 + BPF_LOCAL_STORAGE_GET_F_CREATE); 61 + if (!ptr) 62 + return 0; 63 + __sync_fetch_and_add(&enter_cnt, 1); 64 + *ptr = MAGIC_VALUE + enter_cnt; 65 + 66 + return 0; 67 + } 68 + 69 + SEC("tp_btf/sys_exit") 70 + int BPF_PROG(on_exit, struct pt_regs *regs, long id) 71 + { 72 + struct task_struct *task; 73 + long *ptr; 74 + 75 + task = bpf_get_current_task_btf(); 76 + if (task->pid != target_pid) 77 + return 0; 78 + 79 + ptr = bpf_cgrp_storage_get(&map_a, task->cgroups->dfl_cgrp, 0, 80 + BPF_LOCAL_STORAGE_GET_F_CREATE); 81 + if (!ptr) 82 + return 0; 83 + 84 + __sync_fetch_and_add(&exit_cnt, 1); 85 + if (*ptr != MAGIC_VALUE + exit_cnt) 86 + __sync_fetch_and_add(&mismatch_cnt, 1); 87 + return 0; 88 + }
+50
tools/testing/selftests/bpf/progs/kprobe_multi.c
··· 110 110 kprobe_multi_check(ctx, true); 111 111 return 0; 112 112 } 113 + 114 + extern const void bpf_testmod_fentry_test1 __ksym; 115 + extern const void bpf_testmod_fentry_test2 __ksym; 116 + extern const void bpf_testmod_fentry_test3 __ksym; 117 + 118 + __u64 kprobe_testmod_test1_result = 0; 119 + __u64 kprobe_testmod_test2_result = 0; 120 + __u64 kprobe_testmod_test3_result = 0; 121 + 122 + __u64 kretprobe_testmod_test1_result = 0; 123 + __u64 kretprobe_testmod_test2_result = 0; 124 + __u64 kretprobe_testmod_test3_result = 0; 125 + 126 + static void kprobe_multi_testmod_check(void *ctx, bool is_return) 127 + { 128 + if (bpf_get_current_pid_tgid() >> 32 != pid) 129 + return; 130 + 131 + __u64 addr = bpf_get_func_ip(ctx); 132 + 133 + if (is_return) { 134 + if ((const void *) addr == &bpf_testmod_fentry_test1) 135 + kretprobe_testmod_test1_result = 1; 136 + if ((const void *) addr == &bpf_testmod_fentry_test2) 137 + kretprobe_testmod_test2_result = 1; 138 + if ((const void *) addr == &bpf_testmod_fentry_test3) 139 + kretprobe_testmod_test3_result = 1; 140 + } else { 141 + if ((const void *) addr == &bpf_testmod_fentry_test1) 142 + kprobe_testmod_test1_result = 1; 143 + if ((const void *) addr == &bpf_testmod_fentry_test2) 144 + kprobe_testmod_test2_result = 1; 145 + if ((const void *) addr == &bpf_testmod_fentry_test3) 146 + kprobe_testmod_test3_result = 1; 147 + } 148 + } 149 + 150 + SEC("kprobe.multi") 151 + int test_kprobe_testmod(struct pt_regs *ctx) 152 + { 153 + kprobe_multi_testmod_check(ctx, false); 154 + return 0; 155 + } 156 + 157 + SEC("kretprobe.multi") 158 + int test_kretprobe_testmod(struct pt_regs *ctx) 159 + { 160 + kprobe_multi_testmod_check(ctx, true); 161 + return 0; 162 + }
+3
tools/testing/selftests/bpf/progs/task_local_storage_exit_creds.c
··· 14 14 __type(value, __u64); 15 15 } task_storage SEC(".maps"); 16 16 17 + int run_count = 0; 17 18 int valid_ptr_count = 0; 18 19 int null_ptr_count = 0; 19 20 ··· 29 28 __sync_fetch_and_add(&valid_ptr_count, 1); 30 29 else 31 30 __sync_fetch_and_add(&null_ptr_count, 1); 31 + 32 + __sync_fetch_and_add(&run_count, 1); 32 33 return 0; 33 34 }
+41 -4
tools/testing/selftests/bpf/progs/task_ls_recursion.c
··· 5 5 #include <bpf/bpf_helpers.h> 6 6 #include <bpf/bpf_tracing.h> 7 7 8 + #ifndef EBUSY 9 + #define EBUSY 16 10 + #endif 11 + 8 12 char _license[] SEC("license") = "GPL"; 13 + int nr_del_errs = 0; 14 + int test_pid = 0; 9 15 10 16 struct { 11 17 __uint(type, BPF_MAP_TYPE_TASK_STORAGE); ··· 32 26 { 33 27 struct task_struct *task = bpf_get_current_task_btf(); 34 28 29 + if (!test_pid || task->pid != test_pid) 30 + return 0; 31 + 32 + /* The bpf_task_storage_delete will call 33 + * bpf_local_storage_lookup. The prog->active will 34 + * stop the recursion. 35 + */ 35 36 bpf_task_storage_delete(&map_a, task); 36 37 bpf_task_storage_delete(&map_b, task); 37 38 return 0; ··· 50 37 struct task_struct *task = bpf_get_current_task_btf(); 51 38 long *ptr; 52 39 40 + if (!test_pid || task->pid != test_pid) 41 + return 0; 42 + 53 43 ptr = bpf_task_storage_get(&map_a, task, 0, 54 44 BPF_LOCAL_STORAGE_GET_F_CREATE); 55 - if (ptr) 56 - *ptr += 1; 45 + /* ptr will not be NULL when it is called from 46 + * the bpf_task_storage_get(&map_b,...F_CREATE) in 47 + * the BPF_PROG(on_enter) below. It is because 48 + * the value can be found in map_a and the kernel 49 + * does not need to acquire any spin_lock. 50 + */ 51 + if (ptr) { 52 + int err; 57 53 54 + *ptr += 1; 55 + err = bpf_task_storage_delete(&map_a, task); 56 + if (err == -EBUSY) 57 + nr_del_errs++; 58 + } 59 + 60 + /* This will still fail because map_b is empty and 61 + * this BPF_PROG(on_update) has failed to acquire 62 + * the percpu busy lock => meaning potential 63 + * deadlock is detected and it will fail to create 64 + * new storage. 65 + */ 58 66 ptr = bpf_task_storage_get(&map_b, task, 0, 59 67 BPF_LOCAL_STORAGE_GET_F_CREATE); 60 68 if (ptr) ··· 91 57 long *ptr; 92 58 93 59 task = bpf_get_current_task_btf(); 60 + if (!test_pid || task->pid != test_pid) 61 + return 0; 62 + 94 63 ptr = bpf_task_storage_get(&map_a, task, 0, 95 64 BPF_LOCAL_STORAGE_GET_F_CREATE); 96 - if (ptr) 65 + if (ptr && !*ptr) 97 66 *ptr = 200; 98 67 99 68 ptr = bpf_task_storage_get(&map_b, task, 0, 100 69 BPF_LOCAL_STORAGE_GET_F_CREATE); 101 - if (ptr) 70 + if (ptr && !*ptr) 102 71 *ptr = 100; 103 72 return 0; 104 73 }
+47
tools/testing/selftests/bpf/progs/task_storage_nodeadlock.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + 3 + #include "vmlinux.h" 4 + #include <bpf/bpf_helpers.h> 5 + #include <bpf/bpf_tracing.h> 6 + 7 + char _license[] SEC("license") = "GPL"; 8 + 9 + #ifndef EBUSY 10 + #define EBUSY 16 11 + #endif 12 + 13 + extern bool CONFIG_PREEMPT __kconfig __weak; 14 + int nr_get_errs = 0; 15 + int nr_del_errs = 0; 16 + 17 + struct { 18 + __uint(type, BPF_MAP_TYPE_TASK_STORAGE); 19 + __uint(map_flags, BPF_F_NO_PREALLOC); 20 + __type(key, int); 21 + __type(value, int); 22 + } task_storage SEC(".maps"); 23 + 24 + SEC("lsm.s/socket_post_create") 25 + int BPF_PROG(socket_post_create, struct socket *sock, int family, int type, 26 + int protocol, int kern) 27 + { 28 + struct task_struct *task; 29 + int ret, zero = 0; 30 + int *value; 31 + 32 + if (!CONFIG_PREEMPT) 33 + return 0; 34 + 35 + task = bpf_get_current_task_btf(); 36 + value = bpf_task_storage_get(&task_storage, task, &zero, 37 + BPF_LOCAL_STORAGE_GET_F_CREATE); 38 + if (!value) 39 + __sync_fetch_and_add(&nr_get_errs, 1); 40 + 41 + ret = bpf_task_storage_delete(&task_storage, 42 + bpf_get_current_task_btf()); 43 + if (ret == -EBUSY) 44 + __sync_fetch_and_add(&nr_del_errs, 1); 45 + 46 + return 0; 47 + }
+6
tools/testing/selftests/bpf/progs/test_module_attach.c
··· 110 110 return 0; /* don't override the exit code */ 111 111 } 112 112 113 + SEC("kprobe.multi/bpf_testmod_test_read") 114 + int BPF_PROG(kprobe_multi) 115 + { 116 + return 0; 117 + } 118 + 113 119 char _license[] SEC("license") = "GPL";
+70
tools/testing/selftests/bpf/progs/test_ringbuf_map_key.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */ 3 + 4 + #include <linux/bpf.h> 5 + #include <bpf/bpf_helpers.h> 6 + #include "bpf_misc.h" 7 + 8 + char _license[] SEC("license") = "GPL"; 9 + 10 + struct sample { 11 + int pid; 12 + int seq; 13 + long value; 14 + char comm[16]; 15 + }; 16 + 17 + struct { 18 + __uint(type, BPF_MAP_TYPE_RINGBUF); 19 + } ringbuf SEC(".maps"); 20 + 21 + struct { 22 + __uint(type, BPF_MAP_TYPE_HASH); 23 + __uint(max_entries, 1000); 24 + __type(key, struct sample); 25 + __type(value, int); 26 + } hash_map SEC(".maps"); 27 + 28 + /* inputs */ 29 + int pid = 0; 30 + 31 + /* inner state */ 32 + long seq = 0; 33 + 34 + SEC("fentry/" SYS_PREFIX "sys_getpgid") 35 + int test_ringbuf_mem_map_key(void *ctx) 36 + { 37 + int cur_pid = bpf_get_current_pid_tgid() >> 32; 38 + struct sample *sample, sample_copy; 39 + int *lookup_val; 40 + 41 + if (cur_pid != pid) 42 + return 0; 43 + 44 + sample = bpf_ringbuf_reserve(&ringbuf, sizeof(*sample), 0); 45 + if (!sample) 46 + return 0; 47 + 48 + sample->pid = pid; 49 + bpf_get_current_comm(sample->comm, sizeof(sample->comm)); 50 + sample->seq = ++seq; 51 + sample->value = 42; 52 + 53 + /* test using 'sample' (PTR_TO_MEM | MEM_ALLOC) as map key arg 54 + */ 55 + lookup_val = (int *)bpf_map_lookup_elem(&hash_map, sample); 56 + 57 + /* workaround - memcpy is necessary so that verifier doesn't 58 + * complain with: 59 + * verifier internal error: more than one arg with ref_obj_id R3 60 + * when trying to do bpf_map_update_elem(&hash_map, sample, &sample->seq, BPF_ANY); 61 + * 62 + * Since bpf_map_lookup_elem above uses 'sample' as key, test using 63 + * sample field as value below 64 + */ 65 + __builtin_memcpy(&sample_copy, sample, sizeof(struct sample)); 66 + bpf_map_update_elem(&hash_map, &sample_copy, &sample->seq, BPF_ANY); 67 + 68 + bpf_ringbuf_submit(sample, 0); 69 + return 0; 70 + }
+17
tools/testing/selftests/bpf/progs/test_skeleton.c
··· 53 53 54 54 char huge_arr[16 * 1024 * 1024]; 55 55 56 + /* non-mmapable custom .data section */ 57 + 58 + struct my_value { int x, y, z; }; 59 + 60 + __hidden int zero_key SEC(".data.non_mmapable"); 61 + static struct my_value zero_value SEC(".data.non_mmapable"); 62 + 63 + struct { 64 + __uint(type, BPF_MAP_TYPE_ARRAY); 65 + __type(key, int); 66 + __type(value, struct my_value); 67 + __uint(max_entries, 1); 68 + } my_map SEC(".maps"); 69 + 56 70 SEC("raw_tp/sys_enter") 57 71 int handler(const void *ctx) 58 72 { ··· 88 74 out_mostly_var = read_mostly_var; 89 75 90 76 huge_arr[sizeof(huge_arr) - 1] = 123; 77 + 78 + /* make sure zero_key and zero_value are not optimized out */ 79 + bpf_map_update_elem(&my_map, &zero_key, &zero_value, BPF_ANY); 91 80 92 81 return 0; 93 82 }
+5 -2
tools/testing/selftests/bpf/test_bpftool_metadata.sh
··· 4 4 # Kselftest framework requirement - SKIP code is 4. 5 5 ksft_skip=4 6 6 7 + BPF_FILE_USED="metadata_used.bpf.o" 8 + BPF_FILE_UNUSED="metadata_unused.bpf.o" 9 + 7 10 TESTNAME=bpftool_metadata 8 11 BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts) 9 12 BPF_DIR=$BPF_FS/test_$TESTNAME ··· 58 55 59 56 trap cleanup EXIT 60 57 61 - bpftool prog load metadata_unused.o $BPF_DIR/unused 58 + bpftool prog load $BPF_FILE_UNUSED $BPF_DIR/unused 62 59 63 60 METADATA_PLAIN="$(bpftool prog)" 64 61 echo "$METADATA_PLAIN" | grep 'a = "foo"' > /dev/null ··· 70 67 71 68 rm $BPF_DIR/unused 72 69 73 - bpftool prog load metadata_used.o $BPF_DIR/used 70 + bpftool prog load $BPF_FILE_USED $BPF_DIR/used 74 71 75 72 METADATA_PLAIN="$(bpftool prog)" 76 73 echo "$METADATA_PLAIN" | grep 'a = "bar"' > /dev/null
+8
tools/testing/selftests/bpf/test_bpftool_synctypes.py
··· 501 501 source_map_types = set(bpf_info.get_map_type_map().values()) 502 502 source_map_types.discard('unspec') 503 503 504 + # BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED and BPF_MAP_TYPE_CGROUP_STORAGE 505 + # share the same enum value and source_map_types picks 506 + # BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED/cgroup_storage_deprecated. 507 + # Replace 'cgroup_storage_deprecated' with 'cgroup_storage' 508 + # so it aligns with what `bpftool map help` shows. 509 + source_map_types.remove('cgroup_storage_deprecated') 510 + source_map_types.add('cgroup_storage') 511 + 504 512 help_map_types = map_info.get_map_help() 505 513 help_map_options = map_info.get_options() 506 514 map_info.close()
+4 -2
tools/testing/selftests/bpf/test_flow_dissector.sh
··· 2 2 # SPDX-License-Identifier: GPL-2.0 3 3 # 4 4 # Load BPF flow dissector and verify it correctly dissects traffic 5 + 6 + BPF_FILE="bpf_flow.bpf.o" 5 7 export TESTNAME=test_flow_dissector 6 8 unmount=0 7 9 ··· 24 22 if bpftool="$(which bpftool)"; then 25 23 echo "Testing global flow dissector..." 26 24 27 - $bpftool prog loadall ./bpf_flow.o /sys/fs/bpf/flow \ 25 + $bpftool prog loadall $BPF_FILE /sys/fs/bpf/flow \ 28 26 type flow_dissector 29 27 30 28 if ! unshare --net $bpftool prog attach pinned \ ··· 97 95 fi 98 96 99 97 # Attach BPF program 100 - ./flow_dissector_load -p bpf_flow.o -s _dissect 98 + ./flow_dissector_load -p $BPF_FILE -s _dissect 101 99 102 100 # Setup 103 101 tc qdisc add dev lo ingress
+9 -8
tools/testing/selftests/bpf/test_lwt_ip_encap.sh
··· 38 38 # ping: SRC->[encap at veth2:ingress]->GRE:decap->DST 39 39 # ping replies go DST->SRC directly 40 40 41 + BPF_FILE="test_lwt_ip_encap.bpf.o" 41 42 if [[ $EUID -ne 0 ]]; then 42 43 echo "This script must be run as root" 43 44 echo "FAIL" ··· 374 373 # install replacement routes (LWT/eBPF), pings succeed 375 374 if [ "${ENCAP}" == "IPv4" ] ; then 376 375 ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj \ 377 - test_lwt_ip_encap.o sec encap_gre dev veth1 ${VRF} 376 + ${BPF_FILE} sec encap_gre dev veth1 ${VRF} 378 377 ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj \ 379 - test_lwt_ip_encap.o sec encap_gre dev veth1 ${VRF} 378 + ${BPF_FILE} sec encap_gre dev veth1 ${VRF} 380 379 elif [ "${ENCAP}" == "IPv6" ] ; then 381 380 ip -netns ${NS1} route add ${IPv4_DST} encap bpf xmit obj \ 382 - test_lwt_ip_encap.o sec encap_gre6 dev veth1 ${VRF} 381 + ${BPF_FILE} sec encap_gre6 dev veth1 ${VRF} 383 382 ip -netns ${NS1} -6 route add ${IPv6_DST} encap bpf xmit obj \ 384 - test_lwt_ip_encap.o sec encap_gre6 dev veth1 ${VRF} 383 + ${BPF_FILE} sec encap_gre6 dev veth1 ${VRF} 385 384 else 386 385 echo " unknown encap ${ENCAP}" 387 386 TEST_STATUS=1 ··· 432 431 # install replacement routes (LWT/eBPF), pings succeed 433 432 if [ "${ENCAP}" == "IPv4" ] ; then 434 433 ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj \ 435 - test_lwt_ip_encap.o sec encap_gre dev veth2 ${VRF} 434 + ${BPF_FILE} sec encap_gre dev veth2 ${VRF} 436 435 ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj \ 437 - test_lwt_ip_encap.o sec encap_gre dev veth2 ${VRF} 436 + ${BPF_FILE} sec encap_gre dev veth2 ${VRF} 438 437 elif [ "${ENCAP}" == "IPv6" ] ; then 439 438 ip -netns ${NS2} route add ${IPv4_DST} encap bpf in obj \ 440 - test_lwt_ip_encap.o sec encap_gre6 dev veth2 ${VRF} 439 + ${BPF_FILE} sec encap_gre6 dev veth2 ${VRF} 441 440 ip -netns ${NS2} -6 route add ${IPv6_DST} encap bpf in obj \ 442 - test_lwt_ip_encap.o sec encap_gre6 dev veth2 ${VRF} 441 + ${BPF_FILE} sec encap_gre6 dev veth2 ${VRF} 443 442 else 444 443 echo "FAIL: unknown encap ${ENCAP}" 445 444 TEST_STATUS=1
+5 -4
tools/testing/selftests/bpf/test_lwt_seg6local.sh
··· 23 23 24 24 # Kselftest framework requirement - SKIP code is 4. 25 25 ksft_skip=4 26 + BPF_FILE="test_lwt_seg6local.bpf.o" 26 27 readonly NS1="ns1-$(mktemp -u XXXXXX)" 27 28 readonly NS2="ns2-$(mktemp -u XXXXXX)" 28 29 readonly NS3="ns3-$(mktemp -u XXXXXX)" ··· 118 117 ip netns exec ${NS1} ip -6 addr add fb00::1/16 dev lo 119 118 ip netns exec ${NS1} ip -6 route add fb00::6 dev veth1 via fb00::21 120 119 121 - ip netns exec ${NS2} ip -6 route add fb00::6 encap bpf in obj test_lwt_seg6local.o sec encap_srh dev veth2 120 + ip netns exec ${NS2} ip -6 route add fb00::6 encap bpf in obj ${BPF_FILE} sec encap_srh dev veth2 122 121 ip netns exec ${NS2} ip -6 route add fd00::1 dev veth3 via fb00::43 scope link 123 122 124 123 ip netns exec ${NS3} ip -6 route add fc42::1 dev veth5 via fb00::65 125 - ip netns exec ${NS3} ip -6 route add fd00::1 encap seg6local action End.BPF endpoint obj test_lwt_seg6local.o sec add_egr_x dev veth4 124 + ip netns exec ${NS3} ip -6 route add fd00::1 encap seg6local action End.BPF endpoint obj ${BPF_FILE} sec add_egr_x dev veth4 126 125 127 - ip netns exec ${NS4} ip -6 route add fd00::2 encap seg6local action End.BPF endpoint obj test_lwt_seg6local.o sec pop_egr dev veth6 126 + ip netns exec ${NS4} ip -6 route add fd00::2 encap seg6local action End.BPF endpoint obj ${BPF_FILE} sec pop_egr dev veth6 128 127 ip netns exec ${NS4} ip -6 addr add fc42::1 dev lo 129 128 ip netns exec ${NS4} ip -6 route add fd00::3 dev veth7 via fb00::87 130 129 131 130 ip netns exec ${NS5} ip -6 route add fd00::4 table 117 dev veth9 via fb00::109 132 - ip netns exec ${NS5} ip -6 route add fd00::3 encap seg6local action End.BPF endpoint obj test_lwt_seg6local.o sec inspect_t dev veth8 131 + ip netns exec ${NS5} ip -6 route add fd00::3 encap seg6local action End.BPF endpoint obj ${BPF_FILE} sec inspect_t dev veth8 133 132 134 133 ip netns exec ${NS6} ip -6 addr add fb00::6/16 dev lo 135 134 ip netns exec ${NS6} ip -6 addr add fd00::4/16 dev lo
+2 -1
tools/testing/selftests/bpf/test_tc_edt.sh
··· 5 5 # with dst port = 9000 down to 5MBps. Then it measures actual 6 6 # throughput of the flow. 7 7 8 + BPF_FILE="test_tc_edt.bpf.o" 8 9 if [[ $EUID -ne 0 ]]; then 9 10 echo "This script must be run as root" 10 11 echo "FAIL" ··· 55 54 ip netns exec ${NS_SRC} tc qdisc add dev veth_src root fq 56 55 ip netns exec ${NS_SRC} tc qdisc add dev veth_src clsact 57 56 ip netns exec ${NS_SRC} tc filter add dev veth_src egress \ 58 - bpf da obj test_tc_edt.o sec cls_test 57 + bpf da obj ${BPF_FILE} sec cls_test 59 58 60 59 61 60 # start the listener
+3 -2
tools/testing/selftests/bpf/test_tc_tunnel.sh
··· 3 3 # 4 4 # In-place tunneling 5 5 6 + BPF_FILE="test_tc_tunnel.bpf.o" 6 7 # must match the port that the bpf program filters on 7 8 readonly port=8000 8 9 ··· 197 196 # client can no longer connect 198 197 ip netns exec "${ns1}" tc qdisc add dev veth1 clsact 199 198 ip netns exec "${ns1}" tc filter add dev veth1 egress \ 200 - bpf direct-action object-file ./test_tc_tunnel.o \ 199 + bpf direct-action object-file ${BPF_FILE} \ 201 200 section "encap_${tuntype}_${mac}" 202 201 echo "test bpf encap without decap (expect failure)" 203 202 server_listen ··· 297 296 ip netns exec "${ns2}" ip link del dev testtun0 298 297 ip netns exec "${ns2}" tc qdisc add dev veth2 clsact 299 298 ip netns exec "${ns2}" tc filter add dev veth2 ingress \ 300 - bpf direct-action object-file ./test_tc_tunnel.o section decap 299 + bpf direct-action object-file ${BPF_FILE} section decap 301 300 echo "test bpf encap with bpf decap" 302 301 client_connect 303 302 verify_data
+3 -2
tools/testing/selftests/bpf/test_tunnel.sh
··· 45 45 # 5) Tunnel protocol handler, ex: vxlan_rcv, decap the packet 46 46 # 6) Forward the packet to the overlay tnl dev 47 47 48 + BPF_FILE="test_tunnel_kern.bpf.o" 48 49 BPF_PIN_TUNNEL_DIR="/sys/fs/bpf/tc/tunnel" 49 50 PING_ARG="-c 3 -w 10 -q" 50 51 ret=0 ··· 546 545 > /sys/kernel/debug/tracing/trace 547 546 setup_xfrm_tunnel 548 547 mkdir -p ${BPF_PIN_TUNNEL_DIR} 549 - bpftool prog loadall ./test_tunnel_kern.o ${BPF_PIN_TUNNEL_DIR} 548 + bpftool prog loadall ${BPF_FILE} ${BPF_PIN_TUNNEL_DIR} 550 549 tc qdisc add dev veth1 clsact 551 550 tc filter add dev veth1 proto ip ingress bpf da object-pinned \ 552 551 ${BPF_PIN_TUNNEL_DIR}/xfrm_get_state ··· 573 572 SET=$2 574 573 GET=$3 575 574 mkdir -p ${BPF_PIN_TUNNEL_DIR} 576 - bpftool prog loadall ./test_tunnel_kern.o ${BPF_PIN_TUNNEL_DIR}/ 575 + bpftool prog loadall ${BPF_FILE} ${BPF_PIN_TUNNEL_DIR}/ 577 576 tc qdisc add dev $DEV clsact 578 577 tc filter add dev $DEV egress bpf da object-pinned ${BPF_PIN_TUNNEL_DIR}/$SET 579 578 tc filter add dev $DEV ingress bpf da object-pinned ${BPF_PIN_TUNNEL_DIR}/$GET
+5 -4
tools/testing/selftests/bpf/test_xdp_meta.sh
··· 1 1 #!/bin/sh 2 2 3 + BPF_FILE="test_xdp_meta.bpf.o" 3 4 # Kselftest framework requirement - SKIP code is 4. 4 5 readonly KSFT_SKIP=4 5 6 readonly NS1="ns1-$(mktemp -u XXXXXX)" ··· 43 42 ip netns exec ${NS1} tc qdisc add dev veth1 clsact 44 43 ip netns exec ${NS2} tc qdisc add dev veth2 clsact 45 44 46 - ip netns exec ${NS1} tc filter add dev veth1 ingress bpf da obj test_xdp_meta.o sec t 47 - ip netns exec ${NS2} tc filter add dev veth2 ingress bpf da obj test_xdp_meta.o sec t 45 + ip netns exec ${NS1} tc filter add dev veth1 ingress bpf da obj ${BPF_FILE} sec t 46 + ip netns exec ${NS2} tc filter add dev veth2 ingress bpf da obj ${BPF_FILE} sec t 48 47 49 - ip netns exec ${NS1} ip link set dev veth1 xdp obj test_xdp_meta.o sec x 50 - ip netns exec ${NS2} ip link set dev veth2 xdp obj test_xdp_meta.o sec x 48 + ip netns exec ${NS1} ip link set dev veth1 xdp obj ${BPF_FILE} sec x 49 + ip netns exec ${NS2} ip link set dev veth2 xdp obj ${BPF_FILE} sec x 51 50 52 51 ip netns exec ${NS1} ip link set dev veth1 up 53 52 ip netns exec ${NS2} ip link set dev veth2 up
+4 -4
tools/testing/selftests/bpf/test_xdp_vlan.sh
··· 200 200 # ---------------------------------------------------------------------- 201 201 # In ns1: ingress use XDP to remove VLAN tags 202 202 export DEVNS1=veth1 203 - export FILE=test_xdp_vlan.o 203 + export BPF_FILE=test_xdp_vlan.bpf.o 204 204 205 205 # First test: Remove VLAN by setting VLAN ID 0, using "xdp_vlan_change" 206 206 export XDP_PROG=xdp_vlan_change 207 - ip netns exec ${NS1} ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG 207 + ip netns exec ${NS1} ip link set $DEVNS1 $XDP_MODE object $BPF_FILE section $XDP_PROG 208 208 209 209 # In ns1: egress use TC to add back VLAN tag 4011 210 210 # (del cmd) ··· 212 212 # 213 213 ip netns exec ${NS1} tc qdisc add dev $DEVNS1 clsact 214 214 ip netns exec ${NS1} tc filter add dev $DEVNS1 egress \ 215 - prio 1 handle 1 bpf da obj $FILE sec tc_vlan_push 215 + prio 1 handle 1 bpf da obj $BPF_FILE sec tc_vlan_push 216 216 217 217 # Now the namespaces can reach each-other, test with ping: 218 218 ip netns exec ${NS2} ping -i 0.2 -W 2 -c 2 $IPADDR1 ··· 226 226 # 227 227 export XDP_PROG=xdp_vlan_remove_outer2 228 228 ip netns exec ${NS1} ip link set $DEVNS1 $XDP_MODE off 229 - ip netns exec ${NS1} ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG 229 + ip netns exec ${NS1} ip link set $DEVNS1 $XDP_MODE object $BPF_FILE section $XDP_PROG 230 230 231 231 # Now the namespaces should still be able reach each-other, test with ping: 232 232 ip netns exec ${NS2} ping -i 0.2 -W 2 -c 2 $IPADDR1
+13 -7
tools/testing/selftests/bpf/trace_helpers.c
··· 23 23 return ((struct ksym *)p1)->addr - ((struct ksym *)p2)->addr; 24 24 } 25 25 26 - int load_kallsyms(void) 26 + int load_kallsyms_refresh(void) 27 27 { 28 28 FILE *f; 29 29 char func[256], buf[256]; ··· 31 31 void *addr; 32 32 int i = 0; 33 33 34 - /* 35 - * This is called/used from multiplace places, 36 - * load symbols just once. 37 - */ 38 - if (sym_cnt) 39 - return 0; 34 + sym_cnt = 0; 40 35 41 36 f = fopen("/proc/kallsyms", "r"); 42 37 if (!f) ··· 50 55 sym_cnt = i; 51 56 qsort(syms, sym_cnt, sizeof(struct ksym), ksym_cmp); 52 57 return 0; 58 + } 59 + 60 + int load_kallsyms(void) 61 + { 62 + /* 63 + * This is called/used from multiplace places, 64 + * load symbols just once. 65 + */ 66 + if (sym_cnt) 67 + return 0; 68 + return load_kallsyms_refresh(); 53 69 } 54 70 55 71 struct ksym *ksym_search(long key)
+2
tools/testing/selftests/bpf/trace_helpers.h
··· 10 10 }; 11 11 12 12 int load_kallsyms(void); 13 + int load_kallsyms_refresh(void); 14 + 13 15 struct ksym *ksym_search(long key); 14 16 long ksym_get_addr(const char *name); 15 17
+24
tools/testing/selftests/bpf/verifier/jit.c
··· 21 21 .retval = 2, 22 22 }, 23 23 { 24 + "jit: lsh, rsh, arsh by reg", 25 + .insns = { 26 + BPF_MOV64_IMM(BPF_REG_0, 1), 27 + BPF_MOV64_IMM(BPF_REG_4, 1), 28 + BPF_MOV64_IMM(BPF_REG_1, 0xff), 29 + BPF_ALU64_REG(BPF_LSH, BPF_REG_1, BPF_REG_0), 30 + BPF_ALU32_REG(BPF_LSH, BPF_REG_1, BPF_REG_4), 31 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0x3fc, 1), 32 + BPF_EXIT_INSN(), 33 + BPF_ALU64_REG(BPF_RSH, BPF_REG_1, BPF_REG_4), 34 + BPF_MOV64_REG(BPF_REG_4, BPF_REG_1), 35 + BPF_ALU32_REG(BPF_RSH, BPF_REG_4, BPF_REG_0), 36 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_4, 0xff, 1), 37 + BPF_EXIT_INSN(), 38 + BPF_ALU64_REG(BPF_ARSH, BPF_REG_4, BPF_REG_4), 39 + BPF_JMP_IMM(BPF_JEQ, BPF_REG_4, 0, 1), 40 + BPF_EXIT_INSN(), 41 + BPF_MOV64_IMM(BPF_REG_0, 2), 42 + BPF_EXIT_INSN(), 43 + }, 44 + .result = ACCEPT, 45 + .retval = 2, 46 + }, 47 + { 24 48 "jit: mov32 for ldimm64, 1", 25 49 .insns = { 26 50 BPF_MOV64_IMM(BPF_REG_0, 2),
+6
tools/testing/selftests/bpf/vmtest.sh
··· 21 21 QEMU_FLAGS=(-cpu host -smp 8) 22 22 BZIMAGE="arch/x86/boot/bzImage" 23 23 ;; 24 + aarch64) 25 + QEMU_BINARY=qemu-system-aarch64 26 + QEMU_CONSOLE="ttyAMA0,115200" 27 + QEMU_FLAGS=(-M virt,gic-version=3 -cpu host -smp 8) 28 + BZIMAGE="arch/arm64/boot/Image" 29 + ;; 24 30 *) 25 31 echo "Unsupported architecture" 26 32 exit 1