Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'irq-save-restore'

Kumar Kartikeya Dwivedi says:

====================
IRQ save/restore

This set introduces support for managing IRQ state from BPF programs.
Two new kfuncs, bpf_local_irq_save, and bpf_local_irq_restore are
introduced to enable this functionality.

Intended use cases are writing IRQ safe data structures (e.g. memory
allocator) in BPF programs natively, and use in new spin locking
primitives intended to be introduced in the next few weeks.

The set begins with some refactoring patches before the actual
functionality is introduced. Patch 1 consolidates all resource related
state in bpf_verifier_state, and moves it out from bpf_func_state.

Patch 2 refactor acquire and release functions for reference state to
make them reusable without duplication for other resource types.

After this, patch 3 refactors stack slot liveness marking logic to be
shared between dynptr, and iterators, in preparation for introducing
same logic for irq flag object on stack.

Finally, patch 4 and 7 introduce the new kfuncs and their selftests. For
more details, please inspect the patch commit logs. Patch 5 makes the
error message in case of resource leaks under BPF_EXIT a bit clearer.
Patch 6 expands coverage of existing preempt-disable selftest to cover
sleepable kfuncs.

See individual patches for more details.

Changelog:
----------
v5 -> v6
v5: https://lore.kernel.org/bpf/20241129001632.3828611-1-memxor@gmail.com

* Add Eduard's Acked-by on patch 2
* Remove gen_id parameter to acquire_reference_state (Alexei)
* Remove space before REF_TYPE_LOCK (Alexei)
* Fix link to v4 in changelog

v4 -> v5
v4: https://lore.kernel.org/bpf/20241127213535.3657472-1-memxor@gmail.com

* Do regno - 1 when printing argument
* Pass verifier state explicitly into print_{insn,verifier}_state (Eduard)
* Pass frameno instead of bpf_func_state (Eduard)
* Move bpf_reference_state *refs after parent to fill two holes in
bpf_verifier_state (Eduard). The hunk fixing that bug is in the
commit adding IRQ save/restore kfuncs, as it is only needed then.
* Fix bug in release_reference_state breaking stack property (Eduard)
* Add selftest for triggering and reproducing bug found by Eduard
irq_ooo_refs_array in final patch
* Print insn_idx and active_irq_id on error (Eduard)
* Add more acks

v3 -> v4
v3: https://lore.kernel.org/bpf/20241127165846.2001009-1-memxor@gmail.com

* Add yet another missing kfunc declaration to silence s390 CI

v2 -> v3
v2: https://lore.kernel.org/bpf/20241127153306.1484562-1-memxor@gmail.com

* Drop REF_TYPE_LOCK_MASK
* Add kfunc declarations to selftest to silence s390 CI errors

v1 -> v2
v1: https://lore.kernel.org/bpf/20241121005329.408873-1-memxor@gmail.com

* Drop reference -> resource renaming in the verifier (Eduard, Alexei)
* Change verifier log for check_resource_leak for BPF_EXIT (Eduard)
* Remove id parameter from acquire_resource_state, read s->id (Eduard)
* Rename erase to release for reference state (Eduard)
* Move resource state to bpf_verifier_state (Eduard, Alexei)
* Drop unnecessary casting to/from u64 in helpers (Eduard)
* Add test for arg != PTR_TO_STACK (Eduard)
* Drop now redundant tests (Eduard)
* Address some other misc nits
* Add Reviewed-by and Acked-by from Eduard
====================

Link: https://patch.msgid.link/20241204030400.208005-1-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

+959 -178
+16 -10
include/linux/bpf_verifier.h
··· 233 233 */ 234 234 STACK_DYNPTR, 235 235 STACK_ITER, 236 + STACK_IRQ_FLAG, 236 237 }; 237 238 238 239 #define BPF_REG_SIZE 8 /* size of eBPF register in bytes */ ··· 255 254 * default to pointer reference on zero initialization of a state. 256 255 */ 257 256 enum ref_state_type { 258 - REF_TYPE_PTR = 0, 259 - REF_TYPE_LOCK, 257 + REF_TYPE_PTR = 1, 258 + REF_TYPE_IRQ = 2, 259 + REF_TYPE_LOCK = 3, 260 260 } type; 261 261 /* Track each reference created with a unique id, even if the same 262 262 * instruction creates the reference multiple times (eg, via CALL). ··· 317 315 u32 callback_depth; 318 316 319 317 /* The following fields should be last. See copy_func_state() */ 320 - int acquired_refs; 321 - int active_locks; 322 - struct bpf_reference_state *refs; 323 318 /* The state of the stack. Each element of the array describes BPF_REG_SIZE 324 319 * (i.e. 8) bytes worth of stack memory. 325 320 * stack[0] represents bytes [*(r10-8)..*(r10-1)] ··· 369 370 /* call stack tracking */ 370 371 struct bpf_func_state *frame[MAX_CALL_FRAMES]; 371 372 struct bpf_verifier_state *parent; 373 + /* Acquired reference states */ 374 + struct bpf_reference_state *refs; 372 375 /* 373 376 * 'branches' field is the number of branches left to explore: 374 377 * 0 - all possible paths from this state reached bpf_exit or ··· 420 419 u32 insn_idx; 421 420 u32 curframe; 422 421 423 - bool speculative; 422 + u32 acquired_refs; 423 + u32 active_locks; 424 + u32 active_preempt_locks; 425 + u32 active_irq_id; 424 426 bool active_rcu_lock; 425 - u32 active_preempt_lock; 427 + 428 + bool speculative; 426 429 /* If this state was ever pointed-to by other state's loop_entry field 427 430 * this flag would be set to true. Used to avoid freeing such states 428 431 * while they are still in use. ··· 984 979 const char *iter_type_str(const struct btf *btf, u32 btf_id); 985 980 const char *iter_state_str(enum bpf_iter_state state); 986 981 987 - void print_verifier_state(struct bpf_verifier_env *env, 988 - const struct bpf_func_state *state, bool print_all); 989 - void print_insn_state(struct bpf_verifier_env *env, const struct bpf_func_state *state); 982 + void print_verifier_state(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, 983 + u32 frameno, bool print_all); 984 + void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, 985 + u32 frameno); 990 986 991 987 #endif /* _LINUX_BPF_VERIFIER_H */
+17
kernel/bpf/helpers.c
··· 3057 3057 return ret + 1; 3058 3058 } 3059 3059 3060 + /* Keep unsinged long in prototype so that kfunc is usable when emitted to 3061 + * vmlinux.h in BPF programs directly, but note that while in BPF prog, the 3062 + * unsigned long always points to 8-byte region on stack, the kernel may only 3063 + * read and write the 4-bytes on 32-bit. 3064 + */ 3065 + __bpf_kfunc void bpf_local_irq_save(unsigned long *flags__irq_flag) 3066 + { 3067 + local_irq_save(*flags__irq_flag); 3068 + } 3069 + 3070 + __bpf_kfunc void bpf_local_irq_restore(unsigned long *flags__irq_flag) 3071 + { 3072 + local_irq_restore(*flags__irq_flag); 3073 + } 3074 + 3060 3075 __bpf_kfunc_end_defs(); 3061 3076 3062 3077 BTF_KFUNCS_START(generic_btf_ids) ··· 3164 3149 BTF_ID_FLAGS(func, bpf_iter_kmem_cache_new, KF_ITER_NEW | KF_SLEEPABLE) 3165 3150 BTF_ID_FLAGS(func, bpf_iter_kmem_cache_next, KF_ITER_NEXT | KF_RET_NULL | KF_SLEEPABLE) 3166 3151 BTF_ID_FLAGS(func, bpf_iter_kmem_cache_destroy, KF_ITER_DESTROY | KF_SLEEPABLE) 3152 + BTF_ID_FLAGS(func, bpf_local_irq_save) 3153 + BTF_ID_FLAGS(func, bpf_local_irq_restore) 3167 3154 BTF_KFUNCS_END(common_btf_ids) 3168 3155 3169 3156 static const struct btf_kfunc_id_set common_kfunc_set = {
+12 -9
kernel/bpf/log.c
··· 537 537 [STACK_ZERO] = '0', 538 538 [STACK_DYNPTR] = 'd', 539 539 [STACK_ITER] = 'i', 540 + [STACK_IRQ_FLAG] = 'f' 540 541 }; 541 542 542 543 static void print_liveness(struct bpf_verifier_env *env, ··· 754 753 verbose(env, ")"); 755 754 } 756 755 757 - void print_verifier_state(struct bpf_verifier_env *env, const struct bpf_func_state *state, 758 - bool print_all) 756 + void print_verifier_state(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, 757 + u32 frameno, bool print_all) 759 758 { 759 + const struct bpf_func_state *state = vstate->frame[frameno]; 760 760 const struct bpf_reg_state *reg; 761 761 int i; 762 762 ··· 845 843 break; 846 844 } 847 845 } 848 - if (state->acquired_refs && state->refs[0].id) { 849 - verbose(env, " refs=%d", state->refs[0].id); 850 - for (i = 1; i < state->acquired_refs; i++) 851 - if (state->refs[i].id) 852 - verbose(env, ",%d", state->refs[i].id); 846 + if (vstate->acquired_refs && vstate->refs[0].id) { 847 + verbose(env, " refs=%d", vstate->refs[0].id); 848 + for (i = 1; i < vstate->acquired_refs; i++) 849 + if (vstate->refs[i].id) 850 + verbose(env, ",%d", vstate->refs[i].id); 853 851 } 854 852 if (state->in_callback_fn) 855 853 verbose(env, " cb"); ··· 866 864 BPF_LOG_MIN_ALIGNMENT) - pos - 1; 867 865 } 868 866 869 - void print_insn_state(struct bpf_verifier_env *env, const struct bpf_func_state *state) 867 + void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_state *vstate, 868 + u32 frameno) 870 869 { 871 870 if (env->prev_log_pos && env->prev_log_pos == env->log.end_pos) { 872 871 /* remove new line character */ ··· 876 873 } else { 877 874 verbose(env, "%d:", env->insn_idx); 878 875 } 879 - print_verifier_state(env, state, false); 876 + print_verifier_state(env, vstate, frameno, false); 880 877 }
+444 -149
kernel/bpf/verifier.c
··· 196 196 197 197 #define BPF_PRIV_STACK_MIN_SIZE 64 198 198 199 - static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx); 199 + static int acquire_reference(struct bpf_verifier_env *env, int insn_idx); 200 + static int release_reference_nomark(struct bpf_verifier_state *state, int ref_obj_id); 200 201 static int release_reference(struct bpf_verifier_env *env, int ref_obj_id); 201 202 static void invalidate_non_owning_refs(struct bpf_verifier_env *env); 202 203 static bool in_rbtree_lock_required_cb(struct bpf_verifier_env *env); ··· 661 660 return stack_slot_obj_get_spi(env, reg, "iter", nr_slots); 662 661 } 663 662 663 + static int irq_flag_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 664 + { 665 + return stack_slot_obj_get_spi(env, reg, "irq_flag", 1); 666 + } 667 + 664 668 static enum bpf_dynptr_type arg_to_dynptr_type(enum bpf_arg_type arg_type) 665 669 { 666 670 switch (arg_type & DYNPTR_TYPE_FLAG_MASK) { ··· 777 771 if (clone_ref_obj_id) 778 772 id = clone_ref_obj_id; 779 773 else 780 - id = acquire_reference_state(env, insn_idx); 774 + id = acquire_reference(env, insn_idx); 781 775 782 776 if (id < 0) 783 777 return id; ··· 1039 1033 if (spi < 0) 1040 1034 return spi; 1041 1035 1042 - id = acquire_reference_state(env, insn_idx); 1036 + id = acquire_reference(env, insn_idx); 1043 1037 if (id < 0) 1044 1038 return id; 1045 1039 ··· 1161 1155 return 0; 1162 1156 } 1163 1157 1158 + static int acquire_irq_state(struct bpf_verifier_env *env, int insn_idx); 1159 + static int release_irq_state(struct bpf_verifier_state *state, int id); 1160 + 1161 + static int mark_stack_slot_irq_flag(struct bpf_verifier_env *env, 1162 + struct bpf_kfunc_call_arg_meta *meta, 1163 + struct bpf_reg_state *reg, int insn_idx) 1164 + { 1165 + struct bpf_func_state *state = func(env, reg); 1166 + struct bpf_stack_state *slot; 1167 + struct bpf_reg_state *st; 1168 + int spi, i, id; 1169 + 1170 + spi = irq_flag_get_spi(env, reg); 1171 + if (spi < 0) 1172 + return spi; 1173 + 1174 + id = acquire_irq_state(env, insn_idx); 1175 + if (id < 0) 1176 + return id; 1177 + 1178 + slot = &state->stack[spi]; 1179 + st = &slot->spilled_ptr; 1180 + 1181 + __mark_reg_known_zero(st); 1182 + st->type = PTR_TO_STACK; /* we don't have dedicated reg type */ 1183 + st->live |= REG_LIVE_WRITTEN; 1184 + st->ref_obj_id = id; 1185 + 1186 + for (i = 0; i < BPF_REG_SIZE; i++) 1187 + slot->slot_type[i] = STACK_IRQ_FLAG; 1188 + 1189 + mark_stack_slot_scratched(env, spi); 1190 + return 0; 1191 + } 1192 + 1193 + static int unmark_stack_slot_irq_flag(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 1194 + { 1195 + struct bpf_func_state *state = func(env, reg); 1196 + struct bpf_stack_state *slot; 1197 + struct bpf_reg_state *st; 1198 + int spi, i, err; 1199 + 1200 + spi = irq_flag_get_spi(env, reg); 1201 + if (spi < 0) 1202 + return spi; 1203 + 1204 + slot = &state->stack[spi]; 1205 + st = &slot->spilled_ptr; 1206 + 1207 + err = release_irq_state(env->cur_state, st->ref_obj_id); 1208 + WARN_ON_ONCE(err && err != -EACCES); 1209 + if (err) { 1210 + int insn_idx = 0; 1211 + 1212 + for (int i = 0; i < env->cur_state->acquired_refs; i++) { 1213 + if (env->cur_state->refs[i].id == env->cur_state->active_irq_id) { 1214 + insn_idx = env->cur_state->refs[i].insn_idx; 1215 + break; 1216 + } 1217 + } 1218 + 1219 + verbose(env, "cannot restore irq state out of order, expected id=%d acquired at insn_idx=%d\n", 1220 + env->cur_state->active_irq_id, insn_idx); 1221 + return err; 1222 + } 1223 + 1224 + __mark_reg_not_init(env, st); 1225 + 1226 + /* see unmark_stack_slots_dynptr() for why we need to set REG_LIVE_WRITTEN */ 1227 + st->live |= REG_LIVE_WRITTEN; 1228 + 1229 + for (i = 0; i < BPF_REG_SIZE; i++) 1230 + slot->slot_type[i] = STACK_INVALID; 1231 + 1232 + mark_stack_slot_scratched(env, spi); 1233 + return 0; 1234 + } 1235 + 1236 + static bool is_irq_flag_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 1237 + { 1238 + struct bpf_func_state *state = func(env, reg); 1239 + struct bpf_stack_state *slot; 1240 + int spi, i; 1241 + 1242 + /* For -ERANGE (i.e. spi not falling into allocated stack slots), we 1243 + * will do check_mem_access to check and update stack bounds later, so 1244 + * return true for that case. 1245 + */ 1246 + spi = irq_flag_get_spi(env, reg); 1247 + if (spi == -ERANGE) 1248 + return true; 1249 + if (spi < 0) 1250 + return false; 1251 + 1252 + slot = &state->stack[spi]; 1253 + 1254 + for (i = 0; i < BPF_REG_SIZE; i++) 1255 + if (slot->slot_type[i] == STACK_IRQ_FLAG) 1256 + return false; 1257 + return true; 1258 + } 1259 + 1260 + static int is_irq_flag_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 1261 + { 1262 + struct bpf_func_state *state = func(env, reg); 1263 + struct bpf_stack_state *slot; 1264 + struct bpf_reg_state *st; 1265 + int spi, i; 1266 + 1267 + spi = irq_flag_get_spi(env, reg); 1268 + if (spi < 0) 1269 + return -EINVAL; 1270 + 1271 + slot = &state->stack[spi]; 1272 + st = &slot->spilled_ptr; 1273 + 1274 + if (!st->ref_obj_id) 1275 + return -EINVAL; 1276 + 1277 + for (i = 0; i < BPF_REG_SIZE; i++) 1278 + if (slot->slot_type[i] != STACK_IRQ_FLAG) 1279 + return -EINVAL; 1280 + return 0; 1281 + } 1282 + 1164 1283 /* Check if given stack slot is "special": 1165 1284 * - spilled register state (STACK_SPILL); 1166 1285 * - dynptr state (STACK_DYNPTR); 1167 1286 * - iter state (STACK_ITER). 1287 + * - irq flag state (STACK_IRQ_FLAG) 1168 1288 */ 1169 1289 static bool is_stack_slot_special(const struct bpf_stack_state *stack) 1170 1290 { ··· 1300 1168 case STACK_SPILL: 1301 1169 case STACK_DYNPTR: 1302 1170 case STACK_ITER: 1171 + case STACK_IRQ_FLAG: 1303 1172 return true; 1304 1173 case STACK_INVALID: 1305 1174 case STACK_MISC: ··· 1412 1279 return arr ? arr : ZERO_SIZE_PTR; 1413 1280 } 1414 1281 1415 - static int copy_reference_state(struct bpf_func_state *dst, const struct bpf_func_state *src) 1282 + static int copy_reference_state(struct bpf_verifier_state *dst, const struct bpf_verifier_state *src) 1416 1283 { 1417 1284 dst->refs = copy_array(dst->refs, src->refs, src->acquired_refs, 1418 1285 sizeof(struct bpf_reference_state), GFP_KERNEL); 1419 1286 if (!dst->refs) 1420 1287 return -ENOMEM; 1421 1288 1422 - dst->active_locks = src->active_locks; 1423 1289 dst->acquired_refs = src->acquired_refs; 1290 + dst->active_locks = src->active_locks; 1291 + dst->active_preempt_locks = src->active_preempt_locks; 1292 + dst->active_rcu_lock = src->active_rcu_lock; 1293 + dst->active_irq_id = src->active_irq_id; 1424 1294 return 0; 1425 1295 } 1426 1296 ··· 1440 1304 return 0; 1441 1305 } 1442 1306 1443 - static int resize_reference_state(struct bpf_func_state *state, size_t n) 1307 + static int resize_reference_state(struct bpf_verifier_state *state, size_t n) 1444 1308 { 1445 1309 state->refs = realloc_array(state->refs, state->acquired_refs, n, 1446 1310 sizeof(struct bpf_reference_state)); ··· 1483 1347 * On success, returns a valid pointer id to associate with the register 1484 1348 * On failure, returns a negative errno. 1485 1349 */ 1486 - static int acquire_reference_state(struct bpf_verifier_env *env, int insn_idx) 1350 + static struct bpf_reference_state *acquire_reference_state(struct bpf_verifier_env *env, int insn_idx) 1487 1351 { 1488 - struct bpf_func_state *state = cur_func(env); 1489 - int new_ofs = state->acquired_refs; 1490 - int id, err; 1491 - 1492 - err = resize_reference_state(state, state->acquired_refs + 1); 1493 - if (err) 1494 - return err; 1495 - id = ++env->id_gen; 1496 - state->refs[new_ofs].type = REF_TYPE_PTR; 1497 - state->refs[new_ofs].id = id; 1498 - state->refs[new_ofs].insn_idx = insn_idx; 1499 - 1500 - return id; 1501 - } 1502 - 1503 - static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, enum ref_state_type type, 1504 - int id, void *ptr) 1505 - { 1506 - struct bpf_func_state *state = cur_func(env); 1352 + struct bpf_verifier_state *state = env->cur_state; 1507 1353 int new_ofs = state->acquired_refs; 1508 1354 int err; 1509 1355 1510 1356 err = resize_reference_state(state, state->acquired_refs + 1); 1511 1357 if (err) 1512 - return err; 1513 - state->refs[new_ofs].type = type; 1514 - state->refs[new_ofs].id = id; 1358 + return NULL; 1515 1359 state->refs[new_ofs].insn_idx = insn_idx; 1516 - state->refs[new_ofs].ptr = ptr; 1360 + 1361 + return &state->refs[new_ofs]; 1362 + } 1363 + 1364 + static int acquire_reference(struct bpf_verifier_env *env, int insn_idx) 1365 + { 1366 + struct bpf_reference_state *s; 1367 + 1368 + s = acquire_reference_state(env, insn_idx); 1369 + if (!s) 1370 + return -ENOMEM; 1371 + s->type = REF_TYPE_PTR; 1372 + s->id = ++env->id_gen; 1373 + return s->id; 1374 + } 1375 + 1376 + static int acquire_lock_state(struct bpf_verifier_env *env, int insn_idx, enum ref_state_type type, 1377 + int id, void *ptr) 1378 + { 1379 + struct bpf_verifier_state *state = env->cur_state; 1380 + struct bpf_reference_state *s; 1381 + 1382 + s = acquire_reference_state(env, insn_idx); 1383 + s->type = type; 1384 + s->id = id; 1385 + s->ptr = ptr; 1517 1386 1518 1387 state->active_locks++; 1519 1388 return 0; 1520 1389 } 1521 1390 1522 - /* release function corresponding to acquire_reference_state(). Idempotent. */ 1523 - static int release_reference_state(struct bpf_func_state *state, int ptr_id) 1391 + static int acquire_irq_state(struct bpf_verifier_env *env, int insn_idx) 1524 1392 { 1525 - int i, last_idx; 1393 + struct bpf_verifier_state *state = env->cur_state; 1394 + struct bpf_reference_state *s; 1526 1395 1527 - last_idx = state->acquired_refs - 1; 1528 - for (i = 0; i < state->acquired_refs; i++) { 1529 - if (state->refs[i].type != REF_TYPE_PTR) 1530 - continue; 1531 - if (state->refs[i].id == ptr_id) { 1532 - if (last_idx && i != last_idx) 1533 - memcpy(&state->refs[i], &state->refs[last_idx], 1534 - sizeof(*state->refs)); 1535 - memset(&state->refs[last_idx], 0, sizeof(*state->refs)); 1536 - state->acquired_refs--; 1537 - return 0; 1538 - } 1539 - } 1540 - return -EINVAL; 1396 + s = acquire_reference_state(env, insn_idx); 1397 + if (!s) 1398 + return -ENOMEM; 1399 + s->type = REF_TYPE_IRQ; 1400 + s->id = ++env->id_gen; 1401 + 1402 + state->active_irq_id = s->id; 1403 + return s->id; 1541 1404 } 1542 1405 1543 - static int release_lock_state(struct bpf_func_state *state, int type, int id, void *ptr) 1406 + static void release_reference_state(struct bpf_verifier_state *state, int idx) 1544 1407 { 1545 - int i, last_idx; 1408 + int last_idx; 1409 + size_t rem; 1546 1410 1411 + /* IRQ state requires the relative ordering of elements remaining the 1412 + * same, since it relies on the refs array to behave as a stack, so that 1413 + * it can detect out-of-order IRQ restore. Hence use memmove to shift 1414 + * the array instead of swapping the final element into the deleted idx. 1415 + */ 1547 1416 last_idx = state->acquired_refs - 1; 1417 + rem = state->acquired_refs - idx - 1; 1418 + if (last_idx && idx != last_idx) 1419 + memmove(&state->refs[idx], &state->refs[idx + 1], sizeof(*state->refs) * rem); 1420 + memset(&state->refs[last_idx], 0, sizeof(*state->refs)); 1421 + state->acquired_refs--; 1422 + return; 1423 + } 1424 + 1425 + static int release_lock_state(struct bpf_verifier_state *state, int type, int id, void *ptr) 1426 + { 1427 + int i; 1428 + 1548 1429 for (i = 0; i < state->acquired_refs; i++) { 1549 1430 if (state->refs[i].type != type) 1550 1431 continue; 1551 1432 if (state->refs[i].id == id && state->refs[i].ptr == ptr) { 1552 - if (last_idx && i != last_idx) 1553 - memcpy(&state->refs[i], &state->refs[last_idx], 1554 - sizeof(*state->refs)); 1555 - memset(&state->refs[last_idx], 0, sizeof(*state->refs)); 1556 - state->acquired_refs--; 1433 + release_reference_state(state, i); 1557 1434 state->active_locks--; 1558 1435 return 0; 1559 1436 } ··· 1574 1425 return -EINVAL; 1575 1426 } 1576 1427 1577 - static struct bpf_reference_state *find_lock_state(struct bpf_verifier_env *env, enum ref_state_type type, 1428 + static int release_irq_state(struct bpf_verifier_state *state, int id) 1429 + { 1430 + u32 prev_id = 0; 1431 + int i; 1432 + 1433 + if (id != state->active_irq_id) 1434 + return -EACCES; 1435 + 1436 + for (i = 0; i < state->acquired_refs; i++) { 1437 + if (state->refs[i].type != REF_TYPE_IRQ) 1438 + continue; 1439 + if (state->refs[i].id == id) { 1440 + release_reference_state(state, i); 1441 + state->active_irq_id = prev_id; 1442 + return 0; 1443 + } else { 1444 + prev_id = state->refs[i].id; 1445 + } 1446 + } 1447 + return -EINVAL; 1448 + } 1449 + 1450 + static struct bpf_reference_state *find_lock_state(struct bpf_verifier_state *state, enum ref_state_type type, 1578 1451 int id, void *ptr) 1579 1452 { 1580 - struct bpf_func_state *state = cur_func(env); 1581 1453 int i; 1582 1454 1583 1455 for (i = 0; i < state->acquired_refs; i++) { 1584 1456 struct bpf_reference_state *s = &state->refs[i]; 1585 1457 1586 - if (s->type == REF_TYPE_PTR || s->type != type) 1458 + if (s->type != type) 1587 1459 continue; 1588 1460 1589 1461 if (s->id == id && s->ptr == ptr) ··· 1617 1447 { 1618 1448 if (!state) 1619 1449 return; 1620 - kfree(state->refs); 1621 1450 kfree(state->stack); 1622 1451 kfree(state); 1623 1452 } ··· 1630 1461 free_func_state(state->frame[i]); 1631 1462 state->frame[i] = NULL; 1632 1463 } 1464 + kfree(state->refs); 1633 1465 if (free_self) 1634 1466 kfree(state); 1635 1467 } ··· 1641 1471 static int copy_func_state(struct bpf_func_state *dst, 1642 1472 const struct bpf_func_state *src) 1643 1473 { 1644 - int err; 1645 - 1646 - memcpy(dst, src, offsetof(struct bpf_func_state, acquired_refs)); 1647 - err = copy_reference_state(dst, src); 1648 - if (err) 1649 - return err; 1474 + memcpy(dst, src, offsetof(struct bpf_func_state, stack)); 1650 1475 return copy_stack_state(dst, src); 1651 1476 } 1652 1477 ··· 1658 1493 free_func_state(dst_state->frame[i]); 1659 1494 dst_state->frame[i] = NULL; 1660 1495 } 1496 + err = copy_reference_state(dst_state, src); 1497 + if (err) 1498 + return err; 1661 1499 dst_state->speculative = src->speculative; 1662 - dst_state->active_rcu_lock = src->active_rcu_lock; 1663 - dst_state->active_preempt_lock = src->active_preempt_lock; 1664 1500 dst_state->in_sleepable = src->in_sleepable; 1665 1501 dst_state->curframe = src->curframe; 1666 1502 dst_state->branches = src->branches; ··· 3368 3202 return 0; 3369 3203 } 3370 3204 3371 - static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 3205 + static int mark_stack_slot_obj_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 3206 + int spi, int nr_slots) 3372 3207 { 3373 3208 struct bpf_func_state *state = func(env, reg); 3374 - int spi, ret; 3209 + int err, i; 3210 + 3211 + for (i = 0; i < nr_slots; i++) { 3212 + struct bpf_reg_state *st = &state->stack[spi - i].spilled_ptr; 3213 + 3214 + err = mark_reg_read(env, st, st->parent, REG_LIVE_READ64); 3215 + if (err) 3216 + return err; 3217 + 3218 + mark_stack_slot_scratched(env, spi - i); 3219 + } 3220 + return 0; 3221 + } 3222 + 3223 + static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 3224 + { 3225 + int spi; 3375 3226 3376 3227 /* For CONST_PTR_TO_DYNPTR, it must have already been done by 3377 3228 * check_reg_arg in check_helper_call and mark_btf_func_reg_size in ··· 3403 3220 * bounds and spi is the first dynptr slot. Simply mark stack slot as 3404 3221 * read. 3405 3222 */ 3406 - ret = mark_reg_read(env, &state->stack[spi].spilled_ptr, 3407 - state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64); 3408 - if (ret) 3409 - return ret; 3410 - return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr, 3411 - state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64); 3223 + return mark_stack_slot_obj_read(env, reg, spi, BPF_DYNPTR_NR_SLOTS); 3412 3224 } 3413 3225 3414 3226 static int mark_iter_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg, 3415 3227 int spi, int nr_slots) 3416 3228 { 3417 - struct bpf_func_state *state = func(env, reg); 3418 - int err, i; 3229 + return mark_stack_slot_obj_read(env, reg, spi, nr_slots); 3230 + } 3419 3231 3420 - for (i = 0; i < nr_slots; i++) { 3421 - struct bpf_reg_state *st = &state->stack[spi - i].spilled_ptr; 3232 + static int mark_irq_flag_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 3233 + { 3234 + int spi; 3422 3235 3423 - err = mark_reg_read(env, st, st->parent, REG_LIVE_READ64); 3424 - if (err) 3425 - return err; 3426 - 3427 - mark_stack_slot_scratched(env, spi - i); 3428 - } 3429 - 3430 - return 0; 3236 + spi = irq_flag_get_spi(env, reg); 3237 + if (spi < 0) 3238 + return spi; 3239 + return mark_stack_slot_obj_read(env, reg, spi, 1); 3431 3240 } 3432 3241 3433 3242 /* This function is supposed to be used by the following 32-bit optimization ··· 4674 4499 fmt_stack_mask(env->tmp_str_buf, TMP_STR_BUF_LEN, 4675 4500 bt_frame_stack_mask(bt, fr)); 4676 4501 verbose(env, "stack=%s: ", env->tmp_str_buf); 4677 - print_verifier_state(env, func, true); 4502 + print_verifier_state(env, st, fr, true); 4678 4503 } 4679 4504 } 4680 4505 ··· 5671 5496 static bool in_rcu_cs(struct bpf_verifier_env *env) 5672 5497 { 5673 5498 return env->cur_state->active_rcu_lock || 5674 - cur_func(env)->active_locks || 5499 + env->cur_state->active_locks || 5675 5500 !in_sleepable(env); 5676 5501 } 5677 5502 ··· 8025 7850 * Since only one bpf_spin_lock is allowed the checks are simpler than 8026 7851 * reg_is_refcounted() logic. The verifier needs to remember only 8027 7852 * one spin_lock instead of array of acquired_refs. 8028 - * cur_func(env)->active_locks remembers which map value element or allocated 7853 + * env->cur_state->active_locks remembers which map value element or allocated 8029 7854 * object got locked and clears it after bpf_spin_unlock. 8030 7855 */ 8031 7856 static int process_spin_lock(struct bpf_verifier_env *env, int regno, 8032 7857 bool is_lock) 8033 7858 { 8034 7859 struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 7860 + struct bpf_verifier_state *cur = env->cur_state; 8035 7861 bool is_const = tnum_is_const(reg->var_off); 8036 - struct bpf_func_state *cur = cur_func(env); 8037 7862 u64 val = reg->var_off.value; 8038 7863 struct bpf_map *map = NULL; 8039 7864 struct btf *btf = NULL; ··· 8100 7925 return -EINVAL; 8101 7926 } 8102 7927 8103 - if (release_lock_state(cur_func(env), REF_TYPE_LOCK, reg->id, ptr)) { 7928 + if (release_lock_state(env->cur_state, REF_TYPE_LOCK, reg->id, ptr)) { 8104 7929 verbose(env, "bpf_spin_unlock of different lock\n"); 8105 7930 return -EINVAL; 8106 7931 } ··· 9844 9669 reg->range = AT_PKT_END; 9845 9670 } 9846 9671 9672 + static int release_reference_nomark(struct bpf_verifier_state *state, int ref_obj_id) 9673 + { 9674 + int i; 9675 + 9676 + for (i = 0; i < state->acquired_refs; i++) { 9677 + if (state->refs[i].type != REF_TYPE_PTR) 9678 + continue; 9679 + if (state->refs[i].id == ref_obj_id) { 9680 + release_reference_state(state, i); 9681 + return 0; 9682 + } 9683 + } 9684 + return -EINVAL; 9685 + } 9686 + 9847 9687 /* The pointer with the specified id has released its reference to kernel 9848 9688 * resources. Identify all copies of the same pointer and clear the reference. 9689 + * 9690 + * This is the release function corresponding to acquire_reference(). Idempotent. 9849 9691 */ 9850 - static int release_reference(struct bpf_verifier_env *env, 9851 - int ref_obj_id) 9692 + static int release_reference(struct bpf_verifier_env *env, int ref_obj_id) 9852 9693 { 9694 + struct bpf_verifier_state *vstate = env->cur_state; 9853 9695 struct bpf_func_state *state; 9854 9696 struct bpf_reg_state *reg; 9855 9697 int err; 9856 9698 9857 - err = release_reference_state(cur_func(env), ref_obj_id); 9699 + err = release_reference_nomark(vstate, ref_obj_id); 9858 9700 if (err) 9859 9701 return err; 9860 9702 9861 - bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ 9703 + bpf_for_each_reg_in_vstate(vstate, state, reg, ({ 9862 9704 if (reg->ref_obj_id == ref_obj_id) 9863 9705 mark_reg_invalid(env, reg); 9864 9706 })); ··· 9949 9757 callsite, 9950 9758 state->curframe + 1 /* frameno within this callchain */, 9951 9759 subprog /* subprog number within this prog */); 9952 - /* Transfer references to the callee */ 9953 - err = copy_reference_state(callee, caller); 9954 - err = err ?: set_callee_state_cb(env, caller, callee, callsite); 9760 + err = set_callee_state_cb(env, caller, callee, callsite); 9955 9761 if (err) 9956 9762 goto err_out; 9957 9763 ··· 10182 9992 const char *sub_name = subprog_name(env, subprog); 10183 9993 10184 9994 /* Only global subprogs cannot be called with a lock held. */ 10185 - if (cur_func(env)->active_locks) { 9995 + if (env->cur_state->active_locks) { 10186 9996 verbose(env, "global function calls are not allowed while holding a lock,\n" 10187 9997 "use static function instead\n"); 10188 9998 return -EINVAL; 10189 9999 } 10190 10000 10191 10001 /* Only global subprogs cannot be called with preemption disabled. */ 10192 - if (env->cur_state->active_preempt_lock) { 10002 + if (env->cur_state->active_preempt_locks) { 10193 10003 verbose(env, "global function calls are not allowed with preemption disabled,\n" 10004 + "use static function instead\n"); 10005 + return -EINVAL; 10006 + } 10007 + 10008 + if (env->cur_state->active_irq_id) { 10009 + verbose(env, "global function calls are not allowed with IRQs disabled,\n" 10194 10010 "use static function instead\n"); 10195 10011 return -EINVAL; 10196 10012 } ··· 10235 10039 10236 10040 if (env->log.level & BPF_LOG_LEVEL) { 10237 10041 verbose(env, "caller:\n"); 10238 - print_verifier_state(env, caller, true); 10042 + print_verifier_state(env, state, caller->frameno, true); 10239 10043 verbose(env, "callee:\n"); 10240 - print_verifier_state(env, state->frame[state->curframe], true); 10044 + print_verifier_state(env, state, state->curframe, true); 10241 10045 } 10242 10046 10243 10047 return 0; ··· 10529 10333 caller->regs[BPF_REG_0] = *r0; 10530 10334 } 10531 10335 10532 - /* Transfer references to the caller */ 10533 - err = copy_reference_state(caller, callee); 10534 - if (err) 10535 - return err; 10536 - 10537 10336 /* for callbacks like bpf_loop or bpf_for_each_map_elem go back to callsite, 10538 10337 * there function call logic would reschedule callback visit. If iteration 10539 10338 * converges is_state_visited() would prune that visit eventually. ··· 10541 10350 10542 10351 if (env->log.level & BPF_LOG_LEVEL) { 10543 10352 verbose(env, "returning from callee:\n"); 10544 - print_verifier_state(env, callee, true); 10353 + print_verifier_state(env, state, callee->frameno, true); 10545 10354 verbose(env, "to caller at %d:\n", *insn_idx); 10546 - print_verifier_state(env, caller, true); 10355 + print_verifier_state(env, state, caller->frameno, true); 10547 10356 } 10548 10357 /* clear everything in the callee. In case of exceptional exits using 10549 10358 * bpf_throw, this will be done by copy_verifier_state for extra frames. */ ··· 10693 10502 10694 10503 static int check_reference_leak(struct bpf_verifier_env *env, bool exception_exit) 10695 10504 { 10696 - struct bpf_func_state *state = cur_func(env); 10505 + struct bpf_verifier_state *state = env->cur_state; 10697 10506 bool refs_lingering = false; 10698 10507 int i; 10699 10508 10700 - if (!exception_exit && state->frameno) 10509 + if (!exception_exit && cur_func(env)->frameno) 10701 10510 return 0; 10702 10511 10703 10512 for (i = 0; i < state->acquired_refs; i++) { ··· 10714 10523 { 10715 10524 int err; 10716 10525 10717 - if (check_lock && cur_func(env)->active_locks) { 10526 + if (check_lock && env->cur_state->active_locks) { 10718 10527 verbose(env, "%s cannot be used inside bpf_spin_lock-ed region\n", prefix); 10719 10528 return -EINVAL; 10720 10529 } ··· 10725 10534 return err; 10726 10535 } 10727 10536 10537 + if (check_lock && env->cur_state->active_irq_id) { 10538 + verbose(env, "%s cannot be used inside bpf_local_irq_save-ed region\n", prefix); 10539 + return -EINVAL; 10540 + } 10541 + 10728 10542 if (check_lock && env->cur_state->active_rcu_lock) { 10729 10543 verbose(env, "%s cannot be used inside bpf_rcu_read_lock-ed region\n", prefix); 10730 10544 return -EINVAL; 10731 10545 } 10732 10546 10733 - if (check_lock && env->cur_state->active_preempt_lock) { 10547 + if (check_lock && env->cur_state->active_preempt_locks) { 10734 10548 verbose(env, "%s cannot be used inside bpf_preempt_disable-ed region\n", prefix); 10735 10549 return -EINVAL; 10736 10550 } ··· 10923 10727 env->insn_aux_data[insn_idx].storage_get_func_atomic = true; 10924 10728 } 10925 10729 10926 - if (env->cur_state->active_preempt_lock) { 10730 + if (env->cur_state->active_preempt_locks) { 10927 10731 if (fn->might_sleep) { 10928 10732 verbose(env, "sleepable helper %s#%d in non-preemptible region\n", 10733 + func_id_name(func_id), func_id); 10734 + return -EINVAL; 10735 + } 10736 + 10737 + if (in_sleepable(env) && is_storage_get_function(func_id)) 10738 + env->insn_aux_data[insn_idx].storage_get_func_atomic = true; 10739 + } 10740 + 10741 + if (env->cur_state->active_irq_id) { 10742 + if (fn->might_sleep) { 10743 + verbose(env, "sleepable helper %s#%d in IRQ-disabled region\n", 10929 10744 func_id_name(func_id), func_id); 10930 10745 return -EINVAL; 10931 10746 } ··· 10991 10784 struct bpf_func_state *state; 10992 10785 struct bpf_reg_state *reg; 10993 10786 10994 - err = release_reference_state(cur_func(env), ref_obj_id); 10787 + err = release_reference_nomark(env->cur_state, ref_obj_id); 10995 10788 if (!err) { 10996 10789 bpf_for_each_reg_in_vstate(env->cur_state, state, reg, ({ 10997 10790 if (reg->ref_obj_id == ref_obj_id) { ··· 11324 11117 /* For release_reference() */ 11325 11118 regs[BPF_REG_0].ref_obj_id = meta.ref_obj_id; 11326 11119 } else if (is_acquire_function(func_id, meta.map_ptr)) { 11327 - int id = acquire_reference_state(env, insn_idx); 11120 + int id = acquire_reference(env, insn_idx); 11328 11121 11329 11122 if (id < 0) 11330 11123 return id; ··· 11506 11299 return btf_param_match_suffix(btf, arg, "__str"); 11507 11300 } 11508 11301 11302 + static bool is_kfunc_arg_irq_flag(const struct btf *btf, const struct btf_param *arg) 11303 + { 11304 + return btf_param_match_suffix(btf, arg, "__irq_flag"); 11305 + } 11306 + 11509 11307 static bool is_kfunc_arg_scalar_with_name(const struct btf *btf, 11510 11308 const struct btf_param *arg, 11511 11309 const char *name) ··· 11664 11452 KF_ARG_PTR_TO_CONST_STR, 11665 11453 KF_ARG_PTR_TO_MAP, 11666 11454 KF_ARG_PTR_TO_WORKQUEUE, 11455 + KF_ARG_PTR_TO_IRQ_FLAG, 11667 11456 }; 11668 11457 11669 11458 enum special_kfunc_type { ··· 11696 11483 KF_bpf_iter_css_task_new, 11697 11484 KF_bpf_session_cookie, 11698 11485 KF_bpf_get_kmem_cache, 11486 + KF_bpf_local_irq_save, 11487 + KF_bpf_local_irq_restore, 11699 11488 }; 11700 11489 11701 11490 BTF_SET_START(special_kfunc_set) ··· 11764 11549 BTF_ID_UNUSED 11765 11550 #endif 11766 11551 BTF_ID(func, bpf_get_kmem_cache) 11552 + BTF_ID(func, bpf_local_irq_save) 11553 + BTF_ID(func, bpf_local_irq_restore) 11767 11554 11768 11555 static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta) 11769 11556 { ··· 11855 11638 11856 11639 if (is_kfunc_arg_wq(meta->btf, &args[argno])) 11857 11640 return KF_ARG_PTR_TO_WORKQUEUE; 11641 + 11642 + if (is_kfunc_arg_irq_flag(meta->btf, &args[argno])) 11643 + return KF_ARG_PTR_TO_IRQ_FLAG; 11858 11644 11859 11645 if ((base_type(reg->type) == PTR_TO_BTF_ID || reg2btf_ids[base_type(reg->type)])) { 11860 11646 if (!btf_type_is_struct(ref_t)) { ··· 11962 11742 return 0; 11963 11743 } 11964 11744 11745 + static int process_irq_flag(struct bpf_verifier_env *env, int regno, 11746 + struct bpf_kfunc_call_arg_meta *meta) 11747 + { 11748 + struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno]; 11749 + bool irq_save; 11750 + int err; 11751 + 11752 + if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_save]) { 11753 + irq_save = true; 11754 + } else if (meta->func_id == special_kfunc_list[KF_bpf_local_irq_restore]) { 11755 + irq_save = false; 11756 + } else { 11757 + verbose(env, "verifier internal error: unknown irq flags kfunc\n"); 11758 + return -EFAULT; 11759 + } 11760 + 11761 + if (irq_save) { 11762 + if (!is_irq_flag_reg_valid_uninit(env, reg)) { 11763 + verbose(env, "expected uninitialized irq flag as arg#%d\n", regno - 1); 11764 + return -EINVAL; 11765 + } 11766 + 11767 + err = check_mem_access(env, env->insn_idx, regno, 0, BPF_DW, BPF_WRITE, -1, false, false); 11768 + if (err) 11769 + return err; 11770 + 11771 + err = mark_stack_slot_irq_flag(env, meta, reg, env->insn_idx); 11772 + if (err) 11773 + return err; 11774 + } else { 11775 + err = is_irq_flag_reg_valid_init(env, reg); 11776 + if (err) { 11777 + verbose(env, "expected an initialized irq flag as arg#%d\n", regno - 1); 11778 + return err; 11779 + } 11780 + 11781 + err = mark_irq_flag_read(env, reg); 11782 + if (err) 11783 + return err; 11784 + 11785 + err = unmark_stack_slot_irq_flag(env, reg); 11786 + if (err) 11787 + return err; 11788 + } 11789 + return 0; 11790 + } 11791 + 11792 + 11965 11793 static int ref_set_non_owning(struct bpf_verifier_env *env, struct bpf_reg_state *reg) 11966 11794 { 11967 11795 struct btf_record *rec = reg_btf_record(reg); 11968 11796 11969 - if (!cur_func(env)->active_locks) { 11797 + if (!env->cur_state->active_locks) { 11970 11798 verbose(env, "verifier internal error: ref_set_non_owning w/o active lock\n"); 11971 11799 return -EFAULT; 11972 11800 } ··· 12033 11765 12034 11766 static int ref_convert_owning_non_owning(struct bpf_verifier_env *env, u32 ref_obj_id) 12035 11767 { 12036 - struct bpf_func_state *state, *unused; 11768 + struct bpf_verifier_state *state = env->cur_state; 11769 + struct bpf_func_state *unused; 12037 11770 struct bpf_reg_state *reg; 12038 11771 int i; 12039 - 12040 - state = cur_func(env); 12041 11772 12042 11773 if (!ref_obj_id) { 12043 11774 verbose(env, "verifier internal error: ref_obj_id is zero for " ··· 12127 11860 } 12128 11861 id = reg->id; 12129 11862 12130 - if (!cur_func(env)->active_locks) 11863 + if (!env->cur_state->active_locks) 12131 11864 return -EINVAL; 12132 - s = find_lock_state(env, REF_TYPE_LOCK, id, ptr); 11865 + s = find_lock_state(env->cur_state, REF_TYPE_LOCK, id, ptr); 12133 11866 if (!s) { 12134 11867 verbose(env, "held lock and object are not in the same allocation\n"); 12135 11868 return -EINVAL; ··· 12598 12331 case KF_ARG_PTR_TO_REFCOUNTED_KPTR: 12599 12332 case KF_ARG_PTR_TO_CONST_STR: 12600 12333 case KF_ARG_PTR_TO_WORKQUEUE: 12334 + case KF_ARG_PTR_TO_IRQ_FLAG: 12601 12335 break; 12602 12336 default: 12603 12337 WARN_ON_ONCE(1); ··· 12893 12625 if (ret < 0) 12894 12626 return ret; 12895 12627 break; 12628 + case KF_ARG_PTR_TO_IRQ_FLAG: 12629 + if (reg->type != PTR_TO_STACK) { 12630 + verbose(env, "arg#%d doesn't point to an irq flag on stack\n", i); 12631 + return -EINVAL; 12632 + } 12633 + ret = process_irq_flag(env, regno, meta); 12634 + if (ret < 0) 12635 + return ret; 12636 + break; 12896 12637 } 12897 12638 } 12898 12639 ··· 13066 12789 return -EINVAL; 13067 12790 } 13068 12791 13069 - if (env->cur_state->active_preempt_lock) { 12792 + if (env->cur_state->active_preempt_locks) { 13070 12793 if (preempt_disable) { 13071 - env->cur_state->active_preempt_lock++; 12794 + env->cur_state->active_preempt_locks++; 13072 12795 } else if (preempt_enable) { 13073 - env->cur_state->active_preempt_lock--; 12796 + env->cur_state->active_preempt_locks--; 13074 12797 } else if (sleepable) { 13075 12798 verbose(env, "kernel func %s is sleepable within non-preemptible region\n", func_name); 13076 12799 return -EACCES; 13077 12800 } 13078 12801 } else if (preempt_disable) { 13079 - env->cur_state->active_preempt_lock++; 12802 + env->cur_state->active_preempt_locks++; 13080 12803 } else if (preempt_enable) { 13081 12804 verbose(env, "unmatched attempt to enable preemption (kernel function %s)\n", func_name); 13082 12805 return -EINVAL; 12806 + } 12807 + 12808 + if (env->cur_state->active_irq_id && sleepable) { 12809 + verbose(env, "kernel func %s is sleepable within IRQ-disabled region\n", func_name); 12810 + return -EACCES; 13083 12811 } 13084 12812 13085 12813 /* In case of release function, we get register number of refcounted ··· 13380 13098 } 13381 13099 mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *)); 13382 13100 if (is_kfunc_acquire(&meta)) { 13383 - int id = acquire_reference_state(env, insn_idx); 13101 + int id = acquire_reference(env, insn_idx); 13384 13102 13385 13103 if (id < 0) 13386 13104 return id; ··· 14777 14495 14778 14496 /* Got here implies adding two SCALAR_VALUEs */ 14779 14497 if (WARN_ON_ONCE(ptr_reg)) { 14780 - print_verifier_state(env, state, true); 14498 + print_verifier_state(env, vstate, vstate->curframe, true); 14781 14499 verbose(env, "verifier internal error: unexpected ptr_reg\n"); 14782 14500 return -EINVAL; 14783 14501 } 14784 14502 if (WARN_ON(!src_reg)) { 14785 - print_verifier_state(env, state, true); 14503 + print_verifier_state(env, vstate, vstate->curframe, true); 14786 14504 verbose(env, "verifier internal error: no src_reg\n"); 14787 14505 return -EINVAL; 14788 14506 } ··· 15680 15398 * No one could have freed the reference state before 15681 15399 * doing the NULL check. 15682 15400 */ 15683 - WARN_ON_ONCE(release_reference_state(state, id)); 15401 + WARN_ON_ONCE(release_reference_nomark(vstate, id)); 15684 15402 15685 15403 bpf_for_each_reg_in_vstate(vstate, state, reg, ({ 15686 15404 mark_ptr_or_null_reg(state, reg, id, is_null); ··· 15990 15708 *insn_idx)) 15991 15709 return -EFAULT; 15992 15710 if (env->log.level & BPF_LOG_LEVEL) 15993 - print_insn_state(env, this_branch->frame[this_branch->curframe]); 15711 + print_insn_state(env, this_branch, this_branch->curframe); 15994 15712 *insn_idx += insn->off; 15995 15713 return 0; 15996 15714 } else if (pred == 0) { ··· 16004 15722 *insn_idx)) 16005 15723 return -EFAULT; 16006 15724 if (env->log.level & BPF_LOG_LEVEL) 16007 - print_insn_state(env, this_branch->frame[this_branch->curframe]); 15725 + print_insn_state(env, this_branch, this_branch->curframe); 16008 15726 return 0; 16009 15727 } 16010 15728 ··· 16121 15839 return -EACCES; 16122 15840 } 16123 15841 if (env->log.level & BPF_LOG_LEVEL) 16124 - print_insn_state(env, this_branch->frame[this_branch->curframe]); 15842 + print_insn_state(env, this_branch, this_branch->curframe); 16125 15843 return 0; 16126 15844 } 16127 15845 ··· 18020 17738 !check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) 18021 17739 return false; 18022 17740 break; 17741 + case STACK_IRQ_FLAG: 17742 + old_reg = &old->stack[spi].spilled_ptr; 17743 + cur_reg = &cur->stack[spi].spilled_ptr; 17744 + if (!check_ids(old_reg->ref_obj_id, cur_reg->ref_obj_id, idmap)) 17745 + return false; 17746 + break; 18023 17747 case STACK_MISC: 18024 17748 case STACK_ZERO: 18025 17749 case STACK_INVALID: ··· 18038 17750 return true; 18039 17751 } 18040 17752 18041 - static bool refsafe(struct bpf_func_state *old, struct bpf_func_state *cur, 17753 + static bool refsafe(struct bpf_verifier_state *old, struct bpf_verifier_state *cur, 18042 17754 struct bpf_idmap *idmap) 18043 17755 { 18044 17756 int i; 18045 17757 18046 17758 if (old->acquired_refs != cur->acquired_refs) 17759 + return false; 17760 + 17761 + if (old->active_locks != cur->active_locks) 17762 + return false; 17763 + 17764 + if (old->active_preempt_locks != cur->active_preempt_locks) 17765 + return false; 17766 + 17767 + if (old->active_rcu_lock != cur->active_rcu_lock) 17768 + return false; 17769 + 17770 + if (!check_ids(old->active_irq_id, cur->active_irq_id, idmap)) 18047 17771 return false; 18048 17772 18049 17773 for (i = 0; i < old->acquired_refs; i++) { ··· 18064 17764 return false; 18065 17765 switch (old->refs[i].type) { 18066 17766 case REF_TYPE_PTR: 17767 + case REF_TYPE_IRQ: 18067 17768 break; 18068 17769 case REF_TYPE_LOCK: 18069 17770 if (old->refs[i].ptr != cur->refs[i].ptr) ··· 18121 17820 if (!stacksafe(env, old, cur, &env->idmap_scratch, exact)) 18122 17821 return false; 18123 17822 18124 - if (!refsafe(old, cur, &env->idmap_scratch)) 18125 - return false; 18126 - 18127 17823 return true; 18128 17824 } 18129 17825 ··· 18148 17850 if (old->speculative && !cur->speculative) 18149 17851 return false; 18150 17852 18151 - if (old->active_rcu_lock != cur->active_rcu_lock) 18152 - return false; 18153 - 18154 - if (old->active_preempt_lock != cur->active_preempt_lock) 18155 - return false; 18156 - 18157 17853 if (old->in_sleepable != cur->in_sleepable) 17854 + return false; 17855 + 17856 + if (!refsafe(old, cur, &env->idmap_scratch)) 18158 17857 return false; 18159 17858 18160 17859 /* for states to be equal callsites have to be the same ··· 18544 18249 verbose_linfo(env, insn_idx, "; "); 18545 18250 verbose(env, "infinite loop detected at insn %d\n", insn_idx); 18546 18251 verbose(env, "cur state:"); 18547 - print_verifier_state(env, cur->frame[cur->curframe], true); 18252 + print_verifier_state(env, cur, cur->curframe, true); 18548 18253 verbose(env, "old state:"); 18549 - print_verifier_state(env, sl->state.frame[cur->curframe], true); 18254 + print_verifier_state(env, &sl->state, cur->curframe, true); 18550 18255 return -EINVAL; 18551 18256 } 18552 18257 /* if the verifier is processing a loop, avoid adding new state ··· 18902 18607 env->prev_insn_idx, env->insn_idx, 18903 18608 env->cur_state->speculative ? 18904 18609 " (speculative execution)" : ""); 18905 - print_verifier_state(env, state->frame[state->curframe], true); 18610 + print_verifier_state(env, state, state->curframe, true); 18906 18611 do_print_state = false; 18907 18612 } 18908 18613 ··· 18914 18619 }; 18915 18620 18916 18621 if (verifier_state_scratched(env)) 18917 - print_insn_state(env, state->frame[state->curframe]); 18622 + print_insn_state(env, state, state->curframe); 18918 18623 18919 18624 verbose_linfo(env, env->insn_idx, "; "); 18920 18625 env->prev_log_pos = env->log.end_pos; ··· 19046 18751 return -EINVAL; 19047 18752 } 19048 18753 19049 - if (cur_func(env)->active_locks) { 18754 + if (env->cur_state->active_locks) { 19050 18755 if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) || 19051 18756 (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && 19052 18757 (insn->off != 0 || !is_bpf_graph_api_kfunc(insn->imm)))) { ··· 19102 18807 * match caller reference state when it exits. 19103 18808 */ 19104 18809 err = check_resource_leak(env, exception_exit, !env->cur_state->curframe, 19105 - "BPF_EXIT instruction"); 18810 + "BPF_EXIT instruction in main prog"); 19106 18811 if (err) 19107 18812 return err; 19108 18813
+2
tools/testing/selftests/bpf/prog_tests/verifier.c
··· 98 98 #include "verifier_xdp_direct_packet_access.skel.h" 99 99 #include "verifier_bits_iter.skel.h" 100 100 #include "verifier_lsm.skel.h" 101 + #include "irq.skel.h" 101 102 102 103 #define MAX_ENTRIES 11 103 104 ··· 226 225 void test_verifier_xdp_direct_packet_access(void) { RUN(verifier_xdp_direct_packet_access); } 227 226 void test_verifier_bits_iter(void) { RUN(verifier_bits_iter); } 228 227 void test_verifier_lsm(void) { RUN(verifier_lsm); } 228 + void test_irq(void) { RUN(irq); } 229 229 230 230 void test_verifier_mtu(void) 231 231 {
+2 -2
tools/testing/selftests/bpf/progs/exceptions_fail.c
··· 131 131 } 132 132 133 133 SEC("?tc") 134 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_rcu_read_lock-ed region") 134 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_rcu_read_lock-ed region") 135 135 int reject_with_rcu_read_lock(void *ctx) 136 136 { 137 137 bpf_rcu_read_lock(); ··· 147 147 } 148 148 149 149 SEC("?tc") 150 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_rcu_read_lock-ed region") 150 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_rcu_read_lock-ed region") 151 151 int reject_subprog_with_rcu_read_lock(void *ctx) 152 152 { 153 153 bpf_rcu_read_lock();
+444
tools/testing/selftests/bpf/progs/irq.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Copyright (c) 2024 Meta Platforms, Inc. and affiliates. */ 3 + #include <vmlinux.h> 4 + #include <bpf/bpf_helpers.h> 5 + #include "bpf_misc.h" 6 + #include "bpf_experimental.h" 7 + 8 + unsigned long global_flags; 9 + 10 + extern void bpf_local_irq_save(unsigned long *) __weak __ksym; 11 + extern void bpf_local_irq_restore(unsigned long *) __weak __ksym; 12 + extern int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void *unsafe_ptr__ign, u64 flags) __weak __ksym; 13 + 14 + SEC("?tc") 15 + __failure __msg("arg#0 doesn't point to an irq flag on stack") 16 + int irq_save_bad_arg(struct __sk_buff *ctx) 17 + { 18 + bpf_local_irq_save(&global_flags); 19 + return 0; 20 + } 21 + 22 + SEC("?tc") 23 + __failure __msg("arg#0 doesn't point to an irq flag on stack") 24 + int irq_restore_bad_arg(struct __sk_buff *ctx) 25 + { 26 + bpf_local_irq_restore(&global_flags); 27 + return 0; 28 + } 29 + 30 + SEC("?tc") 31 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 32 + int irq_restore_missing_2(struct __sk_buff *ctx) 33 + { 34 + unsigned long flags1; 35 + unsigned long flags2; 36 + 37 + bpf_local_irq_save(&flags1); 38 + bpf_local_irq_save(&flags2); 39 + return 0; 40 + } 41 + 42 + SEC("?tc") 43 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 44 + int irq_restore_missing_3(struct __sk_buff *ctx) 45 + { 46 + unsigned long flags1; 47 + unsigned long flags2; 48 + unsigned long flags3; 49 + 50 + bpf_local_irq_save(&flags1); 51 + bpf_local_irq_save(&flags2); 52 + bpf_local_irq_save(&flags3); 53 + return 0; 54 + } 55 + 56 + SEC("?tc") 57 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 58 + int irq_restore_missing_3_minus_2(struct __sk_buff *ctx) 59 + { 60 + unsigned long flags1; 61 + unsigned long flags2; 62 + unsigned long flags3; 63 + 64 + bpf_local_irq_save(&flags1); 65 + bpf_local_irq_save(&flags2); 66 + bpf_local_irq_save(&flags3); 67 + bpf_local_irq_restore(&flags3); 68 + bpf_local_irq_restore(&flags2); 69 + return 0; 70 + } 71 + 72 + static __noinline void local_irq_save(unsigned long *flags) 73 + { 74 + bpf_local_irq_save(flags); 75 + } 76 + 77 + static __noinline void local_irq_restore(unsigned long *flags) 78 + { 79 + bpf_local_irq_restore(flags); 80 + } 81 + 82 + SEC("?tc") 83 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 84 + int irq_restore_missing_1_subprog(struct __sk_buff *ctx) 85 + { 86 + unsigned long flags; 87 + 88 + local_irq_save(&flags); 89 + return 0; 90 + } 91 + 92 + SEC("?tc") 93 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 94 + int irq_restore_missing_2_subprog(struct __sk_buff *ctx) 95 + { 96 + unsigned long flags1; 97 + unsigned long flags2; 98 + 99 + local_irq_save(&flags1); 100 + local_irq_save(&flags2); 101 + return 0; 102 + } 103 + 104 + SEC("?tc") 105 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 106 + int irq_restore_missing_3_subprog(struct __sk_buff *ctx) 107 + { 108 + unsigned long flags1; 109 + unsigned long flags2; 110 + unsigned long flags3; 111 + 112 + local_irq_save(&flags1); 113 + local_irq_save(&flags2); 114 + local_irq_save(&flags3); 115 + return 0; 116 + } 117 + 118 + SEC("?tc") 119 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_local_irq_save-ed region") 120 + int irq_restore_missing_3_minus_2_subprog(struct __sk_buff *ctx) 121 + { 122 + unsigned long flags1; 123 + unsigned long flags2; 124 + unsigned long flags3; 125 + 126 + local_irq_save(&flags1); 127 + local_irq_save(&flags2); 128 + local_irq_save(&flags3); 129 + local_irq_restore(&flags3); 130 + local_irq_restore(&flags2); 131 + return 0; 132 + } 133 + 134 + SEC("?tc") 135 + __success 136 + int irq_balance(struct __sk_buff *ctx) 137 + { 138 + unsigned long flags; 139 + 140 + local_irq_save(&flags); 141 + local_irq_restore(&flags); 142 + return 0; 143 + } 144 + 145 + SEC("?tc") 146 + __success 147 + int irq_balance_n(struct __sk_buff *ctx) 148 + { 149 + unsigned long flags1; 150 + unsigned long flags2; 151 + unsigned long flags3; 152 + 153 + local_irq_save(&flags1); 154 + local_irq_save(&flags2); 155 + local_irq_save(&flags3); 156 + local_irq_restore(&flags3); 157 + local_irq_restore(&flags2); 158 + local_irq_restore(&flags1); 159 + return 0; 160 + } 161 + 162 + static __noinline void local_irq_balance(void) 163 + { 164 + unsigned long flags; 165 + 166 + local_irq_save(&flags); 167 + local_irq_restore(&flags); 168 + } 169 + 170 + static __noinline void local_irq_balance_n(void) 171 + { 172 + unsigned long flags1; 173 + unsigned long flags2; 174 + unsigned long flags3; 175 + 176 + local_irq_save(&flags1); 177 + local_irq_save(&flags2); 178 + local_irq_save(&flags3); 179 + local_irq_restore(&flags3); 180 + local_irq_restore(&flags2); 181 + local_irq_restore(&flags1); 182 + } 183 + 184 + SEC("?tc") 185 + __success 186 + int irq_balance_subprog(struct __sk_buff *ctx) 187 + { 188 + local_irq_balance(); 189 + return 0; 190 + } 191 + 192 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 193 + __failure __msg("sleepable helper bpf_copy_from_user#") 194 + int irq_sleepable_helper(void *ctx) 195 + { 196 + unsigned long flags; 197 + u32 data; 198 + 199 + local_irq_save(&flags); 200 + bpf_copy_from_user(&data, sizeof(data), NULL); 201 + local_irq_restore(&flags); 202 + return 0; 203 + } 204 + 205 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 206 + __failure __msg("kernel func bpf_copy_from_user_str is sleepable within IRQ-disabled region") 207 + int irq_sleepable_kfunc(void *ctx) 208 + { 209 + unsigned long flags; 210 + u32 data; 211 + 212 + local_irq_save(&flags); 213 + bpf_copy_from_user_str(&data, sizeof(data), NULL, 0); 214 + local_irq_restore(&flags); 215 + return 0; 216 + } 217 + 218 + int __noinline global_local_irq_balance(void) 219 + { 220 + local_irq_balance_n(); 221 + return 0; 222 + } 223 + 224 + SEC("?tc") 225 + __failure __msg("global function calls are not allowed with IRQs disabled") 226 + int irq_global_subprog(struct __sk_buff *ctx) 227 + { 228 + unsigned long flags; 229 + 230 + bpf_local_irq_save(&flags); 231 + global_local_irq_balance(); 232 + bpf_local_irq_restore(&flags); 233 + return 0; 234 + } 235 + 236 + SEC("?tc") 237 + __failure __msg("cannot restore irq state out of order") 238 + int irq_restore_ooo(struct __sk_buff *ctx) 239 + { 240 + unsigned long flags1; 241 + unsigned long flags2; 242 + 243 + bpf_local_irq_save(&flags1); 244 + bpf_local_irq_save(&flags2); 245 + bpf_local_irq_restore(&flags1); 246 + bpf_local_irq_restore(&flags2); 247 + return 0; 248 + } 249 + 250 + SEC("?tc") 251 + __failure __msg("cannot restore irq state out of order") 252 + int irq_restore_ooo_3(struct __sk_buff *ctx) 253 + { 254 + unsigned long flags1; 255 + unsigned long flags2; 256 + unsigned long flags3; 257 + 258 + bpf_local_irq_save(&flags1); 259 + bpf_local_irq_save(&flags2); 260 + bpf_local_irq_restore(&flags2); 261 + bpf_local_irq_save(&flags3); 262 + bpf_local_irq_restore(&flags1); 263 + bpf_local_irq_restore(&flags3); 264 + return 0; 265 + } 266 + 267 + static __noinline void local_irq_save_3(unsigned long *flags1, unsigned long *flags2, 268 + unsigned long *flags3) 269 + { 270 + local_irq_save(flags1); 271 + local_irq_save(flags2); 272 + local_irq_save(flags3); 273 + } 274 + 275 + SEC("?tc") 276 + __success 277 + int irq_restore_3_subprog(struct __sk_buff *ctx) 278 + { 279 + unsigned long flags1; 280 + unsigned long flags2; 281 + unsigned long flags3; 282 + 283 + local_irq_save_3(&flags1, &flags2, &flags3); 284 + bpf_local_irq_restore(&flags3); 285 + bpf_local_irq_restore(&flags2); 286 + bpf_local_irq_restore(&flags1); 287 + return 0; 288 + } 289 + 290 + SEC("?tc") 291 + __failure __msg("cannot restore irq state out of order") 292 + int irq_restore_4_subprog(struct __sk_buff *ctx) 293 + { 294 + unsigned long flags1; 295 + unsigned long flags2; 296 + unsigned long flags3; 297 + unsigned long flags4; 298 + 299 + local_irq_save_3(&flags1, &flags2, &flags3); 300 + bpf_local_irq_restore(&flags3); 301 + bpf_local_irq_save(&flags4); 302 + bpf_local_irq_restore(&flags4); 303 + bpf_local_irq_restore(&flags1); 304 + return 0; 305 + } 306 + 307 + SEC("?tc") 308 + __failure __msg("cannot restore irq state out of order") 309 + int irq_restore_ooo_3_subprog(struct __sk_buff *ctx) 310 + { 311 + unsigned long flags1; 312 + unsigned long flags2; 313 + unsigned long flags3; 314 + 315 + local_irq_save_3(&flags1, &flags2, &flags3); 316 + bpf_local_irq_restore(&flags3); 317 + bpf_local_irq_restore(&flags2); 318 + bpf_local_irq_save(&flags3); 319 + bpf_local_irq_restore(&flags1); 320 + return 0; 321 + } 322 + 323 + SEC("?tc") 324 + __failure __msg("expected an initialized") 325 + int irq_restore_invalid(struct __sk_buff *ctx) 326 + { 327 + unsigned long flags1; 328 + unsigned long flags = 0xfaceb00c; 329 + 330 + bpf_local_irq_save(&flags1); 331 + bpf_local_irq_restore(&flags); 332 + return 0; 333 + } 334 + 335 + SEC("?tc") 336 + __failure __msg("expected uninitialized") 337 + int irq_save_invalid(struct __sk_buff *ctx) 338 + { 339 + unsigned long flags1; 340 + 341 + bpf_local_irq_save(&flags1); 342 + bpf_local_irq_save(&flags1); 343 + return 0; 344 + } 345 + 346 + SEC("?tc") 347 + __failure __msg("expected an initialized") 348 + int irq_restore_iter(struct __sk_buff *ctx) 349 + { 350 + struct bpf_iter_num it; 351 + 352 + bpf_iter_num_new(&it, 0, 42); 353 + bpf_local_irq_restore((unsigned long *)&it); 354 + return 0; 355 + } 356 + 357 + SEC("?tc") 358 + __failure __msg("Unreleased reference id=1") 359 + int irq_save_iter(struct __sk_buff *ctx) 360 + { 361 + struct bpf_iter_num it; 362 + 363 + /* Ensure same sized slot has st->ref_obj_id set, so we reject based on 364 + * slot_type != STACK_IRQ_FLAG... 365 + */ 366 + _Static_assert(sizeof(it) == sizeof(unsigned long), "broken iterator size"); 367 + 368 + bpf_iter_num_new(&it, 0, 42); 369 + bpf_local_irq_save((unsigned long *)&it); 370 + bpf_local_irq_restore((unsigned long *)&it); 371 + return 0; 372 + } 373 + 374 + SEC("?tc") 375 + __failure __msg("expected an initialized") 376 + int irq_flag_overwrite(struct __sk_buff *ctx) 377 + { 378 + unsigned long flags; 379 + 380 + bpf_local_irq_save(&flags); 381 + flags = 0xdeadbeef; 382 + bpf_local_irq_restore(&flags); 383 + return 0; 384 + } 385 + 386 + SEC("?tc") 387 + __failure __msg("expected an initialized") 388 + int irq_flag_overwrite_partial(struct __sk_buff *ctx) 389 + { 390 + unsigned long flags; 391 + 392 + bpf_local_irq_save(&flags); 393 + *(((char *)&flags) + 1) = 0xff; 394 + bpf_local_irq_restore(&flags); 395 + return 0; 396 + } 397 + 398 + SEC("?tc") 399 + __failure __msg("cannot restore irq state out of order") 400 + int irq_ooo_refs_array(struct __sk_buff *ctx) 401 + { 402 + unsigned long flags[4]; 403 + struct { int i; } *p; 404 + 405 + /* refs=1 */ 406 + bpf_local_irq_save(&flags[0]); 407 + 408 + /* refs=1,2 */ 409 + p = bpf_obj_new(typeof(*p)); 410 + if (!p) { 411 + bpf_local_irq_restore(&flags[0]); 412 + return 0; 413 + } 414 + 415 + /* refs=1,2,3 */ 416 + bpf_local_irq_save(&flags[1]); 417 + 418 + /* refs=1,2,3,4 */ 419 + bpf_local_irq_save(&flags[2]); 420 + 421 + /* Now when we remove ref=2, the verifier must not break the ordering in 422 + * the refs array between 1,3,4. With an older implementation, the 423 + * verifier would swap the last element with the removed element, but to 424 + * maintain the stack property we need to use memmove. 425 + */ 426 + bpf_obj_drop(p); 427 + 428 + /* Save and restore to reset active_irq_id to 3, as the ordering is now 429 + * refs=1,4,3. When restoring the linear scan will find prev_id in order 430 + * as 3 instead of 4. 431 + */ 432 + bpf_local_irq_save(&flags[3]); 433 + bpf_local_irq_restore(&flags[3]); 434 + 435 + /* With the incorrect implementation, we can release flags[1], flags[2], 436 + * and flags[0], i.e. in the wrong order. 437 + */ 438 + bpf_local_irq_restore(&flags[1]); 439 + bpf_local_irq_restore(&flags[2]); 440 + bpf_local_irq_restore(&flags[0]); 441 + return 0; 442 + } 443 + 444 + char _license[] SEC("license") = "GPL";
+21 -7
tools/testing/selftests/bpf/progs/preempt_lock.c
··· 5 5 #include "bpf_misc.h" 6 6 #include "bpf_experimental.h" 7 7 8 + extern int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void *unsafe_ptr__ign, u64 flags) __weak __ksym; 9 + 8 10 SEC("?tc") 9 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 11 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 10 12 int preempt_lock_missing_1(struct __sk_buff *ctx) 11 13 { 12 14 bpf_preempt_disable(); ··· 16 14 } 17 15 18 16 SEC("?tc") 19 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 17 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 20 18 int preempt_lock_missing_2(struct __sk_buff *ctx) 21 19 { 22 20 bpf_preempt_disable(); ··· 25 23 } 26 24 27 25 SEC("?tc") 28 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 26 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 29 27 int preempt_lock_missing_3(struct __sk_buff *ctx) 30 28 { 31 29 bpf_preempt_disable(); ··· 35 33 } 36 34 37 35 SEC("?tc") 38 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 36 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 39 37 int preempt_lock_missing_3_minus_2(struct __sk_buff *ctx) 40 38 { 41 39 bpf_preempt_disable(); ··· 57 55 } 58 56 59 57 SEC("?tc") 60 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 58 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 61 59 int preempt_lock_missing_1_subprog(struct __sk_buff *ctx) 62 60 { 63 61 preempt_disable(); ··· 65 63 } 66 64 67 65 SEC("?tc") 68 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 66 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 69 67 int preempt_lock_missing_2_subprog(struct __sk_buff *ctx) 70 68 { 71 69 preempt_disable(); ··· 74 72 } 75 73 76 74 SEC("?tc") 77 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_preempt_disable-ed region") 75 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_preempt_disable-ed region") 78 76 int preempt_lock_missing_2_minus_1_subprog(struct __sk_buff *ctx) 79 77 { 80 78 preempt_disable(); ··· 111 109 112 110 bpf_preempt_disable(); 113 111 bpf_copy_from_user(&data, sizeof(data), NULL); 112 + bpf_preempt_enable(); 113 + return 0; 114 + } 115 + 116 + SEC("?fentry.s/" SYS_PREFIX "sys_getpgid") 117 + __failure __msg("kernel func bpf_copy_from_user_str is sleepable within non-preemptible region") 118 + int preempt_sleepable_kfunc(void *ctx) 119 + { 120 + u32 data; 121 + 122 + bpf_preempt_disable(); 123 + bpf_copy_from_user_str(&data, sizeof(data), NULL, 0); 114 124 bpf_preempt_enable(); 115 125 return 0; 116 126 }
+1 -1
tools/testing/selftests/bpf/progs/verifier_spin_lock.c
··· 187 187 188 188 SEC("cgroup/skb") 189 189 __description("spin_lock: test6 missing unlock") 190 - __failure __msg("BPF_EXIT instruction cannot be used inside bpf_spin_lock-ed region") 190 + __failure __msg("BPF_EXIT instruction in main prog cannot be used inside bpf_spin_lock-ed region") 191 191 __failure_unpriv __msg_unpriv("") 192 192 __naked void spin_lock_test6_missing_unlock(void) 193 193 {