Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

bpf: Use RCU-safe iteration in dev_map_redirect_multi() SKB path

The DEVMAP_HASH branch in dev_map_redirect_multi() uses
hlist_for_each_entry_safe() to iterate hash buckets, but this function
runs under RCU protection (called from xdp_do_generic_redirect_map()
in softirq context). Concurrent writers (__dev_map_hash_update_elem,
dev_map_hash_delete_elem) modify the list using RCU primitives
(hlist_add_head_rcu, hlist_del_rcu).

hlist_for_each_entry_safe() performs plain pointer dereferences without
rcu_dereference(), missing the acquire barrier needed to pair with
writers' rcu_assign_pointer(). On weakly-ordered architectures (ARM64,
POWER), a reader can observe a partially-constructed node. It also
defeats CONFIG_PROVE_RCU lockdep validation and KCSAN data-race
detection.

Replace with hlist_for_each_entry_rcu() using rcu_read_lock_bh_held()
as the lockdep condition, consistent with the rcu_dereference_check()
used in the DEVMAP (non-hash) branch of the same functions. Also fix
the same incorrect lockdep_is_held(&dtab->index_lock) condition in
dev_map_enqueue_multi(), where the lock is not held either.

Fixes: e624d4ed4aa8 ("xdp: Extend xdp_redirect_map with broadcast support")
Signed-off-by: David Carlier <devnexen@gmail.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20260320072645.16731-1-devnexen@gmail.com

authored by

David Carlier and committed by
Martin KaFai Lau
8ed82f80 7f5b0a60

+2 -3
+2 -3
kernel/bpf/devmap.c
··· 665 665 for (i = 0; i < dtab->n_buckets; i++) { 666 666 head = dev_map_index_hash(dtab, i); 667 667 hlist_for_each_entry_rcu(dst, head, index_hlist, 668 - lockdep_is_held(&dtab->index_lock)) { 668 + rcu_read_lock_bh_held()) { 669 669 if (!is_valid_dst(dst, xdpf)) 670 670 continue; 671 671 ··· 747 747 struct bpf_dtab_netdev *dst, *last_dst = NULL; 748 748 int excluded_devices[1+MAX_NEST_DEV]; 749 749 struct hlist_head *head; 750 - struct hlist_node *next; 751 750 int num_excluded = 0; 752 751 unsigned int i; 753 752 int err; ··· 786 787 } else { /* BPF_MAP_TYPE_DEVMAP_HASH */ 787 788 for (i = 0; i < dtab->n_buckets; i++) { 788 789 head = dev_map_index_hash(dtab, i); 789 - hlist_for_each_entry_safe(dst, next, head, index_hlist) { 790 + hlist_for_each_entry_rcu(dst, head, index_hlist, rcu_read_lock_bh_held()) { 790 791 if (is_ifindex_excluded(excluded_devices, num_excluded, 791 792 dst->dev->ifindex)) 792 793 continue;