Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

RDMA/core: Fix stale RoCE GIDs during netdev events at registration

RoCE GID entries become stale when netdev properties change during the
IB device registration window. This is reproducible with a udev rule
that sets a MAC address when a VF netdev appears:

ACTION=="add", SUBSYSTEM=="net", KERNEL=="eth4", \
RUN+="/sbin/ip link set eth4 address 88:22:33:44:55:66"

After VF creation, show_gids displays GIDs derived from the original
random MAC rather than the configured one.

The root cause is a race between netdev event processing and device
registration:

CPU 0 (driver) CPU 1 (udev/workqueue)
────────────── ──────────────────────
ib_register_device()
ib_cache_setup_one()
gid_table_setup_one()
_gid_table_setup_one()
← GID table allocated
rdma_roce_rescan_device()
← GIDs populated with
OLD MAC
ip link set eth4 addr NEW_MAC
NETDEV_CHANGEADDR queued
netdevice_event_work_handler()
ib_enum_all_roce_netdevs()
← Iterates DEVICE_REGISTERED
← Device NOT marked yet, SKIP!
enable_device_and_get()
xa_set_mark(DEVICE_REGISTERED)
← Too late, event was lost

The netdev event handler uses ib_enum_all_roce_netdevs() which only
iterates devices marked DEVICE_REGISTERED. However, this mark is set
late in the registration process, after the GID cache is already
populated. Events arriving in this window are silently dropped.

Fix this by introducing a new xarray mark DEVICE_GID_UPDATES that is
set immediately after the GID table is allocated and initialized. Use
the new mark in ib_enum_all_roce_netdevs() function to iterate devices
instead of DEVICE_REGISTERED.

This is safe because:
- After _gid_table_setup_one(), all required structures exist (port_data,
immutable, cache.gid)
- The GID table mutex serializes concurrent access between the initial
rescan and event handlers
- Event handlers correctly update stale GIDs even when racing with rescan
- The mark is cleared in ib_cache_cleanup_one() before teardown

This also fixes similar races for IP address events (inetaddr_event,
inet6addr_event) which use the same enumeration path.

Fixes: 0df91bb67334 ("RDMA/devices: Use xarray to store the client_data")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://patch.msgid.link/20260127093839.126291-1-jiri@resnulli.us
Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Signed-off-by: Leon Romanovsky <leon@kernel.org>

authored by

Jiri Pirko and committed by
Leon Romanovsky
9af0feae 16cb1a64

+49 -1
+13
drivers/infiniband/core/cache.c
··· 926 926 if (err) 927 927 return err; 928 928 929 + /* 930 + * Mark the device as ready for GID cache updates. This allows netdev 931 + * event handlers to update the GID cache even before the device is 932 + * fully registered. 933 + */ 934 + ib_device_enable_gid_updates(ib_dev); 935 + 929 936 rdma_roce_rescan_device(ib_dev); 930 937 931 938 return err; ··· 1644 1637 1645 1638 void ib_cache_cleanup_one(struct ib_device *device) 1646 1639 { 1640 + /* 1641 + * Clear the GID updates mark first to prevent event handlers from 1642 + * accessing the device while it's being torn down. 1643 + */ 1644 + ib_device_disable_gid_updates(device); 1645 + 1647 1646 /* The cleanup function waits for all in-progress workqueue 1648 1647 * elements and cleans up the GID cache. This function should be 1649 1648 * called after the device was removed from the devices list and
+3
drivers/infiniband/core/core_priv.h
··· 100 100 roce_netdev_callback cb, 101 101 void *cookie); 102 102 103 + void ib_device_enable_gid_updates(struct ib_device *device); 104 + void ib_device_disable_gid_updates(struct ib_device *device); 105 + 103 106 typedef int (*nldev_callback)(struct ib_device *device, 104 107 struct sk_buff *skb, 105 108 struct netlink_callback *cb,
+33 -1
drivers/infiniband/core/device.c
··· 93 93 static DEFINE_XARRAY_FLAGS(devices, XA_FLAGS_ALLOC); 94 94 static DECLARE_RWSEM(devices_rwsem); 95 95 #define DEVICE_REGISTERED XA_MARK_1 96 + #define DEVICE_GID_UPDATES XA_MARK_2 96 97 97 98 static u32 highest_client_id; 98 99 #define CLIENT_REGISTERED XA_MARK_1 ··· 2413 2412 unsigned long index; 2414 2413 2415 2414 down_read(&devices_rwsem); 2416 - xa_for_each_marked (&devices, index, dev, DEVICE_REGISTERED) 2415 + xa_for_each_marked(&devices, index, dev, DEVICE_GID_UPDATES) 2417 2416 ib_enum_roce_netdev(dev, filter, filter_cookie, cb, cookie); 2418 2417 up_read(&devices_rwsem); 2418 + } 2419 + 2420 + /** 2421 + * ib_device_enable_gid_updates - Mark device as ready for GID cache updates 2422 + * @device: Device to mark 2423 + * 2424 + * Called after GID table is allocated and initialized. After this mark is set, 2425 + * netdevice event handlers can update the device's GID cache. This allows 2426 + * events that arrive during device registration to be processed, avoiding 2427 + * stale GID entries when netdev properties change during the device 2428 + * registration process. 2429 + */ 2430 + void ib_device_enable_gid_updates(struct ib_device *device) 2431 + { 2432 + down_write(&devices_rwsem); 2433 + xa_set_mark(&devices, device->index, DEVICE_GID_UPDATES); 2434 + up_write(&devices_rwsem); 2435 + } 2436 + 2437 + /** 2438 + * ib_device_disable_gid_updates - Clear the GID updates mark 2439 + * @device: Device to unmark 2440 + * 2441 + * Called before GID table cleanup to prevent event handlers from accessing 2442 + * the device while it's being torn down. 2443 + */ 2444 + void ib_device_disable_gid_updates(struct ib_device *device) 2445 + { 2446 + down_write(&devices_rwsem); 2447 + xa_clear_mark(&devices, device->index, DEVICE_GID_UPDATES); 2448 + up_write(&devices_rwsem); 2419 2449 } 2420 2450 2421 2451 /*