Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

net: don't mix device locking in dev_close_many() calls

Lockdep found the following dependency:

&dev_instance_lock_key#3 -->
&rdev->wiphy.mtx -->
&net->xdp.lock -->
&xs->mutex -->
&dev_instance_lock_key#3

The first dependency is the problem. wiphy mutex should be outside
the instance locks. The problem happens in notifiers (as always)
for CLOSE. We only hold the instance lock for ops locked devices
during CLOSE, and WiFi netdevs are not ops locked. Unfortunately,
when we dev_close_many() during netns dismantle we may be holding
the instance lock of _another_ netdev when issuing a CLOSE for
a WiFi device.

Lockdep's "Possible unsafe locking scenario" only prints 3 locks
and we have 4, plus I think we'd need 3 CPUs, like this:

CPU0 CPU1 CPU2
---- ---- ----
lock(&xs->mutex);
lock(&dev_instance_lock_key#3);
lock(&rdev->wiphy.mtx);
lock(&net->xdp.lock);
lock(&xs->mutex);
lock(&rdev->wiphy.mtx);
lock(&dev_instance_lock_key#3);

Tho, I don't think that's possible as CPU1 and CPU2 would
be under rtnl_lock. Even if we have per-netns rtnl_lock and
wiphy can span network namespaces - CPU0 and CPU1 must be
in the same netns to see dev_instance_lock, so CPU0 can't
be installing a socket as CPU1 is tearing the netns down.

Regardless, our expected lock ordering is that wiphy lock
is taken before instance locks, so let's fix this.

Go over the ops locked and non-locked devices separately.
Note that calling dev_close_many() on an empty list is perfectly
fine. All processing (including RCU syncs) are conditional
on the list not being empty, already.

Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations")
Reported-by: syzbot+6f588c78bf765b62b450@syzkaller.appspotmail.com
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250412233011.309762-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+13 -4
+13 -4
net/core/dev.c
··· 11932 11932 BUG_ON(dev->reg_state != NETREG_REGISTERED); 11933 11933 } 11934 11934 11935 - /* If device is running, close it first. */ 11935 + /* If device is running, close it first. Start with ops locked... */ 11936 11936 list_for_each_entry(dev, head, unreg_list) { 11937 - list_add_tail(&dev->close_list, &close_head); 11938 - netdev_lock_ops(dev); 11937 + if (netdev_need_ops_lock(dev)) { 11938 + list_add_tail(&dev->close_list, &close_head); 11939 + netdev_lock(dev); 11940 + } 11941 + } 11942 + dev_close_many(&close_head, true); 11943 + /* ... now unlock them and go over the rest. */ 11944 + list_for_each_entry(dev, head, unreg_list) { 11945 + if (netdev_need_ops_lock(dev)) 11946 + netdev_unlock(dev); 11947 + else 11948 + list_add_tail(&dev->close_list, &close_head); 11939 11949 } 11940 11950 dev_close_many(&close_head, true); 11941 11951 11942 11952 list_for_each_entry(dev, head, unreg_list) { 11943 - netdev_unlock_ops(dev); 11944 11953 /* And unlink it from device chain. */ 11945 11954 unlist_netdevice(dev); 11946 11955 netdev_lock(dev);