Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ipv4: give an IPv4 dev to blackhole_netdev

After commit 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to
invalidate dst entries"), blackhole_netdev was introduced to invalidate
dst cache entries on the TX path whenever the cache times out or is
flushed.

When two UDP sockets (sk1 and sk2) send messages to the same destination
simultaneously, they are using the same dst cache. If the dst cache is
invalidated on one path (sk2) while the other (sk1) is still transmitting,
sk1 may try to use the invalid dst entry.

CPU1 CPU2

udp_sendmsg(sk1) udp_sendmsg(sk2)
udp_send_skb()
ip_output()
<--- dst timeout or flushed
dst_dev_put()
ip_finish_output2()
ip_neigh_for_gw()

This results in a scenario where ip_neigh_for_gw() returns -EINVAL because
blackhole_dev lacks an in_dev, which is needed to initialize the neigh in
arp_constructor(). This error is then propagated back to userspace,
breaking the UDP application.

The patch fixes this issue by assigning an in_dev to blackhole_dev for
IPv4, similar to what was done for IPv6 in commit e5f80fcf869a ("ipv6:
give an IPv6 dev to blackhole_netdev"). This ensures that even when the
dst entry is invalidated with blackhole_dev, it will not fail to create
the neigh entry.

As devinet_init() is called ealier than blackhole_netdev_init() in system
booting, it can not assign the in_dev to blackhole_dev in devinet_init().
As Paolo suggested, add a separate late_initcall() in devinet.c to ensure
inet_blackhole_dev_init() is called after blackhole_netdev_init().

Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/3000792d45ca44e16c785ebe2b092e610e5b3df1.1728499633.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Xin Long and committed by
Jakub Kicinski
22600596 1d227fcc

+25 -10
+25 -10
net/ipv4/devinet.c
··· 298 298 /* Account for reference dev->ip_ptr (below) */ 299 299 refcount_set(&in_dev->refcnt, 1); 300 300 301 - err = devinet_sysctl_register(in_dev); 302 - if (err) { 303 - in_dev->dead = 1; 304 - neigh_parms_release(&arp_tbl, in_dev->arp_parms); 305 - in_dev_put(in_dev); 306 - in_dev = NULL; 307 - goto out; 301 + if (dev != blackhole_netdev) { 302 + err = devinet_sysctl_register(in_dev); 303 + if (err) { 304 + in_dev->dead = 1; 305 + neigh_parms_release(&arp_tbl, in_dev->arp_parms); 306 + in_dev_put(in_dev); 307 + in_dev = NULL; 308 + goto out; 309 + } 310 + ip_mc_init_dev(in_dev); 311 + if (dev->flags & IFF_UP) 312 + ip_mc_up(in_dev); 308 313 } 309 - ip_mc_init_dev(in_dev); 310 - if (dev->flags & IFF_UP) 311 - ip_mc_up(in_dev); 312 314 313 315 /* we can receive as soon as ip_ptr is set -- do this last */ 314 316 rcu_assign_pointer(dev->ip_ptr, in_dev); ··· 348 346 349 347 in_dev_put(in_dev); 350 348 } 349 + 350 + static int __init inet_blackhole_dev_init(void) 351 + { 352 + int err = 0; 353 + 354 + rtnl_lock(); 355 + if (!inetdev_init(blackhole_netdev)) 356 + err = -ENOMEM; 357 + rtnl_unlock(); 358 + 359 + return err; 360 + } 361 + late_initcall(inet_blackhole_dev_init); 351 362 352 363 int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b) 353 364 {