Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

net: Drop the lock in skb_may_tx_timestamp()

skb_may_tx_timestamp() may acquire sock::sk_callback_lock. The lock must
not be taken in IRQ context, only softirq is okay. A few drivers receive
the timestamp via a dedicated interrupt and complete the TX timestamp
from that handler. This will lead to a deadlock if the lock is already
write-locked on the same CPU.

Taking the lock can be avoided. The socket (pointed by the skb) will
remain valid until the skb is released. The ->sk_socket and ->file
member will be set to NULL once the user closes the socket which may
happen before the timestamp arrives.
If we happen to observe the pointer while the socket is closing but
before the pointer is set to NULL then we may use it because both
pointer (and the file's cred member) are RCU freed.

Drop the lock. Use READ_ONCE() to obtain the individual pointer. Add a
matching WRITE_ONCE() where the pointer are cleared.

Link: https://lore.kernel.org/all/20260205145104.iWinkXHv@linutronix.de
Fixes: b245be1f4db1a ("net-timestamp: no-payload only sysctl")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260220183858.N4ERjFW6@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

authored by

Sebastian Andrzej Siewior and committed by
Paolo Abeni
983512f3 82aec772

+20 -7
+1 -1
include/net/sock.h
··· 2098 2098 2099 2099 static inline void sk_set_socket(struct sock *sk, struct socket *sock) 2100 2100 { 2101 - sk->sk_socket = sock; 2101 + WRITE_ONCE(sk->sk_socket, sock); 2102 2102 if (sock) { 2103 2103 WRITE_ONCE(sk->sk_uid, SOCK_INODE(sock)->i_uid); 2104 2104 WRITE_ONCE(sk->sk_ino, SOCK_INODE(sock)->i_ino);
+18 -5
net/core/skbuff.c
··· 5590 5590 5591 5591 static bool skb_may_tx_timestamp(struct sock *sk, bool tsonly) 5592 5592 { 5593 - bool ret; 5593 + struct socket *sock; 5594 + struct file *file; 5595 + bool ret = false; 5594 5596 5595 5597 if (likely(tsonly || READ_ONCE(sock_net(sk)->core.sysctl_tstamp_allow_data))) 5596 5598 return true; 5597 5599 5598 - read_lock_bh(&sk->sk_callback_lock); 5599 - ret = sk->sk_socket && sk->sk_socket->file && 5600 - file_ns_capable(sk->sk_socket->file, &init_user_ns, CAP_NET_RAW); 5601 - read_unlock_bh(&sk->sk_callback_lock); 5600 + /* The sk pointer remains valid as long as the skb is. The sk_socket and 5601 + * file pointer may become NULL if the socket is closed. Both structures 5602 + * (including file->cred) are RCU freed which means they can be accessed 5603 + * within a RCU read section. 5604 + */ 5605 + rcu_read_lock(); 5606 + sock = READ_ONCE(sk->sk_socket); 5607 + if (!sock) 5608 + goto out; 5609 + file = READ_ONCE(sock->file); 5610 + if (!file) 5611 + goto out; 5612 + ret = file_ns_capable(file, &init_user_ns, CAP_NET_RAW); 5613 + out: 5614 + rcu_read_unlock(); 5602 5615 return ret; 5603 5616 } 5604 5617
+1 -1
net/socket.c
··· 674 674 iput(SOCK_INODE(sock)); 675 675 return; 676 676 } 677 - sock->file = NULL; 677 + WRITE_ONCE(sock->file, NULL); 678 678 } 679 679 680 680 /**