Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

af_unix: Drop all SCM attributes for SOCKMAP.

SOCKMAP can hide inflight fd from AF_UNIX GC.

When a socket in SOCKMAP receives skb with inflight fd,
sk_psock_verdict_data_ready() looks up the mapped socket and
enqueue skb to its psock->ingress_skb.

Since neither the old nor the new GC can inspect the psock
queue, the hidden skb leaks the inflight sockets. Note that
this cannot be detected via kmemleak because inflight sockets
are linked to a global list.

In addition, SOCKMAP redirect breaks the Tarjan-based GC's
assumption that unix_edge.successor is always alive, which
is no longer true once skb is redirected, resulting in
use-after-free below. [0]

Moreover, SOCKMAP does not call scm_stat_del() properly,
so unix_show_fdinfo() could report an incorrect fd count.

sk_msg_recvmsg() does not support any SCM attributes in the
first place.

Let's drop all SCM attributes before passing skb to the
SOCKMAP layer.

[0]:
BUG: KASAN: slab-use-after-free in unix_del_edges (net/unix/garbage.c:118 net/unix/garbage.c:181 net/unix/garbage.c:251)
Read of size 8 at addr ffff888125362670 by task kworker/56:1/496

CPU: 56 UID: 0 PID: 496 Comm: kworker/56:1 Not tainted 7.0.0-rc7-00263-gb9d8b856689d #3 PREEMPT(lazy)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
Workqueue: events sk_psock_backlog
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:122)
print_report (mm/kasan/report.c:379)
kasan_report (mm/kasan/report.c:597)
unix_del_edges (net/unix/garbage.c:118 net/unix/garbage.c:181 net/unix/garbage.c:251)
unix_destroy_fpl (net/unix/garbage.c:317)
unix_destruct_scm (./include/net/scm.h:80 ./include/net/scm.h:86 net/unix/af_unix.c:1976)
sk_psock_backlog (./include/linux/skbuff.h:?)
process_scheduled_works (kernel/workqueue.c:?)
worker_thread (kernel/workqueue.c:?)
kthread (kernel/kthread.c:438)
ret_from_fork (arch/x86/kernel/process.c:164)
ret_from_fork_asm (arch/x86/entry/entry_64.S:258)
</TASK>

Allocated by task 955:
kasan_save_track (mm/kasan/common.c:58 mm/kasan/common.c:78)
__kasan_slab_alloc (mm/kasan/common.c:369)
kmem_cache_alloc_noprof (mm/slub.c:4539)
sk_prot_alloc (net/core/sock.c:2240)
sk_alloc (net/core/sock.c:2301)
unix_create1 (net/unix/af_unix.c:1099)
unix_create (net/unix/af_unix.c:1169)
__sock_create (net/socket.c:1606)
__sys_socketpair (net/socket.c:1811)
__x64_sys_socketpair (net/socket.c:1863 net/socket.c:1860 net/socket.c:1860)
do_syscall_64 (arch/x86/entry/syscall_64.c:?)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Freed by task 496:
kasan_save_track (mm/kasan/common.c:58 mm/kasan/common.c:78)
kasan_save_free_info (mm/kasan/generic.c:587)
__kasan_slab_free (mm/kasan/common.c:287)
kmem_cache_free (mm/slub.c:6165)
__sk_destruct (net/core/sock.c:2282 net/core/sock.c:2384)
sk_psock_destroy (./include/net/sock.h:?)
process_scheduled_works (kernel/workqueue.c:?)
worker_thread (kernel/workqueue.c:?)
kthread (kernel/kthread.c:438)
ret_from_fork (arch/x86/kernel/process.c:164)
ret_from_fork_asm (arch/x86/entry/entry_64.S:258)

Fixes: c63829182c37 ("af_unix: Implement ->psock_update_sk_prot()")
Fixes: 77462de14a43 ("af_unix: Add read_sock for stream socket types")
Reported-by: Xingyu Jin <xingyuj@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260415184830.3988432-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Kuniyuki Iwashima and committed by
Jakub Kicinski
965dc934 8cff9dbe

+27 -8
+27 -8
net/unix/af_unix.c
··· 1968 1968 1969 1969 static void unix_destruct_scm(struct sk_buff *skb) 1970 1970 { 1971 - struct scm_cookie scm; 1971 + struct scm_cookie scm = {}; 1972 1972 1973 - memset(&scm, 0, sizeof(scm)); 1974 - scm.pid = UNIXCB(skb).pid; 1973 + swap(scm.pid, UNIXCB(skb).pid); 1974 + 1975 1975 if (UNIXCB(skb).fp) 1976 1976 unix_detach_fds(&scm, skb); 1977 1977 1978 - /* Alas, it calls VFS */ 1979 - /* So fscking what? fput() had been SMP-safe since the last Summer */ 1980 1978 scm_destroy(&scm); 1979 + } 1980 + 1981 + static void unix_wfree(struct sk_buff *skb) 1982 + { 1983 + unix_destruct_scm(skb); 1981 1984 sock_wfree(skb); 1982 1985 } 1983 1986 ··· 1996 1993 if (scm->fp && send_fds) 1997 1994 err = unix_attach_fds(scm, skb); 1998 1995 1999 - skb->destructor = unix_destruct_scm; 1996 + skb->destructor = unix_wfree; 2000 1997 return err; 2001 1998 } 2002 1999 ··· 2071 2068 atomic_sub(fp->count, &u->scm_stat.nr_fds); 2072 2069 unix_del_edges(fp); 2073 2070 } 2071 + } 2072 + 2073 + static void unix_orphan_scm(struct sock *sk, struct sk_buff *skb) 2074 + { 2075 + scm_stat_del(sk, skb); 2076 + unix_destruct_scm(skb); 2077 + skb->destructor = sock_wfree; 2074 2078 } 2075 2079 2076 2080 /* ··· 2693 2683 int err; 2694 2684 2695 2685 mutex_lock(&u->iolock); 2686 + 2696 2687 skb = skb_recv_datagram(sk, MSG_DONTWAIT, &err); 2697 - mutex_unlock(&u->iolock); 2698 - if (!skb) 2688 + if (!skb) { 2689 + mutex_unlock(&u->iolock); 2699 2690 return err; 2691 + } 2692 + 2693 + unix_orphan_scm(sk, skb); 2694 + 2695 + mutex_unlock(&u->iolock); 2700 2696 2701 2697 return recv_actor(sk, skb); 2702 2698 } ··· 2902 2886 #endif 2903 2887 2904 2888 spin_unlock(&queue->lock); 2889 + 2890 + unix_orphan_scm(sk, skb); 2891 + 2905 2892 mutex_unlock(&u->iolock); 2906 2893 2907 2894 return recv_actor(sk, skb);