Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'net-use-skb_attempt_defer_free-in-napi_consume_skb'

Eric Dumazet says:

====================
net: use skb_attempt_defer_free() in napi_consume_skb()

There is a lack of NUMA awareness and more generally lack
of slab caches affinity on TX completion path.

Modern drivers are using napi_consume_skb(), hoping to cache sk_buff
in per-cpu caches so that they can be recycled in RX path.

Only use this if the skb was allocated on the same cpu,
otherwise use skb_attempt_defer_free() so that the skb
is freed on the original cpu.

This removes contention on SLUB spinlocks and data structures,
and this makes sure that recycled sk_buff have correct NUMA locality.

After this series, I get ~50% improvement for an UDP tx workload
on an AMD EPYC 9B45 (IDPF 200Gbit NIC with 32 TX queues).

I will later refactor skb_attempt_defer_free()
to no longer have to care of skb_shared() and skb_release_head_state().
====================

Link: https://patch.msgid.link/20251106202935.1776179-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+11 -7
+2 -2
Documentation/admin-guide/sysctl/net.rst
··· 355 355 ------------- 356 356 357 357 Max size (in skbs) of the per-cpu list of skbs being freed 358 - by the cpu which allocated them. Used by TCP stack so far. 358 + by the cpu which allocated them. 359 359 360 - Default: 64 360 + Default: 128 361 361 362 362 optmem_max 363 363 ----------
+1 -1
net/core/hotdata.c
··· 20 20 .dev_tx_weight = 64, 21 21 .dev_rx_weight = 64, 22 22 .sysctl_max_skb_frags = MAX_SKB_FRAGS, 23 - .sysctl_skb_defer_max = 64, 23 + .sysctl_skb_defer_max = 128, 24 24 .sysctl_mem_pcpu_rsv = SK_MEMORY_PCPU_RESERVE 25 25 }; 26 26 EXPORT_SYMBOL(net_hotdata);
+8 -4
net/core/skbuff.c
··· 1149 1149 skb); 1150 1150 1151 1151 #endif 1152 + skb->destructor = NULL; 1152 1153 } 1153 - #if IS_ENABLED(CONFIG_NF_CONNTRACK) 1154 - nf_conntrack_put(skb_nfct(skb)); 1155 - #endif 1156 - skb_ext_put(skb); 1154 + nf_reset_ct(skb); 1155 + skb_ext_reset(skb); 1157 1156 } 1158 1157 1159 1158 /* Free everything but the sk_buff shell. */ ··· 1475 1476 } 1476 1477 1477 1478 DEBUG_NET_WARN_ON_ONCE(!in_softirq()); 1479 + 1480 + if (skb->alloc_cpu != smp_processor_id() && !skb_shared(skb)) { 1481 + skb_release_head_state(skb); 1482 + return skb_attempt_defer_free(skb); 1483 + } 1478 1484 1479 1485 if (!skb_unref(skb)) 1480 1486 return;