Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
libeth: add libeth_xdp helper lib

Alexander Lobakin says:

Time to add XDP helpers infra to libeth to greatly simplify adding
XDP to idpf and iavf, as well as improve and extend XDP in ice and
i40e. Any vendor is free to reuse helpers. If this happens, I'm fine
with moving the folder of out intel/.

The helpers greatly simplify building xdp_buff, running a prog,
handling the verdict, implement XDP_TX, .ndo_xdp_xmit, XDP buffer
completion. Same applies to XSk (with XSk xmit instead of
.ndo_xdp_xmit, plus stuff like XSk wakeup).
They are entirely generic with no HW definitions or assumptions.
HW-specific stuff like parsing Rx desc / filling Tx desc is passed
from the driver as inline callbacks.

For now, key assumptions that optimize performance / avoid code
bloat, but might not fit every driver in driver/net/:
* netmem holding the buffers are always order-0;
* driver has separate XDP Tx queues, doesn't use stack queues for
that. For best efficiency, you may want to have nr_cpu_ids XDP
queues, but less (queue sharing) is also supported;
* XDP Tx queues are interrupt-less and use "lazy" cleaning only
when there are less than 1/4 free Tx descriptors of the queue
size;
* main target platforms are 64-bit, although 32-bit is also fully
supported, but the code might be not as optimized for them.

Library code already supports multi-buffer for all kinds of Tx and
both header split and no split for Rx and Tx. Frags can come from
devmem/io_uring etc., direct `struct page *` is used only for header
buffers for which it's always true.
Drivers are free to pass their own Rx hints and XSK xmit hints ops.

XDP_TX and ndo_xdp_xmit use onstack bulk for the frames to be sent
and send them by batches of 16 buffers. This eats ~280 bytes on the
stack, but gives good boosts and allow to greatly optimize the main
sending function leaving it without any error/exception paths.

XSk xmit fills Tx descriptors in the loop unrolled by 8. This was
proven to improve perf on ice and i40e. XDP_TX and ndo_xdp_xmit
doesn't use unrolling as I wasn't able to get any improvements in
those scenenarios from this, while +1 Kb for their sending functions
for nothing doesn't sound reasonable.

XSk wakeup, instead of traditionally used "SW interrupts" provided
by NICs, uses IPI to schedule NAPI on the CPU corresponding to the
given queue pair. It gives better control over CPU distribution and
in general performs way better than "SW interrupts", plus allows us
to not pass any HW-specific callbacks there.

The code is built the way that all callbacks passed from drivers
get inlined; in general, most of hotpath gets inlined. Everything
slow/exception lands to .c files in the libeth folder, doesn't
create copies in the drivers themselves and doesn't overloat
hotpath.
Sure, inlining means that hotpath will be compiled into every driver
that uses the lib, but the core code is written in one place, so no
copying of bugs happens. Fixed once -- works everywhere.

The last commit might look like sorta hack, but it gives really good
boosts and decreases object code size, plus there are checks that
all those wider accesses are fully safe, so I don't feel anything
bad about it.

An example of using libeth_xdp can be found either on my GitHub or
on the mailing lists here ("XDP for idpf"). Macros for building
driver XDP functions lead to that some implementations (XDP_TX,
ndo_xdp_xmit etc.) consist of really only a few lines.

* '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
libeth: xdp, xsk: access adjacent u32s as u64 where applicable
libeth: xsk: add XSkFQ refill and XSk wakeup helpers
libeth: xsk: add XSk Rx processing support
libeth: xsk: add XSk xmit functions
libeth: xsk: add XSk XDP_TX sending helpers
libeth: xdp: add RSS hash hint and XDP features setup helpers
libeth: xdp: add templates for building driver-side callbacks
libeth: xdp: add XDP prog run and verdict result handling
libeth: xdp: add helpers for preparing/processing &libeth_xdp_buff
libeth: xdp: add XDPSQ cleanup timers
libeth: xdp: add XDPSQ locking helpers
libeth: xdp: add XDPSQE completion helpers
libeth: xdp: add .ndo_xdp_xmit() helpers
libeth: xdp: add XDP_TX buffers sending
libeth: support native XDP and register memory model
libeth: convert to netmem
libeth, libie: clean symbol exports up a little
====================

Link: https://patch.msgid.link/20250616201639.710420-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+3596 -57
+8 -6
drivers/net/ethernet/intel/iavf/iavf_txrx.c
··· 723 723 for (u32 i = rx_ring->next_to_clean; i != rx_ring->next_to_use; ) { 724 724 const struct libeth_fqe *rx_fqes = &rx_ring->rx_fqes[i]; 725 725 726 - page_pool_put_full_page(rx_ring->pp, rx_fqes->page, false); 726 + libeth_rx_recycle_slow(rx_fqes->netmem); 727 727 728 728 if (unlikely(++i == rx_ring->count)) 729 729 i = 0; ··· 1197 1197 const struct libeth_fqe *rx_buffer, 1198 1198 unsigned int size) 1199 1199 { 1200 - u32 hr = rx_buffer->page->pp->p.offset; 1200 + u32 hr = netmem_get_pp(rx_buffer->netmem)->p.offset; 1201 1201 1202 - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buffer->page, 1203 - rx_buffer->offset + hr, size, rx_buffer->truesize); 1202 + skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags, 1203 + rx_buffer->netmem, rx_buffer->offset + hr, 1204 + size, rx_buffer->truesize); 1204 1205 } 1205 1206 1206 1207 /** ··· 1215 1214 static struct sk_buff *iavf_build_skb(const struct libeth_fqe *rx_buffer, 1216 1215 unsigned int size) 1217 1216 { 1218 - u32 hr = rx_buffer->page->pp->p.offset; 1217 + struct page *buf_page = __netmem_to_page(rx_buffer->netmem); 1218 + u32 hr = buf_page->pp->p.offset; 1219 1219 struct sk_buff *skb; 1220 1220 void *va; 1221 1221 1222 1222 /* prefetch first cache line of first page */ 1223 - va = page_address(rx_buffer->page) + rx_buffer->offset; 1223 + va = page_address(buf_page) + rx_buffer->offset; 1224 1224 net_prefetch(va + hr); 1225 1225 1226 1226 /* build an skb around the page buffer */
+1 -1
drivers/net/ethernet/intel/idpf/idpf_singleq_txrx.c
··· 1006 1006 break; 1007 1007 1008 1008 skip_data: 1009 - rx_buf->page = NULL; 1009 + rx_buf->netmem = 0; 1010 1010 1011 1011 IDPF_SINGLEQ_BUMP_RING_IDX(rx_q, ntc); 1012 1012 cleaned_count++;
+21 -15
drivers/net/ethernet/intel/idpf/idpf_txrx.c
··· 383 383 */ 384 384 static void idpf_rx_page_rel(struct libeth_fqe *rx_buf) 385 385 { 386 - if (unlikely(!rx_buf->page)) 386 + if (unlikely(!rx_buf->netmem)) 387 387 return; 388 388 389 - page_pool_put_full_page(rx_buf->page->pp, rx_buf->page, false); 389 + libeth_rx_recycle_slow(rx_buf->netmem); 390 390 391 - rx_buf->page = NULL; 391 + rx_buf->netmem = 0; 392 392 rx_buf->offset = 0; 393 393 } 394 394 ··· 3240 3240 void idpf_rx_add_frag(struct idpf_rx_buf *rx_buf, struct sk_buff *skb, 3241 3241 unsigned int size) 3242 3242 { 3243 - u32 hr = rx_buf->page->pp->p.offset; 3243 + u32 hr = netmem_get_pp(rx_buf->netmem)->p.offset; 3244 3244 3245 - skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, rx_buf->page, 3246 - rx_buf->offset + hr, size, rx_buf->truesize); 3245 + skb_add_rx_frag_netmem(skb, skb_shinfo(skb)->nr_frags, rx_buf->netmem, 3246 + rx_buf->offset + hr, size, rx_buf->truesize); 3247 3247 } 3248 3248 3249 3249 /** ··· 3266 3266 struct libeth_fqe *buf, u32 data_len) 3267 3267 { 3268 3268 u32 copy = data_len <= L1_CACHE_BYTES ? data_len : ETH_HLEN; 3269 + struct page *hdr_page, *buf_page; 3269 3270 const void *src; 3270 3271 void *dst; 3271 3272 3272 - if (!libeth_rx_sync_for_cpu(buf, copy)) 3273 + if (unlikely(netmem_is_net_iov(buf->netmem)) || 3274 + !libeth_rx_sync_for_cpu(buf, copy)) 3273 3275 return 0; 3274 3276 3275 - dst = page_address(hdr->page) + hdr->offset + hdr->page->pp->p.offset; 3276 - src = page_address(buf->page) + buf->offset + buf->page->pp->p.offset; 3277 - memcpy(dst, src, LARGEST_ALIGN(copy)); 3277 + hdr_page = __netmem_to_page(hdr->netmem); 3278 + buf_page = __netmem_to_page(buf->netmem); 3279 + dst = page_address(hdr_page) + hdr->offset + hdr_page->pp->p.offset; 3280 + src = page_address(buf_page) + buf->offset + buf_page->pp->p.offset; 3278 3281 3282 + memcpy(dst, src, LARGEST_ALIGN(copy)); 3279 3283 buf->offset += copy; 3280 3284 3281 3285 return copy; ··· 3295 3291 */ 3296 3292 struct sk_buff *idpf_rx_build_skb(const struct libeth_fqe *buf, u32 size) 3297 3293 { 3298 - u32 hr = buf->page->pp->p.offset; 3294 + struct page *buf_page = __netmem_to_page(buf->netmem); 3295 + u32 hr = buf_page->pp->p.offset; 3299 3296 struct sk_buff *skb; 3300 3297 void *va; 3301 3298 3302 - va = page_address(buf->page) + buf->offset; 3299 + va = page_address(buf_page) + buf->offset; 3303 3300 prefetch(va + hr); 3304 3301 3305 3302 skb = napi_build_skb(va, buf->truesize); ··· 3434 3429 3435 3430 if (unlikely(!hdr_len && !skb)) { 3436 3431 hdr_len = idpf_rx_hsplit_wa(hdr, rx_buf, pkt_len); 3437 - pkt_len -= hdr_len; 3432 + /* If failed, drop both buffers by setting len to 0 */ 3433 + pkt_len -= hdr_len ? : pkt_len; 3438 3434 3439 3435 u64_stats_update_begin(&rxq->stats_sync); 3440 3436 u64_stats_inc(&rxq->q_stats.hsplit_buf_ovf); ··· 3452 3446 u64_stats_update_end(&rxq->stats_sync); 3453 3447 } 3454 3448 3455 - hdr->page = NULL; 3449 + hdr->netmem = 0; 3456 3450 3457 3451 payload: 3458 3452 if (!libeth_rx_sync_for_cpu(rx_buf, pkt_len)) ··· 3468 3462 break; 3469 3463 3470 3464 skip_data: 3471 - rx_buf->page = NULL; 3465 + rx_buf->netmem = 0; 3472 3466 3473 3467 idpf_rx_post_buf_refill(refillq, buf_id); 3474 3468 IDPF_RX_BUMP_NTC(rxq, ntc);
+8 -2
drivers/net/ethernet/intel/libeth/Kconfig
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 - # Copyright (C) 2024 Intel Corporation 2 + # Copyright (C) 2024-2025 Intel Corporation 3 3 4 4 config LIBETH 5 - tristate 5 + tristate "Common Ethernet library (libeth)" if COMPILE_TEST 6 6 select PAGE_POOL 7 7 help 8 8 libeth is a common library containing routines shared between several 9 9 drivers, but not yet promoted to the generic kernel API. 10 + 11 + config LIBETH_XDP 12 + tristate "Common XDP library (libeth_xdp)" if COMPILE_TEST 13 + select LIBETH 14 + help 15 + XDP and XSk helpers based on libeth hotpath management.
+7 -1
drivers/net/ethernet/intel/libeth/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 - # Copyright (C) 2024 Intel Corporation 2 + # Copyright (C) 2024-2025 Intel Corporation 3 3 4 4 obj-$(CONFIG_LIBETH) += libeth.o 5 5 6 6 libeth-y := rx.o 7 + libeth-y += tx.o 8 + 9 + obj-$(CONFIG_LIBETH_XDP) += libeth_xdp.o 10 + 11 + libeth_xdp-y += xdp.o 12 + libeth_xdp-y += xsk.o
+37
drivers/net/ethernet/intel/libeth/priv.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #ifndef __LIBETH_PRIV_H 5 + #define __LIBETH_PRIV_H 6 + 7 + #include <linux/types.h> 8 + 9 + /* XDP */ 10 + 11 + enum xdp_action; 12 + struct libeth_xdp_buff; 13 + struct libeth_xdp_tx_frame; 14 + struct skb_shared_info; 15 + struct xdp_frame_bulk; 16 + 17 + extern const struct xsk_tx_metadata_ops libeth_xsktmo_slow; 18 + 19 + void libeth_xsk_tx_return_bulk(const struct libeth_xdp_tx_frame *bq, 20 + u32 count); 21 + u32 libeth_xsk_prog_exception(struct libeth_xdp_buff *xdp, enum xdp_action act, 22 + int ret); 23 + 24 + struct libeth_xdp_ops { 25 + void (*bulk)(const struct skb_shared_info *sinfo, 26 + struct xdp_frame_bulk *bq, bool frags); 27 + void (*xsk)(struct libeth_xdp_buff *xdp); 28 + }; 29 + 30 + void libeth_attach_xdp(const struct libeth_xdp_ops *ops); 31 + 32 + static inline void libeth_detach_xdp(void) 33 + { 34 + libeth_attach_xdp(NULL); 35 + } 36 + 37 + #endif /* __LIBETH_PRIV_H */
+28 -14
drivers/net/ethernet/intel/libeth/rx.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 - /* Copyright (C) 2024 Intel Corporation */ 2 + /* Copyright (C) 2024-2025 Intel Corporation */ 3 + 4 + #define DEFAULT_SYMBOL_NAMESPACE "LIBETH" 5 + 6 + #include <linux/export.h> 3 7 4 8 #include <net/libeth/rx.h> 5 9 ··· 72 68 static bool libeth_rx_page_pool_params(struct libeth_fq *fq, 73 69 struct page_pool_params *pp) 74 70 { 75 - pp->offset = LIBETH_SKB_HEADROOM; 71 + pp->offset = fq->xdp ? LIBETH_XDP_HEADROOM : LIBETH_SKB_HEADROOM; 76 72 /* HW-writeable / syncable length per one page */ 77 73 pp->max_len = LIBETH_RX_PAGE_LEN(pp->offset); 78 74 ··· 159 155 .dev = napi->dev->dev.parent, 160 156 .netdev = napi->dev, 161 157 .napi = napi, 162 - .dma_dir = DMA_FROM_DEVICE, 163 158 }; 164 159 struct libeth_fqe *fqes; 165 160 struct page_pool *pool; 166 - bool ret; 161 + int ret; 162 + 163 + pp.dma_dir = fq->xdp ? DMA_BIDIRECTIONAL : DMA_FROM_DEVICE; 167 164 168 165 if (!fq->hsplit) 169 166 ret = libeth_rx_page_pool_params(fq, &pp); ··· 178 173 return PTR_ERR(pool); 179 174 180 175 fqes = kvcalloc_node(fq->count, sizeof(*fqes), GFP_KERNEL, fq->nid); 181 - if (!fqes) 176 + if (!fqes) { 177 + ret = -ENOMEM; 182 178 goto err_buf; 179 + } 180 + 181 + ret = xdp_reg_page_pool(pool); 182 + if (ret) 183 + goto err_mem; 183 184 184 185 fq->fqes = fqes; 185 186 fq->pp = pool; 186 187 187 188 return 0; 188 189 190 + err_mem: 191 + kvfree(fqes); 189 192 err_buf: 190 193 page_pool_destroy(pool); 191 194 192 - return -ENOMEM; 195 + return ret; 193 196 } 194 - EXPORT_SYMBOL_NS_GPL(libeth_rx_fq_create, "LIBETH"); 197 + EXPORT_SYMBOL_GPL(libeth_rx_fq_create); 195 198 196 199 /** 197 200 * libeth_rx_fq_destroy - destroy a &page_pool created by libeth ··· 207 194 */ 208 195 void libeth_rx_fq_destroy(struct libeth_fq *fq) 209 196 { 197 + xdp_unreg_page_pool(fq->pp); 210 198 kvfree(fq->fqes); 211 199 page_pool_destroy(fq->pp); 212 200 } 213 - EXPORT_SYMBOL_NS_GPL(libeth_rx_fq_destroy, "LIBETH"); 201 + EXPORT_SYMBOL_GPL(libeth_rx_fq_destroy); 214 202 215 203 /** 216 - * libeth_rx_recycle_slow - recycle a libeth page from the NAPI context 217 - * @page: page to recycle 204 + * libeth_rx_recycle_slow - recycle libeth netmem 205 + * @netmem: network memory to recycle 218 206 * 219 207 * To be used on exceptions or rare cases not requiring fast inline recycling. 220 208 */ 221 - void libeth_rx_recycle_slow(struct page *page) 209 + void __cold libeth_rx_recycle_slow(netmem_ref netmem) 222 210 { 223 - page_pool_recycle_direct(page->pp, page); 211 + page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, false); 224 212 } 225 - EXPORT_SYMBOL_NS_GPL(libeth_rx_recycle_slow, "LIBETH"); 213 + EXPORT_SYMBOL_GPL(libeth_rx_recycle_slow); 226 214 227 215 /* Converting abstract packet type numbers into a software structure with 228 216 * the packet parameters to do O(1) lookup on Rx. ··· 265 251 pt->hash_type |= libeth_rx_pt_xdp_iprot[pt->inner_prot]; 266 252 pt->hash_type |= libeth_rx_pt_xdp_pl[pt->payload_layer]; 267 253 } 268 - EXPORT_SYMBOL_NS_GPL(libeth_rx_pt_gen_hash_type, "LIBETH"); 254 + EXPORT_SYMBOL_GPL(libeth_rx_pt_gen_hash_type); 269 255 270 256 /* Module */ 271 257
+41
drivers/net/ethernet/intel/libeth/tx.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #define DEFAULT_SYMBOL_NAMESPACE "LIBETH" 5 + 6 + #include <net/libeth/xdp.h> 7 + 8 + #include "priv.h" 9 + 10 + /* Tx buffer completion */ 11 + 12 + DEFINE_STATIC_CALL_NULL(bulk, libeth_xdp_return_buff_bulk); 13 + DEFINE_STATIC_CALL_NULL(xsk, libeth_xsk_buff_free_slow); 14 + 15 + /** 16 + * libeth_tx_complete_any - perform Tx completion for one SQE of any type 17 + * @sqe: Tx buffer to complete 18 + * @cp: polling params 19 + * 20 + * Can be used to complete both regular and XDP SQEs, for example when 21 + * destroying queues. 22 + * When libeth_xdp is not loaded, XDPSQEs won't be handled. 23 + */ 24 + void libeth_tx_complete_any(struct libeth_sqe *sqe, struct libeth_cq_pp *cp) 25 + { 26 + if (sqe->type >= __LIBETH_SQE_XDP_START) 27 + __libeth_xdp_complete_tx(sqe, cp, static_call(bulk), 28 + static_call(xsk)); 29 + else 30 + libeth_tx_complete(sqe, cp); 31 + } 32 + EXPORT_SYMBOL_GPL(libeth_tx_complete_any); 33 + 34 + /* Module */ 35 + 36 + void libeth_attach_xdp(const struct libeth_xdp_ops *ops) 37 + { 38 + static_call_update(bulk, ops ? ops->bulk : NULL); 39 + static_call_update(xsk, ops ? ops->xsk : NULL); 40 + } 41 + EXPORT_SYMBOL_GPL(libeth_attach_xdp);
+451
drivers/net/ethernet/intel/libeth/xdp.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #define DEFAULT_SYMBOL_NAMESPACE "LIBETH_XDP" 5 + 6 + #include <linux/export.h> 7 + 8 + #include <net/libeth/xdp.h> 9 + 10 + #include "priv.h" 11 + 12 + /* XDPSQ sharing */ 13 + 14 + DEFINE_STATIC_KEY_FALSE(libeth_xdpsq_share); 15 + EXPORT_SYMBOL_GPL(libeth_xdpsq_share); 16 + 17 + void __libeth_xdpsq_get(struct libeth_xdpsq_lock *lock, 18 + const struct net_device *dev) 19 + { 20 + bool warn; 21 + 22 + spin_lock_init(&lock->lock); 23 + lock->share = true; 24 + 25 + warn = !static_key_enabled(&libeth_xdpsq_share); 26 + static_branch_inc(&libeth_xdpsq_share); 27 + 28 + if (warn && net_ratelimit()) 29 + netdev_warn(dev, "XDPSQ sharing enabled, possible XDP Tx slowdown\n"); 30 + } 31 + EXPORT_SYMBOL_GPL(__libeth_xdpsq_get); 32 + 33 + void __libeth_xdpsq_put(struct libeth_xdpsq_lock *lock, 34 + const struct net_device *dev) 35 + { 36 + static_branch_dec(&libeth_xdpsq_share); 37 + 38 + if (!static_key_enabled(&libeth_xdpsq_share) && net_ratelimit()) 39 + netdev_notice(dev, "XDPSQ sharing disabled\n"); 40 + 41 + lock->share = false; 42 + } 43 + EXPORT_SYMBOL_GPL(__libeth_xdpsq_put); 44 + 45 + void __acquires(&lock->lock) 46 + __libeth_xdpsq_lock(struct libeth_xdpsq_lock *lock) 47 + { 48 + spin_lock(&lock->lock); 49 + } 50 + EXPORT_SYMBOL_GPL(__libeth_xdpsq_lock); 51 + 52 + void __releases(&lock->lock) 53 + __libeth_xdpsq_unlock(struct libeth_xdpsq_lock *lock) 54 + { 55 + spin_unlock(&lock->lock); 56 + } 57 + EXPORT_SYMBOL_GPL(__libeth_xdpsq_unlock); 58 + 59 + /* XDPSQ clean-up timers */ 60 + 61 + /** 62 + * libeth_xdpsq_init_timer - initialize an XDPSQ clean-up timer 63 + * @timer: timer to initialize 64 + * @xdpsq: queue this timer belongs to 65 + * @lock: corresponding XDPSQ lock 66 + * @poll: queue polling/completion function 67 + * 68 + * XDPSQ clean-up timers must be set up before using at the queue configuration 69 + * time. Set the required pointers and the cleaning callback. 70 + */ 71 + void libeth_xdpsq_init_timer(struct libeth_xdpsq_timer *timer, void *xdpsq, 72 + struct libeth_xdpsq_lock *lock, 73 + void (*poll)(struct work_struct *work)) 74 + { 75 + timer->xdpsq = xdpsq; 76 + timer->lock = lock; 77 + 78 + INIT_DELAYED_WORK(&timer->dwork, poll); 79 + } 80 + EXPORT_SYMBOL_GPL(libeth_xdpsq_init_timer); 81 + 82 + /* ``XDP_TX`` bulking */ 83 + 84 + static void __cold 85 + libeth_xdp_tx_return_one(const struct libeth_xdp_tx_frame *frm) 86 + { 87 + if (frm->len_fl & LIBETH_XDP_TX_MULTI) 88 + libeth_xdp_return_frags(frm->data + frm->soff, true); 89 + 90 + libeth_xdp_return_va(frm->data, true); 91 + } 92 + 93 + static void __cold 94 + libeth_xdp_tx_return_bulk(const struct libeth_xdp_tx_frame *bq, u32 count) 95 + { 96 + for (u32 i = 0; i < count; i++) { 97 + const struct libeth_xdp_tx_frame *frm = &bq[i]; 98 + 99 + if (!(frm->len_fl & LIBETH_XDP_TX_FIRST)) 100 + continue; 101 + 102 + libeth_xdp_tx_return_one(frm); 103 + } 104 + } 105 + 106 + static void __cold libeth_trace_xdp_exception(const struct net_device *dev, 107 + const struct bpf_prog *prog, 108 + u32 act) 109 + { 110 + trace_xdp_exception(dev, prog, act); 111 + } 112 + 113 + /** 114 + * libeth_xdp_tx_exception - handle Tx exceptions of XDP frames 115 + * @bq: XDP Tx frame bulk 116 + * @sent: number of frames sent successfully (from this bulk) 117 + * @flags: internal libeth_xdp flags (XSk, .ndo_xdp_xmit etc.) 118 + * 119 + * Cold helper used by __libeth_xdp_tx_flush_bulk(), do not call directly. 120 + * Reports XDP Tx exceptions, frees the frames that won't be sent or adjust 121 + * the Tx bulk to try again later. 122 + */ 123 + void __cold libeth_xdp_tx_exception(struct libeth_xdp_tx_bulk *bq, u32 sent, 124 + u32 flags) 125 + { 126 + const struct libeth_xdp_tx_frame *pos = &bq->bulk[sent]; 127 + u32 left = bq->count - sent; 128 + 129 + if (!(flags & LIBETH_XDP_TX_NDO)) 130 + libeth_trace_xdp_exception(bq->dev, bq->prog, XDP_TX); 131 + 132 + if (!(flags & LIBETH_XDP_TX_DROP)) { 133 + memmove(bq->bulk, pos, left * sizeof(*bq->bulk)); 134 + bq->count = left; 135 + 136 + return; 137 + } 138 + 139 + if (flags & LIBETH_XDP_TX_XSK) 140 + libeth_xsk_tx_return_bulk(pos, left); 141 + else if (!(flags & LIBETH_XDP_TX_NDO)) 142 + libeth_xdp_tx_return_bulk(pos, left); 143 + else 144 + libeth_xdp_xmit_return_bulk(pos, left, bq->dev); 145 + 146 + bq->count = 0; 147 + } 148 + EXPORT_SYMBOL_GPL(libeth_xdp_tx_exception); 149 + 150 + /* .ndo_xdp_xmit() implementation */ 151 + 152 + u32 __cold libeth_xdp_xmit_return_bulk(const struct libeth_xdp_tx_frame *bq, 153 + u32 count, const struct net_device *dev) 154 + { 155 + u32 n = 0; 156 + 157 + for (u32 i = 0; i < count; i++) { 158 + const struct libeth_xdp_tx_frame *frm = &bq[i]; 159 + dma_addr_t dma; 160 + 161 + if (frm->flags & LIBETH_XDP_TX_FIRST) 162 + dma = *libeth_xdp_xmit_frame_dma(frm->xdpf); 163 + else 164 + dma = dma_unmap_addr(frm, dma); 165 + 166 + dma_unmap_page(dev->dev.parent, dma, dma_unmap_len(frm, len), 167 + DMA_TO_DEVICE); 168 + 169 + /* Actual xdp_frames are freed by the core */ 170 + n += !!(frm->flags & LIBETH_XDP_TX_FIRST); 171 + } 172 + 173 + return n; 174 + } 175 + EXPORT_SYMBOL_GPL(libeth_xdp_xmit_return_bulk); 176 + 177 + /* Rx polling path */ 178 + 179 + /** 180 + * libeth_xdp_load_stash - recreate an &xdp_buff from libeth_xdp buffer stash 181 + * @dst: target &libeth_xdp_buff to initialize 182 + * @src: source stash 183 + * 184 + * External helper used by libeth_xdp_init_buff(), do not call directly. 185 + * Recreate an onstack &libeth_xdp_buff using the stash saved earlier. 186 + * The only field untouched (rxq) is initialized later in the 187 + * abovementioned function. 188 + */ 189 + void libeth_xdp_load_stash(struct libeth_xdp_buff *dst, 190 + const struct libeth_xdp_buff_stash *src) 191 + { 192 + dst->data = src->data; 193 + dst->base.data_end = src->data + src->len; 194 + dst->base.data_meta = src->data; 195 + dst->base.data_hard_start = src->data - src->headroom; 196 + 197 + dst->base.frame_sz = src->frame_sz; 198 + dst->base.flags = src->flags; 199 + } 200 + EXPORT_SYMBOL_GPL(libeth_xdp_load_stash); 201 + 202 + /** 203 + * libeth_xdp_save_stash - convert &xdp_buff to a libeth_xdp buffer stash 204 + * @dst: target &libeth_xdp_buff_stash to initialize 205 + * @src: source XDP buffer 206 + * 207 + * External helper used by libeth_xdp_save_buff(), do not call directly. 208 + * Use the fields from the passed XDP buffer to initialize the stash on the 209 + * queue, so that a partially received frame can be finished later during 210 + * the next NAPI poll. 211 + */ 212 + void libeth_xdp_save_stash(struct libeth_xdp_buff_stash *dst, 213 + const struct libeth_xdp_buff *src) 214 + { 215 + dst->data = src->data; 216 + dst->headroom = src->data - src->base.data_hard_start; 217 + dst->len = src->base.data_end - src->data; 218 + 219 + dst->frame_sz = src->base.frame_sz; 220 + dst->flags = src->base.flags; 221 + 222 + WARN_ON_ONCE(dst->flags != src->base.flags); 223 + } 224 + EXPORT_SYMBOL_GPL(libeth_xdp_save_stash); 225 + 226 + void __libeth_xdp_return_stash(struct libeth_xdp_buff_stash *stash) 227 + { 228 + LIBETH_XDP_ONSTACK_BUFF(xdp); 229 + 230 + libeth_xdp_load_stash(xdp, stash); 231 + libeth_xdp_return_buff_slow(xdp); 232 + 233 + stash->data = NULL; 234 + } 235 + EXPORT_SYMBOL_GPL(__libeth_xdp_return_stash); 236 + 237 + /** 238 + * libeth_xdp_return_buff_slow - free &libeth_xdp_buff 239 + * @xdp: buffer to free/return 240 + * 241 + * Slowpath version of libeth_xdp_return_buff() to be called on exceptions, 242 + * queue clean-ups etc., without unwanted inlining. 243 + */ 244 + void __cold libeth_xdp_return_buff_slow(struct libeth_xdp_buff *xdp) 245 + { 246 + __libeth_xdp_return_buff(xdp, false); 247 + } 248 + EXPORT_SYMBOL_GPL(libeth_xdp_return_buff_slow); 249 + 250 + /** 251 + * libeth_xdp_buff_add_frag - add frag to XDP buffer 252 + * @xdp: head XDP buffer 253 + * @fqe: Rx buffer containing the frag 254 + * @len: frag length reported by HW 255 + * 256 + * External helper used by libeth_xdp_process_buff(), do not call directly. 257 + * Frees both head and frag buffers on error. 258 + * 259 + * Return: true success, false on error (no space for a new frag). 260 + */ 261 + bool libeth_xdp_buff_add_frag(struct libeth_xdp_buff *xdp, 262 + const struct libeth_fqe *fqe, 263 + u32 len) 264 + { 265 + netmem_ref netmem = fqe->netmem; 266 + 267 + if (!xdp_buff_add_frag(&xdp->base, netmem, 268 + fqe->offset + netmem_get_pp(netmem)->p.offset, 269 + len, fqe->truesize)) 270 + goto recycle; 271 + 272 + return true; 273 + 274 + recycle: 275 + libeth_rx_recycle_slow(netmem); 276 + libeth_xdp_return_buff_slow(xdp); 277 + 278 + return false; 279 + } 280 + EXPORT_SYMBOL_GPL(libeth_xdp_buff_add_frag); 281 + 282 + /** 283 + * libeth_xdp_prog_exception - handle XDP prog exceptions 284 + * @bq: XDP Tx bulk 285 + * @xdp: buffer to process 286 + * @act: original XDP prog verdict 287 + * @ret: error code if redirect failed 288 + * 289 + * External helper used by __libeth_xdp_run_prog() and 290 + * __libeth_xsk_run_prog_slow(), do not call directly. 291 + * Reports invalid @act, XDP exception trace event and frees the buffer. 292 + * 293 + * Return: libeth_xdp XDP prog verdict. 294 + */ 295 + u32 __cold libeth_xdp_prog_exception(const struct libeth_xdp_tx_bulk *bq, 296 + struct libeth_xdp_buff *xdp, 297 + enum xdp_action act, int ret) 298 + { 299 + if (act > XDP_REDIRECT) 300 + bpf_warn_invalid_xdp_action(bq->dev, bq->prog, act); 301 + 302 + libeth_trace_xdp_exception(bq->dev, bq->prog, act); 303 + 304 + if (xdp->base.rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL) 305 + return libeth_xsk_prog_exception(xdp, act, ret); 306 + 307 + libeth_xdp_return_buff_slow(xdp); 308 + 309 + return LIBETH_XDP_DROP; 310 + } 311 + EXPORT_SYMBOL_GPL(libeth_xdp_prog_exception); 312 + 313 + /* Tx buffer completion */ 314 + 315 + static void libeth_xdp_put_netmem_bulk(netmem_ref netmem, 316 + struct xdp_frame_bulk *bq) 317 + { 318 + if (unlikely(bq->count == XDP_BULK_QUEUE_SIZE)) 319 + xdp_flush_frame_bulk(bq); 320 + 321 + bq->q[bq->count++] = netmem; 322 + } 323 + 324 + /** 325 + * libeth_xdp_return_buff_bulk - free &xdp_buff as part of a bulk 326 + * @sinfo: shared info corresponding to the buffer 327 + * @bq: XDP frame bulk to store the buffer 328 + * @frags: whether the buffer has frags 329 + * 330 + * Same as xdp_return_frame_bulk(), but for &libeth_xdp_buff, speeds up Tx 331 + * completion of ``XDP_TX`` buffers and allows to free them in same bulks 332 + * with &xdp_frame buffers. 333 + */ 334 + void libeth_xdp_return_buff_bulk(const struct skb_shared_info *sinfo, 335 + struct xdp_frame_bulk *bq, bool frags) 336 + { 337 + if (!frags) 338 + goto head; 339 + 340 + for (u32 i = 0; i < sinfo->nr_frags; i++) 341 + libeth_xdp_put_netmem_bulk(skb_frag_netmem(&sinfo->frags[i]), 342 + bq); 343 + 344 + head: 345 + libeth_xdp_put_netmem_bulk(virt_to_netmem(sinfo), bq); 346 + } 347 + EXPORT_SYMBOL_GPL(libeth_xdp_return_buff_bulk); 348 + 349 + /* Misc */ 350 + 351 + /** 352 + * libeth_xdp_queue_threshold - calculate XDP queue clean/refill threshold 353 + * @count: number of descriptors in the queue 354 + * 355 + * The threshold is the limit at which RQs start to refill (when the number of 356 + * empty buffers exceeds it) and SQs get cleaned up (when the number of free 357 + * descriptors goes below it). To speed up hotpath processing, threshold is 358 + * always pow-2, closest to 1/4 of the queue length. 359 + * Don't call it on hotpath, calculate and cache the threshold during the 360 + * queue initialization. 361 + * 362 + * Return: the calculated threshold. 363 + */ 364 + u32 libeth_xdp_queue_threshold(u32 count) 365 + { 366 + u32 quarter, low, high; 367 + 368 + if (likely(is_power_of_2(count))) 369 + return count >> 2; 370 + 371 + quarter = DIV_ROUND_CLOSEST(count, 4); 372 + low = rounddown_pow_of_two(quarter); 373 + high = roundup_pow_of_two(quarter); 374 + 375 + return high - quarter <= quarter - low ? high : low; 376 + } 377 + EXPORT_SYMBOL_GPL(libeth_xdp_queue_threshold); 378 + 379 + /** 380 + * __libeth_xdp_set_features - set XDP features for netdev 381 + * @dev: &net_device to configure 382 + * @xmo: XDP metadata ops (Rx hints) 383 + * @zc_segs: maximum number of S/G frags the HW can transmit 384 + * @tmo: XSk Tx metadata ops (Tx hints) 385 + * 386 + * Set all the features libeth_xdp supports. Only the first argument is 387 + * necessary; without the third one (zero), XSk support won't be advertised. 388 + * Use the non-underscored versions in drivers instead. 389 + */ 390 + void __libeth_xdp_set_features(struct net_device *dev, 391 + const struct xdp_metadata_ops *xmo, 392 + u32 zc_segs, 393 + const struct xsk_tx_metadata_ops *tmo) 394 + { 395 + xdp_set_features_flag(dev, 396 + NETDEV_XDP_ACT_BASIC | 397 + NETDEV_XDP_ACT_REDIRECT | 398 + NETDEV_XDP_ACT_NDO_XMIT | 399 + (zc_segs ? NETDEV_XDP_ACT_XSK_ZEROCOPY : 0) | 400 + NETDEV_XDP_ACT_RX_SG | 401 + NETDEV_XDP_ACT_NDO_XMIT_SG); 402 + dev->xdp_metadata_ops = xmo; 403 + 404 + tmo = tmo == libeth_xsktmo ? &libeth_xsktmo_slow : tmo; 405 + 406 + dev->xdp_zc_max_segs = zc_segs ? : 1; 407 + dev->xsk_tx_metadata_ops = zc_segs ? tmo : NULL; 408 + } 409 + EXPORT_SYMBOL_GPL(__libeth_xdp_set_features); 410 + 411 + /** 412 + * libeth_xdp_set_redirect - toggle the XDP redirect feature 413 + * @dev: &net_device to configure 414 + * @enable: whether XDP is enabled 415 + * 416 + * Use this when XDPSQs are not always available to dynamically enable 417 + * and disable redirect feature. 418 + */ 419 + void libeth_xdp_set_redirect(struct net_device *dev, bool enable) 420 + { 421 + if (enable) 422 + xdp_features_set_redirect_target(dev, true); 423 + else 424 + xdp_features_clear_redirect_target(dev); 425 + } 426 + EXPORT_SYMBOL_GPL(libeth_xdp_set_redirect); 427 + 428 + /* Module */ 429 + 430 + static const struct libeth_xdp_ops xdp_ops __initconst = { 431 + .bulk = libeth_xdp_return_buff_bulk, 432 + .xsk = libeth_xsk_buff_free_slow, 433 + }; 434 + 435 + static int __init libeth_xdp_module_init(void) 436 + { 437 + libeth_attach_xdp(&xdp_ops); 438 + 439 + return 0; 440 + } 441 + module_init(libeth_xdp_module_init); 442 + 443 + static void __exit libeth_xdp_module_exit(void) 444 + { 445 + libeth_detach_xdp(); 446 + } 447 + module_exit(libeth_xdp_module_exit); 448 + 449 + MODULE_DESCRIPTION("Common Ethernet library - XDP infra"); 450 + MODULE_IMPORT_NS("LIBETH"); 451 + MODULE_LICENSE("GPL");
+271
drivers/net/ethernet/intel/libeth/xsk.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #define DEFAULT_SYMBOL_NAMESPACE "LIBETH_XDP" 5 + 6 + #include <linux/export.h> 7 + 8 + #include <net/libeth/xsk.h> 9 + 10 + #include "priv.h" 11 + 12 + /* ``XDP_TX`` bulking */ 13 + 14 + void __cold libeth_xsk_tx_return_bulk(const struct libeth_xdp_tx_frame *bq, 15 + u32 count) 16 + { 17 + for (u32 i = 0; i < count; i++) 18 + libeth_xsk_buff_free_slow(bq[i].xsk); 19 + } 20 + 21 + /* XSk TMO */ 22 + 23 + const struct xsk_tx_metadata_ops libeth_xsktmo_slow = { 24 + .tmo_request_checksum = libeth_xsktmo_req_csum, 25 + }; 26 + 27 + /* Rx polling path */ 28 + 29 + /** 30 + * libeth_xsk_buff_free_slow - free an XSk Rx buffer 31 + * @xdp: buffer to free 32 + * 33 + * Slowpath version of xsk_buff_free() to be used on exceptions, cleanups etc. 34 + * to avoid unwanted inlining. 35 + */ 36 + void libeth_xsk_buff_free_slow(struct libeth_xdp_buff *xdp) 37 + { 38 + xsk_buff_free(&xdp->base); 39 + } 40 + EXPORT_SYMBOL_GPL(libeth_xsk_buff_free_slow); 41 + 42 + /** 43 + * libeth_xsk_buff_add_frag - add frag to XSk Rx buffer 44 + * @head: head buffer 45 + * @xdp: frag buffer 46 + * 47 + * External helper used by libeth_xsk_process_buff(), do not call directly. 48 + * Frees both main and frag buffers on error. 49 + * 50 + * Return: main buffer with attached frag on success, %NULL on error (no space 51 + * for a new frag). 52 + */ 53 + struct libeth_xdp_buff *libeth_xsk_buff_add_frag(struct libeth_xdp_buff *head, 54 + struct libeth_xdp_buff *xdp) 55 + { 56 + if (!xsk_buff_add_frag(&head->base, &xdp->base)) 57 + goto free; 58 + 59 + return head; 60 + 61 + free: 62 + libeth_xsk_buff_free_slow(xdp); 63 + libeth_xsk_buff_free_slow(head); 64 + 65 + return NULL; 66 + } 67 + EXPORT_SYMBOL_GPL(libeth_xsk_buff_add_frag); 68 + 69 + /** 70 + * libeth_xsk_buff_stats_frags - update onstack RQ stats with XSk frags info 71 + * @rs: onstack stats to update 72 + * @xdp: buffer to account 73 + * 74 + * External helper used by __libeth_xsk_run_pass(), do not call directly. 75 + * Adds buffer's frags count and total len to the onstack stats. 76 + */ 77 + void libeth_xsk_buff_stats_frags(struct libeth_rq_napi_stats *rs, 78 + const struct libeth_xdp_buff *xdp) 79 + { 80 + libeth_xdp_buff_stats_frags(rs, xdp); 81 + } 82 + EXPORT_SYMBOL_GPL(libeth_xsk_buff_stats_frags); 83 + 84 + /** 85 + * __libeth_xsk_run_prog_slow - process the non-``XDP_REDIRECT`` verdicts 86 + * @xdp: buffer to process 87 + * @bq: Tx bulk for queueing on ``XDP_TX`` 88 + * @act: verdict to process 89 + * @ret: error code if ``XDP_REDIRECT`` failed 90 + * 91 + * External helper used by __libeth_xsk_run_prog(), do not call directly. 92 + * ``XDP_REDIRECT`` is the most common and hottest verdict on XSk, thus 93 + * it is processed inline. The rest goes here for out-of-line processing, 94 + * together with redirect errors. 95 + * 96 + * Return: libeth_xdp XDP prog verdict. 97 + */ 98 + u32 __libeth_xsk_run_prog_slow(struct libeth_xdp_buff *xdp, 99 + const struct libeth_xdp_tx_bulk *bq, 100 + enum xdp_action act, int ret) 101 + { 102 + switch (act) { 103 + case XDP_DROP: 104 + xsk_buff_free(&xdp->base); 105 + 106 + return LIBETH_XDP_DROP; 107 + case XDP_TX: 108 + return LIBETH_XDP_TX; 109 + case XDP_PASS: 110 + return LIBETH_XDP_PASS; 111 + default: 112 + break; 113 + } 114 + 115 + return libeth_xdp_prog_exception(bq, xdp, act, ret); 116 + } 117 + EXPORT_SYMBOL_GPL(__libeth_xsk_run_prog_slow); 118 + 119 + /** 120 + * libeth_xsk_prog_exception - handle XDP prog exceptions on XSk 121 + * @xdp: buffer to process 122 + * @act: verdict returned by the prog 123 + * @ret: error code if ``XDP_REDIRECT`` failed 124 + * 125 + * Internal. Frees the buffer and, if the queue uses XSk wakeups, stop the 126 + * current NAPI poll when there are no free buffers left. 127 + * 128 + * Return: libeth_xdp's XDP prog verdict. 129 + */ 130 + u32 __cold libeth_xsk_prog_exception(struct libeth_xdp_buff *xdp, 131 + enum xdp_action act, int ret) 132 + { 133 + const struct xdp_buff_xsk *xsk; 134 + u32 __ret = LIBETH_XDP_DROP; 135 + 136 + if (act != XDP_REDIRECT) 137 + goto drop; 138 + 139 + xsk = container_of(&xdp->base, typeof(*xsk), xdp); 140 + if (xsk_uses_need_wakeup(xsk->pool) && ret == -ENOBUFS) 141 + __ret = LIBETH_XDP_ABORTED; 142 + 143 + drop: 144 + libeth_xsk_buff_free_slow(xdp); 145 + 146 + return __ret; 147 + } 148 + 149 + /* Refill */ 150 + 151 + /** 152 + * libeth_xskfq_create - create an XSkFQ 153 + * @fq: fill queue to initialize 154 + * 155 + * Allocates the FQEs and initializes the fields used by libeth_xdp: number 156 + * of buffers to refill, refill threshold and buffer len. 157 + * 158 + * Return: %0 on success, -errno otherwise. 159 + */ 160 + int libeth_xskfq_create(struct libeth_xskfq *fq) 161 + { 162 + fq->fqes = kvcalloc_node(fq->count, sizeof(*fq->fqes), GFP_KERNEL, 163 + fq->nid); 164 + if (!fq->fqes) 165 + return -ENOMEM; 166 + 167 + fq->pending = fq->count; 168 + fq->thresh = libeth_xdp_queue_threshold(fq->count); 169 + fq->buf_len = xsk_pool_get_rx_frame_size(fq->pool); 170 + 171 + return 0; 172 + } 173 + EXPORT_SYMBOL_GPL(libeth_xskfq_create); 174 + 175 + /** 176 + * libeth_xskfq_destroy - destroy an XSkFQ 177 + * @fq: fill queue to destroy 178 + * 179 + * Zeroes the used fields and frees the FQEs array. 180 + */ 181 + void libeth_xskfq_destroy(struct libeth_xskfq *fq) 182 + { 183 + fq->buf_len = 0; 184 + fq->thresh = 0; 185 + fq->pending = 0; 186 + 187 + kvfree(fq->fqes); 188 + } 189 + EXPORT_SYMBOL_GPL(libeth_xskfq_destroy); 190 + 191 + /* .ndo_xsk_wakeup */ 192 + 193 + static void libeth_xsk_napi_sched(void *info) 194 + { 195 + __napi_schedule_irqoff(info); 196 + } 197 + 198 + /** 199 + * libeth_xsk_init_wakeup - initialize libeth XSk wakeup structure 200 + * @csd: struct to initialize 201 + * @napi: NAPI corresponding to this queue 202 + * 203 + * libeth_xdp uses inter-processor interrupts to perform XSk wakeups. In order 204 + * to do that, the corresponding CSDs must be initialized when creating the 205 + * queues. 206 + */ 207 + void libeth_xsk_init_wakeup(call_single_data_t *csd, struct napi_struct *napi) 208 + { 209 + INIT_CSD(csd, libeth_xsk_napi_sched, napi); 210 + } 211 + EXPORT_SYMBOL_GPL(libeth_xsk_init_wakeup); 212 + 213 + /** 214 + * libeth_xsk_wakeup - perform an XSk wakeup 215 + * @csd: CSD corresponding to the queue 216 + * @qid: the stack queue index 217 + * 218 + * Try to mark the NAPI as missed first, so that it could be rescheduled. 219 + * If it's not, schedule it on the corresponding CPU using IPIs (or directly 220 + * if already running on it). 221 + */ 222 + void libeth_xsk_wakeup(call_single_data_t *csd, u32 qid) 223 + { 224 + struct napi_struct *napi = csd->info; 225 + 226 + if (napi_if_scheduled_mark_missed(napi) || 227 + unlikely(!napi_schedule_prep(napi))) 228 + return; 229 + 230 + if (unlikely(qid >= nr_cpu_ids)) 231 + qid %= nr_cpu_ids; 232 + 233 + if (qid != raw_smp_processor_id() && cpu_online(qid)) 234 + smp_call_function_single_async(qid, csd); 235 + else 236 + __napi_schedule(napi); 237 + } 238 + EXPORT_SYMBOL_GPL(libeth_xsk_wakeup); 239 + 240 + /* Pool setup */ 241 + 242 + #define LIBETH_XSK_DMA_ATTR \ 243 + (DMA_ATTR_WEAK_ORDERING | DMA_ATTR_SKIP_CPU_SYNC) 244 + 245 + /** 246 + * libeth_xsk_setup_pool - setup or destroy an XSk pool for a queue 247 + * @dev: target &net_device 248 + * @qid: stack queue index to configure 249 + * @enable: whether to enable or disable the pool 250 + * 251 + * Check that @qid is valid and then map or unmap the pool. 252 + * 253 + * Return: %0 on success, -errno otherwise. 254 + */ 255 + int libeth_xsk_setup_pool(struct net_device *dev, u32 qid, bool enable) 256 + { 257 + struct xsk_buff_pool *pool; 258 + 259 + pool = xsk_get_pool_from_qid(dev, qid); 260 + if (!pool) 261 + return -EINVAL; 262 + 263 + if (enable) 264 + return xsk_pool_dma_map(pool, dev->dev.parent, 265 + LIBETH_XSK_DMA_ATTR); 266 + else 267 + xsk_pool_dma_unmap(pool, LIBETH_XSK_DMA_ATTR); 268 + 269 + return 0; 270 + } 271 + EXPORT_SYMBOL_GPL(libeth_xsk_setup_pool);
+5 -2
drivers/net/ethernet/intel/libie/rx.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 - /* Copyright (C) 2024 Intel Corporation */ 2 + /* Copyright (C) 2024-2025 Intel Corporation */ 3 3 4 + #define DEFAULT_SYMBOL_NAMESPACE "LIBIE" 5 + 6 + #include <linux/export.h> 4 7 #include <linux/net/intel/libie/rx.h> 5 8 6 9 /* O(1) converting i40e/ice/iavf's 8/10-bit hardware packet type to a parsed ··· 119 116 LIBIE_RX_PT_IP(4), 120 117 LIBIE_RX_PT_IP(6), 121 118 }; 122 - EXPORT_SYMBOL_NS_GPL(libie_rx_pt_lut, "LIBIE"); 119 + EXPORT_SYMBOL_GPL(libie_rx_pt_lut); 123 120 124 121 MODULE_DESCRIPTION("Intel(R) Ethernet common library"); 125 122 MODULE_IMPORT_NS("LIBETH");
+17 -11
include/net/libeth/rx.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 - /* Copyright (C) 2024 Intel Corporation */ 2 + /* Copyright (C) 2024-2025 Intel Corporation */ 3 3 4 4 #ifndef __LIBETH_RX_H 5 5 #define __LIBETH_RX_H ··· 13 13 14 14 /* Space reserved in front of each frame */ 15 15 #define LIBETH_SKB_HEADROOM (NET_SKB_PAD + NET_IP_ALIGN) 16 + #define LIBETH_XDP_HEADROOM (ALIGN(XDP_PACKET_HEADROOM, NET_SKB_PAD) + \ 17 + NET_IP_ALIGN) 16 18 /* Maximum headroom for worst-case calculations */ 17 - #define LIBETH_MAX_HEADROOM LIBETH_SKB_HEADROOM 19 + #define LIBETH_MAX_HEADROOM LIBETH_XDP_HEADROOM 18 20 /* Link layer / L2 overhead: Ethernet, 2 VLAN tags (C + S), FCS */ 19 21 #define LIBETH_RX_LL_LEN (ETH_HLEN + 2 * VLAN_HLEN + ETH_FCS_LEN) 20 22 /* Maximum supported L2-L4 header length */ ··· 33 31 34 32 /** 35 33 * struct libeth_fqe - structure representing an Rx buffer (fill queue element) 36 - * @page: page holding the buffer 34 + * @netmem: network memory reference holding the buffer 37 35 * @offset: offset from the page start (to the headroom) 38 36 * @truesize: total space occupied by the buffer (w/ headroom and tailroom) 39 37 * ··· 42 40 * former, @offset is always 0 and @truesize is always ```PAGE_SIZE```. 43 41 */ 44 42 struct libeth_fqe { 45 - struct page *page; 43 + netmem_ref netmem; 46 44 u32 offset; 47 45 u32 truesize; 48 46 } __aligned_largest; ··· 68 66 * @count: number of descriptors/buffers the queue has 69 67 * @type: type of the buffers this queue has 70 68 * @hsplit: flag whether header split is enabled 69 + * @xdp: flag indicating whether XDP is enabled 71 70 * @buf_len: HW-writeable length per each buffer 72 71 * @nid: ID of the closest NUMA node with memory 73 72 */ ··· 84 81 /* Cold fields */ 85 82 enum libeth_fqe_type type:2; 86 83 bool hsplit:1; 84 + bool xdp:1; 87 85 88 86 u32 buf_len; 89 87 int nid; ··· 106 102 struct libeth_fqe *buf = &fq->fqes[i]; 107 103 108 104 buf->truesize = fq->truesize; 109 - buf->page = page_pool_dev_alloc(fq->pp, &buf->offset, &buf->truesize); 110 - if (unlikely(!buf->page)) 105 + buf->netmem = page_pool_dev_alloc_netmem(fq->pp, &buf->offset, 106 + &buf->truesize); 107 + if (unlikely(!buf->netmem)) 111 108 return DMA_MAPPING_ERROR; 112 109 113 - return page_pool_get_dma_addr(buf->page) + buf->offset + 110 + return page_pool_get_dma_addr_netmem(buf->netmem) + buf->offset + 114 111 fq->pp->p.offset; 115 112 } 116 113 117 - void libeth_rx_recycle_slow(struct page *page); 114 + void libeth_rx_recycle_slow(netmem_ref netmem); 118 115 119 116 /** 120 117 * libeth_rx_sync_for_cpu - synchronize or recycle buffer post DMA ··· 131 126 static inline bool libeth_rx_sync_for_cpu(const struct libeth_fqe *fqe, 132 127 u32 len) 133 128 { 134 - struct page *page = fqe->page; 129 + netmem_ref netmem = fqe->netmem; 135 130 136 131 /* Very rare, but possible case. The most common reason: 137 132 * the last fragment contained FCS only, which was then 138 133 * stripped by the HW. 139 134 */ 140 135 if (unlikely(!len)) { 141 - libeth_rx_recycle_slow(page); 136 + libeth_rx_recycle_slow(netmem); 142 137 return false; 143 138 } 144 139 145 - page_pool_dma_sync_for_cpu(page->pp, page, fqe->offset, len); 140 + page_pool_dma_sync_netmem_for_cpu(netmem_get_pp(netmem), netmem, 141 + fqe->offset, len); 146 142 147 143 return true; 148 144 }
+33 -3
include/net/libeth/tx.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 - /* Copyright (C) 2024 Intel Corporation */ 2 + /* Copyright (C) 2024-2025 Intel Corporation */ 3 3 4 4 #ifndef __LIBETH_TX_H 5 5 #define __LIBETH_TX_H ··· 12 12 13 13 /** 14 14 * enum libeth_sqe_type - type of &libeth_sqe to act on Tx completion 15 - * @LIBETH_SQE_EMPTY: unused/empty, no action required 15 + * @LIBETH_SQE_EMPTY: unused/empty OR XDP_TX/XSk frame, no action required 16 16 * @LIBETH_SQE_CTX: context descriptor with empty SQE, no action required 17 17 * @LIBETH_SQE_SLAB: kmalloc-allocated buffer, unmap and kfree() 18 18 * @LIBETH_SQE_FRAG: mapped skb frag, only unmap DMA 19 19 * @LIBETH_SQE_SKB: &sk_buff, unmap and napi_consume_skb(), update stats 20 + * @__LIBETH_SQE_XDP_START: separator between skb and XDP types 21 + * @LIBETH_SQE_XDP_TX: &skb_shared_info, libeth_xdp_return_buff_bulk(), stats 22 + * @LIBETH_SQE_XDP_XMIT: &xdp_frame, unmap and xdp_return_frame_bulk(), stats 23 + * @LIBETH_SQE_XDP_XMIT_FRAG: &xdp_frame frag, only unmap DMA 24 + * @LIBETH_SQE_XSK_TX: &libeth_xdp_buff on XSk queue, xsk_buff_free(), stats 25 + * @LIBETH_SQE_XSK_TX_FRAG: &libeth_xdp_buff frag on XSk queue, xsk_buff_free() 20 26 */ 21 27 enum libeth_sqe_type { 22 28 LIBETH_SQE_EMPTY = 0U, ··· 30 24 LIBETH_SQE_SLAB, 31 25 LIBETH_SQE_FRAG, 32 26 LIBETH_SQE_SKB, 27 + 28 + __LIBETH_SQE_XDP_START, 29 + LIBETH_SQE_XDP_TX = __LIBETH_SQE_XDP_START, 30 + LIBETH_SQE_XDP_XMIT, 31 + LIBETH_SQE_XDP_XMIT_FRAG, 32 + LIBETH_SQE_XSK_TX, 33 + LIBETH_SQE_XSK_TX_FRAG, 33 34 }; 34 35 35 36 /** ··· 45 32 * @rs_idx: index of the last buffer from the batch this one was sent in 46 33 * @raw: slab buffer to free via kfree() 47 34 * @skb: &sk_buff to consume 35 + * @sinfo: skb shared info of an XDP_TX frame 36 + * @xdpf: XDP frame from ::ndo_xdp_xmit() 37 + * @xsk: XSk Rx frame from XDP_TX action 48 38 * @dma: DMA address to unmap 49 39 * @len: length of the mapped region to unmap 50 40 * @nr_frags: number of frags in the frame this buffer belongs to ··· 62 46 union { 63 47 void *raw; 64 48 struct sk_buff *skb; 49 + struct skb_shared_info *sinfo; 50 + struct xdp_frame *xdpf; 51 + struct libeth_xdp_buff *xsk; 65 52 }; 66 53 67 54 DEFINE_DMA_UNMAP_ADDR(dma); ··· 90 71 /** 91 72 * struct libeth_cq_pp - completion queue poll params 92 73 * @dev: &device to perform DMA unmapping 74 + * @bq: XDP frame bulk to combine return operations 93 75 * @ss: onstack NAPI stats to fill 76 + * @xss: onstack XDPSQ NAPI stats to fill 77 + * @xdp_tx: number of XDP-not-XSk frames processed 94 78 * @napi: whether it's called from the NAPI context 95 79 * 96 80 * libeth uses this structure to access objects needed for performing full ··· 102 80 */ 103 81 struct libeth_cq_pp { 104 82 struct device *dev; 105 - struct libeth_sq_napi_stats *ss; 83 + struct xdp_frame_bulk *bq; 84 + 85 + union { 86 + struct libeth_sq_napi_stats *ss; 87 + struct libeth_xdpsq_napi_stats *xss; 88 + }; 89 + u32 xdp_tx; 106 90 107 91 bool napi; 108 92 }; ··· 153 125 154 126 sqe->type = LIBETH_SQE_EMPTY; 155 127 } 128 + 129 + void libeth_tx_complete_any(struct libeth_sqe *sqe, struct libeth_cq_pp *cp); 156 130 157 131 #endif /* __LIBETH_TX_H */
+104 -2
include/net/libeth/types.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 - /* Copyright (C) 2024 Intel Corporation */ 2 + /* Copyright (C) 2024-2025 Intel Corporation */ 3 3 4 4 #ifndef __LIBETH_TYPES_H 5 5 #define __LIBETH_TYPES_H 6 6 7 - #include <linux/types.h> 7 + #include <linux/workqueue.h> 8 + 9 + /* Stats */ 10 + 11 + /** 12 + * struct libeth_rq_napi_stats - "hot" counters to update in Rx polling loop 13 + * @packets: received frames counter 14 + * @bytes: sum of bytes of received frames above 15 + * @fragments: sum of fragments of received S/G frames 16 + * @hsplit: number of frames the device performed the header split for 17 + * @raw: alias to access all the fields as an array 18 + */ 19 + struct libeth_rq_napi_stats { 20 + union { 21 + struct { 22 + u32 packets; 23 + u32 bytes; 24 + u32 fragments; 25 + u32 hsplit; 26 + }; 27 + DECLARE_FLEX_ARRAY(u32, raw); 28 + }; 29 + }; 8 30 9 31 /** 10 32 * struct libeth_sq_napi_stats - "hot" counters to update in Tx completion loop ··· 43 21 DECLARE_FLEX_ARRAY(u32, raw); 44 22 }; 45 23 }; 24 + 25 + /** 26 + * struct libeth_xdpsq_napi_stats - "hot" counters to update in XDP Tx 27 + * completion loop 28 + * @packets: completed frames counter 29 + * @bytes: sum of bytes of completed frames above 30 + * @fragments: sum of fragments of completed S/G frames 31 + * @raw: alias to access all the fields as an array 32 + */ 33 + struct libeth_xdpsq_napi_stats { 34 + union { 35 + struct { 36 + u32 packets; 37 + u32 bytes; 38 + u32 fragments; 39 + }; 40 + DECLARE_FLEX_ARRAY(u32, raw); 41 + }; 42 + }; 43 + 44 + /* XDP */ 45 + 46 + /* 47 + * The following structures should be embedded into driver's queue structure 48 + * and passed to the libeth_xdp helpers, never used directly. 49 + */ 50 + 51 + /* XDPSQ sharing */ 52 + 53 + /** 54 + * struct libeth_xdpsq_lock - locking primitive for sharing XDPSQs 55 + * @lock: spinlock for locking the queue 56 + * @share: whether this particular queue is shared 57 + */ 58 + struct libeth_xdpsq_lock { 59 + spinlock_t lock; 60 + bool share; 61 + }; 62 + 63 + /* XDPSQ clean-up timers */ 64 + 65 + /** 66 + * struct libeth_xdpsq_timer - timer for cleaning up XDPSQs w/o interrupts 67 + * @xdpsq: queue this timer belongs to 68 + * @lock: lock for the queue 69 + * @dwork: work performing cleanups 70 + * 71 + * XDPSQs not using interrupts but lazy cleaning, i.e. only when there's no 72 + * space for sending the current queued frame/bulk, must fire up timers to 73 + * make sure there are no stale buffers to free. 74 + */ 75 + struct libeth_xdpsq_timer { 76 + void *xdpsq; 77 + struct libeth_xdpsq_lock *lock; 78 + 79 + struct delayed_work dwork; 80 + }; 81 + 82 + /* Rx polling path */ 83 + 84 + /** 85 + * struct libeth_xdp_buff_stash - struct for stashing &xdp_buff onto a queue 86 + * @data: pointer to the start of the frame, xdp_buff.data 87 + * @headroom: frame headroom, xdp_buff.data - xdp_buff.data_hard_start 88 + * @len: frame linear space length, xdp_buff.data_end - xdp_buff.data 89 + * @frame_sz: truesize occupied by the frame, xdp_buff.frame_sz 90 + * @flags: xdp_buff.flags 91 + * 92 + * &xdp_buff is 56 bytes long on x64, &libeth_xdp_buff is 64 bytes. This 93 + * structure carries only necessary fields to save/restore a partially built 94 + * frame on the queue structure to finish it during the next NAPI poll. 95 + */ 96 + struct libeth_xdp_buff_stash { 97 + void *data; 98 + u16 headroom; 99 + u16 len; 100 + 101 + u32 frame_sz:24; 102 + u32 flags:8; 103 + } __aligned_largest; 46 104 47 105 #endif /* __LIBETH_TYPES_H */
+1879
include/net/libeth/xdp.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #ifndef __LIBETH_XDP_H 5 + #define __LIBETH_XDP_H 6 + 7 + #include <linux/bpf_trace.h> 8 + #include <linux/unroll.h> 9 + 10 + #include <net/libeth/rx.h> 11 + #include <net/libeth/tx.h> 12 + #include <net/xsk_buff_pool.h> 13 + 14 + /* 15 + * Defined as bits to be able to use them as a mask on Rx. 16 + * Also used as internal return values on Tx. 17 + */ 18 + enum { 19 + LIBETH_XDP_PASS = 0U, 20 + LIBETH_XDP_DROP = BIT(0), 21 + LIBETH_XDP_ABORTED = BIT(1), 22 + LIBETH_XDP_TX = BIT(2), 23 + LIBETH_XDP_REDIRECT = BIT(3), 24 + }; 25 + 26 + /* 27 + * &xdp_buff_xsk is the largest structure &libeth_xdp_buff gets casted to, 28 + * pick maximum pointer-compatible alignment. 29 + */ 30 + #define __LIBETH_XDP_BUFF_ALIGN \ 31 + (IS_ALIGNED(sizeof(struct xdp_buff_xsk), 16) ? 16 : \ 32 + IS_ALIGNED(sizeof(struct xdp_buff_xsk), 8) ? 8 : \ 33 + sizeof(long)) 34 + 35 + /** 36 + * struct libeth_xdp_buff - libeth extension over &xdp_buff 37 + * @base: main &xdp_buff 38 + * @data: shortcut for @base.data 39 + * @desc: RQ descriptor containing metadata for this buffer 40 + * @priv: driver-private scratchspace 41 + * 42 + * The main reason for this is to have a pointer to the descriptor to be able 43 + * to quickly get frame metadata from xdpmo and driver buff-to-xdp callbacks 44 + * (as well as bigger alignment). 45 + * Pointer/layout-compatible with &xdp_buff and &xdp_buff_xsk. 46 + */ 47 + struct libeth_xdp_buff { 48 + union { 49 + struct xdp_buff base; 50 + void *data; 51 + }; 52 + 53 + const void *desc; 54 + unsigned long priv[] 55 + __aligned(__LIBETH_XDP_BUFF_ALIGN); 56 + } __aligned(__LIBETH_XDP_BUFF_ALIGN); 57 + static_assert(offsetof(struct libeth_xdp_buff, data) == 58 + offsetof(struct xdp_buff_xsk, xdp.data)); 59 + static_assert(offsetof(struct libeth_xdp_buff, desc) == 60 + offsetof(struct xdp_buff_xsk, cb)); 61 + static_assert(IS_ALIGNED(sizeof(struct xdp_buff_xsk), 62 + __alignof(struct libeth_xdp_buff))); 63 + 64 + /** 65 + * __LIBETH_XDP_ONSTACK_BUFF - declare a &libeth_xdp_buff on the stack 66 + * @name: name of the variable to declare 67 + * @...: sizeof() of the driver-private data 68 + */ 69 + #define __LIBETH_XDP_ONSTACK_BUFF(name, ...) \ 70 + ___LIBETH_XDP_ONSTACK_BUFF(name, ##__VA_ARGS__) 71 + /** 72 + * LIBETH_XDP_ONSTACK_BUFF - declare a &libeth_xdp_buff on the stack 73 + * @name: name of the variable to declare 74 + * @...: type or variable name of the driver-private data 75 + */ 76 + #define LIBETH_XDP_ONSTACK_BUFF(name, ...) \ 77 + __LIBETH_XDP_ONSTACK_BUFF(name, __libeth_xdp_priv_sz(__VA_ARGS__)) 78 + 79 + #define ___LIBETH_XDP_ONSTACK_BUFF(name, ...) \ 80 + __DEFINE_FLEX(struct libeth_xdp_buff, name, priv, \ 81 + LIBETH_XDP_PRIV_SZ(__VA_ARGS__ + 0), \ 82 + __uninitialized); \ 83 + LIBETH_XDP_ASSERT_PRIV_SZ(__VA_ARGS__ + 0) 84 + 85 + #define __libeth_xdp_priv_sz(...) \ 86 + CONCATENATE(__libeth_xdp_psz, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__) 87 + 88 + #define __libeth_xdp_psz0(...) 89 + #define __libeth_xdp_psz1(...) sizeof(__VA_ARGS__) 90 + 91 + #define LIBETH_XDP_PRIV_SZ(sz) \ 92 + (ALIGN(sz, __alignof(struct libeth_xdp_buff)) / sizeof(long)) 93 + 94 + /* Performs XSK_CHECK_PRIV_TYPE() */ 95 + #define LIBETH_XDP_ASSERT_PRIV_SZ(sz) \ 96 + static_assert(offsetofend(struct xdp_buff_xsk, cb) >= \ 97 + struct_size_t(struct libeth_xdp_buff, priv, \ 98 + LIBETH_XDP_PRIV_SZ(sz))) 99 + 100 + /* XDPSQ sharing */ 101 + 102 + DECLARE_STATIC_KEY_FALSE(libeth_xdpsq_share); 103 + 104 + /** 105 + * libeth_xdpsq_num - calculate optimal number of XDPSQs for this device + sys 106 + * @rxq: current number of active Rx queues 107 + * @txq: current number of active Tx queues 108 + * @max: maximum number of Tx queues 109 + * 110 + * Each RQ must have its own XDPSQ for XSk pairs, each CPU must have own XDPSQ 111 + * for lockless sending (``XDP_TX``, .ndo_xdp_xmit()). Cap the maximum of these 112 + * two with the number of SQs the device can have (minus used ones). 113 + * 114 + * Return: number of XDP Tx queues the device needs to use. 115 + */ 116 + static inline u32 libeth_xdpsq_num(u32 rxq, u32 txq, u32 max) 117 + { 118 + return min(max(nr_cpu_ids, rxq), max - txq); 119 + } 120 + 121 + /** 122 + * libeth_xdpsq_shared - whether XDPSQs can be shared between several CPUs 123 + * @num: number of active XDPSQs 124 + * 125 + * Return: true if there's no 1:1 XDPSQ/CPU association, false otherwise. 126 + */ 127 + static inline bool libeth_xdpsq_shared(u32 num) 128 + { 129 + return num < nr_cpu_ids; 130 + } 131 + 132 + /** 133 + * libeth_xdpsq_id - get XDPSQ index corresponding to this CPU 134 + * @num: number of active XDPSQs 135 + * 136 + * Helper for libeth_xdp routines, do not use in drivers directly. 137 + * 138 + * Return: XDPSQ index needs to be used on this CPU. 139 + */ 140 + static inline u32 libeth_xdpsq_id(u32 num) 141 + { 142 + u32 ret = raw_smp_processor_id(); 143 + 144 + if (static_branch_unlikely(&libeth_xdpsq_share) && 145 + libeth_xdpsq_shared(num)) 146 + ret %= num; 147 + 148 + return ret; 149 + } 150 + 151 + void __libeth_xdpsq_get(struct libeth_xdpsq_lock *lock, 152 + const struct net_device *dev); 153 + void __libeth_xdpsq_put(struct libeth_xdpsq_lock *lock, 154 + const struct net_device *dev); 155 + 156 + /** 157 + * libeth_xdpsq_get - initialize &libeth_xdpsq_lock 158 + * @lock: lock to initialize 159 + * @dev: netdev which this lock belongs to 160 + * @share: whether XDPSQs can be shared 161 + * 162 + * Tracks the current XDPSQ association and enables the static lock 163 + * if needed. 164 + */ 165 + static inline void libeth_xdpsq_get(struct libeth_xdpsq_lock *lock, 166 + const struct net_device *dev, 167 + bool share) 168 + { 169 + if (unlikely(share)) 170 + __libeth_xdpsq_get(lock, dev); 171 + } 172 + 173 + /** 174 + * libeth_xdpsq_put - deinitialize &libeth_xdpsq_lock 175 + * @lock: lock to deinitialize 176 + * @dev: netdev which this lock belongs to 177 + * 178 + * Tracks the current XDPSQ association and disables the static lock 179 + * if needed. 180 + */ 181 + static inline void libeth_xdpsq_put(struct libeth_xdpsq_lock *lock, 182 + const struct net_device *dev) 183 + { 184 + if (static_branch_unlikely(&libeth_xdpsq_share) && lock->share) 185 + __libeth_xdpsq_put(lock, dev); 186 + } 187 + 188 + void __libeth_xdpsq_lock(struct libeth_xdpsq_lock *lock); 189 + void __libeth_xdpsq_unlock(struct libeth_xdpsq_lock *lock); 190 + 191 + /** 192 + * libeth_xdpsq_lock - grab &libeth_xdpsq_lock if needed 193 + * @lock: lock to take 194 + * 195 + * Touches the underlying spinlock only if the static key is enabled 196 + * and the queue itself is marked as shareable. 197 + */ 198 + static inline void libeth_xdpsq_lock(struct libeth_xdpsq_lock *lock) 199 + { 200 + if (static_branch_unlikely(&libeth_xdpsq_share) && lock->share) 201 + __libeth_xdpsq_lock(lock); 202 + } 203 + 204 + /** 205 + * libeth_xdpsq_unlock - free &libeth_xdpsq_lock if needed 206 + * @lock: lock to free 207 + * 208 + * Touches the underlying spinlock only if the static key is enabled 209 + * and the queue itself is marked as shareable. 210 + */ 211 + static inline void libeth_xdpsq_unlock(struct libeth_xdpsq_lock *lock) 212 + { 213 + if (static_branch_unlikely(&libeth_xdpsq_share) && lock->share) 214 + __libeth_xdpsq_unlock(lock); 215 + } 216 + 217 + /* XDPSQ clean-up timers */ 218 + 219 + void libeth_xdpsq_init_timer(struct libeth_xdpsq_timer *timer, void *xdpsq, 220 + struct libeth_xdpsq_lock *lock, 221 + void (*poll)(struct work_struct *work)); 222 + 223 + /** 224 + * libeth_xdpsq_deinit_timer - deinitialize &libeth_xdpsq_timer 225 + * @timer: timer to deinitialize 226 + * 227 + * Flush and disable the underlying workqueue. 228 + */ 229 + static inline void libeth_xdpsq_deinit_timer(struct libeth_xdpsq_timer *timer) 230 + { 231 + cancel_delayed_work_sync(&timer->dwork); 232 + } 233 + 234 + /** 235 + * libeth_xdpsq_queue_timer - run &libeth_xdpsq_timer 236 + * @timer: timer to queue 237 + * 238 + * Should be called after the queue was filled and the transmission was run 239 + * to complete the pending buffers if no further sending will be done in a 240 + * second (-> lazy cleaning won't happen). 241 + * If the timer was already run, it will be requeued back to one second 242 + * timeout again. 243 + */ 244 + static inline void libeth_xdpsq_queue_timer(struct libeth_xdpsq_timer *timer) 245 + { 246 + mod_delayed_work_on(raw_smp_processor_id(), system_bh_highpri_wq, 247 + &timer->dwork, HZ); 248 + } 249 + 250 + /** 251 + * libeth_xdpsq_run_timer - wrapper to run a queue clean-up on a timer event 252 + * @work: workqueue belonging to the corresponding timer 253 + * @poll: driver-specific completion queue poll function 254 + * 255 + * Run the polling function on the locked queue and requeue the timer if 256 + * there's more work to do. 257 + * Designed to be used via LIBETH_XDP_DEFINE_TIMER() below. 258 + */ 259 + static __always_inline void 260 + libeth_xdpsq_run_timer(struct work_struct *work, 261 + u32 (*poll)(void *xdpsq, u32 budget)) 262 + { 263 + struct libeth_xdpsq_timer *timer = container_of(work, typeof(*timer), 264 + dwork.work); 265 + 266 + libeth_xdpsq_lock(timer->lock); 267 + 268 + if (poll(timer->xdpsq, U32_MAX)) 269 + libeth_xdpsq_queue_timer(timer); 270 + 271 + libeth_xdpsq_unlock(timer->lock); 272 + } 273 + 274 + /* Common Tx bits */ 275 + 276 + /** 277 + * enum - libeth_xdp internal Tx flags 278 + * @LIBETH_XDP_TX_BULK: one bulk size at which it will be flushed to the queue 279 + * @LIBETH_XDP_TX_BATCH: batch size for which the queue fill loop is unrolled 280 + * @LIBETH_XDP_TX_DROP: indicates the send function must drop frames not sent 281 + * @LIBETH_XDP_TX_NDO: whether the send function is called from .ndo_xdp_xmit() 282 + * @LIBETH_XDP_TX_XSK: whether the function is called for ``XDP_TX`` for XSk 283 + */ 284 + enum { 285 + LIBETH_XDP_TX_BULK = DEV_MAP_BULK_SIZE, 286 + LIBETH_XDP_TX_BATCH = 8, 287 + 288 + LIBETH_XDP_TX_DROP = BIT(0), 289 + LIBETH_XDP_TX_NDO = BIT(1), 290 + LIBETH_XDP_TX_XSK = BIT(2), 291 + }; 292 + 293 + /** 294 + * enum - &libeth_xdp_tx_frame and &libeth_xdp_tx_desc flags 295 + * @LIBETH_XDP_TX_LEN: only for ``XDP_TX``, [15:0] of ::len_fl is actual length 296 + * @LIBETH_XDP_TX_CSUM: for XSk xmit, enable checksum offload 297 + * @LIBETH_XDP_TX_XSKMD: for XSk xmit, mask of the metadata bits 298 + * @LIBETH_XDP_TX_FIRST: indicates the frag is the first one of the frame 299 + * @LIBETH_XDP_TX_LAST: whether the frag is the last one of the frame 300 + * @LIBETH_XDP_TX_MULTI: whether the frame contains several frags 301 + * @LIBETH_XDP_TX_FLAGS: only for ``XDP_TX``, [31:16] of ::len_fl is flags 302 + */ 303 + enum { 304 + LIBETH_XDP_TX_LEN = GENMASK(15, 0), 305 + 306 + LIBETH_XDP_TX_CSUM = XDP_TXMD_FLAGS_CHECKSUM, 307 + LIBETH_XDP_TX_XSKMD = LIBETH_XDP_TX_LEN, 308 + 309 + LIBETH_XDP_TX_FIRST = BIT(16), 310 + LIBETH_XDP_TX_LAST = BIT(17), 311 + LIBETH_XDP_TX_MULTI = BIT(18), 312 + 313 + LIBETH_XDP_TX_FLAGS = GENMASK(31, 16), 314 + }; 315 + 316 + /** 317 + * struct libeth_xdp_tx_frame - represents one XDP Tx element 318 + * @data: frame start pointer for ``XDP_TX`` 319 + * @len_fl: ``XDP_TX``, combined flags [31:16] and len [15:0] field for speed 320 + * @soff: ``XDP_TX``, offset from @data to the start of &skb_shared_info 321 + * @frag: one (non-head) frag for ``XDP_TX`` 322 + * @xdpf: &xdp_frame for the head frag for .ndo_xdp_xmit() 323 + * @dma: DMA address of the non-head frag for .ndo_xdp_xmit() 324 + * @xsk: ``XDP_TX`` for XSk, XDP buffer for any frag 325 + * @len: frag length for XSk ``XDP_TX`` and .ndo_xdp_xmit() 326 + * @flags: Tx flags for the above 327 + * @opts: combined @len + @flags for the above for speed 328 + * @desc: XSk xmit descriptor for direct casting 329 + */ 330 + struct libeth_xdp_tx_frame { 331 + union { 332 + /* ``XDP_TX`` */ 333 + struct { 334 + void *data; 335 + u32 len_fl; 336 + u32 soff; 337 + }; 338 + 339 + /* ``XDP_TX`` frag */ 340 + skb_frag_t frag; 341 + 342 + /* .ndo_xdp_xmit(), XSk ``XDP_TX`` */ 343 + struct { 344 + union { 345 + struct xdp_frame *xdpf; 346 + dma_addr_t dma; 347 + 348 + struct libeth_xdp_buff *xsk; 349 + }; 350 + union { 351 + struct { 352 + u32 len; 353 + u32 flags; 354 + }; 355 + aligned_u64 opts; 356 + }; 357 + }; 358 + 359 + /* XSk xmit */ 360 + struct xdp_desc desc; 361 + }; 362 + } __aligned(sizeof(struct xdp_desc)); 363 + static_assert(offsetof(struct libeth_xdp_tx_frame, frag.len) == 364 + offsetof(struct libeth_xdp_tx_frame, len_fl)); 365 + static_assert(sizeof(struct libeth_xdp_tx_frame) == sizeof(struct xdp_desc)); 366 + 367 + /** 368 + * struct libeth_xdp_tx_bulk - XDP Tx frame bulk for bulk sending 369 + * @prog: corresponding active XDP program, %NULL for .ndo_xdp_xmit() 370 + * @dev: &net_device which the frames are transmitted on 371 + * @xdpsq: shortcut to the corresponding driver-specific XDPSQ structure 372 + * @act_mask: Rx only, mask of all the XDP prog verdicts for that NAPI session 373 + * @count: current number of frames in @bulk 374 + * @bulk: array of queued frames for bulk Tx 375 + * 376 + * All XDP Tx operations except XSk xmit queue each frame to the bulk first 377 + * and flush it when @count reaches the array end. Bulk is always placed on 378 + * the stack for performance. One bulk element contains all the data necessary 379 + * for sending a frame and then freeing it on completion. 380 + * For XSk xmit, Tx descriptor array from &xsk_buff_pool is casted directly 381 + * to &libeth_xdp_tx_frame as they are compatible and the bulk structure is 382 + * not used. 383 + */ 384 + struct libeth_xdp_tx_bulk { 385 + const struct bpf_prog *prog; 386 + struct net_device *dev; 387 + void *xdpsq; 388 + 389 + u32 act_mask; 390 + u32 count; 391 + struct libeth_xdp_tx_frame bulk[LIBETH_XDP_TX_BULK]; 392 + } __aligned(sizeof(struct libeth_xdp_tx_frame)); 393 + 394 + /** 395 + * LIBETH_XDP_ONSTACK_BULK - declare &libeth_xdp_tx_bulk on the stack 396 + * @bq: name of the variable to declare 397 + * 398 + * Helper to declare a bulk on the stack with a compiler hint that it should 399 + * not be initialized automatically (with `CONFIG_INIT_STACK_ALL_*`) for 400 + * performance reasons. 401 + */ 402 + #define LIBETH_XDP_ONSTACK_BULK(bq) \ 403 + struct libeth_xdp_tx_bulk bq __uninitialized 404 + 405 + /** 406 + * struct libeth_xdpsq - abstraction for an XDPSQ 407 + * @pool: XSk buffer pool for XSk ``XDP_TX`` and xmit 408 + * @sqes: array of Tx buffers from the actual queue struct 409 + * @descs: opaque pointer to the HW descriptor array 410 + * @ntu: pointer to the next free descriptor index 411 + * @count: number of descriptors on that queue 412 + * @pending: pointer to the number of sent-not-completed descs on that queue 413 + * @xdp_tx: pointer to the above, but only for non-XSk-xmit frames 414 + * @lock: corresponding XDPSQ lock 415 + * 416 + * Abstraction for driver-independent implementation of Tx. Placed on the stack 417 + * and filled by the driver before the transmission, so that the generic 418 + * functions can access and modify driver-specific resources. 419 + */ 420 + struct libeth_xdpsq { 421 + struct xsk_buff_pool *pool; 422 + struct libeth_sqe *sqes; 423 + void *descs; 424 + 425 + u32 *ntu; 426 + u32 count; 427 + 428 + u32 *pending; 429 + u32 *xdp_tx; 430 + struct libeth_xdpsq_lock *lock; 431 + }; 432 + 433 + /** 434 + * struct libeth_xdp_tx_desc - abstraction for an XDP Tx descriptor 435 + * @addr: DMA address of the frame 436 + * @len: length of the frame 437 + * @flags: XDP Tx flags 438 + * @opts: combined @len + @flags for speed 439 + * 440 + * Filled by the generic functions and then passed to driver-specific functions 441 + * to fill a HW Tx descriptor, always placed on the [function] stack. 442 + */ 443 + struct libeth_xdp_tx_desc { 444 + dma_addr_t addr; 445 + union { 446 + struct { 447 + u32 len; 448 + u32 flags; 449 + }; 450 + aligned_u64 opts; 451 + }; 452 + } __aligned_largest; 453 + 454 + /** 455 + * libeth_xdp_ptr_to_priv - convert pointer to a libeth_xdp u64 priv 456 + * @ptr: pointer to convert 457 + * 458 + * The main sending function passes private data as the largest scalar, u64. 459 + * Use this helper when you want to pass a pointer there. 460 + */ 461 + #define libeth_xdp_ptr_to_priv(ptr) ({ \ 462 + typecheck_pointer(ptr); \ 463 + ((u64)(uintptr_t)(ptr)); \ 464 + }) 465 + /** 466 + * libeth_xdp_priv_to_ptr - convert libeth_xdp u64 priv to a pointer 467 + * @priv: private data to convert 468 + * 469 + * The main sending function passes private data as the largest scalar, u64. 470 + * Use this helper when your callback takes this u64 and you want to convert 471 + * it back to a pointer. 472 + */ 473 + #define libeth_xdp_priv_to_ptr(priv) ({ \ 474 + static_assert(__same_type(priv, u64)); \ 475 + ((const void *)(uintptr_t)(priv)); \ 476 + }) 477 + 478 + /* 479 + * On 64-bit systems, assigning one u64 is faster than two u32s. When ::len 480 + * occupies lowest 32 bits (LE), whole ::opts can be assigned directly instead. 481 + */ 482 + #ifdef __LITTLE_ENDIAN 483 + #define __LIBETH_WORD_ACCESS 1 484 + #endif 485 + #ifdef __LIBETH_WORD_ACCESS 486 + #define __libeth_xdp_tx_len(flen, ...) \ 487 + .opts = ((flen) | FIELD_PREP(GENMASK_ULL(63, 32), (__VA_ARGS__ + 0))) 488 + #else 489 + #define __libeth_xdp_tx_len(flen, ...) \ 490 + .len = (flen), .flags = (__VA_ARGS__ + 0) 491 + #endif 492 + 493 + /** 494 + * libeth_xdp_tx_xmit_bulk - main XDP Tx function 495 + * @bulk: array of frames to send 496 + * @xdpsq: pointer to the driver-specific XDPSQ struct 497 + * @n: number of frames to send 498 + * @unroll: whether to unroll the queue filling loop for speed 499 + * @priv: driver-specific private data 500 + * @prep: callback for cleaning the queue and filling abstract &libeth_xdpsq 501 + * @fill: internal callback for filling &libeth_sqe and &libeth_xdp_tx_desc 502 + * @xmit: callback for filling a HW descriptor with the frame info 503 + * 504 + * Internal abstraction for placing @n XDP Tx frames on the HW XDPSQ. Used for 505 + * all types of frames: ``XDP_TX``, .ndo_xdp_xmit(), XSk ``XDP_TX``, and XSk 506 + * xmit. 507 + * @prep must lock the queue as this function releases it at the end. @unroll 508 + * greatly increases the object code size, but also greatly increases XSk xmit 509 + * performance; for other types of frames, it's not enabled. 510 + * The compilers inline all those onstack abstractions to direct data accesses. 511 + * 512 + * Return: number of frames actually placed on the queue, <= @n. The function 513 + * can't fail, but can send less frames if there's no enough free descriptors 514 + * available. The actual free space is returned by @prep from the driver. 515 + */ 516 + static __always_inline u32 517 + libeth_xdp_tx_xmit_bulk(const struct libeth_xdp_tx_frame *bulk, void *xdpsq, 518 + u32 n, bool unroll, u64 priv, 519 + u32 (*prep)(void *xdpsq, struct libeth_xdpsq *sq), 520 + struct libeth_xdp_tx_desc 521 + (*fill)(struct libeth_xdp_tx_frame frm, u32 i, 522 + const struct libeth_xdpsq *sq, u64 priv), 523 + void (*xmit)(struct libeth_xdp_tx_desc desc, u32 i, 524 + const struct libeth_xdpsq *sq, u64 priv)) 525 + { 526 + struct libeth_xdpsq sq __uninitialized; 527 + u32 this, batched, off = 0; 528 + u32 ntu, i = 0; 529 + 530 + n = min(n, prep(xdpsq, &sq)); 531 + if (unlikely(!n)) 532 + goto unlock; 533 + 534 + ntu = *sq.ntu; 535 + 536 + this = sq.count - ntu; 537 + if (likely(this > n)) 538 + this = n; 539 + 540 + again: 541 + if (!unroll) 542 + goto linear; 543 + 544 + batched = ALIGN_DOWN(this, LIBETH_XDP_TX_BATCH); 545 + 546 + for ( ; i < off + batched; i += LIBETH_XDP_TX_BATCH) { 547 + u32 base = ntu + i - off; 548 + 549 + unrolled_count(LIBETH_XDP_TX_BATCH) 550 + for (u32 j = 0; j < LIBETH_XDP_TX_BATCH; j++) 551 + xmit(fill(bulk[i + j], base + j, &sq, priv), 552 + base + j, &sq, priv); 553 + } 554 + 555 + if (batched < this) { 556 + linear: 557 + for ( ; i < off + this; i++) 558 + xmit(fill(bulk[i], ntu + i - off, &sq, priv), 559 + ntu + i - off, &sq, priv); 560 + } 561 + 562 + ntu += this; 563 + if (likely(ntu < sq.count)) 564 + goto out; 565 + 566 + ntu = 0; 567 + 568 + if (i < n) { 569 + this = n - i; 570 + off = i; 571 + 572 + goto again; 573 + } 574 + 575 + out: 576 + *sq.ntu = ntu; 577 + *sq.pending += n; 578 + if (sq.xdp_tx) 579 + *sq.xdp_tx += n; 580 + 581 + unlock: 582 + libeth_xdpsq_unlock(sq.lock); 583 + 584 + return n; 585 + } 586 + 587 + /* ``XDP_TX`` bulking */ 588 + 589 + void libeth_xdp_return_buff_slow(struct libeth_xdp_buff *xdp); 590 + 591 + /** 592 + * libeth_xdp_tx_queue_head - internal helper for queueing one ``XDP_TX`` head 593 + * @bq: XDP Tx bulk to queue the head frag to 594 + * @xdp: XDP buffer with the head to queue 595 + * 596 + * Return: false if it's the only frag of the frame, true if it's an S/G frame. 597 + */ 598 + static inline bool libeth_xdp_tx_queue_head(struct libeth_xdp_tx_bulk *bq, 599 + const struct libeth_xdp_buff *xdp) 600 + { 601 + const struct xdp_buff *base = &xdp->base; 602 + 603 + bq->bulk[bq->count++] = (typeof(*bq->bulk)){ 604 + .data = xdp->data, 605 + .len_fl = (base->data_end - xdp->data) | LIBETH_XDP_TX_FIRST, 606 + .soff = xdp_data_hard_end(base) - xdp->data, 607 + }; 608 + 609 + if (!xdp_buff_has_frags(base)) 610 + return false; 611 + 612 + bq->bulk[bq->count - 1].len_fl |= LIBETH_XDP_TX_MULTI; 613 + 614 + return true; 615 + } 616 + 617 + /** 618 + * libeth_xdp_tx_queue_frag - internal helper for queueing one ``XDP_TX`` frag 619 + * @bq: XDP Tx bulk to queue the frag to 620 + * @frag: frag to queue 621 + */ 622 + static inline void libeth_xdp_tx_queue_frag(struct libeth_xdp_tx_bulk *bq, 623 + const skb_frag_t *frag) 624 + { 625 + bq->bulk[bq->count++].frag = *frag; 626 + } 627 + 628 + /** 629 + * libeth_xdp_tx_queue_bulk - internal helper for queueing one ``XDP_TX`` frame 630 + * @bq: XDP Tx bulk to queue the frame to 631 + * @xdp: XDP buffer to queue 632 + * @flush_bulk: driver callback to flush the bulk to the HW queue 633 + * 634 + * Return: true on success, false on flush error. 635 + */ 636 + static __always_inline bool 637 + libeth_xdp_tx_queue_bulk(struct libeth_xdp_tx_bulk *bq, 638 + struct libeth_xdp_buff *xdp, 639 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 640 + u32 flags)) 641 + { 642 + const struct skb_shared_info *sinfo; 643 + bool ret = true; 644 + u32 nr_frags; 645 + 646 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 647 + unlikely(!flush_bulk(bq, 0))) { 648 + libeth_xdp_return_buff_slow(xdp); 649 + return false; 650 + } 651 + 652 + if (!libeth_xdp_tx_queue_head(bq, xdp)) 653 + goto out; 654 + 655 + sinfo = xdp_get_shared_info_from_buff(&xdp->base); 656 + nr_frags = sinfo->nr_frags; 657 + 658 + for (u32 i = 0; i < nr_frags; i++) { 659 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 660 + unlikely(!flush_bulk(bq, 0))) { 661 + ret = false; 662 + break; 663 + } 664 + 665 + libeth_xdp_tx_queue_frag(bq, &sinfo->frags[i]); 666 + } 667 + 668 + out: 669 + bq->bulk[bq->count - 1].len_fl |= LIBETH_XDP_TX_LAST; 670 + xdp->data = NULL; 671 + 672 + return ret; 673 + } 674 + 675 + /** 676 + * libeth_xdp_tx_fill_stats - fill &libeth_sqe with ``XDP_TX`` frame stats 677 + * @sqe: SQ element to fill 678 + * @desc: libeth_xdp Tx descriptor 679 + * @sinfo: &skb_shared_info for this frame 680 + * 681 + * Internal helper for filling an SQE with the frame stats, do not use in 682 + * drivers. Fills the number of frags and bytes for this frame. 683 + */ 684 + #define libeth_xdp_tx_fill_stats(sqe, desc, sinfo) \ 685 + __libeth_xdp_tx_fill_stats(sqe, desc, sinfo, __UNIQUE_ID(sqe_), \ 686 + __UNIQUE_ID(desc_), __UNIQUE_ID(sinfo_)) 687 + 688 + #define __libeth_xdp_tx_fill_stats(sqe, desc, sinfo, ue, ud, us) do { \ 689 + const struct libeth_xdp_tx_desc *ud = (desc); \ 690 + const struct skb_shared_info *us; \ 691 + struct libeth_sqe *ue = (sqe); \ 692 + \ 693 + ue->nr_frags = 1; \ 694 + ue->bytes = ud->len; \ 695 + \ 696 + if (ud->flags & LIBETH_XDP_TX_MULTI) { \ 697 + us = (sinfo); \ 698 + ue->nr_frags += us->nr_frags; \ 699 + ue->bytes += us->xdp_frags_size; \ 700 + } \ 701 + } while (0) 702 + 703 + /** 704 + * libeth_xdp_tx_fill_buf - internal helper to fill one ``XDP_TX`` &libeth_sqe 705 + * @frm: XDP Tx frame from the bulk 706 + * @i: index on the HW queue 707 + * @sq: XDPSQ abstraction for the queue 708 + * @priv: private data 709 + * 710 + * Return: XDP Tx descriptor with the synced DMA and other info to pass to 711 + * the driver callback. 712 + */ 713 + static inline struct libeth_xdp_tx_desc 714 + libeth_xdp_tx_fill_buf(struct libeth_xdp_tx_frame frm, u32 i, 715 + const struct libeth_xdpsq *sq, u64 priv) 716 + { 717 + struct libeth_xdp_tx_desc desc; 718 + struct skb_shared_info *sinfo; 719 + skb_frag_t *frag = &frm.frag; 720 + struct libeth_sqe *sqe; 721 + netmem_ref netmem; 722 + 723 + if (frm.len_fl & LIBETH_XDP_TX_FIRST) { 724 + sinfo = frm.data + frm.soff; 725 + skb_frag_fill_netmem_desc(frag, virt_to_netmem(frm.data), 726 + offset_in_page(frm.data), 727 + frm.len_fl); 728 + } else { 729 + sinfo = NULL; 730 + } 731 + 732 + netmem = skb_frag_netmem(frag); 733 + desc = (typeof(desc)){ 734 + .addr = page_pool_get_dma_addr_netmem(netmem) + 735 + skb_frag_off(frag), 736 + .len = skb_frag_size(frag) & LIBETH_XDP_TX_LEN, 737 + .flags = skb_frag_size(frag) & LIBETH_XDP_TX_FLAGS, 738 + }; 739 + 740 + dma_sync_single_for_device(__netmem_get_pp(netmem)->p.dev, desc.addr, 741 + desc.len, DMA_BIDIRECTIONAL); 742 + 743 + if (!sinfo) 744 + return desc; 745 + 746 + sqe = &sq->sqes[i]; 747 + sqe->type = LIBETH_SQE_XDP_TX; 748 + sqe->sinfo = sinfo; 749 + libeth_xdp_tx_fill_stats(sqe, &desc, sinfo); 750 + 751 + return desc; 752 + } 753 + 754 + void libeth_xdp_tx_exception(struct libeth_xdp_tx_bulk *bq, u32 sent, 755 + u32 flags); 756 + 757 + /** 758 + * __libeth_xdp_tx_flush_bulk - internal helper to flush one XDP Tx bulk 759 + * @bq: bulk to flush 760 + * @flags: XDP TX flags (.ndo_xdp_xmit(), XSk etc.) 761 + * @prep: driver-specific callback to prepare the queue for sending 762 + * @fill: libeth_xdp callback to fill &libeth_sqe and &libeth_xdp_tx_desc 763 + * @xmit: driver callback to fill a HW descriptor 764 + * 765 + * Internal abstraction to create bulk flush functions for drivers. Used for 766 + * everything except XSk xmit. 767 + * 768 + * Return: true if anything was sent, false otherwise. 769 + */ 770 + static __always_inline bool 771 + __libeth_xdp_tx_flush_bulk(struct libeth_xdp_tx_bulk *bq, u32 flags, 772 + u32 (*prep)(void *xdpsq, struct libeth_xdpsq *sq), 773 + struct libeth_xdp_tx_desc 774 + (*fill)(struct libeth_xdp_tx_frame frm, u32 i, 775 + const struct libeth_xdpsq *sq, u64 priv), 776 + void (*xmit)(struct libeth_xdp_tx_desc desc, u32 i, 777 + const struct libeth_xdpsq *sq, 778 + u64 priv)) 779 + { 780 + u32 sent, drops; 781 + int err = 0; 782 + 783 + sent = libeth_xdp_tx_xmit_bulk(bq->bulk, bq->xdpsq, 784 + min(bq->count, LIBETH_XDP_TX_BULK), 785 + false, 0, prep, fill, xmit); 786 + drops = bq->count - sent; 787 + 788 + if (unlikely(drops)) { 789 + libeth_xdp_tx_exception(bq, sent, flags); 790 + err = -ENXIO; 791 + } else { 792 + bq->count = 0; 793 + } 794 + 795 + trace_xdp_bulk_tx(bq->dev, sent, drops, err); 796 + 797 + return likely(sent); 798 + } 799 + 800 + /** 801 + * libeth_xdp_tx_flush_bulk - wrapper to define flush of one ``XDP_TX`` bulk 802 + * @bq: bulk to flush 803 + * @flags: Tx flags, see above 804 + * @prep: driver callback to prepare the queue 805 + * @xmit: driver callback to fill a HW descriptor 806 + * 807 + * Use via LIBETH_XDP_DEFINE_FLUSH_TX() to define an ``XDP_TX`` driver 808 + * callback. 809 + */ 810 + #define libeth_xdp_tx_flush_bulk(bq, flags, prep, xmit) \ 811 + __libeth_xdp_tx_flush_bulk(bq, flags, prep, libeth_xdp_tx_fill_buf, \ 812 + xmit) 813 + 814 + /* .ndo_xdp_xmit() implementation */ 815 + 816 + /** 817 + * libeth_xdp_xmit_init_bulk - internal helper to initialize bulk for XDP xmit 818 + * @bq: bulk to initialize 819 + * @dev: target &net_device 820 + * @xdpsqs: array of driver-specific XDPSQ structs 821 + * @num: number of active XDPSQs (the above array length) 822 + */ 823 + #define libeth_xdp_xmit_init_bulk(bq, dev, xdpsqs, num) \ 824 + __libeth_xdp_xmit_init_bulk(bq, dev, (xdpsqs)[libeth_xdpsq_id(num)]) 825 + 826 + static inline void __libeth_xdp_xmit_init_bulk(struct libeth_xdp_tx_bulk *bq, 827 + struct net_device *dev, 828 + void *xdpsq) 829 + { 830 + bq->dev = dev; 831 + bq->xdpsq = xdpsq; 832 + bq->count = 0; 833 + } 834 + 835 + /** 836 + * libeth_xdp_xmit_frame_dma - internal helper to access DMA of an &xdp_frame 837 + * @xf: pointer to the XDP frame 838 + * 839 + * There's no place in &libeth_xdp_tx_frame to store DMA address for an 840 + * &xdp_frame head. The headroom is used then, the address is placed right 841 + * after the frame struct, naturally aligned. 842 + * 843 + * Return: pointer to the DMA address to use. 844 + */ 845 + #define libeth_xdp_xmit_frame_dma(xf) \ 846 + _Generic((xf), \ 847 + const struct xdp_frame *: \ 848 + (const dma_addr_t *)__libeth_xdp_xmit_frame_dma(xf), \ 849 + struct xdp_frame *: \ 850 + (dma_addr_t *)__libeth_xdp_xmit_frame_dma(xf) \ 851 + ) 852 + 853 + static inline void *__libeth_xdp_xmit_frame_dma(const struct xdp_frame *xdpf) 854 + { 855 + void *addr = (void *)(xdpf + 1); 856 + 857 + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && 858 + __alignof(*xdpf) < sizeof(dma_addr_t)) 859 + addr = PTR_ALIGN(addr, sizeof(dma_addr_t)); 860 + 861 + return addr; 862 + } 863 + 864 + /** 865 + * libeth_xdp_xmit_queue_head - internal helper for queueing one XDP xmit head 866 + * @bq: XDP Tx bulk to queue the head frag to 867 + * @xdpf: XDP frame with the head to queue 868 + * @dev: device to perform DMA mapping 869 + * 870 + * Return: ``LIBETH_XDP_DROP`` on DMA mapping error, 871 + * ``LIBETH_XDP_PASS`` if it's the only frag in the frame, 872 + * ``LIBETH_XDP_TX`` if it's an S/G frame. 873 + */ 874 + static inline u32 libeth_xdp_xmit_queue_head(struct libeth_xdp_tx_bulk *bq, 875 + struct xdp_frame *xdpf, 876 + struct device *dev) 877 + { 878 + dma_addr_t dma; 879 + 880 + dma = dma_map_single(dev, xdpf->data, xdpf->len, DMA_TO_DEVICE); 881 + if (dma_mapping_error(dev, dma)) 882 + return LIBETH_XDP_DROP; 883 + 884 + *libeth_xdp_xmit_frame_dma(xdpf) = dma; 885 + 886 + bq->bulk[bq->count++] = (typeof(*bq->bulk)){ 887 + .xdpf = xdpf, 888 + __libeth_xdp_tx_len(xdpf->len, LIBETH_XDP_TX_FIRST), 889 + }; 890 + 891 + if (!xdp_frame_has_frags(xdpf)) 892 + return LIBETH_XDP_PASS; 893 + 894 + bq->bulk[bq->count - 1].flags |= LIBETH_XDP_TX_MULTI; 895 + 896 + return LIBETH_XDP_TX; 897 + } 898 + 899 + /** 900 + * libeth_xdp_xmit_queue_frag - internal helper for queueing one XDP xmit frag 901 + * @bq: XDP Tx bulk to queue the frag to 902 + * @frag: frag to queue 903 + * @dev: device to perform DMA mapping 904 + * 905 + * Return: true on success, false on DMA mapping error. 906 + */ 907 + static inline bool libeth_xdp_xmit_queue_frag(struct libeth_xdp_tx_bulk *bq, 908 + const skb_frag_t *frag, 909 + struct device *dev) 910 + { 911 + dma_addr_t dma; 912 + 913 + dma = skb_frag_dma_map(dev, frag); 914 + if (dma_mapping_error(dev, dma)) 915 + return false; 916 + 917 + bq->bulk[bq->count++] = (typeof(*bq->bulk)){ 918 + .dma = dma, 919 + __libeth_xdp_tx_len(skb_frag_size(frag)), 920 + }; 921 + 922 + return true; 923 + } 924 + 925 + /** 926 + * libeth_xdp_xmit_queue_bulk - internal helper for queueing one XDP xmit frame 927 + * @bq: XDP Tx bulk to queue the frame to 928 + * @xdpf: XDP frame to queue 929 + * @flush_bulk: driver callback to flush the bulk to the HW queue 930 + * 931 + * Return: ``LIBETH_XDP_TX`` on success, 932 + * ``LIBETH_XDP_DROP`` if the frame should be dropped by the stack, 933 + * ``LIBETH_XDP_ABORTED`` if the frame will be dropped by libeth_xdp. 934 + */ 935 + static __always_inline u32 936 + libeth_xdp_xmit_queue_bulk(struct libeth_xdp_tx_bulk *bq, 937 + struct xdp_frame *xdpf, 938 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 939 + u32 flags)) 940 + { 941 + u32 head, nr_frags, i, ret = LIBETH_XDP_TX; 942 + struct device *dev = bq->dev->dev.parent; 943 + const struct skb_shared_info *sinfo; 944 + 945 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 946 + unlikely(!flush_bulk(bq, LIBETH_XDP_TX_NDO))) 947 + return LIBETH_XDP_DROP; 948 + 949 + head = libeth_xdp_xmit_queue_head(bq, xdpf, dev); 950 + if (head == LIBETH_XDP_PASS) 951 + goto out; 952 + else if (head == LIBETH_XDP_DROP) 953 + return LIBETH_XDP_DROP; 954 + 955 + sinfo = xdp_get_shared_info_from_frame(xdpf); 956 + nr_frags = sinfo->nr_frags; 957 + 958 + for (i = 0; i < nr_frags; i++) { 959 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 960 + unlikely(!flush_bulk(bq, LIBETH_XDP_TX_NDO))) 961 + break; 962 + 963 + if (!libeth_xdp_xmit_queue_frag(bq, &sinfo->frags[i], dev)) 964 + break; 965 + } 966 + 967 + if (unlikely(i < nr_frags)) 968 + ret = LIBETH_XDP_ABORTED; 969 + 970 + out: 971 + bq->bulk[bq->count - 1].flags |= LIBETH_XDP_TX_LAST; 972 + 973 + return ret; 974 + } 975 + 976 + /** 977 + * libeth_xdp_xmit_fill_buf - internal helper to fill one XDP xmit &libeth_sqe 978 + * @frm: XDP Tx frame from the bulk 979 + * @i: index on the HW queue 980 + * @sq: XDPSQ abstraction for the queue 981 + * @priv: private data 982 + * 983 + * Return: XDP Tx descriptor with the mapped DMA and other info to pass to 984 + * the driver callback. 985 + */ 986 + static inline struct libeth_xdp_tx_desc 987 + libeth_xdp_xmit_fill_buf(struct libeth_xdp_tx_frame frm, u32 i, 988 + const struct libeth_xdpsq *sq, u64 priv) 989 + { 990 + struct libeth_xdp_tx_desc desc; 991 + struct libeth_sqe *sqe; 992 + struct xdp_frame *xdpf; 993 + 994 + if (frm.flags & LIBETH_XDP_TX_FIRST) { 995 + xdpf = frm.xdpf; 996 + desc.addr = *libeth_xdp_xmit_frame_dma(xdpf); 997 + } else { 998 + xdpf = NULL; 999 + desc.addr = frm.dma; 1000 + } 1001 + desc.opts = frm.opts; 1002 + 1003 + sqe = &sq->sqes[i]; 1004 + dma_unmap_addr_set(sqe, dma, desc.addr); 1005 + dma_unmap_len_set(sqe, len, desc.len); 1006 + 1007 + if (!xdpf) { 1008 + sqe->type = LIBETH_SQE_XDP_XMIT_FRAG; 1009 + return desc; 1010 + } 1011 + 1012 + sqe->type = LIBETH_SQE_XDP_XMIT; 1013 + sqe->xdpf = xdpf; 1014 + libeth_xdp_tx_fill_stats(sqe, &desc, 1015 + xdp_get_shared_info_from_frame(xdpf)); 1016 + 1017 + return desc; 1018 + } 1019 + 1020 + /** 1021 + * libeth_xdp_xmit_flush_bulk - wrapper to define flush of one XDP xmit bulk 1022 + * @bq: bulk to flush 1023 + * @flags: Tx flags, see __libeth_xdp_tx_flush_bulk() 1024 + * @prep: driver callback to prepare the queue 1025 + * @xmit: driver callback to fill a HW descriptor 1026 + * 1027 + * Use via LIBETH_XDP_DEFINE_FLUSH_XMIT() to define an XDP xmit driver 1028 + * callback. 1029 + */ 1030 + #define libeth_xdp_xmit_flush_bulk(bq, flags, prep, xmit) \ 1031 + __libeth_xdp_tx_flush_bulk(bq, (flags) | LIBETH_XDP_TX_NDO, prep, \ 1032 + libeth_xdp_xmit_fill_buf, xmit) 1033 + 1034 + u32 libeth_xdp_xmit_return_bulk(const struct libeth_xdp_tx_frame *bq, 1035 + u32 count, const struct net_device *dev); 1036 + 1037 + /** 1038 + * __libeth_xdp_xmit_do_bulk - internal function to implement .ndo_xdp_xmit() 1039 + * @bq: XDP Tx bulk to queue frames to 1040 + * @frames: XDP frames passed by the stack 1041 + * @n: number of frames 1042 + * @flags: flags passed by the stack 1043 + * @flush_bulk: driver callback to flush an XDP xmit bulk 1044 + * @finalize: driver callback to finalize sending XDP Tx frames on the queue 1045 + * 1046 + * Perform common checks, map the frags and queue them to the bulk, then flush 1047 + * the bulk to the XDPSQ. If requested by the stack, finalize the queue. 1048 + * 1049 + * Return: number of frames send or -errno on error. 1050 + */ 1051 + static __always_inline int 1052 + __libeth_xdp_xmit_do_bulk(struct libeth_xdp_tx_bulk *bq, 1053 + struct xdp_frame **frames, u32 n, u32 flags, 1054 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 1055 + u32 flags), 1056 + void (*finalize)(void *xdpsq, bool sent, bool flush)) 1057 + { 1058 + u32 nxmit = 0; 1059 + 1060 + if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK)) 1061 + return -EINVAL; 1062 + 1063 + for (u32 i = 0; likely(i < n); i++) { 1064 + u32 ret; 1065 + 1066 + ret = libeth_xdp_xmit_queue_bulk(bq, frames[i], flush_bulk); 1067 + if (unlikely(ret != LIBETH_XDP_TX)) { 1068 + nxmit += ret == LIBETH_XDP_ABORTED; 1069 + break; 1070 + } 1071 + 1072 + nxmit++; 1073 + } 1074 + 1075 + if (bq->count) { 1076 + flush_bulk(bq, LIBETH_XDP_TX_NDO); 1077 + if (unlikely(bq->count)) 1078 + nxmit -= libeth_xdp_xmit_return_bulk(bq->bulk, 1079 + bq->count, 1080 + bq->dev); 1081 + } 1082 + 1083 + finalize(bq->xdpsq, nxmit, flags & XDP_XMIT_FLUSH); 1084 + 1085 + return nxmit; 1086 + } 1087 + 1088 + /** 1089 + * libeth_xdp_xmit_do_bulk - implement full .ndo_xdp_xmit() in driver 1090 + * @dev: target &net_device 1091 + * @n: number of frames to send 1092 + * @fr: XDP frames to send 1093 + * @f: flags passed by the stack 1094 + * @xqs: array of XDPSQs driver structs 1095 + * @nqs: number of active XDPSQs, the above array length 1096 + * @fl: driver callback to flush an XDP xmit bulk 1097 + * @fin: driver cabback to finalize the queue 1098 + * 1099 + * If the driver has active XDPSQs, perform common checks and send the frames. 1100 + * Finalize the queue, if requested. 1101 + * 1102 + * Return: number of frames sent or -errno on error. 1103 + */ 1104 + #define libeth_xdp_xmit_do_bulk(dev, n, fr, f, xqs, nqs, fl, fin) \ 1105 + _libeth_xdp_xmit_do_bulk(dev, n, fr, f, xqs, nqs, fl, fin, \ 1106 + __UNIQUE_ID(bq_), __UNIQUE_ID(ret_), \ 1107 + __UNIQUE_ID(nqs_)) 1108 + 1109 + #define _libeth_xdp_xmit_do_bulk(d, n, fr, f, xqs, nqs, fl, fin, ub, ur, un) \ 1110 + ({ \ 1111 + u32 un = (nqs); \ 1112 + int ur; \ 1113 + \ 1114 + if (likely(un)) { \ 1115 + LIBETH_XDP_ONSTACK_BULK(ub); \ 1116 + \ 1117 + libeth_xdp_xmit_init_bulk(&ub, d, xqs, un); \ 1118 + ur = __libeth_xdp_xmit_do_bulk(&ub, fr, n, f, fl, fin); \ 1119 + } else { \ 1120 + ur = -ENXIO; \ 1121 + } \ 1122 + \ 1123 + ur; \ 1124 + }) 1125 + 1126 + /* Rx polling path */ 1127 + 1128 + /** 1129 + * libeth_xdp_tx_init_bulk - initialize an XDP Tx bulk for Rx NAPI poll 1130 + * @bq: bulk to initialize 1131 + * @prog: RCU pointer to the XDP program (can be %NULL) 1132 + * @dev: target &net_device 1133 + * @xdpsqs: array of driver XDPSQ structs 1134 + * @num: number of active XDPSQs, the above array length 1135 + * 1136 + * Should be called on an onstack XDP Tx bulk before the NAPI polling loop. 1137 + * Initializes all the needed fields to run libeth_xdp functions. If @num == 0, 1138 + * assumes XDP is not enabled. 1139 + * Do not use for XSk, it has its own optimized helper. 1140 + */ 1141 + #define libeth_xdp_tx_init_bulk(bq, prog, dev, xdpsqs, num) \ 1142 + __libeth_xdp_tx_init_bulk(bq, prog, dev, xdpsqs, num, false, \ 1143 + __UNIQUE_ID(bq_), __UNIQUE_ID(nqs_)) 1144 + 1145 + #define __libeth_xdp_tx_init_bulk(bq, pr, d, xdpsqs, num, xsk, ub, un) do { \ 1146 + typeof(bq) ub = (bq); \ 1147 + u32 un = (num); \ 1148 + \ 1149 + rcu_read_lock(); \ 1150 + \ 1151 + if (un || (xsk)) { \ 1152 + ub->prog = rcu_dereference(pr); \ 1153 + ub->dev = (d); \ 1154 + ub->xdpsq = (xdpsqs)[libeth_xdpsq_id(un)]; \ 1155 + } else { \ 1156 + ub->prog = NULL; \ 1157 + } \ 1158 + \ 1159 + ub->act_mask = 0; \ 1160 + ub->count = 0; \ 1161 + } while (0) 1162 + 1163 + void libeth_xdp_load_stash(struct libeth_xdp_buff *dst, 1164 + const struct libeth_xdp_buff_stash *src); 1165 + void libeth_xdp_save_stash(struct libeth_xdp_buff_stash *dst, 1166 + const struct libeth_xdp_buff *src); 1167 + void __libeth_xdp_return_stash(struct libeth_xdp_buff_stash *stash); 1168 + 1169 + /** 1170 + * libeth_xdp_init_buff - initialize a &libeth_xdp_buff for Rx NAPI poll 1171 + * @dst: onstack buffer to initialize 1172 + * @src: XDP buffer stash placed on the queue 1173 + * @rxq: registered &xdp_rxq_info corresponding to this queue 1174 + * 1175 + * Should be called before the main NAPI polling loop. Loads the content of 1176 + * the previously saved stash or initializes the buffer from scratch. 1177 + * Do not use for XSk. 1178 + */ 1179 + static inline void 1180 + libeth_xdp_init_buff(struct libeth_xdp_buff *dst, 1181 + const struct libeth_xdp_buff_stash *src, 1182 + struct xdp_rxq_info *rxq) 1183 + { 1184 + if (likely(!src->data)) 1185 + dst->data = NULL; 1186 + else 1187 + libeth_xdp_load_stash(dst, src); 1188 + 1189 + dst->base.rxq = rxq; 1190 + } 1191 + 1192 + /** 1193 + * libeth_xdp_save_buff - save a partially built buffer on a queue 1194 + * @dst: XDP buffer stash placed on the queue 1195 + * @src: onstack buffer to save 1196 + * 1197 + * Should be called after the main NAPI polling loop. If the loop exited before 1198 + * the buffer was finished, saves its content on the queue, so that it can be 1199 + * completed during the next poll. Otherwise, clears the stash. 1200 + */ 1201 + static inline void libeth_xdp_save_buff(struct libeth_xdp_buff_stash *dst, 1202 + const struct libeth_xdp_buff *src) 1203 + { 1204 + if (likely(!src->data)) 1205 + dst->data = NULL; 1206 + else 1207 + libeth_xdp_save_stash(dst, src); 1208 + } 1209 + 1210 + /** 1211 + * libeth_xdp_return_stash - free an XDP buffer stash from a queue 1212 + * @stash: stash to free 1213 + * 1214 + * If the queue is about to be destroyed, but it still has an incompleted 1215 + * buffer stash, this helper should be called to free it. 1216 + */ 1217 + static inline void libeth_xdp_return_stash(struct libeth_xdp_buff_stash *stash) 1218 + { 1219 + if (stash->data) 1220 + __libeth_xdp_return_stash(stash); 1221 + } 1222 + 1223 + static inline void libeth_xdp_return_va(const void *data, bool napi) 1224 + { 1225 + netmem_ref netmem = virt_to_netmem(data); 1226 + 1227 + page_pool_put_full_netmem(__netmem_get_pp(netmem), netmem, napi); 1228 + } 1229 + 1230 + static inline void libeth_xdp_return_frags(const struct skb_shared_info *sinfo, 1231 + bool napi) 1232 + { 1233 + for (u32 i = 0; i < sinfo->nr_frags; i++) { 1234 + netmem_ref netmem = skb_frag_netmem(&sinfo->frags[i]); 1235 + 1236 + page_pool_put_full_netmem(netmem_get_pp(netmem), netmem, napi); 1237 + } 1238 + } 1239 + 1240 + /** 1241 + * libeth_xdp_return_buff - free/recycle &libeth_xdp_buff 1242 + * @xdp: buffer to free 1243 + * 1244 + * Hotpath helper to free &libeth_xdp_buff. Comparing to xdp_return_buff(), 1245 + * it's faster as it gets inlined and always assumes order-0 pages and safe 1246 + * direct recycling. Zeroes @xdp->data to avoid UAFs. 1247 + */ 1248 + #define libeth_xdp_return_buff(xdp) __libeth_xdp_return_buff(xdp, true) 1249 + 1250 + static inline void __libeth_xdp_return_buff(struct libeth_xdp_buff *xdp, 1251 + bool napi) 1252 + { 1253 + if (!xdp_buff_has_frags(&xdp->base)) 1254 + goto out; 1255 + 1256 + libeth_xdp_return_frags(xdp_get_shared_info_from_buff(&xdp->base), 1257 + napi); 1258 + 1259 + out: 1260 + libeth_xdp_return_va(xdp->data, napi); 1261 + xdp->data = NULL; 1262 + } 1263 + 1264 + bool libeth_xdp_buff_add_frag(struct libeth_xdp_buff *xdp, 1265 + const struct libeth_fqe *fqe, 1266 + u32 len); 1267 + 1268 + /** 1269 + * libeth_xdp_prepare_buff - fill &libeth_xdp_buff with head FQE data 1270 + * @xdp: XDP buffer to attach the head to 1271 + * @fqe: FQE containing the head buffer 1272 + * @len: buffer len passed from HW 1273 + * 1274 + * Internal, use libeth_xdp_process_buff() instead. Initializes XDP buffer 1275 + * head with the Rx buffer data: data pointer, length, headroom, and 1276 + * truesize/tailroom. Zeroes the flags. 1277 + * Uses faster single u64 write instead of per-field access. 1278 + */ 1279 + static inline void libeth_xdp_prepare_buff(struct libeth_xdp_buff *xdp, 1280 + const struct libeth_fqe *fqe, 1281 + u32 len) 1282 + { 1283 + const struct page *page = __netmem_to_page(fqe->netmem); 1284 + 1285 + #ifdef __LIBETH_WORD_ACCESS 1286 + static_assert(offsetofend(typeof(xdp->base), flags) - 1287 + offsetof(typeof(xdp->base), frame_sz) == 1288 + sizeof(u64)); 1289 + 1290 + *(u64 *)&xdp->base.frame_sz = fqe->truesize; 1291 + #else 1292 + xdp_init_buff(&xdp->base, fqe->truesize, xdp->base.rxq); 1293 + #endif 1294 + xdp_prepare_buff(&xdp->base, page_address(page) + fqe->offset, 1295 + page->pp->p.offset, len, true); 1296 + } 1297 + 1298 + /** 1299 + * libeth_xdp_process_buff - attach Rx buffer to &libeth_xdp_buff 1300 + * @xdp: XDP buffer to attach the Rx buffer to 1301 + * @fqe: Rx buffer to process 1302 + * @len: received data length from the descriptor 1303 + * 1304 + * If the XDP buffer is empty, attaches the Rx buffer as head and initializes 1305 + * the required fields. Otherwise, attaches the buffer as a frag. 1306 + * Already performs DMA sync-for-CPU and frame start prefetch 1307 + * (for head buffers only). 1308 + * 1309 + * Return: true on success, false if the descriptor must be skipped (empty or 1310 + * no space for a new frag). 1311 + */ 1312 + static inline bool libeth_xdp_process_buff(struct libeth_xdp_buff *xdp, 1313 + const struct libeth_fqe *fqe, 1314 + u32 len) 1315 + { 1316 + if (!libeth_rx_sync_for_cpu(fqe, len)) 1317 + return false; 1318 + 1319 + if (xdp->data) 1320 + return libeth_xdp_buff_add_frag(xdp, fqe, len); 1321 + 1322 + libeth_xdp_prepare_buff(xdp, fqe, len); 1323 + 1324 + prefetch(xdp->data); 1325 + 1326 + return true; 1327 + } 1328 + 1329 + /** 1330 + * libeth_xdp_buff_stats_frags - update onstack RQ stats with XDP frags info 1331 + * @ss: onstack stats to update 1332 + * @xdp: buffer to account 1333 + * 1334 + * Internal helper used by __libeth_xdp_run_pass(), do not call directly. 1335 + * Adds buffer's frags count and total len to the onstack stats. 1336 + */ 1337 + static inline void 1338 + libeth_xdp_buff_stats_frags(struct libeth_rq_napi_stats *ss, 1339 + const struct libeth_xdp_buff *xdp) 1340 + { 1341 + const struct skb_shared_info *sinfo; 1342 + 1343 + sinfo = xdp_get_shared_info_from_buff(&xdp->base); 1344 + ss->bytes += sinfo->xdp_frags_size; 1345 + ss->fragments += sinfo->nr_frags + 1; 1346 + } 1347 + 1348 + u32 libeth_xdp_prog_exception(const struct libeth_xdp_tx_bulk *bq, 1349 + struct libeth_xdp_buff *xdp, 1350 + enum xdp_action act, int ret); 1351 + 1352 + /** 1353 + * __libeth_xdp_run_prog - run XDP program on an XDP buffer 1354 + * @xdp: XDP buffer to run the prog on 1355 + * @bq: buffer bulk for ``XDP_TX`` queueing 1356 + * 1357 + * Internal inline abstraction to run XDP program. Handles ``XDP_DROP`` 1358 + * and ``XDP_REDIRECT`` only, the rest is processed levels up. 1359 + * Reports an XDP prog exception on errors. 1360 + * 1361 + * Return: libeth_xdp prog verdict depending on the prog's verdict. 1362 + */ 1363 + static __always_inline u32 1364 + __libeth_xdp_run_prog(struct libeth_xdp_buff *xdp, 1365 + const struct libeth_xdp_tx_bulk *bq) 1366 + { 1367 + enum xdp_action act; 1368 + 1369 + act = bpf_prog_run_xdp(bq->prog, &xdp->base); 1370 + if (unlikely(act < XDP_DROP || act > XDP_REDIRECT)) 1371 + goto out; 1372 + 1373 + switch (act) { 1374 + case XDP_PASS: 1375 + return LIBETH_XDP_PASS; 1376 + case XDP_DROP: 1377 + libeth_xdp_return_buff(xdp); 1378 + 1379 + return LIBETH_XDP_DROP; 1380 + case XDP_TX: 1381 + return LIBETH_XDP_TX; 1382 + case XDP_REDIRECT: 1383 + if (unlikely(xdp_do_redirect(bq->dev, &xdp->base, bq->prog))) 1384 + break; 1385 + 1386 + xdp->data = NULL; 1387 + 1388 + return LIBETH_XDP_REDIRECT; 1389 + default: 1390 + break; 1391 + } 1392 + 1393 + out: 1394 + return libeth_xdp_prog_exception(bq, xdp, act, 0); 1395 + } 1396 + 1397 + /** 1398 + * __libeth_xdp_run_flush - run XDP program and handle ``XDP_TX`` verdict 1399 + * @xdp: XDP buffer to run the prog on 1400 + * @bq: buffer bulk for ``XDP_TX`` queueing 1401 + * @run: internal callback for running XDP program 1402 + * @queue: internal callback for queuing ``XDP_TX`` frame 1403 + * @flush_bulk: driver callback for flushing a bulk 1404 + * 1405 + * Internal inline abstraction to run XDP program and additionally handle 1406 + * ``XDP_TX`` verdict. Used by both XDP and XSk, hence @run and @queue. 1407 + * Do not use directly. 1408 + * 1409 + * Return: libeth_xdp prog verdict depending on the prog's verdict. 1410 + */ 1411 + static __always_inline u32 1412 + __libeth_xdp_run_flush(struct libeth_xdp_buff *xdp, 1413 + struct libeth_xdp_tx_bulk *bq, 1414 + u32 (*run)(struct libeth_xdp_buff *xdp, 1415 + const struct libeth_xdp_tx_bulk *bq), 1416 + bool (*queue)(struct libeth_xdp_tx_bulk *bq, 1417 + struct libeth_xdp_buff *xdp, 1418 + bool (*flush_bulk) 1419 + (struct libeth_xdp_tx_bulk *bq, 1420 + u32 flags)), 1421 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 1422 + u32 flags)) 1423 + { 1424 + u32 act; 1425 + 1426 + act = run(xdp, bq); 1427 + if (act == LIBETH_XDP_TX && unlikely(!queue(bq, xdp, flush_bulk))) 1428 + act = LIBETH_XDP_DROP; 1429 + 1430 + bq->act_mask |= act; 1431 + 1432 + return act; 1433 + } 1434 + 1435 + /** 1436 + * libeth_xdp_run_prog - run XDP program (non-XSk path) and handle all verdicts 1437 + * @xdp: XDP buffer to process 1438 + * @bq: XDP Tx bulk to queue ``XDP_TX`` buffers 1439 + * @fl: driver ``XDP_TX`` bulk flush callback 1440 + * 1441 + * Run the attached XDP program and handle all possible verdicts. XSk has its 1442 + * own version. 1443 + * Prefer using it via LIBETH_XDP_DEFINE_RUN{,_PASS,_PROG}(). 1444 + * 1445 + * Return: true if the buffer should be passed up the stack, false if the poll 1446 + * should go to the next buffer. 1447 + */ 1448 + #define libeth_xdp_run_prog(xdp, bq, fl) \ 1449 + (__libeth_xdp_run_flush(xdp, bq, __libeth_xdp_run_prog, \ 1450 + libeth_xdp_tx_queue_bulk, \ 1451 + fl) == LIBETH_XDP_PASS) 1452 + 1453 + /** 1454 + * __libeth_xdp_run_pass - helper to run XDP program and handle the result 1455 + * @xdp: XDP buffer to process 1456 + * @bq: XDP Tx bulk to queue ``XDP_TX`` frames 1457 + * @napi: NAPI to build an skb and pass it up the stack 1458 + * @rs: onstack libeth RQ stats 1459 + * @md: metadata that should be filled to the XDP buffer 1460 + * @prep: callback for filling the metadata 1461 + * @run: driver wrapper to run XDP program 1462 + * @populate: driver callback to populate an skb with the HW descriptor data 1463 + * 1464 + * Inline abstraction that does the following (non-XSk path): 1465 + * 1) adds frame size and frag number (if needed) to the onstack stats; 1466 + * 2) fills the descriptor metadata to the onstack &libeth_xdp_buff 1467 + * 3) runs XDP program if present; 1468 + * 4) handles all possible verdicts; 1469 + * 5) on ``XDP_PASS`, builds an skb from the buffer; 1470 + * 6) populates it with the descriptor metadata; 1471 + * 7) passes it up the stack. 1472 + * 1473 + * In most cases, number 2 means just writing the pointer to the HW descriptor 1474 + * to the XDP buffer. If so, please use LIBETH_XDP_DEFINE_RUN{,_PASS}() 1475 + * wrappers to build a driver function. 1476 + */ 1477 + static __always_inline void 1478 + __libeth_xdp_run_pass(struct libeth_xdp_buff *xdp, 1479 + struct libeth_xdp_tx_bulk *bq, struct napi_struct *napi, 1480 + struct libeth_rq_napi_stats *rs, const void *md, 1481 + void (*prep)(struct libeth_xdp_buff *xdp, 1482 + const void *md), 1483 + bool (*run)(struct libeth_xdp_buff *xdp, 1484 + struct libeth_xdp_tx_bulk *bq), 1485 + bool (*populate)(struct sk_buff *skb, 1486 + const struct libeth_xdp_buff *xdp, 1487 + struct libeth_rq_napi_stats *rs)) 1488 + { 1489 + struct sk_buff *skb; 1490 + 1491 + rs->bytes += xdp->base.data_end - xdp->data; 1492 + rs->packets++; 1493 + 1494 + if (xdp_buff_has_frags(&xdp->base)) 1495 + libeth_xdp_buff_stats_frags(rs, xdp); 1496 + 1497 + if (prep && (!__builtin_constant_p(!!md) || md)) 1498 + prep(xdp, md); 1499 + 1500 + if (!bq || !run || !bq->prog) 1501 + goto build; 1502 + 1503 + if (!run(xdp, bq)) 1504 + return; 1505 + 1506 + build: 1507 + skb = xdp_build_skb_from_buff(&xdp->base); 1508 + if (unlikely(!skb)) { 1509 + libeth_xdp_return_buff_slow(xdp); 1510 + return; 1511 + } 1512 + 1513 + xdp->data = NULL; 1514 + 1515 + if (unlikely(!populate(skb, xdp, rs))) { 1516 + napi_consume_skb(skb, true); 1517 + return; 1518 + } 1519 + 1520 + napi_gro_receive(napi, skb); 1521 + } 1522 + 1523 + static inline void libeth_xdp_prep_desc(struct libeth_xdp_buff *xdp, 1524 + const void *desc) 1525 + { 1526 + xdp->desc = desc; 1527 + } 1528 + 1529 + /** 1530 + * libeth_xdp_run_pass - helper to run XDP program and handle the result 1531 + * @xdp: XDP buffer to process 1532 + * @bq: XDP Tx bulk to queue ``XDP_TX`` frames 1533 + * @napi: NAPI to build an skb and pass it up the stack 1534 + * @ss: onstack libeth RQ stats 1535 + * @desc: pointer to the HW descriptor for that frame 1536 + * @run: driver wrapper to run XDP program 1537 + * @populate: driver callback to populate an skb with the HW descriptor data 1538 + * 1539 + * Wrapper around the underscored version when "fill the descriptor metadata" 1540 + * means just writing the pointer to the HW descriptor as @xdp->desc. 1541 + */ 1542 + #define libeth_xdp_run_pass(xdp, bq, napi, ss, desc, run, populate) \ 1543 + __libeth_xdp_run_pass(xdp, bq, napi, ss, desc, libeth_xdp_prep_desc, \ 1544 + run, populate) 1545 + 1546 + /** 1547 + * libeth_xdp_finalize_rx - finalize XDPSQ after a NAPI polling loop (non-XSk) 1548 + * @bq: ``XDP_TX`` frame bulk 1549 + * @flush: driver callback to flush the bulk 1550 + * @finalize: driver callback to start sending the frames and run the timer 1551 + * 1552 + * Flush the bulk if there are frames left to send, kick the queue and flush 1553 + * the XDP maps. 1554 + */ 1555 + #define libeth_xdp_finalize_rx(bq, flush, finalize) \ 1556 + __libeth_xdp_finalize_rx(bq, 0, flush, finalize) 1557 + 1558 + static __always_inline void 1559 + __libeth_xdp_finalize_rx(struct libeth_xdp_tx_bulk *bq, u32 flags, 1560 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 1561 + u32 flags), 1562 + void (*finalize)(void *xdpsq, bool sent, bool flush)) 1563 + { 1564 + if (bq->act_mask & LIBETH_XDP_TX) { 1565 + if (bq->count) 1566 + flush_bulk(bq, flags | LIBETH_XDP_TX_DROP); 1567 + finalize(bq->xdpsq, true, true); 1568 + } 1569 + if (bq->act_mask & LIBETH_XDP_REDIRECT) 1570 + xdp_do_flush(); 1571 + 1572 + rcu_read_unlock(); 1573 + } 1574 + 1575 + /* 1576 + * Helpers to reduce boilerplate code in drivers. 1577 + * 1578 + * Typical driver Rx flow would be (excl. bulk and buff init, frag attach): 1579 + * 1580 + * LIBETH_XDP_DEFINE_START(); 1581 + * LIBETH_XDP_DEFINE_FLUSH_TX(static driver_xdp_flush_tx, driver_xdp_tx_prep, 1582 + * driver_xdp_xmit); 1583 + * LIBETH_XDP_DEFINE_RUN(static driver_xdp_run, driver_xdp_run_prog, 1584 + * driver_xdp_flush_tx, driver_populate_skb); 1585 + * LIBETH_XDP_DEFINE_FINALIZE(static driver_xdp_finalize_rx, 1586 + * driver_xdp_flush_tx, driver_xdp_finalize_sq); 1587 + * LIBETH_XDP_DEFINE_END(); 1588 + * 1589 + * This will build a set of 4 static functions. The compiler is free to decide 1590 + * whether to inline them. 1591 + * Then, in the NAPI polling function: 1592 + * 1593 + * while (packets < budget) { 1594 + * // ... 1595 + * driver_xdp_run(xdp, &bq, napi, &rs, desc); 1596 + * } 1597 + * driver_xdp_finalize_rx(&bq); 1598 + */ 1599 + 1600 + #define LIBETH_XDP_DEFINE_START() \ 1601 + __diag_push(); \ 1602 + __diag_ignore(GCC, 8, "-Wold-style-declaration", \ 1603 + "Allow specifying \'static\' after the return type") 1604 + 1605 + /** 1606 + * LIBETH_XDP_DEFINE_TIMER - define a driver XDPSQ cleanup timer callback 1607 + * @name: name of the function to define 1608 + * @poll: Tx polling/completion function 1609 + */ 1610 + #define LIBETH_XDP_DEFINE_TIMER(name, poll) \ 1611 + void name(struct work_struct *work) \ 1612 + { \ 1613 + libeth_xdpsq_run_timer(work, poll); \ 1614 + } 1615 + 1616 + /** 1617 + * LIBETH_XDP_DEFINE_FLUSH_TX - define a driver ``XDP_TX`` bulk flush function 1618 + * @name: name of the function to define 1619 + * @prep: driver callback to clean an XDPSQ 1620 + * @xmit: driver callback to write a HW Tx descriptor 1621 + */ 1622 + #define LIBETH_XDP_DEFINE_FLUSH_TX(name, prep, xmit) \ 1623 + __LIBETH_XDP_DEFINE_FLUSH_TX(name, prep, xmit, xdp) 1624 + 1625 + #define __LIBETH_XDP_DEFINE_FLUSH_TX(name, prep, xmit, pfx) \ 1626 + bool name(struct libeth_xdp_tx_bulk *bq, u32 flags) \ 1627 + { \ 1628 + return libeth_##pfx##_tx_flush_bulk(bq, flags, prep, xmit); \ 1629 + } 1630 + 1631 + /** 1632 + * LIBETH_XDP_DEFINE_FLUSH_XMIT - define a driver XDP xmit bulk flush function 1633 + * @name: name of the function to define 1634 + * @prep: driver callback to clean an XDPSQ 1635 + * @xmit: driver callback to write a HW Tx descriptor 1636 + */ 1637 + #define LIBETH_XDP_DEFINE_FLUSH_XMIT(name, prep, xmit) \ 1638 + bool name(struct libeth_xdp_tx_bulk *bq, u32 flags) \ 1639 + { \ 1640 + return libeth_xdp_xmit_flush_bulk(bq, flags, prep, xmit); \ 1641 + } 1642 + 1643 + /** 1644 + * LIBETH_XDP_DEFINE_RUN_PROG - define a driver XDP program run function 1645 + * @name: name of the function to define 1646 + * @flush: driver callback to flush an ``XDP_TX`` bulk 1647 + */ 1648 + #define LIBETH_XDP_DEFINE_RUN_PROG(name, flush) \ 1649 + bool __LIBETH_XDP_DEFINE_RUN_PROG(name, flush, xdp) 1650 + 1651 + #define __LIBETH_XDP_DEFINE_RUN_PROG(name, flush, pfx) \ 1652 + name(struct libeth_xdp_buff *xdp, struct libeth_xdp_tx_bulk *bq) \ 1653 + { \ 1654 + return libeth_##pfx##_run_prog(xdp, bq, flush); \ 1655 + } 1656 + 1657 + /** 1658 + * LIBETH_XDP_DEFINE_RUN_PASS - define a driver buffer process + pass function 1659 + * @name: name of the function to define 1660 + * @run: driver callback to run XDP program (above) 1661 + * @populate: driver callback to fill an skb with HW descriptor info 1662 + */ 1663 + #define LIBETH_XDP_DEFINE_RUN_PASS(name, run, populate) \ 1664 + void __LIBETH_XDP_DEFINE_RUN_PASS(name, run, populate, xdp) 1665 + 1666 + #define __LIBETH_XDP_DEFINE_RUN_PASS(name, run, populate, pfx) \ 1667 + name(struct libeth_xdp_buff *xdp, struct libeth_xdp_tx_bulk *bq, \ 1668 + struct napi_struct *napi, struct libeth_rq_napi_stats *ss, \ 1669 + const void *desc) \ 1670 + { \ 1671 + return libeth_##pfx##_run_pass(xdp, bq, napi, ss, desc, run, \ 1672 + populate); \ 1673 + } 1674 + 1675 + /** 1676 + * LIBETH_XDP_DEFINE_RUN - define a driver buffer process, run + pass function 1677 + * @name: name of the function to define 1678 + * @run: name of the XDP prog run function to define 1679 + * @flush: driver callback to flush an ``XDP_TX`` bulk 1680 + * @populate: driver callback to fill an skb with HW descriptor info 1681 + */ 1682 + #define LIBETH_XDP_DEFINE_RUN(name, run, flush, populate) \ 1683 + __LIBETH_XDP_DEFINE_RUN(name, run, flush, populate, XDP) 1684 + 1685 + #define __LIBETH_XDP_DEFINE_RUN(name, run, flush, populate, pfx) \ 1686 + LIBETH_##pfx##_DEFINE_RUN_PROG(static run, flush); \ 1687 + LIBETH_##pfx##_DEFINE_RUN_PASS(name, run, populate) 1688 + 1689 + /** 1690 + * LIBETH_XDP_DEFINE_FINALIZE - define a driver Rx NAPI poll finalize function 1691 + * @name: name of the function to define 1692 + * @flush: driver callback to flush an ``XDP_TX`` bulk 1693 + * @finalize: driver callback to finalize an XDPSQ and run the timer 1694 + */ 1695 + #define LIBETH_XDP_DEFINE_FINALIZE(name, flush, finalize) \ 1696 + __LIBETH_XDP_DEFINE_FINALIZE(name, flush, finalize, xdp) 1697 + 1698 + #define __LIBETH_XDP_DEFINE_FINALIZE(name, flush, finalize, pfx) \ 1699 + void name(struct libeth_xdp_tx_bulk *bq) \ 1700 + { \ 1701 + libeth_##pfx##_finalize_rx(bq, flush, finalize); \ 1702 + } 1703 + 1704 + #define LIBETH_XDP_DEFINE_END() __diag_pop() 1705 + 1706 + /* XMO */ 1707 + 1708 + /** 1709 + * libeth_xdp_buff_to_rq - get RQ pointer from an XDP buffer pointer 1710 + * @xdp: &libeth_xdp_buff corresponding to the queue 1711 + * @type: typeof() of the driver Rx queue structure 1712 + * @member: name of &xdp_rxq_info inside @type 1713 + * 1714 + * Often times, pointer to the RQ is needed when reading/filling metadata from 1715 + * HW descriptors. The helper can be used to quickly jump from an XDP buffer 1716 + * to the queue corresponding to its &xdp_rxq_info without introducing 1717 + * additional fields (&libeth_xdp_buff is precisely 1 cacheline long on x64). 1718 + */ 1719 + #define libeth_xdp_buff_to_rq(xdp, type, member) \ 1720 + container_of_const((xdp)->base.rxq, type, member) 1721 + 1722 + /** 1723 + * libeth_xdpmo_rx_hash - convert &libeth_rx_pt to an XDP RSS hash metadata 1724 + * @hash: pointer to the variable to write the hash to 1725 + * @rss_type: pointer to the variable to write the hash type to 1726 + * @val: hash value from the HW descriptor 1727 + * @pt: libeth parsed packet type 1728 + * 1729 + * Handle zeroed/non-available hash and convert libeth parsed packet type to 1730 + * the corresponding XDP RSS hash type. To be called at the end of 1731 + * xdp_metadata_ops idpf_xdpmo::xmo_rx_hash() implementation. 1732 + * Note that if the driver doesn't use a constant packet type lookup table but 1733 + * generates it at runtime, it must call libeth_rx_pt_gen_hash_type(pt) to 1734 + * generate XDP RSS hash type for each packet type. 1735 + * 1736 + * Return: 0 on success, -ENODATA when the hash is not available. 1737 + */ 1738 + static inline int libeth_xdpmo_rx_hash(u32 *hash, 1739 + enum xdp_rss_hash_type *rss_type, 1740 + u32 val, struct libeth_rx_pt pt) 1741 + { 1742 + if (unlikely(!val)) 1743 + return -ENODATA; 1744 + 1745 + *hash = val; 1746 + *rss_type = pt.hash_type; 1747 + 1748 + return 0; 1749 + } 1750 + 1751 + /* Tx buffer completion */ 1752 + 1753 + void libeth_xdp_return_buff_bulk(const struct skb_shared_info *sinfo, 1754 + struct xdp_frame_bulk *bq, bool frags); 1755 + void libeth_xsk_buff_free_slow(struct libeth_xdp_buff *xdp); 1756 + 1757 + /** 1758 + * __libeth_xdp_complete_tx - complete sent XDPSQE 1759 + * @sqe: SQ element / Tx buffer to complete 1760 + * @cp: Tx polling/completion params 1761 + * @bulk: internal callback to bulk-free ``XDP_TX`` buffers 1762 + * @xsk: internal callback to free XSk ``XDP_TX`` buffers 1763 + * 1764 + * Use the non-underscored version in drivers instead. This one is shared 1765 + * internally with libeth_tx_complete_any(). 1766 + * Complete an XDPSQE of any type of XDP frame. This includes DMA unmapping 1767 + * when needed, buffer freeing, stats update, and SQE invalidation. 1768 + */ 1769 + static __always_inline void 1770 + __libeth_xdp_complete_tx(struct libeth_sqe *sqe, struct libeth_cq_pp *cp, 1771 + typeof(libeth_xdp_return_buff_bulk) bulk, 1772 + typeof(libeth_xsk_buff_free_slow) xsk) 1773 + { 1774 + enum libeth_sqe_type type = sqe->type; 1775 + 1776 + switch (type) { 1777 + case LIBETH_SQE_EMPTY: 1778 + return; 1779 + case LIBETH_SQE_XDP_XMIT: 1780 + case LIBETH_SQE_XDP_XMIT_FRAG: 1781 + dma_unmap_page(cp->dev, dma_unmap_addr(sqe, dma), 1782 + dma_unmap_len(sqe, len), DMA_TO_DEVICE); 1783 + break; 1784 + default: 1785 + break; 1786 + } 1787 + 1788 + switch (type) { 1789 + case LIBETH_SQE_XDP_TX: 1790 + bulk(sqe->sinfo, cp->bq, sqe->nr_frags != 1); 1791 + break; 1792 + case LIBETH_SQE_XDP_XMIT: 1793 + xdp_return_frame_bulk(sqe->xdpf, cp->bq); 1794 + break; 1795 + case LIBETH_SQE_XSK_TX: 1796 + case LIBETH_SQE_XSK_TX_FRAG: 1797 + xsk(sqe->xsk); 1798 + break; 1799 + default: 1800 + break; 1801 + } 1802 + 1803 + switch (type) { 1804 + case LIBETH_SQE_XDP_TX: 1805 + case LIBETH_SQE_XDP_XMIT: 1806 + case LIBETH_SQE_XSK_TX: 1807 + cp->xdp_tx -= sqe->nr_frags; 1808 + 1809 + cp->xss->packets++; 1810 + cp->xss->bytes += sqe->bytes; 1811 + break; 1812 + default: 1813 + break; 1814 + } 1815 + 1816 + sqe->type = LIBETH_SQE_EMPTY; 1817 + } 1818 + 1819 + static inline void libeth_xdp_complete_tx(struct libeth_sqe *sqe, 1820 + struct libeth_cq_pp *cp) 1821 + { 1822 + __libeth_xdp_complete_tx(sqe, cp, libeth_xdp_return_buff_bulk, 1823 + libeth_xsk_buff_free_slow); 1824 + } 1825 + 1826 + /* Misc */ 1827 + 1828 + u32 libeth_xdp_queue_threshold(u32 count); 1829 + 1830 + void __libeth_xdp_set_features(struct net_device *dev, 1831 + const struct xdp_metadata_ops *xmo, 1832 + u32 zc_segs, 1833 + const struct xsk_tx_metadata_ops *tmo); 1834 + void libeth_xdp_set_redirect(struct net_device *dev, bool enable); 1835 + 1836 + /** 1837 + * libeth_xdp_set_features - set XDP features for netdev 1838 + * @dev: &net_device to configure 1839 + * @...: optional params, see __libeth_xdp_set_features() 1840 + * 1841 + * Set all the features libeth_xdp supports, including .ndo_xdp_xmit(). That 1842 + * said, it should be used only when XDPSQs are always available regardless 1843 + * of whether an XDP prog is attached to @dev. 1844 + */ 1845 + #define libeth_xdp_set_features(dev, ...) \ 1846 + CONCATENATE(__libeth_xdp_feat, \ 1847 + COUNT_ARGS(__VA_ARGS__))(dev, ##__VA_ARGS__) 1848 + 1849 + #define __libeth_xdp_feat0(dev) \ 1850 + __libeth_xdp_set_features(dev, NULL, 0, NULL) 1851 + #define __libeth_xdp_feat1(dev, xmo) \ 1852 + __libeth_xdp_set_features(dev, xmo, 0, NULL) 1853 + #define __libeth_xdp_feat2(dev, xmo, zc_segs) \ 1854 + __libeth_xdp_set_features(dev, xmo, zc_segs, NULL) 1855 + #define __libeth_xdp_feat3(dev, xmo, zc_segs, tmo) \ 1856 + __libeth_xdp_set_features(dev, xmo, zc_segs, tmo) 1857 + 1858 + /** 1859 + * libeth_xdp_set_features_noredir - enable all libeth_xdp features w/o redir 1860 + * @dev: target &net_device 1861 + * @...: optional params, see __libeth_xdp_set_features() 1862 + * 1863 + * Enable everything except the .ndo_xdp_xmit() feature, use when XDPSQs are 1864 + * not available right after netdev registration. 1865 + */ 1866 + #define libeth_xdp_set_features_noredir(dev, ...) \ 1867 + __libeth_xdp_set_features_noredir(dev, __UNIQUE_ID(dev_), \ 1868 + ##__VA_ARGS__) 1869 + 1870 + #define __libeth_xdp_set_features_noredir(dev, ud, ...) do { \ 1871 + struct net_device *ud = (dev); \ 1872 + \ 1873 + libeth_xdp_set_features(ud, ##__VA_ARGS__); \ 1874 + libeth_xdp_set_redirect(ud, false); \ 1875 + } while (0) 1876 + 1877 + #define libeth_xsktmo ((const void *)GOLDEN_RATIO_PRIME) 1878 + 1879 + #endif /* __LIBETH_XDP_H */
+685
include/net/libeth/xsk.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* Copyright (C) 2025 Intel Corporation */ 3 + 4 + #ifndef __LIBETH_XSK_H 5 + #define __LIBETH_XSK_H 6 + 7 + #include <net/libeth/xdp.h> 8 + #include <net/xdp_sock_drv.h> 9 + 10 + /* ``XDP_TXMD_FLAGS_VALID`` is defined only under ``CONFIG_XDP_SOCKETS`` */ 11 + #ifdef XDP_TXMD_FLAGS_VALID 12 + static_assert(XDP_TXMD_FLAGS_VALID <= LIBETH_XDP_TX_XSKMD); 13 + #endif 14 + 15 + /* ``XDP_TX`` bulking */ 16 + 17 + /** 18 + * libeth_xsk_tx_queue_head - internal helper for queueing XSk ``XDP_TX`` head 19 + * @bq: XDP Tx bulk to queue the head frag to 20 + * @xdp: XSk buffer with the head to queue 21 + * 22 + * Return: false if it's the only frag of the frame, true if it's an S/G frame. 23 + */ 24 + static inline bool libeth_xsk_tx_queue_head(struct libeth_xdp_tx_bulk *bq, 25 + struct libeth_xdp_buff *xdp) 26 + { 27 + bq->bulk[bq->count++] = (typeof(*bq->bulk)){ 28 + .xsk = xdp, 29 + __libeth_xdp_tx_len(xdp->base.data_end - xdp->data, 30 + LIBETH_XDP_TX_FIRST), 31 + }; 32 + 33 + if (likely(!xdp_buff_has_frags(&xdp->base))) 34 + return false; 35 + 36 + bq->bulk[bq->count - 1].flags |= LIBETH_XDP_TX_MULTI; 37 + 38 + return true; 39 + } 40 + 41 + /** 42 + * libeth_xsk_tx_queue_frag - internal helper for queueing XSk ``XDP_TX`` frag 43 + * @bq: XDP Tx bulk to queue the frag to 44 + * @frag: XSk frag to queue 45 + */ 46 + static inline void libeth_xsk_tx_queue_frag(struct libeth_xdp_tx_bulk *bq, 47 + struct libeth_xdp_buff *frag) 48 + { 49 + bq->bulk[bq->count++] = (typeof(*bq->bulk)){ 50 + .xsk = frag, 51 + __libeth_xdp_tx_len(frag->base.data_end - frag->data), 52 + }; 53 + } 54 + 55 + /** 56 + * libeth_xsk_tx_queue_bulk - internal helper for queueing XSk ``XDP_TX`` frame 57 + * @bq: XDP Tx bulk to queue the frame to 58 + * @xdp: XSk buffer to queue 59 + * @flush_bulk: driver callback to flush the bulk to the HW queue 60 + * 61 + * Return: true on success, false on flush error. 62 + */ 63 + static __always_inline bool 64 + libeth_xsk_tx_queue_bulk(struct libeth_xdp_tx_bulk *bq, 65 + struct libeth_xdp_buff *xdp, 66 + bool (*flush_bulk)(struct libeth_xdp_tx_bulk *bq, 67 + u32 flags)) 68 + { 69 + bool ret = true; 70 + 71 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 72 + unlikely(!flush_bulk(bq, LIBETH_XDP_TX_XSK))) { 73 + libeth_xsk_buff_free_slow(xdp); 74 + return false; 75 + } 76 + 77 + if (!libeth_xsk_tx_queue_head(bq, xdp)) 78 + goto out; 79 + 80 + for (const struct libeth_xdp_buff *head = xdp; ; ) { 81 + xdp = container_of(xsk_buff_get_frag(&head->base), 82 + typeof(*xdp), base); 83 + if (!xdp) 84 + break; 85 + 86 + if (unlikely(bq->count == LIBETH_XDP_TX_BULK) && 87 + unlikely(!flush_bulk(bq, LIBETH_XDP_TX_XSK))) { 88 + ret = false; 89 + break; 90 + } 91 + 92 + libeth_xsk_tx_queue_frag(bq, xdp); 93 + } 94 + 95 + out: 96 + bq->bulk[bq->count - 1].flags |= LIBETH_XDP_TX_LAST; 97 + 98 + return ret; 99 + } 100 + 101 + /** 102 + * libeth_xsk_tx_fill_buf - internal helper to fill XSk ``XDP_TX`` &libeth_sqe 103 + * @frm: XDP Tx frame from the bulk 104 + * @i: index on the HW queue 105 + * @sq: XDPSQ abstraction for the queue 106 + * @priv: private data 107 + * 108 + * Return: XDP Tx descriptor with the synced DMA and other info to pass to 109 + * the driver callback. 110 + */ 111 + static inline struct libeth_xdp_tx_desc 112 + libeth_xsk_tx_fill_buf(struct libeth_xdp_tx_frame frm, u32 i, 113 + const struct libeth_xdpsq *sq, u64 priv) 114 + { 115 + struct libeth_xdp_buff *xdp = frm.xsk; 116 + struct libeth_xdp_tx_desc desc = { 117 + .addr = xsk_buff_xdp_get_dma(&xdp->base), 118 + .opts = frm.opts, 119 + }; 120 + struct libeth_sqe *sqe; 121 + 122 + xsk_buff_raw_dma_sync_for_device(sq->pool, desc.addr, desc.len); 123 + 124 + sqe = &sq->sqes[i]; 125 + sqe->xsk = xdp; 126 + 127 + if (!(desc.flags & LIBETH_XDP_TX_FIRST)) { 128 + sqe->type = LIBETH_SQE_XSK_TX_FRAG; 129 + return desc; 130 + } 131 + 132 + sqe->type = LIBETH_SQE_XSK_TX; 133 + libeth_xdp_tx_fill_stats(sqe, &desc, 134 + xdp_get_shared_info_from_buff(&xdp->base)); 135 + 136 + return desc; 137 + } 138 + 139 + /** 140 + * libeth_xsk_tx_flush_bulk - wrapper to define flush of XSk ``XDP_TX`` bulk 141 + * @bq: bulk to flush 142 + * @flags: Tx flags, see __libeth_xdp_tx_flush_bulk() 143 + * @prep: driver callback to prepare the queue 144 + * @xmit: driver callback to fill a HW descriptor 145 + * 146 + * Use via LIBETH_XSK_DEFINE_FLUSH_TX() to define an XSk ``XDP_TX`` driver 147 + * callback. 148 + */ 149 + #define libeth_xsk_tx_flush_bulk(bq, flags, prep, xmit) \ 150 + __libeth_xdp_tx_flush_bulk(bq, (flags) | LIBETH_XDP_TX_XSK, prep, \ 151 + libeth_xsk_tx_fill_buf, xmit) 152 + 153 + /* XSk TMO */ 154 + 155 + /** 156 + * libeth_xsktmo_req_csum - XSk Tx metadata op to request checksum offload 157 + * @csum_start: unused 158 + * @csum_offset: unused 159 + * @priv: &libeth_xdp_tx_desc from the filling helper 160 + * 161 + * Generic implementation of ::tmo_request_checksum. Works only when HW doesn't 162 + * require filling checksum offsets and other parameters beside the checksum 163 + * request bit. 164 + * Consider using within @libeth_xsktmo unless the driver requires HW-specific 165 + * callbacks. 166 + */ 167 + static inline void libeth_xsktmo_req_csum(u16 csum_start, u16 csum_offset, 168 + void *priv) 169 + { 170 + ((struct libeth_xdp_tx_desc *)priv)->flags |= LIBETH_XDP_TX_CSUM; 171 + } 172 + 173 + /* Only to inline the callbacks below, use @libeth_xsktmo in drivers instead */ 174 + static const struct xsk_tx_metadata_ops __libeth_xsktmo = { 175 + .tmo_request_checksum = libeth_xsktmo_req_csum, 176 + }; 177 + 178 + /** 179 + * __libeth_xsk_xmit_fill_buf_md - internal helper to prepare XSk xmit w/meta 180 + * @xdesc: &xdp_desc from the XSk buffer pool 181 + * @sq: XDPSQ abstraction for the queue 182 + * @priv: XSk Tx metadata ops 183 + * 184 + * Same as __libeth_xsk_xmit_fill_buf(), but requests metadata pointer and 185 + * fills additional fields in &libeth_xdp_tx_desc to ask for metadata offload. 186 + * 187 + * Return: XDP Tx descriptor with the DMA, metadata request bits, and other 188 + * info to pass to the driver callback. 189 + */ 190 + static __always_inline struct libeth_xdp_tx_desc 191 + __libeth_xsk_xmit_fill_buf_md(const struct xdp_desc *xdesc, 192 + const struct libeth_xdpsq *sq, 193 + u64 priv) 194 + { 195 + const struct xsk_tx_metadata_ops *tmo = libeth_xdp_priv_to_ptr(priv); 196 + struct libeth_xdp_tx_desc desc; 197 + struct xdp_desc_ctx ctx; 198 + 199 + ctx = xsk_buff_raw_get_ctx(sq->pool, xdesc->addr); 200 + desc = (typeof(desc)){ 201 + .addr = ctx.dma, 202 + __libeth_xdp_tx_len(xdesc->len), 203 + }; 204 + 205 + BUILD_BUG_ON(!__builtin_constant_p(tmo == libeth_xsktmo)); 206 + tmo = tmo == libeth_xsktmo ? &__libeth_xsktmo : tmo; 207 + 208 + xsk_tx_metadata_request(ctx.meta, tmo, &desc); 209 + 210 + return desc; 211 + } 212 + 213 + /* XSk xmit implementation */ 214 + 215 + /** 216 + * __libeth_xsk_xmit_fill_buf - internal helper to prepare XSk xmit w/o meta 217 + * @xdesc: &xdp_desc from the XSk buffer pool 218 + * @sq: XDPSQ abstraction for the queue 219 + * 220 + * Return: XDP Tx descriptor with the DMA and other info to pass to 221 + * the driver callback. 222 + */ 223 + static inline struct libeth_xdp_tx_desc 224 + __libeth_xsk_xmit_fill_buf(const struct xdp_desc *xdesc, 225 + const struct libeth_xdpsq *sq) 226 + { 227 + return (struct libeth_xdp_tx_desc){ 228 + .addr = xsk_buff_raw_get_dma(sq->pool, xdesc->addr), 229 + __libeth_xdp_tx_len(xdesc->len), 230 + }; 231 + } 232 + 233 + /** 234 + * libeth_xsk_xmit_fill_buf - internal helper to prepare an XSk xmit 235 + * @frm: &xdp_desc from the XSk buffer pool 236 + * @i: index on the HW queue 237 + * @sq: XDPSQ abstraction for the queue 238 + * @priv: XSk Tx metadata ops 239 + * 240 + * Depending on the metadata ops presence (determined at compile time), calls 241 + * the quickest helper to build a libeth XDP Tx descriptor. 242 + * 243 + * Return: XDP Tx descriptor with the synced DMA, metadata request bits, 244 + * and other info to pass to the driver callback. 245 + */ 246 + static __always_inline struct libeth_xdp_tx_desc 247 + libeth_xsk_xmit_fill_buf(struct libeth_xdp_tx_frame frm, u32 i, 248 + const struct libeth_xdpsq *sq, u64 priv) 249 + { 250 + struct libeth_xdp_tx_desc desc; 251 + 252 + if (priv) 253 + desc = __libeth_xsk_xmit_fill_buf_md(&frm.desc, sq, priv); 254 + else 255 + desc = __libeth_xsk_xmit_fill_buf(&frm.desc, sq); 256 + 257 + desc.flags |= xsk_is_eop_desc(&frm.desc) ? LIBETH_XDP_TX_LAST : 0; 258 + 259 + xsk_buff_raw_dma_sync_for_device(sq->pool, desc.addr, desc.len); 260 + 261 + return desc; 262 + } 263 + 264 + /** 265 + * libeth_xsk_xmit_do_bulk - send XSk xmit frames 266 + * @pool: XSk buffer pool containing the frames to send 267 + * @xdpsq: opaque pointer to driver's XDPSQ struct 268 + * @budget: maximum number of frames can be sent 269 + * @tmo: optional XSk Tx metadata ops 270 + * @prep: driver callback to build a &libeth_xdpsq 271 + * @xmit: driver callback to put frames to a HW queue 272 + * @finalize: driver callback to start a transmission 273 + * 274 + * Implements generic XSk xmit. Always turns on XSk Tx wakeup as it's assumed 275 + * lazy cleaning is used and interrupts are disabled for the queue. 276 + * HW descriptor filling is unrolled by ``LIBETH_XDP_TX_BATCH`` to optimize 277 + * writes. 278 + * Note that unlike other XDP Tx ops, the queue must be locked and cleaned 279 + * prior to calling this function to already know available @budget. 280 + * @prepare must only build a &libeth_xdpsq and return ``U32_MAX``. 281 + * 282 + * Return: false if @budget was exhausted, true otherwise. 283 + */ 284 + static __always_inline bool 285 + libeth_xsk_xmit_do_bulk(struct xsk_buff_pool *pool, void *xdpsq, u32 budget, 286 + const struct xsk_tx_metadata_ops *tmo, 287 + u32 (*prep)(void *xdpsq, struct libeth_xdpsq *sq), 288 + void (*xmit)(struct libeth_xdp_tx_desc desc, u32 i, 289 + const struct libeth_xdpsq *sq, u64 priv), 290 + void (*finalize)(void *xdpsq, bool sent, bool flush)) 291 + { 292 + const struct libeth_xdp_tx_frame *bulk; 293 + bool wake; 294 + u32 n; 295 + 296 + wake = xsk_uses_need_wakeup(pool); 297 + if (wake) 298 + xsk_clear_tx_need_wakeup(pool); 299 + 300 + n = xsk_tx_peek_release_desc_batch(pool, budget); 301 + bulk = container_of(&pool->tx_descs[0], typeof(*bulk), desc); 302 + 303 + libeth_xdp_tx_xmit_bulk(bulk, xdpsq, n, true, 304 + libeth_xdp_ptr_to_priv(tmo), prep, 305 + libeth_xsk_xmit_fill_buf, xmit); 306 + finalize(xdpsq, n, true); 307 + 308 + if (wake) 309 + xsk_set_tx_need_wakeup(pool); 310 + 311 + return n < budget; 312 + } 313 + 314 + /* Rx polling path */ 315 + 316 + /** 317 + * libeth_xsk_tx_init_bulk - initialize XDP Tx bulk for an XSk Rx NAPI poll 318 + * @bq: bulk to initialize 319 + * @prog: RCU pointer to the XDP program (never %NULL) 320 + * @dev: target &net_device 321 + * @xdpsqs: array of driver XDPSQ structs 322 + * @num: number of active XDPSQs, the above array length 323 + * 324 + * Should be called on an onstack XDP Tx bulk before the XSk NAPI polling loop. 325 + * Initializes all the needed fields to run libeth_xdp functions. 326 + * Never checks if @prog is %NULL or @num == 0 as XDP must always be enabled 327 + * when hitting this path. 328 + */ 329 + #define libeth_xsk_tx_init_bulk(bq, prog, dev, xdpsqs, num) \ 330 + __libeth_xdp_tx_init_bulk(bq, prog, dev, xdpsqs, num, true, \ 331 + __UNIQUE_ID(bq_), __UNIQUE_ID(nqs_)) 332 + 333 + struct libeth_xdp_buff *libeth_xsk_buff_add_frag(struct libeth_xdp_buff *head, 334 + struct libeth_xdp_buff *xdp); 335 + 336 + /** 337 + * libeth_xsk_process_buff - attach XSk Rx buffer to &libeth_xdp_buff 338 + * @head: head XSk buffer to attach the XSk buffer to (or %NULL) 339 + * @xdp: XSk buffer to process 340 + * @len: received data length from the descriptor 341 + * 342 + * If @head == %NULL, treats the XSk buffer as head and initializes 343 + * the required fields. Otherwise, attaches the buffer as a frag. 344 + * Already performs DMA sync-for-CPU and frame start prefetch 345 + * (for head buffers only). 346 + * 347 + * Return: head XSk buffer on success or if the descriptor must be skipped 348 + * (empty), %NULL if there is no space for a new frag. 349 + */ 350 + static inline struct libeth_xdp_buff * 351 + libeth_xsk_process_buff(struct libeth_xdp_buff *head, 352 + struct libeth_xdp_buff *xdp, u32 len) 353 + { 354 + if (unlikely(!len)) { 355 + libeth_xsk_buff_free_slow(xdp); 356 + return head; 357 + } 358 + 359 + xsk_buff_set_size(&xdp->base, len); 360 + xsk_buff_dma_sync_for_cpu(&xdp->base); 361 + 362 + if (head) 363 + return libeth_xsk_buff_add_frag(head, xdp); 364 + 365 + prefetch(xdp->data); 366 + 367 + return xdp; 368 + } 369 + 370 + void libeth_xsk_buff_stats_frags(struct libeth_rq_napi_stats *rs, 371 + const struct libeth_xdp_buff *xdp); 372 + 373 + u32 __libeth_xsk_run_prog_slow(struct libeth_xdp_buff *xdp, 374 + const struct libeth_xdp_tx_bulk *bq, 375 + enum xdp_action act, int ret); 376 + 377 + /** 378 + * __libeth_xsk_run_prog - run XDP program on XSk buffer 379 + * @xdp: XSk buffer to run the prog on 380 + * @bq: buffer bulk for ``XDP_TX`` queueing 381 + * 382 + * Internal inline abstraction to run XDP program on XSk Rx path. Handles 383 + * only the most common ``XDP_REDIRECT`` inline, the rest is processed 384 + * externally. 385 + * Reports an XDP prog exception on errors. 386 + * 387 + * Return: libeth_xdp prog verdict depending on the prog's verdict. 388 + */ 389 + static __always_inline u32 390 + __libeth_xsk_run_prog(struct libeth_xdp_buff *xdp, 391 + const struct libeth_xdp_tx_bulk *bq) 392 + { 393 + enum xdp_action act; 394 + int ret = 0; 395 + 396 + act = bpf_prog_run_xdp(bq->prog, &xdp->base); 397 + if (unlikely(act != XDP_REDIRECT)) 398 + rest: 399 + return __libeth_xsk_run_prog_slow(xdp, bq, act, ret); 400 + 401 + ret = xdp_do_redirect(bq->dev, &xdp->base, bq->prog); 402 + if (unlikely(ret)) 403 + goto rest; 404 + 405 + return LIBETH_XDP_REDIRECT; 406 + } 407 + 408 + /** 409 + * libeth_xsk_run_prog - run XDP program on XSk path and handle all verdicts 410 + * @xdp: XSk buffer to process 411 + * @bq: XDP Tx bulk to queue ``XDP_TX`` buffers 412 + * @fl: driver ``XDP_TX`` bulk flush callback 413 + * 414 + * Run the attached XDP program and handle all possible verdicts. 415 + * Prefer using it via LIBETH_XSK_DEFINE_RUN{,_PASS,_PROG}(). 416 + * 417 + * Return: libeth_xdp prog verdict depending on the prog's verdict. 418 + */ 419 + #define libeth_xsk_run_prog(xdp, bq, fl) \ 420 + __libeth_xdp_run_flush(xdp, bq, __libeth_xsk_run_prog, \ 421 + libeth_xsk_tx_queue_bulk, fl) 422 + 423 + /** 424 + * __libeth_xsk_run_pass - helper to run XDP program and handle the result 425 + * @xdp: XSk buffer to process 426 + * @bq: XDP Tx bulk to queue ``XDP_TX`` frames 427 + * @napi: NAPI to build an skb and pass it up the stack 428 + * @rs: onstack libeth RQ stats 429 + * @md: metadata that should be filled to the XSk buffer 430 + * @prep: callback for filling the metadata 431 + * @run: driver wrapper to run XDP program 432 + * @populate: driver callback to populate an skb with the HW descriptor data 433 + * 434 + * Inline abstraction, XSk's counterpart of __libeth_xdp_run_pass(), see its 435 + * doc for details. 436 + * 437 + * Return: false if the polling loop must be exited due to lack of free 438 + * buffers, true otherwise. 439 + */ 440 + static __always_inline bool 441 + __libeth_xsk_run_pass(struct libeth_xdp_buff *xdp, 442 + struct libeth_xdp_tx_bulk *bq, struct napi_struct *napi, 443 + struct libeth_rq_napi_stats *rs, const void *md, 444 + void (*prep)(struct libeth_xdp_buff *xdp, 445 + const void *md), 446 + u32 (*run)(struct libeth_xdp_buff *xdp, 447 + struct libeth_xdp_tx_bulk *bq), 448 + bool (*populate)(struct sk_buff *skb, 449 + const struct libeth_xdp_buff *xdp, 450 + struct libeth_rq_napi_stats *rs)) 451 + { 452 + struct sk_buff *skb; 453 + u32 act; 454 + 455 + rs->bytes += xdp->base.data_end - xdp->data; 456 + rs->packets++; 457 + 458 + if (unlikely(xdp_buff_has_frags(&xdp->base))) 459 + libeth_xsk_buff_stats_frags(rs, xdp); 460 + 461 + if (prep && (!__builtin_constant_p(!!md) || md)) 462 + prep(xdp, md); 463 + 464 + act = run(xdp, bq); 465 + if (likely(act == LIBETH_XDP_REDIRECT)) 466 + return true; 467 + 468 + if (act != LIBETH_XDP_PASS) 469 + return act != LIBETH_XDP_ABORTED; 470 + 471 + skb = xdp_build_skb_from_zc(&xdp->base); 472 + if (unlikely(!skb)) { 473 + libeth_xsk_buff_free_slow(xdp); 474 + return true; 475 + } 476 + 477 + if (unlikely(!populate(skb, xdp, rs))) { 478 + napi_consume_skb(skb, true); 479 + return true; 480 + } 481 + 482 + napi_gro_receive(napi, skb); 483 + 484 + return true; 485 + } 486 + 487 + /** 488 + * libeth_xsk_run_pass - helper to run XDP program and handle the result 489 + * @xdp: XSk buffer to process 490 + * @bq: XDP Tx bulk to queue ``XDP_TX`` frames 491 + * @napi: NAPI to build an skb and pass it up the stack 492 + * @rs: onstack libeth RQ stats 493 + * @desc: pointer to the HW descriptor for that frame 494 + * @run: driver wrapper to run XDP program 495 + * @populate: driver callback to populate an skb with the HW descriptor data 496 + * 497 + * Wrapper around the underscored version when "fill the descriptor metadata" 498 + * means just writing the pointer to the HW descriptor as @xdp->desc. 499 + */ 500 + #define libeth_xsk_run_pass(xdp, bq, napi, rs, desc, run, populate) \ 501 + __libeth_xsk_run_pass(xdp, bq, napi, rs, desc, libeth_xdp_prep_desc, \ 502 + run, populate) 503 + 504 + /** 505 + * libeth_xsk_finalize_rx - finalize XDPSQ after an XSk NAPI polling loop 506 + * @bq: ``XDP_TX`` frame bulk 507 + * @flush: driver callback to flush the bulk 508 + * @finalize: driver callback to start sending the frames and run the timer 509 + * 510 + * Flush the bulk if there are frames left to send, kick the queue and flush 511 + * the XDP maps. 512 + */ 513 + #define libeth_xsk_finalize_rx(bq, flush, finalize) \ 514 + __libeth_xdp_finalize_rx(bq, LIBETH_XDP_TX_XSK, flush, finalize) 515 + 516 + /* 517 + * Helpers to reduce boilerplate code in drivers. 518 + * 519 + * Typical driver XSk Rx flow would be (excl. bulk and buff init, frag attach): 520 + * 521 + * LIBETH_XDP_DEFINE_START(); 522 + * LIBETH_XSK_DEFINE_FLUSH_TX(static driver_xsk_flush_tx, driver_xsk_tx_prep, 523 + * driver_xdp_xmit); 524 + * LIBETH_XSK_DEFINE_RUN(static driver_xsk_run, driver_xsk_run_prog, 525 + * driver_xsk_flush_tx, driver_populate_skb); 526 + * LIBETH_XSK_DEFINE_FINALIZE(static driver_xsk_finalize_rx, 527 + * driver_xsk_flush_tx, driver_xdp_finalize_sq); 528 + * LIBETH_XDP_DEFINE_END(); 529 + * 530 + * This will build a set of 4 static functions. The compiler is free to decide 531 + * whether to inline them. 532 + * Then, in the NAPI polling function: 533 + * 534 + * while (packets < budget) { 535 + * // ... 536 + * if (!driver_xsk_run(xdp, &bq, napi, &rs, desc)) 537 + * break; 538 + * } 539 + * driver_xsk_finalize_rx(&bq); 540 + */ 541 + 542 + /** 543 + * LIBETH_XSK_DEFINE_FLUSH_TX - define a driver XSk ``XDP_TX`` flush function 544 + * @name: name of the function to define 545 + * @prep: driver callback to clean an XDPSQ 546 + * @xmit: driver callback to write a HW Tx descriptor 547 + */ 548 + #define LIBETH_XSK_DEFINE_FLUSH_TX(name, prep, xmit) \ 549 + __LIBETH_XDP_DEFINE_FLUSH_TX(name, prep, xmit, xsk) 550 + 551 + /** 552 + * LIBETH_XSK_DEFINE_RUN_PROG - define a driver XDP program run function 553 + * @name: name of the function to define 554 + * @flush: driver callback to flush an XSk ``XDP_TX`` bulk 555 + */ 556 + #define LIBETH_XSK_DEFINE_RUN_PROG(name, flush) \ 557 + u32 __LIBETH_XDP_DEFINE_RUN_PROG(name, flush, xsk) 558 + 559 + /** 560 + * LIBETH_XSK_DEFINE_RUN_PASS - define a driver buffer process + pass function 561 + * @name: name of the function to define 562 + * @run: driver callback to run XDP program (above) 563 + * @populate: driver callback to fill an skb with HW descriptor info 564 + */ 565 + #define LIBETH_XSK_DEFINE_RUN_PASS(name, run, populate) \ 566 + bool __LIBETH_XDP_DEFINE_RUN_PASS(name, run, populate, xsk) 567 + 568 + /** 569 + * LIBETH_XSK_DEFINE_RUN - define a driver buffer process, run + pass function 570 + * @name: name of the function to define 571 + * @run: name of the XDP prog run function to define 572 + * @flush: driver callback to flush an XSk ``XDP_TX`` bulk 573 + * @populate: driver callback to fill an skb with HW descriptor info 574 + */ 575 + #define LIBETH_XSK_DEFINE_RUN(name, run, flush, populate) \ 576 + __LIBETH_XDP_DEFINE_RUN(name, run, flush, populate, XSK) 577 + 578 + /** 579 + * LIBETH_XSK_DEFINE_FINALIZE - define a driver XSk NAPI poll finalize function 580 + * @name: name of the function to define 581 + * @flush: driver callback to flush an XSk ``XDP_TX`` bulk 582 + * @finalize: driver callback to finalize an XDPSQ and run the timer 583 + */ 584 + #define LIBETH_XSK_DEFINE_FINALIZE(name, flush, finalize) \ 585 + __LIBETH_XDP_DEFINE_FINALIZE(name, flush, finalize, xsk) 586 + 587 + /* Refilling */ 588 + 589 + /** 590 + * struct libeth_xskfq - structure representing an XSk buffer (fill) queue 591 + * @fp: hotpath part of the structure 592 + * @pool: &xsk_buff_pool for buffer management 593 + * @fqes: array of XSk buffer pointers 594 + * @descs: opaque pointer to the HW descriptor array 595 + * @ntu: index of the next buffer to poll 596 + * @count: number of descriptors/buffers the queue has 597 + * @pending: current number of XSkFQEs to refill 598 + * @thresh: threshold below which the queue is refilled 599 + * @buf_len: HW-writeable length per each buffer 600 + * @nid: ID of the closest NUMA node with memory 601 + */ 602 + struct libeth_xskfq { 603 + struct_group_tagged(libeth_xskfq_fp, fp, 604 + struct xsk_buff_pool *pool; 605 + struct libeth_xdp_buff **fqes; 606 + void *descs; 607 + 608 + u32 ntu; 609 + u32 count; 610 + ); 611 + 612 + /* Cold fields */ 613 + u32 pending; 614 + u32 thresh; 615 + 616 + u32 buf_len; 617 + int nid; 618 + }; 619 + 620 + int libeth_xskfq_create(struct libeth_xskfq *fq); 621 + void libeth_xskfq_destroy(struct libeth_xskfq *fq); 622 + 623 + /** 624 + * libeth_xsk_buff_xdp_get_dma - get DMA address of XSk &libeth_xdp_buff 625 + * @xdp: buffer to get the DMA addr for 626 + */ 627 + #define libeth_xsk_buff_xdp_get_dma(xdp) \ 628 + xsk_buff_xdp_get_dma(&(xdp)->base) 629 + 630 + /** 631 + * libeth_xskfqe_alloc - allocate @n XSk Rx buffers 632 + * @fq: hotpath part of the XSkFQ, usually onstack 633 + * @n: number of buffers to allocate 634 + * @fill: driver callback to write DMA addresses to HW descriptors 635 + * 636 + * Note that @fq->ntu gets updated, but ::pending must be recalculated 637 + * by the caller. 638 + * 639 + * Return: number of buffers refilled. 640 + */ 641 + static __always_inline u32 642 + libeth_xskfqe_alloc(struct libeth_xskfq_fp *fq, u32 n, 643 + void (*fill)(const struct libeth_xskfq_fp *fq, u32 i)) 644 + { 645 + u32 this, ret, done = 0; 646 + struct xdp_buff **xskb; 647 + 648 + this = fq->count - fq->ntu; 649 + if (likely(this > n)) 650 + this = n; 651 + 652 + again: 653 + xskb = (typeof(xskb))&fq->fqes[fq->ntu]; 654 + ret = xsk_buff_alloc_batch(fq->pool, xskb, this); 655 + 656 + for (u32 i = 0, ntu = fq->ntu; likely(i < ret); i++) 657 + fill(fq, ntu + i); 658 + 659 + done += ret; 660 + fq->ntu += ret; 661 + 662 + if (likely(fq->ntu < fq->count) || unlikely(ret < this)) 663 + goto out; 664 + 665 + fq->ntu = 0; 666 + 667 + if (this < n) { 668 + this = n - this; 669 + goto again; 670 + } 671 + 672 + out: 673 + return done; 674 + } 675 + 676 + /* .ndo_xsk_wakeup */ 677 + 678 + void libeth_xsk_init_wakeup(call_single_data_t *csd, struct napi_struct *napi); 679 + void libeth_xsk_wakeup(call_single_data_t *csd, u32 qid); 680 + 681 + /* Pool setup */ 682 + 683 + int libeth_xsk_setup_pool(struct net_device *dev, u32 qid, bool enable); 684 + 685 + #endif /* __LIBETH_XSK_H */