Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

net: tso: Introduce tso_dma_map and helpers

Add struct tso_dma_map to tso.h for tracking DMA addresses of mapped
GSO payload data and tso_dma_map_completion_state.

The tso_dma_map combines DMA mapping storage with iterator state, allowing
drivers to walk pre-mapped DMA regions linearly. Includes fields for
the DMA IOVA path (iova_state, iova_offset, total_len) and a fallback
per-region path (linear_dma, frags[], frag_idx, offset).

The tso_dma_map_completion_state makes the IOVA completion state opaque
for drivers. Drivers are expected to allocate this and use the added
helpers to update the completion state.

Adds skb_frag_phys() to skbuff.h, returning the physical address
of a paged fragment's data, which is used by the tso_dma_map helpers
introduced in this commit described below.

The added TSO DMA map helpers are:

tso_dma_map_init(): DMA-maps the linear payload region and all frags
upfront. Prefers the DMA IOVA API for a single contiguous mapping with
one IOTLB sync; falls back to per-region dma_map_phys() otherwise.
Returns 0 on success, cleans up partial mappings on failure.

tso_dma_map_cleanup(): Handles both IOVA and fallback teardown paths.

tso_dma_map_count(): counts how many descriptors the next N bytes of
payload will need. Returns 1 if IOVA is used since the mapping is
contiguous.

tso_dma_map_next(): yields the next (dma_addr, chunk_len) pair.
On the IOVA path, each segment is a single contiguous chunk. On the
fallback path, indicates when a chunk starts a new DMA mapping so the
driver can set dma_unmap_len on that descriptor for completion-time
unmapping.

tso_dma_map_completion_save(): updates the completion state. Drivers
will call this at xmit time.

tso_dma_map_complete(): tears down the mapping at completion time and
returns true if the IOVA path was used. If it was not used, this is a
no-op and returns false.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <joe@dama.to>
Link: https://patch.msgid.link/20260408230607.2019402-2-joe@dama.to
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Joe Damato and committed by
Jakub Kicinski
82db77f6 00667926

+380
+11
include/linux/skbuff.h
··· 3764 3764 } 3765 3765 3766 3766 /** 3767 + * skb_frag_phys - gets the physical address of the data in a paged fragment 3768 + * @frag: the paged fragment buffer 3769 + * 3770 + * Returns: the physical address of the data within @frag. 3771 + */ 3772 + static inline phys_addr_t skb_frag_phys(const skb_frag_t *frag) 3773 + { 3774 + return page_to_phys(skb_frag_page(frag)) + skb_frag_off(frag); 3775 + } 3776 + 3777 + /** 3767 3778 * skb_frag_page_copy() - sets the page in a fragment from another fragment 3768 3779 * @fragto: skb fragment where page is set 3769 3780 * @fragfrom: skb fragment page is copied from
+100
include/net/tso.h
··· 3 3 #define _TSO_H 4 4 5 5 #include <linux/skbuff.h> 6 + #include <linux/dma-mapping.h> 6 7 #include <net/ip.h> 7 8 8 9 #define TSO_HEADER_SIZE 256 ··· 28 27 int size, bool is_last); 29 28 void tso_build_data(const struct sk_buff *skb, struct tso_t *tso, int size); 30 29 int tso_start(struct sk_buff *skb, struct tso_t *tso); 30 + 31 + /** 32 + * struct tso_dma_map - DMA mapping state for GSO payload 33 + * @dev: device used for DMA mapping 34 + * @skb: the GSO skb being mapped 35 + * @hdr_len: per-segment header length 36 + * @iova_state: DMA IOVA state (when IOMMU available) 37 + * @iova_offset: global byte offset into IOVA range (IOVA path only) 38 + * @total_len: total payload length 39 + * @frag_idx: current region (-1 = linear, 0..nr_frags-1 = frag) 40 + * @offset: byte offset within current region 41 + * @linear_dma: DMA address of the linear payload 42 + * @linear_len: length of the linear payload 43 + * @nr_frags: number of frags successfully DMA-mapped 44 + * @frags: per-frag DMA address and length 45 + * 46 + * DMA-maps the payload regions of a GSO skb (linear data + frags). 47 + * Prefers the DMA IOVA API for a single contiguous mapping with one 48 + * IOTLB sync; falls back to per-region dma_map_phys() otherwise. 49 + */ 50 + struct tso_dma_map { 51 + struct device *dev; 52 + const struct sk_buff *skb; 53 + unsigned int hdr_len; 54 + /* IOVA path */ 55 + struct dma_iova_state iova_state; 56 + size_t iova_offset; 57 + size_t total_len; 58 + /* Fallback path if IOVA path fails */ 59 + int frag_idx; 60 + unsigned int offset; 61 + dma_addr_t linear_dma; 62 + unsigned int linear_len; 63 + unsigned int nr_frags; 64 + struct { 65 + dma_addr_t dma; 66 + unsigned int len; 67 + } frags[MAX_SKB_FRAGS]; 68 + }; 69 + 70 + /** 71 + * struct tso_dma_map_completion_state - Completion-time cleanup state 72 + * @iova_state: DMA IOVA state (when IOMMU available) 73 + * @total_len: total payload length of the IOVA mapping 74 + * 75 + * Drivers store this on their SW ring at xmit time via 76 + * tso_dma_map_completion_save(), then call tso_dma_map_complete() at 77 + * completion time. 78 + */ 79 + struct tso_dma_map_completion_state { 80 + struct dma_iova_state iova_state; 81 + size_t total_len; 82 + }; 83 + 84 + int tso_dma_map_init(struct tso_dma_map *map, struct device *dev, 85 + const struct sk_buff *skb, unsigned int hdr_len); 86 + void tso_dma_map_cleanup(struct tso_dma_map *map); 87 + unsigned int tso_dma_map_count(struct tso_dma_map *map, unsigned int len); 88 + bool tso_dma_map_next(struct tso_dma_map *map, dma_addr_t *addr, 89 + unsigned int *chunk_len, unsigned int *mapping_len, 90 + unsigned int seg_remaining); 91 + 92 + /** 93 + * tso_dma_map_completion_save - save state needed for completion-time cleanup 94 + * @map: the xmit-time DMA map 95 + * @cstate: driver-owned storage that persists until completion 96 + * 97 + * Should be called at xmit time to update the completion state and later passed 98 + * to tso_dma_map_complete(). 99 + */ 100 + static inline void 101 + tso_dma_map_completion_save(const struct tso_dma_map *map, 102 + struct tso_dma_map_completion_state *cstate) 103 + { 104 + cstate->iova_state = map->iova_state; 105 + cstate->total_len = map->total_len; 106 + } 107 + 108 + /** 109 + * tso_dma_map_complete - tear down mapping at completion time 110 + * @dev: the device that owns the mapping 111 + * @cstate: state saved by tso_dma_map_completion_save() 112 + * 113 + * Return: true if the IOVA path was used and the mapping has been 114 + * destroyed; false if the fallback per-region path was used and the 115 + * driver must unmap via its normal completion path. 116 + */ 117 + static inline bool 118 + tso_dma_map_complete(struct device *dev, 119 + struct tso_dma_map_completion_state *cstate) 120 + { 121 + if (dma_use_iova(&cstate->iova_state)) { 122 + dma_iova_destroy(dev, &cstate->iova_state, cstate->total_len, 123 + DMA_TO_DEVICE, 0); 124 + return true; 125 + } 126 + 127 + return false; 128 + } 31 129 32 130 #endif /* _TSO_H */
+269
net/core/tso.c
··· 3 3 #include <linux/if_vlan.h> 4 4 #include <net/ip.h> 5 5 #include <net/tso.h> 6 + #include <linux/dma-mapping.h> 6 7 #include <linux/unaligned.h> 7 8 8 9 void tso_build_hdr(const struct sk_buff *skb, char *hdr, struct tso_t *tso, ··· 88 87 return hdr_len; 89 88 } 90 89 EXPORT_SYMBOL(tso_start); 90 + 91 + static int tso_dma_iova_try(struct device *dev, struct tso_dma_map *map, 92 + phys_addr_t phys, size_t linear_len, 93 + size_t total_len, size_t *offset) 94 + { 95 + const struct sk_buff *skb; 96 + unsigned int nr_frags; 97 + int i; 98 + 99 + if (!dma_iova_try_alloc(dev, &map->iova_state, phys, total_len)) 100 + return 1; 101 + 102 + skb = map->skb; 103 + nr_frags = skb_shinfo(skb)->nr_frags; 104 + 105 + if (linear_len) { 106 + if (dma_iova_link(dev, &map->iova_state, 107 + phys, *offset, linear_len, 108 + DMA_TO_DEVICE, 0)) 109 + goto iova_fail; 110 + map->linear_len = linear_len; 111 + *offset += linear_len; 112 + } 113 + 114 + for (i = 0; i < nr_frags; i++) { 115 + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; 116 + unsigned int frag_len = skb_frag_size(frag); 117 + 118 + if (dma_iova_link(dev, &map->iova_state, 119 + skb_frag_phys(frag), *offset, 120 + frag_len, DMA_TO_DEVICE, 0)) { 121 + map->nr_frags = i; 122 + goto iova_fail; 123 + } 124 + map->frags[i].len = frag_len; 125 + *offset += frag_len; 126 + map->nr_frags = i + 1; 127 + } 128 + 129 + if (dma_iova_sync(dev, &map->iova_state, 0, total_len)) 130 + goto iova_fail; 131 + 132 + return 0; 133 + 134 + iova_fail: 135 + dma_iova_destroy(dev, &map->iova_state, *offset, 136 + DMA_TO_DEVICE, 0); 137 + memset(&map->iova_state, 0, sizeof(map->iova_state)); 138 + 139 + /* reset map state */ 140 + map->frag_idx = -1; 141 + map->offset = 0; 142 + map->linear_len = 0; 143 + map->nr_frags = 0; 144 + 145 + return 1; 146 + } 147 + 148 + /** 149 + * tso_dma_map_init - DMA-map GSO payload regions 150 + * @map: map struct to initialize 151 + * @dev: device for DMA mapping 152 + * @skb: the GSO skb 153 + * @hdr_len: per-segment header length in bytes 154 + * 155 + * DMA-maps the linear payload (after headers) and all frags. 156 + * Prefers the DMA IOVA API (one contiguous mapping, one IOTLB sync); 157 + * falls back to per-region dma_map_phys() when IOVA is not available. 158 + * Positions the iterator at byte 0 of the payload. 159 + * 160 + * Return: 0 on success, -ENOMEM on DMA mapping failure (partial mappings 161 + * are cleaned up internally). 162 + */ 163 + int tso_dma_map_init(struct tso_dma_map *map, struct device *dev, 164 + const struct sk_buff *skb, unsigned int hdr_len) 165 + { 166 + unsigned int linear_len = skb_headlen(skb) - hdr_len; 167 + unsigned int nr_frags = skb_shinfo(skb)->nr_frags; 168 + size_t total_len = skb->len - hdr_len; 169 + size_t offset = 0; 170 + phys_addr_t phys; 171 + int i; 172 + 173 + map->dev = dev; 174 + map->skb = skb; 175 + map->hdr_len = hdr_len; 176 + map->frag_idx = -1; 177 + map->offset = 0; 178 + map->iova_offset = 0; 179 + map->total_len = total_len; 180 + map->linear_len = 0; 181 + map->nr_frags = 0; 182 + memset(&map->iova_state, 0, sizeof(map->iova_state)); 183 + 184 + if (!total_len) 185 + return 0; 186 + 187 + if (linear_len) 188 + phys = virt_to_phys(skb->data + hdr_len); 189 + else 190 + phys = skb_frag_phys(&skb_shinfo(skb)->frags[0]); 191 + 192 + if (tso_dma_iova_try(dev, map, phys, linear_len, total_len, &offset)) { 193 + /* IOVA path failed, map state was reset. Fallback to 194 + * per-region dma_map_phys() 195 + */ 196 + if (linear_len) { 197 + map->linear_dma = dma_map_phys(dev, phys, linear_len, 198 + DMA_TO_DEVICE, 0); 199 + if (dma_mapping_error(dev, map->linear_dma)) 200 + return -ENOMEM; 201 + map->linear_len = linear_len; 202 + } 203 + 204 + for (i = 0; i < nr_frags; i++) { 205 + skb_frag_t *frag = &skb_shinfo(skb)->frags[i]; 206 + unsigned int frag_len = skb_frag_size(frag); 207 + 208 + map->frags[i].len = frag_len; 209 + map->frags[i].dma = dma_map_phys(dev, skb_frag_phys(frag), 210 + frag_len, DMA_TO_DEVICE, 0); 211 + if (dma_mapping_error(dev, map->frags[i].dma)) { 212 + tso_dma_map_cleanup(map); 213 + return -ENOMEM; 214 + } 215 + map->nr_frags = i + 1; 216 + } 217 + } 218 + 219 + if (linear_len == 0 && nr_frags > 0) 220 + map->frag_idx = 0; 221 + 222 + return 0; 223 + } 224 + EXPORT_SYMBOL(tso_dma_map_init); 225 + 226 + /** 227 + * tso_dma_map_cleanup - unmap all DMA regions in a tso_dma_map 228 + * @map: the map to clean up 229 + * 230 + * Handles both IOVA and fallback paths. For IOVA, calls 231 + * dma_iova_destroy(). For fallback, unmaps each region individually. 232 + */ 233 + void tso_dma_map_cleanup(struct tso_dma_map *map) 234 + { 235 + int i; 236 + 237 + if (dma_use_iova(&map->iova_state)) { 238 + dma_iova_destroy(map->dev, &map->iova_state, map->total_len, 239 + DMA_TO_DEVICE, 0); 240 + memset(&map->iova_state, 0, sizeof(map->iova_state)); 241 + } else { 242 + if (map->linear_len) 243 + dma_unmap_phys(map->dev, map->linear_dma, 244 + map->linear_len, DMA_TO_DEVICE, 0); 245 + 246 + for (i = 0; i < map->nr_frags; i++) 247 + dma_unmap_phys(map->dev, map->frags[i].dma, 248 + map->frags[i].len, DMA_TO_DEVICE, 0); 249 + } 250 + 251 + map->linear_len = 0; 252 + map->nr_frags = 0; 253 + } 254 + EXPORT_SYMBOL(tso_dma_map_cleanup); 255 + 256 + /** 257 + * tso_dma_map_count - count descriptors for a payload range 258 + * @map: the payload map 259 + * @len: number of payload bytes in this segment 260 + * 261 + * Counts how many contiguous DMA region chunks the next @len bytes 262 + * will span, without advancing the iterator. On the IOVA path this 263 + * is always 1 (contiguous). On the fallback path, uses region sizes 264 + * from the current position. 265 + * 266 + * Return: the number of descriptors needed for @len bytes of payload. 267 + */ 268 + unsigned int tso_dma_map_count(struct tso_dma_map *map, unsigned int len) 269 + { 270 + unsigned int offset = map->offset; 271 + int idx = map->frag_idx; 272 + unsigned int count = 0; 273 + 274 + if (!len) 275 + return 0; 276 + 277 + if (dma_use_iova(&map->iova_state)) 278 + return 1; 279 + 280 + while (len > 0) { 281 + unsigned int region_len, chunk; 282 + 283 + if (idx == -1) 284 + region_len = map->linear_len; 285 + else 286 + region_len = map->frags[idx].len; 287 + 288 + chunk = min(len, region_len - offset); 289 + len -= chunk; 290 + count++; 291 + offset = 0; 292 + idx++; 293 + } 294 + 295 + return count; 296 + } 297 + EXPORT_SYMBOL(tso_dma_map_count); 298 + 299 + /** 300 + * tso_dma_map_next - yield the next DMA address range 301 + * @map: the payload map 302 + * @addr: output DMA address 303 + * @chunk_len: output chunk length 304 + * @mapping_len: full DMA mapping length when this chunk starts a new 305 + * mapping region, or 0 when continuing a previous one. 306 + * On the IOVA path this is always 0 (driver must not 307 + * do per-region unmaps; use tso_dma_map_cleanup instead). 308 + * @seg_remaining: bytes left in current segment 309 + * 310 + * Yields the next (dma_addr, chunk_len) pair and advances the iterator. 311 + * On the IOVA path, the entire payload is contiguous so each segment 312 + * is always a single chunk. 313 + * 314 + * Return: true if a chunk was yielded, false when @seg_remaining is 0. 315 + */ 316 + bool tso_dma_map_next(struct tso_dma_map *map, dma_addr_t *addr, 317 + unsigned int *chunk_len, unsigned int *mapping_len, 318 + unsigned int seg_remaining) 319 + { 320 + unsigned int region_len, chunk; 321 + 322 + if (!seg_remaining) 323 + return false; 324 + 325 + /* IOVA path: contiguous DMA range, no region boundaries */ 326 + if (dma_use_iova(&map->iova_state)) { 327 + *addr = map->iova_state.addr + map->iova_offset; 328 + *chunk_len = seg_remaining; 329 + *mapping_len = 0; 330 + map->iova_offset += seg_remaining; 331 + return true; 332 + } 333 + 334 + /* Fallback path: per-region iteration */ 335 + 336 + if (map->frag_idx == -1) { 337 + region_len = map->linear_len; 338 + chunk = min(seg_remaining, region_len - map->offset); 339 + *addr = map->linear_dma + map->offset; 340 + } else { 341 + region_len = map->frags[map->frag_idx].len; 342 + chunk = min(seg_remaining, region_len - map->offset); 343 + *addr = map->frags[map->frag_idx].dma + map->offset; 344 + } 345 + 346 + *mapping_len = (map->offset == 0) ? region_len : 0; 347 + *chunk_len = chunk; 348 + map->offset += chunk; 349 + 350 + if (map->offset >= region_len) { 351 + map->frag_idx++; 352 + map->offset = 0; 353 + } 354 + 355 + return true; 356 + } 357 + EXPORT_SYMBOL(tso_dma_map_next);