Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

net-zerocopy: Reduce compound page head access

When compound pages are enabled, although the mm layer still
returns an array of page pointers, a subset (or all) of them
may have the same page head since a max 180kb skb can span 2
hugepages if it is on the boundary, be a mix of pages and 1 hugepage,
or fit completely in a hugepage. Instead of referencing page head
on all page pointers, use page length arithmetic to only call page
head when referencing a known different page head to avoid touching
a cold cacheline.

Tested:
See next patch with changes to tcp_mmap

Correntess:
On a pair of separate hosts as send with MSG_ZEROCOPY will
force a copy on tx if using loopback alone, check that the SHA
on the message sent is equivalent to checksum on the message received,
since the current program already checks for the length.

echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
./tcp_mmap -s -z
./tcp_mmap -H $DADDR -z

SHA256 is correct
received 2 MB (100 % mmap'ed) in 0.005914 s, 2.83686 Gbit
cpu usage user:0.001984 sys:0.000963, 1473.5 usec per MB, 10 c-switches

Performance:
Run neper between adjacent hosts with the same config
tcp_stream -Z --skip-rx-copy -6 -T 20 -F 1000 --stime-use-proc --test-length=30

Before patch: stime_end=37.670000
After patch: stime_end=30.310000

Signed-off-by: Coco Li <lixiaoyan@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230321081202.2370275-1-lixiaoyan@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

authored by

Xiaoyan Li and committed by
Paolo Abeni
593ef60c ce1fdb06

+11 -3
+11 -3
net/core/datagram.c
··· 622 622 frag = skb_shinfo(skb)->nr_frags; 623 623 624 624 while (length && iov_iter_count(from)) { 625 + struct page *head, *last_head = NULL; 625 626 struct page *pages[MAX_SKB_FRAGS]; 626 - struct page *last_head = NULL; 627 + int refs, order, n = 0; 627 628 size_t start; 628 629 ssize_t copied; 629 630 unsigned long truesize; 630 - int refs, n = 0; 631 631 632 632 if (frag == MAX_SKB_FRAGS) 633 633 return -EMSGSIZE; ··· 650 650 } else { 651 651 refcount_add(truesize, &skb->sk->sk_wmem_alloc); 652 652 } 653 + 654 + head = compound_head(pages[n]); 655 + order = compound_order(head); 656 + 653 657 for (refs = 0; copied != 0; start = 0) { 654 658 int size = min_t(int, copied, PAGE_SIZE - start); 655 - struct page *head = compound_head(pages[n]); 659 + 660 + if (pages[n] - head > (1UL << order) - 1) { 661 + head = compound_head(pages[n]); 662 + order = compound_order(head); 663 + } 656 664 657 665 start += (pages[n] - head) << PAGE_SHIFT; 658 666 copied -= size;