Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

treewide: provide a generic clear_user_page() variant

Patch series "mm: folio_zero_user: clear page ranges", v11.

This series adds clearing of contiguous page ranges for hugepages.

The series improves on the current discontiguous clearing approach in two
ways:

- clear pages in a contiguous fashion.
- use batched clearing via clear_pages() wherever exposed.

The first is useful because it allows us to make much better use of
hardware prefetchers.

The second, enables advertising the real extent to the processor. Where
specific instructions support it (ex. string instructions on x86; "mops"
on arm64 etc), a processor can optimize based on this because, instead of
seeing a sequence of 8-byte stores, or a sequence of 4KB pages, it sees a
larger unit being operated on.

For instance, AMD Zen uarchs (for extents larger than LLC-size) switch to
a mode where they start eliding cacheline allocation. This is helpful not
just because it results in higher bandwidth, but also because now the
cache is not evicting useful cachelines and replacing them with zeroes.

Demand faulting a 64GB region shows performance improvement:

$ perf bench mem mmap -p $pg-sz -f demand -s 64GB -l 5

baseline +series
(GBps +- %stdev) (GBps +- %stdev)

pg-sz=2MB 11.76 +- 1.10% 25.34 +- 1.18% [*] +115.47% preempt=*

pg-sz=1GB 24.85 +- 2.41% 39.22 +- 2.32% + 57.82% preempt=none|voluntary
pg-sz=1GB (similar) 52.73 +- 0.20% [#] +112.19% preempt=full|lazy

[*] This improvement is because switching to sequential clearing
allows the hardware prefetchers to do a much better job.

[#] For pg-sz=1GB a large part of the improvement is because of the
cacheline elision mentioned above. preempt=full|lazy improves upon
that because, not needing explicit invocations of cond_resched() to
ensure reasonable preemption latency, it can clear the full extent
as a single unit. In comparison the maximum extent used for
preempt=none|voluntary is PROCESS_PAGES_NON_PREEMPT_BATCH (32MB).

When provided the full extent the processor forgoes allocating
cachelines on this path almost entirely.

(The hope is that eventually, in the fullness of time, the lazy
preemption model will be able to do the same job that none or
voluntary models are used for, allowing us to do away with
cond_resched().)

Raghavendra also tested previous version of the series on AMD Genoa and
sees similar improvement [1] with preempt=lazy.

$ perf bench mem map -p $page-size -f populate -s 64GB -l 10

base patched change
pg-sz=2MB 12.731939 GB/sec 26.304263 GB/sec 106.6%
pg-sz=1GB 26.232423 GB/sec 61.174836 GB/sec 133.2%


This patch (of 8):

Let's drop all variants that effectively map to clear_page() and provide
it in a generic variant instead.

We'll use the macro clear_user_page to indicate whether an architecture
provides it's own variant.

Also, clear_user_page() is only called from the generic variant of
clear_user_highpage(), so define it only if the architecture does not
provide a clear_user_highpage(). And, for simplicity define it in
linux/highmem.h.

Note that for parisc, clear_page() and clear_user_page() map to
clear_page_asm(), so we can just get rid of the custom clear_user_page()
implementation. There is a clear_user_page_asm() function on parisc, that
seems to be unused. Not sure what's up with that.

Link: https://lkml.kernel.org/r/20260107072009.1615991-1-ankur.a.arora@oracle.com
Link: https://lkml.kernel.org/r/20260107072009.1615991-2-ankur.a.arora@oracle.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Co-developed-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ankur Arora <ankur.a.arora@oracle.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Li Zhe <lizhe.67@bytedance.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

David Hildenbrand and committed by
Andrew Morton
8e38607a 8b05d2d8

+29 -28
-1
arch/alpha/include/asm/page.h
··· 11 11 #define STRICT_MM_TYPECHECKS 12 12 13 13 extern void clear_page(void *page); 14 - #define clear_user_page(page, vaddr, pg) clear_page(page) 15 14 16 15 #define vma_alloc_zeroed_movable_folio(vma, vaddr) \ 17 16 vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr)
+2
arch/arc/include/asm/page.h
··· 32 32 33 33 void copy_user_highpage(struct page *to, struct page *from, 34 34 unsigned long u_vaddr, struct vm_area_struct *vma); 35 + 36 + #define clear_user_page clear_user_page 35 37 void clear_user_page(void *to, unsigned long u_vaddr, struct page *page); 36 38 37 39 typedef struct {
-1
arch/arm/include/asm/page-nommu.h
··· 11 11 #define clear_page(page) memset((page), 0, PAGE_SIZE) 12 12 #define copy_page(to,from) memcpy((to), (from), PAGE_SIZE) 13 13 14 - #define clear_user_page(page, vaddr, pg) clear_page(page) 15 14 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 16 15 17 16 /*
-1
arch/arm64/include/asm/page.h
··· 36 36 bool tag_clear_highpages(struct page *to, int numpages); 37 37 #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGES 38 38 39 - #define clear_user_page(page, vaddr, pg) clear_page(page) 40 39 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 41 40 42 41 typedef struct page *pgtable_t;
+1
arch/csky/abiv1/inc/abi/page.h
··· 10 10 return (addr1 ^ addr2) & (SHMLBA-1); 11 11 } 12 12 13 + #define clear_user_page clear_user_page 13 14 static inline void clear_user_page(void *addr, unsigned long vaddr, 14 15 struct page *page) 15 16 {
-7
arch/csky/abiv2/inc/abi/page.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 - 3 - static inline void clear_user_page(void *addr, unsigned long vaddr, 4 - struct page *page) 5 - { 6 - clear_page(addr); 7 - } 8 - 9 2 static inline void copy_user_page(void *to, void *from, unsigned long vaddr, 10 3 struct page *page) 11 4 {
-1
arch/hexagon/include/asm/page.h
··· 113 113 /* 114 114 * Under assumption that kernel always "sees" user map... 115 115 */ 116 - #define clear_user_page(page, vaddr, pg) clear_page(page) 117 116 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 118 117 119 118 static inline unsigned long virt_to_pfn(const void *kaddr)
-1
arch/loongarch/include/asm/page.h
··· 30 30 extern void clear_page(void *page); 31 31 extern void copy_page(void *to, void *from); 32 32 33 - #define clear_user_page(page, vaddr, pg) clear_page(page) 34 33 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 35 34 36 35 extern unsigned long shm_align_mask;
-1
arch/m68k/include/asm/page_no.h
··· 10 10 #define clear_page(page) memset((page), 0, PAGE_SIZE) 11 11 #define copy_page(to,from) memcpy((to), (from), PAGE_SIZE) 12 12 13 - #define clear_user_page(page, vaddr, pg) clear_page(page) 14 13 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 15 14 16 15 #define vma_alloc_zeroed_movable_folio(vma, vaddr) \
-1
arch/microblaze/include/asm/page.h
··· 45 45 # define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) 46 46 # define clear_page(pgaddr) memset((pgaddr), 0, PAGE_SIZE) 47 47 48 - # define clear_user_page(pgaddr, vaddr, page) memset((pgaddr), 0, PAGE_SIZE) 49 48 # define copy_user_page(vto, vfrom, vaddr, topg) \ 50 49 memcpy((vto), (vfrom), PAGE_SIZE) 51 50
+1
arch/mips/include/asm/page.h
··· 90 90 if (pages_do_alias((unsigned long) addr, vaddr & PAGE_MASK)) 91 91 flush_data_cache_page((unsigned long)addr); 92 92 } 93 + #define clear_user_page clear_user_page 93 94 94 95 struct vm_area_struct; 95 96 extern void copy_user_highpage(struct page *to, struct page *from,
+1
arch/nios2/include/asm/page.h
··· 45 45 46 46 struct page; 47 47 48 + #define clear_user_page clear_user_page 48 49 extern void clear_user_page(void *addr, unsigned long vaddr, struct page *page); 49 50 extern void copy_user_page(void *vto, void *vfrom, unsigned long vaddr, 50 51 struct page *to);
-1
arch/openrisc/include/asm/page.h
··· 30 30 #define clear_page(page) memset((page), 0, PAGE_SIZE) 31 31 #define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) 32 32 33 - #define clear_user_page(page, vaddr, pg) clear_page(page) 34 33 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 35 34 36 35 /*
-1
arch/parisc/include/asm/page.h
··· 21 21 22 22 void clear_page_asm(void *page); 23 23 void copy_page_asm(void *to, void *from); 24 - #define clear_user_page(vto, vaddr, page) clear_page_asm(vto) 25 24 void copy_user_highpage(struct page *to, struct page *from, unsigned long vaddr, 26 25 struct vm_area_struct *vma); 27 26 #define __HAVE_ARCH_COPY_USER_HIGHPAGE
+1
arch/powerpc/include/asm/page.h
··· 271 271 272 272 struct page; 273 273 extern void clear_user_page(void *page, unsigned long vaddr, struct page *pg); 274 + #define clear_user_page clear_user_page 274 275 extern void copy_user_page(void *to, void *from, unsigned long vaddr, 275 276 struct page *p); 276 277 extern int devmem_is_allowed(unsigned long pfn);
-1
arch/riscv/include/asm/page.h
··· 50 50 #endif 51 51 #define copy_page(to, from) memcpy((to), (from), PAGE_SIZE) 52 52 53 - #define clear_user_page(pgaddr, vaddr, page) clear_page(pgaddr) 54 53 #define copy_user_page(vto, vfrom, vaddr, topg) \ 55 54 memcpy((vto), (vfrom), PAGE_SIZE) 56 55
-1
arch/s390/include/asm/page.h
··· 65 65 : : "memory", "cc"); 66 66 } 67 67 68 - #define clear_user_page(page, vaddr, pg) clear_page(page) 69 68 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 70 69 71 70 #define vma_alloc_zeroed_movable_folio(vma, vaddr) \
+1
arch/sparc/include/asm/page_64.h
··· 43 43 #define clear_page(X) _clear_page((void *)(X)) 44 44 struct page; 45 45 void clear_user_page(void *addr, unsigned long vaddr, struct page *page); 46 + #define clear_user_page clear_user_page 46 47 #define copy_page(X,Y) memcpy((void *)(X), (void *)(Y), PAGE_SIZE) 47 48 void copy_user_page(void *to, void *from, unsigned long vaddr, struct page *topage); 48 49 #define __HAVE_ARCH_COPY_USER_HIGHPAGE
-1
arch/um/include/asm/page.h
··· 26 26 #define clear_page(page) memset((void *)(page), 0, PAGE_SIZE) 27 27 #define copy_page(to,from) memcpy((void *)(to), (void *)(from), PAGE_SIZE) 28 28 29 - #define clear_user_page(page, vaddr, pg) clear_page(page) 30 29 #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 31 30 32 31 typedef struct { unsigned long pte; } pte_t;
-6
arch/x86/include/asm/page.h
··· 22 22 extern struct range pfn_mapped[]; 23 23 extern int nr_pfn_mapped; 24 24 25 - static inline void clear_user_page(void *page, unsigned long vaddr, 26 - struct page *pg) 27 - { 28 - clear_page(page); 29 - } 30 - 31 25 static inline void copy_user_page(void *to, void *from, unsigned long vaddr, 32 26 struct page *topage) 33 27 {
-1
arch/xtensa/include/asm/page.h
··· 126 126 void copy_user_highpage(struct page *to, struct page *from, 127 127 unsigned long vaddr, struct vm_area_struct *vma); 128 128 #else 129 - # define clear_user_page(page, vaddr, pg) clear_page(page) 130 129 # define copy_user_page(to, from, vaddr, pg) copy_page(to, from) 131 130 #endif 132 131
+22 -2
include/linux/highmem.h
··· 197 197 } 198 198 #endif 199 199 200 - /* when CONFIG_HIGHMEM is not set these will be plain clear/copy_page */ 201 200 #ifndef clear_user_highpage 201 + #ifndef clear_user_page 202 + /** 203 + * clear_user_page() - clear a page to be mapped to user space 204 + * @addr: the address of the page 205 + * @vaddr: the address of the user mapping 206 + * @page: the page 207 + * 208 + * We condition the definition of clear_user_page() on the architecture 209 + * not having a custom clear_user_highpage(). That's because if there 210 + * is some special flushing needed for clear_user_highpage() then it 211 + * is likely that clear_user_page() also needs some magic. And, since 212 + * our only caller is the generic clear_user_highpage(), not defining 213 + * is not much of a loss. 214 + */ 215 + static inline void clear_user_page(void *addr, unsigned long vaddr, struct page *page) 216 + { 217 + clear_page(addr); 218 + } 219 + #endif 220 + 221 + /* when CONFIG_HIGHMEM is not set these will be plain clear/copy_page */ 202 222 static inline void clear_user_highpage(struct page *page, unsigned long vaddr) 203 223 { 204 224 void *addr = kmap_local_page(page); 205 225 clear_user_page(addr, vaddr, page); 206 226 kunmap_local(addr); 207 227 } 208 - #endif 228 + #endif /* clear_user_highpage */ 209 229 210 230 #ifndef vma_alloc_zeroed_movable_folio 211 231 /**