Merge tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

tjh.dev / kernel

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

Merge tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"15 hotfixes. 6 are cc:stable. 14 are for MM.

Singletons, with one doubleton - please see the changelogs for details"

* tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS, mailmap: update email address for Lorenzo Stoakes
mm/mmu_notifier: clean up mmu_notifier.h kernel-doc
uaccess: correct kernel-doc parameter format
mm/huge_memory: fix a folio_split() race condition with folio_try_get()
MAINTAINERS: add co-maintainer and reviewer for SLAB ALLOCATOR
MAINTAINERS: add RELAY entry
memcg: fix slab accounting in refill_obj_stock() trylock path
mm/hugetlb.c: use __pa() instead of virt_to_phys() in early bootmem alloc code
zram: rename writeback_compressed device attr
tools/testing: fix testing/vma and testing/radix-tree build
Revert "ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor()"
mm/cma: move put_page_testzero() out of VM_WARN_ON in cma_release()
mm/damon/core: clear walk_control on inactive context in damos_walk()
mm: memfd_luo: always dirty all folios
mm: memfd_luo: always make all folios uptodate

Linus Torvalds 3 months ago b4f0dd31 1f318b96

+163 -73

18 changed files

expand all collapse all

.mailmap

Documentation

ABI

testing

sysfs-block-zram

admin-guide

blockdev

zram.rst

MAINTAINERS

drivers

block

zram

zram_drv.c

zram_drv.h

include

linux

mm.h

mmu_notifier.h

uaccess.h

cma.c

damon

core.c

huge_memory.c

hugetlb.c

memcontrol.c

memfd_luo.c

tools

include

linux

gfp.h

overflow.h

slab.h

+2 -1

.mailmap

reviewed

··· 498 498 Loic Poulain <loic.poulain@oss.qualcomm.com> <loic.poulain@linaro.org> 499 499 Loic Poulain <loic.poulain@oss.qualcomm.com> <loic.poulain@intel.com> 500 500 Lorenzo Pieralisi <lpieralisi@kernel.org> <lorenzo.pieralisi@arm.com> 501 501 - Lorenzo Stoakes <lorenzo.stoakes@oracle.com> <lstoakes@gmail.com> 501 501 + Lorenzo Stoakes <ljs@kernel.org> <lstoakes@gmail.com> 502 502 + Lorenzo Stoakes <ljs@kernel.org> <lorenzo.stoakes@oracle.com> 502 503 Luca Ceresoli <luca.ceresoli@bootlin.com> <luca@lucaceresoli.net> 503 504 Luca Weiss <luca@lucaweiss.eu> <luca@z3ntu.xyz> 504 505 Lucas De Marchi <demarchi@kernel.org> <lucas.demarchi@intel.com>

+2 -2

Documentation/ABI/testing/sysfs-block-zram

reviewed

··· 151 151 The algorithm_params file is write-only and is used to setup 152 152 compression algorithm parameters. 153 153 154 154 - What: /sys/block/zram<id>/writeback_compressed 154 154 + What: /sys/block/zram<id>/compressed_writeback 155 155 Date: Decemeber 2025 156 156 Contact: Richard Chang <richardycc@google.com> 157 157 Description: 158 158 - The writeback_compressed device atrribute toggles compressed 158 158 + The compressed_writeback device atrribute toggles compressed 159 159 writeback feature. 160 160 161 161 What: /sys/block/zram<id>/writeback_batch_size

+3 -3

Documentation/admin-guide/blockdev/zram.rst

reviewed

··· 216 216 writeback_limit_enable RW show and set writeback_limit feature 217 217 writeback_batch_size RW show and set maximum number of in-flight 218 218 writeback operations 219 219 - writeback_compressed RW show and set compressed writeback feature 219 219 + compressed_writeback RW show and set compressed writeback feature 220 220 comp_algorithm RW show and change the compression algorithm 221 221 algorithm_params WO setup compression algorithm parameters 222 222 compact WO trigger memory compaction ··· 439 439 By default zram stores written back pages in decompressed (raw) form, which 440 440 means that writeback operation involves decompression of the page before 441 441 writing it to the backing device. This behavior can be changed by enabling 442 442 - `writeback_compressed` feature, which causes zram to write compressed pages 442 442 + `compressed_writeback` feature, which causes zram to write compressed pages 443 443 to the backing device, thus avoiding decompression overhead. To enable 444 444 this feature, execute:: 445 445 446 446 - $ echo yes > /sys/block/zramX/writeback_compressed 446 446 + $ echo yes > /sys/block/zramX/compressed_writeback 447 447 448 448 Note that this feature should be configured before the `zramX` device is 449 449 initialized.

+22 -11

MAINTAINERS

reviewed

··· 16643 16643 MEMORY MANAGEMENT - CORE 16644 16644 M: Andrew Morton <akpm@linux-foundation.org> 16645 16645 M: David Hildenbrand <david@kernel.org> 16646 16646 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16646 16646 + R: Lorenzo Stoakes <ljs@kernel.org> 16647 16647 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16648 16648 R: Vlastimil Babka <vbabka@kernel.org> 16649 16649 R: Mike Rapoport <rppt@kernel.org> ··· 16773 16773 MEMORY MANAGEMENT - MISC 16774 16774 M: Andrew Morton <akpm@linux-foundation.org> 16775 16775 M: David Hildenbrand <david@kernel.org> 16776 16776 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16776 16776 + R: Lorenzo Stoakes <ljs@kernel.org> 16777 16777 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16778 16778 R: Vlastimil Babka <vbabka@kernel.org> 16779 16779 R: Mike Rapoport <rppt@kernel.org> ··· 16864 16864 R: Michal Hocko <mhocko@kernel.org> 16865 16865 R: Qi Zheng <zhengqi.arch@bytedance.com> 16866 16866 R: Shakeel Butt <shakeel.butt@linux.dev> 16867 16867 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16867 16867 + R: Lorenzo Stoakes <ljs@kernel.org> 16868 16868 L: linux-mm@kvack.org 16869 16869 S: Maintained 16870 16870 F: mm/vmscan.c ··· 16873 16873 MEMORY MANAGEMENT - RMAP (REVERSE MAPPING) 16874 16874 M: Andrew Morton <akpm@linux-foundation.org> 16875 16875 M: David Hildenbrand <david@kernel.org> 16876 16876 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16876 16876 + M: Lorenzo Stoakes <ljs@kernel.org> 16877 16877 R: Rik van Riel <riel@surriel.com> 16878 16878 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16879 16879 R: Vlastimil Babka <vbabka@kernel.org> ··· 16918 16918 MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE) 16919 16919 M: Andrew Morton <akpm@linux-foundation.org> 16920 16920 M: David Hildenbrand <david@kernel.org> 16921 16921 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16921 16921 + M: Lorenzo Stoakes <ljs@kernel.org> 16922 16922 R: Zi Yan <ziy@nvidia.com> 16923 16923 R: Baolin Wang <baolin.wang@linux.alibaba.com> 16924 16924 R: Liam R. Howlett <Liam.Howlett@oracle.com> ··· 16958 16958 16959 16959 MEMORY MANAGEMENT - RUST 16960 16960 M: Alice Ryhl <aliceryhl@google.com> 16961 16961 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16961 16961 + R: Lorenzo Stoakes <ljs@kernel.org> 16962 16962 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16963 16963 L: linux-mm@kvack.org 16964 16964 L: rust-for-linux@vger.kernel.org ··· 16974 16974 MEMORY MAPPING 16975 16975 M: Andrew Morton <akpm@linux-foundation.org> 16976 16976 M: Liam R. Howlett <Liam.Howlett@oracle.com> 16977 16977 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16977 16977 + M: Lorenzo Stoakes <ljs@kernel.org> 16978 16978 R: Vlastimil Babka <vbabka@kernel.org> 16979 16979 R: Jann Horn <jannh@google.com> 16980 16980 R: Pedro Falcato <pfalcato@suse.de> ··· 17004 17004 M: Andrew Morton <akpm@linux-foundation.org> 17005 17005 M: Suren Baghdasaryan <surenb@google.com> 17006 17006 M: Liam R. Howlett <Liam.Howlett@oracle.com> 17007 17007 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 17007 17007 + M: Lorenzo Stoakes <ljs@kernel.org> 17008 17008 R: Vlastimil Babka <vbabka@kernel.org> 17009 17009 R: Shakeel Butt <shakeel.butt@linux.dev> 17010 17010 L: linux-mm@kvack.org ··· 17019 17019 MEMORY MAPPING - MADVISE (MEMORY ADVICE) 17020 17020 M: Andrew Morton <akpm@linux-foundation.org> 17021 17021 M: Liam R. Howlett <Liam.Howlett@oracle.com> 17022 17022 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 17022 17022 + M: Lorenzo Stoakes <ljs@kernel.org> 17023 17023 M: David Hildenbrand <david@kernel.org> 17024 17024 R: Vlastimil Babka <vbabka@kernel.org> 17025 17025 R: Jann Horn <jannh@google.com> ··· 22267 22267 S: Orphan 22268 22268 F: drivers/net/wireless/rsi/ 22269 22269 22270 22270 + RELAY 22271 22271 + M: Andrew Morton <akpm@linux-foundation.org> 22272 22272 + M: Jens Axboe <axboe@kernel.dk> 22273 22273 + M: Jason Xing <kernelxing@tencent.com> 22274 22274 + L: linux-kernel@vger.kernel.org 22275 22275 + S: Maintained 22276 22276 + F: Documentation/filesystems/relay.rst 22277 22277 + F: include/linux/relay.h 22278 22278 + F: kernel/relay.c 22279 22279 + 22270 22280 REGISTER MAP ABSTRACTION 22271 22281 M: Mark Brown <broonie@kernel.org> 22272 22282 L: linux-kernel@vger.kernel.org ··· 23166 23156 23167 23157 RUST [ALLOC] 23168 23158 M: Danilo Krummrich <dakr@kernel.org> 23169 23169 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 23159 23159 + R: Lorenzo Stoakes <ljs@kernel.org> 23170 23160 R: Vlastimil Babka <vbabka@kernel.org> 23171 23161 R: Liam R. Howlett <Liam.Howlett@oracle.com> 23172 23162 R: Uladzislau Rezki <urezki@gmail.com> ··· 24343 24333 24344 24334 SLAB ALLOCATOR 24345 24335 M: Vlastimil Babka <vbabka@kernel.org> 24336 24336 + M: Harry Yoo <harry.yoo@oracle.com> 24346 24337 M: Andrew Morton <akpm@linux-foundation.org> 24338 24338 + R: Hao Li <hao.li@linux.dev> 24347 24339 R: Christoph Lameter <cl@gentwo.org> 24348 24340 R: David Rientjes <rientjes@google.com> 24349 24341 R: Roman Gushchin <roman.gushchin@linux.dev> 24350 24350 - R: Harry Yoo <harry.yoo@oracle.com> 24351 24342 L: linux-mm@kvack.org 24352 24343 S: Maintained 24353 24344 T: git git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git

+12 -12

drivers/block/zram/zram_drv.c

reviewed

··· 549 549 return ret; 550 550 } 551 551 552 552 - static ssize_t writeback_compressed_store(struct device *dev, 552 552 + static ssize_t compressed_writeback_store(struct device *dev, 553 553 struct device_attribute *attr, 554 554 const char *buf, size_t len) 555 555 { ··· 564 564 return -EBUSY; 565 565 } 566 566 567 567 - zram->wb_compressed = val; 567 567 + zram->compressed_wb = val; 568 568 569 569 return len; 570 570 } 571 571 572 572 - static ssize_t writeback_compressed_show(struct device *dev, 572 572 + static ssize_t compressed_writeback_show(struct device *dev, 573 573 struct device_attribute *attr, 574 574 char *buf) 575 575 { ··· 577 577 struct zram *zram = dev_to_zram(dev); 578 578 579 579 guard(rwsem_read)(&zram->dev_lock); 580 580 - val = zram->wb_compressed; 580 580 + val = zram->compressed_wb; 581 581 582 582 return sysfs_emit(buf, "%d\n", val); 583 583 } ··· 946 946 goto out; 947 947 } 948 948 949 949 - if (zram->wb_compressed) { 949 949 + if (zram->compressed_wb) { 950 950 /* 951 951 * ZRAM_WB slots get freed, we need to preserve data required 952 952 * for read decompression. ··· 960 960 set_slot_flag(zram, index, ZRAM_WB); 961 961 set_slot_handle(zram, index, req->blk_idx); 962 962 963 963 - if (zram->wb_compressed) { 963 963 + if (zram->compressed_wb) { 964 964 if (huge) 965 965 set_slot_flag(zram, index, ZRAM_HUGE); 966 966 set_slot_size(zram, index, size); ··· 1100 1100 */ 1101 1101 if (!test_slot_flag(zram, index, ZRAM_PP_SLOT)) 1102 1102 goto next; 1103 1103 - if (zram->wb_compressed) 1103 1103 + if (zram->compressed_wb) 1104 1104 err = read_from_zspool_raw(zram, req->page, index); 1105 1105 else 1106 1106 err = read_from_zspool(zram, req->page, index); ··· 1429 1429 * 1430 1430 * Keep the existing behavior for now. 1431 1431 */ 1432 1432 - if (zram->wb_compressed == false) { 1432 1432 + if (zram->compressed_wb == false) { 1433 1433 /* No decompression needed, complete the parent IO */ 1434 1434 bio_endio(req->parent); 1435 1435 bio_put(bio); ··· 1508 1508 flush_work(&req.work); 1509 1509 destroy_work_on_stack(&req.work); 1510 1510 1511 1511 - if (req.error || zram->wb_compressed == false) 1511 1511 + if (req.error || zram->compressed_wb == false) 1512 1512 return req.error; 1513 1513 1514 1514 return decompress_bdev_page(zram, page, index); ··· 3007 3007 static DEVICE_ATTR_RW(writeback_limit); 3008 3008 static DEVICE_ATTR_RW(writeback_limit_enable); 3009 3009 static DEVICE_ATTR_RW(writeback_batch_size); 3010 3010 - static DEVICE_ATTR_RW(writeback_compressed); 3010 3010 + static DEVICE_ATTR_RW(compressed_writeback); 3011 3011 #endif 3012 3012 #ifdef CONFIG_ZRAM_MULTI_COMP 3013 3013 static DEVICE_ATTR_RW(recomp_algorithm); ··· 3031 3031 &dev_attr_writeback_limit.attr, 3032 3032 &dev_attr_writeback_limit_enable.attr, 3033 3033 &dev_attr_writeback_batch_size.attr, 3034 3034 - &dev_attr_writeback_compressed.attr, 3034 3034 + &dev_attr_compressed_writeback.attr, 3035 3035 #endif 3036 3036 &dev_attr_io_stat.attr, 3037 3037 &dev_attr_mm_stat.attr, ··· 3091 3091 init_rwsem(&zram->dev_lock); 3092 3092 #ifdef CONFIG_ZRAM_WRITEBACK 3093 3093 zram->wb_batch_size = 32; 3094 3094 - zram->wb_compressed = false; 3094 3094 + zram->compressed_wb = false; 3095 3095 #endif 3096 3096 3097 3097 /* gendisk structure */

+1 -1

drivers/block/zram/zram_drv.h

reviewed

··· 133 133 #ifdef CONFIG_ZRAM_WRITEBACK 134 134 struct file *backing_dev; 135 135 bool wb_limit_enable; 136 136 - bool wb_compressed; 136 136 + bool compressed_wb; 137 137 u32 wb_batch_size; 138 138 u64 bd_wb_limit; 139 139 struct block_device *bdev;

+6 -11

include/linux/mm.h

reviewed

··· 3514 3514 static inline void ptlock_free(struct ptdesc *ptdesc) {} 3515 3515 #endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ 3516 3516 3517 3517 - static inline unsigned long ptdesc_nr_pages(const struct ptdesc *ptdesc) 3518 3518 - { 3519 3519 - return compound_nr(ptdesc_page(ptdesc)); 3520 3520 - } 3521 3521 - 3522 3517 static inline void __pagetable_ctor(struct ptdesc *ptdesc) 3523 3518 { 3524 3524 - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags)); 3519 3519 + struct folio *folio = ptdesc_folio(ptdesc); 3525 3520 3526 3526 - __SetPageTable(ptdesc_page(ptdesc)); 3527 3527 - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc)); 3521 3521 + __folio_set_pgtable(folio); 3522 3522 + lruvec_stat_add_folio(folio, NR_PAGETABLE); 3528 3523 } 3529 3524 3530 3525 static inline void pagetable_dtor(struct ptdesc *ptdesc) 3531 3526 { 3532 3532 - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags)); 3527 3527 + struct folio *folio = ptdesc_folio(ptdesc); 3533 3528 3534 3529 ptlock_free(ptdesc); 3535 3535 - __ClearPageTable(ptdesc_page(ptdesc)); 3536 3536 - mod_node_page_state(pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc)); 3530 3530 + __folio_clear_pgtable(folio); 3531 3531 + lruvec_stat_sub_folio(folio, NR_PAGETABLE); 3537 3532 } 3538 3533 3539 3534 static inline void pagetable_dtor_free(struct ptdesc *ptdesc)

+16 -15

include/linux/mmu_notifier.h

reviewed

··· 234 234 }; 235 235 236 236 /** 237 237 - * struct mmu_interval_notifier_ops 237 237 + * struct mmu_interval_notifier_ops - callback for range notification 238 238 * @invalidate: Upon return the caller must stop using any SPTEs within this 239 239 * range. This function can sleep. Return false only if sleeping 240 240 * was required but mmu_notifier_range_blockable(range) is false. ··· 309 309 310 310 /** 311 311 * mmu_interval_set_seq - Save the invalidation sequence 312 312 - * @interval_sub - The subscription passed to invalidate 313 313 - * @cur_seq - The cur_seq passed to the invalidate() callback 312 312 + * @interval_sub: The subscription passed to invalidate 313 313 + * @cur_seq: The cur_seq passed to the invalidate() callback 314 314 * 315 315 * This must be called unconditionally from the invalidate callback of a 316 316 * struct mmu_interval_notifier_ops under the same lock that is used to call ··· 329 329 330 330 /** 331 331 * mmu_interval_read_retry - End a read side critical section against a VA range 332 332 - * interval_sub: The subscription 333 333 - * seq: The return of the paired mmu_interval_read_begin() 332 332 + * @interval_sub: The subscription 333 333 + * @seq: The return of the paired mmu_interval_read_begin() 334 334 * 335 335 * This MUST be called under a user provided lock that is also held 336 336 * unconditionally by op->invalidate() when it calls mmu_interval_set_seq(). ··· 338 338 * Each call should be paired with a single mmu_interval_read_begin() and 339 339 * should be used to conclude the read side. 340 340 * 341 341 - * Returns true if an invalidation collided with this critical section, and 341 341 + * Returns: true if an invalidation collided with this critical section, and 342 342 * the caller should retry. 343 343 */ 344 344 static inline bool ··· 350 350 351 351 /** 352 352 * mmu_interval_check_retry - Test if a collision has occurred 353 353 - * interval_sub: The subscription 354 354 - * seq: The return of the matching mmu_interval_read_begin() 353 353 + * @interval_sub: The subscription 354 354 + * @seq: The return of the matching mmu_interval_read_begin() 355 355 * 356 356 * This can be used in the critical section between mmu_interval_read_begin() 357 357 - * and mmu_interval_read_retry(). A return of true indicates an invalidation 358 358 - * has collided with this critical region and a future 359 359 - * mmu_interval_read_retry() will return true. 360 360 - * 361 361 - * False is not reliable and only suggests a collision may not have 362 362 - * occurred. It can be called many times and does not have to hold the user 363 363 - * provided lock. 357 357 + * and mmu_interval_read_retry(). 364 358 * 365 359 * This call can be used as part of loops and other expensive operations to 366 360 * expedite a retry. 361 361 + * It can be called many times and does not have to hold the user 362 362 + * provided lock. 363 363 + * 364 364 + * Returns: true indicates an invalidation has collided with this critical 365 365 + * region and a future mmu_interval_read_retry() will return true. 366 366 + * False is not reliable and only suggests a collision may not have 367 367 + * occurred. 367 368 */ 368 369 static inline bool 369 370 mmu_interval_check_retry(struct mmu_interval_notifier *interval_sub,

+2 -2

include/linux/uaccess.h

reviewed

··· 792 792 793 793 /** 794 794 * scoped_user_rw_access_size - Start a scoped user read/write access with given size 795 795 - * @uptr Pointer to the user space address to read from and write to 795 795 + * @uptr: Pointer to the user space address to read from and write to 796 796 * @size: Size of the access starting from @uptr 797 797 * @elbl: Error label to goto when the access region is rejected 798 798 * ··· 803 803 804 804 /** 805 805 * scoped_user_rw_access - Start a scoped user read/write access 806 806 - * @uptr Pointer to the user space address to read from and write to 806 806 + * @uptr: Pointer to the user space address to read from and write to 807 807 * @elbl: Error label to goto when the access region is rejected 808 808 * 809 809 * The size of the access starting from @uptr is determined via sizeof(*@uptr)).

+4 -1

mm/cma.c

reviewed

··· 1013 1013 unsigned long count) 1014 1014 { 1015 1015 struct cma_memrange *cmr; 1016 1016 + unsigned long ret = 0; 1016 1017 unsigned long i, pfn; 1017 1018 1018 1019 cmr = find_cma_memrange(cma, pages, count); ··· 1022 1021 1023 1022 pfn = page_to_pfn(pages); 1024 1023 for (i = 0; i < count; i++, pfn++) 1025 1025 - VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn))); 1024 1024 + ret += !put_page_testzero(pfn_to_page(pfn)); 1025 1025 + 1026 1026 + WARN(ret, "%lu pages are still in use!\n", ret); 1026 1027 1027 1028 __cma_release_frozen(cma, cmr, pages, count); 1028 1029

+6 -1

mm/damon/core.c

reviewed

··· 1562 1562 } 1563 1563 ctx->walk_control = control; 1564 1564 mutex_unlock(&ctx->walk_control_lock); 1565 1565 - if (!damon_is_running(ctx)) 1565 1565 + if (!damon_is_running(ctx)) { 1566 1566 + mutex_lock(&ctx->walk_control_lock); 1567 1567 + if (ctx->walk_control == control) 1568 1568 + ctx->walk_control = NULL; 1569 1569 + mutex_unlock(&ctx->walk_control_lock); 1566 1570 return -EINVAL; 1571 1571 + } 1567 1572 wait_for_completion(&control->completion); 1568 1573 if (control->canceled) 1569 1574 return -ECANCELED;

+9 -4

mm/huge_memory.c

reviewed

··· 3631 3631 const bool is_anon = folio_test_anon(folio); 3632 3632 int old_order = folio_order(folio); 3633 3633 int start_order = split_type == SPLIT_TYPE_UNIFORM ? new_order : old_order - 1; 3634 3634 + struct folio *old_folio = folio; 3634 3635 int split_order; 3635 3636 3636 3637 /* ··· 3652 3651 * uniform split has xas_split_alloc() called before 3653 3652 * irq is disabled to allocate enough memory, whereas 3654 3653 * non-uniform split can handle ENOMEM. 3654 3654 + * Use the to-be-split folio, so that a parallel 3655 3655 + * folio_try_get() waits on it until xarray is updated 3656 3656 + * with after-split folios and the original one is 3657 3657 + * unfrozen. 3655 3658 */ 3656 3656 - if (split_type == SPLIT_TYPE_UNIFORM) 3657 3657 - xas_split(xas, folio, old_order); 3658 3658 - else { 3659 3659 + if (split_type == SPLIT_TYPE_UNIFORM) { 3660 3660 + xas_split(xas, old_folio, old_order); 3661 3661 + } else { 3659 3662 xas_set_order(xas, folio->index, split_order); 3660 3660 - xas_try_split(xas, folio, old_order); 3663 3663 + xas_try_split(xas, old_folio, old_order); 3661 3664 if (xas_error(xas)) 3662 3665 return xas_error(xas); 3663 3666 }

+2 -2

mm/hugetlb.c

reviewed

··· 3101 3101 * extract the actual node first. 3102 3102 */ 3103 3103 if (m) 3104 3104 - listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); 3104 3104 + listnode = early_pfn_to_nid(PHYS_PFN(__pa(m))); 3105 3105 } 3106 3106 3107 3107 if (m) { ··· 3160 3160 * The head struct page is used to get folio information by the HugeTLB 3161 3161 * subsystem like zone id and node id. 3162 3162 */ 3163 3163 - memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), 3163 3163 + memblock_reserved_mark_noinit(__pa((void *)m + PAGE_SIZE), 3164 3164 huge_page_size(h) - PAGE_SIZE); 3165 3165 3166 3166 return 1;

+1 -1

mm/memcontrol.c

reviewed

··· 3086 3086 3087 3087 if (!local_trylock(&obj_stock.lock)) { 3088 3088 if (pgdat) 3089 3089 - mod_objcg_mlstate(objcg, pgdat, idx, nr_bytes); 3089 3089 + mod_objcg_mlstate(objcg, pgdat, idx, nr_acct); 3090 3090 nr_pages = nr_bytes >> PAGE_SHIFT; 3091 3091 nr_bytes = nr_bytes & (PAGE_SIZE - 1); 3092 3092 atomic_add(nr_bytes, &objcg->nr_charged_bytes);

+43 -6

mm/memfd_luo.c

reviewed

··· 146 146 for (i = 0; i < nr_folios; i++) { 147 147 struct memfd_luo_folio_ser *pfolio = &folios_ser[i]; 148 148 struct folio *folio = folios[i]; 149 149 - unsigned int flags = 0; 150 149 151 150 err = kho_preserve_folio(folio); 152 151 if (err) 153 152 goto err_unpreserve; 154 153 155 155 - if (folio_test_dirty(folio)) 156 156 - flags |= MEMFD_LUO_FOLIO_DIRTY; 157 157 - if (folio_test_uptodate(folio)) 158 158 - flags |= MEMFD_LUO_FOLIO_UPTODATE; 154 154 + folio_lock(folio); 155 155 + 156 156 + /* 157 157 + * A dirty folio is one which has been written to. A clean folio 158 158 + * is its opposite. Since a clean folio does not carry user 159 159 + * data, it can be freed by page reclaim under memory pressure. 160 160 + * 161 161 + * Saving the dirty flag at prepare() time doesn't work since it 162 162 + * can change later. Saving it at freeze() also won't work 163 163 + * because the dirty bit is normally synced at unmap and there 164 164 + * might still be a mapping of the file at freeze(). 165 165 + * 166 166 + * To see why this is a problem, say a folio is clean at 167 167 + * preserve, but gets dirtied later. The pfolio flags will mark 168 168 + * it as clean. After retrieve, the next kernel might try to 169 169 + * reclaim this folio under memory pressure, losing user data. 170 170 + * 171 171 + * Unconditionally mark it dirty to avoid this problem. This 172 172 + * comes at the cost of making clean folios un-reclaimable after 173 173 + * live update. 174 174 + */ 175 175 + folio_mark_dirty(folio); 176 176 + 177 177 + /* 178 178 + * If the folio is not uptodate, it was fallocated but never 179 179 + * used. Saving this flag at prepare() doesn't work since it 180 180 + * might change later when someone uses the folio. 181 181 + * 182 182 + * Since we have taken the performance penalty of allocating, 183 183 + * zeroing, and pinning all the folios in the holes, take a bit 184 184 + * more and zero all non-uptodate folios too. 185 185 + * 186 186 + * NOTE: For someone looking to improve preserve performance, 187 187 + * this is a good place to look. 188 188 + */ 189 189 + if (!folio_test_uptodate(folio)) { 190 190 + folio_zero_range(folio, 0, folio_size(folio)); 191 191 + flush_dcache_folio(folio); 192 192 + folio_mark_uptodate(folio); 193 193 + } 194 194 + 195 195 + folio_unlock(folio); 159 196 160 197 pfolio->pfn = folio_pfn(folio); 161 161 - pfolio->flags = flags; 198 198 + pfolio->flags = MEMFD_LUO_FOLIO_DIRTY | MEMFD_LUO_FOLIO_UPTODATE; 162 199 pfolio->index = folio->index; 163 200 } 164 201

tools/include/linux/gfp.h

reviewed

··· 5 5 #include <linux/types.h> 6 6 #include <linux/gfp_types.h> 7 7 8 8 + /* Helper macro to avoid gfp flags if they are the default one */ 9 9 + #define __default_gfp(a,...) a 10 10 + #define default_gfp(...) __default_gfp(__VA_ARGS__ __VA_OPT__(,) GFP_KERNEL) 11 11 + 8 12 static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags) 9 13 { 10 14 return !!(gfp_flags & __GFP_DIRECT_RECLAIM);

+19

tools/include/linux/overflow.h

reviewed

··· 69 69 }) 70 70 71 71 /** 72 72 + * size_mul() - Calculate size_t multiplication with saturation at SIZE_MAX 73 73 + * @factor1: first factor 74 74 + * @factor2: second factor 75 75 + * 76 76 + * Returns: calculate @factor1 * @factor2, both promoted to size_t, 77 77 + * with any overflow causing the return value to be SIZE_MAX. The 78 78 + * lvalue must be size_t to avoid implicit type conversion. 79 79 + */ 80 80 + static inline size_t __must_check size_mul(size_t factor1, size_t factor2) 81 81 + { 82 82 + size_t bytes; 83 83 + 84 84 + if (check_mul_overflow(factor1, factor2, &bytes)) 85 85 + return SIZE_MAX; 86 86 + 87 87 + return bytes; 88 88 + } 89 89 + 90 90 + /** 72 91 * array_size() - Calculate size of 2-dimensional array. 73 92 * 74 93 * @a: dimension one

tools/include/linux/slab.h

reviewed

··· 202 202 return sheaf->size; 203 203 } 204 204 205 205 + #define __alloc_objs(KMALLOC, GFP, TYPE, COUNT) \ 206 206 + ({ \ 207 207 + const size_t __obj_size = size_mul(sizeof(TYPE), COUNT); \ 208 208 + (TYPE *)KMALLOC(__obj_size, GFP); \ 209 209 + }) 210 210 + 211 211 + #define kzalloc_obj(P, ...) \ 212 212 + __alloc_objs(kzalloc, default_gfp(__VA_ARGS__), typeof(P), 1) 213 213 + 205 214 #endif /* _TOOLS_SLAB_H */