Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
"15 hotfixes. 6 are cc:stable. 14 are for MM.

Singletons, with one doubleton - please see the changelogs for details"

* tag 'mm-hotfixes-stable-2026-03-09-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
MAINTAINERS, mailmap: update email address for Lorenzo Stoakes
mm/mmu_notifier: clean up mmu_notifier.h kernel-doc
uaccess: correct kernel-doc parameter format
mm/huge_memory: fix a folio_split() race condition with folio_try_get()
MAINTAINERS: add co-maintainer and reviewer for SLAB ALLOCATOR
MAINTAINERS: add RELAY entry
memcg: fix slab accounting in refill_obj_stock() trylock path
mm/hugetlb.c: use __pa() instead of virt_to_phys() in early bootmem alloc code
zram: rename writeback_compressed device attr
tools/testing: fix testing/vma and testing/radix-tree build
Revert "ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor()"
mm/cma: move put_page_testzero() out of VM_WARN_ON in cma_release()
mm/damon/core: clear walk_control on inactive context in damos_walk()
mm: memfd_luo: always dirty all folios
mm: memfd_luo: always make all folios uptodate

+163 -73
+2 -1
.mailmap
··· 498 498 Loic Poulain <loic.poulain@oss.qualcomm.com> <loic.poulain@linaro.org> 499 499 Loic Poulain <loic.poulain@oss.qualcomm.com> <loic.poulain@intel.com> 500 500 Lorenzo Pieralisi <lpieralisi@kernel.org> <lorenzo.pieralisi@arm.com> 501 - Lorenzo Stoakes <lorenzo.stoakes@oracle.com> <lstoakes@gmail.com> 501 + Lorenzo Stoakes <ljs@kernel.org> <lstoakes@gmail.com> 502 + Lorenzo Stoakes <ljs@kernel.org> <lorenzo.stoakes@oracle.com> 502 503 Luca Ceresoli <luca.ceresoli@bootlin.com> <luca@lucaceresoli.net> 503 504 Luca Weiss <luca@lucaweiss.eu> <luca@z3ntu.xyz> 504 505 Lucas De Marchi <demarchi@kernel.org> <lucas.demarchi@intel.com>
+2 -2
Documentation/ABI/testing/sysfs-block-zram
··· 151 151 The algorithm_params file is write-only and is used to setup 152 152 compression algorithm parameters. 153 153 154 - What: /sys/block/zram<id>/writeback_compressed 154 + What: /sys/block/zram<id>/compressed_writeback 155 155 Date: Decemeber 2025 156 156 Contact: Richard Chang <richardycc@google.com> 157 157 Description: 158 - The writeback_compressed device atrribute toggles compressed 158 + The compressed_writeback device atrribute toggles compressed 159 159 writeback feature. 160 160 161 161 What: /sys/block/zram<id>/writeback_batch_size
+3 -3
Documentation/admin-guide/blockdev/zram.rst
··· 216 216 writeback_limit_enable RW show and set writeback_limit feature 217 217 writeback_batch_size RW show and set maximum number of in-flight 218 218 writeback operations 219 - writeback_compressed RW show and set compressed writeback feature 219 + compressed_writeback RW show and set compressed writeback feature 220 220 comp_algorithm RW show and change the compression algorithm 221 221 algorithm_params WO setup compression algorithm parameters 222 222 compact WO trigger memory compaction ··· 439 439 By default zram stores written back pages in decompressed (raw) form, which 440 440 means that writeback operation involves decompression of the page before 441 441 writing it to the backing device. This behavior can be changed by enabling 442 - `writeback_compressed` feature, which causes zram to write compressed pages 442 + `compressed_writeback` feature, which causes zram to write compressed pages 443 443 to the backing device, thus avoiding decompression overhead. To enable 444 444 this feature, execute:: 445 445 446 - $ echo yes > /sys/block/zramX/writeback_compressed 446 + $ echo yes > /sys/block/zramX/compressed_writeback 447 447 448 448 Note that this feature should be configured before the `zramX` device is 449 449 initialized.
+22 -11
MAINTAINERS
··· 16643 16643 MEMORY MANAGEMENT - CORE 16644 16644 M: Andrew Morton <akpm@linux-foundation.org> 16645 16645 M: David Hildenbrand <david@kernel.org> 16646 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16646 + R: Lorenzo Stoakes <ljs@kernel.org> 16647 16647 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16648 16648 R: Vlastimil Babka <vbabka@kernel.org> 16649 16649 R: Mike Rapoport <rppt@kernel.org> ··· 16773 16773 MEMORY MANAGEMENT - MISC 16774 16774 M: Andrew Morton <akpm@linux-foundation.org> 16775 16775 M: David Hildenbrand <david@kernel.org> 16776 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16776 + R: Lorenzo Stoakes <ljs@kernel.org> 16777 16777 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16778 16778 R: Vlastimil Babka <vbabka@kernel.org> 16779 16779 R: Mike Rapoport <rppt@kernel.org> ··· 16864 16864 R: Michal Hocko <mhocko@kernel.org> 16865 16865 R: Qi Zheng <zhengqi.arch@bytedance.com> 16866 16866 R: Shakeel Butt <shakeel.butt@linux.dev> 16867 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16867 + R: Lorenzo Stoakes <ljs@kernel.org> 16868 16868 L: linux-mm@kvack.org 16869 16869 S: Maintained 16870 16870 F: mm/vmscan.c ··· 16873 16873 MEMORY MANAGEMENT - RMAP (REVERSE MAPPING) 16874 16874 M: Andrew Morton <akpm@linux-foundation.org> 16875 16875 M: David Hildenbrand <david@kernel.org> 16876 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16876 + M: Lorenzo Stoakes <ljs@kernel.org> 16877 16877 R: Rik van Riel <riel@surriel.com> 16878 16878 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16879 16879 R: Vlastimil Babka <vbabka@kernel.org> ··· 16918 16918 MEMORY MANAGEMENT - THP (TRANSPARENT HUGE PAGE) 16919 16919 M: Andrew Morton <akpm@linux-foundation.org> 16920 16920 M: David Hildenbrand <david@kernel.org> 16921 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16921 + M: Lorenzo Stoakes <ljs@kernel.org> 16922 16922 R: Zi Yan <ziy@nvidia.com> 16923 16923 R: Baolin Wang <baolin.wang@linux.alibaba.com> 16924 16924 R: Liam R. Howlett <Liam.Howlett@oracle.com> ··· 16958 16958 16959 16959 MEMORY MANAGEMENT - RUST 16960 16960 M: Alice Ryhl <aliceryhl@google.com> 16961 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16961 + R: Lorenzo Stoakes <ljs@kernel.org> 16962 16962 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16963 16963 L: linux-mm@kvack.org 16964 16964 L: rust-for-linux@vger.kernel.org ··· 16974 16974 MEMORY MAPPING 16975 16975 M: Andrew Morton <akpm@linux-foundation.org> 16976 16976 M: Liam R. Howlett <Liam.Howlett@oracle.com> 16977 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 16977 + M: Lorenzo Stoakes <ljs@kernel.org> 16978 16978 R: Vlastimil Babka <vbabka@kernel.org> 16979 16979 R: Jann Horn <jannh@google.com> 16980 16980 R: Pedro Falcato <pfalcato@suse.de> ··· 17004 17004 M: Andrew Morton <akpm@linux-foundation.org> 17005 17005 M: Suren Baghdasaryan <surenb@google.com> 17006 17006 M: Liam R. Howlett <Liam.Howlett@oracle.com> 17007 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 17007 + M: Lorenzo Stoakes <ljs@kernel.org> 17008 17008 R: Vlastimil Babka <vbabka@kernel.org> 17009 17009 R: Shakeel Butt <shakeel.butt@linux.dev> 17010 17010 L: linux-mm@kvack.org ··· 17019 17019 MEMORY MAPPING - MADVISE (MEMORY ADVICE) 17020 17020 M: Andrew Morton <akpm@linux-foundation.org> 17021 17021 M: Liam R. Howlett <Liam.Howlett@oracle.com> 17022 - M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 17022 + M: Lorenzo Stoakes <ljs@kernel.org> 17023 17023 M: David Hildenbrand <david@kernel.org> 17024 17024 R: Vlastimil Babka <vbabka@kernel.org> 17025 17025 R: Jann Horn <jannh@google.com> ··· 22267 22267 S: Orphan 22268 22268 F: drivers/net/wireless/rsi/ 22269 22269 22270 + RELAY 22271 + M: Andrew Morton <akpm@linux-foundation.org> 22272 + M: Jens Axboe <axboe@kernel.dk> 22273 + M: Jason Xing <kernelxing@tencent.com> 22274 + L: linux-kernel@vger.kernel.org 22275 + S: Maintained 22276 + F: Documentation/filesystems/relay.rst 22277 + F: include/linux/relay.h 22278 + F: kernel/relay.c 22279 + 22270 22280 REGISTER MAP ABSTRACTION 22271 22281 M: Mark Brown <broonie@kernel.org> 22272 22282 L: linux-kernel@vger.kernel.org ··· 23166 23156 23167 23157 RUST [ALLOC] 23168 23158 M: Danilo Krummrich <dakr@kernel.org> 23169 - R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> 23159 + R: Lorenzo Stoakes <ljs@kernel.org> 23170 23160 R: Vlastimil Babka <vbabka@kernel.org> 23171 23161 R: Liam R. Howlett <Liam.Howlett@oracle.com> 23172 23162 R: Uladzislau Rezki <urezki@gmail.com> ··· 24343 24333 24344 24334 SLAB ALLOCATOR 24345 24335 M: Vlastimil Babka <vbabka@kernel.org> 24336 + M: Harry Yoo <harry.yoo@oracle.com> 24346 24337 M: Andrew Morton <akpm@linux-foundation.org> 24338 + R: Hao Li <hao.li@linux.dev> 24347 24339 R: Christoph Lameter <cl@gentwo.org> 24348 24340 R: David Rientjes <rientjes@google.com> 24349 24341 R: Roman Gushchin <roman.gushchin@linux.dev> 24350 - R: Harry Yoo <harry.yoo@oracle.com> 24351 24342 L: linux-mm@kvack.org 24352 24343 S: Maintained 24353 24344 T: git git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git
+12 -12
drivers/block/zram/zram_drv.c
··· 549 549 return ret; 550 550 } 551 551 552 - static ssize_t writeback_compressed_store(struct device *dev, 552 + static ssize_t compressed_writeback_store(struct device *dev, 553 553 struct device_attribute *attr, 554 554 const char *buf, size_t len) 555 555 { ··· 564 564 return -EBUSY; 565 565 } 566 566 567 - zram->wb_compressed = val; 567 + zram->compressed_wb = val; 568 568 569 569 return len; 570 570 } 571 571 572 - static ssize_t writeback_compressed_show(struct device *dev, 572 + static ssize_t compressed_writeback_show(struct device *dev, 573 573 struct device_attribute *attr, 574 574 char *buf) 575 575 { ··· 577 577 struct zram *zram = dev_to_zram(dev); 578 578 579 579 guard(rwsem_read)(&zram->dev_lock); 580 - val = zram->wb_compressed; 580 + val = zram->compressed_wb; 581 581 582 582 return sysfs_emit(buf, "%d\n", val); 583 583 } ··· 946 946 goto out; 947 947 } 948 948 949 - if (zram->wb_compressed) { 949 + if (zram->compressed_wb) { 950 950 /* 951 951 * ZRAM_WB slots get freed, we need to preserve data required 952 952 * for read decompression. ··· 960 960 set_slot_flag(zram, index, ZRAM_WB); 961 961 set_slot_handle(zram, index, req->blk_idx); 962 962 963 - if (zram->wb_compressed) { 963 + if (zram->compressed_wb) { 964 964 if (huge) 965 965 set_slot_flag(zram, index, ZRAM_HUGE); 966 966 set_slot_size(zram, index, size); ··· 1100 1100 */ 1101 1101 if (!test_slot_flag(zram, index, ZRAM_PP_SLOT)) 1102 1102 goto next; 1103 - if (zram->wb_compressed) 1103 + if (zram->compressed_wb) 1104 1104 err = read_from_zspool_raw(zram, req->page, index); 1105 1105 else 1106 1106 err = read_from_zspool(zram, req->page, index); ··· 1429 1429 * 1430 1430 * Keep the existing behavior for now. 1431 1431 */ 1432 - if (zram->wb_compressed == false) { 1432 + if (zram->compressed_wb == false) { 1433 1433 /* No decompression needed, complete the parent IO */ 1434 1434 bio_endio(req->parent); 1435 1435 bio_put(bio); ··· 1508 1508 flush_work(&req.work); 1509 1509 destroy_work_on_stack(&req.work); 1510 1510 1511 - if (req.error || zram->wb_compressed == false) 1511 + if (req.error || zram->compressed_wb == false) 1512 1512 return req.error; 1513 1513 1514 1514 return decompress_bdev_page(zram, page, index); ··· 3007 3007 static DEVICE_ATTR_RW(writeback_limit); 3008 3008 static DEVICE_ATTR_RW(writeback_limit_enable); 3009 3009 static DEVICE_ATTR_RW(writeback_batch_size); 3010 - static DEVICE_ATTR_RW(writeback_compressed); 3010 + static DEVICE_ATTR_RW(compressed_writeback); 3011 3011 #endif 3012 3012 #ifdef CONFIG_ZRAM_MULTI_COMP 3013 3013 static DEVICE_ATTR_RW(recomp_algorithm); ··· 3031 3031 &dev_attr_writeback_limit.attr, 3032 3032 &dev_attr_writeback_limit_enable.attr, 3033 3033 &dev_attr_writeback_batch_size.attr, 3034 - &dev_attr_writeback_compressed.attr, 3034 + &dev_attr_compressed_writeback.attr, 3035 3035 #endif 3036 3036 &dev_attr_io_stat.attr, 3037 3037 &dev_attr_mm_stat.attr, ··· 3091 3091 init_rwsem(&zram->dev_lock); 3092 3092 #ifdef CONFIG_ZRAM_WRITEBACK 3093 3093 zram->wb_batch_size = 32; 3094 - zram->wb_compressed = false; 3094 + zram->compressed_wb = false; 3095 3095 #endif 3096 3096 3097 3097 /* gendisk structure */
+1 -1
drivers/block/zram/zram_drv.h
··· 133 133 #ifdef CONFIG_ZRAM_WRITEBACK 134 134 struct file *backing_dev; 135 135 bool wb_limit_enable; 136 - bool wb_compressed; 136 + bool compressed_wb; 137 137 u32 wb_batch_size; 138 138 u64 bd_wb_limit; 139 139 struct block_device *bdev;
+6 -11
include/linux/mm.h
··· 3514 3514 static inline void ptlock_free(struct ptdesc *ptdesc) {} 3515 3515 #endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ 3516 3516 3517 - static inline unsigned long ptdesc_nr_pages(const struct ptdesc *ptdesc) 3518 - { 3519 - return compound_nr(ptdesc_page(ptdesc)); 3520 - } 3521 - 3522 3517 static inline void __pagetable_ctor(struct ptdesc *ptdesc) 3523 3518 { 3524 - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags)); 3519 + struct folio *folio = ptdesc_folio(ptdesc); 3525 3520 3526 - __SetPageTable(ptdesc_page(ptdesc)); 3527 - mod_node_page_state(pgdat, NR_PAGETABLE, ptdesc_nr_pages(ptdesc)); 3521 + __folio_set_pgtable(folio); 3522 + lruvec_stat_add_folio(folio, NR_PAGETABLE); 3528 3523 } 3529 3524 3530 3525 static inline void pagetable_dtor(struct ptdesc *ptdesc) 3531 3526 { 3532 - pg_data_t *pgdat = NODE_DATA(memdesc_nid(ptdesc->pt_flags)); 3527 + struct folio *folio = ptdesc_folio(ptdesc); 3533 3528 3534 3529 ptlock_free(ptdesc); 3535 - __ClearPageTable(ptdesc_page(ptdesc)); 3536 - mod_node_page_state(pgdat, NR_PAGETABLE, -ptdesc_nr_pages(ptdesc)); 3530 + __folio_clear_pgtable(folio); 3531 + lruvec_stat_sub_folio(folio, NR_PAGETABLE); 3537 3532 } 3538 3533 3539 3534 static inline void pagetable_dtor_free(struct ptdesc *ptdesc)
+16 -15
include/linux/mmu_notifier.h
··· 234 234 }; 235 235 236 236 /** 237 - * struct mmu_interval_notifier_ops 237 + * struct mmu_interval_notifier_ops - callback for range notification 238 238 * @invalidate: Upon return the caller must stop using any SPTEs within this 239 239 * range. This function can sleep. Return false only if sleeping 240 240 * was required but mmu_notifier_range_blockable(range) is false. ··· 309 309 310 310 /** 311 311 * mmu_interval_set_seq - Save the invalidation sequence 312 - * @interval_sub - The subscription passed to invalidate 313 - * @cur_seq - The cur_seq passed to the invalidate() callback 312 + * @interval_sub: The subscription passed to invalidate 313 + * @cur_seq: The cur_seq passed to the invalidate() callback 314 314 * 315 315 * This must be called unconditionally from the invalidate callback of a 316 316 * struct mmu_interval_notifier_ops under the same lock that is used to call ··· 329 329 330 330 /** 331 331 * mmu_interval_read_retry - End a read side critical section against a VA range 332 - * interval_sub: The subscription 333 - * seq: The return of the paired mmu_interval_read_begin() 332 + * @interval_sub: The subscription 333 + * @seq: The return of the paired mmu_interval_read_begin() 334 334 * 335 335 * This MUST be called under a user provided lock that is also held 336 336 * unconditionally by op->invalidate() when it calls mmu_interval_set_seq(). ··· 338 338 * Each call should be paired with a single mmu_interval_read_begin() and 339 339 * should be used to conclude the read side. 340 340 * 341 - * Returns true if an invalidation collided with this critical section, and 341 + * Returns: true if an invalidation collided with this critical section, and 342 342 * the caller should retry. 343 343 */ 344 344 static inline bool ··· 350 350 351 351 /** 352 352 * mmu_interval_check_retry - Test if a collision has occurred 353 - * interval_sub: The subscription 354 - * seq: The return of the matching mmu_interval_read_begin() 353 + * @interval_sub: The subscription 354 + * @seq: The return of the matching mmu_interval_read_begin() 355 355 * 356 356 * This can be used in the critical section between mmu_interval_read_begin() 357 - * and mmu_interval_read_retry(). A return of true indicates an invalidation 358 - * has collided with this critical region and a future 359 - * mmu_interval_read_retry() will return true. 360 - * 361 - * False is not reliable and only suggests a collision may not have 362 - * occurred. It can be called many times and does not have to hold the user 363 - * provided lock. 357 + * and mmu_interval_read_retry(). 364 358 * 365 359 * This call can be used as part of loops and other expensive operations to 366 360 * expedite a retry. 361 + * It can be called many times and does not have to hold the user 362 + * provided lock. 363 + * 364 + * Returns: true indicates an invalidation has collided with this critical 365 + * region and a future mmu_interval_read_retry() will return true. 366 + * False is not reliable and only suggests a collision may not have 367 + * occurred. 367 368 */ 368 369 static inline bool 369 370 mmu_interval_check_retry(struct mmu_interval_notifier *interval_sub,
+2 -2
include/linux/uaccess.h
··· 792 792 793 793 /** 794 794 * scoped_user_rw_access_size - Start a scoped user read/write access with given size 795 - * @uptr Pointer to the user space address to read from and write to 795 + * @uptr: Pointer to the user space address to read from and write to 796 796 * @size: Size of the access starting from @uptr 797 797 * @elbl: Error label to goto when the access region is rejected 798 798 * ··· 803 803 804 804 /** 805 805 * scoped_user_rw_access - Start a scoped user read/write access 806 - * @uptr Pointer to the user space address to read from and write to 806 + * @uptr: Pointer to the user space address to read from and write to 807 807 * @elbl: Error label to goto when the access region is rejected 808 808 * 809 809 * The size of the access starting from @uptr is determined via sizeof(*@uptr)).
+4 -1
mm/cma.c
··· 1013 1013 unsigned long count) 1014 1014 { 1015 1015 struct cma_memrange *cmr; 1016 + unsigned long ret = 0; 1016 1017 unsigned long i, pfn; 1017 1018 1018 1019 cmr = find_cma_memrange(cma, pages, count); ··· 1022 1021 1023 1022 pfn = page_to_pfn(pages); 1024 1023 for (i = 0; i < count; i++, pfn++) 1025 - VM_WARN_ON(!put_page_testzero(pfn_to_page(pfn))); 1024 + ret += !put_page_testzero(pfn_to_page(pfn)); 1025 + 1026 + WARN(ret, "%lu pages are still in use!\n", ret); 1026 1027 1027 1028 __cma_release_frozen(cma, cmr, pages, count); 1028 1029
+6 -1
mm/damon/core.c
··· 1562 1562 } 1563 1563 ctx->walk_control = control; 1564 1564 mutex_unlock(&ctx->walk_control_lock); 1565 - if (!damon_is_running(ctx)) 1565 + if (!damon_is_running(ctx)) { 1566 + mutex_lock(&ctx->walk_control_lock); 1567 + if (ctx->walk_control == control) 1568 + ctx->walk_control = NULL; 1569 + mutex_unlock(&ctx->walk_control_lock); 1566 1570 return -EINVAL; 1571 + } 1567 1572 wait_for_completion(&control->completion); 1568 1573 if (control->canceled) 1569 1574 return -ECANCELED;
+9 -4
mm/huge_memory.c
··· 3631 3631 const bool is_anon = folio_test_anon(folio); 3632 3632 int old_order = folio_order(folio); 3633 3633 int start_order = split_type == SPLIT_TYPE_UNIFORM ? new_order : old_order - 1; 3634 + struct folio *old_folio = folio; 3634 3635 int split_order; 3635 3636 3636 3637 /* ··· 3652 3651 * uniform split has xas_split_alloc() called before 3653 3652 * irq is disabled to allocate enough memory, whereas 3654 3653 * non-uniform split can handle ENOMEM. 3654 + * Use the to-be-split folio, so that a parallel 3655 + * folio_try_get() waits on it until xarray is updated 3656 + * with after-split folios and the original one is 3657 + * unfrozen. 3655 3658 */ 3656 - if (split_type == SPLIT_TYPE_UNIFORM) 3657 - xas_split(xas, folio, old_order); 3658 - else { 3659 + if (split_type == SPLIT_TYPE_UNIFORM) { 3660 + xas_split(xas, old_folio, old_order); 3661 + } else { 3659 3662 xas_set_order(xas, folio->index, split_order); 3660 - xas_try_split(xas, folio, old_order); 3663 + xas_try_split(xas, old_folio, old_order); 3661 3664 if (xas_error(xas)) 3662 3665 return xas_error(xas); 3663 3666 }
+2 -2
mm/hugetlb.c
··· 3101 3101 * extract the actual node first. 3102 3102 */ 3103 3103 if (m) 3104 - listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); 3104 + listnode = early_pfn_to_nid(PHYS_PFN(__pa(m))); 3105 3105 } 3106 3106 3107 3107 if (m) { ··· 3160 3160 * The head struct page is used to get folio information by the HugeTLB 3161 3161 * subsystem like zone id and node id. 3162 3162 */ 3163 - memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), 3163 + memblock_reserved_mark_noinit(__pa((void *)m + PAGE_SIZE), 3164 3164 huge_page_size(h) - PAGE_SIZE); 3165 3165 3166 3166 return 1;
+1 -1
mm/memcontrol.c
··· 3086 3086 3087 3087 if (!local_trylock(&obj_stock.lock)) { 3088 3088 if (pgdat) 3089 - mod_objcg_mlstate(objcg, pgdat, idx, nr_bytes); 3089 + mod_objcg_mlstate(objcg, pgdat, idx, nr_acct); 3090 3090 nr_pages = nr_bytes >> PAGE_SHIFT; 3091 3091 nr_bytes = nr_bytes & (PAGE_SIZE - 1); 3092 3092 atomic_add(nr_bytes, &objcg->nr_charged_bytes);
+43 -6
mm/memfd_luo.c
··· 146 146 for (i = 0; i < nr_folios; i++) { 147 147 struct memfd_luo_folio_ser *pfolio = &folios_ser[i]; 148 148 struct folio *folio = folios[i]; 149 - unsigned int flags = 0; 150 149 151 150 err = kho_preserve_folio(folio); 152 151 if (err) 153 152 goto err_unpreserve; 154 153 155 - if (folio_test_dirty(folio)) 156 - flags |= MEMFD_LUO_FOLIO_DIRTY; 157 - if (folio_test_uptodate(folio)) 158 - flags |= MEMFD_LUO_FOLIO_UPTODATE; 154 + folio_lock(folio); 155 + 156 + /* 157 + * A dirty folio is one which has been written to. A clean folio 158 + * is its opposite. Since a clean folio does not carry user 159 + * data, it can be freed by page reclaim under memory pressure. 160 + * 161 + * Saving the dirty flag at prepare() time doesn't work since it 162 + * can change later. Saving it at freeze() also won't work 163 + * because the dirty bit is normally synced at unmap and there 164 + * might still be a mapping of the file at freeze(). 165 + * 166 + * To see why this is a problem, say a folio is clean at 167 + * preserve, but gets dirtied later. The pfolio flags will mark 168 + * it as clean. After retrieve, the next kernel might try to 169 + * reclaim this folio under memory pressure, losing user data. 170 + * 171 + * Unconditionally mark it dirty to avoid this problem. This 172 + * comes at the cost of making clean folios un-reclaimable after 173 + * live update. 174 + */ 175 + folio_mark_dirty(folio); 176 + 177 + /* 178 + * If the folio is not uptodate, it was fallocated but never 179 + * used. Saving this flag at prepare() doesn't work since it 180 + * might change later when someone uses the folio. 181 + * 182 + * Since we have taken the performance penalty of allocating, 183 + * zeroing, and pinning all the folios in the holes, take a bit 184 + * more and zero all non-uptodate folios too. 185 + * 186 + * NOTE: For someone looking to improve preserve performance, 187 + * this is a good place to look. 188 + */ 189 + if (!folio_test_uptodate(folio)) { 190 + folio_zero_range(folio, 0, folio_size(folio)); 191 + flush_dcache_folio(folio); 192 + folio_mark_uptodate(folio); 193 + } 194 + 195 + folio_unlock(folio); 159 196 160 197 pfolio->pfn = folio_pfn(folio); 161 - pfolio->flags = flags; 198 + pfolio->flags = MEMFD_LUO_FOLIO_DIRTY | MEMFD_LUO_FOLIO_UPTODATE; 162 199 pfolio->index = folio->index; 163 200 } 164 201
+4
tools/include/linux/gfp.h
··· 5 5 #include <linux/types.h> 6 6 #include <linux/gfp_types.h> 7 7 8 + /* Helper macro to avoid gfp flags if they are the default one */ 9 + #define __default_gfp(a,...) a 10 + #define default_gfp(...) __default_gfp(__VA_ARGS__ __VA_OPT__(,) GFP_KERNEL) 11 + 8 12 static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags) 9 13 { 10 14 return !!(gfp_flags & __GFP_DIRECT_RECLAIM);
+19
tools/include/linux/overflow.h
··· 69 69 }) 70 70 71 71 /** 72 + * size_mul() - Calculate size_t multiplication with saturation at SIZE_MAX 73 + * @factor1: first factor 74 + * @factor2: second factor 75 + * 76 + * Returns: calculate @factor1 * @factor2, both promoted to size_t, 77 + * with any overflow causing the return value to be SIZE_MAX. The 78 + * lvalue must be size_t to avoid implicit type conversion. 79 + */ 80 + static inline size_t __must_check size_mul(size_t factor1, size_t factor2) 81 + { 82 + size_t bytes; 83 + 84 + if (check_mul_overflow(factor1, factor2, &bytes)) 85 + return SIZE_MAX; 86 + 87 + return bytes; 88 + } 89 + 90 + /** 72 91 * array_size() - Calculate size of 2-dimensional array. 73 92 * 74 93 * @a: dimension one
+9
tools/include/linux/slab.h
··· 202 202 return sheaf->size; 203 203 } 204 204 205 + #define __alloc_objs(KMALLOC, GFP, TYPE, COUNT) \ 206 + ({ \ 207 + const size_t __obj_size = size_mul(sizeof(TYPE), COUNT); \ 208 + (TYPE *)KMALLOC(__obj_size, GFP); \ 209 + }) 210 + 211 + #define kzalloc_obj(P, ...) \ 212 + __alloc_objs(kzalloc, default_gfp(__VA_ARGS__), typeof(P), 1) 213 + 205 214 #endif /* _TOOLS_SLAB_H */