Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: introduce deferred freeing for kernel page tables

This introduces a conditional asynchronous mechanism, enabled by
CONFIG_ASYNC_KERNEL_PGTABLE_FREE. When enabled, this mechanism defers the
freeing of pages that are used as page tables for kernel address mappings.
These pages are now queued to a work struct instead of being freed
immediately.

This deferred freeing allows for batch-freeing of page tables, providing a
safe context for performing a single expensive operation (TLB flush) for a
batch of kernel page tables instead of performing that expensive operation
for each page table.

Link: https://lkml.kernel.org/r/20251022082635.2462433-8-baolu.lu@linux.intel.com
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murohy <robin.murphy@arm.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>
Cc: Vasant Hegde <vasant.hegde@amd.com>
Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Dave Hansen and committed by
Andrew Morton
5ba2f0a1 bf9e4e30

+53 -3
+13 -3
include/linux/mm.h
··· 3053 3053 __free_pages(page, compound_order(page)); 3054 3054 } 3055 3055 3056 + #ifdef CONFIG_ASYNC_KERNEL_PGTABLE_FREE 3057 + void pagetable_free_kernel(struct ptdesc *pt); 3058 + #else 3059 + static inline void pagetable_free_kernel(struct ptdesc *pt) 3060 + { 3061 + __pagetable_free(pt); 3062 + } 3063 + #endif 3056 3064 /** 3057 3065 * pagetable_free - Free pagetables 3058 3066 * @pt: The page table descriptor ··· 3070 3062 */ 3071 3063 static inline void pagetable_free(struct ptdesc *pt) 3072 3064 { 3073 - if (ptdesc_test_kernel(pt)) 3065 + if (ptdesc_test_kernel(pt)) { 3074 3066 ptdesc_clear_kernel(pt); 3075 - 3076 - __pagetable_free(pt); 3067 + pagetable_free_kernel(pt); 3068 + } else { 3069 + __pagetable_free(pt); 3070 + } 3077 3071 } 3078 3072 3079 3073 #if defined(CONFIG_SPLIT_PTE_PTLOCKS)
+3
mm/Kconfig
··· 906 906 def_bool (HUGETLB_PAGE && ARCH_HAS_GIGANTIC_PAGE) || \ 907 907 (ZONE_DEVICE && HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) 908 908 909 + config ASYNC_KERNEL_PGTABLE_FREE 910 + def_bool n 911 + 909 912 # TODO: Allow to be enabled without THP 910 913 config ARCH_SUPPORTS_HUGE_PFNMAP 911 914 def_bool n
+37
mm/pgtable-generic.c
··· 406 406 pte_unmap_unlock(pte, ptl); 407 407 goto again; 408 408 } 409 + 410 + #ifdef CONFIG_ASYNC_KERNEL_PGTABLE_FREE 411 + static void kernel_pgtable_work_func(struct work_struct *work); 412 + 413 + static struct { 414 + struct list_head list; 415 + /* protect above ptdesc lists */ 416 + spinlock_t lock; 417 + struct work_struct work; 418 + } kernel_pgtable_work = { 419 + .list = LIST_HEAD_INIT(kernel_pgtable_work.list), 420 + .lock = __SPIN_LOCK_UNLOCKED(kernel_pgtable_work.lock), 421 + .work = __WORK_INITIALIZER(kernel_pgtable_work.work, kernel_pgtable_work_func), 422 + }; 423 + 424 + static void kernel_pgtable_work_func(struct work_struct *work) 425 + { 426 + struct ptdesc *pt, *next; 427 + LIST_HEAD(page_list); 428 + 429 + spin_lock(&kernel_pgtable_work.lock); 430 + list_splice_tail_init(&kernel_pgtable_work.list, &page_list); 431 + spin_unlock(&kernel_pgtable_work.lock); 432 + 433 + list_for_each_entry_safe(pt, next, &page_list, pt_list) 434 + __pagetable_free(pt); 435 + } 436 + 437 + void pagetable_free_kernel(struct ptdesc *pt) 438 + { 439 + spin_lock(&kernel_pgtable_work.lock); 440 + list_add(&pt->pt_list, &kernel_pgtable_work.list); 441 + spin_unlock(&kernel_pgtable_work.lock); 442 + 443 + schedule_work(&kernel_pgtable_work.work); 444 + } 445 + #endif