Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/huge_memory: simplify vma_is_specal_huge()

Patch series "mm/huge_memory: refactor zap_huge_pmd()", v3.

zap_huge_pmd() is overly complicated, clean it up and also add an assert
in the case that we encounter a buggy PMD entry that doesn't match
expectations.

This is motivated by a bug discovered [0] where the PMD entry was none of:

* A non-DAX, PFN or mixed map.
* The huge zero folio
* A present PMD entry
* A softleaf entry

In zap_huge_pmd(), but due to the bug we manged to reach this code.

It is useful to explicitly call this out rather than have an arbitrary
NULL pointer dereference happen, which also improves understanding of
what's going on.

The series goes further to make use of vm_normal_folio_pmd() rather than
implementing custom logic for retrieving the folio, and extends softleaf
functionality to provide and use an equivalent softleaf function.


This patch (of 13):

This function is confused - it overloads the term 'special' yet again,
checks for DAX but in many cases the code explicitly excludes DAX before
invoking the predicate.

It also unnecessarily checks for vma->vm_file - this has to be present for
a driver to have set VMA_MIXEDMAP_BIT or VMA_PFNMAP_BIT.

In fact, a far simpler form of this is to reverse the DAX predicate and
return false if DAX is set.

This makes sense from the point of view of 'special' as in
vm_normal_page(), as DAX actually does potentially have retrievable
folios.

Also there's no need to have this in mm.h so move it to huge_memory.c.

No functional change intended.

Link: https://lkml.kernel.org/r/cover.1774029655.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/d2b65883dc4895f197c4b4a69fbf27a063463412.1774029655.git.ljs@kernel.org
Link: https://lore.kernel.org/all/6b3d7ad7-49e1-407a-903d-3103704160d8@lucifer.local/ [0]
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes (Oracle) and committed by
Andrew Morton
c0ea52c1 1a0fe419

+25 -25
+2 -2
include/linux/huge_mm.h
··· 83 83 * file is never split and the MAX_PAGECACHE_ORDER limit does not apply to 84 84 * it. Same to PFNMAPs where there's neither page* nor pagecache. 85 85 */ 86 - #define THP_ORDERS_ALL_SPECIAL \ 86 + #define THP_ORDERS_ALL_SPECIAL_DAX \ 87 87 (BIT(PMD_ORDER) | BIT(PUD_ORDER)) 88 88 #define THP_ORDERS_ALL_FILE_DEFAULT \ 89 89 ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) ··· 92 92 * Mask of all large folio orders supported for THP. 93 93 */ 94 94 #define THP_ORDERS_ALL \ 95 - (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_SPECIAL | THP_ORDERS_ALL_FILE_DEFAULT) 95 + (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_SPECIAL_DAX | THP_ORDERS_ALL_FILE_DEFAULT) 96 96 97 97 enum tva_type { 98 98 TVA_SMAPS, /* Exposing "THPeligible:" in smaps. */
-16
include/linux/mm.h
··· 5068 5068 const void __user *usr_src, 5069 5069 bool allow_pagefault); 5070 5070 5071 - /** 5072 - * vma_is_special_huge - Are transhuge page-table entries considered special? 5073 - * @vma: Pointer to the struct vm_area_struct to consider 5074 - * 5075 - * Whether transhuge page-table entries are considered "special" following 5076 - * the definition in vm_normal_page(). 5077 - * 5078 - * Return: true if transhuge page-table entries should be considered special, 5079 - * false otherwise. 5080 - */ 5081 - static inline bool vma_is_special_huge(const struct vm_area_struct *vma) 5082 - { 5083 - return vma_is_dax(vma) || (vma->vm_file && 5084 - (vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))); 5085 - } 5086 - 5087 5071 #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ 5088 5072 5089 5073 #if MAX_NUMNODES > 1
+23 -7
mm/huge_memory.c
··· 100 100 return !inode_is_open_for_write(inode) && S_ISREG(inode->i_mode); 101 101 } 102 102 103 + /* If returns true, we are unable to access the VMA's folios. */ 104 + static bool vma_is_special_huge(const struct vm_area_struct *vma) 105 + { 106 + if (vma_is_dax(vma)) 107 + return false; 108 + return vma_test_any(vma, VMA_PFNMAP_BIT, VMA_MIXEDMAP_BIT); 109 + } 110 + 103 111 unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, 104 112 vm_flags_t vm_flags, 105 113 enum tva_type type, ··· 121 113 /* Check the intersection of requested and supported orders. */ 122 114 if (vma_is_anonymous(vma)) 123 115 supported_orders = THP_ORDERS_ALL_ANON; 124 - else if (vma_is_special_huge(vma)) 125 - supported_orders = THP_ORDERS_ALL_SPECIAL; 116 + else if (vma_is_dax(vma) || vma_is_special_huge(vma)) 117 + supported_orders = THP_ORDERS_ALL_SPECIAL_DAX; 126 118 else 127 119 supported_orders = THP_ORDERS_ALL_FILE_DEFAULT; 128 120 ··· 2423 2415 tlb->fullmm); 2424 2416 arch_check_zapped_pmd(vma, orig_pmd); 2425 2417 tlb_remove_pmd_tlb_entry(tlb, pmd, addr); 2426 - if (!vma_is_dax(vma) && vma_is_special_huge(vma)) { 2418 + if (vma_is_special_huge(vma)) { 2427 2419 if (arch_needs_pgtable_deposit()) 2428 2420 zap_deposited_table(tlb->mm, pmd); 2429 2421 spin_unlock(ptl); ··· 2925 2917 orig_pud = pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); 2926 2918 arch_check_zapped_pud(vma, orig_pud); 2927 2919 tlb_remove_pud_tlb_entry(tlb, pud, addr); 2928 - if (!vma_is_dax(vma) && vma_is_special_huge(vma)) { 2920 + if (vma_is_special_huge(vma)) { 2929 2921 spin_unlock(ptl); 2930 2922 /* No zero page support yet */ 2931 2923 } else { ··· 3076 3068 */ 3077 3069 if (arch_needs_pgtable_deposit()) 3078 3070 zap_deposited_table(mm, pmd); 3079 - if (!vma_is_dax(vma) && vma_is_special_huge(vma)) 3071 + if (vma_is_special_huge(vma)) 3080 3072 return; 3081 3073 if (unlikely(pmd_is_migration_entry(old_pmd))) { 3082 3074 const softleaf_t old_entry = softleaf_from_pmd(old_pmd); ··· 4637 4629 4638 4630 static inline bool vma_not_suitable_for_thp_split(struct vm_area_struct *vma) 4639 4631 { 4640 - return vma_is_special_huge(vma) || (vma->vm_flags & VM_IO) || 4641 - is_vm_hugetlb_page(vma); 4632 + if (vma_is_dax(vma)) 4633 + return true; 4634 + if (vma_is_special_huge(vma)) 4635 + return true; 4636 + if (vma_test(vma, VMA_IO_BIT)) 4637 + return true; 4638 + if (is_vm_hugetlb_page(vma)) 4639 + return true; 4640 + 4641 + return false; 4642 4642 } 4643 4643 4644 4644 static int split_huge_pages_pid(int pid, unsigned long vaddr_start,