Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

userfaultfd: introduce mfill_establish_pmd() helper

There is a lengthy code chunk in mfill_atomic() that establishes the PMD
for UFFDIO operations. This code may be called twice: first time when the
copy is performed with VMA/mm locks held and the other time after the copy
is retried with locks dropped.

Move the code that establishes a PMD into a helper function so it can be
reused later during refactoring of mfill_atomic_pte_copy().

Link: https://lore.kernel.org/20260402041156.1377214-4-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nikita Kalyazin <kalyazin@amazon.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Carlier <devnexen@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Mike Rapoport (Microsoft) and committed by
Andrew Morton
e2e0b826 db0062d2

+53 -51
+53 -51
mm/userfaultfd.c
··· 157 157 } 158 158 #endif 159 159 160 + static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address) 161 + { 162 + pgd_t *pgd; 163 + p4d_t *p4d; 164 + pud_t *pud; 165 + 166 + pgd = pgd_offset(mm, address); 167 + p4d = p4d_alloc(mm, pgd, address); 168 + if (!p4d) 169 + return NULL; 170 + pud = pud_alloc(mm, p4d, address); 171 + if (!pud) 172 + return NULL; 173 + /* 174 + * Note that we didn't run this because the pmd was 175 + * missing, the *pmd may be already established and in 176 + * turn it may also be a trans_huge_pmd. 177 + */ 178 + return pmd_alloc(mm, pud, address); 179 + } 180 + 181 + static int mfill_establish_pmd(struct mfill_state *state) 182 + { 183 + struct mm_struct *dst_mm = state->ctx->mm; 184 + pmd_t *dst_pmd, dst_pmdval; 185 + 186 + dst_pmd = mm_alloc_pmd(dst_mm, state->dst_addr); 187 + if (unlikely(!dst_pmd)) 188 + return -ENOMEM; 189 + 190 + dst_pmdval = pmdp_get_lockless(dst_pmd); 191 + if (unlikely(pmd_none(dst_pmdval)) && 192 + unlikely(__pte_alloc(dst_mm, dst_pmd))) 193 + return -ENOMEM; 194 + 195 + dst_pmdval = pmdp_get_lockless(dst_pmd); 196 + /* 197 + * If the dst_pmd is THP don't override it and just be strict. 198 + * (This includes the case where the PMD used to be THP and 199 + * changed back to none after __pte_alloc().) 200 + */ 201 + if (unlikely(!pmd_present(dst_pmdval) || pmd_leaf(dst_pmdval))) 202 + return -EEXIST; 203 + if (unlikely(pmd_bad(dst_pmdval))) 204 + return -EFAULT; 205 + 206 + state->pmd = dst_pmd; 207 + return 0; 208 + } 209 + 160 210 /* Check if dst_addr is outside of file's size. Must be called with ptl held. */ 161 211 static bool mfill_file_over_size(struct vm_area_struct *dst_vma, 162 212 unsigned long dst_addr) ··· 539 489 return ret; 540 490 } 541 491 542 - static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address) 543 - { 544 - pgd_t *pgd; 545 - p4d_t *p4d; 546 - pud_t *pud; 547 - 548 - pgd = pgd_offset(mm, address); 549 - p4d = p4d_alloc(mm, pgd, address); 550 - if (!p4d) 551 - return NULL; 552 - pud = pud_alloc(mm, p4d, address); 553 - if (!pud) 554 - return NULL; 555 - /* 556 - * Note that we didn't run this because the pmd was 557 - * missing, the *pmd may be already established and in 558 - * turn it may also be a trans_huge_pmd. 559 - */ 560 - return pmd_alloc(mm, pud, address); 561 - } 562 - 563 492 #ifdef CONFIG_HUGETLB_PAGE 564 493 /* 565 494 * mfill_atomic processing for HUGETLB vmas. Note that this routine is ··· 771 742 struct vm_area_struct *dst_vma; 772 743 long copied = 0; 773 744 ssize_t err; 774 - pmd_t *dst_pmd; 775 745 776 746 /* 777 747 * Sanitize the command parameters: ··· 836 808 while (state.src_addr < src_start + len) { 837 809 VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len); 838 810 839 - pmd_t dst_pmdval; 811 + err = mfill_establish_pmd(&state); 812 + if (err) 813 + break; 840 814 841 - dst_pmd = mm_alloc_pmd(dst_mm, state.dst_addr); 842 - if (unlikely(!dst_pmd)) { 843 - err = -ENOMEM; 844 - break; 845 - } 846 - 847 - dst_pmdval = pmdp_get_lockless(dst_pmd); 848 - if (unlikely(pmd_none(dst_pmdval)) && 849 - unlikely(__pte_alloc(dst_mm, dst_pmd))) { 850 - err = -ENOMEM; 851 - break; 852 - } 853 - dst_pmdval = pmdp_get_lockless(dst_pmd); 854 - /* 855 - * If the dst_pmd is THP don't override it and just be strict. 856 - * (This includes the case where the PMD used to be THP and 857 - * changed back to none after __pte_alloc().) 858 - */ 859 - if (unlikely(!pmd_present(dst_pmdval) || 860 - pmd_trans_huge(dst_pmdval))) { 861 - err = -EEXIST; 862 - break; 863 - } 864 - if (unlikely(pmd_bad(dst_pmdval))) { 865 - err = -EFAULT; 866 - break; 867 - } 868 815 /* 869 816 * For shmem mappings, khugepaged is allowed to remove page 870 817 * tables under us; pte_offset_map_lock() will deal with that. 871 818 */ 872 819 873 - state.pmd = dst_pmd; 874 820 err = mfill_atomic_pte(&state); 875 821 cond_resched(); 876 822