Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

fs/dax: use vmf_insert_folio_pmd() to insert the huge zero folio

Let's convert to vmf_insert_folio_pmd().

There is a theoretical change in behavior: in the unlikely case there is
already something mapped, we'll now still call trace_dax_pmd_load_hole()
and return VM_FAULT_NOPAGE.

Previously, we would have returned VM_FAULT_FALLBACK, and the caller would
have zapped the PMD to try a PTE fault.

However, that behavior was different to other PTE+PMD faults, when there
would already be something mapped, and it's not even clear if it could be
triggered.

Assuming the huge zero folio is already mapped, all good, no need to
fallback to PTEs.

Assuming there is already a leaf page table ... the behavior would be
just like when trying to insert a PMD mapping a folio through
dax_fault_iter()->vmf_insert_folio_pmd().

Assuming there is already something else mapped as PMD? It sounds like a
BUG, and the behavior would be just like when trying to insert a PMD
mapping a folio through dax_fault_iter()->vmf_insert_folio_pmd().

So, it sounds reasonable to not handle huge zero folios differently to
inserting PMDs mapping folios when there already is something mapped.

Link: https://lkml.kernel.org/r/20250811112631.759341-5-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Juegren Gross <jgross@suse.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

David Hildenbrand and committed by
Andrew Morton
b0f86aae 5528ef06

+10 -37
+10 -37
fs/dax.c
··· 1375 1375 const struct iomap_iter *iter, void **entry) 1376 1376 { 1377 1377 struct address_space *mapping = vmf->vma->vm_file->f_mapping; 1378 - unsigned long pmd_addr = vmf->address & PMD_MASK; 1379 - struct vm_area_struct *vma = vmf->vma; 1380 1378 struct inode *inode = mapping->host; 1381 - pgtable_t pgtable = NULL; 1382 1379 struct folio *zero_folio; 1383 - spinlock_t *ptl; 1384 - pmd_t pmd_entry; 1385 - unsigned long pfn; 1380 + vm_fault_t ret; 1386 1381 1387 1382 zero_folio = mm_get_huge_zero_folio(vmf->vma->vm_mm); 1388 1383 1389 - if (unlikely(!zero_folio)) 1390 - goto fallback; 1384 + if (unlikely(!zero_folio)) { 1385 + trace_dax_pmd_load_hole_fallback(inode, vmf, zero_folio, *entry); 1386 + return VM_FAULT_FALLBACK; 1387 + } 1391 1388 1392 - pfn = page_to_pfn(&zero_folio->page); 1393 - *entry = dax_insert_entry(xas, vmf, iter, *entry, pfn, 1389 + *entry = dax_insert_entry(xas, vmf, iter, *entry, folio_pfn(zero_folio), 1394 1390 DAX_PMD | DAX_ZERO_PAGE); 1395 1391 1396 - if (arch_needs_pgtable_deposit()) { 1397 - pgtable = pte_alloc_one(vma->vm_mm); 1398 - if (!pgtable) 1399 - return VM_FAULT_OOM; 1400 - } 1401 - 1402 - ptl = pmd_lock(vmf->vma->vm_mm, vmf->pmd); 1403 - if (!pmd_none(*(vmf->pmd))) { 1404 - spin_unlock(ptl); 1405 - goto fallback; 1406 - } 1407 - 1408 - if (pgtable) { 1409 - pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); 1410 - mm_inc_nr_ptes(vma->vm_mm); 1411 - } 1412 - pmd_entry = folio_mk_pmd(zero_folio, vmf->vma->vm_page_prot); 1413 - set_pmd_at(vmf->vma->vm_mm, pmd_addr, vmf->pmd, pmd_entry); 1414 - spin_unlock(ptl); 1415 - trace_dax_pmd_load_hole(inode, vmf, zero_folio, *entry); 1416 - return VM_FAULT_NOPAGE; 1417 - 1418 - fallback: 1419 - if (pgtable) 1420 - pte_free(vma->vm_mm, pgtable); 1421 - trace_dax_pmd_load_hole_fallback(inode, vmf, zero_folio, *entry); 1422 - return VM_FAULT_FALLBACK; 1392 + ret = vmf_insert_folio_pmd(vmf, zero_folio, false); 1393 + if (ret == VM_FAULT_NOPAGE) 1394 + trace_dax_pmd_load_hole(inode, vmf, zero_folio, *entry); 1395 + return ret; 1423 1396 } 1424 1397 #else 1425 1398 static vm_fault_t dax_pmd_load_hole(struct xa_state *xas, struct vm_fault *vmf,