Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/khugepaged: guard is_zero_pfn() calls with pte_present()

A non-present entry, like a swap PTE, contains completely different data
(swap type and offset). pte_pfn() doesn't know this, so if we feed it a
non-present entry, it will spit out a junk PFN.

What if that junk PFN happens to match the zeropage's PFN by sheer chance?
While really unlikely, this would be really bad if it did.

So, let's fix this potential bug by ensuring all calls to is_zero_pfn() in
khugepaged.c are properly guarded by a pte_present() check.

Link: https://lkml.kernel.org/r/20251020151111.53561-1-lance.yang@linux.dev
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Nico Pache <npache@redhat.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lance Yang and committed by
Andrew Morton
074f027d 40d923ac

+21 -14
+21 -14
mm/khugepaged.c
··· 337 337 }; 338 338 #endif /* CONFIG_SYSFS */ 339 339 340 + static bool pte_none_or_zero(pte_t pte) 341 + { 342 + if (pte_none(pte)) 343 + return true; 344 + return pte_present(pte) && is_zero_pfn(pte_pfn(pte)); 345 + } 346 + 340 347 int hugepage_madvise(struct vm_area_struct *vma, 341 348 vm_flags_t *vm_flags, int advice) 342 349 { ··· 525 518 526 519 if (pte_none(pteval)) 527 520 continue; 521 + VM_WARN_ON_ONCE(!pte_present(pteval)); 528 522 pfn = pte_pfn(pteval); 529 523 if (is_zero_pfn(pfn)) 530 524 continue; ··· 556 548 for (_pte = pte; _pte < pte + HPAGE_PMD_NR; 557 549 _pte++, addr += PAGE_SIZE) { 558 550 pte_t pteval = ptep_get(_pte); 559 - if (pte_none(pteval) || (pte_present(pteval) && 560 - is_zero_pfn(pte_pfn(pteval)))) { 551 + if (pte_none_or_zero(pteval)) { 561 552 ++none_or_zero; 562 553 if (!userfaultfd_armed(vma) && 563 554 (!cc->is_khugepaged || ··· 697 690 address += nr_ptes * PAGE_SIZE) { 698 691 nr_ptes = 1; 699 692 pteval = ptep_get(_pte); 700 - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { 693 + if (pte_none_or_zero(pteval)) { 701 694 add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1); 702 - if (is_zero_pfn(pte_pfn(pteval))) { 703 - /* 704 - * ptl mostly unnecessary. 705 - */ 706 - spin_lock(ptl); 707 - ptep_clear(vma->vm_mm, address, _pte); 708 - spin_unlock(ptl); 709 - ksm_might_unmap_zero_page(vma->vm_mm, pteval); 710 - } 695 + if (pte_none(pteval)) 696 + continue; 697 + /* 698 + * ptl mostly unnecessary. 699 + */ 700 + spin_lock(ptl); 701 + ptep_clear(vma->vm_mm, address, _pte); 702 + spin_unlock(ptl); 703 + ksm_might_unmap_zero_page(vma->vm_mm, pteval); 711 704 } else { 712 705 struct page *src_page = pte_page(pteval); 713 706 ··· 801 794 unsigned long src_addr = address + i * PAGE_SIZE; 802 795 struct page *src_page; 803 796 804 - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { 797 + if (pte_none_or_zero(pteval)) { 805 798 clear_user_highpage(page, src_addr); 806 799 continue; 807 800 } ··· 1308 1301 goto out_unmap; 1309 1302 } 1310 1303 } 1311 - if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { 1304 + if (pte_none_or_zero(pteval)) { 1312 1305 ++none_or_zero; 1313 1306 if (!userfaultfd_armed(vma) && 1314 1307 (!cc->is_khugepaged ||