Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: avoid unnecessary uses of is_swap_pte()

There's an established convention in the kernel that we treat PTEs as
containing swap entries (and the unfortunately named non-swap swap
entries) should they be neither empty (i.e. pte_none() evaluating true)
nor present (i.e. pte_present() evaluating true).

However, there is some inconsistency in how this is applied, as we also
have the is_swap_pte() helper which explicitly performs this check:

/* check whether a pte points to a swap entry */
static inline int is_swap_pte(pte_t pte)
{
return !pte_none(pte) && !pte_present(pte);
}

As this represents a predicate, and it's logical to assume that in order
to establish that a PTE entry can correctly be manipulated as a
swap/non-swap entry, this predicate seems as if it must first be checked.

But we instead, we far more often utilise the established convention of
checking pte_none() / pte_present() before operating on entries as if they
were swap/non-swap.

This patch works towards correcting this inconsistency by removing all
uses of is_swap_pte() where we are already in a position where we perform
pte_none()/pte_present() checks anyway or otherwise it is clearly logical
to do so.

We also take advantage of the fact that pte_swp_uffd_wp() is only set on
swap entries.

Additionally, update comments referencing to is_swap_pte() and
non_swap_entry().

No functional change intended.

Link: https://lkml.kernel.org/r/17fd6d7f46a846517fd455fadd640af47fcd7c55.1762812360.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Byungchul Park <byungchul@sk.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Chris Li <chrisl@kernel.org>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kemeng Shi <shikemeng@huaweicloud.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Wei Xu <weixugc@google.com>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yuanchu Xie <yuanchu@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
fb888710 68aa2fdb

+104 -85
+34 -15
fs/proc/task_mmu.c
··· 1017 1017 young = pte_young(ptent); 1018 1018 dirty = pte_dirty(ptent); 1019 1019 present = true; 1020 - } else if (is_swap_pte(ptent)) { 1020 + } else if (pte_none(ptent)) { 1021 + smaps_pte_hole_lookup(addr, walk); 1022 + } else { 1021 1023 swp_entry_t swpent = pte_to_swp_entry(ptent); 1022 1024 1023 1025 if (!non_swap_entry(swpent)) { ··· 1040 1038 present = true; 1041 1039 page = pfn_swap_entry_to_page(swpent); 1042 1040 } 1043 - } else { 1044 - smaps_pte_hole_lookup(addr, walk); 1045 - return; 1046 1041 } 1047 1042 1048 1043 if (!page) ··· 1611 1612 */ 1612 1613 pte_t ptent = ptep_get(pte); 1613 1614 1615 + if (pte_none(ptent)) 1616 + return; 1617 + 1614 1618 if (pte_present(ptent)) { 1615 1619 pte_t old_pte; 1616 1620 ··· 1623 1621 ptent = pte_wrprotect(old_pte); 1624 1622 ptent = pte_clear_soft_dirty(ptent); 1625 1623 ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); 1626 - } else if (is_swap_pte(ptent)) { 1624 + } else { 1627 1625 ptent = pte_swp_clear_soft_dirty(ptent); 1628 1626 set_pte_at(vma->vm_mm, addr, pte, ptent); 1629 1627 } ··· 1926 1924 struct page *page = NULL; 1927 1925 struct folio *folio; 1928 1926 1927 + if (pte_none(pte)) 1928 + goto out; 1929 + 1929 1930 if (pte_present(pte)) { 1930 1931 if (pm->show_pfn) 1931 1932 frame = pte_pfn(pte); ··· 1938 1933 flags |= PM_SOFT_DIRTY; 1939 1934 if (pte_uffd_wp(pte)) 1940 1935 flags |= PM_UFFD_WP; 1941 - } else if (is_swap_pte(pte)) { 1936 + } else { 1942 1937 swp_entry_t entry; 1938 + 1943 1939 if (pte_swp_soft_dirty(pte)) 1944 1940 flags |= PM_SOFT_DIRTY; 1945 1941 if (pte_swp_uffd_wp(pte)) ··· 1948 1942 entry = pte_to_swp_entry(pte); 1949 1943 if (pm->show_pfn) { 1950 1944 pgoff_t offset; 1945 + 1951 1946 /* 1952 1947 * For PFN swap offsets, keeping the offset field 1953 1948 * to be PFN only to be compatible with old smaps. ··· 1977 1970 __folio_page_mapped_exclusively(folio, page)) 1978 1971 flags |= PM_MMAP_EXCLUSIVE; 1979 1972 } 1973 + 1974 + out: 1980 1975 if (vma->vm_flags & VM_SOFTDIRTY) 1981 1976 flags |= PM_SOFT_DIRTY; 1982 1977 ··· 2320 2311 struct vm_area_struct *vma, 2321 2312 unsigned long addr, pte_t pte) 2322 2313 { 2323 - unsigned long categories = 0; 2314 + unsigned long categories; 2315 + 2316 + if (pte_none(pte)) 2317 + return 0; 2324 2318 2325 2319 if (pte_present(pte)) { 2326 2320 struct page *page; 2327 2321 2328 - categories |= PAGE_IS_PRESENT; 2322 + categories = PAGE_IS_PRESENT; 2323 + 2329 2324 if (!pte_uffd_wp(pte)) 2330 2325 categories |= PAGE_IS_WRITTEN; 2331 2326 ··· 2343 2330 categories |= PAGE_IS_PFNZERO; 2344 2331 if (pte_soft_dirty(pte)) 2345 2332 categories |= PAGE_IS_SOFT_DIRTY; 2346 - } else if (is_swap_pte(pte)) { 2333 + } else { 2347 2334 softleaf_t entry; 2348 2335 2349 - categories |= PAGE_IS_SWAPPED; 2336 + categories = PAGE_IS_SWAPPED; 2337 + 2350 2338 if (!pte_swp_uffd_wp_any(pte)) 2351 2339 categories |= PAGE_IS_WRITTEN; 2352 2340 ··· 2375 2361 old_pte = ptep_modify_prot_start(vma, addr, pte); 2376 2362 ptent = pte_mkuffd_wp(old_pte); 2377 2363 ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent); 2378 - } else if (is_swap_pte(ptent)) { 2379 - ptent = pte_swp_mkuffd_wp(ptent); 2380 - set_pte_at(vma->vm_mm, addr, pte, ptent); 2381 - } else { 2364 + } else if (pte_none(ptent)) { 2382 2365 set_pte_at(vma->vm_mm, addr, pte, 2383 2366 make_pte_marker(PTE_MARKER_UFFD_WP)); 2367 + } else { 2368 + ptent = pte_swp_mkuffd_wp(ptent); 2369 + set_pte_at(vma->vm_mm, addr, pte, ptent); 2384 2370 } 2385 2371 } 2386 2372 ··· 2449 2435 { 2450 2436 unsigned long categories = PAGE_IS_HUGE; 2451 2437 2438 + if (pte_none(pte)) 2439 + return categories; 2440 + 2452 2441 /* 2453 2442 * According to pagemap_hugetlb_range(), file-backed HugeTLB 2454 2443 * page cannot be swapped. So PAGE_IS_FILE is not checked for ··· 2459 2442 */ 2460 2443 if (pte_present(pte)) { 2461 2444 categories |= PAGE_IS_PRESENT; 2445 + 2462 2446 if (!huge_pte_uffd_wp(pte)) 2463 2447 categories |= PAGE_IS_WRITTEN; 2464 2448 if (!PageAnon(pte_page(pte))) ··· 2468 2450 categories |= PAGE_IS_PFNZERO; 2469 2451 if (pte_soft_dirty(pte)) 2470 2452 categories |= PAGE_IS_SOFT_DIRTY; 2471 - } else if (is_swap_pte(pte)) { 2453 + } else { 2472 2454 categories |= PAGE_IS_SWAPPED; 2455 + 2473 2456 if (!pte_swp_uffd_wp_any(pte)) 2474 2457 categories |= PAGE_IS_WRITTEN; 2475 2458 if (pte_swp_soft_dirty(pte))
+1 -2
include/linux/userfaultfd_k.h
··· 441 441 static inline bool pte_swp_uffd_wp_any(pte_t pte) 442 442 { 443 443 #ifdef CONFIG_PTE_MARKER_UFFD_WP 444 - if (!is_swap_pte(pte)) 444 + if (pte_present(pte)) 445 445 return false; 446 - 447 446 if (pte_swp_uffd_wp(pte)) 448 447 return true; 449 448
+3 -3
mm/hugetlb.c
··· 5092 5092 5093 5093 pte = huge_ptep_get_and_clear(mm, old_addr, src_pte, sz); 5094 5094 5095 - if (need_clear_uffd_wp && pte_is_uffd_wp_marker(pte)) 5095 + if (need_clear_uffd_wp && pte_is_uffd_wp_marker(pte)) { 5096 5096 huge_pte_clear(mm, new_addr, dst_pte, sz); 5097 - else { 5097 + } else { 5098 5098 if (need_clear_uffd_wp) { 5099 5099 if (pte_present(pte)) 5100 5100 pte = huge_pte_clear_uffd_wp(pte); 5101 - else if (is_swap_pte(pte)) 5101 + else 5102 5102 pte = pte_swp_clear_uffd_wp(pte); 5103 5103 } 5104 5104 set_huge_pte_at(mm, new_addr, dst_pte, pte, sz);
+2 -4
mm/internal.h
··· 325 325 /** 326 326 * pte_move_swp_offset - Move the swap entry offset field of a swap pte 327 327 * forward or backward by delta 328 - * @pte: The initial pte state; is_swap_pte(pte) must be true and 329 - * non_swap_entry() must be false. 328 + * @pte: The initial pte state; must be a swap entry 330 329 * @delta: The direction and the offset we are moving; forward if delta 331 330 * is positive; backward if delta is negative 332 331 * ··· 351 352 352 353 /** 353 354 * pte_next_swp_offset - Increment the swap entry offset field of a swap pte. 354 - * @pte: The initial pte state; is_swap_pte(pte) must be true and 355 - * non_swap_entry() must be false. 355 + * @pte: The initial pte state; must be a swap entry. 356 356 * 357 357 * Increments the swap offset, while maintaining all other fields, including 358 358 * swap type, and any swp pte bits. The resulting pte is returned.
+15 -14
mm/khugepaged.c
··· 1019 1019 } 1020 1020 1021 1021 vmf.orig_pte = ptep_get_lockless(pte); 1022 - if (!is_swap_pte(vmf.orig_pte)) 1022 + if (pte_none(vmf.orig_pte) || 1023 + pte_present(vmf.orig_pte)) 1023 1024 continue; 1024 1025 1025 1026 vmf.pte = pte; ··· 1277 1276 for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR; 1278 1277 _pte++, addr += PAGE_SIZE) { 1279 1278 pte_t pteval = ptep_get(_pte); 1280 - if (is_swap_pte(pteval)) { 1279 + if (pte_none_or_zero(pteval)) { 1280 + ++none_or_zero; 1281 + if (!userfaultfd_armed(vma) && 1282 + (!cc->is_khugepaged || 1283 + none_or_zero <= khugepaged_max_ptes_none)) { 1284 + continue; 1285 + } else { 1286 + result = SCAN_EXCEED_NONE_PTE; 1287 + count_vm_event(THP_SCAN_EXCEED_NONE_PTE); 1288 + goto out_unmap; 1289 + } 1290 + } 1291 + if (!pte_present(pteval)) { 1281 1292 ++unmapped; 1282 1293 if (!cc->is_khugepaged || 1283 1294 unmapped <= khugepaged_max_ptes_swap) { ··· 1306 1293 } else { 1307 1294 result = SCAN_EXCEED_SWAP_PTE; 1308 1295 count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); 1309 - goto out_unmap; 1310 - } 1311 - } 1312 - if (pte_none_or_zero(pteval)) { 1313 - ++none_or_zero; 1314 - if (!userfaultfd_armed(vma) && 1315 - (!cc->is_khugepaged || 1316 - none_or_zero <= khugepaged_max_ptes_none)) { 1317 - continue; 1318 - } else { 1319 - result = SCAN_EXCEED_NONE_PTE; 1320 - count_vm_event(THP_SCAN_EXCEED_NONE_PTE); 1321 1296 goto out_unmap; 1322 1297 } 1323 1298 }
+1 -1
mm/migrate.c
··· 492 492 pte = ptep_get(ptep); 493 493 pte_unmap(ptep); 494 494 495 - if (!is_swap_pte(pte)) 495 + if (pte_none(pte) || pte_present(pte)) 496 496 goto out; 497 497 498 498 entry = pte_to_swp_entry(pte);
+20 -23
mm/mprotect.c
··· 297 297 prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent, 298 298 nr_ptes, /* idx = */ 0, /* set_write = */ false, tlb); 299 299 pages += nr_ptes; 300 - } else if (is_swap_pte(oldpte)) { 300 + } else if (pte_none(oldpte)) { 301 + /* 302 + * Nobody plays with any none ptes besides 303 + * userfaultfd when applying the protections. 304 + */ 305 + if (likely(!uffd_wp)) 306 + continue; 307 + 308 + if (userfaultfd_wp_use_markers(vma)) { 309 + /* 310 + * For file-backed mem, we need to be able to 311 + * wr-protect a none pte, because even if the 312 + * pte is none, the page/swap cache could 313 + * exist. Doing that by install a marker. 314 + */ 315 + set_pte_at(vma->vm_mm, addr, pte, 316 + make_pte_marker(PTE_MARKER_UFFD_WP)); 317 + pages++; 318 + } 319 + } else { 301 320 swp_entry_t entry = pte_to_swp_entry(oldpte); 302 321 pte_t newpte; 303 322 ··· 375 356 376 357 if (!pte_same(oldpte, newpte)) { 377 358 set_pte_at(vma->vm_mm, addr, pte, newpte); 378 - pages++; 379 - } 380 - } else { 381 - /* It must be an none page, or what else?.. */ 382 - WARN_ON_ONCE(!pte_none(oldpte)); 383 - 384 - /* 385 - * Nobody plays with any none ptes besides 386 - * userfaultfd when applying the protections. 387 - */ 388 - if (likely(!uffd_wp)) 389 - continue; 390 - 391 - if (userfaultfd_wp_use_markers(vma)) { 392 - /* 393 - * For file-backed mem, we need to be able to 394 - * wr-protect a none pte, because even if the 395 - * pte is none, the page/swap cache could 396 - * exist. Doing that by install a marker. 397 - */ 398 - set_pte_at(vma->vm_mm, addr, pte, 399 - make_pte_marker(PTE_MARKER_UFFD_WP)); 400 359 pages++; 401 360 } 402 361 }
+5 -2
mm/mremap.c
··· 158 158 159 159 static pte_t move_soft_dirty_pte(pte_t pte) 160 160 { 161 + if (pte_none(pte)) 162 + return pte; 163 + 161 164 /* 162 165 * Set soft dirty bit so we can notice 163 166 * in userspace the ptes were moved. ··· 168 165 #ifdef CONFIG_MEM_SOFT_DIRTY 169 166 if (pte_present(pte)) 170 167 pte = pte_mksoft_dirty(pte); 171 - else if (is_swap_pte(pte)) 168 + else 172 169 pte = pte_swp_mksoft_dirty(pte); 173 170 #endif 174 171 return pte; ··· 297 294 if (need_clear_uffd_wp) { 298 295 if (pte_present(pte)) 299 296 pte = pte_clear_uffd_wp(pte); 300 - else if (is_swap_pte(pte)) 297 + else 301 298 pte = pte_swp_clear_uffd_wp(pte); 302 299 } 303 300 set_ptes(mm, new_addr, new_ptep, pte, nr_ptes);
+8 -5
mm/page_table_check.c
··· 185 185 is_writable_migration_entry(entry); 186 186 } 187 187 188 - static inline void page_table_check_pte_flags(pte_t pte) 188 + static void page_table_check_pte_flags(pte_t pte) 189 189 { 190 - if (pte_present(pte) && pte_uffd_wp(pte)) 191 - WARN_ON_ONCE(pte_write(pte)); 192 - else if (is_swap_pte(pte) && pte_swp_uffd_wp(pte)) 193 - WARN_ON_ONCE(swap_cached_writable(pte_to_swp_entry(pte))); 190 + if (pte_present(pte)) { 191 + WARN_ON_ONCE(pte_uffd_wp(pte) && pte_write(pte)); 192 + } else if (pte_swp_uffd_wp(pte)) { 193 + const swp_entry_t entry = pte_to_swp_entry(pte); 194 + 195 + WARN_ON_ONCE(swap_cached_writable(entry)); 196 + } 194 197 } 195 198 196 199 void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte,
+15 -16
mm/page_vma_mapped.c
··· 16 16 static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp, 17 17 spinlock_t **ptlp) 18 18 { 19 + bool is_migration; 19 20 pte_t ptent; 20 21 21 22 if (pvmw->flags & PVMW_SYNC) { ··· 27 26 return !!pvmw->pte; 28 27 } 29 28 29 + is_migration = pvmw->flags & PVMW_MIGRATION; 30 30 again: 31 31 /* 32 32 * It is important to return the ptl corresponding to pte, ··· 43 41 44 42 ptent = ptep_get(pvmw->pte); 45 43 46 - if (pvmw->flags & PVMW_MIGRATION) { 47 - if (!is_swap_pte(ptent)) 44 + if (pte_none(ptent)) { 45 + return false; 46 + } else if (pte_present(ptent)) { 47 + if (is_migration) 48 48 return false; 49 - } else if (is_swap_pte(ptent)) { 49 + } else if (!is_migration) { 50 50 swp_entry_t entry; 51 + 51 52 /* 52 53 * Handle un-addressable ZONE_DEVICE memory. 53 54 * ··· 71 66 if (!is_device_private_entry(entry) && 72 67 !is_device_exclusive_entry(entry)) 73 68 return false; 74 - } else if (!pte_present(ptent)) { 75 - return false; 76 69 } 77 70 spin_lock(*ptlp); 78 71 if (unlikely(!pmd_same(*pmdvalp, pmdp_get_lockless(pvmw->pmd)))) { ··· 116 113 return false; 117 114 118 115 pfn = softleaf_to_pfn(entry); 119 - } else if (is_swap_pte(ptent)) { 120 - swp_entry_t entry; 116 + } else if (pte_present(ptent)) { 117 + pfn = pte_pfn(ptent); 118 + } else { 119 + const softleaf_t entry = softleaf_from_pte(ptent); 121 120 122 121 /* Handle un-addressable ZONE_DEVICE memory */ 123 - entry = pte_to_swp_entry(ptent); 124 - if (!is_device_private_entry(entry) && 125 - !is_device_exclusive_entry(entry)) 122 + if (!softleaf_is_device_private(entry) && 123 + !softleaf_is_device_exclusive(entry)) 126 124 return false; 127 125 128 - pfn = swp_offset_pfn(entry); 129 - } else { 130 - if (!pte_present(ptent)) 131 - return false; 132 - 133 - pfn = pte_pfn(ptent); 126 + pfn = softleaf_to_pfn(entry); 134 127 } 135 128 136 129 if ((pfn + pte_nr - 1) < pvmw->pfn)