Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

thp: fix page_referenced to modify mapcount/vm_flags only if page is found

When vmscan.c calls page_referenced(), if an anon page was created
before a process forked, rmap will search for it in both of the
processes, even though one of them might have since broken COW.

If the child process mlocks the vma where the COWed page belongs to,
page_referenced() running on the page mapped by the parent would lead to
*vm_flags getting VM_LOCKED set erroneously (leading to the references
on the parent page being ignored and evicting the parent page too
early).

*mapcount would also be decremented by page_referenced_one even if the
page wasn't found by page_check_address.

This also lets pmdp_clear_flush_young_notify() go ahead on a
pmd_trans_splitting() pmd.

We hold the page_table_lock so __split_huge_page_map() must wait the
pmdp_clear_flush_young_notify() to complete before it can modify the
pmd. The pmd is also still mapped in userland so the young bit may
materialize through a tlb miss before split_huge_page_map runs.

This will provide a more accurate page_referenced() behavior during
split_huge_page().

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Michel Lespinasse <walken@google.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel<riel@redhat.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Andrea Arcangeli and committed by
Linus Torvalds
2da28bfd 78f9bbb5

+35 -19
+35 -19
mm/rmap.c
··· 497 497 struct mm_struct *mm = vma->vm_mm; 498 498 int referenced = 0; 499 499 500 - /* 501 - * Don't want to elevate referenced for mlocked page that gets this far, 502 - * in order that it progresses to try_to_unmap and is moved to the 503 - * unevictable list. 504 - */ 505 - if (vma->vm_flags & VM_LOCKED) { 506 - *mapcount = 0; /* break early from loop */ 507 - *vm_flags |= VM_LOCKED; 508 - goto out; 509 - } 510 - 511 - /* Pretend the page is referenced if the task has the 512 - swap token and is in the middle of a page fault. */ 513 - if (mm != current->mm && has_swap_token(mm) && 514 - rwsem_is_locked(&mm->mmap_sem)) 515 - referenced++; 516 - 517 500 if (unlikely(PageTransHuge(page))) { 518 501 pmd_t *pmd; 519 502 520 503 spin_lock(&mm->page_table_lock); 504 + /* 505 + * rmap might return false positives; we must filter 506 + * these out using page_check_address_pmd(). 507 + */ 521 508 pmd = page_check_address_pmd(page, mm, address, 522 509 PAGE_CHECK_ADDRESS_PMD_FLAG); 523 - if (pmd && !pmd_trans_splitting(*pmd) && 524 - pmdp_clear_flush_young_notify(vma, address, pmd)) 510 + if (!pmd) { 511 + spin_unlock(&mm->page_table_lock); 512 + goto out; 513 + } 514 + 515 + if (vma->vm_flags & VM_LOCKED) { 516 + spin_unlock(&mm->page_table_lock); 517 + *mapcount = 0; /* break early from loop */ 518 + *vm_flags |= VM_LOCKED; 519 + goto out; 520 + } 521 + 522 + /* go ahead even if the pmd is pmd_trans_splitting() */ 523 + if (pmdp_clear_flush_young_notify(vma, address, pmd)) 525 524 referenced++; 526 525 spin_unlock(&mm->page_table_lock); 527 526 } else { 528 527 pte_t *pte; 529 528 spinlock_t *ptl; 530 529 530 + /* 531 + * rmap might return false positives; we must filter 532 + * these out using page_check_address(). 533 + */ 531 534 pte = page_check_address(page, mm, address, &ptl, 0); 532 535 if (!pte) 533 536 goto out; 537 + 538 + if (vma->vm_flags & VM_LOCKED) { 539 + pte_unmap_unlock(pte, ptl); 540 + *mapcount = 0; /* break early from loop */ 541 + *vm_flags |= VM_LOCKED; 542 + goto out; 543 + } 534 544 535 545 if (ptep_clear_flush_young_notify(vma, address, pte)) { 536 546 /* ··· 555 545 } 556 546 pte_unmap_unlock(pte, ptl); 557 547 } 548 + 549 + /* Pretend the page is referenced if the task has the 550 + swap token and is in the middle of a page fault. */ 551 + if (mm != current->mm && has_swap_token(mm) && 552 + rwsem_is_locked(&mm->mmap_sem)) 553 + referenced++; 558 554 559 555 (*mapcount)--; 560 556