Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: stop leaking PageTables

4.10-rc loadtest (even on x86, and even without THPCache) fails with
"fork: Cannot allocate memory" or some such; and /proc/meminfo shows
PageTables growing.

Commit 953c66c2b22a ("mm: THP page cache support for ppc64") that got
merged in rc1 removed the freeing of an unused preallocated pagetable
after do_fault_around() has called map_pages().

This is usually a good optimization, so that the followup doesn't have
to reallocate one; but it's not sufficient to shift the freeing into
alloc_set_pte(), since there are failure cases (most commonly
VM_FAULT_RETRY) which never reach finish_fault().

Check and free it at the outer level in do_fault(), then we don't need
to worry in alloc_set_pte(), and can restore that to how it was (I
cannot find any reason to pte_free() under lock as it was doing).

And fix a separate pagetable leak, or crash, introduced by the same
change, that could only show up on some ppc64: why does do_set_pmd()'s
failure case attempt to withdraw a pagetable when it never deposited
one, at the same time overwriting (so leaking) the vmf->prealloc_pte?
Residue of an earlier implementation, perhaps? Delete it.

Fixes: 953c66c2b22a ("mm: THP page cache support for ppc64")
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Hugh Dickins and committed by
Linus Torvalds
b0b9b3df 87bc6107

+20 -27
+20 -27
mm/memory.c
··· 3008 3008 ret = 0; 3009 3009 count_vm_event(THP_FILE_MAPPED); 3010 3010 out: 3011 - /* 3012 - * If we are going to fallback to pte mapping, do a 3013 - * withdraw with pmd lock held. 3014 - */ 3015 - if (arch_needs_pgtable_deposit() && ret == VM_FAULT_FALLBACK) 3016 - vmf->prealloc_pte = pgtable_trans_huge_withdraw(vma->vm_mm, 3017 - vmf->pmd); 3018 3011 spin_unlock(vmf->ptl); 3019 3012 return ret; 3020 3013 } ··· 3048 3055 3049 3056 ret = do_set_pmd(vmf, page); 3050 3057 if (ret != VM_FAULT_FALLBACK) 3051 - goto fault_handled; 3058 + return ret; 3052 3059 } 3053 3060 3054 3061 if (!vmf->pte) { 3055 3062 ret = pte_alloc_one_map(vmf); 3056 3063 if (ret) 3057 - goto fault_handled; 3064 + return ret; 3058 3065 } 3059 3066 3060 3067 /* Re-check under ptl */ 3061 - if (unlikely(!pte_none(*vmf->pte))) { 3062 - ret = VM_FAULT_NOPAGE; 3063 - goto fault_handled; 3064 - } 3068 + if (unlikely(!pte_none(*vmf->pte))) 3069 + return VM_FAULT_NOPAGE; 3065 3070 3066 3071 flush_icache_page(vma, page); 3067 3072 entry = mk_pte(page, vma->vm_page_prot); ··· 3079 3088 3080 3089 /* no need to invalidate: a not-present page won't be cached */ 3081 3090 update_mmu_cache(vma, vmf->address, vmf->pte); 3082 - ret = 0; 3083 3091 3084 - fault_handled: 3085 - /* preallocated pagetable is unused: free it */ 3086 - if (vmf->prealloc_pte) { 3087 - pte_free(vmf->vma->vm_mm, vmf->prealloc_pte); 3088 - vmf->prealloc_pte = 0; 3089 - } 3090 - return ret; 3092 + return 0; 3091 3093 } 3092 3094 3093 3095 ··· 3344 3360 static int do_fault(struct vm_fault *vmf) 3345 3361 { 3346 3362 struct vm_area_struct *vma = vmf->vma; 3363 + int ret; 3347 3364 3348 3365 /* The VMA was not fully populated on mmap() or missing VM_DONTEXPAND */ 3349 3366 if (!vma->vm_ops->fault) 3350 - return VM_FAULT_SIGBUS; 3351 - if (!(vmf->flags & FAULT_FLAG_WRITE)) 3352 - return do_read_fault(vmf); 3353 - if (!(vma->vm_flags & VM_SHARED)) 3354 - return do_cow_fault(vmf); 3355 - return do_shared_fault(vmf); 3367 + ret = VM_FAULT_SIGBUS; 3368 + else if (!(vmf->flags & FAULT_FLAG_WRITE)) 3369 + ret = do_read_fault(vmf); 3370 + else if (!(vma->vm_flags & VM_SHARED)) 3371 + ret = do_cow_fault(vmf); 3372 + else 3373 + ret = do_shared_fault(vmf); 3374 + 3375 + /* preallocated pagetable is unused: free it */ 3376 + if (vmf->prealloc_pte) { 3377 + pte_free(vma->vm_mm, vmf->prealloc_pte); 3378 + vmf->prealloc_pte = 0; 3379 + } 3380 + return ret; 3356 3381 } 3357 3382 3358 3383 static int numa_migrate_prep(struct page *page, struct vm_area_struct *vma,