Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: avoid leaving partial pfn mappings around in error case

As Jann points out, PFN mappings are special, because unlike normal
memory mappings, there is no lifetime information associated with the
mapping - it is just a raw mapping of PFNs with no reference counting of
a 'struct page'.

That's all very much intentional, but it does mean that it's easy to
mess up the cleanup in case of errors. Yes, a failed mmap() will always
eventually clean up any partial mappings, but without any explicit
lifetime in the page table mapping itself, it's very easy to do the
error handling in the wrong order.

In particular, it's easy to mistakenly free the physical backing store
before the page tables are actually cleaned up and (temporarily) have
stale dangling PTE entries.

To make this situation less error-prone, just make sure that any partial
pfn mapping is torn down early, before any other error handling.

Reported-and-tested-by: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+22 -5
+22 -5
mm/memory.c
··· 2632 2632 return 0; 2633 2633 } 2634 2634 2635 - /* 2636 - * Variant of remap_pfn_range that does not call track_pfn_remap. The caller 2637 - * must have pre-validated the caching bits of the pgprot_t. 2638 - */ 2639 - int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr, 2635 + static int remap_pfn_range_internal(struct vm_area_struct *vma, unsigned long addr, 2640 2636 unsigned long pfn, unsigned long size, pgprot_t prot) 2641 2637 { 2642 2638 pgd_t *pgd; ··· 2683 2687 } while (pgd++, addr = next, addr != end); 2684 2688 2685 2689 return 0; 2690 + } 2691 + 2692 + /* 2693 + * Variant of remap_pfn_range that does not call track_pfn_remap. The caller 2694 + * must have pre-validated the caching bits of the pgprot_t. 2695 + */ 2696 + int remap_pfn_range_notrack(struct vm_area_struct *vma, unsigned long addr, 2697 + unsigned long pfn, unsigned long size, pgprot_t prot) 2698 + { 2699 + int error = remap_pfn_range_internal(vma, addr, pfn, size, prot); 2700 + 2701 + if (!error) 2702 + return 0; 2703 + 2704 + /* 2705 + * A partial pfn range mapping is dangerous: it does not 2706 + * maintain page reference counts, and callers may free 2707 + * pages due to the error. So zap it early. 2708 + */ 2709 + zap_page_range_single(vma, addr, size, NULL); 2710 + return error; 2686 2711 } 2687 2712 2688 2713 /**