Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: avoid unsafe VMA hook invocation when error arises on mmap hook

Patch series "fix error handling in mmap_region() and refactor
(hotfixes)", v4.

mmap_region() is somewhat terrifying, with spaghetti-like control flow and
numerous means by which issues can arise and incomplete state, memory
leaks and other unpleasantness can occur.

A large amount of the complexity arises from trying to handle errors late
in the process of mapping a VMA, which forms the basis of recently
observed issues with resource leaks and observable inconsistent state.

This series goes to great lengths to simplify how mmap_region() works and
to avoid unwinding errors late on in the process of setting up the VMA for
the new mapping, and equally avoids such operations occurring while the
VMA is in an inconsistent state.

The patches in this series comprise the minimal changes required to
resolve existing issues in mmap_region() error handling, in order that
they can be hotfixed and backported. There is additionally a follow up
series which goes further, separated out from the v1 series and sent and
updated separately.


This patch (of 5):

After an attempted mmap() fails, we are no longer in a situation where we
can safely interact with VMA hooks. This is currently not enforced,
meaning that we need complicated handling to ensure we do not incorrectly
call these hooks.

We can avoid the whole issue by treating the VMA as suspect the moment
that the file->f_ops->mmap() function reports an error by replacing
whatever VMA operations were installed with a dummy empty set of VMA
operations.

We do so through a new helper function internal to mm - mmap_file() -
which is both more logically named than the existing call_mmap() function
and correctly isolates handling of the vm_op reassignment to mm.

All the existing invocations of call_mmap() outside of mm are ultimately
nested within the call_mmap() from mm, which we now replace.

It is therefore safe to leave call_mmap() in place as a convenience
function (and to avoid churn). The invokers are:

ovl_file_operations -> mmap -> ovl_mmap() -> backing_file_mmap()
coda_file_operations -> mmap -> coda_file_mmap()
shm_file_operations -> shm_mmap()
shm_file_operations_huge -> shm_mmap()
dma_buf_fops -> dma_buf_mmap_internal -> i915_dmabuf_ops
-> i915_gem_dmabuf_mmap()

None of these callers interact with vm_ops or mappings in a problematic
way on error, quickly exiting out.

Link: https://lkml.kernel.org/r/cover.1730224667.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/d41fd763496fd0048a962f3fd9407dc72dd4fd86.1730224667.git.lorenzo.stoakes@oracle.com
Fixes: deb0f6562884 ("mm/mmap: undo ->mmap() when arch_validate_flags() fails")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Jann Horn <jannh@google.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
3dd6ed34 f8f931bb

+32 -5
+27
mm/internal.h
··· 108 108 return (void *)(mapping & ~PAGE_MAPPING_FLAGS); 109 109 } 110 110 111 + /* 112 + * This is a file-backed mapping, and is about to be memory mapped - invoke its 113 + * mmap hook and safely handle error conditions. On error, VMA hooks will be 114 + * mutated. 115 + * 116 + * @file: File which backs the mapping. 117 + * @vma: VMA which we are mapping. 118 + * 119 + * Returns: 0 if success, error otherwise. 120 + */ 121 + static inline int mmap_file(struct file *file, struct vm_area_struct *vma) 122 + { 123 + int err = call_mmap(file, vma); 124 + 125 + if (likely(!err)) 126 + return 0; 127 + 128 + /* 129 + * OK, we tried to call the file hook for mmap(), but an error 130 + * arose. The mapping is in an inconsistent state and we most not invoke 131 + * any further hooks on it. 132 + */ 133 + vma->vm_ops = &vma_dummy_vm_ops; 134 + 135 + return err; 136 + } 137 + 111 138 #ifdef CONFIG_MMU 112 139 113 140 /* Flags for folio_pte_batch(). */
+3 -3
mm/mmap.c
··· 1422 1422 /* 1423 1423 * clear PTEs while the vma is still in the tree so that rmap 1424 1424 * cannot race with the freeing later in the truncate scenario. 1425 - * This is also needed for call_mmap(), which is why vm_ops 1425 + * This is also needed for mmap_file(), which is why vm_ops 1426 1426 * close function is called. 1427 1427 */ 1428 1428 vms_clean_up_area(&vms, &mas_detach); ··· 1447 1447 1448 1448 if (file) { 1449 1449 vma->vm_file = get_file(file); 1450 - error = call_mmap(file, vma); 1450 + error = mmap_file(file, vma); 1451 1451 if (error) 1452 1452 goto unmap_and_free_vma; 1453 1453 ··· 1470 1470 1471 1471 vma_iter_config(&vmi, addr, end); 1472 1472 /* 1473 - * If vm_flags changed after call_mmap(), we should try merge 1473 + * If vm_flags changed after mmap_file(), we should try merge 1474 1474 * vma again as we may succeed this time. 1475 1475 */ 1476 1476 if (unlikely(vm_flags != vma->vm_flags && vmg.prev)) {
+2 -2
mm/nommu.c
··· 885 885 { 886 886 int ret; 887 887 888 - ret = call_mmap(vma->vm_file, vma); 888 + ret = mmap_file(vma->vm_file, vma); 889 889 if (ret == 0) { 890 890 vma->vm_region->vm_top = vma->vm_region->vm_end; 891 891 return 0; ··· 918 918 * happy. 919 919 */ 920 920 if (capabilities & NOMMU_MAP_DIRECT) { 921 - ret = call_mmap(vma->vm_file, vma); 921 + ret = mmap_file(vma->vm_file, vma); 922 922 /* shouldn't return success if we're not sharing */ 923 923 if (WARN_ON_ONCE(!is_nommu_shared_mapping(vma->vm_flags))) 924 924 ret = -ENOSYS;