Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge

Patch series "mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted
merge", v2.

Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.

However, it is handling merges incorrectly when it comes to mremap() of a
faulted VMA adjacent to an unfaulted VMA. The issues arise in three
cases:

1. Previous VMA unfaulted:

copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev

2. Next VMA unfaulted:

copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next

3. Both adjacent VMAs unfaulted:

copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next

This series fixes each of these cases, and introduces self tests to assert
that the issues are corrected.

I also test a further case which was already handled, to assert that my
changes continues to correctly handle it:

4. prev unfaulted, next faulted:

copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next

This bug was discovered via a syzbot report, linked to in the first patch
in the series, I confirmed that this series fixes the bug.

I also discovered that we are failing to check that the faulted VMA was
not forked when merging a copied VMA in cases 1-3 above, an issue this
series also addresses.

I also added self tests to assert that this is resolved (and confirmed
that the tests failed prior to this).

I also cleaned up vma_expand() as part of this work, renamed
vma_had_uncowed_parents() to vma_is_fork_child() as the previous name was
unduly confusing, and simplified the comments around this function.


This patch (of 4):

Commit 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA
merges") introduced the ability to merge previously unavailable VMA merge
scenarios.

The key piece of logic introduced was the ability to merge a faulted VMA
immediately next to an unfaulted VMA, which relies upon dup_anon_vma() to
correctly handle anon_vma state.

In the case of the merge of an existing VMA (that is changing properties
of a VMA and then merging if those properties are shared by adjacent
VMAs), dup_anon_vma() is invoked correctly.

However in the case of the merge of a new VMA, a corner case peculiar to
mremap() was missed.

The issue is that vma_expand() only performs dup_anon_vma() if the target
(the VMA that will ultimately become the merged VMA): is not the next VMA,
i.e. the one that appears after the range in which the new VMA is to be
established.

A key insight here is that in all other cases other than mremap(), a new
VMA merge either expands an existing VMA, meaning that the target VMA will
be that VMA, or would have anon_vma be NULL.

Specifically:

* __mmap_region() - no anon_vma in place, initial mapping.
* do_brk_flags() - expanding an existing VMA.
* vma_merge_extend() - expanding an existing VMA.
* relocate_vma_down() - no anon_vma in place, initial mapping.

In addition, we are in the unique situation of needing to duplicate
anon_vma state from a VMA that is neither the previous or next VMA being
merged with.

dup_anon_vma() deals exclusively with the target=unfaulted, src=faulted
case. This leaves four possibilities, in each case where the copied VMA
is faulted:

1. Previous VMA unfaulted:

copied -----|
v
|-----------|.............|
| unfaulted |(faulted VMA)|
|-----------|.............|
prev

target = prev, expand prev to cover.

2. Next VMA unfaulted:

copied -----|
v
|.............|-----------|
|(faulted VMA)| unfaulted |
|.............|-----------|
next

target = next, expand next to cover.

3. Both adjacent VMAs unfaulted:

copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| unfaulted |
|-----------|.............|-----------|
prev next

target = prev, expand prev to cover.

4. prev unfaulted, next faulted:

copied -----|
v
|-----------|.............|-----------|
| unfaulted |(faulted VMA)| faulted |
|-----------|.............|-----------|
prev next

target = prev, expand prev to cover. Essentially equivalent to 3, but
with additional requirement that next's anon_vma is the same as the copied
VMA's. This is covered by the existing logic.

To account for this very explicitly, we introduce
vma_merge_copied_range(), which sets a newly introduced vmg->copied_from
field, then invokes vma_merge_new_range() which handles the rest of the
logic.

We then update the key vma_expand() function to clean up the logic and
make what's going on clearer, making the 'remove next' case less special,
before invoking dup_anon_vma() unconditionally should we be copying from a
VMA.

Note that in case 3, the if (remove_next) ... branch will be a no-op, as
next=src in this instance and src is unfaulted.

In case 4, it won't be, but since in this instance next=src and it is
faulted, this will have required tgt=faulted, src=faulted to be
compatible, meaning that next->anon_vma == vmg->copied_from->anon_vma, and
thus a single dup_anon_vma() of next suffices to copy anon_vma state for
the copied-from VMA also.

If we are copying from a VMA in a successful merge we must _always_
propagate anon_vma state.

This issue can be observed most directly by invoked mremap() to move
around a VMA and cause this kind of merge with the MREMAP_DONTUNMAP flag
specified.

This will result in unlink_anon_vmas() being called after failing to
duplicate anon_vma state to the target VMA, which results in the anon_vma
itself being freed with folios still possessing dangling pointers to the
anon_vma and thus a use-after-free bug.

This bug was discovered via a syzbot report, which this patch resolves.

We further make a change to update the mergeable anon_vma check to assert
the copied-from anon_vma did not have CoW parents, as otherwise
dup_anon_vma() might incorrectly propagate CoW ancestors from the next VMA
in case 4 despite the anon_vma's being identical for both VMAs.

Link: https://lkml.kernel.org/r/cover.1767638272.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/b7930ad2b1503a657e29fe928eb33061d7eadf5b.1767638272.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Fixes: 879bca0a2c4f ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Reported-by: syzbot+b165fc2e11771c66d8ba@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694a2745.050a0220.19928e.0017.GAE@google.com/
Reported-by: syzbot+5272541ccbbb14e2ec30@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/694e3dc6.050a0220.35954c.0066.GAE@google.com/
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Jeongjun Park <aha310510@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Yeoreum Yun <yeoreum.yun@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
61f67c23 590b1366

+62 -25
+59 -25
mm/vma.c
··· 829 829 VM_WARN_ON_VMG(middle && 830 830 !(vma_iter_addr(vmg->vmi) >= middle->vm_start && 831 831 vma_iter_addr(vmg->vmi) < middle->vm_end), vmg); 832 + /* An existing merge can never be used by the mremap() logic. */ 833 + VM_WARN_ON_VMG(vmg->copied_from, vmg); 832 834 833 835 vmg->state = VMA_MERGE_NOMERGE; 834 836 ··· 1101 1099 } 1102 1100 1103 1101 /* 1102 + * vma_merge_copied_range - Attempt to merge a VMA that is being copied by 1103 + * mremap() 1104 + * 1105 + * @vmg: Describes the VMA we are adding, in the copied-to range @vmg->start to 1106 + * @vmg->end (exclusive), which we try to merge with any adjacent VMAs if 1107 + * possible. 1108 + * 1109 + * vmg->prev, next, start, end, pgoff should all be relative to the COPIED TO 1110 + * range, i.e. the target range for the VMA. 1111 + * 1112 + * Returns: In instances where no merge was possible, NULL. Otherwise, a pointer 1113 + * to the VMA we expanded. 1114 + * 1115 + * ASSUMPTIONS: Same as vma_merge_new_range(), except vmg->middle must contain 1116 + * the copied-from VMA. 1117 + */ 1118 + static struct vm_area_struct *vma_merge_copied_range(struct vma_merge_struct *vmg) 1119 + { 1120 + /* We must have a copied-from VMA. */ 1121 + VM_WARN_ON_VMG(!vmg->middle, vmg); 1122 + 1123 + vmg->copied_from = vmg->middle; 1124 + vmg->middle = NULL; 1125 + return vma_merge_new_range(vmg); 1126 + } 1127 + 1128 + /* 1104 1129 * vma_expand - Expand an existing VMA 1105 1130 * 1106 1131 * @vmg: Describes a VMA expansion operation. ··· 1146 1117 int vma_expand(struct vma_merge_struct *vmg) 1147 1118 { 1148 1119 struct vm_area_struct *anon_dup = NULL; 1149 - bool remove_next = false; 1150 1120 struct vm_area_struct *target = vmg->target; 1151 1121 struct vm_area_struct *next = vmg->next; 1122 + bool remove_next = false; 1152 1123 vm_flags_t sticky_flags; 1153 - 1154 - sticky_flags = vmg->vm_flags & VM_STICKY; 1155 - sticky_flags |= target->vm_flags & VM_STICKY; 1156 - 1157 - VM_WARN_ON_VMG(!target, vmg); 1124 + int ret = 0; 1158 1125 1159 1126 mmap_assert_write_locked(vmg->mm); 1160 - 1161 1127 vma_start_write(target); 1162 - if (next && (target != next) && (vmg->end == next->vm_end)) { 1163 - int ret; 1164 1128 1165 - sticky_flags |= next->vm_flags & VM_STICKY; 1129 + if (next && target != next && vmg->end == next->vm_end) 1166 1130 remove_next = true; 1167 - /* This should already have been checked by this point. */ 1168 - VM_WARN_ON_VMG(!can_merge_remove_vma(next), vmg); 1169 - vma_start_write(next); 1170 - /* 1171 - * In this case we don't report OOM, so vmg->give_up_on_mm is 1172 - * safe. 1173 - */ 1174 - ret = dup_anon_vma(target, next, &anon_dup); 1175 - if (ret) 1176 - return ret; 1177 - } 1178 1131 1132 + /* We must have a target. */ 1133 + VM_WARN_ON_VMG(!target, vmg); 1134 + /* This should have already been checked by this point. */ 1135 + VM_WARN_ON_VMG(remove_next && !can_merge_remove_vma(next), vmg); 1179 1136 /* Not merging but overwriting any part of next is not handled. */ 1180 1137 VM_WARN_ON_VMG(next && !remove_next && 1181 1138 next != target && vmg->end > next->vm_start, vmg); 1182 - /* Only handles expanding */ 1139 + /* Only handles expanding. */ 1183 1140 VM_WARN_ON_VMG(target->vm_start < vmg->start || 1184 1141 target->vm_end > vmg->end, vmg); 1185 1142 1143 + sticky_flags = vmg->vm_flags & VM_STICKY; 1144 + sticky_flags |= target->vm_flags & VM_STICKY; 1186 1145 if (remove_next) 1187 - vmg->__remove_next = true; 1146 + sticky_flags |= next->vm_flags & VM_STICKY; 1188 1147 1148 + /* 1149 + * If we are removing the next VMA or copying from a VMA 1150 + * (e.g. mremap()'ing), we must propagate anon_vma state. 1151 + * 1152 + * Note that, by convention, callers ignore OOM for this case, so 1153 + * we don't need to account for vmg->give_up_on_mm here. 1154 + */ 1155 + if (remove_next) 1156 + ret = dup_anon_vma(target, next, &anon_dup); 1157 + if (!ret && vmg->copied_from) 1158 + ret = dup_anon_vma(target, vmg->copied_from, &anon_dup); 1159 + if (ret) 1160 + return ret; 1161 + 1162 + if (remove_next) { 1163 + vma_start_write(next); 1164 + vmg->__remove_next = true; 1165 + } 1189 1166 if (commit_merge(vmg)) 1190 1167 goto nomem; 1191 1168 ··· 1863 1828 if (new_vma && new_vma->vm_start < addr + len) 1864 1829 return NULL; /* should never get here */ 1865 1830 1866 - vmg.middle = NULL; /* New VMA range. */ 1867 1831 vmg.pgoff = pgoff; 1868 1832 vmg.next = vma_iter_next_rewind(&vmi, NULL); 1869 - new_vma = vma_merge_new_range(&vmg); 1833 + new_vma = vma_merge_copied_range(&vmg); 1870 1834 1871 1835 if (new_vma) { 1872 1836 /*
+3
mm/vma.h
··· 106 106 struct anon_vma_name *anon_name; 107 107 enum vma_merge_state state; 108 108 109 + /* If copied from (i.e. mremap()'d) the VMA from which we are copying. */ 110 + struct vm_area_struct *copied_from; 111 + 109 112 /* Flags which callers can use to modify merge behaviour: */ 110 113 111 114 /*