Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: avoid deadlock when holding rmap on mmap_prepare error

Commit ac0a3fc9c07d ("mm: add ability to take further action in
vm_area_desc") added the ability for drivers to instruct mm to take actions
after the .mmap_prepare callback is complete.

To make life simpler and safer, this is done before the VMA/mmap write lock
is dropped but when the VMA is completely established.

So on error, we simply munmap() the VMA.

As part of this implementation, unfortunately a horrible hack had to be
implemented to support some questionable behaviour hugetlb relies upon -
that is that the file rmap lock is held until the operation is complete.

The implementation, for convenience, did this in mmap_action_finish() so
both the VMA and mmap_prepare compatibility layer paths would have this
correctly handled.

However, it turns out there is a mistake here - the rmap lock cannot be
held on munmap, as free_pgtables() -> unlink_file_vma_batch_add() ->
unlink_file_vma_batch_process() takes the file rmap lock.

We therefore currently have a deadlock issue that might arise.

Resolve this by leaving it to callers to handle the unmap.

The compatibility layer does not support this rmap behaviour, so we simply
have it unmap on error after calling mmap_action_complete().

In the VMA implementation, we only perform the unmap after the rmap lock is
dropped.

This resolves the issue by ensuring the rmap lock is always dropped when
the unmap occurs.

Link: https://lkml.kernel.org/r/d44248be9da68258b07c2c59d4e73485ee0ca943.1774045440.git.ljs@kernel.org
Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes (Oracle) and committed by
Andrew Morton
f96e1d5f 827e97cf

+17 -8
+7 -5
mm/util.c
··· 1215 1215 return err; 1216 1216 1217 1217 set_vma_from_desc(vma, &desc); 1218 - return mmap_action_complete(vma, &desc.action); 1218 + err = mmap_action_complete(vma, &desc.action); 1219 + if (err) { 1220 + const size_t len = vma_pages(vma) << PAGE_SHIFT; 1221 + 1222 + do_munmap(current->mm, vma->vm_start, len, NULL); 1223 + } 1224 + return err; 1219 1225 } 1220 1226 EXPORT_SYMBOL(compat_vma_mmap); 1221 1227 ··· 1322 1316 * invoked if we do NOT merge, so we only clean up the VMA we created. 1323 1317 */ 1324 1318 if (err) { 1325 - const size_t len = vma_pages(vma) << PAGE_SHIFT; 1326 - 1327 - do_munmap(current->mm, vma->vm_start, len, NULL); 1328 - 1329 1319 if (action->error_hook) { 1330 1320 /* We may want to filter the error. */ 1331 1321 err = action->error_hook(err);
+10 -3
mm/vma.c
··· 2735 2735 struct mmap_action *action, 2736 2736 struct vm_area_struct *vma) 2737 2737 { 2738 - int ret; 2738 + int err; 2739 2739 2740 - ret = mmap_action_complete(vma, action); 2740 + err = mmap_action_complete(vma, action); 2741 2741 2742 2742 /* If we held the file rmap we need to release it. */ 2743 2743 if (map->hold_file_rmap_lock) { ··· 2745 2745 2746 2746 i_mmap_unlock_write(file->f_mapping); 2747 2747 } 2748 - return ret; 2748 + 2749 + if (err) { 2750 + const size_t len = vma_pages(vma) << PAGE_SHIFT; 2751 + 2752 + do_munmap(current->mm, vma->vm_start, len, NULL); 2753 + } 2754 + 2755 + return err; 2749 2756 } 2750 2757 2751 2758 static unsigned long __mmap_region(struct file *file, unsigned long addr,