Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

userfaultfd: retry copying with locks dropped in mfill_atomic_pte_copy()

Implementation of UFFDIO_COPY for anonymous memory might fail to copy data
from userspace buffer when the destination VMA is locked (either with
mm_lock or with per-VMA lock).

In that case, mfill_atomic() releases the locks, retries copying the data
with locks dropped and then re-locks the destination VMA and
re-establishes PMD.

Since this retry-reget dance is only relevant for UFFDIO_COPY and it never
happens for other UFFDIO_ operations, make it a part of
mfill_atomic_pte_copy() that actually implements UFFDIO_COPY for anonymous
memory.

As a temporal safety measure to avoid breaking biscection
mfill_atomic_pte_copy() makes sure to never return -ENOENT so that the
loop in mfill_atomic() won't retry copiyng outside of mmap_lock. This is
removed later when shmem implementation will be updated later and the loop
in mfill_atomic() will be adjusted.

[akpm@linux-foundation.org: update mfill_copy_folio_retry()]
Link: https://lore.kernel.org/20260316173829.1126728-1-avagin@google.com
Link: https://lore.kernel.org/20260306171815.3160826-6-rppt@kernel.org
Link: https://lore.kernel.org/20260402041156.1377214-6-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Harry Yoo (Oracle) <harry@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: James Houghton <jthoughton@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nikita Kalyazin <kalyazin@amazon.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: David Carlier <devnexen@gmail.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Mike Rapoport (Microsoft) and committed by
Andrew Morton
f5f035a7 b8c03b7f

+51 -24
+51 -24
mm/userfaultfd.c
··· 405 405 return ret; 406 406 } 407 407 408 + static int mfill_copy_folio_retry(struct mfill_state *state, struct folio *folio) 409 + { 410 + unsigned long src_addr = state->src_addr; 411 + void *kaddr; 412 + int err; 413 + 414 + /* retry copying with mm_lock dropped */ 415 + mfill_put_vma(state); 416 + 417 + kaddr = kmap_local_folio(folio, 0); 418 + err = copy_from_user(kaddr, (const void __user *) src_addr, PAGE_SIZE); 419 + kunmap_local(kaddr); 420 + if (unlikely(err)) 421 + return -EFAULT; 422 + 423 + flush_dcache_folio(folio); 424 + 425 + /* reget VMA and PMD, they could change underneath us */ 426 + err = mfill_get_vma(state); 427 + if (err) 428 + return err; 429 + 430 + err = mfill_establish_pmd(state); 431 + if (err) 432 + return err; 433 + 434 + return 0; 435 + } 436 + 408 437 static int mfill_atomic_pte_copy(struct mfill_state *state) 409 438 { 410 - struct vm_area_struct *dst_vma = state->vma; 411 439 unsigned long dst_addr = state->dst_addr; 412 440 unsigned long src_addr = state->src_addr; 413 441 uffd_flags_t flags = state->flags; 414 - pmd_t *dst_pmd = state->pmd; 415 442 struct folio *folio; 416 443 int ret; 417 444 418 - if (!state->folio) { 419 - ret = -ENOMEM; 420 - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, dst_vma, 421 - dst_addr); 422 - if (!folio) 423 - goto out; 445 + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, state->vma, dst_addr); 446 + if (!folio) 447 + return -ENOMEM; 424 448 425 - ret = mfill_copy_folio_locked(folio, src_addr); 449 + ret = -ENOMEM; 450 + if (mem_cgroup_charge(folio, state->vma->vm_mm, GFP_KERNEL)) 451 + goto out_release; 426 452 427 - /* fallback to copy_from_user outside mmap_lock */ 428 - if (unlikely(ret)) { 429 - ret = -ENOENT; 430 - state->folio = folio; 431 - /* don't free the page */ 432 - goto out; 433 - } 434 - } else { 435 - folio = state->folio; 436 - state->folio = NULL; 453 + ret = mfill_copy_folio_locked(folio, src_addr); 454 + if (unlikely(ret)) { 455 + /* 456 + * Fallback to copy_from_user outside mmap_lock. 457 + * If retry is successful, mfill_copy_folio_locked() returns 458 + * with locks retaken by mfill_get_vma(). 459 + * If there was an error, we must mfill_put_vma() anyway and it 460 + * will take care of unlocking if needed. 461 + */ 462 + ret = mfill_copy_folio_retry(state, folio); 463 + if (ret) 464 + goto out_release; 437 465 } 438 466 439 467 /* ··· 471 443 */ 472 444 __folio_mark_uptodate(folio); 473 445 474 - ret = -ENOMEM; 475 - if (mem_cgroup_charge(folio, dst_vma->vm_mm, GFP_KERNEL)) 476 - goto out_release; 477 - 478 - ret = mfill_atomic_install_pte(dst_pmd, dst_vma, dst_addr, 446 + ret = mfill_atomic_install_pte(state->pmd, state->vma, dst_addr, 479 447 &folio->page, true, flags); 480 448 if (ret) 481 449 goto out_release; 482 450 out: 483 451 return ret; 484 452 out_release: 453 + /* Don't return -ENOENT so that our caller won't retry */ 454 + if (ret == -ENOENT) 455 + ret = -EFAULT; 485 456 folio_put(folio); 486 457 goto out; 487 458 }