Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: userfaultfd: correct dirty flags set for both present and swap pte

As David pointed out, what truly matters for mremap and userfaultfd move
operations is the soft dirty bit. The current comment and
implementation—which always sets the dirty bit for present PTEs and
fails to set the soft dirty bit for swap PTEs—are incorrect. This could
break features like Checkpoint-Restore in Userspace (CRIU).

This patch updates the behavior to correctly set the soft dirty bit for
both present and swap PTEs in accordance with mremap.

Link: https://lkml.kernel.org/r/20250508220912.7275-1-21cnbao@gmail.com
Fixes: adef440691ba ("userfaultfd: UFFDIO_MOVE uABI")
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Reported-by: David Hildenbrand <david@redhat.com>
Closes: https://lore.kernel.org/linux-mm/02f14ee1-923f-47e3-a994-4950afb9afcc@redhat.com/
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Lokesh Gidra <lokeshgidra@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Barry Song and committed by
Andrew Morton
75cb1cca 02f5bf89

+10 -2
+10 -2
mm/userfaultfd.c
··· 1064 1064 src_folio->index = linear_page_index(dst_vma, dst_addr); 1065 1065 1066 1066 orig_dst_pte = mk_pte(&src_folio->page, dst_vma->vm_page_prot); 1067 - /* Follow mremap() behavior and treat the entry dirty after the move */ 1068 - orig_dst_pte = pte_mkwrite(pte_mkdirty(orig_dst_pte), dst_vma); 1067 + /* Set soft dirty bit so userspace can notice the pte was moved */ 1068 + #ifdef CONFIG_MEM_SOFT_DIRTY 1069 + orig_dst_pte = pte_mksoft_dirty(orig_dst_pte); 1070 + #endif 1071 + if (pte_dirty(orig_src_pte)) 1072 + orig_dst_pte = pte_mkdirty(orig_dst_pte); 1073 + orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma); 1069 1074 1070 1075 set_pte_at(mm, dst_addr, dst_pte, orig_dst_pte); 1071 1076 out: ··· 1105 1100 } 1106 1101 1107 1102 orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte); 1103 + #ifdef CONFIG_MEM_SOFT_DIRTY 1104 + orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte); 1105 + #endif 1108 1106 set_pte_at(mm, dst_addr, dst_pte, orig_src_pte); 1109 1107 double_pt_unlock(dst_ptl, src_ptl); 1110 1108