Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: memfd_luo: always dirty all folios

A dirty folio is one which has been written to. A clean folio is its
opposite. Since a clean folio has no user data, it can be freed under
memory pressure.

memfd preservation with LUO saves the flag at preserve(). This is
problematic. The folio might get dirtied later. Saving it at freeze()
also doesn't work, since the dirty bit from PTE is normally synced at
unmap and there might still be mappings of the file at freeze().

To see why this is a problem, say a folio is clean at preserve, but gets
dirtied later. The serialized state of the folio will mark it as clean.
After retrieve, the next kernel will see the folio as clean and might try
to reclaim it under memory pressure. This will result in losing user
data.

Mark all folios of the file as dirty, and always set the
MEMFD_LUO_FOLIO_DIRTY flag. This comes with the side effect of making all
clean folios un-reclaimable. This is a cost that has to be paid for
participants of live update. It is not expected to be a common use case
to preserve a lot of clean folios anyway.

Since the value of pfolio->flags is a constant now, drop the flags
variable and set it directly.

Link: https://lkml.kernel.org/r/20260223173931.2221759-3-pratyush@kernel.org
Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd")
Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Pratyush Yadav (Google) and committed by
Andrew Morton
7e04bf1f 50d7b433

+21 -5
+21 -5
mm/memfd_luo.c
··· 146 146 for (i = 0; i < nr_folios; i++) { 147 147 struct memfd_luo_folio_ser *pfolio = &folios_ser[i]; 148 148 struct folio *folio = folios[i]; 149 - unsigned int flags = 0; 150 149 151 150 err = kho_preserve_folio(folio); 152 151 if (err) ··· 153 154 154 155 folio_lock(folio); 155 156 156 - if (folio_test_dirty(folio)) 157 - flags |= MEMFD_LUO_FOLIO_DIRTY; 157 + /* 158 + * A dirty folio is one which has been written to. A clean folio 159 + * is its opposite. Since a clean folio does not carry user 160 + * data, it can be freed by page reclaim under memory pressure. 161 + * 162 + * Saving the dirty flag at prepare() time doesn't work since it 163 + * can change later. Saving it at freeze() also won't work 164 + * because the dirty bit is normally synced at unmap and there 165 + * might still be a mapping of the file at freeze(). 166 + * 167 + * To see why this is a problem, say a folio is clean at 168 + * preserve, but gets dirtied later. The pfolio flags will mark 169 + * it as clean. After retrieve, the next kernel might try to 170 + * reclaim this folio under memory pressure, losing user data. 171 + * 172 + * Unconditionally mark it dirty to avoid this problem. This 173 + * comes at the cost of making clean folios un-reclaimable after 174 + * live update. 175 + */ 176 + folio_mark_dirty(folio); 158 177 159 178 /* 160 179 * If the folio is not uptodate, it was fallocated but never ··· 191 174 flush_dcache_folio(folio); 192 175 folio_mark_uptodate(folio); 193 176 } 194 - flags |= MEMFD_LUO_FOLIO_UPTODATE; 195 177 196 178 folio_unlock(folio); 197 179 198 180 pfolio->pfn = folio_pfn(folio); 199 - pfolio->flags = flags; 181 + pfolio->flags = MEMFD_LUO_FOLIO_DIRTY | MEMFD_LUO_FOLIO_UPTODATE; 200 182 pfolio->index = folio->index; 201 183 } 202 184