Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: readahead: improve mmap_miss heuristic for concurrent faults

If two or more threads of an application faulting on the same folio, the
mmap_miss counter can be decreased multiple times. It breaks the
mmap_miss heuristic and keeps the readahead enabled even under extreme
levels of memory pressure.

It happens often if file folios backing a multi-threaded application are
getting evicted and re-faulted.

Fix it by skipping decreasing mmap_miss if the folio is locked.

This change was evaluated on several hundred thousands hosts in Google's
production over a couple of weeks. The number of containers being stuck
in a vicious reclaim cycle for a long time was reduced several fold
(~10-20x), as well as the overall fleet-wide cpu time spent in direct
memory reclaim was meaningfully reduced. No regressions were observed.

Link: https://lkml.kernel.org/r/20250815183224.62007-1-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Roman Gushchin and committed by
Andrew Morton
e338d835 19de1e5d

+11 -3
+11 -3
mm/filemap.c
··· 3323 3323 if (vmf->vma->vm_flags & VM_RAND_READ || !ra->ra_pages) 3324 3324 return fpin; 3325 3325 3326 - mmap_miss = READ_ONCE(ra->mmap_miss); 3327 - if (mmap_miss) 3328 - WRITE_ONCE(ra->mmap_miss, --mmap_miss); 3326 + /* 3327 + * If the folio is locked, we're likely racing against another fault. 3328 + * Don't touch the mmap_miss counter to avoid decreasing it multiple 3329 + * times for a single folio and break the balance with mmap_miss 3330 + * increase in do_sync_mmap_readahead(). 3331 + */ 3332 + if (likely(!folio_test_locked(folio))) { 3333 + mmap_miss = READ_ONCE(ra->mmap_miss); 3334 + if (mmap_miss) 3335 + WRITE_ONCE(ra->mmap_miss, --mmap_miss); 3336 + } 3329 3337 3330 3338 if (folio_test_readahead(folio)) { 3331 3339 fpin = maybe_unlock_mmap_for_io(vmf, fpin);