Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/page_alloc: ignore the exact initial compaction result

Patch series "tweaks for __alloc_pages_slowpath()", v3.


This patch (of 3):

For allocations that are of costly order and __GFP_NORETRY (and can
perform compaction) we attempt direct compaction first. If that fails, we
continue with a single round of direct reclaim+compaction (as for other
__GFP_NORETRY allocations, except the compaction is of lower priority),
with two exceptions that fail immediately:

- __GFP_THISNODE is specified, to prevent zone_reclaim_mode-like
behavior for e.g. THP page faults

- compaction failed because it was deferred (i.e. has been failing
recently so further attempts are not done for a while) or skipped,
which means there are insufficient free base pages to defragment to
begin with

Upon closer inspection, the second condition has a somewhat flawed
reasoning. If there are not enough base pages and reclaim could create
them, we instead fail. When there are enough base pages and compaction
has already ran and failed, we proceed and hope that reclaim and the
subsequent compaction attempt will succeed. But it's unclear why they
should and whether it will be as inexpensive as intended.

It might make therefore more sense to just fail unconditionally after the
initial compaction attempt. However that would change the semantics of
__GFP_NORETRY to attempt reclaim at least once.

Alternatively we can remove the compaction result checks and proceed with
the single reclaim and (lower priority) compaction attempt, leaving only
the __GFP_THISNODE exception for failing immediately.

Link: https://lkml.kernel.org/r/20260106-thp-thisnode-tweak-v3-0-f5d67c21a193@suse.cz
Link: https://lkml.kernel.org/r/20260106-thp-thisnode-tweak-v3-1-f5d67c21a193@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Vlastimil Babka and committed by
Andrew Morton
66987218 d17f0241

+6 -28
+6 -28
mm/page_alloc.c
··· 4798 4798 */ 4799 4799 if (costly_order && (gfp_mask & __GFP_NORETRY)) { 4800 4800 /* 4801 - * If allocating entire pageblock(s) and compaction 4802 - * failed because all zones are below low watermarks 4803 - * or is prohibited because it recently failed at this 4804 - * order, fail immediately unless the allocator has 4805 - * requested compaction and reclaim retry. 4806 - * 4807 - * Reclaim is 4808 - * - potentially very expensive because zones are far 4809 - * below their low watermarks or this is part of very 4810 - * bursty high order allocations, 4811 - * - not guaranteed to help because isolate_freepages() 4812 - * may not iterate over freed pages as part of its 4813 - * linear scan, and 4814 - * - unlikely to make entire pageblocks free on its 4815 - * own. 4816 - */ 4817 - if (compact_result == COMPACT_SKIPPED || 4818 - compact_result == COMPACT_DEFERRED) 4819 - goto nopage; 4820 - 4821 - /* 4822 4801 * THP page faults may attempt local node only first, 4823 4802 * but are then allowed to only compact, not reclaim, 4824 4803 * see alloc_pages_mpol(). 4825 4804 * 4826 - * Compaction can fail for other reasons than those 4827 - * checked above and we don't want such THP allocations 4828 - * to put reclaim pressure on a single node in a 4829 - * situation where other nodes might have plenty of 4830 - * available memory. 4805 + * Compaction has failed above and we don't want such 4806 + * THP allocations to put reclaim pressure on a single 4807 + * node in a situation where other nodes might have 4808 + * plenty of available memory. 4831 4809 */ 4832 4810 if (gfp_mask & __GFP_THISNODE) 4833 4811 goto nopage; 4834 4812 4835 4813 /* 4836 - * Looks like reclaim/compaction is worth trying, but 4837 - * sync compaction could be very expensive, so keep 4814 + * Proceed with single round of reclaim/compaction, but 4815 + * since sync compaction could be very expensive, keep 4838 4816 * using async compaction. 4839 4817 */ 4840 4818 compact_priority = INIT_COMPACT_PRIORITY;