Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/page_alloc: simplify __alloc_pages_slowpath() flow

The actions done before entering the main retry loop include waking up
kswapds and an allocation attempt with the precise alloc_flags. Then in
the loop we keep waking up kswapds, and we retry the allocation with flags
potentially further adjusted by being allowed to use reserves (due to e.g.
becoming an OOM killer victim).

We can adjust the retry loop to keep only one instance of waking up
kswapds and allocation attempt. Introduce the can_retry_reserves variable
for retrying once when we become eligible for reserves. It is still
useful not to evaluate reserve_flags immediately for the first allocation
attempt, because it's better to first try succeed in a non-preferred zone
above the min watermark before allocating immediately from the preferred
zone below min watermark.

Additionally move the cpuset update checks introduced by e05741fb10c3
("mm/page_alloc.c: avoid infinite retries caused by cpuset race") further
down the retry loop. It's enough to do the checks only before reaching
any potentially infinite 'goto retry;' loop.

There should be no meaningful functional changes. The change of exact
moments the retry for reserves and cpuset updates are checked should not
result in different outomes modulo races with concurrent allocator
activity.

Link: https://lkml.kernel.org/r/20260106-thp-thisnode-tweak-v3-3-f5d67c21a193@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Vlastimil Babka and committed by
Andrew Morton
2c4c3e29 53a9b464

+23 -18
+23 -18
mm/page_alloc.c
··· 4708 4708 unsigned int zonelist_iter_cookie; 4709 4709 int reserve_flags; 4710 4710 bool compact_first = false; 4711 + bool can_retry_reserves = true; 4711 4712 4712 4713 if (unlikely(nofail)) { 4713 4714 /* ··· 4776 4775 goto nopage; 4777 4776 } 4778 4777 4778 + retry: 4779 + /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */ 4779 4780 if (alloc_flags & ALLOC_KSWAPD) 4780 4781 wake_all_kswapds(order, gfp_mask, ac); 4781 4782 ··· 4788 4785 page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); 4789 4786 if (page) 4790 4787 goto got_pg; 4791 - 4792 - retry: 4793 - /* 4794 - * Deal with possible cpuset update races or zonelist updates to avoid 4795 - * infinite retries. 4796 - */ 4797 - if (check_retry_cpuset(cpuset_mems_cookie, ac) || 4798 - check_retry_zonelist(zonelist_iter_cookie)) 4799 - goto restart; 4800 - 4801 - /* Ensure kswapd doesn't accidentally go to sleep as long as we loop */ 4802 - if (alloc_flags & ALLOC_KSWAPD) 4803 - wake_all_kswapds(order, gfp_mask, ac); 4804 4788 4805 4789 reserve_flags = __gfp_pfmemalloc_flags(gfp_mask); 4806 4790 if (reserve_flags) ··· 4803 4813 ac->nodemask = NULL; 4804 4814 ac->preferred_zoneref = first_zones_zonelist(ac->zonelist, 4805 4815 ac->highest_zoneidx, ac->nodemask); 4806 - } 4807 4816 4808 - /* Attempt with potentially adjusted zonelist and alloc_flags */ 4809 - page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); 4810 - if (page) 4811 - goto got_pg; 4817 + /* 4818 + * The first time we adjust anything due to being allowed to 4819 + * ignore memory policies or watermarks, retry immediately. This 4820 + * allows us to keep the first allocation attempt optimistic so 4821 + * it can succeed in a zone that is still above watermarks. 4822 + */ 4823 + if (can_retry_reserves) { 4824 + can_retry_reserves = false; 4825 + goto retry; 4826 + } 4827 + } 4812 4828 4813 4829 /* Caller is not willing to reclaim, we can't balance anything */ 4814 4830 if (!can_direct_reclaim) ··· 4876 4880 if (costly_order && (!can_compact || 4877 4881 !(gfp_mask & __GFP_RETRY_MAYFAIL))) 4878 4882 goto nopage; 4883 + 4884 + /* 4885 + * Deal with possible cpuset update races or zonelist updates to avoid 4886 + * infinite retries. No "goto retry;" can be placed above this check 4887 + * unless it can execute just once. 4888 + */ 4889 + if (check_retry_cpuset(cpuset_mems_cookie, ac) || 4890 + check_retry_zonelist(zonelist_iter_cookie)) 4891 + goto restart; 4879 4892 4880 4893 if (should_reclaim_retry(gfp_mask, order, ac, alloc_flags, 4881 4894 did_some_progress > 0, &no_progress_loops))