Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: page_isolation: introduce page_is_unmovable()

Patch series "mm: accelerate gigantic folio allocation".

Optimize pfn_range_valid_contig() and replace_free_hugepage_folios() in
alloc_contig_frozen_pages() to speed up gigantic folio allocation. The
allocation time for 120*1G folios drops from 3.605s to 0.431s.


This patch (of 5):

Factor out the check if a page is unmovable into a new helper, and will be
reused in the following patch.

No functional change intended, the minor changes are as follows,
1) Avoid unnecessary calls by checking CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
2) Directly call PageCompound since PageTransCompound may be dropped
3) Using folio_test_hugetlb()

Link: https://lkml.kernel.org/r/20260112150954.1802953-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20260112150954.1802953-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Kefeng Wang and committed by
Andrew Morton
c83109e9 fde83531

+101 -88
+2
include/linux/page-isolation.h
··· 67 67 68 68 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, 69 69 enum pb_isolate_mode mode); 70 + bool page_is_unmovable(struct zone *zone, struct page *page, 71 + enum pb_isolate_mode mode, unsigned long *step); 70 72 #endif
+99 -88
mm/page_isolation.c
··· 15 15 #define CREATE_TRACE_POINTS 16 16 #include <trace/events/page_isolation.h> 17 17 18 + bool page_is_unmovable(struct zone *zone, struct page *page, 19 + enum pb_isolate_mode mode, unsigned long *step) 20 + { 21 + /* 22 + * Both, bootmem allocations and memory holes are marked 23 + * PG_reserved and are unmovable. We can even have unmovable 24 + * allocations inside ZONE_MOVABLE, for example when 25 + * specifying "movablecore". 26 + */ 27 + if (PageReserved(page)) 28 + return true; 29 + 30 + /* 31 + * If the zone is movable and we have ruled out all reserved 32 + * pages then it should be reasonably safe to assume the rest 33 + * is movable. 34 + */ 35 + if (zone_idx(zone) == ZONE_MOVABLE) 36 + return false; 37 + 38 + /* 39 + * Hugepages are not in LRU lists, but they're movable. 40 + * THPs are on the LRU, but need to be counted as #small pages. 41 + * We need not scan over tail pages because we don't 42 + * handle each tail page individually in migration. 43 + */ 44 + if (PageHuge(page) || PageCompound(page)) { 45 + struct folio *folio = page_folio(page); 46 + 47 + if (folio_test_hugetlb(folio)) { 48 + struct hstate *h; 49 + 50 + if (!IS_ENABLED(CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION)) 51 + return true; 52 + 53 + /* 54 + * The huge page may be freed so can not 55 + * use folio_hstate() directly. 56 + */ 57 + h = size_to_hstate(folio_size(folio)); 58 + if (h && !hugepage_migration_supported(h)) 59 + return true; 60 + 61 + } else if (!folio_test_lru(folio)) { 62 + return true; 63 + } 64 + 65 + *step = folio_nr_pages(folio) - folio_page_idx(folio, page); 66 + return false; 67 + } 68 + 69 + /* 70 + * We can't use page_count without pin a page 71 + * because another CPU can free compound page. 72 + * This check already skips compound tails of THP 73 + * because their page->_refcount is zero at all time. 74 + */ 75 + if (!page_ref_count(page)) { 76 + if (PageBuddy(page)) 77 + *step = (1 << buddy_order(page)); 78 + return false; 79 + } 80 + 81 + /* 82 + * The HWPoisoned page may be not in buddy system, and 83 + * page_count() is not 0. 84 + */ 85 + if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageHWPoison(page)) 86 + return false; 87 + 88 + /* 89 + * We treat all PageOffline() pages as movable when offlining 90 + * to give drivers a chance to decrement their reference count 91 + * in MEM_GOING_OFFLINE in order to indicate that these pages 92 + * can be offlined as there are no direct references anymore. 93 + * For actually unmovable PageOffline() where the driver does 94 + * not support this, we will fail later when trying to actually 95 + * move these pages that still have a reference count > 0. 96 + * (false negatives in this function only) 97 + */ 98 + if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageOffline(page)) 99 + return false; 100 + 101 + if (PageLRU(page) || page_has_movable_ops(page)) 102 + return false; 103 + 104 + /* 105 + * If there are RECLAIMABLE pages, we need to check 106 + * it. But now, memory offline itself doesn't call 107 + * shrink_node_slabs() and it still to be fixed. 108 + */ 109 + return true; 110 + } 111 + 18 112 /* 19 113 * This function checks whether the range [start_pfn, end_pfn) includes 20 114 * unmovable pages or not. The range must fall into a single pageblock and ··· 129 35 { 130 36 struct page *page = pfn_to_page(start_pfn); 131 37 struct zone *zone = page_zone(page); 132 - unsigned long pfn; 133 38 134 39 VM_BUG_ON(pageblock_start_pfn(start_pfn) != 135 40 pageblock_start_pfn(end_pfn - 1)); ··· 145 52 return page; 146 53 } 147 54 148 - for (pfn = start_pfn; pfn < end_pfn; pfn++) { 149 - page = pfn_to_page(pfn); 55 + while (start_pfn < end_pfn) { 56 + unsigned long step = 1; 150 57 151 - /* 152 - * Both, bootmem allocations and memory holes are marked 153 - * PG_reserved and are unmovable. We can even have unmovable 154 - * allocations inside ZONE_MOVABLE, for example when 155 - * specifying "movablecore". 156 - */ 157 - if (PageReserved(page)) 58 + page = pfn_to_page(start_pfn); 59 + if (page_is_unmovable(zone, page, mode, &step)) 158 60 return page; 159 61 160 - /* 161 - * If the zone is movable and we have ruled out all reserved 162 - * pages then it should be reasonably safe to assume the rest 163 - * is movable. 164 - */ 165 - if (zone_idx(zone) == ZONE_MOVABLE) 166 - continue; 167 - 168 - /* 169 - * Hugepages are not in LRU lists, but they're movable. 170 - * THPs are on the LRU, but need to be counted as #small pages. 171 - * We need not scan over tail pages because we don't 172 - * handle each tail page individually in migration. 173 - */ 174 - if (PageHuge(page) || PageTransCompound(page)) { 175 - struct folio *folio = page_folio(page); 176 - unsigned int skip_pages; 177 - 178 - if (PageHuge(page)) { 179 - struct hstate *h; 180 - 181 - /* 182 - * The huge page may be freed so can not 183 - * use folio_hstate() directly. 184 - */ 185 - h = size_to_hstate(folio_size(folio)); 186 - if (h && !hugepage_migration_supported(h)) 187 - return page; 188 - } else if (!folio_test_lru(folio)) { 189 - return page; 190 - } 191 - 192 - skip_pages = folio_nr_pages(folio) - folio_page_idx(folio, page); 193 - pfn += skip_pages - 1; 194 - continue; 195 - } 196 - 197 - /* 198 - * We can't use page_count without pin a page 199 - * because another CPU can free compound page. 200 - * This check already skips compound tails of THP 201 - * because their page->_refcount is zero at all time. 202 - */ 203 - if (!page_ref_count(page)) { 204 - if (PageBuddy(page)) 205 - pfn += (1 << buddy_order(page)) - 1; 206 - continue; 207 - } 208 - 209 - /* 210 - * The HWPoisoned page may be not in buddy system, and 211 - * page_count() is not 0. 212 - */ 213 - if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageHWPoison(page)) 214 - continue; 215 - 216 - /* 217 - * We treat all PageOffline() pages as movable when offlining 218 - * to give drivers a chance to decrement their reference count 219 - * in MEM_GOING_OFFLINE in order to indicate that these pages 220 - * can be offlined as there are no direct references anymore. 221 - * For actually unmovable PageOffline() where the driver does 222 - * not support this, we will fail later when trying to actually 223 - * move these pages that still have a reference count > 0. 224 - * (false negatives in this function only) 225 - */ 226 - if ((mode == PB_ISOLATE_MODE_MEM_OFFLINE) && PageOffline(page)) 227 - continue; 228 - 229 - if (PageLRU(page) || page_has_movable_ops(page)) 230 - continue; 231 - 232 - /* 233 - * If there are RECLAIMABLE pages, we need to check 234 - * it. But now, memory offline itself doesn't call 235 - * shrink_node_slabs() and it still to be fixed. 236 - */ 237 - return page; 62 + start_pfn += step; 238 63 } 239 64 return NULL; 240 65 }