Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/page_alloc: place pages to tail in __free_pages_core()

__free_pages_core() is used when exposing fresh memory to the buddy during
system boot and when onlining memory in generic_online_page().

generic_online_page() is used in two cases:

1. Direct memory onlining in online_pages().
2. Deferred memory onlining in memory-ballooning-like mechanisms (HyperV
balloon and virtio-mem), when parts of a section are kept
fake-offline to be fake-onlined later on.

In 1, we already place pages to the tail of the freelist. Pages will be
freed to MIGRATE_ISOLATE lists first and moved to the tail of the
freelists via undo_isolate_page_range().

In 2, we currently don't implement a proper rule. In case of virtio-mem,
where we currently always online MAX_ORDER - 1 pages, the pages will be
placed to the HEAD of the freelist - undesireable. While the hyper-v
balloon calls generic_online_page() with single pages, usually it will
call it on successive single pages in a larger block.

The pages are fresh, so place them to the tail of the freelist and avoid
the PCP. In __free_pages_core(), remove the now superflouos call to
set_page_refcounted() and add a comment regarding page initialization and
the refcount.

Note: In 2. we currently don't shuffle. If ever relevant (page shuffling
is usually of limited use in virtualized environments), we might want to
shuffle after a sequence of generic_online_page() calls in the relevant
callers.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Scott Cheloha <cheloha@linux.ibm.com>
Link: https://lkml.kernel.org/r/20201005121534.15649-5-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

David Hildenbrand and committed by
Linus Torvalds
7fef431b 293ffa5e

+23 -10
+23 -10
mm/page_alloc.c
··· 275 275 unsigned int pageblock_order __read_mostly; 276 276 #endif 277 277 278 - static void __free_pages_ok(struct page *page, unsigned int order); 278 + static void __free_pages_ok(struct page *page, unsigned int order, 279 + fpi_t fpi_flags); 279 280 280 281 /* 281 282 * results with 256, 32 in the lowmem_reserve sysctl: ··· 688 687 void free_compound_page(struct page *page) 689 688 { 690 689 mem_cgroup_uncharge(page); 691 - __free_pages_ok(page, compound_order(page)); 690 + __free_pages_ok(page, compound_order(page), FPI_NONE); 692 691 } 693 692 694 693 void prep_compound_page(struct page *page, unsigned int order) ··· 1424 1423 static void free_one_page(struct zone *zone, 1425 1424 struct page *page, unsigned long pfn, 1426 1425 unsigned int order, 1427 - int migratetype) 1426 + int migratetype, fpi_t fpi_flags) 1428 1427 { 1429 1428 spin_lock(&zone->lock); 1430 1429 if (unlikely(has_isolate_pageblock(zone) || 1431 1430 is_migrate_isolate(migratetype))) { 1432 1431 migratetype = get_pfnblock_migratetype(page, pfn); 1433 1432 } 1434 - __free_one_page(page, pfn, zone, order, migratetype, FPI_NONE); 1433 + __free_one_page(page, pfn, zone, order, migratetype, fpi_flags); 1435 1434 spin_unlock(&zone->lock); 1436 1435 } 1437 1436 ··· 1509 1508 } 1510 1509 } 1511 1510 1512 - static void __free_pages_ok(struct page *page, unsigned int order) 1511 + static void __free_pages_ok(struct page *page, unsigned int order, 1512 + fpi_t fpi_flags) 1513 1513 { 1514 1514 unsigned long flags; 1515 1515 int migratetype; ··· 1522 1520 migratetype = get_pfnblock_migratetype(page, pfn); 1523 1521 local_irq_save(flags); 1524 1522 __count_vm_events(PGFREE, 1 << order); 1525 - free_one_page(page_zone(page), page, pfn, order, migratetype); 1523 + free_one_page(page_zone(page), page, pfn, order, migratetype, 1524 + fpi_flags); 1526 1525 local_irq_restore(flags); 1527 1526 } 1528 1527 ··· 1533 1530 struct page *p = page; 1534 1531 unsigned int loop; 1535 1532 1533 + /* 1534 + * When initializing the memmap, __init_single_page() sets the refcount 1535 + * of all pages to 1 ("allocated"/"not free"). We have to set the 1536 + * refcount of all involved pages to 0. 1537 + */ 1536 1538 prefetchw(p); 1537 1539 for (loop = 0; loop < (nr_pages - 1); loop++, p++) { 1538 1540 prefetchw(p + 1); ··· 1548 1540 set_page_count(p, 0); 1549 1541 1550 1542 atomic_long_add(nr_pages, &page_zone(page)->managed_pages); 1551 - set_page_refcounted(page); 1552 - __free_pages(page, order); 1543 + 1544 + /* 1545 + * Bypass PCP and place fresh pages right to the tail, primarily 1546 + * relevant for memory onlining. 1547 + */ 1548 + __free_pages_ok(page, order, FPI_TO_TAIL); 1553 1549 } 1554 1550 1555 1551 #ifdef CONFIG_NEED_MULTIPLE_NODES ··· 3180 3168 */ 3181 3169 if (migratetype >= MIGRATE_PCPTYPES) { 3182 3170 if (unlikely(is_migrate_isolate(migratetype))) { 3183 - free_one_page(zone, page, pfn, 0, migratetype); 3171 + free_one_page(zone, page, pfn, 0, migratetype, 3172 + FPI_NONE); 3184 3173 return; 3185 3174 } 3186 3175 migratetype = MIGRATE_MOVABLE; ··· 5004 4991 if (order == 0) /* Via pcp? */ 5005 4992 free_unref_page(page); 5006 4993 else 5007 - __free_pages_ok(page, order); 4994 + __free_pages_ok(page, order, FPI_NONE); 5008 4995 } 5009 4996 5010 4997 void __free_pages(struct page *page, unsigned int order)