Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

memblock: warn when freeing reserved memory before memory map is initialized

When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, freeing of reserved
memory before the memory map is fully initialized in deferred_init_memmap()
would cause access to uninitialized struct pages and may crash when
accessing spurious list pointers, like was recently discovered during
discussion about memory leaks in x86 EFI code [1].

The trace below is from an attempt to call free_reserved_page() before
page_alloc_init_late():

[ 0.076840] BUG: unable to handle page fault for address: ffffce1a005a0788
[ 0.078226] #PF: supervisor read access in kernel mode
[ 0.078226] #PF: error_code(0x0000) - not-present page
[ 0.078226] PGD 0 P4D 0
[ 0.078226] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 0.078226] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.12.68-92.123.amzn2023.x86_64 #1
[ 0.078226] Hardware name: Amazon EC2 t3a.nano/, BIOS 1.0 10/16/2017
[ 0.078226] RIP: 0010:__list_del_entry_valid_or_report+0x32/0xb0
...
[ 0.078226] __free_one_page+0x170/0x520
[ 0.078226] free_pcppages_bulk+0x151/0x1e0
[ 0.078226] free_unref_page_commit+0x263/0x320
[ 0.078226] free_unref_page+0x2c8/0x5b0
[ 0.078226] ? srso_return_thunk+0x5/0x5f
[ 0.078226] free_reserved_page+0x1c/0x30
[ 0.078226] memblock_free_late+0x6c/0xc0

Currently there are not many callers of free_reserved_area() and they all
appear to be at the right timings.

Still, in order to protect against problematic code moves or additions of
new callers add a warning that will inform that reserved pages cannot be
freed until the memory map is fully initialized.

[1] https://lore.kernel.org/all/e5d5a1105d90ee1e7fe7eafaed2ed03bbad0c46b.camel@kernel.crashing.org/

Link: https://patch.msgid.link/20260323074836.3653702-10-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

+15 -10
+10
mm/internal.h
··· 1233 1233 #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT 1234 1234 DECLARE_STATIC_KEY_TRUE(deferred_pages); 1235 1235 1236 + static inline bool deferred_pages_enabled(void) 1237 + { 1238 + return static_branch_unlikely(&deferred_pages); 1239 + } 1240 + 1236 1241 bool __init deferred_grow_zone(struct zone *zone, unsigned int order); 1242 + #else 1243 + static inline bool deferred_pages_enabled(void) 1244 + { 1245 + return false; 1246 + } 1237 1247 #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ 1238 1248 1239 1249 void init_deferred_page(unsigned long pfn, int nid);
+5
mm/memblock.c
··· 900 900 { 901 901 unsigned long pages = 0, pfn; 902 902 903 + if (deferred_pages_enabled()) { 904 + WARN(1, "Cannot free reserved memory because of deferred initialization of the memory map"); 905 + return 0; 906 + } 907 + 903 908 for_each_valid_pfn(pfn, PFN_UP(start), PFN_DOWN(end)) { 904 909 struct page *page = pfn_to_page(pfn); 905 910 void *direct_map_addr;
-10
mm/page_alloc.c
··· 331 331 */ 332 332 DEFINE_STATIC_KEY_TRUE(deferred_pages); 333 333 334 - static inline bool deferred_pages_enabled(void) 335 - { 336 - return static_branch_unlikely(&deferred_pages); 337 - } 338 - 339 334 /* 340 335 * deferred_grow_zone() is __init, but it is called from 341 336 * get_page_from_freelist() during early boot until deferred_pages permanently ··· 343 348 return deferred_grow_zone(zone, order); 344 349 } 345 350 #else 346 - static inline bool deferred_pages_enabled(void) 347 - { 348 - return false; 349 - } 350 - 351 351 static inline bool _deferred_grow_zone(struct zone *zone, unsigned int order) 352 352 { 353 353 return false;