Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/memory-failure: improve large block size folio handling

Large block size (LBS) folios cannot be split to order-0 folios but
min_order_for_folio(). Current split fails directly, but that is not
optimal. Split the folio to min_order_for_folio(), so that, after split,
only the folio containing the poisoned page becomes unusable instead.

For soft offline, do not split the large folio if its
min_order_for_folio() is not 0. Since the folio is still accessible from
userspace and premature split might lead to potential performance loss.

Link: https://lkml.kernel.org/r/20251031162001.670503-3-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Suggested-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Pankaj Raghav <kernel@pankajraghav.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Zi Yan and committed by
Andrew Morton
689b8986 a7ef12c6

+27 -4
+27 -4
mm/memory-failure.c
··· 1659 1659 * there is still more to do, hence the page refcount we took earlier 1660 1660 * is still needed. 1661 1661 */ 1662 - static int try_to_split_thp_page(struct page *page, bool release) 1662 + static int try_to_split_thp_page(struct page *page, unsigned int new_order, 1663 + bool release) 1663 1664 { 1664 1665 int ret; 1665 1666 1666 1667 lock_page(page); 1667 - ret = split_huge_page(page); 1668 + ret = split_huge_page_to_order(page, new_order); 1668 1669 unlock_page(page); 1669 1670 1670 1671 if (ret && release) ··· 2421 2420 folio_unlock(folio); 2422 2421 2423 2422 if (folio_test_large(folio)) { 2423 + const int new_order = min_order_for_split(folio); 2424 + int err; 2425 + 2424 2426 /* 2425 2427 * The flag must be set after the refcount is bumped 2426 2428 * otherwise it may race with THP split. ··· 2438 2434 * page is a valid handlable page. 2439 2435 */ 2440 2436 folio_set_has_hwpoisoned(folio); 2441 - if (try_to_split_thp_page(p, false) < 0) { 2437 + err = try_to_split_thp_page(p, new_order, /* release= */ false); 2438 + /* 2439 + * If splitting a folio to order-0 fails, kill the process. 2440 + * Split the folio regardless to minimize unusable pages. 2441 + * Because the memory failure code cannot handle large 2442 + * folios, this split is always treated as if it failed. 2443 + */ 2444 + if (err || new_order) { 2445 + /* get folio again in case the original one is split */ 2446 + folio = page_folio(p); 2442 2447 res = -EHWPOISON; 2443 2448 kill_procs_now(p, pfn, flags, folio); 2444 2449 put_page(p); ··· 2774 2761 }; 2775 2762 2776 2763 if (!huge && folio_test_large(folio)) { 2777 - if (try_to_split_thp_page(page, true)) { 2764 + const int new_order = min_order_for_split(folio); 2765 + 2766 + /* 2767 + * If new_order (target split order) is not 0, do not split the 2768 + * folio at all to retain the still accessible large folio. 2769 + * NOTE: if minimizing the number of soft offline pages is 2770 + * preferred, split it to non-zero new_order like it is done in 2771 + * memory_failure(). 2772 + */ 2773 + if (new_order || try_to_split_thp_page(page, /* new_order= */ 0, 2774 + /* release= */ true)) { 2778 2775 pr_info("%#lx: thp split failed\n", pfn); 2779 2776 return -EBUSY; 2780 2777 }