Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: shmem: fallback to page size splice if large folio has poisoned pages

The tmpfs has already supported the PMD-sized large folios, and splice()
can not read any pages if the large folio has a poisoned page, which is
not good as Matthew pointed out in a previous email[1]:

"so if we have hwpoison set on one page in a folio, we now can't read
bytes from any page in the folio? That seems like we've made a bad
situation worse."

Thus add a fallback to the PAGE_SIZE splice() still allows reading normal
pages if the large folio has hwpoisoned pages.

[1] https://lore.kernel.org/all/Zw_d0EVAJkpNJEbA@casper.infradead.org/

[baolin.wang@linux.alibaba.com: code layout cleaup, per dhowells]
Link: https://lkml.kernel.org/r/32dd938c-3531-49f7-93e4-b7ff21fec569@linux.alibaba.com
Link: https://lkml.kernel.org/r/e3737fbd5366c4de4337bf5f2044817e77a5235b.1729915173.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Baolin Wang and committed by
Andrew Morton
729881ff 477327e1

+30 -8
+30 -8
mm/shmem.c
··· 3288 3288 len = min_t(size_t, len, npages * PAGE_SIZE); 3289 3289 3290 3290 do { 3291 + bool fallback_page_splice = false; 3292 + struct page *page = NULL; 3293 + pgoff_t index; 3294 + size_t size; 3295 + 3291 3296 if (*ppos >= i_size_read(inode)) 3292 3297 break; 3293 3298 3294 - error = shmem_get_folio(inode, *ppos / PAGE_SIZE, 0, &folio, 3295 - SGP_READ); 3299 + index = *ppos >> PAGE_SHIFT; 3300 + error = shmem_get_folio(inode, index, 0, &folio, SGP_READ); 3296 3301 if (error) { 3297 3302 if (error == -EINVAL) 3298 3303 error = 0; ··· 3306 3301 if (folio) { 3307 3302 folio_unlock(folio); 3308 3303 3309 - if (folio_test_hwpoison(folio) || 3310 - (folio_test_large(folio) && 3311 - folio_test_has_hwpoisoned(folio))) { 3304 + page = folio_file_page(folio, index); 3305 + if (PageHWPoison(page)) { 3312 3306 error = -EIO; 3313 3307 break; 3314 3308 } 3309 + 3310 + if (folio_test_large(folio) && 3311 + folio_test_has_hwpoisoned(folio)) 3312 + fallback_page_splice = true; 3315 3313 } 3316 3314 3317 3315 /* ··· 3328 3320 isize = i_size_read(inode); 3329 3321 if (unlikely(*ppos >= isize)) 3330 3322 break; 3331 - part = min_t(loff_t, isize - *ppos, len); 3323 + /* 3324 + * Fallback to PAGE_SIZE splice if the large folio has hwpoisoned 3325 + * pages. 3326 + */ 3327 + size = len; 3328 + if (unlikely(fallback_page_splice)) { 3329 + size_t offset = *ppos & ~PAGE_MASK; 3330 + 3331 + size = umin(size, PAGE_SIZE - offset); 3332 + } 3333 + part = min_t(loff_t, isize - *ppos, size); 3332 3334 3333 3335 if (folio) { 3334 3336 /* ··· 3346 3328 * virtual addresses, take care about potential aliasing 3347 3329 * before reading the page on the kernel side. 3348 3330 */ 3349 - if (mapping_writably_mapped(mapping)) 3350 - flush_dcache_folio(folio); 3331 + if (mapping_writably_mapped(mapping)) { 3332 + if (likely(!fallback_page_splice)) 3333 + flush_dcache_folio(folio); 3334 + else 3335 + flush_dcache_page(page); 3336 + } 3351 3337 folio_mark_accessed(folio); 3352 3338 /* 3353 3339 * Ok, we have the page, and it's up-to-date, so we can