Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ublk: avoid unpinning pages under maple tree spinlock

ublk_shmem_remove_ranges() calls unpin_user_pages() while holding the
maple tree spinlock (mas_lock). Although unpin_user_pages() is safe in
atomic context, holding the spinlock across potentially many page
unpinning operations is not ideal.

Split into __ublk_shmem_remove_ranges() which erases up to 64 ranges
under mas_lock, collecting base_pfn and nr_pages into a temporary
xarray. Then drop the lock and unpin pages outside spinlock context.
ublk_shmem_remove_ranges() loops until all matching ranges are
processed.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260423033058.2805135-4-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Ming Lei and committed by
Jens Axboe
309e02dc ea1db795

+46 -10
+46 -10
drivers/block/ublk_drv.c
··· 5441 5441 } 5442 5442 5443 5443 /* 5444 - * Remove ranges from the maple tree matching buf_index, unpin pages 5445 - * and free range structs. If buf_index < 0, remove all ranges. 5444 + * Inner loop: erase up to UBLK_REMOVE_BATCH matching ranges under 5445 + * mas_lock, collecting them into an xarray. Then drop the lock and 5446 + * unpin pages + free ranges outside spinlock context. 5447 + * 5448 + * Returns true if the tree walk completed, false if more ranges remain. 5449 + * Xarray key is the base PFN, value encodes nr_pages via xa_mk_value(). 5446 5450 */ 5447 - static int ublk_shmem_remove_ranges(struct ublk_device *ub, int buf_index) 5451 + #define UBLK_REMOVE_BATCH 64 5452 + 5453 + static bool __ublk_shmem_remove_ranges(struct ublk_device *ub, 5454 + int buf_index, int *ret) 5448 5455 { 5449 5456 MA_STATE(mas, &ub->buf_tree, 0, ULONG_MAX); 5450 5457 struct ublk_buf_range *range; 5451 - int ret = -ENOENT; 5458 + struct xarray to_unpin; 5459 + unsigned long idx; 5460 + unsigned int count = 0; 5461 + bool done = false; 5462 + void *entry; 5463 + 5464 + xa_init(&to_unpin); 5452 5465 5453 5466 mas_lock(&mas); 5454 5467 mas_for_each(&mas, range, ULONG_MAX) { 5455 - unsigned long base, nr; 5468 + unsigned long nr; 5456 5469 5457 5470 if (buf_index >= 0 && range->buf_index != buf_index) 5458 5471 continue; 5459 5472 5460 - ret = 0; 5461 - base = mas.index; 5462 - nr = mas.last - base + 1; 5473 + *ret = 0; 5474 + nr = mas.last - mas.index + 1; 5475 + if (xa_err(xa_store(&to_unpin, mas.index, 5476 + xa_mk_value(nr), GFP_ATOMIC))) 5477 + goto unlock; 5463 5478 mas_erase(&mas); 5464 - 5465 - ublk_unpin_range_pages(base, nr); 5466 5479 kfree(range); 5480 + if (++count >= UBLK_REMOVE_BATCH) 5481 + goto unlock; 5467 5482 } 5483 + done = true; 5484 + unlock: 5468 5485 mas_unlock(&mas); 5469 5486 5487 + xa_for_each(&to_unpin, idx, entry) 5488 + ublk_unpin_range_pages(idx, xa_to_value(entry)); 5489 + xa_destroy(&to_unpin); 5490 + 5491 + return done; 5492 + } 5493 + 5494 + /* 5495 + * Remove ranges from the maple tree matching buf_index, unpin pages 5496 + * and free range structs. If buf_index < 0, remove all ranges. 5497 + * Processes ranges in batches to avoid holding the maple tree spinlock 5498 + * across potentially expensive page unpinning. 5499 + */ 5500 + static int ublk_shmem_remove_ranges(struct ublk_device *ub, int buf_index) 5501 + { 5502 + int ret = -ENOENT; 5503 + 5504 + while (!__ublk_shmem_remove_ranges(ub, buf_index, &ret)) 5505 + cond_resched(); 5470 5506 return ret; 5471 5507 } 5472 5508