Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ublk: don't quiesce in ublk_ch_release

ublk_ch_release currently quiesces the device's request_queue while
setting force_abort/fail_io. This avoids data races by preventing
concurrent reads from the I/O path, but is not strictly needed - at this
point, canceling is already set and guaranteed to be observed by any
concurrently executing I/Os, so they will be handled properly even if
the changes to force_abort/fail_io propagate to the I/O path later.
Remove the quiesce/unquiesce calls from ublk_ch_release. This makes the
writes to force_abort/fail_io concurrent with the reads in the I/O path,
so make the accesses atomic.

Before this change, the call to blk_mq_quiesce_queue was responsible for
most (90%) of the runtime of ublk_ch_release. With that call eliminated,
ublk_ch_release runs much faster. Here is a comparison of the total time
spent in calls to ublk_ch_release when a server handling 128 devices
exits, before and after this change:

before: 1.11s
after: 0.09s

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250808-ublk_quiesce2-v1-1-f87ade33fa3d@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Uday Shankar and committed by
Jens Axboe
212c928d d5dd4098

+5 -7
+5 -7
drivers/block/ublk_drv.c
··· 1389 1389 { 1390 1390 blk_status_t res; 1391 1391 1392 - if (unlikely(ubq->fail_io)) 1392 + if (unlikely(READ_ONCE(ubq->fail_io))) 1393 1393 return BLK_STS_TARGET; 1394 1394 1395 1395 /* With recovery feature enabled, force_abort is set in ··· 1401 1401 * Note: force_abort is guaranteed to be seen because it is set 1402 1402 * before request queue is unqiuesced. 1403 1403 */ 1404 - if (ublk_nosrv_should_queue_io(ubq) && unlikely(ubq->force_abort)) 1404 + if (ublk_nosrv_should_queue_io(ubq) && 1405 + unlikely(READ_ONCE(ubq->force_abort))) 1405 1406 return BLK_STS_IOERR; 1406 1407 1407 1408 if (check_cancel && unlikely(ubq->canceling)) ··· 1645 1644 * Transition the device to the nosrv state. What exactly this 1646 1645 * means depends on the recovery flags 1647 1646 */ 1648 - blk_mq_quiesce_queue(disk->queue); 1649 1647 if (ublk_nosrv_should_stop_dev(ub)) { 1650 1648 /* 1651 1649 * Allow any pending/future I/O to pass through quickly ··· 1652 1652 * waits for all pending I/O to complete 1653 1653 */ 1654 1654 for (i = 0; i < ub->dev_info.nr_hw_queues; i++) 1655 - ublk_get_queue(ub, i)->force_abort = true; 1656 - blk_mq_unquiesce_queue(disk->queue); 1655 + WRITE_ONCE(ublk_get_queue(ub, i)->force_abort, true); 1657 1656 1658 1657 ublk_stop_dev_unlocked(ub); 1659 1658 } else { ··· 1662 1663 } else { 1663 1664 ub->dev_info.state = UBLK_S_DEV_FAIL_IO; 1664 1665 for (i = 0; i < ub->dev_info.nr_hw_queues; i++) 1665 - ublk_get_queue(ub, i)->fail_io = true; 1666 + WRITE_ONCE(ublk_get_queue(ub, i)->fail_io, true); 1666 1667 } 1667 - blk_mq_unquiesce_queue(disk->queue); 1668 1668 } 1669 1669 unlock: 1670 1670 mutex_unlock(&ub->mutex);