Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ublk: avoid ublk_io_release() called after ublk char dev is closed

When running test_stress_04.sh, the following warning is triggered:

WARNING: CPU: 1 PID: 135 at drivers/block/ublk_drv.c:1933 ublk_ch_release+0x423/0x4b0 [ublk_drv]

This happens when the daemon is abruptly killed:

- some references may still be held, because registering IO buffer
doesn't grab ublk char device reference

OR

- io->task_registered_buffers won't be cleared because io buffer is
released from non-daemon context

For zero-copy and auto buffer register modes, I/O reference crosses
syscalls, so IO reference may not be dropped naturally when ublk server is
killed abruptly. However, when releasing io_uring context, it is guaranteed
that the reference is dropped finally, see io_sqe_buffers_unregister() from
io_ring_ctx_free().

Fix this by adding ublk_drain_io_references() that:
- Waits for active I/O references dropped in async way by scheduling
work function, for avoiding ublk dev and io_uring file's release
dependency
- Reinitializes io->ref and io->task_registered_buffers to clean state

This ensures the reference count state is clean when ublk_queue_reinit()
is called, preventing the warning and potential use-after-free.

Fixes: 1f6540e2aabb ("ublk: zc register/unregister bvec")
Fixes: 1ceeedb59749 ("ublk: optimize UBLK_IO_UNREGISTER_IO_BUF on daemon task")
Fixes: 8a8fe42d765b ("ublk: optimize UBLK_IO_REGISTER_IO_BUF on daemon task")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250827121602.2619736-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Ming Lei and committed by
Jens Axboe
c5c5eb24 e3ef9445

+70 -2
+70 -2
drivers/block/ublk_drv.c
··· 239 239 struct mutex cancel_mutex; 240 240 bool canceling; 241 241 pid_t ublksrv_tgid; 242 + struct delayed_work exit_work; 242 243 }; 243 244 244 245 /* header of ublk_params */ ··· 1596 1595 ublk_get_queue(ub, i)->canceling = canceling; 1597 1596 } 1598 1597 1599 - static int ublk_ch_release(struct inode *inode, struct file *filp) 1598 + static bool ublk_check_and_reset_active_ref(struct ublk_device *ub) 1600 1599 { 1601 - struct ublk_device *ub = filp->private_data; 1600 + int i, j; 1601 + 1602 + if (!(ub->dev_info.flags & (UBLK_F_SUPPORT_ZERO_COPY | 1603 + UBLK_F_AUTO_BUF_REG))) 1604 + return false; 1605 + 1606 + for (i = 0; i < ub->dev_info.nr_hw_queues; i++) { 1607 + struct ublk_queue *ubq = ublk_get_queue(ub, i); 1608 + 1609 + for (j = 0; j < ubq->q_depth; j++) { 1610 + struct ublk_io *io = &ubq->ios[j]; 1611 + unsigned int refs = refcount_read(&io->ref) + 1612 + io->task_registered_buffers; 1613 + 1614 + /* 1615 + * UBLK_REFCOUNT_INIT or zero means no active 1616 + * reference 1617 + */ 1618 + if (refs != UBLK_REFCOUNT_INIT && refs != 0) 1619 + return true; 1620 + 1621 + /* reset to zero if the io hasn't active references */ 1622 + refcount_set(&io->ref, 0); 1623 + io->task_registered_buffers = 0; 1624 + } 1625 + } 1626 + return false; 1627 + } 1628 + 1629 + static void ublk_ch_release_work_fn(struct work_struct *work) 1630 + { 1631 + struct ublk_device *ub = 1632 + container_of(work, struct ublk_device, exit_work.work); 1602 1633 struct gendisk *disk; 1603 1634 int i; 1635 + 1636 + /* 1637 + * For zero-copy and auto buffer register modes, I/O references 1638 + * might not be dropped naturally when the daemon is killed, but 1639 + * io_uring guarantees that registered bvec kernel buffers are 1640 + * unregistered finally when freeing io_uring context, then the 1641 + * active references are dropped. 1642 + * 1643 + * Wait until active references are dropped for avoiding use-after-free 1644 + * 1645 + * registered buffer may be unregistered in io_ring's release hander, 1646 + * so have to wait by scheduling work function for avoiding the two 1647 + * file release dependency. 1648 + */ 1649 + if (ublk_check_and_reset_active_ref(ub)) { 1650 + schedule_delayed_work(&ub->exit_work, 1); 1651 + return; 1652 + } 1604 1653 1605 1654 /* 1606 1655 * disk isn't attached yet, either device isn't live, or it has ··· 1724 1673 ublk_reset_ch_dev(ub); 1725 1674 out: 1726 1675 clear_bit(UB_STATE_OPEN, &ub->state); 1676 + 1677 + /* put the reference grabbed in ublk_ch_release() */ 1678 + ublk_put_device(ub); 1679 + } 1680 + 1681 + static int ublk_ch_release(struct inode *inode, struct file *filp) 1682 + { 1683 + struct ublk_device *ub = filp->private_data; 1684 + 1685 + /* 1686 + * Grab ublk device reference, so it won't be gone until we are 1687 + * really released from work function. 1688 + */ 1689 + ublk_get_device(ub); 1690 + 1691 + INIT_DELAYED_WORK(&ub->exit_work, ublk_ch_release_work_fn); 1692 + schedule_delayed_work(&ub->exit_work, 0); 1727 1693 return 0; 1728 1694 } 1729 1695