Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

rbd: check for EOD after exclusive lock is ensured to be held

Similar to commit 870611e4877e ("rbd: get snapshot context after
exclusive lock is ensured to be held"), move the "beyond EOD" check
into the image request state machine so that it's performed after
exclusive lock is ensured to be held. This avoids various race
conditions which can arise when the image is shrunk under I/O (in
practice, mostly readahead). In one such scenario

rbd_assert(objno < rbd_dev->object_map_size);

can be triggered if a close-to-EOD read gets queued right before the
shrink is initiated and the EOD check is performed against an outdated
mapping_size. After the resize is done on the server side and exclusive
lock is (re)acquired bringing along the new (now shrunk) object map, the
read starts going through the state machine and rbd_obj_may_exist() gets
invoked on an object that is out of bounds of rbd_dev->object_map array.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang@linux.dev>

+21 -12
+21 -12
drivers/block/rbd.c
··· 3495 3495 rbd_assert(!need_exclusive_lock(img_req) || 3496 3496 __rbd_is_lock_owner(rbd_dev)); 3497 3497 3498 - if (rbd_img_is_write(img_req)) { 3499 - rbd_assert(!img_req->snapc); 3498 + if (test_bit(IMG_REQ_CHILD, &img_req->flags)) { 3499 + rbd_assert(!rbd_img_is_write(img_req)); 3500 + } else { 3501 + struct request *rq = blk_mq_rq_from_pdu(img_req); 3502 + u64 off = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; 3503 + u64 len = blk_rq_bytes(rq); 3504 + u64 mapping_size; 3505 + 3500 3506 down_read(&rbd_dev->header_rwsem); 3501 - img_req->snapc = ceph_get_snap_context(rbd_dev->header.snapc); 3507 + mapping_size = rbd_dev->mapping.size; 3508 + if (rbd_img_is_write(img_req)) { 3509 + rbd_assert(!img_req->snapc); 3510 + img_req->snapc = 3511 + ceph_get_snap_context(rbd_dev->header.snapc); 3512 + } 3502 3513 up_read(&rbd_dev->header_rwsem); 3514 + 3515 + if (unlikely(off + len > mapping_size)) { 3516 + rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", 3517 + off, len, mapping_size); 3518 + img_req->pending.result = -EIO; 3519 + return; 3520 + } 3503 3521 } 3504 3522 3505 3523 for_each_obj_request(img_req, obj_req) { ··· 4743 4725 struct request *rq = blk_mq_rq_from_pdu(img_request); 4744 4726 u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; 4745 4727 u64 length = blk_rq_bytes(rq); 4746 - u64 mapping_size; 4747 4728 int result; 4748 4729 4749 4730 /* Ignore/skip any zero-length requests */ ··· 4755 4738 blk_mq_start_request(rq); 4756 4739 4757 4740 down_read(&rbd_dev->header_rwsem); 4758 - mapping_size = rbd_dev->mapping.size; 4759 4741 rbd_img_capture_header(img_request); 4760 4742 up_read(&rbd_dev->header_rwsem); 4761 - 4762 - if (offset + length > mapping_size) { 4763 - rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", offset, 4764 - length, mapping_size); 4765 - result = -EIO; 4766 - goto err_img_request; 4767 - } 4768 4743 4769 4744 dout("%s rbd_dev %p img_req %p %s %llu~%llu\n", __func__, rbd_dev, 4770 4745 img_request, obj_op_name(op_type), offset, length);