Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'io_uring-6.15-20250403' of git://git.kernel.dk/linux

Pull more io_uring updates from Jens Axboe:
"Set of fixes/updates for io_uring that should go into this release.

The ublk bits could've gone via either tree - usually I put them in
block, but they got a bit mixed this series with the zero-copy
supported that ended up dipping into both trees.

This contains:

- Fix for sendmsg zc, include in pinned pages accounting like we do
for the other zc types

- Series for ublk fixing request aborting, doing various little
cleanups, fixing some zc issues, and adding queue_rqs support

- Another ublk series doing some code cleanups

- Series cleaning up the io_uring send path, mostly in preparation
for registered buffers

- Series doing little MSG_RING cleanups

- Fix for the newly added zc rx, fixing len being 0 for the last
invocation of the callback

- Add vectored registered buffer support for ublk. With that, then
ublk also supports this feature in the kernel revision where it
could generically introduced for rw/net

- A bunch of selftest additions for ublk. This is the majority of the
diffstat

- Silence a KCSAN data race warning for io-wq

- Various little cleanups and fixes"

* tag 'io_uring-6.15-20250403' of git://git.kernel.dk/linux: (44 commits)
io_uring: always do atomic put from iowq
selftests: ublk: enable zero copy for stripe target
io_uring: support vectored kernel fixed buffer
block: add for_each_mp_bvec()
io_uring: add validate_fixed_range() for validate fixed buffer
selftests: ublk: kublk: fix an error log line
selftests: ublk: kublk: use ioctl-encoded opcodes
io_uring/zcrx: return early from io_zcrx_recv_skb if readlen is 0
io_uring/net: avoid import_ubuf for regvec send
io_uring/rsrc: check size when importing reg buffer
io_uring: cleanup {g,s]etsockopt sqe reading
io_uring: hide caches sqes from drivers
io_uring: make zcrx depend on CONFIG_IO_URING
io_uring: add req flag invariant build assertion
Documentation: ublk: remove dead footnote
selftests: ublk: specify io_cmd_buf pointer type
ublk: specify io_cmd_buf pointer type
io_uring: don't pass ctx to tw add remote helper
io_uring/msg: initialise msg request opcode
io_uring/msg: rename io_double_lock_ctx()
...

+673 -238
+25 -10
Documentation/block/ublk.rst
··· 309 309 ``UBLK_IO_COMMIT_AND_FETCH_REQ`` to the server, ublkdrv needs to copy 310 310 the server buffer (pages) read to the IO request pages. 311 311 312 - Future development 313 - ================== 314 - 315 312 Zero copy 316 313 --------- 317 314 318 - Zero copy is a generic requirement for nbd, fuse or similar drivers. A 319 - problem [#xiaoguang]_ Xiaoguang mentioned is that pages mapped to userspace 320 - can't be remapped any more in kernel with existing mm interfaces. This can 321 - occurs when destining direct IO to ``/dev/ublkb*``. Also, he reported that 322 - big requests (IO size >= 256 KB) may benefit a lot from zero copy. 315 + ublk zero copy relies on io_uring's fixed kernel buffer, which provides 316 + two APIs: `io_buffer_register_bvec()` and `io_buffer_unregister_bvec`. 323 317 318 + ublk adds IO command of `UBLK_IO_REGISTER_IO_BUF` to call 319 + `io_buffer_register_bvec()` for ublk server to register client request 320 + buffer into io_uring buffer table, then ublk server can submit io_uring 321 + IOs with the registered buffer index. IO command of `UBLK_IO_UNREGISTER_IO_BUF` 322 + calls `io_buffer_unregister_bvec()` to unregister the buffer, which is 323 + guaranteed to be live between calling `io_buffer_register_bvec()` and 324 + `io_buffer_unregister_bvec()`. Any io_uring operation which supports this 325 + kind of kernel buffer will grab one reference of the buffer until the 326 + operation is completed. 327 + 328 + ublk server implementing zero copy or user copy has to be CAP_SYS_ADMIN and 329 + be trusted, because it is ublk server's responsibility to make sure IO buffer 330 + filled with data for handling read command, and ublk server has to return 331 + correct result to ublk driver when handling READ command, and the result 332 + has to match with how many bytes filled to the IO buffer. Otherwise, 333 + uninitialized kernel IO buffer will be exposed to client application. 334 + 335 + ublk server needs to align the parameter of `struct ublk_param_dma_align` 336 + with backend for zero copy to work correctly. 337 + 338 + For reaching best IO performance, ublk server should align its segment 339 + parameter of `struct ublk_param_segment` with backend for avoiding 340 + unnecessary IO split, which usually hurts io_uring performance. 324 341 325 342 References 326 343 ========== ··· 349 332 .. [#userspace_nbdublk] https://gitlab.com/rwmjones/libnbd/-/tree/nbdublk 350 333 351 334 .. [#userspace_readme] https://github.com/ming1/ubdsrv/blob/master/README 352 - 353 - .. [#xiaoguang] https://lore.kernel.org/linux-block/YoOr6jBfgVm8GvWg@stefanha-x1.localdomain/
+180 -45
drivers/block/ublk_drv.c
··· 74 74 #define UBLK_PARAM_TYPE_ALL \ 75 75 (UBLK_PARAM_TYPE_BASIC | UBLK_PARAM_TYPE_DISCARD | \ 76 76 UBLK_PARAM_TYPE_DEVT | UBLK_PARAM_TYPE_ZONED | \ 77 - UBLK_PARAM_TYPE_DMA_ALIGN) 77 + UBLK_PARAM_TYPE_DMA_ALIGN | UBLK_PARAM_TYPE_SEGMENT) 78 78 79 79 struct ublk_rq_data { 80 80 struct kref ref; 81 81 }; 82 82 83 83 struct ublk_uring_cmd_pdu { 84 + /* 85 + * Store requests in same batch temporarily for queuing them to 86 + * daemon context. 87 + * 88 + * It should have been stored to request payload, but we do want 89 + * to avoid extra pre-allocation, and uring_cmd payload is always 90 + * free for us 91 + */ 92 + union { 93 + struct request *req; 94 + struct request *req_list; 95 + }; 96 + 97 + /* 98 + * The following two are valid in this cmd whole lifetime, and 99 + * setup in ublk uring_cmd handler 100 + */ 84 101 struct ublk_queue *ubq; 85 102 u16 tag; 86 103 }; ··· 158 141 159 142 unsigned long flags; 160 143 struct task_struct *ubq_daemon; 161 - char *io_cmd_buf; 144 + struct ublksrv_io_desc *io_cmd_buf; 162 145 163 - unsigned long io_addr; /* mapped vm address */ 164 - unsigned int max_io_sz; 165 146 bool force_abort; 166 147 bool timeout; 167 148 bool canceling; ··· 597 582 return -EINVAL; 598 583 } 599 584 585 + if (ub->params.types & UBLK_PARAM_TYPE_SEGMENT) { 586 + const struct ublk_param_segment *p = &ub->params.seg; 587 + 588 + if (!is_power_of_2(p->seg_boundary_mask + 1)) 589 + return -EINVAL; 590 + 591 + if (p->seg_boundary_mask + 1 < UBLK_MIN_SEGMENT_SIZE) 592 + return -EINVAL; 593 + if (p->max_segment_size < UBLK_MIN_SEGMENT_SIZE) 594 + return -EINVAL; 595 + } 596 + 600 597 return 0; 601 598 } 602 599 ··· 623 596 static inline bool ublk_support_user_copy(const struct ublk_queue *ubq) 624 597 { 625 598 return ubq->flags & (UBLK_F_USER_COPY | UBLK_F_SUPPORT_ZERO_COPY); 599 + } 600 + 601 + static inline bool ublk_need_map_io(const struct ublk_queue *ubq) 602 + { 603 + return !ublk_support_user_copy(ubq); 626 604 } 627 605 628 606 static inline bool ublk_need_req_ref(const struct ublk_queue *ubq) ··· 706 674 static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq, 707 675 int tag) 708 676 { 709 - return (struct ublksrv_io_desc *) 710 - &(ubq->io_cmd_buf[tag * sizeof(struct ublksrv_io_desc)]); 677 + return &ubq->io_cmd_buf[tag]; 711 678 } 712 679 713 - static inline char *ublk_queue_cmd_buf(struct ublk_device *ub, int q_id) 680 + static inline struct ublksrv_io_desc * 681 + ublk_queue_cmd_buf(struct ublk_device *ub, int q_id) 714 682 { 715 683 return ublk_get_queue(ub, q_id)->io_cmd_buf; 716 684 } ··· 957 925 { 958 926 const unsigned int rq_bytes = blk_rq_bytes(req); 959 927 960 - if (ublk_support_user_copy(ubq)) 928 + if (!ublk_need_map_io(ubq)) 961 929 return rq_bytes; 962 930 963 931 /* ··· 981 949 { 982 950 const unsigned int rq_bytes = blk_rq_bytes(req); 983 951 984 - if (ublk_support_user_copy(ubq)) 952 + if (!ublk_need_map_io(ubq)) 985 953 return rq_bytes; 986 954 987 955 if (ublk_need_unmap_req(req)) { ··· 1069 1037 static inline struct ublk_uring_cmd_pdu *ublk_get_uring_cmd_pdu( 1070 1038 struct io_uring_cmd *ioucmd) 1071 1039 { 1072 - return (struct ublk_uring_cmd_pdu *)&ioucmd->pdu; 1040 + return io_uring_cmd_to_pdu(ioucmd, struct ublk_uring_cmd_pdu); 1073 1041 } 1074 1042 1075 1043 static inline bool ubq_daemon_is_dying(struct ublk_queue *ubq) ··· 1187 1155 blk_mq_end_request(rq, BLK_STS_IOERR); 1188 1156 } 1189 1157 1190 - static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd, 1191 - unsigned int issue_flags) 1158 + static void ublk_dispatch_req(struct ublk_queue *ubq, 1159 + struct request *req, 1160 + unsigned int issue_flags) 1192 1161 { 1193 - struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); 1194 - struct ublk_queue *ubq = pdu->ubq; 1195 - int tag = pdu->tag; 1196 - struct request *req = blk_mq_tag_to_rq( 1197 - ubq->dev->tag_set.tags[ubq->q_id], tag); 1162 + int tag = req->tag; 1198 1163 struct ublk_io *io = &ubq->ios[tag]; 1199 1164 unsigned int mapped_bytes; 1200 1165 ··· 1266 1237 ubq_complete_io_cmd(io, UBLK_IO_RES_OK, issue_flags); 1267 1238 } 1268 1239 1240 + static void ublk_cmd_tw_cb(struct io_uring_cmd *cmd, 1241 + unsigned int issue_flags) 1242 + { 1243 + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); 1244 + struct ublk_queue *ubq = pdu->ubq; 1245 + 1246 + ublk_dispatch_req(ubq, pdu->req, issue_flags); 1247 + } 1248 + 1269 1249 static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq) 1270 1250 { 1271 - struct ublk_io *io = &ubq->ios[rq->tag]; 1251 + struct io_uring_cmd *cmd = ubq->ios[rq->tag].cmd; 1252 + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); 1272 1253 1273 - io_uring_cmd_complete_in_task(io->cmd, ublk_rq_task_work_cb); 1254 + pdu->req = rq; 1255 + io_uring_cmd_complete_in_task(cmd, ublk_cmd_tw_cb); 1256 + } 1257 + 1258 + static void ublk_cmd_list_tw_cb(struct io_uring_cmd *cmd, 1259 + unsigned int issue_flags) 1260 + { 1261 + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); 1262 + struct request *rq = pdu->req_list; 1263 + struct ublk_queue *ubq = pdu->ubq; 1264 + struct request *next; 1265 + 1266 + do { 1267 + next = rq->rq_next; 1268 + rq->rq_next = NULL; 1269 + ublk_dispatch_req(ubq, rq, issue_flags); 1270 + rq = next; 1271 + } while (rq); 1272 + } 1273 + 1274 + static void ublk_queue_cmd_list(struct ublk_queue *ubq, struct rq_list *l) 1275 + { 1276 + struct request *rq = rq_list_peek(l); 1277 + struct io_uring_cmd *cmd = ubq->ios[rq->tag].cmd; 1278 + struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd); 1279 + 1280 + pdu->req_list = rq; 1281 + rq_list_init(l); 1282 + io_uring_cmd_complete_in_task(cmd, ublk_cmd_list_tw_cb); 1274 1283 } 1275 1284 1276 1285 static enum blk_eh_timer_return ublk_timeout(struct request *rq) ··· 1349 1282 return BLK_EH_RESET_TIMER; 1350 1283 } 1351 1284 1352 - static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, 1353 - const struct blk_mq_queue_data *bd) 1285 + static blk_status_t ublk_prep_req(struct ublk_queue *ubq, struct request *rq) 1354 1286 { 1355 - struct ublk_queue *ubq = hctx->driver_data; 1356 - struct request *rq = bd->rq; 1357 1287 blk_status_t res; 1358 1288 1359 - if (unlikely(ubq->fail_io)) { 1289 + if (unlikely(ubq->fail_io)) 1360 1290 return BLK_STS_TARGET; 1361 - } 1362 - 1363 - /* fill iod to slot in io cmd buffer */ 1364 - res = ublk_setup_iod(ubq, rq); 1365 - if (unlikely(res != BLK_STS_OK)) 1366 - return BLK_STS_IOERR; 1367 1291 1368 1292 /* With recovery feature enabled, force_abort is set in 1369 1293 * ublk_stop_dev() before calling del_gendisk(). We have to ··· 1368 1310 if (ublk_nosrv_should_queue_io(ubq) && unlikely(ubq->force_abort)) 1369 1311 return BLK_STS_IOERR; 1370 1312 1313 + if (unlikely(ubq->canceling)) 1314 + return BLK_STS_IOERR; 1315 + 1316 + /* fill iod to slot in io cmd buffer */ 1317 + res = ublk_setup_iod(ubq, rq); 1318 + if (unlikely(res != BLK_STS_OK)) 1319 + return BLK_STS_IOERR; 1320 + 1321 + blk_mq_start_request(rq); 1322 + return BLK_STS_OK; 1323 + } 1324 + 1325 + static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx, 1326 + const struct blk_mq_queue_data *bd) 1327 + { 1328 + struct ublk_queue *ubq = hctx->driver_data; 1329 + struct request *rq = bd->rq; 1330 + blk_status_t res; 1331 + 1332 + res = ublk_prep_req(ubq, rq); 1333 + if (res != BLK_STS_OK) 1334 + return res; 1335 + 1336 + /* 1337 + * ->canceling has to be handled after ->force_abort and ->fail_io 1338 + * is dealt with, otherwise this request may not be failed in case 1339 + * of recovery, and cause hang when deleting disk 1340 + */ 1371 1341 if (unlikely(ubq->canceling)) { 1372 1342 __ublk_abort_rq(ubq, rq); 1373 1343 return BLK_STS_OK; 1374 1344 } 1375 1345 1376 - blk_mq_start_request(bd->rq); 1377 1346 ublk_queue_cmd(ubq, rq); 1378 - 1379 1347 return BLK_STS_OK; 1348 + } 1349 + 1350 + static void ublk_queue_rqs(struct rq_list *rqlist) 1351 + { 1352 + struct rq_list requeue_list = { }; 1353 + struct rq_list submit_list = { }; 1354 + struct ublk_queue *ubq = NULL; 1355 + struct request *req; 1356 + 1357 + while ((req = rq_list_pop(rqlist))) { 1358 + struct ublk_queue *this_q = req->mq_hctx->driver_data; 1359 + 1360 + if (ubq && ubq != this_q && !rq_list_empty(&submit_list)) 1361 + ublk_queue_cmd_list(ubq, &submit_list); 1362 + ubq = this_q; 1363 + 1364 + if (ublk_prep_req(ubq, req) == BLK_STS_OK) 1365 + rq_list_add_tail(&submit_list, req); 1366 + else 1367 + rq_list_add_tail(&requeue_list, req); 1368 + } 1369 + 1370 + if (ubq && !rq_list_empty(&submit_list)) 1371 + ublk_queue_cmd_list(ubq, &submit_list); 1372 + *rqlist = requeue_list; 1380 1373 } 1381 1374 1382 1375 static int ublk_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data, ··· 1442 1333 1443 1334 static const struct blk_mq_ops ublk_mq_ops = { 1444 1335 .queue_rq = ublk_queue_rq, 1336 + .queue_rqs = ublk_queue_rqs, 1445 1337 .init_hctx = ublk_init_hctx, 1446 1338 .timeout = ublk_timeout, 1447 1339 }; ··· 1556 1446 } 1557 1447 } 1558 1448 1559 - static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq) 1449 + /* Must be called when queue is frozen */ 1450 + static bool ublk_mark_queue_canceling(struct ublk_queue *ubq) 1560 1451 { 1561 - struct gendisk *disk; 1452 + bool canceled; 1562 1453 1563 1454 spin_lock(&ubq->cancel_lock); 1564 - if (ubq->canceling) { 1565 - spin_unlock(&ubq->cancel_lock); 1566 - return false; 1567 - } 1568 - ubq->canceling = true; 1455 + canceled = ubq->canceling; 1456 + if (!canceled) 1457 + ubq->canceling = true; 1569 1458 spin_unlock(&ubq->cancel_lock); 1459 + 1460 + return canceled; 1461 + } 1462 + 1463 + static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq) 1464 + { 1465 + bool was_canceled = ubq->canceling; 1466 + struct gendisk *disk; 1467 + 1468 + if (was_canceled) 1469 + return false; 1570 1470 1571 1471 spin_lock(&ub->lock); 1572 1472 disk = ub->ub_disk; ··· 1588 1468 if (!disk) 1589 1469 return false; 1590 1470 1591 - /* Now we are serialized with ublk_queue_rq() */ 1471 + /* 1472 + * Now we are serialized with ublk_queue_rq() 1473 + * 1474 + * Make sure that ubq->canceling is set when queue is frozen, 1475 + * because ublk_queue_rq() has to rely on this flag for avoiding to 1476 + * touch completed uring_cmd 1477 + */ 1592 1478 blk_mq_quiesce_queue(disk->queue); 1593 - /* abort queue is for making forward progress */ 1594 - ublk_abort_queue(ub, ubq); 1479 + was_canceled = ublk_mark_queue_canceling(ubq); 1480 + if (!was_canceled) { 1481 + /* abort queue is for making forward progress */ 1482 + ublk_abort_queue(ub, ubq); 1483 + } 1595 1484 blk_mq_unquiesce_queue(disk->queue); 1596 1485 put_device(disk_to_dev(disk)); 1597 1486 1598 - return true; 1487 + return !was_canceled; 1599 1488 } 1600 1489 1601 1490 static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io, ··· 1974 1845 if (io->flags & UBLK_IO_FLAG_OWNED_BY_SRV) 1975 1846 goto out; 1976 1847 1977 - if (!ublk_support_user_copy(ubq)) { 1848 + if (ublk_need_map_io(ubq)) { 1978 1849 /* 1979 1850 * FETCH_RQ has to provide IO buffer if NEED GET 1980 1851 * DATA is not enabled ··· 1996 1867 if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV)) 1997 1868 goto out; 1998 1869 1999 - if (!ublk_support_user_copy(ubq)) { 1870 + if (ublk_need_map_io(ubq)) { 2000 1871 /* 2001 1872 * COMMIT_AND_FETCH_REQ has to provide IO buffer if 2002 1873 * NEED GET DATA is not enabled or it is Read IO. ··· 2471 2342 2472 2343 if (ub->params.types & UBLK_PARAM_TYPE_DMA_ALIGN) 2473 2344 lim.dma_alignment = ub->params.dma.alignment; 2345 + 2346 + if (ub->params.types & UBLK_PARAM_TYPE_SEGMENT) { 2347 + lim.seg_boundary_mask = ub->params.seg.seg_boundary_mask; 2348 + lim.max_segment_size = ub->params.seg.max_segment_size; 2349 + lim.max_segments = ub->params.seg.max_segments; 2350 + } 2474 2351 2475 2352 if (wait_for_completion_interruptible(&ub->completion) != 0) 2476 2353 return -EINTR;
+6
include/linux/bvec.h
··· 184 184 ((bvl = bvec_iter_bvec((bio_vec), (iter))), 1); \ 185 185 bvec_iter_advance_single((bio_vec), &(iter), (bvl).bv_len)) 186 186 187 + #define for_each_mp_bvec(bvl, bio_vec, iter, start) \ 188 + for (iter = (start); \ 189 + (iter).bi_size && \ 190 + ((bvl = mp_bvec_iter_bvec((bio_vec), (iter))), 1); \ 191 + bvec_iter_advance_single((bio_vec), &(iter), (bvl).bv_len)) 192 + 187 193 /* for iterating one bio from start to end */ 188 194 #define BVEC_ITER_ALL_INIT (struct bvec_iter) \ 189 195 { \
-1
include/linux/io_uring/cmd.h
··· 21 21 22 22 struct io_uring_cmd_data { 23 23 void *op_data; 24 - struct io_uring_sqe sqes[2]; 25 24 }; 26 25 27 26 static inline const void *io_uring_sqe_cmd(const struct io_uring_sqe *sqe)
+25
include/uapi/linux/ublk_cmd.h
··· 410 410 __u8 pad[4]; 411 411 }; 412 412 413 + #define UBLK_MIN_SEGMENT_SIZE 4096 414 + /* 415 + * If any one of the three segment parameter is set as 0, the behavior is 416 + * undefined. 417 + */ 418 + struct ublk_param_segment { 419 + /* 420 + * seg_boundary_mask + 1 needs to be power_of_2(), and the sum has 421 + * to be >= UBLK_MIN_SEGMENT_SIZE(4096) 422 + */ 423 + __u64 seg_boundary_mask; 424 + 425 + /* 426 + * max_segment_size could be override by virt_boundary_mask, so be 427 + * careful when setting both. 428 + * 429 + * max_segment_size has to be >= UBLK_MIN_SEGMENT_SIZE(4096) 430 + */ 431 + __u32 max_segment_size; 432 + __u16 max_segments; 433 + __u8 pad[2]; 434 + }; 435 + 413 436 struct ublk_params { 414 437 /* 415 438 * Total length of parameters, userspace has to set 'len' for both ··· 446 423 #define UBLK_PARAM_TYPE_DEVT (1 << 2) 447 424 #define UBLK_PARAM_TYPE_ZONED (1 << 3) 448 425 #define UBLK_PARAM_TYPE_DMA_ALIGN (1 << 4) 426 + #define UBLK_PARAM_TYPE_SEGMENT (1 << 5) 449 427 __u32 types; /* types of parameter included */ 450 428 451 429 struct ublk_param_basic basic; ··· 454 430 struct ublk_param_devt devt; 455 431 struct ublk_param_zoned zoned; 456 432 struct ublk_param_dma_align dma; 433 + struct ublk_param_segment seg; 457 434 }; 458 435 459 436 #endif
+1
io_uring/Kconfig
··· 5 5 6 6 config IO_URING_ZCRX 7 7 def_bool y 8 + depends on IO_URING 8 9 depends on PAGE_POOL 9 10 depends on INET 10 11 depends on NET_RX_BUSY_POLL
+9 -9
io_uring/io_uring.c
··· 1141 1141 WARN_ON_ONCE(ret); 1142 1142 } 1143 1143 1144 - static inline void io_req_local_work_add(struct io_kiocb *req, 1145 - struct io_ring_ctx *ctx, 1146 - unsigned flags) 1144 + static void io_req_local_work_add(struct io_kiocb *req, unsigned flags) 1147 1145 { 1146 + struct io_ring_ctx *ctx = req->ctx; 1148 1147 unsigned nr_wait, nr_tw, nr_tw_prev; 1149 1148 struct llist_node *head; 1150 1149 ··· 1238 1239 void __io_req_task_work_add(struct io_kiocb *req, unsigned flags) 1239 1240 { 1240 1241 if (req->ctx->flags & IORING_SETUP_DEFER_TASKRUN) 1241 - io_req_local_work_add(req, req->ctx, flags); 1242 + io_req_local_work_add(req, flags); 1242 1243 else 1243 1244 io_req_normal_work_add(req); 1244 1245 } 1245 1246 1246 - void io_req_task_work_add_remote(struct io_kiocb *req, struct io_ring_ctx *ctx, 1247 - unsigned flags) 1247 + void io_req_task_work_add_remote(struct io_kiocb *req, unsigned flags) 1248 1248 { 1249 - if (WARN_ON_ONCE(!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))) 1249 + if (WARN_ON_ONCE(!(req->ctx->flags & IORING_SETUP_DEFER_TASKRUN))) 1250 1250 return; 1251 - io_req_local_work_add(req, ctx, flags); 1251 + __io_req_task_work_add(req, flags); 1252 1252 } 1253 1253 1254 1254 static void __cold io_move_task_work_from_local(struct io_ring_ctx *ctx) ··· 1643 1645 { 1644 1646 io_req_flags_t res = 0; 1645 1647 1648 + BUILD_BUG_ON(REQ_F_ISREG_BIT != REQ_F_SUPPORT_NOWAIT_BIT + 1); 1649 + 1646 1650 if (S_ISREG(file_inode(file)->i_mode)) 1647 1651 res |= REQ_F_ISREG; 1648 1652 if ((file->f_flags & O_NONBLOCK) || (file->f_mode & FMODE_NOWAIT)) ··· 1796 1796 struct io_kiocb *req = container_of(work, struct io_kiocb, work); 1797 1797 struct io_kiocb *nxt = NULL; 1798 1798 1799 - if (req_ref_put_and_test(req)) { 1799 + if (req_ref_put_and_test_atomic(req)) { 1800 1800 if (req->flags & IO_REQ_LINK_FLAGS) 1801 1801 nxt = io_req_find_next(req); 1802 1802 io_free_req(req);
+1 -2
io_uring/io_uring.h
··· 89 89 unsigned issue_flags); 90 90 91 91 void __io_req_task_work_add(struct io_kiocb *req, unsigned flags); 92 - void io_req_task_work_add_remote(struct io_kiocb *req, struct io_ring_ctx *ctx, 93 - unsigned flags); 92 + void io_req_task_work_add_remote(struct io_kiocb *req, unsigned flags); 94 93 void io_req_task_queue(struct io_kiocb *req); 95 94 void io_req_task_complete(struct io_kiocb *req, io_tw_token_t tw); 96 95 void io_req_task_queue_fail(struct io_kiocb *req, int ret);
+6 -5
io_uring/msg_ring.c
··· 38 38 mutex_unlock(&octx->uring_lock); 39 39 } 40 40 41 - static int io_double_lock_ctx(struct io_ring_ctx *octx, 42 - unsigned int issue_flags) 41 + static int io_lock_external_ctx(struct io_ring_ctx *octx, 42 + unsigned int issue_flags) 43 43 { 44 44 /* 45 45 * To ensure proper ordering between the two ctxs, we can only ··· 93 93 kmem_cache_free(req_cachep, req); 94 94 return -EOWNERDEAD; 95 95 } 96 + req->opcode = IORING_OP_NOP; 96 97 req->cqe.user_data = user_data; 97 98 io_req_set_res(req, res, cflags); 98 99 percpu_ref_get(&ctx->refs); 99 100 req->ctx = ctx; 100 101 req->tctx = NULL; 101 102 req->io_task_work.func = io_msg_tw_complete; 102 - io_req_task_work_add_remote(req, ctx, IOU_F_TWQ_LAZY_WAKE); 103 + io_req_task_work_add_remote(req, IOU_F_TWQ_LAZY_WAKE); 103 104 return 0; 104 105 } 105 106 ··· 155 154 156 155 ret = -EOVERFLOW; 157 156 if (target_ctx->flags & IORING_SETUP_IOPOLL) { 158 - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) 157 + if (unlikely(io_lock_external_ctx(target_ctx, issue_flags))) 159 158 return -EAGAIN; 160 159 } 161 160 if (io_post_aux_cqe(target_ctx, msg->user_data, msg->len, flags)) ··· 200 199 struct file *src_file = msg->src_file; 201 200 int ret; 202 201 203 - if (unlikely(io_double_lock_ctx(target_ctx, issue_flags))) 202 + if (unlikely(io_lock_external_ctx(target_ctx, issue_flags))) 204 203 return -EAGAIN; 205 204 206 205 ret = __io_fixed_fd_install(target_ctx, src_file, msg->dst_fd);
+50 -85
io_uring/net.c
··· 97 97 struct io_zcrx_ifq *ifq; 98 98 }; 99 99 100 + static int io_sg_from_iter_iovec(struct sk_buff *skb, 101 + struct iov_iter *from, size_t length); 102 + static int io_sg_from_iter(struct sk_buff *skb, 103 + struct iov_iter *from, size_t length); 104 + 100 105 int io_shutdown_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) 101 106 { 102 107 struct io_shutdown *shutdown = io_kiocb_to_cmd(req, struct io_shutdown); ··· 181 176 return hdr; 182 177 } 183 178 184 - /* assign new iovec to kmsg, if we need to */ 185 - static void io_net_vec_assign(struct io_kiocb *req, struct io_async_msghdr *kmsg, 186 - struct iovec *iov) 187 - { 188 - if (iov) { 189 - req->flags |= REQ_F_NEED_CLEANUP; 190 - io_vec_reset_iovec(&kmsg->vec, iov, kmsg->msg.msg_iter.nr_segs); 191 - } 192 - } 193 - 194 179 static inline void io_mshot_prep_retry(struct io_kiocb *req, 195 180 struct io_async_msghdr *kmsg) 196 181 { ··· 212 217 &iomsg->msg.msg_iter, io_is_compat(req->ctx)); 213 218 if (unlikely(ret < 0)) 214 219 return ret; 215 - io_net_vec_assign(req, iomsg, iov); 220 + 221 + if (iov) { 222 + req->flags |= REQ_F_NEED_CLEANUP; 223 + io_vec_reset_iovec(&iomsg->vec, iov, iomsg->msg.msg_iter.nr_segs); 224 + } 216 225 return 0; 217 226 } 218 227 ··· 324 325 return 0; 325 326 } 326 327 327 - static int io_sendmsg_copy_hdr(struct io_kiocb *req, 328 - struct io_async_msghdr *iomsg) 329 - { 330 - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); 331 - struct user_msghdr msg; 332 - int ret; 333 - 334 - ret = io_msg_copy_hdr(req, iomsg, &msg, ITER_SOURCE, NULL); 335 - if (unlikely(ret)) 336 - return ret; 337 - 338 - if (!(req->flags & REQ_F_BUFFER_SELECT)) 339 - ret = io_net_import_vec(req, iomsg, msg.msg_iov, msg.msg_iovlen, 340 - ITER_SOURCE); 341 - /* save msg_control as sys_sendmsg() overwrites it */ 342 - sr->msg_control = iomsg->msg.msg_control_user; 343 - return ret; 344 - } 345 - 346 328 void io_sendmsg_recvmsg_cleanup(struct io_kiocb *req) 347 329 { 348 330 struct io_async_msghdr *io = req->async_data; ··· 359 379 kmsg->msg.msg_name = &kmsg->addr; 360 380 kmsg->msg.msg_namelen = addr_len; 361 381 } 382 + if (sr->flags & IORING_RECVSEND_FIXED_BUF) 383 + return 0; 362 384 if (!io_do_buffer_select(req)) { 363 385 ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, 364 386 &kmsg->msg.msg_iter); ··· 374 392 { 375 393 struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); 376 394 struct io_async_msghdr *kmsg = req->async_data; 377 - 378 - sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr)); 379 - 380 - return io_sendmsg_copy_hdr(req, kmsg); 381 - } 382 - 383 - static int io_sendmsg_zc_setup(struct io_kiocb *req, const struct io_uring_sqe *sqe) 384 - { 385 - struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); 386 - struct io_async_msghdr *kmsg = req->async_data; 387 395 struct user_msghdr msg; 388 396 int ret; 389 397 390 - if (!(sr->flags & IORING_RECVSEND_FIXED_BUF)) 391 - return io_sendmsg_setup(req, sqe); 392 - 393 398 sr->umsg = u64_to_user_ptr(READ_ONCE(sqe->addr)); 394 - 395 399 ret = io_msg_copy_hdr(req, kmsg, &msg, ITER_SOURCE, NULL); 396 400 if (unlikely(ret)) 397 401 return ret; 402 + /* save msg_control as sys_sendmsg() overwrites it */ 398 403 sr->msg_control = kmsg->msg.msg_control_user; 399 - kmsg->msg.msg_iter.nr_segs = msg.msg_iovlen; 400 404 401 - return io_prep_reg_iovec(req, &kmsg->vec, msg.msg_iov, msg.msg_iovlen); 405 + if (sr->flags & IORING_RECVSEND_FIXED_BUF) { 406 + kmsg->msg.msg_iter.nr_segs = msg.msg_iovlen; 407 + return io_prep_reg_iovec(req, &kmsg->vec, msg.msg_iov, 408 + msg.msg_iovlen); 409 + } 410 + if (req->flags & REQ_F_BUFFER_SELECT) 411 + return 0; 412 + return io_net_import_vec(req, kmsg, msg.msg_iov, msg.msg_iovlen, ITER_SOURCE); 402 413 } 403 414 404 415 #define SENDMSG_FLAGS (IORING_RECVSEND_POLL_FIRST | IORING_RECVSEND_BUNDLE) ··· 402 427 403 428 sr->done_io = 0; 404 429 sr->retry = false; 405 - 406 - if (req->opcode != IORING_OP_SEND) { 407 - if (sqe->addr2 || sqe->file_index) 408 - return -EINVAL; 409 - } 410 - 411 430 sr->len = READ_ONCE(sqe->len); 412 431 sr->flags = READ_ONCE(sqe->ioprio); 413 432 if (sr->flags & ~SENDMSG_FLAGS) ··· 427 458 return -ENOMEM; 428 459 if (req->opcode != IORING_OP_SENDMSG) 429 460 return io_send_setup(req, sqe); 461 + if (unlikely(sqe->addr2 || sqe->file_index)) 462 + return -EINVAL; 430 463 return io_sendmsg_setup(req, sqe); 431 464 } 432 465 ··· 1273 1302 { 1274 1303 struct io_sr_msg *zc = io_kiocb_to_cmd(req, struct io_sr_msg); 1275 1304 struct io_ring_ctx *ctx = req->ctx; 1305 + struct io_async_msghdr *iomsg; 1276 1306 struct io_kiocb *notif; 1307 + int ret; 1277 1308 1278 1309 zc->done_io = 0; 1279 1310 zc->retry = false; 1280 - req->flags |= REQ_F_POLL_NO_LAZY; 1281 1311 1282 1312 if (unlikely(READ_ONCE(sqe->__pad2[0]) || READ_ONCE(sqe->addr3))) 1283 1313 return -EINVAL; ··· 1292 1320 notif->cqe.user_data = req->cqe.user_data; 1293 1321 notif->cqe.res = 0; 1294 1322 notif->cqe.flags = IORING_CQE_F_NOTIF; 1295 - req->flags |= REQ_F_NEED_CLEANUP; 1323 + req->flags |= REQ_F_NEED_CLEANUP | REQ_F_POLL_NO_LAZY; 1296 1324 1297 1325 zc->flags = READ_ONCE(sqe->ioprio); 1298 1326 if (unlikely(zc->flags & ~IO_ZC_FLAGS_COMMON)) { ··· 1307 1335 } 1308 1336 } 1309 1337 1310 - if (req->opcode != IORING_OP_SEND_ZC) { 1311 - if (unlikely(sqe->addr2 || sqe->file_index)) 1312 - return -EINVAL; 1313 - } 1314 - 1315 1338 zc->len = READ_ONCE(sqe->len); 1316 1339 zc->msg_flags = READ_ONCE(sqe->msg_flags) | MSG_NOSIGNAL | MSG_ZEROCOPY; 1317 1340 req->buf_index = READ_ONCE(sqe->buf_index); ··· 1316 1349 if (io_is_compat(req->ctx)) 1317 1350 zc->msg_flags |= MSG_CMSG_COMPAT; 1318 1351 1319 - if (unlikely(!io_msg_alloc_async(req))) 1352 + iomsg = io_msg_alloc_async(req); 1353 + if (unlikely(!iomsg)) 1320 1354 return -ENOMEM; 1355 + 1321 1356 if (req->opcode == IORING_OP_SEND_ZC) { 1322 - req->flags |= REQ_F_IMPORT_BUFFER; 1323 - return io_send_setup(req, sqe); 1357 + if (zc->flags & IORING_RECVSEND_FIXED_BUF) 1358 + req->flags |= REQ_F_IMPORT_BUFFER; 1359 + ret = io_send_setup(req, sqe); 1360 + } else { 1361 + if (unlikely(sqe->addr2 || sqe->file_index)) 1362 + return -EINVAL; 1363 + ret = io_sendmsg_setup(req, sqe); 1324 1364 } 1325 - return io_sendmsg_zc_setup(req, sqe); 1365 + if (unlikely(ret)) 1366 + return ret; 1367 + 1368 + if (!(zc->flags & IORING_RECVSEND_FIXED_BUF)) { 1369 + iomsg->msg.sg_from_iter = io_sg_from_iter_iovec; 1370 + return io_notif_account_mem(zc->notif, iomsg->msg.msg_iter.count); 1371 + } 1372 + iomsg->msg.sg_from_iter = io_sg_from_iter; 1373 + return 0; 1326 1374 } 1327 1375 1328 1376 static int io_sg_from_iter_iovec(struct sk_buff *skb, ··· 1394 1412 { 1395 1413 struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg); 1396 1414 struct io_async_msghdr *kmsg = req->async_data; 1397 - int ret; 1398 1415 1399 - if (sr->flags & IORING_RECVSEND_FIXED_BUF) { 1400 - sr->notif->buf_index = req->buf_index; 1401 - ret = io_import_reg_buf(sr->notif, &kmsg->msg.msg_iter, 1402 - (u64)(uintptr_t)sr->buf, sr->len, 1403 - ITER_SOURCE, issue_flags); 1404 - if (unlikely(ret)) 1405 - return ret; 1406 - kmsg->msg.sg_from_iter = io_sg_from_iter; 1407 - } else { 1408 - ret = import_ubuf(ITER_SOURCE, sr->buf, sr->len, &kmsg->msg.msg_iter); 1409 - if (unlikely(ret)) 1410 - return ret; 1411 - ret = io_notif_account_mem(sr->notif, sr->len); 1412 - if (unlikely(ret)) 1413 - return ret; 1414 - kmsg->msg.sg_from_iter = io_sg_from_iter_iovec; 1415 - } 1416 + WARN_ON_ONCE(!(sr->flags & IORING_RECVSEND_FIXED_BUF)); 1416 1417 1417 - return ret; 1418 + sr->notif->buf_index = req->buf_index; 1419 + return io_import_reg_buf(sr->notif, &kmsg->msg.msg_iter, 1420 + (u64)(uintptr_t)sr->buf, sr->len, 1421 + ITER_SOURCE, issue_flags); 1418 1422 } 1419 1423 1420 1424 int io_send_zc(struct io_kiocb *req, unsigned int issue_flags) ··· 1481 1513 unsigned flags; 1482 1514 int ret, min_ret = 0; 1483 1515 1484 - kmsg->msg.sg_from_iter = io_sg_from_iter_iovec; 1485 - 1486 1516 if (req->flags & REQ_F_IMPORT_BUFFER) { 1487 1517 unsigned uvec_segs = kmsg->msg.msg_iter.nr_segs; 1488 1518 int ret; ··· 1489 1523 &kmsg->vec, uvec_segs, issue_flags); 1490 1524 if (unlikely(ret)) 1491 1525 return ret; 1492 - kmsg->msg.sg_from_iter = io_sg_from_iter; 1493 1526 req->flags &= ~REQ_F_IMPORT_BUFFER; 1494 1527 } 1495 1528
+7
io_uring/refs.h
··· 17 17 return atomic_inc_not_zero(&req->refs); 18 18 } 19 19 20 + static inline bool req_ref_put_and_test_atomic(struct io_kiocb *req) 21 + { 22 + WARN_ON_ONCE(!(data_race(req->flags) & REQ_F_REFCOUNT)); 23 + WARN_ON_ONCE(req_ref_zero_or_close_to_overflow(req)); 24 + return atomic_dec_and_test(&req->refs); 25 + } 26 + 20 27 static inline bool req_ref_put_and_test(struct io_kiocb *req) 21 28 { 22 29 if (likely(!(req->flags & REQ_F_REFCOUNT)))
+112 -14
io_uring/rsrc.c
··· 1002 1002 } 1003 1003 EXPORT_SYMBOL_GPL(io_buffer_unregister_bvec); 1004 1004 1005 - static int io_import_fixed(int ddir, struct iov_iter *iter, 1006 - struct io_mapped_ubuf *imu, 1007 - u64 buf_addr, size_t len) 1005 + static int validate_fixed_range(u64 buf_addr, size_t len, 1006 + const struct io_mapped_ubuf *imu) 1008 1007 { 1009 1008 u64 buf_end; 1010 - size_t offset; 1011 1009 1012 - if (WARN_ON_ONCE(!imu)) 1013 - return -EFAULT; 1014 1010 if (unlikely(check_add_overflow(buf_addr, (u64)len, &buf_end))) 1015 1011 return -EFAULT; 1016 1012 /* not inside the mapped region */ 1017 1013 if (unlikely(buf_addr < imu->ubuf || buf_end > (imu->ubuf + imu->len))) 1018 1014 return -EFAULT; 1015 + if (unlikely(len > MAX_RW_COUNT)) 1016 + return -EFAULT; 1017 + return 0; 1018 + } 1019 + 1020 + static int io_import_fixed(int ddir, struct iov_iter *iter, 1021 + struct io_mapped_ubuf *imu, 1022 + u64 buf_addr, size_t len) 1023 + { 1024 + size_t offset; 1025 + int ret; 1026 + 1027 + if (WARN_ON_ONCE(!imu)) 1028 + return -EFAULT; 1029 + ret = validate_fixed_range(buf_addr, len, imu); 1030 + if (unlikely(ret)) 1031 + return ret; 1019 1032 if (!(imu->dir & (1 << ddir))) 1020 1033 return -EFAULT; 1021 1034 ··· 1318 1305 u64 buf_addr = (u64)(uintptr_t)iovec[iov_idx].iov_base; 1319 1306 struct bio_vec *src_bvec; 1320 1307 size_t offset; 1321 - u64 buf_end; 1308 + int ret; 1322 1309 1323 - if (unlikely(check_add_overflow(buf_addr, (u64)iov_len, &buf_end))) 1324 - return -EFAULT; 1325 - if (unlikely(buf_addr < imu->ubuf || buf_end > (imu->ubuf + imu->len))) 1326 - return -EFAULT; 1310 + ret = validate_fixed_range(buf_addr, iov_len, imu); 1311 + if (unlikely(ret)) 1312 + return ret; 1313 + 1327 1314 if (unlikely(!iov_len)) 1328 1315 return -EFAULT; 1329 1316 if (unlikely(check_add_overflow(total_len, iov_len, &total_len))) ··· 1362 1349 return max_segs; 1363 1350 } 1364 1351 1352 + static int io_vec_fill_kern_bvec(int ddir, struct iov_iter *iter, 1353 + struct io_mapped_ubuf *imu, 1354 + struct iovec *iovec, unsigned nr_iovs, 1355 + struct iou_vec *vec) 1356 + { 1357 + const struct bio_vec *src_bvec = imu->bvec; 1358 + struct bio_vec *res_bvec = vec->bvec; 1359 + unsigned res_idx = 0; 1360 + size_t total_len = 0; 1361 + unsigned iov_idx; 1362 + 1363 + for (iov_idx = 0; iov_idx < nr_iovs; iov_idx++) { 1364 + size_t offset = (size_t)(uintptr_t)iovec[iov_idx].iov_base; 1365 + size_t iov_len = iovec[iov_idx].iov_len; 1366 + struct bvec_iter bi = { 1367 + .bi_size = offset + iov_len, 1368 + }; 1369 + struct bio_vec bv; 1370 + 1371 + bvec_iter_advance(src_bvec, &bi, offset); 1372 + for_each_mp_bvec(bv, src_bvec, bi, bi) 1373 + res_bvec[res_idx++] = bv; 1374 + total_len += iov_len; 1375 + } 1376 + iov_iter_bvec(iter, ddir, res_bvec, res_idx, total_len); 1377 + return 0; 1378 + } 1379 + 1380 + static int iov_kern_bvec_size(const struct iovec *iov, 1381 + const struct io_mapped_ubuf *imu, 1382 + unsigned int *nr_seg) 1383 + { 1384 + size_t offset = (size_t)(uintptr_t)iov->iov_base; 1385 + const struct bio_vec *bvec = imu->bvec; 1386 + int start = 0, i = 0; 1387 + size_t off = 0; 1388 + int ret; 1389 + 1390 + ret = validate_fixed_range(offset, iov->iov_len, imu); 1391 + if (unlikely(ret)) 1392 + return ret; 1393 + 1394 + for (i = 0; off < offset + iov->iov_len && i < imu->nr_bvecs; 1395 + off += bvec[i].bv_len, i++) { 1396 + if (offset >= off && offset < off + bvec[i].bv_len) 1397 + start = i; 1398 + } 1399 + *nr_seg = i - start; 1400 + return 0; 1401 + } 1402 + 1403 + static int io_kern_bvec_size(struct iovec *iov, unsigned nr_iovs, 1404 + struct io_mapped_ubuf *imu, unsigned *nr_segs) 1405 + { 1406 + unsigned max_segs = 0; 1407 + size_t total_len = 0; 1408 + unsigned i; 1409 + int ret; 1410 + 1411 + *nr_segs = 0; 1412 + for (i = 0; i < nr_iovs; i++) { 1413 + if (unlikely(!iov[i].iov_len)) 1414 + return -EFAULT; 1415 + if (unlikely(check_add_overflow(total_len, iov[i].iov_len, 1416 + &total_len))) 1417 + return -EOVERFLOW; 1418 + ret = iov_kern_bvec_size(&iov[i], imu, &max_segs); 1419 + if (unlikely(ret)) 1420 + return ret; 1421 + *nr_segs += max_segs; 1422 + } 1423 + if (total_len > MAX_RW_COUNT) 1424 + return -EINVAL; 1425 + return 0; 1426 + } 1427 + 1365 1428 int io_import_reg_vec(int ddir, struct iov_iter *iter, 1366 1429 struct io_kiocb *req, struct iou_vec *vec, 1367 1430 unsigned nr_iovs, unsigned issue_flags) ··· 1452 1363 if (!node) 1453 1364 return -EFAULT; 1454 1365 imu = node->buf; 1455 - if (imu->is_kbuf) 1456 - return -EOPNOTSUPP; 1457 1366 if (!(imu->dir & (1 << ddir))) 1458 1367 return -EFAULT; 1459 1368 1460 1369 iovec_off = vec->nr - nr_iovs; 1461 1370 iov = vec->iovec + iovec_off; 1462 - nr_segs = io_estimate_bvec_size(iov, nr_iovs, imu); 1371 + 1372 + if (imu->is_kbuf) { 1373 + int ret = io_kern_bvec_size(iov, nr_iovs, imu, &nr_segs); 1374 + 1375 + if (unlikely(ret)) 1376 + return ret; 1377 + } else { 1378 + nr_segs = io_estimate_bvec_size(iov, nr_iovs, imu); 1379 + } 1463 1380 1464 1381 if (sizeof(struct bio_vec) > sizeof(struct iovec)) { 1465 1382 size_t bvec_bytes; ··· 1491 1396 iov = vec->iovec + iovec_off; 1492 1397 req->flags |= REQ_F_NEED_CLEANUP; 1493 1398 } 1399 + 1400 + if (imu->is_kbuf) 1401 + return io_vec_fill_kern_bvec(ddir, iter, imu, iov, nr_iovs, vec); 1494 1402 1495 1403 return io_vec_fill_bvec(ddir, iter, imu, iov, nr_iovs, vec); 1496 1404 }
+12 -10
io_uring/uring_cmd.c
··· 205 205 * that it doesn't read in per-op data, play it safe and ensure that 206 206 * any SQE data is stable beyond prep. This can later get relaxed. 207 207 */ 208 - memcpy(ac->data.sqes, sqe, uring_sqe_size(req->ctx)); 209 - ioucmd->sqe = ac->data.sqes; 208 + memcpy(ac->sqes, sqe, uring_sqe_size(req->ctx)); 209 + ioucmd->sqe = ac->sqes; 210 210 return 0; 211 211 } 212 212 ··· 307 307 struct io_uring_cmd *cmd, 308 308 unsigned int issue_flags) 309 309 { 310 + const struct io_uring_sqe *sqe = cmd->sqe; 310 311 bool compat = !!(issue_flags & IO_URING_F_COMPAT); 311 312 int optlen, optname, level, err; 312 313 void __user *optval; 313 314 314 - level = READ_ONCE(cmd->sqe->level); 315 + level = READ_ONCE(sqe->level); 315 316 if (level != SOL_SOCKET) 316 317 return -EOPNOTSUPP; 317 318 318 - optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval)); 319 - optname = READ_ONCE(cmd->sqe->optname); 320 - optlen = READ_ONCE(cmd->sqe->optlen); 319 + optval = u64_to_user_ptr(READ_ONCE(sqe->optval)); 320 + optname = READ_ONCE(sqe->optname); 321 + optlen = READ_ONCE(sqe->optlen); 321 322 322 323 err = do_sock_getsockopt(sock, compat, level, optname, 323 324 USER_SOCKPTR(optval), ··· 334 333 struct io_uring_cmd *cmd, 335 334 unsigned int issue_flags) 336 335 { 336 + const struct io_uring_sqe *sqe = cmd->sqe; 337 337 bool compat = !!(issue_flags & IO_URING_F_COMPAT); 338 338 int optname, optlen, level; 339 339 void __user *optval; 340 340 sockptr_t optval_s; 341 341 342 - optval = u64_to_user_ptr(READ_ONCE(cmd->sqe->optval)); 343 - optname = READ_ONCE(cmd->sqe->optname); 344 - optlen = READ_ONCE(cmd->sqe->optlen); 345 - level = READ_ONCE(cmd->sqe->level); 342 + optval = u64_to_user_ptr(READ_ONCE(sqe->optval)); 343 + optname = READ_ONCE(sqe->optname); 344 + optlen = READ_ONCE(sqe->optlen); 345 + level = READ_ONCE(sqe->level); 346 346 optval_s = USER_SOCKPTR(optval); 347 347 348 348 return do_sock_setsockopt(sock, compat, level, optname, optval_s,
+1
io_uring/uring_cmd.h
··· 6 6 struct io_async_cmd { 7 7 struct io_uring_cmd_data data; 8 8 struct iou_vec vec; 9 + struct io_uring_sqe sqes[2]; 9 10 }; 10 11 11 12 int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags);
+8
io_uring/zcrx.c
··· 818 818 int ret = 0; 819 819 820 820 len = min_t(size_t, len, desc->count); 821 + /* 822 + * __tcp_read_sock() always calls io_zcrx_recv_skb one last time, even 823 + * if desc->count is already 0. This is caused by the if (offset + 1 != 824 + * skb->len) check. Return early in this case to break out of 825 + * __tcp_read_sock(). 826 + */ 827 + if (!len) 828 + return 0; 821 829 if (unlikely(args->nr_skbs++ > IO_SKBS_PER_CALL_LIMIT)) 822 830 return -EAGAIN; 823 831
+5
tools/testing/selftests/ublk/Makefile
··· 4 4 LDLIBS += -lpthread -lm -luring 5 5 6 6 TEST_PROGS := test_generic_01.sh 7 + TEST_PROGS += test_generic_02.sh 8 + TEST_PROGS += test_generic_03.sh 7 9 8 10 TEST_PROGS += test_null_01.sh 9 11 TEST_PROGS += test_null_02.sh ··· 13 11 TEST_PROGS += test_loop_02.sh 14 12 TEST_PROGS += test_loop_03.sh 15 13 TEST_PROGS += test_loop_04.sh 14 + TEST_PROGS += test_loop_05.sh 16 15 TEST_PROGS += test_stripe_01.sh 17 16 TEST_PROGS += test_stripe_02.sh 17 + TEST_PROGS += test_stripe_03.sh 18 + TEST_PROGS += test_stripe_04.sh 18 19 19 20 TEST_PROGS += test_stress_01.sh 20 21 TEST_PROGS += test_stress_02.sh
+4 -4
tools/testing/selftests/ublk/kublk.c
··· 99 99 static int ublk_ctrl_stop_dev(struct ublk_dev *dev) 100 100 { 101 101 struct ublk_ctrl_cmd_data data = { 102 - .cmd_op = UBLK_CMD_STOP_DEV, 102 + .cmd_op = UBLK_U_CMD_STOP_DEV, 103 103 }; 104 104 105 105 return __ublk_ctrl_cmd(dev, &data); ··· 169 169 struct ublk_params *params) 170 170 { 171 171 struct ublk_ctrl_cmd_data data = { 172 - .cmd_op = UBLK_CMD_GET_PARAMS, 172 + .cmd_op = UBLK_U_CMD_GET_PARAMS, 173 173 .flags = CTRL_CMD_HAS_BUF, 174 174 .addr = (__u64)params, 175 175 .len = sizeof(*params), ··· 215 215 216 216 ret = ublk_ctrl_get_params(dev, &p); 217 217 if (ret < 0) { 218 - ublk_err("failed to get params %m\n"); 218 + ublk_err("failed to get params %d %s\n", ret, strerror(-ret)); 219 219 return; 220 220 } 221 221 ··· 322 322 323 323 cmd_buf_size = ublk_queue_cmd_buf_sz(q); 324 324 off = UBLKSRV_CMD_BUF_OFFSET + q->q_id * ublk_queue_max_cmd_buf_sz(); 325 - q->io_cmd_buf = (char *)mmap(0, cmd_buf_size, PROT_READ, 325 + q->io_cmd_buf = mmap(0, cmd_buf_size, PROT_READ, 326 326 MAP_SHARED | MAP_POPULATE, dev->fds[0], off); 327 327 if (q->io_cmd_buf == MAP_FAILED) { 328 328 ublk_err("ublk dev %d queue %d map io_cmd_buf failed %m\n",
+2 -2
tools/testing/selftests/ublk/kublk.h
··· 128 128 unsigned int io_inflight; 129 129 struct ublk_dev *dev; 130 130 const struct ublk_tgt_ops *tgt_ops; 131 - char *io_cmd_buf; 131 + struct ublksrv_io_desc *io_cmd_buf; 132 132 struct io_uring ring; 133 133 struct ublk_io ios[UBLK_QUEUE_DEPTH]; 134 134 #define UBLKSRV_QUEUE_STOPPING (1U << 0) ··· 302 302 303 303 static inline const struct ublksrv_io_desc *ublk_get_iod(const struct ublk_queue *q, int tag) 304 304 { 305 - return (struct ublksrv_io_desc *)&(q->io_cmd_buf[tag * sizeof(struct ublksrv_io_desc)]); 305 + return &q->io_cmd_buf[tag]; 306 306 } 307 307 308 308 static inline void ublk_set_sqe_cmd_op(struct io_uring_sqe *sqe, __u32 cmd_op)
+10 -1
tools/testing/selftests/ublk/null.c
··· 17 17 18 18 dev->tgt.dev_size = dev_size; 19 19 dev->tgt.params = (struct ublk_params) { 20 - .types = UBLK_PARAM_TYPE_BASIC, 20 + .types = UBLK_PARAM_TYPE_BASIC | UBLK_PARAM_TYPE_DMA_ALIGN | 21 + UBLK_PARAM_TYPE_SEGMENT, 21 22 .basic = { 22 23 .logical_bs_shift = 9, 23 24 .physical_bs_shift = 12, ··· 26 25 .io_min_shift = 9, 27 26 .max_sectors = info->max_io_buf_bytes >> 9, 28 27 .dev_sectors = dev_size >> 9, 28 + }, 29 + .dma = { 30 + .alignment = 4095, 31 + }, 32 + .seg = { 33 + .seg_boundary_mask = 4095, 34 + .max_segment_size = 32 << 10, 35 + .max_segments = 32, 29 36 }, 30 37 }; 31 38
+52 -17
tools/testing/selftests/ublk/stripe.c
··· 111 111 } 112 112 } 113 113 114 - static inline enum io_uring_op stripe_to_uring_op(const struct ublksrv_io_desc *iod) 114 + static inline enum io_uring_op stripe_to_uring_op( 115 + const struct ublksrv_io_desc *iod, int zc) 115 116 { 116 117 unsigned ublk_op = ublksrv_get_op(iod); 117 118 118 119 if (ublk_op == UBLK_IO_OP_READ) 119 - return IORING_OP_READV; 120 + return zc ? IORING_OP_READV_FIXED : IORING_OP_READV; 120 121 else if (ublk_op == UBLK_IO_OP_WRITE) 121 - return IORING_OP_WRITEV; 122 + return zc ? IORING_OP_WRITEV_FIXED : IORING_OP_WRITEV; 122 123 assert(0); 123 124 } 124 125 125 126 static int stripe_queue_tgt_rw_io(struct ublk_queue *q, const struct ublksrv_io_desc *iod, int tag) 126 127 { 127 128 const struct stripe_conf *conf = get_chunk_shift(q); 128 - enum io_uring_op op = stripe_to_uring_op(iod); 129 + int zc = !!(ublk_queue_use_zc(q) != 0); 130 + enum io_uring_op op = stripe_to_uring_op(iod, zc); 129 131 struct io_uring_sqe *sqe[NR_STRIPE]; 130 132 struct stripe_array *s = alloc_stripe_array(conf, iod); 131 133 struct ublk_io *io = ublk_get_io(q, tag); 132 - int i; 134 + int i, extra = zc ? 2 : 0; 133 135 134 136 io->private_data = s; 135 137 calculate_stripe_array(conf, iod, s); 136 138 137 - ublk_queue_alloc_sqes(q, sqe, s->nr); 138 - for (i = 0; i < s->nr; i++) { 139 - struct stripe *t = &s->s[i]; 139 + ublk_queue_alloc_sqes(q, sqe, s->nr + extra); 140 + 141 + if (zc) { 142 + io_uring_prep_buf_register(sqe[0], 0, tag, q->q_id, tag); 143 + sqe[0]->flags |= IOSQE_CQE_SKIP_SUCCESS | IOSQE_IO_HARDLINK; 144 + sqe[0]->user_data = build_user_data(tag, 145 + ublk_cmd_op_nr(sqe[0]->cmd_op), 0, 1); 146 + } 147 + 148 + for (i = zc; i < s->nr + extra - zc; i++) { 149 + struct stripe *t = &s->s[i - zc]; 140 150 141 151 io_uring_prep_rw(op, sqe[i], 142 152 t->seq + 1, 143 153 (void *)t->vec, 144 154 t->nr_vec, 145 155 t->start << 9); 146 - io_uring_sqe_set_flags(sqe[i], IOSQE_FIXED_FILE); 156 + if (zc) { 157 + sqe[i]->buf_index = tag; 158 + io_uring_sqe_set_flags(sqe[i], 159 + IOSQE_FIXED_FILE | IOSQE_IO_HARDLINK); 160 + } else { 161 + io_uring_sqe_set_flags(sqe[i], IOSQE_FIXED_FILE); 162 + } 147 163 /* bit63 marks us as tgt io */ 148 - sqe[i]->user_data = build_user_data(tag, ublksrv_get_op(iod), i, 1); 164 + sqe[i]->user_data = build_user_data(tag, ublksrv_get_op(iod), i - zc, 1); 149 165 } 150 - return s->nr; 166 + if (zc) { 167 + struct io_uring_sqe *unreg = sqe[s->nr + 1]; 168 + 169 + io_uring_prep_buf_unregister(unreg, 0, tag, q->q_id, tag); 170 + unreg->user_data = build_user_data(tag, ublk_cmd_op_nr(unreg->cmd_op), 0, 1); 171 + } 172 + 173 + /* register buffer is skip_success */ 174 + return s->nr + zc; 151 175 } 152 176 153 177 static int handle_flush(struct ublk_queue *q, const struct ublksrv_io_desc *iod, int tag) ··· 232 208 struct ublk_io *io = ublk_get_io(q, tag); 233 209 int res = cqe->res; 234 210 235 - if (res < 0) { 211 + if (res < 0 || op != ublk_cmd_op_nr(UBLK_U_IO_UNREGISTER_IO_BUF)) { 236 212 if (!io->result) 237 213 io->result = res; 238 - ublk_err("%s: io failure %d tag %u\n", __func__, res, tag); 214 + if (res < 0) 215 + ublk_err("%s: io failure %d tag %u\n", __func__, res, tag); 239 216 } 217 + 218 + /* buffer register op is IOSQE_CQE_SKIP_SUCCESS */ 219 + if (op == ublk_cmd_op_nr(UBLK_U_IO_REGISTER_IO_BUF)) 220 + io->tgt_ios += 1; 240 221 241 222 /* fail short READ/WRITE simply */ 242 223 if (op == UBLK_IO_OP_READ || op == UBLK_IO_OP_WRITE) { 243 224 unsigned seq = user_data_to_tgt_data(cqe->user_data); 244 225 struct stripe_array *s = io->private_data; 245 226 246 - if (res < s->s[seq].vec->iov_len) 227 + if (res < s->s[seq].nr_sects << 9) { 247 228 io->result = -EIO; 229 + ublk_err("%s: short rw op %u res %d exp %u tag %u\n", 230 + __func__, op, res, s->s[seq].vec->iov_len, tag); 231 + } 248 232 } 249 233 250 234 if (ublk_completed_tgt_io(q, tag)) { ··· 285 253 struct stripe_conf *conf; 286 254 unsigned chunk_shift; 287 255 loff_t bytes = 0; 288 - int ret, i; 256 + int ret, i, mul = 1; 289 257 290 258 if ((chunk_size & (chunk_size - 1)) || !chunk_size) { 291 259 ublk_err("invalid chunk size %u\n", chunk_size); ··· 327 295 dev->tgt.dev_size = bytes; 328 296 p.basic.dev_sectors = bytes >> 9; 329 297 dev->tgt.params = p; 330 - dev->tgt.sq_depth = dev->dev_info.queue_depth * conf->nr_files; 331 - dev->tgt.cq_depth = dev->dev_info.queue_depth * conf->nr_files; 298 + 299 + if (dev->dev_info.flags & UBLK_F_SUPPORT_ZERO_COPY) 300 + mul = 2; 301 + dev->tgt.sq_depth = mul * dev->dev_info.queue_depth * conf->nr_files; 302 + dev->tgt.cq_depth = mul * dev->dev_info.queue_depth * conf->nr_files; 332 303 333 304 printf("%s: shift %u files %u\n", __func__, conf->shift, conf->nr_files); 334 305
+6
tools/testing/selftests/ublk/test_common.sh
··· 23 23 echo $(( (major & 0xfff) << 20 | (minor & 0xfffff) )) 24 24 } 25 25 26 + _run_fio_verify_io() { 27 + fio --name=verify --rw=randwrite --direct=1 --ioengine=libaio \ 28 + --bs=8k --iodepth=32 --verify=crc32c --do_verify=1 \ 29 + --verify_state_save=0 "$@" > /dev/null 30 + } 31 + 26 32 _create_backfile() { 27 33 local my_size=$1 28 34 local my_file
+44
tools/testing/selftests/ublk/test_generic_02.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + . "$(cd "$(dirname "$0")" && pwd)"/test_common.sh 5 + 6 + TID="generic_02" 7 + ERR_CODE=0 8 + 9 + if ! _have_program bpftrace; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 13 + _prep_test "null" "sequential io order for MQ" 14 + 15 + dev_id=$(_add_ublk_dev -t null -q 2) 16 + _check_add_dev $TID $? 17 + 18 + dev_t=$(_get_disk_dev_t "$dev_id") 19 + bpftrace trace/seq_io.bt "$dev_t" "W" 1 > "$UBLK_TMP" 2>&1 & 20 + btrace_pid=$! 21 + sleep 2 22 + 23 + if ! kill -0 "$btrace_pid" > /dev/null 2>&1; then 24 + _cleanup_test "null" 25 + exit "$UBLK_SKIP_CODE" 26 + fi 27 + 28 + # run fio over this ublk disk 29 + fio --name=write_seq \ 30 + --filename=/dev/ublkb"${dev_id}" \ 31 + --ioengine=libaio --iodepth=16 \ 32 + --rw=write \ 33 + --size=512M \ 34 + --direct=1 \ 35 + --bs=4k > /dev/null 2>&1 36 + ERR_CODE=$? 37 + kill "$btrace_pid" 38 + wait 39 + if grep -q "io_out_of_order" "$UBLK_TMP"; then 40 + cat "$UBLK_TMP" 41 + ERR_CODE=255 42 + fi 43 + _cleanup_test "null" 44 + _show_result $TID $ERR_CODE
+28
tools/testing/selftests/ublk/test_generic_03.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + . "$(cd "$(dirname "$0")" && pwd)"/test_common.sh 5 + 6 + TID="generic_03" 7 + ERR_CODE=0 8 + 9 + _prep_test "null" "check dma & segment limits for zero copy" 10 + 11 + dev_id=$(_add_ublk_dev -t null -z) 12 + _check_add_dev $TID $? 13 + 14 + sysfs_path=/sys/block/ublkb"${dev_id}" 15 + dma_align=$(cat "$sysfs_path"/queue/dma_alignment) 16 + max_segments=$(cat "$sysfs_path"/queue/max_segments) 17 + max_segment_size=$(cat "$sysfs_path"/queue/max_segment_size) 18 + if [ "$dma_align" != "4095" ]; then 19 + ERR_CODE=255 20 + fi 21 + if [ "$max_segments" != "32" ]; then 22 + ERR_CODE=255 23 + fi 24 + if [ "$max_segment_size" != "32768" ]; then 25 + ERR_CODE=255 26 + fi 27 + _cleanup_test "null" 28 + _show_result $TID $ERR_CODE
+5 -9
tools/testing/selftests/ublk/test_loop_01.sh
··· 6 6 TID="loop_01" 7 7 ERR_CODE=0 8 8 9 + if ! _have_program fio; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 9 13 _prep_test "loop" "write and verify test" 10 14 11 15 backfile_0=$(_create_backfile 256M) ··· 18 14 _check_add_dev $TID $? "${backfile_0}" 19 15 20 16 # run fio over the ublk disk 21 - fio --name=write_and_verify \ 22 - --filename=/dev/ublkb"${dev_id}" \ 23 - --ioengine=libaio --iodepth=16 \ 24 - --rw=write \ 25 - --size=256M \ 26 - --direct=1 \ 27 - --verify=crc32c \ 28 - --do_verify=1 \ 29 - --bs=4k > /dev/null 2>&1 17 + _run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=256M 30 18 ERR_CODE=$? 31 19 32 20 _cleanup_test "loop"
+5 -9
tools/testing/selftests/ublk/test_loop_03.sh
··· 6 6 TID="loop_03" 7 7 ERR_CODE=0 8 8 9 + if ! _have_program fio; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 9 13 _prep_test "loop" "write and verify over zero copy" 10 14 11 15 backfile_0=$(_create_backfile 256M) ··· 17 13 _check_add_dev $TID $? "$backfile_0" 18 14 19 15 # run fio over the ublk disk 20 - fio --name=write_and_verify \ 21 - --filename=/dev/ublkb"${dev_id}" \ 22 - --ioengine=libaio --iodepth=64 \ 23 - --rw=write \ 24 - --size=256M \ 25 - --direct=1 \ 26 - --verify=crc32c \ 27 - --do_verify=1 \ 28 - --bs=4k > /dev/null 2>&1 16 + _run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=256M 29 17 ERR_CODE=$? 30 18 31 19 _cleanup_test "loop"
+28
tools/testing/selftests/ublk/test_loop_05.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + . "$(cd "$(dirname "$0")" && pwd)"/test_common.sh 5 + 6 + TID="loop_05" 7 + ERR_CODE=0 8 + 9 + if ! _have_program fio; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 13 + _prep_test "loop" "write and verify test" 14 + 15 + backfile_0=$(_create_backfile 256M) 16 + 17 + dev_id=$(_add_ublk_dev -q 2 -t loop "$backfile_0") 18 + _check_add_dev $TID $? "${backfile_0}" 19 + 20 + # run fio over the ublk disk 21 + _run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=256M 22 + ERR_CODE=$? 23 + 24 + _cleanup_test "loop" 25 + 26 + _remove_backfile "$backfile_0" 27 + 28 + _show_result $TID $ERR_CODE
+3 -3
tools/testing/selftests/ublk/test_stress_01.sh
··· 27 27 28 28 _prep_test "stress" "run IO and remove device" 29 29 30 - ublk_io_and_remove 8G -t null 30 + ublk_io_and_remove 8G -t null -q 4 31 31 ERR_CODE=$? 32 32 if [ ${ERR_CODE} -ne 0 ]; then 33 33 _show_result $TID $ERR_CODE 34 34 fi 35 35 36 36 BACK_FILE=$(_create_backfile 256M) 37 - ublk_io_and_remove 256M -t loop "${BACK_FILE}" 37 + ublk_io_and_remove 256M -t loop -q 4 "${BACK_FILE}" 38 38 ERR_CODE=$? 39 39 if [ ${ERR_CODE} -ne 0 ]; then 40 40 _show_result $TID $ERR_CODE 41 41 fi 42 42 43 - ublk_io_and_remove 256M -t loop -z "${BACK_FILE}" 43 + ublk_io_and_remove 256M -t loop -q 4 -z "${BACK_FILE}" 44 44 ERR_CODE=$? 45 45 _cleanup_test "stress" 46 46 _remove_backfile "${BACK_FILE}"
+3 -3
tools/testing/selftests/ublk/test_stress_02.sh
··· 27 27 28 28 _prep_test "stress" "run IO and kill ublk server" 29 29 30 - ublk_io_and_kill_daemon 8G -t null 30 + ublk_io_and_kill_daemon 8G -t null -q 4 31 31 ERR_CODE=$? 32 32 if [ ${ERR_CODE} -ne 0 ]; then 33 33 _show_result $TID $ERR_CODE 34 34 fi 35 35 36 36 BACK_FILE=$(_create_backfile 256M) 37 - ublk_io_and_kill_daemon 256M -t loop "${BACK_FILE}" 37 + ublk_io_and_kill_daemon 256M -t loop -q 4 "${BACK_FILE}" 38 38 ERR_CODE=$? 39 39 if [ ${ERR_CODE} -ne 0 ]; then 40 40 _show_result $TID $ERR_CODE 41 41 fi 42 42 43 - ublk_io_and_kill_daemon 256M -t loop -z "${BACK_FILE}" 43 + ublk_io_and_kill_daemon 256M -t loop -q 4 -z "${BACK_FILE}" 44 44 ERR_CODE=$? 45 45 _cleanup_test "stress" 46 46 _remove_backfile "${BACK_FILE}"
+5 -9
tools/testing/selftests/ublk/test_stripe_01.sh
··· 6 6 TID="stripe_01" 7 7 ERR_CODE=0 8 8 9 + if ! _have_program fio; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 9 13 _prep_test "stripe" "write and verify test" 10 14 11 15 backfile_0=$(_create_backfile 256M) ··· 19 15 _check_add_dev $TID $? "${backfile_0}" 20 16 21 17 # run fio over the ublk disk 22 - fio --name=write_and_verify \ 23 - --filename=/dev/ublkb"${dev_id}" \ 24 - --ioengine=libaio --iodepth=32 \ 25 - --rw=write \ 26 - --size=512M \ 27 - --direct=1 \ 28 - --verify=crc32c \ 29 - --do_verify=1 \ 30 - --bs=4k > /dev/null 2>&1 18 + _run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=512M 31 19 ERR_CODE=$? 32 20 33 21 _cleanup_test "stripe"
+30
tools/testing/selftests/ublk/test_stripe_03.sh
··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + 4 + . "$(cd "$(dirname "$0")" && pwd)"/test_common.sh 5 + 6 + TID="stripe_03" 7 + ERR_CODE=0 8 + 9 + if ! _have_program fio; then 10 + exit "$UBLK_SKIP_CODE" 11 + fi 12 + 13 + _prep_test "stripe" "write and verify test" 14 + 15 + backfile_0=$(_create_backfile 256M) 16 + backfile_1=$(_create_backfile 256M) 17 + 18 + dev_id=$(_add_ublk_dev -q 2 -t stripe "$backfile_0" "$backfile_1") 19 + _check_add_dev $TID $? "${backfile_0}" 20 + 21 + # run fio over the ublk disk 22 + _run_fio_verify_io --filename=/dev/ublkb"${dev_id}" --size=512M 23 + ERR_CODE=$? 24 + 25 + _cleanup_test "stripe" 26 + 27 + _remove_backfile "$backfile_0" 28 + _remove_backfile "$backfile_1" 29 + 30 + _show_result $TID $ERR_CODE