Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

block: use bvec iterator helper for bio_may_need_split()

bio_may_need_split() uses bi_vcnt to determine if a bio has a single
segment, but bi_vcnt is unreliable for cloned bios. Cloned bios share
the parent's bi_io_vec array but iterate over a subset via bi_iter,
so bi_vcnt may not reflect the actual segment count being iterated.

Replace the bi_vcnt check with bvec iterator access via
__bvec_iter_bvec(), comparing bi_iter.bi_size against the current
bvec's length. This correctly handles both cloned and non-cloned bios.

Move bi_io_vec into the first cache line adjacent to bi_iter. This is
a sensible layout since bi_io_vec and bi_iter are commonly accessed
together throughout the block layer - every bvec iteration requires
both fields. This displaces bi_end_io to the second cache line, which
is acceptable since bi_end_io and bi_private are always fetched
together in bio_endio() anyway.

The struct layout change requires bio_reset() to preserve and restore
bi_io_vec across the memset, since it now falls within BIO_RESET_BYTES.

Nitesh verified that this patch doesn't regress NVMe 512-byte IO perf [1].

Link: https://lore.kernel.org/linux-block/20251220081607.tvnrltcngl3cc2fh@green245.gost/ [1]
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Ming Lei and committed by
Jens Axboe
ee623c89 69d26698

+14 -5
+3
block/bio.c
··· 301 301 */ 302 302 void bio_reset(struct bio *bio, struct block_device *bdev, blk_opf_t opf) 303 303 { 304 + struct bio_vec *bv = bio->bi_io_vec; 305 + 304 306 bio_uninit(bio); 305 307 memset(bio, 0, BIO_RESET_BYTES); 306 308 atomic_set(&bio->__bi_remaining, 1); 309 + bio->bi_io_vec = bv; 307 310 bio->bi_bdev = bdev; 308 311 if (bio->bi_bdev) 309 312 bio_associate_blkg(bio);
+9 -3
block/blk.h
··· 371 371 static inline bool bio_may_need_split(struct bio *bio, 372 372 const struct queue_limits *lim) 373 373 { 374 + const struct bio_vec *bv; 375 + 374 376 if (lim->chunk_sectors) 375 377 return true; 376 - if (bio->bi_vcnt != 1) 378 + 379 + if (!bio->bi_io_vec) 377 380 return true; 378 - return bio->bi_io_vec->bv_len + bio->bi_io_vec->bv_offset > 379 - lim->max_fast_segment_size; 381 + 382 + bv = __bvec_iter_bvec(bio->bi_io_vec, bio->bi_iter); 383 + if (bio->bi_iter.bi_size > bv->bv_len) 384 + return true; 385 + return bv->bv_len + bv->bv_offset > lim->max_fast_segment_size; 380 386 } 381 387 382 388 /**
+2 -2
include/linux/blk_types.h
··· 232 232 233 233 atomic_t __bi_remaining; 234 234 235 + /* The actual vec list, preserved by bio_reset() */ 236 + struct bio_vec *bi_io_vec; 235 237 struct bvec_iter bi_iter; 236 238 237 239 union { ··· 276 274 unsigned short bi_max_vecs; /* max bvl_vecs we can hold */ 277 275 278 276 atomic_t __bi_cnt; /* pin count */ 279 - 280 - struct bio_vec *bi_io_vec; /* the actual vec list */ 281 277 282 278 struct bio_set *bi_pool; 283 279 };