Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

block, bfq: split sync bfq_queues on a per-actuator basis

Single-LUN multi-actuator SCSI drives, as well as all multi-actuator
SATA drives appear as a single device to the I/O subsystem [1]. Yet
they address commands to different actuators internally, as a function
of Logical Block Addressing (LBAs). A given sector is reachable by
only one of the actuators. For example, Seagate’s Serial Advanced
Technology Attachment (SATA) version contains two actuators and maps
the lower half of the SATA LBA space to the lower actuator and the
upper half to the upper actuator.

Evidently, to fully utilize actuators, no actuator must be left idle
or underutilized while there is pending I/O for it. The block layer
must somehow control the load of each actuator individually. This
commit lays the ground for allowing BFQ to provide such a per-actuator
control.

BFQ associates an I/O-request sync bfq_queue with each process doing
synchronous I/O, or with a group of processes, in case of queue
merging. Then BFQ serves one bfq_queue at a time. While in service, a
bfq_queue is emptied in request-position order. Yet the same process,
or group of processes, may generate I/O for different actuators. In
this case, different streams of I/O (each for a different actuator)
get all inserted into the same sync bfq_queue. So there is basically
no individual control on when each stream is served, i.e., on when the
I/O requests of the stream are picked from the bfq_queue and
dispatched to the drive.

This commit enables BFQ to control the service of each actuator
individually for synchronous I/O, by simply splitting each sync
bfq_queue into N queues, one for each actuator. In other words, a sync
bfq_queue is now associated to a pair (process, actuator). As a
consequence of this split, the per-queue proportional-share policy
implemented by BFQ will guarantee that the sync I/O generated for each
actuator, by each process, receives its fair share of service.

This is just a preparatory patch. If the I/O of the same process
happens to be sent to different queues, then each of these queues may
undergo queue merging. To handle this event, the bfq_io_cq data
structure must be properly extended. In addition, stable merging must
be disabled to avoid loss of control on individual actuators. Finally,
also async queues must be split. These issues are described in detail
and addressed in next commits. As for this commit, although multiple
per-process bfq_queues are provided, the I/O of each process or group
of processes is still sent to only one queue, regardless of the
actuator the I/O is for. The forwarding to distinct bfq_queues will be
enabled after addressing the above issues.

[1] https://www.linaro.org/blog/budget-fair-queueing-bfq-linux-io-scheduler-optimizations-for-multi-actuator-sata-hard-drives/

Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Gabriele Felici <felicigb@gmail.com>
Signed-off-by: Carmine Zaccagnino <carmine@carminezacc.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Link: https://lore.kernel.org/r/20230103145503.71712-2-paolo.valente@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Paolo Valente and committed by
Jens Axboe
9778369a 6d796c50

+197 -109
+49 -42
block/bfq-cgroup.c
··· 712 712 bfq_put_queue(bfqq); 713 713 } 714 714 715 + static void bfq_sync_bfqq_move(struct bfq_data *bfqd, 716 + struct bfq_queue *sync_bfqq, 717 + struct bfq_io_cq *bic, 718 + struct bfq_group *bfqg, 719 + unsigned int act_idx) 720 + { 721 + struct bfq_queue *bfqq; 722 + 723 + if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) { 724 + /* We are the only user of this bfqq, just move it */ 725 + if (sync_bfqq->entity.sched_data != &bfqg->sched_data) 726 + bfq_bfqq_move(bfqd, sync_bfqq, bfqg); 727 + return; 728 + } 729 + 730 + /* 731 + * The queue was merged to a different queue. Check 732 + * that the merge chain still belongs to the same 733 + * cgroup. 734 + */ 735 + for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq) 736 + if (bfqq->entity.sched_data != &bfqg->sched_data) 737 + break; 738 + if (bfqq) { 739 + /* 740 + * Some queue changed cgroup so the merge is not valid 741 + * anymore. We cannot easily just cancel the merge (by 742 + * clearing new_bfqq) as there may be other processes 743 + * using this queue and holding refs to all queues 744 + * below sync_bfqq->new_bfqq. Similarly if the merge 745 + * already happened, we need to detach from bfqq now 746 + * so that we cannot merge bio to a request from the 747 + * old cgroup. 748 + */ 749 + bfq_put_cooperator(sync_bfqq); 750 + bfq_release_process_ref(bfqd, sync_bfqq); 751 + bic_set_bfqq(bic, NULL, true, act_idx); 752 + } 753 + } 754 + 715 755 /** 716 756 * __bfq_bic_change_cgroup - move @bic to @bfqg. 717 757 * @bfqd: the queue descriptor. ··· 766 726 struct bfq_io_cq *bic, 767 727 struct bfq_group *bfqg) 768 728 { 769 - struct bfq_queue *async_bfqq = bic_to_bfqq(bic, false); 770 - struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, true); 771 - struct bfq_entity *entity; 729 + unsigned int act_idx; 772 730 773 - if (async_bfqq) { 774 - entity = &async_bfqq->entity; 731 + for (act_idx = 0; act_idx < bfqd->num_actuators; act_idx++) { 732 + struct bfq_queue *async_bfqq = bic_to_bfqq(bic, false, act_idx); 733 + struct bfq_queue *sync_bfqq = bic_to_bfqq(bic, true, act_idx); 775 734 776 - if (entity->sched_data != &bfqg->sched_data) { 777 - bic_set_bfqq(bic, NULL, false); 735 + if (async_bfqq && 736 + async_bfqq->entity.sched_data != &bfqg->sched_data) { 737 + bic_set_bfqq(bic, NULL, false, act_idx); 778 738 bfq_release_process_ref(bfqd, async_bfqq); 779 739 } 780 - } 781 740 782 - if (sync_bfqq) { 783 - if (!sync_bfqq->new_bfqq && !bfq_bfqq_coop(sync_bfqq)) { 784 - /* We are the only user of this bfqq, just move it */ 785 - if (sync_bfqq->entity.sched_data != &bfqg->sched_data) 786 - bfq_bfqq_move(bfqd, sync_bfqq, bfqg); 787 - } else { 788 - struct bfq_queue *bfqq; 789 - 790 - /* 791 - * The queue was merged to a different queue. Check 792 - * that the merge chain still belongs to the same 793 - * cgroup. 794 - */ 795 - for (bfqq = sync_bfqq; bfqq; bfqq = bfqq->new_bfqq) 796 - if (bfqq->entity.sched_data != 797 - &bfqg->sched_data) 798 - break; 799 - if (bfqq) { 800 - /* 801 - * Some queue changed cgroup so the merge is 802 - * not valid anymore. We cannot easily just 803 - * cancel the merge (by clearing new_bfqq) as 804 - * there may be other processes using this 805 - * queue and holding refs to all queues below 806 - * sync_bfqq->new_bfqq. Similarly if the merge 807 - * already happened, we need to detach from 808 - * bfqq now so that we cannot merge bio to a 809 - * request from the old cgroup. 810 - */ 811 - bfq_put_cooperator(sync_bfqq); 812 - bfq_release_process_ref(bfqd, sync_bfqq); 813 - bic_set_bfqq(bic, NULL, true); 814 - } 815 - } 741 + if (sync_bfqq) 742 + bfq_sync_bfqq_move(bfqd, sync_bfqq, bic, bfqg, act_idx); 816 743 } 817 744 } 818 745
+107 -57
block/bfq-iosched.c
··· 377 377 #define RQ_BIC(rq) ((struct bfq_io_cq *)((rq)->elv.priv[0])) 378 378 #define RQ_BFQQ(rq) ((rq)->elv.priv[1]) 379 379 380 - struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync) 380 + struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync, 381 + unsigned int actuator_idx) 381 382 { 382 - return bic->bfqq[is_sync]; 383 + if (is_sync) 384 + return bic->bfqq[1][actuator_idx]; 385 + 386 + return bic->bfqq[0][actuator_idx]; 383 387 } 384 388 385 389 static void bfq_put_stable_ref(struct bfq_queue *bfqq); 386 390 387 - void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync) 391 + void bic_set_bfqq(struct bfq_io_cq *bic, 392 + struct bfq_queue *bfqq, 393 + bool is_sync, 394 + unsigned int actuator_idx) 388 395 { 389 - struct bfq_queue *old_bfqq = bic->bfqq[is_sync]; 396 + struct bfq_queue *old_bfqq = bic->bfqq[is_sync][actuator_idx]; 390 397 391 398 /* Clear bic pointer if bfqq is detached from this bic */ 392 399 if (old_bfqq && old_bfqq->bic == bic) ··· 412 405 * we cancel the stable merge if 413 406 * bic->stable_merge_bfqq == bfqq. 414 407 */ 415 - bic->bfqq[is_sync] = bfqq; 408 + if (is_sync) 409 + bic->bfqq[1][actuator_idx] = bfqq; 410 + else 411 + bic->bfqq[0][actuator_idx] = bfqq; 416 412 417 413 if (bfqq && bic->stable_merge_bfqq == bfqq) { 418 414 /* ··· 688 678 { 689 679 struct bfq_data *bfqd = data->q->elevator->elevator_data; 690 680 struct bfq_io_cq *bic = bfq_bic_lookup(data->q); 691 - struct bfq_queue *bfqq = bic ? bic_to_bfqq(bic, op_is_sync(opf)) : NULL; 692 681 int depth; 693 682 unsigned limit = data->q->nr_requests; 683 + unsigned int act_idx; 694 684 695 685 /* Sync reads have full depth available */ 696 686 if (op_is_sync(opf) && !op_is_write(opf)) { ··· 700 690 limit = (limit * depth) >> bfqd->full_depth_shift; 701 691 } 702 692 703 - /* 704 - * Does queue (or any parent entity) exceed number of requests that 705 - * should be available to it? Heavily limit depth so that it cannot 706 - * consume more available requests and thus starve other entities. 707 - */ 708 - if (bfqq && bfqq_request_over_limit(bfqq, limit)) 709 - depth = 1; 693 + for (act_idx = 0; bic && act_idx < bfqd->num_actuators; act_idx++) { 694 + struct bfq_queue *bfqq = 695 + bic_to_bfqq(bic, op_is_sync(opf), act_idx); 710 696 697 + /* 698 + * Does queue (or any parent entity) exceed number of 699 + * requests that should be available to it? Heavily 700 + * limit depth so that it cannot consume more 701 + * available requests and thus starve other entities. 702 + */ 703 + if (bfqq && bfqq_request_over_limit(bfqq, limit)) { 704 + depth = 1; 705 + break; 706 + } 707 + } 711 708 bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u", 712 709 __func__, bfqd->wr_busy_queues, op_is_sync(opf), depth); 713 710 if (depth) ··· 1783 1766 return bfqq_weight > in_serv_weight; 1784 1767 } 1785 1768 1769 + /* 1770 + * Get the index of the actuator that will serve bio. 1771 + */ 1772 + static unsigned int bfq_actuator_index(struct bfq_data *bfqd, struct bio *bio) 1773 + { 1774 + /* 1775 + * Multi-actuator support not complete yet, so always return 0 1776 + * for the moment (to keep incomplete mechanisms off). 1777 + */ 1778 + return 0; 1779 + } 1780 + 1786 1781 static bool bfq_better_to_idle(struct bfq_queue *bfqq); 1787 1782 1788 1783 static void bfq_bfqq_handle_idle_busy_switch(struct bfq_data *bfqd, ··· 2127 2098 * We reset waker detection logic also if too much time has passed 2128 2099 * since the first detection. If wakeups are rare, pointless idling 2129 2100 * doesn't hurt throughput that much. The condition below makes sure 2130 - * we do not uselessly idle blocking waker in more than 1/64 cases. 2101 + * we do not uselessly idle blocking waker in more than 1/64 cases. 2131 2102 */ 2132 2103 if (bfqd->last_completed_rq_bfqq != 2133 2104 bfqq->tentative_waker_bfqq || ··· 2447 2418 */ 2448 2419 bfq_bic_update_cgroup(bic, bio); 2449 2420 2450 - bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf)); 2421 + bfqd->bio_bfqq = bic_to_bfqq(bic, op_is_sync(bio->bi_opf), 2422 + bfq_actuator_index(bfqd, bio)); 2451 2423 } else { 2452 2424 bfqd->bio_bfqq = NULL; 2453 2425 } ··· 3144 3114 /* 3145 3115 * Merge queues (that is, let bic redirect its requests to new_bfqq) 3146 3116 */ 3147 - bic_set_bfqq(bic, new_bfqq, true); 3117 + bic_set_bfqq(bic, new_bfqq, true, bfqq->actuator_idx); 3148 3118 bfq_mark_bfqq_coop(new_bfqq); 3149 3119 /* 3150 3120 * new_bfqq now belongs to at least two bics (it is a shared queue): ··· 4778 4748 */ 4779 4749 if (bfq_bfqq_wait_request(bfqq) || 4780 4750 (bfqq->dispatched != 0 && bfq_better_to_idle(bfqq))) { 4781 - struct bfq_queue *async_bfqq = 4782 - bfqq->bic && bfqq->bic->bfqq[0] && 4783 - bfq_bfqq_busy(bfqq->bic->bfqq[0]) && 4784 - bfqq->bic->bfqq[0]->next_rq ? 4785 - bfqq->bic->bfqq[0] : NULL; 4751 + unsigned int act_idx = bfqq->actuator_idx; 4752 + struct bfq_queue *async_bfqq = NULL; 4786 4753 struct bfq_queue *blocked_bfqq = 4787 4754 !hlist_empty(&bfqq->woken_list) ? 4788 4755 container_of(bfqq->woken_list.first, ··· 4787 4760 woken_list_node) 4788 4761 : NULL; 4789 4762 4763 + if (bfqq->bic && bfqq->bic->bfqq[0][act_idx] && 4764 + bfq_bfqq_busy(bfqq->bic->bfqq[0][act_idx]) && 4765 + bfqq->bic->bfqq[0][act_idx]->next_rq) 4766 + async_bfqq = bfqq->bic->bfqq[0][act_idx]; 4790 4767 /* 4791 4768 * The next four mutually-exclusive ifs decide 4792 4769 * whether to try injection, and choose the queue to ··· 4875 4844 icq_to_bic(async_bfqq->next_rq->elv.icq) == bfqq->bic && 4876 4845 bfq_serv_to_charge(async_bfqq->next_rq, async_bfqq) <= 4877 4846 bfq_bfqq_budget_left(async_bfqq)) 4878 - bfqq = bfqq->bic->bfqq[0]; 4847 + bfqq = bfqq->bic->bfqq[0][act_idx]; 4879 4848 else if (bfqq->waker_bfqq && 4880 4849 bfq_bfqq_busy(bfqq->waker_bfqq) && 4881 4850 bfqq->waker_bfqq->next_rq && ··· 5336 5305 bfq_release_process_ref(bfqd, bfqq); 5337 5306 } 5338 5307 5339 - static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync) 5308 + static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync, 5309 + unsigned int actuator_idx) 5340 5310 { 5341 - struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync); 5311 + struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync, actuator_idx); 5342 5312 struct bfq_data *bfqd; 5343 5313 5344 5314 if (bfqq) 5345 5315 bfqd = bfqq->bfqd; /* NULL if scheduler already exited */ 5346 5316 5347 5317 if (bfqq && bfqd) { 5348 - unsigned long flags; 5349 - 5350 - spin_lock_irqsave(&bfqd->lock, flags); 5351 - bic_set_bfqq(bic, NULL, is_sync); 5318 + bic_set_bfqq(bic, NULL, is_sync, actuator_idx); 5352 5319 bfq_exit_bfqq(bfqd, bfqq); 5353 - spin_unlock_irqrestore(&bfqd->lock, flags); 5354 5320 } 5355 5321 } 5356 5322 5357 5323 static void bfq_exit_icq(struct io_cq *icq) 5358 5324 { 5359 5325 struct bfq_io_cq *bic = icq_to_bic(icq); 5326 + struct bfq_data *bfqd = bic_to_bfqd(bic); 5327 + unsigned long flags; 5328 + unsigned int act_idx; 5329 + /* 5330 + * If bfqd and thus bfqd->num_actuators is not available any 5331 + * longer, then cycle over all possible per-actuator bfqqs in 5332 + * next loop. We rely on bic being zeroed on creation, and 5333 + * therefore on its unused per-actuator fields being NULL. 5334 + */ 5335 + unsigned int num_actuators = BFQ_MAX_ACTUATORS; 5360 5336 5361 - if (bic->stable_merge_bfqq) { 5362 - struct bfq_data *bfqd = bic->stable_merge_bfqq->bfqd; 5363 - 5364 - /* 5365 - * bfqd is NULL if scheduler already exited, and in 5366 - * that case this is the last time bfqq is accessed. 5367 - */ 5368 - if (bfqd) { 5369 - unsigned long flags; 5370 - 5371 - spin_lock_irqsave(&bfqd->lock, flags); 5372 - bfq_put_stable_ref(bic->stable_merge_bfqq); 5373 - spin_unlock_irqrestore(&bfqd->lock, flags); 5374 - } else { 5375 - bfq_put_stable_ref(bic->stable_merge_bfqq); 5376 - } 5337 + /* 5338 + * bfqd is NULL if scheduler already exited, and in that case 5339 + * this is the last time these queues are accessed. 5340 + */ 5341 + if (bfqd) { 5342 + spin_lock_irqsave(&bfqd->lock, flags); 5343 + num_actuators = bfqd->num_actuators; 5377 5344 } 5378 5345 5379 - bfq_exit_icq_bfqq(bic, true); 5380 - bfq_exit_icq_bfqq(bic, false); 5346 + if (bic->stable_merge_bfqq) 5347 + bfq_put_stable_ref(bic->stable_merge_bfqq); 5348 + 5349 + for (act_idx = 0; act_idx < num_actuators; act_idx++) { 5350 + bfq_exit_icq_bfqq(bic, true, act_idx); 5351 + bfq_exit_icq_bfqq(bic, false, act_idx); 5352 + } 5353 + 5354 + if (bfqd) 5355 + spin_unlock_irqrestore(&bfqd->lock, flags); 5381 5356 } 5382 5357 5383 5358 /* ··· 5460 5423 5461 5424 bic->ioprio = ioprio; 5462 5425 5463 - bfqq = bic_to_bfqq(bic, false); 5426 + bfqq = bic_to_bfqq(bic, false, bfq_actuator_index(bfqd, bio)); 5464 5427 if (bfqq) { 5465 5428 bfq_release_process_ref(bfqd, bfqq); 5466 5429 bfqq = bfq_get_queue(bfqd, bio, false, bic, true); 5467 - bic_set_bfqq(bic, bfqq, false); 5430 + bic_set_bfqq(bic, bfqq, false, bfq_actuator_index(bfqd, bio)); 5468 5431 } 5469 5432 5470 - bfqq = bic_to_bfqq(bic, true); 5433 + bfqq = bic_to_bfqq(bic, true, bfq_actuator_index(bfqd, bio)); 5471 5434 if (bfqq) 5472 5435 bfq_set_next_ioprio_data(bfqq, bic); 5473 5436 } 5474 5437 5475 5438 static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, 5476 - struct bfq_io_cq *bic, pid_t pid, int is_sync) 5439 + struct bfq_io_cq *bic, pid_t pid, int is_sync, 5440 + unsigned int act_idx) 5477 5441 { 5478 5442 u64 now_ns = ktime_get_ns(); 5479 5443 5444 + bfqq->actuator_idx = act_idx; 5480 5445 RB_CLEAR_NODE(&bfqq->entity.rb_node); 5481 5446 INIT_LIST_HEAD(&bfqq->fifo); 5482 5447 INIT_HLIST_NODE(&bfqq->burst_list_node); ··· 5731 5692 5732 5693 if (bfqq) { 5733 5694 bfq_init_bfqq(bfqd, bfqq, bic, current->pid, 5734 - is_sync); 5695 + is_sync, bfq_actuator_index(bfqd, bio)); 5735 5696 bfq_init_entity(&bfqq->entity, bfqg); 5736 5697 bfq_log_bfqq(bfqd, bfqq, "allocated"); 5737 5698 } else { ··· 6046 6007 * then complete the merge and redirect it to 6047 6008 * new_bfqq. 6048 6009 */ 6049 - if (bic_to_bfqq(RQ_BIC(rq), 1) == bfqq) 6010 + if (bic_to_bfqq(RQ_BIC(rq), true, 6011 + bfq_actuator_index(bfqd, rq->bio)) == bfqq) 6050 6012 bfq_merge_bfqqs(bfqd, RQ_BIC(rq), 6051 6013 bfqq, new_bfqq); 6052 6014 ··· 6602 6562 return bfqq; 6603 6563 } 6604 6564 6605 - bic_set_bfqq(bic, NULL, true); 6565 + bic_set_bfqq(bic, NULL, true, bfqq->actuator_idx); 6606 6566 6607 6567 bfq_put_cooperator(bfqq); 6608 6568 ··· 6616 6576 bool split, bool is_sync, 6617 6577 bool *new_queue) 6618 6578 { 6619 - struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync); 6579 + unsigned int act_idx = bfq_actuator_index(bfqd, bio); 6580 + struct bfq_queue *bfqq = bic_to_bfqq(bic, is_sync, act_idx); 6620 6581 6621 6582 if (likely(bfqq && bfqq != &bfqd->oom_bfqq)) 6622 6583 return bfqq; ··· 6629 6588 bfq_put_queue(bfqq); 6630 6589 bfqq = bfq_get_queue(bfqd, bio, is_sync, bic, split); 6631 6590 6632 - bic_set_bfqq(bic, bfqq, is_sync); 6591 + bic_set_bfqq(bic, bfqq, is_sync, act_idx); 6633 6592 if (split && is_sync) { 6634 6593 if ((bic->was_in_burst_list && bfqd->large_burst) || 6635 6594 bic->saved_in_large_burst) ··· 7077 7036 * Our fallback bfqq if bfq_find_alloc_queue() runs into OOM issues. 7078 7037 * Grab a permanent reference to it, so that the normal code flow 7079 7038 * will not attempt to free it. 7039 + * Set zero as actuator index: we will pretend that 7040 + * all I/O requests are for the same actuator. 7080 7041 */ 7081 - bfq_init_bfqq(bfqd, &bfqd->oom_bfqq, NULL, 1, 0); 7042 + bfq_init_bfqq(bfqd, &bfqd->oom_bfqq, NULL, 1, 0, 0); 7082 7043 bfqd->oom_bfqq.ref++; 7083 7044 bfqd->oom_bfqq.new_ioprio = BFQ_DEFAULT_QUEUE_IOPRIO; 7084 7045 bfqd->oom_bfqq.new_ioprio_class = IOPRIO_CLASS_BE; ··· 7098 7055 bfqd->oom_bfqq.entity.prio_changed = 1; 7099 7056 7100 7057 bfqd->queue = q; 7058 + 7059 + /* 7060 + * Multi-actuator support not complete yet, unconditionally 7061 + * set to only one actuator for the moment (to keep incomplete 7062 + * mechanisms off). 7063 + */ 7064 + bfqd->num_actuators = 1; 7101 7065 7102 7066 INIT_LIST_HEAD(&bfqd->dispatch); 7103 7067
+41 -10
block/bfq-iosched.h
··· 33 33 */ 34 34 #define BFQ_SOFTRT_WEIGHT_FACTOR 100 35 35 36 + /* 37 + * Maximum number of actuators supported. This constant is used simply 38 + * to define the size of the static array that will contain 39 + * per-actuator data. The current value is hopefully a good upper 40 + * bound to the possible number of actuators of any actual drive. 41 + */ 42 + #define BFQ_MAX_ACTUATORS 8 43 + 36 44 struct bfq_entity; 37 45 38 46 /** ··· 235 227 * struct bfq_queue - leaf schedulable entity. 236 228 * 237 229 * A bfq_queue is a leaf request queue; it can be associated with an 238 - * io_context or more, if it is async or shared between cooperating 239 - * processes. @cgroup holds a reference to the cgroup, to be sure that it 240 - * does not disappear while a bfqq still references it (mostly to avoid 241 - * races between request issuing and task migration followed by cgroup 242 - * destruction). 243 - * All the fields are protected by the queue lock of the containing bfqd. 230 + * io_context or more, if it is async or shared between cooperating 231 + * processes. Besides, it contains I/O requests for only one actuator 232 + * (an io_context is associated with a different bfq_queue for each 233 + * actuator it generates I/O for). @cgroup holds a reference to the 234 + * cgroup, to be sure that it does not disappear while a bfqq still 235 + * references it (mostly to avoid races between request issuing and 236 + * task migration followed by cgroup destruction). All the fields are 237 + * protected by the queue lock of the containing bfqd. 244 238 */ 245 239 struct bfq_queue { 246 240 /* reference counter */ ··· 407 397 * the woken queues when this queue exits. 408 398 */ 409 399 struct hlist_head woken_list; 400 + 401 + /* index of the actuator this queue is associated with */ 402 + unsigned int actuator_idx; 410 403 }; 411 404 412 405 /** ··· 418 405 struct bfq_io_cq { 419 406 /* associated io_cq structure */ 420 407 struct io_cq icq; /* must be the first member */ 421 - /* array of two process queues, the sync and the async */ 422 - struct bfq_queue *bfqq[2]; 408 + /* 409 + * Matrix of associated process queues: first row for async 410 + * queues, second row sync queues. Each row contains one 411 + * column for each actuator. An I/O request generated by the 412 + * process is inserted into the queue pointed by bfqq[i][j] if 413 + * the request is to be served by the j-th actuator of the 414 + * drive, where i==0 or i==1, depending on whether the request 415 + * is async or sync. So there is a distinct queue for each 416 + * actuator. 417 + */ 418 + struct bfq_queue *bfqq[2][BFQ_MAX_ACTUATORS]; 423 419 /* per (request_queue, blkcg) ioprio */ 424 420 int ioprio; 425 421 #ifdef CONFIG_BFQ_GROUP_IOSCHED ··· 794 772 */ 795 773 unsigned int word_depths[2][2]; 796 774 unsigned int full_depth_shift; 775 + 776 + /* 777 + * Number of independent actuators. This is equal to 1 in 778 + * case of single-actuator drives. 779 + */ 780 + unsigned int num_actuators; 781 + 797 782 }; 798 783 799 784 enum bfqq_state_flags { ··· 998 969 999 970 extern const int bfq_timeout; 1000 971 1001 - struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync); 1002 - void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync); 972 + struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync, 973 + unsigned int actuator_idx); 974 + void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync, 975 + unsigned int actuator_idx); 1003 976 struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic); 1004 977 void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq); 1005 978 void bfq_weights_tree_add(struct bfq_queue *bfqq);