Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

block: allow submitting all zone writes from a single context

In order to maintain sequential write patterns per zone with zoned block
devices, zone write plugging issues only a single write BIO per zone at
any time. This works well but has the side effect that when large
sequential write streams are issued by the user and these streams cross
zone boundaries, the device ends up receiving a discontiguous set of
write commands for different zones. The same also happens when a user
writes simultaneously at high queue depth multiple zones: the device
does not see all sequential writes per zone and receives discontiguous
writes to different zones. While this does not affect the performance of
solid state zoned block devices, when using an SMR HDD, this pattern
change from sequential writes to discontiguous writes to different zones
significantly increases head seek which results in degraded write
throughput.

In order to reduce this seek overhead for rotational media devices,
introduce a per disk zone write plugs kernel thread to issue all write
BIOs to zones. This single zone write issuing context is enabled for
any zoned block device that has a request queue flagged with the new
QUEUE_ZONED_QD1_WRITES flag.

The flag QUEUE_ZONED_QD1_WRITES is visible as the sysfs queue attribute
zoned_qd1_writes for zoned devices. For regular block devices, this
attribute is not visible. For zoned block devices, a user can override
the default value set to force the global write maximum queue depth of
1 for a zoned block device, or clear this attribute to fallback to the
default behavior of zone write plugging which limits writes to QD=1 per
sequential zone.

Writing to a zoned block device flagged with QUEUE_ZONED_QD1_WRITES is
implemented using a list of zone write plugs that have a non-empty BIO
list. Listed zone write plugs are processed by the disk zone write plugs
worker kthread in FIFO order, and all BIOs of a zone write plug are all
processed before switching to the next listed zone write plug. A newly
submitted BIO for a non-FULL zone write plug that is not yet listed
causes the addition of the zone write plug at the end of the disk list
of zone write plugs.

Since the write BIOs queued in a zone write plug BIO list are
necessarilly sequential, for rotational media, using the single zone
write plugs kthread to issue all BIOs maintains a sequential write
pattern and thus reduces seek overhead and improves write throughput.
This processing essentially result in always writing to HDDs at QD=1,
which is not an issue for HDDs operating with write caching enabled.
Performance with write cache disabled is also not degraded thanks to
the efficient write handling of modern SMR HDDs.

A disk list of zone write plugs is defined using the new struct gendisk
zone_wplugs_list, and accesses to this list is protected using the
zone_wplugs_list_lock spinlock. The per disk kthread
(zone_wplugs_worker) code is implemented by the function
disk_zone_wplugs_worker(). A reference on listed zone write plugs is
always held until all BIOs of the zone write plug are processed by the
worker kthread. BIO issuing at QD=1 is driven using a completion
structure (zone_wplugs_worker_bio_done) and calls to blk_io_wait().

With this change, performance when sequentially writing the zones of a
30 TB SMR SATA HDD connected to an AHCI adapter changes as follows
(1MiB direct I/Os, results in MB/s unit):

+--------------------+
| Write BW (MB/s) |
+------------------+----------+---------+
| Sequential write | Baseline | Patched |
| Queue Depth | 6.19-rc8 | |
+------------------+----------+---------+
| 1 | 244 | 245 |
| 2 | 244 | 245 |
| 4 | 245 | 245 |
| 8 | 242 | 245 |
| 16 | 222 | 246 |
| 32 | 211 | 245 |
| 64 | 193 | 244 |
| 128 | 112 | 246 |
+------------------+----------+---------+

With the current code (baseline), as the sequential write stream crosses
a zone boundary, higher queue depth creates a gap between the
last IO to the previous zone and the first IOs to the following zones,
causing head seeks and degrading performance. Using the disk zone
write plugs worker thread, this pattern disappears and the maximum
throughput of the drive is maintained, leading to over 100%
improvements in throughput for high queue depth write.

Using 16 fio jobs all writing to randomly chosen zones at QD=32 with 1
MiB direct IOs, write throughput also increases significantly.

+--------------------+
| Write BW (MB/s) |
+------------------+----------+---------+
| Random write | Baseline | Patched |
| Number of zones | 6.19-rc7 | |
+------------------+----------+---------+
| 1 | 191 | 192 |
| 2 | 101 | 128 |
| 4 | 115 | 123 |
| 8 | 90 | 120 |
| 16 | 64 | 115 |
| 32 | 58 | 105 |
| 64 | 56 | 101 |
| 128 | 55 | 99 |
+------------------+----------+---------+

Tests using XFS shows that buffered write speed with 8 jobs writing
files increases by 12% to 35% depending on the workload.

+--------------------+
| Write BW (MB/s) |
+------------------+----------+---------+
| Workload | Baseline | Patched |
| | 6.19-rc7 | |
+------------------+----------+---------+
| 256MiB file size | 212 | 238 |
+------------------+----------+---------+
| 4MiB .. 128 MiB | 213 | 243 |
| random file size | | |
+------------------+----------+---------+
| 2MiB .. 8 MiB | 179 | 242 |
| random file size | | |
+------------------+----------+---------+

Performance gains are even more significant when using an HBA that
limits the maximum size of commands to a small value, e.g. HBAs
controlled with the mpi3mr driver limit commands to a maximum of 1 MiB.
In such case, the write throughput gains are over 40%.

+--------------------+
| Write BW (MB/s) |
+------------------+----------+---------+
| Workload | Baseline | Patched |
| | 6.19-rc7 | |
+------------------+----------+---------+
| 256MiB file size | 175 | 245 |
+------------------+----------+---------+
| 4MiB .. 128 MiB | 174 | 244 |
| random file size | | |
+------------------+----------+---------+
| 2MiB .. 8 MiB | 171 | 243 |
| random file size | | |
+------------------+----------+---------+

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Damien Le Moal and committed by
Jens Axboe
1365b690 b7cbc30e

+212 -22
+1
block/blk-mq-debugfs.c
··· 97 97 QUEUE_FLAG_NAME(NO_ELV_SWITCH), 98 98 QUEUE_FLAG_NAME(QOS_ENABLED), 99 99 QUEUE_FLAG_NAME(BIO_ISSUE_TIME), 100 + QUEUE_FLAG_NAME(ZONED_QD1_WRITES), 100 101 }; 101 102 #undef QUEUE_FLAG_NAME 102 103
+34 -1
block/blk-sysfs.c
··· 390 390 return queue_var_show(disk_nr_zones(disk), page); 391 391 } 392 392 393 + static ssize_t queue_zoned_qd1_writes_show(struct gendisk *disk, char *page) 394 + { 395 + return queue_var_show(!!blk_queue_zoned_qd1_writes(disk->queue), 396 + page); 397 + } 398 + 399 + static ssize_t queue_zoned_qd1_writes_store(struct gendisk *disk, 400 + const char *page, size_t count) 401 + { 402 + struct request_queue *q = disk->queue; 403 + unsigned long qd1_writes; 404 + unsigned int memflags; 405 + ssize_t ret; 406 + 407 + ret = queue_var_store(&qd1_writes, page, count); 408 + if (ret < 0) 409 + return ret; 410 + 411 + memflags = blk_mq_freeze_queue(q); 412 + blk_mq_quiesce_queue(q); 413 + if (qd1_writes) 414 + blk_queue_flag_set(QUEUE_FLAG_ZONED_QD1_WRITES, q); 415 + else 416 + blk_queue_flag_clear(QUEUE_FLAG_ZONED_QD1_WRITES, q); 417 + blk_mq_unquiesce_queue(q); 418 + blk_mq_unfreeze_queue(q, memflags); 419 + 420 + return count; 421 + } 422 + 393 423 static ssize_t queue_iostats_passthrough_show(struct gendisk *disk, char *page) 394 424 { 395 425 return queue_var_show(!!blk_queue_passthrough_stat(disk->queue), page); ··· 647 617 QUEUE_LIM_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity"); 648 618 649 619 QUEUE_LIM_RO_ENTRY(queue_zoned, "zoned"); 620 + QUEUE_RW_ENTRY(queue_zoned_qd1_writes, "zoned_qd1_writes"); 650 621 QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); 651 622 QUEUE_LIM_RO_ENTRY(queue_max_open_zones, "max_open_zones"); 652 623 QUEUE_LIM_RO_ENTRY(queue_max_active_zones, "max_active_zones"); ··· 785 754 &queue_nomerges_entry.attr, 786 755 &queue_poll_entry.attr, 787 756 &queue_poll_delay_entry.attr, 757 + &queue_zoned_qd1_writes_entry.attr, 788 758 789 759 NULL, 790 760 }; ··· 818 786 struct request_queue *q = disk->queue; 819 787 820 788 if ((attr == &queue_max_open_zones_entry.attr || 821 - attr == &queue_max_active_zones_entry.attr) && 789 + attr == &queue_max_active_zones_entry.attr || 790 + attr == &queue_zoned_qd1_writes_entry.attr) && 822 791 !blk_queue_is_zoned(q)) 823 792 return 0; 824 793
+169 -21
block/blk-zoned.c
··· 16 16 #include <linux/spinlock.h> 17 17 #include <linux/refcount.h> 18 18 #include <linux/mempool.h> 19 + #include <linux/kthread.h> 20 + #include <linux/freezer.h> 19 21 20 22 #include <trace/events/block.h> 21 23 ··· 42 40 /* 43 41 * Per-zone write plug. 44 42 * @node: hlist_node structure for managing the plug using a hash table. 43 + * @entry: list_head structure for listing the plug in the disk list of active 44 + * zone write plugs. 45 45 * @bio_list: The list of BIOs that are currently plugged. 46 46 * @bio_work: Work struct to handle issuing of plugged BIOs 47 47 * @rcu_head: RCU head to free zone write plugs with an RCU grace period. ··· 66 62 */ 67 63 struct blk_zone_wplug { 68 64 struct hlist_node node; 65 + struct list_head entry; 69 66 struct bio_list bio_list; 70 67 struct work_struct bio_work; 71 68 struct rcu_head rcu_head; ··· 628 623 } 629 624 } 630 625 631 - static void blk_zone_wplug_bio_work(struct work_struct *work); 626 + static bool disk_zone_wplug_submit_bio(struct gendisk *disk, 627 + struct blk_zone_wplug *zwplug); 628 + 629 + static void blk_zone_wplug_bio_work(struct work_struct *work) 630 + { 631 + struct blk_zone_wplug *zwplug = 632 + container_of(work, struct blk_zone_wplug, bio_work); 633 + 634 + disk_zone_wplug_submit_bio(zwplug->disk, zwplug); 635 + 636 + /* Drop the reference we took in disk_zone_wplug_schedule_work(). */ 637 + disk_put_zone_wplug(zwplug); 638 + } 632 639 633 640 /* 634 641 * Get a zone write plug for the zone containing @sector. ··· 675 658 zwplug->wp_offset = bdev_offset_from_zone_start(disk->part0, sector); 676 659 bio_list_init(&zwplug->bio_list); 677 660 INIT_WORK(&zwplug->bio_work, blk_zone_wplug_bio_work); 661 + INIT_LIST_HEAD(&zwplug->entry); 678 662 zwplug->disk = disk; 679 663 680 664 /* ··· 708 690 */ 709 691 static void disk_zone_wplug_abort(struct blk_zone_wplug *zwplug) 710 692 { 693 + struct gendisk *disk = zwplug->disk; 711 694 struct bio *bio; 712 695 713 696 lockdep_assert_held(&zwplug->lock); ··· 722 703 blk_zone_wplug_bio_io_error(zwplug, bio); 723 704 724 705 zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; 706 + 707 + /* 708 + * If we are using the per disk zone write plugs worker thread, remove 709 + * the zone write plug from the work list and drop the reference we 710 + * took when the zone write plug was added to that list. 711 + */ 712 + if (blk_queue_zoned_qd1_writes(disk->queue)) { 713 + spin_lock(&disk->zone_wplugs_list_lock); 714 + if (!list_empty(&zwplug->entry)) { 715 + list_del_init(&zwplug->entry); 716 + disk_put_zone_wplug(zwplug); 717 + } 718 + spin_unlock(&disk->zone_wplugs_list_lock); 719 + } 725 720 } 726 721 727 722 /* ··· 1170 1137 } 1171 1138 } 1172 1139 1173 - static void disk_zone_wplug_schedule_bio_work(struct gendisk *disk, 1174 - struct blk_zone_wplug *zwplug) 1140 + static void disk_zone_wplug_schedule_work(struct gendisk *disk, 1141 + struct blk_zone_wplug *zwplug) 1175 1142 { 1176 1143 lockdep_assert_held(&zwplug->lock); 1177 1144 ··· 1184 1151 * and we also drop this reference if the work is already scheduled. 1185 1152 */ 1186 1153 WARN_ON_ONCE(!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)); 1154 + WARN_ON_ONCE(blk_queue_zoned_qd1_writes(disk->queue)); 1187 1155 refcount_inc(&zwplug->ref); 1188 1156 if (!queue_work(disk->zone_wplugs_wq, &zwplug->bio_work)) 1189 1157 disk_put_zone_wplug(zwplug); ··· 1224 1190 bio_list_add(&zwplug->bio_list, bio); 1225 1191 trace_disk_zone_wplug_add_bio(zwplug->disk->queue, zwplug->zone_no, 1226 1192 bio->bi_iter.bi_sector, bio_sectors(bio)); 1193 + 1194 + /* 1195 + * If we are using the disk zone write plugs worker instead of the per 1196 + * zone write plug BIO work, add the zone write plug to the work list 1197 + * if it is not already there. Make sure to also get an extra reference 1198 + * on the zone write plug so that it does not go away until it is 1199 + * removed from the work list. 1200 + */ 1201 + if (blk_queue_zoned_qd1_writes(disk->queue)) { 1202 + spin_lock(&disk->zone_wplugs_list_lock); 1203 + if (list_empty(&zwplug->entry)) { 1204 + list_add_tail(&zwplug->entry, &disk->zone_wplugs_list); 1205 + refcount_inc(&zwplug->ref); 1206 + } 1207 + spin_unlock(&disk->zone_wplugs_list_lock); 1208 + } 1227 1209 } 1228 1210 1229 1211 /* ··· 1473 1423 goto queue_bio; 1474 1424 } 1475 1425 1426 + /* 1427 + * For rotational devices, we will use the gendisk zone write plugs 1428 + * work instead of the per zone write plug BIO work, so queue the BIO. 1429 + */ 1430 + if (blk_queue_zoned_qd1_writes(disk->queue)) 1431 + goto queue_bio; 1432 + 1476 1433 /* If the zone is already plugged, add the BIO to the BIO plug list. */ 1477 1434 if (zwplug->flags & BLK_ZONE_WPLUG_PLUGGED) 1478 1435 goto queue_bio; ··· 1502 1445 1503 1446 if (!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)) { 1504 1447 zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED; 1505 - disk_zone_wplug_schedule_bio_work(disk, zwplug); 1448 + if (blk_queue_zoned_qd1_writes(disk->queue)) 1449 + wake_up_process(disk->zone_wplugs_worker); 1450 + else 1451 + disk_zone_wplug_schedule_work(disk, zwplug); 1506 1452 } 1507 1453 1508 1454 spin_unlock_irqrestore(&zwplug->lock, flags); ··· 1646 1586 1647 1587 spin_lock_irqsave(&zwplug->lock, flags); 1648 1588 1649 - /* Schedule submission of the next plugged BIO if we have one. */ 1650 - if (!bio_list_empty(&zwplug->bio_list)) { 1651 - disk_zone_wplug_schedule_bio_work(disk, zwplug); 1652 - spin_unlock_irqrestore(&zwplug->lock, flags); 1653 - return; 1654 - } 1589 + /* 1590 + * For rotational devices, signal the BIO completion to the zone write 1591 + * plug work. Otherwise, schedule submission of the next plugged BIO 1592 + * if we have one. 1593 + */ 1594 + if (bio_list_empty(&zwplug->bio_list)) 1595 + zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; 1655 1596 1656 - zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; 1597 + if (blk_queue_zoned_qd1_writes(disk->queue)) 1598 + complete(&disk->zone_wplugs_worker_bio_done); 1599 + else if (!bio_list_empty(&zwplug->bio_list)) 1600 + disk_zone_wplug_schedule_work(disk, zwplug); 1601 + 1657 1602 if (!zwplug->wp_offset || disk_zone_wplug_is_full(disk, zwplug)) 1658 1603 disk_mark_zone_wplug_dead(zwplug); 1604 + 1659 1605 spin_unlock_irqrestore(&zwplug->lock, flags); 1660 1606 } 1661 1607 ··· 1751 1685 disk_put_zone_wplug(zwplug); 1752 1686 } 1753 1687 1754 - static void blk_zone_wplug_bio_work(struct work_struct *work) 1688 + static bool disk_zone_wplug_submit_bio(struct gendisk *disk, 1689 + struct blk_zone_wplug *zwplug) 1755 1690 { 1756 - struct blk_zone_wplug *zwplug = 1757 - container_of(work, struct blk_zone_wplug, bio_work); 1758 1691 struct block_device *bdev; 1759 1692 unsigned long flags; 1760 1693 struct bio *bio; ··· 1769 1704 if (!bio) { 1770 1705 zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; 1771 1706 spin_unlock_irqrestore(&zwplug->lock, flags); 1772 - goto put_zwplug; 1707 + return false; 1773 1708 } 1774 1709 1775 1710 trace_blk_zone_wplug_bio(zwplug->disk->queue, zwplug->zone_no, ··· 1783 1718 goto again; 1784 1719 } 1785 1720 1786 - bdev = bio->bi_bdev; 1787 - 1788 1721 /* 1789 1722 * blk-mq devices will reuse the extra reference on the request queue 1790 1723 * usage counter we took when the BIO was plugged, but the submission 1791 1724 * path for BIO-based devices will not do that. So drop this extra 1792 1725 * reference here. 1793 1726 */ 1727 + if (blk_queue_zoned_qd1_writes(disk->queue)) 1728 + reinit_completion(&disk->zone_wplugs_worker_bio_done); 1729 + bdev = bio->bi_bdev; 1794 1730 if (bdev_test_flag(bdev, BD_HAS_SUBMIT_BIO)) { 1795 1731 bdev->bd_disk->fops->submit_bio(bio); 1796 1732 blk_queue_exit(bdev->bd_disk->queue); ··· 1799 1733 blk_mq_submit_bio(bio); 1800 1734 } 1801 1735 1802 - put_zwplug: 1803 - /* Drop the reference we took in disk_zone_wplug_schedule_bio_work(). */ 1804 - disk_put_zone_wplug(zwplug); 1736 + return true; 1737 + } 1738 + 1739 + static struct blk_zone_wplug *disk_get_zone_wplugs_work(struct gendisk *disk) 1740 + { 1741 + struct blk_zone_wplug *zwplug; 1742 + 1743 + spin_lock_irq(&disk->zone_wplugs_list_lock); 1744 + zwplug = list_first_entry_or_null(&disk->zone_wplugs_list, 1745 + struct blk_zone_wplug, entry); 1746 + if (zwplug) 1747 + list_del_init(&zwplug->entry); 1748 + spin_unlock_irq(&disk->zone_wplugs_list_lock); 1749 + 1750 + return zwplug; 1751 + } 1752 + 1753 + static int disk_zone_wplugs_worker(void *data) 1754 + { 1755 + struct gendisk *disk = data; 1756 + struct blk_zone_wplug *zwplug; 1757 + unsigned int noio_flag; 1758 + 1759 + noio_flag = memalloc_noio_save(); 1760 + set_user_nice(current, MIN_NICE); 1761 + set_freezable(); 1762 + 1763 + for (;;) { 1764 + set_current_state(TASK_INTERRUPTIBLE | TASK_FREEZABLE); 1765 + 1766 + zwplug = disk_get_zone_wplugs_work(disk); 1767 + if (zwplug) { 1768 + /* 1769 + * Process all BIOs of this zone write plug and then 1770 + * drop the reference we took when adding the zone write 1771 + * plug to the active list. 1772 + */ 1773 + set_current_state(TASK_RUNNING); 1774 + while (disk_zone_wplug_submit_bio(disk, zwplug)) 1775 + blk_wait_io(&disk->zone_wplugs_worker_bio_done); 1776 + disk_put_zone_wplug(zwplug); 1777 + continue; 1778 + } 1779 + 1780 + /* 1781 + * Only sleep if nothing sets the state to running. Else check 1782 + * for zone write plugs work again as a newly submitted BIO 1783 + * might have added a zone write plug to the work list. 1784 + */ 1785 + if (get_current_state() == TASK_RUNNING) { 1786 + try_to_freeze(); 1787 + } else { 1788 + if (kthread_should_stop()) { 1789 + set_current_state(TASK_RUNNING); 1790 + break; 1791 + } 1792 + schedule(); 1793 + } 1794 + } 1795 + 1796 + WARN_ON_ONCE(!list_empty(&disk->zone_wplugs_list)); 1797 + memalloc_noio_restore(noio_flag); 1798 + 1799 + return 0; 1805 1800 } 1806 1801 1807 1802 void disk_init_zone_resources(struct gendisk *disk) 1808 1803 { 1809 1804 spin_lock_init(&disk->zone_wplugs_hash_lock); 1805 + spin_lock_init(&disk->zone_wplugs_list_lock); 1806 + INIT_LIST_HEAD(&disk->zone_wplugs_list); 1807 + init_completion(&disk->zone_wplugs_worker_bio_done); 1810 1808 } 1811 1809 1812 1810 /* ··· 1886 1756 unsigned int pool_size) 1887 1757 { 1888 1758 unsigned int i; 1759 + int ret = -ENOMEM; 1889 1760 1890 1761 atomic_set(&disk->nr_zone_wplugs, 0); 1891 1762 disk->zone_wplugs_hash_bits = ··· 1912 1781 if (!disk->zone_wplugs_wq) 1913 1782 goto destroy_pool; 1914 1783 1784 + disk->zone_wplugs_worker = 1785 + kthread_create(disk_zone_wplugs_worker, disk, 1786 + "%s_zwplugs_worker", disk->disk_name); 1787 + if (IS_ERR(disk->zone_wplugs_worker)) { 1788 + ret = PTR_ERR(disk->zone_wplugs_worker); 1789 + disk->zone_wplugs_worker = NULL; 1790 + goto destroy_wq; 1791 + } 1792 + wake_up_process(disk->zone_wplugs_worker); 1793 + 1915 1794 return 0; 1916 1795 1796 + destroy_wq: 1797 + destroy_workqueue(disk->zone_wplugs_wq); 1798 + disk->zone_wplugs_wq = NULL; 1917 1799 destroy_pool: 1918 1800 mempool_destroy(disk->zone_wplugs_pool); 1919 1801 disk->zone_wplugs_pool = NULL; ··· 1934 1790 kfree(disk->zone_wplugs_hash); 1935 1791 disk->zone_wplugs_hash = NULL; 1936 1792 disk->zone_wplugs_hash_bits = 0; 1937 - return -ENOMEM; 1793 + return ret; 1938 1794 } 1939 1795 1940 1796 static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk) ··· 1984 1840 1985 1841 void disk_free_zone_resources(struct gendisk *disk) 1986 1842 { 1843 + if (disk->zone_wplugs_worker) 1844 + kthread_stop(disk->zone_wplugs_worker); 1845 + WARN_ON_ONCE(!list_empty(&disk->zone_wplugs_list)); 1846 + 1987 1847 if (disk->zone_wplugs_wq) { 1988 1848 destroy_workqueue(disk->zone_wplugs_wq); 1989 1849 disk->zone_wplugs_wq = NULL;
+8
include/linux/blkdev.h
··· 13 13 #include <linux/minmax.h> 14 14 #include <linux/timer.h> 15 15 #include <linux/workqueue.h> 16 + #include <linux/completion.h> 16 17 #include <linux/wait.h> 17 18 #include <linux/bio.h> 18 19 #include <linux/gfp.h> ··· 205 204 struct mempool *zone_wplugs_pool; 206 205 struct hlist_head *zone_wplugs_hash; 207 206 struct workqueue_struct *zone_wplugs_wq; 207 + spinlock_t zone_wplugs_list_lock; 208 + struct list_head zone_wplugs_list; 209 + struct task_struct *zone_wplugs_worker; 210 + struct completion zone_wplugs_worker_bio_done; 208 211 #endif /* CONFIG_BLK_DEV_ZONED */ 209 212 210 213 #if IS_ENABLED(CONFIG_CDROM) ··· 673 668 QUEUE_FLAG_NO_ELV_SWITCH, /* can't switch elevator any more */ 674 669 QUEUE_FLAG_QOS_ENABLED, /* qos is enabled */ 675 670 QUEUE_FLAG_BIO_ISSUE_TIME, /* record bio->issue_time_ns */ 671 + QUEUE_FLAG_ZONED_QD1_WRITES, /* Limit zoned devices writes to QD=1 */ 676 672 QUEUE_FLAG_MAX 677 673 }; 678 674 ··· 713 707 test_bit(QUEUE_FLAG_DISABLE_WBT_DEF, &(q)->queue_flags) 714 708 #define blk_queue_no_elv_switch(q) \ 715 709 test_bit(QUEUE_FLAG_NO_ELV_SWITCH, &(q)->queue_flags) 710 + #define blk_queue_zoned_qd1_writes(q) \ 711 + test_bit(QUEUE_FLAG_ZONED_QD1_WRITES, &(q)->queue_flags) 716 712 717 713 extern void blk_set_pm_only(struct request_queue *q); 718 714 extern void blk_clear_pm_only(struct request_queue *q);