Merge tag 'for-linus-20190112' of git://git.kernel.dk/linux-block

tjh.dev / kernel

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

fork

Configure Feed

Issues Pull Requests Commits Tags

Feed URL

Select the types of activity you want to include in your feed.

Merge tag 'for-linus-20190112' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

- NVMe pull request from Christoph, with little fixes all over the map

- Loop caching fix for offset/bs change (Jaegeuk Kim)

- Block documentation tweaks (Jeff, Jon, Weiping, John)

- null_blk zoned tweak (John)

- ahch mvebu suspend/resume support. Should have gone into the merge
window, but there was some confusion on which tree had it. (Miquel)

* tag 'for-linus-20190112' of git://git.kernel.dk/linux-block: (22 commits)
ata: ahci: mvebu: request PHY suspend/resume for Armada 3700
ata: ahci: mvebu: add Armada 3700 initialization needed for S2RAM
ata: ahci: mvebu: do Armada 38x configuration only on relevant SoCs
ata: ahci: mvebu: remove stale comment
ata: libahci_platform: comply to PHY framework
loop: drop caches if offset or block_size are changed
block: fix kerneldoc comment for blk_attempt_plug_merge()
nvme: don't initlialize ctrl->cntlid twice
nvme: introduce NVME_QUIRK_IGNORE_DEV_SUBNQN
nvme: pad fake subsys NQN vid and ssvid with zeros
nvme-multipath: zero out ANA log buffer
nvme-fabrics: unset write/poll queues for discovery controllers
nvme-tcp: don't ask if controller is fabrics
nvme-tcp: remove dead code
nvme-pci: fix out of bounds access in nvme_cqe_pending
nvme-pci: rerun irq setup on IO queue init errors
nvme-pci: use the same attributes when freeing host_mem_desc_bufs.
nvme-pci: fix the wrong setting of nr_maps
block: doc: add slice_idle_us to bfq documentation
block: clarify documentation for blk_{start|finish}_plug
...

Linus Torvalds 7 years ago b8c3b899 66c56cfa

+229 -66

16 changed files

expand all collapse all

Documentation

ABI

testing

sysfs-block

block

bfq-iosched.txt

null_blk.txt

queue-sysfs.txt

block

blk-core.c

drivers

ata

ahci.h

ahci_mvebu.c

libahci_platform.c

block

loop.c

null_blk.h

nvme

host

core.c

fabrics.c

multipath.c

nvme.h

pci.c

tcp.c

Documentation/ABI/testing/sysfs-block

reviewed

··· 279 279 size in 512B sectors of the zones of the device, with 280 280 the eventual exception of the last zone of the device 281 281 which may be smaller. 282 282 + 283 283 + What: /sys/block/<disk>/queue/io_timeout 284 284 + Date: November 2018 285 285 + Contact: Weiping Zhang <zhangweiping@didiglobal.com> 286 286 + Description: 287 287 + io_timeout is the request timeout in milliseconds. If a request 288 288 + does not complete in this time then the block driver timeout 289 289 + handler is invoked. That timeout handler can decide to retry 290 290 + the request, to fail it or to start a device recovery strategy.

Documentation/block/bfq-iosched.txt

reviewed

··· 357 357 than maximum throughput. In these cases, consider setting the 358 358 strict_guarantees parameter. 359 359 360 360 + slice_idle_us 361 361 + ------------- 362 362 + 363 363 + Controls the same tuning parameter as slice_idle, but in microseconds. 364 364 + Either tunable can be used to set idling behavior. Afterwards, the 365 365 + other tunable will reflect the newly set value in sysfs. 366 366 + 360 367 strict_guarantees 361 368 ----------------- 362 369

+2 -1

Documentation/block/null_blk.txt

reviewed

··· 88 88 89 89 zoned=[0/1]: Default: 0 90 90 0: Block device is exposed as a random-access block device. 91 91 - 1: Block device is exposed as a host-managed zoned block device. 91 91 + 1: Block device is exposed as a host-managed zoned block device. Requires 92 92 + CONFIG_BLK_DEV_ZONED. 92 93 93 94 zone_size=[MB]: Default: 256 94 95 Per zone size when exposed as a zoned block device. Must be a power of two.

Documentation/block/queue-sysfs.txt

reviewed

··· 67 67 IO to sleep for this amount of microseconds before entering classic 68 68 polling. 69 69 70 70 + io_timeout (RW) 71 71 + --------------- 72 72 + io_timeout is the request timeout in milliseconds. If a request does not 73 73 + complete in this time then the block driver timeout handler is invoked. 74 74 + That timeout handler can decide to retry the request, to fail it or to start 75 75 + a device recovery strategy. 76 76 + 70 77 iostats (RW) 71 78 ------------- 72 79 This file is used to control (on/off) the iostats accounting of the

+19 -1

block/blk-core.c

reviewed

··· 661 661 * blk_attempt_plug_merge - try to merge with %current's plugged list 662 662 * @q: request_queue new bio is being queued at 663 663 * @bio: new bio being queued 664 664 - * @request_count: out parameter for number of traversed plugged requests 665 664 * @same_queue_rq: pointer to &struct request that gets filled in when 666 665 * another request associated with @q is found on the plug list 667 666 * (optional, may be %NULL) ··· 1682 1683 * @plug: The &struct blk_plug that needs to be initialized 1683 1684 * 1684 1685 * Description: 1686 1686 + * blk_start_plug() indicates to the block layer an intent by the caller 1687 1687 + * to submit multiple I/O requests in a batch. The block layer may use 1688 1688 + * this hint to defer submitting I/Os from the caller until blk_finish_plug() 1689 1689 + * is called. However, the block layer may choose to submit requests 1690 1690 + * before a call to blk_finish_plug() if the number of queued I/Os 1691 1691 + * exceeds %BLK_MAX_REQUEST_COUNT, or if the size of the I/O is larger than 1692 1692 + * %BLK_PLUG_FLUSH_SIZE. The queued I/Os may also be submitted early if 1693 1693 + * the task schedules (see below). 1694 1694 + * 1685 1695 * Tracking blk_plug inside the task_struct will help with auto-flushing the 1686 1696 * pending I/O should the task end up blocking between blk_start_plug() and 1687 1697 * blk_finish_plug(). This is important from a performance perspective, but ··· 1773 1765 blk_mq_flush_plug_list(plug, from_schedule); 1774 1766 } 1775 1767 1768 1768 + /** 1769 1769 + * blk_finish_plug - mark the end of a batch of submitted I/O 1770 1770 + * @plug: The &struct blk_plug passed to blk_start_plug() 1771 1771 + * 1772 1772 + * Description: 1773 1773 + * Indicate that a batch of I/O submissions is complete. This function 1774 1774 + * must be paired with an initial call to blk_start_plug(). The intent 1775 1775 + * is to allow the block layer to optimize I/O submission. See the 1776 1776 + * documentation for blk_start_plug() for more information. 1777 1777 + */ 1776 1778 void blk_finish_plug(struct blk_plug *plug) 1777 1779 { 1778 1780 if (plug != current->plug)

drivers/ata/ahci.h

reviewed

··· 254 254 AHCI_HFLAG_IS_MOBILE = (1 << 25), /* mobile chipset, use 255 255 SATA_MOBILE_LPM_POLICY 256 256 as default lpm_policy */ 257 257 + AHCI_HFLAG_SUSPEND_PHYS = (1 << 26), /* handle PHYs during 258 258 + suspend/resume */ 257 259 258 260 /* ap->flags bits */ 259 261

+64 -23

drivers/ata/ahci_mvebu.c

reviewed

··· 28 28 #define AHCI_WINDOW_BASE(win) (0x64 + ((win) << 4)) 29 29 #define AHCI_WINDOW_SIZE(win) (0x68 + ((win) << 4)) 30 30 31 31 + struct ahci_mvebu_plat_data { 32 32 + int (*plat_config)(struct ahci_host_priv *hpriv); 33 33 + unsigned int flags; 34 34 + }; 35 35 + 31 36 static void ahci_mvebu_mbus_config(struct ahci_host_priv *hpriv, 32 37 const struct mbus_dram_target_info *dram) 33 38 { ··· 65 60 */ 66 61 writel(0x4, hpriv->mmio + AHCI_VENDOR_SPECIFIC_0_ADDR); 67 62 writel(0x80, hpriv->mmio + AHCI_VENDOR_SPECIFIC_0_DATA); 63 63 + } 64 64 + 65 65 + static int ahci_mvebu_armada_380_config(struct ahci_host_priv *hpriv) 66 66 + { 67 67 + const struct mbus_dram_target_info *dram; 68 68 + int rc = 0; 69 69 + 70 70 + dram = mv_mbus_dram_info(); 71 71 + if (dram) 72 72 + ahci_mvebu_mbus_config(hpriv, dram); 73 73 + else 74 74 + rc = -ENODEV; 75 75 + 76 76 + ahci_mvebu_regret_option(hpriv); 77 77 + 78 78 + return rc; 79 79 + } 80 80 + 81 81 + static int ahci_mvebu_armada_3700_config(struct ahci_host_priv *hpriv) 82 82 + { 83 83 + u32 reg; 84 84 + 85 85 + writel(0, hpriv->mmio + AHCI_VENDOR_SPECIFIC_0_ADDR); 86 86 + 87 87 + reg = readl(hpriv->mmio + AHCI_VENDOR_SPECIFIC_0_DATA); 88 88 + reg |= BIT(6); 89 89 + writel(reg, hpriv->mmio + AHCI_VENDOR_SPECIFIC_0_DATA); 90 90 + 91 91 + return 0; 68 92 } 69 93 70 94 /** ··· 160 126 { 161 127 struct ata_host *host = platform_get_drvdata(pdev); 162 128 struct ahci_host_priv *hpriv = host->private_data; 163 163 - const struct mbus_dram_target_info *dram; 129 129 + const struct ahci_mvebu_plat_data *pdata = hpriv->plat_data; 164 130 165 165 - dram = mv_mbus_dram_info(); 166 166 - if (dram) 167 167 - ahci_mvebu_mbus_config(hpriv, dram); 168 168 - 169 169 - ahci_mvebu_regret_option(hpriv); 131 131 + pdata->plat_config(hpriv); 170 132 171 133 return ahci_platform_resume_host(&pdev->dev); 172 134 } ··· 184 154 185 155 static int ahci_mvebu_probe(struct platform_device *pdev) 186 156 { 157 157 + const struct ahci_mvebu_plat_data *pdata; 187 158 struct ahci_host_priv *hpriv; 188 188 - const struct mbus_dram_target_info *dram; 189 159 int rc; 160 160 + 161 161 + pdata = of_device_get_match_data(&pdev->dev); 162 162 + if (!pdata) 163 163 + return -EINVAL; 190 164 191 165 hpriv = ahci_platform_get_resources(pdev, 0); 192 166 if (IS_ERR(hpriv)) 193 167 return PTR_ERR(hpriv); 168 168 + 169 169 + hpriv->flags |= pdata->flags; 170 170 + hpriv->plat_data = (void *)pdata; 194 171 195 172 rc = ahci_platform_enable_resources(hpriv); 196 173 if (rc) ··· 205 168 206 169 hpriv->stop_engine = ahci_mvebu_stop_engine; 207 170 208 208 - if (of_device_is_compatible(pdev->dev.of_node, 209 209 - "marvell,armada-380-ahci")) { 210 210 - dram = mv_mbus_dram_info(); 211 211 - if (!dram) 212 212 - return -ENODEV; 213 213 - 214 214 - ahci_mvebu_mbus_config(hpriv, dram); 215 215 - ahci_mvebu_regret_option(hpriv); 216 216 - } 171 171 + rc = pdata->plat_config(hpriv); 172 172 + if (rc) 173 173 + goto disable_resources; 217 174 218 175 rc = ahci_platform_init_host(pdev, hpriv, &ahci_mvebu_port_info, 219 176 &ahci_platform_sht); ··· 221 190 return rc; 222 191 } 223 192 193 193 + static const struct ahci_mvebu_plat_data ahci_mvebu_armada_380_plat_data = { 194 194 + .plat_config = ahci_mvebu_armada_380_config, 195 195 + }; 196 196 + 197 197 + static const struct ahci_mvebu_plat_data ahci_mvebu_armada_3700_plat_data = { 198 198 + .plat_config = ahci_mvebu_armada_3700_config, 199 199 + .flags = AHCI_HFLAG_SUSPEND_PHYS, 200 200 + }; 201 201 + 224 202 static const struct of_device_id ahci_mvebu_of_match[] = { 225 225 - { .compatible = "marvell,armada-380-ahci", }, 226 226 - { .compatible = "marvell,armada-3700-ahci", }, 203 203 + { 204 204 + .compatible = "marvell,armada-380-ahci", 205 205 + .data = &ahci_mvebu_armada_380_plat_data, 206 206 + }, 207 207 + { 208 208 + .compatible = "marvell,armada-3700-ahci", 209 209 + .data = &ahci_mvebu_armada_3700_plat_data, 210 210 + }, 227 211 { }, 228 212 }; 229 213 MODULE_DEVICE_TABLE(of, ahci_mvebu_of_match); 230 214 231 231 - /* 232 232 - * We currently don't provide power management related operations, 233 233 - * since there is no suspend/resume support at the platform level for 234 234 - * Armada 38x for the moment. 235 235 - */ 236 215 static struct platform_driver ahci_mvebu_driver = { 237 216 .probe = ahci_mvebu_probe, 238 217 .remove = ata_platform_remove_one,

+13

drivers/ata/libahci_platform.c

reviewed

··· 56 56 if (rc) 57 57 goto disable_phys; 58 58 59 59 + rc = phy_set_mode(hpriv->phys[i], PHY_MODE_SATA); 60 60 + if (rc) { 61 61 + phy_exit(hpriv->phys[i]); 62 62 + goto disable_phys; 63 63 + } 64 64 + 59 65 rc = phy_power_on(hpriv->phys[i]); 60 66 if (rc) { 61 67 phy_exit(hpriv->phys[i]); ··· 744 738 writel(ctl, mmio + HOST_CTL); 745 739 readl(mmio + HOST_CTL); /* flush */ 746 740 741 741 + if (hpriv->flags & AHCI_HFLAG_SUSPEND_PHYS) 742 742 + ahci_platform_disable_phys(hpriv); 743 743 + 747 744 return ata_host_suspend(host, PMSG_SUSPEND); 748 745 } 749 746 EXPORT_SYMBOL_GPL(ahci_platform_suspend_host); ··· 765 756 int ahci_platform_resume_host(struct device *dev) 766 757 { 767 758 struct ata_host *host = dev_get_drvdata(dev); 759 759 + struct ahci_host_priv *hpriv = host->private_data; 768 760 int rc; 769 761 770 762 if (dev->power.power_state.event == PM_EVENT_SUSPEND) { ··· 775 765 776 766 ahci_init_controller(host); 777 767 } 768 768 + 769 769 + if (hpriv->flags & AHCI_HFLAG_SUSPEND_PHYS) 770 770 + ahci_platform_enable_phys(hpriv); 778 771 779 772 ata_host_resume(host); 780 773

+33 -2

drivers/block/loop.c

reviewed

··· 1190 1190 goto out_unlock; 1191 1191 } 1192 1192 1193 1193 + if (lo->lo_offset != info->lo_offset || 1194 1194 + lo->lo_sizelimit != info->lo_sizelimit) { 1195 1195 + sync_blockdev(lo->lo_device); 1196 1196 + kill_bdev(lo->lo_device); 1197 1197 + } 1198 1198 + 1193 1199 /* I/O need to be drained during transfer transition */ 1194 1200 blk_mq_freeze_queue(lo->lo_queue); 1195 1201 ··· 1224 1218 1225 1219 if (lo->lo_offset != info->lo_offset || 1226 1220 lo->lo_sizelimit != info->lo_sizelimit) { 1221 1221 + /* kill_bdev should have truncated all the pages */ 1222 1222 + if (lo->lo_device->bd_inode->i_mapping->nrpages) { 1223 1223 + err = -EAGAIN; 1224 1224 + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", 1225 1225 + __func__, lo->lo_number, lo->lo_file_name, 1226 1226 + lo->lo_device->bd_inode->i_mapping->nrpages); 1227 1227 + goto out_unfreeze; 1228 1228 + } 1227 1229 if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) { 1228 1230 err = -EFBIG; 1229 1231 goto out_unfreeze; ··· 1457 1443 1458 1444 static int loop_set_block_size(struct loop_device *lo, unsigned long arg) 1459 1445 { 1446 1446 + int err = 0; 1447 1447 + 1460 1448 if (lo->lo_state != Lo_bound) 1461 1449 return -ENXIO; 1462 1450 1463 1451 if (arg < 512 || arg > PAGE_SIZE || !is_power_of_2(arg)) 1464 1452 return -EINVAL; 1465 1453 1454 1454 + if (lo->lo_queue->limits.logical_block_size != arg) { 1455 1455 + sync_blockdev(lo->lo_device); 1456 1456 + kill_bdev(lo->lo_device); 1457 1457 + } 1458 1458 + 1466 1459 blk_mq_freeze_queue(lo->lo_queue); 1460 1460 + 1461 1461 + /* kill_bdev should have truncated all the pages */ 1462 1462 + if (lo->lo_queue->limits.logical_block_size != arg && 1463 1463 + lo->lo_device->bd_inode->i_mapping->nrpages) { 1464 1464 + err = -EAGAIN; 1465 1465 + pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n", 1466 1466 + __func__, lo->lo_number, lo->lo_file_name, 1467 1467 + lo->lo_device->bd_inode->i_mapping->nrpages); 1468 1468 + goto out_unfreeze; 1469 1469 + } 1467 1470 1468 1471 blk_queue_logical_block_size(lo->lo_queue, arg); 1469 1472 blk_queue_physical_block_size(lo->lo_queue, arg); 1470 1473 blk_queue_io_min(lo->lo_queue, arg); 1471 1474 loop_update_dio(lo); 1472 1472 - 1475 1475 + out_unfreeze: 1473 1476 blk_mq_unfreeze_queue(lo->lo_queue); 1474 1477 1475 1475 - return 0; 1478 1478 + return err; 1476 1479 } 1477 1480 1478 1481 static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,

drivers/block/null_blk.h

reviewed

··· 97 97 #else 98 98 static inline int null_zone_init(struct nullb_device *dev) 99 99 { 100 100 + pr_err("null_blk: CONFIG_BLK_DEV_ZONED not enabled\n"); 100 101 return -EINVAL; 101 102 } 102 103 static inline void null_zone_exit(struct nullb_device *dev) {}

+10 -9

drivers/nvme/host/core.c

reviewed

··· 2173 2173 size_t nqnlen; 2174 2174 int off; 2175 2175 2176 2176 - nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE); 2177 2177 - if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) { 2178 2178 - strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE); 2179 2179 - return; 2180 2180 - } 2176 2176 + if(!(ctrl->quirks & NVME_QUIRK_IGNORE_DEV_SUBNQN)) { 2177 2177 + nqnlen = strnlen(id->subnqn, NVMF_NQN_SIZE); 2178 2178 + if (nqnlen > 0 && nqnlen < NVMF_NQN_SIZE) { 2179 2179 + strlcpy(subsys->subnqn, id->subnqn, NVMF_NQN_SIZE); 2180 2180 + return; 2181 2181 + } 2181 2182 2182 2182 - if (ctrl->vs >= NVME_VS(1, 2, 1)) 2183 2183 - dev_warn(ctrl->device, "missing or invalid SUBNQN field.\n"); 2183 2183 + if (ctrl->vs >= NVME_VS(1, 2, 1)) 2184 2184 + dev_warn(ctrl->device, "missing or invalid SUBNQN field.\n"); 2185 2185 + } 2184 2186 2185 2187 /* Generate a "fake" NQN per Figure 254 in NVMe 1.3 + ECN 001 */ 2186 2188 off = snprintf(subsys->subnqn, NVMF_NQN_SIZE, 2187 2187 - "nqn.2014.08.org.nvmexpress:%4x%4x", 2189 2189 + "nqn.2014.08.org.nvmexpress:%04x%04x", 2188 2190 le16_to_cpu(id->vid), le16_to_cpu(id->ssvid)); 2189 2191 memcpy(subsys->subnqn + off, id->sn, sizeof(id->sn)); 2190 2192 off += sizeof(id->sn); ··· 2502 2500 ctrl->oaes = le32_to_cpu(id->oaes); 2503 2501 atomic_set(&ctrl->abort_limit, id->acl + 1); 2504 2502 ctrl->vwc = id->vwc; 2505 2505 - ctrl->cntlid = le16_to_cpup(&id->cntlid); 2506 2503 if (id->mdts) 2507 2504 max_hw_sectors = 1 << (id->mdts + page_shift - 9); 2508 2505 else

drivers/nvme/host/fabrics.c

reviewed

··· 874 874 if (opts->discovery_nqn) { 875 875 opts->kato = 0; 876 876 opts->nr_io_queues = 0; 877 877 + opts->nr_write_queues = 0; 878 878 + opts->nr_poll_queues = 0; 877 879 opts->duplicate_connect = true; 878 880 } 879 881 if (ctrl_loss_tmo < 0)

drivers/nvme/host/multipath.c

reviewed

··· 570 570 return 0; 571 571 out_free_ana_log_buf: 572 572 kfree(ctrl->ana_log_buf); 573 573 + ctrl->ana_log_buf = NULL; 573 574 out: 574 575 return error; 575 576 } ··· 578 577 void nvme_mpath_uninit(struct nvme_ctrl *ctrl) 579 578 { 580 579 kfree(ctrl->ana_log_buf); 580 580 + ctrl->ana_log_buf = NULL; 581 581 } 582 582

drivers/nvme/host/nvme.h

reviewed

··· 90 90 * Set MEDIUM priority on SQ creation 91 91 */ 92 92 NVME_QUIRK_MEDIUM_PRIO_SQ = (1 << 7), 93 93 + 94 94 + /* 95 95 + * Ignore device provided subnqn. 96 96 + */ 97 97 + NVME_QUIRK_IGNORE_DEV_SUBNQN = (1 << 8), 93 98 }; 94 99 95 100 /*

+47 -20

drivers/nvme/host/pci.c

reviewed

··· 95 95 struct nvme_queue; 96 96 97 97 static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown); 98 98 + static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode); 98 99 99 100 /* 100 101 * Represents an NVM Express device. Each nvme_dev is a PCI function. ··· 1020 1019 1021 1020 static inline void nvme_update_cq_head(struct nvme_queue *nvmeq) 1022 1021 { 1023 1023 - if (++nvmeq->cq_head == nvmeq->q_depth) { 1022 1022 + if (nvmeq->cq_head == nvmeq->q_depth - 1) { 1024 1023 nvmeq->cq_head = 0; 1025 1024 nvmeq->cq_phase = !nvmeq->cq_phase; 1025 1025 + } else { 1026 1026 + nvmeq->cq_head++; 1026 1027 } 1027 1028 } 1028 1029 ··· 1421 1418 pci_free_irq(to_pci_dev(nvmeq->dev->dev), nvmeq->cq_vector, nvmeq); 1422 1419 nvmeq->cq_vector = -1; 1423 1420 return 0; 1421 1421 + } 1422 1422 + 1423 1423 + static void nvme_suspend_io_queues(struct nvme_dev *dev) 1424 1424 + { 1425 1425 + int i; 1426 1426 + 1427 1427 + for (i = dev->ctrl.queue_count - 1; i > 0; i--) 1428 1428 + nvme_suspend_queue(&dev->queues[i]); 1424 1429 } 1425 1430 1426 1431 static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown) ··· 1896 1885 struct nvme_host_mem_buf_desc *desc = &dev->host_mem_descs[i]; 1897 1886 size_t size = le32_to_cpu(desc->size) * dev->ctrl.page_size; 1898 1887 1899 1899 - dma_free_coherent(dev->dev, size, dev->host_mem_desc_bufs[i], 1900 1900 - le64_to_cpu(desc->addr)); 1888 1888 + dma_free_attrs(dev->dev, size, dev->host_mem_desc_bufs[i], 1889 1889 + le64_to_cpu(desc->addr), 1890 1890 + DMA_ATTR_NO_KERNEL_MAPPING | DMA_ATTR_NO_WARN); 1901 1891 } 1902 1892 1903 1893 kfree(dev->host_mem_desc_bufs); ··· 1964 1952 while (--i >= 0) { 1965 1953 size_t size = le32_to_cpu(descs[i].size) * dev->ctrl.page_size; 1966 1954 1967 1967 - dma_free_coherent(dev->dev, size, bufs[i], 1968 1968 - le64_to_cpu(descs[i].addr)); 1955 1955 + dma_free_attrs(dev->dev, size, bufs[i], 1956 1956 + le64_to_cpu(descs[i].addr), 1957 1957 + DMA_ATTR_NO_KERNEL_MAPPING | DMA_ATTR_NO_WARN); 1969 1958 } 1970 1959 1971 1960 kfree(bufs); ··· 2145 2132 return result; 2146 2133 } 2147 2134 2135 2135 + static void nvme_disable_io_queues(struct nvme_dev *dev) 2136 2136 + { 2137 2137 + if (__nvme_disable_io_queues(dev, nvme_admin_delete_sq)) 2138 2138 + __nvme_disable_io_queues(dev, nvme_admin_delete_cq); 2139 2139 + } 2140 2140 + 2148 2141 static int nvme_setup_io_queues(struct nvme_dev *dev) 2149 2142 { 2150 2143 struct nvme_queue *adminq = &dev->queues[0]; ··· 2187 2168 } while (1); 2188 2169 adminq->q_db = dev->dbs; 2189 2170 2171 2171 + retry: 2190 2172 /* Deregister the admin queue's interrupt */ 2191 2173 pci_free_irq(pdev, 0, adminq); 2192 2174 ··· 2205 2185 result = max(result - 1, 1); 2206 2186 dev->max_qid = result + dev->io_queues[HCTX_TYPE_POLL]; 2207 2187 2208 2208 - dev_info(dev->ctrl.device, "%d/%d/%d default/read/poll queues\n", 2209 2209 - dev->io_queues[HCTX_TYPE_DEFAULT], 2210 2210 - dev->io_queues[HCTX_TYPE_READ], 2211 2211 - dev->io_queues[HCTX_TYPE_POLL]); 2212 2212 - 2213 2188 /* 2214 2189 * Should investigate if there's a performance win from allocating 2215 2190 * more queues than interrupt vectors; it might allow the submission 2216 2191 * path to scale better, even if the receive path is limited by the 2217 2192 * number of interrupts. 2218 2193 */ 2219 2219 - 2220 2194 result = queue_request_irq(adminq); 2221 2195 if (result) { 2222 2196 adminq->cq_vector = -1; 2223 2197 return result; 2224 2198 } 2225 2199 set_bit(NVMEQ_ENABLED, &adminq->flags); 2226 2226 - return nvme_create_io_queues(dev); 2200 2200 + 2201 2201 + result = nvme_create_io_queues(dev); 2202 2202 + if (result || dev->online_queues < 2) 2203 2203 + return result; 2204 2204 + 2205 2205 + if (dev->online_queues - 1 < dev->max_qid) { 2206 2206 + nr_io_queues = dev->online_queues - 1; 2207 2207 + nvme_disable_io_queues(dev); 2208 2208 + nvme_suspend_io_queues(dev); 2209 2209 + goto retry; 2210 2210 + } 2211 2211 + dev_info(dev->ctrl.device, "%d/%d/%d default/read/poll queues\n", 2212 2212 + dev->io_queues[HCTX_TYPE_DEFAULT], 2213 2213 + dev->io_queues[HCTX_TYPE_READ], 2214 2214 + dev->io_queues[HCTX_TYPE_POLL]); 2215 2215 + return 0; 2227 2216 } 2228 2217 2229 2218 static void nvme_del_queue_end(struct request *req, blk_status_t error) ··· 2277 2248 return 0; 2278 2249 } 2279 2250 2280 2280 - static bool nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode) 2251 2251 + static bool __nvme_disable_io_queues(struct nvme_dev *dev, u8 opcode) 2281 2252 { 2282 2253 int nr_queues = dev->online_queues - 1, sent = 0; 2283 2254 unsigned long timeout; ··· 2323 2294 dev->tagset.nr_maps = 2; /* default + read */ 2324 2295 if (dev->io_queues[HCTX_TYPE_POLL]) 2325 2296 dev->tagset.nr_maps++; 2326 2326 - dev->tagset.nr_maps = HCTX_MAX_TYPES; 2327 2297 dev->tagset.timeout = NVME_IO_TIMEOUT; 2328 2298 dev->tagset.numa_node = dev_to_node(dev->dev); 2329 2299 dev->tagset.queue_depth = ··· 2438 2410 2439 2411 static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown) 2440 2412 { 2441 2441 - int i; 2442 2413 bool dead = true; 2443 2414 struct pci_dev *pdev = to_pci_dev(dev->dev); 2444 2415 ··· 2464 2437 nvme_stop_queues(&dev->ctrl); 2465 2438 2466 2439 if (!dead && dev->ctrl.queue_count > 0) { 2467 2467 - if (nvme_disable_io_queues(dev, nvme_admin_delete_sq)) 2468 2468 - nvme_disable_io_queues(dev, nvme_admin_delete_cq); 2440 2440 + nvme_disable_io_queues(dev); 2469 2441 nvme_disable_admin_queue(dev, shutdown); 2470 2442 } 2471 2471 - for (i = dev->ctrl.queue_count - 1; i >= 0; i--) 2472 2472 - nvme_suspend_queue(&dev->queues[i]); 2473 2473 - 2443 2443 + nvme_suspend_io_queues(dev); 2444 2444 + nvme_suspend_queue(&dev->queues[0]); 2474 2445 nvme_pci_disable(dev); 2475 2446 2476 2447 blk_mq_tagset_busy_iter(&dev->tagset, nvme_cancel_request, &dev->ctrl); ··· 2971 2946 { PCI_VDEVICE(INTEL, 0xf1a5), /* Intel 600P/P3100 */ 2972 2947 .driver_data = NVME_QUIRK_NO_DEEPEST_PS | 2973 2948 NVME_QUIRK_MEDIUM_PRIO_SQ }, 2949 2949 + { PCI_VDEVICE(INTEL, 0xf1a6), /* Intel 760p/Pro 7600p */ 2950 2950 + .driver_data = NVME_QUIRK_IGNORE_DEV_SUBNQN, }, 2974 2951 { PCI_VDEVICE(INTEL, 0x5845), /* Qemu emulated controller */ 2975 2952 .driver_data = NVME_QUIRK_IDENTIFY_CNS, }, 2976 2953 { PCI_DEVICE(0x1bb1, 0x0100), /* Seagate Nytro Flash Storage */

+6 -10

drivers/nvme/host/tcp.c

reviewed

··· 1565 1565 { 1566 1566 nvme_tcp_stop_io_queues(ctrl); 1567 1567 if (remove) { 1568 1568 - if (ctrl->ops->flags & NVME_F_FABRICS) 1569 1569 - blk_cleanup_queue(ctrl->connect_q); 1568 1568 + blk_cleanup_queue(ctrl->connect_q); 1570 1569 blk_mq_free_tag_set(ctrl->tagset); 1571 1570 } 1572 1571 nvme_tcp_free_io_queues(ctrl); ··· 1586 1587 goto out_free_io_queues; 1587 1588 } 1588 1589 1589 1589 - if (ctrl->ops->flags & NVME_F_FABRICS) { 1590 1590 - ctrl->connect_q = blk_mq_init_queue(ctrl->tagset); 1591 1591 - if (IS_ERR(ctrl->connect_q)) { 1592 1592 - ret = PTR_ERR(ctrl->connect_q); 1593 1593 - goto out_free_tag_set; 1594 1594 - } 1590 1590 + ctrl->connect_q = blk_mq_init_queue(ctrl->tagset); 1591 1591 + if (IS_ERR(ctrl->connect_q)) { 1592 1592 + ret = PTR_ERR(ctrl->connect_q); 1593 1593 + goto out_free_tag_set; 1595 1594 } 1596 1595 } else { 1597 1596 blk_mq_update_nr_hw_queues(ctrl->tagset, ··· 1603 1606 return 0; 1604 1607 1605 1608 out_cleanup_connect_q: 1606 1606 - if (new && (ctrl->ops->flags & NVME_F_FABRICS)) 1609 1609 + if (new) 1607 1610 blk_cleanup_queue(ctrl->connect_q); 1608 1611 out_free_tag_set: 1609 1612 if (new) ··· 1617 1620 { 1618 1621 nvme_tcp_stop_queue(ctrl, 0); 1619 1622 if (remove) { 1620 1620 - free_opal_dev(ctrl->opal_dev); 1621 1623 blk_cleanup_queue(ctrl->admin_q); 1622 1624 blk_mq_free_tag_set(ctrl->admin_tagset); 1623 1625 }