Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
"A usual cycle for RDMA with a typical mix of driver and core subsystem
updates:

- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
hns, usnic, qib, qedr, cxgb4, hns, bnxt_re

- Various rtrs fixes and updates

- Bug fix for mlx4 CM emulation for virtualization scenarios where
MRA wasn't working right

- Use tracepoints instead of pr_debug in the CM code

- Scrub the locking in ucma and cma to close more syzkaller bugs

- Use tasklet_setup in the subsystem

- Revert the idea that 'destroy' operations are not allowed to fail
at the driver level. This proved unworkable from a HW perspective.

- Revise how the umem API works so drivers make fewer mistakes using
it

- XRC support for qedr

- Convert uverbs objects RWQ and MW to new the allocation scheme

- Large queue entry sizes for hns

- Use hmm_range_fault() for mlx5 On Demand Paging

- uverbs APIs to inspect the GID table instead of sysfs

- Move some of the RDMA code for building large page SGLs into
lib/scatterlist"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
RDMA/ucma: Fix use after free in destroy id flow
RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
RDMA: Explicitly pass in the dma_device to ib_register_device
lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
IB/mlx4: Convert rej_tmout radix-tree to XArray
RDMA/rxe: Fix bug rejecting all multicast packets
RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
RDMA/rxe: Remove duplicate entries in struct rxe_mr
IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
IB/rdmavt: Fix sizeof mismatch
MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
RDMA/uverbs: Expose the new GID query API to user space
...

+5220 -4710
+1
.clang-format
··· 429 429 - 'rbtree_postorder_for_each_entry_safe' 430 430 - 'rdma_for_each_block' 431 431 - 'rdma_for_each_port' 432 + - 'rdma_umem_for_each_dma_block' 432 433 - 'resource_list_for_each_entry' 433 434 - 'resource_list_for_each_entry_safe' 434 435 - 'rhl_for_each_entry_rcu'
-17
Documentation/ABI/stable/sysfs-class-infiniband
··· 258 258 userspace ABI compatibility of umad & issm devices. 259 259 260 260 261 - What: /sys/class/infiniband_cm/ucmN/ibdev 262 - Date: Oct, 2005 263 - KernelVersion: v2.6.14 264 - Contact: linux-rdma@vger.kernel.org 265 - Description: 266 - (RO) Display Infiniband (IB) device name 267 - 268 - 269 - What: /sys/class/infiniband_cm/abi_version 270 - Date: Oct, 2005 271 - KernelVersion: v2.6.14 272 - Contact: linux-rdma@vger.kernel.org 273 - Description: 274 - (RO) Value is incremented if any changes are made that break 275 - userspace ABI compatibility of ucm devices. 276 - 277 - 278 261 What: /sys/class/infiniband_verbs/uverbsN/ibdev 279 262 What: /sys/class/infiniband_verbs/uverbsN/abi_version 280 263 Date: Sept, 2005
+8 -9
MAINTAINERS
··· 4256 4256 CISCO VIC LOW LATENCY NIC DRIVER 4257 4257 M: Christian Benvenuti <benve@cisco.com> 4258 4258 M: Nelson Escobar <neescoba@cisco.com> 4259 - M: Parvi Kaustubhi <pkaustub@cisco.com> 4260 4259 S: Supported 4261 4260 F: drivers/infiniband/hw/usnic/ 4262 4261 ··· 7792 7793 F: include/uapi/linux/cciss*.h 7793 7794 7794 7795 HFI1 DRIVER 7795 - M: Mike Marciniszyn <mike.marciniszyn@intel.com> 7796 - M: Dennis Dalessandro <dennis.dalessandro@intel.com> 7796 + M: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> 7797 + M: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> 7797 7798 L: linux-rdma@vger.kernel.org 7798 7799 S: Supported 7799 7800 F: drivers/infiniband/hw/hfi1 ··· 12998 12999 F: drivers/char/hw_random/optee-rng.c 12999 13000 13000 13001 OPA-VNIC DRIVER 13001 - M: Dennis Dalessandro <dennis.dalessandro@intel.com> 13002 - M: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> 13002 + M: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> 13003 + M: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> 13003 13004 L: linux-rdma@vger.kernel.org 13004 13005 S: Supported 13005 13006 F: drivers/infiniband/ulp/opa_vnic ··· 14300 14301 F: include/uapi/linux/qemu_fw_cfg.h 14301 14302 14302 14303 QIB DRIVER 14303 - M: Dennis Dalessandro <dennis.dalessandro@intel.com> 14304 - M: Mike Marciniszyn <mike.marciniszyn@intel.com> 14304 + M: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> 14305 + M: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> 14305 14306 L: linux-rdma@vger.kernel.org 14306 14307 S: Supported 14307 14308 F: drivers/infiniband/hw/qib/ ··· 14726 14727 F: drivers/net/ethernet/rdc/r6040.c 14727 14728 14728 14729 RDMAVT - RDMA verbs software 14729 - M: Dennis Dalessandro <dennis.dalessandro@intel.com> 14730 - M: Mike Marciniszyn <mike.marciniszyn@intel.com> 14730 + M: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> 14731 + M: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> 14731 14732 L: linux-rdma@vger.kernel.org 14732 14733 S: Supported 14733 14734 F: drivers/infiniband/sw/rdmavt
+11 -14
drivers/gpu/drm/drm_prime.c
··· 806 806 struct sg_table *drm_prime_pages_to_sg(struct drm_device *dev, 807 807 struct page **pages, unsigned int nr_pages) 808 808 { 809 - struct sg_table *sg = NULL; 809 + struct sg_table *sg; 810 + struct scatterlist *sge; 810 811 size_t max_segment = 0; 811 - int ret; 812 812 813 813 sg = kmalloc(sizeof(struct sg_table), GFP_KERNEL); 814 - if (!sg) { 815 - ret = -ENOMEM; 816 - goto out; 817 - } 814 + if (!sg) 815 + return ERR_PTR(-ENOMEM); 818 816 819 817 if (dev) 820 818 max_segment = dma_max_mapping_size(dev->dev); 821 819 if (max_segment == 0 || max_segment > SCATTERLIST_MAX_SEGMENT) 822 820 max_segment = SCATTERLIST_MAX_SEGMENT; 823 - ret = __sg_alloc_table_from_pages(sg, pages, nr_pages, 0, 821 + sge = __sg_alloc_table_from_pages(sg, pages, nr_pages, 0, 824 822 nr_pages << PAGE_SHIFT, 825 - max_segment, GFP_KERNEL); 826 - if (ret) 827 - goto out; 828 - 823 + max_segment, 824 + NULL, 0, GFP_KERNEL); 825 + if (IS_ERR(sge)) { 826 + kfree(sg); 827 + sg = ERR_CAST(sge); 828 + } 829 829 return sg; 830 - out: 831 - kfree(sg); 832 - return ERR_PTR(ret); 833 830 } 834 831 EXPORT_SYMBOL(drm_prime_pages_to_sg); 835 832
+6 -6
drivers/gpu/drm/i915/gem/i915_gem_userptr.c
··· 403 403 unsigned int max_segment = i915_sg_segment_size(); 404 404 struct sg_table *st; 405 405 unsigned int sg_page_sizes; 406 + struct scatterlist *sg; 406 407 int ret; 407 408 408 409 st = kmalloc(sizeof(*st), GFP_KERNEL); ··· 411 410 return ERR_PTR(-ENOMEM); 412 411 413 412 alloc_table: 414 - ret = __sg_alloc_table_from_pages(st, pvec, num_pages, 415 - 0, num_pages << PAGE_SHIFT, 416 - max_segment, 417 - GFP_KERNEL); 418 - if (ret) { 413 + sg = __sg_alloc_table_from_pages(st, pvec, num_pages, 0, 414 + num_pages << PAGE_SHIFT, max_segment, 415 + NULL, 0, GFP_KERNEL); 416 + if (IS_ERR(sg)) { 419 417 kfree(st); 420 - return ERR_PTR(ret); 418 + return ERR_CAST(sg); 421 419 } 422 420 423 421 ret = i915_gem_gtt_prepare_pages(obj, st);
+9 -6
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c
··· 432 432 int ret = 0; 433 433 static size_t sgl_size; 434 434 static size_t sgt_size; 435 + struct scatterlist *sg; 435 436 436 437 if (vmw_tt->mapped) 437 438 return 0; ··· 455 454 if (unlikely(ret != 0)) 456 455 return ret; 457 456 458 - ret = __sg_alloc_table_from_pages 459 - (&vmw_tt->sgt, vsgt->pages, vsgt->num_pages, 0, 460 - (unsigned long) vsgt->num_pages << PAGE_SHIFT, 461 - dma_get_max_seg_size(dev_priv->dev->dev), 462 - GFP_KERNEL); 463 - if (unlikely(ret != 0)) 457 + sg = __sg_alloc_table_from_pages(&vmw_tt->sgt, vsgt->pages, 458 + vsgt->num_pages, 0, 459 + (unsigned long) vsgt->num_pages << PAGE_SHIFT, 460 + dma_get_max_seg_size(dev_priv->dev->dev), 461 + NULL, 0, GFP_KERNEL); 462 + if (IS_ERR(sg)) { 463 + ret = PTR_ERR(sg); 464 464 goto out_sg_alloc_fail; 465 + } 465 466 466 467 if (vsgt->num_pages > vmw_tt->sgt.orig_nents) { 467 468 uint64_t over_alloc =
+1
drivers/infiniband/Kconfig
··· 48 48 depends on INFINIBAND_USER_MEM 49 49 select MMU_NOTIFIER 50 50 select INTERVAL_TREE 51 + select HMM_MIRROR 51 52 default y 52 53 help 53 54 On demand paging support for the InfiniBand subsystem.
+1 -1
drivers/infiniband/core/Makefile
··· 17 17 ib_core-$(CONFIG_SECURITY_INFINIBAND) += security.o 18 18 ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o 19 19 20 - ib_cm-y := cm.o 20 + ib_cm-y := cm.o cm_trace.o 21 21 22 22 iw_cm-y := iwcm.o iwpm_util.o iwpm_msg.o 23 23
+5 -6
drivers/infiniband/core/addr.c
··· 647 647 req->callback = NULL; 648 648 649 649 spin_lock_bh(&lock); 650 + /* 651 + * Although the work will normally have been canceled by the workqueue, 652 + * it can still be requeued as long as it is on the req_list. 653 + */ 654 + cancel_delayed_work(&req->work); 650 655 if (!list_empty(&req->list)) { 651 - /* 652 - * Although the work will normally have been canceled by the 653 - * workqueue, it can still be requeued as long as it is on the 654 - * req_list. 655 - */ 656 - cancel_delayed_work(&req->work); 657 656 list_del_init(&req->list); 658 657 kfree(req); 659 658 }
+68 -4
drivers/infiniband/core/cache.c
··· 133 133 } 134 134 135 135 static const char * const gid_type_str[] = { 136 + /* IB/RoCE v1 value is set for IB_GID_TYPE_IB and IB_GID_TYPE_ROCE for 137 + * user space compatibility reasons. 138 + */ 136 139 [IB_GID_TYPE_IB] = "IB/RoCE v1", 140 + [IB_GID_TYPE_ROCE] = "IB/RoCE v1", 137 141 [IB_GID_TYPE_ROCE_UDP_ENCAP] = "RoCE v2", 138 142 }; 139 143 ··· 1224 1220 const struct ib_gid_attr * 1225 1221 rdma_get_gid_attr(struct ib_device *device, u8 port_num, int index) 1226 1222 { 1227 - const struct ib_gid_attr *attr = ERR_PTR(-EINVAL); 1223 + const struct ib_gid_attr *attr = ERR_PTR(-ENODATA); 1228 1224 struct ib_gid_table *table; 1229 1225 unsigned long flags; 1230 1226 ··· 1246 1242 return attr; 1247 1243 } 1248 1244 EXPORT_SYMBOL(rdma_get_gid_attr); 1245 + 1246 + /** 1247 + * rdma_query_gid_table - Reads GID table entries of all the ports of a device up to max_entries. 1248 + * @device: The device to query. 1249 + * @entries: Entries where GID entries are returned. 1250 + * @max_entries: Maximum number of entries that can be returned. 1251 + * Entries array must be allocated to hold max_entries number of entries. 1252 + * @num_entries: Updated to the number of entries that were successfully read. 1253 + * 1254 + * Returns number of entries on success or appropriate error code. 1255 + */ 1256 + ssize_t rdma_query_gid_table(struct ib_device *device, 1257 + struct ib_uverbs_gid_entry *entries, 1258 + size_t max_entries) 1259 + { 1260 + const struct ib_gid_attr *gid_attr; 1261 + ssize_t num_entries = 0, ret; 1262 + struct ib_gid_table *table; 1263 + unsigned int port_num, i; 1264 + struct net_device *ndev; 1265 + unsigned long flags; 1266 + 1267 + rdma_for_each_port(device, port_num) { 1268 + if (!rdma_ib_or_roce(device, port_num)) 1269 + continue; 1270 + 1271 + table = rdma_gid_table(device, port_num); 1272 + read_lock_irqsave(&table->rwlock, flags); 1273 + for (i = 0; i < table->sz; i++) { 1274 + if (!is_gid_entry_valid(table->data_vec[i])) 1275 + continue; 1276 + if (num_entries >= max_entries) { 1277 + ret = -EINVAL; 1278 + goto err; 1279 + } 1280 + 1281 + gid_attr = &table->data_vec[i]->attr; 1282 + 1283 + memcpy(&entries->gid, &gid_attr->gid, 1284 + sizeof(gid_attr->gid)); 1285 + entries->gid_index = gid_attr->index; 1286 + entries->port_num = gid_attr->port_num; 1287 + entries->gid_type = gid_attr->gid_type; 1288 + ndev = rcu_dereference_protected( 1289 + gid_attr->ndev, 1290 + lockdep_is_held(&table->rwlock)); 1291 + if (ndev) 1292 + entries->netdev_ifindex = ndev->ifindex; 1293 + 1294 + num_entries++; 1295 + entries++; 1296 + } 1297 + read_unlock_irqrestore(&table->rwlock, flags); 1298 + } 1299 + 1300 + return num_entries; 1301 + err: 1302 + read_unlock_irqrestore(&table->rwlock, flags); 1303 + return ret; 1304 + } 1305 + EXPORT_SYMBOL(rdma_query_gid_table); 1249 1306 1250 1307 /** 1251 1308 * rdma_put_gid_attr - Release reference to the GID attribute ··· 1364 1299 struct ib_gid_table_entry *entry = 1365 1300 container_of(attr, struct ib_gid_table_entry, attr); 1366 1301 struct ib_device *device = entry->attr.device; 1367 - struct net_device *ndev = ERR_PTR(-ENODEV); 1302 + struct net_device *ndev = ERR_PTR(-EINVAL); 1368 1303 u8 port_num = entry->attr.port_num; 1369 1304 struct ib_gid_table *table; 1370 1305 unsigned long flags; ··· 1376 1311 valid = is_gid_entry_valid(table->data_vec[attr->index]); 1377 1312 if (valid) { 1378 1313 ndev = rcu_dereference(attr->ndev); 1379 - if (!ndev || 1380 - (ndev && ((READ_ONCE(ndev->flags) & IFF_UP) == 0))) 1314 + if (!ndev) 1381 1315 ndev = ERR_PTR(-ENODEV); 1382 1316 } 1383 1317 read_unlock_irqrestore(&table->rwlock, flags);
+46 -80
drivers/infiniband/core/cm.c
··· 27 27 #include <rdma/ib_cm.h> 28 28 #include "cm_msgs.h" 29 29 #include "core_priv.h" 30 + #include "cm_trace.h" 30 31 31 32 MODULE_AUTHOR("Sean Hefty"); 32 33 MODULE_DESCRIPTION("InfiniBand CM"); ··· 202 201 struct cm_port { 203 202 struct cm_device *cm_dev; 204 203 struct ib_mad_agent *mad_agent; 205 - struct kobject port_obj; 206 204 u8 port_num; 207 205 struct list_head cm_priv_prim_list; 208 206 struct list_head cm_priv_altr_list; ··· 1563 1563 cm_id_priv->local_qpn = cpu_to_be32(IBA_GET(CM_REQ_LOCAL_QPN, req_msg)); 1564 1564 cm_id_priv->rq_psn = cpu_to_be32(IBA_GET(CM_REQ_STARTING_PSN, req_msg)); 1565 1565 1566 + trace_icm_send_req(&cm_id_priv->id); 1566 1567 spin_lock_irqsave(&cm_id_priv->lock, flags); 1567 1568 ret = ib_post_send_mad(cm_id_priv->msg, NULL); 1568 1569 if (ret) { ··· 1611 1610 IBA_SET_MEM(CM_REJ_ARI, rej_msg, ari, ari_length); 1612 1611 } 1613 1612 1613 + trace_icm_issue_rej( 1614 + IBA_GET(CM_REJ_LOCAL_COMM_ID, rcv_msg), 1615 + IBA_GET(CM_REJ_REMOTE_COMM_ID, rcv_msg)); 1614 1616 ret = ib_post_send_mad(msg, NULL); 1615 1617 if (ret) 1616 1618 cm_free_msg(msg); ··· 1965 1961 } 1966 1962 spin_unlock_irq(&cm_id_priv->lock); 1967 1963 1964 + trace_icm_send_dup_req(&cm_id_priv->id); 1968 1965 ret = ib_post_send_mad(msg, NULL); 1969 1966 if (ret) 1970 1967 goto free; ··· 2129 2124 2130 2125 listen_cm_id_priv = cm_match_req(work, cm_id_priv); 2131 2126 if (!listen_cm_id_priv) { 2132 - pr_debug("%s: local_id %d, no listen_cm_id_priv\n", __func__, 2133 - be32_to_cpu(cm_id_priv->id.local_id)); 2127 + trace_icm_no_listener_err(&cm_id_priv->id); 2134 2128 cm_id_priv->id.state = IB_CM_IDLE; 2135 2129 ret = -EINVAL; 2136 2130 goto destroy; ··· 2278 2274 spin_lock_irqsave(&cm_id_priv->lock, flags); 2279 2275 if (cm_id->state != IB_CM_REQ_RCVD && 2280 2276 cm_id->state != IB_CM_MRA_REQ_SENT) { 2281 - pr_debug("%s: local_comm_id %d, cm_id->state: %d\n", __func__, 2282 - be32_to_cpu(cm_id_priv->id.local_id), cm_id->state); 2277 + trace_icm_send_rep_err(cm_id_priv->id.local_id, cm_id->state); 2283 2278 ret = -EINVAL; 2284 2279 goto out; 2285 2280 } ··· 2292 2289 msg->timeout_ms = cm_id_priv->timeout_ms; 2293 2290 msg->context[1] = (void *) (unsigned long) IB_CM_REP_SENT; 2294 2291 2292 + trace_icm_send_rep(cm_id); 2295 2293 ret = ib_post_send_mad(msg, NULL); 2296 2294 if (ret) { 2297 2295 spin_unlock_irqrestore(&cm_id_priv->lock, flags); ··· 2352 2348 spin_lock_irqsave(&cm_id_priv->lock, flags); 2353 2349 if (cm_id->state != IB_CM_REP_RCVD && 2354 2350 cm_id->state != IB_CM_MRA_REP_SENT) { 2355 - pr_debug("%s: local_id %d, cm_id->state %d\n", __func__, 2356 - be32_to_cpu(cm_id->local_id), cm_id->state); 2351 + trace_icm_send_cm_rtu_err(cm_id); 2357 2352 ret = -EINVAL; 2358 2353 goto error; 2359 2354 } ··· 2364 2361 cm_format_rtu((struct cm_rtu_msg *) msg->mad, cm_id_priv, 2365 2362 private_data, private_data_len); 2366 2363 2364 + trace_icm_send_rtu(cm_id); 2367 2365 ret = ib_post_send_mad(msg, NULL); 2368 2366 if (ret) { 2369 2367 spin_unlock_irqrestore(&cm_id_priv->lock, flags); ··· 2446 2442 goto unlock; 2447 2443 spin_unlock_irq(&cm_id_priv->lock); 2448 2444 2445 + trace_icm_send_dup_rep(&cm_id_priv->id); 2449 2446 ret = ib_post_send_mad(msg, NULL); 2450 2447 if (ret) 2451 2448 goto free; ··· 2470 2465 cpu_to_be32(IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg)), 0); 2471 2466 if (!cm_id_priv) { 2472 2467 cm_dup_rep_handler(work); 2473 - pr_debug("%s: remote_comm_id %d, no cm_id_priv\n", __func__, 2468 + trace_icm_remote_no_priv_err( 2474 2469 IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg)); 2475 2470 return -EINVAL; 2476 2471 } ··· 2484 2479 break; 2485 2480 default: 2486 2481 ret = -EINVAL; 2487 - pr_debug( 2488 - "%s: cm_id_priv->id.state: %d, local_comm_id %d, remote_comm_id %d\n", 2489 - __func__, cm_id_priv->id.state, 2482 + trace_icm_rep_unknown_err( 2490 2483 IBA_GET(CM_REP_LOCAL_COMM_ID, rep_msg), 2491 - IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg)); 2484 + IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg), 2485 + cm_id_priv->id.state); 2492 2486 spin_unlock_irq(&cm_id_priv->lock); 2493 2487 goto error; 2494 2488 } ··· 2504 2500 spin_unlock(&cm.lock); 2505 2501 spin_unlock_irq(&cm_id_priv->lock); 2506 2502 ret = -EINVAL; 2507 - pr_debug("%s: Failed to insert remote id %d\n", __func__, 2503 + trace_icm_insert_failed_err( 2508 2504 IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg)); 2509 2505 goto error; 2510 2506 } ··· 2521 2517 IB_CM_REJ_STALE_CONN, CM_MSG_RESPONSE_REP, 2522 2518 NULL, 0); 2523 2519 ret = -EINVAL; 2524 - pr_debug( 2525 - "%s: Stale connection. local_comm_id %d, remote_comm_id %d\n", 2526 - __func__, IBA_GET(CM_REP_LOCAL_COMM_ID, rep_msg), 2520 + trace_icm_staleconn_err( 2521 + IBA_GET(CM_REP_LOCAL_COMM_ID, rep_msg), 2527 2522 IBA_GET(CM_REP_REMOTE_COMM_ID, rep_msg)); 2528 2523 2529 2524 if (cur_cm_id_priv) { ··· 2649 2646 return -EINVAL; 2650 2647 2651 2648 if (cm_id_priv->id.state != IB_CM_ESTABLISHED) { 2652 - pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__, 2653 - be32_to_cpu(cm_id_priv->id.local_id), 2654 - cm_id_priv->id.state); 2649 + trace_icm_dreq_skipped(&cm_id_priv->id); 2655 2650 return -EINVAL; 2656 2651 } 2657 2652 ··· 2668 2667 msg->timeout_ms = cm_id_priv->timeout_ms; 2669 2668 msg->context[1] = (void *) (unsigned long) IB_CM_DREQ_SENT; 2670 2669 2670 + trace_icm_send_dreq(&cm_id_priv->id); 2671 2671 ret = ib_post_send_mad(msg, NULL); 2672 2672 if (ret) { 2673 2673 cm_enter_timewait(cm_id_priv); ··· 2724 2722 return -EINVAL; 2725 2723 2726 2724 if (cm_id_priv->id.state != IB_CM_DREQ_RCVD) { 2727 - pr_debug( 2728 - "%s: local_id %d, cm_idcm_id->state(%d) != IB_CM_DREQ_RCVD\n", 2729 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 2730 - cm_id_priv->id.state); 2725 + trace_icm_send_drep_err(&cm_id_priv->id); 2731 2726 kfree(private_data); 2732 2727 return -EINVAL; 2733 2728 } ··· 2739 2740 cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv, 2740 2741 private_data, private_data_len); 2741 2742 2743 + trace_icm_send_drep(&cm_id_priv->id); 2742 2744 ret = ib_post_send_mad(msg, NULL); 2743 2745 if (ret) { 2744 2746 cm_free_msg(msg); ··· 2789 2789 IBA_SET(CM_DREP_LOCAL_COMM_ID, drep_msg, 2790 2790 IBA_GET(CM_DREQ_REMOTE_COMM_ID, dreq_msg)); 2791 2791 2792 + trace_icm_issue_drep( 2793 + IBA_GET(CM_DREQ_LOCAL_COMM_ID, dreq_msg), 2794 + IBA_GET(CM_DREQ_REMOTE_COMM_ID, dreq_msg)); 2792 2795 ret = ib_post_send_mad(msg, NULL); 2793 2796 if (ret) 2794 2797 cm_free_msg(msg); ··· 2813 2810 atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES]. 2814 2811 counter[CM_DREQ_COUNTER]); 2815 2812 cm_issue_drep(work->port, work->mad_recv_wc); 2816 - pr_debug( 2817 - "%s: no cm_id_priv, local_comm_id %d, remote_comm_id %d\n", 2818 - __func__, IBA_GET(CM_DREQ_LOCAL_COMM_ID, dreq_msg), 2813 + trace_icm_no_priv_err( 2814 + IBA_GET(CM_DREQ_LOCAL_COMM_ID, dreq_msg), 2819 2815 IBA_GET(CM_DREQ_REMOTE_COMM_ID, dreq_msg)); 2820 2816 return -EINVAL; 2821 2817 } ··· 2860 2858 counter[CM_DREQ_COUNTER]); 2861 2859 goto unlock; 2862 2860 default: 2863 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 2864 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 2865 - cm_id_priv->id.state); 2861 + trace_icm_dreq_unknown_err(&cm_id_priv->id); 2866 2862 goto unlock; 2867 2863 } 2868 2864 cm_id_priv->id.state = IB_CM_DREQ_RCVD; ··· 2945 2945 state); 2946 2946 break; 2947 2947 default: 2948 - pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__, 2949 - be32_to_cpu(cm_id_priv->id.local_id), 2950 - cm_id_priv->id.state); 2948 + trace_icm_send_unknown_rej_err(&cm_id_priv->id); 2951 2949 return -EINVAL; 2952 2950 } 2953 2951 2952 + trace_icm_send_rej(&cm_id_priv->id, reason); 2954 2953 ret = ib_post_send_mad(msg, NULL); 2955 2954 if (ret) { 2956 2955 cm_free_msg(msg); ··· 3059 3060 } 3060 3061 fallthrough; 3061 3062 default: 3062 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 3063 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 3064 - cm_id_priv->id.state); 3063 + trace_icm_rej_unknown_err(&cm_id_priv->id); 3065 3064 spin_unlock_irq(&cm_id_priv->lock); 3066 3065 goto out; 3067 3066 } ··· 3115 3118 } 3116 3119 fallthrough; 3117 3120 default: 3118 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 3119 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 3120 - cm_id_priv->id.state); 3121 + trace_icm_send_mra_unknown_err(&cm_id_priv->id); 3121 3122 ret = -EINVAL; 3122 3123 goto error1; 3123 3124 } ··· 3128 3133 cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv, 3129 3134 msg_response, service_timeout, 3130 3135 private_data, private_data_len); 3136 + trace_icm_send_mra(cm_id); 3131 3137 ret = ib_post_send_mad(msg, NULL); 3132 3138 if (ret) 3133 3139 goto error2; ··· 3225 3229 counter[CM_MRA_COUNTER]); 3226 3230 fallthrough; 3227 3231 default: 3228 - pr_debug("%s local_id %d, cm_id_priv->id.state: %d\n", 3229 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 3230 - cm_id_priv->id.state); 3232 + trace_icm_mra_unknown_err(&cm_id_priv->id); 3231 3233 goto out; 3232 3234 } 3233 3235 ··· 3499 3505 msg->context[1] = (void *) (unsigned long) IB_CM_SIDR_REQ_SENT; 3500 3506 3501 3507 spin_lock_irqsave(&cm_id_priv->lock, flags); 3502 - if (cm_id->state == IB_CM_IDLE) 3508 + if (cm_id->state == IB_CM_IDLE) { 3509 + trace_icm_send_sidr_req(&cm_id_priv->id); 3503 3510 ret = ib_post_send_mad(msg, NULL); 3504 - else 3511 + } else { 3505 3512 ret = -EINVAL; 3513 + } 3506 3514 3507 3515 if (ret) { 3508 3516 spin_unlock_irqrestore(&cm_id_priv->lock, flags); ··· 3666 3670 3667 3671 cm_format_sidr_rep((struct cm_sidr_rep_msg *) msg->mad, cm_id_priv, 3668 3672 param); 3673 + trace_icm_send_sidr_rep(&cm_id_priv->id); 3669 3674 ret = ib_post_send_mad(msg, NULL); 3670 3675 if (ret) { 3671 3676 cm_free_msg(msg); ··· 3764 3767 if (msg != cm_id_priv->msg || state != cm_id_priv->id.state) 3765 3768 goto discard; 3766 3769 3767 - pr_debug_ratelimited("CM: failed sending MAD in state %d. (%s)\n", 3768 - state, ib_wc_status_msg(wc_status)); 3770 + trace_icm_mad_send_err(state, wc_status); 3769 3771 switch (state) { 3770 3772 case IB_CM_REQ_SENT: 3771 3773 case IB_CM_MRA_REQ_RCVD: ··· 3887 3891 ret = cm_timewait_handler(work); 3888 3892 break; 3889 3893 default: 3890 - pr_debug("cm_event.event: 0x%x\n", work->cm_event.event); 3894 + trace_icm_handler_err(work->cm_event.event); 3891 3895 ret = -EINVAL; 3892 3896 break; 3893 3897 } ··· 3923 3927 ret = -EISCONN; 3924 3928 break; 3925 3929 default: 3926 - pr_debug("%s: local_id %d, cm_id->state: %d\n", __func__, 3927 - be32_to_cpu(cm_id->local_id), cm_id->state); 3930 + trace_icm_establish_err(cm_id); 3928 3931 ret = -EINVAL; 3929 3932 break; 3930 3933 } ··· 4120 4125 ret = 0; 4121 4126 break; 4122 4127 default: 4123 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 4124 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 4125 - cm_id_priv->id.state); 4128 + trace_icm_qp_init_err(&cm_id_priv->id); 4126 4129 ret = -EINVAL; 4127 4130 break; 4128 4131 } ··· 4168 4175 ret = 0; 4169 4176 break; 4170 4177 default: 4171 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 4172 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 4173 - cm_id_priv->id.state); 4178 + trace_icm_qp_rtr_err(&cm_id_priv->id); 4174 4179 ret = -EINVAL; 4175 4180 break; 4176 4181 } ··· 4228 4237 ret = 0; 4229 4238 break; 4230 4239 default: 4231 - pr_debug("%s: local_id %d, cm_id_priv->id.state: %d\n", 4232 - __func__, be32_to_cpu(cm_id_priv->id.local_id), 4233 - cm_id_priv->id.state); 4240 + trace_icm_qp_rts_err(&cm_id_priv->id); 4234 4241 ret = -EINVAL; 4235 4242 break; 4236 4243 } ··· 4283 4294 .sysfs_ops = &cm_counter_ops, 4284 4295 .default_attrs = cm_counter_default_attrs 4285 4296 }; 4286 - 4287 - static char *cm_devnode(struct device *dev, umode_t *mode) 4288 - { 4289 - if (mode) 4290 - *mode = 0666; 4291 - return kasprintf(GFP_KERNEL, "infiniband/%s", dev_name(dev)); 4292 - } 4293 - 4294 - struct class cm_class = { 4295 - .owner = THIS_MODULE, 4296 - .name = "infiniband_cm", 4297 - .devnode = cm_devnode, 4298 - }; 4299 - EXPORT_SYMBOL(cm_class); 4300 4297 4301 4298 static int cm_create_port_fs(struct cm_port *port) 4302 4299 { ··· 4486 4511 get_random_bytes(&cm.random_id_operand, sizeof cm.random_id_operand); 4487 4512 INIT_LIST_HEAD(&cm.timewait_list); 4488 4513 4489 - ret = class_register(&cm_class); 4490 - if (ret) { 4491 - ret = -ENOMEM; 4492 - goto error1; 4493 - } 4494 - 4495 4514 cm.wq = alloc_workqueue("ib_cm", 0, 1); 4496 4515 if (!cm.wq) { 4497 4516 ret = -ENOMEM; ··· 4500 4531 error3: 4501 4532 destroy_workqueue(cm.wq); 4502 4533 error2: 4503 - class_unregister(&cm_class); 4504 - error1: 4505 4534 return ret; 4506 4535 } 4507 4536 ··· 4520 4553 kfree(timewait_info); 4521 4554 } 4522 4555 4523 - class_unregister(&cm_class); 4524 4556 WARN_ON(!xa_empty(&cm.local_id_table)); 4525 4557 } 4526 4558
+15
drivers/infiniband/core/cm_trace.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Trace points for the IB Connection Manager. 4 + * 5 + * Author: Chuck Lever <chuck.lever@oracle.com> 6 + * 7 + * Copyright (c) 2020, Oracle and/or its affiliates. 8 + */ 9 + 10 + #include <rdma/rdma_cm.h> 11 + #include "cma_priv.h" 12 + 13 + #define CREATE_TRACE_POINTS 14 + 15 + #include "cm_trace.h"
+414
drivers/infiniband/core/cm_trace.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* 3 + * Trace point definitions for the RDMA Connect Manager. 4 + * 5 + * Author: Chuck Lever <chuck.lever@oracle.com> 6 + * 7 + * Copyright (c) 2020 Oracle and/or its affiliates. 8 + */ 9 + 10 + #undef TRACE_SYSTEM 11 + #define TRACE_SYSTEM ib_cma 12 + 13 + #if !defined(_TRACE_IB_CMA_H) || defined(TRACE_HEADER_MULTI_READ) 14 + 15 + #define _TRACE_IB_CMA_H 16 + 17 + #include <linux/tracepoint.h> 18 + #include <rdma/ib_cm.h> 19 + #include <trace/events/rdma.h> 20 + 21 + /* 22 + * enum ib_cm_state, from include/rdma/ib_cm.h 23 + */ 24 + #define IB_CM_STATE_LIST \ 25 + ib_cm_state(IDLE) \ 26 + ib_cm_state(LISTEN) \ 27 + ib_cm_state(REQ_SENT) \ 28 + ib_cm_state(REQ_RCVD) \ 29 + ib_cm_state(MRA_REQ_SENT) \ 30 + ib_cm_state(MRA_REQ_RCVD) \ 31 + ib_cm_state(REP_SENT) \ 32 + ib_cm_state(REP_RCVD) \ 33 + ib_cm_state(MRA_REP_SENT) \ 34 + ib_cm_state(MRA_REP_RCVD) \ 35 + ib_cm_state(ESTABLISHED) \ 36 + ib_cm_state(DREQ_SENT) \ 37 + ib_cm_state(DREQ_RCVD) \ 38 + ib_cm_state(TIMEWAIT) \ 39 + ib_cm_state(SIDR_REQ_SENT) \ 40 + ib_cm_state_end(SIDR_REQ_RCVD) 41 + 42 + #undef ib_cm_state 43 + #undef ib_cm_state_end 44 + #define ib_cm_state(x) TRACE_DEFINE_ENUM(IB_CM_##x); 45 + #define ib_cm_state_end(x) TRACE_DEFINE_ENUM(IB_CM_##x); 46 + 47 + IB_CM_STATE_LIST 48 + 49 + #undef ib_cm_state 50 + #undef ib_cm_state_end 51 + #define ib_cm_state(x) { IB_CM_##x, #x }, 52 + #define ib_cm_state_end(x) { IB_CM_##x, #x } 53 + 54 + #define show_ib_cm_state(x) \ 55 + __print_symbolic(x, IB_CM_STATE_LIST) 56 + 57 + /* 58 + * enum ib_cm_lap_state, from include/rdma/ib_cm.h 59 + */ 60 + #define IB_CM_LAP_STATE_LIST \ 61 + ib_cm_lap_state(LAP_UNINIT) \ 62 + ib_cm_lap_state(LAP_IDLE) \ 63 + ib_cm_lap_state(LAP_SENT) \ 64 + ib_cm_lap_state(LAP_RCVD) \ 65 + ib_cm_lap_state(MRA_LAP_SENT) \ 66 + ib_cm_lap_state_end(MRA_LAP_RCVD) 67 + 68 + #undef ib_cm_lap_state 69 + #undef ib_cm_lap_state_end 70 + #define ib_cm_lap_state(x) TRACE_DEFINE_ENUM(IB_CM_##x); 71 + #define ib_cm_lap_state_end(x) TRACE_DEFINE_ENUM(IB_CM_##x); 72 + 73 + IB_CM_LAP_STATE_LIST 74 + 75 + #undef ib_cm_lap_state 76 + #undef ib_cm_lap_state_end 77 + #define ib_cm_lap_state(x) { IB_CM_##x, #x }, 78 + #define ib_cm_lap_state_end(x) { IB_CM_##x, #x } 79 + 80 + #define show_ib_cm_lap_state(x) \ 81 + __print_symbolic(x, IB_CM_LAP_STATE_LIST) 82 + 83 + /* 84 + * enum ib_cm_rej_reason, from include/rdma/ib_cm.h 85 + */ 86 + #define IB_CM_REJ_REASON_LIST \ 87 + ib_cm_rej_reason(REJ_NO_QP) \ 88 + ib_cm_rej_reason(REJ_NO_EEC) \ 89 + ib_cm_rej_reason(REJ_NO_RESOURCES) \ 90 + ib_cm_rej_reason(REJ_TIMEOUT) \ 91 + ib_cm_rej_reason(REJ_UNSUPPORTED) \ 92 + ib_cm_rej_reason(REJ_INVALID_COMM_ID) \ 93 + ib_cm_rej_reason(REJ_INVALID_COMM_INSTANCE) \ 94 + ib_cm_rej_reason(REJ_INVALID_SERVICE_ID) \ 95 + ib_cm_rej_reason(REJ_INVALID_TRANSPORT_TYPE) \ 96 + ib_cm_rej_reason(REJ_STALE_CONN) \ 97 + ib_cm_rej_reason(REJ_RDC_NOT_EXIST) \ 98 + ib_cm_rej_reason(REJ_INVALID_GID) \ 99 + ib_cm_rej_reason(REJ_INVALID_LID) \ 100 + ib_cm_rej_reason(REJ_INVALID_SL) \ 101 + ib_cm_rej_reason(REJ_INVALID_TRAFFIC_CLASS) \ 102 + ib_cm_rej_reason(REJ_INVALID_HOP_LIMIT) \ 103 + ib_cm_rej_reason(REJ_INVALID_PACKET_RATE) \ 104 + ib_cm_rej_reason(REJ_INVALID_ALT_GID) \ 105 + ib_cm_rej_reason(REJ_INVALID_ALT_LID) \ 106 + ib_cm_rej_reason(REJ_INVALID_ALT_SL) \ 107 + ib_cm_rej_reason(REJ_INVALID_ALT_TRAFFIC_CLASS) \ 108 + ib_cm_rej_reason(REJ_INVALID_ALT_HOP_LIMIT) \ 109 + ib_cm_rej_reason(REJ_INVALID_ALT_PACKET_RATE) \ 110 + ib_cm_rej_reason(REJ_PORT_CM_REDIRECT) \ 111 + ib_cm_rej_reason(REJ_PORT_REDIRECT) \ 112 + ib_cm_rej_reason(REJ_INVALID_MTU) \ 113 + ib_cm_rej_reason(REJ_INSUFFICIENT_RESP_RESOURCES) \ 114 + ib_cm_rej_reason(REJ_CONSUMER_DEFINED) \ 115 + ib_cm_rej_reason(REJ_INVALID_RNR_RETRY) \ 116 + ib_cm_rej_reason(REJ_DUPLICATE_LOCAL_COMM_ID) \ 117 + ib_cm_rej_reason(REJ_INVALID_CLASS_VERSION) \ 118 + ib_cm_rej_reason(REJ_INVALID_FLOW_LABEL) \ 119 + ib_cm_rej_reason(REJ_INVALID_ALT_FLOW_LABEL) \ 120 + ib_cm_rej_reason_end(REJ_VENDOR_OPTION_NOT_SUPPORTED) 121 + 122 + #undef ib_cm_rej_reason 123 + #undef ib_cm_rej_reason_end 124 + #define ib_cm_rej_reason(x) TRACE_DEFINE_ENUM(IB_CM_##x); 125 + #define ib_cm_rej_reason_end(x) TRACE_DEFINE_ENUM(IB_CM_##x); 126 + 127 + IB_CM_REJ_REASON_LIST 128 + 129 + #undef ib_cm_rej_reason 130 + #undef ib_cm_rej_reason_end 131 + #define ib_cm_rej_reason(x) { IB_CM_##x, #x }, 132 + #define ib_cm_rej_reason_end(x) { IB_CM_##x, #x } 133 + 134 + #define show_ib_cm_rej_reason(x) \ 135 + __print_symbolic(x, IB_CM_REJ_REASON_LIST) 136 + 137 + DECLARE_EVENT_CLASS(icm_id_class, 138 + TP_PROTO( 139 + const struct ib_cm_id *cm_id 140 + ), 141 + 142 + TP_ARGS(cm_id), 143 + 144 + TP_STRUCT__entry( 145 + __field(const void *, cm_id) /* for eBPF scripts */ 146 + __field(unsigned int, local_id) 147 + __field(unsigned int, remote_id) 148 + __field(unsigned long, state) 149 + __field(unsigned long, lap_state) 150 + ), 151 + 152 + TP_fast_assign( 153 + __entry->cm_id = cm_id; 154 + __entry->local_id = be32_to_cpu(cm_id->local_id); 155 + __entry->remote_id = be32_to_cpu(cm_id->remote_id); 156 + __entry->state = cm_id->state; 157 + __entry->lap_state = cm_id->lap_state; 158 + ), 159 + 160 + TP_printk("local_id=%u remote_id=%u state=%s lap_state=%s", 161 + __entry->local_id, __entry->remote_id, 162 + show_ib_cm_state(__entry->state), 163 + show_ib_cm_lap_state(__entry->lap_state) 164 + ) 165 + ); 166 + 167 + #define DEFINE_CM_SEND_EVENT(name) \ 168 + DEFINE_EVENT(icm_id_class, \ 169 + icm_send_##name, \ 170 + TP_PROTO( \ 171 + const struct ib_cm_id *cm_id \ 172 + ), \ 173 + TP_ARGS(cm_id)) 174 + 175 + DEFINE_CM_SEND_EVENT(req); 176 + DEFINE_CM_SEND_EVENT(rep); 177 + DEFINE_CM_SEND_EVENT(dup_req); 178 + DEFINE_CM_SEND_EVENT(dup_rep); 179 + DEFINE_CM_SEND_EVENT(rtu); 180 + DEFINE_CM_SEND_EVENT(mra); 181 + DEFINE_CM_SEND_EVENT(sidr_req); 182 + DEFINE_CM_SEND_EVENT(sidr_rep); 183 + DEFINE_CM_SEND_EVENT(dreq); 184 + DEFINE_CM_SEND_EVENT(drep); 185 + 186 + TRACE_EVENT(icm_send_rej, 187 + TP_PROTO( 188 + const struct ib_cm_id *cm_id, 189 + enum ib_cm_rej_reason reason 190 + ), 191 + 192 + TP_ARGS(cm_id, reason), 193 + 194 + TP_STRUCT__entry( 195 + __field(const void *, cm_id) 196 + __field(u32, local_id) 197 + __field(u32, remote_id) 198 + __field(unsigned long, state) 199 + __field(unsigned long, reason) 200 + ), 201 + 202 + TP_fast_assign( 203 + __entry->cm_id = cm_id; 204 + __entry->local_id = be32_to_cpu(cm_id->local_id); 205 + __entry->remote_id = be32_to_cpu(cm_id->remote_id); 206 + __entry->state = cm_id->state; 207 + __entry->reason = reason; 208 + ), 209 + 210 + TP_printk("local_id=%u remote_id=%u state=%s reason=%s", 211 + __entry->local_id, __entry->remote_id, 212 + show_ib_cm_state(__entry->state), 213 + show_ib_cm_rej_reason(__entry->reason) 214 + ) 215 + ); 216 + 217 + #define DEFINE_CM_ERR_EVENT(name) \ 218 + DEFINE_EVENT(icm_id_class, \ 219 + icm_##name##_err, \ 220 + TP_PROTO( \ 221 + const struct ib_cm_id *cm_id \ 222 + ), \ 223 + TP_ARGS(cm_id)) 224 + 225 + DEFINE_CM_ERR_EVENT(send_cm_rtu); 226 + DEFINE_CM_ERR_EVENT(establish); 227 + DEFINE_CM_ERR_EVENT(no_listener); 228 + DEFINE_CM_ERR_EVENT(send_drep); 229 + DEFINE_CM_ERR_EVENT(dreq_unknown); 230 + DEFINE_CM_ERR_EVENT(send_unknown_rej); 231 + DEFINE_CM_ERR_EVENT(rej_unknown); 232 + DEFINE_CM_ERR_EVENT(send_mra_unknown); 233 + DEFINE_CM_ERR_EVENT(mra_unknown); 234 + DEFINE_CM_ERR_EVENT(qp_init); 235 + DEFINE_CM_ERR_EVENT(qp_rtr); 236 + DEFINE_CM_ERR_EVENT(qp_rts); 237 + 238 + DEFINE_EVENT(icm_id_class, \ 239 + icm_dreq_skipped, \ 240 + TP_PROTO( \ 241 + const struct ib_cm_id *cm_id \ 242 + ), \ 243 + TP_ARGS(cm_id) \ 244 + ); 245 + 246 + DECLARE_EVENT_CLASS(icm_local_class, 247 + TP_PROTO( 248 + unsigned int local_id, 249 + unsigned int remote_id 250 + ), 251 + 252 + TP_ARGS(local_id, remote_id), 253 + 254 + TP_STRUCT__entry( 255 + __field(unsigned int, local_id) 256 + __field(unsigned int, remote_id) 257 + ), 258 + 259 + TP_fast_assign( 260 + __entry->local_id = local_id; 261 + __entry->remote_id = remote_id; 262 + ), 263 + 264 + TP_printk("local_id=%u remote_id=%u", 265 + __entry->local_id, __entry->remote_id 266 + ) 267 + ); 268 + 269 + #define DEFINE_CM_LOCAL_EVENT(name) \ 270 + DEFINE_EVENT(icm_local_class, \ 271 + icm_##name, \ 272 + TP_PROTO( \ 273 + unsigned int local_id, \ 274 + unsigned int remote_id \ 275 + ), \ 276 + TP_ARGS(local_id, remote_id)) 277 + 278 + DEFINE_CM_LOCAL_EVENT(issue_rej); 279 + DEFINE_CM_LOCAL_EVENT(issue_drep); 280 + DEFINE_CM_LOCAL_EVENT(staleconn_err); 281 + DEFINE_CM_LOCAL_EVENT(no_priv_err); 282 + 283 + DECLARE_EVENT_CLASS(icm_remote_class, 284 + TP_PROTO( 285 + u32 remote_id 286 + ), 287 + 288 + TP_ARGS(remote_id), 289 + 290 + TP_STRUCT__entry( 291 + __field(u32, remote_id) 292 + ), 293 + 294 + TP_fast_assign( 295 + __entry->remote_id = remote_id; 296 + ), 297 + 298 + TP_printk("remote_id=%u", 299 + __entry->remote_id 300 + ) 301 + ); 302 + 303 + #define DEFINE_CM_REMOTE_EVENT(name) \ 304 + DEFINE_EVENT(icm_remote_class, \ 305 + icm_##name, \ 306 + TP_PROTO( \ 307 + u32 remote_id \ 308 + ), \ 309 + TP_ARGS(remote_id)) 310 + 311 + DEFINE_CM_REMOTE_EVENT(remote_no_priv_err); 312 + DEFINE_CM_REMOTE_EVENT(insert_failed_err); 313 + 314 + TRACE_EVENT(icm_send_rep_err, 315 + TP_PROTO( 316 + __be32 local_id, 317 + enum ib_cm_state state 318 + ), 319 + 320 + TP_ARGS(local_id, state), 321 + 322 + TP_STRUCT__entry( 323 + __field(unsigned int, local_id) 324 + __field(unsigned long, state) 325 + ), 326 + 327 + TP_fast_assign( 328 + __entry->local_id = be32_to_cpu(local_id); 329 + __entry->state = state; 330 + ), 331 + 332 + TP_printk("local_id=%u state=%s", 333 + __entry->local_id, show_ib_cm_state(__entry->state) 334 + ) 335 + ); 336 + 337 + TRACE_EVENT(icm_rep_unknown_err, 338 + TP_PROTO( 339 + unsigned int local_id, 340 + unsigned int remote_id, 341 + enum ib_cm_state state 342 + ), 343 + 344 + TP_ARGS(local_id, remote_id, state), 345 + 346 + TP_STRUCT__entry( 347 + __field(unsigned int, local_id) 348 + __field(unsigned int, remote_id) 349 + __field(unsigned long, state) 350 + ), 351 + 352 + TP_fast_assign( 353 + __entry->local_id = local_id; 354 + __entry->remote_id = remote_id; 355 + __entry->state = state; 356 + ), 357 + 358 + TP_printk("local_id=%u remote_id=%u state=%s", 359 + __entry->local_id, __entry->remote_id, 360 + show_ib_cm_state(__entry->state) 361 + ) 362 + ); 363 + 364 + TRACE_EVENT(icm_handler_err, 365 + TP_PROTO( 366 + enum ib_cm_event_type event 367 + ), 368 + 369 + TP_ARGS(event), 370 + 371 + TP_STRUCT__entry( 372 + __field(unsigned long, event) 373 + ), 374 + 375 + TP_fast_assign( 376 + __entry->event = event; 377 + ), 378 + 379 + TP_printk("unhandled event=%s", 380 + rdma_show_ib_cm_event(__entry->event) 381 + ) 382 + ); 383 + 384 + TRACE_EVENT(icm_mad_send_err, 385 + TP_PROTO( 386 + enum ib_cm_state state, 387 + enum ib_wc_status wc_status 388 + ), 389 + 390 + TP_ARGS(state, wc_status), 391 + 392 + TP_STRUCT__entry( 393 + __field(unsigned long, state) 394 + __field(unsigned long, wc_status) 395 + ), 396 + 397 + TP_fast_assign( 398 + __entry->state = state; 399 + __entry->wc_status = wc_status; 400 + ), 401 + 402 + TP_printk("state=%s completion status=%s", 403 + show_ib_cm_state(__entry->state), 404 + rdma_show_wc_status(__entry->wc_status) 405 + ) 406 + ); 407 + 408 + #endif /* _TRACE_IB_CMA_H */ 409 + 410 + #undef TRACE_INCLUDE_PATH 411 + #define TRACE_INCLUDE_PATH ../../drivers/infiniband/core 412 + #define TRACE_INCLUDE_FILE cm_trace 413 + 414 + #include <trace/define_trace.h>
+334 -299
drivers/infiniband/core/cma.c
··· 68 68 [RDMA_CM_EVENT_TIMEWAIT_EXIT] = "timewait exit", 69 69 }; 70 70 71 + static void cma_set_mgid(struct rdma_id_private *id_priv, struct sockaddr *addr, 72 + union ib_gid *mgid); 73 + 71 74 const char *__attribute_const__ rdma_event_msg(enum rdma_cm_event_type event) 72 75 { 73 76 size_t index = event; ··· 304 301 if (!rdma_is_port_valid(cma_dev->device, port)) 305 302 return -EINVAL; 306 303 304 + if (default_gid_type == IB_GID_TYPE_IB && 305 + rdma_protocol_roce_eth_encap(cma_dev->device, port)) 306 + default_gid_type = IB_GID_TYPE_ROCE; 307 + 307 308 supported_gids = roce_gid_type_mask_support(cma_dev->device, port); 308 309 309 310 if (!(supported_gids & 1 << default_gid_type)) ··· 352 345 353 346 struct cma_multicast { 354 347 struct rdma_id_private *id_priv; 355 - union { 356 - struct ib_sa_multicast *ib; 357 - } multicast; 348 + struct ib_sa_multicast *sa_mc; 358 349 struct list_head list; 359 350 void *context; 360 351 struct sockaddr_storage addr; 361 - struct kref mcref; 362 352 u8 join_state; 363 353 }; 364 354 ··· 365 361 enum rdma_cm_state old_state; 366 362 enum rdma_cm_state new_state; 367 363 struct rdma_cm_event event; 368 - }; 369 - 370 - struct cma_ndev_work { 371 - struct work_struct work; 372 - struct rdma_id_private *id; 373 - struct rdma_cm_event event; 374 - }; 375 - 376 - struct iboe_mcast_work { 377 - struct work_struct work; 378 - struct rdma_id_private *id; 379 - struct cma_multicast *mc; 380 364 }; 381 365 382 366 union cma_ip_addr { ··· 396 404 u16 pkey; 397 405 }; 398 406 399 - static int cma_comp(struct rdma_id_private *id_priv, enum rdma_cm_state comp) 400 - { 401 - unsigned long flags; 402 - int ret; 403 - 404 - spin_lock_irqsave(&id_priv->lock, flags); 405 - ret = (id_priv->state == comp); 406 - spin_unlock_irqrestore(&id_priv->lock, flags); 407 - return ret; 408 - } 409 - 410 407 static int cma_comp_exch(struct rdma_id_private *id_priv, 411 408 enum rdma_cm_state comp, enum rdma_cm_state exch) 412 409 { 413 410 unsigned long flags; 414 411 int ret; 412 + 413 + /* 414 + * The FSM uses a funny double locking where state is protected by both 415 + * the handler_mutex and the spinlock. State is not allowed to change 416 + * away from a handler_mutex protected value without also holding 417 + * handler_mutex. 418 + */ 419 + if (comp == RDMA_CM_CONNECT) 420 + lockdep_assert_held(&id_priv->handler_mutex); 415 421 416 422 spin_lock_irqsave(&id_priv->lock, flags); 417 423 if ((ret = (id_priv->state == comp))) ··· 457 467 id_priv->id.route.addr.dev_addr.transport = 458 468 rdma_node_get_transport(cma_dev->device->node_type); 459 469 list_add_tail(&id_priv->list, &cma_dev->id_list); 460 - if (id_priv->res.kern_name) 461 - rdma_restrack_kadd(&id_priv->res); 462 - else 463 - rdma_restrack_uadd(&id_priv->res); 470 + rdma_restrack_add(&id_priv->res); 471 + 464 472 trace_cm_id_attach(id_priv, cma_dev->device); 465 473 } 466 474 ··· 469 481 id_priv->gid_type = 470 482 cma_dev->default_gid_type[id_priv->id.port_num - 471 483 rdma_start_port(cma_dev->device)]; 472 - } 473 - 474 - static inline void release_mc(struct kref *kref) 475 - { 476 - struct cma_multicast *mc = container_of(kref, struct cma_multicast, mcref); 477 - 478 - kfree(mc->multicast.ib); 479 - kfree(mc); 480 484 } 481 485 482 486 static void cma_release_dev(struct rdma_id_private *id_priv) ··· 824 844 complete(&id_priv->comp); 825 845 } 826 846 827 - struct rdma_cm_id *__rdma_create_id(struct net *net, 828 - rdma_cm_event_handler event_handler, 829 - void *context, enum rdma_ucm_port_space ps, 830 - enum ib_qp_type qp_type, const char *caller) 847 + static struct rdma_id_private * 848 + __rdma_create_id(struct net *net, rdma_cm_event_handler event_handler, 849 + void *context, enum rdma_ucm_port_space ps, 850 + enum ib_qp_type qp_type, const struct rdma_id_private *parent) 831 851 { 832 852 struct rdma_id_private *id_priv; 833 853 ··· 835 855 if (!id_priv) 836 856 return ERR_PTR(-ENOMEM); 837 857 838 - rdma_restrack_set_task(&id_priv->res, caller); 839 - id_priv->res.type = RDMA_RESTRACK_CM_ID; 840 858 id_priv->state = RDMA_CM_IDLE; 841 859 id_priv->id.context = context; 842 860 id_priv->id.event_handler = event_handler; ··· 854 876 id_priv->id.route.addr.dev_addr.net = get_net(net); 855 877 id_priv->seq_num &= 0x00ffffff; 856 878 857 - return &id_priv->id; 879 + rdma_restrack_new(&id_priv->res, RDMA_RESTRACK_CM_ID); 880 + if (parent) 881 + rdma_restrack_parent_name(&id_priv->res, &parent->res); 882 + 883 + return id_priv; 858 884 } 859 - EXPORT_SYMBOL(__rdma_create_id); 885 + 886 + struct rdma_cm_id * 887 + __rdma_create_kernel_id(struct net *net, rdma_cm_event_handler event_handler, 888 + void *context, enum rdma_ucm_port_space ps, 889 + enum ib_qp_type qp_type, const char *caller) 890 + { 891 + struct rdma_id_private *ret; 892 + 893 + ret = __rdma_create_id(net, event_handler, context, ps, qp_type, NULL); 894 + if (IS_ERR(ret)) 895 + return ERR_CAST(ret); 896 + 897 + rdma_restrack_set_name(&ret->res, caller); 898 + return &ret->id; 899 + } 900 + EXPORT_SYMBOL(__rdma_create_kernel_id); 901 + 902 + struct rdma_cm_id *rdma_create_user_id(rdma_cm_event_handler event_handler, 903 + void *context, 904 + enum rdma_ucm_port_space ps, 905 + enum ib_qp_type qp_type) 906 + { 907 + struct rdma_id_private *ret; 908 + 909 + ret = __rdma_create_id(current->nsproxy->net_ns, event_handler, context, 910 + ps, qp_type, NULL); 911 + if (IS_ERR(ret)) 912 + return ERR_CAST(ret); 913 + 914 + rdma_restrack_set_name(&ret->res, NULL); 915 + return &ret->id; 916 + } 917 + EXPORT_SYMBOL(rdma_create_user_id); 860 918 861 919 static int cma_init_ud_qp(struct rdma_id_private *id_priv, struct ib_qp *qp) 862 920 { ··· 1797 1783 mutex_unlock(&lock); 1798 1784 } 1799 1785 1800 - static void cma_leave_roce_mc_group(struct rdma_id_private *id_priv, 1801 - struct cma_multicast *mc) 1786 + static void destroy_mc(struct rdma_id_private *id_priv, 1787 + struct cma_multicast *mc) 1802 1788 { 1803 - struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr; 1804 - struct net_device *ndev = NULL; 1789 + if (rdma_cap_ib_mcast(id_priv->id.device, id_priv->id.port_num)) 1790 + ib_sa_free_multicast(mc->sa_mc); 1805 1791 1806 - if (dev_addr->bound_dev_if) 1807 - ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if); 1808 - if (ndev) { 1809 - cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid, false); 1810 - dev_put(ndev); 1792 + if (rdma_protocol_roce(id_priv->id.device, id_priv->id.port_num)) { 1793 + struct rdma_dev_addr *dev_addr = 1794 + &id_priv->id.route.addr.dev_addr; 1795 + struct net_device *ndev = NULL; 1796 + 1797 + if (dev_addr->bound_dev_if) 1798 + ndev = dev_get_by_index(dev_addr->net, 1799 + dev_addr->bound_dev_if); 1800 + if (ndev) { 1801 + union ib_gid mgid; 1802 + 1803 + cma_set_mgid(id_priv, (struct sockaddr *)&mc->addr, 1804 + &mgid); 1805 + cma_igmp_send(ndev, &mgid, false); 1806 + dev_put(ndev); 1807 + } 1811 1808 } 1812 - kref_put(&mc->mcref, release_mc); 1809 + kfree(mc); 1813 1810 } 1814 1811 1815 1812 static void cma_leave_mc_groups(struct rdma_id_private *id_priv) ··· 1828 1803 struct cma_multicast *mc; 1829 1804 1830 1805 while (!list_empty(&id_priv->mc_list)) { 1831 - mc = container_of(id_priv->mc_list.next, 1832 - struct cma_multicast, list); 1806 + mc = list_first_entry(&id_priv->mc_list, struct cma_multicast, 1807 + list); 1833 1808 list_del(&mc->list); 1834 - if (rdma_cap_ib_mcast(id_priv->cma_dev->device, 1835 - id_priv->id.port_num)) { 1836 - ib_sa_free_multicast(mc->multicast.ib); 1837 - kfree(mc); 1838 - } else { 1839 - cma_leave_roce_mc_group(id_priv, mc); 1840 - } 1809 + destroy_mc(id_priv, mc); 1841 1810 } 1842 1811 } 1843 1812 ··· 1840 1821 { 1841 1822 cma_cancel_operation(id_priv, state); 1842 1823 1843 - rdma_restrack_del(&id_priv->res); 1844 1824 if (id_priv->cma_dev) { 1845 1825 if (rdma_cap_ib_cm(id_priv->id.device, 1)) { 1846 1826 if (id_priv->cm_id.ib) ··· 1865 1847 rdma_put_gid_attr(id_priv->id.route.addr.dev_addr.sgid_attr); 1866 1848 1867 1849 put_net(id_priv->id.route.addr.dev_addr.net); 1850 + rdma_restrack_del(&id_priv->res); 1868 1851 kfree(id_priv); 1869 1852 } 1870 1853 ··· 1968 1949 { 1969 1950 struct rdma_id_private *id_priv = cm_id->context; 1970 1951 struct rdma_cm_event event = {}; 1952 + enum rdma_cm_state state; 1971 1953 int ret; 1972 1954 1973 1955 mutex_lock(&id_priv->handler_mutex); 1956 + state = READ_ONCE(id_priv->state); 1974 1957 if ((ib_event->event != IB_CM_TIMEWAIT_EXIT && 1975 - id_priv->state != RDMA_CM_CONNECT) || 1958 + state != RDMA_CM_CONNECT) || 1976 1959 (ib_event->event == IB_CM_TIMEWAIT_EXIT && 1977 - id_priv->state != RDMA_CM_DISCONNECT)) 1960 + state != RDMA_CM_DISCONNECT)) 1978 1961 goto out; 1979 1962 1980 1963 switch (ib_event->event) { ··· 1986 1965 event.status = -ETIMEDOUT; 1987 1966 break; 1988 1967 case IB_CM_REP_RECEIVED: 1989 - if (cma_comp(id_priv, RDMA_CM_CONNECT) && 1968 + if (state == RDMA_CM_CONNECT && 1990 1969 (id_priv->id.qp_type != IB_QPT_UD)) { 1991 1970 trace_cm_send_mra(id_priv); 1992 1971 ib_send_cm_mra(cm_id, CMA_CM_MRA_SETTING, NULL, 0); ··· 2064 2043 int ret; 2065 2044 2066 2045 listen_id_priv = container_of(listen_id, struct rdma_id_private, id); 2067 - id = __rdma_create_id(listen_id->route.addr.dev_addr.net, 2068 - listen_id->event_handler, listen_id->context, 2069 - listen_id->ps, ib_event->param.req_rcvd.qp_type, 2070 - listen_id_priv->res.kern_name); 2071 - if (IS_ERR(id)) 2046 + id_priv = __rdma_create_id(listen_id->route.addr.dev_addr.net, 2047 + listen_id->event_handler, listen_id->context, 2048 + listen_id->ps, 2049 + ib_event->param.req_rcvd.qp_type, 2050 + listen_id_priv); 2051 + if (IS_ERR(id_priv)) 2072 2052 return NULL; 2073 2053 2074 - id_priv = container_of(id, struct rdma_id_private, id); 2054 + id = &id_priv->id; 2075 2055 if (cma_save_net_info((struct sockaddr *)&id->route.addr.src_addr, 2076 2056 (struct sockaddr *)&id->route.addr.dst_addr, 2077 2057 listen_id, ib_event, ss_family, service_id)) ··· 2126 2104 int ret; 2127 2105 2128 2106 listen_id_priv = container_of(listen_id, struct rdma_id_private, id); 2129 - id = __rdma_create_id(net, listen_id->event_handler, listen_id->context, 2130 - listen_id->ps, IB_QPT_UD, 2131 - listen_id_priv->res.kern_name); 2132 - if (IS_ERR(id)) 2107 + id_priv = __rdma_create_id(net, listen_id->event_handler, 2108 + listen_id->context, listen_id->ps, IB_QPT_UD, 2109 + listen_id_priv); 2110 + if (IS_ERR(id_priv)) 2133 2111 return NULL; 2134 2112 2135 - id_priv = container_of(id, struct rdma_id_private, id); 2113 + id = &id_priv->id; 2136 2114 if (cma_save_net_info((struct sockaddr *)&id->route.addr.src_addr, 2137 2115 (struct sockaddr *)&id->route.addr.dst_addr, 2138 2116 listen_id, ib_event, ss_family, ··· 2206 2184 } 2207 2185 2208 2186 mutex_lock(&listen_id->handler_mutex); 2209 - if (listen_id->state != RDMA_CM_LISTEN) { 2187 + if (READ_ONCE(listen_id->state) != RDMA_CM_LISTEN) { 2210 2188 ret = -ECONNABORTED; 2211 2189 goto err_unlock; 2212 2190 } ··· 2248 2226 goto net_dev_put; 2249 2227 } 2250 2228 2251 - if (cma_comp(conn_id, RDMA_CM_CONNECT) && 2252 - (conn_id->id.qp_type != IB_QPT_UD)) { 2229 + if (READ_ONCE(conn_id->state) == RDMA_CM_CONNECT && 2230 + conn_id->id.qp_type != IB_QPT_UD) { 2253 2231 trace_cm_send_mra(cm_id->context); 2254 2232 ib_send_cm_mra(cm_id, CMA_CM_MRA_SETTING, NULL, 0); 2255 2233 } ··· 2310 2288 struct sockaddr *raddr = (struct sockaddr *)&iw_event->remote_addr; 2311 2289 2312 2290 mutex_lock(&id_priv->handler_mutex); 2313 - if (id_priv->state != RDMA_CM_CONNECT) 2291 + if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT) 2314 2292 goto out; 2315 2293 2316 2294 switch (iw_event->event) { ··· 2368 2346 static int iw_conn_req_handler(struct iw_cm_id *cm_id, 2369 2347 struct iw_cm_event *iw_event) 2370 2348 { 2371 - struct rdma_cm_id *new_cm_id; 2372 2349 struct rdma_id_private *listen_id, *conn_id; 2373 2350 struct rdma_cm_event event = {}; 2374 2351 int ret = -ECONNABORTED; ··· 2383 2362 listen_id = cm_id->context; 2384 2363 2385 2364 mutex_lock(&listen_id->handler_mutex); 2386 - if (listen_id->state != RDMA_CM_LISTEN) 2365 + if (READ_ONCE(listen_id->state) != RDMA_CM_LISTEN) 2387 2366 goto out; 2388 2367 2389 2368 /* Create a new RDMA id for the new IW CM ID */ 2390 - new_cm_id = __rdma_create_id(listen_id->id.route.addr.dev_addr.net, 2391 - listen_id->id.event_handler, 2392 - listen_id->id.context, 2393 - RDMA_PS_TCP, IB_QPT_RC, 2394 - listen_id->res.kern_name); 2395 - if (IS_ERR(new_cm_id)) { 2369 + conn_id = __rdma_create_id(listen_id->id.route.addr.dev_addr.net, 2370 + listen_id->id.event_handler, 2371 + listen_id->id.context, RDMA_PS_TCP, 2372 + IB_QPT_RC, listen_id); 2373 + if (IS_ERR(conn_id)) { 2396 2374 ret = -ENOMEM; 2397 2375 goto out; 2398 2376 } 2399 - conn_id = container_of(new_cm_id, struct rdma_id_private, id); 2400 2377 mutex_lock_nested(&conn_id->handler_mutex, SINGLE_DEPTH_NESTING); 2401 2378 conn_id->state = RDMA_CM_CONNECT; 2402 2379 ··· 2499 2480 struct cma_device *cma_dev) 2500 2481 { 2501 2482 struct rdma_id_private *dev_id_priv; 2502 - struct rdma_cm_id *id; 2503 2483 struct net *net = id_priv->id.route.addr.dev_addr.net; 2504 2484 int ret; 2505 2485 ··· 2507 2489 if (cma_family(id_priv) == AF_IB && !rdma_cap_ib_cm(cma_dev->device, 1)) 2508 2490 return; 2509 2491 2510 - id = __rdma_create_id(net, cma_listen_handler, id_priv, id_priv->id.ps, 2511 - id_priv->id.qp_type, id_priv->res.kern_name); 2512 - if (IS_ERR(id)) 2492 + dev_id_priv = 2493 + __rdma_create_id(net, cma_listen_handler, id_priv, 2494 + id_priv->id.ps, id_priv->id.qp_type, id_priv); 2495 + if (IS_ERR(dev_id_priv)) 2513 2496 return; 2514 - 2515 - dev_id_priv = container_of(id, struct rdma_id_private, id); 2516 2497 2517 2498 dev_id_priv->state = RDMA_CM_ADDR_BOUND; 2518 2499 memcpy(cma_src_addr(dev_id_priv), cma_src_addr(id_priv), ··· 2525 2508 dev_id_priv->tos_set = id_priv->tos_set; 2526 2509 dev_id_priv->tos = id_priv->tos; 2527 2510 2528 - ret = rdma_listen(id, id_priv->backlog); 2511 + ret = rdma_listen(&dev_id_priv->id, id_priv->backlog); 2529 2512 if (ret) 2530 2513 dev_warn(&cma_dev->device->dev, 2531 2514 "RDMA CMA: cma_listen_on_dev, error %d\n", ret); ··· 2664 2647 struct rdma_id_private *id_priv = work->id; 2665 2648 2666 2649 mutex_lock(&id_priv->handler_mutex); 2667 - if (!cma_comp_exch(id_priv, work->old_state, work->new_state)) 2650 + if (READ_ONCE(id_priv->state) == RDMA_CM_DESTROYING || 2651 + READ_ONCE(id_priv->state) == RDMA_CM_DEVICE_REMOVAL) 2668 2652 goto out_unlock; 2653 + if (work->old_state != 0 || work->new_state != 0) { 2654 + if (!cma_comp_exch(id_priv, work->old_state, work->new_state)) 2655 + goto out_unlock; 2656 + } 2669 2657 2670 2658 if (cma_cm_event_handler(id_priv, &work->event)) { 2671 2659 cma_id_put(id_priv); ··· 2682 2660 mutex_unlock(&id_priv->handler_mutex); 2683 2661 cma_id_put(id_priv); 2684 2662 out_free: 2685 - kfree(work); 2686 - } 2687 - 2688 - static void cma_ndev_work_handler(struct work_struct *_work) 2689 - { 2690 - struct cma_ndev_work *work = container_of(_work, struct cma_ndev_work, work); 2691 - struct rdma_id_private *id_priv = work->id; 2692 - 2693 - mutex_lock(&id_priv->handler_mutex); 2694 - if (id_priv->state == RDMA_CM_DESTROYING || 2695 - id_priv->state == RDMA_CM_DEVICE_REMOVAL) 2696 - goto out_unlock; 2697 - 2698 - if (cma_cm_event_handler(id_priv, &work->event)) { 2699 - cma_id_put(id_priv); 2700 - destroy_id_handler_unlock(id_priv); 2701 - goto out_free; 2702 - } 2703 - 2704 - out_unlock: 2705 - mutex_unlock(&id_priv->handler_mutex); 2706 - cma_id_put(id_priv); 2707 - out_free: 2663 + if (work->event.event == RDMA_CM_EVENT_MULTICAST_JOIN) 2664 + rdma_destroy_ah_attr(&work->event.param.ud.ah_attr); 2708 2665 kfree(work); 2709 2666 } 2710 2667 ··· 3241 3240 return rdma_bind_addr(id, src_addr); 3242 3241 } 3243 3242 3244 - int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, 3245 - const struct sockaddr *dst_addr, unsigned long timeout_ms) 3243 + /* 3244 + * If required, resolve the source address for bind and leave the id_priv in 3245 + * state RDMA_CM_ADDR_BOUND. This oddly uses the state to determine the prior 3246 + * calls made by ULP, a previously bound ID will not be re-bound and src_addr is 3247 + * ignored. 3248 + */ 3249 + static int resolve_prepare_src(struct rdma_id_private *id_priv, 3250 + struct sockaddr *src_addr, 3251 + const struct sockaddr *dst_addr) 3246 3252 { 3247 - struct rdma_id_private *id_priv; 3248 3253 int ret; 3249 3254 3250 - id_priv = container_of(id, struct rdma_id_private, id); 3251 3255 memcpy(cma_dst_addr(id_priv), dst_addr, rdma_addr_size(dst_addr)); 3252 - if (id_priv->state == RDMA_CM_IDLE) { 3253 - ret = cma_bind_addr(id, src_addr, dst_addr); 3254 - if (ret) { 3255 - memset(cma_dst_addr(id_priv), 0, 3256 - rdma_addr_size(dst_addr)); 3257 - return ret; 3256 + if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) { 3257 + /* For a well behaved ULP state will be RDMA_CM_IDLE */ 3258 + ret = cma_bind_addr(&id_priv->id, src_addr, dst_addr); 3259 + if (ret) 3260 + goto err_dst; 3261 + if (WARN_ON(!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, 3262 + RDMA_CM_ADDR_QUERY))) { 3263 + ret = -EINVAL; 3264 + goto err_dst; 3258 3265 } 3259 3266 } 3260 3267 3261 3268 if (cma_family(id_priv) != dst_addr->sa_family) { 3262 - memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); 3263 - return -EINVAL; 3269 + ret = -EINVAL; 3270 + goto err_state; 3264 3271 } 3272 + return 0; 3265 3273 3266 - if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_ADDR_QUERY)) { 3267 - memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); 3268 - return -EINVAL; 3269 - } 3274 + err_state: 3275 + cma_comp_exch(id_priv, RDMA_CM_ADDR_QUERY, RDMA_CM_ADDR_BOUND); 3276 + err_dst: 3277 + memset(cma_dst_addr(id_priv), 0, rdma_addr_size(dst_addr)); 3278 + return ret; 3279 + } 3280 + 3281 + int rdma_resolve_addr(struct rdma_cm_id *id, struct sockaddr *src_addr, 3282 + const struct sockaddr *dst_addr, unsigned long timeout_ms) 3283 + { 3284 + struct rdma_id_private *id_priv = 3285 + container_of(id, struct rdma_id_private, id); 3286 + int ret; 3287 + 3288 + ret = resolve_prepare_src(id_priv, src_addr, dst_addr); 3289 + if (ret) 3290 + return ret; 3270 3291 3271 3292 if (cma_any_addr(dst_addr)) { 3272 3293 ret = cma_resolve_loopback(id_priv); ··· 3320 3297 3321 3298 id_priv = container_of(id, struct rdma_id_private, id); 3322 3299 spin_lock_irqsave(&id_priv->lock, flags); 3323 - if (reuse || id_priv->state == RDMA_CM_IDLE) { 3300 + if ((reuse && id_priv->state != RDMA_CM_LISTEN) || 3301 + id_priv->state == RDMA_CM_IDLE) { 3324 3302 id_priv->reuseaddr = reuse; 3325 3303 ret = 0; 3326 3304 } else { ··· 3515 3491 if (id_priv == cur_id) 3516 3492 continue; 3517 3493 3518 - if ((cur_id->state != RDMA_CM_LISTEN) && reuseaddr && 3519 - cur_id->reuseaddr) 3494 + if (reuseaddr && cur_id->reuseaddr) 3520 3495 continue; 3521 3496 3522 3497 cur_addr = cma_src_addr(cur_id); ··· 3553 3530 if (!ret) 3554 3531 cma_bind_port(bind_list, id_priv); 3555 3532 } 3556 - return ret; 3557 - } 3558 - 3559 - static int cma_bind_listen(struct rdma_id_private *id_priv) 3560 - { 3561 - struct rdma_bind_list *bind_list = id_priv->bind_list; 3562 - int ret = 0; 3563 - 3564 - mutex_lock(&lock); 3565 - if (bind_list->owners.first->next) 3566 - ret = cma_check_port(bind_list, id_priv, 0); 3567 - mutex_unlock(&lock); 3568 3533 return ret; 3569 3534 } 3570 3535 ··· 3649 3638 3650 3639 int rdma_listen(struct rdma_cm_id *id, int backlog) 3651 3640 { 3652 - struct rdma_id_private *id_priv; 3641 + struct rdma_id_private *id_priv = 3642 + container_of(id, struct rdma_id_private, id); 3653 3643 int ret; 3654 3644 3655 - id_priv = container_of(id, struct rdma_id_private, id); 3656 - if (id_priv->state == RDMA_CM_IDLE) { 3645 + if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_LISTEN)) { 3646 + /* For a well behaved ULP state will be RDMA_CM_IDLE */ 3657 3647 id->route.addr.src_addr.ss_family = AF_INET; 3658 3648 ret = rdma_bind_addr(id, cma_src_addr(id_priv)); 3659 3649 if (ret) 3660 3650 return ret; 3651 + if (WARN_ON(!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, 3652 + RDMA_CM_LISTEN))) 3653 + return -EINVAL; 3661 3654 } 3662 3655 3663 - if (!cma_comp_exch(id_priv, RDMA_CM_ADDR_BOUND, RDMA_CM_LISTEN)) 3664 - return -EINVAL; 3665 - 3656 + /* 3657 + * Once the ID reaches RDMA_CM_LISTEN it is not allowed to be reusable 3658 + * any more, and has to be unique in the bind list. 3659 + */ 3666 3660 if (id_priv->reuseaddr) { 3667 - ret = cma_bind_listen(id_priv); 3661 + mutex_lock(&lock); 3662 + ret = cma_check_port(id_priv->bind_list, id_priv, 0); 3663 + if (!ret) 3664 + id_priv->reuseaddr = 0; 3665 + mutex_unlock(&lock); 3668 3666 if (ret) 3669 3667 goto err; 3670 3668 } ··· 3698 3678 return 0; 3699 3679 err: 3700 3680 id_priv->backlog = 0; 3681 + /* 3682 + * All the failure paths that lead here will not allow the req_handler's 3683 + * to have run. 3684 + */ 3701 3685 cma_comp_exch(id_priv, RDMA_CM_LISTEN, RDMA_CM_ADDR_BOUND); 3702 3686 return ret; 3703 3687 } ··· 3756 3732 3757 3733 return 0; 3758 3734 err2: 3759 - rdma_restrack_del(&id_priv->res); 3760 3735 if (id_priv->cma_dev) 3761 3736 cma_release_dev(id_priv); 3762 3737 err1: ··· 3804 3781 int ret; 3805 3782 3806 3783 mutex_lock(&id_priv->handler_mutex); 3807 - if (id_priv->state != RDMA_CM_CONNECT) 3784 + if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT) 3808 3785 goto out; 3809 3786 3810 3787 switch (ib_event->event) { ··· 4040 4017 4041 4018 int rdma_connect(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) 4042 4019 { 4043 - struct rdma_id_private *id_priv; 4020 + struct rdma_id_private *id_priv = 4021 + container_of(id, struct rdma_id_private, id); 4044 4022 int ret; 4045 4023 4046 - id_priv = container_of(id, struct rdma_id_private, id); 4047 - if (!cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_CONNECT)) 4048 - return -EINVAL; 4024 + mutex_lock(&id_priv->handler_mutex); 4025 + if (!cma_comp_exch(id_priv, RDMA_CM_ROUTE_RESOLVED, RDMA_CM_CONNECT)) { 4026 + ret = -EINVAL; 4027 + goto err_unlock; 4028 + } 4049 4029 4050 4030 if (!id->qp) { 4051 4031 id_priv->qp_num = conn_param->qp_num; ··· 4065 4039 else 4066 4040 ret = -ENOSYS; 4067 4041 if (ret) 4068 - goto err; 4069 - 4042 + goto err_state; 4043 + mutex_unlock(&id_priv->handler_mutex); 4070 4044 return 0; 4071 - err: 4045 + err_state: 4072 4046 cma_comp_exch(id_priv, RDMA_CM_CONNECT, RDMA_CM_ROUTE_RESOLVED); 4047 + err_unlock: 4048 + mutex_unlock(&id_priv->handler_mutex); 4073 4049 return ret; 4074 4050 } 4075 4051 EXPORT_SYMBOL(rdma_connect); ··· 4183 4155 return ib_send_cm_sidr_rep(id_priv->cm_id.ib, &rep); 4184 4156 } 4185 4157 4186 - int __rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 4187 - const char *caller) 4158 + /** 4159 + * rdma_accept - Called to accept a connection request or response. 4160 + * @id: Connection identifier associated with the request. 4161 + * @conn_param: Information needed to establish the connection. This must be 4162 + * provided if accepting a connection request. If accepting a connection 4163 + * response, this parameter must be NULL. 4164 + * 4165 + * Typically, this routine is only called by the listener to accept a connection 4166 + * request. It must also be called on the active side of a connection if the 4167 + * user is performing their own QP transitions. 4168 + * 4169 + * In the case of error, a reject message is sent to the remote side and the 4170 + * state of the qp associated with the id is modified to error, such that any 4171 + * previously posted receive buffers would be flushed. 4172 + * 4173 + * This function is for use by kernel ULPs and must be called from under the 4174 + * handler callback. 4175 + */ 4176 + int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param) 4188 4177 { 4189 - struct rdma_id_private *id_priv; 4178 + struct rdma_id_private *id_priv = 4179 + container_of(id, struct rdma_id_private, id); 4190 4180 int ret; 4191 4181 4192 - id_priv = container_of(id, struct rdma_id_private, id); 4182 + lockdep_assert_held(&id_priv->handler_mutex); 4193 4183 4194 - rdma_restrack_set_task(&id_priv->res, caller); 4195 - 4196 - if (!cma_comp(id_priv, RDMA_CM_CONNECT)) 4184 + if (READ_ONCE(id_priv->state) != RDMA_CM_CONNECT) 4197 4185 return -EINVAL; 4198 4186 4199 4187 if (!id->qp && conn_param) { ··· 4247 4203 rdma_reject(id, NULL, 0, IB_CM_REJ_CONSUMER_DEFINED); 4248 4204 return ret; 4249 4205 } 4250 - EXPORT_SYMBOL(__rdma_accept); 4206 + EXPORT_SYMBOL(rdma_accept); 4251 4207 4252 - int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 4253 - const char *caller, struct rdma_ucm_ece *ece) 4208 + int rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 4209 + struct rdma_ucm_ece *ece) 4254 4210 { 4255 4211 struct rdma_id_private *id_priv = 4256 4212 container_of(id, struct rdma_id_private, id); ··· 4258 4214 id_priv->ece.vendor_id = ece->vendor_id; 4259 4215 id_priv->ece.attr_mod = ece->attr_mod; 4260 4216 4261 - return __rdma_accept(id, conn_param, caller); 4217 + return rdma_accept(id, conn_param); 4262 4218 } 4263 - EXPORT_SYMBOL(__rdma_accept_ece); 4219 + EXPORT_SYMBOL(rdma_accept_ece); 4220 + 4221 + void rdma_lock_handler(struct rdma_cm_id *id) 4222 + { 4223 + struct rdma_id_private *id_priv = 4224 + container_of(id, struct rdma_id_private, id); 4225 + 4226 + mutex_lock(&id_priv->handler_mutex); 4227 + } 4228 + EXPORT_SYMBOL(rdma_lock_handler); 4229 + 4230 + void rdma_unlock_handler(struct rdma_cm_id *id) 4231 + { 4232 + struct rdma_id_private *id_priv = 4233 + container_of(id, struct rdma_id_private, id); 4234 + 4235 + mutex_unlock(&id_priv->handler_mutex); 4236 + } 4237 + EXPORT_SYMBOL(rdma_unlock_handler); 4264 4238 4265 4239 int rdma_notify(struct rdma_cm_id *id, enum ib_event_type event) 4266 4240 { ··· 4361 4299 } 4362 4300 EXPORT_SYMBOL(rdma_disconnect); 4363 4301 4364 - static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast) 4302 + static void cma_make_mc_event(int status, struct rdma_id_private *id_priv, 4303 + struct ib_sa_multicast *multicast, 4304 + struct rdma_cm_event *event, 4305 + struct cma_multicast *mc) 4365 4306 { 4366 - struct rdma_id_private *id_priv; 4367 - struct cma_multicast *mc = multicast->context; 4368 - struct rdma_cm_event event = {}; 4369 - int ret = 0; 4370 - 4371 - id_priv = mc->id_priv; 4372 - mutex_lock(&id_priv->handler_mutex); 4373 - if (id_priv->state != RDMA_CM_ADDR_BOUND && 4374 - id_priv->state != RDMA_CM_ADDR_RESOLVED) 4375 - goto out; 4307 + struct rdma_dev_addr *dev_addr; 4308 + enum ib_gid_type gid_type; 4309 + struct net_device *ndev; 4376 4310 4377 4311 if (!status) 4378 4312 status = cma_set_qkey(id_priv, be32_to_cpu(multicast->rec.qkey)); 4379 4313 else 4380 4314 pr_debug_ratelimited("RDMA CM: MULTICAST_ERROR: failed to join multicast. status %d\n", 4381 4315 status); 4382 - mutex_lock(&id_priv->qp_mutex); 4383 - if (!status && id_priv->id.qp) { 4384 - status = ib_attach_mcast(id_priv->id.qp, &multicast->rec.mgid, 4385 - be16_to_cpu(multicast->rec.mlid)); 4386 - if (status) 4387 - pr_debug_ratelimited("RDMA CM: MULTICAST_ERROR: failed to attach QP. status %d\n", 4388 - status); 4316 + 4317 + event->status = status; 4318 + event->param.ud.private_data = mc->context; 4319 + if (status) { 4320 + event->event = RDMA_CM_EVENT_MULTICAST_ERROR; 4321 + return; 4389 4322 } 4390 - mutex_unlock(&id_priv->qp_mutex); 4391 4323 4392 - event.status = status; 4393 - event.param.ud.private_data = mc->context; 4394 - if (!status) { 4395 - struct rdma_dev_addr *dev_addr = 4396 - &id_priv->id.route.addr.dev_addr; 4397 - struct net_device *ndev = 4398 - dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if); 4399 - enum ib_gid_type gid_type = 4400 - id_priv->cma_dev->default_gid_type[id_priv->id.port_num - 4401 - rdma_start_port(id_priv->cma_dev->device)]; 4324 + dev_addr = &id_priv->id.route.addr.dev_addr; 4325 + ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if); 4326 + gid_type = 4327 + id_priv->cma_dev 4328 + ->default_gid_type[id_priv->id.port_num - 4329 + rdma_start_port( 4330 + id_priv->cma_dev->device)]; 4402 4331 4403 - event.event = RDMA_CM_EVENT_MULTICAST_JOIN; 4404 - ret = ib_init_ah_from_mcmember(id_priv->id.device, 4405 - id_priv->id.port_num, 4406 - &multicast->rec, 4407 - ndev, gid_type, 4408 - &event.param.ud.ah_attr); 4409 - if (ret) 4410 - event.event = RDMA_CM_EVENT_MULTICAST_ERROR; 4332 + event->event = RDMA_CM_EVENT_MULTICAST_JOIN; 4333 + if (ib_init_ah_from_mcmember(id_priv->id.device, id_priv->id.port_num, 4334 + &multicast->rec, ndev, gid_type, 4335 + &event->param.ud.ah_attr)) { 4336 + event->event = RDMA_CM_EVENT_MULTICAST_ERROR; 4337 + goto out; 4338 + } 4411 4339 4412 - event.param.ud.qp_num = 0xFFFFFF; 4413 - event.param.ud.qkey = be32_to_cpu(multicast->rec.qkey); 4414 - if (ndev) 4415 - dev_put(ndev); 4416 - } else 4417 - event.event = RDMA_CM_EVENT_MULTICAST_ERROR; 4340 + event->param.ud.qp_num = 0xFFFFFF; 4341 + event->param.ud.qkey = be32_to_cpu(multicast->rec.qkey); 4418 4342 4343 + out: 4344 + if (ndev) 4345 + dev_put(ndev); 4346 + } 4347 + 4348 + static int cma_ib_mc_handler(int status, struct ib_sa_multicast *multicast) 4349 + { 4350 + struct cma_multicast *mc = multicast->context; 4351 + struct rdma_id_private *id_priv = mc->id_priv; 4352 + struct rdma_cm_event event = {}; 4353 + int ret = 0; 4354 + 4355 + mutex_lock(&id_priv->handler_mutex); 4356 + if (READ_ONCE(id_priv->state) == RDMA_CM_DEVICE_REMOVAL || 4357 + READ_ONCE(id_priv->state) == RDMA_CM_DESTROYING) 4358 + goto out; 4359 + 4360 + cma_make_mc_event(status, id_priv, multicast, &event, mc); 4419 4361 ret = cma_cm_event_handler(id_priv, &event); 4420 - 4421 4362 rdma_destroy_ah_attr(&event.param.ud.ah_attr); 4422 4363 if (ret) { 4423 4364 destroy_id_handler_unlock(id_priv); ··· 4510 4445 IB_SA_MCMEMBER_REC_MTU | 4511 4446 IB_SA_MCMEMBER_REC_HOP_LIMIT; 4512 4447 4513 - mc->multicast.ib = ib_sa_join_multicast(&sa_client, id_priv->id.device, 4514 - id_priv->id.port_num, &rec, 4515 - comp_mask, GFP_KERNEL, 4516 - cma_ib_mc_handler, mc); 4517 - return PTR_ERR_OR_ZERO(mc->multicast.ib); 4518 - } 4519 - 4520 - static void iboe_mcast_work_handler(struct work_struct *work) 4521 - { 4522 - struct iboe_mcast_work *mw = container_of(work, struct iboe_mcast_work, work); 4523 - struct cma_multicast *mc = mw->mc; 4524 - struct ib_sa_multicast *m = mc->multicast.ib; 4525 - 4526 - mc->multicast.ib->context = mc; 4527 - cma_ib_mc_handler(0, m); 4528 - kref_put(&mc->mcref, release_mc); 4529 - kfree(mw); 4448 + mc->sa_mc = ib_sa_join_multicast(&sa_client, id_priv->id.device, 4449 + id_priv->id.port_num, &rec, comp_mask, 4450 + GFP_KERNEL, cma_ib_mc_handler, mc); 4451 + return PTR_ERR_OR_ZERO(mc->sa_mc); 4530 4452 } 4531 4453 4532 4454 static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid, ··· 4548 4496 static int cma_iboe_join_multicast(struct rdma_id_private *id_priv, 4549 4497 struct cma_multicast *mc) 4550 4498 { 4551 - struct iboe_mcast_work *work; 4499 + struct cma_work *work; 4552 4500 struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr; 4553 4501 int err = 0; 4554 4502 struct sockaddr *addr = (struct sockaddr *)&mc->addr; 4555 4503 struct net_device *ndev = NULL; 4504 + struct ib_sa_multicast ib; 4556 4505 enum ib_gid_type gid_type; 4557 4506 bool send_only; 4558 4507 4559 4508 send_only = mc->join_state == BIT(SENDONLY_FULLMEMBER_JOIN); 4560 4509 4561 - if (cma_zero_addr((struct sockaddr *)&mc->addr)) 4510 + if (cma_zero_addr(addr)) 4562 4511 return -EINVAL; 4563 4512 4564 4513 work = kzalloc(sizeof *work, GFP_KERNEL); 4565 4514 if (!work) 4566 4515 return -ENOMEM; 4567 4516 4568 - mc->multicast.ib = kzalloc(sizeof(struct ib_sa_multicast), GFP_KERNEL); 4569 - if (!mc->multicast.ib) { 4570 - err = -ENOMEM; 4571 - goto out1; 4572 - } 4573 - 4574 4517 gid_type = id_priv->cma_dev->default_gid_type[id_priv->id.port_num - 4575 4518 rdma_start_port(id_priv->cma_dev->device)]; 4576 - cma_iboe_set_mgid(addr, &mc->multicast.ib->rec.mgid, gid_type); 4519 + cma_iboe_set_mgid(addr, &ib.rec.mgid, gid_type); 4577 4520 4578 - mc->multicast.ib->rec.pkey = cpu_to_be16(0xffff); 4521 + ib.rec.pkey = cpu_to_be16(0xffff); 4579 4522 if (id_priv->id.ps == RDMA_PS_UDP) 4580 - mc->multicast.ib->rec.qkey = cpu_to_be32(RDMA_UDP_QKEY); 4523 + ib.rec.qkey = cpu_to_be32(RDMA_UDP_QKEY); 4581 4524 4582 4525 if (dev_addr->bound_dev_if) 4583 4526 ndev = dev_get_by_index(dev_addr->net, dev_addr->bound_dev_if); 4584 4527 if (!ndev) { 4585 4528 err = -ENODEV; 4586 - goto out2; 4529 + goto err_free; 4587 4530 } 4588 - mc->multicast.ib->rec.rate = iboe_get_rate(ndev); 4589 - mc->multicast.ib->rec.hop_limit = 1; 4590 - mc->multicast.ib->rec.mtu = iboe_get_mtu(ndev->mtu); 4531 + ib.rec.rate = iboe_get_rate(ndev); 4532 + ib.rec.hop_limit = 1; 4533 + ib.rec.mtu = iboe_get_mtu(ndev->mtu); 4591 4534 4592 4535 if (addr->sa_family == AF_INET) { 4593 4536 if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) { 4594 - mc->multicast.ib->rec.hop_limit = IPV6_DEFAULT_HOPLIMIT; 4537 + ib.rec.hop_limit = IPV6_DEFAULT_HOPLIMIT; 4595 4538 if (!send_only) { 4596 - err = cma_igmp_send(ndev, &mc->multicast.ib->rec.mgid, 4539 + err = cma_igmp_send(ndev, &ib.rec.mgid, 4597 4540 true); 4598 4541 } 4599 4542 } ··· 4597 4550 err = -ENOTSUPP; 4598 4551 } 4599 4552 dev_put(ndev); 4600 - if (err || !mc->multicast.ib->rec.mtu) { 4553 + if (err || !ib.rec.mtu) { 4601 4554 if (!err) 4602 4555 err = -EINVAL; 4603 - goto out2; 4556 + goto err_free; 4604 4557 } 4605 4558 rdma_ip2gid((struct sockaddr *)&id_priv->id.route.addr.src_addr, 4606 - &mc->multicast.ib->rec.port_gid); 4559 + &ib.rec.port_gid); 4607 4560 work->id = id_priv; 4608 - work->mc = mc; 4609 - INIT_WORK(&work->work, iboe_mcast_work_handler); 4610 - kref_get(&mc->mcref); 4561 + INIT_WORK(&work->work, cma_work_handler); 4562 + cma_make_mc_event(0, id_priv, &ib, &work->event, mc); 4563 + /* Balances with cma_id_put() in cma_work_handler */ 4564 + cma_id_get(id_priv); 4611 4565 queue_work(cma_wq, &work->work); 4612 - 4613 4566 return 0; 4614 4567 4615 - out2: 4616 - kfree(mc->multicast.ib); 4617 - out1: 4568 + err_free: 4618 4569 kfree(work); 4619 4570 return err; 4620 4571 } ··· 4620 4575 int rdma_join_multicast(struct rdma_cm_id *id, struct sockaddr *addr, 4621 4576 u8 join_state, void *context) 4622 4577 { 4623 - struct rdma_id_private *id_priv; 4578 + struct rdma_id_private *id_priv = 4579 + container_of(id, struct rdma_id_private, id); 4624 4580 struct cma_multicast *mc; 4625 4581 int ret; 4626 4582 4627 - if (!id->device) 4583 + /* Not supported for kernel QPs */ 4584 + if (WARN_ON(id->qp)) 4628 4585 return -EINVAL; 4629 4586 4630 - id_priv = container_of(id, struct rdma_id_private, id); 4631 - if (!cma_comp(id_priv, RDMA_CM_ADDR_BOUND) && 4632 - !cma_comp(id_priv, RDMA_CM_ADDR_RESOLVED)) 4587 + /* ULP is calling this wrong. */ 4588 + if (!id->device || (READ_ONCE(id_priv->state) != RDMA_CM_ADDR_BOUND && 4589 + READ_ONCE(id_priv->state) != RDMA_CM_ADDR_RESOLVED)) 4633 4590 return -EINVAL; 4634 4591 4635 - mc = kmalloc(sizeof *mc, GFP_KERNEL); 4592 + mc = kzalloc(sizeof(*mc), GFP_KERNEL); 4636 4593 if (!mc) 4637 4594 return -ENOMEM; 4638 4595 ··· 4644 4597 mc->join_state = join_state; 4645 4598 4646 4599 if (rdma_protocol_roce(id->device, id->port_num)) { 4647 - kref_init(&mc->mcref); 4648 4600 ret = cma_iboe_join_multicast(id_priv, mc); 4649 4601 if (ret) 4650 4602 goto out_err; ··· 4675 4629 id_priv = container_of(id, struct rdma_id_private, id); 4676 4630 spin_lock_irq(&id_priv->lock); 4677 4631 list_for_each_entry(mc, &id_priv->mc_list, list) { 4678 - if (!memcmp(&mc->addr, addr, rdma_addr_size(addr))) { 4679 - list_del(&mc->list); 4680 - spin_unlock_irq(&id_priv->lock); 4632 + if (memcmp(&mc->addr, addr, rdma_addr_size(addr)) != 0) 4633 + continue; 4634 + list_del(&mc->list); 4635 + spin_unlock_irq(&id_priv->lock); 4681 4636 4682 - if (id->qp) 4683 - ib_detach_mcast(id->qp, 4684 - &mc->multicast.ib->rec.mgid, 4685 - be16_to_cpu(mc->multicast.ib->rec.mlid)); 4686 - 4687 - BUG_ON(id_priv->cma_dev->device != id->device); 4688 - 4689 - if (rdma_cap_ib_mcast(id->device, id->port_num)) { 4690 - ib_sa_free_multicast(mc->multicast.ib); 4691 - kfree(mc); 4692 - } else if (rdma_protocol_roce(id->device, id->port_num)) { 4693 - cma_leave_roce_mc_group(id_priv, mc); 4694 - } 4695 - return; 4696 - } 4637 + WARN_ON(id_priv->cma_dev->device != id->device); 4638 + destroy_mc(id_priv, mc); 4639 + return; 4697 4640 } 4698 4641 spin_unlock_irq(&id_priv->lock); 4699 4642 } ··· 4691 4656 static int cma_netdev_change(struct net_device *ndev, struct rdma_id_private *id_priv) 4692 4657 { 4693 4658 struct rdma_dev_addr *dev_addr; 4694 - struct cma_ndev_work *work; 4659 + struct cma_work *work; 4695 4660 4696 4661 dev_addr = &id_priv->id.route.addr.dev_addr; 4697 4662 ··· 4704 4669 if (!work) 4705 4670 return -ENOMEM; 4706 4671 4707 - INIT_WORK(&work->work, cma_ndev_work_handler); 4672 + INIT_WORK(&work->work, cma_work_handler); 4708 4673 work->id = id_priv; 4709 4674 work->event.event = RDMA_CM_EVENT_ADDR_CHANGE; 4710 4675 cma_id_get(id_priv);
+5 -4
drivers/infiniband/core/cma_configfs.c
··· 123 123 { 124 124 struct cma_device *cma_dev; 125 125 struct cma_dev_port_group *group; 126 - int gid_type = ib_cache_gid_parse_type_str(buf); 126 + int gid_type; 127 127 ssize_t ret; 128 - 129 - if (gid_type < 0) 130 - return -EINVAL; 131 128 132 129 ret = cma_configfs_params_get(item, &cma_dev, &group); 133 130 if (ret) 134 131 return ret; 132 + 133 + gid_type = ib_cache_gid_parse_type_str(buf); 134 + if (gid_type < 0) 135 + return -EINVAL; 135 136 136 137 ret = cma_set_default_gid_type(cma_dev, group->port_num, gid_type); 137 138
-40
drivers/infiniband/core/cma_trace.h
··· 17 17 #include <linux/tracepoint.h> 18 18 #include <trace/events/rdma.h> 19 19 20 - /* 21 - * enum ib_cm_event_type, from include/rdma/ib_cm.h 22 - */ 23 - #define IB_CM_EVENT_LIST \ 24 - ib_cm_event(REQ_ERROR) \ 25 - ib_cm_event(REQ_RECEIVED) \ 26 - ib_cm_event(REP_ERROR) \ 27 - ib_cm_event(REP_RECEIVED) \ 28 - ib_cm_event(RTU_RECEIVED) \ 29 - ib_cm_event(USER_ESTABLISHED) \ 30 - ib_cm_event(DREQ_ERROR) \ 31 - ib_cm_event(DREQ_RECEIVED) \ 32 - ib_cm_event(DREP_RECEIVED) \ 33 - ib_cm_event(TIMEWAIT_EXIT) \ 34 - ib_cm_event(MRA_RECEIVED) \ 35 - ib_cm_event(REJ_RECEIVED) \ 36 - ib_cm_event(LAP_ERROR) \ 37 - ib_cm_event(LAP_RECEIVED) \ 38 - ib_cm_event(APR_RECEIVED) \ 39 - ib_cm_event(SIDR_REQ_ERROR) \ 40 - ib_cm_event(SIDR_REQ_RECEIVED) \ 41 - ib_cm_event_end(SIDR_REP_RECEIVED) 42 - 43 - #undef ib_cm_event 44 - #undef ib_cm_event_end 45 - 46 - #define ib_cm_event(x) TRACE_DEFINE_ENUM(IB_CM_##x); 47 - #define ib_cm_event_end(x) TRACE_DEFINE_ENUM(IB_CM_##x); 48 - 49 - IB_CM_EVENT_LIST 50 - 51 - #undef ib_cm_event 52 - #undef ib_cm_event_end 53 - 54 - #define ib_cm_event(x) { IB_CM_##x, #x }, 55 - #define ib_cm_event_end(x) { IB_CM_##x, #x } 56 - 57 - #define rdma_show_ib_cm_event(x) \ 58 - __print_symbolic(x, IB_CM_EVENT_LIST) 59 - 60 20 61 21 DECLARE_EVENT_CLASS(cma_fsm_class, 62 22 TP_PROTO(
+5 -8
drivers/infiniband/core/core_priv.h
··· 44 44 #include <rdma/ib_mad.h> 45 45 #include <rdma/restrack.h> 46 46 #include "mad_priv.h" 47 + #include "restrack.h" 47 48 48 49 /* Total number of ports combined across all struct ib_devices's */ 49 50 #define RDMA_MAX_PORTS 8192 ··· 353 352 INIT_LIST_HEAD(&qp->rdma_mrs); 354 353 INIT_LIST_HEAD(&qp->sig_mrs); 355 354 355 + rdma_restrack_new(&qp->res, RDMA_RESTRACK_QP); 356 356 /* 357 357 * We don't track XRC QPs for now, because they don't have PD 358 358 * and more importantly they are created internaly by driver, ··· 361 359 */ 362 360 is_xrc = qp_type == IB_QPT_XRC_INI || qp_type == IB_QPT_XRC_TGT; 363 361 if ((qp_type < IB_QPT_MAX && !is_xrc) || qp_type == IB_QPT_DRIVER) { 364 - qp->res.type = RDMA_RESTRACK_QP; 365 - if (uobj) 366 - rdma_restrack_uadd(&qp->res); 367 - else 368 - rdma_restrack_kadd(&qp->res); 369 - } else 370 - qp->res.valid = false; 371 - 362 + rdma_restrack_parent_name(&qp->res, &pd->res); 363 + rdma_restrack_add(&qp->res); 364 + } 372 365 return qp; 373 366 } 374 367
+6 -9
drivers/infiniband/core/counters.c
··· 80 80 81 81 counter->device = dev; 82 82 counter->port = port; 83 - counter->res.type = RDMA_RESTRACK_COUNTER; 84 - counter->stats = dev->ops.counter_alloc_stats(counter); 83 + 84 + rdma_restrack_new(&counter->res, RDMA_RESTRACK_COUNTER); 85 + counter->stats = dev->ops.counter_alloc_stats(counter); 85 86 if (!counter->stats) 86 87 goto err_stats; 87 88 ··· 108 107 mutex_unlock(&port_counter->lock); 109 108 kfree(counter->stats); 110 109 err_stats: 110 + rdma_restrack_put(&counter->res); 111 111 kfree(counter); 112 112 return NULL; 113 113 } ··· 250 248 static void rdma_counter_res_add(struct rdma_counter *counter, 251 249 struct ib_qp *qp) 252 250 { 253 - if (rdma_is_kernel_res(&qp->res)) { 254 - rdma_restrack_set_task(&counter->res, qp->res.kern_name); 255 - rdma_restrack_kadd(&counter->res); 256 - } else { 257 - rdma_restrack_attach_task(&counter->res, qp->res.task); 258 - rdma_restrack_uadd(&counter->res); 259 - } 251 + rdma_restrack_parent_name(&counter->res, &qp->res); 252 + rdma_restrack_add(&counter->res); 260 253 } 261 254 262 255 static void counter_release(struct kref *kref)
+19 -20
drivers/infiniband/core/cq.c
··· 197 197 } 198 198 199 199 /** 200 - * __ib_alloc_cq_user - allocate a completion queue 200 + * __ib_alloc_cq allocate a completion queue 201 201 * @dev: device to allocate the CQ for 202 202 * @private: driver private data, accessible from cq->cq_context 203 203 * @nr_cqe: number of CQEs to allocate 204 204 * @comp_vector: HCA completion vectors for this CQ 205 205 * @poll_ctx: context to poll the CQ from. 206 206 * @caller: module owner name. 207 - * @udata: Valid user data or NULL for kernel object 208 207 * 209 208 * This is the proper interface to allocate a CQ for in-kernel users. A 210 209 * CQ allocated with this interface will automatically be polled from the 211 210 * specified context. The ULP must use wr->wr_cqe instead of wr->wr_id 212 211 * to use this CQ abstraction. 213 212 */ 214 - struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private, 215 - int nr_cqe, int comp_vector, 216 - enum ib_poll_context poll_ctx, 217 - const char *caller, struct ib_udata *udata) 213 + struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe, 214 + int comp_vector, enum ib_poll_context poll_ctx, 215 + const char *caller) 218 216 { 219 217 struct ib_cq_init_attr cq_attr = { 220 218 .cqe = nr_cqe, ··· 235 237 if (!cq->wc) 236 238 goto out_free_cq; 237 239 238 - cq->res.type = RDMA_RESTRACK_CQ; 239 - rdma_restrack_set_task(&cq->res, caller); 240 + rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); 241 + rdma_restrack_set_name(&cq->res, caller); 240 242 241 243 ret = dev->ops.create_cq(cq, &cq_attr, NULL); 242 244 if (ret) 243 245 goto out_free_wc; 244 - 245 - rdma_restrack_kadd(&cq->res); 246 246 247 247 rdma_dim_init(cq); 248 248 ··· 267 271 goto out_destroy_cq; 268 272 } 269 273 274 + rdma_restrack_add(&cq->res); 270 275 trace_cq_alloc(cq, nr_cqe, comp_vector, poll_ctx); 271 276 return cq; 272 277 273 278 out_destroy_cq: 274 279 rdma_dim_destroy(cq); 275 - rdma_restrack_del(&cq->res); 276 - cq->device->ops.destroy_cq(cq, udata); 280 + cq->device->ops.destroy_cq(cq, NULL); 277 281 out_free_wc: 282 + rdma_restrack_put(&cq->res); 278 283 kfree(cq->wc); 279 284 out_free_cq: 280 285 kfree(cq); 281 286 trace_cq_alloc_error(nr_cqe, comp_vector, poll_ctx, ret); 282 287 return ERR_PTR(ret); 283 288 } 284 - EXPORT_SYMBOL(__ib_alloc_cq_user); 289 + EXPORT_SYMBOL(__ib_alloc_cq); 285 290 286 291 /** 287 292 * __ib_alloc_cq_any - allocate a completion queue ··· 307 310 atomic_inc_return(&counter) % 308 311 min_t(int, dev->num_comp_vectors, num_online_cpus()); 309 312 310 - return __ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx, 311 - caller, NULL); 313 + return __ib_alloc_cq(dev, private, nr_cqe, comp_vector, poll_ctx, 314 + caller); 312 315 } 313 316 EXPORT_SYMBOL(__ib_alloc_cq_any); 314 317 315 318 /** 316 - * ib_free_cq_user - free a completion queue 319 + * ib_free_cq - free a completion queue 317 320 * @cq: completion queue to free. 318 - * @udata: User data or NULL for kernel object 319 321 */ 320 - void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata) 322 + void ib_free_cq(struct ib_cq *cq) 321 323 { 324 + int ret; 325 + 322 326 if (WARN_ON_ONCE(atomic_read(&cq->usecnt))) 323 327 return; 324 328 if (WARN_ON_ONCE(cq->cqe_used)) ··· 341 343 342 344 rdma_dim_destroy(cq); 343 345 trace_cq_free(cq); 346 + ret = cq->device->ops.destroy_cq(cq, NULL); 347 + WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail"); 344 348 rdma_restrack_del(&cq->res); 345 - cq->device->ops.destroy_cq(cq, udata); 346 349 kfree(cq->wc); 347 350 kfree(cq); 348 351 } 349 - EXPORT_SYMBOL(ib_free_cq_user); 352 + EXPORT_SYMBOL(ib_free_cq); 350 353 351 354 void ib_cq_pool_init(struct ib_device *dev) 352 355 {
+24 -53
drivers/infiniband/core/device.c
··· 1177 1177 return ret; 1178 1178 } 1179 1179 1180 - static void setup_dma_device(struct ib_device *device) 1180 + static void setup_dma_device(struct ib_device *device, 1181 + struct device *dma_device) 1181 1182 { 1182 - struct device *parent = device->dev.parent; 1183 - 1184 - WARN_ON_ONCE(device->dma_device); 1185 - 1186 - #ifdef CONFIG_DMA_OPS 1187 - if (device->dev.dma_ops) { 1188 - /* 1189 - * The caller provided custom DMA operations. Copy the 1190 - * DMA-related fields that are used by e.g. dma_alloc_coherent() 1191 - * into device->dev. 1192 - */ 1193 - device->dma_device = &device->dev; 1194 - if (!device->dev.dma_mask) { 1195 - if (parent) 1196 - device->dev.dma_mask = parent->dma_mask; 1197 - else 1198 - WARN_ON_ONCE(true); 1199 - } 1200 - if (!device->dev.coherent_dma_mask) { 1201 - if (parent) 1202 - device->dev.coherent_dma_mask = 1203 - parent->coherent_dma_mask; 1204 - else 1205 - WARN_ON_ONCE(true); 1206 - } 1207 - } else 1208 - #endif /* CONFIG_DMA_OPS */ 1209 - { 1210 - /* 1211 - * The caller did not provide custom DMA operations. Use the 1212 - * DMA mapping operations of the parent device. 1213 - */ 1214 - WARN_ON_ONCE(!parent); 1215 - device->dma_device = parent; 1183 + /* 1184 + * If the caller does not provide a DMA capable device then the IB 1185 + * device will be used. In this case the caller should fully setup the 1186 + * ibdev for DMA. This usually means using dma_virt_ops. 1187 + */ 1188 + #ifdef CONFIG_DMA_VIRT_OPS 1189 + if (!dma_device) { 1190 + device->dev.dma_ops = &dma_virt_ops; 1191 + dma_device = &device->dev; 1216 1192 } 1217 - 1218 - if (!device->dev.dma_parms) { 1219 - if (parent) { 1220 - /* 1221 - * The caller did not provide DMA parameters, so 1222 - * 'parent' probably represents a PCI device. The PCI 1223 - * core sets the maximum segment size to 64 1224 - * KB. Increase this parameter to 2 GB. 1225 - */ 1226 - device->dev.dma_parms = parent->dma_parms; 1227 - dma_set_max_seg_size(device->dma_device, SZ_2G); 1228 - } else { 1229 - WARN_ON_ONCE(true); 1230 - } 1231 - } 1193 + #endif 1194 + WARN_ON(!dma_device); 1195 + device->dma_device = dma_device; 1196 + WARN_ON(!device->dma_device->dma_parms); 1232 1197 } 1233 1198 1234 1199 /* ··· 1206 1241 struct ib_udata uhw = {.outlen = 0, .inlen = 0}; 1207 1242 int ret; 1208 1243 1209 - setup_dma_device(device); 1210 1244 ib_device_check_mandatory(device); 1211 1245 1212 1246 ret = setup_port_data(device); ··· 1318 1354 * ib_register_device - Register an IB device with IB core 1319 1355 * @device: Device to register 1320 1356 * @name: unique string device name. This may include a '%' which will 1321 - * cause a unique index to be added to the passed device name. 1357 + * cause a unique index to be added to the passed device name. 1358 + * @dma_device: pointer to a DMA-capable device. If %NULL, then the IB 1359 + * device will be used. In this case the caller should fully 1360 + * setup the ibdev for DMA. This usually means using dma_virt_ops. 1322 1361 * 1323 1362 * Low-level drivers use ib_register_device() to register their 1324 1363 * devices with the IB core. All registered clients will receive a ··· 1332 1365 * asynchronously then the device pointer may become freed as soon as this 1333 1366 * function returns. 1334 1367 */ 1335 - int ib_register_device(struct ib_device *device, const char *name) 1368 + int ib_register_device(struct ib_device *device, const char *name, 1369 + struct device *dma_device) 1336 1370 { 1337 1371 int ret; 1338 1372 ··· 1341 1373 if (ret) 1342 1374 return ret; 1343 1375 1376 + setup_dma_device(device, dma_device); 1344 1377 ret = setup_device(device); 1345 1378 if (ret) 1346 1379 return ret; ··· 2666 2697 SET_OBJ_SIZE(dev_ops, ib_ah); 2667 2698 SET_OBJ_SIZE(dev_ops, ib_counters); 2668 2699 SET_OBJ_SIZE(dev_ops, ib_cq); 2700 + SET_OBJ_SIZE(dev_ops, ib_mw); 2669 2701 SET_OBJ_SIZE(dev_ops, ib_pd); 2702 + SET_OBJ_SIZE(dev_ops, ib_rwq_ind_table); 2670 2703 SET_OBJ_SIZE(dev_ops, ib_srq); 2671 2704 SET_OBJ_SIZE(dev_ops, ib_ucontext); 2672 2705 SET_OBJ_SIZE(dev_ops, ib_xrcd);
+17 -17
drivers/infiniband/core/rdma_core.c
··· 130 130 lockdep_assert_held(&ufile->hw_destroy_rwsem); 131 131 assert_uverbs_usecnt(uobj, UVERBS_LOOKUP_WRITE); 132 132 133 - if (reason == RDMA_REMOVE_ABORT_HWOBJ) { 134 - reason = RDMA_REMOVE_ABORT; 135 - ret = uobj->uapi_object->type_class->destroy_hw(uobj, reason, 136 - attrs); 137 - /* 138 - * Drivers are not permitted to ignore RDMA_REMOVE_ABORT, see 139 - * ib_is_destroy_retryable, cleanup_retryable == false here. 140 - */ 141 - WARN_ON(ret); 142 - } 143 - 144 133 if (reason == RDMA_REMOVE_ABORT) { 145 134 WARN_ON(!list_empty(&uobj->list)); 146 135 WARN_ON(!uobj->context); ··· 663 674 bool hw_obj_valid) 664 675 { 665 676 struct ib_uverbs_file *ufile = uobj->ufile; 677 + int ret; 666 678 667 - uverbs_destroy_uobject(uobj, 668 - hw_obj_valid ? RDMA_REMOVE_ABORT_HWOBJ : 669 - RDMA_REMOVE_ABORT, 670 - attrs); 679 + if (hw_obj_valid) { 680 + ret = uobj->uapi_object->type_class->destroy_hw( 681 + uobj, RDMA_REMOVE_ABORT, attrs); 682 + /* 683 + * If the driver couldn't destroy the object then go ahead and 684 + * commit it. Leaking objects that can't be destroyed is only 685 + * done during FD close after the driver has a few more tries to 686 + * destroy it. 687 + */ 688 + if (WARN_ON(ret)) 689 + return rdma_alloc_commit_uobject(uobj, attrs); 690 + } 691 + 692 + uverbs_destroy_uobject(uobj, RDMA_REMOVE_ABORT, attrs); 671 693 672 694 /* Matches the down_read in rdma_alloc_begin_uobject */ 673 695 up_read(&ufile->hw_destroy_rwsem); ··· 889 889 if (!ufile->ucontext) 890 890 goto done; 891 891 892 - ufile->ucontext->closing = true; 893 892 ufile->ucontext->cleanup_retryable = true; 894 893 while (!list_empty(&ufile->uobjects)) 895 894 if (__uverbs_cleanup_ufile(ufile, reason)) { 896 895 /* 897 896 * No entry was cleaned-up successfully during this 898 - * iteration 897 + * iteration. It is a driver bug to fail destruction. 899 898 */ 899 + WARN_ON(!list_empty(&ufile->uobjects)); 900 900 break; 901 901 } 902 902
+76 -85
drivers/infiniband/core/restrack.c
··· 123 123 } 124 124 EXPORT_SYMBOL(rdma_restrack_count); 125 125 126 - static void set_kern_name(struct rdma_restrack_entry *res) 127 - { 128 - struct ib_pd *pd; 129 - 130 - switch (res->type) { 131 - case RDMA_RESTRACK_QP: 132 - pd = container_of(res, struct ib_qp, res)->pd; 133 - if (!pd) { 134 - WARN_ONCE(true, "XRC QPs are not supported\n"); 135 - /* Survive, despite the programmer's error */ 136 - res->kern_name = " "; 137 - } 138 - break; 139 - case RDMA_RESTRACK_MR: 140 - pd = container_of(res, struct ib_mr, res)->pd; 141 - break; 142 - default: 143 - /* Other types set kern_name directly */ 144 - pd = NULL; 145 - break; 146 - } 147 - 148 - if (pd) 149 - res->kern_name = pd->res.kern_name; 150 - } 151 - 152 126 static struct ib_device *res_to_dev(struct rdma_restrack_entry *res) 153 127 { 154 128 switch (res->type) { ··· 147 173 } 148 174 } 149 175 150 - void rdma_restrack_set_task(struct rdma_restrack_entry *res, 151 - const char *caller) 176 + /** 177 + * rdma_restrack_attach_task() - attach the task onto this resource, 178 + * valid for user space restrack entries. 179 + * @res: resource entry 180 + * @task: the task to attach 181 + */ 182 + static void rdma_restrack_attach_task(struct rdma_restrack_entry *res, 183 + struct task_struct *task) 184 + { 185 + if (WARN_ON_ONCE(!task)) 186 + return; 187 + 188 + if (res->task) 189 + put_task_struct(res->task); 190 + get_task_struct(task); 191 + res->task = task; 192 + res->user = true; 193 + } 194 + 195 + /** 196 + * rdma_restrack_set_name() - set the task for this resource 197 + * @res: resource entry 198 + * @caller: kernel name, the current task will be used if the caller is NULL. 199 + */ 200 + void rdma_restrack_set_name(struct rdma_restrack_entry *res, const char *caller) 152 201 { 153 202 if (caller) { 154 203 res->kern_name = caller; 155 204 return; 156 205 } 157 206 158 - if (res->task) 159 - put_task_struct(res->task); 160 - get_task_struct(current); 161 - res->task = current; 207 + rdma_restrack_attach_task(res, current); 162 208 } 163 - EXPORT_SYMBOL(rdma_restrack_set_task); 209 + EXPORT_SYMBOL(rdma_restrack_set_name); 164 210 165 211 /** 166 - * rdma_restrack_attach_task() - attach the task onto this resource 167 - * @res: resource entry 168 - * @task: the task to attach, the current task will be used if it is NULL. 212 + * rdma_restrack_parent_name() - set the restrack name properties based 213 + * on parent restrack 214 + * @dst: destination resource entry 215 + * @parent: parent resource entry 169 216 */ 170 - void rdma_restrack_attach_task(struct rdma_restrack_entry *res, 171 - struct task_struct *task) 217 + void rdma_restrack_parent_name(struct rdma_restrack_entry *dst, 218 + const struct rdma_restrack_entry *parent) 172 219 { 173 - if (res->task) 174 - put_task_struct(res->task); 175 - get_task_struct(task); 176 - res->task = task; 220 + if (rdma_is_kernel_res(parent)) 221 + dst->kern_name = parent->kern_name; 222 + else 223 + rdma_restrack_attach_task(dst, parent->task); 177 224 } 225 + EXPORT_SYMBOL(rdma_restrack_parent_name); 178 226 179 - static void rdma_restrack_add(struct rdma_restrack_entry *res) 227 + /** 228 + * rdma_restrack_new() - Initializes new restrack entry to allow _put() interface 229 + * to release memory in fully automatic way. 230 + * @res - Entry to initialize 231 + * @type - REstrack type 232 + */ 233 + void rdma_restrack_new(struct rdma_restrack_entry *res, 234 + enum rdma_restrack_type type) 235 + { 236 + kref_init(&res->kref); 237 + init_completion(&res->comp); 238 + res->type = type; 239 + } 240 + EXPORT_SYMBOL(rdma_restrack_new); 241 + 242 + /** 243 + * rdma_restrack_add() - add object to the reource tracking database 244 + * @res: resource entry 245 + */ 246 + void rdma_restrack_add(struct rdma_restrack_entry *res) 180 247 { 181 248 struct ib_device *dev = res_to_dev(res); 182 249 struct rdma_restrack_root *rt; ··· 228 213 229 214 rt = &dev->res[res->type]; 230 215 231 - kref_init(&res->kref); 232 - init_completion(&res->comp); 233 216 if (res->type == RDMA_RESTRACK_QP) { 234 217 /* Special case to ensure that LQPN points to right QP */ 235 218 struct ib_qp *qp = container_of(res, struct ib_qp, res); ··· 249 236 if (!ret) 250 237 res->valid = true; 251 238 } 252 - 253 - /** 254 - * rdma_restrack_kadd() - add kernel object to the reource tracking database 255 - * @res: resource entry 256 - */ 257 - void rdma_restrack_kadd(struct rdma_restrack_entry *res) 258 - { 259 - res->task = NULL; 260 - set_kern_name(res); 261 - res->user = false; 262 - rdma_restrack_add(res); 263 - } 264 - EXPORT_SYMBOL(rdma_restrack_kadd); 265 - 266 - /** 267 - * rdma_restrack_uadd() - add user object to the reource tracking database 268 - * @res: resource entry 269 - */ 270 - void rdma_restrack_uadd(struct rdma_restrack_entry *res) 271 - { 272 - if ((res->type != RDMA_RESTRACK_CM_ID) && 273 - (res->type != RDMA_RESTRACK_COUNTER)) 274 - res->task = NULL; 275 - 276 - if (!res->task) 277 - rdma_restrack_set_task(res, NULL); 278 - res->kern_name = NULL; 279 - 280 - res->user = true; 281 - rdma_restrack_add(res); 282 - } 283 - EXPORT_SYMBOL(rdma_restrack_uadd); 239 + EXPORT_SYMBOL(rdma_restrack_add); 284 240 285 241 int __must_check rdma_restrack_get(struct rdma_restrack_entry *res) 286 242 { ··· 287 305 struct rdma_restrack_entry *res; 288 306 289 307 res = container_of(kref, struct rdma_restrack_entry, kref); 308 + if (res->task) { 309 + put_task_struct(res->task); 310 + res->task = NULL; 311 + } 290 312 complete(&res->comp); 291 313 } 292 314 ··· 300 314 } 301 315 EXPORT_SYMBOL(rdma_restrack_put); 302 316 317 + /** 318 + * rdma_restrack_del() - delete object from the reource tracking database 319 + * @res: resource entry 320 + */ 303 321 void rdma_restrack_del(struct rdma_restrack_entry *res) 304 322 { 305 323 struct rdma_restrack_entry *old; 306 324 struct rdma_restrack_root *rt; 307 325 struct ib_device *dev; 308 326 309 - if (!res->valid) 310 - goto out; 327 + if (!res->valid) { 328 + if (res->task) { 329 + put_task_struct(res->task); 330 + res->task = NULL; 331 + } 332 + return; 333 + } 311 334 312 335 dev = res_to_dev(res); 313 336 if (WARN_ON(!dev)) ··· 325 330 rt = &dev->res[res->type]; 326 331 327 332 old = xa_erase(&rt->xa, res->id); 333 + if (res->type == RDMA_RESTRACK_MR || res->type == RDMA_RESTRACK_QP) 334 + return; 328 335 WARN_ON(old != res); 329 336 res->valid = false; 330 337 331 338 rdma_restrack_put(res); 332 339 wait_for_completion(&res->comp); 333 - 334 - out: 335 - if (res->task) { 336 - put_task_struct(res->task); 337 - res->task = NULL; 338 - } 339 340 } 340 341 EXPORT_SYMBOL(rdma_restrack_del);
+8 -2
drivers/infiniband/core/restrack.h
··· 25 25 26 26 int rdma_restrack_init(struct ib_device *dev); 27 27 void rdma_restrack_clean(struct ib_device *dev); 28 - void rdma_restrack_attach_task(struct rdma_restrack_entry *res, 29 - struct task_struct *task); 28 + void rdma_restrack_add(struct rdma_restrack_entry *res); 29 + void rdma_restrack_del(struct rdma_restrack_entry *res); 30 + void rdma_restrack_new(struct rdma_restrack_entry *res, 31 + enum rdma_restrack_type type); 32 + void rdma_restrack_set_name(struct rdma_restrack_entry *res, 33 + const char *caller); 34 + void rdma_restrack_parent_name(struct rdma_restrack_entry *dst, 35 + const struct rdma_restrack_entry *parent); 30 36 #endif /* _RDMA_CORE_RESTRACK_H_ */
+8 -7
drivers/infiniband/core/sysfs.c
··· 59 59 struct gid_attr_group *gid_attr_group; 60 60 struct attribute_group gid_group; 61 61 struct attribute_group *pkey_group; 62 - struct attribute_group *pma_table; 62 + const struct attribute_group *pma_table; 63 63 struct attribute_group *hw_stats_ag; 64 64 struct rdma_hw_stats *hw_stats; 65 65 u8 port_num; ··· 387 387 388 388 gid_attr = rdma_get_gid_attr(p->ibdev, p->port_num, tab_attr->index); 389 389 if (IS_ERR(gid_attr)) 390 - return PTR_ERR(gid_attr); 390 + /* -EINVAL is returned for user space compatibility reasons. */ 391 + return -EINVAL; 391 392 392 393 ret = print(gid_attr, buf); 393 394 rdma_put_gid_attr(gid_attr); ··· 654 653 NULL 655 654 }; 656 655 657 - static struct attribute_group pma_group = { 656 + static const struct attribute_group pma_group = { 658 657 .name = "counters", 659 658 .attrs = pma_attrs 660 659 }; 661 660 662 - static struct attribute_group pma_group_ext = { 661 + static const struct attribute_group pma_group_ext = { 663 662 .name = "counters", 664 663 .attrs = pma_attrs_ext 665 664 }; 666 665 667 - static struct attribute_group pma_group_noietf = { 666 + static const struct attribute_group pma_group_noietf = { 668 667 .name = "counters", 669 668 .attrs = pma_attrs_noietf 670 669 }; ··· 779 778 * Figure out which counter table to use depending on 780 779 * the device capabilities. 781 780 */ 782 - static struct attribute_group *get_counter_table(struct ib_device *dev, 783 - int port_num) 781 + static const struct attribute_group *get_counter_table(struct ib_device *dev, 782 + int port_num) 784 783 { 785 784 struct ib_class_port_info cpi; 786 785
+243 -309
drivers/infiniband/core/ucma.c
··· 80 80 struct list_head ctx_list; 81 81 struct list_head event_list; 82 82 wait_queue_head_t poll_wait; 83 - struct workqueue_struct *close_wq; 84 83 }; 85 84 86 85 struct ucma_context { ··· 87 88 struct completion comp; 88 89 refcount_t ref; 89 90 int events_reported; 90 - int backlog; 91 + atomic_t backlog; 91 92 92 93 struct ucma_file *file; 93 94 struct rdma_cm_id *cm_id; ··· 95 96 u64 uid; 96 97 97 98 struct list_head list; 98 - struct list_head mc_list; 99 - /* mark that device is in process of destroying the internal HW 100 - * resources, protected by the ctx_table lock 101 - */ 102 - int closing; 103 99 /* sync between removal event and id destroy, protected by file mut */ 104 100 int destroying; 105 101 struct work_struct close_work; ··· 107 113 108 114 u64 uid; 109 115 u8 join_state; 110 - struct list_head list; 111 116 struct sockaddr_storage addr; 112 117 }; 113 118 114 119 struct ucma_event { 115 120 struct ucma_context *ctx; 121 + struct ucma_context *conn_req_ctx; 116 122 struct ucma_multicast *mc; 117 123 struct list_head list; 118 - struct rdma_cm_id *cm_id; 119 124 struct rdma_ucm_event_resp resp; 120 - struct work_struct close_work; 121 125 }; 122 126 123 127 static DEFINE_XARRAY_ALLOC(ctx_table); 124 128 static DEFINE_XARRAY_ALLOC(multicast_table); 125 129 126 130 static const struct file_operations ucma_fops; 131 + static int __destroy_id(struct ucma_context *ctx); 127 132 128 133 static inline struct ucma_context *_ucma_find_context(int id, 129 134 struct ucma_file *file) ··· 132 139 ctx = xa_load(&ctx_table, id); 133 140 if (!ctx) 134 141 ctx = ERR_PTR(-ENOENT); 135 - else if (ctx->file != file || !ctx->cm_id) 142 + else if (ctx->file != file) 136 143 ctx = ERR_PTR(-EINVAL); 137 144 return ctx; 138 145 } ··· 143 150 144 151 xa_lock(&ctx_table); 145 152 ctx = _ucma_find_context(id, file); 146 - if (!IS_ERR(ctx)) { 147 - if (ctx->closing) 148 - ctx = ERR_PTR(-EIO); 149 - else 150 - refcount_inc(&ctx->ref); 151 - } 153 + if (!IS_ERR(ctx)) 154 + if (!refcount_inc_not_zero(&ctx->ref)) 155 + ctx = ERR_PTR(-ENXIO); 152 156 xa_unlock(&ctx_table); 153 157 return ctx; 154 158 } ··· 173 183 return ctx; 174 184 } 175 185 176 - static void ucma_close_event_id(struct work_struct *work) 177 - { 178 - struct ucma_event *uevent_close = container_of(work, struct ucma_event, close_work); 179 - 180 - rdma_destroy_id(uevent_close->cm_id); 181 - kfree(uevent_close); 182 - } 183 - 184 186 static void ucma_close_id(struct work_struct *work) 185 187 { 186 188 struct ucma_context *ctx = container_of(work, struct ucma_context, close_work); ··· 185 203 wait_for_completion(&ctx->comp); 186 204 /* No new events will be generated after destroying the id. */ 187 205 rdma_destroy_id(ctx->cm_id); 206 + 207 + /* 208 + * At this point ctx->ref is zero so the only place the ctx can be is in 209 + * a uevent or in __destroy_id(). Since the former doesn't touch 210 + * ctx->cm_id and the latter sync cancels this, there is no races with 211 + * this store. 212 + */ 213 + ctx->cm_id = NULL; 188 214 } 189 215 190 216 static struct ucma_context *ucma_alloc_ctx(struct ucma_file *file) ··· 206 216 INIT_WORK(&ctx->close_work, ucma_close_id); 207 217 refcount_set(&ctx->ref, 1); 208 218 init_completion(&ctx->comp); 209 - INIT_LIST_HEAD(&ctx->mc_list); 219 + /* So list_del() will work if we don't do ucma_finish_ctx() */ 220 + INIT_LIST_HEAD(&ctx->list); 210 221 ctx->file = file; 211 222 mutex_init(&ctx->mutex); 212 223 213 - if (xa_alloc(&ctx_table, &ctx->id, ctx, xa_limit_32b, GFP_KERNEL)) 214 - goto error; 215 - 216 - list_add_tail(&ctx->list, &file->ctx_list); 224 + if (xa_alloc(&ctx_table, &ctx->id, NULL, xa_limit_32b, GFP_KERNEL)) { 225 + kfree(ctx); 226 + return NULL; 227 + } 217 228 return ctx; 218 - 219 - error: 220 - kfree(ctx); 221 - return NULL; 222 229 } 223 230 224 - static struct ucma_multicast* ucma_alloc_multicast(struct ucma_context *ctx) 231 + static void ucma_finish_ctx(struct ucma_context *ctx) 225 232 { 226 - struct ucma_multicast *mc; 227 - 228 - mc = kzalloc(sizeof(*mc), GFP_KERNEL); 229 - if (!mc) 230 - return NULL; 231 - 232 - mc->ctx = ctx; 233 - if (xa_alloc(&multicast_table, &mc->id, NULL, xa_limit_32b, GFP_KERNEL)) 234 - goto error; 235 - 236 - list_add_tail(&mc->list, &ctx->mc_list); 237 - return mc; 238 - 239 - error: 240 - kfree(mc); 241 - return NULL; 233 + lockdep_assert_held(&ctx->file->mut); 234 + list_add_tail(&ctx->list, &ctx->file->ctx_list); 235 + xa_store(&ctx_table, ctx->id, ctx, GFP_KERNEL); 242 236 } 243 237 244 238 static void ucma_copy_conn_event(struct rdma_ucm_conn_param *dst, ··· 254 280 dst->qkey = src->qkey; 255 281 } 256 282 257 - static void ucma_set_event_context(struct ucma_context *ctx, 258 - struct rdma_cm_event *event, 259 - struct ucma_event *uevent) 283 + static struct ucma_event *ucma_create_uevent(struct ucma_context *ctx, 284 + struct rdma_cm_event *event) 260 285 { 286 + struct ucma_event *uevent; 287 + 288 + uevent = kzalloc(sizeof(*uevent), GFP_KERNEL); 289 + if (!uevent) 290 + return NULL; 291 + 261 292 uevent->ctx = ctx; 262 293 switch (event->event) { 263 294 case RDMA_CM_EVENT_MULTICAST_JOIN: ··· 277 298 uevent->resp.id = ctx->id; 278 299 break; 279 300 } 280 - } 281 - 282 - /* Called with file->mut locked for the relevant context. */ 283 - static void ucma_removal_event_handler(struct rdma_cm_id *cm_id) 284 - { 285 - struct ucma_context *ctx = cm_id->context; 286 - struct ucma_event *con_req_eve; 287 - int event_found = 0; 288 - 289 - if (ctx->destroying) 290 - return; 291 - 292 - /* only if context is pointing to cm_id that it owns it and can be 293 - * queued to be closed, otherwise that cm_id is an inflight one that 294 - * is part of that context event list pending to be detached and 295 - * reattached to its new context as part of ucma_get_event, 296 - * handled separately below. 297 - */ 298 - if (ctx->cm_id == cm_id) { 299 - xa_lock(&ctx_table); 300 - ctx->closing = 1; 301 - xa_unlock(&ctx_table); 302 - queue_work(ctx->file->close_wq, &ctx->close_work); 303 - return; 304 - } 305 - 306 - list_for_each_entry(con_req_eve, &ctx->file->event_list, list) { 307 - if (con_req_eve->cm_id == cm_id && 308 - con_req_eve->resp.event == RDMA_CM_EVENT_CONNECT_REQUEST) { 309 - list_del(&con_req_eve->list); 310 - INIT_WORK(&con_req_eve->close_work, ucma_close_event_id); 311 - queue_work(ctx->file->close_wq, &con_req_eve->close_work); 312 - event_found = 1; 313 - break; 314 - } 315 - } 316 - if (!event_found) 317 - pr_err("ucma_removal_event_handler: warning: connect request event wasn't found\n"); 318 - } 319 - 320 - static int ucma_event_handler(struct rdma_cm_id *cm_id, 321 - struct rdma_cm_event *event) 322 - { 323 - struct ucma_event *uevent; 324 - struct ucma_context *ctx = cm_id->context; 325 - int ret = 0; 326 - 327 - uevent = kzalloc(sizeof(*uevent), GFP_KERNEL); 328 - if (!uevent) 329 - return event->event == RDMA_CM_EVENT_CONNECT_REQUEST; 330 - 331 - mutex_lock(&ctx->file->mut); 332 - uevent->cm_id = cm_id; 333 - ucma_set_event_context(ctx, event, uevent); 334 301 uevent->resp.event = event->event; 335 302 uevent->resp.status = event->status; 336 - if (cm_id->qp_type == IB_QPT_UD) 337 - ucma_copy_ud_event(cm_id->device, &uevent->resp.param.ud, 303 + if (ctx->cm_id->qp_type == IB_QPT_UD) 304 + ucma_copy_ud_event(ctx->cm_id->device, &uevent->resp.param.ud, 338 305 &event->param.ud); 339 306 else 340 307 ucma_copy_conn_event(&uevent->resp.param.conn, ··· 288 363 289 364 uevent->resp.ece.vendor_id = event->ece.vendor_id; 290 365 uevent->resp.ece.attr_mod = event->ece.attr_mod; 366 + return uevent; 367 + } 291 368 292 - if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST) { 293 - if (!ctx->backlog) { 294 - ret = -ENOMEM; 295 - kfree(uevent); 296 - goto out; 297 - } 298 - ctx->backlog--; 299 - } else if (!ctx->uid || ctx->cm_id != cm_id) { 300 - /* 301 - * We ignore events for new connections until userspace has set 302 - * their context. This can only happen if an error occurs on a 303 - * new connection before the user accepts it. This is okay, 304 - * since the accept will just fail later. However, we do need 305 - * to release the underlying HW resources in case of a device 306 - * removal event. 307 - */ 308 - if (event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) 309 - ucma_removal_event_handler(cm_id); 369 + static int ucma_connect_event_handler(struct rdma_cm_id *cm_id, 370 + struct rdma_cm_event *event) 371 + { 372 + struct ucma_context *listen_ctx = cm_id->context; 373 + struct ucma_context *ctx; 374 + struct ucma_event *uevent; 310 375 311 - kfree(uevent); 312 - goto out; 376 + if (!atomic_add_unless(&listen_ctx->backlog, -1, 0)) 377 + return -ENOMEM; 378 + ctx = ucma_alloc_ctx(listen_ctx->file); 379 + if (!ctx) 380 + goto err_backlog; 381 + ctx->cm_id = cm_id; 382 + 383 + uevent = ucma_create_uevent(listen_ctx, event); 384 + if (!uevent) 385 + goto err_alloc; 386 + uevent->conn_req_ctx = ctx; 387 + uevent->resp.id = ctx->id; 388 + 389 + ctx->cm_id->context = ctx; 390 + 391 + mutex_lock(&ctx->file->mut); 392 + ucma_finish_ctx(ctx); 393 + list_add_tail(&uevent->list, &ctx->file->event_list); 394 + mutex_unlock(&ctx->file->mut); 395 + wake_up_interruptible(&ctx->file->poll_wait); 396 + return 0; 397 + 398 + err_alloc: 399 + xa_erase(&ctx_table, ctx->id); 400 + kfree(ctx); 401 + err_backlog: 402 + atomic_inc(&listen_ctx->backlog); 403 + /* Returning error causes the new ID to be destroyed */ 404 + return -ENOMEM; 405 + } 406 + 407 + static int ucma_event_handler(struct rdma_cm_id *cm_id, 408 + struct rdma_cm_event *event) 409 + { 410 + struct ucma_event *uevent; 411 + struct ucma_context *ctx = cm_id->context; 412 + 413 + if (event->event == RDMA_CM_EVENT_CONNECT_REQUEST) 414 + return ucma_connect_event_handler(cm_id, event); 415 + 416 + /* 417 + * We ignore events for new connections until userspace has set their 418 + * context. This can only happen if an error occurs on a new connection 419 + * before the user accepts it. This is okay, since the accept will just 420 + * fail later. However, we do need to release the underlying HW 421 + * resources in case of a device removal event. 422 + */ 423 + if (ctx->uid) { 424 + uevent = ucma_create_uevent(ctx, event); 425 + if (!uevent) 426 + return 0; 427 + 428 + mutex_lock(&ctx->file->mut); 429 + list_add_tail(&uevent->list, &ctx->file->event_list); 430 + mutex_unlock(&ctx->file->mut); 431 + wake_up_interruptible(&ctx->file->poll_wait); 313 432 } 314 433 315 - list_add_tail(&uevent->list, &ctx->file->event_list); 316 - wake_up_interruptible(&ctx->file->poll_wait); 317 - if (event->event == RDMA_CM_EVENT_DEVICE_REMOVAL) 318 - ucma_removal_event_handler(cm_id); 319 - out: 320 - mutex_unlock(&ctx->file->mut); 321 - return ret; 434 + if (event->event == RDMA_CM_EVENT_DEVICE_REMOVAL && !ctx->destroying) 435 + queue_work(system_unbound_wq, &ctx->close_work); 436 + return 0; 322 437 } 323 438 324 439 static ssize_t ucma_get_event(struct ucma_file *file, const char __user *inbuf, 325 440 int in_len, int out_len) 326 441 { 327 - struct ucma_context *ctx; 328 442 struct rdma_ucm_get_event cmd; 329 443 struct ucma_event *uevent; 330 - int ret = 0; 331 444 332 445 /* 333 446 * Old 32 bit user space does not send the 4 byte padding in the ··· 392 429 mutex_lock(&file->mut); 393 430 } 394 431 395 - uevent = list_entry(file->event_list.next, struct ucma_event, list); 396 - 397 - if (uevent->resp.event == RDMA_CM_EVENT_CONNECT_REQUEST) { 398 - ctx = ucma_alloc_ctx(file); 399 - if (!ctx) { 400 - ret = -ENOMEM; 401 - goto done; 402 - } 403 - uevent->ctx->backlog++; 404 - ctx->cm_id = uevent->cm_id; 405 - ctx->cm_id->context = ctx; 406 - uevent->resp.id = ctx->id; 407 - } 432 + uevent = list_first_entry(&file->event_list, struct ucma_event, list); 408 433 409 434 if (copy_to_user(u64_to_user_ptr(cmd.response), 410 435 &uevent->resp, 411 436 min_t(size_t, out_len, sizeof(uevent->resp)))) { 412 - ret = -EFAULT; 413 - goto done; 437 + mutex_unlock(&file->mut); 438 + return -EFAULT; 414 439 } 415 440 416 441 list_del(&uevent->list); 417 442 uevent->ctx->events_reported++; 418 443 if (uevent->mc) 419 444 uevent->mc->events_reported++; 420 - kfree(uevent); 421 - done: 445 + if (uevent->resp.event == RDMA_CM_EVENT_CONNECT_REQUEST) 446 + atomic_inc(&uevent->ctx->backlog); 422 447 mutex_unlock(&file->mut); 423 - return ret; 448 + 449 + kfree(uevent); 450 + return 0; 424 451 } 425 452 426 453 static int ucma_get_qp_type(struct rdma_ucm_create_id *cmd, enum ib_qp_type *qp_type) ··· 451 498 if (ret) 452 499 return ret; 453 500 454 - mutex_lock(&file->mut); 455 501 ctx = ucma_alloc_ctx(file); 456 - mutex_unlock(&file->mut); 457 502 if (!ctx) 458 503 return -ENOMEM; 459 504 460 505 ctx->uid = cmd.uid; 461 - cm_id = __rdma_create_id(current->nsproxy->net_ns, 462 - ucma_event_handler, ctx, cmd.ps, qp_type, NULL); 506 + cm_id = rdma_create_user_id(ucma_event_handler, ctx, cmd.ps, qp_type); 463 507 if (IS_ERR(cm_id)) { 464 508 ret = PTR_ERR(cm_id); 465 509 goto err1; 466 510 } 511 + ctx->cm_id = cm_id; 467 512 468 513 resp.id = ctx->id; 469 514 if (copy_to_user(u64_to_user_ptr(cmd.response), 470 515 &resp, sizeof(resp))) { 471 - ret = -EFAULT; 472 - goto err2; 516 + xa_erase(&ctx_table, ctx->id); 517 + __destroy_id(ctx); 518 + return -EFAULT; 473 519 } 474 520 475 - ctx->cm_id = cm_id; 521 + mutex_lock(&file->mut); 522 + ucma_finish_ctx(ctx); 523 + mutex_unlock(&file->mut); 476 524 return 0; 477 525 478 - err2: 479 - rdma_destroy_id(cm_id); 480 526 err1: 481 527 xa_erase(&ctx_table, ctx->id); 482 - mutex_lock(&file->mut); 483 - list_del(&ctx->list); 484 - mutex_unlock(&file->mut); 485 528 kfree(ctx); 486 529 return ret; 487 530 } 488 531 489 532 static void ucma_cleanup_multicast(struct ucma_context *ctx) 490 533 { 491 - struct ucma_multicast *mc, *tmp; 534 + struct ucma_multicast *mc; 535 + unsigned long index; 492 536 493 - mutex_lock(&ctx->file->mut); 494 - list_for_each_entry_safe(mc, tmp, &ctx->mc_list, list) { 495 - list_del(&mc->list); 496 - xa_erase(&multicast_table, mc->id); 537 + xa_for_each(&multicast_table, index, mc) { 538 + if (mc->ctx != ctx) 539 + continue; 540 + /* 541 + * At this point mc->ctx->ref is 0 so the mc cannot leave the 542 + * lock on the reader and this is enough serialization 543 + */ 544 + xa_erase(&multicast_table, index); 497 545 kfree(mc); 498 546 } 499 - mutex_unlock(&ctx->file->mut); 500 547 } 501 548 502 549 static void ucma_cleanup_mc_events(struct ucma_multicast *mc) 503 550 { 504 551 struct ucma_event *uevent, *tmp; 505 552 553 + rdma_lock_handler(mc->ctx->cm_id); 554 + mutex_lock(&mc->ctx->file->mut); 506 555 list_for_each_entry_safe(uevent, tmp, &mc->ctx->file->event_list, list) { 507 556 if (uevent->mc != mc) 508 557 continue; ··· 512 557 list_del(&uevent->list); 513 558 kfree(uevent); 514 559 } 560 + mutex_unlock(&mc->ctx->file->mut); 561 + rdma_unlock_handler(mc->ctx->cm_id); 515 562 } 516 563 517 564 /* ··· 521 564 * this point, no new events will be reported from the hardware. However, we 522 565 * still need to cleanup the UCMA context for this ID. Specifically, there 523 566 * might be events that have not yet been consumed by the user space software. 524 - * These might include pending connect requests which we have not completed 525 - * processing. We cannot call rdma_destroy_id while holding the lock of the 526 - * context (file->mut), as it might cause a deadlock. We therefore extract all 527 - * relevant events from the context pending events list while holding the 528 567 * mutex. After that we release them as needed. 529 568 */ 530 569 static int ucma_free_ctx(struct ucma_context *ctx) ··· 529 576 struct ucma_event *uevent, *tmp; 530 577 LIST_HEAD(list); 531 578 532 - 533 579 ucma_cleanup_multicast(ctx); 534 580 535 581 /* Cleanup events not yet reported to the user. */ 536 582 mutex_lock(&ctx->file->mut); 537 583 list_for_each_entry_safe(uevent, tmp, &ctx->file->event_list, list) { 538 - if (uevent->ctx == ctx) 584 + if (uevent->ctx == ctx || uevent->conn_req_ctx == ctx) 539 585 list_move_tail(&uevent->list, &list); 540 586 } 541 587 list_del(&ctx->list); 588 + events_reported = ctx->events_reported; 542 589 mutex_unlock(&ctx->file->mut); 543 590 591 + /* 592 + * If this was a listening ID then any connections spawned from it 593 + * that have not been delivered to userspace are cleaned up too. 594 + * Must be done outside any locks. 595 + */ 544 596 list_for_each_entry_safe(uevent, tmp, &list, list) { 545 597 list_del(&uevent->list); 546 - if (uevent->resp.event == RDMA_CM_EVENT_CONNECT_REQUEST) 547 - rdma_destroy_id(uevent->cm_id); 598 + if (uevent->resp.event == RDMA_CM_EVENT_CONNECT_REQUEST && 599 + uevent->conn_req_ctx != ctx) 600 + __destroy_id(uevent->conn_req_ctx); 548 601 kfree(uevent); 549 602 } 550 603 551 - events_reported = ctx->events_reported; 552 604 mutex_destroy(&ctx->mutex); 553 605 kfree(ctx); 554 606 return events_reported; 607 + } 608 + 609 + static int __destroy_id(struct ucma_context *ctx) 610 + { 611 + /* 612 + * If the refcount is already 0 then ucma_close_id() has already 613 + * destroyed the cm_id, otherwise holding the refcount keeps cm_id 614 + * valid. Prevent queue_work() from being called. 615 + */ 616 + if (refcount_inc_not_zero(&ctx->ref)) { 617 + rdma_lock_handler(ctx->cm_id); 618 + ctx->destroying = 1; 619 + rdma_unlock_handler(ctx->cm_id); 620 + ucma_put_ctx(ctx); 621 + } 622 + 623 + cancel_work_sync(&ctx->close_work); 624 + /* At this point it's guaranteed that there is no inflight closing task */ 625 + if (ctx->cm_id) 626 + ucma_close_id(&ctx->close_work); 627 + return ucma_free_ctx(ctx); 555 628 } 556 629 557 630 static ssize_t ucma_destroy_id(struct ucma_file *file, const char __user *inbuf, ··· 603 624 if (IS_ERR(ctx)) 604 625 return PTR_ERR(ctx); 605 626 606 - mutex_lock(&ctx->file->mut); 607 - ctx->destroying = 1; 608 - mutex_unlock(&ctx->file->mut); 609 - 610 - flush_workqueue(ctx->file->close_wq); 611 - /* At this point it's guaranteed that there is no inflight 612 - * closing task */ 613 - xa_lock(&ctx_table); 614 - if (!ctx->closing) { 615 - xa_unlock(&ctx_table); 616 - ucma_put_ctx(ctx); 617 - wait_for_completion(&ctx->comp); 618 - rdma_destroy_id(ctx->cm_id); 619 - } else { 620 - xa_unlock(&ctx_table); 621 - } 622 - 623 - resp.events_reported = ucma_free_ctx(ctx); 627 + resp.events_reported = __destroy_id(ctx); 624 628 if (copy_to_user(u64_to_user_ptr(cmd.response), 625 629 &resp, sizeof(resp))) 626 630 ret = -EFAULT; ··· 1086 1124 if (IS_ERR(ctx)) 1087 1125 return PTR_ERR(ctx); 1088 1126 1089 - ctx->backlog = cmd.backlog > 0 && cmd.backlog < max_backlog ? 1090 - cmd.backlog : max_backlog; 1127 + if (cmd.backlog <= 0 || cmd.backlog > max_backlog) 1128 + cmd.backlog = max_backlog; 1129 + atomic_set(&ctx->backlog, cmd.backlog); 1130 + 1091 1131 mutex_lock(&ctx->mutex); 1092 - ret = rdma_listen(ctx->cm_id, ctx->backlog); 1132 + ret = rdma_listen(ctx->cm_id, cmd.backlog); 1093 1133 mutex_unlock(&ctx->mutex); 1094 1134 ucma_put_ctx(ctx); 1095 1135 return ret; ··· 1124 1160 1125 1161 if (cmd.conn_param.valid) { 1126 1162 ucma_copy_conn_param(ctx->cm_id, &conn_param, &cmd.conn_param); 1127 - mutex_lock(&file->mut); 1128 1163 mutex_lock(&ctx->mutex); 1129 - ret = __rdma_accept_ece(ctx->cm_id, &conn_param, NULL, &ece); 1130 - mutex_unlock(&ctx->mutex); 1131 - if (!ret) 1164 + rdma_lock_handler(ctx->cm_id); 1165 + ret = rdma_accept_ece(ctx->cm_id, &conn_param, &ece); 1166 + if (!ret) { 1167 + /* The uid must be set atomically with the handler */ 1132 1168 ctx->uid = cmd.uid; 1133 - mutex_unlock(&file->mut); 1169 + } 1170 + rdma_unlock_handler(ctx->cm_id); 1171 + mutex_unlock(&ctx->mutex); 1134 1172 } else { 1135 1173 mutex_lock(&ctx->mutex); 1136 - ret = __rdma_accept_ece(ctx->cm_id, NULL, NULL, &ece); 1174 + rdma_lock_handler(ctx->cm_id); 1175 + ret = rdma_accept_ece(ctx->cm_id, NULL, &ece); 1176 + rdma_unlock_handler(ctx->cm_id); 1137 1177 mutex_unlock(&ctx->mutex); 1138 1178 } 1139 1179 ucma_put_ctx(ctx); ··· 1450 1482 if (IS_ERR(ctx)) 1451 1483 return PTR_ERR(ctx); 1452 1484 1453 - mutex_lock(&file->mut); 1454 - mc = ucma_alloc_multicast(ctx); 1485 + mc = kzalloc(sizeof(*mc), GFP_KERNEL); 1455 1486 if (!mc) { 1456 1487 ret = -ENOMEM; 1457 - goto err1; 1488 + goto err_put_ctx; 1458 1489 } 1490 + 1491 + mc->ctx = ctx; 1459 1492 mc->join_state = join_state; 1460 1493 mc->uid = cmd->uid; 1461 1494 memcpy(&mc->addr, addr, cmd->addr_size); 1495 + 1496 + if (xa_alloc(&multicast_table, &mc->id, NULL, xa_limit_32b, 1497 + GFP_KERNEL)) { 1498 + ret = -ENOMEM; 1499 + goto err_free_mc; 1500 + } 1501 + 1462 1502 mutex_lock(&ctx->mutex); 1463 1503 ret = rdma_join_multicast(ctx->cm_id, (struct sockaddr *)&mc->addr, 1464 1504 join_state, mc); 1465 1505 mutex_unlock(&ctx->mutex); 1466 1506 if (ret) 1467 - goto err2; 1507 + goto err_xa_erase; 1468 1508 1469 1509 resp.id = mc->id; 1470 1510 if (copy_to_user(u64_to_user_ptr(cmd->response), 1471 1511 &resp, sizeof(resp))) { 1472 1512 ret = -EFAULT; 1473 - goto err3; 1513 + goto err_leave_multicast; 1474 1514 } 1475 1515 1476 1516 xa_store(&multicast_table, mc->id, mc, 0); 1477 1517 1478 - mutex_unlock(&file->mut); 1479 1518 ucma_put_ctx(ctx); 1480 1519 return 0; 1481 1520 1482 - err3: 1521 + err_leave_multicast: 1522 + mutex_lock(&ctx->mutex); 1483 1523 rdma_leave_multicast(ctx->cm_id, (struct sockaddr *) &mc->addr); 1524 + mutex_unlock(&ctx->mutex); 1484 1525 ucma_cleanup_mc_events(mc); 1485 - err2: 1526 + err_xa_erase: 1486 1527 xa_erase(&multicast_table, mc->id); 1487 - list_del(&mc->list); 1528 + err_free_mc: 1488 1529 kfree(mc); 1489 - err1: 1490 - mutex_unlock(&file->mut); 1530 + err_put_ctx: 1491 1531 ucma_put_ctx(ctx); 1492 1532 return ret; 1493 1533 } ··· 1557 1581 mc = xa_load(&multicast_table, cmd.id); 1558 1582 if (!mc) 1559 1583 mc = ERR_PTR(-ENOENT); 1560 - else if (mc->ctx->file != file) 1584 + else if (READ_ONCE(mc->ctx->file) != file) 1561 1585 mc = ERR_PTR(-EINVAL); 1562 1586 else if (!refcount_inc_not_zero(&mc->ctx->ref)) 1563 1587 mc = ERR_PTR(-ENXIO); ··· 1574 1598 rdma_leave_multicast(mc->ctx->cm_id, (struct sockaddr *) &mc->addr); 1575 1599 mutex_unlock(&mc->ctx->mutex); 1576 1600 1577 - mutex_lock(&mc->ctx->file->mut); 1578 1601 ucma_cleanup_mc_events(mc); 1579 - list_del(&mc->list); 1580 - mutex_unlock(&mc->ctx->file->mut); 1581 1602 1582 1603 ucma_put_ctx(mc->ctx); 1583 1604 resp.events_reported = mc->events_reported; ··· 1587 1614 return ret; 1588 1615 } 1589 1616 1590 - static void ucma_lock_files(struct ucma_file *file1, struct ucma_file *file2) 1591 - { 1592 - /* Acquire mutex's based on pointer comparison to prevent deadlock. */ 1593 - if (file1 < file2) { 1594 - mutex_lock(&file1->mut); 1595 - mutex_lock_nested(&file2->mut, SINGLE_DEPTH_NESTING); 1596 - } else { 1597 - mutex_lock(&file2->mut); 1598 - mutex_lock_nested(&file1->mut, SINGLE_DEPTH_NESTING); 1599 - } 1600 - } 1601 - 1602 - static void ucma_unlock_files(struct ucma_file *file1, struct ucma_file *file2) 1603 - { 1604 - if (file1 < file2) { 1605 - mutex_unlock(&file2->mut); 1606 - mutex_unlock(&file1->mut); 1607 - } else { 1608 - mutex_unlock(&file1->mut); 1609 - mutex_unlock(&file2->mut); 1610 - } 1611 - } 1612 - 1613 - static void ucma_move_events(struct ucma_context *ctx, struct ucma_file *file) 1614 - { 1615 - struct ucma_event *uevent, *tmp; 1616 - 1617 - list_for_each_entry_safe(uevent, tmp, &ctx->file->event_list, list) 1618 - if (uevent->ctx == ctx) 1619 - list_move_tail(&uevent->list, &file->event_list); 1620 - } 1621 - 1622 1617 static ssize_t ucma_migrate_id(struct ucma_file *new_file, 1623 1618 const char __user *inbuf, 1624 1619 int in_len, int out_len) 1625 1620 { 1626 1621 struct rdma_ucm_migrate_id cmd; 1627 1622 struct rdma_ucm_migrate_resp resp; 1623 + struct ucma_event *uevent, *tmp; 1628 1624 struct ucma_context *ctx; 1625 + LIST_HEAD(event_list); 1629 1626 struct fd f; 1630 1627 struct ucma_file *cur_file; 1631 1628 int ret = 0; ··· 1611 1668 ret = -EINVAL; 1612 1669 goto file_put; 1613 1670 } 1671 + cur_file = f.file->private_data; 1614 1672 1615 1673 /* Validate current fd and prevent destruction of id. */ 1616 - ctx = ucma_get_ctx(f.file->private_data, cmd.id); 1674 + ctx = ucma_get_ctx(cur_file, cmd.id); 1617 1675 if (IS_ERR(ctx)) { 1618 1676 ret = PTR_ERR(ctx); 1619 1677 goto file_put; 1620 1678 } 1621 1679 1622 - cur_file = ctx->file; 1623 - if (cur_file == new_file) { 1624 - resp.events_reported = ctx->events_reported; 1625 - goto response; 1626 - } 1627 - 1680 + rdma_lock_handler(ctx->cm_id); 1628 1681 /* 1629 - * Migrate events between fd's, maintaining order, and avoiding new 1630 - * events being added before existing events. 1682 + * ctx->file can only be changed under the handler & xa_lock. xa_load() 1683 + * must be checked again to ensure the ctx hasn't begun destruction 1684 + * since the ucma_get_ctx(). 1631 1685 */ 1632 - ucma_lock_files(cur_file, new_file); 1633 1686 xa_lock(&ctx_table); 1634 - 1635 - list_move_tail(&ctx->list, &new_file->ctx_list); 1636 - ucma_move_events(ctx, new_file); 1687 + if (_ucma_find_context(cmd.id, cur_file) != ctx) { 1688 + xa_unlock(&ctx_table); 1689 + ret = -ENOENT; 1690 + goto err_unlock; 1691 + } 1637 1692 ctx->file = new_file; 1638 - resp.events_reported = ctx->events_reported; 1639 - 1640 1693 xa_unlock(&ctx_table); 1641 - ucma_unlock_files(cur_file, new_file); 1642 1694 1643 - response: 1695 + mutex_lock(&cur_file->mut); 1696 + list_del(&ctx->list); 1697 + /* 1698 + * At this point lock_handler() prevents addition of new uevents for 1699 + * this ctx. 1700 + */ 1701 + list_for_each_entry_safe(uevent, tmp, &cur_file->event_list, list) 1702 + if (uevent->ctx == ctx) 1703 + list_move_tail(&uevent->list, &event_list); 1704 + resp.events_reported = ctx->events_reported; 1705 + mutex_unlock(&cur_file->mut); 1706 + 1707 + mutex_lock(&new_file->mut); 1708 + list_add_tail(&ctx->list, &new_file->ctx_list); 1709 + list_splice_tail(&event_list, &new_file->event_list); 1710 + mutex_unlock(&new_file->mut); 1711 + 1644 1712 if (copy_to_user(u64_to_user_ptr(cmd.response), 1645 1713 &resp, sizeof(resp))) 1646 1714 ret = -EFAULT; 1647 1715 1716 + err_unlock: 1717 + rdma_unlock_handler(ctx->cm_id); 1648 1718 ucma_put_ctx(ctx); 1649 1719 file_put: 1650 1720 fdput(f); ··· 1757 1801 if (!file) 1758 1802 return -ENOMEM; 1759 1803 1760 - file->close_wq = alloc_ordered_workqueue("ucma_close_id", 1761 - WQ_MEM_RECLAIM); 1762 - if (!file->close_wq) { 1763 - kfree(file); 1764 - return -ENOMEM; 1765 - } 1766 - 1767 1804 INIT_LIST_HEAD(&file->event_list); 1768 1805 INIT_LIST_HEAD(&file->ctx_list); 1769 1806 init_waitqueue_head(&file->poll_wait); ··· 1771 1822 static int ucma_close(struct inode *inode, struct file *filp) 1772 1823 { 1773 1824 struct ucma_file *file = filp->private_data; 1774 - struct ucma_context *ctx, *tmp; 1775 1825 1776 - mutex_lock(&file->mut); 1777 - list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) { 1778 - ctx->destroying = 1; 1779 - mutex_unlock(&file->mut); 1826 + /* 1827 + * All paths that touch ctx_list or ctx_list starting from write() are 1828 + * prevented by this being a FD release function. The list_add_tail() in 1829 + * ucma_connect_event_handler() can run concurrently, however it only 1830 + * adds to the list *after* a listening ID. By only reading the first of 1831 + * the list, and relying on __destroy_id() to block 1832 + * ucma_connect_event_handler(), no additional locking is needed. 1833 + */ 1834 + while (!list_empty(&file->ctx_list)) { 1835 + struct ucma_context *ctx = list_first_entry( 1836 + &file->ctx_list, struct ucma_context, list); 1780 1837 1781 1838 xa_erase(&ctx_table, ctx->id); 1782 - flush_workqueue(file->close_wq); 1783 - /* At that step once ctx was marked as destroying and workqueue 1784 - * was flushed we are safe from any inflights handlers that 1785 - * might put other closing task. 1786 - */ 1787 - xa_lock(&ctx_table); 1788 - if (!ctx->closing) { 1789 - xa_unlock(&ctx_table); 1790 - ucma_put_ctx(ctx); 1791 - wait_for_completion(&ctx->comp); 1792 - /* rdma_destroy_id ensures that no event handlers are 1793 - * inflight for that id before releasing it. 1794 - */ 1795 - rdma_destroy_id(ctx->cm_id); 1796 - } else { 1797 - xa_unlock(&ctx_table); 1798 - } 1799 - 1800 - ucma_free_ctx(ctx); 1801 - mutex_lock(&file->mut); 1839 + __destroy_id(ctx); 1802 1840 } 1803 - mutex_unlock(&file->mut); 1804 - destroy_workqueue(file->close_wq); 1805 1841 kfree(file); 1806 1842 return 0; 1807 1843 }
+39 -100
drivers/infiniband/core/umem.c
··· 39 39 #include <linux/export.h> 40 40 #include <linux/slab.h> 41 41 #include <linux/pagemap.h> 42 + #include <linux/count_zeros.h> 42 43 #include <rdma/ib_umem_odp.h> 43 44 44 45 #include "uverbs.h" ··· 61 60 sg_free_table(&umem->sg_head); 62 61 } 63 62 64 - /* ib_umem_add_sg_table - Add N contiguous pages to scatter table 65 - * 66 - * sg: current scatterlist entry 67 - * page_list: array of npage struct page pointers 68 - * npages: number of pages in page_list 69 - * max_seg_sz: maximum segment size in bytes 70 - * nents: [out] number of entries in the scatterlist 71 - * 72 - * Return new end of scatterlist 73 - */ 74 - static struct scatterlist *ib_umem_add_sg_table(struct scatterlist *sg, 75 - struct page **page_list, 76 - unsigned long npages, 77 - unsigned int max_seg_sz, 78 - int *nents) 79 - { 80 - unsigned long first_pfn; 81 - unsigned long i = 0; 82 - bool update_cur_sg = false; 83 - bool first = !sg_page(sg); 84 - 85 - /* Check if new page_list is contiguous with end of previous page_list. 86 - * sg->length here is a multiple of PAGE_SIZE and sg->offset is 0. 87 - */ 88 - if (!first && (page_to_pfn(sg_page(sg)) + (sg->length >> PAGE_SHIFT) == 89 - page_to_pfn(page_list[0]))) 90 - update_cur_sg = true; 91 - 92 - while (i != npages) { 93 - unsigned long len; 94 - struct page *first_page = page_list[i]; 95 - 96 - first_pfn = page_to_pfn(first_page); 97 - 98 - /* Compute the number of contiguous pages we have starting 99 - * at i 100 - */ 101 - for (len = 0; i != npages && 102 - first_pfn + len == page_to_pfn(page_list[i]) && 103 - len < (max_seg_sz >> PAGE_SHIFT); 104 - len++) 105 - i++; 106 - 107 - /* Squash N contiguous pages from page_list into current sge */ 108 - if (update_cur_sg) { 109 - if ((max_seg_sz - sg->length) >= (len << PAGE_SHIFT)) { 110 - sg_set_page(sg, sg_page(sg), 111 - sg->length + (len << PAGE_SHIFT), 112 - 0); 113 - update_cur_sg = false; 114 - continue; 115 - } 116 - update_cur_sg = false; 117 - } 118 - 119 - /* Squash N contiguous pages into next sge or first sge */ 120 - if (!first) 121 - sg = sg_next(sg); 122 - 123 - (*nents)++; 124 - sg_set_page(sg, first_page, len << PAGE_SHIFT, 0); 125 - first = false; 126 - } 127 - 128 - return sg; 129 - } 130 - 131 63 /** 132 64 * ib_umem_find_best_pgsz - Find best HW page size to use for this MR 133 65 * ··· 80 146 unsigned long virt) 81 147 { 82 148 struct scatterlist *sg; 83 - unsigned int best_pg_bit; 84 149 unsigned long va, pgoff; 85 150 dma_addr_t mask; 86 151 int i; 152 + 153 + /* rdma_for_each_block() has a bug if the page size is smaller than the 154 + * page size used to build the umem. For now prevent smaller page sizes 155 + * from being returned. 156 + */ 157 + pgsz_bitmap &= GENMASK(BITS_PER_LONG - 1, PAGE_SHIFT); 87 158 88 159 /* At minimum, drivers must support PAGE_SIZE or smaller */ 89 160 if (WARN_ON(!(pgsz_bitmap & GENMASK(PAGE_SHIFT, 0)))) 90 161 return 0; 91 162 92 - va = virt; 93 - /* max page size not to exceed MR length */ 94 - mask = roundup_pow_of_two(umem->length); 163 + umem->iova = va = virt; 164 + /* The best result is the smallest page size that results in the minimum 165 + * number of required pages. Compute the largest page size that could 166 + * work based on VA address bits that don't change. 167 + */ 168 + mask = pgsz_bitmap & 169 + GENMASK(BITS_PER_LONG - 1, 170 + bits_per((umem->length - 1 + virt) ^ virt)); 95 171 /* offset into first SGL */ 96 172 pgoff = umem->address & ~PAGE_MASK; 97 173 ··· 119 175 mask |= va; 120 176 pgoff = 0; 121 177 } 122 - best_pg_bit = rdma_find_pg_bit(mask, pgsz_bitmap); 123 178 124 - return BIT_ULL(best_pg_bit); 179 + /* The mask accumulates 1's in each position where the VA and physical 180 + * address differ, thus the length of trailing 0 is the largest page 181 + * size that can pass the VA through to the physical. 182 + */ 183 + if (mask) 184 + pgsz_bitmap &= GENMASK(count_trailing_zeros(mask), 0); 185 + return rounddown_pow_of_two(pgsz_bitmap); 125 186 } 126 187 EXPORT_SYMBOL(ib_umem_find_best_pgsz); 127 188 ··· 150 201 struct mm_struct *mm; 151 202 unsigned long npages; 152 203 int ret; 153 - struct scatterlist *sg; 204 + struct scatterlist *sg = NULL; 154 205 unsigned int gup_flags = FOLL_WRITE; 155 206 156 207 /* ··· 173 224 umem->ibdev = device; 174 225 umem->length = size; 175 226 umem->address = addr; 227 + /* 228 + * Drivers should call ib_umem_find_best_pgsz() to set the iova 229 + * correctly. 230 + */ 231 + umem->iova = addr; 176 232 umem->writable = ib_access_writable(access); 177 233 umem->owning_mm = mm = current->mm; 178 234 mmgrab(mm); ··· 205 251 206 252 cur_base = addr & PAGE_MASK; 207 253 208 - ret = sg_alloc_table(&umem->sg_head, npages, GFP_KERNEL); 209 - if (ret) 210 - goto vma; 211 - 212 254 if (!umem->writable) 213 255 gup_flags |= FOLL_FORCE; 214 - 215 - sg = umem->sg_head.sgl; 216 256 217 257 while (npages) { 218 258 cond_resched(); ··· 219 271 goto umem_release; 220 272 221 273 cur_base += ret * PAGE_SIZE; 222 - npages -= ret; 223 - 224 - sg = ib_umem_add_sg_table(sg, page_list, ret, 225 - dma_get_max_seg_size(device->dma_device), 226 - &umem->sg_nents); 274 + npages -= ret; 275 + sg = __sg_alloc_table_from_pages( 276 + &umem->sg_head, page_list, ret, 0, ret << PAGE_SHIFT, 277 + dma_get_max_seg_size(device->dma_device), sg, npages, 278 + GFP_KERNEL); 279 + umem->sg_nents = umem->sg_head.nents; 280 + if (IS_ERR(sg)) { 281 + unpin_user_pages_dirty_lock(page_list, ret, 0); 282 + ret = PTR_ERR(sg); 283 + goto umem_release; 284 + } 227 285 } 228 - 229 - sg_mark_end(sg); 230 286 231 287 if (access & IB_ACCESS_RELAXED_ORDERING) 232 288 dma_attr |= DMA_ATTR_WEAK_ORDERING; ··· 249 297 250 298 umem_release: 251 299 __ib_umem_release(device, umem, 0); 252 - vma: 253 300 atomic64_sub(ib_umem_num_pages(umem), &mm->pinned_vm); 254 301 out: 255 302 free_page((unsigned long) page_list); ··· 279 328 kfree(umem); 280 329 } 281 330 EXPORT_SYMBOL(ib_umem_release); 282 - 283 - int ib_umem_page_count(struct ib_umem *umem) 284 - { 285 - int i, n = 0; 286 - struct scatterlist *sg; 287 - 288 - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i) 289 - n += sg_dma_len(sg) >> PAGE_SHIFT; 290 - 291 - return n; 292 - } 293 - EXPORT_SYMBOL(ib_umem_page_count); 294 331 295 332 /* 296 333 * Copy from the given ib_umem's pages to the given buffer.
+127 -166
drivers/infiniband/core/umem_odp.c
··· 40 40 #include <linux/vmalloc.h> 41 41 #include <linux/hugetlb.h> 42 42 #include <linux/interval_tree.h> 43 + #include <linux/hmm.h> 43 44 #include <linux/pagemap.h> 44 45 45 46 #include <rdma/ib_verbs.h> ··· 61 60 size_t page_size = 1UL << umem_odp->page_shift; 62 61 unsigned long start; 63 62 unsigned long end; 64 - size_t pages; 63 + size_t ndmas, npfns; 65 64 66 65 start = ALIGN_DOWN(umem_odp->umem.address, page_size); 67 66 if (check_add_overflow(umem_odp->umem.address, ··· 72 71 if (unlikely(end < page_size)) 73 72 return -EOVERFLOW; 74 73 75 - pages = (end - start) >> umem_odp->page_shift; 76 - if (!pages) 74 + ndmas = (end - start) >> umem_odp->page_shift; 75 + if (!ndmas) 77 76 return -EINVAL; 78 77 79 - umem_odp->page_list = kvcalloc( 80 - pages, sizeof(*umem_odp->page_list), GFP_KERNEL); 81 - if (!umem_odp->page_list) 78 + npfns = (end - start) >> PAGE_SHIFT; 79 + umem_odp->pfn_list = kvcalloc( 80 + npfns, sizeof(*umem_odp->pfn_list), GFP_KERNEL); 81 + if (!umem_odp->pfn_list) 82 82 return -ENOMEM; 83 83 84 84 umem_odp->dma_list = kvcalloc( 85 - pages, sizeof(*umem_odp->dma_list), GFP_KERNEL); 85 + ndmas, sizeof(*umem_odp->dma_list), GFP_KERNEL); 86 86 if (!umem_odp->dma_list) { 87 87 ret = -ENOMEM; 88 - goto out_page_list; 88 + goto out_pfn_list; 89 89 } 90 90 91 91 ret = mmu_interval_notifier_insert(&umem_odp->notifier, ··· 100 98 101 99 out_dma_list: 102 100 kvfree(umem_odp->dma_list); 103 - out_page_list: 104 - kvfree(umem_odp->page_list); 101 + out_pfn_list: 102 + kvfree(umem_odp->pfn_list); 105 103 return ret; 106 104 } 107 105 ··· 278 276 mutex_unlock(&umem_odp->umem_mutex); 279 277 mmu_interval_notifier_remove(&umem_odp->notifier); 280 278 kvfree(umem_odp->dma_list); 281 - kvfree(umem_odp->page_list); 279 + kvfree(umem_odp->pfn_list); 282 280 } 283 281 put_pid(umem_odp->tgid); 284 282 kfree(umem_odp); ··· 289 287 * Map for DMA and insert a single page into the on-demand paging page tables. 290 288 * 291 289 * @umem: the umem to insert the page to. 292 - * @page_index: index in the umem to add the page to. 290 + * @dma_index: index in the umem to add the dma to. 293 291 * @page: the page struct to map and add. 294 292 * @access_mask: access permissions needed for this page. 295 293 * @current_seq: sequence number for synchronization with invalidations. 296 294 * the sequence number is taken from 297 295 * umem_odp->notifiers_seq. 298 296 * 299 - * The function returns -EFAULT if the DMA mapping operation fails. It returns 300 - * -EAGAIN if a concurrent invalidation prevents us from updating the page. 297 + * The function returns -EFAULT if the DMA mapping operation fails. 301 298 * 302 - * The page is released via put_page even if the operation failed. For on-demand 303 - * pinning, the page is released whenever it isn't stored in the umem. 304 299 */ 305 300 static int ib_umem_odp_map_dma_single_page( 306 301 struct ib_umem_odp *umem_odp, 307 - unsigned int page_index, 302 + unsigned int dma_index, 308 303 struct page *page, 309 - u64 access_mask, 310 - unsigned long current_seq) 304 + u64 access_mask) 311 305 { 312 306 struct ib_device *dev = umem_odp->umem.ibdev; 313 - dma_addr_t dma_addr; 314 - int ret = 0; 307 + dma_addr_t *dma_addr = &umem_odp->dma_list[dma_index]; 315 308 316 - if (mmu_interval_check_retry(&umem_odp->notifier, current_seq)) { 317 - ret = -EAGAIN; 318 - goto out; 319 - } 320 - if (!(umem_odp->dma_list[page_index])) { 321 - dma_addr = 322 - ib_dma_map_page(dev, page, 0, BIT(umem_odp->page_shift), 323 - DMA_BIDIRECTIONAL); 324 - if (ib_dma_mapping_error(dev, dma_addr)) { 325 - ret = -EFAULT; 326 - goto out; 327 - } 328 - umem_odp->dma_list[page_index] = dma_addr | access_mask; 329 - umem_odp->page_list[page_index] = page; 330 - umem_odp->npages++; 331 - } else if (umem_odp->page_list[page_index] == page) { 332 - umem_odp->dma_list[page_index] |= access_mask; 333 - } else { 309 + if (*dma_addr) { 334 310 /* 335 - * This is a race here where we could have done: 336 - * 337 - * CPU0 CPU1 338 - * get_user_pages() 339 - * invalidate() 340 - * page_fault() 341 - * mutex_lock(umem_mutex) 342 - * page from GUP != page in ODP 343 - * 344 - * It should be prevented by the retry test above as reading 345 - * the seq number should be reliable under the 346 - * umem_mutex. Thus something is really not working right if 347 - * things get here. 311 + * If the page is already dma mapped it means it went through 312 + * a non-invalidating trasition, like read-only to writable. 313 + * Resync the flags. 348 314 */ 349 - WARN(true, 350 - "Got different pages in IB device and from get_user_pages. IB device page: %p, gup page: %p\n", 351 - umem_odp->page_list[page_index], page); 352 - ret = -EAGAIN; 315 + *dma_addr = (*dma_addr & ODP_DMA_ADDR_MASK) | access_mask; 316 + return 0; 353 317 } 354 318 355 - out: 356 - put_page(page); 357 - return ret; 319 + *dma_addr = ib_dma_map_page(dev, page, 0, 1 << umem_odp->page_shift, 320 + DMA_BIDIRECTIONAL); 321 + if (ib_dma_mapping_error(dev, *dma_addr)) { 322 + *dma_addr = 0; 323 + return -EFAULT; 324 + } 325 + umem_odp->npages++; 326 + *dma_addr |= access_mask; 327 + return 0; 358 328 } 359 329 360 330 /** 361 - * ib_umem_odp_map_dma_pages - Pin and DMA map userspace memory in an ODP MR. 331 + * ib_umem_odp_map_dma_and_lock - DMA map userspace memory in an ODP MR and lock it. 362 332 * 363 - * Pins the range of pages passed in the argument, and maps them to 364 - * DMA addresses. The DMA addresses of the mapped pages is updated in 365 - * umem_odp->dma_list. 333 + * Maps the range passed in the argument to DMA addresses. 334 + * The DMA addresses of the mapped pages is updated in umem_odp->dma_list. 335 + * Upon success the ODP MR will be locked to let caller complete its device 336 + * page table update. 366 337 * 367 338 * Returns the number of pages mapped in success, negative error code 368 339 * for failure. 369 - * An -EAGAIN error code is returned when a concurrent mmu notifier prevents 370 - * the function from completing its task. 371 - * An -ENOENT error code indicates that userspace process is being terminated 372 - * and mm was already destroyed. 373 340 * @umem_odp: the umem to map and pin 374 341 * @user_virt: the address from which we need to map. 375 342 * @bcnt: the minimal number of bytes to pin and map. The mapping might be ··· 347 376 * the return value. 348 377 * @access_mask: bit mask of the requested access permissions for the given 349 378 * range. 350 - * @current_seq: the MMU notifiers sequance value for synchronization with 351 - * invalidations. the sequance number is read from 352 - * umem_odp->notifiers_seq before calling this function 379 + * @fault: is faulting required for the given range 353 380 */ 354 - int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 user_virt, 355 - u64 bcnt, u64 access_mask, 356 - unsigned long current_seq) 381 + int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 user_virt, 382 + u64 bcnt, u64 access_mask, bool fault) 383 + __acquires(&umem_odp->umem_mutex) 357 384 { 358 385 struct task_struct *owning_process = NULL; 359 386 struct mm_struct *owning_mm = umem_odp->umem.owning_mm; 360 - struct page **local_page_list = NULL; 361 - u64 page_mask, off; 362 - int j, k, ret = 0, start_idx, npages = 0; 363 - unsigned int flags = 0, page_shift; 364 - phys_addr_t p = 0; 387 + int pfn_index, dma_index, ret = 0, start_idx; 388 + unsigned int page_shift, hmm_order, pfn_start_idx; 389 + unsigned long num_pfns, current_seq; 390 + struct hmm_range range = {}; 391 + unsigned long timeout; 365 392 366 393 if (access_mask == 0) 367 394 return -EINVAL; ··· 368 399 user_virt + bcnt > ib_umem_end(umem_odp)) 369 400 return -EFAULT; 370 401 371 - local_page_list = (struct page **)__get_free_page(GFP_KERNEL); 372 - if (!local_page_list) 373 - return -ENOMEM; 374 - 375 402 page_shift = umem_odp->page_shift; 376 - page_mask = ~(BIT(page_shift) - 1); 377 - off = user_virt & (~page_mask); 378 - user_virt = user_virt & page_mask; 379 - bcnt += off; /* Charge for the first page offset as well. */ 380 403 381 404 /* 382 405 * owning_process is allowed to be NULL, this means somehow the mm is ··· 381 420 goto out_put_task; 382 421 } 383 422 384 - if (access_mask & ODP_WRITE_ALLOWED_BIT) 385 - flags |= FOLL_WRITE; 423 + range.notifier = &umem_odp->notifier; 424 + range.start = ALIGN_DOWN(user_virt, 1UL << page_shift); 425 + range.end = ALIGN(user_virt + bcnt, 1UL << page_shift); 426 + pfn_start_idx = (range.start - ib_umem_start(umem_odp)) >> PAGE_SHIFT; 427 + num_pfns = (range.end - range.start) >> PAGE_SHIFT; 428 + if (fault) { 429 + range.default_flags = HMM_PFN_REQ_FAULT; 386 430 387 - start_idx = (user_virt - ib_umem_start(umem_odp)) >> page_shift; 388 - k = start_idx; 431 + if (access_mask & ODP_WRITE_ALLOWED_BIT) 432 + range.default_flags |= HMM_PFN_REQ_WRITE; 433 + } 389 434 390 - while (bcnt > 0) { 391 - const size_t gup_num_pages = min_t(size_t, 392 - ALIGN(bcnt, PAGE_SIZE) / PAGE_SIZE, 393 - PAGE_SIZE / sizeof(struct page *)); 435 + range.hmm_pfns = &(umem_odp->pfn_list[pfn_start_idx]); 436 + timeout = jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT); 394 437 395 - mmap_read_lock(owning_mm); 396 - /* 397 - * Note: this might result in redundent page getting. We can 398 - * avoid this by checking dma_list to be 0 before calling 399 - * get_user_pages. However, this make the code much more 400 - * complex (and doesn't gain us much performance in most use 401 - * cases). 402 - */ 403 - npages = get_user_pages_remote(owning_mm, 404 - user_virt, gup_num_pages, 405 - flags, local_page_list, NULL, NULL); 406 - mmap_read_unlock(owning_mm); 438 + retry: 439 + current_seq = range.notifier_seq = 440 + mmu_interval_read_begin(&umem_odp->notifier); 407 441 408 - if (npages < 0) { 409 - if (npages != -EAGAIN) 410 - pr_warn("fail to get %zu user pages with error %d\n", gup_num_pages, npages); 411 - else 412 - pr_debug("fail to get %zu user pages with error %d\n", gup_num_pages, npages); 413 - break; 414 - } 442 + mmap_read_lock(owning_mm); 443 + ret = hmm_range_fault(&range); 444 + mmap_read_unlock(owning_mm); 445 + if (unlikely(ret)) { 446 + if (ret == -EBUSY && !time_after(jiffies, timeout)) 447 + goto retry; 448 + goto out_put_mm; 449 + } 415 450 416 - bcnt -= min_t(size_t, npages << PAGE_SHIFT, bcnt); 417 - mutex_lock(&umem_odp->umem_mutex); 418 - for (j = 0; j < npages; j++, user_virt += PAGE_SIZE) { 419 - if (user_virt & ~page_mask) { 420 - p += PAGE_SIZE; 421 - if (page_to_phys(local_page_list[j]) != p) { 422 - ret = -EFAULT; 423 - break; 424 - } 425 - put_page(local_page_list[j]); 451 + start_idx = (range.start - ib_umem_start(umem_odp)) >> page_shift; 452 + dma_index = start_idx; 453 + 454 + mutex_lock(&umem_odp->umem_mutex); 455 + if (mmu_interval_read_retry(&umem_odp->notifier, current_seq)) { 456 + mutex_unlock(&umem_odp->umem_mutex); 457 + goto retry; 458 + } 459 + 460 + for (pfn_index = 0; pfn_index < num_pfns; 461 + pfn_index += 1 << (page_shift - PAGE_SHIFT), dma_index++) { 462 + 463 + if (fault) { 464 + /* 465 + * Since we asked for hmm_range_fault() to populate 466 + * pages it shouldn't return an error entry on success. 467 + */ 468 + WARN_ON(range.hmm_pfns[pfn_index] & HMM_PFN_ERROR); 469 + WARN_ON(!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)); 470 + } else { 471 + if (!(range.hmm_pfns[pfn_index] & HMM_PFN_VALID)) { 472 + WARN_ON(umem_odp->dma_list[dma_index]); 426 473 continue; 427 474 } 428 - 429 - ret = ib_umem_odp_map_dma_single_page( 430 - umem_odp, k, local_page_list[j], 431 - access_mask, current_seq); 432 - if (ret < 0) { 433 - if (ret != -EAGAIN) 434 - pr_warn("ib_umem_odp_map_dma_single_page failed with error %d\n", ret); 435 - else 436 - pr_debug("ib_umem_odp_map_dma_single_page failed with error %d\n", ret); 437 - break; 438 - } 439 - 440 - p = page_to_phys(local_page_list[j]); 441 - k++; 475 + access_mask = ODP_READ_ALLOWED_BIT; 476 + if (range.hmm_pfns[pfn_index] & HMM_PFN_WRITE) 477 + access_mask |= ODP_WRITE_ALLOWED_BIT; 442 478 } 443 - mutex_unlock(&umem_odp->umem_mutex); 444 479 480 + hmm_order = hmm_pfn_to_map_order(range.hmm_pfns[pfn_index]); 481 + /* If a hugepage was detected and ODP wasn't set for, the umem 482 + * page_shift will be used, the opposite case is an error. 483 + */ 484 + if (hmm_order + PAGE_SHIFT < page_shift) { 485 + ret = -EINVAL; 486 + ibdev_dbg(umem_odp->umem.ibdev, 487 + "%s: un-expected hmm_order %d, page_shift %d\n", 488 + __func__, hmm_order, page_shift); 489 + break; 490 + } 491 + 492 + ret = ib_umem_odp_map_dma_single_page( 493 + umem_odp, dma_index, hmm_pfn_to_page(range.hmm_pfns[pfn_index]), 494 + access_mask); 445 495 if (ret < 0) { 446 - /* 447 - * Release pages, remembering that the first page 448 - * to hit an error was already released by 449 - * ib_umem_odp_map_dma_single_page(). 450 - */ 451 - if (npages - (j + 1) > 0) 452 - release_pages(&local_page_list[j+1], 453 - npages - (j + 1)); 496 + ibdev_dbg(umem_odp->umem.ibdev, 497 + "ib_umem_odp_map_dma_single_page failed with error %d\n", ret); 454 498 break; 455 499 } 456 500 } 501 + /* upon sucesss lock should stay on hold for the callee */ 502 + if (!ret) 503 + ret = dma_index - start_idx; 504 + else 505 + mutex_unlock(&umem_odp->umem_mutex); 457 506 458 - if (ret >= 0) { 459 - if (npages < 0 && k == start_idx) 460 - ret = npages; 461 - else 462 - ret = k - start_idx; 463 - } 464 - 507 + out_put_mm: 465 508 mmput(owning_mm); 466 509 out_put_task: 467 510 if (owning_process) 468 511 put_task_struct(owning_process); 469 - free_page((unsigned long)local_page_list); 470 512 return ret; 471 513 } 472 - EXPORT_SYMBOL(ib_umem_odp_map_dma_pages); 514 + EXPORT_SYMBOL(ib_umem_odp_map_dma_and_lock); 473 515 474 516 void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt, 475 517 u64 bound) 476 518 { 519 + dma_addr_t dma_addr; 520 + dma_addr_t dma; 477 521 int idx; 478 522 u64 addr; 479 523 struct ib_device *dev = umem_odp->umem.ibdev; ··· 487 521 488 522 virt = max_t(u64, virt, ib_umem_start(umem_odp)); 489 523 bound = min_t(u64, bound, ib_umem_end(umem_odp)); 490 - /* Note that during the run of this function, the 491 - * notifiers_count of the MR is > 0, preventing any racing 492 - * faults from completion. We might be racing with other 493 - * invalidations, so we must make sure we free each page only 494 - * once. */ 495 524 for (addr = virt; addr < bound; addr += BIT(umem_odp->page_shift)) { 496 525 idx = (addr - ib_umem_start(umem_odp)) >> umem_odp->page_shift; 497 - if (umem_odp->page_list[idx]) { 498 - struct page *page = umem_odp->page_list[idx]; 499 - dma_addr_t dma = umem_odp->dma_list[idx]; 500 - dma_addr_t dma_addr = dma & ODP_DMA_ADDR_MASK; 526 + dma = umem_odp->dma_list[idx]; 501 527 502 - WARN_ON(!dma_addr); 528 + /* The access flags guaranteed a valid DMA address in case was NULL */ 529 + if (dma) { 530 + unsigned long pfn_idx = (addr - ib_umem_start(umem_odp)) >> PAGE_SHIFT; 531 + struct page *page = hmm_pfn_to_page(umem_odp->pfn_list[pfn_idx]); 503 532 533 + dma_addr = dma & ODP_DMA_ADDR_MASK; 504 534 ib_dma_unmap_page(dev, dma_addr, 505 535 BIT(umem_odp->page_shift), 506 536 DMA_BIDIRECTIONAL); ··· 513 551 */ 514 552 set_page_dirty(head_page); 515 553 } 516 - umem_odp->page_list[idx] = NULL; 517 554 umem_odp->dma_list[idx] = 0; 518 555 umem_odp->npages--; 519 556 }
+65 -28
drivers/infiniband/core/uverbs_cmd.c
··· 218 218 if (!ucontext) 219 219 return -ENOMEM; 220 220 221 - ucontext->res.type = RDMA_RESTRACK_CTX; 222 221 ucontext->device = ib_dev; 223 222 ucontext->ufile = ufile; 224 223 xa_init_flags(&ucontext->mmap_xa, XA_FLAGS_ALLOC); 224 + 225 + rdma_restrack_new(&ucontext->res, RDMA_RESTRACK_CTX); 226 + rdma_restrack_set_name(&ucontext->res, NULL); 225 227 attrs->context = ucontext; 226 228 return 0; 227 229 } ··· 252 250 if (ret) 253 251 goto err_uncharge; 254 252 255 - rdma_restrack_uadd(&ucontext->res); 253 + rdma_restrack_add(&ucontext->res); 256 254 257 255 /* 258 256 * Make sure that ib_uverbs_get_ucontext() sees the pointer update ··· 315 313 err_uobj: 316 314 rdma_alloc_abort_uobject(uobj, attrs, false); 317 315 err_ucontext: 316 + rdma_restrack_put(&attrs->context->res); 318 317 kfree(attrs->context); 319 318 attrs->context = NULL; 320 319 return ret; ··· 442 439 pd->device = ib_dev; 443 440 pd->uobject = uobj; 444 441 atomic_set(&pd->usecnt, 0); 445 - pd->res.type = RDMA_RESTRACK_PD; 442 + 443 + rdma_restrack_new(&pd->res, RDMA_RESTRACK_PD); 444 + rdma_restrack_set_name(&pd->res, NULL); 446 445 447 446 ret = ib_dev->ops.alloc_pd(pd, &attrs->driver_udata); 448 447 if (ret) 449 448 goto err_alloc; 450 - rdma_restrack_uadd(&pd->res); 449 + rdma_restrack_add(&pd->res); 451 450 452 451 uobj->object = pd; 453 452 uobj_finalize_uobj_create(uobj, attrs); ··· 458 453 return uverbs_response(attrs, &resp, sizeof(resp)); 459 454 460 455 err_alloc: 456 + rdma_restrack_put(&pd->res); 461 457 kfree(pd); 462 458 err: 463 459 uobj_alloc_abort(uobj, attrs); ··· 748 742 mr->sig_attrs = NULL; 749 743 mr->uobject = uobj; 750 744 atomic_inc(&pd->usecnt); 751 - mr->res.type = RDMA_RESTRACK_MR; 752 745 mr->iova = cmd.hca_va; 753 - rdma_restrack_uadd(&mr->res); 746 + 747 + rdma_restrack_new(&mr->res, RDMA_RESTRACK_MR); 748 + rdma_restrack_set_name(&mr->res, NULL); 749 + rdma_restrack_add(&mr->res); 754 750 755 751 uobj->object = mr; 756 752 uobj_put_obj_read(pd); ··· 866 858 static int ib_uverbs_alloc_mw(struct uverbs_attr_bundle *attrs) 867 859 { 868 860 struct ib_uverbs_alloc_mw cmd; 869 - struct ib_uverbs_alloc_mw_resp resp; 861 + struct ib_uverbs_alloc_mw_resp resp = {}; 870 862 struct ib_uobject *uobj; 871 863 struct ib_pd *pd; 872 864 struct ib_mw *mw; ··· 892 884 goto err_put; 893 885 } 894 886 895 - mw = pd->device->ops.alloc_mw(pd, cmd.mw_type, &attrs->driver_udata); 896 - if (IS_ERR(mw)) { 897 - ret = PTR_ERR(mw); 887 + mw = rdma_zalloc_drv_obj(ib_dev, ib_mw); 888 + if (!mw) { 889 + ret = -ENOMEM; 898 890 goto err_put; 899 891 } 900 892 901 - mw->device = pd->device; 902 - mw->pd = pd; 893 + mw->device = ib_dev; 894 + mw->pd = pd; 903 895 mw->uobject = uobj; 896 + mw->type = cmd.mw_type; 897 + 898 + ret = pd->device->ops.alloc_mw(mw, &attrs->driver_udata); 899 + if (ret) 900 + goto err_alloc; 901 + 904 902 atomic_inc(&pd->usecnt); 905 903 906 904 uobj->object = mw; ··· 917 903 resp.mw_handle = uobj->id; 918 904 return uverbs_response(attrs, &resp, sizeof(resp)); 919 905 906 + err_alloc: 907 + kfree(mw); 920 908 err_put: 921 909 uobj_put_obj_read(pd); 922 910 err_free: ··· 1010 994 cq->event_handler = ib_uverbs_cq_event_handler; 1011 995 cq->cq_context = ev_file ? &ev_file->ev_queue : NULL; 1012 996 atomic_set(&cq->usecnt, 0); 1013 - cq->res.type = RDMA_RESTRACK_CQ; 997 + 998 + rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); 999 + rdma_restrack_set_name(&cq->res, NULL); 1014 1000 1015 1001 ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata); 1016 1002 if (ret) 1017 1003 goto err_free; 1018 - rdma_restrack_uadd(&cq->res); 1004 + rdma_restrack_add(&cq->res); 1019 1005 1020 1006 obj->uevent.uobject.object = cq; 1021 1007 obj->uevent.event_file = READ_ONCE(attrs->ufile->default_async_file); ··· 1031 1013 return uverbs_response(attrs, &resp, sizeof(resp)); 1032 1014 1033 1015 err_free: 1016 + rdma_restrack_put(&cq->res); 1034 1017 kfree(cq); 1035 1018 err_file: 1036 1019 if (ev_file) ··· 1256 1237 bool has_sq = true; 1257 1238 struct ib_device *ib_dev; 1258 1239 1259 - if (cmd->qp_type == IB_QPT_RAW_PACKET && !capable(CAP_NET_RAW)) 1260 - return -EPERM; 1240 + switch (cmd->qp_type) { 1241 + case IB_QPT_RAW_PACKET: 1242 + if (!capable(CAP_NET_RAW)) 1243 + return -EPERM; 1244 + break; 1245 + case IB_QPT_RC: 1246 + case IB_QPT_UC: 1247 + case IB_QPT_UD: 1248 + case IB_QPT_XRC_INI: 1249 + case IB_QPT_XRC_TGT: 1250 + case IB_QPT_DRIVER: 1251 + break; 1252 + default: 1253 + return -EINVAL; 1254 + } 1261 1255 1262 1256 obj = (struct ib_uqp_object *)uobj_alloc(UVERBS_OBJECT_QP, attrs, 1263 1257 &ib_dev); ··· 3017 2985 { 3018 2986 struct ib_uverbs_ex_create_rwq_ind_table cmd; 3019 2987 struct ib_uverbs_ex_create_rwq_ind_table_resp resp = {}; 3020 - struct ib_uobject *uobj; 2988 + struct ib_uobject *uobj; 3021 2989 int err; 3022 2990 struct ib_rwq_ind_table_init_attr init_attr = {}; 3023 2991 struct ib_rwq_ind_table *rwq_ind_tbl; 3024 - struct ib_wq **wqs = NULL; 2992 + struct ib_wq **wqs = NULL; 3025 2993 u32 *wqs_handles = NULL; 3026 2994 struct ib_wq *wq = NULL; 3027 2995 int i, num_read_wqs; ··· 3079 3047 goto put_wqs; 3080 3048 } 3081 3049 3082 - init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size; 3083 - init_attr.ind_tbl = wqs; 3084 - 3085 - rwq_ind_tbl = ib_dev->ops.create_rwq_ind_table(ib_dev, &init_attr, 3086 - &attrs->driver_udata); 3087 - 3088 - if (IS_ERR(rwq_ind_tbl)) { 3089 - err = PTR_ERR(rwq_ind_tbl); 3050 + rwq_ind_tbl = rdma_zalloc_drv_obj(ib_dev, ib_rwq_ind_table); 3051 + if (!rwq_ind_tbl) { 3052 + err = -ENOMEM; 3090 3053 goto err_uobj; 3091 3054 } 3055 + 3056 + init_attr.log_ind_tbl_size = cmd.log_ind_tbl_size; 3057 + init_attr.ind_tbl = wqs; 3092 3058 3093 3059 rwq_ind_tbl->ind_tbl = wqs; 3094 3060 rwq_ind_tbl->log_ind_tbl_size = init_attr.log_ind_tbl_size; ··· 3094 3064 uobj->object = rwq_ind_tbl; 3095 3065 rwq_ind_tbl->device = ib_dev; 3096 3066 atomic_set(&rwq_ind_tbl->usecnt, 0); 3067 + 3068 + err = ib_dev->ops.create_rwq_ind_table(rwq_ind_tbl, &init_attr, 3069 + &attrs->driver_udata); 3070 + if (err) 3071 + goto err_create; 3097 3072 3098 3073 for (i = 0; i < num_wq_handles; i++) 3099 3074 rdma_lookup_put_uobject(&wqs[i]->uobject->uevent.uobject, ··· 3111 3076 resp.response_length = uverbs_response_length(attrs, sizeof(resp)); 3112 3077 return uverbs_response(attrs, &resp, sizeof(resp)); 3113 3078 3079 + err_create: 3080 + kfree(rwq_ind_tbl); 3114 3081 err_uobj: 3115 3082 uobj_alloc_abort(uobj, attrs); 3116 3083 put_wqs: ··· 3269 3232 goto err_free; 3270 3233 } 3271 3234 3272 - flow_id = qp->device->ops.create_flow( 3273 - qp, flow_attr, IB_FLOW_DOMAIN_USER, &attrs->driver_udata); 3235 + flow_id = qp->device->ops.create_flow(qp, flow_attr, 3236 + &attrs->driver_udata); 3274 3237 3275 3238 if (IS_ERR(flow_id)) { 3276 3239 err = PTR_ERR(flow_id);
+5 -2
drivers/infiniband/core/uverbs_main.c
··· 108 108 int ret; 109 109 110 110 ret = mw->device->ops.dealloc_mw(mw); 111 - if (!ret) 112 - atomic_dec(&pd->usecnt); 111 + if (ret) 112 + return ret; 113 + 114 + atomic_dec(&pd->usecnt); 115 + kfree(mw); 113 116 return ret; 114 117 } 115 118
+11 -4
drivers/infiniband/core/uverbs_std_types.c
··· 81 81 { 82 82 struct ib_rwq_ind_table *rwq_ind_tbl = uobject->object; 83 83 struct ib_wq **ind_tbl = rwq_ind_tbl->ind_tbl; 84 - int ret; 84 + u32 table_size = (1 << rwq_ind_tbl->log_ind_tbl_size); 85 + int ret, i; 85 86 86 - ret = ib_destroy_rwq_ind_table(rwq_ind_tbl); 87 + if (atomic_read(&rwq_ind_tbl->usecnt)) 88 + return -EBUSY; 89 + 90 + ret = rwq_ind_tbl->device->ops.destroy_rwq_ind_table(rwq_ind_tbl); 87 91 if (ib_is_destroy_retryable(ret, why, uobject)) 88 92 return ret; 89 93 94 + for (i = 0; i < table_size; i++) 95 + atomic_dec(&ind_tbl[i]->usecnt); 96 + 97 + kfree(rwq_ind_tbl); 90 98 kfree(ind_tbl); 91 99 return ret; 92 100 } ··· 130 122 if (ret) 131 123 return ret; 132 124 133 - ib_dealloc_pd_user(pd, &attrs->driver_udata); 134 - return 0; 125 + return ib_dealloc_pd_user(pd, &attrs->driver_udata); 135 126 } 136 127 137 128 void ib_uverbs_free_event_queue(struct ib_uverbs_event_queue *event_queue)
+3 -1
drivers/infiniband/core/uverbs_std_types_counters.c
··· 46 46 if (ret) 47 47 return ret; 48 48 49 - counters->device->ops.destroy_counters(counters); 49 + ret = counters->device->ops.destroy_counters(counters); 50 + if (ret) 51 + return ret; 50 52 kfree(counters); 51 53 return 0; 52 54 }
+6 -2
drivers/infiniband/core/uverbs_std_types_cq.c
··· 33 33 #include <rdma/uverbs_std_types.h> 34 34 #include "rdma_core.h" 35 35 #include "uverbs.h" 36 + #include "restrack.h" 36 37 37 38 static int uverbs_free_cq(struct ib_uobject *uobject, 38 39 enum rdma_remove_reason why, ··· 124 123 cq->event_handler = ib_uverbs_cq_event_handler; 125 124 cq->cq_context = ev_file ? &ev_file->ev_queue : NULL; 126 125 atomic_set(&cq->usecnt, 0); 127 - cq->res.type = RDMA_RESTRACK_CQ; 126 + 127 + rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); 128 + rdma_restrack_set_name(&cq->res, NULL); 128 129 129 130 ret = ib_dev->ops.create_cq(cq, &attr, &attrs->driver_udata); 130 131 if (ret) ··· 134 131 135 132 obj->uevent.uobject.object = cq; 136 133 obj->uevent.uobject.user_handle = user_handle; 137 - rdma_restrack_uadd(&cq->res); 134 + rdma_restrack_add(&cq->res); 138 135 uverbs_finalize_uobj_create(attrs, UVERBS_ATTR_CREATE_CQ_HANDLE); 139 136 140 137 ret = uverbs_copy_to(attrs, UVERBS_ATTR_CREATE_CQ_RESP_CQE, &cq->cqe, ··· 142 139 return ret; 143 140 144 141 err_free: 142 + rdma_restrack_put(&cq->res); 145 143 kfree(cq); 146 144 err_event_file: 147 145 if (obj->uevent.event_file)
+197 -2
drivers/infiniband/core/uverbs_std_types_device.c
··· 3 3 * Copyright (c) 2018, Mellanox Technologies inc. All rights reserved. 4 4 */ 5 5 6 + #include <linux/overflow.h> 6 7 #include <rdma/uverbs_std_types.h> 7 8 #include "rdma_core.h" 8 9 #include "uverbs.h" 9 10 #include <rdma/uverbs_ioctl.h> 10 11 #include <rdma/opa_addr.h> 12 + #include <rdma/ib_cache.h> 11 13 12 14 /* 13 15 * This ioctl method allows calling any defined write or write_ex ··· 167 165 resp->subnet_timeout = attr->subnet_timeout; 168 166 resp->init_type_reply = attr->init_type_reply; 169 167 resp->active_width = attr->active_width; 170 - resp->active_speed = attr->active_speed; 168 + /* This ABI needs to be extended to provide any speed more than IB_SPEED_NDR */ 169 + resp->active_speed = min_t(u16, attr->active_speed, IB_SPEED_NDR); 171 170 resp->phys_state = attr->phys_state; 172 171 resp->link_layer = rdma_port_get_link_layer(ib_dev, port_num); 173 172 } ··· 268 265 return ucontext->device->ops.query_ucontext(ucontext, attrs); 269 266 } 270 267 268 + static int copy_gid_entries_to_user(struct uverbs_attr_bundle *attrs, 269 + struct ib_uverbs_gid_entry *entries, 270 + size_t num_entries, size_t user_entry_size) 271 + { 272 + const struct uverbs_attr *attr; 273 + void __user *user_entries; 274 + size_t copy_len; 275 + int ret; 276 + int i; 277 + 278 + if (user_entry_size == sizeof(*entries)) { 279 + ret = uverbs_copy_to(attrs, 280 + UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, 281 + entries, sizeof(*entries) * num_entries); 282 + return ret; 283 + } 284 + 285 + copy_len = min_t(size_t, user_entry_size, sizeof(*entries)); 286 + attr = uverbs_attr_get(attrs, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES); 287 + if (IS_ERR(attr)) 288 + return PTR_ERR(attr); 289 + 290 + user_entries = u64_to_user_ptr(attr->ptr_attr.data); 291 + for (i = 0; i < num_entries; i++) { 292 + if (copy_to_user(user_entries, entries, copy_len)) 293 + return -EFAULT; 294 + 295 + if (user_entry_size > sizeof(*entries)) { 296 + if (clear_user(user_entries + sizeof(*entries), 297 + user_entry_size - sizeof(*entries))) 298 + return -EFAULT; 299 + } 300 + 301 + entries++; 302 + user_entries += user_entry_size; 303 + } 304 + 305 + return uverbs_output_written(attrs, 306 + UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES); 307 + } 308 + 309 + static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_GID_TABLE)( 310 + struct uverbs_attr_bundle *attrs) 311 + { 312 + struct ib_uverbs_gid_entry *entries; 313 + struct ib_ucontext *ucontext; 314 + struct ib_device *ib_dev; 315 + size_t user_entry_size; 316 + ssize_t num_entries; 317 + size_t max_entries; 318 + size_t num_bytes; 319 + u32 flags; 320 + int ret; 321 + 322 + ret = uverbs_get_flags32(&flags, attrs, 323 + UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, 0); 324 + if (ret) 325 + return ret; 326 + 327 + ret = uverbs_get_const(&user_entry_size, attrs, 328 + UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE); 329 + if (ret) 330 + return ret; 331 + 332 + max_entries = uverbs_attr_ptr_get_array_size( 333 + attrs, UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, 334 + user_entry_size); 335 + if (max_entries <= 0) 336 + return -EINVAL; 337 + 338 + ucontext = ib_uverbs_get_ucontext(attrs); 339 + if (IS_ERR(ucontext)) 340 + return PTR_ERR(ucontext); 341 + ib_dev = ucontext->device; 342 + 343 + if (check_mul_overflow(max_entries, sizeof(*entries), &num_bytes)) 344 + return -EINVAL; 345 + 346 + entries = uverbs_zalloc(attrs, num_bytes); 347 + if (!entries) 348 + return -ENOMEM; 349 + 350 + num_entries = rdma_query_gid_table(ib_dev, entries, max_entries); 351 + if (num_entries < 0) 352 + return -EINVAL; 353 + 354 + ret = copy_gid_entries_to_user(attrs, entries, num_entries, 355 + user_entry_size); 356 + if (ret) 357 + return ret; 358 + 359 + ret = uverbs_copy_to(attrs, 360 + UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES, 361 + &num_entries, sizeof(num_entries)); 362 + return ret; 363 + } 364 + 365 + static int UVERBS_HANDLER(UVERBS_METHOD_QUERY_GID_ENTRY)( 366 + struct uverbs_attr_bundle *attrs) 367 + { 368 + struct ib_uverbs_gid_entry entry = {}; 369 + const struct ib_gid_attr *gid_attr; 370 + struct ib_ucontext *ucontext; 371 + struct ib_device *ib_dev; 372 + struct net_device *ndev; 373 + u32 gid_index; 374 + u32 port_num; 375 + u32 flags; 376 + int ret; 377 + 378 + ret = uverbs_get_flags32(&flags, attrs, 379 + UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, 0); 380 + if (ret) 381 + return ret; 382 + 383 + ret = uverbs_get_const(&port_num, attrs, 384 + UVERBS_ATTR_QUERY_GID_ENTRY_PORT); 385 + if (ret) 386 + return ret; 387 + 388 + ret = uverbs_get_const(&gid_index, attrs, 389 + UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX); 390 + if (ret) 391 + return ret; 392 + 393 + ucontext = ib_uverbs_get_ucontext(attrs); 394 + if (IS_ERR(ucontext)) 395 + return PTR_ERR(ucontext); 396 + ib_dev = ucontext->device; 397 + 398 + if (!rdma_is_port_valid(ib_dev, port_num)) 399 + return -EINVAL; 400 + 401 + if (!rdma_ib_or_roce(ib_dev, port_num)) 402 + return -EOPNOTSUPP; 403 + 404 + gid_attr = rdma_get_gid_attr(ib_dev, port_num, gid_index); 405 + if (IS_ERR(gid_attr)) 406 + return PTR_ERR(gid_attr); 407 + 408 + memcpy(&entry.gid, &gid_attr->gid, sizeof(gid_attr->gid)); 409 + entry.gid_index = gid_attr->index; 410 + entry.port_num = gid_attr->port_num; 411 + entry.gid_type = gid_attr->gid_type; 412 + 413 + rcu_read_lock(); 414 + ndev = rdma_read_gid_attr_ndev_rcu(gid_attr); 415 + if (IS_ERR(ndev)) { 416 + if (PTR_ERR(ndev) != -ENODEV) { 417 + ret = PTR_ERR(ndev); 418 + rcu_read_unlock(); 419 + goto out; 420 + } 421 + } else { 422 + entry.netdev_ifindex = ndev->ifindex; 423 + } 424 + rcu_read_unlock(); 425 + 426 + ret = uverbs_copy_to_struct_or_zero( 427 + attrs, UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, &entry, 428 + sizeof(entry)); 429 + out: 430 + rdma_put_gid_attr(gid_attr); 431 + return ret; 432 + } 433 + 271 434 DECLARE_UVERBS_NAMED_METHOD( 272 435 UVERBS_METHOD_GET_CONTEXT, 273 436 UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS, ··· 468 299 reserved), 469 300 UA_MANDATORY)); 470 301 302 + DECLARE_UVERBS_NAMED_METHOD( 303 + UVERBS_METHOD_QUERY_GID_TABLE, 304 + UVERBS_ATTR_CONST_IN(UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE, u64, 305 + UA_MANDATORY), 306 + UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, u32, 307 + UA_OPTIONAL), 308 + UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, 309 + UVERBS_ATTR_MIN_SIZE(0), UA_MANDATORY), 310 + UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES, 311 + UVERBS_ATTR_TYPE(u64), UA_MANDATORY)); 312 + 313 + DECLARE_UVERBS_NAMED_METHOD( 314 + UVERBS_METHOD_QUERY_GID_ENTRY, 315 + UVERBS_ATTR_CONST_IN(UVERBS_ATTR_QUERY_GID_ENTRY_PORT, u32, 316 + UA_MANDATORY), 317 + UVERBS_ATTR_CONST_IN(UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX, u32, 318 + UA_MANDATORY), 319 + UVERBS_ATTR_FLAGS_IN(UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, u32, 320 + UA_MANDATORY), 321 + UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, 322 + UVERBS_ATTR_STRUCT(struct ib_uverbs_gid_entry, 323 + netdev_ifindex), 324 + UA_MANDATORY)); 325 + 471 326 DECLARE_UVERBS_GLOBAL_METHODS(UVERBS_OBJECT_DEVICE, 472 327 &UVERBS_METHOD(UVERBS_METHOD_GET_CONTEXT), 473 328 &UVERBS_METHOD(UVERBS_METHOD_INVOKE_WRITE), 474 329 &UVERBS_METHOD(UVERBS_METHOD_INFO_HANDLES), 475 330 &UVERBS_METHOD(UVERBS_METHOD_QUERY_PORT), 476 - &UVERBS_METHOD(UVERBS_METHOD_QUERY_CONTEXT)); 331 + &UVERBS_METHOD(UVERBS_METHOD_QUERY_CONTEXT), 332 + &UVERBS_METHOD(UVERBS_METHOD_QUERY_GID_TABLE), 333 + &UVERBS_METHOD(UVERBS_METHOD_QUERY_GID_ENTRY)); 477 334 478 335 const struct uapi_definition uverbs_def_obj_device[] = { 479 336 UAPI_DEF_CHAIN_OBJ_TREE_NAMED(UVERBS_OBJECT_DEVICE),
+1 -1
drivers/infiniband/core/uverbs_std_types_wq.c
··· 16 16 container_of(uobject, struct ib_uwq_object, uevent.uobject); 17 17 int ret; 18 18 19 - ret = ib_destroy_wq(wq, &attrs->driver_udata); 19 + ret = ib_destroy_wq_user(wq, &attrs->driver_udata); 20 20 if (ib_is_destroy_retryable(ret, why, uobject)) 21 21 return ret; 22 22
+61 -53
drivers/infiniband/core/verbs.c
··· 272 272 atomic_set(&pd->usecnt, 0); 273 273 pd->flags = flags; 274 274 275 - pd->res.type = RDMA_RESTRACK_PD; 276 - rdma_restrack_set_task(&pd->res, caller); 275 + rdma_restrack_new(&pd->res, RDMA_RESTRACK_PD); 276 + rdma_restrack_set_name(&pd->res, caller); 277 277 278 278 ret = device->ops.alloc_pd(pd, NULL); 279 279 if (ret) { 280 + rdma_restrack_put(&pd->res); 280 281 kfree(pd); 281 282 return ERR_PTR(ret); 282 283 } 283 - rdma_restrack_kadd(&pd->res); 284 + rdma_restrack_add(&pd->res); 284 285 285 286 if (device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY) 286 287 pd->local_dma_lkey = device->local_dma_lkey; ··· 330 329 * exist. The caller is responsible to synchronously destroy them and 331 330 * guarantee no new allocations will happen. 332 331 */ 333 - void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata) 332 + int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata) 334 333 { 335 334 int ret; 336 335 ··· 344 343 requires the caller to guarantee we can't race here. */ 345 344 WARN_ON(atomic_read(&pd->usecnt)); 346 345 346 + ret = pd->device->ops.dealloc_pd(pd, udata); 347 + if (ret) 348 + return ret; 349 + 347 350 rdma_restrack_del(&pd->res); 348 - pd->device->ops.dealloc_pd(pd, udata); 349 351 kfree(pd); 352 + return ret; 350 353 } 351 354 EXPORT_SYMBOL(ib_dealloc_pd_user); 352 355 ··· 733 728 (struct in6_addr *)dgid); 734 729 return 0; 735 730 } else if (net_type == RDMA_NETWORK_IPV6 || 736 - net_type == RDMA_NETWORK_IB) { 731 + net_type == RDMA_NETWORK_IB || RDMA_NETWORK_ROCE_V1) { 737 732 *dgid = hdr->ibgrh.dgid; 738 733 *sgid = hdr->ibgrh.sgid; 739 734 return 0; ··· 969 964 { 970 965 const struct ib_gid_attr *sgid_attr = ah->sgid_attr; 971 966 struct ib_pd *pd; 967 + int ret; 972 968 973 969 might_sleep_if(flags & RDMA_DESTROY_AH_SLEEPABLE); 974 970 975 971 pd = ah->pd; 976 972 977 - ah->device->ops.destroy_ah(ah, flags); 973 + ret = ah->device->ops.destroy_ah(ah, flags); 974 + if (ret) 975 + return ret; 976 + 978 977 atomic_dec(&pd->usecnt); 979 978 if (sgid_attr) 980 979 rdma_put_gid_attr(sgid_attr); 981 980 982 981 kfree(ah); 983 - return 0; 982 + return ret; 984 983 } 985 984 EXPORT_SYMBOL(rdma_destroy_ah_user); 986 985 ··· 1069 1060 1070 1061 int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata) 1071 1062 { 1063 + int ret; 1064 + 1072 1065 if (atomic_read(&srq->usecnt)) 1073 1066 return -EBUSY; 1074 1067 1075 - srq->device->ops.destroy_srq(srq, udata); 1068 + ret = srq->device->ops.destroy_srq(srq, udata); 1069 + if (ret) 1070 + return ret; 1076 1071 1077 1072 atomic_dec(&srq->pd->usecnt); 1078 1073 if (srq->srq_type == IB_SRQT_XRC) ··· 1085 1072 atomic_dec(&srq->ext.cq->usecnt); 1086 1073 kfree(srq); 1087 1074 1088 - return 0; 1075 + return ret; 1089 1076 } 1090 1077 EXPORT_SYMBOL(ib_destroy_srq_user); 1091 1078 ··· 1794 1781 } 1795 1782 EXPORT_SYMBOL(ib_modify_qp_with_udata); 1796 1783 1797 - int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u8 *speed, u8 *width) 1784 + int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u16 *speed, u8 *width) 1798 1785 { 1799 1786 int rc; 1800 1787 u32 netdev_speed; ··· 1997 1984 cq->event_handler = event_handler; 1998 1985 cq->cq_context = cq_context; 1999 1986 atomic_set(&cq->usecnt, 0); 2000 - cq->res.type = RDMA_RESTRACK_CQ; 2001 - rdma_restrack_set_task(&cq->res, caller); 1987 + 1988 + rdma_restrack_new(&cq->res, RDMA_RESTRACK_CQ); 1989 + rdma_restrack_set_name(&cq->res, caller); 2002 1990 2003 1991 ret = device->ops.create_cq(cq, cq_attr, NULL); 2004 1992 if (ret) { 1993 + rdma_restrack_put(&cq->res); 2005 1994 kfree(cq); 2006 1995 return ERR_PTR(ret); 2007 1996 } 2008 1997 2009 - rdma_restrack_kadd(&cq->res); 1998 + rdma_restrack_add(&cq->res); 2010 1999 return cq; 2011 2000 } 2012 2001 EXPORT_SYMBOL(__ib_create_cq); ··· 2026 2011 2027 2012 int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata) 2028 2013 { 2014 + int ret; 2015 + 2029 2016 if (WARN_ON_ONCE(cq->shared)) 2030 2017 return -EOPNOTSUPP; 2031 2018 2032 2019 if (atomic_read(&cq->usecnt)) 2033 2020 return -EBUSY; 2034 2021 2022 + ret = cq->device->ops.destroy_cq(cq, udata); 2023 + if (ret) 2024 + return ret; 2025 + 2035 2026 rdma_restrack_del(&cq->res); 2036 - cq->device->ops.destroy_cq(cq, udata); 2037 2027 kfree(cq); 2038 - return 0; 2028 + return ret; 2039 2029 } 2040 2030 EXPORT_SYMBOL(ib_destroy_cq_user); 2041 2031 ··· 2079 2059 mr->pd = pd; 2080 2060 mr->dm = NULL; 2081 2061 atomic_inc(&pd->usecnt); 2082 - mr->res.type = RDMA_RESTRACK_MR; 2083 - rdma_restrack_kadd(&mr->res); 2062 + 2063 + rdma_restrack_new(&mr->res, RDMA_RESTRACK_MR); 2064 + rdma_restrack_parent_name(&mr->res, &pd->res); 2065 + rdma_restrack_add(&mr->res); 2084 2066 2085 2067 return mr; 2086 2068 } ··· 2161 2139 mr->uobject = NULL; 2162 2140 atomic_inc(&pd->usecnt); 2163 2141 mr->need_inval = false; 2164 - mr->res.type = RDMA_RESTRACK_MR; 2165 - rdma_restrack_kadd(&mr->res); 2166 2142 mr->type = mr_type; 2167 2143 mr->sig_attrs = NULL; 2168 2144 2145 + rdma_restrack_new(&mr->res, RDMA_RESTRACK_MR); 2146 + rdma_restrack_parent_name(&mr->res, &pd->res); 2147 + rdma_restrack_add(&mr->res); 2169 2148 out: 2170 2149 trace_mr_alloc(pd, mr_type, max_num_sg, mr); 2171 2150 return mr; ··· 2222 2199 mr->uobject = NULL; 2223 2200 atomic_inc(&pd->usecnt); 2224 2201 mr->need_inval = false; 2225 - mr->res.type = RDMA_RESTRACK_MR; 2226 - rdma_restrack_kadd(&mr->res); 2227 2202 mr->type = IB_MR_TYPE_INTEGRITY; 2228 2203 mr->sig_attrs = sig_attrs; 2229 2204 2205 + rdma_restrack_new(&mr->res, RDMA_RESTRACK_MR); 2206 + rdma_restrack_parent_name(&mr->res, &pd->res); 2207 + rdma_restrack_add(&mr->res); 2230 2208 out: 2231 2209 trace_mr_integ_alloc(pd, max_num_data_sg, max_num_meta_sg, mr); 2232 2210 return mr; ··· 2352 2328 */ 2353 2329 int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata) 2354 2330 { 2331 + int ret; 2332 + 2355 2333 if (atomic_read(&xrcd->usecnt)) 2356 2334 return -EBUSY; 2357 2335 2358 2336 WARN_ON(!xa_empty(&xrcd->tgt_qps)); 2359 - xrcd->device->ops.dealloc_xrcd(xrcd, udata); 2337 + ret = xrcd->device->ops.dealloc_xrcd(xrcd, udata); 2338 + if (ret) 2339 + return ret; 2360 2340 kfree(xrcd); 2361 - return 0; 2341 + return ret; 2362 2342 } 2363 2343 EXPORT_SYMBOL(ib_dealloc_xrcd_user); 2364 2344 ··· 2406 2378 EXPORT_SYMBOL(ib_create_wq); 2407 2379 2408 2380 /** 2409 - * ib_destroy_wq - Destroys the specified user WQ. 2381 + * ib_destroy_wq_user - Destroys the specified user WQ. 2410 2382 * @wq: The WQ to destroy. 2411 2383 * @udata: Valid user data 2412 2384 */ 2413 - int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata) 2385 + int ib_destroy_wq_user(struct ib_wq *wq, struct ib_udata *udata) 2414 2386 { 2415 2387 struct ib_cq *cq = wq->cq; 2416 2388 struct ib_pd *pd = wq->pd; 2389 + int ret; 2417 2390 2418 2391 if (atomic_read(&wq->usecnt)) 2419 2392 return -EBUSY; 2420 2393 2421 - wq->device->ops.destroy_wq(wq, udata); 2394 + ret = wq->device->ops.destroy_wq(wq, udata); 2395 + if (ret) 2396 + return ret; 2397 + 2422 2398 atomic_dec(&pd->usecnt); 2423 2399 atomic_dec(&cq->usecnt); 2424 - 2425 - return 0; 2400 + return ret; 2426 2401 } 2427 - EXPORT_SYMBOL(ib_destroy_wq); 2402 + EXPORT_SYMBOL(ib_destroy_wq_user); 2428 2403 2429 2404 /** 2430 2405 * ib_modify_wq - Modifies the specified WQ. ··· 2449 2418 return err; 2450 2419 } 2451 2420 EXPORT_SYMBOL(ib_modify_wq); 2452 - 2453 - /* 2454 - * ib_destroy_rwq_ind_table - Destroys the specified Indirection Table. 2455 - * @wq_ind_table: The Indirection Table to destroy. 2456 - */ 2457 - int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *rwq_ind_table) 2458 - { 2459 - int err, i; 2460 - u32 table_size = (1 << rwq_ind_table->log_ind_tbl_size); 2461 - struct ib_wq **ind_tbl = rwq_ind_table->ind_tbl; 2462 - 2463 - if (atomic_read(&rwq_ind_table->usecnt)) 2464 - return -EBUSY; 2465 - 2466 - err = rwq_ind_table->device->ops.destroy_rwq_ind_table(rwq_ind_table); 2467 - if (!err) { 2468 - for (i = 0; i < table_size; i++) 2469 - atomic_dec(&ind_tbl[i]->usecnt); 2470 - } 2471 - 2472 - return err; 2473 - } 2474 - EXPORT_SYMBOL(ib_destroy_rwq_ind_table); 2475 2421 2476 2422 int ib_check_mr_status(struct ib_mr *mr, u32 check_mask, 2477 2423 struct ib_mr_status *mr_status)
+1 -1
drivers/infiniband/hw/bnxt_re/bnxt_re.h
··· 150 150 151 151 struct delayed_work worker; 152 152 u8 cur_prio_map; 153 - u8 active_speed; 153 + u16 active_speed; 154 154 u8 active_width; 155 155 156 156 /* FP Notification Queue (CQ & SRQ) */
+32 -60
drivers/infiniband/hw/bnxt_re/ib_verbs.c
··· 532 532 } 533 533 534 534 /* Protection Domains */ 535 - void bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata) 535 + int bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata) 536 536 { 537 537 struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd); 538 538 struct bnxt_re_dev *rdev = pd->rdev; ··· 542 542 if (pd->qplib_pd.id) 543 543 bnxt_qplib_dealloc_pd(&rdev->qplib_res, &rdev->qplib_res.pd_tbl, 544 544 &pd->qplib_pd); 545 + return 0; 545 546 } 546 547 547 548 int bnxt_re_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) ··· 602 601 } 603 602 604 603 /* Address Handles */ 605 - void bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags) 604 + int bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags) 606 605 { 607 606 struct bnxt_re_ah *ah = container_of(ib_ah, struct bnxt_re_ah, ib_ah); 608 607 struct bnxt_re_dev *rdev = ah->rdev; 609 608 610 609 bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah, 611 610 !(flags & RDMA_DESTROY_AH_SLEEPABLE)); 611 + return 0; 612 612 } 613 613 614 614 static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype) ··· 940 938 return PTR_ERR(umem); 941 939 942 940 qp->sumem = umem; 943 - qplib_qp->sq.sg_info.sghead = umem->sg_head.sgl; 944 - qplib_qp->sq.sg_info.npages = ib_umem_num_pages(umem); 945 - qplib_qp->sq.sg_info.nmap = umem->nmap; 941 + qplib_qp->sq.sg_info.umem = umem; 946 942 qplib_qp->sq.sg_info.pgsize = PAGE_SIZE; 947 943 qplib_qp->sq.sg_info.pgshft = PAGE_SHIFT; 948 944 qplib_qp->qp_handle = ureq.qp_handle; ··· 953 953 if (IS_ERR(umem)) 954 954 goto rqfail; 955 955 qp->rumem = umem; 956 - qplib_qp->rq.sg_info.sghead = umem->sg_head.sgl; 957 - qplib_qp->rq.sg_info.npages = ib_umem_num_pages(umem); 958 - qplib_qp->rq.sg_info.nmap = umem->nmap; 956 + qplib_qp->rq.sg_info.umem = umem; 959 957 qplib_qp->rq.sg_info.pgsize = PAGE_SIZE; 960 958 qplib_qp->rq.sg_info.pgshft = PAGE_SHIFT; 961 959 } ··· 1566 1568 } 1567 1569 1568 1570 /* Shared Receive Queues */ 1569 - void bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata) 1571 + int bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata) 1570 1572 { 1571 1573 struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq, 1572 1574 ib_srq); ··· 1581 1583 atomic_dec(&rdev->srq_count); 1582 1584 if (nq) 1583 1585 nq->budget--; 1586 + return 0; 1584 1587 } 1585 1588 1586 1589 static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev, ··· 1607 1608 return PTR_ERR(umem); 1608 1609 1609 1610 srq->umem = umem; 1610 - qplib_srq->sg_info.sghead = umem->sg_head.sgl; 1611 - qplib_srq->sg_info.npages = ib_umem_num_pages(umem); 1612 - qplib_srq->sg_info.nmap = umem->nmap; 1611 + qplib_srq->sg_info.umem = umem; 1613 1612 qplib_srq->sg_info.pgsize = PAGE_SIZE; 1614 1613 qplib_srq->sg_info.pgshft = PAGE_SHIFT; 1615 1614 qplib_srq->srq_handle = ureq.srq_handle; ··· 2797 2800 } 2798 2801 2799 2802 /* Completion Queues */ 2800 - void bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 2803 + int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 2801 2804 { 2802 2805 struct bnxt_re_cq *cq; 2803 2806 struct bnxt_qplib_nq *nq; ··· 2813 2816 atomic_dec(&rdev->cq_count); 2814 2817 nq->budget--; 2815 2818 kfree(cq->cql); 2819 + return 0; 2816 2820 } 2817 2821 2818 2822 int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, ··· 2858 2860 rc = PTR_ERR(cq->umem); 2859 2861 goto fail; 2860 2862 } 2861 - cq->qplib_cq.sg_info.sghead = cq->umem->sg_head.sgl; 2862 - cq->qplib_cq.sg_info.npages = ib_umem_num_pages(cq->umem); 2863 - cq->qplib_cq.sg_info.nmap = cq->umem->nmap; 2863 + cq->qplib_cq.sg_info.umem = cq->umem; 2864 2864 cq->qplib_cq.dpi = &uctx->dpi; 2865 2865 } else { 2866 2866 cq->max_cql = min_t(u32, entries, MAX_CQL_PER_POLL); ··· 3770 3774 return rc; 3771 3775 } 3772 3776 3773 - static int bnxt_re_page_size_ok(int page_shift) 3774 - { 3775 - switch (page_shift) { 3776 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_4K: 3777 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_8K: 3778 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_64K: 3779 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_2M: 3780 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_256K: 3781 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_1M: 3782 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_4M: 3783 - case CMDQ_REGISTER_MR_LOG2_PBL_PG_SIZE_PG_1G: 3784 - return 1; 3785 - default: 3786 - return 0; 3787 - } 3788 - } 3789 - 3790 3777 static int fill_umem_pbl_tbl(struct ib_umem *umem, u64 *pbl_tbl_orig, 3791 3778 int page_shift) 3792 3779 { ··· 3777 3798 u64 page_size = BIT_ULL(page_shift); 3778 3799 struct ib_block_iter biter; 3779 3800 3780 - rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, page_size) 3801 + rdma_umem_for_each_dma_block(umem, &biter, page_size) 3781 3802 *pbl_tbl++ = rdma_block_iter_dma_address(&biter); 3782 3803 3783 3804 return pbl_tbl - pbl_tbl_orig; ··· 3793 3814 struct bnxt_re_mr *mr; 3794 3815 struct ib_umem *umem; 3795 3816 u64 *pbl_tbl = NULL; 3796 - int umem_pgs, page_shift, rc; 3817 + unsigned long page_size; 3818 + int umem_pgs, rc; 3797 3819 3798 3820 if (length > BNXT_RE_MAX_MR_SIZE) { 3799 3821 ibdev_err(&rdev->ibdev, "MR Size: %lld > Max supported:%lld\n", ··· 3828 3848 mr->ib_umem = umem; 3829 3849 3830 3850 mr->qplib_mr.va = virt_addr; 3831 - umem_pgs = ib_umem_page_count(umem); 3832 - if (!umem_pgs) { 3833 - ibdev_err(&rdev->ibdev, "umem is invalid!"); 3834 - rc = -EINVAL; 3851 + page_size = ib_umem_find_best_pgsz( 3852 + umem, BNXT_RE_PAGE_SIZE_4K | BNXT_RE_PAGE_SIZE_2M, virt_addr); 3853 + if (!page_size) { 3854 + ibdev_err(&rdev->ibdev, "umem page size unsupported!"); 3855 + rc = -EFAULT; 3835 3856 goto free_umem; 3836 3857 } 3837 3858 mr->qplib_mr.total_size = length; 3838 3859 3839 - pbl_tbl = kcalloc(umem_pgs, sizeof(u64 *), GFP_KERNEL); 3860 + if (page_size == BNXT_RE_PAGE_SIZE_4K && 3861 + length > BNXT_RE_MAX_MR_SIZE_LOW) { 3862 + ibdev_err(&rdev->ibdev, "Requested MR Sz:%llu Max sup:%llu", 3863 + length, (u64)BNXT_RE_MAX_MR_SIZE_LOW); 3864 + rc = -EINVAL; 3865 + goto free_umem; 3866 + } 3867 + 3868 + umem_pgs = ib_umem_num_dma_blocks(umem, page_size); 3869 + pbl_tbl = kcalloc(umem_pgs, sizeof(*pbl_tbl), GFP_KERNEL); 3840 3870 if (!pbl_tbl) { 3841 3871 rc = -ENOMEM; 3842 3872 goto free_umem; 3843 3873 } 3844 3874 3845 - page_shift = __ffs(ib_umem_find_best_pgsz(umem, 3846 - BNXT_RE_PAGE_SIZE_4K | BNXT_RE_PAGE_SIZE_2M, 3847 - virt_addr)); 3848 - 3849 - if (!bnxt_re_page_size_ok(page_shift)) { 3850 - ibdev_err(&rdev->ibdev, "umem page size unsupported!"); 3851 - rc = -EFAULT; 3852 - goto fail; 3853 - } 3854 - 3855 - if (page_shift == BNXT_RE_PAGE_SHIFT_4K && 3856 - length > BNXT_RE_MAX_MR_SIZE_LOW) { 3857 - ibdev_err(&rdev->ibdev, "Requested MR Sz:%llu Max sup:%llu", 3858 - length, (u64)BNXT_RE_MAX_MR_SIZE_LOW); 3859 - rc = -EINVAL; 3860 - goto fail; 3861 - } 3862 - 3863 3875 /* Map umem buf ptrs to the PBL */ 3864 - umem_pgs = fill_umem_pbl_tbl(umem, pbl_tbl, page_shift); 3876 + umem_pgs = fill_umem_pbl_tbl(umem, pbl_tbl, order_base_2(page_size)); 3865 3877 rc = bnxt_qplib_reg_mr(&rdev->qplib_res, &mr->qplib_mr, pbl_tbl, 3866 - umem_pgs, false, 1 << page_shift); 3878 + umem_pgs, false, page_size); 3867 3879 if (rc) { 3868 3880 ibdev_err(&rdev->ibdev, "Failed to register user MR"); 3869 3881 goto fail;
+4 -4
drivers/infiniband/hw/bnxt_re/ib_verbs.h
··· 163 163 enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev, 164 164 u8 port_num); 165 165 int bnxt_re_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 166 - void bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 166 + int bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 167 167 int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 168 168 struct ib_udata *udata); 169 169 int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); 170 170 int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); 171 - void bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags); 171 + int bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags); 172 172 int bnxt_re_create_srq(struct ib_srq *srq, 173 173 struct ib_srq_init_attr *srq_init_attr, 174 174 struct ib_udata *udata); ··· 176 176 enum ib_srq_attr_mask srq_attr_mask, 177 177 struct ib_udata *udata); 178 178 int bnxt_re_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); 179 - void bnxt_re_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 179 + int bnxt_re_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 180 180 int bnxt_re_post_srq_recv(struct ib_srq *srq, const struct ib_recv_wr *recv_wr, 181 181 const struct ib_recv_wr **bad_recv_wr); 182 182 struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd, ··· 193 193 const struct ib_recv_wr **bad_recv_wr); 194 194 int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 195 195 struct ib_udata *udata); 196 - void bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 196 + int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 197 197 int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc); 198 198 int bnxt_re_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags); 199 199 struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
+2 -1
drivers/infiniband/hw/bnxt_re/main.c
··· 736 736 if (ret) 737 737 return ret; 738 738 739 - return ib_register_device(ibdev, "bnxt_re%d"); 739 + dma_set_max_seg_size(&rdev->en_dev->pdev->dev, UINT_MAX); 740 + return ib_register_device(ibdev, "bnxt_re%d", &rdev->en_dev->pdev->dev); 740 741 } 741 742 742 743 static void bnxt_re_dev_remove(struct bnxt_re_dev *rdev)
+3 -4
drivers/infiniband/hw/bnxt_re/qplib_fp.c
··· 295 295 } 296 296 } 297 297 298 - static void bnxt_qplib_service_nq(unsigned long data) 298 + static void bnxt_qplib_service_nq(struct tasklet_struct *t) 299 299 { 300 - struct bnxt_qplib_nq *nq = (struct bnxt_qplib_nq *)data; 300 + struct bnxt_qplib_nq *nq = from_tasklet(nq, t, nq_tasklet); 301 301 struct bnxt_qplib_hwq *hwq = &nq->hwq; 302 302 int num_srqne_processed = 0; 303 303 int num_cqne_processed = 0; ··· 448 448 449 449 nq->msix_vec = msix_vector; 450 450 if (need_init) 451 - tasklet_init(&nq->nq_tasklet, bnxt_qplib_service_nq, 452 - (unsigned long)nq); 451 + tasklet_setup(&nq->nq_tasklet, bnxt_qplib_service_nq); 453 452 else 454 453 tasklet_enable(&nq->nq_tasklet); 455 454
+5 -6
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
··· 50 50 #include "qplib_sp.h" 51 51 #include "qplib_fp.h" 52 52 53 - static void bnxt_qplib_service_creq(unsigned long data); 53 + static void bnxt_qplib_service_creq(struct tasklet_struct *t); 54 54 55 55 /* Hardware communication channel */ 56 56 static int __wait_for_resp(struct bnxt_qplib_rcfw *rcfw, u16 cookie) ··· 79 79 goto done; 80 80 do { 81 81 mdelay(1); /* 1m sec */ 82 - bnxt_qplib_service_creq((unsigned long)rcfw); 82 + bnxt_qplib_service_creq(&rcfw->creq.creq_tasklet); 83 83 } while (test_bit(cbit, cmdq->cmdq_bitmap) && --count); 84 84 done: 85 85 return count ? 0 : -ETIMEDOUT; ··· 370 370 } 371 371 372 372 /* SP - CREQ Completion handlers */ 373 - static void bnxt_qplib_service_creq(unsigned long data) 373 + static void bnxt_qplib_service_creq(struct tasklet_struct *t) 374 374 { 375 - struct bnxt_qplib_rcfw *rcfw = (struct bnxt_qplib_rcfw *)data; 375 + struct bnxt_qplib_rcfw *rcfw = from_tasklet(rcfw, t, creq.creq_tasklet); 376 376 struct bnxt_qplib_creq_ctx *creq = &rcfw->creq; 377 377 u32 type, budget = CREQ_ENTRY_POLL_BUDGET; 378 378 struct bnxt_qplib_hwq *hwq = &creq->hwq; ··· 687 687 688 688 creq->msix_vec = msix_vector; 689 689 if (need_init) 690 - tasklet_init(&creq->creq_tasklet, 691 - bnxt_qplib_service_creq, (unsigned long)rcfw); 690 + tasklet_setup(&creq->creq_tasklet, bnxt_qplib_service_creq); 692 691 else 693 692 tasklet_enable(&creq->creq_tasklet); 694 693 rc = request_irq(creq->msix_vec, bnxt_qplib_creq_irq, 0,
+17 -13
drivers/infiniband/hw/bnxt_re/qplib_res.c
··· 45 45 #include <linux/dma-mapping.h> 46 46 #include <linux/if_vlan.h> 47 47 #include <linux/vmalloc.h> 48 + #include <rdma/ib_verbs.h> 49 + #include <rdma/ib_umem.h> 50 + 48 51 #include "roce_hsi.h" 49 52 #include "qplib_res.h" 50 53 #include "qplib_sp.h" ··· 90 87 static void bnxt_qplib_fill_user_dma_pages(struct bnxt_qplib_pbl *pbl, 91 88 struct bnxt_qplib_sg_info *sginfo) 92 89 { 93 - struct scatterlist *sghead = sginfo->sghead; 94 - struct sg_dma_page_iter sg_iter; 90 + struct ib_block_iter biter; 95 91 int i = 0; 96 92 97 - for_each_sg_dma_page(sghead, &sg_iter, sginfo->nmap, 0) { 98 - pbl->pg_map_arr[i] = sg_page_iter_dma_address(&sg_iter); 93 + rdma_umem_for_each_dma_block(sginfo->umem, &biter, sginfo->pgsize) { 94 + pbl->pg_map_arr[i] = rdma_block_iter_dma_address(&biter); 99 95 pbl->pg_arr[i] = NULL; 100 96 pbl->pg_count++; 101 97 i++; ··· 106 104 struct bnxt_qplib_sg_info *sginfo) 107 105 { 108 106 struct pci_dev *pdev = res->pdev; 109 - struct scatterlist *sghead; 110 107 bool is_umem = false; 111 108 u32 pages; 112 109 int i; 113 110 114 111 if (sginfo->nopte) 115 112 return 0; 116 - pages = sginfo->npages; 117 - sghead = sginfo->sghead; 113 + if (sginfo->umem) 114 + pages = ib_umem_num_dma_blocks(sginfo->umem, sginfo->pgsize); 115 + else 116 + pages = sginfo->npages; 118 117 /* page ptr arrays */ 119 118 pbl->pg_arr = vmalloc(pages * sizeof(void *)); 120 119 if (!pbl->pg_arr) ··· 130 127 pbl->pg_count = 0; 131 128 pbl->pg_size = sginfo->pgsize; 132 129 133 - if (!sghead) { 130 + if (!sginfo->umem) { 134 131 for (i = 0; i < pages; i++) { 135 132 pbl->pg_arr[i] = dma_alloc_coherent(&pdev->dev, 136 133 pbl->pg_size, ··· 186 183 struct bnxt_qplib_sg_info sginfo = {}; 187 184 u32 depth, stride, npbl, npde; 188 185 dma_addr_t *src_phys_ptr, **dst_virt_ptr; 189 - struct scatterlist *sghead = NULL; 190 186 struct bnxt_qplib_res *res; 191 187 struct pci_dev *pdev; 192 188 int i, rc, lvl; 193 189 194 190 res = hwq_attr->res; 195 191 pdev = res->pdev; 196 - sghead = hwq_attr->sginfo->sghead; 197 192 pg_size = hwq_attr->sginfo->pgsize; 198 193 hwq->level = PBL_LVL_MAX; 199 194 ··· 205 204 aux_pages++; 206 205 } 207 206 208 - if (!sghead) { 207 + if (!hwq_attr->sginfo->umem) { 209 208 hwq->is_user = false; 210 209 npages = (depth * stride) / pg_size + aux_pages; 211 210 if ((depth * stride) % pg_size) ··· 214 213 return -EINVAL; 215 214 hwq_attr->sginfo->npages = npages; 216 215 } else { 216 + unsigned long sginfo_num_pages = ib_umem_num_dma_blocks( 217 + hwq_attr->sginfo->umem, hwq_attr->sginfo->pgsize); 218 + 217 219 hwq->is_user = true; 218 - npages = hwq_attr->sginfo->npages; 220 + npages = sginfo_num_pages; 219 221 npages = (npages * PAGE_SIZE) / 220 222 BIT_ULL(hwq_attr->sginfo->pgshft); 221 - if ((hwq_attr->sginfo->npages * PAGE_SIZE) % 223 + if ((sginfo_num_pages * PAGE_SIZE) % 222 224 BIT_ULL(hwq_attr->sginfo->pgshft)) 223 225 if (!npages) 224 226 npages++;
+1 -2
drivers/infiniband/hw/bnxt_re/qplib_res.h
··· 126 126 }; 127 127 128 128 struct bnxt_qplib_sg_info { 129 - struct scatterlist *sghead; 130 - u32 nmap; 129 + struct ib_umem *umem; 131 130 u32 npages; 132 131 u32 pgshft; 133 132 u32 pgsize;
+2 -2
drivers/infiniband/hw/cxgb4/cm.c
··· 77 77 module_param(enable_ecn, int, 0644); 78 78 MODULE_PARM_DESC(enable_ecn, "Enable ECN (default=0/disabled)"); 79 79 80 - static int dack_mode = 1; 80 + static int dack_mode; 81 81 module_param(dack_mode, int, 0644); 82 - MODULE_PARM_DESC(dack_mode, "Delayed ack mode (default=1)"); 82 + MODULE_PARM_DESC(dack_mode, "Delayed ack mode (default=0)"); 83 83 84 84 uint c4iw_max_read_depth = 32; 85 85 module_param(c4iw_max_read_depth, int, 0644);
+2 -1
drivers/infiniband/hw/cxgb4/cq.c
··· 967 967 return !err || err == -ENODATA ? npolled : err; 968 968 } 969 969 970 - void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 970 + int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 971 971 { 972 972 struct c4iw_cq *chp; 973 973 struct c4iw_ucontext *ucontext; ··· 985 985 ucontext ? &ucontext->uctx : &chp->cq.rdev->uctx, 986 986 chp->destroy_skb, chp->wr_waitp); 987 987 c4iw_put_wr_wait(chp->wr_waitp); 988 + return 0; 988 989 } 989 990 990 991 int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
+3 -4
drivers/infiniband/hw/cxgb4/iw_cxgb4.h
··· 985 985 unsigned int *sg_offset); 986 986 int c4iw_dealloc_mw(struct ib_mw *mw); 987 987 void c4iw_dealloc(struct uld_ctx *ctx); 988 - struct ib_mw *c4iw_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 989 - struct ib_udata *udata); 988 + int c4iw_alloc_mw(struct ib_mw *mw, struct ib_udata *udata); 990 989 struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start, 991 990 u64 length, u64 virt, int acc, 992 991 struct ib_udata *udata); 993 992 struct ib_mr *c4iw_get_dma_mr(struct ib_pd *pd, int acc); 994 993 int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata); 995 - void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); 994 + int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); 996 995 int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 997 996 struct ib_udata *udata); 998 997 int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); 999 998 int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr, 1000 999 enum ib_srq_attr_mask srq_attr_mask, 1001 1000 struct ib_udata *udata); 1002 - void c4iw_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata); 1001 + int c4iw_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata); 1003 1002 int c4iw_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *attrs, 1004 1003 struct ib_udata *udata); 1005 1004 int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata);
+15 -25
drivers/infiniband/hw/cxgb4/mem.c
··· 510 510 __be64 *pages; 511 511 int shift, n, i; 512 512 int err = -ENOMEM; 513 - struct sg_dma_page_iter sg_iter; 513 + struct ib_block_iter biter; 514 514 struct c4iw_dev *rhp; 515 515 struct c4iw_pd *php; 516 516 struct c4iw_mr *mhp; ··· 548 548 549 549 shift = PAGE_SHIFT; 550 550 551 - n = ib_umem_num_pages(mhp->umem); 551 + n = ib_umem_num_dma_blocks(mhp->umem, 1 << shift); 552 552 err = alloc_pbl(mhp, n); 553 553 if (err) 554 554 goto err_umem_release; ··· 561 561 562 562 i = n = 0; 563 563 564 - for_each_sg_dma_page(mhp->umem->sg_head.sgl, &sg_iter, mhp->umem->nmap, 0) { 565 - pages[i++] = cpu_to_be64(sg_page_iter_dma_address(&sg_iter)); 564 + rdma_umem_for_each_dma_block(mhp->umem, &biter, 1 << shift) { 565 + pages[i++] = cpu_to_be64(rdma_block_iter_dma_address(&biter)); 566 566 if (i == PAGE_SIZE / sizeof(*pages)) { 567 567 err = write_pbl(&mhp->rhp->rdev, pages, 568 568 mhp->attr.pbl_addr + (n << 3), i, ··· 611 611 return ERR_PTR(err); 612 612 } 613 613 614 - struct ib_mw *c4iw_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 615 - struct ib_udata *udata) 614 + int c4iw_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) 616 615 { 616 + struct c4iw_mw *mhp = to_c4iw_mw(ibmw); 617 617 struct c4iw_dev *rhp; 618 618 struct c4iw_pd *php; 619 - struct c4iw_mw *mhp; 620 619 u32 mmid; 621 620 u32 stag = 0; 622 621 int ret; 623 622 624 - if (type != IB_MW_TYPE_1) 625 - return ERR_PTR(-EINVAL); 623 + if (ibmw->type != IB_MW_TYPE_1) 624 + return -EINVAL; 626 625 627 - php = to_c4iw_pd(pd); 626 + php = to_c4iw_pd(ibmw->pd); 628 627 rhp = php->rhp; 629 - mhp = kzalloc(sizeof(*mhp), GFP_KERNEL); 630 - if (!mhp) 631 - return ERR_PTR(-ENOMEM); 632 - 633 628 mhp->wr_waitp = c4iw_alloc_wr_wait(GFP_KERNEL); 634 - if (!mhp->wr_waitp) { 635 - ret = -ENOMEM; 636 - goto free_mhp; 637 - } 629 + if (!mhp->wr_waitp) 630 + return -ENOMEM; 638 631 639 632 mhp->dereg_skb = alloc_skb(SGE_MAX_WR_LEN, GFP_KERNEL); 640 633 if (!mhp->dereg_skb) { ··· 638 645 ret = allocate_window(&rhp->rdev, &stag, php->pdid, mhp->wr_waitp); 639 646 if (ret) 640 647 goto free_skb; 648 + 641 649 mhp->rhp = rhp; 642 650 mhp->attr.pdid = php->pdid; 643 651 mhp->attr.type = FW_RI_STAG_MW; 644 652 mhp->attr.stag = stag; 645 653 mmid = (stag) >> 8; 646 - mhp->ibmw.rkey = stag; 654 + ibmw->rkey = stag; 647 655 if (xa_insert_irq(&rhp->mrs, mmid, mhp, GFP_KERNEL)) { 648 656 ret = -ENOMEM; 649 657 goto dealloc_win; 650 658 } 651 659 pr_debug("mmid 0x%x mhp %p stag 0x%x\n", mmid, mhp, stag); 652 - return &(mhp->ibmw); 660 + return 0; 653 661 654 662 dealloc_win: 655 663 deallocate_window(&rhp->rdev, mhp->attr.stag, mhp->dereg_skb, ··· 659 665 kfree_skb(mhp->dereg_skb); 660 666 free_wr_wait: 661 667 c4iw_put_wr_wait(mhp->wr_waitp); 662 - free_mhp: 663 - kfree(mhp); 664 - return ERR_PTR(ret); 668 + return ret; 665 669 } 666 670 667 671 int c4iw_dealloc_mw(struct ib_mw *mw) ··· 676 684 mhp->wr_waitp); 677 685 kfree_skb(mhp->dereg_skb); 678 686 c4iw_put_wr_wait(mhp->wr_waitp); 679 - pr_debug("ib_mw %p mmid 0x%x ptr %p\n", mw, mmid, mhp); 680 - kfree(mhp); 681 687 return 0; 682 688 } 683 689
+8 -3
drivers/infiniband/hw/cxgb4/provider.c
··· 190 190 return ret; 191 191 } 192 192 193 - static void c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata) 193 + static int c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata) 194 194 { 195 195 struct c4iw_dev *rhp; 196 196 struct c4iw_pd *php; ··· 202 202 mutex_lock(&rhp->rdev.stats.lock); 203 203 rhp->rdev.stats.pd.cur--; 204 204 mutex_unlock(&rhp->rdev.stats.lock); 205 + return 0; 205 206 } 206 207 207 208 static int c4iw_allocate_pd(struct ib_pd *pd, struct ib_udata *udata) ··· 498 497 .query_qp = c4iw_ib_query_qp, 499 498 .reg_user_mr = c4iw_reg_user_mr, 500 499 .req_notify_cq = c4iw_arm_cq, 501 - INIT_RDMA_OBJ_SIZE(ib_pd, c4iw_pd, ibpd), 500 + 502 501 INIT_RDMA_OBJ_SIZE(ib_cq, c4iw_cq, ibcq), 502 + INIT_RDMA_OBJ_SIZE(ib_mw, c4iw_mw, ibmw), 503 + INIT_RDMA_OBJ_SIZE(ib_pd, c4iw_pd, ibpd), 503 504 INIT_RDMA_OBJ_SIZE(ib_srq, c4iw_srq, ibsrq), 504 505 INIT_RDMA_OBJ_SIZE(ib_ucontext, c4iw_ucontext, ibucontext), 505 506 }; ··· 570 567 ret = set_netdevs(&dev->ibdev, &dev->rdev); 571 568 if (ret) 572 569 goto err_dealloc_ctx; 573 - ret = ib_register_device(&dev->ibdev, "cxgb4_%d"); 570 + dma_set_max_seg_size(&dev->rdev.lldi.pdev->dev, UINT_MAX); 571 + ret = ib_register_device(&dev->ibdev, "cxgb4_%d", 572 + &dev->rdev.lldi.pdev->dev); 574 573 if (ret) 575 574 goto err_dealloc_ctx; 576 575 return;
+2 -1
drivers/infiniband/hw/cxgb4/qp.c
··· 2797 2797 return ret; 2798 2798 } 2799 2799 2800 - void c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 2800 + int c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 2801 2801 { 2802 2802 struct c4iw_dev *rhp; 2803 2803 struct c4iw_srq *srq; ··· 2813 2813 srq->wr_waitp); 2814 2814 c4iw_free_srq_idx(&rhp->rdev, srq->idx); 2815 2815 c4iw_put_wr_wait(srq->wr_waitp); 2816 + return 0; 2816 2817 }
+5 -9
drivers/infiniband/hw/efa/efa.h
··· 33 33 char name[EFA_IRQNAME_SIZE]; 34 34 }; 35 35 36 - struct efa_sw_stats { 36 + /* Don't use anything other than atomic64 */ 37 + struct efa_stats { 37 38 atomic64_t alloc_pd_err; 38 39 atomic64_t create_qp_err; 39 40 atomic64_t create_cq_err; ··· 42 41 atomic64_t alloc_ucontext_err; 43 42 atomic64_t create_ah_err; 44 43 atomic64_t mmap_err; 45 - }; 46 - 47 - /* Don't use anything other than atomic64 */ 48 - struct efa_stats { 49 - struct efa_sw_stats sw_stats; 50 44 atomic64_t keep_alive_rcvd; 51 45 }; 52 46 ··· 130 134 int efa_query_pkey(struct ib_device *ibdev, u8 port, u16 index, 131 135 u16 *pkey); 132 136 int efa_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 133 - void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 137 + int efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 134 138 int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata); 135 139 struct ib_qp *efa_create_qp(struct ib_pd *ibpd, 136 140 struct ib_qp_init_attr *init_attr, 137 141 struct ib_udata *udata); 138 - void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 142 + int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 139 143 int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 140 144 struct ib_udata *udata); 141 145 struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length, ··· 152 156 int efa_create_ah(struct ib_ah *ibah, 153 157 struct rdma_ah_init_attr *init_attr, 154 158 struct ib_udata *udata); 155 - void efa_destroy_ah(struct ib_ah *ibah, u32 flags); 159 + int efa_destroy_ah(struct ib_ah *ibah, u32 flags); 156 160 int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, 157 161 int qp_attr_mask, struct ib_udata *udata); 158 162 enum rdma_link_layer efa_port_link_layer(struct ib_device *ibdev,
+53 -16
drivers/infiniband/hw/efa/efa_admin_cmds_defs.h
··· 61 61 62 62 enum efa_admin_get_stats_type { 63 63 EFA_ADMIN_GET_STATS_TYPE_BASIC = 0, 64 + EFA_ADMIN_GET_STATS_TYPE_MESSAGES = 1, 65 + EFA_ADMIN_GET_STATS_TYPE_RDMA_READ = 2, 64 66 }; 65 67 66 68 enum efa_admin_get_stats_scope { 67 69 EFA_ADMIN_GET_STATS_SCOPE_ALL = 0, 68 70 EFA_ADMIN_GET_STATS_SCOPE_QUEUE = 1, 69 - }; 70 - 71 - enum efa_admin_modify_qp_mask_bits { 72 - EFA_ADMIN_QP_STATE_BIT = 0, 73 - EFA_ADMIN_CUR_QP_STATE_BIT = 1, 74 - EFA_ADMIN_QKEY_BIT = 2, 75 - EFA_ADMIN_SQ_PSN_BIT = 3, 76 - EFA_ADMIN_SQ_DRAINED_ASYNC_NOTIFY_BIT = 4, 77 71 }; 78 72 79 73 /* ··· 193 199 struct efa_admin_aq_common_desc aq_common_desc; 194 200 195 201 /* 196 - * Mask indicating which fields should be updated see enum 197 - * efa_admin_modify_qp_mask_bits 202 + * Mask indicating which fields should be updated 203 + * 0 : qp_state 204 + * 1 : cur_qp_state 205 + * 2 : qkey 206 + * 3 : sq_psn 207 + * 4 : sq_drained_async_notify 208 + * 5 : rnr_retry 209 + * 31:6 : reserved 198 210 */ 199 211 u32 modify_mask; 200 212 ··· 222 222 /* Enable async notification when SQ is drained */ 223 223 u8 sq_drained_async_notify; 224 224 225 - /* MBZ */ 226 - u8 reserved1; 225 + /* Number of RNR retries (valid only for SRD QPs) */ 226 + u8 rnr_retry; 227 227 228 228 /* MBZ */ 229 229 u16 reserved2; ··· 258 258 /* Indicates that draining is in progress */ 259 259 u8 sq_draining; 260 260 261 - /* MBZ */ 262 - u8 reserved1; 261 + /* Number of RNR retries (valid only for SRD QPs) */ 262 + u8 rnr_retry; 263 263 264 264 /* MBZ */ 265 265 u16 reserved2; ··· 530 530 u64 rx_drops; 531 531 }; 532 532 533 + struct efa_admin_messages_stats { 534 + u64 send_bytes; 535 + 536 + u64 send_wrs; 537 + 538 + u64 recv_bytes; 539 + 540 + u64 recv_wrs; 541 + }; 542 + 543 + struct efa_admin_rdma_read_stats { 544 + u64 read_wrs; 545 + 546 + u64 read_bytes; 547 + 548 + u64 read_wr_err; 549 + 550 + u64 read_resp_bytes; 551 + }; 552 + 533 553 struct efa_admin_acq_get_stats_resp { 534 554 struct efa_admin_acq_common_desc acq_common_desc; 535 555 536 - struct efa_admin_basic_stats basic_stats; 556 + union { 557 + struct efa_admin_basic_stats basic_stats; 558 + 559 + struct efa_admin_messages_stats messages_stats; 560 + 561 + struct efa_admin_rdma_read_stats rdma_read_stats; 562 + } u; 537 563 }; 538 564 539 565 struct efa_admin_get_set_feature_common_desc { ··· 602 576 /* 603 577 * 0 : rdma_read - If set, RDMA Read is supported on 604 578 * TX queues 605 - * 31:1 : reserved - MBZ 579 + * 1 : rnr_retry - If set, RNR retry is supported on 580 + * modify QP command 581 + * 31:2 : reserved - MBZ 606 582 */ 607 583 u32 device_caps; 608 584 ··· 890 862 #define EFA_ADMIN_CREATE_QP_CMD_SQ_VIRT_MASK BIT(0) 891 863 #define EFA_ADMIN_CREATE_QP_CMD_RQ_VIRT_MASK BIT(1) 892 864 865 + /* modify_qp_cmd */ 866 + #define EFA_ADMIN_MODIFY_QP_CMD_QP_STATE_MASK BIT(0) 867 + #define EFA_ADMIN_MODIFY_QP_CMD_CUR_QP_STATE_MASK BIT(1) 868 + #define EFA_ADMIN_MODIFY_QP_CMD_QKEY_MASK BIT(2) 869 + #define EFA_ADMIN_MODIFY_QP_CMD_SQ_PSN_MASK BIT(3) 870 + #define EFA_ADMIN_MODIFY_QP_CMD_SQ_DRAINED_ASYNC_NOTIFY_MASK BIT(4) 871 + #define EFA_ADMIN_MODIFY_QP_CMD_RNR_RETRY_MASK BIT(5) 872 + 893 873 /* reg_mr_cmd */ 894 874 #define EFA_ADMIN_REG_MR_CMD_PHYS_PAGE_SIZE_SHIFT_MASK GENMASK(4, 0) 895 875 #define EFA_ADMIN_REG_MR_CMD_MEM_ADDR_PHY_MODE_EN_MASK BIT(7) ··· 914 878 915 879 /* feature_device_attr_desc */ 916 880 #define EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_RDMA_READ_MASK BIT(0) 881 + #define EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_RNR_RETRY_MASK BIT(1) 917 882 918 883 /* host_info */ 919 884 #define EFA_ADMIN_HOST_INFO_DRIVER_MODULE_TYPE_MASK GENMASK(7, 0)
+23 -5
drivers/infiniband/hw/efa/efa_com_cmd.c
··· 76 76 cmd.qkey = params->qkey; 77 77 cmd.sq_psn = params->sq_psn; 78 78 cmd.sq_drained_async_notify = params->sq_drained_async_notify; 79 + cmd.rnr_retry = params->rnr_retry; 79 80 80 81 err = efa_com_cmd_exec(aq, 81 82 (struct efa_admin_aq_entry *)&cmd, ··· 122 121 result->qkey = resp.qkey; 123 122 result->sq_draining = resp.sq_draining; 124 123 result->sq_psn = resp.sq_psn; 124 + result->rnr_retry = resp.rnr_retry; 125 125 126 126 return 0; 127 127 } ··· 752 750 return err; 753 751 } 754 752 755 - result->basic_stats.tx_bytes = resp.basic_stats.tx_bytes; 756 - result->basic_stats.tx_pkts = resp.basic_stats.tx_pkts; 757 - result->basic_stats.rx_bytes = resp.basic_stats.rx_bytes; 758 - result->basic_stats.rx_pkts = resp.basic_stats.rx_pkts; 759 - result->basic_stats.rx_drops = resp.basic_stats.rx_drops; 753 + switch (cmd.type) { 754 + case EFA_ADMIN_GET_STATS_TYPE_BASIC: 755 + result->basic_stats.tx_bytes = resp.u.basic_stats.tx_bytes; 756 + result->basic_stats.tx_pkts = resp.u.basic_stats.tx_pkts; 757 + result->basic_stats.rx_bytes = resp.u.basic_stats.rx_bytes; 758 + result->basic_stats.rx_pkts = resp.u.basic_stats.rx_pkts; 759 + result->basic_stats.rx_drops = resp.u.basic_stats.rx_drops; 760 + break; 761 + case EFA_ADMIN_GET_STATS_TYPE_MESSAGES: 762 + result->messages_stats.send_bytes = resp.u.messages_stats.send_bytes; 763 + result->messages_stats.send_wrs = resp.u.messages_stats.send_wrs; 764 + result->messages_stats.recv_bytes = resp.u.messages_stats.recv_bytes; 765 + result->messages_stats.recv_wrs = resp.u.messages_stats.recv_wrs; 766 + break; 767 + case EFA_ADMIN_GET_STATS_TYPE_RDMA_READ: 768 + result->rdma_read_stats.read_wrs = resp.u.rdma_read_stats.read_wrs; 769 + result->rdma_read_stats.read_bytes = resp.u.rdma_read_stats.read_bytes; 770 + result->rdma_read_stats.read_wr_err = resp.u.rdma_read_stats.read_wr_err; 771 + result->rdma_read_stats.read_resp_bytes = resp.u.rdma_read_stats.read_resp_bytes; 772 + break; 773 + } 760 774 761 775 return 0; 762 776 }
+18
drivers/infiniband/hw/efa/efa_com_cmd.h
··· 47 47 u32 qkey; 48 48 u32 sq_psn; 49 49 u8 sq_drained_async_notify; 50 + u8 rnr_retry; 50 51 }; 51 52 52 53 struct efa_com_query_qp_params { ··· 59 58 u32 qkey; 60 59 u32 sq_draining; 61 60 u32 sq_psn; 61 + u8 rnr_retry; 62 62 }; 63 63 64 64 struct efa_com_destroy_qp_params { ··· 240 238 u64 rx_drops; 241 239 }; 242 240 241 + struct efa_com_messages_stats { 242 + u64 send_bytes; 243 + u64 send_wrs; 244 + u64 recv_bytes; 245 + u64 recv_wrs; 246 + }; 247 + 248 + struct efa_com_rdma_read_stats { 249 + u64 read_wrs; 250 + u64 read_bytes; 251 + u64 read_wr_err; 252 + u64 read_resp_bytes; 253 + }; 254 + 243 255 union efa_com_get_stats_result { 244 256 struct efa_com_basic_stats basic_stats; 257 + struct efa_com_messages_stats messages_stats; 258 + struct efa_com_rdma_read_stats rdma_read_stats; 245 259 }; 246 260 247 261 void efa_com_set_dma_addr(dma_addr_t addr, u32 *addr_high, u32 *addr_low);
+2 -2
drivers/infiniband/hw/efa/efa_main.c
··· 331 331 332 332 ib_set_device_ops(&dev->ibdev, &efa_dev_ops); 333 333 334 - err = ib_register_device(&dev->ibdev, "efa_%d"); 334 + err = ib_register_device(&dev->ibdev, "efa_%d", &pdev->dev); 335 335 if (err) 336 336 goto err_release_doorbell_bar; 337 337 ··· 418 418 err); 419 419 return err; 420 420 } 421 - 421 + dma_set_max_seg_size(&pdev->dev, UINT_MAX); 422 422 return 0; 423 423 } 424 424
+210 -48
drivers/infiniband/hw/efa/efa_verbs.c
··· 4 4 */ 5 5 6 6 #include <linux/vmalloc.h> 7 + #include <linux/log2.h> 7 8 8 9 #include <rdma/ib_addr.h> 9 10 #include <rdma/ib_umem.h> ··· 36 35 op(EFA_RX_BYTES, "rx_bytes") \ 37 36 op(EFA_RX_PKTS, "rx_pkts") \ 38 37 op(EFA_RX_DROPS, "rx_drops") \ 38 + op(EFA_SEND_BYTES, "send_bytes") \ 39 + op(EFA_SEND_WRS, "send_wrs") \ 40 + op(EFA_RECV_BYTES, "recv_bytes") \ 41 + op(EFA_RECV_WRS, "recv_wrs") \ 42 + op(EFA_RDMA_READ_WRS, "rdma_read_wrs") \ 43 + op(EFA_RDMA_READ_BYTES, "rdma_read_bytes") \ 44 + op(EFA_RDMA_READ_WR_ERR, "rdma_read_wr_err") \ 45 + op(EFA_RDMA_READ_RESP_BYTES, "rdma_read_resp_bytes") \ 39 46 op(EFA_SUBMITTED_CMDS, "submitted_cmds") \ 40 47 op(EFA_COMPLETED_CMDS, "completed_cmds") \ 41 48 op(EFA_CMDS_ERR, "cmds_err") \ ··· 151 142 return container_of(rdma_entry, struct efa_user_mmap_entry, rdma_entry); 152 143 } 153 144 154 - static inline bool is_rdma_read_cap(struct efa_dev *dev) 155 - { 156 - return dev->dev_attr.device_caps & EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_RDMA_READ_MASK; 157 - } 145 + #define EFA_DEV_CAP(dev, cap) \ 146 + ((dev)->dev_attr.device_caps & \ 147 + EFA_ADMIN_FEATURE_DEVICE_ATTR_DESC_##cap##_MASK) 158 148 159 149 #define is_reserved_cleared(reserved) \ 160 150 !memchr_inv(reserved, 0, sizeof(reserved)) ··· 229 221 resp.max_rq_wr = dev_attr->max_rq_depth; 230 222 resp.max_rdma_size = dev_attr->max_rdma_size; 231 223 232 - if (is_rdma_read_cap(dev)) 224 + if (EFA_DEV_CAP(dev, RDMA_READ)) 233 225 resp.device_caps |= EFA_QUERY_DEVICE_CAPS_RDMA_READ; 226 + 227 + if (EFA_DEV_CAP(dev, RNR_RETRY)) 228 + resp.device_caps |= EFA_QUERY_DEVICE_CAPS_RNR_RETRY; 234 229 235 230 err = ib_copy_to_udata(udata, &resp, 236 231 min(sizeof(resp), udata->outlen)); ··· 280 269 281 270 #define EFA_QUERY_QP_SUPP_MASK \ 282 271 (IB_QP_STATE | IB_QP_PKEY_INDEX | IB_QP_PORT | \ 283 - IB_QP_QKEY | IB_QP_SQ_PSN | IB_QP_CAP) 272 + IB_QP_QKEY | IB_QP_SQ_PSN | IB_QP_CAP | IB_QP_RNR_RETRY) 284 273 285 274 if (qp_attr_mask & ~EFA_QUERY_QP_SUPP_MASK) { 286 275 ibdev_dbg(&dev->ibdev, ··· 302 291 qp_attr->sq_psn = result.sq_psn; 303 292 qp_attr->sq_draining = result.sq_draining; 304 293 qp_attr->port_num = 1; 294 + qp_attr->rnr_retry = result.rnr_retry; 305 295 306 296 qp_attr->cap.max_send_wr = qp->max_send_wr; 307 297 qp_attr->cap.max_recv_wr = qp->max_recv_wr; ··· 388 376 err_dealloc_pd: 389 377 efa_pd_dealloc(dev, result.pdn); 390 378 err_out: 391 - atomic64_inc(&dev->stats.sw_stats.alloc_pd_err); 379 + atomic64_inc(&dev->stats.alloc_pd_err); 392 380 return err; 393 381 } 394 382 395 - void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 383 + int efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 396 384 { 397 385 struct efa_dev *dev = to_edev(ibpd->device); 398 386 struct efa_pd *pd = to_epd(ibpd); 399 387 400 388 ibdev_dbg(&dev->ibdev, "Dealloc pd[%d]\n", pd->pdn); 401 389 efa_pd_dealloc(dev, pd->pdn); 390 + return 0; 402 391 } 403 392 404 393 static int efa_destroy_qp_handle(struct efa_dev *dev, u32 qp_handle) ··· 750 737 err_free_qp: 751 738 kfree(qp); 752 739 err_out: 753 - atomic64_inc(&dev->stats.sw_stats.create_qp_err); 740 + atomic64_inc(&dev->stats.create_qp_err); 754 741 return ERR_PTR(err); 742 + } 743 + 744 + static const struct { 745 + int valid; 746 + enum ib_qp_attr_mask req_param; 747 + enum ib_qp_attr_mask opt_param; 748 + } srd_qp_state_table[IB_QPS_ERR + 1][IB_QPS_ERR + 1] = { 749 + [IB_QPS_RESET] = { 750 + [IB_QPS_RESET] = { .valid = 1 }, 751 + [IB_QPS_INIT] = { 752 + .valid = 1, 753 + .req_param = IB_QP_PKEY_INDEX | 754 + IB_QP_PORT | 755 + IB_QP_QKEY, 756 + }, 757 + }, 758 + [IB_QPS_INIT] = { 759 + [IB_QPS_RESET] = { .valid = 1 }, 760 + [IB_QPS_ERR] = { .valid = 1 }, 761 + [IB_QPS_INIT] = { 762 + .valid = 1, 763 + .opt_param = IB_QP_PKEY_INDEX | 764 + IB_QP_PORT | 765 + IB_QP_QKEY, 766 + }, 767 + [IB_QPS_RTR] = { 768 + .valid = 1, 769 + .opt_param = IB_QP_PKEY_INDEX | 770 + IB_QP_QKEY, 771 + }, 772 + }, 773 + [IB_QPS_RTR] = { 774 + [IB_QPS_RESET] = { .valid = 1 }, 775 + [IB_QPS_ERR] = { .valid = 1 }, 776 + [IB_QPS_RTS] = { 777 + .valid = 1, 778 + .req_param = IB_QP_SQ_PSN, 779 + .opt_param = IB_QP_CUR_STATE | 780 + IB_QP_QKEY | 781 + IB_QP_RNR_RETRY, 782 + 783 + } 784 + }, 785 + [IB_QPS_RTS] = { 786 + [IB_QPS_RESET] = { .valid = 1 }, 787 + [IB_QPS_ERR] = { .valid = 1 }, 788 + [IB_QPS_RTS] = { 789 + .valid = 1, 790 + .opt_param = IB_QP_CUR_STATE | 791 + IB_QP_QKEY, 792 + }, 793 + [IB_QPS_SQD] = { 794 + .valid = 1, 795 + .opt_param = IB_QP_EN_SQD_ASYNC_NOTIFY, 796 + }, 797 + }, 798 + [IB_QPS_SQD] = { 799 + [IB_QPS_RESET] = { .valid = 1 }, 800 + [IB_QPS_ERR] = { .valid = 1 }, 801 + [IB_QPS_RTS] = { 802 + .valid = 1, 803 + .opt_param = IB_QP_CUR_STATE | 804 + IB_QP_QKEY, 805 + }, 806 + [IB_QPS_SQD] = { 807 + .valid = 1, 808 + .opt_param = IB_QP_PKEY_INDEX | 809 + IB_QP_QKEY, 810 + } 811 + }, 812 + [IB_QPS_SQE] = { 813 + [IB_QPS_RESET] = { .valid = 1 }, 814 + [IB_QPS_ERR] = { .valid = 1 }, 815 + [IB_QPS_RTS] = { 816 + .valid = 1, 817 + .opt_param = IB_QP_CUR_STATE | 818 + IB_QP_QKEY, 819 + } 820 + }, 821 + [IB_QPS_ERR] = { 822 + [IB_QPS_RESET] = { .valid = 1 }, 823 + [IB_QPS_ERR] = { .valid = 1 }, 824 + } 825 + }; 826 + 827 + static bool efa_modify_srd_qp_is_ok(enum ib_qp_state cur_state, 828 + enum ib_qp_state next_state, 829 + enum ib_qp_attr_mask mask) 830 + { 831 + enum ib_qp_attr_mask req_param, opt_param; 832 + 833 + if (mask & IB_QP_CUR_STATE && 834 + cur_state != IB_QPS_RTR && cur_state != IB_QPS_RTS && 835 + cur_state != IB_QPS_SQD && cur_state != IB_QPS_SQE) 836 + return false; 837 + 838 + if (!srd_qp_state_table[cur_state][next_state].valid) 839 + return false; 840 + 841 + req_param = srd_qp_state_table[cur_state][next_state].req_param; 842 + opt_param = srd_qp_state_table[cur_state][next_state].opt_param; 843 + 844 + if ((mask & req_param) != req_param) 845 + return false; 846 + 847 + if (mask & ~(req_param | opt_param | IB_QP_STATE)) 848 + return false; 849 + 850 + return true; 755 851 } 756 852 757 853 static int efa_modify_qp_validate(struct efa_dev *dev, struct efa_qp *qp, ··· 868 746 enum ib_qp_state cur_state, 869 747 enum ib_qp_state new_state) 870 748 { 749 + int err; 750 + 871 751 #define EFA_MODIFY_QP_SUPP_MASK \ 872 752 (IB_QP_STATE | IB_QP_CUR_STATE | IB_QP_EN_SQD_ASYNC_NOTIFY | \ 873 - IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_QKEY | IB_QP_SQ_PSN) 753 + IB_QP_PKEY_INDEX | IB_QP_PORT | IB_QP_QKEY | IB_QP_SQ_PSN | \ 754 + IB_QP_RNR_RETRY) 874 755 875 756 if (qp_attr_mask & ~EFA_MODIFY_QP_SUPP_MASK) { 876 757 ibdev_dbg(&dev->ibdev, ··· 882 757 return -EOPNOTSUPP; 883 758 } 884 759 885 - if (!ib_modify_qp_is_ok(cur_state, new_state, IB_QPT_UD, 886 - qp_attr_mask)) { 760 + if (qp->ibqp.qp_type == IB_QPT_DRIVER) 761 + err = !efa_modify_srd_qp_is_ok(cur_state, new_state, 762 + qp_attr_mask); 763 + else 764 + err = !ib_modify_qp_is_ok(cur_state, new_state, IB_QPT_UD, 765 + qp_attr_mask); 766 + 767 + if (err) { 887 768 ibdev_dbg(&dev->ibdev, "Invalid modify QP parameters\n"); 888 769 return -EINVAL; 889 770 } ··· 936 805 params.qp_handle = qp->qp_handle; 937 806 938 807 if (qp_attr_mask & IB_QP_STATE) { 939 - params.modify_mask |= BIT(EFA_ADMIN_QP_STATE_BIT) | 940 - BIT(EFA_ADMIN_CUR_QP_STATE_BIT); 808 + EFA_SET(&params.modify_mask, EFA_ADMIN_MODIFY_QP_CMD_QP_STATE, 809 + 1); 810 + EFA_SET(&params.modify_mask, 811 + EFA_ADMIN_MODIFY_QP_CMD_CUR_QP_STATE, 1); 941 812 params.cur_qp_state = qp_attr->cur_qp_state; 942 813 params.qp_state = qp_attr->qp_state; 943 814 } 944 815 945 816 if (qp_attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY) { 946 - params.modify_mask |= 947 - BIT(EFA_ADMIN_SQ_DRAINED_ASYNC_NOTIFY_BIT); 817 + EFA_SET(&params.modify_mask, 818 + EFA_ADMIN_MODIFY_QP_CMD_SQ_DRAINED_ASYNC_NOTIFY, 1); 948 819 params.sq_drained_async_notify = qp_attr->en_sqd_async_notify; 949 820 } 950 821 951 822 if (qp_attr_mask & IB_QP_QKEY) { 952 - params.modify_mask |= BIT(EFA_ADMIN_QKEY_BIT); 823 + EFA_SET(&params.modify_mask, EFA_ADMIN_MODIFY_QP_CMD_QKEY, 1); 953 824 params.qkey = qp_attr->qkey; 954 825 } 955 826 956 827 if (qp_attr_mask & IB_QP_SQ_PSN) { 957 - params.modify_mask |= BIT(EFA_ADMIN_SQ_PSN_BIT); 828 + EFA_SET(&params.modify_mask, EFA_ADMIN_MODIFY_QP_CMD_SQ_PSN, 1); 958 829 params.sq_psn = qp_attr->sq_psn; 830 + } 831 + 832 + if (qp_attr_mask & IB_QP_RNR_RETRY) { 833 + EFA_SET(&params.modify_mask, EFA_ADMIN_MODIFY_QP_CMD_RNR_RETRY, 834 + 1); 835 + params.rnr_retry = qp_attr->rnr_retry; 959 836 } 960 837 961 838 err = efa_com_modify_qp(&dev->edev, &params); ··· 982 843 return efa_com_destroy_cq(&dev->edev, &params); 983 844 } 984 845 985 - void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 846 + int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 986 847 { 987 848 struct efa_dev *dev = to_edev(ibcq->device); 988 849 struct efa_cq *cq = to_ecq(ibcq); ··· 995 856 efa_destroy_cq_idx(dev, cq->cq_idx); 996 857 efa_free_mapped(dev, cq->cpu_addr, cq->dma_addr, cq->size, 997 858 DMA_FROM_DEVICE); 859 + return 0; 998 860 } 999 861 1000 862 static int cq_mmap_entries_setup(struct efa_dev *dev, struct efa_cq *cq, ··· 1136 996 DMA_FROM_DEVICE); 1137 997 1138 998 err_out: 1139 - atomic64_inc(&dev->stats.sw_stats.create_cq_err); 999 + atomic64_inc(&dev->stats.create_cq_err); 1140 1000 return err; 1141 1001 } 1142 1002 ··· 1153 1013 ibdev_dbg(&dev->ibdev, "hp_cnt[%u], pages_in_hp[%u]\n", 1154 1014 hp_cnt, pages_in_hp); 1155 1015 1156 - rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, 1157 - BIT(hp_shift)) 1016 + rdma_umem_for_each_dma_block(umem, &biter, BIT(hp_shift)) 1158 1017 page_list[hp_idx++] = rdma_block_iter_dma_address(&biter); 1159 1018 1160 1019 return 0; ··· 1165 1026 struct page *pg; 1166 1027 int i; 1167 1028 1168 - sglist = kcalloc(page_cnt, sizeof(*sglist), GFP_KERNEL); 1029 + sglist = kmalloc_array(page_cnt, sizeof(*sglist), GFP_KERNEL); 1169 1030 if (!sglist) 1170 1031 return NULL; 1171 1032 sg_init_table(sglist, page_cnt); ··· 1509 1370 1510 1371 supp_access_flags = 1511 1372 IB_ACCESS_LOCAL_WRITE | 1512 - (is_rdma_read_cap(dev) ? IB_ACCESS_REMOTE_READ : 0); 1373 + (EFA_DEV_CAP(dev, RDMA_READ) ? IB_ACCESS_REMOTE_READ : 0); 1513 1374 1514 1375 access_flags &= ~IB_ACCESS_OPTIONAL; 1515 1376 if (access_flags & ~supp_access_flags) { ··· 1549 1410 goto err_unmap; 1550 1411 } 1551 1412 1552 - params.page_shift = __ffs(pg_sz); 1553 - params.page_num = DIV_ROUND_UP(length + (start & (pg_sz - 1)), 1554 - pg_sz); 1413 + params.page_shift = order_base_2(pg_sz); 1414 + params.page_num = ib_umem_num_dma_blocks(mr->umem, pg_sz); 1555 1415 1556 1416 ibdev_dbg(&dev->ibdev, 1557 1417 "start %#llx length %#llx params.page_shift %u params.page_num %u\n", ··· 1589 1451 err_free: 1590 1452 kfree(mr); 1591 1453 err_out: 1592 - atomic64_inc(&dev->stats.sw_stats.reg_mr_err); 1454 + atomic64_inc(&dev->stats.reg_mr_err); 1593 1455 return ERR_PTR(err); 1594 1456 } 1595 1457 ··· 1707 1569 resp.max_tx_batch = dev->dev_attr.max_tx_batch; 1708 1570 resp.min_sq_wr = dev->dev_attr.min_sq_depth; 1709 1571 1710 - if (udata && udata->outlen) { 1711 - err = ib_copy_to_udata(udata, &resp, 1712 - min(sizeof(resp), udata->outlen)); 1713 - if (err) 1714 - goto err_dealloc_uar; 1715 - } 1572 + err = ib_copy_to_udata(udata, &resp, 1573 + min(sizeof(resp), udata->outlen)); 1574 + if (err) 1575 + goto err_dealloc_uar; 1716 1576 1717 1577 return 0; 1718 1578 1719 1579 err_dealloc_uar: 1720 1580 efa_dealloc_uar(dev, result.uarn); 1721 1581 err_out: 1722 - atomic64_inc(&dev->stats.sw_stats.alloc_ucontext_err); 1582 + atomic64_inc(&dev->stats.alloc_ucontext_err); 1723 1583 return err; 1724 1584 } 1725 1585 ··· 1750 1614 ibdev_dbg(&dev->ibdev, 1751 1615 "pgoff[%#lx] does not have valid entry\n", 1752 1616 vma->vm_pgoff); 1753 - atomic64_inc(&dev->stats.sw_stats.mmap_err); 1617 + atomic64_inc(&dev->stats.mmap_err); 1754 1618 return -EINVAL; 1755 1619 } 1756 1620 entry = to_emmap(rdma_entry); ··· 1792 1656 "Couldn't mmap address[%#llx] length[%#zx] mmap_flag[%d] err[%d]\n", 1793 1657 entry->address, rdma_entry->npages * PAGE_SIZE, 1794 1658 entry->mmap_flag, err); 1795 - atomic64_inc(&dev->stats.sw_stats.mmap_err); 1659 + atomic64_inc(&dev->stats.mmap_err); 1796 1660 } 1797 1661 1798 1662 rdma_user_mmap_entry_put(rdma_entry); ··· 1877 1741 err_destroy_ah: 1878 1742 efa_ah_destroy(dev, ah); 1879 1743 err_out: 1880 - atomic64_inc(&dev->stats.sw_stats.create_ah_err); 1744 + atomic64_inc(&dev->stats.create_ah_err); 1881 1745 return err; 1882 1746 } 1883 1747 1884 - void efa_destroy_ah(struct ib_ah *ibah, u32 flags) 1748 + int efa_destroy_ah(struct ib_ah *ibah, u32 flags) 1885 1749 { 1886 1750 struct efa_dev *dev = to_edev(ibah->pd->device); 1887 1751 struct efa_ah *ah = to_eah(ibah); ··· 1891 1755 if (!(flags & RDMA_DESTROY_AH_SLEEPABLE)) { 1892 1756 ibdev_dbg(&dev->ibdev, 1893 1757 "Destroy address handle is not supported in atomic context\n"); 1894 - return; 1758 + return -EOPNOTSUPP; 1895 1759 } 1896 1760 1897 1761 efa_ah_destroy(dev, ah); 1762 + return 0; 1898 1763 } 1899 1764 1900 1765 struct rdma_hw_stats *efa_alloc_hw_stats(struct ib_device *ibdev, u8 port_num) ··· 1911 1774 struct efa_com_get_stats_params params = {}; 1912 1775 union efa_com_get_stats_result result; 1913 1776 struct efa_dev *dev = to_edev(ibdev); 1777 + struct efa_com_rdma_read_stats *rrs; 1778 + struct efa_com_messages_stats *ms; 1914 1779 struct efa_com_basic_stats *bs; 1915 1780 struct efa_com_stats_admin *as; 1916 1781 struct efa_stats *s; 1917 1782 int err; 1918 1783 1919 - params.type = EFA_ADMIN_GET_STATS_TYPE_BASIC; 1920 1784 params.scope = EFA_ADMIN_GET_STATS_SCOPE_ALL; 1785 + params.type = EFA_ADMIN_GET_STATS_TYPE_BASIC; 1921 1786 1922 1787 err = efa_com_get_stats(&dev->edev, &params, &result); 1923 1788 if (err) ··· 1932 1793 stats->value[EFA_RX_PKTS] = bs->rx_pkts; 1933 1794 stats->value[EFA_RX_DROPS] = bs->rx_drops; 1934 1795 1796 + params.type = EFA_ADMIN_GET_STATS_TYPE_MESSAGES; 1797 + err = efa_com_get_stats(&dev->edev, &params, &result); 1798 + if (err) 1799 + return err; 1800 + 1801 + ms = &result.messages_stats; 1802 + stats->value[EFA_SEND_BYTES] = ms->send_bytes; 1803 + stats->value[EFA_SEND_WRS] = ms->send_wrs; 1804 + stats->value[EFA_RECV_BYTES] = ms->recv_bytes; 1805 + stats->value[EFA_RECV_WRS] = ms->recv_wrs; 1806 + 1807 + params.type = EFA_ADMIN_GET_STATS_TYPE_RDMA_READ; 1808 + err = efa_com_get_stats(&dev->edev, &params, &result); 1809 + if (err) 1810 + return err; 1811 + 1812 + rrs = &result.rdma_read_stats; 1813 + stats->value[EFA_RDMA_READ_WRS] = rrs->read_wrs; 1814 + stats->value[EFA_RDMA_READ_BYTES] = rrs->read_bytes; 1815 + stats->value[EFA_RDMA_READ_WR_ERR] = rrs->read_wr_err; 1816 + stats->value[EFA_RDMA_READ_RESP_BYTES] = rrs->read_resp_bytes; 1817 + 1935 1818 as = &dev->edev.aq.stats; 1936 1819 stats->value[EFA_SUBMITTED_CMDS] = atomic64_read(&as->submitted_cmd); 1937 1820 stats->value[EFA_COMPLETED_CMDS] = atomic64_read(&as->completed_cmd); ··· 1962 1801 1963 1802 s = &dev->stats; 1964 1803 stats->value[EFA_KEEP_ALIVE_RCVD] = atomic64_read(&s->keep_alive_rcvd); 1965 - stats->value[EFA_ALLOC_PD_ERR] = atomic64_read(&s->sw_stats.alloc_pd_err); 1966 - stats->value[EFA_CREATE_QP_ERR] = atomic64_read(&s->sw_stats.create_qp_err); 1967 - stats->value[EFA_CREATE_CQ_ERR] = atomic64_read(&s->sw_stats.create_cq_err); 1968 - stats->value[EFA_REG_MR_ERR] = atomic64_read(&s->sw_stats.reg_mr_err); 1969 - stats->value[EFA_ALLOC_UCONTEXT_ERR] = atomic64_read(&s->sw_stats.alloc_ucontext_err); 1970 - stats->value[EFA_CREATE_AH_ERR] = atomic64_read(&s->sw_stats.create_ah_err); 1971 - stats->value[EFA_MMAP_ERR] = atomic64_read(&s->sw_stats.mmap_err); 1804 + stats->value[EFA_ALLOC_PD_ERR] = atomic64_read(&s->alloc_pd_err); 1805 + stats->value[EFA_CREATE_QP_ERR] = atomic64_read(&s->create_qp_err); 1806 + stats->value[EFA_CREATE_CQ_ERR] = atomic64_read(&s->create_cq_err); 1807 + stats->value[EFA_REG_MR_ERR] = atomic64_read(&s->reg_mr_err); 1808 + stats->value[EFA_ALLOC_UCONTEXT_ERR] = 1809 + atomic64_read(&s->alloc_ucontext_err); 1810 + stats->value[EFA_CREATE_AH_ERR] = atomic64_read(&s->create_ah_err); 1811 + stats->value[EFA_MMAP_ERR] = atomic64_read(&s->mmap_err); 1972 1812 1973 1813 return ARRAY_SIZE(efa_stats_names); 1974 1814 }
+11 -11
drivers/infiniband/hw/hfi1/sdma.c
··· 232 232 static void sdma_complete(struct kref *); 233 233 static void sdma_finalput(struct sdma_state *); 234 234 static void sdma_get(struct sdma_state *); 235 - static void sdma_hw_clean_up_task(unsigned long); 235 + static void sdma_hw_clean_up_task(struct tasklet_struct *); 236 236 static void sdma_put(struct sdma_state *); 237 237 static void sdma_set_state(struct sdma_engine *, enum sdma_states); 238 238 static void sdma_start_hw_clean_up(struct sdma_engine *); 239 - static void sdma_sw_clean_up_task(unsigned long); 239 + static void sdma_sw_clean_up_task(struct tasklet_struct *); 240 240 static void sdma_sendctrl(struct sdma_engine *, unsigned); 241 241 static void init_sdma_regs(struct sdma_engine *, u32, uint); 242 242 static void sdma_process_event( ··· 545 545 schedule_work(&sde->err_halt_worker); 546 546 } 547 547 548 - static void sdma_hw_clean_up_task(unsigned long opaque) 548 + static void sdma_hw_clean_up_task(struct tasklet_struct *t) 549 549 { 550 - struct sdma_engine *sde = (struct sdma_engine *)opaque; 550 + struct sdma_engine *sde = from_tasklet(sde, t, 551 + sdma_hw_clean_up_task); 551 552 u64 statuscsr; 552 553 553 554 while (1) { ··· 605 604 sdma_desc_avail(sde, sdma_descq_freecnt(sde)); 606 605 } 607 606 608 - static void sdma_sw_clean_up_task(unsigned long opaque) 607 + static void sdma_sw_clean_up_task(struct tasklet_struct *t) 609 608 { 610 - struct sdma_engine *sde = (struct sdma_engine *)opaque; 609 + struct sdma_engine *sde = from_tasklet(sde, t, sdma_sw_clean_up_task); 611 610 unsigned long flags; 612 611 613 612 spin_lock_irqsave(&sde->tail_lock, flags); ··· 1455 1454 sde->tail_csr = 1456 1455 get_kctxt_csr_addr(dd, this_idx, SD(TAIL)); 1457 1456 1458 - tasklet_init(&sde->sdma_hw_clean_up_task, sdma_hw_clean_up_task, 1459 - (unsigned long)sde); 1460 - 1461 - tasklet_init(&sde->sdma_sw_clean_up_task, sdma_sw_clean_up_task, 1462 - (unsigned long)sde); 1457 + tasklet_setup(&sde->sdma_hw_clean_up_task, 1458 + sdma_hw_clean_up_task); 1459 + tasklet_setup(&sde->sdma_sw_clean_up_task, 1460 + sdma_sw_clean_up_task); 1463 1461 INIT_WORK(&sde->err_halt_worker, sdma_err_halt_wait); 1464 1462 INIT_WORK(&sde->flush_worker, sdma_field_flush); 1465 1463
+1 -1
drivers/infiniband/hw/hfi1/verbs.c
··· 1424 1424 props->gid_tbl_len = HFI1_GUIDS_PER_PORT; 1425 1425 props->active_width = (u8)opa_width_to_ib(ppd->link_width_active); 1426 1426 /* see rate_show() in ib core/sysfs.c */ 1427 - props->active_speed = (u8)opa_speed_to_ib(ppd->link_speed_active); 1427 + props->active_speed = opa_speed_to_ib(ppd->link_speed_active); 1428 1428 props->max_vl_num = ppd->vls_supported; 1429 1429 1430 1430 /* Once we are a "first class" citizen and have added the OPA MTUs to
+18 -5
drivers/infiniband/hw/hns/hns_roce_ah.c
··· 39 39 #define HNS_ROCE_VLAN_SL_BIT_MASK 7 40 40 #define HNS_ROCE_VLAN_SL_SHIFT 13 41 41 42 + static inline u16 get_ah_udp_sport(const struct rdma_ah_attr *ah_attr) 43 + { 44 + u32 fl = ah_attr->grh.flow_label; 45 + u16 sport; 46 + 47 + if (!fl) 48 + sport = get_random_u32() % 49 + (IB_ROCE_UDP_ENCAP_VALID_PORT_MAX + 1 - 50 + IB_ROCE_UDP_ENCAP_VALID_PORT_MIN) + 51 + IB_ROCE_UDP_ENCAP_VALID_PORT_MIN; 52 + else 53 + sport = rdma_flow_label_to_udp_sport(fl); 54 + 55 + return sport; 56 + } 57 + 42 58 int hns_roce_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr, 43 59 struct ib_udata *udata) 44 60 { ··· 95 79 96 80 memcpy(ah->av.dgid, grh->dgid.raw, HNS_ROCE_GID_SIZE); 97 81 ah->av.sl = rdma_ah_get_sl(ah_attr); 82 + ah->av.flowlabel = grh->flow_label; 83 + ah->av.udp_sport = get_ah_udp_sport(ah_attr); 98 84 99 85 return 0; 100 86 } ··· 115 97 rdma_ah_set_dgid_raw(ah_attr, ah->av.dgid); 116 98 117 99 return 0; 118 - } 119 - 120 - void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags) 121 - { 122 - return; 123 100 }
+1 -2
drivers/infiniband/hw/hns/hns_roce_alloc.c
··· 268 268 } 269 269 270 270 /* convert system page cnt to hw page cnt */ 271 - rdma_for_each_block(umem->sg_head.sgl, &biter, umem->nmap, 272 - 1 << page_shift) { 271 + rdma_umem_for_each_dma_block(umem, &biter, 1 << page_shift) { 273 272 addr = rdma_block_iter_dma_address(&biter); 274 273 if (idx >= start) { 275 274 bufs[total++] = addr;
+23 -4
drivers/infiniband/hw/hns/hns_roce_cq.c
··· 150 150 int err; 151 151 152 152 buf_attr.page_shift = hr_dev->caps.cqe_buf_pg_sz + HNS_HW_PAGE_SHIFT; 153 - buf_attr.region[0].size = hr_cq->cq_depth * hr_dev->caps.cq_entry_sz; 153 + buf_attr.region[0].size = hr_cq->cq_depth * hr_cq->cqe_size; 154 154 buf_attr.region[0].hopnum = hr_dev->caps.cqe_hop_num; 155 155 buf_attr.region_count = 1; 156 156 buf_attr.fixed_page = true; ··· 224 224 } 225 225 } 226 226 227 + static void set_cqe_size(struct hns_roce_cq *hr_cq, struct ib_udata *udata, 228 + struct hns_roce_ib_create_cq *ucmd) 229 + { 230 + struct hns_roce_dev *hr_dev = to_hr_dev(hr_cq->ib_cq.device); 231 + 232 + if (udata) { 233 + if (udata->inlen >= offsetofend(typeof(*ucmd), cqe_size)) 234 + hr_cq->cqe_size = ucmd->cqe_size; 235 + else 236 + hr_cq->cqe_size = HNS_ROCE_V2_CQE_SIZE; 237 + } else { 238 + hr_cq->cqe_size = hr_dev->caps.cqe_sz; 239 + } 240 + } 241 + 227 242 int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr, 228 243 struct ib_udata *udata) 229 244 { ··· 273 258 INIT_LIST_HEAD(&hr_cq->rq_list); 274 259 275 260 if (udata) { 276 - ret = ib_copy_from_udata(&ucmd, udata, sizeof(ucmd)); 261 + ret = ib_copy_from_udata(&ucmd, udata, 262 + min(sizeof(ucmd), udata->inlen)); 277 263 if (ret) { 278 264 ibdev_err(ibdev, "Failed to copy CQ udata, err %d\n", 279 265 ret); 280 266 return ret; 281 267 } 282 268 } 269 + 270 + set_cqe_size(hr_cq, udata, &ucmd); 283 271 284 272 ret = alloc_cq_buf(hr_dev, hr_cq, udata, ucmd.buf_addr); 285 273 if (ret) { ··· 305 287 /* 306 288 * For the QP created by kernel space, tptr value should be initialized 307 289 * to zero; For the QP created by user space, it will cause synchronous 308 - * problems if tptr is set to zero here, so we initialze it in user 290 + * problems if tptr is set to zero here, so we initialize it in user 309 291 * space. 310 292 */ 311 293 if (!udata && hr_cq->tptr_addr) ··· 329 311 return ret; 330 312 } 331 313 332 - void hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 314 + int hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 333 315 { 334 316 struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device); 335 317 struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq); ··· 340 322 free_cq_buf(hr_dev, hr_cq); 341 323 free_cq_db(hr_dev, hr_cq, udata); 342 324 free_cqc(hr_dev, hr_cq); 325 + return 0; 343 326 } 344 327 345 328 void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn)
+42 -32
drivers/infiniband/hw/hns/hns_roce_device.h
··· 37 37 38 38 #define DRV_NAME "hns_roce" 39 39 40 - /* hip08 is a pci device */ 41 40 #define PCI_REVISION_ID_HIP08 0x21 41 + #define PCI_REVISION_ID_HIP09 0x30 42 42 43 43 #define HNS_ROCE_HW_VER1 ('h' << 24 | 'i' << 16 | '0' << 8 | '6') 44 44 ··· 57 57 /* Hardware specification only for v1 engine */ 58 58 #define HNS_ROCE_MAX_INNER_MTPT_NUM 0x7 59 59 #define HNS_ROCE_MAX_MTPT_PBL_NUM 0x100000 60 - #define HNS_ROCE_MAX_SGE_NUM 2 61 60 62 61 #define HNS_ROCE_EACH_FREE_CQ_WAIT_MSECS 20 63 62 #define HNS_ROCE_MAX_FREE_CQ_WAIT_CNT \ ··· 75 76 #define HNS_ROCE_CEQ 0 76 77 #define HNS_ROCE_AEQ 1 77 78 78 - #define HNS_ROCE_CEQ_ENTRY_SIZE 0x4 79 - #define HNS_ROCE_AEQ_ENTRY_SIZE 0x10 79 + #define HNS_ROCE_CEQE_SIZE 0x4 80 + #define HNS_ROCE_AEQE_SIZE 0x10 80 81 81 - #define HNS_ROCE_SL_SHIFT 28 82 - #define HNS_ROCE_TCLASS_SHIFT 20 83 - #define HNS_ROCE_FLOW_LABEL_MASK 0xfffff 82 + #define HNS_ROCE_V3_EQE_SIZE 0x40 83 + 84 + #define HNS_ROCE_V2_CQE_SIZE 32 85 + #define HNS_ROCE_V3_CQE_SIZE 64 86 + 87 + #define HNS_ROCE_V2_QPC_SZ 256 88 + #define HNS_ROCE_V3_QPC_SZ 512 84 89 85 90 #define HNS_ROCE_MAX_PORTS 6 86 - #define HNS_ROCE_MAX_GID_NUM 16 87 91 #define HNS_ROCE_GID_SIZE 16 88 92 #define HNS_ROCE_SGE_SIZE 16 89 93 ··· 113 111 #define PAGES_SHIFT_16 16 114 112 #define PAGES_SHIFT_24 24 115 113 #define PAGES_SHIFT_32 32 116 - 117 - #define HNS_ROCE_PCI_BAR_NUM 2 118 114 119 115 #define HNS_ROCE_IDX_QUE_ENTRY_SZ 4 120 116 #define SRQ_DB_REG 0x230 ··· 467 467 void __iomem *cq_db_l; 468 468 u16 *tptr_addr; 469 469 int arm_sn; 470 + int cqe_size; 470 471 unsigned long cqn; 471 472 u32 vector; 472 473 atomic_t refcount; ··· 536 535 }; 537 536 538 537 struct hns_roce_av { 539 - u8 port; 540 - u8 gid_index; 541 - u8 stat_rate; 542 - u8 hop_limit; 543 - u32 flowlabel; 544 - u8 sl; 545 - u8 tclass; 546 - u8 dgid[HNS_ROCE_GID_SIZE]; 547 - u8 mac[ETH_ALEN]; 548 - u16 vlan_id; 549 - bool vlan_en; 538 + u8 port; 539 + u8 gid_index; 540 + u8 stat_rate; 541 + u8 hop_limit; 542 + u32 flowlabel; 543 + u16 udp_sport; 544 + u8 sl; 545 + u8 tclass; 546 + u8 dgid[HNS_ROCE_GID_SIZE]; 547 + u8 mac[ETH_ALEN]; 548 + u16 vlan_id; 549 + bool vlan_en; 550 550 }; 551 551 552 552 struct hns_roce_ah { ··· 657 655 658 656 struct hns_roce_sge sge; 659 657 u32 next_sge; 658 + enum ib_mtu path_mtu; 659 + u32 max_inline_data; 660 660 661 661 /* 0: flush needed, 1: unneeded */ 662 662 unsigned long flush_flag; ··· 682 678 }; 683 679 684 680 struct hns_roce_ceqe { 685 - __le32 comp; 681 + __le32 comp; 682 + __le32 rsv[15]; 686 683 }; 687 684 688 685 struct hns_roce_aeqe { ··· 720 715 u8 rsv0; 721 716 } __packed cmd; 722 717 } event; 718 + __le32 rsv[12]; 723 719 }; 724 720 725 721 struct hns_roce_eq { ··· 797 791 int num_pds; 798 792 int reserved_pds; 799 793 u32 mtt_entry_sz; 800 - u32 cq_entry_sz; 794 + u32 cqe_sz; 801 795 u32 page_size_cap; 802 796 u32 reserved_lkey; 803 797 int mtpt_entry_sz; 804 - int qpc_entry_sz; 798 + int qpc_sz; 805 799 int irrl_entry_sz; 806 800 int trrl_entry_sz; 807 801 int cqc_entry_sz; 808 - int sccc_entry_sz; 802 + int sccc_sz; 809 803 int qpc_timer_entry_sz; 810 804 int cqc_timer_entry_sz; 811 805 int srqc_entry_sz; ··· 815 809 u32 pbl_hop_num; 816 810 int aeqe_depth; 817 811 int ceqe_depth; 812 + u32 aeqe_size; 813 + u32 ceqe_size; 818 814 enum ib_mtu max_mtu; 819 815 u32 qpc_bt_num; 820 816 u32 qpc_timer_bt_num; ··· 938 930 int (*poll_cq)(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); 939 931 int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr, 940 932 struct ib_udata *udata); 941 - void (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata); 933 + int (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata); 942 934 int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period); 943 935 int (*init_eq)(struct hns_roce_dev *hr_dev); 944 936 void (*cleanup_eq)(struct hns_roce_dev *hr_dev); ··· 1186 1178 int hns_roce_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 1187 1179 struct ib_udata *udata); 1188 1180 int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); 1189 - void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags); 1181 + static inline int hns_roce_destroy_ah(struct ib_ah *ah, u32 flags) 1182 + { 1183 + return 0; 1184 + } 1190 1185 1191 1186 int hns_roce_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 1192 - void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 1187 + int hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 1193 1188 1194 1189 struct ib_mr *hns_roce_get_dma_mr(struct ib_pd *pd, int acc); 1195 1190 struct ib_mr *hns_roce_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, ··· 1211 1200 unsigned long mpt_index); 1212 1201 unsigned long key_to_hw_index(u32 key); 1213 1202 1214 - struct ib_mw *hns_roce_alloc_mw(struct ib_pd *pd, enum ib_mw_type, 1215 - struct ib_udata *udata); 1203 + int hns_roce_alloc_mw(struct ib_mw *mw, struct ib_udata *udata); 1216 1204 int hns_roce_dealloc_mw(struct ib_mw *ibmw); 1217 1205 1218 1206 void hns_roce_buf_free(struct hns_roce_dev *hr_dev, struct hns_roce_buf *buf); ··· 1230 1220 int hns_roce_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr, 1231 1221 enum ib_srq_attr_mask srq_attr_mask, 1232 1222 struct ib_udata *udata); 1233 - void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 1223 + int hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 1234 1224 1235 1225 struct ib_qp *hns_roce_create_qp(struct ib_pd *ib_pd, 1236 1226 struct ib_qp_init_attr *init_attr, ··· 1257 1247 int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr, 1258 1248 struct ib_udata *udata); 1259 1249 1260 - void hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); 1250 + int hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata); 1261 1251 int hns_roce_db_map_user(struct hns_roce_ucontext *context, 1262 1252 struct ib_udata *udata, unsigned long virt, 1263 1253 struct hns_roce_db *db);
+4 -4
drivers/infiniband/hw/hns/hns_roce_hem.c
··· 338 338 void __iomem *bt_cmd; 339 339 __le32 bt_cmd_val[2]; 340 340 __le32 bt_cmd_h = 0; 341 - __le32 bt_cmd_l = 0; 342 - u64 bt_ba = 0; 341 + __le32 bt_cmd_l; 342 + u64 bt_ba; 343 343 int ret = 0; 344 344 345 345 /* Find the HEM(Hardware Entry Memory) entry */ ··· 1027 1027 if (hr_dev->caps.cqc_timer_entry_sz) 1028 1028 hns_roce_cleanup_hem_table(hr_dev, 1029 1029 &hr_dev->cqc_timer_table); 1030 - if (hr_dev->caps.sccc_entry_sz) 1030 + if (hr_dev->caps.sccc_sz) 1031 1031 hns_roce_cleanup_hem_table(hr_dev, 1032 1032 &hr_dev->qp_table.sccc_table); 1033 1033 if (hr_dev->caps.trrl_entry_sz) ··· 1404 1404 { 1405 1405 const struct hns_roce_buf_region *r; 1406 1406 int ofs, end; 1407 - int ret = 0; 1407 + int ret; 1408 1408 int unit; 1409 1409 int i; 1410 1410
+24 -27
drivers/infiniband/hw/hns/hns_roce_hw_v1.c
··· 70 70 struct hns_roce_qp *qp = to_hr_qp(ibqp); 71 71 struct device *dev = &hr_dev->pdev->dev; 72 72 struct hns_roce_sq_db sq_db = {}; 73 - int ps_opcode = 0, i = 0; 73 + int ps_opcode, i; 74 74 unsigned long flags = 0; 75 75 void *wqe = NULL; 76 76 __le32 doorbell[2]; 77 - u32 wqe_idx = 0; 78 - int nreq = 0; 79 77 int ret = 0; 80 - u8 *smac; 81 78 int loopback; 79 + u32 wqe_idx; 80 + int nreq; 81 + u8 *smac; 82 82 83 83 if (unlikely(ibqp->qp_type != IB_QPT_GSI && 84 84 ibqp->qp_type != IB_QPT_RC)) { ··· 271 271 ps_opcode = HNS_ROCE_WQE_OPCODE_SEND; 272 272 break; 273 273 case IB_WR_LOCAL_INV: 274 - break; 275 274 case IB_WR_ATOMIC_CMP_AND_SWP: 276 275 case IB_WR_ATOMIC_FETCH_AND_ADD: 277 276 case IB_WR_LSO: ··· 887 888 u32 odb_ext_mod; 888 889 u32 sdb_evt_mod; 889 890 u32 odb_evt_mod; 890 - int ret = 0; 891 + int ret; 891 892 892 893 memset(db, 0, sizeof(*db)); 893 894 ··· 1147 1148 struct hns_roce_v1_priv *priv = hr_dev->priv; 1148 1149 struct hns_roce_raq_table *raq = &priv->raq_table; 1149 1150 struct device *dev = &hr_dev->pdev->dev; 1150 - int raq_shift = 0; 1151 1151 dma_addr_t addr; 1152 + int raq_shift; 1152 1153 __le32 tmp; 1153 1154 u32 val; 1154 1155 int ret; ··· 1359 1360 struct hns_roce_v1_priv *priv = hr_dev->priv; 1360 1361 struct hns_roce_free_mr *free_mr = &priv->free_mr; 1361 1362 struct device *dev = &hr_dev->pdev->dev; 1362 - int ret = 0; 1363 + int ret; 1363 1364 1364 1365 free_mr->free_mr_wq = create_singlethread_workqueue("hns_roce_free_mr"); 1365 1366 if (!free_mr->free_mr_wq) { ··· 1439 1440 1440 1441 static int hns_roce_v1_profile(struct hns_roce_dev *hr_dev) 1441 1442 { 1442 - int i = 0; 1443 1443 struct hns_roce_caps *caps = &hr_dev->caps; 1444 + int i; 1444 1445 1445 1446 hr_dev->vendor_id = roce_read(hr_dev, ROCEE_VENDOR_ID_REG); 1446 1447 hr_dev->vendor_part_id = roce_read(hr_dev, ROCEE_VENDOR_PART_ID_REG); ··· 1470 1471 caps->max_qp_dest_rdma = HNS_ROCE_V1_MAX_QP_DEST_RDMA; 1471 1472 caps->max_sq_desc_sz = HNS_ROCE_V1_MAX_SQ_DESC_SZ; 1472 1473 caps->max_rq_desc_sz = HNS_ROCE_V1_MAX_RQ_DESC_SZ; 1473 - caps->qpc_entry_sz = HNS_ROCE_V1_QPC_ENTRY_SIZE; 1474 + caps->qpc_sz = HNS_ROCE_V1_QPC_SIZE; 1474 1475 caps->irrl_entry_sz = HNS_ROCE_V1_IRRL_ENTRY_SIZE; 1475 1476 caps->cqc_entry_sz = HNS_ROCE_V1_CQC_ENTRY_SIZE; 1476 1477 caps->mtpt_entry_sz = HNS_ROCE_V1_MTPT_ENTRY_SIZE; 1477 1478 caps->mtt_entry_sz = HNS_ROCE_V1_MTT_ENTRY_SIZE; 1478 - caps->cq_entry_sz = HNS_ROCE_V1_CQE_ENTRY_SIZE; 1479 + caps->cqe_sz = HNS_ROCE_V1_CQE_SIZE; 1479 1480 caps->page_size_cap = HNS_ROCE_V1_PAGE_SIZE_SUPPORT; 1480 1481 caps->reserved_lkey = 0; 1481 1482 caps->reserved_pds = 0; ··· 1642 1643 unsigned long timeout) 1643 1644 { 1644 1645 u8 __iomem *hcr = hr_dev->reg_base + ROCEE_MB1_REG; 1645 - unsigned long end = 0; 1646 + unsigned long end; 1646 1647 u32 status = 0; 1647 1648 1648 1649 end = msecs_to_jiffies(timeout) + jiffies; ··· 1670 1671 { 1671 1672 unsigned long flags; 1672 1673 u32 *p = NULL; 1673 - u8 gid_idx = 0; 1674 + u8 gid_idx; 1674 1675 1675 1676 gid_idx = hns_get_gid_index(hr_dev, port, gid_index); 1676 1677 ··· 1896 1897 1897 1898 static void *get_cqe(struct hns_roce_cq *hr_cq, int n) 1898 1899 { 1899 - return hns_roce_buf_offset(hr_cq->mtr.kmem, 1900 - n * HNS_ROCE_V1_CQE_ENTRY_SIZE); 1900 + return hns_roce_buf_offset(hr_cq->mtr.kmem, n * HNS_ROCE_V1_CQE_SIZE); 1901 1901 } 1902 1902 1903 1903 static void *get_sw_cqe(struct hns_roce_cq *hr_cq, int n) ··· 2443 2445 2444 2446 struct hns_roce_cmd_mailbox *mailbox; 2445 2447 struct device *dev = &hr_dev->pdev->dev; 2446 - int ret = 0; 2448 + int ret; 2447 2449 2448 2450 if (cur_state >= HNS_ROCE_QP_NUM_STATE || 2449 2451 new_state >= HNS_ROCE_QP_NUM_STATE || ··· 3392 3394 struct hns_roce_qp *hr_qp = to_hr_qp(ibqp); 3393 3395 struct device *dev = &hr_dev->pdev->dev; 3394 3396 struct hns_roce_qp_context *context; 3395 - int tmp_qp_state = 0; 3397 + int tmp_qp_state; 3396 3398 int ret = 0; 3397 3399 int state; 3398 3400 ··· 3570 3572 return 0; 3571 3573 } 3572 3574 3573 - static void hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 3575 + static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 3574 3576 { 3575 3577 struct hns_roce_dev *hr_dev = to_hr_dev(ibcq->device); 3576 3578 struct hns_roce_cq *hr_cq = to_hr_cq(ibcq); ··· 3601 3603 } 3602 3604 wait_time++; 3603 3605 } 3606 + return 0; 3604 3607 } 3605 3608 3606 3609 static void set_eq_cons_index_v1(struct hns_roce_eq *eq, int req_not) ··· 3774 3775 3775 3776 static struct hns_roce_aeqe *get_aeqe_v1(struct hns_roce_eq *eq, u32 entry) 3776 3777 { 3777 - unsigned long off = (entry & (eq->entries - 1)) * 3778 - HNS_ROCE_AEQ_ENTRY_SIZE; 3778 + unsigned long off = (entry & (eq->entries - 1)) * HNS_ROCE_AEQE_SIZE; 3779 3779 3780 3780 return (struct hns_roce_aeqe *)((u8 *) 3781 3781 (eq->buf_list[off / HNS_ROCE_BA_SIZE].buf) + ··· 3879 3881 3880 3882 static struct hns_roce_ceqe *get_ceqe_v1(struct hns_roce_eq *eq, u32 entry) 3881 3883 { 3882 - unsigned long off = (entry & (eq->entries - 1)) * 3883 - HNS_ROCE_CEQ_ENTRY_SIZE; 3884 + unsigned long off = (entry & (eq->entries - 1)) * HNS_ROCE_CEQE_SIZE; 3884 3885 3885 3886 return (struct hns_roce_ceqe *)((u8 *) 3886 3887 (eq->buf_list[off / HNS_ROCE_BA_SIZE].buf) + ··· 3931 3934 { 3932 3935 struct hns_roce_eq *eq = eq_ptr; 3933 3936 struct hns_roce_dev *hr_dev = eq->hr_dev; 3934 - int int_work = 0; 3937 + int int_work; 3935 3938 3936 3939 if (eq->type_flag == HNS_ROCE_CEQ) 3937 3940 /* CEQ irq routine, CEQ is pulse irq, not clear */ ··· 4129 4132 void __iomem *eqc = hr_dev->eq_table.eqc_base[eq->eqn]; 4130 4133 struct device *dev = &hr_dev->pdev->dev; 4131 4134 dma_addr_t tmp_dma_addr; 4132 - u32 eqconsindx_val = 0; 4133 4135 u32 eqcuridx_val = 0; 4134 - u32 eqshift_val = 0; 4136 + u32 eqconsindx_val; 4137 + u32 eqshift_val; 4135 4138 __le32 tmp2 = 0; 4136 4139 __le32 tmp1 = 0; 4137 4140 __le32 tmp = 0; ··· 4250 4253 CEQ_REG_OFFSET * i; 4251 4254 eq->entries = hr_dev->caps.ceqe_depth; 4252 4255 eq->log_entries = ilog2(eq->entries); 4253 - eq->eqe_size = HNS_ROCE_CEQ_ENTRY_SIZE; 4256 + eq->eqe_size = HNS_ROCE_CEQE_SIZE; 4254 4257 } else { 4255 4258 /* AEQ */ 4256 4259 eq_table->eqc_base[i] = hr_dev->reg_base + ··· 4260 4263 ROCEE_CAEP_AEQE_CONS_IDX_REG; 4261 4264 eq->entries = hr_dev->caps.aeqe_depth; 4262 4265 eq->log_entries = ilog2(eq->entries); 4263 - eq->eqe_size = HNS_ROCE_AEQ_ENTRY_SIZE; 4266 + eq->eqe_size = HNS_ROCE_AEQE_SIZE; 4264 4267 } 4265 4268 } 4266 4269
+2 -2
drivers/infiniband/hw/hns/hns_roce_hw_v1.h
··· 68 68 #define HNS_ROCE_V1_COMP_EQE_NUM 0x8000 69 69 #define HNS_ROCE_V1_ASYNC_EQE_NUM 0x400 70 70 71 - #define HNS_ROCE_V1_QPC_ENTRY_SIZE 256 71 + #define HNS_ROCE_V1_QPC_SIZE 256 72 72 #define HNS_ROCE_V1_IRRL_ENTRY_SIZE 8 73 73 #define HNS_ROCE_V1_CQC_ENTRY_SIZE 64 74 74 #define HNS_ROCE_V1_MTPT_ENTRY_SIZE 64 75 75 #define HNS_ROCE_V1_MTT_ENTRY_SIZE 64 76 76 77 - #define HNS_ROCE_V1_CQE_ENTRY_SIZE 32 77 + #define HNS_ROCE_V1_CQE_SIZE 32 78 78 #define HNS_ROCE_V1_PAGE_SIZE_SUPPORT 0xFFFFF000 79 79 80 80 #define HNS_ROCE_V1_TABLE_CHUNK_SIZE (1 << 17)
+388 -146
drivers/infiniband/hw/hns/hns_roce_hw_v2.c
··· 153 153 V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); 154 154 } 155 155 156 + static int fill_ext_sge_inl_data(struct hns_roce_qp *qp, 157 + const struct ib_send_wr *wr, 158 + unsigned int *sge_idx, u32 msg_len) 159 + { 160 + struct ib_device *ibdev = &(to_hr_dev(qp->ibqp.device))->ib_dev; 161 + unsigned int dseg_len = sizeof(struct hns_roce_v2_wqe_data_seg); 162 + unsigned int ext_sge_sz = qp->sq.max_gs * dseg_len; 163 + unsigned int left_len_in_pg; 164 + unsigned int idx = *sge_idx; 165 + unsigned int i = 0; 166 + unsigned int len; 167 + void *addr; 168 + void *dseg; 169 + 170 + if (msg_len > ext_sge_sz) { 171 + ibdev_err(ibdev, 172 + "no enough extended sge space for inline data.\n"); 173 + return -EINVAL; 174 + } 175 + 176 + dseg = hns_roce_get_extend_sge(qp, idx & (qp->sge.sge_cnt - 1)); 177 + left_len_in_pg = hr_hw_page_align((uintptr_t)dseg) - (uintptr_t)dseg; 178 + len = wr->sg_list[0].length; 179 + addr = (void *)(unsigned long)(wr->sg_list[0].addr); 180 + 181 + /* When copying data to extended sge space, the left length in page may 182 + * not long enough for current user's sge. So the data should be 183 + * splited into several parts, one in the first page, and the others in 184 + * the subsequent pages. 185 + */ 186 + while (1) { 187 + if (len <= left_len_in_pg) { 188 + memcpy(dseg, addr, len); 189 + 190 + idx += len / dseg_len; 191 + 192 + i++; 193 + if (i >= wr->num_sge) 194 + break; 195 + 196 + left_len_in_pg -= len; 197 + len = wr->sg_list[i].length; 198 + addr = (void *)(unsigned long)(wr->sg_list[i].addr); 199 + dseg += len; 200 + } else { 201 + memcpy(dseg, addr, left_len_in_pg); 202 + 203 + len -= left_len_in_pg; 204 + addr += left_len_in_pg; 205 + idx += left_len_in_pg / dseg_len; 206 + dseg = hns_roce_get_extend_sge(qp, 207 + idx & (qp->sge.sge_cnt - 1)); 208 + left_len_in_pg = 1 << HNS_HW_PAGE_SHIFT; 209 + } 210 + } 211 + 212 + *sge_idx = idx; 213 + 214 + return 0; 215 + } 216 + 156 217 static void set_extend_sge(struct hns_roce_qp *qp, const struct ib_send_wr *wr, 157 218 unsigned int *sge_ind, unsigned int valid_num_sge) 158 219 { ··· 238 177 *sge_ind = idx; 239 178 } 240 179 180 + static bool check_inl_data_len(struct hns_roce_qp *qp, unsigned int len) 181 + { 182 + struct hns_roce_dev *hr_dev = to_hr_dev(qp->ibqp.device); 183 + int mtu = ib_mtu_enum_to_int(qp->path_mtu); 184 + 185 + if (len > qp->max_inline_data || len > mtu) { 186 + ibdev_err(&hr_dev->ib_dev, 187 + "invalid length of data, data len = %u, max inline len = %u, path mtu = %d.\n", 188 + len, qp->max_inline_data, mtu); 189 + return false; 190 + } 191 + 192 + return true; 193 + } 194 + 195 + static int set_rc_inl(struct hns_roce_qp *qp, const struct ib_send_wr *wr, 196 + struct hns_roce_v2_rc_send_wqe *rc_sq_wqe, 197 + unsigned int *sge_idx) 198 + { 199 + struct hns_roce_dev *hr_dev = to_hr_dev(qp->ibqp.device); 200 + u32 msg_len = le32_to_cpu(rc_sq_wqe->msg_len); 201 + struct ib_device *ibdev = &hr_dev->ib_dev; 202 + unsigned int curr_idx = *sge_idx; 203 + void *dseg = rc_sq_wqe; 204 + unsigned int i; 205 + int ret; 206 + 207 + if (unlikely(wr->opcode == IB_WR_RDMA_READ)) { 208 + ibdev_err(ibdev, "invalid inline parameters!\n"); 209 + return -EINVAL; 210 + } 211 + 212 + if (!check_inl_data_len(qp, msg_len)) 213 + return -EINVAL; 214 + 215 + dseg += sizeof(struct hns_roce_v2_rc_send_wqe); 216 + 217 + roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_INLINE_S, 1); 218 + 219 + if (msg_len <= HNS_ROCE_V2_MAX_RC_INL_INN_SZ) { 220 + roce_set_bit(rc_sq_wqe->byte_20, 221 + V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S, 0); 222 + 223 + for (i = 0; i < wr->num_sge; i++) { 224 + memcpy(dseg, ((void *)wr->sg_list[i].addr), 225 + wr->sg_list[i].length); 226 + dseg += wr->sg_list[i].length; 227 + } 228 + } else { 229 + roce_set_bit(rc_sq_wqe->byte_20, 230 + V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S, 1); 231 + 232 + ret = fill_ext_sge_inl_data(qp, wr, &curr_idx, msg_len); 233 + if (ret) 234 + return ret; 235 + 236 + roce_set_field(rc_sq_wqe->byte_16, 237 + V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, 238 + V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, 239 + curr_idx - *sge_idx); 240 + } 241 + 242 + *sge_idx = curr_idx; 243 + 244 + return 0; 245 + } 246 + 241 247 static int set_rwqe_data_seg(struct ib_qp *ibqp, const struct ib_send_wr *wr, 242 248 struct hns_roce_v2_rc_send_wqe *rc_sq_wqe, 243 249 unsigned int *sge_ind, 244 250 unsigned int valid_num_sge) 245 251 { 246 - struct hns_roce_dev *hr_dev = to_hr_dev(ibqp->device); 247 252 struct hns_roce_v2_wqe_data_seg *dseg = 248 253 (void *)rc_sq_wqe + sizeof(struct hns_roce_v2_rc_send_wqe); 249 - struct ib_device *ibdev = &hr_dev->ib_dev; 250 254 struct hns_roce_qp *qp = to_hr_qp(ibqp); 251 - void *wqe = dseg; 252 255 int j = 0; 253 256 int i; 254 257 255 - if (wr->send_flags & IB_SEND_INLINE && valid_num_sge) { 256 - if (unlikely(le32_to_cpu(rc_sq_wqe->msg_len) > 257 - hr_dev->caps.max_sq_inline)) { 258 - ibdev_err(ibdev, "inline len(1-%d)=%d, illegal", 259 - rc_sq_wqe->msg_len, 260 - hr_dev->caps.max_sq_inline); 261 - return -EINVAL; 262 - } 258 + roce_set_field(rc_sq_wqe->byte_20, 259 + V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M, 260 + V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S, 261 + (*sge_ind) & (qp->sge.sge_cnt - 1)); 263 262 264 - if (unlikely(wr->opcode == IB_WR_RDMA_READ)) { 265 - ibdev_err(ibdev, "Not support inline data!\n"); 266 - return -EINVAL; 267 - } 263 + if (wr->send_flags & IB_SEND_INLINE) 264 + return set_rc_inl(qp, wr, rc_sq_wqe, sge_ind); 268 265 266 + if (valid_num_sge <= HNS_ROCE_SGE_IN_WQE) { 269 267 for (i = 0; i < wr->num_sge; i++) { 270 - memcpy(wqe, ((void *)wr->sg_list[i].addr), 271 - wr->sg_list[i].length); 272 - wqe += wr->sg_list[i].length; 268 + if (likely(wr->sg_list[i].length)) { 269 + set_data_seg_v2(dseg, wr->sg_list + i); 270 + dseg++; 271 + } 273 272 } 274 - 275 - roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_INLINE_S, 276 - 1); 277 273 } else { 278 - if (valid_num_sge <= HNS_ROCE_SGE_IN_WQE) { 279 - for (i = 0; i < wr->num_sge; i++) { 280 - if (likely(wr->sg_list[i].length)) { 281 - set_data_seg_v2(dseg, wr->sg_list + i); 282 - dseg++; 283 - } 274 + for (i = 0; i < wr->num_sge && j < HNS_ROCE_SGE_IN_WQE; i++) { 275 + if (likely(wr->sg_list[i].length)) { 276 + set_data_seg_v2(dseg, wr->sg_list + i); 277 + dseg++; 278 + j++; 284 279 } 285 - } else { 286 - roce_set_field(rc_sq_wqe->byte_20, 287 - V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M, 288 - V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S, 289 - (*sge_ind) & (qp->sge.sge_cnt - 1)); 290 - 291 - for (i = 0; i < wr->num_sge && j < HNS_ROCE_SGE_IN_WQE; 292 - i++) { 293 - if (likely(wr->sg_list[i].length)) { 294 - set_data_seg_v2(dseg, wr->sg_list + i); 295 - dseg++; 296 - j++; 297 - } 298 - } 299 - 300 - set_extend_sge(qp, wr, sge_ind, valid_num_sge); 301 280 } 302 281 303 - roce_set_field(rc_sq_wqe->byte_16, 304 - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, 305 - V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); 282 + set_extend_sge(qp, wr, sge_ind, valid_num_sge); 306 283 } 284 + 285 + roce_set_field(rc_sq_wqe->byte_16, 286 + V2_RC_SEND_WQE_BYTE_16_SGE_NUM_M, 287 + V2_RC_SEND_WQE_BYTE_16_SGE_NUM_S, valid_num_sge); 307 288 308 289 return 0; 309 290 } ··· 395 292 return valid_num; 396 293 } 397 294 295 + static __le32 get_immtdata(const struct ib_send_wr *wr) 296 + { 297 + switch (wr->opcode) { 298 + case IB_WR_SEND_WITH_IMM: 299 + case IB_WR_RDMA_WRITE_WITH_IMM: 300 + return cpu_to_le32(be32_to_cpu(wr->ex.imm_data)); 301 + default: 302 + return 0; 303 + } 304 + } 305 + 306 + static int set_ud_opcode(struct hns_roce_v2_ud_send_wqe *ud_sq_wqe, 307 + const struct ib_send_wr *wr) 308 + { 309 + u32 ib_op = wr->opcode; 310 + 311 + if (ib_op != IB_WR_SEND && ib_op != IB_WR_SEND_WITH_IMM) 312 + return -EINVAL; 313 + 314 + ud_sq_wqe->immtdata = get_immtdata(wr); 315 + 316 + roce_set_field(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_OPCODE_M, 317 + V2_UD_SEND_WQE_BYTE_4_OPCODE_S, to_hr_opcode(ib_op)); 318 + 319 + return 0; 320 + } 321 + 398 322 static inline int set_ud_wqe(struct hns_roce_qp *qp, 399 323 const struct ib_send_wr *wr, 400 324 void *wqe, unsigned int *sge_idx, ··· 435 305 u32 msg_len = 0; 436 306 bool loopback; 437 307 u8 *smac; 308 + int ret; 438 309 439 310 valid_num_sge = calc_wr_sge_num(wr, &msg_len); 440 311 memset(ud_sq_wqe, 0, sizeof(*ud_sq_wqe)); 312 + 313 + ret = set_ud_opcode(ud_sq_wqe, wr); 314 + if (WARN_ON(ret)) 315 + return ret; 441 316 442 317 roce_set_field(ud_sq_wqe->dmac, V2_UD_SEND_WQE_DMAC_0_M, 443 318 V2_UD_SEND_WQE_DMAC_0_S, ah->av.mac[0]); ··· 464 329 roce_set_bit(ud_sq_wqe->byte_40, 465 330 V2_UD_SEND_WQE_BYTE_40_LBI_S, loopback); 466 331 467 - roce_set_field(ud_sq_wqe->byte_4, 468 - V2_UD_SEND_WQE_BYTE_4_OPCODE_M, 469 - V2_UD_SEND_WQE_BYTE_4_OPCODE_S, 470 - HNS_ROCE_V2_WQE_OP_SEND); 471 - 472 332 ud_sq_wqe->msg_len = cpu_to_le32(msg_len); 473 - 474 - switch (wr->opcode) { 475 - case IB_WR_SEND_WITH_IMM: 476 - case IB_WR_RDMA_WRITE_WITH_IMM: 477 - ud_sq_wqe->immtdata = cpu_to_le32(be32_to_cpu(wr->ex.imm_data)); 478 - break; 479 - default: 480 - ud_sq_wqe->immtdata = 0; 481 - break; 482 - } 483 333 484 334 /* Set sig attr */ 485 335 roce_set_bit(ud_sq_wqe->byte_4, V2_UD_SEND_WQE_BYTE_4_CQE_S, ··· 489 369 curr_idx & (qp->sge.sge_cnt - 1)); 490 370 491 371 roce_set_field(ud_sq_wqe->byte_24, V2_UD_SEND_WQE_BYTE_24_UDPSPN_M, 492 - V2_UD_SEND_WQE_BYTE_24_UDPSPN_S, 0); 372 + V2_UD_SEND_WQE_BYTE_24_UDPSPN_S, ah->av.udp_sport); 493 373 ud_sq_wqe->qkey = cpu_to_le32(ud_wr(wr)->remote_qkey & 0x80000000 ? 494 374 qp->qkey : ud_wr(wr)->remote_qkey); 495 375 roce_set_field(ud_sq_wqe->byte_32, V2_UD_SEND_WQE_BYTE_32_DQPN_M, ··· 522 402 return 0; 523 403 } 524 404 405 + static int set_rc_opcode(struct hns_roce_v2_rc_send_wqe *rc_sq_wqe, 406 + const struct ib_send_wr *wr) 407 + { 408 + u32 ib_op = wr->opcode; 409 + 410 + rc_sq_wqe->immtdata = get_immtdata(wr); 411 + 412 + switch (ib_op) { 413 + case IB_WR_RDMA_READ: 414 + case IB_WR_RDMA_WRITE: 415 + case IB_WR_RDMA_WRITE_WITH_IMM: 416 + rc_sq_wqe->rkey = cpu_to_le32(rdma_wr(wr)->rkey); 417 + rc_sq_wqe->va = cpu_to_le64(rdma_wr(wr)->remote_addr); 418 + break; 419 + case IB_WR_SEND: 420 + case IB_WR_SEND_WITH_IMM: 421 + break; 422 + case IB_WR_ATOMIC_CMP_AND_SWP: 423 + case IB_WR_ATOMIC_FETCH_AND_ADD: 424 + rc_sq_wqe->rkey = cpu_to_le32(atomic_wr(wr)->rkey); 425 + rc_sq_wqe->va = cpu_to_le64(atomic_wr(wr)->remote_addr); 426 + break; 427 + case IB_WR_REG_MR: 428 + set_frmr_seg(rc_sq_wqe, reg_wr(wr)); 429 + break; 430 + case IB_WR_LOCAL_INV: 431 + roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_SO_S, 1); 432 + fallthrough; 433 + case IB_WR_SEND_WITH_INV: 434 + rc_sq_wqe->inv_key = cpu_to_le32(wr->ex.invalidate_rkey); 435 + break; 436 + default: 437 + return -EINVAL; 438 + } 439 + 440 + roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_OPCODE_M, 441 + V2_RC_SEND_WQE_BYTE_4_OPCODE_S, to_hr_opcode(ib_op)); 442 + 443 + return 0; 444 + } 525 445 static inline int set_rc_wqe(struct hns_roce_qp *qp, 526 446 const struct ib_send_wr *wr, 527 447 void *wqe, unsigned int *sge_idx, ··· 571 411 unsigned int curr_idx = *sge_idx; 572 412 unsigned int valid_num_sge; 573 413 u32 msg_len = 0; 574 - int ret = 0; 414 + int ret; 575 415 576 416 valid_num_sge = calc_wr_sge_num(wr, &msg_len); 577 417 memset(rc_sq_wqe, 0, sizeof(*rc_sq_wqe)); 578 418 579 419 rc_sq_wqe->msg_len = cpu_to_le32(msg_len); 580 420 581 - switch (wr->opcode) { 582 - case IB_WR_SEND_WITH_IMM: 583 - case IB_WR_RDMA_WRITE_WITH_IMM: 584 - rc_sq_wqe->immtdata = cpu_to_le32(be32_to_cpu(wr->ex.imm_data)); 585 - break; 586 - case IB_WR_SEND_WITH_INV: 587 - rc_sq_wqe->inv_key = cpu_to_le32(wr->ex.invalidate_rkey); 588 - break; 589 - default: 590 - rc_sq_wqe->immtdata = 0; 591 - break; 592 - } 421 + ret = set_rc_opcode(rc_sq_wqe, wr); 422 + if (WARN_ON(ret)) 423 + return ret; 593 424 594 425 roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_FENCE_S, 595 426 (wr->send_flags & IB_SEND_FENCE) ? 1 : 0); ··· 593 442 594 443 roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_OWNER_S, 595 444 owner_bit); 596 - 597 - switch (wr->opcode) { 598 - case IB_WR_RDMA_READ: 599 - case IB_WR_RDMA_WRITE: 600 - case IB_WR_RDMA_WRITE_WITH_IMM: 601 - rc_sq_wqe->rkey = cpu_to_le32(rdma_wr(wr)->rkey); 602 - rc_sq_wqe->va = cpu_to_le64(rdma_wr(wr)->remote_addr); 603 - break; 604 - case IB_WR_LOCAL_INV: 605 - roce_set_bit(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_SO_S, 1); 606 - rc_sq_wqe->inv_key = cpu_to_le32(wr->ex.invalidate_rkey); 607 - break; 608 - case IB_WR_REG_MR: 609 - set_frmr_seg(rc_sq_wqe, reg_wr(wr)); 610 - break; 611 - case IB_WR_ATOMIC_CMP_AND_SWP: 612 - case IB_WR_ATOMIC_FETCH_AND_ADD: 613 - rc_sq_wqe->rkey = cpu_to_le32(atomic_wr(wr)->rkey); 614 - rc_sq_wqe->va = cpu_to_le64(atomic_wr(wr)->remote_addr); 615 - break; 616 - default: 617 - break; 618 - } 619 - 620 - roce_set_field(rc_sq_wqe->byte_4, V2_RC_SEND_WQE_BYTE_4_OPCODE_M, 621 - V2_RC_SEND_WQE_BYTE_4_OPCODE_S, 622 - to_hr_opcode(wr->opcode)); 623 445 624 446 if (wr->opcode == IB_WR_ATOMIC_CMP_AND_SWP || 625 447 wr->opcode == IB_WR_ATOMIC_FETCH_AND_ADD) ··· 1806 1682 caps->max_sq_desc_sz = HNS_ROCE_V2_MAX_SQ_DESC_SZ; 1807 1683 caps->max_rq_desc_sz = HNS_ROCE_V2_MAX_RQ_DESC_SZ; 1808 1684 caps->max_srq_desc_sz = HNS_ROCE_V2_MAX_SRQ_DESC_SZ; 1809 - caps->qpc_entry_sz = HNS_ROCE_V2_QPC_ENTRY_SZ; 1685 + caps->qpc_sz = HNS_ROCE_V2_QPC_SZ; 1810 1686 caps->irrl_entry_sz = HNS_ROCE_V2_IRRL_ENTRY_SZ; 1811 1687 caps->trrl_entry_sz = HNS_ROCE_V2_EXT_ATOMIC_TRRL_ENTRY_SZ; 1812 1688 caps->cqc_entry_sz = HNS_ROCE_V2_CQC_ENTRY_SZ; ··· 1814 1690 caps->mtpt_entry_sz = HNS_ROCE_V2_MTPT_ENTRY_SZ; 1815 1691 caps->mtt_entry_sz = HNS_ROCE_V2_MTT_ENTRY_SZ; 1816 1692 caps->idx_entry_sz = HNS_ROCE_V2_IDX_ENTRY_SZ; 1817 - caps->cq_entry_sz = HNS_ROCE_V2_CQE_ENTRY_SIZE; 1693 + caps->cqe_sz = HNS_ROCE_V2_CQE_SIZE; 1818 1694 caps->page_size_cap = HNS_ROCE_V2_PAGE_SIZE_SUPPORTED; 1819 1695 caps->reserved_lkey = 0; 1820 1696 caps->reserved_pds = 0; ··· 1863 1739 caps->gid_table_len[0] = HNS_ROCE_V2_GID_INDEX_NUM; 1864 1740 caps->ceqe_depth = HNS_ROCE_V2_COMP_EQE_NUM; 1865 1741 caps->aeqe_depth = HNS_ROCE_V2_ASYNC_EQE_NUM; 1742 + caps->aeqe_size = HNS_ROCE_AEQE_SIZE; 1743 + caps->ceqe_size = HNS_ROCE_CEQE_SIZE; 1866 1744 caps->local_ca_ack_delay = 0; 1867 1745 caps->max_mtu = IB_MTU_4096; 1868 1746 ··· 1886 1760 caps->cqc_timer_buf_pg_sz = 0; 1887 1761 caps->cqc_timer_hop_num = HNS_ROCE_HOP_NUM_0; 1888 1762 1889 - caps->sccc_entry_sz = HNS_ROCE_V2_SCCC_ENTRY_SZ; 1763 + caps->sccc_sz = HNS_ROCE_V2_SCCC_SZ; 1890 1764 caps->sccc_ba_pg_sz = 0; 1891 1765 caps->sccc_buf_pg_sz = 0; 1892 1766 caps->sccc_hop_num = HNS_ROCE_SCCC_HOP_NUM; 1767 + 1768 + if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) { 1769 + caps->aeqe_size = HNS_ROCE_V3_EQE_SIZE; 1770 + caps->ceqe_size = HNS_ROCE_V3_EQE_SIZE; 1771 + caps->cqe_sz = HNS_ROCE_V3_CQE_SIZE; 1772 + caps->qpc_sz = HNS_ROCE_V3_QPC_SZ; 1773 + } 1893 1774 } 1894 1775 1895 1776 static void calc_pg_sz(int obj_num, int obj_size, int hop_num, int ctx_bt_num, 1896 1777 int *buf_page_size, int *bt_page_size, u32 hem_type) 1897 1778 { 1898 1779 u64 obj_per_chunk; 1899 - int bt_chunk_size = 1 << PAGE_SHIFT; 1900 - int buf_chunk_size = 1 << PAGE_SHIFT; 1901 - int obj_per_chunk_default = buf_chunk_size / obj_size; 1780 + u64 bt_chunk_size = PAGE_SIZE; 1781 + u64 buf_chunk_size = PAGE_SIZE; 1782 + u64 obj_per_chunk_default = buf_chunk_size / obj_size; 1902 1783 1903 1784 *buf_page_size = 0; 1904 1785 *bt_page_size = 0; ··· 1988 1855 caps->max_sq_desc_sz = resp_a->max_sq_desc_sz; 1989 1856 caps->max_rq_desc_sz = resp_a->max_rq_desc_sz; 1990 1857 caps->max_srq_desc_sz = resp_a->max_srq_desc_sz; 1991 - caps->cq_entry_sz = resp_a->cq_entry_sz; 1858 + caps->cqe_sz = HNS_ROCE_V2_CQE_SIZE; 1992 1859 1993 1860 caps->mtpt_entry_sz = resp_b->mtpt_entry_sz; 1994 1861 caps->irrl_entry_sz = resp_b->irrl_entry_sz; ··· 1996 1863 caps->cqc_entry_sz = resp_b->cqc_entry_sz; 1997 1864 caps->srqc_entry_sz = resp_b->srqc_entry_sz; 1998 1865 caps->idx_entry_sz = resp_b->idx_entry_sz; 1999 - caps->sccc_entry_sz = resp_b->scc_ctx_entry_sz; 1866 + caps->sccc_sz = resp_b->sccc_sz; 2000 1867 caps->max_mtu = resp_b->max_mtu; 2001 - caps->qpc_entry_sz = le16_to_cpu(resp_b->qpc_entry_sz); 1868 + caps->qpc_sz = HNS_ROCE_V2_QPC_SZ; 2002 1869 caps->min_cqes = resp_b->min_cqes; 2003 1870 caps->min_wqes = resp_b->min_wqes; 2004 1871 caps->page_size_cap = le32_to_cpu(resp_b->page_size_cap); ··· 2091 1958 caps->cqc_timer_entry_sz = HNS_ROCE_V2_CQC_TIMER_ENTRY_SZ; 2092 1959 caps->mtt_entry_sz = HNS_ROCE_V2_MTT_ENTRY_SZ; 2093 1960 caps->num_mtt_segs = HNS_ROCE_V2_MAX_MTT_SEGS; 1961 + caps->ceqe_size = HNS_ROCE_CEQE_SIZE; 1962 + caps->aeqe_size = HNS_ROCE_AEQE_SIZE; 2094 1963 caps->mtt_ba_pg_sz = 0; 2095 1964 caps->num_cqe_segs = HNS_ROCE_V2_MAX_CQE_SEGS; 2096 1965 caps->num_srqwqe_segs = HNS_ROCE_V2_MAX_SRQWQE_SEGS; ··· 2116 1981 V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_M, 2117 1982 V2_QUERY_PF_CAPS_D_RQWQE_HOP_NUM_S); 2118 1983 2119 - calc_pg_sz(caps->num_qps, caps->qpc_entry_sz, caps->qpc_hop_num, 1984 + if (hr_dev->pci_dev->revision >= PCI_REVISION_ID_HIP09) { 1985 + caps->ceqe_size = HNS_ROCE_V3_EQE_SIZE; 1986 + caps->aeqe_size = HNS_ROCE_V3_EQE_SIZE; 1987 + caps->cqe_sz = HNS_ROCE_V3_CQE_SIZE; 1988 + caps->qpc_sz = HNS_ROCE_V3_QPC_SZ; 1989 + caps->sccc_sz = HNS_ROCE_V3_SCCC_SZ; 1990 + } 1991 + 1992 + calc_pg_sz(caps->num_qps, caps->qpc_sz, caps->qpc_hop_num, 2120 1993 caps->qpc_bt_num, &caps->qpc_buf_pg_sz, &caps->qpc_ba_pg_sz, 2121 1994 HEM_TYPE_QPC); 2122 1995 calc_pg_sz(caps->num_mtpts, caps->mtpt_entry_sz, caps->mpt_hop_num, ··· 2141 1998 caps->qpc_timer_hop_num = HNS_ROCE_HOP_NUM_0; 2142 1999 caps->cqc_timer_hop_num = HNS_ROCE_HOP_NUM_0; 2143 2000 2144 - calc_pg_sz(caps->num_qps, caps->sccc_entry_sz, 2001 + calc_pg_sz(caps->num_qps, caps->sccc_sz, 2145 2002 caps->sccc_hop_num, caps->sccc_bt_num, 2146 2003 &caps->sccc_buf_pg_sz, &caps->sccc_ba_pg_sz, 2147 2004 HEM_TYPE_SCCC); ··· 2159 2016 1, &caps->idx_buf_pg_sz, &caps->idx_ba_pg_sz, HEM_TYPE_IDX); 2160 2017 2161 2018 return 0; 2019 + } 2020 + 2021 + static int hns_roce_config_qpc_size(struct hns_roce_dev *hr_dev) 2022 + { 2023 + struct hns_roce_cmq_desc desc; 2024 + struct hns_roce_cfg_entry_size *cfg_size = 2025 + (struct hns_roce_cfg_entry_size *)desc.data; 2026 + 2027 + hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_CFG_ENTRY_SIZE, 2028 + false); 2029 + 2030 + cfg_size->type = cpu_to_le32(HNS_ROCE_CFG_QPC_SIZE); 2031 + cfg_size->size = cpu_to_le32(hr_dev->caps.qpc_sz); 2032 + 2033 + return hns_roce_cmq_send(hr_dev, &desc, 1); 2034 + } 2035 + 2036 + static int hns_roce_config_sccc_size(struct hns_roce_dev *hr_dev) 2037 + { 2038 + struct hns_roce_cmq_desc desc; 2039 + struct hns_roce_cfg_entry_size *cfg_size = 2040 + (struct hns_roce_cfg_entry_size *)desc.data; 2041 + 2042 + hns_roce_cmq_setup_basic_desc(&desc, HNS_ROCE_OPC_CFG_ENTRY_SIZE, 2043 + false); 2044 + 2045 + cfg_size->type = cpu_to_le32(HNS_ROCE_CFG_SCCC_SIZE); 2046 + cfg_size->size = cpu_to_le32(hr_dev->caps.sccc_sz); 2047 + 2048 + return hns_roce_cmq_send(hr_dev, &desc, 1); 2049 + } 2050 + 2051 + static int hns_roce_config_entry_size(struct hns_roce_dev *hr_dev) 2052 + { 2053 + int ret; 2054 + 2055 + if (hr_dev->pci_dev->revision < PCI_REVISION_ID_HIP09) 2056 + return 0; 2057 + 2058 + ret = hns_roce_config_qpc_size(hr_dev); 2059 + if (ret) { 2060 + dev_err(hr_dev->dev, "failed to cfg qpc sz, ret = %d.\n", ret); 2061 + return ret; 2062 + } 2063 + 2064 + ret = hns_roce_config_sccc_size(hr_dev); 2065 + if (ret) 2066 + dev_err(hr_dev->dev, "failed to cfg sccc sz, ret = %d.\n", ret); 2067 + 2068 + return ret; 2162 2069 } 2163 2070 2164 2071 static int hns_roce_v2_profile(struct hns_roce_dev *hr_dev) ··· 2283 2090 } 2284 2091 2285 2092 ret = hns_roce_v2_set_bt(hr_dev); 2286 - if (ret) 2287 - dev_err(hr_dev->dev, "Configure bt attribute fail, ret = %d.\n", 2288 - ret); 2093 + if (ret) { 2094 + dev_err(hr_dev->dev, 2095 + "Configure bt attribute fail, ret = %d.\n", ret); 2096 + return ret; 2097 + } 2098 + 2099 + /* Configure the size of QPC, SCCC, etc. */ 2100 + ret = hns_roce_config_entry_size(hr_dev); 2289 2101 2290 2102 return ret; 2291 2103 } ··· 2955 2757 2956 2758 static void *get_cqe_v2(struct hns_roce_cq *hr_cq, int n) 2957 2759 { 2958 - return hns_roce_buf_offset(hr_cq->mtr.kmem, 2959 - n * HNS_ROCE_V2_CQE_ENTRY_SIZE); 2760 + return hns_roce_buf_offset(hr_cq->mtr.kmem, n * hr_cq->cqe_size); 2960 2761 } 2961 2762 2962 2763 static void *get_sw_cqe_v2(struct hns_roce_cq *hr_cq, int n) ··· 3054 2857 3055 2858 roce_set_field(cq_context->byte_8_cqn, V2_CQC_BYTE_8_CQN_M, 3056 2859 V2_CQC_BYTE_8_CQN_S, hr_cq->cqn); 2860 + 2861 + roce_set_field(cq_context->byte_8_cqn, V2_CQC_BYTE_8_CQE_SIZE_M, 2862 + V2_CQC_BYTE_8_CQE_SIZE_S, hr_cq->cqe_size == 2863 + HNS_ROCE_V3_CQE_SIZE ? 1 : 0); 3057 2864 3058 2865 cq_context->cqe_cur_blk_addr = cpu_to_le32(to_hr_hw_page_addr(mtts[0])); 3059 2866 ··· 3226 3025 } 3227 3026 3228 3027 static void get_cqe_status(struct hns_roce_dev *hr_dev, struct hns_roce_qp *qp, 3229 - struct hns_roce_v2_cqe *cqe, struct ib_wc *wc) 3028 + struct hns_roce_cq *cq, struct hns_roce_v2_cqe *cqe, 3029 + struct ib_wc *wc) 3230 3030 { 3231 3031 static const struct { 3232 3032 u32 cqe_status; ··· 3268 3066 3269 3067 ibdev_err(&hr_dev->ib_dev, "error cqe status 0x%x:\n", cqe_status); 3270 3068 print_hex_dump(KERN_ERR, "", DUMP_PREFIX_NONE, 16, 4, cqe, 3271 - sizeof(*cqe), false); 3069 + cq->cqe_size, false); 3272 3070 3273 3071 /* 3274 3072 * For hns ROCEE, GENERAL_ERR is an error type that is not defined in ··· 3365 3163 ++wq->tail; 3366 3164 } 3367 3165 3368 - get_cqe_status(hr_dev, *cur_qp, cqe, wc); 3166 + get_cqe_status(hr_dev, *cur_qp, hr_cq, cqe, wc); 3369 3167 if (unlikely(wc->status != IB_WC_SUCCESS)) 3370 3168 return 0; 3371 3169 ··· 3716 3514 3717 3515 static int hns_roce_v2_qp_modify(struct hns_roce_dev *hr_dev, 3718 3516 struct hns_roce_v2_qp_context *context, 3517 + struct hns_roce_v2_qp_context *qpc_mask, 3719 3518 struct hns_roce_qp *hr_qp) 3720 3519 { 3721 3520 struct hns_roce_cmd_mailbox *mailbox; 3521 + int qpc_size; 3722 3522 int ret; 3723 3523 3724 3524 mailbox = hns_roce_alloc_cmd_mailbox(hr_dev); 3725 3525 if (IS_ERR(mailbox)) 3726 3526 return PTR_ERR(mailbox); 3727 3527 3728 - memcpy(mailbox->buf, context, sizeof(*context) * 2); 3528 + /* The qpc size of HIP08 is only 256B, which is half of HIP09 */ 3529 + qpc_size = hr_dev->caps.qpc_sz; 3530 + memcpy(mailbox->buf, context, qpc_size); 3531 + memcpy(mailbox->buf + qpc_size, qpc_mask, qpc_size); 3729 3532 3730 3533 ret = hns_roce_cmd_mbox(hr_dev, mailbox->dma, 0, hr_qp->qpn, 0, 3731 3534 HNS_ROCE_CMD_MODIFY_QPC, ··· 3847 3640 roce_set_bit(context->byte_76_srqn_op_en, 3848 3641 V2_QPC_BYTE_76_SRQ_EN_S, 1); 3849 3642 } 3850 - 3851 - roce_set_field(context->byte_172_sq_psn, V2_QPC_BYTE_172_ACK_REQ_FREQ_M, 3852 - V2_QPC_BYTE_172_ACK_REQ_FREQ_S, 4); 3853 3643 3854 3644 roce_set_bit(context->byte_172_sq_psn, V2_QPC_BYTE_172_FRE_S, 1); 3855 3645 ··· 4158 3954 dma_addr_t trrl_ba; 4159 3955 dma_addr_t irrl_ba; 4160 3956 enum ib_mtu mtu; 3957 + u8 lp_pktn_ini; 4161 3958 u8 port_num; 4162 3959 u64 *mtts; 4163 3960 u8 *dmac; ··· 4257 4052 V2_QPC_BYTE_52_DMAC_S, 0); 4258 4053 4259 4054 mtu = get_mtu(ibqp, attr); 4055 + hr_qp->path_mtu = mtu; 4260 4056 4261 4057 if (attr_mask & IB_QP_PATH_MTU) { 4262 4058 roce_set_field(context->byte_24_mtu_tc, V2_QPC_BYTE_24_MTU_M, ··· 4267 4061 } 4268 4062 4269 4063 #define MAX_LP_MSG_LEN 65536 4270 - /* MTU*(2^LP_PKTN_INI) shouldn't be bigger than 64kb */ 4064 + /* MTU * (2 ^ LP_PKTN_INI) shouldn't be bigger than 64KB */ 4065 + lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / ib_mtu_enum_to_int(mtu)); 4066 + 4271 4067 roce_set_field(context->byte_56_dqpn_err, V2_QPC_BYTE_56_LP_PKTN_INI_M, 4272 - V2_QPC_BYTE_56_LP_PKTN_INI_S, 4273 - ilog2(MAX_LP_MSG_LEN / ib_mtu_enum_to_int(mtu))); 4068 + V2_QPC_BYTE_56_LP_PKTN_INI_S, lp_pktn_ini); 4274 4069 roce_set_field(qpc_mask->byte_56_dqpn_err, V2_QPC_BYTE_56_LP_PKTN_INI_M, 4275 4070 V2_QPC_BYTE_56_LP_PKTN_INI_S, 0); 4071 + 4072 + /* ACK_REQ_FREQ should be larger than or equal to LP_PKTN_INI */ 4073 + roce_set_field(context->byte_172_sq_psn, V2_QPC_BYTE_172_ACK_REQ_FREQ_M, 4074 + V2_QPC_BYTE_172_ACK_REQ_FREQ_S, lp_pktn_ini); 4075 + roce_set_field(qpc_mask->byte_172_sq_psn, 4076 + V2_QPC_BYTE_172_ACK_REQ_FREQ_M, 4077 + V2_QPC_BYTE_172_ACK_REQ_FREQ_S, 0); 4276 4078 4277 4079 roce_set_bit(qpc_mask->byte_108_rx_reqepsn, 4278 4080 V2_QPC_BYTE_108_RX_REQ_PSN_ERR_S, 0); ··· 4378 4164 return 0; 4379 4165 } 4380 4166 4167 + static inline u16 get_udp_sport(u32 fl, u32 lqpn, u32 rqpn) 4168 + { 4169 + if (!fl) 4170 + fl = rdma_calc_flow_label(lqpn, rqpn); 4171 + 4172 + return rdma_flow_label_to_udp_sport(fl); 4173 + } 4174 + 4381 4175 static int hns_roce_v2_set_path(struct ib_qp *ibqp, 4382 4176 const struct ib_qp_attr *attr, 4383 4177 int attr_mask, ··· 4449 4227 4450 4228 roce_set_field(context->byte_52_udpspn_dmac, V2_QPC_BYTE_52_UDPSPN_M, 4451 4229 V2_QPC_BYTE_52_UDPSPN_S, 4452 - is_udp ? 0x12b7 : 0); 4230 + is_udp ? get_udp_sport(grh->flow_label, ibqp->qp_num, 4231 + attr->dest_qp_num) : 0); 4453 4232 4454 4233 roce_set_field(qpc_mask->byte_52_udpspn_dmac, V2_QPC_BYTE_52_UDPSPN_M, 4455 4234 V2_QPC_BYTE_52_UDPSPN_S, 0); ··· 4482 4259 V2_QPC_BYTE_28_FL_S, 0); 4483 4260 memcpy(context->dgid, grh->dgid.raw, sizeof(grh->dgid.raw)); 4484 4261 memset(qpc_mask->dgid, 0, sizeof(grh->dgid.raw)); 4262 + 4263 + hr_qp->sl = rdma_ah_get_sl(&attr->ah_attr); 4264 + if (unlikely(hr_qp->sl > MAX_SERVICE_LEVEL)) { 4265 + ibdev_err(ibdev, 4266 + "failed to fill QPC, sl (%d) shouldn't be larger than %d.\n", 4267 + hr_qp->sl, MAX_SERVICE_LEVEL); 4268 + return -EINVAL; 4269 + } 4270 + 4485 4271 roce_set_field(context->byte_28_at_fl, V2_QPC_BYTE_28_SL_M, 4486 - V2_QPC_BYTE_28_SL_S, rdma_ah_get_sl(&attr->ah_attr)); 4272 + V2_QPC_BYTE_28_SL_S, hr_qp->sl); 4487 4273 roce_set_field(qpc_mask->byte_28_at_fl, V2_QPC_BYTE_28_SL_M, 4488 4274 V2_QPC_BYTE_28_SL_S, 0); 4489 - hr_qp->sl = rdma_ah_get_sl(&attr->ah_attr); 4490 4275 4491 4276 return 0; 4492 4277 } ··· 4540 4309 } 4541 4310 4542 4311 if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) { 4543 - memset(qpc_mask, 0, sizeof(*qpc_mask)); 4312 + memset(qpc_mask, 0, hr_dev->caps.qpc_sz); 4544 4313 modify_qp_reset_to_init(ibqp, attr, attr_mask, context, 4545 4314 qpc_mask); 4546 4315 } else if (cur_state == IB_QPS_INIT && new_state == IB_QPS_INIT) { ··· 4763 4532 * we should set all bits of the relevant fields in context mask to 4764 4533 * 0 at the same time, else set them to 0x1. 4765 4534 */ 4766 - memset(context, 0, sizeof(*context)); 4767 - memset(qpc_mask, 0xff, sizeof(*qpc_mask)); 4535 + memset(context, 0, hr_dev->caps.qpc_sz); 4536 + memset(qpc_mask, 0xff, hr_dev->caps.qpc_sz); 4537 + 4768 4538 ret = hns_roce_v2_set_abs_fields(ibqp, attr, attr_mask, cur_state, 4769 4539 new_state, context, qpc_mask); 4770 4540 if (ret) ··· 4815 4583 V2_QPC_BYTE_60_QP_ST_S, 0); 4816 4584 4817 4585 /* SW pass context to HW */ 4818 - ret = hns_roce_v2_qp_modify(hr_dev, ctx, hr_qp); 4586 + ret = hns_roce_v2_qp_modify(hr_dev, context, qpc_mask, hr_qp); 4819 4587 if (ret) { 4820 4588 ibdev_err(ibdev, "failed to modify QP, ret = %d\n", ret); 4821 4589 goto out; ··· 4878 4646 if (ret) 4879 4647 goto out; 4880 4648 4881 - memcpy(hr_context, mailbox->buf, sizeof(*hr_context)); 4649 + memcpy(hr_context, mailbox->buf, hr_dev->caps.qpc_sz); 4882 4650 4883 4651 out: 4884 4652 hns_roce_free_cmd_mailbox(hr_dev, mailbox); ··· 4991 4759 qp_attr->retry_cnt = roce_get_field(context.byte_212_lsn, 4992 4760 V2_QPC_BYTE_212_RETRY_CNT_M, 4993 4761 V2_QPC_BYTE_212_RETRY_CNT_S); 4994 - qp_attr->rnr_retry = le32_to_cpu(context.rq_rnr_timer); 4762 + qp_attr->rnr_retry = roce_get_field(context.byte_244_rnr_rxack, 4763 + V2_QPC_BYTE_244_RNR_CNT_M, 4764 + V2_QPC_BYTE_244_RNR_CNT_S); 4995 4765 4996 4766 done: 4997 4767 qp_attr->cur_qp_state = qp_attr->qp_state; ··· 5009 4775 } 5010 4776 5011 4777 qp_init_attr->cap = qp_attr->cap; 4778 + qp_init_attr->sq_sig_type = hr_qp->sq_signal_bits; 5012 4779 5013 4780 out: 5014 4781 mutex_unlock(&hr_qp->mutex); ··· 5238 5003 struct hns_roce_srq_context *srqc_mask; 5239 5004 struct hns_roce_cmd_mailbox *mailbox; 5240 5005 int ret; 5006 + 5007 + /* Resizing SRQs is not supported yet */ 5008 + if (srq_attr_mask & IB_SRQ_MAX_WR) 5009 + return -EINVAL; 5241 5010 5242 5011 if (srq_attr_mask & IB_SRQ_LIMIT) { 5243 5012 if (srq_attr->srq_limit >= srq->wqe_cnt) ··· 5472 5233 5473 5234 aeqe = hns_roce_buf_offset(eq->mtr.kmem, 5474 5235 (eq->cons_index & (eq->entries - 1)) * 5475 - HNS_ROCE_AEQ_ENTRY_SIZE); 5236 + eq->eqe_size); 5476 5237 5477 5238 return (roce_get_bit(aeqe->asyn, HNS_ROCE_V2_AEQ_AEQE_OWNER_S) ^ 5478 5239 !!(eq->cons_index & eq->entries)) ? aeqe : NULL; ··· 5572 5333 5573 5334 ceqe = hns_roce_buf_offset(eq->mtr.kmem, 5574 5335 (eq->cons_index & (eq->entries - 1)) * 5575 - HNS_ROCE_CEQ_ENTRY_SIZE); 5336 + eq->eqe_size); 5337 + 5576 5338 return (!!(roce_get_bit(ceqe->comp, HNS_ROCE_V2_CEQ_CEQE_OWNER_S))) ^ 5577 5339 (!!(eq->cons_index & eq->entries)) ? ceqe : NULL; 5578 5340 } ··· 5614 5374 { 5615 5375 struct hns_roce_eq *eq = eq_ptr; 5616 5376 struct hns_roce_dev *hr_dev = eq->hr_dev; 5617 - int int_work = 0; 5377 + int int_work; 5618 5378 5619 5379 if (eq->type_flag == HNS_ROCE_CEQ) 5620 5380 /* Completion event interrupt */ ··· 5849 5609 roce_set_field(eqc->byte_36, HNS_ROCE_EQC_CONS_INDX_M, 5850 5610 HNS_ROCE_EQC_CONS_INDX_S, HNS_ROCE_EQ_INIT_CONS_IDX); 5851 5611 5852 - /* set nex_eqe_ba[43:12] */ 5853 - roce_set_field(eqc->nxt_eqe_ba0, HNS_ROCE_EQC_NXT_EQE_BA_L_M, 5612 + roce_set_field(eqc->byte_40, HNS_ROCE_EQC_NXT_EQE_BA_L_M, 5854 5613 HNS_ROCE_EQC_NXT_EQE_BA_L_S, eqe_ba[1] >> 12); 5855 5614 5856 - /* set nex_eqe_ba[63:44] */ 5857 - roce_set_field(eqc->nxt_eqe_ba1, HNS_ROCE_EQC_NXT_EQE_BA_H_M, 5615 + roce_set_field(eqc->byte_44, HNS_ROCE_EQC_NXT_EQE_BA_H_M, 5858 5616 HNS_ROCE_EQC_NXT_EQE_BA_H_S, eqe_ba[1] >> 44); 5617 + 5618 + roce_set_field(eqc->byte_44, HNS_ROCE_EQC_EQE_SIZE_M, 5619 + HNS_ROCE_EQC_EQE_SIZE_S, 5620 + eq->eqe_size == HNS_ROCE_V3_EQE_SIZE ? 1 : 0); 5859 5621 5860 5622 return 0; 5861 5623 } ··· 6049 5807 eq_cmd = HNS_ROCE_CMD_CREATE_CEQC; 6050 5808 eq->type_flag = HNS_ROCE_CEQ; 6051 5809 eq->entries = hr_dev->caps.ceqe_depth; 6052 - eq->eqe_size = HNS_ROCE_CEQ_ENTRY_SIZE; 5810 + eq->eqe_size = hr_dev->caps.ceqe_size; 6053 5811 eq->irq = hr_dev->irq[i + other_num + aeq_num]; 6054 5812 eq->eq_max_cnt = HNS_ROCE_CEQ_DEFAULT_BURST_NUM; 6055 5813 eq->eq_period = HNS_ROCE_CEQ_DEFAULT_INTERVAL; ··· 6058 5816 eq_cmd = HNS_ROCE_CMD_CREATE_AEQC; 6059 5817 eq->type_flag = HNS_ROCE_AEQ; 6060 5818 eq->entries = hr_dev->caps.aeqe_depth; 6061 - eq->eqe_size = HNS_ROCE_AEQ_ENTRY_SIZE; 5819 + eq->eqe_size = hr_dev->caps.aeqe_size; 6062 5820 eq->irq = hr_dev->irq[i - comp_num + other_num]; 6063 5821 eq->eq_max_cnt = HNS_ROCE_AEQ_DEFAULT_BURST_NUM; 6064 5822 eq->eq_period = HNS_ROCE_AEQ_DEFAULT_INTERVAL;
+35 -8
drivers/infiniband/hw/hns/hns_roce_hw_v2.h
··· 60 60 #define HNS_ROCE_V2_MAX_SQ_SGE_NUM 64 61 61 #define HNS_ROCE_V2_MAX_EXTEND_SGE_NUM 0x200000 62 62 #define HNS_ROCE_V2_MAX_SQ_INLINE 0x20 63 + #define HNS_ROCE_V2_MAX_RC_INL_INN_SZ 32 63 64 #define HNS_ROCE_V2_UAR_NUM 256 64 65 #define HNS_ROCE_V2_PHY_UAR_NUM 1 65 66 #define HNS_ROCE_V2_MAX_IRQ_NUM 65 ··· 78 77 #define HNS_ROCE_V2_MAX_SQ_DESC_SZ 64 79 78 #define HNS_ROCE_V2_MAX_RQ_DESC_SZ 16 80 79 #define HNS_ROCE_V2_MAX_SRQ_DESC_SZ 64 81 - #define HNS_ROCE_V2_QPC_ENTRY_SZ 256 82 80 #define HNS_ROCE_V2_IRRL_ENTRY_SZ 64 83 81 #define HNS_ROCE_V2_TRRL_ENTRY_SZ 48 84 82 #define HNS_ROCE_V2_EXT_ATOMIC_TRRL_ENTRY_SZ 100 ··· 86 86 #define HNS_ROCE_V2_MTPT_ENTRY_SZ 64 87 87 #define HNS_ROCE_V2_MTT_ENTRY_SZ 64 88 88 #define HNS_ROCE_V2_IDX_ENTRY_SZ 4 89 - #define HNS_ROCE_V2_CQE_ENTRY_SIZE 32 90 - #define HNS_ROCE_V2_SCCC_ENTRY_SZ 32 89 + 90 + #define HNS_ROCE_V2_SCCC_SZ 32 91 + #define HNS_ROCE_V3_SCCC_SZ 64 92 + 91 93 #define HNS_ROCE_V2_QPC_TIMER_ENTRY_SZ PAGE_SIZE 92 94 #define HNS_ROCE_V2_CQC_TIMER_ENTRY_SZ PAGE_SIZE 93 95 #define HNS_ROCE_V2_PAGE_SIZE_SUPPORTED 0xFFFFF000 ··· 231 229 HNS_ROCE_OPC_CFG_TMOUT_LLM = 0x8404, 232 230 HNS_ROCE_OPC_QUERY_PF_TIMER_RES = 0x8406, 233 231 HNS_ROCE_OPC_QUERY_PF_CAPS_NUM = 0x8408, 232 + HNS_ROCE_OPC_CFG_ENTRY_SIZE = 0x8409, 234 233 HNS_ROCE_OPC_CFG_SGID_TB = 0x8500, 235 234 HNS_ROCE_OPC_CFG_SMAC_TB = 0x8501, 236 235 HNS_ROCE_OPC_POST_MB = 0x8504, ··· 311 308 312 309 #define V2_CQC_BYTE_8_CQN_S 0 313 310 #define V2_CQC_BYTE_8_CQN_M GENMASK(23, 0) 311 + 312 + #define V2_CQC_BYTE_8_CQE_SIZE_S 27 313 + #define V2_CQC_BYTE_8_CQE_SIZE_M GENMASK(28, 27) 314 314 315 315 #define V2_CQC_BYTE_16_CQE_CUR_BLK_ADDR_S 0 316 316 #define V2_CQC_BYTE_16_CQE_CUR_BLK_ADDR_M GENMASK(19, 0) ··· 518 512 __le32 byte_248_ack_psn; 519 513 __le32 byte_252_err_txcqn; 520 514 __le32 byte_256_sqflush_rqcqe; 515 + __le32 ext[64]; 521 516 }; 522 517 523 518 #define V2_QPC_BYTE_4_TST_S 0 ··· 903 896 u8 smac[4]; 904 897 __le32 byte_28; 905 898 __le32 byte_32; 899 + __le32 rsv[8]; 906 900 }; 907 901 908 902 #define V2_CQE_BYTE_4_OPCODE_S 0 ··· 1194 1186 1195 1187 #define V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_S 0 1196 1188 #define V2_RC_SEND_WQE_BYTE_20_MSG_START_SGE_IDX_M GENMASK(23, 0) 1189 + 1190 + #define V2_RC_SEND_WQE_BYTE_20_INL_TYPE_S 31 1197 1191 1198 1192 struct hns_roce_wqe_frmr_seg { 1199 1193 __le32 pbl_size; ··· 1547 1537 __le32 vf_sgid_h; 1548 1538 __le32 vf_sgid_type_rsv; 1549 1539 }; 1540 + 1541 + enum { 1542 + HNS_ROCE_CFG_QPC_SIZE = BIT(0), 1543 + HNS_ROCE_CFG_SCCC_SIZE = BIT(1), 1544 + }; 1545 + 1546 + struct hns_roce_cfg_entry_size { 1547 + __le32 type; 1548 + __le32 rsv[4]; 1549 + __le32 size; 1550 + }; 1551 + 1550 1552 #define CFG_SGID_TB_TABLE_IDX_S 0 1551 1553 #define CFG_SGID_TB_TABLE_IDX_M GENMASK(7, 0) 1552 1554 ··· 1593 1571 u8 max_sq_desc_sz; 1594 1572 u8 max_rq_desc_sz; 1595 1573 u8 max_srq_desc_sz; 1596 - u8 cq_entry_sz; 1574 + u8 cqe_sz; 1597 1575 }; 1598 1576 1599 1577 struct hns_roce_query_pf_caps_b { ··· 1603 1581 u8 cqc_entry_sz; 1604 1582 u8 srqc_entry_sz; 1605 1583 u8 idx_entry_sz; 1606 - u8 scc_ctx_entry_sz; 1584 + u8 sccc_sz; 1607 1585 u8 max_mtu; 1608 - __le16 qpc_entry_sz; 1586 + __le16 qpc_sz; 1609 1587 __le16 qpc_timer_entry_sz; 1610 1588 __le16 cqc_timer_entry_sz; 1611 1589 u8 min_cqes; ··· 1799 1777 __le32 byte_28; 1800 1778 __le32 byte_32; 1801 1779 __le32 byte_36; 1802 - __le32 nxt_eqe_ba0; 1803 - __le32 nxt_eqe_ba1; 1780 + __le32 byte_40; 1781 + __le32 byte_44; 1804 1782 __le32 rsv[5]; 1805 1783 }; 1806 1784 ··· 1942 1920 #define HNS_ROCE_EQC_NXT_EQE_BA_H_S 0 1943 1921 #define HNS_ROCE_EQC_NXT_EQE_BA_H_M GENMASK(19, 0) 1944 1922 1923 + #define HNS_ROCE_EQC_EQE_SIZE_S 20 1924 + #define HNS_ROCE_EQC_EQE_SIZE_M GENMASK(21, 20) 1925 + 1945 1926 #define HNS_ROCE_V2_CEQE_COMP_CQN_S 0 1946 1927 #define HNS_ROCE_V2_CEQE_COMP_CQN_M GENMASK(23, 0) 1947 1928 ··· 1965 1940 1966 1941 #define HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_S 0 1967 1942 #define HNS_ROCE_V2_AEQE_EVENT_QUEUE_NUM_M GENMASK(23, 0) 1943 + 1944 + #define MAX_SERVICE_LEVEL 0x7 1968 1945 1969 1946 struct hns_roce_wqe_atomic_seg { 1970 1947 __le64 fetchadd_swap_data;
+12 -7
drivers/infiniband/hw/hns/hns_roce_main.c
··· 141 141 struct net_device *dev = netdev_notifier_info_to_dev(ptr); 142 142 struct hns_roce_ib_iboe *iboe = NULL; 143 143 struct hns_roce_dev *hr_dev = NULL; 144 - u8 port = 0; 145 - int ret = 0; 144 + int ret; 145 + u8 port; 146 146 147 147 hr_dev = container_of(self, struct hns_roce_dev, iboe.nb); 148 148 iboe = &hr_dev->iboe; ··· 323 323 mutex_init(&context->page_mutex); 324 324 } 325 325 326 + resp.cqe_size = hr_dev->caps.cqe_sz; 327 + 326 328 ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); 327 329 if (ret) 328 330 goto error_fail_copy_to_udata; ··· 456 454 static const struct ib_device_ops hns_roce_dev_mw_ops = { 457 455 .alloc_mw = hns_roce_alloc_mw, 458 456 .dealloc_mw = hns_roce_dealloc_mw, 457 + 458 + INIT_RDMA_OBJ_SIZE(ib_mw, hns_roce_mw, ibmw), 459 459 }; 460 460 461 461 static const struct ib_device_ops hns_roce_dev_frmr_ops = { ··· 549 545 if (ret) 550 546 return ret; 551 547 } 552 - ret = ib_register_device(ib_dev, "hns_%d"); 548 + dma_set_max_seg_size(dev, UINT_MAX); 549 + ret = ib_register_device(ib_dev, "hns_%d", dev); 553 550 if (ret) { 554 551 dev_err(dev, "ib_register_device failed!\n"); 555 552 return ret; ··· 592 587 } 593 588 594 589 ret = hns_roce_init_hem_table(hr_dev, &hr_dev->qp_table.qp_table, 595 - HEM_TYPE_QPC, hr_dev->caps.qpc_entry_sz, 590 + HEM_TYPE_QPC, hr_dev->caps.qpc_sz, 596 591 hr_dev->caps.num_qps, 1); 597 592 if (ret) { 598 593 dev_err(dev, "Failed to init QP context memory, aborting.\n"); ··· 643 638 } 644 639 } 645 640 646 - if (hr_dev->caps.sccc_entry_sz) { 641 + if (hr_dev->caps.sccc_sz) { 647 642 ret = hns_roce_init_hem_table(hr_dev, 648 643 &hr_dev->qp_table.sccc_table, 649 644 HEM_TYPE_SCCC, 650 - hr_dev->caps.sccc_entry_sz, 645 + hr_dev->caps.sccc_sz, 651 646 hr_dev->caps.num_qps, 1); 652 647 if (ret) { 653 648 dev_err(dev, ··· 687 682 hns_roce_cleanup_hem_table(hr_dev, &hr_dev->qpc_timer_table); 688 683 689 684 err_unmap_ctx: 690 - if (hr_dev->caps.sccc_entry_sz) 685 + if (hr_dev->caps.sccc_sz) 691 686 hns_roce_cleanup_hem_table(hr_dev, 692 687 &hr_dev->qp_table.sccc_table); 693 688 err_unmap_srq:
+29 -52
drivers/infiniband/hw/hns/hns_roce_mr.c
··· 589 589 return ret; 590 590 } 591 591 592 - struct ib_mw *hns_roce_alloc_mw(struct ib_pd *ib_pd, enum ib_mw_type type, 593 - struct ib_udata *udata) 592 + int hns_roce_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) 594 593 { 595 - struct hns_roce_dev *hr_dev = to_hr_dev(ib_pd->device); 596 - struct hns_roce_mw *mw; 594 + struct hns_roce_dev *hr_dev = to_hr_dev(ibmw->device); 595 + struct hns_roce_mw *mw = to_hr_mw(ibmw); 597 596 unsigned long index = 0; 598 597 int ret; 599 - 600 - mw = kmalloc(sizeof(*mw), GFP_KERNEL); 601 - if (!mw) 602 - return ERR_PTR(-ENOMEM); 603 598 604 599 /* Allocate a key for mw from bitmap */ 605 600 ret = hns_roce_bitmap_alloc(&hr_dev->mr_table.mtpt_bitmap, &index); 606 601 if (ret) 607 - goto err_bitmap; 602 + return ret; 608 603 609 604 mw->rkey = hw_index_to_key(index); 610 605 611 - mw->ibmw.rkey = mw->rkey; 612 - mw->ibmw.type = type; 613 - mw->pdn = to_hr_pd(ib_pd)->pdn; 606 + ibmw->rkey = mw->rkey; 607 + mw->pdn = to_hr_pd(ibmw->pd)->pdn; 614 608 mw->pbl_hop_num = hr_dev->caps.pbl_hop_num; 615 609 mw->pbl_ba_pg_sz = hr_dev->caps.pbl_ba_pg_sz; 616 610 mw->pbl_buf_pg_sz = hr_dev->caps.pbl_buf_pg_sz; ··· 613 619 if (ret) 614 620 goto err_mw; 615 621 616 - return &mw->ibmw; 622 + return 0; 617 623 618 624 err_mw: 619 625 hns_roce_mw_free(hr_dev, mw); 620 - 621 - err_bitmap: 622 - kfree(mw); 623 - 624 - return ERR_PTR(ret); 626 + return ret; 625 627 } 626 628 627 629 int hns_roce_dealloc_mw(struct ib_mw *ibmw) ··· 626 636 struct hns_roce_mw *mw = to_hr_mw(ibmw); 627 637 628 638 hns_roce_mw_free(hr_dev, mw); 629 - kfree(mw); 630 - 631 639 return 0; 632 640 } 633 641 ··· 695 707 return size; 696 708 } 697 709 698 - static inline int mtr_umem_page_count(struct ib_umem *umem, 699 - unsigned int page_shift) 700 - { 701 - int count = ib_umem_page_count(umem); 702 - 703 - if (page_shift >= PAGE_SHIFT) 704 - count >>= page_shift - PAGE_SHIFT; 705 - else 706 - count <<= PAGE_SHIFT - page_shift; 707 - 708 - return count; 709 - } 710 - 711 710 static inline size_t mtr_kmem_direct_size(bool is_direct, size_t alloc_size, 712 711 unsigned int page_shift) 713 712 { ··· 742 767 struct ib_udata *udata, unsigned long user_addr) 743 768 { 744 769 struct ib_device *ibdev = &hr_dev->ib_dev; 745 - unsigned int max_pg_shift = buf_attr->page_shift; 746 - unsigned int best_pg_shift = 0; 770 + unsigned int best_pg_shift; 747 771 int all_pg_count = 0; 748 772 size_t direct_size; 749 773 size_t total_size; 750 - unsigned long tmp; 751 - int ret = 0; 774 + int ret; 752 775 753 776 total_size = mtr_bufs_size(buf_attr); 754 777 if (total_size < 1) { ··· 755 782 } 756 783 757 784 if (udata) { 785 + unsigned long pgsz_bitmap; 786 + unsigned long page_size; 787 + 758 788 mtr->kmem = NULL; 759 789 mtr->umem = ib_umem_get(ibdev, user_addr, total_size, 760 790 buf_attr->user_access); ··· 766 790 PTR_ERR(mtr->umem)); 767 791 return -ENOMEM; 768 792 } 769 - if (buf_attr->fixed_page) { 770 - best_pg_shift = max_pg_shift; 771 - } else { 772 - tmp = GENMASK(max_pg_shift, 0); 773 - ret = ib_umem_find_best_pgsz(mtr->umem, tmp, user_addr); 774 - best_pg_shift = (ret <= PAGE_SIZE) ? 775 - PAGE_SHIFT : ilog2(ret); 776 - } 777 - all_pg_count = mtr_umem_page_count(mtr->umem, best_pg_shift); 793 + if (buf_attr->fixed_page) 794 + pgsz_bitmap = 1 << buf_attr->page_shift; 795 + else 796 + pgsz_bitmap = GENMASK(buf_attr->page_shift, PAGE_SHIFT); 797 + 798 + page_size = ib_umem_find_best_pgsz(mtr->umem, pgsz_bitmap, 799 + user_addr); 800 + if (!page_size) 801 + return -EINVAL; 802 + best_pg_shift = order_base_2(page_size); 803 + all_pg_count = ib_umem_num_dma_blocks(mtr->umem, page_size); 778 804 ret = 0; 779 805 } else { 780 806 mtr->umem = NULL; ··· 786 808 return -ENOMEM; 787 809 } 788 810 direct_size = mtr_kmem_direct_size(is_direct, total_size, 789 - max_pg_shift); 811 + buf_attr->page_shift); 790 812 ret = hns_roce_buf_alloc(hr_dev, total_size, direct_size, 791 - mtr->kmem, max_pg_shift); 813 + mtr->kmem, buf_attr->page_shift); 792 814 if (ret) { 793 815 ibdev_err(ibdev, "Failed to alloc kmem, ret %d\n", ret); 794 816 goto err_alloc_mem; 795 - } else { 796 - best_pg_shift = max_pg_shift; 797 - all_pg_count = mtr->kmem->npages; 798 817 } 818 + best_pg_shift = buf_attr->page_shift; 819 + all_pg_count = mtr->kmem->npages; 799 820 } 800 821 801 822 /* must bigger than minimum hardware page shift */ ··· 944 967 unsigned int *buf_page_shift) 945 968 { 946 969 struct hns_roce_buf_region *r; 947 - unsigned int page_shift = 0; 970 + unsigned int page_shift; 948 971 int page_cnt = 0; 949 972 size_t buf_size; 950 973 int region_cnt;
+2 -1
drivers/infiniband/hw/hns/hns_roce_pd.c
··· 82 82 return 0; 83 83 } 84 84 85 - void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 85 + int hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 86 86 { 87 87 hns_roce_pd_free(to_hr_dev(pd->device), to_hr_pd(pd)->pdn); 88 + return 0; 88 89 } 89 90 90 91 int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
+31 -51
drivers/infiniband/hw/hns/hns_roce_qp.c
··· 41 41 #include "hns_roce_hem.h" 42 42 #include <rdma/hns-abi.h> 43 43 44 - #define SQP_NUM (2 * HNS_ROCE_MAX_PORTS) 45 - 46 44 static void flush_work_handle(struct work_struct *work) 47 45 { 48 46 struct hns_roce_work *flush_work = container_of(work, ··· 286 288 } 287 289 } 288 290 289 - if (hr_dev->caps.sccc_entry_sz) { 291 + if (hr_dev->caps.sccc_sz) { 290 292 /* Alloc memory for SCC CTX */ 291 293 ret = hns_roce_table_get(hr_dev, &qp_table->sccc_table, 292 294 hr_qp->qpn); ··· 549 551 int ret; 550 552 551 553 if (!cap->max_send_wr || cap->max_send_wr > hr_dev->caps.max_wqes || 552 - cap->max_send_sge > hr_dev->caps.max_sq_sg || 553 - cap->max_inline_data > hr_dev->caps.max_sq_inline) { 554 + cap->max_send_sge > hr_dev->caps.max_sq_sg) { 554 555 ibdev_err(ibdev, 555 - "failed to check SQ WR, SGE or inline num, ret = %d.\n", 556 + "failed to check SQ WR or SGE num, ret = %d.\n", 556 557 -EINVAL); 557 558 return -EINVAL; 558 559 } ··· 573 576 /* sync the parameters of kernel QP to user's configuration */ 574 577 cap->max_send_wr = cnt; 575 578 cap->max_send_sge = hr_qp->sq.max_gs; 576 - 577 - /* We don't support inline sends for kernel QPs (yet) */ 578 - cap->max_inline_data = 0; 579 579 580 580 return 0; 581 581 } ··· 841 847 842 848 hr_qp->ibqp.qp_type = init_attr->qp_type; 843 849 850 + if (init_attr->cap.max_inline_data > hr_dev->caps.max_sq_inline) 851 + init_attr->cap.max_inline_data = hr_dev->caps.max_sq_inline; 852 + 853 + hr_qp->max_inline_data = init_attr->cap.max_inline_data; 854 + 844 855 if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) 845 856 hr_qp->sq_signal_bits = IB_SIGNAL_ALL_WR; 846 857 else ··· 1013 1014 int ret; 1014 1015 1015 1016 switch (init_attr->qp_type) { 1016 - case IB_QPT_RC: { 1017 - hr_qp = kzalloc(sizeof(*hr_qp), GFP_KERNEL); 1018 - if (!hr_qp) 1019 - return ERR_PTR(-ENOMEM); 1020 - 1021 - ret = hns_roce_create_qp_common(hr_dev, pd, init_attr, udata, 1022 - hr_qp); 1023 - if (ret) { 1024 - ibdev_err(ibdev, "Create QP 0x%06lx failed(%d)\n", 1025 - hr_qp->qpn, ret); 1026 - kfree(hr_qp); 1027 - return ERR_PTR(ret); 1028 - } 1029 - 1017 + case IB_QPT_RC: 1018 + case IB_QPT_GSI: 1030 1019 break; 1031 - } 1032 - case IB_QPT_GSI: { 1033 - /* Userspace is not allowed to create special QPs: */ 1034 - if (udata) { 1035 - ibdev_err(ibdev, "not support usr space GSI\n"); 1036 - return ERR_PTR(-EINVAL); 1037 - } 1038 - 1039 - hr_qp = kzalloc(sizeof(*hr_qp), GFP_KERNEL); 1040 - if (!hr_qp) 1041 - return ERR_PTR(-ENOMEM); 1042 - 1043 - hr_qp->port = init_attr->port_num - 1; 1044 - hr_qp->phy_port = hr_dev->iboe.phy_port[hr_qp->port]; 1045 - 1046 - ret = hns_roce_create_qp_common(hr_dev, pd, init_attr, udata, 1047 - hr_qp); 1048 - if (ret) { 1049 - ibdev_err(ibdev, "Create GSI QP failed!\n"); 1050 - kfree(hr_qp); 1051 - return ERR_PTR(ret); 1052 - } 1053 - 1054 - break; 1055 - } 1056 - default:{ 1020 + default: 1057 1021 ibdev_err(ibdev, "not support QP type %d\n", 1058 1022 init_attr->qp_type); 1059 1023 return ERR_PTR(-EOPNOTSUPP); 1060 1024 } 1025 + 1026 + hr_qp = kzalloc(sizeof(*hr_qp), GFP_KERNEL); 1027 + if (!hr_qp) 1028 + return ERR_PTR(-ENOMEM); 1029 + 1030 + if (init_attr->qp_type == IB_QPT_GSI) { 1031 + hr_qp->port = init_attr->port_num - 1; 1032 + hr_qp->phy_port = hr_dev->iboe.phy_port[hr_qp->port]; 1061 1033 } 1062 1034 1035 + ret = hns_roce_create_qp_common(hr_dev, pd, init_attr, udata, hr_qp); 1036 + if (ret) { 1037 + ibdev_err(ibdev, "Create QP type 0x%x failed(%d)\n", 1038 + init_attr->qp_type, ret); 1039 + ibdev_err(ibdev, "Create GSI QP failed!\n"); 1040 + kfree(hr_qp); 1041 + return ERR_PTR(ret); 1042 + } 1063 1043 return &hr_qp->ibqp; 1064 1044 } 1065 1045 ··· 1139 1161 1140 1162 mutex_lock(&hr_qp->mutex); 1141 1163 1142 - cur_state = attr_mask & IB_QP_CUR_STATE ? 1143 - attr->cur_qp_state : (enum ib_qp_state)hr_qp->state; 1164 + if (attr_mask & IB_QP_CUR_STATE && attr->cur_qp_state != hr_qp->state) 1165 + goto out; 1166 + 1167 + cur_state = hr_qp->state; 1144 1168 new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state; 1145 1169 1146 1170 if (ibqp->uobject &&
+3 -2
drivers/infiniband/hw/hns/hns_roce_srq.c
··· 285 285 struct hns_roce_srq *srq = to_hr_srq(ib_srq); 286 286 struct ib_device *ibdev = &hr_dev->ib_dev; 287 287 struct hns_roce_ib_create_srq ucmd = {}; 288 - int ret = 0; 288 + int ret; 289 289 u32 cqn; 290 290 291 291 /* Check the actual SRQ wqe and SRQ sge num */ ··· 363 363 return ret; 364 364 } 365 365 366 - void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 366 + int hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 367 367 { 368 368 struct hns_roce_dev *hr_dev = to_hr_dev(ibsrq->device); 369 369 struct hns_roce_srq *srq = to_hr_srq(ibsrq); ··· 372 372 free_srq_idx(hr_dev, srq); 373 373 free_srq_wrid(srq); 374 374 free_srq_buf(hr_dev, srq); 375 + return 0; 375 376 } 376 377 377 378 int hns_roce_init_srq_table(struct hns_roce_dev *hr_dev)
+4 -5
drivers/infiniband/hw/i40iw/i40iw.h
··· 409 409 } 410 410 411 411 /* i40iw.c */ 412 - void i40iw_add_ref(struct ib_qp *); 413 - void i40iw_rem_ref(struct ib_qp *); 412 + void i40iw_qp_add_ref(struct ib_qp *ibqp); 413 + void i40iw_qp_rem_ref(struct ib_qp *ibqp); 414 414 struct ib_qp *i40iw_get_qp(struct ib_device *, int); 415 415 416 416 void i40iw_flush_wqes(struct i40iw_device *iwdev, ··· 554 554 bool wait); 555 555 void i40iw_receive_ilq(struct i40iw_sc_vsi *vsi, struct i40iw_puda_buf *rbuf); 556 556 void i40iw_free_sqbuf(struct i40iw_sc_vsi *vsi, void *bufp); 557 - void i40iw_free_qp_resources(struct i40iw_device *iwdev, 558 - struct i40iw_qp *iwqp, 559 - u32 qp_num); 557 + void i40iw_free_qp_resources(struct i40iw_qp *iwqp); 558 + 560 559 enum i40iw_status_code i40iw_obj_aligned_mem(struct i40iw_device *iwdev, 561 560 struct i40iw_dma_mem *memptr, 562 561 u32 size, u32 mask);
+5 -5
drivers/infiniband/hw/i40iw/i40iw_cm.c
··· 2322 2322 iwqp = cm_node->iwqp; 2323 2323 if (iwqp) { 2324 2324 iwqp->cm_node = NULL; 2325 - i40iw_rem_ref(&iwqp->ibqp); 2325 + i40iw_qp_rem_ref(&iwqp->ibqp); 2326 2326 cm_node->iwqp = NULL; 2327 2327 } else if (cm_node->qhash_set) { 2328 2328 i40iw_get_addr_info(cm_node, &nfo); ··· 3452 3452 kfree(work); 3453 3453 return; 3454 3454 } 3455 - i40iw_add_ref(&iwqp->ibqp); 3455 + i40iw_qp_add_ref(&iwqp->ibqp); 3456 3456 spin_unlock_irqrestore(&iwdev->qptable_lock, flags); 3457 3457 3458 3458 work->iwqp = iwqp; ··· 3623 3623 3624 3624 kfree(dwork); 3625 3625 i40iw_cm_disconn_true(iwqp); 3626 - i40iw_rem_ref(&iwqp->ibqp); 3626 + i40iw_qp_rem_ref(&iwqp->ibqp); 3627 3627 } 3628 3628 3629 3629 /** ··· 3745 3745 cm_node->lsmm_size = accept.size + conn_param->private_data_len; 3746 3746 i40iw_cm_init_tsa_conn(iwqp, cm_node); 3747 3747 cm_id->add_ref(cm_id); 3748 - i40iw_add_ref(&iwqp->ibqp); 3748 + i40iw_qp_add_ref(&iwqp->ibqp); 3749 3749 3750 3750 attr.qp_state = IB_QPS_RTS; 3751 3751 cm_node->qhash_set = false; ··· 3908 3908 iwqp->cm_node = cm_node; 3909 3909 cm_node->iwqp = iwqp; 3910 3910 iwqp->cm_id = cm_id; 3911 - i40iw_add_ref(&iwqp->ibqp); 3911 + i40iw_qp_add_ref(&iwqp->ibqp); 3912 3912 3913 3913 if (cm_node->state != I40IW_CM_STATE_OFFLOADED) { 3914 3914 cm_node->state = I40IW_CM_STATE_SYN_SENT;
+2 -2
drivers/infiniband/hw/i40iw/i40iw_hw.c
··· 313 313 __func__, info->qp_cq_id); 314 314 continue; 315 315 } 316 - i40iw_add_ref(&iwqp->ibqp); 316 + i40iw_qp_add_ref(&iwqp->ibqp); 317 317 spin_unlock_irqrestore(&iwdev->qptable_lock, flags); 318 318 qp = &iwqp->sc_qp; 319 319 spin_lock_irqsave(&iwqp->lock, flags); ··· 426 426 break; 427 427 } 428 428 if (info->qp) 429 - i40iw_rem_ref(&iwqp->ibqp); 429 + i40iw_qp_rem_ref(&iwqp->ibqp); 430 430 } while (1); 431 431 432 432 if (aeqcnt)
+8 -8
drivers/infiniband/hw/i40iw/i40iw_main.c
··· 192 192 * i40iw_dpc - tasklet for aeq and ceq 0 193 193 * @data: iwarp device 194 194 */ 195 - static void i40iw_dpc(unsigned long data) 195 + static void i40iw_dpc(struct tasklet_struct *t) 196 196 { 197 - struct i40iw_device *iwdev = (struct i40iw_device *)data; 197 + struct i40iw_device *iwdev = from_tasklet(iwdev, t, dpc_tasklet); 198 198 199 199 if (iwdev->msix_shared) 200 200 i40iw_process_ceq(iwdev, iwdev->ceqlist); ··· 206 206 * i40iw_ceq_dpc - dpc handler for CEQ 207 207 * @data: data points to CEQ 208 208 */ 209 - static void i40iw_ceq_dpc(unsigned long data) 209 + static void i40iw_ceq_dpc(struct tasklet_struct *t) 210 210 { 211 - struct i40iw_ceq *iwceq = (struct i40iw_ceq *)data; 211 + struct i40iw_ceq *iwceq = from_tasklet(iwceq, t, dpc_tasklet); 212 212 struct i40iw_device *iwdev = iwceq->iwdev; 213 213 214 214 i40iw_process_ceq(iwdev, iwceq); ··· 689 689 enum i40iw_status_code status; 690 690 691 691 if (iwdev->msix_shared && !ceq_id) { 692 - tasklet_init(&iwdev->dpc_tasklet, i40iw_dpc, (unsigned long)iwdev); 692 + tasklet_setup(&iwdev->dpc_tasklet, i40iw_dpc); 693 693 status = request_irq(msix_vec->irq, i40iw_irq_handler, 0, "AEQCEQ", iwdev); 694 694 } else { 695 - tasklet_init(&iwceq->dpc_tasklet, i40iw_ceq_dpc, (unsigned long)iwceq); 695 + tasklet_setup(&iwceq->dpc_tasklet, i40iw_ceq_dpc); 696 696 status = request_irq(msix_vec->irq, i40iw_ceq_handler, 0, "CEQ", iwceq); 697 697 } 698 698 ··· 841 841 u32 ret = 0; 842 842 843 843 if (!iwdev->msix_shared) { 844 - tasklet_init(&iwdev->dpc_tasklet, i40iw_dpc, (unsigned long)iwdev); 844 + tasklet_setup(&iwdev->dpc_tasklet, i40iw_dpc); 845 845 ret = request_irq(msix_vec->irq, i40iw_irq_handler, 0, "i40iw", iwdev); 846 846 } 847 847 if (ret) { ··· 1573 1573 status = i40iw_save_msix_info(iwdev, ldev); 1574 1574 if (status) 1575 1575 return status; 1576 - iwdev->hw.dev_context = (void *)ldev->pcidev; 1576 + iwdev->hw.pcidev = ldev->pcidev; 1577 1577 iwdev->hw.hw_addr = ldev->hw_addr; 1578 1578 status = i40iw_allocate_dma_mem(&iwdev->hw, 1579 1579 &iwdev->obj_mem, 8192, 4096);
+2 -2
drivers/infiniband/hw/i40iw/i40iw_pble.c
··· 167 167 */ 168 168 static void i40iw_free_vmalloc_mem(struct i40iw_hw *hw, struct i40iw_chunk *chunk) 169 169 { 170 - struct pci_dev *pcidev = (struct pci_dev *)hw->dev_context; 170 + struct pci_dev *pcidev = hw->pcidev; 171 171 int i; 172 172 173 173 if (!chunk->pg_cnt) ··· 193 193 struct i40iw_chunk *chunk, 194 194 int pg_cnt) 195 195 { 196 - struct pci_dev *pcidev = (struct pci_dev *)hw->dev_context; 196 + struct pci_dev *pcidev = hw->pcidev; 197 197 struct page *page; 198 198 u8 *addr; 199 199 u32 size;
+2 -1
drivers/infiniband/hw/i40iw/i40iw_type.h
··· 73 73 struct i40iw_priv_qp_ops; 74 74 struct i40iw_priv_cq_ops; 75 75 struct i40iw_hmc_ops; 76 + struct pci_dev; 76 77 77 78 enum i40iw_page_size { 78 79 I40IW_PAGE_SIZE_4K, ··· 262 261 263 262 struct i40iw_hw { 264 263 u8 __iomem *hw_addr; 265 - void *dev_context; 264 + struct pci_dev *pcidev; 266 265 struct i40iw_hmc_info hmc; 267 266 }; 268 267
+12 -51
drivers/infiniband/hw/i40iw/i40iw_utils.c
··· 478 478 } 479 479 480 480 /** 481 - * i40iw_free_qp - callback after destroy cqp completes 482 - * @cqp_request: cqp request for destroy qp 483 - * @num: not used 484 - */ 485 - static void i40iw_free_qp(struct i40iw_cqp_request *cqp_request, u32 num) 486 - { 487 - struct i40iw_sc_qp *qp = (struct i40iw_sc_qp *)cqp_request->param; 488 - struct i40iw_qp *iwqp = (struct i40iw_qp *)qp->back_qp; 489 - struct i40iw_device *iwdev; 490 - u32 qp_num = iwqp->ibqp.qp_num; 491 - 492 - iwdev = iwqp->iwdev; 493 - 494 - i40iw_rem_pdusecount(iwqp->iwpd, iwdev); 495 - i40iw_free_qp_resources(iwdev, iwqp, qp_num); 496 - i40iw_rem_devusecount(iwdev); 497 - } 498 - 499 - /** 500 481 * i40iw_wait_event - wait for completion 501 482 * @iwdev: iwarp device 502 483 * @cqp_request: cqp request to wait ··· 597 616 } 598 617 599 618 /** 600 - * i40iw_add_ref - add refcount for qp 619 + * i40iw_qp_add_ref - add refcount for qp 601 620 * @ibqp: iqarp qp 602 621 */ 603 - void i40iw_add_ref(struct ib_qp *ibqp) 622 + void i40iw_qp_add_ref(struct ib_qp *ibqp) 604 623 { 605 624 struct i40iw_qp *iwqp = (struct i40iw_qp *)ibqp; 606 625 607 - atomic_inc(&iwqp->refcount); 626 + refcount_inc(&iwqp->refcount); 608 627 } 609 628 610 629 /** 611 - * i40iw_rem_ref - rem refcount for qp and free if 0 630 + * i40iw_qp_rem_ref - rem refcount for qp and free if 0 612 631 * @ibqp: iqarp qp 613 632 */ 614 - void i40iw_rem_ref(struct ib_qp *ibqp) 633 + void i40iw_qp_rem_ref(struct ib_qp *ibqp) 615 634 { 616 635 struct i40iw_qp *iwqp; 617 - enum i40iw_status_code status; 618 - struct i40iw_cqp_request *cqp_request; 619 - struct cqp_commands_info *cqp_info; 620 636 struct i40iw_device *iwdev; 621 637 u32 qp_num; 622 638 unsigned long flags; ··· 621 643 iwqp = to_iwqp(ibqp); 622 644 iwdev = iwqp->iwdev; 623 645 spin_lock_irqsave(&iwdev->qptable_lock, flags); 624 - if (!atomic_dec_and_test(&iwqp->refcount)) { 646 + if (!refcount_dec_and_test(&iwqp->refcount)) { 625 647 spin_unlock_irqrestore(&iwdev->qptable_lock, flags); 626 648 return; 627 649 } ··· 629 651 qp_num = iwqp->ibqp.qp_num; 630 652 iwdev->qp_table[qp_num] = NULL; 631 653 spin_unlock_irqrestore(&iwdev->qptable_lock, flags); 632 - cqp_request = i40iw_get_cqp_request(&iwdev->cqp, false); 633 - if (!cqp_request) 634 - return; 654 + complete(&iwqp->free_qp); 635 655 636 - cqp_request->callback_fcn = i40iw_free_qp; 637 - cqp_request->param = (void *)&iwqp->sc_qp; 638 - cqp_info = &cqp_request->info; 639 - cqp_info->cqp_cmd = OP_QP_DESTROY; 640 - cqp_info->post_sq = 1; 641 - cqp_info->in.u.qp_destroy.qp = &iwqp->sc_qp; 642 - cqp_info->in.u.qp_destroy.scratch = (uintptr_t)cqp_request; 643 - cqp_info->in.u.qp_destroy.remove_hash_idx = true; 644 - status = i40iw_handle_cqp_op(iwdev, cqp_request); 645 - if (!status) 646 - return; 647 - 648 - i40iw_rem_pdusecount(iwqp->iwpd, iwdev); 649 - i40iw_free_qp_resources(iwdev, iwqp, qp_num); 650 - i40iw_rem_devusecount(iwdev); 651 656 } 652 657 653 658 /** ··· 712 751 u64 size, 713 752 u32 alignment) 714 753 { 715 - struct pci_dev *pcidev = (struct pci_dev *)hw->dev_context; 754 + struct pci_dev *pcidev = hw->pcidev; 716 755 717 756 if (!mem) 718 757 return I40IW_ERR_PARAM; ··· 731 770 */ 732 771 void i40iw_free_dma_mem(struct i40iw_hw *hw, struct i40iw_dma_mem *mem) 733 772 { 734 - struct pci_dev *pcidev = (struct pci_dev *)hw->dev_context; 773 + struct pci_dev *pcidev = hw->pcidev; 735 774 736 775 if (!mem || !mem->va) 737 776 return; ··· 897 936 struct i40iw_sc_qp *qp = (struct i40iw_sc_qp *)&iwqp->sc_qp; 898 937 899 938 i40iw_terminate_done(qp, 1); 900 - i40iw_rem_ref(&iwqp->ibqp); 939 + i40iw_qp_rem_ref(&iwqp->ibqp); 901 940 } 902 941 903 942 /** ··· 909 948 struct i40iw_qp *iwqp; 910 949 911 950 iwqp = (struct i40iw_qp *)qp->back_qp; 912 - i40iw_add_ref(&iwqp->ibqp); 951 + i40iw_qp_add_ref(&iwqp->ibqp); 913 952 timer_setup(&iwqp->terminate_timer, i40iw_terminate_timeout, 0); 914 953 iwqp->terminate_timer.expires = jiffies + HZ; 915 954 add_timer(&iwqp->terminate_timer); ··· 925 964 926 965 iwqp = (struct i40iw_qp *)qp->back_qp; 927 966 if (del_timer(&iwqp->terminate_timer)) 928 - i40iw_rem_ref(&iwqp->ibqp); 967 + i40iw_qp_rem_ref(&iwqp->ibqp); 929 968 } 930 969 931 970 /**
+34 -30
drivers/infiniband/hw/i40iw/i40iw_verbs.c
··· 328 328 * @ibpd: ptr of pd to be deallocated 329 329 * @udata: user data or null for kernel object 330 330 */ 331 - static void i40iw_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 331 + static int i40iw_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 332 332 { 333 333 struct i40iw_pd *iwpd = to_iwpd(ibpd); 334 334 struct i40iw_device *iwdev = to_iwdev(ibpd->device); 335 335 336 336 i40iw_rem_pdusecount(iwpd, iwdev); 337 + return 0; 337 338 } 338 339 339 340 /** ··· 364 363 * @iwqp: qp ptr (user or kernel) 365 364 * @qp_num: qp number assigned 366 365 */ 367 - void i40iw_free_qp_resources(struct i40iw_device *iwdev, 368 - struct i40iw_qp *iwqp, 369 - u32 qp_num) 366 + void i40iw_free_qp_resources(struct i40iw_qp *iwqp) 370 367 { 371 368 struct i40iw_pbl *iwpbl = &iwqp->iwpbl; 369 + struct i40iw_device *iwdev = iwqp->iwdev; 370 + u32 qp_num = iwqp->ibqp.qp_num; 372 371 373 372 i40iw_ieq_cleanup_qp(iwdev->vsi.ieq, &iwqp->sc_qp); 374 373 i40iw_dealloc_push_page(iwdev, &iwqp->sc_qp); ··· 380 379 i40iw_free_dma_mem(iwdev->sc_dev.hw, &iwqp->kqp.dma_mem); 381 380 kfree(iwqp->kqp.wrid_mem); 382 381 iwqp->kqp.wrid_mem = NULL; 383 - kfree(iwqp->allocated_buffer); 382 + kfree(iwqp); 384 383 } 385 384 386 385 /** ··· 402 401 static int i40iw_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) 403 402 { 404 403 struct i40iw_qp *iwqp = to_iwqp(ibqp); 404 + struct ib_qp_attr attr; 405 + struct i40iw_device *iwdev = iwqp->iwdev; 406 + 407 + memset(&attr, 0, sizeof(attr)); 405 408 406 409 iwqp->destroyed = 1; 407 410 ··· 420 415 } 421 416 } 422 417 423 - i40iw_rem_ref(&iwqp->ibqp); 418 + attr.qp_state = IB_QPS_ERR; 419 + i40iw_modify_qp(&iwqp->ibqp, &attr, IB_QP_STATE, NULL); 420 + i40iw_qp_rem_ref(&iwqp->ibqp); 421 + wait_for_completion(&iwqp->free_qp); 422 + i40iw_cqp_qp_destroy_cmd(&iwdev->sc_dev, &iwqp->sc_qp); 423 + i40iw_rem_pdusecount(iwqp->iwpd, iwdev); 424 + i40iw_free_qp_resources(iwqp); 425 + i40iw_rem_devusecount(iwdev); 426 + 424 427 return 0; 425 428 } 426 429 ··· 537 524 struct i40iw_create_qp_req req; 538 525 struct i40iw_create_qp_resp uresp; 539 526 u32 qp_num = 0; 540 - void *mem; 541 527 enum i40iw_status_code ret; 542 528 int err_code; 543 529 int sq_size; ··· 578 566 init_info.qp_uk_init_info.max_rq_frag_cnt = init_attr->cap.max_recv_sge; 579 567 init_info.qp_uk_init_info.max_inline_data = init_attr->cap.max_inline_data; 580 568 581 - mem = kzalloc(sizeof(*iwqp), GFP_KERNEL); 582 - if (!mem) 569 + iwqp = kzalloc(sizeof(*iwqp), GFP_KERNEL); 570 + if (!iwqp) 583 571 return ERR_PTR(-ENOMEM); 584 572 585 - iwqp = (struct i40iw_qp *)mem; 586 - iwqp->allocated_buffer = mem; 587 573 qp = &iwqp->sc_qp; 588 574 qp->back_qp = (void *)iwqp; 589 575 qp->push_idx = I40IW_INVALID_PUSH_PAGE_INDEX; 590 576 577 + iwqp->iwdev = iwdev; 591 578 iwqp->ctx_info.iwarp_info = &iwqp->iwarp_info; 592 579 593 580 if (i40iw_allocate_dma_mem(dev->hw, ··· 611 600 goto error; 612 601 } 613 602 614 - iwqp->iwdev = iwdev; 615 603 iwqp->iwpd = iwpd; 616 604 iwqp->ibqp.qp_num = qp_num; 617 605 qp = &iwqp->sc_qp; ··· 724 714 goto error; 725 715 } 726 716 727 - i40iw_add_ref(&iwqp->ibqp); 717 + refcount_set(&iwqp->refcount, 1); 728 718 spin_lock_init(&iwqp->lock); 729 719 iwqp->sig_all = (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR) ? 1 : 0; 730 720 iwdev->qp_table[qp_num] = iwqp; ··· 746 736 } 747 737 init_completion(&iwqp->sq_drained); 748 738 init_completion(&iwqp->rq_drained); 739 + init_completion(&iwqp->free_qp); 749 740 750 741 return &iwqp->ibqp; 751 742 error: 752 - i40iw_free_qp_resources(iwdev, iwqp, qp_num); 743 + i40iw_free_qp_resources(iwqp); 753 744 return ERR_PTR(err_code); 754 745 } 755 746 ··· 1063 1052 * @ib_cq: cq pointer 1064 1053 * @udata: user data or NULL for kernel object 1065 1054 */ 1066 - static void i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 1055 + static int i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata) 1067 1056 { 1068 1057 struct i40iw_cq *iwcq; 1069 1058 struct i40iw_device *iwdev; ··· 1075 1064 i40iw_cq_wq_destroy(iwdev, cq); 1076 1065 cq_free_resources(iwdev, iwcq); 1077 1066 i40iw_rem_devusecount(iwdev); 1067 + return 0; 1078 1068 } 1079 1069 1080 1070 /** ··· 1332 1320 if (iwmr->type == IW_MEMREG_TYPE_QP) 1333 1321 iwpbl->qp_mr.sq_page = sg_page(region->sg_head.sgl); 1334 1322 1335 - rdma_for_each_block(region->sg_head.sgl, &biter, region->nmap, 1336 - iwmr->page_size) { 1323 + rdma_umem_for_each_dma_block(region, &biter, iwmr->page_size) { 1337 1324 *pbl = rdma_block_iter_dma_address(&biter); 1338 1325 pbl = i40iw_next_pbl_addr(pbl, &pinfo, &idx); 1339 1326 } ··· 1755 1744 struct i40iw_mr *iwmr; 1756 1745 struct ib_umem *region; 1757 1746 struct i40iw_mem_reg_req req; 1758 - u64 pbl_depth = 0; 1759 1747 u32 stag = 0; 1760 1748 u16 access; 1761 - u64 region_length; 1762 1749 bool use_pbles = false; 1763 1750 unsigned long flags; 1764 1751 int err = -ENOSYS; 1765 1752 int ret; 1766 - int pg_shift; 1767 1753 1768 1754 if (!udata) 1769 1755 return ERR_PTR(-EOPNOTSUPP); ··· 1795 1787 if (req.reg_type == IW_MEMREG_TYPE_MEM) 1796 1788 iwmr->page_size = ib_umem_find_best_pgsz(region, SZ_4K | SZ_2M, 1797 1789 virt); 1798 - 1799 - region_length = region->length + (start & (iwmr->page_size - 1)); 1800 - pg_shift = ffs(iwmr->page_size) - 1; 1801 - pbl_depth = region_length >> pg_shift; 1802 - pbl_depth += (region_length & (iwmr->page_size - 1)) ? 1 : 0; 1803 1790 iwmr->length = region->length; 1804 1791 1805 1792 iwpbl->user_base = virt; 1806 1793 palloc = &iwpbl->pble_alloc; 1807 1794 1808 1795 iwmr->type = req.reg_type; 1809 - iwmr->page_cnt = (u32)pbl_depth; 1796 + iwmr->page_cnt = ib_umem_num_dma_blocks(region, iwmr->page_size); 1810 1797 1811 1798 switch (req.reg_type) { 1812 1799 case IW_MEMREG_TYPE_QP: ··· 2639 2636 .get_hw_stats = i40iw_get_hw_stats, 2640 2637 .get_port_immutable = i40iw_port_immutable, 2641 2638 .iw_accept = i40iw_accept, 2642 - .iw_add_ref = i40iw_add_ref, 2639 + .iw_add_ref = i40iw_qp_add_ref, 2643 2640 .iw_connect = i40iw_connect, 2644 2641 .iw_create_listen = i40iw_create_listen, 2645 2642 .iw_destroy_listen = i40iw_destroy_listen, 2646 2643 .iw_get_qp = i40iw_get_qp, 2647 2644 .iw_reject = i40iw_reject, 2648 - .iw_rem_ref = i40iw_rem_ref, 2645 + .iw_rem_ref = i40iw_qp_rem_ref, 2649 2646 .map_mr_sg = i40iw_map_mr_sg, 2650 2647 .mmap = i40iw_mmap, 2651 2648 .modify_qp = i40iw_modify_qp, ··· 2671 2668 { 2672 2669 struct i40iw_ib_device *iwibdev; 2673 2670 struct net_device *netdev = iwdev->netdev; 2674 - struct pci_dev *pcidev = (struct pci_dev *)iwdev->hw.dev_context; 2671 + struct pci_dev *pcidev = iwdev->hw.pcidev; 2675 2672 2676 2673 iwibdev = ib_alloc_device(i40iw_ib_device, ibdev); 2677 2674 if (!iwibdev) { ··· 2761 2758 if (ret) 2762 2759 goto error; 2763 2760 2764 - ret = ib_register_device(&iwibdev->ibdev, "i40iw%d"); 2761 + dma_set_max_seg_size(&iwdev->hw.pcidev->dev, UINT_MAX); 2762 + ret = ib_register_device(&iwibdev->ibdev, "i40iw%d", &iwdev->hw.pcidev->dev); 2765 2763 if (ret) 2766 2764 goto error; 2767 2765
+2 -1
drivers/infiniband/hw/i40iw/i40iw_verbs.h
··· 139 139 struct i40iw_qp_host_ctx_info ctx_info; 140 140 struct i40iwarp_offload_info iwarp_info; 141 141 void *allocated_buffer; 142 - atomic_t refcount; 142 + refcount_t refcount; 143 143 struct iw_cm_id *cm_id; 144 144 void *cm_node; 145 145 struct ib_mr *lsmm_mr; ··· 174 174 struct i40iw_dma_mem ietf_mem; 175 175 struct completion sq_drained; 176 176 struct completion rq_drained; 177 + struct completion free_qp; 177 178 }; 178 179 #endif
-5
drivers/infiniband/hw/mlx4/ah.c
··· 232 232 233 233 return 0; 234 234 } 235 - 236 - void mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags) 237 - { 238 - return; 239 - }
+147 -5
drivers/infiniband/hw/mlx4/cm.c
··· 54 54 struct delayed_work timeout; 55 55 }; 56 56 57 + struct rej_tmout_entry { 58 + int slave; 59 + u32 rem_pv_cm_id; 60 + struct delayed_work timeout; 61 + struct xarray *xa_rej_tmout; 62 + }; 63 + 57 64 struct cm_generic_msg { 58 65 struct ib_mad_hdr hdr; 59 66 60 67 __be32 local_comm_id; 61 68 __be32 remote_comm_id; 69 + unsigned char unused[2]; 70 + __be16 rej_reason; 62 71 }; 63 72 64 73 struct cm_sidr_generic_msg { ··· 289 280 if (!sriov->is_going_down && !id->scheduled_delete) { 290 281 id->scheduled_delete = 1; 291 282 schedule_delayed_work(&id->timeout, CM_CLEANUP_CACHE_TIMEOUT); 283 + } else if (id->scheduled_delete) { 284 + /* Adjust timeout if already scheduled */ 285 + mod_delayed_work(system_wq, &id->timeout, CM_CLEANUP_CACHE_TIMEOUT); 292 286 } 293 287 spin_unlock_irqrestore(&sriov->going_down_lock, flags); 294 288 spin_unlock(&sriov->id_map_lock); 295 289 } 296 290 291 + #define REJ_REASON(m) be16_to_cpu(((struct cm_generic_msg *)(m))->rej_reason) 297 292 int mlx4_ib_multiplex_cm_handler(struct ib_device *ibdev, int port, int slave_id, 298 293 struct ib_mad *mad) 299 294 { ··· 306 293 int pv_cm_id = -1; 307 294 308 295 if (mad->mad_hdr.attr_id == CM_REQ_ATTR_ID || 309 - mad->mad_hdr.attr_id == CM_REP_ATTR_ID || 310 - mad->mad_hdr.attr_id == CM_SIDR_REQ_ATTR_ID) { 296 + mad->mad_hdr.attr_id == CM_REP_ATTR_ID || 297 + mad->mad_hdr.attr_id == CM_MRA_ATTR_ID || 298 + mad->mad_hdr.attr_id == CM_SIDR_REQ_ATTR_ID || 299 + (mad->mad_hdr.attr_id == CM_REJ_ATTR_ID && REJ_REASON(mad) == IB_CM_REJ_TIMEOUT)) { 311 300 sl_cm_id = get_local_comm_id(mad); 312 301 id = id_map_get(ibdev, &pv_cm_id, slave_id, sl_cm_id); 313 302 if (id) ··· 329 314 } 330 315 331 316 if (!id) { 332 - pr_debug("id{slave: %d, sl_cm_id: 0x%x} is NULL!\n", 333 - slave_id, sl_cm_id); 317 + pr_debug("id{slave: %d, sl_cm_id: 0x%x} is NULL! attr_id: 0x%x\n", 318 + slave_id, sl_cm_id, be16_to_cpu(mad->mad_hdr.attr_id)); 334 319 return -EINVAL; 335 320 } 336 321 ··· 342 327 return 0; 343 328 } 344 329 330 + static void rej_tmout_timeout(struct work_struct *work) 331 + { 332 + struct delayed_work *delay = to_delayed_work(work); 333 + struct rej_tmout_entry *item = container_of(delay, struct rej_tmout_entry, timeout); 334 + struct rej_tmout_entry *deleted; 335 + 336 + deleted = xa_cmpxchg(item->xa_rej_tmout, item->rem_pv_cm_id, item, NULL, 0); 337 + 338 + if (deleted != item) 339 + pr_debug("deleted(%p) != item(%p)\n", deleted, item); 340 + 341 + kfree(item); 342 + } 343 + 344 + static int alloc_rej_tmout(struct mlx4_ib_sriov *sriov, u32 rem_pv_cm_id, int slave) 345 + { 346 + struct rej_tmout_entry *item; 347 + struct rej_tmout_entry *old; 348 + int ret = 0; 349 + 350 + xa_lock(&sriov->xa_rej_tmout); 351 + item = xa_load(&sriov->xa_rej_tmout, (unsigned long)rem_pv_cm_id); 352 + 353 + if (item) { 354 + if (xa_err(item)) 355 + ret = xa_err(item); 356 + else 357 + /* If a retry, adjust delayed work */ 358 + mod_delayed_work(system_wq, &item->timeout, CM_CLEANUP_CACHE_TIMEOUT); 359 + goto err_or_exists; 360 + } 361 + xa_unlock(&sriov->xa_rej_tmout); 362 + 363 + item = kmalloc(sizeof(*item), GFP_KERNEL); 364 + if (!item) 365 + return -ENOMEM; 366 + 367 + INIT_DELAYED_WORK(&item->timeout, rej_tmout_timeout); 368 + item->slave = slave; 369 + item->rem_pv_cm_id = rem_pv_cm_id; 370 + item->xa_rej_tmout = &sriov->xa_rej_tmout; 371 + 372 + old = xa_cmpxchg(&sriov->xa_rej_tmout, (unsigned long)rem_pv_cm_id, NULL, item, GFP_KERNEL); 373 + if (old) { 374 + pr_debug( 375 + "Non-null old entry (%p) or error (%d) when inserting\n", 376 + old, xa_err(old)); 377 + kfree(item); 378 + return xa_err(old); 379 + } 380 + 381 + schedule_delayed_work(&item->timeout, CM_CLEANUP_CACHE_TIMEOUT); 382 + 383 + return 0; 384 + 385 + err_or_exists: 386 + xa_unlock(&sriov->xa_rej_tmout); 387 + return ret; 388 + } 389 + 390 + static int lookup_rej_tmout_slave(struct mlx4_ib_sriov *sriov, u32 rem_pv_cm_id) 391 + { 392 + struct rej_tmout_entry *item; 393 + int slave; 394 + 395 + xa_lock(&sriov->xa_rej_tmout); 396 + item = xa_load(&sriov->xa_rej_tmout, (unsigned long)rem_pv_cm_id); 397 + 398 + if (!item || xa_err(item)) { 399 + pr_debug("Could not find slave. rem_pv_cm_id 0x%x error: %d\n", 400 + rem_pv_cm_id, xa_err(item)); 401 + slave = !item ? -ENOENT : xa_err(item); 402 + } else { 403 + slave = item->slave; 404 + } 405 + xa_unlock(&sriov->xa_rej_tmout); 406 + 407 + return slave; 408 + } 409 + 345 410 int mlx4_ib_demux_cm_handler(struct ib_device *ibdev, int port, int *slave, 346 411 struct ib_mad *mad) 347 412 { 413 + struct mlx4_ib_sriov *sriov = &to_mdev(ibdev)->sriov; 414 + u32 rem_pv_cm_id = get_local_comm_id(mad); 348 415 u32 pv_cm_id; 349 416 struct id_map_entry *id; 417 + int sts; 350 418 351 419 if (mad->mad_hdr.attr_id == CM_REQ_ATTR_ID || 352 420 mad->mad_hdr.attr_id == CM_SIDR_REQ_ATTR_ID) { ··· 445 347 be64_to_cpu(gid.global.interface_id)); 446 348 return -ENOENT; 447 349 } 350 + 351 + sts = alloc_rej_tmout(sriov, rem_pv_cm_id, *slave); 352 + if (sts) 353 + /* Even if this fails, we pass on the REQ to the slave */ 354 + pr_debug("Could not allocate rej_tmout entry. rem_pv_cm_id 0x%x slave %d status %d\n", 355 + rem_pv_cm_id, *slave, sts); 356 + 448 357 return 0; 449 358 } 450 359 ··· 459 354 id = id_map_get(ibdev, (int *)&pv_cm_id, -1, -1); 460 355 461 356 if (!id) { 462 - pr_debug("Couldn't find an entry for pv_cm_id 0x%x\n", pv_cm_id); 357 + if (mad->mad_hdr.attr_id == CM_REJ_ATTR_ID && 358 + REJ_REASON(mad) == IB_CM_REJ_TIMEOUT && slave) { 359 + *slave = lookup_rej_tmout_slave(sriov, rem_pv_cm_id); 360 + 361 + return (*slave < 0) ? *slave : 0; 362 + } 363 + pr_debug("Couldn't find an entry for pv_cm_id 0x%x, attr_id 0x%x\n", 364 + pv_cm_id, be16_to_cpu(mad->mad_hdr.attr_id)); 463 365 return -ENOENT; 464 366 } 465 367 ··· 487 375 INIT_LIST_HEAD(&dev->sriov.cm_list); 488 376 dev->sriov.sl_id_map = RB_ROOT; 489 377 xa_init_flags(&dev->sriov.pv_id_table, XA_FLAGS_ALLOC); 378 + xa_init(&dev->sriov.xa_rej_tmout); 379 + } 380 + 381 + static void rej_tmout_xa_cleanup(struct mlx4_ib_sriov *sriov, int slave) 382 + { 383 + struct rej_tmout_entry *item; 384 + bool flush_needed = false; 385 + unsigned long id; 386 + int cnt = 0; 387 + 388 + xa_lock(&sriov->xa_rej_tmout); 389 + xa_for_each(&sriov->xa_rej_tmout, id, item) { 390 + if (slave < 0 || slave == item->slave) { 391 + mod_delayed_work(system_wq, &item->timeout, 0); 392 + flush_needed = true; 393 + ++cnt; 394 + } 395 + } 396 + xa_unlock(&sriov->xa_rej_tmout); 397 + 398 + if (flush_needed) { 399 + flush_scheduled_work(); 400 + pr_debug("Deleted %d entries in xarray for slave %d during cleanup\n", 401 + cnt, slave); 402 + } 403 + 404 + if (slave < 0) 405 + WARN_ON(!xa_empty(&sriov->xa_rej_tmout)); 490 406 } 491 407 492 408 /* slave = -1 ==> all slaves */ ··· 584 444 list_del(&map->list); 585 445 kfree(map); 586 446 } 447 + 448 + rej_tmout_xa_cleanup(sriov, slave); 587 449 }
+2 -2
drivers/infiniband/hw/mlx4/cq.c
··· 149 149 if (IS_ERR(*umem)) 150 150 return PTR_ERR(*umem); 151 151 152 - n = ib_umem_page_count(*umem); 153 152 shift = mlx4_ib_umem_calc_optimal_mtt_size(*umem, 0, &n); 154 153 err = mlx4_mtt_init(dev->dev, n, shift, &buf->mtt); 155 154 ··· 474 475 return err; 475 476 } 476 477 477 - void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 478 + int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 478 479 { 479 480 struct mlx4_ib_dev *dev = to_mdev(cq->device); 480 481 struct mlx4_ib_cq *mcq = to_mcq(cq); ··· 494 495 mlx4_db_free(dev->dev, &mcq->db); 495 496 } 496 497 ib_umem_release(mcq->umem); 498 + return 0; 497 499 } 498 500 499 501 static void dump_cqe(void *cqe)
+91 -67
drivers/infiniband/hw/mlx4/mad.c
··· 500 500 sgid, dgid); 501 501 } 502 502 503 + static int is_proxy_qp0(struct mlx4_ib_dev *dev, int qpn, int slave) 504 + { 505 + int proxy_start = dev->dev->phys_caps.base_proxy_sqpn + 8 * slave; 506 + 507 + return (qpn >= proxy_start && qpn <= proxy_start + 1); 508 + } 509 + 503 510 int mlx4_ib_send_to_slave(struct mlx4_ib_dev *dev, int slave, u8 port, 504 511 enum ib_qp_type dest_qpt, struct ib_wc *wc, 505 512 struct ib_grh *grh, struct ib_mad *mad) ··· 527 520 u16 cached_pkey; 528 521 u8 is_eth = dev->dev->caps.port_type[port] == MLX4_PORT_TYPE_ETH; 529 522 530 - if (dest_qpt > IB_QPT_GSI) 523 + if (dest_qpt > IB_QPT_GSI) { 524 + pr_debug("dest_qpt (%d) > IB_QPT_GSI\n", dest_qpt); 531 525 return -EINVAL; 526 + } 532 527 533 528 tun_ctx = dev->sriov.demux[port-1].tun[slave]; 534 529 ··· 547 538 if (dest_qpt) { 548 539 u16 pkey_ix; 549 540 ret = ib_get_cached_pkey(&dev->ib_dev, port, wc->pkey_index, &cached_pkey); 550 - if (ret) 541 + if (ret) { 542 + pr_debug("unable to get %s cached pkey for index %d, ret %d\n", 543 + is_proxy_qp0(dev, wc->src_qp, slave) ? "SMI" : "GSI", 544 + wc->pkey_index, ret); 551 545 return -EINVAL; 546 + } 552 547 553 548 ret = find_slave_port_pkey_ix(dev, slave, port, cached_pkey, &pkey_ix); 554 - if (ret) 549 + if (ret) { 550 + pr_debug("unable to get %s pkey ix for pkey 0x%x, ret %d\n", 551 + is_proxy_qp0(dev, wc->src_qp, slave) ? "SMI" : "GSI", 552 + cached_pkey, ret); 555 553 return -EINVAL; 554 + } 556 555 tun_pkey_ix = pkey_ix; 557 556 } else 558 557 tun_pkey_ix = dev->pkeys.virt2phys_pkey[slave][port - 1][0]; ··· 732 715 733 716 err = mlx4_ib_send_to_slave(dev, slave, port, wc->qp->qp_type, wc, grh, mad); 734 717 if (err) 735 - pr_debug("failed sending to slave %d via tunnel qp (%d)\n", 718 + pr_debug("failed sending %s to slave %d via tunnel qp (%d)\n", 719 + is_proxy_qp0(dev, wc->src_qp, slave) ? "SMI" : "GSI", 736 720 slave, err); 737 721 return 0; 738 722 } ··· 812 794 813 795 err = mlx4_ib_send_to_slave(dev, slave, port, wc->qp->qp_type, wc, grh, mad); 814 796 if (err) 815 - pr_debug("failed sending to slave %d via tunnel qp (%d)\n", 797 + pr_debug("failed sending %s to slave %d via tunnel qp (%d)\n", 798 + is_proxy_qp0(dev, wc->src_qp, slave) ? "SMI" : "GSI", 816 799 slave, err); 817 800 return 0; 818 801 } ··· 825 806 u16 slid, prev_lid = 0; 826 807 int err; 827 808 struct ib_port_attr pattr; 828 - 829 - if (in_wc && in_wc->qp) { 830 - pr_debug("received MAD: port:%d slid:%d sqpn:%d " 831 - "dlid_bits:%d dqpn:%d wc_flags:0x%x tid:%016llx cls:%x mtd:%x atr:%x\n", 832 - port_num, 833 - in_wc->slid, in_wc->src_qp, 834 - in_wc->dlid_path_bits, 835 - in_wc->qp->qp_num, 836 - in_wc->wc_flags, 837 - be64_to_cpu(in_mad->mad_hdr.tid), 838 - in_mad->mad_hdr.mgmt_class, in_mad->mad_hdr.method, 839 - be16_to_cpu(in_mad->mad_hdr.attr_id)); 840 - if (in_wc->wc_flags & IB_WC_GRH) { 841 - pr_debug("sgid_hi:0x%016llx sgid_lo:0x%016llx\n", 842 - be64_to_cpu(in_grh->sgid.global.subnet_prefix), 843 - be64_to_cpu(in_grh->sgid.global.interface_id)); 844 - pr_debug("dgid_hi:0x%016llx dgid_lo:0x%016llx\n", 845 - be64_to_cpu(in_grh->dgid.global.subnet_prefix), 846 - be64_to_cpu(in_grh->dgid.global.interface_id)); 847 - } 848 - } 849 809 850 810 slid = in_wc ? ib_lid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE); 851 811 ··· 1297 1299 spin_unlock_irqrestore(&dev->sriov.going_down_lock, flags); 1298 1300 } 1299 1301 1302 + static void mlx4_ib_wire_comp_handler(struct ib_cq *cq, void *arg) 1303 + { 1304 + unsigned long flags; 1305 + struct mlx4_ib_demux_pv_ctx *ctx = cq->cq_context; 1306 + struct mlx4_ib_dev *dev = to_mdev(ctx->ib_dev); 1307 + 1308 + spin_lock_irqsave(&dev->sriov.going_down_lock, flags); 1309 + if (!dev->sriov.is_going_down && ctx->state == DEMUX_PV_STATE_ACTIVE) 1310 + queue_work(ctx->wi_wq, &ctx->work); 1311 + spin_unlock_irqrestore(&dev->sriov.going_down_lock, flags); 1312 + } 1313 + 1300 1314 static int mlx4_ib_post_pv_qp_buf(struct mlx4_ib_demux_pv_ctx *ctx, 1301 1315 struct mlx4_ib_demux_pv_qp *tun_qp, 1302 1316 int index) ··· 1350 1340 } 1351 1341 return ret; 1352 1342 } 1353 - 1354 - static int is_proxy_qp0(struct mlx4_ib_dev *dev, int qpn, int slave) 1355 - { 1356 - int proxy_start = dev->dev->phys_caps.base_proxy_sqpn + 8 * slave; 1357 - 1358 - return (qpn >= proxy_start && qpn <= proxy_start + 1); 1359 - } 1360 - 1361 1343 1362 1344 int mlx4_ib_send_to_wire(struct mlx4_ib_dev *dev, int slave, u8 port, 1363 1345 enum ib_qp_type dest_qpt, u16 pkey_index, ··· 1403 1401 1404 1402 spin_lock(&sqp->tx_lock); 1405 1403 if (sqp->tx_ix_head - sqp->tx_ix_tail >= 1406 - (MLX4_NUM_TUNNEL_BUFS - 1)) 1404 + (MLX4_NUM_WIRE_BUFS - 1)) 1407 1405 ret = -EAGAIN; 1408 1406 else 1409 - wire_tx_ix = (++sqp->tx_ix_head) & (MLX4_NUM_TUNNEL_BUFS - 1); 1407 + wire_tx_ix = (++sqp->tx_ix_head) & (MLX4_NUM_WIRE_BUFS - 1); 1410 1408 spin_unlock(&sqp->tx_lock); 1411 1409 if (ret) 1412 1410 goto out; ··· 1486 1484 u16 vlan_id; 1487 1485 u8 qos; 1488 1486 u8 *dmac; 1487 + int sts; 1489 1488 1490 1489 /* Get slave that sent this packet */ 1491 1490 if (wc->src_qp < dev->dev->phys_caps.base_proxy_sqpn || ··· 1583 1580 &vlan_id, &qos)) 1584 1581 rdma_ah_set_sl(&ah_attr, qos); 1585 1582 1586 - mlx4_ib_send_to_wire(dev, slave, ctx->port, 1587 - is_proxy_qp0(dev, wc->src_qp, slave) ? 1588 - IB_QPT_SMI : IB_QPT_GSI, 1589 - be16_to_cpu(tunnel->hdr.pkey_index), 1590 - be32_to_cpu(tunnel->hdr.remote_qpn), 1591 - be32_to_cpu(tunnel->hdr.qkey), 1592 - &ah_attr, wc->smac, vlan_id, &tunnel->mad); 1583 + sts = mlx4_ib_send_to_wire(dev, slave, ctx->port, 1584 + is_proxy_qp0(dev, wc->src_qp, slave) ? 1585 + IB_QPT_SMI : IB_QPT_GSI, 1586 + be16_to_cpu(tunnel->hdr.pkey_index), 1587 + be32_to_cpu(tunnel->hdr.remote_qpn), 1588 + be32_to_cpu(tunnel->hdr.qkey), 1589 + &ah_attr, wc->smac, vlan_id, &tunnel->mad); 1590 + if (sts) 1591 + pr_debug("failed sending %s to wire on behalf of slave %d (%d)\n", 1592 + is_proxy_qp0(dev, wc->src_qp, slave) ? "SMI" : "GSI", 1593 + slave, sts); 1593 1594 } 1594 1595 1595 1596 static int mlx4_ib_alloc_pv_bufs(struct mlx4_ib_demux_pv_ctx *ctx, ··· 1602 1595 int i; 1603 1596 struct mlx4_ib_demux_pv_qp *tun_qp; 1604 1597 int rx_buf_size, tx_buf_size; 1598 + const int nmbr_bufs = is_tun ? MLX4_NUM_TUNNEL_BUFS : MLX4_NUM_WIRE_BUFS; 1605 1599 1606 1600 if (qp_type > IB_QPT_GSI) 1607 1601 return -EINVAL; 1608 1602 1609 1603 tun_qp = &ctx->qp[qp_type]; 1610 1604 1611 - tun_qp->ring = kcalloc(MLX4_NUM_TUNNEL_BUFS, 1605 + tun_qp->ring = kcalloc(nmbr_bufs, 1612 1606 sizeof(struct mlx4_ib_buf), 1613 1607 GFP_KERNEL); 1614 1608 if (!tun_qp->ring) 1615 1609 return -ENOMEM; 1616 1610 1617 - tun_qp->tx_ring = kcalloc(MLX4_NUM_TUNNEL_BUFS, 1611 + tun_qp->tx_ring = kcalloc(nmbr_bufs, 1618 1612 sizeof (struct mlx4_ib_tun_tx_buf), 1619 1613 GFP_KERNEL); 1620 1614 if (!tun_qp->tx_ring) { ··· 1632 1624 tx_buf_size = sizeof (struct mlx4_mad_snd_buf); 1633 1625 } 1634 1626 1635 - for (i = 0; i < MLX4_NUM_TUNNEL_BUFS; i++) { 1627 + for (i = 0; i < nmbr_bufs; i++) { 1636 1628 tun_qp->ring[i].addr = kmalloc(rx_buf_size, GFP_KERNEL); 1637 1629 if (!tun_qp->ring[i].addr) 1638 1630 goto err; ··· 1646 1638 } 1647 1639 } 1648 1640 1649 - for (i = 0; i < MLX4_NUM_TUNNEL_BUFS; i++) { 1641 + for (i = 0; i < nmbr_bufs; i++) { 1650 1642 tun_qp->tx_ring[i].buf.addr = 1651 1643 kmalloc(tx_buf_size, GFP_KERNEL); 1652 1644 if (!tun_qp->tx_ring[i].buf.addr) ··· 1677 1669 tx_buf_size, DMA_TO_DEVICE); 1678 1670 kfree(tun_qp->tx_ring[i].buf.addr); 1679 1671 } 1680 - i = MLX4_NUM_TUNNEL_BUFS; 1672 + i = nmbr_bufs; 1681 1673 err: 1682 1674 while (i > 0) { 1683 1675 --i; ··· 1698 1690 int i; 1699 1691 struct mlx4_ib_demux_pv_qp *tun_qp; 1700 1692 int rx_buf_size, tx_buf_size; 1693 + const int nmbr_bufs = is_tun ? MLX4_NUM_TUNNEL_BUFS : MLX4_NUM_WIRE_BUFS; 1701 1694 1702 1695 if (qp_type > IB_QPT_GSI) 1703 1696 return; ··· 1713 1704 } 1714 1705 1715 1706 1716 - for (i = 0; i < MLX4_NUM_TUNNEL_BUFS; i++) { 1707 + for (i = 0; i < nmbr_bufs; i++) { 1717 1708 ib_dma_unmap_single(ctx->ib_dev, tun_qp->ring[i].map, 1718 1709 rx_buf_size, DMA_FROM_DEVICE); 1719 1710 kfree(tun_qp->ring[i].addr); 1720 1711 } 1721 1712 1722 - for (i = 0; i < MLX4_NUM_TUNNEL_BUFS; i++) { 1713 + for (i = 0; i < nmbr_bufs; i++) { 1723 1714 ib_dma_unmap_single(ctx->ib_dev, tun_qp->tx_ring[i].buf.map, 1724 1715 tx_buf_size, DMA_TO_DEVICE); 1725 1716 kfree(tun_qp->tx_ring[i].buf.addr); ··· 1753 1744 "buf:%lld\n", wc.wr_id); 1754 1745 break; 1755 1746 case IB_WC_SEND: 1756 - pr_debug("received tunnel send completion:" 1757 - "wrid=0x%llx, status=0x%x\n", 1758 - wc.wr_id, wc.status); 1759 1747 rdma_destroy_ah(tun_qp->tx_ring[wc.wr_id & 1760 1748 (MLX4_NUM_TUNNEL_BUFS - 1)].ah, 0); 1761 1749 tun_qp->tx_ring[wc.wr_id & (MLX4_NUM_TUNNEL_BUFS - 1)].ah ··· 1799 1793 struct mlx4_ib_qp_tunnel_init_attr qp_init_attr; 1800 1794 struct ib_qp_attr attr; 1801 1795 int qp_attr_mask_INIT; 1796 + const int nmbr_bufs = create_tun ? MLX4_NUM_TUNNEL_BUFS : MLX4_NUM_WIRE_BUFS; 1802 1797 1803 1798 if (qp_type > IB_QPT_GSI) 1804 1799 return -EINVAL; ··· 1810 1803 qp_init_attr.init_attr.send_cq = ctx->cq; 1811 1804 qp_init_attr.init_attr.recv_cq = ctx->cq; 1812 1805 qp_init_attr.init_attr.sq_sig_type = IB_SIGNAL_ALL_WR; 1813 - qp_init_attr.init_attr.cap.max_send_wr = MLX4_NUM_TUNNEL_BUFS; 1814 - qp_init_attr.init_attr.cap.max_recv_wr = MLX4_NUM_TUNNEL_BUFS; 1806 + qp_init_attr.init_attr.cap.max_send_wr = nmbr_bufs; 1807 + qp_init_attr.init_attr.cap.max_recv_wr = nmbr_bufs; 1815 1808 qp_init_attr.init_attr.cap.max_send_sge = 1; 1816 1809 qp_init_attr.init_attr.cap.max_recv_sge = 1; 1817 1810 if (create_tun) { ··· 1873 1866 goto err_qp; 1874 1867 } 1875 1868 1876 - for (i = 0; i < MLX4_NUM_TUNNEL_BUFS; i++) { 1869 + for (i = 0; i < nmbr_bufs; i++) { 1877 1870 ret = mlx4_ib_post_pv_qp_buf(ctx, tun_qp, i); 1878 1871 if (ret) { 1879 1872 pr_err(" mlx4_ib_post_pv_buf error" ··· 1909 1902 switch (wc.opcode) { 1910 1903 case IB_WC_SEND: 1911 1904 kfree(sqp->tx_ring[wc.wr_id & 1912 - (MLX4_NUM_TUNNEL_BUFS - 1)].ah); 1913 - sqp->tx_ring[wc.wr_id & (MLX4_NUM_TUNNEL_BUFS - 1)].ah 1905 + (MLX4_NUM_WIRE_BUFS - 1)].ah); 1906 + sqp->tx_ring[wc.wr_id & (MLX4_NUM_WIRE_BUFS - 1)].ah 1914 1907 = NULL; 1915 1908 spin_lock(&sqp->tx_lock); 1916 1909 sqp->tx_ix_tail++; ··· 1919 1912 case IB_WC_RECV: 1920 1913 mad = (struct ib_mad *) &(((struct mlx4_mad_rcv_buf *) 1921 1914 (sqp->ring[wc.wr_id & 1922 - (MLX4_NUM_TUNNEL_BUFS - 1)].addr))->payload); 1915 + (MLX4_NUM_WIRE_BUFS - 1)].addr))->payload); 1923 1916 grh = &(((struct mlx4_mad_rcv_buf *) 1924 1917 (sqp->ring[wc.wr_id & 1925 - (MLX4_NUM_TUNNEL_BUFS - 1)].addr))->grh); 1918 + (MLX4_NUM_WIRE_BUFS - 1)].addr))->grh); 1926 1919 mlx4_ib_demux_mad(ctx->ib_dev, ctx->port, &wc, grh, mad); 1927 1920 if (mlx4_ib_post_pv_qp_buf(ctx, sqp, wc.wr_id & 1928 - (MLX4_NUM_TUNNEL_BUFS - 1))) 1921 + (MLX4_NUM_WIRE_BUFS - 1))) 1929 1922 pr_err("Failed reposting SQP " 1930 1923 "buf:%lld\n", wc.wr_id); 1931 1924 break; ··· 1938 1931 ctx->slave, wc.status, wc.wr_id); 1939 1932 if (!MLX4_TUN_IS_RECV(wc.wr_id)) { 1940 1933 kfree(sqp->tx_ring[wc.wr_id & 1941 - (MLX4_NUM_TUNNEL_BUFS - 1)].ah); 1942 - sqp->tx_ring[wc.wr_id & (MLX4_NUM_TUNNEL_BUFS - 1)].ah 1934 + (MLX4_NUM_WIRE_BUFS - 1)].ah); 1935 + sqp->tx_ring[wc.wr_id & (MLX4_NUM_WIRE_BUFS - 1)].ah 1943 1936 = NULL; 1944 1937 spin_lock(&sqp->tx_lock); 1945 1938 sqp->tx_ix_tail++; ··· 1979 1972 { 1980 1973 int ret, cq_size; 1981 1974 struct ib_cq_init_attr cq_attr = {}; 1975 + const int nmbr_bufs = create_tun ? MLX4_NUM_TUNNEL_BUFS : MLX4_NUM_WIRE_BUFS; 1982 1976 1983 1977 if (ctx->state != DEMUX_PV_STATE_DOWN) 1984 1978 return -EEXIST; ··· 2004 1996 goto err_out_qp0; 2005 1997 } 2006 1998 2007 - cq_size = 2 * MLX4_NUM_TUNNEL_BUFS; 1999 + cq_size = 2 * nmbr_bufs; 2008 2000 if (ctx->has_smi) 2009 2001 cq_size *= 2; 2010 2002 2011 2003 cq_attr.cqe = cq_size; 2012 - ctx->cq = ib_create_cq(ctx->ib_dev, mlx4_ib_tunnel_comp_handler, 2004 + ctx->cq = ib_create_cq(ctx->ib_dev, 2005 + create_tun ? mlx4_ib_tunnel_comp_handler : mlx4_ib_wire_comp_handler, 2013 2006 NULL, ctx, &cq_attr); 2014 2007 if (IS_ERR(ctx->cq)) { 2015 2008 ret = PTR_ERR(ctx->cq); ··· 2047 2038 INIT_WORK(&ctx->work, mlx4_ib_sqp_comp_worker); 2048 2039 2049 2040 ctx->wq = to_mdev(ibdev)->sriov.demux[port - 1].wq; 2041 + ctx->wi_wq = to_mdev(ibdev)->sriov.demux[port - 1].wi_wq; 2050 2042 2051 2043 ret = ib_req_notify_cq(ctx->cq, IB_CQ_NEXT_COMP); 2052 2044 if (ret) { ··· 2191 2181 goto err_mcg; 2192 2182 } 2193 2183 2194 - snprintf(name, sizeof name, "mlx4_ibt%d", port); 2184 + snprintf(name, sizeof(name), "mlx4_ibt%d", port); 2195 2185 ctx->wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); 2196 2186 if (!ctx->wq) { 2197 2187 pr_err("Failed to create tunnelling WQ for port %d\n", port); ··· 2199 2189 goto err_wq; 2200 2190 } 2201 2191 2202 - snprintf(name, sizeof name, "mlx4_ibud%d", port); 2192 + snprintf(name, sizeof(name), "mlx4_ibwi%d", port); 2193 + ctx->wi_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); 2194 + if (!ctx->wi_wq) { 2195 + pr_err("Failed to create wire WQ for port %d\n", port); 2196 + ret = -ENOMEM; 2197 + goto err_wiwq; 2198 + } 2199 + 2200 + snprintf(name, sizeof(name), "mlx4_ibud%d", port); 2203 2201 ctx->ud_wq = alloc_ordered_workqueue(name, WQ_MEM_RECLAIM); 2204 2202 if (!ctx->ud_wq) { 2205 2203 pr_err("Failed to create up/down WQ for port %d\n", port); ··· 2218 2200 return 0; 2219 2201 2220 2202 err_udwq: 2203 + destroy_workqueue(ctx->wi_wq); 2204 + ctx->wi_wq = NULL; 2205 + 2206 + err_wiwq: 2221 2207 destroy_workqueue(ctx->wq); 2222 2208 ctx->wq = NULL; 2223 2209 ··· 2269 2247 ctx->tun[i]->state = DEMUX_PV_STATE_DOWNING; 2270 2248 } 2271 2249 flush_workqueue(ctx->wq); 2250 + flush_workqueue(ctx->wi_wq); 2272 2251 for (i = 0; i < dev->dev->caps.sqp_demux; i++) { 2273 2252 destroy_pv_resources(dev, i, ctx->port, ctx->tun[i], 0); 2274 2253 free_pv_object(dev, i, ctx->port); 2275 2254 } 2276 2255 kfree(ctx->tun); 2277 2256 destroy_workqueue(ctx->ud_wq); 2257 + destroy_workqueue(ctx->wi_wq); 2278 2258 destroy_workqueue(ctx->wq); 2279 2259 } 2280 2260 }
+19 -26
drivers/infiniband/hw/mlx4/main.c
··· 1215 1215 return 0; 1216 1216 } 1217 1217 1218 - static void mlx4_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 1218 + static int mlx4_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 1219 1219 { 1220 1220 mlx4_pd_free(to_mdev(pd->device)->dev, to_mpd(pd)->pdn); 1221 + return 0; 1221 1222 } 1222 1223 1223 1224 static int mlx4_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata) ··· 1257 1256 return err; 1258 1257 } 1259 1258 1260 - static void mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata) 1259 + static int mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata) 1261 1260 { 1262 1261 ib_destroy_cq(to_mxrcd(xrcd)->cq); 1263 1262 ib_dealloc_pd(to_mxrcd(xrcd)->pd); 1264 1263 mlx4_xrcd_free(to_mdev(xrcd->device)->dev, to_mxrcd(xrcd)->xrcdn); 1264 + return 0; 1265 1265 } 1266 1266 1267 1267 static int add_gid_entry(struct ib_qp *ibqp, union ib_gid *gid) ··· 1535 1533 struct mlx4_net_trans_rule_hw_ctrl *ctrl; 1536 1534 int default_flow; 1537 1535 1538 - static const u16 __mlx4_domain[] = { 1539 - [IB_FLOW_DOMAIN_USER] = MLX4_DOMAIN_UVERBS, 1540 - [IB_FLOW_DOMAIN_ETHTOOL] = MLX4_DOMAIN_ETHTOOL, 1541 - [IB_FLOW_DOMAIN_RFS] = MLX4_DOMAIN_RFS, 1542 - [IB_FLOW_DOMAIN_NIC] = MLX4_DOMAIN_NIC, 1543 - }; 1544 - 1545 1536 if (flow_attr->priority > MLX4_IB_FLOW_MAX_PRIO) { 1546 1537 pr_err("Invalid priority value %d\n", flow_attr->priority); 1547 - return -EINVAL; 1548 - } 1549 - 1550 - if (domain >= IB_FLOW_DOMAIN_NUM) { 1551 - pr_err("Invalid domain value %d\n", domain); 1552 1538 return -EINVAL; 1553 1539 } 1554 1540 ··· 1548 1558 return PTR_ERR(mailbox); 1549 1559 ctrl = mailbox->buf; 1550 1560 1551 - ctrl->prio = cpu_to_be16(__mlx4_domain[domain] | 1552 - flow_attr->priority); 1561 + ctrl->prio = cpu_to_be16(domain | flow_attr->priority); 1553 1562 ctrl->type = mlx4_map_sw_to_hw_steering_mode(mdev->dev, flow_type); 1554 1563 ctrl->port = flow_attr->port; 1555 1564 ctrl->qpn = cpu_to_be32(qp->qp_num); ··· 1690 1701 } 1691 1702 1692 1703 static struct ib_flow *mlx4_ib_create_flow(struct ib_qp *qp, 1693 - struct ib_flow_attr *flow_attr, 1694 - int domain, struct ib_udata *udata) 1704 + struct ib_flow_attr *flow_attr, 1705 + struct ib_udata *udata) 1695 1706 { 1696 1707 int err = 0, i = 0, j = 0; 1697 1708 struct mlx4_ib_flow *mflow; ··· 1757 1768 } 1758 1769 1759 1770 while (i < ARRAY_SIZE(type) && type[i]) { 1760 - err = __mlx4_ib_create_flow(qp, flow_attr, domain, type[i], 1761 - &mflow->reg_id[i].id); 1771 + err = __mlx4_ib_create_flow(qp, flow_attr, MLX4_DOMAIN_UVERBS, 1772 + type[i], &mflow->reg_id[i].id); 1762 1773 if (err) 1763 1774 goto err_create_flow; 1764 1775 if (is_bonded) { ··· 1767 1778 */ 1768 1779 flow_attr->port = 2; 1769 1780 err = __mlx4_ib_create_flow(qp, flow_attr, 1770 - domain, type[j], 1781 + MLX4_DOMAIN_UVERBS, type[j], 1771 1782 &mflow->reg_id[j].mirror); 1772 1783 flow_attr->port = 1; 1773 1784 if (err) ··· 2578 2589 .destroy_rwq_ind_table = mlx4_ib_destroy_rwq_ind_table, 2579 2590 .destroy_wq = mlx4_ib_destroy_wq, 2580 2591 .modify_wq = mlx4_ib_modify_wq, 2592 + 2593 + INIT_RDMA_OBJ_SIZE(ib_rwq_ind_table, mlx4_ib_rwq_ind_table, 2594 + ib_rwq_ind_tbl), 2581 2595 }; 2582 2596 2583 2597 static const struct ib_device_ops mlx4_ib_dev_mw_ops = { 2584 2598 .alloc_mw = mlx4_ib_alloc_mw, 2585 2599 .dealloc_mw = mlx4_ib_dealloc_mw, 2600 + 2601 + INIT_RDMA_OBJ_SIZE(ib_mw, mlx4_ib_mw, ibmw), 2586 2602 }; 2587 2603 2588 2604 static const struct ib_device_ops mlx4_ib_dev_xrc_ops = { ··· 2841 2847 goto err_steer_free_bitmap; 2842 2848 2843 2849 rdma_set_device_sysfs_group(&ibdev->ib_dev, &mlx4_attr_group); 2844 - if (ib_register_device(&ibdev->ib_dev, "mlx4_%d")) 2850 + if (ib_register_device(&ibdev->ib_dev, "mlx4_%d", 2851 + &dev->persist->pdev->dev)) 2845 2852 goto err_diag_counters; 2846 2853 2847 2854 if (mlx4_ib_mad_init(ibdev)) ··· 2984 2989 /* Add an empty rule for IB L2 */ 2985 2990 memset(&ib_spec->mask, 0, sizeof(ib_spec->mask)); 2986 2991 2987 - err = __mlx4_ib_create_flow(&mqp->ibqp, flow, 2988 - IB_FLOW_DOMAIN_NIC, 2989 - MLX4_FS_REGULAR, 2990 - &mqp->reg_id); 2992 + err = __mlx4_ib_create_flow(&mqp->ibqp, flow, MLX4_DOMAIN_NIC, 2993 + MLX4_FS_REGULAR, &mqp->reg_id); 2991 2994 } else { 2992 2995 err = __mlx4_ib_destroy_flow(mdev->dev, mqp->reg_id); 2993 2996 }
+49 -13
drivers/infiniband/hw/mlx4/mlx4_ib.h
··· 233 233 }; 234 234 235 235 enum { 236 - MLX4_NUM_TUNNEL_BUFS = 256, 236 + MLX4_NUM_TUNNEL_BUFS = 512, 237 + MLX4_NUM_WIRE_BUFS = 2048, 237 238 }; 238 239 239 240 struct mlx4_ib_tunnel_header { ··· 299 298 u8 rss_key[MLX4_EN_RSS_KEY_SIZE]; 300 299 }; 301 300 301 + enum { 302 + /* 303 + * Largest possible UD header: send with GRH and immediate 304 + * data plus 18 bytes for an Ethernet header with VLAN/802.1Q 305 + * tag. (LRH would only use 8 bytes, so Ethernet is the 306 + * biggest case) 307 + */ 308 + MLX4_IB_UD_HEADER_SIZE = 82, 309 + MLX4_IB_LSO_HEADER_SPARE = 128, 310 + }; 311 + 312 + struct mlx4_ib_sqp { 313 + int pkey_index; 314 + u32 qkey; 315 + u32 send_psn; 316 + struct ib_ud_header ud_header; 317 + u8 header_buf[MLX4_IB_UD_HEADER_SIZE]; 318 + struct ib_qp *roce_v2_gsi; 319 + }; 320 + 302 321 struct mlx4_ib_qp { 303 322 union { 304 323 struct ib_qp ibqp; ··· 364 343 struct mlx4_wqn_range *wqn_range; 365 344 /* Number of RSS QP parents that uses this WQ */ 366 345 u32 rss_usecnt; 367 - struct mlx4_ib_rss *rss_ctx; 346 + union { 347 + struct mlx4_ib_rss *rss_ctx; 348 + struct mlx4_ib_sqp *sqp; 349 + }; 368 350 }; 369 351 370 352 struct mlx4_ib_srq { ··· 388 364 struct mlx4_ib_ah { 389 365 struct ib_ah ibah; 390 366 union mlx4_ext_av av; 367 + }; 368 + 369 + struct mlx4_ib_rwq_ind_table { 370 + struct ib_rwq_ind_table ib_rwq_ind_tbl; 391 371 }; 392 372 393 373 /****************************************/ ··· 482 454 struct ib_pd *pd; 483 455 struct work_struct work; 484 456 struct workqueue_struct *wq; 457 + struct workqueue_struct *wi_wq; 485 458 struct mlx4_ib_demux_pv_qp qp[2]; 486 459 }; 487 460 ··· 490 461 struct ib_device *ib_dev; 491 462 int port; 492 463 struct workqueue_struct *wq; 464 + struct workqueue_struct *wi_wq; 493 465 struct workqueue_struct *ud_wq; 494 466 spinlock_t ud_lock; 495 467 atomic64_t subnet_prefix; ··· 522 492 spinlock_t id_map_lock; 523 493 struct rb_root sl_id_map; 524 494 struct list_head cm_list; 495 + struct xarray xa_rej_tmout; 525 496 }; 526 497 527 498 struct gid_cache_context { ··· 756 725 u64 virt_addr, int access_flags, 757 726 struct ib_udata *udata); 758 727 int mlx4_ib_dereg_mr(struct ib_mr *mr, struct ib_udata *udata); 759 - struct ib_mw *mlx4_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 760 - struct ib_udata *udata); 728 + int mlx4_ib_alloc_mw(struct ib_mw *mw, struct ib_udata *udata); 761 729 int mlx4_ib_dealloc_mw(struct ib_mw *mw); 762 730 struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type, 763 731 u32 max_num_sg); ··· 766 736 int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata); 767 737 int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 768 738 struct ib_udata *udata); 769 - void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 739 + int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 770 740 int mlx4_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); 771 741 int mlx4_ib_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags); 772 742 void __mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq); ··· 777 747 int mlx4_ib_create_ah_slave(struct ib_ah *ah, struct rdma_ah_attr *ah_attr, 778 748 int slave_sgid_index, u8 *s_mac, u16 vlan_tag); 779 749 int mlx4_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); 780 - void mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags); 750 + static inline int mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags) 751 + { 752 + return 0; 753 + } 781 754 782 755 int mlx4_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr, 783 756 struct ib_udata *udata); 784 757 int mlx4_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, 785 758 enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); 786 759 int mlx4_ib_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); 787 - void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 760 + int mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 788 761 void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index); 789 762 int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr, 790 763 const struct ib_recv_wr **bad_wr); ··· 923 890 struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd, 924 891 struct ib_wq_init_attr *init_attr, 925 892 struct ib_udata *udata); 926 - void mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata); 893 + int mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata); 927 894 int mlx4_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr, 928 895 u32 wq_attr_mask, struct ib_udata *udata); 929 896 930 - struct ib_rwq_ind_table 931 - *mlx4_ib_create_rwq_ind_table(struct ib_device *device, 932 - struct ib_rwq_ind_table_init_attr *init_attr, 933 - struct ib_udata *udata); 934 - int mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table); 897 + int mlx4_ib_create_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl, 898 + struct ib_rwq_ind_table_init_attr *init_attr, 899 + struct ib_udata *udata); 900 + static inline int 901 + mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table) 902 + { 903 + return 0; 904 + } 935 905 int mlx4_ib_umem_calc_optimal_mtt_size(struct ib_umem *umem, u64 start_va, 936 906 int *num_of_mtts); 937 907
+12 -23
drivers/infiniband/hw/mlx4/mr.c
··· 271 271 u64 total_len = 0; 272 272 int i; 273 273 274 + *num_of_mtts = ib_umem_num_dma_blocks(umem, PAGE_SIZE); 275 + 274 276 for_each_sg(umem->sg_head.sgl, sg, umem->nmap, i) { 275 277 /* 276 278 * Initialization - save the first chunk start as the ··· 423 421 goto err_free; 424 422 } 425 423 426 - n = ib_umem_page_count(mr->umem); 427 424 shift = mlx4_ib_umem_calc_optimal_mtt_size(mr->umem, start, &n); 428 425 429 426 err = mlx4_mr_alloc(dev->dev, to_mpd(pd)->pdn, virt_addr, length, ··· 512 511 mmr->umem = NULL; 513 512 goto release_mpt_entry; 514 513 } 515 - n = ib_umem_page_count(mmr->umem); 514 + n = ib_umem_num_dma_blocks(mmr->umem, PAGE_SIZE); 516 515 shift = PAGE_SHIFT; 517 516 518 517 err = mlx4_mr_rereg_mem_write(dev->dev, &mmr->mmr, ··· 611 610 return 0; 612 611 } 613 612 614 - struct ib_mw *mlx4_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 615 - struct ib_udata *udata) 613 + int mlx4_ib_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) 616 614 { 617 - struct mlx4_ib_dev *dev = to_mdev(pd->device); 618 - struct mlx4_ib_mw *mw; 615 + struct mlx4_ib_dev *dev = to_mdev(ibmw->device); 616 + struct mlx4_ib_mw *mw = to_mmw(ibmw); 619 617 int err; 620 618 621 - mw = kmalloc(sizeof(*mw), GFP_KERNEL); 622 - if (!mw) 623 - return ERR_PTR(-ENOMEM); 624 - 625 - err = mlx4_mw_alloc(dev->dev, to_mpd(pd)->pdn, 626 - to_mlx4_type(type), &mw->mmw); 619 + err = mlx4_mw_alloc(dev->dev, to_mpd(ibmw->pd)->pdn, 620 + to_mlx4_type(ibmw->type), &mw->mmw); 627 621 if (err) 628 - goto err_free; 622 + return err; 629 623 630 624 err = mlx4_mw_enable(dev->dev, &mw->mmw); 631 625 if (err) 632 626 goto err_mw; 633 627 634 - mw->ibmw.rkey = mw->mmw.key; 635 - 636 - return &mw->ibmw; 628 + ibmw->rkey = mw->mmw.key; 629 + return 0; 637 630 638 631 err_mw: 639 632 mlx4_mw_free(dev->dev, &mw->mmw); 640 - 641 - err_free: 642 - kfree(mw); 643 - 644 - return ERR_PTR(err); 633 + return err; 645 634 } 646 635 647 636 int mlx4_ib_dealloc_mw(struct ib_mw *ibmw) ··· 639 648 struct mlx4_ib_mw *mw = to_mmw(ibmw); 640 649 641 650 mlx4_mw_free(to_mdev(ibmw->device)->dev, &mw->mmw); 642 - kfree(mw); 643 - 644 651 return 0; 645 652 } 646 653
+139 -206
drivers/infiniband/hw/mlx4/qp.c
··· 68 68 }; 69 69 70 70 enum { 71 - /* 72 - * Largest possible UD header: send with GRH and immediate 73 - * data plus 18 bytes for an Ethernet header with VLAN/802.1Q 74 - * tag. (LRH would only use 8 bytes, so Ethernet is the 75 - * biggest case) 76 - */ 77 - MLX4_IB_UD_HEADER_SIZE = 82, 78 - MLX4_IB_LSO_HEADER_SPARE = 128, 79 - }; 80 - 81 - struct mlx4_ib_sqp { 82 - struct mlx4_ib_qp qp; 83 - int pkey_index; 84 - u32 qkey; 85 - u32 send_psn; 86 - struct ib_ud_header ud_header; 87 - u8 header_buf[MLX4_IB_UD_HEADER_SIZE]; 88 - struct ib_qp *roce_v2_gsi; 89 - }; 90 - 91 - enum { 92 71 MLX4_IB_MIN_SQ_STRIDE = 6, 93 72 MLX4_IB_CACHE_LINE_SIZE = 64, 94 73 }; ··· 101 122 MLX4_IB_QP_SRC = 0, 102 123 MLX4_IB_RWQ_SRC = 1, 103 124 }; 104 - 105 - static struct mlx4_ib_sqp *to_msqp(struct mlx4_ib_qp *mqp) 106 - { 107 - return container_of(mqp, struct mlx4_ib_sqp, qp); 108 - } 109 125 110 126 static int is_tunnel_qp(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp) 111 127 { ··· 630 656 if (err) 631 657 goto err_qpn; 632 658 633 - mutex_init(&qp->mutex); 634 - 635 659 INIT_LIST_HEAD(&qp->gid_list); 636 660 INIT_LIST_HEAD(&qp->steering_rules); 637 661 ··· 668 696 return err; 669 697 } 670 698 671 - static struct ib_qp *_mlx4_ib_create_qp_rss(struct ib_pd *pd, 672 - struct ib_qp_init_attr *init_attr, 673 - struct ib_udata *udata) 699 + static int _mlx4_ib_create_qp_rss(struct ib_pd *pd, struct mlx4_ib_qp *qp, 700 + struct ib_qp_init_attr *init_attr, 701 + struct ib_udata *udata) 674 702 { 675 - struct mlx4_ib_qp *qp; 676 703 struct mlx4_ib_create_qp_rss ucmd = {}; 677 704 size_t required_cmd_sz; 678 705 int err; 679 706 680 707 if (!udata) { 681 708 pr_debug("RSS QP with NULL udata\n"); 682 - return ERR_PTR(-EINVAL); 709 + return -EINVAL; 683 710 } 684 711 685 712 if (udata->outlen) 686 - return ERR_PTR(-EOPNOTSUPP); 713 + return -EOPNOTSUPP; 687 714 688 715 required_cmd_sz = offsetof(typeof(ucmd), reserved1) + 689 716 sizeof(ucmd.reserved1); 690 717 if (udata->inlen < required_cmd_sz) { 691 718 pr_debug("invalid inlen\n"); 692 - return ERR_PTR(-EINVAL); 719 + return -EINVAL; 693 720 } 694 721 695 722 if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) { 696 723 pr_debug("copy failed\n"); 697 - return ERR_PTR(-EFAULT); 724 + return -EFAULT; 698 725 } 699 726 700 727 if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved))) 701 - return ERR_PTR(-EOPNOTSUPP); 728 + return -EOPNOTSUPP; 702 729 703 730 if (ucmd.comp_mask || ucmd.reserved1) 704 - return ERR_PTR(-EOPNOTSUPP); 731 + return -EOPNOTSUPP; 705 732 706 733 if (udata->inlen > sizeof(ucmd) && 707 734 !ib_is_udata_cleared(udata, sizeof(ucmd), 708 735 udata->inlen - sizeof(ucmd))) { 709 736 pr_debug("inlen is not supported\n"); 710 - return ERR_PTR(-EOPNOTSUPP); 737 + return -EOPNOTSUPP; 711 738 } 712 739 713 740 if (init_attr->qp_type != IB_QPT_RAW_PACKET) { 714 741 pr_debug("RSS QP with unsupported QP type %d\n", 715 742 init_attr->qp_type); 716 - return ERR_PTR(-EOPNOTSUPP); 743 + return -EOPNOTSUPP; 717 744 } 718 745 719 746 if (init_attr->create_flags) { 720 747 pr_debug("RSS QP doesn't support create flags\n"); 721 - return ERR_PTR(-EOPNOTSUPP); 748 + return -EOPNOTSUPP; 722 749 } 723 750 724 751 if (init_attr->send_cq || init_attr->cap.max_send_wr) { 725 752 pr_debug("RSS QP with unsupported send attributes\n"); 726 - return ERR_PTR(-EOPNOTSUPP); 753 + return -EOPNOTSUPP; 727 754 } 728 - 729 - qp = kzalloc(sizeof(*qp), GFP_KERNEL); 730 - if (!qp) 731 - return ERR_PTR(-ENOMEM); 732 755 733 756 qp->pri.vid = 0xFFFF; 734 757 qp->alt.vid = 0xFFFF; 735 758 736 759 err = create_qp_rss(to_mdev(pd->device), init_attr, &ucmd, qp); 737 - if (err) { 738 - kfree(qp); 739 - return ERR_PTR(err); 740 - } 760 + if (err) 761 + return err; 741 762 742 763 qp->ibqp.qp_num = qp->mqp.qpn; 743 - 744 - return &qp->ibqp; 764 + return 0; 745 765 } 746 766 747 767 /* ··· 837 873 838 874 qp->mlx4_ib_qp_type = MLX4_IB_QPT_RAW_PACKET; 839 875 840 - mutex_init(&qp->mutex); 841 876 spin_lock_init(&qp->sq.lock); 842 877 spin_lock_init(&qp->rq.lock); 843 878 INIT_LIST_HEAD(&qp->gid_list); ··· 885 922 goto err; 886 923 } 887 924 888 - n = ib_umem_page_count(qp->umem); 889 925 shift = mlx4_ib_umem_calc_optimal_mtt_size(qp->umem, 0, &n); 890 926 err = mlx4_mtt_init(dev->dev, n, shift, &qp->mtt); 891 927 ··· 951 989 952 990 static int create_qp_common(struct ib_pd *pd, struct ib_qp_init_attr *init_attr, 953 991 struct ib_udata *udata, int sqpn, 954 - struct mlx4_ib_qp **caller_qp) 992 + struct mlx4_ib_qp *qp) 955 993 { 956 994 struct mlx4_ib_dev *dev = to_mdev(pd->device); 957 995 int qpn; 958 996 int err; 959 - struct mlx4_ib_sqp *sqp = NULL; 960 - struct mlx4_ib_qp *qp; 961 997 struct mlx4_ib_ucontext *context = rdma_udata_to_drv_context( 962 998 udata, struct mlx4_ib_ucontext, ibucontext); 963 999 enum mlx4_ib_qp_type qp_type = (enum mlx4_ib_qp_type) init_attr->qp_type; ··· 1003 1043 sqpn = qpn; 1004 1044 } 1005 1045 1006 - if (!*caller_qp) { 1007 - if (qp_type == MLX4_IB_QPT_SMI || qp_type == MLX4_IB_QPT_GSI || 1008 - (qp_type & (MLX4_IB_QPT_PROXY_SMI | MLX4_IB_QPT_PROXY_SMI_OWNER | 1009 - MLX4_IB_QPT_PROXY_GSI | MLX4_IB_QPT_TUN_SMI_OWNER))) { 1010 - sqp = kzalloc(sizeof(struct mlx4_ib_sqp), GFP_KERNEL); 1011 - if (!sqp) 1012 - return -ENOMEM; 1013 - qp = &sqp->qp; 1014 - } else { 1015 - qp = kzalloc(sizeof(struct mlx4_ib_qp), GFP_KERNEL); 1016 - if (!qp) 1017 - return -ENOMEM; 1018 - } 1019 - qp->pri.vid = 0xFFFF; 1020 - qp->alt.vid = 0xFFFF; 1021 - } else 1022 - qp = *caller_qp; 1046 + if (init_attr->qp_type == IB_QPT_SMI || 1047 + init_attr->qp_type == IB_QPT_GSI || qp_type == MLX4_IB_QPT_SMI || 1048 + qp_type == MLX4_IB_QPT_GSI || 1049 + (qp_type & (MLX4_IB_QPT_PROXY_SMI | MLX4_IB_QPT_PROXY_SMI_OWNER | 1050 + MLX4_IB_QPT_PROXY_GSI | MLX4_IB_QPT_TUN_SMI_OWNER))) { 1051 + qp->sqp = kzalloc(sizeof(struct mlx4_ib_sqp), GFP_KERNEL); 1052 + if (!qp->sqp) 1053 + return -ENOMEM; 1054 + } 1023 1055 1024 1056 qp->mlx4_ib_qp_type = qp_type; 1025 1057 1026 - mutex_init(&qp->mutex); 1027 1058 spin_lock_init(&qp->sq.lock); 1028 1059 spin_lock_init(&qp->rq.lock); 1029 1060 INIT_LIST_HEAD(&qp->gid_list); ··· 1068 1117 goto err; 1069 1118 } 1070 1119 1071 - n = ib_umem_page_count(qp->umem); 1072 1120 shift = mlx4_ib_umem_calc_optimal_mtt_size(qp->umem, 0, &n); 1073 1121 err = mlx4_mtt_init(dev->dev, n, shift, &qp->mtt); 1074 1122 ··· 1189 1239 1190 1240 qp->mqp.event = mlx4_ib_qp_event; 1191 1241 1192 - if (!*caller_qp) 1193 - *caller_qp = qp; 1194 - 1195 1242 spin_lock_irqsave(&dev->reset_flow_resource_lock, flags); 1196 1243 mlx4_ib_lock_cqs(to_mcq(init_attr->send_cq), 1197 1244 to_mcq(init_attr->recv_cq)); ··· 1240 1293 mlx4_db_free(dev->dev, &qp->db); 1241 1294 1242 1295 err: 1243 - if (!sqp && !*caller_qp) 1244 - kfree(qp); 1245 - kfree(sqp); 1246 - 1296 + kfree(qp->sqp); 1247 1297 return err; 1248 1298 } 1249 1299 ··· 1354 1410 mlx4_qp_free(dev->dev, &qp->mqp); 1355 1411 mlx4_qp_release_range(dev->dev, qp->mqp.qpn, 1); 1356 1412 del_gid_entries(qp); 1357 - kfree(qp->rss_ctx); 1358 1413 } 1359 1414 1360 1415 static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp, ··· 1472 1529 return dev->dev->caps.spec_qps[attr->port_num - 1].qp1_proxy; 1473 1530 } 1474 1531 1475 - static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd, 1476 - struct ib_qp_init_attr *init_attr, 1477 - struct ib_udata *udata) 1532 + static int _mlx4_ib_create_qp(struct ib_pd *pd, struct mlx4_ib_qp *qp, 1533 + struct ib_qp_init_attr *init_attr, 1534 + struct ib_udata *udata) 1478 1535 { 1479 - struct mlx4_ib_qp *qp = NULL; 1480 1536 int err; 1481 1537 int sup_u_create_flags = MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK; 1482 1538 u16 xrcdn = 0; 1483 1539 1484 1540 if (init_attr->rwq_ind_tbl) 1485 - return _mlx4_ib_create_qp_rss(pd, init_attr, udata); 1541 + return _mlx4_ib_create_qp_rss(pd, qp, init_attr, udata); 1486 1542 1487 1543 /* 1488 1544 * We only support LSO, vendor flag1, and multicast loopback blocking, ··· 1493 1551 MLX4_IB_SRIOV_SQP | 1494 1552 MLX4_IB_QP_NETIF | 1495 1553 MLX4_IB_QP_CREATE_ROCE_V2_GSI)) 1496 - return ERR_PTR(-EINVAL); 1554 + return -EINVAL; 1497 1555 1498 1556 if (init_attr->create_flags & IB_QP_CREATE_NETIF_QP) { 1499 1557 if (init_attr->qp_type != IB_QPT_UD) 1500 - return ERR_PTR(-EINVAL); 1558 + return -EINVAL; 1501 1559 } 1502 1560 1503 1561 if (init_attr->create_flags) { 1504 1562 if (udata && init_attr->create_flags & ~(sup_u_create_flags)) 1505 - return ERR_PTR(-EINVAL); 1563 + return -EINVAL; 1506 1564 1507 1565 if ((init_attr->create_flags & ~(MLX4_IB_SRIOV_SQP | 1508 1566 MLX4_IB_QP_CREATE_ROCE_V2_GSI | ··· 1512 1570 init_attr->qp_type > IB_QPT_GSI) || 1513 1571 (init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI && 1514 1572 init_attr->qp_type != IB_QPT_GSI)) 1515 - return ERR_PTR(-EINVAL); 1573 + return -EINVAL; 1516 1574 } 1517 1575 1518 1576 switch (init_attr->qp_type) { ··· 1523 1581 fallthrough; 1524 1582 case IB_QPT_XRC_INI: 1525 1583 if (!(to_mdev(pd->device)->dev->caps.flags & MLX4_DEV_CAP_FLAG_XRC)) 1526 - return ERR_PTR(-ENOSYS); 1584 + return -ENOSYS; 1527 1585 init_attr->recv_cq = init_attr->send_cq; 1528 1586 fallthrough; 1529 1587 case IB_QPT_RC: 1530 1588 case IB_QPT_UC: 1531 1589 case IB_QPT_RAW_PACKET: 1532 - qp = kzalloc(sizeof(*qp), GFP_KERNEL); 1533 - if (!qp) 1534 - return ERR_PTR(-ENOMEM); 1590 + case IB_QPT_UD: 1535 1591 qp->pri.vid = 0xFFFF; 1536 1592 qp->alt.vid = 0xFFFF; 1537 - fallthrough; 1538 - case IB_QPT_UD: 1539 - { 1540 - err = create_qp_common(pd, init_attr, udata, 0, &qp); 1541 - if (err) { 1542 - kfree(qp); 1543 - return ERR_PTR(err); 1544 - } 1593 + err = create_qp_common(pd, init_attr, udata, 0, qp); 1594 + if (err) 1595 + return err; 1545 1596 1546 1597 qp->ibqp.qp_num = qp->mqp.qpn; 1547 1598 qp->xrcdn = xrcdn; 1548 - 1549 1599 break; 1550 - } 1551 1600 case IB_QPT_SMI: 1552 1601 case IB_QPT_GSI: 1553 1602 { 1554 1603 int sqpn; 1555 1604 1556 - /* Userspace is not allowed to create special QPs: */ 1557 - if (udata) 1558 - return ERR_PTR(-EINVAL); 1559 1605 if (init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI) { 1560 1606 int res = mlx4_qp_reserve_range(to_mdev(pd->device)->dev, 1561 1607 1, 1, &sqpn, 0, 1562 1608 MLX4_RES_USAGE_DRIVER); 1563 1609 1564 1610 if (res) 1565 - return ERR_PTR(res); 1611 + return res; 1566 1612 } else { 1567 1613 sqpn = get_sqp_num(to_mdev(pd->device), init_attr); 1568 1614 } 1569 1615 1570 - err = create_qp_common(pd, init_attr, udata, sqpn, &qp); 1616 + qp->pri.vid = 0xFFFF; 1617 + qp->alt.vid = 0xFFFF; 1618 + err = create_qp_common(pd, init_attr, udata, sqpn, qp); 1571 1619 if (err) 1572 - return ERR_PTR(err); 1620 + return err; 1573 1621 1574 1622 qp->port = init_attr->port_num; 1575 1623 qp->ibqp.qp_num = init_attr->qp_type == IB_QPT_SMI ? 0 : ··· 1568 1636 } 1569 1637 default: 1570 1638 /* Don't support raw QPs */ 1571 - return ERR_PTR(-EOPNOTSUPP); 1639 + return -EOPNOTSUPP; 1572 1640 } 1573 - 1574 - return &qp->ibqp; 1641 + return 0; 1575 1642 } 1576 1643 1577 1644 struct ib_qp *mlx4_ib_create_qp(struct ib_pd *pd, 1578 1645 struct ib_qp_init_attr *init_attr, 1579 1646 struct ib_udata *udata) { 1580 1647 struct ib_device *device = pd ? pd->device : init_attr->xrcd->device; 1581 - struct ib_qp *ibqp; 1582 1648 struct mlx4_ib_dev *dev = to_mdev(device); 1649 + struct mlx4_ib_qp *qp; 1650 + int ret; 1583 1651 1584 - ibqp = _mlx4_ib_create_qp(pd, init_attr, udata); 1652 + qp = kzalloc(sizeof(*qp), GFP_KERNEL); 1653 + if (!qp) 1654 + return ERR_PTR(-ENOMEM); 1585 1655 1586 - if (!IS_ERR(ibqp) && 1587 - (init_attr->qp_type == IB_QPT_GSI) && 1656 + mutex_init(&qp->mutex); 1657 + ret = _mlx4_ib_create_qp(pd, qp, init_attr, udata); 1658 + if (ret) { 1659 + kfree(qp); 1660 + return ERR_PTR(ret); 1661 + } 1662 + 1663 + if (init_attr->qp_type == IB_QPT_GSI && 1588 1664 !(init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI)) { 1589 - struct mlx4_ib_sqp *sqp = to_msqp((to_mqp(ibqp))); 1665 + struct mlx4_ib_sqp *sqp = qp->sqp; 1590 1666 int is_eth = rdma_cap_eth_ah(&dev->ib_dev, init_attr->port_num); 1591 1667 1592 1668 if (is_eth && ··· 1606 1666 pr_err("Failed to create GSI QP for RoCEv2 (%ld)\n", PTR_ERR(sqp->roce_v2_gsi)); 1607 1667 sqp->roce_v2_gsi = NULL; 1608 1668 } else { 1609 - sqp = to_msqp(to_mqp(sqp->roce_v2_gsi)); 1610 - sqp->qp.flags |= MLX4_IB_ROCE_V2_GSI_QP; 1669 + to_mqp(sqp->roce_v2_gsi)->flags |= 1670 + MLX4_IB_ROCE_V2_GSI_QP; 1611 1671 } 1612 1672 1613 1673 init_attr->create_flags &= ~MLX4_IB_QP_CREATE_ROCE_V2_GSI; 1614 1674 } 1615 1675 } 1616 - return ibqp; 1676 + return &qp->ibqp; 1617 1677 } 1618 1678 1619 1679 static int _mlx4_ib_destroy_qp(struct ib_qp *qp, struct ib_udata *udata) ··· 1640 1700 destroy_qp_common(dev, mqp, MLX4_IB_QP_SRC, udata); 1641 1701 } 1642 1702 1643 - if (is_sqp(dev, mqp)) 1644 - kfree(to_msqp(mqp)); 1645 - else 1646 - kfree(mqp); 1703 + kfree(mqp->sqp); 1704 + kfree(mqp); 1647 1705 1648 1706 return 0; 1649 1707 } ··· 1651 1713 struct mlx4_ib_qp *mqp = to_mqp(qp); 1652 1714 1653 1715 if (mqp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) { 1654 - struct mlx4_ib_sqp *sqp = to_msqp(mqp); 1716 + struct mlx4_ib_sqp *sqp = mqp->sqp; 1655 1717 1656 1718 if (sqp->roce_v2_gsi) 1657 1719 ib_destroy_qp(sqp->roce_v2_gsi); ··· 2513 2575 qp->alt_port = attr->alt_port_num; 2514 2576 2515 2577 if (is_sqp(dev, qp)) 2516 - store_sqp_attrs(to_msqp(qp), attr, attr_mask); 2578 + store_sqp_attrs(qp->sqp, attr, attr_mask); 2517 2579 2518 2580 /* 2519 2581 * If we moved QP0 to RTR, bring the IB link up; if we moved ··· 2790 2852 ret = _mlx4_ib_modify_qp(ibqp, attr, attr_mask, udata); 2791 2853 2792 2854 if (mqp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) { 2793 - struct mlx4_ib_sqp *sqp = to_msqp(mqp); 2855 + struct mlx4_ib_sqp *sqp = mqp->sqp; 2794 2856 int err = 0; 2795 2857 2796 2858 if (sqp->roce_v2_gsi) ··· 2815 2877 return -EINVAL; 2816 2878 } 2817 2879 2818 - static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp, 2880 + static int build_sriov_qp0_header(struct mlx4_ib_qp *qp, 2819 2881 const struct ib_ud_wr *wr, 2820 2882 void *wqe, unsigned *mlx_seg_len) 2821 2883 { 2822 - struct mlx4_ib_dev *mdev = to_mdev(sqp->qp.ibqp.device); 2823 - struct ib_device *ib_dev = &mdev->ib_dev; 2884 + struct mlx4_ib_dev *mdev = to_mdev(qp->ibqp.device); 2885 + struct mlx4_ib_sqp *sqp = qp->sqp; 2886 + struct ib_device *ib_dev = qp->ibqp.device; 2824 2887 struct mlx4_wqe_mlx_seg *mlx = wqe; 2825 2888 struct mlx4_wqe_inline_seg *inl = wqe + sizeof *mlx; 2826 2889 struct mlx4_ib_ah *ah = to_mah(wr->ah); ··· 2843 2904 2844 2905 /* for proxy-qp0 sends, need to add in size of tunnel header */ 2845 2906 /* for tunnel-qp0 sends, tunnel header is already in s/g list */ 2846 - if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER) 2907 + if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER) 2847 2908 send_size += sizeof (struct mlx4_ib_tunnel_header); 2848 2909 2849 2910 ib_ud_header_init(send_size, 1, 0, 0, 0, 0, 0, 0, &sqp->ud_header); 2850 2911 2851 - if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER) { 2912 + if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_PROXY_SMI_OWNER) { 2852 2913 sqp->ud_header.lrh.service_level = 2853 2914 be32_to_cpu(ah->av.ib.sl_tclass_flowlabel) >> 28; 2854 2915 sqp->ud_header.lrh.destination_lid = ··· 2865 2926 2866 2927 sqp->ud_header.lrh.virtual_lane = 0; 2867 2928 sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); 2868 - err = ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey); 2929 + err = ib_get_cached_pkey(ib_dev, qp->port, 0, &pkey); 2869 2930 if (err) 2870 2931 return err; 2871 2932 sqp->ud_header.bth.pkey = cpu_to_be16(pkey); 2872 - if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER) 2933 + if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER) 2873 2934 sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); 2874 2935 else 2875 2936 sqp->ud_header.bth.destination_qpn = 2876 - cpu_to_be32(mdev->dev->caps.spec_qps[sqp->qp.port - 1].qp0_tunnel); 2937 + cpu_to_be32(mdev->dev->caps.spec_qps[qp->port - 1].qp0_tunnel); 2877 2938 2878 2939 sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); 2879 2940 if (mlx4_is_master(mdev->dev)) { 2880 - if (mlx4_get_parav_qkey(mdev->dev, sqp->qp.mqp.qpn, &qkey)) 2941 + if (mlx4_get_parav_qkey(mdev->dev, qp->mqp.qpn, &qkey)) 2881 2942 return -EINVAL; 2882 2943 } else { 2883 - if (vf_get_qp0_qkey(mdev->dev, sqp->qp.mqp.qpn, &qkey)) 2944 + if (vf_get_qp0_qkey(mdev->dev, qp->mqp.qpn, &qkey)) 2884 2945 return -EINVAL; 2885 2946 } 2886 2947 sqp->ud_header.deth.qkey = cpu_to_be32(qkey); 2887 - sqp->ud_header.deth.source_qpn = cpu_to_be32(sqp->qp.mqp.qpn); 2948 + sqp->ud_header.deth.source_qpn = cpu_to_be32(qp->mqp.qpn); 2888 2949 2889 2950 sqp->ud_header.bth.opcode = IB_OPCODE_UD_SEND_ONLY; 2890 2951 sqp->ud_header.immediate_present = 0; ··· 2968 3029 } 2969 3030 2970 3031 #define MLX4_ROCEV2_QP1_SPORT 0xC000 2971 - static int build_mlx_header(struct mlx4_ib_sqp *sqp, const struct ib_ud_wr *wr, 3032 + static int build_mlx_header(struct mlx4_ib_qp *qp, const struct ib_ud_wr *wr, 2972 3033 void *wqe, unsigned *mlx_seg_len) 2973 3034 { 2974 - struct ib_device *ib_dev = sqp->qp.ibqp.device; 3035 + struct mlx4_ib_sqp *sqp = qp->sqp; 3036 + struct ib_device *ib_dev = qp->ibqp.device; 2975 3037 struct mlx4_ib_dev *ibdev = to_mdev(ib_dev); 2976 3038 struct mlx4_wqe_mlx_seg *mlx = wqe; 2977 3039 struct mlx4_wqe_ctrl_seg *ctrl = wqe; ··· 2996 3056 for (i = 0; i < wr->wr.num_sge; ++i) 2997 3057 send_size += wr->wr.sg_list[i].length; 2998 3058 2999 - is_eth = rdma_port_get_link_layer(sqp->qp.ibqp.device, sqp->qp.port) == IB_LINK_LAYER_ETHERNET; 3059 + is_eth = rdma_port_get_link_layer(qp->ibqp.device, qp->port) == IB_LINK_LAYER_ETHERNET; 3000 3060 is_grh = mlx4_ib_ah_grh_present(ah); 3001 3061 if (is_eth) { 3002 3062 enum ib_gid_type gid_type; ··· 3010 3070 if (err) 3011 3071 return err; 3012 3072 } else { 3013 - err = fill_gid_by_hw_index(ibdev, sqp->qp.port, 3014 - ah->av.ib.gid_index, 3015 - &sgid, &gid_type); 3073 + err = fill_gid_by_hw_index(ibdev, qp->port, 3074 + ah->av.ib.gid_index, &sgid, 3075 + &gid_type); 3016 3076 if (!err) { 3017 3077 is_udp = gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP; 3018 3078 if (is_udp) { ··· 3057 3117 * indexes don't necessarily match the hw ones, so 3058 3118 * we must use our own cache 3059 3119 */ 3060 - sqp->ud_header.grh.source_gid.global.subnet_prefix = 3061 - cpu_to_be64(atomic64_read(&(to_mdev(ib_dev)->sriov. 3062 - demux[sqp->qp.port - 1]. 3063 - subnet_prefix))); 3064 - sqp->ud_header.grh.source_gid.global.interface_id = 3065 - to_mdev(ib_dev)->sriov.demux[sqp->qp.port - 1]. 3066 - guid_cache[ah->av.ib.gid_index]; 3120 + sqp->ud_header.grh.source_gid.global 3121 + .subnet_prefix = 3122 + cpu_to_be64(atomic64_read( 3123 + &(to_mdev(ib_dev) 3124 + ->sriov 3125 + .demux[qp->port - 1] 3126 + .subnet_prefix))); 3127 + sqp->ud_header.grh.source_gid.global 3128 + .interface_id = 3129 + to_mdev(ib_dev) 3130 + ->sriov.demux[qp->port - 1] 3131 + .guid_cache[ah->av.ib.gid_index]; 3067 3132 } else { 3068 3133 sqp->ud_header.grh.source_gid = 3069 3134 ah->ibah.sgid_attr->gid; ··· 3100 3155 mlx->flags &= cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE); 3101 3156 3102 3157 if (!is_eth) { 3103 - mlx->flags |= cpu_to_be32((!sqp->qp.ibqp.qp_num ? MLX4_WQE_MLX_VL15 : 0) | 3104 - (sqp->ud_header.lrh.destination_lid == 3105 - IB_LID_PERMISSIVE ? MLX4_WQE_MLX_SLR : 0) | 3106 - (sqp->ud_header.lrh.service_level << 8)); 3158 + mlx->flags |= 3159 + cpu_to_be32((!qp->ibqp.qp_num ? MLX4_WQE_MLX_VL15 : 0) | 3160 + (sqp->ud_header.lrh.destination_lid == 3161 + IB_LID_PERMISSIVE ? 3162 + MLX4_WQE_MLX_SLR : 3163 + 0) | 3164 + (sqp->ud_header.lrh.service_level << 8)); 3107 3165 if (ah->av.ib.port_pd & cpu_to_be32(0x80000000)) 3108 3166 mlx->flags |= cpu_to_be32(0x1); /* force loopback */ 3109 3167 mlx->rlid = sqp->ud_header.lrh.destination_lid; ··· 3152 3204 sqp->ud_header.vlan.tag = cpu_to_be16(vlan | pcp); 3153 3205 } 3154 3206 } else { 3155 - sqp->ud_header.lrh.virtual_lane = !sqp->qp.ibqp.qp_num ? 15 : 3156 - sl_to_vl(to_mdev(ib_dev), 3157 - sqp->ud_header.lrh.service_level, 3158 - sqp->qp.port); 3159 - if (sqp->qp.ibqp.qp_num && sqp->ud_header.lrh.virtual_lane == 15) 3207 + sqp->ud_header.lrh.virtual_lane = 3208 + !qp->ibqp.qp_num ? 3209 + 15 : 3210 + sl_to_vl(to_mdev(ib_dev), 3211 + sqp->ud_header.lrh.service_level, 3212 + qp->port); 3213 + if (qp->ibqp.qp_num && sqp->ud_header.lrh.virtual_lane == 15) 3160 3214 return -EINVAL; 3161 3215 if (sqp->ud_header.lrh.destination_lid == IB_LID_PERMISSIVE) 3162 3216 sqp->ud_header.lrh.source_lid = IB_LID_PERMISSIVE; 3163 3217 } 3164 3218 sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); 3165 - if (!sqp->qp.ibqp.qp_num) 3166 - err = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, 3219 + if (!qp->ibqp.qp_num) 3220 + err = ib_get_cached_pkey(ib_dev, qp->port, sqp->pkey_index, 3167 3221 &pkey); 3168 3222 else 3169 - err = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, 3223 + err = ib_get_cached_pkey(ib_dev, qp->port, wr->pkey_index, 3170 3224 &pkey); 3171 3225 if (err) 3172 3226 return err; ··· 3178 3228 sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); 3179 3229 sqp->ud_header.deth.qkey = cpu_to_be32(wr->remote_qkey & 0x80000000 ? 3180 3230 sqp->qkey : wr->remote_qkey); 3181 - sqp->ud_header.deth.source_qpn = cpu_to_be32(sqp->qp.ibqp.qp_num); 3231 + sqp->ud_header.deth.source_qpn = cpu_to_be32(qp->ibqp.qp_num); 3182 3232 3183 3233 header_size = ib_ud_header_pack(&sqp->ud_header, sqp->header_buf); 3184 3234 ··· 3501 3551 struct mlx4_ib_dev *mdev = to_mdev(ibqp->device); 3502 3552 3503 3553 if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_GSI) { 3504 - struct mlx4_ib_sqp *sqp = to_msqp(qp); 3554 + struct mlx4_ib_sqp *sqp = qp->sqp; 3505 3555 3506 3556 if (sqp->roce_v2_gsi) { 3507 3557 struct mlx4_ib_ah *ah = to_mah(ud_wr(wr)->ah); 3508 3558 enum ib_gid_type gid_type; 3509 3559 union ib_gid gid; 3510 3560 3511 - if (!fill_gid_by_hw_index(mdev, sqp->qp.port, 3561 + if (!fill_gid_by_hw_index(mdev, qp->port, 3512 3562 ah->av.ib.gid_index, 3513 3563 &gid, &gid_type)) 3514 3564 qp = (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) ? ··· 3628 3678 break; 3629 3679 3630 3680 case MLX4_IB_QPT_TUN_SMI_OWNER: 3631 - err = build_sriov_qp0_header(to_msqp(qp), ud_wr(wr), 3632 - ctrl, &seglen); 3681 + err = build_sriov_qp0_header(qp, ud_wr(wr), ctrl, 3682 + &seglen); 3633 3683 if (unlikely(err)) { 3634 3684 *bad_wr = wr; 3635 3685 goto out; ··· 3665 3715 break; 3666 3716 3667 3717 case MLX4_IB_QPT_PROXY_SMI_OWNER: 3668 - err = build_sriov_qp0_header(to_msqp(qp), ud_wr(wr), 3669 - ctrl, &seglen); 3718 + err = build_sriov_qp0_header(qp, ud_wr(wr), ctrl, 3719 + &seglen); 3670 3720 if (unlikely(err)) { 3671 3721 *bad_wr = wr; 3672 3722 goto out; ··· 3699 3749 3700 3750 case MLX4_IB_QPT_SMI: 3701 3751 case MLX4_IB_QPT_GSI: 3702 - err = build_mlx_header(to_msqp(qp), ud_wr(wr), ctrl, 3703 - &seglen); 3752 + err = build_mlx_header(qp, ud_wr(wr), ctrl, &seglen); 3704 3753 if (unlikely(err)) { 3705 3754 *bad_wr = wr; 3706 3755 goto out; ··· 4121 4172 if (!qp) 4122 4173 return ERR_PTR(-ENOMEM); 4123 4174 4175 + mutex_init(&qp->mutex); 4124 4176 qp->pri.vid = 0xFFFF; 4125 4177 qp->alt.vid = 0xFFFF; 4126 4178 ··· 4277 4327 return err; 4278 4328 } 4279 4329 4280 - void mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata) 4330 + int mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata) 4281 4331 { 4282 4332 struct mlx4_ib_dev *dev = to_mdev(ibwq->device); 4283 4333 struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq); ··· 4288 4338 destroy_qp_common(dev, qp, MLX4_IB_RWQ_SRC, udata); 4289 4339 4290 4340 kfree(qp); 4341 + return 0; 4291 4342 } 4292 4343 4293 - struct ib_rwq_ind_table 4294 - *mlx4_ib_create_rwq_ind_table(struct ib_device *device, 4295 - struct ib_rwq_ind_table_init_attr *init_attr, 4296 - struct ib_udata *udata) 4344 + int mlx4_ib_create_rwq_ind_table(struct ib_rwq_ind_table *rwq_ind_table, 4345 + struct ib_rwq_ind_table_init_attr *init_attr, 4346 + struct ib_udata *udata) 4297 4347 { 4298 - struct ib_rwq_ind_table *rwq_ind_table; 4299 4348 struct mlx4_ib_create_rwq_ind_tbl_resp resp = {}; 4300 4349 unsigned int ind_tbl_size = 1 << init_attr->log_ind_tbl_size; 4350 + struct ib_device *device = rwq_ind_table->device; 4301 4351 unsigned int base_wqn; 4302 4352 size_t min_resp_len; 4303 - int i; 4304 - int err; 4353 + int i, err = 0; 4305 4354 4306 4355 if (udata->inlen > 0 && 4307 4356 !ib_is_udata_cleared(udata, 0, 4308 4357 udata->inlen)) 4309 - return ERR_PTR(-EOPNOTSUPP); 4358 + return -EOPNOTSUPP; 4310 4359 4311 4360 min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved); 4312 4361 if (udata->outlen && udata->outlen < min_resp_len) 4313 - return ERR_PTR(-EINVAL); 4362 + return -EINVAL; 4314 4363 4315 4364 if (ind_tbl_size > 4316 4365 device->attrs.rss_caps.max_rwq_indirection_table_size) { 4317 4366 pr_debug("log_ind_tbl_size = %d is bigger than supported = %d\n", 4318 4367 ind_tbl_size, 4319 4368 device->attrs.rss_caps.max_rwq_indirection_table_size); 4320 - return ERR_PTR(-EINVAL); 4369 + return -EINVAL; 4321 4370 } 4322 4371 4323 4372 base_wqn = init_attr->ind_tbl[0]->wq_num; ··· 4324 4375 if (base_wqn % ind_tbl_size) { 4325 4376 pr_debug("WQN=0x%x isn't aligned with indirection table size\n", 4326 4377 base_wqn); 4327 - return ERR_PTR(-EINVAL); 4378 + return -EINVAL; 4328 4379 } 4329 4380 4330 4381 for (i = 1; i < ind_tbl_size; i++) { 4331 4382 if (++base_wqn != init_attr->ind_tbl[i]->wq_num) { 4332 4383 pr_debug("indirection table's WQNs aren't consecutive\n"); 4333 - return ERR_PTR(-EINVAL); 4384 + return -EINVAL; 4334 4385 } 4335 4386 } 4336 - 4337 - rwq_ind_table = kzalloc(sizeof(*rwq_ind_table), GFP_KERNEL); 4338 - if (!rwq_ind_table) 4339 - return ERR_PTR(-ENOMEM); 4340 4387 4341 4388 if (udata->outlen) { 4342 4389 resp.response_length = offsetof(typeof(resp), response_length) + 4343 4390 sizeof(resp.response_length); 4344 4391 err = ib_copy_to_udata(udata, &resp, resp.response_length); 4345 - if (err) 4346 - goto err; 4347 4392 } 4348 4393 4349 - return rwq_ind_table; 4350 - 4351 - err: 4352 - kfree(rwq_ind_table); 4353 - return ERR_PTR(err); 4354 - } 4355 - 4356 - int mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl) 4357 - { 4358 - kfree(ib_rwq_ind_tbl); 4359 - return 0; 4394 + return err; 4360 4395 } 4361 4396 4362 4397 struct mlx4_ib_drain_cqe {
+5 -3
drivers/infiniband/hw/mlx4/srq.c
··· 115 115 if (IS_ERR(srq->umem)) 116 116 return PTR_ERR(srq->umem); 117 117 118 - err = mlx4_mtt_init(dev->dev, ib_umem_page_count(srq->umem), 119 - PAGE_SHIFT, &srq->mtt); 118 + err = mlx4_mtt_init( 119 + dev->dev, ib_umem_num_dma_blocks(srq->umem, PAGE_SIZE), 120 + PAGE_SHIFT, &srq->mtt); 120 121 if (err) 121 122 goto err_buf; 122 123 ··· 261 260 return 0; 262 261 } 263 262 264 - void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 263 + int mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 265 264 { 266 265 struct mlx4_ib_dev *dev = to_mdev(srq->device); 267 266 struct mlx4_ib_srq *msrq = to_msrq(srq); ··· 283 282 mlx4_db_free(dev->dev, &msrq->db); 284 283 } 285 284 ib_umem_release(msrq->umem); 285 + return 0; 286 286 } 287 287 288 288 void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index)
+2 -7
drivers/infiniband/hw/mlx5/ah.c
··· 106 106 if (ah_type == RDMA_AH_ATTR_TYPE_ROCE && udata) { 107 107 int err; 108 108 struct mlx5_ib_create_ah_resp resp = {}; 109 - u32 min_resp_len = offsetof(typeof(resp), dmac) + 110 - sizeof(resp.dmac); 109 + u32 min_resp_len = 110 + offsetofend(struct mlx5_ib_create_ah_resp, dmac); 111 111 112 112 if (udata->outlen < min_resp_len) 113 113 return -EINVAL; ··· 146 146 rdma_ah_set_sl(ah_attr, ah->av.stat_rate_sl & 0xf); 147 147 148 148 return 0; 149 - } 150 - 151 - void mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags) 152 - { 153 - return; 154 149 }
+4 -4
drivers/infiniband/hw/mlx5/cmd.c
··· 168 168 mlx5_cmd_exec_in(dev, destroy_tis, in); 169 169 } 170 170 171 - void mlx5_cmd_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn, u16 uid) 171 + int mlx5_cmd_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn, u16 uid) 172 172 { 173 173 u32 in[MLX5_ST_SZ_DW(destroy_rqt_in)] = {}; 174 174 175 175 MLX5_SET(destroy_rqt_in, in, opcode, MLX5_CMD_OP_DESTROY_RQT); 176 176 MLX5_SET(destroy_rqt_in, in, rqtn, rqtn); 177 177 MLX5_SET(destroy_rqt_in, in, uid, uid); 178 - mlx5_cmd_exec_in(dev, destroy_rqt, in); 178 + return mlx5_cmd_exec_in(dev, destroy_rqt, in); 179 179 } 180 180 181 181 int mlx5_cmd_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn, ··· 209 209 mlx5_cmd_exec_in(dev, dealloc_transport_domain, in); 210 210 } 211 211 212 - void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid) 212 + int mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid) 213 213 { 214 214 u32 in[MLX5_ST_SZ_DW(dealloc_pd_in)] = {}; 215 215 216 216 MLX5_SET(dealloc_pd_in, in, opcode, MLX5_CMD_OP_DEALLOC_PD); 217 217 MLX5_SET(dealloc_pd_in, in, pd, pdn); 218 218 MLX5_SET(dealloc_pd_in, in, uid, uid); 219 - mlx5_cmd_exec_in(dev, dealloc_pd, in); 219 + return mlx5_cmd_exec_in(dev, dealloc_pd, in); 220 220 } 221 221 222 222 int mlx5_cmd_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid,
+2 -2
drivers/infiniband/hw/mlx5/cmd.h
··· 44 44 int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr, 45 45 u64 length, u32 alignment); 46 46 void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length); 47 - void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid); 47 + int mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid); 48 48 void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid); 49 49 void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid); 50 - void mlx5_cmd_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn, u16 uid); 50 + int mlx5_cmd_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn, u16 uid); 51 51 int mlx5_cmd_alloc_transport_domain(struct mlx5_core_dev *dev, u32 *tdn, 52 52 u16 uid); 53 53 void mlx5_cmd_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn,
+4 -3
drivers/infiniband/hw/mlx5/counters.c
··· 117 117 return ret; 118 118 } 119 119 120 - static void mlx5_ib_destroy_counters(struct ib_counters *counters) 120 + static int mlx5_ib_destroy_counters(struct ib_counters *counters) 121 121 { 122 122 struct mlx5_ib_mcounters *mcounters = to_mcounters(counters); 123 123 ··· 125 125 if (mcounters->hw_cntrs_hndl) 126 126 mlx5_fc_destroy(to_mdev(counters->device)->mdev, 127 127 mcounters->hw_cntrs_hndl); 128 + return 0; 128 129 } 129 130 130 131 static int mlx5_ib_create_counters(struct ib_counters *counters, ··· 457 456 cnts->num_ext_ppcnt_counters = ARRAY_SIZE(ext_ppcnt_cnts); 458 457 num_counters += ARRAY_SIZE(ext_ppcnt_cnts); 459 458 } 460 - cnts->names = kcalloc(num_counters, sizeof(cnts->names), GFP_KERNEL); 459 + cnts->names = kcalloc(num_counters, sizeof(*cnts->names), GFP_KERNEL); 461 460 if (!cnts->names) 462 461 return -ENOMEM; 463 462 464 463 cnts->offsets = kcalloc(num_counters, 465 - sizeof(cnts->offsets), GFP_KERNEL); 464 + sizeof(*cnts->offsets), GFP_KERNEL); 466 465 if (!cnts->offsets) 467 466 goto err_names; 468 467
+11 -5
drivers/infiniband/hw/mlx5/cq.c
··· 168 168 { 169 169 enum rdma_link_layer ll = rdma_port_get_link_layer(qp->ibqp.device, 1); 170 170 struct mlx5_ib_dev *dev = to_mdev(qp->ibqp.device); 171 - struct mlx5_ib_srq *srq; 171 + struct mlx5_ib_srq *srq = NULL; 172 172 struct mlx5_ib_wq *wq; 173 173 u16 wqe_ctr; 174 174 u8 roce_packet_type; ··· 180 180 181 181 if (qp->ibqp.xrcd) { 182 182 msrq = mlx5_cmd_get_srq(dev, be32_to_cpu(cqe->srqn)); 183 - srq = to_mibsrq(msrq); 183 + if (msrq) 184 + srq = to_mibsrq(msrq); 184 185 } else { 185 186 srq = to_msrq(qp->ibqp.srq); 186 187 } ··· 255 254 256 255 switch (roce_packet_type) { 257 256 case MLX5_CQE_ROCE_L3_HEADER_TYPE_GRH: 258 - wc->network_hdr_type = RDMA_NETWORK_IB; 257 + wc->network_hdr_type = RDMA_NETWORK_ROCE_V1; 259 258 break; 260 259 case MLX5_CQE_ROCE_L3_HEADER_TYPE_IPV6: 261 260 wc->network_hdr_type = RDMA_NETWORK_IPV6; ··· 1024 1023 return err; 1025 1024 } 1026 1025 1027 - void mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 1026 + int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 1028 1027 { 1029 1028 struct mlx5_ib_dev *dev = to_mdev(cq->device); 1030 1029 struct mlx5_ib_cq *mcq = to_mcq(cq); 1030 + int ret; 1031 1031 1032 - mlx5_core_destroy_cq(dev->mdev, &mcq->mcq); 1032 + ret = mlx5_core_destroy_cq(dev->mdev, &mcq->mcq); 1033 + if (ret) 1034 + return ret; 1035 + 1033 1036 if (udata) 1034 1037 destroy_cq_user(mcq, udata); 1035 1038 else 1036 1039 destroy_cq_kernel(dev, mcq); 1040 + return 0; 1037 1041 } 1038 1042 1039 1043 static int is_equal_rsn(struct mlx5_cqe64 *cqe64, u32 rsn)
+84 -64
drivers/infiniband/hw/mlx5/fs.c
··· 136 136 #define LAST_COUNTERS_FIELD counters 137 137 138 138 /* Field is the last supported field */ 139 - #define FIELDS_NOT_SUPPORTED(filter, field)\ 140 - memchr_inv((void *)&filter.field +\ 141 - sizeof(filter.field), 0,\ 142 - sizeof(filter) -\ 143 - offsetof(typeof(filter), field) -\ 144 - sizeof(filter.field)) 139 + #define FIELDS_NOT_SUPPORTED(filter, field) \ 140 + memchr_inv((void *)&filter.field + sizeof(filter.field), 0, \ 141 + sizeof(filter) - offsetofend(typeof(filter), field)) 145 142 146 143 int parse_flow_flow_action(struct mlx5_ib_flow_action *maction, 147 144 bool is_egress, ··· 764 767 { 765 768 bool dont_trap = flow_attr->flags & IB_FLOW_ATTR_FLAGS_DONT_TRAP; 766 769 struct mlx5_flow_namespace *ns = NULL; 770 + enum mlx5_flow_namespace_type fn_type; 767 771 struct mlx5_ib_flow_prio *prio; 768 772 struct mlx5_flow_table *ft; 769 773 int max_table_size; ··· 778 780 log_max_ft_size)); 779 781 esw_encap = mlx5_eswitch_get_encap_mode(dev->mdev) != 780 782 DEVLINK_ESWITCH_ENCAP_MODE_NONE; 781 - if (flow_attr->type == IB_FLOW_ATTR_NORMAL) { 782 - enum mlx5_flow_namespace_type fn_type; 783 - 784 - if (flow_is_multicast_only(flow_attr) && 785 - !dont_trap) 783 + switch (flow_attr->type) { 784 + case IB_FLOW_ATTR_NORMAL: 785 + if (flow_is_multicast_only(flow_attr) && !dont_trap) 786 786 priority = MLX5_IB_FLOW_MCAST_PRIO; 787 787 else 788 788 priority = ib_prio_to_core_prio(flow_attr->priority, ··· 793 797 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_DECAP; 794 798 if (!dev->is_rep && !esw_encap && 795 799 MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, 796 - reformat_l3_tunnel_to_l2)) 800 + reformat_l3_tunnel_to_l2)) 797 801 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; 798 802 } else { 799 - max_table_size = 800 - BIT(MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, 801 - log_max_ft_size)); 803 + max_table_size = BIT(MLX5_CAP_FLOWTABLE_NIC_TX( 804 + dev->mdev, log_max_ft_size)); 802 805 fn_type = MLX5_FLOW_NAMESPACE_EGRESS; 803 806 prio = &dev->flow_db->egress_prios[priority]; 804 807 if (!dev->is_rep && !esw_encap && ··· 807 812 ns = mlx5_get_flow_namespace(dev->mdev, fn_type); 808 813 num_entries = MLX5_FS_MAX_ENTRIES; 809 814 num_groups = MLX5_FS_MAX_TYPES; 810 - } else if (flow_attr->type == IB_FLOW_ATTR_ALL_DEFAULT || 811 - flow_attr->type == IB_FLOW_ATTR_MC_DEFAULT) { 815 + break; 816 + case IB_FLOW_ATTR_ALL_DEFAULT: 817 + case IB_FLOW_ATTR_MC_DEFAULT: 812 818 ns = mlx5_get_flow_namespace(dev->mdev, 813 819 MLX5_FLOW_NAMESPACE_LEFTOVERS); 814 - build_leftovers_ft_param(&priority, 815 - &num_entries, 816 - &num_groups); 820 + build_leftovers_ft_param(&priority, &num_entries, &num_groups); 817 821 prio = &dev->flow_db->prios[MLX5_IB_FLOW_LEFTOVERS_PRIO]; 818 - } else if (flow_attr->type == IB_FLOW_ATTR_SNIFFER) { 822 + break; 823 + case IB_FLOW_ATTR_SNIFFER: 819 824 if (!MLX5_CAP_FLOWTABLE(dev->mdev, 820 825 allow_sniffer_and_nic_rx_shared_tir)) 821 826 return ERR_PTR(-EOPNOTSUPP); 822 827 823 - ns = mlx5_get_flow_namespace(dev->mdev, ft_type == MLX5_IB_FT_RX ? 824 - MLX5_FLOW_NAMESPACE_SNIFFER_RX : 825 - MLX5_FLOW_NAMESPACE_SNIFFER_TX); 828 + ns = mlx5_get_flow_namespace( 829 + dev->mdev, ft_type == MLX5_IB_FT_RX ? 830 + MLX5_FLOW_NAMESPACE_SNIFFER_RX : 831 + MLX5_FLOW_NAMESPACE_SNIFFER_TX); 826 832 827 833 prio = &dev->flow_db->sniffer[ft_type]; 828 834 priority = 0; 829 835 num_entries = 1; 830 836 num_groups = 1; 837 + break; 838 + default: 839 + break; 831 840 } 832 841 833 842 if (!ns) ··· 953 954 if (!flow_is_multicast_only(flow_attr)) 954 955 set_underlay_qp(dev, spec, underlay_qpn); 955 956 956 - if (dev->is_rep) { 957 + if (dev->is_rep && flow_attr->type != IB_FLOW_ATTR_SNIFFER) { 957 958 struct mlx5_eswitch_rep *rep; 958 959 959 960 rep = dev->port[flow_attr->port - 1].rep; ··· 1115 1116 int err; 1116 1117 static const struct ib_flow_attr flow_attr = { 1117 1118 .num_of_specs = 0, 1119 + .type = IB_FLOW_ATTR_SNIFFER, 1118 1120 .size = sizeof(flow_attr) 1119 1121 }; 1120 1122 ··· 1143 1143 return ERR_PTR(err); 1144 1144 } 1145 1145 1146 - 1147 1146 static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp, 1148 1147 struct ib_flow_attr *flow_attr, 1149 - int domain, 1150 1148 struct ib_udata *udata) 1151 1149 { 1152 1150 struct mlx5_ib_dev *dev = to_mdev(qp->device); ··· 1160 1162 int underlay_qpn; 1161 1163 1162 1164 if (udata && udata->inlen) { 1163 - min_ucmd_sz = offsetof(typeof(ucmd_hdr), reserved) + 1164 - sizeof(ucmd_hdr.reserved); 1165 + min_ucmd_sz = offsetofend(struct mlx5_ib_create_flow, reserved); 1165 1166 if (udata->inlen < min_ucmd_sz) 1166 1167 return ERR_PTR(-EOPNOTSUPP); 1167 1168 ··· 1194 1197 goto free_ucmd; 1195 1198 } 1196 1199 1197 - if (domain != IB_FLOW_DOMAIN_USER || 1198 - flow_attr->port > dev->num_ports || 1199 - (flow_attr->flags & ~(IB_FLOW_ATTR_FLAGS_DONT_TRAP | 1200 - IB_FLOW_ATTR_FLAGS_EGRESS))) { 1200 + if (flow_attr->port > dev->num_ports || 1201 + (flow_attr->flags & 1202 + ~(IB_FLOW_ATTR_FLAGS_DONT_TRAP | IB_FLOW_ATTR_FLAGS_EGRESS))) { 1201 1203 err = -EINVAL; 1202 1204 goto free_ucmd; 1203 1205 } ··· 1241 1245 dst->tir_num = mqp->raw_packet_qp.rq.tirn; 1242 1246 } 1243 1247 1244 - if (flow_attr->type == IB_FLOW_ATTR_NORMAL) { 1248 + switch (flow_attr->type) { 1249 + case IB_FLOW_ATTR_NORMAL: 1245 1250 underlay_qpn = (mqp->flags & IB_QP_CREATE_SOURCE_QPN) ? 1246 1251 mqp->underlay_qpn : 1247 1252 0; 1248 1253 handler = _create_flow_rule(dev, ft_prio, flow_attr, dst, 1249 1254 underlay_qpn, ucmd); 1250 - } else if (flow_attr->type == IB_FLOW_ATTR_ALL_DEFAULT || 1251 - flow_attr->type == IB_FLOW_ATTR_MC_DEFAULT) { 1252 - handler = create_leftovers_rule(dev, ft_prio, flow_attr, 1253 - dst); 1254 - } else if (flow_attr->type == IB_FLOW_ATTR_SNIFFER) { 1255 + break; 1256 + case IB_FLOW_ATTR_ALL_DEFAULT: 1257 + case IB_FLOW_ATTR_MC_DEFAULT: 1258 + handler = create_leftovers_rule(dev, ft_prio, flow_attr, dst); 1259 + break; 1260 + case IB_FLOW_ATTR_SNIFFER: 1255 1261 handler = create_sniffer_rule(dev, ft_prio, ft_prio_tx, dst); 1256 - } else { 1262 + break; 1263 + default: 1257 1264 err = -EINVAL; 1258 1265 goto destroy_ft; 1259 1266 } ··· 1304 1305 1305 1306 esw_encap = mlx5_eswitch_get_encap_mode(dev->mdev) != 1306 1307 DEVLINK_ESWITCH_ENCAP_MODE_NONE; 1307 - if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_BYPASS) { 1308 - max_table_size = BIT(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, 1309 - log_max_ft_size)); 1308 + switch (fs_matcher->ns_type) { 1309 + case MLX5_FLOW_NAMESPACE_BYPASS: 1310 + max_table_size = BIT( 1311 + MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, log_max_ft_size)); 1310 1312 if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, decap) && !esw_encap) 1311 1313 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_DECAP; 1312 1314 if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, 1313 1315 reformat_l3_tunnel_to_l2) && 1314 1316 !esw_encap) 1315 1317 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; 1316 - } else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS) { 1318 + break; 1319 + case MLX5_FLOW_NAMESPACE_EGRESS: 1317 1320 max_table_size = BIT( 1318 1321 MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, log_max_ft_size)); 1319 - if (MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, reformat) && !esw_encap) 1322 + if (MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, reformat) && 1323 + !esw_encap) 1320 1324 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; 1321 - } else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_FDB) { 1325 + break; 1326 + case MLX5_FLOW_NAMESPACE_FDB: 1322 1327 max_table_size = BIT( 1323 1328 MLX5_CAP_ESW_FLOWTABLE_FDB(dev->mdev, log_max_ft_size)); 1324 1329 if (MLX5_CAP_ESW_FLOWTABLE_FDB(dev->mdev, decap) && esw_encap) 1325 1330 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_DECAP; 1326 - if (MLX5_CAP_ESW_FLOWTABLE_FDB(dev->mdev, reformat_l3_tunnel_to_l2) && 1331 + if (MLX5_CAP_ESW_FLOWTABLE_FDB(dev->mdev, 1332 + reformat_l3_tunnel_to_l2) && 1327 1333 esw_encap) 1328 1334 flags |= MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT; 1329 1335 priority = FDB_BYPASS_PATH; 1330 - } else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) { 1331 - max_table_size = 1332 - BIT(MLX5_CAP_FLOWTABLE_RDMA_RX(dev->mdev, 1333 - log_max_ft_size)); 1336 + break; 1337 + case MLX5_FLOW_NAMESPACE_RDMA_RX: 1338 + max_table_size = BIT( 1339 + MLX5_CAP_FLOWTABLE_RDMA_RX(dev->mdev, log_max_ft_size)); 1334 1340 priority = fs_matcher->priority; 1335 - } else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX) { 1336 - max_table_size = 1337 - BIT(MLX5_CAP_FLOWTABLE_RDMA_TX(dev->mdev, 1338 - log_max_ft_size)); 1341 + break; 1342 + case MLX5_FLOW_NAMESPACE_RDMA_TX: 1343 + max_table_size = BIT( 1344 + MLX5_CAP_FLOWTABLE_RDMA_TX(dev->mdev, log_max_ft_size)); 1339 1345 priority = fs_matcher->priority; 1346 + break; 1347 + default: 1348 + break; 1340 1349 } 1341 1350 1342 1351 max_table_size = min_t(int, max_table_size, MLX5_FS_MAX_ENTRIES); ··· 1353 1346 if (!ns) 1354 1347 return ERR_PTR(-EOPNOTSUPP); 1355 1348 1356 - if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_BYPASS) 1349 + switch (fs_matcher->ns_type) { 1350 + case MLX5_FLOW_NAMESPACE_BYPASS: 1357 1351 prio = &dev->flow_db->prios[priority]; 1358 - else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS) 1352 + break; 1353 + case MLX5_FLOW_NAMESPACE_EGRESS: 1359 1354 prio = &dev->flow_db->egress_prios[priority]; 1360 - else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_FDB) 1355 + break; 1356 + case MLX5_FLOW_NAMESPACE_FDB: 1361 1357 prio = &dev->flow_db->fdb; 1362 - else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) 1358 + break; 1359 + case MLX5_FLOW_NAMESPACE_RDMA_RX: 1363 1360 prio = &dev->flow_db->rdma_rx[priority]; 1364 - else if (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX) 1361 + break; 1362 + case MLX5_FLOW_NAMESPACE_RDMA_TX: 1365 1363 prio = &dev->flow_db->rdma_tx[priority]; 1364 + break; 1365 + default: return ERR_PTR(-EINVAL); 1366 + } 1366 1367 1367 1368 if (!prio) 1368 1369 return ERR_PTR(-EINVAL); ··· 1503 1488 goto unlock; 1504 1489 } 1505 1490 1506 - if (dest_type == MLX5_FLOW_DESTINATION_TYPE_TIR) { 1491 + switch (dest_type) { 1492 + case MLX5_FLOW_DESTINATION_TYPE_TIR: 1507 1493 dst[dst_num].type = dest_type; 1508 1494 dst[dst_num++].tir_num = dest_id; 1509 1495 flow_act->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; 1510 - } else if (dest_type == MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE) { 1496 + break; 1497 + case MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE: 1511 1498 dst[dst_num].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE_NUM; 1512 1499 dst[dst_num++].ft_num = dest_id; 1513 1500 flow_act->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; 1514 - } else if (dest_type == MLX5_FLOW_DESTINATION_TYPE_PORT) { 1501 + break; 1502 + case MLX5_FLOW_DESTINATION_TYPE_PORT: 1515 1503 dst[dst_num++].type = MLX5_FLOW_DESTINATION_TYPE_PORT; 1516 1504 flow_act->action |= MLX5_FLOW_CONTEXT_ACTION_ALLOW; 1505 + break; 1506 + default: 1507 + break; 1517 1508 } 1518 - 1519 1509 1520 1510 if (flow_act->action & MLX5_FLOW_CONTEXT_ACTION_COUNT) { 1521 1511 dst[dst_num].type = MLX5_FLOW_DESTINATION_TYPE_COUNTER;
+61 -93
drivers/infiniband/hw/mlx5/gsi.c
··· 35 35 struct mlx5_ib_gsi_wr { 36 36 struct ib_cqe cqe; 37 37 struct ib_wc wc; 38 - int send_flags; 39 38 bool completed:1; 40 39 }; 41 - 42 - struct mlx5_ib_gsi_qp { 43 - struct ib_qp ibqp; 44 - struct ib_qp *rx_qp; 45 - u8 port_num; 46 - struct ib_qp_cap cap; 47 - enum ib_sig_type sq_sig_type; 48 - /* Serialize qp state modifications */ 49 - struct mutex mutex; 50 - struct ib_cq *cq; 51 - struct mlx5_ib_gsi_wr *outstanding_wrs; 52 - u32 outstanding_pi, outstanding_ci; 53 - int num_qps; 54 - /* Protects access to the tx_qps. Post send operations synchronize 55 - * with tx_qp creation in setup_qp(). Also protects the 56 - * outstanding_wrs array and indices. 57 - */ 58 - spinlock_t lock; 59 - struct ib_qp **tx_qps; 60 - }; 61 - 62 - static struct mlx5_ib_gsi_qp *gsi_qp(struct ib_qp *qp) 63 - { 64 - return container_of(qp, struct mlx5_ib_gsi_qp, ibqp); 65 - } 66 40 67 41 static bool mlx5_ib_deth_sqpn_cap(struct mlx5_ib_dev *dev) 68 42 { ··· 44 70 } 45 71 46 72 /* Call with gsi->lock locked */ 47 - static void generate_completions(struct mlx5_ib_gsi_qp *gsi) 73 + static void generate_completions(struct mlx5_ib_qp *mqp) 48 74 { 49 - struct ib_cq *gsi_cq = gsi->ibqp.send_cq; 75 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 76 + struct ib_cq *gsi_cq = mqp->ibqp.send_cq; 50 77 struct mlx5_ib_gsi_wr *wr; 51 78 u32 index; 52 79 ··· 58 83 if (!wr->completed) 59 84 break; 60 85 61 - if (gsi->sq_sig_type == IB_SIGNAL_ALL_WR || 62 - wr->send_flags & IB_SEND_SIGNALED) 63 - WARN_ON_ONCE(mlx5_ib_generate_wc(gsi_cq, &wr->wc)); 64 - 86 + WARN_ON_ONCE(mlx5_ib_generate_wc(gsi_cq, &wr->wc)); 65 87 wr->completed = false; 66 88 } 67 89 ··· 70 98 struct mlx5_ib_gsi_qp *gsi = cq->cq_context; 71 99 struct mlx5_ib_gsi_wr *wr = 72 100 container_of(wc->wr_cqe, struct mlx5_ib_gsi_wr, cqe); 101 + struct mlx5_ib_qp *mqp = container_of(gsi, struct mlx5_ib_qp, gsi); 73 102 u64 wr_id; 74 103 unsigned long flags; 75 104 ··· 79 106 wr_id = wr->wc.wr_id; 80 107 wr->wc = *wc; 81 108 wr->wc.wr_id = wr_id; 82 - wr->wc.qp = &gsi->ibqp; 109 + wr->wc.qp = &mqp->ibqp; 83 110 84 - generate_completions(gsi); 111 + generate_completions(mqp); 85 112 spin_unlock_irqrestore(&gsi->lock, flags); 86 113 } 87 114 88 - struct ib_qp *mlx5_ib_gsi_create_qp(struct ib_pd *pd, 89 - struct ib_qp_init_attr *init_attr) 115 + int mlx5_ib_create_gsi(struct ib_pd *pd, struct mlx5_ib_qp *mqp, 116 + struct ib_qp_init_attr *attr) 90 117 { 91 118 struct mlx5_ib_dev *dev = to_mdev(pd->device); 92 119 struct mlx5_ib_gsi_qp *gsi; 93 - struct ib_qp_init_attr hw_init_attr = *init_attr; 94 - const u8 port_num = init_attr->port_num; 120 + struct ib_qp_init_attr hw_init_attr = *attr; 121 + const u8 port_num = attr->port_num; 95 122 int num_qps = 0; 96 123 int ret; 97 124 ··· 103 130 num_qps = MLX5_MAX_PORTS; 104 131 } 105 132 106 - gsi = kzalloc(sizeof(*gsi), GFP_KERNEL); 107 - if (!gsi) 108 - return ERR_PTR(-ENOMEM); 109 - 133 + gsi = &mqp->gsi; 110 134 gsi->tx_qps = kcalloc(num_qps, sizeof(*gsi->tx_qps), GFP_KERNEL); 111 - if (!gsi->tx_qps) { 112 - ret = -ENOMEM; 113 - goto err_free; 114 - } 135 + if (!gsi->tx_qps) 136 + return -ENOMEM; 115 137 116 - gsi->outstanding_wrs = kcalloc(init_attr->cap.max_send_wr, 117 - sizeof(*gsi->outstanding_wrs), 118 - GFP_KERNEL); 138 + gsi->outstanding_wrs = 139 + kcalloc(attr->cap.max_send_wr, sizeof(*gsi->outstanding_wrs), 140 + GFP_KERNEL); 119 141 if (!gsi->outstanding_wrs) { 120 142 ret = -ENOMEM; 121 143 goto err_free_tx; 122 144 } 123 - 124 - mutex_init(&gsi->mutex); 125 145 126 146 mutex_lock(&dev->devr.mutex); 127 147 ··· 127 161 gsi->num_qps = num_qps; 128 162 spin_lock_init(&gsi->lock); 129 163 130 - gsi->cap = init_attr->cap; 131 - gsi->sq_sig_type = init_attr->sq_sig_type; 132 - gsi->ibqp.qp_num = 1; 164 + gsi->cap = attr->cap; 133 165 gsi->port_num = port_num; 134 166 135 - gsi->cq = ib_alloc_cq(pd->device, gsi, init_attr->cap.max_send_wr, 0, 167 + gsi->cq = ib_alloc_cq(pd->device, gsi, attr->cap.max_send_wr, 0, 136 168 IB_POLL_SOFTIRQ); 137 169 if (IS_ERR(gsi->cq)) { 138 170 mlx5_ib_warn(dev, "unable to create send CQ for GSI QP. error %ld\n", ··· 146 182 hw_init_attr.cap.max_send_sge = 0; 147 183 hw_init_attr.cap.max_inline_data = 0; 148 184 } 149 - gsi->rx_qp = ib_create_qp(pd, &hw_init_attr); 185 + 186 + gsi->rx_qp = mlx5_ib_create_qp(pd, &hw_init_attr, NULL); 150 187 if (IS_ERR(gsi->rx_qp)) { 151 188 mlx5_ib_warn(dev, "unable to create hardware GSI QP. error %ld\n", 152 189 PTR_ERR(gsi->rx_qp)); 153 190 ret = PTR_ERR(gsi->rx_qp); 154 191 goto err_destroy_cq; 155 192 } 193 + gsi->rx_qp->device = pd->device; 194 + gsi->rx_qp->pd = pd; 195 + gsi->rx_qp->real_qp = gsi->rx_qp; 156 196 157 - dev->devr.ports[init_attr->port_num - 1].gsi = gsi; 197 + gsi->rx_qp->qp_type = hw_init_attr.qp_type; 198 + gsi->rx_qp->send_cq = hw_init_attr.send_cq; 199 + gsi->rx_qp->recv_cq = hw_init_attr.recv_cq; 200 + gsi->rx_qp->event_handler = hw_init_attr.event_handler; 201 + spin_lock_init(&gsi->rx_qp->mr_lock); 202 + INIT_LIST_HEAD(&gsi->rx_qp->rdma_mrs); 203 + INIT_LIST_HEAD(&gsi->rx_qp->sig_mrs); 204 + 205 + dev->devr.ports[attr->port_num - 1].gsi = gsi; 158 206 159 207 mutex_unlock(&dev->devr.mutex); 160 208 161 - return &gsi->ibqp; 209 + return 0; 162 210 163 211 err_destroy_cq: 164 212 ib_free_cq(gsi->cq); ··· 179 203 kfree(gsi->outstanding_wrs); 180 204 err_free_tx: 181 205 kfree(gsi->tx_qps); 182 - err_free: 183 - kfree(gsi); 184 - return ERR_PTR(ret); 206 + return ret; 185 207 } 186 208 187 - int mlx5_ib_gsi_destroy_qp(struct ib_qp *qp) 209 + int mlx5_ib_destroy_gsi(struct mlx5_ib_qp *mqp) 188 210 { 189 - struct mlx5_ib_dev *dev = to_mdev(qp->device); 190 - struct mlx5_ib_gsi_qp *gsi = gsi_qp(qp); 211 + struct mlx5_ib_dev *dev = to_mdev(mqp->ibqp.device); 212 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 191 213 const int port_num = gsi->port_num; 192 214 int qp_index; 193 215 int ret; 194 216 195 - mlx5_ib_dbg(dev, "destroying GSI QP\n"); 196 - 197 217 mutex_lock(&dev->devr.mutex); 198 - ret = ib_destroy_qp(gsi->rx_qp); 218 + ret = mlx5_ib_destroy_qp(gsi->rx_qp, NULL); 199 219 if (ret) { 200 220 mlx5_ib_warn(dev, "unable to destroy hardware GSI QP. error %d\n", 201 221 ret); ··· 213 241 214 242 kfree(gsi->outstanding_wrs); 215 243 kfree(gsi->tx_qps); 216 - kfree(gsi); 244 + kfree(mqp); 217 245 218 246 return 0; 219 247 } ··· 231 259 .max_send_sge = gsi->cap.max_send_sge, 232 260 .max_inline_data = gsi->cap.max_inline_data, 233 261 }, 234 - .sq_sig_type = gsi->sq_sig_type, 235 262 .qp_type = IB_QPT_UD, 236 263 .create_flags = MLX5_IB_QP_CREATE_SQPN_QP1, 237 264 }; ··· 341 370 342 371 static void setup_qps(struct mlx5_ib_gsi_qp *gsi) 343 372 { 373 + struct mlx5_ib_dev *dev = to_mdev(gsi->rx_qp->device); 344 374 u16 qp_index; 345 375 376 + mutex_lock(&dev->devr.mutex); 346 377 for (qp_index = 0; qp_index < gsi->num_qps; ++qp_index) 347 378 setup_qp(gsi, qp_index); 379 + mutex_unlock(&dev->devr.mutex); 348 380 } 349 381 350 382 int mlx5_ib_gsi_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr, 351 383 int attr_mask) 352 384 { 353 385 struct mlx5_ib_dev *dev = to_mdev(qp->device); 354 - struct mlx5_ib_gsi_qp *gsi = gsi_qp(qp); 386 + struct mlx5_ib_qp *mqp = to_mqp(qp); 387 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 355 388 int ret; 356 389 357 390 mlx5_ib_dbg(dev, "modifying GSI QP to state %d\n", attr->qp_state); 358 391 359 - mutex_lock(&gsi->mutex); 360 392 ret = ib_modify_qp(gsi->rx_qp, attr, attr_mask); 361 393 if (ret) { 362 394 mlx5_ib_warn(dev, "unable to modify GSI rx QP: %d\n", ret); 363 - goto unlock; 395 + return ret; 364 396 } 365 397 366 398 if (to_mqp(gsi->rx_qp)->state == IB_QPS_RTS) 367 399 setup_qps(gsi); 368 - 369 - unlock: 370 - mutex_unlock(&gsi->mutex); 371 - 372 - return ret; 400 + return 0; 373 401 } 374 402 375 403 int mlx5_ib_gsi_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, 376 404 int qp_attr_mask, 377 405 struct ib_qp_init_attr *qp_init_attr) 378 406 { 379 - struct mlx5_ib_gsi_qp *gsi = gsi_qp(qp); 407 + struct mlx5_ib_qp *mqp = to_mqp(qp); 408 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 380 409 int ret; 381 410 382 - mutex_lock(&gsi->mutex); 383 411 ret = ib_query_qp(gsi->rx_qp, qp_attr, qp_attr_mask, qp_init_attr); 384 412 qp_init_attr->cap = gsi->cap; 385 - mutex_unlock(&gsi->mutex); 386 - 387 413 return ret; 388 414 } 389 415 390 416 /* Call with gsi->lock locked */ 391 - static int mlx5_ib_add_outstanding_wr(struct mlx5_ib_gsi_qp *gsi, 417 + static int mlx5_ib_add_outstanding_wr(struct mlx5_ib_qp *mqp, 392 418 struct ib_ud_wr *wr, struct ib_wc *wc) 393 419 { 420 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 394 421 struct mlx5_ib_dev *dev = to_mdev(gsi->rx_qp->device); 395 422 struct mlx5_ib_gsi_wr *gsi_wr; 396 423 ··· 417 448 } 418 449 419 450 /* Call with gsi->lock locked */ 420 - static int mlx5_ib_gsi_silent_drop(struct mlx5_ib_gsi_qp *gsi, 421 - struct ib_ud_wr *wr) 451 + static int mlx5_ib_gsi_silent_drop(struct mlx5_ib_qp *mqp, struct ib_ud_wr *wr) 422 452 { 423 453 struct ib_wc wc = { 424 454 { .wr_id = wr->wr.wr_id }, 425 455 .status = IB_WC_SUCCESS, 426 456 .opcode = IB_WC_SEND, 427 - .qp = &gsi->ibqp, 457 + .qp = &mqp->ibqp, 428 458 }; 429 459 int ret; 430 460 431 - ret = mlx5_ib_add_outstanding_wr(gsi, wr, &wc); 461 + ret = mlx5_ib_add_outstanding_wr(mqp, wr, &wc); 432 462 if (ret) 433 463 return ret; 434 464 435 - generate_completions(gsi); 465 + generate_completions(mqp); 436 466 437 467 return 0; 438 468 } ··· 458 490 int mlx5_ib_gsi_post_send(struct ib_qp *qp, const struct ib_send_wr *wr, 459 491 const struct ib_send_wr **bad_wr) 460 492 { 461 - struct mlx5_ib_gsi_qp *gsi = gsi_qp(qp); 493 + struct mlx5_ib_qp *mqp = to_mqp(qp); 494 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 462 495 struct ib_qp *tx_qp; 463 496 unsigned long flags; 464 497 int ret; ··· 472 503 spin_lock_irqsave(&gsi->lock, flags); 473 504 tx_qp = get_tx_qp(gsi, &cur_wr); 474 505 if (!tx_qp) { 475 - ret = mlx5_ib_gsi_silent_drop(gsi, &cur_wr); 506 + ret = mlx5_ib_gsi_silent_drop(mqp, &cur_wr); 476 507 if (ret) 477 508 goto err; 478 509 spin_unlock_irqrestore(&gsi->lock, flags); 479 510 continue; 480 511 } 481 512 482 - ret = mlx5_ib_add_outstanding_wr(gsi, &cur_wr, NULL); 513 + ret = mlx5_ib_add_outstanding_wr(mqp, &cur_wr, NULL); 483 514 if (ret) 484 515 goto err; 485 516 ··· 503 534 int mlx5_ib_gsi_post_recv(struct ib_qp *qp, const struct ib_recv_wr *wr, 504 535 const struct ib_recv_wr **bad_wr) 505 536 { 506 - struct mlx5_ib_gsi_qp *gsi = gsi_qp(qp); 537 + struct mlx5_ib_qp *mqp = to_mqp(qp); 538 + struct mlx5_ib_gsi_qp *gsi = &mqp->gsi; 507 539 508 540 return ib_post_recv(gsi->rx_qp, wr, bad_wr); 509 541 } ··· 514 544 if (!gsi) 515 545 return; 516 546 517 - mutex_lock(&gsi->mutex); 518 547 setup_qps(gsi); 519 - mutex_unlock(&gsi->mutex); 520 548 }
+32 -38
drivers/infiniband/hw/mlx5/main.c
··· 326 326 spin_unlock(&port->mp.mpi_lock); 327 327 } 328 328 329 - static int translate_eth_legacy_proto_oper(u32 eth_proto_oper, u8 *active_speed, 330 - u8 *active_width) 329 + static int translate_eth_legacy_proto_oper(u32 eth_proto_oper, 330 + u16 *active_speed, u8 *active_width) 331 331 { 332 332 switch (eth_proto_oper) { 333 333 case MLX5E_PROT_MASK(MLX5E_1000BASE_CX_SGMII): ··· 384 384 return 0; 385 385 } 386 386 387 - static int translate_eth_ext_proto_oper(u32 eth_proto_oper, u8 *active_speed, 387 + static int translate_eth_ext_proto_oper(u32 eth_proto_oper, u16 *active_speed, 388 388 u8 *active_width) 389 389 { 390 390 switch (eth_proto_oper) { ··· 436 436 return 0; 437 437 } 438 438 439 - static int translate_eth_proto_oper(u32 eth_proto_oper, u8 *active_speed, 439 + static int translate_eth_proto_oper(u32 eth_proto_oper, u16 *active_speed, 440 440 u8 *active_width, bool ext) 441 441 { 442 442 return ext ? ··· 546 546 unsigned int index, const union ib_gid *gid, 547 547 const struct ib_gid_attr *attr) 548 548 { 549 - enum ib_gid_type gid_type = IB_GID_TYPE_IB; 549 + enum ib_gid_type gid_type = IB_GID_TYPE_ROCE; 550 550 u16 vlan_id = 0xffff; 551 551 u8 roce_version = 0; 552 552 u8 roce_l3_type = 0; ··· 561 561 } 562 562 563 563 switch (gid_type) { 564 - case IB_GID_TYPE_IB: 564 + case IB_GID_TYPE_ROCE: 565 565 roce_version = MLX5_ROCE_VERSION_1; 566 566 break; 567 567 case IB_GID_TYPE_ROCE_UDP_ENCAP: ··· 840 840 /* We support 'Gappy' memory registration too */ 841 841 props->device_cap_flags |= IB_DEVICE_SG_GAPS_REG; 842 842 } 843 - props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS; 843 + /* IB_WR_REG_MR always requires changing the entity size with UMR */ 844 + if (!MLX5_CAP_GEN(dev->mdev, umr_modify_entity_size_disabled)) 845 + props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS; 844 846 if (MLX5_CAP_GEN(mdev, sho)) { 845 847 props->device_cap_flags |= IB_DEVICE_INTEGRITY_HANDOVER; 846 848 /* At this stage no support for signature handover */ ··· 1177 1175 return 0; 1178 1176 } 1179 1177 1180 - enum mlx5_ib_width { 1181 - MLX5_IB_WIDTH_1X = 1 << 0, 1182 - MLX5_IB_WIDTH_2X = 1 << 1, 1183 - MLX5_IB_WIDTH_4X = 1 << 2, 1184 - MLX5_IB_WIDTH_8X = 1 << 3, 1185 - MLX5_IB_WIDTH_12X = 1 << 4 1186 - }; 1187 - 1188 - static void translate_active_width(struct ib_device *ibdev, u8 active_width, 1189 - u8 *ib_width) 1178 + static void translate_active_width(struct ib_device *ibdev, u16 active_width, 1179 + u8 *ib_width) 1190 1180 { 1191 1181 struct mlx5_ib_dev *dev = to_mdev(ibdev); 1192 1182 1193 - if (active_width & MLX5_IB_WIDTH_1X) 1183 + if (active_width & MLX5_PTYS_WIDTH_1X) 1194 1184 *ib_width = IB_WIDTH_1X; 1195 - else if (active_width & MLX5_IB_WIDTH_2X) 1185 + else if (active_width & MLX5_PTYS_WIDTH_2X) 1196 1186 *ib_width = IB_WIDTH_2X; 1197 - else if (active_width & MLX5_IB_WIDTH_4X) 1187 + else if (active_width & MLX5_PTYS_WIDTH_4X) 1198 1188 *ib_width = IB_WIDTH_4X; 1199 - else if (active_width & MLX5_IB_WIDTH_8X) 1189 + else if (active_width & MLX5_PTYS_WIDTH_8X) 1200 1190 *ib_width = IB_WIDTH_8X; 1201 - else if (active_width & MLX5_IB_WIDTH_12X) 1191 + else if (active_width & MLX5_PTYS_WIDTH_12X) 1202 1192 *ib_width = IB_WIDTH_12X; 1203 1193 else { 1204 1194 mlx5_ib_dbg(dev, "Invalid active_width %d, setting width to default value: 4x\n", 1205 - (int)active_width); 1195 + active_width); 1206 1196 *ib_width = IB_WIDTH_4X; 1207 1197 } 1208 1198 ··· 1271 1277 u16 max_mtu; 1272 1278 u16 oper_mtu; 1273 1279 int err; 1274 - u8 ib_link_width_oper; 1280 + u16 ib_link_width_oper; 1275 1281 u8 vl_hw_cap; 1276 1282 1277 1283 rep = kzalloc(sizeof(*rep), GFP_KERNEL); ··· 1304 1310 if (props->port_cap_flags & IB_PORT_CAP_MASK2_SUP) 1305 1311 props->port_cap_flags2 = rep->cap_mask2; 1306 1312 1307 - err = mlx5_query_port_link_width_oper(mdev, &ib_link_width_oper, port); 1313 + err = mlx5_query_ib_port_oper(mdev, &ib_link_width_oper, 1314 + &props->active_speed, port); 1308 1315 if (err) 1309 1316 goto out; 1310 1317 1311 1318 translate_active_width(ibdev, ib_link_width_oper, &props->active_width); 1312 - 1313 - err = mlx5_query_port_ib_proto_oper(mdev, &props->active_speed, port); 1314 - if (err) 1315 - goto out; 1316 1319 1317 1320 mlx5_query_port_max_mtu(mdev, &max_mtu, port); 1318 1321 ··· 2345 2354 return -EPERM; 2346 2355 2347 2356 if (!(MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) || 2348 - MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner))) 2357 + MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner) || 2358 + MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner_v2) || 2359 + MLX5_CAP_FLOWTABLE_NIC_TX(dev->mdev, sw_owner_v2))) 2349 2360 return -EOPNOTSUPP; 2350 2361 break; 2351 2362 } ··· 2562 2569 return 0; 2563 2570 } 2564 2571 2565 - static void mlx5_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 2572 + static int mlx5_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 2566 2573 { 2567 2574 struct mlx5_ib_dev *mdev = to_mdev(pd->device); 2568 2575 struct mlx5_ib_pd *mpd = to_mpd(pd); 2569 2576 2570 - mlx5_cmd_dealloc_pd(mdev->mdev, mpd->pdn, mpd->uid); 2577 + return mlx5_cmd_dealloc_pd(mdev->mdev, mpd->pdn, mpd->uid); 2571 2578 } 2572 2579 2573 2580 static int mlx5_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) ··· 2692 2699 container_of(work, struct mlx5_ib_port_resources, 2693 2700 pkey_change_work); 2694 2701 2695 - mutex_lock(&ports->devr->mutex); 2696 2702 mlx5_ib_gsi_pkey_change(ports->gsi); 2697 - mutex_unlock(&ports->devr->mutex); 2698 2703 } 2699 2704 2700 2705 static void mlx5_ib_handle_internal_error(struct mlx5_ib_dev *ibdev) ··· 3118 3127 atomic_inc(&devr->p0->usecnt); 3119 3128 atomic_set(&devr->s1->usecnt, 0); 3120 3129 3121 - for (port = 0; port < ARRAY_SIZE(devr->ports); ++port) { 3130 + for (port = 0; port < ARRAY_SIZE(devr->ports); ++port) 3122 3131 INIT_WORK(&devr->ports[port].pkey_change_work, 3123 3132 pkey_change_handler); 3124 - devr->ports[port].devr = devr; 3125 - } 3126 3133 3127 3134 return 0; 3128 3135 ··· 4087 4098 static const struct ib_device_ops mlx5_ib_dev_mw_ops = { 4088 4099 .alloc_mw = mlx5_ib_alloc_mw, 4089 4100 .dealloc_mw = mlx5_ib_dealloc_mw, 4101 + 4102 + INIT_RDMA_OBJ_SIZE(ib_mw, mlx5_ib_mw, ibmw), 4090 4103 }; 4091 4104 4092 4105 static const struct ib_device_ops mlx5_ib_dev_xrc_ops = { ··· 4259 4268 .destroy_wq = mlx5_ib_destroy_wq, 4260 4269 .get_netdev = mlx5_ib_get_netdev, 4261 4270 .modify_wq = mlx5_ib_modify_wq, 4271 + 4272 + INIT_RDMA_OBJ_SIZE(ib_rwq_ind_table, mlx5_ib_rwq_ind_table, 4273 + ib_rwq_ind_tbl), 4262 4274 }; 4263 4275 4264 4276 static int mlx5_ib_roce_init(struct mlx5_ib_dev *dev) ··· 4380 4386 name = "mlx5_%d"; 4381 4387 else 4382 4388 name = "mlx5_bond_%d"; 4383 - return ib_register_device(&dev->ib_dev, name); 4389 + return ib_register_device(&dev->ib_dev, name, &dev->mdev->pdev->dev); 4384 4390 } 4385 4391 4386 4392 static void mlx5_ib_stage_pre_ib_reg_umr_cleanup(struct mlx5_ib_dev *dev)
+2 -2
drivers/infiniband/hw/mlx5/mem.c
··· 169 169 int page_shift, __be64 *pas, int access_flags) 170 170 { 171 171 return __mlx5_ib_populate_pas(dev, umem, page_shift, 0, 172 - ib_umem_num_pages(umem), pas, 173 - access_flags); 172 + ib_umem_num_dma_blocks(umem, PAGE_SIZE), 173 + pas, access_flags); 174 174 } 175 175 int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset) 176 176 {
+75 -25
drivers/infiniband/hw/mlx5/mlx5_ib.h
··· 384 384 u32 *in; 385 385 }; 386 386 387 + struct mlx5_ib_gsi_qp { 388 + struct ib_qp *rx_qp; 389 + u8 port_num; 390 + struct ib_qp_cap cap; 391 + struct ib_cq *cq; 392 + struct mlx5_ib_gsi_wr *outstanding_wrs; 393 + u32 outstanding_pi, outstanding_ci; 394 + int num_qps; 395 + /* Protects access to the tx_qps. Post send operations synchronize 396 + * with tx_qp creation in setup_qp(). Also protects the 397 + * outstanding_wrs array and indices. 398 + */ 399 + spinlock_t lock; 400 + struct ib_qp **tx_qps; 401 + }; 402 + 387 403 struct mlx5_ib_qp { 388 404 struct ib_qp ibqp; 389 405 union { ··· 407 391 struct mlx5_ib_raw_packet_qp raw_packet_qp; 408 392 struct mlx5_ib_rss_qp rss_qp; 409 393 struct mlx5_ib_dct dct; 394 + struct mlx5_ib_gsi_qp gsi; 410 395 }; 411 396 struct mlx5_frag_buf buf; 412 397 ··· 710 693 unsigned long last_add; 711 694 }; 712 695 713 - struct mlx5_ib_gsi_qp; 714 - 715 696 struct mlx5_ib_port_resources { 716 - struct mlx5_ib_resources *devr; 717 697 struct mlx5_ib_gsi_qp *gsi; 718 698 struct work_struct pkey_change_work; 719 699 }; ··· 1133 1119 int mlx5_ib_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 1134 1120 struct ib_udata *udata); 1135 1121 int mlx5_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); 1136 - void mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags); 1122 + static inline int mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags) 1123 + { 1124 + return 0; 1125 + } 1137 1126 int mlx5_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr, 1138 1127 struct ib_udata *udata); 1139 1128 int mlx5_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, 1140 1129 enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); 1141 1130 int mlx5_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr); 1142 - void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 1131 + int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 1143 1132 int mlx5_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr, 1144 1133 const struct ib_recv_wr **bad_wr); 1145 1134 int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp); ··· 1165 1148 size_t buflen, size_t *bc); 1166 1149 int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 1167 1150 struct ib_udata *udata); 1168 - void mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 1151 + int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 1169 1152 int mlx5_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); 1170 1153 int mlx5_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); 1171 1154 int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period); ··· 1180 1163 struct ib_sge *sg_list, 1181 1164 u32 num_sge, 1182 1165 struct uverbs_attr_bundle *attrs); 1183 - struct ib_mw *mlx5_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 1184 - struct ib_udata *udata); 1166 + int mlx5_ib_alloc_mw(struct ib_mw *mw, struct ib_udata *udata); 1185 1167 int mlx5_ib_dealloc_mw(struct ib_mw *mw); 1186 1168 int mlx5_ib_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages, 1187 1169 int page_shift, int flags); ··· 1209 1193 const struct ib_mad *in, struct ib_mad *out, 1210 1194 size_t *out_mad_size, u16 *out_mad_pkey_index); 1211 1195 int mlx5_ib_alloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata); 1212 - void mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata); 1196 + int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata); 1213 1197 int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset); 1214 1198 int mlx5_query_ext_port_caps(struct mlx5_ib_dev *dev, u8 port); 1215 1199 int mlx5_query_mad_ifc_smp_attr_node_info(struct ib_device *ibdev, ··· 1245 1229 int mlx5_mr_cache_cleanup(struct mlx5_ib_dev *dev); 1246 1230 1247 1231 struct mlx5_ib_mr *mlx5_mr_cache_alloc(struct mlx5_ib_dev *dev, 1248 - unsigned int entry); 1232 + unsigned int entry, int access_flags); 1249 1233 void mlx5_mr_cache_free(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr); 1250 1234 int mlx5_mr_cache_invalidate(struct mlx5_ib_mr *mr); 1251 1235 ··· 1254 1238 struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd, 1255 1239 struct ib_wq_init_attr *init_attr, 1256 1240 struct ib_udata *udata); 1257 - void mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata); 1241 + int mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata); 1258 1242 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr, 1259 1243 u32 wq_attr_mask, struct ib_udata *udata); 1260 - struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device, 1261 - struct ib_rwq_ind_table_init_attr *init_attr, 1262 - struct ib_udata *udata); 1244 + int mlx5_ib_create_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_table, 1245 + struct ib_rwq_ind_table_init_attr *init_attr, 1246 + struct ib_udata *udata); 1263 1247 int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table); 1264 1248 struct ib_dm *mlx5_ib_alloc_dm(struct ib_device *ibdev, 1265 1249 struct ib_ucontext *context, ··· 1283 1267 int mlx5_ib_advise_mr_prefetch(struct ib_pd *pd, 1284 1268 enum ib_uverbs_advise_mr_advice advice, 1285 1269 u32 flags, struct ib_sge *sg_list, u32 num_sge); 1270 + int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable); 1286 1271 #else /* CONFIG_INFINIBAND_ON_DEMAND_PAGING */ 1287 1272 static inline void mlx5_ib_internal_fill_odp_caps(struct mlx5_ib_dev *dev) 1288 1273 { ··· 1302 1285 mlx5_ib_advise_mr_prefetch(struct ib_pd *pd, 1303 1286 enum ib_uverbs_advise_mr_advice advice, u32 flags, 1304 1287 struct ib_sge *sg_list, u32 num_sge) 1288 + { 1289 + return -EOPNOTSUPP; 1290 + } 1291 + static inline int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable) 1305 1292 { 1306 1293 return -EOPNOTSUPP; 1307 1294 } ··· 1339 1318 void mlx5_ib_init_cong_debugfs(struct mlx5_ib_dev *dev, u8 port_num); 1340 1319 1341 1320 /* GSI QP helper functions */ 1342 - struct ib_qp *mlx5_ib_gsi_create_qp(struct ib_pd *pd, 1343 - struct ib_qp_init_attr *init_attr); 1344 - int mlx5_ib_gsi_destroy_qp(struct ib_qp *qp); 1321 + int mlx5_ib_create_gsi(struct ib_pd *pd, struct mlx5_ib_qp *mqp, 1322 + struct ib_qp_init_attr *attr); 1323 + int mlx5_ib_destroy_gsi(struct mlx5_ib_qp *mqp); 1345 1324 int mlx5_ib_gsi_modify_qp(struct ib_qp *qp, struct ib_qp_attr *attr, 1346 1325 int attr_mask); 1347 1326 int mlx5_ib_gsi_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr, ··· 1379 1358 1380 1359 static inline int is_qp1(enum ib_qp_type qp_type) 1381 1360 { 1382 - return qp_type == MLX5_IB_QPT_HW_GSI; 1361 + return qp_type == MLX5_IB_QPT_HW_GSI || qp_type == IB_QPT_GSI; 1383 1362 } 1384 1363 1385 1364 #define MLX5_MAX_UMR_SHIFT 16 ··· 1463 1442 struct mlx5_bfreg_info *bfregi, u32 bfregn, 1464 1443 bool dyn_bfreg); 1465 1444 1466 - static inline bool mlx5_ib_can_use_umr(struct mlx5_ib_dev *dev, 1467 - bool do_modify_atomic, int access_flags) 1445 + static inline bool mlx5_ib_can_load_pas_with_umr(struct mlx5_ib_dev *dev, 1446 + size_t length) 1468 1447 { 1448 + /* 1449 + * umr_check_mkey_mask() rejects MLX5_MKEY_MASK_PAGE_SIZE which is 1450 + * always set if MLX5_IB_SEND_UMR_UPDATE_TRANSLATION (aka 1451 + * MLX5_IB_UPD_XLT_ADDR and MLX5_IB_UPD_XLT_ENABLE) is set. Thus, a mkey 1452 + * can never be enabled without this capability. Simplify this weird 1453 + * quirky hardware by just saying it can't use PAS lists with UMR at 1454 + * all. 1455 + */ 1469 1456 if (MLX5_CAP_GEN(dev->mdev, umr_modify_entity_size_disabled)) 1470 1457 return false; 1471 1458 1472 - if (do_modify_atomic && 1459 + /* 1460 + * length is the size of the MR in bytes when mlx5_ib_update_xlt() is 1461 + * used. 1462 + */ 1463 + if (!MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset) && 1464 + length >= MLX5_MAX_UMR_PAGES * PAGE_SIZE) 1465 + return false; 1466 + return true; 1467 + } 1468 + 1469 + /* 1470 + * true if an existing MR can be reconfigured to new access_flags using UMR. 1471 + * Older HW cannot use UMR to update certain elements of the MKC. See 1472 + * umr_check_mkey_mask(), get_umr_update_access_mask() and umr_check_mkey_mask() 1473 + */ 1474 + static inline bool mlx5_ib_can_reconfig_with_umr(struct mlx5_ib_dev *dev, 1475 + unsigned int current_access_flags, 1476 + unsigned int target_access_flags) 1477 + { 1478 + unsigned int diffs = current_access_flags ^ target_access_flags; 1479 + 1480 + if ((diffs & IB_ACCESS_REMOTE_ATOMIC) && 1473 1481 MLX5_CAP_GEN(dev->mdev, atomic) && 1474 1482 MLX5_CAP_GEN(dev->mdev, umr_modify_atomic_disabled)) 1475 1483 return false; 1476 1484 1477 - if (access_flags & IB_ACCESS_RELAXED_ORDERING && 1485 + if ((diffs & IB_ACCESS_RELAXED_ORDERING) && 1478 1486 MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write) && 1479 1487 !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write_umr)) 1480 1488 return false; 1481 1489 1482 - if (access_flags & IB_ACCESS_RELAXED_ORDERING && 1483 - MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) && 1484 - !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr)) 1490 + if ((diffs & IB_ACCESS_RELAXED_ORDERING) && 1491 + MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) && 1492 + !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr)) 1485 1493 return false; 1486 1494 1487 1495 return true;
+94 -95
drivers/infiniband/hw/mlx5/mr.c
··· 50 50 static void 51 51 create_mkey_callback(int status, struct mlx5_async_work *context); 52 52 53 + static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr, 54 + struct ib_pd *pd) 55 + { 56 + struct mlx5_ib_dev *dev = to_mdev(pd->device); 57 + 58 + MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC)); 59 + MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE)); 60 + MLX5_SET(mkc, mkc, rr, !!(acc & IB_ACCESS_REMOTE_READ)); 61 + MLX5_SET(mkc, mkc, lw, !!(acc & IB_ACCESS_LOCAL_WRITE)); 62 + MLX5_SET(mkc, mkc, lr, 1); 63 + 64 + if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write)) 65 + MLX5_SET(mkc, mkc, relaxed_ordering_write, 66 + !!(acc & IB_ACCESS_RELAXED_ORDERING)); 67 + if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read)) 68 + MLX5_SET(mkc, mkc, relaxed_ordering_read, 69 + !!(acc & IB_ACCESS_RELAXED_ORDERING)); 70 + 71 + MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); 72 + MLX5_SET(mkc, mkc, qpn, 0xffffff); 73 + MLX5_SET64(mkc, mkc, start_addr, start_addr); 74 + } 75 + 53 76 static void 54 77 assign_mkey_variant(struct mlx5_ib_dev *dev, struct mlx5_core_mkey *mkey, 55 78 u32 *in) ··· 123 100 return mlx5_core_destroy_mkey(dev->mdev, &mr->mmkey); 124 101 } 125 102 126 - static bool use_umr_mtt_update(struct mlx5_ib_mr *mr, u64 start, u64 length) 103 + static inline bool mlx5_ib_pas_fits_in_mr(struct mlx5_ib_mr *mr, u64 start, 104 + u64 length) 127 105 { 128 106 return ((u64)1 << mr->order) * MLX5_ADAPTER_PAGE_SIZE >= 129 107 length + (start & (MLX5_ADAPTER_PAGE_SIZE - 1)); ··· 176 152 mr->cache_ent = ent; 177 153 mr->dev = ent->dev; 178 154 155 + set_mkc_access_pd_addr_fields(mkc, 0, 0, ent->dev->umrc.pd); 179 156 MLX5_SET(mkc, mkc, free, 1); 180 157 MLX5_SET(mkc, mkc, umr_en, 1); 181 158 MLX5_SET(mkc, mkc, access_mode_1_0, ent->access_mode & 0x3); 182 159 MLX5_SET(mkc, mkc, access_mode_4_2, (ent->access_mode >> 2) & 0x7); 183 160 184 - MLX5_SET(mkc, mkc, qpn, 0xffffff); 185 161 MLX5_SET(mkc, mkc, translations_octword_size, ent->xlt); 186 162 MLX5_SET(mkc, mkc, log_page_size, ent->page); 187 163 return mr; ··· 558 534 559 535 /* Allocate a special entry from the cache */ 560 536 struct mlx5_ib_mr *mlx5_mr_cache_alloc(struct mlx5_ib_dev *dev, 561 - unsigned int entry) 537 + unsigned int entry, int access_flags) 562 538 { 563 539 struct mlx5_mr_cache *cache = &dev->cache; 564 540 struct mlx5_cache_ent *ent; ··· 567 543 if (WARN_ON(entry <= MR_CACHE_LAST_STD_ENTRY || 568 544 entry >= ARRAY_SIZE(cache->ent))) 569 545 return ERR_PTR(-EINVAL); 546 + 547 + /* Matches access in alloc_cache_mr() */ 548 + if (!mlx5_ib_can_reconfig_with_umr(dev, 0, access_flags)) 549 + return ERR_PTR(-EOPNOTSUPP); 570 550 571 551 ent = &cache->ent[entry]; 572 552 spin_lock_irq(&ent->lock); ··· 586 558 queue_adjust_cache_locked(ent); 587 559 spin_unlock_irq(&ent->lock); 588 560 } 561 + mr->access_flags = access_flags; 589 562 return mr; 590 563 } 591 564 ··· 759 730 MLX5_IB_UMR_OCTOWORD; 760 731 ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT; 761 732 if ((dev->mdev->profile->mask & MLX5_PROF_MASK_MR_CACHE) && 762 - !dev->is_rep && 763 - mlx5_core_is_pf(dev->mdev)) 733 + !dev->is_rep && mlx5_core_is_pf(dev->mdev) && 734 + mlx5_ib_can_load_pas_with_umr(dev, 0)) 764 735 ent->limit = dev->mdev->profile->mr_cache[i].limit; 765 736 else 766 737 ent->limit = 0; ··· 801 772 del_timer_sync(&dev->delay_timer); 802 773 803 774 return 0; 804 - } 805 - 806 - static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr, 807 - struct ib_pd *pd) 808 - { 809 - struct mlx5_ib_dev *dev = to_mdev(pd->device); 810 - 811 - MLX5_SET(mkc, mkc, a, !!(acc & IB_ACCESS_REMOTE_ATOMIC)); 812 - MLX5_SET(mkc, mkc, rw, !!(acc & IB_ACCESS_REMOTE_WRITE)); 813 - MLX5_SET(mkc, mkc, rr, !!(acc & IB_ACCESS_REMOTE_READ)); 814 - MLX5_SET(mkc, mkc, lw, !!(acc & IB_ACCESS_LOCAL_WRITE)); 815 - MLX5_SET(mkc, mkc, lr, 1); 816 - 817 - if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write)) 818 - MLX5_SET(mkc, mkc, relaxed_ordering_write, 819 - !!(acc & IB_ACCESS_RELAXED_ORDERING)); 820 - if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read)) 821 - MLX5_SET(mkc, mkc, relaxed_ordering_read, 822 - !!(acc & IB_ACCESS_RELAXED_ORDERING)); 823 - 824 - MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); 825 - MLX5_SET(mkc, mkc, qpn, 0xffffff); 826 - MLX5_SET64(mkc, mkc, start_addr, start_addr); 827 775 } 828 776 829 777 struct ib_mr *mlx5_ib_get_dma_mr(struct ib_pd *pd, int acc) ··· 985 979 986 980 if (!ent) 987 981 return ERR_PTR(-E2BIG); 982 + 983 + /* Matches access in alloc_cache_mr() */ 984 + if (!mlx5_ib_can_reconfig_with_umr(dev, 0, access_flags)) 985 + return ERR_PTR(-EOPNOTSUPP); 986 + 988 987 mr = get_cache_mr(ent); 989 988 if (!mr) { 990 989 mr = create_cache_mr(ent); ··· 1192 1181 goto err_1; 1193 1182 } 1194 1183 pas = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); 1195 - if (populate && !(access_flags & IB_ACCESS_ON_DEMAND)) 1184 + if (populate) { 1185 + if (WARN_ON(access_flags & IB_ACCESS_ON_DEMAND)) { 1186 + err = -EINVAL; 1187 + goto err_2; 1188 + } 1196 1189 mlx5_ib_populate_pas(dev, umem, page_shift, pas, 1197 1190 pg_cap ? MLX5_IB_MTT_PRESENT : 0); 1191 + } 1198 1192 1199 1193 /* The pg_access bit allows setting the access flags 1200 1194 * in the page list submitted with the command. */ 1201 1195 MLX5_SET(create_mkey_in, in, pg_access, !!(pg_cap)); 1202 1196 1203 1197 mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); 1198 + set_mkc_access_pd_addr_fields(mkc, access_flags, virt_addr, 1199 + populate ? pd : dev->umrc.pd); 1204 1200 MLX5_SET(mkc, mkc, free, !populate); 1205 1201 MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); 1206 - if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write)) 1207 - MLX5_SET(mkc, mkc, relaxed_ordering_write, 1208 - !!(access_flags & IB_ACCESS_RELAXED_ORDERING)); 1209 - if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read)) 1210 - MLX5_SET(mkc, mkc, relaxed_ordering_read, 1211 - !!(access_flags & IB_ACCESS_RELAXED_ORDERING)); 1212 - MLX5_SET(mkc, mkc, a, !!(access_flags & IB_ACCESS_REMOTE_ATOMIC)); 1213 - MLX5_SET(mkc, mkc, rw, !!(access_flags & IB_ACCESS_REMOTE_WRITE)); 1214 - MLX5_SET(mkc, mkc, rr, !!(access_flags & IB_ACCESS_REMOTE_READ)); 1215 - MLX5_SET(mkc, mkc, lw, !!(access_flags & IB_ACCESS_LOCAL_WRITE)); 1216 - MLX5_SET(mkc, mkc, lr, 1); 1217 1202 MLX5_SET(mkc, mkc, umr_en, 1); 1218 1203 1219 - MLX5_SET64(mkc, mkc, start_addr, virt_addr); 1220 1204 MLX5_SET64(mkc, mkc, len, length); 1221 - MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); 1222 1205 MLX5_SET(mkc, mkc, bsf_octword_size, 0); 1223 1206 MLX5_SET(mkc, mkc, translations_octword_size, 1224 1207 get_octo_len(virt_addr, length, page_shift)); 1225 1208 MLX5_SET(mkc, mkc, log_page_size, page_shift); 1226 - MLX5_SET(mkc, mkc, qpn, 0xffffff); 1227 1209 if (populate) { 1228 1210 MLX5_SET(create_mkey_in, in, translations_octword_actual_size, 1229 1211 get_octo_len(virt_addr, length, page_shift)); ··· 1312 1308 struct uverbs_attr_bundle *attrs) 1313 1309 { 1314 1310 if (advice != IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH && 1315 - advice != IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE) 1311 + advice != IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE && 1312 + advice != IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT) 1316 1313 return -EOPNOTSUPP; 1317 1314 1318 1315 return mlx5_ib_advise_mr_prefetch(pd, advice, flags, ··· 1358 1353 { 1359 1354 struct mlx5_ib_dev *dev = to_mdev(pd->device); 1360 1355 struct mlx5_ib_mr *mr = NULL; 1361 - bool use_umr; 1356 + bool xlt_with_umr; 1362 1357 struct ib_umem *umem; 1363 1358 int page_shift; 1364 1359 int npages; ··· 1371 1366 1372 1367 mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n", 1373 1368 start, virt_addr, length, access_flags); 1369 + 1370 + xlt_with_umr = mlx5_ib_can_load_pas_with_umr(dev, length); 1371 + /* ODP requires xlt update via umr to work. */ 1372 + if (!xlt_with_umr && (access_flags & IB_ACCESS_ON_DEMAND)) 1373 + return ERR_PTR(-EINVAL); 1374 1374 1375 1375 if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING) && !start && 1376 1376 length == U64_MAX) { ··· 1397 1387 if (err < 0) 1398 1388 return ERR_PTR(err); 1399 1389 1400 - use_umr = mlx5_ib_can_use_umr(dev, true, access_flags); 1401 - 1402 - if (order <= mr_cache_max_order(dev) && use_umr) { 1390 + if (xlt_with_umr) { 1403 1391 mr = alloc_mr_from_cache(pd, umem, virt_addr, length, ncont, 1404 1392 page_shift, order, access_flags); 1405 - if (PTR_ERR(mr) == -EAGAIN) { 1406 - mlx5_ib_dbg(dev, "cache empty for order %d\n", order); 1393 + if (IS_ERR(mr)) 1407 1394 mr = NULL; 1408 - } 1409 - } else if (!MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset)) { 1410 - if (access_flags & IB_ACCESS_ON_DEMAND) { 1411 - err = -EINVAL; 1412 - pr_err("Got MR registration for ODP MR > 512MB, not supported for Connect-IB\n"); 1413 - goto error; 1414 - } 1415 - use_umr = false; 1416 1395 } 1417 1396 1418 1397 if (!mr) { 1419 1398 mutex_lock(&dev->slow_path_mutex); 1420 1399 mr = reg_create(NULL, pd, virt_addr, length, umem, ncont, 1421 - page_shift, access_flags, !use_umr); 1400 + page_shift, access_flags, !xlt_with_umr); 1422 1401 mutex_unlock(&dev->slow_path_mutex); 1423 1402 } 1424 1403 ··· 1421 1422 mr->umem = umem; 1422 1423 set_mr_fields(dev, mr, npages, length, access_flags); 1423 1424 1424 - if (use_umr) { 1425 + if (xlt_with_umr && !(access_flags & IB_ACCESS_ON_DEMAND)) { 1426 + /* 1427 + * If the MR was created with reg_create then it will be 1428 + * configured properly but left disabled. It is safe to go ahead 1429 + * and configure it again via UMR while enabling it. 1430 + */ 1425 1431 int update_xlt_flags = MLX5_IB_UPD_XLT_ENABLE; 1426 - 1427 - if (access_flags & IB_ACCESS_ON_DEMAND) 1428 - update_xlt_flags |= MLX5_IB_UPD_XLT_ZAP; 1429 1432 1430 1433 err = mlx5_ib_update_xlt(mr, 0, ncont, page_shift, 1431 1434 update_xlt_flags); 1432 - 1433 1435 if (err) { 1434 1436 dereg_mr(dev, mr); 1435 1437 return ERR_PTR(err); ··· 1444 1444 err = xa_err(xa_store(&dev->odp_mkeys, 1445 1445 mlx5_base_mkey(mr->mmkey.key), &mr->mmkey, 1446 1446 GFP_KERNEL)); 1447 + if (err) { 1448 + dereg_mr(dev, mr); 1449 + return ERR_PTR(err); 1450 + } 1451 + 1452 + err = mlx5_ib_init_odp_mr(mr, xlt_with_umr); 1447 1453 if (err) { 1448 1454 dereg_mr(dev, mr); 1449 1455 return ERR_PTR(err); ··· 1561 1555 goto err; 1562 1556 } 1563 1557 1564 - if (!mlx5_ib_can_use_umr(dev, true, access_flags) || 1565 - (flags & IB_MR_REREG_TRANS && !use_umr_mtt_update(mr, addr, len))) { 1558 + if (!mlx5_ib_can_reconfig_with_umr(dev, mr->access_flags, 1559 + access_flags) || 1560 + !mlx5_ib_can_load_pas_with_umr(dev, len) || 1561 + (flags & IB_MR_REREG_TRANS && 1562 + !mlx5_ib_pas_fits_in_mr(mr, addr, len))) { 1566 1563 /* 1567 1564 * UMR can't be used - MKey needs to be replaced. 1568 1565 */ ··· 1736 1727 1737 1728 mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); 1738 1729 1730 + /* This is only used from the kernel, so setting the PD is OK. */ 1731 + set_mkc_access_pd_addr_fields(mkc, 0, 0, pd); 1739 1732 MLX5_SET(mkc, mkc, free, 1); 1740 - MLX5_SET(mkc, mkc, qpn, 0xffffff); 1741 - MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); 1742 1733 MLX5_SET(mkc, mkc, translations_octword_size, ndescs); 1743 1734 MLX5_SET(mkc, mkc, access_mode_1_0, access_mode & 0x3); 1744 1735 MLX5_SET(mkc, mkc, access_mode_4_2, (access_mode >> 2) & 0x7); ··· 1982 1973 max_num_meta_sg); 1983 1974 } 1984 1975 1985 - struct ib_mw *mlx5_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, 1986 - struct ib_udata *udata) 1976 + int mlx5_ib_alloc_mw(struct ib_mw *ibmw, struct ib_udata *udata) 1987 1977 { 1988 - struct mlx5_ib_dev *dev = to_mdev(pd->device); 1978 + struct mlx5_ib_dev *dev = to_mdev(ibmw->device); 1989 1979 int inlen = MLX5_ST_SZ_BYTES(create_mkey_in); 1990 - struct mlx5_ib_mw *mw = NULL; 1980 + struct mlx5_ib_mw *mw = to_mmw(ibmw); 1991 1981 u32 *in = NULL; 1992 1982 void *mkc; 1993 1983 int ndescs; ··· 1999 1991 2000 1992 err = ib_copy_from_udata(&req, udata, min(udata->inlen, sizeof(req))); 2001 1993 if (err) 2002 - return ERR_PTR(err); 1994 + return err; 2003 1995 2004 1996 if (req.comp_mask || req.reserved1 || req.reserved2) 2005 - return ERR_PTR(-EOPNOTSUPP); 1997 + return -EOPNOTSUPP; 2006 1998 2007 1999 if (udata->inlen > sizeof(req) && 2008 2000 !ib_is_udata_cleared(udata, sizeof(req), 2009 2001 udata->inlen - sizeof(req))) 2010 - return ERR_PTR(-EOPNOTSUPP); 2002 + return -EOPNOTSUPP; 2011 2003 2012 2004 ndescs = req.num_klms ? roundup(req.num_klms, 4) : roundup(1, 4); 2013 2005 2014 - mw = kzalloc(sizeof(*mw), GFP_KERNEL); 2015 2006 in = kzalloc(inlen, GFP_KERNEL); 2016 - if (!mw || !in) { 2007 + if (!in) { 2017 2008 err = -ENOMEM; 2018 2009 goto free; 2019 2010 } ··· 2021 2014 2022 2015 MLX5_SET(mkc, mkc, free, 1); 2023 2016 MLX5_SET(mkc, mkc, translations_octword_size, ndescs); 2024 - MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn); 2017 + MLX5_SET(mkc, mkc, pd, to_mpd(ibmw->pd)->pdn); 2025 2018 MLX5_SET(mkc, mkc, umr_en, 1); 2026 2019 MLX5_SET(mkc, mkc, lr, 1); 2027 2020 MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_KLMS); 2028 - MLX5_SET(mkc, mkc, en_rinval, !!((type == IB_MW_TYPE_2))); 2021 + MLX5_SET(mkc, mkc, en_rinval, !!((ibmw->type == IB_MW_TYPE_2))); 2029 2022 MLX5_SET(mkc, mkc, qpn, 0xffffff); 2030 2023 2031 2024 err = mlx5_ib_create_mkey(dev, &mw->mmkey, in, inlen); ··· 2033 2026 goto free; 2034 2027 2035 2028 mw->mmkey.type = MLX5_MKEY_MW; 2036 - mw->ibmw.rkey = mw->mmkey.key; 2029 + ibmw->rkey = mw->mmkey.key; 2037 2030 mw->ndescs = ndescs; 2038 2031 2039 - resp.response_length = min(offsetof(typeof(resp), response_length) + 2040 - sizeof(resp.response_length), udata->outlen); 2032 + resp.response_length = 2033 + min(offsetofend(typeof(resp), response_length), udata->outlen); 2041 2034 if (resp.response_length) { 2042 2035 err = ib_copy_to_udata(udata, &resp, resp.response_length); 2043 - if (err) { 2044 - mlx5_core_destroy_mkey(dev->mdev, &mw->mmkey); 2045 - goto free; 2046 - } 2036 + if (err) 2037 + goto free_mkey; 2047 2038 } 2048 2039 2049 2040 if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { ··· 2053 2048 } 2054 2049 2055 2050 kfree(in); 2056 - return &mw->ibmw; 2051 + return 0; 2057 2052 2058 2053 free_mkey: 2059 2054 mlx5_core_destroy_mkey(dev->mdev, &mw->mmkey); 2060 2055 free: 2061 - kfree(mw); 2062 2056 kfree(in); 2063 - return ERR_PTR(err); 2057 + return err; 2064 2058 } 2065 2059 2066 2060 int mlx5_ib_dealloc_mw(struct ib_mw *mw) 2067 2061 { 2068 2062 struct mlx5_ib_dev *dev = to_mdev(mw->device); 2069 2063 struct mlx5_ib_mw *mmw = to_mmw(mw); 2070 - int err; 2071 2064 2072 2065 if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) { 2073 2066 xa_erase(&dev->odp_mkeys, mlx5_base_mkey(mmw->mmkey.key)); ··· 2076 2073 synchronize_srcu(&dev->odp_srcu); 2077 2074 } 2078 2075 2079 - err = mlx5_core_destroy_mkey(dev->mdev, &mmw->mmkey); 2080 - if (err) 2081 - return err; 2082 - kfree(mmw); 2083 - return 0; 2076 + return mlx5_core_destroy_mkey(dev->mdev, &mmw->mmkey); 2084 2077 } 2085 2078 2086 2079 int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
+34 -22
drivers/infiniband/hw/mlx5/odp.c
··· 382 382 memset(caps, 0, sizeof(*caps)); 383 383 384 384 if (!MLX5_CAP_GEN(dev->mdev, pg) || 385 - !mlx5_ib_can_use_umr(dev, true, 0)) 385 + !mlx5_ib_can_load_pas_with_umr(dev, 0)) 386 386 return; 387 387 388 388 caps->general_caps = IB_ODP_SUPPORT; ··· 476 476 if (IS_ERR(odp)) 477 477 return ERR_CAST(odp); 478 478 479 - ret = mr = mlx5_mr_cache_alloc(imr->dev, MLX5_IMR_MTT_CACHE_ENTRY); 479 + ret = mr = mlx5_mr_cache_alloc(imr->dev, MLX5_IMR_MTT_CACHE_ENTRY, 480 + imr->access_flags); 480 481 if (IS_ERR(mr)) 481 482 goto out_umem; 482 483 483 484 mr->ibmr.pd = imr->ibmr.pd; 484 - mr->access_flags = imr->access_flags; 485 485 mr->umem = &odp->umem; 486 486 mr->ibmr.lkey = mr->mmkey.key; 487 487 mr->ibmr.rkey = mr->mmkey.key; ··· 540 540 if (IS_ERR(umem_odp)) 541 541 return ERR_CAST(umem_odp); 542 542 543 - imr = mlx5_mr_cache_alloc(dev, MLX5_IMR_KSM_CACHE_ENTRY); 543 + imr = mlx5_mr_cache_alloc(dev, MLX5_IMR_KSM_CACHE_ENTRY, access_flags); 544 544 if (IS_ERR(imr)) { 545 545 err = PTR_ERR(imr); 546 546 goto out_umem; 547 547 } 548 548 549 549 imr->ibmr.pd = &pd->ibpd; 550 - imr->access_flags = access_flags; 551 550 imr->mmkey.iova = 0; 552 551 imr->umem = &umem_odp->umem; 553 552 imr->ibmr.lkey = imr->mmkey.key; ··· 665 666 } 666 667 667 668 #define MLX5_PF_FLAGS_DOWNGRADE BIT(1) 669 + #define MLX5_PF_FLAGS_SNAPSHOT BIT(2) 670 + #define MLX5_PF_FLAGS_ENABLE BIT(3) 668 671 static int pagefault_real_mr(struct mlx5_ib_mr *mr, struct ib_umem_odp *odp, 669 672 u64 user_va, size_t bcnt, u32 *bytes_mapped, 670 673 u32 flags) 671 674 { 672 675 int page_shift, ret, np; 673 676 bool downgrade = flags & MLX5_PF_FLAGS_DOWNGRADE; 674 - unsigned long current_seq; 675 677 u64 access_mask; 676 678 u64 start_idx; 679 + bool fault = !(flags & MLX5_PF_FLAGS_SNAPSHOT); 680 + u32 xlt_flags = MLX5_IB_UPD_XLT_ATOMIC; 681 + 682 + if (flags & MLX5_PF_FLAGS_ENABLE) 683 + xlt_flags |= MLX5_IB_UPD_XLT_ENABLE; 677 684 678 685 page_shift = odp->page_shift; 679 686 start_idx = (user_va - ib_umem_start(odp)) >> page_shift; ··· 688 683 if (odp->umem.writable && !downgrade) 689 684 access_mask |= ODP_WRITE_ALLOWED_BIT; 690 685 691 - current_seq = mmu_interval_read_begin(&odp->notifier); 692 - 693 - np = ib_umem_odp_map_dma_pages(odp, user_va, bcnt, access_mask, 694 - current_seq); 686 + np = ib_umem_odp_map_dma_and_lock(odp, user_va, bcnt, access_mask, fault); 695 687 if (np < 0) 696 688 return np; 697 689 698 - mutex_lock(&odp->umem_mutex); 699 - if (!mmu_interval_read_retry(&odp->notifier, current_seq)) { 700 - /* 701 - * No need to check whether the MTTs really belong to 702 - * this MR, since ib_umem_odp_map_dma_pages already 703 - * checks this. 704 - */ 705 - ret = mlx5_ib_update_xlt(mr, start_idx, np, 706 - page_shift, MLX5_IB_UPD_XLT_ATOMIC); 707 - } else { 708 - ret = -EAGAIN; 709 - } 690 + /* 691 + * No need to check whether the MTTs really belong to this MR, since 692 + * ib_umem_odp_map_dma_and_lock already checks this. 693 + */ 694 + ret = mlx5_ib_update_xlt(mr, start_idx, np, page_shift, xlt_flags); 710 695 mutex_unlock(&odp->umem_mutex); 711 696 712 697 if (ret < 0) { ··· 829 834 } 830 835 return pagefault_implicit_mr(mr, odp, io_virt, bcnt, bytes_mapped, 831 836 flags); 837 + } 838 + 839 + int mlx5_ib_init_odp_mr(struct mlx5_ib_mr *mr, bool enable) 840 + { 841 + u32 flags = MLX5_PF_FLAGS_SNAPSHOT; 842 + int ret; 843 + 844 + if (enable) 845 + flags |= MLX5_PF_FLAGS_ENABLE; 846 + 847 + ret = pagefault_real_mr(mr, to_ib_umem_odp(mr->umem), 848 + mr->umem->address, mr->umem->length, NULL, 849 + flags); 850 + return ret >= 0 ? 0 : ret; 832 851 } 833 852 834 853 struct pf_frame { ··· 1870 1861 1871 1862 if (advice == IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH) 1872 1863 pf_flags |= MLX5_PF_FLAGS_DOWNGRADE; 1864 + 1865 + if (advice == IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT) 1866 + pf_flags |= MLX5_PF_FLAGS_SNAPSHOT; 1873 1867 1874 1868 if (flags & IB_UVERBS_ADVISE_MR_FLAG_FLUSH) 1875 1869 return mlx5_ib_prefetch_sg_list(pd, advice, pf_flags, sg_list,
+100 -82
drivers/infiniband/hw/mlx5/qp.c
··· 1477 1477 resp->comp_mask |= MLX5_IB_CREATE_QP_RESP_MASK_RQN; 1478 1478 resp->tirn = rq->tirn; 1479 1479 resp->comp_mask |= MLX5_IB_CREATE_QP_RESP_MASK_TIRN; 1480 - if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner)) { 1480 + if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) || 1481 + MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner_v2)) { 1481 1482 resp->tir_icm_addr = MLX5_GET( 1482 1483 create_tir_out, out, icm_address_31_0); 1483 1484 resp->tir_icm_addr |= ··· 1740 1739 if (mucontext->devx_uid) { 1741 1740 params->resp.comp_mask |= MLX5_IB_CREATE_QP_RESP_MASK_TIRN; 1742 1741 params->resp.tirn = qp->rss_qp.tirn; 1743 - if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner)) { 1742 + if (MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner) || 1743 + MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev, sw_owner_v2)) { 1744 1744 params->resp.tir_icm_addr = 1745 1745 MLX5_GET(create_tir_out, out, icm_address_31_0); 1746 1746 params->resp.tir_icm_addr |= ··· 2411 2409 u32 uidx = params->uidx; 2412 2410 void *dctc; 2413 2411 2412 + if (mlx5_lag_is_active(dev->mdev) && !MLX5_CAP_GEN(dev->mdev, lag_dct)) 2413 + return -EOPNOTSUPP; 2414 + 2414 2415 qp->dct.in = kzalloc(MLX5_ST_SZ_BYTES(create_dct_in), GFP_KERNEL); 2415 2416 if (!qp->dct.in) 2416 2417 return -ENOMEM; ··· 2509 2504 "Wrong QP type %d for the RWQ indirect table\n", 2510 2505 attr->qp_type); 2511 2506 return -EINVAL; 2512 - } 2513 - 2514 - switch (attr->qp_type) { 2515 - case IB_QPT_SMI: 2516 - case MLX5_IB_QPT_HW_GSI: 2517 - case MLX5_IB_QPT_REG_UMR: 2518 - case IB_QPT_GSI: 2519 - mlx5_ib_dbg(dev, "Kernel doesn't support QP type %d\n", 2520 - attr->qp_type); 2521 - return -EINVAL; 2522 - default: 2523 - break; 2524 2507 } 2525 2508 2526 2509 /* ··· 2773 2780 goto out; 2774 2781 } 2775 2782 2776 - if (qp->type == MLX5_IB_QPT_DCT) { 2783 + switch (qp->type) { 2784 + case MLX5_IB_QPT_DCT: 2777 2785 err = create_dct(dev, pd, qp, params); 2778 - goto out; 2779 - } 2780 - 2781 - if (qp->type == IB_QPT_XRC_TGT) { 2786 + break; 2787 + case IB_QPT_XRC_TGT: 2782 2788 err = create_xrc_tgt_qp(dev, qp, params); 2783 - goto out; 2789 + break; 2790 + case IB_QPT_GSI: 2791 + err = mlx5_ib_create_gsi(pd, qp, params->attr); 2792 + break; 2793 + default: 2794 + if (params->udata) 2795 + err = create_user_qp(dev, pd, qp, params); 2796 + else 2797 + err = create_kernel_qp(dev, pd, qp, params); 2784 2798 } 2785 - 2786 - if (params->udata) 2787 - err = create_user_qp(dev, pd, qp, params); 2788 - else 2789 - err = create_kernel_qp(dev, pd, qp, params); 2790 2799 2791 2800 out: 2792 2801 if (err) { ··· 2929 2934 if (err) 2930 2935 return ERR_PTR(err); 2931 2936 2932 - if (attr->qp_type == IB_QPT_GSI) 2933 - return mlx5_ib_gsi_create_qp(pd, attr); 2934 - 2935 2937 params.udata = udata; 2936 2938 params.uidx = MLX5_IB_DEFAULT_UIDX; 2937 2939 params.attr = attr; ··· 2997 3005 return &qp->ibqp; 2998 3006 2999 3007 destroy_qp: 3000 - if (qp->type == MLX5_IB_QPT_DCT) { 3008 + switch (qp->type) { 3009 + case MLX5_IB_QPT_DCT: 3001 3010 mlx5_ib_destroy_dct(qp); 3002 - } else { 3011 + break; 3012 + case IB_QPT_GSI: 3013 + mlx5_ib_destroy_gsi(qp); 3014 + break; 3015 + default: 3003 3016 /* 3004 3017 * These lines below are temp solution till QP allocation 3005 3018 * will be moved to be under IB/core responsiblity. ··· 3029 3032 struct mlx5_ib_qp *mqp = to_mqp(qp); 3030 3033 3031 3034 if (unlikely(qp->qp_type == IB_QPT_GSI)) 3032 - return mlx5_ib_gsi_destroy_qp(qp); 3035 + return mlx5_ib_destroy_gsi(mqp); 3033 3036 3034 3037 if (mqp->type == MLX5_IB_QPT_DCT) 3035 3038 return mlx5_ib_destroy_dct(mqp); ··· 3085 3088 MLX5_PATH_FLAG_COUNTER = 1 << 2, 3086 3089 }; 3087 3090 3091 + static int ib_to_mlx5_rate_map(u8 rate) 3092 + { 3093 + switch (rate) { 3094 + case IB_RATE_PORT_CURRENT: 3095 + return 0; 3096 + case IB_RATE_56_GBPS: 3097 + return 1; 3098 + case IB_RATE_25_GBPS: 3099 + return 2; 3100 + case IB_RATE_100_GBPS: 3101 + return 3; 3102 + case IB_RATE_200_GBPS: 3103 + return 4; 3104 + case IB_RATE_50_GBPS: 3105 + return 5; 3106 + default: 3107 + return rate + MLX5_STAT_RATE_OFFSET; 3108 + }; 3109 + 3110 + return 0; 3111 + } 3112 + 3088 3113 static int ib_rate_to_mlx5(struct mlx5_ib_dev *dev, u8 rate) 3089 3114 { 3115 + u32 stat_rate_support; 3116 + 3090 3117 if (rate == IB_RATE_PORT_CURRENT) 3091 3118 return 0; 3092 3119 3093 3120 if (rate < IB_RATE_2_5_GBPS || rate > IB_RATE_600_GBPS) 3094 3121 return -EINVAL; 3095 3122 3123 + stat_rate_support = MLX5_CAP_GEN(dev->mdev, stat_rate_support); 3096 3124 while (rate != IB_RATE_PORT_CURRENT && 3097 - !(1 << (rate + MLX5_STAT_RATE_OFFSET) & 3098 - MLX5_CAP_GEN(dev->mdev, stat_rate_support))) 3125 + !(1 << ib_to_mlx5_rate_map(rate) & stat_rate_support)) 3099 3126 --rate; 3100 3127 3101 - return rate ? rate + MLX5_STAT_RATE_OFFSET : rate; 3128 + return ib_to_mlx5_rate_map(rate); 3102 3129 } 3103 3130 3104 3131 static int modify_raw_packet_eth_prio(struct mlx5_core_dev *dev, ··· 3664 3643 MLX5_MAX_PORTS + 1; 3665 3644 } 3666 3645 3667 - static bool qp_supports_affinity(struct ib_qp *qp) 3646 + static bool qp_supports_affinity(struct mlx5_ib_qp *qp) 3668 3647 { 3669 - if ((qp->qp_type == IB_QPT_RC) || 3670 - (qp->qp_type == IB_QPT_UD) || 3671 - (qp->qp_type == IB_QPT_UC) || 3672 - (qp->qp_type == IB_QPT_RAW_PACKET) || 3673 - (qp->qp_type == IB_QPT_XRC_INI) || 3674 - (qp->qp_type == IB_QPT_XRC_TGT)) 3648 + if ((qp->type == IB_QPT_RC) || (qp->type == IB_QPT_UD) || 3649 + (qp->type == IB_QPT_UC) || (qp->type == IB_QPT_RAW_PACKET) || 3650 + (qp->type == IB_QPT_XRC_INI) || (qp->type == IB_QPT_XRC_TGT) || 3651 + (qp->type == MLX5_IB_QPT_DCI)) 3675 3652 return true; 3676 3653 return false; 3677 3654 } ··· 3687 3668 unsigned int tx_affinity; 3688 3669 3689 3670 if (!(mlx5_ib_lag_should_assign_affinity(dev) && 3690 - qp_supports_affinity(qp))) 3671 + qp_supports_affinity(mqp))) 3691 3672 return 0; 3692 3673 3693 3674 if (mqp->flags & MLX5_IB_QP_CREATE_SQPN_QP1) ··· 4180 4161 MLX5_SET(dctc, dctc, rae, 1); 4181 4162 } 4182 4163 MLX5_SET(dctc, dctc, pkey_index, attr->pkey_index); 4183 - MLX5_SET(dctc, dctc, port, attr->port_num); 4164 + if (mlx5_lag_is_active(dev->mdev)) 4165 + MLX5_SET(dctc, dctc, port, 4166 + get_tx_affinity_rr(dev, udata)); 4167 + else 4168 + MLX5_SET(dctc, dctc, port, attr->port_num); 4184 4169 4185 4170 set_id = mlx5_ib_get_counters_id(dev, attr->port_num - 1); 4186 4171 MLX5_SET(dctc, dctc, counter_set_id, set_id); ··· 4739 4716 return mlx5_cmd_xrcd_alloc(dev->mdev, &xrcd->xrcdn, 0); 4740 4717 } 4741 4718 4742 - void mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata) 4719 + int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata) 4743 4720 { 4744 4721 struct mlx5_ib_dev *dev = to_mdev(xrcd->device); 4745 4722 u32 xrcdn = to_mxrcd(xrcd)->xrcdn; 4746 4723 4747 - mlx5_cmd_xrcd_dealloc(dev->mdev, xrcdn, 0); 4724 + return mlx5_cmd_xrcd_dealloc(dev->mdev, xrcdn, 0); 4748 4725 } 4749 4726 4750 4727 static void mlx5_ib_wq_event(struct mlx5_core_qp *core_qp, int type) ··· 4944 4921 int err; 4945 4922 size_t required_cmd_sz; 4946 4923 4947 - required_cmd_sz = offsetof(typeof(ucmd), single_stride_log_num_of_bytes) 4948 - + sizeof(ucmd.single_stride_log_num_of_bytes); 4924 + required_cmd_sz = offsetofend(struct mlx5_ib_create_wq, 4925 + single_stride_log_num_of_bytes); 4949 4926 if (udata->inlen < required_cmd_sz) { 4950 4927 mlx5_ib_dbg(dev, "invalid inlen\n"); 4951 4928 return -EINVAL; ··· 5029 5006 if (!udata) 5030 5007 return ERR_PTR(-ENOSYS); 5031 5008 5032 - min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved); 5009 + min_resp_len = offsetofend(struct mlx5_ib_create_wq_resp, reserved); 5033 5010 if (udata->outlen && udata->outlen < min_resp_len) 5034 5011 return ERR_PTR(-EINVAL); 5035 5012 ··· 5059 5036 rwq->ibwq.wq_num = rwq->core_qp.qpn; 5060 5037 rwq->ibwq.state = IB_WQS_RESET; 5061 5038 if (udata->outlen) { 5062 - resp.response_length = offsetof(typeof(resp), response_length) + 5063 - sizeof(resp.response_length); 5039 + resp.response_length = offsetofend( 5040 + struct mlx5_ib_create_wq_resp, response_length); 5064 5041 err = ib_copy_to_udata(udata, &resp, resp.response_length); 5065 5042 if (err) 5066 5043 goto err_copy; ··· 5079 5056 return ERR_PTR(err); 5080 5057 } 5081 5058 5082 - void mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata) 5059 + int mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata) 5083 5060 { 5084 5061 struct mlx5_ib_dev *dev = to_mdev(wq->device); 5085 5062 struct mlx5_ib_rwq *rwq = to_mrwq(wq); 5063 + int ret; 5086 5064 5087 - mlx5_core_destroy_rq_tracked(dev, &rwq->core_qp); 5065 + ret = mlx5_core_destroy_rq_tracked(dev, &rwq->core_qp); 5066 + if (ret) 5067 + return ret; 5088 5068 destroy_user_rq(dev, wq->pd, rwq, udata); 5089 5069 kfree(rwq); 5070 + return 0; 5090 5071 } 5091 5072 5092 - struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device, 5093 - struct ib_rwq_ind_table_init_attr *init_attr, 5094 - struct ib_udata *udata) 5073 + int mlx5_ib_create_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_table, 5074 + struct ib_rwq_ind_table_init_attr *init_attr, 5075 + struct ib_udata *udata) 5095 5076 { 5096 - struct mlx5_ib_dev *dev = to_mdev(device); 5097 - struct mlx5_ib_rwq_ind_table *rwq_ind_tbl; 5077 + struct mlx5_ib_rwq_ind_table *rwq_ind_tbl = 5078 + to_mrwq_ind_table(ib_rwq_ind_table); 5079 + struct mlx5_ib_dev *dev = to_mdev(ib_rwq_ind_table->device); 5098 5080 int sz = 1 << init_attr->log_ind_tbl_size; 5099 5081 struct mlx5_ib_create_rwq_ind_tbl_resp resp = {}; 5100 5082 size_t min_resp_len; ··· 5112 5084 if (udata->inlen > 0 && 5113 5085 !ib_is_udata_cleared(udata, 0, 5114 5086 udata->inlen)) 5115 - return ERR_PTR(-EOPNOTSUPP); 5087 + return -EOPNOTSUPP; 5116 5088 5117 5089 if (init_attr->log_ind_tbl_size > 5118 5090 MLX5_CAP_GEN(dev->mdev, log_max_rqt_size)) { 5119 5091 mlx5_ib_dbg(dev, "log_ind_tbl_size = %d is bigger than supported = %d\n", 5120 5092 init_attr->log_ind_tbl_size, 5121 5093 MLX5_CAP_GEN(dev->mdev, log_max_rqt_size)); 5122 - return ERR_PTR(-EINVAL); 5094 + return -EINVAL; 5123 5095 } 5124 5096 5125 - min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved); 5097 + min_resp_len = 5098 + offsetofend(struct mlx5_ib_create_rwq_ind_tbl_resp, reserved); 5126 5099 if (udata->outlen && udata->outlen < min_resp_len) 5127 - return ERR_PTR(-EINVAL); 5128 - 5129 - rwq_ind_tbl = kzalloc(sizeof(*rwq_ind_tbl), GFP_KERNEL); 5130 - if (!rwq_ind_tbl) 5131 - return ERR_PTR(-ENOMEM); 5100 + return -EINVAL; 5132 5101 5133 5102 inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + sizeof(u32) * sz; 5134 5103 in = kvzalloc(inlen, GFP_KERNEL); 5135 - if (!in) { 5136 - err = -ENOMEM; 5137 - goto err; 5138 - } 5104 + if (!in) 5105 + return -ENOMEM; 5139 5106 5140 5107 rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context); 5141 5108 ··· 5145 5122 5146 5123 err = mlx5_core_create_rqt(dev->mdev, in, inlen, &rwq_ind_tbl->rqtn); 5147 5124 kvfree(in); 5148 - 5149 5125 if (err) 5150 - goto err; 5126 + return err; 5151 5127 5152 5128 rwq_ind_tbl->ib_rwq_ind_tbl.ind_tbl_num = rwq_ind_tbl->rqtn; 5153 5129 if (udata->outlen) { 5154 - resp.response_length = offsetof(typeof(resp), response_length) + 5155 - sizeof(resp.response_length); 5130 + resp.response_length = 5131 + offsetofend(struct mlx5_ib_create_rwq_ind_tbl_resp, 5132 + response_length); 5156 5133 err = ib_copy_to_udata(udata, &resp, resp.response_length); 5157 5134 if (err) 5158 5135 goto err_copy; 5159 5136 } 5160 5137 5161 - return &rwq_ind_tbl->ib_rwq_ind_tbl; 5138 + return 0; 5162 5139 5163 5140 err_copy: 5164 5141 mlx5_cmd_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn, rwq_ind_tbl->uid); 5165 - err: 5166 - kfree(rwq_ind_tbl); 5167 - return ERR_PTR(err); 5142 + return err; 5168 5143 } 5169 5144 5170 5145 int mlx5_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl) ··· 5170 5149 struct mlx5_ib_rwq_ind_table *rwq_ind_tbl = to_mrwq_ind_table(ib_rwq_ind_tbl); 5171 5150 struct mlx5_ib_dev *dev = to_mdev(ib_rwq_ind_tbl->device); 5172 5151 5173 - mlx5_cmd_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn, rwq_ind_tbl->uid); 5174 - 5175 - kfree(rwq_ind_tbl); 5176 - return 0; 5152 + return mlx5_cmd_destroy_rqt(dev->mdev, rwq_ind_tbl->rqtn, rwq_ind_tbl->uid); 5177 5153 } 5178 5154 5179 5155 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr, ··· 5187 5169 void *rqc; 5188 5170 void *in; 5189 5171 5190 - required_cmd_sz = offsetof(typeof(ucmd), reserved) + sizeof(ucmd.reserved); 5172 + required_cmd_sz = offsetofend(struct mlx5_ib_modify_wq, reserved); 5191 5173 if (udata->inlen < required_cmd_sz) 5192 5174 return -EINVAL; 5193 5175
+2 -2
drivers/infiniband/hw/mlx5/qp.h
··· 26 26 27 27 int mlx5_core_set_delay_drop(struct mlx5_ib_dev *dev, u32 timeout_usec); 28 28 29 - void mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev, 30 - struct mlx5_core_qp *rq); 29 + int mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev, 30 + struct mlx5_core_qp *rq); 31 31 int mlx5_core_create_sq_tracked(struct mlx5_ib_dev *dev, u32 *in, int inlen, 32 32 struct mlx5_core_qp *sq); 33 33 void mlx5_core_destroy_sq_tracked(struct mlx5_ib_dev *dev,
+3 -2
drivers/infiniband/hw/mlx5/qpc.c
··· 576 576 return err; 577 577 } 578 578 579 - void mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev, 580 - struct mlx5_core_qp *rq) 579 + int mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev, 580 + struct mlx5_core_qp *rq) 581 581 { 582 582 destroy_resource_common(dev, rq); 583 583 destroy_rq_tracked(dev, rq->qpn, rq->uid); 584 + return 0; 584 585 } 585 586 586 587 static void destroy_sq_tracked(struct mlx5_ib_dev *dev, u32 sqn, u16 uid)
+9 -12
drivers/infiniband/hw/mlx5/srq.c
··· 389 389 return ret; 390 390 } 391 391 392 - void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 392 + int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 393 393 { 394 394 struct mlx5_ib_dev *dev = to_mdev(srq->device); 395 395 struct mlx5_ib_srq *msrq = to_msrq(srq); 396 + int ret; 396 397 397 - mlx5_cmd_destroy_srq(dev, &msrq->msrq); 398 + ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq); 399 + if (ret) 400 + return ret; 398 401 399 - if (srq->uobject) { 400 - mlx5_ib_db_unmap_user( 401 - rdma_udata_to_drv_context( 402 - udata, 403 - struct mlx5_ib_ucontext, 404 - ibucontext), 405 - &msrq->db); 406 - ib_umem_release(msrq->umem); 407 - } else { 402 + if (udata) 403 + destroy_srq_user(srq->pd, msrq, udata); 404 + else 408 405 destroy_srq_kernel(dev, msrq); 409 - } 406 + return 0; 410 407 } 411 408 412 409 void mlx5_ib_free_srq_wqe(struct mlx5_ib_srq *srq, int wqe_index)
+1 -1
drivers/infiniband/hw/mlx5/srq.h
··· 56 56 57 57 int mlx5_cmd_create_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq, 58 58 struct mlx5_srq_attr *in); 59 - void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq); 59 + int mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq); 60 60 int mlx5_cmd_query_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq, 61 61 struct mlx5_srq_attr *out); 62 62 int mlx5_cmd_arm_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
+16 -6
drivers/infiniband/hw/mlx5/srq_cmd.c
··· 590 590 return err; 591 591 } 592 592 593 - void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq) 593 + int mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq) 594 594 { 595 595 struct mlx5_srq_table *table = &dev->srq_table; 596 596 struct mlx5_core_srq *tmp; 597 597 int err; 598 598 599 - tmp = xa_erase_irq(&table->array, srq->srqn); 600 - if (!tmp || tmp != srq) 601 - return; 599 + /* Delete entry, but leave index occupied */ 600 + tmp = xa_cmpxchg_irq(&table->array, srq->srqn, srq, XA_ZERO_ENTRY, 0); 601 + if (WARN_ON(tmp != srq)) 602 + return xa_err(tmp) ?: -EINVAL; 602 603 603 604 err = destroy_srq_split(dev, srq); 604 - if (err) 605 - return; 605 + if (err) { 606 + /* 607 + * We don't need to check returned result for an error, 608 + * because we are storing in pre-allocated space xarray 609 + * entry and it can't fail at this stage. 610 + */ 611 + xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, srq, 0); 612 + return err; 613 + } 614 + xa_erase_irq(&table->array, srq->srqn); 606 615 607 616 mlx5_core_res_put(&srq->common); 608 617 wait_for_completion(&srq->common.free); 618 + return 0; 609 619 } 610 620 611 621 int mlx5_cmd_query_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
+14 -13
drivers/infiniband/hw/mlx5/wr.c
··· 398 398 seg->status = MLX5_MKEY_STATUS_FREE; 399 399 } 400 400 401 - static void set_reg_mkey_segment(struct mlx5_mkey_seg *seg, 401 + static void set_reg_mkey_segment(struct mlx5_ib_dev *dev, 402 + struct mlx5_mkey_seg *seg, 402 403 const struct ib_send_wr *wr) 403 404 { 404 405 const struct mlx5_umr_wr *umrwr = umr_wr(wr); ··· 415 414 MLX5_SET(mkc, seg, rr, !!(umrwr->access_flags & IB_ACCESS_REMOTE_READ)); 416 415 MLX5_SET(mkc, seg, lw, !!(umrwr->access_flags & IB_ACCESS_LOCAL_WRITE)); 417 416 MLX5_SET(mkc, seg, lr, 1); 418 - MLX5_SET(mkc, seg, relaxed_ordering_write, 419 - !!(umrwr->access_flags & IB_ACCESS_RELAXED_ORDERING)); 420 - MLX5_SET(mkc, seg, relaxed_ordering_read, 421 - !!(umrwr->access_flags & IB_ACCESS_RELAXED_ORDERING)); 417 + if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write_umr)) 418 + MLX5_SET(mkc, seg, relaxed_ordering_write, 419 + !!(umrwr->access_flags & IB_ACCESS_RELAXED_ORDERING)); 420 + if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr)) 421 + MLX5_SET(mkc, seg, relaxed_ordering_read, 422 + !!(umrwr->access_flags & IB_ACCESS_RELAXED_ORDERING)); 422 423 423 424 if (umrwr->pd) 424 425 MLX5_SET(mkc, seg, pd, to_mpd(umrwr->pd)->pdn); ··· 866 863 bool atomic = wr->access & IB_ACCESS_REMOTE_ATOMIC; 867 864 u8 flags = 0; 868 865 869 - if (!mlx5_ib_can_use_umr(dev, atomic, wr->access)) { 870 - mlx5_ib_warn(to_mdev(qp->ibqp.device), 871 - "Fast update of %s for MR is disabled\n", 872 - (MLX5_CAP_GEN(dev->mdev, 873 - umr_modify_entity_size_disabled)) ? 874 - "entity size" : 875 - "atomic access"); 866 + /* Matches access in mlx5_set_umr_free_mkey() */ 867 + if (!mlx5_ib_can_reconfig_with_umr(dev, 0, wr->access)) { 868 + mlx5_ib_warn( 869 + to_mdev(qp->ibqp.device), 870 + "Fast update for MR access flags is not possible\n"); 876 871 return -EINVAL; 877 872 } 878 873 ··· 1264 1263 *seg += sizeof(struct mlx5_wqe_umr_ctrl_seg); 1265 1264 *size += sizeof(struct mlx5_wqe_umr_ctrl_seg) / 16; 1266 1265 handle_post_send_edge(&qp->sq, seg, *size, cur_edge); 1267 - set_reg_mkey_segment(*seg, wr); 1266 + set_reg_mkey_segment(dev, *seg, wr); 1268 1267 *seg += sizeof(struct mlx5_mkey_seg); 1269 1268 *size += sizeof(struct mlx5_mkey_seg) / 16; 1270 1269 handle_post_send_edge(&qp->sq, seg, *size, cur_edge);
+1 -1
drivers/infiniband/hw/mthca/mthca_dev.h
··· 548 548 struct ib_qp_cap *cap, 549 549 int qpn, 550 550 int port, 551 - struct mthca_sqp *sqp, 551 + struct mthca_qp *qp, 552 552 struct ib_udata *udata); 553 553 void mthca_free_qp(struct mthca_dev *dev, struct mthca_qp *qp); 554 554 int mthca_create_ah(struct mthca_dev *dev,
+23 -16
drivers/infiniband/hw/mthca/mthca_provider.c
··· 373 373 return 0; 374 374 } 375 375 376 - static void mthca_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 376 + static int mthca_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 377 377 { 378 378 mthca_pd_free(to_mdev(pd->device), to_mpd(pd)); 379 + return 0; 379 380 } 380 381 381 382 static int mthca_ah_create(struct ib_ah *ibah, ··· 390 389 init_attr->ah_attr, ah); 391 390 } 392 391 393 - static void mthca_ah_destroy(struct ib_ah *ah, u32 flags) 392 + static int mthca_ah_destroy(struct ib_ah *ah, u32 flags) 394 393 { 395 394 mthca_destroy_ah(to_mdev(ah->device), to_mah(ah)); 395 + return 0; 396 396 } 397 397 398 398 static int mthca_create_srq(struct ib_srq *ibsrq, ··· 442 440 return 0; 443 441 } 444 442 445 - static void mthca_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 443 + static int mthca_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 446 444 { 447 445 if (udata) { 448 446 struct mthca_ucontext *context = ··· 456 454 } 457 455 458 456 mthca_free_srq(to_mdev(srq->device), to_msrq(srq)); 457 + return 0; 459 458 } 460 459 461 460 static struct ib_qp *mthca_create_qp(struct ib_pd *pd, ··· 535 532 case IB_QPT_SMI: 536 533 case IB_QPT_GSI: 537 534 { 538 - /* Don't allow userspace to create special QPs */ 539 - if (udata) 540 - return ERR_PTR(-EINVAL); 541 - 542 - qp = kzalloc(sizeof(struct mthca_sqp), GFP_KERNEL); 535 + qp = kzalloc(sizeof(*qp), GFP_KERNEL); 543 536 if (!qp) 544 537 return ERR_PTR(-ENOMEM); 538 + qp->sqp = kzalloc(sizeof(struct mthca_sqp), GFP_KERNEL); 539 + if (!qp->sqp) { 540 + kfree(qp); 541 + return ERR_PTR(-ENOMEM); 542 + } 545 543 546 544 qp->ibqp.qp_num = init_attr->qp_type == IB_QPT_SMI ? 0 : 1; 547 545 ··· 551 547 to_mcq(init_attr->recv_cq), 552 548 init_attr->sq_sig_type, &init_attr->cap, 553 549 qp->ibqp.qp_num, init_attr->port_num, 554 - to_msqp(qp), udata); 550 + qp, udata); 555 551 break; 556 552 } 557 553 default: ··· 560 556 } 561 557 562 558 if (err) { 559 + kfree(qp->sqp); 563 560 kfree(qp); 564 561 return ERR_PTR(err); 565 562 } ··· 593 588 to_mqp(qp)->rq.db_index); 594 589 } 595 590 mthca_free_qp(to_mdev(qp->device), to_mqp(qp)); 596 - kfree(qp); 591 + kfree(to_mqp(qp)->sqp); 592 + kfree(to_mqp(qp)); 597 593 return 0; 598 594 } 599 595 ··· 795 789 return ret; 796 790 } 797 791 798 - static void mthca_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 792 + static int mthca_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 799 793 { 800 794 if (udata) { 801 795 struct mthca_ucontext *context = ··· 814 808 to_mcq(cq)->set_ci_db_index); 815 809 } 816 810 mthca_free_cq(to_mdev(cq->device), to_mcq(cq)); 811 + return 0; 817 812 } 818 813 819 814 static inline u32 convert_access(int acc) ··· 853 846 u64 virt, int acc, struct ib_udata *udata) 854 847 { 855 848 struct mthca_dev *dev = to_mdev(pd->device); 856 - struct sg_dma_page_iter sg_iter; 849 + struct ib_block_iter biter; 857 850 struct mthca_ucontext *context = rdma_udata_to_drv_context( 858 851 udata, struct mthca_ucontext, ibucontext); 859 852 struct mthca_mr *mr; ··· 884 877 goto err; 885 878 } 886 879 887 - n = ib_umem_num_pages(mr->umem); 880 + n = ib_umem_num_dma_blocks(mr->umem, PAGE_SIZE); 888 881 889 882 mr->mtt = mthca_alloc_mtt(dev, n); 890 883 if (IS_ERR(mr->mtt)) { ··· 902 895 903 896 write_mtt_size = min(mthca_write_mtt_size(dev), (int) (PAGE_SIZE / sizeof *pages)); 904 897 905 - for_each_sg_dma_page(mr->umem->sg_head.sgl, &sg_iter, mr->umem->nmap, 0) { 906 - pages[i++] = sg_page_iter_dma_address(&sg_iter); 898 + rdma_umem_for_each_dma_block(mr->umem, &biter, PAGE_SIZE) { 899 + pages[i++] = rdma_block_iter_dma_address(&biter); 907 900 908 901 /* 909 902 * Be friendly to write_mtt and pass it chunks ··· 1206 1199 mutex_init(&dev->cap_mask_mutex); 1207 1200 1208 1201 rdma_set_device_sysfs_group(&dev->ib_dev, &mthca_attr_group); 1209 - ret = ib_register_device(&dev->ib_dev, "mthca%d"); 1202 + ret = ib_register_device(&dev->ib_dev, "mthca%d", &dev->pdev->dev); 1210 1203 if (ret) 1211 1204 return ret; 1212 1205
+11 -16
drivers/infiniband/hw/mthca/mthca_provider.h
··· 240 240 __be32 *db; 241 241 }; 242 242 243 + struct mthca_sqp { 244 + int pkey_index; 245 + u32 qkey; 246 + u32 send_psn; 247 + struct ib_ud_header ud_header; 248 + int header_buf_size; 249 + void *header_buf; 250 + dma_addr_t header_dma; 251 + }; 252 + 243 253 struct mthca_qp { 244 254 struct ib_qp ibqp; 245 255 int refcount; ··· 275 265 276 266 wait_queue_head_t wait; 277 267 struct mutex mutex; 278 - }; 279 - 280 - struct mthca_sqp { 281 - struct mthca_qp qp; 282 - int pkey_index; 283 - u32 qkey; 284 - u32 send_psn; 285 - struct ib_ud_header ud_header; 286 - int header_buf_size; 287 - void *header_buf; 288 - dma_addr_t header_dma; 268 + struct mthca_sqp *sqp; 289 269 }; 290 270 291 271 static inline struct mthca_ucontext *to_mucontext(struct ib_ucontext *ibucontext) ··· 311 311 static inline struct mthca_qp *to_mqp(struct ib_qp *ibqp) 312 312 { 313 313 return container_of(ibqp, struct mthca_qp, ibqp); 314 - } 315 - 316 - static inline struct mthca_sqp *to_msqp(struct mthca_qp *qp) 317 - { 318 - return container_of(qp, struct mthca_sqp, qp); 319 314 } 320 315 321 316 #endif /* MTHCA_PROVIDER_H */
+37 -38
drivers/infiniband/hw/mthca/mthca_qp.c
··· 809 809 qp->alt_port = attr->alt_port_num; 810 810 811 811 if (is_sqp(dev, qp)) 812 - store_attrs(to_msqp(qp), attr, attr_mask); 812 + store_attrs(qp->sqp, attr, attr_mask); 813 813 814 814 /* 815 815 * If we moved QP0 to RTR, bring the IB link up; if we moved ··· 1368 1368 struct ib_qp_cap *cap, 1369 1369 int qpn, 1370 1370 int port, 1371 - struct mthca_sqp *sqp, 1371 + struct mthca_qp *qp, 1372 1372 struct ib_udata *udata) 1373 1373 { 1374 1374 u32 mqpn = qpn * 2 + dev->qp_table.sqp_start + port - 1; 1375 1375 int err; 1376 1376 1377 - sqp->qp.transport = MLX; 1378 - err = mthca_set_qp_size(dev, cap, pd, &sqp->qp); 1377 + qp->transport = MLX; 1378 + err = mthca_set_qp_size(dev, cap, pd, qp); 1379 1379 if (err) 1380 1380 return err; 1381 1381 1382 - sqp->header_buf_size = sqp->qp.sq.max * MTHCA_UD_HEADER_SIZE; 1383 - sqp->header_buf = dma_alloc_coherent(&dev->pdev->dev, sqp->header_buf_size, 1384 - &sqp->header_dma, GFP_KERNEL); 1385 - if (!sqp->header_buf) 1382 + qp->sqp->header_buf_size = qp->sq.max * MTHCA_UD_HEADER_SIZE; 1383 + qp->sqp->header_buf = 1384 + dma_alloc_coherent(&dev->pdev->dev, qp->sqp->header_buf_size, 1385 + &qp->sqp->header_dma, GFP_KERNEL); 1386 + if (!qp->sqp->header_buf) 1386 1387 return -ENOMEM; 1387 1388 1388 1389 spin_lock_irq(&dev->qp_table.lock); 1389 1390 if (mthca_array_get(&dev->qp_table.qp, mqpn)) 1390 1391 err = -EBUSY; 1391 1392 else 1392 - mthca_array_set(&dev->qp_table.qp, mqpn, sqp); 1393 + mthca_array_set(&dev->qp_table.qp, mqpn, qp->sqp); 1393 1394 spin_unlock_irq(&dev->qp_table.lock); 1394 1395 1395 1396 if (err) 1396 1397 goto err_out; 1397 1398 1398 - sqp->qp.port = port; 1399 - sqp->qp.qpn = mqpn; 1400 - sqp->qp.transport = MLX; 1399 + qp->port = port; 1400 + qp->qpn = mqpn; 1401 + qp->transport = MLX; 1401 1402 1402 1403 err = mthca_alloc_qp_common(dev, pd, send_cq, recv_cq, 1403 - send_policy, &sqp->qp, udata); 1404 + send_policy, qp, udata); 1404 1405 if (err) 1405 1406 goto err_out_free; 1406 1407 ··· 1422 1421 1423 1422 mthca_unlock_cqs(send_cq, recv_cq); 1424 1423 1425 - err_out: 1426 - dma_free_coherent(&dev->pdev->dev, sqp->header_buf_size, 1427 - sqp->header_buf, sqp->header_dma); 1428 - 1424 + err_out: 1425 + dma_free_coherent(&dev->pdev->dev, qp->sqp->header_buf_size, 1426 + qp->sqp->header_buf, qp->sqp->header_dma); 1429 1427 return err; 1430 1428 } 1431 1429 ··· 1487 1487 1488 1488 if (is_sqp(dev, qp)) { 1489 1489 atomic_dec(&(to_mpd(qp->ibqp.pd)->sqp_count)); 1490 - dma_free_coherent(&dev->pdev->dev, 1491 - to_msqp(qp)->header_buf_size, 1492 - to_msqp(qp)->header_buf, 1493 - to_msqp(qp)->header_dma); 1490 + dma_free_coherent(&dev->pdev->dev, qp->sqp->header_buf_size, 1491 + qp->sqp->header_buf, qp->sqp->header_dma); 1494 1492 } else 1495 1493 mthca_free(&dev->qp_table.alloc, qp->qpn); 1496 1494 } 1497 1495 1498 1496 /* Create UD header for an MLX send and build a data segment for it */ 1499 - static int build_mlx_header(struct mthca_dev *dev, struct mthca_sqp *sqp, 1500 - int ind, const struct ib_ud_wr *wr, 1497 + static int build_mlx_header(struct mthca_dev *dev, struct mthca_qp *qp, int ind, 1498 + const struct ib_ud_wr *wr, 1501 1499 struct mthca_mlx_seg *mlx, 1502 1500 struct mthca_data_seg *data) 1503 1501 { 1502 + struct mthca_sqp *sqp = qp->sqp; 1504 1503 int header_size; 1505 1504 int err; 1506 1505 u16 pkey; ··· 1512 1513 if (err) 1513 1514 return err; 1514 1515 mlx->flags &= ~cpu_to_be32(MTHCA_NEXT_SOLICIT | 1); 1515 - mlx->flags |= cpu_to_be32((!sqp->qp.ibqp.qp_num ? MTHCA_MLX_VL15 : 0) | 1516 + mlx->flags |= cpu_to_be32((!qp->ibqp.qp_num ? MTHCA_MLX_VL15 : 0) | 1516 1517 (sqp->ud_header.lrh.destination_lid == 1517 1518 IB_LID_PERMISSIVE ? MTHCA_MLX_SLR : 0) | 1518 1519 (sqp->ud_header.lrh.service_level << 8)); ··· 1533 1534 return -EINVAL; 1534 1535 } 1535 1536 1536 - sqp->ud_header.lrh.virtual_lane = !sqp->qp.ibqp.qp_num ? 15 : 0; 1537 + sqp->ud_header.lrh.virtual_lane = !qp->ibqp.qp_num ? 15 : 0; 1537 1538 if (sqp->ud_header.lrh.destination_lid == IB_LID_PERMISSIVE) 1538 1539 sqp->ud_header.lrh.source_lid = IB_LID_PERMISSIVE; 1539 1540 sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED); 1540 - if (!sqp->qp.ibqp.qp_num) 1541 - ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, 1542 - sqp->pkey_index, &pkey); 1541 + if (!qp->ibqp.qp_num) 1542 + ib_get_cached_pkey(&dev->ib_dev, qp->port, sqp->pkey_index, 1543 + &pkey); 1543 1544 else 1544 - ib_get_cached_pkey(&dev->ib_dev, sqp->qp.port, 1545 - wr->pkey_index, &pkey); 1545 + ib_get_cached_pkey(&dev->ib_dev, qp->port, wr->pkey_index, 1546 + &pkey); 1546 1547 sqp->ud_header.bth.pkey = cpu_to_be16(pkey); 1547 1548 sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn); 1548 1549 sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); 1549 1550 sqp->ud_header.deth.qkey = cpu_to_be32(wr->remote_qkey & 0x80000000 ? 1550 1551 sqp->qkey : wr->remote_qkey); 1551 - sqp->ud_header.deth.source_qpn = cpu_to_be32(sqp->qp.ibqp.qp_num); 1552 + sqp->ud_header.deth.source_qpn = cpu_to_be32(qp->ibqp.qp_num); 1552 1553 1553 1554 header_size = ib_ud_header_pack(&sqp->ud_header, 1554 1555 sqp->header_buf + 1555 1556 ind * MTHCA_UD_HEADER_SIZE); 1556 1557 1557 1558 data->byte_count = cpu_to_be32(header_size); 1558 - data->lkey = cpu_to_be32(to_mpd(sqp->qp.ibqp.pd)->ntmr.ibmr.lkey); 1559 + data->lkey = cpu_to_be32(to_mpd(qp->ibqp.pd)->ntmr.ibmr.lkey); 1559 1560 data->addr = cpu_to_be64(sqp->header_dma + 1560 1561 ind * MTHCA_UD_HEADER_SIZE); 1561 1562 ··· 1734 1735 break; 1735 1736 1736 1737 case MLX: 1737 - err = build_mlx_header(dev, to_msqp(qp), ind, ud_wr(wr), 1738 - wqe - sizeof (struct mthca_next_seg), 1739 - wqe); 1738 + err = build_mlx_header( 1739 + dev, qp, ind, ud_wr(wr), 1740 + wqe - sizeof(struct mthca_next_seg), wqe); 1740 1741 if (err) { 1741 1742 *bad_wr = wr; 1742 1743 goto out; ··· 2064 2065 break; 2065 2066 2066 2067 case MLX: 2067 - err = build_mlx_header(dev, to_msqp(qp), ind, ud_wr(wr), 2068 - wqe - sizeof (struct mthca_next_seg), 2069 - wqe); 2068 + err = build_mlx_header( 2069 + dev, qp, ind, ud_wr(wr), 2070 + wqe - sizeof(struct mthca_next_seg), wqe); 2070 2071 if (err) { 2071 2072 *bad_wr = wr; 2072 2073 goto out;
-1
drivers/infiniband/hw/ocrdma/ocrdma.h
··· 185 185 u32 num_pbes; 186 186 u32 pbl_size; 187 187 u32 pbe_size; 188 - u64 fbo; 189 188 u64 va; 190 189 }; 191 190
+2 -1
drivers/infiniband/hw/ocrdma/ocrdma_ah.c
··· 215 215 return status; 216 216 } 217 217 218 - void ocrdma_destroy_ah(struct ib_ah *ibah, u32 flags) 218 + int ocrdma_destroy_ah(struct ib_ah *ibah, u32 flags) 219 219 { 220 220 struct ocrdma_ah *ah = get_ocrdma_ah(ibah); 221 221 struct ocrdma_dev *dev = get_ocrdma_dev(ibah->device); 222 222 223 223 ocrdma_free_av(dev, ah); 224 + return 0; 224 225 } 225 226 226 227 int ocrdma_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *attr)
+1 -1
drivers/infiniband/hw/ocrdma/ocrdma_ah.h
··· 53 53 54 54 int ocrdma_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 55 55 struct ib_udata *udata); 56 - void ocrdma_destroy_ah(struct ib_ah *ah, u32 flags); 56 + int ocrdma_destroy_ah(struct ib_ah *ah, u32 flags); 57 57 int ocrdma_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); 58 58 59 59 int ocrdma_process_mad(struct ib_device *dev, int process_mad_flags,
+3 -2
drivers/infiniband/hw/ocrdma/ocrdma_hw.c
··· 1962 1962 int i; 1963 1963 struct ocrdma_reg_nsmr *cmd; 1964 1964 struct ocrdma_reg_nsmr_rsp *rsp; 1965 + u64 fbo = hwmr->va & (hwmr->pbe_size - 1); 1965 1966 1966 1967 cmd = ocrdma_init_emb_mqe(OCRDMA_CMD_REGISTER_NSMR, sizeof(*cmd)); 1967 1968 if (!cmd) ··· 1988 1987 OCRDMA_REG_NSMR_HPAGE_SIZE_SHIFT; 1989 1988 cmd->totlen_low = hwmr->len; 1990 1989 cmd->totlen_high = upper_32_bits(hwmr->len); 1991 - cmd->fbo_low = (u32) (hwmr->fbo & 0xffffffff); 1992 - cmd->fbo_high = (u32) upper_32_bits(hwmr->fbo); 1990 + cmd->fbo_low = lower_32_bits(fbo); 1991 + cmd->fbo_high = upper_32_bits(fbo); 1993 1992 cmd->va_loaddr = (u32) hwmr->va; 1994 1993 cmd->va_hiaddr = (u32) upper_32_bits(hwmr->va); 1995 1994
+3 -1
drivers/infiniband/hw/ocrdma/ocrdma_main.c
··· 255 255 if (ret) 256 256 return ret; 257 257 258 - return ib_register_device(&dev->ibdev, "ocrdma%d"); 258 + dma_set_max_seg_size(&dev->nic_info.pdev->dev, UINT_MAX); 259 + return ib_register_device(&dev->ibdev, "ocrdma%d", 260 + &dev->nic_info.pdev->dev); 259 261 } 260 262 261 263 static int ocrdma_alloc_resources(struct ocrdma_dev *dev)
+16 -22
drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
··· 112 112 } 113 113 114 114 static inline void get_link_speed_and_width(struct ocrdma_dev *dev, 115 - u8 *ib_speed, u8 *ib_width) 115 + u16 *ib_speed, u8 *ib_width) 116 116 { 117 117 int status; 118 118 u8 speed; ··· 664 664 return status; 665 665 } 666 666 667 - void ocrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 667 + int ocrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 668 668 { 669 669 struct ocrdma_pd *pd = get_ocrdma_pd(ibpd); 670 670 struct ocrdma_dev *dev = get_ocrdma_dev(ibpd->device); ··· 682 682 683 683 if (is_ucontext_pd(uctx, pd)) { 684 684 ocrdma_release_ucontext_pd(uctx); 685 - return; 685 + return 0; 686 686 } 687 687 } 688 688 _ocrdma_dealloc_pd(dev, pd); 689 + return 0; 689 690 } 690 691 691 692 static int ocrdma_alloc_lkey(struct ocrdma_dev *dev, struct ocrdma_mr *mr, ··· 811 810 return status; 812 811 } 813 812 814 - static void build_user_pbes(struct ocrdma_dev *dev, struct ocrdma_mr *mr, 815 - u32 num_pbes) 813 + static void build_user_pbes(struct ocrdma_dev *dev, struct ocrdma_mr *mr) 816 814 { 817 815 struct ocrdma_pbe *pbe; 818 - struct sg_dma_page_iter sg_iter; 816 + struct ib_block_iter biter; 819 817 struct ocrdma_pbl *pbl_tbl = mr->hwmr.pbl_table; 820 - struct ib_umem *umem = mr->umem; 821 - int pbe_cnt, total_num_pbes = 0; 818 + int pbe_cnt; 822 819 u64 pg_addr; 823 820 824 821 if (!mr->hwmr.num_pbes) ··· 825 826 pbe = (struct ocrdma_pbe *)pbl_tbl->va; 826 827 pbe_cnt = 0; 827 828 828 - for_each_sg_dma_page (umem->sg_head.sgl, &sg_iter, umem->nmap, 0) { 829 + rdma_umem_for_each_dma_block (mr->umem, &biter, PAGE_SIZE) { 829 830 /* store the page address in pbe */ 830 - pg_addr = sg_page_iter_dma_address(&sg_iter); 831 + pg_addr = rdma_block_iter_dma_address(&biter); 831 832 pbe->pa_lo = cpu_to_le32(pg_addr); 832 833 pbe->pa_hi = cpu_to_le32(upper_32_bits(pg_addr)); 833 834 pbe_cnt += 1; 834 - total_num_pbes += 1; 835 835 pbe++; 836 - 837 - /* if done building pbes, issue the mbx cmd. */ 838 - if (total_num_pbes == num_pbes) 839 - return; 840 836 841 837 /* if the given pbl is full storing the pbes, 842 838 * move to next pbl. ··· 851 857 struct ocrdma_dev *dev = get_ocrdma_dev(ibpd->device); 852 858 struct ocrdma_mr *mr; 853 859 struct ocrdma_pd *pd; 854 - u32 num_pbes; 855 860 856 861 pd = get_ocrdma_pd(ibpd); 857 862 ··· 865 872 status = -EFAULT; 866 873 goto umem_err; 867 874 } 868 - num_pbes = ib_umem_page_count(mr->umem); 869 - status = ocrdma_get_pbl_info(dev, mr, num_pbes); 875 + status = ocrdma_get_pbl_info( 876 + dev, mr, ib_umem_num_dma_blocks(mr->umem, PAGE_SIZE)); 870 877 if (status) 871 878 goto umem_err; 872 879 873 880 mr->hwmr.pbe_size = PAGE_SIZE; 874 - mr->hwmr.fbo = ib_umem_offset(mr->umem); 875 881 mr->hwmr.va = usr_addr; 876 882 mr->hwmr.len = len; 877 883 mr->hwmr.remote_wr = (acc & IB_ACCESS_REMOTE_WRITE) ? 1 : 0; ··· 881 889 status = ocrdma_build_pbl_tbl(dev, &mr->hwmr); 882 890 if (status) 883 891 goto umem_err; 884 - build_user_pbes(dev, mr, num_pbes); 892 + build_user_pbes(dev, mr); 885 893 status = ocrdma_reg_mr(dev, &mr->hwmr, pd->id, acc); 886 894 if (status) 887 895 goto mbx_err; ··· 1048 1056 spin_unlock_irqrestore(&cq->cq_lock, flags); 1049 1057 } 1050 1058 1051 - void ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 1059 + int ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 1052 1060 { 1053 1061 struct ocrdma_cq *cq = get_ocrdma_cq(ibcq); 1054 1062 struct ocrdma_eq *eq = NULL; ··· 1073 1081 ocrdma_get_db_addr(dev, pdid), 1074 1082 dev->nic_info.db_page_size); 1075 1083 } 1084 + return 0; 1076 1085 } 1077 1086 1078 1087 static int ocrdma_add_qpn_map(struct ocrdma_dev *dev, struct ocrdma_qp *qp) ··· 1850 1857 return status; 1851 1858 } 1852 1859 1853 - void ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 1860 + int ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 1854 1861 { 1855 1862 struct ocrdma_srq *srq; 1856 1863 struct ocrdma_dev *dev = get_ocrdma_dev(ibsrq->device); ··· 1865 1872 1866 1873 kfree(srq->idx_bit_fields); 1867 1874 kfree(srq->rqe_wr_id_tbl); 1875 + return 0; 1868 1876 } 1869 1877 1870 1878 /* unprivileged verbs and their support functions. */
+3 -3
drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
··· 67 67 int ocrdma_mmap(struct ib_ucontext *, struct vm_area_struct *vma); 68 68 69 69 int ocrdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 70 - void ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 70 + int ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 71 71 72 72 int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 73 73 struct ib_udata *udata); 74 74 int ocrdma_resize_cq(struct ib_cq *, int cqe, struct ib_udata *); 75 - void ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 75 + int ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 76 76 77 77 struct ib_qp *ocrdma_create_qp(struct ib_pd *, 78 78 struct ib_qp_init_attr *attrs, ··· 92 92 int ocrdma_modify_srq(struct ib_srq *, struct ib_srq_attr *, 93 93 enum ib_srq_attr_mask, struct ib_udata *); 94 94 int ocrdma_query_srq(struct ib_srq *, struct ib_srq_attr *); 95 - void ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 95 + int ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 96 96 int ocrdma_post_srq_recv(struct ib_srq *, const struct ib_recv_wr *, 97 97 const struct ib_recv_wr **bad_recv_wr); 98 98
+29 -2
drivers/infiniband/hw/qedr/main.c
··· 177 177 } 178 178 179 179 static const struct ib_device_ops qedr_roce_dev_ops = { 180 + .alloc_xrcd = qedr_alloc_xrcd, 181 + .dealloc_xrcd = qedr_dealloc_xrcd, 180 182 .get_port_immutable = qedr_roce_port_immutable, 181 183 .query_pkey = qedr_query_pkey, 182 184 }; ··· 188 186 dev->ibdev.node_type = RDMA_NODE_IB_CA; 189 187 190 188 ib_set_device_ops(&dev->ibdev, &qedr_roce_dev_ops); 189 + 190 + dev->ibdev.uverbs_cmd_mask |= QEDR_UVERBS(OPEN_XRCD) | 191 + QEDR_UVERBS(CLOSE_XRCD) | 192 + QEDR_UVERBS(CREATE_XSRQ); 191 193 } 192 194 193 195 static const struct ib_device_ops qedr_dev_ops = { ··· 238 232 INIT_RDMA_OBJ_SIZE(ib_cq, qedr_cq, ibcq), 239 233 INIT_RDMA_OBJ_SIZE(ib_pd, qedr_pd, ibpd), 240 234 INIT_RDMA_OBJ_SIZE(ib_srq, qedr_srq, ibsrq), 235 + INIT_RDMA_OBJ_SIZE(ib_xrcd, qedr_xrcd, ibxrcd), 241 236 INIT_RDMA_OBJ_SIZE(ib_ucontext, qedr_ucontext, ibucontext), 242 237 }; 243 238 ··· 293 286 if (rc) 294 287 return rc; 295 288 296 - return ib_register_device(&dev->ibdev, "qedr%d"); 289 + dma_set_max_seg_size(&dev->pdev->dev, UINT_MAX); 290 + return ib_register_device(&dev->ibdev, "qedr%d", &dev->pdev->dev); 297 291 } 298 292 299 293 /* This function allocates fast-path status block memory */ ··· 610 602 qed_attr = dev->ops->rdma_query_device(dev->rdma_ctx); 611 603 612 604 /* Part 2 - check capabilities */ 613 - page_size = ~dev->attr.page_size_caps + 1; 605 + page_size = ~qed_attr->page_size_caps + 1; 614 606 if (page_size > PAGE_SIZE) { 615 607 DP_ERR(dev, 616 608 "Kernel PAGE_SIZE is %ld which is smaller than minimum page size (%d) required by qedr\n", ··· 712 704 case ROCE_ASYNC_EVENT_SRQ_EMPTY: 713 705 event.event = IB_EVENT_SRQ_ERR; 714 706 event_type = EVENT_TYPE_SRQ; 707 + break; 708 + case ROCE_ASYNC_EVENT_XRC_DOMAIN_ERR: 709 + event.event = IB_EVENT_QP_ACCESS_ERR; 710 + event_type = EVENT_TYPE_QP; 711 + break; 712 + case ROCE_ASYNC_EVENT_INVALID_XRCETH_ERR: 713 + event.event = IB_EVENT_QP_ACCESS_ERR; 714 + event_type = EVENT_TYPE_QP; 715 + break; 716 + case ROCE_ASYNC_EVENT_XRC_SRQ_CATASTROPHIC_ERR: 717 + event.event = IB_EVENT_CQ_ERR; 718 + event_type = EVENT_TYPE_CQ; 715 719 break; 716 720 default: 717 721 DP_ERR(dev, "unsupported event %d on handle=%llx\n", ··· 1045 1025 break; 1046 1026 case QEDE_CHANGE_ADDR: 1047 1027 qedr_mac_address_change(dev); 1028 + break; 1029 + case QEDE_CHANGE_MTU: 1030 + if (rdma_protocol_iwarp(&dev->ibdev, 1)) 1031 + if (dev->ndev->mtu != dev->iwarp_max_mtu) 1032 + DP_NOTICE(dev, 1033 + "Mtu was changed from %d to %d. This will not take affect for iWARP until qedr is reloaded\n", 1034 + dev->iwarp_max_mtu, dev->ndev->mtu); 1048 1035 break; 1049 1036 default: 1050 1037 pr_err("Event not supported\n");
+33
drivers/infiniband/hw/qedr/qedr.h
··· 310 310 struct qedr_ucontext *uctx; 311 311 }; 312 312 313 + struct qedr_xrcd { 314 + struct ib_xrcd ibxrcd; 315 + u16 xrcd_id; 316 + }; 317 + 313 318 struct qedr_qp_hwq_info { 314 319 /* WQE Elements */ 315 320 struct qed_chain pbl; ··· 366 361 struct ib_umem *prod_umem; 367 362 u16 srq_id; 368 363 u32 srq_limit; 364 + bool is_xrc; 369 365 /* lock to protect srq recv post */ 370 366 spinlock_t lock; 371 367 }; ··· 579 573 return container_of(ibpd, struct qedr_pd, ibpd); 580 574 } 581 575 576 + static inline struct qedr_xrcd *get_qedr_xrcd(struct ib_xrcd *ibxrcd) 577 + { 578 + return container_of(ibxrcd, struct qedr_xrcd, ibxrcd); 579 + } 580 + 582 581 static inline struct qedr_cq *get_qedr_cq(struct ib_cq *ibcq) 583 582 { 584 583 return container_of(ibcq, struct qedr_cq, ibcq); ··· 607 596 static inline struct qedr_srq *get_qedr_srq(struct ib_srq *ibsrq) 608 597 { 609 598 return container_of(ibsrq, struct qedr_srq, ibsrq); 599 + } 600 + 601 + static inline bool qedr_qp_has_srq(struct qedr_qp *qp) 602 + { 603 + return qp->srq; 604 + } 605 + 606 + static inline bool qedr_qp_has_sq(struct qedr_qp *qp) 607 + { 608 + if (qp->qp_type == IB_QPT_GSI || qp->qp_type == IB_QPT_XRC_TGT) 609 + return 0; 610 + 611 + return 1; 612 + } 613 + 614 + static inline bool qedr_qp_has_rq(struct qedr_qp *qp) 615 + { 616 + if (qp->qp_type == IB_QPT_GSI || qp->qp_type == IB_QPT_XRC_INI || 617 + qp->qp_type == IB_QPT_XRC_TGT || qedr_qp_has_srq(qp)) 618 + return 0; 619 + 620 + return 1; 610 621 } 611 622 612 623 static inline struct qedr_user_mmap_entry *
+4 -2
drivers/infiniband/hw/qedr/qedr_iw_cm.c
··· 736 736 struct qedr_dev *dev = ep->dev; 737 737 struct qedr_qp *qp; 738 738 struct qed_iwarp_accept_in params; 739 - int rc = 0; 739 + int rc; 740 740 741 741 DP_DEBUG(dev, QEDR_MSG_IWARP, "Accept on qpid=%d\n", conn_param->qpn); 742 742 ··· 759 759 params.ord = conn_param->ord; 760 760 761 761 if (test_and_set_bit(QEDR_IWARP_CM_WAIT_FOR_CONNECT, 762 - &qp->iwarp_cm_flags)) 762 + &qp->iwarp_cm_flags)) { 763 + rc = -EINVAL; 763 764 goto err; /* QP already destroyed */ 765 + } 764 766 765 767 rc = dev->ops->iwarp_accept(dev->rdma_ctx, &params); 766 768 if (rc) {
+269 -171
drivers/infiniband/hw/qedr/verbs.c
··· 136 136 IB_DEVICE_RC_RNR_NAK_GEN | 137 137 IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_MGT_EXTENSIONS; 138 138 139 + if (!rdma_protocol_iwarp(&dev->ibdev, 1)) 140 + attr->device_cap_flags |= IB_DEVICE_XRC; 139 141 attr->max_send_sge = qattr->max_sge; 140 142 attr->max_recv_sge = qattr->max_sge; 141 143 attr->max_sge_rd = qattr->max_sge; ··· 159 157 160 158 attr->local_ca_ack_delay = qattr->dev_ack_delay; 161 159 attr->max_fast_reg_page_list_len = qattr->max_mr / 8; 162 - attr->max_pkeys = QEDR_ROCE_PKEY_MAX; 160 + attr->max_pkeys = qattr->max_pkey; 163 161 attr->max_ah = qattr->max_ah; 164 162 165 163 return 0; 166 164 } 167 165 168 - static inline void get_link_speed_and_width(int speed, u8 *ib_speed, 166 + static inline void get_link_speed_and_width(int speed, u16 *ib_speed, 169 167 u8 *ib_width) 170 168 { 171 169 switch (speed) { ··· 233 231 attr->phys_state = IB_PORT_PHYS_STATE_DISABLED; 234 232 } 235 233 attr->max_mtu = IB_MTU_4096; 236 - attr->active_mtu = iboe_get_mtu(dev->ndev->mtu); 237 234 attr->lid = 0; 238 235 attr->lmc = 0; 239 236 attr->sm_lid = 0; 240 237 attr->sm_sl = 0; 241 238 attr->ip_gids = true; 242 239 if (rdma_protocol_iwarp(&dev->ibdev, 1)) { 240 + attr->active_mtu = iboe_get_mtu(dev->iwarp_max_mtu); 243 241 attr->gid_tbl_len = 1; 244 242 } else { 243 + attr->active_mtu = iboe_get_mtu(dev->ndev->mtu); 245 244 attr->gid_tbl_len = QEDR_MAX_SGID; 246 245 attr->pkey_tbl_len = QEDR_ROCE_PKEY_TABLE_LEN; 247 246 } ··· 474 471 return 0; 475 472 } 476 473 477 - void qedr_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 474 + int qedr_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 478 475 { 479 476 struct qedr_dev *dev = get_qedr_dev(ibpd->device); 480 477 struct qedr_pd *pd = get_qedr_pd(ibpd); 481 478 482 479 DP_DEBUG(dev, QEDR_MSG_INIT, "Deallocating PD %d\n", pd->pd_id); 483 480 dev->ops->rdma_dealloc_pd(dev->rdma_ctx, pd->pd_id); 481 + return 0; 484 482 } 485 483 484 + 485 + int qedr_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata) 486 + { 487 + struct qedr_dev *dev = get_qedr_dev(ibxrcd->device); 488 + struct qedr_xrcd *xrcd = get_qedr_xrcd(ibxrcd); 489 + 490 + return dev->ops->rdma_alloc_xrcd(dev->rdma_ctx, &xrcd->xrcd_id); 491 + } 492 + 493 + int qedr_dealloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata) 494 + { 495 + struct qedr_dev *dev = get_qedr_dev(ibxrcd->device); 496 + u16 xrcd_id = get_qedr_xrcd(ibxrcd)->xrcd_id; 497 + 498 + dev->ops->rdma_dealloc_xrcd(dev->rdma_ctx, xrcd_id); 499 + return 0; 500 + } 486 501 static void qedr_free_pbl(struct qedr_dev *dev, 487 502 struct qedr_pbl_info *pbl_info, struct qedr_pbl *pbl) 488 503 { ··· 621 600 struct qedr_pbl_info *pbl_info, u32 pg_shift) 622 601 { 623 602 int pbe_cnt, total_num_pbes = 0; 624 - u32 fw_pg_cnt, fw_pg_per_umem_pg; 625 603 struct qedr_pbl *pbl_tbl; 626 - struct sg_dma_page_iter sg_iter; 604 + struct ib_block_iter biter; 627 605 struct regpair *pbe; 628 - u64 pg_addr; 629 606 630 607 if (!pbl_info->num_pbes) 631 608 return; ··· 644 625 645 626 pbe_cnt = 0; 646 627 647 - fw_pg_per_umem_pg = BIT(PAGE_SHIFT - pg_shift); 628 + rdma_umem_for_each_dma_block (umem, &biter, BIT(pg_shift)) { 629 + u64 pg_addr = rdma_block_iter_dma_address(&biter); 648 630 649 - for_each_sg_dma_page (umem->sg_head.sgl, &sg_iter, umem->nmap, 0) { 650 - pg_addr = sg_page_iter_dma_address(&sg_iter); 651 - for (fw_pg_cnt = 0; fw_pg_cnt < fw_pg_per_umem_pg;) { 652 - pbe->lo = cpu_to_le32(pg_addr); 653 - pbe->hi = cpu_to_le32(upper_32_bits(pg_addr)); 631 + pbe->lo = cpu_to_le32(pg_addr); 632 + pbe->hi = cpu_to_le32(upper_32_bits(pg_addr)); 654 633 655 - pg_addr += BIT(pg_shift); 656 - pbe_cnt++; 657 - total_num_pbes++; 658 - pbe++; 634 + pbe_cnt++; 635 + total_num_pbes++; 636 + pbe++; 659 637 660 - if (total_num_pbes == pbl_info->num_pbes) 661 - return; 638 + if (total_num_pbes == pbl_info->num_pbes) 639 + return; 662 640 663 - /* If the given pbl is full storing the pbes, 664 - * move to next pbl. 665 - */ 666 - if (pbe_cnt == (pbl_info->pbl_size / sizeof(u64))) { 667 - pbl_tbl++; 668 - pbe = (struct regpair *)pbl_tbl->va; 669 - pbe_cnt = 0; 670 - } 671 - 672 - fw_pg_cnt++; 641 + /* If the given pbl is full storing the pbes, move to next pbl. 642 + */ 643 + if (pbe_cnt == (pbl_info->pbl_size / sizeof(u64))) { 644 + pbl_tbl++; 645 + pbe = (struct regpair *)pbl_tbl->va; 646 + pbe_cnt = 0; 673 647 } 674 648 } 675 649 } ··· 804 792 return PTR_ERR(q->umem); 805 793 } 806 794 807 - fw_pages = ib_umem_page_count(q->umem) << 808 - (PAGE_SHIFT - FW_PAGE_SHIFT); 809 - 795 + fw_pages = ib_umem_num_dma_blocks(q->umem, 1 << FW_PAGE_SHIFT); 810 796 rc = qedr_prepare_pbl_tbl(dev, &q->pbl_info, fw_pages, 0); 811 797 if (rc) 812 798 goto err0; ··· 1009 999 /* Generate doorbell address. */ 1010 1000 cq->db.data.icid = cq->icid; 1011 1001 cq->db_addr = dev->db_addr + db_offset; 1012 - cq->db.data.params = DB_AGG_CMD_SET << 1002 + cq->db.data.params = DB_AGG_CMD_MAX << 1013 1003 RDMA_PWM_VAL32_DATA_AGG_CMD_SHIFT; 1014 1004 1015 1005 /* point to the very last element, passing it we will toggle */ ··· 1061 1051 #define QEDR_DESTROY_CQ_MAX_ITERATIONS (10) 1062 1052 #define QEDR_DESTROY_CQ_ITER_DURATION (10) 1063 1053 1064 - void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 1054 + int qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 1065 1055 { 1066 1056 struct qedr_dev *dev = get_qedr_dev(ibcq->device); 1067 1057 struct qed_rdma_destroy_cq_out_params oparams; ··· 1076 1066 /* GSIs CQs are handled by driver, so they don't exist in the FW */ 1077 1067 if (cq->cq_type == QEDR_CQ_TYPE_GSI) { 1078 1068 qedr_db_recovery_del(dev, cq->db_addr, &cq->db.data); 1079 - return; 1069 + return 0; 1080 1070 } 1081 1071 1082 1072 iparams.icid = cq->icid; ··· 1124 1114 * Since the destroy CQ ramrod has also been received on the EQ we can 1125 1115 * be certain that there's no event handler in process. 1126 1116 */ 1117 + return 0; 1127 1118 } 1128 1119 1129 1120 static inline int get_gid_info_from_table(struct ib_qp *ibqp, ··· 1157 1146 SET_FIELD(qp_params->modify_flags, 1158 1147 QED_ROCE_MODIFY_QP_VALID_ROCE_MODE, 1); 1159 1148 break; 1160 - case RDMA_NETWORK_IB: 1149 + case RDMA_NETWORK_ROCE_V1: 1161 1150 memcpy(&qp_params->sgid.bytes[0], &gid_attr->gid.raw[0], 1162 1151 sizeof(qp_params->sgid)); 1163 1152 memcpy(&qp_params->dgid.bytes[0], ··· 1177 1166 QED_ROCE_MODIFY_QP_VALID_ROCE_MODE, 1); 1178 1167 qp_params->roce_mode = ROCE_V2_IPV4; 1179 1168 break; 1169 + default: 1170 + return -EINVAL; 1180 1171 } 1181 1172 1182 1173 for (i = 0; i < 4; i++) { ··· 1199 1186 struct qedr_device_attr *qattr = &dev->attr; 1200 1187 1201 1188 /* QP0... attrs->qp_type == IB_QPT_GSI */ 1202 - if (attrs->qp_type != IB_QPT_RC && attrs->qp_type != IB_QPT_GSI) { 1189 + if (attrs->qp_type != IB_QPT_RC && 1190 + attrs->qp_type != IB_QPT_GSI && 1191 + attrs->qp_type != IB_QPT_XRC_INI && 1192 + attrs->qp_type != IB_QPT_XRC_TGT) { 1203 1193 DP_DEBUG(dev, QEDR_MSG_QP, 1204 1194 "create qp: unsupported qp type=0x%x requested\n", 1205 1195 attrs->qp_type); ··· 1237 1221 return -EINVAL; 1238 1222 } 1239 1223 1240 - /* Unprivileged user space cannot create special QP */ 1241 - if (udata && attrs->qp_type == IB_QPT_GSI) { 1242 - DP_ERR(dev, 1243 - "create qp: userspace can't create special QPs of type=0x%x\n", 1244 - attrs->qp_type); 1245 - return -EINVAL; 1224 + /* verify consumer QPs are not trying to use GSI QP's CQ. 1225 + * TGT QP isn't associated with RQ/SQ 1226 + */ 1227 + if ((attrs->qp_type != IB_QPT_GSI) && (dev->gsi_qp_created) && 1228 + (attrs->qp_type != IB_QPT_XRC_TGT)) { 1229 + struct qedr_cq *send_cq = get_qedr_cq(attrs->send_cq); 1230 + struct qedr_cq *recv_cq = get_qedr_cq(attrs->recv_cq); 1231 + 1232 + if ((send_cq->cq_type == QEDR_CQ_TYPE_GSI) || 1233 + (recv_cq->cq_type == QEDR_CQ_TYPE_GSI)) { 1234 + DP_ERR(dev, 1235 + "create qp: consumer QP cannot use GSI CQs.\n"); 1236 + return -EINVAL; 1237 + } 1246 1238 } 1247 1239 1248 1240 return 0; ··· 1272 1248 } 1273 1249 1274 1250 static void qedr_copy_rq_uresp(struct qedr_dev *dev, 1275 - struct qedr_create_qp_uresp *uresp, 1276 - struct qedr_qp *qp) 1251 + struct qedr_create_qp_uresp *uresp, 1252 + struct qedr_qp *qp) 1277 1253 { 1278 1254 /* iWARP requires two doorbells per RQ. */ 1279 1255 if (rdma_protocol_iwarp(&dev->ibdev, 1)) { ··· 1315 1291 int rc; 1316 1292 1317 1293 memset(uresp, 0, sizeof(*uresp)); 1318 - qedr_copy_sq_uresp(dev, uresp, qp); 1319 - qedr_copy_rq_uresp(dev, uresp, qp); 1294 + 1295 + if (qedr_qp_has_sq(qp)) 1296 + qedr_copy_sq_uresp(dev, uresp, qp); 1297 + 1298 + if (qedr_qp_has_rq(qp)) 1299 + qedr_copy_rq_uresp(dev, uresp, qp); 1320 1300 1321 1301 uresp->atomic_supported = dev->atomic_cap != IB_ATOMIC_NONE; 1322 1302 uresp->qp_id = qp->qp_id; ··· 1344 1316 kref_init(&qp->refcnt); 1345 1317 init_completion(&qp->iwarp_cm_comp); 1346 1318 } 1319 + 1347 1320 qp->pd = pd; 1348 1321 qp->qp_type = attrs->qp_type; 1349 1322 qp->max_inline_data = attrs->cap.max_inline_data; 1350 - qp->sq.max_sges = attrs->cap.max_send_sge; 1351 1323 qp->state = QED_ROCE_QP_STATE_RESET; 1352 1324 qp->signaled = (attrs->sq_sig_type == IB_SIGNAL_ALL_WR) ? true : false; 1353 - qp->sq_cq = get_qedr_cq(attrs->send_cq); 1354 1325 qp->dev = dev; 1326 + if (qedr_qp_has_sq(qp)) { 1327 + qp->sq.max_sges = attrs->cap.max_send_sge; 1328 + qp->sq_cq = get_qedr_cq(attrs->send_cq); 1329 + DP_DEBUG(dev, QEDR_MSG_QP, 1330 + "SQ params:\tsq_max_sges = %d, sq_cq_id = %d\n", 1331 + qp->sq.max_sges, qp->sq_cq->icid); 1332 + } 1355 1333 1356 - if (attrs->srq) { 1334 + if (attrs->srq) 1357 1335 qp->srq = get_qedr_srq(attrs->srq); 1358 - } else { 1336 + 1337 + if (qedr_qp_has_rq(qp)) { 1359 1338 qp->rq_cq = get_qedr_cq(attrs->recv_cq); 1360 1339 qp->rq.max_sges = attrs->cap.max_recv_sge; 1361 1340 DP_DEBUG(dev, QEDR_MSG_QP, ··· 1381 1346 1382 1347 static int qedr_set_roce_db_info(struct qedr_dev *dev, struct qedr_qp *qp) 1383 1348 { 1384 - int rc; 1349 + int rc = 0; 1385 1350 1386 - qp->sq.db = dev->db_addr + 1387 - DB_ADDR_SHIFT(DQ_PWM_OFFSET_XCM_RDMA_SQ_PROD); 1388 - qp->sq.db_data.data.icid = qp->icid + 1; 1389 - rc = qedr_db_recovery_add(dev, qp->sq.db, 1390 - &qp->sq.db_data, 1391 - DB_REC_WIDTH_32B, 1392 - DB_REC_KERNEL); 1393 - if (rc) 1394 - return rc; 1351 + if (qedr_qp_has_sq(qp)) { 1352 + qp->sq.db = dev->db_addr + 1353 + DB_ADDR_SHIFT(DQ_PWM_OFFSET_XCM_RDMA_SQ_PROD); 1354 + qp->sq.db_data.data.icid = qp->icid + 1; 1355 + rc = qedr_db_recovery_add(dev, qp->sq.db, &qp->sq.db_data, 1356 + DB_REC_WIDTH_32B, DB_REC_KERNEL); 1357 + if (rc) 1358 + return rc; 1359 + } 1395 1360 1396 - if (!qp->srq) { 1361 + if (qedr_qp_has_rq(qp)) { 1397 1362 qp->rq.db = dev->db_addr + 1398 1363 DB_ADDR_SHIFT(DQ_PWM_OFFSET_TCM_ROCE_RQ_PROD); 1399 1364 qp->rq.db_data.data.icid = qp->icid; 1400 - 1401 - rc = qedr_db_recovery_add(dev, qp->rq.db, 1402 - &qp->rq.db_data, 1403 - DB_REC_WIDTH_32B, 1404 - DB_REC_KERNEL); 1405 - if (rc) 1406 - qedr_db_recovery_del(dev, qp->sq.db, 1407 - &qp->sq.db_data); 1365 + rc = qedr_db_recovery_add(dev, qp->rq.db, &qp->rq.db_data, 1366 + DB_REC_WIDTH_32B, DB_REC_KERNEL); 1367 + if (rc && qedr_qp_has_sq(qp)) 1368 + qedr_db_recovery_del(dev, qp->sq.db, &qp->sq.db_data); 1408 1369 } 1409 1370 1410 1371 return rc; ··· 1423 1392 DP_ERR(dev, 1424 1393 "create srq: unsupported sge=0x%x requested (max_srq_sge=0x%x)\n", 1425 1394 attrs->attr.max_sge, qattr->max_sge); 1395 + } 1396 + 1397 + if (!udata && attrs->srq_type == IB_SRQT_XRC) { 1398 + DP_ERR(dev, "XRC SRQs are not supported in kernel-space\n"); 1426 1399 return -EINVAL; 1427 1400 } 1428 1401 ··· 1551 1516 return -EINVAL; 1552 1517 1553 1518 srq->dev = dev; 1519 + srq->is_xrc = (init_attr->srq_type == IB_SRQT_XRC); 1554 1520 hw_srq = &srq->hw_srq; 1555 1521 spin_lock_init(&srq->lock); 1556 1522 ··· 1593 1557 in_params.prod_pair_addr = phy_prod_pair_addr; 1594 1558 in_params.num_pages = page_cnt; 1595 1559 in_params.page_size = page_size; 1560 + if (srq->is_xrc) { 1561 + struct qedr_xrcd *xrcd = get_qedr_xrcd(init_attr->ext.xrc.xrcd); 1562 + struct qedr_cq *cq = get_qedr_cq(init_attr->ext.cq); 1563 + 1564 + in_params.is_xrc = 1; 1565 + in_params.xrcd_id = xrcd->xrcd_id; 1566 + in_params.cq_cid = cq->icid; 1567 + } 1596 1568 1597 1569 rc = dev->ops->rdma_create_srq(dev->rdma_ctx, &in_params, &out_params); 1598 1570 if (rc) ··· 1635 1591 return -EFAULT; 1636 1592 } 1637 1593 1638 - void qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 1594 + int qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 1639 1595 { 1640 1596 struct qed_rdma_destroy_srq_in_params in_params = {}; 1641 1597 struct qedr_dev *dev = get_qedr_dev(ibsrq->device); ··· 1643 1599 1644 1600 xa_erase_irq(&dev->srqs, srq->srq_id); 1645 1601 in_params.srq_id = srq->srq_id; 1602 + in_params.is_xrc = srq->is_xrc; 1646 1603 dev->ops->rdma_destroy_srq(dev->rdma_ctx, &in_params); 1647 1604 1648 1605 if (ibsrq->uobject) ··· 1654 1609 DP_DEBUG(dev, QEDR_MSG_SRQ, 1655 1610 "destroy srq: destroyed srq with srq_id=0x%0x\n", 1656 1611 srq->srq_id); 1612 + return 0; 1657 1613 } 1658 1614 1659 1615 int qedr_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, ··· 1695 1649 return 0; 1696 1650 } 1697 1651 1652 + static enum qed_rdma_qp_type qedr_ib_to_qed_qp_type(enum ib_qp_type ib_qp_type) 1653 + { 1654 + switch (ib_qp_type) { 1655 + case IB_QPT_RC: 1656 + return QED_RDMA_QP_TYPE_RC; 1657 + case IB_QPT_XRC_INI: 1658 + return QED_RDMA_QP_TYPE_XRC_INI; 1659 + case IB_QPT_XRC_TGT: 1660 + return QED_RDMA_QP_TYPE_XRC_TGT; 1661 + default: 1662 + return QED_RDMA_QP_TYPE_INVAL; 1663 + } 1664 + } 1665 + 1698 1666 static inline void 1699 1667 qedr_init_common_qp_in_params(struct qedr_dev *dev, 1700 1668 struct qedr_pd *pd, ··· 1723 1663 1724 1664 params->signal_all = (attrs->sq_sig_type == IB_SIGNAL_ALL_WR); 1725 1665 params->fmr_and_reserved_lkey = fmr_and_reserved_lkey; 1726 - params->pd = pd->pd_id; 1727 - params->dpi = pd->uctx ? pd->uctx->dpi : dev->dpi; 1728 - params->sq_cq_id = get_qedr_cq(attrs->send_cq)->icid; 1666 + params->qp_type = qedr_ib_to_qed_qp_type(attrs->qp_type); 1729 1667 params->stats_queue = 0; 1730 - params->srq_id = 0; 1731 - params->use_srq = false; 1732 1668 1733 - if (!qp->srq) { 1669 + if (pd) { 1670 + params->pd = pd->pd_id; 1671 + params->dpi = pd->uctx ? pd->uctx->dpi : dev->dpi; 1672 + } 1673 + 1674 + if (qedr_qp_has_sq(qp)) 1675 + params->sq_cq_id = get_qedr_cq(attrs->send_cq)->icid; 1676 + 1677 + if (qedr_qp_has_rq(qp)) 1734 1678 params->rq_cq_id = get_qedr_cq(attrs->recv_cq)->icid; 1735 1679 1736 - } else { 1680 + if (qedr_qp_has_srq(qp)) { 1737 1681 params->rq_cq_id = get_qedr_cq(attrs->recv_cq)->icid; 1738 1682 params->srq_id = qp->srq->srq_id; 1739 1683 params->use_srq = true; 1684 + } else { 1685 + params->srq_id = 0; 1686 + params->use_srq = false; 1740 1687 } 1741 1688 } 1742 1689 ··· 1757 1690 "rq_len=%zd" 1758 1691 "\n", 1759 1692 qp, 1760 - qp->usq.buf_addr, 1761 - qp->usq.buf_len, qp->urq.buf_addr, qp->urq.buf_len); 1693 + qedr_qp_has_sq(qp) ? qp->usq.buf_addr : 0x0, 1694 + qedr_qp_has_sq(qp) ? qp->usq.buf_len : 0, 1695 + qedr_qp_has_rq(qp) ? qp->urq.buf_addr : 0x0, 1696 + qedr_qp_has_sq(qp) ? qp->urq.buf_len : 0); 1762 1697 } 1763 1698 1764 1699 static inline void ··· 1786 1717 struct qedr_ucontext *ctx, 1787 1718 struct qedr_qp *qp) 1788 1719 { 1789 - ib_umem_release(qp->usq.umem); 1790 - qp->usq.umem = NULL; 1720 + if (qedr_qp_has_sq(qp)) { 1721 + ib_umem_release(qp->usq.umem); 1722 + qp->usq.umem = NULL; 1723 + } 1791 1724 1792 - ib_umem_release(qp->urq.umem); 1793 - qp->urq.umem = NULL; 1725 + if (qedr_qp_has_rq(qp)) { 1726 + ib_umem_release(qp->urq.umem); 1727 + qp->urq.umem = NULL; 1728 + } 1794 1729 1795 1730 if (rdma_protocol_roce(&dev->ibdev, 1)) { 1796 1731 qedr_free_pbl(dev, &qp->usq.pbl_info, qp->usq.pbl_tbl); ··· 1829 1756 { 1830 1757 struct qed_rdma_create_qp_in_params in_params; 1831 1758 struct qed_rdma_create_qp_out_params out_params; 1832 - struct qedr_pd *pd = get_qedr_pd(ibpd); 1833 - struct qedr_create_qp_uresp uresp; 1834 - struct qedr_ucontext *ctx = pd ? pd->uctx : NULL; 1835 - struct qedr_create_qp_ureq ureq; 1759 + struct qedr_create_qp_uresp uresp = {}; 1760 + struct qedr_create_qp_ureq ureq = {}; 1836 1761 int alloc_and_init = rdma_protocol_roce(&dev->ibdev, 1); 1837 - int rc = -EINVAL; 1762 + struct qedr_ucontext *ctx = NULL; 1763 + struct qedr_pd *pd = NULL; 1764 + int rc = 0; 1838 1765 1839 1766 qp->create_type = QEDR_QP_CREATE_USER; 1840 - memset(&ureq, 0, sizeof(ureq)); 1841 - rc = ib_copy_from_udata(&ureq, udata, min(sizeof(ureq), udata->inlen)); 1842 - if (rc) { 1843 - DP_ERR(dev, "Problem copying data from user space\n"); 1844 - return rc; 1767 + 1768 + if (ibpd) { 1769 + pd = get_qedr_pd(ibpd); 1770 + ctx = pd->uctx; 1845 1771 } 1846 1772 1847 - /* SQ - read access only (0) */ 1848 - rc = qedr_init_user_queue(udata, dev, &qp->usq, ureq.sq_addr, 1849 - ureq.sq_len, true, 0, alloc_and_init); 1850 - if (rc) 1851 - return rc; 1773 + if (udata) { 1774 + rc = ib_copy_from_udata(&ureq, udata, min(sizeof(ureq), 1775 + udata->inlen)); 1776 + if (rc) { 1777 + DP_ERR(dev, "Problem copying data from user space\n"); 1778 + return rc; 1779 + } 1780 + } 1852 1781 1853 - if (!qp->srq) { 1782 + if (qedr_qp_has_sq(qp)) { 1783 + /* SQ - read access only (0) */ 1784 + rc = qedr_init_user_queue(udata, dev, &qp->usq, ureq.sq_addr, 1785 + ureq.sq_len, true, 0, alloc_and_init); 1786 + if (rc) 1787 + return rc; 1788 + } 1789 + 1790 + if (qedr_qp_has_rq(qp)) { 1854 1791 /* RQ - read access only (0) */ 1855 1792 rc = qedr_init_user_queue(udata, dev, &qp->urq, ureq.rq_addr, 1856 1793 ureq.rq_len, true, 0, alloc_and_init); ··· 1872 1789 qedr_init_common_qp_in_params(dev, pd, qp, attrs, false, &in_params); 1873 1790 in_params.qp_handle_lo = ureq.qp_handle_lo; 1874 1791 in_params.qp_handle_hi = ureq.qp_handle_hi; 1875 - in_params.sq_num_pages = qp->usq.pbl_info.num_pbes; 1876 - in_params.sq_pbl_ptr = qp->usq.pbl_tbl->pa; 1877 - if (!qp->srq) { 1792 + 1793 + if (qp->qp_type == IB_QPT_XRC_TGT) { 1794 + struct qedr_xrcd *xrcd = get_qedr_xrcd(attrs->xrcd); 1795 + 1796 + in_params.xrcd_id = xrcd->xrcd_id; 1797 + in_params.qp_handle_lo = qp->qp_id; 1798 + in_params.use_srq = 1; 1799 + } 1800 + 1801 + if (qedr_qp_has_sq(qp)) { 1802 + in_params.sq_num_pages = qp->usq.pbl_info.num_pbes; 1803 + in_params.sq_pbl_ptr = qp->usq.pbl_tbl->pa; 1804 + } 1805 + 1806 + if (qedr_qp_has_rq(qp)) { 1878 1807 in_params.rq_num_pages = qp->urq.pbl_info.num_pbes; 1879 1808 in_params.rq_pbl_ptr = qp->urq.pbl_tbl->pa; 1880 1809 } ··· 1908 1813 qp->qp_id = out_params.qp_id; 1909 1814 qp->icid = out_params.icid; 1910 1815 1911 - rc = qedr_copy_qp_uresp(dev, qp, udata, &uresp); 1912 - if (rc) 1913 - goto err; 1914 - 1915 - /* db offset was calculated in copy_qp_uresp, now set in the user q */ 1916 - ctx = pd->uctx; 1917 - qp->usq.db_addr = ctx->dpi_addr + uresp.sq_db_offset; 1918 - qp->urq.db_addr = ctx->dpi_addr + uresp.rq_db_offset; 1919 - 1920 - if (rdma_protocol_iwarp(&dev->ibdev, 1)) { 1921 - qp->urq.db_rec_db2_addr = ctx->dpi_addr + uresp.rq_db2_offset; 1922 - 1923 - /* calculate the db_rec_db2 data since it is constant so no 1924 - * need to reflect from user 1925 - */ 1926 - qp->urq.db_rec_db2_data.data.icid = cpu_to_le16(qp->icid); 1927 - qp->urq.db_rec_db2_data.data.value = 1928 - cpu_to_le16(DQ_TCM_IWARP_POST_RQ_CF_CMD); 1816 + if (udata) { 1817 + rc = qedr_copy_qp_uresp(dev, qp, udata, &uresp); 1818 + if (rc) 1819 + goto err; 1929 1820 } 1930 1821 1931 - rc = qedr_db_recovery_add(dev, qp->usq.db_addr, 1932 - &qp->usq.db_rec_data->db_data, 1933 - DB_REC_WIDTH_32B, 1934 - DB_REC_USER); 1935 - if (rc) 1936 - goto err; 1822 + /* db offset was calculated in copy_qp_uresp, now set in the user q */ 1823 + if (qedr_qp_has_sq(qp)) { 1824 + qp->usq.db_addr = ctx->dpi_addr + uresp.sq_db_offset; 1825 + rc = qedr_db_recovery_add(dev, qp->usq.db_addr, 1826 + &qp->usq.db_rec_data->db_data, 1827 + DB_REC_WIDTH_32B, 1828 + DB_REC_USER); 1829 + if (rc) 1830 + goto err; 1831 + } 1937 1832 1938 - rc = qedr_db_recovery_add(dev, qp->urq.db_addr, 1939 - &qp->urq.db_rec_data->db_data, 1940 - DB_REC_WIDTH_32B, 1941 - DB_REC_USER); 1942 - if (rc) 1943 - goto err; 1833 + if (qedr_qp_has_rq(qp)) { 1834 + qp->urq.db_addr = ctx->dpi_addr + uresp.rq_db_offset; 1835 + rc = qedr_db_recovery_add(dev, qp->urq.db_addr, 1836 + &qp->urq.db_rec_data->db_data, 1837 + DB_REC_WIDTH_32B, 1838 + DB_REC_USER); 1839 + if (rc) 1840 + goto err; 1841 + } 1944 1842 1945 1843 if (rdma_protocol_iwarp(&dev->ibdev, 1)) { 1946 1844 rc = qedr_db_recovery_add(dev, qp->urq.db_rec_db2_addr, ··· 1944 1856 goto err; 1945 1857 } 1946 1858 qedr_qp_user_print(dev, qp); 1947 - 1948 1859 return rc; 1949 1860 err: 1950 1861 rc = dev->ops->rdma_destroy_qp(dev->rdma_ctx, qp->qed_qp); ··· 2199 2112 return rc; 2200 2113 } 2201 2114 2115 + static int qedr_free_qp_resources(struct qedr_dev *dev, struct qedr_qp *qp, 2116 + struct ib_udata *udata) 2117 + { 2118 + struct qedr_ucontext *ctx = 2119 + rdma_udata_to_drv_context(udata, struct qedr_ucontext, 2120 + ibucontext); 2121 + int rc; 2122 + 2123 + if (qp->qp_type != IB_QPT_GSI) { 2124 + rc = dev->ops->rdma_destroy_qp(dev->rdma_ctx, qp->qed_qp); 2125 + if (rc) 2126 + return rc; 2127 + } 2128 + 2129 + if (qp->create_type == QEDR_QP_CREATE_USER) 2130 + qedr_cleanup_user(dev, ctx, qp); 2131 + else 2132 + qedr_cleanup_kernel(dev, qp); 2133 + 2134 + return 0; 2135 + } 2136 + 2202 2137 struct ib_qp *qedr_create_qp(struct ib_pd *ibpd, 2203 2138 struct ib_qp_init_attr *attrs, 2204 2139 struct ib_udata *udata) 2205 2140 { 2206 - struct qedr_dev *dev = get_qedr_dev(ibpd->device); 2207 - struct qedr_pd *pd = get_qedr_pd(ibpd); 2141 + struct qedr_xrcd *xrcd = NULL; 2142 + struct qedr_pd *pd = NULL; 2143 + struct qedr_dev *dev; 2208 2144 struct qedr_qp *qp; 2209 2145 struct ib_qp *ibqp; 2210 2146 int rc = 0; 2147 + 2148 + if (attrs->qp_type == IB_QPT_XRC_TGT) { 2149 + xrcd = get_qedr_xrcd(attrs->xrcd); 2150 + dev = get_qedr_dev(xrcd->ibxrcd.device); 2151 + } else { 2152 + pd = get_qedr_pd(ibpd); 2153 + dev = get_qedr_dev(ibpd->device); 2154 + } 2211 2155 2212 2156 DP_DEBUG(dev, QEDR_MSG_QP, "create qp: called from %s, pd=%p\n", 2213 2157 udata ? "user library" : "kernel", pd); ··· 2270 2152 return ibqp; 2271 2153 } 2272 2154 2273 - if (udata) 2155 + if (udata || xrcd) 2274 2156 rc = qedr_create_user_qp(dev, qp, ibpd, udata, attrs); 2275 2157 else 2276 2158 rc = qedr_create_kernel_qp(dev, qp, ibpd, attrs); 2277 2159 2278 2160 if (rc) 2279 - goto err; 2161 + goto out_free_qp; 2280 2162 2281 2163 qp->ibqp.qp_num = qp->qp_id; 2282 2164 2283 2165 if (rdma_protocol_iwarp(&dev->ibdev, 1)) { 2284 2166 rc = xa_insert(&dev->qps, qp->qp_id, qp, GFP_KERNEL); 2285 2167 if (rc) 2286 - goto err; 2168 + goto out_free_qp_resources; 2287 2169 } 2288 2170 2289 2171 return &qp->ibqp; 2290 2172 2291 - err: 2173 + out_free_qp_resources: 2174 + qedr_free_qp_resources(dev, qp, udata); 2175 + out_free_qp: 2292 2176 kfree(qp); 2293 2177 2294 2178 return ERR_PTR(-EFAULT); ··· 2756 2636 qp_attr->cap.max_recv_wr = qp->rq.max_wr; 2757 2637 qp_attr->cap.max_send_sge = qp->sq.max_sges; 2758 2638 qp_attr->cap.max_recv_sge = qp->rq.max_sges; 2759 - qp_attr->cap.max_inline_data = ROCE_REQ_MAX_INLINE_DATA_SIZE; 2639 + qp_attr->cap.max_inline_data = dev->attr.max_inline; 2760 2640 qp_init_attr->cap = qp_attr->cap; 2761 2641 2762 2642 qp_attr->ah_attr.type = RDMA_AH_ATTR_TYPE_ROCE; ··· 2789 2669 2790 2670 err: 2791 2671 return rc; 2792 - } 2793 - 2794 - static int qedr_free_qp_resources(struct qedr_dev *dev, struct qedr_qp *qp, 2795 - struct ib_udata *udata) 2796 - { 2797 - struct qedr_ucontext *ctx = 2798 - rdma_udata_to_drv_context(udata, struct qedr_ucontext, 2799 - ibucontext); 2800 - int rc; 2801 - 2802 - if (qp->qp_type != IB_QPT_GSI) { 2803 - rc = dev->ops->rdma_destroy_qp(dev->rdma_ctx, qp->qed_qp); 2804 - if (rc) 2805 - return rc; 2806 - } 2807 - 2808 - if (qp->create_type == QEDR_QP_CREATE_USER) 2809 - qedr_cleanup_user(dev, ctx, qp); 2810 - else 2811 - qedr_cleanup_kernel(dev, qp); 2812 - 2813 - return 0; 2814 2672 } 2815 2673 2816 2674 int qedr_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata) ··· 2850 2752 2851 2753 if (rdma_protocol_iwarp(&dev->ibdev, 1)) 2852 2754 qedr_iw_qp_rem_ref(&qp->ibqp); 2755 + else 2756 + kfree(qp); 2853 2757 2854 2758 return 0; 2855 2759 } ··· 2866 2766 return 0; 2867 2767 } 2868 2768 2869 - void qedr_destroy_ah(struct ib_ah *ibah, u32 flags) 2769 + int qedr_destroy_ah(struct ib_ah *ibah, u32 flags) 2870 2770 { 2871 2771 struct qedr_ah *ah = get_qedr_ah(ibah); 2872 2772 2873 2773 rdma_destroy_ah_attr(&ah->attr); 2774 + return 0; 2874 2775 } 2875 2776 2876 2777 static void free_mr_info(struct qedr_dev *dev, struct mr_info *info) ··· 2962 2861 goto err0; 2963 2862 } 2964 2863 2965 - rc = init_mr_info(dev, &mr->info, ib_umem_page_count(mr->umem), 1); 2864 + rc = init_mr_info(dev, &mr->info, 2865 + ib_umem_num_dma_blocks(mr->umem, PAGE_SIZE), 1); 2966 2866 if (rc) 2967 2867 goto err1; 2968 2868 ··· 2990 2888 mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered; 2991 2889 mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size); 2992 2890 mr->hw_mr.page_size_log = PAGE_SHIFT; 2993 - mr->hw_mr.fbo = ib_umem_offset(mr->umem); 2994 2891 mr->hw_mr.length = len; 2995 2892 mr->hw_mr.vaddr = usr_addr; 2996 - mr->hw_mr.zbva = false; 2997 2893 mr->hw_mr.phy_mr = false; 2998 2894 mr->hw_mr.dma_mr = false; 2999 2895 ··· 3084 2984 mr->hw_mr.pbl_ptr = 0; 3085 2985 mr->hw_mr.pbl_two_level = mr->info.pbl_info.two_layered; 3086 2986 mr->hw_mr.pbl_page_size_log = ilog2(mr->info.pbl_info.pbl_size); 3087 - mr->hw_mr.fbo = 0; 3088 2987 mr->hw_mr.length = 0; 3089 2988 mr->hw_mr.vaddr = 0; 3090 - mr->hw_mr.zbva = false; 3091 2989 mr->hw_mr.phy_mr = true; 3092 2990 mr->hw_mr.dma_mr = false; 3093 2991 ··· 3863 3765 * in first 4 bytes and need to update WQE producer in 3864 3766 * next 4 bytes. 3865 3767 */ 3866 - srq->hw_srq.virt_prod_pair_addr->sge_prod = hw_srq->sge_prod; 3768 + srq->hw_srq.virt_prod_pair_addr->sge_prod = cpu_to_le32(hw_srq->sge_prod); 3867 3769 /* Make sure sge producer is updated first */ 3868 3770 dma_wmb(); 3869 - srq->hw_srq.virt_prod_pair_addr->wqe_prod = hw_srq->wqe_prod; 3771 + srq->hw_srq.virt_prod_pair_addr->wqe_prod = cpu_to_le32(hw_srq->wqe_prod); 3870 3772 3871 3773 wr = wr->next; 3872 3774 }
+6 -5
drivers/infiniband/hw/qedr/verbs.h
··· 47 47 int qedr_mmap(struct ib_ucontext *ucontext, struct vm_area_struct *vma); 48 48 void qedr_mmap_free(struct rdma_user_mmap_entry *rdma_entry); 49 49 int qedr_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 50 - void qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 51 - 50 + int qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 51 + int qedr_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata); 52 + int qedr_dealloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata); 52 53 int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 53 54 struct ib_udata *udata); 54 55 int qedr_resize_cq(struct ib_cq *, int cqe, struct ib_udata *); 55 - void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 56 + int qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 56 57 int qedr_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags); 57 58 struct ib_qp *qedr_create_qp(struct ib_pd *, struct ib_qp_init_attr *attrs, 58 59 struct ib_udata *); ··· 68 67 int qedr_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, 69 68 enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); 70 69 int qedr_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr); 71 - void qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 70 + int qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 72 71 int qedr_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr, 73 72 const struct ib_recv_wr **bad_recv_wr); 74 73 int qedr_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr, 75 74 struct ib_udata *udata); 76 - void qedr_destroy_ah(struct ib_ah *ibah, u32 flags); 75 + int qedr_destroy_ah(struct ib_ah *ibah, u32 flags); 77 76 78 77 int qedr_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata); 79 78 struct ib_mr *qedr_get_dma_mr(struct ib_pd *, int acc);
+3 -3
drivers/infiniband/hw/qib/qib.h
··· 619 619 /* LID mask control */ 620 620 u8 lmc; 621 621 u8 link_width_supported; 622 - u8 link_speed_supported; 622 + u16 link_speed_supported; 623 623 u8 link_width_enabled; 624 - u8 link_speed_enabled; 624 + u16 link_speed_enabled; 625 625 u8 link_width_active; 626 - u8 link_speed_active; 626 + u16 link_speed_active; 627 627 u8 vls_supported; 628 628 u8 vls_operational; 629 629 /* Rx Polarity inversion (compensate for ~tx on partner) */
+3 -4
drivers/infiniband/hw/qib/qib_iba7322.c
··· 1733 1733 return; 1734 1734 } 1735 1735 1736 - static void qib_error_tasklet(unsigned long data) 1736 + static void qib_error_tasklet(struct tasklet_struct *t) 1737 1737 { 1738 - struct qib_devdata *dd = (struct qib_devdata *)data; 1738 + struct qib_devdata *dd = from_tasklet(dd, t, error_tasklet); 1739 1739 1740 1740 handle_7322_errors(dd); 1741 1741 qib_write_kreg(dd, kr_errmask, dd->cspec->errormask); ··· 3537 3537 for (i = 0; i < ARRAY_SIZE(redirect); i++) 3538 3538 qib_write_kreg(dd, kr_intredirect + i, redirect[i]); 3539 3539 dd->cspec->main_int_mask = mask; 3540 - tasklet_init(&dd->error_tasklet, qib_error_tasklet, 3541 - (unsigned long)dd); 3540 + tasklet_setup(&dd->error_tasklet, qib_error_tasklet); 3542 3541 } 3543 3542 3544 3543 /**
+13 -39
drivers/infiniband/hw/qib/qib_mad.c
··· 2293 2293 struct ib_mad *out_mad) 2294 2294 { 2295 2295 struct ib_cc_mad *ccp = (struct ib_cc_mad *)out_mad; 2296 - int ret; 2297 - 2298 2296 *out_mad = *in_mad; 2299 2297 2300 2298 if (ccp->class_version != 2) { 2301 2299 ccp->status |= IB_SMP_UNSUP_VERSION; 2302 - ret = reply((struct ib_smp *)ccp); 2303 - goto bail; 2300 + return reply((struct ib_smp *)ccp); 2304 2301 } 2305 2302 2306 2303 switch (ccp->method) { 2307 2304 case IB_MGMT_METHOD_GET: 2308 2305 switch (ccp->attr_id) { 2309 2306 case IB_CC_ATTR_CLASSPORTINFO: 2310 - ret = cc_get_classportinfo(ccp, ibdev); 2311 - goto bail; 2312 - 2307 + return cc_get_classportinfo(ccp, ibdev); 2313 2308 case IB_CC_ATTR_CONGESTION_INFO: 2314 - ret = cc_get_congestion_info(ccp, ibdev, port); 2315 - goto bail; 2316 - 2309 + return cc_get_congestion_info(ccp, ibdev, port); 2317 2310 case IB_CC_ATTR_CA_CONGESTION_SETTING: 2318 - ret = cc_get_congestion_setting(ccp, ibdev, port); 2319 - goto bail; 2320 - 2311 + return cc_get_congestion_setting(ccp, ibdev, port); 2321 2312 case IB_CC_ATTR_CONGESTION_CONTROL_TABLE: 2322 - ret = cc_get_congestion_control_table(ccp, ibdev, port); 2323 - goto bail; 2324 - 2325 - fallthrough; 2313 + return cc_get_congestion_control_table(ccp, ibdev, port); 2326 2314 default: 2327 2315 ccp->status |= IB_SMP_UNSUP_METH_ATTR; 2328 - ret = reply((struct ib_smp *) ccp); 2329 - goto bail; 2316 + return reply((struct ib_smp *) ccp); 2330 2317 } 2331 - 2332 2318 case IB_MGMT_METHOD_SET: 2333 2319 switch (ccp->attr_id) { 2334 2320 case IB_CC_ATTR_CA_CONGESTION_SETTING: 2335 - ret = cc_set_congestion_setting(ccp, ibdev, port); 2336 - goto bail; 2337 - 2321 + return cc_set_congestion_setting(ccp, ibdev, port); 2338 2322 case IB_CC_ATTR_CONGESTION_CONTROL_TABLE: 2339 - ret = cc_set_congestion_control_table(ccp, ibdev, port); 2340 - goto bail; 2341 - 2342 - fallthrough; 2323 + return cc_set_congestion_control_table(ccp, ibdev, port); 2343 2324 default: 2344 2325 ccp->status |= IB_SMP_UNSUP_METH_ATTR; 2345 - ret = reply((struct ib_smp *) ccp); 2346 - goto bail; 2326 + return reply((struct ib_smp *) ccp); 2347 2327 } 2348 - 2349 2328 case IB_MGMT_METHOD_GET_RESP: 2350 2329 /* 2351 2330 * The ib_mad module will call us to process responses 2352 2331 * before checking for other consumers. 2353 2332 * Just tell the caller to process it normally. 2354 2333 */ 2355 - ret = IB_MAD_RESULT_SUCCESS; 2356 - goto bail; 2357 - 2358 - case IB_MGMT_METHOD_TRAP: 2359 - default: 2360 - ccp->status |= IB_SMP_UNSUP_METHOD; 2361 - ret = reply((struct ib_smp *) ccp); 2334 + return IB_MAD_RESULT_SUCCESS; 2362 2335 } 2363 2336 2364 - bail: 2365 - return ret; 2337 + /* method is unsupported */ 2338 + ccp->status |= IB_SMP_UNSUP_METHOD; 2339 + return reply((struct ib_smp *) ccp); 2366 2340 } 2367 2341 2368 2342 /**
+5 -5
drivers/infiniband/hw/qib/qib_sdma.c
··· 62 62 static void sdma_put(struct qib_sdma_state *); 63 63 static void sdma_set_state(struct qib_pportdata *, enum qib_sdma_states); 64 64 static void sdma_start_sw_clean_up(struct qib_pportdata *); 65 - static void sdma_sw_clean_up_task(unsigned long); 65 + static void sdma_sw_clean_up_task(struct tasklet_struct *); 66 66 static void unmap_desc(struct qib_pportdata *, unsigned); 67 67 68 68 static void sdma_get(struct qib_sdma_state *ss) ··· 119 119 } 120 120 } 121 121 122 - static void sdma_sw_clean_up_task(unsigned long opaque) 122 + static void sdma_sw_clean_up_task(struct tasklet_struct *t) 123 123 { 124 - struct qib_pportdata *ppd = (struct qib_pportdata *) opaque; 124 + struct qib_pportdata *ppd = from_tasklet(ppd, t, 125 + sdma_sw_clean_up_task); 125 126 unsigned long flags; 126 127 127 128 spin_lock_irqsave(&ppd->sdma_lock, flags); ··· 437 436 438 437 INIT_LIST_HEAD(&ppd->sdma_activelist); 439 438 440 - tasklet_init(&ppd->sdma_sw_clean_up_task, sdma_sw_clean_up_task, 441 - (unsigned long)ppd); 439 + tasklet_setup(&ppd->sdma_sw_clean_up_task, sdma_sw_clean_up_task); 442 440 443 441 ret = dd->f_init_sdma_regs(ppd); 444 442 if (ret)
+2 -3
drivers/infiniband/hw/usnic/usnic_ib_main.c
··· 315 315 if (err) 316 316 return err; 317 317 318 - immutable->pkey_tbl_len = attr.pkey_tbl_len; 319 318 immutable->gid_tbl_len = attr.gid_tbl_len; 320 319 321 320 return 0; ··· 354 355 .modify_qp = usnic_ib_modify_qp, 355 356 .query_device = usnic_ib_query_device, 356 357 .query_gid = usnic_ib_query_gid, 357 - .query_pkey = usnic_ib_query_pkey, 358 358 .query_port = usnic_ib_query_port, 359 359 .query_qp = usnic_ib_query_qp, 360 360 .reg_user_mr = usnic_ib_reg_mr, ··· 425 427 if (ret) 426 428 goto err_fwd_dealloc; 427 429 428 - if (ib_register_device(&us_ibdev->ib_dev, "usnic_%d")) 430 + dma_set_max_seg_size(&dev->dev, SZ_2G); 431 + if (ib_register_device(&us_ibdev->ib_dev, "usnic_%d", &dev->dev)) 429 432 goto err_fwd_dealloc; 430 433 431 434 usnic_fwd_set_mtu(us_ibdev->ufdev, us_ibdev->netdev->mtu);
+4 -14
drivers/infiniband/hw/usnic/usnic_ib_verbs.c
··· 367 367 368 368 props->port_cap_flags = 0; 369 369 props->gid_tbl_len = 1; 370 - props->pkey_tbl_len = 1; 371 370 props->bad_pkey_cntr = 0; 372 371 props->qkey_viol_cntr = 0; 373 372 props->max_mtu = IB_MTU_4096; ··· 436 437 return 0; 437 438 } 438 439 439 - int usnic_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index, 440 - u16 *pkey) 441 - { 442 - if (index > 0) 443 - return -EINVAL; 444 - 445 - *pkey = 0xffff; 446 - return 0; 447 - } 448 - 449 440 int usnic_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 450 441 { 451 442 struct usnic_ib_pd *pd = to_upd(ibpd); ··· 449 460 return 0; 450 461 } 451 462 452 - void usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 463 + int usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 453 464 { 454 465 usnic_uiom_dealloc_pd((to_upd(pd))->umem_pd); 466 + return 0; 455 467 } 456 468 457 469 struct ib_qp *usnic_ib_create_qp(struct ib_pd *pd, ··· 586 596 return 0; 587 597 } 588 598 589 - void usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 599 + int usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 590 600 { 591 - return; 601 + return 0; 592 602 } 593 603 594 604 struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length,
+2 -4
drivers/infiniband/hw/usnic/usnic_ib_verbs.h
··· 48 48 struct ib_qp_init_attr *qp_init_attr); 49 49 int usnic_ib_query_gid(struct ib_device *ibdev, u8 port, int index, 50 50 union ib_gid *gid); 51 - int usnic_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index, 52 - u16 *pkey); 53 51 int usnic_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 54 - void usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 52 + int usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata); 55 53 struct ib_qp *usnic_ib_create_qp(struct ib_pd *pd, 56 54 struct ib_qp_init_attr *init_attr, 57 55 struct ib_udata *udata); ··· 58 60 int attr_mask, struct ib_udata *udata); 59 61 int usnic_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 60 62 struct ib_udata *udata); 61 - void usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 63 + int usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 62 64 struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length, 63 65 u64 virt_addr, int access_flags, 64 66 struct ib_udata *udata);
+4 -3
drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
··· 142 142 goto err_cq; 143 143 } 144 144 145 - npages = ib_umem_page_count(cq->umem); 145 + npages = ib_umem_num_dma_blocks(cq->umem, PAGE_SIZE); 146 146 } else { 147 147 /* One extra page for shared ring state */ 148 148 npages = 1 + (entries * sizeof(struct pvrdma_cqe) + ··· 235 235 * @cq: the completion queue to destroy. 236 236 * @udata: user data or null for kernel object 237 237 */ 238 - void pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 238 + int pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata) 239 239 { 240 240 struct pvrdma_cq *vcq = to_vcq(cq); 241 241 union pvrdma_cmd_req req; ··· 261 261 262 262 pvrdma_free_cq(dev, vcq); 263 263 atomic_dec(&dev->num_cqs); 264 + return 0; 264 265 } 265 266 266 267 static inline struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int i) ··· 376 375 * pvrdma_poll_cq - poll for work completion queue entries 377 376 * @ibcq: completion queue 378 377 * @num_entries: the maximum number of entries 379 - * @entry: pointer to work completion array 378 + * @wc: pointer to work completion array 380 379 * 381 380 * @return: number of polled completion entries 382 381 */
+2 -2
drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
··· 270 270 spin_lock_init(&dev->srq_tbl_lock); 271 271 rdma_set_device_sysfs_group(&dev->ib_dev, &pvrdma_attr_group); 272 272 273 - ret = ib_register_device(&dev->ib_dev, "vmw_pvrdma%d"); 273 + ret = ib_register_device(&dev->ib_dev, "vmw_pvrdma%d", &dev->pdev->dev); 274 274 if (ret) 275 275 goto err_srq_free; 276 276 ··· 854 854 goto err_free_resource; 855 855 } 856 856 } 857 - 857 + dma_set_max_seg_size(&pdev->dev, UINT_MAX); 858 858 pci_set_master(pdev); 859 859 860 860 /* Map register space */
+4 -5
drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c
··· 182 182 int pvrdma_page_dir_insert_umem(struct pvrdma_page_dir *pdir, 183 183 struct ib_umem *umem, u64 offset) 184 184 { 185 + struct ib_block_iter biter; 185 186 u64 i = offset; 186 187 int ret = 0; 187 - struct sg_dma_page_iter sg_iter; 188 188 189 189 if (offset >= pdir->npages) 190 190 return -EINVAL; 191 191 192 - for_each_sg_dma_page(umem->sg_head.sgl, &sg_iter, umem->nmap, 0) { 193 - dma_addr_t addr = sg_page_iter_dma_address(&sg_iter); 194 - 195 - ret = pvrdma_page_dir_insert_dma(pdir, i, addr); 192 + rdma_umem_for_each_dma_block (umem, &biter, PAGE_SIZE) { 193 + ret = pvrdma_page_dir_insert_dma( 194 + pdir, i, rdma_block_iter_dma_address(&biter)); 196 195 if (ret) 197 196 goto exit; 198 197
+2 -1
drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c
··· 133 133 return ERR_CAST(umem); 134 134 } 135 135 136 - npages = ib_umem_num_pages(umem); 136 + npages = ib_umem_num_dma_blocks(umem, PAGE_SIZE); 137 137 if (npages < 0 || npages > PVRDMA_PAGE_DIR_MAX_PAGES) { 138 138 dev_warn(&dev->pdev->dev, "overflow %d pages in mem region\n", 139 139 npages); ··· 270 270 /** 271 271 * pvrdma_dereg_mr - deregister a memory region 272 272 * @ibmr: memory region 273 + * @udata: pointer to user data 273 274 * 274 275 * @return: 0 on success. 275 276 */
+5 -4
drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c
··· 232 232 switch (init_attr->qp_type) { 233 233 case IB_QPT_GSI: 234 234 if (init_attr->port_num == 0 || 235 - init_attr->port_num > pd->device->phys_port_cnt || 236 - udata) { 235 + init_attr->port_num > pd->device->phys_port_cnt) { 237 236 dev_warn(&dev->pdev->dev, "invalid queuepair attrs\n"); 238 237 ret = -EINVAL; 239 238 goto err_qp; ··· 297 298 goto err_qp; 298 299 } 299 300 300 - qp->npages_send = ib_umem_page_count(qp->sumem); 301 + qp->npages_send = 302 + ib_umem_num_dma_blocks(qp->sumem, PAGE_SIZE); 301 303 if (!is_srq) 302 - qp->npages_recv = ib_umem_page_count(qp->rumem); 304 + qp->npages_recv = ib_umem_num_dma_blocks( 305 + qp->rumem, PAGE_SIZE); 303 306 else 304 307 qp->npages_recv = 0; 305 308 qp->npages = qp->npages_send + qp->npages_recv;
+4 -3
drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
··· 90 90 91 91 /** 92 92 * pvrdma_create_srq - create shared receive queue 93 - * @pd: protection domain 93 + * @ibsrq: the IB shared receive queue 94 94 * @init_attr: shared receive queue attributes 95 95 * @udata: user data 96 96 * ··· 152 152 goto err_srq; 153 153 } 154 154 155 - srq->npages = ib_umem_page_count(srq->umem); 155 + srq->npages = ib_umem_num_dma_blocks(srq->umem, PAGE_SIZE); 156 156 157 157 if (srq->npages < 0 || srq->npages > PVRDMA_PAGE_DIR_MAX_PAGES) { 158 158 dev_warn(&dev->pdev->dev, ··· 240 240 * 241 241 * @return: 0 for success. 242 242 */ 243 - void pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 243 + int pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata) 244 244 { 245 245 struct pvrdma_srq *vsrq = to_vsrq(srq); 246 246 union pvrdma_cmd_req req; ··· 259 259 ret); 260 260 261 261 pvrdma_free_srq(dev, vsrq); 262 + return 0; 262 263 } 263 264 264 265 /**
+8 -7
drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
··· 479 479 * @pd: the protection domain to be released 480 480 * @udata: user data or null for kernel object 481 481 * 482 - * @return: 0 on success, otherwise errno. 482 + * @return: Always 0 483 483 */ 484 - void pvrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 484 + int pvrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 485 485 { 486 486 struct pvrdma_dev *dev = to_vdev(pd->device); 487 487 union pvrdma_cmd_req req = {}; ··· 498 498 ret); 499 499 500 500 atomic_dec(&dev->num_pds); 501 + return 0; 501 502 } 502 503 503 504 /** 504 505 * pvrdma_create_ah - create an address handle 505 - * @pd: the protection domain 506 - * @ah_attr: the attributes of the AH 507 - * @udata: user data blob 508 - * @flags: create address handle flags (see enum rdma_create_ah_flags) 506 + * @ibah: the IB address handle 507 + * @init_attr: the attributes of the AH 508 + * @udata: pointer to user data 509 509 * 510 510 * @return: 0 on success, otherwise errno. 511 511 */ ··· 548 548 * @flags: destroy address handle flags (see enum rdma_destroy_ah_flags) 549 549 * 550 550 */ 551 - void pvrdma_destroy_ah(struct ib_ah *ah, u32 flags) 551 + int pvrdma_destroy_ah(struct ib_ah *ah, u32 flags) 552 552 { 553 553 struct pvrdma_dev *dev = to_vdev(ah->device); 554 554 555 555 atomic_dec(&dev->num_ahs); 556 + return 0; 556 557 }
+5 -5
drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
··· 176 176 u8 subnet_timeout; 177 177 u8 init_type_reply; 178 178 u8 active_width; 179 - u8 active_speed; 179 + u16 active_speed; 180 180 u8 phys_state; 181 181 u8 reserved[2]; 182 182 }; ··· 399 399 int pvrdma_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata); 400 400 void pvrdma_dealloc_ucontext(struct ib_ucontext *context); 401 401 int pvrdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 402 - void pvrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 402 + int pvrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 403 403 struct ib_mr *pvrdma_get_dma_mr(struct ib_pd *pd, int acc); 404 404 struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, 405 405 u64 virt_addr, int access_flags, ··· 411 411 int sg_nents, unsigned int *sg_offset); 412 412 int pvrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 413 413 struct ib_udata *udata); 414 - void pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 414 + int pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata); 415 415 int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc); 416 416 int pvrdma_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags); 417 417 int pvrdma_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 418 418 struct ib_udata *udata); 419 - void pvrdma_destroy_ah(struct ib_ah *ah, u32 flags); 419 + int pvrdma_destroy_ah(struct ib_ah *ah, u32 flags); 420 420 421 421 int pvrdma_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr, 422 422 struct ib_udata *udata); 423 423 int pvrdma_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr, 424 424 enum ib_srq_attr_mask attr_mask, struct ib_udata *udata); 425 425 int pvrdma_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr); 426 - void pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 426 + int pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata); 427 427 428 428 struct ib_qp *pvrdma_create_qp(struct ib_pd *pd, 429 429 struct ib_qp_init_attr *init_attr,
+2 -1
drivers/infiniband/sw/rdmavt/ah.c
··· 132 132 * 133 133 * Return: 0 on success 134 134 */ 135 - void rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags) 135 + int rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags) 136 136 { 137 137 struct rvt_dev_info *dev = ib_to_rvt(ibah->device); 138 138 struct rvt_ah *ah = ibah_to_rvtah(ibah); ··· 143 143 spin_unlock_irqrestore(&dev->n_ahs_lock, flags); 144 144 145 145 rdma_destroy_ah_attr(&ah->attr); 146 + return 0; 146 147 } 147 148 148 149 /**
+1 -1
drivers/infiniband/sw/rdmavt/ah.h
··· 52 52 53 53 int rvt_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr, 54 54 struct ib_udata *udata); 55 - void rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags); 55 + int rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags); 56 56 int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); 57 57 int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr); 58 58
+2 -1
drivers/infiniband/sw/rdmavt/cq.c
··· 315 315 * 316 316 * Called by ib_destroy_cq() in the generic verbs code. 317 317 */ 318 - void rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 318 + int rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 319 319 { 320 320 struct rvt_cq *cq = ibcq_to_rvtcq(ibcq); 321 321 struct rvt_dev_info *rdi = cq->rdi; ··· 328 328 kref_put(&cq->ip->ref, rvt_release_mmap_info); 329 329 else 330 330 vfree(cq->kqueue); 331 + return 0; 331 332 } 332 333 333 334 /**
+1 -1
drivers/infiniband/sw/rdmavt/cq.h
··· 53 53 54 54 int rvt_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr, 55 55 struct ib_udata *udata); 56 - void rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 56 + int rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata); 57 57 int rvt_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags); 58 58 int rvt_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata); 59 59 int rvt_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry);
+2 -1
drivers/infiniband/sw/rdmavt/pd.c
··· 95 95 * 96 96 * Return: always 0 97 97 */ 98 - void rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 98 + int rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 99 99 { 100 100 struct rvt_dev_info *dev = ib_to_rvt(ibpd->device); 101 101 102 102 spin_lock(&dev->n_pds_lock); 103 103 dev->n_pds_allocated--; 104 104 spin_unlock(&dev->n_pds_lock); 105 + return 0; 105 106 }
+1 -1
drivers/infiniband/sw/rdmavt/pd.h
··· 51 51 #include <rdma/rdma_vt.h> 52 52 53 53 int rvt_alloc_pd(struct ib_pd *pd, struct ib_udata *udata); 54 - void rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 54 + int rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata); 55 55 56 56 #endif /* DEF_RDMAVTPD_H */
+2 -1
drivers/infiniband/sw/rdmavt/srq.c
··· 332 332 * @ibsrq: srq object to destroy 333 333 * 334 334 */ 335 - void rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 335 + int rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 336 336 { 337 337 struct rvt_srq *srq = ibsrq_to_rvtsrq(ibsrq); 338 338 struct rvt_dev_info *dev = ib_to_rvt(ibsrq->device); ··· 343 343 if (srq->ip) 344 344 kref_put(&srq->ip->ref, rvt_release_mmap_info); 345 345 kvfree(srq->rq.kwq); 346 + return 0; 346 347 }
+1 -1
drivers/infiniband/sw/rdmavt/srq.h
··· 56 56 enum ib_srq_attr_mask attr_mask, 57 57 struct ib_udata *udata); 58 58 int rvt_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr); 59 - void rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 59 + int rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata); 60 60 61 61 #endif /* DEF_RVTSRQ_H */
+5 -5
drivers/infiniband/sw/rdmavt/vt.c
··· 95 95 if (!rdi) 96 96 return rdi; 97 97 98 - rdi->ports = kcalloc(nports, 99 - sizeof(struct rvt_ibport **), 100 - GFP_KERNEL); 98 + rdi->ports = kcalloc(nports, sizeof(*rdi->ports), GFP_KERNEL); 101 99 if (!rdi->ports) 102 100 ib_dealloc_device(&rdi->ibdev); 103 101 ··· 579 581 spin_lock_init(&rdi->n_cqs_lock); 580 582 581 583 /* DMA Operations */ 582 - rdi->ibdev.dev.dma_ops = rdi->ibdev.dev.dma_ops ? : &dma_virt_ops; 584 + rdi->ibdev.dev.dma_parms = rdi->ibdev.dev.parent->dma_parms; 585 + dma_set_coherent_mask(&rdi->ibdev.dev, 586 + rdi->ibdev.dev.parent->coherent_dma_mask); 583 587 584 588 /* Protection Domain */ 585 589 spin_lock_init(&rdi->n_pds_lock); ··· 629 629 rdi->ibdev.num_comp_vectors = 1; 630 630 631 631 /* We are now good to announce we exist */ 632 - ret = ib_register_device(&rdi->ibdev, dev_name(&rdi->ibdev.dev)); 632 + ret = ib_register_device(&rdi->ibdev, dev_name(&rdi->ibdev.dev), NULL); 633 633 if (ret) { 634 634 rvt_pr_err(rdi, "Failed to register driver with ib core.\n"); 635 635 goto bail_wss;
+7 -36
drivers/infiniband/sw/rxe/rxe.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <rdma/rdma_netlink.h> ··· 252 279 struct rxe_dev *exists; 253 280 int err = 0; 254 281 282 + if (is_vlan_dev(ndev)) { 283 + pr_err("rxe creation allowed on top of a real device only\n"); 284 + err = -EPERM; 285 + goto err; 286 + } 287 + 255 288 exists = rxe_get_dev_from_net(ndev); 256 289 if (exists) { 257 290 ib_device_put(&exists->ib_dev); ··· 284 305 { 285 306 int err; 286 307 287 - /* initialize slab caches for managed objects */ 288 - err = rxe_cache_init(); 289 - if (err) { 290 - pr_err("unable to init object pools\n"); 291 - return err; 292 - } 293 - 294 308 err = rxe_net_init(); 295 309 if (err) 296 310 return err; ··· 299 327 rdma_link_unregister(&rxe_link_ops); 300 328 ib_unregister_driver(RDMA_DRIVER_RXE); 301 329 rxe_net_exit(); 302 - rxe_cache_exit(); 303 330 304 331 rxe_initialized = false; 305 332 pr_info("unloaded\n");
+1 -28
drivers/infiniband/sw/rxe/rxe.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_H
+1 -28
drivers/infiniband/sw/rxe/rxe_av.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h"
+2 -30
drivers/infiniband/sw/rxe/rxe_comp.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h> ··· 663 690 */ 664 691 665 692 /* there is nothing to retry in this case */ 666 - if (!wqe || (wqe->state == wqe_state_posted)) { 693 + if (!wqe || (wqe->state == wqe_state_posted)) 667 694 goto exit; 668 - } 669 695 670 696 /* if we've started a retry, don't start another 671 697 * retry sequence, unless this is a timeout.
+4 -31
drivers/infiniband/sw/rxe/rxe_cq.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 #include <linux/vmalloc.h> 34 7 #include "rxe.h" ··· 39 66 return -EINVAL; 40 67 } 41 68 42 - static void rxe_send_complete(unsigned long data) 69 + static void rxe_send_complete(struct tasklet_struct *t) 43 70 { 44 - struct rxe_cq *cq = (struct rxe_cq *)data; 71 + struct rxe_cq *cq = from_tasklet(cq, t, comp_task); 45 72 unsigned long flags; 46 73 47 74 spin_lock_irqsave(&cq->cq_lock, flags); ··· 80 107 81 108 cq->is_dying = false; 82 109 83 - tasklet_init(&cq->comp_task, rxe_send_complete, (unsigned long)cq); 110 + tasklet_setup(&cq->comp_task, rxe_send_complete); 84 111 85 112 spin_lock_init(&cq->cq_lock); 86 113 cq->ibcq.cqe = cqe;
+1 -28
drivers/infiniband/sw/rxe/rxe_hdr.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_HDR_H
+1 -28
drivers/infiniband/sw/rxe/rxe_hw_counters.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2017 Mellanox Technologies Ltd. All rights reserved. 3 - * 4 - * This software is available to you under a choice of one of two 5 - * licenses. You may choose to be licensed under the terms of the GNU 6 - * General Public License (GPL) Version 2, available from the file 7 - * COPYING in the main directory of this source tree, or the 8 - * OpenIB.org BSD license below: 9 - * 10 - * Redistribution and use in source and binary forms, with or 11 - * without modification, are permitted provided that the following 12 - * conditions are met: 13 - * 14 - * - Redistributions of source code must retain the above 15 - * copyright notice, this list of conditions and the following 16 - * disclaimer. 17 - * 18 - * - Redistributions in binary form must reproduce the above 19 - * copyright notice, this list of conditions and the following 20 - * disclaimer in the documentation and/or other materials 21 - * provided with the distribution. 22 - * 23 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 24 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 25 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 26 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 27 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 28 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 29 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 30 - * SOFTWARE. 31 4 */ 32 5 33 6 #include "rxe.h"
+1 -28
drivers/infiniband/sw/rxe/rxe_hw_counters.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2017 Mellanox Technologies Ltd. All rights reserved. 3 - * 4 - * This software is available to you under a choice of one of two 5 - * licenses. You may choose to be licensed under the terms of the GNU 6 - * General Public License (GPL) Version 2, available from the file 7 - * COPYING in the main directory of this source tree, or the 8 - * OpenIB.org BSD license below: 9 - * 10 - * Redistribution and use in source and binary forms, with or 11 - * without modification, are permitted provided that the following 12 - * conditions are met: 13 - * 14 - * - Redistributions of source code must retain the above 15 - * copyright notice, this list of conditions and the following 16 - * disclaimer. 17 - * 18 - * - Redistributions in binary form must reproduce the above 19 - * copyright notice, this list of conditions and the following 20 - * disclaimer in the documentation and/or other materials 21 - * provided with the distribution. 22 - * 23 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 24 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 25 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 26 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 27 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 28 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 29 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 30 - * SOFTWARE. 31 4 */ 32 5 33 6 #ifndef RXE_HW_COUNTERS_H
+1 -28
drivers/infiniband/sw/rxe/rxe_icrc.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h"
+1 -28
drivers/infiniband/sw/rxe/rxe_loc.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_LOC_H
+1 -28
drivers/infiniband/sw/rxe/rxe_mcast.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h"
+1 -28
drivers/infiniband/sw/rxe/rxe_mmap.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/module.h>
+11 -43
drivers/infiniband/sw/rxe/rxe_mr.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h" ··· 52 79 u32 lkey = mem->pelem.index << 8 | rxe_get_key(); 53 80 u32 rkey = (access & IB_ACCESS_REMOTE) ? lkey : 0; 54 81 55 - if (mem->pelem.pool->type == RXE_TYPE_MR) { 56 - mem->ibmr.lkey = lkey; 57 - mem->ibmr.rkey = rkey; 58 - } 59 - 60 - mem->lkey = lkey; 61 - mem->rkey = rkey; 82 + mem->ibmr.lkey = lkey; 83 + mem->ibmr.rkey = rkey; 62 84 mem->state = RXE_MEM_STATE_INVALID; 63 85 mem->type = RXE_MEM_TYPE_NONE; 64 86 mem->map_shift = ilog2(RXE_BUF_PER_MAP); ··· 117 149 { 118 150 rxe_mem_init(access, mem); 119 151 120 - mem->pd = pd; 152 + mem->ibmr.pd = &pd->ibpd; 121 153 mem->access = access; 122 154 mem->state = RXE_MEM_STATE_VALID; 123 155 mem->type = RXE_MEM_TYPE_DMA; ··· 186 218 } 187 219 } 188 220 189 - mem->pd = pd; 221 + mem->ibmr.pd = &pd->ibpd; 190 222 mem->umem = umem; 191 223 mem->access = access; 192 224 mem->length = length; ··· 216 248 if (err) 217 249 goto err1; 218 250 219 - mem->pd = pd; 251 + mem->ibmr.pd = &pd->ibpd; 220 252 mem->max_buf = max_pages; 221 253 mem->state = RXE_MEM_STATE_FREE; 222 254 mem->type = RXE_MEM_TYPE_MR; ··· 336 368 memcpy(dest, src, length); 337 369 338 370 if (crcp) 339 - *crcp = rxe_crc32(to_rdev(mem->pd->ibpd.device), 371 + *crcp = rxe_crc32(to_rdev(mem->ibmr.device), 340 372 *crcp, dest, length); 341 373 342 374 return 0; ··· 370 402 memcpy(dest, src, bytes); 371 403 372 404 if (crcp) 373 - crc = rxe_crc32(to_rdev(mem->pd->ibpd.device), 405 + crc = rxe_crc32(to_rdev(mem->ibmr.device), 374 406 crc, dest, bytes); 375 407 376 408 length -= bytes; ··· 543 575 if (!mem) 544 576 return NULL; 545 577 546 - if (unlikely((type == lookup_local && mem->lkey != key) || 547 - (type == lookup_remote && mem->rkey != key) || 548 - mem->pd != pd || 578 + if (unlikely((type == lookup_local && mr_lkey(mem) != key) || 579 + (type == lookup_remote && mr_rkey(mem) != key) || 580 + mr_pd(mem) != pd || 549 581 (access && !(access & mem->access)) || 550 582 mem->state != RXE_MEM_STATE_VALID)) { 551 583 rxe_drop_ref(mem);
+6 -33
drivers/infiniband/sw/rxe/rxe_net.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h> ··· 93 120 ndst = ipv6_stub->ipv6_dst_lookup_flow(sock_net(recv_sockets.sk6->sk), 94 121 recv_sockets.sk6->sk, &fl6, 95 122 NULL); 96 - if (unlikely(IS_ERR(ndst))) { 123 + if (IS_ERR(ndst)) { 97 124 pr_err_ratelimited("no route to %pI6\n", daddr); 98 125 return NULL; 99 126 } ··· 133 160 if (dst) 134 161 dst_release(dst); 135 162 136 - if (av->network_type == RDMA_NETWORK_IPV4) { 163 + if (av->network_type == RXE_NETWORK_TYPE_IPV4) { 137 164 struct in_addr *saddr; 138 165 struct in_addr *daddr; 139 166 140 167 saddr = &av->sgid_addr._sockaddr_in.sin_addr; 141 168 daddr = &av->dgid_addr._sockaddr_in.sin_addr; 142 169 dst = rxe_find_route4(ndev, saddr, daddr); 143 - } else if (av->network_type == RDMA_NETWORK_IPV6) { 170 + } else if (av->network_type == RXE_NETWORK_TYPE_IPV6) { 144 171 struct in6_addr *saddr6; 145 172 struct in6_addr *daddr6; 146 173 ··· 442 469 if (IS_ERR(attr)) 443 470 return NULL; 444 471 445 - if (av->network_type == RDMA_NETWORK_IPV4) 472 + if (av->network_type == RXE_NETWORK_TYPE_IPV6) 446 473 hdr_len = ETH_HLEN + sizeof(struct udphdr) + 447 474 sizeof(struct iphdr); 448 475 else ··· 469 496 skb->dev = ndev; 470 497 rcu_read_unlock(); 471 498 472 - if (av->network_type == RDMA_NETWORK_IPV4) 499 + if (av->network_type == RXE_NETWORK_TYPE_IPV4) 473 500 skb->protocol = htons(ETH_P_IP); 474 501 else 475 502 skb->protocol = htons(ETH_P_IPV6);
+1 -28
drivers/infiniband/sw/rxe/rxe_net.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_NET_H
+1 -28
drivers/infiniband/sw/rxe/rxe_opcode.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <rdma/ib_pack.h>
+1 -28
drivers/infiniband/sw/rxe/rxe_opcode.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_OPCODE_H
+1 -28
drivers/infiniband/sw/rxe/rxe_param.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_PARAM_H
+3 -86
drivers/infiniband/sw/rxe/rxe_pool.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h" ··· 81 108 static inline const char *pool_name(struct rxe_pool *pool) 82 109 { 83 110 return rxe_type_info[pool->type].name; 84 - } 85 - 86 - static inline struct kmem_cache *pool_cache(struct rxe_pool *pool) 87 - { 88 - return rxe_type_info[pool->type].cache; 89 - } 90 - 91 - static void rxe_cache_clean(size_t cnt) 92 - { 93 - int i; 94 - struct rxe_type_info *type; 95 - 96 - for (i = 0; i < cnt; i++) { 97 - type = &rxe_type_info[i]; 98 - if (!(type->flags & RXE_POOL_NO_ALLOC)) { 99 - kmem_cache_destroy(type->cache); 100 - type->cache = NULL; 101 - } 102 - } 103 - } 104 - 105 - int rxe_cache_init(void) 106 - { 107 - int err; 108 - int i; 109 - size_t size; 110 - struct rxe_type_info *type; 111 - 112 - for (i = 0; i < RXE_NUM_TYPES; i++) { 113 - type = &rxe_type_info[i]; 114 - size = ALIGN(type->size, RXE_POOL_ALIGN); 115 - if (!(type->flags & RXE_POOL_NO_ALLOC)) { 116 - type->cache = 117 - kmem_cache_create(type->name, size, 118 - RXE_POOL_ALIGN, 119 - RXE_POOL_CACHE_FLAGS, NULL); 120 - if (!type->cache) { 121 - pr_err("Unable to init kmem cache for %s\n", 122 - type->name); 123 - err = -ENOMEM; 124 - goto err1; 125 - } 126 - } 127 - } 128 - 129 - return 0; 130 - 131 - err1: 132 - rxe_cache_clean(i); 133 - 134 - return err; 135 - } 136 - 137 - void rxe_cache_exit(void) 138 - { 139 - rxe_cache_clean(RXE_NUM_TYPES); 140 111 } 141 112 142 113 static int rxe_pool_init_index(struct rxe_pool *pool, u32 max, u32 min) ··· 323 406 if (atomic_inc_return(&pool->num_elem) > pool->max_elem) 324 407 goto out_cnt; 325 408 326 - elem = kmem_cache_zalloc(pool_cache(pool), 409 + elem = kzalloc(rxe_type_info[pool->type].size, 327 410 (pool->flags & RXE_POOL_ATOMIC) ? 328 411 GFP_ATOMIC : GFP_KERNEL); 329 412 if (!elem) ··· 385 468 pool->cleanup(elem); 386 469 387 470 if (!(pool->flags & RXE_POOL_NO_ALLOC)) 388 - kmem_cache_free(pool_cache(pool), elem); 471 + kfree(elem); 389 472 atomic_dec(&pool->num_elem); 390 473 ib_device_put(&pool->rxe->ib_dev); 391 474 rxe_pool_put(pool);
+1 -35
drivers/infiniband/sw/rxe/rxe_pool.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_POOL_H ··· 42 69 u32 min_index; 43 70 size_t key_offset; 44 71 size_t key_size; 45 - struct kmem_cache *cache; 46 72 }; 47 73 48 74 extern struct rxe_type_info rxe_type_info[]; ··· 84 112 size_t key_offset; 85 113 size_t key_size; 86 114 }; 87 - 88 - /* initialize slab caches for managed objects */ 89 - int rxe_cache_init(void); 90 - 91 - /* cleanup slab caches for managed objects */ 92 - void rxe_cache_exit(void); 93 115 94 116 /* initialize a pool of objects with given limit on 95 117 * number of elements. gets parameters from rxe_type_info
+2 -30
drivers/infiniband/sw/rxe/rxe_qp.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h> ··· 601 628 if (mask & IB_QP_QKEY) 602 629 qp->attr.qkey = attr->qkey; 603 630 604 - if (mask & IB_QP_AV) { 631 + if (mask & IB_QP_AV) 605 632 rxe_init_av(&attr->ah_attr, &qp->pri_av); 606 - } 607 633 608 634 if (mask & IB_QP_ALT_PATH) { 609 635 rxe_init_av(&attr->alt_ah_attr, &qp->alt_av);
+1 -28
drivers/infiniband/sw/rxe/rxe_queue.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must retailuce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/vmalloc.h>
+1 -28
drivers/infiniband/sw/rxe/rxe_queue.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_QUEUE_H
+32 -36
drivers/infiniband/sw/rxe/rxe_recv.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h> ··· 233 260 struct rxe_mc_elem *mce; 234 261 struct rxe_qp *qp; 235 262 union ib_gid dgid; 263 + struct sk_buff *per_qp_skb; 264 + struct rxe_pkt_info *per_qp_pkt; 236 265 int err; 237 266 238 267 if (skb->protocol == htons(ETH_P_IP)) ··· 263 288 if (err) 264 289 continue; 265 290 266 - /* if *not* the last qp in the list 267 - * increase the users of the skb then post to the next qp 291 + /* for all but the last qp create a new clone of the 292 + * skb and pass to the qp. 268 293 */ 269 294 if (mce->qp_list.next != &mcg->qp_list) 270 - skb_get(skb); 295 + per_qp_skb = skb_clone(skb, GFP_ATOMIC); 296 + else 297 + per_qp_skb = skb; 271 298 272 - pkt->qp = qp; 299 + if (unlikely(!per_qp_skb)) 300 + continue; 301 + 302 + per_qp_pkt = SKB_TO_PKT(per_qp_skb); 303 + per_qp_pkt->qp = qp; 273 304 rxe_add_ref(qp); 274 - rxe_rcv_pkt(pkt, skb); 305 + rxe_rcv_pkt(per_qp_pkt, per_qp_skb); 275 306 } 276 307 277 308 spin_unlock_bh(&mcg->mcg_lock); 278 309 279 310 rxe_drop_ref(mcg); /* drop ref from rxe_pool_get_key. */ 280 311 312 + return; 313 + 281 314 err1: 282 315 kfree_skb(skb); 283 316 } 284 317 285 - static int rxe_match_dgid(struct rxe_dev *rxe, struct sk_buff *skb) 318 + /** 319 + * rxe_chk_dgid - validate destination IP address 320 + * @rxe: rxe device that received packet 321 + * @skb: the received packet buffer 322 + * 323 + * Accept any loopback packets 324 + * Extract IP address from packet and 325 + * Accept if multicast packet 326 + * Accept if matches an SGID table entry 327 + */ 328 + static int rxe_chk_dgid(struct rxe_dev *rxe, struct sk_buff *skb) 286 329 { 287 330 struct rxe_pkt_info *pkt = SKB_TO_PKT(skb); 288 331 const struct ib_gid_attr *gid_attr; ··· 317 324 } else { 318 325 pdgid = (union ib_gid *)&ipv6_hdr(skb)->daddr; 319 326 } 327 + 328 + if (rdma_is_multicast_addr((struct in6_addr *)pdgid)) 329 + return 0; 320 330 321 331 gid_attr = rdma_find_gid_by_port(&rxe->ib_dev, pdgid, 322 332 IB_GID_TYPE_ROCE_UDP_ENCAP, ··· 345 349 if (unlikely(skb->len < pkt->offset + RXE_BTH_BYTES)) 346 350 goto drop; 347 351 348 - if (rxe_match_dgid(rxe, skb) < 0) { 349 - pr_warn_ratelimited("failed matching dgid\n"); 352 + if (rxe_chk_dgid(rxe, skb) < 0) { 353 + pr_warn_ratelimited("failed checking dgid\n"); 350 354 goto drop; 351 355 } 352 356
+3 -30
drivers/infiniband/sw/rxe/rxe_req.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h> ··· 617 644 618 645 rmr->state = RXE_MEM_STATE_VALID; 619 646 rmr->access = wqe->wr.wr.reg.access; 620 - rmr->lkey = wqe->wr.wr.reg.key; 621 - rmr->rkey = wqe->wr.wr.reg.key; 647 + rmr->ibmr.lkey = wqe->wr.wr.reg.key; 648 + rmr->ibmr.rkey = wqe->wr.wr.reg.key; 622 649 rmr->iova = wqe->wr.wr.reg.mr->iova; 623 650 wqe->state = wqe_state_done; 624 651 wqe->status = IB_WC_SUCCESS;
+1 -28
drivers/infiniband/sw/rxe/rxe_resp.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/skbuff.h>
+1 -28
drivers/infiniband/sw/rxe/rxe_srq.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/vmalloc.h>
+7 -28
drivers/infiniband/sw/rxe/rxe_sysfs.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include "rxe.h" ··· 49 76 if (!ndev) { 50 77 pr_err("interface %s not found\n", intf); 51 78 return -EINVAL; 79 + } 80 + 81 + if (is_vlan_dev(ndev)) { 82 + pr_err("rxe creation allowed on top of a real device only\n"); 83 + err = -EPERM; 84 + goto err; 52 85 } 53 86 54 87 exists = rxe_get_dev_from_net(ndev);
+5 -32
drivers/infiniband/sw/rxe/rxe_task.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/kernel.h> ··· 28 55 * a second caller finds the task already running 29 56 * but looks just after the last call to func 30 57 */ 31 - void rxe_do_task(unsigned long data) 58 + void rxe_do_task(struct tasklet_struct *t) 32 59 { 33 60 int cont; 34 61 int ret; 35 62 unsigned long flags; 36 - struct rxe_task *task = (struct rxe_task *)data; 63 + struct rxe_task *task = from_tasklet(task, t, tasklet); 37 64 38 65 spin_lock_irqsave(&task->state_lock, flags); 39 66 switch (task->state) { ··· 96 123 snprintf(task->name, sizeof(task->name), "%s", name); 97 124 task->destroyed = false; 98 125 99 - tasklet_init(&task->tasklet, rxe_do_task, (unsigned long)task); 126 + tasklet_setup(&task->tasklet, rxe_do_task); 100 127 101 128 task->state = TASK_STATE_START; 102 129 spin_lock_init(&task->state_lock); ··· 132 159 if (sched) 133 160 tasklet_schedule(&task->tasklet); 134 161 else 135 - rxe_do_task((unsigned long)task); 162 + rxe_do_task(&task->tasklet); 136 163 } 137 164 138 165 void rxe_disable_task(struct rxe_task *task)
+3 -30
drivers/infiniband/sw/rxe/rxe_task.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_TASK_H ··· 33 60 /* 34 61 * init rxe_task structure 35 62 * arg => parameter to pass to fcn 36 - * fcn => function to call until it returns != 0 63 + * func => function to call until it returns != 0 37 64 */ 38 65 int rxe_init_task(void *obj, struct rxe_task *task, 39 66 void *arg, int (*func)(void *), char *name); ··· 53 80 * work to do someone must reschedule the task before 54 81 * leaving 55 82 */ 56 - void rxe_do_task(unsigned long data); 83 + void rxe_do_task(struct tasklet_struct *t); 57 84 58 85 /* run a task, else schedule it to run as a tasklet, The decision 59 86 * to run or schedule tasklet is based on the parameter sched.
+13 -39
drivers/infiniband/sw/rxe/rxe_verbs.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #include <linux/dma-mapping.h> ··· 148 175 return rxe_add_to_pool(&rxe->pd_pool, &pd->pelem); 149 176 } 150 177 151 - static void rxe_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 178 + static int rxe_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata) 152 179 { 153 180 struct rxe_pd *pd = to_rpd(ibpd); 154 181 155 182 rxe_drop_ref(pd); 183 + return 0; 156 184 } 157 185 158 186 static int rxe_create_ah(struct ib_ah *ibah, ··· 201 227 return 0; 202 228 } 203 229 204 - static void rxe_destroy_ah(struct ib_ah *ibah, u32 flags) 230 + static int rxe_destroy_ah(struct ib_ah *ibah, u32 flags) 205 231 { 206 232 struct rxe_ah *ah = to_rah(ibah); 207 233 208 234 rxe_drop_ref(ah); 235 + return 0; 209 236 } 210 237 211 238 static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr) ··· 340 365 return 0; 341 366 } 342 367 343 - static void rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 368 + static int rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata) 344 369 { 345 370 struct rxe_srq *srq = to_rsrq(ibsrq); 346 371 ··· 349 374 350 375 rxe_drop_ref(srq->pd); 351 376 rxe_drop_ref(srq); 377 + return 0; 352 378 } 353 379 354 380 static int rxe_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr, ··· 779 803 return rxe_add_to_pool(&rxe->cq_pool, &cq->pelem); 780 804 } 781 805 782 - static void rxe_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 806 + static int rxe_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata) 783 807 { 784 808 struct rxe_cq *cq = to_rcq(ibcq); 785 809 786 810 rxe_cq_disable(cq); 787 811 788 812 rxe_drop_ref(cq); 813 + return 0; 789 814 } 790 815 791 816 static int rxe_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata) ··· 921 944 struct rxe_mem *mr = to_rmr(ibmr); 922 945 923 946 mr->state = RXE_MEM_STATE_ZOMBIE; 924 - rxe_drop_ref(mr->pd); 947 + rxe_drop_ref(mr_pd(mr)); 925 948 rxe_drop_index(mr); 926 949 rxe_drop_ref(mr); 927 950 return 0; ··· 1128 1151 dev->local_dma_lkey = 0; 1129 1152 addrconf_addr_eui48((unsigned char *)&dev->node_guid, 1130 1153 rxe->ndev->dev_addr); 1131 - dev->dev.dma_ops = &dma_virt_ops; 1132 1154 dev->dev.dma_parms = &rxe->dma_parms; 1133 - rxe->dma_parms = (struct device_dma_parameters) 1134 - { .max_segment_size = SZ_2G }; 1135 - dma_coerce_mask_and_coherent(&dev->dev, 1136 - dma_get_required_mask(&dev->dev)); 1155 + dma_set_max_seg_size(&dev->dev, UINT_MAX); 1156 + dma_set_coherent_mask(&dev->dev, dma_get_required_mask(&dev->dev)); 1137 1157 1138 1158 dev->uverbs_cmd_mask = BIT_ULL(IB_USER_VERBS_CMD_GET_CONTEXT) 1139 1159 | BIT_ULL(IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) ··· 1179 1205 rxe->tfm = tfm; 1180 1206 1181 1207 rdma_set_device_sysfs_group(dev, &rxe_attr_group); 1182 - err = ib_register_device(dev, ibdev_name); 1208 + err = ib_register_device(dev, ibdev_name, NULL); 1183 1209 if (err) 1184 1210 pr_warn("%s failed with error %d\n", __func__, err); 1185 1211
+16 -32
drivers/infiniband/sw/rxe/rxe_verbs.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 1 2 /* 2 3 * Copyright (c) 2016 Mellanox Technologies Ltd. All rights reserved. 3 4 * Copyright (c) 2015 System Fabric Works, Inc. All rights reserved. 4 - * 5 - * This software is available to you under a choice of one of two 6 - * licenses. You may choose to be licensed under the terms of the GNU 7 - * General Public License (GPL) Version 2, available from the file 8 - * COPYING in the main directory of this source tree, or the 9 - * OpenIB.org BSD license below: 10 - * 11 - * Redistribution and use in source and binary forms, with or 12 - * without modification, are permitted provided that the following 13 - * conditions are met: 14 - * 15 - * - Redistributions of source code must retain the above 16 - * copyright notice, this list of conditions and the following 17 - * disclaimer. 18 - * 19 - * - Redistributions in binary form must reproduce the above 20 - * copyright notice, this list of conditions and the following 21 - * disclaimer in the documentation and/or other materials 22 - * provided with the distribution. 23 - * 24 - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 25 - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 26 - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 27 - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS 28 - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN 29 - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 30 - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 31 - * SOFTWARE. 32 5 */ 33 6 34 7 #ifndef RXE_VERBS_H ··· 295 322 struct ib_mw ibmw; 296 323 }; 297 324 298 - struct rxe_pd *pd; 299 325 struct ib_umem *umem; 300 - 301 - u32 lkey; 302 - u32 rkey; 303 326 304 327 enum rxe_mem_state state; 305 328 enum rxe_mem_type type; ··· 432 463 static inline struct rxe_mem *to_rmw(struct ib_mw *mw) 433 464 { 434 465 return mw ? container_of(mw, struct rxe_mem, ibmw) : NULL; 466 + } 467 + 468 + static inline struct rxe_pd *mr_pd(struct rxe_mem *mr) 469 + { 470 + return to_rpd(mr->ibmr.pd); 471 + } 472 + 473 + static inline u32 mr_lkey(struct rxe_mem *mr) 474 + { 475 + return mr->ibmr.lkey; 476 + } 477 + 478 + static inline u32 mr_rkey(struct rxe_mem *mr) 479 + { 480 + return mr->ibmr.rkey; 435 481 } 436 482 437 483 int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name);
+4 -4
drivers/infiniband/sw/siw/siw_main.c
··· 69 69 70 70 sdev->vendor_part_id = dev_id++; 71 71 72 - rv = ib_register_device(base_dev, name); 72 + rv = ib_register_device(base_dev, name, NULL); 73 73 if (rv) { 74 74 pr_warn("siw: device registration error %d\n", rv); 75 75 return rv; ··· 382 382 */ 383 383 base_dev->phys_port_cnt = 1; 384 384 base_dev->dev.parent = parent; 385 - base_dev->dev.dma_ops = &dma_virt_ops; 386 385 base_dev->dev.dma_parms = &sdev->dma_parms; 387 - sdev->dma_parms = (struct device_dma_parameters) 388 - { .max_segment_size = SZ_2G }; 386 + dma_set_max_seg_size(&base_dev->dev, UINT_MAX); 387 + dma_set_coherent_mask(&base_dev->dev, 388 + dma_get_required_mask(&base_dev->dev)); 389 389 base_dev->num_comp_vectors = num_possible_cpus(); 390 390 391 391 xa_init_flags(&sdev->qp_xa, XA_FLAGS_ALLOC1);
+6 -3
drivers/infiniband/sw/siw/siw_verbs.c
··· 234 234 return 0; 235 235 } 236 236 237 - void siw_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 237 + int siw_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata) 238 238 { 239 239 struct siw_device *sdev = to_siw_dev(pd->device); 240 240 241 241 siw_dbg_pd(pd, "free PD\n"); 242 242 atomic_dec(&sdev->num_pd); 243 + return 0; 243 244 } 244 245 245 246 void siw_qp_get_ref(struct ib_qp *base_qp) ··· 1056 1055 return rv > 0 ? 0 : rv; 1057 1056 } 1058 1057 1059 - void siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata) 1058 + int siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata) 1060 1059 { 1061 1060 struct siw_cq *cq = to_siw_cq(base_cq); 1062 1061 struct siw_device *sdev = to_siw_dev(base_cq->device); ··· 1074 1073 atomic_dec(&sdev->num_cq); 1075 1074 1076 1075 vfree(cq->queue); 1076 + return 0; 1077 1077 } 1078 1078 1079 1079 /* ··· 1692 1690 * QP anymore - the code trusts the RDMA core environment to keep track 1693 1691 * of QP references. 1694 1692 */ 1695 - void siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata) 1693 + int siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata) 1696 1694 { 1697 1695 struct siw_srq *srq = to_siw_srq(base_srq); 1698 1696 struct siw_device *sdev = to_siw_dev(base_srq->device); ··· 1704 1702 rdma_user_mmap_entry_remove(srq->srq_entry); 1705 1703 vfree(srq->recvq); 1706 1704 atomic_dec(&sdev->num_srq); 1705 + return 0; 1707 1706 } 1708 1707 1709 1708 /*
+3 -3
drivers/infiniband/sw/siw/siw_verbs.h
··· 49 49 int siw_query_gid(struct ib_device *base_dev, u8 port, int idx, 50 50 union ib_gid *gid); 51 51 int siw_alloc_pd(struct ib_pd *base_pd, struct ib_udata *udata); 52 - void siw_dealloc_pd(struct ib_pd *base_pd, struct ib_udata *udata); 52 + int siw_dealloc_pd(struct ib_pd *base_pd, struct ib_udata *udata); 53 53 struct ib_qp *siw_create_qp(struct ib_pd *base_pd, 54 54 struct ib_qp_init_attr *attr, 55 55 struct ib_udata *udata); ··· 62 62 const struct ib_send_wr **bad_wr); 63 63 int siw_post_receive(struct ib_qp *base_qp, const struct ib_recv_wr *wr, 64 64 const struct ib_recv_wr **bad_wr); 65 - void siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata); 65 + int siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata); 66 66 int siw_poll_cq(struct ib_cq *base_cq, int num_entries, struct ib_wc *wc); 67 67 int siw_req_notify_cq(struct ib_cq *base_cq, enum ib_cq_notify_flags flags); 68 68 struct ib_mr *siw_reg_user_mr(struct ib_pd *base_pd, u64 start, u64 len, ··· 78 78 int siw_modify_srq(struct ib_srq *base_srq, struct ib_srq_attr *attr, 79 79 enum ib_srq_attr_mask mask, struct ib_udata *udata); 80 80 int siw_query_srq(struct ib_srq *base_srq, struct ib_srq_attr *attr); 81 - void siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata); 81 + int siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata); 82 82 int siw_post_srq_recv(struct ib_srq *base_srq, const struct ib_recv_wr *wr, 83 83 const struct ib_recv_wr **bad_wr); 84 84 int siw_mmap(struct ib_ucontext *ctx, struct vm_area_struct *vma);
+1 -5
drivers/infiniband/ulp/ipoib/ipoib_cm.c
··· 1647 1647 void ipoib_cm_dev_cleanup(struct net_device *dev) 1648 1648 { 1649 1649 struct ipoib_dev_priv *priv = ipoib_priv(dev); 1650 - int ret; 1651 1650 1652 1651 if (!priv->cm.srq) 1653 1652 return; 1654 1653 1655 1654 ipoib_dbg(priv, "Cleanup ipoib connected mode.\n"); 1656 1655 1657 - ret = ib_destroy_srq(priv->cm.srq); 1658 - if (ret) 1659 - ipoib_warn(priv, "ib_destroy_srq failed: %d\n", ret); 1660 - 1656 + ib_destroy_srq(priv->cm.srq); 1661 1657 priv->cm.srq = NULL; 1662 1658 if (!priv->cm.srq_ring) 1663 1659 return;
+4 -46
drivers/infiniband/ulp/ipoib/ipoib_fs.c
··· 124 124 return 0; 125 125 } 126 126 127 - static const struct seq_operations ipoib_mcg_seq_ops = { 127 + static const struct seq_operations ipoib_mcg_sops = { 128 128 .start = ipoib_mcg_seq_start, 129 129 .next = ipoib_mcg_seq_next, 130 130 .stop = ipoib_mcg_seq_stop, 131 131 .show = ipoib_mcg_seq_show, 132 132 }; 133 133 134 - static int ipoib_mcg_open(struct inode *inode, struct file *file) 135 - { 136 - struct seq_file *seq; 137 - int ret; 138 - 139 - ret = seq_open(file, &ipoib_mcg_seq_ops); 140 - if (ret) 141 - return ret; 142 - 143 - seq = file->private_data; 144 - seq->private = inode->i_private; 145 - 146 - return 0; 147 - } 148 - 149 - static const struct file_operations ipoib_mcg_fops = { 150 - .owner = THIS_MODULE, 151 - .open = ipoib_mcg_open, 152 - .read = seq_read, 153 - .llseek = seq_lseek, 154 - .release = seq_release 155 - }; 134 + DEFINE_SEQ_ATTRIBUTE(ipoib_mcg); 156 135 157 136 static void *ipoib_path_seq_start(struct seq_file *file, loff_t *pos) 158 137 { ··· 208 229 return 0; 209 230 } 210 231 211 - static const struct seq_operations ipoib_path_seq_ops = { 232 + static const struct seq_operations ipoib_path_sops = { 212 233 .start = ipoib_path_seq_start, 213 234 .next = ipoib_path_seq_next, 214 235 .stop = ipoib_path_seq_stop, 215 236 .show = ipoib_path_seq_show, 216 237 }; 217 238 218 - static int ipoib_path_open(struct inode *inode, struct file *file) 219 - { 220 - struct seq_file *seq; 221 - int ret; 222 - 223 - ret = seq_open(file, &ipoib_path_seq_ops); 224 - if (ret) 225 - return ret; 226 - 227 - seq = file->private_data; 228 - seq->private = inode->i_private; 229 - 230 - return 0; 231 - } 232 - 233 - static const struct file_operations ipoib_path_fops = { 234 - .owner = THIS_MODULE, 235 - .open = ipoib_path_open, 236 - .read = seq_read, 237 - .llseek = seq_lseek, 238 - .release = seq_release 239 - }; 239 + DEFINE_SEQ_ATTRIBUTE(ipoib_path); 240 240 241 241 void ipoib_create_debug_files(struct net_device *dev) 242 242 {
+2
drivers/infiniband/ulp/ipoib/ipoib_main.c
··· 2480 2480 /* call event handler to ensure pkey in sync */ 2481 2481 queue_work(ipoib_workqueue, &priv->flush_heavy); 2482 2482 2483 + ndev->rtnl_link_ops = ipoib_get_link_ops(); 2484 + 2483 2485 result = register_netdev(ndev); 2484 2486 if (result) { 2485 2487 pr_warn("%s: couldn't register ipoib port %d; error %d\n",
+11
drivers/infiniband/ulp/ipoib/ipoib_netlink.c
··· 144 144 return 0; 145 145 } 146 146 147 + static void ipoib_del_child_link(struct net_device *dev, struct list_head *head) 148 + { 149 + struct ipoib_dev_priv *priv = ipoib_priv(dev); 150 + 151 + if (!priv->parent) 152 + return; 153 + 154 + unregister_netdevice_queue(dev, head); 155 + } 156 + 147 157 static size_t ipoib_get_size(const struct net_device *dev) 148 158 { 149 159 return nla_total_size(2) + /* IFLA_IPOIB_PKEY */ ··· 168 158 .priv_size = sizeof(struct ipoib_dev_priv), 169 159 .setup = ipoib_setup_common, 170 160 .newlink = ipoib_new_child_link, 161 + .dellink = ipoib_del_child_link, 171 162 .changelink = ipoib_changelink, 172 163 .get_size = ipoib_get_size, 173 164 .fill_info = ipoib_fill_info,
+2
drivers/infiniband/ulp/ipoib/ipoib_vlan.c
··· 195 195 } 196 196 priv = ipoib_priv(ndev); 197 197 198 + ndev->rtnl_link_ops = ipoib_get_link_ops(); 199 + 198 200 result = __ipoib_vlan_add(ppriv, priv, pkey, IPOIB_LEGACY_CHILD); 199 201 200 202 if (result && ndev->reg_state == NETREG_UNINITIALIZED)
+3 -12
drivers/infiniband/ulp/isert/ib_isert.c
··· 1141 1141 * multiple data-outs on the same command can arrive - 1142 1142 * so post the buffer before hand 1143 1143 */ 1144 - rc = isert_post_recv(isert_conn, rx_desc); 1145 - if (rc) { 1146 - isert_err("ib_post_recv failed with %d\n", rc); 1147 - return rc; 1148 - } 1149 - return 0; 1144 + return isert_post_recv(isert_conn, rx_desc); 1150 1145 } 1151 1146 1152 1147 static int ··· 1718 1723 int ret; 1719 1724 1720 1725 ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); 1721 - if (ret) { 1722 - isert_err("ib_post_recv failed with %d\n", ret); 1726 + if (ret) 1723 1727 return ret; 1724 - } 1725 1728 1726 1729 ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, NULL); 1727 1730 if (ret) { ··· 2091 2098 &isert_cmd->tx_desc.send_wr); 2092 2099 2093 2100 rc = isert_post_recv(isert_conn, isert_cmd->rx_desc); 2094 - if (rc) { 2095 - isert_err("ib_post_recv failed with %d\n", rc); 2101 + if (rc) 2096 2102 return rc; 2097 - } 2098 2103 2099 2104 chain_wr = &isert_cmd->tx_desc.send_wr; 2100 2105 }
+3 -3
drivers/infiniband/ulp/rtrs/rtrs-clt-sysfs.c
··· 312 312 NULL 313 313 }; 314 314 315 - static struct attribute_group rtrs_clt_stats_attr_group = { 315 + static const struct attribute_group rtrs_clt_stats_attr_group = { 316 316 .attrs = rtrs_clt_stats_attrs, 317 317 }; 318 318 ··· 388 388 NULL, 389 389 }; 390 390 391 - static struct attribute_group rtrs_clt_sess_attr_group = { 391 + static const struct attribute_group rtrs_clt_sess_attr_group = { 392 392 .attrs = rtrs_clt_sess_attrs, 393 393 }; 394 394 ··· 460 460 NULL, 461 461 }; 462 462 463 - static struct attribute_group rtrs_clt_attr_group = { 463 + static const struct attribute_group rtrs_clt_attr_group = { 464 464 .attrs = rtrs_clt_attrs, 465 465 }; 466 466
-1
drivers/infiniband/ulp/rtrs/rtrs-pri.h
··· 115 115 116 116 /* rtrs information unit */ 117 117 struct rtrs_iu { 118 - struct list_head list; 119 118 struct ib_cqe cqe; 120 119 dma_addr_t dma_addr; 121 120 void *buf;
+2 -2
drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c
··· 135 135 NULL, 136 136 }; 137 137 138 - static struct attribute_group rtrs_srv_sess_attr_group = { 138 + static const struct attribute_group rtrs_srv_sess_attr_group = { 139 139 .attrs = rtrs_srv_sess_attrs, 140 140 }; 141 141 ··· 148 148 NULL, 149 149 }; 150 150 151 - static struct attribute_group rtrs_srv_stats_attr_group = { 151 + static const struct attribute_group rtrs_srv_stats_attr_group = { 152 152 .attrs = rtrs_srv_stats_attrs, 153 153 }; 154 154
+73 -3
drivers/infiniband/ulp/rtrs/rtrs-srv.c
··· 16 16 #include "rtrs-srv.h" 17 17 #include "rtrs-log.h" 18 18 #include <rdma/ib_cm.h> 19 + #include <rdma/ib_verbs.h> 19 20 20 21 MODULE_DESCRIPTION("RDMA Transport Server"); 21 22 MODULE_LICENSE("GPL"); ··· 32 31 static struct rtrs_rdma_dev_pd dev_pd; 33 32 static mempool_t *chunk_pool; 34 33 struct class *rtrs_dev_class; 34 + static struct rtrs_srv_ib_ctx ib_ctx; 35 35 36 36 static int __read_mostly max_chunk_size = DEFAULT_MAX_CHUNK_SIZE; 37 37 static int __read_mostly sess_queue_depth = DEFAULT_SESS_QUEUE_DEPTH; ··· 2044 2042 kfree(ctx); 2045 2043 } 2046 2044 2045 + static int rtrs_srv_add_one(struct ib_device *device) 2046 + { 2047 + struct rtrs_srv_ctx *ctx; 2048 + int ret = 0; 2049 + 2050 + mutex_lock(&ib_ctx.ib_dev_mutex); 2051 + if (ib_ctx.ib_dev_count) 2052 + goto out; 2053 + 2054 + /* 2055 + * Since our CM IDs are NOT bound to any ib device we will create them 2056 + * only once 2057 + */ 2058 + ctx = ib_ctx.srv_ctx; 2059 + ret = rtrs_srv_rdma_init(ctx, ib_ctx.port); 2060 + if (ret) { 2061 + /* 2062 + * We errored out here. 2063 + * According to the ib code, if we encounter an error here then the 2064 + * error code is ignored, and no more calls to our ops are made. 2065 + */ 2066 + pr_err("Failed to initialize RDMA connection"); 2067 + goto err_out; 2068 + } 2069 + 2070 + out: 2071 + /* 2072 + * Keep a track on the number of ib devices added 2073 + */ 2074 + ib_ctx.ib_dev_count++; 2075 + 2076 + err_out: 2077 + mutex_unlock(&ib_ctx.ib_dev_mutex); 2078 + return ret; 2079 + } 2080 + 2081 + static void rtrs_srv_remove_one(struct ib_device *device, void *client_data) 2082 + { 2083 + struct rtrs_srv_ctx *ctx; 2084 + 2085 + mutex_lock(&ib_ctx.ib_dev_mutex); 2086 + ib_ctx.ib_dev_count--; 2087 + 2088 + if (ib_ctx.ib_dev_count) 2089 + goto out; 2090 + 2091 + /* 2092 + * Since our CM IDs are NOT bound to any ib device we will remove them 2093 + * only once, when the last device is removed 2094 + */ 2095 + ctx = ib_ctx.srv_ctx; 2096 + rdma_destroy_id(ctx->cm_id_ip); 2097 + rdma_destroy_id(ctx->cm_id_ib); 2098 + 2099 + out: 2100 + mutex_unlock(&ib_ctx.ib_dev_mutex); 2101 + } 2102 + 2103 + static struct ib_client rtrs_srv_client = { 2104 + .name = "rtrs_server", 2105 + .add = rtrs_srv_add_one, 2106 + .remove = rtrs_srv_remove_one 2107 + }; 2108 + 2047 2109 /** 2048 2110 * rtrs_srv_open() - open RTRS server context 2049 2111 * @ops: callback functions ··· 2126 2060 if (!ctx) 2127 2061 return ERR_PTR(-ENOMEM); 2128 2062 2129 - err = rtrs_srv_rdma_init(ctx, port); 2063 + mutex_init(&ib_ctx.ib_dev_mutex); 2064 + ib_ctx.srv_ctx = ctx; 2065 + ib_ctx.port = port; 2066 + 2067 + err = ib_register_client(&rtrs_srv_client); 2130 2068 if (err) { 2131 2069 free_srv_ctx(ctx); 2132 2070 return ERR_PTR(err); ··· 2169 2099 */ 2170 2100 void rtrs_srv_close(struct rtrs_srv_ctx *ctx) 2171 2101 { 2172 - rdma_destroy_id(ctx->cm_id_ip); 2173 - rdma_destroy_id(ctx->cm_id_ib); 2102 + ib_unregister_client(&rtrs_srv_client); 2103 + mutex_destroy(&ib_ctx.ib_dev_mutex); 2174 2104 close_ctx(ctx); 2175 2105 free_srv_ctx(ctx); 2176 2106 }
+7
drivers/infiniband/ulp/rtrs/rtrs-srv.h
··· 118 118 struct list_head srv_list; 119 119 }; 120 120 121 + struct rtrs_srv_ib_ctx { 122 + struct rtrs_srv_ctx *srv_ctx; 123 + u16 port; 124 + struct mutex ib_dev_mutex; 125 + int ib_dev_count; 126 + }; 127 + 121 128 extern struct class *rtrs_dev_class; 122 129 123 130 void close_sess(struct rtrs_srv_sess *sess);
+4 -27
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c
··· 130 130 return mlx5e_ethtool_flash_device(priv, flash); 131 131 } 132 132 133 - enum mlx5_ptys_width { 134 - MLX5_PTYS_WIDTH_1X = 1 << 0, 135 - MLX5_PTYS_WIDTH_2X = 1 << 1, 136 - MLX5_PTYS_WIDTH_4X = 1 << 2, 137 - MLX5_PTYS_WIDTH_8X = 1 << 3, 138 - MLX5_PTYS_WIDTH_12X = 1 << 4, 139 - }; 140 - 141 133 static inline int mlx5_ptys_width_enum_to_int(enum mlx5_ptys_width width) 142 134 { 143 135 switch (width) { ··· 166 174 } 167 175 } 168 176 169 - static int mlx5i_get_port_settings(struct net_device *netdev, 170 - u16 *ib_link_width_oper, u16 *ib_proto_oper) 171 - { 172 - struct mlx5e_priv *priv = mlx5i_epriv(netdev); 173 - struct mlx5_core_dev *mdev = priv->mdev; 174 - u32 out[MLX5_ST_SZ_DW(ptys_reg)] = {0}; 175 - int ret; 176 - 177 - ret = mlx5_query_port_ptys(mdev, out, sizeof(out), MLX5_PTYS_IB, 1); 178 - if (ret) 179 - return ret; 180 - 181 - *ib_link_width_oper = MLX5_GET(ptys_reg, out, ib_link_width_oper); 182 - *ib_proto_oper = MLX5_GET(ptys_reg, out, ib_proto_oper); 183 - 184 - return 0; 185 - } 186 - 187 177 static int mlx5i_get_speed_settings(u16 ib_link_width_oper, u16 ib_proto_oper) 188 178 { 189 179 int rate, width; ··· 183 209 static int mlx5i_get_link_ksettings(struct net_device *netdev, 184 210 struct ethtool_link_ksettings *link_ksettings) 185 211 { 212 + struct mlx5e_priv *priv = mlx5i_epriv(netdev); 213 + struct mlx5_core_dev *mdev = priv->mdev; 186 214 u16 ib_link_width_oper; 187 215 u16 ib_proto_oper; 188 216 int speed, ret; 189 217 190 - ret = mlx5i_get_port_settings(netdev, &ib_link_width_oper, &ib_proto_oper); 218 + ret = mlx5_query_ib_port_oper(mdev, &ib_link_width_oper, &ib_proto_oper, 219 + 1); 191 220 if (ret) 192 221 return ret; 193 222
+4 -19
drivers/net/ethernet/mellanox/mlx5/core/port.c
··· 154 154 sizeof(out), MLX5_REG_MLCR, 0, 1); 155 155 } 156 156 157 - int mlx5_query_port_link_width_oper(struct mlx5_core_dev *dev, 158 - u8 *link_width_oper, u8 local_port) 159 - { 160 - u32 out[MLX5_ST_SZ_DW(ptys_reg)]; 161 - int err; 162 - 163 - err = mlx5_query_port_ptys(dev, out, sizeof(out), MLX5_PTYS_IB, local_port); 164 - if (err) 165 - return err; 166 - 167 - *link_width_oper = MLX5_GET(ptys_reg, out, ib_link_width_oper); 168 - 169 - return 0; 170 - } 171 - EXPORT_SYMBOL_GPL(mlx5_query_port_link_width_oper); 172 - 173 - int mlx5_query_port_ib_proto_oper(struct mlx5_core_dev *dev, 174 - u8 *proto_oper, u8 local_port) 157 + int mlx5_query_ib_port_oper(struct mlx5_core_dev *dev, u16 *link_width_oper, 158 + u16 *proto_oper, u8 local_port) 175 159 { 176 160 u32 out[MLX5_ST_SZ_DW(ptys_reg)]; 177 161 int err; ··· 165 181 if (err) 166 182 return err; 167 183 184 + *link_width_oper = MLX5_GET(ptys_reg, out, ib_link_width_oper); 168 185 *proto_oper = MLX5_GET(ptys_reg, out, ib_proto_oper); 169 186 170 187 return 0; 171 188 } 172 - EXPORT_SYMBOL(mlx5_query_port_ib_proto_oper); 189 + EXPORT_SYMBOL(mlx5_query_ib_port_oper); 173 190 174 191 /* This function should be used after setting a port register only */ 175 192 void mlx5_toggle_port_link(struct mlx5_core_dev *dev)
+4 -11
drivers/net/ethernet/qlogic/qed/qed_rdma.c
··· 504 504 dev->max_mw = 0; 505 505 dev->max_mr_mw_fmr_pbl = (PAGE_SIZE / 8) * (PAGE_SIZE / 8); 506 506 dev->max_mr_mw_fmr_size = dev->max_mr_mw_fmr_pbl * PAGE_SIZE; 507 - dev->max_pkey = QED_RDMA_MAX_P_KEY; 507 + if (QED_IS_ROCE_PERSONALITY(p_hwfn)) 508 + dev->max_pkey = QED_RDMA_MAX_P_KEY; 508 509 509 510 dev->max_srq = p_hwfn->p_rdma_info->num_srqs; 510 511 dev->max_srq_wr = QED_RDMA_MAX_SRQ_WQE_ELEM; ··· 1520 1519 params->pbl_two_level); 1521 1520 1522 1521 SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_ZERO_BASED, 1523 - params->zbva); 1522 + false); 1524 1523 1525 1524 SET_FIELD(flags, RDMA_REGISTER_TID_RAMROD_DATA_PHY_MR, params->phy_mr); 1526 1525 ··· 1582 1581 p_ramrod->pd = cpu_to_le16(params->pd); 1583 1582 p_ramrod->length_hi = (u8)(params->length >> 32); 1584 1583 p_ramrod->length_lo = DMA_LO_LE(params->length); 1585 - if (params->zbva) { 1586 - /* Lower 32 bits of the registered MR address. 1587 - * In case of zero based MR, will hold FBO 1588 - */ 1589 - p_ramrod->va.hi = 0; 1590 - p_ramrod->va.lo = cpu_to_le32(params->fbo); 1591 - } else { 1592 - DMA_REGPAIR_LE(p_ramrod->va, params->vaddr); 1593 - } 1584 + DMA_REGPAIR_LE(p_ramrod->va, params->vaddr); 1594 1585 DMA_REGPAIR_LE(p_ramrod->pbl_base, params->pbl_ptr); 1595 1586 1596 1587 /* DIF */
+3 -1
drivers/net/ethernet/qlogic/qede/qede_ethtool.c
··· 1026 1026 args.u.mtu = new_mtu; 1027 1027 args.func = &qede_update_mtu; 1028 1028 qede_reload(edev, &args, false); 1029 - 1029 + #if IS_ENABLED(CONFIG_QED_RDMA) 1030 + qede_rdma_event_change_mtu(edev); 1031 + #endif 1030 1032 edev->ops->common->update_mtu(edev->cdev, new_mtu); 1031 1033 1032 1034 return 0;
+17
drivers/net/ethernet/qlogic/qede/qede_rdma.c
··· 234 234 qedr_drv->notify(edev->rdma_info.qedr_dev, QEDE_CHANGE_ADDR); 235 235 } 236 236 237 + static void qede_rdma_change_mtu(struct qede_dev *edev) 238 + { 239 + if (qede_rdma_supported(edev)) { 240 + if (qedr_drv && edev->rdma_info.qedr_dev && qedr_drv->notify) 241 + qedr_drv->notify(edev->rdma_info.qedr_dev, 242 + QEDE_CHANGE_MTU); 243 + } 244 + } 245 + 237 246 static struct qede_rdma_event_work * 238 247 qede_rdma_get_free_event_node(struct qede_dev *edev) 239 248 { ··· 296 287 case QEDE_CHANGE_ADDR: 297 288 qede_rdma_changeaddr(edev); 298 289 break; 290 + case QEDE_CHANGE_MTU: 291 + qede_rdma_change_mtu(edev); 292 + break; 299 293 default: 300 294 DP_NOTICE(edev, "Invalid rdma event %d", event); 301 295 } ··· 349 337 void qede_rdma_event_changeaddr(struct qede_dev *edev) 350 338 { 351 339 qede_rdma_add_event(edev, QEDE_CHANGE_ADDR); 340 + } 341 + 342 + void qede_rdma_event_change_mtu(struct qede_dev *edev) 343 + { 344 + qede_rdma_add_event(edev, QEDE_CHANGE_MTU); 352 345 }
+4 -2
include/linux/mlx5/mlx5_ifc.h
··· 420 420 u8 reserved_at_1a[0x2]; 421 421 u8 ipsec_encrypt[0x1]; 422 422 u8 ipsec_decrypt[0x1]; 423 - u8 reserved_at_1e[0x2]; 423 + u8 sw_owner_v2[0x1]; 424 + u8 reserved_at_1f[0x1]; 424 425 425 426 u8 termination_table_raw_traffic[0x1]; 426 427 u8 reserved_at_21[0x1]; ··· 1431 1430 1432 1431 u8 log_bf_reg_size[0x5]; 1433 1432 1434 - u8 reserved_at_270[0x8]; 1433 + u8 reserved_at_270[0x6]; 1434 + u8 lag_dct[0x2]; 1435 1435 u8 lag_tx_port_affinity[0x1]; 1436 1436 u8 reserved_at_279[0x2]; 1437 1437 u8 lag_master[0x1];
+11 -4
include/linux/mlx5/port.h
··· 125 125 MLX5E_CONNECTOR_TYPE_NUMBER, 126 126 }; 127 127 128 + enum mlx5_ptys_width { 129 + MLX5_PTYS_WIDTH_1X = 1 << 0, 130 + MLX5_PTYS_WIDTH_2X = 1 << 1, 131 + MLX5_PTYS_WIDTH_4X = 1 << 2, 132 + MLX5_PTYS_WIDTH_8X = 1 << 3, 133 + MLX5_PTYS_WIDTH_12X = 1 << 4, 134 + }; 135 + 128 136 #define MLX5E_PROT_MASK(link_mode) (1 << link_mode) 129 137 #define MLX5_GET_ETH_PROTO(reg, out, ext, field) \ 130 138 (ext ? MLX5_GET(reg, out, ext_##field) : \ ··· 141 133 int mlx5_set_port_caps(struct mlx5_core_dev *dev, u8 port_num, u32 caps); 142 134 int mlx5_query_port_ptys(struct mlx5_core_dev *dev, u32 *ptys, 143 135 int ptys_size, int proto_mask, u8 local_port); 144 - int mlx5_query_port_link_width_oper(struct mlx5_core_dev *dev, 145 - u8 *link_width_oper, u8 local_port); 146 - int mlx5_query_port_ib_proto_oper(struct mlx5_core_dev *dev, 147 - u8 *proto_oper, u8 local_port); 136 + 137 + int mlx5_query_ib_port_oper(struct mlx5_core_dev *dev, u16 *link_width_oper, 138 + u16 *proto_oper, u8 local_port); 148 139 void mlx5_toggle_port_link(struct mlx5_core_dev *dev); 149 140 int mlx5_set_port_admin_status(struct mlx5_core_dev *dev, 150 141 enum mlx5_port_status status);
+1
include/linux/overflow.h
··· 3 3 #define __LINUX_OVERFLOW_H 4 4 5 5 #include <linux/compiler.h> 6 + #include <linux/limits.h> 6 7 7 8 /* 8 9 * In the fallback code below, we need to compute the minimum and
-2
include/linux/qed/qed_rdma_if.h
··· 242 242 bool pbl_two_level; 243 243 u8 pbl_page_size_log; 244 244 u8 page_size_log; 245 - u32 fbo; 246 245 u64 length; 247 246 u64 vaddr; 248 - bool zbva; 249 247 bool phy_mr; 250 248 bool dma_mr; 251 249
+3 -1
include/linux/qed/qede_rdma.h
··· 20 20 QEDE_UP, 21 21 QEDE_DOWN, 22 22 QEDE_CHANGE_ADDR, 23 - QEDE_CLOSE 23 + QEDE_CLOSE, 24 + QEDE_CHANGE_MTU, 24 25 }; 25 26 26 27 struct qede_rdma_event_work { ··· 55 54 void qede_rdma_dev_event_close(struct qede_dev *dev); 56 55 void qede_rdma_dev_remove(struct qede_dev *dev, bool recovery); 57 56 void qede_rdma_event_changeaddr(struct qede_dev *edr); 57 + void qede_rdma_event_change_mtu(struct qede_dev *edev); 58 58 59 59 #else 60 60 static inline int qede_rdma_dev_add(struct qede_dev *dev,
+22 -16
include/linux/scatterlist.h
··· 165 165 #define for_each_sgtable_dma_sg(sgt, sg, i) \ 166 166 for_each_sg((sgt)->sgl, sg, (sgt)->nents, i) 167 167 168 + static inline void __sg_chain(struct scatterlist *chain_sg, 169 + struct scatterlist *sgl) 170 + { 171 + /* 172 + * offset and length are unused for chain entry. Clear them. 173 + */ 174 + chain_sg->offset = 0; 175 + chain_sg->length = 0; 176 + 177 + /* 178 + * Set lowest bit to indicate a link pointer, and make sure to clear 179 + * the termination bit if it happens to be set. 180 + */ 181 + chain_sg->page_link = ((unsigned long) sgl | SG_CHAIN) & ~SG_END; 182 + } 183 + 168 184 /** 169 185 * sg_chain - Chain two sglists together 170 186 * @prv: First scatterlist ··· 194 178 static inline void sg_chain(struct scatterlist *prv, unsigned int prv_nents, 195 179 struct scatterlist *sgl) 196 180 { 197 - /* 198 - * offset and length are unused for chain entry. Clear them. 199 - */ 200 - prv[prv_nents - 1].offset = 0; 201 - prv[prv_nents - 1].length = 0; 202 - 203 - /* 204 - * Set lowest bit to indicate a link pointer, and make sure to clear 205 - * the termination bit if it happens to be set. 206 - */ 207 - prv[prv_nents - 1].page_link = ((unsigned long) sgl | SG_CHAIN) 208 - & ~SG_END; 181 + __sg_chain(&prv[prv_nents - 1], sgl); 209 182 } 210 183 211 184 /** ··· 291 286 int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int, 292 287 struct scatterlist *, unsigned int, gfp_t, sg_alloc_fn *); 293 288 int sg_alloc_table(struct sg_table *, unsigned int, gfp_t); 294 - int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, 295 - unsigned int n_pages, unsigned int offset, 296 - unsigned long size, unsigned int max_segment, 297 - gfp_t gfp_mask); 289 + struct scatterlist *__sg_alloc_table_from_pages(struct sg_table *sgt, 290 + struct page **pages, unsigned int n_pages, unsigned int offset, 291 + unsigned long size, unsigned int max_segment, 292 + struct scatterlist *prv, unsigned int left_pages, 293 + gfp_t gfp_mask); 298 294 int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, 299 295 unsigned int n_pages, unsigned int offset, 300 296 unsigned long size, gfp_t gfp_mask);
+3
include/rdma/ib_cache.h
··· 110 110 u8 port_num, int index); 111 111 void rdma_put_gid_attr(const struct ib_gid_attr *attr); 112 112 void rdma_hold_gid_attr(const struct ib_gid_attr *attr); 113 + ssize_t rdma_query_gid_table(struct ib_device *device, 114 + struct ib_uverbs_gid_entry *entries, 115 + size_t max_entries); 113 116 114 117 #endif /* _IB_CACHE_H */
-3
include/rdma/ib_cm.h
··· 14 14 #include <rdma/ib_sa.h> 15 15 #include <rdma/rdma_cm.h> 16 16 17 - /* ib_cm and ib_user_cm modules share /sys/class/infiniband_cm */ 18 - extern struct class cm_class; 19 - 20 17 enum ib_cm_state { 21 18 IB_CM_IDLE, 22 19 IB_CM_LISTEN,
+37 -9
include/rdma/ib_umem.h
··· 17 17 struct ib_umem { 18 18 struct ib_device *ibdev; 19 19 struct mm_struct *owning_mm; 20 + u64 iova; 20 21 size_t length; 21 22 unsigned long address; 22 23 u32 writable : 1; ··· 34 33 return umem->address & ~PAGE_MASK; 35 34 } 36 35 36 + static inline size_t ib_umem_num_dma_blocks(struct ib_umem *umem, 37 + unsigned long pgsz) 38 + { 39 + return (size_t)((ALIGN(umem->iova + umem->length, pgsz) - 40 + ALIGN_DOWN(umem->iova, pgsz))) / 41 + pgsz; 42 + } 43 + 37 44 static inline size_t ib_umem_num_pages(struct ib_umem *umem) 38 45 { 39 - return (ALIGN(umem->address + umem->length, PAGE_SIZE) - 40 - ALIGN_DOWN(umem->address, PAGE_SIZE)) >> 41 - PAGE_SHIFT; 46 + return ib_umem_num_dma_blocks(umem, PAGE_SIZE); 42 47 } 48 + 49 + static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, 50 + struct ib_umem *umem, 51 + unsigned long pgsz) 52 + { 53 + __rdma_block_iter_start(biter, umem->sg_head.sgl, umem->nmap, pgsz); 54 + } 55 + 56 + /** 57 + * rdma_umem_for_each_dma_block - iterate over contiguous DMA blocks of the umem 58 + * @umem: umem to iterate over 59 + * @pgsz: Page size to split the list into 60 + * 61 + * pgsz must be <= PAGE_SIZE or computed by ib_umem_find_best_pgsz(). The 62 + * returned DMA blocks will be aligned to pgsz and span the range: 63 + * ALIGN_DOWN(umem->address, pgsz) to ALIGN(umem->address + umem->length, pgsz) 64 + * 65 + * Performs exactly ib_umem_num_dma_blocks() iterations. 66 + */ 67 + #define rdma_umem_for_each_dma_block(umem, biter, pgsz) \ 68 + for (__rdma_umem_block_iter_start(biter, umem, pgsz); \ 69 + __rdma_block_iter_next(biter);) 43 70 44 71 #ifdef CONFIG_INFINIBAND_USER_MEM 45 72 46 73 struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long addr, 47 74 size_t size, int access); 48 75 void ib_umem_release(struct ib_umem *umem); 49 - int ib_umem_page_count(struct ib_umem *umem); 50 76 int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, 51 77 size_t length); 52 78 unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, ··· 91 63 return ERR_PTR(-EINVAL); 92 64 } 93 65 static inline void ib_umem_release(struct ib_umem *umem) { } 94 - static inline int ib_umem_page_count(struct ib_umem *umem) { return 0; } 95 66 static inline int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, 96 67 size_t length) { 97 68 return -EINVAL; 98 69 } 99 - static inline int ib_umem_find_best_pgsz(struct ib_umem *umem, 100 - unsigned long pgsz_bitmap, 101 - unsigned long virt) { 102 - return -EINVAL; 70 + static inline unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, 71 + unsigned long pgsz_bitmap, 72 + unsigned long virt) 73 + { 74 + return 0; 103 75 } 104 76 105 77 #endif /* CONFIG_INFINIBAND_USER_MEM */
+8 -13
include/rdma/ib_umem_odp.h
··· 14 14 struct mmu_interval_notifier notifier; 15 15 struct pid *tgid; 16 16 17 + /* An array of the pfns included in the on-demand paging umem. */ 18 + unsigned long *pfn_list; 19 + 17 20 /* 18 - * An array of the pages included in the on-demand paging umem. 19 - * Indices of pages that are currently not mapped into the device will 20 - * contain NULL. 21 - */ 22 - struct page **page_list; 23 - /* 24 - * An array of the same size as page_list, with DMA addresses mapped 25 - * for pages the pages in page_list. The lower two bits designate 26 - * access permissions. See ODP_READ_ALLOWED_BIT and 27 - * ODP_WRITE_ALLOWED_BIT. 21 + * An array with DMA addresses mapped for pfns in pfn_list. 22 + * The lower two bits designate access permissions. 23 + * See ODP_READ_ALLOWED_BIT and ODP_WRITE_ALLOWED_BIT. 28 24 */ 29 25 dma_addr_t *dma_list; 30 26 /* ··· 93 97 const struct mmu_interval_notifier_ops *ops); 94 98 void ib_umem_odp_release(struct ib_umem_odp *umem_odp); 95 99 96 - int ib_umem_odp_map_dma_pages(struct ib_umem_odp *umem_odp, u64 start_offset, 97 - u64 bcnt, u64 access_mask, 98 - unsigned long current_seq); 100 + int ib_umem_odp_map_dma_and_lock(struct ib_umem_odp *umem_odp, u64 start_offset, 101 + u64 bcnt, u64 access_mask, bool fault); 99 102 100 103 void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 start_offset, 101 104 u64 bound);
+63 -149
include/rdma/ib_verbs.h
··· 138 138 extern union ib_gid zgid; 139 139 140 140 enum ib_gid_type { 141 - /* If link layer is Ethernet, this is RoCE V1 */ 142 - IB_GID_TYPE_IB = 0, 143 - IB_GID_TYPE_ROCE = 0, 144 - IB_GID_TYPE_ROCE_UDP_ENCAP = 1, 141 + IB_GID_TYPE_IB = IB_UVERBS_GID_TYPE_IB, 142 + IB_GID_TYPE_ROCE = IB_UVERBS_GID_TYPE_ROCE_V1, 143 + IB_GID_TYPE_ROCE_UDP_ENCAP = IB_UVERBS_GID_TYPE_ROCE_V2, 145 144 IB_GID_TYPE_SIZE 146 145 }; 147 146 ··· 179 180 180 181 enum rdma_network_type { 181 182 RDMA_NETWORK_IB, 182 - RDMA_NETWORK_ROCE_V1 = RDMA_NETWORK_IB, 183 + RDMA_NETWORK_ROCE_V1, 183 184 RDMA_NETWORK_IPV4, 184 185 RDMA_NETWORK_IPV6 185 186 }; ··· 189 190 if (network_type == RDMA_NETWORK_IPV4 || 190 191 network_type == RDMA_NETWORK_IPV6) 191 192 return IB_GID_TYPE_ROCE_UDP_ENCAP; 192 - 193 - /* IB_GID_TYPE_IB same as RDMA_NETWORK_ROCE_V1 */ 194 - return IB_GID_TYPE_IB; 193 + else if (network_type == RDMA_NETWORK_ROCE_V1) 194 + return IB_GID_TYPE_ROCE; 195 + else 196 + return IB_GID_TYPE_IB; 195 197 } 196 198 197 199 static inline enum rdma_network_type ··· 200 200 { 201 201 if (attr->gid_type == IB_GID_TYPE_IB) 202 202 return RDMA_NETWORK_IB; 203 + 204 + if (attr->gid_type == IB_GID_TYPE_ROCE) 205 + return RDMA_NETWORK_ROCE_V1; 203 206 204 207 if (ipv6_addr_v4mapped((struct in6_addr *)&attr->gid)) 205 208 return RDMA_NETWORK_IPV4; ··· 538 535 IB_SPEED_FDR10 = 8, 539 536 IB_SPEED_FDR = 16, 540 537 IB_SPEED_EDR = 32, 541 - IB_SPEED_HDR = 64 538 + IB_SPEED_HDR = 64, 539 + IB_SPEED_NDR = 128, 542 540 }; 543 541 544 542 /** ··· 673 669 u8 subnet_timeout; 674 670 u8 init_type_reply; 675 671 u8 active_width; 676 - u8 active_speed; 672 + u16 active_speed; 677 673 u8 phys_state; 678 674 u16 port_cap_flags2; 679 675 }; ··· 956 952 const char *__attribute_const__ ib_wc_status_msg(enum ib_wc_status status); 957 953 958 954 enum ib_wc_opcode { 959 - IB_WC_SEND, 960 - IB_WC_RDMA_WRITE, 961 - IB_WC_RDMA_READ, 962 - IB_WC_COMP_SWAP, 963 - IB_WC_FETCH_ADD, 964 - IB_WC_LSO, 965 - IB_WC_LOCAL_INV, 955 + IB_WC_SEND = IB_UVERBS_WC_SEND, 956 + IB_WC_RDMA_WRITE = IB_UVERBS_WC_RDMA_WRITE, 957 + IB_WC_RDMA_READ = IB_UVERBS_WC_RDMA_READ, 958 + IB_WC_COMP_SWAP = IB_UVERBS_WC_COMP_SWAP, 959 + IB_WC_FETCH_ADD = IB_UVERBS_WC_FETCH_ADD, 960 + IB_WC_BIND_MW = IB_UVERBS_WC_BIND_MW, 961 + IB_WC_LOCAL_INV = IB_UVERBS_WC_LOCAL_INV, 962 + IB_WC_LSO = IB_UVERBS_WC_TSO, 966 963 IB_WC_REG_MR, 967 964 IB_WC_MASKED_COMP_SWAP, 968 965 IB_WC_MASKED_FETCH_ADD, ··· 1296 1291 IB_WR_RDMA_READ = IB_UVERBS_WR_RDMA_READ, 1297 1292 IB_WR_ATOMIC_CMP_AND_SWP = IB_UVERBS_WR_ATOMIC_CMP_AND_SWP, 1298 1293 IB_WR_ATOMIC_FETCH_AND_ADD = IB_UVERBS_WR_ATOMIC_FETCH_AND_ADD, 1294 + IB_WR_BIND_MW = IB_UVERBS_WR_BIND_MW, 1299 1295 IB_WR_LSO = IB_UVERBS_WR_TSO, 1300 1296 IB_WR_SEND_WITH_INV = IB_UVERBS_WR_SEND_WITH_INV, 1301 1297 IB_WR_RDMA_READ_WITH_INV = IB_UVERBS_WR_RDMA_READ_WITH_INV, ··· 1469 1463 RDMA_REMOVE_DRIVER_REMOVE, 1470 1464 /* uobj is being cleaned-up before being committed */ 1471 1465 RDMA_REMOVE_ABORT, 1472 - /* 1473 - * uobj has been fully created, with the uobj->object set, but is being 1474 - * cleaned up before being comitted 1475 - */ 1476 - RDMA_REMOVE_ABORT_HWOBJ, 1477 1466 }; 1478 1467 1479 1468 struct ib_rdmacg_object { ··· 1480 1479 struct ib_ucontext { 1481 1480 struct ib_device *device; 1482 1481 struct ib_uverbs_file *ufile; 1483 - /* 1484 - * 'closing' can be read by the driver only during a destroy callback, 1485 - * it is set when we are closing the file descriptor and indicates 1486 - * that mm_sem may be locked. 1487 - */ 1488 - bool closing; 1489 1482 1490 1483 bool cleanup_retryable; 1491 1484 ··· 1857 1862 }; 1858 1863 #define IB_FLOW_SPEC_LAYER_MASK 0xF0 1859 1864 #define IB_FLOW_SPEC_SUPPORT_LAYERS 10 1860 - 1861 - /* Flow steering rule priority is set according to it's domain. 1862 - * Lower domain value means higher priority. 1863 - */ 1864 - enum ib_flow_domain { 1865 - IB_FLOW_DOMAIN_USER, 1866 - IB_FLOW_DOMAIN_ETHTOOL, 1867 - IB_FLOW_DOMAIN_RFS, 1868 - IB_FLOW_DOMAIN_NIC, 1869 - IB_FLOW_DOMAIN_NUM /* Must be last */ 1870 - }; 1871 1865 1872 1866 enum ib_flow_flags { 1873 1867 IB_FLOW_ATTR_FLAGS_DONT_TRAP = 1UL << 1, /* Continue match, no steal */ ··· 2398 2414 void (*mmap_free)(struct rdma_user_mmap_entry *entry); 2399 2415 void (*disassociate_ucontext)(struct ib_ucontext *ibcontext); 2400 2416 int (*alloc_pd)(struct ib_pd *pd, struct ib_udata *udata); 2401 - void (*dealloc_pd)(struct ib_pd *pd, struct ib_udata *udata); 2417 + int (*dealloc_pd)(struct ib_pd *pd, struct ib_udata *udata); 2402 2418 int (*create_ah)(struct ib_ah *ah, struct rdma_ah_init_attr *attr, 2403 2419 struct ib_udata *udata); 2404 2420 int (*modify_ah)(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); 2405 2421 int (*query_ah)(struct ib_ah *ah, struct rdma_ah_attr *ah_attr); 2406 - void (*destroy_ah)(struct ib_ah *ah, u32 flags); 2422 + int (*destroy_ah)(struct ib_ah *ah, u32 flags); 2407 2423 int (*create_srq)(struct ib_srq *srq, 2408 2424 struct ib_srq_init_attr *srq_init_attr, 2409 2425 struct ib_udata *udata); ··· 2411 2427 enum ib_srq_attr_mask srq_attr_mask, 2412 2428 struct ib_udata *udata); 2413 2429 int (*query_srq)(struct ib_srq *srq, struct ib_srq_attr *srq_attr); 2414 - void (*destroy_srq)(struct ib_srq *srq, struct ib_udata *udata); 2430 + int (*destroy_srq)(struct ib_srq *srq, struct ib_udata *udata); 2415 2431 struct ib_qp *(*create_qp)(struct ib_pd *pd, 2416 2432 struct ib_qp_init_attr *qp_init_attr, 2417 2433 struct ib_udata *udata); ··· 2423 2439 int (*create_cq)(struct ib_cq *cq, const struct ib_cq_init_attr *attr, 2424 2440 struct ib_udata *udata); 2425 2441 int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period); 2426 - void (*destroy_cq)(struct ib_cq *cq, struct ib_udata *udata); 2442 + int (*destroy_cq)(struct ib_cq *cq, struct ib_udata *udata); 2427 2443 int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata); 2428 2444 struct ib_mr *(*get_dma_mr)(struct ib_pd *pd, int mr_access_flags); 2429 2445 struct ib_mr *(*reg_user_mr)(struct ib_pd *pd, u64 start, u64 length, ··· 2446 2462 unsigned int *sg_offset); 2447 2463 int (*check_mr_status)(struct ib_mr *mr, u32 check_mask, 2448 2464 struct ib_mr_status *mr_status); 2449 - struct ib_mw *(*alloc_mw)(struct ib_pd *pd, enum ib_mw_type type, 2450 - struct ib_udata *udata); 2465 + int (*alloc_mw)(struct ib_mw *mw, struct ib_udata *udata); 2451 2466 int (*dealloc_mw)(struct ib_mw *mw); 2452 2467 int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid); 2453 2468 int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid); 2454 2469 int (*alloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata); 2455 - void (*dealloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata); 2470 + int (*dealloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata); 2456 2471 struct ib_flow *(*create_flow)(struct ib_qp *qp, 2457 2472 struct ib_flow_attr *flow_attr, 2458 - int domain, struct ib_udata *udata); 2473 + struct ib_udata *udata); 2459 2474 int (*destroy_flow)(struct ib_flow *flow_id); 2460 2475 struct ib_flow_action *(*create_flow_action_esp)( 2461 2476 struct ib_device *device, ··· 2479 2496 struct ib_wq *(*create_wq)(struct ib_pd *pd, 2480 2497 struct ib_wq_init_attr *init_attr, 2481 2498 struct ib_udata *udata); 2482 - void (*destroy_wq)(struct ib_wq *wq, struct ib_udata *udata); 2499 + int (*destroy_wq)(struct ib_wq *wq, struct ib_udata *udata); 2483 2500 int (*modify_wq)(struct ib_wq *wq, struct ib_wq_attr *attr, 2484 2501 u32 wq_attr_mask, struct ib_udata *udata); 2485 - struct ib_rwq_ind_table *(*create_rwq_ind_table)( 2486 - struct ib_device *device, 2487 - struct ib_rwq_ind_table_init_attr *init_attr, 2488 - struct ib_udata *udata); 2502 + int (*create_rwq_ind_table)(struct ib_rwq_ind_table *ib_rwq_ind_table, 2503 + struct ib_rwq_ind_table_init_attr *init_attr, 2504 + struct ib_udata *udata); 2489 2505 int (*destroy_rwq_ind_table)(struct ib_rwq_ind_table *wq_ind_table); 2490 2506 struct ib_dm *(*alloc_dm)(struct ib_device *device, 2491 2507 struct ib_ucontext *context, ··· 2496 2514 struct uverbs_attr_bundle *attrs); 2497 2515 int (*create_counters)(struct ib_counters *counters, 2498 2516 struct uverbs_attr_bundle *attrs); 2499 - void (*destroy_counters)(struct ib_counters *counters); 2517 + int (*destroy_counters)(struct ib_counters *counters); 2500 2518 int (*read_counters)(struct ib_counters *counters, 2501 2519 struct ib_counters_read_attr *counters_read_attr, 2502 2520 struct uverbs_attr_bundle *attrs); ··· 2606 2624 DECLARE_RDMA_OBJ_SIZE(ib_ah); 2607 2625 DECLARE_RDMA_OBJ_SIZE(ib_counters); 2608 2626 DECLARE_RDMA_OBJ_SIZE(ib_cq); 2627 + DECLARE_RDMA_OBJ_SIZE(ib_mw); 2609 2628 DECLARE_RDMA_OBJ_SIZE(ib_pd); 2629 + DECLARE_RDMA_OBJ_SIZE(ib_rwq_ind_table); 2610 2630 DECLARE_RDMA_OBJ_SIZE(ib_srq); 2611 2631 DECLARE_RDMA_OBJ_SIZE(ib_ucontext); 2612 2632 DECLARE_RDMA_OBJ_SIZE(ib_xrcd); ··· 2782 2798 2783 2799 void ib_get_device_fw_str(struct ib_device *device, char *str); 2784 2800 2785 - int ib_register_device(struct ib_device *device, const char *name); 2801 + int ib_register_device(struct ib_device *device, const char *name, 2802 + struct device *dma_device); 2786 2803 void ib_unregister_device(struct ib_device *device); 2787 2804 void ib_unregister_driver(enum rdma_driver_id driver_id); 2788 2805 void ib_unregister_device_and_put(struct ib_device *device); ··· 3337 3352 } 3338 3353 3339 3354 /** 3340 - * rdma_find_pg_bit - Find page bit given address and HW supported page sizes 3341 - * 3342 - * @addr: address 3343 - * @pgsz_bitmap: bitmap of HW supported page sizes 3344 - */ 3345 - static inline unsigned int rdma_find_pg_bit(unsigned long addr, 3346 - unsigned long pgsz_bitmap) 3347 - { 3348 - unsigned long align; 3349 - unsigned long pgsz; 3350 - 3351 - align = addr & -addr; 3352 - 3353 - /* Find page bit such that addr is aligned to the highest supported 3354 - * HW page size 3355 - */ 3356 - pgsz = pgsz_bitmap & ~(-align << 1); 3357 - if (!pgsz) 3358 - return __ffs(pgsz_bitmap); 3359 - 3360 - return __fls(pgsz); 3361 - } 3362 - 3363 - /** 3364 3355 * rdma_core_cap_opa_port - Return whether the RDMA Port is OPA or not. 3365 3356 * @device: Device 3366 3357 * @port_num: 1 based Port number ··· 3433 3472 #define ib_alloc_pd(device, flags) \ 3434 3473 __ib_alloc_pd((device), (flags), KBUILD_MODNAME) 3435 3474 3436 - /** 3437 - * ib_dealloc_pd_user - Deallocate kernel/user PD 3438 - * @pd: The protection domain 3439 - * @udata: Valid user data or NULL for kernel objects 3440 - */ 3441 - void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata); 3475 + int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata); 3442 3476 3443 3477 /** 3444 3478 * ib_dealloc_pd - Deallocate kernel PD ··· 3443 3487 */ 3444 3488 static inline void ib_dealloc_pd(struct ib_pd *pd) 3445 3489 { 3446 - ib_dealloc_pd_user(pd, NULL); 3490 + int ret = ib_dealloc_pd_user(pd, NULL); 3491 + 3492 + WARN_ONCE(ret, "Destroy of kernel PD shouldn't fail"); 3447 3493 } 3448 3494 3449 3495 enum rdma_create_ah_flags { ··· 3573 3615 * 3574 3616 * NOTE: for user ah use rdma_destroy_ah_user with valid udata! 3575 3617 */ 3576 - static inline int rdma_destroy_ah(struct ib_ah *ah, u32 flags) 3618 + static inline void rdma_destroy_ah(struct ib_ah *ah, u32 flags) 3577 3619 { 3578 - return rdma_destroy_ah_user(ah, flags, NULL); 3620 + int ret = rdma_destroy_ah_user(ah, flags, NULL); 3621 + 3622 + WARN_ONCE(ret, "Destroy of kernel AH shouldn't fail"); 3579 3623 } 3580 3624 3581 3625 struct ib_srq *ib_create_srq_user(struct ib_pd *pd, ··· 3631 3671 * 3632 3672 * NOTE: for user srq use ib_destroy_srq_user with valid udata! 3633 3673 */ 3634 - static inline int ib_destroy_srq(struct ib_srq *srq) 3674 + static inline void ib_destroy_srq(struct ib_srq *srq) 3635 3675 { 3636 - return ib_destroy_srq_user(srq, NULL); 3676 + int ret = ib_destroy_srq_user(srq, NULL); 3677 + 3678 + WARN_ONCE(ret, "Destroy of kernel SRQ shouldn't fail"); 3637 3679 } 3638 3680 3639 3681 /** ··· 3779 3817 return qp->device->ops.post_recv(qp, recv_wr, bad_recv_wr ? : &dummy); 3780 3818 } 3781 3819 3782 - struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private, 3783 - int nr_cqe, int comp_vector, 3784 - enum ib_poll_context poll_ctx, 3785 - const char *caller, struct ib_udata *udata); 3786 - 3787 - /** 3788 - * ib_alloc_cq_user: Allocate kernel/user CQ 3789 - * @dev: The IB device 3790 - * @private: Private data attached to the CQE 3791 - * @nr_cqe: Number of CQEs in the CQ 3792 - * @comp_vector: Completion vector used for the IRQs 3793 - * @poll_ctx: Context used for polling the CQ 3794 - * @udata: Valid user data or NULL for kernel objects 3795 - */ 3796 - static inline struct ib_cq *ib_alloc_cq_user(struct ib_device *dev, 3797 - void *private, int nr_cqe, 3798 - int comp_vector, 3799 - enum ib_poll_context poll_ctx, 3800 - struct ib_udata *udata) 3801 - { 3802 - return __ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx, 3803 - KBUILD_MODNAME, udata); 3804 - } 3805 - 3806 - /** 3807 - * ib_alloc_cq: Allocate kernel CQ 3808 - * @dev: The IB device 3809 - * @private: Private data attached to the CQE 3810 - * @nr_cqe: Number of CQEs in the CQ 3811 - * @comp_vector: Completion vector used for the IRQs 3812 - * @poll_ctx: Context used for polling the CQ 3813 - * 3814 - * NOTE: for user cq use ib_alloc_cq_user with valid udata! 3815 - */ 3820 + struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe, 3821 + int comp_vector, enum ib_poll_context poll_ctx, 3822 + const char *caller); 3816 3823 static inline struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private, 3817 3824 int nr_cqe, int comp_vector, 3818 3825 enum ib_poll_context poll_ctx) 3819 3826 { 3820 - return ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx, 3821 - NULL); 3827 + return __ib_alloc_cq(dev, private, nr_cqe, comp_vector, poll_ctx, 3828 + KBUILD_MODNAME); 3822 3829 } 3823 3830 3824 3831 struct ib_cq *__ib_alloc_cq_any(struct ib_device *dev, void *private, ··· 3809 3878 KBUILD_MODNAME); 3810 3879 } 3811 3880 3812 - /** 3813 - * ib_free_cq_user - Free kernel/user CQ 3814 - * @cq: The CQ to free 3815 - * @udata: Valid user data or NULL for kernel objects 3816 - * 3817 - * NOTE: This function shouldn't be called on shared CQs. 3818 - */ 3819 - void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata); 3820 - 3821 - /** 3822 - * ib_free_cq - Free kernel CQ 3823 - * @cq: The CQ to free 3824 - * 3825 - * NOTE: for user cq use ib_free_cq_user with valid udata! 3826 - */ 3827 - static inline void ib_free_cq(struct ib_cq *cq) 3828 - { 3829 - ib_free_cq_user(cq, NULL); 3830 - } 3831 - 3881 + void ib_free_cq(struct ib_cq *cq); 3832 3882 int ib_process_cq_direct(struct ib_cq *cq, int budget); 3833 3883 3834 3884 /** ··· 3867 3955 */ 3868 3956 static inline void ib_destroy_cq(struct ib_cq *cq) 3869 3957 { 3870 - ib_destroy_cq_user(cq, NULL); 3958 + int ret = ib_destroy_cq_user(cq, NULL); 3959 + 3960 + WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail"); 3871 3961 } 3872 3962 3873 3963 /** ··· 4293 4379 4294 4380 struct ib_wq *ib_create_wq(struct ib_pd *pd, 4295 4381 struct ib_wq_init_attr *init_attr); 4296 - int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata); 4382 + int ib_destroy_wq_user(struct ib_wq *wq, struct ib_udata *udata); 4297 4383 int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr, 4298 4384 u32 wq_attr_mask); 4299 - int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table); 4300 4385 4301 4386 int ib_map_mr_sg(struct ib_mr *mr, struct scatterlist *sg, int sg_nents, 4302 4387 unsigned int *sg_offset, unsigned int page_size); ··· 4323 4410 void ib_drain_sq(struct ib_qp *qp); 4324 4411 void ib_drain_qp(struct ib_qp *qp); 4325 4412 4326 - int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u8 *speed, u8 *width); 4413 + int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u16 *speed, u8 *width); 4327 4414 4328 4415 static inline u8 *rdma_ah_retrieve_dmac(struct rdma_ah_attr *attr) 4329 4416 { ··· 4630 4717 const struct net *net); 4631 4718 4632 4719 #define IB_ROCE_UDP_ENCAP_VALID_PORT_MIN (0xC000) 4720 + #define IB_ROCE_UDP_ENCAP_VALID_PORT_MAX (0xFFFF) 4633 4721 #define IB_GRH_FLOWLABEL_MASK (0x000FFFFF) 4634 4722 4635 4723 /**
+16 -30
include/rdma/rdma_cm.h
··· 110 110 u8 port_num; 111 111 }; 112 112 113 - struct rdma_cm_id *__rdma_create_id(struct net *net, 114 - rdma_cm_event_handler event_handler, 115 - void *context, enum rdma_ucm_port_space ps, 116 - enum ib_qp_type qp_type, 117 - const char *caller); 113 + struct rdma_cm_id * 114 + __rdma_create_kernel_id(struct net *net, rdma_cm_event_handler event_handler, 115 + void *context, enum rdma_ucm_port_space ps, 116 + enum ib_qp_type qp_type, const char *caller); 117 + struct rdma_cm_id *rdma_create_user_id(rdma_cm_event_handler event_handler, 118 + void *context, 119 + enum rdma_ucm_port_space ps, 120 + enum ib_qp_type qp_type); 118 121 119 122 /** 120 123 * rdma_create_id - Create an RDMA identifier. ··· 135 132 * The event handler callback serializes on the id's mutex and is 136 133 * allowed to sleep. 137 134 */ 138 - #define rdma_create_id(net, event_handler, context, ps, qp_type) \ 139 - __rdma_create_id((net), (event_handler), (context), (ps), (qp_type), \ 140 - KBUILD_MODNAME) 135 + #define rdma_create_id(net, event_handler, context, ps, qp_type) \ 136 + __rdma_create_kernel_id(net, event_handler, context, ps, qp_type, \ 137 + KBUILD_MODNAME) 141 138 142 139 /** 143 140 * rdma_destroy_id - Destroys an RDMA identifier. ··· 253 250 */ 254 251 int rdma_listen(struct rdma_cm_id *id, int backlog); 255 252 256 - int __rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 257 - const char *caller); 253 + int rdma_accept(struct rdma_cm_id *id, struct rdma_conn_param *conn_param); 258 254 259 - int __rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 260 - const char *caller, struct rdma_ucm_ece *ece); 261 - 262 - /** 263 - * rdma_accept - Called to accept a connection request or response. 264 - * @id: Connection identifier associated with the request. 265 - * @conn_param: Information needed to establish the connection. This must be 266 - * provided if accepting a connection request. If accepting a connection 267 - * response, this parameter must be NULL. 268 - * 269 - * Typically, this routine is only called by the listener to accept a connection 270 - * request. It must also be called on the active side of a connection if the 271 - * user is performing their own QP transitions. 272 - * 273 - * In the case of error, a reject message is sent to the remote side and the 274 - * state of the qp associated with the id is modified to error, such that any 275 - * previously posted receive buffers would be flushed. 276 - */ 277 - #define rdma_accept(id, conn_param) \ 278 - __rdma_accept((id), (conn_param), KBUILD_MODNAME) 255 + void rdma_lock_handler(struct rdma_cm_id *id); 256 + void rdma_unlock_handler(struct rdma_cm_id *id); 257 + int rdma_accept_ece(struct rdma_cm_id *id, struct rdma_conn_param *conn_param, 258 + struct rdma_ucm_ece *ece); 279 259 280 260 /** 281 261 * rdma_notify - Notifies the RDMA CM of an asynchronous event that has
+1 -20
include/rdma/restrack.h
··· 106 106 107 107 int rdma_restrack_count(struct ib_device *dev, 108 108 enum rdma_restrack_type type); 109 - 110 - void rdma_restrack_kadd(struct rdma_restrack_entry *res); 111 - void rdma_restrack_uadd(struct rdma_restrack_entry *res); 112 - 113 - /** 114 - * rdma_restrack_del() - delete object from the reource tracking database 115 - * @res: resource entry 116 - * @type: actual type of object to operate 117 - */ 118 - void rdma_restrack_del(struct rdma_restrack_entry *res); 119 - 120 109 /** 121 110 * rdma_is_kernel_res() - check the owner of resource 122 111 * @res: resource entry 123 112 */ 124 - static inline bool rdma_is_kernel_res(struct rdma_restrack_entry *res) 113 + static inline bool rdma_is_kernel_res(const struct rdma_restrack_entry *res) 125 114 { 126 115 return !res->user; 127 116 } ··· 126 137 * @res: resource entry 127 138 */ 128 139 int rdma_restrack_put(struct rdma_restrack_entry *res); 129 - 130 - /** 131 - * rdma_restrack_set_task() - set the task for this resource 132 - * @res: resource entry 133 - * @caller: kernel name, the current task will be used if the caller is NULL. 134 - */ 135 - void rdma_restrack_set_task(struct rdma_restrack_entry *res, 136 - const char *caller); 137 140 138 141 /* 139 142 * Helper functions for rdma drivers when filling out
+40 -1
include/trace/events/rdma.h
··· 6 6 /* 7 7 * enum ib_event_type, from include/rdma/ib_verbs.h 8 8 */ 9 - 10 9 #define IB_EVENT_LIST \ 11 10 ib_event(CQ_ERR) \ 12 11 ib_event(QP_FATAL) \ ··· 88 89 89 90 #define rdma_show_wc_status(x) \ 90 91 __print_symbolic(x, IB_WC_STATUS_LIST) 92 + 93 + /* 94 + * enum ib_cm_event_type, from include/rdma/ib_cm.h 95 + */ 96 + #define IB_CM_EVENT_LIST \ 97 + ib_cm_event(REQ_ERROR) \ 98 + ib_cm_event(REQ_RECEIVED) \ 99 + ib_cm_event(REP_ERROR) \ 100 + ib_cm_event(REP_RECEIVED) \ 101 + ib_cm_event(RTU_RECEIVED) \ 102 + ib_cm_event(USER_ESTABLISHED) \ 103 + ib_cm_event(DREQ_ERROR) \ 104 + ib_cm_event(DREQ_RECEIVED) \ 105 + ib_cm_event(DREP_RECEIVED) \ 106 + ib_cm_event(TIMEWAIT_EXIT) \ 107 + ib_cm_event(MRA_RECEIVED) \ 108 + ib_cm_event(REJ_RECEIVED) \ 109 + ib_cm_event(LAP_ERROR) \ 110 + ib_cm_event(LAP_RECEIVED) \ 111 + ib_cm_event(APR_RECEIVED) \ 112 + ib_cm_event(SIDR_REQ_ERROR) \ 113 + ib_cm_event(SIDR_REQ_RECEIVED) \ 114 + ib_cm_event_end(SIDR_REP_RECEIVED) 115 + 116 + #undef ib_cm_event 117 + #undef ib_cm_event_end 118 + 119 + #define ib_cm_event(x) TRACE_DEFINE_ENUM(IB_CM_##x); 120 + #define ib_cm_event_end(x) TRACE_DEFINE_ENUM(IB_CM_##x); 121 + 122 + IB_CM_EVENT_LIST 123 + 124 + #undef ib_cm_event 125 + #undef ib_cm_event_end 126 + 127 + #define ib_cm_event(x) { IB_CM_##x, #x }, 128 + #define ib_cm_event_end(x) { IB_CM_##x, #x } 129 + 130 + #define rdma_show_ib_cm_event(x) \ 131 + __print_symbolic(x, IB_CM_EVENT_LIST) 91 132 92 133 /* 93 134 * enum rdma_cm_event_type, from include/rdma/rdma_cm.h
+1
include/trace/events/rpcrdma.h
··· 13 13 #include <linux/scatterlist.h> 14 14 #include <linux/sunrpc/rpc_rdma_cid.h> 15 15 #include <linux/tracepoint.h> 16 + #include <rdma/ib_cm.h> 16 17 #include <trace/events/rdma.h> 17 18 18 19 /**
+1
include/uapi/rdma/efa-abi.h
··· 105 105 106 106 enum { 107 107 EFA_QUERY_DEVICE_CAPS_RDMA_READ = 1 << 0, 108 + EFA_QUERY_DEVICE_CAPS_RNR_RETRY = 1 << 1, 108 109 }; 109 110 110 111 struct efa_ibv_ex_query_device_resp {
+3 -1
include/uapi/rdma/hns-abi.h
··· 39 39 struct hns_roce_ib_create_cq { 40 40 __aligned_u64 buf_addr; 41 41 __aligned_u64 db_addr; 42 + __u32 cqe_size; 43 + __u32 reserved; 42 44 }; 43 45 44 46 struct hns_roce_ib_create_cq_resp { ··· 75 73 76 74 struct hns_roce_ib_alloc_ucontext_resp { 77 75 __u32 qp_tab_size; 78 - __u32 reserved; 76 + __u32 cqe_size; 79 77 }; 80 78 81 79 struct hns_roce_ib_alloc_pd_resp {
+16
include/uapi/rdma/ib_user_ioctl_cmds.h
··· 70 70 UVERBS_METHOD_QUERY_PORT, 71 71 UVERBS_METHOD_GET_CONTEXT, 72 72 UVERBS_METHOD_QUERY_CONTEXT, 73 + UVERBS_METHOD_QUERY_GID_TABLE, 74 + UVERBS_METHOD_QUERY_GID_ENTRY, 73 75 }; 74 76 75 77 enum uverbs_attrs_invoke_write_cmd_attr_ids { ··· 352 350 353 351 enum uverbs_attrs_async_event_create { 354 352 UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE, 353 + }; 354 + 355 + enum uverbs_attrs_query_gid_table_cmd_attr_ids { 356 + UVERBS_ATTR_QUERY_GID_TABLE_ENTRY_SIZE, 357 + UVERBS_ATTR_QUERY_GID_TABLE_FLAGS, 358 + UVERBS_ATTR_QUERY_GID_TABLE_RESP_ENTRIES, 359 + UVERBS_ATTR_QUERY_GID_TABLE_RESP_NUM_ENTRIES, 360 + }; 361 + 362 + enum uverbs_attrs_query_gid_entry_cmd_attr_ids { 363 + UVERBS_ATTR_QUERY_GID_ENTRY_PORT, 364 + UVERBS_ATTR_QUERY_GID_ENTRY_GID_INDEX, 365 + UVERBS_ATTR_QUERY_GID_ENTRY_FLAGS, 366 + UVERBS_ATTR_QUERY_GID_ENTRY_RESP_ENTRY, 355 367 }; 356 368 357 369 #endif
+15
include/uapi/rdma/ib_user_ioctl_verbs.h
··· 208 208 enum ib_uverbs_advise_mr_advice { 209 209 IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH, 210 210 IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE, 211 + IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_NO_FAULT, 211 212 }; 212 213 213 214 enum ib_uverbs_advise_mr_flag { ··· 249 248 RDMA_DRIVER_QIB, 250 249 RDMA_DRIVER_EFA, 251 250 RDMA_DRIVER_SIW, 251 + }; 252 + 253 + enum ib_uverbs_gid_type { 254 + IB_UVERBS_GID_TYPE_IB, 255 + IB_UVERBS_GID_TYPE_ROCE_V1, 256 + IB_UVERBS_GID_TYPE_ROCE_V2, 257 + }; 258 + 259 + struct ib_uverbs_gid_entry { 260 + __aligned_u64 gid[2]; 261 + __u32 gid_index; 262 + __u32 port_num; 263 + __u32 gid_type; 264 + __u32 netdev_ifindex; /* It is 0 if there is no netdev associated with it */ 252 265 }; 253 266 254 267 #endif
+11
include/uapi/rdma/ib_user_verbs.h
··· 457 457 __u32 ne; 458 458 }; 459 459 460 + enum ib_uverbs_wc_opcode { 461 + IB_UVERBS_WC_SEND = 0, 462 + IB_UVERBS_WC_RDMA_WRITE = 1, 463 + IB_UVERBS_WC_RDMA_READ = 2, 464 + IB_UVERBS_WC_COMP_SWAP = 3, 465 + IB_UVERBS_WC_FETCH_ADD = 4, 466 + IB_UVERBS_WC_BIND_MW = 5, 467 + IB_UVERBS_WC_LOCAL_INV = 6, 468 + IB_UVERBS_WC_TSO = 7, 469 + }; 470 + 460 471 struct ib_uverbs_wc { 461 472 __aligned_u64 wr_id; 462 473 __u32 status;
+9 -3
include/uapi/rdma/rdma_user_rxe.h
··· 39 39 #include <linux/in.h> 40 40 #include <linux/in6.h> 41 41 42 + enum { 43 + RXE_NETWORK_TYPE_IPV4 = 1, 44 + RXE_NETWORK_TYPE_IPV6 = 2, 45 + }; 46 + 42 47 union rxe_gid { 43 48 __u8 raw[16]; 44 49 struct { ··· 62 57 63 58 struct rxe_av { 64 59 __u8 port_num; 60 + /* From RXE_NETWORK_TYPE_* */ 65 61 __u8 network_type; 66 62 __u8 dmac[6]; 67 63 struct rxe_global_route grh; ··· 105 99 struct ib_mr *mr; 106 100 __aligned_u64 reserved; 107 101 }; 108 - __u32 key; 109 - __u32 access; 102 + __u32 key; 103 + __u32 access; 110 104 } reg; 111 105 } wr; 112 106 }; ··· 118 112 }; 119 113 120 114 struct mminfo { 121 - __aligned_u64 offset; 115 + __aligned_u64 offset; 122 116 __u32 size; 123 117 __u32 pad; 124 118 };
+107 -26
lib/scatterlist.c
··· 365 365 } 366 366 EXPORT_SYMBOL(sg_alloc_table); 367 367 368 + static struct scatterlist *get_next_sg(struct sg_table *table, 369 + struct scatterlist *cur, 370 + unsigned long needed_sges, 371 + gfp_t gfp_mask) 372 + { 373 + struct scatterlist *new_sg, *next_sg; 374 + unsigned int alloc_size; 375 + 376 + if (cur) { 377 + next_sg = sg_next(cur); 378 + /* Check if last entry should be keeped for chainning */ 379 + if (!sg_is_last(next_sg) || needed_sges == 1) 380 + return next_sg; 381 + } 382 + 383 + alloc_size = min_t(unsigned long, needed_sges, SG_MAX_SINGLE_ALLOC); 384 + new_sg = sg_kmalloc(alloc_size, gfp_mask); 385 + if (!new_sg) 386 + return ERR_PTR(-ENOMEM); 387 + sg_init_table(new_sg, alloc_size); 388 + if (cur) { 389 + __sg_chain(next_sg, new_sg); 390 + table->orig_nents += alloc_size - 1; 391 + } else { 392 + table->sgl = new_sg; 393 + table->orig_nents = alloc_size; 394 + table->nents = 0; 395 + } 396 + return new_sg; 397 + } 398 + 368 399 /** 369 400 * __sg_alloc_table_from_pages - Allocate and initialize an sg table from 370 401 * an array of pages ··· 404 373 * @n_pages: Number of pages in the pages array 405 374 * @offset: Offset from start of the first page to the start of a buffer 406 375 * @size: Number of valid bytes in the buffer (after offset) 407 - * @max_segment: Maximum size of a scatterlist node in bytes (page aligned) 376 + * @max_segment: Maximum size of a scatterlist element in bytes 377 + * @prv: Last populated sge in sgt 378 + * @left_pages: Left pages caller have to set after this call 408 379 * @gfp_mask: GFP allocation mask 409 380 * 410 - * Description: 411 - * Allocate and initialize an sg table from a list of pages. Contiguous 412 - * ranges of the pages are squashed into a single scatterlist node up to the 413 - * maximum size specified in @max_segment. An user may provide an offset at a 414 - * start and a size of valid data in a buffer specified by the page array. 415 - * The returned sg table is released by sg_free_table. 381 + * Description: 382 + * If @prv is NULL, allocate and initialize an sg table from a list of pages, 383 + * else reuse the scatterlist passed in at @prv. 384 + * Contiguous ranges of the pages are squashed into a single scatterlist 385 + * entry up to the maximum size specified in @max_segment. A user may 386 + * provide an offset at a start and a size of valid data in a buffer 387 + * specified by the page array. 416 388 * 417 389 * Returns: 418 - * 0 on success, negative error on failure 390 + * Last SGE in sgt on success, PTR_ERR on otherwise. 391 + * The allocation in @sgt must be released by sg_free_table. 392 + * 393 + * Notes: 394 + * If this function returns non-0 (eg failure), the caller must call 395 + * sg_free_table() to cleanup any leftover allocations. 419 396 */ 420 - int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages, 421 - unsigned int n_pages, unsigned int offset, 422 - unsigned long size, unsigned int max_segment, 423 - gfp_t gfp_mask) 397 + struct scatterlist *__sg_alloc_table_from_pages(struct sg_table *sgt, 398 + struct page **pages, unsigned int n_pages, unsigned int offset, 399 + unsigned long size, unsigned int max_segment, 400 + struct scatterlist *prv, unsigned int left_pages, 401 + gfp_t gfp_mask) 424 402 { 425 - unsigned int chunks, cur_page, seg_len, i; 426 - int ret; 427 - struct scatterlist *s; 403 + unsigned int chunks, cur_page, seg_len, i, prv_len = 0; 404 + unsigned int added_nents = 0; 405 + struct scatterlist *s = prv; 428 406 429 - if (WARN_ON(!max_segment || offset_in_page(max_segment))) 430 - return -EINVAL; 407 + /* 408 + * The algorithm below requires max_segment to be aligned to PAGE_SIZE 409 + * otherwise it can overshoot. 410 + */ 411 + max_segment = ALIGN_DOWN(max_segment, PAGE_SIZE); 412 + if (WARN_ON(max_segment < PAGE_SIZE)) 413 + return ERR_PTR(-EINVAL); 414 + 415 + if (IS_ENABLED(CONFIG_ARCH_NO_SG_CHAIN) && prv) 416 + return ERR_PTR(-EOPNOTSUPP); 417 + 418 + if (prv) { 419 + unsigned long paddr = (page_to_pfn(sg_page(prv)) * PAGE_SIZE + 420 + prv->offset + prv->length) / 421 + PAGE_SIZE; 422 + 423 + if (WARN_ON(offset)) 424 + return ERR_PTR(-EINVAL); 425 + 426 + /* Merge contiguous pages into the last SG */ 427 + prv_len = prv->length; 428 + while (n_pages && page_to_pfn(pages[0]) == paddr) { 429 + if (prv->length + PAGE_SIZE > max_segment) 430 + break; 431 + prv->length += PAGE_SIZE; 432 + paddr++; 433 + pages++; 434 + n_pages--; 435 + } 436 + if (!n_pages) 437 + goto out; 438 + } 431 439 432 440 /* compute number of contiguous chunks */ 433 441 chunks = 1; ··· 480 410 } 481 411 } 482 412 483 - ret = sg_alloc_table(sgt, chunks, gfp_mask); 484 - if (unlikely(ret)) 485 - return ret; 486 - 487 413 /* merging chunks and putting them into the scatterlist */ 488 414 cur_page = 0; 489 - for_each_sg(sgt->sgl, s, sgt->orig_nents, i) { 415 + for (i = 0; i < chunks; i++) { 490 416 unsigned int j, chunk_size; 491 417 492 418 /* look for the end of the current chunk */ ··· 495 429 break; 496 430 } 497 431 432 + /* Pass how many chunks might be left */ 433 + s = get_next_sg(sgt, s, chunks - i + left_pages, gfp_mask); 434 + if (IS_ERR(s)) { 435 + /* 436 + * Adjust entry length to be as before function was 437 + * called. 438 + */ 439 + if (prv) 440 + prv->length = prv_len; 441 + return s; 442 + } 498 443 chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset; 499 444 sg_set_page(s, pages[cur_page], 500 445 min_t(unsigned long, size, chunk_size), offset); 446 + added_nents++; 501 447 size -= chunk_size; 502 448 offset = 0; 503 449 cur_page = j; 504 450 } 505 - 506 - return 0; 451 + sgt->nents += added_nents; 452 + out: 453 + if (!left_pages) 454 + sg_mark_end(s); 455 + return s; 507 456 } 508 457 EXPORT_SYMBOL(__sg_alloc_table_from_pages); 509 458 ··· 546 465 unsigned int n_pages, unsigned int offset, 547 466 unsigned long size, gfp_t gfp_mask) 548 467 { 549 - return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset, size, 550 - SCATTERLIST_MAX_SEGMENT, gfp_mask); 468 + return PTR_ERR_OR_ZERO(__sg_alloc_table_from_pages(sgt, pages, n_pages, 469 + offset, size, UINT_MAX, NULL, 0, gfp_mask)); 551 470 } 552 471 EXPORT_SYMBOL(sg_alloc_table_from_pages); 553 472
+2 -1
tools/testing/scatterlist/Makefile
··· 14 14 main: $(OFILES) 15 15 16 16 clean: 17 - $(RM) $(TARGETS) $(OFILES) scatterlist.c linux/scatterlist.h linux/highmem.h linux/kmemleak.h asm/io.h 17 + $(RM) $(TARGETS) $(OFILES) scatterlist.c linux/scatterlist.h linux/highmem.h linux/kmemleak.h linux/slab.h asm/io.h 18 18 @rmdir asm 19 19 20 20 scatterlist.c: ../../../lib/scatterlist.c ··· 28 28 @touch asm/io.h 29 29 @touch linux/highmem.h 30 30 @touch linux/kmemleak.h 31 + @touch linux/slab.h 31 32 @cp $< linux/scatterlist.h
+35
tools/testing/scatterlist/linux/mm.h
··· 114 114 return malloc(size); 115 115 } 116 116 117 + static inline void * 118 + kmalloc_array(unsigned int n, unsigned int size, unsigned int flags) 119 + { 120 + return malloc(n * size); 121 + } 122 + 117 123 #define kfree(x) free(x) 118 124 119 125 #define kmemleak_alloc(a, b, c, d) ··· 127 121 128 122 #define PageSlab(p) (0) 129 123 #define flush_kernel_dcache_page(p) 124 + 125 + #define MAX_ERRNO 4095 126 + 127 + #define IS_ERR_VALUE(x) unlikely((unsigned long)(void *)(x) >= (unsigned long)-MAX_ERRNO) 128 + 129 + static inline void * __must_check ERR_PTR(long error) 130 + { 131 + return (void *) error; 132 + } 133 + 134 + static inline long __must_check PTR_ERR(__force const void *ptr) 135 + { 136 + return (long) ptr; 137 + } 138 + 139 + static inline bool __must_check IS_ERR(__force const void *ptr) 140 + { 141 + return IS_ERR_VALUE((unsigned long)ptr); 142 + } 143 + 144 + static inline int __must_check PTR_ERR_OR_ZERO(__force const void *ptr) 145 + { 146 + if (IS_ERR(ptr)) 147 + return PTR_ERR(ptr); 148 + else 149 + return 0; 150 + } 151 + 152 + #define IS_ENABLED(x) (0) 130 153 131 154 #endif
+38 -15
tools/testing/scatterlist/main.c
··· 5 5 6 6 #define MAX_PAGES (64) 7 7 8 + struct test { 9 + int alloc_ret; 10 + unsigned num_pages; 11 + unsigned *pfn; 12 + unsigned size; 13 + unsigned int max_seg; 14 + unsigned int expected_segments; 15 + }; 16 + 8 17 static void set_pages(struct page **pages, const unsigned *array, unsigned num) 9 18 { 10 19 unsigned int i; ··· 26 17 27 18 #define pfn(...) (unsigned []){ __VA_ARGS__ } 28 19 20 + static void fail(struct test *test, struct sg_table *st, const char *cond) 21 + { 22 + unsigned int i; 23 + 24 + fprintf(stderr, "Failed on '%s'!\n\n", cond); 25 + 26 + printf("size = %u, max segment = %u, expected nents = %u\nst->nents = %u, st->orig_nents= %u\n", 27 + test->size, test->max_seg, test->expected_segments, st->nents, 28 + st->orig_nents); 29 + 30 + printf("%u input PFNs:", test->num_pages); 31 + for (i = 0; i < test->num_pages; i++) 32 + printf(" %x", test->pfn[i]); 33 + printf("\n"); 34 + 35 + exit(1); 36 + } 37 + 38 + #define VALIDATE(cond, st, test) \ 39 + if (!(cond)) \ 40 + fail((test), (st), #cond); 41 + 29 42 int main(void) 30 43 { 31 44 const unsigned int sgmax = SCATTERLIST_MAX_SEGMENT; 32 - struct test { 33 - int alloc_ret; 34 - unsigned num_pages; 35 - unsigned *pfn; 36 - unsigned size; 37 - unsigned int max_seg; 38 - unsigned int expected_segments; 39 - } *test, tests[] = { 45 + struct test *test, tests[] = { 40 46 { -EINVAL, 1, pfn(0), PAGE_SIZE, PAGE_SIZE + 1, 1 }, 41 47 { -EINVAL, 1, pfn(0), PAGE_SIZE, 0, 1 }, 42 48 { -EINVAL, 1, pfn(0), PAGE_SIZE, sgmax + 1, 1 }, ··· 79 55 for (i = 0, test = tests; test->expected_segments; test++, i++) { 80 56 struct page *pages[MAX_PAGES]; 81 57 struct sg_table st; 82 - int ret; 58 + struct scatterlist *sg; 83 59 84 60 set_pages(pages, test->pfn, test->num_pages); 85 61 86 - ret = __sg_alloc_table_from_pages(&st, pages, test->num_pages, 87 - 0, test->size, test->max_seg, 88 - GFP_KERNEL); 89 - assert(ret == test->alloc_ret); 62 + sg = __sg_alloc_table_from_pages(&st, pages, test->num_pages, 0, 63 + test->size, test->max_seg, NULL, 0, GFP_KERNEL); 64 + assert(PTR_ERR_OR_ZERO(sg) == test->alloc_ret); 90 65 91 66 if (test->alloc_ret) 92 67 continue; 93 68 94 - assert(st.nents == test->expected_segments); 95 - assert(st.orig_nents == test->expected_segments); 69 + VALIDATE(st.nents == test->expected_segments, &st, test); 70 + VALIDATE(st.orig_nents == test->expected_segments, &st, test); 96 71 97 72 sg_free_table(&st); 98 73 }