Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'devlink-mlx5-add-new-parameters-for-link-management-and-sriov-eswitch-configurations'

Saeed Mahameed says:

====================
devlink, mlx5: Add new parameters for link management and SRIOV/eSwitch configurations [part]

This patch series introduces several devlink parameters improving device
configuration capabilities, link management, and SRIOV/eSwitch, by adding
NV config boot time parameters.

Implement the following parameters:

a) total_vfs Parameter:
-----------------------

Adds support for managing the number of VFs (total_vfs) and enabling
SR-IOV (enable_sriov for mlx5) through devlink. These additions enhance
user control over virtualization features directly from standard kernel
interfaces without relying on additional external tools. total_vfs
functionality is critical for environments that require flexible num VF
configuration.

b) CQE Compression Type:
------------------------

Introduces a new devlink parameter, cqe_compress_type, to configure the
rate of CQE compression based on PCIe bus conditions. This setting
provides a balance between compression efficiency and overall NIC
performance under different traffic loads.
====================

Link: https://patch.msgid.link/20250907012953.301746-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+658 -4
+43 -3
Documentation/networking/devlink/mlx5.rst
··· 15 15 * - Name 16 16 - Mode 17 17 - Validation 18 + - Notes 18 19 * - ``enable_roce`` 19 20 - driverinit 20 - - Type: Boolean 21 - 22 - If the device supports RoCE disablement, RoCE enablement state controls 21 + - Boolean 22 + - If the device supports RoCE disablement, RoCE enablement state controls 23 23 device support for RoCE capability. Otherwise, the control occurs in the 24 24 driver stack. When RoCE is disabled at the driver level, only raw 25 25 ethernet QPs are supported. 26 26 * - ``io_eq_size`` 27 27 - driverinit 28 28 - The range is between 64 and 4096. 29 + - 29 30 * - ``event_eq_size`` 30 31 - driverinit 31 32 - The range is between 64 and 4096. 33 + - 32 34 * - ``max_macs`` 33 35 - driverinit 34 36 - The range is between 1 and 2^31. Only power of 2 values are supported. 37 + - 38 + * - ``enable_sriov`` 39 + - permanent 40 + - Boolean 41 + - Applies to each physical function (PF) independently, if the device 42 + supports it. Otherwise, it applies symmetrically to all PFs. 43 + * - ``total_vfs`` 44 + - permanent 45 + - The range is between 1 and a device-specific max. 46 + - Applies to each physical function (PF) independently, if the device 47 + supports it. Otherwise, it applies symmetrically to all PFs. 48 + 49 + Note: permanent parameters such as ``enable_sriov`` and ``total_vfs`` require FW reset to take effect 50 + 51 + .. code-block:: bash 52 + 53 + # setup parameters 54 + devlink dev param set pci/0000:01:00.0 name enable_sriov value true cmode permanent 55 + devlink dev param set pci/0000:01:00.0 name total_vfs value 8 cmode permanent 56 + 57 + # Fw reset 58 + devlink dev reload pci/0000:01:00.0 action fw_activate 59 + 60 + # for PCI related config such as sriov PCI reset/rescan is required: 61 + echo 1 >/sys/bus/pci/devices/0000:01:00.0/remove 62 + echo 1 >/sys/bus/pci/rescan 63 + grep ^ /sys/bus/pci/devices/0000:01:00.0/sriov_* 64 + 35 65 36 66 The ``mlx5`` driver also implements the following driver-specific 37 67 parameters. ··· 146 116 - u32 147 117 - driverinit 148 118 - Control the size (in packets) of the hairpin queues. 119 + 120 + * - ``cqe_compress_type`` 121 + - string 122 + - permanent 123 + - Configure which mechanism/algorithm should be used by the NIC that will 124 + affect the rate (aggressiveness) of compressed CQEs depending on PCIe bus 125 + conditions and other internal NIC factors. This mode affects all queues 126 + that enable compression. 127 + * ``balanced`` : Merges fewer CQEs, resulting in a moderate compression ratio but maintaining a balance between bandwidth savings and performance 128 + * ``aggressive`` : Merges more CQEs into a single entry, achieving a higher compression rate and maximizing performance, particularly under high traffic loads 149 129 150 130 The ``mlx5`` driver supports reloading via ``DEVLINK_CMD_RELOAD`` 151 131
+1 -1
drivers/net/ethernet/mellanox/mlx5/core/Makefile
··· 17 17 fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \ 18 18 lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \ 19 19 diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \ 20 - fw_reset.o qos.o lib/tout.o lib/aso.o wc.o fs_pool.o 20 + fw_reset.o qos.o lib/tout.o lib/aso.o wc.o fs_pool.o lib/nv_param.o 21 21 22 22 # 23 23 # Netdev basic
+8
drivers/net/ethernet/mellanox/mlx5/core/devlink.c
··· 10 10 #include "esw/qos.h" 11 11 #include "sf/dev/dev.h" 12 12 #include "sf/sf.h" 13 + #include "lib/nv_param.h" 13 14 14 15 static int mlx5_devlink_flash_update(struct devlink *devlink, 15 16 struct devlink_flash_update_params *params, ··· 896 895 if (err) 897 896 goto max_uc_list_err; 898 897 898 + err = mlx5_nv_param_register_dl_params(devlink); 899 + if (err) 900 + goto nv_param_err; 901 + 899 902 return 0; 900 903 904 + nv_param_err: 905 + mlx5_devlink_max_uc_list_params_unregister(devlink); 901 906 max_uc_list_err: 902 907 mlx5_devlink_auxdev_params_unregister(devlink); 903 908 auxdev_reg_err: ··· 914 907 915 908 void mlx5_devlink_params_unregister(struct devlink *devlink) 916 909 { 910 + mlx5_nv_param_unregister_dl_params(devlink); 917 911 mlx5_devlink_max_uc_list_params_unregister(devlink); 918 912 mlx5_devlink_auxdev_params_unregister(devlink); 919 913 devl_params_unregister(devlink, mlx5_devlink_params,
+1
drivers/net/ethernet/mellanox/mlx5/core/devlink.h
··· 22 22 MLX5_DEVLINK_PARAM_ID_ESW_MULTIPORT, 23 23 MLX5_DEVLINK_PARAM_ID_HAIRPIN_NUM_QUEUES, 24 24 MLX5_DEVLINK_PARAM_ID_HAIRPIN_QUEUE_SIZE, 25 + MLX5_DEVLINK_PARAM_ID_CQE_COMPRESSION_TYPE 25 26 }; 26 27 27 28 struct mlx5_trap_ctx {
+576
drivers/net/ethernet/mellanox/mlx5/core/lib/nv_param.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB 2 + /* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ 3 + 4 + #include "nv_param.h" 5 + #include "mlx5_core.h" 6 + 7 + enum { 8 + MLX5_CLASS_0_CTRL_ID_NV_GLOBAL_PCI_CONF = 0x80, 9 + MLX5_CLASS_0_CTRL_ID_NV_GLOBAL_PCI_CAP = 0x81, 10 + MLX5_CLASS_0_CTRL_ID_NV_SW_OFFLOAD_CONFIG = 0x10a, 11 + 12 + MLX5_CLASS_3_CTRL_ID_NV_PF_PCI_CONF = 0x80, 13 + }; 14 + 15 + struct mlx5_ifc_configuration_item_type_class_global_bits { 16 + u8 type_class[0x8]; 17 + u8 parameter_index[0x18]; 18 + }; 19 + 20 + struct mlx5_ifc_configuration_item_type_class_per_host_pf_bits { 21 + u8 type_class[0x8]; 22 + u8 pf_index[0x6]; 23 + u8 pci_bus_index[0x8]; 24 + u8 parameter_index[0xa]; 25 + }; 26 + 27 + union mlx5_ifc_config_item_type_auto_bits { 28 + struct mlx5_ifc_configuration_item_type_class_global_bits 29 + configuration_item_type_class_global; 30 + struct mlx5_ifc_configuration_item_type_class_per_host_pf_bits 31 + configuration_item_type_class_per_host_pf; 32 + u8 reserved_at_0[0x20]; 33 + }; 34 + 35 + struct mlx5_ifc_config_item_bits { 36 + u8 valid[0x2]; 37 + u8 priority[0x2]; 38 + u8 header_type[0x2]; 39 + u8 ovr_en[0x1]; 40 + u8 rd_en[0x1]; 41 + u8 access_mode[0x2]; 42 + u8 reserved_at_a[0x1]; 43 + u8 writer_id[0x5]; 44 + u8 version[0x4]; 45 + u8 reserved_at_14[0x2]; 46 + u8 host_id_valid[0x1]; 47 + u8 length[0x9]; 48 + 49 + union mlx5_ifc_config_item_type_auto_bits type; 50 + 51 + u8 reserved_at_40[0x10]; 52 + u8 crc16[0x10]; 53 + }; 54 + 55 + struct mlx5_ifc_mnvda_reg_bits { 56 + struct mlx5_ifc_config_item_bits configuration_item_header; 57 + 58 + u8 configuration_item_data[64][0x20]; 59 + }; 60 + 61 + struct mlx5_ifc_nv_global_pci_conf_bits { 62 + u8 sriov_valid[0x1]; 63 + u8 reserved_at_1[0x10]; 64 + u8 per_pf_total_vf[0x1]; 65 + u8 reserved_at_12[0xe]; 66 + 67 + u8 sriov_en[0x1]; 68 + u8 reserved_at_21[0xf]; 69 + u8 total_vfs[0x10]; 70 + 71 + u8 reserved_at_40[0x20]; 72 + }; 73 + 74 + struct mlx5_ifc_nv_global_pci_cap_bits { 75 + u8 max_vfs_per_pf_valid[0x1]; 76 + u8 reserved_at_1[0x13]; 77 + u8 per_pf_total_vf_supported[0x1]; 78 + u8 reserved_at_15[0xb]; 79 + 80 + u8 sriov_support[0x1]; 81 + u8 reserved_at_21[0xf]; 82 + u8 max_vfs_per_pf[0x10]; 83 + 84 + u8 reserved_at_40[0x60]; 85 + }; 86 + 87 + struct mlx5_ifc_nv_pf_pci_conf_bits { 88 + u8 reserved_at_0[0x9]; 89 + u8 pf_total_vf_en[0x1]; 90 + u8 reserved_at_a[0x16]; 91 + 92 + u8 reserved_at_20[0x20]; 93 + 94 + u8 reserved_at_40[0x10]; 95 + u8 total_vf[0x10]; 96 + 97 + u8 reserved_at_60[0x20]; 98 + }; 99 + 100 + struct mlx5_ifc_nv_sw_offload_conf_bits { 101 + u8 ip_over_vxlan_port[0x10]; 102 + u8 tunnel_ecn_copy_offload_disable[0x1]; 103 + u8 pci_atomic_mode[0x3]; 104 + u8 sr_enable[0x1]; 105 + u8 ptp_cyc2realtime[0x1]; 106 + u8 vector_calc_disable[0x1]; 107 + u8 uctx_en[0x1]; 108 + u8 prio_tag_required_en[0x1]; 109 + u8 esw_fdb_ipv4_ttl_modify_enable[0x1]; 110 + u8 mkey_by_name[0x1]; 111 + u8 ip_over_vxlan_en[0x1]; 112 + u8 one_qp_per_recovery[0x1]; 113 + u8 cqe_compression[0x3]; 114 + u8 tunnel_udp_entropy_proto_disable[0x1]; 115 + u8 reserved_at_21[0x1]; 116 + u8 ar_enable[0x1]; 117 + u8 log_max_outstanding_wqe[0x5]; 118 + u8 vf_migration[0x2]; 119 + u8 log_tx_psn_win[0x6]; 120 + u8 lro_log_timeout3[0x4]; 121 + u8 lro_log_timeout2[0x4]; 122 + u8 lro_log_timeout1[0x4]; 123 + u8 lro_log_timeout0[0x4]; 124 + }; 125 + 126 + #define MNVDA_HDR_SZ \ 127 + (MLX5_ST_SZ_BYTES(mnvda_reg) - \ 128 + MLX5_BYTE_OFF(mnvda_reg, configuration_item_data)) 129 + 130 + #define MLX5_SET_CFG_ITEM_TYPE(_cls_name, _mnvda_ptr, _field, _val) \ 131 + MLX5_SET(mnvda_reg, _mnvda_ptr, \ 132 + configuration_item_header.type.configuration_item_type_class_##_cls_name._field, \ 133 + _val) 134 + 135 + #define MLX5_SET_CFG_HDR_LEN(_mnvda_ptr, _cls_name) \ 136 + MLX5_SET(mnvda_reg, _mnvda_ptr, configuration_item_header.length, \ 137 + MLX5_ST_SZ_BYTES(_cls_name)) 138 + 139 + #define MLX5_GET_CFG_HDR_LEN(_mnvda_ptr) \ 140 + MLX5_GET(mnvda_reg, _mnvda_ptr, configuration_item_header.length) 141 + 142 + static int mlx5_nv_param_read(struct mlx5_core_dev *dev, void *mnvda, 143 + size_t len) 144 + { 145 + u32 param_idx, type_class; 146 + u32 header_len; 147 + void *cls_ptr; 148 + int err; 149 + 150 + if (WARN_ON(len > MLX5_ST_SZ_BYTES(mnvda_reg)) || len < MNVDA_HDR_SZ) 151 + return -EINVAL; /* A caller bug */ 152 + 153 + err = mlx5_core_access_reg(dev, mnvda, len, mnvda, len, MLX5_REG_MNVDA, 154 + 0, 0); 155 + if (!err) 156 + return 0; 157 + 158 + cls_ptr = MLX5_ADDR_OF(mnvda_reg, mnvda, 159 + configuration_item_header.type.configuration_item_type_class_global); 160 + 161 + type_class = MLX5_GET(configuration_item_type_class_global, cls_ptr, 162 + type_class); 163 + param_idx = MLX5_GET(configuration_item_type_class_global, cls_ptr, 164 + parameter_index); 165 + header_len = MLX5_GET_CFG_HDR_LEN(mnvda); 166 + 167 + mlx5_core_warn(dev, "Failed to read mnvda reg: type_class 0x%x, param_idx 0x%x, header_len %u, err %d\n", 168 + type_class, param_idx, header_len, err); 169 + 170 + return -EOPNOTSUPP; 171 + } 172 + 173 + static int mlx5_nv_param_write(struct mlx5_core_dev *dev, void *mnvda, 174 + size_t len) 175 + { 176 + if (WARN_ON(len > MLX5_ST_SZ_BYTES(mnvda_reg)) || len < MNVDA_HDR_SZ) 177 + return -EINVAL; 178 + 179 + if (WARN_ON(MLX5_GET_CFG_HDR_LEN(mnvda) == 0)) 180 + return -EINVAL; 181 + 182 + return mlx5_core_access_reg(dev, mnvda, len, mnvda, len, MLX5_REG_MNVDA, 183 + 0, 1); 184 + } 185 + 186 + static int 187 + mlx5_nv_param_read_sw_offload_conf(struct mlx5_core_dev *dev, void *mnvda, 188 + size_t len) 189 + { 190 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, type_class, 0); 191 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, parameter_index, 192 + MLX5_CLASS_0_CTRL_ID_NV_SW_OFFLOAD_CONFIG); 193 + MLX5_SET_CFG_HDR_LEN(mnvda, nv_sw_offload_conf); 194 + 195 + return mlx5_nv_param_read(dev, mnvda, len); 196 + } 197 + 198 + static const char *const 199 + cqe_compress_str[] = { "balanced", "aggressive" }; 200 + 201 + static int 202 + mlx5_nv_param_devlink_cqe_compress_get(struct devlink *devlink, u32 id, 203 + struct devlink_param_gset_ctx *ctx) 204 + { 205 + struct mlx5_core_dev *dev = devlink_priv(devlink); 206 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)] = {}; 207 + u8 value = U8_MAX; 208 + void *data; 209 + int err; 210 + 211 + err = mlx5_nv_param_read_sw_offload_conf(dev, mnvda, sizeof(mnvda)); 212 + if (err) 213 + return err; 214 + 215 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 216 + value = MLX5_GET(nv_sw_offload_conf, data, cqe_compression); 217 + 218 + if (value >= ARRAY_SIZE(cqe_compress_str)) 219 + return -EOPNOTSUPP; 220 + 221 + strscpy(ctx->val.vstr, cqe_compress_str[value], sizeof(ctx->val.vstr)); 222 + return 0; 223 + } 224 + 225 + static int 226 + mlx5_nv_param_devlink_cqe_compress_validate(struct devlink *devlink, u32 id, 227 + union devlink_param_value val, 228 + struct netlink_ext_ack *extack) 229 + { 230 + int i; 231 + 232 + for (i = 0; i < ARRAY_SIZE(cqe_compress_str); i++) { 233 + if (!strcmp(val.vstr, cqe_compress_str[i])) 234 + return 0; 235 + } 236 + 237 + NL_SET_ERR_MSG_MOD(extack, 238 + "Invalid value, supported values are balanced/aggressive"); 239 + return -EOPNOTSUPP; 240 + } 241 + 242 + static int 243 + mlx5_nv_param_devlink_cqe_compress_set(struct devlink *devlink, u32 id, 244 + struct devlink_param_gset_ctx *ctx, 245 + struct netlink_ext_ack *extack) 246 + { 247 + struct mlx5_core_dev *dev = devlink_priv(devlink); 248 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)] = {}; 249 + int err = 0; 250 + void *data; 251 + u8 value; 252 + 253 + if (!strcmp(ctx->val.vstr, "aggressive")) 254 + value = 1; 255 + else /* balanced: can't be anything else already validated above */ 256 + value = 0; 257 + 258 + err = mlx5_nv_param_read_sw_offload_conf(dev, mnvda, sizeof(mnvda)); 259 + if (err) { 260 + NL_SET_ERR_MSG_MOD(extack, 261 + "Failed to read sw_offload_conf mnvda reg"); 262 + return err; 263 + } 264 + 265 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 266 + MLX5_SET(nv_sw_offload_conf, data, cqe_compression, value); 267 + 268 + return mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 269 + } 270 + 271 + static int mlx5_nv_param_read_global_pci_conf(struct mlx5_core_dev *dev, 272 + void *mnvda, size_t len) 273 + { 274 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, type_class, 0); 275 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, parameter_index, 276 + MLX5_CLASS_0_CTRL_ID_NV_GLOBAL_PCI_CONF); 277 + MLX5_SET_CFG_HDR_LEN(mnvda, nv_global_pci_conf); 278 + 279 + return mlx5_nv_param_read(dev, mnvda, len); 280 + } 281 + 282 + static int mlx5_nv_param_read_global_pci_cap(struct mlx5_core_dev *dev, 283 + void *mnvda, size_t len) 284 + { 285 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, type_class, 0); 286 + MLX5_SET_CFG_ITEM_TYPE(global, mnvda, parameter_index, 287 + MLX5_CLASS_0_CTRL_ID_NV_GLOBAL_PCI_CAP); 288 + MLX5_SET_CFG_HDR_LEN(mnvda, nv_global_pci_cap); 289 + 290 + return mlx5_nv_param_read(dev, mnvda, len); 291 + } 292 + 293 + static int mlx5_nv_param_read_per_host_pf_conf(struct mlx5_core_dev *dev, 294 + void *mnvda, size_t len) 295 + { 296 + MLX5_SET_CFG_ITEM_TYPE(per_host_pf, mnvda, type_class, 3); 297 + MLX5_SET_CFG_ITEM_TYPE(per_host_pf, mnvda, parameter_index, 298 + MLX5_CLASS_3_CTRL_ID_NV_PF_PCI_CONF); 299 + MLX5_SET_CFG_HDR_LEN(mnvda, nv_pf_pci_conf); 300 + 301 + return mlx5_nv_param_read(dev, mnvda, len); 302 + } 303 + 304 + static int mlx5_devlink_enable_sriov_get(struct devlink *devlink, u32 id, 305 + struct devlink_param_gset_ctx *ctx) 306 + { 307 + struct mlx5_core_dev *dev = devlink_priv(devlink); 308 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)] = {}; 309 + bool sriov_en = false; 310 + void *data; 311 + int err; 312 + 313 + err = mlx5_nv_param_read_global_pci_cap(dev, mnvda, sizeof(mnvda)); 314 + if (err) 315 + return err; 316 + 317 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 318 + if (!MLX5_GET(nv_global_pci_cap, data, sriov_support)) { 319 + ctx->val.vbool = false; 320 + return 0; 321 + } 322 + 323 + memset(mnvda, 0, sizeof(mnvda)); 324 + err = mlx5_nv_param_read_global_pci_conf(dev, mnvda, sizeof(mnvda)); 325 + if (err) 326 + return err; 327 + 328 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 329 + sriov_en = MLX5_GET(nv_global_pci_conf, data, sriov_en); 330 + if (!MLX5_GET(nv_global_pci_conf, data, per_pf_total_vf)) { 331 + ctx->val.vbool = sriov_en; 332 + return 0; 333 + } 334 + 335 + /* SRIOV is per PF */ 336 + memset(mnvda, 0, sizeof(mnvda)); 337 + err = mlx5_nv_param_read_per_host_pf_conf(dev, mnvda, sizeof(mnvda)); 338 + if (err) 339 + return err; 340 + 341 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 342 + ctx->val.vbool = sriov_en && 343 + MLX5_GET(nv_pf_pci_conf, data, pf_total_vf_en); 344 + return 0; 345 + } 346 + 347 + static int mlx5_devlink_enable_sriov_set(struct devlink *devlink, u32 id, 348 + struct devlink_param_gset_ctx *ctx, 349 + struct netlink_ext_ack *extack) 350 + { 351 + struct mlx5_core_dev *dev = devlink_priv(devlink); 352 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)] = {}; 353 + bool per_pf_support; 354 + void *cap, *data; 355 + int err; 356 + 357 + err = mlx5_nv_param_read_global_pci_cap(dev, mnvda, sizeof(mnvda)); 358 + if (err) { 359 + NL_SET_ERR_MSG_MOD(extack, 360 + "Failed to read global PCI capability"); 361 + return err; 362 + } 363 + 364 + cap = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 365 + per_pf_support = MLX5_GET(nv_global_pci_cap, cap, 366 + per_pf_total_vf_supported); 367 + 368 + if (!MLX5_GET(nv_global_pci_cap, cap, sriov_support)) { 369 + NL_SET_ERR_MSG_MOD(extack, 370 + "SRIOV is not supported on this device"); 371 + return -EOPNOTSUPP; 372 + } 373 + 374 + if (!per_pf_support) { 375 + /* We don't allow global SRIOV setting on per PF devlink */ 376 + NL_SET_ERR_MSG_MOD(extack, 377 + "SRIOV is not per PF on this device"); 378 + return -EOPNOTSUPP; 379 + } 380 + 381 + memset(mnvda, 0, sizeof(mnvda)); 382 + err = mlx5_nv_param_read_global_pci_conf(dev, mnvda, sizeof(mnvda)); 383 + if (err) { 384 + NL_SET_ERR_MSG_MOD(extack, 385 + "Unable to read global PCI configuration"); 386 + return err; 387 + } 388 + 389 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 390 + 391 + /* setup per PF sriov mode */ 392 + MLX5_SET(nv_global_pci_conf, data, sriov_valid, 1); 393 + MLX5_SET(nv_global_pci_conf, data, sriov_en, 1); 394 + MLX5_SET(nv_global_pci_conf, data, per_pf_total_vf, 1); 395 + 396 + err = mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 397 + if (err) { 398 + NL_SET_ERR_MSG_MOD(extack, 399 + "Unable to write global PCI configuration"); 400 + return err; 401 + } 402 + 403 + /* enable/disable sriov on this PF */ 404 + memset(mnvda, 0, sizeof(mnvda)); 405 + err = mlx5_nv_param_read_per_host_pf_conf(dev, mnvda, sizeof(mnvda)); 406 + if (err) { 407 + NL_SET_ERR_MSG_MOD(extack, 408 + "Unable to read per host PF configuration"); 409 + return err; 410 + } 411 + MLX5_SET(nv_pf_pci_conf, data, pf_total_vf_en, ctx->val.vbool); 412 + return mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 413 + } 414 + 415 + static int mlx5_devlink_total_vfs_get(struct devlink *devlink, u32 id, 416 + struct devlink_param_gset_ctx *ctx) 417 + { 418 + struct mlx5_core_dev *dev = devlink_priv(devlink); 419 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)] = {}; 420 + void *data; 421 + int err; 422 + 423 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 424 + 425 + err = mlx5_nv_param_read_global_pci_cap(dev, mnvda, sizeof(mnvda)); 426 + if (err) 427 + return err; 428 + 429 + if (!MLX5_GET(nv_global_pci_cap, data, sriov_support)) { 430 + ctx->val.vu32 = 0; 431 + return 0; 432 + } 433 + 434 + memset(mnvda, 0, sizeof(mnvda)); 435 + err = mlx5_nv_param_read_global_pci_conf(dev, mnvda, sizeof(mnvda)); 436 + if (err) 437 + return err; 438 + 439 + if (!MLX5_GET(nv_global_pci_conf, data, per_pf_total_vf)) { 440 + ctx->val.vu32 = MLX5_GET(nv_global_pci_conf, data, total_vfs); 441 + return 0; 442 + } 443 + 444 + /* SRIOV is per PF */ 445 + memset(mnvda, 0, sizeof(mnvda)); 446 + err = mlx5_nv_param_read_per_host_pf_conf(dev, mnvda, sizeof(mnvda)); 447 + if (err) 448 + return err; 449 + 450 + ctx->val.vu32 = MLX5_GET(nv_pf_pci_conf, data, total_vf); 451 + 452 + return 0; 453 + } 454 + 455 + static int mlx5_devlink_total_vfs_set(struct devlink *devlink, u32 id, 456 + struct devlink_param_gset_ctx *ctx, 457 + struct netlink_ext_ack *extack) 458 + { 459 + struct mlx5_core_dev *dev = devlink_priv(devlink); 460 + u32 mnvda[MLX5_ST_SZ_DW(mnvda_reg)]; 461 + bool per_pf_support; 462 + void *data; 463 + int err; 464 + 465 + err = mlx5_nv_param_read_global_pci_cap(dev, mnvda, sizeof(mnvda)); 466 + if (err) { 467 + NL_SET_ERR_MSG_MOD(extack, "Failed to read global pci cap"); 468 + return err; 469 + } 470 + 471 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 472 + if (!MLX5_GET(nv_global_pci_cap, data, sriov_support)) { 473 + NL_SET_ERR_MSG_MOD(extack, "Not configurable on this device"); 474 + return -EOPNOTSUPP; 475 + } 476 + 477 + per_pf_support = MLX5_GET(nv_global_pci_cap, data, 478 + per_pf_total_vf_supported); 479 + if (!per_pf_support) { 480 + /* We don't allow global SRIOV setting on per PF devlink */ 481 + NL_SET_ERR_MSG_MOD(extack, 482 + "SRIOV is not per PF on this device"); 483 + return -EOPNOTSUPP; 484 + } 485 + 486 + memset(mnvda, 0, sizeof(mnvda)); 487 + err = mlx5_nv_param_read_global_pci_conf(dev, mnvda, sizeof(mnvda)); 488 + if (err) 489 + return err; 490 + 491 + MLX5_SET(nv_global_pci_conf, data, sriov_valid, 1); 492 + MLX5_SET(nv_global_pci_conf, data, per_pf_total_vf, per_pf_support); 493 + 494 + if (!per_pf_support) { 495 + MLX5_SET(nv_global_pci_conf, data, total_vfs, ctx->val.vu32); 496 + return mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 497 + } 498 + 499 + /* SRIOV is per PF */ 500 + err = mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 501 + if (err) 502 + return err; 503 + 504 + memset(mnvda, 0, sizeof(mnvda)); 505 + err = mlx5_nv_param_read_per_host_pf_conf(dev, mnvda, sizeof(mnvda)); 506 + if (err) 507 + return err; 508 + 509 + data = MLX5_ADDR_OF(mnvda_reg, mnvda, configuration_item_data); 510 + MLX5_SET(nv_pf_pci_conf, data, total_vf, ctx->val.vu32); 511 + return mlx5_nv_param_write(dev, mnvda, sizeof(mnvda)); 512 + } 513 + 514 + static int mlx5_devlink_total_vfs_validate(struct devlink *devlink, u32 id, 515 + union devlink_param_value val, 516 + struct netlink_ext_ack *extack) 517 + { 518 + struct mlx5_core_dev *dev = devlink_priv(devlink); 519 + u32 cap[MLX5_ST_SZ_DW(mnvda_reg)]; 520 + void *data; 521 + u16 max; 522 + int err; 523 + 524 + data = MLX5_ADDR_OF(mnvda_reg, cap, configuration_item_data); 525 + 526 + err = mlx5_nv_param_read_global_pci_cap(dev, cap, sizeof(cap)); 527 + if (err) 528 + return err; 529 + 530 + if (!MLX5_GET(nv_global_pci_cap, data, max_vfs_per_pf_valid)) 531 + return 0; /* optimistic, but set might fail later */ 532 + 533 + max = MLX5_GET(nv_global_pci_cap, data, max_vfs_per_pf); 534 + if (val.vu16 > max) { 535 + NL_SET_ERR_MSG_FMT_MOD(extack, 536 + "Max allowed by device is %u", max); 537 + return -EINVAL; 538 + } 539 + 540 + return 0; 541 + } 542 + 543 + static const struct devlink_param mlx5_nv_param_devlink_params[] = { 544 + DEVLINK_PARAM_GENERIC(ENABLE_SRIOV, BIT(DEVLINK_PARAM_CMODE_PERMANENT), 545 + mlx5_devlink_enable_sriov_get, 546 + mlx5_devlink_enable_sriov_set, NULL), 547 + DEVLINK_PARAM_GENERIC(TOTAL_VFS, BIT(DEVLINK_PARAM_CMODE_PERMANENT), 548 + mlx5_devlink_total_vfs_get, 549 + mlx5_devlink_total_vfs_set, 550 + mlx5_devlink_total_vfs_validate), 551 + DEVLINK_PARAM_DRIVER(MLX5_DEVLINK_PARAM_ID_CQE_COMPRESSION_TYPE, 552 + "cqe_compress_type", DEVLINK_PARAM_TYPE_STRING, 553 + BIT(DEVLINK_PARAM_CMODE_PERMANENT), 554 + mlx5_nv_param_devlink_cqe_compress_get, 555 + mlx5_nv_param_devlink_cqe_compress_set, 556 + mlx5_nv_param_devlink_cqe_compress_validate), 557 + }; 558 + 559 + int mlx5_nv_param_register_dl_params(struct devlink *devlink) 560 + { 561 + if (!mlx5_core_is_pf(devlink_priv(devlink))) 562 + return 0; 563 + 564 + return devl_params_register(devlink, mlx5_nv_param_devlink_params, 565 + ARRAY_SIZE(mlx5_nv_param_devlink_params)); 566 + } 567 + 568 + void mlx5_nv_param_unregister_dl_params(struct devlink *devlink) 569 + { 570 + if (!mlx5_core_is_pf(devlink_priv(devlink))) 571 + return; 572 + 573 + devl_params_unregister(devlink, mlx5_nv_param_devlink_params, 574 + ARRAY_SIZE(mlx5_nv_param_devlink_params)); 575 + } 576 +
+14
drivers/net/ethernet/mellanox/mlx5/core/lib/nv_param.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ 2 + /* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ 3 + 4 + #ifndef __MLX5_NV_PARAM_H 5 + #define __MLX5_NV_PARAM_H 6 + 7 + #include <linux/mlx5/driver.h> 8 + #include "devlink.h" 9 + 10 + int mlx5_nv_param_register_dl_params(struct devlink *devlink); 11 + void mlx5_nv_param_unregister_dl_params(struct devlink *devlink); 12 + 13 + #endif 14 +
+1
include/linux/mlx5/driver.h
··· 137 137 MLX5_REG_MTCAP = 0x9009, 138 138 MLX5_REG_MTMP = 0x900A, 139 139 MLX5_REG_MCIA = 0x9014, 140 + MLX5_REG_MNVDA = 0x9024, 140 141 MLX5_REG_MFRL = 0x9028, 141 142 MLX5_REG_MLCR = 0x902b, 142 143 MLX5_REG_MRTC = 0x902d,
+4
include/net/devlink.h
··· 530 530 DEVLINK_PARAM_GENERIC_ID_EVENT_EQ_SIZE, 531 531 DEVLINK_PARAM_GENERIC_ID_ENABLE_PHC, 532 532 DEVLINK_PARAM_GENERIC_ID_CLOCK_ID, 533 + DEVLINK_PARAM_GENERIC_ID_TOTAL_VFS, 533 534 534 535 /* add new param generic ids above here*/ 535 536 __DEVLINK_PARAM_GENERIC_ID_MAX, ··· 594 593 595 594 #define DEVLINK_PARAM_GENERIC_CLOCK_ID_NAME "clock_id" 596 595 #define DEVLINK_PARAM_GENERIC_CLOCK_ID_TYPE DEVLINK_PARAM_TYPE_U64 596 + 597 + #define DEVLINK_PARAM_GENERIC_TOTAL_VFS_NAME "total_vfs" 598 + #define DEVLINK_PARAM_GENERIC_TOTAL_VFS_TYPE DEVLINK_PARAM_TYPE_U32 597 599 598 600 #define DEVLINK_PARAM_GENERIC(_id, _cmodes, _get, _set, _validate) \ 599 601 { \
+5
net/devlink/param.c
··· 102 102 .name = DEVLINK_PARAM_GENERIC_CLOCK_ID_NAME, 103 103 .type = DEVLINK_PARAM_GENERIC_CLOCK_ID_TYPE, 104 104 }, 105 + { 106 + .id = DEVLINK_PARAM_GENERIC_ID_TOTAL_VFS, 107 + .name = DEVLINK_PARAM_GENERIC_TOTAL_VFS_NAME, 108 + .type = DEVLINK_PARAM_GENERIC_TOTAL_VFS_TYPE, 109 + }, 105 110 }; 106 111 107 112 static int devlink_param_generic_verify(const struct devlink_param *param)