Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd

Pull iommufd updates from Jason Gunthorpe:
"This broadly brings the assigned HW command queue support to iommufd.
This feature is used to improve SVA performance in VMs by avoiding
paravirtualization traps during SVA invalidations.

Along the way I think some of the core logic is in a much better state
to support future driver backed features.

Summary:

- IOMMU HW now has features to directly assign HW command queues to a
guest VM. In this mode the command queue operates on a limited set
of invalidation commands that are suitable for improving guest
invalidation performance and easy for the HW to virtualize.

This brings the generic infrastructure to allow IOMMU drivers to
expose such command queues through the iommufd uAPI, mmap the
doorbell pages, and get the guest physical range for the command
queue ring itself.

- An implementation for the NVIDIA SMMUv3 extension "cmdqv" is built
on the new iommufd command queue features. It works with the
existing SMMU driver support for cmdqv in guest VMs.

- Many precursor cleanups and improvements to support the above
cleanly, changes to the general ioctl and object helpers, driver
support for VDEVICE, and mmap pgoff cookie infrastructure.

- Sequence VDEVICE destruction to always happen before VFIO device
destruction. When using the above type features, and also in future
confidential compute, the internal virtual device representation
becomes linked to HW or CC TSM configuration and objects. If a VFIO
device is removed from iommufd those HW objects should also be
cleaned up to prevent a sort of UAF. This became important now that
we have HW backing the VDEVICE.

- Fix one syzkaller found error related to math overflows during iova
allocation"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (57 commits)
iommu/arm-smmu-v3: Replace vsmmu_size/type with get_viommu_size
iommu/arm-smmu-v3: Do not bother impl_ops if IOMMU_VIOMMU_TYPE_ARM_SMMUV3
iommufd: Rename some shortterm-related identifiers
iommufd/selftest: Add coverage for vdevice tombstone
iommufd/selftest: Explicitly skip tests for inapplicable variant
iommufd/vdevice: Remove struct device reference from struct vdevice
iommufd: Destroy vdevice on idevice destroy
iommufd: Add a pre_destroy() op for objects
iommufd: Add iommufd_object_tombstone_user() helper
iommufd/viommu: Roll back to use iommufd_object_alloc() for vdevice
iommufd/selftest: Test reserved regions near ULONG_MAX
iommufd: Prevent ALIGN() overflow
iommu/tegra241-cmdqv: import IOMMUFD module namespace
iommufd: Do not allow _iommufd_object_alloc_ucmd if abort op is set
iommu/tegra241-cmdqv: Add IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV support
iommu/tegra241-cmdqv: Add user-space use support
iommu/tegra241-cmdqv: Do not statically map LVCMDQs
iommu/tegra241-cmdqv: Simplify deinit flow in tegra241_cmdqv_remove_vintf()
iommu/tegra241-cmdqv: Use request_threaded_irq
iommu/arm-smmu-v3-iommufd: Add hw_info to impl_ops
...

+2428 -498
+12
Documentation/userspace-api/iommufd.rst
··· 124 124 used to allocate a vEVENTQ. Each vIOMMU can support multiple types of vEVENTS, 125 125 but is confined to one vEVENTQ per vEVENTQ type. 126 126 127 + - IOMMUFD_OBJ_HW_QUEUE, representing a hardware accelerated queue, as a subset 128 + of IOMMU's virtualization features, for the IOMMU HW to directly read or write 129 + the virtual queue memory owned by a guest OS. This HW-acceleration feature can 130 + allow VM to work with the IOMMU HW directly without a VM Exit, so as to reduce 131 + overhead from the hypercalls. Along with the HW QUEUE object, iommufd provides 132 + user space an mmap interface for VMM to mmap a physical MMIO region from the 133 + host physical address space to the guest physical address space, allowing the 134 + guest OS to directly control the allocated HW QUEUE. Thus, when allocating a 135 + HW QUEUE, the VMM must request a pair of mmap info (offset/length) and pass in 136 + exactly to an mmap syscall via its offset and length arguments. 137 + 127 138 All user-visible objects are destroyed via the IOMMU_DESTROY uAPI. 128 139 129 140 The diagrams below show relationships between user-visible objects and kernel ··· 281 270 - iommufd_viommu for IOMMUFD_OBJ_VIOMMU. 282 271 - iommufd_vdevice for IOMMUFD_OBJ_VDEVICE. 283 272 - iommufd_veventq for IOMMUFD_OBJ_VEVENTQ. 273 + - iommufd_hw_queue for IOMMUFD_OBJ_HW_QUEUE. 284 274 285 275 Several terminologies when looking at these datastructures: 286 276
+44 -26
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
··· 7 7 8 8 #include "arm-smmu-v3.h" 9 9 10 - void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type) 10 + void *arm_smmu_hw_info(struct device *dev, u32 *length, 11 + enum iommu_hw_info_type *type) 11 12 { 12 13 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 14 + const struct arm_smmu_impl_ops *impl_ops = master->smmu->impl_ops; 13 15 struct iommu_hw_info_arm_smmuv3 *info; 14 16 u32 __iomem *base_idr; 15 17 unsigned int i; 18 + 19 + if (*type != IOMMU_HW_INFO_TYPE_DEFAULT && 20 + *type != IOMMU_HW_INFO_TYPE_ARM_SMMUV3) { 21 + if (!impl_ops || !impl_ops->hw_info) 22 + return ERR_PTR(-EOPNOTSUPP); 23 + return impl_ops->hw_info(master->smmu, length, type); 24 + } 16 25 17 26 info = kzalloc(sizeof(*info), GFP_KERNEL); 18 27 if (!info) ··· 225 216 return 0; 226 217 } 227 218 228 - static struct iommu_domain * 219 + struct iommu_domain * 229 220 arm_vsmmu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags, 230 221 const struct iommu_user_data *user_data) 231 222 { ··· 336 327 return 0; 337 328 } 338 329 339 - static int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu, 340 - struct iommu_user_data_array *array) 330 + int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu, 331 + struct iommu_user_data_array *array) 341 332 { 342 333 struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core); 343 334 struct arm_smmu_device *smmu = vsmmu->smmu; ··· 391 382 .cache_invalidate = arm_vsmmu_cache_invalidate, 392 383 }; 393 384 394 - struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, 395 - struct iommu_domain *parent, 396 - struct iommufd_ctx *ictx, 397 - unsigned int viommu_type) 385 + size_t arm_smmu_get_viommu_size(struct device *dev, 386 + enum iommu_viommu_type viommu_type) 398 387 { 399 - struct arm_smmu_device *smmu = 400 - iommu_get_iommu_dev(dev, struct arm_smmu_device, iommu); 401 388 struct arm_smmu_master *master = dev_iommu_priv_get(dev); 402 - struct arm_smmu_domain *s2_parent = to_smmu_domain(parent); 403 - struct arm_vsmmu *vsmmu; 404 - 405 - if (viommu_type != IOMMU_VIOMMU_TYPE_ARM_SMMUV3) 406 - return ERR_PTR(-EOPNOTSUPP); 389 + struct arm_smmu_device *smmu = master->smmu; 407 390 408 391 if (!(smmu->features & ARM_SMMU_FEAT_NESTING)) 409 - return ERR_PTR(-EOPNOTSUPP); 410 - 411 - if (s2_parent->smmu != master->smmu) 412 - return ERR_PTR(-EINVAL); 392 + return 0; 413 393 414 394 /* 415 395 * FORCE_SYNC is not set with FEAT_NESTING. Some study of the exact HW ··· 406 408 * any change to remove this. 407 409 */ 408 410 if (WARN_ON(smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) 409 - return ERR_PTR(-EOPNOTSUPP); 411 + return 0; 410 412 411 413 /* 412 414 * Must support some way to prevent the VM from bypassing the cache ··· 418 420 */ 419 421 if (!arm_smmu_master_canwbs(master) && 420 422 !(smmu->features & ARM_SMMU_FEAT_S2FWB)) 421 - return ERR_PTR(-EOPNOTSUPP); 423 + return 0; 422 424 423 - vsmmu = iommufd_viommu_alloc(ictx, struct arm_vsmmu, core, 424 - &arm_vsmmu_ops); 425 - if (IS_ERR(vsmmu)) 426 - return ERR_CAST(vsmmu); 425 + if (viommu_type == IOMMU_VIOMMU_TYPE_ARM_SMMUV3) 426 + return VIOMMU_STRUCT_SIZE(struct arm_vsmmu, core); 427 + 428 + if (!smmu->impl_ops || !smmu->impl_ops->get_viommu_size) 429 + return 0; 430 + return smmu->impl_ops->get_viommu_size(viommu_type); 431 + } 432 + 433 + int arm_vsmmu_init(struct iommufd_viommu *viommu, 434 + struct iommu_domain *parent_domain, 435 + const struct iommu_user_data *user_data) 436 + { 437 + struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core); 438 + struct arm_smmu_device *smmu = 439 + container_of(viommu->iommu_dev, struct arm_smmu_device, iommu); 440 + struct arm_smmu_domain *s2_parent = to_smmu_domain(parent_domain); 441 + 442 + if (s2_parent->smmu != smmu) 443 + return -EINVAL; 427 444 428 445 vsmmu->smmu = smmu; 429 446 vsmmu->s2_parent = s2_parent; 430 447 /* FIXME Move VMID allocation from the S2 domain allocation to here */ 431 448 vsmmu->vmid = s2_parent->s2_cfg.vmid; 432 449 433 - return &vsmmu->core; 450 + if (viommu->type == IOMMU_VIOMMU_TYPE_ARM_SMMUV3) { 451 + viommu->ops = &arm_vsmmu_ops; 452 + return 0; 453 + } 454 + 455 + return smmu->impl_ops->vsmmu_init(vsmmu, user_data); 434 456 } 435 457 436 458 int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt)
+16 -1
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
··· 3689 3689 .get_resv_regions = arm_smmu_get_resv_regions, 3690 3690 .page_response = arm_smmu_page_response, 3691 3691 .def_domain_type = arm_smmu_def_domain_type, 3692 - .viommu_alloc = arm_vsmmu_alloc, 3692 + .get_viommu_size = arm_smmu_get_viommu_size, 3693 + .viommu_init = arm_vsmmu_init, 3693 3694 .user_pasid_table = 1, 3694 3695 .owner = THIS_MODULE, 3695 3696 .default_domain_ops = &(const struct iommu_domain_ops) { ··· 4701 4700 static struct arm_smmu_device *arm_smmu_impl_probe(struct arm_smmu_device *smmu) 4702 4701 { 4703 4702 struct arm_smmu_device *new_smmu = ERR_PTR(-ENODEV); 4703 + const struct arm_smmu_impl_ops *ops; 4704 4704 int ret; 4705 4705 4706 4706 if (smmu->impl_dev && (smmu->options & ARM_SMMU_OPT_TEGRA241_CMDQV)) ··· 4712 4710 if (IS_ERR(new_smmu)) 4713 4711 return new_smmu; 4714 4712 4713 + ops = new_smmu->impl_ops; 4714 + if (ops) { 4715 + /* get_viommu_size and vsmmu_init ops must be paired */ 4716 + if (WARN_ON(!ops->get_viommu_size != !ops->vsmmu_init)) { 4717 + ret = -EINVAL; 4718 + goto err_remove; 4719 + } 4720 + } 4721 + 4715 4722 ret = devm_add_action_or_reset(new_smmu->dev, arm_smmu_impl_remove, 4716 4723 new_smmu); 4717 4724 if (ret) 4718 4725 return ERR_PTR(ret); 4719 4726 return new_smmu; 4727 + 4728 + err_remove: 4729 + arm_smmu_impl_remove(new_smmu); 4730 + return ERR_PTR(ret); 4720 4731 } 4721 4732 4722 4733 static int arm_smmu_device_probe(struct platform_device *pdev)
+27 -6
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
··· 16 16 #include <linux/sizes.h> 17 17 18 18 struct arm_smmu_device; 19 + struct arm_vsmmu; 19 20 20 21 /* MMIO registers */ 21 22 #define ARM_SMMU_IDR0 0x0 ··· 722 721 int (*init_structures)(struct arm_smmu_device *smmu); 723 722 struct arm_smmu_cmdq *(*get_secondary_cmdq)( 724 723 struct arm_smmu_device *smmu, struct arm_smmu_cmdq_ent *ent); 724 + /* 725 + * An implementation should define its own type other than the default 726 + * IOMMU_HW_INFO_TYPE_ARM_SMMUV3. And it must validate the input @type 727 + * to return its own structure. 728 + */ 729 + void *(*hw_info)(struct arm_smmu_device *smmu, u32 *length, 730 + enum iommu_hw_info_type *type); 731 + size_t (*get_viommu_size)(enum iommu_viommu_type viommu_type); 732 + int (*vsmmu_init)(struct arm_vsmmu *vsmmu, 733 + const struct iommu_user_data *user_data); 725 734 }; 726 735 727 736 /* An SMMUv3 instance */ ··· 1046 1035 }; 1047 1036 1048 1037 #if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD) 1049 - void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type); 1050 - struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, 1051 - struct iommu_domain *parent, 1052 - struct iommufd_ctx *ictx, 1053 - unsigned int viommu_type); 1038 + void *arm_smmu_hw_info(struct device *dev, u32 *length, 1039 + enum iommu_hw_info_type *type); 1040 + size_t arm_smmu_get_viommu_size(struct device *dev, 1041 + enum iommu_viommu_type viommu_type); 1042 + int arm_vsmmu_init(struct iommufd_viommu *viommu, 1043 + struct iommu_domain *parent_domain, 1044 + const struct iommu_user_data *user_data); 1054 1045 int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state, 1055 1046 struct arm_smmu_nested_domain *nested_domain); 1056 1047 void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state); 1057 1048 void arm_smmu_master_clear_vmaster(struct arm_smmu_master *master); 1058 1049 int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt); 1050 + struct iommu_domain * 1051 + arm_vsmmu_alloc_domain_nested(struct iommufd_viommu *viommu, u32 flags, 1052 + const struct iommu_user_data *user_data); 1053 + int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu, 1054 + struct iommu_user_data_array *array); 1059 1055 #else 1056 + #define arm_smmu_get_viommu_size NULL 1060 1057 #define arm_smmu_hw_info NULL 1061 - #define arm_vsmmu_alloc NULL 1058 + #define arm_vsmmu_init NULL 1059 + #define arm_vsmmu_alloc_domain_nested NULL 1060 + #define arm_vsmmu_cache_invalidate NULL 1062 1061 1063 1062 static inline int 1064 1063 arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
+474 -19
drivers/iommu/arm/arm-smmu-v3/tegra241-cmdqv.c
··· 8 8 #include <linux/dma-mapping.h> 9 9 #include <linux/interrupt.h> 10 10 #include <linux/iommu.h> 11 + #include <linux/iommufd.h> 11 12 #include <linux/iopoll.h> 13 + #include <uapi/linux/iommufd.h> 12 14 13 15 #include <acpi/acpixf.h> 14 16 ··· 28 26 #define CMDQV_EN BIT(0) 29 27 30 28 #define TEGRA241_CMDQV_PARAM 0x0004 29 + #define CMDQV_NUM_SID_PER_VM_LOG2 GENMASK(15, 12) 31 30 #define CMDQV_NUM_VINTF_LOG2 GENMASK(11, 8) 32 31 #define CMDQV_NUM_VCMDQ_LOG2 GENMASK(7, 4) 32 + #define CMDQV_VER GENMASK(3, 0) 33 33 34 34 #define TEGRA241_CMDQV_STATUS 0x0008 35 35 #define CMDQV_ENABLED BIT(0) ··· 56 52 #define TEGRA241_VINTF_STATUS 0x0004 57 53 #define VINTF_STATUS GENMASK(3, 1) 58 54 #define VINTF_ENABLED BIT(0) 55 + 56 + #define TEGRA241_VINTF_SID_MATCH(s) (0x0040 + 0x4*(s)) 57 + #define TEGRA241_VINTF_SID_REPLACE(s) (0x0080 + 0x4*(s)) 59 58 60 59 #define TEGRA241_VINTF_LVCMDQ_ERR_MAP_64(m) \ 61 60 (0x00C0 + 0x8*(m)) ··· 121 114 122 115 /** 123 116 * struct tegra241_vcmdq - Virtual Command Queue 117 + * @core: Embedded iommufd_hw_queue structure 124 118 * @idx: Global index in the CMDQV 125 119 * @lidx: Local index in the VINTF 126 120 * @enabled: Enable status 127 121 * @cmdqv: Parent CMDQV pointer 128 122 * @vintf: Parent VINTF pointer 123 + * @prev: Previous LVCMDQ to depend on 129 124 * @cmdq: Command Queue struct 130 125 * @page0: MMIO Page0 base address 131 126 * @page1: MMIO Page1 base address 132 127 */ 133 128 struct tegra241_vcmdq { 129 + struct iommufd_hw_queue core; 130 + 134 131 u16 idx; 135 132 u16 lidx; 136 133 ··· 142 131 143 132 struct tegra241_cmdqv *cmdqv; 144 133 struct tegra241_vintf *vintf; 134 + struct tegra241_vcmdq *prev; 145 135 struct arm_smmu_cmdq cmdq; 146 136 147 137 void __iomem *page0; 148 138 void __iomem *page1; 149 139 }; 140 + #define hw_queue_to_vcmdq(v) container_of(v, struct tegra241_vcmdq, core) 150 141 151 142 /** 152 143 * struct tegra241_vintf - Virtual Interface 144 + * @vsmmu: Embedded arm_vsmmu structure 153 145 * @idx: Global index in the CMDQV 154 146 * @enabled: Enable status 155 147 * @hyp_own: Owned by hypervisor (in-kernel) 156 148 * @cmdqv: Parent CMDQV pointer 157 149 * @lvcmdqs: List of logical VCMDQ pointers 150 + * @lvcmdq_mutex: Lock to serialize user-allocated lvcmdqs 158 151 * @base: MMIO base address 152 + * @mmap_offset: Offset argument for mmap() syscall 153 + * @sids: Stream ID mapping resources 159 154 */ 160 155 struct tegra241_vintf { 156 + struct arm_vsmmu vsmmu; 157 + 161 158 u16 idx; 162 159 163 160 bool enabled; ··· 173 154 174 155 struct tegra241_cmdqv *cmdqv; 175 156 struct tegra241_vcmdq **lvcmdqs; 157 + struct mutex lvcmdq_mutex; /* user space race */ 176 158 177 159 void __iomem *base; 160 + unsigned long mmap_offset; 161 + 162 + struct ida sids; 178 163 }; 164 + #define viommu_to_vintf(v) container_of(v, struct tegra241_vintf, vsmmu.core) 165 + 166 + /** 167 + * struct tegra241_vintf_sid - Virtual Interface Stream ID Mapping 168 + * @core: Embedded iommufd_vdevice structure, holding virtual Stream ID 169 + * @vintf: Parent VINTF pointer 170 + * @sid: Physical Stream ID 171 + * @idx: Mapping index in the VINTF 172 + */ 173 + struct tegra241_vintf_sid { 174 + struct iommufd_vdevice core; 175 + struct tegra241_vintf *vintf; 176 + u32 sid; 177 + u8 idx; 178 + }; 179 + #define vdev_to_vsid(v) container_of(v, struct tegra241_vintf_sid, core) 179 180 180 181 /** 181 182 * struct tegra241_cmdqv - CMDQ-V for SMMUv3 182 183 * @smmu: SMMUv3 device 183 184 * @dev: CMDQV device 184 185 * @base: MMIO base address 186 + * @base_phys: MMIO physical base address, for mmap 185 187 * @irq: IRQ number 186 188 * @num_vintfs: Total number of VINTFs 187 189 * @num_vcmdqs: Total number of VCMDQs 188 190 * @num_lvcmdqs_per_vintf: Number of logical VCMDQs per VINTF 191 + * @num_sids_per_vintf: Total number of SID mappings per VINTF 189 192 * @vintf_ids: VINTF id allocator 190 193 * @vintfs: List of VINTFs 191 194 */ ··· 216 175 struct device *dev; 217 176 218 177 void __iomem *base; 178 + phys_addr_t base_phys; 219 179 int irq; 220 180 221 181 /* CMDQV Hardware Params */ 222 182 u16 num_vintfs; 223 183 u16 num_vcmdqs; 224 184 u16 num_lvcmdqs_per_vintf; 185 + u16 num_sids_per_vintf; 225 186 226 187 struct ida vintf_ids; 227 188 ··· 295 252 296 253 /* ISR Functions */ 297 254 255 + static void tegra241_vintf_user_handle_error(struct tegra241_vintf *vintf) 256 + { 257 + struct iommufd_viommu *viommu = &vintf->vsmmu.core; 258 + struct iommu_vevent_tegra241_cmdqv vevent_data; 259 + int i; 260 + 261 + for (i = 0; i < LVCMDQ_ERR_MAP_NUM_64; i++) 262 + vevent_data.lvcmdq_err_map[i] = 263 + readq_relaxed(REG_VINTF(vintf, LVCMDQ_ERR_MAP_64(i))); 264 + 265 + iommufd_viommu_report_event(viommu, IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV, 266 + &vevent_data, sizeof(vevent_data)); 267 + } 268 + 298 269 static void tegra241_vintf0_handle_error(struct tegra241_vintf *vintf) 299 270 { 300 271 int i; ··· 352 295 if (vintf_map & BIT_ULL(0)) { 353 296 tegra241_vintf0_handle_error(cmdqv->vintfs[0]); 354 297 vintf_map &= ~BIT_ULL(0); 298 + } 299 + 300 + /* Handle other user VINTFs and their LVCMDQs */ 301 + while (vintf_map) { 302 + unsigned long idx = __ffs64(vintf_map); 303 + 304 + tegra241_vintf_user_handle_error(cmdqv->vintfs[idx]); 305 + vintf_map &= ~BIT_ULL(idx); 355 306 } 356 307 357 308 return IRQ_HANDLED; ··· 416 351 417 352 /* HW Reset Functions */ 418 353 354 + /* 355 + * When a guest-owned VCMDQ is disabled, if the guest did not enqueue a CMD_SYNC 356 + * following an ATC_INV command at the end of the guest queue while this ATC_INV 357 + * is timed out, the TIMEOUT will not be reported until this VCMDQ gets assigned 358 + * to the next VM, which will be a false alarm potentially causing some unwanted 359 + * behavior in the new VM. Thus, a guest-owned VCMDQ must flush the TIMEOUT when 360 + * it gets disabled. This can be done by just issuing a CMD_SYNC to SMMU CMDQ. 361 + */ 362 + static void tegra241_vcmdq_hw_flush_timeout(struct tegra241_vcmdq *vcmdq) 363 + { 364 + struct arm_smmu_device *smmu = &vcmdq->cmdqv->smmu; 365 + u64 cmd_sync[CMDQ_ENT_DWORDS] = {}; 366 + 367 + cmd_sync[0] = FIELD_PREP(CMDQ_0_OP, CMDQ_OP_CMD_SYNC) | 368 + FIELD_PREP(CMDQ_SYNC_0_CS, CMDQ_SYNC_0_CS_NONE); 369 + 370 + /* 371 + * It does not hurt to insert another CMD_SYNC, taking advantage of the 372 + * arm_smmu_cmdq_issue_cmdlist() that waits for the CMD_SYNC completion. 373 + */ 374 + arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, cmd_sync, 1, true); 375 + } 376 + 377 + /* This function is for LVCMDQ, so @vcmdq must not be unmapped yet */ 419 378 static void tegra241_vcmdq_hw_deinit(struct tegra241_vcmdq *vcmdq) 420 379 { 421 380 char header[64], *h = lvcmdq_error_header(vcmdq, header, 64); ··· 452 363 readl_relaxed(REG_VCMDQ_PAGE0(vcmdq, GERROR)), 453 364 readl_relaxed(REG_VCMDQ_PAGE0(vcmdq, CONS))); 454 365 } 366 + tegra241_vcmdq_hw_flush_timeout(vcmdq); 367 + 455 368 writel_relaxed(0, REG_VCMDQ_PAGE0(vcmdq, PROD)); 456 369 writel_relaxed(0, REG_VCMDQ_PAGE0(vcmdq, CONS)); 457 370 writeq_relaxed(0, REG_VCMDQ_PAGE1(vcmdq, BASE)); ··· 470 379 dev_dbg(vcmdq->cmdqv->dev, "%sdeinited\n", h); 471 380 } 472 381 382 + /* This function is for LVCMDQ, so @vcmdq must be mapped prior */ 473 383 static int tegra241_vcmdq_hw_init(struct tegra241_vcmdq *vcmdq) 474 384 { 475 385 char header[64], *h = lvcmdq_error_header(vcmdq, header, 64); ··· 496 404 return 0; 497 405 } 498 406 407 + /* Unmap a global VCMDQ from the pre-assigned LVCMDQ */ 408 + static void tegra241_vcmdq_unmap_lvcmdq(struct tegra241_vcmdq *vcmdq) 409 + { 410 + u32 regval = readl(REG_CMDQV(vcmdq->cmdqv, CMDQ_ALLOC(vcmdq->idx))); 411 + char header[64], *h = lvcmdq_error_header(vcmdq, header, 64); 412 + 413 + writel(regval & ~CMDQV_CMDQ_ALLOCATED, 414 + REG_CMDQV(vcmdq->cmdqv, CMDQ_ALLOC(vcmdq->idx))); 415 + dev_dbg(vcmdq->cmdqv->dev, "%sunmapped\n", h); 416 + } 417 + 499 418 static void tegra241_vintf_hw_deinit(struct tegra241_vintf *vintf) 500 419 { 501 - u16 lidx; 420 + u16 lidx = vintf->cmdqv->num_lvcmdqs_per_vintf; 421 + int sidx; 502 422 503 - for (lidx = 0; lidx < vintf->cmdqv->num_lvcmdqs_per_vintf; lidx++) 504 - if (vintf->lvcmdqs && vintf->lvcmdqs[lidx]) 423 + /* HW requires to unmap LVCMDQs in descending order */ 424 + while (lidx--) { 425 + if (vintf->lvcmdqs && vintf->lvcmdqs[lidx]) { 505 426 tegra241_vcmdq_hw_deinit(vintf->lvcmdqs[lidx]); 427 + tegra241_vcmdq_unmap_lvcmdq(vintf->lvcmdqs[lidx]); 428 + } 429 + } 506 430 vintf_write_config(vintf, 0); 431 + for (sidx = 0; sidx < vintf->cmdqv->num_sids_per_vintf; sidx++) { 432 + writel(0, REG_VINTF(vintf, SID_MATCH(sidx))); 433 + writel(0, REG_VINTF(vintf, SID_REPLACE(sidx))); 434 + } 435 + } 436 + 437 + /* Map a global VCMDQ to the pre-assigned LVCMDQ */ 438 + static void tegra241_vcmdq_map_lvcmdq(struct tegra241_vcmdq *vcmdq) 439 + { 440 + u32 regval = readl(REG_CMDQV(vcmdq->cmdqv, CMDQ_ALLOC(vcmdq->idx))); 441 + char header[64], *h = lvcmdq_error_header(vcmdq, header, 64); 442 + 443 + writel(regval | CMDQV_CMDQ_ALLOCATED, 444 + REG_CMDQV(vcmdq->cmdqv, CMDQ_ALLOC(vcmdq->idx))); 445 + dev_dbg(vcmdq->cmdqv->dev, "%smapped\n", h); 507 446 } 508 447 509 448 static int tegra241_vintf_hw_init(struct tegra241_vintf *vintf, bool hyp_own) ··· 552 429 * whether enabling it here or not, as !HYP_OWN cmdq HWs only support a 553 430 * restricted set of supported commands. 554 431 */ 555 - regval = FIELD_PREP(VINTF_HYP_OWN, hyp_own); 432 + regval = FIELD_PREP(VINTF_HYP_OWN, hyp_own) | 433 + FIELD_PREP(VINTF_VMID, vintf->vsmmu.vmid); 556 434 writel(regval, REG_VINTF(vintf, CONFIG)); 557 435 558 436 ret = vintf_write_config(vintf, regval | VINTF_EN); ··· 565 441 */ 566 442 vintf->hyp_own = !!(VINTF_HYP_OWN & readl(REG_VINTF(vintf, CONFIG))); 567 443 444 + /* HW requires to map LVCMDQs in ascending order */ 568 445 for (lidx = 0; lidx < vintf->cmdqv->num_lvcmdqs_per_vintf; lidx++) { 569 446 if (vintf->lvcmdqs && vintf->lvcmdqs[lidx]) { 447 + tegra241_vcmdq_map_lvcmdq(vintf->lvcmdqs[lidx]); 570 448 ret = tegra241_vcmdq_hw_init(vintf->lvcmdqs[lidx]); 571 449 if (ret) { 572 450 tegra241_vintf_hw_deinit(vintf); ··· 602 476 for (lidx = 0; lidx < cmdqv->num_lvcmdqs_per_vintf; lidx++) { 603 477 regval = FIELD_PREP(CMDQV_CMDQ_ALLOC_VINTF, idx); 604 478 regval |= FIELD_PREP(CMDQV_CMDQ_ALLOC_LVCMDQ, lidx); 605 - regval |= CMDQV_CMDQ_ALLOCATED; 606 479 writel_relaxed(regval, 607 480 REG_CMDQV(cmdqv, CMDQ_ALLOC(qidx++))); 608 481 } ··· 680 555 681 556 dev_dbg(vintf->cmdqv->dev, 682 557 "%sdeallocated\n", lvcmdq_error_header(vcmdq, header, 64)); 683 - kfree(vcmdq); 558 + /* Guest-owned VCMDQ is free-ed with hw_queue by iommufd core */ 559 + if (vcmdq->vintf->hyp_own) 560 + kfree(vcmdq); 684 561 } 685 562 686 563 static struct tegra241_vcmdq * ··· 755 628 756 629 /* Remove Helpers */ 757 630 758 - static void tegra241_vintf_remove_lvcmdq(struct tegra241_vintf *vintf, u16 lidx) 759 - { 760 - tegra241_vcmdq_hw_deinit(vintf->lvcmdqs[lidx]); 761 - tegra241_vintf_free_lvcmdq(vintf, lidx); 762 - } 763 - 764 631 static void tegra241_cmdqv_remove_vintf(struct tegra241_cmdqv *cmdqv, u16 idx) 765 632 { 766 633 struct tegra241_vintf *vintf = cmdqv->vintfs[idx]; 767 634 u16 lidx; 768 635 636 + tegra241_vintf_hw_deinit(vintf); 637 + 769 638 /* Remove LVCMDQ resources */ 770 639 for (lidx = 0; lidx < vintf->cmdqv->num_lvcmdqs_per_vintf; lidx++) 771 640 if (vintf->lvcmdqs[lidx]) 772 - tegra241_vintf_remove_lvcmdq(vintf, lidx); 773 - 774 - /* Remove VINTF resources */ 775 - tegra241_vintf_hw_deinit(vintf); 641 + tegra241_vintf_free_lvcmdq(vintf, lidx); 776 642 777 643 dev_dbg(cmdqv->dev, "VINTF%u: deallocated\n", vintf->idx); 778 644 tegra241_cmdqv_deinit_vintf(cmdqv, idx); 779 - kfree(vintf); 645 + if (!vintf->hyp_own) { 646 + mutex_destroy(&vintf->lvcmdq_mutex); 647 + ida_destroy(&vintf->sids); 648 + /* Guest-owned VINTF is free-ed with viommu by iommufd core */ 649 + } else { 650 + kfree(vintf); 651 + } 780 652 } 781 653 782 654 static void tegra241_cmdqv_remove(struct arm_smmu_device *smmu) ··· 803 677 put_device(cmdqv->dev); /* smmu->impl_dev */ 804 678 } 805 679 680 + static int 681 + tegra241_cmdqv_init_vintf_user(struct arm_vsmmu *vsmmu, 682 + const struct iommu_user_data *user_data); 683 + 684 + static void *tegra241_cmdqv_hw_info(struct arm_smmu_device *smmu, u32 *length, 685 + enum iommu_hw_info_type *type) 686 + { 687 + struct tegra241_cmdqv *cmdqv = 688 + container_of(smmu, struct tegra241_cmdqv, smmu); 689 + struct iommu_hw_info_tegra241_cmdqv *info; 690 + u32 regval; 691 + 692 + if (*type != IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV) 693 + return ERR_PTR(-EOPNOTSUPP); 694 + 695 + info = kzalloc(sizeof(*info), GFP_KERNEL); 696 + if (!info) 697 + return ERR_PTR(-ENOMEM); 698 + 699 + regval = readl_relaxed(REG_CMDQV(cmdqv, PARAM)); 700 + info->log2vcmdqs = ilog2(cmdqv->num_lvcmdqs_per_vintf); 701 + info->log2vsids = ilog2(cmdqv->num_sids_per_vintf); 702 + info->version = FIELD_GET(CMDQV_VER, regval); 703 + 704 + *length = sizeof(*info); 705 + *type = IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV; 706 + return info; 707 + } 708 + 709 + static size_t tegra241_cmdqv_get_vintf_size(enum iommu_viommu_type viommu_type) 710 + { 711 + if (viommu_type != IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV) 712 + return 0; 713 + return VIOMMU_STRUCT_SIZE(struct tegra241_vintf, vsmmu.core); 714 + } 715 + 806 716 static struct arm_smmu_impl_ops tegra241_cmdqv_impl_ops = { 717 + /* For in-kernel use */ 807 718 .get_secondary_cmdq = tegra241_cmdqv_get_cmdq, 808 719 .device_reset = tegra241_cmdqv_hw_reset, 809 720 .device_remove = tegra241_cmdqv_remove, 721 + /* For user-space use */ 722 + .hw_info = tegra241_cmdqv_hw_info, 723 + .get_viommu_size = tegra241_cmdqv_get_vintf_size, 724 + .vsmmu_init = tegra241_cmdqv_init_vintf_user, 810 725 }; 811 726 812 727 /* Probe Functions */ ··· 989 822 cmdqv->irq = irq; 990 823 cmdqv->base = base; 991 824 cmdqv->dev = smmu->impl_dev; 825 + cmdqv->base_phys = res->start; 992 826 993 827 if (cmdqv->irq > 0) { 994 - ret = request_irq(irq, tegra241_cmdqv_isr, 0, "tegra241-cmdqv", 995 - cmdqv); 828 + ret = request_threaded_irq(irq, NULL, tegra241_cmdqv_isr, 829 + IRQF_ONESHOT, "tegra241-cmdqv", 830 + cmdqv); 996 831 if (ret) { 997 832 dev_err(cmdqv->dev, "failed to request irq (%d): %d\n", 998 833 cmdqv->irq, ret); ··· 1006 837 cmdqv->num_vintfs = 1 << FIELD_GET(CMDQV_NUM_VINTF_LOG2, regval); 1007 838 cmdqv->num_vcmdqs = 1 << FIELD_GET(CMDQV_NUM_VCMDQ_LOG2, regval); 1008 839 cmdqv->num_lvcmdqs_per_vintf = cmdqv->num_vcmdqs / cmdqv->num_vintfs; 840 + cmdqv->num_sids_per_vintf = 841 + 1 << FIELD_GET(CMDQV_NUM_SID_PER_VM_LOG2, regval); 1009 842 1010 843 cmdqv->vintfs = 1011 844 kcalloc(cmdqv->num_vintfs, sizeof(*cmdqv->vintfs), GFP_KERNEL); ··· 1061 890 put_device(smmu->impl_dev); 1062 891 return ERR_PTR(-ENODEV); 1063 892 } 893 + 894 + /* User space VINTF and VCMDQ Functions */ 895 + 896 + static size_t tegra241_vintf_get_vcmdq_size(struct iommufd_viommu *viommu, 897 + enum iommu_hw_queue_type queue_type) 898 + { 899 + if (queue_type != IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV) 900 + return 0; 901 + return HW_QUEUE_STRUCT_SIZE(struct tegra241_vcmdq, core); 902 + } 903 + 904 + static int tegra241_vcmdq_hw_init_user(struct tegra241_vcmdq *vcmdq) 905 + { 906 + char header[64]; 907 + 908 + /* Configure the vcmdq only; User space does the enabling */ 909 + writeq_relaxed(vcmdq->cmdq.q.q_base, REG_VCMDQ_PAGE1(vcmdq, BASE)); 910 + 911 + dev_dbg(vcmdq->cmdqv->dev, "%sinited at host PA 0x%llx size 0x%lx\n", 912 + lvcmdq_error_header(vcmdq, header, 64), 913 + vcmdq->cmdq.q.q_base & VCMDQ_ADDR, 914 + 1UL << (vcmdq->cmdq.q.q_base & VCMDQ_LOG2SIZE)); 915 + return 0; 916 + } 917 + 918 + static void 919 + tegra241_vintf_destroy_lvcmdq_user(struct iommufd_hw_queue *hw_queue) 920 + { 921 + struct tegra241_vcmdq *vcmdq = hw_queue_to_vcmdq(hw_queue); 922 + 923 + mutex_lock(&vcmdq->vintf->lvcmdq_mutex); 924 + tegra241_vcmdq_hw_deinit(vcmdq); 925 + tegra241_vcmdq_unmap_lvcmdq(vcmdq); 926 + tegra241_vintf_free_lvcmdq(vcmdq->vintf, vcmdq->lidx); 927 + if (vcmdq->prev) 928 + iommufd_hw_queue_undepend(vcmdq, vcmdq->prev, core); 929 + mutex_unlock(&vcmdq->vintf->lvcmdq_mutex); 930 + } 931 + 932 + static int tegra241_vintf_alloc_lvcmdq_user(struct iommufd_hw_queue *hw_queue, 933 + u32 lidx, phys_addr_t base_addr_pa) 934 + { 935 + struct tegra241_vintf *vintf = viommu_to_vintf(hw_queue->viommu); 936 + struct tegra241_vcmdq *vcmdq = hw_queue_to_vcmdq(hw_queue); 937 + struct tegra241_cmdqv *cmdqv = vintf->cmdqv; 938 + struct arm_smmu_device *smmu = &cmdqv->smmu; 939 + struct tegra241_vcmdq *prev = NULL; 940 + u32 log2size, max_n_shift; 941 + char header[64]; 942 + int ret; 943 + 944 + if (hw_queue->type != IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV) 945 + return -EOPNOTSUPP; 946 + if (lidx >= cmdqv->num_lvcmdqs_per_vintf) 947 + return -EINVAL; 948 + 949 + mutex_lock(&vintf->lvcmdq_mutex); 950 + 951 + if (vintf->lvcmdqs[lidx]) { 952 + ret = -EEXIST; 953 + goto unlock; 954 + } 955 + 956 + /* 957 + * HW requires to map LVCMDQs in ascending order, so reject if the 958 + * previous lvcmdqs is not allocated yet. 959 + */ 960 + if (lidx) { 961 + prev = vintf->lvcmdqs[lidx - 1]; 962 + if (!prev) { 963 + ret = -EIO; 964 + goto unlock; 965 + } 966 + } 967 + 968 + /* 969 + * hw_queue->length must be a power of 2, in range of 970 + * [ 32, 2 ^ (idr[1].CMDQS + CMDQ_ENT_SZ_SHIFT) ] 971 + */ 972 + max_n_shift = FIELD_GET(IDR1_CMDQS, 973 + readl_relaxed(smmu->base + ARM_SMMU_IDR1)); 974 + if (!is_power_of_2(hw_queue->length) || hw_queue->length < 32 || 975 + hw_queue->length > (1 << (max_n_shift + CMDQ_ENT_SZ_SHIFT))) { 976 + ret = -EINVAL; 977 + goto unlock; 978 + } 979 + log2size = ilog2(hw_queue->length) - CMDQ_ENT_SZ_SHIFT; 980 + 981 + /* base_addr_pa must be aligned to hw_queue->length */ 982 + if (base_addr_pa & ~VCMDQ_ADDR || 983 + base_addr_pa & (hw_queue->length - 1)) { 984 + ret = -EINVAL; 985 + goto unlock; 986 + } 987 + 988 + /* 989 + * HW requires to unmap LVCMDQs in descending order, so destroy() must 990 + * follow this rule. Set a dependency on its previous LVCMDQ so iommufd 991 + * core will help enforce it. 992 + */ 993 + if (prev) { 994 + ret = iommufd_hw_queue_depend(vcmdq, prev, core); 995 + if (ret) 996 + goto unlock; 997 + } 998 + vcmdq->prev = prev; 999 + 1000 + ret = tegra241_vintf_init_lvcmdq(vintf, lidx, vcmdq); 1001 + if (ret) 1002 + goto undepend_vcmdq; 1003 + 1004 + dev_dbg(cmdqv->dev, "%sallocated\n", 1005 + lvcmdq_error_header(vcmdq, header, 64)); 1006 + 1007 + tegra241_vcmdq_map_lvcmdq(vcmdq); 1008 + 1009 + vcmdq->cmdq.q.q_base = base_addr_pa & VCMDQ_ADDR; 1010 + vcmdq->cmdq.q.q_base |= log2size; 1011 + 1012 + ret = tegra241_vcmdq_hw_init_user(vcmdq); 1013 + if (ret) 1014 + goto unmap_lvcmdq; 1015 + 1016 + hw_queue->destroy = &tegra241_vintf_destroy_lvcmdq_user; 1017 + mutex_unlock(&vintf->lvcmdq_mutex); 1018 + return 0; 1019 + 1020 + unmap_lvcmdq: 1021 + tegra241_vcmdq_unmap_lvcmdq(vcmdq); 1022 + tegra241_vintf_deinit_lvcmdq(vintf, lidx); 1023 + undepend_vcmdq: 1024 + if (vcmdq->prev) 1025 + iommufd_hw_queue_undepend(vcmdq, vcmdq->prev, core); 1026 + unlock: 1027 + mutex_unlock(&vintf->lvcmdq_mutex); 1028 + return ret; 1029 + } 1030 + 1031 + static void tegra241_cmdqv_destroy_vintf_user(struct iommufd_viommu *viommu) 1032 + { 1033 + struct tegra241_vintf *vintf = viommu_to_vintf(viommu); 1034 + 1035 + if (vintf->mmap_offset) 1036 + iommufd_viommu_destroy_mmap(&vintf->vsmmu.core, 1037 + vintf->mmap_offset); 1038 + tegra241_cmdqv_remove_vintf(vintf->cmdqv, vintf->idx); 1039 + } 1040 + 1041 + static void tegra241_vintf_destroy_vsid(struct iommufd_vdevice *vdev) 1042 + { 1043 + struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev); 1044 + struct tegra241_vintf *vintf = vsid->vintf; 1045 + 1046 + writel(0, REG_VINTF(vintf, SID_MATCH(vsid->idx))); 1047 + writel(0, REG_VINTF(vintf, SID_REPLACE(vsid->idx))); 1048 + ida_free(&vintf->sids, vsid->idx); 1049 + dev_dbg(vintf->cmdqv->dev, 1050 + "VINTF%u: deallocated SID_REPLACE%d for pSID=%x\n", vintf->idx, 1051 + vsid->idx, vsid->sid); 1052 + } 1053 + 1054 + static int tegra241_vintf_init_vsid(struct iommufd_vdevice *vdev) 1055 + { 1056 + struct device *dev = iommufd_vdevice_to_device(vdev); 1057 + struct arm_smmu_master *master = dev_iommu_priv_get(dev); 1058 + struct tegra241_vintf *vintf = viommu_to_vintf(vdev->viommu); 1059 + struct tegra241_vintf_sid *vsid = vdev_to_vsid(vdev); 1060 + struct arm_smmu_stream *stream = &master->streams[0]; 1061 + u64 virt_sid = vdev->virt_id; 1062 + int sidx; 1063 + 1064 + if (virt_sid > UINT_MAX) 1065 + return -EINVAL; 1066 + 1067 + WARN_ON_ONCE(master->num_streams != 1); 1068 + 1069 + /* Find an empty pair of SID_REPLACE and SID_MATCH */ 1070 + sidx = ida_alloc_max(&vintf->sids, vintf->cmdqv->num_sids_per_vintf - 1, 1071 + GFP_KERNEL); 1072 + if (sidx < 0) 1073 + return sidx; 1074 + 1075 + writel(stream->id, REG_VINTF(vintf, SID_REPLACE(sidx))); 1076 + writel(virt_sid << 1 | 0x1, REG_VINTF(vintf, SID_MATCH(sidx))); 1077 + dev_dbg(vintf->cmdqv->dev, 1078 + "VINTF%u: allocated SID_REPLACE%d for pSID=%x, vSID=%x\n", 1079 + vintf->idx, sidx, stream->id, (u32)virt_sid); 1080 + 1081 + vsid->idx = sidx; 1082 + vsid->vintf = vintf; 1083 + vsid->sid = stream->id; 1084 + 1085 + vdev->destroy = &tegra241_vintf_destroy_vsid; 1086 + return 0; 1087 + } 1088 + 1089 + static struct iommufd_viommu_ops tegra241_cmdqv_viommu_ops = { 1090 + .destroy = tegra241_cmdqv_destroy_vintf_user, 1091 + .alloc_domain_nested = arm_vsmmu_alloc_domain_nested, 1092 + /* Non-accelerated commands will be still handled by the kernel */ 1093 + .cache_invalidate = arm_vsmmu_cache_invalidate, 1094 + .vdevice_size = VDEVICE_STRUCT_SIZE(struct tegra241_vintf_sid, core), 1095 + .vdevice_init = tegra241_vintf_init_vsid, 1096 + .get_hw_queue_size = tegra241_vintf_get_vcmdq_size, 1097 + .hw_queue_init_phys = tegra241_vintf_alloc_lvcmdq_user, 1098 + }; 1099 + 1100 + static int 1101 + tegra241_cmdqv_init_vintf_user(struct arm_vsmmu *vsmmu, 1102 + const struct iommu_user_data *user_data) 1103 + { 1104 + struct tegra241_cmdqv *cmdqv = 1105 + container_of(vsmmu->smmu, struct tegra241_cmdqv, smmu); 1106 + struct tegra241_vintf *vintf = viommu_to_vintf(&vsmmu->core); 1107 + struct iommu_viommu_tegra241_cmdqv data; 1108 + phys_addr_t page0_base; 1109 + int ret; 1110 + 1111 + /* 1112 + * Unsupported type should be rejected by tegra241_cmdqv_get_vintf_size. 1113 + * Seeing one here indicates a kernel bug or some data corruption. 1114 + */ 1115 + if (WARN_ON(vsmmu->core.type != IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV)) 1116 + return -EOPNOTSUPP; 1117 + 1118 + if (!user_data) 1119 + return -EINVAL; 1120 + 1121 + ret = iommu_copy_struct_from_user(&data, user_data, 1122 + IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV, 1123 + out_vintf_mmap_length); 1124 + if (ret) 1125 + return ret; 1126 + 1127 + ret = tegra241_cmdqv_init_vintf(cmdqv, cmdqv->num_vintfs - 1, vintf); 1128 + if (ret < 0) { 1129 + dev_err(cmdqv->dev, "no more available vintf\n"); 1130 + return ret; 1131 + } 1132 + 1133 + /* 1134 + * Initialize the user-owned VINTF without a LVCMDQ, as it cannot pre- 1135 + * allocate a LVCMDQ until user space wants one, for security reasons. 1136 + * It is different than the kernel-owned VINTF0, which had pre-assigned 1137 + * and pre-allocated global VCMDQs that would be mapped to the LVCMDQs 1138 + * by the tegra241_vintf_hw_init() call. 1139 + */ 1140 + ret = tegra241_vintf_hw_init(vintf, false); 1141 + if (ret) 1142 + goto deinit_vintf; 1143 + 1144 + page0_base = cmdqv->base_phys + TEGRA241_VINTFi_PAGE0(vintf->idx); 1145 + ret = iommufd_viommu_alloc_mmap(&vintf->vsmmu.core, page0_base, SZ_64K, 1146 + &vintf->mmap_offset); 1147 + if (ret) 1148 + goto hw_deinit_vintf; 1149 + 1150 + data.out_vintf_mmap_length = SZ_64K; 1151 + data.out_vintf_mmap_offset = vintf->mmap_offset; 1152 + ret = iommu_copy_struct_to_user(user_data, &data, 1153 + IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV, 1154 + out_vintf_mmap_length); 1155 + if (ret) 1156 + goto free_mmap; 1157 + 1158 + ida_init(&vintf->sids); 1159 + mutex_init(&vintf->lvcmdq_mutex); 1160 + 1161 + dev_dbg(cmdqv->dev, "VINTF%u: allocated with vmid (%d)\n", vintf->idx, 1162 + vintf->vsmmu.vmid); 1163 + 1164 + vsmmu->core.ops = &tegra241_cmdqv_viommu_ops; 1165 + return 0; 1166 + 1167 + free_mmap: 1168 + iommufd_viommu_destroy_mmap(&vintf->vsmmu.core, vintf->mmap_offset); 1169 + hw_deinit_vintf: 1170 + tegra241_vintf_hw_deinit(vintf); 1171 + deinit_vintf: 1172 + tegra241_cmdqv_deinit_vintf(cmdqv, vintf->idx); 1173 + return ret; 1174 + } 1175 + 1176 + MODULE_IMPORT_NS("IOMMUFD");
+6 -1
drivers/iommu/intel/iommu.c
··· 4193 4193 return ret; 4194 4194 } 4195 4195 4196 - static void *intel_iommu_hw_info(struct device *dev, u32 *length, u32 *type) 4196 + static void *intel_iommu_hw_info(struct device *dev, u32 *length, 4197 + enum iommu_hw_info_type *type) 4197 4198 { 4198 4199 struct device_domain_info *info = dev_iommu_priv_get(dev); 4199 4200 struct intel_iommu *iommu = info->iommu; 4200 4201 struct iommu_hw_info_vtd *vtd; 4202 + 4203 + if (*type != IOMMU_HW_INFO_TYPE_DEFAULT && 4204 + *type != IOMMU_HW_INFO_TYPE_INTEL_VTD) 4205 + return ERR_PTR(-EOPNOTSUPP); 4201 4206 4202 4207 vtd = kzalloc(sizeof(*vtd), GFP_KERNEL); 4203 4208 if (!vtd)
+123 -20
drivers/iommu/iommufd/device.c
··· 137 137 } 138 138 } 139 139 140 + static void iommufd_device_remove_vdev(struct iommufd_device *idev) 141 + { 142 + struct iommufd_vdevice *vdev; 143 + 144 + mutex_lock(&idev->igroup->lock); 145 + /* prevent new references from vdev */ 146 + idev->destroying = true; 147 + /* vdev has been completely destroyed by userspace */ 148 + if (!idev->vdev) 149 + goto out_unlock; 150 + 151 + vdev = iommufd_get_vdevice(idev->ictx, idev->vdev->obj.id); 152 + /* 153 + * An ongoing vdev destroy ioctl has removed the vdev from the object 154 + * xarray, but has not finished iommufd_vdevice_destroy() yet as it 155 + * needs the same mutex. We exit the locking then wait on wait_cnt 156 + * reference for the vdev destruction. 157 + */ 158 + if (IS_ERR(vdev)) 159 + goto out_unlock; 160 + 161 + /* Should never happen */ 162 + if (WARN_ON(vdev != idev->vdev)) { 163 + iommufd_put_object(idev->ictx, &vdev->obj); 164 + goto out_unlock; 165 + } 166 + 167 + /* 168 + * vdev is still alive. Hold a users refcount to prevent racing with 169 + * userspace destruction, then use iommufd_object_tombstone_user() to 170 + * destroy it and leave a tombstone. 171 + */ 172 + refcount_inc(&vdev->obj.users); 173 + iommufd_put_object(idev->ictx, &vdev->obj); 174 + mutex_unlock(&idev->igroup->lock); 175 + iommufd_object_tombstone_user(idev->ictx, &vdev->obj); 176 + return; 177 + 178 + out_unlock: 179 + mutex_unlock(&idev->igroup->lock); 180 + } 181 + 182 + void iommufd_device_pre_destroy(struct iommufd_object *obj) 183 + { 184 + struct iommufd_device *idev = 185 + container_of(obj, struct iommufd_device, obj); 186 + 187 + /* Release the wait_cnt reference on this */ 188 + iommufd_device_remove_vdev(idev); 189 + } 190 + 140 191 void iommufd_device_destroy(struct iommufd_object *obj) 141 192 { 142 193 struct iommufd_device *idev = ··· 536 485 537 486 lockdep_assert_held(&idev->igroup->lock); 538 487 539 - handle = 540 - iommu_attach_handle_get(idev->igroup->group, pasid, 0); 488 + handle = iommu_attach_handle_get(idev->igroup->group, pasid, 0); 541 489 if (IS_ERR(handle)) 542 490 return NULL; 543 491 return to_iommufd_handle(handle); ··· 1099 1049 } 1100 1050 1101 1051 if (cur_ioas) { 1102 - if (access->ops->unmap) { 1052 + if (!iommufd_access_is_internal(access) && access->ops->unmap) { 1103 1053 mutex_unlock(&access->ioas_lock); 1104 1054 access->ops->unmap(access->data, 0, ULONG_MAX); 1105 1055 mutex_lock(&access->ioas_lock); ··· 1135 1085 if (access->ioas) 1136 1086 WARN_ON(iommufd_access_change_ioas(access, NULL)); 1137 1087 mutex_unlock(&access->ioas_lock); 1138 - iommufd_ctx_put(access->ictx); 1088 + if (!iommufd_access_is_internal(access)) 1089 + iommufd_ctx_put(access->ictx); 1090 + } 1091 + 1092 + static struct iommufd_access *__iommufd_access_create(struct iommufd_ctx *ictx) 1093 + { 1094 + struct iommufd_access *access; 1095 + 1096 + /* 1097 + * There is no uAPI for the access object, but to keep things symmetric 1098 + * use the object infrastructure anyhow. 1099 + */ 1100 + access = iommufd_object_alloc(ictx, access, IOMMUFD_OBJ_ACCESS); 1101 + if (IS_ERR(access)) 1102 + return access; 1103 + 1104 + /* The calling driver is a user until iommufd_access_destroy() */ 1105 + refcount_inc(&access->obj.users); 1106 + mutex_init(&access->ioas_lock); 1107 + return access; 1108 + } 1109 + 1110 + struct iommufd_access *iommufd_access_create_internal(struct iommufd_ctx *ictx) 1111 + { 1112 + struct iommufd_access *access; 1113 + 1114 + access = __iommufd_access_create(ictx); 1115 + if (IS_ERR(access)) 1116 + return access; 1117 + access->iova_alignment = PAGE_SIZE; 1118 + 1119 + iommufd_object_finalize(ictx, &access->obj); 1120 + return access; 1139 1121 } 1140 1122 1141 1123 /** ··· 1189 1107 { 1190 1108 struct iommufd_access *access; 1191 1109 1192 - /* 1193 - * There is no uAPI for the access object, but to keep things symmetric 1194 - * use the object infrastructure anyhow. 1195 - */ 1196 - access = iommufd_object_alloc(ictx, access, IOMMUFD_OBJ_ACCESS); 1110 + access = __iommufd_access_create(ictx); 1197 1111 if (IS_ERR(access)) 1198 1112 return access; 1199 1113 ··· 1201 1123 else 1202 1124 access->iova_alignment = 1; 1203 1125 1204 - /* The calling driver is a user until iommufd_access_destroy() */ 1205 - refcount_inc(&access->obj.users); 1206 1126 access->ictx = ictx; 1207 1127 iommufd_ctx_get(ictx); 1208 1128 iommufd_object_finalize(ictx, &access->obj); 1209 1129 *id = access->obj.id; 1210 - mutex_init(&access->ioas_lock); 1211 1130 return access; 1212 1131 } 1213 1132 EXPORT_SYMBOL_NS_GPL(iommufd_access_create, "IOMMUFD"); ··· 1249 1174 } 1250 1175 EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, "IOMMUFD"); 1251 1176 1177 + int iommufd_access_attach_internal(struct iommufd_access *access, 1178 + struct iommufd_ioas *ioas) 1179 + { 1180 + int rc; 1181 + 1182 + mutex_lock(&access->ioas_lock); 1183 + if (WARN_ON(access->ioas)) { 1184 + mutex_unlock(&access->ioas_lock); 1185 + return -EINVAL; 1186 + } 1187 + 1188 + rc = iommufd_access_change_ioas(access, ioas); 1189 + mutex_unlock(&access->ioas_lock); 1190 + return rc; 1191 + } 1192 + 1252 1193 int iommufd_access_replace(struct iommufd_access *access, u32 ioas_id) 1253 1194 { 1254 1195 int rc; ··· 1306 1215 1307 1216 xa_lock(&ioas->iopt.access_list); 1308 1217 xa_for_each(&ioas->iopt.access_list, index, access) { 1309 - if (!iommufd_lock_obj(&access->obj)) 1218 + if (!iommufd_lock_obj(&access->obj) || 1219 + iommufd_access_is_internal(access)) 1310 1220 continue; 1311 1221 xa_unlock(&ioas->iopt.access_list); 1312 1222 ··· 1331 1239 void iommufd_access_unpin_pages(struct iommufd_access *access, 1332 1240 unsigned long iova, unsigned long length) 1333 1241 { 1242 + bool internal = iommufd_access_is_internal(access); 1334 1243 struct iopt_area_contig_iter iter; 1335 1244 struct io_pagetable *iopt; 1336 1245 unsigned long last_iova; ··· 1358 1265 area, iopt_area_iova_to_index(area, iter.cur_iova), 1359 1266 iopt_area_iova_to_index( 1360 1267 area, 1361 - min(last_iova, iopt_area_last_iova(area)))); 1268 + min(last_iova, iopt_area_last_iova(area))), 1269 + internal); 1362 1270 WARN_ON(!iopt_area_contig_done(&iter)); 1363 1271 up_read(&iopt->iova_rwsem); 1364 1272 mutex_unlock(&access->ioas_lock); ··· 1408 1314 unsigned long length, struct page **out_pages, 1409 1315 unsigned int flags) 1410 1316 { 1317 + bool internal = iommufd_access_is_internal(access); 1411 1318 struct iopt_area_contig_iter iter; 1412 1319 struct io_pagetable *iopt; 1413 1320 unsigned long last_iova; ··· 1417 1322 1418 1323 /* Driver's ops don't support pin_pages */ 1419 1324 if (IS_ENABLED(CONFIG_IOMMUFD_TEST) && 1420 - WARN_ON(access->iova_alignment != PAGE_SIZE || !access->ops->unmap)) 1325 + WARN_ON(access->iova_alignment != PAGE_SIZE || 1326 + (!internal && !access->ops->unmap))) 1421 1327 return -EINVAL; 1422 1328 1423 1329 if (!length) ··· 1452 1356 } 1453 1357 1454 1358 rc = iopt_area_add_access(area, index, last_index, out_pages, 1455 - flags); 1359 + flags, internal); 1456 1360 if (rc) 1457 1361 goto err_remove; 1458 1362 out_pages += last_index - index + 1; ··· 1475 1379 iopt_area_iova_to_index(area, iter.cur_iova), 1476 1380 iopt_area_iova_to_index( 1477 1381 area, min(last_iova, 1478 - iopt_area_last_iova(area)))); 1382 + iopt_area_last_iova(area))), 1383 + internal); 1479 1384 } 1480 1385 up_read(&iopt->iova_rwsem); 1481 1386 mutex_unlock(&access->ioas_lock); ··· 1550 1453 1551 1454 int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) 1552 1455 { 1456 + const u32 SUPPORTED_FLAGS = IOMMU_HW_INFO_FLAG_INPUT_TYPE; 1553 1457 struct iommu_hw_info *cmd = ucmd->cmd; 1554 1458 void __user *user_ptr = u64_to_user_ptr(cmd->data_uptr); 1555 1459 const struct iommu_ops *ops; ··· 1560 1462 void *data; 1561 1463 int rc; 1562 1464 1563 - if (cmd->flags || cmd->__reserved[0] || cmd->__reserved[1] || 1564 - cmd->__reserved[2]) 1465 + if (cmd->flags & ~SUPPORTED_FLAGS) 1565 1466 return -EOPNOTSUPP; 1467 + if (cmd->__reserved[0] || cmd->__reserved[1] || cmd->__reserved[2]) 1468 + return -EOPNOTSUPP; 1469 + 1470 + /* Clear the type field since drivers don't support a random input */ 1471 + if (!(cmd->flags & IOMMU_HW_INFO_FLAG_INPUT_TYPE)) 1472 + cmd->in_data_type = IOMMU_HW_INFO_TYPE_DEFAULT; 1566 1473 1567 1474 idev = iommufd_get_device(ucmd, cmd->dev_id); 1568 1475 if (IS_ERR(idev)) ··· 1587 1484 */ 1588 1485 if (WARN_ON_ONCE(cmd->out_data_type == 1589 1486 IOMMU_HW_INFO_TYPE_NONE)) { 1590 - rc = -ENODEV; 1487 + rc = -EOPNOTSUPP; 1591 1488 goto out_free; 1592 1489 } 1593 1490 } else {
+83 -30
drivers/iommu/iommufd/driver.c
··· 3 3 */ 4 4 #include "iommufd_private.h" 5 5 6 - struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, 7 - size_t size, 8 - enum iommufd_object_type type) 6 + /* Driver should use a per-structure helper in include/linux/iommufd.h */ 7 + int _iommufd_object_depend(struct iommufd_object *obj_dependent, 8 + struct iommufd_object *obj_depended) 9 9 { 10 - struct iommufd_object *obj; 10 + /* Reject self dependency that dead locks */ 11 + if (obj_dependent == obj_depended) 12 + return -EINVAL; 13 + /* Only support dependency between two objects of the same type */ 14 + if (obj_dependent->type != obj_depended->type) 15 + return -EINVAL; 16 + 17 + refcount_inc(&obj_depended->users); 18 + return 0; 19 + } 20 + EXPORT_SYMBOL_NS_GPL(_iommufd_object_depend, "IOMMUFD"); 21 + 22 + /* Driver should use a per-structure helper in include/linux/iommufd.h */ 23 + void _iommufd_object_undepend(struct iommufd_object *obj_dependent, 24 + struct iommufd_object *obj_depended) 25 + { 26 + if (WARN_ON_ONCE(obj_dependent == obj_depended || 27 + obj_dependent->type != obj_depended->type)) 28 + return; 29 + 30 + refcount_dec(&obj_depended->users); 31 + } 32 + EXPORT_SYMBOL_NS_GPL(_iommufd_object_undepend, "IOMMUFD"); 33 + 34 + /* 35 + * Allocate an @offset to return to user space to use for an mmap() syscall 36 + * 37 + * Driver should use a per-structure helper in include/linux/iommufd.h 38 + */ 39 + int _iommufd_alloc_mmap(struct iommufd_ctx *ictx, struct iommufd_object *owner, 40 + phys_addr_t mmio_addr, size_t length, 41 + unsigned long *offset) 42 + { 43 + struct iommufd_mmap *immap; 44 + unsigned long startp; 11 45 int rc; 12 46 13 - obj = kzalloc(size, GFP_KERNEL_ACCOUNT); 14 - if (!obj) 15 - return ERR_PTR(-ENOMEM); 16 - obj->type = type; 17 - /* Starts out bias'd by 1 until it is removed from the xarray */ 18 - refcount_set(&obj->shortterm_users, 1); 19 - refcount_set(&obj->users, 1); 47 + if (!PAGE_ALIGNED(mmio_addr)) 48 + return -EINVAL; 49 + if (!length || !PAGE_ALIGNED(length)) 50 + return -EINVAL; 20 51 21 - /* 22 - * Reserve an ID in the xarray but do not publish the pointer yet since 23 - * the caller hasn't initialized it yet. Once the pointer is published 24 - * in the xarray and visible to other threads we can't reliably destroy 25 - * it anymore, so the caller must complete all errorable operations 26 - * before calling iommufd_object_finalize(). 27 - */ 28 - rc = xa_alloc(&ictx->objects, &obj->id, XA_ZERO_ENTRY, xa_limit_31b, 29 - GFP_KERNEL_ACCOUNT); 30 - if (rc) 31 - goto out_free; 32 - return obj; 33 - out_free: 34 - kfree(obj); 35 - return ERR_PTR(rc); 52 + immap = kzalloc(sizeof(*immap), GFP_KERNEL); 53 + if (!immap) 54 + return -ENOMEM; 55 + immap->owner = owner; 56 + immap->length = length; 57 + immap->mmio_addr = mmio_addr; 58 + 59 + /* Skip the first page to ease caller identifying the returned offset */ 60 + rc = mtree_alloc_range(&ictx->mt_mmap, &startp, immap, immap->length, 61 + PAGE_SIZE, ULONG_MAX, GFP_KERNEL); 62 + if (rc < 0) { 63 + kfree(immap); 64 + return rc; 65 + } 66 + 67 + /* mmap() syscall will right-shift the offset in vma->vm_pgoff too */ 68 + immap->vm_pgoff = startp >> PAGE_SHIFT; 69 + *offset = startp; 70 + return 0; 36 71 } 37 - EXPORT_SYMBOL_NS_GPL(_iommufd_object_alloc, "IOMMUFD"); 72 + EXPORT_SYMBOL_NS_GPL(_iommufd_alloc_mmap, "IOMMUFD"); 73 + 74 + /* Driver should use a per-structure helper in include/linux/iommufd.h */ 75 + void _iommufd_destroy_mmap(struct iommufd_ctx *ictx, 76 + struct iommufd_object *owner, unsigned long offset) 77 + { 78 + struct iommufd_mmap *immap; 79 + 80 + immap = mtree_erase(&ictx->mt_mmap, offset); 81 + WARN_ON_ONCE(!immap || immap->owner != owner); 82 + kfree(immap); 83 + } 84 + EXPORT_SYMBOL_NS_GPL(_iommufd_destroy_mmap, "IOMMUFD"); 85 + 86 + struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev) 87 + { 88 + return vdev->idev->dev; 89 + } 90 + EXPORT_SYMBOL_NS_GPL(iommufd_vdevice_to_device, "IOMMUFD"); 38 91 39 92 /* Caller should xa_lock(&viommu->vdevs) to protect the return value */ 40 93 struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, ··· 98 45 lockdep_assert_held(&viommu->vdevs.xa_lock); 99 46 100 47 vdev = xa_load(&viommu->vdevs, vdev_id); 101 - return vdev ? vdev->dev : NULL; 48 + return vdev ? iommufd_vdevice_to_device(vdev) : NULL; 102 49 } 103 50 EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, "IOMMUFD"); 104 51 ··· 115 62 116 63 xa_lock(&viommu->vdevs); 117 64 xa_for_each(&viommu->vdevs, index, vdev) { 118 - if (vdev->dev == dev) { 119 - *vdev_id = vdev->id; 65 + if (iommufd_vdevice_to_device(vdev) == dev) { 66 + *vdev_id = vdev->virt_id; 120 67 rc = 0; 121 68 break; 122 69 }
+4 -10
drivers/iommu/iommufd/eventq.c
··· 427 427 if (cmd->flags) 428 428 return -EOPNOTSUPP; 429 429 430 - fault = __iommufd_object_alloc(ucmd->ictx, fault, IOMMUFD_OBJ_FAULT, 431 - common.obj); 430 + fault = __iommufd_object_alloc_ucmd(ucmd, fault, IOMMUFD_OBJ_FAULT, 431 + common.obj); 432 432 if (IS_ERR(fault)) 433 433 return PTR_ERR(fault); 434 434 ··· 437 437 438 438 fdno = iommufd_eventq_init(&fault->common, "[iommufd-pgfault]", 439 439 ucmd->ictx, &iommufd_fault_fops); 440 - if (fdno < 0) { 441 - rc = fdno; 442 - goto out_abort; 443 - } 440 + if (fdno < 0) 441 + return fdno; 444 442 445 443 cmd->out_fault_id = fault->common.obj.id; 446 444 cmd->out_fault_fd = fdno; ··· 446 448 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 447 449 if (rc) 448 450 goto out_put_fdno; 449 - iommufd_object_finalize(ucmd->ictx, &fault->common.obj); 450 451 451 452 fd_install(fdno, fault->common.filep); 452 453 ··· 453 456 out_put_fdno: 454 457 put_unused_fd(fdno); 455 458 fput(fault->common.filep); 456 - out_abort: 457 - iommufd_object_abort_and_destroy(ucmd->ictx, &fault->common.obj); 458 - 459 459 return rc; 460 460 } 461 461
+4 -6
drivers/iommu/iommufd/hw_pagetable.c
··· 264 264 hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD; 265 265 266 266 if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { 267 - rc = -EINVAL; 267 + rc = -EOPNOTSUPP; 268 268 goto out_abort; 269 269 } 270 270 return hwpt_nested; ··· 309 309 refcount_inc(&viommu->obj.users); 310 310 hwpt_nested->parent = viommu->hwpt; 311 311 312 - hwpt->domain = 313 - viommu->ops->alloc_domain_nested(viommu, 314 - flags & ~IOMMU_HWPT_FAULT_ID_VALID, 315 - user_data); 312 + hwpt->domain = viommu->ops->alloc_domain_nested( 313 + viommu, flags & ~IOMMU_HWPT_FAULT_ID_VALID, user_data); 316 314 if (IS_ERR(hwpt->domain)) { 317 315 rc = PTR_ERR(hwpt->domain); 318 316 hwpt->domain = NULL; ··· 321 323 hwpt->domain->cookie_type = IOMMU_COOKIE_IOMMUFD; 322 324 323 325 if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { 324 - rc = -EINVAL; 326 + rc = -EOPNOTSUPP; 325 327 goto out_abort; 326 328 } 327 329 return hwpt_nested;
+37 -20
drivers/iommu/iommufd/io_pagetable.c
··· 70 70 return iter->area; 71 71 } 72 72 73 + static bool __alloc_iova_check_range(unsigned long *start, unsigned long last, 74 + unsigned long length, 75 + unsigned long iova_alignment, 76 + unsigned long page_offset) 77 + { 78 + unsigned long aligned_start; 79 + 80 + /* ALIGN_UP() */ 81 + if (check_add_overflow(*start, iova_alignment - 1, &aligned_start)) 82 + return false; 83 + aligned_start &= ~(iova_alignment - 1); 84 + aligned_start |= page_offset; 85 + 86 + if (aligned_start >= last || last - aligned_start < length - 1) 87 + return false; 88 + *start = aligned_start; 89 + return true; 90 + } 91 + 73 92 static bool __alloc_iova_check_hole(struct interval_tree_double_span_iter *span, 74 93 unsigned long length, 75 94 unsigned long iova_alignment, 76 95 unsigned long page_offset) 77 96 { 78 - if (span->is_used || span->last_hole - span->start_hole < length - 1) 97 + if (span->is_used) 79 98 return false; 80 - 81 - span->start_hole = ALIGN(span->start_hole, iova_alignment) | 82 - page_offset; 83 - if (span->start_hole > span->last_hole || 84 - span->last_hole - span->start_hole < length - 1) 85 - return false; 86 - return true; 99 + return __alloc_iova_check_range(&span->start_hole, span->last_hole, 100 + length, iova_alignment, page_offset); 87 101 } 88 102 89 103 static bool __alloc_iova_check_used(struct interval_tree_span_iter *span, ··· 105 91 unsigned long iova_alignment, 106 92 unsigned long page_offset) 107 93 { 108 - if (span->is_hole || span->last_used - span->start_used < length - 1) 94 + if (span->is_hole) 109 95 return false; 110 - 111 - span->start_used = ALIGN(span->start_used, iova_alignment) | 112 - page_offset; 113 - if (span->start_used > span->last_used || 114 - span->last_used - span->start_used < length - 1) 115 - return false; 116 - return true; 96 + return __alloc_iova_check_range(&span->start_used, span->last_used, 97 + length, iova_alignment, page_offset); 117 98 } 118 99 119 100 /* ··· 728 719 goto out_unlock_iova; 729 720 } 730 721 722 + /* The area is locked by an object that has not been destroyed */ 723 + if (area->num_locks) { 724 + rc = -EBUSY; 725 + goto out_unlock_iova; 726 + } 727 + 731 728 if (area_first < start || area_last > last) { 732 729 rc = -ENOENT; 733 730 goto out_unlock_iova; ··· 758 743 iommufd_access_notify_unmap(iopt, area_first, length); 759 744 /* Something is not responding to unmap requests. */ 760 745 tries++; 761 - if (WARN_ON(tries > 100)) 762 - return -EDEADLOCK; 746 + if (WARN_ON(tries > 100)) { 747 + rc = -EDEADLOCK; 748 + goto out_unmapped; 749 + } 763 750 goto again; 764 751 } 765 752 ··· 783 766 out_unlock_iova: 784 767 up_write(&iopt->iova_rwsem); 785 768 up_read(&iopt->domains_rwsem); 769 + out_unmapped: 786 770 if (unmapped) 787 771 *unmapped = unmapped_bytes; 788 772 return rc; ··· 1428 1410 } 1429 1411 1430 1412 void iopt_remove_access(struct io_pagetable *iopt, 1431 - struct iommufd_access *access, 1432 - u32 iopt_access_list_id) 1413 + struct iommufd_access *access, u32 iopt_access_list_id) 1433 1414 { 1434 1415 down_write(&iopt->domains_rwsem); 1435 1416 down_write(&iopt->iova_rwsem);
+3 -2
drivers/iommu/iommufd/io_pagetable.h
··· 48 48 int iommu_prot; 49 49 bool prevent_access : 1; 50 50 unsigned int num_accesses; 51 + unsigned int num_locks; 51 52 }; 52 53 53 54 struct iopt_allowed { ··· 239 238 240 239 int iopt_area_add_access(struct iopt_area *area, unsigned long start, 241 240 unsigned long last, struct page **out_pages, 242 - unsigned int flags); 241 + unsigned int flags, bool lock_area); 243 242 void iopt_area_remove_access(struct iopt_area *area, unsigned long start, 244 - unsigned long last); 243 + unsigned long last, bool unlock_area); 245 244 int iopt_pages_rw_access(struct iopt_pages *pages, unsigned long start_byte, 246 245 void *data, unsigned long length, unsigned int flags); 247 246
+114 -21
drivers/iommu/iommufd/iommufd_private.h
··· 7 7 #include <linux/iommu.h> 8 8 #include <linux/iommufd.h> 9 9 #include <linux/iova_bitmap.h> 10 + #include <linux/maple_tree.h> 10 11 #include <linux/rwsem.h> 11 12 #include <linux/uaccess.h> 12 13 #include <linux/xarray.h> ··· 45 44 struct xarray groups; 46 45 wait_queue_head_t destroy_wait; 47 46 struct rw_semaphore ioas_creation_lock; 47 + struct maple_tree mt_mmap; 48 48 49 49 struct mutex sw_msi_lock; 50 50 struct list_head sw_msi_list; ··· 55 53 /* Compatibility with VFIO no iommu */ 56 54 u8 no_iommu_mode; 57 55 struct iommufd_ioas *vfio_ioas; 56 + }; 57 + 58 + /* Entry for iommufd_ctx::mt_mmap */ 59 + struct iommufd_mmap { 60 + struct iommufd_object *owner; 61 + 62 + /* Page-shifted start position in mt_mmap to validate vma->vm_pgoff */ 63 + unsigned long vm_pgoff; 64 + 65 + /* Physical range for io_remap_pfn_range() */ 66 + phys_addr_t mmio_addr; 67 + size_t length; 58 68 }; 59 69 60 70 /* ··· 149 135 void __user *ubuffer; 150 136 u32 user_size; 151 137 void *cmd; 138 + struct iommufd_object *new_obj; 152 139 }; 153 140 154 141 int iommufd_vfio_ioctl(struct iommufd_ctx *ictx, unsigned int cmd, ··· 169 154 { 170 155 if (!refcount_inc_not_zero(&obj->users)) 171 156 return false; 172 - if (!refcount_inc_not_zero(&obj->shortterm_users)) { 157 + if (!refcount_inc_not_zero(&obj->wait_cnt)) { 173 158 /* 174 159 * If the caller doesn't already have a ref on obj this must be 175 160 * called under the xa_lock. Otherwise the caller is holding a ··· 187 172 struct iommufd_object *obj) 188 173 { 189 174 /* 190 - * Users first, then shortterm so that REMOVE_WAIT_SHORTTERM never sees 191 - * a spurious !0 users with a 0 shortterm_users. 175 + * Users first, then wait_cnt so that REMOVE_WAIT never sees a spurious 176 + * !0 users with a 0 wait_cnt. 192 177 */ 193 178 refcount_dec(&obj->users); 194 - if (refcount_dec_and_test(&obj->shortterm_users)) 179 + if (refcount_dec_and_test(&obj->wait_cnt)) 195 180 wake_up_interruptible_all(&ictx->destroy_wait); 196 181 } 197 182 ··· 202 187 struct iommufd_object *obj); 203 188 204 189 enum { 205 - REMOVE_WAIT_SHORTTERM = 1, 190 + REMOVE_WAIT = BIT(0), 191 + REMOVE_OBJ_TOMBSTONE = BIT(1), 206 192 }; 207 193 int iommufd_object_remove(struct iommufd_ctx *ictx, 208 194 struct iommufd_object *to_destroy, u32 id, ··· 211 195 212 196 /* 213 197 * The caller holds a users refcount and wants to destroy the object. At this 214 - * point the caller has no shortterm_users reference and at least the xarray 215 - * will be holding one. 198 + * point the caller has no wait_cnt reference and at least the xarray will be 199 + * holding one. 216 200 */ 217 201 static inline void iommufd_object_destroy_user(struct iommufd_ctx *ictx, 218 202 struct iommufd_object *obj) 219 203 { 220 204 int ret; 221 205 222 - ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT_SHORTTERM); 206 + ret = iommufd_object_remove(ictx, obj, obj->id, REMOVE_WAIT); 207 + 208 + /* 209 + * If there is a bug and we couldn't destroy the object then we did put 210 + * back the caller's users refcount and will eventually try to free it 211 + * again during close. 212 + */ 213 + WARN_ON(ret); 214 + } 215 + 216 + /* 217 + * Similar to iommufd_object_destroy_user(), except that the object ID is left 218 + * reserved/tombstoned. 219 + */ 220 + static inline void iommufd_object_tombstone_user(struct iommufd_ctx *ictx, 221 + struct iommufd_object *obj) 222 + { 223 + int ret; 224 + 225 + ret = iommufd_object_remove(ictx, obj, obj->id, 226 + REMOVE_WAIT | REMOVE_OBJ_TOMBSTONE); 223 227 224 228 /* 225 229 * If there is a bug and we couldn't destroy the object then we did put ··· 266 230 iommufd_object_remove(ictx, obj, obj->id, 0); 267 231 } 268 232 233 + /* 234 + * Callers of these normal object allocators must call iommufd_object_finalize() 235 + * to finalize the object, or call iommufd_object_abort_and_destroy() to revert 236 + * the allocation. 237 + */ 238 + struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, 239 + size_t size, 240 + enum iommufd_object_type type); 241 + 269 242 #define __iommufd_object_alloc(ictx, ptr, type, obj) \ 270 243 container_of(_iommufd_object_alloc( \ 271 244 ictx, \ ··· 286 241 287 242 #define iommufd_object_alloc(ictx, ptr, type) \ 288 243 __iommufd_object_alloc(ictx, ptr, type, obj) 244 + 245 + /* 246 + * Callers of these _ucmd allocators should not call iommufd_object_finalize() 247 + * or iommufd_object_abort_and_destroy(), as the core automatically does that. 248 + */ 249 + struct iommufd_object * 250 + _iommufd_object_alloc_ucmd(struct iommufd_ucmd *ucmd, size_t size, 251 + enum iommufd_object_type type); 252 + 253 + #define __iommufd_object_alloc_ucmd(ucmd, ptr, type, obj) \ 254 + container_of(_iommufd_object_alloc_ucmd( \ 255 + ucmd, \ 256 + sizeof(*(ptr)) + BUILD_BUG_ON_ZERO( \ 257 + offsetof(typeof(*(ptr)), \ 258 + obj) != 0), \ 259 + type), \ 260 + typeof(*(ptr)), obj) 261 + 262 + #define iommufd_object_alloc_ucmd(ucmd, ptr, type) \ 263 + __iommufd_object_alloc_ucmd(ucmd, ptr, type, obj) 289 264 290 265 /* 291 266 * The IO Address Space (IOAS) pagetable is a virtual page table backed by the ··· 331 266 static inline struct iommufd_ioas *iommufd_get_ioas(struct iommufd_ctx *ictx, 332 267 u32 id) 333 268 { 334 - return container_of(iommufd_get_object(ictx, id, 335 - IOMMUFD_OBJ_IOAS), 269 + return container_of(iommufd_get_object(ictx, id, IOMMUFD_OBJ_IOAS), 336 270 struct iommufd_ioas, obj); 337 271 } 338 272 ··· 489 425 /* always the physical device */ 490 426 struct device *dev; 491 427 bool enforce_cache_coherency; 428 + struct iommufd_vdevice *vdev; 429 + bool destroying; 492 430 }; 493 431 494 432 static inline struct iommufd_device * ··· 501 435 struct iommufd_device, obj); 502 436 } 503 437 438 + void iommufd_device_pre_destroy(struct iommufd_object *obj); 504 439 void iommufd_device_destroy(struct iommufd_object *obj); 505 440 int iommufd_get_hw_info(struct iommufd_ucmd *ucmd); 506 441 ··· 519 452 520 453 int iopt_add_access(struct io_pagetable *iopt, struct iommufd_access *access); 521 454 void iopt_remove_access(struct io_pagetable *iopt, 522 - struct iommufd_access *access, 523 - u32 iopt_access_list_id); 455 + struct iommufd_access *access, u32 iopt_access_list_id); 524 456 void iommufd_access_destroy_object(struct iommufd_object *obj); 457 + 458 + /* iommufd_access for internal use */ 459 + static inline bool iommufd_access_is_internal(struct iommufd_access *access) 460 + { 461 + return !access->ictx; 462 + } 463 + 464 + struct iommufd_access *iommufd_access_create_internal(struct iommufd_ctx *ictx); 465 + 466 + static inline void 467 + iommufd_access_destroy_internal(struct iommufd_ctx *ictx, 468 + struct iommufd_access *access) 469 + { 470 + iommufd_object_destroy_user(ictx, &access->obj); 471 + } 472 + 473 + int iommufd_access_attach_internal(struct iommufd_access *access, 474 + struct iommufd_ioas *ioas); 475 + 476 + static inline void iommufd_access_detach_internal(struct iommufd_access *access) 477 + { 478 + iommufd_access_detach(access); 479 + } 525 480 526 481 struct iommufd_eventq { 527 482 struct iommufd_object obj; ··· 617 528 struct list_head node; /* for iommufd_viommu::veventqs */ 618 529 struct iommufd_vevent lost_events_header; 619 530 620 - unsigned int type; 531 + enum iommu_veventq_type type; 621 532 unsigned int depth; 622 533 623 534 /* Use common.lock for protection */ ··· 672 583 } 673 584 674 585 static inline struct iommufd_veventq * 675 - iommufd_viommu_find_veventq(struct iommufd_viommu *viommu, u32 type) 586 + iommufd_viommu_find_veventq(struct iommufd_viommu *viommu, 587 + enum iommu_veventq_type type) 676 588 { 677 589 struct iommufd_veventq *veventq, *next; 678 590 ··· 690 600 void iommufd_viommu_destroy(struct iommufd_object *obj); 691 601 int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd); 692 602 void iommufd_vdevice_destroy(struct iommufd_object *obj); 603 + void iommufd_vdevice_abort(struct iommufd_object *obj); 604 + int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd); 605 + void iommufd_hw_queue_destroy(struct iommufd_object *obj); 693 606 694 - struct iommufd_vdevice { 695 - struct iommufd_object obj; 696 - struct iommufd_ctx *ictx; 697 - struct iommufd_viommu *viommu; 698 - struct device *dev; 699 - u64 id; /* per-vIOMMU virtual ID */ 700 - }; 607 + static inline struct iommufd_vdevice * 608 + iommufd_get_vdevice(struct iommufd_ctx *ictx, u32 id) 609 + { 610 + return container_of(iommufd_get_object(ictx, id, 611 + IOMMUFD_OBJ_VDEVICE), 612 + struct iommufd_vdevice, obj); 613 + } 701 614 702 615 #ifdef CONFIG_IOMMUFD_TEST 703 616 int iommufd_test(struct iommufd_ucmd *ucmd);
+20
drivers/iommu/iommufd/iommufd_test.h
··· 227 227 228 228 #define IOMMU_VIOMMU_TYPE_SELFTEST 0xdeadbeef 229 229 230 + /** 231 + * struct iommu_viommu_selftest - vIOMMU data for Mock driver 232 + * (IOMMU_VIOMMU_TYPE_SELFTEST) 233 + * @in_data: Input random data from user space 234 + * @out_data: Output data (matching @in_data) to user space 235 + * @out_mmap_offset: The offset argument for mmap syscall 236 + * @out_mmap_length: The length argument for mmap syscall 237 + * 238 + * Simply set @out_data=@in_data for a loopback test 239 + */ 240 + struct iommu_viommu_selftest { 241 + __u32 in_data; 242 + __u32 out_data; 243 + __aligned_u64 out_mmap_offset; 244 + __aligned_u64 out_mmap_length; 245 + }; 246 + 230 247 /* Should not be equal to any defined value in enum iommu_viommu_invalidate_data_type */ 231 248 #define IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST 0xdeadbeef 232 249 #define IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID 0xdadbeef ··· 268 251 struct iommu_viommu_event_selftest { 269 252 __u32 virt_id; 270 253 }; 254 + 255 + #define IOMMU_HW_QUEUE_TYPE_SELFTEST 0xdeadbeef 256 + #define IOMMU_TEST_HW_QUEUE_MAX 2 271 257 272 258 #endif
-1
drivers/iommu/iommufd/iova_bitmap.c
··· 407 407 408 408 update_indexes: 409 409 if (unlikely(!iova_bitmap_mapped_range(mapped, iova, length))) { 410 - 411 410 /* 412 411 * The attempt to advance the base index to @iova 413 412 * may fail if it's out of bounds, or pinning the pages
+184 -22
drivers/iommu/iommufd/main.c
··· 23 23 #include "iommufd_test.h" 24 24 25 25 struct iommufd_object_ops { 26 + void (*pre_destroy)(struct iommufd_object *obj); 26 27 void (*destroy)(struct iommufd_object *obj); 27 28 void (*abort)(struct iommufd_object *obj); 28 29 }; 29 30 static const struct iommufd_object_ops iommufd_object_ops[]; 30 31 static struct miscdevice vfio_misc_dev; 32 + 33 + struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, 34 + size_t size, 35 + enum iommufd_object_type type) 36 + { 37 + struct iommufd_object *obj; 38 + int rc; 39 + 40 + obj = kzalloc(size, GFP_KERNEL_ACCOUNT); 41 + if (!obj) 42 + return ERR_PTR(-ENOMEM); 43 + obj->type = type; 44 + /* Starts out bias'd by 1 until it is removed from the xarray */ 45 + refcount_set(&obj->wait_cnt, 1); 46 + refcount_set(&obj->users, 1); 47 + 48 + /* 49 + * Reserve an ID in the xarray but do not publish the pointer yet since 50 + * the caller hasn't initialized it yet. Once the pointer is published 51 + * in the xarray and visible to other threads we can't reliably destroy 52 + * it anymore, so the caller must complete all errorable operations 53 + * before calling iommufd_object_finalize(). 54 + */ 55 + rc = xa_alloc(&ictx->objects, &obj->id, XA_ZERO_ENTRY, xa_limit_31b, 56 + GFP_KERNEL_ACCOUNT); 57 + if (rc) 58 + goto out_free; 59 + return obj; 60 + out_free: 61 + kfree(obj); 62 + return ERR_PTR(rc); 63 + } 64 + 65 + struct iommufd_object *_iommufd_object_alloc_ucmd(struct iommufd_ucmd *ucmd, 66 + size_t size, 67 + enum iommufd_object_type type) 68 + { 69 + struct iommufd_object *new_obj; 70 + 71 + /* Something is coded wrong if this is hit */ 72 + if (WARN_ON(ucmd->new_obj)) 73 + return ERR_PTR(-EBUSY); 74 + 75 + /* 76 + * An abort op means that its caller needs to invoke it within a lock in 77 + * the caller. So it doesn't work with _iommufd_object_alloc_ucmd() that 78 + * will invoke the abort op in iommufd_object_abort_and_destroy(), which 79 + * must be outside the caller's lock. 80 + */ 81 + if (WARN_ON(iommufd_object_ops[type].abort)) 82 + return ERR_PTR(-EOPNOTSUPP); 83 + 84 + new_obj = _iommufd_object_alloc(ucmd->ictx, size, type); 85 + if (IS_ERR(new_obj)) 86 + return new_obj; 87 + 88 + ucmd->new_obj = new_obj; 89 + return new_obj; 90 + } 31 91 32 92 /* 33 93 * Allow concurrent access to the object. ··· 155 95 return obj; 156 96 } 157 97 158 - static int iommufd_object_dec_wait_shortterm(struct iommufd_ctx *ictx, 159 - struct iommufd_object *to_destroy) 98 + static int iommufd_object_dec_wait(struct iommufd_ctx *ictx, 99 + struct iommufd_object *to_destroy) 160 100 { 161 - if (refcount_dec_and_test(&to_destroy->shortterm_users)) 101 + if (refcount_dec_and_test(&to_destroy->wait_cnt)) 162 102 return 0; 163 103 104 + if (iommufd_object_ops[to_destroy->type].pre_destroy) 105 + iommufd_object_ops[to_destroy->type].pre_destroy(to_destroy); 106 + 164 107 if (wait_event_timeout(ictx->destroy_wait, 165 - refcount_read(&to_destroy->shortterm_users) == 166 - 0, 167 - msecs_to_jiffies(60000))) 108 + refcount_read(&to_destroy->wait_cnt) == 0, 109 + msecs_to_jiffies(60000))) 168 110 return 0; 169 111 170 112 pr_crit("Time out waiting for iommufd object to become free\n"); 171 - refcount_inc(&to_destroy->shortterm_users); 113 + refcount_inc(&to_destroy->wait_cnt); 172 114 return -EBUSY; 173 115 } 174 116 ··· 184 122 { 185 123 struct iommufd_object *obj; 186 124 XA_STATE(xas, &ictx->objects, id); 187 - bool zerod_shortterm = false; 125 + bool zerod_wait_cnt = false; 188 126 int ret; 189 127 190 128 /* 191 - * The purpose of the shortterm_users is to ensure deterministic 192 - * destruction of objects used by external drivers and destroyed by this 193 - * function. Any temporary increment of the refcount must increment 194 - * shortterm_users, such as during ioctl execution. 129 + * The purpose of the wait_cnt is to ensure deterministic destruction 130 + * of objects used by external drivers and destroyed by this function. 131 + * Incrementing this wait_cnt should either be short lived, such as 132 + * during ioctl execution, or be revoked and blocked during 133 + * pre_destroy(), such as vdev holding the idev's refcount. 195 134 */ 196 - if (flags & REMOVE_WAIT_SHORTTERM) { 197 - ret = iommufd_object_dec_wait_shortterm(ictx, to_destroy); 135 + if (flags & REMOVE_WAIT) { 136 + ret = iommufd_object_dec_wait(ictx, to_destroy); 198 137 if (ret) { 199 138 /* 200 139 * We have a bug. Put back the callers reference and ··· 204 141 refcount_dec(&to_destroy->users); 205 142 return ret; 206 143 } 207 - zerod_shortterm = true; 144 + zerod_wait_cnt = true; 208 145 } 209 146 210 147 xa_lock(&ictx->objects); ··· 230 167 goto err_xa; 231 168 } 232 169 233 - xas_store(&xas, NULL); 170 + xas_store(&xas, (flags & REMOVE_OBJ_TOMBSTONE) ? XA_ZERO_ENTRY : NULL); 234 171 if (ictx->vfio_ioas == container_of(obj, struct iommufd_ioas, obj)) 235 172 ictx->vfio_ioas = NULL; 236 173 xa_unlock(&ictx->objects); 237 174 238 175 /* 239 - * Since users is zero any positive users_shortterm must be racing 176 + * Since users is zero any positive wait_cnt must be racing 240 177 * iommufd_put_object(), or we have a bug. 241 178 */ 242 - if (!zerod_shortterm) { 243 - ret = iommufd_object_dec_wait_shortterm(ictx, obj); 179 + if (!zerod_wait_cnt) { 180 + ret = iommufd_object_dec_wait(ictx, obj); 244 181 if (WARN_ON(ret)) 245 182 return ret; 246 183 } ··· 250 187 return 0; 251 188 252 189 err_xa: 253 - if (zerod_shortterm) { 190 + if (zerod_wait_cnt) { 254 191 /* Restore the xarray owned reference */ 255 - refcount_set(&obj->shortterm_users, 1); 192 + refcount_set(&obj->wait_cnt, 1); 256 193 } 257 194 xa_unlock(&ictx->objects); 258 195 ··· 289 226 xa_init_flags(&ictx->objects, XA_FLAGS_ALLOC1 | XA_FLAGS_ACCOUNT); 290 227 xa_init(&ictx->groups); 291 228 ictx->file = filp; 229 + mt_init_flags(&ictx->mt_mmap, MT_FLAGS_ALLOC_RANGE); 292 230 init_waitqueue_head(&ictx->destroy_wait); 293 231 mutex_init(&ictx->sw_msi_lock); 294 232 INIT_LIST_HEAD(&ictx->sw_msi_list); ··· 316 252 while (!xa_empty(&ictx->objects)) { 317 253 unsigned int destroyed = 0; 318 254 unsigned long index; 255 + bool empty = true; 319 256 257 + /* 258 + * We can't use xa_empty() to end the loop as the tombstones 259 + * are stored as XA_ZERO_ENTRY in the xarray. However 260 + * xa_for_each() automatically converts them to NULL and skips 261 + * them causing xa_empty() to be kept false. Thus once 262 + * xa_for_each() finds no further !NULL entries the loop is 263 + * done. 264 + */ 320 265 xa_for_each(&ictx->objects, index, obj) { 266 + empty = false; 321 267 if (!refcount_dec_if_one(&obj->users)) 322 268 continue; 269 + 323 270 destroyed++; 324 271 xa_erase(&ictx->objects, index); 325 272 iommufd_object_ops[obj->type].destroy(obj); 326 273 kfree(obj); 327 274 } 275 + 276 + if (empty) 277 + break; 278 + 328 279 /* Bug related to users refcount */ 329 280 if (WARN_ON(!destroyed)) 330 281 break; 331 282 } 283 + 284 + /* 285 + * There may be some tombstones left over from 286 + * iommufd_object_tombstone_user() 287 + */ 288 + xa_destroy(&ictx->objects); 289 + 332 290 WARN_ON(!xa_empty(&ictx->groups)); 333 291 334 292 mutex_destroy(&ictx->sw_msi_lock); ··· 391 305 struct iommu_destroy destroy; 392 306 struct iommu_fault_alloc fault; 393 307 struct iommu_hw_info info; 308 + struct iommu_hw_queue_alloc hw_queue; 394 309 struct iommu_hwpt_alloc hwpt; 395 310 struct iommu_hwpt_get_dirty_bitmap get_dirty_bitmap; 396 311 struct iommu_hwpt_invalidate cache; ··· 434 347 struct iommu_fault_alloc, out_fault_fd), 435 348 IOCTL_OP(IOMMU_GET_HW_INFO, iommufd_get_hw_info, struct iommu_hw_info, 436 349 __reserved), 350 + IOCTL_OP(IOMMU_HW_QUEUE_ALLOC, iommufd_hw_queue_alloc_ioctl, 351 + struct iommu_hw_queue_alloc, length), 437 352 IOCTL_OP(IOMMU_HWPT_ALLOC, iommufd_hwpt_alloc, struct iommu_hwpt_alloc, 438 353 __reserved), 439 354 IOCTL_OP(IOMMU_HWPT_GET_DIRTY_BITMAP, iommufd_hwpt_get_dirty_bitmap, ··· 506 417 if (ret) 507 418 return ret; 508 419 ret = op->execute(&ucmd); 420 + 421 + if (ucmd.new_obj) { 422 + if (ret) 423 + iommufd_object_abort_and_destroy(ictx, ucmd.new_obj); 424 + else 425 + iommufd_object_finalize(ictx, ucmd.new_obj); 426 + } 509 427 return ret; 428 + } 429 + 430 + static void iommufd_fops_vma_open(struct vm_area_struct *vma) 431 + { 432 + struct iommufd_mmap *immap = vma->vm_private_data; 433 + 434 + refcount_inc(&immap->owner->users); 435 + } 436 + 437 + static void iommufd_fops_vma_close(struct vm_area_struct *vma) 438 + { 439 + struct iommufd_mmap *immap = vma->vm_private_data; 440 + 441 + refcount_dec(&immap->owner->users); 442 + } 443 + 444 + static const struct vm_operations_struct iommufd_vma_ops = { 445 + .open = iommufd_fops_vma_open, 446 + .close = iommufd_fops_vma_close, 447 + }; 448 + 449 + /* The vm_pgoff must be pre-allocated from mt_mmap, and given to user space */ 450 + static int iommufd_fops_mmap(struct file *filp, struct vm_area_struct *vma) 451 + { 452 + struct iommufd_ctx *ictx = filp->private_data; 453 + size_t length = vma->vm_end - vma->vm_start; 454 + struct iommufd_mmap *immap; 455 + int rc; 456 + 457 + if (!PAGE_ALIGNED(length)) 458 + return -EINVAL; 459 + if (!(vma->vm_flags & VM_SHARED)) 460 + return -EINVAL; 461 + if (vma->vm_flags & VM_EXEC) 462 + return -EPERM; 463 + 464 + /* vma->vm_pgoff carries a page-shifted start position to an immap */ 465 + immap = mtree_load(&ictx->mt_mmap, vma->vm_pgoff << PAGE_SHIFT); 466 + if (!immap) 467 + return -ENXIO; 468 + /* 469 + * mtree_load() returns the immap for any contained mmio_addr, so only 470 + * allow the exact immap thing to be mapped 471 + */ 472 + if (vma->vm_pgoff != immap->vm_pgoff || length != immap->length) 473 + return -ENXIO; 474 + 475 + vma->vm_pgoff = 0; 476 + vma->vm_private_data = immap; 477 + vma->vm_ops = &iommufd_vma_ops; 478 + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); 479 + 480 + rc = io_remap_pfn_range(vma, vma->vm_start, 481 + immap->mmio_addr >> PAGE_SHIFT, length, 482 + vma->vm_page_prot); 483 + if (rc) 484 + return rc; 485 + 486 + /* vm_ops.open won't be called for mmap itself. */ 487 + refcount_inc(&immap->owner->users); 488 + return rc; 510 489 } 511 490 512 491 static const struct file_operations iommufd_fops = { ··· 582 425 .open = iommufd_fops_open, 583 426 .release = iommufd_fops_release, 584 427 .unlocked_ioctl = iommufd_fops_ioctl, 428 + .mmap = iommufd_fops_mmap, 585 429 }; 586 430 587 431 /** ··· 656 498 .destroy = iommufd_access_destroy_object, 657 499 }, 658 500 [IOMMUFD_OBJ_DEVICE] = { 501 + .pre_destroy = iommufd_device_pre_destroy, 659 502 .destroy = iommufd_device_destroy, 660 503 }, 661 504 [IOMMUFD_OBJ_FAULT] = { 662 505 .destroy = iommufd_fault_destroy, 506 + }, 507 + [IOMMUFD_OBJ_HW_QUEUE] = { 508 + .destroy = iommufd_hw_queue_destroy, 663 509 }, 664 510 [IOMMUFD_OBJ_HWPT_PAGING] = { 665 511 .destroy = iommufd_hwpt_paging_destroy, ··· 678 516 }, 679 517 [IOMMUFD_OBJ_VDEVICE] = { 680 518 .destroy = iommufd_vdevice_destroy, 519 + .abort = iommufd_vdevice_abort, 681 520 }, 682 521 [IOMMUFD_OBJ_VEVENTQ] = { 683 522 .destroy = iommufd_veventq_destroy, ··· 701 538 .nodename = "iommu", 702 539 .mode = 0660, 703 540 }; 704 - 705 541 706 542 static struct miscdevice vfio_misc_dev = { 707 543 .minor = VFIO_MINOR,
+15 -6
drivers/iommu/iommufd/pages.c
··· 1287 1287 } 1288 1288 1289 1289 static struct iopt_pages *iopt_alloc_pages(unsigned long start_byte, 1290 - unsigned long length, 1291 - bool writable) 1290 + unsigned long length, bool writable) 1292 1291 { 1293 1292 struct iopt_pages *pages; 1294 1293 ··· 1327 1328 struct iopt_pages *pages; 1328 1329 unsigned long end; 1329 1330 void __user *uptr_down = 1330 - (void __user *) ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE); 1331 + (void __user *)ALIGN_DOWN((uintptr_t)uptr, PAGE_SIZE); 1331 1332 1332 1333 if (check_add_overflow((unsigned long)uptr, length, &end)) 1333 1334 return ERR_PTR(-EOVERFLOW); ··· 2103 2104 * @last_index: Inclusive last page index 2104 2105 * @out_pages: Output list of struct page's representing the PFNs 2105 2106 * @flags: IOMMUFD_ACCESS_RW_* flags 2107 + * @lock_area: Fail userspace munmap on this area 2106 2108 * 2107 2109 * Record that an in-kernel access will be accessing the pages, ensure they are 2108 2110 * pinned, and return the PFNs as a simple list of 'struct page *'. ··· 2111 2111 * This should be undone through a matching call to iopt_area_remove_access() 2112 2112 */ 2113 2113 int iopt_area_add_access(struct iopt_area *area, unsigned long start_index, 2114 - unsigned long last_index, struct page **out_pages, 2115 - unsigned int flags) 2114 + unsigned long last_index, struct page **out_pages, 2115 + unsigned int flags, bool lock_area) 2116 2116 { 2117 2117 struct iopt_pages *pages = area->pages; 2118 2118 struct iopt_pages_access *access; ··· 2125 2125 access = iopt_pages_get_exact_access(pages, start_index, last_index); 2126 2126 if (access) { 2127 2127 area->num_accesses++; 2128 + if (lock_area) 2129 + area->num_locks++; 2128 2130 access->users++; 2129 2131 iopt_pages_fill_from_xarray(pages, start_index, last_index, 2130 2132 out_pages); ··· 2148 2146 access->node.last = last_index; 2149 2147 access->users = 1; 2150 2148 area->num_accesses++; 2149 + if (lock_area) 2150 + area->num_locks++; 2151 2151 interval_tree_insert(&access->node, &pages->access_itree); 2152 2152 mutex_unlock(&pages->mutex); 2153 2153 return 0; ··· 2166 2162 * @area: The source of PFNs 2167 2163 * @start_index: First page index 2168 2164 * @last_index: Inclusive last page index 2165 + * @unlock_area: Must match the matching iopt_area_add_access()'s lock_area 2169 2166 * 2170 2167 * Undo iopt_area_add_access() and unpin the pages if necessary. The caller 2171 2168 * must stop using the PFNs before calling this. 2172 2169 */ 2173 2170 void iopt_area_remove_access(struct iopt_area *area, unsigned long start_index, 2174 - unsigned long last_index) 2171 + unsigned long last_index, bool unlock_area) 2175 2172 { 2176 2173 struct iopt_pages *pages = area->pages; 2177 2174 struct iopt_pages_access *access; ··· 2183 2178 goto out_unlock; 2184 2179 2185 2180 WARN_ON(area->num_accesses == 0 || access->users == 0); 2181 + if (unlock_area) { 2182 + WARN_ON(area->num_locks == 0); 2183 + area->num_locks--; 2184 + } 2186 2185 area->num_accesses--; 2187 2186 access->users--; 2188 2187 if (access->users)
+175 -32
drivers/iommu/iommufd/selftest.c
··· 138 138 struct mock_iommu_domain_nested { 139 139 struct iommu_domain domain; 140 140 struct mock_viommu *mock_viommu; 141 - struct mock_iommu_domain *parent; 142 141 u32 iotlb[MOCK_NESTED_DOMAIN_IOTLB_NUM]; 143 142 }; 144 143 ··· 150 151 struct mock_viommu { 151 152 struct iommufd_viommu core; 152 153 struct mock_iommu_domain *s2_parent; 154 + struct mock_hw_queue *hw_queue[IOMMU_TEST_HW_QUEUE_MAX]; 155 + struct mutex queue_mutex; 156 + 157 + unsigned long mmap_offset; 158 + u32 *page; /* Mmap page to test u32 type of in_data */ 153 159 }; 154 160 155 161 static inline struct mock_viommu *to_mock_viommu(struct iommufd_viommu *viommu) 156 162 { 157 163 return container_of(viommu, struct mock_viommu, core); 164 + } 165 + 166 + struct mock_hw_queue { 167 + struct iommufd_hw_queue core; 168 + struct mock_viommu *mock_viommu; 169 + struct mock_hw_queue *prev; 170 + u16 index; 171 + }; 172 + 173 + static inline struct mock_hw_queue * 174 + to_mock_hw_queue(struct iommufd_hw_queue *hw_queue) 175 + { 176 + return container_of(hw_queue, struct mock_hw_queue, core); 158 177 } 159 178 160 179 enum selftest_obj_type { ··· 305 288 .ops = &mock_blocking_ops, 306 289 }; 307 290 308 - static void *mock_domain_hw_info(struct device *dev, u32 *length, u32 *type) 291 + static void *mock_domain_hw_info(struct device *dev, u32 *length, 292 + enum iommu_hw_info_type *type) 309 293 { 310 294 struct iommu_test_hw_info *info; 295 + 296 + if (*type != IOMMU_HW_INFO_TYPE_DEFAULT && 297 + *type != IOMMU_HW_INFO_TYPE_SELFTEST) 298 + return ERR_PTR(-EOPNOTSUPP); 311 299 312 300 info = kzalloc(sizeof(*info), GFP_KERNEL); 313 301 if (!info) ··· 456 434 mock_nested = __mock_domain_alloc_nested(user_data); 457 435 if (IS_ERR(mock_nested)) 458 436 return ERR_CAST(mock_nested); 459 - mock_nested->parent = mock_parent; 460 437 return &mock_nested->domain; 461 438 } 462 439 ··· 692 671 { 693 672 struct mock_iommu_device *mock_iommu = container_of( 694 673 viommu->iommu_dev, struct mock_iommu_device, iommu_dev); 674 + struct mock_viommu *mock_viommu = to_mock_viommu(viommu); 695 675 696 676 if (refcount_dec_and_test(&mock_iommu->users)) 697 677 complete(&mock_iommu->complete); 678 + if (mock_viommu->mmap_offset) 679 + iommufd_viommu_destroy_mmap(&mock_viommu->core, 680 + mock_viommu->mmap_offset); 681 + free_page((unsigned long)mock_viommu->page); 682 + mutex_destroy(&mock_viommu->queue_mutex); 698 683 699 684 /* iommufd core frees mock_viommu and viommu */ 700 685 } ··· 719 692 if (IS_ERR(mock_nested)) 720 693 return ERR_CAST(mock_nested); 721 694 mock_nested->mock_viommu = mock_viommu; 722 - mock_nested->parent = mock_viommu->s2_parent; 723 695 return &mock_nested->domain; 724 696 } 725 697 ··· 792 766 return rc; 793 767 } 794 768 769 + static size_t mock_viommu_get_hw_queue_size(struct iommufd_viommu *viommu, 770 + enum iommu_hw_queue_type queue_type) 771 + { 772 + if (queue_type != IOMMU_HW_QUEUE_TYPE_SELFTEST) 773 + return 0; 774 + return HW_QUEUE_STRUCT_SIZE(struct mock_hw_queue, core); 775 + } 776 + 777 + static void mock_hw_queue_destroy(struct iommufd_hw_queue *hw_queue) 778 + { 779 + struct mock_hw_queue *mock_hw_queue = to_mock_hw_queue(hw_queue); 780 + struct mock_viommu *mock_viommu = mock_hw_queue->mock_viommu; 781 + 782 + mutex_lock(&mock_viommu->queue_mutex); 783 + mock_viommu->hw_queue[mock_hw_queue->index] = NULL; 784 + if (mock_hw_queue->prev) 785 + iommufd_hw_queue_undepend(mock_hw_queue, mock_hw_queue->prev, 786 + core); 787 + mutex_unlock(&mock_viommu->queue_mutex); 788 + } 789 + 790 + /* Test iommufd_hw_queue_depend/undepend() */ 791 + static int mock_hw_queue_init_phys(struct iommufd_hw_queue *hw_queue, u32 index, 792 + phys_addr_t base_addr_pa) 793 + { 794 + struct mock_viommu *mock_viommu = to_mock_viommu(hw_queue->viommu); 795 + struct mock_hw_queue *mock_hw_queue = to_mock_hw_queue(hw_queue); 796 + struct mock_hw_queue *prev = NULL; 797 + int rc = 0; 798 + 799 + if (index >= IOMMU_TEST_HW_QUEUE_MAX) 800 + return -EINVAL; 801 + 802 + mutex_lock(&mock_viommu->queue_mutex); 803 + 804 + if (mock_viommu->hw_queue[index]) { 805 + rc = -EEXIST; 806 + goto unlock; 807 + } 808 + 809 + if (index) { 810 + prev = mock_viommu->hw_queue[index - 1]; 811 + if (!prev) { 812 + rc = -EIO; 813 + goto unlock; 814 + } 815 + } 816 + 817 + /* 818 + * Test to catch a kernel bug if the core converted the physical address 819 + * incorrectly. Let mock_domain_iova_to_phys() WARN_ON if it fails. 820 + */ 821 + if (base_addr_pa != iommu_iova_to_phys(&mock_viommu->s2_parent->domain, 822 + hw_queue->base_addr)) { 823 + rc = -EFAULT; 824 + goto unlock; 825 + } 826 + 827 + if (prev) { 828 + rc = iommufd_hw_queue_depend(mock_hw_queue, prev, core); 829 + if (rc) 830 + goto unlock; 831 + } 832 + 833 + mock_hw_queue->prev = prev; 834 + mock_hw_queue->mock_viommu = mock_viommu; 835 + mock_viommu->hw_queue[index] = mock_hw_queue; 836 + 837 + hw_queue->destroy = &mock_hw_queue_destroy; 838 + unlock: 839 + mutex_unlock(&mock_viommu->queue_mutex); 840 + return rc; 841 + } 842 + 795 843 static struct iommufd_viommu_ops mock_viommu_ops = { 796 844 .destroy = mock_viommu_destroy, 797 845 .alloc_domain_nested = mock_viommu_alloc_domain_nested, 798 846 .cache_invalidate = mock_viommu_cache_invalidate, 847 + .get_hw_queue_size = mock_viommu_get_hw_queue_size, 848 + .hw_queue_init_phys = mock_hw_queue_init_phys, 799 849 }; 800 850 801 - static struct iommufd_viommu *mock_viommu_alloc(struct device *dev, 802 - struct iommu_domain *domain, 803 - struct iommufd_ctx *ictx, 804 - unsigned int viommu_type) 851 + static size_t mock_get_viommu_size(struct device *dev, 852 + enum iommu_viommu_type viommu_type) 805 853 { 806 - struct mock_iommu_device *mock_iommu = 807 - iommu_get_iommu_dev(dev, struct mock_iommu_device, iommu_dev); 808 - struct mock_viommu *mock_viommu; 809 - 810 854 if (viommu_type != IOMMU_VIOMMU_TYPE_SELFTEST) 811 - return ERR_PTR(-EOPNOTSUPP); 855 + return 0; 856 + return VIOMMU_STRUCT_SIZE(struct mock_viommu, core); 857 + } 812 858 813 - mock_viommu = iommufd_viommu_alloc(ictx, struct mock_viommu, core, 814 - &mock_viommu_ops); 815 - if (IS_ERR(mock_viommu)) 816 - return ERR_CAST(mock_viommu); 859 + static int mock_viommu_init(struct iommufd_viommu *viommu, 860 + struct iommu_domain *parent_domain, 861 + const struct iommu_user_data *user_data) 862 + { 863 + struct mock_iommu_device *mock_iommu = container_of( 864 + viommu->iommu_dev, struct mock_iommu_device, iommu_dev); 865 + struct mock_viommu *mock_viommu = to_mock_viommu(viommu); 866 + struct iommu_viommu_selftest data; 867 + int rc; 868 + 869 + if (user_data) { 870 + rc = iommu_copy_struct_from_user( 871 + &data, user_data, IOMMU_VIOMMU_TYPE_SELFTEST, out_data); 872 + if (rc) 873 + return rc; 874 + 875 + /* Allocate two pages */ 876 + mock_viommu->page = 877 + (u32 *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); 878 + if (!mock_viommu->page) 879 + return -ENOMEM; 880 + 881 + rc = iommufd_viommu_alloc_mmap(&mock_viommu->core, 882 + __pa(mock_viommu->page), 883 + PAGE_SIZE * 2, 884 + &mock_viommu->mmap_offset); 885 + if (rc) 886 + goto err_free_page; 887 + 888 + /* For loopback tests on both the page and out_data */ 889 + *mock_viommu->page = data.in_data; 890 + data.out_data = data.in_data; 891 + data.out_mmap_length = PAGE_SIZE * 2; 892 + data.out_mmap_offset = mock_viommu->mmap_offset; 893 + rc = iommu_copy_struct_to_user( 894 + user_data, &data, IOMMU_VIOMMU_TYPE_SELFTEST, out_data); 895 + if (rc) 896 + goto err_destroy_mmap; 897 + } 817 898 818 899 refcount_inc(&mock_iommu->users); 819 - return &mock_viommu->core; 900 + mutex_init(&mock_viommu->queue_mutex); 901 + mock_viommu->s2_parent = to_mock_domain(parent_domain); 902 + 903 + viommu->ops = &mock_viommu_ops; 904 + return 0; 905 + 906 + err_destroy_mmap: 907 + iommufd_viommu_destroy_mmap(&mock_viommu->core, 908 + mock_viommu->mmap_offset); 909 + err_free_page: 910 + free_page((unsigned long)mock_viommu->page); 911 + return rc; 820 912 } 821 913 822 914 static const struct iommu_ops mock_ops = { ··· 953 809 .probe_device = mock_probe_device, 954 810 .page_response = mock_domain_page_response, 955 811 .user_pasid_table = true, 956 - .viommu_alloc = mock_viommu_alloc, 812 + .get_viommu_size = mock_get_viommu_size, 813 + .viommu_init = mock_viommu_init, 957 814 .default_domain_ops = 958 815 &(struct iommu_domain_ops){ 959 816 .free = mock_domain_free, ··· 1360 1215 return 0; 1361 1216 } 1362 1217 1363 - static int iommufd_test_md_check_iotlb(struct iommufd_ucmd *ucmd, 1364 - u32 mockpt_id, unsigned int iotlb_id, 1365 - u32 iotlb) 1218 + static int iommufd_test_md_check_iotlb(struct iommufd_ucmd *ucmd, u32 mockpt_id, 1219 + unsigned int iotlb_id, u32 iotlb) 1366 1220 { 1367 1221 struct mock_iommu_domain_nested *mock_nested; 1368 1222 struct iommufd_hw_pagetable *hwpt; ··· 1634 1490 int rc; 1635 1491 1636 1492 /* Prevent syzkaller from triggering a WARN_ON in kvzalloc() */ 1637 - if (length > 16*1024*1024) 1493 + if (length > 16 * 1024 * 1024) 1638 1494 return -ENOMEM; 1639 1495 1640 1496 if (flags & ~(MOCK_FLAGS_ACCESS_WRITE | MOCK_FLAGS_ACCESS_SYZ)) ··· 1651 1507 1652 1508 if (flags & MOCK_FLAGS_ACCESS_SYZ) 1653 1509 iova = iommufd_test_syz_conv_iova(staccess->access, 1654 - &cmd->access_pages.iova); 1510 + &cmd->access_pages.iova); 1655 1511 1656 1512 npages = (ALIGN(iova + length, PAGE_SIZE) - 1657 1513 ALIGN_DOWN(iova, PAGE_SIZE)) / ··· 1727 1583 int rc; 1728 1584 1729 1585 /* Prevent syzkaller from triggering a WARN_ON in kvzalloc() */ 1730 - if (length > 16*1024*1024) 1586 + if (length > 16 * 1024 * 1024) 1731 1587 return -ENOMEM; 1732 1588 1733 1589 if (flags & ~(MOCK_ACCESS_RW_WRITE | MOCK_ACCESS_RW_SLOW_PATH | ··· 1753 1609 1754 1610 if (flags & MOCK_FLAGS_ACCESS_SYZ) 1755 1611 iova = iommufd_test_syz_conv_iova(staccess->access, 1756 - &cmd->access_rw.iova); 1612 + &cmd->access_rw.iova); 1757 1613 1758 1614 rc = iommufd_access_rw(staccess->access, iova, tmp, length, flags); 1759 1615 if (rc) ··· 1808 1664 goto out_put; 1809 1665 } 1810 1666 1811 - if (copy_from_user(tmp, uptr,DIV_ROUND_UP(max, BITS_PER_BYTE))) { 1667 + if (copy_from_user(tmp, uptr, DIV_ROUND_UP(max, BITS_PER_BYTE))) { 1812 1668 rc = -EFAULT; 1813 1669 goto out_free; 1814 1670 } ··· 1844 1700 static int iommufd_test_trigger_iopf(struct iommufd_ucmd *ucmd, 1845 1701 struct iommu_test_cmd *cmd) 1846 1702 { 1847 - struct iopf_fault event = { }; 1703 + struct iopf_fault event = {}; 1848 1704 struct iommufd_device *idev; 1849 1705 1850 1706 idev = iommufd_get_device(ucmd, cmd->trigger_iopf.dev_id); ··· 1975 1831 1976 1832 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 1977 1833 if (rc) 1978 - iommufd_device_detach(sobj->idev.idev, 1979 - cmd->pasid_attach.pasid); 1834 + iommufd_device_detach(sobj->idev.idev, cmd->pasid_attach.pasid); 1980 1835 1981 1836 out_sobj: 1982 1837 iommufd_put_object(ucmd->ictx, &sobj->obj); ··· 2146 2003 goto err_bus; 2147 2004 2148 2005 rc = iommu_device_register_bus(&mock_iommu.iommu_dev, &mock_ops, 2149 - &iommufd_mock_bus_type.bus, 2150 - &iommufd_mock_bus_type.nb); 2006 + &iommufd_mock_bus_type.bus, 2007 + &iommufd_mock_bus_type.nb); 2151 2008 if (rc) 2152 2009 goto err_sysfs; 2153 2010
+292 -21
drivers/iommu/iommufd/viommu.c
··· 17 17 int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) 18 18 { 19 19 struct iommu_viommu_alloc *cmd = ucmd->cmd; 20 + const struct iommu_user_data user_data = { 21 + .type = cmd->type, 22 + .uptr = u64_to_user_ptr(cmd->data_uptr), 23 + .len = cmd->data_len, 24 + }; 20 25 struct iommufd_hwpt_paging *hwpt_paging; 21 26 struct iommufd_viommu *viommu; 22 27 struct iommufd_device *idev; 23 28 const struct iommu_ops *ops; 29 + size_t viommu_size; 24 30 int rc; 25 31 26 32 if (cmd->flags || cmd->type == IOMMU_VIOMMU_TYPE_DEFAULT) ··· 37 31 return PTR_ERR(idev); 38 32 39 33 ops = dev_iommu_ops(idev->dev); 40 - if (!ops->viommu_alloc) { 34 + if (!ops->get_viommu_size || !ops->viommu_init) { 35 + rc = -EOPNOTSUPP; 36 + goto out_put_idev; 37 + } 38 + 39 + viommu_size = ops->get_viommu_size(idev->dev, cmd->type); 40 + if (!viommu_size) { 41 + rc = -EOPNOTSUPP; 42 + goto out_put_idev; 43 + } 44 + 45 + /* 46 + * It is a driver bug for providing a viommu_size smaller than the core 47 + * vIOMMU structure size 48 + */ 49 + if (WARN_ON_ONCE(viommu_size < sizeof(*viommu))) { 41 50 rc = -EOPNOTSUPP; 42 51 goto out_put_idev; 43 52 } ··· 68 47 goto out_put_hwpt; 69 48 } 70 49 71 - viommu = ops->viommu_alloc(idev->dev, hwpt_paging->common.domain, 72 - ucmd->ictx, cmd->type); 50 + viommu = (struct iommufd_viommu *)_iommufd_object_alloc_ucmd( 51 + ucmd, viommu_size, IOMMUFD_OBJ_VIOMMU); 73 52 if (IS_ERR(viommu)) { 74 53 rc = PTR_ERR(viommu); 75 54 goto out_put_hwpt; ··· 89 68 */ 90 69 viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev); 91 70 71 + rc = ops->viommu_init(viommu, hwpt_paging->common.domain, 72 + user_data.len ? &user_data : NULL); 73 + if (rc) 74 + goto out_put_hwpt; 75 + 76 + /* It is a driver bug that viommu->ops isn't filled */ 77 + if (WARN_ON_ONCE(!viommu->ops)) { 78 + rc = -EOPNOTSUPP; 79 + goto out_put_hwpt; 80 + } 81 + 92 82 cmd->out_viommu_id = viommu->obj.id; 93 83 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 94 - if (rc) 95 - goto out_abort; 96 - iommufd_object_finalize(ucmd->ictx, &viommu->obj); 97 - goto out_put_hwpt; 98 84 99 - out_abort: 100 - iommufd_object_abort_and_destroy(ucmd->ictx, &viommu->obj); 101 85 out_put_hwpt: 102 86 iommufd_put_object(ucmd->ictx, &hwpt_paging->common.obj); 103 87 out_put_idev: ··· 110 84 return rc; 111 85 } 112 86 113 - void iommufd_vdevice_destroy(struct iommufd_object *obj) 87 + void iommufd_vdevice_abort(struct iommufd_object *obj) 114 88 { 115 89 struct iommufd_vdevice *vdev = 116 90 container_of(obj, struct iommufd_vdevice, obj); 117 91 struct iommufd_viommu *viommu = vdev->viommu; 92 + struct iommufd_device *idev = vdev->idev; 118 93 94 + lockdep_assert_held(&idev->igroup->lock); 95 + 96 + if (vdev->destroy) 97 + vdev->destroy(vdev); 119 98 /* xa_cmpxchg is okay to fail if alloc failed xa_cmpxchg previously */ 120 - xa_cmpxchg(&viommu->vdevs, vdev->id, vdev, NULL, GFP_KERNEL); 99 + xa_cmpxchg(&viommu->vdevs, vdev->virt_id, vdev, NULL, GFP_KERNEL); 121 100 refcount_dec(&viommu->obj.users); 122 - put_device(vdev->dev); 101 + idev->vdev = NULL; 102 + } 103 + 104 + void iommufd_vdevice_destroy(struct iommufd_object *obj) 105 + { 106 + struct iommufd_vdevice *vdev = 107 + container_of(obj, struct iommufd_vdevice, obj); 108 + struct iommufd_device *idev = vdev->idev; 109 + struct iommufd_ctx *ictx = idev->ictx; 110 + 111 + mutex_lock(&idev->igroup->lock); 112 + iommufd_vdevice_abort(obj); 113 + mutex_unlock(&idev->igroup->lock); 114 + iommufd_put_object(ictx, &idev->obj); 123 115 } 124 116 125 117 int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd) 126 118 { 127 119 struct iommu_vdevice_alloc *cmd = ucmd->cmd; 128 120 struct iommufd_vdevice *vdev, *curr; 121 + size_t vdev_size = sizeof(*vdev); 129 122 struct iommufd_viommu *viommu; 130 123 struct iommufd_device *idev; 131 124 u64 virt_id = cmd->virt_id; ··· 169 124 goto out_put_idev; 170 125 } 171 126 172 - vdev = iommufd_object_alloc(ucmd->ictx, vdev, IOMMUFD_OBJ_VDEVICE); 173 - if (IS_ERR(vdev)) { 174 - rc = PTR_ERR(vdev); 175 - goto out_put_idev; 127 + mutex_lock(&idev->igroup->lock); 128 + if (idev->destroying) { 129 + rc = -ENOENT; 130 + goto out_unlock_igroup; 176 131 } 177 132 178 - vdev->id = virt_id; 179 - vdev->dev = idev->dev; 180 - get_device(idev->dev); 133 + if (idev->vdev) { 134 + rc = -EEXIST; 135 + goto out_unlock_igroup; 136 + } 137 + 138 + if (viommu->ops && viommu->ops->vdevice_size) { 139 + /* 140 + * It is a driver bug for: 141 + * - ops->vdevice_size smaller than the core structure size 142 + * - not implementing a pairing ops->vdevice_init op 143 + */ 144 + if (WARN_ON_ONCE(viommu->ops->vdevice_size < vdev_size || 145 + !viommu->ops->vdevice_init)) { 146 + rc = -EOPNOTSUPP; 147 + goto out_put_idev; 148 + } 149 + vdev_size = viommu->ops->vdevice_size; 150 + } 151 + 152 + vdev = (struct iommufd_vdevice *)_iommufd_object_alloc( 153 + ucmd->ictx, vdev_size, IOMMUFD_OBJ_VDEVICE); 154 + if (IS_ERR(vdev)) { 155 + rc = PTR_ERR(vdev); 156 + goto out_unlock_igroup; 157 + } 158 + 159 + vdev->virt_id = virt_id; 181 160 vdev->viommu = viommu; 182 161 refcount_inc(&viommu->obj.users); 162 + /* 163 + * A wait_cnt reference is held on the idev so long as we have the 164 + * pointer. iommufd_device_pre_destroy() will revoke it before the 165 + * idev real destruction. 166 + */ 167 + vdev->idev = idev; 168 + 169 + /* 170 + * iommufd_device_destroy() delays until idev->vdev is NULL before 171 + * freeing the idev, which only happens once the vdev is finished 172 + * destruction. 173 + */ 174 + idev->vdev = vdev; 183 175 184 176 curr = xa_cmpxchg(&viommu->vdevs, virt_id, NULL, vdev, GFP_KERNEL); 185 177 if (curr) { ··· 224 142 goto out_abort; 225 143 } 226 144 145 + if (viommu->ops && viommu->ops->vdevice_init) { 146 + rc = viommu->ops->vdevice_init(vdev); 147 + if (rc) 148 + goto out_abort; 149 + } 150 + 227 151 cmd->out_vdevice_id = vdev->obj.id; 228 152 rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 229 153 if (rc) 230 154 goto out_abort; 231 155 iommufd_object_finalize(ucmd->ictx, &vdev->obj); 232 - goto out_put_idev; 156 + goto out_unlock_igroup; 233 157 234 158 out_abort: 235 159 iommufd_object_abort_and_destroy(ucmd->ictx, &vdev->obj); 160 + out_unlock_igroup: 161 + mutex_unlock(&idev->igroup->lock); 236 162 out_put_idev: 237 - iommufd_put_object(ucmd->ictx, &idev->obj); 163 + if (rc) 164 + iommufd_put_object(ucmd->ictx, &idev->obj); 165 + out_put_viommu: 166 + iommufd_put_object(ucmd->ictx, &viommu->obj); 167 + return rc; 168 + } 169 + 170 + static void iommufd_hw_queue_destroy_access(struct iommufd_ctx *ictx, 171 + struct iommufd_access *access, 172 + u64 base_iova, size_t length) 173 + { 174 + u64 aligned_iova = PAGE_ALIGN_DOWN(base_iova); 175 + u64 offset = base_iova - aligned_iova; 176 + 177 + iommufd_access_unpin_pages(access, aligned_iova, 178 + PAGE_ALIGN(length + offset)); 179 + iommufd_access_detach_internal(access); 180 + iommufd_access_destroy_internal(ictx, access); 181 + } 182 + 183 + void iommufd_hw_queue_destroy(struct iommufd_object *obj) 184 + { 185 + struct iommufd_hw_queue *hw_queue = 186 + container_of(obj, struct iommufd_hw_queue, obj); 187 + 188 + if (hw_queue->destroy) 189 + hw_queue->destroy(hw_queue); 190 + if (hw_queue->access) 191 + iommufd_hw_queue_destroy_access(hw_queue->viommu->ictx, 192 + hw_queue->access, 193 + hw_queue->base_addr, 194 + hw_queue->length); 195 + if (hw_queue->viommu) 196 + refcount_dec(&hw_queue->viommu->obj.users); 197 + } 198 + 199 + /* 200 + * When the HW accesses the guest queue via physical addresses, the underlying 201 + * physical pages of the guest queue must be contiguous. Also, for the security 202 + * concern that IOMMUFD_CMD_IOAS_UNMAP could potentially remove the mappings of 203 + * the guest queue from the nesting parent iopt while the HW is still accessing 204 + * the guest queue memory physically, such a HW queue must require an access to 205 + * pin the underlying pages and prevent that from happening. 206 + */ 207 + static struct iommufd_access * 208 + iommufd_hw_queue_alloc_phys(struct iommu_hw_queue_alloc *cmd, 209 + struct iommufd_viommu *viommu, phys_addr_t *base_pa) 210 + { 211 + u64 aligned_iova = PAGE_ALIGN_DOWN(cmd->nesting_parent_iova); 212 + u64 offset = cmd->nesting_parent_iova - aligned_iova; 213 + struct iommufd_access *access; 214 + struct page **pages; 215 + size_t max_npages; 216 + size_t length; 217 + size_t i; 218 + int rc; 219 + 220 + /* max_npages = DIV_ROUND_UP(offset + cmd->length, PAGE_SIZE) */ 221 + if (check_add_overflow(offset, cmd->length, &length)) 222 + return ERR_PTR(-ERANGE); 223 + if (check_add_overflow(length, PAGE_SIZE - 1, &length)) 224 + return ERR_PTR(-ERANGE); 225 + max_npages = length / PAGE_SIZE; 226 + /* length needs to be page aligned too */ 227 + length = max_npages * PAGE_SIZE; 228 + 229 + /* 230 + * Use kvcalloc() to avoid memory fragmentation for a large page array. 231 + * Set __GFP_NOWARN to avoid syzkaller blowups 232 + */ 233 + pages = kvcalloc(max_npages, sizeof(*pages), GFP_KERNEL | __GFP_NOWARN); 234 + if (!pages) 235 + return ERR_PTR(-ENOMEM); 236 + 237 + access = iommufd_access_create_internal(viommu->ictx); 238 + if (IS_ERR(access)) { 239 + rc = PTR_ERR(access); 240 + goto out_free; 241 + } 242 + 243 + rc = iommufd_access_attach_internal(access, viommu->hwpt->ioas); 244 + if (rc) 245 + goto out_destroy; 246 + 247 + rc = iommufd_access_pin_pages(access, aligned_iova, length, pages, 0); 248 + if (rc) 249 + goto out_detach; 250 + 251 + /* Validate if the underlying physical pages are contiguous */ 252 + for (i = 1; i < max_npages; i++) { 253 + if (page_to_pfn(pages[i]) == page_to_pfn(pages[i - 1]) + 1) 254 + continue; 255 + rc = -EFAULT; 256 + goto out_unpin; 257 + } 258 + 259 + *base_pa = (page_to_pfn(pages[0]) << PAGE_SHIFT) + offset; 260 + kfree(pages); 261 + return access; 262 + 263 + out_unpin: 264 + iommufd_access_unpin_pages(access, aligned_iova, length); 265 + out_detach: 266 + iommufd_access_detach_internal(access); 267 + out_destroy: 268 + iommufd_access_destroy_internal(viommu->ictx, access); 269 + out_free: 270 + kfree(pages); 271 + return ERR_PTR(rc); 272 + } 273 + 274 + int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd) 275 + { 276 + struct iommu_hw_queue_alloc *cmd = ucmd->cmd; 277 + struct iommufd_hw_queue *hw_queue; 278 + struct iommufd_viommu *viommu; 279 + struct iommufd_access *access; 280 + size_t hw_queue_size; 281 + phys_addr_t base_pa; 282 + u64 last; 283 + int rc; 284 + 285 + if (cmd->flags || cmd->type == IOMMU_HW_QUEUE_TYPE_DEFAULT) 286 + return -EOPNOTSUPP; 287 + if (!cmd->length) 288 + return -EINVAL; 289 + if (check_add_overflow(cmd->nesting_parent_iova, cmd->length - 1, 290 + &last)) 291 + return -EOVERFLOW; 292 + 293 + viommu = iommufd_get_viommu(ucmd, cmd->viommu_id); 294 + if (IS_ERR(viommu)) 295 + return PTR_ERR(viommu); 296 + 297 + if (!viommu->ops || !viommu->ops->get_hw_queue_size || 298 + !viommu->ops->hw_queue_init_phys) { 299 + rc = -EOPNOTSUPP; 300 + goto out_put_viommu; 301 + } 302 + 303 + hw_queue_size = viommu->ops->get_hw_queue_size(viommu, cmd->type); 304 + if (!hw_queue_size) { 305 + rc = -EOPNOTSUPP; 306 + goto out_put_viommu; 307 + } 308 + 309 + /* 310 + * It is a driver bug for providing a hw_queue_size smaller than the 311 + * core HW queue structure size 312 + */ 313 + if (WARN_ON_ONCE(hw_queue_size < sizeof(*hw_queue))) { 314 + rc = -EOPNOTSUPP; 315 + goto out_put_viommu; 316 + } 317 + 318 + hw_queue = (struct iommufd_hw_queue *)_iommufd_object_alloc_ucmd( 319 + ucmd, hw_queue_size, IOMMUFD_OBJ_HW_QUEUE); 320 + if (IS_ERR(hw_queue)) { 321 + rc = PTR_ERR(hw_queue); 322 + goto out_put_viommu; 323 + } 324 + 325 + access = iommufd_hw_queue_alloc_phys(cmd, viommu, &base_pa); 326 + if (IS_ERR(access)) { 327 + rc = PTR_ERR(access); 328 + goto out_put_viommu; 329 + } 330 + 331 + hw_queue->viommu = viommu; 332 + refcount_inc(&viommu->obj.users); 333 + hw_queue->access = access; 334 + hw_queue->type = cmd->type; 335 + hw_queue->length = cmd->length; 336 + hw_queue->base_addr = cmd->nesting_parent_iova; 337 + 338 + rc = viommu->ops->hw_queue_init_phys(hw_queue, cmd->index, base_pa); 339 + if (rc) 340 + goto out_put_viommu; 341 + 342 + cmd->out_hw_queue_id = hw_queue->obj.id; 343 + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); 344 + 238 345 out_put_viommu: 239 346 iommufd_put_object(ucmd->ictx, &viommu->obj); 240 347 return rc;
+60 -14
include/linux/iommu.h
··· 14 14 #include <linux/err.h> 15 15 #include <linux/of.h> 16 16 #include <linux/iova_bitmap.h> 17 + #include <uapi/linux/iommufd.h> 17 18 18 19 #define IOMMU_READ (1 << 0) 19 20 #define IOMMU_WRITE (1 << 1) ··· 559 558 } 560 559 561 560 /** 561 + * __iommu_copy_struct_to_user - Report iommu driver specific user space data 562 + * @dst_data: Pointer to a struct iommu_user_data for user space data location 563 + * @src_data: Pointer to an iommu driver specific user data that is defined in 564 + * include/uapi/linux/iommufd.h 565 + * @data_type: The data type of the @src_data. Must match with @dst_data.type 566 + * @data_len: Length of current user data structure, i.e. sizeof(struct _src) 567 + * @min_len: Initial length of user data structure for backward compatibility. 568 + * This should be offsetofend using the last member in the user data 569 + * struct that was initially added to include/uapi/linux/iommufd.h 570 + */ 571 + static inline int 572 + __iommu_copy_struct_to_user(const struct iommu_user_data *dst_data, 573 + void *src_data, unsigned int data_type, 574 + size_t data_len, size_t min_len) 575 + { 576 + if (WARN_ON(!dst_data || !src_data)) 577 + return -EINVAL; 578 + if (dst_data->type != data_type) 579 + return -EINVAL; 580 + if (dst_data->len < min_len || data_len < dst_data->len) 581 + return -EINVAL; 582 + return copy_struct_to_user(dst_data->uptr, dst_data->len, src_data, 583 + data_len, NULL); 584 + } 585 + 586 + /** 587 + * iommu_copy_struct_to_user - Report iommu driver specific user space data 588 + * @user_data: Pointer to a struct iommu_user_data for user space data location 589 + * @ksrc: Pointer to an iommu driver specific user data that is defined in 590 + * include/uapi/linux/iommufd.h 591 + * @data_type: The data type of the @ksrc. Must match with @user_data->type 592 + * @min_last: The last member of the data structure @ksrc points in the initial 593 + * version. 594 + * Return 0 for success, otherwise -error. 595 + */ 596 + #define iommu_copy_struct_to_user(user_data, ksrc, data_type, min_last) \ 597 + __iommu_copy_struct_to_user(user_data, ksrc, data_type, sizeof(*ksrc), \ 598 + offsetofend(typeof(*ksrc), min_last)) 599 + 600 + /** 562 601 * struct iommu_ops - iommu ops and capabilities 563 602 * @capable: check capability 564 603 * @hw_info: report iommu hardware information. The data buffer returned by this 565 604 * op is allocated in the iommu driver and freed by the caller after 566 - * use. The information type is one of enum iommu_hw_info_type defined 567 - * in include/uapi/linux/iommufd.h. 605 + * use. @type can input a requested type and output a supported type. 606 + * Driver should reject an unsupported data @type input 568 607 * @domain_alloc: Do not use in new drivers 569 608 * @domain_alloc_identity: allocate an IDENTITY domain. Drivers should prefer to 570 609 * use identity_domain instead. This should only be used ··· 637 596 * - IOMMU_DOMAIN_DMA: must use a dma domain 638 597 * - 0: use the default setting 639 598 * @default_domain_ops: the default ops for domains 640 - * @viommu_alloc: Allocate an iommufd_viommu on a physical IOMMU instance behind 641 - * the @dev, as the set of virtualization resources shared/passed 642 - * to user space IOMMU instance. And associate it with a nesting 643 - * @parent_domain. The @viommu_type must be defined in the header 644 - * include/uapi/linux/iommufd.h 645 - * It is required to call iommufd_viommu_alloc() helper for 646 - * a bundled allocation of the core and the driver structures, 647 - * using the given @ictx pointer. 599 + * @get_viommu_size: Get the size of a driver-level vIOMMU structure for a given 600 + * @dev corresponding to @viommu_type. Driver should return 0 601 + * if vIOMMU isn't supported accordingly. It is required for 602 + * driver to use the VIOMMU_STRUCT_SIZE macro to sanitize the 603 + * driver-level vIOMMU structure related to the core one 604 + * @viommu_init: Init the driver-level struct of an iommufd_viommu on a physical 605 + * IOMMU instance @viommu->iommu_dev, as the set of virtualization 606 + * resources shared/passed to user space IOMMU instance. Associate 607 + * it with a nesting @parent_domain. It is required for driver to 608 + * set @viommu->ops pointing to its own viommu_ops 648 609 * @owner: Driver module providing these ops 649 610 * @identity_domain: An always available, always attachable identity 650 611 * translation. ··· 662 619 */ 663 620 struct iommu_ops { 664 621 bool (*capable)(struct device *dev, enum iommu_cap); 665 - void *(*hw_info)(struct device *dev, u32 *length, u32 *type); 622 + void *(*hw_info)(struct device *dev, u32 *length, 623 + enum iommu_hw_info_type *type); 666 624 667 625 /* Domain allocation and freeing by the iommu driver */ 668 626 #if IS_ENABLED(CONFIG_FSL_PAMU) ··· 697 653 698 654 int (*def_domain_type)(struct device *dev); 699 655 700 - struct iommufd_viommu *(*viommu_alloc)( 701 - struct device *dev, struct iommu_domain *parent_domain, 702 - struct iommufd_ctx *ictx, unsigned int viommu_type); 656 + size_t (*get_viommu_size)(struct device *dev, 657 + enum iommu_viommu_type viommu_type); 658 + int (*viommu_init)(struct iommufd_viommu *viommu, 659 + struct iommu_domain *parent_domain, 660 + const struct iommu_user_data *user_data); 703 661 704 662 const struct iommu_domain_ops *default_domain_ops; 705 663 struct module *owner;
+174 -22
include/linux/iommufd.h
··· 37 37 IOMMUFD_OBJ_VIOMMU, 38 38 IOMMUFD_OBJ_VDEVICE, 39 39 IOMMUFD_OBJ_VEVENTQ, 40 + IOMMUFD_OBJ_HW_QUEUE, 40 41 #ifdef CONFIG_IOMMUFD_TEST 41 42 IOMMUFD_OBJ_SELFTEST, 42 43 #endif ··· 46 45 47 46 /* Base struct for all objects with a userspace ID handle. */ 48 47 struct iommufd_object { 49 - refcount_t shortterm_users; 48 + /* 49 + * Destroy will sleep and wait for wait_cnt to go to zero. This allows 50 + * concurrent users of the ID to reliably avoid causing a spurious 51 + * destroy failure. Incrementing this count should either be short 52 + * lived or be revoked and blocked during pre_destroy(). 53 + */ 54 + refcount_t wait_cnt; 50 55 refcount_t users; 51 56 enum iommufd_object_type type; 52 57 unsigned int id; ··· 108 101 struct list_head veventqs; 109 102 struct rw_semaphore veventqs_rwsem; 110 103 111 - unsigned int type; 104 + enum iommu_viommu_type type; 105 + }; 106 + 107 + struct iommufd_vdevice { 108 + struct iommufd_object obj; 109 + struct iommufd_viommu *viommu; 110 + struct iommufd_device *idev; 111 + 112 + /* 113 + * Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID of 114 + * AMD IOMMU, and vRID of Intel VT-d 115 + */ 116 + u64 virt_id; 117 + 118 + /* Clean up all driver-specific parts of an iommufd_vdevice */ 119 + void (*destroy)(struct iommufd_vdevice *vdev); 120 + }; 121 + 122 + struct iommufd_hw_queue { 123 + struct iommufd_object obj; 124 + struct iommufd_viommu *viommu; 125 + struct iommufd_access *access; 126 + 127 + u64 base_addr; /* in guest physical address space */ 128 + size_t length; 129 + 130 + enum iommu_hw_queue_type type; 131 + 132 + /* Clean up all driver-specific parts of an iommufd_hw_queue */ 133 + void (*destroy)(struct iommufd_hw_queue *hw_queue); 112 134 }; 113 135 114 136 /** ··· 156 120 * array->entry_num to report the number of handled requests. 157 121 * The data structure of the array entry must be defined in 158 122 * include/uapi/linux/iommufd.h 123 + * @vdevice_size: Size of the driver-defined vDEVICE structure per this vIOMMU 124 + * @vdevice_init: Initialize the driver-level structure of a vDEVICE object, or 125 + * related HW procedure. @vdev is already initialized by iommufd 126 + * core: vdev->dev and vdev->viommu pointers; vdev->id carries a 127 + * per-vIOMMU virtual ID (refer to struct iommu_vdevice_alloc in 128 + * include/uapi/linux/iommufd.h) 129 + * If driver has a deinit function to revert what vdevice_init op 130 + * does, it should set it to the @vdev->destroy function pointer 131 + * @get_hw_queue_size: Get the size of a driver-defined HW queue structure for a 132 + * given @viommu corresponding to @queue_type. Driver should 133 + * return 0 if HW queue aren't supported accordingly. It is 134 + * required for driver to use the HW_QUEUE_STRUCT_SIZE macro 135 + * to sanitize the driver-level HW queue structure related 136 + * to the core one 137 + * @hw_queue_init_phys: Initialize the driver-level structure of a HW queue that 138 + * is initialized with its core-level structure that holds 139 + * all the info about a guest queue memory. 140 + * Driver providing this op indicates that HW accesses the 141 + * guest queue memory via physical addresses. 142 + * @index carries the logical HW QUEUE ID per vIOMMU in a 143 + * guest VM, for a multi-queue model. @base_addr_pa carries 144 + * the physical location of the guest queue 145 + * If driver has a deinit function to revert what this op 146 + * does, it should set it to the @hw_queue->destroy pointer 159 147 */ 160 148 struct iommufd_viommu_ops { 161 149 void (*destroy)(struct iommufd_viommu *viommu); ··· 188 128 const struct iommu_user_data *user_data); 189 129 int (*cache_invalidate)(struct iommufd_viommu *viommu, 190 130 struct iommu_user_data_array *array); 131 + const size_t vdevice_size; 132 + int (*vdevice_init)(struct iommufd_vdevice *vdev); 133 + size_t (*get_hw_queue_size)(struct iommufd_viommu *viommu, 134 + enum iommu_hw_queue_type queue_type); 135 + /* AMD's HW will add hw_queue_init simply using @hw_queue->base_addr */ 136 + int (*hw_queue_init_phys)(struct iommufd_hw_queue *hw_queue, u32 index, 137 + phys_addr_t base_addr_pa); 191 138 }; 192 139 193 140 #if IS_ENABLED(CONFIG_IOMMUFD) ··· 238 171 { 239 172 } 240 173 241 - static inline int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, 242 - void *data, size_t len, unsigned int flags) 174 + static inline int iommufd_access_rw(struct iommufd_access *access, 175 + unsigned long iova, void *data, size_t len, 176 + unsigned int flags) 243 177 { 244 178 return -EOPNOTSUPP; 245 179 } ··· 257 189 #endif /* CONFIG_IOMMUFD */ 258 190 259 191 #if IS_ENABLED(CONFIG_IOMMUFD_DRIVER_CORE) 260 - struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, 261 - size_t size, 262 - enum iommufd_object_type type); 192 + int _iommufd_object_depend(struct iommufd_object *obj_dependent, 193 + struct iommufd_object *obj_depended); 194 + void _iommufd_object_undepend(struct iommufd_object *obj_dependent, 195 + struct iommufd_object *obj_depended); 196 + int _iommufd_alloc_mmap(struct iommufd_ctx *ictx, struct iommufd_object *owner, 197 + phys_addr_t mmio_addr, size_t length, 198 + unsigned long *offset); 199 + void _iommufd_destroy_mmap(struct iommufd_ctx *ictx, 200 + struct iommufd_object *owner, unsigned long offset); 201 + struct device *iommufd_vdevice_to_device(struct iommufd_vdevice *vdev); 263 202 struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, 264 203 unsigned long vdev_id); 265 204 int iommufd_viommu_get_vdev_id(struct iommufd_viommu *viommu, ··· 275 200 enum iommu_veventq_type type, void *event_data, 276 201 size_t data_len); 277 202 #else /* !CONFIG_IOMMUFD_DRIVER_CORE */ 278 - static inline struct iommufd_object * 279 - _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, 280 - enum iommufd_object_type type) 203 + static inline int _iommufd_object_depend(struct iommufd_object *obj_dependent, 204 + struct iommufd_object *obj_depended) 281 205 { 282 - return ERR_PTR(-EOPNOTSUPP); 206 + return -EOPNOTSUPP; 207 + } 208 + 209 + static inline void 210 + _iommufd_object_undepend(struct iommufd_object *obj_dependent, 211 + struct iommufd_object *obj_depended) 212 + { 213 + } 214 + 215 + static inline int _iommufd_alloc_mmap(struct iommufd_ctx *ictx, 216 + struct iommufd_object *owner, 217 + phys_addr_t mmio_addr, size_t length, 218 + unsigned long *offset) 219 + { 220 + return -EOPNOTSUPP; 221 + } 222 + 223 + static inline void _iommufd_destroy_mmap(struct iommufd_ctx *ictx, 224 + struct iommufd_object *owner, 225 + unsigned long offset) 226 + { 227 + } 228 + 229 + static inline struct device * 230 + iommufd_vdevice_to_device(struct iommufd_vdevice *vdev) 231 + { 232 + return NULL; 283 233 } 284 234 285 235 static inline struct device * ··· 328 228 } 329 229 #endif /* CONFIG_IOMMUFD_DRIVER_CORE */ 330 230 231 + #define VIOMMU_STRUCT_SIZE(drv_struct, member) \ 232 + (sizeof(drv_struct) + \ 233 + BUILD_BUG_ON_ZERO(offsetof(drv_struct, member)) + \ 234 + BUILD_BUG_ON_ZERO(!__same_type(struct iommufd_viommu, \ 235 + ((drv_struct *)NULL)->member))) 236 + 237 + #define VDEVICE_STRUCT_SIZE(drv_struct, member) \ 238 + (sizeof(drv_struct) + \ 239 + BUILD_BUG_ON_ZERO(offsetof(drv_struct, member)) + \ 240 + BUILD_BUG_ON_ZERO(!__same_type(struct iommufd_vdevice, \ 241 + ((drv_struct *)NULL)->member))) 242 + 243 + #define HW_QUEUE_STRUCT_SIZE(drv_struct, member) \ 244 + (sizeof(drv_struct) + \ 245 + BUILD_BUG_ON_ZERO(offsetof(drv_struct, member)) + \ 246 + BUILD_BUG_ON_ZERO(!__same_type(struct iommufd_hw_queue, \ 247 + ((drv_struct *)NULL)->member))) 248 + 331 249 /* 332 - * Helpers for IOMMU driver to allocate driver structures that will be freed by 333 - * the iommufd core. The free op will be called prior to freeing the memory. 250 + * Helpers for IOMMU driver to build/destroy a dependency between two sibling 251 + * structures created by one of the allocators above 334 252 */ 335 - #define iommufd_viommu_alloc(ictx, drv_struct, member, viommu_ops) \ 253 + #define iommufd_hw_queue_depend(dependent, depended, member) \ 336 254 ({ \ 337 - drv_struct *ret; \ 255 + int ret = -EINVAL; \ 338 256 \ 339 - static_assert(__same_type(struct iommufd_viommu, \ 340 - ((drv_struct *)NULL)->member)); \ 341 - static_assert(offsetof(drv_struct, member.obj) == 0); \ 342 - ret = (drv_struct *)_iommufd_object_alloc( \ 343 - ictx, sizeof(drv_struct), IOMMUFD_OBJ_VIOMMU); \ 344 - if (!IS_ERR(ret)) \ 345 - ret->member.ops = viommu_ops; \ 257 + static_assert(__same_type(struct iommufd_hw_queue, \ 258 + dependent->member)); \ 259 + static_assert(__same_type(typeof(*dependent), *depended)); \ 260 + if (!WARN_ON_ONCE(dependent->member.viommu != \ 261 + depended->member.viommu)) \ 262 + ret = _iommufd_object_depend(&dependent->member.obj, \ 263 + &depended->member.obj); \ 346 264 ret; \ 347 265 }) 266 + 267 + #define iommufd_hw_queue_undepend(dependent, depended, member) \ 268 + ({ \ 269 + static_assert(__same_type(struct iommufd_hw_queue, \ 270 + dependent->member)); \ 271 + static_assert(__same_type(typeof(*dependent), *depended)); \ 272 + WARN_ON_ONCE(dependent->member.viommu != \ 273 + depended->member.viommu); \ 274 + _iommufd_object_undepend(&dependent->member.obj, \ 275 + &depended->member.obj); \ 276 + }) 277 + 278 + /* 279 + * Helpers for IOMMU driver to alloc/destroy an mmapable area for a structure. 280 + * 281 + * To support an mmappable MMIO region, kernel driver must first register it to 282 + * iommufd core to allocate an @offset, during a driver-structure initialization 283 + * (e.g. viommu_init op). Then, it should report to user space this @offset and 284 + * the @length of the MMIO region for mmap syscall. 285 + */ 286 + static inline int iommufd_viommu_alloc_mmap(struct iommufd_viommu *viommu, 287 + phys_addr_t mmio_addr, 288 + size_t length, 289 + unsigned long *offset) 290 + { 291 + return _iommufd_alloc_mmap(viommu->ictx, &viommu->obj, mmio_addr, 292 + length, offset); 293 + } 294 + 295 + static inline void iommufd_viommu_destroy_mmap(struct iommufd_viommu *viommu, 296 + unsigned long offset) 297 + { 298 + _iommufd_destroy_mmap(viommu->ictx, &viommu->obj, offset); 299 + } 348 300 #endif
+151 -3
include/uapi/linux/iommufd.h
··· 56 56 IOMMUFD_CMD_VDEVICE_ALLOC = 0x91, 57 57 IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92, 58 58 IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93, 59 + IOMMUFD_CMD_HW_QUEUE_ALLOC = 0x94, 59 60 }; 60 61 61 62 /** ··· 592 591 }; 593 592 594 593 /** 594 + * struct iommu_hw_info_tegra241_cmdqv - NVIDIA Tegra241 CMDQV Hardware 595 + * Information (IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV) 596 + * 597 + * @flags: Must be 0 598 + * @version: Version number for the CMDQ-V HW for PARAM bits[03:00] 599 + * @log2vcmdqs: Log2 of the total number of VCMDQs for PARAM bits[07:04] 600 + * @log2vsids: Log2 of the total number of SID replacements for PARAM bits[15:12] 601 + * @__reserved: Must be 0 602 + * 603 + * VMM can use these fields directly in its emulated global PARAM register. Note 604 + * that only one Virtual Interface (VINTF) should be exposed to a VM, i.e. PARAM 605 + * bits[11:08] should be set to 0 for log2 of the total number of VINTFs. 606 + */ 607 + struct iommu_hw_info_tegra241_cmdqv { 608 + __u32 flags; 609 + __u8 version; 610 + __u8 log2vcmdqs; 611 + __u8 log2vsids; 612 + __u8 __reserved; 613 + }; 614 + 615 + /** 595 616 * enum iommu_hw_info_type - IOMMU Hardware Info Types 596 - * @IOMMU_HW_INFO_TYPE_NONE: Used by the drivers that do not report hardware 617 + * @IOMMU_HW_INFO_TYPE_NONE: Output by the drivers that do not report hardware 597 618 * info 619 + * @IOMMU_HW_INFO_TYPE_DEFAULT: Input to request for a default type 598 620 * @IOMMU_HW_INFO_TYPE_INTEL_VTD: Intel VT-d iommu info type 599 621 * @IOMMU_HW_INFO_TYPE_ARM_SMMUV3: ARM SMMUv3 iommu info type 622 + * @IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV (extension for ARM 623 + * SMMUv3) info type 600 624 */ 601 625 enum iommu_hw_info_type { 602 626 IOMMU_HW_INFO_TYPE_NONE = 0, 627 + IOMMU_HW_INFO_TYPE_DEFAULT = 0, 603 628 IOMMU_HW_INFO_TYPE_INTEL_VTD = 1, 604 629 IOMMU_HW_INFO_TYPE_ARM_SMMUV3 = 2, 630 + IOMMU_HW_INFO_TYPE_TEGRA241_CMDQV = 3, 605 631 }; 606 632 607 633 /** ··· 654 626 }; 655 627 656 628 /** 629 + * enum iommufd_hw_info_flags - Flags for iommu_hw_info 630 + * @IOMMU_HW_INFO_FLAG_INPUT_TYPE: If set, @in_data_type carries an input type 631 + * for user space to request for a specific info 632 + */ 633 + enum iommufd_hw_info_flags { 634 + IOMMU_HW_INFO_FLAG_INPUT_TYPE = 1 << 0, 635 + }; 636 + 637 + /** 657 638 * struct iommu_hw_info - ioctl(IOMMU_GET_HW_INFO) 658 639 * @size: sizeof(struct iommu_hw_info) 659 640 * @flags: Must be 0 ··· 671 634 * data that kernel supports 672 635 * @data_uptr: User pointer to a user-space buffer used by the kernel to fill 673 636 * the iommu type specific hardware information data 637 + * @in_data_type: This shares the same field with @out_data_type, making it be 638 + * a bidirectional field. When IOMMU_HW_INFO_FLAG_INPUT_TYPE is 639 + * set, an input type carried via this @in_data_type field will 640 + * be valid, requesting for the info data to the given type. If 641 + * IOMMU_HW_INFO_FLAG_INPUT_TYPE is unset, any input value will 642 + * be seen as IOMMU_HW_INFO_TYPE_DEFAULT 674 643 * @out_data_type: Output the iommu hardware info type as defined in the enum 675 644 * iommu_hw_info_type. 676 645 * @out_capabilities: Output the generic iommu capability info type as defined ··· 706 663 __u32 dev_id; 707 664 __u32 data_len; 708 665 __aligned_u64 data_uptr; 709 - __u32 out_data_type; 666 + union { 667 + __u32 in_data_type; 668 + __u32 out_data_type; 669 + }; 710 670 __u8 out_max_pasid_log2; 711 671 __u8 __reserved[3]; 712 672 __aligned_u64 out_capabilities; ··· 997 951 * enum iommu_viommu_type - Virtual IOMMU Type 998 952 * @IOMMU_VIOMMU_TYPE_DEFAULT: Reserved for future use 999 953 * @IOMMU_VIOMMU_TYPE_ARM_SMMUV3: ARM SMMUv3 driver specific type 954 + * @IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV (extension for ARM 955 + * SMMUv3) enabled ARM SMMUv3 type 1000 956 */ 1001 957 enum iommu_viommu_type { 1002 958 IOMMU_VIOMMU_TYPE_DEFAULT = 0, 1003 959 IOMMU_VIOMMU_TYPE_ARM_SMMUV3 = 1, 960 + IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV = 2, 961 + }; 962 + 963 + /** 964 + * struct iommu_viommu_tegra241_cmdqv - NVIDIA Tegra241 CMDQV Virtual Interface 965 + * (IOMMU_VIOMMU_TYPE_TEGRA241_CMDQV) 966 + * @out_vintf_mmap_offset: mmap offset argument for VINTF's page0 967 + * @out_vintf_mmap_length: mmap length argument for VINTF's page0 968 + * 969 + * Both @out_vintf_mmap_offset and @out_vintf_mmap_length are reported by kernel 970 + * for user space to mmap the VINTF page0 from the host physical address space 971 + * to the guest physical address space so that a guest kernel can directly R/W 972 + * access to the VINTF page0 in order to control its virtual command queues. 973 + */ 974 + struct iommu_viommu_tegra241_cmdqv { 975 + __aligned_u64 out_vintf_mmap_offset; 976 + __aligned_u64 out_vintf_mmap_length; 1004 977 }; 1005 978 1006 979 /** ··· 1030 965 * @dev_id: The device's physical IOMMU will be used to back the virtual IOMMU 1031 966 * @hwpt_id: ID of a nesting parent HWPT to associate to 1032 967 * @out_viommu_id: Output virtual IOMMU ID for the allocated object 968 + * @data_len: Length of the type specific data 969 + * @__reserved: Must be 0 970 + * @data_uptr: User pointer to a driver-specific virtual IOMMU data 1033 971 * 1034 972 * Allocate a virtual IOMMU object, representing the underlying physical IOMMU's 1035 973 * virtualization support that is a security-isolated slice of the real IOMMU HW ··· 1053 985 __u32 dev_id; 1054 986 __u32 hwpt_id; 1055 987 __u32 out_viommu_id; 988 + __u32 data_len; 989 + __u32 __reserved; 990 + __aligned_u64 data_uptr; 1056 991 }; 1057 992 #define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC) 1058 993 ··· 1066 995 * @dev_id: The physical device to allocate a virtual instance on the vIOMMU 1067 996 * @out_vdevice_id: Object handle for the vDevice. Pass to IOMMU_DESTORY 1068 997 * @virt_id: Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID 1069 - * of AMD IOMMU, and vRID of a nested Intel VT-d to a Context Table 998 + * of AMD IOMMU, and vRID of Intel VT-d 1070 999 * 1071 1000 * Allocate a virtual device instance (for a physical device) against a vIOMMU. 1072 1001 * This instance holds the device's information (related to its vIOMMU) in a VM. 1002 + * User should use IOMMU_DESTROY to destroy the virtual device before 1003 + * destroying the physical device (by closing vfio_cdev fd). Otherwise the 1004 + * virtual device would be forcibly destroyed on physical device destruction, 1005 + * its vdevice_id would be permanently leaked (unremovable & unreusable) until 1006 + * iommu fd closed. 1073 1007 */ 1074 1008 struct iommu_vdevice_alloc { 1075 1009 __u32 size; ··· 1151 1075 * enum iommu_veventq_type - Virtual Event Queue Type 1152 1076 * @IOMMU_VEVENTQ_TYPE_DEFAULT: Reserved for future use 1153 1077 * @IOMMU_VEVENTQ_TYPE_ARM_SMMUV3: ARM SMMUv3 Virtual Event Queue 1078 + * @IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV Extension IRQ 1154 1079 */ 1155 1080 enum iommu_veventq_type { 1156 1081 IOMMU_VEVENTQ_TYPE_DEFAULT = 0, 1157 1082 IOMMU_VEVENTQ_TYPE_ARM_SMMUV3 = 1, 1083 + IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV = 2, 1158 1084 }; 1159 1085 1160 1086 /** ··· 1178 1100 */ 1179 1101 struct iommu_vevent_arm_smmuv3 { 1180 1102 __aligned_le64 evt[4]; 1103 + }; 1104 + 1105 + /** 1106 + * struct iommu_vevent_tegra241_cmdqv - Tegra241 CMDQV IRQ 1107 + * (IOMMU_VEVENTQ_TYPE_TEGRA241_CMDQV) 1108 + * @lvcmdq_err_map: 128-bit logical vcmdq error map, little-endian. 1109 + * (Refer to register LVCMDQ_ERR_MAPs per VINTF ) 1110 + * 1111 + * The 128-bit register value from HW exclusively reflect the error bits for a 1112 + * Virtual Interface represented by a vIOMMU object. Read and report directly. 1113 + */ 1114 + struct iommu_vevent_tegra241_cmdqv { 1115 + __aligned_le64 lvcmdq_err_map[2]; 1181 1116 }; 1182 1117 1183 1118 /** ··· 1232 1141 __u32 __reserved; 1233 1142 }; 1234 1143 #define IOMMU_VEVENTQ_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VEVENTQ_ALLOC) 1144 + 1145 + /** 1146 + * enum iommu_hw_queue_type - HW Queue Type 1147 + * @IOMMU_HW_QUEUE_TYPE_DEFAULT: Reserved for future use 1148 + * @IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV: NVIDIA Tegra241 CMDQV (extension for ARM 1149 + * SMMUv3) Virtual Command Queue (VCMDQ) 1150 + */ 1151 + enum iommu_hw_queue_type { 1152 + IOMMU_HW_QUEUE_TYPE_DEFAULT = 0, 1153 + /* 1154 + * TEGRA241_CMDQV requirements (otherwise, allocation will fail) 1155 + * - alloc starts from the lowest @index=0 in ascending order 1156 + * - destroy starts from the last allocated @index in descending order 1157 + * - @base_addr must be aligned to @length in bytes and mapped in IOAS 1158 + * - @length must be a power of 2, with a minimum 32 bytes and a maximum 1159 + * 2 ^ idr[1].CMDQS * 16 bytes (use GET_HW_INFO call to read idr[1] 1160 + * from struct iommu_hw_info_arm_smmuv3) 1161 + * - suggest to back the queue memory with contiguous physical pages or 1162 + * a single huge page with alignment of the queue size, and limit the 1163 + * emulated vSMMU's IDR1.CMDQS to log2(huge page size / 16 bytes) 1164 + */ 1165 + IOMMU_HW_QUEUE_TYPE_TEGRA241_CMDQV = 1, 1166 + }; 1167 + 1168 + /** 1169 + * struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC) 1170 + * @size: sizeof(struct iommu_hw_queue_alloc) 1171 + * @flags: Must be 0 1172 + * @viommu_id: Virtual IOMMU ID to associate the HW queue with 1173 + * @type: One of enum iommu_hw_queue_type 1174 + * @index: The logical index to the HW queue per virtual IOMMU for a multi-queue 1175 + * model 1176 + * @out_hw_queue_id: The ID of the new HW queue 1177 + * @nesting_parent_iova: Base address of the queue memory in the guest physical 1178 + * address space 1179 + * @length: Length of the queue memory 1180 + * 1181 + * Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, which 1182 + * allows HW to access a guest queue memory described using @nesting_parent_iova 1183 + * and @length. 1184 + * 1185 + * A vIOMMU can allocate multiple queues, but it must use a different @index per 1186 + * type to separate each allocation, e.g:: 1187 + * 1188 + * Type1 HW queue0, Type1 HW queue1, Type2 HW queue0, ... 1189 + */ 1190 + struct iommu_hw_queue_alloc { 1191 + __u32 size; 1192 + __u32 flags; 1193 + __u32 viommu_id; 1194 + __u32 type; 1195 + __u32 index; 1196 + __u32 out_hw_queue_id; 1197 + __aligned_u64 nesting_parent_iova; 1198 + __aligned_u64 length; 1199 + }; 1200 + #define IOMMU_HW_QUEUE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_HW_QUEUE_ALLOC) 1235 1201 #endif
+331 -190
tools/testing/selftests/iommu/iommufd.c
··· 766 766 uint8_t max_pasid = 0; 767 767 768 768 /* Provide a zero-size user_buffer */ 769 - test_cmd_get_hw_info(self->device_id, NULL, 0); 769 + test_cmd_get_hw_info(self->device_id, 770 + IOMMU_HW_INFO_TYPE_DEFAULT, NULL, 0); 770 771 /* Provide a user_buffer with exact size */ 771 - test_cmd_get_hw_info(self->device_id, &buffer_exact, sizeof(buffer_exact)); 772 + test_cmd_get_hw_info(self->device_id, 773 + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_exact, 774 + sizeof(buffer_exact)); 775 + 776 + /* Request for a wrong data_type, and a correct one */ 777 + test_err_get_hw_info(EOPNOTSUPP, self->device_id, 778 + IOMMU_HW_INFO_TYPE_SELFTEST + 1, 779 + &buffer_exact, sizeof(buffer_exact)); 780 + test_cmd_get_hw_info(self->device_id, 781 + IOMMU_HW_INFO_TYPE_SELFTEST, &buffer_exact, 782 + sizeof(buffer_exact)); 772 783 /* 773 784 * Provide a user_buffer with size larger than the exact size to check if 774 785 * kernel zero the trailing bytes. 775 786 */ 776 - test_cmd_get_hw_info(self->device_id, &buffer_larger, sizeof(buffer_larger)); 787 + test_cmd_get_hw_info(self->device_id, 788 + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_larger, 789 + sizeof(buffer_larger)); 777 790 /* 778 791 * Provide a user_buffer with size smaller than the exact size to check if 779 792 * the fields within the size range still gets updated. 780 793 */ 781 - test_cmd_get_hw_info(self->device_id, &buffer_smaller, sizeof(buffer_smaller)); 794 + test_cmd_get_hw_info(self->device_id, 795 + IOMMU_HW_INFO_TYPE_DEFAULT, 796 + &buffer_smaller, sizeof(buffer_smaller)); 782 797 test_cmd_get_hw_info_pasid(self->device_id, &max_pasid); 783 798 ASSERT_EQ(0, max_pasid); 784 799 if (variant->pasid_capable) { ··· 803 788 } 804 789 } else { 805 790 test_err_get_hw_info(ENOENT, self->device_id, 806 - &buffer_exact, sizeof(buffer_exact)); 791 + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_exact, 792 + sizeof(buffer_exact)); 807 793 test_err_get_hw_info(ENOENT, self->device_id, 808 - &buffer_larger, sizeof(buffer_larger)); 794 + IOMMU_HW_INFO_TYPE_DEFAULT, &buffer_larger, 795 + sizeof(buffer_larger)); 809 796 } 810 797 } 811 798 ··· 968 951 } 969 952 for (i = 0; i != 10; i++) 970 953 test_ioctl_ioas_unmap(iovas[i], PAGE_SIZE * (i + 1)); 954 + } 955 + 956 + /* https://lore.kernel.org/r/685af644.a00a0220.2e5631.0094.GAE@google.com */ 957 + TEST_F(iommufd_ioas, reserved_overflow) 958 + { 959 + struct iommu_test_cmd test_cmd = { 960 + .size = sizeof(test_cmd), 961 + .op = IOMMU_TEST_OP_ADD_RESERVED, 962 + .id = self->ioas_id, 963 + .add_reserved.start = 6, 964 + }; 965 + unsigned int map_len; 966 + __u64 iova; 967 + 968 + if (PAGE_SIZE == 4096) { 969 + test_cmd.add_reserved.length = 0xffffffffffff8001; 970 + map_len = 0x5000; 971 + } else { 972 + test_cmd.add_reserved.length = 973 + 0xffffffffffffffff - MOCK_PAGE_SIZE * 16; 974 + map_len = MOCK_PAGE_SIZE * 10; 975 + } 976 + 977 + ASSERT_EQ(0, 978 + ioctl(self->fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_ADD_RESERVED), 979 + &test_cmd)); 980 + test_err_ioctl_ioas_map(ENOSPC, buffer, map_len, &iova); 971 981 } 972 982 973 983 TEST_F(iommufd_ioas, area_allowed) ··· 2237 2193 2238 2194 test_cmd_hwpt_alloc(self->idev_id, self->ioas_id, 0, &hwpt_id); 2239 2195 test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); 2240 - test_cmd_get_hw_capabilities(self->idev_id, caps, 2241 - IOMMU_HW_CAP_DIRTY_TRACKING); 2196 + test_cmd_get_hw_capabilities(self->idev_id, caps); 2242 2197 ASSERT_EQ(IOMMU_HW_CAP_DIRTY_TRACKING, 2243 2198 caps & IOMMU_HW_CAP_DIRTY_TRACKING); 2244 2199 ··· 2749 2706 2750 2707 /* Allocate a vIOMMU taking refcount of the parent hwpt */ 2751 2708 test_cmd_viommu_alloc(self->device_id, self->hwpt_id, 2752 - IOMMU_VIOMMU_TYPE_SELFTEST, 2709 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, 2753 2710 &self->viommu_id); 2754 2711 2755 2712 /* Allocate a regular nested hwpt */ ··· 2788 2745 if (self->device_id) { 2789 2746 /* Negative test -- invalid hwpt (hwpt_id=0) */ 2790 2747 test_err_viommu_alloc(ENOENT, device_id, 0, 2791 - IOMMU_VIOMMU_TYPE_SELFTEST, NULL); 2748 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, 2749 + NULL); 2792 2750 2793 2751 /* Negative test -- not a nesting parent hwpt */ 2794 2752 test_cmd_hwpt_alloc(device_id, ioas_id, 0, &hwpt_id); 2795 2753 test_err_viommu_alloc(EINVAL, device_id, hwpt_id, 2796 - IOMMU_VIOMMU_TYPE_SELFTEST, NULL); 2754 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, 2755 + NULL); 2797 2756 test_ioctl_destroy(hwpt_id); 2798 2757 2799 2758 /* Negative test -- unsupported viommu type */ 2800 2759 test_err_viommu_alloc(EOPNOTSUPP, device_id, self->hwpt_id, 2801 - 0xdead, NULL); 2760 + 0xdead, NULL, 0, NULL); 2802 2761 EXPECT_ERRNO(EBUSY, 2803 2762 _test_ioctl_destroy(self->fd, self->hwpt_id)); 2804 2763 EXPECT_ERRNO(EBUSY, 2805 2764 _test_ioctl_destroy(self->fd, self->viommu_id)); 2806 2765 } else { 2807 2766 test_err_viommu_alloc(ENOENT, self->device_id, self->hwpt_id, 2808 - IOMMU_VIOMMU_TYPE_SELFTEST, NULL); 2767 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, 2768 + NULL); 2809 2769 } 2810 2770 } 2811 2771 ··· 2824 2778 uint32_t fault_fd; 2825 2779 uint32_t vdev_id; 2826 2780 2827 - if (self->device_id) { 2828 - test_ioctl_fault_alloc(&fault_id, &fault_fd); 2829 - test_err_hwpt_alloc_iopf( 2830 - ENOENT, dev_id, viommu_id, UINT32_MAX, 2831 - IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id, 2832 - IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); 2833 - test_err_hwpt_alloc_iopf( 2834 - EOPNOTSUPP, dev_id, viommu_id, fault_id, 2835 - IOMMU_HWPT_FAULT_ID_VALID | (1 << 31), &iopf_hwpt_id, 2836 - IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); 2837 - test_cmd_hwpt_alloc_iopf( 2838 - dev_id, viommu_id, fault_id, IOMMU_HWPT_FAULT_ID_VALID, 2839 - &iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, &data, 2840 - sizeof(data)); 2781 + if (!dev_id) 2782 + SKIP(return, "Skipping test for variant no_viommu"); 2841 2783 2842 - /* Must allocate vdevice before attaching to a nested hwpt */ 2843 - test_err_mock_domain_replace(ENOENT, self->stdev_id, 2844 - iopf_hwpt_id); 2845 - test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2846 - test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id); 2847 - EXPECT_ERRNO(EBUSY, 2848 - _test_ioctl_destroy(self->fd, iopf_hwpt_id)); 2849 - test_cmd_trigger_iopf(dev_id, fault_fd); 2784 + test_ioctl_fault_alloc(&fault_id, &fault_fd); 2785 + test_err_hwpt_alloc_iopf(ENOENT, dev_id, viommu_id, UINT32_MAX, 2786 + IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id, 2787 + IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); 2788 + test_err_hwpt_alloc_iopf(EOPNOTSUPP, dev_id, viommu_id, fault_id, 2789 + IOMMU_HWPT_FAULT_ID_VALID | (1 << 31), 2790 + &iopf_hwpt_id, IOMMU_HWPT_DATA_SELFTEST, &data, 2791 + sizeof(data)); 2792 + test_cmd_hwpt_alloc_iopf(dev_id, viommu_id, fault_id, 2793 + IOMMU_HWPT_FAULT_ID_VALID, &iopf_hwpt_id, 2794 + IOMMU_HWPT_DATA_SELFTEST, &data, sizeof(data)); 2850 2795 2851 - test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); 2852 - test_ioctl_destroy(iopf_hwpt_id); 2853 - close(fault_fd); 2854 - test_ioctl_destroy(fault_id); 2855 - } 2796 + /* Must allocate vdevice before attaching to a nested hwpt */ 2797 + test_err_mock_domain_replace(ENOENT, self->stdev_id, iopf_hwpt_id); 2798 + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2799 + test_cmd_mock_domain_replace(self->stdev_id, iopf_hwpt_id); 2800 + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, iopf_hwpt_id)); 2801 + test_cmd_trigger_iopf(dev_id, fault_fd); 2802 + 2803 + test_cmd_mock_domain_replace(self->stdev_id, self->ioas_id); 2804 + test_ioctl_destroy(iopf_hwpt_id); 2805 + close(fault_fd); 2806 + test_ioctl_destroy(fault_id); 2807 + } 2808 + 2809 + TEST_F(iommufd_viommu, viommu_alloc_with_data) 2810 + { 2811 + struct iommu_viommu_selftest data = { 2812 + .in_data = 0xbeef, 2813 + }; 2814 + uint32_t *test; 2815 + 2816 + if (!self->device_id) 2817 + SKIP(return, "Skipping test for variant no_viommu"); 2818 + 2819 + test_cmd_viommu_alloc(self->device_id, self->hwpt_id, 2820 + IOMMU_VIOMMU_TYPE_SELFTEST, &data, sizeof(data), 2821 + &self->viommu_id); 2822 + ASSERT_EQ(data.out_data, data.in_data); 2823 + 2824 + /* Negative mmap tests -- offset and length cannot be changed */ 2825 + test_err_mmap(ENXIO, data.out_mmap_length, 2826 + data.out_mmap_offset + PAGE_SIZE); 2827 + test_err_mmap(ENXIO, data.out_mmap_length, 2828 + data.out_mmap_offset + PAGE_SIZE * 2); 2829 + test_err_mmap(ENXIO, data.out_mmap_length / 2, data.out_mmap_offset); 2830 + test_err_mmap(ENXIO, data.out_mmap_length * 2, data.out_mmap_offset); 2831 + 2832 + /* Now do a correct mmap for a loopback test */ 2833 + test = mmap(NULL, data.out_mmap_length, PROT_READ | PROT_WRITE, 2834 + MAP_SHARED, self->fd, data.out_mmap_offset); 2835 + ASSERT_NE(MAP_FAILED, test); 2836 + ASSERT_EQ(data.in_data, *test); 2837 + 2838 + /* The owner of the mmap region should be blocked */ 2839 + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, self->viommu_id)); 2840 + munmap(test, data.out_mmap_length); 2856 2841 } 2857 2842 2858 2843 TEST_F(iommufd_viommu, vdevice_alloc) ··· 2944 2867 uint32_t vdev_id = 0; 2945 2868 uint32_t num_inv; 2946 2869 2947 - if (dev_id) { 2948 - test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2870 + if (!dev_id) 2871 + SKIP(return, "Skipping test for variant no_viommu"); 2949 2872 2950 - test_cmd_dev_check_cache_all(dev_id, 2951 - IOMMU_TEST_DEV_CACHE_DEFAULT); 2873 + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 2952 2874 2953 - /* Check data_type by passing zero-length array */ 2954 - num_inv = 0; 2955 - test_cmd_viommu_invalidate(viommu_id, inv_reqs, 2956 - sizeof(*inv_reqs), &num_inv); 2957 - assert(!num_inv); 2875 + test_cmd_dev_check_cache_all(dev_id, IOMMU_TEST_DEV_CACHE_DEFAULT); 2958 2876 2959 - /* Negative test: Invalid data_type */ 2960 - num_inv = 1; 2961 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2962 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID, 2963 - sizeof(*inv_reqs), &num_inv); 2964 - assert(!num_inv); 2877 + /* Check data_type by passing zero-length array */ 2878 + num_inv = 0; 2879 + test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs), 2880 + &num_inv); 2881 + assert(!num_inv); 2965 2882 2966 - /* Negative test: structure size sanity */ 2967 - num_inv = 1; 2968 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2969 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2970 - sizeof(*inv_reqs) + 1, &num_inv); 2971 - assert(!num_inv); 2883 + /* Negative test: Invalid data_type */ 2884 + num_inv = 1; 2885 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2886 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID, 2887 + sizeof(*inv_reqs), &num_inv); 2888 + assert(!num_inv); 2972 2889 2973 - num_inv = 1; 2974 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2975 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2976 - 1, &num_inv); 2977 - assert(!num_inv); 2890 + /* Negative test: structure size sanity */ 2891 + num_inv = 1; 2892 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2893 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2894 + sizeof(*inv_reqs) + 1, &num_inv); 2895 + assert(!num_inv); 2978 2896 2979 - /* Negative test: invalid flag is passed */ 2980 - num_inv = 1; 2981 - inv_reqs[0].flags = 0xffffffff; 2982 - inv_reqs[0].vdev_id = 0x99; 2983 - test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, 2984 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2985 - sizeof(*inv_reqs), &num_inv); 2986 - assert(!num_inv); 2897 + num_inv = 1; 2898 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2899 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 1, 2900 + &num_inv); 2901 + assert(!num_inv); 2987 2902 2988 - /* Negative test: invalid data_uptr when array is not empty */ 2989 - num_inv = 1; 2990 - inv_reqs[0].flags = 0; 2991 - inv_reqs[0].vdev_id = 0x99; 2992 - test_err_viommu_invalidate(EINVAL, viommu_id, NULL, 2993 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2994 - sizeof(*inv_reqs), &num_inv); 2995 - assert(!num_inv); 2903 + /* Negative test: invalid flag is passed */ 2904 + num_inv = 1; 2905 + inv_reqs[0].flags = 0xffffffff; 2906 + inv_reqs[0].vdev_id = 0x99; 2907 + test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, 2908 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2909 + sizeof(*inv_reqs), &num_inv); 2910 + assert(!num_inv); 2996 2911 2997 - /* Negative test: invalid entry_len when array is not empty */ 2998 - num_inv = 1; 2999 - inv_reqs[0].flags = 0; 3000 - inv_reqs[0].vdev_id = 0x99; 3001 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 3002 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 3003 - 0, &num_inv); 3004 - assert(!num_inv); 2912 + /* Negative test: invalid data_uptr when array is not empty */ 2913 + num_inv = 1; 2914 + inv_reqs[0].flags = 0; 2915 + inv_reqs[0].vdev_id = 0x99; 2916 + test_err_viommu_invalidate(EINVAL, viommu_id, NULL, 2917 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2918 + sizeof(*inv_reqs), &num_inv); 2919 + assert(!num_inv); 3005 2920 3006 - /* Negative test: invalid cache_id */ 3007 - num_inv = 1; 3008 - inv_reqs[0].flags = 0; 3009 - inv_reqs[0].vdev_id = 0x99; 3010 - inv_reqs[0].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; 3011 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 3012 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 3013 - sizeof(*inv_reqs), &num_inv); 3014 - assert(!num_inv); 2921 + /* Negative test: invalid entry_len when array is not empty */ 2922 + num_inv = 1; 2923 + inv_reqs[0].flags = 0; 2924 + inv_reqs[0].vdev_id = 0x99; 2925 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2926 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 0, 2927 + &num_inv); 2928 + assert(!num_inv); 3015 2929 3016 - /* Negative test: invalid vdev_id */ 3017 - num_inv = 1; 3018 - inv_reqs[0].flags = 0; 3019 - inv_reqs[0].vdev_id = 0x9; 3020 - inv_reqs[0].cache_id = 0; 3021 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 3022 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 3023 - sizeof(*inv_reqs), &num_inv); 3024 - assert(!num_inv); 2930 + /* Negative test: invalid cache_id */ 2931 + num_inv = 1; 2932 + inv_reqs[0].flags = 0; 2933 + inv_reqs[0].vdev_id = 0x99; 2934 + inv_reqs[0].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; 2935 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2936 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2937 + sizeof(*inv_reqs), &num_inv); 2938 + assert(!num_inv); 3025 2939 3026 - /* 3027 - * Invalidate the 1st cache entry but fail the 2nd request 3028 - * due to invalid flags configuration in the 2nd request. 3029 - */ 3030 - num_inv = 2; 3031 - inv_reqs[0].flags = 0; 3032 - inv_reqs[0].vdev_id = 0x99; 3033 - inv_reqs[0].cache_id = 0; 3034 - inv_reqs[1].flags = 0xffffffff; 3035 - inv_reqs[1].vdev_id = 0x99; 3036 - inv_reqs[1].cache_id = 1; 3037 - test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, 3038 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 3039 - sizeof(*inv_reqs), &num_inv); 3040 - assert(num_inv == 1); 3041 - test_cmd_dev_check_cache(dev_id, 0, 0); 3042 - test_cmd_dev_check_cache(dev_id, 1, 3043 - IOMMU_TEST_DEV_CACHE_DEFAULT); 3044 - test_cmd_dev_check_cache(dev_id, 2, 3045 - IOMMU_TEST_DEV_CACHE_DEFAULT); 3046 - test_cmd_dev_check_cache(dev_id, 3, 3047 - IOMMU_TEST_DEV_CACHE_DEFAULT); 2940 + /* Negative test: invalid vdev_id */ 2941 + num_inv = 1; 2942 + inv_reqs[0].flags = 0; 2943 + inv_reqs[0].vdev_id = 0x9; 2944 + inv_reqs[0].cache_id = 0; 2945 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2946 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2947 + sizeof(*inv_reqs), &num_inv); 2948 + assert(!num_inv); 3048 2949 3049 - /* 3050 - * Invalidate the 1st cache entry but fail the 2nd request 3051 - * due to invalid cache_id configuration in the 2nd request. 3052 - */ 3053 - num_inv = 2; 3054 - inv_reqs[0].flags = 0; 3055 - inv_reqs[0].vdev_id = 0x99; 3056 - inv_reqs[0].cache_id = 0; 3057 - inv_reqs[1].flags = 0; 3058 - inv_reqs[1].vdev_id = 0x99; 3059 - inv_reqs[1].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; 3060 - test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 3061 - IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 3062 - sizeof(*inv_reqs), &num_inv); 3063 - assert(num_inv == 1); 3064 - test_cmd_dev_check_cache(dev_id, 0, 0); 3065 - test_cmd_dev_check_cache(dev_id, 1, 3066 - IOMMU_TEST_DEV_CACHE_DEFAULT); 3067 - test_cmd_dev_check_cache(dev_id, 2, 3068 - IOMMU_TEST_DEV_CACHE_DEFAULT); 3069 - test_cmd_dev_check_cache(dev_id, 3, 3070 - IOMMU_TEST_DEV_CACHE_DEFAULT); 2950 + /* 2951 + * Invalidate the 1st cache entry but fail the 2nd request 2952 + * due to invalid flags configuration in the 2nd request. 2953 + */ 2954 + num_inv = 2; 2955 + inv_reqs[0].flags = 0; 2956 + inv_reqs[0].vdev_id = 0x99; 2957 + inv_reqs[0].cache_id = 0; 2958 + inv_reqs[1].flags = 0xffffffff; 2959 + inv_reqs[1].vdev_id = 0x99; 2960 + inv_reqs[1].cache_id = 1; 2961 + test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, 2962 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2963 + sizeof(*inv_reqs), &num_inv); 2964 + assert(num_inv == 1); 2965 + test_cmd_dev_check_cache(dev_id, 0, 0); 2966 + test_cmd_dev_check_cache(dev_id, 1, IOMMU_TEST_DEV_CACHE_DEFAULT); 2967 + test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT); 2968 + test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT); 3071 2969 3072 - /* Invalidate the 2nd cache entry and verify */ 3073 - num_inv = 1; 3074 - inv_reqs[0].flags = 0; 3075 - inv_reqs[0].vdev_id = 0x99; 3076 - inv_reqs[0].cache_id = 1; 3077 - test_cmd_viommu_invalidate(viommu_id, inv_reqs, 3078 - sizeof(*inv_reqs), &num_inv); 3079 - assert(num_inv == 1); 3080 - test_cmd_dev_check_cache(dev_id, 0, 0); 3081 - test_cmd_dev_check_cache(dev_id, 1, 0); 3082 - test_cmd_dev_check_cache(dev_id, 2, 3083 - IOMMU_TEST_DEV_CACHE_DEFAULT); 3084 - test_cmd_dev_check_cache(dev_id, 3, 3085 - IOMMU_TEST_DEV_CACHE_DEFAULT); 2970 + /* 2971 + * Invalidate the 1st cache entry but fail the 2nd request 2972 + * due to invalid cache_id configuration in the 2nd request. 2973 + */ 2974 + num_inv = 2; 2975 + inv_reqs[0].flags = 0; 2976 + inv_reqs[0].vdev_id = 0x99; 2977 + inv_reqs[0].cache_id = 0; 2978 + inv_reqs[1].flags = 0; 2979 + inv_reqs[1].vdev_id = 0x99; 2980 + inv_reqs[1].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; 2981 + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, 2982 + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, 2983 + sizeof(*inv_reqs), &num_inv); 2984 + assert(num_inv == 1); 2985 + test_cmd_dev_check_cache(dev_id, 0, 0); 2986 + test_cmd_dev_check_cache(dev_id, 1, IOMMU_TEST_DEV_CACHE_DEFAULT); 2987 + test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT); 2988 + test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT); 3086 2989 3087 - /* Invalidate the 3rd and 4th cache entries and verify */ 3088 - num_inv = 2; 3089 - inv_reqs[0].flags = 0; 3090 - inv_reqs[0].vdev_id = 0x99; 3091 - inv_reqs[0].cache_id = 2; 3092 - inv_reqs[1].flags = 0; 3093 - inv_reqs[1].vdev_id = 0x99; 3094 - inv_reqs[1].cache_id = 3; 3095 - test_cmd_viommu_invalidate(viommu_id, inv_reqs, 3096 - sizeof(*inv_reqs), &num_inv); 3097 - assert(num_inv == 2); 3098 - test_cmd_dev_check_cache_all(dev_id, 0); 2990 + /* Invalidate the 2nd cache entry and verify */ 2991 + num_inv = 1; 2992 + inv_reqs[0].flags = 0; 2993 + inv_reqs[0].vdev_id = 0x99; 2994 + inv_reqs[0].cache_id = 1; 2995 + test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs), 2996 + &num_inv); 2997 + assert(num_inv == 1); 2998 + test_cmd_dev_check_cache(dev_id, 0, 0); 2999 + test_cmd_dev_check_cache(dev_id, 1, 0); 3000 + test_cmd_dev_check_cache(dev_id, 2, IOMMU_TEST_DEV_CACHE_DEFAULT); 3001 + test_cmd_dev_check_cache(dev_id, 3, IOMMU_TEST_DEV_CACHE_DEFAULT); 3099 3002 3100 - /* Invalidate all cache entries for nested_dev_id[1] and verify */ 3101 - num_inv = 1; 3102 - inv_reqs[0].vdev_id = 0x99; 3103 - inv_reqs[0].flags = IOMMU_TEST_INVALIDATE_FLAG_ALL; 3104 - test_cmd_viommu_invalidate(viommu_id, inv_reqs, 3105 - sizeof(*inv_reqs), &num_inv); 3106 - assert(num_inv == 1); 3107 - test_cmd_dev_check_cache_all(dev_id, 0); 3108 - test_ioctl_destroy(vdev_id); 3109 - } 3003 + /* Invalidate the 3rd and 4th cache entries and verify */ 3004 + num_inv = 2; 3005 + inv_reqs[0].flags = 0; 3006 + inv_reqs[0].vdev_id = 0x99; 3007 + inv_reqs[0].cache_id = 2; 3008 + inv_reqs[1].flags = 0; 3009 + inv_reqs[1].vdev_id = 0x99; 3010 + inv_reqs[1].cache_id = 3; 3011 + test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs), 3012 + &num_inv); 3013 + assert(num_inv == 2); 3014 + test_cmd_dev_check_cache_all(dev_id, 0); 3015 + 3016 + /* Invalidate all cache entries for nested_dev_id[1] and verify */ 3017 + num_inv = 1; 3018 + inv_reqs[0].vdev_id = 0x99; 3019 + inv_reqs[0].flags = IOMMU_TEST_INVALIDATE_FLAG_ALL; 3020 + test_cmd_viommu_invalidate(viommu_id, inv_reqs, sizeof(*inv_reqs), 3021 + &num_inv); 3022 + assert(num_inv == 1); 3023 + test_cmd_dev_check_cache_all(dev_id, 0); 3024 + test_ioctl_destroy(vdev_id); 3025 + } 3026 + 3027 + TEST_F(iommufd_viommu, hw_queue) 3028 + { 3029 + __u64 iova = MOCK_APERTURE_START, iova2; 3030 + uint32_t viommu_id = self->viommu_id; 3031 + uint32_t hw_queue_id[2]; 3032 + 3033 + if (!viommu_id) 3034 + SKIP(return, "Skipping test for variant no_viommu"); 3035 + 3036 + /* Fail IOMMU_HW_QUEUE_TYPE_DEFAULT */ 3037 + test_err_hw_queue_alloc(EOPNOTSUPP, viommu_id, 3038 + IOMMU_HW_QUEUE_TYPE_DEFAULT, 0, iova, PAGE_SIZE, 3039 + &hw_queue_id[0]); 3040 + /* Fail queue addr and length */ 3041 + test_err_hw_queue_alloc(EINVAL, viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 3042 + 0, iova, 0, &hw_queue_id[0]); 3043 + test_err_hw_queue_alloc(EOVERFLOW, viommu_id, 3044 + IOMMU_HW_QUEUE_TYPE_SELFTEST, 0, ~(uint64_t)0, 3045 + PAGE_SIZE, &hw_queue_id[0]); 3046 + /* Fail missing iova */ 3047 + test_err_hw_queue_alloc(ENOENT, viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 3048 + 0, iova, PAGE_SIZE, &hw_queue_id[0]); 3049 + 3050 + /* Map iova */ 3051 + test_ioctl_ioas_map(buffer, PAGE_SIZE, &iova); 3052 + test_ioctl_ioas_map(buffer + PAGE_SIZE, PAGE_SIZE, &iova2); 3053 + 3054 + /* Fail index=1 and =MAX; must start from index=0 */ 3055 + test_err_hw_queue_alloc(EIO, viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 1, 3056 + iova, PAGE_SIZE, &hw_queue_id[0]); 3057 + test_err_hw_queue_alloc(EINVAL, viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 3058 + IOMMU_TEST_HW_QUEUE_MAX, iova, PAGE_SIZE, 3059 + &hw_queue_id[0]); 3060 + 3061 + /* Allocate index=0, declare ownership of the iova */ 3062 + test_cmd_hw_queue_alloc(viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 0, 3063 + iova, PAGE_SIZE, &hw_queue_id[0]); 3064 + /* Fail duplicated index */ 3065 + test_err_hw_queue_alloc(EEXIST, viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 3066 + 0, iova, PAGE_SIZE, &hw_queue_id[0]); 3067 + /* Fail unmap, due to iova ownership */ 3068 + test_err_ioctl_ioas_unmap(EBUSY, iova, PAGE_SIZE); 3069 + /* The 2nd page is not pinned, so it can be unmmap */ 3070 + test_ioctl_ioas_unmap(iova2, PAGE_SIZE); 3071 + 3072 + /* Allocate index=1, with an unaligned case */ 3073 + test_cmd_hw_queue_alloc(viommu_id, IOMMU_HW_QUEUE_TYPE_SELFTEST, 1, 3074 + iova + PAGE_SIZE / 2, PAGE_SIZE / 2, 3075 + &hw_queue_id[1]); 3076 + /* Fail to destroy, due to dependency */ 3077 + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, hw_queue_id[0])); 3078 + 3079 + /* Destroy in descending order */ 3080 + test_ioctl_destroy(hw_queue_id[1]); 3081 + test_ioctl_destroy(hw_queue_id[0]); 3082 + /* Now it can unmap the first page */ 3083 + test_ioctl_ioas_unmap(iova, PAGE_SIZE); 3084 + } 3085 + 3086 + TEST_F(iommufd_viommu, vdevice_tombstone) 3087 + { 3088 + uint32_t viommu_id = self->viommu_id; 3089 + uint32_t dev_id = self->device_id; 3090 + uint32_t vdev_id = 0; 3091 + 3092 + if (!dev_id) 3093 + SKIP(return, "Skipping test for variant no_viommu"); 3094 + 3095 + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); 3096 + test_ioctl_destroy(self->stdev_id); 3097 + EXPECT_ERRNO(ENOENT, _test_ioctl_destroy(self->fd, vdev_id)); 3110 3098 } 3111 3099 3112 3100 FIXTURE(iommufd_device_pasid) ··· 3265 3123 3266 3124 /* Allocate a regular nested hwpt based on viommu */ 3267 3125 test_cmd_viommu_alloc(self->device_id, parent_hwpt_id, 3268 - IOMMU_VIOMMU_TYPE_SELFTEST, 3269 - &viommu_id); 3126 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, &viommu_id); 3270 3127 test_cmd_hwpt_alloc_nested(self->device_id, viommu_id, 3271 3128 IOMMU_HWPT_ALLOC_PASID, 3272 3129 &nested_hwpt_id[2],
+11 -4
tools/testing/selftests/iommu/iommufd_fail_nth.c
··· 634 634 uint32_t idev_id; 635 635 uint32_t hwpt_id; 636 636 uint32_t viommu_id; 637 + uint32_t hw_queue_id; 637 638 uint32_t vdev_id; 638 639 __u64 iova; 639 640 ··· 667 666 &self->stdev_id, NULL, &idev_id)) 668 667 return -1; 669 668 670 - if (_test_cmd_get_hw_info(self->fd, idev_id, &info, 671 - sizeof(info), NULL, NULL)) 669 + if (_test_cmd_get_hw_info(self->fd, idev_id, IOMMU_HW_INFO_TYPE_DEFAULT, 670 + &info, sizeof(info), NULL, NULL)) 672 671 return -1; 673 672 674 673 if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, ··· 689 688 IOMMU_HWPT_DATA_NONE, 0, 0)) 690 689 return -1; 691 690 692 - if (_test_cmd_viommu_alloc(self->fd, idev_id, hwpt_id, 693 - IOMMU_VIOMMU_TYPE_SELFTEST, 0, &viommu_id)) 691 + if (_test_cmd_viommu_alloc(self->fd, idev_id, hwpt_id, 0, 692 + IOMMU_VIOMMU_TYPE_SELFTEST, NULL, 0, 693 + &viommu_id)) 694 694 return -1; 695 695 696 696 if (_test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, 0, &vdev_id)) 697 + return -1; 698 + 699 + if (_test_cmd_hw_queue_alloc(self->fd, viommu_id, 700 + IOMMU_HW_QUEUE_TYPE_SELFTEST, 0, iova, 701 + PAGE_SIZE, &hw_queue_id)) 697 702 return -1; 698 703 699 704 if (_test_ioctl_fault_alloc(self->fd, &fault_id, &fault_fd))
+68 -21
tools/testing/selftests/iommu/iommufd_utils.h
··· 56 56 #define offsetofend(TYPE, MEMBER) \ 57 57 (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER)) 58 58 59 + #define test_err_mmap(_errno, length, offset) \ 60 + EXPECT_ERRNO(_errno, (long)mmap(NULL, length, PROT_READ | PROT_WRITE, \ 61 + MAP_SHARED, self->fd, offset)) 62 + 59 63 static inline void *memfd_mmap(size_t length, int prot, int flags, int *mfd_p) 60 64 { 61 65 int mfd_flags = (flags & MAP_HUGETLB) ? MFD_HUGETLB : 0; ··· 766 762 #endif 767 763 768 764 /* @data can be NULL */ 769 - static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data, 770 - size_t data_len, uint32_t *capabilities, 771 - uint8_t *max_pasid) 765 + static int _test_cmd_get_hw_info(int fd, __u32 device_id, __u32 data_type, 766 + void *data, size_t data_len, 767 + uint32_t *capabilities, uint8_t *max_pasid) 772 768 { 773 769 struct iommu_test_hw_info *info = (struct iommu_test_hw_info *)data; 774 770 struct iommu_hw_info cmd = { 775 771 .size = sizeof(cmd), 776 772 .dev_id = device_id, 777 773 .data_len = data_len, 774 + .in_data_type = data_type, 778 775 .data_uptr = (uint64_t)data, 779 776 .out_capabilities = 0, 780 777 }; 781 778 int ret; 779 + 780 + if (data_type != IOMMU_HW_INFO_TYPE_DEFAULT) 781 + cmd.flags |= IOMMU_HW_INFO_FLAG_INPUT_TYPE; 782 782 783 783 ret = ioctl(fd, IOMMU_GET_HW_INFO, &cmd); 784 784 if (ret) ··· 826 818 return 0; 827 819 } 828 820 829 - #define test_cmd_get_hw_info(device_id, data, data_len) \ 830 - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data, \ 831 - data_len, NULL, NULL)) 821 + #define test_cmd_get_hw_info(device_id, data_type, data, data_len) \ 822 + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, data_type, \ 823 + data, data_len, NULL, NULL)) 832 824 833 - #define test_err_get_hw_info(_errno, device_id, data, data_len) \ 834 - EXPECT_ERRNO(_errno, _test_cmd_get_hw_info(self->fd, device_id, data, \ 835 - data_len, NULL, NULL)) 825 + #define test_err_get_hw_info(_errno, device_id, data_type, data, data_len) \ 826 + EXPECT_ERRNO(_errno, \ 827 + _test_cmd_get_hw_info(self->fd, device_id, data_type, \ 828 + data, data_len, NULL, NULL)) 836 829 837 - #define test_cmd_get_hw_capabilities(device_id, caps, mask) \ 838 - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ 830 + #define test_cmd_get_hw_capabilities(device_id, caps) \ 831 + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, \ 832 + IOMMU_HW_INFO_TYPE_DEFAULT, NULL, \ 839 833 0, &caps, NULL)) 840 834 841 - #define test_cmd_get_hw_info_pasid(device_id, max_pasid) \ 842 - ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, \ 835 + #define test_cmd_get_hw_info_pasid(device_id, max_pasid) \ 836 + ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, \ 837 + IOMMU_HW_INFO_TYPE_DEFAULT, NULL, \ 843 838 0, NULL, max_pasid)) 844 839 845 840 static int _test_ioctl_fault_alloc(int fd, __u32 *fault_id, __u32 *fault_fd) ··· 913 902 pasid, fault_fd)) 914 903 915 904 static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id, 916 - __u32 type, __u32 flags, __u32 *viommu_id) 905 + __u32 flags, __u32 type, void *data, 906 + __u32 data_len, __u32 *viommu_id) 917 907 { 918 908 struct iommu_viommu_alloc cmd = { 919 909 .size = sizeof(cmd), ··· 922 910 .type = type, 923 911 .dev_id = device_id, 924 912 .hwpt_id = hwpt_id, 913 + .data_uptr = (uint64_t)data, 914 + .data_len = data_len, 925 915 }; 926 916 int ret; 927 917 ··· 935 921 return 0; 936 922 } 937 923 938 - #define test_cmd_viommu_alloc(device_id, hwpt_id, type, viommu_id) \ 939 - ASSERT_EQ(0, _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, \ 940 - type, 0, viommu_id)) 941 - #define test_err_viommu_alloc(_errno, device_id, hwpt_id, type, viommu_id) \ 942 - EXPECT_ERRNO(_errno, \ 943 - _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, \ 944 - type, 0, viommu_id)) 924 + #define test_cmd_viommu_alloc(device_id, hwpt_id, type, data, data_len, \ 925 + viommu_id) \ 926 + ASSERT_EQ(0, _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, 0, \ 927 + type, data, data_len, viommu_id)) 928 + #define test_err_viommu_alloc(_errno, device_id, hwpt_id, type, data, \ 929 + data_len, viommu_id) \ 930 + EXPECT_ERRNO(_errno, \ 931 + _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, 0, \ 932 + type, data, data_len, viommu_id)) 945 933 946 934 static int _test_cmd_vdevice_alloc(int fd, __u32 viommu_id, __u32 idev_id, 947 935 __u64 virt_id, __u32 *vdev_id) ··· 971 955 EXPECT_ERRNO(_errno, \ 972 956 _test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, \ 973 957 virt_id, vdev_id)) 958 + 959 + static int _test_cmd_hw_queue_alloc(int fd, __u32 viommu_id, __u32 type, 960 + __u32 idx, __u64 base_addr, __u64 length, 961 + __u32 *hw_queue_id) 962 + { 963 + struct iommu_hw_queue_alloc cmd = { 964 + .size = sizeof(cmd), 965 + .viommu_id = viommu_id, 966 + .type = type, 967 + .index = idx, 968 + .nesting_parent_iova = base_addr, 969 + .length = length, 970 + }; 971 + int ret; 972 + 973 + ret = ioctl(fd, IOMMU_HW_QUEUE_ALLOC, &cmd); 974 + if (ret) 975 + return ret; 976 + if (hw_queue_id) 977 + *hw_queue_id = cmd.out_hw_queue_id; 978 + return 0; 979 + } 980 + 981 + #define test_cmd_hw_queue_alloc(viommu_id, type, idx, base_addr, len, out_qid) \ 982 + ASSERT_EQ(0, _test_cmd_hw_queue_alloc(self->fd, viommu_id, type, idx, \ 983 + base_addr, len, out_qid)) 984 + #define test_err_hw_queue_alloc(_errno, viommu_id, type, idx, base_addr, len, \ 985 + out_qid) \ 986 + EXPECT_ERRNO(_errno, \ 987 + _test_cmd_hw_queue_alloc(self->fd, viommu_id, type, idx, \ 988 + base_addr, len, out_qid)) 974 989 975 990 static int _test_cmd_veventq_alloc(int fd, __u32 viommu_id, __u32 type, 976 991 __u32 *veventq_id, __u32 *veventq_fd)