Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull compute express link updates from Dan Williams:
"DOE support is promoted from drivers/cxl/ to drivers/pci/ with Bjorn's
blessing, and the CXL core continues to mature its media management
capabilities with support for listing and injecting media errors. Some
late fixes that missed v6.3-final are also included:

- Refactor the DOE infrastructure (Data Object Exchange
PCI-config-cycle mailbox) to be a facility of the PCI core rather
than the CXL core.

This is foundational for upcoming support for PCI
device-attestation and PCIe / CXL link encryption.

- Add support for retrieving and injecting poison for CXL memory
expanders.

This enabling uses trace-events to convey CXL media error records
to user tooling. It includes translation of device-local addresses
(DPA) to system physical addresses (SPA) and their corresponding
CXL region.

- Fixes for decoder enumeration that missed v6.3-final

- Miscellaneous fixups"

* tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (38 commits)
cxl/test: Add mock test for set_timestamp
cxl/mbox: Update CMD_RC_TABLE
tools/testing/cxl: Require CONFIG_DEBUG_FS
tools/testing/cxl: Add a sysfs attr to test poison inject limits
tools/testing/cxl: Use injected poison for get poison list
tools/testing/cxl: Mock the Clear Poison mailbox command
tools/testing/cxl: Mock the Inject Poison mailbox command
cxl/mem: Add debugfs attributes for poison inject and clear
cxl/memdev: Trace inject and clear poison as cxl_poison events
cxl/memdev: Warn of poison inject or clear to a mapped region
cxl/memdev: Add support for the Clear Poison mailbox command
cxl/memdev: Add support for the Inject Poison mailbox command
tools/testing/cxl: Mock support for Get Poison List
cxl/trace: Add an HPA to cxl_poison trace events
cxl/region: Provide region info to the cxl_poison trace event
cxl/memdev: Add trigger_poison_list sysfs attribute
cxl/trace: Add TRACE support for CXL media-error records
cxl/mbox: Add GET_POISON_LIST mailbox command
cxl/mbox: Initialize the poison state
cxl/mbox: Restrict poison cmds to debugfs cxl_raw_allow_all
...

+1584 -314
-1
.clang-format
··· 521 521 - 'of_property_for_each_u32' 522 522 - 'pci_bus_for_each_resource' 523 523 - 'pci_dev_for_each_resource' 524 - - 'pci_doe_for_each_off' 525 524 - 'pcl_for_each_chunk' 526 525 - 'pcl_for_each_segment' 527 526 - 'pcm_for_each_format'
+35
Documentation/ABI/testing/debugfs-cxl
··· 1 + What: /sys/kernel/debug/cxl/memX/inject_poison 2 + Date: April, 2023 3 + KernelVersion: v6.4 4 + Contact: linux-cxl@vger.kernel.org 5 + Description: 6 + (WO) When a Device Physical Address (DPA) is written to this 7 + attribute, the memdev driver sends an inject poison command to 8 + the device for the specified address. The DPA must be 64-byte 9 + aligned and the length of the injected poison is 64-bytes. If 10 + successful, the device returns poison when the address is 11 + accessed through the CXL.mem bus. Injecting poison adds the 12 + address to the device's Poison List and the error source is set 13 + to Injected. In addition, the device adds a poison creation 14 + event to its internal Informational Event log, updates the 15 + Event Status register, and if configured, interrupts the host. 16 + It is not an error to inject poison into an address that 17 + already has poison present and no error is returned. The 18 + inject_poison attribute is only visible for devices supporting 19 + the capability. 20 + 21 + 22 + What: /sys/kernel/debug/memX/clear_poison 23 + Date: April, 2023 24 + KernelVersion: v6.4 25 + Contact: linux-cxl@vger.kernel.org 26 + Description: 27 + (WO) When a Device Physical Address (DPA) is written to this 28 + attribute, the memdev driver sends a clear poison command to 29 + the device for the specified address. Clearing poison removes 30 + the address from the device's Poison List and writes 0 (zero) 31 + for 64 bytes starting at address. It is not an error to clear 32 + poison from an address that does not have poison set. If the 33 + device cannot clear poison from the address, -ENXIO is returned. 34 + The clear_poison attribute is only visible for devices 35 + supporting the capability.
+14
Documentation/ABI/testing/sysfs-bus-cxl
··· 415 415 1), and checks that the hardware accepts the commit request. 416 416 Reading this value indicates whether the region is committed or 417 417 not. 418 + 419 + 420 + What: /sys/bus/cxl/devices/memX/trigger_poison_list 421 + Date: April, 2023 422 + KernelVersion: v6.4 423 + Contact: linux-cxl@vger.kernel.org 424 + Description: 425 + (WO) When a boolean 'true' is written to this attribute the 426 + memdev driver retrieves the poison list from the device. The 427 + list consists of addresses that are poisoned, or would result 428 + in poison if accessed, and the source of the poison. This 429 + attribute is only visible for devices supporting the 430 + capability. The retrieved errors are logged as kernel 431 + events when cxl_poison event tracing is enabled.
+11
drivers/cxl/core/core.h
··· 25 25 #define CXL_DAX_REGION_TYPE(x) (&cxl_dax_region_type) 26 26 int cxl_region_init(void); 27 27 void cxl_region_exit(void); 28 + int cxl_get_poison_by_endpoint(struct cxl_port *port); 28 29 #else 30 + static inline int cxl_get_poison_by_endpoint(struct cxl_port *port) 31 + { 32 + return 0; 33 + } 29 34 static inline void cxl_decoder_kill_region(struct cxl_endpoint_decoder *cxled) 30 35 { 31 36 } ··· 68 63 int cxl_memdev_init(void); 69 64 void cxl_memdev_exit(void); 70 65 void cxl_mbox_init(void); 66 + 67 + enum cxl_poison_trace_type { 68 + CXL_POISON_TRACE_LIST, 69 + CXL_POISON_TRACE_INJECT, 70 + CXL_POISON_TRACE_CLEAR, 71 + }; 71 72 72 73 #endif /* __CXL_CORE_H__ */
+40 -12
drivers/cxl/core/hdm.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ 3 - #include <linux/io-64-nonatomic-hi-lo.h> 4 3 #include <linux/seq_file.h> 5 4 #include <linux/device.h> 6 5 #include <linux/delay.h> ··· 92 93 93 94 cxl_probe_component_regs(&port->dev, crb, &map.component_map); 94 95 if (!map.component_map.hdm_decoder.valid) { 95 - dev_err(&port->dev, "HDM decoder registers invalid\n"); 96 - return -ENXIO; 96 + dev_dbg(&port->dev, "HDM decoder registers not implemented\n"); 97 + /* unique error code to indicate no HDM decoder capability */ 98 + return -ENODEV; 97 99 } 98 100 99 101 return cxl_map_component_regs(&port->dev, regs, &map, ··· 130 130 */ 131 131 for (i = 0; i < cxlhdm->decoder_count; i++) { 132 132 ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(i)); 133 + dev_dbg(&info->port->dev, 134 + "decoder%d.%d: committed: %ld base: %#x_%.8x size: %#x_%.8x\n", 135 + info->port->id, i, 136 + FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl), 137 + readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i)), 138 + readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(i)), 139 + readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i)), 140 + readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i))); 133 141 if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl)) 134 142 return false; 135 143 } ··· 277 269 278 270 lockdep_assert_held_write(&cxl_dpa_rwsem); 279 271 280 - if (!len) 281 - goto success; 272 + if (!len) { 273 + dev_warn(dev, "decoder%d.%d: empty reservation attempted\n", 274 + port->id, cxled->cxld.id); 275 + return -EINVAL; 276 + } 282 277 283 278 if (cxled->dpa_res) { 284 279 dev_dbg(dev, "decoder%d.%d: existing allocation %pr assigned\n", ··· 334 323 cxled->mode = CXL_DECODER_MIXED; 335 324 } 336 325 337 - success: 338 326 port->hdm_end++; 339 327 get_device(&cxled->cxld.dev); 340 328 return 0; ··· 793 783 int *target_map, void __iomem *hdm, int which, 794 784 u64 *dpa_base, struct cxl_endpoint_dvsec_info *info) 795 785 { 786 + u64 size, base, skip, dpa_size, lo, hi; 796 787 struct cxl_endpoint_decoder *cxled; 797 - u64 size, base, skip, dpa_size; 798 788 bool committed; 799 789 u32 remainder; 800 790 int i, rc; ··· 809 799 which, info); 810 800 811 801 ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(which)); 812 - base = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which)); 813 - size = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); 802 + lo = readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(which)); 803 + hi = readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(which)); 804 + base = (hi << 32) + lo; 805 + lo = readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(which)); 806 + hi = readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(which)); 807 + size = (hi << 32) + lo; 814 808 committed = !!(ctrl & CXL_HDM_DECODER0_CTRL_COMMITTED); 815 809 cxld->commit = cxl_decoder_commit; 816 810 cxld->reset = cxl_decoder_reset; ··· 847 833 port->id, cxld->id); 848 834 return -ENXIO; 849 835 } 836 + 837 + if (size == 0) { 838 + dev_warn(&port->dev, 839 + "decoder%d.%d: Committed with zero size\n", 840 + port->id, cxld->id); 841 + return -ENXIO; 842 + } 850 843 port->commit_end = cxld->id; 851 844 } else { 852 845 /* unless / until type-2 drivers arrive, assume type-3 */ ··· 876 855 if (rc) 877 856 return rc; 878 857 858 + dev_dbg(&port->dev, "decoder%d.%d: range: %#llx-%#llx iw: %d ig: %d\n", 859 + port->id, cxld->id, cxld->hpa_range.start, cxld->hpa_range.end, 860 + cxld->interleave_ways, cxld->interleave_granularity); 861 + 879 862 if (!info) { 880 - target_list.value = 881 - ioread64_hi_lo(hdm + CXL_HDM_DECODER0_TL_LOW(which)); 863 + lo = readl(hdm + CXL_HDM_DECODER0_TL_LOW(which)); 864 + hi = readl(hdm + CXL_HDM_DECODER0_TL_HIGH(which)); 865 + target_list.value = (hi << 32) + lo; 882 866 for (i = 0; i < cxld->interleave_ways; i++) 883 867 target_map[i] = target_list.target_id[i]; 884 868 ··· 900 874 port->id, cxld->id, size, cxld->interleave_ways); 901 875 return -ENXIO; 902 876 } 903 - skip = ioread64_hi_lo(hdm + CXL_HDM_DECODER0_SKIP_LOW(which)); 877 + lo = readl(hdm + CXL_HDM_DECODER0_SKIP_LOW(which)); 878 + hi = readl(hdm + CXL_HDM_DECODER0_SKIP_HIGH(which)); 879 + skip = (hi << 32) + lo; 904 880 cxled = to_cxl_endpoint_decoder(&cxld->dev); 905 881 rc = devm_cxl_dpa_reserve(cxled, *dpa_base + skip, dpa_size, skip); 906 882 if (rc) {
+143 -8
drivers/cxl/core/mbox.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ 3 - #include <linux/io-64-nonatomic-lo-hi.h> 4 3 #include <linux/security.h> 5 4 #include <linux/debugfs.h> 6 5 #include <linux/ktime.h> 7 6 #include <linux/mutex.h> 7 + #include <asm/unaligned.h> 8 + #include <cxlpci.h> 8 9 #include <cxlmem.h> 9 10 #include <cxl.h> 10 11 ··· 62 61 CXL_CMD(SET_ALERT_CONFIG, 0xc, 0, 0), 63 62 CXL_CMD(GET_SHUTDOWN_STATE, 0, 0x1, 0), 64 63 CXL_CMD(SET_SHUTDOWN_STATE, 0x1, 0, 0), 65 - CXL_CMD(GET_POISON, 0x10, CXL_VARIABLE_PAYLOAD, 0), 66 - CXL_CMD(INJECT_POISON, 0x8, 0, 0), 67 - CXL_CMD(CLEAR_POISON, 0x48, 0, 0), 68 64 CXL_CMD(GET_SCAN_MEDIA_CAPS, 0x10, 0x4, 0), 69 - CXL_CMD(SCAN_MEDIA, 0x11, 0, 0), 70 - CXL_CMD(GET_SCAN_MEDIA, 0, CXL_VARIABLE_PAYLOAD, 0), 71 65 }; 72 66 73 67 /* ··· 83 87 * 84 88 * CXL_MBOX_OP_[GET_]SCAN_MEDIA: The kernel provides a native error list that 85 89 * is kept up to date with patrol notifications and error management. 90 + * 91 + * CXL_MBOX_OP_[GET_,INJECT_,CLEAR_]POISON: These commands require kernel 92 + * driver orchestration for safety. 86 93 */ 87 94 static u16 cxl_disabled_raw_commands[] = { 88 95 CXL_MBOX_OP_ACTIVATE_FW, ··· 94 95 CXL_MBOX_OP_SET_SHUTDOWN_STATE, 95 96 CXL_MBOX_OP_SCAN_MEDIA, 96 97 CXL_MBOX_OP_GET_SCAN_MEDIA, 98 + CXL_MBOX_OP_GET_POISON, 99 + CXL_MBOX_OP_INJECT_POISON, 100 + CXL_MBOX_OP_CLEAR_POISON, 97 101 }; 98 102 99 103 /* ··· 119 117 if (security_command_sets[i] == (opcode >> 8)) 120 118 return true; 121 119 return false; 120 + } 121 + 122 + static bool cxl_is_poison_command(u16 opcode) 123 + { 124 + #define CXL_MBOX_OP_POISON_CMDS 0x43 125 + 126 + if ((opcode >> 8) == CXL_MBOX_OP_POISON_CMDS) 127 + return true; 128 + 129 + return false; 130 + } 131 + 132 + static void cxl_set_poison_cmd_enabled(struct cxl_poison_state *poison, 133 + u16 opcode) 134 + { 135 + switch (opcode) { 136 + case CXL_MBOX_OP_GET_POISON: 137 + set_bit(CXL_POISON_ENABLED_LIST, poison->enabled_cmds); 138 + break; 139 + case CXL_MBOX_OP_INJECT_POISON: 140 + set_bit(CXL_POISON_ENABLED_INJECT, poison->enabled_cmds); 141 + break; 142 + case CXL_MBOX_OP_CLEAR_POISON: 143 + set_bit(CXL_POISON_ENABLED_CLEAR, poison->enabled_cmds); 144 + break; 145 + case CXL_MBOX_OP_GET_SCAN_MEDIA_CAPS: 146 + set_bit(CXL_POISON_ENABLED_SCAN_CAPS, poison->enabled_cmds); 147 + break; 148 + case CXL_MBOX_OP_SCAN_MEDIA: 149 + set_bit(CXL_POISON_ENABLED_SCAN_MEDIA, poison->enabled_cmds); 150 + break; 151 + case CXL_MBOX_OP_GET_SCAN_MEDIA: 152 + set_bit(CXL_POISON_ENABLED_SCAN_RESULTS, poison->enabled_cmds); 153 + break; 154 + default: 155 + break; 156 + } 122 157 } 123 158 124 159 static struct cxl_mem_command *cxl_mem_find_command(u16 opcode) ··· 673 634 u16 opcode = le16_to_cpu(cel_entry[i].opcode); 674 635 struct cxl_mem_command *cmd = cxl_mem_find_command(opcode); 675 636 676 - if (!cmd) { 637 + if (!cmd && !cxl_is_poison_command(opcode)) { 677 638 dev_dbg(cxlds->dev, 678 639 "Opcode 0x%04x unsupported by driver\n", opcode); 679 640 continue; 680 641 } 681 642 682 - set_bit(cmd->info.id, cxlds->enabled_cmds); 643 + if (cmd) 644 + set_bit(cmd->info.id, cxlds->enabled_cmds); 645 + 646 + if (cxl_is_poison_command(opcode)) 647 + cxl_set_poison_cmd_enabled(&cxlds->poison, opcode); 648 + 683 649 dev_dbg(cxlds->dev, "Opcode 0x%04x enabled\n", opcode); 684 650 } 685 651 } ··· 1038 994 /* See CXL 2.0 Table 175 Identify Memory Device Output Payload */ 1039 995 struct cxl_mbox_identify id; 1040 996 struct cxl_mbox_cmd mbox_cmd; 997 + u32 val; 1041 998 int rc; 1042 999 1043 1000 mbox_cmd = (struct cxl_mbox_cmd) { ··· 1061 1016 1062 1017 cxlds->lsa_size = le32_to_cpu(id.lsa_size); 1063 1018 memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision)); 1019 + 1020 + if (test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds)) { 1021 + val = get_unaligned_le24(id.poison_list_max_mer); 1022 + cxlds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX); 1023 + } 1064 1024 1065 1025 return 0; 1066 1026 } ··· 1156 1106 return 0; 1157 1107 } 1158 1108 EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL); 1109 + 1110 + int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, 1111 + struct cxl_region *cxlr) 1112 + { 1113 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 1114 + struct cxl_mbox_poison_out *po; 1115 + struct cxl_mbox_poison_in pi; 1116 + struct cxl_mbox_cmd mbox_cmd; 1117 + int nr_records = 0; 1118 + int rc; 1119 + 1120 + rc = mutex_lock_interruptible(&cxlds->poison.lock); 1121 + if (rc) 1122 + return rc; 1123 + 1124 + po = cxlds->poison.list_out; 1125 + pi.offset = cpu_to_le64(offset); 1126 + pi.length = cpu_to_le64(len / CXL_POISON_LEN_MULT); 1127 + 1128 + mbox_cmd = (struct cxl_mbox_cmd) { 1129 + .opcode = CXL_MBOX_OP_GET_POISON, 1130 + .size_in = sizeof(pi), 1131 + .payload_in = &pi, 1132 + .size_out = cxlds->payload_size, 1133 + .payload_out = po, 1134 + .min_out = struct_size(po, record, 0), 1135 + }; 1136 + 1137 + do { 1138 + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd); 1139 + if (rc) 1140 + break; 1141 + 1142 + for (int i = 0; i < le16_to_cpu(po->count); i++) 1143 + trace_cxl_poison(cxlmd, cxlr, &po->record[i], 1144 + po->flags, po->overflow_ts, 1145 + CXL_POISON_TRACE_LIST); 1146 + 1147 + /* Protect against an uncleared _FLAG_MORE */ 1148 + nr_records = nr_records + le16_to_cpu(po->count); 1149 + if (nr_records >= cxlds->poison.max_errors) { 1150 + dev_dbg(&cxlmd->dev, "Max Error Records reached: %d\n", 1151 + nr_records); 1152 + break; 1153 + } 1154 + } while (po->flags & CXL_POISON_FLAG_MORE); 1155 + 1156 + mutex_unlock(&cxlds->poison.lock); 1157 + return rc; 1158 + } 1159 + EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL); 1160 + 1161 + static void free_poison_buf(void *buf) 1162 + { 1163 + kvfree(buf); 1164 + } 1165 + 1166 + /* Get Poison List output buffer is protected by cxlds->poison.lock */ 1167 + static int cxl_poison_alloc_buf(struct cxl_dev_state *cxlds) 1168 + { 1169 + cxlds->poison.list_out = kvmalloc(cxlds->payload_size, GFP_KERNEL); 1170 + if (!cxlds->poison.list_out) 1171 + return -ENOMEM; 1172 + 1173 + return devm_add_action_or_reset(cxlds->dev, free_poison_buf, 1174 + cxlds->poison.list_out); 1175 + } 1176 + 1177 + int cxl_poison_state_init(struct cxl_dev_state *cxlds) 1178 + { 1179 + int rc; 1180 + 1181 + if (!test_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds)) 1182 + return 0; 1183 + 1184 + rc = cxl_poison_alloc_buf(cxlds); 1185 + if (rc) { 1186 + clear_bit(CXL_POISON_ENABLED_LIST, cxlds->poison.enabled_cmds); 1187 + return rc; 1188 + } 1189 + 1190 + mutex_init(&cxlds->poison.lock); 1191 + return 0; 1192 + } 1193 + EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL); 1159 1194 1160 1195 struct cxl_dev_state *cxl_dev_state_create(struct device *dev) 1161 1196 {
+227
drivers/cxl/core/memdev.c
··· 6 6 #include <linux/idr.h> 7 7 #include <linux/pci.h> 8 8 #include <cxlmem.h> 9 + #include "trace.h" 9 10 #include "core.h" 10 11 11 12 static DECLARE_RWSEM(cxl_memdev_rwsem); ··· 106 105 return sprintf(buf, "%d\n", dev_to_node(dev)); 107 106 } 108 107 static DEVICE_ATTR_RO(numa_node); 108 + 109 + static int cxl_get_poison_by_memdev(struct cxl_memdev *cxlmd) 110 + { 111 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 112 + u64 offset, length; 113 + int rc = 0; 114 + 115 + /* CXL 3.0 Spec 8.2.9.8.4.1 Separate pmem and ram poison requests */ 116 + if (resource_size(&cxlds->pmem_res)) { 117 + offset = cxlds->pmem_res.start; 118 + length = resource_size(&cxlds->pmem_res); 119 + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); 120 + if (rc) 121 + return rc; 122 + } 123 + if (resource_size(&cxlds->ram_res)) { 124 + offset = cxlds->ram_res.start; 125 + length = resource_size(&cxlds->ram_res); 126 + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); 127 + /* 128 + * Invalid Physical Address is not an error for 129 + * volatile addresses. Device support is optional. 130 + */ 131 + if (rc == -EFAULT) 132 + rc = 0; 133 + } 134 + return rc; 135 + } 136 + 137 + int cxl_trigger_poison_list(struct cxl_memdev *cxlmd) 138 + { 139 + struct cxl_port *port; 140 + int rc; 141 + 142 + port = dev_get_drvdata(&cxlmd->dev); 143 + if (!port || !is_cxl_endpoint(port)) 144 + return -EINVAL; 145 + 146 + rc = down_read_interruptible(&cxl_dpa_rwsem); 147 + if (rc) 148 + return rc; 149 + 150 + if (port->commit_end == -1) { 151 + /* No regions mapped to this memdev */ 152 + rc = cxl_get_poison_by_memdev(cxlmd); 153 + } else { 154 + /* Regions mapped, collect poison by endpoint */ 155 + rc = cxl_get_poison_by_endpoint(port); 156 + } 157 + up_read(&cxl_dpa_rwsem); 158 + 159 + return rc; 160 + } 161 + EXPORT_SYMBOL_NS_GPL(cxl_trigger_poison_list, CXL); 162 + 163 + struct cxl_dpa_to_region_context { 164 + struct cxl_region *cxlr; 165 + u64 dpa; 166 + }; 167 + 168 + static int __cxl_dpa_to_region(struct device *dev, void *arg) 169 + { 170 + struct cxl_dpa_to_region_context *ctx = arg; 171 + struct cxl_endpoint_decoder *cxled; 172 + u64 dpa = ctx->dpa; 173 + 174 + if (!is_endpoint_decoder(dev)) 175 + return 0; 176 + 177 + cxled = to_cxl_endpoint_decoder(dev); 178 + if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) 179 + return 0; 180 + 181 + if (dpa > cxled->dpa_res->end || dpa < cxled->dpa_res->start) 182 + return 0; 183 + 184 + dev_dbg(dev, "dpa:0x%llx mapped in region:%s\n", dpa, 185 + dev_name(&cxled->cxld.region->dev)); 186 + 187 + ctx->cxlr = cxled->cxld.region; 188 + 189 + return 1; 190 + } 191 + 192 + static struct cxl_region *cxl_dpa_to_region(struct cxl_memdev *cxlmd, u64 dpa) 193 + { 194 + struct cxl_dpa_to_region_context ctx; 195 + struct cxl_port *port; 196 + 197 + ctx = (struct cxl_dpa_to_region_context) { 198 + .dpa = dpa, 199 + }; 200 + port = dev_get_drvdata(&cxlmd->dev); 201 + if (port && is_cxl_endpoint(port) && port->commit_end != -1) 202 + device_for_each_child(&port->dev, &ctx, __cxl_dpa_to_region); 203 + 204 + return ctx.cxlr; 205 + } 206 + 207 + static int cxl_validate_poison_dpa(struct cxl_memdev *cxlmd, u64 dpa) 208 + { 209 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 210 + 211 + if (!IS_ENABLED(CONFIG_DEBUG_FS)) 212 + return 0; 213 + 214 + if (!resource_size(&cxlds->dpa_res)) { 215 + dev_dbg(cxlds->dev, "device has no dpa resource\n"); 216 + return -EINVAL; 217 + } 218 + if (dpa < cxlds->dpa_res.start || dpa > cxlds->dpa_res.end) { 219 + dev_dbg(cxlds->dev, "dpa:0x%llx not in resource:%pR\n", 220 + dpa, &cxlds->dpa_res); 221 + return -EINVAL; 222 + } 223 + if (!IS_ALIGNED(dpa, 64)) { 224 + dev_dbg(cxlds->dev, "dpa:0x%llx is not 64-byte aligned\n", dpa); 225 + return -EINVAL; 226 + } 227 + 228 + return 0; 229 + } 230 + 231 + int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa) 232 + { 233 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 234 + struct cxl_mbox_inject_poison inject; 235 + struct cxl_poison_record record; 236 + struct cxl_mbox_cmd mbox_cmd; 237 + struct cxl_region *cxlr; 238 + int rc; 239 + 240 + if (!IS_ENABLED(CONFIG_DEBUG_FS)) 241 + return 0; 242 + 243 + rc = down_read_interruptible(&cxl_dpa_rwsem); 244 + if (rc) 245 + return rc; 246 + 247 + rc = cxl_validate_poison_dpa(cxlmd, dpa); 248 + if (rc) 249 + goto out; 250 + 251 + inject.address = cpu_to_le64(dpa); 252 + mbox_cmd = (struct cxl_mbox_cmd) { 253 + .opcode = CXL_MBOX_OP_INJECT_POISON, 254 + .size_in = sizeof(inject), 255 + .payload_in = &inject, 256 + }; 257 + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd); 258 + if (rc) 259 + goto out; 260 + 261 + cxlr = cxl_dpa_to_region(cxlmd, dpa); 262 + if (cxlr) 263 + dev_warn_once(cxlds->dev, 264 + "poison inject dpa:%#llx region: %s\n", dpa, 265 + dev_name(&cxlr->dev)); 266 + 267 + record = (struct cxl_poison_record) { 268 + .address = cpu_to_le64(dpa), 269 + .length = cpu_to_le32(1), 270 + }; 271 + trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_INJECT); 272 + out: 273 + up_read(&cxl_dpa_rwsem); 274 + 275 + return rc; 276 + } 277 + EXPORT_SYMBOL_NS_GPL(cxl_inject_poison, CXL); 278 + 279 + int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa) 280 + { 281 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 282 + struct cxl_mbox_clear_poison clear; 283 + struct cxl_poison_record record; 284 + struct cxl_mbox_cmd mbox_cmd; 285 + struct cxl_region *cxlr; 286 + int rc; 287 + 288 + if (!IS_ENABLED(CONFIG_DEBUG_FS)) 289 + return 0; 290 + 291 + rc = down_read_interruptible(&cxl_dpa_rwsem); 292 + if (rc) 293 + return rc; 294 + 295 + rc = cxl_validate_poison_dpa(cxlmd, dpa); 296 + if (rc) 297 + goto out; 298 + 299 + /* 300 + * In CXL 3.0 Spec 8.2.9.8.4.3, the Clear Poison mailbox command 301 + * is defined to accept 64 bytes of write-data, along with the 302 + * address to clear. This driver uses zeroes as write-data. 303 + */ 304 + clear = (struct cxl_mbox_clear_poison) { 305 + .address = cpu_to_le64(dpa) 306 + }; 307 + 308 + mbox_cmd = (struct cxl_mbox_cmd) { 309 + .opcode = CXL_MBOX_OP_CLEAR_POISON, 310 + .size_in = sizeof(clear), 311 + .payload_in = &clear, 312 + }; 313 + 314 + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd); 315 + if (rc) 316 + goto out; 317 + 318 + cxlr = cxl_dpa_to_region(cxlmd, dpa); 319 + if (cxlr) 320 + dev_warn_once(cxlds->dev, "poison clear dpa:%#llx region: %s\n", 321 + dpa, dev_name(&cxlr->dev)); 322 + 323 + record = (struct cxl_poison_record) { 324 + .address = cpu_to_le64(dpa), 325 + .length = cpu_to_le32(1), 326 + }; 327 + trace_cxl_poison(cxlmd, cxlr, &record, 0, 0, CXL_POISON_TRACE_CLEAR); 328 + out: 329 + up_read(&cxl_dpa_rwsem); 330 + 331 + return rc; 332 + } 333 + EXPORT_SYMBOL_NS_GPL(cxl_clear_poison, CXL); 109 334 110 335 static struct attribute *cxl_memdev_attributes[] = { 111 336 &dev_attr_serial.attr,
+51 -86
drivers/cxl/core/pci.c
··· 441 441 #define CXL_DOE_TABLE_ACCESS_LAST_ENTRY 0xffff 442 442 #define CXL_DOE_PROTOCOL_TABLE_ACCESS 2 443 443 444 - static struct pci_doe_mb *find_cdat_doe(struct device *uport) 445 - { 446 - struct cxl_memdev *cxlmd; 447 - struct cxl_dev_state *cxlds; 448 - unsigned long index; 449 - void *entry; 450 - 451 - cxlmd = to_cxl_memdev(uport); 452 - cxlds = cxlmd->cxlds; 453 - 454 - xa_for_each(&cxlds->doe_mbs, index, entry) { 455 - struct pci_doe_mb *cur = entry; 456 - 457 - if (pci_doe_supports_prot(cur, PCI_DVSEC_VENDOR_ID_CXL, 458 - CXL_DOE_PROTOCOL_TABLE_ACCESS)) 459 - return cur; 460 - } 461 - 462 - return NULL; 463 - } 464 - 465 444 #define CDAT_DOE_REQ(entry_handle) cpu_to_le32 \ 466 445 (FIELD_PREP(CXL_DOE_TABLE_ACCESS_REQ_CODE, \ 467 446 CXL_DOE_TABLE_ACCESS_REQ_CODE_READ) | \ ··· 448 469 CXL_DOE_TABLE_ACCESS_TABLE_TYPE_CDATA) | \ 449 470 FIELD_PREP(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, (entry_handle))) 450 471 451 - static void cxl_doe_task_complete(struct pci_doe_task *task) 452 - { 453 - complete(task->private); 454 - } 455 - 456 - struct cdat_doe_task { 457 - __le32 request_pl; 458 - __le32 response_pl[32]; 459 - struct completion c; 460 - struct pci_doe_task task; 461 - }; 462 - 463 - #define DECLARE_CDAT_DOE_TASK(req, cdt) \ 464 - struct cdat_doe_task cdt = { \ 465 - .c = COMPLETION_INITIALIZER_ONSTACK(cdt.c), \ 466 - .request_pl = req, \ 467 - .task = { \ 468 - .prot.vid = PCI_DVSEC_VENDOR_ID_CXL, \ 469 - .prot.type = CXL_DOE_PROTOCOL_TABLE_ACCESS, \ 470 - .request_pl = &cdt.request_pl, \ 471 - .request_pl_sz = sizeof(cdt.request_pl), \ 472 - .response_pl = cdt.response_pl, \ 473 - .response_pl_sz = sizeof(cdt.response_pl), \ 474 - .complete = cxl_doe_task_complete, \ 475 - .private = &cdt.c, \ 476 - } \ 477 - } 478 - 479 472 static int cxl_cdat_get_length(struct device *dev, 480 473 struct pci_doe_mb *cdat_doe, 481 474 size_t *length) 482 475 { 483 - DECLARE_CDAT_DOE_TASK(CDAT_DOE_REQ(0), t); 476 + __le32 request = CDAT_DOE_REQ(0); 477 + __le32 response[2]; 484 478 int rc; 485 479 486 - rc = pci_doe_submit_task(cdat_doe, &t.task); 480 + rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, 481 + CXL_DOE_PROTOCOL_TABLE_ACCESS, 482 + &request, sizeof(request), 483 + &response, sizeof(response)); 487 484 if (rc < 0) { 488 - dev_err(dev, "DOE submit failed: %d", rc); 485 + dev_err(dev, "DOE failed: %d", rc); 489 486 return rc; 490 487 } 491 - wait_for_completion(&t.c); 492 - if (t.task.rv < 2 * sizeof(__le32)) 488 + if (rc < sizeof(response)) 493 489 return -EIO; 494 490 495 - *length = le32_to_cpu(t.response_pl[1]); 491 + *length = le32_to_cpu(response[1]); 496 492 dev_dbg(dev, "CDAT length %zu\n", *length); 497 493 498 494 return 0; ··· 475 521 476 522 static int cxl_cdat_read_table(struct device *dev, 477 523 struct pci_doe_mb *cdat_doe, 478 - struct cxl_cdat *cdat) 524 + void *cdat_table, size_t *cdat_length) 479 525 { 480 - size_t length = cdat->length; 481 - __le32 *data = cdat->table; 526 + size_t length = *cdat_length + sizeof(__le32); 527 + __le32 *data = cdat_table; 482 528 int entry_handle = 0; 529 + __le32 saved_dw = 0; 483 530 484 531 do { 485 - DECLARE_CDAT_DOE_TASK(CDAT_DOE_REQ(entry_handle), t); 532 + __le32 request = CDAT_DOE_REQ(entry_handle); 486 533 struct cdat_entry_header *entry; 487 534 size_t entry_dw; 488 535 int rc; 489 536 490 - rc = pci_doe_submit_task(cdat_doe, &t.task); 537 + rc = pci_doe(cdat_doe, PCI_DVSEC_VENDOR_ID_CXL, 538 + CXL_DOE_PROTOCOL_TABLE_ACCESS, 539 + &request, sizeof(request), 540 + data, length); 491 541 if (rc < 0) { 492 - dev_err(dev, "DOE submit failed: %d", rc); 542 + dev_err(dev, "DOE failed: %d", rc); 493 543 return rc; 494 544 } 495 - wait_for_completion(&t.c); 496 545 497 546 /* 1 DW Table Access Response Header + CDAT entry */ 498 - entry = (struct cdat_entry_header *)(t.response_pl + 1); 547 + entry = (struct cdat_entry_header *)(data + 1); 499 548 if ((entry_handle == 0 && 500 - t.task.rv != sizeof(__le32) + sizeof(struct cdat_header)) || 549 + rc != sizeof(__le32) + sizeof(struct cdat_header)) || 501 550 (entry_handle > 0 && 502 - (t.task.rv < sizeof(__le32) + sizeof(*entry) || 503 - t.task.rv != sizeof(__le32) + le16_to_cpu(entry->length)))) 551 + (rc < sizeof(__le32) + sizeof(*entry) || 552 + rc != sizeof(__le32) + le16_to_cpu(entry->length)))) 504 553 return -EIO; 505 554 506 555 /* Get the CXL table access header entry handle */ 507 556 entry_handle = FIELD_GET(CXL_DOE_TABLE_ACCESS_ENTRY_HANDLE, 508 - le32_to_cpu(t.response_pl[0])); 509 - entry_dw = t.task.rv / sizeof(__le32); 557 + le32_to_cpu(data[0])); 558 + entry_dw = rc / sizeof(__le32); 510 559 /* Skip Header */ 511 560 entry_dw -= 1; 512 - entry_dw = min(length / sizeof(__le32), entry_dw); 513 - /* Prevent length < 1 DW from causing a buffer overflow */ 514 - if (entry_dw) { 515 - memcpy(data, entry, entry_dw * sizeof(__le32)); 516 - length -= entry_dw * sizeof(__le32); 517 - data += entry_dw; 518 - } 561 + /* 562 + * Table Access Response Header overwrote the last DW of 563 + * previous entry, so restore that DW 564 + */ 565 + *data = saved_dw; 566 + length -= entry_dw * sizeof(__le32); 567 + data += entry_dw; 568 + saved_dw = *data; 519 569 } while (entry_handle != CXL_DOE_TABLE_ACCESS_LAST_ENTRY); 520 570 521 571 /* Length in CDAT header may exceed concatenation of CDAT entries */ 522 - cdat->length -= length; 572 + *cdat_length -= length - sizeof(__le32); 523 573 524 574 return 0; 525 575 } ··· 536 578 */ 537 579 void read_cdat_data(struct cxl_port *port) 538 580 { 539 - struct pci_doe_mb *cdat_doe; 581 + struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport); 582 + struct device *host = cxlmd->dev.parent; 540 583 struct device *dev = &port->dev; 541 - struct device *uport = port->uport; 584 + struct pci_doe_mb *cdat_doe; 542 585 size_t cdat_length; 586 + void *cdat_table; 543 587 int rc; 544 588 545 - cdat_doe = find_cdat_doe(uport); 589 + if (!dev_is_pci(host)) 590 + return; 591 + cdat_doe = pci_find_doe_mailbox(to_pci_dev(host), 592 + PCI_DVSEC_VENDOR_ID_CXL, 593 + CXL_DOE_PROTOCOL_TABLE_ACCESS); 546 594 if (!cdat_doe) { 547 595 dev_dbg(dev, "No CDAT mailbox\n"); 548 596 return; ··· 561 597 return; 562 598 } 563 599 564 - port->cdat.table = devm_kzalloc(dev, cdat_length, GFP_KERNEL); 565 - if (!port->cdat.table) 600 + cdat_table = devm_kzalloc(dev, cdat_length + sizeof(__le32), 601 + GFP_KERNEL); 602 + if (!cdat_table) 566 603 return; 567 604 568 - port->cdat.length = cdat_length; 569 - rc = cxl_cdat_read_table(dev, cdat_doe, &port->cdat); 605 + rc = cxl_cdat_read_table(dev, cdat_doe, cdat_table, &cdat_length); 570 606 if (rc) { 571 607 /* Don't leave table data allocated on error */ 572 - devm_kfree(dev, port->cdat.table); 573 - port->cdat.table = NULL; 574 - port->cdat.length = 0; 608 + devm_kfree(dev, cdat_table); 575 609 dev_err(dev, "CDAT data read error\n"); 576 610 } 611 + 612 + port->cdat.table = cdat_table + sizeof(__le32); 613 + port->cdat.length = cdat_length; 577 614 } 578 615 EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); 579 616
-1
drivers/cxl/core/port.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */ 3 - #include <linux/io-64-nonatomic-lo-hi.h> 4 3 #include <linux/memregion.h> 5 4 #include <linux/workqueue.h> 6 5 #include <linux/debugfs.h>
+124
drivers/cxl/core/region.c
··· 2238 2238 } 2239 2239 EXPORT_SYMBOL_NS_GPL(to_cxl_pmem_region, CXL); 2240 2240 2241 + struct cxl_poison_context { 2242 + struct cxl_port *port; 2243 + enum cxl_decoder_mode mode; 2244 + u64 offset; 2245 + }; 2246 + 2247 + static int cxl_get_poison_unmapped(struct cxl_memdev *cxlmd, 2248 + struct cxl_poison_context *ctx) 2249 + { 2250 + struct cxl_dev_state *cxlds = cxlmd->cxlds; 2251 + u64 offset, length; 2252 + int rc = 0; 2253 + 2254 + /* 2255 + * Collect poison for the remaining unmapped resources 2256 + * after poison is collected by committed endpoints. 2257 + * 2258 + * Knowing that PMEM must always follow RAM, get poison 2259 + * for unmapped resources based on the last decoder's mode: 2260 + * ram: scan remains of ram range, then any pmem range 2261 + * pmem: scan remains of pmem range 2262 + */ 2263 + 2264 + if (ctx->mode == CXL_DECODER_RAM) { 2265 + offset = ctx->offset; 2266 + length = resource_size(&cxlds->ram_res) - offset; 2267 + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); 2268 + if (rc == -EFAULT) 2269 + rc = 0; 2270 + if (rc) 2271 + return rc; 2272 + } 2273 + if (ctx->mode == CXL_DECODER_PMEM) { 2274 + offset = ctx->offset; 2275 + length = resource_size(&cxlds->dpa_res) - offset; 2276 + if (!length) 2277 + return 0; 2278 + } else if (resource_size(&cxlds->pmem_res)) { 2279 + offset = cxlds->pmem_res.start; 2280 + length = resource_size(&cxlds->pmem_res); 2281 + } else { 2282 + return 0; 2283 + } 2284 + 2285 + return cxl_mem_get_poison(cxlmd, offset, length, NULL); 2286 + } 2287 + 2288 + static int poison_by_decoder(struct device *dev, void *arg) 2289 + { 2290 + struct cxl_poison_context *ctx = arg; 2291 + struct cxl_endpoint_decoder *cxled; 2292 + struct cxl_memdev *cxlmd; 2293 + u64 offset, length; 2294 + int rc = 0; 2295 + 2296 + if (!is_endpoint_decoder(dev)) 2297 + return rc; 2298 + 2299 + cxled = to_cxl_endpoint_decoder(dev); 2300 + if (!cxled->dpa_res || !resource_size(cxled->dpa_res)) 2301 + return rc; 2302 + 2303 + /* 2304 + * Regions are only created with single mode decoders: pmem or ram. 2305 + * Linux does not support mixed mode decoders. This means that 2306 + * reading poison per endpoint decoder adheres to the requirement 2307 + * that poison reads of pmem and ram must be separated. 2308 + * CXL 3.0 Spec 8.2.9.8.4.1 2309 + */ 2310 + if (cxled->mode == CXL_DECODER_MIXED) { 2311 + dev_dbg(dev, "poison list read unsupported in mixed mode\n"); 2312 + return rc; 2313 + } 2314 + 2315 + cxlmd = cxled_to_memdev(cxled); 2316 + if (cxled->skip) { 2317 + offset = cxled->dpa_res->start - cxled->skip; 2318 + length = cxled->skip; 2319 + rc = cxl_mem_get_poison(cxlmd, offset, length, NULL); 2320 + if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM) 2321 + rc = 0; 2322 + if (rc) 2323 + return rc; 2324 + } 2325 + 2326 + offset = cxled->dpa_res->start; 2327 + length = cxled->dpa_res->end - offset + 1; 2328 + rc = cxl_mem_get_poison(cxlmd, offset, length, cxled->cxld.region); 2329 + if (rc == -EFAULT && cxled->mode == CXL_DECODER_RAM) 2330 + rc = 0; 2331 + if (rc) 2332 + return rc; 2333 + 2334 + /* Iterate until commit_end is reached */ 2335 + if (cxled->cxld.id == ctx->port->commit_end) { 2336 + ctx->offset = cxled->dpa_res->end + 1; 2337 + ctx->mode = cxled->mode; 2338 + return 1; 2339 + } 2340 + 2341 + return 0; 2342 + } 2343 + 2344 + int cxl_get_poison_by_endpoint(struct cxl_port *port) 2345 + { 2346 + struct cxl_poison_context ctx; 2347 + int rc = 0; 2348 + 2349 + rc = down_read_interruptible(&cxl_region_rwsem); 2350 + if (rc) 2351 + return rc; 2352 + 2353 + ctx = (struct cxl_poison_context) { 2354 + .port = port 2355 + }; 2356 + 2357 + rc = device_for_each_child(&port->dev, &ctx, poison_by_decoder); 2358 + if (rc == 1) 2359 + rc = cxl_get_poison_unmapped(to_cxl_memdev(port->uport), &ctx); 2360 + 2361 + up_read(&cxl_region_rwsem); 2362 + return rc; 2363 + } 2364 + 2241 2365 static struct lock_class_key cxl_pmem_region_key; 2242 2366 2243 2367 static struct cxl_pmem_region *cxl_pmem_region_alloc(struct cxl_region *cxlr)
+94
drivers/cxl/core/trace.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 2 /* Copyright(c) 2022 Intel Corporation. All rights reserved. */ 3 3 4 + #include <cxl.h> 5 + #include "core.h" 6 + 4 7 #define CREATE_TRACE_POINTS 5 8 #include "trace.h" 9 + 10 + static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos) 11 + { 12 + struct cxl_region_params *p = &cxlr->params; 13 + int gran = p->interleave_granularity; 14 + int ways = p->interleave_ways; 15 + u64 offset; 16 + 17 + /* Is the hpa within this region at all */ 18 + if (hpa < p->res->start || hpa > p->res->end) { 19 + dev_dbg(&cxlr->dev, 20 + "Addr trans fail: hpa 0x%llx not in region\n", hpa); 21 + return false; 22 + } 23 + 24 + /* Is the hpa in an expected chunk for its pos(-ition) */ 25 + offset = hpa - p->res->start; 26 + offset = do_div(offset, gran * ways); 27 + if ((offset >= pos * gran) && (offset < (pos + 1) * gran)) 28 + return true; 29 + 30 + dev_dbg(&cxlr->dev, 31 + "Addr trans fail: hpa 0x%llx not in expected chunk\n", hpa); 32 + 33 + return false; 34 + } 35 + 36 + static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr, 37 + struct cxl_endpoint_decoder *cxled) 38 + { 39 + u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa; 40 + struct cxl_region_params *p = &cxlr->params; 41 + int pos = cxled->pos; 42 + u16 eig = 0; 43 + u8 eiw = 0; 44 + 45 + ways_to_eiw(p->interleave_ways, &eiw); 46 + granularity_to_eig(p->interleave_granularity, &eig); 47 + 48 + /* 49 + * The device position in the region interleave set was removed 50 + * from the offset at HPA->DPA translation. To reconstruct the 51 + * HPA, place the 'pos' in the offset. 52 + * 53 + * The placement of 'pos' in the HPA is determined by interleave 54 + * ways and granularity and is defined in the CXL Spec 3.0 Section 55 + * 8.2.4.19.13 Implementation Note: Device Decode Logic 56 + */ 57 + 58 + /* Remove the dpa base */ 59 + dpa_offset = dpa - cxl_dpa_resource_start(cxled); 60 + 61 + mask_upper = GENMASK_ULL(51, eig + 8); 62 + 63 + if (eiw < 8) { 64 + hpa_offset = (dpa_offset & mask_upper) << eiw; 65 + hpa_offset |= pos << (eig + 8); 66 + } else { 67 + bits_upper = (dpa_offset & mask_upper) >> (eig + 8); 68 + bits_upper = bits_upper * 3; 69 + hpa_offset = ((bits_upper << (eiw - 8)) + pos) << (eig + 8); 70 + } 71 + 72 + /* The lower bits remain unchanged */ 73 + hpa_offset |= dpa_offset & GENMASK_ULL(eig + 7, 0); 74 + 75 + /* Apply the hpa_offset to the region base address */ 76 + hpa = hpa_offset + p->res->start; 77 + 78 + if (!cxl_is_hpa_in_range(hpa, cxlr, cxled->pos)) 79 + return ULLONG_MAX; 80 + 81 + return hpa; 82 + } 83 + 84 + u64 cxl_trace_hpa(struct cxl_region *cxlr, struct cxl_memdev *cxlmd, 85 + u64 dpa) 86 + { 87 + struct cxl_region_params *p = &cxlr->params; 88 + struct cxl_endpoint_decoder *cxled = NULL; 89 + 90 + for (int i = 0; i < p->nr_targets; i++) { 91 + cxled = p->targets[i]; 92 + if (cxlmd == cxled_to_memdev(cxled)) 93 + break; 94 + } 95 + if (!cxled || cxlmd != cxled_to_memdev(cxled)) 96 + return ULLONG_MAX; 97 + 98 + return cxl_dpa_to_hpa(dpa, cxlr, cxled); 99 + }
+103
drivers/cxl/core/trace.h
··· 7 7 #define _CXL_EVENTS_H 8 8 9 9 #include <linux/tracepoint.h> 10 + #include <linux/pci.h> 10 11 #include <asm-generic/unaligned.h> 11 12 12 13 #include <cxl.h> 13 14 #include <cxlmem.h> 15 + #include "core.h" 14 16 15 17 #define CXL_RAS_UC_CACHE_DATA_PARITY BIT(0) 16 18 #define CXL_RAS_UC_CACHE_ADDR_PARITY BIT(1) ··· 599 597 __entry->life_used, __entry->device_temp, 600 598 __entry->dirty_shutdown_cnt, __entry->cor_vol_err_cnt, 601 599 __entry->cor_per_err_cnt 600 + ) 601 + ); 602 + 603 + #define show_poison_trace_type(type) \ 604 + __print_symbolic(type, \ 605 + { CXL_POISON_TRACE_LIST, "List" }, \ 606 + { CXL_POISON_TRACE_INJECT, "Inject" }, \ 607 + { CXL_POISON_TRACE_CLEAR, "Clear" }) 608 + 609 + #define __show_poison_source(source) \ 610 + __print_symbolic(source, \ 611 + { CXL_POISON_SOURCE_UNKNOWN, "Unknown" }, \ 612 + { CXL_POISON_SOURCE_EXTERNAL, "External" }, \ 613 + { CXL_POISON_SOURCE_INTERNAL, "Internal" }, \ 614 + { CXL_POISON_SOURCE_INJECTED, "Injected" }, \ 615 + { CXL_POISON_SOURCE_VENDOR, "Vendor" }) 616 + 617 + #define show_poison_source(source) \ 618 + (((source > CXL_POISON_SOURCE_INJECTED) && \ 619 + (source != CXL_POISON_SOURCE_VENDOR)) ? "Reserved" \ 620 + : __show_poison_source(source)) 621 + 622 + #define show_poison_flags(flags) \ 623 + __print_flags(flags, "|", \ 624 + { CXL_POISON_FLAG_MORE, "More" }, \ 625 + { CXL_POISON_FLAG_OVERFLOW, "Overflow" }, \ 626 + { CXL_POISON_FLAG_SCANNING, "Scanning" }) 627 + 628 + #define __cxl_poison_addr(record) \ 629 + (le64_to_cpu(record->address)) 630 + #define cxl_poison_record_dpa(record) \ 631 + (__cxl_poison_addr(record) & CXL_POISON_START_MASK) 632 + #define cxl_poison_record_source(record) \ 633 + (__cxl_poison_addr(record) & CXL_POISON_SOURCE_MASK) 634 + #define cxl_poison_record_dpa_length(record) \ 635 + (le32_to_cpu(record->length) * CXL_POISON_LEN_MULT) 636 + #define cxl_poison_overflow(flags, time) \ 637 + (flags & CXL_POISON_FLAG_OVERFLOW ? le64_to_cpu(time) : 0) 638 + 639 + u64 cxl_trace_hpa(struct cxl_region *cxlr, struct cxl_memdev *memdev, u64 dpa); 640 + 641 + TRACE_EVENT(cxl_poison, 642 + 643 + TP_PROTO(struct cxl_memdev *cxlmd, struct cxl_region *region, 644 + const struct cxl_poison_record *record, u8 flags, 645 + __le64 overflow_ts, enum cxl_poison_trace_type trace_type), 646 + 647 + TP_ARGS(cxlmd, region, record, flags, overflow_ts, trace_type), 648 + 649 + TP_STRUCT__entry( 650 + __string(memdev, dev_name(&cxlmd->dev)) 651 + __string(host, dev_name(cxlmd->dev.parent)) 652 + __field(u64, serial) 653 + __field(u8, trace_type) 654 + __string(region, region) 655 + __field(u64, overflow_ts) 656 + __field(u64, hpa) 657 + __field(u64, dpa) 658 + __field(u32, dpa_length) 659 + __array(char, uuid, 16) 660 + __field(u8, source) 661 + __field(u8, flags) 662 + ), 663 + 664 + TP_fast_assign( 665 + __assign_str(memdev, dev_name(&cxlmd->dev)); 666 + __assign_str(host, dev_name(cxlmd->dev.parent)); 667 + __entry->serial = cxlmd->cxlds->serial; 668 + __entry->overflow_ts = cxl_poison_overflow(flags, overflow_ts); 669 + __entry->dpa = cxl_poison_record_dpa(record); 670 + __entry->dpa_length = cxl_poison_record_dpa_length(record); 671 + __entry->source = cxl_poison_record_source(record); 672 + __entry->trace_type = trace_type; 673 + __entry->flags = flags; 674 + if (region) { 675 + __assign_str(region, dev_name(&region->dev)); 676 + memcpy(__entry->uuid, &region->params.uuid, 16); 677 + __entry->hpa = cxl_trace_hpa(region, cxlmd, 678 + __entry->dpa); 679 + } else { 680 + __assign_str(region, ""); 681 + memset(__entry->uuid, 0, 16); 682 + __entry->hpa = ULLONG_MAX; 683 + } 684 + ), 685 + 686 + TP_printk("memdev=%s host=%s serial=%lld trace_type=%s region=%s " \ 687 + "region_uuid=%pU hpa=0x%llx dpa=0x%llx dpa_length=0x%x " \ 688 + "source=%s flags=%s overflow_time=%llu", 689 + __get_str(memdev), 690 + __get_str(host), 691 + __entry->serial, 692 + show_poison_trace_type(__entry->trace_type), 693 + __get_str(region), 694 + __entry->uuid, 695 + __entry->hpa, 696 + __entry->dpa, 697 + __entry->dpa_length, 698 + show_poison_source(__entry->source), 699 + show_poison_flags(__entry->flags), 700 + __entry->overflow_ts 602 701 ) 603 702 ); 604 703
+105 -6
drivers/cxl/cxlmem.h
··· 127 127 }; 128 128 129 129 /* 130 - * Per CXL 2.0 Section 8.2.8.4.5.1 130 + * Per CXL 3.0 Section 8.2.8.4.5.1 131 131 */ 132 132 #define CMD_CMD_RC_TABLE \ 133 133 C(SUCCESS, 0, NULL), \ ··· 145 145 C(FWROLLBACK, -ENXIO, "rolled back to the previous active FW"), \ 146 146 C(FWRESET, -ENXIO, "FW failed to activate, needs cold reset"), \ 147 147 C(HANDLE, -ENXIO, "one or more Event Record Handles were invalid"), \ 148 - C(PADDR, -ENXIO, "physical address specified is invalid"), \ 148 + C(PADDR, -EFAULT, "physical address specified is invalid"), \ 149 149 C(POISONLMT, -ENXIO, "poison injection limit has been reached"), \ 150 150 C(MEDIAFAILURE, -ENXIO, "permanent issue with the media"), \ 151 151 C(ABORT, -ENXIO, "background cmd was aborted by device"), \ 152 152 C(SECURITY, -ENXIO, "not valid in the current security state"), \ 153 153 C(PASSPHRASE, -ENXIO, "phrase doesn't match current set passphrase"), \ 154 154 C(MBUNSUPPORTED, -ENXIO, "unsupported on the mailbox it was issued on"),\ 155 - C(PAYLOADLEN, -ENXIO, "invalid payload length") 155 + C(PAYLOADLEN, -ENXIO, "invalid payload length"), \ 156 + C(LOG, -ENXIO, "invalid or unsupported log page"), \ 157 + C(INTERRUPTED, -ENXIO, "asynchronous event occured"), \ 158 + C(FEATUREVERSION, -ENXIO, "unsupported feature version"), \ 159 + C(FEATURESELVALUE, -ENXIO, "unsupported feature selection value"), \ 160 + C(FEATURETRANSFERIP, -ENXIO, "feature transfer in progress"), \ 161 + C(FEATURETRANSFEROOO, -ENXIO, "feature transfer out of order"), \ 162 + C(RESOURCEEXHAUSTED, -ENXIO, "resources are exhausted"), \ 163 + C(EXTLIST, -ENXIO, "invalid Extent List"), \ 156 164 157 165 #undef C 158 166 #define C(a, b, c) CXL_MBOX_CMD_RC_##a ··· 223 215 struct mutex log_lock; 224 216 }; 225 217 218 + /* Device enabled poison commands */ 219 + enum poison_cmd_enabled_bits { 220 + CXL_POISON_ENABLED_LIST, 221 + CXL_POISON_ENABLED_INJECT, 222 + CXL_POISON_ENABLED_CLEAR, 223 + CXL_POISON_ENABLED_SCAN_CAPS, 224 + CXL_POISON_ENABLED_SCAN_MEDIA, 225 + CXL_POISON_ENABLED_SCAN_RESULTS, 226 + CXL_POISON_ENABLED_MAX 227 + }; 228 + 229 + /** 230 + * struct cxl_poison_state - Driver poison state info 231 + * 232 + * @max_errors: Maximum media error records held in device cache 233 + * @enabled_cmds: All poison commands enabled in the CEL 234 + * @list_out: The poison list payload returned by device 235 + * @lock: Protect reads of the poison list 236 + * 237 + * Reads of the poison list are synchronized to ensure that a reader 238 + * does not get an incomplete list because their request overlapped 239 + * (was interrupted or preceded by) another read request of the same 240 + * DPA range. CXL Spec 3.0 Section 8.2.9.8.4.1 241 + */ 242 + struct cxl_poison_state { 243 + u32 max_errors; 244 + DECLARE_BITMAP(enabled_cmds, CXL_POISON_ENABLED_MAX); 245 + struct cxl_mbox_poison_out *list_out; 246 + struct mutex lock; /* Protect reads of poison list */ 247 + }; 248 + 226 249 /** 227 250 * struct cxl_dev_state - The driver device state 228 251 * ··· 288 249 * @component_reg_phys: register base of component registers 289 250 * @info: Cached DVSEC information about the device. 290 251 * @serial: PCIe Device Serial Number 291 - * @doe_mbs: PCI DOE mailbox array 292 252 * @event: event log driver state 253 + * @poison: poison driver state info 293 254 * @mbox_send: @dev specific transport for transmitting mailbox commands 294 255 * 295 256 * See section 8.2.9.5.2 Capacity Configuration and Label Storage for ··· 326 287 resource_size_t component_reg_phys; 327 288 u64 serial; 328 289 329 - struct xarray doe_mbs; 330 - 331 290 struct cxl_event_state event; 291 + struct cxl_poison_state poison; 332 292 333 293 int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd); 334 294 }; ··· 576 538 577 539 } __packed; 578 540 541 + /* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */ 542 + struct cxl_mbox_poison_in { 543 + __le64 offset; 544 + __le64 length; 545 + } __packed; 546 + 547 + struct cxl_mbox_poison_out { 548 + u8 flags; 549 + u8 rsvd1; 550 + __le64 overflow_ts; 551 + __le16 count; 552 + u8 rsvd2[20]; 553 + struct cxl_poison_record { 554 + __le64 address; 555 + __le32 length; 556 + __le32 rsvd; 557 + } __packed record[]; 558 + } __packed; 559 + 560 + /* 561 + * Get Poison List address field encodes the starting 562 + * address of poison, and the source of the poison. 563 + */ 564 + #define CXL_POISON_START_MASK GENMASK_ULL(63, 6) 565 + #define CXL_POISON_SOURCE_MASK GENMASK(2, 0) 566 + 567 + /* Get Poison List record length is in units of 64 bytes */ 568 + #define CXL_POISON_LEN_MULT 64 569 + 570 + /* Kernel defined maximum for a list of poison errors */ 571 + #define CXL_POISON_LIST_MAX 1024 572 + 573 + /* Get Poison List: Payload out flags */ 574 + #define CXL_POISON_FLAG_MORE BIT(0) 575 + #define CXL_POISON_FLAG_OVERFLOW BIT(1) 576 + #define CXL_POISON_FLAG_SCANNING BIT(2) 577 + 578 + /* Get Poison List: Poison Source */ 579 + #define CXL_POISON_SOURCE_UNKNOWN 0 580 + #define CXL_POISON_SOURCE_EXTERNAL 1 581 + #define CXL_POISON_SOURCE_INTERNAL 2 582 + #define CXL_POISON_SOURCE_INJECTED 3 583 + #define CXL_POISON_SOURCE_VENDOR 7 584 + 585 + /* Inject & Clear Poison CXL 3.0 Spec 8.2.9.8.4.2/3 */ 586 + struct cxl_mbox_inject_poison { 587 + __le64 address; 588 + }; 589 + 590 + /* Clear Poison CXL 3.0 Spec 8.2.9.8.4.3 */ 591 + struct cxl_mbox_clear_poison { 592 + __le64 address; 593 + u8 write_data[CXL_POISON_LEN_MULT]; 594 + } __packed; 595 + 579 596 /** 580 597 * struct cxl_mem_command - Driver representation of a memory device command 581 598 * @info: Command information as it exists for the UAPI ··· 701 608 void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds); 702 609 void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status); 703 610 int cxl_set_timestamp(struct cxl_dev_state *cxlds); 611 + int cxl_poison_state_init(struct cxl_dev_state *cxlds); 612 + int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, 613 + struct cxl_region *cxlr); 614 + int cxl_trigger_poison_list(struct cxl_memdev *cxlmd); 615 + int cxl_inject_poison(struct cxl_memdev *cxlmd, u64 dpa); 616 + int cxl_clear_poison(struct cxl_memdev *cxlmd, u64 dpa); 704 617 705 618 #ifdef CONFIG_CXL_SUSPEND 706 619 void cxl_mem_active_inc(void);
+71
drivers/cxl/mem.c
··· 94 94 return 0; 95 95 } 96 96 97 + static int cxl_debugfs_poison_inject(void *data, u64 dpa) 98 + { 99 + struct cxl_memdev *cxlmd = data; 100 + 101 + return cxl_inject_poison(cxlmd, dpa); 102 + } 103 + 104 + DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_inject_fops, NULL, 105 + cxl_debugfs_poison_inject, "%llx\n"); 106 + 107 + static int cxl_debugfs_poison_clear(void *data, u64 dpa) 108 + { 109 + struct cxl_memdev *cxlmd = data; 110 + 111 + return cxl_clear_poison(cxlmd, dpa); 112 + } 113 + 114 + DEFINE_DEBUGFS_ATTRIBUTE(cxl_poison_clear_fops, NULL, 115 + cxl_debugfs_poison_clear, "%llx\n"); 116 + 97 117 static int cxl_mem_probe(struct device *dev) 98 118 { 99 119 struct cxl_memdev *cxlmd = to_cxl_memdev(dev); ··· 137 117 138 118 dentry = cxl_debugfs_create_dir(dev_name(dev)); 139 119 debugfs_create_devm_seqfile(dev, "dpamem", dentry, cxl_mem_dpa_show); 120 + 121 + if (test_bit(CXL_POISON_ENABLED_INJECT, cxlds->poison.enabled_cmds)) 122 + debugfs_create_file("inject_poison", 0200, dentry, cxlmd, 123 + &cxl_poison_inject_fops); 124 + if (test_bit(CXL_POISON_ENABLED_CLEAR, cxlds->poison.enabled_cmds)) 125 + debugfs_create_file("clear_poison", 0200, dentry, cxlmd, 126 + &cxl_poison_clear_fops); 127 + 140 128 rc = devm_add_action_or_reset(dev, remove_debugfs, dentry); 141 129 if (rc) 142 130 return rc; ··· 204 176 return devm_add_action_or_reset(dev, enable_suspend, NULL); 205 177 } 206 178 179 + static ssize_t trigger_poison_list_store(struct device *dev, 180 + struct device_attribute *attr, 181 + const char *buf, size_t len) 182 + { 183 + bool trigger; 184 + int rc; 185 + 186 + if (kstrtobool(buf, &trigger) || !trigger) 187 + return -EINVAL; 188 + 189 + rc = cxl_trigger_poison_list(to_cxl_memdev(dev)); 190 + 191 + return rc ? rc : len; 192 + } 193 + static DEVICE_ATTR_WO(trigger_poison_list); 194 + 195 + static umode_t cxl_mem_visible(struct kobject *kobj, struct attribute *a, int n) 196 + { 197 + if (a == &dev_attr_trigger_poison_list.attr) { 198 + struct device *dev = kobj_to_dev(kobj); 199 + 200 + if (!test_bit(CXL_POISON_ENABLED_LIST, 201 + to_cxl_memdev(dev)->cxlds->poison.enabled_cmds)) 202 + return 0; 203 + } 204 + return a->mode; 205 + } 206 + 207 + static struct attribute *cxl_mem_attrs[] = { 208 + &dev_attr_trigger_poison_list.attr, 209 + NULL 210 + }; 211 + 212 + static struct attribute_group cxl_mem_group = { 213 + .attrs = cxl_mem_attrs, 214 + .is_visible = cxl_mem_visible, 215 + }; 216 + 217 + __ATTRIBUTE_GROUPS(cxl_mem); 218 + 207 219 static struct cxl_driver cxl_mem_driver = { 208 220 .name = "cxl_mem", 209 221 .probe = cxl_mem_probe, 210 222 .id = CXL_DEVICE_MEMORY_EXPANDER, 223 + .drv = { 224 + .dev_groups = cxl_mem_groups, 225 + }, 211 226 }; 212 227 213 228 module_cxl_driver(cxl_mem_driver);
+4 -49
drivers/cxl/pci.c
··· 8 8 #include <linux/mutex.h> 9 9 #include <linux/list.h> 10 10 #include <linux/pci.h> 11 - #include <linux/pci-doe.h> 12 11 #include <linux/aer.h> 13 12 #include <linux/io.h> 14 13 #include "cxlmem.h" ··· 354 355 cxl_unmap_regblock(pdev, map); 355 356 356 357 return rc; 357 - } 358 - 359 - static void cxl_pci_destroy_doe(void *mbs) 360 - { 361 - xa_destroy(mbs); 362 - } 363 - 364 - static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) 365 - { 366 - struct device *dev = cxlds->dev; 367 - struct pci_dev *pdev = to_pci_dev(dev); 368 - u16 off = 0; 369 - 370 - xa_init(&cxlds->doe_mbs); 371 - if (devm_add_action(&pdev->dev, cxl_pci_destroy_doe, &cxlds->doe_mbs)) { 372 - dev_err(dev, "Failed to create XArray for DOE's\n"); 373 - return; 374 - } 375 - 376 - /* 377 - * Mailbox creation is best effort. Higher layers must determine if 378 - * the lack of a mailbox for their protocol is a device failure or not. 379 - */ 380 - pci_doe_for_each_off(pdev, off) { 381 - struct pci_doe_mb *doe_mb; 382 - 383 - doe_mb = pcim_doe_create_mb(pdev, off); 384 - if (IS_ERR(doe_mb)) { 385 - dev_err(dev, "Failed to create MB object for MB @ %x\n", 386 - off); 387 - continue; 388 - } 389 - 390 - if (!pci_request_config_region_exclusive(pdev, off, 391 - PCI_DOE_CAP_SIZEOF, 392 - dev_name(dev))) 393 - pci_err(pdev, "Failed to exclude DOE registers\n"); 394 - 395 - if (xa_insert(&cxlds->doe_mbs, off, doe_mb, GFP_KERNEL)) { 396 - dev_err(dev, "xa_insert failed to insert MB @ %x\n", 397 - off); 398 - continue; 399 - } 400 - 401 - dev_dbg(dev, "Created DOE mailbox @%x\n", off); 402 - } 403 358 } 404 359 405 360 /* ··· 703 750 704 751 cxlds->component_reg_phys = map.resource; 705 752 706 - devm_cxl_pci_create_doe(cxlds); 707 - 708 753 rc = cxl_map_component_regs(&pdev->dev, &cxlds->regs.component, 709 754 &map, BIT(CXL_CM_CAP_CAP_ID_RAS)); 710 755 if (rc) ··· 717 766 return rc; 718 767 719 768 rc = cxl_set_timestamp(cxlds); 769 + if (rc) 770 + return rc; 771 + 772 + rc = cxl_poison_state_init(cxlds); 720 773 if (rc) 721 774 return rc; 722 775
+14 -6
drivers/cxl/port.c
··· 66 66 if (rc < 0) 67 67 return rc; 68 68 69 - if (rc == 1) 70 - return devm_cxl_add_passthrough_decoder(port); 71 - 72 69 cxlhdm = devm_cxl_setup_hdm(port, NULL); 73 - if (IS_ERR(cxlhdm)) 74 - return PTR_ERR(cxlhdm); 70 + if (!IS_ERR(cxlhdm)) 71 + return devm_cxl_enumerate_decoders(cxlhdm, NULL); 75 72 76 - return devm_cxl_enumerate_decoders(cxlhdm, NULL); 73 + if (PTR_ERR(cxlhdm) != -ENODEV) { 74 + dev_err(&port->dev, "Failed to map HDM decoder capability\n"); 75 + return PTR_ERR(cxlhdm); 76 + } 77 + 78 + if (rc == 1) { 79 + dev_dbg(&port->dev, "Fallback to passthrough decoder\n"); 80 + return devm_cxl_add_passthrough_decoder(port); 81 + } 82 + 83 + dev_err(&port->dev, "HDM decoder capability not found\n"); 84 + return -ENXIO; 77 85 } 78 86 79 87 static int cxl_endpoint_port_probe(struct cxl_port *port)
+249 -79
drivers/pci/doe.c
··· 20 20 #include <linux/pci-doe.h> 21 21 #include <linux/workqueue.h> 22 22 23 + #include "pci.h" 24 + 23 25 #define PCI_DOE_PROTOCOL_DISCOVERY 0 24 26 25 27 /* Timeout of 1 second from 6.30.2 Operation, PCI Spec r6.0 */ ··· 39 37 * 40 38 * This state is used to manage a single DOE mailbox capability. All fields 41 39 * should be considered opaque to the consumers and the structure passed into 42 - * the helpers below after being created by devm_pci_doe_create() 40 + * the helpers below after being created by pci_doe_create_mb(). 43 41 * 44 42 * @pdev: PCI device this mailbox belongs to 45 43 * @cap_offset: Capability offset ··· 56 54 wait_queue_head_t wq; 57 55 struct workqueue_struct *work_queue; 58 56 unsigned long flags; 57 + }; 58 + 59 + struct pci_doe_protocol { 60 + u16 vid; 61 + u8 type; 62 + }; 63 + 64 + /** 65 + * struct pci_doe_task - represents a single query/response 66 + * 67 + * @prot: DOE Protocol 68 + * @request_pl: The request payload 69 + * @request_pl_sz: Size of the request payload (bytes) 70 + * @response_pl: The response payload 71 + * @response_pl_sz: Size of the response payload (bytes) 72 + * @rv: Return value. Length of received response or error (bytes) 73 + * @complete: Called when task is complete 74 + * @private: Private data for the consumer 75 + * @work: Used internally by the mailbox 76 + * @doe_mb: Used internally by the mailbox 77 + */ 78 + struct pci_doe_task { 79 + struct pci_doe_protocol prot; 80 + const __le32 *request_pl; 81 + size_t request_pl_sz; 82 + __le32 *response_pl; 83 + size_t response_pl_sz; 84 + int rv; 85 + void (*complete)(struct pci_doe_task *task); 86 + void *private; 87 + 88 + /* initialized by pci_doe_submit_task() */ 89 + struct work_struct work; 90 + struct pci_doe_mb *doe_mb; 59 91 }; 60 92 61 93 static int pci_doe_wait(struct pci_doe_mb *doe_mb, unsigned long timeout) ··· 146 110 { 147 111 struct pci_dev *pdev = doe_mb->pdev; 148 112 int offset = doe_mb->cap_offset; 149 - size_t length; 113 + size_t length, remainder; 150 114 u32 val; 151 115 int i; 152 116 ··· 164 128 return -EIO; 165 129 166 130 /* Length is 2 DW of header + length of payload in DW */ 167 - length = 2 + task->request_pl_sz / sizeof(__le32); 131 + length = 2 + DIV_ROUND_UP(task->request_pl_sz, sizeof(__le32)); 168 132 if (length > PCI_DOE_MAX_LENGTH) 169 133 return -EIO; 170 134 if (length == PCI_DOE_MAX_LENGTH) ··· 177 141 pci_write_config_dword(pdev, offset + PCI_DOE_WRITE, 178 142 FIELD_PREP(PCI_DOE_DATA_OBJECT_HEADER_2_LENGTH, 179 143 length)); 144 + 145 + /* Write payload */ 180 146 for (i = 0; i < task->request_pl_sz / sizeof(__le32); i++) 181 147 pci_write_config_dword(pdev, offset + PCI_DOE_WRITE, 182 148 le32_to_cpu(task->request_pl[i])); 149 + 150 + /* Write last payload dword */ 151 + remainder = task->request_pl_sz % sizeof(__le32); 152 + if (remainder) { 153 + val = 0; 154 + memcpy(&val, &task->request_pl[i], remainder); 155 + le32_to_cpus(&val); 156 + pci_write_config_dword(pdev, offset + PCI_DOE_WRITE, val); 157 + } 183 158 184 159 pci_doe_write_ctrl(doe_mb, PCI_DOE_CTRL_GO); 185 160 ··· 211 164 212 165 static int pci_doe_recv_resp(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) 213 166 { 167 + size_t length, payload_length, remainder, received; 214 168 struct pci_dev *pdev = doe_mb->pdev; 215 169 int offset = doe_mb->cap_offset; 216 - size_t length, payload_length; 170 + int i = 0; 217 171 u32 val; 218 - int i; 219 172 220 173 /* Read the first dword to get the protocol */ 221 174 pci_read_config_dword(pdev, offset + PCI_DOE_READ, &val); ··· 242 195 243 196 /* First 2 dwords have already been read */ 244 197 length -= 2; 245 - payload_length = min(length, task->response_pl_sz / sizeof(__le32)); 246 - /* Read the rest of the response payload */ 247 - for (i = 0; i < payload_length; i++) { 198 + received = task->response_pl_sz; 199 + payload_length = DIV_ROUND_UP(task->response_pl_sz, sizeof(__le32)); 200 + remainder = task->response_pl_sz % sizeof(__le32); 201 + 202 + /* remainder signifies number of data bytes in last payload dword */ 203 + if (!remainder) 204 + remainder = sizeof(__le32); 205 + 206 + if (length < payload_length) { 207 + received = length * sizeof(__le32); 208 + payload_length = length; 209 + remainder = sizeof(__le32); 210 + } 211 + 212 + if (payload_length) { 213 + /* Read all payload dwords except the last */ 214 + for (; i < payload_length - 1; i++) { 215 + pci_read_config_dword(pdev, offset + PCI_DOE_READ, 216 + &val); 217 + task->response_pl[i] = cpu_to_le32(val); 218 + pci_write_config_dword(pdev, offset + PCI_DOE_READ, 0); 219 + } 220 + 221 + /* Read last payload dword */ 248 222 pci_read_config_dword(pdev, offset + PCI_DOE_READ, &val); 249 - task->response_pl[i] = cpu_to_le32(val); 223 + cpu_to_le32s(&val); 224 + memcpy(&task->response_pl[i], &val, remainder); 250 225 /* Prior to the last ack, ensure Data Object Ready */ 251 - if (i == (payload_length - 1) && !pci_doe_data_obj_ready(doe_mb)) 226 + if (!pci_doe_data_obj_ready(doe_mb)) 252 227 return -EIO; 253 228 pci_write_config_dword(pdev, offset + PCI_DOE_READ, 0); 229 + i++; 254 230 } 255 231 256 232 /* Flush excess length */ ··· 287 217 if (FIELD_GET(PCI_DOE_STATUS_ERROR, val)) 288 218 return -EIO; 289 219 290 - return min(length, task->response_pl_sz / sizeof(__le32)) * sizeof(__le32); 220 + return received; 291 221 } 292 222 293 223 static void signal_task_complete(struct pci_doe_task *task, int rv) ··· 391 321 __le32 request_pl_le = cpu_to_le32(request_pl); 392 322 __le32 response_pl_le; 393 323 u32 response_pl; 394 - DECLARE_COMPLETION_ONSTACK(c); 395 - struct pci_doe_task task = { 396 - .prot.vid = PCI_VENDOR_ID_PCI_SIG, 397 - .prot.type = PCI_DOE_PROTOCOL_DISCOVERY, 398 - .request_pl = &request_pl_le, 399 - .request_pl_sz = sizeof(request_pl), 400 - .response_pl = &response_pl_le, 401 - .response_pl_sz = sizeof(response_pl), 402 - .complete = pci_doe_task_complete, 403 - .private = &c, 404 - }; 405 324 int rc; 406 325 407 - rc = pci_doe_submit_task(doe_mb, &task); 326 + rc = pci_doe(doe_mb, PCI_VENDOR_ID_PCI_SIG, PCI_DOE_PROTOCOL_DISCOVERY, 327 + &request_pl_le, sizeof(request_pl_le), 328 + &response_pl_le, sizeof(response_pl_le)); 408 329 if (rc < 0) 409 330 return rc; 410 331 411 - wait_for_completion(&c); 412 - 413 - if (task.rv != sizeof(response_pl)) 332 + if (rc != sizeof(response_pl_le)) 414 333 return -EIO; 415 334 416 335 response_pl = le32_to_cpu(response_pl_le); ··· 444 385 return 0; 445 386 } 446 387 447 - static void pci_doe_xa_destroy(void *mb) 388 + static void pci_doe_cancel_tasks(struct pci_doe_mb *doe_mb) 448 389 { 449 - struct pci_doe_mb *doe_mb = mb; 450 - 451 - xa_destroy(&doe_mb->prots); 452 - } 453 - 454 - static void pci_doe_destroy_workqueue(void *mb) 455 - { 456 - struct pci_doe_mb *doe_mb = mb; 457 - 458 - destroy_workqueue(doe_mb->work_queue); 459 - } 460 - 461 - static void pci_doe_flush_mb(void *mb) 462 - { 463 - struct pci_doe_mb *doe_mb = mb; 464 - 465 390 /* Stop all pending work items from starting */ 466 391 set_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags); 467 392 468 393 /* Cancel an in progress work item, if necessary */ 469 394 set_bit(PCI_DOE_FLAG_CANCEL, &doe_mb->flags); 470 395 wake_up(&doe_mb->wq); 471 - 472 - /* Flush all work items */ 473 - flush_workqueue(doe_mb->work_queue); 474 396 } 475 397 476 398 /** 477 - * pcim_doe_create_mb() - Create a DOE mailbox object 399 + * pci_doe_create_mb() - Create a DOE mailbox object 478 400 * 479 401 * @pdev: PCI device to create the DOE mailbox for 480 402 * @cap_offset: Offset of the DOE mailbox ··· 466 426 * RETURNS: created mailbox object on success 467 427 * ERR_PTR(-errno) on failure 468 428 */ 469 - struct pci_doe_mb *pcim_doe_create_mb(struct pci_dev *pdev, u16 cap_offset) 429 + static struct pci_doe_mb *pci_doe_create_mb(struct pci_dev *pdev, 430 + u16 cap_offset) 470 431 { 471 432 struct pci_doe_mb *doe_mb; 472 - struct device *dev = &pdev->dev; 473 433 int rc; 474 434 475 - doe_mb = devm_kzalloc(dev, sizeof(*doe_mb), GFP_KERNEL); 435 + doe_mb = kzalloc(sizeof(*doe_mb), GFP_KERNEL); 476 436 if (!doe_mb) 477 437 return ERR_PTR(-ENOMEM); 478 438 479 439 doe_mb->pdev = pdev; 480 440 doe_mb->cap_offset = cap_offset; 481 441 init_waitqueue_head(&doe_mb->wq); 482 - 483 442 xa_init(&doe_mb->prots); 484 - rc = devm_add_action(dev, pci_doe_xa_destroy, doe_mb); 485 - if (rc) 486 - return ERR_PTR(rc); 487 443 488 444 doe_mb->work_queue = alloc_ordered_workqueue("%s %s DOE [%x]", 0, 489 - dev_driver_string(&pdev->dev), 445 + dev_bus_name(&pdev->dev), 490 446 pci_name(pdev), 491 447 doe_mb->cap_offset); 492 448 if (!doe_mb->work_queue) { 493 449 pci_err(pdev, "[%x] failed to allocate work queue\n", 494 450 doe_mb->cap_offset); 495 - return ERR_PTR(-ENOMEM); 451 + rc = -ENOMEM; 452 + goto err_free; 496 453 } 497 - rc = devm_add_action_or_reset(dev, pci_doe_destroy_workqueue, doe_mb); 498 - if (rc) 499 - return ERR_PTR(rc); 500 454 501 455 /* Reset the mailbox by issuing an abort */ 502 456 rc = pci_doe_abort(doe_mb); 503 457 if (rc) { 504 458 pci_err(pdev, "[%x] failed to reset mailbox with abort command : %d\n", 505 459 doe_mb->cap_offset, rc); 506 - return ERR_PTR(rc); 460 + goto err_destroy_wq; 507 461 } 508 462 509 463 /* 510 464 * The state machine and the mailbox should be in sync now; 511 - * Set up mailbox flush prior to using the mailbox to query protocols. 465 + * Use the mailbox to query protocols. 512 466 */ 513 - rc = devm_add_action_or_reset(dev, pci_doe_flush_mb, doe_mb); 514 - if (rc) 515 - return ERR_PTR(rc); 516 - 517 467 rc = pci_doe_cache_protocols(doe_mb); 518 468 if (rc) { 519 469 pci_err(pdev, "[%x] failed to cache protocols : %d\n", 520 470 doe_mb->cap_offset, rc); 521 - return ERR_PTR(rc); 471 + goto err_cancel; 522 472 } 523 473 524 474 return doe_mb; 475 + 476 + err_cancel: 477 + pci_doe_cancel_tasks(doe_mb); 478 + xa_destroy(&doe_mb->prots); 479 + err_destroy_wq: 480 + destroy_workqueue(doe_mb->work_queue); 481 + err_free: 482 + kfree(doe_mb); 483 + return ERR_PTR(rc); 525 484 } 526 - EXPORT_SYMBOL_GPL(pcim_doe_create_mb); 485 + 486 + /** 487 + * pci_doe_destroy_mb() - Destroy a DOE mailbox object 488 + * 489 + * @doe_mb: DOE mailbox 490 + * 491 + * Destroy all internal data structures created for the DOE mailbox. 492 + */ 493 + static void pci_doe_destroy_mb(struct pci_doe_mb *doe_mb) 494 + { 495 + pci_doe_cancel_tasks(doe_mb); 496 + xa_destroy(&doe_mb->prots); 497 + destroy_workqueue(doe_mb->work_queue); 498 + kfree(doe_mb); 499 + } 527 500 528 501 /** 529 502 * pci_doe_supports_prot() - Return if the DOE instance supports the given ··· 547 494 * 548 495 * RETURNS: True if the DOE mailbox supports the protocol specified 549 496 */ 550 - bool pci_doe_supports_prot(struct pci_doe_mb *doe_mb, u16 vid, u8 type) 497 + static bool pci_doe_supports_prot(struct pci_doe_mb *doe_mb, u16 vid, u8 type) 551 498 { 552 499 unsigned long index; 553 500 void *entry; ··· 562 509 563 510 return false; 564 511 } 565 - EXPORT_SYMBOL_GPL(pci_doe_supports_prot); 566 512 567 513 /** 568 514 * pci_doe_submit_task() - Submit a task to be processed by the state machine ··· 582 530 * 583 531 * RETURNS: 0 when task has been successfully queued, -ERRNO on error 584 532 */ 585 - int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task) 533 + static int pci_doe_submit_task(struct pci_doe_mb *doe_mb, 534 + struct pci_doe_task *task) 586 535 { 587 536 if (!pci_doe_supports_prot(doe_mb, task->prot.vid, task->prot.type)) 588 - return -EINVAL; 589 - 590 - /* 591 - * DOE requests must be a whole number of DW and the response needs to 592 - * be big enough for at least 1 DW 593 - */ 594 - if (task->request_pl_sz % sizeof(__le32) || 595 - task->response_pl_sz < sizeof(__le32)) 596 537 return -EINVAL; 597 538 598 539 if (test_bit(PCI_DOE_FLAG_DEAD, &doe_mb->flags)) ··· 596 551 queue_work(doe_mb->work_queue, &task->work); 597 552 return 0; 598 553 } 599 - EXPORT_SYMBOL_GPL(pci_doe_submit_task); 554 + 555 + /** 556 + * pci_doe() - Perform Data Object Exchange 557 + * 558 + * @doe_mb: DOE Mailbox 559 + * @vendor: Vendor ID 560 + * @type: Data Object Type 561 + * @request: Request payload 562 + * @request_sz: Size of request payload (bytes) 563 + * @response: Response payload 564 + * @response_sz: Size of response payload (bytes) 565 + * 566 + * Submit @request to @doe_mb and store the @response. 567 + * The DOE exchange is performed synchronously and may therefore sleep. 568 + * 569 + * Payloads are treated as opaque byte streams which are transmitted verbatim, 570 + * without byte-swapping. If payloads contain little-endian register values, 571 + * the caller is responsible for conversion with cpu_to_le32() / le32_to_cpu(). 572 + * 573 + * For convenience, arbitrary payload sizes are allowed even though PCIe r6.0 574 + * sec 6.30.1 specifies the Data Object Header 2 "Length" in dwords. The last 575 + * (partial) dword is copied with byte granularity and padded with zeroes if 576 + * necessary. Callers are thus relieved of using dword-sized bounce buffers. 577 + * 578 + * RETURNS: Length of received response or negative errno. 579 + * Received data in excess of @response_sz is discarded. 580 + * The length may be smaller than @response_sz and the caller 581 + * is responsible for checking that. 582 + */ 583 + int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type, 584 + const void *request, size_t request_sz, 585 + void *response, size_t response_sz) 586 + { 587 + DECLARE_COMPLETION_ONSTACK(c); 588 + struct pci_doe_task task = { 589 + .prot.vid = vendor, 590 + .prot.type = type, 591 + .request_pl = request, 592 + .request_pl_sz = request_sz, 593 + .response_pl = response, 594 + .response_pl_sz = response_sz, 595 + .complete = pci_doe_task_complete, 596 + .private = &c, 597 + }; 598 + int rc; 599 + 600 + rc = pci_doe_submit_task(doe_mb, &task); 601 + if (rc) 602 + return rc; 603 + 604 + wait_for_completion(&c); 605 + 606 + return task.rv; 607 + } 608 + EXPORT_SYMBOL_GPL(pci_doe); 609 + 610 + /** 611 + * pci_find_doe_mailbox() - Find Data Object Exchange mailbox 612 + * 613 + * @pdev: PCI device 614 + * @vendor: Vendor ID 615 + * @type: Data Object Type 616 + * 617 + * Find first DOE mailbox of a PCI device which supports the given protocol. 618 + * 619 + * RETURNS: Pointer to the DOE mailbox or NULL if none was found. 620 + */ 621 + struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor, 622 + u8 type) 623 + { 624 + struct pci_doe_mb *doe_mb; 625 + unsigned long index; 626 + 627 + xa_for_each(&pdev->doe_mbs, index, doe_mb) 628 + if (pci_doe_supports_prot(doe_mb, vendor, type)) 629 + return doe_mb; 630 + 631 + return NULL; 632 + } 633 + EXPORT_SYMBOL_GPL(pci_find_doe_mailbox); 634 + 635 + void pci_doe_init(struct pci_dev *pdev) 636 + { 637 + struct pci_doe_mb *doe_mb; 638 + u16 offset = 0; 639 + int rc; 640 + 641 + xa_init(&pdev->doe_mbs); 642 + 643 + while ((offset = pci_find_next_ext_capability(pdev, offset, 644 + PCI_EXT_CAP_ID_DOE))) { 645 + doe_mb = pci_doe_create_mb(pdev, offset); 646 + if (IS_ERR(doe_mb)) { 647 + pci_err(pdev, "[%x] failed to create mailbox: %ld\n", 648 + offset, PTR_ERR(doe_mb)); 649 + continue; 650 + } 651 + 652 + rc = xa_insert(&pdev->doe_mbs, offset, doe_mb, GFP_KERNEL); 653 + if (rc) { 654 + pci_err(pdev, "[%x] failed to insert mailbox: %d\n", 655 + offset, rc); 656 + pci_doe_destroy_mb(doe_mb); 657 + } 658 + } 659 + } 660 + 661 + void pci_doe_destroy(struct pci_dev *pdev) 662 + { 663 + struct pci_doe_mb *doe_mb; 664 + unsigned long index; 665 + 666 + xa_for_each(&pdev->doe_mbs, index, doe_mb) 667 + pci_doe_destroy_mb(doe_mb); 668 + 669 + xa_destroy(&pdev->doe_mbs); 670 + } 671 + 672 + void pci_doe_disconnected(struct pci_dev *pdev) 673 + { 674 + struct pci_doe_mb *doe_mb; 675 + unsigned long index; 676 + 677 + xa_for_each(&pdev->doe_mbs, index, doe_mb) 678 + pci_doe_cancel_tasks(doe_mb); 679 + }
+11
drivers/pci/pci.h
··· 311 311 bool drivers_autoprobe; /* Auto probing of VFs by driver */ 312 312 }; 313 313 314 + #ifdef CONFIG_PCI_DOE 315 + void pci_doe_init(struct pci_dev *pdev); 316 + void pci_doe_destroy(struct pci_dev *pdev); 317 + void pci_doe_disconnected(struct pci_dev *pdev); 318 + #else 319 + static inline void pci_doe_init(struct pci_dev *pdev) { } 320 + static inline void pci_doe_destroy(struct pci_dev *pdev) { } 321 + static inline void pci_doe_disconnected(struct pci_dev *pdev) { } 322 + #endif 323 + 314 324 /** 315 325 * pci_dev_set_io_state - Set the new error state if possible. 316 326 * ··· 357 347 static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) 358 348 { 359 349 pci_dev_set_io_state(dev, pci_channel_io_perm_failure); 350 + pci_doe_disconnected(dev); 360 351 361 352 return 0; 362 353 }
+1
drivers/pci/probe.c
··· 2479 2479 pci_aer_init(dev); /* Advanced Error Reporting */ 2480 2480 pci_dpc_init(dev); /* Downstream Port Containment */ 2481 2481 pci_rcec_init(dev); /* Root Complex Event Collector */ 2482 + pci_doe_init(dev); /* Data Object Exchange */ 2482 2483 2483 2484 pcie_report_downtraining(dev); 2484 2485 pci_init_reset_methods(dev);
+1
drivers/pci/remove.c
··· 38 38 list_del(&dev->bus_list); 39 39 up_write(&pci_bus_sem); 40 40 41 + pci_doe_destroy(dev); 41 42 pcie_aspm_exit_link_state(dev); 42 43 pci_bridge_d3_update(dev); 43 44 pci_free_resources(dev);
+5 -61
include/linux/pci-doe.h
··· 13 13 #ifndef LINUX_PCI_DOE_H 14 14 #define LINUX_PCI_DOE_H 15 15 16 - struct pci_doe_protocol { 17 - u16 vid; 18 - u8 type; 19 - }; 20 - 21 16 struct pci_doe_mb; 22 17 23 - /** 24 - * struct pci_doe_task - represents a single query/response 25 - * 26 - * @prot: DOE Protocol 27 - * @request_pl: The request payload 28 - * @request_pl_sz: Size of the request payload (bytes) 29 - * @response_pl: The response payload 30 - * @response_pl_sz: Size of the response payload (bytes) 31 - * @rv: Return value. Length of received response or error (bytes) 32 - * @complete: Called when task is complete 33 - * @private: Private data for the consumer 34 - * @work: Used internally by the mailbox 35 - * @doe_mb: Used internally by the mailbox 36 - * 37 - * Payloads are treated as opaque byte streams which are transmitted verbatim, 38 - * without byte-swapping. If payloads contain little-endian register values, 39 - * the caller is responsible for conversion with cpu_to_le32() / le32_to_cpu(). 40 - * 41 - * The payload sizes and rv are specified in bytes with the following 42 - * restrictions concerning the protocol. 43 - * 44 - * 1) The request_pl_sz must be a multiple of double words (4 bytes) 45 - * 2) The response_pl_sz must be >= a single double word (4 bytes) 46 - * 3) rv is returned as bytes but it will be a multiple of double words 47 - * 48 - * NOTE there is no need for the caller to initialize work or doe_mb. 49 - */ 50 - struct pci_doe_task { 51 - struct pci_doe_protocol prot; 52 - __le32 *request_pl; 53 - size_t request_pl_sz; 54 - __le32 *response_pl; 55 - size_t response_pl_sz; 56 - int rv; 57 - void (*complete)(struct pci_doe_task *task); 58 - void *private; 18 + struct pci_doe_mb *pci_find_doe_mailbox(struct pci_dev *pdev, u16 vendor, 19 + u8 type); 59 20 60 - /* No need for the user to initialize these fields */ 61 - struct work_struct work; 62 - struct pci_doe_mb *doe_mb; 63 - }; 64 - 65 - /** 66 - * pci_doe_for_each_off - Iterate each DOE capability 67 - * @pdev: struct pci_dev to iterate 68 - * @off: u16 of config space offset of each mailbox capability found 69 - */ 70 - #define pci_doe_for_each_off(pdev, off) \ 71 - for (off = pci_find_next_ext_capability(pdev, off, \ 72 - PCI_EXT_CAP_ID_DOE); \ 73 - off > 0; \ 74 - off = pci_find_next_ext_capability(pdev, off, \ 75 - PCI_EXT_CAP_ID_DOE)) 76 - 77 - struct pci_doe_mb *pcim_doe_create_mb(struct pci_dev *pdev, u16 cap_offset); 78 - bool pci_doe_supports_prot(struct pci_doe_mb *doe_mb, u16 vid, u8 type); 79 - int pci_doe_submit_task(struct pci_doe_mb *doe_mb, struct pci_doe_task *task); 21 + int pci_doe(struct pci_doe_mb *doe_mb, u16 vendor, u8 type, 22 + const void *request, size_t request_sz, 23 + void *response, size_t response_sz); 80 24 81 25 #endif
+3
include/linux/pci.h
··· 512 512 #ifdef CONFIG_PCI_P2PDMA 513 513 struct pci_p2pdma __rcu *p2pdma; 514 514 #endif 515 + #ifdef CONFIG_PCI_DOE 516 + struct xarray doe_mbs; /* Data Object Exchange mailboxes */ 517 + #endif 515 518 u16 acs_cap; /* ACS Capability offset */ 516 519 phys_addr_t rom; /* Physical address if not from BAR */ 517 520 size_t romlen; /* Length if not from BAR */
+30 -5
include/uapi/linux/cxl_mem.h
··· 40 40 ___C(SET_ALERT_CONFIG, "Set Alert Configuration"), \ 41 41 ___C(GET_SHUTDOWN_STATE, "Get Shutdown State"), \ 42 42 ___C(SET_SHUTDOWN_STATE, "Set Shutdown State"), \ 43 - ___C(GET_POISON, "Get Poison List"), \ 44 - ___C(INJECT_POISON, "Inject Poison"), \ 45 - ___C(CLEAR_POISON, "Clear Poison"), \ 43 + ___DEPRECATED(GET_POISON, "Get Poison List"), \ 44 + ___DEPRECATED(INJECT_POISON, "Inject Poison"), \ 45 + ___DEPRECATED(CLEAR_POISON, "Clear Poison"), \ 46 46 ___C(GET_SCAN_MEDIA_CAPS, "Get Scan Media Capabilities"), \ 47 - ___C(SCAN_MEDIA, "Scan Media"), \ 48 - ___C(GET_SCAN_MEDIA, "Get Scan Media Results"), \ 47 + ___DEPRECATED(SCAN_MEDIA, "Scan Media"), \ 48 + ___DEPRECATED(GET_SCAN_MEDIA, "Get Scan Media Results"), \ 49 49 ___C(MAX, "invalid / last command") 50 50 51 51 #define ___C(a, b) CXL_MEM_COMMAND_ID_##a 52 + #define ___DEPRECATED(a, b) CXL_MEM_DEPRECATED_ID_##a 52 53 enum { CXL_CMDS }; 53 54 54 55 #undef ___C 56 + #undef ___DEPRECATED 55 57 #define ___C(a, b) { b } 58 + #define ___DEPRECATED(a, b) { "Deprecated " b } 56 59 static const struct { 57 60 const char *name; 58 61 } cxl_command_names[] __attribute__((__unused__)) = { CXL_CMDS }; ··· 71 68 */ 72 69 73 70 #undef ___C 71 + #undef ___DEPRECATED 72 + #define ___C(a, b) (0) 73 + #define ___DEPRECATED(a, b) (1) 74 + 75 + static const __u8 cxl_deprecated_commands[] 76 + __attribute__((__unused__)) = { CXL_CMDS }; 77 + 78 + /* 79 + * Here's how this actually breaks out: 80 + * cxl_deprecated_commands[] = { 81 + * [CXL_MEM_COMMAND_ID_INVALID] = 0, 82 + * [CXL_MEM_COMMAND_ID_IDENTIFY] = 0, 83 + * ... 84 + * [CXL_MEM_DEPRECATED_ID_GET_POISON] = 1, 85 + * [CXL_MEM_DEPRECATED_ID_INJECT_POISON] = 1, 86 + * [CXL_MEM_DEPRECATED_ID_CLEAR_POISON] = 1, 87 + * ... 88 + * }; 89 + */ 90 + 91 + #undef ___C 92 + #undef ___DEPRECATED 74 93 75 94 /** 76 95 * struct cxl_command_info - Command information returned from a query.
+1
tools/testing/cxl/config_check.c
··· 13 13 BUILD_BUG_ON(!IS_MODULE(CONFIG_CXL_PMEM)); 14 14 BUILD_BUG_ON(!IS_ENABLED(CONFIG_CXL_REGION_INVALIDATION_TEST)); 15 15 BUILD_BUG_ON(!IS_ENABLED(CONFIG_NVDIMM_SECURITY_TEST)); 16 + BUILD_BUG_ON(!IS_ENABLED(CONFIG_DEBUG_FS)); 16 17 }
+247
tools/testing/cxl/test/mem.c
··· 7 7 #include <linux/delay.h> 8 8 #include <linux/sizes.h> 9 9 #include <linux/bits.h> 10 + #include <asm/unaligned.h> 10 11 #include <cxlmem.h> 11 12 12 13 #include "trace.h" ··· 15 14 #define LSA_SIZE SZ_128K 16 15 #define DEV_SIZE SZ_2G 17 16 #define EFFECT(x) (1U << x) 17 + 18 + #define MOCK_INJECT_DEV_MAX 8 19 + #define MOCK_INJECT_TEST_MAX 128 20 + 21 + static unsigned int poison_inject_dev_max = MOCK_INJECT_DEV_MAX; 18 22 19 23 static struct cxl_cel_entry mock_cel[] = { 20 24 { ··· 44 38 }, 45 39 { 46 40 .opcode = cpu_to_le16(CXL_MBOX_OP_GET_HEALTH_INFO), 41 + .effect = cpu_to_le16(0), 42 + }, 43 + { 44 + .opcode = cpu_to_le16(CXL_MBOX_OP_GET_POISON), 45 + .effect = cpu_to_le16(0), 46 + }, 47 + { 48 + .opcode = cpu_to_le16(CXL_MBOX_OP_INJECT_POISON), 49 + .effect = cpu_to_le16(0), 50 + }, 51 + { 52 + .opcode = cpu_to_le16(CXL_MBOX_OP_CLEAR_POISON), 47 53 .effect = cpu_to_le16(0), 48 54 }, 49 55 }; ··· 116 98 int master_limit; 117 99 struct mock_event_store mes; 118 100 u8 event_buf[SZ_4K]; 101 + u64 timestamp; 119 102 }; 120 103 121 104 static struct mock_event_log *event_find_log(struct device *dev, int log_type) ··· 380 361 } 381 362 }; 382 363 364 + static int mock_set_timestamp(struct cxl_dev_state *cxlds, 365 + struct cxl_mbox_cmd *cmd) 366 + { 367 + struct cxl_mockmem_data *mdata = dev_get_drvdata(cxlds->dev); 368 + struct cxl_mbox_set_timestamp_in *ts = cmd->payload_in; 369 + 370 + if (cmd->size_in != sizeof(*ts)) 371 + return -EINVAL; 372 + 373 + if (cmd->size_out != 0) 374 + return -EINVAL; 375 + 376 + mdata->timestamp = le64_to_cpu(ts->timestamp); 377 + return 0; 378 + } 379 + 383 380 static void cxl_mock_add_event_logs(struct mock_event_store *mes) 384 381 { 385 382 put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK, ··· 504 469 cpu_to_le64(SZ_256M / CXL_CAPACITY_MULTIPLIER), 505 470 .total_capacity = 506 471 cpu_to_le64(DEV_SIZE / CXL_CAPACITY_MULTIPLIER), 472 + .inject_poison_limit = cpu_to_le16(MOCK_INJECT_TEST_MAX), 507 473 }; 474 + 475 + put_unaligned_le24(CXL_POISON_LIST_MAX, id.poison_list_max_mer); 508 476 509 477 if (cmd->size_out < sizeof(id)) 510 478 return -EINVAL; ··· 926 888 return 0; 927 889 } 928 890 891 + static struct mock_poison { 892 + struct cxl_dev_state *cxlds; 893 + u64 dpa; 894 + } mock_poison_list[MOCK_INJECT_TEST_MAX]; 895 + 896 + static struct cxl_mbox_poison_out * 897 + cxl_get_injected_po(struct cxl_dev_state *cxlds, u64 offset, u64 length) 898 + { 899 + struct cxl_mbox_poison_out *po; 900 + int nr_records = 0; 901 + u64 dpa; 902 + 903 + po = kzalloc(struct_size(po, record, poison_inject_dev_max), GFP_KERNEL); 904 + if (!po) 905 + return NULL; 906 + 907 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 908 + if (mock_poison_list[i].cxlds != cxlds) 909 + continue; 910 + if (mock_poison_list[i].dpa < offset || 911 + mock_poison_list[i].dpa > offset + length - 1) 912 + continue; 913 + 914 + dpa = mock_poison_list[i].dpa + CXL_POISON_SOURCE_INJECTED; 915 + po->record[nr_records].address = cpu_to_le64(dpa); 916 + po->record[nr_records].length = cpu_to_le32(1); 917 + nr_records++; 918 + if (nr_records == poison_inject_dev_max) 919 + break; 920 + } 921 + 922 + /* Always return count, even when zero */ 923 + po->count = cpu_to_le16(nr_records); 924 + 925 + return po; 926 + } 927 + 928 + static int mock_get_poison(struct cxl_dev_state *cxlds, 929 + struct cxl_mbox_cmd *cmd) 930 + { 931 + struct cxl_mbox_poison_in *pi = cmd->payload_in; 932 + struct cxl_mbox_poison_out *po; 933 + u64 offset = le64_to_cpu(pi->offset); 934 + u64 length = le64_to_cpu(pi->length); 935 + int nr_records; 936 + 937 + po = cxl_get_injected_po(cxlds, offset, length); 938 + if (!po) 939 + return -ENOMEM; 940 + nr_records = le16_to_cpu(po->count); 941 + memcpy(cmd->payload_out, po, struct_size(po, record, nr_records)); 942 + cmd->size_out = struct_size(po, record, nr_records); 943 + kfree(po); 944 + 945 + return 0; 946 + } 947 + 948 + static bool mock_poison_dev_max_injected(struct cxl_dev_state *cxlds) 949 + { 950 + int count = 0; 951 + 952 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 953 + if (mock_poison_list[i].cxlds == cxlds) 954 + count++; 955 + } 956 + return (count >= poison_inject_dev_max); 957 + } 958 + 959 + static bool mock_poison_add(struct cxl_dev_state *cxlds, u64 dpa) 960 + { 961 + if (mock_poison_dev_max_injected(cxlds)) { 962 + dev_dbg(cxlds->dev, 963 + "Device poison injection limit has been reached: %d\n", 964 + MOCK_INJECT_DEV_MAX); 965 + return false; 966 + } 967 + 968 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 969 + if (!mock_poison_list[i].cxlds) { 970 + mock_poison_list[i].cxlds = cxlds; 971 + mock_poison_list[i].dpa = dpa; 972 + return true; 973 + } 974 + } 975 + dev_dbg(cxlds->dev, 976 + "Mock test poison injection limit has been reached: %d\n", 977 + MOCK_INJECT_TEST_MAX); 978 + 979 + return false; 980 + } 981 + 982 + static bool mock_poison_found(struct cxl_dev_state *cxlds, u64 dpa) 983 + { 984 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 985 + if (mock_poison_list[i].cxlds == cxlds && 986 + mock_poison_list[i].dpa == dpa) 987 + return true; 988 + } 989 + return false; 990 + } 991 + 992 + static int mock_inject_poison(struct cxl_dev_state *cxlds, 993 + struct cxl_mbox_cmd *cmd) 994 + { 995 + struct cxl_mbox_inject_poison *pi = cmd->payload_in; 996 + u64 dpa = le64_to_cpu(pi->address); 997 + 998 + if (mock_poison_found(cxlds, dpa)) { 999 + /* Not an error to inject poison if already poisoned */ 1000 + dev_dbg(cxlds->dev, "DPA: 0x%llx already poisoned\n", dpa); 1001 + return 0; 1002 + } 1003 + if (!mock_poison_add(cxlds, dpa)) 1004 + return -ENXIO; 1005 + 1006 + return 0; 1007 + } 1008 + 1009 + static bool mock_poison_del(struct cxl_dev_state *cxlds, u64 dpa) 1010 + { 1011 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 1012 + if (mock_poison_list[i].cxlds == cxlds && 1013 + mock_poison_list[i].dpa == dpa) { 1014 + mock_poison_list[i].cxlds = NULL; 1015 + return true; 1016 + } 1017 + } 1018 + return false; 1019 + } 1020 + 1021 + static int mock_clear_poison(struct cxl_dev_state *cxlds, 1022 + struct cxl_mbox_cmd *cmd) 1023 + { 1024 + struct cxl_mbox_clear_poison *pi = cmd->payload_in; 1025 + u64 dpa = le64_to_cpu(pi->address); 1026 + 1027 + /* 1028 + * A real CXL device will write pi->write_data to the address 1029 + * being cleared. In this mock, just delete this address from 1030 + * the mock poison list. 1031 + */ 1032 + if (!mock_poison_del(cxlds, dpa)) 1033 + dev_dbg(cxlds->dev, "DPA: 0x%llx not in poison list\n", dpa); 1034 + 1035 + return 0; 1036 + } 1037 + 1038 + static bool mock_poison_list_empty(void) 1039 + { 1040 + for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) { 1041 + if (mock_poison_list[i].cxlds) 1042 + return false; 1043 + } 1044 + return true; 1045 + } 1046 + 1047 + static ssize_t poison_inject_max_show(struct device_driver *drv, char *buf) 1048 + { 1049 + return sysfs_emit(buf, "%u\n", poison_inject_dev_max); 1050 + } 1051 + 1052 + static ssize_t poison_inject_max_store(struct device_driver *drv, 1053 + const char *buf, size_t len) 1054 + { 1055 + int val; 1056 + 1057 + if (kstrtoint(buf, 0, &val) < 0) 1058 + return -EINVAL; 1059 + 1060 + if (!mock_poison_list_empty()) 1061 + return -EBUSY; 1062 + 1063 + if (val <= MOCK_INJECT_TEST_MAX) 1064 + poison_inject_dev_max = val; 1065 + else 1066 + return -EINVAL; 1067 + 1068 + return len; 1069 + } 1070 + 1071 + static DRIVER_ATTR_RW(poison_inject_max); 1072 + 1073 + static struct attribute *cxl_mock_mem_core_attrs[] = { 1074 + &driver_attr_poison_inject_max.attr, 1075 + NULL 1076 + }; 1077 + ATTRIBUTE_GROUPS(cxl_mock_mem_core); 1078 + 929 1079 static int cxl_mock_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd) 930 1080 { 931 1081 struct device *dev = cxlds->dev; 932 1082 int rc = -EIO; 933 1083 934 1084 switch (cmd->opcode) { 1085 + case CXL_MBOX_OP_SET_TIMESTAMP: 1086 + rc = mock_set_timestamp(cxlds, cmd); 1087 + break; 935 1088 case CXL_MBOX_OP_GET_SUPPORTED_LOGS: 936 1089 rc = mock_gsl(cmd); 937 1090 break; ··· 1170 941 break; 1171 942 case CXL_MBOX_OP_PASSPHRASE_SECURE_ERASE: 1172 943 rc = mock_passphrase_secure_erase(cxlds, cmd); 944 + break; 945 + case CXL_MBOX_OP_GET_POISON: 946 + rc = mock_get_poison(cxlds, cmd); 947 + break; 948 + case CXL_MBOX_OP_INJECT_POISON: 949 + rc = mock_inject_poison(cxlds, cmd); 950 + break; 951 + case CXL_MBOX_OP_CLEAR_POISON: 952 + rc = mock_clear_poison(cxlds, cmd); 1173 953 break; 1174 954 default: 1175 955 break; ··· 1245 1007 } 1246 1008 1247 1009 rc = cxl_enumerate_cmds(cxlds); 1010 + if (rc) 1011 + return rc; 1012 + 1013 + rc = cxl_poison_state_init(cxlds); 1014 + if (rc) 1015 + return rc; 1016 + 1017 + rc = cxl_set_timestamp(cxlds); 1248 1018 if (rc) 1249 1019 return rc; 1250 1020 ··· 1329 1083 .driver = { 1330 1084 .name = KBUILD_MODNAME, 1331 1085 .dev_groups = cxl_mock_mem_groups, 1086 + .groups = cxl_mock_mem_core_groups, 1332 1087 }, 1333 1088 }; 1334 1089