Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'x86-platform-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform updates from Ingo Molnar:
"This adds support for the AMD hardware feedback interface (HFI), by
Perry Yuan"

* tag 'x86-platform-2025-07-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/itmt: Add debugfs file to show core priorities
platform/x86/amd: hfi: Add debugfs support
platform/x86/amd: hfi: Set ITMT priority from ranking data
cpufreq/amd-pstate: Disable preferred cores on designs with workload classification
x86/process: Clear hardware feedback history for AMD processors
platform/x86: hfi: Add power management callback
platform/x86: hfi: Add online and offline callback support
platform/x86: hfi: Init per-cpu scores for each class
platform/x86: hfi: Parse CPU core ranking data from shared memory
platform/x86: hfi: Introduce AMD Hardware Feedback Interface Driver
x86/msr-index: Add AMD workload classification MSRs
MAINTAINERS: Add maintainer entry for AMD Hardware Feedback Driver
Documentation/x86: Add AMD Hardware Feedback Interface documentation

+760
+133
Documentation/arch/x86/amd-hfi.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ====================================================================== 4 + Hardware Feedback Interface For Hetero Core Scheduling On AMD Platform 5 + ====================================================================== 6 + 7 + :Copyright: 2025 Advanced Micro Devices, Inc. All Rights Reserved. 8 + 9 + :Author: Perry Yuan <perry.yuan@amd.com> 10 + :Author: Mario Limonciello <mario.limonciello@amd.com> 11 + 12 + Overview 13 + -------- 14 + 15 + AMD Heterogeneous Core implementations are comprised of more than one 16 + architectural class and CPUs are comprised of cores of various efficiency and 17 + power capabilities: performance-oriented *classic cores* and power-efficient 18 + *dense cores*. As such, power management strategies must be designed to 19 + accommodate the complexities introduced by incorporating different core types. 20 + Heterogeneous systems can also extend to more than two architectural classes 21 + as well. The purpose of the scheduling feedback mechanism is to provide 22 + information to the operating system scheduler in real time such that the 23 + scheduler can direct threads to the optimal core. 24 + 25 + The goal of AMD's heterogeneous architecture is to attain power benefit by 26 + sending background threads to the dense cores while sending high priority 27 + threads to the classic cores. From a performance perspective, sending 28 + background threads to dense cores can free up power headroom and allow the 29 + classic cores to optimally service demanding threads. Furthermore, the area 30 + optimized nature of the dense cores allows for an increasing number of 31 + physical cores. This improved core density will have positive multithreaded 32 + performance impact. 33 + 34 + AMD Heterogeneous Core Driver 35 + ----------------------------- 36 + 37 + The ``amd_hfi`` driver delivers the operating system a performance and energy 38 + efficiency capability data for each CPU in the system. The scheduler can use 39 + the ranking data from the HFI driver to make task placement decisions. 40 + 41 + Thread Classification and Ranking Table Interaction 42 + ---------------------------------------------------- 43 + 44 + The thread classification is used to select into a ranking table that 45 + describes an efficiency and performance ranking for each classification. 46 + 47 + Threads are classified during runtime into enumerated classes. The classes 48 + represent thread performance/power characteristics that may benefit from 49 + special scheduling behaviors. The below table depicts an example of thread 50 + classification and a preference where a given thread should be scheduled 51 + based on its thread class. The real time thread classification is consumed 52 + by the operating system and is used to inform the scheduler of where the 53 + thread should be placed. 54 + 55 + Thread Classification Example Table 56 + ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 57 + +----------+----------------+-------------------------------+---------------------+---------+ 58 + | class ID | Classification | Preferred scheduling behavior | Preemption priority | Counter | 59 + +----------+----------------+-------------------------------+---------------------+---------+ 60 + | 0 | Default | Performant | Highest | | 61 + +----------+----------------+-------------------------------+---------------------+---------+ 62 + | 1 | Non-scalable | Efficient | Lowest | PMCx1A1 | 63 + +----------+----------------+-------------------------------+---------------------+---------+ 64 + | 2 | I/O bound | Efficient | Lowest | PMCx044 | 65 + +----------+----------------+-------------------------------+---------------------+---------+ 66 + 67 + Thread classification is performed by the hardware each time that the thread is switched out. 68 + Threads that don't meet any hardware specified criteria are classified as "default". 69 + 70 + AMD Hardware Feedback Interface 71 + -------------------------------- 72 + 73 + The Hardware Feedback Interface provides to the operating system information 74 + about the performance and energy efficiency of each CPU in the system. Each 75 + capability is given as a unit-less quantity in the range [0-255]. A higher 76 + performance value indicates higher performance capability, and a higher 77 + efficiency value indicates more efficiency. Energy efficiency and performance 78 + are reported in separate capabilities in the shared memory based ranking table. 79 + 80 + These capabilities may change at runtime as a result of changes in the 81 + operating conditions of the system or the action of external factors. 82 + Power Management firmware is responsible for detecting events that require 83 + a reordering of the performance and efficiency ranking. Table updates happen 84 + relatively infrequently and occur on the time scale of seconds or more. 85 + 86 + The following events trigger a table update: 87 + * Thermal Stress Events 88 + * Silent Compute 89 + * Extreme Low Battery Scenarios 90 + 91 + The kernel or a userspace policy daemon can use these capabilities to modify 92 + task placement decisions. For instance, if either the performance or energy 93 + capabilities of a given logical processor becomes zero, it is an indication 94 + that the hardware recommends to the operating system to not schedule any tasks 95 + on that processor for performance or energy efficiency reasons, respectively. 96 + 97 + Implementation details for Linux 98 + -------------------------------- 99 + 100 + The implementation of threads scheduling consists of the following steps: 101 + 102 + 1. A thread is spawned and scheduled to the ideal core using the default 103 + heterogeneous scheduling policy. 104 + 2. The processor profiles thread execution and assigns an enumerated 105 + classification ID. 106 + This classification is communicated to the OS via logical processor 107 + scope MSR. 108 + 3. During the thread context switch out the operating system consumes the 109 + workload (WL) classification which resides in a logical processor scope MSR. 110 + 4. The OS triggers the hardware to clear its history by writing to an MSR, 111 + after consuming the WL classification and before switching in the new thread. 112 + 5. If due to the classification, ranking table, and processor availability, 113 + the thread is not on its ideal processor, the OS will then consider 114 + scheduling the thread on its ideal processor (if available). 115 + 116 + Ranking Table 117 + ------------- 118 + The ranking table is a shared memory region that is used to communicate the 119 + performance and energy efficiency capabilities of each CPU in the system. 120 + 121 + The ranking table design includes rankings for each APIC ID in the system and 122 + rankings both for performance and efficiency for each workload classification. 123 + 124 + .. kernel-doc:: drivers/platform/x86/amd/hfi/hfi.c 125 + :doc: amd_shmem_info 126 + 127 + Ranking Table update 128 + --------------------------- 129 + The power management firmware issues an platform interrupt after updating the 130 + ranking table and is ready for the operating system to consume it. CPUs receive 131 + such interrupt and read new ranking table from shared memory which PCCT table 132 + has provided, then ``amd_hfi`` driver parses the new table to provide new 133 + consume data for scheduling decisions.
+1
Documentation/arch/x86/index.rst
··· 28 28 amd-debugging 29 29 amd-memory-encryption 30 30 amd_hsmp 31 + amd-hfi 31 32 tdx 32 33 pti 33 34 mds
+9
MAINTAINERS
··· 1115 1115 F: arch/x86/include/uapi/asm/amd_hsmp.h 1116 1116 F: drivers/platform/x86/amd/hsmp/ 1117 1117 1118 + AMD HETERO CORE HARDWARE FEEDBACK DRIVER 1119 + M: Mario Limonciello <mario.limonciello@amd.com> 1120 + R: Perry Yuan <perry.yuan@amd.com> 1121 + L: platform-driver-x86@vger.kernel.org 1122 + S: Supported 1123 + B: https://gitlab.freedesktop.org/drm/amd/-/issues 1124 + F: Documentation/arch/x86/amd-hfi.rst 1125 + F: drivers/platform/x86/amd/hfi/ 1126 + 1118 1127 AMD IOMMU (AMD-VI) 1119 1128 M: Joerg Roedel <joro@8bytes.org> 1120 1129 R: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
+5
arch/x86/include/asm/msr-index.h
··· 733 733 #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL 0xc0000301 734 734 #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR 0xc0000302 735 735 736 + /* AMD Hardware Feedback Support MSRs */ 737 + #define MSR_AMD_WORKLOAD_CLASS_CONFIG 0xc0000500 738 + #define MSR_AMD_WORKLOAD_CLASS_ID 0xc0000501 739 + #define MSR_AMD_WORKLOAD_HRST 0xc0000502 740 + 736 741 /* AMD Last Branch Record MSRs */ 737 742 #define MSR_AMD64_LBR_SELECT 0xc000010e 738 743
+23
arch/x86/kernel/itmt.c
··· 59 59 return result; 60 60 } 61 61 62 + static int sched_core_priority_show(struct seq_file *s, void *unused) 63 + { 64 + int cpu; 65 + 66 + seq_puts(s, "CPU #\tPriority\n"); 67 + for_each_possible_cpu(cpu) 68 + seq_printf(s, "%d\t%d\n", cpu, arch_asym_cpu_priority(cpu)); 69 + 70 + return 0; 71 + } 72 + DEFINE_SHOW_ATTRIBUTE(sched_core_priority); 73 + 62 74 static const struct file_operations dfs_sched_itmt_fops = { 63 75 .read = debugfs_read_file_bool, 64 76 .write = sched_itmt_enabled_write, ··· 79 67 }; 80 68 81 69 static struct dentry *dfs_sched_itmt; 70 + static struct dentry *dfs_sched_core_prio; 82 71 83 72 /** 84 73 * sched_set_itmt_support() - Indicate platform supports ITMT ··· 115 102 return -ENOMEM; 116 103 } 117 104 105 + dfs_sched_core_prio = debugfs_create_file("sched_core_priority", 0644, 106 + arch_debugfs_dir, NULL, 107 + &sched_core_priority_fops); 108 + if (IS_ERR_OR_NULL(dfs_sched_core_prio)) { 109 + dfs_sched_core_prio = NULL; 110 + return -ENOMEM; 111 + } 112 + 118 113 sched_itmt_capable = true; 119 114 120 115 sysctl_sched_itmt_enabled = 1; ··· 154 133 155 134 debugfs_remove(dfs_sched_itmt); 156 135 dfs_sched_itmt = NULL; 136 + debugfs_remove(dfs_sched_core_prio); 137 + dfs_sched_core_prio = NULL; 157 138 158 139 if (sysctl_sched_itmt_enabled) { 159 140 /* disable sched_itmt if we are no longer ITMT capable */
+4
arch/x86/kernel/process_64.c
··· 707 707 /* Load the Intel cache allocation PQR MSR. */ 708 708 resctrl_arch_sched_in(next_p); 709 709 710 + /* Reset hw history on AMD CPUs */ 711 + if (cpu_feature_enabled(X86_FEATURE_AMD_WORKLOAD_CLASS)) 712 + wrmsrl(MSR_AMD_WORKLOAD_HRST, 0x1); 713 + 710 714 return prev_p; 711 715 } 712 716
+7
drivers/cpufreq/amd-pstate.c
··· 826 826 if (!amd_pstate_prefcore) 827 827 return; 828 828 829 + /* should use amd-hfi instead */ 830 + if (cpu_feature_enabled(X86_FEATURE_AMD_WORKLOAD_CLASS) && 831 + IS_ENABLED(CONFIG_AMD_HFI)) { 832 + amd_pstate_prefcore = false; 833 + return; 834 + } 835 + 829 836 cpudata->hw_prefcore = true; 830 837 831 838 /* Priorities must be initialized before ITMT support can be toggled on. */
+1
drivers/platform/x86/amd/Kconfig
··· 6 6 source "drivers/platform/x86/amd/hsmp/Kconfig" 7 7 source "drivers/platform/x86/amd/pmf/Kconfig" 8 8 source "drivers/platform/x86/amd/pmc/Kconfig" 9 + source "drivers/platform/x86/amd/hfi/Kconfig" 9 10 10 11 config AMD_3D_VCACHE 11 12 tristate "AMD 3D V-Cache Performance Optimizer Driver"
+1
drivers/platform/x86/amd/Makefile
··· 11 11 obj-$(CONFIG_AMD_PMF) += pmf/ 12 12 obj-$(CONFIG_AMD_WBRF) += wbrf.o 13 13 obj-$(CONFIG_AMD_ISP_PLATFORM) += amd_isp4.o 14 + obj-$(CONFIG_AMD_HFI) += hfi/
+18
drivers/platform/x86/amd/hfi/Kconfig
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + # 3 + # AMD Hardware Feedback Interface Driver 4 + # 5 + 6 + config AMD_HFI 7 + bool "AMD Hetero Core Hardware Feedback Driver" 8 + depends on ACPI 9 + depends on CPU_SUP_AMD 10 + depends on SCHED_MC_PRIO 11 + help 12 + Select this option to enable the AMD Heterogeneous Core Hardware 13 + Feedback Interface. If selected, hardware provides runtime thread 14 + classification guidance to the operating system on the performance and 15 + energy efficiency capabilities of each heterogeneous CPU core. These 16 + capabilities may vary due to the inherent differences in the core types 17 + and can also change as a result of variations in the operating 18 + conditions of the system such as power and thermal limits.
+7
drivers/platform/x86/amd/hfi/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + # 3 + # AMD Hardware Feedback Interface Driver 4 + # 5 + 6 + obj-$(CONFIG_AMD_HFI) += amd_hfi.o 7 + amd_hfi-objs := hfi.o
+551
drivers/platform/x86/amd/hfi/hfi.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * AMD Hardware Feedback Interface Driver 4 + * 5 + * Copyright (C) 2025 Advanced Micro Devices, Inc. All Rights Reserved. 6 + * 7 + * Authors: Perry Yuan <Perry.Yuan@amd.com> 8 + * Mario Limonciello <mario.limonciello@amd.com> 9 + */ 10 + 11 + #define pr_fmt(fmt) "amd-hfi: " fmt 12 + 13 + #include <linux/acpi.h> 14 + #include <linux/cpu.h> 15 + #include <linux/cpumask.h> 16 + #include <linux/debugfs.h> 17 + #include <linux/gfp.h> 18 + #include <linux/init.h> 19 + #include <linux/io.h> 20 + #include <linux/kernel.h> 21 + #include <linux/module.h> 22 + #include <linux/mailbox_client.h> 23 + #include <linux/mutex.h> 24 + #include <linux/percpu-defs.h> 25 + #include <linux/platform_device.h> 26 + #include <linux/smp.h> 27 + #include <linux/topology.h> 28 + #include <linux/workqueue.h> 29 + 30 + #include <asm/cpu_device_id.h> 31 + 32 + #include <acpi/pcc.h> 33 + #include <acpi/cppc_acpi.h> 34 + 35 + #define AMD_HFI_DRIVER "amd_hfi" 36 + #define AMD_HFI_MAILBOX_COUNT 1 37 + #define AMD_HETERO_RANKING_TABLE_VER 2 38 + 39 + #define AMD_HETERO_CPUID_27 0x80000027 40 + 41 + static struct platform_device *device; 42 + 43 + /** 44 + * struct amd_shmem_info - Shared memory table for AMD HFI 45 + * 46 + * @header: The PCCT table header including signature, length flags and command. 47 + * @version_number: Version number of the table 48 + * @n_logical_processors: Number of logical processors 49 + * @n_capabilities: Number of ranking dimensions (performance, efficiency, etc) 50 + * @table_update_context: Command being sent over the subspace 51 + * @n_bitmaps: Number of 32-bit bitmaps to enumerate all the APIC IDs 52 + * This is based on the maximum APIC ID enumerated in the system 53 + * @reserved: 24 bit spare 54 + * @table_data: Bit Map(s) of enabled logical processors 55 + * Followed by the ranking data for each logical processor 56 + */ 57 + struct amd_shmem_info { 58 + struct acpi_pcct_ext_pcc_shared_memory header; 59 + u32 version_number :8, 60 + n_logical_processors :8, 61 + n_capabilities :8, 62 + table_update_context :8; 63 + u32 n_bitmaps :8, 64 + reserved :24; 65 + u32 table_data[]; 66 + }; 67 + 68 + struct amd_hfi_data { 69 + const char *name; 70 + struct device *dev; 71 + 72 + /* PCCT table related */ 73 + struct pcc_mbox_chan *pcc_chan; 74 + void __iomem *pcc_comm_addr; 75 + struct acpi_subtable_header *pcct_entry; 76 + struct amd_shmem_info *shmem; 77 + 78 + struct dentry *dbgfs_dir; 79 + }; 80 + 81 + /** 82 + * struct amd_hfi_classes - HFI class capabilities per CPU 83 + * @perf: Performance capability 84 + * @eff: Power efficiency capability 85 + * 86 + * Capabilities of a logical processor in the ranking table. These capabilities 87 + * are unitless and specific to each HFI class. 88 + */ 89 + struct amd_hfi_classes { 90 + u32 perf; 91 + u32 eff; 92 + }; 93 + 94 + /** 95 + * struct amd_hfi_cpuinfo - HFI workload class info per CPU 96 + * @cpu: CPU index 97 + * @apic_id: APIC id of the current CPU 98 + * @cpus: mask of CPUs associated with amd_hfi_cpuinfo 99 + * @class_index: workload class ID index 100 + * @nr_class: max number of workload class supported 101 + * @ipcc_scores: ipcc scores for each class 102 + * @amd_hfi_classes: current CPU workload class ranking data 103 + * 104 + * Parameters of a logical processor linked with hardware feedback class. 105 + */ 106 + struct amd_hfi_cpuinfo { 107 + int cpu; 108 + u32 apic_id; 109 + cpumask_var_t cpus; 110 + s16 class_index; 111 + u8 nr_class; 112 + int *ipcc_scores; 113 + struct amd_hfi_classes *amd_hfi_classes; 114 + }; 115 + 116 + static DEFINE_PER_CPU(struct amd_hfi_cpuinfo, amd_hfi_cpuinfo) = {.class_index = -1}; 117 + 118 + static DEFINE_MUTEX(hfi_cpuinfo_lock); 119 + 120 + static void amd_hfi_sched_itmt_work(struct work_struct *work) 121 + { 122 + sched_set_itmt_support(); 123 + } 124 + static DECLARE_WORK(sched_amd_hfi_itmt_work, amd_hfi_sched_itmt_work); 125 + 126 + static int find_cpu_index_by_apicid(unsigned int target_apicid) 127 + { 128 + int cpu_index; 129 + 130 + for_each_possible_cpu(cpu_index) { 131 + struct cpuinfo_x86 *info = &cpu_data(cpu_index); 132 + 133 + if (info->topo.apicid == target_apicid) { 134 + pr_debug("match APIC id %u for CPU index: %d\n", 135 + info->topo.apicid, cpu_index); 136 + return cpu_index; 137 + } 138 + } 139 + 140 + return -ENODEV; 141 + } 142 + 143 + static int amd_hfi_fill_metadata(struct amd_hfi_data *amd_hfi_data) 144 + { 145 + struct acpi_pcct_ext_pcc_slave *pcct_ext = 146 + (struct acpi_pcct_ext_pcc_slave *)amd_hfi_data->pcct_entry; 147 + void __iomem *pcc_comm_addr; 148 + u32 apic_start = 0; 149 + 150 + pcc_comm_addr = acpi_os_ioremap(amd_hfi_data->pcc_chan->shmem_base_addr, 151 + amd_hfi_data->pcc_chan->shmem_size); 152 + if (!pcc_comm_addr) { 153 + dev_err(amd_hfi_data->dev, "failed to ioremap PCC common region mem\n"); 154 + return -ENOMEM; 155 + } 156 + 157 + memcpy_fromio(amd_hfi_data->shmem, pcc_comm_addr, pcct_ext->length); 158 + iounmap(pcc_comm_addr); 159 + 160 + if (amd_hfi_data->shmem->header.signature != PCC_SIGNATURE) { 161 + dev_err(amd_hfi_data->dev, "invalid signature in shared memory\n"); 162 + return -EINVAL; 163 + } 164 + if (amd_hfi_data->shmem->version_number != AMD_HETERO_RANKING_TABLE_VER) { 165 + dev_err(amd_hfi_data->dev, "invalid version %d\n", 166 + amd_hfi_data->shmem->version_number); 167 + return -EINVAL; 168 + } 169 + 170 + for (unsigned int i = 0; i < amd_hfi_data->shmem->n_bitmaps; i++) { 171 + u32 bitmap = amd_hfi_data->shmem->table_data[i]; 172 + 173 + for (unsigned int j = 0; j < BITS_PER_TYPE(u32); j++) { 174 + u32 apic_id = i * BITS_PER_TYPE(u32) + j; 175 + struct amd_hfi_cpuinfo *info; 176 + int cpu_index, apic_index; 177 + 178 + if (!(bitmap & BIT(j))) 179 + continue; 180 + 181 + cpu_index = find_cpu_index_by_apicid(apic_id); 182 + if (cpu_index < 0) { 183 + dev_warn(amd_hfi_data->dev, "APIC ID %u not found\n", apic_id); 184 + continue; 185 + } 186 + 187 + info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu_index); 188 + info->apic_id = apic_id; 189 + 190 + /* Fill the ranking data for each logical processor */ 191 + info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu_index); 192 + apic_index = apic_start * info->nr_class * 2; 193 + for (unsigned int k = 0; k < info->nr_class; k++) { 194 + u32 *table = amd_hfi_data->shmem->table_data + 195 + amd_hfi_data->shmem->n_bitmaps + 196 + i * info->nr_class; 197 + 198 + info->amd_hfi_classes[k].eff = table[apic_index + 2 * k]; 199 + info->amd_hfi_classes[k].perf = table[apic_index + 2 * k + 1]; 200 + } 201 + apic_start++; 202 + } 203 + } 204 + 205 + return 0; 206 + } 207 + 208 + static int amd_hfi_alloc_class_data(struct platform_device *pdev) 209 + { 210 + struct amd_hfi_cpuinfo *hfi_cpuinfo; 211 + struct device *dev = &pdev->dev; 212 + u32 nr_class_id; 213 + int idx; 214 + 215 + nr_class_id = cpuid_eax(AMD_HETERO_CPUID_27); 216 + if (nr_class_id > 255) { 217 + dev_err(dev, "number of supported classes too large: %d\n", 218 + nr_class_id); 219 + return -EINVAL; 220 + } 221 + 222 + for_each_possible_cpu(idx) { 223 + struct amd_hfi_classes *classes; 224 + int *ipcc_scores; 225 + 226 + classes = devm_kcalloc(dev, 227 + nr_class_id, 228 + sizeof(struct amd_hfi_classes), 229 + GFP_KERNEL); 230 + if (!classes) 231 + return -ENOMEM; 232 + ipcc_scores = devm_kcalloc(dev, nr_class_id, sizeof(int), GFP_KERNEL); 233 + if (!ipcc_scores) 234 + return -ENOMEM; 235 + hfi_cpuinfo = per_cpu_ptr(&amd_hfi_cpuinfo, idx); 236 + hfi_cpuinfo->amd_hfi_classes = classes; 237 + hfi_cpuinfo->ipcc_scores = ipcc_scores; 238 + hfi_cpuinfo->nr_class = nr_class_id; 239 + } 240 + 241 + return 0; 242 + } 243 + 244 + static void amd_hfi_remove(struct platform_device *pdev) 245 + { 246 + struct amd_hfi_data *dev = platform_get_drvdata(pdev); 247 + 248 + debugfs_remove_recursive(dev->dbgfs_dir); 249 + } 250 + 251 + static int amd_set_hfi_ipcc_score(struct amd_hfi_cpuinfo *hfi_cpuinfo, int cpu) 252 + { 253 + for (int i = 0; i < hfi_cpuinfo->nr_class; i++) 254 + WRITE_ONCE(hfi_cpuinfo->ipcc_scores[i], 255 + hfi_cpuinfo->amd_hfi_classes[i].perf); 256 + 257 + sched_set_itmt_core_prio(hfi_cpuinfo->ipcc_scores[0], cpu); 258 + 259 + return 0; 260 + } 261 + 262 + static int amd_hfi_set_state(unsigned int cpu, bool state) 263 + { 264 + int ret; 265 + 266 + ret = wrmsrq_on_cpu(cpu, MSR_AMD_WORKLOAD_CLASS_CONFIG, state ? 1 : 0); 267 + if (ret) 268 + return ret; 269 + 270 + return wrmsrq_on_cpu(cpu, MSR_AMD_WORKLOAD_HRST, 0x1); 271 + } 272 + 273 + /** 274 + * amd_hfi_online() - Enable workload classification on @cpu 275 + * @cpu: CPU in which the workload classification will be enabled 276 + * 277 + * Return: 0 on success, negative error code on failure. 278 + */ 279 + static int amd_hfi_online(unsigned int cpu) 280 + { 281 + struct amd_hfi_cpuinfo *hfi_info = per_cpu_ptr(&amd_hfi_cpuinfo, cpu); 282 + struct amd_hfi_classes *hfi_classes; 283 + int ret; 284 + 285 + if (WARN_ON_ONCE(!hfi_info)) 286 + return -EINVAL; 287 + 288 + /* 289 + * Check if @cpu as an associated, initialized and ranking data must 290 + * be filled. 291 + */ 292 + hfi_classes = hfi_info->amd_hfi_classes; 293 + if (!hfi_classes) 294 + return -EINVAL; 295 + 296 + guard(mutex)(&hfi_cpuinfo_lock); 297 + 298 + if (!zalloc_cpumask_var(&hfi_info->cpus, GFP_KERNEL)) 299 + return -ENOMEM; 300 + 301 + cpumask_set_cpu(cpu, hfi_info->cpus); 302 + 303 + ret = amd_hfi_set_state(cpu, true); 304 + if (ret) 305 + pr_err("WCT enable failed for CPU %u\n", cpu); 306 + 307 + return ret; 308 + } 309 + 310 + /** 311 + * amd_hfi_offline() - Disable workload classification on @cpu 312 + * @cpu: CPU in which the workload classification will be disabled 313 + * 314 + * Remove @cpu from those covered by its HFI instance. 315 + * 316 + * Return: 0 on success, negative error code on failure 317 + */ 318 + static int amd_hfi_offline(unsigned int cpu) 319 + { 320 + struct amd_hfi_cpuinfo *hfi_info = &per_cpu(amd_hfi_cpuinfo, cpu); 321 + int ret; 322 + 323 + if (WARN_ON_ONCE(!hfi_info)) 324 + return -EINVAL; 325 + 326 + guard(mutex)(&hfi_cpuinfo_lock); 327 + 328 + ret = amd_hfi_set_state(cpu, false); 329 + if (ret) 330 + pr_err("WCT disable failed for CPU %u\n", cpu); 331 + 332 + free_cpumask_var(hfi_info->cpus); 333 + 334 + return ret; 335 + } 336 + 337 + static int update_hfi_ipcc_scores(void) 338 + { 339 + int cpu; 340 + int ret; 341 + 342 + for_each_possible_cpu(cpu) { 343 + struct amd_hfi_cpuinfo *hfi_cpuinfo = per_cpu_ptr(&amd_hfi_cpuinfo, cpu); 344 + 345 + ret = amd_set_hfi_ipcc_score(hfi_cpuinfo, cpu); 346 + if (ret) 347 + return ret; 348 + } 349 + 350 + return 0; 351 + } 352 + 353 + static int amd_hfi_metadata_parser(struct platform_device *pdev, 354 + struct amd_hfi_data *amd_hfi_data) 355 + { 356 + struct acpi_pcct_ext_pcc_slave *pcct_ext; 357 + struct acpi_subtable_header *pcct_entry; 358 + struct mbox_chan *pcc_mbox_channels; 359 + struct acpi_table_header *pcct_tbl; 360 + struct pcc_mbox_chan *pcc_chan; 361 + acpi_status status; 362 + int ret; 363 + 364 + pcc_mbox_channels = devm_kcalloc(&pdev->dev, AMD_HFI_MAILBOX_COUNT, 365 + sizeof(*pcc_mbox_channels), GFP_KERNEL); 366 + if (!pcc_mbox_channels) 367 + return -ENOMEM; 368 + 369 + pcc_chan = devm_kcalloc(&pdev->dev, AMD_HFI_MAILBOX_COUNT, 370 + sizeof(*pcc_chan), GFP_KERNEL); 371 + if (!pcc_chan) 372 + return -ENOMEM; 373 + 374 + status = acpi_get_table(ACPI_SIG_PCCT, 0, &pcct_tbl); 375 + if (ACPI_FAILURE(status) || !pcct_tbl) 376 + return -ENODEV; 377 + 378 + /* get pointer to the first PCC subspace entry */ 379 + pcct_entry = (struct acpi_subtable_header *) ( 380 + (unsigned long)pcct_tbl + sizeof(struct acpi_table_pcct)); 381 + 382 + pcc_chan->mchan = &pcc_mbox_channels[0]; 383 + 384 + amd_hfi_data->pcc_chan = pcc_chan; 385 + amd_hfi_data->pcct_entry = pcct_entry; 386 + pcct_ext = (struct acpi_pcct_ext_pcc_slave *)pcct_entry; 387 + 388 + if (pcct_ext->length <= 0) 389 + return -EINVAL; 390 + 391 + amd_hfi_data->shmem = devm_kzalloc(amd_hfi_data->dev, pcct_ext->length, GFP_KERNEL); 392 + if (!amd_hfi_data->shmem) 393 + return -ENOMEM; 394 + 395 + pcc_chan->shmem_base_addr = pcct_ext->base_address; 396 + pcc_chan->shmem_size = pcct_ext->length; 397 + 398 + /* parse the shared memory info from the PCCT table */ 399 + ret = amd_hfi_fill_metadata(amd_hfi_data); 400 + 401 + acpi_put_table(pcct_tbl); 402 + 403 + return ret; 404 + } 405 + 406 + static int class_capabilities_show(struct seq_file *s, void *unused) 407 + { 408 + u32 cpu, idx; 409 + 410 + seq_puts(s, "CPU #\tWLC\tPerf\tEff\n"); 411 + for_each_possible_cpu(cpu) { 412 + struct amd_hfi_cpuinfo *hfi_cpuinfo = per_cpu_ptr(&amd_hfi_cpuinfo, cpu); 413 + 414 + seq_printf(s, "%d", cpu); 415 + for (idx = 0; idx < hfi_cpuinfo->nr_class; idx++) { 416 + seq_printf(s, "\t%u\t%u\t%u\n", idx, 417 + hfi_cpuinfo->amd_hfi_classes[idx].perf, 418 + hfi_cpuinfo->amd_hfi_classes[idx].eff); 419 + } 420 + } 421 + 422 + return 0; 423 + } 424 + DEFINE_SHOW_ATTRIBUTE(class_capabilities); 425 + 426 + static int amd_hfi_pm_resume(struct device *dev) 427 + { 428 + int ret, cpu; 429 + 430 + for_each_online_cpu(cpu) { 431 + ret = amd_hfi_set_state(cpu, true); 432 + if (ret < 0) { 433 + dev_err(dev, "failed to enable workload class config: %d\n", ret); 434 + return ret; 435 + } 436 + } 437 + 438 + return 0; 439 + } 440 + 441 + static int amd_hfi_pm_suspend(struct device *dev) 442 + { 443 + int ret, cpu; 444 + 445 + for_each_online_cpu(cpu) { 446 + ret = amd_hfi_set_state(cpu, false); 447 + if (ret < 0) { 448 + dev_err(dev, "failed to disable workload class config: %d\n", ret); 449 + return ret; 450 + } 451 + } 452 + 453 + return 0; 454 + } 455 + 456 + static DEFINE_SIMPLE_DEV_PM_OPS(amd_hfi_pm_ops, amd_hfi_pm_suspend, amd_hfi_pm_resume); 457 + 458 + static const struct acpi_device_id amd_hfi_platform_match[] = { 459 + {"AMDI0104", 0}, 460 + { } 461 + }; 462 + MODULE_DEVICE_TABLE(acpi, amd_hfi_platform_match); 463 + 464 + static int amd_hfi_probe(struct platform_device *pdev) 465 + { 466 + struct amd_hfi_data *amd_hfi_data; 467 + int ret; 468 + 469 + if (!acpi_match_device(amd_hfi_platform_match, &pdev->dev)) 470 + return -ENODEV; 471 + 472 + amd_hfi_data = devm_kzalloc(&pdev->dev, sizeof(*amd_hfi_data), GFP_KERNEL); 473 + if (!amd_hfi_data) 474 + return -ENOMEM; 475 + 476 + amd_hfi_data->dev = &pdev->dev; 477 + platform_set_drvdata(pdev, amd_hfi_data); 478 + 479 + ret = amd_hfi_alloc_class_data(pdev); 480 + if (ret) 481 + return ret; 482 + 483 + ret = amd_hfi_metadata_parser(pdev, amd_hfi_data); 484 + if (ret) 485 + return ret; 486 + 487 + ret = update_hfi_ipcc_scores(); 488 + if (ret) 489 + return ret; 490 + 491 + /* 492 + * Tasks will already be running at the time this happens. This is 493 + * OK because rankings will be adjusted by the callbacks. 494 + */ 495 + ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/amd_hfi:online", 496 + amd_hfi_online, amd_hfi_offline); 497 + if (ret < 0) 498 + return ret; 499 + 500 + schedule_work(&sched_amd_hfi_itmt_work); 501 + 502 + amd_hfi_data->dbgfs_dir = debugfs_create_dir("amd_hfi", arch_debugfs_dir); 503 + debugfs_create_file("class_capabilities", 0644, amd_hfi_data->dbgfs_dir, pdev, 504 + &class_capabilities_fops); 505 + 506 + return 0; 507 + } 508 + 509 + static struct platform_driver amd_hfi_driver = { 510 + .driver = { 511 + .name = AMD_HFI_DRIVER, 512 + .owner = THIS_MODULE, 513 + .pm = &amd_hfi_pm_ops, 514 + .acpi_match_table = ACPI_PTR(amd_hfi_platform_match), 515 + }, 516 + .probe = amd_hfi_probe, 517 + .remove = amd_hfi_remove, 518 + }; 519 + 520 + static int __init amd_hfi_init(void) 521 + { 522 + int ret; 523 + 524 + if (acpi_disabled || 525 + !cpu_feature_enabled(X86_FEATURE_AMD_HTR_CORES) || 526 + !cpu_feature_enabled(X86_FEATURE_AMD_WORKLOAD_CLASS)) 527 + return -ENODEV; 528 + 529 + device = platform_device_register_simple(AMD_HFI_DRIVER, -1, NULL, 0); 530 + if (IS_ERR(device)) { 531 + pr_err("unable to register HFI platform device\n"); 532 + return PTR_ERR(device); 533 + } 534 + 535 + ret = platform_driver_register(&amd_hfi_driver); 536 + if (ret) 537 + pr_err("failed to register HFI driver\n"); 538 + 539 + return ret; 540 + } 541 + 542 + static __exit void amd_hfi_exit(void) 543 + { 544 + platform_driver_unregister(&amd_hfi_driver); 545 + platform_device_unregister(device); 546 + } 547 + module_init(amd_hfi_init); 548 + module_exit(amd_hfi_exit); 549 + 550 + MODULE_LICENSE("GPL"); 551 + MODULE_DESCRIPTION("AMD Hardware Feedback Interface Driver");