Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"This lot contains:

- Some fixups for the fallout of the topology consolidation which
unearthed AMD/Intel inconsistencies
- Documentation for the x86 topology management
- Support for AMD advanced power management bits
- Two simple cleanups removing duplicated code"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add advanced power management bits
x86/thread_info: Merge two !__ASSEMBLY__ sections
x86/cpufreq: Remove duplicated TDP MSR macro definitions
x86/Documentation: Start documenting x86 topology
x86/cpu: Get rid of compute_unit_id
perf/x86/amd: Cleanup Fam10h NB event constraints
x86/topology: Fix AMD core count

+246 -30
+208
Documentation/x86/topology.txt
··· 1 + x86 Topology 2 + ============ 3 + 4 + This documents and clarifies the main aspects of x86 topology modelling and 5 + representation in the kernel. Update/change when doing changes to the 6 + respective code. 7 + 8 + The architecture-agnostic topology definitions are in 9 + Documentation/cputopology.txt. This file holds x86-specific 10 + differences/specialities which must not necessarily apply to the generic 11 + definitions. Thus, the way to read up on Linux topology on x86 is to start 12 + with the generic one and look at this one in parallel for the x86 specifics. 13 + 14 + Needless to say, code should use the generic functions - this file is *only* 15 + here to *document* the inner workings of x86 topology. 16 + 17 + Started by Thomas Gleixner <tglx@linutronix.de> and Borislav Petkov <bp@alien8.de>. 18 + 19 + The main aim of the topology facilities is to present adequate interfaces to 20 + code which needs to know/query/use the structure of the running system wrt 21 + threads, cores, packages, etc. 22 + 23 + The kernel does not care about the concept of physical sockets because a 24 + socket has no relevance to software. It's an electromechanical component. In 25 + the past a socket always contained a single package (see below), but with the 26 + advent of Multi Chip Modules (MCM) a socket can hold more than one package. So 27 + there might be still references to sockets in the code, but they are of 28 + historical nature and should be cleaned up. 29 + 30 + The topology of a system is described in the units of: 31 + 32 + - packages 33 + - cores 34 + - threads 35 + 36 + * Package: 37 + 38 + Packages contain a number of cores plus shared resources, e.g. DRAM 39 + controller, shared caches etc. 40 + 41 + AMD nomenclature for package is 'Node'. 42 + 43 + Package-related topology information in the kernel: 44 + 45 + - cpuinfo_x86.x86_max_cores: 46 + 47 + The number of cores in a package. This information is retrieved via CPUID. 48 + 49 + - cpuinfo_x86.phys_proc_id: 50 + 51 + The physical ID of the package. This information is retrieved via CPUID 52 + and deduced from the APIC IDs of the cores in the package. 53 + 54 + - cpuinfo_x86.logical_id: 55 + 56 + The logical ID of the package. As we do not trust BIOSes to enumerate the 57 + packages in a consistent way, we introduced the concept of logical package 58 + ID so we can sanely calculate the number of maximum possible packages in 59 + the system and have the packages enumerated linearly. 60 + 61 + - topology_max_packages(): 62 + 63 + The maximum possible number of packages in the system. Helpful for per 64 + package facilities to preallocate per package information. 65 + 66 + 67 + * Cores: 68 + 69 + A core consists of 1 or more threads. It does not matter whether the threads 70 + are SMT- or CMT-type threads. 71 + 72 + AMDs nomenclature for a CMT core is "Compute Unit". The kernel always uses 73 + "core". 74 + 75 + Core-related topology information in the kernel: 76 + 77 + - smp_num_siblings: 78 + 79 + The number of threads in a core. The number of threads in a package can be 80 + calculated by: 81 + 82 + threads_per_package = cpuinfo_x86.x86_max_cores * smp_num_siblings 83 + 84 + 85 + * Threads: 86 + 87 + A thread is a single scheduling unit. It's the equivalent to a logical Linux 88 + CPU. 89 + 90 + AMDs nomenclature for CMT threads is "Compute Unit Core". The kernel always 91 + uses "thread". 92 + 93 + Thread-related topology information in the kernel: 94 + 95 + - topology_core_cpumask(): 96 + 97 + The cpumask contains all online threads in the package to which a thread 98 + belongs. 99 + 100 + The number of online threads is also printed in /proc/cpuinfo "siblings." 101 + 102 + - topology_sibling_mask(): 103 + 104 + The cpumask contains all online threads in the core to which a thread 105 + belongs. 106 + 107 + - topology_logical_package_id(): 108 + 109 + The logical package ID to which a thread belongs. 110 + 111 + - topology_physical_package_id(): 112 + 113 + The physical package ID to which a thread belongs. 114 + 115 + - topology_core_id(); 116 + 117 + The ID of the core to which a thread belongs. It is also printed in /proc/cpuinfo 118 + "core_id." 119 + 120 + 121 + 122 + System topology examples 123 + 124 + Note: 125 + 126 + The alternative Linux CPU enumeration depends on how the BIOS enumerates the 127 + threads. Many BIOSes enumerate all threads 0 first and then all threads 1. 128 + That has the "advantage" that the logical Linux CPU numbers of threads 0 stay 129 + the same whether threads are enabled or not. That's merely an implementation 130 + detail and has no practical impact. 131 + 132 + 1) Single Package, Single Core 133 + 134 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 135 + 136 + 2) Single Package, Dual Core 137 + 138 + a) One thread per core 139 + 140 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 141 + -> [core 1] -> [thread 0] -> Linux CPU 1 142 + 143 + b) Two threads per core 144 + 145 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 146 + -> [thread 1] -> Linux CPU 1 147 + -> [core 1] -> [thread 0] -> Linux CPU 2 148 + -> [thread 1] -> Linux CPU 3 149 + 150 + Alternative enumeration: 151 + 152 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 153 + -> [thread 1] -> Linux CPU 2 154 + -> [core 1] -> [thread 0] -> Linux CPU 1 155 + -> [thread 1] -> Linux CPU 3 156 + 157 + AMD nomenclature for CMT systems: 158 + 159 + [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 160 + -> [Compute Unit Core 1] -> Linux CPU 1 161 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 162 + -> [Compute Unit Core 1] -> Linux CPU 3 163 + 164 + 4) Dual Package, Dual Core 165 + 166 + a) One thread per core 167 + 168 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 169 + -> [core 1] -> [thread 0] -> Linux CPU 1 170 + 171 + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2 172 + -> [core 1] -> [thread 0] -> Linux CPU 3 173 + 174 + b) Two threads per core 175 + 176 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 177 + -> [thread 1] -> Linux CPU 1 178 + -> [core 1] -> [thread 0] -> Linux CPU 2 179 + -> [thread 1] -> Linux CPU 3 180 + 181 + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 4 182 + -> [thread 1] -> Linux CPU 5 183 + -> [core 1] -> [thread 0] -> Linux CPU 6 184 + -> [thread 1] -> Linux CPU 7 185 + 186 + Alternative enumeration: 187 + 188 + [package 0] -> [core 0] -> [thread 0] -> Linux CPU 0 189 + -> [thread 1] -> Linux CPU 4 190 + -> [core 1] -> [thread 0] -> Linux CPU 1 191 + -> [thread 1] -> Linux CPU 5 192 + 193 + [package 1] -> [core 0] -> [thread 0] -> Linux CPU 2 194 + -> [thread 1] -> Linux CPU 6 195 + -> [core 1] -> [thread 0] -> Linux CPU 3 196 + -> [thread 1] -> Linux CPU 7 197 + 198 + AMD nomenclature for CMT systems: 199 + 200 + [node 0] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 0 201 + -> [Compute Unit Core 1] -> Linux CPU 1 202 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 2 203 + -> [Compute Unit Core 1] -> Linux CPU 3 204 + 205 + [node 1] -> [Compute Unit 0] -> [Compute Unit Core 0] -> Linux CPU 4 206 + -> [Compute Unit Core 1] -> Linux CPU 5 207 + -> [Compute Unit 1] -> [Compute Unit Core 0] -> Linux CPU 6 208 + -> [Compute Unit Core 1] -> Linux CPU 7
+18 -3
arch/x86/events/amd/core.c
··· 369 369 370 370 WARN_ON_ONCE(cpuc->amd_nb); 371 371 372 - if (boot_cpu_data.x86_max_cores < 2) 372 + if (!x86_pmu.amd_nb_constraints) 373 373 return NOTIFY_OK; 374 374 375 375 cpuc->amd_nb = amd_alloc_nb(cpu); ··· 388 388 389 389 cpuc->perf_ctr_virt_mask = AMD64_EVENTSEL_HOSTONLY; 390 390 391 - if (boot_cpu_data.x86_max_cores < 2) 391 + if (!x86_pmu.amd_nb_constraints) 392 392 return; 393 393 394 394 nb_id = amd_get_nb_id(cpu); ··· 414 414 { 415 415 struct cpu_hw_events *cpuhw; 416 416 417 - if (boot_cpu_data.x86_max_cores < 2) 417 + if (!x86_pmu.amd_nb_constraints) 418 418 return; 419 419 420 420 cpuhw = &per_cpu(cpu_hw_events, cpu); ··· 648 648 .cpu_prepare = amd_pmu_cpu_prepare, 649 649 .cpu_starting = amd_pmu_cpu_starting, 650 650 .cpu_dead = amd_pmu_cpu_dead, 651 + 652 + .amd_nb_constraints = 1, 651 653 }; 652 654 653 655 static int __init amd_core_pmu_init(void) ··· 676 674 x86_pmu.eventsel = MSR_F15H_PERF_CTL; 677 675 x86_pmu.perfctr = MSR_F15H_PERF_CTR; 678 676 x86_pmu.num_counters = AMD64_NUM_COUNTERS_CORE; 677 + /* 678 + * AMD Core perfctr has separate MSRs for the NB events, see 679 + * the amd/uncore.c driver. 680 + */ 681 + x86_pmu.amd_nb_constraints = 0; 679 682 680 683 pr_cont("core perfctr, "); 681 684 return 0; ··· 699 692 ret = amd_core_pmu_init(); 700 693 if (ret) 701 694 return ret; 695 + 696 + if (num_possible_cpus() == 1) { 697 + /* 698 + * No point in allocating data structures to serialize 699 + * against other CPUs, when there is only the one CPU. 700 + */ 701 + x86_pmu.amd_nb_constraints = 0; 702 + } 702 703 703 704 /* Events are common for all AMDs */ 704 705 memcpy(hw_cache_event_ids, amd_hw_cache_event_ids,
+5
arch/x86/events/perf_event.h
··· 608 608 atomic_t lbr_exclusive[x86_lbr_exclusive_max]; 609 609 610 610 /* 611 + * AMD bits 612 + */ 613 + unsigned int amd_nb_constraints : 1; 614 + 615 + /* 611 616 * Extra registers for events 612 617 */ 613 618 struct extra_reg *extra_regs;
+1 -7
arch/x86/include/asm/msr-index.h
··· 190 190 #define MSR_PP1_ENERGY_STATUS 0x00000641 191 191 #define MSR_PP1_POLICY 0x00000642 192 192 193 + /* Config TDP MSRs */ 193 194 #define MSR_CONFIG_TDP_NOMINAL 0x00000648 194 195 #define MSR_CONFIG_TDP_LEVEL_1 0x00000649 195 196 #define MSR_CONFIG_TDP_LEVEL_2 0x0000064A ··· 210 209 #define MSR_CORE_PERF_LIMIT_REASONS 0x00000690 211 210 #define MSR_GFX_PERF_LIMIT_REASONS 0x000006B0 212 211 #define MSR_RING_PERF_LIMIT_REASONS 0x000006B1 213 - 214 - /* Config TDP MSRs */ 215 - #define MSR_CONFIG_TDP_NOMINAL 0x00000648 216 - #define MSR_CONFIG_TDP_LEVEL1 0x00000649 217 - #define MSR_CONFIG_TDP_LEVEL2 0x0000064A 218 - #define MSR_CONFIG_TDP_CONTROL 0x0000064B 219 - #define MSR_TURBO_ACTIVATION_RATIO 0x0000064C 220 212 221 213 /* Hardware P state interface */ 222 214 #define MSR_PPERF 0x0000064e
-2
arch/x86/include/asm/processor.h
··· 132 132 u16 logical_proc_id; 133 133 /* Core id: */ 134 134 u16 cpu_core_id; 135 - /* Compute unit id */ 136 - u8 compute_unit_id; 137 135 /* Index into per_cpu list: */ 138 136 u16 cpu_index; 139 137 u32 microcode;
+1
arch/x86/include/asm/smp.h
··· 155 155 wbinvd(); 156 156 return 0; 157 157 } 158 + #define smp_num_siblings 1 158 159 #endif /* CONFIG_SMP */ 159 160 160 161 extern unsigned disabled_cpus;
+2 -4
arch/x86/include/asm/thread_info.h
··· 276 276 */ 277 277 #define force_iret() set_thread_flag(TIF_NOTIFY_RESUME) 278 278 279 - #endif /* !__ASSEMBLY__ */ 280 - 281 - #ifndef __ASSEMBLY__ 282 279 extern void arch_task_cache_init(void); 283 280 extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); 284 281 extern void arch_release_task_struct(struct task_struct *tsk); 285 - #endif 282 + #endif /* !__ASSEMBLY__ */ 283 + 286 284 #endif /* _ASM_X86_THREAD_INFO_H */
+2 -4
arch/x86/kernel/amd_nb.c
··· 170 170 { 171 171 struct pci_dev *link = node_to_amd_nb(amd_get_nb_id(cpu))->link; 172 172 unsigned int mask; 173 - int cuid; 174 173 175 174 if (!amd_nb_has_feature(AMD_NB_L3_PARTITIONING)) 176 175 return 0; 177 176 178 177 pci_read_config_dword(link, 0x1d4, &mask); 179 178 180 - cuid = cpu_data(cpu).compute_unit_id; 181 - return (mask >> (4 * cuid)) & 0xf; 179 + return (mask >> (4 * cpu_data(cpu).cpu_core_id)) & 0xf; 182 180 } 183 181 184 182 int amd_set_subcaches(int cpu, unsigned long mask) ··· 202 204 pci_write_config_dword(nb->misc, 0x1b8, reg & ~0x180000); 203 205 } 204 206 205 - cuid = cpu_data(cpu).compute_unit_id; 207 + cuid = cpu_data(cpu).cpu_core_id; 206 208 mask <<= 4 * cuid; 207 209 mask |= (0xf ^ (1 << cuid)) << 26; 208 210
+4 -8
arch/x86/kernel/cpu/amd.c
··· 300 300 #ifdef CONFIG_SMP 301 301 static void amd_get_topology(struct cpuinfo_x86 *c) 302 302 { 303 - u32 cores_per_cu = 1; 304 303 u8 node_id; 305 304 int cpu = smp_processor_id(); 306 305 ··· 312 313 313 314 /* get compute unit information */ 314 315 smp_num_siblings = ((ebx >> 8) & 3) + 1; 315 - c->compute_unit_id = ebx & 0xff; 316 - cores_per_cu += ((ebx >> 8) & 3); 316 + c->x86_max_cores /= smp_num_siblings; 317 + c->cpu_core_id = ebx & 0xff; 317 318 } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) { 318 319 u64 value; 319 320 ··· 324 325 325 326 /* fixup multi-node processor information */ 326 327 if (nodes_per_socket > 1) { 327 - u32 cores_per_node; 328 328 u32 cus_per_node; 329 329 330 330 set_cpu_cap(c, X86_FEATURE_AMD_DCM); 331 - cores_per_node = c->x86_max_cores / nodes_per_socket; 332 - cus_per_node = cores_per_node / cores_per_cu; 331 + cus_per_node = c->x86_max_cores / nodes_per_socket; 333 332 334 333 /* store NodeID, use llc_shared_map to store sibling info */ 335 334 per_cpu(cpu_llc_id, cpu) = node_id; 336 335 337 336 /* core id has to be in the [0 .. cores_per_node - 1] range */ 338 - c->cpu_core_id %= cores_per_node; 339 - c->compute_unit_id %= cus_per_node; 337 + c->cpu_core_id %= cus_per_node; 340 338 } 341 339 } 342 340 #endif
+2
arch/x86/kernel/cpu/powerflags.c
··· 18 18 "", /* tsc invariant mapped to constant_tsc */ 19 19 "cpb", /* core performance boost */ 20 20 "eff_freq_ro", /* Readonly aperf/mperf */ 21 + "proc_feedback", /* processor feedback interface */ 22 + "acc_power", /* accumulated power mechanism */ 21 23 };
+1 -1
arch/x86/kernel/smpboot.c
··· 422 422 423 423 if (c->phys_proc_id == o->phys_proc_id && 424 424 per_cpu(cpu_llc_id, cpu1) == per_cpu(cpu_llc_id, cpu2) && 425 - c->compute_unit_id == o->compute_unit_id) 425 + c->cpu_core_id == o->cpu_core_id) 426 426 return topology_sane(c, o, "smt"); 427 427 428 428 } else if (c->phys_proc_id == o->phys_proc_id &&
+2 -1
arch/x86/ras/mce_amd_inj.c
··· 20 20 #include <linux/pci.h> 21 21 22 22 #include <asm/mce.h> 23 + #include <asm/smp.h> 23 24 #include <asm/amd_nb.h> 24 25 #include <asm/irq_vectors.h> 25 26 ··· 207 206 struct cpuinfo_x86 *c = &boot_cpu_data; 208 207 u32 cores_per_node; 209 208 210 - cores_per_node = c->x86_max_cores / amd_get_nodes_per_socket(); 209 + cores_per_node = (c->x86_max_cores * smp_num_siblings) / amd_get_nodes_per_socket(); 211 210 212 211 return cores_per_node * node_id; 213 212 }