Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance event updates from Ingo Molnar:
"x86 PMU driver updates:

- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
(Dapeng Mi)

Compared to previous iterations of the Intel PMU code, there's been
a lot of changes, which center around three main areas:

- Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
Off-Core Response (OCR) facility

- New PEBS data source encoding layout

- Support the new "RDPMC user disable" feature

- Likewise, a large series adds uncore PMU support for Intel Diamond
Rapids (DMR) CPUs (Zide Chen)

This centers around these four main areas:

- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and each IMH
die has its own discovery domain.

- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.

- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.

- IIO free-running counters in DMR are MMIO-based, unlike SPR.

- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)

- KVM integration: Add support for mediated vPMUs (by Kan Liang and
Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)

- Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
which are a low-power variant of Panther Lake (Zide Chen)

- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code (Martin Schiller)

Performance enhancements:

- Speed up kexec shutdown by avoiding unnecessary cross CPU calls
(Jan H. Schönherr)

- Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)

User-space stack unwinding support:

- Various cleanups and refactorings in preparation to generalize the
unwinding code for other architectures (Jens Remus)

Uprobes updates:

- Transition from kmap_atomic to kmap_local_page (Keke Ming)

- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)

- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)

Misc fixes and cleanups:

- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)

- x86/intel/uncore: Convert comma to semicolon (Chen Ni)

- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)

- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"

* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
s390: remove kvm_types.h from Kbuild
uprobes: Fix incorrect lockdep condition in filter_chain()
x86/ibs: Fix typo in dc_l2tlb_miss comment
x86/uprobes: Fix XOL allocation failure for 32-bit tasks
perf/x86/intel/uncore: Convert comma to semicolon
perf/x86/intel: Add support for rdpmc user disable feature
perf/x86: Use macros to replace magic numbers in attr_rdpmc
perf/x86/intel: Add core PMU support for Novalake
perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
perf/x86/intel: Add core PMU support for DMR
perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
perf/core: Fix slow perf_event_task_exit() with LBR callstacks
perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
uprobes: use kmap_local_page() for temporary page mappings
arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
perf/x86/intel/uncore: Add Nova Lake support
...

+2155 -523
+44
Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
··· 1 + What: /sys/bus/event_source/devices/cpu.../rdpmc 2 + Date: November 2011 3 + KernelVersion: 3.10 4 + Contact: Linux kernel mailing list linux-kernel@vger.kernel.org 5 + Description: The /sys/bus/event_source/devices/cpu.../rdpmc attribute 6 + is used to show/manage if rdpmc instruction can be 7 + executed in user space. This attribute supports 3 numbers. 8 + - rdpmc = 0 9 + user space rdpmc is globally disabled for all PMU 10 + counters. 11 + - rdpmc = 1 12 + user space rdpmc is globally enabled only in event mmap 13 + ioctl called time window. If the mmap region is unmapped, 14 + user space rdpmc is disabled again. 15 + - rdpmc = 2 16 + user space rdpmc is globally enabled for all PMU 17 + counters. 18 + 19 + In the Intel platforms supporting counter level's user 20 + space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the 21 + meaning of 3 numbers is extended to 22 + - rdpmc = 0 23 + global user space rdpmc and counter level's user space 24 + rdpmc of all counters are both disabled. 25 + - rdpmc = 1 26 + No changes on behavior of global user space rdpmc. 27 + counter level's rdpmc of system-wide events is disabled 28 + but counter level's rdpmc of non-system-wide events is 29 + enabled. 30 + - rdpmc = 2 31 + global user space rdpmc and counter level's user space 32 + rdpmc of all counters are both enabled unconditionally. 33 + 34 + The default value of rdpmc is 1. 35 + 36 + Please notice: 37 + - global user space rdpmc's behavior would change 38 + immediately along with the rdpmc value's change, 39 + but the behavior of counter level's user space rdpmc 40 + won't take effect immediately until the event is 41 + reactivated or recreated. 42 + - The rdpmc attribute is global, even for x86 hybrid 43 + platforms. For example, changing cpu_core/rdpmc will 44 + also change cpu_atom/rdpmc.
+2 -2
arch/arm/probes/uprobes/core.c
··· 113 113 void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, 114 114 void *src, unsigned long len) 115 115 { 116 - void *xol_page_kaddr = kmap_atomic(page); 116 + void *xol_page_kaddr = kmap_local_page(page); 117 117 void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK); 118 118 119 119 preempt_disable(); ··· 126 126 127 127 preempt_enable(); 128 128 129 - kunmap_atomic(xol_page_kaddr); 129 + kunmap_local(xol_page_kaddr); 130 130 } 131 131 132 132
+2 -2
arch/arm64/kernel/probes/uprobes.c
··· 15 15 void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, 16 16 void *src, unsigned long len) 17 17 { 18 - void *xol_page_kaddr = kmap_atomic(page); 18 + void *xol_page_kaddr = kmap_local_page(page); 19 19 void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK); 20 20 21 21 /* ··· 32 32 sync_icache_aliases((unsigned long)dst, (unsigned long)dst + len); 33 33 34 34 done: 35 - kunmap_atomic(xol_page_kaddr); 35 + kunmap_local(xol_page_kaddr); 36 36 } 37 37 38 38 unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
+2 -2
arch/mips/kernel/uprobes.c
··· 214 214 unsigned long kaddr, kstart; 215 215 216 216 /* Initialize the slot */ 217 - kaddr = (unsigned long)kmap_atomic(page); 217 + kaddr = (unsigned long)kmap_local_page(page); 218 218 kstart = kaddr + (vaddr & ~PAGE_MASK); 219 219 memcpy((void *)kstart, src, len); 220 220 flush_icache_range(kstart, kstart + len); 221 - kunmap_atomic((void *)kaddr); 221 + kunmap_local((void *)kaddr); 222 222 } 223 223 224 224 /**
+2 -2
arch/riscv/kernel/probes/uprobes.c
··· 165 165 void *src, unsigned long len) 166 166 { 167 167 /* Initialize the slot */ 168 - void *kaddr = kmap_atomic(page); 168 + void *kaddr = kmap_local_page(page); 169 169 void *dst = kaddr + (vaddr & ~PAGE_MASK); 170 170 unsigned long start = (unsigned long)dst; 171 171 ··· 178 178 } 179 179 180 180 flush_icache_range(start, start + len); 181 - kunmap_atomic(kaddr); 181 + kunmap_local(kaddr); 182 182 }
-1
arch/s390/include/asm/Kbuild
··· 5 5 generated-y += unistd_nr.h 6 6 7 7 generic-y += asm-offsets.h 8 - generic-y += kvm_types.h 9 8 generic-y += mcs_spinlock.h 10 9 generic-y += mmzone.h
+1
arch/x86/entry/entry_fred.c
··· 114 114 115 115 SYSVEC(IRQ_WORK_VECTOR, irq_work), 116 116 117 + SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, perf_guest_mediated_pmi_handler), 117 118 SYSVEC(POSTED_INTR_VECTOR, kvm_posted_intr_ipi), 118 119 SYSVEC(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi), 119 120 SYSVEC(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi),
+2
arch/x86/events/amd/core.c
··· 1439 1439 1440 1440 amd_pmu_global_cntr_mask = x86_pmu.cntr_mask64; 1441 1441 1442 + x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; 1443 + 1442 1444 /* Update PMC handling functions */ 1443 1445 x86_pmu.enable_all = amd_pmu_v2_enable_all; 1444 1446 x86_pmu.disable_all = amd_pmu_v2_disable_all;
+61 -5
arch/x86/events/core.c
··· 30 30 #include <linux/device.h> 31 31 #include <linux/nospec.h> 32 32 #include <linux/static_call.h> 33 + #include <linux/kvm_types.h> 33 34 34 35 #include <asm/apic.h> 35 36 #include <asm/stacktrace.h> ··· 56 55 .enabled = 1, 57 56 .pmu = &pmu, 58 57 }; 58 + 59 + static DEFINE_PER_CPU(bool, guest_lvtpc_loaded); 59 60 60 61 DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key); 61 62 DEFINE_STATIC_KEY_FALSE(rdpmc_always_available_key); ··· 1763 1760 apic_write(APIC_LVTPC, APIC_DM_NMI); 1764 1761 } 1765 1762 1763 + #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU 1764 + void perf_load_guest_lvtpc(u32 guest_lvtpc) 1765 + { 1766 + u32 masked = guest_lvtpc & APIC_LVT_MASKED; 1767 + 1768 + apic_write(APIC_LVTPC, 1769 + APIC_DM_FIXED | PERF_GUEST_MEDIATED_PMI_VECTOR | masked); 1770 + this_cpu_write(guest_lvtpc_loaded, true); 1771 + } 1772 + EXPORT_SYMBOL_FOR_KVM(perf_load_guest_lvtpc); 1773 + 1774 + void perf_put_guest_lvtpc(void) 1775 + { 1776 + this_cpu_write(guest_lvtpc_loaded, false); 1777 + apic_write(APIC_LVTPC, APIC_DM_NMI); 1778 + } 1779 + EXPORT_SYMBOL_FOR_KVM(perf_put_guest_lvtpc); 1780 + #endif /* CONFIG_PERF_GUEST_MEDIATED_PMU */ 1781 + 1766 1782 static int 1767 1783 perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs) 1768 1784 { 1769 1785 u64 start_clock; 1770 1786 u64 finish_clock; 1771 1787 int ret; 1788 + 1789 + /* 1790 + * Ignore all NMIs when the CPU's LVTPC is configured to route PMIs to 1791 + * PERF_GUEST_MEDIATED_PMI_VECTOR, i.e. when an NMI time can't be due 1792 + * to a PMI. Attempting to handle a PMI while the guest's context is 1793 + * loaded will generate false positives and clobber guest state. Note, 1794 + * the LVTPC is switched to/from the dedicated mediated PMI IRQ vector 1795 + * while host events are quiesced. 1796 + */ 1797 + if (this_cpu_read(guest_lvtpc_loaded)) 1798 + return NMI_DONE; 1772 1799 1773 1800 /* 1774 1801 * All PMUs/events that share this PMI handler should make sure to ··· 2163 2130 2164 2131 pr_cont("%s PMU driver.\n", x86_pmu.name); 2165 2132 2166 - x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */ 2133 + /* enable userspace RDPMC usage by default */ 2134 + x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE; 2167 2135 2168 2136 for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next) 2169 2137 quirk->func(); ··· 2616 2582 return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc); 2617 2583 } 2618 2584 2585 + /* 2586 + * Behaviors of rdpmc value: 2587 + * - rdpmc = 0 2588 + * global user space rdpmc and counter level's user space rdpmc of all 2589 + * counters are both disabled. 2590 + * - rdpmc = 1 2591 + * global user space rdpmc is enabled in mmap enabled time window and 2592 + * counter level's user space rdpmc is enabled for only non system-wide 2593 + * events. Counter level's user space rdpmc of system-wide events is 2594 + * still disabled by default. This won't introduce counter data leak for 2595 + * non system-wide events since their count data would be cleared when 2596 + * context switches. 2597 + * - rdpmc = 2 2598 + * global user space rdpmc and counter level's user space rdpmc of all 2599 + * counters are enabled unconditionally. 2600 + * 2601 + * Suppose the rdpmc value won't be changed frequently, don't dynamically 2602 + * reschedule events to make the new rpdmc value take effect on active perf 2603 + * events immediately, the new rdpmc value would only impact the new 2604 + * activated perf events. This makes code simpler and cleaner. 2605 + */ 2619 2606 static ssize_t set_attr_rdpmc(struct device *cdev, 2620 2607 struct device_attribute *attr, 2621 2608 const char *buf, size_t count) ··· 2665 2610 */ 2666 2611 if (val == 0) 2667 2612 static_branch_inc(&rdpmc_never_available_key); 2668 - else if (x86_pmu.attr_rdpmc == 0) 2613 + else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE) 2669 2614 static_branch_dec(&rdpmc_never_available_key); 2670 2615 2671 2616 if (val == 2) 2672 2617 static_branch_inc(&rdpmc_always_available_key); 2673 - else if (x86_pmu.attr_rdpmc == 2) 2618 + else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE) 2674 2619 static_branch_dec(&rdpmc_always_available_key); 2675 2620 2676 2621 on_each_cpu(cr4_update_pce, NULL, 1); ··· 3128 3073 cap->version = x86_pmu.version; 3129 3074 cap->num_counters_gp = x86_pmu_num_counters(NULL); 3130 3075 cap->num_counters_fixed = x86_pmu_num_counters_fixed(NULL); 3131 - cap->bit_width_gp = x86_pmu.cntval_bits; 3132 - cap->bit_width_fixed = x86_pmu.cntval_bits; 3076 + cap->bit_width_gp = cap->num_counters_gp ? x86_pmu.cntval_bits : 0; 3077 + cap->bit_width_fixed = cap->num_counters_fixed ? x86_pmu.cntval_bits : 0; 3133 3078 cap->events_mask = (unsigned int)x86_pmu.events_maskl; 3134 3079 cap->events_mask_len = x86_pmu.events_mask_len; 3135 3080 cap->pebs_ept = x86_pmu.pebs_ept; 3081 + cap->mediated = !!(pmu.capabilities & PERF_PMU_CAP_MEDIATED_VPMU); 3136 3082 } 3137 3083 EXPORT_SYMBOL_FOR_KVM(perf_get_x86_pmu_capability); 3138 3084
+352 -18
arch/x86/events/intel/core.c
··· 232 232 EVENT_CONSTRAINT_END 233 233 }; 234 234 235 + static struct event_constraint intel_arw_event_constraints[] __read_mostly = { 236 + FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ 237 + FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ 238 + FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */ 239 + FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */ 240 + FIXED_EVENT_CONSTRAINT(0x0073, 4), /* TOPDOWN_BAD_SPECULATION.ALL */ 241 + FIXED_EVENT_CONSTRAINT(0x019c, 5), /* TOPDOWN_FE_BOUND.ALL */ 242 + FIXED_EVENT_CONSTRAINT(0x02c2, 6), /* TOPDOWN_RETIRING.ALL */ 243 + INTEL_UEVENT_CONSTRAINT(0x01b7, 0x1), 244 + INTEL_UEVENT_CONSTRAINT(0x02b7, 0x2), 245 + INTEL_UEVENT_CONSTRAINT(0x04b7, 0x4), 246 + INTEL_UEVENT_CONSTRAINT(0x08b7, 0x8), 247 + INTEL_UEVENT_CONSTRAINT(0x01d4, 0x1), 248 + INTEL_UEVENT_CONSTRAINT(0x02d4, 0x2), 249 + INTEL_UEVENT_CONSTRAINT(0x04d4, 0x4), 250 + INTEL_UEVENT_CONSTRAINT(0x08d4, 0x8), 251 + INTEL_UEVENT_CONSTRAINT(0x0175, 0x1), 252 + INTEL_UEVENT_CONSTRAINT(0x0275, 0x2), 253 + INTEL_UEVENT_CONSTRAINT(0x21d3, 0x1), 254 + INTEL_UEVENT_CONSTRAINT(0x22d3, 0x1), 255 + EVENT_CONSTRAINT_END 256 + }; 257 + 235 258 static struct event_constraint intel_skl_event_constraints[] = { 236 259 FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ 237 260 FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ ··· 450 427 static struct extra_reg intel_lnc_extra_regs[] __read_mostly = { 451 428 INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OFFCORE_RSP_0, 0xfffffffffffull, RSP_0), 452 429 INTEL_UEVENT_EXTRA_REG(0x012b, MSR_OFFCORE_RSP_1, 0xfffffffffffull, RSP_1), 430 + INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd), 431 + INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE), 432 + INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE), 433 + INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE), 434 + INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE), 435 + EVENT_EXTRA_END 436 + }; 437 + 438 + static struct event_constraint intel_pnc_event_constraints[] = { 439 + FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */ 440 + FIXED_EVENT_CONSTRAINT(0x0100, 0), /* INST_RETIRED.PREC_DIST */ 441 + FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */ 442 + FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */ 443 + FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */ 444 + FIXED_EVENT_CONSTRAINT(0x0400, 3), /* SLOTS */ 445 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_RETIRING, 0), 446 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BAD_SPEC, 1), 447 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FE_BOUND, 2), 448 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BE_BOUND, 3), 449 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_HEAVY_OPS, 4), 450 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BR_MISPREDICT, 5), 451 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FETCH_LAT, 6), 452 + METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_MEM_BOUND, 7), 453 + 454 + INTEL_EVENT_CONSTRAINT(0x20, 0xf), 455 + INTEL_EVENT_CONSTRAINT(0x79, 0xf), 456 + 457 + INTEL_UEVENT_CONSTRAINT(0x0275, 0xf), 458 + INTEL_UEVENT_CONSTRAINT(0x0176, 0xf), 459 + INTEL_UEVENT_CONSTRAINT(0x04a4, 0x1), 460 + INTEL_UEVENT_CONSTRAINT(0x08a4, 0x1), 461 + INTEL_UEVENT_CONSTRAINT(0x01cd, 0xfc), 462 + INTEL_UEVENT_CONSTRAINT(0x02cd, 0x3), 463 + 464 + INTEL_EVENT_CONSTRAINT(0xd0, 0xf), 465 + INTEL_EVENT_CONSTRAINT(0xd1, 0xf), 466 + INTEL_EVENT_CONSTRAINT(0xd4, 0xf), 467 + INTEL_EVENT_CONSTRAINT(0xd6, 0xf), 468 + INTEL_EVENT_CONSTRAINT(0xdf, 0xf), 469 + INTEL_EVENT_CONSTRAINT(0xce, 0x1), 470 + 471 + INTEL_UEVENT_CONSTRAINT(0x01b1, 0x8), 472 + INTEL_UEVENT_CONSTRAINT(0x0847, 0xf), 473 + INTEL_UEVENT_CONSTRAINT(0x0446, 0xf), 474 + INTEL_UEVENT_CONSTRAINT(0x0846, 0xf), 475 + INTEL_UEVENT_CONSTRAINT(0x0148, 0xf), 476 + 477 + EVENT_CONSTRAINT_END 478 + }; 479 + 480 + static struct extra_reg intel_pnc_extra_regs[] __read_mostly = { 481 + /* must define OMR_X first, see intel_alt_er() */ 482 + INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OMR_0, 0x40ffffff0000ffffull, OMR_0), 483 + INTEL_UEVENT_EXTRA_REG(0x022a, MSR_OMR_1, 0x40ffffff0000ffffull, OMR_1), 484 + INTEL_UEVENT_EXTRA_REG(0x042a, MSR_OMR_2, 0x40ffffff0000ffffull, OMR_2), 485 + INTEL_UEVENT_EXTRA_REG(0x082a, MSR_OMR_3, 0x40ffffff0000ffffull, OMR_3), 453 486 INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd), 454 487 INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE), 455 488 INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE), ··· 725 646 [ C(OP_READ) ] = { 726 647 [ C(RESULT_ACCESS) ] = 0x10c000001, 727 648 [ C(RESULT_MISS) ] = 0x3fb3000001, 649 + }, 650 + }, 651 + }; 652 + 653 + static __initconst const u64 pnc_hw_cache_event_ids 654 + [PERF_COUNT_HW_CACHE_MAX] 655 + [PERF_COUNT_HW_CACHE_OP_MAX] 656 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = 657 + { 658 + [ C(L1D ) ] = { 659 + [ C(OP_READ) ] = { 660 + [ C(RESULT_ACCESS) ] = 0x81d0, 661 + [ C(RESULT_MISS) ] = 0xe124, 662 + }, 663 + [ C(OP_WRITE) ] = { 664 + [ C(RESULT_ACCESS) ] = 0x82d0, 665 + }, 666 + }, 667 + [ C(L1I ) ] = { 668 + [ C(OP_READ) ] = { 669 + [ C(RESULT_MISS) ] = 0xe424, 670 + }, 671 + [ C(OP_WRITE) ] = { 672 + [ C(RESULT_ACCESS) ] = -1, 673 + [ C(RESULT_MISS) ] = -1, 674 + }, 675 + }, 676 + [ C(LL ) ] = { 677 + [ C(OP_READ) ] = { 678 + [ C(RESULT_ACCESS) ] = 0x12a, 679 + [ C(RESULT_MISS) ] = 0x12a, 680 + }, 681 + [ C(OP_WRITE) ] = { 682 + [ C(RESULT_ACCESS) ] = 0x12a, 683 + [ C(RESULT_MISS) ] = 0x12a, 684 + }, 685 + }, 686 + [ C(DTLB) ] = { 687 + [ C(OP_READ) ] = { 688 + [ C(RESULT_ACCESS) ] = 0x81d0, 689 + [ C(RESULT_MISS) ] = 0xe12, 690 + }, 691 + [ C(OP_WRITE) ] = { 692 + [ C(RESULT_ACCESS) ] = 0x82d0, 693 + [ C(RESULT_MISS) ] = 0xe13, 694 + }, 695 + }, 696 + [ C(ITLB) ] = { 697 + [ C(OP_READ) ] = { 698 + [ C(RESULT_ACCESS) ] = -1, 699 + [ C(RESULT_MISS) ] = 0xe11, 700 + }, 701 + [ C(OP_WRITE) ] = { 702 + [ C(RESULT_ACCESS) ] = -1, 703 + [ C(RESULT_MISS) ] = -1, 704 + }, 705 + [ C(OP_PREFETCH) ] = { 706 + [ C(RESULT_ACCESS) ] = -1, 707 + [ C(RESULT_MISS) ] = -1, 708 + }, 709 + }, 710 + [ C(BPU ) ] = { 711 + [ C(OP_READ) ] = { 712 + [ C(RESULT_ACCESS) ] = 0x4c4, 713 + [ C(RESULT_MISS) ] = 0x4c5, 714 + }, 715 + [ C(OP_WRITE) ] = { 716 + [ C(RESULT_ACCESS) ] = -1, 717 + [ C(RESULT_MISS) ] = -1, 718 + }, 719 + [ C(OP_PREFETCH) ] = { 720 + [ C(RESULT_ACCESS) ] = -1, 721 + [ C(RESULT_MISS) ] = -1, 722 + }, 723 + }, 724 + [ C(NODE) ] = { 725 + [ C(OP_READ) ] = { 726 + [ C(RESULT_ACCESS) ] = -1, 727 + [ C(RESULT_MISS) ] = -1, 728 + }, 729 + }, 730 + }; 731 + 732 + static __initconst const u64 pnc_hw_cache_extra_regs 733 + [PERF_COUNT_HW_CACHE_MAX] 734 + [PERF_COUNT_HW_CACHE_OP_MAX] 735 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = 736 + { 737 + [ C(LL ) ] = { 738 + [ C(OP_READ) ] = { 739 + [ C(RESULT_ACCESS) ] = 0x4000000000000001, 740 + [ C(RESULT_MISS) ] = 0xFFFFF000000001, 741 + }, 742 + [ C(OP_WRITE) ] = { 743 + [ C(RESULT_ACCESS) ] = 0x4000000000000002, 744 + [ C(RESULT_MISS) ] = 0xFFFFF000000002, 728 745 }, 729 746 }, 730 747 }; ··· 2342 2167 }, 2343 2168 }; 2344 2169 2170 + static __initconst const u64 arw_hw_cache_extra_regs 2171 + [PERF_COUNT_HW_CACHE_MAX] 2172 + [PERF_COUNT_HW_CACHE_OP_MAX] 2173 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { 2174 + [C(LL)] = { 2175 + [C(OP_READ)] = { 2176 + [C(RESULT_ACCESS)] = 0x4000000000000001, 2177 + [C(RESULT_MISS)] = 0xFFFFF000000001, 2178 + }, 2179 + [C(OP_WRITE)] = { 2180 + [C(RESULT_ACCESS)] = 0x4000000000000002, 2181 + [C(RESULT_MISS)] = 0xFFFFF000000002, 2182 + }, 2183 + [C(OP_PREFETCH)] = { 2184 + [C(RESULT_ACCESS)] = 0x0, 2185 + [C(RESULT_MISS)] = 0x0, 2186 + }, 2187 + }, 2188 + }; 2189 + 2345 2190 EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_tnt, "event=0x71,umask=0x0"); 2346 2191 EVENT_ATTR_STR(topdown-retiring, td_retiring_tnt, "event=0xc2,umask=0x0"); 2347 2192 EVENT_ATTR_STR(topdown-bad-spec, td_bad_spec_tnt, "event=0x73,umask=0x6"); ··· 2414 2219 /* must define OFFCORE_RSP_X first, see intel_fixup_er() */ 2415 2220 INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x800ff3ffffffffffull, RSP_0), 2416 2221 INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0xff3ffffffffffull, RSP_1), 2222 + INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0), 2223 + INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0), 2224 + INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1), 2225 + EVENT_EXTRA_END 2226 + }; 2227 + 2228 + static struct extra_reg intel_arw_extra_regs[] __read_mostly = { 2229 + /* must define OMR_X first, see intel_alt_er() */ 2230 + INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0), 2231 + INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1), 2232 + INTEL_UEVENT_EXTRA_REG(0x04b7, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2), 2233 + INTEL_UEVENT_EXTRA_REG(0x08b7, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3), 2234 + INTEL_UEVENT_EXTRA_REG(0x01d4, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0), 2235 + INTEL_UEVENT_EXTRA_REG(0x02d4, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1), 2236 + INTEL_UEVENT_EXTRA_REG(0x04d4, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2), 2237 + INTEL_UEVENT_EXTRA_REG(0x08d4, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3), 2417 2238 INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0), 2418 2239 INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0), 2419 2240 INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1), ··· 3128 2917 bits |= INTEL_FIXED_0_USER; 3129 2918 if (hwc->config & ARCH_PERFMON_EVENTSEL_OS) 3130 2919 bits |= INTEL_FIXED_0_KERNEL; 2920 + if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE) 2921 + bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE; 3131 2922 3132 2923 /* 3133 2924 * ANY bit is supported in v3 and up ··· 3265 3052 __intel_pmu_update_event_ext(hwc->idx, ext); 3266 3053 } 3267 3054 3055 + static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event) 3056 + { 3057 + if (!x86_pmu_has_rdpmc_user_disable(event->pmu)) 3058 + return; 3059 + 3060 + /* 3061 + * Counter scope's user-space rdpmc is disabled by default 3062 + * except two cases. 3063 + * a. rdpmc = 2 (user space rdpmc enabled unconditionally) 3064 + * b. rdpmc = 1 and the event is not a system-wide event. 3065 + * The count of non-system-wide events would be cleared when 3066 + * context switches, so no count data is leaked. 3067 + */ 3068 + if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE || 3069 + (x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE && 3070 + event->ctx->task)) 3071 + event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 3072 + else 3073 + event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 3074 + } 3075 + 3268 3076 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext); 3269 3077 3270 3078 static void intel_pmu_enable_event(struct perf_event *event) ··· 3293 3059 u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE; 3294 3060 struct hw_perf_event *hwc = &event->hw; 3295 3061 int idx = hwc->idx; 3062 + 3063 + intel_pmu_update_rdpmc_user_disable(event); 3296 3064 3297 3065 if (unlikely(event->attr.precise_ip)) 3298 3066 static_call(x86_pmu_pebs_enable)(event); ··· 3768 3532 struct extra_reg *extra_regs = hybrid(cpuc->pmu, extra_regs); 3769 3533 int alt_idx = idx; 3770 3534 3771 - if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1)) 3772 - return idx; 3535 + switch (idx) { 3536 + case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1: 3537 + if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1)) 3538 + return idx; 3539 + if (++alt_idx > EXTRA_REG_RSP_1) 3540 + alt_idx = EXTRA_REG_RSP_0; 3541 + if (config & ~extra_regs[alt_idx].valid_mask) 3542 + return idx; 3543 + break; 3773 3544 3774 - if (idx == EXTRA_REG_RSP_0) 3775 - alt_idx = EXTRA_REG_RSP_1; 3545 + case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3: 3546 + if (!(x86_pmu.flags & PMU_FL_HAS_OMR)) 3547 + return idx; 3548 + if (++alt_idx > EXTRA_REG_OMR_3) 3549 + alt_idx = EXTRA_REG_OMR_0; 3550 + /* 3551 + * Subtracting EXTRA_REG_OMR_0 ensures to get correct 3552 + * OMR extra_reg entries which start from 0. 3553 + */ 3554 + if (config & ~extra_regs[alt_idx - EXTRA_REG_OMR_0].valid_mask) 3555 + return idx; 3556 + break; 3776 3557 3777 - if (idx == EXTRA_REG_RSP_1) 3778 - alt_idx = EXTRA_REG_RSP_0; 3779 - 3780 - if (config & ~extra_regs[alt_idx].valid_mask) 3781 - return idx; 3558 + default: 3559 + break; 3560 + } 3782 3561 3783 3562 return alt_idx; 3784 3563 } ··· 3801 3550 static void intel_fixup_er(struct perf_event *event, int idx) 3802 3551 { 3803 3552 struct extra_reg *extra_regs = hybrid(event->pmu, extra_regs); 3804 - event->hw.extra_reg.idx = idx; 3553 + int er_idx; 3805 3554 3806 - if (idx == EXTRA_REG_RSP_0) { 3555 + event->hw.extra_reg.idx = idx; 3556 + switch (idx) { 3557 + case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1: 3558 + er_idx = idx - EXTRA_REG_RSP_0; 3807 3559 event->hw.config &= ~INTEL_ARCH_EVENT_MASK; 3808 - event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event; 3809 - event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0; 3810 - } else if (idx == EXTRA_REG_RSP_1) { 3811 - event->hw.config &= ~INTEL_ARCH_EVENT_MASK; 3812 - event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event; 3813 - event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1; 3560 + event->hw.config |= extra_regs[er_idx].event; 3561 + event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0 + er_idx; 3562 + break; 3563 + 3564 + case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3: 3565 + er_idx = idx - EXTRA_REG_OMR_0; 3566 + event->hw.config &= ~ARCH_PERFMON_EVENTSEL_UMASK; 3567 + event->hw.config |= 1ULL << (8 + er_idx); 3568 + event->hw.extra_reg.reg = MSR_OMR_0 + er_idx; 3569 + break; 3570 + 3571 + default: 3572 + pr_warn("The extra reg idx %d is not supported.\n", idx); 3814 3573 } 3815 3574 } 3816 3575 ··· 5894 5633 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2; 5895 5634 if (ebx_0.split.eq) 5896 5635 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ; 5636 + if (ebx_0.split.rdpmc_user_disable) 5637 + hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 5897 5638 5898 5639 if (eax_0.split.cntr_subleaf) { 5899 5640 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF, ··· 5957 5694 pmu->intel_ctrl |= GLOBAL_CTRL_EN_PERF_METRICS; 5958 5695 else 5959 5696 pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS; 5697 + 5698 + pmu->pmu.capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; 5960 5699 5961 5700 intel_pmu_check_event_constraints_all(&pmu->pmu); 5962 5701 ··· 7474 7209 hybrid(pmu, extra_regs) = intel_lnc_extra_regs; 7475 7210 } 7476 7211 7212 + static __always_inline void intel_pmu_init_pnc(struct pmu *pmu) 7213 + { 7214 + intel_pmu_init_glc(pmu); 7215 + x86_pmu.flags &= ~PMU_FL_HAS_RSP_1; 7216 + x86_pmu.flags |= PMU_FL_HAS_OMR; 7217 + memcpy(hybrid_var(pmu, hw_cache_event_ids), 7218 + pnc_hw_cache_event_ids, sizeof(hw_cache_event_ids)); 7219 + memcpy(hybrid_var(pmu, hw_cache_extra_regs), 7220 + pnc_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); 7221 + hybrid(pmu, event_constraints) = intel_pnc_event_constraints; 7222 + hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints; 7223 + hybrid(pmu, extra_regs) = intel_pnc_extra_regs; 7224 + } 7225 + 7477 7226 static __always_inline void intel_pmu_init_skt(struct pmu *pmu) 7478 7227 { 7479 7228 intel_pmu_init_grt(pmu); 7480 7229 hybrid(pmu, event_constraints) = intel_skt_event_constraints; 7481 7230 hybrid(pmu, extra_regs) = intel_cmt_extra_regs; 7231 + static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); 7232 + } 7233 + 7234 + static __always_inline void intel_pmu_init_arw(struct pmu *pmu) 7235 + { 7236 + intel_pmu_init_grt(pmu); 7237 + x86_pmu.flags &= ~PMU_FL_HAS_RSP_1; 7238 + x86_pmu.flags |= PMU_FL_HAS_OMR; 7239 + memcpy(hybrid_var(pmu, hw_cache_extra_regs), 7240 + arw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs)); 7241 + hybrid(pmu, event_constraints) = intel_arw_event_constraints; 7242 + hybrid(pmu, pebs_constraints) = intel_arw_pebs_event_constraints; 7243 + hybrid(pmu, extra_regs) = intel_arw_extra_regs; 7482 7244 static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr); 7483 7245 } 7484 7246 ··· 7606 7314 pr_cont(" AnyThread deprecated, "); 7607 7315 } 7608 7316 7317 + /* The perf side of core PMU is ready to support the mediated vPMU. */ 7318 + x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU; 7319 + 7609 7320 /* 7610 7321 * Many features on and after V6 require dynamic constraint, 7611 7322 * e.g., Arch PEBS, ACR. ··· 7700 7405 case INTEL_ATOM_SILVERMONT_D: 7701 7406 case INTEL_ATOM_SILVERMONT_MID: 7702 7407 case INTEL_ATOM_AIRMONT: 7408 + case INTEL_ATOM_AIRMONT_NP: 7703 7409 case INTEL_ATOM_SILVERMONT_MID2: 7704 7410 memcpy(hw_cache_event_ids, slm_hw_cache_event_ids, 7705 7411 sizeof(hw_cache_event_ids)); ··· 8162 7866 x86_pmu.extra_regs = intel_rwc_extra_regs; 8163 7867 pr_cont("Granite Rapids events, "); 8164 7868 name = "granite_rapids"; 7869 + goto glc_common; 7870 + 7871 + case INTEL_DIAMONDRAPIDS_X: 7872 + intel_pmu_init_pnc(NULL); 7873 + x86_pmu.pebs_latency_data = pnc_latency_data; 7874 + 7875 + pr_cont("Panthercove events, "); 7876 + name = "panthercove"; 7877 + goto glc_base; 8165 7878 8166 7879 glc_common: 8167 7880 intel_pmu_init_glc(NULL); 7881 + intel_pmu_pebs_data_source_skl(true); 7882 + 7883 + glc_base: 8168 7884 x86_pmu.pebs_ept = 1; 8169 7885 x86_pmu.hw_config = hsw_hw_config; 8170 7886 x86_pmu.get_event_constraints = glc_get_event_constraints; ··· 8186 7878 mem_attr = glc_events_attrs; 8187 7879 td_attr = glc_td_events_attrs; 8188 7880 tsx_attr = glc_tsx_events_attrs; 8189 - intel_pmu_pebs_data_source_skl(true); 8190 7881 break; 8191 7882 8192 7883 case INTEL_ALDERLAKE: ··· 8347 8040 intel_pmu_pebs_data_source_arl_h(); 8348 8041 pr_cont("ArrowLake-H Hybrid events, "); 8349 8042 name = "arrowlake_h_hybrid"; 8043 + break; 8044 + 8045 + case INTEL_NOVALAKE: 8046 + case INTEL_NOVALAKE_L: 8047 + pr_cont("Novalake Hybrid events, "); 8048 + name = "novalake_hybrid"; 8049 + intel_pmu_init_hybrid(hybrid_big_small); 8050 + 8051 + x86_pmu.pebs_latency_data = nvl_latency_data; 8052 + x86_pmu.get_event_constraints = mtl_get_event_constraints; 8053 + x86_pmu.hw_config = adl_hw_config; 8054 + 8055 + td_attr = lnl_hybrid_events_attrs; 8056 + mem_attr = mtl_hybrid_mem_attrs; 8057 + tsx_attr = adl_hybrid_tsx_attrs; 8058 + extra_attr = boot_cpu_has(X86_FEATURE_RTM) ? 8059 + mtl_hybrid_extra_attr_rtm : mtl_hybrid_extra_attr; 8060 + 8061 + /* Initialize big core specific PerfMon capabilities.*/ 8062 + pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX]; 8063 + intel_pmu_init_pnc(&pmu->pmu); 8064 + 8065 + /* Initialize Atom core specific PerfMon capabilities.*/ 8066 + pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX]; 8067 + intel_pmu_init_arw(&pmu->pmu); 8068 + 8069 + intel_pmu_pebs_data_source_lnl(); 8350 8070 break; 8351 8071 8352 8072 default:
+26 -7
arch/x86/events/intel/cstate.c
··· 41 41 * MSR_CORE_C1_RES: CORE C1 Residency Counter 42 42 * perf code: 0x00 43 43 * Available model: SLM,AMT,GLM,CNL,ICX,TNT,ADL,RPL 44 - * MTL,SRF,GRR,ARL,LNL,PTL 44 + * MTL,SRF,GRR,ARL,LNL,PTL,WCL,NVL 45 45 * Scope: Core (each processor core has a MSR) 46 46 * MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter 47 47 * perf code: 0x01 ··· 53 53 * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, 54 54 * SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX, 55 55 * TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF, 56 - * GRR,ARL,LNL,PTL 56 + * GRR,ARL,LNL,PTL,WCL,NVL 57 57 * Scope: Core 58 58 * MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter 59 59 * perf code: 0x03 60 60 * Available model: SNB,IVB,HSW,BDW,SKL,CNL,KBL,CML, 61 61 * ICL,TGL,RKL,ADL,RPL,MTL,ARL,LNL, 62 - * PTL 62 + * PTL,WCL,NVL 63 63 * Scope: Core 64 64 * MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter. 65 65 * perf code: 0x00 66 66 * Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL, 67 67 * KBL,CML,ICL,ICX,TGL,TNT,RKL,ADL, 68 - * RPL,SPR,MTL,ARL,LNL,SRF,PTL 68 + * RPL,SPR,MTL,ARL,LNL,SRF,PTL,WCL, 69 + * NVL 69 70 * Scope: Package (physical package) 70 71 * MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter. 71 72 * perf code: 0x01 ··· 79 78 * Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW, 80 79 * SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX, 81 80 * TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF, 82 - * ARL,LNL,PTL 81 + * ARL,LNL,PTL,WCL,NVL 83 82 * Scope: Package (physical package) 84 83 * MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter. 85 84 * perf code: 0x03 ··· 98 97 * MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter. 99 98 * perf code: 0x06 100 99 * Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL, 101 - * TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL 100 + * TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL, 101 + * WCL,NVL 102 102 * Scope: Package (physical package) 103 103 * MSR_MODULE_C6_RES_MS: Module C6 Residency Counter. 104 104 * perf code: 0x00 105 - * Available model: SRF,GRR 105 + * Available model: SRF,GRR,NVL 106 106 * Scope: A cluster of cores shared L2 cache 107 107 * 108 108 */ ··· 529 527 BIT(PERF_CSTATE_PKG_C10_RES), 530 528 }; 531 529 530 + static const struct cstate_model nvl_cstates __initconst = { 531 + .core_events = BIT(PERF_CSTATE_CORE_C1_RES) | 532 + BIT(PERF_CSTATE_CORE_C6_RES) | 533 + BIT(PERF_CSTATE_CORE_C7_RES), 534 + 535 + .module_events = BIT(PERF_CSTATE_MODULE_C6_RES), 536 + 537 + .pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) | 538 + BIT(PERF_CSTATE_PKG_C6_RES) | 539 + BIT(PERF_CSTATE_PKG_C10_RES), 540 + }; 541 + 532 542 static const struct cstate_model slm_cstates __initconst = { 533 543 .core_events = BIT(PERF_CSTATE_CORE_C1_RES) | 534 544 BIT(PERF_CSTATE_CORE_C6_RES), ··· 613 599 X86_MATCH_VFM(INTEL_ATOM_SILVERMONT, &slm_cstates), 614 600 X86_MATCH_VFM(INTEL_ATOM_SILVERMONT_D, &slm_cstates), 615 601 X86_MATCH_VFM(INTEL_ATOM_AIRMONT, &slm_cstates), 602 + X86_MATCH_VFM(INTEL_ATOM_AIRMONT_NP, &slm_cstates), 616 603 617 604 X86_MATCH_VFM(INTEL_BROADWELL, &snb_cstates), 618 605 X86_MATCH_VFM(INTEL_BROADWELL_D, &snb_cstates), ··· 653 638 X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &icx_cstates), 654 639 X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &icx_cstates), 655 640 X86_MATCH_VFM(INTEL_GRANITERAPIDS_D, &icx_cstates), 641 + X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &srf_cstates), 656 642 657 643 X86_MATCH_VFM(INTEL_TIGERLAKE_L, &icl_cstates), 658 644 X86_MATCH_VFM(INTEL_TIGERLAKE, &icl_cstates), ··· 670 654 X86_MATCH_VFM(INTEL_ARROWLAKE_U, &adl_cstates), 671 655 X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_cstates), 672 656 X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &lnl_cstates), 657 + X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &lnl_cstates), 658 + X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_cstates), 659 + X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_cstates), 673 660 { }, 674 661 }; 675 662 MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);
+261
arch/x86/events/intel/ds.c
··· 34 34 35 35 */ 36 36 37 + union omr_encoding { 38 + struct { 39 + u8 omr_source : 4; 40 + u8 omr_remote : 1; 41 + u8 omr_hitm : 1; 42 + u8 omr_snoop : 1; 43 + u8 omr_promoted : 1; 44 + }; 45 + u8 omr_full; 46 + }; 47 + 37 48 union intel_x86_pebs_dse { 38 49 u64 val; 39 50 struct { ··· 83 72 unsigned int lnc_data_blk:1; 84 73 unsigned int lnc_addr_blk:1; 85 74 unsigned int ld_reserved6:18; 75 + }; 76 + struct { 77 + unsigned int pnc_dse: 8; 78 + unsigned int pnc_l2_miss:1; 79 + unsigned int pnc_stlb_clean_hit:1; 80 + unsigned int pnc_stlb_any_hit:1; 81 + unsigned int pnc_stlb_miss:1; 82 + unsigned int pnc_locked:1; 83 + unsigned int pnc_data_blk:1; 84 + unsigned int pnc_addr_blk:1; 85 + unsigned int pnc_fb_full:1; 86 + unsigned int ld_reserved8:16; 87 + }; 88 + struct { 89 + unsigned int arw_dse:8; 90 + unsigned int arw_l2_miss:1; 91 + unsigned int arw_xq_promotion:1; 92 + unsigned int arw_reissue:1; 93 + unsigned int arw_stlb_miss:1; 94 + unsigned int arw_locked:1; 95 + unsigned int arw_data_blk:1; 96 + unsigned int arw_addr_blk:1; 97 + unsigned int arw_fb_full:1; 98 + unsigned int ld_reserved9:16; 86 99 }; 87 100 }; 88 101 ··· 263 228 __intel_pmu_pebs_data_source_cmt(data_source); 264 229 } 265 230 231 + /* Version for Panthercove and later */ 232 + 233 + /* L2 hit */ 234 + #define PNC_PEBS_DATA_SOURCE_MAX 16 235 + static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = { 236 + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */ 237 + OP_LH | LEVEL(L0) | P(SNOOP, NONE), /* 0x01: L0 hit */ 238 + OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x02: L1 hit */ 239 + OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x03: L1 Miss Handling Buffer hit */ 240 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x04: L2 Hit Clean */ 241 + 0, /* 0x05: Reserved */ 242 + 0, /* 0x06: Reserved */ 243 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x07: L2 Hit Snoop HIT */ 244 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x08: L2 Hit Snoop Hit Modified */ 245 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x09: Prefetch Promotion */ 246 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x0a: Cross Core Prefetch Promotion */ 247 + 0, /* 0x0b: Reserved */ 248 + 0, /* 0x0c: Reserved */ 249 + 0, /* 0x0d: Reserved */ 250 + 0, /* 0x0e: Reserved */ 251 + OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */ 252 + }; 253 + 254 + /* Version for Arctic Wolf and later */ 255 + 256 + /* L2 hit */ 257 + #define ARW_PEBS_DATA_SOURCE_MAX 16 258 + static u64 arw_pebs_l2_hit_data_source[ARW_PEBS_DATA_SOURCE_MAX] = { 259 + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */ 260 + OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x01: L1 hit */ 261 + OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: WCB Hit */ 262 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x03: L2 Hit Clean */ 263 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x04: L2 Hit Snoop HIT */ 264 + OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x05: L2 Hit Snoop Hit Modified */ 265 + OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x06: uncached */ 266 + 0, /* 0x07: Reserved */ 267 + 0, /* 0x08: Reserved */ 268 + 0, /* 0x09: Reserved */ 269 + 0, /* 0x0a: Reserved */ 270 + 0, /* 0x0b: Reserved */ 271 + 0, /* 0x0c: Reserved */ 272 + 0, /* 0x0d: Reserved */ 273 + 0, /* 0x0e: Reserved */ 274 + 0, /* 0x0f: Reserved */ 275 + }; 276 + 277 + /* L2 miss */ 278 + #define OMR_DATA_SOURCE_MAX 16 279 + static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = { 280 + P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: invalid */ 281 + 0, /* 0x01: Reserved */ 282 + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE), /* 0x02: local CA shared cache */ 283 + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */ 284 + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO), /* 0x04: other CA IO agent */ 285 + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE), /* 0x05: other CA shared cache */ 286 + OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */ 287 + OP_LH | LEVEL(RAM) | P(REGION, MMIO), /* 0x07: MMIO */ 288 + OP_LH | LEVEL(RAM) | P(REGION, MEM0), /* 0x08: Memory region 0 */ 289 + OP_LH | LEVEL(RAM) | P(REGION, MEM1), /* 0x09: Memory region 1 */ 290 + OP_LH | LEVEL(RAM) | P(REGION, MEM2), /* 0x0a: Memory region 2 */ 291 + OP_LH | LEVEL(RAM) | P(REGION, MEM3), /* 0x0b: Memory region 3 */ 292 + OP_LH | LEVEL(RAM) | P(REGION, MEM4), /* 0x0c: Memory region 4 */ 293 + OP_LH | LEVEL(RAM) | P(REGION, MEM5), /* 0x0d: Memory region 5 */ 294 + OP_LH | LEVEL(RAM) | P(REGION, MEM6), /* 0x0e: Memory region 6 */ 295 + OP_LH | LEVEL(RAM) | P(REGION, MEM7), /* 0x0f: Memory region 7 */ 296 + }; 297 + 298 + static u64 parse_omr_data_source(u8 dse) 299 + { 300 + union omr_encoding omr; 301 + u64 val = 0; 302 + 303 + omr.omr_full = dse; 304 + val = omr_data_source[omr.omr_source]; 305 + if (omr.omr_source > 0x1 && omr.omr_source < 0x7) 306 + val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0; 307 + else if (omr.omr_source > 0x7) 308 + val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM); 309 + 310 + if (omr.omr_remote) 311 + val |= REM; 312 + 313 + val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT); 314 + 315 + if (omr.omr_source == 0x2) { 316 + u8 snoop = omr.omr_snoop | omr.omr_promoted; 317 + 318 + if (snoop == 0x0) 319 + val |= P(SNOOP, NA); 320 + else if (snoop == 0x1) 321 + val |= P(SNOOP, MISS); 322 + else if (snoop == 0x2) 323 + val |= P(SNOOP, HIT); 324 + else if (snoop == 0x3) 325 + val |= P(SNOOP, NONE); 326 + } else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) { 327 + val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0; 328 + } 329 + 330 + return val; 331 + } 332 + 266 333 static u64 precise_store_data(u64 status) 267 334 { 268 335 union intel_x86_pebs_dse dse; ··· 493 356 dse.mtl_fwd_blk); 494 357 } 495 358 359 + static u64 arw_latency_data(struct perf_event *event, u64 status) 360 + { 361 + union intel_x86_pebs_dse dse; 362 + union perf_mem_data_src src; 363 + u64 val; 364 + 365 + dse.val = status; 366 + 367 + if (!dse.arw_l2_miss) 368 + val = arw_pebs_l2_hit_data_source[dse.arw_dse & 0xf]; 369 + else 370 + val = parse_omr_data_source(dse.arw_dse); 371 + 372 + if (!val) 373 + val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA); 374 + 375 + if (dse.arw_stlb_miss) 376 + val |= P(TLB, MISS) | P(TLB, L2); 377 + else 378 + val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2); 379 + 380 + if (dse.arw_locked) 381 + val |= P(LOCK, LOCKED); 382 + 383 + if (dse.arw_data_blk) 384 + val |= P(BLK, DATA); 385 + if (dse.arw_addr_blk) 386 + val |= P(BLK, ADDR); 387 + if (!dse.arw_data_blk && !dse.arw_addr_blk) 388 + val |= P(BLK, NA); 389 + 390 + src.val = val; 391 + if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW) 392 + src.mem_op = P(OP, STORE); 393 + 394 + return src.val; 395 + } 396 + 496 397 static u64 lnc_latency_data(struct perf_event *event, u64 status) 497 398 { 498 399 union intel_x86_pebs_dse dse; ··· 584 409 return cmt_latency_data(event, status); 585 410 586 411 return lnl_latency_data(event, status); 412 + } 413 + 414 + u64 pnc_latency_data(struct perf_event *event, u64 status) 415 + { 416 + union intel_x86_pebs_dse dse; 417 + union perf_mem_data_src src; 418 + u64 val; 419 + 420 + dse.val = status; 421 + 422 + if (!dse.pnc_l2_miss) 423 + val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf]; 424 + else 425 + val = parse_omr_data_source(dse.pnc_dse); 426 + 427 + if (!val) 428 + val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA); 429 + 430 + if (dse.pnc_stlb_miss) 431 + val |= P(TLB, MISS) | P(TLB, L2); 432 + else 433 + val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2); 434 + 435 + if (dse.pnc_locked) 436 + val |= P(LOCK, LOCKED); 437 + 438 + if (dse.pnc_data_blk) 439 + val |= P(BLK, DATA); 440 + if (dse.pnc_addr_blk) 441 + val |= P(BLK, ADDR); 442 + if (!dse.pnc_data_blk && !dse.pnc_addr_blk) 443 + val |= P(BLK, NA); 444 + 445 + src.val = val; 446 + if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW) 447 + src.mem_op = P(OP, STORE); 448 + 449 + return src.val; 450 + } 451 + 452 + u64 nvl_latency_data(struct perf_event *event, u64 status) 453 + { 454 + struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu); 455 + 456 + if (pmu->pmu_type == hybrid_small) 457 + return arw_latency_data(event, status); 458 + 459 + return pnc_latency_data(event, status); 587 460 } 588 461 589 462 static u64 load_latency_data(struct perf_event *event, u64 status) ··· 1293 1070 EVENT_CONSTRAINT_END 1294 1071 }; 1295 1072 1073 + struct event_constraint intel_arw_pebs_event_constraints[] = { 1074 + /* Allow all events as PEBS with no flags */ 1075 + INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xff), 1076 + INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xff), 1077 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x01d4, 0x1), 1078 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x02d4, 0x2), 1079 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x04d4, 0x4), 1080 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x08d4, 0x8), 1081 + EVENT_CONSTRAINT_END 1082 + }; 1083 + 1296 1084 struct event_constraint intel_nehalem_pebs_event_constraints[] = { 1297 1085 INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */ 1298 1086 INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */ ··· 1510 1276 INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf), 1511 1277 1512 1278 INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf), 1279 + 1280 + /* 1281 + * Everything else is handled by PMU_FL_PEBS_ALL, because we 1282 + * need the full constraints from the main table. 1283 + */ 1284 + 1285 + EVENT_CONSTRAINT_END 1286 + }; 1287 + 1288 + struct event_constraint intel_pnc_pebs_event_constraints[] = { 1289 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */ 1290 + INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL), 1291 + 1292 + INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc), 1293 + INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3), 1294 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */ 1295 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_STORES */ 1296 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_INST_RETIRED.LOCK_LOADS */ 1297 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf), /* MEM_INST_RETIRED.SPLIT_LOADS */ 1298 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf), /* MEM_INST_RETIRED.SPLIT_STORES */ 1299 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf), /* MEM_INST_RETIRED.ALL_LOADS */ 1300 + INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf), /* MEM_INST_RETIRED.ALL_STORES */ 1301 + 1302 + INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf), 1303 + 1304 + INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf), 1305 + INTEL_FLAGS_EVENT_CONSTRAINT(0xd6, 0xf), 1513 1306 1514 1307 /* 1515 1308 * Everything else is handled by PMU_FL_PEBS_ALL, because we
+1 -1
arch/x86/events/intel/p6.c
··· 243 243 */ 244 244 pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n"); 245 245 x86_pmu.attr_rdpmc_broken = 1; 246 - x86_pmu.attr_rdpmc = 0; 246 + x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE; 247 247 } 248 248 } 249 249
+90 -49
arch/x86/events/intel/uncore.c
··· 436 436 437 437 if (type->constraints) { 438 438 for_each_event_constraint(c, type->constraints) { 439 - if ((event->hw.config & c->cmask) == c->code) 439 + if (constraint_match(c, event->hw.config)) 440 440 return c; 441 441 } 442 442 } ··· 1697 1697 return ret; 1698 1698 } 1699 1699 1700 - struct intel_uncore_init_fun { 1701 - void (*cpu_init)(void); 1702 - int (*pci_init)(void); 1703 - void (*mmio_init)(void); 1704 - /* Discovery table is required */ 1705 - bool use_discovery; 1706 - /* The units in the discovery table should be ignored. */ 1707 - int *uncore_units_ignore; 1708 - }; 1700 + static int uncore_mmio_global_init(u64 ctl) 1701 + { 1702 + void __iomem *io_addr; 1709 1703 1710 - static const struct intel_uncore_init_fun nhm_uncore_init __initconst = { 1704 + io_addr = ioremap(ctl, sizeof(ctl)); 1705 + if (!io_addr) 1706 + return -ENOMEM; 1707 + 1708 + /* Clear freeze bit (0) to enable all counters. */ 1709 + writel(0, io_addr); 1710 + 1711 + iounmap(io_addr); 1712 + return 0; 1713 + } 1714 + 1715 + static const struct uncore_plat_init nhm_uncore_init __initconst = { 1711 1716 .cpu_init = nhm_uncore_cpu_init, 1712 1717 }; 1713 1718 1714 - static const struct intel_uncore_init_fun snb_uncore_init __initconst = { 1719 + static const struct uncore_plat_init snb_uncore_init __initconst = { 1715 1720 .cpu_init = snb_uncore_cpu_init, 1716 1721 .pci_init = snb_uncore_pci_init, 1717 1722 }; 1718 1723 1719 - static const struct intel_uncore_init_fun ivb_uncore_init __initconst = { 1724 + static const struct uncore_plat_init ivb_uncore_init __initconst = { 1720 1725 .cpu_init = snb_uncore_cpu_init, 1721 1726 .pci_init = ivb_uncore_pci_init, 1722 1727 }; 1723 1728 1724 - static const struct intel_uncore_init_fun hsw_uncore_init __initconst = { 1729 + static const struct uncore_plat_init hsw_uncore_init __initconst = { 1725 1730 .cpu_init = snb_uncore_cpu_init, 1726 1731 .pci_init = hsw_uncore_pci_init, 1727 1732 }; 1728 1733 1729 - static const struct intel_uncore_init_fun bdw_uncore_init __initconst = { 1734 + static const struct uncore_plat_init bdw_uncore_init __initconst = { 1730 1735 .cpu_init = snb_uncore_cpu_init, 1731 1736 .pci_init = bdw_uncore_pci_init, 1732 1737 }; 1733 1738 1734 - static const struct intel_uncore_init_fun snbep_uncore_init __initconst = { 1739 + static const struct uncore_plat_init snbep_uncore_init __initconst = { 1735 1740 .cpu_init = snbep_uncore_cpu_init, 1736 1741 .pci_init = snbep_uncore_pci_init, 1737 1742 }; 1738 1743 1739 - static const struct intel_uncore_init_fun nhmex_uncore_init __initconst = { 1744 + static const struct uncore_plat_init nhmex_uncore_init __initconst = { 1740 1745 .cpu_init = nhmex_uncore_cpu_init, 1741 1746 }; 1742 1747 1743 - static const struct intel_uncore_init_fun ivbep_uncore_init __initconst = { 1748 + static const struct uncore_plat_init ivbep_uncore_init __initconst = { 1744 1749 .cpu_init = ivbep_uncore_cpu_init, 1745 1750 .pci_init = ivbep_uncore_pci_init, 1746 1751 }; 1747 1752 1748 - static const struct intel_uncore_init_fun hswep_uncore_init __initconst = { 1753 + static const struct uncore_plat_init hswep_uncore_init __initconst = { 1749 1754 .cpu_init = hswep_uncore_cpu_init, 1750 1755 .pci_init = hswep_uncore_pci_init, 1751 1756 }; 1752 1757 1753 - static const struct intel_uncore_init_fun bdx_uncore_init __initconst = { 1758 + static const struct uncore_plat_init bdx_uncore_init __initconst = { 1754 1759 .cpu_init = bdx_uncore_cpu_init, 1755 1760 .pci_init = bdx_uncore_pci_init, 1756 1761 }; 1757 1762 1758 - static const struct intel_uncore_init_fun knl_uncore_init __initconst = { 1763 + static const struct uncore_plat_init knl_uncore_init __initconst = { 1759 1764 .cpu_init = knl_uncore_cpu_init, 1760 1765 .pci_init = knl_uncore_pci_init, 1761 1766 }; 1762 1767 1763 - static const struct intel_uncore_init_fun skl_uncore_init __initconst = { 1768 + static const struct uncore_plat_init skl_uncore_init __initconst = { 1764 1769 .cpu_init = skl_uncore_cpu_init, 1765 1770 .pci_init = skl_uncore_pci_init, 1766 1771 }; 1767 1772 1768 - static const struct intel_uncore_init_fun skx_uncore_init __initconst = { 1773 + static const struct uncore_plat_init skx_uncore_init __initconst = { 1769 1774 .cpu_init = skx_uncore_cpu_init, 1770 1775 .pci_init = skx_uncore_pci_init, 1771 1776 }; 1772 1777 1773 - static const struct intel_uncore_init_fun icl_uncore_init __initconst = { 1778 + static const struct uncore_plat_init icl_uncore_init __initconst = { 1774 1779 .cpu_init = icl_uncore_cpu_init, 1775 1780 .pci_init = skl_uncore_pci_init, 1776 1781 }; 1777 1782 1778 - static const struct intel_uncore_init_fun tgl_uncore_init __initconst = { 1783 + static const struct uncore_plat_init tgl_uncore_init __initconst = { 1779 1784 .cpu_init = tgl_uncore_cpu_init, 1780 1785 .mmio_init = tgl_uncore_mmio_init, 1781 1786 }; 1782 1787 1783 - static const struct intel_uncore_init_fun tgl_l_uncore_init __initconst = { 1788 + static const struct uncore_plat_init tgl_l_uncore_init __initconst = { 1784 1789 .cpu_init = tgl_uncore_cpu_init, 1785 1790 .mmio_init = tgl_l_uncore_mmio_init, 1786 1791 }; 1787 1792 1788 - static const struct intel_uncore_init_fun rkl_uncore_init __initconst = { 1793 + static const struct uncore_plat_init rkl_uncore_init __initconst = { 1789 1794 .cpu_init = tgl_uncore_cpu_init, 1790 1795 .pci_init = skl_uncore_pci_init, 1791 1796 }; 1792 1797 1793 - static const struct intel_uncore_init_fun adl_uncore_init __initconst = { 1798 + static const struct uncore_plat_init adl_uncore_init __initconst = { 1794 1799 .cpu_init = adl_uncore_cpu_init, 1795 1800 .mmio_init = adl_uncore_mmio_init, 1796 1801 }; 1797 1802 1798 - static const struct intel_uncore_init_fun mtl_uncore_init __initconst = { 1803 + static const struct uncore_plat_init mtl_uncore_init __initconst = { 1799 1804 .cpu_init = mtl_uncore_cpu_init, 1800 1805 .mmio_init = adl_uncore_mmio_init, 1801 1806 }; 1802 1807 1803 - static const struct intel_uncore_init_fun lnl_uncore_init __initconst = { 1808 + static const struct uncore_plat_init lnl_uncore_init __initconst = { 1804 1809 .cpu_init = lnl_uncore_cpu_init, 1805 1810 .mmio_init = lnl_uncore_mmio_init, 1806 1811 }; 1807 1812 1808 - static const struct intel_uncore_init_fun ptl_uncore_init __initconst = { 1813 + static const struct uncore_plat_init ptl_uncore_init __initconst = { 1809 1814 .cpu_init = ptl_uncore_cpu_init, 1810 1815 .mmio_init = ptl_uncore_mmio_init, 1811 - .use_discovery = true, 1816 + .domain[0].discovery_base = UNCORE_DISCOVERY_MSR, 1817 + .domain[0].global_init = uncore_mmio_global_init, 1812 1818 }; 1813 1819 1814 - static const struct intel_uncore_init_fun icx_uncore_init __initconst = { 1820 + static const struct uncore_plat_init nvl_uncore_init __initconst = { 1821 + .cpu_init = nvl_uncore_cpu_init, 1822 + .mmio_init = ptl_uncore_mmio_init, 1823 + .domain[0].discovery_base = PACKAGE_UNCORE_DISCOVERY_MSR, 1824 + .domain[0].global_init = uncore_mmio_global_init, 1825 + }; 1826 + 1827 + static const struct uncore_plat_init icx_uncore_init __initconst = { 1815 1828 .cpu_init = icx_uncore_cpu_init, 1816 1829 .pci_init = icx_uncore_pci_init, 1817 1830 .mmio_init = icx_uncore_mmio_init, 1818 1831 }; 1819 1832 1820 - static const struct intel_uncore_init_fun snr_uncore_init __initconst = { 1833 + static const struct uncore_plat_init snr_uncore_init __initconst = { 1821 1834 .cpu_init = snr_uncore_cpu_init, 1822 1835 .pci_init = snr_uncore_pci_init, 1823 1836 .mmio_init = snr_uncore_mmio_init, 1824 1837 }; 1825 1838 1826 - static const struct intel_uncore_init_fun spr_uncore_init __initconst = { 1839 + static const struct uncore_plat_init spr_uncore_init __initconst = { 1827 1840 .cpu_init = spr_uncore_cpu_init, 1828 1841 .pci_init = spr_uncore_pci_init, 1829 1842 .mmio_init = spr_uncore_mmio_init, 1830 - .use_discovery = true, 1831 - .uncore_units_ignore = spr_uncore_units_ignore, 1843 + .domain[0].base_is_pci = true, 1844 + .domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE, 1845 + .domain[0].units_ignore = spr_uncore_units_ignore, 1832 1846 }; 1833 1847 1834 - static const struct intel_uncore_init_fun gnr_uncore_init __initconst = { 1848 + static const struct uncore_plat_init gnr_uncore_init __initconst = { 1835 1849 .cpu_init = gnr_uncore_cpu_init, 1836 1850 .pci_init = gnr_uncore_pci_init, 1837 1851 .mmio_init = gnr_uncore_mmio_init, 1838 - .use_discovery = true, 1839 - .uncore_units_ignore = gnr_uncore_units_ignore, 1852 + .domain[0].base_is_pci = true, 1853 + .domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE, 1854 + .domain[0].units_ignore = gnr_uncore_units_ignore, 1840 1855 }; 1841 1856 1842 - static const struct intel_uncore_init_fun generic_uncore_init __initconst = { 1857 + static const struct uncore_plat_init dmr_uncore_init __initconst = { 1858 + .pci_init = dmr_uncore_pci_init, 1859 + .mmio_init = dmr_uncore_mmio_init, 1860 + .domain[0].base_is_pci = true, 1861 + .domain[0].discovery_base = DMR_UNCORE_DISCOVERY_TABLE_DEVICE, 1862 + .domain[0].units_ignore = dmr_uncore_imh_units_ignore, 1863 + .domain[1].discovery_base = CBB_UNCORE_DISCOVERY_MSR, 1864 + .domain[1].units_ignore = dmr_uncore_cbb_units_ignore, 1865 + .domain[1].global_init = uncore_mmio_global_init, 1866 + }; 1867 + 1868 + static const struct uncore_plat_init generic_uncore_init __initconst = { 1843 1869 .cpu_init = intel_uncore_generic_uncore_cpu_init, 1844 1870 .pci_init = intel_uncore_generic_uncore_pci_init, 1845 1871 .mmio_init = intel_uncore_generic_uncore_mmio_init, 1872 + .domain[0].base_is_pci = true, 1873 + .domain[0].discovery_base = PCI_ANY_ID, 1874 + .domain[1].discovery_base = UNCORE_DISCOVERY_MSR, 1846 1875 }; 1847 1876 1848 1877 static const struct x86_cpu_id intel_uncore_match[] __initconst = { ··· 1923 1894 X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_uncore_init), 1924 1895 X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &ptl_uncore_init), 1925 1896 X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &ptl_uncore_init), 1897 + X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_uncore_init), 1898 + X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_uncore_init), 1926 1899 X86_MATCH_VFM(INTEL_SAPPHIRERAPIDS_X, &spr_uncore_init), 1927 1900 X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &spr_uncore_init), 1928 1901 X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &gnr_uncore_init), ··· 1934 1903 X86_MATCH_VFM(INTEL_ATOM_CRESTMONT_X, &gnr_uncore_init), 1935 1904 X86_MATCH_VFM(INTEL_ATOM_CRESTMONT, &gnr_uncore_init), 1936 1905 X86_MATCH_VFM(INTEL_ATOM_DARKMONT_X, &gnr_uncore_init), 1906 + X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &dmr_uncore_init), 1937 1907 {}, 1938 1908 }; 1939 1909 MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match); 1940 1910 1911 + static bool uncore_use_discovery(struct uncore_plat_init *config) 1912 + { 1913 + for (int i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) { 1914 + if (config->domain[i].discovery_base) 1915 + return true; 1916 + } 1917 + 1918 + return false; 1919 + } 1920 + 1941 1921 static int __init intel_uncore_init(void) 1942 1922 { 1943 1923 const struct x86_cpu_id *id; 1944 - struct intel_uncore_init_fun *uncore_init; 1924 + struct uncore_plat_init *uncore_init; 1945 1925 int pret = 0, cret = 0, mret = 0, ret; 1946 1926 1947 1927 if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) ··· 1963 1921 1964 1922 id = x86_match_cpu(intel_uncore_match); 1965 1923 if (!id) { 1966 - if (!uncore_no_discover && intel_uncore_has_discovery_tables(NULL)) 1967 - uncore_init = (struct intel_uncore_init_fun *)&generic_uncore_init; 1968 - else 1924 + uncore_init = (struct uncore_plat_init *)&generic_uncore_init; 1925 + if (uncore_no_discover || !uncore_discovery(uncore_init)) 1969 1926 return -ENODEV; 1970 1927 } else { 1971 - uncore_init = (struct intel_uncore_init_fun *)id->driver_data; 1972 - if (uncore_no_discover && uncore_init->use_discovery) 1928 + uncore_init = (struct uncore_plat_init *)id->driver_data; 1929 + if (uncore_no_discover && uncore_use_discovery(uncore_init)) 1973 1930 return -ENODEV; 1974 - if (uncore_init->use_discovery && 1975 - !intel_uncore_has_discovery_tables(uncore_init->uncore_units_ignore)) 1931 + if (uncore_use_discovery(uncore_init) && 1932 + !uncore_discovery(uncore_init)) 1976 1933 return -ENODEV; 1977 1934 } 1978 1935
+26
arch/x86/events/intel/uncore.h
··· 33 33 #define UNCORE_EXTRA_PCI_DEV_MAX 4 34 34 35 35 #define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff) 36 + #define UNCORE_EVENT_CONSTRAINT_RANGE(c, e, n) \ 37 + EVENT_CONSTRAINT_RANGE(c, e, n, 0xff) 36 38 37 39 #define UNCORE_IGNORE_END -1 38 40 ··· 48 46 struct uncore_event_desc; 49 47 struct freerunning_counters; 50 48 struct intel_uncore_topology; 49 + 50 + struct uncore_discovery_domain { 51 + /* MSR address or PCI device used as the discovery base */ 52 + u32 discovery_base; 53 + bool base_is_pci; 54 + int (*global_init)(u64 ctl); 55 + 56 + /* The units in the discovery table should be ignored. */ 57 + int *units_ignore; 58 + }; 59 + 60 + #define UNCORE_DISCOVERY_DOMAINS 2 61 + struct uncore_plat_init { 62 + void (*cpu_init)(void); 63 + int (*pci_init)(void); 64 + void (*mmio_init)(void); 65 + 66 + struct uncore_discovery_domain domain[UNCORE_DISCOVERY_DOMAINS]; 67 + }; 51 68 52 69 struct intel_uncore_type { 53 70 const char *name; ··· 618 597 extern struct event_constraint uncore_constraint_empty; 619 598 extern int spr_uncore_units_ignore[]; 620 599 extern int gnr_uncore_units_ignore[]; 600 + extern int dmr_uncore_imh_units_ignore[]; 601 + extern int dmr_uncore_cbb_units_ignore[]; 621 602 622 603 /* uncore_snb.c */ 623 604 int snb_uncore_pci_init(void); ··· 636 613 void lnl_uncore_cpu_init(void); 637 614 void mtl_uncore_cpu_init(void); 638 615 void ptl_uncore_cpu_init(void); 616 + void nvl_uncore_cpu_init(void); 639 617 void tgl_uncore_mmio_init(void); 640 618 void tgl_l_uncore_mmio_init(void); 641 619 void adl_uncore_mmio_init(void); ··· 669 645 int gnr_uncore_pci_init(void); 670 646 void gnr_uncore_cpu_init(void); 671 647 void gnr_uncore_mmio_init(void); 648 + int dmr_uncore_pci_init(void); 649 + void dmr_uncore_mmio_init(void); 672 650 673 651 /* uncore_nhmex.c */ 674 652 void nhmex_uncore_cpu_init(void);
+40 -44
arch/x86/events/intel/uncore_discovery.c
··· 12 12 static struct rb_root discovery_tables = RB_ROOT; 13 13 static int num_discovered_types[UNCORE_ACCESS_MAX]; 14 14 15 - static bool has_generic_discovery_table(void) 16 - { 17 - struct pci_dev *dev; 18 - int dvsec; 19 - 20 - dev = pci_get_device(PCI_VENDOR_ID_INTEL, UNCORE_DISCOVERY_TABLE_DEVICE, NULL); 21 - if (!dev) 22 - return false; 23 - 24 - /* A discovery table device has the unique capability ID. */ 25 - dvsec = pci_find_next_ext_capability(dev, 0, UNCORE_EXT_CAP_ID_DISCOVERY); 26 - pci_dev_put(dev); 27 - if (dvsec) 28 - return true; 29 - 30 - return false; 31 - } 32 - 33 15 static int logical_die_id; 34 16 35 17 static int get_device_die_id(struct pci_dev *dev) ··· 34 52 35 53 static inline int __type_cmp(const void *key, const struct rb_node *b) 36 54 { 37 - struct intel_uncore_discovery_type *type_b = __node_2_type(b); 55 + const struct intel_uncore_discovery_type *type_b = __node_2_type(b); 38 56 const u16 *type_id = key; 39 57 40 58 if (type_b->type > *type_id) ··· 97 115 98 116 static inline int pmu_idx_cmp(const void *key, const struct rb_node *b) 99 117 { 100 - struct intel_uncore_discovery_unit *unit; 118 + const struct intel_uncore_discovery_unit *unit; 101 119 const unsigned int *id = key; 102 120 103 121 unit = rb_entry(b, struct intel_uncore_discovery_unit, node); ··· 155 173 156 174 static inline bool unit_less(struct rb_node *a, const struct rb_node *b) 157 175 { 158 - struct intel_uncore_discovery_unit *a_node, *b_node; 176 + const struct intel_uncore_discovery_unit *a_node, *b_node; 159 177 160 178 a_node = rb_entry(a, struct intel_uncore_discovery_unit, node); 161 179 b_node = rb_entry(b, struct intel_uncore_discovery_unit, node); ··· 241 259 } 242 260 243 261 static bool 244 - uncore_ignore_unit(struct uncore_unit_discovery *unit, int *ignore) 262 + uncore_ignore_unit(struct uncore_unit_discovery *unit, 263 + struct uncore_discovery_domain *domain) 245 264 { 246 265 int i; 247 266 248 - if (!ignore) 267 + if (!domain || !domain->units_ignore) 249 268 return false; 250 269 251 - for (i = 0; ignore[i] != UNCORE_IGNORE_END ; i++) { 252 - if (unit->box_type == ignore[i]) 270 + for (i = 0; domain->units_ignore[i] != UNCORE_IGNORE_END ; i++) { 271 + if (unit->box_type == domain->units_ignore[i]) 253 272 return true; 254 273 } 255 274 256 275 return false; 257 276 } 258 277 259 - static int __parse_discovery_table(resource_size_t addr, int die, 260 - bool *parsed, int *ignore) 278 + static int __parse_discovery_table(struct uncore_discovery_domain *domain, 279 + resource_size_t addr, int die, bool *parsed) 261 280 { 262 281 struct uncore_global_discovery global; 263 282 struct uncore_unit_discovery unit; ··· 286 303 if (!io_addr) 287 304 return -ENOMEM; 288 305 306 + if (domain->global_init && domain->global_init(global.ctl)) 307 + return -ENODEV; 308 + 289 309 /* Parsing Unit Discovery State */ 290 310 for (i = 0; i < global.max_units; i++) { 291 311 memcpy_fromio(&unit, io_addr + (i + 1) * (global.stride * 8), ··· 300 314 if (unit.access_type >= UNCORE_ACCESS_MAX) 301 315 continue; 302 316 303 - if (uncore_ignore_unit(&unit, ignore)) 317 + if (uncore_ignore_unit(&unit, domain)) 304 318 continue; 305 319 306 320 uncore_insert_box_info(&unit, die); ··· 311 325 return 0; 312 326 } 313 327 314 - static int parse_discovery_table(struct pci_dev *dev, int die, 315 - u32 bar_offset, bool *parsed, 316 - int *ignore) 328 + static int parse_discovery_table(struct uncore_discovery_domain *domain, 329 + struct pci_dev *dev, int die, 330 + u32 bar_offset, bool *parsed) 317 331 { 318 332 resource_size_t addr; 319 333 u32 val; ··· 333 347 } 334 348 #endif 335 349 336 - return __parse_discovery_table(addr, die, parsed, ignore); 350 + return __parse_discovery_table(domain, addr, die, parsed); 337 351 } 338 352 339 - static bool intel_uncore_has_discovery_tables_pci(int *ignore) 353 + static bool uncore_discovery_pci(struct uncore_discovery_domain *domain) 340 354 { 341 355 u32 device, val, entry_id, bar_offset; 342 356 int die, dvsec = 0, ret = true; 343 357 struct pci_dev *dev = NULL; 344 358 bool parsed = false; 345 359 346 - if (has_generic_discovery_table()) 347 - device = UNCORE_DISCOVERY_TABLE_DEVICE; 348 - else 349 - device = PCI_ANY_ID; 360 + device = domain->discovery_base; 350 361 351 362 /* 352 363 * Start a new search and iterates through the list of ··· 369 386 if (die < 0) 370 387 continue; 371 388 372 - parse_discovery_table(dev, die, bar_offset, &parsed, ignore); 389 + parse_discovery_table(domain, dev, die, bar_offset, &parsed); 373 390 } 374 391 } 375 392 ··· 382 399 return ret; 383 400 } 384 401 385 - static bool intel_uncore_has_discovery_tables_msr(int *ignore) 402 + static bool uncore_discovery_msr(struct uncore_discovery_domain *domain) 386 403 { 387 404 unsigned long *die_mask; 388 405 bool parsed = false; ··· 400 417 if (__test_and_set_bit(die, die_mask)) 401 418 continue; 402 419 403 - if (rdmsrq_safe_on_cpu(cpu, UNCORE_DISCOVERY_MSR, &base)) 420 + if (rdmsrq_safe_on_cpu(cpu, domain->discovery_base, &base)) 404 421 continue; 405 422 406 423 if (!base) 407 424 continue; 408 425 409 - __parse_discovery_table(base, die, &parsed, ignore); 426 + __parse_discovery_table(domain, base, die, &parsed); 410 427 } 411 428 412 429 cpus_read_unlock(); ··· 415 432 return parsed; 416 433 } 417 434 418 - bool intel_uncore_has_discovery_tables(int *ignore) 435 + bool uncore_discovery(struct uncore_plat_init *init) 419 436 { 420 - return intel_uncore_has_discovery_tables_msr(ignore) || 421 - intel_uncore_has_discovery_tables_pci(ignore); 437 + struct uncore_discovery_domain *domain; 438 + bool ret = false; 439 + int i; 440 + 441 + for (i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) { 442 + domain = &init->domain[i]; 443 + if (domain->discovery_base) { 444 + if (!domain->base_is_pci) 445 + ret |= uncore_discovery_msr(domain); 446 + else 447 + ret |= uncore_discovery_pci(domain); 448 + } 449 + } 450 + 451 + return ret; 422 452 } 423 453 424 454 void intel_uncore_clear_discovery_tables(void)
+7 -1
arch/x86/events/intel/uncore_discovery.h
··· 2 2 3 3 /* Store the full address of the global discovery table */ 4 4 #define UNCORE_DISCOVERY_MSR 0x201e 5 + /* Base address of uncore perfmon discovery table for CBB domain */ 6 + #define CBB_UNCORE_DISCOVERY_MSR 0x710 7 + /* Base address of uncore perfmon discovery table for the package */ 8 + #define PACKAGE_UNCORE_DISCOVERY_MSR 0x711 5 9 6 10 /* Generic device ID of a discovery table device */ 7 11 #define UNCORE_DISCOVERY_TABLE_DEVICE 0x09a7 12 + /* Device ID used on DMR */ 13 + #define DMR_UNCORE_DISCOVERY_TABLE_DEVICE 0x09a1 8 14 /* Capability ID for a discovery table device */ 9 15 #define UNCORE_EXT_CAP_ID_DISCOVERY 0x23 10 16 /* First DVSEC offset */ ··· 142 136 u16 num_units; /* number of units */ 143 137 }; 144 138 145 - bool intel_uncore_has_discovery_tables(int *ignore); 139 + bool uncore_discovery(struct uncore_plat_init *init); 146 140 void intel_uncore_clear_discovery_tables(void); 147 141 void intel_uncore_generic_uncore_cpu_init(void); 148 142 int intel_uncore_generic_uncore_pci_init(void);
+85
arch/x86/events/intel/uncore_snb.c
··· 245 245 #define MTL_UNC_HBO_CTR 0x2048 246 246 #define MTL_UNC_HBO_CTRL 0x2042 247 247 248 + /* PTL Low Power Bridge register */ 249 + #define PTL_UNC_IA_CORE_BRIDGE_PER_CTR0 0x2028 250 + #define PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0 0x2022 251 + 252 + /* PTL Santa register */ 253 + #define PTL_UNC_SANTA_CTR0 0x2418 254 + #define PTL_UNC_SANTA_CTRL0 0x2412 255 + 256 + /* PTL cNCU register */ 257 + #define PTL_UNC_CNCU_MSR_OFFSET 0x140 258 + 259 + /* NVL cNCU register */ 260 + #define NVL_UNC_CNCU_BOX_CTL 0x202e 261 + #define NVL_UNC_CNCU_FIXED_CTR 0x2028 262 + #define NVL_UNC_CNCU_FIXED_CTRL 0x2022 263 + 264 + /* NVL SANTA register */ 265 + #define NVL_UNC_SANTA_CTR0 0x2048 266 + #define NVL_UNC_SANTA_CTRL0 0x2042 267 + 268 + /* NVL CBOX register */ 269 + #define NVL_UNC_CBOX_PER_CTR0 0x2108 270 + #define NVL_UNC_CBOX_PERFEVTSEL0 0x2102 271 + 248 272 DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7"); 249 273 DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15"); 250 274 DEFINE_UNCORE_FORMAT_ATTR(chmask, chmask, "config:8-11"); ··· 1945 1921 ptl_uncores); 1946 1922 } 1947 1923 1924 + static struct intel_uncore_type ptl_uncore_ia_core_bridge = { 1925 + .name = "ia_core_bridge", 1926 + .num_counters = 2, 1927 + .num_boxes = 1, 1928 + .perf_ctr_bits = 48, 1929 + .perf_ctr = PTL_UNC_IA_CORE_BRIDGE_PER_CTR0, 1930 + .event_ctl = PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0, 1931 + .event_mask = ADL_UNC_RAW_EVENT_MASK, 1932 + .ops = &icl_uncore_msr_ops, 1933 + .format_group = &adl_uncore_format_group, 1934 + }; 1935 + 1936 + static struct intel_uncore_type ptl_uncore_santa = { 1937 + .name = "santa", 1938 + .num_counters = 2, 1939 + .num_boxes = 2, 1940 + .perf_ctr_bits = 48, 1941 + .perf_ctr = PTL_UNC_SANTA_CTR0, 1942 + .event_ctl = PTL_UNC_SANTA_CTRL0, 1943 + .event_mask = ADL_UNC_RAW_EVENT_MASK, 1944 + .msr_offset = SNB_UNC_CBO_MSR_OFFSET, 1945 + .ops = &icl_uncore_msr_ops, 1946 + .format_group = &adl_uncore_format_group, 1947 + }; 1948 + 1948 1949 static struct intel_uncore_type *ptl_msr_uncores[] = { 1949 1950 &mtl_uncore_cbox, 1951 + &ptl_uncore_ia_core_bridge, 1952 + &ptl_uncore_santa, 1953 + &mtl_uncore_cncu, 1950 1954 NULL 1951 1955 }; 1952 1956 ··· 1982 1930 { 1983 1931 mtl_uncore_cbox.num_boxes = 6; 1984 1932 mtl_uncore_cbox.ops = &lnl_uncore_msr_ops; 1933 + 1934 + mtl_uncore_cncu.num_counters = 2; 1935 + mtl_uncore_cncu.num_boxes = 2; 1936 + mtl_uncore_cncu.msr_offset = PTL_UNC_CNCU_MSR_OFFSET; 1937 + mtl_uncore_cncu.single_fixed = 0; 1938 + 1985 1939 uncore_msr_uncores = ptl_msr_uncores; 1986 1940 } 1987 1941 1988 1942 /* end of Panther Lake uncore support */ 1943 + 1944 + /* Nova Lake uncore support */ 1945 + 1946 + static struct intel_uncore_type *nvl_msr_uncores[] = { 1947 + &mtl_uncore_cbox, 1948 + &ptl_uncore_santa, 1949 + &mtl_uncore_cncu, 1950 + NULL 1951 + }; 1952 + 1953 + void nvl_uncore_cpu_init(void) 1954 + { 1955 + mtl_uncore_cbox.num_boxes = 12; 1956 + mtl_uncore_cbox.perf_ctr = NVL_UNC_CBOX_PER_CTR0; 1957 + mtl_uncore_cbox.event_ctl = NVL_UNC_CBOX_PERFEVTSEL0; 1958 + 1959 + ptl_uncore_santa.perf_ctr = NVL_UNC_SANTA_CTR0; 1960 + ptl_uncore_santa.event_ctl = NVL_UNC_SANTA_CTRL0; 1961 + 1962 + mtl_uncore_cncu.box_ctl = NVL_UNC_CNCU_BOX_CTL; 1963 + mtl_uncore_cncu.fixed_ctr = NVL_UNC_CNCU_FIXED_CTR; 1964 + mtl_uncore_cncu.fixed_ctl = NVL_UNC_CNCU_FIXED_CTRL; 1965 + 1966 + uncore_msr_uncores = nvl_msr_uncores; 1967 + } 1968 + 1969 + /* end of Nova Lake uncore support */
+480 -210
arch/x86/events/intel/uncore_snbep.c
··· 471 471 472 472 #define SPR_C0_MSR_PMON_BOX_FILTER0 0x200e 473 473 474 + /* DMR */ 475 + #define DMR_IMH1_HIOP_MMIO_BASE 0x1ffff6ae7000 476 + #define DMR_HIOP_MMIO_SIZE 0x8000 477 + #define DMR_CXLCM_EVENT_MASK_EXT 0xf 478 + #define DMR_HAMVF_EVENT_MASK_EXT 0xffffffff 479 + #define DMR_PCIE4_EVENT_MASK_EXT 0xffffff 480 + 481 + #define UNCORE_DMR_ITC 0x30 482 + 483 + #define DMR_IMC_PMON_FIXED_CTR 0x18 484 + #define DMR_IMC_PMON_FIXED_CTL 0x10 485 + 474 486 DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7"); 475 487 DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6"); 476 488 DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21"); ··· 498 486 DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19"); 499 487 DEFINE_UNCORE_FORMAT_ATTR(tid_en2, tid_en, "config:16"); 500 488 DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23"); 489 + DEFINE_UNCORE_FORMAT_ATTR(inv2, inv, "config:21"); 490 + DEFINE_UNCORE_FORMAT_ATTR(thresh_ext, thresh_ext, "config:32-35"); 491 + DEFINE_UNCORE_FORMAT_ATTR(thresh10, thresh, "config:23-32"); 492 + DEFINE_UNCORE_FORMAT_ATTR(thresh9_2, thresh, "config:23-31"); 501 493 DEFINE_UNCORE_FORMAT_ATTR(thresh9, thresh, "config:24-35"); 502 494 DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31"); 503 495 DEFINE_UNCORE_FORMAT_ATTR(thresh6, thresh, "config:24-29"); ··· 510 494 DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30"); 511 495 DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51"); 512 496 DEFINE_UNCORE_FORMAT_ATTR(occ_edge_det, occ_edge_det, "config:31"); 497 + DEFINE_UNCORE_FORMAT_ATTR(port_en, port_en, "config:32-35"); 498 + DEFINE_UNCORE_FORMAT_ATTR(rs3_sel, rs3_sel, "config:36"); 499 + DEFINE_UNCORE_FORMAT_ATTR(rx_sel, rx_sel, "config:37"); 500 + DEFINE_UNCORE_FORMAT_ATTR(tx_sel, tx_sel, "config:38"); 501 + DEFINE_UNCORE_FORMAT_ATTR(iep_sel, iep_sel, "config:39"); 502 + DEFINE_UNCORE_FORMAT_ATTR(vc_sel, vc_sel, "config:40-47"); 503 + DEFINE_UNCORE_FORMAT_ATTR(port_sel, port_sel, "config:48-55"); 513 504 DEFINE_UNCORE_FORMAT_ATTR(ch_mask, ch_mask, "config:36-43"); 514 505 DEFINE_UNCORE_FORMAT_ATTR(ch_mask2, ch_mask, "config:36-47"); 515 506 DEFINE_UNCORE_FORMAT_ATTR(fc_mask, fc_mask, "config:44-46"); ··· 836 813 static struct event_constraint snbep_uncore_cbox_constraints[] = { 837 814 UNCORE_EVENT_CONSTRAINT(0x01, 0x1), 838 815 UNCORE_EVENT_CONSTRAINT(0x02, 0x3), 839 - UNCORE_EVENT_CONSTRAINT(0x04, 0x3), 840 - UNCORE_EVENT_CONSTRAINT(0x05, 0x3), 816 + UNCORE_EVENT_CONSTRAINT_RANGE(0x04, 0x5, 0x3), 841 817 UNCORE_EVENT_CONSTRAINT(0x07, 0x3), 842 818 UNCORE_EVENT_CONSTRAINT(0x09, 0x3), 843 819 UNCORE_EVENT_CONSTRAINT(0x11, 0x1), 844 - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), 845 - UNCORE_EVENT_CONSTRAINT(0x13, 0x3), 846 - UNCORE_EVENT_CONSTRAINT(0x1b, 0xc), 847 - UNCORE_EVENT_CONSTRAINT(0x1c, 0xc), 848 - UNCORE_EVENT_CONSTRAINT(0x1d, 0xc), 849 - UNCORE_EVENT_CONSTRAINT(0x1e, 0xc), 820 + UNCORE_EVENT_CONSTRAINT_RANGE(0x12, 0x13, 0x3), 821 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1b, 0x1e, 0xc), 850 822 UNCORE_EVENT_CONSTRAINT(0x1f, 0xe), 851 823 UNCORE_EVENT_CONSTRAINT(0x21, 0x3), 852 824 UNCORE_EVENT_CONSTRAINT(0x23, 0x3), 853 - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), 854 - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), 855 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 856 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 857 - UNCORE_EVENT_CONSTRAINT(0x35, 0x3), 825 + UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x35, 0x3), 858 826 UNCORE_EVENT_CONSTRAINT(0x36, 0x1), 859 - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), 860 - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), 861 - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), 827 + UNCORE_EVENT_CONSTRAINT_RANGE(0x37, 0x39, 0x3), 862 828 UNCORE_EVENT_CONSTRAINT(0x3b, 0x1), 863 829 EVENT_CONSTRAINT_END 864 830 }; 865 831 866 832 static struct event_constraint snbep_uncore_r2pcie_constraints[] = { 867 - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), 868 - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), 833 + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), 869 834 UNCORE_EVENT_CONSTRAINT(0x12, 0x1), 870 835 UNCORE_EVENT_CONSTRAINT(0x23, 0x3), 871 - UNCORE_EVENT_CONSTRAINT(0x24, 0x3), 872 - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), 873 - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 874 - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), 875 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 876 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 836 + UNCORE_EVENT_CONSTRAINT_RANGE(0x24, 0x26, 0x3), 837 + UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x34, 0x3), 877 838 EVENT_CONSTRAINT_END 878 839 }; 879 840 880 841 static struct event_constraint snbep_uncore_r3qpi_constraints[] = { 881 - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), 882 - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), 883 - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), 842 + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3), 884 843 UNCORE_EVENT_CONSTRAINT(0x13, 0x1), 885 - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), 886 - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), 887 - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), 888 - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), 889 - UNCORE_EVENT_CONSTRAINT(0x24, 0x3), 890 - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), 891 - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 892 - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), 893 - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), 894 - UNCORE_EVENT_CONSTRAINT(0x2a, 0x3), 895 - UNCORE_EVENT_CONSTRAINT(0x2b, 0x3), 896 - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), 897 - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), 898 - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), 899 - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), 900 - UNCORE_EVENT_CONSTRAINT(0x30, 0x3), 901 - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), 902 - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), 903 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 904 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 905 - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), 906 - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), 907 - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), 908 - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), 844 + UNCORE_EVENT_CONSTRAINT_RANGE(0x20, 0x26, 0x3), 845 + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x34, 0x3), 846 + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), 909 847 EVENT_CONSTRAINT_END 910 848 }; 911 849 ··· 2995 3011 }; 2996 3012 2997 3013 static struct event_constraint hswep_uncore_r2pcie_constraints[] = { 2998 - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), 2999 - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), 3014 + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), 3000 3015 UNCORE_EVENT_CONSTRAINT(0x13, 0x1), 3001 - UNCORE_EVENT_CONSTRAINT(0x23, 0x1), 3002 - UNCORE_EVENT_CONSTRAINT(0x24, 0x1), 3003 - UNCORE_EVENT_CONSTRAINT(0x25, 0x1), 3016 + UNCORE_EVENT_CONSTRAINT_RANGE(0x23, 0x25, 0x1), 3004 3017 UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 3005 3018 UNCORE_EVENT_CONSTRAINT(0x27, 0x1), 3006 - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), 3007 - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), 3019 + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), 3008 3020 UNCORE_EVENT_CONSTRAINT(0x2a, 0x1), 3009 - UNCORE_EVENT_CONSTRAINT(0x2b, 0x3), 3010 - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), 3011 - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), 3012 - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), 3013 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 3014 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 3015 - UNCORE_EVENT_CONSTRAINT(0x35, 0x3), 3021 + UNCORE_EVENT_CONSTRAINT_RANGE(0x2b, 0x2d, 0x3), 3022 + UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x35, 0x3), 3016 3023 EVENT_CONSTRAINT_END 3017 3024 }; 3018 3025 ··· 3018 3043 3019 3044 static struct event_constraint hswep_uncore_r3qpi_constraints[] = { 3020 3045 UNCORE_EVENT_CONSTRAINT(0x01, 0x3), 3021 - UNCORE_EVENT_CONSTRAINT(0x07, 0x7), 3022 - UNCORE_EVENT_CONSTRAINT(0x08, 0x7), 3023 - UNCORE_EVENT_CONSTRAINT(0x09, 0x7), 3024 - UNCORE_EVENT_CONSTRAINT(0x0a, 0x7), 3046 + UNCORE_EVENT_CONSTRAINT_RANGE(0x7, 0x0a, 0x7), 3025 3047 UNCORE_EVENT_CONSTRAINT(0x0e, 0x7), 3026 - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), 3027 - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), 3028 - UNCORE_EVENT_CONSTRAINT(0x12, 0x3), 3048 + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3), 3029 3049 UNCORE_EVENT_CONSTRAINT(0x13, 0x1), 3030 - UNCORE_EVENT_CONSTRAINT(0x14, 0x3), 3031 - UNCORE_EVENT_CONSTRAINT(0x15, 0x3), 3032 - UNCORE_EVENT_CONSTRAINT(0x1f, 0x3), 3033 - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), 3034 - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), 3035 - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), 3036 - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), 3037 - UNCORE_EVENT_CONSTRAINT(0x25, 0x3), 3038 - UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 3039 - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), 3040 - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), 3041 - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), 3042 - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), 3043 - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), 3044 - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), 3045 - UNCORE_EVENT_CONSTRAINT(0x31, 0x3), 3046 - UNCORE_EVENT_CONSTRAINT(0x32, 0x3), 3047 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 3048 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 3049 - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), 3050 - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), 3051 - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), 3052 - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), 3050 + UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3), 3051 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3), 3052 + UNCORE_EVENT_CONSTRAINT_RANGE(0x25, 0x26, 0x3), 3053 + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), 3054 + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3), 3055 + UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x34, 0x3), 3056 + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), 3053 3057 EVENT_CONSTRAINT_END 3054 3058 }; 3055 3059 ··· 3302 3348 UNCORE_EVENT_CONSTRAINT(0x25, 0x1), 3303 3349 UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 3304 3350 UNCORE_EVENT_CONSTRAINT(0x28, 0x3), 3305 - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), 3306 - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), 3351 + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2d, 0x3), 3307 3352 EVENT_CONSTRAINT_END 3308 3353 }; 3309 3354 ··· 3317 3364 3318 3365 static struct event_constraint bdx_uncore_r3qpi_constraints[] = { 3319 3366 UNCORE_EVENT_CONSTRAINT(0x01, 0x7), 3320 - UNCORE_EVENT_CONSTRAINT(0x07, 0x7), 3321 - UNCORE_EVENT_CONSTRAINT(0x08, 0x7), 3322 - UNCORE_EVENT_CONSTRAINT(0x09, 0x7), 3323 - UNCORE_EVENT_CONSTRAINT(0x0a, 0x7), 3367 + UNCORE_EVENT_CONSTRAINT_RANGE(0x07, 0x0a, 0x7), 3324 3368 UNCORE_EVENT_CONSTRAINT(0x0e, 0x7), 3325 - UNCORE_EVENT_CONSTRAINT(0x10, 0x3), 3326 - UNCORE_EVENT_CONSTRAINT(0x11, 0x3), 3369 + UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3), 3327 3370 UNCORE_EVENT_CONSTRAINT(0x13, 0x1), 3328 - UNCORE_EVENT_CONSTRAINT(0x14, 0x3), 3329 - UNCORE_EVENT_CONSTRAINT(0x15, 0x3), 3330 - UNCORE_EVENT_CONSTRAINT(0x1f, 0x3), 3331 - UNCORE_EVENT_CONSTRAINT(0x20, 0x3), 3332 - UNCORE_EVENT_CONSTRAINT(0x21, 0x3), 3333 - UNCORE_EVENT_CONSTRAINT(0x22, 0x3), 3334 - UNCORE_EVENT_CONSTRAINT(0x23, 0x3), 3371 + UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3), 3372 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3), 3335 3373 UNCORE_EVENT_CONSTRAINT(0x25, 0x3), 3336 3374 UNCORE_EVENT_CONSTRAINT(0x26, 0x3), 3337 - UNCORE_EVENT_CONSTRAINT(0x28, 0x3), 3338 - UNCORE_EVENT_CONSTRAINT(0x29, 0x3), 3339 - UNCORE_EVENT_CONSTRAINT(0x2c, 0x3), 3340 - UNCORE_EVENT_CONSTRAINT(0x2d, 0x3), 3341 - UNCORE_EVENT_CONSTRAINT(0x2e, 0x3), 3342 - UNCORE_EVENT_CONSTRAINT(0x2f, 0x3), 3343 - UNCORE_EVENT_CONSTRAINT(0x33, 0x3), 3344 - UNCORE_EVENT_CONSTRAINT(0x34, 0x3), 3345 - UNCORE_EVENT_CONSTRAINT(0x36, 0x3), 3346 - UNCORE_EVENT_CONSTRAINT(0x37, 0x3), 3347 - UNCORE_EVENT_CONSTRAINT(0x38, 0x3), 3348 - UNCORE_EVENT_CONSTRAINT(0x39, 0x3), 3375 + UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3), 3376 + UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3), 3377 + UNCORE_EVENT_CONSTRAINT_RANGE(0x33, 0x34, 0x3), 3378 + UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3), 3349 3379 EVENT_CONSTRAINT_END 3350 3380 }; 3351 3381 ··· 3635 3699 UNCORE_EVENT_CONSTRAINT(0x95, 0xc), 3636 3700 UNCORE_EVENT_CONSTRAINT(0xc0, 0xc), 3637 3701 UNCORE_EVENT_CONSTRAINT(0xc5, 0xc), 3638 - UNCORE_EVENT_CONSTRAINT(0xd4, 0xc), 3639 - UNCORE_EVENT_CONSTRAINT(0xd5, 0xc), 3702 + UNCORE_EVENT_CONSTRAINT_RANGE(0xd4, 0xd5, 0xc), 3640 3703 EVENT_CONSTRAINT_END 3641 3704 }; 3642 3705 ··· 3984 4049 [SKX_IIO_MSR_UTIL] = { 0xb08, 0x1, 0x10, 8, 36 }, 3985 4050 }; 3986 4051 4052 + #define INTEL_UNCORE_FR_EVENT_DESC(name, umask, scl) \ 4053 + INTEL_UNCORE_EVENT_DESC(name, \ 4054 + "event=0xff,umask=" __stringify(umask)),\ 4055 + INTEL_UNCORE_EVENT_DESC(name.scale, __stringify(scl)), \ 4056 + INTEL_UNCORE_EVENT_DESC(name.unit, "MiB") 4057 + 3987 4058 static struct uncore_event_desc skx_uncore_iio_freerunning_events[] = { 3988 4059 /* Free-Running IO CLOCKS Counter */ 3989 4060 INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"), 3990 4061 /* Free-Running IIO BANDWIDTH Counters */ 3991 - INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"), 3992 - INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.814697266e-6"), 3993 - INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"), 3994 - INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"), 3995 - INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.814697266e-6"), 3996 - INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"), 3997 - INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"), 3998 - INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.814697266e-6"), 3999 - INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"), 4000 - INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"), 4001 - INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.814697266e-6"), 4002 - INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"), 4003 - INTEL_UNCORE_EVENT_DESC(bw_out_port0, "event=0xff,umask=0x24"), 4004 - INTEL_UNCORE_EVENT_DESC(bw_out_port0.scale, "3.814697266e-6"), 4005 - INTEL_UNCORE_EVENT_DESC(bw_out_port0.unit, "MiB"), 4006 - INTEL_UNCORE_EVENT_DESC(bw_out_port1, "event=0xff,umask=0x25"), 4007 - INTEL_UNCORE_EVENT_DESC(bw_out_port1.scale, "3.814697266e-6"), 4008 - INTEL_UNCORE_EVENT_DESC(bw_out_port1.unit, "MiB"), 4009 - INTEL_UNCORE_EVENT_DESC(bw_out_port2, "event=0xff,umask=0x26"), 4010 - INTEL_UNCORE_EVENT_DESC(bw_out_port2.scale, "3.814697266e-6"), 4011 - INTEL_UNCORE_EVENT_DESC(bw_out_port2.unit, "MiB"), 4012 - INTEL_UNCORE_EVENT_DESC(bw_out_port3, "event=0xff,umask=0x27"), 4013 - INTEL_UNCORE_EVENT_DESC(bw_out_port3.scale, "3.814697266e-6"), 4014 - INTEL_UNCORE_EVENT_DESC(bw_out_port3.unit, "MiB"), 4062 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.814697266e-6), 4063 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.814697266e-6), 4064 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.814697266e-6), 4065 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.814697266e-6), 4066 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x24, 3.814697266e-6), 4067 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x25, 3.814697266e-6), 4068 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x26, 3.814697266e-6), 4069 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x27, 3.814697266e-6), 4015 4070 /* Free-running IIO UTILIZATION Counters */ 4016 4071 INTEL_UNCORE_EVENT_DESC(util_in_port0, "event=0xff,umask=0x30"), 4017 4072 INTEL_UNCORE_EVENT_DESC(util_out_port0, "event=0xff,umask=0x31"), ··· 4391 4466 }; 4392 4467 4393 4468 static struct event_constraint skx_uncore_m3upi_constraints[] = { 4394 - UNCORE_EVENT_CONSTRAINT(0x1d, 0x1), 4395 - UNCORE_EVENT_CONSTRAINT(0x1e, 0x1), 4469 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1d, 0x1e, 0x1), 4396 4470 UNCORE_EVENT_CONSTRAINT(0x40, 0x7), 4397 - UNCORE_EVENT_CONSTRAINT(0x4e, 0x7), 4398 - UNCORE_EVENT_CONSTRAINT(0x4f, 0x7), 4399 - UNCORE_EVENT_CONSTRAINT(0x50, 0x7), 4400 - UNCORE_EVENT_CONSTRAINT(0x51, 0x7), 4401 - UNCORE_EVENT_CONSTRAINT(0x52, 0x7), 4471 + UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x52, 0x7), 4402 4472 EVENT_CONSTRAINT_END 4403 4473 }; 4404 4474 ··· 4811 4891 /* Free-Running IIO CLOCKS Counter */ 4812 4892 INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"), 4813 4893 /* Free-Running IIO BANDWIDTH IN Counters */ 4814 - INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"), 4815 - INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.0517578125e-5"), 4816 - INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"), 4817 - INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"), 4818 - INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.0517578125e-5"), 4819 - INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"), 4820 - INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"), 4821 - INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.0517578125e-5"), 4822 - INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"), 4823 - INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"), 4824 - INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.0517578125e-5"), 4825 - INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"), 4826 - INTEL_UNCORE_EVENT_DESC(bw_in_port4, "event=0xff,umask=0x24"), 4827 - INTEL_UNCORE_EVENT_DESC(bw_in_port4.scale, "3.0517578125e-5"), 4828 - INTEL_UNCORE_EVENT_DESC(bw_in_port4.unit, "MiB"), 4829 - INTEL_UNCORE_EVENT_DESC(bw_in_port5, "event=0xff,umask=0x25"), 4830 - INTEL_UNCORE_EVENT_DESC(bw_in_port5.scale, "3.0517578125e-5"), 4831 - INTEL_UNCORE_EVENT_DESC(bw_in_port5.unit, "MiB"), 4832 - INTEL_UNCORE_EVENT_DESC(bw_in_port6, "event=0xff,umask=0x26"), 4833 - INTEL_UNCORE_EVENT_DESC(bw_in_port6.scale, "3.0517578125e-5"), 4834 - INTEL_UNCORE_EVENT_DESC(bw_in_port6.unit, "MiB"), 4835 - INTEL_UNCORE_EVENT_DESC(bw_in_port7, "event=0xff,umask=0x27"), 4836 - INTEL_UNCORE_EVENT_DESC(bw_in_port7.scale, "3.0517578125e-5"), 4837 - INTEL_UNCORE_EVENT_DESC(bw_in_port7.unit, "MiB"), 4894 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.0517578125e-5), 4895 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.0517578125e-5), 4896 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.0517578125e-5), 4897 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.0517578125e-5), 4898 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, 3.0517578125e-5), 4899 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, 3.0517578125e-5), 4900 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, 3.0517578125e-5), 4901 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, 3.0517578125e-5), 4838 4902 { /* end: all zeroes */ }, 4839 4903 }; 4840 4904 ··· 5151 5247 static struct uncore_event_desc snr_uncore_imc_freerunning_events[] = { 5152 5248 INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"), 5153 5249 5154 - INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"), 5155 - INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"), 5156 - INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"), 5157 - INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"), 5158 - INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"), 5159 - INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"), 5250 + INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5), 5251 + INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5), 5160 5252 { /* end: all zeroes */ }, 5161 5253 }; 5162 5254 ··· 5559 5659 }; 5560 5660 5561 5661 static struct event_constraint icx_uncore_m3upi_constraints[] = { 5562 - UNCORE_EVENT_CONSTRAINT(0x1c, 0x1), 5563 - UNCORE_EVENT_CONSTRAINT(0x1d, 0x1), 5564 - UNCORE_EVENT_CONSTRAINT(0x1e, 0x1), 5565 - UNCORE_EVENT_CONSTRAINT(0x1f, 0x1), 5662 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1c, 0x1f, 0x1), 5566 5663 UNCORE_EVENT_CONSTRAINT(0x40, 0x7), 5567 - UNCORE_EVENT_CONSTRAINT(0x4e, 0x7), 5568 - UNCORE_EVENT_CONSTRAINT(0x4f, 0x7), 5569 - UNCORE_EVENT_CONSTRAINT(0x50, 0x7), 5664 + UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x50, 0x7), 5570 5665 EVENT_CONSTRAINT_END 5571 5666 }; 5572 5667 ··· 5712 5817 static struct uncore_event_desc icx_uncore_imc_freerunning_events[] = { 5713 5818 INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"), 5714 5819 5715 - INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"), 5716 - INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"), 5717 - INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"), 5718 - INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"), 5719 - INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"), 5720 - INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"), 5721 - 5722 - INTEL_UNCORE_EVENT_DESC(ddrt_read, "event=0xff,umask=0x30"), 5723 - INTEL_UNCORE_EVENT_DESC(ddrt_read.scale, "6.103515625e-5"), 5724 - INTEL_UNCORE_EVENT_DESC(ddrt_read.unit, "MiB"), 5725 - INTEL_UNCORE_EVENT_DESC(ddrt_write, "event=0xff,umask=0x31"), 5726 - INTEL_UNCORE_EVENT_DESC(ddrt_write.scale, "6.103515625e-5"), 5727 - INTEL_UNCORE_EVENT_DESC(ddrt_write.unit, "MiB"), 5820 + INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5), 5821 + INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5), 5822 + INTEL_UNCORE_FR_EVENT_DESC(ddrt_read, 0x30, 6.103515625e-5), 5823 + INTEL_UNCORE_FR_EVENT_DESC(ddrt_write, 0x31, 6.103515625e-5), 5728 5824 { /* end: all zeroes */ }, 5729 5825 }; 5730 5826 ··· 6044 6158 static struct event_constraint spr_uncore_cxlcm_constraints[] = { 6045 6159 UNCORE_EVENT_CONSTRAINT(0x02, 0x0f), 6046 6160 UNCORE_EVENT_CONSTRAINT(0x05, 0x0f), 6047 - UNCORE_EVENT_CONSTRAINT(0x40, 0xf0), 6048 - UNCORE_EVENT_CONSTRAINT(0x41, 0xf0), 6049 - UNCORE_EVENT_CONSTRAINT(0x42, 0xf0), 6050 - UNCORE_EVENT_CONSTRAINT(0x43, 0xf0), 6161 + UNCORE_EVENT_CONSTRAINT_RANGE(0x40, 0x43, 0xf0), 6051 6162 UNCORE_EVENT_CONSTRAINT(0x4b, 0xf0), 6052 6163 UNCORE_EVENT_CONSTRAINT(0x52, 0xf0), 6053 6164 EVENT_CONSTRAINT_END ··· 6345 6462 for (node = rb_first(type->boxes); node; node = rb_next(node)) { 6346 6463 unit = rb_entry(node, struct intel_uncore_discovery_unit, node); 6347 6464 6348 - if (unit->id > max) 6465 + /* 6466 + * on DMR IMH2, the unit id starts from 0x8000, 6467 + * and we don't need to count it. 6468 + */ 6469 + if ((unit->id > max) && (unit->id < 0x8000)) 6349 6470 max = unit->id; 6350 6471 } 6351 6472 return max + 1; ··· 6596 6709 } 6597 6710 6598 6711 /* end of GNR uncore support */ 6712 + 6713 + /* DMR uncore support */ 6714 + #define UNCORE_DMR_NUM_UNCORE_TYPES 52 6715 + 6716 + static struct attribute *dmr_imc_uncore_formats_attr[] = { 6717 + &format_attr_event.attr, 6718 + &format_attr_umask.attr, 6719 + &format_attr_edge.attr, 6720 + &format_attr_inv.attr, 6721 + &format_attr_thresh10.attr, 6722 + NULL, 6723 + }; 6724 + 6725 + static const struct attribute_group dmr_imc_uncore_format_group = { 6726 + .name = "format", 6727 + .attrs = dmr_imc_uncore_formats_attr, 6728 + }; 6729 + 6730 + static struct intel_uncore_type dmr_uncore_imc = { 6731 + .name = "imc", 6732 + .fixed_ctr_bits = 48, 6733 + .fixed_ctr = DMR_IMC_PMON_FIXED_CTR, 6734 + .fixed_ctl = DMR_IMC_PMON_FIXED_CTL, 6735 + .ops = &spr_uncore_mmio_ops, 6736 + .format_group = &dmr_imc_uncore_format_group, 6737 + .attr_update = uncore_alias_groups, 6738 + }; 6739 + 6740 + static struct attribute *dmr_sca_uncore_formats_attr[] = { 6741 + &format_attr_event.attr, 6742 + &format_attr_umask_ext5.attr, 6743 + &format_attr_edge.attr, 6744 + &format_attr_inv.attr, 6745 + &format_attr_thresh8.attr, 6746 + NULL, 6747 + }; 6748 + 6749 + static const struct attribute_group dmr_sca_uncore_format_group = { 6750 + .name = "format", 6751 + .attrs = dmr_sca_uncore_formats_attr, 6752 + }; 6753 + 6754 + static struct intel_uncore_type dmr_uncore_sca = { 6755 + .name = "sca", 6756 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6757 + .format_group = &dmr_sca_uncore_format_group, 6758 + .attr_update = uncore_alias_groups, 6759 + }; 6760 + 6761 + static struct attribute *dmr_cxlcm_uncore_formats_attr[] = { 6762 + &format_attr_event.attr, 6763 + &format_attr_umask.attr, 6764 + &format_attr_edge.attr, 6765 + &format_attr_inv2.attr, 6766 + &format_attr_thresh9_2.attr, 6767 + &format_attr_port_en.attr, 6768 + NULL, 6769 + }; 6770 + 6771 + static const struct attribute_group dmr_cxlcm_uncore_format_group = { 6772 + .name = "format", 6773 + .attrs = dmr_cxlcm_uncore_formats_attr, 6774 + }; 6775 + 6776 + static struct event_constraint dmr_uncore_cxlcm_constraints[] = { 6777 + UNCORE_EVENT_CONSTRAINT_RANGE(0x1, 0x24, 0x0f), 6778 + UNCORE_EVENT_CONSTRAINT_RANGE(0x41, 0x41, 0xf0), 6779 + UNCORE_EVENT_CONSTRAINT_RANGE(0x50, 0x5e, 0xf0), 6780 + UNCORE_EVENT_CONSTRAINT_RANGE(0x60, 0x61, 0xf0), 6781 + EVENT_CONSTRAINT_END 6782 + }; 6783 + 6784 + static struct intel_uncore_type dmr_uncore_cxlcm = { 6785 + .name = "cxlcm", 6786 + .event_mask = GENERIC_PMON_RAW_EVENT_MASK, 6787 + .event_mask_ext = DMR_CXLCM_EVENT_MASK_EXT, 6788 + .constraints = dmr_uncore_cxlcm_constraints, 6789 + .format_group = &dmr_cxlcm_uncore_format_group, 6790 + .attr_update = uncore_alias_groups, 6791 + }; 6792 + 6793 + static struct intel_uncore_type dmr_uncore_hamvf = { 6794 + .name = "hamvf", 6795 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6796 + .format_group = &dmr_sca_uncore_format_group, 6797 + .attr_update = uncore_alias_groups, 6798 + }; 6799 + 6800 + static struct event_constraint dmr_uncore_cbo_constraints[] = { 6801 + UNCORE_EVENT_CONSTRAINT(0x11, 0x1), 6802 + UNCORE_EVENT_CONSTRAINT_RANGE(0x19, 0x1a, 0x1), 6803 + UNCORE_EVENT_CONSTRAINT(0x1f, 0x1), 6804 + UNCORE_EVENT_CONSTRAINT(0x21, 0x1), 6805 + UNCORE_EVENT_CONSTRAINT(0x25, 0x1), 6806 + UNCORE_EVENT_CONSTRAINT(0x36, 0x1), 6807 + EVENT_CONSTRAINT_END 6808 + }; 6809 + 6810 + static struct intel_uncore_type dmr_uncore_cbo = { 6811 + .name = "cbo", 6812 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6813 + .constraints = dmr_uncore_cbo_constraints, 6814 + .format_group = &dmr_sca_uncore_format_group, 6815 + .attr_update = uncore_alias_groups, 6816 + }; 6817 + 6818 + static struct intel_uncore_type dmr_uncore_santa = { 6819 + .name = "santa", 6820 + .attr_update = uncore_alias_groups, 6821 + }; 6822 + 6823 + static struct intel_uncore_type dmr_uncore_cncu = { 6824 + .name = "cncu", 6825 + .attr_update = uncore_alias_groups, 6826 + }; 6827 + 6828 + static struct intel_uncore_type dmr_uncore_sncu = { 6829 + .name = "sncu", 6830 + .attr_update = uncore_alias_groups, 6831 + }; 6832 + 6833 + static struct intel_uncore_type dmr_uncore_ula = { 6834 + .name = "ula", 6835 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6836 + .format_group = &dmr_sca_uncore_format_group, 6837 + .attr_update = uncore_alias_groups, 6838 + }; 6839 + 6840 + static struct intel_uncore_type dmr_uncore_dda = { 6841 + .name = "dda", 6842 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6843 + .format_group = &dmr_sca_uncore_format_group, 6844 + .attr_update = uncore_alias_groups, 6845 + }; 6846 + 6847 + static struct event_constraint dmr_uncore_sbo_constraints[] = { 6848 + UNCORE_EVENT_CONSTRAINT(0x1f, 0x01), 6849 + UNCORE_EVENT_CONSTRAINT(0x25, 0x01), 6850 + EVENT_CONSTRAINT_END 6851 + }; 6852 + 6853 + static struct intel_uncore_type dmr_uncore_sbo = { 6854 + .name = "sbo", 6855 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6856 + .constraints = dmr_uncore_sbo_constraints, 6857 + .format_group = &dmr_sca_uncore_format_group, 6858 + .attr_update = uncore_alias_groups, 6859 + }; 6860 + 6861 + static struct intel_uncore_type dmr_uncore_ubr = { 6862 + .name = "ubr", 6863 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6864 + .format_group = &dmr_sca_uncore_format_group, 6865 + .attr_update = uncore_alias_groups, 6866 + }; 6867 + 6868 + static struct attribute *dmr_pcie4_uncore_formats_attr[] = { 6869 + &format_attr_event.attr, 6870 + &format_attr_umask.attr, 6871 + &format_attr_edge.attr, 6872 + &format_attr_inv.attr, 6873 + &format_attr_thresh8.attr, 6874 + &format_attr_thresh_ext.attr, 6875 + &format_attr_rs3_sel.attr, 6876 + &format_attr_rx_sel.attr, 6877 + &format_attr_tx_sel.attr, 6878 + &format_attr_iep_sel.attr, 6879 + &format_attr_vc_sel.attr, 6880 + &format_attr_port_sel.attr, 6881 + NULL, 6882 + }; 6883 + 6884 + static const struct attribute_group dmr_pcie4_uncore_format_group = { 6885 + .name = "format", 6886 + .attrs = dmr_pcie4_uncore_formats_attr, 6887 + }; 6888 + 6889 + static struct intel_uncore_type dmr_uncore_pcie4 = { 6890 + .name = "pcie4", 6891 + .event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT, 6892 + .format_group = &dmr_pcie4_uncore_format_group, 6893 + .attr_update = uncore_alias_groups, 6894 + }; 6895 + 6896 + static struct intel_uncore_type dmr_uncore_crs = { 6897 + .name = "crs", 6898 + .attr_update = uncore_alias_groups, 6899 + }; 6900 + 6901 + static struct intel_uncore_type dmr_uncore_cpc = { 6902 + .name = "cpc", 6903 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6904 + .format_group = &dmr_sca_uncore_format_group, 6905 + .attr_update = uncore_alias_groups, 6906 + }; 6907 + 6908 + static struct intel_uncore_type dmr_uncore_itc = { 6909 + .name = "itc", 6910 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6911 + .format_group = &dmr_sca_uncore_format_group, 6912 + .attr_update = uncore_alias_groups, 6913 + }; 6914 + 6915 + static struct intel_uncore_type dmr_uncore_otc = { 6916 + .name = "otc", 6917 + .event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT, 6918 + .format_group = &dmr_sca_uncore_format_group, 6919 + .attr_update = uncore_alias_groups, 6920 + }; 6921 + 6922 + static struct intel_uncore_type dmr_uncore_cms = { 6923 + .name = "cms", 6924 + .attr_update = uncore_alias_groups, 6925 + }; 6926 + 6927 + static struct intel_uncore_type dmr_uncore_pcie6 = { 6928 + .name = "pcie6", 6929 + .event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT, 6930 + .format_group = &dmr_pcie4_uncore_format_group, 6931 + .attr_update = uncore_alias_groups, 6932 + }; 6933 + 6934 + static struct intel_uncore_type *dmr_uncores[UNCORE_DMR_NUM_UNCORE_TYPES] = { 6935 + NULL, NULL, NULL, NULL, 6936 + &spr_uncore_pcu, 6937 + &gnr_uncore_ubox, 6938 + &dmr_uncore_imc, 6939 + NULL, 6940 + NULL, NULL, NULL, NULL, 6941 + NULL, NULL, NULL, NULL, 6942 + NULL, NULL, NULL, NULL, 6943 + NULL, NULL, NULL, 6944 + &dmr_uncore_sca, 6945 + &dmr_uncore_cxlcm, 6946 + NULL, NULL, NULL, 6947 + NULL, NULL, 6948 + &dmr_uncore_hamvf, 6949 + &dmr_uncore_cbo, 6950 + &dmr_uncore_santa, 6951 + &dmr_uncore_cncu, 6952 + &dmr_uncore_sncu, 6953 + &dmr_uncore_ula, 6954 + &dmr_uncore_dda, 6955 + NULL, 6956 + &dmr_uncore_sbo, 6957 + NULL, 6958 + NULL, NULL, NULL, 6959 + &dmr_uncore_ubr, 6960 + NULL, 6961 + &dmr_uncore_pcie4, 6962 + &dmr_uncore_crs, 6963 + &dmr_uncore_cpc, 6964 + &dmr_uncore_itc, 6965 + &dmr_uncore_otc, 6966 + &dmr_uncore_cms, 6967 + &dmr_uncore_pcie6, 6968 + }; 6969 + 6970 + int dmr_uncore_imh_units_ignore[] = { 6971 + 0x13, /* MSE */ 6972 + UNCORE_IGNORE_END 6973 + }; 6974 + 6975 + int dmr_uncore_cbb_units_ignore[] = { 6976 + 0x25, /* SB2UCIE */ 6977 + UNCORE_IGNORE_END 6978 + }; 6979 + 6980 + static unsigned int dmr_iio_freerunning_box_offsets[] = { 6981 + 0x0, 0x8000, 0x18000, 0x20000 6982 + }; 6983 + 6984 + static void dmr_uncore_freerunning_init_box(struct intel_uncore_box *box) 6985 + { 6986 + struct intel_uncore_type *type = box->pmu->type; 6987 + u64 mmio_base; 6988 + 6989 + if (box->pmu->pmu_idx >= type->num_boxes) 6990 + return; 6991 + 6992 + mmio_base = DMR_IMH1_HIOP_MMIO_BASE; 6993 + mmio_base += dmr_iio_freerunning_box_offsets[box->pmu->pmu_idx]; 6994 + 6995 + box->io_addr = ioremap(mmio_base, type->mmio_map_size); 6996 + if (!box->io_addr) 6997 + pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name); 6998 + } 6999 + 7000 + static struct intel_uncore_ops dmr_uncore_freerunning_ops = { 7001 + .init_box = dmr_uncore_freerunning_init_box, 7002 + .exit_box = uncore_mmio_exit_box, 7003 + .read_counter = uncore_mmio_read_counter, 7004 + .hw_config = uncore_freerunning_hw_config, 7005 + }; 7006 + 7007 + enum perf_uncore_dmr_iio_freerunning_type_id { 7008 + DMR_ITC_INB_DATA_BW, 7009 + DMR_ITC_BW_IN, 7010 + DMR_OTC_BW_OUT, 7011 + DMR_OTC_CLOCK_TICKS, 7012 + 7013 + DMR_IIO_FREERUNNING_TYPE_MAX, 7014 + }; 7015 + 7016 + static struct freerunning_counters dmr_iio_freerunning[] = { 7017 + [DMR_ITC_INB_DATA_BW] = { 0x4d40, 0x8, 0, 8, 48}, 7018 + [DMR_ITC_BW_IN] = { 0x6b00, 0x8, 0, 8, 48}, 7019 + [DMR_OTC_BW_OUT] = { 0x6b60, 0x8, 0, 8, 48}, 7020 + [DMR_OTC_CLOCK_TICKS] = { 0x6bb0, 0x8, 0, 1, 48}, 7021 + }; 7022 + 7023 + static struct uncore_event_desc dmr_uncore_iio_freerunning_events[] = { 7024 + /* ITC Free Running Data BW counter for inbound traffic */ 7025 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, "3.814697266e-6"), 7026 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, "3.814697266e-6"), 7027 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, "3.814697266e-6"), 7028 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, "3.814697266e-6"), 7029 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, "3.814697266e-6"), 7030 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, "3.814697266e-6"), 7031 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, "3.814697266e-6"), 7032 + INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, "3.814697266e-6"), 7033 + 7034 + /* ITC Free Running BW IN counters */ 7035 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, "3.814697266e-6"), 7036 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, "3.814697266e-6"), 7037 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, "3.814697266e-6"), 7038 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, "3.814697266e-6"), 7039 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, "3.814697266e-6"), 7040 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, "3.814697266e-6"), 7041 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, "3.814697266e-6"), 7042 + INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, "3.814697266e-6"), 7043 + 7044 + /* ITC Free Running BW OUT counters */ 7045 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, "3.814697266e-6"), 7046 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, "3.814697266e-6"), 7047 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, "3.814697266e-6"), 7048 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, "3.814697266e-6"), 7049 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, "3.814697266e-6"), 7050 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, "3.814697266e-6"), 7051 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, "3.814697266e-6"), 7052 + INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, "3.814697266e-6"), 7053 + 7054 + /* Free Running Clock Counter */ 7055 + INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x40"), 7056 + { /* end: all zeroes */ }, 7057 + }; 7058 + 7059 + static struct intel_uncore_type dmr_uncore_iio_free_running = { 7060 + .name = "iio_free_running", 7061 + .num_counters = 25, 7062 + .mmio_map_size = DMR_HIOP_MMIO_SIZE, 7063 + .num_freerunning_types = DMR_IIO_FREERUNNING_TYPE_MAX, 7064 + .freerunning = dmr_iio_freerunning, 7065 + .ops = &dmr_uncore_freerunning_ops, 7066 + .event_descs = dmr_uncore_iio_freerunning_events, 7067 + .format_group = &skx_uncore_iio_freerunning_format_group, 7068 + }; 7069 + 7070 + #define UNCORE_DMR_MMIO_EXTRA_UNCORES 1 7071 + static struct intel_uncore_type *dmr_mmio_uncores[UNCORE_DMR_MMIO_EXTRA_UNCORES] = { 7072 + &dmr_uncore_iio_free_running, 7073 + }; 7074 + 7075 + int dmr_uncore_pci_init(void) 7076 + { 7077 + uncore_pci_uncores = uncore_get_uncores(UNCORE_ACCESS_PCI, 0, NULL, 7078 + UNCORE_DMR_NUM_UNCORE_TYPES, 7079 + dmr_uncores); 7080 + return 0; 7081 + } 7082 + 7083 + void dmr_uncore_mmio_init(void) 7084 + { 7085 + uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO, 7086 + UNCORE_DMR_MMIO_EXTRA_UNCORES, 7087 + dmr_mmio_uncores, 7088 + UNCORE_DMR_NUM_UNCORE_TYPES, 7089 + dmr_uncores); 7090 + 7091 + dmr_uncore_iio_free_running.num_boxes = 7092 + uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_DMR_ITC); 7093 + } 7094 + /* end of DMR uncore support */
+1
arch/x86/events/msr.c
··· 78 78 case INTEL_ATOM_SILVERMONT: 79 79 case INTEL_ATOM_SILVERMONT_D: 80 80 case INTEL_ATOM_AIRMONT: 81 + case INTEL_ATOM_AIRMONT_NP: 81 82 82 83 case INTEL_ATOM_GOLDMONT: 83 84 case INTEL_ATOM_GOLDMONT_D:
+26
arch/x86/events/perf_event.h
··· 45 45 EXTRA_REG_FE = 4, /* fe_* */ 46 46 EXTRA_REG_SNOOP_0 = 5, /* snoop response 0 */ 47 47 EXTRA_REG_SNOOP_1 = 6, /* snoop response 1 */ 48 + EXTRA_REG_OMR_0 = 7, /* OMR 0 */ 49 + EXTRA_REG_OMR_1 = 8, /* OMR 1 */ 50 + EXTRA_REG_OMR_2 = 9, /* OMR 2 */ 51 + EXTRA_REG_OMR_3 = 10, /* OMR 3 */ 48 52 49 53 EXTRA_REG_MAX /* number of entries needed */ 50 54 }; ··· 186 182 (1ULL << PERF_REG_X86_R13) | \ 187 183 (1ULL << PERF_REG_X86_R14) | \ 188 184 (1ULL << PERF_REG_X86_R15)) 185 + 186 + /* user space rdpmc control values */ 187 + enum { 188 + X86_USER_RDPMC_NEVER_ENABLE = 0, 189 + X86_USER_RDPMC_CONDITIONAL_ENABLE = 1, 190 + X86_USER_RDPMC_ALWAYS_ENABLE = 2, 191 + }; 189 192 190 193 /* 191 194 * Per register state. ··· 1110 1099 #define PMU_FL_RETIRE_LATENCY 0x200 /* Support Retire Latency in PEBS */ 1111 1100 #define PMU_FL_BR_CNTR 0x400 /* Support branch counter logging */ 1112 1101 #define PMU_FL_DYN_CONSTRAINT 0x800 /* Needs dynamic constraint */ 1102 + #define PMU_FL_HAS_OMR 0x1000 /* has 4 equivalent OMR regs */ 1113 1103 1114 1104 #define EVENT_VAR(_id) event_attr_##_id 1115 1105 #define EVENT_PTR(_id) &event_attr_##_id.attr.attr ··· 1331 1319 static inline u64 x86_pmu_get_event_config(struct perf_event *event) 1332 1320 { 1333 1321 return event->attr.config & hybrid(event->pmu, config_mask); 1322 + } 1323 + 1324 + static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu) 1325 + { 1326 + return !!(hybrid(pmu, config_mask) & 1327 + ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE); 1334 1328 } 1335 1329 1336 1330 extern struct event_constraint emptyconstraint; ··· 1686 1668 1687 1669 u64 arl_h_latency_data(struct perf_event *event, u64 status); 1688 1670 1671 + u64 pnc_latency_data(struct perf_event *event, u64 status); 1672 + 1673 + u64 nvl_latency_data(struct perf_event *event, u64 status); 1674 + 1689 1675 extern struct event_constraint intel_core2_pebs_event_constraints[]; 1690 1676 1691 1677 extern struct event_constraint intel_atom_pebs_event_constraints[]; ··· 1701 1679 extern struct event_constraint intel_glp_pebs_event_constraints[]; 1702 1680 1703 1681 extern struct event_constraint intel_grt_pebs_event_constraints[]; 1682 + 1683 + extern struct event_constraint intel_arw_pebs_event_constraints[]; 1704 1684 1705 1685 extern struct event_constraint intel_nehalem_pebs_event_constraints[]; 1706 1686 ··· 1723 1699 extern struct event_constraint intel_glc_pebs_event_constraints[]; 1724 1700 1725 1701 extern struct event_constraint intel_lnc_pebs_event_constraints[]; 1702 + 1703 + extern struct event_constraint intel_pnc_pebs_event_constraints[]; 1726 1704 1727 1705 struct event_constraint *intel_pebs_constraints(struct perf_event *event); 1728 1706
+1 -1
arch/x86/include/asm/amd/ibs.h
··· 110 110 __u64 ld_op:1, /* 0: load op */ 111 111 st_op:1, /* 1: store op */ 112 112 dc_l1tlb_miss:1, /* 2: data cache L1TLB miss */ 113 - dc_l2tlb_miss:1, /* 3: data cache L2TLB hit in 2M page */ 113 + dc_l2tlb_miss:1, /* 3: data cache L2TLB miss in 2M page */ 114 114 dc_l1tlb_hit_2m:1, /* 4: data cache L1TLB hit in 2M page */ 115 115 dc_l1tlb_hit_1g:1, /* 5: data cache L1TLB hit in 1G page */ 116 116 dc_l2tlb_hit_2m:1, /* 6: data cache L2TLB hit in 2M page */
+3
arch/x86/include/asm/hardirq.h
··· 19 19 unsigned int kvm_posted_intr_wakeup_ipis; 20 20 unsigned int kvm_posted_intr_nested_ipis; 21 21 #endif 22 + #ifdef CONFIG_GUEST_PERF_EVENTS 23 + unsigned int perf_guest_mediated_pmis; 24 + #endif 22 25 unsigned int x86_platform_ipis; /* arch dependent */ 23 26 unsigned int apic_perf_irqs; 24 27 unsigned int apic_irq_work_irqs;
+6
arch/x86/include/asm/idtentry.h
··· 746 746 # define fred_sysvec_kvm_posted_intr_nested_ipi NULL 747 747 #endif 748 748 749 + # ifdef CONFIG_GUEST_PERF_EVENTS 750 + DECLARE_IDTENTRY_SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, sysvec_perf_guest_mediated_pmi_handler); 751 + #else 752 + # define fred_sysvec_perf_guest_mediated_pmi_handler NULL 753 + #endif 754 + 749 755 # ifdef CONFIG_X86_POSTED_MSI 750 756 DECLARE_IDTENTRY_SYSVEC(POSTED_MSI_NOTIFICATION_VECTOR, sysvec_posted_msi_notification); 751 757 #else
+3 -1
arch/x86/include/asm/irq_vectors.h
··· 77 77 */ 78 78 #define IRQ_WORK_VECTOR 0xf6 79 79 80 - /* 0xf5 - unused, was UV_BAU_MESSAGE */ 80 + /* IRQ vector for PMIs when running a guest with a mediated PMU. */ 81 + #define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 82 + 81 83 #define DEFERRED_ERROR_VECTOR 0xf4 82 84 83 85 /* Vector on which hypervisor callbacks will be delivered */
+5
arch/x86/include/asm/msr-index.h
··· 263 263 #define MSR_SNOOP_RSP_0 0x00001328 264 264 #define MSR_SNOOP_RSP_1 0x00001329 265 265 266 + #define MSR_OMR_0 0x000003e0 267 + #define MSR_OMR_1 0x000003e1 268 + #define MSR_OMR_2 0x000003e2 269 + #define MSR_OMR_3 0x000003e3 270 + 266 271 #define MSR_LBR_SELECT 0x000001c8 267 272 #define MSR_LBR_TOS 0x000001c9 268 273
+12 -2
arch/x86/include/asm/perf_event.h
··· 33 33 #define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL 34 34 #define ARCH_PERFMON_EVENTSEL_BR_CNTR (1ULL << 35) 35 35 #define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36) 36 + #define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE (1ULL << 37) 36 37 #define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40) 37 38 38 39 #define INTEL_FIXED_BITS_STRIDE 4 ··· 41 40 #define INTEL_FIXED_0_USER (1ULL << 1) 42 41 #define INTEL_FIXED_0_ANYTHREAD (1ULL << 2) 43 42 #define INTEL_FIXED_0_ENABLE_PMI (1ULL << 3) 43 + #define INTEL_FIXED_0_RDPMC_USER_DISABLE (1ULL << 33) 44 44 #define INTEL_FIXED_3_METRICS_CLEAR (1ULL << 2) 45 45 46 46 #define HSW_IN_TX (1ULL << 32) ··· 52 50 #define INTEL_FIXED_BITS_MASK \ 53 51 (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \ 54 52 INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \ 55 - ICL_FIXED_0_ADAPTIVE) 53 + ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE) 56 54 57 55 #define intel_fixed_bits_by_idx(_idx, _bits) \ 58 56 ((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE)) ··· 228 226 unsigned int umask2:1; 229 227 /* EQ-bit Supported */ 230 228 unsigned int eq:1; 231 - unsigned int reserved:30; 229 + /* rdpmc user disable Supported */ 230 + unsigned int rdpmc_user_disable:1; 231 + unsigned int reserved:29; 232 232 } split; 233 233 unsigned int full; 234 234 }; ··· 305 301 unsigned int events_mask; 306 302 int events_mask_len; 307 303 unsigned int pebs_ept :1; 304 + unsigned int mediated :1; 308 305 }; 309 306 310 307 /* ··· 762 757 763 758 static inline void perf_events_lapic_init(void) { } 764 759 static inline void perf_check_microcode(void) { } 760 + #endif 761 + 762 + #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU 763 + extern void perf_load_guest_lvtpc(u32 guest_lvtpc); 764 + extern void perf_put_guest_lvtpc(void); 765 765 #endif 766 766 767 767 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
+14 -13
arch/x86/include/asm/unwind_user.h
··· 2 2 #ifndef _ASM_X86_UNWIND_USER_H 3 3 #define _ASM_X86_UNWIND_USER_H 4 4 5 - #ifdef CONFIG_HAVE_UNWIND_USER_FP 5 + #ifdef CONFIG_UNWIND_USER 6 6 7 7 #include <asm/ptrace.h> 8 8 #include <asm/uprobes.h> 9 + 10 + static inline int unwind_user_word_size(struct pt_regs *regs) 11 + { 12 + /* We can't unwind VM86 stacks */ 13 + if (regs->flags & X86_VM_MASK) 14 + return 0; 15 + return user_64bit_mode(regs) ? 8 : 4; 16 + } 17 + 18 + #endif /* CONFIG_UNWIND_USER */ 19 + 20 + #ifdef CONFIG_HAVE_UNWIND_USER_FP 9 21 10 22 #define ARCH_INIT_USER_FP_FRAME(ws) \ 11 23 .cfa_off = 2*(ws), \ ··· 31 19 .fp_off = 0, \ 32 20 .use_fp = false, 33 21 34 - static inline int unwind_user_word_size(struct pt_regs *regs) 35 - { 36 - /* We can't unwind VM86 stacks */ 37 - if (regs->flags & X86_VM_MASK) 38 - return 0; 39 - #ifdef CONFIG_X86_64 40 - if (!user_64bit_mode(regs)) 41 - return sizeof(int); 42 - #endif 43 - return sizeof(long); 44 - } 45 - 46 22 static inline bool unwind_user_at_function_start(struct pt_regs *regs) 47 23 { 48 24 return is_uprobe_at_func_entry(regs); 49 25 } 26 + #define unwind_user_at_function_start unwind_user_at_function_start 50 27 51 28 #endif /* CONFIG_HAVE_UNWIND_USER_FP */ 52 29
+3
arch/x86/kernel/idt.c
··· 158 158 INTG(POSTED_INTR_WAKEUP_VECTOR, asm_sysvec_kvm_posted_intr_wakeup_ipi), 159 159 INTG(POSTED_INTR_NESTED_VECTOR, asm_sysvec_kvm_posted_intr_nested_ipi), 160 160 # endif 161 + #ifdef CONFIG_GUEST_PERF_EVENTS 162 + INTG(PERF_GUEST_MEDIATED_PMI_VECTOR, asm_sysvec_perf_guest_mediated_pmi_handler), 163 + #endif 161 164 # ifdef CONFIG_IRQ_WORK 162 165 INTG(IRQ_WORK_VECTOR, asm_sysvec_irq_work), 163 166 # endif
+19
arch/x86/kernel/irq.c
··· 192 192 irq_stats(j)->kvm_posted_intr_wakeup_ipis); 193 193 seq_puts(p, " Posted-interrupt wakeup event\n"); 194 194 #endif 195 + #ifdef CONFIG_GUEST_PERF_EVENTS 196 + seq_printf(p, "%*s: ", prec, "VPMI"); 197 + for_each_online_cpu(j) 198 + seq_printf(p, "%10u ", 199 + irq_stats(j)->perf_guest_mediated_pmis); 200 + seq_puts(p, " Perf Guest Mediated PMI\n"); 201 + #endif 195 202 #ifdef CONFIG_X86_POSTED_MSI 196 203 seq_printf(p, "%*s: ", prec, "PMN"); 197 204 for_each_online_cpu(j) ··· 353 346 x86_platform_ipi_callback(); 354 347 trace_x86_platform_ipi_exit(X86_PLATFORM_IPI_VECTOR); 355 348 set_irq_regs(old_regs); 349 + } 350 + #endif 351 + 352 + #ifdef CONFIG_GUEST_PERF_EVENTS 353 + /* 354 + * Handler for PERF_GUEST_MEDIATED_PMI_VECTOR. 355 + */ 356 + DEFINE_IDTENTRY_SYSVEC(sysvec_perf_guest_mediated_pmi_handler) 357 + { 358 + apic_eoi(); 359 + inc_irq_stat(perf_guest_mediated_pmis); 360 + perf_guest_handle_mediated_pmi(); 356 361 } 357 362 #endif 358 363
+24
arch/x86/kernel/uprobes.c
··· 1823 1823 1824 1824 return false; 1825 1825 } 1826 + 1827 + #ifdef CONFIG_IA32_EMULATION 1828 + unsigned long arch_uprobe_get_xol_area(void) 1829 + { 1830 + struct thread_info *ti = current_thread_info(); 1831 + unsigned long vaddr; 1832 + 1833 + /* 1834 + * HACK: we are not in a syscall, but x86 get_unmapped_area() paths 1835 + * ignore TIF_ADDR32 and rely on in_32bit_syscall() to calculate 1836 + * vm_unmapped_area_info.high_limit. 1837 + * 1838 + * The #ifdef above doesn't cover the CONFIG_X86_X32_ABI=y case, 1839 + * but in this case in_32bit_syscall() -> in_x32_syscall() always 1840 + * (falsely) returns true because ->orig_ax == -1. 1841 + */ 1842 + if (test_thread_flag(TIF_ADDR32)) 1843 + ti->status |= TS_COMPAT; 1844 + vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); 1845 + ti->status &= ~TS_COMPAT; 1846 + 1847 + return vaddr; 1848 + } 1849 + #endif
+1
arch/x86/kvm/Kconfig
··· 37 37 select SCHED_INFO 38 38 select PERF_EVENTS 39 39 select GUEST_PERF_EVENTS 40 + select PERF_GUEST_MEDIATED_PMU 40 41 select HAVE_KVM_MSI 41 42 select HAVE_KVM_CPU_RELAX_INTERCEPT 42 43 select HAVE_KVM_NO_POLL
+1
include/asm-generic/Kbuild
··· 32 32 mandatory-y += kdebug.h 33 33 mandatory-y += kmap_size.h 34 34 mandatory-y += kprobes.h 35 + mandatory-y += kvm_types.h 35 36 mandatory-y += linkage.h 36 37 mandatory-y += local.h 37 38 mandatory-y += local64.h
+29 -6
include/linux/perf_event.h
··· 305 305 #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100 306 306 #define PERF_PMU_CAP_AUX_PAUSE 0x0200 307 307 #define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400 308 + #define PERF_PMU_CAP_MEDIATED_VPMU 0x0800 308 309 309 310 /** 310 311 * pmu::scope ··· 999 998 u64 index; 1000 999 }; 1001 1000 1001 + struct perf_time_ctx { 1002 + u64 time; 1003 + u64 stamp; 1004 + u64 offset; 1005 + }; 1002 1006 1003 1007 /** 1004 1008 * struct perf_event_context - event context structure ··· 1042 1036 /* 1043 1037 * Context clock, runs when context enabled. 1044 1038 */ 1045 - u64 time; 1046 - u64 timestamp; 1047 - u64 timeoffset; 1039 + struct perf_time_ctx time; 1040 + 1041 + /* 1042 + * Context clock, runs when in the guest mode. 1043 + */ 1044 + struct perf_time_ctx timeguest; 1048 1045 1049 1046 /* 1050 1047 * These fields let us detect when two contexts have both ··· 1180 1171 * This is a per-cpu dynamically allocated data structure. 1181 1172 */ 1182 1173 struct perf_cgroup_info { 1183 - u64 time; 1184 - u64 timestamp; 1185 - u64 timeoffset; 1174 + struct perf_time_ctx time; 1175 + struct perf_time_ctx timeguest; 1186 1176 int active; 1187 1177 }; 1188 1178 ··· 1677 1669 unsigned int (*state)(void); 1678 1670 unsigned long (*get_ip)(void); 1679 1671 unsigned int (*handle_intel_pt_intr)(void); 1672 + 1673 + void (*handle_mediated_pmi)(void); 1680 1674 }; 1681 1675 1682 1676 #ifdef CONFIG_GUEST_PERF_EVENTS ··· 1688 1678 DECLARE_STATIC_CALL(__perf_guest_state, *perf_guest_cbs->state); 1689 1679 DECLARE_STATIC_CALL(__perf_guest_get_ip, *perf_guest_cbs->get_ip); 1690 1680 DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr); 1681 + DECLARE_STATIC_CALL(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi); 1691 1682 1692 1683 static inline unsigned int perf_guest_state(void) 1693 1684 { ··· 1703 1692 static inline unsigned int perf_guest_handle_intel_pt_intr(void) 1704 1693 { 1705 1694 return static_call(__perf_guest_handle_intel_pt_intr)(); 1695 + } 1696 + 1697 + static inline void perf_guest_handle_mediated_pmi(void) 1698 + { 1699 + static_call(__perf_guest_handle_mediated_pmi)(); 1706 1700 } 1707 1701 1708 1702 extern void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs); ··· 1929 1913 extern int perf_event_account_interrupt(struct perf_event *event); 1930 1914 extern int perf_event_period(struct perf_event *event, u64 value); 1931 1915 extern u64 perf_event_pause(struct perf_event *event, bool reset); 1916 + 1917 + #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU 1918 + int perf_create_mediated_pmu(void); 1919 + void perf_release_mediated_pmu(void); 1920 + void perf_load_guest_context(void); 1921 + void perf_put_guest_context(void); 1922 + #endif 1932 1923 1933 1924 #else /* !CONFIG_PERF_EVENTS: */ 1934 1925
+16 -2
include/linux/unwind_user.h
··· 5 5 #include <linux/unwind_user_types.h> 6 6 #include <asm/unwind_user.h> 7 7 8 - #ifndef ARCH_INIT_USER_FP_FRAME 9 - #define ARCH_INIT_USER_FP_FRAME 8 + #ifndef CONFIG_HAVE_UNWIND_USER_FP 9 + 10 + #define ARCH_INIT_USER_FP_FRAME(ws) 11 + 12 + #endif 13 + 14 + #ifndef ARCH_INIT_USER_FP_ENTRY_FRAME 15 + #define ARCH_INIT_USER_FP_ENTRY_FRAME(ws) 16 + #endif 17 + 18 + #ifndef unwind_user_at_function_start 19 + static inline bool unwind_user_at_function_start(struct pt_regs *regs) 20 + { 21 + return false; 22 + } 23 + #define unwind_user_at_function_start unwind_user_at_function_start 10 24 #endif 11 25 12 26 int unwind_user(struct unwind_stacktrace *trace, unsigned int max_entries);
+1
include/linux/uprobes.h
··· 242 242 extern void arch_uprobe_init_state(struct mm_struct *mm); 243 243 extern void handle_syscall_uprobe(struct pt_regs *regs, unsigned long bp_vaddr); 244 244 extern void arch_uprobe_optimize(struct arch_uprobe *auprobe, unsigned long vaddr); 245 + extern unsigned long arch_uprobe_get_xol_area(void); 245 246 #else /* !CONFIG_UPROBES */ 246 247 struct uprobes_state { 247 248 };
+24 -3
include/uapi/linux/perf_event.h
··· 1330 1330 mem_snoopx : 2, /* Snoop mode, ext */ 1331 1331 mem_blk : 3, /* Access blocked */ 1332 1332 mem_hops : 3, /* Hop level */ 1333 - mem_rsvd : 18; 1333 + mem_region : 5, /* cache/memory regions */ 1334 + mem_rsvd : 13; 1334 1335 }; 1335 1336 }; 1336 1337 #elif defined(__BIG_ENDIAN_BITFIELD) 1337 1338 union perf_mem_data_src { 1338 1339 __u64 val; 1339 1340 struct { 1340 - __u64 mem_rsvd : 18, 1341 + __u64 mem_rsvd : 13, 1342 + mem_region : 5, /* cache/memory regions */ 1341 1343 mem_hops : 3, /* Hop level */ 1342 1344 mem_blk : 3, /* Access blocked */ 1343 1345 mem_snoopx : 2, /* Snoop mode, ext */ ··· 1396 1394 #define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */ 1397 1395 #define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */ 1398 1396 #define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */ 1399 - /* 0x007 available */ 1397 + #define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */ 1400 1398 #define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */ 1401 1399 #define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */ 1402 1400 #define PERF_MEM_LVLNUM_IO 0x000a /* I/O */ ··· 1448 1446 #define PERF_MEM_HOPS_3 0x0004 /* Remote board */ 1449 1447 /* 5-7 available */ 1450 1448 #define PERF_MEM_HOPS_SHIFT 43 1449 + 1450 + /* Cache/Memory region */ 1451 + #define PERF_MEM_REGION_NA 0x0 /* Invalid */ 1452 + #define PERF_MEM_REGION_RSVD 0x01 /* Reserved */ 1453 + #define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */ 1454 + #define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */ 1455 + #define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */ 1456 + #define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */ 1457 + #define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */ 1458 + #define PERF_MEM_REGION_MMIO 0x07 /* MMIO */ 1459 + #define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */ 1460 + #define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */ 1461 + #define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */ 1462 + #define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */ 1463 + #define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */ 1464 + #define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */ 1465 + #define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */ 1466 + #define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */ 1467 + #define PERF_MEM_REGION_SHIFT 46 1451 1468 1452 1469 #define PERF_MEM_S(a, s) \ 1453 1470 (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
+4
init/Kconfig
··· 2072 2072 bool 2073 2073 depends on HAVE_PERF_EVENTS 2074 2074 2075 + config PERF_GUEST_MEDIATED_PMU 2076 + bool 2077 + depends on GUEST_PERF_EVENTS 2078 + 2075 2079 config PERF_USE_VMALLOC 2076 2080 bool 2077 2081 help
+423 -123
kernel/events/core.c
··· 57 57 #include <linux/task_work.h> 58 58 #include <linux/percpu-rwsem.h> 59 59 #include <linux/unwind_deferred.h> 60 + #include <linux/kvm_types.h> 60 61 61 62 #include "internal.h" 62 63 ··· 167 166 EVENT_CPU = 0x10, 168 167 EVENT_CGROUP = 0x20, 169 168 169 + /* 170 + * EVENT_GUEST is set when scheduling in/out events between the host 171 + * and a guest with a mediated vPMU. Among other things, EVENT_GUEST 172 + * is used: 173 + * 174 + * - In for_each_epc() to skip PMUs that don't support events in a 175 + * MEDIATED_VPMU guest, i.e. don't need to be context switched. 176 + * - To indicate the start/end point of the events in a guest. Guest 177 + * running time is deducted for host-only (exclude_guest) events. 178 + */ 179 + EVENT_GUEST = 0x40, 180 + EVENT_FLAGS = EVENT_CGROUP | EVENT_GUEST, 170 181 /* compound helpers */ 171 182 EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED, 172 183 EVENT_TIME_FROZEN = EVENT_TIME | EVENT_FROZEN, ··· 470 457 static cpumask_var_t perf_online_pkg_mask; 471 458 static cpumask_var_t perf_online_sys_mask; 472 459 static struct kmem_cache *perf_event_cache; 460 + 461 + #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU 462 + static DEFINE_PER_CPU(bool, guest_ctx_loaded); 463 + 464 + static __always_inline bool is_guest_mediated_pmu_loaded(void) 465 + { 466 + return __this_cpu_read(guest_ctx_loaded); 467 + } 468 + #else 469 + static __always_inline bool is_guest_mediated_pmu_loaded(void) 470 + { 471 + return false; 472 + } 473 + #endif 473 474 474 475 /* 475 476 * perf event paranoia level: ··· 806 779 ___p; \ 807 780 }) 808 781 809 - #define for_each_epc(_epc, _ctx, _pmu, _cgroup) \ 782 + static bool perf_skip_pmu_ctx(struct perf_event_pmu_context *pmu_ctx, 783 + enum event_type_t event_type) 784 + { 785 + if ((event_type & EVENT_CGROUP) && !pmu_ctx->nr_cgroups) 786 + return true; 787 + if ((event_type & EVENT_GUEST) && 788 + !(pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU)) 789 + return true; 790 + return false; 791 + } 792 + 793 + #define for_each_epc(_epc, _ctx, _pmu, _event_type) \ 810 794 list_for_each_entry(_epc, &((_ctx)->pmu_ctx_list), pmu_ctx_entry) \ 811 - if (_cgroup && !_epc->nr_cgroups) \ 795 + if (perf_skip_pmu_ctx(_epc, _event_type)) \ 812 796 continue; \ 813 797 else if (_pmu && _epc->pmu != _pmu) \ 814 798 continue; \ 815 799 else 816 800 817 - static void perf_ctx_disable(struct perf_event_context *ctx, bool cgroup) 801 + static void perf_ctx_disable(struct perf_event_context *ctx, 802 + enum event_type_t event_type) 818 803 { 819 804 struct perf_event_pmu_context *pmu_ctx; 820 805 821 - for_each_epc(pmu_ctx, ctx, NULL, cgroup) 806 + for_each_epc(pmu_ctx, ctx, NULL, event_type) 822 807 perf_pmu_disable(pmu_ctx->pmu); 823 808 } 824 809 825 - static void perf_ctx_enable(struct perf_event_context *ctx, bool cgroup) 810 + static void perf_ctx_enable(struct perf_event_context *ctx, 811 + enum event_type_t event_type) 826 812 { 827 813 struct perf_event_pmu_context *pmu_ctx; 828 814 829 - for_each_epc(pmu_ctx, ctx, NULL, cgroup) 815 + for_each_epc(pmu_ctx, ctx, NULL, event_type) 830 816 perf_pmu_enable(pmu_ctx->pmu); 831 817 } 832 818 833 819 static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type); 834 820 static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type); 821 + 822 + static inline void update_perf_time_ctx(struct perf_time_ctx *time, u64 now, bool adv) 823 + { 824 + if (adv) 825 + time->time += now - time->stamp; 826 + time->stamp = now; 827 + 828 + /* 829 + * The above: time' = time + (now - timestamp), can be re-arranged 830 + * into: time` = now + (time - timestamp), which gives a single value 831 + * offset to compute future time without locks on. 832 + * 833 + * See perf_event_time_now(), which can be used from NMI context where 834 + * it's (obviously) not possible to acquire ctx->lock in order to read 835 + * both the above values in a consistent manner. 836 + */ 837 + WRITE_ONCE(time->offset, time->time - time->stamp); 838 + } 839 + 840 + static_assert(offsetof(struct perf_event_context, timeguest) - 841 + offsetof(struct perf_event_context, time) == 842 + sizeof(struct perf_time_ctx)); 843 + 844 + #define T_TOTAL 0 845 + #define T_GUEST 1 846 + 847 + static inline u64 __perf_event_time_ctx(struct perf_event *event, 848 + struct perf_time_ctx *times) 849 + { 850 + u64 time = times[T_TOTAL].time; 851 + 852 + if (event->attr.exclude_guest) 853 + time -= times[T_GUEST].time; 854 + 855 + return time; 856 + } 857 + 858 + static inline u64 __perf_event_time_ctx_now(struct perf_event *event, 859 + struct perf_time_ctx *times, 860 + u64 now) 861 + { 862 + if (is_guest_mediated_pmu_loaded() && event->attr.exclude_guest) { 863 + /* 864 + * (now + times[total].offset) - (now + times[guest].offset) := 865 + * times[total].offset - times[guest].offset 866 + */ 867 + return READ_ONCE(times[T_TOTAL].offset) - READ_ONCE(times[T_GUEST].offset); 868 + } 869 + 870 + return now + READ_ONCE(times[T_TOTAL].offset); 871 + } 835 872 836 873 #ifdef CONFIG_CGROUP_PERF 837 874 ··· 933 842 return event->cgrp != NULL; 934 843 } 935 844 845 + static_assert(offsetof(struct perf_cgroup_info, timeguest) - 846 + offsetof(struct perf_cgroup_info, time) == 847 + sizeof(struct perf_time_ctx)); 848 + 936 849 static inline u64 perf_cgroup_event_time(struct perf_event *event) 937 850 { 938 851 struct perf_cgroup_info *t; 939 852 940 853 t = per_cpu_ptr(event->cgrp->info, event->cpu); 941 - return t->time; 854 + return __perf_event_time_ctx(event, &t->time); 942 855 } 943 856 944 857 static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now) ··· 951 856 952 857 t = per_cpu_ptr(event->cgrp->info, event->cpu); 953 858 if (!__load_acquire(&t->active)) 954 - return t->time; 955 - now += READ_ONCE(t->timeoffset); 956 - return now; 859 + return __perf_event_time_ctx(event, &t->time); 860 + 861 + return __perf_event_time_ctx_now(event, &t->time, now); 957 862 } 958 863 959 - static inline void __update_cgrp_time(struct perf_cgroup_info *info, u64 now, bool adv) 864 + static inline void __update_cgrp_guest_time(struct perf_cgroup_info *info, u64 now, bool adv) 960 865 { 961 - if (adv) 962 - info->time += now - info->timestamp; 963 - info->timestamp = now; 964 - /* 965 - * see update_context_time() 966 - */ 967 - WRITE_ONCE(info->timeoffset, info->time - info->timestamp); 866 + update_perf_time_ctx(&info->timeguest, now, adv); 867 + } 868 + 869 + static inline void update_cgrp_time(struct perf_cgroup_info *info, u64 now) 870 + { 871 + update_perf_time_ctx(&info->time, now, true); 872 + if (is_guest_mediated_pmu_loaded()) 873 + __update_cgrp_guest_time(info, now, true); 968 874 } 969 875 970 876 static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx, bool final) ··· 981 885 cgrp = container_of(css, struct perf_cgroup, css); 982 886 info = this_cpu_ptr(cgrp->info); 983 887 984 - __update_cgrp_time(info, now, true); 888 + update_cgrp_time(info, now); 985 889 if (final) 986 890 __store_release(&info->active, 0); 987 891 } ··· 1004 908 * Do not update time when cgroup is not active 1005 909 */ 1006 910 if (info->active) 1007 - __update_cgrp_time(info, perf_clock(), true); 911 + update_cgrp_time(info, perf_clock()); 1008 912 } 1009 913 1010 914 static inline void 1011 - perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) 915 + perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) 1012 916 { 1013 917 struct perf_event_context *ctx = &cpuctx->ctx; 1014 918 struct perf_cgroup *cgrp = cpuctx->cgrp; ··· 1028 932 for (css = &cgrp->css; css; css = css->parent) { 1029 933 cgrp = container_of(css, struct perf_cgroup, css); 1030 934 info = this_cpu_ptr(cgrp->info); 1031 - __update_cgrp_time(info, ctx->timestamp, false); 1032 - __store_release(&info->active, 1); 935 + if (guest) { 936 + __update_cgrp_guest_time(info, ctx->time.stamp, false); 937 + } else { 938 + update_perf_time_ctx(&info->time, ctx->time.stamp, false); 939 + __store_release(&info->active, 1); 940 + } 1033 941 } 1034 942 } 1035 943 ··· 1064 964 return; 1065 965 1066 966 WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0); 1067 - 1068 - perf_ctx_disable(&cpuctx->ctx, true); 967 + perf_ctx_disable(&cpuctx->ctx, EVENT_CGROUP); 1069 968 1070 969 ctx_sched_out(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); 1071 970 /* ··· 1080 981 */ 1081 982 ctx_sched_in(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP); 1082 983 1083 - perf_ctx_enable(&cpuctx->ctx, true); 984 + perf_ctx_enable(&cpuctx->ctx, EVENT_CGROUP); 1084 985 } 1085 986 1086 987 static int perf_cgroup_ensure_storage(struct perf_event *event, ··· 1237 1138 } 1238 1139 1239 1140 static inline void 1240 - perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx) 1141 + perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest) 1241 1142 { 1242 1143 } 1243 1144 ··· 1649 1550 */ 1650 1551 static void __update_context_time(struct perf_event_context *ctx, bool adv) 1651 1552 { 1652 - u64 now = perf_clock(); 1653 - 1654 1553 lockdep_assert_held(&ctx->lock); 1655 1554 1656 - if (adv) 1657 - ctx->time += now - ctx->timestamp; 1658 - ctx->timestamp = now; 1555 + update_perf_time_ctx(&ctx->time, perf_clock(), adv); 1556 + } 1659 1557 1660 - /* 1661 - * The above: time' = time + (now - timestamp), can be re-arranged 1662 - * into: time` = now + (time - timestamp), which gives a single value 1663 - * offset to compute future time without locks on. 1664 - * 1665 - * See perf_event_time_now(), which can be used from NMI context where 1666 - * it's (obviously) not possible to acquire ctx->lock in order to read 1667 - * both the above values in a consistent manner. 1668 - */ 1669 - WRITE_ONCE(ctx->timeoffset, ctx->time - ctx->timestamp); 1558 + static void __update_context_guest_time(struct perf_event_context *ctx, bool adv) 1559 + { 1560 + lockdep_assert_held(&ctx->lock); 1561 + 1562 + /* must be called after __update_context_time(); */ 1563 + update_perf_time_ctx(&ctx->timeguest, ctx->time.stamp, adv); 1670 1564 } 1671 1565 1672 1566 static void update_context_time(struct perf_event_context *ctx) 1673 1567 { 1674 1568 __update_context_time(ctx, true); 1569 + if (is_guest_mediated_pmu_loaded()) 1570 + __update_context_guest_time(ctx, true); 1675 1571 } 1676 1572 1677 1573 static u64 perf_event_time(struct perf_event *event) ··· 1679 1585 if (is_cgroup_event(event)) 1680 1586 return perf_cgroup_event_time(event); 1681 1587 1682 - return ctx->time; 1588 + return __perf_event_time_ctx(event, &ctx->time); 1683 1589 } 1684 1590 1685 1591 static u64 perf_event_time_now(struct perf_event *event, u64 now) ··· 1693 1599 return perf_cgroup_event_time_now(event, now); 1694 1600 1695 1601 if (!(__load_acquire(&ctx->is_active) & EVENT_TIME)) 1696 - return ctx->time; 1602 + return __perf_event_time_ctx(event, &ctx->time); 1697 1603 1698 - now += READ_ONCE(ctx->timeoffset); 1699 - return now; 1604 + return __perf_event_time_ctx_now(event, &ctx->time, now); 1700 1605 } 1701 1606 1702 1607 static enum event_type_t get_event_type(struct perf_event *event) ··· 2515 2422 } 2516 2423 2517 2424 static inline void 2518 - __ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, bool final) 2425 + __ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, 2426 + bool final, enum event_type_t event_type) 2519 2427 { 2520 2428 if (ctx->is_active & EVENT_TIME) { 2521 2429 if (ctx->is_active & EVENT_FROZEN) 2522 2430 return; 2431 + 2523 2432 update_context_time(ctx); 2524 - update_cgrp_time_from_cpuctx(cpuctx, final); 2433 + /* vPMU should not stop time */ 2434 + update_cgrp_time_from_cpuctx(cpuctx, !(event_type & EVENT_GUEST) && final); 2525 2435 } 2526 2436 } 2527 2437 2528 2438 static inline void 2529 2439 ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx) 2530 2440 { 2531 - __ctx_time_update(cpuctx, ctx, false); 2441 + __ctx_time_update(cpuctx, ctx, false, 0); 2532 2442 } 2533 2443 2534 2444 /* ··· 2957 2861 2958 2862 static void perf_event_sched_in(struct perf_cpu_context *cpuctx, 2959 2863 struct perf_event_context *ctx, 2960 - struct pmu *pmu) 2864 + struct pmu *pmu, 2865 + enum event_type_t event_type) 2961 2866 { 2962 - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED); 2867 + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED | event_type); 2963 2868 if (ctx) 2964 - ctx_sched_in(ctx, pmu, EVENT_PINNED); 2965 - ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); 2869 + ctx_sched_in(ctx, pmu, EVENT_PINNED | event_type); 2870 + ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE | event_type); 2966 2871 if (ctx) 2967 - ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE); 2872 + ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE | event_type); 2968 2873 } 2969 2874 2970 2875 /* ··· 2999 2902 3000 2903 event_type &= EVENT_ALL; 3001 2904 3002 - for_each_epc(epc, &cpuctx->ctx, pmu, false) 2905 + for_each_epc(epc, &cpuctx->ctx, pmu, 0) 3003 2906 perf_pmu_disable(epc->pmu); 3004 2907 3005 2908 if (task_ctx) { 3006 - for_each_epc(epc, task_ctx, pmu, false) 2909 + for_each_epc(epc, task_ctx, pmu, 0) 3007 2910 perf_pmu_disable(epc->pmu); 3008 2911 3009 2912 task_ctx_sched_out(task_ctx, pmu, event_type); ··· 3021 2924 else if (event_type & EVENT_PINNED) 3022 2925 ctx_sched_out(&cpuctx->ctx, pmu, EVENT_FLEXIBLE); 3023 2926 3024 - perf_event_sched_in(cpuctx, task_ctx, pmu); 2927 + perf_event_sched_in(cpuctx, task_ctx, pmu, 0); 3025 2928 3026 - for_each_epc(epc, &cpuctx->ctx, pmu, false) 2929 + for_each_epc(epc, &cpuctx->ctx, pmu, 0) 3027 2930 perf_pmu_enable(epc->pmu); 3028 2931 3029 2932 if (task_ctx) { 3030 - for_each_epc(epc, task_ctx, pmu, false) 2933 + for_each_epc(epc, task_ctx, pmu, 0) 3031 2934 perf_pmu_enable(epc->pmu); 3032 2935 } 3033 2936 } ··· 3576 3479 ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type) 3577 3480 { 3578 3481 struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); 3482 + enum event_type_t active_type = event_type & ~EVENT_FLAGS; 3579 3483 struct perf_event_pmu_context *pmu_ctx; 3580 3484 int is_active = ctx->is_active; 3581 - bool cgroup = event_type & EVENT_CGROUP; 3582 3485 3583 - event_type &= ~EVENT_CGROUP; 3584 3486 3585 3487 lockdep_assert_held(&ctx->lock); 3586 3488 ··· 3603 3507 * 3604 3508 * would only update time for the pinned events. 3605 3509 */ 3606 - __ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx); 3510 + __ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx, event_type); 3607 3511 3608 3512 /* 3609 3513 * CPU-release for the below ->is_active store, 3610 3514 * see __load_acquire() in perf_event_time_now() 3611 3515 */ 3612 3516 barrier(); 3613 - ctx->is_active &= ~event_type; 3517 + ctx->is_active &= ~active_type; 3614 3518 3615 3519 if (!(ctx->is_active & EVENT_ALL)) { 3616 3520 /* ··· 3629 3533 cpuctx->task_ctx = NULL; 3630 3534 } 3631 3535 3632 - is_active ^= ctx->is_active; /* changed bits */ 3536 + if (event_type & EVENT_GUEST) { 3537 + /* 3538 + * Schedule out all exclude_guest events of PMU 3539 + * with PERF_PMU_CAP_MEDIATED_VPMU. 3540 + */ 3541 + is_active = EVENT_ALL; 3542 + __update_context_guest_time(ctx, false); 3543 + perf_cgroup_set_timestamp(cpuctx, true); 3544 + barrier(); 3545 + } else { 3546 + is_active ^= ctx->is_active; /* changed bits */ 3547 + } 3633 3548 3634 - for_each_epc(pmu_ctx, ctx, pmu, cgroup) 3549 + for_each_epc(pmu_ctx, ctx, pmu, event_type) 3635 3550 __pmu_ctx_sched_out(pmu_ctx, is_active); 3636 3551 } 3637 3552 ··· 3798 3691 raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING); 3799 3692 if (context_equiv(ctx, next_ctx)) { 3800 3693 3801 - perf_ctx_disable(ctx, false); 3694 + perf_ctx_disable(ctx, 0); 3802 3695 3803 3696 /* PMIs are disabled; ctx->nr_no_switch_fast is stable. */ 3804 3697 if (local_read(&ctx->nr_no_switch_fast) || ··· 3822 3715 3823 3716 perf_ctx_sched_task_cb(ctx, task, false); 3824 3717 3825 - perf_ctx_enable(ctx, false); 3718 + perf_ctx_enable(ctx, 0); 3826 3719 3827 3720 /* 3828 3721 * RCU_INIT_POINTER here is safe because we've not ··· 3846 3739 3847 3740 if (do_switch) { 3848 3741 raw_spin_lock(&ctx->lock); 3849 - perf_ctx_disable(ctx, false); 3742 + perf_ctx_disable(ctx, 0); 3850 3743 3851 3744 inside_switch: 3852 3745 perf_ctx_sched_task_cb(ctx, task, false); 3853 3746 task_ctx_sched_out(ctx, NULL, EVENT_ALL); 3854 3747 3855 - perf_ctx_enable(ctx, false); 3748 + perf_ctx_enable(ctx, 0); 3856 3749 raw_spin_unlock(&ctx->lock); 3857 3750 } 3858 3751 } ··· 4099 3992 event_update_userpage(event); 4100 3993 } 4101 3994 3995 + struct merge_sched_data { 3996 + int can_add_hw; 3997 + enum event_type_t event_type; 3998 + }; 3999 + 4102 4000 static int merge_sched_in(struct perf_event *event, void *data) 4103 4001 { 4104 4002 struct perf_event_context *ctx = event->ctx; 4105 - int *can_add_hw = data; 4003 + struct merge_sched_data *msd = data; 4106 4004 4107 4005 if (event->state <= PERF_EVENT_STATE_OFF) 4108 4006 return 0; ··· 4115 4003 if (!event_filter_match(event)) 4116 4004 return 0; 4117 4005 4118 - if (group_can_go_on(event, *can_add_hw)) { 4006 + /* 4007 + * Don't schedule in any host events from PMU with 4008 + * PERF_PMU_CAP_MEDIATED_VPMU, while a guest is running. 4009 + */ 4010 + if (is_guest_mediated_pmu_loaded() && 4011 + event->pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU && 4012 + !(msd->event_type & EVENT_GUEST)) 4013 + return 0; 4014 + 4015 + if (group_can_go_on(event, msd->can_add_hw)) { 4119 4016 if (!group_sched_in(event, ctx)) 4120 4017 list_add_tail(&event->active_list, get_event_list(event)); 4121 4018 } 4122 4019 4123 4020 if (event->state == PERF_EVENT_STATE_INACTIVE) { 4124 - *can_add_hw = 0; 4021 + msd->can_add_hw = 0; 4125 4022 if (event->attr.pinned) { 4126 4023 perf_cgroup_event_disable(event, ctx); 4127 4024 perf_event_set_state(event, PERF_EVENT_STATE_ERROR); ··· 4153 4032 4154 4033 static void pmu_groups_sched_in(struct perf_event_context *ctx, 4155 4034 struct perf_event_groups *groups, 4156 - struct pmu *pmu) 4035 + struct pmu *pmu, 4036 + enum event_type_t event_type) 4157 4037 { 4158 - int can_add_hw = 1; 4038 + struct merge_sched_data msd = { 4039 + .can_add_hw = 1, 4040 + .event_type = event_type, 4041 + }; 4159 4042 visit_groups_merge(ctx, groups, smp_processor_id(), pmu, 4160 - merge_sched_in, &can_add_hw); 4043 + merge_sched_in, &msd); 4161 4044 } 4162 4045 4163 4046 static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx, ··· 4170 4045 struct perf_event_context *ctx = pmu_ctx->ctx; 4171 4046 4172 4047 if (event_type & EVENT_PINNED) 4173 - pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu); 4048 + pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu, event_type); 4174 4049 if (event_type & EVENT_FLEXIBLE) 4175 - pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu); 4050 + pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu, event_type); 4176 4051 } 4177 4052 4178 4053 static void 4179 4054 ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type) 4180 4055 { 4181 4056 struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); 4057 + enum event_type_t active_type = event_type & ~EVENT_FLAGS; 4182 4058 struct perf_event_pmu_context *pmu_ctx; 4183 4059 int is_active = ctx->is_active; 4184 - bool cgroup = event_type & EVENT_CGROUP; 4185 - 4186 - event_type &= ~EVENT_CGROUP; 4187 4060 4188 4061 lockdep_assert_held(&ctx->lock); 4189 4062 ··· 4189 4066 return; 4190 4067 4191 4068 if (!(is_active & EVENT_TIME)) { 4069 + /* EVENT_TIME should be active while the guest runs */ 4070 + WARN_ON_ONCE(event_type & EVENT_GUEST); 4192 4071 /* start ctx time */ 4193 4072 __update_context_time(ctx, false); 4194 - perf_cgroup_set_timestamp(cpuctx); 4073 + perf_cgroup_set_timestamp(cpuctx, false); 4195 4074 /* 4196 4075 * CPU-release for the below ->is_active store, 4197 4076 * see __load_acquire() in perf_event_time_now() ··· 4201 4076 barrier(); 4202 4077 } 4203 4078 4204 - ctx->is_active |= (event_type | EVENT_TIME); 4079 + ctx->is_active |= active_type | EVENT_TIME; 4205 4080 if (ctx->task) { 4206 4081 if (!(is_active & EVENT_ALL)) 4207 4082 cpuctx->task_ctx = ctx; ··· 4209 4084 WARN_ON_ONCE(cpuctx->task_ctx != ctx); 4210 4085 } 4211 4086 4212 - is_active ^= ctx->is_active; /* changed bits */ 4087 + if (event_type & EVENT_GUEST) { 4088 + /* 4089 + * Schedule in the required exclude_guest events of PMU 4090 + * with PERF_PMU_CAP_MEDIATED_VPMU. 4091 + */ 4092 + is_active = event_type & EVENT_ALL; 4093 + 4094 + /* 4095 + * Update ctx time to set the new start time for 4096 + * the exclude_guest events. 4097 + */ 4098 + update_context_time(ctx); 4099 + update_cgrp_time_from_cpuctx(cpuctx, false); 4100 + barrier(); 4101 + } else { 4102 + is_active ^= ctx->is_active; /* changed bits */ 4103 + } 4213 4104 4214 4105 /* 4215 4106 * First go through the list and put on any pinned groups 4216 4107 * in order to give them the best chance of going on. 4217 4108 */ 4218 4109 if (is_active & EVENT_PINNED) { 4219 - for_each_epc(pmu_ctx, ctx, pmu, cgroup) 4220 - __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED); 4110 + for_each_epc(pmu_ctx, ctx, pmu, event_type) 4111 + __pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED | (event_type & EVENT_GUEST)); 4221 4112 } 4222 4113 4223 4114 /* Then walk through the lower prio flexible groups */ 4224 4115 if (is_active & EVENT_FLEXIBLE) { 4225 - for_each_epc(pmu_ctx, ctx, pmu, cgroup) 4226 - __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE); 4116 + for_each_epc(pmu_ctx, ctx, pmu, event_type) 4117 + __pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE | (event_type & EVENT_GUEST)); 4227 4118 } 4228 4119 } 4229 4120 ··· 4255 4114 4256 4115 if (cpuctx->task_ctx == ctx) { 4257 4116 perf_ctx_lock(cpuctx, ctx); 4258 - perf_ctx_disable(ctx, false); 4117 + perf_ctx_disable(ctx, 0); 4259 4118 4260 4119 perf_ctx_sched_task_cb(ctx, task, true); 4261 4120 4262 - perf_ctx_enable(ctx, false); 4121 + perf_ctx_enable(ctx, 0); 4263 4122 perf_ctx_unlock(cpuctx, ctx); 4264 4123 goto rcu_unlock; 4265 4124 } ··· 4272 4131 if (!ctx->nr_events) 4273 4132 goto unlock; 4274 4133 4275 - perf_ctx_disable(ctx, false); 4134 + perf_ctx_disable(ctx, 0); 4276 4135 /* 4277 4136 * We want to keep the following priority order: 4278 4137 * cpu pinned (that don't need to move), task pinned, ··· 4282 4141 * events, no need to flip the cpuctx's events around. 4283 4142 */ 4284 4143 if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) { 4285 - perf_ctx_disable(&cpuctx->ctx, false); 4144 + perf_ctx_disable(&cpuctx->ctx, 0); 4286 4145 ctx_sched_out(&cpuctx->ctx, NULL, EVENT_FLEXIBLE); 4287 4146 } 4288 4147 4289 - perf_event_sched_in(cpuctx, ctx, NULL); 4148 + perf_event_sched_in(cpuctx, ctx, NULL, 0); 4290 4149 4291 4150 perf_ctx_sched_task_cb(cpuctx->task_ctx, task, true); 4292 4151 4293 4152 if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) 4294 - perf_ctx_enable(&cpuctx->ctx, false); 4153 + perf_ctx_enable(&cpuctx->ctx, 0); 4295 4154 4296 - perf_ctx_enable(ctx, false); 4155 + perf_ctx_enable(ctx, 0); 4297 4156 4298 4157 unlock: 4299 4158 perf_ctx_unlock(cpuctx, ctx); ··· 5421 5280 return -ENOMEM; 5422 5281 5423 5282 for (;;) { 5424 - if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) { 5283 + if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) { 5425 5284 if (old) 5426 5285 perf_free_ctx_data_rcu(old); 5286 + /* 5287 + * Above try_cmpxchg() pairs with try_cmpxchg() from 5288 + * detach_task_ctx_data() such that 5289 + * if we race with perf_event_exit_task(), we must 5290 + * observe PF_EXITING. 5291 + */ 5292 + if (task->flags & PF_EXITING) { 5293 + /* detach_task_ctx_data() may free it already */ 5294 + if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL)) 5295 + perf_free_ctx_data_rcu(cd); 5296 + } 5427 5297 return 0; 5428 5298 } 5429 5299 ··· 5480 5328 /* Allocate everything */ 5481 5329 scoped_guard (rcu) { 5482 5330 for_each_process_thread(g, p) { 5331 + if (p->flags & PF_EXITING) 5332 + continue; 5483 5333 cd = rcu_dereference(p->perf_ctx_data); 5484 5334 if (cd && !cd->global) { 5485 5335 cd->global = 1; ··· 5748 5594 { 5749 5595 struct pmu *pmu = event->pmu; 5750 5596 5597 + security_perf_event_free(event); 5598 + 5751 5599 if (event->attach_state & PERF_ATTACH_CALLCHAIN) 5752 5600 put_callchain_buffers(); 5753 5601 ··· 5803 5647 call_rcu(&event->rcu_head, free_event_rcu); 5804 5648 } 5805 5649 5650 + static void mediated_pmu_unaccount_event(struct perf_event *event); 5651 + 5806 5652 DEFINE_FREE(__free_event, struct perf_event *, if (_T) __free_event(_T)) 5807 5653 5808 5654 /* vs perf_event_alloc() success */ ··· 5814 5656 irq_work_sync(&event->pending_disable_irq); 5815 5657 5816 5658 unaccount_event(event); 5817 - 5818 - security_perf_event_free(event); 5659 + mediated_pmu_unaccount_event(event); 5819 5660 5820 5661 if (event->rb) { 5821 5662 /* ··· 6337 6180 } 6338 6181 EXPORT_SYMBOL_GPL(perf_event_pause); 6339 6182 6183 + #ifdef CONFIG_PERF_GUEST_MEDIATED_PMU 6184 + static atomic_t nr_include_guest_events __read_mostly; 6185 + 6186 + static atomic_t nr_mediated_pmu_vms __read_mostly; 6187 + static DEFINE_MUTEX(perf_mediated_pmu_mutex); 6188 + 6189 + /* !exclude_guest event of PMU with PERF_PMU_CAP_MEDIATED_VPMU */ 6190 + static inline bool is_include_guest_event(struct perf_event *event) 6191 + { 6192 + if ((event->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU) && 6193 + !event->attr.exclude_guest) 6194 + return true; 6195 + 6196 + return false; 6197 + } 6198 + 6199 + static int mediated_pmu_account_event(struct perf_event *event) 6200 + { 6201 + if (!is_include_guest_event(event)) 6202 + return 0; 6203 + 6204 + if (atomic_inc_not_zero(&nr_include_guest_events)) 6205 + return 0; 6206 + 6207 + guard(mutex)(&perf_mediated_pmu_mutex); 6208 + if (atomic_read(&nr_mediated_pmu_vms)) 6209 + return -EOPNOTSUPP; 6210 + 6211 + atomic_inc(&nr_include_guest_events); 6212 + return 0; 6213 + } 6214 + 6215 + static void mediated_pmu_unaccount_event(struct perf_event *event) 6216 + { 6217 + if (!is_include_guest_event(event)) 6218 + return; 6219 + 6220 + if (WARN_ON_ONCE(!atomic_read(&nr_include_guest_events))) 6221 + return; 6222 + 6223 + atomic_dec(&nr_include_guest_events); 6224 + } 6225 + 6226 + /* 6227 + * Currently invoked at VM creation to 6228 + * - Check whether there are existing !exclude_guest events of PMU with 6229 + * PERF_PMU_CAP_MEDIATED_VPMU 6230 + * - Set nr_mediated_pmu_vms to prevent !exclude_guest event creation on 6231 + * PMUs with PERF_PMU_CAP_MEDIATED_VPMU 6232 + * 6233 + * No impact for the PMU without PERF_PMU_CAP_MEDIATED_VPMU. The perf 6234 + * still owns all the PMU resources. 6235 + */ 6236 + int perf_create_mediated_pmu(void) 6237 + { 6238 + if (atomic_inc_not_zero(&nr_mediated_pmu_vms)) 6239 + return 0; 6240 + 6241 + guard(mutex)(&perf_mediated_pmu_mutex); 6242 + if (atomic_read(&nr_include_guest_events)) 6243 + return -EBUSY; 6244 + 6245 + atomic_inc(&nr_mediated_pmu_vms); 6246 + return 0; 6247 + } 6248 + EXPORT_SYMBOL_FOR_KVM(perf_create_mediated_pmu); 6249 + 6250 + void perf_release_mediated_pmu(void) 6251 + { 6252 + if (WARN_ON_ONCE(!atomic_read(&nr_mediated_pmu_vms))) 6253 + return; 6254 + 6255 + atomic_dec(&nr_mediated_pmu_vms); 6256 + } 6257 + EXPORT_SYMBOL_FOR_KVM(perf_release_mediated_pmu); 6258 + 6259 + /* When loading a guest's mediated PMU, schedule out all exclude_guest events. */ 6260 + void perf_load_guest_context(void) 6261 + { 6262 + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); 6263 + 6264 + lockdep_assert_irqs_disabled(); 6265 + 6266 + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); 6267 + 6268 + if (WARN_ON_ONCE(__this_cpu_read(guest_ctx_loaded))) 6269 + return; 6270 + 6271 + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); 6272 + ctx_sched_out(&cpuctx->ctx, NULL, EVENT_GUEST); 6273 + if (cpuctx->task_ctx) { 6274 + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); 6275 + task_ctx_sched_out(cpuctx->task_ctx, NULL, EVENT_GUEST); 6276 + } 6277 + 6278 + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); 6279 + if (cpuctx->task_ctx) 6280 + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); 6281 + 6282 + __this_cpu_write(guest_ctx_loaded, true); 6283 + } 6284 + EXPORT_SYMBOL_GPL(perf_load_guest_context); 6285 + 6286 + void perf_put_guest_context(void) 6287 + { 6288 + struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context); 6289 + 6290 + lockdep_assert_irqs_disabled(); 6291 + 6292 + guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx); 6293 + 6294 + if (WARN_ON_ONCE(!__this_cpu_read(guest_ctx_loaded))) 6295 + return; 6296 + 6297 + perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST); 6298 + if (cpuctx->task_ctx) 6299 + perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST); 6300 + 6301 + perf_event_sched_in(cpuctx, cpuctx->task_ctx, NULL, EVENT_GUEST); 6302 + 6303 + if (cpuctx->task_ctx) 6304 + perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST); 6305 + perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST); 6306 + 6307 + __this_cpu_write(guest_ctx_loaded, false); 6308 + } 6309 + EXPORT_SYMBOL_GPL(perf_put_guest_context); 6310 + #else 6311 + static int mediated_pmu_account_event(struct perf_event *event) { return 0; } 6312 + static void mediated_pmu_unaccount_event(struct perf_event *event) {} 6313 + #endif 6314 + 6340 6315 /* 6341 6316 * Holding the top-level event's child_mutex means that any 6342 6317 * descendant process that has inherited this event will block ··· 6837 6548 goto unlock; 6838 6549 6839 6550 /* 6840 - * compute total_time_enabled, total_time_running 6841 - * based on snapshot values taken when the event 6842 - * was last scheduled in. 6843 - * 6844 - * we cannot simply called update_context_time() 6845 - * because of locking issue as we can be called in 6846 - * NMI context 6847 - */ 6848 - calc_timer_values(event, &now, &enabled, &running); 6849 - 6850 - userpg = rb->user_page; 6851 - /* 6852 6551 * Disable preemption to guarantee consistent time stamps are stored to 6853 6552 * the user page. 6854 6553 */ 6855 6554 preempt_disable(); 6555 + 6556 + /* 6557 + * Compute total_time_enabled, total_time_running based on snapshot 6558 + * values taken when the event was last scheduled in. 6559 + * 6560 + * We cannot simply call update_context_time() because doing so would 6561 + * lead to deadlock when called from NMI context. 6562 + */ 6563 + calc_timer_values(event, &now, &enabled, &running); 6564 + 6565 + userpg = rb->user_page; 6566 + 6856 6567 ++userpg->lock; 6857 6568 barrier(); 6858 6569 userpg->index = perf_event_index(event); ··· 7672 7383 DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state); 7673 7384 DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip); 7674 7385 DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr); 7386 + DEFINE_STATIC_CALL_RET0(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi); 7675 7387 7676 7388 void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs) 7677 7389 { ··· 7687 7397 if (cbs->handle_intel_pt_intr) 7688 7398 static_call_update(__perf_guest_handle_intel_pt_intr, 7689 7399 cbs->handle_intel_pt_intr); 7400 + 7401 + if (cbs->handle_mediated_pmi) 7402 + static_call_update(__perf_guest_handle_mediated_pmi, 7403 + cbs->handle_mediated_pmi); 7690 7404 } 7691 7405 EXPORT_SYMBOL_GPL(perf_register_guest_info_callbacks); 7692 7406 ··· 7702 7408 rcu_assign_pointer(perf_guest_cbs, NULL); 7703 7409 static_call_update(__perf_guest_state, (void *)&__static_call_return0); 7704 7410 static_call_update(__perf_guest_get_ip, (void *)&__static_call_return0); 7705 - static_call_update(__perf_guest_handle_intel_pt_intr, 7706 - (void *)&__static_call_return0); 7411 + static_call_update(__perf_guest_handle_intel_pt_intr, (void *)&__static_call_return0); 7412 + static_call_update(__perf_guest_handle_mediated_pmi, (void *)&__static_call_return0); 7707 7413 synchronize_rcu(); 7708 7414 } 7709 7415 EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks); ··· 8163 7869 u64 read_format = event->attr.read_format; 8164 7870 8165 7871 /* 8166 - * compute total_time_enabled, total_time_running 8167 - * based on snapshot values taken when the event 8168 - * was last scheduled in. 7872 + * Compute total_time_enabled, total_time_running based on snapshot 7873 + * values taken when the event was last scheduled in. 8169 7874 * 8170 - * we cannot simply called update_context_time() 8171 - * because of locking issue as we are called in 8172 - * NMI context 7875 + * We cannot simply call update_context_time() because doing so would 7876 + * lead to deadlock when called from NMI context. 8173 7877 */ 8174 7878 if (read_format & PERF_FORMAT_TOTAL_TIMES) 8175 7879 calc_timer_values(event, &now, &enabled, &running); ··· 12335 12043 static void task_clock_event_start(struct perf_event *event, int flags) 12336 12044 { 12337 12045 event->hw.state = 0; 12338 - local64_set(&event->hw.prev_count, event->ctx->time); 12046 + local64_set(&event->hw.prev_count, event->ctx->time.time); 12339 12047 perf_swevent_start_hrtimer(event); 12340 12048 } 12341 12049 ··· 12344 12052 event->hw.state = PERF_HES_STOPPED; 12345 12053 perf_swevent_cancel_hrtimer(event); 12346 12054 if (flags & PERF_EF_UPDATE) 12347 - task_clock_event_update(event, event->ctx->time); 12055 + task_clock_event_update(event, event->ctx->time.time); 12348 12056 } 12349 12057 12350 12058 static int task_clock_event_add(struct perf_event *event, int flags) ··· 12364 12072 static void task_clock_event_read(struct perf_event *event) 12365 12073 { 12366 12074 u64 now = perf_clock(); 12367 - u64 delta = now - event->ctx->timestamp; 12368 - u64 time = event->ctx->time + delta; 12075 + u64 delta = now - event->ctx->time.stamp; 12076 + u64 time = event->ctx->time.time + delta; 12369 12077 12370 12078 task_clock_event_update(event, time); 12371 12079 } ··· 13444 13152 } 13445 13153 13446 13154 err = security_perf_event_alloc(event); 13155 + if (err) 13156 + return ERR_PTR(err); 13157 + 13158 + err = mediated_pmu_account_event(event); 13447 13159 if (err) 13448 13160 return ERR_PTR(err); 13449 13161 ··· 14590 14294 14591 14295 /* 14592 14296 * Detach the perf_ctx_data for the system-wide event. 14297 + * 14298 + * Done without holding global_ctx_data_rwsem; typically 14299 + * attach_global_ctx_data() will skip over this task, but otherwise 14300 + * attach_task_ctx_data() will observe PF_EXITING. 14593 14301 */ 14594 - guard(percpu_read)(&global_ctx_data_rwsem); 14595 14302 detach_task_ctx_data(task); 14596 14303 } 14597 14304 ··· 15097 14798 ctx = &cpuctx->ctx; 15098 14799 15099 14800 mutex_lock(&ctx->mutex); 15100 - smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); 14801 + if (ctx->nr_events) 14802 + smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1); 15101 14803 cpuctx->online = 0; 15102 14804 mutex_unlock(&ctx->mutex); 15103 14805 mutex_unlock(&pmus_lock);
+14 -10
kernel/events/uprobes.c
··· 179 179 180 180 void uprobe_copy_from_page(struct page *page, unsigned long vaddr, void *dst, int len) 181 181 { 182 - void *kaddr = kmap_atomic(page); 182 + void *kaddr = kmap_local_page(page); 183 183 memcpy(dst, kaddr + (vaddr & ~PAGE_MASK), len); 184 - kunmap_atomic(kaddr); 184 + kunmap_local(kaddr); 185 185 } 186 186 187 187 static void copy_to_page(struct page *page, unsigned long vaddr, const void *src, int len) 188 188 { 189 - void *kaddr = kmap_atomic(page); 189 + void *kaddr = kmap_local_page(page); 190 190 memcpy(kaddr + (vaddr & ~PAGE_MASK), src, len); 191 - kunmap_atomic(kaddr); 191 + kunmap_local(kaddr); 192 192 } 193 193 194 194 static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t *insn, ··· 323 323 return ret == 0 ? -EBUSY : ret; 324 324 } 325 325 326 - kaddr = kmap_atomic(page); 326 + kaddr = kmap_local_page(page); 327 327 ptr = kaddr + (vaddr & ~PAGE_MASK); 328 328 329 329 if (unlikely(*ptr + d < 0)) { ··· 336 336 *ptr += d; 337 337 ret = 0; 338 338 out: 339 - kunmap_atomic(kaddr); 339 + kunmap_local(kaddr); 340 340 put_page(page); 341 341 return ret; 342 342 } ··· 1138 1138 bool ret = false; 1139 1139 1140 1140 down_read(&uprobe->consumer_rwsem); 1141 - list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) { 1141 + list_for_each_entry(uc, &uprobe->consumers, cons_node) { 1142 1142 ret = consumer_filter(uc, mm); 1143 1143 if (ret) 1144 1144 break; ··· 1694 1694 .mremap = xol_mremap, 1695 1695 }; 1696 1696 1697 + unsigned long __weak arch_uprobe_get_xol_area(void) 1698 + { 1699 + /* Try to map as high as possible, this is only a hint. */ 1700 + return get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0); 1701 + } 1702 + 1697 1703 /* Slot allocation for XOL */ 1698 1704 static int xol_add_vma(struct mm_struct *mm, struct xol_area *area) 1699 1705 { ··· 1715 1709 } 1716 1710 1717 1711 if (!area->vaddr) { 1718 - /* Try to map as high as possible, this is only a hint. */ 1719 - area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, 1720 - PAGE_SIZE, 0, 0); 1712 + area->vaddr = arch_uprobe_get_xol_area(); 1721 1713 if (IS_ERR_VALUE(area->vaddr)) { 1722 1714 ret = area->vaddr; 1723 1715 goto fail;
+4 -8
kernel/unwind/user.c
··· 31 31 { 32 32 unsigned long cfa, fp, ra; 33 33 34 + /* Get the Canonical Frame Address (CFA) */ 34 35 if (frame->use_fp) { 35 36 if (state->fp < state->sp) 36 37 return -EINVAL; ··· 39 38 } else { 40 39 cfa = state->sp; 41 40 } 42 - 43 - /* Get the Canonical Frame Address (CFA) */ 44 41 cfa += frame->cfa_off; 45 42 46 - /* stack going in wrong direction? */ 43 + /* Make sure that stack is not going in wrong direction */ 47 44 if (cfa <= state->sp) 48 45 return -EINVAL; 49 46 ··· 49 50 if (cfa & (state->ws - 1)) 50 51 return -EINVAL; 51 52 52 - /* Find the Return Address (RA) */ 53 + /* Get the Return Address (RA) */ 53 54 if (get_user_word(&ra, cfa, frame->ra_off, state->ws)) 54 55 return -EINVAL; 55 56 57 + /* Get the Frame Pointer (FP) */ 56 58 if (frame->fp_off && get_user_word(&fp, cfa, frame->fp_off, state->ws)) 57 59 return -EINVAL; 58 60 ··· 67 67 68 68 static int unwind_user_next_fp(struct unwind_user_state *state) 69 69 { 70 - #ifdef CONFIG_HAVE_UNWIND_USER_FP 71 70 struct pt_regs *regs = task_pt_regs(current); 72 71 73 72 if (state->topmost && unwind_user_at_function_start(regs)) { ··· 80 81 ARCH_INIT_USER_FP_FRAME(state->ws) 81 82 }; 82 83 return unwind_user_next_common(state, &fp_frame); 83 - #else 84 - return -EINVAL; 85 - #endif 86 84 } 87 85 88 86 static int unwind_user_next(struct unwind_user_state *state)
+24 -3
tools/include/uapi/linux/perf_event.h
··· 1330 1330 mem_snoopx : 2, /* Snoop mode, ext */ 1331 1331 mem_blk : 3, /* Access blocked */ 1332 1332 mem_hops : 3, /* Hop level */ 1333 - mem_rsvd : 18; 1333 + mem_region : 5, /* cache/memory regions */ 1334 + mem_rsvd : 13; 1334 1335 }; 1335 1336 }; 1336 1337 #elif defined(__BIG_ENDIAN_BITFIELD) 1337 1338 union perf_mem_data_src { 1338 1339 __u64 val; 1339 1340 struct { 1340 - __u64 mem_rsvd : 18, 1341 + __u64 mem_rsvd : 13, 1342 + mem_region : 5, /* cache/memory regions */ 1341 1343 mem_hops : 3, /* Hop level */ 1342 1344 mem_blk : 3, /* Access blocked */ 1343 1345 mem_snoopx : 2, /* Snoop mode, ext */ ··· 1396 1394 #define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */ 1397 1395 #define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */ 1398 1396 #define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */ 1399 - /* 0x007 available */ 1397 + #define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */ 1400 1398 #define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */ 1401 1399 #define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */ 1402 1400 #define PERF_MEM_LVLNUM_IO 0x000a /* I/O */ ··· 1448 1446 #define PERF_MEM_HOPS_3 0x0004 /* Remote board */ 1449 1447 /* 5-7 available */ 1450 1448 #define PERF_MEM_HOPS_SHIFT 43 1449 + 1450 + /* Cache/Memory region */ 1451 + #define PERF_MEM_REGION_NA 0x0 /* Invalid */ 1452 + #define PERF_MEM_REGION_RSVD 0x01 /* Reserved */ 1453 + #define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */ 1454 + #define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */ 1455 + #define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */ 1456 + #define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */ 1457 + #define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */ 1458 + #define PERF_MEM_REGION_MMIO 0x07 /* MMIO */ 1459 + #define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */ 1460 + #define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */ 1461 + #define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */ 1462 + #define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */ 1463 + #define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */ 1464 + #define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */ 1465 + #define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */ 1466 + #define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */ 1467 + #define PERF_MEM_REGION_SHIFT 46 1451 1468 1452 1469 #define PERF_MEM_S(a, s) \ 1453 1470 (((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
+2 -1
tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h
··· 77 77 */ 78 78 #define IRQ_WORK_VECTOR 0xf6 79 79 80 - /* 0xf5 - unused, was UV_BAU_MESSAGE */ 80 + #define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5 81 + 81 82 #define DEFERRED_ERROR_VECTOR 0xf4 82 83 83 84 /* Vector on which hypervisor callbacks will be delivered */
+8 -6
tools/perf/util/pmu.c
··· 939 939 { 940 940 const char *p, *suffix; 941 941 bool has_hex = false; 942 + bool has_underscore = false; 942 943 size_t tok_len = strlen(tok); 943 944 944 945 /* Check start of pmu_name for equality. */ ··· 950 949 if (*p == 0) 951 950 return true; 952 951 953 - if (*p == '_') { 954 - ++p; 955 - ++suffix; 956 - } 957 - 958 - /* Ensure we end in a number */ 952 + /* Ensure we end in a number or a mix of number and "_". */ 959 953 while (1) { 954 + if (!has_underscore && (*p == '_')) { 955 + has_underscore = true; 956 + ++p; 957 + ++suffix; 958 + } 959 + 960 960 if (!isxdigit(*p)) 961 961 return false; 962 962 if (!has_hex)
+3
virt/kvm/kvm_main.c
··· 6482 6482 .state = kvm_guest_state, 6483 6483 .get_ip = kvm_guest_get_ip, 6484 6484 .handle_intel_pt_intr = NULL, 6485 + .handle_mediated_pmi = NULL, 6485 6486 }; 6486 6487 6487 6488 void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void)) 6488 6489 { 6489 6490 kvm_guest_cbs.handle_intel_pt_intr = pt_intr_handler; 6491 + kvm_guest_cbs.handle_mediated_pmi = NULL; 6492 + 6490 6493 perf_register_guest_info_callbacks(&kvm_guest_cbs); 6491 6494 } 6492 6495 void kvm_unregister_perf_callbacks(void)