Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

perf/x86/intel: Add support for rdpmc user disable feature

Starting with Panther Cove, the rdpmc user disable feature is supported.
This feature allows the perf system to disable user space rdpmc reads at
the counter level.

Currently, when a global counter is active, any user with rdpmc rights
can read it, even if perf access permissions forbid it (e.g., disallow
reading ring 0 counters). The rdpmc user disable feature mitigates this
security concern.

Details:

- A new RDPMC_USR_DISABLE bit (bit 37) in each EVNTSELx MSR indicates
that the GP counter cannot be read by RDPMC in ring 3.
- New RDPMC_USR_DISABLE bits in IA32_FIXED_CTR_CTRL MSR (bits 33, 37,
41, 45, etc.) for fixed counters 0, 1, 2, 3, etc.
- When calling rdpmc instruction for counter x, the following pseudo
code demonstrates how the counter value is obtained:
If (!CPL0 && RDPMC_USR_DISABLE[x] == 1) ? 0 : counter_value;
- RDPMC_USR_DISABLE is enumerated by CPUID.0x23.0.EBX[2].

This patch extends the current global user space rdpmc control logic via
the sysfs interface (/sys/devices/cpu/rdpmc) as follows:

- rdpmc = 0:
Global user space rdpmc and counter-level user space rdpmc for all
counters are both disabled.
- rdpmc = 1:
Global user space rdpmc is enabled during the mmap-enabled time window,
and counter-level user space rdpmc is enabled only for non-system-wide
events. This prevents counter data leaks as count data is cleared
during context switches.
- rdpmc = 2:
Global user space rdpmc and counter-level user space rdpmc for all
counters are enabled unconditionally.

The new rdpmc settings only affect newly activated perf events; currently
active perf events remain unaffected. This simplifies and cleans up the
code. The default value of rdpmc remains unchanged at 1.

For more details about rdpmc user disable, please refer to chapter 15
"RDPMC USER DISABLE" in ISE documentation.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260114011750.350569-8-dapeng1.mi@linux.intel.com

authored by

Dapeng Mi and committed by
Peter Zijlstra
59af95e0 8c74e4e3

+104 -2
+44
Documentation/ABI/testing/sysfs-bus-event_source-devices-rdpmc
··· 1 + What: /sys/bus/event_source/devices/cpu.../rdpmc 2 + Date: November 2011 3 + KernelVersion: 3.10 4 + Contact: Linux kernel mailing list linux-kernel@vger.kernel.org 5 + Description: The /sys/bus/event_source/devices/cpu.../rdpmc attribute 6 + is used to show/manage if rdpmc instruction can be 7 + executed in user space. This attribute supports 3 numbers. 8 + - rdpmc = 0 9 + user space rdpmc is globally disabled for all PMU 10 + counters. 11 + - rdpmc = 1 12 + user space rdpmc is globally enabled only in event mmap 13 + ioctl called time window. If the mmap region is unmapped, 14 + user space rdpmc is disabled again. 15 + - rdpmc = 2 16 + user space rdpmc is globally enabled for all PMU 17 + counters. 18 + 19 + In the Intel platforms supporting counter level's user 20 + space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the 21 + meaning of 3 numbers is extended to 22 + - rdpmc = 0 23 + global user space rdpmc and counter level's user space 24 + rdpmc of all counters are both disabled. 25 + - rdpmc = 1 26 + No changes on behavior of global user space rdpmc. 27 + counter level's rdpmc of system-wide events is disabled 28 + but counter level's rdpmc of non-system-wide events is 29 + enabled. 30 + - rdpmc = 2 31 + global user space rdpmc and counter level's user space 32 + rdpmc of all counters are both enabled unconditionally. 33 + 34 + The default value of rdpmc is 1. 35 + 36 + Please notice: 37 + - global user space rdpmc's behavior would change 38 + immediately along with the rdpmc value's change, 39 + but the behavior of counter level's user space rdpmc 40 + won't take effect immediately until the event is 41 + reactivated or recreated. 42 + - The rdpmc attribute is global, even for x86 hybrid 43 + platforms. For example, changing cpu_core/rdpmc will 44 + also change cpu_atom/rdpmc.
+21
arch/x86/events/core.c
··· 2616 2616 return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc); 2617 2617 } 2618 2618 2619 + /* 2620 + * Behaviors of rdpmc value: 2621 + * - rdpmc = 0 2622 + * global user space rdpmc and counter level's user space rdpmc of all 2623 + * counters are both disabled. 2624 + * - rdpmc = 1 2625 + * global user space rdpmc is enabled in mmap enabled time window and 2626 + * counter level's user space rdpmc is enabled for only non system-wide 2627 + * events. Counter level's user space rdpmc of system-wide events is 2628 + * still disabled by default. This won't introduce counter data leak for 2629 + * non system-wide events since their count data would be cleared when 2630 + * context switches. 2631 + * - rdpmc = 2 2632 + * global user space rdpmc and counter level's user space rdpmc of all 2633 + * counters are enabled unconditionally. 2634 + * 2635 + * Suppose the rdpmc value won't be changed frequently, don't dynamically 2636 + * reschedule events to make the new rpdmc value take effect on active perf 2637 + * events immediately, the new rdpmc value would only impact the new 2638 + * activated perf events. This makes code simpler and cleaner. 2639 + */ 2619 2640 static ssize_t set_attr_rdpmc(struct device *cdev, 2620 2641 struct device_attribute *attr, 2621 2642 const char *buf, size_t count)
+27
arch/x86/events/intel/core.c
··· 3128 3128 bits |= INTEL_FIXED_0_USER; 3129 3129 if (hwc->config & ARCH_PERFMON_EVENTSEL_OS) 3130 3130 bits |= INTEL_FIXED_0_KERNEL; 3131 + if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE) 3132 + bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE; 3131 3133 3132 3134 /* 3133 3135 * ANY bit is supported in v3 and up ··· 3265 3263 __intel_pmu_update_event_ext(hwc->idx, ext); 3266 3264 } 3267 3265 3266 + static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event) 3267 + { 3268 + if (!x86_pmu_has_rdpmc_user_disable(event->pmu)) 3269 + return; 3270 + 3271 + /* 3272 + * Counter scope's user-space rdpmc is disabled by default 3273 + * except two cases. 3274 + * a. rdpmc = 2 (user space rdpmc enabled unconditionally) 3275 + * b. rdpmc = 1 and the event is not a system-wide event. 3276 + * The count of non-system-wide events would be cleared when 3277 + * context switches, so no count data is leaked. 3278 + */ 3279 + if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE || 3280 + (x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE && 3281 + event->ctx->task)) 3282 + event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 3283 + else 3284 + event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 3285 + } 3286 + 3268 3287 DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext); 3269 3288 3270 3289 static void intel_pmu_enable_event(struct perf_event *event) ··· 3293 3270 u64 enable_mask = ARCH_PERFMON_EVENTSEL_ENABLE; 3294 3271 struct hw_perf_event *hwc = &event->hw; 3295 3272 int idx = hwc->idx; 3273 + 3274 + intel_pmu_update_rdpmc_user_disable(event); 3296 3275 3297 3276 if (unlikely(event->attr.precise_ip)) 3298 3277 static_call(x86_pmu_pebs_enable)(event); ··· 5894 5869 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2; 5895 5870 if (ebx_0.split.eq) 5896 5871 hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ; 5872 + if (ebx_0.split.rdpmc_user_disable) 5873 + hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE; 5897 5874 5898 5875 if (eax_0.split.cntr_subleaf) { 5899 5876 cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
+6
arch/x86/events/perf_event.h
··· 1333 1333 return event->attr.config & hybrid(event->pmu, config_mask); 1334 1334 } 1335 1335 1336 + static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu) 1337 + { 1338 + return !!(hybrid(pmu, config_mask) & 1339 + ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE); 1340 + } 1341 + 1336 1342 extern struct event_constraint emptyconstraint; 1337 1343 1338 1344 extern struct event_constraint unconstrained;
+6 -2
arch/x86/include/asm/perf_event.h
··· 33 33 #define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL 34 34 #define ARCH_PERFMON_EVENTSEL_BR_CNTR (1ULL << 35) 35 35 #define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36) 36 + #define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE (1ULL << 37) 36 37 #define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40) 37 38 38 39 #define INTEL_FIXED_BITS_STRIDE 4 ··· 41 40 #define INTEL_FIXED_0_USER (1ULL << 1) 42 41 #define INTEL_FIXED_0_ANYTHREAD (1ULL << 2) 43 42 #define INTEL_FIXED_0_ENABLE_PMI (1ULL << 3) 43 + #define INTEL_FIXED_0_RDPMC_USER_DISABLE (1ULL << 33) 44 44 #define INTEL_FIXED_3_METRICS_CLEAR (1ULL << 2) 45 45 46 46 #define HSW_IN_TX (1ULL << 32) ··· 52 50 #define INTEL_FIXED_BITS_MASK \ 53 51 (INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \ 54 52 INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \ 55 - ICL_FIXED_0_ADAPTIVE) 53 + ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE) 56 54 57 55 #define intel_fixed_bits_by_idx(_idx, _bits) \ 58 56 ((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE)) ··· 228 226 unsigned int umask2:1; 229 227 /* EQ-bit Supported */ 230 228 unsigned int eq:1; 231 - unsigned int reserved:30; 229 + /* rdpmc user disable Supported */ 230 + unsigned int rdpmc_user_disable:1; 231 + unsigned int reserved:29; 232 232 } split; 233 233 unsigned int full; 234 234 };