Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'turbostat-v2025.12.02' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

- Add LLC statistics columns:
LLCkRPS = Last Level Cache Thousands of References Per Second
LLC%hit = Last Level Cache Hit %

- Recognize Wildcat Lake and Nova Lake platforms

- Add MSR check for Android

- Add APERF check for VMWARE

- Add RAPL check for AWS

- Minor fixes to turbostat (and x86_energy_perf_policy)

* tag 'turbostat-v2025.12.02' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (21 commits)
tools/power turbostat: version 2025.12.02
tools/power turbostat: Print wide names only for RAW 64-bit columns
tools/power turbostat: Print percentages in 8-columns
tools/power turbostat: Print "nan" for out of range percentages
tools/power turbostat: Validate APERF access for VMWARE
tools/power turbostat: Enhance perf probe
tools/power turbostat: Validate RAPL MSRs for AWS Nitro Hypervisor
tools/power x86_energy_perf_policy: Fix potential NULL pointer dereference
tools/power x86_energy_perf_policy: Fix format string in error message
tools/power x86_energy_perf_policy: Simplify Android MSR probe
tools/power x86_energy_perf_policy: Add Android MSR device support
tools/power turbostat: Add run-time MSR driver probe
tools/power turbostat: Set per_cpu_msr_sum to NULL after free
tools/power turbostat: Add LLC stats
tools/power turbostat: Remove dead code
tools/power turbostat: Refactor floating point printout code
tools/power turbostat.8: Update example
tools/power turbostat: Refactor added-counter value printing code
tools/power turbostat: Refactor added column header printing
tools/power turbostat: Add Wildcat Lake and Nova Lake support
...

+660 -619
+15 -12
tools/power/x86/turbostat/turbostat.8
··· 101 101 .PP 102 102 \fB--show column\fP show only the specified built-in columns. May be invoked multiple times, or with a comma-separated list of column names. 103 103 .PP 104 - \fB--show CATEGORY --hide CATEGORY\fP Show and hide also accept a single CATEGORY of columns: "all", "topology", "idle", "frequency", "power", "cpuidle", "hwidle", "swidle", "other". "idle" (enabled by default), includes "hwidle" and "pct_idle". "cpuidle" (default disabled) includes cpuidle software invocation counters. "swidle" includes "cpuidle" plus "pct_idle". "hwidle" includes only hardware based idle residency counters. Older versions of turbostat used the term "sysfs" for what is now "swidle". 104 + \fB--show CATEGORY --hide CATEGORY\fP Show and hide also accept a comma-separated-list of CATEGORIES of columns: "all", "topology", "idle", "frequency", "power", "cpuidle", "hwidle", "swidle", "cache", "llc", "other". "idle" (enabled by default), includes "hwidle" and "pct_idle". "cpuidle" (default disabled) includes cpuidle software invocation counters. "swidle" includes "cpuidle" plus "pct_idle". "hwidle" includes only hardware based idle residency counters. Older versions of turbostat used the term "sysfs" for what is now "swidle". 105 105 .PP 106 106 \fB--Dump\fP displays the raw counter values. 107 107 .PP ··· 158 158 \fBIRQ\fP The number of interrupts serviced by that CPU during the measurement interval. The system total line is the sum of interrupts serviced across all CPUs. turbostat parses /proc/interrupts to generate this summary. 159 159 .PP 160 160 \fBSMI\fP The number of System Management Interrupts serviced CPU during the measurement interval. While this counter is actually per-CPU, SMI are triggered on all processors, so the number should be the same for all CPUs. 161 + .PP 162 + \fBLLCkRPS\fP Last Level Cache Thousands of References Per Second. For CPUs with an L3 LLC, this is the number of references that CPU made to the L3 (and the number of misses that CPU made to it's L2). For CPUs with an L2 LLC, this is the number of references to the L2 (and the number of misses to the CPU's L1). The system summary row shows the sum for all CPUs. In both cases, the value displayed is the actual value divided by 1000 in the interest of usually fitting into 8 columns. 163 + .PP 164 + \fBLLC%hit\fP Last Level Cache Hit Rate %. Hit Rate Percent = 100.0 * (References - Misses)/References. The system summary row shows the weighted average for all CPUs (100.0 * (Sum_References - Sum_Misses)/Sum_References). 161 165 .PP 162 166 \fBC1, C2, C3...\fP The number times Linux requested the C1, C2, C3 idle state during the measurement interval. The system summary line shows the sum for all CPUs. These are C-state names as exported in /sys/devices/system/cpu/cpu*/cpuidle/state*/name. While their names are generic, their attributes are processor specific. They the system description section of output shows what MWAIT sub-states they are mapped to on each system. These counters are in the "cpuidle" group, which is disabled, by default. 163 167 .PP ··· 414 410 .fi 415 411 416 412 .SH ADD PERF COUNTER EXAMPLE #2 (using virtual cpu device) 417 - Here we run on hybrid, Raptor Lake platform. 418 - We limit turbostat to show output for just cpu0 (pcore) and cpu12 (ecore). 413 + Here we run on hybrid, Meteor Lake platform. 414 + We limit turbostat to show output for just cpu0 (pcore) and cpu4 (ecore). 419 415 We add a counter showing number of L3 cache misses, using virtual "cpu" device, 420 416 labeling it with the column header, "VCMISS". 421 417 We add a counter showing number of L3 cache misses, using virtual "cpu_core" device, 422 - labeling it with the column header, "PCMISS". This will fail on ecore cpu12. 418 + labeling it with the column header, "PCMISS". This will fail on ecore cpu4. 423 419 We add a counter showing number of L3 cache misses, using virtual "cpu_atom" device, 424 420 labeling it with the column header, "ECMISS". This will fail on pcore cpu0. 425 421 We display it only once, after the conclusion of 0.1 second sleep. 426 422 .nf 427 - sudo ./turbostat --quiet --cpu 0,12 --show CPU --add perf/cpu/cache-misses,cpu,delta,raw,VCMISS --add perf/cpu_core/cache-misses,cpu,delta,raw,PCMISS --add perf/cpu_atom/cache-misses,cpu,delta,raw,ECMISS sleep .1 423 + sudo ./turbostat --quiet --cpu 0,4 --show CPU --add perf/cpu/cache-misses,cpu,delta,VCMISS --add perf/cpu_core/cache-misses,cpu,delta,PCMISS --add perf/cpu_atom/cache-misses,cpu,delta,ECMISS sleep 5 428 424 turbostat: added_perf_counters_init_: perf/cpu_atom/cache-misses: failed to open counter on cpu0 429 - turbostat: added_perf_counters_init_: perf/cpu_core/cache-misses: failed to open counter on cpu12 430 - 0.104630 sec 431 - CPU ECMISS PCMISS VCMISS 432 - - 0x0000000000000000 0x0000000000000000 0x0000000000000000 433 - 0 0x0000000000000000 0x0000000000007951 0x0000000000007796 434 - 12 0x000000000001137a 0x0000000000000000 0x0000000000011392 435 - 425 + turbostat: added_perf_counters_init_: perf/cpu_core/cache-misses: failed to open counter on cpu4 426 + 5.001207 sec 427 + CPU ECMISS PCMISS VCMISS 428 + - 41586506 46291219 87877749 429 + 4 83173012 0 83173040 430 + 0 0 92582439 92582458 436 431 .fi 437 432 438 433 .SH ADD PMT COUNTER EXAMPLE
+608 -597
tools/power/x86/turbostat/turbostat.c
··· 142 142 #define FLAGS_SHOW (1 << 1) 143 143 #define SYSFS_PERCPU (1 << 1) 144 144 }; 145 + static int use_android_msr_path; 145 146 146 147 struct msr_counter bic[] = { 147 148 { 0x0, "usec", NULL, 0, 0, 0, NULL, 0 }, ··· 210 209 { 0x0, "NMI", NULL, 0, 0, 0, NULL, 0 }, 211 210 { 0x0, "CPU%c1e", NULL, 0, 0, 0, NULL, 0 }, 212 211 { 0x0, "pct_idle", NULL, 0, 0, 0, NULL, 0 }, 212 + { 0x0, "LLCkRPS", NULL, 0, 0, 0, NULL, 0 }, 213 + { 0x0, "LLC%hit", NULL, 0, 0, 0, NULL, 0 }, 213 214 }; 214 215 215 216 /* n.b. bic_names must match the order in bic[], above */ ··· 281 278 BIC_NMI, 282 279 BIC_CPU_c1e, 283 280 BIC_pct_idle, 281 + BIC_LLC_RPS, 282 + BIC_LLC_HIT, 284 283 MAX_BIC 285 284 }; 286 285 ··· 310 305 static cpu_set_t bic_group_hw_idle; 311 306 static cpu_set_t bic_group_sw_idle; 312 307 static cpu_set_t bic_group_idle; 308 + static cpu_set_t bic_group_cache; 313 309 static cpu_set_t bic_group_other; 314 310 static cpu_set_t bic_group_disabled_by_default; 315 311 static cpu_set_t bic_enabled; ··· 419 413 SET_BIC(BIC_pct_idle, &bic_group_sw_idle); 420 414 421 415 BIC_INIT(&bic_group_idle); 416 + 422 417 CPU_OR(&bic_group_idle, &bic_group_idle, &bic_group_hw_idle); 423 418 SET_BIC(BIC_pct_idle, &bic_group_idle); 419 + 420 + BIC_INIT(&bic_group_cache); 421 + SET_BIC(BIC_LLC_RPS, &bic_group_cache); 422 + SET_BIC(BIC_LLC_HIT, &bic_group_cache); 424 423 425 424 BIC_INIT(&bic_group_other); 426 425 SET_BIC(BIC_IRQ, &bic_group_other); ··· 477 466 #define PCL_10 14 /* PC10 */ 478 467 #define PCLUNL 15 /* Unlimited */ 479 468 480 - struct amperf_group_fd; 481 - 482 469 char *proc_stat = "/proc/stat"; 483 470 FILE *outf; 484 471 int *fd_percpu; 485 472 int *fd_instr_count_percpu; 473 + int *fd_llc_percpu; 486 474 struct timeval interval_tv = { 5, 0 }; 487 475 struct timespec interval_ts = { 5, 0 }; 488 476 ··· 492 482 unsigned int shown; 493 483 unsigned int sums_need_wide_columns; 494 484 unsigned int rapl_joules; 485 + unsigned int valid_rapl_msrs; 495 486 unsigned int summary_only; 496 487 unsigned int list_header_only; 497 488 unsigned int dump_only; 498 489 unsigned int force_load; 499 - unsigned int has_aperf; 490 + unsigned int cpuid_has_aperf_mperf; 500 491 unsigned int has_aperf_access; 501 492 unsigned int has_epb; 502 493 unsigned int has_turbo; ··· 563 552 564 553 int get_msr(int cpu, off_t offset, unsigned long long *msr); 565 554 int add_counter(unsigned int msr_num, char *path, char *name, 566 - unsigned int width, enum counter_scope scope, 567 - enum counter_type type, enum counter_format format, int flags, int package_num); 555 + unsigned int width, enum counter_scope scope, enum counter_type type, enum counter_format format, int flags, int package_num); 568 556 569 557 /* Model specific support Start */ 570 558 ··· 588 578 bool has_cst_prewake_bit; /* Cstate prewake bit in MSR_IA32_POWER_CTL */ 589 579 int trl_msrs; /* MSR_TURBO_RATIO_LIMIT/LIMIT1/LIMIT2/SECONDARY, Atom TRL MSRs */ 590 580 int plr_msrs; /* MSR_CORE/GFX/RING_PERF_LIMIT_REASONS */ 591 - int rapl_msrs; /* RAPL PKG/DRAM/CORE/GFX MSRs, AMD RAPL MSRs */ 581 + int plat_rapl_msrs; /* RAPL PKG/DRAM/CORE/GFX MSRs, AMD RAPL MSRs */ 592 582 bool has_per_core_rapl; /* Indicates cores energy collection is per-core, not per-package. AMD specific for now */ 593 583 bool has_rapl_divisor; /* Divisor for Energy unit raw value from MSR_RAPL_POWER_UNIT */ 594 584 bool has_fixed_rapl_unit; /* Fixed Energy Unit used for DRAM RAPL Domain */ ··· 743 733 .cst_limit = CST_LIMIT_SNB, 744 734 .has_irtl_msrs = 1, 745 735 .trl_msrs = TRL_BASE, 746 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 736 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 747 737 }; 748 738 749 739 static const struct platform_features snx_features = { ··· 755 745 .cst_limit = CST_LIMIT_SNB, 756 746 .has_irtl_msrs = 1, 757 747 .trl_msrs = TRL_BASE, 758 - .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL, 748 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL, 759 749 }; 760 750 761 751 static const struct platform_features ivb_features = { ··· 768 758 .cst_limit = CST_LIMIT_SNB, 769 759 .has_irtl_msrs = 1, 770 760 .trl_msrs = TRL_BASE, 771 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 761 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 772 762 }; 773 763 774 764 static const struct platform_features ivx_features = { ··· 780 770 .cst_limit = CST_LIMIT_SNB, 781 771 .has_irtl_msrs = 1, 782 772 .trl_msrs = TRL_BASE | TRL_LIMIT1, 783 - .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL, 773 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM_ALL, 784 774 }; 785 775 786 776 static const struct platform_features hsw_features = { ··· 794 784 .has_irtl_msrs = 1, 795 785 .trl_msrs = TRL_BASE, 796 786 .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING, 797 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 787 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 798 788 }; 799 789 800 790 static const struct platform_features hsx_features = { ··· 808 798 .has_irtl_msrs = 1, 809 799 .trl_msrs = TRL_BASE | TRL_LIMIT1 | TRL_LIMIT2, 810 800 .plr_msrs = PLR_CORE | PLR_RING, 811 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 801 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 812 802 .has_fixed_rapl_unit = 1, 813 803 }; 814 804 ··· 823 813 .has_irtl_msrs = 1, 824 814 .trl_msrs = TRL_BASE, 825 815 .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING, 826 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 816 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 827 817 }; 828 818 829 819 static const struct platform_features hswg_features = { ··· 837 827 .has_irtl_msrs = 1, 838 828 .trl_msrs = TRL_BASE, 839 829 .plr_msrs = PLR_CORE | PLR_GFX | PLR_RING, 840 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 830 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 841 831 }; 842 832 843 833 static const struct platform_features bdw_features = { ··· 850 840 .cst_limit = CST_LIMIT_HSW, 851 841 .has_irtl_msrs = 1, 852 842 .trl_msrs = TRL_BASE, 853 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 843 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 854 844 }; 855 845 856 846 static const struct platform_features bdwg_features = { ··· 863 853 .cst_limit = CST_LIMIT_HSW, 864 854 .has_irtl_msrs = 1, 865 855 .trl_msrs = TRL_BASE, 866 - .rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 856 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE_ALL | RAPL_GFX | RAPL_PKG_POWER_INFO, 867 857 }; 868 858 869 859 static const struct platform_features bdx_features = { ··· 877 867 .has_irtl_msrs = 1, 878 868 .has_cst_auto_convension = 1, 879 869 .trl_msrs = TRL_BASE, 880 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 870 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 881 871 .has_fixed_rapl_unit = 1, 882 872 }; 883 873 ··· 894 884 .has_ext_cst_msrs = 1, 895 885 .trl_msrs = TRL_BASE, 896 886 .tcc_offset_bits = 6, 897 - .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX | RAPL_PSYS, 887 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX | RAPL_PSYS, 898 888 .enable_tsc_tweak = 1, 899 889 }; 900 890 ··· 911 901 .has_ext_cst_msrs = 1, 912 902 .trl_msrs = TRL_BASE, 913 903 .tcc_offset_bits = 6, 914 - .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX | RAPL_PSYS, 904 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX | RAPL_PSYS, 915 905 .enable_tsc_tweak = 1, 916 906 }; 917 907 ··· 929 919 .has_ext_cst_msrs = cnl_features.has_ext_cst_msrs, 930 920 .trl_msrs = cnl_features.trl_msrs, 931 921 .tcc_offset_bits = cnl_features.tcc_offset_bits, 932 - .rapl_msrs = cnl_features.rapl_msrs, 922 + .plat_rapl_msrs = cnl_features.plat_rapl_msrs, 933 923 .enable_tsc_tweak = cnl_features.enable_tsc_tweak, 934 924 }; 935 925 ··· 947 937 .has_ext_cst_msrs = adl_features.has_ext_cst_msrs, 948 938 .trl_msrs = adl_features.trl_msrs, 949 939 .tcc_offset_bits = adl_features.tcc_offset_bits, 950 - .rapl_msrs = adl_features.rapl_msrs, 940 + .plat_rapl_msrs = adl_features.plat_rapl_msrs, 951 941 .enable_tsc_tweak = adl_features.enable_tsc_tweak, 952 942 }; 953 943 ··· 962 952 .has_irtl_msrs = 1, 963 953 .has_cst_auto_convension = 1, 964 954 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 965 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 955 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 966 956 .has_fixed_rapl_unit = 1, 967 957 }; 968 958 ··· 978 968 .has_irtl_msrs = 1, 979 969 .has_cst_prewake_bit = 1, 980 970 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 981 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 971 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 982 972 .has_fixed_rapl_unit = 1, 983 973 }; 984 974 ··· 995 985 .has_cst_prewake_bit = 1, 996 986 .has_fixed_rapl_psys_unit = 1, 997 987 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 998 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 988 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 999 989 }; 1000 990 1001 991 static const struct platform_features dmr_features = { ··· 1010 1000 .has_fixed_rapl_psys_unit = spr_features.has_fixed_rapl_psys_unit, 1011 1001 .trl_msrs = spr_features.trl_msrs, 1012 1002 .has_msr_module_c6_res_ms = 1, /* DMR has Dual-Core-Module and MC6 MSR */ 1013 - .rapl_msrs = 0, /* DMR does not have RAPL MSRs */ 1003 + .plat_rapl_msrs = 0, /* DMR does not have RAPL MSRs */ 1014 1004 .plr_msrs = 0, /* DMR does not have PLR MSRs */ 1015 1005 .has_irtl_msrs = 0, /* DMR does not have IRTL MSRs */ 1016 1006 .has_config_tdp = 0, /* DMR does not have CTDP MSRs */ ··· 1029 1019 .has_irtl_msrs = 1, 1030 1020 .has_cst_prewake_bit = 1, 1031 1021 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 1032 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 1022 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 1033 1023 }; 1034 1024 1035 1025 static const struct platform_features grr_features = { ··· 1045 1035 .has_irtl_msrs = 1, 1046 1036 .has_cst_prewake_bit = 1, 1047 1037 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 1048 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 1038 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_PSYS, 1049 1039 }; 1050 1040 1051 1041 static const struct platform_features slv_features = { ··· 1058 1048 .has_msr_c6_demotion_policy_config = 1, 1059 1049 .has_msr_atom_pkg_c6_residency = 1, 1060 1050 .trl_msrs = TRL_ATOM, 1061 - .rapl_msrs = RAPL_PKG | RAPL_CORE, 1051 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE, 1062 1052 .has_rapl_divisor = 1, 1063 1053 .rapl_quirk_tdp = 30, 1064 1054 }; ··· 1071 1061 .cst_limit = CST_LIMIT_SLV, 1072 1062 .has_msr_atom_pkg_c6_residency = 1, 1073 1063 .trl_msrs = TRL_BASE, 1074 - .rapl_msrs = RAPL_PKG | RAPL_CORE, 1064 + .plat_rapl_msrs = RAPL_PKG | RAPL_CORE, 1075 1065 .rapl_quirk_tdp = 30, 1076 1066 }; 1077 1067 ··· 1092 1082 .cst_limit = CST_LIMIT_GMT, 1093 1083 .has_irtl_msrs = 1, 1094 1084 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 1095 - .rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO, 1085 + .plat_rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO, 1096 1086 }; 1097 1087 1098 1088 static const struct platform_features gmtd_features = { ··· 1105 1095 .has_irtl_msrs = 1, 1106 1096 .has_msr_core_c1_res = 1, 1107 1097 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 1108 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_CORE_ENERGY_STATUS, 1098 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL | RAPL_CORE_ENERGY_STATUS, 1109 1099 }; 1110 1100 1111 1101 static const struct platform_features gmtp_features = { ··· 1117 1107 .cst_limit = CST_LIMIT_GMT, 1118 1108 .has_irtl_msrs = 1, 1119 1109 .trl_msrs = TRL_BASE, 1120 - .rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO, 1110 + .plat_rapl_msrs = RAPL_PKG | RAPL_PKG_POWER_INFO, 1121 1111 }; 1122 1112 1123 1113 static const struct platform_features tmt_features = { ··· 1128 1118 .cst_limit = CST_LIMIT_GMT, 1129 1119 .has_irtl_msrs = 1, 1130 1120 .trl_msrs = TRL_BASE, 1131 - .rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX, 1121 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_CORE_ALL | RAPL_DRAM | RAPL_DRAM_PERF_STATUS | RAPL_GFX, 1132 1122 .enable_tsc_tweak = 1, 1133 1123 }; 1134 1124 ··· 1140 1130 .cst_limit = CST_LIMIT_GMT, 1141 1131 .has_irtl_msrs = 1, 1142 1132 .trl_msrs = TRL_BASE | TRL_CORECOUNT, 1143 - .rapl_msrs = RAPL_PKG_ALL, 1133 + .plat_rapl_msrs = RAPL_PKG_ALL, 1144 1134 }; 1145 1135 1146 1136 static const struct platform_features knl_features = { ··· 1152 1142 .cst_limit = CST_LIMIT_KNL, 1153 1143 .has_msr_knl_core_c6_residency = 1, 1154 1144 .trl_msrs = TRL_KNL, 1155 - .rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 1145 + .plat_rapl_msrs = RAPL_PKG_ALL | RAPL_DRAM_ALL, 1156 1146 .has_fixed_rapl_unit = 1, 1157 1147 .need_perf_multiplier = 1, 1158 1148 }; ··· 1161 1151 }; 1162 1152 1163 1153 static const struct platform_features amd_features_with_rapl = { 1164 - .rapl_msrs = RAPL_AMD_F17H, 1154 + .plat_rapl_msrs = RAPL_AMD_F17H, 1165 1155 .has_per_core_rapl = 1, 1166 1156 .rapl_quirk_tdp = 280, /* This is the max stock TDP of HEDT/Server Fam17h+ chips */ 1167 1157 }; ··· 1220 1210 { INTEL_ARROWLAKE, &adl_features }, 1221 1211 { INTEL_LUNARLAKE_M, &lnl_features }, 1222 1212 { INTEL_PANTHERLAKE_L, &lnl_features }, 1213 + { INTEL_NOVALAKE, &lnl_features }, 1214 + { INTEL_NOVALAKE_L, &lnl_features }, 1215 + { INTEL_WILDCATLAKE_L, &lnl_features }, 1223 1216 { INTEL_ATOM_SILVERMONT, &slv_features }, 1224 1217 { INTEL_ATOM_SILVERMONT_D, &slvd_features }, 1225 1218 { INTEL_ATOM_AIRMONT, &amt_features }, ··· 1307 1294 1308 1295 #define CPU_SUBSET_MAXCPUS 8192 /* need to use before probe... */ 1309 1296 cpu_set_t *cpu_present_set, *cpu_possible_set, *cpu_effective_set, *cpu_allowed_set, *cpu_affinity_set, *cpu_subset; 1310 - size_t cpu_present_setsize, cpu_possible_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affinity_setsize, 1311 - cpu_subset_size; 1297 + size_t cpu_present_setsize, cpu_possible_setsize, cpu_effective_setsize, cpu_allowed_setsize, cpu_affinity_setsize, cpu_subset_size; 1312 1298 #define MAX_ADDED_THREAD_COUNTERS 24 1313 1299 #define MAX_ADDED_CORE_COUNTERS 8 1314 1300 #define MAX_ADDED_PACKAGE_COUNTERS 16 ··· 2003 1991 pmt_counter_resize_(pcounter, new_size); 2004 1992 } 2005 1993 1994 + struct llc_stats { 1995 + unsigned long long references; 1996 + unsigned long long misses; 1997 + }; 2006 1998 struct thread_data { 2007 1999 struct timeval tv_begin; 2008 2000 struct timeval tv_end; ··· 2019 2003 unsigned long long irq_count; 2020 2004 unsigned long long nmi_count; 2021 2005 unsigned int smi_count; 2006 + struct llc_stats llc; 2022 2007 unsigned int cpu_id; 2023 2008 unsigned int apic_id; 2024 2009 unsigned int x2apic_id; ··· 2135 2118 2136 2119 switch (idx) { 2137 2120 case IDX_PKG_ENERGY: 2138 - if (platform->rapl_msrs & RAPL_AMD_F17H) 2121 + if (valid_rapl_msrs & RAPL_AMD_F17H) 2139 2122 offset = MSR_PKG_ENERGY_STAT; 2140 2123 else 2141 2124 offset = MSR_PKG_ENERGY_STATUS; ··· 2201 2184 { 2202 2185 switch (idx) { 2203 2186 case IDX_PKG_ENERGY: 2204 - return platform->rapl_msrs & (RAPL_PKG | RAPL_AMD_F17H); 2187 + return valid_rapl_msrs & (RAPL_PKG | RAPL_AMD_F17H); 2205 2188 case IDX_DRAM_ENERGY: 2206 - return platform->rapl_msrs & RAPL_DRAM; 2189 + return valid_rapl_msrs & RAPL_DRAM; 2207 2190 case IDX_PP0_ENERGY: 2208 - return platform->rapl_msrs & RAPL_CORE_ENERGY_STATUS; 2191 + return valid_rapl_msrs & RAPL_CORE_ENERGY_STATUS; 2209 2192 case IDX_PP1_ENERGY: 2210 - return platform->rapl_msrs & RAPL_GFX; 2193 + return valid_rapl_msrs & RAPL_GFX; 2211 2194 case IDX_PKG_PERF: 2212 - return platform->rapl_msrs & RAPL_PKG_PERF_STATUS; 2195 + return valid_rapl_msrs & RAPL_PKG_PERF_STATUS; 2213 2196 case IDX_DRAM_PERF: 2214 - return platform->rapl_msrs & RAPL_DRAM_PERF_STATUS; 2197 + return valid_rapl_msrs & RAPL_DRAM_PERF_STATUS; 2215 2198 case IDX_PSYS_ENERGY: 2216 - return platform->rapl_msrs & RAPL_PSYS; 2199 + return valid_rapl_msrs & RAPL_PSYS; 2217 2200 default: 2218 2201 return 0; 2219 2202 } ··· 2379 2362 return retval; 2380 2363 } 2381 2364 2382 - int is_cpu_first_thread_in_core(PER_THREAD_PARAMS) 2365 + int is_cpu_first_thread_in_core(struct thread_data *t, struct core_data *c) 2383 2366 { 2384 - UNUSED(p); 2385 - 2386 2367 return ((int)t->cpu_id == c->base_cpu || c->base_cpu < 0); 2387 2368 } 2388 2369 2389 - int is_cpu_first_core_in_package(PER_THREAD_PARAMS) 2370 + int is_cpu_first_core_in_package(struct thread_data *t, struct pkg_data *p) 2390 2371 { 2391 - UNUSED(c); 2392 - 2393 2372 return ((int)t->cpu_id == p->base_cpu || p->base_cpu < 0); 2394 2373 } 2395 2374 2396 - int is_cpu_first_thread_in_package(PER_THREAD_PARAMS) 2375 + int is_cpu_first_thread_in_package(struct thread_data *t, struct core_data *c, struct pkg_data *p) 2397 2376 { 2398 - return is_cpu_first_thread_in_core(t, c, p) && is_cpu_first_core_in_package(t, c, p); 2377 + return is_cpu_first_thread_in_core(t, c) && is_cpu_first_core_in_package(t, p); 2399 2378 } 2400 2379 2401 2380 int cpu_migrate(int cpu) ··· 2413 2400 2414 2401 if (fd) 2415 2402 return fd; 2416 - #if defined(ANDROID) 2417 - sprintf(pathname, "/dev/msr%d", cpu); 2418 - #else 2419 - sprintf(pathname, "/dev/cpu/%d/msr", cpu); 2420 - #endif 2403 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", cpu); 2421 2404 fd = open(pathname, O_RDONLY); 2422 2405 if (fd < 0) 2423 - #if defined(ANDROID) 2424 - err(-1, "%s open failed, try chown or chmod +r /dev/msr*, " 2425 - "or run with --no-msr, or run as root", pathname); 2426 - #else 2427 - err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, " 2428 - "or run with --no-msr, or run as root", pathname); 2429 - #endif 2406 + err(-1, "%s open failed, try chown or chmod +r %s, " 2407 + "or run with --no-msr, or run as root", pathname, use_android_msr_path ? "/dev/msr*" : "/dev/cpu/*/msr"); 2430 2408 fd_percpu[cpu] = fd; 2431 2409 2432 2410 return fd; ··· 2434 2430 CLR_BIC(BIC_PkgTmp, &bic_enabled); 2435 2431 2436 2432 free_sys_msr_counters(); 2433 + } 2434 + 2435 + static void bic_disable_perf_access(void) 2436 + { 2437 + CLR_BIC(BIC_IPC, &bic_enabled); 2438 + CLR_BIC(BIC_LLC_RPS, &bic_enabled); 2439 + CLR_BIC(BIC_LLC_HIT, &bic_enabled); 2437 2440 } 2438 2441 2439 2442 static long perf_event_open(struct perf_event_attr *hw_event, pid_t pid, int cpu, int group_fd, unsigned long flags) ··· 2523 2512 { 2524 2513 int ret; 2525 2514 2526 - if (!(platform->rapl_msrs & cai->feature_mask)) 2515 + if (!(valid_rapl_msrs & cai->feature_mask)) 2527 2516 return -1; 2528 2517 2529 2518 ret = add_msr_counter(cpu, cai->msr); ··· 2667 2656 } else if (!strcmp(name_list, "idle")) { 2668 2657 CPU_OR(ret_set, ret_set, &bic_group_idle); 2669 2658 break; 2659 + } else if (!strcmp(name_list, "cache")) { 2660 + CPU_OR(ret_set, ret_set, &bic_group_cache); 2661 + break; 2662 + } else if (!strcmp(name_list, "llc")) { 2663 + CPU_OR(ret_set, ret_set, &bic_group_cache); 2664 + break; 2670 2665 } else if (!strcmp(name_list, "swidle")) { 2671 2666 CPU_OR(ret_set, ret_set, &bic_group_sw_idle); 2672 2667 break; ··· 2694 2677 if (mode == SHOW_LIST) { 2695 2678 deferred_add_names[deferred_add_index++] = name_list; 2696 2679 if (deferred_add_index >= MAX_DEFERRED) { 2697 - fprintf(stderr, "More than max %d un-recognized --add options '%s'\n", 2698 - MAX_DEFERRED, name_list); 2680 + fprintf(stderr, "More than max %d un-recognized --add options '%s'\n", MAX_DEFERRED, name_list); 2699 2681 help(); 2700 2682 exit(1); 2701 2683 } ··· 2703 2687 if (debug) 2704 2688 fprintf(stderr, "deferred \"%s\"\n", name_list); 2705 2689 if (deferred_skip_index >= MAX_DEFERRED) { 2706 - fprintf(stderr, "More than max %d un-recognized --skip options '%s'\n", 2707 - MAX_DEFERRED, name_list); 2690 + fprintf(stderr, "More than max %d un-recognized --skip options '%s'\n", MAX_DEFERRED, name_list); 2708 2691 help(); 2709 2692 exit(1); 2710 2693 } ··· 2715 2700 name_list++; 2716 2701 2717 2702 } 2703 + } 2704 + 2705 + /* 2706 + * print_name() 2707 + * Print column header name for raw 64-bit counter in 16 columns (at least 8-char plus a tab) 2708 + * Otherwise, allow the name + tab to fit within 8-coumn tab-stop. 2709 + * In both cases, left justififed, just like other turbostat columns, 2710 + * to allow the column values to consume the tab. 2711 + * 2712 + * Yes, 32-bit counters can overflow 8-columns, and 2713 + * 64-bit counters can overflow 16-columns, but that is uncommon. 2714 + */ 2715 + static inline int print_name(int width, int *printed, char *delim, char *name, enum counter_type type, enum counter_format format) 2716 + { 2717 + UNUSED(type); 2718 + 2719 + if (format == FORMAT_RAW && width >= 64) 2720 + return (sprintf(outp, "%s%-8s", (*printed++ ? delim : ""), name)); 2721 + else 2722 + return (sprintf(outp, "%s%s", (*printed++ ? delim : ""), name)); 2723 + } 2724 + 2725 + static inline int print_hex_value(int width, int *printed, char *delim, unsigned long long value) 2726 + { 2727 + if (width <= 32) 2728 + return (sprintf(outp, "%s%08x", (*printed++ ? delim : ""), (unsigned int)value)); 2729 + else 2730 + return (sprintf(outp, "%s%016llx", (*printed++ ? delim : ""), value)); 2731 + } 2732 + 2733 + static inline int print_decimal_value(int width, int *printed, char *delim, unsigned long long value) 2734 + { 2735 + if (width <= 32) 2736 + return (sprintf(outp, "%s%d", (*printed++ ? delim : ""), (unsigned int)value)); 2737 + else 2738 + return (sprintf(outp, "%s%-8lld", (*printed++ ? delim : ""), value)); 2739 + } 2740 + 2741 + static inline int print_float_value(int *printed, char *delim, double value) 2742 + { 2743 + return (sprintf(outp, "%s%0.2f", (*printed++ ? delim : ""), value)); 2718 2744 } 2719 2745 2720 2746 void print_header(char *delim) ··· 2813 2757 if (DO_BIC(BIC_SMI)) 2814 2758 outp += sprintf(outp, "%sSMI", (printed++ ? delim : "")); 2815 2759 2816 - for (mp = sys.tp; mp; mp = mp->next) { 2760 + if (DO_BIC(BIC_LLC_RPS)) 2761 + outp += sprintf(outp, "%sLLCkRPS", (printed++ ? delim : "")); 2817 2762 2818 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 2819 - if (mp->width == 64) 2820 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), mp->name); 2821 - else 2822 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), mp->name); 2823 - } else { 2824 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2825 - outp += sprintf(outp, "%s%8s", (printed++ ? delim : ""), mp->name); 2826 - else 2827 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), mp->name); 2828 - } 2829 - } 2763 + if (DO_BIC(BIC_LLC_HIT)) 2764 + outp += sprintf(outp, "%sLLC%%hit", (printed++ ? delim : "")); 2830 2765 2831 - for (pp = sys.perf_tp; pp; pp = pp->next) { 2766 + for (mp = sys.tp; mp; mp = mp->next) 2767 + outp += print_name(mp->width, &printed, delim, mp->name, mp->type, mp->format); 2832 2768 2833 - if (pp->format == FORMAT_RAW) { 2834 - if (pp->width == 64) 2835 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), pp->name); 2836 - else 2837 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), pp->name); 2838 - } else { 2839 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2840 - outp += sprintf(outp, "%s%8s", (printed++ ? delim : ""), pp->name); 2841 - else 2842 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), pp->name); 2843 - } 2844 - } 2769 + for (pp = sys.perf_tp; pp; pp = pp->next) 2770 + outp += print_name(pp->width, &printed, delim, pp->name, pp->type, pp->format); 2845 2771 2846 2772 ppmt = sys.pmt_tp; 2847 2773 while (ppmt) { 2848 2774 switch (ppmt->type) { 2849 2775 case PMT_TYPE_RAW: 2850 - if (pmt_counter_get_width(ppmt) <= 32) 2851 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), ppmt->name); 2852 - else 2853 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), ppmt->name); 2854 - 2776 + outp += print_name(pmt_counter_get_width(ppmt), &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 2855 2777 break; 2856 2778 2857 2779 case PMT_TYPE_XTAL_TIME: 2858 2780 case PMT_TYPE_TCORE_CLOCK: 2859 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), ppmt->name); 2781 + outp += print_name(32, &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 2860 2782 break; 2861 2783 } 2862 2784 ··· 2859 2825 if (DO_BIC(BIC_CORE_THROT_CNT)) 2860 2826 outp += sprintf(outp, "%sCoreThr", (printed++ ? delim : "")); 2861 2827 2862 - if (platform->rapl_msrs && !rapl_joules) { 2828 + if (valid_rapl_msrs && !rapl_joules) { 2863 2829 if (DO_BIC(BIC_CorWatt) && platform->has_per_core_rapl) 2864 2830 outp += sprintf(outp, "%sCorWatt", (printed++ ? delim : "")); 2865 - } else if (platform->rapl_msrs && rapl_joules) { 2831 + } else if (valid_rapl_msrs && rapl_joules) { 2866 2832 if (DO_BIC(BIC_Cor_J) && platform->has_per_core_rapl) 2867 2833 outp += sprintf(outp, "%sCor_J", (printed++ ? delim : "")); 2868 2834 } 2869 2835 2870 - for (mp = sys.cp; mp; mp = mp->next) { 2871 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 2872 - if (mp->width == 64) 2873 - outp += sprintf(outp, "%s%18.18s", delim, mp->name); 2874 - else 2875 - outp += sprintf(outp, "%s%10.10s", delim, mp->name); 2876 - } else { 2877 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2878 - outp += sprintf(outp, "%s%8s", delim, mp->name); 2879 - else 2880 - outp += sprintf(outp, "%s%s", delim, mp->name); 2881 - } 2882 - } 2836 + for (mp = sys.cp; mp; mp = mp->next) 2837 + outp += print_name(mp->width, &printed, delim, mp->name, mp->type, mp->format); 2883 2838 2884 - for (pp = sys.perf_cp; pp; pp = pp->next) { 2885 - 2886 - if (pp->format == FORMAT_RAW) { 2887 - if (pp->width == 64) 2888 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), pp->name); 2889 - else 2890 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), pp->name); 2891 - } else { 2892 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2893 - outp += sprintf(outp, "%s%8s", (printed++ ? delim : ""), pp->name); 2894 - else 2895 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), pp->name); 2896 - } 2897 - } 2839 + for (pp = sys.perf_cp; pp; pp = pp->next) 2840 + outp += print_name(pp->width, &printed, delim, pp->name, pp->type, pp->format); 2898 2841 2899 2842 ppmt = sys.pmt_cp; 2900 2843 while (ppmt) { 2901 2844 switch (ppmt->type) { 2902 2845 case PMT_TYPE_RAW: 2903 - if (pmt_counter_get_width(ppmt) <= 32) 2904 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), ppmt->name); 2905 - else 2906 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), ppmt->name); 2846 + outp += print_name(pmt_counter_get_width(ppmt), &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 2907 2847 2908 2848 break; 2909 2849 2910 2850 case PMT_TYPE_XTAL_TIME: 2911 2851 case PMT_TYPE_TCORE_CLOCK: 2912 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), ppmt->name); 2852 + outp += print_name(32, &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 2913 2853 break; 2914 2854 } 2915 2855 2916 2856 ppmt = ppmt->next; 2917 2857 } 2918 - 2919 2858 if (DO_BIC(BIC_PkgTmp)) 2920 2859 outp += sprintf(outp, "%sPkgTmp", (printed++ ? delim : "")); 2921 2860 ··· 2970 2963 if (DO_BIC(BIC_UNCORE_MHZ)) 2971 2964 outp += sprintf(outp, "%sUncMHz", (printed++ ? delim : "")); 2972 2965 2973 - for (mp = sys.pp; mp; mp = mp->next) { 2974 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 2975 - if (mp->width == 64) 2976 - outp += sprintf(outp, "%s%18.18s", delim, mp->name); 2977 - else if (mp->width == 32) 2978 - outp += sprintf(outp, "%s%10.10s", delim, mp->name); 2979 - else 2980 - outp += sprintf(outp, "%s%7.7s", delim, mp->name); 2981 - } else { 2982 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2983 - outp += sprintf(outp, "%s%8s", delim, mp->name); 2984 - else 2985 - outp += sprintf(outp, "%s%7.7s", delim, mp->name); 2986 - } 2987 - } 2966 + for (mp = sys.pp; mp; mp = mp->next) 2967 + outp += print_name(mp->width, &printed, delim, mp->name, mp->type, mp->format); 2988 2968 2989 - for (pp = sys.perf_pp; pp; pp = pp->next) { 2990 - 2991 - if (pp->format == FORMAT_RAW) { 2992 - if (pp->width == 64) 2993 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), pp->name); 2994 - else 2995 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), pp->name); 2996 - } else { 2997 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 2998 - outp += sprintf(outp, "%s%8s", (printed++ ? delim : ""), pp->name); 2999 - else 3000 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), pp->name); 3001 - } 3002 - } 2969 + for (pp = sys.perf_pp; pp; pp = pp->next) 2970 + outp += print_name(pp->width, &printed, delim, pp->name, pp->type, pp->format); 3003 2971 3004 2972 ppmt = sys.pmt_pp; 3005 2973 while (ppmt) { 3006 2974 switch (ppmt->type) { 3007 2975 case PMT_TYPE_RAW: 3008 - if (pmt_counter_get_width(ppmt) <= 32) 3009 - outp += sprintf(outp, "%s%10.10s", (printed++ ? delim : ""), ppmt->name); 3010 - else 3011 - outp += sprintf(outp, "%s%18.18s", (printed++ ? delim : ""), ppmt->name); 3012 - 2976 + outp += print_name(pmt_counter_get_width(ppmt), &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 3013 2977 break; 3014 2978 3015 2979 case PMT_TYPE_XTAL_TIME: 3016 2980 case PMT_TYPE_TCORE_CLOCK: 3017 - outp += sprintf(outp, "%s%s", (printed++ ? delim : ""), ppmt->name); 2981 + outp += print_name(32, &printed, delim, ppmt->name, COUNTER_ITEMS, ppmt->format); 3018 2982 break; 3019 2983 } 3020 2984 ··· 2998 3020 outp += sprintf(outp, "%sSys_J", (printed++ ? delim : "")); 2999 3021 3000 3022 outp += sprintf(outp, "\n"); 3023 + } 3024 + 3025 + /* 3026 + * pct() 3027 + * 3028 + * If absolute value is < 1.1, return percentage 3029 + * otherwise, return nan 3030 + * 3031 + * return value is appropriate for printing percentages with %f 3032 + * while flagging some obvious erroneous values. 3033 + */ 3034 + double pct(double d) 3035 + { 3036 + 3037 + double abs = fabs(d); 3038 + 3039 + if (abs < 1.10) 3040 + return (100.0 * d); 3041 + return nan(""); 3001 3042 } 3002 3043 3003 3044 int dump_counters(PER_THREAD_PARAMS) ··· 3044 3047 if (DO_BIC(BIC_SMI)) 3045 3048 outp += sprintf(outp, "SMI: %d\n", t->smi_count); 3046 3049 3050 + outp += sprintf(outp, "LLC refs: %lld", t->llc.references); 3051 + outp += sprintf(outp, "LLC miss: %lld", t->llc.misses); 3052 + outp += sprintf(outp, "LLC Hit%%: %.2f", pct((t->llc.references - t->llc.misses) / t->llc.references)); 3053 + 3047 3054 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3048 - outp += 3049 - sprintf(outp, "tADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, 3050 - t->counter[i], mp->sp->path); 3055 + outp += sprintf(outp, "tADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, t->counter[i], mp->sp->path); 3051 3056 } 3052 3057 } 3053 3058 3054 - if (c && is_cpu_first_thread_in_core(t, c, p)) { 3059 + if (c && is_cpu_first_thread_in_core(t, c)) { 3055 3060 outp += sprintf(outp, "core: %d\n", c->core_id); 3056 3061 outp += sprintf(outp, "c3: %016llX\n", c->c3); 3057 3062 outp += sprintf(outp, "c6: %016llX\n", c->c6); ··· 3068 3069 outp += sprintf(outp, "Joules: %0llX (scale: %lf)\n", energy_value, energy_scale); 3069 3070 3070 3071 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 3071 - outp += 3072 - sprintf(outp, "cADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, 3073 - c->counter[i], mp->sp->path); 3072 + outp += sprintf(outp, "cADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, c->counter[i], mp->sp->path); 3074 3073 } 3075 3074 outp += sprintf(outp, "mc6_us: %016llX\n", c->mc6_us); 3076 3075 } 3077 3076 3078 - if (p && is_cpu_first_core_in_package(t, c, p)) { 3077 + if (p && is_cpu_first_core_in_package(t, p)) { 3079 3078 outp += sprintf(outp, "package: %d\n", p->package_id); 3080 3079 3081 3080 outp += sprintf(outp, "Weighted cores: %016llX\n", p->pkg_wtd_core_c0); ··· 3103 3106 outp += sprintf(outp, "PTM: %dC\n", p->pkg_temp_c); 3104 3107 3105 3108 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 3106 - outp += 3107 - sprintf(outp, "pADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, 3108 - p->counter[i], mp->sp->path); 3109 + outp += sprintf(outp, "pADDED [%d] %8s msr0x%x: %08llX %s\n", i, mp->name, mp->msr_num, p->counter[i], mp->sp->path); 3109 3110 } 3110 3111 } 3111 3112 ··· 3129 3134 return scaled; 3130 3135 } 3131 3136 3137 + void get_perf_llc_stats(int cpu, struct llc_stats *llc) 3138 + { 3139 + struct read_format { 3140 + unsigned long long num_read; 3141 + struct llc_stats llc; 3142 + } r; 3143 + const ssize_t expected_read_size = sizeof(r); 3144 + ssize_t actual_read_size; 3145 + 3146 + actual_read_size = read(fd_llc_percpu[cpu], &r, expected_read_size); 3147 + 3148 + if (actual_read_size == -1) 3149 + err(-1, "%s(cpu%d,) %d,,%ld\n", __func__, cpu, fd_llc_percpu[cpu], expected_read_size); 3150 + 3151 + llc->references = r.llc.references; 3152 + llc->misses = r.llc.misses; 3153 + if (actual_read_size != expected_read_size) 3154 + warn("%s: failed to read perf_data (req %zu act %zu)", __func__, expected_read_size, actual_read_size); 3155 + } 3156 + 3132 3157 /* 3133 3158 * column formatting convention & formats 3134 3159 */ ··· 3158 3143 3159 3144 struct platform_counters *pplat_cnt = NULL; 3160 3145 double interval_float, tsc; 3161 - char *fmt8; 3146 + char *fmt8 = "%s%.2f"; 3147 + 3162 3148 int i; 3163 3149 struct msr_counter *mp; 3164 3150 struct perf_counter_info *pp; ··· 3173 3157 } 3174 3158 3175 3159 /* if showing only 1st thread in core and this isn't one, bail out */ 3176 - if (show_core_only && !is_cpu_first_thread_in_core(t, c, p)) 3160 + if (show_core_only && !is_cpu_first_thread_in_core(t, c)) 3177 3161 return 0; 3178 3162 3179 3163 /* if showing only 1st thread in pkg and this isn't one, bail out */ 3180 - if (show_pkg_only && !is_cpu_first_core_in_package(t, c, p)) 3164 + if (show_pkg_only && !is_cpu_first_core_in_package(t, p)) 3181 3165 return 0; 3182 3166 3183 3167 /*if not summary line and --cpu is used */ ··· 3239 3223 } 3240 3224 if (DO_BIC(BIC_Node)) { 3241 3225 if (t) 3242 - outp += sprintf(outp, "%s%d", 3243 - (printed++ ? delim : ""), cpus[t->cpu_id].physical_node_id); 3226 + outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), cpus[t->cpu_id].physical_node_id); 3244 3227 else 3245 3228 outp += sprintf(outp, "%s-", (printed++ ? delim : "")); 3246 3229 } ··· 3261 3246 outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), 1.0 / units * t->aperf / interval_float); 3262 3247 3263 3248 if (DO_BIC(BIC_Busy)) 3264 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->mperf / tsc); 3249 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->mperf / tsc)); 3265 3250 3266 3251 if (DO_BIC(BIC_Bzy_MHz)) { 3267 3252 if (has_base_hz) 3268 - outp += 3269 - sprintf(outp, "%s%.0f", (printed++ ? delim : ""), base_hz / units * t->aperf / t->mperf); 3253 + outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), base_hz / units * t->aperf / t->mperf); 3270 3254 else 3271 - outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), 3272 - tsc / units * t->aperf / t->mperf / interval_float); 3255 + outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), tsc / units * t->aperf / t->mperf / interval_float); 3273 3256 } 3274 3257 3275 3258 if (DO_BIC(BIC_TSC_MHz)) ··· 3296 3283 if (DO_BIC(BIC_SMI)) 3297 3284 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), t->smi_count); 3298 3285 3299 - /* Added counters */ 3286 + /* LLC Stats */ 3287 + if (DO_BIC(BIC_LLC_RPS) || DO_BIC(BIC_LLC_HIT)) { 3288 + if (DO_BIC(BIC_LLC_RPS)) 3289 + outp += sprintf(outp, "%s%.0f", (printed++ ? delim : ""), t->llc.references / interval_float / 1000); 3290 + 3291 + if (DO_BIC(BIC_LLC_HIT)) 3292 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), pct((t->llc.references - t->llc.misses) / t->llc.references)); 3293 + } 3294 + 3295 + /* Added Thread Counters */ 3300 3296 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3301 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 3302 - if (mp->width == 32) 3303 - outp += 3304 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), (unsigned int)t->counter[i]); 3305 - else 3306 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), t->counter[i]); 3307 - } else if (mp->format == FORMAT_DELTA) { 3308 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3309 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), t->counter[i]); 3310 - else 3311 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), t->counter[i]); 3312 - } else if (mp->format == FORMAT_PERCENT) { 3297 + if (mp->format == FORMAT_RAW) 3298 + outp += print_hex_value(mp->width, &printed, delim, t->counter[i]); 3299 + else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3300 + outp += print_decimal_value(mp->width, &printed, delim, t->counter[i]); 3301 + else if (mp->format == FORMAT_PERCENT) { 3313 3302 if (mp->type == COUNTER_USEC) 3314 - outp += 3315 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3316 - t->counter[i] / interval_float / 10000); 3303 + outp += print_float_value(&printed, delim, t->counter[i] / interval_float / 10000); 3317 3304 else 3318 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->counter[i] / tsc); 3305 + outp += print_float_value(&printed, delim, pct(t->counter[i] / tsc)); 3319 3306 } 3320 3307 } 3321 3308 3322 - /* Added perf counters */ 3309 + /* Added perf Thread Counters */ 3323 3310 for (i = 0, pp = sys.perf_tp; pp; ++i, pp = pp->next) { 3324 - if (pp->format == FORMAT_RAW) { 3325 - if (pp->width == 32) 3326 - outp += 3327 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3328 - (unsigned int)t->perf_counter[i]); 3329 - else 3330 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), t->perf_counter[i]); 3331 - } else if (pp->format == FORMAT_DELTA) { 3332 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3333 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), t->perf_counter[i]); 3334 - else 3335 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), t->perf_counter[i]); 3336 - } else if (pp->format == FORMAT_PERCENT) { 3311 + if (pp->format == FORMAT_RAW) 3312 + outp += print_hex_value(pp->width, &printed, delim, t->perf_counter[i]); 3313 + else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3314 + outp += print_decimal_value(pp->width, &printed, delim, t->perf_counter[i]); 3315 + else if (pp->format == FORMAT_PERCENT) { 3337 3316 if (pp->type == COUNTER_USEC) 3338 - outp += 3339 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3340 - t->perf_counter[i] / interval_float / 10000); 3317 + outp += print_float_value(&printed, delim, t->perf_counter[i] / interval_float / 10000); 3341 3318 else 3342 - outp += 3343 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->perf_counter[i] / tsc); 3319 + outp += print_float_value(&printed, delim, pct(t->perf_counter[i] / tsc)); 3344 3320 } 3345 3321 } 3346 3322 3323 + /* Added PMT Thread Counters */ 3347 3324 for (i = 0, ppmt = sys.pmt_tp; ppmt; i++, ppmt = ppmt->next) { 3348 3325 const unsigned long value_raw = t->pmt_counter[i]; 3349 3326 double value_converted; 3350 3327 switch (ppmt->type) { 3351 3328 case PMT_TYPE_RAW: 3352 - if (pmt_counter_get_width(ppmt) <= 32) 3353 - outp += sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3354 - (unsigned int)t->pmt_counter[i]); 3355 - else 3356 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), t->pmt_counter[i]); 3357 - 3329 + outp += print_hex_value(pmt_counter_get_width(ppmt), &printed, delim, t->pmt_counter[i]); 3358 3330 break; 3359 3331 3360 3332 case PMT_TYPE_XTAL_TIME: 3361 - value_converted = 100.0 * value_raw / crystal_hz / interval_float; 3333 + value_converted = pct(value_raw / crystal_hz / interval_float); 3362 3334 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3363 3335 break; 3364 3336 3365 3337 case PMT_TYPE_TCORE_CLOCK: 3366 - value_converted = 100.0 * value_raw / tcore_clock_freq_hz / interval_float; 3338 + value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3367 3339 outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3368 3340 } 3369 3341 } 3370 3342 3371 3343 /* C1 */ 3372 3344 if (DO_BIC(BIC_CPU_c1)) 3373 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->c1 / tsc); 3345 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(t->c1 / tsc)); 3374 3346 3375 3347 /* print per-core data only for 1st thread in core */ 3376 - if (!is_cpu_first_thread_in_core(t, c, p)) 3348 + if (!is_cpu_first_thread_in_core(t, c)) 3377 3349 goto done; 3378 3350 3379 3351 if (DO_BIC(BIC_CPU_c3)) 3380 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->c3 / tsc); 3352 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c3 / tsc)); 3381 3353 if (DO_BIC(BIC_CPU_c6)) 3382 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->c6 / tsc); 3354 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c6 / tsc)); 3383 3355 if (DO_BIC(BIC_CPU_c7)) 3384 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->c7 / tsc); 3356 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->c7 / tsc)); 3385 3357 3386 3358 /* Mod%c6 */ 3387 3359 if (DO_BIC(BIC_Mod_c6)) 3388 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->mc6_us / tsc); 3360 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(c->mc6_us / tsc)); 3389 3361 3390 3362 if (DO_BIC(BIC_CoreTmp)) 3391 3363 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), c->core_temp_c); ··· 3379 3381 if (DO_BIC(BIC_CORE_THROT_CNT)) 3380 3382 outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), c->core_throt_cnt); 3381 3383 3384 + /* Added Core Counters */ 3382 3385 for (i = 0, mp = sys.cp; mp; i++, mp = mp->next) { 3383 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 3384 - if (mp->width == 32) 3385 - outp += 3386 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), (unsigned int)c->counter[i]); 3387 - else 3388 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), c->counter[i]); 3389 - } else if (mp->format == FORMAT_DELTA) { 3390 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3391 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), c->counter[i]); 3392 - else 3393 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), c->counter[i]); 3394 - } else if (mp->format == FORMAT_PERCENT) { 3395 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->counter[i] / tsc); 3396 - } 3386 + if (mp->format == FORMAT_RAW) 3387 + outp += print_hex_value(mp->width, &printed, delim, c->counter[i]); 3388 + else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3389 + outp += print_decimal_value(mp->width, &printed, delim, c->counter[i]); 3390 + else if (mp->format == FORMAT_PERCENT) 3391 + outp += print_float_value(&printed, delim, pct(c->counter[i] / tsc)); 3397 3392 } 3398 3393 3394 + /* Added perf Core counters */ 3399 3395 for (i = 0, pp = sys.perf_cp; pp; i++, pp = pp->next) { 3400 - if (pp->format == FORMAT_RAW) { 3401 - if (pp->width == 32) 3402 - outp += 3403 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3404 - (unsigned int)c->perf_counter[i]); 3405 - else 3406 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), c->perf_counter[i]); 3407 - } else if (pp->format == FORMAT_DELTA) { 3408 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3409 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), c->perf_counter[i]); 3410 - else 3411 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), c->perf_counter[i]); 3412 - } else if (pp->format == FORMAT_PERCENT) { 3413 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * c->perf_counter[i] / tsc); 3414 - } 3396 + if (pp->format == FORMAT_RAW) 3397 + outp += print_hex_value(pp->width, &printed, delim, c->perf_counter[i]); 3398 + else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3399 + outp += print_decimal_value(pp->width, &printed, delim, c->perf_counter[i]); 3400 + else if (pp->format == FORMAT_PERCENT) 3401 + outp += print_float_value(&printed, delim, pct(c->perf_counter[i] / tsc)); 3415 3402 } 3416 3403 3404 + /* Added PMT Core counters */ 3417 3405 for (i = 0, ppmt = sys.pmt_cp; ppmt; i++, ppmt = ppmt->next) { 3418 3406 const unsigned long value_raw = c->pmt_counter[i]; 3419 3407 double value_converted; 3420 3408 switch (ppmt->type) { 3421 3409 case PMT_TYPE_RAW: 3422 - if (pmt_counter_get_width(ppmt) <= 32) 3423 - outp += sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3424 - (unsigned int)c->pmt_counter[i]); 3425 - else 3426 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), c->pmt_counter[i]); 3427 - 3410 + outp += print_hex_value(pmt_counter_get_width(ppmt), &printed, delim, c->pmt_counter[i]); 3428 3411 break; 3429 3412 3430 3413 case PMT_TYPE_XTAL_TIME: 3431 - value_converted = 100.0 * value_raw / crystal_hz / interval_float; 3432 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3414 + value_converted = pct(value_raw / crystal_hz / interval_float); 3415 + outp += print_float_value(&printed, delim, value_converted); 3433 3416 break; 3434 3417 3435 3418 case PMT_TYPE_TCORE_CLOCK: 3436 - value_converted = 100.0 * value_raw / tcore_clock_freq_hz / interval_float; 3437 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3419 + value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3420 + outp += print_float_value(&printed, delim, value_converted); 3438 3421 } 3439 3422 } 3440 3423 3441 - fmt8 = "%s%.2f"; 3442 - 3443 3424 if (DO_BIC(BIC_CorWatt) && platform->has_per_core_rapl) 3444 - outp += 3445 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3446 - rapl_counter_get_value(&c->core_energy, RAPL_UNIT_WATTS, interval_float)); 3425 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&c->core_energy, RAPL_UNIT_WATTS, interval_float)); 3447 3426 if (DO_BIC(BIC_Cor_J) && platform->has_per_core_rapl) 3448 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3449 - rapl_counter_get_value(&c->core_energy, RAPL_UNIT_JOULES, interval_float)); 3427 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&c->core_energy, RAPL_UNIT_JOULES, interval_float)); 3450 3428 3451 3429 /* print per-package data only for 1st core in package */ 3452 - if (!is_cpu_first_core_in_package(t, c, p)) 3430 + if (!is_cpu_first_core_in_package(t, p)) 3453 3431 goto done; 3454 3432 3455 3433 /* PkgTmp */ ··· 3437 3463 if (p->gfx_rc6_ms == -1) { /* detect GFX counter reset */ 3438 3464 outp += sprintf(outp, "%s**.**", (printed++ ? delim : "")); 3439 3465 } else { 3440 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3441 - p->gfx_rc6_ms / 10.0 / interval_float); 3466 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), p->gfx_rc6_ms / 10.0 / interval_float); 3442 3467 } 3443 3468 } 3444 3469 ··· 3454 3481 if (p->sam_mc6_ms == -1) { /* detect GFX counter reset */ 3455 3482 outp += sprintf(outp, "%s**.**", (printed++ ? delim : "")); 3456 3483 } else { 3457 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3458 - p->sam_mc6_ms / 10.0 / interval_float); 3484 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), p->sam_mc6_ms / 10.0 / interval_float); 3459 3485 } 3460 3486 } 3461 3487 ··· 3468 3496 3469 3497 /* Totl%C0, Any%C0 GFX%C0 CPUGFX% */ 3470 3498 if (DO_BIC(BIC_Totl_c0)) 3471 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_wtd_core_c0 / tsc); 3499 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100 * p->pkg_wtd_core_c0 / tsc); /* can exceed 100% */ 3472 3500 if (DO_BIC(BIC_Any_c0)) 3473 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_core_c0 / tsc); 3501 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_core_c0 / tsc)); 3474 3502 if (DO_BIC(BIC_GFX_c0)) 3475 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_any_gfxe_c0 / tsc); 3503 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_any_gfxe_c0 / tsc)); 3476 3504 if (DO_BIC(BIC_CPUGFX)) 3477 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pkg_both_core_gfxe_c0 / tsc); 3505 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pkg_both_core_gfxe_c0 / tsc)); 3478 3506 3479 3507 if (DO_BIC(BIC_Pkgpc2)) 3480 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc2 / tsc); 3508 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc2 / tsc)); 3481 3509 if (DO_BIC(BIC_Pkgpc3)) 3482 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc3 / tsc); 3510 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc3 / tsc)); 3483 3511 if (DO_BIC(BIC_Pkgpc6)) 3484 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc6 / tsc); 3512 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc6 / tsc)); 3485 3513 if (DO_BIC(BIC_Pkgpc7)) 3486 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc7 / tsc); 3514 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc7 / tsc)); 3487 3515 if (DO_BIC(BIC_Pkgpc8)) 3488 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc8 / tsc); 3516 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc8 / tsc)); 3489 3517 if (DO_BIC(BIC_Pkgpc9)) 3490 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc9 / tsc); 3518 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc9 / tsc)); 3491 3519 if (DO_BIC(BIC_Pkgpc10)) 3492 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->pc10 / tsc); 3520 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->pc10 / tsc)); 3493 3521 3494 3522 if (DO_BIC(BIC_Diec6)) 3495 - outp += 3496 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->die_c6 / crystal_hz / interval_float); 3523 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->die_c6 / crystal_hz / interval_float)); 3497 3524 3498 3525 if (DO_BIC(BIC_CPU_LPI)) { 3499 3526 if (p->cpu_lpi >= 0) 3500 - outp += 3501 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3502 - 100.0 * p->cpu_lpi / 1000000.0 / interval_float); 3527 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->cpu_lpi / 1000000.0 / interval_float)); 3503 3528 else 3504 3529 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3505 3530 } 3506 3531 if (DO_BIC(BIC_SYS_LPI)) { 3507 3532 if (p->sys_lpi >= 0) 3508 - outp += 3509 - sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 3510 - 100.0 * p->sys_lpi / 1000000.0 / interval_float); 3533 + outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), pct(p->sys_lpi / 1000000.0 / interval_float)); 3511 3534 else 3512 3535 outp += sprintf(outp, "%s(neg)", (printed++ ? delim : "")); 3513 3536 } 3514 3537 3515 3538 if (DO_BIC(BIC_PkgWatt)) 3516 - outp += 3517 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3518 - rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_WATTS, interval_float)); 3539 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_WATTS, interval_float)); 3519 3540 if (DO_BIC(BIC_CorWatt) && !platform->has_per_core_rapl) 3520 - outp += 3521 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3522 - rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_WATTS, interval_float)); 3541 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_WATTS, interval_float)); 3523 3542 if (DO_BIC(BIC_GFXWatt)) 3524 - outp += 3525 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3526 - rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_WATTS, interval_float)); 3543 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_WATTS, interval_float)); 3527 3544 if (DO_BIC(BIC_RAMWatt)) 3528 - outp += 3529 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3530 - rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_WATTS, interval_float)); 3545 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_WATTS, interval_float)); 3531 3546 if (DO_BIC(BIC_Pkg_J)) 3532 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3533 - rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_JOULES, interval_float)); 3547 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_pkg, RAPL_UNIT_JOULES, interval_float)); 3534 3548 if (DO_BIC(BIC_Cor_J) && !platform->has_per_core_rapl) 3535 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3536 - rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_JOULES, interval_float)); 3549 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_cores, RAPL_UNIT_JOULES, interval_float)); 3537 3550 if (DO_BIC(BIC_GFX_J)) 3538 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3539 - rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_JOULES, interval_float)); 3551 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_gfx, RAPL_UNIT_JOULES, interval_float)); 3540 3552 if (DO_BIC(BIC_RAM_J)) 3541 - outp += sprintf(outp, fmt8, (printed++ ? delim : ""), 3542 - rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_JOULES, interval_float)); 3553 + outp += sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->energy_dram, RAPL_UNIT_JOULES, interval_float)); 3543 3554 if (DO_BIC(BIC_PKG__)) 3544 3555 outp += 3545 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3546 - rapl_counter_get_value(&p->rapl_pkg_perf_status, RAPL_UNIT_WATTS, interval_float)); 3556 + sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_pkg_perf_status, RAPL_UNIT_WATTS, interval_float)); 3547 3557 if (DO_BIC(BIC_RAM__)) 3548 3558 outp += 3549 - sprintf(outp, fmt8, (printed++ ? delim : ""), 3550 - rapl_counter_get_value(&p->rapl_dram_perf_status, RAPL_UNIT_WATTS, interval_float)); 3559 + sprintf(outp, fmt8, (printed++ ? delim : ""), rapl_counter_get_value(&p->rapl_dram_perf_status, RAPL_UNIT_WATTS, interval_float)); 3551 3560 /* UncMHz */ 3552 3561 if (DO_BIC(BIC_UNCORE_MHZ)) 3553 3562 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), p->uncore_mhz); 3554 3563 3564 + /* Added Package Counters */ 3555 3565 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 3556 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) { 3557 - if (mp->width == 32) 3558 - outp += 3559 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), (unsigned int)p->counter[i]); 3560 - else 3561 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), p->counter[i]); 3562 - } else if (mp->format == FORMAT_DELTA) { 3563 - if ((mp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3564 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), p->counter[i]); 3565 - else 3566 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), p->counter[i]); 3567 - } else if (mp->format == FORMAT_PERCENT) { 3568 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->counter[i] / tsc); 3569 - } else if (mp->type == COUNTER_K2M) 3566 + if (mp->format == FORMAT_RAW) 3567 + outp += print_hex_value(mp->width, &printed, delim, p->counter[i]); 3568 + else if (mp->type == COUNTER_K2M) 3570 3569 outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), (unsigned int)p->counter[i] / 1000); 3570 + else if (mp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3571 + outp += print_decimal_value(mp->width, &printed, delim, p->counter[i]); 3572 + else if (mp->format == FORMAT_PERCENT) 3573 + outp += print_float_value(&printed, delim, pct(p->counter[i] / tsc)); 3571 3574 } 3572 3575 3576 + /* Added perf Package Counters */ 3573 3577 for (i = 0, pp = sys.perf_pp; pp; i++, pp = pp->next) { 3574 - if (pp->format == FORMAT_RAW) { 3575 - if (pp->width == 32) 3576 - outp += 3577 - sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3578 - (unsigned int)p->perf_counter[i]); 3579 - else 3580 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), p->perf_counter[i]); 3581 - } else if (pp->format == FORMAT_DELTA) { 3582 - if ((pp->type == COUNTER_ITEMS) && sums_need_wide_columns) 3583 - outp += sprintf(outp, "%s%8lld", (printed++ ? delim : ""), p->perf_counter[i]); 3584 - else 3585 - outp += sprintf(outp, "%s%lld", (printed++ ? delim : ""), p->perf_counter[i]); 3586 - } else if (pp->format == FORMAT_PERCENT) { 3587 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * p->perf_counter[i] / tsc); 3588 - } else if (pp->type == COUNTER_K2M) { 3589 - outp += 3590 - sprintf(outp, "%s%d", (printed++ ? delim : ""), (unsigned int)p->perf_counter[i] / 1000); 3591 - } 3578 + if (pp->format == FORMAT_RAW) 3579 + outp += print_hex_value(pp->width, &printed, delim, p->perf_counter[i]); 3580 + else if (pp->type == COUNTER_K2M) 3581 + outp += sprintf(outp, "%s%d", (printed++ ? delim : ""), (unsigned int)p->perf_counter[i] / 1000); 3582 + else if (pp->format == FORMAT_DELTA || mp->format == FORMAT_AVERAGE) 3583 + outp += print_decimal_value(pp->width, &printed, delim, p->perf_counter[i]); 3584 + else if (pp->format == FORMAT_PERCENT) 3585 + outp += print_float_value(&printed, delim, pct(p->perf_counter[i] / tsc)); 3592 3586 } 3593 3587 3588 + /* Added PMT Package Counters */ 3594 3589 for (i = 0, ppmt = sys.pmt_pp; ppmt; i++, ppmt = ppmt->next) { 3595 3590 const unsigned long value_raw = p->pmt_counter[i]; 3596 3591 double value_converted; 3597 3592 switch (ppmt->type) { 3598 3593 case PMT_TYPE_RAW: 3599 - if (pmt_counter_get_width(ppmt) <= 32) 3600 - outp += sprintf(outp, "%s0x%08x", (printed++ ? delim : ""), 3601 - (unsigned int)p->pmt_counter[i]); 3602 - else 3603 - outp += sprintf(outp, "%s0x%016llx", (printed++ ? delim : ""), p->pmt_counter[i]); 3604 - 3594 + outp += print_hex_value(pmt_counter_get_width(ppmt), &printed, delim, p->pmt_counter[i]); 3605 3595 break; 3606 3596 3607 3597 case PMT_TYPE_XTAL_TIME: 3608 - value_converted = 100.0 * value_raw / crystal_hz / interval_float; 3609 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3598 + value_converted = pct(value_raw / crystal_hz / interval_float); 3599 + outp += print_float_value(&printed, delim, value_converted); 3610 3600 break; 3611 3601 3612 3602 case PMT_TYPE_TCORE_CLOCK: 3613 - value_converted = 100.0 * value_raw / tcore_clock_freq_hz / interval_float; 3614 - outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), value_converted); 3603 + value_converted = pct(value_raw / tcore_clock_freq_hz / interval_float); 3604 + outp += print_float_value(&printed, delim, value_converted); 3615 3605 } 3616 3606 } 3617 3607 ··· 3688 3754 old->energy_gfx.raw_value = new->energy_gfx.raw_value - old->energy_gfx.raw_value; 3689 3755 old->energy_dram.raw_value = new->energy_dram.raw_value - old->energy_dram.raw_value; 3690 3756 old->rapl_pkg_perf_status.raw_value = new->rapl_pkg_perf_status.raw_value - old->rapl_pkg_perf_status.raw_value; 3691 - old->rapl_dram_perf_status.raw_value = 3692 - new->rapl_dram_perf_status.raw_value - old->rapl_dram_perf_status.raw_value; 3757 + old->rapl_dram_perf_status.raw_value = new->rapl_dram_perf_status.raw_value - old->rapl_dram_perf_status.raw_value; 3693 3758 3694 3759 for (i = 0, mp = sys.pp; mp; i++, mp = mp->next) { 3695 - if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) 3760 + if (mp->format == FORMAT_RAW) 3696 3761 old->counter[i] = new->counter[i]; 3697 3762 else if (mp->format == FORMAT_AVERAGE) 3698 3763 old->counter[i] = new->counter[i]; ··· 3795 3862 /* check for TSC < 1 Mcycles over interval */ 3796 3863 if (old->tsc < (1000 * 1000)) 3797 3864 errx(-3, "Insanely slow TSC rate, TSC stops in idle?\n" 3798 - "You can disable all c-states by booting with \"idle=poll\"\n" 3799 - "or just the deep ones with \"processor.max_cstate=1\""); 3865 + "You can disable all c-states by booting with \"idle=poll\"\n" "or just the deep ones with \"processor.max_cstate=1\""); 3800 3866 3801 3867 old->c1 = new->c1 - old->c1; 3802 3868 ··· 3824 3892 old->c1 = 0; 3825 3893 else { 3826 3894 /* normal case, derive c1 */ 3827 - old->c1 = (old->tsc * tsc_tweak) - old->mperf - core_delta->c3 3828 - - core_delta->c6 - core_delta->c7; 3895 + old->c1 = (old->tsc * tsc_tweak) - old->mperf - core_delta->c3 - core_delta->c6 - core_delta->c7; 3829 3896 } 3830 3897 } 3831 3898 ··· 3845 3914 3846 3915 if (DO_BIC(BIC_SMI)) 3847 3916 old->smi_count = new->smi_count - old->smi_count; 3917 + 3918 + if (DO_BIC(BIC_LLC_RPS)) 3919 + old->llc.references = new->llc.references - old->llc.references; 3920 + 3921 + if (DO_BIC(BIC_LLC_HIT)) 3922 + old->llc.misses = new->llc.misses - old->llc.misses; 3848 3923 3849 3924 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 3850 3925 if (mp->format == FORMAT_RAW || mp->format == FORMAT_AVERAGE) ··· 3876 3939 return 0; 3877 3940 } 3878 3941 3879 - int delta_cpu(struct thread_data *t, struct core_data *c, 3880 - struct pkg_data *p, struct thread_data *t2, struct core_data *c2, struct pkg_data *p2) 3942 + int delta_cpu(struct thread_data *t, struct core_data *c, struct pkg_data *p, struct thread_data *t2, struct core_data *c2, struct pkg_data *p2) 3881 3943 { 3882 3944 int retval = 0; 3883 3945 3884 3946 /* calculate core delta only for 1st thread in core */ 3885 - if (is_cpu_first_thread_in_core(t, c, p)) 3947 + if (is_cpu_first_thread_in_core(t, c)) 3886 3948 delta_core(c, c2); 3887 3949 3888 3950 /* always calculate thread delta */ 3889 3951 retval = delta_thread(t, t2, c2); /* c2 is core delta */ 3890 3952 3891 3953 /* calculate package delta only for 1st core in package */ 3892 - if (is_cpu_first_core_in_package(t, c, p)) 3954 + if (is_cpu_first_core_in_package(t, p)) 3893 3955 retval |= delta_package(p, p2); 3894 3956 3895 3957 return retval; ··· 3929 3993 t->nmi_count = 0; 3930 3994 t->smi_count = 0; 3931 3995 3996 + t->llc.references = 0; 3997 + t->llc.misses = 0; 3998 + 3932 3999 c->c3 = 0; 3933 4000 c->c6 = 0; 3934 4001 c->c7 = 0; ··· 3939 4000 c->core_temp_c = 0; 3940 4001 rapl_counter_clear(&c->core_energy); 3941 4002 c->core_throt_cnt = 0; 4003 + 4004 + t->llc.references = 0; 4005 + t->llc.misses = 0; 3942 4006 3943 4007 p->pkg_wtd_core_c0 = 0; 3944 4008 p->pkg_any_core_c0 = 0; ··· 4040 4098 average.threads.nmi_count += t->nmi_count; 4041 4099 average.threads.smi_count += t->smi_count; 4042 4100 4101 + average.threads.llc.references += t->llc.references; 4102 + average.threads.llc.misses += t->llc.misses; 4103 + 4043 4104 for (i = 0, mp = sys.tp; mp; i++, mp = mp->next) { 4044 4105 if (mp->format == FORMAT_RAW) 4045 4106 continue; ··· 4060 4115 } 4061 4116 4062 4117 /* sum per-core values only for 1st thread in core */ 4063 - if (!is_cpu_first_thread_in_core(t, c, p)) 4118 + if (!is_cpu_first_thread_in_core(t, c)) 4064 4119 return 0; 4065 4120 4066 4121 average.cores.c3 += c->c3; ··· 4090 4145 } 4091 4146 4092 4147 /* sum per-pkg values only for 1st core in pkg */ 4093 - if (!is_cpu_first_core_in_package(t, c, p)) 4148 + if (!is_cpu_first_core_in_package(t, p)) 4094 4149 return 0; 4095 4150 4096 4151 if (DO_BIC(BIC_Totl_c0)) ··· 4356 4411 */ 4357 4412 for (die = 0; die <= topo.max_die_id; ++die) { 4358 4413 4359 - sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d/current_freq_khz", 4360 - package, die); 4414 + sprintf(path, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d/current_freq_khz", package, die); 4361 4415 4362 4416 if (access(path, R_OK) == 0) 4363 4417 return (snapshot_sysfs_counter(path) / 1000); ··· 4466 4522 4467 4523 return 0; 4468 4524 } 4469 - 4470 - struct amperf_group_fd { 4471 - int aperf; /* Also the group descriptor */ 4472 - int mperf; 4473 - }; 4474 4525 4475 4526 static int read_perf_counter_info(const char *const path, const char *const parse_format, void *value_ptr) 4476 4527 { ··· 4666 4727 const ssize_t actual_read_size = read(rci->fd_perf, &perf_data[0], sizeof(perf_data)); 4667 4728 4668 4729 if (actual_read_size != expected_read_size) 4669 - err(-1, "%s: failed to read perf_data (%zu %zu)", __func__, expected_read_size, 4670 - actual_read_size); 4730 + err(-1, "%s: failed to read perf_data (%zu %zu)", __func__, expected_read_size, actual_read_size); 4671 4731 } 4672 4732 4673 4733 for (unsigned int i = 0, pi = 1; i < NUM_RAPL_COUNTERS; ++i) { ··· 4904 4966 const ssize_t actual_read_size = read(mci->fd_perf, &perf_data[0], sizeof(perf_data)); 4905 4967 4906 4968 if (actual_read_size != expected_read_size) 4907 - err(-1, "%s: failed to read perf_data (%zu %zu)", __func__, expected_read_size, 4908 - actual_read_size); 4969 + err(-1, "%s: failed to read perf_data (%zu %zu)", __func__, expected_read_size, actual_read_size); 4909 4970 } 4910 4971 4911 4972 for (unsigned int i = 0, pi = 1; i < NUM_MSR_COUNTERS; ++i) { ··· 5058 5121 5059 5122 get_smi_aperf_mperf(cpu, t); 5060 5123 5124 + if (DO_BIC(BIC_LLC_RPS) || DO_BIC(BIC_LLC_HIT)) 5125 + get_perf_llc_stats(cpu, &t->llc); 5126 + 5061 5127 if (DO_BIC(BIC_IPC)) 5062 5128 if (read(get_instr_count_fd(cpu), &t->instr_count, sizeof(long long)) != sizeof(long long)) 5063 5129 return -4; ··· 5084 5144 t->pmt_counter[i] = pmt_read_counter(pp, t->cpu_id); 5085 5145 5086 5146 /* collect core counters only for 1st thread in core */ 5087 - if (!is_cpu_first_thread_in_core(t, c, p)) 5147 + if (!is_cpu_first_thread_in_core(t, c)) 5088 5148 goto done; 5089 5149 5090 5150 if (platform->has_per_core_rapl) { ··· 5128 5188 c->pmt_counter[i] = pmt_read_counter(pp, c->core_id); 5129 5189 5130 5190 /* collect package counters only for 1st core in package */ 5131 - if (!is_cpu_first_core_in_package(t, c, p)) 5191 + if (!is_cpu_first_core_in_package(t, p)) 5132 5192 goto done; 5133 5193 5134 5194 if (DO_BIC(BIC_Totl_c0)) { ··· 5217 5277 "pc3", "pc4", "pc6", "pc6n", "pc6r", "pc7", "pc7s", "pc8", "pc9", "pc10", "unlimited" 5218 5278 }; 5219 5279 5220 - int nhm_pkg_cstate_limits[16] = 5221 - { PCL__0, PCL__1, PCL__3, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5280 + int nhm_pkg_cstate_limits[16] = { PCL__0, PCL__1, PCL__3, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5222 5281 PCLRSV, PCLRSV 5223 5282 }; 5224 5283 5225 - int snb_pkg_cstate_limits[16] = 5226 - { PCL__0, PCL__2, PCL_6N, PCL_6R, PCL__7, PCL_7S, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5284 + int snb_pkg_cstate_limits[16] = { PCL__0, PCL__2, PCL_6N, PCL_6R, PCL__7, PCL_7S, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5227 5285 PCLRSV, PCLRSV 5228 5286 }; 5229 5287 5230 - int hsw_pkg_cstate_limits[16] = 5231 - { PCL__0, PCL__2, PCL__3, PCL__6, PCL__7, PCL_7S, PCL__8, PCL__9, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5288 + int hsw_pkg_cstate_limits[16] = { PCL__0, PCL__2, PCL__3, PCL__6, PCL__7, PCL_7S, PCL__8, PCL__9, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5232 5289 PCLRSV, PCLRSV 5233 5290 }; 5234 5291 5235 - int slv_pkg_cstate_limits[16] = 5236 - { PCL__0, PCL__1, PCLRSV, PCLRSV, PCL__4, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5292 + int slv_pkg_cstate_limits[16] = { PCL__0, PCL__1, PCLRSV, PCLRSV, PCL__4, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5237 5293 PCL__6, PCL__7 5238 5294 }; 5239 5295 5240 - int amt_pkg_cstate_limits[16] = 5241 - { PCLUNL, PCL__1, PCL__2, PCLRSV, PCLRSV, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5296 + int amt_pkg_cstate_limits[16] = { PCLUNL, PCL__1, PCL__2, PCLRSV, PCLRSV, PCLRSV, PCL__6, PCL__7, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5242 5297 PCLRSV, PCLRSV 5243 5298 }; 5244 5299 5245 - int phi_pkg_cstate_limits[16] = 5246 - { PCL__0, PCL__2, PCL_6N, PCL_6R, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5300 + int phi_pkg_cstate_limits[16] = { PCL__0, PCL__2, PCL_6N, PCL_6R, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5247 5301 PCLRSV, PCLRSV 5248 5302 }; 5249 5303 5250 - int glm_pkg_cstate_limits[16] = 5251 - { PCLUNL, PCL__1, PCL__3, PCL__6, PCL__7, PCL_7S, PCL__8, PCL__9, PCL_10, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5304 + int glm_pkg_cstate_limits[16] = { PCLUNL, PCL__1, PCL__3, PCL__6, PCL__7, PCL_7S, PCL__8, PCL__9, PCL_10, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5252 5305 PCLRSV, PCLRSV 5253 5306 }; 5254 5307 5255 - int skx_pkg_cstate_limits[16] = 5256 - { PCL__0, PCL__2, PCL_6N, PCL_6R, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5308 + int skx_pkg_cstate_limits[16] = { PCL__0, PCL__2, PCL_6N, PCL_6R, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5257 5309 PCLRSV, PCLRSV 5258 5310 }; 5259 5311 5260 - int icx_pkg_cstate_limits[16] = 5261 - { PCL__0, PCL__2, PCL__6, PCL__6, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5312 + int icx_pkg_cstate_limits[16] = { PCL__0, PCL__2, PCL__6, PCL__6, PCLRSV, PCLRSV, PCLRSV, PCLUNL, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, PCLRSV, 5262 5313 PCLRSV, PCLRSV 5263 5314 }; 5264 5315 ··· 5324 5393 return; 5325 5394 5326 5395 get_msr(base_cpu, MSR_IA32_POWER_CTL, &msr); 5327 - fprintf(outf, "cpu%d: MSR_IA32_POWER_CTL: 0x%08llx (C1E auto-promotion: %sabled)\n", 5328 - base_cpu, msr, msr & 0x2 ? "EN" : "DIS"); 5396 + fprintf(outf, "cpu%d: MSR_IA32_POWER_CTL: 0x%08llx (C1E auto-promotion: %sabled)\n", base_cpu, msr, msr & 0x2 ? "EN" : "DIS"); 5329 5397 5330 5398 /* C-state Pre-wake Disable (CSTATE_PREWAKE_DISABLE) */ 5331 5399 if (platform->has_cst_prewake_bit) ··· 5417 5487 ratio = (msr >> shift) & 0xFF; 5418 5488 group_size = (core_counts >> shift) & 0xFF; 5419 5489 if (ratio) 5420 - fprintf(outf, "%d * %.1f = %.1f MHz max turbo %d active cores\n", 5421 - ratio, bclk, ratio * bclk, group_size); 5490 + fprintf(outf, "%d * %.1f = %.1f MHz max turbo %d active cores\n", ratio, bclk, ratio * bclk, group_size); 5422 5491 } 5423 5492 5424 5493 return; ··· 5515 5586 5516 5587 for (i = buckets_no - 1; i >= 0; i--) 5517 5588 if (i > 0 ? ratio[i] != ratio[i - 1] : 1) 5518 - fprintf(outf, 5519 - "%d * %.1f = %.1f MHz max turbo %d active cores\n", 5520 - ratio[i], bclk, ratio[i] * bclk, cores[i]); 5589 + fprintf(outf, "%d * %.1f = %.1f MHz max turbo %d active cores\n", ratio[i], bclk, ratio[i] * bclk, cores[i]); 5521 5590 } 5522 5591 5523 5592 static void dump_cst_cfg(void) ··· 5600 5673 if (platform->supported_cstates & PC3) { 5601 5674 get_msr(base_cpu, MSR_PKGC3_IRTL, &msr); 5602 5675 fprintf(outf, "cpu%d: MSR_PKGC3_IRTL: 0x%08llx (", base_cpu, msr); 5603 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5604 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5676 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5605 5677 } 5606 5678 5607 5679 if (platform->supported_cstates & PC6) { 5608 5680 get_msr(base_cpu, MSR_PKGC6_IRTL, &msr); 5609 5681 fprintf(outf, "cpu%d: MSR_PKGC6_IRTL: 0x%08llx (", base_cpu, msr); 5610 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5611 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5682 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5612 5683 } 5613 5684 5614 5685 if (platform->supported_cstates & PC7) { 5615 5686 get_msr(base_cpu, MSR_PKGC7_IRTL, &msr); 5616 5687 fprintf(outf, "cpu%d: MSR_PKGC7_IRTL: 0x%08llx (", base_cpu, msr); 5617 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5618 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5688 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5619 5689 } 5620 5690 5621 5691 if (platform->supported_cstates & PC8) { 5622 5692 get_msr(base_cpu, MSR_PKGC8_IRTL, &msr); 5623 5693 fprintf(outf, "cpu%d: MSR_PKGC8_IRTL: 0x%08llx (", base_cpu, msr); 5624 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5625 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5694 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5626 5695 } 5627 5696 5628 5697 if (platform->supported_cstates & PC9) { 5629 5698 get_msr(base_cpu, MSR_PKGC9_IRTL, &msr); 5630 5699 fprintf(outf, "cpu%d: MSR_PKGC9_IRTL: 0x%08llx (", base_cpu, msr); 5631 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5632 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5700 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5633 5701 } 5634 5702 5635 5703 if (platform->supported_cstates & PC10) { 5636 5704 get_msr(base_cpu, MSR_PKGC10_IRTL, &msr); 5637 5705 fprintf(outf, "cpu%d: MSR_PKGC10_IRTL: 0x%08llx (", base_cpu, msr); 5638 - fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", 5639 - (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5706 + fprintf(outf, "%svalid, %lld ns)\n", msr & (1 << 15) ? "" : "NOT", (msr & 0x3FF) * irtl_time_units[(msr >> 10) & 0x3]); 5640 5707 } 5641 5708 } 5642 5709 ··· 5662 5741 5663 5742 free(fd_instr_count_percpu); 5664 5743 fd_instr_count_percpu = NULL; 5744 + } 5745 + 5746 + void free_fd_llc_percpu(void) 5747 + { 5748 + if (!fd_llc_percpu) 5749 + return; 5750 + 5751 + for (int i = 0; i < topo.max_cpu_num + 1; ++i) { 5752 + if (fd_llc_percpu[i] != 0) 5753 + close(fd_llc_percpu[i]); 5754 + } 5755 + 5756 + free(fd_llc_percpu); 5757 + fd_llc_percpu = NULL; 5665 5758 } 5666 5759 5667 5760 void free_fd_cstate(void) ··· 5802 5867 5803 5868 free_fd_percpu(); 5804 5869 free_fd_instr_count_percpu(); 5870 + free_fd_llc_percpu(); 5805 5871 free_fd_msr(); 5806 5872 free_fd_rapl_percpu(); 5807 5873 free_fd_cstate(); ··· 6149 6213 void msr_perf_init(void); 6150 6214 void rapl_perf_init(void); 6151 6215 void cstate_perf_init(void); 6216 + void perf_llc_init(void); 6152 6217 void added_perf_counters_init(void); 6153 6218 void pmt_init(void); 6154 6219 ··· 6161 6224 msr_perf_init(); 6162 6225 rapl_perf_init(); 6163 6226 cstate_perf_init(); 6227 + perf_llc_init(); 6164 6228 added_perf_counters_init(); 6165 6229 pmt_init(); 6166 - fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, 6167 - topo.allowed_cpus); 6230 + fprintf(outf, "turbostat: re-initialized with num_cpus %d, allowed_cpus %d\n", topo.num_cpus, topo.allowed_cpus); 6168 6231 } 6169 6232 6170 6233 void set_max_cpu_num(void) ··· 6610 6673 timer_delete(timerid); 6611 6674 release_msr: 6612 6675 free(per_cpu_msr_sum); 6676 + per_cpu_msr_sum = NULL; 6613 6677 } 6614 6678 6615 6679 /* ··· 6735 6797 } 6736 6798 } 6737 6799 6738 - void check_dev_msr() 6800 + int probe_dev_msr(void) 6739 6801 { 6740 6802 struct stat sb; 6741 6803 char pathname[32]; 6742 6804 6743 - if (no_msr) 6744 - return; 6745 - #if defined(ANDROID) 6746 6805 sprintf(pathname, "/dev/msr%d", base_cpu); 6747 - #else 6806 + return !stat(pathname, &sb); 6807 + } 6808 + 6809 + int probe_dev_cpu_msr(void) 6810 + { 6811 + struct stat sb; 6812 + char pathname[32]; 6813 + 6748 6814 sprintf(pathname, "/dev/cpu/%d/msr", base_cpu); 6749 - #endif 6750 - if (stat(pathname, &sb)) 6751 - if (system("/sbin/modprobe msr > /dev/null 2>&1")) 6752 - no_msr = 1; 6815 + return !stat(pathname, &sb); 6816 + } 6817 + 6818 + int probe_msr_driver(void) 6819 + { 6820 + if (probe_dev_msr()) { 6821 + use_android_msr_path = 1; 6822 + return 1; 6823 + } 6824 + return probe_dev_cpu_msr(); 6825 + } 6826 + 6827 + void check_msr_driver(void) 6828 + { 6829 + if (probe_msr_driver()) 6830 + return; 6831 + 6832 + if (system("/sbin/modprobe msr > /dev/null 2>&1")) 6833 + no_msr = 1; 6834 + 6835 + if (!probe_msr_driver()) 6836 + no_msr = 1; 6753 6837 } 6754 6838 6755 6839 /* ··· 6826 6866 failed += check_for_cap_sys_rawio(); 6827 6867 6828 6868 /* test file permissions */ 6829 - #if defined(ANDROID) 6830 - sprintf(pathname, "/dev/msr%d", base_cpu); 6831 - #else 6832 - sprintf(pathname, "/dev/cpu/%d/msr", base_cpu); 6833 - #endif 6869 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", base_cpu); 6834 6870 if (euidaccess(pathname, R_OK)) { 6835 6871 failed++; 6836 6872 } ··· 6955 6999 int k, l; 6956 7000 char path_base[128]; 6957 7001 6958 - sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d", i, 6959 - j); 7002 + sprintf(path_base, "/sys/devices/system/cpu/intel_uncore_frequency/package_%02d_die_%02d", i, j); 6960 7003 6961 7004 sprintf(path, "%s/current_freq_khz", path_base); 6962 7005 if (access(path, R_OK)) ··· 7038 7083 */ 7039 7084 if BIC_IS_ENABLED 7040 7085 (BIC_UNCORE_MHZ) 7041 - add_counter(0, path, name_buf, 0, SCOPE_PACKAGE, COUNTER_K2M, FORMAT_AVERAGE, 0, 7042 - package_id); 7086 + add_counter(0, path, name_buf, 0, SCOPE_PACKAGE, COUNTER_K2M, FORMAT_AVERAGE, 0, package_id); 7043 7087 7044 7088 if (quiet) 7045 7089 continue; ··· 7047 7093 k = read_sysfs_int(path); 7048 7094 sprintf(path, "%s/max_freq_khz", path_base); 7049 7095 l = read_sysfs_int(path); 7050 - fprintf(outf, "Uncore Frequency package%d domain%d cluster%d: %d - %d MHz ", package_id, domain_id, 7051 - cluster_id, k / 1000, l / 1000); 7096 + fprintf(outf, "Uncore Frequency package%d domain%d cluster%d: %d - %d MHz ", package_id, domain_id, cluster_id, k / 1000, l / 1000); 7052 7097 7053 7098 sprintf(path, "%s/initial_min_freq_khz", path_base); 7054 7099 k = read_sysfs_int(path); ··· 7109 7156 else 7110 7157 goto next; 7111 7158 7112 - set_graphics_fp("/sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms", 7113 - gt0_is_gt ? GFX_rc6 : SAM_mc6); 7159 + set_graphics_fp("/sys/class/drm/card0/device/tile0/gt0/gtidle/idle_residency_ms", gt0_is_gt ? GFX_rc6 : SAM_mc6); 7114 7160 7115 7161 set_graphics_fp("/sys/class/drm/card0/device/tile0/gt0/freq0/cur_freq", gt0_is_gt ? GFX_MHz : SAM_MHz); 7116 7162 7117 - set_graphics_fp("/sys/class/drm/card0/device/tile0/gt0/freq0/act_freq", 7118 - gt0_is_gt ? GFX_ACTMHz : SAM_ACTMHz); 7163 + set_graphics_fp("/sys/class/drm/card0/device/tile0/gt0/freq0/act_freq", gt0_is_gt ? GFX_ACTMHz : SAM_ACTMHz); 7119 7164 7120 - set_graphics_fp("/sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms", 7121 - gt0_is_gt ? SAM_mc6 : GFX_rc6); 7165 + set_graphics_fp("/sys/class/drm/card0/device/tile0/gt1/gtidle/idle_residency_ms", gt0_is_gt ? SAM_mc6 : GFX_rc6); 7122 7166 7123 7167 set_graphics_fp("/sys/class/drm/card0/device/tile0/gt1/freq0/cur_freq", gt0_is_gt ? SAM_MHz : GFX_MHz); 7124 7168 7125 - set_graphics_fp("/sys/class/drm/card0/device/tile0/gt1/freq0/act_freq", 7126 - gt0_is_gt ? SAM_ACTMHz : GFX_ACTMHz); 7169 + set_graphics_fp("/sys/class/drm/card0/device/tile0/gt1/freq0/act_freq", gt0_is_gt ? SAM_ACTMHz : GFX_ACTMHz); 7127 7170 7128 7171 goto end; 7129 7172 } ··· 7374 7425 "(high %d guar %d eff %d low %d)\n", 7375 7426 cpu, msr, 7376 7427 (unsigned int)HWP_HIGHEST_PERF(msr), 7377 - (unsigned int)HWP_GUARANTEED_PERF(msr), 7378 - (unsigned int)HWP_MOSTEFFICIENT_PERF(msr), (unsigned int)HWP_LOWEST_PERF(msr)); 7428 + (unsigned int)HWP_GUARANTEED_PERF(msr), (unsigned int)HWP_MOSTEFFICIENT_PERF(msr), (unsigned int)HWP_LOWEST_PERF(msr)); 7379 7429 7380 7430 if (get_msr(cpu, MSR_HWP_REQUEST, &msr)) 7381 7431 return 0; ··· 7385 7437 (unsigned int)(((msr) >> 0) & 0xff), 7386 7438 (unsigned int)(((msr) >> 8) & 0xff), 7387 7439 (unsigned int)(((msr) >> 16) & 0xff), 7388 - (unsigned int)(((msr) >> 24) & 0xff), 7389 - (unsigned int)(((msr) >> 32) & 0xff3), (unsigned int)(((msr) >> 42) & 0x1)); 7440 + (unsigned int)(((msr) >> 24) & 0xff), (unsigned int)(((msr) >> 32) & 0xff3), (unsigned int)(((msr) >> 42) & 0x1)); 7390 7441 7391 7442 if (has_hwp_pkg) { 7392 7443 if (get_msr(cpu, MSR_HWP_REQUEST_PKG, &msr)) ··· 7396 7449 cpu, msr, 7397 7450 (unsigned int)(((msr) >> 0) & 0xff), 7398 7451 (unsigned int)(((msr) >> 8) & 0xff), 7399 - (unsigned int)(((msr) >> 16) & 0xff), 7400 - (unsigned int)(((msr) >> 24) & 0xff), (unsigned int)(((msr) >> 32) & 0xff3)); 7452 + (unsigned int)(((msr) >> 16) & 0xff), (unsigned int)(((msr) >> 24) & 0xff), (unsigned int)(((msr) >> 32) & 0xff3)); 7401 7453 } 7402 7454 if (has_hwp_notify) { 7403 7455 if (get_msr(cpu, MSR_HWP_INTERRUPT, &msr)) 7404 7456 return 0; 7405 7457 7406 7458 fprintf(outf, "cpu%d: MSR_HWP_INTERRUPT: 0x%08llx " 7407 - "(%s_Guaranteed_Perf_Change, %s_Excursion_Min)\n", 7408 - cpu, msr, ((msr) & 0x1) ? "EN" : "Dis", ((msr) & 0x2) ? "EN" : "Dis"); 7459 + "(%s_Guaranteed_Perf_Change, %s_Excursion_Min)\n", cpu, msr, ((msr) & 0x1) ? "EN" : "Dis", ((msr) & 0x2) ? "EN" : "Dis"); 7409 7460 } 7410 7461 if (get_msr(cpu, MSR_HWP_STATUS, &msr)) 7411 7462 return 0; 7412 7463 7413 7464 fprintf(outf, "cpu%d: MSR_HWP_STATUS: 0x%08llx " 7414 - "(%sGuaranteed_Perf_Change, %sExcursion_Min)\n", 7415 - cpu, msr, ((msr) & 0x1) ? "" : "No-", ((msr) & 0x4) ? "" : "No-"); 7465 + "(%sGuaranteed_Perf_Change, %sExcursion_Min)\n", cpu, msr, ((msr) & 0x1) ? "" : "No-", ((msr) & 0x4) ? "" : "No-"); 7416 7466 7417 7467 return 0; 7418 7468 } ··· 7454 7510 (msr & 1 << 6) ? "VR-Therm, " : "", 7455 7511 (msr & 1 << 5) ? "Auto-HWP, " : "", 7456 7512 (msr & 1 << 4) ? "Graphics, " : "", 7457 - (msr & 1 << 2) ? "bit2, " : "", 7458 - (msr & 1 << 1) ? "ThermStatus, " : "", (msr & 1 << 0) ? "PROCHOT, " : ""); 7513 + (msr & 1 << 2) ? "bit2, " : "", (msr & 1 << 1) ? "ThermStatus, " : "", (msr & 1 << 0) ? "PROCHOT, " : ""); 7459 7514 fprintf(outf, " (Logged: %s%s%s%s%s%s%s%s%s%s%s%s%s%s)\n", 7460 7515 (msr & 1 << 31) ? "bit31, " : "", 7461 7516 (msr & 1 << 30) ? "bit30, " : "", ··· 7467 7524 (msr & 1 << 22) ? "VR-Therm, " : "", 7468 7525 (msr & 1 << 21) ? "Auto-HWP, " : "", 7469 7526 (msr & 1 << 20) ? "Graphics, " : "", 7470 - (msr & 1 << 18) ? "bit18, " : "", 7471 - (msr & 1 << 17) ? "ThermStatus, " : "", (msr & 1 << 16) ? "PROCHOT, " : ""); 7527 + (msr & 1 << 18) ? "bit18, " : "", (msr & 1 << 17) ? "ThermStatus, " : "", (msr & 1 << 16) ? "PROCHOT, " : ""); 7472 7528 7473 7529 } 7474 7530 if (platform->plr_msrs & PLR_GFX) { ··· 7479 7537 (msr & 1 << 4) ? "Graphics, " : "", 7480 7538 (msr & 1 << 6) ? "VR-Therm, " : "", 7481 7539 (msr & 1 << 8) ? "Amps, " : "", 7482 - (msr & 1 << 9) ? "GFXPwr, " : "", 7483 - (msr & 1 << 10) ? "PkgPwrL1, " : "", (msr & 1 << 11) ? "PkgPwrL2, " : ""); 7540 + (msr & 1 << 9) ? "GFXPwr, " : "", (msr & 1 << 10) ? "PkgPwrL1, " : "", (msr & 1 << 11) ? "PkgPwrL2, " : ""); 7484 7541 fprintf(outf, " (Logged: %s%s%s%s%s%s%s%s)\n", 7485 7542 (msr & 1 << 16) ? "PROCHOT, " : "", 7486 7543 (msr & 1 << 17) ? "ThermStatus, " : "", 7487 7544 (msr & 1 << 20) ? "Graphics, " : "", 7488 7545 (msr & 1 << 22) ? "VR-Therm, " : "", 7489 7546 (msr & 1 << 24) ? "Amps, " : "", 7490 - (msr & 1 << 25) ? "GFXPwr, " : "", 7491 - (msr & 1 << 26) ? "PkgPwrL1, " : "", (msr & 1 << 27) ? "PkgPwrL2, " : ""); 7547 + (msr & 1 << 25) ? "GFXPwr, " : "", (msr & 1 << 26) ? "PkgPwrL1, " : "", (msr & 1 << 27) ? "PkgPwrL2, " : ""); 7492 7548 } 7493 7549 if (platform->plr_msrs & PLR_RING) { 7494 7550 get_msr(cpu, MSR_RING_PERF_LIMIT_REASONS, &msr); ··· 7495 7555 (msr & 1 << 0) ? "PROCHOT, " : "", 7496 7556 (msr & 1 << 1) ? "ThermStatus, " : "", 7497 7557 (msr & 1 << 6) ? "VR-Therm, " : "", 7498 - (msr & 1 << 8) ? "Amps, " : "", 7499 - (msr & 1 << 10) ? "PkgPwrL1, " : "", (msr & 1 << 11) ? "PkgPwrL2, " : ""); 7558 + (msr & 1 << 8) ? "Amps, " : "", (msr & 1 << 10) ? "PkgPwrL1, " : "", (msr & 1 << 11) ? "PkgPwrL2, " : ""); 7500 7559 fprintf(outf, " (Logged: %s%s%s%s%s%s)\n", 7501 7560 (msr & 1 << 16) ? "PROCHOT, " : "", 7502 7561 (msr & 1 << 17) ? "ThermStatus, " : "", 7503 7562 (msr & 1 << 22) ? "VR-Therm, " : "", 7504 - (msr & 1 << 24) ? "Amps, " : "", 7505 - (msr & 1 << 26) ? "PkgPwrL1, " : "", (msr & 1 << 27) ? "PkgPwrL2, " : ""); 7563 + (msr & 1 << 24) ? "Amps, " : "", (msr & 1 << 26) ? "PkgPwrL1, " : "", (msr & 1 << 27) ? "PkgPwrL2, " : ""); 7506 7564 } 7507 7565 return 0; 7508 7566 } ··· 7520 7582 { 7521 7583 unsigned long long msr; 7522 7584 7523 - if (platform->rapl_msrs & RAPL_PKG_POWER_INFO) 7585 + if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) 7524 7586 if (!get_msr(base_cpu, MSR_PKG_POWER_INFO, &msr)) 7525 7587 return ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units; 7526 7588 return get_quirk_tdp(); ··· 7551 7613 CLR_BIC(BIC_GFX_J, &bic_enabled); 7552 7614 } 7553 7615 7554 - if (!platform->rapl_msrs || no_msr) 7616 + if (!valid_rapl_msrs || no_msr) 7555 7617 return; 7556 7618 7557 - if (!(platform->rapl_msrs & RAPL_PKG_PERF_STATUS)) 7619 + if (!(valid_rapl_msrs & RAPL_PKG_PERF_STATUS)) 7558 7620 CLR_BIC(BIC_PKG__, &bic_enabled); 7559 - if (!(platform->rapl_msrs & RAPL_DRAM_PERF_STATUS)) 7621 + if (!(valid_rapl_msrs & RAPL_DRAM_PERF_STATUS)) 7560 7622 CLR_BIC(BIC_RAM__, &bic_enabled); 7561 7623 7562 7624 /* units on package 0, verify later other packages match */ ··· 7605 7667 CLR_BIC(BIC_Cor_J, &bic_enabled); 7606 7668 } 7607 7669 7608 - if (!platform->rapl_msrs || no_msr) 7670 + if (!valid_rapl_msrs || no_msr) 7609 7671 return; 7610 7672 7611 7673 if (get_msr(base_cpu, MSR_RAPL_PWR_UNIT, &msr)) ··· 7628 7690 cpu, label, 7629 7691 ((msr >> 15) & 1) ? "EN" : "DIS", 7630 7692 ((msr >> 0) & 0x7FFF) * rapl_power_units, 7631 - (1.0 + (((msr >> 22) & 0x3) / 4.0)) * (1 << ((msr >> 17) & 0x1F)) * rapl_time_units, 7632 - (((msr >> 16) & 1) ? "EN" : "DIS")); 7693 + (1.0 + (((msr >> 22) & 0x3) / 4.0)) * (1 << ((msr >> 17) & 0x1F)) * rapl_time_units, (((msr >> 16) & 1) ? "EN" : "DIS")); 7633 7694 7634 7695 return; 7635 7696 } ··· 7794 7857 UNUSED(c); 7795 7858 UNUSED(p); 7796 7859 7797 - if (!platform->rapl_msrs) 7860 + if (!valid_rapl_msrs) 7798 7861 return 0; 7799 7862 7800 7863 /* RAPL counters are per package, so print only for 1st thread/package */ ··· 7807 7870 return -1; 7808 7871 } 7809 7872 7810 - if (platform->rapl_msrs & RAPL_AMD_F17H) { 7873 + if (valid_rapl_msrs & RAPL_AMD_F17H) { 7811 7874 msr_name = "MSR_RAPL_PWR_UNIT"; 7812 7875 if (get_msr(cpu, MSR_RAPL_PWR_UNIT, &msr)) 7813 7876 return -1; ··· 7820 7883 fprintf(outf, "cpu%d: %s: 0x%08llx (%f Watts, %f Joules, %f sec.)\n", cpu, msr_name, msr, 7821 7884 rapl_power_units, rapl_energy_units, rapl_time_units); 7822 7885 7823 - if (platform->rapl_msrs & RAPL_PKG_POWER_INFO) { 7886 + if (valid_rapl_msrs & RAPL_PKG_POWER_INFO) { 7824 7887 7825 7888 if (get_msr(cpu, MSR_PKG_POWER_INFO, &msr)) 7826 7889 return -5; ··· 7829 7892 cpu, msr, 7830 7893 ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7831 7894 ((msr >> 16) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7832 - ((msr >> 32) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7833 - ((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units); 7895 + ((msr >> 32) & RAPL_POWER_GRANULARITY) * rapl_power_units, ((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units); 7834 7896 7835 7897 } 7836 - if (platform->rapl_msrs & RAPL_PKG) { 7898 + if (valid_rapl_msrs & RAPL_PKG) { 7837 7899 7838 7900 if (get_msr(cpu, MSR_PKG_POWER_LIMIT, &msr)) 7839 7901 return -9; 7840 7902 7841 - fprintf(outf, "cpu%d: MSR_PKG_POWER_LIMIT: 0x%08llx (%slocked)\n", 7842 - cpu, msr, (msr >> 63) & 1 ? "" : "UN"); 7903 + fprintf(outf, "cpu%d: MSR_PKG_POWER_LIMIT: 0x%08llx (%slocked)\n", cpu, msr, (msr >> 63) & 1 ? "" : "UN"); 7843 7904 7844 7905 print_power_limit_msr(cpu, msr, "PKG Limit #1"); 7845 7906 fprintf(outf, "cpu%d: PKG Limit #2: %sabled (%0.3f Watts, %f* sec, clamp %sabled)\n", 7846 7907 cpu, 7847 7908 ((msr >> 47) & 1) ? "EN" : "DIS", 7848 7909 ((msr >> 32) & 0x7FFF) * rapl_power_units, 7849 - (1.0 + (((msr >> 54) & 0x3) / 4.0)) * (1 << ((msr >> 49) & 0x1F)) * rapl_time_units, 7850 - ((msr >> 48) & 1) ? "EN" : "DIS"); 7910 + (1.0 + (((msr >> 54) & 0x3) / 4.0)) * (1 << ((msr >> 49) & 0x1F)) * rapl_time_units, ((msr >> 48) & 1) ? "EN" : "DIS"); 7851 7911 7852 7912 if (get_msr(cpu, MSR_VR_CURRENT_CONFIG, &msr)) 7853 7913 return -9; ··· 7854 7920 cpu, ((msr >> 0) & 0x1FFF) * rapl_power_units, (msr >> 31) & 1 ? "" : "UN"); 7855 7921 } 7856 7922 7857 - if (platform->rapl_msrs & RAPL_DRAM_POWER_INFO) { 7923 + if (valid_rapl_msrs & RAPL_DRAM_POWER_INFO) { 7858 7924 if (get_msr(cpu, MSR_DRAM_POWER_INFO, &msr)) 7859 7925 return -6; 7860 7926 ··· 7862 7928 cpu, msr, 7863 7929 ((msr >> 0) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7864 7930 ((msr >> 16) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7865 - ((msr >> 32) & RAPL_POWER_GRANULARITY) * rapl_power_units, 7866 - ((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units); 7931 + ((msr >> 32) & RAPL_POWER_GRANULARITY) * rapl_power_units, ((msr >> 48) & RAPL_TIME_GRANULARITY) * rapl_time_units); 7867 7932 } 7868 - if (platform->rapl_msrs & RAPL_DRAM) { 7933 + if (valid_rapl_msrs & RAPL_DRAM) { 7869 7934 if (get_msr(cpu, MSR_DRAM_POWER_LIMIT, &msr)) 7870 7935 return -9; 7871 - fprintf(outf, "cpu%d: MSR_DRAM_POWER_LIMIT: 0x%08llx (%slocked)\n", 7872 - cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7936 + fprintf(outf, "cpu%d: MSR_DRAM_POWER_LIMIT: 0x%08llx (%slocked)\n", cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7873 7937 7874 7938 print_power_limit_msr(cpu, msr, "DRAM Limit"); 7875 7939 } 7876 - if (platform->rapl_msrs & RAPL_CORE_POLICY) { 7940 + if (valid_rapl_msrs & RAPL_CORE_POLICY) { 7877 7941 if (get_msr(cpu, MSR_PP0_POLICY, &msr)) 7878 7942 return -7; 7879 7943 7880 7944 fprintf(outf, "cpu%d: MSR_PP0_POLICY: %lld\n", cpu, msr & 0xF); 7881 7945 } 7882 - if (platform->rapl_msrs & RAPL_CORE_POWER_LIMIT) { 7946 + if (valid_rapl_msrs & RAPL_CORE_POWER_LIMIT) { 7883 7947 if (get_msr(cpu, MSR_PP0_POWER_LIMIT, &msr)) 7884 7948 return -9; 7885 - fprintf(outf, "cpu%d: MSR_PP0_POWER_LIMIT: 0x%08llx (%slocked)\n", 7886 - cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7949 + fprintf(outf, "cpu%d: MSR_PP0_POWER_LIMIT: 0x%08llx (%slocked)\n", cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7887 7950 print_power_limit_msr(cpu, msr, "Cores Limit"); 7888 7951 } 7889 - if (platform->rapl_msrs & RAPL_GFX) { 7952 + if (valid_rapl_msrs & RAPL_GFX) { 7890 7953 if (get_msr(cpu, MSR_PP1_POLICY, &msr)) 7891 7954 return -8; 7892 7955 ··· 7891 7960 7892 7961 if (get_msr(cpu, MSR_PP1_POWER_LIMIT, &msr)) 7893 7962 return -9; 7894 - fprintf(outf, "cpu%d: MSR_PP1_POWER_LIMIT: 0x%08llx (%slocked)\n", 7895 - cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7963 + fprintf(outf, "cpu%d: MSR_PP1_POWER_LIMIT: 0x%08llx (%slocked)\n", cpu, msr, (msr >> 31) & 1 ? "" : "UN"); 7896 7964 print_power_limit_msr(cpu, msr, "GFX Limit"); 7897 7965 } 7898 7966 return 0; 7967 + } 7968 + 7969 + /* 7970 + * probe_rapl_msrs 7971 + * 7972 + * initialize global valid_rapl_msrs to platform->plat_rapl_msrs 7973 + * only if PKG_ENERGY counter is enumerated and reads non-zero 7974 + */ 7975 + void probe_rapl_msrs(void) 7976 + { 7977 + int ret; 7978 + off_t offset; 7979 + unsigned long long msr_value; 7980 + 7981 + if (no_msr) 7982 + return; 7983 + 7984 + if ((platform->plat_rapl_msrs & (RAPL_PKG | RAPL_AMD_F17H)) == 0) 7985 + return; 7986 + 7987 + offset = idx_to_offset(IDX_PKG_ENERGY); 7988 + if (offset < 0) 7989 + return; 7990 + 7991 + ret = get_msr(base_cpu, offset, &msr_value); 7992 + if (ret) { 7993 + if (debug) 7994 + fprintf(outf, "Can not read RAPL_PKG_ENERGY MSR(0x%llx)\n", (unsigned long long)offset); 7995 + return; 7996 + } 7997 + if (msr_value == 0) { 7998 + if (debug) 7999 + fprintf(outf, "RAPL_PKG_ENERGY MSR(0x%llx) == ZERO: disabling all RAPL MSRs\n", (unsigned long long)offset); 8000 + return; 8001 + } 8002 + 8003 + valid_rapl_msrs = platform->plat_rapl_msrs; /* success */ 7899 8004 } 7900 8005 7901 8006 /* ··· 7941 7974 */ 7942 7975 void probe_rapl(void) 7943 7976 { 7977 + probe_rapl_msrs(); 7978 + 7944 7979 if (genuine_intel) 7945 7980 rapl_probe_intel(); 7946 7981 if (authentic_amd || hygon_genuine) ··· 7953 7984 7954 7985 print_rapl_sysfs(); 7955 7986 7956 - if (!platform->rapl_msrs || no_msr) 7987 + if (!valid_rapl_msrs || no_msr) 7957 7988 return; 7958 7989 7959 7990 for_all_cpus(print_rapl, ODD_COUNTERS); ··· 8057 8088 cpu = t->cpu_id; 8058 8089 8059 8090 /* DTS is per-core, no need to print for each thread */ 8060 - if (!is_cpu_first_thread_in_core(t, c, p)) 8091 + if (!is_cpu_first_thread_in_core(t, c)) 8061 8092 return 0; 8062 8093 8063 8094 if (cpu_migrate(cpu)) { ··· 8065 8096 return -1; 8066 8097 } 8067 8098 8068 - if (do_ptm && is_cpu_first_core_in_package(t, c, p)) { 8099 + if (do_ptm && is_cpu_first_core_in_package(t, p)) { 8069 8100 if (get_msr(cpu, MSR_IA32_PACKAGE_THERM_STATUS, &msr)) 8070 8101 return 0; 8071 8102 ··· 8077 8108 8078 8109 dts = (msr >> 16) & 0x7F; 8079 8110 dts2 = (msr >> 8) & 0x7F; 8080 - fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n", 8081 - cpu, msr, tj_max - dts, tj_max - dts2); 8111 + fprintf(outf, "cpu%d: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n", cpu, msr, tj_max - dts, tj_max - dts2); 8082 8112 } 8083 8113 8084 8114 if (do_dts && debug) { ··· 8088 8120 8089 8121 dts = (msr >> 16) & 0x7F; 8090 8122 resolution = (msr >> 27) & 0xF; 8091 - fprintf(outf, "cpu%d: MSR_IA32_THERM_STATUS: 0x%08llx (%d C +/- %d)\n", 8092 - cpu, msr, tj_max - dts, resolution); 8123 + fprintf(outf, "cpu%d: MSR_IA32_THERM_STATUS: 0x%08llx (%d C +/- %d)\n", cpu, msr, tj_max - dts, resolution); 8093 8124 8094 8125 if (get_msr(cpu, MSR_IA32_THERM_INTERRUPT, &msr)) 8095 8126 return 0; 8096 8127 8097 8128 dts = (msr >> 16) & 0x7F; 8098 8129 dts2 = (msr >> 8) & 0x7F; 8099 - fprintf(outf, "cpu%d: MSR_IA32_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n", 8100 - cpu, msr, tj_max - dts, tj_max - dts2); 8130 + fprintf(outf, "cpu%d: MSR_IA32_THERM_INTERRUPT: 0x%08llx (%d C, %d C)\n", cpu, msr, tj_max - dts, tj_max - dts2); 8101 8131 } 8102 8132 8103 8133 return 0; ··· 8169 8203 msr & MSR_IA32_MISC_ENABLE_TM1 ? "" : "No-", 8170 8204 msr & MSR_IA32_MISC_ENABLE_ENHANCED_SPEEDSTEP ? "" : "No-", 8171 8205 msr & MSR_IA32_MISC_ENABLE_MWAIT ? "" : "No-", 8172 - msr & MSR_IA32_MISC_ENABLE_PREFETCH_DISABLE ? "No-" : "", 8173 - msr & MSR_IA32_MISC_ENABLE_TURBO_DISABLE ? "No-" : ""); 8206 + msr & MSR_IA32_MISC_ENABLE_PREFETCH_DISABLE ? "No-" : "", msr & MSR_IA32_MISC_ENABLE_TURBO_DISABLE ? "No-" : ""); 8174 8207 } 8175 8208 8176 8209 void decode_misc_feature_control(void) ··· 8208 8243 8209 8244 if (!get_msr(base_cpu, MSR_MISC_PWR_MGMT, &msr)) 8210 8245 fprintf(outf, "cpu%d: MSR_MISC_PWR_MGMT: 0x%08llx (%sable-EIST_Coordination %sable-EPB %sable-OOB)\n", 8211 - base_cpu, msr, 8212 - msr & (1 << 0) ? "DIS" : "EN", msr & (1 << 1) ? "EN" : "DIS", msr & (1 << 8) ? "EN" : "DIS"); 8246 + base_cpu, msr, msr & (1 << 0) ? "DIS" : "EN", msr & (1 << 1) ? "EN" : "DIS", msr & (1 << 8) ? "EN" : "DIS"); 8213 8247 } 8214 8248 8215 8249 /* ··· 8261 8297 close(fd); 8262 8298 } 8263 8299 8264 - static int has_instr_count_access(void) 8300 + static int has_perf_instr_count_access(void) 8265 8301 { 8266 8302 int fd; 8267 - int has_access; 8268 8303 8269 8304 if (no_perf) 8270 8305 return 0; 8271 8306 8272 8307 fd = open_perf_counter(base_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_INSTRUCTIONS, -1, 0); 8273 - has_access = fd != -1; 8274 - 8275 8308 if (fd != -1) 8276 8309 close(fd); 8277 8310 8278 - if (!has_access) 8311 + if (fd == -1) 8279 8312 warnx("Failed to access %s. Some of the counters may not be available\n" 8280 - "\tRun as root to enable them or use %s to disable the access explicitly", 8281 - "instructions retired perf counter", "--no-perf"); 8313 + "\tRun as root to enable them or use %s to disable the access explicitly", "perf instructions retired counter", 8314 + "'--hide IPC' or '--no-perf'"); 8282 8315 8283 - return has_access; 8316 + return (fd != -1); 8284 8317 } 8285 8318 8286 - int add_rapl_perf_counter(int cpu, struct rapl_counter_info_t *rci, const struct rapl_counter_arch_info *cai, 8287 - double *scale_, enum rapl_unit *unit_) 8319 + int add_rapl_perf_counter(int cpu, struct rapl_counter_info_t *rci, const struct rapl_counter_arch_info *cai, double *scale_, enum rapl_unit *unit_) 8288 8320 { 8289 8321 int ret = -1; 8290 8322 ··· 8330 8370 if (access("/proc/sys/kernel/perf_event_paranoid", F_OK)) 8331 8371 return; 8332 8372 8333 - if (BIC_IS_ENABLED(BIC_IPC) && has_aperf) { 8373 + if (BIC_IS_ENABLED(BIC_IPC) && cpuid_has_aperf_mperf) { 8334 8374 fd_instr_count_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8335 8375 if (fd_instr_count_percpu == NULL) 8336 8376 err(-1, "calloc fd_instr_count_percpu"); 8377 + } 8378 + if (BIC_IS_ENABLED(BIC_LLC_RPS)) { 8379 + fd_llc_percpu = calloc(topo.max_cpu_num + 1, sizeof(int)); 8380 + if (fd_llc_percpu == NULL) 8381 + err(-1, "calloc fd_llc_percpu"); 8337 8382 } 8338 8383 } 8339 8384 ··· 8450 8485 /* Assumes msr_counter_info is populated */ 8451 8486 static int has_amperf_access(void) 8452 8487 { 8453 - return msr_counter_arch_infos[MSR_ARCH_INFO_APERF_INDEX].present && 8488 + return cpuid_has_aperf_mperf && msr_counter_arch_infos[MSR_ARCH_INFO_APERF_INDEX].present && 8454 8489 msr_counter_arch_infos[MSR_ARCH_INFO_MPERF_INDEX].present; 8455 8490 } 8456 8491 ··· 8673 8708 cci->source[cai->rci_index] = COUNTER_SOURCE_PERF; 8674 8709 8675 8710 /* User MSR for this counter */ 8676 - } else if (pkg_cstate_limit >= cai->pkg_cstate_limit 8677 - && add_msr_counter(cpu, cai->msr) >= 0) { 8711 + } else if (pkg_cstate_limit >= cai->pkg_cstate_limit && add_msr_counter(cpu, cai->msr) >= 0) { 8678 8712 cci->source[cai->rci_index] = COUNTER_SOURCE_MSR; 8679 8713 cci->msr[cai->rci_index] = cai->msr; 8680 8714 } ··· 8791 8827 hygon_genuine = 1; 8792 8828 8793 8829 if (!quiet) 8794 - fprintf(outf, "CPUID(0): %.4s%.4s%.4s 0x%x CPUID levels\n", 8795 - (char *)&ebx, (char *)&edx, (char *)&ecx, max_level); 8830 + fprintf(outf, "CPUID(0): %.4s%.4s%.4s 0x%x CPUID levels\n", (char *)&ebx, (char *)&edx, (char *)&ecx, max_level); 8796 8831 8797 8832 __cpuid(1, fms, ebx, ecx, edx); 8798 8833 family = (fms >> 8) & 0xf; ··· 8820 8857 __cpuid(0x80000000, max_extended_level, ebx, ecx, edx); 8821 8858 8822 8859 if (!quiet) { 8823 - fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d)", 8824 - family, model, stepping, family, model, stepping); 8860 + fprintf(outf, "CPUID(1): family:model:stepping 0x%x:%x:%x (%d:%d:%d)", family, model, stepping, family, model, stepping); 8825 8861 if (ucode_patch_valid) 8826 8862 fprintf(outf, " microcode 0x%x", (unsigned int)((ucode_patch >> 32) & 0xFFFFFFFF)); 8827 8863 fputc('\n', outf); ··· 8834 8872 ecx_flags & (1 << 8) ? "TM2" : "-", 8835 8873 edx_flags & (1 << 4) ? "TSC" : "-", 8836 8874 edx_flags & (1 << 5) ? "MSR" : "-", 8837 - edx_flags & (1 << 22) ? "ACPI-TM" : "-", 8838 - edx_flags & (1 << 28) ? "HT" : "-", edx_flags & (1 << 29) ? "TM" : "-"); 8875 + edx_flags & (1 << 22) ? "ACPI-TM" : "-", edx_flags & (1 << 28) ? "HT" : "-", edx_flags & (1 << 29) ? "TM" : "-"); 8839 8876 } 8840 8877 8841 8878 probe_platform_features(family, model); ··· 8858 8897 */ 8859 8898 8860 8899 __cpuid(0x6, eax, ebx, ecx, edx); 8861 - has_aperf = ecx & (1 << 0); 8900 + cpuid_has_aperf_mperf = ecx & (1 << 0); 8862 8901 do_dts = eax & (1 << 0); 8863 8902 if (do_dts) 8864 8903 BIC_PRESENT(BIC_CoreTmp); ··· 8876 8915 if (!quiet) 8877 8916 fprintf(outf, "CPUID(6): %sAPERF, %sTURBO, %sDTS, %sPTM, %sHWP, " 8878 8917 "%sHWPnotify, %sHWPwindow, %sHWPepp, %sHWPpkg, %sEPB\n", 8879 - has_aperf ? "" : "No-", 8918 + cpuid_has_aperf_mperf ? "" : "No-", 8880 8919 has_turbo ? "" : "No-", 8881 8920 do_dts ? "" : "No-", 8882 8921 do_ptm ? "" : "No-", 8883 8922 has_hwp ? "" : "No-", 8884 8923 has_hwp_notify ? "" : "No-", 8885 - has_hwp_activity_window ? "" : "No-", 8886 - has_hwp_epp ? "" : "No-", has_hwp_pkg ? "" : "No-", has_epb ? "" : "No-"); 8924 + has_hwp_activity_window ? "" : "No-", has_hwp_epp ? "" : "No-", has_hwp_pkg ? "" : "No-", has_epb ? "" : "No-"); 8887 8925 8888 8926 if (!quiet) 8889 8927 decode_misc_enable_msr(); ··· 8916 8956 8917 8957 if (ebx_tsc != 0) { 8918 8958 if (!quiet && (ebx != 0)) 8919 - fprintf(outf, "CPUID(0x15): eax_crystal: %d ebx_tsc: %d ecx_crystal_hz: %d\n", 8920 - eax_crystal, ebx_tsc, crystal_hz); 8959 + fprintf(outf, "CPUID(0x15): eax_crystal: %d ebx_tsc: %d ecx_crystal_hz: %d\n", eax_crystal, ebx_tsc, crystal_hz); 8921 8960 8922 8961 if (crystal_hz == 0) 8923 8962 crystal_hz = platform->crystal_freq; ··· 8948 8989 tsc_tweak = base_hz / tsc_hz; 8949 8990 8950 8991 if (!quiet) 8951 - fprintf(outf, "CPUID(0x16): base_mhz: %d max_mhz: %d bus_mhz: %d\n", 8952 - base_mhz, max_mhz, bus_mhz); 8992 + fprintf(outf, "CPUID(0x16): base_mhz: %d max_mhz: %d bus_mhz: %d\n", base_mhz, max_mhz, bus_mhz); 8953 8993 } 8954 8994 8955 - if (has_aperf) 8995 + if (cpuid_has_aperf_mperf) 8956 8996 aperf_mperf_multiplier = platform->need_perf_multiplier ? 1024 : 1; 8957 8997 8958 8998 BIC_PRESENT(BIC_IRQ); ··· 9001 9043 9002 9044 if (!quiet) 9003 9045 decode_misc_feature_control(); 9046 + } 9047 + 9048 + /* perf_llc_probe 9049 + * 9050 + * return 1 on success, else 0 9051 + */ 9052 + int has_perf_llc_access(void) 9053 + { 9054 + int fd; 9055 + 9056 + if (no_perf) 9057 + return 0; 9058 + 9059 + fd = open_perf_counter(base_cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9060 + if (fd != -1) 9061 + close(fd); 9062 + 9063 + if (fd == -1) 9064 + warnx("Failed to access %s. Some of the counters may not be available\n" 9065 + "\tRun as root to enable them or use %s to disable the access explicitly", "perf LLC counters", "'--hide LLC' or '--no-perf'"); 9066 + 9067 + return (fd != -1); 9068 + } 9069 + 9070 + void perf_llc_init(void) 9071 + { 9072 + int cpu; 9073 + int retval; 9074 + 9075 + if (no_perf) 9076 + return; 9077 + if (!(BIC_IS_ENABLED(BIC_LLC_RPS) && BIC_IS_ENABLED(BIC_LLC_HIT))) 9078 + return; 9079 + 9080 + for (cpu = 0; cpu <= topo.max_cpu_num; ++cpu) { 9081 + 9082 + if (cpu_is_not_allowed(cpu)) 9083 + continue; 9084 + 9085 + assert(fd_llc_percpu != 0); 9086 + fd_llc_percpu[cpu] = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_REFERENCES, -1, PERF_FORMAT_GROUP); 9087 + if (fd_llc_percpu[cpu] == -1) { 9088 + warnx("%s: perf REFS: failed to open counter on cpu%d", __func__, cpu); 9089 + free_fd_llc_percpu(); 9090 + return; 9091 + } 9092 + assert(fd_llc_percpu != 0); 9093 + retval = open_perf_counter(cpu, PERF_TYPE_HARDWARE, PERF_COUNT_HW_CACHE_MISSES, fd_llc_percpu[cpu], PERF_FORMAT_GROUP); 9094 + if (retval == -1) { 9095 + warnx("%s: perf MISS: failed to open counter on cpu%d", __func__, cpu); 9096 + free_fd_llc_percpu(); 9097 + return; 9098 + } 9099 + } 9100 + BIC_PRESENT(BIC_LLC_RPS); 9101 + BIC_PRESENT(BIC_LLC_HIT); 9004 9102 } 9005 9103 9006 9104 /* ··· 9365 9351 9366 9352 t->cpu_id = cpu_id; 9367 9353 if (!cpu_is_not_allowed(cpu_id)) { 9354 + 9368 9355 if (c->base_cpu < 0) 9369 9356 c->base_cpu = t->cpu_id; 9370 9357 if (pkg_base[pkg_id].base_cpu < 0) ··· 9471 9456 9472 9457 void check_msr_access(void) 9473 9458 { 9474 - check_dev_msr(); 9459 + check_msr_driver(); 9475 9460 check_msr_permission(); 9476 9461 9477 9462 if (no_msr) ··· 9480 9465 9481 9466 void check_perf_access(void) 9482 9467 { 9483 - if (no_perf || !BIC_IS_ENABLED(BIC_IPC) || !has_instr_count_access()) 9484 - CLR_BIC(BIC_IPC, &bic_enabled); 9468 + if (BIC_IS_ENABLED(BIC_IPC)) 9469 + if (!has_perf_instr_count_access()) 9470 + no_perf = 1; 9471 + 9472 + if (BIC_IS_ENABLED(BIC_LLC_RPS) || BIC_IS_ENABLED(BIC_LLC_HIT)) 9473 + if (!has_perf_llc_access()) 9474 + no_perf = 1; 9475 + 9476 + if (no_perf) 9477 + bic_disable_perf_access(); 9485 9478 } 9486 9479 9487 9480 bool perf_has_hybrid_devices(void) ··· 9612 9589 9613 9590 perf_config = read_perf_config(perf_device, pinfo->event); 9614 9591 if (perf_config == (unsigned int)-1) { 9615 - warnx("%s: perf/%s/%s: failed to read %s", 9616 - __func__, perf_device, pinfo->event, "config"); 9592 + warnx("%s: perf/%s/%s: failed to read %s", __func__, perf_device, pinfo->event, "config"); 9617 9593 continue; 9618 9594 } 9619 9595 ··· 9623 9601 9624 9602 fd_perf = open_perf_counter(cpu, perf_type, perf_config, -1, 0); 9625 9603 if (fd_perf == -1) { 9626 - warnx("%s: perf/%s/%s: failed to open counter on cpu%d", 9627 - __func__, perf_device, pinfo->event, cpu); 9604 + warnx("%s: perf/%s/%s: failed to open counter on cpu%d", __func__, perf_device, pinfo->event, cpu); 9628 9605 continue; 9629 9606 } 9630 9607 ··· 9632 9611 pinfo->scale = perf_scale; 9633 9612 9634 9613 if (debug) 9635 - fprintf(stderr, "Add perf/%s/%s cpu%d: %d\n", 9636 - perf_device, pinfo->event, cpu, pinfo->fd_perf_per_domain[next_domain]); 9614 + fprintf(stderr, "Add perf/%s/%s cpu%d: %d\n", perf_device, pinfo->event, cpu, pinfo->fd_perf_per_domain[next_domain]); 9637 9615 } 9638 9616 9639 9617 pinfo = pinfo->next; ··· 9946 9926 } 9947 9927 9948 9928 if (conflict) { 9949 - fprintf(stderr, "%s: conflicting parameters for the PMT counter with the same name %s\n", 9950 - __func__, name); 9929 + fprintf(stderr, "%s: conflicting parameters for the PMT counter with the same name %s\n", __func__, name); 9951 9930 exit(1); 9952 9931 } 9953 9932 ··· 9989 9970 * CWF with newer firmware might require a PMT_TYPE_XTAL_TIME intead of PMT_TYPE_TCORE_CLOCK. 9990 9971 */ 9991 9972 pmt_add_counter(PMT_CWF_MC1E_GUID, seq, "CPU%c1e", PMT_TYPE_TCORE_CLOCK, 9992 - PMT_COUNTER_CWF_MC1E_LSB, PMT_COUNTER_CWF_MC1E_MSB, offset, SCOPE_CPU, 9993 - FORMAT_DELTA, cpu_num, PMT_OPEN_TRY); 9973 + PMT_COUNTER_CWF_MC1E_LSB, PMT_COUNTER_CWF_MC1E_MSB, offset, SCOPE_CPU, FORMAT_DELTA, cpu_num, PMT_OPEN_TRY); 9994 9974 9995 9975 /* 9996 9976 * Rather complex logic for each time we go to the next loop iteration, ··· 10039 10021 linux_perf_init(); 10040 10022 rapl_perf_init(); 10041 10023 cstate_perf_init(); 10024 + perf_llc_init(); 10042 10025 added_perf_counters_init(); 10043 10026 pmt_init(); 10044 10027 ··· 10145 10126 10146 10127 void print_version() 10147 10128 { 10148 - fprintf(outf, "turbostat version 2025.09.09 - Len Brown <lenb@kernel.org>\n"); 10129 + fprintf(outf, "turbostat version 2025.12.02 - Len Brown <lenb@kernel.org>\n"); 10149 10130 } 10150 10131 10151 10132 #define COMMAND_LINE_SIZE 2048 ··· 10185 10166 } 10186 10167 10187 10168 int add_counter(unsigned int msr_num, char *path, char *name, 10188 - unsigned int width, enum counter_scope scope, 10189 - enum counter_type type, enum counter_format format, int flags, int id) 10169 + unsigned int width, enum counter_scope scope, enum counter_type type, enum counter_format format, int flags, int id) 10190 10170 { 10191 10171 struct msr_counter *msrp; 10192 10172 ··· 10294 10276 struct perf_counter_info *make_perf_counter_info(const char *perf_device, 10295 10277 const char *perf_event, 10296 10278 const char *name, 10297 - unsigned int width, 10298 - enum counter_scope scope, 10299 - enum counter_type type, enum counter_format format) 10279 + unsigned int width, enum counter_scope scope, enum counter_type type, enum counter_format format) 10300 10280 { 10301 10281 struct perf_counter_info *pinfo; 10302 10282 ··· 10369 10353 10370 10354 // FIXME: we might not have debug here yet 10371 10355 if (debug) 10372 - fprintf(stderr, "%s: %s/%s, name: %s, scope%d\n", 10373 - __func__, pinfo->device, pinfo->event, pinfo->name, pinfo->scope); 10356 + fprintf(stderr, "%s: %s/%s, name: %s, scope%d\n", __func__, pinfo->device, pinfo->event, pinfo->name, pinfo->scope); 10374 10357 10375 10358 return 0; 10376 10359 } ··· 10538 10523 10539 10524 pmt_diriter_init(&pmt_iter); 10540 10525 10541 - for (dirname = pmt_diriter_begin(&pmt_iter, SYSFS_TELEM_PATH); dirname != NULL; 10542 - dirname = pmt_diriter_next(&pmt_iter)) { 10526 + for (dirname = pmt_diriter_begin(&pmt_iter, SYSFS_TELEM_PATH); dirname != NULL; dirname = pmt_diriter_next(&pmt_iter)) { 10543 10527 10544 10528 fd_telem_dir = openat(dirfd(pmt_iter.dir), dirname->d_name, O_RDONLY | O_DIRECTORY); 10545 10529 if (fd_telem_dir == -1) ··· 10550 10536 } 10551 10537 10552 10538 if (fstat(fd_telem_dir, &stat) == -1) { 10553 - fprintf(stderr, "%s: Failed to stat %s directory: %s", __func__, 10554 - dirname->d_name, strerror(errno)); 10539 + fprintf(stderr, "%s: Failed to stat %s directory: %s", __func__, dirname->d_name, strerror(errno)); 10555 10540 continue; 10556 10541 } 10557 10542 ··· 10646 10633 } 10647 10634 10648 10635 if (!has_scope) { 10649 - printf("%s: invalid value for scope. Expected cpu%%u, core%%u or package%%u.\n", 10650 - __func__); 10636 + printf("%s: invalid value for scope. Expected cpu%%u, core%%u or package%%u.\n", __func__); 10651 10637 exit(1); 10652 10638 } 10653 10639 ··· 10722 10710 } 10723 10711 10724 10712 if (!has_format) { 10725 - fprintf(stderr, "%s: Invalid format %s. Expected raw, average or delta\n", 10726 - __func__, format_name); 10713 + fprintf(stderr, "%s: Invalid format %s. Expected raw, average or delta\n", __func__, format_name); 10727 10714 exit(1); 10728 10715 } 10729 10716 } ··· 10889 10878 if (is_deferred_skip(name_buf)) 10890 10879 continue; 10891 10880 10892 - add_counter(0, path, name_buf, 64, SCOPE_CPU, COUNTER_USEC, FORMAT_PERCENT, SYSFS_PERCPU, 0); 10881 + add_counter(0, path, name_buf, 32, SCOPE_CPU, COUNTER_USEC, FORMAT_PERCENT, SYSFS_PERCPU, 0); 10893 10882 10894 10883 if (state > max_state) 10895 10884 max_state = state;
+37 -10
tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
··· 95 95 #define PATH_TO_CPU "/sys/devices/system/cpu/" 96 96 #define SYSFS_PATH_MAX 255 97 97 98 + static int use_android_msr_path; 99 + 98 100 /* 99 101 * maintain compatibility with original implementation, but don't document it: 100 102 */ ··· 372 370 for (cpu = 0; cpu <= max_cpu_num; ++cpu) { 373 371 if (CPU_ISSET_S(cpu, cpu_setsize, cpu_selected_set)) 374 372 if (!CPU_ISSET_S(cpu, cpu_setsize, cpu_present_set)) 375 - errx(1, "Requested cpu% is not present", cpu); 373 + errx(1, "Requested cpu%d is not present", cpu); 376 374 } 377 375 } 378 376 ··· 520 518 521 519 void print_version(void) 522 520 { 523 - printf("x86_energy_perf_policy 2025.9.19 Len Brown <lenb@kernel.org>\n"); 521 + printf("x86_energy_perf_policy 2025.11.22 Len Brown <lenb@kernel.org>\n"); 524 522 } 525 523 526 524 void cmdline(int argc, char **argv) ··· 662 660 } 663 661 664 662 flags = strstr(buffer, "flags"); 663 + if (!flags) { 664 + fclose(cpuinfo); 665 + free(buffer); 666 + err(1, "Failed to find 'flags' in /proc/cpuinfo"); 667 + } 665 668 rewind(cpuinfo); 666 669 fseek(cpuinfo, flags - buffer, SEEK_SET); 667 670 if (!fgets(buffer, 4096, cpuinfo)) { ··· 691 684 char pathname[32]; 692 685 int fd; 693 686 694 - sprintf(pathname, "/dev/cpu/%d/msr", cpu); 687 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", cpu); 695 688 fd = open(pathname, O_RDONLY); 696 689 if (fd < 0) 697 - err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, or run as root", pathname); 690 + err(-1, "%s open failed, try chown or chmod +r %s, or run as root", 691 + pathname, use_android_msr_path ? "/dev/msr*" : "/dev/cpu/*/msr"); 692 + 698 693 699 694 retval = pread(fd, msr, sizeof(*msr), offset); 700 695 if (retval != sizeof(*msr)) { ··· 717 708 int retval; 718 709 int fd; 719 710 720 - sprintf(pathname, "/dev/cpu/%d/msr", cpu); 711 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", cpu); 721 712 fd = open(pathname, O_RDWR); 722 713 if (fd < 0) 723 - err(-1, "%s open failed, try chown or chmod +r /dev/cpu/*/msr, or run as root", pathname); 714 + err(-1, "%s open failed, try chown or chmod +r %s, or run as root", 715 + pathname, use_android_msr_path ? "/dev/msr*" : "/dev/cpu/*/msr"); 724 716 725 717 retval = pwrite(fd, &new_msr, sizeof(new_msr), offset); 726 718 if (retval != sizeof(new_msr)) ··· 1431 1421 err(-ENODEV, "No valid cpus found"); 1432 1422 } 1433 1423 1424 + static void probe_android_msr_path(void) 1425 + { 1426 + struct stat sb; 1427 + char test_path[32]; 1428 + 1429 + sprintf(test_path, "/dev/msr%d", base_cpu); 1430 + if (stat(test_path, &sb) == 0) 1431 + use_android_msr_path = 1; 1432 + } 1434 1433 1435 1434 void probe_dev_msr(void) 1436 1435 { 1437 1436 struct stat sb; 1438 1437 char pathname[32]; 1439 1438 1440 - sprintf(pathname, "/dev/cpu/%d/msr", base_cpu); 1441 - if (stat(pathname, &sb)) 1442 - if (system("/sbin/modprobe msr > /dev/null 2>&1")) 1443 - err(-5, "no /dev/cpu/0/msr, Try \"# modprobe msr\" "); 1439 + probe_android_msr_path(); 1440 + 1441 + sprintf(pathname, use_android_msr_path ? "/dev/msr%d" : "/dev/cpu/%d/msr", base_cpu); 1442 + if (stat(pathname, &sb)) { 1443 + if (system("/sbin/modprobe msr > /dev/null 2>&1")) { 1444 + if (use_android_msr_path) 1445 + err(-5, "no /dev/msr0, Try \"# modprobe msr\" "); 1446 + else 1447 + err(-5, "no /dev/cpu/0/msr, Try \"# modprobe msr\" "); 1448 + } 1449 + } 1444 1450 } 1445 1451 1446 1452 static void get_cpuid_or_exit(unsigned int leaf, ··· 1573 1547 int main(int argc, char **argv) 1574 1548 { 1575 1549 set_base_cpu(); 1550 + 1576 1551 probe_dev_msr(); 1577 1552 init_data_structures(); 1578 1553