Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"There's good stuff across the board, including some nice mm
improvements for CPUs with the 'noabort' BBML2 feature and a clever
patch to allow ptdump to play nicely with block mappings in the
vmalloc area.

Confidential computing:

- Add support for accepting secrets from firmware (e.g. ACPI CCEL)
and mapping them with appropriate attributes.

CPU features:

- Advertise atomic floating-point instructions to userspace

- Extend Spectre workarounds to cover additional Arm CPU variants

- Extend list of CPUs that support break-before-make level 2 and
guarantee not to generate TLB conflict aborts for changes of
mapping granularity (BBML2_NOABORT)

- Add GCS support to our uprobes implementation.

Documentation:

- Remove bogus SME documentation concerning register state when
entering/exiting streaming mode.

Entry code:

- Switch over to the generic IRQ entry code (GENERIC_IRQ_ENTRY)

- Micro-optimise syscall entry path with a compiler branch hint.

Memory management:

- Enable huge mappings in vmalloc space even when kernel page-table
dumping is enabled

- Tidy up the types used in our early MMU setup code

- Rework rodata= for closer parity with the behaviour on x86

- For CPUs implementing BBML2_NOABORT, utilise block mappings in the
linear map even when rodata= applies to virtual aliases

- Don't re-allocate the virtual region between '_text' and '_stext',
as doing so confused tools parsing /proc/vmcore.

Miscellaneous:

- Clean-up Kconfig menuconfig text for architecture features

- Avoid redundant bitmap_empty() during determination of supported
SME vector lengths

- Re-enable warnings when building the 32-bit vDSO object

- Avoid breaking our eggs at the wrong end.

Perf and PMUs:

- Support for v3 of the Hisilicon L3C PMU

- Support for Hisilicon's MN and NoC PMUs

- Support for Fujitsu's Uncore PMU

- Support for SPE's extended event filtering feature

- Preparatory work to enable data source filtering in SPE

- Support for multiple lanes in the DWC PCIe PMU

- Support for i.MX94 in the IMX DDR PMU driver

- MAINTAINERS update (Thank you, Yicong)

- Minor driver fixes (PERF_IDX2OFF() overflow, CMN register offsets).

Selftests:

- Add basic LSFE check to the existing hwcaps test

- Support nolibc in GCS tests

- Extend SVE ptrace test to pass unsupported regsets and invalid
vector lengths

- Minor cleanups (typos, cosmetic changes).

System registers:

- Fix ID_PFR1_EL1 definition

- Fix incorrect signedness of some fields in ID_AA64MMFR4_EL1

- Sync TCR_EL1 definition with the latest Arm ARM (L.b)

- Be stricter about the input fed into our AWK sysreg generator
script

- Typo fixes and removal of redundant definitions.

ACPI, EFI and PSCI:

- Decouple Arm's "Software Delegated Exception Interface" (SDEI)
support from the ACPI GHES code so that it can be used by platforms
booted with device-tree

- Remove unnecessary per-CPU tracking of the FPSIMD state across EFI
runtime calls

- Fix a node refcount imbalance in the PSCI device-tree code.

CPU Features:

- Ensure register sanitisation is applied to fields in ID_AA64MMFR4

- Expose AIDR_EL1 to userspace via sysfs, primarily so that KVM
guests can reliably query the underlying CPU types from the VMM

- Re-enabling of SME support (CONFIG_ARM64_SME) as a result of fixes
to our context-switching, signal handling and ptrace code"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (93 commits)
arm64: cpufeature: Remove duplicate asm/mmu.h header
arm64: Kconfig: Make CPU_BIG_ENDIAN depend on BROKEN
perf/dwc_pcie: Fix use of uninitialized variable
arm/syscalls: mark syscall invocation as likely in invoke_syscall
Documentation: hisi-pmu: Add introduction to HiSilicon V3 PMU
Documentation: hisi-pmu: Fix of minor format error
drivers/perf: hisi: Add support for L3C PMU v3
drivers/perf: hisi: Refactor the event configuration of L3C PMU
drivers/perf: hisi: Extend the field of tt_core
drivers/perf: hisi: Extract the event filter check of L3C PMU
drivers/perf: hisi: Simplify the probe process of each L3C PMU version
drivers/perf: hisi: Export hisi_uncore_pmu_isr()
drivers/perf: hisi: Relax the event ID check in the framework
perf: Fujitsu: Add the Uncore PMU driver
arm64: map [_text, _stext) virtual address range non-executable+read-only
arm64/sysreg: Update TCR_EL1 register
arm64: Enable vmalloc-huge with ptdump
arm64: cpufeature: add Neoverse-V3AE to BBML2 allow list
arm64: errata: Apply workarounds for Neoverse-V3AE
arm64: cputype: Add Neoverse-V3AE definitions
...

+3781 -767
+3 -2
Documentation/admin-guide/kernel-parameters.txt
··· 6406 6406 rodata= [KNL,EARLY] 6407 6407 on Mark read-only kernel memory as read-only (default). 6408 6408 off Leave read-only kernel memory writable for debugging. 6409 - full Mark read-only kernel memory and aliases as read-only 6410 - [arm64] 6409 + noalias Mark read-only kernel memory as read-only but retain 6410 + writable aliases in the direct map for regions outside 6411 + of the kernel image. [arm64] 6411 6412 6412 6413 rockchip.usb_uart 6413 6414 [EARLY]
+2 -2
Documentation/admin-guide/perf/dwc_pcie_pmu.rst
··· 16 16 17 17 - one 64-bit counter for Time Based Analysis (RX/TX data throughput and 18 18 time spent in each low-power LTSSM state) and 19 - - one 32-bit counter for Event Counting (error and non-error events for 20 - a specified lane) 19 + - one 32-bit counter per event for Event Counting (error and non-error 20 + events for a specified lane) 21 21 22 22 Note: There is no interrupt for counter overflow. 23 23
+110
Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-only 2 + 3 + ================================================ 4 + Fujitsu Uncore Performance Monitoring Unit (PMU) 5 + ================================================ 6 + 7 + This driver supports the Uncore MAC PMUs and the Uncore PCI PMUs found 8 + in Fujitsu chips. 9 + Each MAC PMU on these chips is exposed as a uncore perf PMU with device name 10 + mac_iod<iod>_mac<mac>_ch<ch>. 11 + And each PCI PMU on these chips is exposed as a uncore perf PMU with device name 12 + pci_iod<iod>_pci<pci>. 13 + 14 + The driver provides a description of its available events and configuration 15 + options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/ 16 + and /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/. 17 + This driver exports: 18 + - formats, used by perf user space and other tools to configure events 19 + - events, used by perf user space and other tools to create events 20 + symbolically, e.g.: 21 + perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls 22 + perf stat -a -e pci_iod0_pci0/event=0x24/ ls 23 + - cpumask, used by perf user space and other tools to know on which CPUs 24 + to open the events 25 + 26 + This driver supports the following events for MAC: 27 + - cycles 28 + This event counts MAC cycles at MAC frequency. 29 + - read-count 30 + This event counts the number of read requests to MAC. 31 + - read-count-request 32 + This event counts the number of read requests including retry to MAC. 33 + - read-count-return 34 + This event counts the number of responses to read requests to MAC. 35 + - read-count-request-pftgt 36 + This event counts the number of read requests including retry with PFTGT 37 + flag. 38 + - read-count-request-normal 39 + This event counts the number of read requests including retry without PFTGT 40 + flag. 41 + - read-count-return-pftgt-hit 42 + This event counts the number of responses to read requests which hit the 43 + PFTGT buffer. 44 + - read-count-return-pftgt-miss 45 + This event counts the number of responses to read requests which miss the 46 + PFTGT buffer. 47 + - read-wait 48 + This event counts outstanding read requests issued by DDR memory controller 49 + per cycle. 50 + - write-count 51 + This event counts the number of write requests to MAC (including zero write, 52 + full write, partial write, write cancel). 53 + - write-count-write 54 + This event counts the number of full write requests to MAC (not including 55 + zero write). 56 + - write-count-pwrite 57 + This event counts the number of partial write requests to MAC. 58 + - memory-read-count 59 + This event counts the number of read requests from MAC to memory. 60 + - memory-write-count 61 + This event counts the number of full write requests from MAC to memory. 62 + - memory-pwrite-count 63 + This event counts the number of partial write requests from MAC to memory. 64 + - ea-mac 65 + This event counts energy consumption of MAC. 66 + - ea-memory 67 + This event counts energy consumption of memory. 68 + - ea-memory-mac-write 69 + This event counts the number of write requests from MAC to memory. 70 + - ea-ha 71 + This event counts energy consumption of HA. 72 + 73 + 'ea' is the abbreviation for 'Energy Analyzer'. 74 + 75 + Examples for use with perf:: 76 + 77 + perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls 78 + 79 + And, this driver supports the following events for PCI: 80 + - pci-port0-cycles 81 + This event counts PCI cycles at PCI frequency in port0. 82 + - pci-port0-read-count 83 + This event counts read transactions for data transfer in port0. 84 + - pci-port0-read-count-bus 85 + This event counts read transactions for bus usage in port0. 86 + - pci-port0-write-count 87 + This event counts write transactions for data transfer in port0. 88 + - pci-port0-write-count-bus 89 + This event counts write transactions for bus usage in port0. 90 + - pci-port1-cycles 91 + This event counts PCI cycles at PCI frequency in port1. 92 + - pci-port1-read-count 93 + This event counts read transactions for data transfer in port1. 94 + - pci-port1-read-count-bus 95 + This event counts read transactions for bus usage in port1. 96 + - pci-port1-write-count 97 + This event counts write transactions for data transfer in port1. 98 + - pci-port1-write-count-bus 99 + This event counts write transactions for bus usage in port1. 100 + - ea-pci 101 + This event counts energy consumption of PCI. 102 + 103 + 'ea' is the abbreviation for 'Energy Analyzer'. 104 + 105 + Examples for use with perf:: 106 + 107 + perf stat -e pci_iod0_pci0/ea-pci/ ls 108 + 109 + Given that these are uncore PMUs the driver does not support sampling, therefore 110 + "perf record" will not work. Per-task perf sessions are not supported.
+47 -2
Documentation/admin-guide/perf/hisi-pmu.rst
··· 18 18 Each device PMU has separate registers for event counting, control and 19 19 interrupt, and the PMU driver shall register perf PMU drivers like L3C, 20 20 HHA and DDRC etc. The available events and configuration options shall 21 - be described in the sysfs, see: 21 + be described in the sysfs, see:: 22 22 23 - /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>. 23 + /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}> 24 + 24 25 The "perf list" command shall list the available events from sysfs. 25 26 26 27 Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU ··· 112 111 - 2'b10: count the events which sent to the uring (non-MATA) channel; 113 112 - 2'b00: default value, count the events which sent to the both uring and 114 113 uring_ext channel; 114 + 115 + 6. ch: NoC PMU supports filtering the event counts of certain transaction 116 + channel with this option. The current supported channels are as follows: 117 + 118 + - 3'b010: Request channel 119 + - 3'b100: Snoop channel 120 + - 3'b110: Response channel 121 + - 3'b111: Data channel 122 + 123 + 7. tt_en: NoC PMU supports counting only transactions that have tracetag set 124 + if this option is set. See the 2nd list for more information about tracetag. 125 + 126 + For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are 127 + further divided into parts for finer granularity of tracing, each part has its 128 + own dedicated PMU, and all such PMUs together cover the monitoring job of events 129 + on particular uncore device. Such PMUs are described in sysfs with name format 130 + slightly changed:: 131 + 132 + /sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}> 133 + 134 + Z is the sub-id, indicating different PMUs for part of hardware device. 135 + 136 + Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU 137 + provides ``ext`` option to allow exploration of even finer granual statistics 138 + of L3C PMU. L3C PMU driver uses that as hint of termination when delivering 139 + perf command to hardware: 140 + 141 + - ext=0: Default, could be used with event names. 142 + - ext=1 and ext=2: Must be used with event codes, event names are not supported. 143 + 144 + An example of perf command could be:: 145 + 146 + $# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5 147 + 148 + or:: 149 + 150 + $# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5 151 + 152 + As above, ``hisi_sccl0_l3c1_0`` locates PMU of Super CPU CLuster 0, L3 cache 1 153 + pipe0. 154 + 155 + First command locates the first part of L3C since ``ext=0`` is implied by 156 + default. Second command issues the counting on another part of L3C with the 157 + event ``0x1``. 115 158 116 159 Users could configure IDs to count data come from specific CCL/ICL, by setting 117 160 srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting
+1
Documentation/admin-guide/perf/index.rst
··· 29 29 cxl 30 30 ampere_cspmu 31 31 mrvl-pem-pmu 32 + fujitsu_uncore_pmu
+11
Documentation/arch/arm64/booting.rst
··· 466 466 - HDFGWTR2_EL2.nPMICFILTR_EL0 (bit 3) must be initialised to 0b1. 467 467 - HDFGWTR2_EL2.nPMUACR_EL1 (bit 4) must be initialised to 0b1. 468 468 469 + For CPUs with SPE data source filtering (FEAT_SPE_FDS): 470 + 471 + - If EL3 is present: 472 + 473 + - MDCR_EL3.EnPMS3 (bit 42) must be initialised to 0b1. 474 + 475 + - If the kernel is entered at EL1 and EL2 is present: 476 + 477 + - HDFGRTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. 478 + - HDFGWTR2_EL2.nPMSDSFR_EL1 (bit 19) must be initialised to 0b1. 479 + 469 480 For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS): 470 481 471 482 - If the kernel is entered at EL1 and EL2 is present:
+4
Documentation/arch/arm64/elf_hwcaps.rst
··· 441 441 HWCAP3_MTE_STORE_ONLY 442 442 Functionality implied by ID_AA64PFR2_EL1.MTESTOREONLY == 0b0001. 443 443 444 + HWCAP3_LSFE 445 + Functionality implied by ID_AA64ISAR3_EL1.LSFE == 0b0001 446 + 447 + 444 448 4. Unused AT_HWCAP bits 445 449 ----------------------- 446 450
+2
Documentation/arch/arm64/silicon-errata.rst
··· 200 200 +----------------+-----------------+-----------------+-----------------------------+ 201 201 | ARM | Neoverse-V3 | #3312417 | ARM64_ERRATUM_3194386 | 202 202 +----------------+-----------------+-----------------+-----------------------------+ 203 + | ARM | Neoverse-V3AE | #3312417 | ARM64_ERRATUM_3194386 | 204 + +----------------+-----------------+-----------------+-----------------------------+ 203 205 | ARM | MMU-500 | #841119,826419 | ARM_SMMU_MMU_500_CPRE_ERRATA| 204 206 | | | #562869,1047329 | | 205 207 +----------------+-----------------+-----------------+-----------------------------+
+2 -12
Documentation/arch/arm64/sme.rst
··· 81 81 mode SVE vector. 82 82 83 83 84 - 3. Sharing of streaming and non-streaming mode SVE state 85 - --------------------------------------------------------- 86 - 87 - It is implementation defined which if any parts of the SVE state are shared 88 - between streaming and non-streaming modes. When switching between modes 89 - via software interfaces such as ptrace if no register content is provided as 90 - part of switching no state will be assumed to be shared and everything will 91 - be zeroed. 92 - 93 - 94 - 4. System call behaviour 84 + 3. System call behaviour 95 85 ------------------------- 96 86 97 87 * On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the ··· 102 112 exceptions for execve() described in section 6. 103 113 104 114 105 - 5. Signal handling 115 + 4. Signal handling 106 116 ------------------- 107 117 108 118 * Signal handlers are invoked with PSTATE.SM=0, PSTATE.ZA=0, and TPIDR2_EL0=0.
+1
Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml
··· 33 33 - items: 34 34 - enum: 35 35 - fsl,imx91-ddr-pmu 36 + - fsl,imx94-ddr-pmu 36 37 - fsl,imx95-ddr-pmu 37 38 - const: fsl,imx93-ddr-pmu 38 39
+3 -1
MAINTAINERS
··· 9758 9758 9759 9759 FREESCALE IMX DDR PMU DRIVER 9760 9760 M: Frank Li <Frank.li@nxp.com> 9761 + M: Xu Yang <xu.yang_2@nxp.com> 9761 9762 L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers) 9762 9763 S: Maintained 9763 9764 F: Documentation/admin-guide/perf/imx-ddr.rst 9764 9765 F: Documentation/devicetree/bindings/perf/fsl-imx-ddr.yaml 9765 9766 F: drivers/perf/fsl_imx8_ddr_perf.c 9767 + F: drivers/perf/fsl_imx9_ddr_perf.c 9768 + F: tools/perf/pmu-events/arch/arm64/freescale/ 9766 9769 9767 9770 FREESCALE IMX I2C DRIVER 9768 9771 M: Oleksij Rempel <o.rempel@pengutronix.de> ··· 11081 11078 F: drivers/net/ethernet/hisilicon/ 11082 11079 11083 11080 HISILICON PMU DRIVER 11084 - M: Yicong Yang <yangyicong@hisilicon.com> 11085 11081 M: Jonathan Cameron <jonathan.cameron@huawei.com> 11086 11082 S: Supported 11087 11083 W: http://www.hisilicon.com
+5 -18
arch/arm64/Kconfig
··· 151 151 select GENERIC_EARLY_IOREMAP 152 152 select GENERIC_IDLE_POLL_SETUP 153 153 select GENERIC_IOREMAP 154 + select GENERIC_IRQ_ENTRY 154 155 select GENERIC_IRQ_IPI 155 156 select GENERIC_IRQ_KEXEC_CLEAR_VM_FORWARD 156 157 select GENERIC_IRQ_PROBE ··· 1139 1138 * ARM Neoverse-V1 erratum 3324341 1140 1139 * ARM Neoverse V2 erratum 3324336 1141 1140 * ARM Neoverse-V3 erratum 3312417 1141 + * ARM Neoverse-V3AE erratum 3312417 1142 1142 1143 1143 On affected cores "MSR SSBS, #0" instructions may not affect 1144 1144 subsequent speculative instructions, which may permit unexepected ··· 1495 1493 config CPU_BIG_ENDIAN 1496 1494 bool "Build big-endian kernel" 1497 1495 # https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba5308da1161c 1498 - depends on AS_IS_GNU || AS_VERSION >= 150000 1496 + depends on (AS_IS_GNU || AS_VERSION >= 150000) && BROKEN 1499 1497 help 1500 1498 Say Y if you plan on running a kernel with a big-endian userspace. 1501 1499 ··· 1699 1697 make use of branch history to influence future speculation. 1700 1698 When taking an exception from user-space, a sequence of branches 1701 1699 or a firmware call overwrites the branch history. 1702 - 1703 - config RODATA_FULL_DEFAULT_ENABLED 1704 - bool "Apply r/o permissions of VM areas also to their linear aliases" 1705 - default y 1706 - help 1707 - Apply read-only attributes of VM areas to the linear alias of 1708 - the backing pages as well. This prevents code or read-only data 1709 - from being modified (inadvertently or intentionally) via another 1710 - mapping of the same memory page. This additional enhancement can 1711 - be turned off at runtime by passing rodata=[off|on] (and turned on 1712 - with rodata=full if this option is set to 'n') 1713 - 1714 - This requires the linear region to be mapped down to pages, 1715 - which may adversely affect performance in some cases. 1716 1700 1717 1701 config ARM64_SW_TTBR0_PAN 1718 1702 bool "Emulate Privileged Access Never using TTBR0_EL1 switching" ··· 2206 2218 2207 2219 endmenu # "ARMv8.9 architectural features" 2208 2220 2209 - menu "v9.4 architectural features" 2221 + menu "ARMv9.4 architectural features" 2210 2222 2211 2223 config ARM64_GCS 2212 2224 bool "Enable support for Guarded Control Stack (GCS)" 2213 2225 default y 2214 2226 select ARCH_HAS_USER_SHADOW_STACK 2215 2227 select ARCH_USES_HIGH_VMA_FLAGS 2216 - depends on !UPROBES 2217 2228 help 2218 2229 Guarded Control Stack (GCS) provides support for a separate 2219 2230 stack with restricted access which contains only return ··· 2224 2237 The feature is detected at runtime, and will remain disabled 2225 2238 if the system does not implement the feature. 2226 2239 2227 - endmenu # "v9.4 architectural features" 2240 + endmenu # "ARMv9.4 architectural features" 2228 2241 2229 2242 config ARM64_SVE 2230 2243 bool "ARM Scalable Vector Extension support"
+2
arch/arm64/include/asm/cpufeature.h
··· 871 871 return cpus_have_final_cap(ARM64_HAS_PMUV3); 872 872 } 873 873 874 + bool cpu_supports_bbml2_noabort(void); 875 + 874 876 static inline bool system_supports_bbml2_noabort(void) 875 877 { 876 878 return alternative_has_cap_unlikely(ARM64_HAS_BBML2_NOABORT);
+6 -2
arch/arm64/include/asm/cputype.h
··· 81 81 #define ARM_CPU_PART_CORTEX_A78AE 0xD42 82 82 #define ARM_CPU_PART_CORTEX_X1 0xD44 83 83 #define ARM_CPU_PART_CORTEX_A510 0xD46 84 - #define ARM_CPU_PART_CORTEX_X1C 0xD4C 85 84 #define ARM_CPU_PART_CORTEX_A520 0xD80 86 85 #define ARM_CPU_PART_CORTEX_A710 0xD47 87 86 #define ARM_CPU_PART_CORTEX_A715 0xD4D ··· 92 93 #define ARM_CPU_PART_NEOVERSE_V2 0xD4F 93 94 #define ARM_CPU_PART_CORTEX_A720 0xD81 94 95 #define ARM_CPU_PART_CORTEX_X4 0xD82 96 + #define ARM_CPU_PART_NEOVERSE_V3AE 0xD83 95 97 #define ARM_CPU_PART_NEOVERSE_V3 0xD84 96 98 #define ARM_CPU_PART_CORTEX_X925 0xD85 97 99 #define ARM_CPU_PART_CORTEX_A725 0xD87 100 + #define ARM_CPU_PART_CORTEX_A720AE 0xD89 98 101 #define ARM_CPU_PART_NEOVERSE_N3 0xD8E 99 102 100 103 #define APM_CPU_PART_XGENE 0x000 ··· 130 129 131 130 #define NVIDIA_CPU_PART_DENVER 0x003 132 131 #define NVIDIA_CPU_PART_CARMEL 0x004 132 + #define NVIDIA_CPU_PART_OLYMPUS 0x010 133 133 134 134 #define FUJITSU_CPU_PART_A64FX 0x001 135 135 ··· 172 170 #define MIDR_CORTEX_A78AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A78AE) 173 171 #define MIDR_CORTEX_X1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1) 174 172 #define MIDR_CORTEX_A510 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A510) 175 - #define MIDR_CORTEX_X1C MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X1C) 176 173 #define MIDR_CORTEX_A520 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A520) 177 174 #define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710) 178 175 #define MIDR_CORTEX_A715 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A715) ··· 183 182 #define MIDR_NEOVERSE_V2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V2) 184 183 #define MIDR_CORTEX_A720 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720) 185 184 #define MIDR_CORTEX_X4 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X4) 185 + #define MIDR_NEOVERSE_V3AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3AE) 186 186 #define MIDR_NEOVERSE_V3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_V3) 187 187 #define MIDR_CORTEX_X925 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_X925) 188 188 #define MIDR_CORTEX_A725 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A725) 189 + #define MIDR_CORTEX_A720AE MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A720AE) 189 190 #define MIDR_NEOVERSE_N3 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N3) 190 191 #define MIDR_THUNDERX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX) 191 192 #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX) ··· 223 220 224 221 #define MIDR_NVIDIA_DENVER MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_DENVER) 225 222 #define MIDR_NVIDIA_CARMEL MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_CARMEL) 223 + #define MIDR_NVIDIA_OLYMPUS MIDR_CPU_MODEL(ARM_CPU_IMP_NVIDIA, NVIDIA_CPU_PART_OLYMPUS) 226 224 #define MIDR_FUJITSU_A64FX MIDR_CPU_MODEL(ARM_CPU_IMP_FUJITSU, FUJITSU_CPU_PART_A64FX) 227 225 #define MIDR_HISI_TSV110 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_TSV110) 228 226 #define MIDR_HISI_HIP09 MIDR_CPU_MODEL(ARM_CPU_IMP_HISI, HISI_CPU_PART_HIP09)
+1 -1
arch/arm64/include/asm/daifflags.h
··· 128 128 { 129 129 unsigned long flags = regs->pstate & DAIF_MASK; 130 130 131 - if (interrupts_enabled(regs)) 131 + if (!regs_irqs_disabled(regs)) 132 132 trace_hardirqs_on(); 133 133 134 134 if (system_uses_irq_prio_masking())
+22 -6
arch/arm64/include/asm/el2_setup.h
··· 91 91 msr cntvoff_el2, xzr // Clear virtual offset 92 92 .endm 93 93 94 + /* Branch to skip_label if SPE version is less than given version */ 95 + .macro __spe_vers_imp skip_label, version, tmp 96 + mrs \tmp, id_aa64dfr0_el1 97 + ubfx \tmp, \tmp, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 98 + cmp \tmp, \version 99 + b.lt \skip_label 100 + .endm 101 + 94 102 .macro __init_el2_debug 95 103 mrs x1, id_aa64dfr0_el1 96 104 ubfx x0, x1, #ID_AA64DFR0_EL1_PMUVer_SHIFT, #4 ··· 111 103 csel x2, xzr, x0, eq // all PMU counters from EL1 112 104 113 105 /* Statistical profiling */ 114 - ubfx x0, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 115 - cbz x0, .Lskip_spe_\@ // Skip if SPE not present 106 + __spe_vers_imp .Lskip_spe_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x0 // Skip if SPE not present 116 107 117 108 mrs_s x0, SYS_PMBIDR_EL1 // If SPE available at EL2, 118 109 and x0, x0, #(1 << PMBIDR_EL1_P_SHIFT) ··· 270 263 271 264 mov x0, xzr 272 265 mov x2, xzr 273 - mrs x1, id_aa64dfr0_el1 274 - ubfx x1, x1, #ID_AA64DFR0_EL1_PMSVer_SHIFT, #4 275 - cmp x1, #3 276 - b.lt .Lskip_spe_fgt_\@ 266 + /* If SPEv1p2 is implemented, */ 267 + __spe_vers_imp .Lskip_spe_fgt_\@, #ID_AA64DFR0_EL1_PMSVer_V1P2, x1 277 268 /* Disable PMSNEVFR_EL1 read and write traps */ 278 269 orr x0, x0, #HDFGRTR_EL2_nPMSNEVFR_EL1_MASK 279 270 orr x2, x2, #HDFGWTR_EL2_nPMSNEVFR_EL1_MASK ··· 392 387 orr x0, x0, #HDFGRTR2_EL2_nPMICFILTR_EL0 393 388 orr x0, x0, #HDFGRTR2_EL2_nPMUACR_EL1 394 389 .Lskip_pmuv3p9_\@: 390 + /* If SPE is implemented, */ 391 + __spe_vers_imp .Lskip_spefds_\@, ID_AA64DFR0_EL1_PMSVer_IMP, x1 392 + /* we can read PMSIDR and */ 393 + mrs_s x1, SYS_PMSIDR_EL1 394 + and x1, x1, #PMSIDR_EL1_FDS 395 + /* if FEAT_SPE_FDS is implemented, */ 396 + cbz x1, .Lskip_spefds_\@ 397 + /* disable traps of PMSDSFR to EL2. */ 398 + orr x0, x0, #HDFGRTR2_EL2_nPMSDSFR_EL1 399 + 400 + .Lskip_spefds_\@: 395 401 msr_s SYS_HDFGRTR2_EL2, x0 396 402 msr_s SYS_HDFGWTR2_EL2, x0 397 403 msr_s SYS_HFGRTR2_EL2, xzr
+57
arch/arm64/include/asm/entry-common.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + 3 + #ifndef _ASM_ARM64_ENTRY_COMMON_H 4 + #define _ASM_ARM64_ENTRY_COMMON_H 5 + 6 + #include <linux/thread_info.h> 7 + 8 + #include <asm/cpufeature.h> 9 + #include <asm/daifflags.h> 10 + #include <asm/fpsimd.h> 11 + #include <asm/mte.h> 12 + #include <asm/stacktrace.h> 13 + 14 + #define ARCH_EXIT_TO_USER_MODE_WORK (_TIF_MTE_ASYNC_FAULT | _TIF_FOREIGN_FPSTATE) 15 + 16 + static __always_inline void arch_exit_to_user_mode_work(struct pt_regs *regs, 17 + unsigned long ti_work) 18 + { 19 + if (ti_work & _TIF_MTE_ASYNC_FAULT) { 20 + clear_thread_flag(TIF_MTE_ASYNC_FAULT); 21 + send_sig_fault(SIGSEGV, SEGV_MTEAERR, (void __user *)NULL, current); 22 + } 23 + 24 + if (ti_work & _TIF_FOREIGN_FPSTATE) 25 + fpsimd_restore_current_state(); 26 + } 27 + 28 + #define arch_exit_to_user_mode_work arch_exit_to_user_mode_work 29 + 30 + static inline bool arch_irqentry_exit_need_resched(void) 31 + { 32 + /* 33 + * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC 34 + * priority masking is used the GIC irqchip driver will clear DAIF.IF 35 + * using gic_arch_enable_irqs() for normal IRQs. If anything is set in 36 + * DAIF we must have handled an NMI, so skip preemption. 37 + */ 38 + if (system_uses_irq_prio_masking() && read_sysreg(daif)) 39 + return false; 40 + 41 + /* 42 + * Preempting a task from an IRQ means we leave copies of PSTATE 43 + * on the stack. cpufeature's enable calls may modify PSTATE, but 44 + * resuming one of these preempted tasks would undo those changes. 45 + * 46 + * Only allow a task to be preempted once cpufeatures have been 47 + * enabled. 48 + */ 49 + if (!system_capabilities_finalized()) 50 + return false; 51 + 52 + return true; 53 + } 54 + 55 + #define arch_irqentry_exit_need_resched arch_irqentry_exit_need_resched 56 + 57 + #endif /* _ASM_ARM64_ENTRY_COMMON_H */
-1
arch/arm64/include/asm/exception.h
··· 89 89 void do_el0_mops(struct pt_regs *regs, unsigned long esr); 90 90 void do_el1_mops(struct pt_regs *regs, unsigned long esr); 91 91 void do_serror(struct pt_regs *regs, unsigned long esr); 92 - void do_signal(struct pt_regs *regs); 93 92 94 93 void __noreturn panic_bad_stack(struct pt_regs *regs, unsigned long esr, unsigned long far); 95 94 #endif /* __ASM_EXCEPTION_H */
+90 -1
arch/arm64/include/asm/gcs.h
··· 21 21 register u64 *_addr __asm__ ("x0") = addr; 22 22 register long _val __asm__ ("x1") = val; 23 23 24 - /* GCSSTTR x1, x0 */ 24 + /* GCSSTTR x1, [x0] */ 25 25 asm volatile( 26 26 ".inst 0xd91f1c01\n" 27 27 : ··· 81 81 return 0; 82 82 } 83 83 84 + static inline int gcssttr(unsigned long __user *addr, unsigned long val) 85 + { 86 + register unsigned long __user *_addr __asm__ ("x0") = addr; 87 + register unsigned long _val __asm__ ("x1") = val; 88 + int err = 0; 89 + 90 + /* GCSSTTR x1, [x0] */ 91 + asm volatile( 92 + "1: .inst 0xd91f1c01\n" 93 + "2: \n" 94 + _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0) 95 + : "+r" (err) 96 + : "rZ" (_val), "r" (_addr) 97 + : "memory"); 98 + 99 + return err; 100 + } 101 + 102 + static inline void put_user_gcs(unsigned long val, unsigned long __user *addr, 103 + int *err) 104 + { 105 + int ret; 106 + 107 + if (!access_ok((char __user *)addr, sizeof(u64))) { 108 + *err = -EFAULT; 109 + return; 110 + } 111 + 112 + uaccess_ttbr0_enable(); 113 + ret = gcssttr(addr, val); 114 + if (ret != 0) 115 + *err = ret; 116 + uaccess_ttbr0_disable(); 117 + } 118 + 119 + static inline void push_user_gcs(unsigned long val, int *err) 120 + { 121 + u64 gcspr = read_sysreg_s(SYS_GCSPR_EL0); 122 + 123 + gcspr -= sizeof(u64); 124 + put_user_gcs(val, (unsigned long __user *)gcspr, err); 125 + if (!*err) 126 + write_sysreg_s(gcspr, SYS_GCSPR_EL0); 127 + } 128 + 129 + /* 130 + * Unlike put/push_user_gcs() above, get/pop_user_gsc() doesn't 131 + * validate the GCS permission is set on the page being read. This 132 + * differs from how the hardware works when it consumes data stored at 133 + * GCSPR. Callers should ensure this is acceptable. 134 + */ 135 + static inline u64 get_user_gcs(unsigned long __user *addr, int *err) 136 + { 137 + unsigned long ret; 138 + u64 load = 0; 139 + 140 + /* Ensure previous GCS operation are visible before we read the page */ 141 + gcsb_dsync(); 142 + ret = copy_from_user(&load, addr, sizeof(load)); 143 + if (ret != 0) 144 + *err = ret; 145 + return load; 146 + } 147 + 148 + static inline u64 pop_user_gcs(int *err) 149 + { 150 + u64 gcspr = read_sysreg_s(SYS_GCSPR_EL0); 151 + u64 read_val; 152 + 153 + read_val = get_user_gcs((__force unsigned long __user *)gcspr, err); 154 + if (!*err) 155 + write_sysreg_s(gcspr + sizeof(u64), SYS_GCSPR_EL0); 156 + 157 + return read_val; 158 + } 159 + 84 160 #else 85 161 86 162 static inline bool task_gcs_el0_enabled(struct task_struct *task) ··· 167 91 static inline void gcs_set_el0_mode(struct task_struct *task) { } 168 92 static inline void gcs_free(struct task_struct *task) { } 169 93 static inline void gcs_preserve_current_state(void) { } 94 + static inline void put_user_gcs(unsigned long val, unsigned long __user *addr, 95 + int *err) { } 96 + static inline void push_user_gcs(unsigned long val, int *err) { } 97 + 170 98 static inline unsigned long gcs_alloc_thread_stack(struct task_struct *tsk, 171 99 const struct kernel_clone_args *args) 172 100 { ··· 178 98 } 179 99 static inline int gcs_check_locked(struct task_struct *task, 180 100 unsigned long new_val) 101 + { 102 + return 0; 103 + } 104 + static inline u64 get_user_gcs(unsigned long __user *addr, int *err) 105 + { 106 + *err = -EFAULT; 107 + return 0; 108 + } 109 + static inline u64 pop_user_gcs(int *err) 181 110 { 182 111 return 0; 183 112 }
+1
arch/arm64/include/asm/hwcap.h
··· 178 178 #define __khwcap3_feature(x) (const_ilog2(HWCAP3_ ## x) + 128) 179 179 #define KERNEL_HWCAP_MTE_FAR __khwcap3_feature(MTE_FAR) 180 180 #define KERNEL_HWCAP_MTE_STORE_ONLY __khwcap3_feature(MTE_STORE_ONLY) 181 + #define KERNEL_HWCAP_LSFE __khwcap3_feature(LSFE) 181 182 182 183 /* 183 184 * This yields a mask that user programs can use to figure out what
+5 -1
arch/arm64/include/asm/io.h
··· 274 274 #define ioremap_np(addr, size) \ 275 275 ioremap_prot((addr), (size), __pgprot(PROT_DEVICE_nGnRnE)) 276 276 277 + 278 + #define ioremap_encrypted(addr, size) \ 279 + ioremap_prot((addr), (size), PAGE_KERNEL) 280 + 277 281 /* 278 282 * io{read,write}{16,32,64}be() macros 279 283 */ ··· 315 311 static inline bool arm64_is_protected_mmio(phys_addr_t phys_addr, size_t size) 316 312 { 317 313 if (unlikely(is_realm_world())) 318 - return __arm64_is_protected_mmio(phys_addr, size); 314 + return arm64_rsi_is_protected(phys_addr, size); 319 315 return false; 320 316 } 321 317
+3
arch/arm64/include/asm/mmu.h
··· 78 78 pgprot_t prot, bool page_mappings_only); 79 79 extern void *fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot); 80 80 extern void mark_linear_text_alias_ro(void); 81 + extern int split_kernel_leaf_mapping(unsigned long start, unsigned long end); 82 + extern void init_idmap_kpti_bbml2_flag(void); 83 + extern void linear_map_maybe_split_to_ptes(void); 81 84 82 85 /* 83 86 * This check is triggered during the early boot before the cpufeature
+5
arch/arm64/include/asm/pgtable.h
··· 371 371 return __pmd(pmd_val(pmd) | PMD_SECT_CONT); 372 372 } 373 373 374 + static inline pmd_t pmd_mknoncont(pmd_t pmd) 375 + { 376 + return __pmd(pmd_val(pmd) & ~PMD_SECT_CONT); 377 + } 378 + 374 379 #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP 375 380 static inline int pte_uffd_wp(pte_t pte) 376 381 {
-2
arch/arm64/include/asm/preempt.h
··· 2 2 #ifndef __ASM_PREEMPT_H 3 3 #define __ASM_PREEMPT_H 4 4 5 - #include <linux/jump_label.h> 6 5 #include <linux/thread_info.h> 7 6 8 7 #define PREEMPT_NEED_RESCHED BIT(32) ··· 86 87 87 88 #ifdef CONFIG_PREEMPT_DYNAMIC 88 89 89 - DECLARE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 90 90 void dynamic_preempt_schedule(void); 91 91 #define __preempt_schedule() dynamic_preempt_schedule() 92 92 void dynamic_preempt_schedule_notrace(void);
+2
arch/arm64/include/asm/ptdump.h
··· 7 7 8 8 #include <linux/ptdump.h> 9 9 10 + DECLARE_STATIC_KEY_FALSE(arm64_ptdump_lock_key); 11 + 10 12 #ifdef CONFIG_PTDUMP 11 13 12 14 #include <linux/mm_types.h>
+5 -8
arch/arm64/include/asm/ptrace.h
··· 169 169 170 170 u64 sdei_ttbr1; 171 171 struct frame_record_meta stackframe; 172 - 173 - /* Only valid for some EL1 exceptions. */ 174 - u64 lockdep_hardirqs; 175 - u64 exit_rcu; 176 172 }; 177 173 178 174 /* For correct stack alignment, pt_regs has to be a multiple of 16 bytes. */ ··· 210 214 (regs)->pmr == GIC_PRIO_IRQON : \ 211 215 true) 212 216 213 - #define interrupts_enabled(regs) \ 214 - (!((regs)->pstate & PSR_I_BIT) && irqs_priority_unmasked(regs)) 217 + static __always_inline bool regs_irqs_disabled(const struct pt_regs *regs) 218 + { 219 + return (regs->pstate & PSR_I_BIT) || !irqs_priority_unmasked(regs); 220 + } 215 221 216 - #define fast_interrupts_enabled(regs) \ 217 - (!((regs)->pstate & PSR_F_BIT)) 222 + #define interrupts_enabled(regs) (!regs_irqs_disabled(regs)) 218 223 219 224 static inline unsigned long user_stack_pointer(struct pt_regs *regs) 220 225 {
+1 -1
arch/arm64/include/asm/rsi.h
··· 16 16 17 17 void __init arm64_rsi_init(void); 18 18 19 - bool __arm64_is_protected_mmio(phys_addr_t base, size_t size); 19 + bool arm64_rsi_is_protected(phys_addr_t base, size_t size); 20 20 21 21 static inline bool is_realm_world(void) 22 22 {
+2 -2
arch/arm64/include/asm/setup.h
··· 21 21 if (!arg) 22 22 return false; 23 23 24 - if (!strcmp(arg, "full")) { 24 + if (!strcmp(arg, "on")) { 25 25 rodata_enabled = rodata_full = true; 26 26 return true; 27 27 } ··· 31 31 return true; 32 32 } 33 33 34 - if (!strcmp(arg, "on")) { 34 + if (!strcmp(arg, "noalias")) { 35 35 rodata_enabled = true; 36 36 rodata_full = false; 37 37 return true;
-11
arch/arm64/include/asm/sysreg.h
··· 281 281 #define SYS_RGSR_EL1 sys_reg(3, 0, 1, 0, 5) 282 282 #define SYS_GCR_EL1 sys_reg(3, 0, 1, 0, 6) 283 283 284 - #define SYS_TCR_EL1 sys_reg(3, 0, 2, 0, 2) 285 - 286 284 #define SYS_APIAKEYLO_EL1 sys_reg(3, 0, 2, 1, 0) 287 285 #define SYS_APIAKEYHI_EL1 sys_reg(3, 0, 2, 1, 1) 288 286 #define SYS_APIBKEYLO_EL1 sys_reg(3, 0, 2, 1, 2) ··· 341 343 #define SYS_PAR_EL1_PA GENMASK_ULL(51, 12) 342 344 #define SYS_PAR_EL1_ATTR GENMASK_ULL(63, 56) 343 345 #define SYS_PAR_EL1_F0_RES0 (GENMASK_ULL(6, 1) | GENMASK_ULL(55, 52)) 344 - 345 - /*** Statistical Profiling Extension ***/ 346 - #define PMSEVFR_EL1_RES0_IMP \ 347 - (GENMASK_ULL(47, 32) | GENMASK_ULL(23, 16) | GENMASK_ULL(11, 8) |\ 348 - BIT_ULL(6) | BIT_ULL(4) | BIT_ULL(2) | BIT_ULL(0)) 349 - #define PMSEVFR_EL1_RES0_V1P1 \ 350 - (PMSEVFR_EL1_RES0_IMP & ~(BIT_ULL(18) | BIT_ULL(17) | BIT_ULL(11))) 351 - #define PMSEVFR_EL1_RES0_V1P2 \ 352 - (PMSEVFR_EL1_RES0_V1P1 & ~BIT_ULL(6)) 353 346 354 347 /* Buffer error reporting */ 355 348 #define PMBSR_EL1_FAULT_FSC_SHIFT PMBSR_EL1_MSS_SHIFT
-40
arch/arm64/include/asm/uaccess.h
··· 502 502 503 503 #endif /* CONFIG_ARCH_HAS_SUBPAGE_FAULTS */ 504 504 505 - #ifdef CONFIG_ARM64_GCS 506 - 507 - static inline int gcssttr(unsigned long __user *addr, unsigned long val) 508 - { 509 - register unsigned long __user *_addr __asm__ ("x0") = addr; 510 - register unsigned long _val __asm__ ("x1") = val; 511 - int err = 0; 512 - 513 - /* GCSSTTR x1, x0 */ 514 - asm volatile( 515 - "1: .inst 0xd91f1c01\n" 516 - "2: \n" 517 - _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0) 518 - : "+r" (err) 519 - : "rZ" (_val), "r" (_addr) 520 - : "memory"); 521 - 522 - return err; 523 - } 524 - 525 - static inline void put_user_gcs(unsigned long val, unsigned long __user *addr, 526 - int *err) 527 - { 528 - int ret; 529 - 530 - if (!access_ok((char __user *)addr, sizeof(u64))) { 531 - *err = -EFAULT; 532 - return; 533 - } 534 - 535 - uaccess_ttbr0_enable(); 536 - ret = gcssttr(addr, val); 537 - if (ret != 0) 538 - *err = ret; 539 - uaccess_ttbr0_disable(); 540 - } 541 - 542 - 543 - #endif /* CONFIG_ARM64_GCS */ 544 - 545 505 #endif /* __ASM_UACCESS_H */
+2 -7
arch/arm64/include/asm/vmalloc.h
··· 9 9 #define arch_vmap_pud_supported arch_vmap_pud_supported 10 10 static inline bool arch_vmap_pud_supported(pgprot_t prot) 11 11 { 12 - /* 13 - * SW table walks can't handle removal of intermediate entries. 14 - */ 15 - return pud_sect_supported() && 16 - !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); 12 + return pud_sect_supported(); 17 13 } 18 14 19 15 #define arch_vmap_pmd_supported arch_vmap_pmd_supported 20 16 static inline bool arch_vmap_pmd_supported(pgprot_t prot) 21 17 { 22 - /* See arch_vmap_pud_supported() */ 23 - return !IS_ENABLED(CONFIG_PTDUMP_DEBUGFS); 18 + return true; 24 19 } 25 20 26 21 #define arch_vmap_pte_range_map_size arch_vmap_pte_range_map_size
+1 -1
arch/arm64/include/asm/xen/events.h
··· 14 14 15 15 static inline int xen_irqs_disabled(struct pt_regs *regs) 16 16 { 17 - return !interrupts_enabled(regs); 17 + return regs_irqs_disabled(regs); 18 18 } 19 19 20 20 #define xchg_xen_ulong(ptr, val) xchg((ptr), (val))
+1
arch/arm64/include/uapi/asm/hwcap.h
··· 145 145 */ 146 146 #define HWCAP3_MTE_FAR (1UL << 0) 147 147 #define HWCAP3_MTE_STORE_ONLY (1UL << 1) 148 + #define HWCAP3_LSFE (1UL << 2) 148 149 149 150 #endif /* _UAPI__ASM_HWCAP_H */
+11 -1
arch/arm64/kernel/acpi.c
··· 357 357 * as long as we take care not to create a writable 358 358 * mapping for executable code. 359 359 */ 360 + fallthrough; 361 + 362 + case EFI_ACPI_MEMORY_NVS: 363 + /* 364 + * ACPI NVS marks an area reserved for use by the 365 + * firmware, even after exiting the boot service. 366 + * This may be used by the firmware for sharing dynamic 367 + * tables/data (e.g., ACPI CCEL) with the OS. Map it 368 + * as read-only. 369 + */ 360 370 prot = PAGE_KERNEL_RO; 361 371 break; 362 372 ··· 417 407 return_to_irqs_enabled = !irqs_disabled_flags(arch_local_save_flags()); 418 408 419 409 if (regs) 420 - return_to_irqs_enabled = interrupts_enabled(regs); 410 + return_to_irqs_enabled = !regs_irqs_disabled(regs); 421 411 422 412 /* 423 413 * SEA can interrupt SError, mask it and describe this as an NMI so
+2
arch/arm64/kernel/cpu_errata.c
··· 531 531 MIDR_ALL_VERSIONS(MIDR_CORTEX_A710), 532 532 MIDR_ALL_VERSIONS(MIDR_CORTEX_A715), 533 533 MIDR_ALL_VERSIONS(MIDR_CORTEX_A720), 534 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A720AE), 534 535 MIDR_ALL_VERSIONS(MIDR_CORTEX_A725), 535 536 MIDR_ALL_VERSIONS(MIDR_CORTEX_X1), 536 537 MIDR_ALL_VERSIONS(MIDR_CORTEX_X1C), ··· 546 545 MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), 547 546 MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2), 548 547 MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3), 548 + MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V3AE), 549 549 {} 550 550 }; 551 551 #endif
+14 -1
arch/arm64/kernel/cpufeature.c
··· 279 279 280 280 static const struct arm64_ftr_bits ftr_id_aa64isar3[] = { 281 281 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FPRCVT_SHIFT, 4, 0), 282 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_LSFE_SHIFT, 4, 0), 282 283 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FAMINMAX_SHIFT, 4, 0), 283 284 ARM64_FTR_END, 284 285 }; ··· 2029 2028 if (arm64_use_ng_mappings) 2030 2029 return; 2031 2030 2031 + init_idmap_kpti_bbml2_flag(); 2032 2032 stop_machine(__kpti_install_ng_mappings, NULL, cpu_online_mask); 2033 2033 } 2034 2034 ··· 2220 2218 return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE); 2221 2219 } 2222 2220 2223 - static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope) 2221 + bool cpu_supports_bbml2_noabort(void) 2224 2222 { 2225 2223 /* 2226 2224 * We want to allow usage of BBML2 in as wide a range of kernel contexts ··· 2237 2235 static const struct midr_range supports_bbml2_noabort_list[] = { 2238 2236 MIDR_REV_RANGE(MIDR_CORTEX_X4, 0, 3, 0xf), 2239 2237 MIDR_REV_RANGE(MIDR_NEOVERSE_V3, 0, 2, 0xf), 2238 + MIDR_REV_RANGE(MIDR_NEOVERSE_V3AE, 0, 2, 0xf), 2239 + MIDR_ALL_VERSIONS(MIDR_NVIDIA_OLYMPUS), 2240 + MIDR_ALL_VERSIONS(MIDR_AMPERE1), 2241 + MIDR_ALL_VERSIONS(MIDR_AMPERE1A), 2240 2242 {} 2241 2243 }; 2242 2244 ··· 2254 2248 */ 2255 2249 2256 2250 return true; 2251 + } 2252 + 2253 + static bool has_bbml2_noabort(const struct arm64_cpu_capabilities *caps, int scope) 2254 + { 2255 + return cpu_supports_bbml2_noabort(); 2257 2256 } 2258 2257 2259 2258 #ifdef CONFIG_ARM64_PAN ··· 3288 3277 HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM), 3289 3278 HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT), 3290 3279 HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX), 3280 + HWCAP_CAP(ID_AA64ISAR3_EL1, LSFE, IMP, CAP_HWCAP, KERNEL_HWCAP_LSFE), 3291 3281 HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT), 3292 3282 #ifdef CONFIG_ARM64_SVE 3293 3283 HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE), ··· 3960 3948 { 3961 3949 setup_system_capabilities(); 3962 3950 3951 + linear_map_maybe_split_to_ptes(); 3963 3952 kpti_install_ng_mappings(); 3964 3953 3965 3954 sve_setup();
+1
arch/arm64/kernel/cpuinfo.c
··· 162 162 [KERNEL_HWCAP_SME_SMOP4] = "smesmop4", 163 163 [KERNEL_HWCAP_MTE_FAR] = "mtefar", 164 164 [KERNEL_HWCAP_MTE_STORE_ONLY] = "mtestoreonly", 165 + [KERNEL_HWCAP_LSFE] = "lsfe", 165 166 }; 166 167 167 168 #ifdef CONFIG_COMPAT
+1 -1
arch/arm64/kernel/debug-monitors.c
··· 167 167 if (WARN_ON(!user_mode(regs))) 168 168 return; 169 169 170 - if (interrupts_enabled(regs)) 170 + if (!regs_irqs_disabled(regs)) 171 171 local_irq_enable(); 172 172 173 173 arm64_force_sig_fault(SIGTRAP, si_code, instruction_pointer(regs),
+145 -278
arch/arm64/kernel/entry-common.c
··· 6 6 */ 7 7 8 8 #include <linux/context_tracking.h> 9 + #include <linux/irq-entry-common.h> 9 10 #include <linux/kasan.h> 10 11 #include <linux/linkage.h> 11 12 #include <linux/livepatch.h> ··· 38 37 * This is intended to match the logic in irqentry_enter(), handling the kernel 39 38 * mode transitions only. 40 39 */ 41 - static __always_inline void __enter_from_kernel_mode(struct pt_regs *regs) 40 + static __always_inline irqentry_state_t __enter_from_kernel_mode(struct pt_regs *regs) 42 41 { 43 - regs->exit_rcu = false; 44 - 45 - if (!IS_ENABLED(CONFIG_TINY_RCU) && is_idle_task(current)) { 46 - lockdep_hardirqs_off(CALLER_ADDR0); 47 - ct_irq_enter(); 48 - trace_hardirqs_off_finish(); 49 - 50 - regs->exit_rcu = true; 51 - return; 52 - } 53 - 54 - lockdep_hardirqs_off(CALLER_ADDR0); 55 - rcu_irq_enter_check_tick(); 56 - trace_hardirqs_off_finish(); 42 + return irqentry_enter(regs); 57 43 } 58 44 59 - static void noinstr enter_from_kernel_mode(struct pt_regs *regs) 45 + static noinstr irqentry_state_t enter_from_kernel_mode(struct pt_regs *regs) 60 46 { 61 - __enter_from_kernel_mode(regs); 47 + irqentry_state_t state; 48 + 49 + state = __enter_from_kernel_mode(regs); 62 50 mte_check_tfsr_entry(); 63 51 mte_disable_tco_entry(current); 52 + 53 + return state; 64 54 } 65 55 66 56 /* ··· 62 70 * This is intended to match the logic in irqentry_exit(), handling the kernel 63 71 * mode transitions only, and with preemption handled elsewhere. 64 72 */ 65 - static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs) 73 + static __always_inline void __exit_to_kernel_mode(struct pt_regs *regs, 74 + irqentry_state_t state) 66 75 { 67 - lockdep_assert_irqs_disabled(); 68 - 69 - if (interrupts_enabled(regs)) { 70 - if (regs->exit_rcu) { 71 - trace_hardirqs_on_prepare(); 72 - lockdep_hardirqs_on_prepare(); 73 - ct_irq_exit(); 74 - lockdep_hardirqs_on(CALLER_ADDR0); 75 - return; 76 - } 77 - 78 - trace_hardirqs_on(); 79 - } else { 80 - if (regs->exit_rcu) 81 - ct_irq_exit(); 82 - } 76 + irqentry_exit(regs, state); 83 77 } 84 78 85 - static void noinstr exit_to_kernel_mode(struct pt_regs *regs) 79 + static void noinstr exit_to_kernel_mode(struct pt_regs *regs, 80 + irqentry_state_t state) 86 81 { 87 82 mte_check_tfsr_exit(); 88 - __exit_to_kernel_mode(regs); 83 + __exit_to_kernel_mode(regs, state); 89 84 } 90 85 91 86 /* ··· 80 101 * Before this function is called it is not safe to call regular kernel code, 81 102 * instrumentable code, or any code which may trigger an exception. 82 103 */ 83 - static __always_inline void __enter_from_user_mode(void) 104 + static __always_inline void __enter_from_user_mode(struct pt_regs *regs) 84 105 { 85 - lockdep_hardirqs_off(CALLER_ADDR0); 86 - CT_WARN_ON(ct_state() != CT_STATE_USER); 87 - user_exit_irqoff(); 88 - trace_hardirqs_off_finish(); 106 + enter_from_user_mode(regs); 89 107 mte_disable_tco_entry(current); 90 108 } 91 109 92 - static __always_inline void enter_from_user_mode(struct pt_regs *regs) 110 + static __always_inline void arm64_enter_from_user_mode(struct pt_regs *regs) 93 111 { 94 - __enter_from_user_mode(); 112 + __enter_from_user_mode(regs); 95 113 } 96 114 97 115 /* ··· 96 120 * After this function returns it is not safe to call regular kernel code, 97 121 * instrumentable code, or any code which may trigger an exception. 98 122 */ 99 - static __always_inline void __exit_to_user_mode(void) 123 + 124 + static __always_inline void arm64_exit_to_user_mode(struct pt_regs *regs) 100 125 { 101 - trace_hardirqs_on_prepare(); 102 - lockdep_hardirqs_on_prepare(); 103 - user_enter_irqoff(); 104 - lockdep_hardirqs_on(CALLER_ADDR0); 105 - } 106 - 107 - static void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags) 108 - { 109 - do { 110 - local_irq_enable(); 111 - 112 - if (thread_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) 113 - schedule(); 114 - 115 - if (thread_flags & _TIF_UPROBE) 116 - uprobe_notify_resume(regs); 117 - 118 - if (thread_flags & _TIF_MTE_ASYNC_FAULT) { 119 - clear_thread_flag(TIF_MTE_ASYNC_FAULT); 120 - send_sig_fault(SIGSEGV, SEGV_MTEAERR, 121 - (void __user *)NULL, current); 122 - } 123 - 124 - if (thread_flags & _TIF_PATCH_PENDING) 125 - klp_update_patch_state(current); 126 - 127 - if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) 128 - do_signal(regs); 129 - 130 - if (thread_flags & _TIF_NOTIFY_RESUME) 131 - resume_user_mode_work(regs); 132 - 133 - if (thread_flags & _TIF_FOREIGN_FPSTATE) 134 - fpsimd_restore_current_state(); 135 - 136 - local_irq_disable(); 137 - thread_flags = read_thread_flags(); 138 - } while (thread_flags & _TIF_WORK_MASK); 139 - } 140 - 141 - static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) 142 - { 143 - unsigned long flags; 144 - 145 126 local_irq_disable(); 146 - 147 - flags = read_thread_flags(); 148 - if (unlikely(flags & _TIF_WORK_MASK)) 149 - do_notify_resume(regs, flags); 150 - 151 - local_daif_mask(); 152 - 153 - lockdep_sys_exit(); 154 - } 155 - 156 - static __always_inline void exit_to_user_mode(struct pt_regs *regs) 157 - { 158 127 exit_to_user_mode_prepare(regs); 128 + local_daif_mask(); 159 129 mte_check_tfsr_exit(); 160 - __exit_to_user_mode(); 130 + exit_to_user_mode(); 161 131 } 162 132 163 133 asmlinkage void noinstr asm_exit_to_user_mode(struct pt_regs *regs) 164 134 { 165 - exit_to_user_mode(regs); 166 - } 167 - 168 - /* 169 - * Handle IRQ/context state management when entering an NMI from user/kernel 170 - * mode. Before this function is called it is not safe to call regular kernel 171 - * code, instrumentable code, or any code which may trigger an exception. 172 - */ 173 - static void noinstr arm64_enter_nmi(struct pt_regs *regs) 174 - { 175 - regs->lockdep_hardirqs = lockdep_hardirqs_enabled(); 176 - 177 - __nmi_enter(); 178 - lockdep_hardirqs_off(CALLER_ADDR0); 179 - lockdep_hardirq_enter(); 180 - ct_nmi_enter(); 181 - 182 - trace_hardirqs_off_finish(); 183 - ftrace_nmi_enter(); 184 - } 185 - 186 - /* 187 - * Handle IRQ/context state management when exiting an NMI from user/kernel 188 - * mode. After this function returns it is not safe to call regular kernel 189 - * code, instrumentable code, or any code which may trigger an exception. 190 - */ 191 - static void noinstr arm64_exit_nmi(struct pt_regs *regs) 192 - { 193 - bool restore = regs->lockdep_hardirqs; 194 - 195 - ftrace_nmi_exit(); 196 - if (restore) { 197 - trace_hardirqs_on_prepare(); 198 - lockdep_hardirqs_on_prepare(); 199 - } 200 - 201 - ct_nmi_exit(); 202 - lockdep_hardirq_exit(); 203 - if (restore) 204 - lockdep_hardirqs_on(CALLER_ADDR0); 205 - __nmi_exit(); 135 + arm64_exit_to_user_mode(regs); 206 136 } 207 137 208 138 /* ··· 116 234 * kernel mode. Before this function is called it is not safe to call regular 117 235 * kernel code, instrumentable code, or any code which may trigger an exception. 118 236 */ 119 - static void noinstr arm64_enter_el1_dbg(struct pt_regs *regs) 237 + static noinstr irqentry_state_t arm64_enter_el1_dbg(struct pt_regs *regs) 120 238 { 121 - regs->lockdep_hardirqs = lockdep_hardirqs_enabled(); 239 + irqentry_state_t state; 240 + 241 + state.lockdep = lockdep_hardirqs_enabled(); 122 242 123 243 lockdep_hardirqs_off(CALLER_ADDR0); 124 244 ct_nmi_enter(); 125 245 126 246 trace_hardirqs_off_finish(); 247 + 248 + return state; 127 249 } 128 250 129 251 /* ··· 135 249 * kernel mode. After this function returns it is not safe to call regular 136 250 * kernel code, instrumentable code, or any code which may trigger an exception. 137 251 */ 138 - static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs) 252 + static void noinstr arm64_exit_el1_dbg(struct pt_regs *regs, 253 + irqentry_state_t state) 139 254 { 140 - bool restore = regs->lockdep_hardirqs; 141 - 142 - if (restore) { 255 + if (state.lockdep) { 143 256 trace_hardirqs_on_prepare(); 144 257 lockdep_hardirqs_on_prepare(); 145 258 } 146 259 147 260 ct_nmi_exit(); 148 - if (restore) 261 + if (state.lockdep) 149 262 lockdep_hardirqs_on(CALLER_ADDR0); 150 - } 151 - 152 - #ifdef CONFIG_PREEMPT_DYNAMIC 153 - DEFINE_STATIC_KEY_TRUE(sk_dynamic_irqentry_exit_cond_resched); 154 - #define need_irq_preemption() \ 155 - (static_branch_unlikely(&sk_dynamic_irqentry_exit_cond_resched)) 156 - #else 157 - #define need_irq_preemption() (IS_ENABLED(CONFIG_PREEMPTION)) 158 - #endif 159 - 160 - static void __sched arm64_preempt_schedule_irq(void) 161 - { 162 - if (!need_irq_preemption()) 163 - return; 164 - 165 - /* 166 - * Note: thread_info::preempt_count includes both thread_info::count 167 - * and thread_info::need_resched, and is not equivalent to 168 - * preempt_count(). 169 - */ 170 - if (READ_ONCE(current_thread_info()->preempt_count) != 0) 171 - return; 172 - 173 - /* 174 - * DAIF.DA are cleared at the start of IRQ/FIQ handling, and when GIC 175 - * priority masking is used the GIC irqchip driver will clear DAIF.IF 176 - * using gic_arch_enable_irqs() for normal IRQs. If anything is set in 177 - * DAIF we must have handled an NMI, so skip preemption. 178 - */ 179 - if (system_uses_irq_prio_masking() && read_sysreg(daif)) 180 - return; 181 - 182 - /* 183 - * Preempting a task from an IRQ means we leave copies of PSTATE 184 - * on the stack. cpufeature's enable calls may modify PSTATE, but 185 - * resuming one of these preempted tasks would undo those changes. 186 - * 187 - * Only allow a task to be preempted once cpufeatures have been 188 - * enabled. 189 - */ 190 - if (system_capabilities_finalized()) 191 - preempt_schedule_irq(); 192 263 } 193 264 194 265 static void do_interrupt_handler(struct pt_regs *regs, ··· 167 324 static void noinstr __panic_unhandled(struct pt_regs *regs, const char *vector, 168 325 unsigned long esr) 169 326 { 170 - arm64_enter_nmi(regs); 327 + irqentry_nmi_enter(regs); 171 328 172 329 console_verbose(); 173 330 ··· 318 475 static void noinstr el1_abort(struct pt_regs *regs, unsigned long esr) 319 476 { 320 477 unsigned long far = read_sysreg(far_el1); 478 + irqentry_state_t state; 321 479 322 - enter_from_kernel_mode(regs); 480 + state = enter_from_kernel_mode(regs); 323 481 local_daif_inherit(regs); 324 482 do_mem_abort(far, esr, regs); 325 483 local_daif_mask(); 326 - exit_to_kernel_mode(regs); 484 + exit_to_kernel_mode(regs, state); 327 485 } 328 486 329 487 static void noinstr el1_pc(struct pt_regs *regs, unsigned long esr) 330 488 { 331 489 unsigned long far = read_sysreg(far_el1); 490 + irqentry_state_t state; 332 491 333 - enter_from_kernel_mode(regs); 492 + state = enter_from_kernel_mode(regs); 334 493 local_daif_inherit(regs); 335 494 do_sp_pc_abort(far, esr, regs); 336 495 local_daif_mask(); 337 - exit_to_kernel_mode(regs); 496 + exit_to_kernel_mode(regs, state); 338 497 } 339 498 340 499 static void noinstr el1_undef(struct pt_regs *regs, unsigned long esr) 341 500 { 342 - enter_from_kernel_mode(regs); 501 + irqentry_state_t state; 502 + 503 + state = enter_from_kernel_mode(regs); 343 504 local_daif_inherit(regs); 344 505 do_el1_undef(regs, esr); 345 506 local_daif_mask(); 346 - exit_to_kernel_mode(regs); 507 + exit_to_kernel_mode(regs, state); 347 508 } 348 509 349 510 static void noinstr el1_bti(struct pt_regs *regs, unsigned long esr) 350 511 { 351 - enter_from_kernel_mode(regs); 512 + irqentry_state_t state; 513 + 514 + state = enter_from_kernel_mode(regs); 352 515 local_daif_inherit(regs); 353 516 do_el1_bti(regs, esr); 354 517 local_daif_mask(); 355 - exit_to_kernel_mode(regs); 518 + exit_to_kernel_mode(regs, state); 356 519 } 357 520 358 521 static void noinstr el1_gcs(struct pt_regs *regs, unsigned long esr) 359 522 { 360 - enter_from_kernel_mode(regs); 523 + irqentry_state_t state; 524 + 525 + state = enter_from_kernel_mode(regs); 361 526 local_daif_inherit(regs); 362 527 do_el1_gcs(regs, esr); 363 528 local_daif_mask(); 364 - exit_to_kernel_mode(regs); 529 + exit_to_kernel_mode(regs, state); 365 530 } 366 531 367 532 static void noinstr el1_mops(struct pt_regs *regs, unsigned long esr) 368 533 { 369 - enter_from_kernel_mode(regs); 534 + irqentry_state_t state; 535 + 536 + state = enter_from_kernel_mode(regs); 370 537 local_daif_inherit(regs); 371 538 do_el1_mops(regs, esr); 372 539 local_daif_mask(); 373 - exit_to_kernel_mode(regs); 540 + exit_to_kernel_mode(regs, state); 374 541 } 375 542 376 543 static void noinstr el1_breakpt(struct pt_regs *regs, unsigned long esr) 377 544 { 378 - arm64_enter_el1_dbg(regs); 545 + irqentry_state_t state; 546 + 547 + state = arm64_enter_el1_dbg(regs); 379 548 debug_exception_enter(regs); 380 549 do_breakpoint(esr, regs); 381 550 debug_exception_exit(regs); 382 - arm64_exit_el1_dbg(regs); 551 + arm64_exit_el1_dbg(regs, state); 383 552 } 384 553 385 554 static void noinstr el1_softstp(struct pt_regs *regs, unsigned long esr) 386 555 { 387 - arm64_enter_el1_dbg(regs); 556 + irqentry_state_t state; 557 + 558 + state = arm64_enter_el1_dbg(regs); 388 559 if (!cortex_a76_erratum_1463225_debug_handler(regs)) { 389 560 debug_exception_enter(regs); 390 561 /* ··· 411 554 do_el1_softstep(esr, regs); 412 555 debug_exception_exit(regs); 413 556 } 414 - arm64_exit_el1_dbg(regs); 557 + arm64_exit_el1_dbg(regs, state); 415 558 } 416 559 417 560 static void noinstr el1_watchpt(struct pt_regs *regs, unsigned long esr) 418 561 { 419 562 /* Watchpoints are the only debug exception to write FAR_EL1 */ 420 563 unsigned long far = read_sysreg(far_el1); 564 + irqentry_state_t state; 421 565 422 - arm64_enter_el1_dbg(regs); 566 + state = arm64_enter_el1_dbg(regs); 423 567 debug_exception_enter(regs); 424 568 do_watchpoint(far, esr, regs); 425 569 debug_exception_exit(regs); 426 - arm64_exit_el1_dbg(regs); 570 + arm64_exit_el1_dbg(regs, state); 427 571 } 428 572 429 573 static void noinstr el1_brk64(struct pt_regs *regs, unsigned long esr) 430 574 { 431 - arm64_enter_el1_dbg(regs); 575 + irqentry_state_t state; 576 + 577 + state = arm64_enter_el1_dbg(regs); 432 578 debug_exception_enter(regs); 433 579 do_el1_brk64(esr, regs); 434 580 debug_exception_exit(regs); 435 - arm64_exit_el1_dbg(regs); 581 + arm64_exit_el1_dbg(regs, state); 436 582 } 437 583 438 584 static void noinstr el1_fpac(struct pt_regs *regs, unsigned long esr) 439 585 { 440 - enter_from_kernel_mode(regs); 586 + irqentry_state_t state; 587 + 588 + state = enter_from_kernel_mode(regs); 441 589 local_daif_inherit(regs); 442 590 do_el1_fpac(regs, esr); 443 591 local_daif_mask(); 444 - exit_to_kernel_mode(regs); 592 + exit_to_kernel_mode(regs, state); 445 593 } 446 594 447 595 asmlinkage void noinstr el1h_64_sync_handler(struct pt_regs *regs) ··· 501 639 static __always_inline void __el1_pnmi(struct pt_regs *regs, 502 640 void (*handler)(struct pt_regs *)) 503 641 { 504 - arm64_enter_nmi(regs); 642 + irqentry_state_t state; 643 + 644 + state = irqentry_nmi_enter(regs); 505 645 do_interrupt_handler(regs, handler); 506 - arm64_exit_nmi(regs); 646 + irqentry_nmi_exit(regs, state); 507 647 } 508 648 509 649 static __always_inline void __el1_irq(struct pt_regs *regs, 510 650 void (*handler)(struct pt_regs *)) 511 651 { 512 - enter_from_kernel_mode(regs); 652 + irqentry_state_t state; 653 + 654 + state = enter_from_kernel_mode(regs); 513 655 514 656 irq_enter_rcu(); 515 657 do_interrupt_handler(regs, handler); 516 658 irq_exit_rcu(); 517 659 518 - arm64_preempt_schedule_irq(); 519 - 520 - exit_to_kernel_mode(regs); 660 + exit_to_kernel_mode(regs, state); 521 661 } 522 662 static void noinstr el1_interrupt(struct pt_regs *regs, 523 663 void (*handler)(struct pt_regs *)) 524 664 { 525 665 write_sysreg(DAIF_PROCCTX_NOIRQ, daif); 526 666 527 - if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && !interrupts_enabled(regs)) 667 + if (IS_ENABLED(CONFIG_ARM64_PSEUDO_NMI) && regs_irqs_disabled(regs)) 528 668 __el1_pnmi(regs, handler); 529 669 else 530 670 __el1_irq(regs, handler); ··· 545 681 asmlinkage void noinstr el1h_64_error_handler(struct pt_regs *regs) 546 682 { 547 683 unsigned long esr = read_sysreg(esr_el1); 684 + irqentry_state_t state; 548 685 549 686 local_daif_restore(DAIF_ERRCTX); 550 - arm64_enter_nmi(regs); 687 + state = irqentry_nmi_enter(regs); 551 688 do_serror(regs, esr); 552 - arm64_exit_nmi(regs); 689 + irqentry_nmi_exit(regs, state); 553 690 } 554 691 555 692 static void noinstr el0_da(struct pt_regs *regs, unsigned long esr) 556 693 { 557 694 unsigned long far = read_sysreg(far_el1); 558 695 559 - enter_from_user_mode(regs); 696 + arm64_enter_from_user_mode(regs); 560 697 local_daif_restore(DAIF_PROCCTX); 561 698 do_mem_abort(far, esr, regs); 562 - exit_to_user_mode(regs); 699 + arm64_exit_to_user_mode(regs); 563 700 } 564 701 565 702 static void noinstr el0_ia(struct pt_regs *regs, unsigned long esr) ··· 575 710 if (!is_ttbr0_addr(far)) 576 711 arm64_apply_bp_hardening(); 577 712 578 - enter_from_user_mode(regs); 713 + arm64_enter_from_user_mode(regs); 579 714 local_daif_restore(DAIF_PROCCTX); 580 715 do_mem_abort(far, esr, regs); 581 - exit_to_user_mode(regs); 716 + arm64_exit_to_user_mode(regs); 582 717 } 583 718 584 719 static void noinstr el0_fpsimd_acc(struct pt_regs *regs, unsigned long esr) 585 720 { 586 - enter_from_user_mode(regs); 721 + arm64_enter_from_user_mode(regs); 587 722 local_daif_restore(DAIF_PROCCTX); 588 723 do_fpsimd_acc(esr, regs); 589 - exit_to_user_mode(regs); 724 + arm64_exit_to_user_mode(regs); 590 725 } 591 726 592 727 static void noinstr el0_sve_acc(struct pt_regs *regs, unsigned long esr) 593 728 { 594 - enter_from_user_mode(regs); 729 + arm64_enter_from_user_mode(regs); 595 730 local_daif_restore(DAIF_PROCCTX); 596 731 do_sve_acc(esr, regs); 597 - exit_to_user_mode(regs); 732 + arm64_exit_to_user_mode(regs); 598 733 } 599 734 600 735 static void noinstr el0_sme_acc(struct pt_regs *regs, unsigned long esr) 601 736 { 602 - enter_from_user_mode(regs); 737 + arm64_enter_from_user_mode(regs); 603 738 local_daif_restore(DAIF_PROCCTX); 604 739 do_sme_acc(esr, regs); 605 - exit_to_user_mode(regs); 740 + arm64_exit_to_user_mode(regs); 606 741 } 607 742 608 743 static void noinstr el0_fpsimd_exc(struct pt_regs *regs, unsigned long esr) 609 744 { 610 - enter_from_user_mode(regs); 745 + arm64_enter_from_user_mode(regs); 611 746 local_daif_restore(DAIF_PROCCTX); 612 747 do_fpsimd_exc(esr, regs); 613 - exit_to_user_mode(regs); 748 + arm64_exit_to_user_mode(regs); 614 749 } 615 750 616 751 static void noinstr el0_sys(struct pt_regs *regs, unsigned long esr) 617 752 { 618 - enter_from_user_mode(regs); 753 + arm64_enter_from_user_mode(regs); 619 754 local_daif_restore(DAIF_PROCCTX); 620 755 do_el0_sys(esr, regs); 621 - exit_to_user_mode(regs); 756 + arm64_exit_to_user_mode(regs); 622 757 } 623 758 624 759 static void noinstr el0_pc(struct pt_regs *regs, unsigned long esr) ··· 628 763 if (!is_ttbr0_addr(instruction_pointer(regs))) 629 764 arm64_apply_bp_hardening(); 630 765 631 - enter_from_user_mode(regs); 766 + arm64_enter_from_user_mode(regs); 632 767 local_daif_restore(DAIF_PROCCTX); 633 768 do_sp_pc_abort(far, esr, regs); 634 - exit_to_user_mode(regs); 769 + arm64_exit_to_user_mode(regs); 635 770 } 636 771 637 772 static void noinstr el0_sp(struct pt_regs *regs, unsigned long esr) 638 773 { 639 - enter_from_user_mode(regs); 774 + arm64_enter_from_user_mode(regs); 640 775 local_daif_restore(DAIF_PROCCTX); 641 776 do_sp_pc_abort(regs->sp, esr, regs); 642 - exit_to_user_mode(regs); 777 + arm64_exit_to_user_mode(regs); 643 778 } 644 779 645 780 static void noinstr el0_undef(struct pt_regs *regs, unsigned long esr) 646 781 { 647 - enter_from_user_mode(regs); 782 + arm64_enter_from_user_mode(regs); 648 783 local_daif_restore(DAIF_PROCCTX); 649 784 do_el0_undef(regs, esr); 650 - exit_to_user_mode(regs); 785 + arm64_exit_to_user_mode(regs); 651 786 } 652 787 653 788 static void noinstr el0_bti(struct pt_regs *regs) 654 789 { 655 - enter_from_user_mode(regs); 790 + arm64_enter_from_user_mode(regs); 656 791 local_daif_restore(DAIF_PROCCTX); 657 792 do_el0_bti(regs); 658 - exit_to_user_mode(regs); 793 + arm64_exit_to_user_mode(regs); 659 794 } 660 795 661 796 static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr) 662 797 { 663 - enter_from_user_mode(regs); 798 + arm64_enter_from_user_mode(regs); 664 799 local_daif_restore(DAIF_PROCCTX); 665 800 do_el0_mops(regs, esr); 666 - exit_to_user_mode(regs); 801 + arm64_exit_to_user_mode(regs); 667 802 } 668 803 669 804 static void noinstr el0_gcs(struct pt_regs *regs, unsigned long esr) 670 805 { 671 - enter_from_user_mode(regs); 806 + arm64_enter_from_user_mode(regs); 672 807 local_daif_restore(DAIF_PROCCTX); 673 808 do_el0_gcs(regs, esr); 674 - exit_to_user_mode(regs); 809 + arm64_exit_to_user_mode(regs); 675 810 } 676 811 677 812 static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr) 678 813 { 679 - enter_from_user_mode(regs); 814 + arm64_enter_from_user_mode(regs); 680 815 local_daif_restore(DAIF_PROCCTX); 681 816 bad_el0_sync(regs, 0, esr); 682 - exit_to_user_mode(regs); 817 + arm64_exit_to_user_mode(regs); 683 818 } 684 819 685 820 static void noinstr el0_breakpt(struct pt_regs *regs, unsigned long esr) ··· 687 822 if (!is_ttbr0_addr(regs->pc)) 688 823 arm64_apply_bp_hardening(); 689 824 690 - enter_from_user_mode(regs); 825 + arm64_enter_from_user_mode(regs); 691 826 debug_exception_enter(regs); 692 827 do_breakpoint(esr, regs); 693 828 debug_exception_exit(regs); 694 829 local_daif_restore(DAIF_PROCCTX); 695 - exit_to_user_mode(regs); 830 + arm64_exit_to_user_mode(regs); 696 831 } 697 832 698 833 static void noinstr el0_softstp(struct pt_regs *regs, unsigned long esr) ··· 700 835 if (!is_ttbr0_addr(regs->pc)) 701 836 arm64_apply_bp_hardening(); 702 837 703 - enter_from_user_mode(regs); 838 + arm64_enter_from_user_mode(regs); 704 839 /* 705 840 * After handling a breakpoint, we suspend the breakpoint 706 841 * and use single-step to move to the next instruction. ··· 711 846 local_daif_restore(DAIF_PROCCTX); 712 847 do_el0_softstep(esr, regs); 713 848 } 714 - exit_to_user_mode(regs); 849 + arm64_exit_to_user_mode(regs); 715 850 } 716 851 717 852 static void noinstr el0_watchpt(struct pt_regs *regs, unsigned long esr) ··· 719 854 /* Watchpoints are the only debug exception to write FAR_EL1 */ 720 855 unsigned long far = read_sysreg(far_el1); 721 856 722 - enter_from_user_mode(regs); 857 + arm64_enter_from_user_mode(regs); 723 858 debug_exception_enter(regs); 724 859 do_watchpoint(far, esr, regs); 725 860 debug_exception_exit(regs); 726 861 local_daif_restore(DAIF_PROCCTX); 727 - exit_to_user_mode(regs); 862 + arm64_exit_to_user_mode(regs); 728 863 } 729 864 730 865 static void noinstr el0_brk64(struct pt_regs *regs, unsigned long esr) 731 866 { 732 - enter_from_user_mode(regs); 867 + arm64_enter_from_user_mode(regs); 733 868 local_daif_restore(DAIF_PROCCTX); 734 869 do_el0_brk64(esr, regs); 735 - exit_to_user_mode(regs); 870 + arm64_exit_to_user_mode(regs); 736 871 } 737 872 738 873 static void noinstr el0_svc(struct pt_regs *regs) 739 874 { 740 - enter_from_user_mode(regs); 875 + arm64_enter_from_user_mode(regs); 741 876 cortex_a76_erratum_1463225_svc_handler(); 742 877 fpsimd_syscall_enter(); 743 878 local_daif_restore(DAIF_PROCCTX); 744 879 do_el0_svc(regs); 745 - exit_to_user_mode(regs); 880 + arm64_exit_to_user_mode(regs); 746 881 fpsimd_syscall_exit(); 747 882 } 748 883 749 884 static void noinstr el0_fpac(struct pt_regs *regs, unsigned long esr) 750 885 { 751 - enter_from_user_mode(regs); 886 + arm64_enter_from_user_mode(regs); 752 887 local_daif_restore(DAIF_PROCCTX); 753 888 do_el0_fpac(regs, esr); 754 - exit_to_user_mode(regs); 889 + arm64_exit_to_user_mode(regs); 755 890 } 756 891 757 892 asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs) ··· 825 960 static void noinstr el0_interrupt(struct pt_regs *regs, 826 961 void (*handler)(struct pt_regs *)) 827 962 { 828 - enter_from_user_mode(regs); 963 + arm64_enter_from_user_mode(regs); 829 964 830 965 write_sysreg(DAIF_PROCCTX_NOIRQ, daif); 831 966 ··· 836 971 do_interrupt_handler(regs, handler); 837 972 irq_exit_rcu(); 838 973 839 - exit_to_user_mode(regs); 974 + arm64_exit_to_user_mode(regs); 840 975 } 841 976 842 977 static void noinstr __el0_irq_handler_common(struct pt_regs *regs) ··· 862 997 static void noinstr __el0_error_handler_common(struct pt_regs *regs) 863 998 { 864 999 unsigned long esr = read_sysreg(esr_el1); 1000 + irqentry_state_t state; 865 1001 866 - enter_from_user_mode(regs); 1002 + arm64_enter_from_user_mode(regs); 867 1003 local_daif_restore(DAIF_ERRCTX); 868 - arm64_enter_nmi(regs); 1004 + state = irqentry_nmi_enter(regs); 869 1005 do_serror(regs, esr); 870 - arm64_exit_nmi(regs); 1006 + irqentry_nmi_exit(regs, state); 871 1007 local_daif_restore(DAIF_PROCCTX); 872 - exit_to_user_mode(regs); 1008 + arm64_exit_to_user_mode(regs); 873 1009 } 874 1010 875 1011 asmlinkage void noinstr el0t_64_error_handler(struct pt_regs *regs) ··· 881 1015 #ifdef CONFIG_COMPAT 882 1016 static void noinstr el0_cp15(struct pt_regs *regs, unsigned long esr) 883 1017 { 884 - enter_from_user_mode(regs); 1018 + arm64_enter_from_user_mode(regs); 885 1019 local_daif_restore(DAIF_PROCCTX); 886 1020 do_el0_cp15(esr, regs); 887 - exit_to_user_mode(regs); 1021 + arm64_exit_to_user_mode(regs); 888 1022 } 889 1023 890 1024 static void noinstr el0_svc_compat(struct pt_regs *regs) 891 1025 { 892 - enter_from_user_mode(regs); 1026 + arm64_enter_from_user_mode(regs); 893 1027 cortex_a76_erratum_1463225_svc_handler(); 894 1028 local_daif_restore(DAIF_PROCCTX); 895 1029 do_el0_svc_compat(regs); 896 - exit_to_user_mode(regs); 1030 + arm64_exit_to_user_mode(regs); 897 1031 } 898 1032 899 1033 static void noinstr el0_bkpt32(struct pt_regs *regs, unsigned long esr) 900 1034 { 901 - enter_from_user_mode(regs); 1035 + arm64_enter_from_user_mode(regs); 902 1036 local_daif_restore(DAIF_PROCCTX); 903 1037 do_bkpt32(esr, regs); 904 - exit_to_user_mode(regs); 1038 + arm64_exit_to_user_mode(regs); 905 1039 } 906 1040 907 1041 asmlinkage void noinstr el0t_32_sync_handler(struct pt_regs *regs) ··· 980 1114 unsigned long esr = read_sysreg(esr_el1); 981 1115 unsigned long far = read_sysreg(far_el1); 982 1116 983 - arm64_enter_nmi(regs); 1117 + irqentry_nmi_enter(regs); 984 1118 panic_bad_stack(regs, esr, far); 985 1119 } 986 1120 ··· 988 1122 asmlinkage noinstr unsigned long 989 1123 __sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg) 990 1124 { 1125 + irqentry_state_t state; 991 1126 unsigned long ret; 992 1127 993 1128 /* ··· 1013 1146 else if (cpu_has_pan()) 1014 1147 set_pstate_pan(0); 1015 1148 1016 - arm64_enter_nmi(regs); 1149 + state = irqentry_nmi_enter(regs); 1017 1150 ret = do_sdei_event(regs, arg); 1018 - arm64_exit_nmi(regs); 1151 + irqentry_nmi_exit(regs, state); 1019 1152 1020 1153 return ret; 1021 1154 }
+3 -2
arch/arm64/kernel/fpsimd.c
··· 1265 1265 if (!system_supports_sme()) 1266 1266 return; 1267 1267 1268 + min_bit = find_last_bit(info->vq_map, SVE_VQ_MAX); 1269 + 1268 1270 /* 1269 1271 * SME doesn't require any particular vector length be 1270 1272 * supported but it does require at least one. We should have ··· 1274 1272 * let's double check here. The bitmap is SVE_VQ_MAP sized for 1275 1273 * sharing with SVE. 1276 1274 */ 1277 - WARN_ON(bitmap_empty(info->vq_map, SVE_VQ_MAX)); 1275 + WARN_ON(min_bit >= SVE_VQ_MAX); 1278 1276 1279 - min_bit = find_last_bit(info->vq_map, SVE_VQ_MAX); 1280 1277 info->min_vl = sve_vl_from_vq(__bit_to_vq(min_bit)); 1281 1278 1282 1279 max_bit = find_first_bit(info->vq_map, SVE_VQ_MAX);
+27 -20
arch/arm64/kernel/pi/map_kernel.c
··· 18 18 19 19 extern const u8 __eh_frame_start[], __eh_frame_end[]; 20 20 21 - extern void idmap_cpu_replace_ttbr1(void *pgdir); 21 + extern void idmap_cpu_replace_ttbr1(phys_addr_t pgdir); 22 22 23 - static void __init map_segment(pgd_t *pg_dir, u64 *pgd, u64 va_offset, 23 + static void __init map_segment(pgd_t *pg_dir, phys_addr_t *pgd, u64 va_offset, 24 24 void *start, void *end, pgprot_t prot, 25 25 bool may_use_cont, int root_level) 26 26 { ··· 40 40 { 41 41 bool enable_scs = IS_ENABLED(CONFIG_UNWIND_PATCH_PAC_INTO_SCS); 42 42 bool twopass = IS_ENABLED(CONFIG_RELOCATABLE); 43 - u64 pgdp = (u64)init_pg_dir + PAGE_SIZE; 43 + phys_addr_t pgdp = (phys_addr_t)init_pg_dir + PAGE_SIZE; 44 44 pgprot_t text_prot = PAGE_KERNEL_ROX; 45 45 pgprot_t data_prot = PAGE_KERNEL; 46 46 pgprot_t prot; ··· 78 78 twopass |= enable_scs; 79 79 prot = twopass ? data_prot : text_prot; 80 80 81 + /* 82 + * [_stext, _text) isn't executed after boot and contains some 83 + * non-executable, unpredictable data, so map it non-executable. 84 + */ 85 + map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot, 86 + false, root_level); 81 87 map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot, 82 88 !twopass, root_level); 83 89 map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata, ··· 96 90 true, root_level); 97 91 dsb(ishst); 98 92 99 - idmap_cpu_replace_ttbr1(init_pg_dir); 93 + idmap_cpu_replace_ttbr1((phys_addr_t)init_pg_dir); 100 94 101 95 if (twopass) { 102 96 if (IS_ENABLED(CONFIG_RELOCATABLE)) ··· 135 129 /* Copy the root page table to its final location */ 136 130 memcpy((void *)swapper_pg_dir + va_offset, init_pg_dir, PAGE_SIZE); 137 131 dsb(ishst); 138 - idmap_cpu_replace_ttbr1(swapper_pg_dir); 132 + idmap_cpu_replace_ttbr1((phys_addr_t)swapper_pg_dir); 139 133 } 140 134 141 - static void noinline __section(".idmap.text") set_ttbr0_for_lpa2(u64 ttbr) 135 + static void noinline __section(".idmap.text") set_ttbr0_for_lpa2(phys_addr_t ttbr) 142 136 { 143 137 u64 sctlr = read_sysreg(sctlr_el1); 144 138 u64 tcr = read_sysreg(tcr_el1) | TCR_DS; ··· 178 172 */ 179 173 create_init_idmap(init_pg_dir, mask); 180 174 dsb(ishst); 181 - set_ttbr0_for_lpa2((u64)init_pg_dir); 175 + set_ttbr0_for_lpa2((phys_addr_t)init_pg_dir); 182 176 183 177 /* 184 178 * Recreate the initial ID map with the same granularity as before. 185 179 * Don't bother with the FDT, we no longer need it after this. 186 180 */ 187 181 memset(init_idmap_pg_dir, 0, 188 - (u64)init_idmap_pg_end - (u64)init_idmap_pg_dir); 182 + (char *)init_idmap_pg_end - (char *)init_idmap_pg_dir); 189 183 190 184 create_init_idmap(init_idmap_pg_dir, mask); 191 185 dsb(ishst); 192 186 193 187 /* switch back to the updated initial ID map */ 194 - set_ttbr0_for_lpa2((u64)init_idmap_pg_dir); 188 + set_ttbr0_for_lpa2((phys_addr_t)init_idmap_pg_dir); 195 189 196 190 /* wipe the temporary ID map from memory */ 197 - memset(init_pg_dir, 0, (u64)init_pg_end - (u64)init_pg_dir); 191 + memset(init_pg_dir, 0, (char *)init_pg_end - (char *)init_pg_dir); 198 192 } 199 193 200 - static void __init map_fdt(u64 fdt) 194 + static void *__init map_fdt(phys_addr_t fdt) 201 195 { 202 196 static u8 ptes[INIT_IDMAP_FDT_SIZE] __initdata __aligned(PAGE_SIZE); 203 - u64 efdt = fdt + MAX_FDT_SIZE; 204 - u64 ptep = (u64)ptes; 197 + phys_addr_t efdt = fdt + MAX_FDT_SIZE; 198 + phys_addr_t ptep = (phys_addr_t)ptes; /* We're idmapped when called */ 205 199 206 200 /* 207 201 * Map up to MAX_FDT_SIZE bytes, but avoid overlap with ··· 211 205 fdt, PAGE_KERNEL, IDMAP_ROOT_LEVEL, 212 206 (pte_t *)init_idmap_pg_dir, false, 0); 213 207 dsb(ishst); 208 + 209 + return (void *)fdt; 214 210 } 215 211 216 212 /* ··· 238 230 return true; 239 231 } 240 232 241 - asmlinkage void __init early_map_kernel(u64 boot_status, void *fdt) 233 + asmlinkage void __init early_map_kernel(u64 boot_status, phys_addr_t fdt) 242 234 { 243 235 static char const chosen_str[] __initconst = "/chosen"; 244 236 u64 va_base, pa_base = (u64)&_text; ··· 246 238 int root_level = 4 - CONFIG_PGTABLE_LEVELS; 247 239 int va_bits = VA_BITS; 248 240 int chosen; 249 - 250 - map_fdt((u64)fdt); 241 + void *fdt_mapped = map_fdt(fdt); 251 242 252 243 /* Clear BSS and the initial page tables */ 253 - memset(__bss_start, 0, (u64)init_pg_end - (u64)__bss_start); 244 + memset(__bss_start, 0, (char *)init_pg_end - (char *)__bss_start); 254 245 255 246 /* Parse the command line for CPU feature overrides */ 256 - chosen = fdt_path_offset(fdt, chosen_str); 257 - init_feature_override(boot_status, fdt, chosen); 247 + chosen = fdt_path_offset(fdt_mapped, chosen_str); 248 + init_feature_override(boot_status, fdt_mapped, chosen); 258 249 259 250 if (IS_ENABLED(CONFIG_ARM64_64K_PAGES) && !cpu_has_lva()) { 260 251 va_bits = VA_BITS_MIN; ··· 273 266 * fill in the high bits from the seed. 274 267 */ 275 268 if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { 276 - u64 kaslr_seed = kaslr_early_init(fdt, chosen); 269 + u64 kaslr_seed = kaslr_early_init(fdt_mapped, chosen); 277 270 278 271 if (kaslr_seed && kaslr_requires_kpti()) 279 272 arm64_use_ng_mappings = ng_mappings_allowed();
+12 -8
arch/arm64/kernel/pi/map_range.c
··· 26 26 * @va_offset: Offset between a physical page and its current mapping 27 27 * in the VA space 28 28 */ 29 - void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot, 30 - int level, pte_t *tbl, bool may_use_cont, u64 va_offset) 29 + void __init map_range(phys_addr_t *pte, u64 start, u64 end, phys_addr_t pa, 30 + pgprot_t prot, int level, pte_t *tbl, bool may_use_cont, 31 + u64 va_offset) 31 32 { 32 33 u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : U64_MAX; 33 34 ptdesc_t protval = pgprot_val(prot) & ~PTE_TYPE_MASK; ··· 88 87 } 89 88 } 90 89 91 - asmlinkage u64 __init create_init_idmap(pgd_t *pg_dir, ptdesc_t clrmask) 90 + asmlinkage phys_addr_t __init create_init_idmap(pgd_t *pg_dir, ptdesc_t clrmask) 92 91 { 93 - u64 ptep = (u64)pg_dir + PAGE_SIZE; 92 + phys_addr_t ptep = (phys_addr_t)pg_dir + PAGE_SIZE; /* MMU is off */ 94 93 pgprot_t text_prot = PAGE_KERNEL_ROX; 95 94 pgprot_t data_prot = PAGE_KERNEL; 96 95 97 96 pgprot_val(text_prot) &= ~clrmask; 98 97 pgprot_val(data_prot) &= ~clrmask; 99 98 100 - map_range(&ptep, (u64)_stext, (u64)__initdata_begin, (u64)_stext, 101 - text_prot, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0); 102 - map_range(&ptep, (u64)__initdata_begin, (u64)_end, (u64)__initdata_begin, 103 - data_prot, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0); 99 + /* MMU is off; pointer casts to phys_addr_t are safe */ 100 + map_range(&ptep, (u64)_stext, (u64)__initdata_begin, 101 + (phys_addr_t)_stext, text_prot, IDMAP_ROOT_LEVEL, 102 + (pte_t *)pg_dir, false, 0); 103 + map_range(&ptep, (u64)__initdata_begin, (u64)_end, 104 + (phys_addr_t)__initdata_begin, data_prot, IDMAP_ROOT_LEVEL, 105 + (pte_t *)pg_dir, false, 0); 104 106 105 107 return ptep; 106 108 }
+5 -4
arch/arm64/kernel/pi/pi.h
··· 29 29 void relocate_kernel(u64 offset); 30 30 int scs_patch(const u8 eh_frame[], int size); 31 31 32 - void map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot, 33 - int level, pte_t *tbl, bool may_use_cont, u64 va_offset); 32 + void map_range(phys_addr_t *pte, u64 start, u64 end, phys_addr_t pa, 33 + pgprot_t prot, int level, pte_t *tbl, bool may_use_cont, 34 + u64 va_offset); 34 35 35 - asmlinkage void early_map_kernel(u64 boot_status, void *fdt); 36 + asmlinkage void early_map_kernel(u64 boot_status, phys_addr_t fdt); 36 37 37 - asmlinkage u64 create_init_idmap(pgd_t *pgd, ptdesc_t clrmask); 38 + asmlinkage phys_addr_t create_init_idmap(pgd_t *pgd, ptdesc_t clrmask);
+4 -3
arch/arm64/kernel/probes/decode-insn.c
··· 108 108 aarch64_insn_is_bl(insn)) { 109 109 api->handler = simulate_b_bl; 110 110 } else if (aarch64_insn_is_br(insn) || 111 - aarch64_insn_is_blr(insn) || 112 - aarch64_insn_is_ret(insn)) { 113 - api->handler = simulate_br_blr_ret; 111 + aarch64_insn_is_blr(insn)) { 112 + api->handler = simulate_br_blr; 113 + } else if (aarch64_insn_is_ret(insn)) { 114 + api->handler = simulate_ret; 114 115 } else { 115 116 /* 116 117 * Instruction cannot be stepped out-of-line and we don't
+42 -8
arch/arm64/kernel/probes/simulate-insn.c
··· 13 13 #include <asm/traps.h> 14 14 15 15 #include "simulate-insn.h" 16 + #include "asm/gcs.h" 16 17 17 18 #define bbl_displacement(insn) \ 18 19 sign_extend32(((insn) & 0x3ffffff) << 2, 27) ··· 48 47 static inline u32 get_w_reg(struct pt_regs *regs, int reg) 49 48 { 50 49 return lower_32_bits(pt_regs_read_reg(regs, reg)); 50 + } 51 + 52 + static inline int update_lr(struct pt_regs *regs, long addr) 53 + { 54 + int err = 0; 55 + 56 + if (user_mode(regs) && task_gcs_el0_enabled(current)) { 57 + push_user_gcs(addr, &err); 58 + if (err) { 59 + force_sig(SIGSEGV); 60 + return err; 61 + } 62 + } 63 + procedure_link_pointer_set(regs, addr); 64 + return err; 51 65 } 52 66 53 67 static bool __kprobes check_cbz(u32 opcode, struct pt_regs *regs) ··· 123 107 { 124 108 int disp = bbl_displacement(opcode); 125 109 126 - /* Link register is x30 */ 127 110 if (opcode & (1 << 31)) 128 - set_x_reg(regs, 30, addr + 4); 111 + if (update_lr(regs, addr + 4)) 112 + return; 129 113 130 114 instruction_pointer_set(regs, addr + disp); 131 115 } ··· 142 126 } 143 127 144 128 void __kprobes 145 - simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs) 129 + simulate_br_blr(u32 opcode, long addr, struct pt_regs *regs) 146 130 { 147 131 int xn = (opcode >> 5) & 0x1f; 132 + u64 b_target = get_x_reg(regs, xn); 148 133 149 - /* update pc first in case we're doing a "blr lr" */ 150 - instruction_pointer_set(regs, get_x_reg(regs, xn)); 151 - 152 - /* Link register is x30 */ 153 134 if (((opcode >> 21) & 0x3) == 1) 154 - set_x_reg(regs, 30, addr + 4); 135 + if (update_lr(regs, addr + 4)) 136 + return; 137 + 138 + instruction_pointer_set(regs, b_target); 139 + } 140 + 141 + void __kprobes 142 + simulate_ret(u32 opcode, long addr, struct pt_regs *regs) 143 + { 144 + u64 ret_addr; 145 + int err = 0; 146 + int xn = (opcode >> 5) & 0x1f; 147 + u64 r_target = get_x_reg(regs, xn); 148 + 149 + if (user_mode(regs) && task_gcs_el0_enabled(current)) { 150 + ret_addr = pop_user_gcs(&err); 151 + if (err || ret_addr != r_target) { 152 + force_sig(SIGSEGV); 153 + return; 154 + } 155 + } 156 + instruction_pointer_set(regs, r_target); 155 157 } 156 158 157 159 void __kprobes
+2 -1
arch/arm64/kernel/probes/simulate-insn.h
··· 11 11 void simulate_adr_adrp(u32 opcode, long addr, struct pt_regs *regs); 12 12 void simulate_b_bl(u32 opcode, long addr, struct pt_regs *regs); 13 13 void simulate_b_cond(u32 opcode, long addr, struct pt_regs *regs); 14 - void simulate_br_blr_ret(u32 opcode, long addr, struct pt_regs *regs); 14 + void simulate_br_blr(u32 opcode, long addr, struct pt_regs *regs); 15 + void simulate_ret(u32 opcode, long addr, struct pt_regs *regs); 15 16 void simulate_cbz_cbnz(u32 opcode, long addr, struct pt_regs *regs); 16 17 void simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs); 17 18 void simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs);
+33
arch/arm64/kernel/probes/uprobes.c
··· 6 6 #include <linux/ptrace.h> 7 7 #include <linux/uprobes.h> 8 8 #include <asm/cacheflush.h> 9 + #include <asm/gcs.h> 9 10 10 11 #include "decode-insn.h" 11 12 ··· 160 159 struct pt_regs *regs) 161 160 { 162 161 unsigned long orig_ret_vaddr; 162 + unsigned long gcs_ret_vaddr; 163 + int err = 0; 164 + u64 gcspr; 163 165 164 166 orig_ret_vaddr = procedure_link_pointer(regs); 167 + 168 + if (task_gcs_el0_enabled(current)) { 169 + gcspr = read_sysreg_s(SYS_GCSPR_EL0); 170 + gcs_ret_vaddr = get_user_gcs((__force unsigned long __user *)gcspr, &err); 171 + if (err) { 172 + force_sig(SIGSEGV); 173 + goto out; 174 + } 175 + 176 + /* 177 + * If the LR and GCS return addr don't match, then some kind of PAC 178 + * signing or control flow occurred since entering the probed function. 179 + * Likely because the user is attempting to retprobe on an instruction 180 + * that isn't a function boundary or inside a leaf function. Explicitly 181 + * abort this retprobe because it will generate a GCS exception. 182 + */ 183 + if (gcs_ret_vaddr != orig_ret_vaddr) { 184 + orig_ret_vaddr = -1; 185 + goto out; 186 + } 187 + 188 + put_user_gcs(trampoline_vaddr, (__force unsigned long __user *)gcspr, &err); 189 + if (err) { 190 + force_sig(SIGSEGV); 191 + goto out; 192 + } 193 + } 194 + 165 195 /* Replace the return addr with trampoline addr */ 166 196 procedure_link_pointer_set(regs, trampoline_vaddr); 167 197 198 + out: 168 199 return orig_ret_vaddr; 169 200 } 170 201
+1
arch/arm64/kernel/proton-pack.c
··· 884 884 static const struct midr_range spectre_bhb_k38_list[] = { 885 885 MIDR_ALL_VERSIONS(MIDR_CORTEX_A715), 886 886 MIDR_ALL_VERSIONS(MIDR_CORTEX_A720), 887 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A720AE), 887 888 {}, 888 889 }; 889 890 static const struct midr_range spectre_bhb_k32_list[] = {
+22 -4
arch/arm64/kernel/rsi.c
··· 84 84 } 85 85 } 86 86 87 - bool __arm64_is_protected_mmio(phys_addr_t base, size_t size) 87 + /* 88 + * Check if a given PA range is Trusted (e.g., Protected memory, a Trusted Device 89 + * mapping, or an MMIO emulated in the Realm world). 90 + * 91 + * We can rely on the RIPAS value of the region to detect if a given region is 92 + * protected. 93 + * 94 + * RIPAS_DEV - A trusted device memory or a trusted emulated MMIO (in the Realm 95 + * world 96 + * RIPAS_RAM - Memory (RAM), protected by the RMM guarantees. (e.g., Firmware 97 + * reserved regions for data sharing). 98 + * 99 + * RIPAS_DESTROYED is a special case of one of the above, where the host did 100 + * something without our permission and as such we can't do anything about it. 101 + * 102 + * The only case where something is emulated by the untrusted hypervisor or is 103 + * backed by shared memory is indicated by RSI_RIPAS_EMPTY. 104 + */ 105 + bool arm64_rsi_is_protected(phys_addr_t base, size_t size) 88 106 { 89 107 enum ripas ripas; 90 108 phys_addr_t end, top; ··· 119 101 break; 120 102 if (WARN_ON(top <= base)) 121 103 break; 122 - if (ripas != RSI_RIPAS_DEV) 104 + if (ripas == RSI_RIPAS_EMPTY) 123 105 break; 124 106 base = top; 125 107 } 126 108 127 109 return base >= end; 128 110 } 129 - EXPORT_SYMBOL(__arm64_is_protected_mmio); 111 + EXPORT_SYMBOL(arm64_rsi_is_protected); 130 112 131 113 static int realm_ioremap_hook(phys_addr_t phys, size_t size, pgprot_t *prot) 132 114 { 133 - if (__arm64_is_protected_mmio(phys, size)) 115 + if (arm64_rsi_is_protected(phys, size)) 134 116 *prot = pgprot_encrypted(*prot); 135 117 else 136 118 *prot = pgprot_decrypted(*prot);
+1 -1
arch/arm64/kernel/sdei.c
··· 243 243 * If we interrupted the kernel with interrupts masked, we always go 244 244 * back to wherever we came from. 245 245 */ 246 - if (mode == kernel_mode && !interrupts_enabled(regs)) 246 + if (mode == kernel_mode && regs_irqs_disabled(regs)) 247 247 return SDEI_EV_HANDLED; 248 248 249 249 /*
+2 -2
arch/arm64/kernel/setup.c
··· 214 214 unsigned long i = 0; 215 215 size_t res_size; 216 216 217 - kernel_code.start = __pa_symbol(_stext); 217 + kernel_code.start = __pa_symbol(_text); 218 218 kernel_code.end = __pa_symbol(__init_begin - 1); 219 219 kernel_data.start = __pa_symbol(_sdata); 220 220 kernel_data.end = __pa_symbol(_end - 1); ··· 280 280 281 281 void __init __no_sanitize_address setup_arch(char **cmdline_p) 282 282 { 283 - setup_initial_init_mm(_stext, _etext, _edata, _end); 283 + setup_initial_init_mm(_text, _etext, _edata, _end); 284 284 285 285 *cmdline_p = boot_command_line; 286 286
+2 -1
arch/arm64/kernel/signal.c
··· 9 9 #include <linux/cache.h> 10 10 #include <linux/compat.h> 11 11 #include <linux/errno.h> 12 + #include <linux/irq-entry-common.h> 12 13 #include <linux/kernel.h> 13 14 #include <linux/signal.h> 14 15 #include <linux/freezer.h> ··· 1577 1576 * the kernel can handle, and then we build all the user-level signal handling 1578 1577 * stack-frames in one go after that. 1579 1578 */ 1580 - void do_signal(struct pt_regs *regs) 1579 + void arch_do_signal_or_restart(struct pt_regs *regs) 1581 1580 { 1582 1581 unsigned long continue_addr = 0, restart_addr = 0; 1583 1582 int retval = 0;
+1 -1
arch/arm64/kernel/syscall.c
··· 43 43 44 44 add_random_kstack_offset(); 45 45 46 - if (scno < sc_nr) { 46 + if (likely(scno < sc_nr)) { 47 47 syscall_fn_t syscall_fn; 48 48 syscall_fn = syscall_table[array_index_nospec(scno, sc_nr)]; 49 49 ret = __invoke_syscall(regs, syscall_fn);
+1 -12
arch/arm64/kernel/vdso32/Makefile
··· 21 21 22 22 cc32-option = $(call try-run,\ 23 23 $(CC_COMPAT) $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2)) 24 - cc32-disable-warning = $(call try-run,\ 25 - $(CC_COMPAT) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1))) 26 24 27 25 # We cannot use the global flags to compile the vDSO files, the main reason 28 26 # being that the 32-bit compiler may be older than the main (64-bit) compiler ··· 61 63 # KBUILD_CFLAGS from top-level Makefile 62 64 VDSO_CFLAGS += -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ 63 65 -fno-strict-aliasing -fno-common \ 66 + $(filter -Werror,$(KBUILD_CPPFLAGS)) \ 64 67 -Werror-implicit-function-declaration \ 65 68 -Wno-format-security \ 66 69 -std=gnu11 ··· 72 73 VDSO_CFLAGS += $(call cc32-option,-Werror=strict-prototypes) 73 74 VDSO_CFLAGS += -Werror=date-time 74 75 VDSO_CFLAGS += $(call cc32-option,-Werror=incompatible-pointer-types) 75 - 76 - # The 32-bit compiler does not provide 128-bit integers, which are used in 77 - # some headers that are indirectly included from the vDSO code. 78 - # This hack makes the compiler happy and should trigger a warning/error if 79 - # variables of such type are referenced. 80 - VDSO_CFLAGS += -D__uint128_t='void*' 81 - # Silence some warnings coming from headers that operate on long's 82 - # (on GCC 4.8 or older, there is unfortunately no way to silence this warning) 83 - VDSO_CFLAGS += $(call cc32-disable-warning,shift-count-overflow) 84 - VDSO_CFLAGS += -Wno-int-to-pointer-cast 85 76 86 77 # Compile as THUMB2 or ARM. Unwinding via frame-pointers in THUMB2 is 87 78 # unreliable.
+4 -4
arch/arm64/mm/init.c
··· 243 243 */ 244 244 if (memory_limit != PHYS_ADDR_MAX) { 245 245 memblock_mem_limit_remove_map(memory_limit); 246 - memblock_add(__pa_symbol(_text), (u64)(_end - _text)); 246 + memblock_add(__pa_symbol(_text), (resource_size_t)(_end - _text)); 247 247 } 248 248 249 249 if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { ··· 252 252 * initrd to become inaccessible via the linear mapping. 253 253 * Otherwise, this is a no-op 254 254 */ 255 - u64 base = phys_initrd_start & PAGE_MASK; 256 - u64 size = PAGE_ALIGN(phys_initrd_start + phys_initrd_size) - base; 255 + phys_addr_t base = phys_initrd_start & PAGE_MASK; 256 + resource_size_t size = PAGE_ALIGN(phys_initrd_start + phys_initrd_size) - base; 257 257 258 258 /* 259 259 * We can only add back the initrd memory if we don't end up ··· 279 279 * Register the kernel text, kernel data, initrd, and initial 280 280 * pagetables with memblock. 281 281 */ 282 - memblock_reserve(__pa_symbol(_stext), _end - _stext); 282 + memblock_reserve(__pa_symbol(_text), _end - _text); 283 283 if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { 284 284 /* the generic initrd code expects virtual addresses */ 285 285 initrd_start = __phys_to_virt(phys_initrd_start);
+465 -29
arch/arm64/mm/mmu.c
··· 27 27 #include <linux/kfence.h> 28 28 #include <linux/pkeys.h> 29 29 #include <linux/mm_inline.h> 30 + #include <linux/pagewalk.h> 31 + #include <linux/stop_machine.h> 30 32 31 33 #include <asm/barrier.h> 32 34 #include <asm/cputype.h> ··· 48 46 #define NO_BLOCK_MAPPINGS BIT(0) 49 47 #define NO_CONT_MAPPINGS BIT(1) 50 48 #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ 49 + 50 + DEFINE_STATIC_KEY_FALSE(arm64_ptdump_lock_key); 51 51 52 52 u64 kimage_voffset __ro_after_init; 53 53 EXPORT_SYMBOL(kimage_voffset); ··· 478 474 int flags); 479 475 #endif 480 476 481 - static phys_addr_t __pgd_pgtable_alloc(struct mm_struct *mm, 477 + #define INVALID_PHYS_ADDR (-1ULL) 478 + 479 + static phys_addr_t __pgd_pgtable_alloc(struct mm_struct *mm, gfp_t gfp, 482 480 enum pgtable_type pgtable_type) 483 481 { 484 482 /* Page is zeroed by init_clear_pgtable() so don't duplicate effort. */ 485 - struct ptdesc *ptdesc = pagetable_alloc(GFP_PGTABLE_KERNEL & ~__GFP_ZERO, 0); 483 + struct ptdesc *ptdesc = pagetable_alloc(gfp & ~__GFP_ZERO, 0); 486 484 phys_addr_t pa; 487 485 488 - BUG_ON(!ptdesc); 486 + if (!ptdesc) 487 + return INVALID_PHYS_ADDR; 488 + 489 489 pa = page_to_phys(ptdesc_page(ptdesc)); 490 490 491 491 switch (pgtable_type) { ··· 510 502 return pa; 511 503 } 512 504 505 + static phys_addr_t 506 + try_pgd_pgtable_alloc_init_mm(enum pgtable_type pgtable_type, gfp_t gfp) 507 + { 508 + return __pgd_pgtable_alloc(&init_mm, gfp, pgtable_type); 509 + } 510 + 513 511 static phys_addr_t __maybe_unused 514 512 pgd_pgtable_alloc_init_mm(enum pgtable_type pgtable_type) 515 513 { 516 - return __pgd_pgtable_alloc(&init_mm, pgtable_type); 514 + phys_addr_t pa; 515 + 516 + pa = __pgd_pgtable_alloc(&init_mm, GFP_PGTABLE_KERNEL, pgtable_type); 517 + BUG_ON(pa == INVALID_PHYS_ADDR); 518 + return pa; 517 519 } 518 520 519 521 static phys_addr_t 520 522 pgd_pgtable_alloc_special_mm(enum pgtable_type pgtable_type) 521 523 { 522 - return __pgd_pgtable_alloc(NULL, pgtable_type); 524 + phys_addr_t pa; 525 + 526 + pa = __pgd_pgtable_alloc(NULL, GFP_PGTABLE_KERNEL, pgtable_type); 527 + BUG_ON(pa == INVALID_PHYS_ADDR); 528 + return pa; 529 + } 530 + 531 + static void split_contpte(pte_t *ptep) 532 + { 533 + int i; 534 + 535 + ptep = PTR_ALIGN_DOWN(ptep, sizeof(*ptep) * CONT_PTES); 536 + for (i = 0; i < CONT_PTES; i++, ptep++) 537 + __set_pte(ptep, pte_mknoncont(__ptep_get(ptep))); 538 + } 539 + 540 + static int split_pmd(pmd_t *pmdp, pmd_t pmd, gfp_t gfp, bool to_cont) 541 + { 542 + pmdval_t tableprot = PMD_TYPE_TABLE | PMD_TABLE_UXN | PMD_TABLE_AF; 543 + unsigned long pfn = pmd_pfn(pmd); 544 + pgprot_t prot = pmd_pgprot(pmd); 545 + phys_addr_t pte_phys; 546 + pte_t *ptep; 547 + int i; 548 + 549 + pte_phys = try_pgd_pgtable_alloc_init_mm(TABLE_PTE, gfp); 550 + if (pte_phys == INVALID_PHYS_ADDR) 551 + return -ENOMEM; 552 + ptep = (pte_t *)phys_to_virt(pte_phys); 553 + 554 + if (pgprot_val(prot) & PMD_SECT_PXN) 555 + tableprot |= PMD_TABLE_PXN; 556 + 557 + prot = __pgprot((pgprot_val(prot) & ~PTE_TYPE_MASK) | PTE_TYPE_PAGE); 558 + prot = __pgprot(pgprot_val(prot) & ~PTE_CONT); 559 + if (to_cont) 560 + prot = __pgprot(pgprot_val(prot) | PTE_CONT); 561 + 562 + for (i = 0; i < PTRS_PER_PTE; i++, ptep++, pfn++) 563 + __set_pte(ptep, pfn_pte(pfn, prot)); 564 + 565 + /* 566 + * Ensure the pte entries are visible to the table walker by the time 567 + * the pmd entry that points to the ptes is visible. 568 + */ 569 + dsb(ishst); 570 + __pmd_populate(pmdp, pte_phys, tableprot); 571 + 572 + return 0; 573 + } 574 + 575 + static void split_contpmd(pmd_t *pmdp) 576 + { 577 + int i; 578 + 579 + pmdp = PTR_ALIGN_DOWN(pmdp, sizeof(*pmdp) * CONT_PMDS); 580 + for (i = 0; i < CONT_PMDS; i++, pmdp++) 581 + set_pmd(pmdp, pmd_mknoncont(pmdp_get(pmdp))); 582 + } 583 + 584 + static int split_pud(pud_t *pudp, pud_t pud, gfp_t gfp, bool to_cont) 585 + { 586 + pudval_t tableprot = PUD_TYPE_TABLE | PUD_TABLE_UXN | PUD_TABLE_AF; 587 + unsigned int step = PMD_SIZE >> PAGE_SHIFT; 588 + unsigned long pfn = pud_pfn(pud); 589 + pgprot_t prot = pud_pgprot(pud); 590 + phys_addr_t pmd_phys; 591 + pmd_t *pmdp; 592 + int i; 593 + 594 + pmd_phys = try_pgd_pgtable_alloc_init_mm(TABLE_PMD, gfp); 595 + if (pmd_phys == INVALID_PHYS_ADDR) 596 + return -ENOMEM; 597 + pmdp = (pmd_t *)phys_to_virt(pmd_phys); 598 + 599 + if (pgprot_val(prot) & PMD_SECT_PXN) 600 + tableprot |= PUD_TABLE_PXN; 601 + 602 + prot = __pgprot((pgprot_val(prot) & ~PMD_TYPE_MASK) | PMD_TYPE_SECT); 603 + prot = __pgprot(pgprot_val(prot) & ~PTE_CONT); 604 + if (to_cont) 605 + prot = __pgprot(pgprot_val(prot) | PTE_CONT); 606 + 607 + for (i = 0; i < PTRS_PER_PMD; i++, pmdp++, pfn += step) 608 + set_pmd(pmdp, pfn_pmd(pfn, prot)); 609 + 610 + /* 611 + * Ensure the pmd entries are visible to the table walker by the time 612 + * the pud entry that points to the pmds is visible. 613 + */ 614 + dsb(ishst); 615 + __pud_populate(pudp, pmd_phys, tableprot); 616 + 617 + return 0; 618 + } 619 + 620 + static int split_kernel_leaf_mapping_locked(unsigned long addr) 621 + { 622 + pgd_t *pgdp, pgd; 623 + p4d_t *p4dp, p4d; 624 + pud_t *pudp, pud; 625 + pmd_t *pmdp, pmd; 626 + pte_t *ptep, pte; 627 + int ret = 0; 628 + 629 + /* 630 + * PGD: If addr is PGD aligned then addr already describes a leaf 631 + * boundary. If not present then there is nothing to split. 632 + */ 633 + if (ALIGN_DOWN(addr, PGDIR_SIZE) == addr) 634 + goto out; 635 + pgdp = pgd_offset_k(addr); 636 + pgd = pgdp_get(pgdp); 637 + if (!pgd_present(pgd)) 638 + goto out; 639 + 640 + /* 641 + * P4D: If addr is P4D aligned then addr already describes a leaf 642 + * boundary. If not present then there is nothing to split. 643 + */ 644 + if (ALIGN_DOWN(addr, P4D_SIZE) == addr) 645 + goto out; 646 + p4dp = p4d_offset(pgdp, addr); 647 + p4d = p4dp_get(p4dp); 648 + if (!p4d_present(p4d)) 649 + goto out; 650 + 651 + /* 652 + * PUD: If addr is PUD aligned then addr already describes a leaf 653 + * boundary. If not present then there is nothing to split. Otherwise, 654 + * if we have a pud leaf, split to contpmd. 655 + */ 656 + if (ALIGN_DOWN(addr, PUD_SIZE) == addr) 657 + goto out; 658 + pudp = pud_offset(p4dp, addr); 659 + pud = pudp_get(pudp); 660 + if (!pud_present(pud)) 661 + goto out; 662 + if (pud_leaf(pud)) { 663 + ret = split_pud(pudp, pud, GFP_PGTABLE_KERNEL, true); 664 + if (ret) 665 + goto out; 666 + } 667 + 668 + /* 669 + * CONTPMD: If addr is CONTPMD aligned then addr already describes a 670 + * leaf boundary. If not present then there is nothing to split. 671 + * Otherwise, if we have a contpmd leaf, split to pmd. 672 + */ 673 + if (ALIGN_DOWN(addr, CONT_PMD_SIZE) == addr) 674 + goto out; 675 + pmdp = pmd_offset(pudp, addr); 676 + pmd = pmdp_get(pmdp); 677 + if (!pmd_present(pmd)) 678 + goto out; 679 + if (pmd_leaf(pmd)) { 680 + if (pmd_cont(pmd)) 681 + split_contpmd(pmdp); 682 + /* 683 + * PMD: If addr is PMD aligned then addr already describes a 684 + * leaf boundary. Otherwise, split to contpte. 685 + */ 686 + if (ALIGN_DOWN(addr, PMD_SIZE) == addr) 687 + goto out; 688 + ret = split_pmd(pmdp, pmd, GFP_PGTABLE_KERNEL, true); 689 + if (ret) 690 + goto out; 691 + } 692 + 693 + /* 694 + * CONTPTE: If addr is CONTPTE aligned then addr already describes a 695 + * leaf boundary. If not present then there is nothing to split. 696 + * Otherwise, if we have a contpte leaf, split to pte. 697 + */ 698 + if (ALIGN_DOWN(addr, CONT_PTE_SIZE) == addr) 699 + goto out; 700 + ptep = pte_offset_kernel(pmdp, addr); 701 + pte = __ptep_get(ptep); 702 + if (!pte_present(pte)) 703 + goto out; 704 + if (pte_cont(pte)) 705 + split_contpte(ptep); 706 + 707 + out: 708 + return ret; 709 + } 710 + 711 + static DEFINE_MUTEX(pgtable_split_lock); 712 + 713 + int split_kernel_leaf_mapping(unsigned long start, unsigned long end) 714 + { 715 + int ret; 716 + 717 + /* 718 + * !BBML2_NOABORT systems should not be trying to change permissions on 719 + * anything that is not pte-mapped in the first place. Just return early 720 + * and let the permission change code raise a warning if not already 721 + * pte-mapped. 722 + */ 723 + if (!system_supports_bbml2_noabort()) 724 + return 0; 725 + 726 + /* 727 + * Ensure start and end are at least page-aligned since this is the 728 + * finest granularity we can split to. 729 + */ 730 + if (start != PAGE_ALIGN(start) || end != PAGE_ALIGN(end)) 731 + return -EINVAL; 732 + 733 + mutex_lock(&pgtable_split_lock); 734 + arch_enter_lazy_mmu_mode(); 735 + 736 + /* 737 + * The split_kernel_leaf_mapping_locked() may sleep, it is not a 738 + * problem for ARM64 since ARM64's lazy MMU implementation allows 739 + * sleeping. 740 + * 741 + * Optimize for the common case of splitting out a single page from a 742 + * larger mapping. Here we can just split on the "least aligned" of 743 + * start and end and this will guarantee that there must also be a split 744 + * on the more aligned address since the both addresses must be in the 745 + * same contpte block and it must have been split to ptes. 746 + */ 747 + if (end - start == PAGE_SIZE) { 748 + start = __ffs(start) < __ffs(end) ? start : end; 749 + ret = split_kernel_leaf_mapping_locked(start); 750 + } else { 751 + ret = split_kernel_leaf_mapping_locked(start); 752 + if (!ret) 753 + ret = split_kernel_leaf_mapping_locked(end); 754 + } 755 + 756 + arch_leave_lazy_mmu_mode(); 757 + mutex_unlock(&pgtable_split_lock); 758 + return ret; 759 + } 760 + 761 + static int __init split_to_ptes_pud_entry(pud_t *pudp, unsigned long addr, 762 + unsigned long next, 763 + struct mm_walk *walk) 764 + { 765 + pud_t pud = pudp_get(pudp); 766 + int ret = 0; 767 + 768 + if (pud_leaf(pud)) 769 + ret = split_pud(pudp, pud, GFP_ATOMIC, false); 770 + 771 + return ret; 772 + } 773 + 774 + static int __init split_to_ptes_pmd_entry(pmd_t *pmdp, unsigned long addr, 775 + unsigned long next, 776 + struct mm_walk *walk) 777 + { 778 + pmd_t pmd = pmdp_get(pmdp); 779 + int ret = 0; 780 + 781 + if (pmd_leaf(pmd)) { 782 + if (pmd_cont(pmd)) 783 + split_contpmd(pmdp); 784 + ret = split_pmd(pmdp, pmd, GFP_ATOMIC, false); 785 + 786 + /* 787 + * We have split the pmd directly to ptes so there is no need to 788 + * visit each pte to check if they are contpte. 789 + */ 790 + walk->action = ACTION_CONTINUE; 791 + } 792 + 793 + return ret; 794 + } 795 + 796 + static int __init split_to_ptes_pte_entry(pte_t *ptep, unsigned long addr, 797 + unsigned long next, 798 + struct mm_walk *walk) 799 + { 800 + pte_t pte = __ptep_get(ptep); 801 + 802 + if (pte_cont(pte)) 803 + split_contpte(ptep); 804 + 805 + return 0; 806 + } 807 + 808 + static const struct mm_walk_ops split_to_ptes_ops __initconst = { 809 + .pud_entry = split_to_ptes_pud_entry, 810 + .pmd_entry = split_to_ptes_pmd_entry, 811 + .pte_entry = split_to_ptes_pte_entry, 812 + }; 813 + 814 + static bool linear_map_requires_bbml2 __initdata; 815 + 816 + u32 idmap_kpti_bbml2_flag; 817 + 818 + void __init init_idmap_kpti_bbml2_flag(void) 819 + { 820 + WRITE_ONCE(idmap_kpti_bbml2_flag, 1); 821 + /* Must be visible to other CPUs before stop_machine() is called. */ 822 + smp_mb(); 823 + } 824 + 825 + static int __init linear_map_split_to_ptes(void *__unused) 826 + { 827 + /* 828 + * Repainting the linear map must be done by CPU0 (the boot CPU) because 829 + * that's the only CPU that we know supports BBML2. The other CPUs will 830 + * be held in a waiting area with the idmap active. 831 + */ 832 + if (!smp_processor_id()) { 833 + unsigned long lstart = _PAGE_OFFSET(vabits_actual); 834 + unsigned long lend = PAGE_END; 835 + unsigned long kstart = (unsigned long)lm_alias(_stext); 836 + unsigned long kend = (unsigned long)lm_alias(__init_begin); 837 + int ret; 838 + 839 + /* 840 + * Wait for all secondary CPUs to be put into the waiting area. 841 + */ 842 + smp_cond_load_acquire(&idmap_kpti_bbml2_flag, VAL == num_online_cpus()); 843 + 844 + /* 845 + * Walk all of the linear map [lstart, lend), except the kernel 846 + * linear map alias [kstart, kend), and split all mappings to 847 + * PTE. The kernel alias remains static throughout runtime so 848 + * can continue to be safely mapped with large mappings. 849 + */ 850 + ret = walk_kernel_page_table_range_lockless(lstart, kstart, 851 + &split_to_ptes_ops, NULL, NULL); 852 + if (!ret) 853 + ret = walk_kernel_page_table_range_lockless(kend, lend, 854 + &split_to_ptes_ops, NULL, NULL); 855 + if (ret) 856 + panic("Failed to split linear map\n"); 857 + flush_tlb_kernel_range(lstart, lend); 858 + 859 + /* 860 + * Relies on dsb in flush_tlb_kernel_range() to avoid reordering 861 + * before any page table split operations. 862 + */ 863 + WRITE_ONCE(idmap_kpti_bbml2_flag, 0); 864 + } else { 865 + typedef void (wait_split_fn)(void); 866 + extern wait_split_fn wait_linear_map_split_to_ptes; 867 + wait_split_fn *wait_fn; 868 + 869 + wait_fn = (void *)__pa_symbol(wait_linear_map_split_to_ptes); 870 + 871 + /* 872 + * At least one secondary CPU doesn't support BBML2 so cannot 873 + * tolerate the size of the live mappings changing. So have the 874 + * secondary CPUs wait for the boot CPU to make the changes 875 + * with the idmap active and init_mm inactive. 876 + */ 877 + cpu_install_idmap(); 878 + wait_fn(); 879 + cpu_uninstall_idmap(); 880 + } 881 + 882 + return 0; 883 + } 884 + 885 + void __init linear_map_maybe_split_to_ptes(void) 886 + { 887 + if (linear_map_requires_bbml2 && !system_supports_bbml2_noabort()) { 888 + init_idmap_kpti_bbml2_flag(); 889 + stop_machine(linear_map_split_to_ptes, NULL, cpu_online_mask); 890 + } 523 891 } 524 892 525 893 /* ··· 958 574 /* 959 575 * Remove the write permissions from the linear alias of .text/.rodata 960 576 */ 961 - update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext), 962 - (unsigned long)__init_begin - (unsigned long)_stext, 577 + update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text), 578 + (unsigned long)__init_begin - (unsigned long)_text, 963 579 PAGE_KERNEL_RO); 964 580 } 965 581 ··· 1017 633 1018 634 #endif /* CONFIG_KFENCE */ 1019 635 636 + static inline bool force_pte_mapping(void) 637 + { 638 + bool bbml2 = system_capabilities_finalized() ? 639 + system_supports_bbml2_noabort() : cpu_supports_bbml2_noabort(); 640 + 641 + return (!bbml2 && (rodata_full || arm64_kfence_can_set_direct_map() || 642 + is_realm_world())) || 643 + debug_pagealloc_enabled(); 644 + } 645 + 1020 646 static void __init map_mem(pgd_t *pgdp) 1021 647 { 1022 648 static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN); 1023 - phys_addr_t kernel_start = __pa_symbol(_stext); 649 + phys_addr_t kernel_start = __pa_symbol(_text); 1024 650 phys_addr_t kernel_end = __pa_symbol(__init_begin); 1025 651 phys_addr_t start, end; 1026 652 phys_addr_t early_kfence_pool; ··· 1052 658 1053 659 early_kfence_pool = arm64_kfence_alloc_pool(); 1054 660 1055 - if (can_set_direct_map()) 661 + linear_map_requires_bbml2 = !force_pte_mapping() && can_set_direct_map(); 662 + 663 + if (force_pte_mapping()) 1056 664 flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; 1057 665 1058 666 /* ··· 1079 683 } 1080 684 1081 685 /* 1082 - * Map the linear alias of the [_stext, __init_begin) interval 686 + * Map the linear alias of the [_text, __init_begin) interval 1083 687 * as non-executable now, and remove the write permission in 1084 688 * mark_linear_text_alias_ro() below (which will be called after 1085 689 * alternative patching has completed). This makes the contents ··· 1106 710 WRITE_ONCE(rodata_is_rw, false); 1107 711 update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, 1108 712 section_size, PAGE_KERNEL_RO); 713 + /* mark the range between _text and _stext as read only. */ 714 + update_mapping_prot(__pa_symbol(_text), (unsigned long)_text, 715 + (unsigned long)_stext - (unsigned long)_text, 716 + PAGE_KERNEL_RO); 1109 717 } 1110 718 1111 719 static void __init declare_vma(struct vm_struct *vma, ··· 1180 780 { 1181 781 static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT]; 1182 782 1183 - declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD); 783 + declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD); 1184 784 declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD); 1185 785 declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD); 1186 786 declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD); 1187 787 declare_vma(&vmlinux_seg[4], _data, _end, 0); 1188 788 } 1189 789 1190 - void __pi_map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot, 1191 - int level, pte_t *tbl, bool may_use_cont, u64 va_offset); 790 + void __pi_map_range(phys_addr_t *pte, u64 start, u64 end, phys_addr_t pa, 791 + pgprot_t prot, int level, pte_t *tbl, bool may_use_cont, 792 + u64 va_offset); 1192 793 1193 794 static u8 idmap_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init, 1194 - kpti_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init; 795 + kpti_bbml2_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init; 1195 796 1196 797 static void __init create_idmap(void) 1197 798 { 1198 - u64 start = __pa_symbol(__idmap_text_start); 1199 - u64 end = __pa_symbol(__idmap_text_end); 1200 - u64 ptep = __pa_symbol(idmap_ptes); 799 + phys_addr_t start = __pa_symbol(__idmap_text_start); 800 + phys_addr_t end = __pa_symbol(__idmap_text_end); 801 + phys_addr_t ptep = __pa_symbol(idmap_ptes); 1201 802 1202 803 __pi_map_range(&ptep, start, end, start, PAGE_KERNEL_ROX, 1203 804 IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false, 1204 805 __phys_to_virt(ptep) - ptep); 1205 806 1206 - if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) && !arm64_use_ng_mappings) { 1207 - extern u32 __idmap_kpti_flag; 1208 - u64 pa = __pa_symbol(&__idmap_kpti_flag); 807 + if (linear_map_requires_bbml2 || 808 + (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) && !arm64_use_ng_mappings)) { 809 + phys_addr_t pa = __pa_symbol(&idmap_kpti_bbml2_flag); 1209 810 1210 811 /* 1211 812 * The KPTI G-to-nG conversion code needs a read-write mapping 1212 - * of its synchronization flag in the ID map. 813 + * of its synchronization flag in the ID map. This is also used 814 + * when splitting the linear map to ptes if a secondary CPU 815 + * doesn't support bbml2. 1213 816 */ 1214 - ptep = __pa_symbol(kpti_ptes); 817 + ptep = __pa_symbol(kpti_bbml2_ptes); 1215 818 __pi_map_range(&ptep, pa, pa + sizeof(u32), pa, PAGE_KERNEL, 1216 819 IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false, 1217 820 __phys_to_virt(ptep) - ptep); ··· 1664 1261 return 1; 1665 1262 } 1666 1263 1667 - int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr) 1264 + static int __pmd_free_pte_page(pmd_t *pmdp, unsigned long addr, 1265 + bool acquire_mmap_lock) 1668 1266 { 1669 1267 pte_t *table; 1670 1268 pmd_t pmd; ··· 1677 1273 return 1; 1678 1274 } 1679 1275 1276 + /* See comment in pud_free_pmd_page for static key logic */ 1680 1277 table = pte_offset_kernel(pmdp, addr); 1681 1278 pmd_clear(pmdp); 1682 1279 __flush_tlb_kernel_pgtable(addr); 1280 + if (static_branch_unlikely(&arm64_ptdump_lock_key) && acquire_mmap_lock) { 1281 + mmap_read_lock(&init_mm); 1282 + mmap_read_unlock(&init_mm); 1283 + } 1284 + 1683 1285 pte_free_kernel(NULL, table); 1684 1286 return 1; 1287 + } 1288 + 1289 + int pmd_free_pte_page(pmd_t *pmdp, unsigned long addr) 1290 + { 1291 + /* If ptdump is walking the pagetables, acquire init_mm.mmap_lock */ 1292 + return __pmd_free_pte_page(pmdp, addr, /* acquire_mmap_lock = */ true); 1685 1293 } 1686 1294 1687 1295 int pud_free_pmd_page(pud_t *pudp, unsigned long addr) ··· 1711 1295 } 1712 1296 1713 1297 table = pmd_offset(pudp, addr); 1298 + 1299 + /* 1300 + * Our objective is to prevent ptdump from reading a PMD table which has 1301 + * been freed. In this race, if pud_free_pmd_page observes the key on 1302 + * (which got flipped by ptdump) then the mmap lock sequence here will, 1303 + * as a result of the mmap write lock/unlock sequence in ptdump, give 1304 + * us the correct synchronization. If not, this means that ptdump has 1305 + * yet not started walking the pagetables - the sequence of barriers 1306 + * issued by __flush_tlb_kernel_pgtable() guarantees that ptdump will 1307 + * observe an empty PUD. 1308 + */ 1309 + pud_clear(pudp); 1310 + __flush_tlb_kernel_pgtable(addr); 1311 + if (static_branch_unlikely(&arm64_ptdump_lock_key)) { 1312 + mmap_read_lock(&init_mm); 1313 + mmap_read_unlock(&init_mm); 1314 + } 1315 + 1714 1316 pmdp = table; 1715 1317 next = addr; 1716 1318 end = addr + PUD_SIZE; 1717 1319 do { 1718 1320 if (pmd_present(pmdp_get(pmdp))) 1719 - pmd_free_pte_page(pmdp, next); 1321 + /* 1322 + * PMD has been isolated, so ptdump won't see it. No 1323 + * need to acquire init_mm.mmap_lock. 1324 + */ 1325 + __pmd_free_pte_page(pmdp, next, /* acquire_mmap_lock = */ false); 1720 1326 } while (pmdp++, next += PMD_SIZE, next != end); 1721 1327 1722 - pud_clear(pudp); 1723 - __flush_tlb_kernel_pgtable(addr); 1724 1328 pmd_free(NULL, table); 1725 1329 return 1; 1726 1330 } ··· 1760 1324 struct range arch_get_mappable_range(void) 1761 1325 { 1762 1326 struct range mhp_range; 1763 - u64 start_linear_pa = __pa(_PAGE_OFFSET(vabits_actual)); 1764 - u64 end_linear_pa = __pa(PAGE_END - 1); 1327 + phys_addr_t start_linear_pa = __pa(_PAGE_OFFSET(vabits_actual)); 1328 + phys_addr_t end_linear_pa = __pa(PAGE_END - 1); 1765 1329 1766 1330 if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { 1767 1331 /* ··· 1796 1360 1797 1361 VM_BUG_ON(!mhp_range_allowed(start, size, true)); 1798 1362 1799 - if (can_set_direct_map()) 1363 + if (force_pte_mapping()) 1800 1364 flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; 1801 1365 1802 1366 __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
+95 -34
arch/arm64/mm/pageattr.c
··· 8 8 #include <linux/mem_encrypt.h> 9 9 #include <linux/sched.h> 10 10 #include <linux/vmalloc.h> 11 + #include <linux/pagewalk.h> 11 12 12 13 #include <asm/cacheflush.h> 13 14 #include <asm/pgtable-prot.h> ··· 21 20 pgprot_t clear_mask; 22 21 }; 23 22 24 - bool rodata_full __ro_after_init = IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED); 23 + static ptdesc_t set_pageattr_masks(ptdesc_t val, struct mm_walk *walk) 24 + { 25 + struct page_change_data *masks = walk->private; 26 + 27 + val &= ~(pgprot_val(masks->clear_mask)); 28 + val |= (pgprot_val(masks->set_mask)); 29 + 30 + return val; 31 + } 32 + 33 + static int pageattr_pud_entry(pud_t *pud, unsigned long addr, 34 + unsigned long next, struct mm_walk *walk) 35 + { 36 + pud_t val = pudp_get(pud); 37 + 38 + if (pud_sect(val)) { 39 + if (WARN_ON_ONCE((next - addr) != PUD_SIZE)) 40 + return -EINVAL; 41 + val = __pud(set_pageattr_masks(pud_val(val), walk)); 42 + set_pud(pud, val); 43 + walk->action = ACTION_CONTINUE; 44 + } 45 + 46 + return 0; 47 + } 48 + 49 + static int pageattr_pmd_entry(pmd_t *pmd, unsigned long addr, 50 + unsigned long next, struct mm_walk *walk) 51 + { 52 + pmd_t val = pmdp_get(pmd); 53 + 54 + if (pmd_sect(val)) { 55 + if (WARN_ON_ONCE((next - addr) != PMD_SIZE)) 56 + return -EINVAL; 57 + val = __pmd(set_pageattr_masks(pmd_val(val), walk)); 58 + set_pmd(pmd, val); 59 + walk->action = ACTION_CONTINUE; 60 + } 61 + 62 + return 0; 63 + } 64 + 65 + static int pageattr_pte_entry(pte_t *pte, unsigned long addr, 66 + unsigned long next, struct mm_walk *walk) 67 + { 68 + pte_t val = __ptep_get(pte); 69 + 70 + val = __pte(set_pageattr_masks(pte_val(val), walk)); 71 + __set_pte(pte, val); 72 + 73 + return 0; 74 + } 75 + 76 + static const struct mm_walk_ops pageattr_ops = { 77 + .pud_entry = pageattr_pud_entry, 78 + .pmd_entry = pageattr_pmd_entry, 79 + .pte_entry = pageattr_pte_entry, 80 + }; 81 + 82 + bool rodata_full __ro_after_init = true; 25 83 26 84 bool can_set_direct_map(void) 27 85 { ··· 97 37 arm64_kfence_can_set_direct_map() || is_realm_world(); 98 38 } 99 39 100 - static int change_page_range(pte_t *ptep, unsigned long addr, void *data) 101 - { 102 - struct page_change_data *cdata = data; 103 - pte_t pte = __ptep_get(ptep); 104 - 105 - pte = clear_pte_bit(pte, cdata->clear_mask); 106 - pte = set_pte_bit(pte, cdata->set_mask); 107 - 108 - __set_pte(ptep, pte); 109 - return 0; 110 - } 111 - 112 - /* 113 - * This function assumes that the range is mapped with PAGE_SIZE pages. 114 - */ 115 - static int __change_memory_common(unsigned long start, unsigned long size, 116 - pgprot_t set_mask, pgprot_t clear_mask) 40 + static int update_range_prot(unsigned long start, unsigned long size, 41 + pgprot_t set_mask, pgprot_t clear_mask) 117 42 { 118 43 struct page_change_data data; 119 44 int ret; ··· 106 61 data.set_mask = set_mask; 107 62 data.clear_mask = clear_mask; 108 63 109 - ret = apply_to_page_range(&init_mm, start, size, change_page_range, 110 - &data); 64 + ret = split_kernel_leaf_mapping(start, start + size); 65 + if (WARN_ON_ONCE(ret)) 66 + return ret; 67 + 68 + arch_enter_lazy_mmu_mode(); 69 + 70 + /* 71 + * The caller must ensure that the range we are operating on does not 72 + * partially overlap a block mapping, or a cont mapping. Any such case 73 + * must be eliminated by splitting the mapping. 74 + */ 75 + ret = walk_kernel_page_table_range_lockless(start, start + size, 76 + &pageattr_ops, NULL, &data); 77 + arch_leave_lazy_mmu_mode(); 78 + 79 + return ret; 80 + } 81 + 82 + static int __change_memory_common(unsigned long start, unsigned long size, 83 + pgprot_t set_mask, pgprot_t clear_mask) 84 + { 85 + int ret; 86 + 87 + ret = update_range_prot(start, size, set_mask, clear_mask); 111 88 112 89 /* 113 90 * If the memory is being made valid without changing any other bits ··· 241 174 242 175 int set_direct_map_invalid_noflush(struct page *page) 243 176 { 244 - struct page_change_data data = { 245 - .set_mask = __pgprot(0), 246 - .clear_mask = __pgprot(PTE_VALID), 247 - }; 177 + pgprot_t clear_mask = __pgprot(PTE_VALID); 178 + pgprot_t set_mask = __pgprot(0); 248 179 249 180 if (!can_set_direct_map()) 250 181 return 0; 251 182 252 - return apply_to_page_range(&init_mm, 253 - (unsigned long)page_address(page), 254 - PAGE_SIZE, change_page_range, &data); 183 + return update_range_prot((unsigned long)page_address(page), 184 + PAGE_SIZE, set_mask, clear_mask); 255 185 } 256 186 257 187 int set_direct_map_default_noflush(struct page *page) 258 188 { 259 - struct page_change_data data = { 260 - .set_mask = __pgprot(PTE_VALID | PTE_WRITE), 261 - .clear_mask = __pgprot(PTE_RDONLY), 262 - }; 189 + pgprot_t set_mask = __pgprot(PTE_VALID | PTE_WRITE); 190 + pgprot_t clear_mask = __pgprot(PTE_RDONLY); 263 191 264 192 if (!can_set_direct_map()) 265 193 return 0; 266 194 267 - return apply_to_page_range(&init_mm, 268 - (unsigned long)page_address(page), 269 - PAGE_SIZE, change_page_range, &data); 195 + return update_range_prot((unsigned long)page_address(page), 196 + PAGE_SIZE, set_mask, clear_mask); 270 197 } 271 198 272 199 static int __set_memory_enc_dec(unsigned long addr,
+20 -7
arch/arm64/mm/proc.S
··· 245 245 * 246 246 * Called exactly once from stop_machine context by each CPU found during boot. 247 247 */ 248 - .pushsection ".data", "aw", %progbits 249 - SYM_DATA(__idmap_kpti_flag, .long 1) 250 - .popsection 251 - 252 248 SYM_TYPED_FUNC_START(idmap_kpti_install_ng_mappings) 253 249 cpu .req w0 254 250 temp_pte .req x0 ··· 269 273 270 274 mov x5, x3 // preserve temp_pte arg 271 275 mrs swapper_ttb, ttbr1_el1 272 - adr_l flag_ptr, __idmap_kpti_flag 276 + adr_l flag_ptr, idmap_kpti_bbml2_flag 273 277 274 278 cbnz cpu, __idmap_kpti_secondary 275 279 ··· 412 416 __idmap_kpti_secondary: 413 417 /* Uninstall swapper before surgery begins */ 414 418 __idmap_cpu_set_reserved_ttbr1 x16, x17 419 + b scondary_cpu_wait 415 420 421 + .unreq swapper_ttb 422 + .unreq flag_ptr 423 + SYM_FUNC_END(idmap_kpti_install_ng_mappings) 424 + .popsection 425 + #endif 426 + 427 + .pushsection ".idmap.text", "a" 428 + SYM_TYPED_FUNC_START(wait_linear_map_split_to_ptes) 429 + /* Must be same registers as in idmap_kpti_install_ng_mappings */ 430 + swapper_ttb .req x3 431 + flag_ptr .req x4 432 + 433 + mrs swapper_ttb, ttbr1_el1 434 + adr_l flag_ptr, idmap_kpti_bbml2_flag 435 + __idmap_cpu_set_reserved_ttbr1 x16, x17 436 + 437 + scondary_cpu_wait: 416 438 /* Increment the flag to let the boot CPU we're ready */ 417 439 1: ldxr w16, [flag_ptr] 418 440 add w16, w16, #1 ··· 450 436 451 437 .unreq swapper_ttb 452 438 .unreq flag_ptr 453 - SYM_FUNC_END(idmap_kpti_install_ng_mappings) 439 + SYM_FUNC_END(wait_linear_map_split_to_ptes) 454 440 .popsection 455 - #endif 456 441 457 442 /* 458 443 * __cpu_setup
+9 -2
arch/arm64/mm/ptdump.c
··· 283 283 note_page(pt_st, 0, -1, pte_val(pte_zero)); 284 284 } 285 285 286 + static void arm64_ptdump_walk_pgd(struct ptdump_state *st, struct mm_struct *mm) 287 + { 288 + static_branch_inc(&arm64_ptdump_lock_key); 289 + ptdump_walk_pgd(st, mm, NULL); 290 + static_branch_dec(&arm64_ptdump_lock_key); 291 + } 292 + 286 293 void ptdump_walk(struct seq_file *s, struct ptdump_info *info) 287 294 { 288 295 unsigned long end = ~0UL; ··· 318 311 } 319 312 }; 320 313 321 - ptdump_walk_pgd(&st.ptdump, info->mm, NULL); 314 + arm64_ptdump_walk_pgd(&st.ptdump, info->mm); 322 315 } 323 316 324 317 static void __init ptdump_initialize(void) ··· 360 353 } 361 354 }; 362 355 363 - ptdump_walk_pgd(&st.ptdump, &init_mm, NULL); 356 + arm64_ptdump_walk_pgd(&st.ptdump, &init_mm); 364 357 365 358 if (st.wx_pages || st.uxn_pages) { 366 359 pr_warn("Checked W+X mappings: FAILED, %lu W+X pages found, %lu non-UXN pages found\n",
+20
arch/arm64/tools/gen-sysreg.awk
··· 122 122 res1 = "UL(0)" 123 123 unkn = "UL(0)" 124 124 125 + if (reg in defined_fields) 126 + fatal("Duplicate SysregFields definition for " reg) 127 + defined_fields[reg] = 1 128 + 125 129 next_bit = 63 126 130 127 131 next ··· 165 161 res0 = "UL(0)" 166 162 res1 = "UL(0)" 167 163 unkn = "UL(0)" 164 + 165 + if (reg in defined_regs) 166 + fatal("Duplicate Sysreg definition for " reg) 167 + defined_regs[reg] = 1 168 168 169 169 define("REG_" reg, "S" op0 "_" op1 "_C" crn "_C" crm "_" op2) 170 170 define("SYS_" reg, "sys_reg(" op0 ", " op1 ", " crn ", " crm ", " op2 ")") ··· 292 284 define_field(reg, field, msb, lsb) 293 285 define_field_sign(reg, field, "true") 294 286 287 + delete seen_enum_vals 288 + 295 289 next 296 290 } 297 291 ··· 307 297 define_field(reg, field, msb, lsb) 308 298 define_field_sign(reg, field, "false") 309 299 300 + delete seen_enum_vals 301 + 310 302 next 311 303 } 312 304 ··· 321 309 322 310 define_field(reg, field, msb, lsb) 323 311 312 + delete seen_enum_vals 313 + 324 314 next 325 315 } 326 316 ··· 334 320 lsb = null 335 321 print "" 336 322 323 + delete seen_enum_vals 324 + 337 325 block_pop() 338 326 next 339 327 } ··· 344 328 expect_fields(2) 345 329 val = $1 346 330 name = $2 331 + 332 + if (val in seen_enum_vals) 333 + fatal("Duplicate Enum value " val " for " name) 334 + seen_enum_vals[val] = 1 347 335 348 336 define(reg "_" field "_" name, "UL(" val ")") 349 337 next
+60 -23
arch/arm64/tools/sysreg
··· 31 31 # Mapping <name_EL1> 32 32 # EndSysreg 33 33 34 - # Where multiple system regsiters are not VHE aliases but share a 34 + # Where multiple system registers are not VHE aliases but share a 35 35 # common layout, a SysregFields block can be used to describe the 36 36 # shared layout: 37 37 ··· 54 54 # 55 55 # In general it is recommended that new enumeration items be named for the 56 56 # feature that introduces them (eg, FEAT_LS64_ACCDATA introduces enumeration 57 - # item ACCDATA) though it may be more taseful to do something else. 57 + # item ACCDATA) though it may be more tasteful to do something else. 58 58 59 59 Sysreg OSDTRRX_EL1 2 0 0 0 2 60 60 Res0 63:32 ··· 474 474 Enum 7:4 Security 475 475 0b0000 NI 476 476 0b0001 EL3 477 - 0b0001 NSACR_RFR 477 + 0b0010 NSACR_RFR 478 478 EndEnum 479 479 UnsignedEnum 3:0 ProgMod 480 480 0b0000 NI ··· 1693 1693 0b0000 NI 1694 1694 0b0001 IMP 1695 1695 EndEnum 1696 - UnsignedEnum 39:36 DoubleLock 1696 + SignedEnum 39:36 DoubleLock 1697 1697 0b0000 IMP 1698 1698 0b1111 NI 1699 1699 EndEnum ··· 2409 2409 0b0000 NI 2410 2410 0b0001 IMP 2411 2411 EndEnum 2412 - SignedEnum 7:4 EIESB 2412 + UnsignedEnum 7:4 EIESB 2413 2413 0b0000 NI 2414 2414 0b0001 ToEL3 2415 2415 0b0010 ToELx ··· 2528 2528 Res0 15:0 2529 2529 EndSysreg 2530 2530 2531 - Sysreg CPACR_EL12 3 5 1 0 2 2532 - Mapping CPACR_EL1 2533 - EndSysreg 2534 - 2535 2531 Sysreg CPACRALIAS_EL1 3 0 1 4 4 2536 2532 Mapping CPACR_EL1 2537 2533 EndSysreg ··· 2570 2574 2571 2575 Sysreg PFAR_EL12 3 5 6 0 5 2572 2576 Mapping PFAR_EL1 2573 - EndSysreg 2574 - 2575 - Sysreg RCWSMASK_EL1 3 0 13 0 3 2576 - Field 63:0 RCWSMASK 2577 2577 EndSysreg 2578 2578 2579 2579 Sysreg SCTLR2_EL1 3 0 1 0 3 ··· 2986 2994 EndSysreg 2987 2995 2988 2996 Sysreg PMSFCR_EL1 3 0 9 9 4 2989 - Res0 63:19 2997 + Res0 63:53 2998 + Field 52 SIMDm 2999 + Field 51 FPm 3000 + Field 50 STm 3001 + Field 49 LDm 3002 + Field 48 Bm 3003 + Res0 47:21 3004 + Field 20 SIMD 3005 + Field 19 FP 2990 3006 Field 18 ST 2991 3007 Field 17 LD 2992 3008 Field 16 B 2993 - Res0 15:4 3009 + Res0 15:5 3010 + Field 4 FDS 2994 3011 Field 3 FnE 2995 3012 Field 2 FL 2996 3013 Field 1 FT ··· 4757 4756 Field 36 AS 4758 4757 Res0 35 4759 4758 Field 34:32 IPS 4760 - Field 31:30 TG1 4761 - Field 29:28 SH1 4762 - Field 27:26 ORGN1 4763 - Field 25:24 IRGN1 4759 + Enum 31:30 TG1 4760 + 0b01 16K 4761 + 0b10 4K 4762 + 0b11 64K 4763 + EndEnum 4764 + Enum 29:28 SH1 4765 + 0b00 NONE 4766 + 0b10 OUTER 4767 + 0b11 INNER 4768 + EndEnum 4769 + Enum 27:26 ORGN1 4770 + 0b00 NC 4771 + 0b01 WBWA 4772 + 0b10 WT 4773 + 0b11 WBnWA 4774 + EndEnum 4775 + Enum 25:24 IRGN1 4776 + 0b00 NC 4777 + 0b01 WBWA 4778 + 0b10 WT 4779 + 0b11 WBnWA 4780 + EndEnum 4764 4781 Field 23 EPD1 4765 4782 Field 22 A1 4766 4783 Field 21:16 T1SZ 4767 - Field 15:14 TG0 4768 - Field 13:12 SH0 4769 - Field 11:10 ORGN0 4770 - Field 9:8 IRGN0 4784 + Enum 15:14 TG0 4785 + 0b00 4K 4786 + 0b01 64K 4787 + 0b10 16K 4788 + EndEnum 4789 + Enum 13:12 SH0 4790 + 0b00 NONE 4791 + 0b10 OUTER 4792 + 0b11 INNER 4793 + EndEnum 4794 + Enum 11:10 ORGN0 4795 + 0b00 NC 4796 + 0b01 WBWA 4797 + 0b10 WT 4798 + 0b11 WBnWA 4799 + EndEnum 4800 + Enum 9:8 IRGN0 4801 + 0b00 NC 4802 + 0b01 WBWA 4803 + 0b10 WT 4804 + 0b11 WBnWA 4805 + EndEnum 4771 4806 Field 7 EPD0 4772 4807 Res0 6 4773 4808 Field 5:0 T0SZ
+2 -1
drivers/hwtracing/coresight/coresight-trbe.c
··· 23 23 #include "coresight-self-hosted-trace.h" 24 24 #include "coresight-trbe.h" 25 25 26 - #define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) 26 + #define PERF_IDX2OFF(idx, buf) \ 27 + ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT)) 27 28 28 29 /* 29 30 * A padding packet that will help the user space tools
+9
drivers/perf/Kconfig
··· 178 178 can give information about memory throughput and other related 179 179 events. 180 180 181 + config FUJITSU_UNCORE_PMU 182 + tristate "Fujitsu Uncore PMU" 183 + depends on (ARM64 && ACPI) || (COMPILE_TEST && 64BIT) 184 + help 185 + Provides support for the Uncore performance monitor unit (PMU) 186 + in Fujitsu processors. 187 + Adds the Uncore PMU into the perf events subsystem for 188 + monitoring Uncore events. 189 + 181 190 config QCOM_L2_PMU 182 191 bool "Qualcomm Technologies L2-cache PMU" 183 192 depends on ARCH_QCOM && ARM64 && ACPI
+1
drivers/perf/Makefile
··· 13 13 obj-$(CONFIG_ARM_SMMU_V3_PMU) += arm_smmuv3_pmu.o 14 14 obj-$(CONFIG_FSL_IMX8_DDR_PMU) += fsl_imx8_ddr_perf.o 15 15 obj-$(CONFIG_FSL_IMX9_DDR_PMU) += fsl_imx9_ddr_perf.o 16 + obj-$(CONFIG_FUJITSU_UNCORE_PMU) += fujitsu_uncore_pmu.o 16 17 obj-$(CONFIG_HISI_PMU) += hisilicon/ 17 18 obj-$(CONFIG_QCOM_L2_PMU) += qcom_l2_pmu.o 18 19 obj-$(CONFIG_QCOM_L3_PMU) += qcom_l3_pmu.o
+1 -1
drivers/perf/arm-ccn.c
··· 565 565 566 566 static ktime_t arm_ccn_pmu_timer_period(void) 567 567 { 568 - return ns_to_ktime((u64)arm_ccn_pmu_poll_period_us * 1000); 568 + return us_to_ktime((u64)arm_ccn_pmu_poll_period_us); 569 569 } 570 570 571 571
+6 -3
drivers/perf/arm-cmn.c
··· 65 65 /* PMU registers occupy the 3rd 4KB page of each node's region */ 66 66 #define CMN_PMU_OFFSET 0x2000 67 67 /* ...except when they don't :( */ 68 - #define CMN_S3_DTM_OFFSET 0xa000 68 + #define CMN_S3_R1_DTM_OFFSET 0xa000 69 69 #define CMN_S3_PMU_OFFSET 0xd900 70 70 71 71 /* For most nodes, this is all there is */ ··· 233 233 REV_CMN700_R1P0, 234 234 REV_CMN700_R2P0, 235 235 REV_CMN700_R3P0, 236 + REV_CMNS3_R0P0 = 0, 237 + REV_CMNS3_R0P1, 238 + REV_CMNS3_R1P0, 236 239 REV_CI700_R0P0 = 0, 237 240 REV_CI700_R1P0, 238 241 REV_CI700_R2P0, ··· 428 425 static int arm_cmn_pmu_offset(const struct arm_cmn *cmn, const struct arm_cmn_node *dn) 429 426 { 430 427 if (cmn->part == PART_CMN_S3) { 431 - if (dn->type == CMN_TYPE_XP) 432 - return CMN_S3_DTM_OFFSET; 428 + if (cmn->rev >= REV_CMNS3_R1P0 && dn->type == CMN_TYPE_XP) 429 + return CMN_S3_R1_DTM_OFFSET; 433 430 return CMN_S3_PMU_OFFSET; 434 431 } 435 432 return CMN_PMU_OFFSET;
+27 -2
drivers/perf/arm_pmuv3.c
··· 978 978 return -EAGAIN; 979 979 } 980 980 981 + static bool armv8pmu_can_use_pmccntr(struct pmu_hw_events *cpuc, 982 + struct perf_event *event) 983 + { 984 + struct hw_perf_event *hwc = &event->hw; 985 + unsigned long evtype = hwc->config_base & ARMV8_PMU_EVTYPE_EVENT; 986 + 987 + if (evtype != ARMV8_PMUV3_PERFCTR_CPU_CYCLES) 988 + return false; 989 + 990 + /* 991 + * A CPU_CYCLES event with threshold counting cannot use PMCCNTR_EL0 992 + * since it lacks threshold support. 993 + */ 994 + if (armv8pmu_event_get_threshold(&event->attr)) 995 + return false; 996 + 997 + /* 998 + * PMCCNTR_EL0 is not affected by BRBE controls like BRBCR_ELx.FZP. 999 + * So don't use it for branch events. 1000 + */ 1001 + if (has_branch_stack(event)) 1002 + return false; 1003 + 1004 + return true; 1005 + } 1006 + 981 1007 static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc, 982 1008 struct perf_event *event) 983 1009 { ··· 1012 986 unsigned long evtype = hwc->config_base & ARMV8_PMU_EVTYPE_EVENT; 1013 987 1014 988 /* Always prefer to place a cycle counter into the cycle counter. */ 1015 - if ((evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) && 1016 - !armv8pmu_event_get_threshold(&event->attr) && !has_branch_stack(event)) { 989 + if (armv8pmu_can_use_pmccntr(cpuc, event)) { 1017 990 if (!test_and_set_bit(ARMV8_PMU_CYCLE_IDX, cpuc->used_mask)) 1018 991 return ARMV8_PMU_CYCLE_IDX; 1019 992 else if (armv8pmu_event_is_64bit(event) &&
+95 -19
drivers/perf/arm_spe_pmu.c
··· 86 86 #define SPE_PMU_FEAT_ERND (1UL << 5) 87 87 #define SPE_PMU_FEAT_INV_FILT_EVT (1UL << 6) 88 88 #define SPE_PMU_FEAT_DISCARD (1UL << 7) 89 + #define SPE_PMU_FEAT_EFT (1UL << 8) 89 90 #define SPE_PMU_FEAT_DEV_PROBED (1UL << 63) 90 91 u64 features; 91 92 93 + u64 pmsevfr_res0; 92 94 u16 max_record_sz; 93 95 u16 align; 94 96 struct perf_output_handle __percpu *handle; ··· 99 97 #define to_spe_pmu(p) (container_of(p, struct arm_spe_pmu, pmu)) 100 98 101 99 /* Convert a free-running index from perf into an SPE buffer offset */ 102 - #define PERF_IDX2OFF(idx, buf) ((idx) % ((buf)->nr_pages << PAGE_SHIFT)) 100 + #define PERF_IDX2OFF(idx, buf) \ 101 + ((idx) % ((unsigned long)(buf)->nr_pages << PAGE_SHIFT)) 103 102 104 103 /* Keep track of our dynamic hotplug state */ 105 104 static enum cpuhp_state arm_spe_pmu_online; ··· 118 115 SPE_PMU_CAP_FEAT_MAX, 119 116 SPE_PMU_CAP_CNT_SZ = SPE_PMU_CAP_FEAT_MAX, 120 117 SPE_PMU_CAP_MIN_IVAL, 118 + SPE_PMU_CAP_EVENT_FILTER, 121 119 }; 122 120 123 121 static int arm_spe_pmu_feat_caps[SPE_PMU_CAP_FEAT_MAX] = { ··· 126 122 [SPE_PMU_CAP_ERND] = SPE_PMU_FEAT_ERND, 127 123 }; 128 124 129 - static u32 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) 125 + static u64 arm_spe_pmu_cap_get(struct arm_spe_pmu *spe_pmu, int cap) 130 126 { 131 127 if (cap < SPE_PMU_CAP_FEAT_MAX) 132 128 return !!(spe_pmu->features & arm_spe_pmu_feat_caps[cap]); ··· 136 132 return spe_pmu->counter_sz; 137 133 case SPE_PMU_CAP_MIN_IVAL: 138 134 return spe_pmu->min_period; 135 + case SPE_PMU_CAP_EVENT_FILTER: 136 + return ~spe_pmu->pmsevfr_res0; 139 137 default: 140 138 WARN(1, "unknown cap %d\n", cap); 141 139 } ··· 154 148 container_of(attr, struct dev_ext_attribute, attr); 155 149 int cap = (long)ea->var; 156 150 157 - return sysfs_emit(buf, "%u\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 151 + return sysfs_emit(buf, "%llu\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 152 + } 153 + 154 + static ssize_t arm_spe_pmu_cap_show_hex(struct device *dev, 155 + struct device_attribute *attr, 156 + char *buf) 157 + { 158 + struct arm_spe_pmu *spe_pmu = dev_get_drvdata(dev); 159 + struct dev_ext_attribute *ea = 160 + container_of(attr, struct dev_ext_attribute, attr); 161 + int cap = (long)ea->var; 162 + 163 + return sysfs_emit(buf, "0x%llx\n", arm_spe_pmu_cap_get(spe_pmu, cap)); 158 164 } 159 165 160 166 #define SPE_EXT_ATTR_ENTRY(_name, _func, _var) \ ··· 176 158 177 159 #define SPE_CAP_EXT_ATTR_ENTRY(_name, _var) \ 178 160 SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show, _var) 161 + #define SPE_CAP_EXT_ATTR_ENTRY_HEX(_name, _var) \ 162 + SPE_EXT_ATTR_ENTRY(_name, arm_spe_pmu_cap_show_hex, _var) 179 163 180 164 static struct attribute *arm_spe_pmu_cap_attr[] = { 181 165 SPE_CAP_EXT_ATTR_ENTRY(arch_inst, SPE_PMU_CAP_ARCH_INST), 182 166 SPE_CAP_EXT_ATTR_ENTRY(ernd, SPE_PMU_CAP_ERND), 183 167 SPE_CAP_EXT_ATTR_ENTRY(count_size, SPE_PMU_CAP_CNT_SZ), 184 168 SPE_CAP_EXT_ATTR_ENTRY(min_interval, SPE_PMU_CAP_MIN_IVAL), 169 + SPE_CAP_EXT_ATTR_ENTRY_HEX(event_filter, SPE_PMU_CAP_EVENT_FILTER), 185 170 NULL, 186 171 }; 187 172 ··· 218 197 #define ATTR_CFG_FLD_discard_CFG config /* PMBLIMITR_EL1.FM = DISCARD */ 219 198 #define ATTR_CFG_FLD_discard_LO 35 220 199 #define ATTR_CFG_FLD_discard_HI 35 200 + #define ATTR_CFG_FLD_branch_filter_mask_CFG config /* PMSFCR_EL1.Bm */ 201 + #define ATTR_CFG_FLD_branch_filter_mask_LO 36 202 + #define ATTR_CFG_FLD_branch_filter_mask_HI 36 203 + #define ATTR_CFG_FLD_load_filter_mask_CFG config /* PMSFCR_EL1.LDm */ 204 + #define ATTR_CFG_FLD_load_filter_mask_LO 37 205 + #define ATTR_CFG_FLD_load_filter_mask_HI 37 206 + #define ATTR_CFG_FLD_store_filter_mask_CFG config /* PMSFCR_EL1.STm */ 207 + #define ATTR_CFG_FLD_store_filter_mask_LO 38 208 + #define ATTR_CFG_FLD_store_filter_mask_HI 38 209 + #define ATTR_CFG_FLD_simd_filter_CFG config /* PMSFCR_EL1.SIMD */ 210 + #define ATTR_CFG_FLD_simd_filter_LO 39 211 + #define ATTR_CFG_FLD_simd_filter_HI 39 212 + #define ATTR_CFG_FLD_simd_filter_mask_CFG config /* PMSFCR_EL1.SIMDm */ 213 + #define ATTR_CFG_FLD_simd_filter_mask_LO 40 214 + #define ATTR_CFG_FLD_simd_filter_mask_HI 40 215 + #define ATTR_CFG_FLD_float_filter_CFG config /* PMSFCR_EL1.FP */ 216 + #define ATTR_CFG_FLD_float_filter_LO 41 217 + #define ATTR_CFG_FLD_float_filter_HI 41 218 + #define ATTR_CFG_FLD_float_filter_mask_CFG config /* PMSFCR_EL1.FPm */ 219 + #define ATTR_CFG_FLD_float_filter_mask_LO 42 220 + #define ATTR_CFG_FLD_float_filter_mask_HI 42 221 221 222 222 #define ATTR_CFG_FLD_event_filter_CFG config1 /* PMSEVFR_EL1 */ 223 223 #define ATTR_CFG_FLD_event_filter_LO 0 ··· 257 215 GEN_PMU_FORMAT_ATTR(pct_enable); 258 216 GEN_PMU_FORMAT_ATTR(jitter); 259 217 GEN_PMU_FORMAT_ATTR(branch_filter); 218 + GEN_PMU_FORMAT_ATTR(branch_filter_mask); 260 219 GEN_PMU_FORMAT_ATTR(load_filter); 220 + GEN_PMU_FORMAT_ATTR(load_filter_mask); 261 221 GEN_PMU_FORMAT_ATTR(store_filter); 222 + GEN_PMU_FORMAT_ATTR(store_filter_mask); 223 + GEN_PMU_FORMAT_ATTR(simd_filter); 224 + GEN_PMU_FORMAT_ATTR(simd_filter_mask); 225 + GEN_PMU_FORMAT_ATTR(float_filter); 226 + GEN_PMU_FORMAT_ATTR(float_filter_mask); 262 227 GEN_PMU_FORMAT_ATTR(event_filter); 263 228 GEN_PMU_FORMAT_ATTR(inv_event_filter); 264 229 GEN_PMU_FORMAT_ATTR(min_latency); ··· 277 228 &format_attr_pct_enable.attr, 278 229 &format_attr_jitter.attr, 279 230 &format_attr_branch_filter.attr, 231 + &format_attr_branch_filter_mask.attr, 280 232 &format_attr_load_filter.attr, 233 + &format_attr_load_filter_mask.attr, 281 234 &format_attr_store_filter.attr, 235 + &format_attr_store_filter_mask.attr, 236 + &format_attr_simd_filter.attr, 237 + &format_attr_simd_filter_mask.attr, 238 + &format_attr_float_filter.attr, 239 + &format_attr_float_filter_mask.attr, 282 240 &format_attr_event_filter.attr, 283 241 &format_attr_inv_event_filter.attr, 284 242 &format_attr_min_latency.attr, ··· 304 248 return 0; 305 249 306 250 if (attr == &format_attr_inv_event_filter.attr && !(spe_pmu->features & SPE_PMU_FEAT_INV_FILT_EVT)) 251 + return 0; 252 + 253 + if ((attr == &format_attr_branch_filter_mask.attr || 254 + attr == &format_attr_load_filter_mask.attr || 255 + attr == &format_attr_store_filter_mask.attr || 256 + attr == &format_attr_simd_filter.attr || 257 + attr == &format_attr_simd_filter_mask.attr || 258 + attr == &format_attr_float_filter.attr || 259 + attr == &format_attr_float_filter_mask.attr) && 260 + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) 307 261 return 0; 308 262 309 263 return attr->mode; ··· 411 345 u64 reg = 0; 412 346 413 347 reg |= FIELD_PREP(PMSFCR_EL1_LD, ATTR_CFG_GET_FLD(attr, load_filter)); 348 + reg |= FIELD_PREP(PMSFCR_EL1_LDm, ATTR_CFG_GET_FLD(attr, load_filter_mask)); 414 349 reg |= FIELD_PREP(PMSFCR_EL1_ST, ATTR_CFG_GET_FLD(attr, store_filter)); 350 + reg |= FIELD_PREP(PMSFCR_EL1_STm, ATTR_CFG_GET_FLD(attr, store_filter_mask)); 415 351 reg |= FIELD_PREP(PMSFCR_EL1_B, ATTR_CFG_GET_FLD(attr, branch_filter)); 352 + reg |= FIELD_PREP(PMSFCR_EL1_Bm, ATTR_CFG_GET_FLD(attr, branch_filter_mask)); 353 + reg |= FIELD_PREP(PMSFCR_EL1_SIMD, ATTR_CFG_GET_FLD(attr, simd_filter)); 354 + reg |= FIELD_PREP(PMSFCR_EL1_SIMDm, ATTR_CFG_GET_FLD(attr, simd_filter_mask)); 355 + reg |= FIELD_PREP(PMSFCR_EL1_FP, ATTR_CFG_GET_FLD(attr, float_filter)); 356 + reg |= FIELD_PREP(PMSFCR_EL1_FPm, ATTR_CFG_GET_FLD(attr, float_filter_mask)); 416 357 417 358 if (reg) 418 359 reg |= PMSFCR_EL1_FT; ··· 770 697 return IRQ_HANDLED; 771 698 } 772 699 773 - static u64 arm_spe_pmsevfr_res0(u16 pmsver) 774 - { 775 - switch (pmsver) { 776 - case ID_AA64DFR0_EL1_PMSVer_IMP: 777 - return PMSEVFR_EL1_RES0_IMP; 778 - case ID_AA64DFR0_EL1_PMSVer_V1P1: 779 - return PMSEVFR_EL1_RES0_V1P1; 780 - case ID_AA64DFR0_EL1_PMSVer_V1P2: 781 - /* Return the highest version we support in default */ 782 - default: 783 - return PMSEVFR_EL1_RES0_V1P2; 784 - } 785 - } 786 - 787 700 /* Perf callbacks */ 788 701 static int arm_spe_pmu_event_init(struct perf_event *event) 789 702 { ··· 785 726 !cpumask_test_cpu(event->cpu, &spe_pmu->supported_cpus)) 786 727 return -ENOENT; 787 728 788 - if (arm_spe_event_to_pmsevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) 729 + if (arm_spe_event_to_pmsevfr(event) & spe_pmu->pmsevfr_res0) 789 730 return -EOPNOTSUPP; 790 731 791 - if (arm_spe_event_to_pmsnevfr(event) & arm_spe_pmsevfr_res0(spe_pmu->pmsver)) 732 + if (arm_spe_event_to_pmsnevfr(event) & spe_pmu->pmsevfr_res0) 792 733 return -EOPNOTSUPP; 793 734 794 735 if (attr->exclude_idle) ··· 819 760 820 761 if ((FIELD_GET(PMSFCR_EL1_FL, reg)) && 821 762 !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) 763 + return -EOPNOTSUPP; 764 + 765 + if ((FIELD_GET(PMSFCR_EL1_LDm, reg) || 766 + FIELD_GET(PMSFCR_EL1_STm, reg) || 767 + FIELD_GET(PMSFCR_EL1_Bm, reg) || 768 + FIELD_GET(PMSFCR_EL1_SIMD, reg) || 769 + FIELD_GET(PMSFCR_EL1_SIMDm, reg) || 770 + FIELD_GET(PMSFCR_EL1_FP, reg) || 771 + FIELD_GET(PMSFCR_EL1_FPm, reg)) && 772 + !(spe_pmu->features & SPE_PMU_FEAT_EFT)) 822 773 return -EOPNOTSUPP; 823 774 824 775 if (ATTR_CFG_GET_FLD(&event->attr, discard) && ··· 1122 1053 if (spe_pmu->pmsver >= ID_AA64DFR0_EL1_PMSVer_V1P2) 1123 1054 spe_pmu->features |= SPE_PMU_FEAT_DISCARD; 1124 1055 1056 + if (FIELD_GET(PMSIDR_EL1_EFT, reg)) 1057 + spe_pmu->features |= SPE_PMU_FEAT_EFT; 1058 + 1125 1059 /* This field has a spaced out encoding, so just use a look-up */ 1126 1060 fld = FIELD_GET(PMSIDR_EL1_INTERVAL, reg); 1127 1061 switch (fld) { ··· 1178 1106 case PMSIDR_EL1_COUNTSIZE_16_BIT_SAT: 1179 1107 spe_pmu->counter_sz = 16; 1180 1108 } 1109 + 1110 + /* Write all 1s and then read back. Unsupported filter bits are RAZ/WI. */ 1111 + write_sysreg_s(U64_MAX, SYS_PMSEVFR_EL1); 1112 + spe_pmu->pmsevfr_res0 = ~read_sysreg_s(SYS_PMSEVFR_EL1); 1181 1113 1182 1114 dev_info(dev, 1183 1115 "probed SPEv1.%d for CPUs %*pbl [max_record_sz %u, align %u, features 0x%llx]\n",
+130 -31
drivers/perf/dwc_pcie_pmu.c
··· 39 39 #define DWC_PCIE_EVENT_CLEAR GENMASK(1, 0) 40 40 #define DWC_PCIE_EVENT_PER_CLEAR 0x1 41 41 42 + /* Event Selection Field has two subfields */ 43 + #define DWC_PCIE_CNT_EVENT_SEL_GROUP GENMASK(11, 8) 44 + #define DWC_PCIE_CNT_EVENT_SEL_EVID GENMASK(7, 0) 45 + 42 46 #define DWC_PCIE_EVENT_CNT_DATA 0xC 43 47 44 48 #define DWC_PCIE_TIME_BASED_ANAL_CTL 0x10 ··· 77 73 DWC_PCIE_EVENT_TYPE_MAX, 78 74 }; 79 75 76 + #define DWC_PCIE_LANE_GROUP_6 6 77 + #define DWC_PCIE_LANE_GROUP_7 7 78 + #define DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP 256 79 + 80 80 #define DWC_PCIE_LANE_EVENT_MAX_PERIOD GENMASK_ULL(31, 0) 81 81 #define DWC_PCIE_MAX_PERIOD GENMASK_ULL(63, 0) 82 82 ··· 90 82 u16 ras_des_offset; 91 83 u32 nr_lanes; 92 84 85 + /* Groups #6 and #7 */ 86 + DECLARE_BITMAP(lane_events, 2 * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP); 87 + struct perf_event *time_based_event; 88 + 93 89 struct hlist_node cpuhp_node; 94 - struct perf_event *event[DWC_PCIE_EVENT_TYPE_MAX]; 95 90 int on_cpu; 96 91 }; 97 92 ··· 257 246 }; 258 247 259 248 static void dwc_pcie_pmu_lane_event_enable(struct dwc_pcie_pmu *pcie_pmu, 249 + struct perf_event *event, 260 250 bool enable) 261 251 { 262 252 struct pci_dev *pdev = pcie_pmu->pdev; 263 253 u16 ras_des_offset = pcie_pmu->ras_des_offset; 254 + int event_id = DWC_PCIE_EVENT_ID(event); 255 + int lane = DWC_PCIE_EVENT_LANE(event); 256 + u32 ctrl; 257 + 258 + ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 259 + FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | 260 + FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR); 264 261 265 262 if (enable) 266 - pci_clear_and_set_config_dword(pdev, 267 - ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 268 - DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 263 + ctrl |= FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 269 264 else 270 - pci_clear_and_set_config_dword(pdev, 271 - ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 272 - DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF); 265 + ctrl |= FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_OFF); 266 + 267 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 268 + ctrl); 273 269 } 274 270 275 271 static void dwc_pcie_pmu_time_based_event_enable(struct dwc_pcie_pmu *pcie_pmu, ··· 294 276 { 295 277 struct dwc_pcie_pmu *pcie_pmu = to_dwc_pcie_pmu(event->pmu); 296 278 struct pci_dev *pdev = pcie_pmu->pdev; 279 + int event_id = DWC_PCIE_EVENT_ID(event); 280 + int lane = DWC_PCIE_EVENT_LANE(event); 297 281 u16 ras_des_offset = pcie_pmu->ras_des_offset; 298 - u32 val; 282 + u32 val, ctrl; 299 283 284 + ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 285 + FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | 286 + FIELD_PREP(DWC_PCIE_CNT_ENABLE, DWC_PCIE_PER_EVENT_ON); 287 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 288 + ctrl); 300 289 pci_read_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_DATA, &val); 290 + 291 + ctrl |= FIELD_PREP(DWC_PCIE_EVENT_CLEAR, DWC_PCIE_EVENT_PER_CLEAR); 292 + pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 293 + ctrl); 301 294 302 295 return val; 303 296 } ··· 358 329 { 359 330 struct hw_perf_event *hwc = &event->hw; 360 331 enum dwc_pcie_event_type type = DWC_PCIE_EVENT_TYPE(event); 361 - u64 delta, prev, now = 0; 332 + u64 delta, prev, now; 333 + 334 + if (type == DWC_PCIE_LANE_EVENT) { 335 + now = dwc_pcie_pmu_read_lane_event_counter(event) & 336 + DWC_PCIE_LANE_EVENT_MAX_PERIOD; 337 + local64_add(now, &event->count); 338 + return; 339 + } 362 340 363 341 do { 364 342 prev = local64_read(&hwc->prev_count); 365 - 366 - if (type == DWC_PCIE_LANE_EVENT) 367 - now = dwc_pcie_pmu_read_lane_event_counter(event); 368 - else if (type == DWC_PCIE_TIME_BASE_EVENT) 369 - now = dwc_pcie_pmu_read_time_based_counter(event); 343 + now = dwc_pcie_pmu_read_time_based_counter(event); 370 344 371 345 } while (local64_cmpxchg(&hwc->prev_count, prev, now) != prev); 372 346 373 347 delta = (now - prev) & DWC_PCIE_MAX_PERIOD; 374 - /* 32-bit counter for Lane Event Counting */ 375 - if (type == DWC_PCIE_LANE_EVENT) 376 - delta &= DWC_PCIE_LANE_EVENT_MAX_PERIOD; 377 - 378 348 local64_add(delta, &event->count); 349 + } 350 + 351 + static int dwc_pcie_pmu_validate_add_lane_event(struct perf_event *event, 352 + unsigned long val_lane_events[]) 353 + { 354 + int event_id, event_nr, group; 355 + 356 + event_id = DWC_PCIE_EVENT_ID(event); 357 + event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 358 + group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id); 359 + 360 + if (group != DWC_PCIE_LANE_GROUP_6 && group != DWC_PCIE_LANE_GROUP_7) 361 + return -EINVAL; 362 + 363 + group -= DWC_PCIE_LANE_GROUP_6; 364 + 365 + if (test_and_set_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 366 + val_lane_events)) 367 + return -EINVAL; 368 + 369 + return 0; 370 + } 371 + 372 + static int dwc_pcie_pmu_validate_group(struct perf_event *event) 373 + { 374 + struct perf_event *sibling, *leader = event->group_leader; 375 + DECLARE_BITMAP(val_lane_events, 2 * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP); 376 + bool time_event = false; 377 + int type; 378 + 379 + type = DWC_PCIE_EVENT_TYPE(leader); 380 + if (type == DWC_PCIE_TIME_BASE_EVENT) 381 + time_event = true; 382 + else 383 + if (dwc_pcie_pmu_validate_add_lane_event(leader, val_lane_events)) 384 + return -ENOSPC; 385 + 386 + for_each_sibling_event(sibling, leader) { 387 + type = DWC_PCIE_EVENT_TYPE(sibling); 388 + if (type == DWC_PCIE_TIME_BASE_EVENT) { 389 + if (time_event) 390 + return -ENOSPC; 391 + 392 + time_event = true; 393 + continue; 394 + } 395 + 396 + if (dwc_pcie_pmu_validate_add_lane_event(sibling, val_lane_events)) 397 + return -ENOSPC; 398 + } 399 + 400 + return 0; 379 401 } 380 402 381 403 static int dwc_pcie_pmu_event_init(struct perf_event *event) ··· 447 367 if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) 448 368 return -EINVAL; 449 369 450 - if (event->group_leader != event && 451 - !is_software_event(event->group_leader)) 452 - return -EINVAL; 453 - 454 370 for_each_sibling_event(sibling, event->group_leader) { 455 371 if (sibling->pmu != event->pmu && !is_software_event(sibling)) 456 372 return -EINVAL; ··· 460 384 if (lane < 0 || lane >= pcie_pmu->nr_lanes) 461 385 return -EINVAL; 462 386 } 387 + 388 + if (dwc_pcie_pmu_validate_group(event)) 389 + return -ENOSPC; 463 390 464 391 event->cpu = pcie_pmu->on_cpu; 465 392 ··· 479 400 local64_set(&hwc->prev_count, 0); 480 401 481 402 if (type == DWC_PCIE_LANE_EVENT) 482 - dwc_pcie_pmu_lane_event_enable(pcie_pmu, true); 403 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, event, true); 483 404 else if (type == DWC_PCIE_TIME_BASE_EVENT) 484 405 dwc_pcie_pmu_time_based_event_enable(pcie_pmu, true); 485 406 } ··· 493 414 if (event->hw.state & PERF_HES_STOPPED) 494 415 return; 495 416 417 + dwc_pcie_pmu_event_update(event); 418 + 496 419 if (type == DWC_PCIE_LANE_EVENT) 497 - dwc_pcie_pmu_lane_event_enable(pcie_pmu, false); 420 + dwc_pcie_pmu_lane_event_enable(pcie_pmu, event, false); 498 421 else if (type == DWC_PCIE_TIME_BASE_EVENT) 499 422 dwc_pcie_pmu_time_based_event_enable(pcie_pmu, false); 500 423 501 - dwc_pcie_pmu_event_update(event); 502 424 hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 503 425 } 504 426 ··· 514 434 u16 ras_des_offset = pcie_pmu->ras_des_offset; 515 435 u32 ctrl; 516 436 517 - /* one counter for each type and it is in use */ 518 - if (pcie_pmu->event[type]) 519 - return -ENOSPC; 520 - 521 - pcie_pmu->event[type] = event; 522 437 hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 523 438 524 439 if (type == DWC_PCIE_LANE_EVENT) { 440 + int event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 441 + int group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id) - 442 + DWC_PCIE_LANE_GROUP_6; 443 + 444 + if (test_and_set_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 445 + pcie_pmu->lane_events)) 446 + return -ENOSPC; 447 + 525 448 /* EVENT_COUNTER_DATA_REG needs clear manually */ 526 449 ctrl = FIELD_PREP(DWC_PCIE_CNT_EVENT_SEL, event_id) | 527 450 FIELD_PREP(DWC_PCIE_CNT_LANE_SEL, lane) | ··· 533 450 pci_write_config_dword(pdev, ras_des_offset + DWC_PCIE_EVENT_CNT_CTL, 534 451 ctrl); 535 452 } else if (type == DWC_PCIE_TIME_BASE_EVENT) { 453 + if (pcie_pmu->time_based_event) 454 + return -ENOSPC; 455 + 456 + pcie_pmu->time_based_event = event; 457 + 536 458 /* 537 459 * TIME_BASED_ANAL_DATA_REG is a 64 bit register, we can safely 538 460 * use it with any manually controlled duration. And it is ··· 566 478 567 479 dwc_pcie_pmu_event_stop(event, flags | PERF_EF_UPDATE); 568 480 perf_event_update_userpage(event); 569 - pcie_pmu->event[type] = NULL; 481 + 482 + if (type == DWC_PCIE_TIME_BASE_EVENT) { 483 + pcie_pmu->time_based_event = NULL; 484 + } else { 485 + int event_id = DWC_PCIE_EVENT_ID(event); 486 + int event_nr = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_EVID, event_id); 487 + int group = FIELD_GET(DWC_PCIE_CNT_EVENT_SEL_GROUP, event_id) - 488 + DWC_PCIE_LANE_GROUP_6; 489 + 490 + clear_bit(group * DWC_PCIE_LANE_MAX_EVENTS_PER_GROUP + event_nr, 491 + pcie_pmu->lane_events); 492 + } 570 493 } 571 494 572 495 static void dwc_pcie_pmu_remove_cpuhp_instance(void *hotplug_node)
+6
drivers/perf/fsl_imx9_ddr_perf.c
··· 104 104 .filter_ver = DDR_PERF_AXI_FILTER_V1 105 105 }; 106 106 107 + static const struct imx_ddr_devtype_data imx94_devtype_data = { 108 + .identifier = "imx94", 109 + .filter_ver = DDR_PERF_AXI_FILTER_V2 110 + }; 111 + 107 112 static const struct imx_ddr_devtype_data imx95_devtype_data = { 108 113 .identifier = "imx95", 109 114 .filter_ver = DDR_PERF_AXI_FILTER_V2 ··· 127 122 static const struct of_device_id imx_ddr_pmu_dt_ids[] = { 128 123 { .compatible = "fsl,imx91-ddr-pmu", .data = &imx91_devtype_data }, 129 124 { .compatible = "fsl,imx93-ddr-pmu", .data = &imx93_devtype_data }, 125 + { .compatible = "fsl,imx94-ddr-pmu", .data = &imx94_devtype_data }, 130 126 { .compatible = "fsl,imx95-ddr-pmu", .data = &imx95_devtype_data }, 131 127 { /* sentinel */ } 132 128 };
+613
drivers/perf/fujitsu_uncore_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Driver for the Uncore PMUs in Fujitsu chips. 4 + * 5 + * See Documentation/admin-guide/perf/fujitsu_uncore_pmu.rst for more details. 6 + * 7 + * Copyright (c) 2025 Fujitsu. All rights reserved. 8 + */ 9 + 10 + #include <linux/acpi.h> 11 + #include <linux/bitfield.h> 12 + #include <linux/bitops.h> 13 + #include <linux/interrupt.h> 14 + #include <linux/io.h> 15 + #include <linux/list.h> 16 + #include <linux/mod_devicetable.h> 17 + #include <linux/module.h> 18 + #include <linux/perf_event.h> 19 + #include <linux/platform_device.h> 20 + 21 + /* Number of counters on each PMU */ 22 + #define MAC_NUM_COUNTERS 8 23 + #define PCI_NUM_COUNTERS 8 24 + /* Mask for the event type field within perf_event_attr.config and EVTYPE reg */ 25 + #define UNCORE_EVTYPE_MASK 0xFF 26 + 27 + /* Perfmon registers */ 28 + #define PM_EVCNTR(__cntr) (0x000 + (__cntr) * 8) 29 + #define PM_CNTCTL(__cntr) (0x100 + (__cntr) * 8) 30 + #define PM_CNTCTL_RESET 0 31 + #define PM_EVTYPE(__cntr) (0x200 + (__cntr) * 8) 32 + #define PM_EVTYPE_EVSEL(__val) FIELD_GET(UNCORE_EVTYPE_MASK, __val) 33 + #define PM_CR 0x400 34 + #define PM_CR_RESET BIT(1) 35 + #define PM_CR_ENABLE BIT(0) 36 + #define PM_CNTENSET 0x410 37 + #define PM_CNTENSET_IDX(__cntr) BIT(__cntr) 38 + #define PM_CNTENCLR 0x418 39 + #define PM_CNTENCLR_IDX(__cntr) BIT(__cntr) 40 + #define PM_CNTENCLR_RESET 0xFF 41 + #define PM_INTENSET 0x420 42 + #define PM_INTENSET_IDX(__cntr) BIT(__cntr) 43 + #define PM_INTENCLR 0x428 44 + #define PM_INTENCLR_IDX(__cntr) BIT(__cntr) 45 + #define PM_INTENCLR_RESET 0xFF 46 + #define PM_OVSR 0x440 47 + #define PM_OVSR_OVSRCLR_RESET 0xFF 48 + 49 + enum fujitsu_uncore_pmu { 50 + FUJITSU_UNCORE_PMU_MAC = 1, 51 + FUJITSU_UNCORE_PMU_PCI = 2, 52 + }; 53 + 54 + struct uncore_pmu { 55 + int num_counters; 56 + struct pmu pmu; 57 + struct hlist_node node; 58 + void __iomem *regs; 59 + struct perf_event **events; 60 + unsigned long *used_mask; 61 + int cpu; 62 + int irq; 63 + struct device *dev; 64 + }; 65 + 66 + #define to_uncore_pmu(p) (container_of(p, struct uncore_pmu, pmu)) 67 + 68 + static int uncore_pmu_cpuhp_state; 69 + 70 + static void fujitsu_uncore_counter_start(struct perf_event *event) 71 + { 72 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 73 + int idx = event->hw.idx; 74 + 75 + /* Initialize the hardware counter and reset prev_count*/ 76 + local64_set(&event->hw.prev_count, 0); 77 + writeq_relaxed(0, uncorepmu->regs + PM_EVCNTR(idx)); 78 + 79 + /* Set the event type */ 80 + writeq_relaxed(PM_EVTYPE_EVSEL(event->attr.config), uncorepmu->regs + PM_EVTYPE(idx)); 81 + 82 + /* Enable interrupt generation by this counter */ 83 + writeq_relaxed(PM_INTENSET_IDX(idx), uncorepmu->regs + PM_INTENSET); 84 + 85 + /* Finally, enable the counter */ 86 + writeq_relaxed(PM_CNTCTL_RESET, uncorepmu->regs + PM_CNTCTL(idx)); 87 + writeq_relaxed(PM_CNTENSET_IDX(idx), uncorepmu->regs + PM_CNTENSET); 88 + } 89 + 90 + static void fujitsu_uncore_counter_stop(struct perf_event *event) 91 + { 92 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 93 + int idx = event->hw.idx; 94 + 95 + /* Disable the counter */ 96 + writeq_relaxed(PM_CNTENCLR_IDX(idx), uncorepmu->regs + PM_CNTENCLR); 97 + 98 + /* Disable interrupt generation by this counter */ 99 + writeq_relaxed(PM_INTENCLR_IDX(idx), uncorepmu->regs + PM_INTENCLR); 100 + } 101 + 102 + static void fujitsu_uncore_counter_update(struct perf_event *event) 103 + { 104 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 105 + int idx = event->hw.idx; 106 + u64 prev, new; 107 + 108 + do { 109 + prev = local64_read(&event->hw.prev_count); 110 + new = readq_relaxed(uncorepmu->regs + PM_EVCNTR(idx)); 111 + } while (local64_cmpxchg(&event->hw.prev_count, prev, new) != prev); 112 + 113 + local64_add(new - prev, &event->count); 114 + } 115 + 116 + static inline void fujitsu_uncore_init(struct uncore_pmu *uncorepmu) 117 + { 118 + int i; 119 + 120 + writeq_relaxed(PM_CR_RESET, uncorepmu->regs + PM_CR); 121 + 122 + writeq_relaxed(PM_CNTENCLR_RESET, uncorepmu->regs + PM_CNTENCLR); 123 + writeq_relaxed(PM_INTENCLR_RESET, uncorepmu->regs + PM_INTENCLR); 124 + writeq_relaxed(PM_OVSR_OVSRCLR_RESET, uncorepmu->regs + PM_OVSR); 125 + 126 + for (i = 0; i < uncorepmu->num_counters; ++i) { 127 + writeq_relaxed(PM_CNTCTL_RESET, uncorepmu->regs + PM_CNTCTL(i)); 128 + writeq_relaxed(PM_EVTYPE_EVSEL(0), uncorepmu->regs + PM_EVTYPE(i)); 129 + } 130 + writeq_relaxed(PM_CR_ENABLE, uncorepmu->regs + PM_CR); 131 + } 132 + 133 + static irqreturn_t fujitsu_uncore_handle_irq(int irq_num, void *data) 134 + { 135 + struct uncore_pmu *uncorepmu = data; 136 + /* Read the overflow status register */ 137 + long status = readq_relaxed(uncorepmu->regs + PM_OVSR); 138 + int idx; 139 + 140 + if (status == 0) 141 + return IRQ_NONE; 142 + 143 + /* Clear the bits we read on the overflow status register */ 144 + writeq_relaxed(status, uncorepmu->regs + PM_OVSR); 145 + 146 + for_each_set_bit(idx, &status, uncorepmu->num_counters) { 147 + struct perf_event *event; 148 + 149 + event = uncorepmu->events[idx]; 150 + if (!event) 151 + continue; 152 + 153 + fujitsu_uncore_counter_update(event); 154 + } 155 + 156 + return IRQ_HANDLED; 157 + } 158 + 159 + static void fujitsu_uncore_pmu_enable(struct pmu *pmu) 160 + { 161 + writeq_relaxed(PM_CR_ENABLE, to_uncore_pmu(pmu)->regs + PM_CR); 162 + } 163 + 164 + static void fujitsu_uncore_pmu_disable(struct pmu *pmu) 165 + { 166 + writeq_relaxed(0, to_uncore_pmu(pmu)->regs + PM_CR); 167 + } 168 + 169 + static bool fujitsu_uncore_validate_event_group(struct perf_event *event) 170 + { 171 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 172 + struct perf_event *leader = event->group_leader; 173 + struct perf_event *sibling; 174 + int counters = 1; 175 + 176 + if (leader == event) 177 + return true; 178 + 179 + if (leader->pmu == event->pmu) 180 + counters++; 181 + 182 + for_each_sibling_event(sibling, leader) { 183 + if (sibling->pmu == event->pmu) 184 + counters++; 185 + } 186 + 187 + /* 188 + * If the group requires more counters than the HW has, it 189 + * cannot ever be scheduled. 190 + */ 191 + return counters <= uncorepmu->num_counters; 192 + } 193 + 194 + static int fujitsu_uncore_event_init(struct perf_event *event) 195 + { 196 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 197 + struct hw_perf_event *hwc = &event->hw; 198 + 199 + /* Is the event for this PMU? */ 200 + if (event->attr.type != event->pmu->type) 201 + return -ENOENT; 202 + 203 + /* 204 + * Sampling not supported since these events are not 205 + * core-attributable. 206 + */ 207 + if (is_sampling_event(event)) 208 + return -EINVAL; 209 + 210 + /* 211 + * Task mode not available, we run the counters as socket counters, 212 + * not attributable to any CPU and therefore cannot attribute per-task. 213 + */ 214 + if (event->cpu < 0) 215 + return -EINVAL; 216 + 217 + /* Validate the group */ 218 + if (!fujitsu_uncore_validate_event_group(event)) 219 + return -EINVAL; 220 + 221 + hwc->idx = -1; 222 + 223 + event->cpu = uncorepmu->cpu; 224 + 225 + return 0; 226 + } 227 + 228 + static void fujitsu_uncore_event_start(struct perf_event *event, int flags) 229 + { 230 + struct hw_perf_event *hwc = &event->hw; 231 + 232 + hwc->state = 0; 233 + fujitsu_uncore_counter_start(event); 234 + } 235 + 236 + static void fujitsu_uncore_event_stop(struct perf_event *event, int flags) 237 + { 238 + struct hw_perf_event *hwc = &event->hw; 239 + 240 + if (hwc->state & PERF_HES_STOPPED) 241 + return; 242 + 243 + fujitsu_uncore_counter_stop(event); 244 + if (flags & PERF_EF_UPDATE) 245 + fujitsu_uncore_counter_update(event); 246 + hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE; 247 + } 248 + 249 + static int fujitsu_uncore_event_add(struct perf_event *event, int flags) 250 + { 251 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 252 + struct hw_perf_event *hwc = &event->hw; 253 + int idx; 254 + 255 + /* Try to allocate a counter. */ 256 + idx = bitmap_find_free_region(uncorepmu->used_mask, uncorepmu->num_counters, 0); 257 + if (idx < 0) 258 + /* The counters are all in use. */ 259 + return -EAGAIN; 260 + 261 + hwc->idx = idx; 262 + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 263 + uncorepmu->events[idx] = event; 264 + 265 + if (flags & PERF_EF_START) 266 + fujitsu_uncore_event_start(event, 0); 267 + 268 + /* Propagate changes to the userspace mapping. */ 269 + perf_event_update_userpage(event); 270 + 271 + return 0; 272 + } 273 + 274 + static void fujitsu_uncore_event_del(struct perf_event *event, int flags) 275 + { 276 + struct uncore_pmu *uncorepmu = to_uncore_pmu(event->pmu); 277 + struct hw_perf_event *hwc = &event->hw; 278 + 279 + /* Stop and clean up */ 280 + fujitsu_uncore_event_stop(event, flags | PERF_EF_UPDATE); 281 + uncorepmu->events[hwc->idx] = NULL; 282 + bitmap_release_region(uncorepmu->used_mask, hwc->idx, 0); 283 + 284 + /* Propagate changes to the userspace mapping. */ 285 + perf_event_update_userpage(event); 286 + } 287 + 288 + static void fujitsu_uncore_event_read(struct perf_event *event) 289 + { 290 + fujitsu_uncore_counter_update(event); 291 + } 292 + 293 + #define UNCORE_PMU_FORMAT_ATTR(_name, _config) \ 294 + (&((struct dev_ext_attribute[]) { \ 295 + { .attr = __ATTR(_name, 0444, device_show_string, NULL), \ 296 + .var = (void *)_config, } \ 297 + })[0].attr.attr) 298 + 299 + static struct attribute *fujitsu_uncore_pmu_formats[] = { 300 + UNCORE_PMU_FORMAT_ATTR(event, "config:0-7"), 301 + NULL 302 + }; 303 + 304 + static const struct attribute_group fujitsu_uncore_pmu_format_group = { 305 + .name = "format", 306 + .attrs = fujitsu_uncore_pmu_formats, 307 + }; 308 + 309 + static ssize_t fujitsu_uncore_pmu_event_show(struct device *dev, 310 + struct device_attribute *attr, char *page) 311 + { 312 + struct perf_pmu_events_attr *pmu_attr; 313 + 314 + pmu_attr = container_of(attr, struct perf_pmu_events_attr, attr); 315 + return sysfs_emit(page, "event=0x%02llx\n", pmu_attr->id); 316 + } 317 + 318 + #define MAC_EVENT_ATTR(_name, _id) \ 319 + PMU_EVENT_ATTR_ID(_name, fujitsu_uncore_pmu_event_show, _id) 320 + 321 + static struct attribute *fujitsu_uncore_mac_pmu_events[] = { 322 + MAC_EVENT_ATTR(cycles, 0x00), 323 + MAC_EVENT_ATTR(read-count, 0x10), 324 + MAC_EVENT_ATTR(read-count-request, 0x11), 325 + MAC_EVENT_ATTR(read-count-return, 0x12), 326 + MAC_EVENT_ATTR(read-count-request-pftgt, 0x13), 327 + MAC_EVENT_ATTR(read-count-request-normal, 0x14), 328 + MAC_EVENT_ATTR(read-count-return-pftgt-hit, 0x15), 329 + MAC_EVENT_ATTR(read-count-return-pftgt-miss, 0x16), 330 + MAC_EVENT_ATTR(read-wait, 0x17), 331 + MAC_EVENT_ATTR(write-count, 0x20), 332 + MAC_EVENT_ATTR(write-count-write, 0x21), 333 + MAC_EVENT_ATTR(write-count-pwrite, 0x22), 334 + MAC_EVENT_ATTR(memory-read-count, 0x40), 335 + MAC_EVENT_ATTR(memory-write-count, 0x50), 336 + MAC_EVENT_ATTR(memory-pwrite-count, 0x60), 337 + MAC_EVENT_ATTR(ea-mac, 0x80), 338 + MAC_EVENT_ATTR(ea-memory, 0x90), 339 + MAC_EVENT_ATTR(ea-memory-mac-write, 0x92), 340 + MAC_EVENT_ATTR(ea-ha, 0xa0), 341 + NULL 342 + }; 343 + 344 + #define PCI_EVENT_ATTR(_name, _id) \ 345 + PMU_EVENT_ATTR_ID(_name, fujitsu_uncore_pmu_event_show, _id) 346 + 347 + static struct attribute *fujitsu_uncore_pci_pmu_events[] = { 348 + PCI_EVENT_ATTR(pci-port0-cycles, 0x00), 349 + PCI_EVENT_ATTR(pci-port0-read-count, 0x10), 350 + PCI_EVENT_ATTR(pci-port0-read-count-bus, 0x14), 351 + PCI_EVENT_ATTR(pci-port0-write-count, 0x20), 352 + PCI_EVENT_ATTR(pci-port0-write-count-bus, 0x24), 353 + PCI_EVENT_ATTR(pci-port1-cycles, 0x40), 354 + PCI_EVENT_ATTR(pci-port1-read-count, 0x50), 355 + PCI_EVENT_ATTR(pci-port1-read-count-bus, 0x54), 356 + PCI_EVENT_ATTR(pci-port1-write-count, 0x60), 357 + PCI_EVENT_ATTR(pci-port1-write-count-bus, 0x64), 358 + PCI_EVENT_ATTR(ea-pci, 0x80), 359 + NULL 360 + }; 361 + 362 + static const struct attribute_group fujitsu_uncore_mac_pmu_events_group = { 363 + .name = "events", 364 + .attrs = fujitsu_uncore_mac_pmu_events, 365 + }; 366 + 367 + static const struct attribute_group fujitsu_uncore_pci_pmu_events_group = { 368 + .name = "events", 369 + .attrs = fujitsu_uncore_pci_pmu_events, 370 + }; 371 + 372 + static ssize_t cpumask_show(struct device *dev, 373 + struct device_attribute *attr, char *buf) 374 + { 375 + struct uncore_pmu *uncorepmu = to_uncore_pmu(dev_get_drvdata(dev)); 376 + 377 + return cpumap_print_to_pagebuf(true, buf, cpumask_of(uncorepmu->cpu)); 378 + } 379 + static DEVICE_ATTR_RO(cpumask); 380 + 381 + static struct attribute *fujitsu_uncore_pmu_cpumask_attrs[] = { 382 + &dev_attr_cpumask.attr, 383 + NULL 384 + }; 385 + 386 + static const struct attribute_group fujitsu_uncore_pmu_cpumask_attr_group = { 387 + .attrs = fujitsu_uncore_pmu_cpumask_attrs, 388 + }; 389 + 390 + static const struct attribute_group *fujitsu_uncore_mac_pmu_attr_grps[] = { 391 + &fujitsu_uncore_pmu_format_group, 392 + &fujitsu_uncore_mac_pmu_events_group, 393 + &fujitsu_uncore_pmu_cpumask_attr_group, 394 + NULL 395 + }; 396 + 397 + static const struct attribute_group *fujitsu_uncore_pci_pmu_attr_grps[] = { 398 + &fujitsu_uncore_pmu_format_group, 399 + &fujitsu_uncore_pci_pmu_events_group, 400 + &fujitsu_uncore_pmu_cpumask_attr_group, 401 + NULL 402 + }; 403 + 404 + static void fujitsu_uncore_pmu_migrate(struct uncore_pmu *uncorepmu, unsigned int cpu) 405 + { 406 + perf_pmu_migrate_context(&uncorepmu->pmu, uncorepmu->cpu, cpu); 407 + irq_set_affinity(uncorepmu->irq, cpumask_of(cpu)); 408 + uncorepmu->cpu = cpu; 409 + } 410 + 411 + static int fujitsu_uncore_pmu_online_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 412 + { 413 + struct uncore_pmu *uncorepmu; 414 + int node; 415 + 416 + uncorepmu = hlist_entry_safe(cpuhp_node, struct uncore_pmu, node); 417 + node = dev_to_node(uncorepmu->dev); 418 + if (cpu_to_node(uncorepmu->cpu) != node && cpu_to_node(cpu) == node) 419 + fujitsu_uncore_pmu_migrate(uncorepmu, cpu); 420 + 421 + return 0; 422 + } 423 + 424 + static int fujitsu_uncore_pmu_offline_cpu(unsigned int cpu, struct hlist_node *cpuhp_node) 425 + { 426 + struct uncore_pmu *uncorepmu; 427 + unsigned int target; 428 + int node; 429 + 430 + uncorepmu = hlist_entry_safe(cpuhp_node, struct uncore_pmu, node); 431 + if (cpu != uncorepmu->cpu) 432 + return 0; 433 + 434 + node = dev_to_node(uncorepmu->dev); 435 + target = cpumask_any_and_but(cpumask_of_node(node), cpu_online_mask, cpu); 436 + if (target >= nr_cpu_ids) 437 + target = cpumask_any_but(cpu_online_mask, cpu); 438 + 439 + if (target < nr_cpu_ids) 440 + fujitsu_uncore_pmu_migrate(uncorepmu, target); 441 + 442 + return 0; 443 + } 444 + 445 + static int fujitsu_uncore_pmu_probe(struct platform_device *pdev) 446 + { 447 + struct device *dev = &pdev->dev; 448 + unsigned long device_type = (unsigned long)device_get_match_data(dev); 449 + const struct attribute_group **attr_groups; 450 + struct uncore_pmu *uncorepmu; 451 + struct resource *memrc; 452 + size_t alloc_size; 453 + char *name; 454 + int ret; 455 + int irq; 456 + u64 uid; 457 + 458 + ret = acpi_dev_uid_to_integer(ACPI_COMPANION(dev), &uid); 459 + if (ret) 460 + return dev_err_probe(dev, ret, "unable to read ACPI uid\n"); 461 + 462 + uncorepmu = devm_kzalloc(dev, sizeof(*uncorepmu), GFP_KERNEL); 463 + if (!uncorepmu) 464 + return -ENOMEM; 465 + uncorepmu->dev = dev; 466 + uncorepmu->cpu = cpumask_local_spread(0, dev_to_node(dev)); 467 + platform_set_drvdata(pdev, uncorepmu); 468 + 469 + switch (device_type) { 470 + case FUJITSU_UNCORE_PMU_MAC: 471 + uncorepmu->num_counters = MAC_NUM_COUNTERS; 472 + attr_groups = fujitsu_uncore_mac_pmu_attr_grps; 473 + name = devm_kasprintf(dev, GFP_KERNEL, "mac_iod%llu_mac%llu_ch%llu", 474 + (uid >> 8) & 0xF, (uid >> 4) & 0xF, uid & 0xF); 475 + break; 476 + case FUJITSU_UNCORE_PMU_PCI: 477 + uncorepmu->num_counters = PCI_NUM_COUNTERS; 478 + attr_groups = fujitsu_uncore_pci_pmu_attr_grps; 479 + name = devm_kasprintf(dev, GFP_KERNEL, "pci_iod%llu_pci%llu", 480 + (uid >> 4) & 0xF, uid & 0xF); 481 + break; 482 + default: 483 + return dev_err_probe(dev, -EINVAL, "illegal device type: %lu\n", device_type); 484 + } 485 + if (!name) 486 + return -ENOMEM; 487 + 488 + uncorepmu->pmu = (struct pmu) { 489 + .parent = dev, 490 + .task_ctx_nr = perf_invalid_context, 491 + 492 + .attr_groups = attr_groups, 493 + 494 + .pmu_enable = fujitsu_uncore_pmu_enable, 495 + .pmu_disable = fujitsu_uncore_pmu_disable, 496 + .event_init = fujitsu_uncore_event_init, 497 + .add = fujitsu_uncore_event_add, 498 + .del = fujitsu_uncore_event_del, 499 + .start = fujitsu_uncore_event_start, 500 + .stop = fujitsu_uncore_event_stop, 501 + .read = fujitsu_uncore_event_read, 502 + 503 + .capabilities = PERF_PMU_CAP_NO_EXCLUDE | PERF_PMU_CAP_NO_INTERRUPT, 504 + }; 505 + 506 + alloc_size = sizeof(uncorepmu->events[0]) * uncorepmu->num_counters; 507 + uncorepmu->events = devm_kzalloc(dev, alloc_size, GFP_KERNEL); 508 + if (!uncorepmu->events) 509 + return -ENOMEM; 510 + 511 + alloc_size = sizeof(uncorepmu->used_mask[0]) * BITS_TO_LONGS(uncorepmu->num_counters); 512 + uncorepmu->used_mask = devm_kzalloc(dev, alloc_size, GFP_KERNEL); 513 + if (!uncorepmu->used_mask) 514 + return -ENOMEM; 515 + 516 + uncorepmu->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &memrc); 517 + if (IS_ERR(uncorepmu->regs)) 518 + return PTR_ERR(uncorepmu->regs); 519 + 520 + fujitsu_uncore_init(uncorepmu); 521 + 522 + irq = platform_get_irq(pdev, 0); 523 + if (irq < 0) 524 + return irq; 525 + 526 + ret = devm_request_irq(dev, irq, fujitsu_uncore_handle_irq, 527 + IRQF_NOBALANCING | IRQF_NO_THREAD, 528 + name, uncorepmu); 529 + if (ret) 530 + return dev_err_probe(dev, ret, "Failed to request IRQ:%d\n", irq); 531 + 532 + ret = irq_set_affinity(irq, cpumask_of(uncorepmu->cpu)); 533 + if (ret) 534 + return dev_err_probe(dev, ret, "Failed to set irq affinity:%d\n", irq); 535 + 536 + uncorepmu->irq = irq; 537 + 538 + /* Add this instance to the list used by the offline callback */ 539 + ret = cpuhp_state_add_instance(uncore_pmu_cpuhp_state, &uncorepmu->node); 540 + if (ret) 541 + return dev_err_probe(dev, ret, "Error registering hotplug"); 542 + 543 + ret = perf_pmu_register(&uncorepmu->pmu, name, -1); 544 + if (ret < 0) { 545 + cpuhp_state_remove_instance_nocalls(uncore_pmu_cpuhp_state, &uncorepmu->node); 546 + return dev_err_probe(dev, ret, "Failed to register %s PMU\n", name); 547 + } 548 + 549 + dev_dbg(dev, "Registered %s, type: %d\n", name, uncorepmu->pmu.type); 550 + 551 + return 0; 552 + } 553 + 554 + static void fujitsu_uncore_pmu_remove(struct platform_device *pdev) 555 + { 556 + struct uncore_pmu *uncorepmu = platform_get_drvdata(pdev); 557 + 558 + writeq_relaxed(0, uncorepmu->regs + PM_CR); 559 + 560 + perf_pmu_unregister(&uncorepmu->pmu); 561 + cpuhp_state_remove_instance_nocalls(uncore_pmu_cpuhp_state, &uncorepmu->node); 562 + } 563 + 564 + static const struct acpi_device_id fujitsu_uncore_pmu_acpi_match[] = { 565 + { "FUJI200C", FUJITSU_UNCORE_PMU_MAC }, 566 + { "FUJI200D", FUJITSU_UNCORE_PMU_PCI }, 567 + { } 568 + }; 569 + MODULE_DEVICE_TABLE(acpi, fujitsu_uncore_pmu_acpi_match); 570 + 571 + static struct platform_driver fujitsu_uncore_pmu_driver = { 572 + .driver = { 573 + .name = "fujitsu-uncore-pmu", 574 + .acpi_match_table = fujitsu_uncore_pmu_acpi_match, 575 + .suppress_bind_attrs = true, 576 + }, 577 + .probe = fujitsu_uncore_pmu_probe, 578 + .remove = fujitsu_uncore_pmu_remove, 579 + }; 580 + 581 + static int __init fujitsu_uncore_pmu_init(void) 582 + { 583 + int ret; 584 + 585 + /* Install a hook to update the reader CPU in case it goes offline */ 586 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, 587 + "perf/fujitsu/uncore:online", 588 + fujitsu_uncore_pmu_online_cpu, 589 + fujitsu_uncore_pmu_offline_cpu); 590 + if (ret < 0) 591 + return ret; 592 + 593 + uncore_pmu_cpuhp_state = ret; 594 + 595 + ret = platform_driver_register(&fujitsu_uncore_pmu_driver); 596 + if (ret) 597 + cpuhp_remove_multi_state(uncore_pmu_cpuhp_state); 598 + 599 + return ret; 600 + } 601 + 602 + static void __exit fujitsu_uncore_pmu_exit(void) 603 + { 604 + platform_driver_unregister(&fujitsu_uncore_pmu_driver); 605 + cpuhp_remove_multi_state(uncore_pmu_cpuhp_state); 606 + } 607 + 608 + module_init(fujitsu_uncore_pmu_init); 609 + module_exit(fujitsu_uncore_pmu_exit); 610 + 611 + MODULE_AUTHOR("Koichi Okuno <fj2767dz@fujitsu.com>"); 612 + MODULE_DESCRIPTION("Fujitsu Uncore PMU driver"); 613 + MODULE_LICENSE("GPL");
+2 -1
drivers/perf/hisilicon/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o \ 3 3 hisi_uncore_hha_pmu.o hisi_uncore_ddrc_pmu.o hisi_uncore_sllc_pmu.o \ 4 - hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o hisi_uncore_uc_pmu.o 4 + hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o hisi_uncore_uc_pmu.o \ 5 + hisi_uncore_noc_pmu.o hisi_uncore_mn_pmu.o 5 6 6 7 obj-$(CONFIG_HISI_PCIE_PMU) += hisi_pcie_pmu.o 7 8 obj-$(CONFIG_HNS3_PMU) += hns3_pmu.o
+438 -90
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 39 39 40 40 /* L3C has 8-counters */ 41 41 #define L3C_NR_COUNTERS 0x8 42 + #define L3C_MAX_EXT 2 42 43 43 44 #define L3C_PERF_CTRL_EN 0x10000 44 45 #define L3C_TRACETAG_EN BIT(31) ··· 56 55 #define L3C_V1_NR_EVENTS 0x59 57 56 #define L3C_V2_NR_EVENTS 0xFF 58 57 59 - HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config1, 7, 0); 58 + HISI_PMU_EVENT_ATTR_EXTRACTOR(ext, config, 17, 16); 60 59 HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_req, config1, 10, 8); 61 60 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_cfg, config1, 15, 11); 62 61 HISI_PMU_EVENT_ATTR_EXTRACTOR(datasrc_skt, config1, 16, 16); 62 + HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_core, config2, 15, 0); 63 + 64 + struct hisi_l3c_pmu { 65 + struct hisi_pmu l3c_pmu; 66 + 67 + /* MMIO and IRQ resources for extension events */ 68 + void __iomem *ext_base[L3C_MAX_EXT]; 69 + int ext_irq[L3C_MAX_EXT]; 70 + int ext_num; 71 + }; 72 + 73 + #define to_hisi_l3c_pmu(_l3c_pmu) \ 74 + container_of(_l3c_pmu, struct hisi_l3c_pmu, l3c_pmu) 75 + 76 + /* 77 + * The hardware counter idx used in counter enable/disable, 78 + * interrupt enable/disable and status check, etc. 79 + */ 80 + #define L3C_HW_IDX(_cntr_idx) ((_cntr_idx) % L3C_NR_COUNTERS) 81 + 82 + /* Range of ext counters in used mask. */ 83 + #define L3C_CNTR_EXT_L(_ext) (((_ext) + 1) * L3C_NR_COUNTERS) 84 + #define L3C_CNTR_EXT_H(_ext) (((_ext) + 2) * L3C_NR_COUNTERS) 85 + 86 + struct hisi_l3c_pmu_ext { 87 + bool support_ext; 88 + }; 89 + 90 + static bool support_ext(struct hisi_l3c_pmu *pmu) 91 + { 92 + struct hisi_l3c_pmu_ext *l3c_pmu_ext = pmu->l3c_pmu.dev_info->private; 93 + 94 + return l3c_pmu_ext->support_ext; 95 + } 96 + 97 + static int hisi_l3c_pmu_get_event_idx(struct perf_event *event) 98 + { 99 + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 100 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 101 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 102 + int ext = hisi_get_ext(event); 103 + int idx; 104 + 105 + /* 106 + * For an L3C PMU that supports extension events, we can monitor 107 + * maximum 2 * num_counters to 3 * num_counters events, depending on 108 + * the number of ext regions supported by hardware. Thus use bit 109 + * [0, num_counters - 1] for normal events and bit 110 + * [ext * num_counters, (ext + 1) * num_counters - 1] for extension 111 + * events. The idx allocation will keep unchanged for normal events and 112 + * we can also use the idx to distinguish whether it's an extension 113 + * event or not. 114 + * 115 + * Since normal events and extension events locates on the different 116 + * address space, save the base address to the event->hw.event_base. 117 + */ 118 + if (ext && !support_ext(hisi_l3c_pmu)) 119 + return -EOPNOTSUPP; 120 + 121 + if (ext) 122 + event->hw.event_base = (unsigned long)hisi_l3c_pmu->ext_base[ext - 1]; 123 + else 124 + event->hw.event_base = (unsigned long)l3c_pmu->base; 125 + 126 + ext -= 1; 127 + idx = find_next_zero_bit(used_mask, L3C_CNTR_EXT_H(ext), L3C_CNTR_EXT_L(ext)); 128 + 129 + if (idx >= L3C_CNTR_EXT_H(ext)) 130 + return -EAGAIN; 131 + 132 + set_bit(idx, used_mask); 133 + 134 + return idx; 135 + } 136 + 137 + static u32 hisi_l3c_pmu_event_readl(struct hw_perf_event *hwc, u32 reg) 138 + { 139 + return readl((void __iomem *)hwc->event_base + reg); 140 + } 141 + 142 + static void hisi_l3c_pmu_event_writel(struct hw_perf_event *hwc, u32 reg, u32 val) 143 + { 144 + writel(val, (void __iomem *)hwc->event_base + reg); 145 + } 146 + 147 + static u64 hisi_l3c_pmu_event_readq(struct hw_perf_event *hwc, u32 reg) 148 + { 149 + return readq((void __iomem *)hwc->event_base + reg); 150 + } 151 + 152 + static void hisi_l3c_pmu_event_writeq(struct hw_perf_event *hwc, u32 reg, u64 val) 153 + { 154 + writeq(val, (void __iomem *)hwc->event_base + reg); 155 + } 63 156 64 157 static void hisi_l3c_pmu_config_req_tracetag(struct perf_event *event) 65 158 { 66 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 159 + struct hw_perf_event *hwc = &event->hw; 67 160 u32 tt_req = hisi_get_tt_req(event); 68 161 69 162 if (tt_req) { 70 163 u32 val; 71 164 72 165 /* Set request-type for tracetag */ 73 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 166 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 74 167 val |= tt_req << L3C_TRACETAG_REQ_SHIFT; 75 168 val |= L3C_TRACETAG_REQ_EN; 76 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 169 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 77 170 78 171 /* Enable request-tracetag statistics */ 79 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 172 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 80 173 val |= L3C_TRACETAG_EN; 81 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 174 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 82 175 } 83 176 } 84 177 85 178 static void hisi_l3c_pmu_clear_req_tracetag(struct perf_event *event) 86 179 { 87 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 180 + struct hw_perf_event *hwc = &event->hw; 88 181 u32 tt_req = hisi_get_tt_req(event); 89 182 90 183 if (tt_req) { 91 184 u32 val; 92 185 93 186 /* Clear request-type */ 94 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 187 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 95 188 val &= ~(tt_req << L3C_TRACETAG_REQ_SHIFT); 96 189 val &= ~L3C_TRACETAG_REQ_EN; 97 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 190 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 98 191 99 192 /* Disable request-tracetag statistics */ 100 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 193 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 101 194 val &= ~L3C_TRACETAG_EN; 102 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 195 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 103 196 } 104 197 } 105 198 106 199 static void hisi_l3c_pmu_write_ds(struct perf_event *event, u32 ds_cfg) 107 200 { 108 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 109 201 struct hw_perf_event *hwc = &event->hw; 110 202 u32 reg, reg_idx, shift, val; 111 - int idx = hwc->idx; 203 + int idx = L3C_HW_IDX(hwc->idx); 112 204 113 205 /* 114 206 * Select the appropriate datasource register(L3C_DATSRC_TYPE0/1). ··· 214 120 reg_idx = idx % 4; 215 121 shift = 8 * reg_idx; 216 122 217 - val = readl(l3c_pmu->base + reg); 123 + val = hisi_l3c_pmu_event_readl(hwc, reg); 218 124 val &= ~(L3C_DATSRC_MASK << shift); 219 125 val |= ds_cfg << shift; 220 - writel(val, l3c_pmu->base + reg); 126 + hisi_l3c_pmu_event_writel(hwc, reg, val); 221 127 } 222 128 223 129 static void hisi_l3c_pmu_config_ds(struct perf_event *event) 224 130 { 225 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 131 + struct hw_perf_event *hwc = &event->hw; 226 132 u32 ds_cfg = hisi_get_datasrc_cfg(event); 227 133 u32 ds_skt = hisi_get_datasrc_skt(event); 228 134 ··· 232 138 if (ds_skt) { 233 139 u32 val; 234 140 235 - val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); 141 + val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL); 236 142 val |= L3C_DATSRC_SKT_EN; 237 - writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); 143 + hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val); 238 144 } 239 145 } 240 146 241 147 static void hisi_l3c_pmu_clear_ds(struct perf_event *event) 242 148 { 243 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 149 + struct hw_perf_event *hwc = &event->hw; 244 150 u32 ds_cfg = hisi_get_datasrc_cfg(event); 245 151 u32 ds_skt = hisi_get_datasrc_skt(event); 246 152 ··· 250 156 if (ds_skt) { 251 157 u32 val; 252 158 253 - val = readl(l3c_pmu->base + L3C_DATSRC_CTRL); 159 + val = hisi_l3c_pmu_event_readl(hwc, L3C_DATSRC_CTRL); 254 160 val &= ~L3C_DATSRC_SKT_EN; 255 - writel(val, l3c_pmu->base + L3C_DATSRC_CTRL); 161 + hisi_l3c_pmu_event_writel(hwc, L3C_DATSRC_CTRL, val); 256 162 } 257 163 } 258 164 259 165 static void hisi_l3c_pmu_config_core_tracetag(struct perf_event *event) 260 166 { 261 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 167 + struct hw_perf_event *hwc = &event->hw; 262 168 u32 core = hisi_get_tt_core(event); 263 169 264 170 if (core) { 265 171 u32 val; 266 172 267 173 /* Config and enable core information */ 268 - writel(core, l3c_pmu->base + L3C_CORE_CTRL); 269 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 174 + hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, core); 175 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 270 176 val |= L3C_CORE_EN; 271 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 177 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 272 178 273 179 /* Enable core-tracetag statistics */ 274 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 180 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 275 181 val |= L3C_TRACETAG_CORE_EN; 276 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 182 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 277 183 } 278 184 } 279 185 280 186 static void hisi_l3c_pmu_clear_core_tracetag(struct perf_event *event) 281 187 { 282 - struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 188 + struct hw_perf_event *hwc = &event->hw; 283 189 u32 core = hisi_get_tt_core(event); 284 190 285 191 if (core) { 286 192 u32 val; 287 193 288 194 /* Clear core information */ 289 - writel(L3C_COER_NONE, l3c_pmu->base + L3C_CORE_CTRL); 290 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 195 + hisi_l3c_pmu_event_writel(hwc, L3C_CORE_CTRL, L3C_COER_NONE); 196 + val = hisi_l3c_pmu_event_readl(hwc, L3C_PERF_CTRL); 291 197 val &= ~L3C_CORE_EN; 292 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 198 + hisi_l3c_pmu_event_writel(hwc, L3C_PERF_CTRL, val); 293 199 294 200 /* Disable core-tracetag statistics */ 295 - val = readl(l3c_pmu->base + L3C_TRACETAG_CTRL); 201 + val = hisi_l3c_pmu_event_readl(hwc, L3C_TRACETAG_CTRL); 296 202 val &= ~L3C_TRACETAG_CORE_EN; 297 - writel(val, l3c_pmu->base + L3C_TRACETAG_CTRL); 203 + hisi_l3c_pmu_event_writel(hwc, L3C_TRACETAG_CTRL, val); 298 204 } 205 + } 206 + 207 + static bool hisi_l3c_pmu_have_filter(struct perf_event *event) 208 + { 209 + return hisi_get_tt_req(event) || hisi_get_tt_core(event) || 210 + hisi_get_datasrc_cfg(event) || hisi_get_datasrc_skt(event); 299 211 } 300 212 301 213 static void hisi_l3c_pmu_enable_filter(struct perf_event *event) 302 214 { 303 - if (event->attr.config1 != 0x0) { 215 + if (hisi_l3c_pmu_have_filter(event)) { 304 216 hisi_l3c_pmu_config_req_tracetag(event); 305 217 hisi_l3c_pmu_config_core_tracetag(event); 306 218 hisi_l3c_pmu_config_ds(event); ··· 315 215 316 216 static void hisi_l3c_pmu_disable_filter(struct perf_event *event) 317 217 { 318 - if (event->attr.config1 != 0x0) { 218 + if (hisi_l3c_pmu_have_filter(event)) { 319 219 hisi_l3c_pmu_clear_ds(event); 320 220 hisi_l3c_pmu_clear_core_tracetag(event); 321 221 hisi_l3c_pmu_clear_req_tracetag(event); 322 222 } 223 + } 224 + 225 + static int hisi_l3c_pmu_check_filter(struct perf_event *event) 226 + { 227 + struct hisi_pmu *l3c_pmu = to_hisi_pmu(event->pmu); 228 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 229 + int ext = hisi_get_ext(event); 230 + 231 + if (ext < 0 || ext > hisi_l3c_pmu->ext_num) 232 + return -EINVAL; 233 + 234 + return 0; 323 235 } 324 236 325 237 /* ··· 339 227 */ 340 228 static u32 hisi_l3c_pmu_get_counter_offset(int cntr_idx) 341 229 { 342 - return (L3C_CNTR0_LOWER + (cntr_idx * 8)); 230 + return L3C_CNTR0_LOWER + L3C_HW_IDX(cntr_idx) * 8; 343 231 } 344 232 345 233 static u64 hisi_l3c_pmu_read_counter(struct hisi_pmu *l3c_pmu, 346 234 struct hw_perf_event *hwc) 347 235 { 348 - return readq(l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); 236 + return hisi_l3c_pmu_event_readq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx)); 349 237 } 350 238 351 239 static void hisi_l3c_pmu_write_counter(struct hisi_pmu *l3c_pmu, 352 240 struct hw_perf_event *hwc, u64 val) 353 241 { 354 - writeq(val, l3c_pmu->base + hisi_l3c_pmu_get_counter_offset(hwc->idx)); 242 + hisi_l3c_pmu_event_writeq(hwc, hisi_l3c_pmu_get_counter_offset(hwc->idx), val); 355 243 } 356 244 357 245 static void hisi_l3c_pmu_write_evtype(struct hisi_pmu *l3c_pmu, int idx, 358 246 u32 type) 359 247 { 248 + struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw; 360 249 u32 reg, reg_idx, shift, val; 250 + 251 + idx = L3C_HW_IDX(idx); 361 252 362 253 /* 363 254 * Select the appropriate event select register(L3C_EVENT_TYPE0/1). ··· 374 259 shift = 8 * reg_idx; 375 260 376 261 /* Write event code to L3C_EVENT_TYPEx Register */ 377 - val = readl(l3c_pmu->base + reg); 262 + val = hisi_l3c_pmu_event_readl(hwc, reg); 378 263 val &= ~(L3C_EVTYPE_NONE << shift); 379 - val |= (type << shift); 380 - writel(val, l3c_pmu->base + reg); 264 + val |= type << shift; 265 + hisi_l3c_pmu_event_writel(hwc, reg, val); 381 266 } 382 267 383 268 static void hisi_l3c_pmu_start_counters(struct hisi_pmu *l3c_pmu) 384 269 { 270 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 271 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 272 + unsigned long used_cntr = find_first_bit(used_mask, l3c_pmu->num_counters); 385 273 u32 val; 274 + int i; 386 275 387 276 /* 388 - * Set perf_enable bit in L3C_PERF_CTRL register to start counting 389 - * for all enabled counters. 277 + * Check if any counter belongs to the normal range (instead of ext 278 + * range). If so, enable it. 390 279 */ 391 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 392 - val |= L3C_PERF_CTRL_EN; 393 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 280 + if (used_cntr < L3C_NR_COUNTERS) { 281 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 282 + val |= L3C_PERF_CTRL_EN; 283 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 284 + } 285 + 286 + /* If not, do enable it on ext ranges. */ 287 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 288 + /* Find used counter in this ext range, skip the range if not. */ 289 + used_cntr = find_next_bit(used_mask, L3C_CNTR_EXT_H(i), L3C_CNTR_EXT_L(i)); 290 + if (used_cntr >= L3C_CNTR_EXT_H(i)) 291 + continue; 292 + 293 + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 294 + val |= L3C_PERF_CTRL_EN; 295 + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 296 + } 394 297 } 395 298 396 299 static void hisi_l3c_pmu_stop_counters(struct hisi_pmu *l3c_pmu) 397 300 { 301 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 302 + unsigned long *used_mask = l3c_pmu->pmu_events.used_mask; 303 + unsigned long used_cntr = find_first_bit(used_mask, l3c_pmu->num_counters); 398 304 u32 val; 305 + int i; 399 306 400 307 /* 401 - * Clear perf_enable bit in L3C_PERF_CTRL register to stop counting 402 - * for all enabled counters. 308 + * Check if any counter belongs to the normal range (instead of ext 309 + * range). If so, stop it. 403 310 */ 404 - val = readl(l3c_pmu->base + L3C_PERF_CTRL); 405 - val &= ~(L3C_PERF_CTRL_EN); 406 - writel(val, l3c_pmu->base + L3C_PERF_CTRL); 311 + if (used_cntr < L3C_NR_COUNTERS) { 312 + val = readl(l3c_pmu->base + L3C_PERF_CTRL); 313 + val &= ~L3C_PERF_CTRL_EN; 314 + writel(val, l3c_pmu->base + L3C_PERF_CTRL); 315 + } 316 + 317 + /* If not, do stop it on ext ranges. */ 318 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 319 + /* Find used counter in this ext range, skip the range if not. */ 320 + used_cntr = find_next_bit(used_mask, L3C_CNTR_EXT_H(i), L3C_CNTR_EXT_L(i)); 321 + if (used_cntr >= L3C_CNTR_EXT_H(i)) 322 + continue; 323 + 324 + val = readl(hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 325 + val &= ~L3C_PERF_CTRL_EN; 326 + writel(val, hisi_l3c_pmu->ext_base[i] + L3C_PERF_CTRL); 327 + } 407 328 } 408 329 409 330 static void hisi_l3c_pmu_enable_counter(struct hisi_pmu *l3c_pmu, ··· 448 297 u32 val; 449 298 450 299 /* Enable counter index in L3C_EVENT_CTRL register */ 451 - val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 452 - val |= (1 << hwc->idx); 453 - writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 300 + val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL); 301 + val |= 1 << L3C_HW_IDX(hwc->idx); 302 + hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val); 454 303 } 455 304 456 305 static void hisi_l3c_pmu_disable_counter(struct hisi_pmu *l3c_pmu, ··· 459 308 u32 val; 460 309 461 310 /* Clear counter index in L3C_EVENT_CTRL register */ 462 - val = readl(l3c_pmu->base + L3C_EVENT_CTRL); 463 - val &= ~(1 << hwc->idx); 464 - writel(val, l3c_pmu->base + L3C_EVENT_CTRL); 311 + val = hisi_l3c_pmu_event_readl(hwc, L3C_EVENT_CTRL); 312 + val &= ~(1 << L3C_HW_IDX(hwc->idx)); 313 + hisi_l3c_pmu_event_writel(hwc, L3C_EVENT_CTRL, val); 465 314 } 466 315 467 316 static void hisi_l3c_pmu_enable_counter_int(struct hisi_pmu *l3c_pmu, ··· 469 318 { 470 319 u32 val; 471 320 472 - val = readl(l3c_pmu->base + L3C_INT_MASK); 321 + val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK); 473 322 /* Write 0 to enable interrupt */ 474 - val &= ~(1 << hwc->idx); 475 - writel(val, l3c_pmu->base + L3C_INT_MASK); 323 + val &= ~(1 << L3C_HW_IDX(hwc->idx)); 324 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val); 476 325 } 477 326 478 327 static void hisi_l3c_pmu_disable_counter_int(struct hisi_pmu *l3c_pmu, ··· 480 329 { 481 330 u32 val; 482 331 483 - val = readl(l3c_pmu->base + L3C_INT_MASK); 332 + val = hisi_l3c_pmu_event_readl(hwc, L3C_INT_MASK); 484 333 /* Write 1 to mask interrupt */ 485 - val |= (1 << hwc->idx); 486 - writel(val, l3c_pmu->base + L3C_INT_MASK); 334 + val |= 1 << L3C_HW_IDX(hwc->idx); 335 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_MASK, val); 487 336 } 488 337 489 338 static u32 hisi_l3c_pmu_get_int_status(struct hisi_pmu *l3c_pmu) 490 339 { 491 - return readl(l3c_pmu->base + L3C_INT_STATUS); 340 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 341 + u32 ext_int, status, status_ext = 0; 342 + int i; 343 + 344 + status = readl(l3c_pmu->base + L3C_INT_STATUS); 345 + 346 + if (!support_ext(hisi_l3c_pmu)) 347 + return status; 348 + 349 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 350 + ext_int = readl(hisi_l3c_pmu->ext_base[i] + L3C_INT_STATUS); 351 + status_ext |= ext_int << (L3C_NR_COUNTERS * i); 352 + } 353 + 354 + return status | (status_ext << L3C_NR_COUNTERS); 492 355 } 493 356 494 357 static void hisi_l3c_pmu_clear_int_status(struct hisi_pmu *l3c_pmu, int idx) 495 358 { 496 - writel(1 << idx, l3c_pmu->base + L3C_INT_CLEAR); 497 - } 359 + struct hw_perf_event *hwc = &l3c_pmu->pmu_events.hw_events[idx]->hw; 498 360 499 - static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { 500 - { "HISI0213", }, 501 - { "HISI0214", }, 502 - {} 503 - }; 504 - MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); 361 + hisi_l3c_pmu_event_writel(hwc, L3C_INT_CLEAR, 1 << L3C_HW_IDX(idx)); 362 + } 505 363 506 364 static int hisi_l3c_pmu_init_data(struct platform_device *pdev, 507 365 struct hisi_pmu *l3c_pmu) ··· 531 371 return -EINVAL; 532 372 } 533 373 374 + l3c_pmu->dev_info = device_get_match_data(&pdev->dev); 375 + if (!l3c_pmu->dev_info) 376 + return -ENODEV; 377 + 534 378 l3c_pmu->base = devm_platform_ioremap_resource(pdev, 0); 535 379 if (IS_ERR(l3c_pmu->base)) { 536 380 dev_err(&pdev->dev, "ioremap failed for l3c_pmu resource\n"); ··· 542 378 } 543 379 544 380 l3c_pmu->identifier = readl(l3c_pmu->base + L3C_VERSION); 381 + 382 + return 0; 383 + } 384 + 385 + static int hisi_l3c_pmu_init_ext(struct hisi_pmu *l3c_pmu, struct platform_device *pdev) 386 + { 387 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 388 + int ret, irq, ext_num, i; 389 + char *irqname; 390 + 391 + /* HiSilicon L3C PMU supporting ext should have more than 1 irq resources. */ 392 + ext_num = platform_irq_count(pdev); 393 + if (ext_num < L3C_MAX_EXT) 394 + return -ENODEV; 395 + 396 + /* 397 + * The number of ext supported equals the number of irq - 1, since one 398 + * of the irqs belongs to the normal part of PMU. 399 + */ 400 + hisi_l3c_pmu->ext_num = ext_num - 1; 401 + 402 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) { 403 + hisi_l3c_pmu->ext_base[i] = devm_platform_ioremap_resource(pdev, i + 1); 404 + if (IS_ERR(hisi_l3c_pmu->ext_base[i])) 405 + return PTR_ERR(hisi_l3c_pmu->ext_base[i]); 406 + 407 + irq = platform_get_irq(pdev, i + 1); 408 + if (irq < 0) 409 + return irq; 410 + 411 + irqname = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s ext%d", 412 + dev_name(&pdev->dev), i + 1); 413 + if (!irqname) 414 + return -ENOMEM; 415 + 416 + ret = devm_request_irq(&pdev->dev, irq, hisi_uncore_pmu_isr, 417 + IRQF_NOBALANCING | IRQF_NO_THREAD, 418 + irqname, l3c_pmu); 419 + if (ret < 0) 420 + return dev_err_probe(&pdev->dev, ret, 421 + "Fail to request EXT IRQ: %d.\n", irq); 422 + 423 + hisi_l3c_pmu->ext_irq[i] = irq; 424 + } 545 425 546 426 return 0; 547 427 } ··· 602 394 603 395 static struct attribute *hisi_l3c_pmu_v2_format_attr[] = { 604 396 HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 605 - HISI_PMU_FORMAT_ATTR(tt_core, "config1:0-7"), 397 + HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"), 606 398 HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), 607 399 HISI_PMU_FORMAT_ATTR(datasrc_cfg, "config1:11-15"), 608 400 HISI_PMU_FORMAT_ATTR(datasrc_skt, "config1:16"), ··· 612 404 static const struct attribute_group hisi_l3c_pmu_v2_format_group = { 613 405 .name = "format", 614 406 .attrs = hisi_l3c_pmu_v2_format_attr, 407 + }; 408 + 409 + static struct attribute *hisi_l3c_pmu_v3_format_attr[] = { 410 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 411 + HISI_PMU_FORMAT_ATTR(ext, "config:16-17"), 412 + HISI_PMU_FORMAT_ATTR(tt_req, "config1:8-10"), 413 + HISI_PMU_FORMAT_ATTR(tt_core, "config2:0-15"), 414 + NULL 415 + }; 416 + 417 + static const struct attribute_group hisi_l3c_pmu_v3_format_group = { 418 + .name = "format", 419 + .attrs = hisi_l3c_pmu_v3_format_attr, 615 420 }; 616 421 617 422 static struct attribute *hisi_l3c_pmu_v1_events_attr[] = { ··· 662 441 .attrs = hisi_l3c_pmu_v2_events_attr, 663 442 }; 664 443 444 + static struct attribute *hisi_l3c_pmu_v3_events_attr[] = { 445 + HISI_PMU_EVENT_ATTR(rd_spipe, 0x18), 446 + HISI_PMU_EVENT_ATTR(rd_hit_spipe, 0x19), 447 + HISI_PMU_EVENT_ATTR(wr_spipe, 0x1a), 448 + HISI_PMU_EVENT_ATTR(wr_hit_spipe, 0x1b), 449 + HISI_PMU_EVENT_ATTR(io_rd_spipe, 0x1c), 450 + HISI_PMU_EVENT_ATTR(io_rd_hit_spipe, 0x1d), 451 + HISI_PMU_EVENT_ATTR(io_wr_spipe, 0x1e), 452 + HISI_PMU_EVENT_ATTR(io_wr_hit_spipe, 0x1f), 453 + HISI_PMU_EVENT_ATTR(cycles, 0x7f), 454 + HISI_PMU_EVENT_ATTR(l3c_ref, 0xbc), 455 + HISI_PMU_EVENT_ATTR(l3c2ring, 0xbd), 456 + NULL 457 + }; 458 + 459 + static const struct attribute_group hisi_l3c_pmu_v3_events_group = { 460 + .name = "events", 461 + .attrs = hisi_l3c_pmu_v3_events_attr, 462 + }; 463 + 665 464 static const struct attribute_group *hisi_l3c_pmu_v1_attr_groups[] = { 666 465 &hisi_l3c_pmu_v1_format_group, 667 466 &hisi_l3c_pmu_v1_events_group, ··· 698 457 NULL 699 458 }; 700 459 460 + static const struct attribute_group *hisi_l3c_pmu_v3_attr_groups[] = { 461 + &hisi_l3c_pmu_v3_format_group, 462 + &hisi_l3c_pmu_v3_events_group, 463 + &hisi_pmu_cpumask_attr_group, 464 + &hisi_pmu_identifier_group, 465 + NULL 466 + }; 467 + 468 + static struct hisi_l3c_pmu_ext hisi_l3c_pmu_support_ext = { 469 + .support_ext = true, 470 + }; 471 + 472 + static struct hisi_l3c_pmu_ext hisi_l3c_pmu_not_support_ext = { 473 + .support_ext = false, 474 + }; 475 + 476 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v1 = { 477 + .attr_groups = hisi_l3c_pmu_v1_attr_groups, 478 + .counter_bits = 48, 479 + .check_event = L3C_V1_NR_EVENTS, 480 + .private = &hisi_l3c_pmu_not_support_ext, 481 + }; 482 + 483 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v2 = { 484 + .attr_groups = hisi_l3c_pmu_v2_attr_groups, 485 + .counter_bits = 64, 486 + .check_event = L3C_V2_NR_EVENTS, 487 + .private = &hisi_l3c_pmu_not_support_ext, 488 + }; 489 + 490 + static const struct hisi_pmu_dev_info hisi_l3c_pmu_v3 = { 491 + .attr_groups = hisi_l3c_pmu_v3_attr_groups, 492 + .counter_bits = 64, 493 + .check_event = L3C_V2_NR_EVENTS, 494 + .private = &hisi_l3c_pmu_support_ext, 495 + }; 496 + 701 497 static const struct hisi_uncore_ops hisi_uncore_l3c_ops = { 702 498 .write_evtype = hisi_l3c_pmu_write_evtype, 703 - .get_event_idx = hisi_uncore_pmu_get_event_idx, 499 + .get_event_idx = hisi_l3c_pmu_get_event_idx, 704 500 .start_counters = hisi_l3c_pmu_start_counters, 705 501 .stop_counters = hisi_l3c_pmu_stop_counters, 706 502 .enable_counter = hisi_l3c_pmu_enable_counter, ··· 750 472 .clear_int_status = hisi_l3c_pmu_clear_int_status, 751 473 .enable_filter = hisi_l3c_pmu_enable_filter, 752 474 .disable_filter = hisi_l3c_pmu_disable_filter, 475 + .check_filter = hisi_l3c_pmu_check_filter, 753 476 }; 754 477 755 478 static int hisi_l3c_pmu_dev_probe(struct platform_device *pdev, 756 479 struct hisi_pmu *l3c_pmu) 757 480 { 481 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 482 + struct hisi_l3c_pmu_ext *l3c_pmu_dev_ext; 758 483 int ret; 759 484 760 485 ret = hisi_l3c_pmu_init_data(pdev, l3c_pmu); ··· 768 487 if (ret) 769 488 return ret; 770 489 771 - if (l3c_pmu->identifier >= HISI_PMU_V2) { 772 - l3c_pmu->counter_bits = 64; 773 - l3c_pmu->check_event = L3C_V2_NR_EVENTS; 774 - l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v2_attr_groups; 775 - } else { 776 - l3c_pmu->counter_bits = 48; 777 - l3c_pmu->check_event = L3C_V1_NR_EVENTS; 778 - l3c_pmu->pmu_events.attr_groups = hisi_l3c_pmu_v1_attr_groups; 779 - } 780 - 490 + l3c_pmu->pmu_events.attr_groups = l3c_pmu->dev_info->attr_groups; 491 + l3c_pmu->counter_bits = l3c_pmu->dev_info->counter_bits; 492 + l3c_pmu->check_event = l3c_pmu->dev_info->check_event; 781 493 l3c_pmu->num_counters = L3C_NR_COUNTERS; 782 494 l3c_pmu->ops = &hisi_uncore_l3c_ops; 783 495 l3c_pmu->dev = &pdev->dev; 784 496 l3c_pmu->on_cpu = -1; 497 + 498 + l3c_pmu_dev_ext = l3c_pmu->dev_info->private; 499 + if (l3c_pmu_dev_ext->support_ext) { 500 + ret = hisi_l3c_pmu_init_ext(l3c_pmu, pdev); 501 + if (ret) 502 + return ret; 503 + /* 504 + * The extension events have their own counters with the 505 + * same number of the normal events counters. So we can 506 + * have at maximum num_counters * ext events monitored. 507 + */ 508 + l3c_pmu->num_counters += hisi_l3c_pmu->ext_num * L3C_NR_COUNTERS; 509 + } 785 510 786 511 return 0; 787 512 } 788 513 789 514 static int hisi_l3c_pmu_probe(struct platform_device *pdev) 790 515 { 516 + struct hisi_l3c_pmu *hisi_l3c_pmu; 791 517 struct hisi_pmu *l3c_pmu; 792 518 char *name; 793 519 int ret; 794 520 795 - l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*l3c_pmu), GFP_KERNEL); 796 - if (!l3c_pmu) 521 + hisi_l3c_pmu = devm_kzalloc(&pdev->dev, sizeof(*hisi_l3c_pmu), GFP_KERNEL); 522 + if (!hisi_l3c_pmu) 797 523 return -ENOMEM; 798 524 525 + l3c_pmu = &hisi_l3c_pmu->l3c_pmu; 799 526 platform_set_drvdata(pdev, l3c_pmu); 800 527 801 528 ret = hisi_l3c_pmu_dev_probe(pdev, l3c_pmu); 802 529 if (ret) 803 530 return ret; 804 531 805 - name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d", 806 - l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id); 532 + if (l3c_pmu->topo.sub_id >= 0) 533 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d_%d", 534 + l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id, 535 + l3c_pmu->topo.sub_id); 536 + else 537 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%d_l3c%d", 538 + l3c_pmu->topo.sccl_id, l3c_pmu->topo.ccl_id); 807 539 if (!name) 808 540 return -ENOMEM; 809 541 ··· 848 554 &l3c_pmu->node); 849 555 } 850 556 557 + static const struct acpi_device_id hisi_l3c_pmu_acpi_match[] = { 558 + { "HISI0213", (kernel_ulong_t)&hisi_l3c_pmu_v1 }, 559 + { "HISI0214", (kernel_ulong_t)&hisi_l3c_pmu_v2 }, 560 + { "HISI0215", (kernel_ulong_t)&hisi_l3c_pmu_v3 }, 561 + {} 562 + }; 563 + MODULE_DEVICE_TABLE(acpi, hisi_l3c_pmu_acpi_match); 564 + 851 565 static struct platform_driver hisi_l3c_pmu_driver = { 852 566 .driver = { 853 567 .name = "hisi_l3c_pmu", ··· 866 564 .remove = hisi_l3c_pmu_remove, 867 565 }; 868 566 567 + static int hisi_l3c_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) 568 + { 569 + struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); 570 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 571 + int ret, i; 572 + 573 + ret = hisi_uncore_pmu_online_cpu(cpu, node); 574 + if (ret) 575 + return ret; 576 + 577 + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ 578 + if (!support_ext(hisi_l3c_pmu)) 579 + return 0; 580 + 581 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) 582 + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], 583 + cpumask_of(l3c_pmu->on_cpu))); 584 + 585 + return 0; 586 + } 587 + 588 + static int hisi_l3c_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) 589 + { 590 + struct hisi_pmu *l3c_pmu = hlist_entry_safe(node, struct hisi_pmu, node); 591 + struct hisi_l3c_pmu *hisi_l3c_pmu = to_hisi_l3c_pmu(l3c_pmu); 592 + int ret, i; 593 + 594 + ret = hisi_uncore_pmu_offline_cpu(cpu, node); 595 + if (ret) 596 + return ret; 597 + 598 + /* If failed to find any available CPU, skip irq migration. */ 599 + if (l3c_pmu->on_cpu < 0) 600 + return 0; 601 + 602 + /* Avoid L3C pmu not supporting ext from ext irq migrating. */ 603 + if (!support_ext(hisi_l3c_pmu)) 604 + return 0; 605 + 606 + for (i = 0; i < hisi_l3c_pmu->ext_num; i++) 607 + WARN_ON(irq_set_affinity(hisi_l3c_pmu->ext_irq[i], 608 + cpumask_of(l3c_pmu->on_cpu))); 609 + 610 + return 0; 611 + } 612 + 869 613 static int __init hisi_l3c_pmu_module_init(void) 870 614 { 871 615 int ret; 872 616 873 617 ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 874 618 "AP_PERF_ARM_HISI_L3_ONLINE", 875 - hisi_uncore_pmu_online_cpu, 876 - hisi_uncore_pmu_offline_cpu); 619 + hisi_l3c_pmu_online_cpu, 620 + hisi_l3c_pmu_offline_cpu); 877 621 if (ret) { 878 622 pr_err("L3C PMU: Error setup hotplug, ret = %d\n", ret); 879 623 return ret;
+411
drivers/perf/hisilicon/hisi_uncore_mn_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * HiSilicon SoC MN uncore Hardware event counters support 4 + * 5 + * Copyright (c) 2025 HiSilicon Technologies Co., Ltd. 6 + */ 7 + #include <linux/cpuhotplug.h> 8 + #include <linux/interrupt.h> 9 + #include <linux/iopoll.h> 10 + #include <linux/irq.h> 11 + #include <linux/list.h> 12 + #include <linux/mod_devicetable.h> 13 + #include <linux/property.h> 14 + 15 + #include "hisi_uncore_pmu.h" 16 + 17 + /* Dynamic CPU hotplug state used by MN PMU */ 18 + static enum cpuhp_state hisi_mn_pmu_online; 19 + 20 + /* MN register definition */ 21 + #define HISI_MN_DYNAMIC_CTRL_REG 0x400 22 + #define HISI_MN_DYNAMIC_CTRL_EN BIT(0) 23 + #define HISI_MN_PERF_CTRL_REG 0x408 24 + #define HISI_MN_PERF_CTRL_EN BIT(6) 25 + #define HISI_MN_INT_MASK_REG 0x800 26 + #define HISI_MN_INT_STATUS_REG 0x808 27 + #define HISI_MN_INT_CLEAR_REG 0x80C 28 + #define HISI_MN_EVENT_CTRL_REG 0x1C00 29 + #define HISI_MN_VERSION_REG 0x1C04 30 + #define HISI_MN_EVTYPE0_REG 0x1d00 31 + #define HISI_MN_EVTYPE_MASK GENMASK(7, 0) 32 + #define HISI_MN_CNTR0_REG 0x1e00 33 + #define HISI_MN_EVTYPE_REGn(evtype0, n) ((evtype0) + (n) * 4) 34 + #define HISI_MN_CNTR_REGn(cntr0, n) ((cntr0) + (n) * 8) 35 + 36 + #define HISI_MN_NR_COUNTERS 4 37 + #define HISI_MN_TIMEOUT_US 500U 38 + 39 + struct hisi_mn_pmu_regs { 40 + u32 version; 41 + u32 dyn_ctrl; 42 + u32 perf_ctrl; 43 + u32 int_mask; 44 + u32 int_clear; 45 + u32 int_status; 46 + u32 event_ctrl; 47 + u32 event_type0; 48 + u32 event_cntr0; 49 + }; 50 + 51 + /* 52 + * Each event request takes a certain amount of time to complete. If 53 + * we counting the latency related event, we need to wait for the all 54 + * requests complete. Otherwise, the value of counter is slightly larger. 55 + */ 56 + static void hisi_mn_pmu_counter_flush(struct hisi_pmu *mn_pmu) 57 + { 58 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 59 + int ret; 60 + u32 val; 61 + 62 + val = readl(mn_pmu->base + reg_info->dyn_ctrl); 63 + val |= HISI_MN_DYNAMIC_CTRL_EN; 64 + writel(val, mn_pmu->base + reg_info->dyn_ctrl); 65 + 66 + ret = readl_poll_timeout_atomic(mn_pmu->base + reg_info->dyn_ctrl, 67 + val, !(val & HISI_MN_DYNAMIC_CTRL_EN), 68 + 1, HISI_MN_TIMEOUT_US); 69 + if (ret) 70 + dev_warn(mn_pmu->dev, "Counter flush timeout\n"); 71 + } 72 + 73 + static u64 hisi_mn_pmu_read_counter(struct hisi_pmu *mn_pmu, 74 + struct hw_perf_event *hwc) 75 + { 76 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 77 + 78 + return readq(mn_pmu->base + HISI_MN_CNTR_REGn(reg_info->event_cntr0, hwc->idx)); 79 + } 80 + 81 + static void hisi_mn_pmu_write_counter(struct hisi_pmu *mn_pmu, 82 + struct hw_perf_event *hwc, u64 val) 83 + { 84 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 85 + 86 + writeq(val, mn_pmu->base + HISI_MN_CNTR_REGn(reg_info->event_cntr0, hwc->idx)); 87 + } 88 + 89 + static void hisi_mn_pmu_write_evtype(struct hisi_pmu *mn_pmu, int idx, u32 type) 90 + { 91 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 92 + u32 val; 93 + 94 + /* 95 + * Select the appropriate event select register. 96 + * There are 2 32-bit event select registers for the 97 + * 8 hardware counters, each event code is 8-bit wide. 98 + */ 99 + val = readl(mn_pmu->base + HISI_MN_EVTYPE_REGn(reg_info->event_type0, idx / 4)); 100 + val &= ~(HISI_MN_EVTYPE_MASK << HISI_PMU_EVTYPE_SHIFT(idx)); 101 + val |= (type << HISI_PMU_EVTYPE_SHIFT(idx)); 102 + writel(val, mn_pmu->base + HISI_MN_EVTYPE_REGn(reg_info->event_type0, idx / 4)); 103 + } 104 + 105 + static void hisi_mn_pmu_start_counters(struct hisi_pmu *mn_pmu) 106 + { 107 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 108 + u32 val; 109 + 110 + val = readl(mn_pmu->base + reg_info->perf_ctrl); 111 + val |= HISI_MN_PERF_CTRL_EN; 112 + writel(val, mn_pmu->base + reg_info->perf_ctrl); 113 + } 114 + 115 + static void hisi_mn_pmu_stop_counters(struct hisi_pmu *mn_pmu) 116 + { 117 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 118 + u32 val; 119 + 120 + val = readl(mn_pmu->base + reg_info->perf_ctrl); 121 + val &= ~HISI_MN_PERF_CTRL_EN; 122 + writel(val, mn_pmu->base + reg_info->perf_ctrl); 123 + 124 + hisi_mn_pmu_counter_flush(mn_pmu); 125 + } 126 + 127 + static void hisi_mn_pmu_enable_counter(struct hisi_pmu *mn_pmu, 128 + struct hw_perf_event *hwc) 129 + { 130 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 131 + u32 val; 132 + 133 + val = readl(mn_pmu->base + reg_info->event_ctrl); 134 + val |= BIT(hwc->idx); 135 + writel(val, mn_pmu->base + reg_info->event_ctrl); 136 + } 137 + 138 + static void hisi_mn_pmu_disable_counter(struct hisi_pmu *mn_pmu, 139 + struct hw_perf_event *hwc) 140 + { 141 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 142 + u32 val; 143 + 144 + val = readl(mn_pmu->base + reg_info->event_ctrl); 145 + val &= ~BIT(hwc->idx); 146 + writel(val, mn_pmu->base + reg_info->event_ctrl); 147 + } 148 + 149 + static void hisi_mn_pmu_enable_counter_int(struct hisi_pmu *mn_pmu, 150 + struct hw_perf_event *hwc) 151 + { 152 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 153 + u32 val; 154 + 155 + val = readl(mn_pmu->base + reg_info->int_mask); 156 + val &= ~BIT(hwc->idx); 157 + writel(val, mn_pmu->base + reg_info->int_mask); 158 + } 159 + 160 + static void hisi_mn_pmu_disable_counter_int(struct hisi_pmu *mn_pmu, 161 + struct hw_perf_event *hwc) 162 + { 163 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 164 + u32 val; 165 + 166 + val = readl(mn_pmu->base + reg_info->int_mask); 167 + val |= BIT(hwc->idx); 168 + writel(val, mn_pmu->base + reg_info->int_mask); 169 + } 170 + 171 + static u32 hisi_mn_pmu_get_int_status(struct hisi_pmu *mn_pmu) 172 + { 173 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 174 + 175 + return readl(mn_pmu->base + reg_info->int_status); 176 + } 177 + 178 + static void hisi_mn_pmu_clear_int_status(struct hisi_pmu *mn_pmu, int idx) 179 + { 180 + struct hisi_mn_pmu_regs *reg_info = mn_pmu->dev_info->private; 181 + 182 + writel(BIT(idx), mn_pmu->base + reg_info->int_clear); 183 + } 184 + 185 + static struct attribute *hisi_mn_pmu_format_attr[] = { 186 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 187 + NULL 188 + }; 189 + 190 + static const struct attribute_group hisi_mn_pmu_format_group = { 191 + .name = "format", 192 + .attrs = hisi_mn_pmu_format_attr, 193 + }; 194 + 195 + static struct attribute *hisi_mn_pmu_events_attr[] = { 196 + HISI_PMU_EVENT_ATTR(req_eobarrier_num, 0x00), 197 + HISI_PMU_EVENT_ATTR(req_ecbarrier_num, 0x01), 198 + HISI_PMU_EVENT_ATTR(req_dvmop_num, 0x02), 199 + HISI_PMU_EVENT_ATTR(req_dvmsync_num, 0x03), 200 + HISI_PMU_EVENT_ATTR(req_retry_num, 0x04), 201 + HISI_PMU_EVENT_ATTR(req_writenosnp_num, 0x05), 202 + HISI_PMU_EVENT_ATTR(req_readnosnp_num, 0x06), 203 + HISI_PMU_EVENT_ATTR(snp_dvm_num, 0x07), 204 + HISI_PMU_EVENT_ATTR(snp_dvmsync_num, 0x08), 205 + HISI_PMU_EVENT_ATTR(l3t_req_dvm_num, 0x09), 206 + HISI_PMU_EVENT_ATTR(l3t_req_dvmsync_num, 0x0A), 207 + HISI_PMU_EVENT_ATTR(mn_req_dvm_num, 0x0B), 208 + HISI_PMU_EVENT_ATTR(mn_req_dvmsync_num, 0x0C), 209 + HISI_PMU_EVENT_ATTR(pa_req_dvm_num, 0x0D), 210 + HISI_PMU_EVENT_ATTR(pa_req_dvmsync_num, 0x0E), 211 + HISI_PMU_EVENT_ATTR(snp_dvm_latency, 0x80), 212 + HISI_PMU_EVENT_ATTR(snp_dvmsync_latency, 0x81), 213 + HISI_PMU_EVENT_ATTR(l3t_req_dvm_latency, 0x82), 214 + HISI_PMU_EVENT_ATTR(l3t_req_dvmsync_latency, 0x83), 215 + HISI_PMU_EVENT_ATTR(mn_req_dvm_latency, 0x84), 216 + HISI_PMU_EVENT_ATTR(mn_req_dvmsync_latency, 0x85), 217 + HISI_PMU_EVENT_ATTR(pa_req_dvm_latency, 0x86), 218 + HISI_PMU_EVENT_ATTR(pa_req_dvmsync_latency, 0x87), 219 + NULL 220 + }; 221 + 222 + static const struct attribute_group hisi_mn_pmu_events_group = { 223 + .name = "events", 224 + .attrs = hisi_mn_pmu_events_attr, 225 + }; 226 + 227 + static const struct attribute_group *hisi_mn_pmu_attr_groups[] = { 228 + &hisi_mn_pmu_format_group, 229 + &hisi_mn_pmu_events_group, 230 + &hisi_pmu_cpumask_attr_group, 231 + &hisi_pmu_identifier_group, 232 + NULL 233 + }; 234 + 235 + static const struct hisi_uncore_ops hisi_uncore_mn_ops = { 236 + .write_evtype = hisi_mn_pmu_write_evtype, 237 + .get_event_idx = hisi_uncore_pmu_get_event_idx, 238 + .start_counters = hisi_mn_pmu_start_counters, 239 + .stop_counters = hisi_mn_pmu_stop_counters, 240 + .enable_counter = hisi_mn_pmu_enable_counter, 241 + .disable_counter = hisi_mn_pmu_disable_counter, 242 + .enable_counter_int = hisi_mn_pmu_enable_counter_int, 243 + .disable_counter_int = hisi_mn_pmu_disable_counter_int, 244 + .write_counter = hisi_mn_pmu_write_counter, 245 + .read_counter = hisi_mn_pmu_read_counter, 246 + .get_int_status = hisi_mn_pmu_get_int_status, 247 + .clear_int_status = hisi_mn_pmu_clear_int_status, 248 + }; 249 + 250 + static int hisi_mn_pmu_dev_init(struct platform_device *pdev, 251 + struct hisi_pmu *mn_pmu) 252 + { 253 + struct hisi_mn_pmu_regs *reg_info; 254 + int ret; 255 + 256 + hisi_uncore_pmu_init_topology(mn_pmu, &pdev->dev); 257 + 258 + if (mn_pmu->topo.scl_id < 0) 259 + return dev_err_probe(&pdev->dev, -EINVAL, 260 + "Failed to read MN scl id\n"); 261 + 262 + if (mn_pmu->topo.index_id < 0) 263 + return dev_err_probe(&pdev->dev, -EINVAL, 264 + "Failed to read MN index id\n"); 265 + 266 + mn_pmu->base = devm_platform_ioremap_resource(pdev, 0); 267 + if (IS_ERR(mn_pmu->base)) 268 + return dev_err_probe(&pdev->dev, PTR_ERR(mn_pmu->base), 269 + "Failed to ioremap resource\n"); 270 + 271 + ret = hisi_uncore_pmu_init_irq(mn_pmu, pdev); 272 + if (ret) 273 + return ret; 274 + 275 + mn_pmu->dev_info = device_get_match_data(&pdev->dev); 276 + if (!mn_pmu->dev_info) 277 + return -ENODEV; 278 + 279 + mn_pmu->pmu_events.attr_groups = mn_pmu->dev_info->attr_groups; 280 + mn_pmu->counter_bits = mn_pmu->dev_info->counter_bits; 281 + mn_pmu->check_event = mn_pmu->dev_info->check_event; 282 + mn_pmu->num_counters = HISI_MN_NR_COUNTERS; 283 + mn_pmu->ops = &hisi_uncore_mn_ops; 284 + mn_pmu->dev = &pdev->dev; 285 + mn_pmu->on_cpu = -1; 286 + 287 + reg_info = mn_pmu->dev_info->private; 288 + mn_pmu->identifier = readl(mn_pmu->base + reg_info->version); 289 + 290 + return 0; 291 + } 292 + 293 + static void hisi_mn_pmu_remove_cpuhp(void *hotplug_node) 294 + { 295 + cpuhp_state_remove_instance_nocalls(hisi_mn_pmu_online, hotplug_node); 296 + } 297 + 298 + static void hisi_mn_pmu_unregister(void *pmu) 299 + { 300 + perf_pmu_unregister(pmu); 301 + } 302 + 303 + static int hisi_mn_pmu_probe(struct platform_device *pdev) 304 + { 305 + struct hisi_pmu *mn_pmu; 306 + char *name; 307 + int ret; 308 + 309 + mn_pmu = devm_kzalloc(&pdev->dev, sizeof(*mn_pmu), GFP_KERNEL); 310 + if (!mn_pmu) 311 + return -ENOMEM; 312 + 313 + platform_set_drvdata(pdev, mn_pmu); 314 + 315 + ret = hisi_mn_pmu_dev_init(pdev, mn_pmu); 316 + if (ret) 317 + return ret; 318 + 319 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_scl%d_mn%d", 320 + mn_pmu->topo.scl_id, mn_pmu->topo.index_id); 321 + if (!name) 322 + return -ENOMEM; 323 + 324 + ret = cpuhp_state_add_instance(hisi_mn_pmu_online, &mn_pmu->node); 325 + if (ret) 326 + return dev_err_probe(&pdev->dev, ret, "Failed to register cpu hotplug\n"); 327 + 328 + ret = devm_add_action_or_reset(&pdev->dev, hisi_mn_pmu_remove_cpuhp, &mn_pmu->node); 329 + if (ret) 330 + return ret; 331 + 332 + hisi_pmu_init(mn_pmu, THIS_MODULE); 333 + 334 + ret = perf_pmu_register(&mn_pmu->pmu, name, -1); 335 + if (ret) 336 + return dev_err_probe(mn_pmu->dev, ret, "Failed to register MN PMU\n"); 337 + 338 + return devm_add_action_or_reset(&pdev->dev, hisi_mn_pmu_unregister, &mn_pmu->pmu); 339 + } 340 + 341 + static struct hisi_mn_pmu_regs hisi_mn_v1_pmu_regs = { 342 + .version = HISI_MN_VERSION_REG, 343 + .dyn_ctrl = HISI_MN_DYNAMIC_CTRL_REG, 344 + .perf_ctrl = HISI_MN_PERF_CTRL_REG, 345 + .int_mask = HISI_MN_INT_MASK_REG, 346 + .int_clear = HISI_MN_INT_CLEAR_REG, 347 + .int_status = HISI_MN_INT_STATUS_REG, 348 + .event_ctrl = HISI_MN_EVENT_CTRL_REG, 349 + .event_type0 = HISI_MN_EVTYPE0_REG, 350 + .event_cntr0 = HISI_MN_CNTR0_REG, 351 + }; 352 + 353 + static const struct hisi_pmu_dev_info hisi_mn_v1 = { 354 + .attr_groups = hisi_mn_pmu_attr_groups, 355 + .counter_bits = 48, 356 + .check_event = HISI_MN_EVTYPE_MASK, 357 + .private = &hisi_mn_v1_pmu_regs, 358 + }; 359 + 360 + static const struct acpi_device_id hisi_mn_pmu_acpi_match[] = { 361 + { "HISI0222", (kernel_ulong_t) &hisi_mn_v1 }, 362 + { } 363 + }; 364 + MODULE_DEVICE_TABLE(acpi, hisi_mn_pmu_acpi_match); 365 + 366 + static struct platform_driver hisi_mn_pmu_driver = { 367 + .driver = { 368 + .name = "hisi_mn_pmu", 369 + .acpi_match_table = hisi_mn_pmu_acpi_match, 370 + /* 371 + * We have not worked out a safe bind/unbind process, 372 + * Forcefully unbinding during sampling will lead to a 373 + * kernel panic, so this is not supported yet. 374 + */ 375 + .suppress_bind_attrs = true, 376 + }, 377 + .probe = hisi_mn_pmu_probe, 378 + }; 379 + 380 + static int __init hisi_mn_pmu_module_init(void) 381 + { 382 + int ret; 383 + 384 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "perf/hisi/mn:online", 385 + hisi_uncore_pmu_online_cpu, 386 + hisi_uncore_pmu_offline_cpu); 387 + if (ret < 0) { 388 + pr_err("hisi_mn_pmu: Failed to setup MN PMU hotplug: %d\n", ret); 389 + return ret; 390 + } 391 + hisi_mn_pmu_online = ret; 392 + 393 + ret = platform_driver_register(&hisi_mn_pmu_driver); 394 + if (ret) 395 + cpuhp_remove_multi_state(hisi_mn_pmu_online); 396 + 397 + return ret; 398 + } 399 + module_init(hisi_mn_pmu_module_init); 400 + 401 + static void __exit hisi_mn_pmu_module_exit(void) 402 + { 403 + platform_driver_unregister(&hisi_mn_pmu_driver); 404 + cpuhp_remove_multi_state(hisi_mn_pmu_online); 405 + } 406 + module_exit(hisi_mn_pmu_module_exit); 407 + 408 + MODULE_IMPORT_NS("HISI_PMU"); 409 + MODULE_DESCRIPTION("HiSilicon SoC MN uncore PMU driver"); 410 + MODULE_LICENSE("GPL"); 411 + MODULE_AUTHOR("Junhao He <hejunhao3@huawei.com>");
+443
drivers/perf/hisilicon/hisi_uncore_noc_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Driver for HiSilicon Uncore NoC (Network on Chip) PMU device 4 + * 5 + * Copyright (c) 2025 HiSilicon Technologies Co., Ltd. 6 + * Author: Yicong Yang <yangyicong@hisilicon.com> 7 + */ 8 + #include <linux/bitops.h> 9 + #include <linux/cpuhotplug.h> 10 + #include <linux/device.h> 11 + #include <linux/io.h> 12 + #include <linux/mod_devicetable.h> 13 + #include <linux/module.h> 14 + #include <linux/platform_device.h> 15 + #include <linux/property.h> 16 + #include <linux/sysfs.h> 17 + 18 + #include "hisi_uncore_pmu.h" 19 + 20 + #define NOC_PMU_VERSION 0x1e00 21 + #define NOC_PMU_GLOBAL_CTRL 0x1e04 22 + #define NOC_PMU_GLOBAL_CTRL_PMU_EN BIT(0) 23 + #define NOC_PMU_GLOBAL_CTRL_TT_EN BIT(1) 24 + #define NOC_PMU_CNT_INFO 0x1e08 25 + #define NOC_PMU_CNT_INFO_OVERFLOW(n) BIT(n) 26 + #define NOC_PMU_EVENT_CTRL0 0x1e20 27 + #define NOC_PMU_EVENT_CTRL_TYPE GENMASK(4, 0) 28 + /* 29 + * Note channel of 0x0 will reset the counter value, so don't do it before 30 + * we read out the counter. 31 + */ 32 + #define NOC_PMU_EVENT_CTRL_CHANNEL GENMASK(10, 8) 33 + #define NOC_PMU_EVENT_CTRL_EN BIT(11) 34 + #define NOC_PMU_EVENT_COUNTER0 0x1e80 35 + 36 + #define NOC_PMU_NR_COUNTERS 4 37 + #define NOC_PMU_CH_DEFAULT 0x7 38 + 39 + #define NOC_PMU_EVENT_CTRLn(ctrl0, n) ((ctrl0) + 4 * (n)) 40 + #define NOC_PMU_EVENT_CNTRn(cntr0, n) ((cntr0) + 8 * (n)) 41 + 42 + HISI_PMU_EVENT_ATTR_EXTRACTOR(ch, config1, 2, 0); 43 + HISI_PMU_EVENT_ATTR_EXTRACTOR(tt_en, config1, 3, 3); 44 + 45 + /* Dynamic CPU hotplug state used by this PMU driver */ 46 + static enum cpuhp_state hisi_noc_pmu_cpuhp_state; 47 + 48 + struct hisi_noc_pmu_regs { 49 + u32 version; 50 + u32 pmu_ctrl; 51 + u32 event_ctrl0; 52 + u32 event_cntr0; 53 + u32 overflow_status; 54 + }; 55 + 56 + /* 57 + * Tracetag filtering is not per event and all the events should keep 58 + * the consistence. Return true if the new comer doesn't match the 59 + * tracetag filtering configuration of the current scheduled events. 60 + */ 61 + static bool hisi_noc_pmu_check_global_filter(struct perf_event *curr, 62 + struct perf_event *new) 63 + { 64 + return hisi_get_tt_en(curr) == hisi_get_tt_en(new); 65 + } 66 + 67 + static void hisi_noc_pmu_write_evtype(struct hisi_pmu *noc_pmu, int idx, u32 type) 68 + { 69 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 70 + u32 reg; 71 + 72 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, idx)); 73 + reg &= ~NOC_PMU_EVENT_CTRL_TYPE; 74 + reg |= FIELD_PREP(NOC_PMU_EVENT_CTRL_TYPE, type); 75 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, idx)); 76 + } 77 + 78 + static int hisi_noc_pmu_get_event_idx(struct perf_event *event) 79 + { 80 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 81 + struct hisi_pmu_hwevents *pmu_events = &noc_pmu->pmu_events; 82 + int cur_idx; 83 + 84 + cur_idx = find_first_bit(pmu_events->used_mask, noc_pmu->num_counters); 85 + if (cur_idx != noc_pmu->num_counters && 86 + !hisi_noc_pmu_check_global_filter(pmu_events->hw_events[cur_idx], event)) 87 + return -EAGAIN; 88 + 89 + return hisi_uncore_pmu_get_event_idx(event); 90 + } 91 + 92 + static u64 hisi_noc_pmu_read_counter(struct hisi_pmu *noc_pmu, 93 + struct hw_perf_event *hwc) 94 + { 95 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 96 + 97 + return readq(noc_pmu->base + NOC_PMU_EVENT_CNTRn(reg_info->event_cntr0, hwc->idx)); 98 + } 99 + 100 + static void hisi_noc_pmu_write_counter(struct hisi_pmu *noc_pmu, 101 + struct hw_perf_event *hwc, u64 val) 102 + { 103 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 104 + 105 + writeq(val, noc_pmu->base + NOC_PMU_EVENT_CNTRn(reg_info->event_cntr0, hwc->idx)); 106 + } 107 + 108 + static void hisi_noc_pmu_enable_counter(struct hisi_pmu *noc_pmu, 109 + struct hw_perf_event *hwc) 110 + { 111 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 112 + u32 reg; 113 + 114 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 115 + reg |= NOC_PMU_EVENT_CTRL_EN; 116 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 117 + } 118 + 119 + static void hisi_noc_pmu_disable_counter(struct hisi_pmu *noc_pmu, 120 + struct hw_perf_event *hwc) 121 + { 122 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 123 + u32 reg; 124 + 125 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 126 + reg &= ~NOC_PMU_EVENT_CTRL_EN; 127 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 128 + } 129 + 130 + static void hisi_noc_pmu_enable_counter_int(struct hisi_pmu *noc_pmu, 131 + struct hw_perf_event *hwc) 132 + { 133 + /* We don't support interrupt, so a stub here. */ 134 + } 135 + 136 + static void hisi_noc_pmu_disable_counter_int(struct hisi_pmu *noc_pmu, 137 + struct hw_perf_event *hwc) 138 + { 139 + } 140 + 141 + static void hisi_noc_pmu_start_counters(struct hisi_pmu *noc_pmu) 142 + { 143 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 144 + u32 reg; 145 + 146 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 147 + reg |= NOC_PMU_GLOBAL_CTRL_PMU_EN; 148 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 149 + } 150 + 151 + static void hisi_noc_pmu_stop_counters(struct hisi_pmu *noc_pmu) 152 + { 153 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 154 + u32 reg; 155 + 156 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 157 + reg &= ~NOC_PMU_GLOBAL_CTRL_PMU_EN; 158 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 159 + } 160 + 161 + static u32 hisi_noc_pmu_get_int_status(struct hisi_pmu *noc_pmu) 162 + { 163 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 164 + 165 + return readl(noc_pmu->base + reg_info->overflow_status); 166 + } 167 + 168 + static void hisi_noc_pmu_clear_int_status(struct hisi_pmu *noc_pmu, int idx) 169 + { 170 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 171 + u32 reg; 172 + 173 + reg = readl(noc_pmu->base + reg_info->overflow_status); 174 + reg &= ~NOC_PMU_CNT_INFO_OVERFLOW(idx); 175 + writel(reg, noc_pmu->base + reg_info->overflow_status); 176 + } 177 + 178 + static void hisi_noc_pmu_enable_filter(struct perf_event *event) 179 + { 180 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 181 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 182 + struct hw_perf_event *hwc = &event->hw; 183 + u32 tt_en = hisi_get_tt_en(event); 184 + u32 ch = hisi_get_ch(event); 185 + u32 reg; 186 + 187 + if (!ch) 188 + ch = NOC_PMU_CH_DEFAULT; 189 + 190 + reg = readl(noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 191 + reg &= ~NOC_PMU_EVENT_CTRL_CHANNEL; 192 + reg |= FIELD_PREP(NOC_PMU_EVENT_CTRL_CHANNEL, ch); 193 + writel(reg, noc_pmu->base + NOC_PMU_EVENT_CTRLn(reg_info->event_ctrl0, hwc->idx)); 194 + 195 + /* 196 + * Since tracetag filter applies to all the counters, don't touch it 197 + * if user doesn't specify it explicitly. 198 + */ 199 + if (tt_en) { 200 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 201 + reg |= NOC_PMU_GLOBAL_CTRL_TT_EN; 202 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 203 + } 204 + } 205 + 206 + static void hisi_noc_pmu_disable_filter(struct perf_event *event) 207 + { 208 + struct hisi_pmu *noc_pmu = to_hisi_pmu(event->pmu); 209 + struct hisi_noc_pmu_regs *reg_info = noc_pmu->dev_info->private; 210 + u32 tt_en = hisi_get_tt_en(event); 211 + u32 reg; 212 + 213 + /* 214 + * If we're not the last counter, don't touch the global tracetag 215 + * configuration. 216 + */ 217 + if (bitmap_weight(noc_pmu->pmu_events.used_mask, noc_pmu->num_counters) > 1) 218 + return; 219 + 220 + if (tt_en) { 221 + reg = readl(noc_pmu->base + reg_info->pmu_ctrl); 222 + reg &= ~NOC_PMU_GLOBAL_CTRL_TT_EN; 223 + writel(reg, noc_pmu->base + reg_info->pmu_ctrl); 224 + } 225 + } 226 + 227 + static const struct hisi_uncore_ops hisi_uncore_noc_ops = { 228 + .write_evtype = hisi_noc_pmu_write_evtype, 229 + .get_event_idx = hisi_noc_pmu_get_event_idx, 230 + .read_counter = hisi_noc_pmu_read_counter, 231 + .write_counter = hisi_noc_pmu_write_counter, 232 + .enable_counter = hisi_noc_pmu_enable_counter, 233 + .disable_counter = hisi_noc_pmu_disable_counter, 234 + .enable_counter_int = hisi_noc_pmu_enable_counter_int, 235 + .disable_counter_int = hisi_noc_pmu_disable_counter_int, 236 + .start_counters = hisi_noc_pmu_start_counters, 237 + .stop_counters = hisi_noc_pmu_stop_counters, 238 + .get_int_status = hisi_noc_pmu_get_int_status, 239 + .clear_int_status = hisi_noc_pmu_clear_int_status, 240 + .enable_filter = hisi_noc_pmu_enable_filter, 241 + .disable_filter = hisi_noc_pmu_disable_filter, 242 + }; 243 + 244 + static struct attribute *hisi_noc_pmu_format_attrs[] = { 245 + HISI_PMU_FORMAT_ATTR(event, "config:0-7"), 246 + HISI_PMU_FORMAT_ATTR(ch, "config1:0-2"), 247 + HISI_PMU_FORMAT_ATTR(tt_en, "config1:3"), 248 + NULL 249 + }; 250 + 251 + static const struct attribute_group hisi_noc_pmu_format_group = { 252 + .name = "format", 253 + .attrs = hisi_noc_pmu_format_attrs, 254 + }; 255 + 256 + static struct attribute *hisi_noc_pmu_events_attrs[] = { 257 + HISI_PMU_EVENT_ATTR(cycles, 0x0e), 258 + /* Flux on/off the ring */ 259 + HISI_PMU_EVENT_ATTR(ingress_flow_sum, 0x1a), 260 + HISI_PMU_EVENT_ATTR(egress_flow_sum, 0x17), 261 + /* Buffer full duration on/off the ring */ 262 + HISI_PMU_EVENT_ATTR(ingress_buf_full, 0x19), 263 + HISI_PMU_EVENT_ATTR(egress_buf_full, 0x12), 264 + /* Failure packets count on/off the ring */ 265 + HISI_PMU_EVENT_ATTR(cw_ingress_fail, 0x01), 266 + HISI_PMU_EVENT_ATTR(cc_ingress_fail, 0x09), 267 + HISI_PMU_EVENT_ATTR(cw_egress_fail, 0x03), 268 + HISI_PMU_EVENT_ATTR(cc_egress_fail, 0x0b), 269 + /* Flux of the ring */ 270 + HISI_PMU_EVENT_ATTR(cw_main_flow_sum, 0x05), 271 + HISI_PMU_EVENT_ATTR(cc_main_flow_sum, 0x0d), 272 + NULL 273 + }; 274 + 275 + static const struct attribute_group hisi_noc_pmu_events_group = { 276 + .name = "events", 277 + .attrs = hisi_noc_pmu_events_attrs, 278 + }; 279 + 280 + static const struct attribute_group *hisi_noc_pmu_attr_groups[] = { 281 + &hisi_noc_pmu_format_group, 282 + &hisi_noc_pmu_events_group, 283 + &hisi_pmu_cpumask_attr_group, 284 + &hisi_pmu_identifier_group, 285 + NULL 286 + }; 287 + 288 + static int hisi_noc_pmu_dev_init(struct platform_device *pdev, struct hisi_pmu *noc_pmu) 289 + { 290 + struct hisi_noc_pmu_regs *reg_info; 291 + 292 + hisi_uncore_pmu_init_topology(noc_pmu, &pdev->dev); 293 + 294 + if (noc_pmu->topo.scl_id < 0) 295 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get scl-id\n"); 296 + 297 + if (noc_pmu->topo.index_id < 0) 298 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get idx-id\n"); 299 + 300 + if (noc_pmu->topo.sub_id < 0) 301 + return dev_err_probe(&pdev->dev, -EINVAL, "failed to get sub-id\n"); 302 + 303 + noc_pmu->base = devm_platform_ioremap_resource(pdev, 0); 304 + if (IS_ERR(noc_pmu->base)) 305 + return dev_err_probe(&pdev->dev, PTR_ERR(noc_pmu->base), 306 + "fail to remap io memory\n"); 307 + 308 + noc_pmu->dev_info = device_get_match_data(&pdev->dev); 309 + if (!noc_pmu->dev_info) 310 + return -ENODEV; 311 + 312 + noc_pmu->pmu_events.attr_groups = noc_pmu->dev_info->attr_groups; 313 + noc_pmu->counter_bits = noc_pmu->dev_info->counter_bits; 314 + noc_pmu->check_event = noc_pmu->dev_info->check_event; 315 + noc_pmu->num_counters = NOC_PMU_NR_COUNTERS; 316 + noc_pmu->ops = &hisi_uncore_noc_ops; 317 + noc_pmu->dev = &pdev->dev; 318 + noc_pmu->on_cpu = -1; 319 + 320 + reg_info = noc_pmu->dev_info->private; 321 + noc_pmu->identifier = readl(noc_pmu->base + reg_info->version); 322 + 323 + return 0; 324 + } 325 + 326 + static void hisi_noc_pmu_remove_cpuhp_instance(void *hotplug_node) 327 + { 328 + cpuhp_state_remove_instance_nocalls(hisi_noc_pmu_cpuhp_state, hotplug_node); 329 + } 330 + 331 + static void hisi_noc_pmu_unregister_pmu(void *pmu) 332 + { 333 + perf_pmu_unregister(pmu); 334 + } 335 + 336 + static int hisi_noc_pmu_probe(struct platform_device *pdev) 337 + { 338 + struct device *dev = &pdev->dev; 339 + struct hisi_pmu *noc_pmu; 340 + char *name; 341 + int ret; 342 + 343 + noc_pmu = devm_kzalloc(dev, sizeof(*noc_pmu), GFP_KERNEL); 344 + if (!noc_pmu) 345 + return -ENOMEM; 346 + 347 + /* 348 + * HiSilicon Uncore PMU framework needs to get common hisi_pmu device 349 + * from device's drvdata. 350 + */ 351 + platform_set_drvdata(pdev, noc_pmu); 352 + 353 + ret = hisi_noc_pmu_dev_init(pdev, noc_pmu); 354 + if (ret) 355 + return ret; 356 + 357 + ret = cpuhp_state_add_instance(hisi_noc_pmu_cpuhp_state, &noc_pmu->node); 358 + if (ret) 359 + return dev_err_probe(dev, ret, "Fail to register cpuhp instance\n"); 360 + 361 + ret = devm_add_action_or_reset(dev, hisi_noc_pmu_remove_cpuhp_instance, 362 + &noc_pmu->node); 363 + if (ret) 364 + return ret; 365 + 366 + hisi_pmu_init(noc_pmu, THIS_MODULE); 367 + 368 + name = devm_kasprintf(dev, GFP_KERNEL, "hisi_scl%d_noc%d_%d", 369 + noc_pmu->topo.scl_id, noc_pmu->topo.index_id, 370 + noc_pmu->topo.sub_id); 371 + if (!name) 372 + return -ENOMEM; 373 + 374 + ret = perf_pmu_register(&noc_pmu->pmu, name, -1); 375 + if (ret) 376 + return dev_err_probe(dev, ret, "Fail to register PMU\n"); 377 + 378 + return devm_add_action_or_reset(dev, hisi_noc_pmu_unregister_pmu, 379 + &noc_pmu->pmu); 380 + } 381 + 382 + static struct hisi_noc_pmu_regs hisi_noc_v1_pmu_regs = { 383 + .version = NOC_PMU_VERSION, 384 + .pmu_ctrl = NOC_PMU_GLOBAL_CTRL, 385 + .event_ctrl0 = NOC_PMU_EVENT_CTRL0, 386 + .event_cntr0 = NOC_PMU_EVENT_COUNTER0, 387 + .overflow_status = NOC_PMU_CNT_INFO, 388 + }; 389 + 390 + static const struct hisi_pmu_dev_info hisi_noc_v1 = { 391 + .attr_groups = hisi_noc_pmu_attr_groups, 392 + .counter_bits = 64, 393 + .check_event = NOC_PMU_EVENT_CTRL_TYPE, 394 + .private = &hisi_noc_v1_pmu_regs, 395 + }; 396 + 397 + static const struct acpi_device_id hisi_noc_pmu_ids[] = { 398 + { "HISI04E0", (kernel_ulong_t) &hisi_noc_v1 }, 399 + { } 400 + }; 401 + MODULE_DEVICE_TABLE(acpi, hisi_noc_pmu_ids); 402 + 403 + static struct platform_driver hisi_noc_pmu_driver = { 404 + .driver = { 405 + .name = "hisi_noc_pmu", 406 + .acpi_match_table = hisi_noc_pmu_ids, 407 + .suppress_bind_attrs = true, 408 + }, 409 + .probe = hisi_noc_pmu_probe, 410 + }; 411 + 412 + static int __init hisi_noc_pmu_module_init(void) 413 + { 414 + int ret; 415 + 416 + ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, "perf/hisi/noc:online", 417 + hisi_uncore_pmu_online_cpu, 418 + hisi_uncore_pmu_offline_cpu); 419 + if (ret < 0) { 420 + pr_err("hisi_noc_pmu: Fail to setup cpuhp callbacks, ret = %d\n", ret); 421 + return ret; 422 + } 423 + hisi_noc_pmu_cpuhp_state = ret; 424 + 425 + ret = platform_driver_register(&hisi_noc_pmu_driver); 426 + if (ret) 427 + cpuhp_remove_multi_state(hisi_noc_pmu_cpuhp_state); 428 + 429 + return ret; 430 + } 431 + module_init(hisi_noc_pmu_module_init); 432 + 433 + static void __exit hisi_noc_pmu_module_exit(void) 434 + { 435 + platform_driver_unregister(&hisi_noc_pmu_driver); 436 + cpuhp_remove_multi_state(hisi_noc_pmu_cpuhp_state); 437 + } 438 + module_exit(hisi_noc_pmu_module_exit); 439 + 440 + MODULE_IMPORT_NS("HISI_PMU"); 441 + MODULE_DESCRIPTION("HiSilicon SoC Uncore NoC PMU driver"); 442 + MODULE_LICENSE("GPL"); 443 + MODULE_AUTHOR("Yicong Yang <yangyicong@hisilicon.com>");
+3 -2
drivers/perf/hisilicon/hisi_uncore_pmu.c
··· 149 149 clear_bit(idx, hisi_pmu->pmu_events.used_mask); 150 150 } 151 151 152 - static irqreturn_t hisi_uncore_pmu_isr(int irq, void *data) 152 + irqreturn_t hisi_uncore_pmu_isr(int irq, void *data) 153 153 { 154 154 struct hisi_pmu *hisi_pmu = data; 155 155 struct perf_event *event; ··· 178 178 179 179 return IRQ_HANDLED; 180 180 } 181 + EXPORT_SYMBOL_NS_GPL(hisi_uncore_pmu_isr, "HISI_PMU"); 181 182 182 183 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, 183 184 struct platform_device *pdev) ··· 235 234 return -EINVAL; 236 235 237 236 hisi_pmu = to_hisi_pmu(event->pmu); 238 - if (event->attr.config > hisi_pmu->check_event) 237 + if ((event->attr.config & HISI_EVENTID_MASK) > hisi_pmu->check_event) 239 238 return -EINVAL; 240 239 241 240 if (hisi_pmu->on_cpu == -1)
+4 -2
drivers/perf/hisilicon/hisi_uncore_pmu.h
··· 24 24 #define pr_fmt(fmt) "hisi_pmu: " fmt 25 25 26 26 #define HISI_PMU_V2 0x30 27 - #define HISI_MAX_COUNTERS 0x10 27 + #define HISI_MAX_COUNTERS 0x18 28 28 #define to_hisi_pmu(p) (container_of(p, struct hisi_pmu, pmu)) 29 29 30 30 #define HISI_PMU_ATTR(_name, _func, _config) \ ··· 43 43 return FIELD_GET(GENMASK_ULL(hi, lo), event->attr.config); \ 44 44 } 45 45 46 - #define HISI_GET_EVENTID(ev) (ev->hw.config_base & 0xff) 46 + #define HISI_EVENTID_MASK GENMASK(7, 0) 47 + #define HISI_GET_EVENTID(ev) ((ev)->hw.config_base & HISI_EVENTID_MASK) 47 48 48 49 #define HISI_PMU_EVTYPE_BITS 8 49 50 #define HISI_PMU_EVTYPE_SHIFT(idx) ((idx) % 4 * HISI_PMU_EVTYPE_BITS) ··· 165 164 ssize_t hisi_uncore_pmu_identifier_attr_show(struct device *dev, 166 165 struct device_attribute *attr, 167 166 char *page); 167 + irqreturn_t hisi_uncore_pmu_isr(int irq, void *data); 168 168 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, 169 169 struct platform_device *pdev); 170 170 void hisi_uncore_pmu_init_topology(struct hisi_pmu *hisi_pmu, struct device *dev);
+1 -1
drivers/virt/coco/efi_secret/Kconfig
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 config EFI_SECRET 3 3 tristate "EFI secret area securityfs support" 4 - depends on EFI && X86_64 4 + depends on EFI && (X86_64 || ARM64) 5 5 select EFI_COCO_SECRET 6 6 select SECURITYFS 7 7 help
+3
include/linux/pagewalk.h
··· 134 134 int walk_kernel_page_table_range(unsigned long start, 135 135 unsigned long end, const struct mm_walk_ops *ops, 136 136 pgd_t *pgd, void *private); 137 + int walk_kernel_page_table_range_lockless(unsigned long start, 138 + unsigned long end, const struct mm_walk_ops *ops, 139 + pgd_t *pgd, void *private); 137 140 int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, 138 141 unsigned long end, const struct mm_walk_ops *ops, 139 142 void *private);
+15 -1
kernel/entry/common.c
··· 143 143 return ret; 144 144 } 145 145 146 + /** 147 + * arch_irqentry_exit_need_resched - Architecture specific need resched function 148 + * 149 + * Invoked from raw_irqentry_exit_cond_resched() to check if resched is needed. 150 + * Defaults return true. 151 + * 152 + * The main purpose is to permit arch to avoid preemption of a task from an IRQ. 153 + */ 154 + static inline bool arch_irqentry_exit_need_resched(void); 155 + 156 + #ifndef arch_irqentry_exit_need_resched 157 + static inline bool arch_irqentry_exit_need_resched(void) { return true; } 158 + #endif 159 + 146 160 void raw_irqentry_exit_cond_resched(void) 147 161 { 148 162 if (!preempt_count()) { ··· 164 150 rcu_irq_exit_check_preempt(); 165 151 if (IS_ENABLED(CONFIG_DEBUG_ENTRY)) 166 152 WARN_ON_ONCE(!on_thread_stack()); 167 - if (need_resched()) 153 + if (need_resched() && arch_irqentry_exit_need_resched()) 168 154 preempt_schedule_irq(); 169 155 } 170 156 }
+1 -1
kernel/events/uprobes.c
··· 121 121 122 122 static void uprobe_warn(struct task_struct *t, const char *msg) 123 123 { 124 - pr_warn("uprobe: %s:%d failed to %s\n", current->comm, current->pid, msg); 124 + pr_warn("uprobe: %s:%d failed to %s\n", t->comm, t->pid, msg); 125 125 } 126 126 127 127 /*
+24 -12
mm/pagewalk.c
··· 606 606 int walk_kernel_page_table_range(unsigned long start, unsigned long end, 607 607 const struct mm_walk_ops *ops, pgd_t *pgd, void *private) 608 608 { 609 - struct mm_struct *mm = &init_mm; 609 + /* 610 + * Kernel intermediate page tables are usually not freed, so the mmap 611 + * read lock is sufficient. But there are some exceptions. 612 + * E.g. memory hot-remove. In which case, the mmap lock is insufficient 613 + * to prevent the intermediate kernel pages tables belonging to the 614 + * specified address range from being freed. The caller should take 615 + * other actions to prevent this race. 616 + */ 617 + mmap_assert_locked(&init_mm); 618 + 619 + return walk_kernel_page_table_range_lockless(start, end, ops, pgd, 620 + private); 621 + } 622 + 623 + /* 624 + * Use this function to walk the kernel page tables locklessly. It should be 625 + * guaranteed that the caller has exclusive access over the range they are 626 + * operating on - that there should be no concurrent access, for example, 627 + * changing permissions for vmalloc objects. 628 + */ 629 + int walk_kernel_page_table_range_lockless(unsigned long start, unsigned long end, 630 + const struct mm_walk_ops *ops, pgd_t *pgd, void *private) 631 + { 610 632 struct mm_walk walk = { 611 633 .ops = ops, 612 - .mm = mm, 634 + .mm = &init_mm, 613 635 .pgd = pgd, 614 636 .private = private, 615 637 .no_vma = true ··· 641 619 return -EINVAL; 642 620 if (!check_ops_valid(ops)) 643 621 return -EINVAL; 644 - 645 - /* 646 - * Kernel intermediate page tables are usually not freed, so the mmap 647 - * read lock is sufficient. But there are some exceptions. 648 - * E.g. memory hot-remove. In which case, the mmap lock is insufficient 649 - * to prevent the intermediate kernel pages tables belonging to the 650 - * specified address range from being freed. The caller should take 651 - * other actions to prevent this race. 652 - */ 653 - mmap_assert_locked(mm); 654 622 655 623 return walk_pgd_range(start, end, &walk); 656 624 }
+21 -1
tools/testing/selftests/arm64/abi/hwcap.c
··· 17 17 #include <asm/sigcontext.h> 18 18 #include <asm/unistd.h> 19 19 20 + #include <linux/auxvec.h> 21 + 20 22 #include "../../kselftest.h" 21 23 22 24 #define TESTS_PER_HWCAP 3 ··· 56 54 { 57 55 /* Not implemented, too complicated and unreliable anyway */ 58 56 } 59 - 60 57 61 58 static void crc32_sigill(void) 62 59 { ··· 168 167 : "+r" (memp), "+r" (val0), "+r" (val1) 169 168 : 170 169 : "cc", "memory"); 170 + } 171 + 172 + static void lsfe_sigill(void) 173 + { 174 + float __attribute__ ((aligned (16))) mem; 175 + register float *memp asm ("x0") = &mem; 176 + 177 + /* STFADD H0, [X0] */ 178 + asm volatile(".inst 0x7c20801f" 179 + : "+r" (memp) 180 + : 181 + : "memory"); 171 182 } 172 183 173 184 static void lut_sigill(void) ··· 774 761 .hwcap_bit = HWCAP2_LSE128, 775 762 .cpuinfo = "lse128", 776 763 .sigill_fn = lse128_sigill, 764 + }, 765 + { 766 + .name = "LSFE", 767 + .at_hwcap = AT_HWCAP3, 768 + .hwcap_bit = HWCAP3_LSFE, 769 + .cpuinfo = "lsfe", 770 + .sigill_fn = lsfe_sigill, 777 771 }, 778 772 { 779 773 .name = "LUT",
+4 -4
tools/testing/selftests/arm64/abi/tpidr2.c
··· 227 227 ret = open("/proc/sys/abi/sme_default_vector_length", O_RDONLY, 0); 228 228 if (ret >= 0) { 229 229 ksft_test_result(default_value(), "default_value\n"); 230 - ksft_test_result(write_read, "write_read\n"); 231 - ksft_test_result(write_sleep_read, "write_sleep_read\n"); 232 - ksft_test_result(write_fork_read, "write_fork_read\n"); 233 - ksft_test_result(write_clone_read, "write_clone_read\n"); 230 + ksft_test_result(write_read(), "write_read\n"); 231 + ksft_test_result(write_sleep_read(), "write_sleep_read\n"); 232 + ksft_test_result(write_fork_read(), "write_fork_read\n"); 233 + ksft_test_result(write_clone_read(), "write_clone_read\n"); 234 234 235 235 } else { 236 236 ksft_print_msg("SME support not present\n");
-1
tools/testing/selftests/arm64/bti/assembler.h
··· 14 14 #define GNU_PROPERTY_AARCH64_FEATURE_1_BTI (1U << 0) 15 15 #define GNU_PROPERTY_AARCH64_FEATURE_1_PAC (1U << 1) 16 16 17 - 18 17 .macro startfn name:req 19 18 .globl \name 20 19 \name:
-1
tools/testing/selftests/arm64/fp/fp-ptrace.c
··· 1568 1568 &test_config); 1569 1569 } 1570 1570 } 1571 - 1572 1571 } 1573 1572 1574 1573 static void run_sme_tests(void)
+3 -3
tools/testing/selftests/arm64/fp/fp-stress.c
··· 105 105 106 106 /* 107 107 * Read from the startup pipe, there should be no data 108 - * and we should block until it is closed. We just 109 - * carry on on error since this isn't super critical. 108 + * and we should block until it is closed. We just 109 + * carry-on on error since this isn't super critical. 110 110 */ 111 111 ret = read(3, &i, sizeof(i)); 112 112 if (ret < 0) ··· 549 549 550 550 evs = calloc(tests, sizeof(*evs)); 551 551 if (!evs) 552 - ksft_exit_fail_msg("Failed to allocated %d epoll events\n", 552 + ksft_exit_fail_msg("Failed to allocate %d epoll events\n", 553 553 tests); 554 554 555 555 for (i = 0; i < cpus; i++) {
+2 -2
tools/testing/selftests/arm64/fp/kernel-test.c
··· 188 188 189 189 ref = malloc(digest_len); 190 190 if (!ref) { 191 - printf("Failed to allocated %d byte reference\n", digest_len); 191 + printf("Failed to allocate %d byte reference\n", digest_len); 192 192 return false; 193 193 } 194 194 195 195 digest = malloc(digest_len); 196 196 if (!digest) { 197 - printf("Failed to allocated %d byte digest\n", digest_len); 197 + printf("Failed to allocate %d byte digest\n", digest_len); 198 198 return false; 199 199 } 200 200
+98 -6
tools/testing/selftests/arm64/fp/sve-ptrace.c
··· 66 66 }; 67 67 68 68 #define VL_TESTS (((TEST_VQ_MAX - SVE_VQ_MIN) + 1) * 4) 69 - #define FLAG_TESTS 2 69 + #define FLAG_TESTS 4 70 70 #define FPSIMD_TESTS 2 71 71 72 72 #define EXPECTED_TESTS ((VL_TESTS + FLAG_TESTS + FPSIMD_TESTS) * ARRAY_SIZE(vec_types)) ··· 95 95 static int get_fpsimd(pid_t pid, struct user_fpsimd_state *fpsimd) 96 96 { 97 97 struct iovec iov; 98 + int ret; 98 99 99 100 iov.iov_base = fpsimd; 100 101 iov.iov_len = sizeof(*fpsimd); 101 - return ptrace(PTRACE_GETREGSET, pid, NT_PRFPREG, &iov); 102 + ret = ptrace(PTRACE_GETREGSET, pid, NT_PRFPREG, &iov); 103 + if (ret == -1) 104 + ksft_perror("ptrace(PTRACE_GETREGSET)"); 105 + return ret; 102 106 } 103 107 104 108 static int set_fpsimd(pid_t pid, struct user_fpsimd_state *fpsimd) 105 109 { 106 110 struct iovec iov; 111 + int ret; 107 112 108 113 iov.iov_base = fpsimd; 109 114 iov.iov_len = sizeof(*fpsimd); 110 - return ptrace(PTRACE_SETREGSET, pid, NT_PRFPREG, &iov); 115 + ret = ptrace(PTRACE_SETREGSET, pid, NT_PRFPREG, &iov); 116 + if (ret == -1) 117 + ksft_perror("ptrace(PTRACE_SETREGSET)"); 118 + return ret; 111 119 } 112 120 113 121 static struct user_sve_header *get_sve(pid_t pid, const struct vec_type *type, ··· 123 115 { 124 116 struct user_sve_header *sve; 125 117 void *p; 126 - size_t sz = sizeof *sve; 118 + size_t sz = sizeof(*sve); 127 119 struct iovec iov; 120 + int ret; 128 121 129 122 while (1) { 130 123 if (*size < sz) { ··· 141 132 142 133 iov.iov_base = *buf; 143 134 iov.iov_len = sz; 144 - if (ptrace(PTRACE_GETREGSET, pid, type->regset, &iov)) 135 + ret = ptrace(PTRACE_GETREGSET, pid, type->regset, &iov); 136 + if (ret) { 137 + ksft_perror("ptrace(PTRACE_GETREGSET)"); 145 138 goto error; 139 + } 146 140 147 141 sve = *buf; 148 142 if (sve->size <= sz) ··· 164 152 const struct user_sve_header *sve) 165 153 { 166 154 struct iovec iov; 155 + int ret; 167 156 168 157 iov.iov_base = (void *)sve; 169 158 iov.iov_len = sve->size; 170 - return ptrace(PTRACE_SETREGSET, pid, type->regset, &iov); 159 + ret = ptrace(PTRACE_SETREGSET, pid, type->regset, &iov); 160 + if (ret == -1) 161 + ksft_perror("ptrace(PTRACE_SETREGSET)"); 162 + return ret; 163 + } 164 + 165 + /* A read operation fails */ 166 + static void read_fails(pid_t child, const struct vec_type *type) 167 + { 168 + struct user_sve_header *new_sve = NULL; 169 + size_t new_sve_size = 0; 170 + void *ret; 171 + 172 + ret = get_sve(child, type, (void **)&new_sve, &new_sve_size); 173 + 174 + ksft_test_result(ret == NULL, "%s unsupported read fails\n", 175 + type->name); 176 + 177 + free(new_sve); 178 + } 179 + 180 + /* A write operation fails */ 181 + static void write_fails(pid_t child, const struct vec_type *type) 182 + { 183 + struct user_sve_header sve; 184 + int ret; 185 + 186 + /* Just the header, no data */ 187 + memset(&sve, 0, sizeof(sve)); 188 + sve.size = sizeof(sve); 189 + sve.flags = SVE_PT_REGS_SVE; 190 + sve.vl = SVE_VL_MIN; 191 + ret = set_sve(child, type, &sve); 192 + 193 + ksft_test_result(ret != 0, "%s unsupported write fails\n", 194 + type->name); 171 195 } 172 196 173 197 /* Validate setting and getting the inherit flag */ ··· 316 268 vl, reg, *in, *out); 317 269 (*errors)++; 318 270 } 271 + } 272 + 273 + /* Set out of range VLs */ 274 + static void ptrace_set_vl_ranges(pid_t child, const struct vec_type *type) 275 + { 276 + struct user_sve_header sve; 277 + int ret; 278 + 279 + memset(&sve, 0, sizeof(sve)); 280 + sve.flags = SVE_PT_REGS_SVE; 281 + sve.size = sizeof(sve); 282 + 283 + ret = set_sve(child, type, &sve); 284 + ksft_test_result(ret != 0, "%s Set invalid VL 0\n", type->name); 285 + 286 + sve.vl = SVE_VL_MAX + SVE_VQ_BYTES; 287 + ret = set_sve(child, type, &sve); 288 + ksft_test_result(ret != 0, "%s Set invalid VL %d\n", type->name, 289 + SVE_VL_MAX + SVE_VQ_BYTES); 319 290 } 320 291 321 292 /* Access the FPSIMD registers via the SVE regset */ ··· 750 683 } 751 684 752 685 for (i = 0; i < ARRAY_SIZE(vec_types); i++) { 686 + /* 687 + * If the vector type isn't supported reads and writes 688 + * should fail. 689 + */ 690 + if (!(getauxval(vec_types[i].hwcap_type) & vec_types[i].hwcap)) { 691 + read_fails(child, &vec_types[i]); 692 + write_fails(child, &vec_types[i]); 693 + } else { 694 + ksft_test_result_skip("%s unsupported read fails\n", 695 + vec_types[i].name); 696 + ksft_test_result_skip("%s unsupported write fails\n", 697 + vec_types[i].name); 698 + } 699 + 753 700 /* FPSIMD via SVE regset */ 754 701 if (getauxval(vec_types[i].hwcap_type) & vec_types[i].hwcap) { 755 702 ptrace_sve_fpsimd(child, &vec_types[i]); ··· 782 701 vec_types[i].name); 783 702 ksft_test_result_skip("%s SVE_PT_VL_INHERIT cleared\n", 784 703 vec_types[i].name); 704 + } 705 + 706 + /* Setting out of bounds VLs should fail */ 707 + if (getauxval(vec_types[i].hwcap_type) & vec_types[i].hwcap) { 708 + ptrace_set_vl_ranges(child, &vec_types[i]); 709 + } else { 710 + ksft_test_result_skip("%s Set invalid VL 0\n", 711 + vec_types[i].name); 712 + ksft_test_result_skip("%s Set invalid VL %d\n", 713 + vec_types[i].name, 714 + SVE_VL_MAX + SVE_VQ_BYTES); 785 715 } 786 716 787 717 /* Step through every possible VQ */
-1
tools/testing/selftests/arm64/fp/vec-syscfg.c
··· 690 690 asm volatile("msr S0_3_C4_C6_3, xzr"); 691 691 } 692 692 693 - 694 693 /* 695 694 * Verify we can change the SVE vector length while SME is active and 696 695 * continue to use SME afterwards.
-1
tools/testing/selftests/arm64/fp/zt-ptrace.c
··· 108 108 return ptrace(PTRACE_GETREGSET, pid, NT_ARM_ZT, &iov); 109 109 } 110 110 111 - 112 111 static int set_zt(pid_t pid, const char zt[ZT_SIG_REG_BYTES]) 113 112 { 114 113 struct iovec iov;
+3 -3
tools/testing/selftests/arm64/gcs/Makefile
··· 14 14 include ../../lib.mk 15 15 16 16 $(OUTPUT)/basic-gcs: basic-gcs.c 17 - $(CC) -g -fno-asynchronous-unwind-tables -fno-ident -s -Os -nostdlib \ 18 - -static -include ../../../../include/nolibc/nolibc.h \ 17 + $(CC) $(CFLAGS) -fno-asynchronous-unwind-tables -fno-ident -s -nostdlib -nostdinc \ 18 + -static -I../../../../include/nolibc -include ../../../../include/nolibc/nolibc.h \ 19 19 -I../../../../../usr/include \ 20 20 -std=gnu99 -I../.. -g \ 21 - -ffreestanding -Wall $^ -o $@ -lgcc 21 + -ffreestanding $^ -o $@ -lgcc 22 22 23 23 $(OUTPUT)/gcs-stress-thread: gcs-stress-thread.S 24 24 $(CC) -nostdlib $^ -o $@
+6 -6
tools/testing/selftests/arm64/gcs/basic-gcs.c
··· 10 10 11 11 #include <sys/mman.h> 12 12 #include <asm/mman.h> 13 + #include <asm/hwcap.h> 13 14 #include <linux/sched.h> 14 15 15 16 #include "kselftest.h" ··· 387 386 388 387 ksft_print_header(); 389 388 390 - /* 391 - * We don't have getauxval() with nolibc so treat a failure to 392 - * read GCS state as a lack of support and skip. 393 - */ 389 + if (!(getauxval(AT_HWCAP) & HWCAP_GCS)) 390 + ksft_exit_skip("SKIP GCS not supported\n"); 391 + 394 392 ret = my_syscall5(__NR_prctl, PR_GET_SHADOW_STACK_STATUS, 395 393 &gcs_mode, 0, 0, 0); 396 394 if (ret != 0) 397 - ksft_exit_skip("Failed to read GCS state: %d\n", ret); 395 + ksft_exit_fail_msg("Failed to read GCS state: %d\n", ret); 398 396 399 397 if (!(gcs_mode & PR_SHADOW_STACK_ENABLE)) { 400 398 gcs_mode = PR_SHADOW_STACK_ENABLE; ··· 410 410 } 411 411 412 412 /* One last test: disable GCS, we can do this one time */ 413 - my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, 0, 0, 0, 0); 413 + ret = my_syscall5(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, 0, 0, 0, 0); 414 414 if (ret != 0) 415 415 ksft_print_msg("Failed to disable GCS: %d\n", ret); 416 416
-1
tools/testing/selftests/arm64/gcs/gcs-locking.c
··· 165 165 ASSERT_EQ(ret, 0); 166 166 ASSERT_EQ(mode, PR_SHADOW_STACK_ALL_MODES); 167 167 168 - 169 168 ret = my_syscall2(__NR_prctl, PR_SET_SHADOW_STACK_STATUS, 170 169 variant->mode); 171 170 ASSERT_EQ(ret, 0);
+1 -1
tools/testing/selftests/arm64/gcs/gcs-stress.c
··· 433 433 434 434 evs = calloc(tests, sizeof(*evs)); 435 435 if (!evs) 436 - ksft_exit_fail_msg("Failed to allocated %d epoll events\n", 436 + ksft_exit_fail_msg("Failed to allocate %d epoll events\n", 437 437 tests); 438 438 439 439 for (i = 0; i < gcs_threads; i++)
+6 -1
tools/testing/selftests/arm64/pauth/exec_target.c
··· 13 13 unsigned long hwcaps; 14 14 size_t val; 15 15 16 - fread(&val, sizeof(size_t), 1, stdin); 16 + size_t size = fread(&val, sizeof(size_t), 1, stdin); 17 + 18 + if (size != 1) { 19 + fprintf(stderr, "Could not read input from stdin\n"); 20 + return EXIT_FAILURE; 21 + } 17 22 18 23 /* don't try to execute illegal (unimplemented) instructions) caller 19 24 * should have checked this and keep worker simple