Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
"Highlights include a major rework of our kPTI page-table rewriting
code (which makes it both more maintainable and considerably faster in
the cases where it is required) as well as significant changes to our
early boot code to reduce the need for data cache maintenance and
greatly simplify the KASLR relocation dance.

Summary:

- Remove unused generic cpuidle support (replaced by PSCI version)

- Fix documentation describing the kernel virtual address space

- Handling of some new CPU errata in Arm implementations

- Rework of our exception table code in preparation for handling
machine checks (i.e. RAS errors) more gracefully

- Switch over to the generic implementation of ioremap()

- Fix lockdep tracking in NMI context

- Instrument our memory barrier macros for KCSAN

- Rework of the kPTI G->nG page-table repainting so that the MMU
remains enabled and the boot time is no longer slowed to a crawl
for systems which require the late remapping

- Enable support for direct swapping of 2MiB transparent huge-pages
on systems without MTE

- Fix handling of MTE tags with allocating new pages with HW KASAN

- Expose the SMIDR register to userspace via sysfs

- Continued rework of the stack unwinder, particularly improving the
behaviour under KASAN

- More repainting of our system register definitions to match the
architectural terminology

- Improvements to the layout of the vDSO objects

- Support for allocating additional bits of HWCAP2 and exposing
FEAT_EBF16 to userspace on CPUs that support it

- Considerable rework and optimisation of our early boot code to
reduce the need for cache maintenance and avoid jumping in and out
of the kernel when handling relocation under KASLR

- Support for disabling SVE and SME support on the kernel
command-line

- Support for the Hisilicon HNS3 PMU

- Miscellanous cleanups, trivial updates and minor fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (136 commits)
arm64: Delay initialisation of cpuinfo_arm64::reg_{zcr,smcr}
arm64: fix KASAN_INLINE
arm64/hwcap: Support FEAT_EBF16
arm64/cpufeature: Store elf_hwcaps as a bitmap rather than unsigned long
arm64/hwcap: Document allocation of upper bits of AT_HWCAP
arm64: enable THP_SWAP for arm64
arm64/mm: use GENMASK_ULL for TTBR_BADDR_MASK_52
arm64: errata: Remove AES hwcap for COMPAT tasks
arm64: numa: Don't check node against MAX_NUMNODES
drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX
perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node()
docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING
arm64: kasan: Revert "arm64: mte: reset the page tag in page->flags"
mm: kasan: Skip page unpoisoning only if __GFP_SKIP_KASAN_UNPOISON
mm: kasan: Skip unpoisoning of user pages
mm: kasan: Ensure the tags are visible before the tag in page->flags
drivers/perf: hisi: add driver for HNS3 PMU
drivers/perf: hisi: Add description for HNS3 PMU driver
drivers/perf: riscv_pmu_sbi: perf format
perf/arm-cci: Use the bitmap API to allocate bitmaps
...

+3972 -1628
+2 -1
Documentation/ABI/testing/sysfs-devices-system-cpu
··· 493 493 /sys/devices/system/cpu/cpuX/regs/identification/ 494 494 /sys/devices/system/cpu/cpuX/regs/identification/midr_el1 495 495 /sys/devices/system/cpu/cpuX/regs/identification/revidr_el1 496 + /sys/devices/system/cpu/cpuX/regs/identification/smidr_el1 496 497 Date: June 2016 497 498 Contact: Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org> 498 499 Description: AArch64 CPU registers 499 500 500 501 'identification' directory exposes the CPU ID registers for 501 - identifying model and revision of the CPU. 502 + identifying model and revision of the CPU and SMCU. 502 503 503 504 What: /sys/devices/system/cpu/aarch32_el0 504 505 Date: May 2021
+7 -1
Documentation/admin-guide/kernel-parameters.txt
··· 400 400 arm64.nomte [ARM64] Unconditionally disable Memory Tagging Extension 401 401 support 402 402 403 + arm64.nosve [ARM64] Unconditionally disable Scalable Vector 404 + Extension support 405 + 406 + arm64.nosme [ARM64] Unconditionally disable Scalable Matrix 407 + Extension support 408 + 403 409 ataflop= [HW,M68k] 404 410 405 411 atarimouse= [HW,MOUSE] Atari Mouse ··· 3167 3161 improves system performance, but it may also 3168 3162 expose users to several CPU vulnerabilities. 3169 3163 Equivalent to: nopti [X86,PPC] 3170 - kpti=0 [ARM64] 3164 + if nokaslr then kpti=0 [ARM64] 3171 3165 nospectre_v1 [X86,PPC] 3172 3166 nobp=0 [S390] 3173 3167 nospectre_v2 [X86,PPC,S390,ARM64]
+136
Documentation/admin-guide/perf/hns3-pmu.rst
··· 1 + ====================================== 2 + HNS3 Performance Monitoring Unit (PMU) 3 + ====================================== 4 + 5 + HNS3(HiSilicon network system 3) Performance Monitoring Unit (PMU) is an 6 + End Point device to collect performance statistics of HiSilicon SoC NIC. 7 + On Hip09, each SICL(Super I/O cluster) has one PMU device. 8 + 9 + HNS3 PMU supports collection of performance statistics such as bandwidth, 10 + latency, packet rate and interrupt rate. 11 + 12 + Each HNS3 PMU supports 8 hardware events. 13 + 14 + HNS3 PMU driver 15 + =============== 16 + 17 + The HNS3 PMU driver registers a perf PMU with the name of its sicl id.:: 18 + 19 + /sys/devices/hns3_pmu_sicl_<sicl_id> 20 + 21 + PMU driver provides description of available events, filter modes, format, 22 + identifier and cpumask in sysfs. 23 + 24 + The "events" directory describes the event code of all supported events 25 + shown in perf list. 26 + 27 + The "filtermode" directory describes the supported filter modes of each 28 + event. 29 + 30 + The "format" directory describes all formats of the config (events) and 31 + config1 (filter options) fields of the perf_event_attr structure. 32 + 33 + The "identifier" file shows version of PMU hardware device. 34 + 35 + The "bdf_min" and "bdf_max" files show the supported bdf range of each 36 + pmu device. 37 + 38 + The "hw_clk_freq" file shows the hardware clock frequency of each pmu 39 + device. 40 + 41 + Example usage of checking event code and subevent code:: 42 + 43 + $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_time 44 + config=0x00204 45 + $# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_packet_num 46 + config=0x10204 47 + 48 + Each performance statistic has a pair of events to get two values to 49 + calculate real performance data in userspace. 50 + 51 + The bits 0~15 of config (here 0x0204) are the true hardware event code. If 52 + two events have same value of bits 0~15 of config, that means they are 53 + event pair. And the bit 16 of config indicates getting counter 0 or 54 + counter 1 of hardware event. 55 + 56 + After getting two values of event pair in usersapce, the formula of 57 + computation to calculate real performance data is::: 58 + 59 + counter 0 / counter 1 60 + 61 + Example usage of checking supported filter mode:: 62 + 63 + $# cat /sys/devices/hns3_pmu_sicl_0/filtermode/bw_ssu_rpu_byte_num 64 + filter mode supported: global/port/port-tc/func/func-queue/ 65 + 66 + Example usage of perf:: 67 + 68 + $# perf list 69 + hns3_pmu_sicl_0/bw_ssu_rpu_byte_num/ [kernel PMU event] 70 + hns3_pmu_sicl_0/bw_ssu_rpu_time/ [kernel PMU event] 71 + ------------------------------------------ 72 + 73 + $# perf stat -g -e hns3_pmu_sicl_0/bw_ssu_rpu_byte_num,global=1/ -e hns3_pmu_sicl_0/bw_ssu_rpu_time,global=1/ -I 1000 74 + or 75 + $# perf stat -g -e hns3_pmu_sicl_0/config=0x00002,global=1/ -e hns3_pmu_sicl_0/config=0x10002,global=1/ -I 1000 76 + 77 + 78 + Filter modes 79 + -------------- 80 + 81 + 1. global mode 82 + PMU collect performance statistics for all HNS3 PCIe functions of IO DIE. 83 + Set the "global" filter option to 1 will enable this mode. 84 + Example usage of perf:: 85 + 86 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,global=1/ -I 1000 87 + 88 + 2. port mode 89 + PMU collect performance statistic of one whole physical port. The port id 90 + is same as mac id. The "tc" filter option must be set to 0xF in this mode, 91 + here tc stands for traffic class. 92 + 93 + Example usage of perf:: 94 + 95 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0xF/ -I 1000 96 + 97 + 3. port-tc mode 98 + PMU collect performance statistic of one tc of physical port. The port id 99 + is same as mac id. The "tc" filter option must be set to 0 ~ 7 in this 100 + mode. 101 + Example usage of perf:: 102 + 103 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0/ -I 1000 104 + 105 + 4. func mode 106 + PMU collect performance statistic of one PF/VF. The function id is BDF of 107 + PF/VF, its conversion formula:: 108 + 109 + func = (bus << 8) + (device << 3) + (function) 110 + 111 + for example: 112 + BDF func 113 + 35:00.0 0x3500 114 + 35:00.1 0x3501 115 + 35:01.0 0x3508 116 + 117 + In this mode, the "queue" filter option must be set to 0xFFFF. 118 + Example usage of perf:: 119 + 120 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0xFFFF/ -I 1000 121 + 122 + 5. func-queue mode 123 + PMU collect performance statistic of one queue of PF/VF. The function id 124 + is BDF of PF/VF, the "queue" filter option must be set to the exact queue 125 + id of function. 126 + Example usage of perf:: 127 + 128 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0/ -I 1000 129 + 130 + 6. func-intr mode 131 + PMU collect performance statistic of one interrupt of PF/VF. The function 132 + id is BDF of PF/VF, the "intr" filter option must be set to the exact 133 + interrupt id of function. 134 + Example usage of perf:: 135 + 136 + $# perf stat -a -e hns3_pmu_sicl_0/config=0x00301,bdf=0x3500,intr=0/ -I 1000
+1
Documentation/admin-guide/perf/index.rst
··· 9 9 10 10 hisi-pmu 11 11 hisi-pcie-pmu 12 + hns3-pmu 12 13 imx-ddr 13 14 qcom_l2_pmu 14 15 qcom_l3_pmu
+4
Documentation/arm64/elf_hwcaps.rst
··· 301 301 302 302 Functionality implied by ID_AA64ISAR2_EL1.WFXT == 0b0010. 303 303 304 + HWCAP2_EBF16 305 + 306 + Functionality implied by ID_AA64ISAR1_EL1.BF16 == 0b0010. 307 + 304 308 4. Unused AT_HWCAP bits 305 309 ----------------------- 306 310
+4 -6
Documentation/arm64/memory.rst
··· 33 33 0000000000000000 0000ffffffffffff 256TB user 34 34 ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map 35 35 [ffff600000000000 ffff7fffffffffff] 32TB [kasan shadow region] 36 - ffff800000000000 ffff800007ffffff 128MB bpf jit region 37 - ffff800008000000 ffff80000fffffff 128MB modules 38 - ffff800010000000 fffffbffefffffff 124TB vmalloc 36 + ffff800000000000 ffff800007ffffff 128MB modules 37 + ffff800008000000 fffffbffefffffff 124TB vmalloc 39 38 fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down) 40 39 fffffbfffe000000 fffffbfffe7fffff 8MB [guard region] 41 40 fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space ··· 50 51 0000000000000000 000fffffffffffff 4PB user 51 52 fff0000000000000 ffff7fffffffffff ~4PB kernel logical memory map 52 53 [fffd800000000000 ffff7fffffffffff] 512TB [kasan shadow region] 53 - ffff800000000000 ffff800007ffffff 128MB bpf jit region 54 - ffff800008000000 ffff80000fffffff 128MB modules 55 - ffff800010000000 fffffbffefffffff 124TB vmalloc 54 + ffff800000000000 ffff800007ffffff 128MB modules 55 + ffff800008000000 fffffbffefffffff 124TB vmalloc 56 56 fffffbfff0000000 fffffbfffdffffff 224MB fixed mappings (top down) 57 57 fffffbfffe000000 fffffbfffe7fffff 8MB [guard region] 58 58 fffffbfffe800000 fffffbffff7fffff 16MB PCI I/O space
+6
Documentation/arm64/silicon-errata.rst
··· 82 82 +----------------+-----------------+-----------------+-----------------------------+ 83 83 | ARM | Cortex-A57 | #1319537 | ARM64_ERRATUM_1319367 | 84 84 +----------------+-----------------+-----------------+-----------------------------+ 85 + | ARM | Cortex-A57 | #1742098 | ARM64_ERRATUM_1742098 | 86 + +----------------+-----------------+-----------------+-----------------------------+ 85 87 | ARM | Cortex-A72 | #853709 | N/A | 86 88 +----------------+-----------------+-----------------+-----------------------------+ 87 89 | ARM | Cortex-A72 | #1319367 | ARM64_ERRATUM_1319367 | 90 + +----------------+-----------------+-----------------+-----------------------------+ 91 + | ARM | Cortex-A72 | #1655431 | ARM64_ERRATUM_1742098 | 88 92 +----------------+-----------------+-----------------+-----------------------------+ 89 93 | ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 | 90 94 +----------------+-----------------+-----------------+-----------------------------+ ··· 105 101 | ARM | Cortex-A510 | #2051678 | ARM64_ERRATUM_2051678 | 106 102 +----------------+-----------------+-----------------+-----------------------------+ 107 103 | ARM | Cortex-A510 | #2077057 | ARM64_ERRATUM_2077057 | 104 + +----------------+-----------------+-----------------+-----------------------------+ 105 + | ARM | Cortex-A510 | #2441009 | ARM64_ERRATUM_2441009 | 108 106 +----------------+-----------------+-----------------+-----------------------------+ 109 107 | ARM | Cortex-A710 | #2119858 | ARM64_ERRATUM_2119858 | 110 108 +----------------+-----------------+-----------------+-----------------------------+
+1 -1
Documentation/features/vm/ioremap_prot/arch-support.txt
··· 9 9 | alpha: | TODO | 10 10 | arc: | ok | 11 11 | arm: | TODO | 12 - | arm64: | TODO | 12 + | arm64: | ok | 13 13 | csky: | TODO | 14 14 | hexagon: | TODO | 15 15 | ia64: | TODO |
+6 -5
Documentation/memory-barriers.txt
··· 1894 1894 1895 1895 (*) dma_wmb(); 1896 1896 (*) dma_rmb(); 1897 + (*) dma_mb(); 1897 1898 1898 1899 These are for use with consistent memory to guarantee the ordering 1899 1900 of writes or reads of shared memory accessible to both the CPU and a ··· 1926 1925 The dma_rmb() allows us guarantee the device has released ownership 1927 1926 before we read the data from the descriptor, and the dma_wmb() allows 1928 1927 us to guarantee the data is written to the descriptor before the device 1929 - can see it now has ownership. Note that, when using writel(), a prior 1930 - wmb() is not needed to guarantee that the cache coherent memory writes 1931 - have completed before writing to the MMIO region. The cheaper 1932 - writel_relaxed() does not provide this guarantee and must not be used 1933 - here. 1928 + can see it now has ownership. The dma_mb() implies both a dma_rmb() and 1929 + a dma_wmb(). Note that, when using writel(), a prior wmb() is not needed 1930 + to guarantee that the cache coherent memory writes have completed before 1931 + writing to the MMIO region. The cheaper writel_relaxed() does not provide 1932 + this guarantee and must not be used here. 1934 1933 1935 1934 See the subsection "Kernel I/O barrier effects" for more information on 1936 1935 relaxed I/O accessors and the Documentation/core-api/dma-api.rst file for
+6 -5
Documentation/virt/kvm/arm/hyp-abi.rst
··· 60 60 61 61 * :: 62 62 63 - x0 = HVC_VHE_RESTART (arm64 only) 63 + x0 = HVC_FINALISE_EL2 (arm64 only) 64 64 65 - Attempt to upgrade the kernel's exception level from EL1 to EL2 by enabling 66 - the VHE mode. This is conditioned by the CPU supporting VHE, the EL2 MMU 67 - being off, and VHE not being disabled by any other means (command line 68 - option, for example). 65 + Finish configuring EL2 depending on the command-line options, 66 + including an attempt to upgrade the kernel's exception level from 67 + EL1 to EL2 by enabling the VHE mode. This is conditioned by the CPU 68 + supporting VHE, the EL2 MMU being off, and VHE not being disabled by 69 + any other means (command line option, for example). 69 70 70 71 Any other value of r0/x0 triggers a hypervisor-specific handling, 71 72 which is not documented here.
+6
MAINTAINERS
··· 9038 9038 F: Documentation/admin-guide/perf/hisi-pmu.rst 9039 9039 F: drivers/perf/hisilicon 9040 9040 9041 + HISILICON HNS3 PMU DRIVER 9042 + M: Guangbin Huang <huangguangbin2@huawei.com> 9043 + S: Supported 9044 + F: Documentation/admin-guide/perf/hns3-pmu.rst 9045 + F: drivers/perf/hisilicon/hns3_pmu.c 9046 + 9041 9047 HISILICON QM AND ZIP Controller DRIVER 9042 9048 M: Zhou Wang <wangzhou1@hisilicon.com> 9043 9049 L: linux-crypto@vger.kernel.org
+3
arch/Kconfig
··· 223 223 config TRACE_IRQFLAGS_SUPPORT 224 224 bool 225 225 226 + config TRACE_IRQFLAGS_NMI_SUPPORT 227 + bool 228 + 226 229 # 227 230 # An arch should select this if it provides all these things: 228 231 #
+1 -3
arch/arm/include/asm/io.h
··· 139 139 extern void __iomem *__arm_ioremap_pfn(unsigned long, unsigned long, size_t, unsigned int); 140 140 extern void __iomem *__arm_ioremap_exec(phys_addr_t, size_t, bool cached); 141 141 void __arm_iomem_set_ro(void __iomem *ptr, size_t size); 142 - extern void __iounmap(volatile void __iomem *addr); 143 142 144 143 extern void __iomem * (*arch_ioremap_caller)(phys_addr_t, size_t, 145 144 unsigned int, void *); 146 - extern void (*arch_iounmap)(volatile void __iomem *); 147 145 148 146 /* 149 147 * Bad read/write accesses... ··· 378 380 #define ioremap_wc ioremap_wc 379 381 #define ioremap_wt ioremap_wc 380 382 381 - void iounmap(volatile void __iomem *iomem_cookie); 383 + void iounmap(volatile void __iomem *io_addr); 382 384 #define iounmap iounmap 383 385 384 386 void *arch_memremap_wb(phys_addr_t phys_addr, size_t size);
+1 -8
arch/arm/mm/ioremap.c
··· 418 418 __builtin_return_address(0)); 419 419 } 420 420 421 - void __iounmap(volatile void __iomem *io_addr) 421 + void iounmap(volatile void __iomem *io_addr) 422 422 { 423 423 void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr); 424 424 struct static_vm *svm; ··· 445 445 #endif 446 446 447 447 vunmap(addr); 448 - } 449 - 450 - void (*arch_iounmap)(volatile void __iomem *) = __iounmap; 451 - 452 - void iounmap(volatile void __iomem *cookie) 453 - { 454 - arch_iounmap(cookie); 455 448 } 456 449 EXPORT_SYMBOL(iounmap); 457 450
+1 -8
arch/arm/mm/nommu.c
··· 230 230 return (void *)phys_addr; 231 231 } 232 232 233 - void __iounmap(volatile void __iomem *addr) 234 - { 235 - } 236 - EXPORT_SYMBOL(__iounmap); 237 - 238 - void (*arch_iounmap)(volatile void __iomem *); 239 - 240 - void iounmap(volatile void __iomem *addr) 233 + void iounmap(volatile void __iomem *io_addr) 241 234 { 242 235 } 243 236 EXPORT_SYMBOL(iounmap);
+37
arch/arm64/Kconfig
··· 101 101 select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP 102 102 select ARCH_WANT_LD_ORPHAN_WARN 103 103 select ARCH_WANTS_NO_INSTR 104 + select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES 104 105 select ARCH_HAS_UBSAN_SANITIZE_ALL 105 106 select ARM_AMBA 106 107 select ARM_ARCH_TIMER ··· 127 126 select GENERIC_CPU_VULNERABILITIES 128 127 select GENERIC_EARLY_IOREMAP 129 128 select GENERIC_IDLE_POLL_SETUP 129 + select GENERIC_IOREMAP 130 130 select GENERIC_IRQ_IPI 131 131 select GENERIC_IRQ_PROBE 132 132 select GENERIC_IRQ_SHOW ··· 190 188 select HAVE_FUNCTION_GRAPH_TRACER 191 189 select HAVE_GCC_PLUGINS 192 190 select HAVE_HW_BREAKPOINT if PERF_EVENTS 191 + select HAVE_IOREMAP_PROT 193 192 select HAVE_IRQ_TIME_ACCOUNTING 194 193 select HAVE_KVM 195 194 select HAVE_NMI ··· 229 226 select THREAD_INFO_IN_TASK 230 227 select HAVE_ARCH_USERFAULTFD_MINOR if USERFAULTFD 231 228 select TRACE_IRQFLAGS_SUPPORT 229 + select TRACE_IRQFLAGS_NMI_SUPPORT 232 230 help 233 231 ARM 64-bit (AArch64) Linux support. 234 232 ··· 504 500 Please note that this does not necessarily enable the workaround, 505 501 as it depends on the alternative framework, which will only patch 506 502 the kernel if an affected CPU is detected. 503 + 504 + If unsure, say Y. 505 + 506 + config ARM64_ERRATUM_1742098 507 + bool "Cortex-A57/A72: 1742098: ELR recorded incorrectly on interrupt taken between cryptographic instructions in a sequence" 508 + depends on COMPAT 509 + default y 510 + help 511 + This option removes the AES hwcap for aarch32 user-space to 512 + workaround erratum 1742098 on Cortex-A57 and Cortex-A72. 513 + 514 + Affected parts may corrupt the AES state if an interrupt is 515 + taken between a pair of AES instructions. These instructions 516 + are only present if the cryptography extensions are present. 517 + All software should have a fallback implementation for CPUs 518 + that don't implement the cryptography extensions. 507 519 508 520 If unsure, say Y. 509 521 ··· 838 818 839 819 Work around this in the driver by always making sure that there is a 840 820 page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE. 821 + 822 + If unsure, say Y. 823 + 824 + config ARM64_ERRATUM_2441009 825 + bool "Cortex-A510: Completion of affected memory accesses might not be guaranteed by completion of a TLBI" 826 + default y 827 + select ARM64_WORKAROUND_REPEAT_TLBI 828 + help 829 + This option adds a workaround for ARM Cortex-A510 erratum #2441009. 830 + 831 + Under very rare circumstances, affected Cortex-A510 CPUs 832 + may not handle a race between a break-before-make sequence on one 833 + CPU, and another CPU accessing the same page. This could allow a 834 + store to a page that has been unmapped. 835 + 836 + Work around this by adding the affected CPUs to the list that needs 837 + TLB sequences to be done twice. 841 838 842 839 If unsure, say Y. 843 840
+4 -1
arch/arm64/boot/Makefile
··· 16 16 17 17 OBJCOPYFLAGS_Image :=-O binary -R .note -R .note.gnu.build-id -R .comment -S 18 18 19 - targets := Image Image.bz2 Image.gz Image.lz4 Image.lzma Image.lzo 19 + targets := Image Image.bz2 Image.gz Image.lz4 Image.lzma Image.lzo Image.zst 20 20 21 21 $(obj)/Image: vmlinux FORCE 22 22 $(call if_changed,objcopy) ··· 35 35 36 36 $(obj)/Image.lzo: $(obj)/Image FORCE 37 37 $(call if_changed,lzo) 38 + 39 + $(obj)/Image.zst: $(obj)/Image FORCE 40 + $(call if_changed,zstd)
+55 -24
arch/arm64/include/asm/asm-extable.h
··· 2 2 #ifndef __ASM_ASM_EXTABLE_H 3 3 #define __ASM_ASM_EXTABLE_H 4 4 5 + #include <linux/bits.h> 6 + #include <asm/gpr-num.h> 7 + 5 8 #define EX_TYPE_NONE 0 6 - #define EX_TYPE_FIXUP 1 7 - #define EX_TYPE_BPF 2 8 - #define EX_TYPE_UACCESS_ERR_ZERO 3 9 + #define EX_TYPE_BPF 1 10 + #define EX_TYPE_UACCESS_ERR_ZERO 2 11 + #define EX_TYPE_KACCESS_ERR_ZERO 3 9 12 #define EX_TYPE_LOAD_UNALIGNED_ZEROPAD 4 13 + 14 + /* Data fields for EX_TYPE_UACCESS_ERR_ZERO */ 15 + #define EX_DATA_REG_ERR_SHIFT 0 16 + #define EX_DATA_REG_ERR GENMASK(4, 0) 17 + #define EX_DATA_REG_ZERO_SHIFT 5 18 + #define EX_DATA_REG_ZERO GENMASK(9, 5) 19 + 20 + /* Data fields for EX_TYPE_LOAD_UNALIGNED_ZEROPAD */ 21 + #define EX_DATA_REG_DATA_SHIFT 0 22 + #define EX_DATA_REG_DATA GENMASK(4, 0) 23 + #define EX_DATA_REG_ADDR_SHIFT 5 24 + #define EX_DATA_REG_ADDR GENMASK(9, 5) 10 25 11 26 #ifdef __ASSEMBLY__ 12 27 ··· 34 19 .short (data); \ 35 20 .popsection; 36 21 22 + #define EX_DATA_REG(reg, gpr) \ 23 + (.L__gpr_num_##gpr << EX_DATA_REG_##reg##_SHIFT) 24 + 25 + #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero) \ 26 + __ASM_EXTABLE_RAW(insn, fixup, \ 27 + EX_TYPE_UACCESS_ERR_ZERO, \ 28 + ( \ 29 + EX_DATA_REG(ERR, err) | \ 30 + EX_DATA_REG(ZERO, zero) \ 31 + )) 32 + 33 + #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err) \ 34 + _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr) 35 + 36 + #define _ASM_EXTABLE_UACCESS(insn, fixup) \ 37 + _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr) 38 + 37 39 /* 38 - * Create an exception table entry for `insn`, which will branch to `fixup` 40 + * Create an exception table entry for uaccess `insn`, which will branch to `fixup` 39 41 * when an unhandled fault is taken. 40 42 */ 41 - .macro _asm_extable, insn, fixup 42 - __ASM_EXTABLE_RAW(\insn, \fixup, EX_TYPE_FIXUP, 0) 43 + .macro _asm_extable_uaccess, insn, fixup 44 + _ASM_EXTABLE_UACCESS(\insn, \fixup) 43 45 .endm 44 46 45 47 /* 46 48 * Create an exception table entry for `insn` if `fixup` is provided. Otherwise 47 49 * do nothing. 48 50 */ 49 - .macro _cond_extable, insn, fixup 50 - .ifnc \fixup, 51 - _asm_extable \insn, \fixup 51 + .macro _cond_uaccess_extable, insn, fixup 52 + .ifnc \fixup, 53 + _asm_extable_uaccess \insn, \fixup 52 54 .endif 53 55 .endm 54 56 55 57 #else /* __ASSEMBLY__ */ 56 58 57 - #include <linux/bits.h> 58 59 #include <linux/stringify.h> 59 - 60 - #include <asm/gpr-num.h> 61 60 62 61 #define __ASM_EXTABLE_RAW(insn, fixup, type, data) \ 63 62 ".pushsection __ex_table, \"a\"\n" \ ··· 81 52 ".short (" type ")\n" \ 82 53 ".short (" data ")\n" \ 83 54 ".popsection\n" 84 - 85 - #define _ASM_EXTABLE(insn, fixup) \ 86 - __ASM_EXTABLE_RAW(#insn, #fixup, __stringify(EX_TYPE_FIXUP), "0") 87 - 88 - #define EX_DATA_REG_ERR_SHIFT 0 89 - #define EX_DATA_REG_ERR GENMASK(4, 0) 90 - #define EX_DATA_REG_ZERO_SHIFT 5 91 - #define EX_DATA_REG_ZERO GENMASK(9, 5) 92 55 93 56 #define EX_DATA_REG(reg, gpr) \ 94 57 "((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")" ··· 94 73 EX_DATA_REG(ZERO, zero) \ 95 74 ")") 96 75 76 + #define _ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, err, zero) \ 77 + __DEFINE_ASM_GPR_NUMS \ 78 + __ASM_EXTABLE_RAW(#insn, #fixup, \ 79 + __stringify(EX_TYPE_KACCESS_ERR_ZERO), \ 80 + "(" \ 81 + EX_DATA_REG(ERR, err) " | " \ 82 + EX_DATA_REG(ZERO, zero) \ 83 + ")") 84 + 97 85 #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err) \ 98 86 _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, wzr) 99 87 100 - #define EX_DATA_REG_DATA_SHIFT 0 101 - #define EX_DATA_REG_DATA GENMASK(4, 0) 102 - #define EX_DATA_REG_ADDR_SHIFT 5 103 - #define EX_DATA_REG_ADDR GENMASK(9, 5) 88 + #define _ASM_EXTABLE_UACCESS(insn, fixup) \ 89 + _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, wzr, wzr) 90 + 91 + #define _ASM_EXTABLE_KACCESS_ERR(insn, fixup, err) \ 92 + _ASM_EXTABLE_KACCESS_ERR_ZERO(insn, fixup, err, wzr) 104 93 105 94 #define _ASM_EXTABLE_LOAD_UNALIGNED_ZEROPAD(insn, fixup, data, addr) \ 106 95 __DEFINE_ASM_GPR_NUMS \
+6 -6
arch/arm64/include/asm/asm-uaccess.h
··· 61 61 62 62 #define USER(l, x...) \ 63 63 9999: x; \ 64 - _asm_extable 9999b, l 64 + _asm_extable_uaccess 9999b, l 65 65 66 66 /* 67 67 * Generate the assembly for LDTR/STTR with exception table entries. ··· 73 73 8889: ldtr \reg2, [\addr, #8]; 74 74 add \addr, \addr, \post_inc; 75 75 76 - _asm_extable 8888b,\l; 77 - _asm_extable 8889b,\l; 76 + _asm_extable_uaccess 8888b, \l; 77 + _asm_extable_uaccess 8889b, \l; 78 78 .endm 79 79 80 80 .macro user_stp l, reg1, reg2, addr, post_inc ··· 82 82 8889: sttr \reg2, [\addr, #8]; 83 83 add \addr, \addr, \post_inc; 84 84 85 - _asm_extable 8888b,\l; 86 - _asm_extable 8889b,\l; 85 + _asm_extable_uaccess 8888b,\l; 86 + _asm_extable_uaccess 8889b,\l; 87 87 .endm 88 88 89 89 .macro user_ldst l, inst, reg, addr, post_inc 90 90 8888: \inst \reg, [\addr]; 91 91 add \addr, \addr, \post_inc; 92 92 93 - _asm_extable 8888b,\l; 93 + _asm_extable_uaccess 8888b, \l; 94 94 .endm 95 95 #endif
+2 -2
arch/arm64/include/asm/asm_pointer_auth.h
··· 59 59 60 60 .macro __ptrauth_keys_init_cpu tsk, tmp1, tmp2, tmp3 61 61 mrs \tmp1, id_aa64isar1_el1 62 - ubfx \tmp1, \tmp1, #ID_AA64ISAR1_APA_SHIFT, #8 62 + ubfx \tmp1, \tmp1, #ID_AA64ISAR1_EL1_APA_SHIFT, #8 63 63 mrs_s \tmp2, SYS_ID_AA64ISAR2_EL1 64 - ubfx \tmp2, \tmp2, #ID_AA64ISAR2_APA3_SHIFT, #4 64 + ubfx \tmp2, \tmp2, #ID_AA64ISAR2_EL1_APA3_SHIFT, #4 65 65 orr \tmp1, \tmp1, \tmp2 66 66 cbz \tmp1, .Lno_addr_auth\@ 67 67 mov_q \tmp1, (SCTLR_ELx_ENIA | SCTLR_ELx_ENIB | \
+29 -6
arch/arm64/include/asm/assembler.h
··· 360 360 .endm 361 361 362 362 /* 363 + * idmap_get_t0sz - get the T0SZ value needed to cover the ID map 364 + * 365 + * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the 366 + * entire ID map region can be mapped. As T0SZ == (64 - #bits used), 367 + * this number conveniently equals the number of leading zeroes in 368 + * the physical address of _end. 369 + */ 370 + .macro idmap_get_t0sz, reg 371 + adrp \reg, _end 372 + orr \reg, \reg, #(1 << VA_BITS_MIN) - 1 373 + clz \reg, \reg 374 + .endm 375 + 376 + /* 363 377 * tcr_compute_pa_size - set TCR.(I)PS to the highest supported 364 378 * ID_AA64MMFR0_EL1.PARange value 365 379 * ··· 437 423 b.lo .Ldcache_op\@ 438 424 dsb \domain 439 425 440 - _cond_extable .Ldcache_op\@, \fixup 426 + _cond_uaccess_extable .Ldcache_op\@, \fixup 441 427 .endm 442 428 443 429 /* ··· 476 462 dsb ish 477 463 isb 478 464 479 - _cond_extable .Licache_op\@, \fixup 465 + _cond_uaccess_extable .Licache_op\@, \fixup 466 + .endm 467 + 468 + /* 469 + * load_ttbr1 - install @pgtbl as a TTBR1 page table 470 + * pgtbl preserved 471 + * tmp1/tmp2 clobbered, either may overlap with pgtbl 472 + */ 473 + .macro load_ttbr1, pgtbl, tmp1, tmp2 474 + phys_to_ttbr \tmp1, \pgtbl 475 + offset_ttbr1 \tmp1, \tmp2 476 + msr ttbr1_el1, \tmp1 477 + isb 480 478 .endm 481 479 482 480 /* ··· 504 478 isb 505 479 tlbi vmalle1 506 480 dsb nsh 507 - phys_to_ttbr \tmp, \page_table 508 - offset_ttbr1 \tmp, \tmp2 509 - msr ttbr1_el1, \tmp 510 - isb 481 + load_ttbr1 \page_table, \tmp, \tmp2 511 482 .endm 512 483 513 484 /*
+6 -6
arch/arm64/include/asm/barrier.h
··· 50 50 #define pmr_sync() do {} while (0) 51 51 #endif 52 52 53 - #define mb() dsb(sy) 54 - #define rmb() dsb(ld) 55 - #define wmb() dsb(st) 53 + #define __mb() dsb(sy) 54 + #define __rmb() dsb(ld) 55 + #define __wmb() dsb(st) 56 56 57 - #define dma_mb() dmb(osh) 58 - #define dma_rmb() dmb(oshld) 59 - #define dma_wmb() dmb(oshst) 57 + #define __dma_mb() dmb(osh) 58 + #define __dma_rmb() dmb(oshld) 59 + #define __dma_wmb() dmb(oshst) 60 60 61 61 #define io_stop_wc() dgh() 62 62
+13 -28
arch/arm64/include/asm/cache.h
··· 5 5 #ifndef __ASM_CACHE_H 6 6 #define __ASM_CACHE_H 7 7 8 - #include <asm/cputype.h> 9 - #include <asm/mte-def.h> 10 - 11 - #define CTR_L1IP_SHIFT 14 12 - #define CTR_L1IP_MASK 3 13 - #define CTR_DMINLINE_SHIFT 16 14 - #define CTR_IMINLINE_SHIFT 0 15 - #define CTR_IMINLINE_MASK 0xf 16 - #define CTR_ERG_SHIFT 20 17 - #define CTR_CWG_SHIFT 24 18 - #define CTR_CWG_MASK 15 19 - #define CTR_IDC_SHIFT 28 20 - #define CTR_DIC_SHIFT 29 21 - 22 - #define CTR_CACHE_MINLINE_MASK \ 23 - (0xf << CTR_DMINLINE_SHIFT | CTR_IMINLINE_MASK << CTR_IMINLINE_SHIFT) 24 - 25 - #define CTR_L1IP(ctr) (((ctr) >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK) 26 - 27 - #define ICACHE_POLICY_VPIPT 0 28 - #define ICACHE_POLICY_RESERVED 1 29 - #define ICACHE_POLICY_VIPT 2 30 - #define ICACHE_POLICY_PIPT 3 31 - 32 8 #define L1_CACHE_SHIFT (6) 33 9 #define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT) 34 - 35 10 36 11 #define CLIDR_LOUU_SHIFT 27 37 12 #define CLIDR_LOC_SHIFT 24 ··· 30 55 #include <linux/bitops.h> 31 56 #include <linux/kasan-enabled.h> 32 57 58 + #include <asm/cputype.h> 59 + #include <asm/mte-def.h> 60 + #include <asm/sysreg.h> 61 + 33 62 #ifdef CONFIG_KASAN_SW_TAGS 34 63 #define ARCH_SLAB_MINALIGN (1ULL << KASAN_SHADOW_SCALE_SHIFT) 35 64 #elif defined(CONFIG_KASAN_HW_TAGS) ··· 44 65 } 45 66 #define arch_slab_minalign() arch_slab_minalign() 46 67 #endif 68 + 69 + #define CTR_CACHE_MINLINE_MASK \ 70 + (0xf << CTR_EL0_DMINLINE_SHIFT | \ 71 + CTR_EL0_IMINLINE_MASK << CTR_EL0_IMINLINE_SHIFT) 72 + 73 + #define CTR_L1IP(ctr) SYS_FIELD_GET(CTR_EL0, L1Ip, ctr) 47 74 48 75 #define ICACHEF_ALIASING 0 49 76 #define ICACHEF_VPIPT 1 ··· 71 86 72 87 static inline u32 cache_type_cwg(void) 73 88 { 74 - return (read_cpuid_cachetype() >> CTR_CWG_SHIFT) & CTR_CWG_MASK; 89 + return (read_cpuid_cachetype() >> CTR_EL0_CWG_SHIFT) & CTR_EL0_CWG_MASK; 75 90 } 76 91 77 92 #define __read_mostly __section(".data..read_mostly") ··· 105 120 { 106 121 u32 ctr = read_cpuid_cachetype(); 107 122 108 - if (!(ctr & BIT(CTR_IDC_SHIFT))) { 123 + if (!(ctr & BIT(CTR_EL0_IDC_SHIFT))) { 109 124 u64 clidr = read_sysreg(clidr_el1); 110 125 111 126 if (CLIDR_LOC(clidr) == 0 || 112 127 (CLIDR_LOUIS(clidr) == 0 && CLIDR_LOUU(clidr) == 0)) 113 - ctr |= BIT(CTR_IDC_SHIFT); 128 + ctr |= BIT(CTR_EL0_IDC_SHIFT); 114 129 } 115 130 116 131 return ctr;
-7
arch/arm64/include/asm/cacheflush.h
··· 105 105 #define flush_icache_range flush_icache_range 106 106 107 107 /* 108 - * Cache maintenance functions used by the DMA API. No to be used directly. 109 - */ 110 - extern void __dma_map_area(const void *, size_t, int); 111 - extern void __dma_unmap_area(const void *, size_t, int); 112 - extern void __dma_flush_area(const void *, size_t); 113 - 114 - /* 115 108 * Copy user data from/to a page which is mapped into a different 116 109 * processes address space. Really, we want to allow our "user 117 110 * space" model to handle this.
+1
arch/arm64/include/asm/cpu.h
··· 46 46 u64 reg_midr; 47 47 u64 reg_revidr; 48 48 u64 reg_gmid; 49 + u64 reg_smidr; 49 50 50 51 u64 reg_id_aa64dfr0; 51 52 u64 reg_id_aa64dfr1;
-9
arch/arm64/include/asm/cpu_ops.h
··· 31 31 * @cpu_die: Makes a cpu leave the kernel. Must not fail. Called from the 32 32 * cpu being killed. 33 33 * @cpu_kill: Ensures a cpu has left the kernel. Called from another cpu. 34 - * @cpu_init_idle: Reads any data necessary to initialize CPU idle states for 35 - * a proposed logical id. 36 - * @cpu_suspend: Suspends a cpu and saves the required context. May fail owing 37 - * to wrong parameters or error conditions. Called from the 38 - * CPU being suspended. Must be called with IRQs disabled. 39 34 */ 40 35 struct cpu_operations { 41 36 const char *name; ··· 43 48 int (*cpu_disable)(unsigned int cpu); 44 49 void (*cpu_die)(unsigned int cpu); 45 50 int (*cpu_kill)(unsigned int cpu); 46 - #endif 47 - #ifdef CONFIG_CPU_IDLE 48 - int (*cpu_init_idle)(unsigned int); 49 - int (*cpu_suspend)(unsigned long); 50 51 #endif 51 52 }; 52 53
+5 -2
arch/arm64/include/asm/cpufeature.h
··· 11 11 #include <asm/hwcap.h> 12 12 #include <asm/sysreg.h> 13 13 14 - #define MAX_CPU_FEATURES 64 14 + #define MAX_CPU_FEATURES 128 15 15 #define cpu_feature(x) KERNEL_HWCAP_ ## x 16 16 17 17 #ifndef __ASSEMBLY__ ··· 673 673 isar2 = read_sanitised_ftr_reg(SYS_ID_AA64ISAR2_EL1); 674 674 675 675 return cpuid_feature_extract_unsigned_field(isar2, 676 - ID_AA64ISAR2_CLEARBHB_SHIFT); 676 + ID_AA64ISAR2_EL1_BC_SHIFT); 677 677 } 678 678 679 679 const struct cpumask *system_32bit_el0_cpumask(void); ··· 908 908 } 909 909 910 910 extern struct arm64_ftr_override id_aa64mmfr1_override; 911 + extern struct arm64_ftr_override id_aa64pfr0_override; 911 912 extern struct arm64_ftr_override id_aa64pfr1_override; 913 + extern struct arm64_ftr_override id_aa64zfr0_override; 914 + extern struct arm64_ftr_override id_aa64smfr0_override; 912 915 extern struct arm64_ftr_override id_aa64isar1_override; 913 916 extern struct arm64_ftr_override id_aa64isar2_override; 914 917
-15
arch/arm64/include/asm/cpuidle.h
··· 4 4 5 5 #include <asm/proc-fns.h> 6 6 7 - #ifdef CONFIG_CPU_IDLE 8 - extern int arm_cpuidle_init(unsigned int cpu); 9 - extern int arm_cpuidle_suspend(int index); 10 - #else 11 - static inline int arm_cpuidle_init(unsigned int cpu) 12 - { 13 - return -EOPNOTSUPP; 14 - } 15 - 16 - static inline int arm_cpuidle_suspend(int index) 17 - { 18 - return -EOPNOTSUPP; 19 - } 20 - #endif 21 - 22 7 #ifdef CONFIG_ARM64_PSEUDO_NMI 23 8 #include <asm/arch_gicv3.h> 24 9
-60
arch/arm64/include/asm/el2_setup.h
··· 129 129 msr cptr_el2, x0 // Disable copro. traps to EL2 130 130 .endm 131 131 132 - /* SVE register access */ 133 - .macro __init_el2_nvhe_sve 134 - mrs x1, id_aa64pfr0_el1 135 - ubfx x1, x1, #ID_AA64PFR0_SVE_SHIFT, #4 136 - cbz x1, .Lskip_sve_\@ 137 - 138 - bic x0, x0, #CPTR_EL2_TZ // Also disable SVE traps 139 - msr cptr_el2, x0 // Disable copro. traps to EL2 140 - isb 141 - mov x1, #ZCR_ELx_LEN_MASK // SVE: Enable full vector 142 - msr_s SYS_ZCR_EL2, x1 // length for EL1. 143 - .Lskip_sve_\@: 144 - .endm 145 - 146 - /* SME register access and priority mapping */ 147 - .macro __init_el2_nvhe_sme 148 - mrs x1, id_aa64pfr1_el1 149 - ubfx x1, x1, #ID_AA64PFR1_SME_SHIFT, #4 150 - cbz x1, .Lskip_sme_\@ 151 - 152 - bic x0, x0, #CPTR_EL2_TSM // Also disable SME traps 153 - msr cptr_el2, x0 // Disable copro. traps to EL2 154 - isb 155 - 156 - mrs x1, sctlr_el2 157 - orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps 158 - msr sctlr_el2, x1 159 - isb 160 - 161 - mov x1, #0 // SMCR controls 162 - 163 - mrs_s x2, SYS_ID_AA64SMFR0_EL1 164 - ubfx x2, x2, #ID_AA64SMFR0_FA64_SHIFT, #1 // Full FP in SM? 165 - cbz x2, .Lskip_sme_fa64_\@ 166 - 167 - orr x1, x1, SMCR_ELx_FA64_MASK 168 - .Lskip_sme_fa64_\@: 169 - 170 - orr x1, x1, #SMCR_ELx_LEN_MASK // Enable full SME vector 171 - msr_s SYS_SMCR_EL2, x1 // length for EL1. 172 - 173 - mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported? 174 - ubfx x1, x1, #SMIDR_EL1_SMPS_SHIFT, #1 175 - cbz x1, .Lskip_sme_\@ 176 - 177 - msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal 178 - 179 - mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present? 180 - ubfx x1, x1, #ID_AA64MMFR1_HCX_SHIFT, #4 181 - cbz x1, .Lskip_sme_\@ 182 - 183 - mrs_s x1, SYS_HCRX_EL2 184 - orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping 185 - msr_s SYS_HCRX_EL2, x1 186 - 187 - .Lskip_sme_\@: 188 - .endm 189 - 190 132 /* Disable any fine grained traps */ 191 133 .macro __init_el2_fgt 192 134 mrs x1, id_aa64mmfr0_el1 ··· 192 250 __init_el2_hstr 193 251 __init_el2_nvhe_idregs 194 252 __init_el2_nvhe_cptr 195 - __init_el2_nvhe_sve 196 - __init_el2_nvhe_sme 197 253 __init_el2_fgt 198 254 __init_el2_nvhe_prepare_eret 199 255 .endm
+3 -1
arch/arm64/include/asm/fixmap.h
··· 62 62 #endif /* CONFIG_ACPI_APEI_GHES */ 63 63 64 64 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 65 + #ifdef CONFIG_RELOCATABLE 66 + FIX_ENTRY_TRAMP_TEXT4, /* one extra slot for the data page */ 67 + #endif 65 68 FIX_ENTRY_TRAMP_TEXT3, 66 69 FIX_ENTRY_TRAMP_TEXT2, 67 70 FIX_ENTRY_TRAMP_TEXT1, 68 - FIX_ENTRY_TRAMP_DATA, 69 71 #define TRAMP_VALIAS (__fix_to_virt(FIX_ENTRY_TRAMP_TEXT1)) 70 72 #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ 71 73 __end_of_permanent_fixed_addresses,
+2 -1
arch/arm64/include/asm/hwcap.h
··· 85 85 #define KERNEL_HWCAP_PACA __khwcap_feature(PACA) 86 86 #define KERNEL_HWCAP_PACG __khwcap_feature(PACG) 87 87 88 - #define __khwcap2_feature(x) (const_ilog2(HWCAP2_ ## x) + 32) 88 + #define __khwcap2_feature(x) (const_ilog2(HWCAP2_ ## x) + 64) 89 89 #define KERNEL_HWCAP_DCPODP __khwcap2_feature(DCPODP) 90 90 #define KERNEL_HWCAP_SVE2 __khwcap2_feature(SVE2) 91 91 #define KERNEL_HWCAP_SVEAES __khwcap2_feature(SVEAES) ··· 118 118 #define KERNEL_HWCAP_SME_F32F32 __khwcap2_feature(SME_F32F32) 119 119 #define KERNEL_HWCAP_SME_FA64 __khwcap2_feature(SME_FA64) 120 120 #define KERNEL_HWCAP_WFXT __khwcap2_feature(WFXT) 121 + #define KERNEL_HWCAP_EBF16 __khwcap2_feature(EBF16) 121 122 122 123 /* 123 124 * This yields a mask that user programs can use to figure out what
+18 -6
arch/arm64/include/asm/io.h
··· 163 163 /* 164 164 * I/O memory mapping functions. 165 165 */ 166 - extern void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot); 167 - extern void iounmap(volatile void __iomem *addr); 168 - extern void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size); 169 166 170 - #define ioremap(addr, size) __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRE)) 171 - #define ioremap_wc(addr, size) __ioremap((addr), (size), __pgprot(PROT_NORMAL_NC)) 172 - #define ioremap_np(addr, size) __ioremap((addr), (size), __pgprot(PROT_DEVICE_nGnRnE)) 167 + bool ioremap_allowed(phys_addr_t phys_addr, size_t size, unsigned long prot); 168 + #define ioremap_allowed ioremap_allowed 169 + 170 + #define _PAGE_IOREMAP PROT_DEVICE_nGnRE 171 + 172 + #define ioremap_wc(addr, size) \ 173 + ioremap_prot((addr), (size), PROT_NORMAL_NC) 174 + #define ioremap_np(addr, size) \ 175 + ioremap_prot((addr), (size), PROT_DEVICE_nGnRnE) 173 176 174 177 /* 175 178 * io{read,write}{16,32,64}be() macros ··· 186 183 #define iowrite64be(v,p) ({ __iowmb(); __raw_writeq((__force __u64)cpu_to_be64(v), p); }) 187 184 188 185 #include <asm-generic/io.h> 186 + 187 + #define ioremap_cache ioremap_cache 188 + static inline void __iomem *ioremap_cache(phys_addr_t addr, size_t size) 189 + { 190 + if (pfn_is_map_memory(__phys_to_pfn(addr))) 191 + return (void __iomem *)__phys_to_virt(addr); 192 + 193 + return ioremap_prot(addr, size, PROT_NORMAL); 194 + } 189 195 190 196 /* 191 197 * More restrictive address range checking than the default implementation
+13 -5
arch/arm64/include/asm/kernel-pgtable.h
··· 8 8 #ifndef __ASM_KERNEL_PGTABLE_H 9 9 #define __ASM_KERNEL_PGTABLE_H 10 10 11 + #include <asm/boot.h> 11 12 #include <asm/pgtable-hwdef.h> 12 13 #include <asm/sparsemem.h> 13 14 ··· 36 35 */ 37 36 #if ARM64_KERNEL_USES_PMD_MAPS 38 37 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1) 39 - #define IDMAP_PGTABLE_LEVELS (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT) - 1) 40 38 #else 41 39 #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS) 42 - #define IDMAP_PGTABLE_LEVELS (ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT)) 43 40 #endif 44 41 45 42 ··· 86 87 + EARLY_PUDS((vstart), (vend)) /* each PUD needs a next level page table */ \ 87 88 + EARLY_PMDS((vstart), (vend))) /* each PMD needs a next level page table */ 88 89 #define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end)) 89 - #define IDMAP_DIR_SIZE (IDMAP_PGTABLE_LEVELS * PAGE_SIZE) 90 + 91 + /* the initial ID map may need two extra pages if it needs to be extended */ 92 + #if VA_BITS < 48 93 + #define INIT_IDMAP_DIR_SIZE ((INIT_IDMAP_DIR_PAGES + 2) * PAGE_SIZE) 94 + #else 95 + #define INIT_IDMAP_DIR_SIZE (INIT_IDMAP_DIR_PAGES * PAGE_SIZE) 96 + #endif 97 + #define INIT_IDMAP_DIR_PAGES EARLY_PAGES(KIMAGE_VADDR, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE) 90 98 91 99 /* Initial memory map size */ 92 100 #if ARM64_KERNEL_USES_PMD_MAPS ··· 113 107 #define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) 114 108 115 109 #if ARM64_KERNEL_USES_PMD_MAPS 116 - #define SWAPPER_MM_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS) 110 + #define SWAPPER_RW_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS) 111 + #define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PMD_SECT_RDONLY) 117 112 #else 118 - #define SWAPPER_MM_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS) 113 + #define SWAPPER_RW_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS) 114 + #define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PTE_RDONLY) 119 115 #endif 120 116 121 117 /*
+9
arch/arm64/include/asm/memory.h
··· 174 174 #include <linux/types.h> 175 175 #include <asm/bug.h> 176 176 177 + #if VA_BITS > 48 177 178 extern u64 vabits_actual; 179 + #else 180 + #define vabits_actual ((u64)VA_BITS) 181 + #endif 178 182 179 183 extern s64 memstart_addr; 180 184 /* PHYS_OFFSET - the physical address of the start of memory. */ ··· 355 351 }) 356 352 357 353 void dump_mem_limit(void); 354 + 355 + static inline bool defer_reserve_crashkernel(void) 356 + { 357 + return IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32); 358 + } 358 359 #endif /* !ASSEMBLY */ 359 360 360 361 /*
+10 -6
arch/arm64/include/asm/mmu_context.h
··· 60 60 * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in 61 61 * physical memory, in which case it will be smaller. 62 62 */ 63 - extern u64 idmap_t0sz; 64 - extern u64 idmap_ptrs_per_pgd; 63 + extern int idmap_t0sz; 65 64 66 65 /* 67 66 * Ensure TCR.T0SZ is set to the provided value. ··· 105 106 cpu_switch_mm(mm->pgd, mm); 106 107 } 107 108 108 - static inline void cpu_install_idmap(void) 109 + static inline void __cpu_install_idmap(pgd_t *idmap) 109 110 { 110 111 cpu_set_reserved_ttbr0(); 111 112 local_flush_tlb_all(); 112 113 cpu_set_idmap_tcr_t0sz(); 113 114 114 - cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm); 115 + cpu_switch_mm(lm_alias(idmap), &init_mm); 116 + } 117 + 118 + static inline void cpu_install_idmap(void) 119 + { 120 + __cpu_install_idmap(idmap_pg_dir); 115 121 } 116 122 117 123 /* ··· 147 143 * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD, 148 144 * avoiding the possibility of conflicting TLB entries being allocated. 149 145 */ 150 - static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp) 146 + static inline void __nocfi cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap) 151 147 { 152 148 typedef void (ttbr_replace_func)(phys_addr_t); 153 149 extern ttbr_replace_func idmap_cpu_replace_ttbr1; ··· 170 166 171 167 replace_phys = (void *)__pa_symbol(function_nocfi(idmap_cpu_replace_ttbr1)); 172 168 173 - cpu_install_idmap(); 169 + __cpu_install_idmap(idmap); 174 170 replace_phys(ttbr1); 175 171 cpu_uninstall_idmap(); 176 172 }
+1 -2
arch/arm64/include/asm/pgtable-hwdef.h
··· 281 281 */ 282 282 #ifdef CONFIG_ARM64_PA_BITS_52 283 283 /* 284 - * This should be GENMASK_ULL(47, 2). 285 284 * TTBR_ELx[1] is RES0 in this configuration. 286 285 */ 287 - #define TTBR_BADDR_MASK_52 (((UL(1) << 46) - 1) << 2) 286 + #define TTBR_BADDR_MASK_52 GENMASK_ULL(47, 2) 288 287 #endif 289 288 290 289 #ifdef CONFIG_ARM64_VA_BITS_52
+16
arch/arm64/include/asm/pgtable.h
··· 45 45 __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) 46 46 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 47 47 48 + static inline bool arch_thp_swp_supported(void) 49 + { 50 + return !system_supports_mte(); 51 + } 52 + #define arch_thp_swp_supported arch_thp_swp_supported 53 + 48 54 /* 49 55 * Outside of a few very special situations (e.g. hibernation), we always 50 56 * use broadcast TLB invalidation instructions, therefore a spurious page ··· 431 425 static inline pte_t pte_swp_clear_exclusive(pte_t pte) 432 426 { 433 427 return clear_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); 428 + } 429 + 430 + /* 431 + * Select all bits except the pfn 432 + */ 433 + static inline pgprot_t pte_pgprot(pte_t pte) 434 + { 435 + unsigned long pfn = pte_pfn(pte); 436 + 437 + return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); 434 438 } 435 439 436 440 #ifdef CONFIG_NUMA_BALANCING
+2 -1
arch/arm64/include/asm/processor.h
··· 272 272 273 273 static inline void start_thread_common(struct pt_regs *regs, unsigned long pc) 274 274 { 275 + s32 previous_syscall = regs->syscallno; 275 276 memset(regs, 0, sizeof(*regs)); 276 - forget_syscall(regs); 277 + regs->syscallno = previous_syscall; 277 278 regs->pc = pc; 278 279 279 280 if (system_uses_irq_prio_masking())
+9 -119
arch/arm64/include/asm/sysreg.h
··· 192 192 193 193 #define SYS_ID_AA64PFR0_EL1 sys_reg(3, 0, 0, 4, 0) 194 194 #define SYS_ID_AA64PFR1_EL1 sys_reg(3, 0, 0, 4, 1) 195 - #define SYS_ID_AA64ZFR0_EL1 sys_reg(3, 0, 0, 4, 4) 196 - #define SYS_ID_AA64SMFR0_EL1 sys_reg(3, 0, 0, 4, 5) 197 195 198 196 #define SYS_ID_AA64DFR0_EL1 sys_reg(3, 0, 0, 5, 0) 199 197 #define SYS_ID_AA64DFR1_EL1 sys_reg(3, 0, 0, 5, 1) 200 198 201 199 #define SYS_ID_AA64AFR0_EL1 sys_reg(3, 0, 0, 5, 4) 202 200 #define SYS_ID_AA64AFR1_EL1 sys_reg(3, 0, 0, 5, 5) 203 - 204 - #define SYS_ID_AA64ISAR1_EL1 sys_reg(3, 0, 0, 6, 1) 205 - #define SYS_ID_AA64ISAR2_EL1 sys_reg(3, 0, 0, 6, 2) 206 201 207 202 #define SYS_ID_AA64MMFR0_EL1 sys_reg(3, 0, 0, 7, 0) 208 203 #define SYS_ID_AA64MMFR1_EL1 sys_reg(3, 0, 0, 7, 1) ··· 405 410 #define SYS_MAIR_EL1 sys_reg(3, 0, 10, 2, 0) 406 411 #define SYS_AMAIR_EL1 sys_reg(3, 0, 10, 3, 0) 407 412 408 - #define SYS_LORSA_EL1 sys_reg(3, 0, 10, 4, 0) 409 - #define SYS_LOREA_EL1 sys_reg(3, 0, 10, 4, 1) 410 - #define SYS_LORN_EL1 sys_reg(3, 0, 10, 4, 2) 411 - #define SYS_LORC_EL1 sys_reg(3, 0, 10, 4, 3) 412 - #define SYS_LORID_EL1 sys_reg(3, 0, 10, 4, 7) 413 - 414 413 #define SYS_VBAR_EL1 sys_reg(3, 0, 12, 0, 0) 415 414 #define SYS_DISR_EL1 sys_reg(3, 0, 12, 1, 1) 416 415 ··· 443 454 #define SYS_CNTKCTL_EL1 sys_reg(3, 0, 14, 1, 0) 444 455 445 456 #define SYS_CCSIDR_EL1 sys_reg(3, 1, 0, 0, 0) 446 - #define SYS_GMID_EL1 sys_reg(3, 1, 0, 0, 4) 447 457 #define SYS_AIDR_EL1 sys_reg(3, 1, 0, 0, 7) 448 458 449 459 #define SMIDR_EL1_IMPLEMENTER_SHIFT 24 450 460 #define SMIDR_EL1_SMPS_SHIFT 15 451 461 #define SMIDR_EL1_AFFINITY_SHIFT 0 452 - 453 - #define SYS_CTR_EL0 sys_reg(3, 3, 0, 0, 1) 454 - #define SYS_DCZID_EL0 sys_reg(3, 3, 0, 0, 7) 455 462 456 463 #define SYS_RNDR_EL0 sys_reg(3, 3, 2, 4, 0) 457 464 #define SYS_RNDRRS_EL0 sys_reg(3, 3, 2, 4, 1) ··· 689 704 /* Position the attr at the correct index */ 690 705 #define MAIR_ATTRIDX(attr, idx) ((attr) << ((idx) * 8)) 691 706 692 - /* id_aa64isar1 */ 693 - #define ID_AA64ISAR1_I8MM_SHIFT 52 694 - #define ID_AA64ISAR1_DGH_SHIFT 48 695 - #define ID_AA64ISAR1_BF16_SHIFT 44 696 - #define ID_AA64ISAR1_SPECRES_SHIFT 40 697 - #define ID_AA64ISAR1_SB_SHIFT 36 698 - #define ID_AA64ISAR1_FRINTTS_SHIFT 32 699 - #define ID_AA64ISAR1_GPI_SHIFT 28 700 - #define ID_AA64ISAR1_GPA_SHIFT 24 701 - #define ID_AA64ISAR1_LRCPC_SHIFT 20 702 - #define ID_AA64ISAR1_FCMA_SHIFT 16 703 - #define ID_AA64ISAR1_JSCVT_SHIFT 12 704 - #define ID_AA64ISAR1_API_SHIFT 8 705 - #define ID_AA64ISAR1_APA_SHIFT 4 706 - #define ID_AA64ISAR1_DPB_SHIFT 0 707 - 708 - #define ID_AA64ISAR1_APA_NI 0x0 709 - #define ID_AA64ISAR1_APA_ARCHITECTED 0x1 710 - #define ID_AA64ISAR1_APA_ARCH_EPAC 0x2 711 - #define ID_AA64ISAR1_APA_ARCH_EPAC2 0x3 712 - #define ID_AA64ISAR1_APA_ARCH_EPAC2_FPAC 0x4 713 - #define ID_AA64ISAR1_APA_ARCH_EPAC2_FPAC_CMB 0x5 714 - #define ID_AA64ISAR1_API_NI 0x0 715 - #define ID_AA64ISAR1_API_IMP_DEF 0x1 716 - #define ID_AA64ISAR1_API_IMP_DEF_EPAC 0x2 717 - #define ID_AA64ISAR1_API_IMP_DEF_EPAC2 0x3 718 - #define ID_AA64ISAR1_API_IMP_DEF_EPAC2_FPAC 0x4 719 - #define ID_AA64ISAR1_API_IMP_DEF_EPAC2_FPAC_CMB 0x5 720 - #define ID_AA64ISAR1_GPA_NI 0x0 721 - #define ID_AA64ISAR1_GPA_ARCHITECTED 0x1 722 - #define ID_AA64ISAR1_GPI_NI 0x0 723 - #define ID_AA64ISAR1_GPI_IMP_DEF 0x1 724 - 725 - /* id_aa64isar2 */ 726 - #define ID_AA64ISAR2_CLEARBHB_SHIFT 28 727 - #define ID_AA64ISAR2_APA3_SHIFT 12 728 - #define ID_AA64ISAR2_GPA3_SHIFT 8 729 - #define ID_AA64ISAR2_RPRES_SHIFT 4 730 - #define ID_AA64ISAR2_WFXT_SHIFT 0 731 - 732 - #define ID_AA64ISAR2_RPRES_8BIT 0x0 733 - #define ID_AA64ISAR2_RPRES_12BIT 0x1 734 - /* 735 - * Value 0x1 has been removed from the architecture, and is 736 - * reserved, but has not yet been removed from the ARM ARM 737 - * as of ARM DDI 0487G.b. 738 - */ 739 - #define ID_AA64ISAR2_WFXT_NI 0x0 740 - #define ID_AA64ISAR2_WFXT_SUPPORTED 0x2 741 - 742 - #define ID_AA64ISAR2_APA3_NI 0x0 743 - #define ID_AA64ISAR2_APA3_ARCHITECTED 0x1 744 - #define ID_AA64ISAR2_APA3_ARCH_EPAC 0x2 745 - #define ID_AA64ISAR2_APA3_ARCH_EPAC2 0x3 746 - #define ID_AA64ISAR2_APA3_ARCH_EPAC2_FPAC 0x4 747 - #define ID_AA64ISAR2_APA3_ARCH_EPAC2_FPAC_CMB 0x5 748 - 749 - #define ID_AA64ISAR2_GPA3_NI 0x0 750 - #define ID_AA64ISAR2_GPA3_ARCHITECTED 0x1 751 - 752 707 /* id_aa64pfr0 */ 753 708 #define ID_AA64PFR0_CSV3_SHIFT 60 754 709 #define ID_AA64PFR0_CSV2_SHIFT 56 ··· 735 810 #define ID_AA64PFR1_MTE_EL0 0x1 736 811 #define ID_AA64PFR1_MTE 0x2 737 812 #define ID_AA64PFR1_MTE_ASYMM 0x3 738 - 739 - /* id_aa64zfr0 */ 740 - #define ID_AA64ZFR0_F64MM_SHIFT 56 741 - #define ID_AA64ZFR0_F32MM_SHIFT 52 742 - #define ID_AA64ZFR0_I8MM_SHIFT 44 743 - #define ID_AA64ZFR0_SM4_SHIFT 40 744 - #define ID_AA64ZFR0_SHA3_SHIFT 32 745 - #define ID_AA64ZFR0_BF16_SHIFT 20 746 - #define ID_AA64ZFR0_BITPERM_SHIFT 16 747 - #define ID_AA64ZFR0_AES_SHIFT 4 748 - #define ID_AA64ZFR0_SVEVER_SHIFT 0 749 - 750 - #define ID_AA64ZFR0_F64MM 0x1 751 - #define ID_AA64ZFR0_F32MM 0x1 752 - #define ID_AA64ZFR0_I8MM 0x1 753 - #define ID_AA64ZFR0_BF16 0x1 754 - #define ID_AA64ZFR0_SM4 0x1 755 - #define ID_AA64ZFR0_SHA3 0x1 756 - #define ID_AA64ZFR0_BITPERM 0x1 757 - #define ID_AA64ZFR0_AES 0x1 758 - #define ID_AA64ZFR0_AES_PMULL 0x2 759 - #define ID_AA64ZFR0_SVEVER_SVE2 0x1 760 - 761 - /* id_aa64smfr0 */ 762 - #define ID_AA64SMFR0_FA64_SHIFT 63 763 - #define ID_AA64SMFR0_I16I64_SHIFT 52 764 - #define ID_AA64SMFR0_F64F64_SHIFT 48 765 - #define ID_AA64SMFR0_I8I32_SHIFT 36 766 - #define ID_AA64SMFR0_F16F32_SHIFT 35 767 - #define ID_AA64SMFR0_B16F32_SHIFT 34 768 - #define ID_AA64SMFR0_F32F32_SHIFT 32 769 - 770 - #define ID_AA64SMFR0_FA64 0x1 771 - #define ID_AA64SMFR0_I16I64 0xf 772 - #define ID_AA64SMFR0_F64F64 0x1 773 - #define ID_AA64SMFR0_I8I32 0xf 774 - #define ID_AA64SMFR0_F16F32 0x1 775 - #define ID_AA64SMFR0_B16F32 0x1 776 - #define ID_AA64SMFR0_F32F32 0x1 777 813 778 814 /* id_aa64mmfr0 */ 779 815 #define ID_AA64MMFR0_ECV_SHIFT 60 ··· 788 902 789 903 /* id_aa64mmfr1 */ 790 904 #define ID_AA64MMFR1_ECBHB_SHIFT 60 905 + #define ID_AA64MMFR1_TIDCP1_SHIFT 52 791 906 #define ID_AA64MMFR1_HCX_SHIFT 40 792 907 #define ID_AA64MMFR1_AFP_SHIFT 44 793 908 #define ID_AA64MMFR1_ETS_SHIFT 36 ··· 804 917 805 918 #define ID_AA64MMFR1_VMIDBITS_8 0 806 919 #define ID_AA64MMFR1_VMIDBITS_16 2 920 + 921 + #define ID_AA64MMFR1_TIDCP1_NI 0 922 + #define ID_AA64MMFR1_TIDCP1_IMP 1 807 923 808 924 /* id_aa64mmfr2 */ 809 925 #define ID_AA64MMFR2_E0PD_SHIFT 60 ··· 974 1084 #define MVFR2_FPMISC_SHIFT 4 975 1085 #define MVFR2_SIMDMISC_SHIFT 0 976 1086 977 - #define DCZID_DZP_SHIFT 4 978 - #define DCZID_BS_SHIFT 0 979 - 980 1087 #define CPACR_EL1_FPEN_EL1EN (BIT(20)) /* enable EL1 access */ 981 1088 #define CPACR_EL1_FPEN_EL0EN (BIT(21)) /* enable EL0 access, if EL1EN set */ 982 1089 ··· 1008 1121 #define SYS_RGSR_EL1_SEED_MASK 0xffffUL 1009 1122 1010 1123 /* GMID_EL1 field definitions */ 1011 - #define SYS_GMID_EL1_BS_SHIFT 0 1012 - #define SYS_GMID_EL1_BS_SIZE 4 1124 + #define GMID_EL1_BS_SHIFT 0 1125 + #define GMID_EL1_BS_SIZE 4 1013 1126 1014 1127 /* TFSR{,E0}_EL1 bit definitions */ 1015 1128 #define SYS_TFSR_EL1_TF0_SHIFT 0 ··· 1210 1323 }) 1211 1324 1212 1325 #endif 1326 + 1327 + #define SYS_FIELD_GET(reg, field, val) \ 1328 + FIELD_GET(reg##_##field##_MASK, val) 1213 1329 1214 1330 #define SYS_FIELD_PREP(reg, field, val) \ 1215 1331 FIELD_PREP(reg##_##field##_MASK, val)
+47 -47
arch/arm64/include/asm/uaccess.h
··· 232 232 * The "__xxx_error" versions set the third argument to -EFAULT if an error 233 233 * occurs, and leave it unchanged on success. 234 234 */ 235 - #define __get_mem_asm(load, reg, x, addr, err) \ 235 + #define __get_mem_asm(load, reg, x, addr, err, type) \ 236 236 asm volatile( \ 237 237 "1: " load " " reg "1, [%2]\n" \ 238 238 "2:\n" \ 239 - _ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %w0, %w1) \ 239 + _ASM_EXTABLE_##type##ACCESS_ERR_ZERO(1b, 2b, %w0, %w1) \ 240 240 : "+r" (err), "=&r" (x) \ 241 241 : "r" (addr)) 242 242 243 - #define __raw_get_mem(ldr, x, ptr, err) \ 244 - do { \ 245 - unsigned long __gu_val; \ 246 - switch (sizeof(*(ptr))) { \ 247 - case 1: \ 248 - __get_mem_asm(ldr "b", "%w", __gu_val, (ptr), (err)); \ 249 - break; \ 250 - case 2: \ 251 - __get_mem_asm(ldr "h", "%w", __gu_val, (ptr), (err)); \ 252 - break; \ 253 - case 4: \ 254 - __get_mem_asm(ldr, "%w", __gu_val, (ptr), (err)); \ 255 - break; \ 256 - case 8: \ 257 - __get_mem_asm(ldr, "%x", __gu_val, (ptr), (err)); \ 258 - break; \ 259 - default: \ 260 - BUILD_BUG(); \ 261 - } \ 262 - (x) = (__force __typeof__(*(ptr)))__gu_val; \ 243 + #define __raw_get_mem(ldr, x, ptr, err, type) \ 244 + do { \ 245 + unsigned long __gu_val; \ 246 + switch (sizeof(*(ptr))) { \ 247 + case 1: \ 248 + __get_mem_asm(ldr "b", "%w", __gu_val, (ptr), (err), type); \ 249 + break; \ 250 + case 2: \ 251 + __get_mem_asm(ldr "h", "%w", __gu_val, (ptr), (err), type); \ 252 + break; \ 253 + case 4: \ 254 + __get_mem_asm(ldr, "%w", __gu_val, (ptr), (err), type); \ 255 + break; \ 256 + case 8: \ 257 + __get_mem_asm(ldr, "%x", __gu_val, (ptr), (err), type); \ 258 + break; \ 259 + default: \ 260 + BUILD_BUG(); \ 261 + } \ 262 + (x) = (__force __typeof__(*(ptr)))__gu_val; \ 263 263 } while (0) 264 264 265 265 /* ··· 274 274 __chk_user_ptr(ptr); \ 275 275 \ 276 276 uaccess_ttbr0_enable(); \ 277 - __raw_get_mem("ldtr", __rgu_val, __rgu_ptr, err); \ 277 + __raw_get_mem("ldtr", __rgu_val, __rgu_ptr, err, U); \ 278 278 uaccess_ttbr0_disable(); \ 279 279 \ 280 280 (x) = __rgu_val; \ ··· 314 314 \ 315 315 __uaccess_enable_tco_async(); \ 316 316 __raw_get_mem("ldr", *((type *)(__gkn_dst)), \ 317 - (__force type *)(__gkn_src), __gkn_err); \ 317 + (__force type *)(__gkn_src), __gkn_err, K); \ 318 318 __uaccess_disable_tco_async(); \ 319 319 \ 320 320 if (unlikely(__gkn_err)) \ 321 321 goto err_label; \ 322 322 } while (0) 323 323 324 - #define __put_mem_asm(store, reg, x, addr, err) \ 324 + #define __put_mem_asm(store, reg, x, addr, err, type) \ 325 325 asm volatile( \ 326 326 "1: " store " " reg "1, [%2]\n" \ 327 327 "2:\n" \ 328 - _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %w0) \ 328 + _ASM_EXTABLE_##type##ACCESS_ERR(1b, 2b, %w0) \ 329 329 : "+r" (err) \ 330 330 : "r" (x), "r" (addr)) 331 331 332 - #define __raw_put_mem(str, x, ptr, err) \ 333 - do { \ 334 - __typeof__(*(ptr)) __pu_val = (x); \ 335 - switch (sizeof(*(ptr))) { \ 336 - case 1: \ 337 - __put_mem_asm(str "b", "%w", __pu_val, (ptr), (err)); \ 338 - break; \ 339 - case 2: \ 340 - __put_mem_asm(str "h", "%w", __pu_val, (ptr), (err)); \ 341 - break; \ 342 - case 4: \ 343 - __put_mem_asm(str, "%w", __pu_val, (ptr), (err)); \ 344 - break; \ 345 - case 8: \ 346 - __put_mem_asm(str, "%x", __pu_val, (ptr), (err)); \ 347 - break; \ 348 - default: \ 349 - BUILD_BUG(); \ 350 - } \ 332 + #define __raw_put_mem(str, x, ptr, err, type) \ 333 + do { \ 334 + __typeof__(*(ptr)) __pu_val = (x); \ 335 + switch (sizeof(*(ptr))) { \ 336 + case 1: \ 337 + __put_mem_asm(str "b", "%w", __pu_val, (ptr), (err), type); \ 338 + break; \ 339 + case 2: \ 340 + __put_mem_asm(str "h", "%w", __pu_val, (ptr), (err), type); \ 341 + break; \ 342 + case 4: \ 343 + __put_mem_asm(str, "%w", __pu_val, (ptr), (err), type); \ 344 + break; \ 345 + case 8: \ 346 + __put_mem_asm(str, "%x", __pu_val, (ptr), (err), type); \ 347 + break; \ 348 + default: \ 349 + BUILD_BUG(); \ 350 + } \ 351 351 } while (0) 352 352 353 353 /* ··· 362 362 __chk_user_ptr(__rpu_ptr); \ 363 363 \ 364 364 uaccess_ttbr0_enable(); \ 365 - __raw_put_mem("sttr", __rpu_val, __rpu_ptr, err); \ 365 + __raw_put_mem("sttr", __rpu_val, __rpu_ptr, err, U); \ 366 366 uaccess_ttbr0_disable(); \ 367 367 } while (0) 368 368 ··· 400 400 \ 401 401 __uaccess_enable_tco_async(); \ 402 402 __raw_put_mem("str", *((type *)(__pkn_src)), \ 403 - (__force type *)(__pkn_dst), __pkn_err); \ 403 + (__force type *)(__pkn_dst), __pkn_err, K); \ 404 404 __uaccess_disable_tco_async(); \ 405 405 \ 406 406 if (unlikely(__pkn_err)) \
+9 -2
arch/arm64/include/asm/virt.h
··· 36 36 #define HVC_RESET_VECTORS 2 37 37 38 38 /* 39 - * HVC_VHE_RESTART - Upgrade the CPU from EL1 to EL2, if possible 39 + * HVC_FINALISE_EL2 - Upgrade the CPU from EL1 to EL2, if possible 40 40 */ 41 - #define HVC_VHE_RESTART 3 41 + #define HVC_FINALISE_EL2 3 42 42 43 43 /* Max number of HYP stub hypercalls */ 44 44 #define HVC_STUB_HCALL_NR 4 ··· 48 48 49 49 #define BOOT_CPU_MODE_EL1 (0xe11) 50 50 #define BOOT_CPU_MODE_EL2 (0xe12) 51 + 52 + /* 53 + * Flags returned together with the boot mode, but not preserved in 54 + * __boot_cpu_mode. Used by the idreg override code to work out the 55 + * boot state. 56 + */ 57 + #define BOOT_CPU_FLAG_E2H BIT_ULL(32) 51 58 52 59 #ifndef __ASSEMBLY__ 53 60
+4
arch/arm64/include/uapi/asm/hwcap.h
··· 19 19 20 20 /* 21 21 * HWCAP flags - for AT_HWCAP 22 + * 23 + * Bits 62 and 63 are reserved for use by libc. 24 + * Bits 32-61 are unallocated for potential use by libc. 22 25 */ 23 26 #define HWCAP_FP (1 << 0) 24 27 #define HWCAP_ASIMD (1 << 1) ··· 91 88 #define HWCAP2_SME_F32F32 (1 << 29) 92 89 #define HWCAP2_SME_FA64 (1 << 30) 93 90 #define HWCAP2_WFXT (1UL << 31) 91 + #define HWCAP2_EBF16 (1UL << 32) 94 92 95 93 #endif /* _UAPI__ASM_HWCAP_H */
+6 -1
arch/arm64/kernel/Makefile
··· 14 14 CFLAGS_REMOVE_syscall.o = -fstack-protector -fstack-protector-strong 15 15 CFLAGS_syscall.o += -fno-stack-protector 16 16 17 + # When KASAN is enabled, a stack trace is recorded for every alloc/free, which 18 + # can significantly impact performance. Avoid instrumenting the stack trace 19 + # collection code to minimize this impact. 20 + KASAN_SANITIZE_stacktrace.o := n 21 + 17 22 # It's not safe to invoke KCOV when portions of the kernel environment aren't 18 23 # available or are out-of-sync with HW state. Since `noinstr` doesn't always 19 24 # inhibit KCOV instrumentation, disable it for the entire compilation unit. ··· 64 59 obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o 65 60 obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o 66 61 obj-$(CONFIG_PARAVIRT) += paravirt.o 67 - obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o 62 + obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ 68 63 obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o 69 64 obj-$(CONFIG_ELF_CORE) += elfcore.o 70 65 obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \
+1 -1
arch/arm64/kernel/acpi.c
··· 351 351 prot = __acpi_get_writethrough_mem_attribute(); 352 352 } 353 353 } 354 - return __ioremap(phys, size, prot); 354 + return ioremap_prot(phys, size, pgprot_val(prot)); 355 355 } 356 356 357 357 /*
+1 -1
arch/arm64/kernel/acpi_numa.c
··· 109 109 pxm = pa->proximity_domain; 110 110 node = acpi_map_pxm_to_node(pxm); 111 111 112 - if (node == NUMA_NO_NODE || node >= MAX_NUMNODES) { 112 + if (node == NUMA_NO_NODE) { 113 113 pr_err("SRAT: Too many proximity domains %d\n", pxm); 114 114 bad_srat(); 115 115 return;
+1 -1
arch/arm64/kernel/alternative.c
··· 121 121 122 122 ctr_el0 = read_sanitised_ftr_reg(SYS_CTR_EL0); 123 123 d_size = 4 << cpuid_feature_extract_unsigned_field(ctr_el0, 124 - CTR_DMINLINE_SHIFT); 124 + CTR_EL0_DminLine_SHIFT); 125 125 cur = start & ~(d_size - 1); 126 126 do { 127 127 /*
+5 -4
arch/arm64/kernel/armv8_deprecated.c
··· 59 59 static LIST_HEAD(insn_emulation); 60 60 static int nr_insn_emulated __initdata; 61 61 static DEFINE_RAW_SPINLOCK(insn_emulation_lock); 62 + static DEFINE_MUTEX(insn_emulation_mutex); 62 63 63 64 static void register_emulation_hooks(struct insn_emulation_ops *ops) 64 65 { ··· 208 207 loff_t *ppos) 209 208 { 210 209 int ret = 0; 211 - struct insn_emulation *insn = (struct insn_emulation *) table->data; 210 + struct insn_emulation *insn = container_of(table->data, struct insn_emulation, current_mode); 212 211 enum insn_emulation_mode prev_mode = insn->current_mode; 213 212 214 - table->data = &insn->current_mode; 213 + mutex_lock(&insn_emulation_mutex); 215 214 ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos); 216 215 217 216 if (ret || !write || prev_mode == insn->current_mode) ··· 224 223 update_insn_emulation_mode(insn, INSN_UNDEF); 225 224 } 226 225 ret: 227 - table->data = insn; 226 + mutex_unlock(&insn_emulation_mutex); 228 227 return ret; 229 228 } 230 229 ··· 248 247 sysctl->maxlen = sizeof(int); 249 248 250 249 sysctl->procname = insn->ops->name; 251 - sysctl->data = insn; 250 + sysctl->data = &insn->current_mode; 252 251 sysctl->extra1 = &insn->min; 253 252 sysctl->extra2 = &insn->max; 254 253 sysctl->proc_handler = emulation_proc_handler;
+24 -2
arch/arm64/kernel/cpu_errata.c
··· 187 187 int scope) 188 188 { 189 189 u32 midr = read_cpuid_id(); 190 - bool has_dic = read_cpuid_cachetype() & BIT(CTR_DIC_SHIFT); 190 + bool has_dic = read_cpuid_cachetype() & BIT(CTR_EL0_DIC_SHIFT); 191 191 const struct midr_range range = MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1); 192 192 193 193 WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible()); ··· 210 210 ERRATA_MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 3, 0), 211 211 /* Kryo4xx Gold (rcpe to rfpe) => (r0p0 to r3p0) */ 212 212 ERRATA_MIDR_RANGE(MIDR_QCOM_KRYO_4XX_GOLD, 0xc, 0xe, 0xf, 0xe), 213 + }, 214 + #endif 215 + #ifdef CONFIG_ARM64_ERRATUM_2441009 216 + { 217 + /* Cortex-A510 r0p0 -> r1p1. Fixed in r1p2 */ 218 + ERRATA_MIDR_RANGE(MIDR_CORTEX_A510, 0, 0, 1, 1), 213 219 }, 214 220 #endif 215 221 {}, ··· 401 395 }; 402 396 #endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */ 403 397 398 + #ifdef CONFIG_ARM64_ERRATUM_1742098 399 + static struct midr_range broken_aarch32_aes[] = { 400 + MIDR_RANGE(MIDR_CORTEX_A57, 0, 1, 0xf, 0xf), 401 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), 402 + {}, 403 + }; 404 + #endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */ 405 + 404 406 const struct arm64_cpu_capabilities arm64_errata[] = { 405 407 #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE 406 408 { ··· 494 480 #endif 495 481 #ifdef CONFIG_ARM64_WORKAROUND_REPEAT_TLBI 496 482 { 497 - .desc = "Qualcomm erratum 1009, or ARM erratum 1286807", 483 + .desc = "Qualcomm erratum 1009, or ARM erratum 1286807, 2441009", 498 484 .capability = ARM64_WORKAROUND_REPEAT_TLBI, 499 485 .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, 500 486 .matches = cpucap_multi_entry_cap_matches, ··· 670 656 671 657 /* Cortex-A510 r0p0 - r0p1 */ 672 658 ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 1) 659 + }, 660 + #endif 661 + #ifdef CONFIG_ARM64_ERRATUM_1742098 662 + { 663 + .desc = "ARM erratum 1742098", 664 + .capability = ARM64_WORKAROUND_1742098, 665 + CAP_MIDR_RANGE_LIST(broken_aarch32_aes), 666 + .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, 673 667 }, 674 668 #endif 675 669 {
+238 -138
arch/arm64/kernel/cpufeature.c
··· 79 79 #include <asm/cpufeature.h> 80 80 #include <asm/cpu_ops.h> 81 81 #include <asm/fpsimd.h> 82 + #include <asm/hwcap.h> 82 83 #include <asm/insn.h> 83 84 #include <asm/kvm_host.h> 84 85 #include <asm/mmu_context.h> ··· 92 91 #include <asm/virt.h> 93 92 94 93 /* Kernel representation of AT_HWCAP and AT_HWCAP2 */ 95 - static unsigned long elf_hwcap __read_mostly; 94 + static DECLARE_BITMAP(elf_hwcap, MAX_CPU_FEATURES) __read_mostly; 96 95 97 96 #ifdef CONFIG_COMPAT 98 97 #define COMPAT_ELF_HWCAP_DEFAULT \ ··· 210 209 }; 211 210 212 211 static const struct arm64_ftr_bits ftr_id_aa64isar1[] = { 213 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_I8MM_SHIFT, 4, 0), 214 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_DGH_SHIFT, 4, 0), 215 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_BF16_SHIFT, 4, 0), 216 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_SPECRES_SHIFT, 4, 0), 217 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_SB_SHIFT, 4, 0), 218 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FRINTTS_SHIFT, 4, 0), 212 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, 0), 213 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, 0), 214 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_BF16_SHIFT, 4, 0), 215 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_SPECRES_SHIFT, 4, 0), 216 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_SB_SHIFT, 4, 0), 217 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_FRINTTS_SHIFT, 4, 0), 219 218 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 220 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_GPI_SHIFT, 4, 0), 219 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_GPI_SHIFT, 4, 0), 221 220 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 222 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_GPA_SHIFT, 4, 0), 223 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_LRCPC_SHIFT, 4, 0), 224 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_FCMA_SHIFT, 4, 0), 225 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_JSCVT_SHIFT, 4, 0), 221 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_GPA_SHIFT, 4, 0), 222 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_LRCPC_SHIFT, 4, 0), 223 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_FCMA_SHIFT, 4, 0), 224 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_JSCVT_SHIFT, 4, 0), 226 225 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 227 - FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_API_SHIFT, 4, 0), 226 + FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_EL1_API_SHIFT, 4, 0), 228 227 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 229 - FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_APA_SHIFT, 4, 0), 230 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_DPB_SHIFT, 4, 0), 228 + FTR_STRICT, FTR_EXACT, ID_AA64ISAR1_EL1_APA_SHIFT, 4, 0), 229 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR1_EL1_DPB_SHIFT, 4, 0), 231 230 ARM64_FTR_END, 232 231 }; 233 232 234 233 static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { 235 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_CLEARBHB_SHIFT, 4, 0), 234 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0), 236 235 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 237 - FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_APA3_SHIFT, 4, 0), 236 + FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 0), 238 237 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH), 239 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_GPA3_SHIFT, 4, 0), 240 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_RPRES_SHIFT, 4, 0), 241 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_WFXT_SHIFT, 4, 0), 238 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_GPA3_SHIFT, 4, 0), 239 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRES_SHIFT, 4, 0), 240 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_WFxT_SHIFT, 4, 0), 242 241 ARM64_FTR_END, 243 242 }; 244 243 ··· 277 276 278 277 static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = { 279 278 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 280 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_F64MM_SHIFT, 4, 0), 279 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F64MM_SHIFT, 4, 0), 281 280 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 282 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_F32MM_SHIFT, 4, 0), 281 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F32MM_SHIFT, 4, 0), 283 282 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 284 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_I8MM_SHIFT, 4, 0), 283 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_I8MM_SHIFT, 4, 0), 285 284 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 286 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_SM4_SHIFT, 4, 0), 285 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_SM4_SHIFT, 4, 0), 287 286 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 288 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_SHA3_SHIFT, 4, 0), 287 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_SHA3_SHIFT, 4, 0), 289 288 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 290 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_BF16_SHIFT, 4, 0), 289 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_BF16_SHIFT, 4, 0), 291 290 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 292 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_BITPERM_SHIFT, 4, 0), 291 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_BitPerm_SHIFT, 4, 0), 293 292 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 294 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_AES_SHIFT, 4, 0), 293 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_AES_SHIFT, 4, 0), 295 294 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 296 - FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_SVEVER_SHIFT, 4, 0), 295 + FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_SVEver_SHIFT, 4, 0), 297 296 ARM64_FTR_END, 298 297 }; 299 298 300 299 static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = { 301 300 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 302 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_FA64_SHIFT, 1, 0), 301 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_FA64_SHIFT, 1, 0), 303 302 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 304 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I16I64_SHIFT, 4, 0), 303 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_I16I64_SHIFT, 4, 0), 305 304 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 306 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F64F64_SHIFT, 1, 0), 305 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F64F64_SHIFT, 1, 0), 307 306 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 308 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_I8I32_SHIFT, 4, 0), 307 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_I8I32_SHIFT, 4, 0), 309 308 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 310 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F16F32_SHIFT, 1, 0), 309 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F16F32_SHIFT, 1, 0), 311 310 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 312 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_B16F32_SHIFT, 1, 0), 311 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_B16F32_SHIFT, 1, 0), 313 312 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 314 - FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_F32F32_SHIFT, 1, 0), 313 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F32F32_SHIFT, 1, 0), 315 314 ARM64_FTR_END, 316 315 }; 317 316 ··· 362 361 }; 363 362 364 363 static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = { 364 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_TIDCP1_SHIFT, 4, 0), 365 365 ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_AFP_SHIFT, 4, 0), 366 366 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_ETS_SHIFT, 4, 0), 367 367 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_TWED_SHIFT, 4, 0), ··· 398 396 399 397 static const struct arm64_ftr_bits ftr_ctr[] = { 400 398 ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, 31, 1, 1), /* RES1 */ 401 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DIC_SHIFT, 1, 1), 402 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IDC_SHIFT, 1, 1), 403 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_OR_ZERO_SAFE, CTR_CWG_SHIFT, 4, 0), 404 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_OR_ZERO_SAFE, CTR_ERG_SHIFT, 4, 0), 405 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_DMINLINE_SHIFT, 4, 1), 399 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_EL0_DIC_SHIFT, 1, 1), 400 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_EL0_IDC_SHIFT, 1, 1), 401 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_OR_ZERO_SAFE, CTR_EL0_CWG_SHIFT, 4, 0), 402 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_HIGHER_OR_ZERO_SAFE, CTR_EL0_ERG_SHIFT, 4, 0), 403 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_EL0_DminLine_SHIFT, 4, 1), 406 404 /* 407 405 * Linux can handle differing I-cache policies. Userspace JITs will 408 406 * make use of *minLine. 409 407 * If we have differing I-cache policies, report it as the weakest - VIPT. 410 408 */ 411 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_EXACT, CTR_L1IP_SHIFT, 2, ICACHE_POLICY_VIPT), /* L1Ip */ 412 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_IMINLINE_SHIFT, 4, 0), 409 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_EXACT, CTR_EL0_L1Ip_SHIFT, 2, CTR_EL0_L1Ip_VIPT), /* L1Ip */ 410 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, CTR_EL0_IminLine_SHIFT, 4, 0), 413 411 ARM64_FTR_END, 414 412 }; 415 413 ··· 455 453 }; 456 454 457 455 static const struct arm64_ftr_bits ftr_dczid[] = { 458 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, DCZID_DZP_SHIFT, 1, 1), 459 - ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, DCZID_BS_SHIFT, 4, 0), 456 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, DCZID_EL0_DZP_SHIFT, 1, 1), 457 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, DCZID_EL0_BS_SHIFT, 4, 0), 460 458 ARM64_FTR_END, 461 459 }; 462 460 463 461 static const struct arm64_ftr_bits ftr_gmid[] = { 464 - ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, SYS_GMID_EL1_BS_SHIFT, 4, 0), 462 + ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, GMID_EL1_BS_SHIFT, 4, 0), 465 463 ARM64_FTR_END, 466 464 }; 467 465 ··· 563 561 564 562 static const struct arm64_ftr_bits ftr_id_dfr0[] = { 565 563 /* [31:28] TraceFilt */ 566 - S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_PERFMON_SHIFT, 4, 0xf), 564 + S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_DFR0_PERFMON_SHIFT, 4, 0), 567 565 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MPROFDBG_SHIFT, 4, 0), 568 566 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MMAPTRC_SHIFT, 4, 0), 569 567 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_COPTRC_SHIFT, 4, 0), ··· 633 631 __ARM64_FTR_REG_OVERRIDE(#id, id, table, &no_override) 634 632 635 633 struct arm64_ftr_override __ro_after_init id_aa64mmfr1_override; 634 + struct arm64_ftr_override __ro_after_init id_aa64pfr0_override; 636 635 struct arm64_ftr_override __ro_after_init id_aa64pfr1_override; 636 + struct arm64_ftr_override __ro_after_init id_aa64zfr0_override; 637 + struct arm64_ftr_override __ro_after_init id_aa64smfr0_override; 637 638 struct arm64_ftr_override __ro_after_init id_aa64isar1_override; 638 639 struct arm64_ftr_override __ro_after_init id_aa64isar2_override; 639 640 ··· 673 668 ARM64_FTR_REG(SYS_ID_MMFR5_EL1, ftr_id_mmfr5), 674 669 675 670 /* Op1 = 0, CRn = 0, CRm = 4 */ 676 - ARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0), 671 + ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0, 672 + &id_aa64pfr0_override), 677 673 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1, 678 674 &id_aa64pfr1_override), 679 - ARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0), 680 - ARM64_FTR_REG(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0), 675 + ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0, 676 + &id_aa64zfr0_override), 677 + ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0, 678 + &id_aa64smfr0_override), 681 679 682 680 /* Op1 = 0, CRn = 0, CRm = 5 */ 683 681 ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), ··· 1001 993 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) 1002 994 init_32bit_cpu_features(&info->aarch32); 1003 995 1004 - if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { 996 + if (IS_ENABLED(CONFIG_ARM64_SVE) && 997 + id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1))) { 998 + info->reg_zcr = read_zcr_features(); 1005 999 init_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr); 1006 1000 vec_init_vq_map(ARM64_VEC_SVE); 1007 1001 } 1008 1002 1009 - if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) { 1003 + if (IS_ENABLED(CONFIG_ARM64_SME) && 1004 + id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) { 1005 + info->reg_smcr = read_smcr_features(); 1006 + /* 1007 + * We mask out SMPS since even if the hardware 1008 + * supports priorities the kernel does not at present 1009 + * and we block access to them. 1010 + */ 1011 + info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS; 1010 1012 init_cpu_ftr_reg(SYS_SMCR_EL1, info->reg_smcr); 1011 - if (IS_ENABLED(CONFIG_ARM64_SME)) 1012 - vec_init_vq_map(ARM64_VEC_SME); 1013 + vec_init_vq_map(ARM64_VEC_SME); 1013 1014 } 1014 1015 1015 1016 if (id_aa64pfr1_mte(info->reg_id_aa64pfr1)) ··· 1250 1233 taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu, 1251 1234 info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0); 1252 1235 1253 - if (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) { 1236 + if (IS_ENABLED(CONFIG_ARM64_SVE) && 1237 + id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1))) { 1238 + info->reg_zcr = read_zcr_features(); 1254 1239 taint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu, 1255 1240 info->reg_zcr, boot->reg_zcr); 1256 1241 1257 - /* Probe vector lengths, unless we already gave up on SVE */ 1258 - if (id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1)) && 1259 - !system_capabilities_finalized()) 1242 + /* Probe vector lengths */ 1243 + if (!system_capabilities_finalized()) 1260 1244 vec_update_vq_map(ARM64_VEC_SVE); 1261 1245 } 1262 1246 1263 - if (id_aa64pfr1_sme(info->reg_id_aa64pfr1)) { 1247 + if (IS_ENABLED(CONFIG_ARM64_SME) && 1248 + id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1))) { 1249 + info->reg_smcr = read_smcr_features(); 1250 + /* 1251 + * We mask out SMPS since even if the hardware 1252 + * supports priorities the kernel does not at present 1253 + * and we block access to them. 1254 + */ 1255 + info->reg_smidr = read_cpuid(SMIDR_EL1) & ~SMIDR_EL1_SMPS; 1264 1256 taint |= check_update_ftr_reg(SYS_SMCR_EL1, cpu, 1265 1257 info->reg_smcr, boot->reg_smcr); 1266 1258 1267 - /* Probe vector lengths, unless we already gave up on SME */ 1268 - if (id_aa64pfr1_sme(read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1)) && 1269 - !system_capabilities_finalized()) 1259 + /* Probe vector lengths */ 1260 + if (!system_capabilities_finalized()) 1270 1261 vec_update_vq_map(ARM64_VEC_SME); 1271 1262 } 1272 1263 ··· 1505 1480 else 1506 1481 ctr = read_cpuid_effective_cachetype(); 1507 1482 1508 - return ctr & BIT(CTR_IDC_SHIFT); 1483 + return ctr & BIT(CTR_EL0_IDC_SHIFT); 1509 1484 } 1510 1485 1511 1486 static void cpu_emulate_effective_ctr(const struct arm64_cpu_capabilities *__unused) ··· 1516 1491 * to the CTR_EL0 on this CPU and emulate it with the real/safe 1517 1492 * value. 1518 1493 */ 1519 - if (!(read_cpuid_cachetype() & BIT(CTR_IDC_SHIFT))) 1494 + if (!(read_cpuid_cachetype() & BIT(CTR_EL0_IDC_SHIFT))) 1520 1495 sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0); 1521 1496 } 1522 1497 ··· 1530 1505 else 1531 1506 ctr = read_cpuid_cachetype(); 1532 1507 1533 - return ctr & BIT(CTR_DIC_SHIFT); 1508 + return ctr & BIT(CTR_EL0_DIC_SHIFT); 1534 1509 } 1535 1510 1536 1511 static bool __maybe_unused ··· 1670 1645 } 1671 1646 1672 1647 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 1648 + #define KPTI_NG_TEMP_VA (-(1UL << PMD_SHIFT)) 1649 + 1650 + extern 1651 + void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, 1652 + phys_addr_t size, pgprot_t prot, 1653 + phys_addr_t (*pgtable_alloc)(int), int flags); 1654 + 1655 + static phys_addr_t kpti_ng_temp_alloc; 1656 + 1657 + static phys_addr_t kpti_ng_pgd_alloc(int shift) 1658 + { 1659 + kpti_ng_temp_alloc -= PAGE_SIZE; 1660 + return kpti_ng_temp_alloc; 1661 + } 1662 + 1673 1663 static void __nocfi 1674 1664 kpti_install_ng_mappings(const struct arm64_cpu_capabilities *__unused) 1675 1665 { 1676 - typedef void (kpti_remap_fn)(int, int, phys_addr_t); 1666 + typedef void (kpti_remap_fn)(int, int, phys_addr_t, unsigned long); 1677 1667 extern kpti_remap_fn idmap_kpti_install_ng_mappings; 1678 1668 kpti_remap_fn *remap_fn; 1679 1669 1680 1670 int cpu = smp_processor_id(); 1671 + int levels = CONFIG_PGTABLE_LEVELS; 1672 + int order = order_base_2(levels); 1673 + u64 kpti_ng_temp_pgd_pa = 0; 1674 + pgd_t *kpti_ng_temp_pgd; 1675 + u64 alloc = 0; 1681 1676 1682 1677 if (__this_cpu_read(this_cpu_vector) == vectors) { 1683 1678 const char *v = arm64_get_bp_hardening_vector(EL1_VECTOR_KPTI); ··· 1715 1670 1716 1671 remap_fn = (void *)__pa_symbol(function_nocfi(idmap_kpti_install_ng_mappings)); 1717 1672 1673 + if (!cpu) { 1674 + alloc = __get_free_pages(GFP_ATOMIC | __GFP_ZERO, order); 1675 + kpti_ng_temp_pgd = (pgd_t *)(alloc + (levels - 1) * PAGE_SIZE); 1676 + kpti_ng_temp_alloc = kpti_ng_temp_pgd_pa = __pa(kpti_ng_temp_pgd); 1677 + 1678 + // 1679 + // Create a minimal page table hierarchy that permits us to map 1680 + // the swapper page tables temporarily as we traverse them. 1681 + // 1682 + // The physical pages are laid out as follows: 1683 + // 1684 + // +--------+-/-------+-/------ +-\\--------+ 1685 + // : PTE[] : | PMD[] : | PUD[] : || PGD[] : 1686 + // +--------+-\-------+-\------ +-//--------+ 1687 + // ^ 1688 + // The first page is mapped into this hierarchy at a PMD_SHIFT 1689 + // aligned virtual address, so that we can manipulate the PTE 1690 + // level entries while the mapping is active. The first entry 1691 + // covers the PTE[] page itself, the remaining entries are free 1692 + // to be used as a ad-hoc fixmap. 1693 + // 1694 + create_kpti_ng_temp_pgd(kpti_ng_temp_pgd, __pa(alloc), 1695 + KPTI_NG_TEMP_VA, PAGE_SIZE, PAGE_KERNEL, 1696 + kpti_ng_pgd_alloc, 0); 1697 + } 1698 + 1718 1699 cpu_install_idmap(); 1719 - remap_fn(cpu, num_online_cpus(), __pa_symbol(swapper_pg_dir)); 1700 + remap_fn(cpu, num_online_cpus(), kpti_ng_temp_pgd_pa, KPTI_NG_TEMP_VA); 1720 1701 cpu_uninstall_idmap(); 1721 1702 1722 - if (!cpu) 1703 + if (!cpu) { 1704 + free_pages(alloc, order); 1723 1705 arm64_use_ng_mappings = true; 1706 + } 1724 1707 } 1725 1708 #else 1726 1709 static void ··· 2044 1971 } 2045 1972 #endif /* CONFIG_ARM64_MTE */ 2046 1973 1974 + static void elf_hwcap_fixup(void) 1975 + { 1976 + #ifdef CONFIG_ARM64_ERRATUM_1742098 1977 + if (cpus_have_const_cap(ARM64_WORKAROUND_1742098)) 1978 + compat_elf_hwcap2 &= ~COMPAT_HWCAP2_AES; 1979 + #endif /* ARM64_ERRATUM_1742098 */ 1980 + } 1981 + 2047 1982 #ifdef CONFIG_KVM 2048 1983 static bool is_kvm_protected_mode(const struct arm64_cpu_capabilities *entry, int __unused) 2049 1984 { 2050 1985 return kvm_get_mode() == KVM_MODE_PROTECTED; 2051 1986 } 2052 1987 #endif /* CONFIG_KVM */ 1988 + 1989 + static void cpu_trap_el0_impdef(const struct arm64_cpu_capabilities *__unused) 1990 + { 1991 + sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_TIDCP); 1992 + } 2053 1993 2054 1994 /* Internal helper functions to match cpu capability type */ 2055 1995 static bool ··· 2218 2132 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2219 2133 .matches = has_cpuid_feature, 2220 2134 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2221 - .field_pos = ID_AA64ISAR1_DPB_SHIFT, 2135 + .field_pos = ID_AA64ISAR1_EL1_DPB_SHIFT, 2222 2136 .field_width = 4, 2223 2137 .min_field_value = 1, 2224 2138 }, ··· 2229 2143 .matches = has_cpuid_feature, 2230 2144 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2231 2145 .sign = FTR_UNSIGNED, 2232 - .field_pos = ID_AA64ISAR1_DPB_SHIFT, 2146 + .field_pos = ID_AA64ISAR1_EL1_DPB_SHIFT, 2233 2147 .field_width = 4, 2234 2148 .min_field_value = 2, 2235 2149 }, ··· 2389 2303 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2390 2304 .matches = has_cpuid_feature, 2391 2305 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2392 - .field_pos = ID_AA64ISAR1_SB_SHIFT, 2306 + .field_pos = ID_AA64ISAR1_EL1_SB_SHIFT, 2393 2307 .field_width = 4, 2394 2308 .sign = FTR_UNSIGNED, 2395 2309 .min_field_value = 1, ··· 2401 2315 .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, 2402 2316 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2403 2317 .sign = FTR_UNSIGNED, 2404 - .field_pos = ID_AA64ISAR1_APA_SHIFT, 2318 + .field_pos = ID_AA64ISAR1_EL1_APA_SHIFT, 2405 2319 .field_width = 4, 2406 - .min_field_value = ID_AA64ISAR1_APA_ARCHITECTED, 2320 + .min_field_value = ID_AA64ISAR1_EL1_APA_PAuth, 2407 2321 .matches = has_address_auth_cpucap, 2408 2322 }, 2409 2323 { ··· 2412 2326 .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, 2413 2327 .sys_reg = SYS_ID_AA64ISAR2_EL1, 2414 2328 .sign = FTR_UNSIGNED, 2415 - .field_pos = ID_AA64ISAR2_APA3_SHIFT, 2329 + .field_pos = ID_AA64ISAR2_EL1_APA3_SHIFT, 2416 2330 .field_width = 4, 2417 - .min_field_value = ID_AA64ISAR2_APA3_ARCHITECTED, 2331 + .min_field_value = ID_AA64ISAR2_EL1_APA3_PAuth, 2418 2332 .matches = has_address_auth_cpucap, 2419 2333 }, 2420 2334 { ··· 2423 2337 .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, 2424 2338 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2425 2339 .sign = FTR_UNSIGNED, 2426 - .field_pos = ID_AA64ISAR1_API_SHIFT, 2340 + .field_pos = ID_AA64ISAR1_EL1_API_SHIFT, 2427 2341 .field_width = 4, 2428 - .min_field_value = ID_AA64ISAR1_API_IMP_DEF, 2342 + .min_field_value = ID_AA64ISAR1_EL1_API_PAuth, 2429 2343 .matches = has_address_auth_cpucap, 2430 2344 }, 2431 2345 { ··· 2439 2353 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2440 2354 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2441 2355 .sign = FTR_UNSIGNED, 2442 - .field_pos = ID_AA64ISAR1_GPA_SHIFT, 2356 + .field_pos = ID_AA64ISAR1_EL1_GPA_SHIFT, 2443 2357 .field_width = 4, 2444 - .min_field_value = ID_AA64ISAR1_GPA_ARCHITECTED, 2358 + .min_field_value = ID_AA64ISAR1_EL1_GPA_IMP, 2445 2359 .matches = has_cpuid_feature, 2446 2360 }, 2447 2361 { ··· 2450 2364 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2451 2365 .sys_reg = SYS_ID_AA64ISAR2_EL1, 2452 2366 .sign = FTR_UNSIGNED, 2453 - .field_pos = ID_AA64ISAR2_GPA3_SHIFT, 2367 + .field_pos = ID_AA64ISAR2_EL1_GPA3_SHIFT, 2454 2368 .field_width = 4, 2455 - .min_field_value = ID_AA64ISAR2_GPA3_ARCHITECTED, 2369 + .min_field_value = ID_AA64ISAR2_EL1_GPA3_IMP, 2456 2370 .matches = has_cpuid_feature, 2457 2371 }, 2458 2372 { ··· 2461 2375 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2462 2376 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2463 2377 .sign = FTR_UNSIGNED, 2464 - .field_pos = ID_AA64ISAR1_GPI_SHIFT, 2378 + .field_pos = ID_AA64ISAR1_EL1_GPI_SHIFT, 2465 2379 .field_width = 4, 2466 - .min_field_value = ID_AA64ISAR1_GPI_IMP_DEF, 2380 + .min_field_value = ID_AA64ISAR1_EL1_GPI_IMP, 2467 2381 .matches = has_cpuid_feature, 2468 2382 }, 2469 2383 { ··· 2564 2478 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2565 2479 .sys_reg = SYS_ID_AA64ISAR1_EL1, 2566 2480 .sign = FTR_UNSIGNED, 2567 - .field_pos = ID_AA64ISAR1_LRCPC_SHIFT, 2481 + .field_pos = ID_AA64ISAR1_EL1_LRCPC_SHIFT, 2568 2482 .field_width = 4, 2569 2483 .matches = has_cpuid_feature, 2570 2484 .min_field_value = 1, ··· 2589 2503 .capability = ARM64_SME_FA64, 2590 2504 .sys_reg = SYS_ID_AA64SMFR0_EL1, 2591 2505 .sign = FTR_UNSIGNED, 2592 - .field_pos = ID_AA64SMFR0_FA64_SHIFT, 2506 + .field_pos = ID_AA64SMFR0_EL1_FA64_SHIFT, 2593 2507 .field_width = 1, 2594 - .min_field_value = ID_AA64SMFR0_FA64, 2508 + .min_field_value = ID_AA64SMFR0_EL1_FA64_IMP, 2595 2509 .matches = has_cpuid_feature, 2596 2510 .cpu_enable = fa64_kernel_enable, 2597 2511 }, ··· 2602 2516 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2603 2517 .sys_reg = SYS_ID_AA64ISAR2_EL1, 2604 2518 .sign = FTR_UNSIGNED, 2605 - .field_pos = ID_AA64ISAR2_WFXT_SHIFT, 2519 + .field_pos = ID_AA64ISAR2_EL1_WFxT_SHIFT, 2606 2520 .field_width = 4, 2607 2521 .matches = has_cpuid_feature, 2608 - .min_field_value = ID_AA64ISAR2_WFXT_SUPPORTED, 2522 + .min_field_value = ID_AA64ISAR2_EL1_WFxT_IMP, 2523 + }, 2524 + { 2525 + .desc = "Trap EL0 IMPLEMENTATION DEFINED functionality", 2526 + .capability = ARM64_HAS_TIDCP1, 2527 + .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2528 + .sys_reg = SYS_ID_AA64MMFR1_EL1, 2529 + .sign = FTR_UNSIGNED, 2530 + .field_pos = ID_AA64MMFR1_TIDCP1_SHIFT, 2531 + .field_width = 4, 2532 + .min_field_value = ID_AA64MMFR1_TIDCP1_IMP, 2533 + .matches = has_cpuid_feature, 2534 + .cpu_enable = cpu_trap_el0_impdef, 2609 2535 }, 2610 2536 {}, 2611 2537 }; ··· 2658 2560 #ifdef CONFIG_ARM64_PTR_AUTH 2659 2561 static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = { 2660 2562 { 2661 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_APA_SHIFT, 2563 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_APA_SHIFT, 2662 2564 4, FTR_UNSIGNED, 2663 - ID_AA64ISAR1_APA_ARCHITECTED) 2565 + ID_AA64ISAR1_EL1_APA_PAuth) 2664 2566 }, 2665 2567 { 2666 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_APA3_SHIFT, 2667 - 4, FTR_UNSIGNED, ID_AA64ISAR2_APA3_ARCHITECTED) 2568 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_EL1_APA3_SHIFT, 2569 + 4, FTR_UNSIGNED, ID_AA64ISAR2_EL1_APA3_PAuth) 2668 2570 }, 2669 2571 { 2670 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_API_SHIFT, 2671 - 4, FTR_UNSIGNED, ID_AA64ISAR1_API_IMP_DEF) 2572 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_API_SHIFT, 2573 + 4, FTR_UNSIGNED, ID_AA64ISAR1_EL1_API_PAuth) 2672 2574 }, 2673 2575 {}, 2674 2576 }; 2675 2577 2676 2578 static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = { 2677 2579 { 2678 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_GPA_SHIFT, 2679 - 4, FTR_UNSIGNED, ID_AA64ISAR1_GPA_ARCHITECTED) 2580 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_GPA_SHIFT, 2581 + 4, FTR_UNSIGNED, ID_AA64ISAR1_EL1_GPA_IMP) 2680 2582 }, 2681 2583 { 2682 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_GPA3_SHIFT, 2683 - 4, FTR_UNSIGNED, ID_AA64ISAR2_GPA3_ARCHITECTED) 2584 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_EL1_GPA3_SHIFT, 2585 + 4, FTR_UNSIGNED, ID_AA64ISAR2_EL1_GPA3_IMP) 2684 2586 }, 2685 2587 { 2686 - HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_GPI_SHIFT, 2687 - 4, FTR_UNSIGNED, ID_AA64ISAR1_GPI_IMP_DEF) 2588 + HWCAP_CPUID_MATCH(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_GPI_SHIFT, 2589 + 4, FTR_UNSIGNED, ID_AA64ISAR1_EL1_GPI_IMP) 2688 2590 }, 2689 2591 {}, 2690 2592 }; ··· 2712 2614 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, 4, FTR_SIGNED, 0, CAP_HWCAP, KERNEL_HWCAP_ASIMD), 2713 2615 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, 4, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP), 2714 2616 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_DIT_SHIFT, 4, FTR_SIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DIT), 2715 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP), 2716 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DPB_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP), 2717 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_JSCVT_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT), 2718 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FCMA_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA), 2719 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC), 2720 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ILRCPC), 2721 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_FRINTTS_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FRINT), 2722 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_SB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SB), 2723 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_BF16_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_BF16), 2724 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_DGH_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DGH), 2725 - HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_I8MM_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_I8MM), 2617 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_DPB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DCPOP), 2618 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_DPB_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_DCPODP), 2619 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_JSCVT_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_JSCVT), 2620 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_FCMA_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FCMA), 2621 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_LRCPC), 2622 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_LRCPC_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_ILRCPC), 2623 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_FRINTTS_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_FRINT), 2624 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_SB_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SB), 2625 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_BF16_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_BF16), 2626 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_BF16_SHIFT, 4, FTR_UNSIGNED, 2, CAP_HWCAP, KERNEL_HWCAP_EBF16), 2627 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_DGH_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_DGH), 2628 + HWCAP_CAP(SYS_ID_AA64ISAR1_EL1, ID_AA64ISAR1_EL1_I8MM_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_I8MM), 2726 2629 HWCAP_CAP(SYS_ID_AA64MMFR2_EL1, ID_AA64MMFR2_AT_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_USCAT), 2727 2630 #ifdef CONFIG_ARM64_SVE 2728 2631 HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_SVE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR0_SVE, CAP_HWCAP, KERNEL_HWCAP_SVE), 2729 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SVEVER_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SVEVER_SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2), 2730 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_AES, CAP_HWCAP, KERNEL_HWCAP_SVEAES), 2731 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_AES_PMULL, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL), 2732 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BITPERM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_BITPERM, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM), 2733 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_BF16_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_BF16, CAP_HWCAP, KERNEL_HWCAP_SVEBF16), 2734 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SHA3_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SHA3, CAP_HWCAP, KERNEL_HWCAP_SVESHA3), 2735 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_SM4_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_SM4, CAP_HWCAP, KERNEL_HWCAP_SVESM4), 2736 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_I8MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_I8MM, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM), 2737 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F32MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_F32MM, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM), 2738 - HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_F64MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_F64MM, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM), 2632 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_SVEver_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_SVEver_SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2), 2633 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_AES_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES), 2634 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_AES_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_AES_PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL), 2635 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_BitPerm_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_BitPerm_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM), 2636 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_BF16_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_BF16_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16), 2637 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_SHA3_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_SHA3_IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3), 2638 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_SM4_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_SM4_IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4), 2639 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_I8MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_I8MM_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM), 2640 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_F32MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_F32MM_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM), 2641 + HWCAP_CAP(SYS_ID_AA64ZFR0_EL1, ID_AA64ZFR0_EL1_F64MM_SHIFT, 4, FTR_UNSIGNED, ID_AA64ZFR0_EL1_F64MM_IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM), 2739 2642 #endif 2740 2643 HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SSBS_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SSBS_PSTATE_INSNS, CAP_HWCAP, KERNEL_HWCAP_SSBS), 2741 2644 #ifdef CONFIG_ARM64_BTI ··· 2752 2653 #endif /* CONFIG_ARM64_MTE */ 2753 2654 HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV), 2754 2655 HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP), 2755 - HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), 2756 - HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_WFXT_SHIFT, 4, FTR_UNSIGNED, ID_AA64ISAR2_WFXT_SUPPORTED, CAP_HWCAP, KERNEL_HWCAP_WFXT), 2656 + HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_EL1_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES), 2657 + HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_EL1_WFxT_SHIFT, 4, FTR_UNSIGNED, ID_AA64ISAR2_EL1_WFxT_IMP, CAP_HWCAP, KERNEL_HWCAP_WFXT), 2757 2658 #ifdef CONFIG_ARM64_SME 2758 2659 HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_SME, CAP_HWCAP, KERNEL_HWCAP_SME), 2759 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_FA64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_FA64, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), 2760 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I16I64_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I16I64, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64), 2761 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F64F64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F64F64, CAP_HWCAP, KERNEL_HWCAP_SME_F64F64), 2762 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_I8I32_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_I8I32, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32), 2763 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F16F32, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32), 2764 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_B16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_B16F32, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32), 2765 - HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_F32F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_F32F32, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32), 2660 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_FA64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_EL1_FA64_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), 2661 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_I16I64_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_EL1_I16I64_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64), 2662 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_F64F64_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_EL1_F64F64_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F64F64), 2663 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_I8I32_SHIFT, 4, FTR_UNSIGNED, ID_AA64SMFR0_EL1_I8I32_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32), 2664 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_F16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_EL1_F16F32_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32), 2665 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_B16F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_EL1_B16F32_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32), 2666 + HWCAP_CAP(SYS_ID_AA64SMFR0_EL1, ID_AA64SMFR0_EL1_F32F32_SHIFT, 1, FTR_UNSIGNED, ID_AA64SMFR0_EL1_F32F32_IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32), 2766 2667 #endif /* CONFIG_ARM64_SME */ 2767 2668 {}, 2768 2669 }; ··· 3197 3098 3198 3099 void cpu_set_feature(unsigned int num) 3199 3100 { 3200 - WARN_ON(num >= MAX_CPU_FEATURES); 3201 - elf_hwcap |= BIT(num); 3101 + set_bit(num, elf_hwcap); 3202 3102 } 3203 3103 3204 3104 bool cpu_have_feature(unsigned int num) 3205 3105 { 3206 - WARN_ON(num >= MAX_CPU_FEATURES); 3207 - return elf_hwcap & BIT(num); 3106 + return test_bit(num, elf_hwcap); 3208 3107 } 3209 3108 EXPORT_SYMBOL_GPL(cpu_have_feature); 3210 3109 ··· 3213 3116 * note that for userspace compatibility we guarantee that bits 62 3214 3117 * and 63 will always be returned as 0. 3215 3118 */ 3216 - return lower_32_bits(elf_hwcap); 3119 + return elf_hwcap[0]; 3217 3120 } 3218 3121 3219 3122 unsigned long cpu_get_elf_hwcap2(void) 3220 3123 { 3221 - return upper_32_bits(elf_hwcap); 3124 + return elf_hwcap[1]; 3222 3125 } 3223 3126 3224 3127 static void __init setup_system_capabilities(void) ··· 3240 3143 setup_system_capabilities(); 3241 3144 setup_elf_hwcaps(arm64_elf_hwcaps); 3242 3145 3243 - if (system_supports_32bit_el0()) 3146 + if (system_supports_32bit_el0()) { 3244 3147 setup_elf_hwcaps(compat_elf_hwcaps); 3148 + elf_hwcap_fixup(); 3149 + } 3245 3150 3246 3151 if (system_uses_ttbr0_pan()) 3247 3152 pr_info("emulated: Privileged Access Never (PAN) using TTBR0_EL1 switching\n"); ··· 3296 3197 cpu_active_mask); 3297 3198 get_cpu_device(lucky_winner)->offline_disabled = true; 3298 3199 setup_elf_hwcaps(compat_elf_hwcaps); 3200 + elf_hwcap_fixup(); 3299 3201 pr_info("Asymmetric 32-bit EL0 support detected on CPU %u; CPU hot-unplug disabled on CPU %u\n", 3300 3202 cpu, lucky_winner); 3301 3203 return 0; ··· 3318 3218 3319 3219 static void __maybe_unused cpu_enable_cnp(struct arm64_cpu_capabilities const *cap) 3320 3220 { 3321 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); 3221 + cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir); 3322 3222 } 3323 3223 3324 3224 /*
-29
arch/arm64/kernel/cpuidle.c
··· 13 13 #include <linux/of_device.h> 14 14 #include <linux/psci.h> 15 15 16 - #include <asm/cpuidle.h> 17 - #include <asm/cpu_ops.h> 18 - 19 - int arm_cpuidle_init(unsigned int cpu) 20 - { 21 - const struct cpu_operations *ops = get_cpu_ops(cpu); 22 - int ret = -EOPNOTSUPP; 23 - 24 - if (ops && ops->cpu_suspend && ops->cpu_init_idle) 25 - ret = ops->cpu_init_idle(cpu); 26 - 27 - return ret; 28 - } 29 - 30 - /** 31 - * arm_cpuidle_suspend() - function to enter a low-power idle state 32 - * @index: argument to pass to CPU suspend operations 33 - * 34 - * Return: 0 on success, -EOPNOTSUPP if CPU suspend hook not initialized, CPU 35 - * operations back-end error code otherwise. 36 - */ 37 - int arm_cpuidle_suspend(int index) 38 - { 39 - int cpu = smp_processor_id(); 40 - const struct cpu_operations *ops = get_cpu_ops(cpu); 41 - 42 - return ops->cpu_suspend(index); 43 - } 44 - 45 16 #ifdef CONFIG_ACPI 46 17 47 18 #include <acpi/processor.h>
+32 -19
arch/arm64/kernel/cpuinfo.c
··· 33 33 DEFINE_PER_CPU(struct cpuinfo_arm64, cpu_data); 34 34 static struct cpuinfo_arm64 boot_cpu_data; 35 35 36 - static const char *icache_policy_str[] = { 37 - [ICACHE_POLICY_VPIPT] = "VPIPT", 38 - [ICACHE_POLICY_RESERVED] = "RESERVED/UNKNOWN", 39 - [ICACHE_POLICY_VIPT] = "VIPT", 40 - [ICACHE_POLICY_PIPT] = "PIPT", 41 - }; 36 + static inline const char *icache_policy_str(int l1ip) 37 + { 38 + switch (l1ip) { 39 + case CTR_EL0_L1Ip_VPIPT: 40 + return "VPIPT"; 41 + case CTR_EL0_L1Ip_VIPT: 42 + return "VIPT"; 43 + case CTR_EL0_L1Ip_PIPT: 44 + return "PIPT"; 45 + default: 46 + return "RESERVED/UNKNOWN"; 47 + } 48 + } 42 49 43 50 unsigned long __icache_flags; 44 51 ··· 114 107 [KERNEL_HWCAP_SME_F32F32] = "smef32f32", 115 108 [KERNEL_HWCAP_SME_FA64] = "smefa64", 116 109 [KERNEL_HWCAP_WFXT] = "wfxt", 110 + [KERNEL_HWCAP_EBF16] = "ebf16", 117 111 }; 118 112 119 113 #ifdef CONFIG_COMPAT ··· 275 267 276 268 CPUREGS_ATTR_RO(midr_el1, midr); 277 269 CPUREGS_ATTR_RO(revidr_el1, revidr); 270 + CPUREGS_ATTR_RO(smidr_el1, smidr); 278 271 279 272 static struct attribute *cpuregs_id_attrs[] = { 280 273 &cpuregs_attr_midr_el1.attr, ··· 285 276 286 277 static const struct attribute_group cpuregs_attr_group = { 287 278 .attrs = cpuregs_id_attrs, 279 + .name = "identification" 280 + }; 281 + 282 + static struct attribute *sme_cpuregs_id_attrs[] = { 283 + &cpuregs_attr_smidr_el1.attr, 284 + NULL 285 + }; 286 + 287 + static const struct attribute_group sme_cpuregs_attr_group = { 288 + .attrs = sme_cpuregs_id_attrs, 288 289 .name = "identification" 289 290 }; 290 291 ··· 315 296 rc = sysfs_create_group(&info->kobj, &cpuregs_attr_group); 316 297 if (rc) 317 298 kobject_del(&info->kobj); 299 + if (system_supports_sme()) 300 + rc = sysfs_merge_group(&info->kobj, &sme_cpuregs_attr_group); 318 301 out: 319 302 return rc; 320 303 } ··· 363 342 u32 l1ip = CTR_L1IP(info->reg_ctr); 364 343 365 344 switch (l1ip) { 366 - case ICACHE_POLICY_PIPT: 345 + case CTR_EL0_L1Ip_PIPT: 367 346 break; 368 - case ICACHE_POLICY_VPIPT: 347 + case CTR_EL0_L1Ip_VPIPT: 369 348 set_bit(ICACHEF_VPIPT, &__icache_flags); 370 349 break; 371 - case ICACHE_POLICY_RESERVED: 372 - case ICACHE_POLICY_VIPT: 350 + case CTR_EL0_L1Ip_VIPT: 351 + default: 373 352 /* Assume aliasing */ 374 353 set_bit(ICACHEF_ALIASING, &__icache_flags); 375 354 break; 376 355 } 377 356 378 - pr_info("Detected %s I-cache on CPU%d\n", icache_policy_str[l1ip], cpu); 357 + pr_info("Detected %s I-cache on CPU%d\n", icache_policy_str(l1ip), cpu); 379 358 } 380 359 381 360 static void __cpuinfo_store_cpu_32bit(struct cpuinfo_32bit *info) ··· 438 417 439 418 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) 440 419 __cpuinfo_store_cpu_32bit(&info->aarch32); 441 - 442 - if (IS_ENABLED(CONFIG_ARM64_SVE) && 443 - id_aa64pfr0_sve(info->reg_id_aa64pfr0)) 444 - info->reg_zcr = read_zcr_features(); 445 - 446 - if (IS_ENABLED(CONFIG_ARM64_SME) && 447 - id_aa64pfr1_sme(info->reg_id_aa64pfr1)) 448 - info->reg_smcr = read_smcr_features(); 449 420 450 421 cpuinfo_detect_icache_policy(info); 451 422 }
+22 -31
arch/arm64/kernel/entry.S
··· 636 636 */ 637 637 .endm 638 638 639 - .macro tramp_data_page dst 640 - adr_l \dst, .entry.tramp.text 641 - sub \dst, \dst, PAGE_SIZE 642 - .endm 643 - 644 - .macro tramp_data_read_var dst, var 645 - #ifdef CONFIG_RANDOMIZE_BASE 646 - tramp_data_page \dst 647 - add \dst, \dst, #:lo12:__entry_tramp_data_\var 648 - ldr \dst, [\dst] 639 + .macro tramp_data_read_var dst, var 640 + #ifdef CONFIG_RELOCATABLE 641 + ldr \dst, .L__tramp_data_\var 642 + .ifndef .L__tramp_data_\var 643 + .pushsection ".entry.tramp.rodata", "a", %progbits 644 + .align 3 645 + .L__tramp_data_\var: 646 + .quad \var 647 + .popsection 648 + .endif 649 649 #else 650 - ldr \dst, =\var 650 + /* 651 + * As !RELOCATABLE implies !RANDOMIZE_BASE the address is always a 652 + * compile time constant (and hence not secret and not worth hiding). 653 + * 654 + * As statically allocated kernel code and data always live in the top 655 + * 47 bits of the address space we can sign-extend bit 47 and avoid an 656 + * instruction to load the upper 16 bits (which must be 0xFFFF). 657 + */ 658 + movz \dst, :abs_g2_s:\var 659 + movk \dst, :abs_g1_nc:\var 660 + movk \dst, :abs_g0_nc:\var 651 661 #endif 652 662 .endm 653 663 ··· 705 695 msr vbar_el1, x30 706 696 isb 707 697 .else 708 - ldr x30, =vectors 698 + adr_l x30, vectors 709 699 .endif // \kpti == 1 710 700 711 701 .if \bhb == BHB_MITIGATION_FW ··· 774 764 SYM_CODE_START(tramp_exit_compat) 775 765 tramp_exit 32 776 766 SYM_CODE_END(tramp_exit_compat) 777 - 778 - .ltorg 779 767 .popsection // .entry.tramp.text 780 - #ifdef CONFIG_RANDOMIZE_BASE 781 - .pushsection ".rodata", "a" 782 - .align PAGE_SHIFT 783 - SYM_DATA_START(__entry_tramp_data_start) 784 - __entry_tramp_data_vectors: 785 - .quad vectors 786 - #ifdef CONFIG_ARM_SDE_INTERFACE 787 - __entry_tramp_data___sdei_asm_handler: 788 - .quad __sdei_asm_handler 789 - #endif /* CONFIG_ARM_SDE_INTERFACE */ 790 - __entry_tramp_data_this_cpu_vector: 791 - .quad this_cpu_vector 792 - SYM_DATA_END(__entry_tramp_data_start) 793 - .popsection // .rodata 794 - #endif /* CONFIG_RANDOMIZE_BASE */ 795 768 #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ 796 769 797 770 /* ··· 925 932 * This clobbers x4, __sdei_handler() will restore this from firmware's 926 933 * copy. 927 934 */ 928 - .ltorg 929 935 .pushsection ".entry.tramp.text", "ax" 930 936 SYM_CODE_START(__sdei_asm_entry_trampoline) 931 937 mrs x4, ttbr1_el1 ··· 959 967 1: sdei_handler_exit exit_mode=x2 960 968 SYM_CODE_END(__sdei_asm_exit_trampoline) 961 969 NOKPROBE(__sdei_asm_exit_trampoline) 962 - .ltorg 963 970 .popsection // .entry.tramp.text 964 971 #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */ 965 972
-1
arch/arm64/kernel/fpsimd.c
··· 445 445 446 446 if (system_supports_sme()) { 447 447 u64 *svcr = last->svcr; 448 - *svcr = read_sysreg_s(SYS_SVCR); 449 448 450 449 *svcr = read_sysreg_s(SYS_SVCR); 451 450
+229 -304
arch/arm64/kernel/head.S
··· 37 37 38 38 #include "efi-header.S" 39 39 40 - #define __PHYS_OFFSET KERNEL_START 41 - 42 40 #if (PAGE_OFFSET & 0x1fffff) != 0 43 41 #error PAGE_OFFSET must be at least 2MB aligned 44 42 #endif ··· 48 50 * The requirements are: 49 51 * MMU = off, D-cache = off, I-cache = on or off, 50 52 * x0 = physical address to the FDT blob. 51 - * 52 - * This code is mostly position independent so you call this at 53 - * __pa(PAGE_OFFSET). 54 53 * 55 54 * Note that the callee-saved registers are used for storing variables 56 55 * that are useful before the MMU is enabled. The allocations are described ··· 77 82 * primary lowlevel boot path: 78 83 * 79 84 * Register Scope Purpose 85 + * x20 primary_entry() .. __primary_switch() CPU boot mode 80 86 * x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0 87 + * x22 create_idmap() .. start_kernel() ID map VA of the DT blob 81 88 * x23 primary_entry() .. start_kernel() physical misalignment/KASLR offset 82 - * x28 __create_page_tables() callee preserved temp register 83 - * x19/x20 __primary_switch() callee preserved temp registers 84 - * x24 __primary_switch() .. relocate_kernel() current RELR displacement 89 + * x24 __primary_switch() linear map KASLR seed 90 + * x25 primary_entry() .. start_kernel() supported VA size 91 + * x28 create_idmap() callee preserved temp register 85 92 */ 86 93 SYM_CODE_START(primary_entry) 87 94 bl preserve_boot_args 88 95 bl init_kernel_el // w0=cpu_boot_mode 89 - adrp x23, __PHYS_OFFSET 90 - and x23, x23, MIN_KIMG_ALIGN - 1 // KASLR offset, defaults to 0 91 - bl set_cpu_boot_mode_flag 92 - bl __create_page_tables 96 + mov x20, x0 97 + bl create_idmap 98 + 93 99 /* 94 100 * The following calls CPU setup code, see arch/arm64/mm/proc.S for 95 101 * details. 96 102 * On return, the CPU will be ready for the MMU to be turned on and 97 103 * the TCR will have been set. 98 104 */ 105 + #if VA_BITS > 48 106 + mrs_s x0, SYS_ID_AA64MMFR2_EL1 107 + tst x0, #0xf << ID_AA64MMFR2_LVA_SHIFT 108 + mov x0, #VA_BITS 109 + mov x25, #VA_BITS_MIN 110 + csel x25, x25, x0, eq 111 + mov x0, x25 112 + #endif 99 113 bl __cpu_setup // initialise processor 100 114 b __primary_switch 101 115 SYM_CODE_END(primary_entry) ··· 126 122 b dcache_inval_poc // tail call 127 123 SYM_CODE_END(preserve_boot_args) 128 124 129 - /* 130 - * Macro to create a table entry to the next page. 131 - * 132 - * tbl: page table address 133 - * virt: virtual address 134 - * shift: #imm page table shift 135 - * ptrs: #imm pointers per table page 136 - * 137 - * Preserves: virt 138 - * Corrupts: ptrs, tmp1, tmp2 139 - * Returns: tbl -> next level table page address 140 - */ 141 - .macro create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2 142 - add \tmp1, \tbl, #PAGE_SIZE 143 - phys_to_pte \tmp2, \tmp1 144 - orr \tmp2, \tmp2, #PMD_TYPE_TABLE // address of next table and entry type 145 - lsr \tmp1, \virt, #\shift 146 - sub \ptrs, \ptrs, #1 147 - and \tmp1, \tmp1, \ptrs // table index 148 - str \tmp2, [\tbl, \tmp1, lsl #3] 149 - add \tbl, \tbl, #PAGE_SIZE // next level table page 150 - .endm 125 + SYM_FUNC_START_LOCAL(clear_page_tables) 126 + /* 127 + * Clear the init page tables. 128 + */ 129 + adrp x0, init_pg_dir 130 + adrp x1, init_pg_end 131 + sub x2, x1, x0 132 + mov x1, xzr 133 + b __pi_memset // tail call 134 + SYM_FUNC_END(clear_page_tables) 151 135 152 136 /* 153 137 * Macro to populate page table entries, these entries can be pointers to the next level ··· 171 179 * vstart: virtual address of start of range 172 180 * vend: virtual address of end of range - we map [vstart, vend] 173 181 * shift: shift used to transform virtual address into index 174 - * ptrs: number of entries in page table 182 + * order: #imm 2log(number of entries in page table) 175 183 * istart: index in table corresponding to vstart 176 184 * iend: index in table corresponding to vend 177 185 * count: On entry: how many extra entries were required in previous level, scales 178 186 * our end index. 179 187 * On exit: returns how many extra entries required for next page table level 180 188 * 181 - * Preserves: vstart, vend, shift, ptrs 189 + * Preserves: vstart, vend 182 190 * Returns: istart, iend, count 183 191 */ 184 - .macro compute_indices, vstart, vend, shift, ptrs, istart, iend, count 185 - lsr \iend, \vend, \shift 186 - mov \istart, \ptrs 187 - sub \istart, \istart, #1 188 - and \iend, \iend, \istart // iend = (vend >> shift) & (ptrs - 1) 189 - mov \istart, \ptrs 190 - mul \istart, \istart, \count 191 - add \iend, \iend, \istart // iend += count * ptrs 192 - // our entries span multiple tables 193 - 194 - lsr \istart, \vstart, \shift 195 - mov \count, \ptrs 196 - sub \count, \count, #1 197 - and \istart, \istart, \count 198 - 192 + .macro compute_indices, vstart, vend, shift, order, istart, iend, count 193 + ubfx \istart, \vstart, \shift, \order 194 + ubfx \iend, \vend, \shift, \order 195 + add \iend, \iend, \count, lsl \order 199 196 sub \count, \iend, \istart 200 197 .endm 201 198 ··· 199 218 * vend: virtual address of end of range - we map [vstart, vend - 1] 200 219 * flags: flags to use to map last level entries 201 220 * phys: physical address corresponding to vstart - physical memory is contiguous 202 - * pgds: the number of pgd entries 221 + * order: #imm 2log(number of entries in PGD table) 222 + * 223 + * If extra_shift is set, an extra level will be populated if the end address does 224 + * not fit in 'extra_shift' bits. This assumes vend is in the TTBR0 range. 203 225 * 204 226 * Temporaries: istart, iend, tmp, count, sv - these need to be different registers 205 227 * Preserves: vstart, flags 206 228 * Corrupts: tbl, rtbl, vend, istart, iend, tmp, count, sv 207 229 */ 208 - .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv 230 + .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, order, istart, iend, tmp, count, sv, extra_shift 209 231 sub \vend, \vend, #1 210 232 add \rtbl, \tbl, #PAGE_SIZE 211 - mov \sv, \rtbl 212 233 mov \count, #0 213 - compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count 234 + 235 + .ifnb \extra_shift 236 + tst \vend, #~((1 << (\extra_shift)) - 1) 237 + b.eq .L_\@ 238 + compute_indices \vstart, \vend, #\extra_shift, #(PAGE_SHIFT - 3), \istart, \iend, \count 239 + mov \sv, \rtbl 214 240 populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 215 241 mov \tbl, \sv 242 + .endif 243 + .L_\@: 244 + compute_indices \vstart, \vend, #PGDIR_SHIFT, #\order, \istart, \iend, \count 216 245 mov \sv, \rtbl 246 + populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 247 + mov \tbl, \sv 217 248 218 249 #if SWAPPER_PGTABLE_LEVELS > 3 219 - compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count 250 + compute_indices \vstart, \vend, #PUD_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 251 + mov \sv, \rtbl 220 252 populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 221 253 mov \tbl, \sv 222 - mov \sv, \rtbl 223 254 #endif 224 255 225 256 #if SWAPPER_PGTABLE_LEVELS > 2 226 - compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count 257 + compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 258 + mov \sv, \rtbl 227 259 populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 228 260 mov \tbl, \sv 229 261 #endif 230 262 231 - compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count 232 - bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1 233 - populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp 263 + compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 264 + bic \rtbl, \phys, #SWAPPER_BLOCK_SIZE - 1 265 + populate_entries \tbl, \rtbl, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp 234 266 .endm 235 267 236 268 /* 237 - * Setup the initial page tables. We only setup the barest amount which is 238 - * required to get the kernel running. The following sections are required: 239 - * - identity mapping to enable the MMU (low address, TTBR0) 240 - * - first few MB of the kernel linear mapping to jump to once the MMU has 241 - * been enabled 269 + * Remap a subregion created with the map_memory macro with modified attributes 270 + * or output address. The entire remapped region must have been covered in the 271 + * invocation of map_memory. 272 + * 273 + * x0: last level table address (returned in first argument to map_memory) 274 + * x1: start VA of the existing mapping 275 + * x2: start VA of the region to update 276 + * x3: end VA of the region to update (exclusive) 277 + * x4: start PA associated with the region to update 278 + * x5: attributes to set on the updated region 279 + * x6: order of the last level mappings 242 280 */ 243 - SYM_FUNC_START_LOCAL(__create_page_tables) 281 + SYM_FUNC_START_LOCAL(remap_region) 282 + sub x3, x3, #1 // make end inclusive 283 + 284 + // Get the index offset for the start of the last level table 285 + lsr x1, x1, x6 286 + bfi x1, xzr, #0, #PAGE_SHIFT - 3 287 + 288 + // Derive the start and end indexes into the last level table 289 + // associated with the provided region 290 + lsr x2, x2, x6 291 + lsr x3, x3, x6 292 + sub x2, x2, x1 293 + sub x3, x3, x1 294 + 295 + mov x1, #1 296 + lsl x6, x1, x6 // block size at this level 297 + 298 + populate_entries x0, x4, x2, x3, x5, x6, x7 299 + ret 300 + SYM_FUNC_END(remap_region) 301 + 302 + SYM_FUNC_START_LOCAL(create_idmap) 244 303 mov x28, lr 245 - 246 304 /* 247 - * Invalidate the init page tables to avoid potential dirty cache lines 248 - * being evicted. Other page tables are allocated in rodata as part of 249 - * the kernel image, and thus are clean to the PoC per the boot 250 - * protocol. 251 - */ 252 - adrp x0, init_pg_dir 253 - adrp x1, init_pg_end 254 - bl dcache_inval_poc 255 - 256 - /* 257 - * Clear the init page tables. 258 - */ 259 - adrp x0, init_pg_dir 260 - adrp x1, init_pg_end 261 - sub x1, x1, x0 262 - 1: stp xzr, xzr, [x0], #16 263 - stp xzr, xzr, [x0], #16 264 - stp xzr, xzr, [x0], #16 265 - stp xzr, xzr, [x0], #16 266 - subs x1, x1, #64 267 - b.ne 1b 268 - 269 - mov x7, SWAPPER_MM_MMUFLAGS 270 - 271 - /* 272 - * Create the identity mapping. 273 - */ 274 - adrp x0, idmap_pg_dir 275 - adrp x3, __idmap_text_start // __pa(__idmap_text_start) 276 - 277 - #ifdef CONFIG_ARM64_VA_BITS_52 278 - mrs_s x6, SYS_ID_AA64MMFR2_EL1 279 - and x6, x6, #(0xf << ID_AA64MMFR2_LVA_SHIFT) 280 - mov x5, #52 281 - cbnz x6, 1f 282 - #endif 283 - mov x5, #VA_BITS_MIN 284 - 1: 285 - adr_l x6, vabits_actual 286 - str x5, [x6] 287 - dmb sy 288 - dc ivac, x6 // Invalidate potentially stale cache line 289 - 290 - /* 291 - * VA_BITS may be too small to allow for an ID mapping to be created 292 - * that covers system RAM if that is located sufficiently high in the 293 - * physical address space. So for the ID map, use an extended virtual 294 - * range in that case, and configure an additional translation level 295 - * if needed. 305 + * The ID map carries a 1:1 mapping of the physical address range 306 + * covered by the loaded image, which could be anywhere in DRAM. This 307 + * means that the required size of the VA (== PA) space is decided at 308 + * boot time, and could be more than the configured size of the VA 309 + * space for ordinary kernel and user space mappings. 296 310 * 297 - * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the 298 - * entire ID map region can be mapped. As T0SZ == (64 - #bits used), 299 - * this number conveniently equals the number of leading zeroes in 300 - * the physical address of __idmap_text_end. 311 + * There are three cases to consider here: 312 + * - 39 <= VA_BITS < 48, and the ID map needs up to 48 VA bits to cover 313 + * the placement of the image. In this case, we configure one extra 314 + * level of translation on the fly for the ID map only. (This case 315 + * also covers 42-bit VA/52-bit PA on 64k pages). 316 + * 317 + * - VA_BITS == 48, and the ID map needs more than 48 VA bits. This can 318 + * only happen when using 64k pages, in which case we need to extend 319 + * the root level table rather than add a level. Note that we can 320 + * treat this case as 'always extended' as long as we take care not 321 + * to program an unsupported T0SZ value into the TCR register. 322 + * 323 + * - Combinations that would require two additional levels of 324 + * translation are not supported, e.g., VA_BITS==36 on 16k pages, or 325 + * VA_BITS==39/4k pages with 5-level paging, where the input address 326 + * requires more than 47 or 48 bits, respectively. 301 327 */ 302 - adrp x5, __idmap_text_end 303 - clz x5, x5 304 - cmp x5, TCR_T0SZ(VA_BITS_MIN) // default T0SZ small enough? 305 - b.ge 1f // .. then skip VA range extension 306 - 307 - adr_l x6, idmap_t0sz 308 - str x5, [x6] 309 - dmb sy 310 - dc ivac, x6 // Invalidate potentially stale cache line 311 - 312 328 #if (VA_BITS < 48) 329 + #define IDMAP_PGD_ORDER (VA_BITS - PGDIR_SHIFT) 313 330 #define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3) 314 - #define EXTRA_PTRS (1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT)) 315 331 316 332 /* 317 333 * If VA_BITS < 48, we have to configure an additional table level. ··· 320 342 #if VA_BITS != EXTRA_SHIFT 321 343 #error "Mismatch between VA_BITS and page size/number of translation levels" 322 344 #endif 323 - 324 - mov x4, EXTRA_PTRS 325 - create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6 326 345 #else 346 + #define IDMAP_PGD_ORDER (PHYS_MASK_SHIFT - PGDIR_SHIFT) 347 + #define EXTRA_SHIFT 327 348 /* 328 349 * If VA_BITS == 48, we don't have to configure an additional 329 350 * translation level, but the top-level table has more entries. 330 351 */ 331 - mov x4, #1 << (PHYS_MASK_SHIFT - PGDIR_SHIFT) 332 - str_l x4, idmap_ptrs_per_pgd, x5 333 352 #endif 334 - 1: 335 - ldr_l x4, idmap_ptrs_per_pgd 336 - adr_l x6, __idmap_text_end // __pa(__idmap_text_end) 353 + adrp x0, init_idmap_pg_dir 354 + adrp x3, _text 355 + adrp x6, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE 356 + mov x7, SWAPPER_RX_MMUFLAGS 337 357 338 - map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14 358 + map_memory x0, x1, x3, x6, x7, x3, IDMAP_PGD_ORDER, x10, x11, x12, x13, x14, EXTRA_SHIFT 339 359 340 - /* 341 - * Map the kernel image (starting with PHYS_OFFSET). 342 - */ 343 - adrp x0, init_pg_dir 344 - mov_q x5, KIMAGE_VADDR // compile time __va(_text) 345 - add x5, x5, x23 // add KASLR displacement 346 - mov x4, PTRS_PER_PGD 347 - adrp x6, _end // runtime __pa(_end) 348 - adrp x3, _text // runtime __pa(_text) 349 - sub x6, x6, x3 // _end - _text 350 - add x6, x6, x5 // runtime __va(_end) 360 + /* Remap the kernel page tables r/w in the ID map */ 361 + adrp x1, _text 362 + adrp x2, init_pg_dir 363 + adrp x3, init_pg_end 364 + bic x4, x2, #SWAPPER_BLOCK_SIZE - 1 365 + mov x5, SWAPPER_RW_MMUFLAGS 366 + mov x6, #SWAPPER_BLOCK_SHIFT 367 + bl remap_region 351 368 352 - map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14 369 + /* Remap the FDT after the kernel image */ 370 + adrp x1, _text 371 + adrp x22, _end + SWAPPER_BLOCK_SIZE 372 + bic x2, x22, #SWAPPER_BLOCK_SIZE - 1 373 + bfi x22, x21, #0, #SWAPPER_BLOCK_SHIFT // remapped FDT address 374 + add x3, x2, #MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE 375 + bic x4, x21, #SWAPPER_BLOCK_SIZE - 1 376 + mov x5, SWAPPER_RW_MMUFLAGS 377 + mov x6, #SWAPPER_BLOCK_SHIFT 378 + bl remap_region 353 379 354 380 /* 355 381 * Since the page tables have been populated with non-cacheable ··· 362 380 */ 363 381 dmb sy 364 382 365 - adrp x0, idmap_pg_dir 366 - adrp x1, idmap_pg_end 383 + adrp x0, init_idmap_pg_dir 384 + adrp x1, init_idmap_pg_end 367 385 bl dcache_inval_poc 368 - 369 - adrp x0, init_pg_dir 370 - adrp x1, init_pg_end 371 - bl dcache_inval_poc 372 - 373 386 ret x28 374 - SYM_FUNC_END(__create_page_tables) 387 + SYM_FUNC_END(create_idmap) 388 + 389 + SYM_FUNC_START_LOCAL(create_kernel_mapping) 390 + adrp x0, init_pg_dir 391 + mov_q x5, KIMAGE_VADDR // compile time __va(_text) 392 + add x5, x5, x23 // add KASLR displacement 393 + adrp x6, _end // runtime __pa(_end) 394 + adrp x3, _text // runtime __pa(_text) 395 + sub x6, x6, x3 // _end - _text 396 + add x6, x6, x5 // runtime __va(_end) 397 + mov x7, SWAPPER_RW_MMUFLAGS 398 + 399 + map_memory x0, x1, x5, x6, x7, x3, (VA_BITS - PGDIR_SHIFT), x10, x11, x12, x13, x14 400 + 401 + dsb ishst // sync with page table walker 402 + ret 403 + SYM_FUNC_END(create_kernel_mapping) 375 404 376 405 /* 377 406 * Initialize CPU registers with task-specific and cpu-specific context. ··· 413 420 /* 414 421 * The following fragment of code is executed with the MMU enabled. 415 422 * 416 - * x0 = __PHYS_OFFSET 423 + * x0 = __pa(KERNEL_START) 417 424 */ 418 425 SYM_FUNC_START_LOCAL(__primary_switched) 419 426 adr_l x4, init_task ··· 432 439 sub x4, x4, x0 // the kernel virtual and 433 440 str_l x4, kimage_voffset, x5 // physical mappings 434 441 442 + mov x0, x20 443 + bl set_cpu_boot_mode_flag 444 + 435 445 // Clear BSS 436 446 adr_l x0, __bss_start 437 447 mov x1, xzr ··· 443 447 bl __pi_memset 444 448 dsb ishst // Make zero page visible to PTW 445 449 450 + #if VA_BITS > 48 451 + adr_l x8, vabits_actual // Set this early so KASAN early init 452 + str x25, [x8] // ... observes the correct value 453 + dc civac, x8 // Make visible to booting secondaries 454 + #endif 455 + 456 + #ifdef CONFIG_RANDOMIZE_BASE 457 + adrp x5, memstart_offset_seed // Save KASLR linear map seed 458 + strh w24, [x5, :lo12:memstart_offset_seed] 459 + #endif 446 460 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 447 461 bl kasan_early_init 448 462 #endif 449 463 mov x0, x21 // pass FDT address in x0 450 464 bl early_fdt_map // Try mapping the FDT early 465 + mov x0, x20 // pass the full boot status 451 466 bl init_feature_override // Parse cpu feature overrides 452 - #ifdef CONFIG_RANDOMIZE_BASE 453 - tst x23, ~(MIN_KIMG_ALIGN - 1) // already running randomized? 454 - b.ne 0f 455 - bl kaslr_early_init // parse FDT for KASLR options 456 - cbz x0, 0f // KASLR disabled? just proceed 457 - orr x23, x23, x0 // record KASLR offset 458 - ldp x29, x30, [sp], #16 // we must enable KASLR, return 459 - ret // to __primary_switch() 460 - 0: 461 - #endif 462 - bl switch_to_vhe // Prefer VHE if possible 467 + mov x0, x20 468 + bl finalise_el2 // Prefer VHE if possible 463 469 ldp x29, x30, [sp], #16 464 470 bl start_kernel 465 471 ASM_BUG() 466 472 SYM_FUNC_END(__primary_switched) 467 - 468 - .pushsection ".rodata", "a" 469 - SYM_DATA_START(kimage_vaddr) 470 - .quad _text 471 - SYM_DATA_END(kimage_vaddr) 472 - EXPORT_SYMBOL(kimage_vaddr) 473 - .popsection 474 473 475 474 /* 476 475 * end early head section, begin head code that is also used for ··· 481 490 * Since we cannot always rely on ERET synchronizing writes to sysregs (e.g. if 482 491 * SCTLR_ELx.EOS is clear), we place an ISB prior to ERET. 483 492 * 484 - * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in w0 if 485 - * booted in EL1 or EL2 respectively. 493 + * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x0 if 494 + * booted in EL1 or EL2 respectively, with the top 32 bits containing 495 + * potential context flags. These flags are *not* stored in __boot_cpu_mode. 486 496 */ 487 497 SYM_FUNC_START(init_kernel_el) 488 498 mrs x0, CurrentEL ··· 512 520 msr vbar_el2, x0 513 521 isb 514 522 523 + mov_q x1, INIT_SCTLR_EL1_MMU_OFF 524 + 515 525 /* 516 526 * Fruity CPUs seem to have HCR_EL2.E2H set to RES1, 517 527 * making it impossible to start in nVHE mode. Is that ··· 523 529 and x0, x0, #HCR_E2H 524 530 cbz x0, 1f 525 531 526 - /* Switching to VHE requires a sane SCTLR_EL1 as a start */ 527 - mov_q x0, INIT_SCTLR_EL1_MMU_OFF 528 - msr_s SYS_SCTLR_EL12, x0 529 - 530 - /* 531 - * Force an eret into a helper "function", and let it return 532 - * to our original caller... This makes sure that we have 533 - * initialised the basic PSTATE state. 534 - */ 535 - mov x0, #INIT_PSTATE_EL2 536 - msr spsr_el1, x0 537 - adr x0, __cpu_stick_to_vhe 538 - msr elr_el1, x0 539 - eret 532 + /* Set a sane SCTLR_EL1, the VHE way */ 533 + msr_s SYS_SCTLR_EL12, x1 534 + mov x2, #BOOT_CPU_FLAG_E2H 535 + b 2f 540 536 541 537 1: 542 - mov_q x0, INIT_SCTLR_EL1_MMU_OFF 543 - msr sctlr_el1, x0 544 - 538 + msr sctlr_el1, x1 539 + mov x2, xzr 540 + 2: 545 541 msr elr_el2, lr 546 542 mov w0, #BOOT_CPU_MODE_EL2 543 + orr x0, x0, x2 547 544 eret 548 - 549 - __cpu_stick_to_vhe: 550 - mov x0, #HVC_VHE_RESTART 551 - hvc #0 552 - mov x0, #BOOT_CPU_MODE_EL2 553 - ret 554 545 SYM_FUNC_END(init_kernel_el) 555 546 556 547 /* ··· 548 569 b.ne 1f 549 570 add x1, x1, #4 550 571 1: str w0, [x1] // Save CPU boot mode 551 - dmb sy 552 - dc ivac, x1 // Invalidate potentially stale cache line 553 572 ret 554 573 SYM_FUNC_END(set_cpu_boot_mode_flag) 555 - 556 - /* 557 - * These values are written with the MMU off, but read with the MMU on. 558 - * Writers will invalidate the corresponding address, discarding up to a 559 - * 'Cache Writeback Granule' (CWG) worth of data. The linker script ensures 560 - * sufficient alignment that the CWG doesn't overlap another section. 561 - */ 562 - .pushsection ".mmuoff.data.write", "aw" 563 - /* 564 - * We need to find out the CPU boot mode long after boot, so we need to 565 - * store it in a writable variable. 566 - * 567 - * This is not in .bss, because we set it sufficiently early that the boot-time 568 - * zeroing of .bss would clobber it. 569 - */ 570 - SYM_DATA_START(__boot_cpu_mode) 571 - .long BOOT_CPU_MODE_EL2 572 - .long BOOT_CPU_MODE_EL1 573 - SYM_DATA_END(__boot_cpu_mode) 574 - /* 575 - * The booting CPU updates the failed status @__early_cpu_boot_status, 576 - * with MMU turned off. 577 - */ 578 - SYM_DATA_START(__early_cpu_boot_status) 579 - .quad 0 580 - SYM_DATA_END(__early_cpu_boot_status) 581 - 582 - .popsection 583 574 584 575 /* 585 576 * This provides a "holding pen" for platforms to hold all secondary ··· 557 608 */ 558 609 SYM_FUNC_START(secondary_holding_pen) 559 610 bl init_kernel_el // w0=cpu_boot_mode 560 - bl set_cpu_boot_mode_flag 561 - mrs x0, mpidr_el1 611 + mrs x2, mpidr_el1 562 612 mov_q x1, MPIDR_HWID_BITMASK 563 - and x0, x0, x1 613 + and x2, x2, x1 564 614 adr_l x3, secondary_holding_pen_release 565 615 pen: ldr x4, [x3] 566 - cmp x4, x0 616 + cmp x4, x2 567 617 b.eq secondary_startup 568 618 wfe 569 619 b pen ··· 574 626 */ 575 627 SYM_FUNC_START(secondary_entry) 576 628 bl init_kernel_el // w0=cpu_boot_mode 577 - bl set_cpu_boot_mode_flag 578 629 b secondary_startup 579 630 SYM_FUNC_END(secondary_entry) 580 631 ··· 581 634 /* 582 635 * Common entry point for secondary CPUs. 583 636 */ 584 - bl switch_to_vhe 637 + mov x20, x0 // preserve boot mode 638 + bl finalise_el2 585 639 bl __cpu_secondary_check52bitva 640 + #if VA_BITS > 48 641 + ldr_l x0, vabits_actual 642 + #endif 586 643 bl __cpu_setup // initialise processor 587 644 adrp x1, swapper_pg_dir 645 + adrp x2, idmap_pg_dir 588 646 bl __enable_mmu 589 647 ldr x8, =__secondary_switched 590 648 br x8 591 649 SYM_FUNC_END(secondary_startup) 592 650 593 651 SYM_FUNC_START_LOCAL(__secondary_switched) 652 + mov x0, x20 653 + bl set_cpu_boot_mode_flag 654 + str_l xzr, __early_cpu_boot_status, x3 594 655 adr_l x5, vectors 595 656 msr vbar_el1, x5 596 657 isb ··· 646 691 * 647 692 * x0 = SCTLR_EL1 value for turning on the MMU. 648 693 * x1 = TTBR1_EL1 value 694 + * x2 = ID map root table address 649 695 * 650 696 * Returns to the caller via x30/lr. This requires the caller to be covered 651 697 * by the .idmap.text section. ··· 655 699 * If it isn't, park the CPU 656 700 */ 657 701 SYM_FUNC_START(__enable_mmu) 658 - mrs x2, ID_AA64MMFR0_EL1 659 - ubfx x2, x2, #ID_AA64MMFR0_TGRAN_SHIFT, 4 660 - cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED_MIN 702 + mrs x3, ID_AA64MMFR0_EL1 703 + ubfx x3, x3, #ID_AA64MMFR0_TGRAN_SHIFT, 4 704 + cmp x3, #ID_AA64MMFR0_TGRAN_SUPPORTED_MIN 661 705 b.lt __no_granule_support 662 - cmp x2, #ID_AA64MMFR0_TGRAN_SUPPORTED_MAX 706 + cmp x3, #ID_AA64MMFR0_TGRAN_SUPPORTED_MAX 663 707 b.gt __no_granule_support 664 - update_early_cpu_boot_status 0, x2, x3 665 - adrp x2, idmap_pg_dir 666 - phys_to_ttbr x1, x1 667 708 phys_to_ttbr x2, x2 668 709 msr ttbr0_el1, x2 // load TTBR0 669 - offset_ttbr1 x1, x3 670 - msr ttbr1_el1, x1 // load TTBR1 671 - isb 710 + load_ttbr1 x1, x1, x3 672 711 673 712 set_sctlr_el1 x0 674 713 ··· 671 720 SYM_FUNC_END(__enable_mmu) 672 721 673 722 SYM_FUNC_START(__cpu_secondary_check52bitva) 674 - #ifdef CONFIG_ARM64_VA_BITS_52 723 + #if VA_BITS > 48 675 724 ldr_l x0, vabits_actual 676 725 cmp x0, #52 677 726 b.ne 2f ··· 706 755 * Iterate over each entry in the relocation table, and apply the 707 756 * relocations in place. 708 757 */ 709 - ldr w9, =__rela_offset // offset to reloc table 710 - ldr w10, =__rela_size // size of reloc table 711 - 758 + adr_l x9, __rela_start 759 + adr_l x10, __rela_end 712 760 mov_q x11, KIMAGE_VADDR // default virtual offset 713 761 add x11, x11, x23 // actual virtual offset 714 - add x9, x9, x11 // __va(.rela) 715 - add x10, x9, x10 // __va(.rela) + sizeof(.rela) 716 762 717 763 0: cmp x9, x10 718 764 b.hs 1f ··· 752 804 * entry in x9, the address being relocated by the current address or 753 805 * bitmap entry in x13 and the address being relocated by the current 754 806 * bit in x14. 755 - * 756 - * Because addends are stored in place in the binary, RELR relocations 757 - * cannot be applied idempotently. We use x24 to keep track of the 758 - * currently applied displacement so that we can correctly relocate if 759 - * __relocate_kernel is called twice with non-zero displacements (i.e. 760 - * if there is both a physical misalignment and a KASLR displacement). 761 807 */ 762 - ldr w9, =__relr_offset // offset to reloc table 763 - ldr w10, =__relr_size // size of reloc table 764 - add x9, x9, x11 // __va(.relr) 765 - add x10, x9, x10 // __va(.relr) + sizeof(.relr) 766 - 767 - sub x15, x23, x24 // delta from previous offset 768 - cbz x15, 7f // nothing to do if unchanged 769 - mov x24, x23 // save new offset 808 + adr_l x9, __relr_start 809 + adr_l x10, __relr_end 770 810 771 811 2: cmp x9, x10 772 812 b.hs 7f ··· 762 826 tbnz x11, #0, 3f // branch to handle bitmaps 763 827 add x13, x11, x23 764 828 ldr x12, [x13] // relocate address entry 765 - add x12, x12, x15 829 + add x12, x12, x23 766 830 str x12, [x13], #8 // adjust to start of bitmap 767 831 b 2b 768 832 ··· 771 835 cbz x11, 6f 772 836 tbz x11, #0, 5f // skip bit if not set 773 837 ldr x12, [x14] // relocate bit 774 - add x12, x12, x15 838 + add x12, x12, x23 775 839 str x12, [x14] 776 840 777 841 5: add x14, x14, #8 // move to next bit's address ··· 792 856 #endif 793 857 794 858 SYM_FUNC_START_LOCAL(__primary_switch) 795 - #ifdef CONFIG_RANDOMIZE_BASE 796 - mov x19, x0 // preserve new SCTLR_EL1 value 797 - mrs x20, sctlr_el1 // preserve old SCTLR_EL1 value 798 - #endif 799 - 800 - adrp x1, init_pg_dir 859 + adrp x1, reserved_pg_dir 860 + adrp x2, init_idmap_pg_dir 801 861 bl __enable_mmu 802 862 #ifdef CONFIG_RELOCATABLE 803 - #ifdef CONFIG_RELR 804 - mov x24, #0 // no RELR displacement yet 805 - #endif 806 - bl __relocate_kernel 863 + adrp x23, KERNEL_START 864 + and x23, x23, MIN_KIMG_ALIGN - 1 807 865 #ifdef CONFIG_RANDOMIZE_BASE 808 - ldr x8, =__primary_switched 809 - adrp x0, __PHYS_OFFSET 810 - blr x8 866 + mov x0, x22 867 + adrp x1, init_pg_end 868 + mov sp, x1 869 + mov x29, xzr 870 + bl __pi_kaslr_early_init 871 + and x24, x0, #SZ_2M - 1 // capture memstart offset seed 872 + bic x0, x0, #SZ_2M - 1 873 + orr x23, x23, x0 // record kernel offset 874 + #endif 875 + #endif 876 + bl clear_page_tables 877 + bl create_kernel_mapping 811 878 812 - /* 813 - * If we return here, we have a KASLR displacement in x23 which we need 814 - * to take into account by discarding the current kernel mapping and 815 - * creating a new one. 816 - */ 817 - pre_disable_mmu_workaround 818 - msr sctlr_el1, x20 // disable the MMU 819 - isb 820 - bl __create_page_tables // recreate kernel mapping 821 - 822 - tlbi vmalle1 // Remove any stale TLB entries 823 - dsb nsh 824 - isb 825 - 826 - set_sctlr_el1 x19 // re-enable the MMU 827 - 879 + adrp x1, init_pg_dir 880 + load_ttbr1 x1, x1, x2 881 + #ifdef CONFIG_RELOCATABLE 828 882 bl __relocate_kernel 829 883 #endif 830 - #endif 831 884 ldr x8, =__primary_switched 832 - adrp x0, __PHYS_OFFSET 885 + adrp x0, KERNEL_START // __pa(KERNEL_START) 833 886 br x8 834 887 SYM_FUNC_END(__primary_switch)
-5
arch/arm64/kernel/hibernate.c
··· 300 300 unsigned long pfn = xa_state.xa_index; 301 301 struct page *page = pfn_to_online_page(pfn); 302 302 303 - /* 304 - * It is not required to invoke page_kasan_tag_reset(page) 305 - * at this point since the tags stored in page->flags are 306 - * already restored. 307 - */ 308 303 mte_restore_page_tags(page_address(page), tags); 309 304 310 305 mte_free_tag_storage(tags);
+90 -27
arch/arm64/kernel/hyp-stub.S
··· 16 16 #include <asm/ptrace.h> 17 17 #include <asm/virt.h> 18 18 19 + // Warning, hardcoded register allocation 20 + // This will clobber x1 and x2, and expect x1 to contain 21 + // the id register value as read from the HW 22 + .macro __check_override idreg, fld, width, pass, fail 23 + ubfx x1, x1, #\fld, #\width 24 + cbz x1, \fail 25 + 26 + adr_l x1, \idreg\()_override 27 + ldr x2, [x1, FTR_OVR_VAL_OFFSET] 28 + ldr x1, [x1, FTR_OVR_MASK_OFFSET] 29 + ubfx x2, x2, #\fld, #\width 30 + ubfx x1, x1, #\fld, #\width 31 + cmp x1, xzr 32 + and x2, x2, x1 33 + csinv x2, x2, xzr, ne 34 + cbnz x2, \pass 35 + b \fail 36 + .endm 37 + 38 + .macro check_override idreg, fld, pass, fail 39 + mrs x1, \idreg\()_el1 40 + __check_override \idreg \fld 4 \pass \fail 41 + .endm 42 + 19 43 .text 20 44 .pushsection .hyp.text, "ax" 21 45 ··· 75 51 msr vbar_el2, x1 76 52 b 9f 77 53 78 - 1: cmp x0, #HVC_VHE_RESTART 79 - b.eq mutate_to_vhe 54 + 1: cmp x0, #HVC_FINALISE_EL2 55 + b.eq __finalise_el2 80 56 81 57 2: cmp x0, #HVC_SOFT_RESTART 82 58 b.ne 3f ··· 97 73 eret 98 74 SYM_CODE_END(elx_sync) 99 75 100 - // nVHE? No way! Give me the real thing! 101 - SYM_CODE_START_LOCAL(mutate_to_vhe) 76 + SYM_CODE_START_LOCAL(__finalise_el2) 77 + check_override id_aa64pfr0 ID_AA64PFR0_SVE_SHIFT .Linit_sve .Lskip_sve 78 + 79 + .Linit_sve: /* SVE register access */ 80 + mrs x0, cptr_el2 // Disable SVE traps 81 + bic x0, x0, #CPTR_EL2_TZ 82 + msr cptr_el2, x0 83 + isb 84 + mov x1, #ZCR_ELx_LEN_MASK // SVE: Enable full vector 85 + msr_s SYS_ZCR_EL2, x1 // length for EL1. 86 + 87 + .Lskip_sve: 88 + check_override id_aa64pfr1 ID_AA64PFR1_SME_SHIFT .Linit_sme .Lskip_sme 89 + 90 + .Linit_sme: /* SME register access and priority mapping */ 91 + mrs x0, cptr_el2 // Disable SME traps 92 + bic x0, x0, #CPTR_EL2_TSM 93 + msr cptr_el2, x0 94 + isb 95 + 96 + mrs x1, sctlr_el2 97 + orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps 98 + msr sctlr_el2, x1 99 + isb 100 + 101 + mov x0, #0 // SMCR controls 102 + 103 + // Full FP in SM? 104 + mrs_s x1, SYS_ID_AA64SMFR0_EL1 105 + __check_override id_aa64smfr0 ID_AA64SMFR0_EL1_FA64_SHIFT 1 .Linit_sme_fa64 .Lskip_sme_fa64 106 + 107 + .Linit_sme_fa64: 108 + orr x0, x0, SMCR_ELx_FA64_MASK 109 + .Lskip_sme_fa64: 110 + 111 + orr x0, x0, #SMCR_ELx_LEN_MASK // Enable full SME vector 112 + msr_s SYS_SMCR_EL2, x0 // length for EL1. 113 + 114 + mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported? 115 + ubfx x1, x1, #SMIDR_EL1_SMPS_SHIFT, #1 116 + cbz x1, .Lskip_sme 117 + 118 + msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal 119 + 120 + mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present? 121 + ubfx x1, x1, #ID_AA64MMFR1_HCX_SHIFT, #4 122 + cbz x1, .Lskip_sme 123 + 124 + mrs_s x1, SYS_HCRX_EL2 125 + orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping 126 + msr_s SYS_HCRX_EL2, x1 127 + 128 + .Lskip_sme: 129 + 130 + // nVHE? No way! Give me the real thing! 102 131 // Sanity check: MMU *must* be off 103 132 mrs x1, sctlr_el2 104 133 tbnz x1, #0, 1f 105 134 106 135 // Needs to be VHE capable, obviously 107 - mrs x1, id_aa64mmfr1_el1 108 - ubfx x1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4 109 - cbz x1, 1f 110 - 111 - // Check whether VHE is disabled from the command line 112 - adr_l x1, id_aa64mmfr1_override 113 - ldr x2, [x1, FTR_OVR_VAL_OFFSET] 114 - ldr x1, [x1, FTR_OVR_MASK_OFFSET] 115 - ubfx x2, x2, #ID_AA64MMFR1_VHE_SHIFT, #4 116 - ubfx x1, x1, #ID_AA64MMFR1_VHE_SHIFT, #4 117 - cmp x1, xzr 118 - and x2, x2, x1 119 - csinv x2, x2, xzr, ne 120 - cbnz x2, 2f 136 + check_override id_aa64mmfr1 ID_AA64MMFR1_VHE_SHIFT 2f 1f 121 137 122 138 1: mov_q x0, HVC_STUB_ERR 123 139 eret ··· 204 140 msr spsr_el1, x0 205 141 206 142 b enter_vhe 207 - SYM_CODE_END(mutate_to_vhe) 143 + SYM_CODE_END(__finalise_el2) 208 144 209 145 // At the point where we reach enter_vhe(), we run with 210 - // the MMU off (which is enforced by mutate_to_vhe()). 146 + // the MMU off (which is enforced by __finalise_el2()). 211 147 // We thus need to be in the idmap, or everything will 212 148 // explode when enabling the MMU. 213 149 ··· 286 222 SYM_FUNC_END(__hyp_reset_vectors) 287 223 288 224 /* 289 - * Entry point to switch to VHE if deemed capable 225 + * Entry point to finalise EL2 and switch to VHE if deemed capable 226 + * 227 + * w0: boot mode, as returned by init_kernel_el() 290 228 */ 291 - SYM_FUNC_START(switch_to_vhe) 229 + SYM_FUNC_START(finalise_el2) 292 230 // Need to have booted at EL2 293 - adr_l x1, __boot_cpu_mode 294 - ldr w0, [x1] 295 231 cmp w0, #BOOT_CPU_MODE_EL2 296 232 b.ne 1f 297 233 ··· 300 236 cmp x0, #CurrentEL_EL1 301 237 b.ne 1f 302 238 303 - // Turn the world upside down 304 - mov x0, #HVC_VHE_RESTART 239 + mov x0, #HVC_FINALISE_EL2 305 240 hvc #0 306 241 1: 307 242 ret 308 - SYM_FUNC_END(switch_to_vhe) 243 + SYM_FUNC_END(finalise_el2)
+78 -15
arch/arm64/kernel/idreg-override.c
··· 19 19 #define FTR_ALIAS_NAME_LEN 30 20 20 #define FTR_ALIAS_OPTION_LEN 116 21 21 22 + static u64 __boot_status __initdata; 23 + 22 24 struct ftr_set_desc { 23 25 char name[FTR_DESC_NAME_LEN]; 24 26 struct arm64_ftr_override *override; 25 27 struct { 26 28 char name[FTR_DESC_FIELD_LEN]; 27 29 u8 shift; 30 + u8 width; 28 31 bool (*filter)(u64 val); 29 32 } fields[]; 30 33 }; 34 + 35 + #define FIELD(n, s, f) { .name = n, .shift = s, .width = 4, .filter = f } 31 36 32 37 static bool __init mmfr1_vh_filter(u64 val) 33 38 { ··· 42 37 * the user was trying to force nVHE on us, proceed with 43 38 * attitude adjustment. 44 39 */ 45 - return !(is_kernel_in_hyp_mode() && val == 0); 40 + return !(__boot_status == (BOOT_CPU_FLAG_E2H | BOOT_CPU_MODE_EL2) && 41 + val == 0); 46 42 } 47 43 48 44 static const struct ftr_set_desc mmfr1 __initconst = { 49 45 .name = "id_aa64mmfr1", 50 46 .override = &id_aa64mmfr1_override, 51 47 .fields = { 52 - { "vh", ID_AA64MMFR1_VHE_SHIFT, mmfr1_vh_filter }, 48 + FIELD("vh", ID_AA64MMFR1_VHE_SHIFT, mmfr1_vh_filter), 53 49 {} 54 50 }, 55 51 }; 52 + 53 + static bool __init pfr0_sve_filter(u64 val) 54 + { 55 + /* 56 + * Disabling SVE also means disabling all the features that 57 + * are associated with it. The easiest way to do it is just to 58 + * override id_aa64zfr0_el1 to be 0. 59 + */ 60 + if (!val) { 61 + id_aa64zfr0_override.val = 0; 62 + id_aa64zfr0_override.mask = GENMASK(63, 0); 63 + } 64 + 65 + return true; 66 + } 67 + 68 + static const struct ftr_set_desc pfr0 __initconst = { 69 + .name = "id_aa64pfr0", 70 + .override = &id_aa64pfr0_override, 71 + .fields = { 72 + FIELD("sve", ID_AA64PFR0_SVE_SHIFT, pfr0_sve_filter), 73 + {} 74 + }, 75 + }; 76 + 77 + static bool __init pfr1_sme_filter(u64 val) 78 + { 79 + /* 80 + * Similarly to SVE, disabling SME also means disabling all 81 + * the features that are associated with it. Just set 82 + * id_aa64smfr0_el1 to 0 and don't look back. 83 + */ 84 + if (!val) { 85 + id_aa64smfr0_override.val = 0; 86 + id_aa64smfr0_override.mask = GENMASK(63, 0); 87 + } 88 + 89 + return true; 90 + } 56 91 57 92 static const struct ftr_set_desc pfr1 __initconst = { 58 93 .name = "id_aa64pfr1", 59 94 .override = &id_aa64pfr1_override, 60 95 .fields = { 61 - { "bt", ID_AA64PFR1_BT_SHIFT }, 62 - { "mte", ID_AA64PFR1_MTE_SHIFT}, 96 + FIELD("bt", ID_AA64PFR1_BT_SHIFT, NULL ), 97 + FIELD("mte", ID_AA64PFR1_MTE_SHIFT, NULL), 98 + FIELD("sme", ID_AA64PFR1_SME_SHIFT, pfr1_sme_filter), 63 99 {} 64 100 }, 65 101 }; ··· 109 63 .name = "id_aa64isar1", 110 64 .override = &id_aa64isar1_override, 111 65 .fields = { 112 - { "gpi", ID_AA64ISAR1_GPI_SHIFT }, 113 - { "gpa", ID_AA64ISAR1_GPA_SHIFT }, 114 - { "api", ID_AA64ISAR1_API_SHIFT }, 115 - { "apa", ID_AA64ISAR1_APA_SHIFT }, 66 + FIELD("gpi", ID_AA64ISAR1_EL1_GPI_SHIFT, NULL), 67 + FIELD("gpa", ID_AA64ISAR1_EL1_GPA_SHIFT, NULL), 68 + FIELD("api", ID_AA64ISAR1_EL1_API_SHIFT, NULL), 69 + FIELD("apa", ID_AA64ISAR1_EL1_APA_SHIFT, NULL), 116 70 {} 117 71 }, 118 72 }; ··· 121 75 .name = "id_aa64isar2", 122 76 .override = &id_aa64isar2_override, 123 77 .fields = { 124 - { "gpa3", ID_AA64ISAR2_GPA3_SHIFT }, 125 - { "apa3", ID_AA64ISAR2_APA3_SHIFT }, 78 + FIELD("gpa3", ID_AA64ISAR2_EL1_GPA3_SHIFT, NULL), 79 + FIELD("apa3", ID_AA64ISAR2_EL1_APA3_SHIFT, NULL), 80 + {} 81 + }, 82 + }; 83 + 84 + static const struct ftr_set_desc smfr0 __initconst = { 85 + .name = "id_aa64smfr0", 86 + .override = &id_aa64smfr0_override, 87 + .fields = { 88 + /* FA64 is a one bit field... :-/ */ 89 + { "fa64", ID_AA64SMFR0_EL1_FA64_SHIFT, 1, }, 126 90 {} 127 91 }, 128 92 }; ··· 145 89 .override = &kaslr_feature_override, 146 90 #endif 147 91 .fields = { 148 - { "disabled", 0 }, 92 + FIELD("disabled", 0, NULL), 149 93 {} 150 94 }, 151 95 }; 152 96 153 97 static const struct ftr_set_desc * const regs[] __initconst = { 154 98 &mmfr1, 99 + &pfr0, 155 100 &pfr1, 156 101 &isar1, 157 102 &isar2, 103 + &smfr0, 158 104 &kaslr, 159 105 }; 160 106 ··· 166 108 } aliases[] __initconst = { 167 109 { "kvm-arm.mode=nvhe", "id_aa64mmfr1.vh=0" }, 168 110 { "kvm-arm.mode=protected", "id_aa64mmfr1.vh=0" }, 111 + { "arm64.nosve", "id_aa64pfr0.sve=0 id_aa64pfr1.sme=0" }, 112 + { "arm64.nosme", "id_aa64pfr1.sme=0" }, 169 113 { "arm64.nobti", "id_aa64pfr1.bt=0" }, 170 114 { "arm64.nopauth", 171 115 "id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 " ··· 204 144 205 145 for (f = 0; strlen(regs[i]->fields[f].name); f++) { 206 146 u64 shift = regs[i]->fields[f].shift; 207 - u64 mask = 0xfUL << shift; 147 + u64 width = regs[i]->fields[f].width ?: 4; 148 + u64 mask = GENMASK_ULL(shift + width - 1, shift); 208 149 u64 v; 209 150 210 151 if (find_field(cmdline, regs[i], f, &v)) ··· 213 152 214 153 /* 215 154 * If an override gets filtered out, advertise 216 - * it by setting the value to 0xf, but 155 + * it by setting the value to the all-ones while 217 156 * clearing the mask... Yes, this is fragile. 218 157 */ 219 158 if (regs[i]->fields[f].filter && ··· 295 234 } 296 235 297 236 /* Keep checkers quiet */ 298 - void init_feature_override(void); 237 + void init_feature_override(u64 boot_status); 299 238 300 - asmlinkage void __init init_feature_override(void) 239 + asmlinkage void __init init_feature_override(u64 boot_status) 301 240 { 302 241 int i; 303 242 ··· 307 246 regs[i]->override->mask = 0; 308 247 } 309 248 } 249 + 250 + __boot_status = boot_status; 310 251 311 252 parse_cmdline(); 312 253
+30 -27
arch/arm64/kernel/image-vars.h
··· 10 10 #error This file should only be included in vmlinux.lds.S 11 11 #endif 12 12 13 - #ifdef CONFIG_EFI 14 - 15 - __efistub_kernel_size = _edata - _text; 16 - __efistub_primary_entry_offset = primary_entry - _text; 17 - 13 + PROVIDE(__efistub_kernel_size = _edata - _text); 14 + PROVIDE(__efistub_primary_entry_offset = primary_entry - _text); 18 15 19 16 /* 20 17 * The EFI stub has its own symbol namespace prefixed by __efistub_, to ··· 22 25 * linked at. The routines below are all implemented in assembler in a 23 26 * position independent manner 24 27 */ 25 - __efistub_memcmp = __pi_memcmp; 26 - __efistub_memchr = __pi_memchr; 27 - __efistub_memcpy = __pi_memcpy; 28 - __efistub_memmove = __pi_memmove; 29 - __efistub_memset = __pi_memset; 30 - __efistub_strlen = __pi_strlen; 31 - __efistub_strnlen = __pi_strnlen; 32 - __efistub_strcmp = __pi_strcmp; 33 - __efistub_strncmp = __pi_strncmp; 34 - __efistub_strrchr = __pi_strrchr; 35 - __efistub_dcache_clean_poc = __pi_dcache_clean_poc; 28 + PROVIDE(__efistub_memcmp = __pi_memcmp); 29 + PROVIDE(__efistub_memchr = __pi_memchr); 30 + PROVIDE(__efistub_memcpy = __pi_memcpy); 31 + PROVIDE(__efistub_memmove = __pi_memmove); 32 + PROVIDE(__efistub_memset = __pi_memset); 33 + PROVIDE(__efistub_strlen = __pi_strlen); 34 + PROVIDE(__efistub_strnlen = __pi_strnlen); 35 + PROVIDE(__efistub_strcmp = __pi_strcmp); 36 + PROVIDE(__efistub_strncmp = __pi_strncmp); 37 + PROVIDE(__efistub_strrchr = __pi_strrchr); 38 + PROVIDE(__efistub_dcache_clean_poc = __pi_dcache_clean_poc); 36 39 37 - #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 38 - __efistub___memcpy = __pi_memcpy; 39 - __efistub___memmove = __pi_memmove; 40 - __efistub___memset = __pi_memset; 41 - #endif 40 + PROVIDE(__efistub__text = _text); 41 + PROVIDE(__efistub__end = _end); 42 + PROVIDE(__efistub__edata = _edata); 43 + PROVIDE(__efistub_screen_info = screen_info); 44 + PROVIDE(__efistub__ctype = _ctype); 42 45 43 - __efistub__text = _text; 44 - __efistub__end = _end; 45 - __efistub__edata = _edata; 46 - __efistub_screen_info = screen_info; 47 - __efistub__ctype = _ctype; 46 + /* 47 + * The __ prefixed memcpy/memset/memmove symbols are provided by KASAN, which 48 + * instruments the conventional ones. Therefore, any references from the EFI 49 + * stub or other position independent, low level C code should be redirected to 50 + * the non-instrumented versions as well. 51 + */ 52 + PROVIDE(__efistub___memcpy = __pi_memcpy); 53 + PROVIDE(__efistub___memmove = __pi_memmove); 54 + PROVIDE(__efistub___memset = __pi_memset); 48 55 49 - #endif 56 + PROVIDE(__pi___memcpy = __pi_memcpy); 57 + PROVIDE(__pi___memmove = __pi_memmove); 58 + PROVIDE(__pi___memset = __pi_memset); 50 59 51 60 #ifdef CONFIG_KVM 52 61
+18 -131
arch/arm64/kernel/kaslr.c
··· 13 13 #include <linux/pgtable.h> 14 14 #include <linux/random.h> 15 15 16 - #include <asm/cacheflush.h> 17 16 #include <asm/fixmap.h> 18 17 #include <asm/kernel-pgtable.h> 19 18 #include <asm/memory.h> ··· 20 21 #include <asm/sections.h> 21 22 #include <asm/setup.h> 22 23 23 - enum kaslr_status { 24 - KASLR_ENABLED, 25 - KASLR_DISABLED_CMDLINE, 26 - KASLR_DISABLED_NO_SEED, 27 - KASLR_DISABLED_FDT_REMAP, 28 - }; 29 - 30 - static enum kaslr_status __initdata kaslr_status; 31 24 u64 __ro_after_init module_alloc_base; 32 25 u16 __initdata memstart_offset_seed; 33 26 34 - static __init u64 get_kaslr_seed(void *fdt) 35 - { 36 - int node, len; 37 - fdt64_t *prop; 38 - u64 ret; 39 - 40 - node = fdt_path_offset(fdt, "/chosen"); 41 - if (node < 0) 42 - return 0; 43 - 44 - prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len); 45 - if (!prop || len != sizeof(u64)) 46 - return 0; 47 - 48 - ret = fdt64_to_cpu(*prop); 49 - *prop = 0; 50 - return ret; 51 - } 52 - 53 27 struct arm64_ftr_override kaslr_feature_override __initdata; 54 28 55 - /* 56 - * This routine will be executed with the kernel mapped at its default virtual 57 - * address, and if it returns successfully, the kernel will be remapped, and 58 - * start_kernel() will be executed from a randomized virtual offset. The 59 - * relocation will result in all absolute references (e.g., static variables 60 - * containing function pointers) to be reinitialized, and zero-initialized 61 - * .bss variables will be reset to 0. 62 - */ 63 - u64 __init kaslr_early_init(void) 29 + static int __init kaslr_init(void) 64 30 { 65 - void *fdt; 66 - u64 seed, offset, mask, module_range; 67 - unsigned long raw; 31 + u64 module_range; 32 + u32 seed; 68 33 69 34 /* 70 35 * Set a reasonable default for module_alloc_base in case 71 36 * we end up running with module randomization disabled. 72 37 */ 73 38 module_alloc_base = (u64)_etext - MODULES_VSIZE; 74 - dcache_clean_inval_poc((unsigned long)&module_alloc_base, 75 - (unsigned long)&module_alloc_base + 76 - sizeof(module_alloc_base)); 77 39 78 - /* 79 - * Try to map the FDT early. If this fails, we simply bail, 80 - * and proceed with KASLR disabled. We will make another 81 - * attempt at mapping the FDT in setup_machine() 82 - */ 83 - fdt = get_early_fdt_ptr(); 84 - if (!fdt) { 85 - kaslr_status = KASLR_DISABLED_FDT_REMAP; 86 - return 0; 87 - } 88 - 89 - /* 90 - * Retrieve (and wipe) the seed from the FDT 91 - */ 92 - seed = get_kaslr_seed(fdt); 93 - 94 - /* 95 - * Check if 'nokaslr' appears on the command line, and 96 - * return 0 if that is the case. 97 - */ 98 40 if (kaslr_feature_override.val & kaslr_feature_override.mask & 0xf) { 99 - kaslr_status = KASLR_DISABLED_CMDLINE; 41 + pr_info("KASLR disabled on command line\n"); 100 42 return 0; 101 43 } 102 44 103 - /* 104 - * Mix in any entropy obtainable architecturally if enabled 105 - * and supported. 106 - */ 107 - 108 - if (arch_get_random_seed_long_early(&raw)) 109 - seed ^= raw; 110 - 111 - if (!seed) { 112 - kaslr_status = KASLR_DISABLED_NO_SEED; 45 + if (!kaslr_offset()) { 46 + pr_warn("KASLR disabled due to lack of seed\n"); 113 47 return 0; 114 48 } 115 49 50 + pr_info("KASLR enabled\n"); 51 + 116 52 /* 117 - * OK, so we are proceeding with KASLR enabled. Calculate a suitable 118 - * kernel image offset from the seed. Let's place the kernel in the 119 - * middle half of the VMALLOC area (VA_BITS_MIN - 2), and stay clear of 120 - * the lower and upper quarters to avoid colliding with other 121 - * allocations. 122 - * Even if we could randomize at page granularity for 16k and 64k pages, 123 - * let's always round to 2 MB so we don't interfere with the ability to 124 - * map using contiguous PTEs 53 + * KASAN without KASAN_VMALLOC does not expect the module region to 54 + * intersect the vmalloc region, since shadow memory is allocated for 55 + * each module at load time, whereas the vmalloc region will already be 56 + * shadowed by KASAN zero pages. 125 57 */ 126 - mask = ((1UL << (VA_BITS_MIN - 2)) - 1) & ~(SZ_2M - 1); 127 - offset = BIT(VA_BITS_MIN - 3) + (seed & mask); 58 + BUILD_BUG_ON((IS_ENABLED(CONFIG_KASAN_GENERIC) || 59 + IS_ENABLED(CONFIG_KASAN_SW_TAGS)) && 60 + !IS_ENABLED(CONFIG_KASAN_VMALLOC)); 128 61 129 - /* use the top 16 bits to randomize the linear region */ 130 - memstart_offset_seed = seed >> 48; 131 - 132 - if (!IS_ENABLED(CONFIG_KASAN_VMALLOC) && 133 - (IS_ENABLED(CONFIG_KASAN_GENERIC) || 134 - IS_ENABLED(CONFIG_KASAN_SW_TAGS))) 135 - /* 136 - * KASAN without KASAN_VMALLOC does not expect the module region 137 - * to intersect the vmalloc region, since shadow memory is 138 - * allocated for each module at load time, whereas the vmalloc 139 - * region is shadowed by KASAN zero pages. So keep modules 140 - * out of the vmalloc region if KASAN is enabled without 141 - * KASAN_VMALLOC, and put the kernel well within 4 GB of the 142 - * module region. 143 - */ 144 - return offset % SZ_2G; 62 + seed = get_random_u32(); 145 63 146 64 if (IS_ENABLED(CONFIG_RANDOMIZE_MODULE_REGION_FULL)) { 147 65 /* ··· 70 154 * resolved normally.) 71 155 */ 72 156 module_range = SZ_2G - (u64)(_end - _stext); 73 - module_alloc_base = max((u64)_end + offset - SZ_2G, 74 - (u64)MODULES_VADDR); 157 + module_alloc_base = max((u64)_end - SZ_2G, (u64)MODULES_VADDR); 75 158 } else { 76 159 /* 77 160 * Randomize the module region by setting module_alloc_base to ··· 82 167 * when ARM64_MODULE_PLTS is enabled. 83 168 */ 84 169 module_range = MODULES_VSIZE - (u64)(_etext - _stext); 85 - module_alloc_base = (u64)_etext + offset - MODULES_VSIZE; 86 170 } 87 171 88 172 /* use the lower 21 bits to randomize the base of the module region */ 89 173 module_alloc_base += (module_range * (seed & ((1 << 21) - 1))) >> 21; 90 174 module_alloc_base &= PAGE_MASK; 91 175 92 - dcache_clean_inval_poc((unsigned long)&module_alloc_base, 93 - (unsigned long)&module_alloc_base + 94 - sizeof(module_alloc_base)); 95 - dcache_clean_inval_poc((unsigned long)&memstart_offset_seed, 96 - (unsigned long)&memstart_offset_seed + 97 - sizeof(memstart_offset_seed)); 98 - 99 - return offset; 100 - } 101 - 102 - static int __init kaslr_init(void) 103 - { 104 - switch (kaslr_status) { 105 - case KASLR_ENABLED: 106 - pr_info("KASLR enabled\n"); 107 - break; 108 - case KASLR_DISABLED_CMDLINE: 109 - pr_info("KASLR disabled on command line\n"); 110 - break; 111 - case KASLR_DISABLED_NO_SEED: 112 - pr_warn("KASLR disabled due to lack of seed\n"); 113 - break; 114 - case KASLR_DISABLED_FDT_REMAP: 115 - pr_warn("KASLR disabled due to FDT remapping failure\n"); 116 - break; 117 - } 118 - 119 176 return 0; 120 177 } 121 - core_initcall(kaslr_init) 178 + subsys_initcall(kaslr_init)
+1
arch/arm64/kernel/kuser32.S
··· 15 15 16 16 #include <asm/unistd.h> 17 17 18 + .section .rodata 18 19 .align 5 19 20 .globl __kuser_helper_start 20 21 __kuser_helper_start:
-9
arch/arm64/kernel/mte.c
··· 48 48 if (!pte_is_tagged) 49 49 return; 50 50 51 - page_kasan_tag_reset(page); 52 - /* 53 - * We need smp_wmb() in between setting the flags and clearing the 54 - * tags because if another thread reads page->flags and builds a 55 - * tagged address out of it, there is an actual dependency to the 56 - * memory access, but on the current thread we do not guarantee that 57 - * the new page->flags are visible before the tags were updated. 58 - */ 59 - smp_wmb(); 60 51 mte_clear_page_tags(page_address(page)); 61 52 } 62 53
+33
arch/arm64/kernel/pi/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + # Copyright 2022 Google LLC 3 + 4 + KBUILD_CFLAGS := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) -fpie \ 5 + -Os -DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN) \ 6 + $(call cc-option,-mbranch-protection=none) \ 7 + -I$(srctree)/scripts/dtc/libfdt -fno-stack-protector \ 8 + -include $(srctree)/include/linux/hidden.h \ 9 + -D__DISABLE_EXPORTS -ffreestanding -D__NO_FORTIFY \ 10 + $(call cc-option,-fno-addrsig) 11 + 12 + # remove SCS flags from all objects in this directory 13 + KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) 14 + # disable LTO 15 + KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS)) 16 + 17 + GCOV_PROFILE := n 18 + KASAN_SANITIZE := n 19 + KCSAN_SANITIZE := n 20 + UBSAN_SANITIZE := n 21 + KCOV_INSTRUMENT := n 22 + 23 + $(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \ 24 + --remove-section=.note.gnu.property \ 25 + --prefix-alloc-sections=.init 26 + $(obj)/%.pi.o: $(obj)/%.o FORCE 27 + $(call if_changed,objcopy) 28 + 29 + $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE 30 + $(call if_changed_rule,cc_o_c) 31 + 32 + obj-y := kaslr_early.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o 33 + extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
+112
arch/arm64/kernel/pi/kaslr_early.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright 2022 Google LLC 3 + // Author: Ard Biesheuvel <ardb@google.com> 4 + 5 + // NOTE: code in this file runs *very* early, and is not permitted to use 6 + // global variables or anything that relies on absolute addressing. 7 + 8 + #include <linux/libfdt.h> 9 + #include <linux/init.h> 10 + #include <linux/linkage.h> 11 + #include <linux/types.h> 12 + #include <linux/sizes.h> 13 + #include <linux/string.h> 14 + 15 + #include <asm/archrandom.h> 16 + #include <asm/memory.h> 17 + 18 + /* taken from lib/string.c */ 19 + static char *__strstr(const char *s1, const char *s2) 20 + { 21 + size_t l1, l2; 22 + 23 + l2 = strlen(s2); 24 + if (!l2) 25 + return (char *)s1; 26 + l1 = strlen(s1); 27 + while (l1 >= l2) { 28 + l1--; 29 + if (!memcmp(s1, s2, l2)) 30 + return (char *)s1; 31 + s1++; 32 + } 33 + return NULL; 34 + } 35 + static bool cmdline_contains_nokaslr(const u8 *cmdline) 36 + { 37 + const u8 *str; 38 + 39 + str = __strstr(cmdline, "nokaslr"); 40 + return str == cmdline || (str > cmdline && *(str - 1) == ' '); 41 + } 42 + 43 + static bool is_kaslr_disabled_cmdline(void *fdt) 44 + { 45 + if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) { 46 + int node; 47 + const u8 *prop; 48 + 49 + node = fdt_path_offset(fdt, "/chosen"); 50 + if (node < 0) 51 + goto out; 52 + 53 + prop = fdt_getprop(fdt, node, "bootargs", NULL); 54 + if (!prop) 55 + goto out; 56 + 57 + if (cmdline_contains_nokaslr(prop)) 58 + return true; 59 + 60 + if (IS_ENABLED(CONFIG_CMDLINE_EXTEND)) 61 + goto out; 62 + 63 + return false; 64 + } 65 + out: 66 + return cmdline_contains_nokaslr(CONFIG_CMDLINE); 67 + } 68 + 69 + static u64 get_kaslr_seed(void *fdt) 70 + { 71 + int node, len; 72 + fdt64_t *prop; 73 + u64 ret; 74 + 75 + node = fdt_path_offset(fdt, "/chosen"); 76 + if (node < 0) 77 + return 0; 78 + 79 + prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len); 80 + if (!prop || len != sizeof(u64)) 81 + return 0; 82 + 83 + ret = fdt64_to_cpu(*prop); 84 + *prop = 0; 85 + return ret; 86 + } 87 + 88 + asmlinkage u64 kaslr_early_init(void *fdt) 89 + { 90 + u64 seed; 91 + 92 + if (is_kaslr_disabled_cmdline(fdt)) 93 + return 0; 94 + 95 + seed = get_kaslr_seed(fdt); 96 + if (!seed) { 97 + #ifdef CONFIG_ARCH_RANDOM 98 + if (!__early_cpu_has_rndr() || 99 + !__arm64_rndr((unsigned long *)&seed)) 100 + #endif 101 + return 0; 102 + } 103 + 104 + /* 105 + * OK, so we are proceeding with KASLR enabled. Calculate a suitable 106 + * kernel image offset from the seed. Let's place the kernel in the 107 + * middle half of the VMALLOC area (VA_BITS_MIN - 2), and stay clear of 108 + * the lower and upper quarters to avoid colliding with other 109 + * allocations. 110 + */ 111 + return BIT(VA_BITS_MIN - 3) + (seed & GENMASK(VA_BITS_MIN - 3, 0)); 112 + }
+12 -8
arch/arm64/kernel/signal.c
··· 280 280 281 281 vl = task_get_sme_vl(current); 282 282 } else { 283 + if (!system_supports_sve()) 284 + return -EINVAL; 285 + 283 286 vl = task_get_sve_vl(current); 284 287 } 285 288 ··· 345 342 346 343 #else /* ! CONFIG_ARM64_SVE */ 347 344 348 - /* Turn any non-optimised out attempts to use these into a link error: */ 345 + static int restore_sve_fpsimd_context(struct user_ctxs *user) 346 + { 347 + WARN_ON_ONCE(1); 348 + return -EINVAL; 349 + } 350 + 351 + /* Turn any non-optimised out attempts to use this into a link error: */ 349 352 extern int preserve_sve_context(void __user *ctx); 350 - extern int restore_sve_fpsimd_context(struct user_ctxs *user); 351 353 352 354 #endif /* ! CONFIG_ARM64_SVE */ 353 355 ··· 657 649 if (!user.fpsimd) 658 650 return -EINVAL; 659 651 660 - if (user.sve) { 661 - if (!system_supports_sve()) 662 - return -EINVAL; 663 - 652 + if (user.sve) 664 653 err = restore_sve_fpsimd_context(&user); 665 - } else { 654 + else 666 655 err = restore_fpsimd_context(user.fpsimd); 667 - } 668 656 } 669 657 670 658 if (err == 0 && system_supports_sme() && user.za)
+1
arch/arm64/kernel/sigreturn32.S
··· 15 15 16 16 #include <asm/unistd.h> 17 17 18 + .section .rodata 18 19 .globl __aarch32_sigret_code_start 19 20 __aarch32_sigret_code_start: 20 21
+2 -1
arch/arm64/kernel/sleep.S
··· 100 100 .pushsection ".idmap.text", "awx" 101 101 SYM_CODE_START(cpu_resume) 102 102 bl init_kernel_el 103 - bl switch_to_vhe 103 + bl finalise_el2 104 104 bl __cpu_setup 105 105 /* enable the MMU early - so we can access sleep_save_stash by va */ 106 106 adrp x1, swapper_pg_dir 107 + adrp x2, idmap_pg_dir 107 108 bl __enable_mmu 108 109 ldr x8, =_cpu_resume 109 110 br x8
+74 -23
arch/arm64/kernel/stacktrace.c
··· 38 38 * @kr_cur: When KRETPROBES is selected, holds the kretprobe instance 39 39 * associated with the most recently encountered replacement lr 40 40 * value. 41 + * 42 + * @task: The task being unwound. 41 43 */ 42 44 struct unwind_state { 43 45 unsigned long fp; ··· 50 48 #ifdef CONFIG_KRETPROBES 51 49 struct llist_node *kr_cur; 52 50 #endif 51 + struct task_struct *task; 53 52 }; 54 53 55 - static notrace void unwind_init(struct unwind_state *state, unsigned long fp, 56 - unsigned long pc) 54 + static void unwind_init_common(struct unwind_state *state, 55 + struct task_struct *task) 57 56 { 58 - state->fp = fp; 59 - state->pc = pc; 57 + state->task = task; 60 58 #ifdef CONFIG_KRETPROBES 61 59 state->kr_cur = NULL; 62 60 #endif ··· 74 72 state->prev_fp = 0; 75 73 state->prev_type = STACK_TYPE_UNKNOWN; 76 74 } 77 - NOKPROBE_SYMBOL(unwind_init); 75 + 76 + /* 77 + * Start an unwind from a pt_regs. 78 + * 79 + * The unwind will begin at the PC within the regs. 80 + * 81 + * The regs must be on a stack currently owned by the calling task. 82 + */ 83 + static inline void unwind_init_from_regs(struct unwind_state *state, 84 + struct pt_regs *regs) 85 + { 86 + unwind_init_common(state, current); 87 + 88 + state->fp = regs->regs[29]; 89 + state->pc = regs->pc; 90 + } 91 + 92 + /* 93 + * Start an unwind from a caller. 94 + * 95 + * The unwind will begin at the caller of whichever function this is inlined 96 + * into. 97 + * 98 + * The function which invokes this must be noinline. 99 + */ 100 + static __always_inline void unwind_init_from_caller(struct unwind_state *state) 101 + { 102 + unwind_init_common(state, current); 103 + 104 + state->fp = (unsigned long)__builtin_frame_address(1); 105 + state->pc = (unsigned long)__builtin_return_address(0); 106 + } 107 + 108 + /* 109 + * Start an unwind from a blocked task. 110 + * 111 + * The unwind will begin at the blocked tasks saved PC (i.e. the caller of 112 + * cpu_switch_to()). 113 + * 114 + * The caller should ensure the task is blocked in cpu_switch_to() for the 115 + * duration of the unwind, or the unwind will be bogus. It is never valid to 116 + * call this for the current task. 117 + */ 118 + static inline void unwind_init_from_task(struct unwind_state *state, 119 + struct task_struct *task) 120 + { 121 + unwind_init_common(state, task); 122 + 123 + state->fp = thread_saved_fp(task); 124 + state->pc = thread_saved_pc(task); 125 + } 78 126 79 127 /* 80 128 * Unwind from one frame record (A) to the next frame record (B). ··· 133 81 * records (e.g. a cycle), determined based on the location and fp value of A 134 82 * and the location (but not the fp value) of B. 135 83 */ 136 - static int notrace unwind_next(struct task_struct *tsk, 137 - struct unwind_state *state) 84 + static int notrace unwind_next(struct unwind_state *state) 138 85 { 86 + struct task_struct *tsk = state->task; 139 87 unsigned long fp = state->fp; 140 88 struct stack_info info; 141 89 ··· 169 117 if (fp <= state->prev_fp) 170 118 return -EINVAL; 171 119 } else { 172 - set_bit(state->prev_type, state->stacks_done); 120 + __set_bit(state->prev_type, state->stacks_done); 173 121 } 174 122 175 123 /* 176 124 * Record this frame record's values and location. The prev_fp and 177 125 * prev_type are only meaningful to the next unwind_next() invocation. 178 126 */ 179 - state->fp = READ_ONCE_NOCHECK(*(unsigned long *)(fp)); 180 - state->pc = READ_ONCE_NOCHECK(*(unsigned long *)(fp + 8)); 127 + state->fp = READ_ONCE(*(unsigned long *)(fp)); 128 + state->pc = READ_ONCE(*(unsigned long *)(fp + 8)); 181 129 state->prev_fp = fp; 182 130 state->prev_type = info.type; 183 131 ··· 209 157 } 210 158 NOKPROBE_SYMBOL(unwind_next); 211 159 212 - static void notrace unwind(struct task_struct *tsk, 213 - struct unwind_state *state, 160 + static void notrace unwind(struct unwind_state *state, 214 161 stack_trace_consume_fn consume_entry, void *cookie) 215 162 { 216 163 while (1) { ··· 217 166 218 167 if (!consume_entry(cookie, state->pc)) 219 168 break; 220 - ret = unwind_next(tsk, state); 169 + ret = unwind_next(state); 221 170 if (ret < 0) 222 171 break; 223 172 } ··· 263 212 { 264 213 struct unwind_state state; 265 214 266 - if (regs) 267 - unwind_init(&state, regs->regs[29], regs->pc); 268 - else if (task == current) 269 - unwind_init(&state, 270 - (unsigned long)__builtin_frame_address(1), 271 - (unsigned long)__builtin_return_address(0)); 272 - else 273 - unwind_init(&state, thread_saved_fp(task), 274 - thread_saved_pc(task)); 215 + if (regs) { 216 + if (task != current) 217 + return; 218 + unwind_init_from_regs(&state, regs); 219 + } else if (task == current) { 220 + unwind_init_from_caller(&state); 221 + } else { 222 + unwind_init_from_task(&state, task); 223 + } 275 224 276 - unwind(task, &state, consume_entry, cookie); 225 + unwind(&state, consume_entry, cookie); 277 226 }
+1 -1
arch/arm64/kernel/suspend.c
··· 52 52 53 53 /* Restore CnP bit in TTBR1_EL1 */ 54 54 if (system_supports_cnp()) 55 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); 55 + cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir); 56 56 57 57 /* 58 58 * PSTATE was not saved over suspend/resume, re-enable any detected
+3 -3
arch/arm64/kernel/traps.c
··· 579 579 580 580 if (cpus_have_const_cap(ARM64_WORKAROUND_1542419)) { 581 581 /* Hide DIC so that we can trap the unnecessary maintenance...*/ 582 - val &= ~BIT(CTR_DIC_SHIFT); 582 + val &= ~BIT(CTR_EL0_DIC_SHIFT); 583 583 584 584 /* ... and fake IminLine to reduce the number of traps. */ 585 - val &= ~CTR_IMINLINE_MASK; 586 - val |= (PAGE_SHIFT - 2) & CTR_IMINLINE_MASK; 585 + val &= ~CTR_EL0_IminLine_MASK; 586 + val |= (PAGE_SHIFT - 2) & CTR_EL0_IminLine_MASK; 587 587 } 588 588 589 589 pt_regs_write_reg(regs, rt, val);
+7 -1
arch/arm64/kernel/vdso/Makefile
··· 24 24 # routines, as x86 does (see 6f121e548f83 ("x86, vdso: Reimplement vdso.so 25 25 # preparation in build-time C")). 26 26 ldflags-y := -shared -soname=linux-vdso.so.1 --hash-style=sysv \ 27 - -Bsymbolic --build-id=sha1 -n $(btildflags-y) -T 27 + -Bsymbolic --build-id=sha1 -n $(btildflags-y) 28 + 29 + ifdef CONFIG_LD_ORPHAN_WARN 30 + ldflags-y += --orphan-handling=warn 31 + endif 32 + 33 + ldflags-y += -T 28 34 29 35 ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18 30 36 ccflags-y += -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO
+15 -1
arch/arm64/kernel/vdso/vdso.lds.S
··· 11 11 #include <linux/const.h> 12 12 #include <asm/page.h> 13 13 #include <asm/vdso.h> 14 + #include <asm-generic/vmlinux.lds.h> 14 15 15 16 OUTPUT_FORMAT("elf64-littleaarch64", "elf64-bigaarch64", "elf64-littleaarch64") 16 17 OUTPUT_ARCH(aarch64) ··· 50 49 51 50 .dynamic : { *(.dynamic) } :text :dynamic 52 51 53 - .rodata : { *(.rodata*) } :text 52 + .rela.dyn : ALIGN(8) { *(.rela .rela*) } 53 + 54 + .rodata : { 55 + *(.rodata*) 56 + *(.got) 57 + *(.got.plt) 58 + *(.plt) 59 + *(.plt.*) 60 + *(.iplt) 61 + *(.igot .igot.plt) 62 + } :text 54 63 55 64 _end = .; 56 65 PROVIDE(end = .); 66 + 67 + DWARF_DEBUG 68 + ELF_DETAILS 57 69 58 70 /DISCARD/ : { 59 71 *(.data .data.* .gnu.linkonce.d.* .sdata*)
+1
arch/arm64/kernel/vdso32/Makefile
··· 104 104 VDSO_LDFLAGS += -Bsymbolic --no-undefined -soname=linux-vdso.so.1 105 105 VDSO_LDFLAGS += -z max-page-size=4096 -z common-page-size=4096 106 106 VDSO_LDFLAGS += -shared --hash-style=sysv --build-id=sha1 107 + VDSO_LDFLAGS += --orphan-handling=warn 107 108 108 109 109 110 # Borrow vdsomunge.c from the arm vDSO
+23 -4
arch/arm64/kernel/vdso32/vdso.lds.S
··· 11 11 #include <linux/const.h> 12 12 #include <asm/page.h> 13 13 #include <asm/vdso.h> 14 + #include <asm-generic/vmlinux.lds.h> 14 15 15 16 OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm", "elf32-littlearm") 16 17 OUTPUT_ARCH(arm) ··· 36 35 37 36 .dynamic : { *(.dynamic) } :text :dynamic 38 37 39 - .rodata : { *(.rodata*) } :text 38 + .rodata : { 39 + *(.rodata*) 40 + *(.got) 41 + *(.got.plt) 42 + *(.plt) 43 + *(.rel.iplt) 44 + *(.iplt) 45 + *(.igot.plt) 46 + } :text 40 47 41 - .text : { *(.text*) } :text =0xe7f001f2 48 + .text : { 49 + *(.text*) 50 + *(.glue_7) 51 + *(.glue_7t) 52 + *(.vfp11_veneer) 53 + *(.v4_bx) 54 + } :text =0xe7f001f2 42 55 43 - .got : { *(.got) } 44 - .rel.plt : { *(.rel.plt) } 56 + .rel.dyn : { *(.rel*) } 57 + 58 + .ARM.exidx : { *(.ARM.exidx*) } 59 + DWARF_DEBUG 60 + ELF_DETAILS 61 + .ARM.attributes 0 : { *(.ARM.attributes) } 45 62 46 63 /DISCARD/ : { 47 64 *(.note.GNU-stack)
+11 -11
arch/arm64/kernel/vmlinux.lds.S
··· 115 115 __entry_tramp_text_start = .; \ 116 116 *(.entry.tramp.text) \ 117 117 . = ALIGN(PAGE_SIZE); \ 118 - __entry_tramp_text_end = .; 118 + __entry_tramp_text_end = .; \ 119 + *(.entry.tramp.rodata) 119 120 #else 120 121 #define TRAMP_TEXT 121 122 #endif ··· 199 198 } 200 199 201 200 idmap_pg_dir = .; 202 - . += IDMAP_DIR_SIZE; 203 - idmap_pg_end = .; 201 + . += PAGE_SIZE; 204 202 205 203 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 206 204 tramp_pg_dir = .; ··· 235 235 __inittext_end = .; 236 236 __initdata_begin = .; 237 237 238 + init_idmap_pg_dir = .; 239 + . += INIT_IDMAP_DIR_SIZE; 240 + init_idmap_pg_end = .; 241 + 238 242 .init.data : { 239 243 INIT_DATA 240 244 INIT_SETUP(16) ··· 257 253 HYPERVISOR_RELOC_SECTION 258 254 259 255 .rela.dyn : ALIGN(8) { 256 + __rela_start = .; 260 257 *(.rela .rela*) 258 + __rela_end = .; 261 259 } 262 260 263 - __rela_offset = ABSOLUTE(ADDR(.rela.dyn) - KIMAGE_VADDR); 264 - __rela_size = SIZEOF(.rela.dyn); 265 - 266 - #ifdef CONFIG_RELR 267 261 .relr.dyn : ALIGN(8) { 262 + __relr_start = .; 268 263 *(.relr.dyn) 264 + __relr_end = .; 269 265 } 270 - 271 - __relr_offset = ABSOLUTE(ADDR(.relr.dyn) - KIMAGE_VADDR); 272 - __relr_size = SIZEOF(.relr.dyn); 273 - #endif 274 266 275 267 . = ALIGN(SEGMENT_ALIGN); 276 268 __initdata_end = .;
+16 -16
arch/arm64/kvm/hyp/include/nvhe/fixed_config.h
··· 176 176 ) 177 177 178 178 #define PVM_ID_AA64ISAR1_ALLOW (\ 179 - ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \ 180 - ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \ 181 - ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \ 182 - ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \ 183 - ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \ 184 - ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \ 185 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \ 186 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \ 187 - ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \ 188 - ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \ 189 - ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \ 190 - ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \ 191 - ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \ 192 - ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \ 179 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DPB) | \ 180 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) | \ 181 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) | \ 182 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_JSCVT) | \ 183 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FCMA) | \ 184 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_LRCPC) | \ 185 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) | \ 186 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI) | \ 187 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FRINTTS) | \ 188 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SB) | \ 189 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SPECRES) | \ 190 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_BF16) | \ 191 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DGH) | \ 192 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_I8MM) \ 193 193 ) 194 194 195 195 #define PVM_ID_AA64ISAR2_ALLOW (\ 196 - ARM64_FEATURE_MASK(ID_AA64ISAR2_GPA3) | \ 197 - ARM64_FEATURE_MASK(ID_AA64ISAR2_APA3) \ 196 + ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3) | \ 197 + ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) \ 198 198 ) 199 199 200 200 u64 pvm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id);
+6 -6
arch/arm64/kvm/hyp/nvhe/sys_regs.c
··· 173 173 u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW; 174 174 175 175 if (!vcpu_has_ptrauth(vcpu)) 176 - allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | 177 - ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | 178 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | 179 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI)); 176 + allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) | 177 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) | 178 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) | 179 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI)); 180 180 181 181 return id_aa64isar1_el1_sys_val & allow_mask; 182 182 } ··· 186 186 u64 allow_mask = PVM_ID_AA64ISAR2_ALLOW; 187 187 188 188 if (!vcpu_has_ptrauth(vcpu)) 189 - allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_APA3) | 190 - ARM64_FEATURE_MASK(ID_AA64ISAR2_GPA3)); 189 + allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) | 190 + ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3)); 191 191 192 192 return id_aa64isar2_el1_sys_val & allow_mask; 193 193 }
+7 -7
arch/arm64/kvm/sys_regs.c
··· 1136 1136 break; 1137 1137 case SYS_ID_AA64ISAR1_EL1: 1138 1138 if (!vcpu_has_ptrauth(vcpu)) 1139 - val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | 1140 - ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | 1141 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | 1142 - ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI)); 1139 + val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA) | 1140 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API) | 1141 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) | 1142 + ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI)); 1143 1143 break; 1144 1144 case SYS_ID_AA64ISAR2_EL1: 1145 1145 if (!vcpu_has_ptrauth(vcpu)) 1146 - val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_APA3) | 1147 - ARM64_FEATURE_MASK(ID_AA64ISAR2_GPA3)); 1146 + val &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3) | 1147 + ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3)); 1148 1148 if (!cpus_have_final_cap(ARM64_HAS_WFXT)) 1149 - val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_WFXT); 1149 + val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT); 1150 1150 break; 1151 1151 case SYS_ID_AA64DFR0_EL1: 1152 1152 /* Limit debug to ARMv8.0 */
+1 -1
arch/arm64/lib/mte.S
··· 18 18 */ 19 19 .macro multitag_transfer_size, reg, tmp 20 20 mrs_s \reg, SYS_GMID_EL1 21 - ubfx \reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE 21 + ubfx \reg, \reg, #GMID_EL1_BS_SHIFT, #GMID_EL1_BS_SIZE 22 22 mov \tmp, #4 23 23 lsl \reg, \tmp, \reg 24 24 .endm
-41
arch/arm64/mm/cache.S
··· 194 194 ret 195 195 SYM_FUNC_END(__pi_dcache_clean_pop) 196 196 SYM_FUNC_ALIAS(dcache_clean_pop, __pi_dcache_clean_pop) 197 - 198 - /* 199 - * __dma_flush_area(start, size) 200 - * 201 - * clean & invalidate D / U line 202 - * 203 - * - start - virtual start address of region 204 - * - size - size in question 205 - */ 206 - SYM_FUNC_START(__pi___dma_flush_area) 207 - add x1, x0, x1 208 - dcache_by_line_op civac, sy, x0, x1, x2, x3 209 - ret 210 - SYM_FUNC_END(__pi___dma_flush_area) 211 - SYM_FUNC_ALIAS(__dma_flush_area, __pi___dma_flush_area) 212 - 213 - /* 214 - * __dma_map_area(start, size, dir) 215 - * - start - kernel virtual start address 216 - * - size - size of region 217 - * - dir - DMA direction 218 - */ 219 - SYM_FUNC_START(__pi___dma_map_area) 220 - add x1, x0, x1 221 - b __pi_dcache_clean_poc 222 - SYM_FUNC_END(__pi___dma_map_area) 223 - SYM_FUNC_ALIAS(__dma_map_area, __pi___dma_map_area) 224 - 225 - /* 226 - * __dma_unmap_area(start, size, dir) 227 - * - start - kernel virtual start address 228 - * - size - size of region 229 - * - dir - DMA direction 230 - */ 231 - SYM_FUNC_START(__pi___dma_unmap_area) 232 - add x1, x0, x1 233 - cmp w2, #DMA_TO_DEVICE 234 - b.ne __pi_dcache_inval_poc 235 - ret 236 - SYM_FUNC_END(__pi___dma_unmap_area) 237 - SYM_FUNC_ALIAS(__dma_unmap_area, __pi___dma_unmap_area)
-9
arch/arm64/mm/copypage.c
··· 23 23 24 24 if (system_supports_mte() && test_bit(PG_mte_tagged, &from->flags)) { 25 25 set_bit(PG_mte_tagged, &to->flags); 26 - page_kasan_tag_reset(to); 27 - /* 28 - * We need smp_wmb() in between setting the flags and clearing the 29 - * tags because if another thread reads page->flags and builds a 30 - * tagged address out of it, there is an actual dependency to the 31 - * memory access, but on the current thread we do not guarantee that 32 - * the new page->flags are visible before the tags were updated. 33 - */ 34 - smp_wmb(); 35 26 mte_copy_page_tags(kto, kfrom); 36 27 } 37 28 }
+14 -5
arch/arm64/mm/dma-mapping.c
··· 14 14 #include <asm/xen/xen-ops.h> 15 15 16 16 void arch_sync_dma_for_device(phys_addr_t paddr, size_t size, 17 - enum dma_data_direction dir) 17 + enum dma_data_direction dir) 18 18 { 19 - __dma_map_area(phys_to_virt(paddr), size, dir); 19 + unsigned long start = (unsigned long)phys_to_virt(paddr); 20 + 21 + dcache_clean_poc(start, start + size); 20 22 } 21 23 22 24 void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size, 23 - enum dma_data_direction dir) 25 + enum dma_data_direction dir) 24 26 { 25 - __dma_unmap_area(phys_to_virt(paddr), size, dir); 27 + unsigned long start = (unsigned long)phys_to_virt(paddr); 28 + 29 + if (dir == DMA_TO_DEVICE) 30 + return; 31 + 32 + dcache_inval_poc(start, start + size); 26 33 } 27 34 28 35 void arch_dma_prep_coherent(struct page *page, size_t size) 29 36 { 30 - __dma_flush_area(page_address(page), size); 37 + unsigned long start = (unsigned long)page_address(page); 38 + 39 + dcache_clean_inval_poc(start, start + size); 31 40 } 32 41 33 42 #ifdef CONFIG_IOMMU_DMA
+1 -9
arch/arm64/mm/extable.c
··· 16 16 return ((unsigned long)&ex->fixup + ex->fixup); 17 17 } 18 18 19 - static bool ex_handler_fixup(const struct exception_table_entry *ex, 20 - struct pt_regs *regs) 21 - { 22 - regs->pc = get_ex_fixup(ex); 23 - return true; 24 - } 25 - 26 19 static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex, 27 20 struct pt_regs *regs) 28 21 { ··· 65 72 return false; 66 73 67 74 switch (ex->type) { 68 - case EX_TYPE_FIXUP: 69 - return ex_handler_fixup(ex, regs); 70 75 case EX_TYPE_BPF: 71 76 return ex_handler_bpf(ex, regs); 72 77 case EX_TYPE_UACCESS_ERR_ZERO: 78 + case EX_TYPE_KACCESS_ERR_ZERO: 73 79 return ex_handler_uaccess_err_zero(ex, regs); 74 80 case EX_TYPE_LOAD_UNALIGNED_ZEROPAD: 75 81 return ex_handler_load_unaligned_zeropad(ex, regs);
-1
arch/arm64/mm/fault.c
··· 927 927 void tag_clear_highpage(struct page *page) 928 928 { 929 929 mte_zero_clear_page_tags(page_address(page)); 930 - page_kasan_tag_reset(page); 931 930 set_bit(PG_mte_tagged, &page->flags); 932 931 }
-10
arch/arm64/mm/hugetlbpage.c
··· 100 100 #endif 101 101 } 102 102 103 - /* 104 - * Select all bits except the pfn 105 - */ 106 - static inline pgprot_t pte_pgprot(pte_t pte) 107 - { 108 - unsigned long pfn = pte_pfn(pte); 109 - 110 - return __pgprot(pte_val(pfn_pte(pfn, __pgprot(0))) ^ pte_val(pte)); 111 - } 112 - 113 103 static int find_num_contig(struct mm_struct *mm, unsigned long addr, 114 104 pte_t *ptep, size_t *pgsize) 115 105 {
+2 -2
arch/arm64/mm/init.c
··· 389 389 390 390 early_init_fdt_scan_reserved_mem(); 391 391 392 - if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) 392 + if (!defer_reserve_crashkernel()) 393 393 reserve_crashkernel(); 394 394 395 395 high_memory = __va(memblock_end_of_DRAM() - 1) + 1; ··· 438 438 * request_standard_resources() depends on crashkernel's memory being 439 439 * reserved, so do it here. 440 440 */ 441 - if (IS_ENABLED(CONFIG_ZONE_DMA) || IS_ENABLED(CONFIG_ZONE_DMA32)) 441 + if (defer_reserve_crashkernel()) 442 442 reserve_crashkernel(); 443 443 444 444 memblock_dump_all();
+8 -82
arch/arm64/mm/ioremap.c
··· 1 1 // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * Based on arch/arm/mm/ioremap.c 4 - * 5 - * (C) Copyright 1995 1996 Linus Torvalds 6 - * Hacked for ARM by Phil Blundell <philb@gnu.org> 7 - * Hacked to allow all architectures to build, and various cleanups 8 - * by Russell King 9 - * Copyright (C) 2012 ARM Ltd. 10 - */ 11 2 12 - #include <linux/export.h> 13 3 #include <linux/mm.h> 14 - #include <linux/vmalloc.h> 15 4 #include <linux/io.h> 16 5 17 - #include <asm/fixmap.h> 18 - #include <asm/tlbflush.h> 19 - 20 - static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size, 21 - pgprot_t prot, void *caller) 6 + bool ioremap_allowed(phys_addr_t phys_addr, size_t size, unsigned long prot) 22 7 { 23 - unsigned long last_addr; 24 - unsigned long offset = phys_addr & ~PAGE_MASK; 25 - int err; 26 - unsigned long addr; 27 - struct vm_struct *area; 8 + unsigned long last_addr = phys_addr + size - 1; 28 9 29 - /* 30 - * Page align the mapping address and size, taking account of any 31 - * offset. 32 - */ 33 - phys_addr &= PAGE_MASK; 34 - size = PAGE_ALIGN(size + offset); 10 + /* Don't allow outside PHYS_MASK */ 11 + if (last_addr & ~PHYS_MASK) 12 + return false; 35 13 36 - /* 37 - * Don't allow wraparound, zero size or outside PHYS_MASK. 38 - */ 39 - last_addr = phys_addr + size - 1; 40 - if (!size || last_addr < phys_addr || (last_addr & ~PHYS_MASK)) 41 - return NULL; 42 - 43 - /* 44 - * Don't allow RAM to be mapped. 45 - */ 14 + /* Don't allow RAM to be mapped. */ 46 15 if (WARN_ON(pfn_is_map_memory(__phys_to_pfn(phys_addr)))) 47 - return NULL; 16 + return false; 48 17 49 - area = get_vm_area_caller(size, VM_IOREMAP, caller); 50 - if (!area) 51 - return NULL; 52 - addr = (unsigned long)area->addr; 53 - area->phys_addr = phys_addr; 54 - 55 - err = ioremap_page_range(addr, addr + size, phys_addr, prot); 56 - if (err) { 57 - vunmap((void *)addr); 58 - return NULL; 59 - } 60 - 61 - return (void __iomem *)(offset + addr); 18 + return true; 62 19 } 63 - 64 - void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot) 65 - { 66 - return __ioremap_caller(phys_addr, size, prot, 67 - __builtin_return_address(0)); 68 - } 69 - EXPORT_SYMBOL(__ioremap); 70 - 71 - void iounmap(volatile void __iomem *io_addr) 72 - { 73 - unsigned long addr = (unsigned long)io_addr & PAGE_MASK; 74 - 75 - /* 76 - * We could get an address outside vmalloc range in case 77 - * of ioremap_cache() reusing a RAM mapping. 78 - */ 79 - if (is_vmalloc_addr((void *)addr)) 80 - vunmap((void *)addr); 81 - } 82 - EXPORT_SYMBOL(iounmap); 83 - 84 - void __iomem *ioremap_cache(phys_addr_t phys_addr, size_t size) 85 - { 86 - /* For normal memory we already have a cacheable mapping. */ 87 - if (pfn_is_map_memory(__phys_to_pfn(phys_addr))) 88 - return (void __iomem *)__phys_to_virt(phys_addr); 89 - 90 - return __ioremap_caller(phys_addr, size, __pgprot(PROT_NORMAL), 91 - __builtin_return_address(0)); 92 - } 93 - EXPORT_SYMBOL(ioremap_cache); 94 20 95 21 /* 96 22 * Must be called after early_fixmap_init
+2 -2
arch/arm64/mm/kasan_init.c
··· 236 236 */ 237 237 memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir)); 238 238 dsb(ishst); 239 - cpu_replace_ttbr1(lm_alias(tmp_pg_dir)); 239 + cpu_replace_ttbr1(lm_alias(tmp_pg_dir), idmap_pg_dir); 240 240 241 241 clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END); 242 242 ··· 280 280 PAGE_KERNEL_RO)); 281 281 282 282 memset(kasan_early_shadow_page, KASAN_SHADOW_INIT, PAGE_SIZE); 283 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); 283 + cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir); 284 284 } 285 285 286 286 static void __init kasan_init_depth(void)
+63 -15
arch/arm64/mm/mmu.c
··· 43 43 #define NO_CONT_MAPPINGS BIT(1) 44 44 #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ 45 45 46 - u64 idmap_t0sz = TCR_T0SZ(VA_BITS_MIN); 47 - u64 idmap_ptrs_per_pgd = PTRS_PER_PGD; 46 + int idmap_t0sz __ro_after_init; 48 47 49 - u64 __section(".mmuoff.data.write") vabits_actual; 48 + #if VA_BITS > 48 49 + u64 vabits_actual __ro_after_init = VA_BITS_MIN; 50 50 EXPORT_SYMBOL(vabits_actual); 51 + #endif 52 + 53 + u64 kimage_vaddr __ro_after_init = (u64)&_text; 54 + EXPORT_SYMBOL(kimage_vaddr); 51 55 52 56 u64 kimage_voffset __ro_after_init; 53 57 EXPORT_SYMBOL(kimage_voffset); 58 + 59 + u32 __boot_cpu_mode[] = { BOOT_CPU_MODE_EL2, BOOT_CPU_MODE_EL1 }; 60 + 61 + /* 62 + * The booting CPU updates the failed status @__early_cpu_boot_status, 63 + * with MMU turned off. 64 + */ 65 + long __section(".mmuoff.data.write") __early_cpu_boot_status; 54 66 55 67 /* 56 68 * Empty_zero_page is a special page that is used for zero-initialized data ··· 400 388 } while (pgdp++, addr = next, addr != end); 401 389 } 402 390 391 + #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 392 + extern __alias(__create_pgd_mapping) 393 + void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, 394 + phys_addr_t size, pgprot_t prot, 395 + phys_addr_t (*pgtable_alloc)(int), int flags); 396 + #endif 397 + 403 398 static phys_addr_t __pgd_pgtable_alloc(int shift) 404 399 { 405 400 void *ptr = (void *)__get_free_page(GFP_PGTABLE_KERNEL); ··· 548 529 549 530 #ifdef CONFIG_KEXEC_CORE 550 531 if (crash_mem_map) { 551 - if (IS_ENABLED(CONFIG_ZONE_DMA) || 552 - IS_ENABLED(CONFIG_ZONE_DMA32)) 532 + if (defer_reserve_crashkernel()) 553 533 flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS; 554 534 else if (crashk_res.end) 555 535 memblock_mark_nomap(crashk_res.start, ··· 589 571 * through /sys/kernel/kexec_crash_size interface. 590 572 */ 591 573 #ifdef CONFIG_KEXEC_CORE 592 - if (crash_mem_map && 593 - !IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32)) { 574 + if (crash_mem_map && !defer_reserve_crashkernel()) { 594 575 if (crashk_res.end) { 595 576 __map_memblock(pgdp, crashk_res.start, 596 577 crashk_res.end + 1, ··· 682 665 __set_fixmap(FIX_ENTRY_TRAMP_TEXT1 - i, 683 666 pa_start + i * PAGE_SIZE, prot); 684 667 685 - if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { 686 - extern char __entry_tramp_data_start[]; 687 - 688 - __set_fixmap(FIX_ENTRY_TRAMP_DATA, 689 - __pa_symbol(__entry_tramp_data_start), 690 - PAGE_KERNEL_RO); 691 - } 668 + if (IS_ENABLED(CONFIG_RELOCATABLE)) 669 + __set_fixmap(FIX_ENTRY_TRAMP_TEXT1 - i, 670 + pa_start + i * PAGE_SIZE, PAGE_KERNEL_RO); 692 671 693 672 return 0; 694 673 } ··· 775 762 kasan_copy_shadow(pgdp); 776 763 } 777 764 765 + static void __init create_idmap(void) 766 + { 767 + u64 start = __pa_symbol(__idmap_text_start); 768 + u64 size = __pa_symbol(__idmap_text_end) - start; 769 + pgd_t *pgd = idmap_pg_dir; 770 + u64 pgd_phys; 771 + 772 + /* check if we need an additional level of translation */ 773 + if (VA_BITS < 48 && idmap_t0sz < (64 - VA_BITS_MIN)) { 774 + pgd_phys = early_pgtable_alloc(PAGE_SHIFT); 775 + set_pgd(&idmap_pg_dir[start >> VA_BITS], 776 + __pgd(pgd_phys | P4D_TYPE_TABLE)); 777 + pgd = __va(pgd_phys); 778 + } 779 + __create_pgd_mapping(pgd, start, start, size, PAGE_KERNEL_ROX, 780 + early_pgtable_alloc, 0); 781 + 782 + if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)) { 783 + extern u32 __idmap_kpti_flag; 784 + u64 pa = __pa_symbol(&__idmap_kpti_flag); 785 + 786 + /* 787 + * The KPTI G-to-nG conversion code needs a read-write mapping 788 + * of its synchronization flag in the ID map. 789 + */ 790 + __create_pgd_mapping(pgd, pa, pa, sizeof(u32), PAGE_KERNEL, 791 + early_pgtable_alloc, 0); 792 + } 793 + } 794 + 778 795 void __init paging_init(void) 779 796 { 780 797 pgd_t *pgdp = pgd_set_fixmap(__pa_symbol(swapper_pg_dir)); 798 + extern pgd_t init_idmap_pg_dir[]; 799 + 800 + idmap_t0sz = 63UL - __fls(__pa_symbol(_end) | GENMASK(VA_BITS_MIN - 1, 0)); 781 801 782 802 map_kernel(pgdp); 783 803 map_mem(pgdp); 784 804 785 805 pgd_clear_fixmap(); 786 806 787 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); 807 + cpu_replace_ttbr1(lm_alias(swapper_pg_dir), init_idmap_pg_dir); 788 808 init_mm.pgd = swapper_pg_dir; 789 809 790 810 memblock_phys_free(__pa_symbol(init_pg_dir), 791 811 __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir)); 792 812 793 813 memblock_allow_resize(); 814 + 815 + create_idmap(); 794 816 } 795 817 796 818 /*
-9
arch/arm64/mm/mteswap.c
··· 53 53 if (!tags) 54 54 return false; 55 55 56 - page_kasan_tag_reset(page); 57 - /* 58 - * We need smp_wmb() in between setting the flags and clearing the 59 - * tags because if another thread reads page->flags and builds a 60 - * tagged address out of it, there is an actual dependency to the 61 - * memory access, but on the current thread we do not guarantee that 62 - * the new page->flags are visible before the tags were updated. 63 - */ 64 - smp_wmb(); 65 56 mte_restore_page_tags(page_address(page), tags); 66 57 67 58 return true;
+95 -91
arch/arm64/mm/proc.S
··· 14 14 #include <asm/asm-offsets.h> 15 15 #include <asm/asm_pointer_auth.h> 16 16 #include <asm/hwcap.h> 17 + #include <asm/kernel-pgtable.h> 17 18 #include <asm/pgtable-hwdef.h> 18 19 #include <asm/cpufeature.h> 19 20 #include <asm/alternative.h> ··· 201 200 .popsection 202 201 203 202 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 203 + 204 + #define KPTI_NG_PTE_FLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS) 205 + 204 206 .pushsection ".idmap.text", "awx" 205 207 206 - .macro __idmap_kpti_get_pgtable_ent, type 207 - dc cvac, cur_\()\type\()p // Ensure any existing dirty 208 - dmb sy // lines are written back before 209 - ldr \type, [cur_\()\type\()p] // loading the entry 210 - tbz \type, #0, skip_\()\type // Skip invalid and 211 - tbnz \type, #11, skip_\()\type // non-global entries 208 + .macro kpti_mk_tbl_ng, type, num_entries 209 + add end_\type\()p, cur_\type\()p, #\num_entries * 8 210 + .Ldo_\type: 211 + ldr \type, [cur_\type\()p] // Load the entry 212 + tbz \type, #0, .Lnext_\type // Skip invalid and 213 + tbnz \type, #11, .Lnext_\type // non-global entries 214 + orr \type, \type, #PTE_NG // Same bit for blocks and pages 215 + str \type, [cur_\type\()p] // Update the entry 216 + .ifnc \type, pte 217 + tbnz \type, #1, .Lderef_\type 218 + .endif 219 + .Lnext_\type: 220 + add cur_\type\()p, cur_\type\()p, #8 221 + cmp cur_\type\()p, end_\type\()p 222 + b.ne .Ldo_\type 212 223 .endm 213 224 214 - .macro __idmap_kpti_put_pgtable_ent_ng, type 215 - orr \type, \type, #PTE_NG // Same bit for blocks and pages 216 - str \type, [cur_\()\type\()p] // Update the entry and ensure 217 - dmb sy // that it is visible to all 218 - dc civac, cur_\()\type\()p // CPUs. 225 + /* 226 + * Dereference the current table entry and map it into the temporary 227 + * fixmap slot associated with the current level. 228 + */ 229 + .macro kpti_map_pgtbl, type, level 230 + str xzr, [temp_pte, #8 * (\level + 1)] // break before make 231 + dsb nshst 232 + add pte, temp_pte, #PAGE_SIZE * (\level + 1) 233 + lsr pte, pte, #12 234 + tlbi vaae1, pte 235 + dsb nsh 236 + isb 237 + 238 + phys_to_pte pte, cur_\type\()p 239 + add cur_\type\()p, temp_pte, #PAGE_SIZE * (\level + 1) 240 + orr pte, pte, pte_flags 241 + str pte, [temp_pte, #8 * (\level + 1)] 242 + dsb nshst 219 243 .endm 220 244 221 245 /* 222 - * void __kpti_install_ng_mappings(int cpu, int num_cpus, phys_addr_t swapper) 246 + * void __kpti_install_ng_mappings(int cpu, int num_secondaries, phys_addr_t temp_pgd, 247 + * unsigned long temp_pte_va) 223 248 * 224 249 * Called exactly once from stop_machine context by each CPU found during boot. 225 250 */ 226 - __idmap_kpti_flag: 227 - .long 1 251 + .pushsection ".data", "aw", %progbits 252 + SYM_DATA(__idmap_kpti_flag, .long 1) 253 + .popsection 254 + 228 255 SYM_FUNC_START(idmap_kpti_install_ng_mappings) 229 256 cpu .req w0 257 + temp_pte .req x0 230 258 num_cpus .req w1 231 - swapper_pa .req x2 259 + pte_flags .req x1 260 + temp_pgd_phys .req x2 232 261 swapper_ttb .req x3 233 262 flag_ptr .req x4 234 263 cur_pgdp .req x5 ··· 266 235 pgd .req x7 267 236 cur_pudp .req x8 268 237 end_pudp .req x9 269 - pud .req x10 270 238 cur_pmdp .req x11 271 239 end_pmdp .req x12 272 - pmd .req x13 273 240 cur_ptep .req x14 274 241 end_ptep .req x15 275 242 pte .req x16 243 + valid .req x17 276 244 245 + mov x5, x3 // preserve temp_pte arg 277 246 mrs swapper_ttb, ttbr1_el1 278 - restore_ttbr1 swapper_ttb 279 - adr flag_ptr, __idmap_kpti_flag 247 + adr_l flag_ptr, __idmap_kpti_flag 280 248 281 249 cbnz cpu, __idmap_kpti_secondary 282 250 ··· 286 256 eor w17, w17, num_cpus 287 257 cbnz w17, 1b 288 258 289 - /* We need to walk swapper, so turn off the MMU. */ 290 - pre_disable_mmu_workaround 291 - mrs x17, sctlr_el1 292 - bic x17, x17, #SCTLR_ELx_M 293 - msr sctlr_el1, x17 259 + /* Switch to the temporary page tables on this CPU only */ 260 + __idmap_cpu_set_reserved_ttbr1 x8, x9 261 + offset_ttbr1 temp_pgd_phys, x8 262 + msr ttbr1_el1, temp_pgd_phys 294 263 isb 264 + 265 + mov temp_pte, x5 266 + mov pte_flags, #KPTI_NG_PTE_FLAGS 295 267 296 268 /* Everybody is enjoying the idmap, so we can rewrite swapper. */ 297 269 /* PGD */ 298 - mov cur_pgdp, swapper_pa 299 - add end_pgdp, cur_pgdp, #(PTRS_PER_PGD * 8) 300 - do_pgd: __idmap_kpti_get_pgtable_ent pgd 301 - tbnz pgd, #1, walk_puds 302 - next_pgd: 303 - __idmap_kpti_put_pgtable_ent_ng pgd 304 - skip_pgd: 305 - add cur_pgdp, cur_pgdp, #8 306 - cmp cur_pgdp, end_pgdp 307 - b.ne do_pgd 270 + adrp cur_pgdp, swapper_pg_dir 271 + kpti_map_pgtbl pgd, 0 272 + kpti_mk_tbl_ng pgd, PTRS_PER_PGD 308 273 309 - /* Publish the updated tables and nuke all the TLBs */ 310 - dsb sy 311 - tlbi vmalle1is 312 - dsb ish 274 + /* Ensure all the updated entries are visible to secondary CPUs */ 275 + dsb ishst 276 + 277 + /* We're done: fire up swapper_pg_dir again */ 278 + __idmap_cpu_set_reserved_ttbr1 x8, x9 279 + msr ttbr1_el1, swapper_ttb 313 280 isb 314 - 315 - /* We're done: fire up the MMU again */ 316 - mrs x17, sctlr_el1 317 - orr x17, x17, #SCTLR_ELx_M 318 - set_sctlr_el1 x17 319 281 320 282 /* Set the flag to zero to indicate that we're all done */ 321 283 str wzr, [flag_ptr] 322 284 ret 323 285 286 + .Lderef_pgd: 324 287 /* PUD */ 325 - walk_puds: 326 - .if CONFIG_PGTABLE_LEVELS > 3 288 + .if CONFIG_PGTABLE_LEVELS > 3 289 + pud .req x10 327 290 pte_to_phys cur_pudp, pgd 328 - add end_pudp, cur_pudp, #(PTRS_PER_PUD * 8) 329 - do_pud: __idmap_kpti_get_pgtable_ent pud 330 - tbnz pud, #1, walk_pmds 331 - next_pud: 332 - __idmap_kpti_put_pgtable_ent_ng pud 333 - skip_pud: 334 - add cur_pudp, cur_pudp, 8 335 - cmp cur_pudp, end_pudp 336 - b.ne do_pud 337 - b next_pgd 338 - .else /* CONFIG_PGTABLE_LEVELS <= 3 */ 339 - mov pud, pgd 340 - b walk_pmds 341 - next_pud: 342 - b next_pgd 291 + kpti_map_pgtbl pud, 1 292 + kpti_mk_tbl_ng pud, PTRS_PER_PUD 293 + b .Lnext_pgd 294 + .else /* CONFIG_PGTABLE_LEVELS <= 3 */ 295 + pud .req pgd 296 + .set .Lnext_pud, .Lnext_pgd 343 297 .endif 344 298 299 + .Lderef_pud: 345 300 /* PMD */ 346 - walk_pmds: 347 - .if CONFIG_PGTABLE_LEVELS > 2 301 + .if CONFIG_PGTABLE_LEVELS > 2 302 + pmd .req x13 348 303 pte_to_phys cur_pmdp, pud 349 - add end_pmdp, cur_pmdp, #(PTRS_PER_PMD * 8) 350 - do_pmd: __idmap_kpti_get_pgtable_ent pmd 351 - tbnz pmd, #1, walk_ptes 352 - next_pmd: 353 - __idmap_kpti_put_pgtable_ent_ng pmd 354 - skip_pmd: 355 - add cur_pmdp, cur_pmdp, #8 356 - cmp cur_pmdp, end_pmdp 357 - b.ne do_pmd 358 - b next_pud 359 - .else /* CONFIG_PGTABLE_LEVELS <= 2 */ 360 - mov pmd, pud 361 - b walk_ptes 362 - next_pmd: 363 - b next_pud 304 + kpti_map_pgtbl pmd, 2 305 + kpti_mk_tbl_ng pmd, PTRS_PER_PMD 306 + b .Lnext_pud 307 + .else /* CONFIG_PGTABLE_LEVELS <= 2 */ 308 + pmd .req pgd 309 + .set .Lnext_pmd, .Lnext_pgd 364 310 .endif 365 311 312 + .Lderef_pmd: 366 313 /* PTE */ 367 - walk_ptes: 368 314 pte_to_phys cur_ptep, pmd 369 - add end_ptep, cur_ptep, #(PTRS_PER_PTE * 8) 370 - do_pte: __idmap_kpti_get_pgtable_ent pte 371 - __idmap_kpti_put_pgtable_ent_ng pte 372 - skip_pte: 373 - add cur_ptep, cur_ptep, #8 374 - cmp cur_ptep, end_ptep 375 - b.ne do_pte 376 - b next_pmd 315 + kpti_map_pgtbl pte, 3 316 + kpti_mk_tbl_ng pte, PTRS_PER_PTE 317 + b .Lnext_pmd 377 318 378 319 .unreq cpu 320 + .unreq temp_pte 379 321 .unreq num_cpus 380 - .unreq swapper_pa 322 + .unreq pte_flags 323 + .unreq temp_pgd_phys 381 324 .unreq cur_pgdp 382 325 .unreq end_pgdp 383 326 .unreq pgd ··· 363 360 .unreq cur_ptep 364 361 .unreq end_ptep 365 362 .unreq pte 363 + .unreq valid 366 364 367 365 /* Secondary CPUs end up here */ 368 366 __idmap_kpti_secondary: ··· 383 379 cbnz w16, 1b 384 380 385 381 /* All done, act like nothing happened */ 386 - offset_ttbr1 swapper_ttb, x16 387 382 msr ttbr1_el1, swapper_ttb 388 383 isb 389 384 ret ··· 398 395 * 399 396 * Initialise the processor for turning the MMU on. 400 397 * 398 + * Input: 399 + * x0 - actual number of VA bits (ignored unless VA_BITS > 48) 401 400 * Output: 402 401 * Return in x0 the value of the SCTLR_EL1 register. 403 402 */ ··· 469 464 tcr_clear_errata_bits tcr, x9, x5 470 465 471 466 #ifdef CONFIG_ARM64_VA_BITS_52 472 - ldr_l x9, vabits_actual 473 - sub x9, xzr, x9 467 + sub x9, xzr, x0 474 468 add x9, x9, #64 475 469 tcr_set_t1sz tcr, x9 476 470 #else 477 - ldr_l x9, idmap_t0sz 471 + idmap_get_t0sz x9 478 472 #endif 479 473 tcr_set_t0sz tcr, x9 480 474
+2
arch/arm64/tools/cpucaps
··· 36 36 HAS_SB 37 37 HAS_STAGE2_FWB 38 38 HAS_SYSREG_GIC_CPUIF 39 + HAS_TIDCP1 39 40 HAS_TLB_RANGE 40 41 HAS_VIRT_HOST_EXTN 41 42 HAS_WFXT ··· 62 61 WORKAROUND_1463225 63 62 WORKAROUND_1508412 64 63 WORKAROUND_1542419 64 + WORKAROUND_1742098 65 65 WORKAROUND_1902691 66 66 WORKAROUND_2038923 67 67 WORKAROUND_2064142
+1 -1
arch/arm64/tools/gen-sysreg.awk
··· 88 88 89 89 # skip blank lines and comment lines 90 90 /^$/ { next } 91 - /^#/ { next } 91 + /^[\t ]*#/ { next } 92 92 93 93 /^SysregFields/ { 94 94 change_block("SysregFields", "None", "SysregFields")
+264
arch/arm64/tools/sysreg
··· 46 46 # feature that introduces them (eg, FEAT_LS64_ACCDATA introduces enumeration 47 47 # item ACCDATA) though it may be more taseful to do something else. 48 48 49 + Sysreg ID_AA64ZFR0_EL1 3 0 0 4 4 50 + Res0 63:60 51 + Enum 59:56 F64MM 52 + 0b0000 NI 53 + 0b0001 IMP 54 + EndEnum 55 + Enum 55:52 F32MM 56 + 0b0000 NI 57 + 0b0001 IMP 58 + EndEnum 59 + Res0 51:48 60 + Enum 47:44 I8MM 61 + 0b0000 NI 62 + 0b0001 IMP 63 + EndEnum 64 + Enum 43:40 SM4 65 + 0b0000 NI 66 + 0b0001 IMP 67 + EndEnum 68 + Res0 39:36 69 + Enum 35:32 SHA3 70 + 0b0000 NI 71 + 0b0001 IMP 72 + EndEnum 73 + Res0 31:24 74 + Enum 23:20 BF16 75 + 0b0000 NI 76 + 0b0001 IMP 77 + 0b0010 EBF16 78 + EndEnum 79 + Enum 19:16 BitPerm 80 + 0b0000 NI 81 + 0b0001 IMP 82 + EndEnum 83 + Res0 15:8 84 + Enum 7:4 AES 85 + 0b0000 NI 86 + 0b0001 IMP 87 + 0b0010 PMULL128 88 + EndEnum 89 + Enum 3:0 SVEver 90 + 0b0000 IMP 91 + 0b0001 SVE2 92 + EndEnum 93 + EndSysreg 94 + 95 + Sysreg ID_AA64SMFR0_EL1 3 0 0 4 5 96 + Enum 63 FA64 97 + 0b0 NI 98 + 0b1 IMP 99 + EndEnum 100 + Res0 62:60 101 + Field 59:56 SMEver 102 + Enum 55:52 I16I64 103 + 0b0000 NI 104 + 0b1111 IMP 105 + EndEnum 106 + Res0 51:49 107 + Enum 48 F64F64 108 + 0b0 NI 109 + 0b1 IMP 110 + EndEnum 111 + Res0 47:40 112 + Enum 39:36 I8I32 113 + 0b0000 NI 114 + 0b1111 IMP 115 + EndEnum 116 + Enum 35 F16F32 117 + 0b0 NI 118 + 0b1 IMP 119 + EndEnum 120 + Enum 34 B16F32 121 + 0b0 NI 122 + 0b1 IMP 123 + EndEnum 124 + Res0 33 125 + Enum 32 F32F32 126 + 0b0 NI 127 + 0b1 IMP 128 + EndEnum 129 + Res0 31:0 130 + EndSysreg 131 + 49 132 Sysreg ID_AA64ISAR0_EL1 3 0 0 6 0 50 133 Enum 63:60 RNDR 51 134 0b0000 NI ··· 195 112 0b0010 PMULL 196 113 EndEnum 197 114 Res0 3:0 115 + EndSysreg 116 + 117 + Sysreg ID_AA64ISAR1_EL1 3 0 0 6 1 118 + Enum 63:60 LS64 119 + 0b0000 NI 120 + 0b0001 LS64 121 + 0b0010 LS64_V 122 + 0b0011 LS64_ACCDATA 123 + EndEnum 124 + Enum 59:56 XS 125 + 0b0000 NI 126 + 0b0001 IMP 127 + EndEnum 128 + Enum 55:52 I8MM 129 + 0b0000 NI 130 + 0b0001 IMP 131 + EndEnum 132 + Enum 51:48 DGH 133 + 0b0000 NI 134 + 0b0001 IMP 135 + EndEnum 136 + Enum 47:44 BF16 137 + 0b0000 NI 138 + 0b0001 IMP 139 + 0b0010 EBF16 140 + EndEnum 141 + Enum 43:40 SPECRES 142 + 0b0000 NI 143 + 0b0001 IMP 144 + EndEnum 145 + Enum 39:36 SB 146 + 0b0000 NI 147 + 0b0001 IMP 148 + EndEnum 149 + Enum 35:32 FRINTTS 150 + 0b0000 NI 151 + 0b0001 IMP 152 + EndEnum 153 + Enum 31:28 GPI 154 + 0b0000 NI 155 + 0b0001 IMP 156 + EndEnum 157 + Enum 27:24 GPA 158 + 0b0000 NI 159 + 0b0001 IMP 160 + EndEnum 161 + Enum 23:20 LRCPC 162 + 0b0000 NI 163 + 0b0001 IMP 164 + 0b0010 LRCPC2 165 + EndEnum 166 + Enum 19:16 FCMA 167 + 0b0000 NI 168 + 0b0001 IMP 169 + EndEnum 170 + Enum 15:12 JSCVT 171 + 0b0000 NI 172 + 0b0001 IMP 173 + EndEnum 174 + Enum 11:8 API 175 + 0b0000 NI 176 + 0b0001 PAuth 177 + 0b0010 EPAC 178 + 0b0011 PAuth2 179 + 0b0100 FPAC 180 + 0b0101 FPACCOMBINE 181 + EndEnum 182 + Enum 7:4 APA 183 + 0b0000 NI 184 + 0b0001 PAuth 185 + 0b0010 EPAC 186 + 0b0011 PAuth2 187 + 0b0100 FPAC 188 + 0b0101 FPACCOMBINE 189 + EndEnum 190 + Enum 3:0 DPB 191 + 0b0000 NI 192 + 0b0001 IMP 193 + 0b0010 DPB2 194 + EndEnum 195 + EndSysreg 196 + 197 + Sysreg ID_AA64ISAR2_EL1 3 0 0 6 2 198 + Res0 63:28 199 + Enum 27:24 PAC_frac 200 + 0b0000 NI 201 + 0b0001 IMP 202 + EndEnum 203 + Enum 23:20 BC 204 + 0b0000 NI 205 + 0b0001 IMP 206 + EndEnum 207 + Enum 19:16 MOPS 208 + 0b0000 NI 209 + 0b0001 IMP 210 + EndEnum 211 + Enum 15:12 APA3 212 + 0b0000 NI 213 + 0b0001 PAuth 214 + 0b0010 EPAC 215 + 0b0011 PAuth2 216 + 0b0100 FPAC 217 + 0b0101 FPACCOMBINE 218 + EndEnum 219 + Enum 11:8 GPA3 220 + 0b0000 NI 221 + 0b0001 IMP 222 + EndEnum 223 + Enum 7:4 RPRES 224 + 0b0000 NI 225 + 0b0001 IMP 226 + EndEnum 227 + Enum 3:0 WFxT 228 + 0b0000 NI 229 + 0b0010 IMP 230 + EndEnum 198 231 EndSysreg 199 232 200 233 Sysreg SCTLR_EL1 3 0 1 0 0 ··· 456 257 Field 2:0 Ctype1 457 258 EndSysreg 458 259 260 + Sysreg GMID_EL1 3 1 0 0 4 261 + Res0 63:4 262 + Field 3:0 BS 263 + EndSysreg 264 + 459 265 Sysreg SMIDR_EL1 3 1 0 0 6 460 266 Res0 63:32 461 267 Field 31:24 IMPLEMENTER ··· 475 271 Field 4 TnD 476 272 Field 3:1 Level 477 273 Field 0 InD 274 + EndSysreg 275 + 276 + Sysreg CTR_EL0 3 3 0 0 1 277 + Res0 63:38 278 + Field 37:32 TminLine 279 + Res1 31 280 + Res0 30 281 + Field 29 DIC 282 + Field 28 IDC 283 + Field 27:24 CWG 284 + Field 23:20 ERG 285 + Field 19:16 DminLine 286 + Enum 15:14 L1Ip 287 + 0b00 VPIPT 288 + # This is named as AIVIVT in the ARM but documented as reserved 289 + 0b01 RESERVED 290 + 0b10 VIPT 291 + 0b11 PIPT 292 + EndEnum 293 + Res0 13:4 294 + Field 3:0 IminLine 295 + EndSysreg 296 + 297 + Sysreg DCZID_EL0 3 3 0 0 7 298 + Res0 63:5 299 + Field 4 DZP 300 + Field 3:0 BS 478 301 EndSysreg 479 302 480 303 Sysreg SVCR 3 3 4 2 2 ··· 597 366 598 367 Sysreg TTBR1_EL1 3 0 2 0 1 599 368 Fields TTBRx_EL1 369 + EndSysreg 370 + 371 + Sysreg LORSA_EL1 3 0 10 4 0 372 + Res0 63:52 373 + Field 51:16 SA 374 + Res0 15:1 375 + Field 0 Valid 376 + EndSysreg 377 + 378 + Sysreg LOREA_EL1 3 0 10 4 1 379 + Res0 63:52 380 + Field 51:48 EA_51_48 381 + Field 47:16 EA_47_16 382 + Res0 15:0 383 + EndSysreg 384 + 385 + Sysreg LORN_EL1 3 0 10 4 2 386 + Res0 63:8 387 + Field 7:0 Num 388 + EndSysreg 389 + 390 + Sysreg LORC_EL1 3 0 10 4 3 391 + Res0 63:10 392 + Field 9:2 DS 393 + Res0 1 394 + Field 0 EN 395 + EndSysreg 396 + 397 + Sysreg LORID_EL1 3 0 10 4 7 398 + Res0 63:24 399 + Field 23:16 LD 400 + Res0 15:8 401 + Field 7:0 LR 600 402 EndSysreg
+1
arch/x86/Kconfig
··· 278 278 select SYSCTL_EXCEPTION_TRACE 279 279 select THREAD_INFO_IN_TASK 280 280 select TRACE_IRQFLAGS_SUPPORT 281 + select TRACE_IRQFLAGS_NMI_SUPPORT 281 282 select USER_STACKTRACE_SUPPORT 282 283 select VIRT_TO_BUS 283 284 select HAVE_ARCH_KCSAN if X86_64
-3
arch/x86/Kconfig.debug
··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 3 - config TRACE_IRQFLAGS_NMI_SUPPORT 4 - def_bool y 5 - 6 3 config EARLY_PRINTK_USB 7 4 bool 8 5
+2 -1
drivers/cpuidle/Kconfig.arm
··· 3 3 # ARM CPU Idle drivers 4 4 # 5 5 config ARM_CPUIDLE 6 - bool "Generic ARM/ARM64 CPU idle Driver" 6 + bool "Generic ARM CPU idle Driver" 7 + depends on ARM 7 8 select DT_IDLE_STATES 8 9 select CPU_IDLE_MULTIPLE_DRIVERS 9 10 help
+5 -6
drivers/perf/arm-cci.c
··· 1139 1139 1140 1140 /* 1141 1141 * To handle interrupt latency, we always reprogram the period 1142 - * regardlesss of PERF_EF_RELOAD. 1142 + * regardless of PERF_EF_RELOAD. 1143 1143 */ 1144 1144 if (pmu_flags & PERF_EF_RELOAD) 1145 1145 WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); ··· 1261 1261 */ 1262 1262 .used_mask = mask, 1263 1263 }; 1264 - memset(mask, 0, BITS_TO_LONGS(cci_pmu->num_cntrs) * sizeof(unsigned long)); 1264 + bitmap_zero(mask, cci_pmu->num_cntrs); 1265 1265 1266 1266 if (!validate_event(event->pmu, &fake_pmu, leader)) 1267 1267 return -EINVAL; ··· 1629 1629 GFP_KERNEL); 1630 1630 if (!cci_pmu->hw_events.events) 1631 1631 return ERR_PTR(-ENOMEM); 1632 - cci_pmu->hw_events.used_mask = devm_kcalloc(dev, 1633 - BITS_TO_LONGS(CCI_PMU_MAX_HW_CNTRS(model)), 1634 - sizeof(*cci_pmu->hw_events.used_mask), 1635 - GFP_KERNEL); 1632 + cci_pmu->hw_events.used_mask = devm_bitmap_zalloc(dev, 1633 + CCI_PMU_MAX_HW_CNTRS(model), 1634 + GFP_KERNEL); 1636 1635 if (!cci_pmu->hw_events.used_mask) 1637 1636 return ERR_PTR(-ENOMEM); 1638 1637
+3 -3
drivers/perf/arm-ccn.c
··· 1250 1250 ccn->dt.cmp_mask[CCN_IDX_MASK_OPCODE].h = ~(0x1f << 9); 1251 1251 1252 1252 /* Get a convenient /sys/event_source/devices/ name */ 1253 - ccn->dt.id = ida_simple_get(&arm_ccn_pmu_ida, 0, 0, GFP_KERNEL); 1253 + ccn->dt.id = ida_alloc(&arm_ccn_pmu_ida, GFP_KERNEL); 1254 1254 if (ccn->dt.id == 0) { 1255 1255 name = "ccn"; 1256 1256 } else { ··· 1312 1312 &ccn->dt.node); 1313 1313 error_set_affinity: 1314 1314 error_choose_name: 1315 - ida_simple_remove(&arm_ccn_pmu_ida, ccn->dt.id); 1315 + ida_free(&arm_ccn_pmu_ida, ccn->dt.id); 1316 1316 for (i = 0; i < ccn->num_xps; i++) 1317 1317 writel(0, ccn->xp[i].base + CCN_XP_DT_CONTROL); 1318 1318 writel(0, ccn->dt.base + CCN_DT_PMCR); ··· 1329 1329 writel(0, ccn->xp[i].base + CCN_XP_DT_CONTROL); 1330 1330 writel(0, ccn->dt.base + CCN_DT_PMCR); 1331 1331 perf_pmu_unregister(&ccn->dt.pmu); 1332 - ida_simple_remove(&arm_ccn_pmu_ida, ccn->dt.id); 1332 + ida_free(&arm_ccn_pmu_ida, ccn->dt.id); 1333 1333 } 1334 1334 1335 1335 static int arm_ccn_for_each_valid_region(struct arm_ccn *ccn,
+20 -2
drivers/perf/arm_spe_pmu.c
··· 39 39 #include <asm/mmu.h> 40 40 #include <asm/sysreg.h> 41 41 42 + /* 43 + * Cache if the event is allowed to trace Context information. 44 + * This allows us to perform the check, i.e, perfmon_capable(), 45 + * in the context of the event owner, once, during the event_init(). 46 + */ 47 + #define SPE_PMU_HW_FLAGS_CX BIT(0) 48 + 49 + static void set_spe_event_has_cx(struct perf_event *event) 50 + { 51 + if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable()) 52 + event->hw.flags |= SPE_PMU_HW_FLAGS_CX; 53 + } 54 + 55 + static bool get_spe_event_has_cx(struct perf_event *event) 56 + { 57 + return !!(event->hw.flags & SPE_PMU_HW_FLAGS_CX); 58 + } 59 + 42 60 #define ARM_SPE_BUF_PAD_BYTE 0 43 61 44 62 struct arm_spe_pmu_buf { ··· 290 272 if (!attr->exclude_kernel) 291 273 reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT); 292 274 293 - if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable()) 275 + if (get_spe_event_has_cx(event)) 294 276 reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT); 295 277 296 278 return reg; ··· 727 709 !(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT)) 728 710 return -EOPNOTSUPP; 729 711 712 + set_spe_event_has_cx(event); 730 713 reg = arm_spe_event_to_pmscr(event); 731 714 if (!perfmon_capable() && 732 715 (reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) | 733 - BIT(SYS_PMSCR_EL1_CX_SHIFT) | 734 716 BIT(SYS_PMSCR_EL1_PCT_SHIFT)))) 735 717 return -EACCES; 736 718
+3 -3
drivers/perf/fsl_imx8_ddr_perf.c
··· 611 611 .dev = dev, 612 612 }; 613 613 614 - pmu->id = ida_simple_get(&ddr_ida, 0, 0, GFP_KERNEL); 614 + pmu->id = ida_alloc(&ddr_ida, GFP_KERNEL); 615 615 return pmu->id; 616 616 } 617 617 ··· 765 765 cpuhp_instance_err: 766 766 cpuhp_remove_multi_state(pmu->cpuhp_state); 767 767 cpuhp_state_err: 768 - ida_simple_remove(&ddr_ida, pmu->id); 768 + ida_free(&ddr_ida, pmu->id); 769 769 dev_warn(&pdev->dev, "i.MX8 DDR Perf PMU failed (%d), disabled\n", ret); 770 770 return ret; 771 771 } ··· 779 779 780 780 perf_pmu_unregister(&pmu->pmu); 781 781 782 - ida_simple_remove(&ddr_ida, pmu->id); 782 + ida_free(&ddr_ida, pmu->id); 783 783 return 0; 784 784 } 785 785
+10
drivers/perf/hisilicon/Kconfig
··· 14 14 RCiEP devices. 15 15 Adds the PCIe PMU into perf events system for monitoring latency, 16 16 bandwidth etc. 17 + 18 + config HNS3_PMU 19 + tristate "HNS3 PERF PMU" 20 + depends on ARM64 || COMPILE_TEST 21 + depends on PCI 22 + help 23 + Provide support for HNS3 performance monitoring unit (PMU) RCiEP 24 + devices. 25 + Adds the HNS3 PMU into perf events system for monitoring latency, 26 + bandwidth etc.
+1
drivers/perf/hisilicon/Makefile
··· 4 4 hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o 5 5 6 6 obj-$(CONFIG_HISI_PCIE_PMU) += hisi_pcie_pmu.o 7 + obj-$(CONFIG_HNS3_PMU) += hns3_pmu.o
+1 -15
drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
··· 516 516 "hisi_sccl%u_ddrc%u", ddrc_pmu->sccl_id, 517 517 ddrc_pmu->index_id); 518 518 519 - ddrc_pmu->pmu = (struct pmu) { 520 - .name = name, 521 - .module = THIS_MODULE, 522 - .task_ctx_nr = perf_invalid_context, 523 - .event_init = hisi_uncore_pmu_event_init, 524 - .pmu_enable = hisi_uncore_pmu_enable, 525 - .pmu_disable = hisi_uncore_pmu_disable, 526 - .add = hisi_uncore_pmu_add, 527 - .del = hisi_uncore_pmu_del, 528 - .start = hisi_uncore_pmu_start, 529 - .stop = hisi_uncore_pmu_stop, 530 - .read = hisi_uncore_pmu_read, 531 - .attr_groups = ddrc_pmu->pmu_events.attr_groups, 532 - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 533 - }; 519 + hisi_pmu_init(&ddrc_pmu->pmu, name, ddrc_pmu->pmu_events.attr_groups, THIS_MODULE); 534 520 535 521 ret = perf_pmu_register(&ddrc_pmu->pmu, name, -1); 536 522 if (ret) {
+1 -15
drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
··· 519 519 520 520 name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_hha%u", 521 521 hha_pmu->sccl_id, hha_pmu->index_id); 522 - hha_pmu->pmu = (struct pmu) { 523 - .name = name, 524 - .module = THIS_MODULE, 525 - .task_ctx_nr = perf_invalid_context, 526 - .event_init = hisi_uncore_pmu_event_init, 527 - .pmu_enable = hisi_uncore_pmu_enable, 528 - .pmu_disable = hisi_uncore_pmu_disable, 529 - .add = hisi_uncore_pmu_add, 530 - .del = hisi_uncore_pmu_del, 531 - .start = hisi_uncore_pmu_start, 532 - .stop = hisi_uncore_pmu_stop, 533 - .read = hisi_uncore_pmu_read, 534 - .attr_groups = hha_pmu->pmu_events.attr_groups, 535 - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 536 - }; 522 + hisi_pmu_init(&hha_pmu->pmu, name, hha_pmu->pmu_events.attr_groups, THIS_MODULE); 537 523 538 524 ret = perf_pmu_register(&hha_pmu->pmu, name, -1); 539 525 if (ret) {
+1 -15
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 557 557 */ 558 558 name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hisi_sccl%u_l3c%u", 559 559 l3c_pmu->sccl_id, l3c_pmu->ccl_id); 560 - l3c_pmu->pmu = (struct pmu) { 561 - .name = name, 562 - .module = THIS_MODULE, 563 - .task_ctx_nr = perf_invalid_context, 564 - .event_init = hisi_uncore_pmu_event_init, 565 - .pmu_enable = hisi_uncore_pmu_enable, 566 - .pmu_disable = hisi_uncore_pmu_disable, 567 - .add = hisi_uncore_pmu_add, 568 - .del = hisi_uncore_pmu_del, 569 - .start = hisi_uncore_pmu_start, 570 - .stop = hisi_uncore_pmu_stop, 571 - .read = hisi_uncore_pmu_read, 572 - .attr_groups = l3c_pmu->pmu_events.attr_groups, 573 - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 574 - }; 560 + hisi_pmu_init(&l3c_pmu->pmu, name, l3c_pmu->pmu_events.attr_groups, THIS_MODULE); 575 561 576 562 ret = perf_pmu_register(&l3c_pmu->pmu, name, -1); 577 563 if (ret) {
+1 -15
drivers/perf/hisilicon/hisi_uncore_pa_pmu.c
··· 412 412 return ret; 413 413 } 414 414 415 - pa_pmu->pmu = (struct pmu) { 416 - .module = THIS_MODULE, 417 - .task_ctx_nr = perf_invalid_context, 418 - .event_init = hisi_uncore_pmu_event_init, 419 - .pmu_enable = hisi_uncore_pmu_enable, 420 - .pmu_disable = hisi_uncore_pmu_disable, 421 - .add = hisi_uncore_pmu_add, 422 - .del = hisi_uncore_pmu_del, 423 - .start = hisi_uncore_pmu_start, 424 - .stop = hisi_uncore_pmu_stop, 425 - .read = hisi_uncore_pmu_read, 426 - .attr_groups = pa_pmu->pmu_events.attr_groups, 427 - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 428 - }; 429 - 415 + hisi_pmu_init(&pa_pmu->pmu, name, pa_pmu->pmu_events.attr_groups, THIS_MODULE); 430 416 ret = perf_pmu_register(&pa_pmu->pmu, name, -1); 431 417 if (ret) { 432 418 dev_err(pa_pmu->dev, "PMU register failed, ret = %d\n", ret);
+18
drivers/perf/hisilicon/hisi_uncore_pmu.c
··· 531 531 } 532 532 EXPORT_SYMBOL_GPL(hisi_uncore_pmu_offline_cpu); 533 533 534 + void hisi_pmu_init(struct pmu *pmu, const char *name, 535 + const struct attribute_group **attr_groups, struct module *module) 536 + { 537 + pmu->name = name; 538 + pmu->module = module; 539 + pmu->task_ctx_nr = perf_invalid_context; 540 + pmu->event_init = hisi_uncore_pmu_event_init; 541 + pmu->pmu_enable = hisi_uncore_pmu_enable; 542 + pmu->pmu_disable = hisi_uncore_pmu_disable; 543 + pmu->add = hisi_uncore_pmu_add; 544 + pmu->del = hisi_uncore_pmu_del; 545 + pmu->start = hisi_uncore_pmu_start; 546 + pmu->stop = hisi_uncore_pmu_stop; 547 + pmu->read = hisi_uncore_pmu_read; 548 + pmu->attr_groups = attr_groups; 549 + } 550 + EXPORT_SYMBOL_GPL(hisi_pmu_init); 551 + 534 552 MODULE_LICENSE("GPL v2");
+2
drivers/perf/hisilicon/hisi_uncore_pmu.h
··· 121 121 int hisi_uncore_pmu_init_irq(struct hisi_pmu *hisi_pmu, 122 122 struct platform_device *pdev); 123 123 124 + void hisi_pmu_init(struct pmu *pmu, const char *name, 125 + const struct attribute_group **attr_groups, struct module *module); 124 126 #endif /* __HISI_UNCORE_PMU_H__ */
+1 -14
drivers/perf/hisilicon/hisi_uncore_sllc_pmu.c
··· 445 445 return ret; 446 446 } 447 447 448 - sllc_pmu->pmu = (struct pmu) { 449 - .module = THIS_MODULE, 450 - .task_ctx_nr = perf_invalid_context, 451 - .event_init = hisi_uncore_pmu_event_init, 452 - .pmu_enable = hisi_uncore_pmu_enable, 453 - .pmu_disable = hisi_uncore_pmu_disable, 454 - .add = hisi_uncore_pmu_add, 455 - .del = hisi_uncore_pmu_del, 456 - .start = hisi_uncore_pmu_start, 457 - .stop = hisi_uncore_pmu_stop, 458 - .read = hisi_uncore_pmu_read, 459 - .attr_groups = sllc_pmu->pmu_events.attr_groups, 460 - .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 461 - }; 448 + hisi_pmu_init(&sllc_pmu->pmu, name, sllc_pmu->pmu_events.attr_groups, THIS_MODULE); 462 449 463 450 ret = perf_pmu_register(&sllc_pmu->pmu, name, -1); 464 451 if (ret) {
+1671
drivers/perf/hisilicon/hns3_pmu.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * This driver adds support for HNS3 PMU iEP device. Related perf events are 4 + * bandwidth, latency, packet rate, interrupt rate etc. 5 + * 6 + * Copyright (C) 2022 HiSilicon Limited 7 + */ 8 + #include <linux/bitfield.h> 9 + #include <linux/bitmap.h> 10 + #include <linux/bug.h> 11 + #include <linux/cpuhotplug.h> 12 + #include <linux/cpumask.h> 13 + #include <linux/delay.h> 14 + #include <linux/device.h> 15 + #include <linux/err.h> 16 + #include <linux/interrupt.h> 17 + #include <linux/iopoll.h> 18 + #include <linux/io-64-nonatomic-hi-lo.h> 19 + #include <linux/irq.h> 20 + #include <linux/kernel.h> 21 + #include <linux/list.h> 22 + #include <linux/module.h> 23 + #include <linux/pci.h> 24 + #include <linux/pci-epf.h> 25 + #include <linux/perf_event.h> 26 + #include <linux/smp.h> 27 + 28 + /* registers offset address */ 29 + #define HNS3_PMU_REG_GLOBAL_CTRL 0x0000 30 + #define HNS3_PMU_REG_CLOCK_FREQ 0x0020 31 + #define HNS3_PMU_REG_BDF 0x0fe0 32 + #define HNS3_PMU_REG_VERSION 0x0fe4 33 + #define HNS3_PMU_REG_DEVICE_ID 0x0fe8 34 + 35 + #define HNS3_PMU_REG_EVENT_OFFSET 0x1000 36 + #define HNS3_PMU_REG_EVENT_SIZE 0x1000 37 + #define HNS3_PMU_REG_EVENT_CTRL_LOW 0x00 38 + #define HNS3_PMU_REG_EVENT_CTRL_HIGH 0x04 39 + #define HNS3_PMU_REG_EVENT_INTR_STATUS 0x08 40 + #define HNS3_PMU_REG_EVENT_INTR_MASK 0x0c 41 + #define HNS3_PMU_REG_EVENT_COUNTER 0x10 42 + #define HNS3_PMU_REG_EVENT_EXT_COUNTER 0x18 43 + #define HNS3_PMU_REG_EVENT_QID_CTRL 0x28 44 + #define HNS3_PMU_REG_EVENT_QID_PARA 0x2c 45 + 46 + #define HNS3_PMU_FILTER_SUPPORT_GLOBAL BIT(0) 47 + #define HNS3_PMU_FILTER_SUPPORT_PORT BIT(1) 48 + #define HNS3_PMU_FILTER_SUPPORT_PORT_TC BIT(2) 49 + #define HNS3_PMU_FILTER_SUPPORT_FUNC BIT(3) 50 + #define HNS3_PMU_FILTER_SUPPORT_FUNC_QUEUE BIT(4) 51 + #define HNS3_PMU_FILTER_SUPPORT_FUNC_INTR BIT(5) 52 + 53 + #define HNS3_PMU_FILTER_ALL_TC 0xf 54 + #define HNS3_PMU_FILTER_ALL_QUEUE 0xffff 55 + 56 + #define HNS3_PMU_CTRL_SUBEVENT_S 4 57 + #define HNS3_PMU_CTRL_FILTER_MODE_S 24 58 + 59 + #define HNS3_PMU_GLOBAL_START BIT(0) 60 + 61 + #define HNS3_PMU_EVENT_STATUS_RESET BIT(11) 62 + #define HNS3_PMU_EVENT_EN BIT(12) 63 + #define HNS3_PMU_EVENT_OVERFLOW_RESTART BIT(15) 64 + 65 + #define HNS3_PMU_QID_PARA_FUNC_S 0 66 + #define HNS3_PMU_QID_PARA_QUEUE_S 16 67 + 68 + #define HNS3_PMU_QID_CTRL_REQ_ENABLE BIT(0) 69 + #define HNS3_PMU_QID_CTRL_DONE BIT(1) 70 + #define HNS3_PMU_QID_CTRL_MISS BIT(2) 71 + 72 + #define HNS3_PMU_INTR_MASK_OVERFLOW BIT(1) 73 + 74 + #define HNS3_PMU_MAX_HW_EVENTS 8 75 + 76 + /* 77 + * Each hardware event contains two registers (counter and ext_counter) for 78 + * bandwidth, packet rate, latency and interrupt rate. These two registers will 79 + * be triggered to run at the same when a hardware event is enabled. The meaning 80 + * of counter and ext_counter of different event type are different, their 81 + * meaning show as follow: 82 + * 83 + * +----------------+------------------+---------------+ 84 + * | event type | counter | ext_counter | 85 + * +----------------+------------------+---------------+ 86 + * | bandwidth | byte number | cycle number | 87 + * +----------------+------------------+---------------+ 88 + * | packet rate | packet number | cycle number | 89 + * +----------------+------------------+---------------+ 90 + * | latency | cycle number | packet number | 91 + * +----------------+------------------+---------------+ 92 + * | interrupt rate | interrupt number | cycle number | 93 + * +----------------+------------------+---------------+ 94 + * 95 + * The cycle number indicates increment of counter of hardware timer, the 96 + * frequency of hardware timer can be read from hw_clk_freq file. 97 + * 98 + * Performance of each hardware event is calculated by: counter / ext_counter. 99 + * 100 + * Since processing of data is preferred to be done in userspace, we expose 101 + * ext_counter as a separate event for userspace and use bit 16 to indicate it. 102 + * For example, event 0x00001 and 0x10001 are actually one event for hardware 103 + * because bit 0-15 are same. If the bit 16 of one event is 0 means to read 104 + * counter register, otherwise means to read ext_counter register. 105 + */ 106 + /* bandwidth events */ 107 + #define HNS3_PMU_EVT_BW_SSU_EGU_BYTE_NUM 0x00001 108 + #define HNS3_PMU_EVT_BW_SSU_EGU_TIME 0x10001 109 + #define HNS3_PMU_EVT_BW_SSU_RPU_BYTE_NUM 0x00002 110 + #define HNS3_PMU_EVT_BW_SSU_RPU_TIME 0x10002 111 + #define HNS3_PMU_EVT_BW_SSU_ROCE_BYTE_NUM 0x00003 112 + #define HNS3_PMU_EVT_BW_SSU_ROCE_TIME 0x10003 113 + #define HNS3_PMU_EVT_BW_ROCE_SSU_BYTE_NUM 0x00004 114 + #define HNS3_PMU_EVT_BW_ROCE_SSU_TIME 0x10004 115 + #define HNS3_PMU_EVT_BW_TPU_SSU_BYTE_NUM 0x00005 116 + #define HNS3_PMU_EVT_BW_TPU_SSU_TIME 0x10005 117 + #define HNS3_PMU_EVT_BW_RPU_RCBRX_BYTE_NUM 0x00006 118 + #define HNS3_PMU_EVT_BW_RPU_RCBRX_TIME 0x10006 119 + #define HNS3_PMU_EVT_BW_RCBTX_TXSCH_BYTE_NUM 0x00008 120 + #define HNS3_PMU_EVT_BW_RCBTX_TXSCH_TIME 0x10008 121 + #define HNS3_PMU_EVT_BW_WR_FBD_BYTE_NUM 0x00009 122 + #define HNS3_PMU_EVT_BW_WR_FBD_TIME 0x10009 123 + #define HNS3_PMU_EVT_BW_WR_EBD_BYTE_NUM 0x0000a 124 + #define HNS3_PMU_EVT_BW_WR_EBD_TIME 0x1000a 125 + #define HNS3_PMU_EVT_BW_RD_FBD_BYTE_NUM 0x0000b 126 + #define HNS3_PMU_EVT_BW_RD_FBD_TIME 0x1000b 127 + #define HNS3_PMU_EVT_BW_RD_EBD_BYTE_NUM 0x0000c 128 + #define HNS3_PMU_EVT_BW_RD_EBD_TIME 0x1000c 129 + #define HNS3_PMU_EVT_BW_RD_PAY_M0_BYTE_NUM 0x0000d 130 + #define HNS3_PMU_EVT_BW_RD_PAY_M0_TIME 0x1000d 131 + #define HNS3_PMU_EVT_BW_RD_PAY_M1_BYTE_NUM 0x0000e 132 + #define HNS3_PMU_EVT_BW_RD_PAY_M1_TIME 0x1000e 133 + #define HNS3_PMU_EVT_BW_WR_PAY_M0_BYTE_NUM 0x0000f 134 + #define HNS3_PMU_EVT_BW_WR_PAY_M0_TIME 0x1000f 135 + #define HNS3_PMU_EVT_BW_WR_PAY_M1_BYTE_NUM 0x00010 136 + #define HNS3_PMU_EVT_BW_WR_PAY_M1_TIME 0x10010 137 + 138 + /* packet rate events */ 139 + #define HNS3_PMU_EVT_PPS_IGU_SSU_PACKET_NUM 0x00100 140 + #define HNS3_PMU_EVT_PPS_IGU_SSU_TIME 0x10100 141 + #define HNS3_PMU_EVT_PPS_SSU_EGU_PACKET_NUM 0x00101 142 + #define HNS3_PMU_EVT_PPS_SSU_EGU_TIME 0x10101 143 + #define HNS3_PMU_EVT_PPS_SSU_RPU_PACKET_NUM 0x00102 144 + #define HNS3_PMU_EVT_PPS_SSU_RPU_TIME 0x10102 145 + #define HNS3_PMU_EVT_PPS_SSU_ROCE_PACKET_NUM 0x00103 146 + #define HNS3_PMU_EVT_PPS_SSU_ROCE_TIME 0x10103 147 + #define HNS3_PMU_EVT_PPS_ROCE_SSU_PACKET_NUM 0x00104 148 + #define HNS3_PMU_EVT_PPS_ROCE_SSU_TIME 0x10104 149 + #define HNS3_PMU_EVT_PPS_TPU_SSU_PACKET_NUM 0x00105 150 + #define HNS3_PMU_EVT_PPS_TPU_SSU_TIME 0x10105 151 + #define HNS3_PMU_EVT_PPS_RPU_RCBRX_PACKET_NUM 0x00106 152 + #define HNS3_PMU_EVT_PPS_RPU_RCBRX_TIME 0x10106 153 + #define HNS3_PMU_EVT_PPS_RCBTX_TPU_PACKET_NUM 0x00107 154 + #define HNS3_PMU_EVT_PPS_RCBTX_TPU_TIME 0x10107 155 + #define HNS3_PMU_EVT_PPS_RCBTX_TXSCH_PACKET_NUM 0x00108 156 + #define HNS3_PMU_EVT_PPS_RCBTX_TXSCH_TIME 0x10108 157 + #define HNS3_PMU_EVT_PPS_WR_FBD_PACKET_NUM 0x00109 158 + #define HNS3_PMU_EVT_PPS_WR_FBD_TIME 0x10109 159 + #define HNS3_PMU_EVT_PPS_WR_EBD_PACKET_NUM 0x0010a 160 + #define HNS3_PMU_EVT_PPS_WR_EBD_TIME 0x1010a 161 + #define HNS3_PMU_EVT_PPS_RD_FBD_PACKET_NUM 0x0010b 162 + #define HNS3_PMU_EVT_PPS_RD_FBD_TIME 0x1010b 163 + #define HNS3_PMU_EVT_PPS_RD_EBD_PACKET_NUM 0x0010c 164 + #define HNS3_PMU_EVT_PPS_RD_EBD_TIME 0x1010c 165 + #define HNS3_PMU_EVT_PPS_RD_PAY_M0_PACKET_NUM 0x0010d 166 + #define HNS3_PMU_EVT_PPS_RD_PAY_M0_TIME 0x1010d 167 + #define HNS3_PMU_EVT_PPS_RD_PAY_M1_PACKET_NUM 0x0010e 168 + #define HNS3_PMU_EVT_PPS_RD_PAY_M1_TIME 0x1010e 169 + #define HNS3_PMU_EVT_PPS_WR_PAY_M0_PACKET_NUM 0x0010f 170 + #define HNS3_PMU_EVT_PPS_WR_PAY_M0_TIME 0x1010f 171 + #define HNS3_PMU_EVT_PPS_WR_PAY_M1_PACKET_NUM 0x00110 172 + #define HNS3_PMU_EVT_PPS_WR_PAY_M1_TIME 0x10110 173 + #define HNS3_PMU_EVT_PPS_NICROH_TX_PRE_PACKET_NUM 0x00111 174 + #define HNS3_PMU_EVT_PPS_NICROH_TX_PRE_TIME 0x10111 175 + #define HNS3_PMU_EVT_PPS_NICROH_RX_PRE_PACKET_NUM 0x00112 176 + #define HNS3_PMU_EVT_PPS_NICROH_RX_PRE_TIME 0x10112 177 + 178 + /* latency events */ 179 + #define HNS3_PMU_EVT_DLY_TX_PUSH_TIME 0x00202 180 + #define HNS3_PMU_EVT_DLY_TX_PUSH_PACKET_NUM 0x10202 181 + #define HNS3_PMU_EVT_DLY_TX_TIME 0x00204 182 + #define HNS3_PMU_EVT_DLY_TX_PACKET_NUM 0x10204 183 + #define HNS3_PMU_EVT_DLY_SSU_TX_NIC_TIME 0x00206 184 + #define HNS3_PMU_EVT_DLY_SSU_TX_NIC_PACKET_NUM 0x10206 185 + #define HNS3_PMU_EVT_DLY_SSU_TX_ROCE_TIME 0x00207 186 + #define HNS3_PMU_EVT_DLY_SSU_TX_ROCE_PACKET_NUM 0x10207 187 + #define HNS3_PMU_EVT_DLY_SSU_RX_NIC_TIME 0x00208 188 + #define HNS3_PMU_EVT_DLY_SSU_RX_NIC_PACKET_NUM 0x10208 189 + #define HNS3_PMU_EVT_DLY_SSU_RX_ROCE_TIME 0x00209 190 + #define HNS3_PMU_EVT_DLY_SSU_RX_ROCE_PACKET_NUM 0x10209 191 + #define HNS3_PMU_EVT_DLY_RPU_TIME 0x0020e 192 + #define HNS3_PMU_EVT_DLY_RPU_PACKET_NUM 0x1020e 193 + #define HNS3_PMU_EVT_DLY_TPU_TIME 0x0020f 194 + #define HNS3_PMU_EVT_DLY_TPU_PACKET_NUM 0x1020f 195 + #define HNS3_PMU_EVT_DLY_RPE_TIME 0x00210 196 + #define HNS3_PMU_EVT_DLY_RPE_PACKET_NUM 0x10210 197 + #define HNS3_PMU_EVT_DLY_TPE_TIME 0x00211 198 + #define HNS3_PMU_EVT_DLY_TPE_PACKET_NUM 0x10211 199 + #define HNS3_PMU_EVT_DLY_TPE_PUSH_TIME 0x00212 200 + #define HNS3_PMU_EVT_DLY_TPE_PUSH_PACKET_NUM 0x10212 201 + #define HNS3_PMU_EVT_DLY_WR_FBD_TIME 0x00213 202 + #define HNS3_PMU_EVT_DLY_WR_FBD_PACKET_NUM 0x10213 203 + #define HNS3_PMU_EVT_DLY_WR_EBD_TIME 0x00214 204 + #define HNS3_PMU_EVT_DLY_WR_EBD_PACKET_NUM 0x10214 205 + #define HNS3_PMU_EVT_DLY_RD_FBD_TIME 0x00215 206 + #define HNS3_PMU_EVT_DLY_RD_FBD_PACKET_NUM 0x10215 207 + #define HNS3_PMU_EVT_DLY_RD_EBD_TIME 0x00216 208 + #define HNS3_PMU_EVT_DLY_RD_EBD_PACKET_NUM 0x10216 209 + #define HNS3_PMU_EVT_DLY_RD_PAY_M0_TIME 0x00217 210 + #define HNS3_PMU_EVT_DLY_RD_PAY_M0_PACKET_NUM 0x10217 211 + #define HNS3_PMU_EVT_DLY_RD_PAY_M1_TIME 0x00218 212 + #define HNS3_PMU_EVT_DLY_RD_PAY_M1_PACKET_NUM 0x10218 213 + #define HNS3_PMU_EVT_DLY_WR_PAY_M0_TIME 0x00219 214 + #define HNS3_PMU_EVT_DLY_WR_PAY_M0_PACKET_NUM 0x10219 215 + #define HNS3_PMU_EVT_DLY_WR_PAY_M1_TIME 0x0021a 216 + #define HNS3_PMU_EVT_DLY_WR_PAY_M1_PACKET_NUM 0x1021a 217 + #define HNS3_PMU_EVT_DLY_MSIX_WRITE_TIME 0x0021c 218 + #define HNS3_PMU_EVT_DLY_MSIX_WRITE_PACKET_NUM 0x1021c 219 + 220 + /* interrupt rate events */ 221 + #define HNS3_PMU_EVT_PPS_MSIX_NIC_INTR_NUM 0x00300 222 + #define HNS3_PMU_EVT_PPS_MSIX_NIC_TIME 0x10300 223 + 224 + /* filter mode supported by each bandwidth event */ 225 + #define HNS3_PMU_FILTER_BW_SSU_EGU 0x07 226 + #define HNS3_PMU_FILTER_BW_SSU_RPU 0x1f 227 + #define HNS3_PMU_FILTER_BW_SSU_ROCE 0x0f 228 + #define HNS3_PMU_FILTER_BW_ROCE_SSU 0x0f 229 + #define HNS3_PMU_FILTER_BW_TPU_SSU 0x1f 230 + #define HNS3_PMU_FILTER_BW_RPU_RCBRX 0x11 231 + #define HNS3_PMU_FILTER_BW_RCBTX_TXSCH 0x11 232 + #define HNS3_PMU_FILTER_BW_WR_FBD 0x1b 233 + #define HNS3_PMU_FILTER_BW_WR_EBD 0x11 234 + #define HNS3_PMU_FILTER_BW_RD_FBD 0x01 235 + #define HNS3_PMU_FILTER_BW_RD_EBD 0x1b 236 + #define HNS3_PMU_FILTER_BW_RD_PAY_M0 0x01 237 + #define HNS3_PMU_FILTER_BW_RD_PAY_M1 0x01 238 + #define HNS3_PMU_FILTER_BW_WR_PAY_M0 0x01 239 + #define HNS3_PMU_FILTER_BW_WR_PAY_M1 0x01 240 + 241 + /* filter mode supported by each packet rate event */ 242 + #define HNS3_PMU_FILTER_PPS_IGU_SSU 0x07 243 + #define HNS3_PMU_FILTER_PPS_SSU_EGU 0x07 244 + #define HNS3_PMU_FILTER_PPS_SSU_RPU 0x1f 245 + #define HNS3_PMU_FILTER_PPS_SSU_ROCE 0x0f 246 + #define HNS3_PMU_FILTER_PPS_ROCE_SSU 0x0f 247 + #define HNS3_PMU_FILTER_PPS_TPU_SSU 0x1f 248 + #define HNS3_PMU_FILTER_PPS_RPU_RCBRX 0x11 249 + #define HNS3_PMU_FILTER_PPS_RCBTX_TPU 0x1f 250 + #define HNS3_PMU_FILTER_PPS_RCBTX_TXSCH 0x11 251 + #define HNS3_PMU_FILTER_PPS_WR_FBD 0x1b 252 + #define HNS3_PMU_FILTER_PPS_WR_EBD 0x11 253 + #define HNS3_PMU_FILTER_PPS_RD_FBD 0x01 254 + #define HNS3_PMU_FILTER_PPS_RD_EBD 0x1b 255 + #define HNS3_PMU_FILTER_PPS_RD_PAY_M0 0x01 256 + #define HNS3_PMU_FILTER_PPS_RD_PAY_M1 0x01 257 + #define HNS3_PMU_FILTER_PPS_WR_PAY_M0 0x01 258 + #define HNS3_PMU_FILTER_PPS_WR_PAY_M1 0x01 259 + #define HNS3_PMU_FILTER_PPS_NICROH_TX_PRE 0x01 260 + #define HNS3_PMU_FILTER_PPS_NICROH_RX_PRE 0x01 261 + 262 + /* filter mode supported by each latency event */ 263 + #define HNS3_PMU_FILTER_DLY_TX_PUSH 0x01 264 + #define HNS3_PMU_FILTER_DLY_TX 0x01 265 + #define HNS3_PMU_FILTER_DLY_SSU_TX_NIC 0x07 266 + #define HNS3_PMU_FILTER_DLY_SSU_TX_ROCE 0x07 267 + #define HNS3_PMU_FILTER_DLY_SSU_RX_NIC 0x07 268 + #define HNS3_PMU_FILTER_DLY_SSU_RX_ROCE 0x07 269 + #define HNS3_PMU_FILTER_DLY_RPU 0x11 270 + #define HNS3_PMU_FILTER_DLY_TPU 0x1f 271 + #define HNS3_PMU_FILTER_DLY_RPE 0x01 272 + #define HNS3_PMU_FILTER_DLY_TPE 0x0b 273 + #define HNS3_PMU_FILTER_DLY_TPE_PUSH 0x1b 274 + #define HNS3_PMU_FILTER_DLY_WR_FBD 0x1b 275 + #define HNS3_PMU_FILTER_DLY_WR_EBD 0x11 276 + #define HNS3_PMU_FILTER_DLY_RD_FBD 0x01 277 + #define HNS3_PMU_FILTER_DLY_RD_EBD 0x1b 278 + #define HNS3_PMU_FILTER_DLY_RD_PAY_M0 0x01 279 + #define HNS3_PMU_FILTER_DLY_RD_PAY_M1 0x01 280 + #define HNS3_PMU_FILTER_DLY_WR_PAY_M0 0x01 281 + #define HNS3_PMU_FILTER_DLY_WR_PAY_M1 0x01 282 + #define HNS3_PMU_FILTER_DLY_MSIX_WRITE 0x01 283 + 284 + /* filter mode supported by each interrupt rate event */ 285 + #define HNS3_PMU_FILTER_INTR_MSIX_NIC 0x01 286 + 287 + enum hns3_pmu_hw_filter_mode { 288 + HNS3_PMU_HW_FILTER_GLOBAL, 289 + HNS3_PMU_HW_FILTER_PORT, 290 + HNS3_PMU_HW_FILTER_PORT_TC, 291 + HNS3_PMU_HW_FILTER_FUNC, 292 + HNS3_PMU_HW_FILTER_FUNC_QUEUE, 293 + HNS3_PMU_HW_FILTER_FUNC_INTR, 294 + }; 295 + 296 + struct hns3_pmu_event_attr { 297 + u32 event; 298 + u16 filter_support; 299 + }; 300 + 301 + struct hns3_pmu { 302 + struct perf_event *hw_events[HNS3_PMU_MAX_HW_EVENTS]; 303 + struct hlist_node node; 304 + struct pci_dev *pdev; 305 + struct pmu pmu; 306 + void __iomem *base; 307 + int irq; 308 + int on_cpu; 309 + u32 identifier; 310 + u32 hw_clk_freq; /* hardware clock frequency of PMU */ 311 + /* maximum and minimum bdf allowed by PMU */ 312 + u16 bdf_min; 313 + u16 bdf_max; 314 + }; 315 + 316 + #define to_hns3_pmu(p) (container_of((p), struct hns3_pmu, pmu)) 317 + 318 + #define GET_PCI_DEVFN(bdf) ((bdf) & 0xff) 319 + 320 + #define FILTER_CONDITION_PORT(port) ((1 << (port)) & 0xff) 321 + #define FILTER_CONDITION_PORT_TC(port, tc) (((port) << 3) | ((tc) & 0x07)) 322 + #define FILTER_CONDITION_FUNC_INTR(func, intr) (((intr) << 8) | (func)) 323 + 324 + #define HNS3_PMU_FILTER_ATTR(_name, _config, _start, _end) \ 325 + static inline u64 hns3_pmu_get_##_name(struct perf_event *event) \ 326 + { \ 327 + return FIELD_GET(GENMASK_ULL(_end, _start), \ 328 + event->attr._config); \ 329 + } 330 + 331 + HNS3_PMU_FILTER_ATTR(subevent, config, 0, 7); 332 + HNS3_PMU_FILTER_ATTR(event_type, config, 8, 15); 333 + HNS3_PMU_FILTER_ATTR(ext_counter_used, config, 16, 16); 334 + HNS3_PMU_FILTER_ATTR(port, config1, 0, 3); 335 + HNS3_PMU_FILTER_ATTR(tc, config1, 4, 7); 336 + HNS3_PMU_FILTER_ATTR(bdf, config1, 8, 23); 337 + HNS3_PMU_FILTER_ATTR(queue, config1, 24, 39); 338 + HNS3_PMU_FILTER_ATTR(intr, config1, 40, 51); 339 + HNS3_PMU_FILTER_ATTR(global, config1, 52, 52); 340 + 341 + #define HNS3_BW_EVT_BYTE_NUM(_name) (&(struct hns3_pmu_event_attr) {\ 342 + HNS3_PMU_EVT_BW_##_name##_BYTE_NUM, \ 343 + HNS3_PMU_FILTER_BW_##_name}) 344 + #define HNS3_BW_EVT_TIME(_name) (&(struct hns3_pmu_event_attr) {\ 345 + HNS3_PMU_EVT_BW_##_name##_TIME, \ 346 + HNS3_PMU_FILTER_BW_##_name}) 347 + #define HNS3_PPS_EVT_PACKET_NUM(_name) (&(struct hns3_pmu_event_attr) {\ 348 + HNS3_PMU_EVT_PPS_##_name##_PACKET_NUM, \ 349 + HNS3_PMU_FILTER_PPS_##_name}) 350 + #define HNS3_PPS_EVT_TIME(_name) (&(struct hns3_pmu_event_attr) {\ 351 + HNS3_PMU_EVT_PPS_##_name##_TIME, \ 352 + HNS3_PMU_FILTER_PPS_##_name}) 353 + #define HNS3_DLY_EVT_TIME(_name) (&(struct hns3_pmu_event_attr) {\ 354 + HNS3_PMU_EVT_DLY_##_name##_TIME, \ 355 + HNS3_PMU_FILTER_DLY_##_name}) 356 + #define HNS3_DLY_EVT_PACKET_NUM(_name) (&(struct hns3_pmu_event_attr) {\ 357 + HNS3_PMU_EVT_DLY_##_name##_PACKET_NUM, \ 358 + HNS3_PMU_FILTER_DLY_##_name}) 359 + #define HNS3_INTR_EVT_INTR_NUM(_name) (&(struct hns3_pmu_event_attr) {\ 360 + HNS3_PMU_EVT_PPS_##_name##_INTR_NUM, \ 361 + HNS3_PMU_FILTER_INTR_##_name}) 362 + #define HNS3_INTR_EVT_TIME(_name) (&(struct hns3_pmu_event_attr) {\ 363 + HNS3_PMU_EVT_PPS_##_name##_TIME, \ 364 + HNS3_PMU_FILTER_INTR_##_name}) 365 + 366 + static ssize_t hns3_pmu_format_show(struct device *dev, 367 + struct device_attribute *attr, char *buf) 368 + { 369 + struct dev_ext_attribute *eattr; 370 + 371 + eattr = container_of(attr, struct dev_ext_attribute, attr); 372 + 373 + return sysfs_emit(buf, "%s\n", (char *)eattr->var); 374 + } 375 + 376 + static ssize_t hns3_pmu_event_show(struct device *dev, 377 + struct device_attribute *attr, char *buf) 378 + { 379 + struct hns3_pmu_event_attr *event; 380 + struct dev_ext_attribute *eattr; 381 + 382 + eattr = container_of(attr, struct dev_ext_attribute, attr); 383 + event = eattr->var; 384 + 385 + return sysfs_emit(buf, "config=0x%x\n", event->event); 386 + } 387 + 388 + static ssize_t hns3_pmu_filter_mode_show(struct device *dev, 389 + struct device_attribute *attr, 390 + char *buf) 391 + { 392 + struct hns3_pmu_event_attr *event; 393 + struct dev_ext_attribute *eattr; 394 + int len; 395 + 396 + eattr = container_of(attr, struct dev_ext_attribute, attr); 397 + event = eattr->var; 398 + 399 + len = sysfs_emit_at(buf, 0, "filter mode supported: "); 400 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_GLOBAL) 401 + len += sysfs_emit_at(buf, len, "global "); 402 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_PORT) 403 + len += sysfs_emit_at(buf, len, "port "); 404 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_PORT_TC) 405 + len += sysfs_emit_at(buf, len, "port-tc "); 406 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC) 407 + len += sysfs_emit_at(buf, len, "func "); 408 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC_QUEUE) 409 + len += sysfs_emit_at(buf, len, "func-queue "); 410 + if (event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC_INTR) 411 + len += sysfs_emit_at(buf, len, "func-intr "); 412 + 413 + len += sysfs_emit_at(buf, len, "\n"); 414 + 415 + return len; 416 + } 417 + 418 + #define HNS3_PMU_ATTR(_name, _func, _config) \ 419 + (&((struct dev_ext_attribute[]) { \ 420 + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ 421 + })[0].attr.attr) 422 + 423 + #define HNS3_PMU_FORMAT_ATTR(_name, _format) \ 424 + HNS3_PMU_ATTR(_name, hns3_pmu_format_show, (void *)_format) 425 + #define HNS3_PMU_EVENT_ATTR(_name, _event) \ 426 + HNS3_PMU_ATTR(_name, hns3_pmu_event_show, (void *)_event) 427 + #define HNS3_PMU_FLT_MODE_ATTR(_name, _event) \ 428 + HNS3_PMU_ATTR(_name, hns3_pmu_filter_mode_show, (void *)_event) 429 + 430 + #define HNS3_PMU_BW_EVT_PAIR(_name, _macro) \ 431 + HNS3_PMU_EVENT_ATTR(_name##_byte_num, HNS3_BW_EVT_BYTE_NUM(_macro)), \ 432 + HNS3_PMU_EVENT_ATTR(_name##_time, HNS3_BW_EVT_TIME(_macro)) 433 + #define HNS3_PMU_PPS_EVT_PAIR(_name, _macro) \ 434 + HNS3_PMU_EVENT_ATTR(_name##_packet_num, HNS3_PPS_EVT_PACKET_NUM(_macro)), \ 435 + HNS3_PMU_EVENT_ATTR(_name##_time, HNS3_PPS_EVT_TIME(_macro)) 436 + #define HNS3_PMU_DLY_EVT_PAIR(_name, _macro) \ 437 + HNS3_PMU_EVENT_ATTR(_name##_time, HNS3_DLY_EVT_TIME(_macro)), \ 438 + HNS3_PMU_EVENT_ATTR(_name##_packet_num, HNS3_DLY_EVT_PACKET_NUM(_macro)) 439 + #define HNS3_PMU_INTR_EVT_PAIR(_name, _macro) \ 440 + HNS3_PMU_EVENT_ATTR(_name##_intr_num, HNS3_INTR_EVT_INTR_NUM(_macro)), \ 441 + HNS3_PMU_EVENT_ATTR(_name##_time, HNS3_INTR_EVT_TIME(_macro)) 442 + 443 + #define HNS3_PMU_BW_FLT_MODE_PAIR(_name, _macro) \ 444 + HNS3_PMU_FLT_MODE_ATTR(_name##_byte_num, HNS3_BW_EVT_BYTE_NUM(_macro)), \ 445 + HNS3_PMU_FLT_MODE_ATTR(_name##_time, HNS3_BW_EVT_TIME(_macro)) 446 + #define HNS3_PMU_PPS_FLT_MODE_PAIR(_name, _macro) \ 447 + HNS3_PMU_FLT_MODE_ATTR(_name##_packet_num, HNS3_PPS_EVT_PACKET_NUM(_macro)), \ 448 + HNS3_PMU_FLT_MODE_ATTR(_name##_time, HNS3_PPS_EVT_TIME(_macro)) 449 + #define HNS3_PMU_DLY_FLT_MODE_PAIR(_name, _macro) \ 450 + HNS3_PMU_FLT_MODE_ATTR(_name##_time, HNS3_DLY_EVT_TIME(_macro)), \ 451 + HNS3_PMU_FLT_MODE_ATTR(_name##_packet_num, HNS3_DLY_EVT_PACKET_NUM(_macro)) 452 + #define HNS3_PMU_INTR_FLT_MODE_PAIR(_name, _macro) \ 453 + HNS3_PMU_FLT_MODE_ATTR(_name##_intr_num, HNS3_INTR_EVT_INTR_NUM(_macro)), \ 454 + HNS3_PMU_FLT_MODE_ATTR(_name##_time, HNS3_INTR_EVT_TIME(_macro)) 455 + 456 + static u8 hns3_pmu_hw_filter_modes[] = { 457 + HNS3_PMU_HW_FILTER_GLOBAL, 458 + HNS3_PMU_HW_FILTER_PORT, 459 + HNS3_PMU_HW_FILTER_PORT_TC, 460 + HNS3_PMU_HW_FILTER_FUNC, 461 + HNS3_PMU_HW_FILTER_FUNC_QUEUE, 462 + HNS3_PMU_HW_FILTER_FUNC_INTR, 463 + }; 464 + 465 + #define HNS3_PMU_SET_HW_FILTER(_hwc, _mode) \ 466 + ((_hwc)->addr_filters = (void *)&hns3_pmu_hw_filter_modes[(_mode)]) 467 + 468 + static ssize_t identifier_show(struct device *dev, 469 + struct device_attribute *attr, char *buf) 470 + { 471 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(dev_get_drvdata(dev)); 472 + 473 + return sysfs_emit(buf, "0x%x\n", hns3_pmu->identifier); 474 + } 475 + static DEVICE_ATTR_RO(identifier); 476 + 477 + static ssize_t cpumask_show(struct device *dev, struct device_attribute *attr, 478 + char *buf) 479 + { 480 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(dev_get_drvdata(dev)); 481 + 482 + return sysfs_emit(buf, "%d\n", hns3_pmu->on_cpu); 483 + } 484 + static DEVICE_ATTR_RO(cpumask); 485 + 486 + static ssize_t bdf_min_show(struct device *dev, struct device_attribute *attr, 487 + char *buf) 488 + { 489 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(dev_get_drvdata(dev)); 490 + u16 bdf = hns3_pmu->bdf_min; 491 + 492 + return sysfs_emit(buf, "%02x:%02x.%x\n", PCI_BUS_NUM(bdf), 493 + PCI_SLOT(bdf), PCI_FUNC(bdf)); 494 + } 495 + static DEVICE_ATTR_RO(bdf_min); 496 + 497 + static ssize_t bdf_max_show(struct device *dev, struct device_attribute *attr, 498 + char *buf) 499 + { 500 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(dev_get_drvdata(dev)); 501 + u16 bdf = hns3_pmu->bdf_max; 502 + 503 + return sysfs_emit(buf, "%02x:%02x.%x\n", PCI_BUS_NUM(bdf), 504 + PCI_SLOT(bdf), PCI_FUNC(bdf)); 505 + } 506 + static DEVICE_ATTR_RO(bdf_max); 507 + 508 + static ssize_t hw_clk_freq_show(struct device *dev, 509 + struct device_attribute *attr, char *buf) 510 + { 511 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(dev_get_drvdata(dev)); 512 + 513 + return sysfs_emit(buf, "%u\n", hns3_pmu->hw_clk_freq); 514 + } 515 + static DEVICE_ATTR_RO(hw_clk_freq); 516 + 517 + static struct attribute *hns3_pmu_events_attr[] = { 518 + /* bandwidth events */ 519 + HNS3_PMU_BW_EVT_PAIR(bw_ssu_egu, SSU_EGU), 520 + HNS3_PMU_BW_EVT_PAIR(bw_ssu_rpu, SSU_RPU), 521 + HNS3_PMU_BW_EVT_PAIR(bw_ssu_roce, SSU_ROCE), 522 + HNS3_PMU_BW_EVT_PAIR(bw_roce_ssu, ROCE_SSU), 523 + HNS3_PMU_BW_EVT_PAIR(bw_tpu_ssu, TPU_SSU), 524 + HNS3_PMU_BW_EVT_PAIR(bw_rpu_rcbrx, RPU_RCBRX), 525 + HNS3_PMU_BW_EVT_PAIR(bw_rcbtx_txsch, RCBTX_TXSCH), 526 + HNS3_PMU_BW_EVT_PAIR(bw_wr_fbd, WR_FBD), 527 + HNS3_PMU_BW_EVT_PAIR(bw_wr_ebd, WR_EBD), 528 + HNS3_PMU_BW_EVT_PAIR(bw_rd_fbd, RD_FBD), 529 + HNS3_PMU_BW_EVT_PAIR(bw_rd_ebd, RD_EBD), 530 + HNS3_PMU_BW_EVT_PAIR(bw_rd_pay_m0, RD_PAY_M0), 531 + HNS3_PMU_BW_EVT_PAIR(bw_rd_pay_m1, RD_PAY_M1), 532 + HNS3_PMU_BW_EVT_PAIR(bw_wr_pay_m0, WR_PAY_M0), 533 + HNS3_PMU_BW_EVT_PAIR(bw_wr_pay_m1, WR_PAY_M1), 534 + 535 + /* packet rate events */ 536 + HNS3_PMU_PPS_EVT_PAIR(pps_igu_ssu, IGU_SSU), 537 + HNS3_PMU_PPS_EVT_PAIR(pps_ssu_egu, SSU_EGU), 538 + HNS3_PMU_PPS_EVT_PAIR(pps_ssu_rpu, SSU_RPU), 539 + HNS3_PMU_PPS_EVT_PAIR(pps_ssu_roce, SSU_ROCE), 540 + HNS3_PMU_PPS_EVT_PAIR(pps_roce_ssu, ROCE_SSU), 541 + HNS3_PMU_PPS_EVT_PAIR(pps_tpu_ssu, TPU_SSU), 542 + HNS3_PMU_PPS_EVT_PAIR(pps_rpu_rcbrx, RPU_RCBRX), 543 + HNS3_PMU_PPS_EVT_PAIR(pps_rcbtx_tpu, RCBTX_TPU), 544 + HNS3_PMU_PPS_EVT_PAIR(pps_rcbtx_txsch, RCBTX_TXSCH), 545 + HNS3_PMU_PPS_EVT_PAIR(pps_wr_fbd, WR_FBD), 546 + HNS3_PMU_PPS_EVT_PAIR(pps_wr_ebd, WR_EBD), 547 + HNS3_PMU_PPS_EVT_PAIR(pps_rd_fbd, RD_FBD), 548 + HNS3_PMU_PPS_EVT_PAIR(pps_rd_ebd, RD_EBD), 549 + HNS3_PMU_PPS_EVT_PAIR(pps_rd_pay_m0, RD_PAY_M0), 550 + HNS3_PMU_PPS_EVT_PAIR(pps_rd_pay_m1, RD_PAY_M1), 551 + HNS3_PMU_PPS_EVT_PAIR(pps_wr_pay_m0, WR_PAY_M0), 552 + HNS3_PMU_PPS_EVT_PAIR(pps_wr_pay_m1, WR_PAY_M1), 553 + HNS3_PMU_PPS_EVT_PAIR(pps_intr_nicroh_tx_pre, NICROH_TX_PRE), 554 + HNS3_PMU_PPS_EVT_PAIR(pps_intr_nicroh_rx_pre, NICROH_RX_PRE), 555 + 556 + /* latency events */ 557 + HNS3_PMU_DLY_EVT_PAIR(dly_tx_push_to_mac, TX_PUSH), 558 + HNS3_PMU_DLY_EVT_PAIR(dly_tx_normal_to_mac, TX), 559 + HNS3_PMU_DLY_EVT_PAIR(dly_ssu_tx_th_nic, SSU_TX_NIC), 560 + HNS3_PMU_DLY_EVT_PAIR(dly_ssu_tx_th_roce, SSU_TX_ROCE), 561 + HNS3_PMU_DLY_EVT_PAIR(dly_ssu_rx_th_nic, SSU_RX_NIC), 562 + HNS3_PMU_DLY_EVT_PAIR(dly_ssu_rx_th_roce, SSU_RX_ROCE), 563 + HNS3_PMU_DLY_EVT_PAIR(dly_rpu, RPU), 564 + HNS3_PMU_DLY_EVT_PAIR(dly_tpu, TPU), 565 + HNS3_PMU_DLY_EVT_PAIR(dly_rpe, RPE), 566 + HNS3_PMU_DLY_EVT_PAIR(dly_tpe_normal, TPE), 567 + HNS3_PMU_DLY_EVT_PAIR(dly_tpe_push, TPE_PUSH), 568 + HNS3_PMU_DLY_EVT_PAIR(dly_wr_fbd, WR_FBD), 569 + HNS3_PMU_DLY_EVT_PAIR(dly_wr_ebd, WR_EBD), 570 + HNS3_PMU_DLY_EVT_PAIR(dly_rd_fbd, RD_FBD), 571 + HNS3_PMU_DLY_EVT_PAIR(dly_rd_ebd, RD_EBD), 572 + HNS3_PMU_DLY_EVT_PAIR(dly_rd_pay_m0, RD_PAY_M0), 573 + HNS3_PMU_DLY_EVT_PAIR(dly_rd_pay_m1, RD_PAY_M1), 574 + HNS3_PMU_DLY_EVT_PAIR(dly_wr_pay_m0, WR_PAY_M0), 575 + HNS3_PMU_DLY_EVT_PAIR(dly_wr_pay_m1, WR_PAY_M1), 576 + HNS3_PMU_DLY_EVT_PAIR(dly_msix_write, MSIX_WRITE), 577 + 578 + /* interrupt rate events */ 579 + HNS3_PMU_INTR_EVT_PAIR(pps_intr_msix_nic, MSIX_NIC), 580 + 581 + NULL 582 + }; 583 + 584 + static struct attribute *hns3_pmu_filter_mode_attr[] = { 585 + /* bandwidth events */ 586 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_ssu_egu, SSU_EGU), 587 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_ssu_rpu, SSU_RPU), 588 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_ssu_roce, SSU_ROCE), 589 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_roce_ssu, ROCE_SSU), 590 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_tpu_ssu, TPU_SSU), 591 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rpu_rcbrx, RPU_RCBRX), 592 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rcbtx_txsch, RCBTX_TXSCH), 593 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_wr_fbd, WR_FBD), 594 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_wr_ebd, WR_EBD), 595 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rd_fbd, RD_FBD), 596 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rd_ebd, RD_EBD), 597 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rd_pay_m0, RD_PAY_M0), 598 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_rd_pay_m1, RD_PAY_M1), 599 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_wr_pay_m0, WR_PAY_M0), 600 + HNS3_PMU_BW_FLT_MODE_PAIR(bw_wr_pay_m1, WR_PAY_M1), 601 + 602 + /* packet rate events */ 603 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_igu_ssu, IGU_SSU), 604 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_ssu_egu, SSU_EGU), 605 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_ssu_rpu, SSU_RPU), 606 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_ssu_roce, SSU_ROCE), 607 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_roce_ssu, ROCE_SSU), 608 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_tpu_ssu, TPU_SSU), 609 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rpu_rcbrx, RPU_RCBRX), 610 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rcbtx_tpu, RCBTX_TPU), 611 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rcbtx_txsch, RCBTX_TXSCH), 612 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_wr_fbd, WR_FBD), 613 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_wr_ebd, WR_EBD), 614 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rd_fbd, RD_FBD), 615 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rd_ebd, RD_EBD), 616 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rd_pay_m0, RD_PAY_M0), 617 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_rd_pay_m1, RD_PAY_M1), 618 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_wr_pay_m0, WR_PAY_M0), 619 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_wr_pay_m1, WR_PAY_M1), 620 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_intr_nicroh_tx_pre, NICROH_TX_PRE), 621 + HNS3_PMU_PPS_FLT_MODE_PAIR(pps_intr_nicroh_rx_pre, NICROH_RX_PRE), 622 + 623 + /* latency events */ 624 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_tx_push_to_mac, TX_PUSH), 625 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_tx_normal_to_mac, TX), 626 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_ssu_tx_th_nic, SSU_TX_NIC), 627 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_ssu_tx_th_roce, SSU_TX_ROCE), 628 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_ssu_rx_th_nic, SSU_RX_NIC), 629 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_ssu_rx_th_roce, SSU_RX_ROCE), 630 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rpu, RPU), 631 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_tpu, TPU), 632 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rpe, RPE), 633 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_tpe_normal, TPE), 634 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_tpe_push, TPE_PUSH), 635 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_wr_fbd, WR_FBD), 636 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_wr_ebd, WR_EBD), 637 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rd_fbd, RD_FBD), 638 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rd_ebd, RD_EBD), 639 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rd_pay_m0, RD_PAY_M0), 640 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_rd_pay_m1, RD_PAY_M1), 641 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_wr_pay_m0, WR_PAY_M0), 642 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_wr_pay_m1, WR_PAY_M1), 643 + HNS3_PMU_DLY_FLT_MODE_PAIR(dly_msix_write, MSIX_WRITE), 644 + 645 + /* interrupt rate events */ 646 + HNS3_PMU_INTR_FLT_MODE_PAIR(pps_intr_msix_nic, MSIX_NIC), 647 + 648 + NULL 649 + }; 650 + 651 + static struct attribute_group hns3_pmu_events_group = { 652 + .name = "events", 653 + .attrs = hns3_pmu_events_attr, 654 + }; 655 + 656 + static struct attribute_group hns3_pmu_filter_mode_group = { 657 + .name = "filtermode", 658 + .attrs = hns3_pmu_filter_mode_attr, 659 + }; 660 + 661 + static struct attribute *hns3_pmu_format_attr[] = { 662 + HNS3_PMU_FORMAT_ATTR(subevent, "config:0-7"), 663 + HNS3_PMU_FORMAT_ATTR(event_type, "config:8-15"), 664 + HNS3_PMU_FORMAT_ATTR(ext_counter_used, "config:16"), 665 + HNS3_PMU_FORMAT_ATTR(port, "config1:0-3"), 666 + HNS3_PMU_FORMAT_ATTR(tc, "config1:4-7"), 667 + HNS3_PMU_FORMAT_ATTR(bdf, "config1:8-23"), 668 + HNS3_PMU_FORMAT_ATTR(queue, "config1:24-39"), 669 + HNS3_PMU_FORMAT_ATTR(intr, "config1:40-51"), 670 + HNS3_PMU_FORMAT_ATTR(global, "config1:52"), 671 + NULL 672 + }; 673 + 674 + static struct attribute_group hns3_pmu_format_group = { 675 + .name = "format", 676 + .attrs = hns3_pmu_format_attr, 677 + }; 678 + 679 + static struct attribute *hns3_pmu_cpumask_attrs[] = { 680 + &dev_attr_cpumask.attr, 681 + NULL 682 + }; 683 + 684 + static struct attribute_group hns3_pmu_cpumask_attr_group = { 685 + .attrs = hns3_pmu_cpumask_attrs, 686 + }; 687 + 688 + static struct attribute *hns3_pmu_identifier_attrs[] = { 689 + &dev_attr_identifier.attr, 690 + NULL 691 + }; 692 + 693 + static struct attribute_group hns3_pmu_identifier_attr_group = { 694 + .attrs = hns3_pmu_identifier_attrs, 695 + }; 696 + 697 + static struct attribute *hns3_pmu_bdf_range_attrs[] = { 698 + &dev_attr_bdf_min.attr, 699 + &dev_attr_bdf_max.attr, 700 + NULL 701 + }; 702 + 703 + static struct attribute_group hns3_pmu_bdf_range_attr_group = { 704 + .attrs = hns3_pmu_bdf_range_attrs, 705 + }; 706 + 707 + static struct attribute *hns3_pmu_hw_clk_freq_attrs[] = { 708 + &dev_attr_hw_clk_freq.attr, 709 + NULL 710 + }; 711 + 712 + static struct attribute_group hns3_pmu_hw_clk_freq_attr_group = { 713 + .attrs = hns3_pmu_hw_clk_freq_attrs, 714 + }; 715 + 716 + static const struct attribute_group *hns3_pmu_attr_groups[] = { 717 + &hns3_pmu_events_group, 718 + &hns3_pmu_filter_mode_group, 719 + &hns3_pmu_format_group, 720 + &hns3_pmu_cpumask_attr_group, 721 + &hns3_pmu_identifier_attr_group, 722 + &hns3_pmu_bdf_range_attr_group, 723 + &hns3_pmu_hw_clk_freq_attr_group, 724 + NULL 725 + }; 726 + 727 + static u32 hns3_pmu_get_event(struct perf_event *event) 728 + { 729 + return hns3_pmu_get_ext_counter_used(event) << 16 | 730 + hns3_pmu_get_event_type(event) << 8 | 731 + hns3_pmu_get_subevent(event); 732 + } 733 + 734 + static u32 hns3_pmu_get_real_event(struct perf_event *event) 735 + { 736 + return hns3_pmu_get_event_type(event) << 8 | 737 + hns3_pmu_get_subevent(event); 738 + } 739 + 740 + static u32 hns3_pmu_get_offset(u32 offset, u32 idx) 741 + { 742 + return offset + HNS3_PMU_REG_EVENT_OFFSET + 743 + HNS3_PMU_REG_EVENT_SIZE * idx; 744 + } 745 + 746 + static u32 hns3_pmu_readl(struct hns3_pmu *hns3_pmu, u32 reg_offset, u32 idx) 747 + { 748 + u32 offset = hns3_pmu_get_offset(reg_offset, idx); 749 + 750 + return readl(hns3_pmu->base + offset); 751 + } 752 + 753 + static void hns3_pmu_writel(struct hns3_pmu *hns3_pmu, u32 reg_offset, u32 idx, 754 + u32 val) 755 + { 756 + u32 offset = hns3_pmu_get_offset(reg_offset, idx); 757 + 758 + writel(val, hns3_pmu->base + offset); 759 + } 760 + 761 + static u64 hns3_pmu_readq(struct hns3_pmu *hns3_pmu, u32 reg_offset, u32 idx) 762 + { 763 + u32 offset = hns3_pmu_get_offset(reg_offset, idx); 764 + 765 + return readq(hns3_pmu->base + offset); 766 + } 767 + 768 + static void hns3_pmu_writeq(struct hns3_pmu *hns3_pmu, u32 reg_offset, u32 idx, 769 + u64 val) 770 + { 771 + u32 offset = hns3_pmu_get_offset(reg_offset, idx); 772 + 773 + writeq(val, hns3_pmu->base + offset); 774 + } 775 + 776 + static bool hns3_pmu_cmp_event(struct perf_event *target, 777 + struct perf_event *event) 778 + { 779 + return hns3_pmu_get_real_event(target) == hns3_pmu_get_real_event(event); 780 + } 781 + 782 + static int hns3_pmu_find_related_event_idx(struct hns3_pmu *hns3_pmu, 783 + struct perf_event *event) 784 + { 785 + struct perf_event *sibling; 786 + int hw_event_used = 0; 787 + int idx; 788 + 789 + for (idx = 0; idx < HNS3_PMU_MAX_HW_EVENTS; idx++) { 790 + sibling = hns3_pmu->hw_events[idx]; 791 + if (!sibling) 792 + continue; 793 + 794 + hw_event_used++; 795 + 796 + if (!hns3_pmu_cmp_event(sibling, event)) 797 + continue; 798 + 799 + /* Related events is used in group */ 800 + if (sibling->group_leader == event->group_leader) 801 + return idx; 802 + } 803 + 804 + /* No related event and all hardware events are used up */ 805 + if (hw_event_used >= HNS3_PMU_MAX_HW_EVENTS) 806 + return -EBUSY; 807 + 808 + /* No related event and there is extra hardware events can be use */ 809 + return -ENOENT; 810 + } 811 + 812 + static int hns3_pmu_get_event_idx(struct hns3_pmu *hns3_pmu) 813 + { 814 + int idx; 815 + 816 + for (idx = 0; idx < HNS3_PMU_MAX_HW_EVENTS; idx++) { 817 + if (!hns3_pmu->hw_events[idx]) 818 + return idx; 819 + } 820 + 821 + return -EBUSY; 822 + } 823 + 824 + static bool hns3_pmu_valid_bdf(struct hns3_pmu *hns3_pmu, u16 bdf) 825 + { 826 + struct pci_dev *pdev; 827 + 828 + if (bdf < hns3_pmu->bdf_min || bdf > hns3_pmu->bdf_max) { 829 + pci_err(hns3_pmu->pdev, "Invalid EP device: %#x!\n", bdf); 830 + return false; 831 + } 832 + 833 + pdev = pci_get_domain_bus_and_slot(pci_domain_nr(hns3_pmu->pdev->bus), 834 + PCI_BUS_NUM(bdf), 835 + GET_PCI_DEVFN(bdf)); 836 + if (!pdev) { 837 + pci_err(hns3_pmu->pdev, "Nonexistent EP device: %#x!\n", bdf); 838 + return false; 839 + } 840 + 841 + pci_dev_put(pdev); 842 + return true; 843 + } 844 + 845 + static void hns3_pmu_set_qid_para(struct hns3_pmu *hns3_pmu, u32 idx, u16 bdf, 846 + u16 queue) 847 + { 848 + u32 val; 849 + 850 + val = GET_PCI_DEVFN(bdf); 851 + val |= (u32)queue << HNS3_PMU_QID_PARA_QUEUE_S; 852 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_QID_PARA, idx, val); 853 + } 854 + 855 + static bool hns3_pmu_qid_req_start(struct hns3_pmu *hns3_pmu, u32 idx) 856 + { 857 + bool queue_id_valid = false; 858 + u32 reg_qid_ctrl, val; 859 + int err; 860 + 861 + /* enable queue id request */ 862 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_QID_CTRL, idx, 863 + HNS3_PMU_QID_CTRL_REQ_ENABLE); 864 + 865 + reg_qid_ctrl = hns3_pmu_get_offset(HNS3_PMU_REG_EVENT_QID_CTRL, idx); 866 + err = readl_poll_timeout(hns3_pmu->base + reg_qid_ctrl, val, 867 + val & HNS3_PMU_QID_CTRL_DONE, 1, 1000); 868 + if (err == -ETIMEDOUT) { 869 + pci_err(hns3_pmu->pdev, "QID request timeout!\n"); 870 + goto out; 871 + } 872 + 873 + queue_id_valid = !(val & HNS3_PMU_QID_CTRL_MISS); 874 + 875 + out: 876 + /* disable qid request and clear status */ 877 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_QID_CTRL, idx, 0); 878 + 879 + return queue_id_valid; 880 + } 881 + 882 + static bool hns3_pmu_valid_queue(struct hns3_pmu *hns3_pmu, u32 idx, u16 bdf, 883 + u16 queue) 884 + { 885 + hns3_pmu_set_qid_para(hns3_pmu, idx, bdf, queue); 886 + 887 + return hns3_pmu_qid_req_start(hns3_pmu, idx); 888 + } 889 + 890 + static struct hns3_pmu_event_attr *hns3_pmu_get_pmu_event(u32 event) 891 + { 892 + struct hns3_pmu_event_attr *pmu_event; 893 + struct dev_ext_attribute *eattr; 894 + struct device_attribute *dattr; 895 + struct attribute *attr; 896 + u32 i; 897 + 898 + for (i = 0; i < ARRAY_SIZE(hns3_pmu_events_attr) - 1; i++) { 899 + attr = hns3_pmu_events_attr[i]; 900 + dattr = container_of(attr, struct device_attribute, attr); 901 + eattr = container_of(dattr, struct dev_ext_attribute, attr); 902 + pmu_event = eattr->var; 903 + 904 + if (event == pmu_event->event) 905 + return pmu_event; 906 + } 907 + 908 + return NULL; 909 + } 910 + 911 + static int hns3_pmu_set_func_mode(struct perf_event *event, 912 + struct hns3_pmu *hns3_pmu) 913 + { 914 + struct hw_perf_event *hwc = &event->hw; 915 + u16 bdf = hns3_pmu_get_bdf(event); 916 + 917 + if (!hns3_pmu_valid_bdf(hns3_pmu, bdf)) 918 + return -ENOENT; 919 + 920 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_FUNC); 921 + 922 + return 0; 923 + } 924 + 925 + static int hns3_pmu_set_func_queue_mode(struct perf_event *event, 926 + struct hns3_pmu *hns3_pmu) 927 + { 928 + u16 queue_id = hns3_pmu_get_queue(event); 929 + struct hw_perf_event *hwc = &event->hw; 930 + u16 bdf = hns3_pmu_get_bdf(event); 931 + 932 + if (!hns3_pmu_valid_bdf(hns3_pmu, bdf)) 933 + return -ENOENT; 934 + 935 + if (!hns3_pmu_valid_queue(hns3_pmu, hwc->idx, bdf, queue_id)) { 936 + pci_err(hns3_pmu->pdev, "Invalid queue: %u\n", queue_id); 937 + return -ENOENT; 938 + } 939 + 940 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_FUNC_QUEUE); 941 + 942 + return 0; 943 + } 944 + 945 + static bool 946 + hns3_pmu_is_enabled_global_mode(struct perf_event *event, 947 + struct hns3_pmu_event_attr *pmu_event) 948 + { 949 + u8 global = hns3_pmu_get_global(event); 950 + 951 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_GLOBAL)) 952 + return false; 953 + 954 + return global; 955 + } 956 + 957 + static bool hns3_pmu_is_enabled_func_mode(struct perf_event *event, 958 + struct hns3_pmu_event_attr *pmu_event) 959 + { 960 + u16 queue_id = hns3_pmu_get_queue(event); 961 + u16 bdf = hns3_pmu_get_bdf(event); 962 + 963 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC)) 964 + return false; 965 + else if (queue_id != HNS3_PMU_FILTER_ALL_QUEUE) 966 + return false; 967 + 968 + return bdf; 969 + } 970 + 971 + static bool 972 + hns3_pmu_is_enabled_func_queue_mode(struct perf_event *event, 973 + struct hns3_pmu_event_attr *pmu_event) 974 + { 975 + u16 queue_id = hns3_pmu_get_queue(event); 976 + u16 bdf = hns3_pmu_get_bdf(event); 977 + 978 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC_QUEUE)) 979 + return false; 980 + else if (queue_id == HNS3_PMU_FILTER_ALL_QUEUE) 981 + return false; 982 + 983 + return bdf; 984 + } 985 + 986 + static bool hns3_pmu_is_enabled_port_mode(struct perf_event *event, 987 + struct hns3_pmu_event_attr *pmu_event) 988 + { 989 + u8 tc_id = hns3_pmu_get_tc(event); 990 + 991 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_PORT)) 992 + return false; 993 + 994 + return tc_id == HNS3_PMU_FILTER_ALL_TC; 995 + } 996 + 997 + static bool 998 + hns3_pmu_is_enabled_port_tc_mode(struct perf_event *event, 999 + struct hns3_pmu_event_attr *pmu_event) 1000 + { 1001 + u8 tc_id = hns3_pmu_get_tc(event); 1002 + 1003 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_PORT_TC)) 1004 + return false; 1005 + 1006 + return tc_id != HNS3_PMU_FILTER_ALL_TC; 1007 + } 1008 + 1009 + static bool 1010 + hns3_pmu_is_enabled_func_intr_mode(struct perf_event *event, 1011 + struct hns3_pmu *hns3_pmu, 1012 + struct hns3_pmu_event_attr *pmu_event) 1013 + { 1014 + u16 bdf = hns3_pmu_get_bdf(event); 1015 + 1016 + if (!(pmu_event->filter_support & HNS3_PMU_FILTER_SUPPORT_FUNC_INTR)) 1017 + return false; 1018 + 1019 + return hns3_pmu_valid_bdf(hns3_pmu, bdf); 1020 + } 1021 + 1022 + static int hns3_pmu_select_filter_mode(struct perf_event *event, 1023 + struct hns3_pmu *hns3_pmu) 1024 + { 1025 + u32 event_id = hns3_pmu_get_event(event); 1026 + struct hw_perf_event *hwc = &event->hw; 1027 + struct hns3_pmu_event_attr *pmu_event; 1028 + 1029 + pmu_event = hns3_pmu_get_pmu_event(event_id); 1030 + if (!pmu_event) { 1031 + pci_err(hns3_pmu->pdev, "Invalid pmu event\n"); 1032 + return -ENOENT; 1033 + } 1034 + 1035 + if (hns3_pmu_is_enabled_global_mode(event, pmu_event)) { 1036 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_GLOBAL); 1037 + return 0; 1038 + } 1039 + 1040 + if (hns3_pmu_is_enabled_func_mode(event, pmu_event)) 1041 + return hns3_pmu_set_func_mode(event, hns3_pmu); 1042 + 1043 + if (hns3_pmu_is_enabled_func_queue_mode(event, pmu_event)) 1044 + return hns3_pmu_set_func_queue_mode(event, hns3_pmu); 1045 + 1046 + if (hns3_pmu_is_enabled_port_mode(event, pmu_event)) { 1047 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_PORT); 1048 + return 0; 1049 + } 1050 + 1051 + if (hns3_pmu_is_enabled_port_tc_mode(event, pmu_event)) { 1052 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_PORT_TC); 1053 + return 0; 1054 + } 1055 + 1056 + if (hns3_pmu_is_enabled_func_intr_mode(event, hns3_pmu, pmu_event)) { 1057 + HNS3_PMU_SET_HW_FILTER(hwc, HNS3_PMU_HW_FILTER_FUNC_INTR); 1058 + return 0; 1059 + } 1060 + 1061 + return -ENOENT; 1062 + } 1063 + 1064 + static bool hns3_pmu_validate_event_group(struct perf_event *event) 1065 + { 1066 + struct perf_event *sibling, *leader = event->group_leader; 1067 + struct perf_event *event_group[HNS3_PMU_MAX_HW_EVENTS]; 1068 + int counters = 1; 1069 + int num; 1070 + 1071 + event_group[0] = leader; 1072 + if (!is_software_event(leader)) { 1073 + if (leader->pmu != event->pmu) 1074 + return false; 1075 + 1076 + if (leader != event && !hns3_pmu_cmp_event(leader, event)) 1077 + event_group[counters++] = event; 1078 + } 1079 + 1080 + for_each_sibling_event(sibling, event->group_leader) { 1081 + if (is_software_event(sibling)) 1082 + continue; 1083 + 1084 + if (sibling->pmu != event->pmu) 1085 + return false; 1086 + 1087 + for (num = 0; num < counters; num++) { 1088 + if (hns3_pmu_cmp_event(event_group[num], sibling)) 1089 + break; 1090 + } 1091 + 1092 + if (num == counters) 1093 + event_group[counters++] = sibling; 1094 + } 1095 + 1096 + return counters <= HNS3_PMU_MAX_HW_EVENTS; 1097 + } 1098 + 1099 + static u32 hns3_pmu_get_filter_condition(struct perf_event *event) 1100 + { 1101 + struct hw_perf_event *hwc = &event->hw; 1102 + u16 intr_id = hns3_pmu_get_intr(event); 1103 + u8 port_id = hns3_pmu_get_port(event); 1104 + u16 bdf = hns3_pmu_get_bdf(event); 1105 + u8 tc_id = hns3_pmu_get_tc(event); 1106 + u8 filter_mode; 1107 + 1108 + filter_mode = *(u8 *)hwc->addr_filters; 1109 + switch (filter_mode) { 1110 + case HNS3_PMU_HW_FILTER_PORT: 1111 + return FILTER_CONDITION_PORT(port_id); 1112 + case HNS3_PMU_HW_FILTER_PORT_TC: 1113 + return FILTER_CONDITION_PORT_TC(port_id, tc_id); 1114 + case HNS3_PMU_HW_FILTER_FUNC: 1115 + case HNS3_PMU_HW_FILTER_FUNC_QUEUE: 1116 + return GET_PCI_DEVFN(bdf); 1117 + case HNS3_PMU_HW_FILTER_FUNC_INTR: 1118 + return FILTER_CONDITION_FUNC_INTR(GET_PCI_DEVFN(bdf), intr_id); 1119 + default: 1120 + break; 1121 + } 1122 + 1123 + return 0; 1124 + } 1125 + 1126 + static void hns3_pmu_config_filter(struct perf_event *event) 1127 + { 1128 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1129 + u8 event_type = hns3_pmu_get_event_type(event); 1130 + u8 subevent_id = hns3_pmu_get_subevent(event); 1131 + u16 queue_id = hns3_pmu_get_queue(event); 1132 + struct hw_perf_event *hwc = &event->hw; 1133 + u8 filter_mode = *(u8 *)hwc->addr_filters; 1134 + u16 bdf = hns3_pmu_get_bdf(event); 1135 + u32 idx = hwc->idx; 1136 + u32 val; 1137 + 1138 + val = event_type; 1139 + val |= subevent_id << HNS3_PMU_CTRL_SUBEVENT_S; 1140 + val |= filter_mode << HNS3_PMU_CTRL_FILTER_MODE_S; 1141 + val |= HNS3_PMU_EVENT_OVERFLOW_RESTART; 1142 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx, val); 1143 + 1144 + val = hns3_pmu_get_filter_condition(event); 1145 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_HIGH, idx, val); 1146 + 1147 + if (filter_mode == HNS3_PMU_HW_FILTER_FUNC_QUEUE) 1148 + hns3_pmu_set_qid_para(hns3_pmu, idx, bdf, queue_id); 1149 + } 1150 + 1151 + static void hns3_pmu_enable_counter(struct hns3_pmu *hns3_pmu, 1152 + struct hw_perf_event *hwc) 1153 + { 1154 + u32 idx = hwc->idx; 1155 + u32 val; 1156 + 1157 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx); 1158 + val |= HNS3_PMU_EVENT_EN; 1159 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx, val); 1160 + } 1161 + 1162 + static void hns3_pmu_disable_counter(struct hns3_pmu *hns3_pmu, 1163 + struct hw_perf_event *hwc) 1164 + { 1165 + u32 idx = hwc->idx; 1166 + u32 val; 1167 + 1168 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx); 1169 + val &= ~HNS3_PMU_EVENT_EN; 1170 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx, val); 1171 + } 1172 + 1173 + static void hns3_pmu_enable_intr(struct hns3_pmu *hns3_pmu, 1174 + struct hw_perf_event *hwc) 1175 + { 1176 + u32 idx = hwc->idx; 1177 + u32 val; 1178 + 1179 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_INTR_MASK, idx); 1180 + val &= ~HNS3_PMU_INTR_MASK_OVERFLOW; 1181 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_INTR_MASK, idx, val); 1182 + } 1183 + 1184 + static void hns3_pmu_disable_intr(struct hns3_pmu *hns3_pmu, 1185 + struct hw_perf_event *hwc) 1186 + { 1187 + u32 idx = hwc->idx; 1188 + u32 val; 1189 + 1190 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_INTR_MASK, idx); 1191 + val |= HNS3_PMU_INTR_MASK_OVERFLOW; 1192 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_INTR_MASK, idx, val); 1193 + } 1194 + 1195 + static void hns3_pmu_clear_intr_status(struct hns3_pmu *hns3_pmu, u32 idx) 1196 + { 1197 + u32 val; 1198 + 1199 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx); 1200 + val |= HNS3_PMU_EVENT_STATUS_RESET; 1201 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx, val); 1202 + 1203 + val = hns3_pmu_readl(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx); 1204 + val &= ~HNS3_PMU_EVENT_STATUS_RESET; 1205 + hns3_pmu_writel(hns3_pmu, HNS3_PMU_REG_EVENT_CTRL_LOW, idx, val); 1206 + } 1207 + 1208 + static u64 hns3_pmu_read_counter(struct perf_event *event) 1209 + { 1210 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1211 + 1212 + return hns3_pmu_readq(hns3_pmu, event->hw.event_base, event->hw.idx); 1213 + } 1214 + 1215 + static void hns3_pmu_write_counter(struct perf_event *event, u64 value) 1216 + { 1217 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1218 + u32 idx = event->hw.idx; 1219 + 1220 + hns3_pmu_writeq(hns3_pmu, HNS3_PMU_REG_EVENT_COUNTER, idx, value); 1221 + hns3_pmu_writeq(hns3_pmu, HNS3_PMU_REG_EVENT_EXT_COUNTER, idx, value); 1222 + } 1223 + 1224 + static void hns3_pmu_init_counter(struct perf_event *event) 1225 + { 1226 + struct hw_perf_event *hwc = &event->hw; 1227 + 1228 + local64_set(&hwc->prev_count, 0); 1229 + hns3_pmu_write_counter(event, 0); 1230 + } 1231 + 1232 + static int hns3_pmu_event_init(struct perf_event *event) 1233 + { 1234 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1235 + struct hw_perf_event *hwc = &event->hw; 1236 + int idx; 1237 + int ret; 1238 + 1239 + if (event->attr.type != event->pmu->type) 1240 + return -ENOENT; 1241 + 1242 + /* Sampling is not supported */ 1243 + if (is_sampling_event(event) || event->attach_state & PERF_ATTACH_TASK) 1244 + return -EOPNOTSUPP; 1245 + 1246 + event->cpu = hns3_pmu->on_cpu; 1247 + 1248 + idx = hns3_pmu_get_event_idx(hns3_pmu); 1249 + if (idx < 0) { 1250 + pci_err(hns3_pmu->pdev, "Up to %u events are supported!\n", 1251 + HNS3_PMU_MAX_HW_EVENTS); 1252 + return -EBUSY; 1253 + } 1254 + 1255 + hwc->idx = idx; 1256 + 1257 + ret = hns3_pmu_select_filter_mode(event, hns3_pmu); 1258 + if (ret) { 1259 + pci_err(hns3_pmu->pdev, "Invalid filter, ret = %d.\n", ret); 1260 + return ret; 1261 + } 1262 + 1263 + if (!hns3_pmu_validate_event_group(event)) { 1264 + pci_err(hns3_pmu->pdev, "Invalid event group.\n"); 1265 + return -EINVAL; 1266 + } 1267 + 1268 + if (hns3_pmu_get_ext_counter_used(event)) 1269 + hwc->event_base = HNS3_PMU_REG_EVENT_EXT_COUNTER; 1270 + else 1271 + hwc->event_base = HNS3_PMU_REG_EVENT_COUNTER; 1272 + 1273 + return 0; 1274 + } 1275 + 1276 + static void hns3_pmu_read(struct perf_event *event) 1277 + { 1278 + struct hw_perf_event *hwc = &event->hw; 1279 + u64 new_cnt, prev_cnt, delta; 1280 + 1281 + do { 1282 + prev_cnt = local64_read(&hwc->prev_count); 1283 + new_cnt = hns3_pmu_read_counter(event); 1284 + } while (local64_cmpxchg(&hwc->prev_count, prev_cnt, new_cnt) != 1285 + prev_cnt); 1286 + 1287 + delta = new_cnt - prev_cnt; 1288 + local64_add(delta, &event->count); 1289 + } 1290 + 1291 + static void hns3_pmu_start(struct perf_event *event, int flags) 1292 + { 1293 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1294 + struct hw_perf_event *hwc = &event->hw; 1295 + 1296 + if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED))) 1297 + return; 1298 + 1299 + WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); 1300 + hwc->state = 0; 1301 + 1302 + hns3_pmu_config_filter(event); 1303 + hns3_pmu_init_counter(event); 1304 + hns3_pmu_enable_intr(hns3_pmu, hwc); 1305 + hns3_pmu_enable_counter(hns3_pmu, hwc); 1306 + 1307 + perf_event_update_userpage(event); 1308 + } 1309 + 1310 + static void hns3_pmu_stop(struct perf_event *event, int flags) 1311 + { 1312 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1313 + struct hw_perf_event *hwc = &event->hw; 1314 + 1315 + hns3_pmu_disable_counter(hns3_pmu, hwc); 1316 + hns3_pmu_disable_intr(hns3_pmu, hwc); 1317 + 1318 + WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); 1319 + hwc->state |= PERF_HES_STOPPED; 1320 + 1321 + if (hwc->state & PERF_HES_UPTODATE) 1322 + return; 1323 + 1324 + /* Read hardware counter and update the perf counter statistics */ 1325 + hns3_pmu_read(event); 1326 + hwc->state |= PERF_HES_UPTODATE; 1327 + } 1328 + 1329 + static int hns3_pmu_add(struct perf_event *event, int flags) 1330 + { 1331 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1332 + struct hw_perf_event *hwc = &event->hw; 1333 + int idx; 1334 + 1335 + hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 1336 + 1337 + /* Check all working events to find a related event. */ 1338 + idx = hns3_pmu_find_related_event_idx(hns3_pmu, event); 1339 + if (idx < 0 && idx != -ENOENT) 1340 + return idx; 1341 + 1342 + /* Current event shares an enabled hardware event with related event */ 1343 + if (idx >= 0 && idx < HNS3_PMU_MAX_HW_EVENTS) { 1344 + hwc->idx = idx; 1345 + goto start_count; 1346 + } 1347 + 1348 + idx = hns3_pmu_get_event_idx(hns3_pmu); 1349 + if (idx < 0) 1350 + return idx; 1351 + 1352 + hwc->idx = idx; 1353 + hns3_pmu->hw_events[idx] = event; 1354 + 1355 + start_count: 1356 + if (flags & PERF_EF_START) 1357 + hns3_pmu_start(event, PERF_EF_RELOAD); 1358 + 1359 + return 0; 1360 + } 1361 + 1362 + static void hns3_pmu_del(struct perf_event *event, int flags) 1363 + { 1364 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(event->pmu); 1365 + struct hw_perf_event *hwc = &event->hw; 1366 + 1367 + hns3_pmu_stop(event, PERF_EF_UPDATE); 1368 + hns3_pmu->hw_events[hwc->idx] = NULL; 1369 + perf_event_update_userpage(event); 1370 + } 1371 + 1372 + static void hns3_pmu_enable(struct pmu *pmu) 1373 + { 1374 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(pmu); 1375 + u32 val; 1376 + 1377 + val = readl(hns3_pmu->base + HNS3_PMU_REG_GLOBAL_CTRL); 1378 + val |= HNS3_PMU_GLOBAL_START; 1379 + writel(val, hns3_pmu->base + HNS3_PMU_REG_GLOBAL_CTRL); 1380 + } 1381 + 1382 + static void hns3_pmu_disable(struct pmu *pmu) 1383 + { 1384 + struct hns3_pmu *hns3_pmu = to_hns3_pmu(pmu); 1385 + u32 val; 1386 + 1387 + val = readl(hns3_pmu->base + HNS3_PMU_REG_GLOBAL_CTRL); 1388 + val &= ~HNS3_PMU_GLOBAL_START; 1389 + writel(val, hns3_pmu->base + HNS3_PMU_REG_GLOBAL_CTRL); 1390 + } 1391 + 1392 + static int hns3_pmu_alloc_pmu(struct pci_dev *pdev, struct hns3_pmu *hns3_pmu) 1393 + { 1394 + u16 device_id; 1395 + char *name; 1396 + u32 val; 1397 + 1398 + hns3_pmu->base = pcim_iomap_table(pdev)[BAR_2]; 1399 + if (!hns3_pmu->base) { 1400 + pci_err(pdev, "ioremap failed\n"); 1401 + return -ENOMEM; 1402 + } 1403 + 1404 + hns3_pmu->hw_clk_freq = readl(hns3_pmu->base + HNS3_PMU_REG_CLOCK_FREQ); 1405 + 1406 + val = readl(hns3_pmu->base + HNS3_PMU_REG_BDF); 1407 + hns3_pmu->bdf_min = val & 0xffff; 1408 + hns3_pmu->bdf_max = val >> 16; 1409 + 1410 + val = readl(hns3_pmu->base + HNS3_PMU_REG_DEVICE_ID); 1411 + device_id = val & 0xffff; 1412 + name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "hns3_pmu_sicl_%u", device_id); 1413 + if (!name) 1414 + return -ENOMEM; 1415 + 1416 + hns3_pmu->pdev = pdev; 1417 + hns3_pmu->on_cpu = -1; 1418 + hns3_pmu->identifier = readl(hns3_pmu->base + HNS3_PMU_REG_VERSION); 1419 + hns3_pmu->pmu = (struct pmu) { 1420 + .name = name, 1421 + .module = THIS_MODULE, 1422 + .event_init = hns3_pmu_event_init, 1423 + .pmu_enable = hns3_pmu_enable, 1424 + .pmu_disable = hns3_pmu_disable, 1425 + .add = hns3_pmu_add, 1426 + .del = hns3_pmu_del, 1427 + .start = hns3_pmu_start, 1428 + .stop = hns3_pmu_stop, 1429 + .read = hns3_pmu_read, 1430 + .task_ctx_nr = perf_invalid_context, 1431 + .attr_groups = hns3_pmu_attr_groups, 1432 + .capabilities = PERF_PMU_CAP_NO_EXCLUDE, 1433 + }; 1434 + 1435 + return 0; 1436 + } 1437 + 1438 + static irqreturn_t hns3_pmu_irq(int irq, void *data) 1439 + { 1440 + struct hns3_pmu *hns3_pmu = data; 1441 + u32 intr_status, idx; 1442 + 1443 + for (idx = 0; idx < HNS3_PMU_MAX_HW_EVENTS; idx++) { 1444 + intr_status = hns3_pmu_readl(hns3_pmu, 1445 + HNS3_PMU_REG_EVENT_INTR_STATUS, 1446 + idx); 1447 + 1448 + /* 1449 + * As each counter will restart from 0 when it is overflowed, 1450 + * extra processing is no need, just clear interrupt status. 1451 + */ 1452 + if (intr_status) 1453 + hns3_pmu_clear_intr_status(hns3_pmu, idx); 1454 + } 1455 + 1456 + return IRQ_HANDLED; 1457 + } 1458 + 1459 + static int hns3_pmu_online_cpu(unsigned int cpu, struct hlist_node *node) 1460 + { 1461 + struct hns3_pmu *hns3_pmu; 1462 + 1463 + hns3_pmu = hlist_entry_safe(node, struct hns3_pmu, node); 1464 + if (!hns3_pmu) 1465 + return -ENODEV; 1466 + 1467 + if (hns3_pmu->on_cpu == -1) { 1468 + hns3_pmu->on_cpu = cpu; 1469 + irq_set_affinity(hns3_pmu->irq, cpumask_of(cpu)); 1470 + } 1471 + 1472 + return 0; 1473 + } 1474 + 1475 + static int hns3_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) 1476 + { 1477 + struct hns3_pmu *hns3_pmu; 1478 + unsigned int target; 1479 + 1480 + hns3_pmu = hlist_entry_safe(node, struct hns3_pmu, node); 1481 + if (!hns3_pmu) 1482 + return -ENODEV; 1483 + 1484 + /* Nothing to do if this CPU doesn't own the PMU */ 1485 + if (hns3_pmu->on_cpu != cpu) 1486 + return 0; 1487 + 1488 + /* Choose a new CPU from all online cpus */ 1489 + target = cpumask_any_but(cpu_online_mask, cpu); 1490 + if (target >= nr_cpu_ids) 1491 + return 0; 1492 + 1493 + perf_pmu_migrate_context(&hns3_pmu->pmu, cpu, target); 1494 + hns3_pmu->on_cpu = target; 1495 + irq_set_affinity(hns3_pmu->irq, cpumask_of(target)); 1496 + 1497 + return 0; 1498 + } 1499 + 1500 + static void hns3_pmu_free_irq(void *data) 1501 + { 1502 + struct pci_dev *pdev = data; 1503 + 1504 + pci_free_irq_vectors(pdev); 1505 + } 1506 + 1507 + static int hns3_pmu_irq_register(struct pci_dev *pdev, 1508 + struct hns3_pmu *hns3_pmu) 1509 + { 1510 + int irq, ret; 1511 + 1512 + ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI); 1513 + if (ret < 0) { 1514 + pci_err(pdev, "failed to enable MSI vectors, ret = %d.\n", ret); 1515 + return ret; 1516 + } 1517 + 1518 + ret = devm_add_action(&pdev->dev, hns3_pmu_free_irq, pdev); 1519 + if (ret) { 1520 + pci_err(pdev, "failed to add free irq action, ret = %d.\n", ret); 1521 + return ret; 1522 + } 1523 + 1524 + irq = pci_irq_vector(pdev, 0); 1525 + ret = devm_request_irq(&pdev->dev, irq, hns3_pmu_irq, 0, 1526 + hns3_pmu->pmu.name, hns3_pmu); 1527 + if (ret) { 1528 + pci_err(pdev, "failed to register irq, ret = %d.\n", ret); 1529 + return ret; 1530 + } 1531 + 1532 + hns3_pmu->irq = irq; 1533 + 1534 + return 0; 1535 + } 1536 + 1537 + static int hns3_pmu_init_pmu(struct pci_dev *pdev, struct hns3_pmu *hns3_pmu) 1538 + { 1539 + int ret; 1540 + 1541 + ret = hns3_pmu_alloc_pmu(pdev, hns3_pmu); 1542 + if (ret) 1543 + return ret; 1544 + 1545 + ret = hns3_pmu_irq_register(pdev, hns3_pmu); 1546 + if (ret) 1547 + return ret; 1548 + 1549 + ret = cpuhp_state_add_instance(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE, 1550 + &hns3_pmu->node); 1551 + if (ret) { 1552 + pci_err(pdev, "failed to register hotplug, ret = %d.\n", ret); 1553 + return ret; 1554 + } 1555 + 1556 + ret = perf_pmu_register(&hns3_pmu->pmu, hns3_pmu->pmu.name, -1); 1557 + if (ret) { 1558 + pci_err(pdev, "failed to register perf PMU, ret = %d.\n", ret); 1559 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE, 1560 + &hns3_pmu->node); 1561 + } 1562 + 1563 + return ret; 1564 + } 1565 + 1566 + static void hns3_pmu_uninit_pmu(struct pci_dev *pdev) 1567 + { 1568 + struct hns3_pmu *hns3_pmu = pci_get_drvdata(pdev); 1569 + 1570 + perf_pmu_unregister(&hns3_pmu->pmu); 1571 + cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE, 1572 + &hns3_pmu->node); 1573 + } 1574 + 1575 + static int hns3_pmu_init_dev(struct pci_dev *pdev) 1576 + { 1577 + int ret; 1578 + 1579 + ret = pcim_enable_device(pdev); 1580 + if (ret) { 1581 + pci_err(pdev, "failed to enable pci device, ret = %d.\n", ret); 1582 + return ret; 1583 + } 1584 + 1585 + ret = pcim_iomap_regions(pdev, BIT(BAR_2), "hns3_pmu"); 1586 + if (ret < 0) { 1587 + pci_err(pdev, "failed to request pci region, ret = %d.\n", ret); 1588 + return ret; 1589 + } 1590 + 1591 + pci_set_master(pdev); 1592 + 1593 + return 0; 1594 + } 1595 + 1596 + static int hns3_pmu_probe(struct pci_dev *pdev, const struct pci_device_id *id) 1597 + { 1598 + struct hns3_pmu *hns3_pmu; 1599 + int ret; 1600 + 1601 + hns3_pmu = devm_kzalloc(&pdev->dev, sizeof(*hns3_pmu), GFP_KERNEL); 1602 + if (!hns3_pmu) 1603 + return -ENOMEM; 1604 + 1605 + ret = hns3_pmu_init_dev(pdev); 1606 + if (ret) 1607 + return ret; 1608 + 1609 + ret = hns3_pmu_init_pmu(pdev, hns3_pmu); 1610 + if (ret) { 1611 + pci_clear_master(pdev); 1612 + return ret; 1613 + } 1614 + 1615 + pci_set_drvdata(pdev, hns3_pmu); 1616 + 1617 + return ret; 1618 + } 1619 + 1620 + static void hns3_pmu_remove(struct pci_dev *pdev) 1621 + { 1622 + hns3_pmu_uninit_pmu(pdev); 1623 + pci_clear_master(pdev); 1624 + pci_set_drvdata(pdev, NULL); 1625 + } 1626 + 1627 + static const struct pci_device_id hns3_pmu_ids[] = { 1628 + { PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, 0xa22b) }, 1629 + { 0, } 1630 + }; 1631 + MODULE_DEVICE_TABLE(pci, hns3_pmu_ids); 1632 + 1633 + static struct pci_driver hns3_pmu_driver = { 1634 + .name = "hns3_pmu", 1635 + .id_table = hns3_pmu_ids, 1636 + .probe = hns3_pmu_probe, 1637 + .remove = hns3_pmu_remove, 1638 + }; 1639 + 1640 + static int __init hns3_pmu_module_init(void) 1641 + { 1642 + int ret; 1643 + 1644 + ret = cpuhp_setup_state_multi(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE, 1645 + "AP_PERF_ARM_HNS3_PMU_ONLINE", 1646 + hns3_pmu_online_cpu, 1647 + hns3_pmu_offline_cpu); 1648 + if (ret) { 1649 + pr_err("failed to setup HNS3 PMU hotplug, ret = %d.\n", ret); 1650 + return ret; 1651 + } 1652 + 1653 + ret = pci_register_driver(&hns3_pmu_driver); 1654 + if (ret) { 1655 + pr_err("failed to register pci driver, ret = %d.\n", ret); 1656 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE); 1657 + } 1658 + 1659 + return ret; 1660 + } 1661 + module_init(hns3_pmu_module_init); 1662 + 1663 + static void __exit hns3_pmu_module_exit(void) 1664 + { 1665 + pci_unregister_driver(&hns3_pmu_driver); 1666 + cpuhp_remove_multi_state(CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE); 1667 + } 1668 + module_exit(hns3_pmu_module_exit); 1669 + 1670 + MODULE_DESCRIPTION("HNS3 PMU driver"); 1671 + MODULE_LICENSE("GPL v2");
+3 -9
drivers/perf/marvell_cn10k_tad_pmu.c
··· 2 2 /* Marvell CN10K LLC-TAD perf driver 3 3 * 4 4 * Copyright (C) 2021 Marvell 5 - * 6 - * This program is free software; you can redistribute it and/or modify 7 - * it under the terms of the GNU General Public License version 2 as 8 - * published by the Free Software Foundation. 9 5 */ 10 6 11 7 #define pr_fmt(fmt) "tad_pmu: " fmt ··· 14 18 #include <linux/perf_event.h> 15 19 #include <linux/platform_device.h> 16 20 17 - #define TAD_PFC_OFFSET 0x0 21 + #define TAD_PFC_OFFSET 0x800 18 22 #define TAD_PFC(counter) (TAD_PFC_OFFSET | (counter << 3)) 19 - #define TAD_PRF_OFFSET 0x100 23 + #define TAD_PRF_OFFSET 0x900 20 24 #define TAD_PRF(counter) (TAD_PRF_OFFSET | (counter << 3)) 21 25 #define TAD_PRF_CNTSEL_MASK 0xFF 22 26 #define TAD_MAX_COUNTERS 8 ··· 96 100 * which sets TAD()_PRF()[CNTSEL] != 0 97 101 */ 98 102 for (i = 0; i < tad_pmu->region_cnt; i++) { 99 - reg_val = readq_relaxed(tad_pmu->regions[i].base + 100 - TAD_PRF(counter_idx)); 101 - reg_val |= (event_idx & 0xFF); 103 + reg_val = event_idx & 0xFF; 102 104 writeq_relaxed(reg_val, tad_pmu->regions[i].base + 103 105 TAD_PRF(counter_idx)); 104 106 }
+2 -2
drivers/perf/riscv_pmu.c
··· 121 121 return delta; 122 122 } 123 123 124 - static void riscv_pmu_stop(struct perf_event *event, int flags) 124 + void riscv_pmu_stop(struct perf_event *event, int flags) 125 125 { 126 126 struct hw_perf_event *hwc = &event->hw; 127 127 struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu); ··· 175 175 return overflow; 176 176 } 177 177 178 - static void riscv_pmu_start(struct perf_event *event, int flags) 178 + void riscv_pmu_start(struct perf_event *event, int flags) 179 179 { 180 180 struct hw_perf_event *hwc = &event->hw; 181 181 struct riscv_pmu *rvpmu = to_riscv_pmu(event->pmu);
+101 -5
drivers/perf/riscv_pmu_sbi.c
··· 17 17 #include <linux/irqdomain.h> 18 18 #include <linux/of_irq.h> 19 19 #include <linux/of.h> 20 + #include <linux/cpu_pm.h> 20 21 21 22 #include <asm/sbi.h> 22 23 #include <asm/hwcap.h> 24 + 25 + PMU_FORMAT_ATTR(event, "config:0-47"); 26 + PMU_FORMAT_ATTR(firmware, "config:63"); 27 + 28 + static struct attribute *riscv_arch_formats_attr[] = { 29 + &format_attr_event.attr, 30 + &format_attr_firmware.attr, 31 + NULL, 32 + }; 33 + 34 + static struct attribute_group riscv_pmu_format_group = { 35 + .name = "format", 36 + .attrs = riscv_arch_formats_attr, 37 + }; 38 + 39 + static const struct attribute_group *riscv_pmu_attr_groups[] = { 40 + &riscv_pmu_format_group, 41 + NULL, 42 + }; 23 43 24 44 union sbi_pmu_ctr_info { 25 45 unsigned long value; ··· 686 666 child = of_get_compatible_child(cpu, "riscv,cpu-intc"); 687 667 if (!child) { 688 668 pr_err("Failed to find INTC node\n"); 669 + of_node_put(cpu); 689 670 return -ENODEV; 690 671 } 691 672 domain = irq_find_host(child); 692 673 of_node_put(child); 693 - if (domain) 674 + if (domain) { 675 + of_node_put(cpu); 694 676 break; 677 + } 695 678 } 696 679 if (!domain) { 697 680 pr_err("Failed to find INTC IRQ root domain\n"); ··· 714 691 } 715 692 716 693 return 0; 694 + } 695 + 696 + #ifdef CONFIG_CPU_PM 697 + static int riscv_pm_pmu_notify(struct notifier_block *b, unsigned long cmd, 698 + void *v) 699 + { 700 + struct riscv_pmu *rvpmu = container_of(b, struct riscv_pmu, riscv_pm_nb); 701 + struct cpu_hw_events *cpuc = this_cpu_ptr(rvpmu->hw_events); 702 + int enabled = bitmap_weight(cpuc->used_hw_ctrs, RISCV_MAX_COUNTERS); 703 + struct perf_event *event; 704 + int idx; 705 + 706 + if (!enabled) 707 + return NOTIFY_OK; 708 + 709 + for (idx = 0; idx < RISCV_MAX_COUNTERS; idx++) { 710 + event = cpuc->events[idx]; 711 + if (!event) 712 + continue; 713 + 714 + switch (cmd) { 715 + case CPU_PM_ENTER: 716 + /* 717 + * Stop and update the counter 718 + */ 719 + riscv_pmu_stop(event, PERF_EF_UPDATE); 720 + break; 721 + case CPU_PM_EXIT: 722 + case CPU_PM_ENTER_FAILED: 723 + /* 724 + * Restore and enable the counter. 725 + * 726 + * Requires RCU read locking to be functional, 727 + * wrap the call within RCU_NONIDLE to make the 728 + * RCU subsystem aware this cpu is not idle from 729 + * an RCU perspective for the riscv_pmu_start() call 730 + * duration. 731 + */ 732 + RCU_NONIDLE(riscv_pmu_start(event, PERF_EF_RELOAD)); 733 + break; 734 + default: 735 + break; 736 + } 737 + } 738 + 739 + return NOTIFY_OK; 740 + } 741 + 742 + static int riscv_pm_pmu_register(struct riscv_pmu *pmu) 743 + { 744 + pmu->riscv_pm_nb.notifier_call = riscv_pm_pmu_notify; 745 + return cpu_pm_register_notifier(&pmu->riscv_pm_nb); 746 + } 747 + 748 + static void riscv_pm_pmu_unregister(struct riscv_pmu *pmu) 749 + { 750 + cpu_pm_unregister_notifier(&pmu->riscv_pm_nb); 751 + } 752 + #else 753 + static inline int riscv_pm_pmu_register(struct riscv_pmu *pmu) { return 0; } 754 + static inline void riscv_pm_pmu_unregister(struct riscv_pmu *pmu) { } 755 + #endif 756 + 757 + static void riscv_pmu_destroy(struct riscv_pmu *pmu) 758 + { 759 + riscv_pm_pmu_unregister(pmu); 760 + cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node); 717 761 } 718 762 719 763 static int pmu_sbi_device_probe(struct platform_device *pdev) ··· 810 720 pmu->pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT; 811 721 pmu->pmu.capabilities |= PERF_PMU_CAP_NO_EXCLUDE; 812 722 } 723 + pmu->pmu.attr_groups = riscv_pmu_attr_groups; 813 724 pmu->num_counters = num_counters; 814 725 pmu->ctr_start = pmu_sbi_ctr_start; 815 726 pmu->ctr_stop = pmu_sbi_ctr_stop; ··· 824 733 if (ret) 825 734 return ret; 826 735 736 + ret = riscv_pm_pmu_register(pmu); 737 + if (ret) 738 + goto out_unregister; 739 + 827 740 ret = perf_pmu_register(&pmu->pmu, "cpu", PERF_TYPE_RAW); 828 - if (ret) { 829 - cpuhp_state_remove_instance(CPUHP_AP_PERF_RISCV_STARTING, &pmu->node); 830 - return ret; 831 - } 741 + if (ret) 742 + goto out_unregister; 832 743 833 744 return 0; 745 + 746 + out_unregister: 747 + riscv_pmu_destroy(pmu); 834 748 835 749 out_free: 836 750 kfree(pmu);
+8
include/asm-generic/barrier.h
··· 38 38 #define wmb() do { kcsan_wmb(); __wmb(); } while (0) 39 39 #endif 40 40 41 + #ifdef __dma_mb 42 + #define dma_mb() do { kcsan_mb(); __dma_mb(); } while (0) 43 + #endif 44 + 41 45 #ifdef __dma_rmb 42 46 #define dma_rmb() do { kcsan_rmb(); __dma_rmb(); } while (0) 43 47 #endif ··· 67 63 68 64 #ifndef wmb 69 65 #define wmb() mb() 66 + #endif 67 + 68 + #ifndef dma_mb 69 + #define dma_mb() mb() 70 70 #endif 71 71 72 72 #ifndef dma_rmb
+28 -1
include/asm-generic/io.h
··· 964 964 #elif defined(CONFIG_GENERIC_IOREMAP) 965 965 #include <linux/pgtable.h> 966 966 967 - void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot); 967 + /* 968 + * Arch code can implement the following two hooks when using GENERIC_IOREMAP 969 + * ioremap_allowed() return a bool, 970 + * - true means continue to remap 971 + * - false means skip remap and return directly 972 + * iounmap_allowed() return a bool, 973 + * - true means continue to vunmap 974 + * - false means skip vunmap and return directly 975 + */ 976 + #ifndef ioremap_allowed 977 + #define ioremap_allowed ioremap_allowed 978 + static inline bool ioremap_allowed(phys_addr_t phys_addr, size_t size, 979 + unsigned long prot) 980 + { 981 + return true; 982 + } 983 + #endif 984 + 985 + #ifndef iounmap_allowed 986 + #define iounmap_allowed iounmap_allowed 987 + static inline bool iounmap_allowed(void *addr) 988 + { 989 + return true; 990 + } 991 + #endif 992 + 993 + void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, 994 + unsigned long prot); 968 995 void iounmap(volatile void __iomem *addr); 969 996 970 997 static inline void __iomem *ioremap(phys_addr_t addr, size_t size)
+1
include/linux/cpuhotplug.h
··· 229 229 CPUHP_AP_PERF_ARM_HISI_PA_ONLINE, 230 230 CPUHP_AP_PERF_ARM_HISI_SLLC_ONLINE, 231 231 CPUHP_AP_PERF_ARM_HISI_PCIE_PMU_ONLINE, 232 + CPUHP_AP_PERF_ARM_HNS3_PMU_ONLINE, 232 233 CPUHP_AP_PERF_ARM_L2X0_ONLINE, 233 234 CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE, 234 235 CPUHP_AP_PERF_ARM_QCOM_L3_ONLINE,
+1 -1
include/linux/gfp.h
··· 348 348 #define GFP_DMA32 __GFP_DMA32 349 349 #define GFP_HIGHUSER (GFP_USER | __GFP_HIGHMEM) 350 350 #define GFP_HIGHUSER_MOVABLE (GFP_HIGHUSER | __GFP_MOVABLE | \ 351 - __GFP_SKIP_KASAN_POISON) 351 + __GFP_SKIP_KASAN_POISON | __GFP_SKIP_KASAN_UNPOISON) 352 352 #define GFP_TRANSHUGE_LIGHT ((GFP_HIGHUSER_MOVABLE | __GFP_COMP | \ 353 353 __GFP_NOMEMALLOC | __GFP_NOWARN) & ~__GFP_RECLAIM) 354 354 #define GFP_TRANSHUGE (GFP_TRANSHUGE_LIGHT | __GFP_DIRECT_RECLAIM)
+12
include/linux/huge_mm.h
··· 461 461 return split_huge_page_to_list(&folio->page, list); 462 462 } 463 463 464 + /* 465 + * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to 466 + * limitations in the implementation like arm64 MTE can override this to 467 + * false 468 + */ 469 + #ifndef arch_thp_swp_supported 470 + static inline bool arch_thp_swp_supported(void) 471 + { 472 + return true; 473 + } 474 + #endif 475 + 464 476 #endif /* _LINUX_HUGE_MM_H */
+4
include/linux/perf/riscv_pmu.h
··· 56 56 57 57 struct cpu_hw_events __percpu *hw_events; 58 58 struct hlist_node node; 59 + struct notifier_block riscv_pm_nb; 59 60 }; 60 61 61 62 #define to_riscv_pmu(p) (container_of(p, struct riscv_pmu, pmu)) 63 + 64 + void riscv_pmu_start(struct perf_event *event, int flags); 65 + void riscv_pmu_stop(struct perf_event *event, int flags); 62 66 unsigned long riscv_pmu_ctr_read_csr(unsigned long csr); 63 67 int riscv_pmu_event_set_period(struct perf_event *event); 64 68 uint64_t riscv_pmu_ctr_get_width_mask(struct perf_event *event);
+19 -7
mm/ioremap.c
··· 11 11 #include <linux/io.h> 12 12 #include <linux/export.h> 13 13 14 - void __iomem *ioremap_prot(phys_addr_t addr, size_t size, unsigned long prot) 14 + void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, 15 + unsigned long prot) 15 16 { 16 17 unsigned long offset, vaddr; 17 18 phys_addr_t last_addr; 18 19 struct vm_struct *area; 19 20 20 21 /* Disallow wrap-around or zero size */ 21 - last_addr = addr + size - 1; 22 - if (!size || last_addr < addr) 22 + last_addr = phys_addr + size - 1; 23 + if (!size || last_addr < phys_addr) 23 24 return NULL; 24 25 25 26 /* Page-align mappings */ 26 - offset = addr & (~PAGE_MASK); 27 - addr -= offset; 27 + offset = phys_addr & (~PAGE_MASK); 28 + phys_addr -= offset; 28 29 size = PAGE_ALIGN(size + offset); 30 + 31 + if (!ioremap_allowed(phys_addr, size, prot)) 32 + return NULL; 29 33 30 34 area = get_vm_area_caller(size, VM_IOREMAP, 31 35 __builtin_return_address(0)); 32 36 if (!area) 33 37 return NULL; 34 38 vaddr = (unsigned long)area->addr; 39 + area->phys_addr = phys_addr; 35 40 36 - if (ioremap_page_range(vaddr, vaddr + size, addr, __pgprot(prot))) { 41 + if (ioremap_page_range(vaddr, vaddr + size, phys_addr, 42 + __pgprot(prot))) { 37 43 free_vm_area(area); 38 44 return NULL; 39 45 } ··· 50 44 51 45 void iounmap(volatile void __iomem *addr) 52 46 { 53 - vunmap((void *)((unsigned long)addr & PAGE_MASK)); 47 + void *vaddr = (void *)((unsigned long)addr & PAGE_MASK); 48 + 49 + if (!iounmap_allowed(vaddr)) 50 + return; 51 + 52 + if (is_vmalloc_addr(vaddr)) 53 + vunmap(vaddr); 54 54 } 55 55 EXPORT_SYMBOL(iounmap);
+2 -1
mm/kasan/common.c
··· 108 108 return; 109 109 110 110 tag = kasan_random_tag(); 111 + kasan_unpoison(set_tag(page_address(page), tag), 112 + PAGE_SIZE << order, init); 111 113 for (i = 0; i < (1 << order); i++) 112 114 page_kasan_tag_set(page + i, tag); 113 - kasan_unpoison(page_address(page), PAGE_SIZE << order, init); 114 115 } 115 116 116 117 void __kasan_poison_pages(struct page *page, unsigned int order, bool init)
+10 -9
mm/page_alloc.c
··· 2361 2361 } 2362 2362 #endif /* CONFIG_DEBUG_VM */ 2363 2363 2364 - static inline bool should_skip_kasan_unpoison(gfp_t flags, bool init_tags) 2364 + static inline bool should_skip_kasan_unpoison(gfp_t flags) 2365 2365 { 2366 2366 /* Don't skip if a software KASAN mode is enabled. */ 2367 2367 if (IS_ENABLED(CONFIG_KASAN_GENERIC) || ··· 2373 2373 return true; 2374 2374 2375 2375 /* 2376 - * With hardware tag-based KASAN enabled, skip if either: 2377 - * 2378 - * 1. Memory tags have already been cleared via tag_clear_highpage(). 2379 - * 2. Skipping has been requested via __GFP_SKIP_KASAN_UNPOISON. 2376 + * With hardware tag-based KASAN enabled, skip if this has been 2377 + * requested via __GFP_SKIP_KASAN_UNPOISON. 2380 2378 */ 2381 - return init_tags || (flags & __GFP_SKIP_KASAN_UNPOISON); 2379 + return flags & __GFP_SKIP_KASAN_UNPOISON; 2382 2380 } 2383 2381 2384 2382 static inline bool should_skip_init(gfp_t flags) ··· 2395 2397 bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && 2396 2398 !should_skip_init(gfp_flags); 2397 2399 bool init_tags = init && (gfp_flags & __GFP_ZEROTAGS); 2400 + int i; 2398 2401 2399 2402 set_page_private(page, 0); 2400 2403 set_page_refcounted(page); ··· 2421 2422 * should be initialized as well). 2422 2423 */ 2423 2424 if (init_tags) { 2424 - int i; 2425 - 2426 2425 /* Initialize both memory and tags. */ 2427 2426 for (i = 0; i != 1 << order; ++i) 2428 2427 tag_clear_highpage(page + i); ··· 2428 2431 /* Note that memory is already initialized by the loop above. */ 2429 2432 init = false; 2430 2433 } 2431 - if (!should_skip_kasan_unpoison(gfp_flags, init_tags)) { 2434 + if (!should_skip_kasan_unpoison(gfp_flags)) { 2432 2435 /* Unpoison shadow memory or set memory tags. */ 2433 2436 kasan_unpoison_pages(page, order, init); 2434 2437 2435 2438 /* Note that memory is already initialized by KASAN. */ 2436 2439 if (kasan_has_integrated_init()) 2437 2440 init = false; 2441 + } else { 2442 + /* Ensure page_address() dereferencing does not fault. */ 2443 + for (i = 0; i != 1 << order; ++i) 2444 + page_kasan_tag_reset(page + i); 2438 2445 } 2439 2446 /* If memory is still not initialized, do it now. */ 2440 2447 if (init)
+1 -1
mm/swap_slots.c
··· 307 307 entry.val = 0; 308 308 309 309 if (folio_test_large(folio)) { 310 - if (IS_ENABLED(CONFIG_THP_SWAP)) 310 + if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported()) 311 311 get_swap_pages(1, &entry, folio_nr_pages(folio)); 312 312 goto out; 313 313 }