Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:
"The major features are support for LPA2 (52-bit VA/PA with 4K and 16K
pages), the dpISA extension and Rust enabled on arm64. The changes are
mostly contained within the usual arch/arm64/, drivers/perf, the arm64
Documentation and kselftests. The exception is the Rust support which
touches some generic build files.

Summary:

- Reorganise the arm64 kernel VA space and add support for LPA2 (at
stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address
range with 4KB and 16KB pages

- Enable Rust on arm64

- Support for the 2023 dpISA extensions (data processing ISA), host
only

- arm64 perf updates:

- StarFive's StarLink (integrates one or more CPU cores with a
shared L3 memory system) PMU support

- Enable HiSilicon Erratum 162700402 quirk for HIP09

- Several updates for the HiSilicon PCIe PMU driver

- Arm CoreSight PMU support

- Convert all drivers under drivers/perf/ to use .remove_new()

- Miscellaneous:

- Don't enable workarounds for "rare" errata by default

- Clean up the DAIF flags handling for EL0 returns (in preparation
for NMI support)

- Kselftest update for ptrace()

- Update some of the sysreg field definitions

- Slight improvement in the code generation for inline asm I/O
accessors to permit offset addressing

- kretprobes: acquire regs via a BRK exception (previously done
via a trampoline handler)

- SVE/SME cleanups, comment updates

- Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously
disabled due to gcc silently ignoring -falign-functions=N)"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits)
Revert "mm: add arch hook to validate mmap() prot flags"
Revert "arm64: mm: add support for WXN memory translation attribute"
Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512"
ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512
kselftest/arm64: Add 2023 DPISA hwcap test coverage
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Handle FPMR context in generic signal frame parser
arm64/hwcap: Define hwcaps for 2023 DPISA features
arm64/ptrace: Expose FPMR via ptrace
arm64/signal: Add FPMR signal handling
arm64/fpsimd: Support FEAT_FPMR
arm64/fpsimd: Enable host kernel access to FPMR
arm64/cpufeature: Hook new identification registers up to cpufeature
docs: perf: Fix build warning of hisi-pcie-pmu.rst
perf: starfive: Only allow COMPILE_TEST for 64-bit architectures
MAINTAINERS: Add entry for StarFive StarLink PMU
docs: perf: Add description for StarFive's StarLink PMU
dt-bindings: perf: starfive: Add JH8100 StarLink PMU
perf: starfive: Add StarLink PMU support
docs: perf: Update usage for target filter of hisi-pcie-pmu
...

+5513 -1583
+24 -8
Documentation/admin-guide/perf/hisi-pcie-pmu.rst
··· 37 37 hisi_pcie0_core0/rx_mwr_cnt/ [kernel PMU event] 38 38 ------------------------------------------ 39 39 40 - $# perf stat -e hisi_pcie0_core0/rx_mwr_latency/ 41 - $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt/ 42 - $# perf stat -g -e hisi_pcie0_core0/rx_mwr_latency/ -e hisi_pcie0_core0/rx_mwr_cnt/ 40 + $# perf stat -e hisi_pcie0_core0/rx_mwr_latency,port=0xffff/ 41 + $# perf stat -e hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/ 42 + 43 + The related events usually used to calculate the bandwidth, latency or others. 44 + They need to start and end counting at the same time, therefore related events 45 + are best used in the same event group to get the expected value. There are two 46 + ways to know if they are related events: 47 + 48 + a) By event name, such as the latency events "xxx_latency, xxx_cnt" or 49 + bandwidth events "xxx_flux, xxx_time". 50 + b) By event type, such as "event=0xXXXX, event=0x1XXXX". 51 + 52 + Example usage of perf group:: 53 + 54 + $# perf stat -e "{hisi_pcie0_core0/rx_mwr_latency,port=0xffff/,hisi_pcie0_core0/rx_mwr_cnt,port=0xffff/}" 43 55 44 56 The current driver does not support sampling. So "perf record" is unsupported. 45 57 Also attach to a task is unsupported for PCIe PMU. ··· 63 51 64 52 PMU could only monitor the performance of traffic downstream target Root 65 53 Ports or downstream target Endpoint. PCIe PMU driver support "port" and 66 - "bdf" interfaces for users, and these two interfaces aren't supported at the 67 - same time. 54 + "bdf" interfaces for users. 55 + Please notice that, one of these two interfaces must be set, and these two 56 + interfaces aren't supported at the same time. If they are both set, only 57 + "port" filter is valid. 58 + If "port" filter not being set or is set explicitly to zero (default), the 59 + "bdf" filter will be in effect, because "bdf=0" meaning 0000:000:00.0. 68 60 69 61 - port 70 62 ··· 111 95 112 96 Example usage of perf:: 113 97 114 - $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,trig_len=0x4,trig_mode=1/ sleep 5 98 + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,trig_len=0x4,trig_mode=1/ sleep 5 115 99 116 100 3. Threshold filter 117 101 ··· 125 109 126 110 Example usage of perf:: 127 111 128 - $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,thr_len=0x4,thr_mode=1/ sleep 5 112 + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,thr_len=0x4,thr_mode=1/ sleep 5 129 113 130 114 4. TLP Length filter 131 115 ··· 143 127 144 128 Example usage of perf:: 145 129 146 - $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,len_mode=0x1/ sleep 5 130 + $# perf stat -e hisi_pcie0_core0/rx_mrd_flux,port=0xffff,len_mode=0x1/ sleep 5
+1
Documentation/admin-guide/perf/index.rst
··· 13 13 imx-ddr 14 14 qcom_l2_pmu 15 15 qcom_l3_pmu 16 + starfive_starlink_pmu 16 17 arm-ccn 17 18 arm-cmn 18 19 xgene-pmu
+49
Documentation/arch/arm64/elf_hwcaps.rst
··· 317 317 HWCAP2_LSE128 318 318 Functionality implied by ID_AA64ISAR0_EL1.Atomic == 0b0011. 319 319 320 + HWCAP2_FPMR 321 + Functionality implied by ID_AA64PFR2_EL1.FMR == 0b0001. 322 + 323 + HWCAP2_LUT 324 + Functionality implied by ID_AA64ISAR2_EL1.LUT == 0b0001. 325 + 326 + HWCAP2_FAMINMAX 327 + Functionality implied by ID_AA64ISAR3_EL1.FAMINMAX == 0b0001. 328 + 329 + HWCAP2_F8CVT 330 + Functionality implied by ID_AA64FPFR0_EL1.F8CVT == 0b1. 331 + 332 + HWCAP2_F8FMA 333 + Functionality implied by ID_AA64FPFR0_EL1.F8FMA == 0b1. 334 + 335 + HWCAP2_F8DP4 336 + Functionality implied by ID_AA64FPFR0_EL1.F8DP4 == 0b1. 337 + 338 + HWCAP2_F8DP2 339 + Functionality implied by ID_AA64FPFR0_EL1.F8DP2 == 0b1. 340 + 341 + HWCAP2_F8E4M3 342 + Functionality implied by ID_AA64FPFR0_EL1.F8E4M3 == 0b1. 343 + 344 + HWCAP2_F8E5M2 345 + Functionality implied by ID_AA64FPFR0_EL1.F8E5M2 == 0b1. 346 + 347 + HWCAP2_SME_LUTV2 348 + Functionality implied by ID_AA64SMFR0_EL1.LUTv2 == 0b1. 349 + 350 + HWCAP2_SME_F8F16 351 + Functionality implied by ID_AA64SMFR0_EL1.F8F16 == 0b1. 352 + 353 + HWCAP2_SME_F8F32 354 + Functionality implied by ID_AA64SMFR0_EL1.F8F32 == 0b1. 355 + 356 + HWCAP2_SME_SF8FMA 357 + Functionality implied by ID_AA64SMFR0_EL1.SF8FMA == 0b1. 358 + 359 + HWCAP2_SME_SF8DP4 360 + Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1. 361 + 362 + HWCAP2_SME_SF8DP2 363 + Functionality implied by ID_AA64SMFR0_EL1.SF8DP2 == 0b1. 364 + 365 + HWCAP2_SME_SF8DP4 366 + Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1. 367 + 368 + 320 369 4. Unused AT_HWCAP bits 321 370 ----------------------- 322 371
+3 -2
Documentation/arch/arm64/silicon-errata.rst
··· 35 35 For software workarounds that may adversely impact systems unaffected by 36 36 the erratum in question, a Kconfig entry is added under "Kernel 37 37 Features" -> "ARM errata workarounds via the alternatives framework". 38 - These are enabled by default and patched in at runtime when an affected 39 - CPU is detected. For less-intrusive workarounds, a Kconfig option is not 38 + With the exception of workarounds for errata deemed "rare" by Arm, these 39 + are enabled by default and patched in at runtime when an affected CPU is 40 + detected. For less-intrusive workarounds, a Kconfig option is not 40 41 available and the code is structured (preferably with a comment) in such 41 42 a way that the erratum will not be hit. 42 43
+5 -6
Documentation/arch/arm64/sme.rst
··· 75 75 2. Vector lengths 76 76 ------------------ 77 77 78 - SME defines a second vector length similar to the SVE vector length which is 78 + SME defines a second vector length similar to the SVE vector length which 79 79 controls the size of the streaming mode SVE vectors and the ZA matrix array. 80 80 The ZA matrix is square with each side having as many bytes as a streaming 81 81 mode SVE vector. ··· 238 238 bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become 239 239 unspecified, including both streaming and non-streaming SVE state. 240 240 Calling PR_SME_SET_VL with vl equal to the thread's current vector 241 - length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag, 241 + length, or calling PR_SME_SET_VL with the PR_SME_SET_VL_ONEXEC flag, 242 242 does not constitute a change to the vector length for this purpose. 243 243 244 244 * Changing the vector length causes PSTATE.ZA and PSTATE.SM to be cleared. 245 245 Calling PR_SME_SET_VL with vl equal to the thread's current vector 246 - length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag, 246 + length, or calling PR_SME_SET_VL with the PR_SME_SET_VL_ONEXEC flag, 247 247 does not constitute a change to the vector length for this purpose. 248 248 249 249 ··· 379 379 /proc/sys/abi/sme_default_vector_length 380 380 381 381 Writing the text representation of an integer to this file sets the system 382 - default vector length to the specified value, unless the value is greater 383 - than the maximum vector length supported by the system in which case the 384 - default vector length is set to that maximum. 382 + default vector length to the specified value rounded to a supported value 383 + using the same rules as for setting vector length via PR_SME_SET_VL. 385 384 386 385 The result can be determined by reopening the file and reading its 387 386 contents.
+2 -8
Documentation/arch/arm64/sve.rst
··· 117 117 * The SVE registers are not used to pass arguments to or receive results from 118 118 any syscall. 119 119 120 - * In practice the affected registers/bits will be preserved or will be replaced 121 - with zeros on return from a syscall, but userspace should not make 122 - assumptions about this. The kernel behaviour may vary on a case-by-case 123 - basis. 124 - 125 120 * All other SVE state of a thread, including the currently configured vector 126 121 length, the state of the PR_SVE_VL_INHERIT flag, and the deferred vector 127 122 length (if any), is preserved across all syscalls, subject to the specific ··· 423 428 /proc/sys/abi/sve_default_vector_length 424 429 425 430 Writing the text representation of an integer to this file sets the system 426 - default vector length to the specified value, unless the value is greater 427 - than the maximum vector length supported by the system in which case the 428 - default vector length is set to that maximum. 431 + default vector length to the specified value rounded to a supported value 432 + using the same rules as for setting vector length via PR_SVE_SET_VL. 429 433 430 434 The result can be determined by reopening the file and reading its 431 435 contents.
+39
Documentation/devicetree/bindings/perf/arm,coresight-pmu.yaml
··· 1 + # SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause 2 + %YAML 1.2 3 + --- 4 + $id: http://devicetree.org/schemas/perf/arm,coresight-pmu.yaml# 5 + $schema: http://devicetree.org/meta-schemas/core.yaml# 6 + 7 + title: Arm Coresight Performance Monitoring Unit Architecture 8 + 9 + maintainers: 10 + - Robin Murphy <robin.murphy@arm.com> 11 + 12 + properties: 13 + compatible: 14 + const: arm,coresight-pmu 15 + 16 + reg: 17 + items: 18 + - description: Register page 0 19 + - description: Register page 1, if the PMU implements the dual-page extension 20 + minItems: 1 21 + 22 + interrupts: 23 + items: 24 + - description: Overflow interrupt 25 + 26 + cpus: 27 + description: If the PMU is associated with a particular CPU or subset of CPUs, 28 + array of phandles to the appropriate CPU node(s) 29 + 30 + reg-io-width: 31 + description: Granularity at which PMU register accesses are single-copy atomic 32 + default: 4 33 + enum: [4, 8] 34 + 35 + required: 36 + - compatible 37 + - reg 38 + 39 + additionalProperties: false
+1
Documentation/rust/arch-support.rst
··· 15 15 ============= ================ ============================================== 16 16 Architecture Level of support Constraints 17 17 ============= ================ ============================================== 18 + ``arm64`` Maintained Little Endian only. 18 19 ``loongarch`` Maintained - 19 20 ``um`` Maintained ``x86_64`` only. 20 21 ``x86`` Maintained ``x86_64`` only.
+7
MAINTAINERS
··· 20974 20974 T: git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/ 20975 20975 F: Documentation/devicetree/bindings/soc/starfive/ 20976 20976 20977 + STARFIVE STARLINK PMU DRIVER 20978 + M: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> 20979 + S: Maintained 20980 + F: Documentation/admin-guide/perf/starfive_starlink_pmu.rst 20981 + F: Documentation/devicetree/bindings/perf/starfive,jh8100-starlink-pmu.yaml 20982 + F: drivers/perf/starfive_starlink_pmu.c 20983 + 20977 20984 STARFIVE TRNG DRIVER 20978 20985 M: Jia Jie Ho <jiajie.ho@starfivetech.com> 20979 20986 S: Supported
-1
Makefile
··· 561 561 562 562 KBUILD_CPPFLAGS := -D__KERNEL__ 563 563 KBUILD_RUSTFLAGS := $(rust_common_flags) \ 564 - --target=$(objtree)/scripts/target.json \ 565 564 -Cpanic=abort -Cembed-bitcode=n -Clto=n \ 566 565 -Cforce-unwind-tables=n -Ccodegen-units=1 \ 567 566 -Csymbol-mangling-version=v0 \
+29 -26
arch/arm64/Kconfig
··· 164 164 select HAVE_ARCH_HUGE_VMAP 165 165 select HAVE_ARCH_JUMP_LABEL 166 166 select HAVE_ARCH_JUMP_LABEL_RELATIVE 167 - select HAVE_ARCH_KASAN if !(ARM64_16K_PAGES && ARM64_VA_BITS_48) 167 + select HAVE_ARCH_KASAN 168 168 select HAVE_ARCH_KASAN_VMALLOC if HAVE_ARCH_KASAN 169 169 select HAVE_ARCH_KASAN_SW_TAGS if HAVE_ARCH_KASAN 170 170 select HAVE_ARCH_KASAN_HW_TAGS if (HAVE_ARCH_KASAN && ARM64_MTE) ··· 198 198 if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS 199 199 select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \ 200 200 if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG && \ 201 - !CC_OPTIMIZE_FOR_SIZE) 201 + (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE)) 202 202 select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \ 203 203 if DYNAMIC_FTRACE_WITH_ARGS 204 204 select HAVE_SAMPLE_FTRACE_DIRECT ··· 229 229 select HAVE_FUNCTION_ARG_ACCESS_API 230 230 select MMU_GATHER_RCU_TABLE_FREE 231 231 select HAVE_RSEQ 232 + select HAVE_RUST if CPU_LITTLE_ENDIAN 232 233 select HAVE_STACKPROTECTOR 233 234 select HAVE_SYSCALL_TRACEPOINTS 234 235 select HAVE_KPROBES ··· 363 362 default 3 if ARM64_64K_PAGES && (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) 364 363 default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39 365 364 default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47 365 + default 4 if ARM64_16K_PAGES && (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) 366 366 default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48 367 + default 5 if ARM64_4K_PAGES && ARM64_VA_BITS_52 367 368 368 369 config ARCH_SUPPORTS_UPROBES 369 370 def_bool y ··· 393 390 config KASAN_SHADOW_OFFSET 394 391 hex 395 392 depends on KASAN_GENERIC || KASAN_SW_TAGS 396 - default 0xdfff800000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && !KASAN_SW_TAGS 397 - default 0xdfffc00000000000 if ARM64_VA_BITS_47 && !KASAN_SW_TAGS 393 + default 0xdfff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && !KASAN_SW_TAGS 394 + default 0xdfffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && !KASAN_SW_TAGS 398 395 default 0xdffffe0000000000 if ARM64_VA_BITS_42 && !KASAN_SW_TAGS 399 396 default 0xdfffffc000000000 if ARM64_VA_BITS_39 && !KASAN_SW_TAGS 400 397 default 0xdffffff800000000 if ARM64_VA_BITS_36 && !KASAN_SW_TAGS 401 - default 0xefff800000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && KASAN_SW_TAGS 402 - default 0xefffc00000000000 if ARM64_VA_BITS_47 && KASAN_SW_TAGS 398 + default 0xefff800000000000 if (ARM64_VA_BITS_48 || (ARM64_VA_BITS_52 && !ARM64_16K_PAGES)) && KASAN_SW_TAGS 399 + default 0xefffc00000000000 if (ARM64_VA_BITS_47 || ARM64_VA_BITS_52) && ARM64_16K_PAGES && KASAN_SW_TAGS 403 400 default 0xeffffe0000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS 404 401 default 0xefffffc000000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS 405 402 default 0xeffffff800000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS ··· 544 541 If unsure, say Y. 545 542 546 543 config ARM64_ERRATUM_834220 547 - bool "Cortex-A57: 834220: Stage 2 translation fault might be incorrectly reported in presence of a Stage 1 fault" 544 + bool "Cortex-A57: 834220: Stage 2 translation fault might be incorrectly reported in presence of a Stage 1 fault (rare)" 548 545 depends on KVM 549 - default y 550 546 help 551 547 This option adds an alternative code sequence to work around ARM 552 548 erratum 834220 on Cortex-A57 parts up to r1p2. ··· 561 559 as it depends on the alternative framework, which will only patch 562 560 the kernel if an affected CPU is detected. 563 561 564 - If unsure, say Y. 562 + If unsure, say N. 565 563 566 564 config ARM64_ERRATUM_1742098 567 565 bool "Cortex-A57/A72: 1742098: ELR recorded incorrectly on interrupt taken between cryptographic instructions in a sequence" ··· 688 686 bool 689 687 690 688 config ARM64_ERRATUM_2441007 691 - bool "Cortex-A55: Completion of affected memory accesses might not be guaranteed by completion of a TLBI" 692 - default y 689 + bool "Cortex-A55: Completion of affected memory accesses might not be guaranteed by completion of a TLBI (rare)" 693 690 select ARM64_WORKAROUND_REPEAT_TLBI 694 691 help 695 692 This option adds a workaround for ARM Cortex-A55 erratum #2441007. ··· 701 700 Work around this by adding the affected CPUs to the list that needs 702 701 TLB sequences to be done twice. 703 702 704 - If unsure, say Y. 703 + If unsure, say N. 705 704 706 705 config ARM64_ERRATUM_1286807 707 - bool "Cortex-A76: Modification of the translation table for a virtual address might lead to read-after-read ordering violation" 708 - default y 706 + bool "Cortex-A76: Modification of the translation table for a virtual address might lead to read-after-read ordering violation (rare)" 709 707 select ARM64_WORKAROUND_REPEAT_TLBI 710 708 help 711 709 This option adds a workaround for ARM Cortex-A76 erratum 1286807. ··· 717 717 TLBI+DSB completes before a read using the translation being 718 718 invalidated has been observed by other observers. The 719 719 workaround repeats the TLBI+DSB operation. 720 + 721 + If unsure, say N. 720 722 721 723 config ARM64_ERRATUM_1463225 722 724 bool "Cortex-A76: Software Step might prevent interrupt recognition" ··· 739 737 If unsure, say Y. 740 738 741 739 config ARM64_ERRATUM_1542419 742 - bool "Neoverse-N1: workaround mis-ordering of instruction fetches" 743 - default y 740 + bool "Neoverse-N1: workaround mis-ordering of instruction fetches (rare)" 744 741 help 745 742 This option adds a workaround for ARM Neoverse-N1 erratum 746 743 1542419. ··· 751 750 Workaround the issue by hiding the DIC feature from EL0. This 752 751 forces user-space to perform cache maintenance. 753 752 754 - If unsure, say Y. 753 + If unsure, say N. 755 754 756 755 config ARM64_ERRATUM_1508412 757 756 bool "Cortex-A77: 1508412: workaround deadlock on sequence of NC/Device load and store exclusive or PAR read" ··· 926 925 If unsure, say Y. 927 926 928 927 config ARM64_ERRATUM_2441009 929 - bool "Cortex-A510: Completion of affected memory accesses might not be guaranteed by completion of a TLBI" 930 - default y 928 + bool "Cortex-A510: Completion of affected memory accesses might not be guaranteed by completion of a TLBI (rare)" 931 929 select ARM64_WORKAROUND_REPEAT_TLBI 932 930 help 933 931 This option adds a workaround for ARM Cortex-A510 erratum #2441009. ··· 939 939 Work around this by adding the affected CPUs to the list that needs 940 940 TLB sequences to be done twice. 941 941 942 - If unsure, say Y. 942 + If unsure, say N. 943 943 944 944 config ARM64_ERRATUM_2064142 945 945 bool "Cortex-A510: 2064142: workaround TRBE register writes while disabled" ··· 1278 1278 1279 1279 choice 1280 1280 prompt "Virtual address space size" 1281 - default ARM64_VA_BITS_39 if ARM64_4K_PAGES 1282 - default ARM64_VA_BITS_47 if ARM64_16K_PAGES 1283 - default ARM64_VA_BITS_42 if ARM64_64K_PAGES 1281 + default ARM64_VA_BITS_52 1284 1282 help 1285 1283 Allows choosing one of multiple possible virtual address 1286 1284 space sizes. The level of translation table is determined by ··· 1305 1307 1306 1308 config ARM64_VA_BITS_52 1307 1309 bool "52-bit" 1308 - depends on ARM64_64K_PAGES && (ARM64_PAN || !ARM64_SW_TTBR0_PAN) 1310 + depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN 1309 1311 help 1310 1312 Enable 52-bit virtual addressing for userspace when explicitly 1311 1313 requested via a hint to mmap(). The kernel will also use 52-bit ··· 1352 1354 1353 1355 config ARM64_PA_BITS_48 1354 1356 bool "48-bit" 1357 + depends on ARM64_64K_PAGES || !ARM64_VA_BITS_52 1355 1358 1356 1359 config ARM64_PA_BITS_52 1357 - bool "52-bit (ARMv8.2)" 1358 - depends on ARM64_64K_PAGES 1360 + bool "52-bit" 1361 + depends on ARM64_64K_PAGES || ARM64_VA_BITS_52 1359 1362 depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN 1360 1363 help 1361 1364 Enable support for a 52-bit physical address space, introduced as ··· 1372 1373 int 1373 1374 default 48 if ARM64_PA_BITS_48 1374 1375 default 52 if ARM64_PA_BITS_52 1376 + 1377 + config ARM64_LPA2 1378 + def_bool y 1379 + depends on ARM64_PA_BITS_52 && !ARM64_64K_PAGES 1375 1380 1376 1381 choice 1377 1382 prompt "Endianness"
+4
arch/arm64/Makefile
··· 41 41 KBUILD_CFLAGS += $(call cc-disable-warning, psabi) 42 42 KBUILD_AFLAGS += $(compat_vdso) 43 43 44 + KBUILD_RUSTFLAGS += --target=aarch64-unknown-none -Ctarget-feature="-neon" 45 + 44 46 KBUILD_CFLAGS += $(call cc-option,-mabi=lp64) 45 47 KBUILD_AFLAGS += $(call cc-option,-mabi=lp64) 46 48 ··· 67 65 68 66 ifeq ($(CONFIG_ARM64_BTI_KERNEL),y) 69 67 KBUILD_CFLAGS += -mbranch-protection=pac-ret+bti 68 + KBUILD_RUSTFLAGS += -Zbranch-protection=bti,pac-ret 70 69 else ifeq ($(CONFIG_ARM64_PTR_AUTH_KERNEL),y) 70 + KBUILD_RUSTFLAGS += -Zbranch-protection=pac-ret 71 71 ifeq ($(CONFIG_CC_HAS_BRANCH_PROT_PAC_RET),y) 72 72 KBUILD_CFLAGS += -mbranch-protection=pac-ret 73 73 else
-1
arch/arm64/configs/defconfig
··· 76 76 CONFIG_ARCH_VISCONTI=y 77 77 CONFIG_ARCH_XGENE=y 78 78 CONFIG_ARCH_ZYNQMP=y 79 - CONFIG_ARM64_VA_BITS_48=y 80 79 CONFIG_SCHED_MC=y 81 80 CONFIG_SCHED_SMT=y 82 81 CONFIG_NUMA=y
-2
arch/arm64/include/asm/archrandom.h
··· 129 129 return (ftr >> ID_AA64ISAR0_EL1_RNDR_SHIFT) & 0xf; 130 130 } 131 131 132 - u64 kaslr_early_init(void *fdt); 133 - 134 132 #endif /* _ASM_ARCHRANDOM_H */
+19 -40
arch/arm64/include/asm/assembler.h
··· 38 38 msr daifset, #0xf 39 39 .endm 40 40 41 - .macro enable_daif 42 - msr daifclr, #0xf 43 - .endm 44 - 45 41 /* 46 42 * Save/restore interrupts. 47 43 */ ··· 342 346 .endm 343 347 344 348 /* 345 - * idmap_get_t0sz - get the T0SZ value needed to cover the ID map 346 - * 347 - * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the 348 - * entire ID map region can be mapped. As T0SZ == (64 - #bits used), 349 - * this number conveniently equals the number of leading zeroes in 350 - * the physical address of _end. 351 - */ 352 - .macro idmap_get_t0sz, reg 353 - adrp \reg, _end 354 - orr \reg, \reg, #(1 << VA_BITS_MIN) - 1 355 - clz \reg, \reg 356 - .endm 357 - 358 - /* 359 349 * tcr_compute_pa_size - set TCR.(I)PS to the highest supported 360 350 * ID_AA64MMFR0_EL1.PARange value 361 351 * ··· 572 590 .endm 573 591 574 592 /* 575 - * Offset ttbr1 to allow for 48-bit kernel VAs set with 52-bit PTRS_PER_PGD. 593 + * If the kernel is built for 52-bit virtual addressing but the hardware only 594 + * supports 48 bits, we cannot program the pgdir address into TTBR1 directly, 595 + * but we have to add an offset so that the TTBR1 address corresponds with the 596 + * pgdir entry that covers the lowest 48-bit addressable VA. 597 + * 598 + * Note that this trick is only used for LVA/64k pages - LPA2/4k pages uses an 599 + * additional paging level, and on LPA2/16k pages, we would end up with a root 600 + * level table with only 2 entries, which is suboptimal in terms of TLB 601 + * utilization, so there we fall back to 47 bits of translation if LPA2 is not 602 + * supported. 603 + * 576 604 * orr is used as it can cover the immediate value (and is idempotent). 577 - * In future this may be nop'ed out when dealing with 52-bit kernel VAs. 578 605 * ttbr: Value of ttbr to set, modified. 579 606 */ 580 607 .macro offset_ttbr1, ttbr, tmp 581 - #ifdef CONFIG_ARM64_VA_BITS_52 582 - mrs_s \tmp, SYS_ID_AA64MMFR2_EL1 583 - and \tmp, \tmp, #(0xf << ID_AA64MMFR2_EL1_VARange_SHIFT) 584 - cbnz \tmp, .Lskipoffs_\@ 585 - orr \ttbr, \ttbr, #TTBR1_BADDR_4852_OFFSET 586 - .Lskipoffs_\@ : 608 + #if defined(CONFIG_ARM64_VA_BITS_52) && !defined(CONFIG_ARM64_LPA2) 609 + mrs \tmp, tcr_el1 610 + and \tmp, \tmp, #TCR_T1SZ_MASK 611 + cmp \tmp, #TCR_T1SZ(VA_BITS_MIN) 612 + orr \tmp, \ttbr, #TTBR1_BADDR_4852_OFFSET 613 + csel \ttbr, \tmp, \ttbr, eq 587 614 #endif 588 615 .endm 589 616 ··· 614 623 615 624 .macro phys_to_pte, pte, phys 616 625 #ifdef CONFIG_ARM64_PA_BITS_52 617 - /* 618 - * We assume \phys is 64K aligned and this is guaranteed by only 619 - * supporting this configuration with 64K pages. 620 - */ 621 - orr \pte, \phys, \phys, lsr #36 622 - and \pte, \pte, #PTE_ADDR_MASK 626 + orr \pte, \phys, \phys, lsr #PTE_ADDR_HIGH_SHIFT 627 + and \pte, \pte, #PHYS_TO_PTE_ADDR_MASK 623 628 #else 624 629 mov \pte, \phys 625 - #endif 626 - .endm 627 - 628 - .macro pte_to_phys, phys, pte 629 - and \phys, \pte, #PTE_ADDR_MASK 630 - #ifdef CONFIG_ARM64_PA_BITS_52 631 - orr \phys, \phys, \phys, lsl #PTE_ADDR_HIGH_SHIFT 632 - and \phys, \phys, GENMASK_ULL(PHYS_MASK_SHIFT - 1, PAGE_SHIFT) 633 630 #endif 634 631 .endm 635 632
+2
arch/arm64/include/asm/brk-imm.h
··· 11 11 * 0x004: for installing kprobes 12 12 * 0x005: for installing uprobes 13 13 * 0x006: for kprobe software single-step 14 + * 0x007: for kretprobe return 14 15 * Allowed values for kgdb are 0x400 - 0x7ff 15 16 * 0x100: for triggering a fault on purpose (reserved) 16 17 * 0x400: for dynamic BRK instruction ··· 24 23 #define KPROBES_BRK_IMM 0x004 25 24 #define UPROBES_BRK_IMM 0x005 26 25 #define KPROBES_BRK_SS_IMM 0x006 26 + #define KRETPROBES_BRK_IMM 0x007 27 27 #define FAULT_BRK_IMM 0x100 28 28 #define KGDB_DYN_DBG_BRK_IMM 0x400 29 29 #define KGDB_COMPILED_DBG_BRK_IMM 0x401
+3
arch/arm64/include/asm/cpu.h
··· 52 52 u64 reg_id_aa64isar0; 53 53 u64 reg_id_aa64isar1; 54 54 u64 reg_id_aa64isar2; 55 + u64 reg_id_aa64isar3; 55 56 u64 reg_id_aa64mmfr0; 56 57 u64 reg_id_aa64mmfr1; 57 58 u64 reg_id_aa64mmfr2; 58 59 u64 reg_id_aa64mmfr3; 59 60 u64 reg_id_aa64pfr0; 60 61 u64 reg_id_aa64pfr1; 62 + u64 reg_id_aa64pfr2; 61 63 u64 reg_id_aa64zfr0; 62 64 u64 reg_id_aa64smfr0; 65 + u64 reg_id_aa64fpfr0; 63 66 64 67 struct cpuinfo_32bit aarch32; 65 68 };
+113
arch/arm64/include/asm/cpufeature.h
··· 17 17 18 18 #define ARM64_SW_FEATURE_OVERRIDE_NOKASLR 0 19 19 #define ARM64_SW_FEATURE_OVERRIDE_HVHE 4 20 + #define ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF 8 20 21 21 22 #ifndef __ASSEMBLY__ 22 23 ··· 769 768 return system_supports_sme(); 770 769 } 771 770 771 + static __always_inline bool system_supports_fpmr(void) 772 + { 773 + return alternative_has_cap_unlikely(ARM64_HAS_FPMR); 774 + } 775 + 772 776 static __always_inline bool system_supports_cnp(void) 773 777 { 774 778 return alternative_has_cap_unlikely(ARM64_HAS_CNP); ··· 911 905 s64 arm64_ftr_safe_value(const struct arm64_ftr_bits *ftrp, s64 new, s64 cur); 912 906 struct arm64_ftr_reg *get_arm64_ftr_reg(u32 sys_id); 913 907 908 + extern struct arm64_ftr_override id_aa64mmfr0_override; 914 909 extern struct arm64_ftr_override id_aa64mmfr1_override; 910 + extern struct arm64_ftr_override id_aa64mmfr2_override; 915 911 extern struct arm64_ftr_override id_aa64pfr0_override; 916 912 extern struct arm64_ftr_override id_aa64pfr1_override; 917 913 extern struct arm64_ftr_override id_aa64zfr0_override; ··· 923 915 924 916 extern struct arm64_ftr_override arm64_sw_feature_override; 925 917 918 + static inline 919 + u64 arm64_apply_feature_override(u64 val, int feat, int width, 920 + const struct arm64_ftr_override *override) 921 + { 922 + u64 oval = override->val; 923 + 924 + /* 925 + * When it encounters an invalid override (e.g., an override that 926 + * cannot be honoured due to a missing CPU feature), the early idreg 927 + * override code will set the mask to 0x0 and the value to non-zero for 928 + * the field in question. In order to determine whether the override is 929 + * valid or not for the field we are interested in, we first need to 930 + * disregard bits belonging to other fields. 931 + */ 932 + oval &= GENMASK_ULL(feat + width - 1, feat); 933 + 934 + /* 935 + * The override is valid if all value bits are accounted for in the 936 + * mask. If so, replace the masked bits with the override value. 937 + */ 938 + if (oval == (oval & override->mask)) { 939 + val &= ~override->mask; 940 + val |= oval; 941 + } 942 + 943 + /* Extract the field from the updated value */ 944 + return cpuid_feature_extract_unsigned_field(val, feat); 945 + } 946 + 947 + static inline bool arm64_test_sw_feature_override(int feat) 948 + { 949 + /* 950 + * Software features are pseudo CPU features that have no underlying 951 + * CPUID system register value to apply the override to. 952 + */ 953 + return arm64_apply_feature_override(0, feat, 4, 954 + &arm64_sw_feature_override); 955 + } 956 + 957 + static inline bool kaslr_disabled_cmdline(void) 958 + { 959 + return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_NOKASLR); 960 + } 961 + 926 962 u32 get_kvm_ipa_limit(void); 927 963 void dump_cpu_features(void); 964 + 965 + static inline bool cpu_has_bti(void) 966 + { 967 + if (!IS_ENABLED(CONFIG_ARM64_BTI)) 968 + return false; 969 + 970 + return arm64_apply_feature_override(read_cpuid(ID_AA64PFR1_EL1), 971 + ID_AA64PFR1_EL1_BT_SHIFT, 4, 972 + &id_aa64pfr1_override); 973 + } 974 + 975 + static inline bool cpu_has_pac(void) 976 + { 977 + u64 isar1, isar2; 978 + 979 + if (!IS_ENABLED(CONFIG_ARM64_PTR_AUTH)) 980 + return false; 981 + 982 + isar1 = read_cpuid(ID_AA64ISAR1_EL1); 983 + isar2 = read_cpuid(ID_AA64ISAR2_EL1); 984 + 985 + if (arm64_apply_feature_override(isar1, ID_AA64ISAR1_EL1_APA_SHIFT, 4, 986 + &id_aa64isar1_override)) 987 + return true; 988 + 989 + if (arm64_apply_feature_override(isar1, ID_AA64ISAR1_EL1_API_SHIFT, 4, 990 + &id_aa64isar1_override)) 991 + return true; 992 + 993 + return arm64_apply_feature_override(isar2, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 994 + &id_aa64isar2_override); 995 + } 996 + 997 + static inline bool cpu_has_lva(void) 998 + { 999 + u64 mmfr2; 1000 + 1001 + mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1); 1002 + mmfr2 &= ~id_aa64mmfr2_override.mask; 1003 + mmfr2 |= id_aa64mmfr2_override.val; 1004 + return cpuid_feature_extract_unsigned_field(mmfr2, 1005 + ID_AA64MMFR2_EL1_VARange_SHIFT); 1006 + } 1007 + 1008 + static inline bool cpu_has_lpa2(void) 1009 + { 1010 + #ifdef CONFIG_ARM64_LPA2 1011 + u64 mmfr0; 1012 + int feat; 1013 + 1014 + mmfr0 = read_sysreg(id_aa64mmfr0_el1); 1015 + mmfr0 &= ~id_aa64mmfr0_override.mask; 1016 + mmfr0 |= id_aa64mmfr0_override.val; 1017 + feat = cpuid_feature_extract_signed_field(mmfr0, 1018 + ID_AA64MMFR0_EL1_TGRAN_SHIFT); 1019 + 1020 + return feat >= ID_AA64MMFR0_EL1_TGRAN_LPA2; 1021 + #else 1022 + return false; 1023 + #endif 1024 + } 928 1025 929 1026 #endif /* __ASSEMBLY__ */ 930 1027
+5 -5
arch/arm64/include/asm/elf.h
··· 201 201 #define COMPAT_ELF_PLATFORM ("v8l") 202 202 #endif 203 203 204 - #ifdef CONFIG_COMPAT 205 - 206 - /* PIE load location for compat arm. Must match ARM ELF_ET_DYN_BASE. */ 207 - #define COMPAT_ELF_ET_DYN_BASE 0x000400000UL 208 - 209 204 /* AArch32 registers. */ 210 205 #define COMPAT_ELF_NGREG 18 211 206 typedef unsigned int compat_elf_greg_t; 212 207 typedef compat_elf_greg_t compat_elf_gregset_t[COMPAT_ELF_NGREG]; 208 + 209 + #ifdef CONFIG_COMPAT 210 + 211 + /* PIE load location for compat arm. Must match ARM ELF_ET_DYN_BASE. */ 212 + #define COMPAT_ELF_ET_DYN_BASE 0x000400000UL 213 213 214 214 /* AArch32 EABI. */ 215 215 #define EF_ARM_EABI_MASK 0xff000000
+5 -8
arch/arm64/include/asm/esr.h
··· 117 117 #define ESR_ELx_FSC_ACCESS (0x08) 118 118 #define ESR_ELx_FSC_FAULT (0x04) 119 119 #define ESR_ELx_FSC_PERM (0x0C) 120 - #define ESR_ELx_FSC_SEA_TTW0 (0x14) 121 - #define ESR_ELx_FSC_SEA_TTW1 (0x15) 122 - #define ESR_ELx_FSC_SEA_TTW2 (0x16) 123 - #define ESR_ELx_FSC_SEA_TTW3 (0x17) 120 + #define ESR_ELx_FSC_SEA_TTW(n) (0x14 + (n)) 124 121 #define ESR_ELx_FSC_SECC (0x18) 125 - #define ESR_ELx_FSC_SECC_TTW0 (0x1c) 126 - #define ESR_ELx_FSC_SECC_TTW1 (0x1d) 127 - #define ESR_ELx_FSC_SECC_TTW2 (0x1e) 128 - #define ESR_ELx_FSC_SECC_TTW3 (0x1f) 122 + #define ESR_ELx_FSC_SECC_TTW(n) (0x1c + (n)) 129 123 130 124 /* ISS field definitions for Data Aborts */ 131 125 #define ESR_ELx_ISV_SHIFT (24) ··· 388 394 389 395 static inline bool esr_fsc_is_translation_fault(unsigned long esr) 390 396 { 397 + /* Translation fault, level -1 */ 398 + if ((esr & ESR_ELx_FSC) == 0b101011) 399 + return true; 391 400 return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT; 392 401 } 393 402
+1 -1
arch/arm64/include/asm/exception.h
··· 74 74 void do_el1_fpac(struct pt_regs *regs, unsigned long esr); 75 75 void do_el0_mops(struct pt_regs *regs, unsigned long esr); 76 76 void do_serror(struct pt_regs *regs, unsigned long esr); 77 - void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags); 77 + void do_signal(struct pt_regs *regs); 78 78 79 79 void __noreturn panic_bad_stack(struct pt_regs *regs, unsigned long esr, unsigned long far); 80 80 #endif /* __ASM_EXCEPTION_H */
+1 -1
arch/arm64/include/asm/fixmap.h
··· 87 87 FIX_PTE, 88 88 FIX_PMD, 89 89 FIX_PUD, 90 + FIX_P4D, 90 91 FIX_PGD, 91 92 92 93 __end_of_fixed_addresses ··· 101 100 #define FIXMAP_PAGE_IO __pgprot(PROT_DEVICE_nGnRE) 102 101 103 102 void __init early_fixmap_init(void); 104 - void __init fixmap_copy(pgd_t *pgdir); 105 103 106 104 #define __early_set_fixmap __set_fixmap 107 105
+2 -2
arch/arm64/include/asm/fpsimd.h
··· 21 21 #include <linux/stddef.h> 22 22 #include <linux/types.h> 23 23 24 - #ifdef CONFIG_COMPAT 25 24 /* Masks for extracting the FPSR and FPCR from the FPSCR */ 26 25 #define VFP_FPSCR_STAT_MASK 0xf800009f 27 26 #define VFP_FPSCR_CTRL_MASK 0x07f79f00 ··· 29 30 * control/status register. 30 31 */ 31 32 #define VFP_STATE_SIZE ((32 * 8) + 4) 32 - #endif 33 33 34 34 static inline unsigned long cpacr_save_enable_kernel_sve(void) 35 35 { ··· 87 89 void *sve_state; 88 90 void *sme_state; 89 91 u64 *svcr; 92 + u64 *fpmr; 90 93 unsigned int sve_vl; 91 94 unsigned int sme_vl; 92 95 enum fp_type *fp_type; ··· 153 154 extern void cpu_enable_sme(const struct arm64_cpu_capabilities *__unused); 154 155 extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused); 155 156 extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused); 157 + extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused); 156 158 157 159 extern u64 read_smcr_features(void); 158 160
-1
arch/arm64/include/asm/hw_breakpoint.h
··· 59 59 /* Watchpoints */ 60 60 #define ARM_BREAKPOINT_LOAD 1 61 61 #define ARM_BREAKPOINT_STORE 2 62 - #define AARCH64_ESR_ACCESS_MASK (1 << 6) 63 62 64 63 /* Lengths */ 65 64 #define ARM_BREAKPOINT_LEN_1 0x1
+15
arch/arm64/include/asm/hwcap.h
··· 142 142 #define KERNEL_HWCAP_SVE_B16B16 __khwcap2_feature(SVE_B16B16) 143 143 #define KERNEL_HWCAP_LRCPC3 __khwcap2_feature(LRCPC3) 144 144 #define KERNEL_HWCAP_LSE128 __khwcap2_feature(LSE128) 145 + #define KERNEL_HWCAP_FPMR __khwcap2_feature(FPMR) 146 + #define KERNEL_HWCAP_LUT __khwcap2_feature(LUT) 147 + #define KERNEL_HWCAP_FAMINMAX __khwcap2_feature(FAMINMAX) 148 + #define KERNEL_HWCAP_F8CVT __khwcap2_feature(F8CVT) 149 + #define KERNEL_HWCAP_F8FMA __khwcap2_feature(F8FMA) 150 + #define KERNEL_HWCAP_F8DP4 __khwcap2_feature(F8DP4) 151 + #define KERNEL_HWCAP_F8DP2 __khwcap2_feature(F8DP2) 152 + #define KERNEL_HWCAP_F8E4M3 __khwcap2_feature(F8E4M3) 153 + #define KERNEL_HWCAP_F8E5M2 __khwcap2_feature(F8E5M2) 154 + #define KERNEL_HWCAP_SME_LUTV2 __khwcap2_feature(SME_LUTV2) 155 + #define KERNEL_HWCAP_SME_F8F16 __khwcap2_feature(SME_F8F16) 156 + #define KERNEL_HWCAP_SME_F8F32 __khwcap2_feature(SME_F8F32) 157 + #define KERNEL_HWCAP_SME_SF8FMA __khwcap2_feature(SME_SF8FMA) 158 + #define KERNEL_HWCAP_SME_SF8DP4 __khwcap2_feature(SME_SF8DP4) 159 + #define KERNEL_HWCAP_SME_SF8DP2 __khwcap2_feature(SME_SF8DP2) 145 160 146 161 /* 147 162 * This yields a mask that user programs can use to figure out what
+8 -4
arch/arm64/include/asm/io.h
··· 24 24 #define __raw_writeb __raw_writeb 25 25 static __always_inline void __raw_writeb(u8 val, volatile void __iomem *addr) 26 26 { 27 - asm volatile("strb %w0, [%1]" : : "rZ" (val), "r" (addr)); 27 + volatile u8 __iomem *ptr = addr; 28 + asm volatile("strb %w0, %1" : : "rZ" (val), "Qo" (*ptr)); 28 29 } 29 30 30 31 #define __raw_writew __raw_writew 31 32 static __always_inline void __raw_writew(u16 val, volatile void __iomem *addr) 32 33 { 33 - asm volatile("strh %w0, [%1]" : : "rZ" (val), "r" (addr)); 34 + volatile u16 __iomem *ptr = addr; 35 + asm volatile("strh %w0, %1" : : "rZ" (val), "Qo" (*ptr)); 34 36 } 35 37 36 38 #define __raw_writel __raw_writel 37 39 static __always_inline void __raw_writel(u32 val, volatile void __iomem *addr) 38 40 { 39 - asm volatile("str %w0, [%1]" : : "rZ" (val), "r" (addr)); 41 + volatile u32 __iomem *ptr = addr; 42 + asm volatile("str %w0, %1" : : "rZ" (val), "Qo" (*ptr)); 40 43 } 41 44 42 45 #define __raw_writeq __raw_writeq 43 46 static __always_inline void __raw_writeq(u64 val, volatile void __iomem *addr) 44 47 { 45 - asm volatile("str %x0, [%1]" : : "rZ" (val), "r" (addr)); 48 + volatile u64 __iomem *ptr = addr; 49 + asm volatile("str %x0, %1" : : "rZ" (val), "Qo" (*ptr)); 46 50 } 47 51 48 52 #define __raw_readb __raw_readb
-2
arch/arm64/include/asm/kasan.h
··· 17 17 18 18 asmlinkage void kasan_early_init(void); 19 19 void kasan_init(void); 20 - void kasan_copy_shadow(pgd_t *pgdir); 21 20 22 21 #else 23 22 static inline void kasan_init(void) { } 24 - static inline void kasan_copy_shadow(pgd_t *pgdir) { } 25 23 #endif 26 24 27 25 #endif
+42 -61
arch/arm64/include/asm/kernel-pgtable.h
··· 13 13 #include <asm/sparsemem.h> 14 14 15 15 /* 16 - * The linear mapping and the start of memory are both 2M aligned (per 17 - * the arm64 booting.txt requirements). Hence we can use section mapping 18 - * with 4K (section size = 2M) but not with 16K (section size = 32M) or 19 - * 64K (section size = 512M). 16 + * The physical and virtual addresses of the start of the kernel image are 17 + * equal modulo 2 MiB (per the arm64 booting.txt requirements). Hence we can 18 + * use section mapping with 4K (section size = 2M) but not with 16K (section 19 + * size = 32M) or 64K (section size = 512M). 20 20 */ 21 - 22 - /* 23 - * The idmap and swapper page tables need some space reserved in the kernel 24 - * image. Both require pgd, pud (4 levels only) and pmd tables to (section) 25 - * map the kernel. With the 64K page configuration, swapper and idmap need to 26 - * map to pte level. The swapper also maps the FDT (see __create_page_tables 27 - * for more information). Note that the number of ID map translation levels 28 - * could be increased on the fly if system RAM is out of reach for the default 29 - * VA range, so pages required to map highest possible PA are reserved in all 30 - * cases. 31 - */ 32 - #ifdef CONFIG_ARM64_4K_PAGES 33 - #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - 1) 21 + #if defined(PMD_SIZE) && PMD_SIZE <= MIN_KIMG_ALIGN 22 + #define SWAPPER_BLOCK_SHIFT PMD_SHIFT 23 + #define SWAPPER_SKIP_LEVEL 1 34 24 #else 35 - #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS) 25 + #define SWAPPER_BLOCK_SHIFT PAGE_SHIFT 26 + #define SWAPPER_SKIP_LEVEL 0 36 27 #endif 28 + #define SWAPPER_BLOCK_SIZE (UL(1) << SWAPPER_BLOCK_SHIFT) 29 + #define SWAPPER_TABLE_SHIFT (SWAPPER_BLOCK_SHIFT + PAGE_SHIFT - 3) 37 30 31 + #define SWAPPER_PGTABLE_LEVELS (CONFIG_PGTABLE_LEVELS - SWAPPER_SKIP_LEVEL) 32 + #define INIT_IDMAP_PGTABLE_LEVELS (IDMAP_LEVELS - SWAPPER_SKIP_LEVEL) 33 + 34 + #define IDMAP_VA_BITS 48 35 + #define IDMAP_LEVELS ARM64_HW_PGTABLE_LEVELS(IDMAP_VA_BITS) 36 + #define IDMAP_ROOT_LEVEL (4 - IDMAP_LEVELS) 38 37 39 38 /* 40 39 * A relocatable kernel may execute from an address that differs from the one at ··· 49 50 #define EARLY_ENTRIES(vstart, vend, shift, add) \ 50 51 (SPAN_NR_ENTRIES(vstart, vend, shift) + (add)) 51 52 52 - #define EARLY_PGDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, PGDIR_SHIFT, add)) 53 + #define EARLY_LEVEL(lvl, lvls, vstart, vend, add) \ 54 + (lvls > lvl ? EARLY_ENTRIES(vstart, vend, SWAPPER_BLOCK_SHIFT + lvl * (PAGE_SHIFT - 3), add) : 0) 53 55 54 - #if SWAPPER_PGTABLE_LEVELS > 3 55 - #define EARLY_PUDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, PUD_SHIFT, add)) 56 - #else 57 - #define EARLY_PUDS(vstart, vend, add) (0) 58 - #endif 56 + #define EARLY_PAGES(lvls, vstart, vend, add) (1 /* PGDIR page */ \ 57 + + EARLY_LEVEL(3, (lvls), (vstart), (vend), add) /* each entry needs a next level page table */ \ 58 + + EARLY_LEVEL(2, (lvls), (vstart), (vend), add) /* each entry needs a next level page table */ \ 59 + + EARLY_LEVEL(1, (lvls), (vstart), (vend), add))/* each entry needs a next level page table */ 60 + #define INIT_DIR_SIZE (PAGE_SIZE * (EARLY_PAGES(SWAPPER_PGTABLE_LEVELS, KIMAGE_VADDR, _end, EXTRA_PAGE) \ 61 + + EARLY_SEGMENT_EXTRA_PAGES)) 59 62 60 - #if SWAPPER_PGTABLE_LEVELS > 2 61 - #define EARLY_PMDS(vstart, vend, add) (EARLY_ENTRIES(vstart, vend, SWAPPER_TABLE_SHIFT, add)) 62 - #else 63 - #define EARLY_PMDS(vstart, vend, add) (0) 64 - #endif 63 + #define INIT_IDMAP_DIR_PAGES (EARLY_PAGES(INIT_IDMAP_PGTABLE_LEVELS, KIMAGE_VADDR, _end, 1)) 64 + #define INIT_IDMAP_DIR_SIZE ((INIT_IDMAP_DIR_PAGES + EARLY_IDMAP_EXTRA_PAGES) * PAGE_SIZE) 65 65 66 - #define EARLY_PAGES(vstart, vend, add) ( 1 /* PGDIR page */ \ 67 - + EARLY_PGDS((vstart), (vend), add) /* each PGDIR needs a next level page table */ \ 68 - + EARLY_PUDS((vstart), (vend), add) /* each PUD needs a next level page table */ \ 69 - + EARLY_PMDS((vstart), (vend), add)) /* each PMD needs a next level page table */ 70 - #define INIT_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR, _end, EXTRA_PAGE)) 66 + #define INIT_IDMAP_FDT_PAGES (EARLY_PAGES(INIT_IDMAP_PGTABLE_LEVELS, 0UL, UL(MAX_FDT_SIZE), 1) - 1) 67 + #define INIT_IDMAP_FDT_SIZE ((INIT_IDMAP_FDT_PAGES + EARLY_IDMAP_EXTRA_FDT_PAGES) * PAGE_SIZE) 71 68 72 - /* the initial ID map may need two extra pages if it needs to be extended */ 73 - #if VA_BITS < 48 74 - #define INIT_IDMAP_DIR_SIZE ((INIT_IDMAP_DIR_PAGES + 2) * PAGE_SIZE) 75 - #else 76 - #define INIT_IDMAP_DIR_SIZE (INIT_IDMAP_DIR_PAGES * PAGE_SIZE) 77 - #endif 78 - #define INIT_IDMAP_DIR_PAGES EARLY_PAGES(KIMAGE_VADDR, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE, 1) 69 + /* The number of segments in the kernel image (text, rodata, inittext, initdata, data+bss) */ 70 + #define KERNEL_SEGMENT_COUNT 5 79 71 80 - /* Initial memory map size */ 81 - #ifdef CONFIG_ARM64_4K_PAGES 82 - #define SWAPPER_BLOCK_SHIFT PMD_SHIFT 83 - #define SWAPPER_BLOCK_SIZE PMD_SIZE 84 - #define SWAPPER_TABLE_SHIFT PUD_SHIFT 85 - #else 86 - #define SWAPPER_BLOCK_SHIFT PAGE_SHIFT 87 - #define SWAPPER_BLOCK_SIZE PAGE_SIZE 88 - #define SWAPPER_TABLE_SHIFT PMD_SHIFT 89 - #endif 90 - 72 + #if SWAPPER_BLOCK_SIZE > SEGMENT_ALIGN 73 + #define EARLY_SEGMENT_EXTRA_PAGES (KERNEL_SEGMENT_COUNT + 1) 91 74 /* 92 - * Initial memory map attributes. 75 + * The initial ID map consists of the kernel image, mapped as two separate 76 + * segments, and may appear misaligned wrt the swapper block size. This means 77 + * we need 3 additional pages. The DT could straddle a swapper block boundary, 78 + * so it may need 2. 93 79 */ 94 - #define SWAPPER_PTE_FLAGS (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED | PTE_UXN) 95 - #define SWAPPER_PMD_FLAGS (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S | PTE_UXN) 96 - 97 - #ifdef CONFIG_ARM64_4K_PAGES 98 - #define SWAPPER_RW_MMUFLAGS (PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS | PTE_WRITE) 99 - #define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PMD_SECT_RDONLY) 80 + #define EARLY_IDMAP_EXTRA_PAGES 3 81 + #define EARLY_IDMAP_EXTRA_FDT_PAGES 2 100 82 #else 101 - #define SWAPPER_RW_MMUFLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS | PTE_WRITE) 102 - #define SWAPPER_RX_MMUFLAGS (SWAPPER_RW_MMUFLAGS | PTE_RDONLY) 83 + #define EARLY_SEGMENT_EXTRA_PAGES 0 84 + #define EARLY_IDMAP_EXTRA_PAGES 0 85 + #define EARLY_IDMAP_EXTRA_FDT_PAGES 0 103 86 #endif 104 87 105 88 #endif /* __ASM_KERNEL_PGTABLE_H */
+1 -1
arch/arm64/include/asm/kvm_arm.h
··· 105 105 #define HCRX_GUEST_FLAGS \ 106 106 (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \ 107 107 (cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0)) 108 - #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En) 108 + #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM) 109 109 110 110 /* TCR_EL2 Registers bits */ 111 111 #define TCR_EL2_DS (1UL << 32)
+2 -8
arch/arm64/include/asm/kvm_emulate.h
··· 425 425 { 426 426 switch (kvm_vcpu_trap_get_fault(vcpu)) { 427 427 case ESR_ELx_FSC_EXTABT: 428 - case ESR_ELx_FSC_SEA_TTW0: 429 - case ESR_ELx_FSC_SEA_TTW1: 430 - case ESR_ELx_FSC_SEA_TTW2: 431 - case ESR_ELx_FSC_SEA_TTW3: 428 + case ESR_ELx_FSC_SEA_TTW(-1) ... ESR_ELx_FSC_SEA_TTW(3): 432 429 case ESR_ELx_FSC_SECC: 433 - case ESR_ELx_FSC_SECC_TTW0: 434 - case ESR_ELx_FSC_SECC_TTW1: 435 - case ESR_ELx_FSC_SECC_TTW2: 436 - case ESR_ELx_FSC_SECC_TTW3: 430 + case ESR_ELx_FSC_SECC_TTW(-1) ... ESR_ELx_FSC_SECC_TTW(3): 437 431 return true; 438 432 default: 439 433 return false;
+1
arch/arm64/include/asm/kvm_host.h
··· 543 543 enum fp_type fp_type; 544 544 unsigned int sve_max_vl; 545 545 u64 svcr; 546 + u64 fpmr; 546 547 547 548 /* Stage 2 paging state used by the hardware on next switch */ 548 549 struct kvm_s2_mmu *hw_mmu;
+23 -8
arch/arm64/include/asm/memory.h
··· 30 30 * keep a constant PAGE_OFFSET and "fallback" to using the higher end 31 31 * of the VMEMMAP where 52-bit support is not available in hardware. 32 32 */ 33 - #define VMEMMAP_SHIFT (PAGE_SHIFT - STRUCT_PAGE_MAX_SHIFT) 34 - #define VMEMMAP_SIZE ((_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) >> VMEMMAP_SHIFT) 33 + #define VMEMMAP_RANGE (_PAGE_END(VA_BITS_MIN) - PAGE_OFFSET) 34 + #define VMEMMAP_SIZE ((VMEMMAP_RANGE >> PAGE_SHIFT) * sizeof(struct page)) 35 35 36 36 /* 37 37 * PAGE_OFFSET - the virtual address of the start of the linear map, at the ··· 47 47 #define MODULES_END (MODULES_VADDR + MODULES_VSIZE) 48 48 #define MODULES_VADDR (_PAGE_END(VA_BITS_MIN)) 49 49 #define MODULES_VSIZE (SZ_2G) 50 - #define VMEMMAP_START (-(UL(1) << (VA_BITS - VMEMMAP_SHIFT))) 51 - #define VMEMMAP_END (VMEMMAP_START + VMEMMAP_SIZE) 52 - #define PCI_IO_END (VMEMMAP_START - SZ_8M) 53 - #define PCI_IO_START (PCI_IO_END - PCI_IO_SIZE) 54 - #define FIXADDR_TOP (VMEMMAP_START - SZ_32M) 50 + #define VMEMMAP_START (VMEMMAP_END - VMEMMAP_SIZE) 51 + #define VMEMMAP_END (-UL(SZ_1G)) 52 + #define PCI_IO_START (VMEMMAP_END + SZ_8M) 53 + #define PCI_IO_END (PCI_IO_START + PCI_IO_SIZE) 54 + #define FIXADDR_TOP (-UL(SZ_8M)) 55 55 56 56 #if VA_BITS > 48 57 + #ifdef CONFIG_ARM64_16K_PAGES 58 + #define VA_BITS_MIN (47) 59 + #else 57 60 #define VA_BITS_MIN (48) 61 + #endif 58 62 #else 59 63 #define VA_BITS_MIN (VA_BITS) 60 64 #endif ··· 213 209 #include <asm/boot.h> 214 210 #include <asm/bug.h> 215 211 #include <asm/sections.h> 212 + #include <asm/sysreg.h> 213 + 214 + static inline u64 __pure read_tcr(void) 215 + { 216 + u64 tcr; 217 + 218 + // read_sysreg() uses asm volatile, so avoid it here 219 + asm("mrs %0, tcr_el1" : "=r"(tcr)); 220 + return tcr; 221 + } 216 222 217 223 #if VA_BITS > 48 218 - extern u64 vabits_actual; 224 + // For reasons of #include hell, we can't use TCR_T1SZ_OFFSET/TCR_T1SZ_MASK here 225 + #define vabits_actual (64 - ((read_tcr() >> 16) & 63)) 219 226 #else 220 227 #define vabits_actual ((u64)VA_BITS) 221 228 #endif
+38 -2
arch/arm64/include/asm/mmu.h
··· 71 71 pgprot_t prot, bool page_mappings_only); 72 72 extern void *fixmap_remap_fdt(phys_addr_t dt_phys, int *size, pgprot_t prot); 73 73 extern void mark_linear_text_alias_ro(void); 74 - extern bool kaslr_requires_kpti(void); 74 + 75 + /* 76 + * This check is triggered during the early boot before the cpufeature 77 + * is initialised. Checking the status on the local CPU allows the boot 78 + * CPU to detect the need for non-global mappings and thus avoiding a 79 + * pagetable re-write after all the CPUs are booted. This check will be 80 + * anyway run on individual CPUs, allowing us to get the consistent 81 + * state once the SMP CPUs are up and thus make the switch to non-global 82 + * mappings if required. 83 + */ 84 + static inline bool kaslr_requires_kpti(void) 85 + { 86 + /* 87 + * E0PD does a similar job to KPTI so can be used instead 88 + * where available. 89 + */ 90 + if (IS_ENABLED(CONFIG_ARM64_E0PD)) { 91 + u64 mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1); 92 + if (cpuid_feature_extract_unsigned_field(mmfr2, 93 + ID_AA64MMFR2_EL1_E0PD_SHIFT)) 94 + return false; 95 + } 96 + 97 + /* 98 + * Systems affected by Cavium erratum 24756 are incompatible 99 + * with KPTI. 100 + */ 101 + if (IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) { 102 + extern const struct midr_range cavium_erratum_27456_cpus[]; 103 + 104 + if (is_midr_in_range_list(read_cpuid_id(), 105 + cavium_erratum_27456_cpus)) 106 + return false; 107 + } 108 + 109 + return true; 110 + } 75 111 76 112 #define INIT_MM_CONTEXT(name) \ 77 - .pgd = init_pg_dir, 113 + .pgd = swapper_pg_dir, 78 114 79 115 #endif /* !__ASSEMBLY__ */ 80 116 #endif
+8 -45
arch/arm64/include/asm/mmu_context.h
··· 61 61 } 62 62 63 63 /* 64 - * TCR.T0SZ value to use when the ID map is active. Usually equals 65 - * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in 66 - * physical memory, in which case it will be smaller. 64 + * TCR.T0SZ value to use when the ID map is active. 67 65 */ 68 - extern int idmap_t0sz; 66 + #define idmap_t0sz TCR_T0SZ(IDMAP_VA_BITS) 69 67 70 68 /* 71 69 * Ensure TCR.T0SZ is set to the provided value. ··· 108 110 cpu_switch_mm(mm->pgd, mm); 109 111 } 110 112 111 - static inline void __cpu_install_idmap(pgd_t *idmap) 113 + static inline void cpu_install_idmap(void) 112 114 { 113 115 cpu_set_reserved_ttbr0(); 114 116 local_flush_tlb_all(); 115 117 cpu_set_idmap_tcr_t0sz(); 116 118 117 - cpu_switch_mm(lm_alias(idmap), &init_mm); 118 - } 119 - 120 - static inline void cpu_install_idmap(void) 121 - { 122 - __cpu_install_idmap(idmap_pg_dir); 119 + cpu_switch_mm(lm_alias(idmap_pg_dir), &init_mm); 123 120 } 124 121 125 122 /* ··· 141 148 isb(); 142 149 } 143 150 144 - /* 145 - * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD, 146 - * avoiding the possibility of conflicting TLB entries being allocated. 147 - */ 148 - static inline void __cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap, bool cnp) 149 - { 150 - typedef void (ttbr_replace_func)(phys_addr_t); 151 - extern ttbr_replace_func idmap_cpu_replace_ttbr1; 152 - ttbr_replace_func *replace_phys; 153 - unsigned long daif; 154 - 155 - /* phys_to_ttbr() zeros lower 2 bits of ttbr with 52-bit PA */ 156 - phys_addr_t ttbr1 = phys_to_ttbr(virt_to_phys(pgdp)); 157 - 158 - if (cnp) 159 - ttbr1 |= TTBR_CNP_BIT; 160 - 161 - replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1); 162 - 163 - __cpu_install_idmap(idmap); 164 - 165 - /* 166 - * We really don't want to take *any* exceptions while TTBR1 is 167 - * in the process of being replaced so mask everything. 168 - */ 169 - daif = local_daif_save(); 170 - replace_phys(ttbr1); 171 - local_daif_restore(daif); 172 - 173 - cpu_uninstall_idmap(); 174 - } 151 + void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp); 175 152 176 153 static inline void cpu_enable_swapper_cnp(void) 177 154 { 178 - __cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir, true); 155 + __cpu_replace_ttbr1(lm_alias(swapper_pg_dir), true); 179 156 } 180 157 181 - static inline void cpu_replace_ttbr1(pgd_t *pgdp, pgd_t *idmap) 158 + static inline void cpu_replace_ttbr1(pgd_t *pgdp) 182 159 { 183 160 /* 184 161 * Only for early TTBR1 replacement before cpucaps are finalized and 185 162 * before we've decided whether to use CNP. 186 163 */ 187 164 WARN_ON(system_capabilities_finalized()); 188 - __cpu_replace_ttbr1(pgdp, idmap, false); 165 + __cpu_replace_ttbr1(pgdp, false); 189 166 } 190 167 191 168 /*
+51 -1
arch/arm64/include/asm/pgalloc.h
··· 14 14 #include <asm/tlbflush.h> 15 15 16 16 #define __HAVE_ARCH_PGD_FREE 17 + #define __HAVE_ARCH_PUD_FREE 17 18 #include <asm-generic/pgalloc.h> 18 19 19 20 #define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t)) ··· 44 43 45 44 static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot) 46 45 { 47 - set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot)); 46 + if (pgtable_l4_enabled()) 47 + set_p4d(p4dp, __p4d(__phys_to_p4d_val(pudp) | prot)); 48 48 } 49 49 50 50 static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4dp, pud_t *pudp) ··· 55 53 p4dval |= (mm == &init_mm) ? P4D_TABLE_UXN : P4D_TABLE_PXN; 56 54 __p4d_populate(p4dp, __pa(pudp), p4dval); 57 55 } 56 + 57 + static inline void pud_free(struct mm_struct *mm, pud_t *pud) 58 + { 59 + if (!pgtable_l4_enabled()) 60 + return; 61 + __pud_free(mm, pud); 62 + } 58 63 #else 59 64 static inline void __p4d_populate(p4d_t *p4dp, phys_addr_t pudp, p4dval_t prot) 60 65 { 61 66 BUILD_BUG(); 62 67 } 63 68 #endif /* CONFIG_PGTABLE_LEVELS > 3 */ 69 + 70 + #if CONFIG_PGTABLE_LEVELS > 4 71 + 72 + static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot) 73 + { 74 + if (pgtable_l5_enabled()) 75 + set_pgd(pgdp, __pgd(__phys_to_pgd_val(p4dp) | prot)); 76 + } 77 + 78 + static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp) 79 + { 80 + pgdval_t pgdval = PGD_TYPE_TABLE; 81 + 82 + pgdval |= (mm == &init_mm) ? PGD_TABLE_UXN : PGD_TABLE_PXN; 83 + __pgd_populate(pgdp, __pa(p4dp), pgdval); 84 + } 85 + 86 + static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr) 87 + { 88 + gfp_t gfp = GFP_PGTABLE_USER; 89 + 90 + if (mm == &init_mm) 91 + gfp = GFP_PGTABLE_KERNEL; 92 + return (p4d_t *)get_zeroed_page(gfp); 93 + } 94 + 95 + static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d) 96 + { 97 + if (!pgtable_l5_enabled()) 98 + return; 99 + BUG_ON((unsigned long)p4d & (PAGE_SIZE-1)); 100 + free_page((unsigned long)p4d); 101 + } 102 + 103 + #define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d) 104 + #else 105 + static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot) 106 + { 107 + BUILD_BUG(); 108 + } 109 + #endif /* CONFIG_PGTABLE_LEVELS > 4 */ 64 110 65 111 extern pgd_t *pgd_alloc(struct mm_struct *mm); 66 112 extern void pgd_free(struct mm_struct *mm, pgd_t *pgdp);
+27 -6
arch/arm64/include/asm/pgtable-hwdef.h
··· 26 26 #define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3)) 27 27 28 28 /* 29 - * Size mapped by an entry at level n ( 0 <= n <= 3) 29 + * Size mapped by an entry at level n ( -1 <= n <= 3) 30 30 * We map (PAGE_SHIFT - 3) at all translation levels and PAGE_SHIFT bits 31 31 * in the final page. The maximum number of translation levels supported by 32 - * the architecture is 4. Hence, starting at level n, we have further 32 + * the architecture is 5. Hence, starting at level n, we have further 33 33 * ((4 - n) - 1) levels of translation excluding the offset within the page. 34 34 * So, the total number of bits mapped by an entry at level n is : 35 35 * ··· 62 62 #define PTRS_PER_PUD (1 << (PAGE_SHIFT - 3)) 63 63 #endif 64 64 65 + #if CONFIG_PGTABLE_LEVELS > 4 66 + #define P4D_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(0) 67 + #define P4D_SIZE (_AC(1, UL) << P4D_SHIFT) 68 + #define P4D_MASK (~(P4D_SIZE-1)) 69 + #define PTRS_PER_P4D (1 << (PAGE_SHIFT - 3)) 70 + #endif 71 + 65 72 /* 66 73 * PGDIR_SHIFT determines the size a top-level page table entry can map 67 - * (depending on the configuration, this level can be 0, 1 or 2). 74 + * (depending on the configuration, this level can be -1, 0, 1 or 2). 68 75 */ 69 76 #define PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - CONFIG_PGTABLE_LEVELS) 70 77 #define PGDIR_SIZE (_AC(1, UL) << PGDIR_SHIFT) ··· 94 87 /* 95 88 * Hardware page table definitions. 96 89 * 90 + * Level -1 descriptor (PGD). 91 + */ 92 + #define PGD_TYPE_TABLE (_AT(pgdval_t, 3) << 0) 93 + #define PGD_TABLE_BIT (_AT(pgdval_t, 1) << 1) 94 + #define PGD_TYPE_MASK (_AT(pgdval_t, 3) << 0) 95 + #define PGD_TABLE_PXN (_AT(pgdval_t, 1) << 59) 96 + #define PGD_TABLE_UXN (_AT(pgdval_t, 1) << 60) 97 + 98 + /* 97 99 * Level 0 descriptor (P4D). 98 100 */ 99 101 #define P4D_TYPE_TABLE (_AT(p4dval_t, 3) << 0) ··· 171 155 #define PTE_PXN (_AT(pteval_t, 1) << 53) /* Privileged XN */ 172 156 #define PTE_UXN (_AT(pteval_t, 1) << 54) /* User XN */ 173 157 174 - #define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT) 158 + #define PTE_ADDR_LOW (((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT) 175 159 #ifdef CONFIG_ARM64_PA_BITS_52 160 + #ifdef CONFIG_ARM64_64K_PAGES 176 161 #define PTE_ADDR_HIGH (_AT(pteval_t, 0xf) << 12) 177 - #define PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH) 178 162 #define PTE_ADDR_HIGH_SHIFT 36 163 + #define PHYS_TO_PTE_ADDR_MASK (PTE_ADDR_LOW | PTE_ADDR_HIGH) 179 164 #else 180 - #define PTE_ADDR_MASK PTE_ADDR_LOW 165 + #define PTE_ADDR_HIGH (_AT(pteval_t, 0x3) << 8) 166 + #define PTE_ADDR_HIGH_SHIFT 42 167 + #define PHYS_TO_PTE_ADDR_MASK GENMASK_ULL(49, 8) 168 + #endif 181 169 #endif 182 170 183 171 /* ··· 304 284 #define TCR_E0PD1 (UL(1) << 56) 305 285 #define TCR_TCMA0 (UL(1) << 57) 306 286 #define TCR_TCMA1 (UL(1) << 58) 287 + #define TCR_DS (UL(1) << 59) 307 288 308 289 /* 309 290 * TTBR.
+14 -6
arch/arm64/include/asm/pgtable-prot.h
··· 30 30 #define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED) 31 31 #define _PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S) 32 32 33 - #define PROT_DEFAULT (_PROT_DEFAULT | PTE_MAYBE_NG) 34 - #define PROT_SECT_DEFAULT (_PROT_SECT_DEFAULT | PMD_MAYBE_NG) 33 + #define PROT_DEFAULT (PTE_TYPE_PAGE | PTE_MAYBE_NG | PTE_MAYBE_SHARED | PTE_AF) 34 + #define PROT_SECT_DEFAULT (PMD_TYPE_SECT | PMD_MAYBE_NG | PMD_MAYBE_SHARED | PMD_SECT_AF) 35 35 36 36 #define PROT_DEVICE_nGnRnE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE)) 37 37 #define PROT_DEVICE_nGnRE (PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE)) ··· 57 57 #define _PAGE_READONLY_EXEC (_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN) 58 58 #define _PAGE_EXECONLY (_PAGE_DEFAULT | PTE_RDONLY | PTE_NG | PTE_PXN) 59 59 60 - #ifdef __ASSEMBLY__ 61 - #define PTE_MAYBE_NG 0 62 - #endif 63 - 64 60 #ifndef __ASSEMBLY__ 65 61 66 62 #include <asm/cpufeature.h> ··· 67 71 #define PTE_MAYBE_NG (arm64_use_ng_mappings ? PTE_NG : 0) 68 72 #define PMD_MAYBE_NG (arm64_use_ng_mappings ? PMD_SECT_NG : 0) 69 73 74 + #ifndef CONFIG_ARM64_LPA2 70 75 #define lpa2_is_enabled() false 76 + #define PTE_MAYBE_SHARED PTE_SHARED 77 + #define PMD_MAYBE_SHARED PMD_SECT_S 78 + #else 79 + static inline bool __pure lpa2_is_enabled(void) 80 + { 81 + return read_tcr() & TCR_DS; 82 + } 83 + 84 + #define PTE_MAYBE_SHARED (lpa2_is_enabled() ? 0 : PTE_SHARED) 85 + #define PMD_MAYBE_SHARED (lpa2_is_enabled() ? 0 : PMD_SECT_S) 86 + #endif 71 87 72 88 /* 73 89 * If we have userspace only BTI we don't want to mark kernel pages
+6
arch/arm64/include/asm/pgtable-types.h
··· 36 36 #define __pud(x) ((pud_t) { (x) } ) 37 37 #endif 38 38 39 + #if CONFIG_PGTABLE_LEVELS > 4 40 + typedef struct { p4dval_t p4d; } p4d_t; 41 + #define p4d_val(x) ((x).p4d) 42 + #define __p4d(x) ((p4d_t) { (x) } ) 43 + #endif 44 + 39 45 typedef struct { pgdval_t pgd; } pgd_t; 40 46 #define pgd_val(x) ((x).pgd) 41 47 #define __pgd(x) ((pgd_t) { (x) } )
+211 -26
arch/arm64/include/asm/pgtable.h
··· 18 18 * VMALLOC range. 19 19 * 20 20 * VMALLOC_START: beginning of the kernel vmalloc space 21 - * VMALLOC_END: extends to the available space below vmemmap, PCI I/O space 22 - * and fixed mappings 21 + * VMALLOC_END: extends to the available space below vmemmap 23 22 */ 24 23 #define VMALLOC_START (MODULES_END) 25 - #define VMALLOC_END (VMEMMAP_START - SZ_256M) 24 + #if VA_BITS == VA_BITS_MIN 25 + #define VMALLOC_END (VMEMMAP_START - SZ_8M) 26 + #else 27 + #define VMEMMAP_UNUSED_NPAGES ((_PAGE_OFFSET(vabits_actual) - PAGE_OFFSET) >> PAGE_SHIFT) 28 + #define VMALLOC_END (VMEMMAP_START + VMEMMAP_UNUSED_NPAGES * sizeof(struct page) - SZ_8M) 29 + #endif 26 30 27 31 #define vmemmap ((struct page *)VMEMMAP_START - (memstart_addr >> PAGE_SHIFT)) 28 32 ··· 80 76 #ifdef CONFIG_ARM64_PA_BITS_52 81 77 static inline phys_addr_t __pte_to_phys(pte_t pte) 82 78 { 79 + pte_val(pte) &= ~PTE_MAYBE_SHARED; 83 80 return (pte_val(pte) & PTE_ADDR_LOW) | 84 81 ((pte_val(pte) & PTE_ADDR_HIGH) << PTE_ADDR_HIGH_SHIFT); 85 82 } 86 83 static inline pteval_t __phys_to_pte_val(phys_addr_t phys) 87 84 { 88 - return (phys | (phys >> PTE_ADDR_HIGH_SHIFT)) & PTE_ADDR_MASK; 85 + return (phys | (phys >> PTE_ADDR_HIGH_SHIFT)) & PHYS_TO_PTE_ADDR_MASK; 89 86 } 90 87 #else 91 - #define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_MASK) 88 + #define __pte_to_phys(pte) (pte_val(pte) & PTE_ADDR_LOW) 92 89 #define __phys_to_pte_val(phys) (phys) 93 90 #endif 94 91 ··· 621 616 PUD_TYPE_TABLE) 622 617 #endif 623 618 624 - extern pgd_t init_pg_dir[PTRS_PER_PGD]; 619 + extern pgd_t init_pg_dir[]; 625 620 extern pgd_t init_pg_end[]; 626 - extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; 627 - extern pgd_t idmap_pg_dir[PTRS_PER_PGD]; 628 - extern pgd_t tramp_pg_dir[PTRS_PER_PGD]; 629 - extern pgd_t reserved_pg_dir[PTRS_PER_PGD]; 621 + extern pgd_t swapper_pg_dir[]; 622 + extern pgd_t idmap_pg_dir[]; 623 + extern pgd_t tramp_pg_dir[]; 624 + extern pgd_t reserved_pg_dir[]; 630 625 631 626 extern void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd); 632 627 ··· 699 694 #define pud_user(pud) pte_user(pud_pte(pud)) 700 695 #define pud_user_exec(pud) pte_user_exec(pud_pte(pud)) 701 696 697 + static inline bool pgtable_l4_enabled(void); 698 + 702 699 static inline void set_pud(pud_t *pudp, pud_t pud) 703 700 { 704 - #ifdef __PAGETABLE_PUD_FOLDED 705 - if (in_swapper_pgdir(pudp)) { 701 + if (!pgtable_l4_enabled() && in_swapper_pgdir(pudp)) { 706 702 set_swapper_pgd((pgd_t *)pudp, __pgd(pud_val(pud))); 707 703 return; 708 704 } 709 - #endif /* __PAGETABLE_PUD_FOLDED */ 710 705 711 706 WRITE_ONCE(*pudp, pud); 712 707 ··· 759 754 760 755 #if CONFIG_PGTABLE_LEVELS > 3 761 756 757 + static __always_inline bool pgtable_l4_enabled(void) 758 + { 759 + if (CONFIG_PGTABLE_LEVELS > 4 || !IS_ENABLED(CONFIG_ARM64_LPA2)) 760 + return true; 761 + if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT)) 762 + return vabits_actual == VA_BITS; 763 + return alternative_has_cap_unlikely(ARM64_HAS_VA52); 764 + } 765 + 766 + static inline bool mm_pud_folded(const struct mm_struct *mm) 767 + { 768 + return !pgtable_l4_enabled(); 769 + } 770 + #define mm_pud_folded mm_pud_folded 771 + 762 772 #define pud_ERROR(e) \ 763 773 pr_err("%s:%d: bad pud %016llx.\n", __FILE__, __LINE__, pud_val(e)) 764 774 765 - #define p4d_none(p4d) (!p4d_val(p4d)) 766 - #define p4d_bad(p4d) (!(p4d_val(p4d) & 2)) 767 - #define p4d_present(p4d) (p4d_val(p4d)) 775 + #define p4d_none(p4d) (pgtable_l4_enabled() && !p4d_val(p4d)) 776 + #define p4d_bad(p4d) (pgtable_l4_enabled() && !(p4d_val(p4d) & 2)) 777 + #define p4d_present(p4d) (!p4d_none(p4d)) 768 778 769 779 static inline void set_p4d(p4d_t *p4dp, p4d_t p4d) 770 780 { ··· 795 775 796 776 static inline void p4d_clear(p4d_t *p4dp) 797 777 { 798 - set_p4d(p4dp, __p4d(0)); 778 + if (pgtable_l4_enabled()) 779 + set_p4d(p4dp, __p4d(0)); 799 780 } 800 781 801 782 static inline phys_addr_t p4d_page_paddr(p4d_t p4d) ··· 804 783 return __p4d_to_phys(p4d); 805 784 } 806 785 786 + #define pud_index(addr) (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)) 787 + 788 + static inline pud_t *p4d_to_folded_pud(p4d_t *p4dp, unsigned long addr) 789 + { 790 + return (pud_t *)PTR_ALIGN_DOWN(p4dp, PAGE_SIZE) + pud_index(addr); 791 + } 792 + 807 793 static inline pud_t *p4d_pgtable(p4d_t p4d) 808 794 { 809 795 return (pud_t *)__va(p4d_page_paddr(p4d)); 810 796 } 811 797 812 - /* Find an entry in the first-level page table. */ 813 - #define pud_offset_phys(dir, addr) (p4d_page_paddr(READ_ONCE(*(dir))) + pud_index(addr) * sizeof(pud_t)) 798 + static inline phys_addr_t pud_offset_phys(p4d_t *p4dp, unsigned long addr) 799 + { 800 + BUG_ON(!pgtable_l4_enabled()); 814 801 815 - #define pud_set_fixmap(addr) ((pud_t *)set_fixmap_offset(FIX_PUD, addr)) 816 - #define pud_set_fixmap_offset(p4d, addr) pud_set_fixmap(pud_offset_phys(p4d, addr)) 817 - #define pud_clear_fixmap() clear_fixmap(FIX_PUD) 802 + return p4d_page_paddr(READ_ONCE(*p4dp)) + pud_index(addr) * sizeof(pud_t); 803 + } 804 + 805 + static inline 806 + pud_t *pud_offset_lockless(p4d_t *p4dp, p4d_t p4d, unsigned long addr) 807 + { 808 + if (!pgtable_l4_enabled()) 809 + return p4d_to_folded_pud(p4dp, addr); 810 + return (pud_t *)__va(p4d_page_paddr(p4d)) + pud_index(addr); 811 + } 812 + #define pud_offset_lockless pud_offset_lockless 813 + 814 + static inline pud_t *pud_offset(p4d_t *p4dp, unsigned long addr) 815 + { 816 + return pud_offset_lockless(p4dp, READ_ONCE(*p4dp), addr); 817 + } 818 + #define pud_offset pud_offset 819 + 820 + static inline pud_t *pud_set_fixmap(unsigned long addr) 821 + { 822 + if (!pgtable_l4_enabled()) 823 + return NULL; 824 + return (pud_t *)set_fixmap_offset(FIX_PUD, addr); 825 + } 826 + 827 + static inline pud_t *pud_set_fixmap_offset(p4d_t *p4dp, unsigned long addr) 828 + { 829 + if (!pgtable_l4_enabled()) 830 + return p4d_to_folded_pud(p4dp, addr); 831 + return pud_set_fixmap(pud_offset_phys(p4dp, addr)); 832 + } 833 + 834 + static inline void pud_clear_fixmap(void) 835 + { 836 + if (pgtable_l4_enabled()) 837 + clear_fixmap(FIX_PUD); 838 + } 839 + 840 + /* use ONLY for statically allocated translation tables */ 841 + static inline pud_t *pud_offset_kimg(p4d_t *p4dp, u64 addr) 842 + { 843 + if (!pgtable_l4_enabled()) 844 + return p4d_to_folded_pud(p4dp, addr); 845 + return (pud_t *)__phys_to_kimg(pud_offset_phys(p4dp, addr)); 846 + } 818 847 819 848 #define p4d_page(p4d) pfn_to_page(__phys_to_pfn(__p4d_to_phys(p4d))) 820 849 821 - /* use ONLY for statically allocated translation tables */ 822 - #define pud_offset_kimg(dir,addr) ((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr)))) 823 - 824 850 #else 825 851 852 + static inline bool pgtable_l4_enabled(void) { return false; } 853 + 826 854 #define p4d_page_paddr(p4d) ({ BUILD_BUG(); 0;}) 827 - #define pgd_page_paddr(pgd) ({ BUILD_BUG(); 0;}) 828 855 829 856 /* Match pud_offset folding in <asm/generic/pgtable-nopud.h> */ 830 857 #define pud_set_fixmap(addr) NULL ··· 882 813 #define pud_offset_kimg(dir,addr) ((pud_t *)dir) 883 814 884 815 #endif /* CONFIG_PGTABLE_LEVELS > 3 */ 816 + 817 + #if CONFIG_PGTABLE_LEVELS > 4 818 + 819 + static __always_inline bool pgtable_l5_enabled(void) 820 + { 821 + if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT)) 822 + return vabits_actual == VA_BITS; 823 + return alternative_has_cap_unlikely(ARM64_HAS_VA52); 824 + } 825 + 826 + static inline bool mm_p4d_folded(const struct mm_struct *mm) 827 + { 828 + return !pgtable_l5_enabled(); 829 + } 830 + #define mm_p4d_folded mm_p4d_folded 831 + 832 + #define p4d_ERROR(e) \ 833 + pr_err("%s:%d: bad p4d %016llx.\n", __FILE__, __LINE__, p4d_val(e)) 834 + 835 + #define pgd_none(pgd) (pgtable_l5_enabled() && !pgd_val(pgd)) 836 + #define pgd_bad(pgd) (pgtable_l5_enabled() && !(pgd_val(pgd) & 2)) 837 + #define pgd_present(pgd) (!pgd_none(pgd)) 838 + 839 + static inline void set_pgd(pgd_t *pgdp, pgd_t pgd) 840 + { 841 + if (in_swapper_pgdir(pgdp)) { 842 + set_swapper_pgd(pgdp, __pgd(pgd_val(pgd))); 843 + return; 844 + } 845 + 846 + WRITE_ONCE(*pgdp, pgd); 847 + dsb(ishst); 848 + isb(); 849 + } 850 + 851 + static inline void pgd_clear(pgd_t *pgdp) 852 + { 853 + if (pgtable_l5_enabled()) 854 + set_pgd(pgdp, __pgd(0)); 855 + } 856 + 857 + static inline phys_addr_t pgd_page_paddr(pgd_t pgd) 858 + { 859 + return __pgd_to_phys(pgd); 860 + } 861 + 862 + #define p4d_index(addr) (((addr) >> P4D_SHIFT) & (PTRS_PER_P4D - 1)) 863 + 864 + static inline p4d_t *pgd_to_folded_p4d(pgd_t *pgdp, unsigned long addr) 865 + { 866 + return (p4d_t *)PTR_ALIGN_DOWN(pgdp, PAGE_SIZE) + p4d_index(addr); 867 + } 868 + 869 + static inline phys_addr_t p4d_offset_phys(pgd_t *pgdp, unsigned long addr) 870 + { 871 + BUG_ON(!pgtable_l5_enabled()); 872 + 873 + return pgd_page_paddr(READ_ONCE(*pgdp)) + p4d_index(addr) * sizeof(p4d_t); 874 + } 875 + 876 + static inline 877 + p4d_t *p4d_offset_lockless(pgd_t *pgdp, pgd_t pgd, unsigned long addr) 878 + { 879 + if (!pgtable_l5_enabled()) 880 + return pgd_to_folded_p4d(pgdp, addr); 881 + return (p4d_t *)__va(pgd_page_paddr(pgd)) + p4d_index(addr); 882 + } 883 + #define p4d_offset_lockless p4d_offset_lockless 884 + 885 + static inline p4d_t *p4d_offset(pgd_t *pgdp, unsigned long addr) 886 + { 887 + return p4d_offset_lockless(pgdp, READ_ONCE(*pgdp), addr); 888 + } 889 + 890 + static inline p4d_t *p4d_set_fixmap(unsigned long addr) 891 + { 892 + if (!pgtable_l5_enabled()) 893 + return NULL; 894 + return (p4d_t *)set_fixmap_offset(FIX_P4D, addr); 895 + } 896 + 897 + static inline p4d_t *p4d_set_fixmap_offset(pgd_t *pgdp, unsigned long addr) 898 + { 899 + if (!pgtable_l5_enabled()) 900 + return pgd_to_folded_p4d(pgdp, addr); 901 + return p4d_set_fixmap(p4d_offset_phys(pgdp, addr)); 902 + } 903 + 904 + static inline void p4d_clear_fixmap(void) 905 + { 906 + if (pgtable_l5_enabled()) 907 + clear_fixmap(FIX_P4D); 908 + } 909 + 910 + /* use ONLY for statically allocated translation tables */ 911 + static inline p4d_t *p4d_offset_kimg(pgd_t *pgdp, u64 addr) 912 + { 913 + if (!pgtable_l5_enabled()) 914 + return pgd_to_folded_p4d(pgdp, addr); 915 + return (p4d_t *)__phys_to_kimg(p4d_offset_phys(pgdp, addr)); 916 + } 917 + 918 + #define pgd_page(pgd) pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd))) 919 + 920 + #else 921 + 922 + static inline bool pgtable_l5_enabled(void) { return false; } 923 + 924 + /* Match p4d_offset folding in <asm/generic/pgtable-nop4d.h> */ 925 + #define p4d_set_fixmap(addr) NULL 926 + #define p4d_set_fixmap_offset(p4dp, addr) ((p4d_t *)p4dp) 927 + #define p4d_clear_fixmap() 928 + 929 + #define p4d_offset_kimg(dir,addr) ((p4d_t *)dir) 930 + 931 + #endif /* CONFIG_PGTABLE_LEVELS > 4 */ 885 932 886 933 #define pgd_ERROR(e) \ 887 934 pr_err("%s:%d: bad pgd %016llx.\n", __FILE__, __LINE__, pgd_val(e))
+4
arch/arm64/include/asm/processor.h
··· 155 155 struct { 156 156 unsigned long tp_value; /* TLS register */ 157 157 unsigned long tp2_value; 158 + u64 fpmr; 159 + unsigned long pad; 158 160 struct user_fpsimd_state fpsimd_state; 159 161 } uw; 160 162 ··· 255 253 BUILD_BUG_ON(sizeof_field(struct thread_struct, uw) != 256 254 sizeof_field(struct thread_struct, uw.tp_value) + 257 255 sizeof_field(struct thread_struct, uw.tp2_value) + 256 + sizeof_field(struct thread_struct, uw.fpmr) + 257 + sizeof_field(struct thread_struct, uw.pad) + 258 258 sizeof_field(struct thread_struct, uw.fpsimd_state)); 259 259 260 260 *offset = offsetof(struct thread_struct, uw);
+5 -31
arch/arm64/include/asm/scs.h
··· 33 33 #include <asm/cpufeature.h> 34 34 35 35 #ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS 36 - static inline bool should_patch_pac_into_scs(void) 37 - { 38 - u64 reg; 39 - 40 - /* 41 - * We only enable the shadow call stack dynamically if we are running 42 - * on a system that does not implement PAC or BTI. PAC and SCS provide 43 - * roughly the same level of protection, and BTI relies on the PACIASP 44 - * instructions serving as landing pads, preventing us from patching 45 - * those instructions into something else. 46 - */ 47 - reg = read_sysreg_s(SYS_ID_AA64ISAR1_EL1); 48 - if (SYS_FIELD_GET(ID_AA64ISAR1_EL1, APA, reg) | 49 - SYS_FIELD_GET(ID_AA64ISAR1_EL1, API, reg)) 50 - return false; 51 - 52 - reg = read_sysreg_s(SYS_ID_AA64ISAR2_EL1); 53 - if (SYS_FIELD_GET(ID_AA64ISAR2_EL1, APA3, reg)) 54 - return false; 55 - 56 - if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)) { 57 - reg = read_sysreg_s(SYS_ID_AA64PFR1_EL1); 58 - if (reg & (0xf << ID_AA64PFR1_EL1_BT_SHIFT)) 59 - return false; 60 - } 61 - return true; 62 - } 63 - 64 36 static inline void dynamic_scs_init(void) 65 37 { 66 - if (should_patch_pac_into_scs()) { 38 + extern bool __pi_dynamic_scs_is_enabled; 39 + 40 + if (__pi_dynamic_scs_is_enabled) { 67 41 pr_info("Enabling dynamic shadow call stack\n"); 68 42 static_branch_enable(&dynamic_scs_enabled); 69 43 } ··· 46 72 static inline void dynamic_scs_init(void) {} 47 73 #endif 48 74 49 - int scs_patch(const u8 eh_frame[], int size); 50 - asmlinkage void scs_patch_vmlinux(void); 75 + int __pi_scs_patch(const u8 eh_frame[], int size); 76 + asmlinkage void __pi_scs_patch_vmlinux(void); 51 77 52 78 #endif /* __ASSEMBLY __ */ 53 79
-3
arch/arm64/include/asm/setup.h
··· 7 7 8 8 #include <uapi/asm/setup.h> 9 9 10 - void *get_early_fdt_ptr(void); 11 - void early_fdt_map(u64 dt_phys); 12 - 13 10 /* 14 11 * These two variables are used in the head.S file. 15 12 */
+3
arch/arm64/include/asm/tlb.h
··· 103 103 { 104 104 struct ptdesc *ptdesc = virt_to_ptdesc(pudp); 105 105 106 + if (!pgtable_l4_enabled()) 107 + return; 108 + 106 109 pagetable_pud_dtor(ptdesc); 107 110 tlb_remove_ptdesc(tlb, ptdesc); 108 111 }
+15
arch/arm64/include/uapi/asm/hwcap.h
··· 107 107 #define HWCAP2_SVE_B16B16 (1UL << 45) 108 108 #define HWCAP2_LRCPC3 (1UL << 46) 109 109 #define HWCAP2_LSE128 (1UL << 47) 110 + #define HWCAP2_FPMR (1UL << 48) 111 + #define HWCAP2_LUT (1UL << 49) 112 + #define HWCAP2_FAMINMAX (1UL << 50) 113 + #define HWCAP2_F8CVT (1UL << 51) 114 + #define HWCAP2_F8FMA (1UL << 52) 115 + #define HWCAP2_F8DP4 (1UL << 53) 116 + #define HWCAP2_F8DP2 (1UL << 54) 117 + #define HWCAP2_F8E4M3 (1UL << 55) 118 + #define HWCAP2_F8E5M2 (1UL << 56) 119 + #define HWCAP2_SME_LUTV2 (1UL << 57) 120 + #define HWCAP2_SME_F8F16 (1UL << 58) 121 + #define HWCAP2_SME_F8F32 (1UL << 59) 122 + #define HWCAP2_SME_SF8FMA (1UL << 60) 123 + #define HWCAP2_SME_SF8DP4 (1UL << 61) 124 + #define HWCAP2_SME_SF8DP2 (1UL << 62) 110 125 111 126 #endif /* _UAPI__ASM_HWCAP_H */
+8
arch/arm64/include/uapi/asm/sigcontext.h
··· 152 152 __u64 tpidr2; 153 153 }; 154 154 155 + /* FPMR context */ 156 + #define FPMR_MAGIC 0x46504d52 157 + 158 + struct fpmr_context { 159 + struct _aarch64_ctx head; 160 + __u64 fpmr; 161 + }; 162 + 155 163 #define ZA_MAGIC 0x54366345 156 164 157 165 struct za_context {
+11
arch/arm64/include/uapi/asm/sve_context.h
··· 13 13 14 14 #define __SVE_VQ_BYTES 16 /* number of bytes per quadword */ 15 15 16 + /* 17 + * Yes, __SVE_VQ_MAX is 512 QUADWORDS. 18 + * 19 + * To help ensure forward portability, this is much larger than the 20 + * current maximum value defined by the SVE architecture. While arrays 21 + * or static allocations can be sized based on this value, watch out! 22 + * It will waste a surprisingly large amount of memory. 23 + * 24 + * Dynamic sizing based on the actual runtime vector length is likely to 25 + * be preferable for most purposes. 26 + */ 16 27 #define __SVE_VQ_MIN 1 17 28 #define __SVE_VQ_MAX 512 18 29
+2 -11
arch/arm64/kernel/Makefile
··· 33 33 return_address.o cpuinfo.o cpu_errata.o \ 34 34 cpufeature.o alternative.o cacheinfo.o \ 35 35 smp.o smp_spin_table.o topology.o smccc-call.o \ 36 - syscall.o proton-pack.o idreg-override.o idle.o \ 37 - patching.o 36 + syscall.o proton-pack.o idle.o patching.o pi/ 38 37 39 38 obj-$(CONFIG_COMPAT) += sys32.o signal32.o \ 40 39 sys_compat.o ··· 56 57 obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o 57 58 obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL) += acpi_parking_protocol.o 58 59 obj-$(CONFIG_PARAVIRT) += paravirt.o 59 - obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o pi/ 60 + obj-$(CONFIG_RANDOMIZE_BASE) += kaslr.o 60 61 obj-$(CONFIG_HIBERNATION) += hibernate.o hibernate-asm.o 61 62 obj-$(CONFIG_ELF_CORE) += elfcore.o 62 63 obj-$(CONFIG_KEXEC_CORE) += machine_kexec.o relocate_kernel.o \ ··· 71 72 obj-$(CONFIG_ARM64_MTE) += mte.o 72 73 obj-y += vdso-wrap.o 73 74 obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o 74 - obj-$(CONFIG_UNWIND_PATCH_PAC_INTO_SCS) += patch-scs.o 75 - 76 - # We need to prevent the SCS patching code from patching itself. Using 77 - # -mbranch-protection=none here to avoid the patchable PAC opcodes from being 78 - # generated triggers an issue with full LTO on Clang, which stops emitting PAC 79 - # instructions altogether. So disable LTO as well for the compilation unit. 80 - CFLAGS_patch-scs.o += -mbranch-protection=none 81 - CFLAGS_REMOVE_patch-scs.o += $(CC_FLAGS_LTO) 82 75 83 76 # Force dependency (vdso*-wrap.S includes vdso.so through incbin) 84 77 $(obj)/vdso-wrap.o: $(obj)/vdso/vdso.so
+1 -1
arch/arm64/kernel/asm-offsets.c
··· 75 75 DEFINE(S_FP, offsetof(struct pt_regs, regs[29])); 76 76 DEFINE(S_LR, offsetof(struct pt_regs, regs[30])); 77 77 DEFINE(S_SP, offsetof(struct pt_regs, sp)); 78 - DEFINE(S_PSTATE, offsetof(struct pt_regs, pstate)); 79 78 DEFINE(S_PC, offsetof(struct pt_regs, pc)); 79 + DEFINE(S_PSTATE, offsetof(struct pt_regs, pstate)); 80 80 DEFINE(S_SYSCALLNO, offsetof(struct pt_regs, syscallno)); 81 81 DEFINE(S_SDEI_TTBR1, offsetof(struct pt_regs, sdei_ttbr1)); 82 82 DEFINE(S_PMR_SAVE, offsetof(struct pt_regs, pmr_save));
+115 -67
arch/arm64/kernel/cpufeature.c
··· 220 220 }; 221 221 222 222 static const struct arm64_ftr_bits ftr_id_aa64isar2[] = { 223 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_LUT_SHIFT, 4, 0), 223 224 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0), 224 225 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0), 225 226 ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CLRBHB_SHIFT, 4, 0), ··· 232 231 FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_GPA3_SHIFT, 4, 0), 233 232 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRES_SHIFT, 4, 0), 234 233 ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_WFxT_SHIFT, 4, 0), 234 + ARM64_FTR_END, 235 + }; 236 + 237 + static const struct arm64_ftr_bits ftr_id_aa64isar3[] = { 238 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FAMINMAX_SHIFT, 4, 0), 235 239 ARM64_FTR_END, 236 240 }; 237 241 ··· 273 267 ARM64_FTR_END, 274 268 }; 275 269 270 + static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = { 271 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_FPMR_SHIFT, 4, 0), 272 + ARM64_FTR_END, 273 + }; 274 + 276 275 static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = { 277 276 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE), 278 277 FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F64MM_SHIFT, 4, 0), ··· 306 295 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 307 296 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_FA64_SHIFT, 1, 0), 308 297 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 298 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_LUTv2_SHIFT, 1, 0), 299 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 309 300 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SMEver_SHIFT, 4, 0), 310 301 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 311 302 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_I16I64_SHIFT, 4, 0), ··· 320 307 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 321 308 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F16F16_SHIFT, 1, 0), 322 309 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 310 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F8F16_SHIFT, 1, 0), 311 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 312 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F8F32_SHIFT, 1, 0), 313 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 323 314 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_I8I32_SHIFT, 4, 0), 324 315 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 325 316 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F16F32_SHIFT, 1, 0), ··· 333 316 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_BI32I32_SHIFT, 1, 0), 334 317 ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 335 318 FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F32F32_SHIFT, 1, 0), 319 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 320 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8FMA_SHIFT, 1, 0), 321 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 322 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP4_SHIFT, 1, 0), 323 + ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME), 324 + FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP2_SHIFT, 1, 0), 325 + ARM64_FTR_END, 326 + }; 327 + 328 + static const struct arm64_ftr_bits ftr_id_aa64fpfr0[] = { 329 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8CVT_SHIFT, 1, 0), 330 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8FMA_SHIFT, 1, 0), 331 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP4_SHIFT, 1, 0), 332 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP2_SHIFT, 1, 0), 333 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E4M3_SHIFT, 1, 0), 334 + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E5M2_SHIFT, 1, 0), 336 335 ARM64_FTR_END, 337 336 }; 338 337 ··· 688 655 #define ARM64_FTR_REG(id, table) \ 689 656 __ARM64_FTR_REG_OVERRIDE(#id, id, table, &no_override) 690 657 691 - struct arm64_ftr_override __ro_after_init id_aa64mmfr1_override; 692 - struct arm64_ftr_override __ro_after_init id_aa64pfr0_override; 693 - struct arm64_ftr_override __ro_after_init id_aa64pfr1_override; 694 - struct arm64_ftr_override __ro_after_init id_aa64zfr0_override; 695 - struct arm64_ftr_override __ro_after_init id_aa64smfr0_override; 696 - struct arm64_ftr_override __ro_after_init id_aa64isar1_override; 697 - struct arm64_ftr_override __ro_after_init id_aa64isar2_override; 658 + struct arm64_ftr_override id_aa64mmfr0_override; 659 + struct arm64_ftr_override id_aa64mmfr1_override; 660 + struct arm64_ftr_override id_aa64mmfr2_override; 661 + struct arm64_ftr_override id_aa64pfr0_override; 662 + struct arm64_ftr_override id_aa64pfr1_override; 663 + struct arm64_ftr_override id_aa64zfr0_override; 664 + struct arm64_ftr_override id_aa64smfr0_override; 665 + struct arm64_ftr_override id_aa64isar1_override; 666 + struct arm64_ftr_override id_aa64isar2_override; 698 667 699 668 struct arm64_ftr_override arm64_sw_feature_override; 700 669 ··· 737 702 &id_aa64pfr0_override), 738 703 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1, 739 704 &id_aa64pfr1_override), 705 + ARM64_FTR_REG(SYS_ID_AA64PFR2_EL1, ftr_id_aa64pfr2), 740 706 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0, 741 707 &id_aa64zfr0_override), 742 708 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0, 743 709 &id_aa64smfr0_override), 710 + ARM64_FTR_REG(SYS_ID_AA64FPFR0_EL1, ftr_id_aa64fpfr0), 744 711 745 712 /* Op1 = 0, CRn = 0, CRm = 5 */ 746 713 ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0), ··· 754 717 &id_aa64isar1_override), 755 718 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ISAR2_EL1, ftr_id_aa64isar2, 756 719 &id_aa64isar2_override), 720 + ARM64_FTR_REG(SYS_ID_AA64ISAR3_EL1, ftr_id_aa64isar3), 757 721 758 722 /* Op1 = 0, CRn = 0, CRm = 7 */ 759 - ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0), 723 + ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0, 724 + &id_aa64mmfr0_override), 760 725 ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1, 761 726 &id_aa64mmfr1_override), 762 - ARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2), 727 + ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2, 728 + &id_aa64mmfr2_override), 763 729 ARM64_FTR_REG(SYS_ID_AA64MMFR3_EL1, ftr_id_aa64mmfr3), 764 730 765 731 /* Op1 = 1, CRn = 0, CRm = 0 */ ··· 1083 1043 init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0); 1084 1044 init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1); 1085 1045 init_cpu_ftr_reg(SYS_ID_AA64ISAR2_EL1, info->reg_id_aa64isar2); 1046 + init_cpu_ftr_reg(SYS_ID_AA64ISAR3_EL1, info->reg_id_aa64isar3); 1086 1047 init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0); 1087 1048 init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1); 1088 1049 init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2); 1089 1050 init_cpu_ftr_reg(SYS_ID_AA64MMFR3_EL1, info->reg_id_aa64mmfr3); 1090 1051 init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0); 1091 1052 init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1); 1053 + init_cpu_ftr_reg(SYS_ID_AA64PFR2_EL1, info->reg_id_aa64pfr2); 1092 1054 init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0); 1093 1055 init_cpu_ftr_reg(SYS_ID_AA64SMFR0_EL1, info->reg_id_aa64smfr0); 1056 + init_cpu_ftr_reg(SYS_ID_AA64FPFR0_EL1, info->reg_id_aa64fpfr0); 1094 1057 1095 1058 if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) 1096 1059 init_32bit_cpu_features(&info->aarch32); ··· 1315 1272 info->reg_id_aa64isar1, boot->reg_id_aa64isar1); 1316 1273 taint |= check_update_ftr_reg(SYS_ID_AA64ISAR2_EL1, cpu, 1317 1274 info->reg_id_aa64isar2, boot->reg_id_aa64isar2); 1275 + taint |= check_update_ftr_reg(SYS_ID_AA64ISAR3_EL1, cpu, 1276 + info->reg_id_aa64isar3, boot->reg_id_aa64isar3); 1318 1277 1319 1278 /* 1320 1279 * Differing PARange support is fine as long as all peripherals and ··· 1336 1291 info->reg_id_aa64pfr0, boot->reg_id_aa64pfr0); 1337 1292 taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu, 1338 1293 info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1); 1294 + taint |= check_update_ftr_reg(SYS_ID_AA64PFR2_EL1, cpu, 1295 + info->reg_id_aa64pfr2, boot->reg_id_aa64pfr2); 1339 1296 1340 1297 taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu, 1341 1298 info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0); 1342 1299 1343 1300 taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu, 1344 1301 info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0); 1302 + 1303 + taint |= check_update_ftr_reg(SYS_ID_AA64FPFR0_EL1, cpu, 1304 + info->reg_id_aa64fpfr0, boot->reg_id_aa64fpfr0); 1345 1305 1346 1306 /* Probe vector lengths */ 1347 1307 if (IS_ENABLED(CONFIG_ARM64_SVE) && ··· 1460 1410 1461 1411 read_sysreg_case(SYS_ID_AA64PFR0_EL1); 1462 1412 read_sysreg_case(SYS_ID_AA64PFR1_EL1); 1413 + read_sysreg_case(SYS_ID_AA64PFR2_EL1); 1463 1414 read_sysreg_case(SYS_ID_AA64ZFR0_EL1); 1464 1415 read_sysreg_case(SYS_ID_AA64SMFR0_EL1); 1416 + read_sysreg_case(SYS_ID_AA64FPFR0_EL1); 1465 1417 read_sysreg_case(SYS_ID_AA64DFR0_EL1); 1466 1418 read_sysreg_case(SYS_ID_AA64DFR1_EL1); 1467 1419 read_sysreg_case(SYS_ID_AA64MMFR0_EL1); ··· 1473 1421 read_sysreg_case(SYS_ID_AA64ISAR0_EL1); 1474 1422 read_sysreg_case(SYS_ID_AA64ISAR1_EL1); 1475 1423 read_sysreg_case(SYS_ID_AA64ISAR2_EL1); 1424 + read_sysreg_case(SYS_ID_AA64ISAR3_EL1); 1476 1425 1477 1426 read_sysreg_case(SYS_CNTFRQ_EL0); 1478 1427 read_sysreg_case(SYS_CTR_EL0); ··· 1673 1620 return has_cpuid_feature(entry, scope); 1674 1621 } 1675 1622 1676 - /* 1677 - * This check is triggered during the early boot before the cpufeature 1678 - * is initialised. Checking the status on the local CPU allows the boot 1679 - * CPU to detect the need for non-global mappings and thus avoiding a 1680 - * pagetable re-write after all the CPUs are booted. This check will be 1681 - * anyway run on individual CPUs, allowing us to get the consistent 1682 - * state once the SMP CPUs are up and thus make the switch to non-global 1683 - * mappings if required. 1684 - */ 1685 - bool kaslr_requires_kpti(void) 1686 - { 1687 - if (!IS_ENABLED(CONFIG_RANDOMIZE_BASE)) 1688 - return false; 1689 - 1690 - /* 1691 - * E0PD does a similar job to KPTI so can be used instead 1692 - * where available. 1693 - */ 1694 - if (IS_ENABLED(CONFIG_ARM64_E0PD)) { 1695 - u64 mmfr2 = read_sysreg_s(SYS_ID_AA64MMFR2_EL1); 1696 - if (cpuid_feature_extract_unsigned_field(mmfr2, 1697 - ID_AA64MMFR2_EL1_E0PD_SHIFT)) 1698 - return false; 1699 - } 1700 - 1701 - /* 1702 - * Systems affected by Cavium erratum 24756 are incompatible 1703 - * with KPTI. 1704 - */ 1705 - if (IS_ENABLED(CONFIG_CAVIUM_ERRATUM_27456)) { 1706 - extern const struct midr_range cavium_erratum_27456_cpus[]; 1707 - 1708 - if (is_midr_in_range_list(read_cpuid_id(), 1709 - cavium_erratum_27456_cpus)) 1710 - return false; 1711 - } 1712 - 1713 - return kaslr_enabled(); 1714 - } 1715 - 1716 1623 static bool __meltdown_safe = true; 1717 1624 static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */ 1718 1625 ··· 1725 1712 } 1726 1713 1727 1714 /* Useful for KASLR robustness */ 1728 - if (kaslr_requires_kpti()) { 1715 + if (kaslr_enabled() && kaslr_requires_kpti()) { 1729 1716 if (!__kpti_forced) { 1730 1717 str = "KASLR"; 1731 1718 __kpti_forced = 1; ··· 1814 1801 pgd_t *kpti_ng_temp_pgd; 1815 1802 u64 alloc = 0; 1816 1803 1804 + if (levels == 5 && !pgtable_l5_enabled()) 1805 + levels = 4; 1806 + else if (levels == 4 && !pgtable_l4_enabled()) 1807 + levels = 3; 1808 + 1817 1809 remap_fn = (void *)__pa_symbol(idmap_kpti_install_ng_mappings); 1818 1810 1819 1811 if (!cpu) { ··· 1832 1814 // 1833 1815 // The physical pages are laid out as follows: 1834 1816 // 1835 - // +--------+-/-------+-/------ +-\\--------+ 1836 - // : PTE[] : | PMD[] : | PUD[] : || PGD[] : 1837 - // +--------+-\-------+-\------ +-//--------+ 1817 + // +--------+-/-------+-/------ +-/------ +-\\\--------+ 1818 + // : PTE[] : | PMD[] : | PUD[] : | P4D[] : ||| PGD[] : 1819 + // +--------+-\-------+-\------ +-\------ +-///--------+ 1838 1820 // ^ 1839 1821 // The first page is mapped into this hierarchy at a PMD_SHIFT 1840 1822 // aligned virtual address, so that we can manipulate the PTE ··· 2060 2042 static bool hvhe_possible(const struct arm64_cpu_capabilities *entry, 2061 2043 int __unused) 2062 2044 { 2063 - u64 val; 2064 - 2065 - val = read_sysreg(id_aa64mmfr1_el1); 2066 - if (!cpuid_feature_extract_unsigned_field(val, ID_AA64MMFR1_EL1_VH_SHIFT)) 2067 - return false; 2068 - 2069 - val = arm64_sw_feature_override.val & arm64_sw_feature_override.mask; 2070 - return cpuid_feature_extract_unsigned_field(val, ARM64_SW_FEATURE_OVERRIDE_HVHE); 2045 + return arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_HVHE); 2071 2046 } 2072 2047 2073 2048 #ifdef CONFIG_ARM64_PAN ··· 2750 2739 .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2751 2740 .matches = has_lpa2, 2752 2741 }, 2742 + { 2743 + .desc = "FPMR", 2744 + .type = ARM64_CPUCAP_SYSTEM_FEATURE, 2745 + .capability = ARM64_HAS_FPMR, 2746 + .matches = has_cpuid_feature, 2747 + .cpu_enable = cpu_enable_fpmr, 2748 + ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, FPMR, IMP) 2749 + }, 2750 + #ifdef CONFIG_ARM64_VA_BITS_52 2751 + { 2752 + .capability = ARM64_HAS_VA52, 2753 + .type = ARM64_CPUCAP_BOOT_CPU_FEATURE, 2754 + .matches = has_cpuid_feature, 2755 + #ifdef CONFIG_ARM64_64K_PAGES 2756 + .desc = "52-bit Virtual Addressing (LVA)", 2757 + ARM64_CPUID_FIELDS(ID_AA64MMFR2_EL1, VARange, 52) 2758 + #else 2759 + .desc = "52-bit Virtual Addressing (LPA2)", 2760 + #ifdef CONFIG_ARM64_4K_PAGES 2761 + ARM64_CPUID_FIELDS(ID_AA64MMFR0_EL1, TGRAN4, 52_BIT) 2762 + #else 2763 + ARM64_CPUID_FIELDS(ID_AA64MMFR0_EL1, TGRAN16, 52_BIT) 2764 + #endif 2765 + #endif 2766 + }, 2767 + #endif 2753 2768 {}, 2754 2769 }; 2755 2770 ··· 2859 2822 HWCAP_CAP(ID_AA64PFR0_EL1, AdvSIMD, IMP, CAP_HWCAP, KERNEL_HWCAP_ASIMD), 2860 2823 HWCAP_CAP(ID_AA64PFR0_EL1, AdvSIMD, FP16, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP), 2861 2824 HWCAP_CAP(ID_AA64PFR0_EL1, DIT, IMP, CAP_HWCAP, KERNEL_HWCAP_DIT), 2825 + HWCAP_CAP(ID_AA64PFR2_EL1, FPMR, IMP, CAP_HWCAP, KERNEL_HWCAP_FPMR), 2862 2826 HWCAP_CAP(ID_AA64ISAR1_EL1, DPB, IMP, CAP_HWCAP, KERNEL_HWCAP_DCPOP), 2863 2827 HWCAP_CAP(ID_AA64ISAR1_EL1, DPB, DPB2, CAP_HWCAP, KERNEL_HWCAP_DCPODP), 2864 2828 HWCAP_CAP(ID_AA64ISAR1_EL1, JSCVT, IMP, CAP_HWCAP, KERNEL_HWCAP_JSCVT), ··· 2873 2835 HWCAP_CAP(ID_AA64ISAR1_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_EBF16), 2874 2836 HWCAP_CAP(ID_AA64ISAR1_EL1, DGH, IMP, CAP_HWCAP, KERNEL_HWCAP_DGH), 2875 2837 HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM), 2838 + HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT), 2839 + HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX), 2876 2840 HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT), 2877 2841 #ifdef CONFIG_ARM64_SVE 2878 2842 HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE), ··· 2915 2875 #ifdef CONFIG_ARM64_SME 2916 2876 HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME), 2917 2877 HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64), 2878 + HWCAP_CAP(ID_AA64SMFR0_EL1, LUTv2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_LUTV2), 2918 2879 HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2p1, CAP_HWCAP, KERNEL_HWCAP_SME2P1), 2919 2880 HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2, CAP_HWCAP, KERNEL_HWCAP_SME2), 2920 2881 HWCAP_CAP(ID_AA64SMFR0_EL1, I16I64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64), ··· 2923 2882 HWCAP_CAP(ID_AA64SMFR0_EL1, I16I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I32), 2924 2883 HWCAP_CAP(ID_AA64SMFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_B16B16), 2925 2884 HWCAP_CAP(ID_AA64SMFR0_EL1, F16F16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F16F16), 2885 + HWCAP_CAP(ID_AA64SMFR0_EL1, F8F16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F8F16), 2886 + HWCAP_CAP(ID_AA64SMFR0_EL1, F8F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F8F32), 2926 2887 HWCAP_CAP(ID_AA64SMFR0_EL1, I8I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32), 2927 2888 HWCAP_CAP(ID_AA64SMFR0_EL1, F16F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32), 2928 2889 HWCAP_CAP(ID_AA64SMFR0_EL1, B16F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32), 2929 2890 HWCAP_CAP(ID_AA64SMFR0_EL1, BI32I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_BI32I32), 2930 2891 HWCAP_CAP(ID_AA64SMFR0_EL1, F32F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32), 2892 + HWCAP_CAP(ID_AA64SMFR0_EL1, SF8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8FMA), 2893 + HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP4), 2894 + HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP2), 2931 2895 #endif /* CONFIG_ARM64_SME */ 2896 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8CVT, IMP, CAP_HWCAP, KERNEL_HWCAP_F8CVT), 2897 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_F8FMA), 2898 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP4), 2899 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2), 2900 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3), 2901 + HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2), 2932 2902 {}, 2933 2903 }; 2934 2904 ··· 3104 3052 boot_scope = !!(scope_mask & SCOPE_BOOT_CPU); 3105 3053 3106 3054 for (i = 0; i < ARM64_NCAPS; i++) { 3107 - unsigned int num; 3108 - 3109 3055 caps = cpucap_ptrs[i]; 3110 - if (!caps || !(caps->type & scope_mask)) 3111 - continue; 3112 - num = caps->capability; 3113 - if (!cpus_have_cap(num)) 3056 + if (!caps || !(caps->type & scope_mask) || 3057 + !cpus_have_cap(caps->capability)) 3114 3058 continue; 3115 3059 3116 3060 if (boot_scope && caps->cpu_enable)
+18
arch/arm64/kernel/cpuinfo.c
··· 128 128 [KERNEL_HWCAP_SVE_B16B16] = "sveb16b16", 129 129 [KERNEL_HWCAP_LRCPC3] = "lrcpc3", 130 130 [KERNEL_HWCAP_LSE128] = "lse128", 131 + [KERNEL_HWCAP_FPMR] = "fpmr", 132 + [KERNEL_HWCAP_LUT] = "lut", 133 + [KERNEL_HWCAP_FAMINMAX] = "faminmax", 134 + [KERNEL_HWCAP_F8CVT] = "f8cvt", 135 + [KERNEL_HWCAP_F8FMA] = "f8fma", 136 + [KERNEL_HWCAP_F8DP4] = "f8dp4", 137 + [KERNEL_HWCAP_F8DP2] = "f8dp2", 138 + [KERNEL_HWCAP_F8E4M3] = "f8e4m3", 139 + [KERNEL_HWCAP_F8E5M2] = "f8e5m2", 140 + [KERNEL_HWCAP_SME_LUTV2] = "smelutv2", 141 + [KERNEL_HWCAP_SME_F8F16] = "smef8f16", 142 + [KERNEL_HWCAP_SME_F8F32] = "smef8f32", 143 + [KERNEL_HWCAP_SME_SF8FMA] = "smesf8fma", 144 + [KERNEL_HWCAP_SME_SF8DP4] = "smesf8dp4", 145 + [KERNEL_HWCAP_SME_SF8DP2] = "smesf8dp2", 131 146 }; 132 147 133 148 #ifdef CONFIG_COMPAT ··· 458 443 info->reg_id_aa64isar0 = read_cpuid(ID_AA64ISAR0_EL1); 459 444 info->reg_id_aa64isar1 = read_cpuid(ID_AA64ISAR1_EL1); 460 445 info->reg_id_aa64isar2 = read_cpuid(ID_AA64ISAR2_EL1); 446 + info->reg_id_aa64isar3 = read_cpuid(ID_AA64ISAR3_EL1); 461 447 info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1); 462 448 info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1); 463 449 info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1); 464 450 info->reg_id_aa64mmfr3 = read_cpuid(ID_AA64MMFR3_EL1); 465 451 info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1); 466 452 info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1); 453 + info->reg_id_aa64pfr2 = read_cpuid(ID_AA64PFR2_EL1); 467 454 info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1); 468 455 info->reg_id_aa64smfr0 = read_cpuid(ID_AA64SMFR0_EL1); 456 + info->reg_id_aa64fpfr0 = read_cpuid(ID_AA64FPFR0_EL1); 469 457 470 458 if (id_aa64pfr1_mte(info->reg_id_aa64pfr1)) 471 459 info->reg_gmid = read_cpuid(GMID_EL1);
+35 -1
arch/arm64/kernel/entry-common.c
··· 10 10 #include <linux/linkage.h> 11 11 #include <linux/lockdep.h> 12 12 #include <linux/ptrace.h> 13 + #include <linux/resume_user_mode.h> 13 14 #include <linux/sched.h> 14 15 #include <linux/sched/debug.h> 15 16 #include <linux/thread_info.h> ··· 127 126 lockdep_hardirqs_on(CALLER_ADDR0); 128 127 } 129 128 129 + static void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags) 130 + { 131 + do { 132 + local_irq_enable(); 133 + 134 + if (thread_flags & _TIF_NEED_RESCHED) 135 + schedule(); 136 + 137 + if (thread_flags & _TIF_UPROBE) 138 + uprobe_notify_resume(regs); 139 + 140 + if (thread_flags & _TIF_MTE_ASYNC_FAULT) { 141 + clear_thread_flag(TIF_MTE_ASYNC_FAULT); 142 + send_sig_fault(SIGSEGV, SEGV_MTEAERR, 143 + (void __user *)NULL, current); 144 + } 145 + 146 + if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) 147 + do_signal(regs); 148 + 149 + if (thread_flags & _TIF_NOTIFY_RESUME) 150 + resume_user_mode_work(regs); 151 + 152 + if (thread_flags & _TIF_FOREIGN_FPSTATE) 153 + fpsimd_restore_current_state(); 154 + 155 + local_irq_disable(); 156 + thread_flags = read_thread_flags(); 157 + } while (thread_flags & _TIF_WORK_MASK); 158 + } 159 + 130 160 static __always_inline void exit_to_user_mode_prepare(struct pt_regs *regs) 131 161 { 132 162 unsigned long flags; 133 163 134 - local_daif_mask(); 164 + local_irq_disable(); 135 165 136 166 flags = read_thread_flags(); 137 167 if (unlikely(flags & _TIF_WORK_MASK)) 138 168 do_notify_resume(regs, flags); 169 + 170 + local_daif_mask(); 139 171 140 172 lockdep_sys_exit(); 141 173 }
+18
arch/arm64/kernel/fpsimd.c
··· 359 359 WARN_ON(preemptible()); 360 360 WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE)); 361 361 362 + if (system_supports_fpmr()) 363 + write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR); 364 + 362 365 if (system_supports_sve() || system_supports_sme()) { 363 366 switch (current->thread.fp_type) { 364 367 case FP_STATE_FPSIMD: ··· 448 445 449 446 if (test_thread_flag(TIF_FOREIGN_FPSTATE)) 450 447 return; 448 + 449 + if (system_supports_fpmr()) 450 + *(last->fpmr) = read_sysreg_s(SYS_FPMR); 451 451 452 452 /* 453 453 * If a task is in a syscall the ABI allows us to only ··· 692 686 p = (__uint128_t const *)ZREG(sst, vq, i); 693 687 fst->vregs[i] = arm64_le128_to_cpu(*p); 694 688 } 689 + } 690 + 691 + void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__always_unused p) 692 + { 693 + write_sysreg_s(read_sysreg_s(SYS_SCTLR_EL1) | SCTLR_EL1_EnFPM_MASK, 694 + SYS_SCTLR_EL1); 695 695 } 696 696 697 697 #ifdef CONFIG_ARM64_SVE ··· 1146 1134 { 1147 1135 write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1); 1148 1136 isb(); 1137 + 1138 + write_sysreg_s(0, SYS_ZCR_EL1); 1149 1139 } 1150 1140 1151 1141 void __init sve_setup(void) ··· 1258 1244 /* Allow SME in kernel */ 1259 1245 write_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_SMEN_EL1EN, CPACR_EL1); 1260 1246 isb(); 1247 + 1248 + /* Ensure all bits in SMCR are set to known values */ 1249 + write_sysreg_s(0, SYS_SMCR_EL1); 1261 1250 1262 1251 /* Allow EL0 to access TPIDR2 */ 1263 1252 write_sysreg(read_sysreg(SCTLR_EL1) | SCTLR_ELx_ENTP2, SCTLR_EL1); ··· 1713 1696 last->sve_vl = task_get_sve_vl(current); 1714 1697 last->sme_vl = task_get_sme_vl(current); 1715 1698 last->svcr = &current->thread.svcr; 1699 + last->fpmr = &current->thread.uw.fpmr; 1716 1700 last->fp_type = &current->thread.fp_type; 1717 1701 last->to_save = FP_STATE_CURRENT; 1718 1702 current->thread.fpsimd_cpu = smp_processor_id();
+42 -421
arch/arm64/kernel/head.S
··· 80 80 * x19 primary_entry() .. start_kernel() whether we entered with the MMU on 81 81 * x20 primary_entry() .. __primary_switch() CPU boot mode 82 82 * x21 primary_entry() .. start_kernel() FDT pointer passed at boot in x0 83 - * x22 create_idmap() .. start_kernel() ID map VA of the DT blob 84 - * x23 primary_entry() .. start_kernel() physical misalignment/KASLR offset 85 - * x24 __primary_switch() linear map KASLR seed 86 - * x25 primary_entry() .. start_kernel() supported VA size 87 - * x28 create_idmap() callee preserved temp register 88 83 */ 89 84 SYM_CODE_START(primary_entry) 90 85 bl record_mmu_state 91 86 bl preserve_boot_args 92 - bl create_idmap 87 + 88 + adrp x1, early_init_stack 89 + mov sp, x1 90 + mov x29, xzr 91 + adrp x0, init_idmap_pg_dir 92 + mov x1, xzr 93 + bl __pi_create_init_idmap 94 + 95 + /* 96 + * If the page tables have been populated with non-cacheable 97 + * accesses (MMU disabled), invalidate those tables again to 98 + * remove any speculatively loaded cache lines. 99 + */ 100 + cbnz x19, 0f 101 + dmb sy 102 + mov x1, x0 // end of used region 103 + adrp x0, init_idmap_pg_dir 104 + adr_l x2, dcache_inval_poc 105 + blr x2 106 + b 1f 93 107 94 108 /* 95 109 * If we entered with the MMU and caches on, clean the ID mapped part 96 110 * of the primary boot code to the PoC so we can safely execute it with 97 111 * the MMU off. 98 112 */ 99 - cbz x19, 0f 100 - adrp x0, __idmap_text_start 113 + 0: adrp x0, __idmap_text_start 101 114 adr_l x1, __idmap_text_end 102 115 adr_l x2, dcache_clean_poc 103 116 blr x2 104 - 0: mov x0, x19 117 + 118 + 1: mov x0, x19 105 119 bl init_kernel_el // w0=cpu_boot_mode 106 120 mov x20, x0 107 121 ··· 125 111 * On return, the CPU will be ready for the MMU to be turned on and 126 112 * the TCR will have been set. 127 113 */ 128 - #if VA_BITS > 48 129 - mrs_s x0, SYS_ID_AA64MMFR2_EL1 130 - tst x0, ID_AA64MMFR2_EL1_VARange_MASK 131 - mov x0, #VA_BITS 132 - mov x25, #VA_BITS_MIN 133 - csel x25, x25, x0, eq 134 - mov x0, x25 135 - #endif 136 114 bl __cpu_setup // initialise processor 137 115 b __primary_switch 138 116 SYM_CODE_END(primary_entry) ··· 183 177 ret 184 178 SYM_CODE_END(preserve_boot_args) 185 179 186 - SYM_FUNC_START_LOCAL(clear_page_tables) 187 - /* 188 - * Clear the init page tables. 189 - */ 190 - adrp x0, init_pg_dir 191 - adrp x1, init_pg_end 192 - sub x2, x1, x0 193 - mov x1, xzr 194 - b __pi_memset // tail call 195 - SYM_FUNC_END(clear_page_tables) 196 - 197 - /* 198 - * Macro to populate page table entries, these entries can be pointers to the next level 199 - * or last level entries pointing to physical memory. 200 - * 201 - * tbl: page table address 202 - * rtbl: pointer to page table or physical memory 203 - * index: start index to write 204 - * eindex: end index to write - [index, eindex] written to 205 - * flags: flags for pagetable entry to or in 206 - * inc: increment to rtbl between each entry 207 - * tmp1: temporary variable 208 - * 209 - * Preserves: tbl, eindex, flags, inc 210 - * Corrupts: index, tmp1 211 - * Returns: rtbl 212 - */ 213 - .macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1 214 - .Lpe\@: phys_to_pte \tmp1, \rtbl 215 - orr \tmp1, \tmp1, \flags // tmp1 = table entry 216 - str \tmp1, [\tbl, \index, lsl #3] 217 - add \rtbl, \rtbl, \inc // rtbl = pa next level 218 - add \index, \index, #1 219 - cmp \index, \eindex 220 - b.ls .Lpe\@ 221 - .endm 222 - 223 - /* 224 - * Compute indices of table entries from virtual address range. If multiple entries 225 - * were needed in the previous page table level then the next page table level is assumed 226 - * to be composed of multiple pages. (This effectively scales the end index). 227 - * 228 - * vstart: virtual address of start of range 229 - * vend: virtual address of end of range - we map [vstart, vend] 230 - * shift: shift used to transform virtual address into index 231 - * order: #imm 2log(number of entries in page table) 232 - * istart: index in table corresponding to vstart 233 - * iend: index in table corresponding to vend 234 - * count: On entry: how many extra entries were required in previous level, scales 235 - * our end index. 236 - * On exit: returns how many extra entries required for next page table level 237 - * 238 - * Preserves: vstart, vend 239 - * Returns: istart, iend, count 240 - */ 241 - .macro compute_indices, vstart, vend, shift, order, istart, iend, count 242 - ubfx \istart, \vstart, \shift, \order 243 - ubfx \iend, \vend, \shift, \order 244 - add \iend, \iend, \count, lsl \order 245 - sub \count, \iend, \istart 246 - .endm 247 - 248 - /* 249 - * Map memory for specified virtual address range. Each level of page table needed supports 250 - * multiple entries. If a level requires n entries the next page table level is assumed to be 251 - * formed from n pages. 252 - * 253 - * tbl: location of page table 254 - * rtbl: address to be used for first level page table entry (typically tbl + PAGE_SIZE) 255 - * vstart: virtual address of start of range 256 - * vend: virtual address of end of range - we map [vstart, vend - 1] 257 - * flags: flags to use to map last level entries 258 - * phys: physical address corresponding to vstart - physical memory is contiguous 259 - * order: #imm 2log(number of entries in PGD table) 260 - * 261 - * If extra_shift is set, an extra level will be populated if the end address does 262 - * not fit in 'extra_shift' bits. This assumes vend is in the TTBR0 range. 263 - * 264 - * Temporaries: istart, iend, tmp, count, sv - these need to be different registers 265 - * Preserves: vstart, flags 266 - * Corrupts: tbl, rtbl, vend, istart, iend, tmp, count, sv 267 - */ 268 - .macro map_memory, tbl, rtbl, vstart, vend, flags, phys, order, istart, iend, tmp, count, sv, extra_shift 269 - sub \vend, \vend, #1 270 - add \rtbl, \tbl, #PAGE_SIZE 271 - mov \count, #0 272 - 273 - .ifnb \extra_shift 274 - tst \vend, #~((1 << (\extra_shift)) - 1) 275 - b.eq .L_\@ 276 - compute_indices \vstart, \vend, #\extra_shift, #(PAGE_SHIFT - 3), \istart, \iend, \count 277 - mov \sv, \rtbl 278 - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 279 - mov \tbl, \sv 280 - .endif 281 - .L_\@: 282 - compute_indices \vstart, \vend, #PGDIR_SHIFT, #\order, \istart, \iend, \count 283 - mov \sv, \rtbl 284 - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 285 - mov \tbl, \sv 286 - 287 - #if SWAPPER_PGTABLE_LEVELS > 3 288 - compute_indices \vstart, \vend, #PUD_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 289 - mov \sv, \rtbl 290 - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 291 - mov \tbl, \sv 292 - #endif 293 - 294 - #if SWAPPER_PGTABLE_LEVELS > 2 295 - compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 296 - mov \sv, \rtbl 297 - populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp 298 - mov \tbl, \sv 299 - #endif 300 - 301 - compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #(PAGE_SHIFT - 3), \istart, \iend, \count 302 - bic \rtbl, \phys, #SWAPPER_BLOCK_SIZE - 1 303 - populate_entries \tbl, \rtbl, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp 304 - .endm 305 - 306 - /* 307 - * Remap a subregion created with the map_memory macro with modified attributes 308 - * or output address. The entire remapped region must have been covered in the 309 - * invocation of map_memory. 310 - * 311 - * x0: last level table address (returned in first argument to map_memory) 312 - * x1: start VA of the existing mapping 313 - * x2: start VA of the region to update 314 - * x3: end VA of the region to update (exclusive) 315 - * x4: start PA associated with the region to update 316 - * x5: attributes to set on the updated region 317 - * x6: order of the last level mappings 318 - */ 319 - SYM_FUNC_START_LOCAL(remap_region) 320 - sub x3, x3, #1 // make end inclusive 321 - 322 - // Get the index offset for the start of the last level table 323 - lsr x1, x1, x6 324 - bfi x1, xzr, #0, #PAGE_SHIFT - 3 325 - 326 - // Derive the start and end indexes into the last level table 327 - // associated with the provided region 328 - lsr x2, x2, x6 329 - lsr x3, x3, x6 330 - sub x2, x2, x1 331 - sub x3, x3, x1 332 - 333 - mov x1, #1 334 - lsl x6, x1, x6 // block size at this level 335 - 336 - populate_entries x0, x4, x2, x3, x5, x6, x7 337 - ret 338 - SYM_FUNC_END(remap_region) 339 - 340 - SYM_FUNC_START_LOCAL(create_idmap) 341 - mov x28, lr 342 - /* 343 - * The ID map carries a 1:1 mapping of the physical address range 344 - * covered by the loaded image, which could be anywhere in DRAM. This 345 - * means that the required size of the VA (== PA) space is decided at 346 - * boot time, and could be more than the configured size of the VA 347 - * space for ordinary kernel and user space mappings. 348 - * 349 - * There are three cases to consider here: 350 - * - 39 <= VA_BITS < 48, and the ID map needs up to 48 VA bits to cover 351 - * the placement of the image. In this case, we configure one extra 352 - * level of translation on the fly for the ID map only. (This case 353 - * also covers 42-bit VA/52-bit PA on 64k pages). 354 - * 355 - * - VA_BITS == 48, and the ID map needs more than 48 VA bits. This can 356 - * only happen when using 64k pages, in which case we need to extend 357 - * the root level table rather than add a level. Note that we can 358 - * treat this case as 'always extended' as long as we take care not 359 - * to program an unsupported T0SZ value into the TCR register. 360 - * 361 - * - Combinations that would require two additional levels of 362 - * translation are not supported, e.g., VA_BITS==36 on 16k pages, or 363 - * VA_BITS==39/4k pages with 5-level paging, where the input address 364 - * requires more than 47 or 48 bits, respectively. 365 - */ 366 - #if (VA_BITS < 48) 367 - #define IDMAP_PGD_ORDER (VA_BITS - PGDIR_SHIFT) 368 - #define EXTRA_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3) 369 - 370 - /* 371 - * If VA_BITS < 48, we have to configure an additional table level. 372 - * First, we have to verify our assumption that the current value of 373 - * VA_BITS was chosen such that all translation levels are fully 374 - * utilised, and that lowering T0SZ will always result in an additional 375 - * translation level to be configured. 376 - */ 377 - #if VA_BITS != EXTRA_SHIFT 378 - #error "Mismatch between VA_BITS and page size/number of translation levels" 379 - #endif 380 - #else 381 - #define IDMAP_PGD_ORDER (PHYS_MASK_SHIFT - PGDIR_SHIFT) 382 - #define EXTRA_SHIFT 383 - /* 384 - * If VA_BITS == 48, we don't have to configure an additional 385 - * translation level, but the top-level table has more entries. 386 - */ 387 - #endif 388 - adrp x0, init_idmap_pg_dir 389 - adrp x3, _text 390 - adrp x6, _end + MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE 391 - mov_q x7, SWAPPER_RX_MMUFLAGS 392 - 393 - map_memory x0, x1, x3, x6, x7, x3, IDMAP_PGD_ORDER, x10, x11, x12, x13, x14, EXTRA_SHIFT 394 - 395 - /* Remap the kernel page tables r/w in the ID map */ 396 - adrp x1, _text 397 - adrp x2, init_pg_dir 398 - adrp x3, init_pg_end 399 - bic x4, x2, #SWAPPER_BLOCK_SIZE - 1 400 - mov_q x5, SWAPPER_RW_MMUFLAGS 401 - mov x6, #SWAPPER_BLOCK_SHIFT 402 - bl remap_region 403 - 404 - /* Remap the FDT after the kernel image */ 405 - adrp x1, _text 406 - adrp x22, _end + SWAPPER_BLOCK_SIZE 407 - bic x2, x22, #SWAPPER_BLOCK_SIZE - 1 408 - bfi x22, x21, #0, #SWAPPER_BLOCK_SHIFT // remapped FDT address 409 - add x3, x2, #MAX_FDT_SIZE + SWAPPER_BLOCK_SIZE 410 - bic x4, x21, #SWAPPER_BLOCK_SIZE - 1 411 - mov_q x5, SWAPPER_RW_MMUFLAGS 412 - mov x6, #SWAPPER_BLOCK_SHIFT 413 - bl remap_region 414 - 415 - /* 416 - * Since the page tables have been populated with non-cacheable 417 - * accesses (MMU disabled), invalidate those tables again to 418 - * remove any speculatively loaded cache lines. 419 - */ 420 - cbnz x19, 0f // skip cache invalidation if MMU is on 421 - dmb sy 422 - 423 - adrp x0, init_idmap_pg_dir 424 - adrp x1, init_idmap_pg_end 425 - bl dcache_inval_poc 426 - 0: ret x28 427 - SYM_FUNC_END(create_idmap) 428 - 429 - SYM_FUNC_START_LOCAL(create_kernel_mapping) 430 - adrp x0, init_pg_dir 431 - mov_q x5, KIMAGE_VADDR // compile time __va(_text) 432 - #ifdef CONFIG_RELOCATABLE 433 - add x5, x5, x23 // add KASLR displacement 434 - #endif 435 - adrp x6, _end // runtime __pa(_end) 436 - adrp x3, _text // runtime __pa(_text) 437 - sub x6, x6, x3 // _end - _text 438 - add x6, x6, x5 // runtime __va(_end) 439 - mov_q x7, SWAPPER_RW_MMUFLAGS 440 - 441 - map_memory x0, x1, x5, x6, x7, x3, (VA_BITS - PGDIR_SHIFT), x10, x11, x12, x13, x14 442 - 443 - dsb ishst // sync with page table walker 444 - ret 445 - SYM_FUNC_END(create_kernel_mapping) 446 - 447 180 /* 448 181 * Initialize CPU registers with task-specific and cpu-specific context. 449 182 * ··· 234 489 mov x0, x20 235 490 bl set_cpu_boot_mode_flag 236 491 237 - // Clear BSS 238 - adr_l x0, __bss_start 239 - mov x1, xzr 240 - adr_l x2, __bss_stop 241 - sub x2, x2, x0 242 - bl __pi_memset 243 - dsb ishst // Make zero page visible to PTW 244 - 245 - #if VA_BITS > 48 246 - adr_l x8, vabits_actual // Set this early so KASAN early init 247 - str x25, [x8] // ... observes the correct value 248 - dc civac, x8 // Make visible to booting secondaries 249 - #endif 250 - 251 - #ifdef CONFIG_RANDOMIZE_BASE 252 - adrp x5, memstart_offset_seed // Save KASLR linear map seed 253 - strh w24, [x5, :lo12:memstart_offset_seed] 254 - #endif 255 492 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 256 493 bl kasan_early_init 257 - #endif 258 - mov x0, x21 // pass FDT address in x0 259 - bl early_fdt_map // Try mapping the FDT early 260 - mov x0, x20 // pass the full boot status 261 - bl init_feature_override // Parse cpu feature overrides 262 - #ifdef CONFIG_UNWIND_PATCH_PAC_INTO_SCS 263 - bl scs_patch_vmlinux 264 494 #endif 265 495 mov x0, x20 266 496 bl finalise_el2 // Prefer VHE if possible ··· 363 643 * Common entry point for secondary CPUs. 364 644 */ 365 645 mov x20, x0 // preserve boot mode 646 + 647 + #ifdef CONFIG_ARM64_VA_BITS_52 648 + alternative_if ARM64_HAS_VA52 366 649 bl __cpu_secondary_check52bitva 367 - #if VA_BITS > 48 368 - ldr_l x0, vabits_actual 650 + alternative_else_nop_endif 369 651 #endif 652 + 370 653 bl __cpu_setup // initialise processor 371 654 adrp x1, swapper_pg_dir 372 655 adrp x2, idmap_pg_dir ··· 472 749 ret 473 750 SYM_FUNC_END(__enable_mmu) 474 751 752 + #ifdef CONFIG_ARM64_VA_BITS_52 475 753 SYM_FUNC_START(__cpu_secondary_check52bitva) 476 - #if VA_BITS > 48 477 - ldr_l x0, vabits_actual 478 - cmp x0, #52 479 - b.ne 2f 480 - 754 + #ifndef CONFIG_ARM64_LPA2 481 755 mrs_s x0, SYS_ID_AA64MMFR2_EL1 482 756 and x0, x0, ID_AA64MMFR2_EL1_VARange_MASK 483 757 cbnz x0, 2f 758 + #else 759 + mrs x0, id_aa64mmfr0_el1 760 + sbfx x0, x0, #ID_AA64MMFR0_EL1_TGRAN_SHIFT, 4 761 + cmp x0, #ID_AA64MMFR0_EL1_TGRAN_LPA2 762 + b.ge 2f 763 + #endif 484 764 485 765 update_early_cpu_boot_status \ 486 766 CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_52_BIT_VA, x0, x1 ··· 491 765 wfi 492 766 b 1b 493 767 494 - #endif 495 768 2: ret 496 769 SYM_FUNC_END(__cpu_secondary_check52bitva) 770 + #endif 497 771 498 772 SYM_FUNC_START_LOCAL(__no_granule_support) 499 773 /* Indicate that this CPU can't boot and is stuck in the kernel */ ··· 505 779 b 1b 506 780 SYM_FUNC_END(__no_granule_support) 507 781 508 - #ifdef CONFIG_RELOCATABLE 509 - SYM_FUNC_START_LOCAL(__relocate_kernel) 510 - /* 511 - * Iterate over each entry in the relocation table, and apply the 512 - * relocations in place. 513 - */ 514 - adr_l x9, __rela_start 515 - adr_l x10, __rela_end 516 - mov_q x11, KIMAGE_VADDR // default virtual offset 517 - add x11, x11, x23 // actual virtual offset 518 - 519 - 0: cmp x9, x10 520 - b.hs 1f 521 - ldp x12, x13, [x9], #24 522 - ldr x14, [x9, #-8] 523 - cmp w13, #R_AARCH64_RELATIVE 524 - b.ne 0b 525 - add x14, x14, x23 // relocate 526 - str x14, [x12, x23] 527 - b 0b 528 - 529 - 1: 530 - #ifdef CONFIG_RELR 531 - /* 532 - * Apply RELR relocations. 533 - * 534 - * RELR is a compressed format for storing relative relocations. The 535 - * encoded sequence of entries looks like: 536 - * [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ] 537 - * 538 - * i.e. start with an address, followed by any number of bitmaps. The 539 - * address entry encodes 1 relocation. The subsequent bitmap entries 540 - * encode up to 63 relocations each, at subsequent offsets following 541 - * the last address entry. 542 - * 543 - * The bitmap entries must have 1 in the least significant bit. The 544 - * assumption here is that an address cannot have 1 in lsb. Odd 545 - * addresses are not supported. Any odd addresses are stored in the RELA 546 - * section, which is handled above. 547 - * 548 - * Excluding the least significant bit in the bitmap, each non-zero 549 - * bit in the bitmap represents a relocation to be applied to 550 - * a corresponding machine word that follows the base address 551 - * word. The second least significant bit represents the machine 552 - * word immediately following the initial address, and each bit 553 - * that follows represents the next word, in linear order. As such, 554 - * a single bitmap can encode up to 63 relocations in a 64-bit object. 555 - * 556 - * In this implementation we store the address of the next RELR table 557 - * entry in x9, the address being relocated by the current address or 558 - * bitmap entry in x13 and the address being relocated by the current 559 - * bit in x14. 560 - */ 561 - adr_l x9, __relr_start 562 - adr_l x10, __relr_end 563 - 564 - 2: cmp x9, x10 565 - b.hs 7f 566 - ldr x11, [x9], #8 567 - tbnz x11, #0, 3f // branch to handle bitmaps 568 - add x13, x11, x23 569 - ldr x12, [x13] // relocate address entry 570 - add x12, x12, x23 571 - str x12, [x13], #8 // adjust to start of bitmap 572 - b 2b 573 - 574 - 3: mov x14, x13 575 - 4: lsr x11, x11, #1 576 - cbz x11, 6f 577 - tbz x11, #0, 5f // skip bit if not set 578 - ldr x12, [x14] // relocate bit 579 - add x12, x12, x23 580 - str x12, [x14] 581 - 582 - 5: add x14, x14, #8 // move to next bit's address 583 - b 4b 584 - 585 - 6: /* 586 - * Move to the next bitmap's address. 8 is the word size, and 63 is the 587 - * number of significant bits in a bitmap entry. 588 - */ 589 - add x13, x13, #(8 * 63) 590 - b 2b 591 - 592 - 7: 593 - #endif 594 - ret 595 - 596 - SYM_FUNC_END(__relocate_kernel) 597 - #endif 598 - 599 782 SYM_FUNC_START_LOCAL(__primary_switch) 600 783 adrp x1, reserved_pg_dir 601 784 adrp x2, init_idmap_pg_dir 602 785 bl __enable_mmu 603 - #ifdef CONFIG_RELOCATABLE 604 - adrp x23, KERNEL_START 605 - and x23, x23, MIN_KIMG_ALIGN - 1 606 - #ifdef CONFIG_RANDOMIZE_BASE 607 - mov x0, x22 608 - adrp x1, init_pg_end 786 + 787 + adrp x1, early_init_stack 609 788 mov sp, x1 610 789 mov x29, xzr 611 - bl __pi_kaslr_early_init 612 - and x24, x0, #SZ_2M - 1 // capture memstart offset seed 613 - bic x0, x0, #SZ_2M - 1 614 - orr x23, x23, x0 // record kernel offset 615 - #endif 616 - #endif 617 - bl clear_page_tables 618 - bl create_kernel_mapping 790 + mov x0, x20 // pass the full boot status 791 + mov x1, x21 // pass the FDT 792 + bl __pi_early_map_kernel // Map and relocate the kernel 619 793 620 - adrp x1, init_pg_dir 621 - load_ttbr1 x1, x1, x2 622 - #ifdef CONFIG_RELOCATABLE 623 - bl __relocate_kernel 624 - #endif 625 794 ldr x8, =__primary_switched 626 795 adrp x0, KERNEL_START // __pa(KERNEL_START) 627 796 br x8
+2 -1
arch/arm64/kernel/hw_breakpoint.c
··· 21 21 22 22 #include <asm/current.h> 23 23 #include <asm/debug-monitors.h> 24 + #include <asm/esr.h> 24 25 #include <asm/hw_breakpoint.h> 25 26 #include <asm/traps.h> 26 27 #include <asm/cputype.h> ··· 780 779 * Check that the access type matches. 781 780 * 0 => load, otherwise => store 782 781 */ 783 - access = (esr & AARCH64_ESR_ACCESS_MASK) ? HW_BREAKPOINT_W : 782 + access = (esr & ESR_ELx_WNR) ? HW_BREAKPOINT_W : 784 783 HW_BREAKPOINT_R; 785 784 if (!(access & hw_breakpoint_type(wp))) 786 785 continue;
+52 -26
arch/arm64/kernel/idreg-override.c arch/arm64/kernel/pi/idreg-override.c
··· 14 14 #include <asm/cpufeature.h> 15 15 #include <asm/setup.h> 16 16 17 + #include "pi.h" 18 + 17 19 #define FTR_DESC_NAME_LEN 20 18 20 #define FTR_DESC_FIELD_LEN 10 19 21 #define FTR_ALIAS_NAME_LEN 30 20 22 #define FTR_ALIAS_OPTION_LEN 116 21 23 22 24 static u64 __boot_status __initdata; 23 - 24 - // temporary __prel64 related definitions 25 - // to be removed when this code is moved under pi/ 26 - 27 - #define __prel64_initconst __initconst 28 - 29 - #define PREL64(type, name) union { type *name; } 30 - 31 - #define prel64_pointer(__d) (__d) 32 25 33 26 typedef bool filter_t(u64 val); 34 27 ··· 55 62 .override = &id_aa64mmfr1_override, 56 63 .fields = { 57 64 FIELD("vh", ID_AA64MMFR1_EL1_VH_SHIFT, mmfr1_vh_filter), 65 + {} 66 + }, 67 + }; 68 + 69 + 70 + static bool __init mmfr2_varange_filter(u64 val) 71 + { 72 + int __maybe_unused feat; 73 + 74 + if (val) 75 + return false; 76 + 77 + #ifdef CONFIG_ARM64_LPA2 78 + feat = cpuid_feature_extract_signed_field(read_sysreg(id_aa64mmfr0_el1), 79 + ID_AA64MMFR0_EL1_TGRAN_SHIFT); 80 + if (feat >= ID_AA64MMFR0_EL1_TGRAN_LPA2) { 81 + id_aa64mmfr0_override.val |= 82 + (ID_AA64MMFR0_EL1_TGRAN_LPA2 - 1) << ID_AA64MMFR0_EL1_TGRAN_SHIFT; 83 + id_aa64mmfr0_override.mask |= 0xfU << ID_AA64MMFR0_EL1_TGRAN_SHIFT; 84 + } 85 + #endif 86 + return true; 87 + } 88 + 89 + static const struct ftr_set_desc mmfr2 __prel64_initconst = { 90 + .name = "id_aa64mmfr2", 91 + .override = &id_aa64mmfr2_override, 92 + .fields = { 93 + FIELD("varange", ID_AA64MMFR2_EL1_VARange_SHIFT, mmfr2_varange_filter), 58 94 {} 59 95 }, 60 96 }; ··· 188 166 .fields = { 189 167 FIELD("nokaslr", ARM64_SW_FEATURE_OVERRIDE_NOKASLR, NULL), 190 168 FIELD("hvhe", ARM64_SW_FEATURE_OVERRIDE_HVHE, hvhe_filter), 169 + FIELD("rodataoff", ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF, NULL), 191 170 {} 192 171 }, 193 172 }; ··· 196 173 static const 197 174 PREL64(const struct ftr_set_desc, reg) regs[] __prel64_initconst = { 198 175 { &mmfr1 }, 176 + { &mmfr2 }, 199 177 { &pfr0 }, 200 178 { &pfr1 }, 201 179 { &isar1 }, ··· 221 197 { "arm64.nomops", "id_aa64isar2.mops=0" }, 222 198 { "arm64.nomte", "id_aa64pfr1.mte=0" }, 223 199 { "nokaslr", "arm64_sw.nokaslr=1" }, 200 + { "rodata=off", "arm64_sw.rodataoff=1" }, 201 + { "arm64.nolva", "id_aa64mmfr2.varange=0" }, 224 202 }; 225 203 226 204 static int __init parse_hexdigit(const char *p, u64 *v) ··· 339 313 } while (1); 340 314 } 341 315 342 - static __init const u8 *get_bootargs_cmdline(void) 316 + static __init const u8 *get_bootargs_cmdline(const void *fdt, int node) 343 317 { 318 + static char const bootargs[] __initconst = "bootargs"; 344 319 const u8 *prop; 345 - void *fdt; 346 - int node; 347 320 348 - fdt = get_early_fdt_ptr(); 349 - if (!fdt) 350 - return NULL; 351 - 352 - node = fdt_path_offset(fdt, "/chosen"); 353 321 if (node < 0) 354 322 return NULL; 355 323 356 - prop = fdt_getprop(fdt, node, "bootargs", NULL); 324 + prop = fdt_getprop(fdt, node, bootargs, NULL); 357 325 if (!prop) 358 326 return NULL; 359 327 360 328 return strlen(prop) ? prop : NULL; 361 329 } 362 330 363 - static __init void parse_cmdline(void) 331 + static __init void parse_cmdline(const void *fdt, int chosen) 364 332 { 365 - const u8 *prop = get_bootargs_cmdline(); 333 + static char const cmdline[] __initconst = CONFIG_CMDLINE; 334 + const u8 *prop = get_bootargs_cmdline(fdt, chosen); 366 335 367 336 if (IS_ENABLED(CONFIG_CMDLINE_FORCE) || !prop) 368 - __parse_cmdline(CONFIG_CMDLINE, true); 337 + __parse_cmdline(cmdline, true); 369 338 370 339 if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && prop) 371 340 __parse_cmdline(prop, true); 372 341 } 373 342 374 - /* Keep checkers quiet */ 375 - void init_feature_override(u64 boot_status); 376 - 377 - asmlinkage void __init init_feature_override(u64 boot_status) 343 + void __init init_feature_override(u64 boot_status, const void *fdt, 344 + int chosen) 378 345 { 379 346 struct arm64_ftr_override *override; 380 347 const struct ftr_set_desc *reg; ··· 383 364 384 365 __boot_status = boot_status; 385 366 386 - parse_cmdline(); 367 + parse_cmdline(fdt, chosen); 387 368 388 369 for (i = 0; i < ARRAY_SIZE(regs); i++) { 389 370 reg = prel64_pointer(regs[i].reg); ··· 391 372 dcache_clean_inval_poc((unsigned long)override, 392 373 (unsigned long)(override + 1)); 393 374 } 375 + } 376 + 377 + char * __init skip_spaces(const char *str) 378 + { 379 + while (isspace(*str)) 380 + ++str; 381 + return (char *)str; 394 382 }
+35
arch/arm64/kernel/image-vars.h
··· 36 36 PROVIDE(__pi___memmove = __pi_memmove); 37 37 PROVIDE(__pi___memset = __pi_memset); 38 38 39 + PROVIDE(__pi_id_aa64isar1_override = id_aa64isar1_override); 40 + PROVIDE(__pi_id_aa64isar2_override = id_aa64isar2_override); 41 + PROVIDE(__pi_id_aa64mmfr0_override = id_aa64mmfr0_override); 42 + PROVIDE(__pi_id_aa64mmfr1_override = id_aa64mmfr1_override); 43 + PROVIDE(__pi_id_aa64mmfr2_override = id_aa64mmfr2_override); 44 + PROVIDE(__pi_id_aa64pfr0_override = id_aa64pfr0_override); 45 + PROVIDE(__pi_id_aa64pfr1_override = id_aa64pfr1_override); 46 + PROVIDE(__pi_id_aa64smfr0_override = id_aa64smfr0_override); 47 + PROVIDE(__pi_id_aa64zfr0_override = id_aa64zfr0_override); 48 + PROVIDE(__pi_arm64_sw_feature_override = arm64_sw_feature_override); 49 + PROVIDE(__pi_arm64_use_ng_mappings = arm64_use_ng_mappings); 50 + #ifdef CONFIG_CAVIUM_ERRATUM_27456 51 + PROVIDE(__pi_cavium_erratum_27456_cpus = cavium_erratum_27456_cpus); 52 + #endif 53 + PROVIDE(__pi__ctype = _ctype); 54 + PROVIDE(__pi_memstart_offset_seed = memstart_offset_seed); 55 + 56 + PROVIDE(__pi_init_idmap_pg_dir = init_idmap_pg_dir); 57 + PROVIDE(__pi_init_idmap_pg_end = init_idmap_pg_end); 58 + PROVIDE(__pi_init_pg_dir = init_pg_dir); 59 + PROVIDE(__pi_init_pg_end = init_pg_end); 60 + PROVIDE(__pi_swapper_pg_dir = swapper_pg_dir); 61 + 62 + PROVIDE(__pi__text = _text); 63 + PROVIDE(__pi__stext = _stext); 64 + PROVIDE(__pi__etext = _etext); 65 + PROVIDE(__pi___start_rodata = __start_rodata); 66 + PROVIDE(__pi___inittext_begin = __inittext_begin); 67 + PROVIDE(__pi___inittext_end = __inittext_end); 68 + PROVIDE(__pi___initdata_begin = __initdata_begin); 69 + PROVIDE(__pi___initdata_end = __initdata_end); 70 + PROVIDE(__pi__data = _data); 71 + PROVIDE(__pi___bss_start = __bss_start); 72 + PROVIDE(__pi__end = _end); 73 + 39 74 #ifdef CONFIG_KVM 40 75 41 76 /*
+1 -3
arch/arm64/kernel/kaslr.c
··· 16 16 17 17 void __init kaslr_init(void) 18 18 { 19 - if (cpuid_feature_extract_unsigned_field(arm64_sw_feature_override.val & 20 - arm64_sw_feature_override.mask, 21 - ARM64_SW_FEATURE_OVERRIDE_NOKASLR)) { 19 + if (kaslr_disabled_cmdline()) { 22 20 pr_info("KASLR disabled on command line\n"); 23 21 return; 24 22 }
+1 -1
arch/arm64/kernel/module.c
··· 595 595 if (scs_is_dynamic()) { 596 596 s = find_section(hdr, sechdrs, ".init.eh_frame"); 597 597 if (s) 598 - scs_patch((void *)s->sh_addr, s->sh_size); 598 + __pi_scs_patch((void *)s->sh_addr, s->sh_size); 599 599 } 600 600 601 601 return module_init_ftrace_plt(hdr, sechdrs, me);
+14 -22
arch/arm64/kernel/patch-scs.c arch/arm64/kernel/pi/patch-scs.c
··· 4 4 * Author: Ard Biesheuvel <ardb@google.com> 5 5 */ 6 6 7 - #include <linux/bug.h> 8 7 #include <linux/errno.h> 9 8 #include <linux/init.h> 10 9 #include <linux/linkage.h> 11 - #include <linux/printk.h> 12 10 #include <linux/types.h> 13 11 14 - #include <asm/cacheflush.h> 15 12 #include <asm/scs.h> 13 + 14 + #include "pi.h" 15 + 16 + bool dynamic_scs_is_enabled; 16 17 17 18 // 18 19 // This minimal DWARF CFI parser is partially based on the code in ··· 50 49 #define DW_CFA_GNU_negative_offset_extended 0x2f 51 50 #define DW_CFA_hi_user 0x3f 52 51 53 - extern const u8 __eh_frame_start[], __eh_frame_end[]; 54 - 55 52 enum { 56 53 PACIASP = 0xd503233f, 57 54 AUTIASP = 0xd50323bf, ··· 80 81 */ 81 82 return; 82 83 } 83 - dcache_clean_pou(loc, loc + sizeof(u32)); 84 + if (IS_ENABLED(CONFIG_ARM64_WORKAROUND_CLEAN_CACHE)) 85 + asm("dc civac, %0" :: "r"(loc)); 86 + else 87 + asm(ALTERNATIVE("dc cvau, %0", "nop", ARM64_HAS_CACHE_IDC) 88 + :: "r"(loc)); 84 89 } 85 90 86 91 /* ··· 131 128 }; 132 129 }; 133 130 134 - static int noinstr scs_handle_fde_frame(const struct eh_frame *frame, 135 - bool fde_has_augmentation_data, 136 - int code_alignment_factor, 137 - bool dry_run) 131 + static int scs_handle_fde_frame(const struct eh_frame *frame, 132 + bool fde_has_augmentation_data, 133 + int code_alignment_factor, 134 + bool dry_run) 138 135 { 139 136 int size = frame->size - offsetof(struct eh_frame, opcodes) + 4; 140 137 u64 loc = (u64)offset_to_ptr(&frame->initial_loc); ··· 201 198 break; 202 199 203 200 default: 204 - pr_err("unhandled opcode: %02x in FDE frame %lx\n", opcode[-1], (uintptr_t)frame); 205 201 return -ENOEXEC; 206 202 } 207 203 } 208 204 return 0; 209 205 } 210 206 211 - int noinstr scs_patch(const u8 eh_frame[], int size) 207 + int scs_patch(const u8 eh_frame[], int size) 212 208 { 213 209 const u8 *p = eh_frame; 214 210 ··· 251 249 size -= sizeof(frame->size) + frame->size; 252 250 } 253 251 return 0; 254 - } 255 - 256 - asmlinkage void __init scs_patch_vmlinux(void) 257 - { 258 - if (!should_patch_pac_into_scs()) 259 - return; 260 - 261 - WARN_ON(scs_patch(__eh_frame_start, __eh_frame_end - __eh_frame_start)); 262 - icache_inval_all_pou(); 263 - isb(); 264 252 }
+3
arch/arm64/kernel/pi/.gitignore
··· 1 + # SPDX-License-Identifier: GPL-2.0-only 2 + 3 + relacheck
+21 -6
arch/arm64/kernel/pi/Makefile
··· 11 11 -fno-asynchronous-unwind-tables -fno-unwind-tables \ 12 12 $(call cc-option,-fno-addrsig) 13 13 14 + # this code may run with the MMU off so disable unaligned accesses 15 + CFLAGS_map_range.o += -mstrict-align 16 + 14 17 # remove SCS flags from all objects in this directory 15 18 KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) 16 19 # disable LTO ··· 25 22 UBSAN_SANITIZE := n 26 23 KCOV_INSTRUMENT := n 27 24 25 + hostprogs := relacheck 26 + 27 + quiet_cmd_piobjcopy = $(quiet_cmd_objcopy) 28 + cmd_piobjcopy = $(cmd_objcopy) && $(obj)/relacheck $(@) $(<) 29 + 28 30 $(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \ 29 - --remove-section=.note.gnu.property \ 30 - --prefix-alloc-sections=.init 31 - $(obj)/%.pi.o: $(obj)/%.o FORCE 32 - $(call if_changed,objcopy) 31 + --remove-section=.note.gnu.property 32 + $(obj)/%.pi.o: $(obj)/%.o $(obj)/relacheck FORCE 33 + $(call if_changed,piobjcopy) 34 + 35 + # ensure that all the lib- code ends up as __init code and data 36 + $(obj)/lib-%.pi.o: OBJCOPYFLAGS += --prefix-alloc-sections=.init 33 37 34 38 $(obj)/lib-%.o: $(srctree)/lib/%.c FORCE 35 39 $(call if_changed_rule,cc_o_c) 36 40 37 - obj-y := kaslr_early.pi.o lib-fdt.pi.o lib-fdt_ro.pi.o 38 - extra-y := $(patsubst %.pi.o,%.o,$(obj-y)) 41 + obj-y := idreg-override.pi.o \ 42 + map_kernel.pi.o map_range.pi.o \ 43 + lib-fdt.pi.o lib-fdt_ro.pi.o 44 + obj-$(CONFIG_RELOCATABLE) += relocate.pi.o 45 + obj-$(CONFIG_RANDOMIZE_BASE) += kaslr_early.pi.o 46 + obj-$(CONFIG_UNWIND_PATCH_PAC_INTO_SCS) += patch-scs.pi.o 47 + extra-y := $(patsubst %.pi.o,%.o,$(obj-y))
+19 -63
arch/arm64/kernel/pi/kaslr_early.c
··· 14 14 15 15 #include <asm/archrandom.h> 16 16 #include <asm/memory.h> 17 + #include <asm/pgtable.h> 17 18 18 - /* taken from lib/string.c */ 19 - static char *__strstr(const char *s1, const char *s2) 19 + #include "pi.h" 20 + 21 + extern u16 memstart_offset_seed; 22 + 23 + static u64 __init get_kaslr_seed(void *fdt, int node) 20 24 { 21 - size_t l1, l2; 22 - 23 - l2 = strlen(s2); 24 - if (!l2) 25 - return (char *)s1; 26 - l1 = strlen(s1); 27 - while (l1 >= l2) { 28 - l1--; 29 - if (!memcmp(s1, s2, l2)) 30 - return (char *)s1; 31 - s1++; 32 - } 33 - return NULL; 34 - } 35 - static bool cmdline_contains_nokaslr(const u8 *cmdline) 36 - { 37 - const u8 *str; 38 - 39 - str = __strstr(cmdline, "nokaslr"); 40 - return str == cmdline || (str > cmdline && *(str - 1) == ' '); 41 - } 42 - 43 - static bool is_kaslr_disabled_cmdline(void *fdt) 44 - { 45 - if (!IS_ENABLED(CONFIG_CMDLINE_FORCE)) { 46 - int node; 47 - const u8 *prop; 48 - 49 - node = fdt_path_offset(fdt, "/chosen"); 50 - if (node < 0) 51 - goto out; 52 - 53 - prop = fdt_getprop(fdt, node, "bootargs", NULL); 54 - if (!prop) 55 - goto out; 56 - 57 - if (cmdline_contains_nokaslr(prop)) 58 - return true; 59 - 60 - if (IS_ENABLED(CONFIG_CMDLINE_EXTEND)) 61 - goto out; 62 - 63 - return false; 64 - } 65 - out: 66 - return cmdline_contains_nokaslr(CONFIG_CMDLINE); 67 - } 68 - 69 - static u64 get_kaslr_seed(void *fdt) 70 - { 71 - int node, len; 25 + static char const seed_str[] __initconst = "kaslr-seed"; 72 26 fdt64_t *prop; 73 27 u64 ret; 28 + int len; 74 29 75 - node = fdt_path_offset(fdt, "/chosen"); 76 30 if (node < 0) 77 31 return 0; 78 32 79 - prop = fdt_getprop_w(fdt, node, "kaslr-seed", &len); 33 + prop = fdt_getprop_w(fdt, node, seed_str, &len); 80 34 if (!prop || len != sizeof(u64)) 81 35 return 0; 82 36 ··· 39 85 return ret; 40 86 } 41 87 42 - asmlinkage u64 kaslr_early_init(void *fdt) 88 + u64 __init kaslr_early_init(void *fdt, int chosen) 43 89 { 44 - u64 seed; 90 + u64 seed, range; 45 91 46 - if (is_kaslr_disabled_cmdline(fdt)) 92 + if (kaslr_disabled_cmdline()) 47 93 return 0; 48 94 49 - seed = get_kaslr_seed(fdt); 95 + seed = get_kaslr_seed(fdt, chosen); 50 96 if (!seed) { 51 97 if (!__early_cpu_has_rndr() || 52 98 !__arm64_rndr((unsigned long *)&seed)) 53 99 return 0; 54 100 } 55 101 102 + memstart_offset_seed = seed & U16_MAX; 103 + 56 104 /* 57 105 * OK, so we are proceeding with KASLR enabled. Calculate a suitable 58 106 * kernel image offset from the seed. Let's place the kernel in the 59 - * middle half of the VMALLOC area (VA_BITS_MIN - 2), and stay clear of 60 - * the lower and upper quarters to avoid colliding with other 61 - * allocations. 107 + * 'middle' half of the VMALLOC area, and stay clear of the lower and 108 + * upper quarters to avoid colliding with other allocations. 62 109 */ 63 - return BIT(VA_BITS_MIN - 3) + (seed & GENMASK(VA_BITS_MIN - 3, 0)); 110 + range = (VMALLOC_END - KIMAGE_VADDR) / 2; 111 + return range / 2 + (((__uint128_t)range * seed) >> 64); 64 112 }
+253
arch/arm64/kernel/pi/map_kernel.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright 2023 Google LLC 3 + // Author: Ard Biesheuvel <ardb@google.com> 4 + 5 + #include <linux/init.h> 6 + #include <linux/libfdt.h> 7 + #include <linux/linkage.h> 8 + #include <linux/types.h> 9 + #include <linux/sizes.h> 10 + #include <linux/string.h> 11 + 12 + #include <asm/memory.h> 13 + #include <asm/pgalloc.h> 14 + #include <asm/pgtable.h> 15 + #include <asm/tlbflush.h> 16 + 17 + #include "pi.h" 18 + 19 + extern const u8 __eh_frame_start[], __eh_frame_end[]; 20 + 21 + extern void idmap_cpu_replace_ttbr1(void *pgdir); 22 + 23 + static void __init map_segment(pgd_t *pg_dir, u64 *pgd, u64 va_offset, 24 + void *start, void *end, pgprot_t prot, 25 + bool may_use_cont, int root_level) 26 + { 27 + map_range(pgd, ((u64)start + va_offset) & ~PAGE_OFFSET, 28 + ((u64)end + va_offset) & ~PAGE_OFFSET, (u64)start, 29 + prot, root_level, (pte_t *)pg_dir, may_use_cont, 0); 30 + } 31 + 32 + static void __init unmap_segment(pgd_t *pg_dir, u64 va_offset, void *start, 33 + void *end, int root_level) 34 + { 35 + map_segment(pg_dir, NULL, va_offset, start, end, __pgprot(0), 36 + false, root_level); 37 + } 38 + 39 + static void __init map_kernel(u64 kaslr_offset, u64 va_offset, int root_level) 40 + { 41 + bool enable_scs = IS_ENABLED(CONFIG_UNWIND_PATCH_PAC_INTO_SCS); 42 + bool twopass = IS_ENABLED(CONFIG_RELOCATABLE); 43 + u64 pgdp = (u64)init_pg_dir + PAGE_SIZE; 44 + pgprot_t text_prot = PAGE_KERNEL_ROX; 45 + pgprot_t data_prot = PAGE_KERNEL; 46 + pgprot_t prot; 47 + 48 + /* 49 + * External debuggers may need to write directly to the text mapping to 50 + * install SW breakpoints. Allow this (only) when explicitly requested 51 + * with rodata=off. 52 + */ 53 + if (arm64_test_sw_feature_override(ARM64_SW_FEATURE_OVERRIDE_RODATA_OFF)) 54 + text_prot = PAGE_KERNEL_EXEC; 55 + 56 + /* 57 + * We only enable the shadow call stack dynamically if we are running 58 + * on a system that does not implement PAC or BTI. PAC and SCS provide 59 + * roughly the same level of protection, and BTI relies on the PACIASP 60 + * instructions serving as landing pads, preventing us from patching 61 + * those instructions into something else. 62 + */ 63 + if (IS_ENABLED(CONFIG_ARM64_PTR_AUTH_KERNEL) && cpu_has_pac()) 64 + enable_scs = false; 65 + 66 + if (IS_ENABLED(CONFIG_ARM64_BTI_KERNEL) && cpu_has_bti()) { 67 + enable_scs = false; 68 + 69 + /* 70 + * If we have a CPU that supports BTI and a kernel built for 71 + * BTI then mark the kernel executable text as guarded pages 72 + * now so we don't have to rewrite the page tables later. 73 + */ 74 + text_prot = __pgprot_modify(text_prot, PTE_GP, PTE_GP); 75 + } 76 + 77 + /* Map all code read-write on the first pass if needed */ 78 + twopass |= enable_scs; 79 + prot = twopass ? data_prot : text_prot; 80 + 81 + map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot, 82 + !twopass, root_level); 83 + map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata, 84 + __inittext_begin, data_prot, false, root_level); 85 + map_segment(init_pg_dir, &pgdp, va_offset, __inittext_begin, 86 + __inittext_end, prot, false, root_level); 87 + map_segment(init_pg_dir, &pgdp, va_offset, __initdata_begin, 88 + __initdata_end, data_prot, false, root_level); 89 + map_segment(init_pg_dir, &pgdp, va_offset, _data, _end, data_prot, 90 + true, root_level); 91 + dsb(ishst); 92 + 93 + idmap_cpu_replace_ttbr1(init_pg_dir); 94 + 95 + if (twopass) { 96 + if (IS_ENABLED(CONFIG_RELOCATABLE)) 97 + relocate_kernel(kaslr_offset); 98 + 99 + if (enable_scs) { 100 + scs_patch(__eh_frame_start + va_offset, 101 + __eh_frame_end - __eh_frame_start); 102 + asm("ic ialluis"); 103 + 104 + dynamic_scs_is_enabled = true; 105 + } 106 + 107 + /* 108 + * Unmap the text region before remapping it, to avoid 109 + * potential TLB conflicts when creating the contiguous 110 + * descriptors. 111 + */ 112 + unmap_segment(init_pg_dir, va_offset, _stext, _etext, 113 + root_level); 114 + dsb(ishst); 115 + isb(); 116 + __tlbi(vmalle1); 117 + isb(); 118 + 119 + /* 120 + * Remap these segments with different permissions 121 + * No new page table allocations should be needed 122 + */ 123 + map_segment(init_pg_dir, NULL, va_offset, _stext, _etext, 124 + text_prot, true, root_level); 125 + map_segment(init_pg_dir, NULL, va_offset, __inittext_begin, 126 + __inittext_end, text_prot, false, root_level); 127 + } 128 + 129 + /* Copy the root page table to its final location */ 130 + memcpy((void *)swapper_pg_dir + va_offset, init_pg_dir, PAGE_SIZE); 131 + dsb(ishst); 132 + idmap_cpu_replace_ttbr1(swapper_pg_dir); 133 + } 134 + 135 + static void noinline __section(".idmap.text") set_ttbr0_for_lpa2(u64 ttbr) 136 + { 137 + u64 sctlr = read_sysreg(sctlr_el1); 138 + u64 tcr = read_sysreg(tcr_el1) | TCR_DS; 139 + 140 + asm(" msr sctlr_el1, %0 ;" 141 + " isb ;" 142 + " msr ttbr0_el1, %1 ;" 143 + " msr tcr_el1, %2 ;" 144 + " isb ;" 145 + " tlbi vmalle1 ;" 146 + " dsb nsh ;" 147 + " isb ;" 148 + " msr sctlr_el1, %3 ;" 149 + " isb ;" 150 + :: "r"(sctlr & ~SCTLR_ELx_M), "r"(ttbr), "r"(tcr), "r"(sctlr)); 151 + } 152 + 153 + static void __init remap_idmap_for_lpa2(void) 154 + { 155 + /* clear the bits that change meaning once LPA2 is turned on */ 156 + pteval_t mask = PTE_SHARED; 157 + 158 + /* 159 + * We have to clear bits [9:8] in all block or page descriptors in the 160 + * initial ID map, as otherwise they will be (mis)interpreted as 161 + * physical address bits once we flick the LPA2 switch (TCR.DS). Since 162 + * we cannot manipulate live descriptors in that way without creating 163 + * potential TLB conflicts, let's create another temporary ID map in a 164 + * LPA2 compatible fashion, and update the initial ID map while running 165 + * from that. 166 + */ 167 + create_init_idmap(init_pg_dir, mask); 168 + dsb(ishst); 169 + set_ttbr0_for_lpa2((u64)init_pg_dir); 170 + 171 + /* 172 + * Recreate the initial ID map with the same granularity as before. 173 + * Don't bother with the FDT, we no longer need it after this. 174 + */ 175 + memset(init_idmap_pg_dir, 0, 176 + (u64)init_idmap_pg_dir - (u64)init_idmap_pg_end); 177 + 178 + create_init_idmap(init_idmap_pg_dir, mask); 179 + dsb(ishst); 180 + 181 + /* switch back to the updated initial ID map */ 182 + set_ttbr0_for_lpa2((u64)init_idmap_pg_dir); 183 + 184 + /* wipe the temporary ID map from memory */ 185 + memset(init_pg_dir, 0, (u64)init_pg_end - (u64)init_pg_dir); 186 + } 187 + 188 + static void __init map_fdt(u64 fdt) 189 + { 190 + static u8 ptes[INIT_IDMAP_FDT_SIZE] __initdata __aligned(PAGE_SIZE); 191 + u64 efdt = fdt + MAX_FDT_SIZE; 192 + u64 ptep = (u64)ptes; 193 + 194 + /* 195 + * Map up to MAX_FDT_SIZE bytes, but avoid overlap with 196 + * the kernel image. 197 + */ 198 + map_range(&ptep, fdt, (u64)_text > fdt ? min((u64)_text, efdt) : efdt, 199 + fdt, PAGE_KERNEL, IDMAP_ROOT_LEVEL, 200 + (pte_t *)init_idmap_pg_dir, false, 0); 201 + dsb(ishst); 202 + } 203 + 204 + asmlinkage void __init early_map_kernel(u64 boot_status, void *fdt) 205 + { 206 + static char const chosen_str[] __initconst = "/chosen"; 207 + u64 va_base, pa_base = (u64)&_text; 208 + u64 kaslr_offset = pa_base % MIN_KIMG_ALIGN; 209 + int root_level = 4 - CONFIG_PGTABLE_LEVELS; 210 + int va_bits = VA_BITS; 211 + int chosen; 212 + 213 + map_fdt((u64)fdt); 214 + 215 + /* Clear BSS and the initial page tables */ 216 + memset(__bss_start, 0, (u64)init_pg_end - (u64)__bss_start); 217 + 218 + /* Parse the command line for CPU feature overrides */ 219 + chosen = fdt_path_offset(fdt, chosen_str); 220 + init_feature_override(boot_status, fdt, chosen); 221 + 222 + if (IS_ENABLED(CONFIG_ARM64_64K_PAGES) && !cpu_has_lva()) { 223 + va_bits = VA_BITS_MIN; 224 + } else if (IS_ENABLED(CONFIG_ARM64_LPA2) && !cpu_has_lpa2()) { 225 + va_bits = VA_BITS_MIN; 226 + root_level++; 227 + } 228 + 229 + if (va_bits > VA_BITS_MIN) 230 + sysreg_clear_set(tcr_el1, TCR_T1SZ_MASK, TCR_T1SZ(va_bits)); 231 + 232 + /* 233 + * The virtual KASLR displacement modulo 2MiB is decided by the 234 + * physical placement of the image, as otherwise, we might not be able 235 + * to create the early kernel mapping using 2 MiB block descriptors. So 236 + * take the low bits of the KASLR offset from the physical address, and 237 + * fill in the high bits from the seed. 238 + */ 239 + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { 240 + u64 kaslr_seed = kaslr_early_init(fdt, chosen); 241 + 242 + if (kaslr_seed && kaslr_requires_kpti()) 243 + arm64_use_ng_mappings = true; 244 + 245 + kaslr_offset |= kaslr_seed & ~(MIN_KIMG_ALIGN - 1); 246 + } 247 + 248 + if (IS_ENABLED(CONFIG_ARM64_LPA2) && va_bits > VA_BITS_MIN) 249 + remap_idmap_for_lpa2(); 250 + 251 + va_base = KIMAGE_VADDR + kaslr_offset; 252 + map_kernel(kaslr_offset, va_base - pa_base, root_level); 253 + }
+105
arch/arm64/kernel/pi/map_range.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright 2023 Google LLC 3 + // Author: Ard Biesheuvel <ardb@google.com> 4 + 5 + #include <linux/types.h> 6 + #include <linux/sizes.h> 7 + 8 + #include <asm/memory.h> 9 + #include <asm/pgalloc.h> 10 + #include <asm/pgtable.h> 11 + 12 + #include "pi.h" 13 + 14 + /** 15 + * map_range - Map a contiguous range of physical pages into virtual memory 16 + * 17 + * @pte: Address of physical pointer to array of pages to 18 + * allocate page tables from 19 + * @start: Virtual address of the start of the range 20 + * @end: Virtual address of the end of the range (exclusive) 21 + * @pa: Physical address of the start of the range 22 + * @prot: Access permissions of the range 23 + * @level: Translation level for the mapping 24 + * @tbl: The level @level page table to create the mappings in 25 + * @may_use_cont: Whether the use of the contiguous attribute is allowed 26 + * @va_offset: Offset between a physical page and its current mapping 27 + * in the VA space 28 + */ 29 + void __init map_range(u64 *pte, u64 start, u64 end, u64 pa, pgprot_t prot, 30 + int level, pte_t *tbl, bool may_use_cont, u64 va_offset) 31 + { 32 + u64 cmask = (level == 3) ? CONT_PTE_SIZE - 1 : U64_MAX; 33 + u64 protval = pgprot_val(prot) & ~PTE_TYPE_MASK; 34 + int lshift = (3 - level) * (PAGE_SHIFT - 3); 35 + u64 lmask = (PAGE_SIZE << lshift) - 1; 36 + 37 + start &= PAGE_MASK; 38 + pa &= PAGE_MASK; 39 + 40 + /* Advance tbl to the entry that covers start */ 41 + tbl += (start >> (lshift + PAGE_SHIFT)) % PTRS_PER_PTE; 42 + 43 + /* 44 + * Set the right block/page bits for this level unless we are 45 + * clearing the mapping 46 + */ 47 + if (protval) 48 + protval |= (level < 3) ? PMD_TYPE_SECT : PTE_TYPE_PAGE; 49 + 50 + while (start < end) { 51 + u64 next = min((start | lmask) + 1, PAGE_ALIGN(end)); 52 + 53 + if (level < 3 && (start | next | pa) & lmask) { 54 + /* 55 + * This chunk needs a finer grained mapping. Create a 56 + * table mapping if necessary and recurse. 57 + */ 58 + if (pte_none(*tbl)) { 59 + *tbl = __pte(__phys_to_pte_val(*pte) | 60 + PMD_TYPE_TABLE | PMD_TABLE_UXN); 61 + *pte += PTRS_PER_PTE * sizeof(pte_t); 62 + } 63 + map_range(pte, start, next, pa, prot, level + 1, 64 + (pte_t *)(__pte_to_phys(*tbl) + va_offset), 65 + may_use_cont, va_offset); 66 + } else { 67 + /* 68 + * Start a contiguous range if start and pa are 69 + * suitably aligned 70 + */ 71 + if (((start | pa) & cmask) == 0 && may_use_cont) 72 + protval |= PTE_CONT; 73 + 74 + /* 75 + * Clear the contiguous attribute if the remaining 76 + * range does not cover a contiguous block 77 + */ 78 + if ((end & ~cmask) <= start) 79 + protval &= ~PTE_CONT; 80 + 81 + /* Put down a block or page mapping */ 82 + *tbl = __pte(__phys_to_pte_val(pa) | protval); 83 + } 84 + pa += next - start; 85 + start = next; 86 + tbl++; 87 + } 88 + } 89 + 90 + asmlinkage u64 __init create_init_idmap(pgd_t *pg_dir, pteval_t clrmask) 91 + { 92 + u64 ptep = (u64)pg_dir + PAGE_SIZE; 93 + pgprot_t text_prot = PAGE_KERNEL_ROX; 94 + pgprot_t data_prot = PAGE_KERNEL; 95 + 96 + pgprot_val(text_prot) &= ~clrmask; 97 + pgprot_val(data_prot) &= ~clrmask; 98 + 99 + map_range(&ptep, (u64)_stext, (u64)__initdata_begin, (u64)_stext, 100 + text_prot, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0); 101 + map_range(&ptep, (u64)__initdata_begin, (u64)_end, (u64)__initdata_begin, 102 + data_prot, IDMAP_ROOT_LEVEL, (pte_t *)pg_dir, false, 0); 103 + 104 + return ptep; 105 + }
+36
arch/arm64/kernel/pi/pi.h
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright 2023 Google LLC 3 + // Author: Ard Biesheuvel <ardb@google.com> 4 + 5 + #include <linux/types.h> 6 + 7 + #define __prel64_initconst __section(".init.rodata.prel64") 8 + 9 + #define PREL64(type, name) union { type *name; prel64_t name ## _prel; } 10 + 11 + #define prel64_pointer(__d) (typeof(__d))prel64_to_pointer(&__d##_prel) 12 + 13 + typedef volatile signed long prel64_t; 14 + 15 + static inline void *prel64_to_pointer(const prel64_t *offset) 16 + { 17 + if (!*offset) 18 + return NULL; 19 + return (void *)offset + *offset; 20 + } 21 + 22 + extern bool dynamic_scs_is_enabled; 23 + 24 + extern pgd_t init_idmap_pg_dir[], init_idmap_pg_end[]; 25 + 26 + void init_feature_override(u64 boot_status, const void *fdt, int chosen); 27 + u64 kaslr_early_init(void *fdt, int chosen); 28 + void relocate_kernel(u64 offset); 29 + int scs_patch(const u8 eh_frame[], int size); 30 + 31 + void map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot, 32 + int level, pte_t *tbl, bool may_use_cont, u64 va_offset); 33 + 34 + asmlinkage void early_map_kernel(u64 boot_status, void *fdt); 35 + 36 + asmlinkage u64 create_init_idmap(pgd_t *pgd, pteval_t clrmask);
+130
arch/arm64/kernel/pi/relacheck.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright (C) 2023 - Google LLC 4 + * Author: Ard Biesheuvel <ardb@google.com> 5 + */ 6 + 7 + #include <elf.h> 8 + #include <fcntl.h> 9 + #include <stdbool.h> 10 + #include <stdio.h> 11 + #include <stdlib.h> 12 + #include <string.h> 13 + #include <sys/mman.h> 14 + #include <sys/stat.h> 15 + #include <sys/types.h> 16 + #include <unistd.h> 17 + 18 + #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ 19 + #define HOST_ORDER ELFDATA2LSB 20 + #elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ 21 + #define HOST_ORDER ELFDATA2MSB 22 + #endif 23 + 24 + static Elf64_Ehdr *ehdr; 25 + static Elf64_Shdr *shdr; 26 + static const char *strtab; 27 + static bool swap; 28 + 29 + static uint64_t swab_elfxword(uint64_t val) 30 + { 31 + return swap ? __builtin_bswap64(val) : val; 32 + } 33 + 34 + static uint32_t swab_elfword(uint32_t val) 35 + { 36 + return swap ? __builtin_bswap32(val) : val; 37 + } 38 + 39 + static uint16_t swab_elfhword(uint16_t val) 40 + { 41 + return swap ? __builtin_bswap16(val) : val; 42 + } 43 + 44 + int main(int argc, char *argv[]) 45 + { 46 + struct stat stat; 47 + int fd, ret; 48 + 49 + if (argc < 3) { 50 + fprintf(stderr, "file arguments missing\n"); 51 + exit(EXIT_FAILURE); 52 + } 53 + 54 + fd = open(argv[1], O_RDWR); 55 + if (fd < 0) { 56 + fprintf(stderr, "failed to open %s\n", argv[1]); 57 + exit(EXIT_FAILURE); 58 + } 59 + 60 + ret = fstat(fd, &stat); 61 + if (ret < 0) { 62 + fprintf(stderr, "failed to stat() %s\n", argv[1]); 63 + exit(EXIT_FAILURE); 64 + } 65 + 66 + ehdr = mmap(0, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); 67 + if (ehdr == MAP_FAILED) { 68 + fprintf(stderr, "failed to mmap() %s\n", argv[1]); 69 + exit(EXIT_FAILURE); 70 + } 71 + 72 + swap = ehdr->e_ident[EI_DATA] != HOST_ORDER; 73 + shdr = (void *)ehdr + swab_elfxword(ehdr->e_shoff); 74 + strtab = (void *)ehdr + 75 + swab_elfxword(shdr[swab_elfhword(ehdr->e_shstrndx)].sh_offset); 76 + 77 + for (int i = 0; i < swab_elfhword(ehdr->e_shnum); i++) { 78 + unsigned long info, flags; 79 + bool prel64 = false; 80 + Elf64_Rela *rela; 81 + int numrela; 82 + 83 + if (swab_elfword(shdr[i].sh_type) != SHT_RELA) 84 + continue; 85 + 86 + /* only consider RELA sections operating on data */ 87 + info = swab_elfword(shdr[i].sh_info); 88 + flags = swab_elfxword(shdr[info].sh_flags); 89 + if ((flags & (SHF_ALLOC | SHF_EXECINSTR)) != SHF_ALLOC) 90 + continue; 91 + 92 + /* 93 + * We generally don't permit ABS64 relocations in the code that 94 + * runs before relocation processing occurs. If statically 95 + * initialized absolute symbol references are unavoidable, they 96 + * may be emitted into a *.rodata.prel64 section and they will 97 + * be converted to place-relative 64-bit references. This 98 + * requires special handling in the referring code. 99 + */ 100 + if (strstr(strtab + swab_elfword(shdr[info].sh_name), 101 + ".rodata.prel64")) { 102 + prel64 = true; 103 + } 104 + 105 + rela = (void *)ehdr + swab_elfxword(shdr[i].sh_offset); 106 + numrela = swab_elfxword(shdr[i].sh_size) / sizeof(*rela); 107 + 108 + for (int j = 0; j < numrela; j++) { 109 + uint64_t info = swab_elfxword(rela[j].r_info); 110 + 111 + if (ELF64_R_TYPE(info) != R_AARCH64_ABS64) 112 + continue; 113 + 114 + if (prel64) { 115 + /* convert ABS64 into PREL64 */ 116 + info ^= R_AARCH64_ABS64 ^ R_AARCH64_PREL64; 117 + rela[j].r_info = swab_elfxword(info); 118 + } else { 119 + fprintf(stderr, 120 + "Unexpected absolute relocations detected in %s\n", 121 + argv[2]); 122 + close(fd); 123 + unlink(argv[1]); 124 + exit(EXIT_FAILURE); 125 + } 126 + } 127 + } 128 + close(fd); 129 + return 0; 130 + }
+64
arch/arm64/kernel/pi/relocate.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright 2023 Google LLC 3 + // Authors: Ard Biesheuvel <ardb@google.com> 4 + // Peter Collingbourne <pcc@google.com> 5 + 6 + #include <linux/elf.h> 7 + #include <linux/init.h> 8 + #include <linux/types.h> 9 + 10 + #include "pi.h" 11 + 12 + extern const Elf64_Rela rela_start[], rela_end[]; 13 + extern const u64 relr_start[], relr_end[]; 14 + 15 + void __init relocate_kernel(u64 offset) 16 + { 17 + u64 *place = NULL; 18 + 19 + for (const Elf64_Rela *rela = rela_start; rela < rela_end; rela++) { 20 + if (ELF64_R_TYPE(rela->r_info) != R_AARCH64_RELATIVE) 21 + continue; 22 + *(u64 *)(rela->r_offset + offset) = rela->r_addend + offset; 23 + } 24 + 25 + if (!IS_ENABLED(CONFIG_RELR) || !offset) 26 + return; 27 + 28 + /* 29 + * Apply RELR relocations. 30 + * 31 + * RELR is a compressed format for storing relative relocations. The 32 + * encoded sequence of entries looks like: 33 + * [ AAAAAAAA BBBBBBB1 BBBBBBB1 ... AAAAAAAA BBBBBB1 ... ] 34 + * 35 + * i.e. start with an address, followed by any number of bitmaps. The 36 + * address entry encodes 1 relocation. The subsequent bitmap entries 37 + * encode up to 63 relocations each, at subsequent offsets following 38 + * the last address entry. 39 + * 40 + * The bitmap entries must have 1 in the least significant bit. The 41 + * assumption here is that an address cannot have 1 in lsb. Odd 42 + * addresses are not supported. Any odd addresses are stored in the 43 + * RELA section, which is handled above. 44 + * 45 + * With the exception of the least significant bit, each bit in the 46 + * bitmap corresponds with a machine word that follows the base address 47 + * word, and the bit value indicates whether or not a relocation needs 48 + * to be applied to it. The second least significant bit represents the 49 + * machine word immediately following the initial address, and each bit 50 + * that follows represents the next word, in linear order. As such, a 51 + * single bitmap can encode up to 63 relocations in a 64-bit object. 52 + */ 53 + for (const u64 *relr = relr_start; relr < relr_end; relr++) { 54 + if ((*relr & 1) == 0) { 55 + place = (u64 *)(*relr + offset); 56 + *place++ += offset; 57 + } else { 58 + for (u64 *p = place, r = *relr >> 1; r; p++, r >>= 1) 59 + if (r & 1) 60 + *p += offset; 61 + place += 63; 62 + } 63 + } 64 + }
+16 -5
arch/arm64/kernel/probes/kprobes.c
··· 371 371 .fn = kprobe_breakpoint_ss_handler, 372 372 }; 373 373 374 + static int __kprobes 375 + kretprobe_breakpoint_handler(struct pt_regs *regs, unsigned long esr) 376 + { 377 + if (regs->pc != (unsigned long)__kretprobe_trampoline) 378 + return DBG_HOOK_ERROR; 379 + 380 + regs->pc = kretprobe_trampoline_handler(regs, (void *)regs->regs[29]); 381 + return DBG_HOOK_HANDLED; 382 + } 383 + 384 + static struct break_hook kretprobes_break_hook = { 385 + .imm = KRETPROBES_BRK_IMM, 386 + .fn = kretprobe_breakpoint_handler, 387 + }; 388 + 374 389 /* 375 390 * Provide a blacklist of symbols identifying ranges which cannot be kprobed. 376 391 * This blacklist is exposed to userspace via debugfs (kprobes/blacklist). ··· 411 396 return ret; 412 397 } 413 398 414 - void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs) 415 - { 416 - return (void *)kretprobe_trampoline_handler(regs, (void *)regs->regs[29]); 417 - } 418 - 419 399 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri, 420 400 struct pt_regs *regs) 421 401 { ··· 430 420 { 431 421 register_kernel_break_hook(&kprobes_break_hook); 432 422 register_kernel_break_hook(&kprobes_break_ss_hook); 423 + register_kernel_break_hook(&kretprobes_break_hook); 433 424 434 425 return 0; 435 426 }
+6 -72
arch/arm64/kernel/probes/kprobes_trampoline.S
··· 4 4 */ 5 5 6 6 #include <linux/linkage.h> 7 - #include <asm/asm-offsets.h> 7 + #include <asm/asm-bug.h> 8 8 #include <asm/assembler.h> 9 9 10 10 .text 11 11 12 - .macro save_all_base_regs 13 - stp x0, x1, [sp, #S_X0] 14 - stp x2, x3, [sp, #S_X2] 15 - stp x4, x5, [sp, #S_X4] 16 - stp x6, x7, [sp, #S_X6] 17 - stp x8, x9, [sp, #S_X8] 18 - stp x10, x11, [sp, #S_X10] 19 - stp x12, x13, [sp, #S_X12] 20 - stp x14, x15, [sp, #S_X14] 21 - stp x16, x17, [sp, #S_X16] 22 - stp x18, x19, [sp, #S_X18] 23 - stp x20, x21, [sp, #S_X20] 24 - stp x22, x23, [sp, #S_X22] 25 - stp x24, x25, [sp, #S_X24] 26 - stp x26, x27, [sp, #S_X26] 27 - stp x28, x29, [sp, #S_X28] 28 - add x0, sp, #PT_REGS_SIZE 29 - stp lr, x0, [sp, #S_LR] 30 - /* 31 - * Construct a useful saved PSTATE 32 - */ 33 - mrs x0, nzcv 34 - mrs x1, daif 35 - orr x0, x0, x1 36 - mrs x1, CurrentEL 37 - orr x0, x0, x1 38 - mrs x1, SPSel 39 - orr x0, x0, x1 40 - stp xzr, x0, [sp, #S_PC] 41 - .endm 42 - 43 - .macro restore_all_base_regs 44 - ldr x0, [sp, #S_PSTATE] 45 - and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT) 46 - msr nzcv, x0 47 - ldp x0, x1, [sp, #S_X0] 48 - ldp x2, x3, [sp, #S_X2] 49 - ldp x4, x5, [sp, #S_X4] 50 - ldp x6, x7, [sp, #S_X6] 51 - ldp x8, x9, [sp, #S_X8] 52 - ldp x10, x11, [sp, #S_X10] 53 - ldp x12, x13, [sp, #S_X12] 54 - ldp x14, x15, [sp, #S_X14] 55 - ldp x16, x17, [sp, #S_X16] 56 - ldp x18, x19, [sp, #S_X18] 57 - ldp x20, x21, [sp, #S_X20] 58 - ldp x22, x23, [sp, #S_X22] 59 - ldp x24, x25, [sp, #S_X24] 60 - ldp x26, x27, [sp, #S_X26] 61 - ldp x28, x29, [sp, #S_X28] 62 - .endm 63 - 64 12 SYM_CODE_START(__kretprobe_trampoline) 65 - sub sp, sp, #PT_REGS_SIZE 66 - 67 - save_all_base_regs 68 - 69 - /* Setup a frame pointer. */ 70 - add x29, sp, #S_FP 71 - 72 - mov x0, sp 73 - bl trampoline_probe_handler 74 13 /* 75 - * Replace trampoline address in lr with actual orig_ret_addr return 76 - * address. 14 + * Trigger a breakpoint exception. The PC will be adjusted by 15 + * kretprobe_breakpoint_handler(), and no subsequent instructions will 16 + * be executed from the trampoline. 77 17 */ 78 - mov lr, x0 79 - 80 - /* The frame pointer (x29) is restored with other registers. */ 81 - restore_all_base_regs 82 - 83 - add sp, sp, #PT_REGS_SIZE 84 - ret 85 - 18 + brk #KRETPROBES_BRK_IMM 19 + ASM_BUG() 86 20 SYM_CODE_END(__kretprobe_trampoline)
-3
arch/arm64/kernel/process.c
··· 290 290 fpsimd_preserve_current_state(); 291 291 *dst = *src; 292 292 293 - /* We rely on the above assignment to initialize dst's thread_flags: */ 294 - BUILD_BUG_ON(!IS_ENABLED(CONFIG_THREAD_INFO_IN_TASK)); 295 - 296 293 /* 297 294 * Detach src's sve_state (if any) from dst so that it does not 298 295 * get erroneously used or freed prematurely. dst's copies
+45 -5
arch/arm64/kernel/ptrace.c
··· 174 174 struct arch_hw_breakpoint *bkpt = counter_arch_bp(bp); 175 175 const char *desc = "Hardware breakpoint trap (ptrace)"; 176 176 177 - #ifdef CONFIG_COMPAT 178 177 if (is_compat_task()) { 179 178 int si_errno = 0; 180 179 int i; ··· 195 196 desc); 196 197 return; 197 198 } 198 - #endif 199 + 199 200 arm64_force_sig_fault(SIGTRAP, TRAP_HWBKPT, bkpt->trigger, desc); 200 201 } 201 202 ··· 695 696 target->thread.tpidr2_el0 = tls[1]; 696 697 697 698 return ret; 699 + } 700 + 701 + static int fpmr_get(struct task_struct *target, const struct user_regset *regset, 702 + struct membuf to) 703 + { 704 + if (!system_supports_fpmr()) 705 + return -EINVAL; 706 + 707 + if (target == current) 708 + fpsimd_preserve_current_state(); 709 + 710 + return membuf_store(&to, target->thread.uw.fpmr); 711 + } 712 + 713 + static int fpmr_set(struct task_struct *target, const struct user_regset *regset, 714 + unsigned int pos, unsigned int count, 715 + const void *kbuf, const void __user *ubuf) 716 + { 717 + int ret; 718 + unsigned long fpmr; 719 + 720 + if (!system_supports_fpmr()) 721 + return -EINVAL; 722 + 723 + ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &fpmr, 0, count); 724 + if (ret) 725 + return ret; 726 + 727 + target->thread.uw.fpmr = fpmr; 728 + 729 + fpsimd_flush_task_state(target); 730 + 731 + return 0; 698 732 } 699 733 700 734 static int system_call_get(struct task_struct *target, ··· 1451 1419 REGSET_HW_BREAK, 1452 1420 REGSET_HW_WATCH, 1453 1421 #endif 1422 + REGSET_FPMR, 1454 1423 REGSET_SYSTEM_CALL, 1455 1424 #ifdef CONFIG_ARM64_SVE 1456 1425 REGSET_SVE, ··· 1529 1496 .align = sizeof(int), 1530 1497 .regset_get = system_call_get, 1531 1498 .set = system_call_set, 1499 + }, 1500 + [REGSET_FPMR] = { 1501 + .core_note_type = NT_ARM_FPMR, 1502 + .n = 1, 1503 + .size = sizeof(u64), 1504 + .align = sizeof(u64), 1505 + .regset_get = fpmr_get, 1506 + .set = fpmr_set, 1532 1507 }, 1533 1508 #ifdef CONFIG_ARM64_SVE 1534 1509 [REGSET_SVE] = { /* Scalable Vector Extension */ ··· 1638 1597 .regsets = aarch64_regsets, .n = ARRAY_SIZE(aarch64_regsets) 1639 1598 }; 1640 1599 1641 - #ifdef CONFIG_COMPAT 1642 1600 enum compat_regset { 1643 1601 REGSET_COMPAT_GPR, 1644 1602 REGSET_COMPAT_VFP, ··· 1894 1854 .regsets = aarch32_ptrace_regsets, .n = ARRAY_SIZE(aarch32_ptrace_regsets) 1895 1855 }; 1896 1856 1857 + #ifdef CONFIG_COMPAT 1897 1858 static int compat_ptrace_read_user(struct task_struct *tsk, compat_ulong_t off, 1898 1859 compat_ulong_t __user *ret) 1899 1860 { ··· 2156 2115 2157 2116 const struct user_regset_view *task_user_regset_view(struct task_struct *task) 2158 2117 { 2159 - #ifdef CONFIG_COMPAT 2160 2118 /* 2161 2119 * Core dumping of 32-bit tasks or compat ptrace requests must use the 2162 2120 * user_aarch32_view compatible with arm32. Native ptrace requests on ··· 2166 2126 return &user_aarch32_view; 2167 2127 else if (is_compat_thread(task_thread_info(task))) 2168 2128 return &user_aarch32_ptrace_view; 2169 - #endif 2129 + 2170 2130 return &user_aarch64_view; 2171 2131 } 2172 2132
+2 -25
arch/arm64/kernel/setup.c
··· 166 166 pr_warn("Large number of MPIDR hash buckets detected\n"); 167 167 } 168 168 169 - static void *early_fdt_ptr __initdata; 170 - 171 - void __init *get_early_fdt_ptr(void) 172 - { 173 - return early_fdt_ptr; 174 - } 175 - 176 - asmlinkage void __init early_fdt_map(u64 dt_phys) 177 - { 178 - int fdt_size; 179 - 180 - early_fixmap_init(); 181 - early_fdt_ptr = fixmap_remap_fdt(dt_phys, &fdt_size, PAGE_KERNEL); 182 - } 183 - 184 169 static void __init setup_machine_fdt(phys_addr_t dt_phys) 185 170 { 186 171 int size; ··· 283 298 284 299 kaslr_init(); 285 300 286 - /* 287 - * If know now we are going to need KPTI then use non-global 288 - * mappings from the start, avoiding the cost of rewriting 289 - * everything later. 290 - */ 291 - arm64_use_ng_mappings = kaslr_requires_kpti(); 292 - 293 301 early_fixmap_init(); 294 302 early_ioremap_init(); 295 303 ··· 298 320 dynamic_scs_init(); 299 321 300 322 /* 301 - * Unmask asynchronous aborts and fiq after bringing up possible 302 - * earlycon. (Report possible System Errors once we can report this 303 - * occurred). 323 + * Unmask SError as soon as possible after initializing earlycon so 324 + * that we can report any SErrors immediately. 304 325 */ 305 326 local_daif_restore(DAIF_PROCCTX_NOIRQ); 306 327
+61 -37
arch/arm64/kernel/signal.c
··· 16 16 #include <linux/uaccess.h> 17 17 #include <linux/sizes.h> 18 18 #include <linux/string.h> 19 - #include <linux/resume_user_mode.h> 20 19 #include <linux/ratelimit.h> 20 + #include <linux/rseq.h> 21 21 #include <linux/syscalls.h> 22 22 23 23 #include <asm/daifflags.h> ··· 60 60 unsigned long tpidr2_offset; 61 61 unsigned long za_offset; 62 62 unsigned long zt_offset; 63 + unsigned long fpmr_offset; 63 64 unsigned long extra_offset; 64 65 unsigned long end_offset; 65 66 }; ··· 183 182 u32 za_size; 184 183 struct zt_context __user *zt; 185 184 u32 zt_size; 185 + struct fpmr_context __user *fpmr; 186 + u32 fpmr_size; 186 187 }; 187 188 188 189 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx) ··· 230 227 return err ? -EFAULT : 0; 231 228 } 232 229 230 + static int preserve_fpmr_context(struct fpmr_context __user *ctx) 231 + { 232 + int err = 0; 233 + 234 + current->thread.uw.fpmr = read_sysreg_s(SYS_FPMR); 235 + 236 + __put_user_error(FPMR_MAGIC, &ctx->head.magic, err); 237 + __put_user_error(sizeof(*ctx), &ctx->head.size, err); 238 + __put_user_error(current->thread.uw.fpmr, &ctx->fpmr, err); 239 + 240 + return err; 241 + } 242 + 243 + static int restore_fpmr_context(struct user_ctxs *user) 244 + { 245 + u64 fpmr; 246 + int err = 0; 247 + 248 + if (user->fpmr_size != sizeof(*user->fpmr)) 249 + return -EINVAL; 250 + 251 + __get_user_error(fpmr, &user->fpmr->fpmr, err); 252 + if (!err) 253 + write_sysreg_s(fpmr, SYS_FPMR); 254 + 255 + return err; 256 + } 233 257 234 258 #ifdef CONFIG_ARM64_SVE 235 259 ··· 620 590 user->tpidr2 = NULL; 621 591 user->za = NULL; 622 592 user->zt = NULL; 593 + user->fpmr = NULL; 623 594 624 595 if (!IS_ALIGNED((unsigned long)base, 16)) 625 596 goto invalid; ··· 713 682 714 683 user->zt = (struct zt_context __user *)head; 715 684 user->zt_size = size; 685 + break; 686 + 687 + case FPMR_MAGIC: 688 + if (!system_supports_fpmr()) 689 + goto invalid; 690 + 691 + if (user->fpmr) 692 + goto invalid; 693 + 694 + user->fpmr = (struct fpmr_context __user *)head; 695 + user->fpmr_size = size; 716 696 break; 717 697 718 698 case EXTRA_MAGIC: ··· 848 806 if (err == 0 && system_supports_tpidr2() && user.tpidr2) 849 807 err = restore_tpidr2_context(&user); 850 808 809 + if (err == 0 && system_supports_fpmr() && user.fpmr) 810 + err = restore_fpmr_context(&user); 811 + 851 812 if (err == 0 && system_supports_sme() && user.za) 852 813 err = restore_za_context(&user); 853 814 ··· 973 928 } 974 929 } 975 930 931 + if (system_supports_fpmr()) { 932 + err = sigframe_alloc(user, &user->fpmr_offset, 933 + sizeof(struct fpmr_context)); 934 + if (err) 935 + return err; 936 + } 937 + 976 938 return sigframe_alloc_end(user); 977 939 } 978 940 ··· 1033 981 struct tpidr2_context __user *tpidr2_ctx = 1034 982 apply_user_offset(user, user->tpidr2_offset); 1035 983 err |= preserve_tpidr2_context(tpidr2_ctx); 984 + } 985 + 986 + /* FPMR if supported */ 987 + if (system_supports_fpmr() && err == 0) { 988 + struct fpmr_context __user *fpmr_ctx = 989 + apply_user_offset(user, user->fpmr_offset); 990 + err |= preserve_fpmr_context(fpmr_ctx); 1036 991 } 1037 992 1038 993 /* ZA state if present */ ··· 1266 1207 * the kernel can handle, and then we build all the user-level signal handling 1267 1208 * stack-frames in one go after that. 1268 1209 */ 1269 - static void do_signal(struct pt_regs *regs) 1210 + void do_signal(struct pt_regs *regs) 1270 1211 { 1271 1212 unsigned long continue_addr = 0, restart_addr = 0; 1272 1213 int retval = 0; ··· 1335 1276 } 1336 1277 1337 1278 restore_saved_sigmask(); 1338 - } 1339 - 1340 - void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags) 1341 - { 1342 - do { 1343 - if (thread_flags & _TIF_NEED_RESCHED) { 1344 - /* Unmask Debug and SError for the next task */ 1345 - local_daif_restore(DAIF_PROCCTX_NOIRQ); 1346 - 1347 - schedule(); 1348 - } else { 1349 - local_daif_restore(DAIF_PROCCTX); 1350 - 1351 - if (thread_flags & _TIF_UPROBE) 1352 - uprobe_notify_resume(regs); 1353 - 1354 - if (thread_flags & _TIF_MTE_ASYNC_FAULT) { 1355 - clear_thread_flag(TIF_MTE_ASYNC_FAULT); 1356 - send_sig_fault(SIGSEGV, SEGV_MTEAERR, 1357 - (void __user *)NULL, current); 1358 - } 1359 - 1360 - if (thread_flags & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL)) 1361 - do_signal(regs); 1362 - 1363 - if (thread_flags & _TIF_NOTIFY_RESUME) 1364 - resume_user_mode_work(regs); 1365 - 1366 - if (thread_flags & _TIF_FOREIGN_FPSTATE) 1367 - fpsimd_restore_current_state(); 1368 - } 1369 - 1370 - local_daif_mask(); 1371 - thread_flags = read_thread_flags(); 1372 - } while (thread_flags & _TIF_WORK_MASK); 1373 1279 } 1374 1280 1375 1281 unsigned long __ro_after_init signal_minsigstksz;
-3
arch/arm64/kernel/sleep.S
··· 102 102 mov x0, xzr 103 103 bl init_kernel_el 104 104 mov x19, x0 // preserve boot mode 105 - #if VA_BITS > 48 106 - ldr_l x0, vabits_actual 107 - #endif 108 105 bl __cpu_setup 109 106 /* enable the MMU early - so we can access sleep_save_stash by va */ 110 107 adrp x1, swapper_pg_dir
+1 -4
arch/arm64/kernel/syscall.c
··· 20 20 21 21 static long do_ni_syscall(struct pt_regs *regs, int scno) 22 22 { 23 - #ifdef CONFIG_COMPAT 24 - long ret; 25 23 if (is_compat_task()) { 26 - ret = compat_arm_syscall(regs, scno); 24 + long ret = compat_arm_syscall(regs, scno); 27 25 if (ret != -ENOSYS) 28 26 return ret; 29 27 } 30 - #endif 31 28 32 29 return sys_ni_syscall(); 33 30 }
+11 -6
arch/arm64/kernel/vmlinux.lds.S
··· 126 126 #ifdef CONFIG_UNWIND_TABLES 127 127 #define UNWIND_DATA_SECTIONS \ 128 128 .eh_frame : { \ 129 - __eh_frame_start = .; \ 129 + __pi___eh_frame_start = .; \ 130 130 *(.eh_frame) \ 131 - __eh_frame_end = .; \ 131 + __pi___eh_frame_end = .; \ 132 132 } 133 133 #else 134 134 #define UNWIND_DATA_SECTIONS ··· 270 270 HYPERVISOR_RELOC_SECTION 271 271 272 272 .rela.dyn : ALIGN(8) { 273 - __rela_start = .; 273 + __pi_rela_start = .; 274 274 *(.rela .rela*) 275 - __rela_end = .; 275 + __pi_rela_end = .; 276 276 } 277 277 278 278 .relr.dyn : ALIGN(8) { 279 - __relr_start = .; 279 + __pi_relr_start = .; 280 280 *(.relr.dyn) 281 - __relr_end = .; 281 + __pi_relr_end = .; 282 282 } 283 283 284 284 . = ALIGN(SEGMENT_ALIGN); ··· 311 311 __pecoff_data_rawsize = ABSOLUTE(. - __initdata_begin); 312 312 _edata = .; 313 313 314 + /* start of zero-init region */ 314 315 BSS_SECTION(SBSS_ALIGN, 0, 0) 315 316 316 317 . = ALIGN(PAGE_SIZE); 317 318 init_pg_dir = .; 318 319 . += INIT_DIR_SIZE; 319 320 init_pg_end = .; 321 + /* end of zero-init region */ 322 + 323 + . += SZ_4K; /* stack for the early C runtime */ 324 + early_init_stack = .; 320 325 321 326 . = ALIGN(SEGMENT_ALIGN); 322 327 __pecoff_data_size = ABSOLUTE(. - __initdata_begin);
+1
arch/arm64/kvm/fpsimd.c
··· 153 153 fp_state.sve_vl = vcpu->arch.sve_max_vl; 154 154 fp_state.sme_state = NULL; 155 155 fp_state.svcr = &vcpu->arch.svcr; 156 + fp_state.fpmr = &vcpu->arch.fpmr; 156 157 fp_state.fp_type = &vcpu->arch.fp_type; 157 158 158 159 if (vcpu_has_sve(vcpu))
+5 -12
arch/arm64/kvm/mmu.c
··· 805 805 .pgd = (kvm_pteref_t)kvm->mm->pgd, 806 806 .ia_bits = vabits_actual, 807 807 .start_level = (KVM_PGTABLE_LAST_LEVEL - 808 - CONFIG_PGTABLE_LEVELS + 1), 808 + ARM64_HW_PGTABLE_LEVELS(pgt.ia_bits) + 1), 809 809 .mm_ops = &kvm_user_mm_ops, 810 810 }; 811 811 unsigned long flags; ··· 1874 1874 BUG_ON((hyp_idmap_start ^ (hyp_idmap_end - 1)) & PAGE_MASK); 1875 1875 1876 1876 /* 1877 - * The ID map may be configured to use an extended virtual address 1878 - * range. This is only the case if system RAM is out of range for the 1879 - * currently configured page size and VA_BITS_MIN, in which case we will 1880 - * also need the extended virtual range for the HYP ID map, or we won't 1881 - * be able to enable the EL2 MMU. 1882 - * 1883 - * However, in some cases the ID map may be configured for fewer than 1884 - * the number of VA bits used by the regular kernel stage 1. This 1885 - * happens when VA_BITS=52 and the kernel image is placed in PA space 1886 - * below 48 bits. 1877 + * The ID map is always configured for 48 bits of translation, which 1878 + * may be fewer than the number of VA bits used by the regular kernel 1879 + * stage 1, when VA_BITS=52. 1887 1880 * 1888 1881 * At EL2, there is only one TTBR register, and we can't switch between 1889 1882 * translation tables *and* update TCR_EL2.T0SZ at the same time. Bottom ··· 1887 1894 * 1 VA bits to assure that the hypervisor can both ID map its code page 1888 1895 * and map any kernel memory. 1889 1896 */ 1890 - idmap_bits = 64 - ((idmap_t0sz & TCR_T0SZ_MASK) >> TCR_T0SZ_OFFSET); 1897 + idmap_bits = IDMAP_VA_BITS; 1891 1898 kernel_bits = vabits_actual; 1892 1899 *hyp_va_bits = max(idmap_bits, kernel_bits); 1893 1900
+11 -19
arch/arm64/mm/fault.c
··· 257 257 static inline bool is_el1_permission_fault(unsigned long addr, unsigned long esr, 258 258 struct pt_regs *regs) 259 259 { 260 - unsigned long fsc_type = esr & ESR_ELx_FSC_TYPE; 261 - 262 260 if (!is_el1_data_abort(esr) && !is_el1_instruction_abort(esr)) 263 261 return false; 264 262 265 - if (fsc_type == ESR_ELx_FSC_PERM) 263 + if (esr_fsc_is_permission_fault(esr)) 266 264 return true; 267 265 268 266 if (is_ttbr0_addr(addr) && system_uses_ttbr0_pan()) 269 - return fsc_type == ESR_ELx_FSC_FAULT && 267 + return esr_fsc_is_translation_fault(esr) && 270 268 (regs->pstate & PSR_PAN_BIT); 271 269 272 270 return false; ··· 277 279 unsigned long flags; 278 280 u64 par, dfsc; 279 281 280 - if (!is_el1_data_abort(esr) || 281 - (esr & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_FAULT) 282 + if (!is_el1_data_abort(esr) || !esr_fsc_is_translation_fault(esr)) 282 283 return false; 283 284 284 285 local_irq_save(flags); ··· 298 301 * treat the translation fault as spurious. 299 302 */ 300 303 dfsc = FIELD_GET(SYS_PAR_EL1_FST, par); 301 - return (dfsc & ESR_ELx_FSC_TYPE) != ESR_ELx_FSC_FAULT; 304 + return !esr_fsc_is_translation_fault(dfsc); 302 305 } 303 306 304 307 static void die_kernel_fault(const char *msg, unsigned long addr, ··· 365 368 return false; 366 369 } 367 370 368 - static bool is_translation_fault(unsigned long esr) 369 - { 370 - return (esr & ESR_ELx_FSC_TYPE) == ESR_ELx_FSC_FAULT; 371 - } 372 - 373 371 static void __do_kernel_fault(unsigned long addr, unsigned long esr, 374 372 struct pt_regs *regs) 375 373 { ··· 397 405 } else if (addr < PAGE_SIZE) { 398 406 msg = "NULL pointer dereference"; 399 407 } else { 400 - if (is_translation_fault(esr) && 408 + if (esr_fsc_is_translation_fault(esr) && 401 409 kfence_handle_page_fault(addr, esr & ESR_ELx_WNR, regs)) 402 410 return; 403 411 ··· 774 782 { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 1 translation fault" }, 775 783 { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 2 translation fault" }, 776 784 { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level 3 translation fault" }, 777 - { do_bad, SIGKILL, SI_KERNEL, "unknown 8" }, 785 + { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 0 access flag fault" }, 778 786 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 1 access flag fault" }, 779 787 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 access flag fault" }, 780 788 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 access flag fault" }, 781 - { do_bad, SIGKILL, SI_KERNEL, "unknown 12" }, 789 + { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 0 permission fault" }, 782 790 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 1 permission fault" }, 783 791 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 2 permission fault" }, 784 792 { do_page_fault, SIGSEGV, SEGV_ACCERR, "level 3 permission fault" }, 785 793 { do_sea, SIGBUS, BUS_OBJERR, "synchronous external abort" }, 786 794 { do_tag_check_fault, SIGSEGV, SEGV_MTESERR, "synchronous tag check fault" }, 787 795 { do_bad, SIGKILL, SI_KERNEL, "unknown 18" }, 788 - { do_bad, SIGKILL, SI_KERNEL, "unknown 19" }, 796 + { do_sea, SIGKILL, SI_KERNEL, "level -1 (translation table walk)" }, 789 797 { do_sea, SIGKILL, SI_KERNEL, "level 0 (translation table walk)" }, 790 798 { do_sea, SIGKILL, SI_KERNEL, "level 1 (translation table walk)" }, 791 799 { do_sea, SIGKILL, SI_KERNEL, "level 2 (translation table walk)" }, ··· 793 801 { do_sea, SIGBUS, BUS_OBJERR, "synchronous parity or ECC error" }, // Reserved when RAS is implemented 794 802 { do_bad, SIGKILL, SI_KERNEL, "unknown 25" }, 795 803 { do_bad, SIGKILL, SI_KERNEL, "unknown 26" }, 796 - { do_bad, SIGKILL, SI_KERNEL, "unknown 27" }, 804 + { do_sea, SIGKILL, SI_KERNEL, "level -1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 797 805 { do_sea, SIGKILL, SI_KERNEL, "level 0 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 798 806 { do_sea, SIGKILL, SI_KERNEL, "level 1 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented 799 807 { do_sea, SIGKILL, SI_KERNEL, "level 2 synchronous parity error (translation table walk)" }, // Reserved when RAS is implemented ··· 807 815 { do_bad, SIGKILL, SI_KERNEL, "unknown 38" }, 808 816 { do_bad, SIGKILL, SI_KERNEL, "unknown 39" }, 809 817 { do_bad, SIGKILL, SI_KERNEL, "unknown 40" }, 810 - { do_bad, SIGKILL, SI_KERNEL, "unknown 41" }, 818 + { do_bad, SIGKILL, SI_KERNEL, "level -1 address size fault" }, 811 819 { do_bad, SIGKILL, SI_KERNEL, "unknown 42" }, 812 - { do_bad, SIGKILL, SI_KERNEL, "unknown 43" }, 820 + { do_translation_fault, SIGSEGV, SEGV_MAPERR, "level -1 translation fault" }, 813 821 { do_bad, SIGKILL, SI_KERNEL, "unknown 44" }, 814 822 { do_bad, SIGKILL, SI_KERNEL, "unknown 45" }, 815 823 { do_bad, SIGKILL, SI_KERNEL, "unknown 46" },
+4 -35
arch/arm64/mm/fixmap.c
··· 16 16 #include <asm/pgalloc.h> 17 17 #include <asm/tlbflush.h> 18 18 19 + /* ensure that the fixmap region does not grow down into the PCI I/O region */ 20 + static_assert(FIXADDR_TOT_START > PCI_IO_END); 21 + 19 22 #define NR_BM_PTE_TABLES \ 20 23 SPAN_NR_ENTRIES(FIXADDR_TOT_START, FIXADDR_TOP, PMD_SHIFT) 21 24 #define NR_BM_PMD_TABLES \ ··· 104 101 unsigned long end = FIXADDR_TOP; 105 102 106 103 pgd_t *pgdp = pgd_offset_k(addr); 107 - p4d_t *p4dp = p4d_offset(pgdp, addr); 104 + p4d_t *p4dp = p4d_offset_kimg(pgdp, addr); 108 105 109 106 early_fixmap_init_pud(p4dp, addr, end); 110 107 } ··· 169 166 } 170 167 171 168 return dt_virt; 172 - } 173 - 174 - /* 175 - * Copy the fixmap region into a new pgdir. 176 - */ 177 - void __init fixmap_copy(pgd_t *pgdir) 178 - { 179 - if (!READ_ONCE(pgd_val(*pgd_offset_pgd(pgdir, FIXADDR_TOT_START)))) { 180 - /* 181 - * The fixmap falls in a separate pgd to the kernel, and doesn't 182 - * live in the carveout for the swapper_pg_dir. We can simply 183 - * re-use the existing dir for the fixmap. 184 - */ 185 - set_pgd(pgd_offset_pgd(pgdir, FIXADDR_TOT_START), 186 - READ_ONCE(*pgd_offset_k(FIXADDR_TOT_START))); 187 - } else if (CONFIG_PGTABLE_LEVELS > 3) { 188 - pgd_t *bm_pgdp; 189 - p4d_t *bm_p4dp; 190 - pud_t *bm_pudp; 191 - /* 192 - * The fixmap shares its top level pgd entry with the kernel 193 - * mapping. This can really only occur when we are running 194 - * with 16k/4 levels, so we can simply reuse the pud level 195 - * entry instead. 196 - */ 197 - BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES)); 198 - bm_pgdp = pgd_offset_pgd(pgdir, FIXADDR_TOT_START); 199 - bm_p4dp = p4d_offset(bm_pgdp, FIXADDR_TOT_START); 200 - bm_pudp = pud_set_fixmap_offset(bm_p4dp, FIXADDR_TOT_START); 201 - pud_populate(&init_mm, bm_pudp, lm_alias(bm_pmd)); 202 - pud_clear_fixmap(); 203 - } else { 204 - BUG(); 205 - } 206 169 }
+1 -1
arch/arm64/mm/init.c
··· 238 238 * physical address of PAGE_OFFSET, we have to *subtract* from it. 239 239 */ 240 240 if (IS_ENABLED(CONFIG_ARM64_VA_BITS_52) && (vabits_actual != 52)) 241 - memstart_addr -= _PAGE_OFFSET(48) - _PAGE_OFFSET(52); 241 + memstart_addr -= _PAGE_OFFSET(vabits_actual) - _PAGE_OFFSET(52); 242 242 243 243 /* 244 244 * Apply the memory limit if it was set. Since the kernel may be loaded
+130 -35
arch/arm64/mm/kasan_init.c
··· 23 23 24 24 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 25 25 26 - static pgd_t tmp_pg_dir[PTRS_PER_PGD] __initdata __aligned(PGD_SIZE); 26 + static pgd_t tmp_pg_dir[PTRS_PER_PTE] __initdata __aligned(PAGE_SIZE); 27 27 28 28 /* 29 29 * The p*d_populate functions call virt_to_phys implicitly so they can't be used ··· 99 99 return early ? pud_offset_kimg(p4dp, addr) : pud_offset(p4dp, addr); 100 100 } 101 101 102 + static p4d_t *__init kasan_p4d_offset(pgd_t *pgdp, unsigned long addr, int node, 103 + bool early) 104 + { 105 + if (pgd_none(READ_ONCE(*pgdp))) { 106 + phys_addr_t p4d_phys = early ? 107 + __pa_symbol(kasan_early_shadow_p4d) 108 + : kasan_alloc_zeroed_page(node); 109 + __pgd_populate(pgdp, p4d_phys, PGD_TYPE_TABLE); 110 + } 111 + 112 + return early ? p4d_offset_kimg(pgdp, addr) : p4d_offset(pgdp, addr); 113 + } 114 + 102 115 static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr, 103 116 unsigned long end, int node, bool early) 104 117 { ··· 157 144 unsigned long end, int node, bool early) 158 145 { 159 146 unsigned long next; 160 - p4d_t *p4dp = p4d_offset(pgdp, addr); 147 + p4d_t *p4dp = kasan_p4d_offset(pgdp, addr, node, early); 161 148 162 149 do { 163 150 next = p4d_addr_end(addr, end); 164 151 kasan_pud_populate(p4dp, addr, next, node, early); 165 - } while (p4dp++, addr = next, addr != end); 152 + } while (p4dp++, addr = next, addr != end && p4d_none(READ_ONCE(*p4dp))); 166 153 } 167 154 168 155 static void __init kasan_pgd_populate(unsigned long addr, unsigned long end, ··· 178 165 } while (pgdp++, addr = next, addr != end); 179 166 } 180 167 168 + #if defined(CONFIG_ARM64_64K_PAGES) || CONFIG_PGTABLE_LEVELS > 4 169 + #define SHADOW_ALIGN P4D_SIZE 170 + #else 171 + #define SHADOW_ALIGN PUD_SIZE 172 + #endif 173 + 174 + /* 175 + * Return whether 'addr' is aligned to the size covered by a root level 176 + * descriptor. 177 + */ 178 + static bool __init root_level_aligned(u64 addr) 179 + { 180 + int shift = (ARM64_HW_PGTABLE_LEVELS(vabits_actual) - 1) * (PAGE_SHIFT - 3); 181 + 182 + return (addr % (PAGE_SIZE << shift)) == 0; 183 + } 184 + 181 185 /* The early shadow maps everything to a single page of zeroes */ 182 186 asmlinkage void __init kasan_early_init(void) 183 187 { 184 188 BUILD_BUG_ON(KASAN_SHADOW_OFFSET != 185 189 KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); 186 - /* 187 - * We cannot check the actual value of KASAN_SHADOW_START during build, 188 - * as it depends on vabits_actual. As a best-effort approach, check 189 - * potential values calculated based on VA_BITS and VA_BITS_MIN. 190 - */ 191 - BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS), PGDIR_SIZE)); 192 - BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), PGDIR_SIZE)); 193 - BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, PGDIR_SIZE)); 190 + BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS), SHADOW_ALIGN)); 191 + BUILD_BUG_ON(!IS_ALIGNED(_KASAN_SHADOW_START(VA_BITS_MIN), SHADOW_ALIGN)); 192 + BUILD_BUG_ON(!IS_ALIGNED(KASAN_SHADOW_END, SHADOW_ALIGN)); 193 + 194 + if (!root_level_aligned(KASAN_SHADOW_START)) { 195 + /* 196 + * The start address is misaligned, and so the next level table 197 + * will be shared with the linear region. This can happen with 198 + * 4 or 5 level paging, so install a generic pte_t[] as the 199 + * next level. This prevents the kasan_pgd_populate call below 200 + * from inserting an entry that refers to the shared KASAN zero 201 + * shadow pud_t[]/p4d_t[], which could end up getting corrupted 202 + * when the linear region is mapped. 203 + */ 204 + static pte_t tbl[PTRS_PER_PTE] __page_aligned_bss; 205 + pgd_t *pgdp = pgd_offset_k(KASAN_SHADOW_START); 206 + 207 + set_pgd(pgdp, __pgd(__pa_symbol(tbl) | PGD_TYPE_TABLE)); 208 + } 209 + 194 210 kasan_pgd_populate(KASAN_SHADOW_START, KASAN_SHADOW_END, NUMA_NO_NODE, 195 211 true); 196 212 } ··· 232 190 } 233 191 234 192 /* 235 - * Copy the current shadow region into a new pgdir. 193 + * Return the descriptor index of 'addr' in the root level table 236 194 */ 237 - void __init kasan_copy_shadow(pgd_t *pgdir) 238 - { 239 - pgd_t *pgdp, *pgdp_new, *pgdp_end; 240 - 241 - pgdp = pgd_offset_k(KASAN_SHADOW_START); 242 - pgdp_end = pgd_offset_k(KASAN_SHADOW_END); 243 - pgdp_new = pgd_offset_pgd(pgdir, KASAN_SHADOW_START); 244 - do { 245 - set_pgd(pgdp_new, READ_ONCE(*pgdp)); 246 - } while (pgdp++, pgdp_new++, pgdp != pgdp_end); 247 - } 248 - 249 - static void __init clear_pgds(unsigned long start, 250 - unsigned long end) 195 + static int __init root_level_idx(u64 addr) 251 196 { 252 197 /* 253 - * Remove references to kasan page tables from 254 - * swapper_pg_dir. pgd_clear() can't be used 255 - * here because it's nop on 2,3-level pagetable setups 198 + * On 64k pages, the TTBR1 range root tables are extended for 52-bit 199 + * virtual addressing, and TTBR1 will simply point to the pgd_t entry 200 + * that covers the start of the 48-bit addressable VA space if LVA is 201 + * not implemented. This means we need to index the table as usual, 202 + * instead of masking off bits based on vabits_actual. 256 203 */ 257 - for (; start < end; start += PGDIR_SIZE) 258 - set_pgd(pgd_offset_k(start), __pgd(0)); 204 + u64 vabits = IS_ENABLED(CONFIG_ARM64_64K_PAGES) ? VA_BITS 205 + : vabits_actual; 206 + int shift = (ARM64_HW_PGTABLE_LEVELS(vabits) - 1) * (PAGE_SHIFT - 3); 207 + 208 + return (addr & ~_PAGE_OFFSET(vabits)) >> (shift + PAGE_SHIFT); 209 + } 210 + 211 + /* 212 + * Clone a next level table from swapper_pg_dir into tmp_pg_dir 213 + */ 214 + static void __init clone_next_level(u64 addr, pgd_t *tmp_pg_dir, pud_t *pud) 215 + { 216 + int idx = root_level_idx(addr); 217 + pgd_t pgd = READ_ONCE(swapper_pg_dir[idx]); 218 + pud_t *pudp = (pud_t *)__phys_to_kimg(__pgd_to_phys(pgd)); 219 + 220 + memcpy(pud, pudp, PAGE_SIZE); 221 + tmp_pg_dir[idx] = __pgd(__phys_to_pgd_val(__pa_symbol(pud)) | 222 + PUD_TYPE_TABLE); 223 + } 224 + 225 + /* 226 + * Return the descriptor index of 'addr' in the next level table 227 + */ 228 + static int __init next_level_idx(u64 addr) 229 + { 230 + int shift = (ARM64_HW_PGTABLE_LEVELS(vabits_actual) - 2) * (PAGE_SHIFT - 3); 231 + 232 + return (addr >> (shift + PAGE_SHIFT)) % PTRS_PER_PTE; 233 + } 234 + 235 + /* 236 + * Dereference the table descriptor at 'pgd_idx' and clear the entries from 237 + * 'start' to 'end' (exclusive) from the table. 238 + */ 239 + static void __init clear_next_level(int pgd_idx, int start, int end) 240 + { 241 + pgd_t pgd = READ_ONCE(swapper_pg_dir[pgd_idx]); 242 + pud_t *pudp = (pud_t *)__phys_to_kimg(__pgd_to_phys(pgd)); 243 + 244 + memset(&pudp[start], 0, (end - start) * sizeof(pud_t)); 245 + } 246 + 247 + static void __init clear_shadow(u64 start, u64 end) 248 + { 249 + int l = root_level_idx(start), m = root_level_idx(end); 250 + 251 + if (!root_level_aligned(start)) 252 + clear_next_level(l++, next_level_idx(start), PTRS_PER_PTE); 253 + if (!root_level_aligned(end)) 254 + clear_next_level(m, 0, next_level_idx(end)); 255 + memset(&swapper_pg_dir[l], 0, (m - l) * sizeof(pgd_t)); 259 256 } 260 257 261 258 static void __init kasan_init_shadow(void) 262 259 { 260 + static pud_t pud[2][PTRS_PER_PUD] __initdata __aligned(PAGE_SIZE); 263 261 u64 kimg_shadow_start, kimg_shadow_end; 264 262 u64 mod_shadow_start; 265 263 u64 vmalloc_shadow_end; ··· 321 239 * setup will be finished. 322 240 */ 323 241 memcpy(tmp_pg_dir, swapper_pg_dir, sizeof(tmp_pg_dir)); 324 - dsb(ishst); 325 - cpu_replace_ttbr1(lm_alias(tmp_pg_dir), idmap_pg_dir); 326 242 327 - clear_pgds(KASAN_SHADOW_START, KASAN_SHADOW_END); 243 + /* 244 + * If the start or end address of the shadow region is not aligned to 245 + * the root level size, we have to allocate a temporary next-level table 246 + * in each case, clone the next level of descriptors, and install the 247 + * table into tmp_pg_dir. Note that with 5 levels of paging, the next 248 + * level will in fact be p4d_t, but that makes no difference in this 249 + * case. 250 + */ 251 + if (!root_level_aligned(KASAN_SHADOW_START)) 252 + clone_next_level(KASAN_SHADOW_START, tmp_pg_dir, pud[0]); 253 + if (!root_level_aligned(KASAN_SHADOW_END)) 254 + clone_next_level(KASAN_SHADOW_END, tmp_pg_dir, pud[1]); 255 + dsb(ishst); 256 + cpu_replace_ttbr1(lm_alias(tmp_pg_dir)); 257 + 258 + clear_shadow(KASAN_SHADOW_START, KASAN_SHADOW_END); 328 259 329 260 kasan_map_populate(kimg_shadow_start, kimg_shadow_end, 330 261 early_pfn_to_nid(virt_to_pfn(lm_alias(KERNEL_START)))); ··· 371 276 PAGE_KERNEL_RO)); 372 277 373 278 memset(kasan_early_shadow_page, KASAN_SHADOW_INIT, PAGE_SIZE); 374 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir), idmap_pg_dir); 279 + cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); 375 280 } 376 281 377 282 static void __init kasan_init_depth(void)
+4
arch/arm64/mm/mmap.c
··· 73 73 protection_map[VM_EXEC | VM_SHARED] = PAGE_EXECONLY; 74 74 } 75 75 76 + if (lpa2_is_enabled()) 77 + for (int i = 0; i < ARRAY_SIZE(protection_map); i++) 78 + pgprot_val(protection_map[i]) &= ~PTE_SHARED; 79 + 76 80 return 0; 77 81 } 78 82 arch_initcall(adjust_protection_map);
+153 -102
arch/arm64/mm/mmu.c
··· 45 45 #define NO_CONT_MAPPINGS BIT(1) 46 46 #define NO_EXEC_MAPPINGS BIT(2) /* assumes FEAT_HPDS is not used */ 47 47 48 - int idmap_t0sz __ro_after_init; 49 - 50 - #if VA_BITS > 48 51 - u64 vabits_actual __ro_after_init = VA_BITS_MIN; 52 - EXPORT_SYMBOL(vabits_actual); 53 - #endif 54 - 55 48 u64 kimage_voffset __ro_after_init; 56 49 EXPORT_SYMBOL(kimage_voffset); 57 50 58 51 u32 __boot_cpu_mode[] = { BOOT_CPU_MODE_EL2, BOOT_CPU_MODE_EL1 }; 52 + 53 + static bool rodata_is_rw __ro_after_init = true; 59 54 60 55 /* 61 56 * The booting CPU updates the failed status @__early_cpu_boot_status, ··· 68 73 static DEFINE_SPINLOCK(swapper_pgdir_lock); 69 74 static DEFINE_MUTEX(fixmap_lock); 70 75 71 - void set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) 76 + void noinstr set_swapper_pgd(pgd_t *pgdp, pgd_t pgd) 72 77 { 73 78 pgd_t *fixmap_pgdp; 79 + 80 + /* 81 + * Don't bother with the fixmap if swapper_pg_dir is still mapped 82 + * writable in the kernel mapping. 83 + */ 84 + if (rodata_is_rw) { 85 + WRITE_ONCE(*pgdp, pgd); 86 + dsb(ishst); 87 + isb(); 88 + return; 89 + } 74 90 75 91 spin_lock(&swapper_pgdir_lock); 76 92 fixmap_pgdp = pgd_set_fixmap(__pa_symbol(pgdp)); ··· 313 307 } while (addr = next, addr != end); 314 308 } 315 309 316 - static void alloc_init_pud(pgd_t *pgdp, unsigned long addr, unsigned long end, 310 + static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end, 317 311 phys_addr_t phys, pgprot_t prot, 318 312 phys_addr_t (*pgtable_alloc)(int), 319 313 int flags) 320 314 { 321 315 unsigned long next; 322 - pud_t *pudp; 323 - p4d_t *p4dp = p4d_offset(pgdp, addr); 324 316 p4d_t p4d = READ_ONCE(*p4dp); 317 + pud_t *pudp; 325 318 326 319 if (p4d_none(p4d)) { 327 320 p4dval_t p4dval = P4D_TYPE_TABLE | P4D_TABLE_UXN; ··· 368 363 pud_clear_fixmap(); 369 364 } 370 365 366 + static void alloc_init_p4d(pgd_t *pgdp, unsigned long addr, unsigned long end, 367 + phys_addr_t phys, pgprot_t prot, 368 + phys_addr_t (*pgtable_alloc)(int), 369 + int flags) 370 + { 371 + unsigned long next; 372 + pgd_t pgd = READ_ONCE(*pgdp); 373 + p4d_t *p4dp; 374 + 375 + if (pgd_none(pgd)) { 376 + pgdval_t pgdval = PGD_TYPE_TABLE | PGD_TABLE_UXN; 377 + phys_addr_t p4d_phys; 378 + 379 + if (flags & NO_EXEC_MAPPINGS) 380 + pgdval |= PGD_TABLE_PXN; 381 + BUG_ON(!pgtable_alloc); 382 + p4d_phys = pgtable_alloc(P4D_SHIFT); 383 + __pgd_populate(pgdp, p4d_phys, pgdval); 384 + pgd = READ_ONCE(*pgdp); 385 + } 386 + BUG_ON(pgd_bad(pgd)); 387 + 388 + p4dp = p4d_set_fixmap_offset(pgdp, addr); 389 + do { 390 + p4d_t old_p4d = READ_ONCE(*p4dp); 391 + 392 + next = p4d_addr_end(addr, end); 393 + 394 + alloc_init_pud(p4dp, addr, next, phys, prot, 395 + pgtable_alloc, flags); 396 + 397 + BUG_ON(p4d_val(old_p4d) != 0 && 398 + p4d_val(old_p4d) != READ_ONCE(p4d_val(*p4dp))); 399 + 400 + phys += next - addr; 401 + } while (p4dp++, addr = next, addr != end); 402 + 403 + p4d_clear_fixmap(); 404 + } 405 + 371 406 static void __create_pgd_mapping_locked(pgd_t *pgdir, phys_addr_t phys, 372 407 unsigned long virt, phys_addr_t size, 373 408 pgprot_t prot, ··· 430 385 431 386 do { 432 387 next = pgd_addr_end(addr, end); 433 - alloc_init_pud(pgdp, addr, next, phys, prot, pgtable_alloc, 388 + alloc_init_p4d(pgdp, addr, next, phys, prot, pgtable_alloc, 434 389 flags); 435 390 phys += next - addr; 436 391 } while (pgdp++, addr = next, addr != end); ··· 621 576 * entries at any level are being shared between the linear region and 622 577 * the vmalloc region. Check whether this is true for the PGD level, in 623 578 * which case it is guaranteed to be true for all other levels as well. 579 + * (Unless we are running with support for LPA2, in which case the 580 + * entire reduced VA space is covered by a single pgd_t which will have 581 + * been populated without the PXNTable attribute by the time we get here.) 624 582 */ 625 - BUILD_BUG_ON(pgd_index(direct_map_end - 1) == pgd_index(direct_map_end)); 583 + BUILD_BUG_ON(pgd_index(direct_map_end - 1) == pgd_index(direct_map_end) && 584 + pgd_index(_PAGE_OFFSET(VA_BITS_MIN)) != PTRS_PER_PGD - 1); 626 585 627 586 early_kfence_pool = arm64_kfence_alloc_pool(); 628 587 ··· 679 630 * to cover NOTES and EXCEPTION_TABLE. 680 631 */ 681 632 section_size = (unsigned long)__init_begin - (unsigned long)__start_rodata; 633 + WRITE_ONCE(rodata_is_rw, false); 682 634 update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, 683 635 section_size, PAGE_KERNEL_RO); 684 636 685 637 debug_checkwx(); 686 638 } 687 639 688 - static void __init map_kernel_segment(pgd_t *pgdp, void *va_start, void *va_end, 689 - pgprot_t prot, struct vm_struct *vma, 690 - int flags, unsigned long vm_flags) 640 + static void __init declare_vma(struct vm_struct *vma, 641 + void *va_start, void *va_end, 642 + unsigned long vm_flags) 691 643 { 692 644 phys_addr_t pa_start = __pa_symbol(va_start); 693 645 unsigned long size = va_end - va_start; 694 646 695 647 BUG_ON(!PAGE_ALIGNED(pa_start)); 696 648 BUG_ON(!PAGE_ALIGNED(size)); 697 - 698 - __create_pgd_mapping(pgdp, pa_start, (unsigned long)va_start, size, prot, 699 - early_pgtable_alloc, flags); 700 649 701 650 if (!(vm_flags & VM_NO_GUARD)) 702 651 size += PAGE_SIZE; ··· 708 661 vm_area_add_early(vma); 709 662 } 710 663 664 + #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 711 665 static pgprot_t kernel_exec_prot(void) 712 666 { 713 667 return rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC; 714 668 } 715 669 716 - #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 717 670 static int __init map_entry_trampoline(void) 718 671 { 719 672 int i; ··· 748 701 #endif 749 702 750 703 /* 751 - * Open coded check for BTI, only for use to determine configuration 752 - * for early mappings for before the cpufeature code has run. 704 + * Declare the VMA areas for the kernel 753 705 */ 754 - static bool arm64_early_this_cpu_has_bti(void) 706 + static void __init declare_kernel_vmas(void) 755 707 { 756 - u64 pfr1; 708 + static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT]; 757 709 758 - if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL)) 759 - return false; 760 - 761 - pfr1 = __read_sysreg_by_encoding(SYS_ID_AA64PFR1_EL1); 762 - return cpuid_feature_extract_unsigned_field(pfr1, 763 - ID_AA64PFR1_EL1_BT_SHIFT); 710 + declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD); 711 + declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD); 712 + declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD); 713 + declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD); 714 + declare_vma(&vmlinux_seg[4], _data, _end, 0); 764 715 } 765 716 766 - /* 767 - * Create fine-grained mappings for the kernel. 768 - */ 769 - static void __init map_kernel(pgd_t *pgdp) 770 - { 771 - static struct vm_struct vmlinux_text, vmlinux_rodata, vmlinux_inittext, 772 - vmlinux_initdata, vmlinux_data; 717 + void __pi_map_range(u64 *pgd, u64 start, u64 end, u64 pa, pgprot_t prot, 718 + int level, pte_t *tbl, bool may_use_cont, u64 va_offset); 773 719 774 - /* 775 - * External debuggers may need to write directly to the text 776 - * mapping to install SW breakpoints. Allow this (only) when 777 - * explicitly requested with rodata=off. 778 - */ 779 - pgprot_t text_prot = kernel_exec_prot(); 780 - 781 - /* 782 - * If we have a CPU that supports BTI and a kernel built for 783 - * BTI then mark the kernel executable text as guarded pages 784 - * now so we don't have to rewrite the page tables later. 785 - */ 786 - if (arm64_early_this_cpu_has_bti()) 787 - text_prot = __pgprot_modify(text_prot, PTE_GP, PTE_GP); 788 - 789 - /* 790 - * Only rodata will be remapped with different permissions later on, 791 - * all other segments are allowed to use contiguous mappings. 792 - */ 793 - map_kernel_segment(pgdp, _stext, _etext, text_prot, &vmlinux_text, 0, 794 - VM_NO_GUARD); 795 - map_kernel_segment(pgdp, __start_rodata, __inittext_begin, PAGE_KERNEL, 796 - &vmlinux_rodata, NO_CONT_MAPPINGS, VM_NO_GUARD); 797 - map_kernel_segment(pgdp, __inittext_begin, __inittext_end, text_prot, 798 - &vmlinux_inittext, 0, VM_NO_GUARD); 799 - map_kernel_segment(pgdp, __initdata_begin, __initdata_end, PAGE_KERNEL, 800 - &vmlinux_initdata, 0, VM_NO_GUARD); 801 - map_kernel_segment(pgdp, _data, _end, PAGE_KERNEL, &vmlinux_data, 0, 0); 802 - 803 - fixmap_copy(pgdp); 804 - kasan_copy_shadow(pgdp); 805 - } 720 + static u8 idmap_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init, 721 + kpti_ptes[IDMAP_LEVELS - 1][PAGE_SIZE] __aligned(PAGE_SIZE) __ro_after_init; 806 722 807 723 static void __init create_idmap(void) 808 724 { 809 725 u64 start = __pa_symbol(__idmap_text_start); 810 - u64 size = __pa_symbol(__idmap_text_end) - start; 811 - pgd_t *pgd = idmap_pg_dir; 812 - u64 pgd_phys; 726 + u64 end = __pa_symbol(__idmap_text_end); 727 + u64 ptep = __pa_symbol(idmap_ptes); 813 728 814 - /* check if we need an additional level of translation */ 815 - if (VA_BITS < 48 && idmap_t0sz < (64 - VA_BITS_MIN)) { 816 - pgd_phys = early_pgtable_alloc(PAGE_SHIFT); 817 - set_pgd(&idmap_pg_dir[start >> VA_BITS], 818 - __pgd(pgd_phys | P4D_TYPE_TABLE)); 819 - pgd = __va(pgd_phys); 820 - } 821 - __create_pgd_mapping(pgd, start, start, size, PAGE_KERNEL_ROX, 822 - early_pgtable_alloc, 0); 729 + __pi_map_range(&ptep, start, end, start, PAGE_KERNEL_ROX, 730 + IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false, 731 + __phys_to_virt(ptep) - ptep); 823 732 824 - if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0)) { 733 + if (IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) && !arm64_use_ng_mappings) { 825 734 extern u32 __idmap_kpti_flag; 826 735 u64 pa = __pa_symbol(&__idmap_kpti_flag); 827 736 ··· 785 782 * The KPTI G-to-nG conversion code needs a read-write mapping 786 783 * of its synchronization flag in the ID map. 787 784 */ 788 - __create_pgd_mapping(pgd, pa, pa, sizeof(u32), PAGE_KERNEL, 789 - early_pgtable_alloc, 0); 785 + ptep = __pa_symbol(kpti_ptes); 786 + __pi_map_range(&ptep, pa, pa + sizeof(u32), pa, PAGE_KERNEL, 787 + IDMAP_ROOT_LEVEL, (pte_t *)idmap_pg_dir, false, 788 + __phys_to_virt(ptep) - ptep); 790 789 } 791 790 } 792 791 793 792 void __init paging_init(void) 794 793 { 795 - pgd_t *pgdp = pgd_set_fixmap(__pa_symbol(swapper_pg_dir)); 796 - extern pgd_t init_idmap_pg_dir[]; 797 - 798 - idmap_t0sz = 63UL - __fls(__pa_symbol(_end) | GENMASK(VA_BITS_MIN - 1, 0)); 799 - 800 - map_kernel(pgdp); 801 - map_mem(pgdp); 802 - 803 - pgd_clear_fixmap(); 804 - 805 - cpu_replace_ttbr1(lm_alias(swapper_pg_dir), init_idmap_pg_dir); 806 - init_mm.pgd = swapper_pg_dir; 807 - 808 - memblock_phys_free(__pa_symbol(init_pg_dir), 809 - __pa_symbol(init_pg_end) - __pa_symbol(init_pg_dir)); 794 + map_mem(swapper_pg_dir); 810 795 811 796 memblock_allow_resize(); 812 797 813 798 create_idmap(); 799 + declare_kernel_vmas(); 814 800 } 815 801 816 802 #ifdef CONFIG_MEMORY_HOTPLUG ··· 1065 1073 free_empty_pmd_table(pudp, addr, next, floor, ceiling); 1066 1074 } while (addr = next, addr < end); 1067 1075 1068 - if (CONFIG_PGTABLE_LEVELS <= 3) 1076 + if (!pgtable_l4_enabled()) 1069 1077 return; 1070 1078 1071 - if (!pgtable_range_aligned(start, end, floor, ceiling, PGDIR_MASK)) 1079 + if (!pgtable_range_aligned(start, end, floor, ceiling, P4D_MASK)) 1072 1080 return; 1073 1081 1074 1082 /* ··· 1091 1099 unsigned long end, unsigned long floor, 1092 1100 unsigned long ceiling) 1093 1101 { 1094 - unsigned long next; 1095 1102 p4d_t *p4dp, p4d; 1103 + unsigned long i, next, start = addr; 1096 1104 1097 1105 do { 1098 1106 next = p4d_addr_end(addr, end); ··· 1104 1112 WARN_ON(!p4d_present(p4d)); 1105 1113 free_empty_pud_table(p4dp, addr, next, floor, ceiling); 1106 1114 } while (addr = next, addr < end); 1115 + 1116 + if (!pgtable_l5_enabled()) 1117 + return; 1118 + 1119 + if (!pgtable_range_aligned(start, end, floor, ceiling, PGDIR_MASK)) 1120 + return; 1121 + 1122 + /* 1123 + * Check whether we can free the p4d page if the rest of the 1124 + * entries are empty. Overlap with other regions have been 1125 + * handled by the floor/ceiling check. 1126 + */ 1127 + p4dp = p4d_offset(pgdp, 0UL); 1128 + for (i = 0; i < PTRS_PER_P4D; i++) { 1129 + if (!p4d_none(READ_ONCE(p4dp[i]))) 1130 + return; 1131 + } 1132 + 1133 + pgd_clear(pgdp); 1134 + __flush_tlb_kernel_pgtable(start); 1135 + free_hotplug_pgtable_page(virt_to_page(p4dp)); 1107 1136 } 1108 1137 1109 1138 static void free_empty_tables(unsigned long addr, unsigned long end, ··· 1208 1195 set_pmd(pmdp, new_pmd); 1209 1196 return 1; 1210 1197 } 1198 + 1199 + #ifndef __PAGETABLE_P4D_FOLDED 1200 + void p4d_clear_huge(p4d_t *p4dp) 1201 + { 1202 + } 1203 + #endif 1211 1204 1212 1205 int pud_clear_huge(pud_t *pudp) 1213 1206 { ··· 1504 1485 pte_t old_pte, pte_t pte) 1505 1486 { 1506 1487 set_pte_at(vma->vm_mm, addr, ptep, pte); 1488 + } 1489 + 1490 + /* 1491 + * Atomically replaces the active TTBR1_EL1 PGD with a new VA-compatible PGD, 1492 + * avoiding the possibility of conflicting TLB entries being allocated. 1493 + */ 1494 + void __cpu_replace_ttbr1(pgd_t *pgdp, bool cnp) 1495 + { 1496 + typedef void (ttbr_replace_func)(phys_addr_t); 1497 + extern ttbr_replace_func idmap_cpu_replace_ttbr1; 1498 + ttbr_replace_func *replace_phys; 1499 + unsigned long daif; 1500 + 1501 + /* phys_to_ttbr() zeros lower 2 bits of ttbr with 52-bit PA */ 1502 + phys_addr_t ttbr1 = phys_to_ttbr(virt_to_phys(pgdp)); 1503 + 1504 + if (cnp) 1505 + ttbr1 |= TTBR_CNP_BIT; 1506 + 1507 + replace_phys = (void *)__pa_symbol(idmap_cpu_replace_ttbr1); 1508 + 1509 + cpu_install_idmap(); 1510 + 1511 + /* 1512 + * We really don't want to take *any* exceptions while TTBR1 is 1513 + * in the process of being replaced so mask everything. 1514 + */ 1515 + daif = local_daif_save(); 1516 + replace_phys(ttbr1); 1517 + local_daif_restore(daif); 1518 + 1519 + cpu_uninstall_idmap(); 1507 1520 }
+14 -3
arch/arm64/mm/pgd.c
··· 17 17 18 18 static struct kmem_cache *pgd_cache __ro_after_init; 19 19 20 + static bool pgdir_is_page_size(void) 21 + { 22 + if (PGD_SIZE == PAGE_SIZE) 23 + return true; 24 + if (CONFIG_PGTABLE_LEVELS == 4) 25 + return !pgtable_l4_enabled(); 26 + if (CONFIG_PGTABLE_LEVELS == 5) 27 + return !pgtable_l5_enabled(); 28 + return false; 29 + } 30 + 20 31 pgd_t *pgd_alloc(struct mm_struct *mm) 21 32 { 22 33 gfp_t gfp = GFP_PGTABLE_USER; 23 34 24 - if (PGD_SIZE == PAGE_SIZE) 35 + if (pgdir_is_page_size()) 25 36 return (pgd_t *)__get_free_page(gfp); 26 37 else 27 38 return kmem_cache_alloc(pgd_cache, gfp); ··· 40 29 41 30 void pgd_free(struct mm_struct *mm, pgd_t *pgd) 42 31 { 43 - if (PGD_SIZE == PAGE_SIZE) 32 + if (pgdir_is_page_size()) 44 33 free_page((unsigned long)pgd); 45 34 else 46 35 kmem_cache_free(pgd_cache, pgd); ··· 48 37 49 38 void __init pgtable_cache_init(void) 50 39 { 51 - if (PGD_SIZE == PAGE_SIZE) 40 + if (pgdir_is_page_size()) 52 41 return; 53 42 54 43 #ifdef CONFIG_ARM64_PA_BITS_52
+95 -21
arch/arm64/mm/proc.S
··· 195 195 196 196 ret 197 197 SYM_FUNC_END(idmap_cpu_replace_ttbr1) 198 + SYM_FUNC_ALIAS(__pi_idmap_cpu_replace_ttbr1, idmap_cpu_replace_ttbr1) 198 199 .popsection 199 200 200 201 #ifdef CONFIG_UNMAP_KERNEL_AT_EL0 201 202 202 - #define KPTI_NG_PTE_FLAGS (PTE_ATTRINDX(MT_NORMAL) | SWAPPER_PTE_FLAGS | PTE_WRITE) 203 + #define KPTI_NG_PTE_FLAGS (PTE_ATTRINDX(MT_NORMAL) | PTE_TYPE_PAGE | \ 204 + PTE_AF | PTE_SHARED | PTE_UXN | PTE_WRITE) 203 205 204 206 .pushsection ".idmap.text", "a" 207 + 208 + .macro pte_to_phys, phys, pte 209 + and \phys, \pte, #PTE_ADDR_LOW 210 + #ifdef CONFIG_ARM64_PA_BITS_52 211 + and \pte, \pte, #PTE_ADDR_HIGH 212 + orr \phys, \phys, \pte, lsl #PTE_ADDR_HIGH_SHIFT 213 + #endif 214 + .endm 205 215 206 216 .macro kpti_mk_tbl_ng, type, num_entries 207 217 add end_\type\()p, cur_\type\()p, #\num_entries * 8 208 218 .Ldo_\type: 209 - ldr \type, [cur_\type\()p] // Load the entry 219 + ldr \type, [cur_\type\()p], #8 // Load the entry and advance 210 220 tbz \type, #0, .Lnext_\type // Skip invalid and 211 221 tbnz \type, #11, .Lnext_\type // non-global entries 212 222 orr \type, \type, #PTE_NG // Same bit for blocks and pages 213 - str \type, [cur_\type\()p] // Update the entry 223 + str \type, [cur_\type\()p, #-8] // Update the entry 214 224 .ifnc \type, pte 215 225 tbnz \type, #1, .Lderef_\type 216 226 .endif 217 227 .Lnext_\type: 218 - add cur_\type\()p, cur_\type\()p, #8 219 228 cmp cur_\type\()p, end_\type\()p 220 229 b.ne .Ldo_\type 221 230 .endm ··· 234 225 * fixmap slot associated with the current level. 235 226 */ 236 227 .macro kpti_map_pgtbl, type, level 237 - str xzr, [temp_pte, #8 * (\level + 1)] // break before make 228 + str xzr, [temp_pte, #8 * (\level + 2)] // break before make 238 229 dsb nshst 239 - add pte, temp_pte, #PAGE_SIZE * (\level + 1) 230 + add pte, temp_pte, #PAGE_SIZE * (\level + 2) 240 231 lsr pte, pte, #12 241 232 tlbi vaae1, pte 242 233 dsb nsh 243 234 isb 244 235 245 236 phys_to_pte pte, cur_\type\()p 246 - add cur_\type\()p, temp_pte, #PAGE_SIZE * (\level + 1) 237 + add cur_\type\()p, temp_pte, #PAGE_SIZE * (\level + 2) 247 238 orr pte, pte, pte_flags 248 - str pte, [temp_pte, #8 * (\level + 1)] 239 + str pte, [temp_pte, #8 * (\level + 2)] 249 240 dsb nshst 250 241 .endm 251 242 ··· 278 269 end_ptep .req x15 279 270 pte .req x16 280 271 valid .req x17 272 + cur_p4dp .req x19 273 + end_p4dp .req x20 281 274 282 275 mov x5, x3 // preserve temp_pte arg 283 276 mrs swapper_ttb, ttbr1_el1 284 277 adr_l flag_ptr, __idmap_kpti_flag 285 278 286 279 cbnz cpu, __idmap_kpti_secondary 280 + 281 + #if CONFIG_PGTABLE_LEVELS > 4 282 + stp x29, x30, [sp, #-32]! 283 + mov x29, sp 284 + stp x19, x20, [sp, #16] 285 + #endif 287 286 288 287 /* We're the boot CPU. Wait for the others to catch up */ 289 288 sevl ··· 310 293 mov_q pte_flags, KPTI_NG_PTE_FLAGS 311 294 312 295 /* Everybody is enjoying the idmap, so we can rewrite swapper. */ 296 + 297 + #ifdef CONFIG_ARM64_LPA2 298 + /* 299 + * If LPA2 support is configured, but 52-bit virtual addressing is not 300 + * enabled at runtime, we will fall back to one level of paging less, 301 + * and so we have to walk swapper_pg_dir as if we dereferenced its 302 + * address from a PGD level entry, and terminate the PGD level loop 303 + * right after. 304 + */ 305 + adrp pgd, swapper_pg_dir // walk &swapper_pg_dir at the next level 306 + mov cur_pgdp, end_pgdp // must be equal to terminate the PGD loop 307 + alternative_if_not ARM64_HAS_VA52 308 + b .Lderef_pgd // skip to the next level 309 + alternative_else_nop_endif 310 + /* 311 + * LPA2 based 52-bit virtual addressing requires 52-bit physical 312 + * addressing to be enabled as well. In this case, the shareability 313 + * bits are repurposed as physical address bits, and should not be 314 + * set in pte_flags. 315 + */ 316 + bic pte_flags, pte_flags, #PTE_SHARED 317 + #endif 318 + 313 319 /* PGD */ 314 320 adrp cur_pgdp, swapper_pg_dir 315 - kpti_map_pgtbl pgd, 0 321 + kpti_map_pgtbl pgd, -1 316 322 kpti_mk_tbl_ng pgd, PTRS_PER_PGD 317 323 318 324 /* Ensure all the updated entries are visible to secondary CPUs */ ··· 348 308 349 309 /* Set the flag to zero to indicate that we're all done */ 350 310 str wzr, [flag_ptr] 311 + #if CONFIG_PGTABLE_LEVELS > 4 312 + ldp x19, x20, [sp, #16] 313 + ldp x29, x30, [sp], #32 314 + #endif 351 315 ret 352 316 353 317 .Lderef_pgd: 318 + /* P4D */ 319 + .if CONFIG_PGTABLE_LEVELS > 4 320 + p4d .req x30 321 + pte_to_phys cur_p4dp, pgd 322 + kpti_map_pgtbl p4d, 0 323 + kpti_mk_tbl_ng p4d, PTRS_PER_P4D 324 + b .Lnext_pgd 325 + .else /* CONFIG_PGTABLE_LEVELS <= 4 */ 326 + p4d .req pgd 327 + .set .Lnext_p4d, .Lnext_pgd 328 + .endif 329 + 330 + .Lderef_p4d: 354 331 /* PUD */ 355 332 .if CONFIG_PGTABLE_LEVELS > 3 356 333 pud .req x10 357 - pte_to_phys cur_pudp, pgd 334 + pte_to_phys cur_pudp, p4d 358 335 kpti_map_pgtbl pud, 1 359 336 kpti_mk_tbl_ng pud, PTRS_PER_PUD 360 - b .Lnext_pgd 337 + b .Lnext_p4d 361 338 .else /* CONFIG_PGTABLE_LEVELS <= 3 */ 362 339 pud .req pgd 363 340 .set .Lnext_pud, .Lnext_pgd ··· 418 361 .unreq end_ptep 419 362 .unreq pte 420 363 .unreq valid 364 + .unreq cur_p4dp 365 + .unreq end_p4dp 366 + .unreq p4d 421 367 422 368 /* Secondary CPUs end up here */ 423 369 __idmap_kpti_secondary: ··· 455 395 * 456 396 * Initialise the processor for turning the MMU on. 457 397 * 458 - * Input: 459 - * x0 - actual number of VA bits (ignored unless VA_BITS > 48) 460 398 * Output: 461 399 * Return in x0 the value of the SCTLR_EL1 register. 462 400 */ ··· 478 420 mair .req x17 479 421 tcr .req x16 480 422 mov_q mair, MAIR_EL1_SET 481 - mov_q tcr, TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \ 482 - TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \ 483 - TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS 423 + mov_q tcr, TCR_T0SZ(IDMAP_VA_BITS) | TCR_T1SZ(VA_BITS_MIN) | TCR_CACHE_FLAGS | \ 424 + TCR_SMP_FLAGS | TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \ 425 + TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS | TCR_MTE_FLAGS 484 426 485 427 tcr_clear_errata_bits tcr, x9, x5 486 428 487 429 #ifdef CONFIG_ARM64_VA_BITS_52 488 - sub x9, xzr, x0 489 - add x9, x9, #64 430 + mov x9, #64 - VA_BITS 431 + alternative_if ARM64_HAS_VA52 490 432 tcr_set_t1sz tcr, x9 491 - #else 492 - idmap_get_t0sz x9 433 + #ifdef CONFIG_ARM64_LPA2 434 + orr tcr, tcr, #TCR_DS 493 435 #endif 494 - tcr_set_t0sz tcr, x9 436 + alternative_else_nop_endif 437 + #endif 495 438 496 439 /* 497 440 * Set the IPS bits in TCR_EL1. ··· 517 458 ubfx x1, x1, #ID_AA64MMFR3_EL1_S1PIE_SHIFT, #4 518 459 cbz x1, .Lskip_indirection 519 460 461 + /* 462 + * The PROT_* macros describing the various memory types may resolve to 463 + * C expressions if they include the PTE_MAYBE_* macros, and so they 464 + * can only be used from C code. The PIE_E* constants below are also 465 + * defined in terms of those macros, but will mask out those 466 + * PTE_MAYBE_* constants, whether they are set or not. So #define them 467 + * as 0x0 here so we can evaluate the PIE_E* constants in asm context. 468 + */ 469 + 470 + #define PTE_MAYBE_NG 0 471 + #define PTE_MAYBE_SHARED 0 472 + 520 473 mov_q x0, PIE_E0 521 474 msr REG_PIRE0_EL1, x0 522 475 mov_q x0, PIE_E1 523 476 msr REG_PIR_EL1, x0 477 + 478 + #undef PTE_MAYBE_NG 479 + #undef PTE_MAYBE_SHARED 524 480 525 481 mov x0, TCR2_EL1x_PIE 526 482 msr REG_TCR2_EL1, x0
+38 -39
arch/arm64/mm/ptdump.c
··· 26 26 #include <asm/ptdump.h> 27 27 28 28 29 - enum address_markers_idx { 30 - PAGE_OFFSET_NR = 0, 31 - PAGE_END_NR, 32 - #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 33 - KASAN_START_NR, 34 - #endif 35 - }; 36 - 37 - static struct addr_marker address_markers[] = { 38 - { PAGE_OFFSET, "Linear Mapping start" }, 39 - { 0 /* PAGE_END */, "Linear Mapping end" }, 40 - #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 41 - { 0 /* KASAN_SHADOW_START */, "Kasan shadow start" }, 42 - { KASAN_SHADOW_END, "Kasan shadow end" }, 43 - #endif 44 - { MODULES_VADDR, "Modules start" }, 45 - { MODULES_END, "Modules end" }, 46 - { VMALLOC_START, "vmalloc() area" }, 47 - { VMALLOC_END, "vmalloc() end" }, 48 - { FIXADDR_TOT_START, "Fixmap start" }, 49 - { FIXADDR_TOP, "Fixmap end" }, 50 - { PCI_IO_START, "PCI I/O start" }, 51 - { PCI_IO_END, "PCI I/O end" }, 52 - { VMEMMAP_START, "vmemmap start" }, 53 - { VMEMMAP_START + VMEMMAP_SIZE, "vmemmap end" }, 54 - { -1, NULL }, 55 - }; 56 - 57 29 #define pt_dump_seq_printf(m, fmt, args...) \ 58 30 ({ \ 59 31 if (m) \ ··· 48 76 struct ptdump_state ptdump; 49 77 struct seq_file *seq; 50 78 const struct addr_marker *marker; 79 + const struct mm_struct *mm; 51 80 unsigned long start_address; 52 81 int level; 53 82 u64 current_prot; ··· 145 172 146 173 struct pg_level { 147 174 const struct prot_bits *bits; 148 - const char *name; 149 - size_t num; 175 + char name[4]; 176 + int num; 150 177 u64 mask; 151 178 }; 152 179 153 - static struct pg_level pg_level[] = { 180 + static struct pg_level pg_level[] __ro_after_init = { 154 181 { /* pgd */ 155 182 .name = "PGD", 156 183 .bits = pte_bits, ··· 160 187 .bits = pte_bits, 161 188 .num = ARRAY_SIZE(pte_bits), 162 189 }, { /* pud */ 163 - .name = (CONFIG_PGTABLE_LEVELS > 3) ? "PUD" : "PGD", 190 + .name = "PUD", 164 191 .bits = pte_bits, 165 192 .num = ARRAY_SIZE(pte_bits), 166 193 }, { /* pmd */ 167 - .name = (CONFIG_PGTABLE_LEVELS > 2) ? "PMD" : "PGD", 194 + .name = "PMD", 168 195 .bits = pte_bits, 169 196 .num = ARRAY_SIZE(pte_bits), 170 197 }, { /* pte */ ··· 228 255 static const char units[] = "KMGTPE"; 229 256 u64 prot = 0; 230 257 258 + /* check if the current level has been folded dynamically */ 259 + if ((level == 1 && mm_p4d_folded(st->mm)) || 260 + (level == 2 && mm_pud_folded(st->mm))) 261 + level = 0; 262 + 231 263 if (level >= 0) 232 264 prot = val & pg_level[level].mask; 233 265 ··· 294 316 st = (struct pg_state){ 295 317 .seq = s, 296 318 .marker = info->markers, 319 + .mm = info->mm, 297 320 .level = -1, 298 321 .ptdump = { 299 322 .note_page = note_page, ··· 318 339 pg_level[i].mask |= pg_level[i].bits[j].mask; 319 340 } 320 341 321 - static struct ptdump_info kernel_ptdump_info = { 342 + static struct ptdump_info kernel_ptdump_info __ro_after_init = { 322 343 .mm = &init_mm, 323 - .markers = address_markers, 324 - .base_addr = PAGE_OFFSET, 325 344 }; 326 345 327 346 void ptdump_check_wx(void) ··· 335 358 .ptdump = { 336 359 .note_page = note_page, 337 360 .range = (struct ptdump_range[]) { 338 - {PAGE_OFFSET, ~0UL}, 361 + {_PAGE_OFFSET(vabits_actual), ~0UL}, 339 362 {0, 0} 340 363 } 341 364 } ··· 352 375 353 376 static int __init ptdump_init(void) 354 377 { 355 - address_markers[PAGE_END_NR].start_address = PAGE_END; 378 + u64 page_offset = _PAGE_OFFSET(vabits_actual); 379 + u64 vmemmap_start = (u64)virt_to_page((void *)page_offset); 380 + struct addr_marker m[] = { 381 + { PAGE_OFFSET, "Linear Mapping start" }, 382 + { PAGE_END, "Linear Mapping end" }, 356 383 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS) 357 - address_markers[KASAN_START_NR].start_address = KASAN_SHADOW_START; 384 + { KASAN_SHADOW_START, "Kasan shadow start" }, 385 + { KASAN_SHADOW_END, "Kasan shadow end" }, 358 386 #endif 387 + { MODULES_VADDR, "Modules start" }, 388 + { MODULES_END, "Modules end" }, 389 + { VMALLOC_START, "vmalloc() area" }, 390 + { VMALLOC_END, "vmalloc() end" }, 391 + { vmemmap_start, "vmemmap start" }, 392 + { VMEMMAP_END, "vmemmap end" }, 393 + { PCI_IO_START, "PCI I/O start" }, 394 + { PCI_IO_END, "PCI I/O end" }, 395 + { FIXADDR_TOT_START, "Fixmap start" }, 396 + { FIXADDR_TOP, "Fixmap end" }, 397 + { -1, NULL }, 398 + }; 399 + static struct addr_marker address_markers[ARRAY_SIZE(m)] __ro_after_init; 400 + 401 + kernel_ptdump_info.markers = memcpy(address_markers, m, sizeof(m)); 402 + kernel_ptdump_info.base_addr = page_offset; 403 + 359 404 ptdump_initialize(); 360 405 ptdump_debugfs_register(&kernel_ptdump_info, "kernel_page_tables"); 361 406 return 0;
+2
arch/arm64/tools/cpucaps
··· 26 26 HAS_ECV_CNTPOFF 27 27 HAS_EPAN 28 28 HAS_EVT 29 + HAS_FPMR 29 30 HAS_FGT 30 31 HAS_FPSIMD 31 32 HAS_GENERIC_AUTH ··· 51 50 HAS_TCR2 52 51 HAS_TIDCP1 53 52 HAS_TLB_RANGE 53 + HAS_VA52 54 54 HAS_VIRT_HOST_EXTN 55 55 HAS_WFXT 56 56 HW_DBM
+38 -5
arch/arm64/tools/sysreg
··· 200 200 0b0110 PMUv3p5 201 201 0b0111 PMUv3p7 202 202 0b1000 PMUv3p8 203 + 0b1001 PMUv3p9 203 204 0b1111 IMPDEF 204 205 EndEnum 205 206 Enum 23:20 MProfDbg ··· 232 231 0b1000 Debugv8p2 233 232 0b1001 Debugv8p4 234 233 0b1010 Debugv8p8 234 + 0b1011 Debugv8p9 235 235 EndEnum 236 236 EndSysreg 237 237 ··· 1223 1221 0b0010 V1P1 1224 1222 0b0011 V1P2 1225 1223 0b0100 V1P3 1224 + 0b0101 V1P4 1226 1225 EndEnum 1227 1226 Field 31:28 CTX_CMPs 1228 1227 Res0 27:24 ··· 1250 1247 0b1000 V8P2 1251 1248 0b1001 V8P4 1252 1249 0b1010 V8P8 1250 + 0b1011 V8P9 1253 1251 EndEnum 1254 1252 EndSysreg 1255 1253 1256 1254 Sysreg ID_AA64DFR1_EL1 3 0 0 5 1 1257 - Res0 63:0 1255 + Field 63:56 ABL_CMPs 1256 + UnsignedEnum 55:52 DPFZS 1257 + 0b0000 IGNR 1258 + 0b0001 FRZN 1259 + EndEnum 1260 + UnsignedEnum 51:48 EBEP 1261 + 0b0000 NI 1262 + 0b0001 IMP 1263 + EndEnum 1264 + UnsignedEnum 47:44 ITE 1265 + 0b0000 NI 1266 + 0b0001 IMP 1267 + EndEnum 1268 + UnsignedEnum 43:40 ABLE 1269 + 0b0000 NI 1270 + 0b0001 IMP 1271 + EndEnum 1272 + UnsignedEnum 39:36 PMICNTR 1273 + 0b0000 NI 1274 + 0b0001 IMP 1275 + EndEnum 1276 + UnsignedEnum 35:32 SPMU 1277 + 0b0000 NI 1278 + 0b0001 IMP 1279 + 0b0010 IMP_SPMZR 1280 + EndEnum 1281 + Field 31:24 CTX_CMPs 1282 + Field 23:16 WRPs 1283 + Field 15:8 BRPs 1284 + Field 7:0 SYSPMUID 1258 1285 EndSysreg 1259 1286 1260 1287 Sysreg ID_AA64AFR0_EL1 3 0 0 5 4 ··· 1573 1540 0b0010 IMP 1574 1541 0b0011 52_BIT 1575 1542 EndEnum 1576 - Enum 31:28 TGRAN4 1543 + SignedEnum 31:28 TGRAN4 1577 1544 0b0000 IMP 1578 1545 0b0001 52_BIT 1579 1546 0b1111 NI 1580 1547 EndEnum 1581 - Enum 27:24 TGRAN64 1548 + SignedEnum 27:24 TGRAN64 1582 1549 0b0000 IMP 1583 1550 0b1111 NI 1584 1551 EndEnum 1585 - Enum 23:20 TGRAN16 1552 + UnsignedEnum 23:20 TGRAN16 1586 1553 0b0000 NI 1587 1554 0b0001 IMP 1588 1555 0b0010 52_BIT ··· 1730 1697 0b0000 32 1731 1698 0b0001 64 1732 1699 EndEnum 1733 - Enum 19:16 VARange 1700 + UnsignedEnum 19:16 VARange 1734 1701 0b0000 48 1735 1702 0b0001 52 1736 1703 EndEnum
+1
arch/loongarch/Makefile
··· 82 82 KBUILD_CFLAGS_MODULE += -fplt -Wa,-mla-global-with-abs,-mla-local-with-abs 83 83 endif 84 84 85 + KBUILD_RUSTFLAGS += --target=$(objtree)/scripts/target.json 85 86 KBUILD_RUSTFLAGS_MODULE += -Crelocation-model=pic 86 87 87 88 ifeq ($(CONFIG_RELOCATABLE),y)
+1
arch/x86/Makefile
··· 71 71 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383 72 72 # 73 73 KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx 74 + KBUILD_RUSTFLAGS += --target=$(objtree)/scripts/target.json 74 75 KBUILD_RUSTFLAGS += -Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2 75 76 76 77 ifeq ($(CONFIG_X86_KERNEL_IBT),y)
+9
drivers/perf/Kconfig
··· 86 86 full perf feature support i.e. counter overflow, privilege mode 87 87 filtering, counter configuration. 88 88 89 + config STARFIVE_STARLINK_PMU 90 + depends on ARCH_STARFIVE || (COMPILE_TEST && 64BIT) 91 + bool "StarFive StarLink PMU" 92 + help 93 + Provide support for StarLink Performance Monitor Unit. 94 + StarLink Performance Monitor Unit integrates one or more cores with 95 + an L3 memory system. The L3 cache events are added into perf event 96 + subsystem, allowing monitoring of various L3 cache perf events. 97 + 89 98 config ARM_PMU_ACPI 90 99 depends on ARM_PMU && ACPI 91 100 def_bool y
+1
drivers/perf/Makefile
··· 15 15 obj-$(CONFIG_RISCV_PMU) += riscv_pmu.o 16 16 obj-$(CONFIG_RISCV_PMU_LEGACY) += riscv_pmu_legacy.o 17 17 obj-$(CONFIG_RISCV_PMU_SBI) += riscv_pmu_sbi.o 18 + obj-$(CONFIG_STARFIVE_STARLINK_PMU) += starfive_starlink_pmu.o 18 19 obj-$(CONFIG_THUNDERX2_PMU) += thunderx2_pmu.o 19 20 obj-$(CONFIG_XGENE_PMU) += xgene_pmu.o 20 21 obj-$(CONFIG_ARM_SPE_PMU) += arm_spe_pmu.o
+2 -4
drivers/perf/alibaba_uncore_drw_pmu.c
··· 729 729 return ret; 730 730 } 731 731 732 - static int ali_drw_pmu_remove(struct platform_device *pdev) 732 + static void ali_drw_pmu_remove(struct platform_device *pdev) 733 733 { 734 734 struct ali_drw_pmu *drw_pmu = platform_get_drvdata(pdev); 735 735 ··· 739 739 740 740 ali_drw_pmu_uninit_irq(drw_pmu); 741 741 perf_pmu_unregister(&drw_pmu->pmu); 742 - 743 - return 0; 744 742 } 745 743 746 744 static int ali_drw_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) ··· 793 795 .acpi_match_table = ali_drw_acpi_match, 794 796 }, 795 797 .probe = ali_drw_pmu_probe, 796 - .remove = ali_drw_pmu_remove, 798 + .remove_new = ali_drw_pmu_remove, 797 799 }; 798 800 799 801 static int __init ali_drw_pmu_init(void)
+2 -4
drivers/perf/amlogic/meson_g12_ddr_pmu.c
··· 355 355 return meson_ddr_pmu_create(pdev); 356 356 } 357 357 358 - static int g12_ddr_pmu_remove(struct platform_device *pdev) 358 + static void g12_ddr_pmu_remove(struct platform_device *pdev) 359 359 { 360 360 meson_ddr_pmu_remove(pdev); 361 - 362 - return 0; 363 361 } 364 362 365 363 static const struct of_device_id meson_ddr_pmu_dt_match[] = { ··· 379 381 380 382 static struct platform_driver g12_ddr_pmu_driver = { 381 383 .probe = g12_ddr_pmu_probe, 382 - .remove = g12_ddr_pmu_remove, 384 + .remove_new = g12_ddr_pmu_remove, 383 385 384 386 .driver = { 385 387 .name = "meson-g12-ddr-pmu",
+3 -5
drivers/perf/arm-cci.c
··· 1697 1697 return ret; 1698 1698 } 1699 1699 1700 - static int cci_pmu_remove(struct platform_device *pdev) 1700 + static void cci_pmu_remove(struct platform_device *pdev) 1701 1701 { 1702 1702 if (!g_cci_pmu) 1703 - return 0; 1703 + return; 1704 1704 1705 1705 cpuhp_remove_state(CPUHP_AP_PERF_ARM_CCI_ONLINE); 1706 1706 perf_pmu_unregister(&g_cci_pmu->pmu); 1707 1707 g_cci_pmu = NULL; 1708 - 1709 - return 0; 1710 1708 } 1711 1709 1712 1710 static struct platform_driver cci_pmu_driver = { ··· 1714 1716 .suppress_bind_attrs = true, 1715 1717 }, 1716 1718 .probe = cci_pmu_probe, 1717 - .remove = cci_pmu_remove, 1719 + .remove_new = cci_pmu_remove, 1718 1720 }; 1719 1721 1720 1722 module_platform_driver(cci_pmu_driver);
+2 -4
drivers/perf/arm-ccn.c
··· 1515 1515 return arm_ccn_pmu_init(ccn); 1516 1516 } 1517 1517 1518 - static int arm_ccn_remove(struct platform_device *pdev) 1518 + static void arm_ccn_remove(struct platform_device *pdev) 1519 1519 { 1520 1520 struct arm_ccn *ccn = platform_get_drvdata(pdev); 1521 1521 1522 1522 arm_ccn_pmu_cleanup(ccn); 1523 - 1524 - return 0; 1525 1523 } 1526 1524 1527 1525 static const struct of_device_id arm_ccn_match[] = { ··· 1537 1539 .suppress_bind_attrs = true, 1538 1540 }, 1539 1541 .probe = arm_ccn_probe, 1540 - .remove = arm_ccn_remove, 1542 + .remove_new = arm_ccn_remove, 1541 1543 }; 1542 1544 1543 1545 static int __init arm_ccn_init(void)
+7 -7
drivers/perf/arm-cmn.c
··· 493 493 494 494 for (dn = cmn->dns; dn->type; dn++) { 495 495 struct arm_cmn_nodeid nid = arm_cmn_nid(cmn, dn->id); 496 + int pad = dn->logid < 10; 496 497 497 498 if (dn->type == CMN_TYPE_XP) 498 499 continue; ··· 504 503 if (nid.x != x || nid.y != y || nid.port != p || nid.dev != d) 505 504 continue; 506 505 507 - seq_printf(s, " #%-2d |", dn->logid); 506 + seq_printf(s, " %*c#%-*d |", pad + 1, ' ', 3 - pad, dn->logid); 508 507 return; 509 508 } 510 509 seq_puts(s, " |"); ··· 517 516 518 517 seq_puts(s, " X"); 519 518 for (x = 0; x < cmn->mesh_x; x++) 520 - seq_printf(s, " %d ", x); 519 + seq_printf(s, " %-2d ", x); 521 520 seq_puts(s, "\nY P D+"); 522 521 y = cmn->mesh_y; 523 522 while (y--) { ··· 527 526 for (x = 0; x < cmn->mesh_x; x++) 528 527 seq_puts(s, "--------+"); 529 528 530 - seq_printf(s, "\n%d |", y); 529 + seq_printf(s, "\n%-2d |", y); 531 530 for (x = 0; x < cmn->mesh_x; x++) { 532 531 struct arm_cmn_node *xp = cmn->xps + xp_base + x; 533 532 534 533 for (p = 0; p < CMN_MAX_PORTS; p++) 535 534 port[p][x] = arm_cmn_device_connect_info(cmn, xp, p); 536 - seq_printf(s, " XP #%-2d |", xp_base + x); 535 + seq_printf(s, " XP #%-3d|", xp_base + x); 537 536 } 538 537 539 538 seq_puts(s, "\n |"); ··· 2516 2515 return err; 2517 2516 } 2518 2517 2519 - static int arm_cmn_remove(struct platform_device *pdev) 2518 + static void arm_cmn_remove(struct platform_device *pdev) 2520 2519 { 2521 2520 struct arm_cmn *cmn = platform_get_drvdata(pdev); 2522 2521 ··· 2525 2524 perf_pmu_unregister(&cmn->pmu); 2526 2525 cpuhp_state_remove_instance_nocalls(arm_cmn_hp_state, &cmn->cpuhp_node); 2527 2526 debugfs_remove(cmn->debug); 2528 - return 0; 2529 2527 } 2530 2528 2531 2529 #ifdef CONFIG_OF ··· 2555 2555 .acpi_match_table = ACPI_PTR(arm_cmn_acpi_match), 2556 2556 }, 2557 2557 .probe = arm_cmn_probe, 2558 - .remove = arm_cmn_remove, 2558 + .remove_new = arm_cmn_remove, 2559 2559 }; 2560 2560 2561 2561 static int __init arm_cmn_init(void)
+88 -71
drivers/perf/arm_cspmu/arm_cspmu.c
··· 27 27 #include <linux/io-64-nonatomic-lo-hi.h> 28 28 #include <linux/module.h> 29 29 #include <linux/mutex.h> 30 + #include <linux/of.h> 30 31 #include <linux/perf_event.h> 31 32 #include <linux/platform_device.h> 32 33 ··· 101 100 #define ARM_CSPMU_ACTIVE_CPU_MASK 0x0 102 101 #define ARM_CSPMU_ASSOCIATED_CPU_MASK 0x1 103 102 104 - /* Check and use default if implementer doesn't provide attribute callback */ 105 - #define CHECK_DEFAULT_IMPL_OPS(ops, callback) \ 106 - do { \ 107 - if (!ops->callback) \ 108 - ops->callback = arm_cspmu_ ## callback; \ 109 - } while (0) 110 - 111 103 /* 112 104 * Maximum poll count for reading counter value using high-low-high sequence. 113 105 */ ··· 115 121 116 122 static struct acpi_apmt_node *arm_cspmu_apmt_node(struct device *dev) 117 123 { 118 - return *(struct acpi_apmt_node **)dev_get_platdata(dev); 124 + struct acpi_apmt_node **ptr = dev_get_platdata(dev); 125 + 126 + return ptr ? *ptr : NULL; 119 127 } 120 128 121 129 /* ··· 313 317 314 318 dev = cspmu->dev; 315 319 apmt_node = arm_cspmu_apmt_node(dev); 320 + if (!apmt_node) 321 + return devm_kasprintf(dev, GFP_KERNEL, PMUNAME "_%u", 322 + atomic_fetch_inc(&pmu_idx[0])); 323 + 316 324 pmu_type = apmt_node->type; 317 325 318 326 if (pmu_type >= ACPI_APMT_NODE_TYPE_COUNT) { ··· 408 408 return NULL; 409 409 } 410 410 411 + #define DEFAULT_IMPL_OP(name) .name = arm_cspmu_##name 412 + 411 413 static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu) 412 414 { 413 415 int ret = 0; 414 - struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops; 415 416 struct acpi_apmt_node *apmt_node = arm_cspmu_apmt_node(cspmu->dev); 416 417 struct arm_cspmu_impl_match *match; 417 418 418 - /* 419 - * Get PMU implementer and product id from APMT node. 420 - * If APMT node doesn't have implementer/product id, try get it 421 - * from PMIIDR. 422 - */ 423 - cspmu->impl.pmiidr = 424 - (apmt_node->impl_id) ? apmt_node->impl_id : 425 - readl(cspmu->base0 + PMIIDR); 419 + /* Start with a default PMU implementation */ 420 + cspmu->impl.module = THIS_MODULE; 421 + cspmu->impl.pmiidr = readl(cspmu->base0 + PMIIDR); 422 + cspmu->impl.ops = (struct arm_cspmu_impl_ops) { 423 + DEFAULT_IMPL_OP(get_event_attrs), 424 + DEFAULT_IMPL_OP(get_format_attrs), 425 + DEFAULT_IMPL_OP(get_identifier), 426 + DEFAULT_IMPL_OP(get_name), 427 + DEFAULT_IMPL_OP(is_cycle_counter_event), 428 + DEFAULT_IMPL_OP(event_type), 429 + DEFAULT_IMPL_OP(event_filter), 430 + DEFAULT_IMPL_OP(set_ev_filter), 431 + DEFAULT_IMPL_OP(event_attr_is_visible), 432 + }; 433 + 434 + /* Firmware may override implementer/product ID from PMIIDR */ 435 + if (apmt_node && apmt_node->impl_id) 436 + cspmu->impl.pmiidr = apmt_node->impl_id; 426 437 427 438 /* Find implementer specific attribute ops. */ 428 439 match = arm_cspmu_impl_match_get(cspmu->impl.pmiidr); ··· 461 450 } 462 451 463 452 mutex_unlock(&arm_cspmu_lock); 453 + } 464 454 465 - if (ret) 466 - return ret; 467 - } else 468 - cspmu->impl.module = THIS_MODULE; 469 - 470 - /* Use default callbacks if implementer doesn't provide one. */ 471 - CHECK_DEFAULT_IMPL_OPS(impl_ops, get_event_attrs); 472 - CHECK_DEFAULT_IMPL_OPS(impl_ops, get_format_attrs); 473 - CHECK_DEFAULT_IMPL_OPS(impl_ops, get_identifier); 474 - CHECK_DEFAULT_IMPL_OPS(impl_ops, get_name); 475 - CHECK_DEFAULT_IMPL_OPS(impl_ops, is_cycle_counter_event); 476 - CHECK_DEFAULT_IMPL_OPS(impl_ops, event_type); 477 - CHECK_DEFAULT_IMPL_OPS(impl_ops, event_filter); 478 - CHECK_DEFAULT_IMPL_OPS(impl_ops, event_attr_is_visible); 479 - CHECK_DEFAULT_IMPL_OPS(impl_ops, set_ev_filter); 480 - 481 - return 0; 455 + return ret; 482 456 } 483 457 484 458 static struct attribute_group * ··· 508 512 return format_group; 509 513 } 510 514 511 - static struct attribute_group ** 512 - arm_cspmu_alloc_attr_group(struct arm_cspmu *cspmu) 515 + static int arm_cspmu_alloc_attr_groups(struct arm_cspmu *cspmu) 513 516 { 514 - struct attribute_group **attr_groups = NULL; 515 - struct device *dev = cspmu->dev; 517 + const struct attribute_group **attr_groups = cspmu->attr_groups; 516 518 const struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops; 517 519 518 520 cspmu->identifier = impl_ops->get_identifier(cspmu); 519 521 cspmu->name = impl_ops->get_name(cspmu); 520 522 521 523 if (!cspmu->identifier || !cspmu->name) 522 - return NULL; 523 - 524 - attr_groups = devm_kcalloc(dev, 5, sizeof(struct attribute_group *), 525 - GFP_KERNEL); 526 - if (!attr_groups) 527 - return NULL; 524 + return -ENOMEM; 528 525 529 526 attr_groups[0] = arm_cspmu_alloc_event_attr_group(cspmu); 530 527 attr_groups[1] = arm_cspmu_alloc_format_attr_group(cspmu); ··· 525 536 attr_groups[3] = &arm_cspmu_cpumask_attr_group; 526 537 527 538 if (!attr_groups[0] || !attr_groups[1]) 528 - return NULL; 539 + return -ENOMEM; 529 540 530 - return attr_groups; 541 + return 0; 531 542 } 532 543 533 544 static inline void arm_cspmu_reset_counters(struct arm_cspmu *cspmu) 534 545 { 535 - u32 pmcr = 0; 536 - 537 - pmcr |= PMCR_P; 538 - pmcr |= PMCR_C; 539 - writel(pmcr, cspmu->base0 + PMCR); 546 + writel(PMCR_C | PMCR_P, cspmu->base0 + PMCR); 540 547 } 541 548 542 549 static inline void arm_cspmu_start_counters(struct arm_cspmu *cspmu) ··· 947 962 platform_set_drvdata(pdev, cspmu); 948 963 949 964 apmt_node = arm_cspmu_apmt_node(dev); 950 - cspmu->has_atomic_dword = apmt_node->flags & ACPI_APMT_FLAGS_ATOMIC; 965 + if (apmt_node) { 966 + cspmu->has_atomic_dword = apmt_node->flags & ACPI_APMT_FLAGS_ATOMIC; 967 + } else { 968 + u32 width = 0; 969 + 970 + device_property_read_u32(dev, "reg-io-width", &width); 971 + cspmu->has_atomic_dword = (width == 8); 972 + } 951 973 952 974 return cspmu; 953 975 } ··· 1145 1153 } 1146 1154 } 1147 1155 1148 - if (cpumask_empty(&cspmu->associated_cpus)) { 1149 - dev_dbg(cspmu->dev, "No cpu associated with the PMU\n"); 1150 - return -ENODEV; 1151 - } 1152 - 1153 1156 return 0; 1154 1157 } 1155 1158 #else ··· 1154 1167 } 1155 1168 #endif 1156 1169 1170 + static int arm_cspmu_of_get_cpus(struct arm_cspmu *cspmu) 1171 + { 1172 + struct of_phandle_iterator it; 1173 + int ret, cpu; 1174 + 1175 + of_for_each_phandle(&it, ret, dev_of_node(cspmu->dev), "cpus", NULL, 0) { 1176 + cpu = of_cpu_node_to_id(it.node); 1177 + if (cpu < 0) 1178 + continue; 1179 + cpumask_set_cpu(cpu, &cspmu->associated_cpus); 1180 + } 1181 + return ret == -ENOENT ? 0 : ret; 1182 + } 1183 + 1157 1184 static int arm_cspmu_get_cpus(struct arm_cspmu *cspmu) 1158 1185 { 1159 - return arm_cspmu_acpi_get_cpus(cspmu); 1186 + int ret = 0; 1187 + 1188 + if (arm_cspmu_apmt_node(cspmu->dev)) 1189 + ret = arm_cspmu_acpi_get_cpus(cspmu); 1190 + else if (device_property_present(cspmu->dev, "cpus")) 1191 + ret = arm_cspmu_of_get_cpus(cspmu); 1192 + else 1193 + cpumask_copy(&cspmu->associated_cpus, cpu_possible_mask); 1194 + 1195 + if (!ret && cpumask_empty(&cspmu->associated_cpus)) { 1196 + dev_dbg(cspmu->dev, "No cpu associated with the PMU\n"); 1197 + ret = -ENODEV; 1198 + } 1199 + return ret; 1160 1200 } 1161 1201 1162 1202 static int arm_cspmu_register_pmu(struct arm_cspmu *cspmu) 1163 1203 { 1164 1204 int ret, capabilities; 1165 - struct attribute_group **attr_groups; 1166 1205 1167 - attr_groups = arm_cspmu_alloc_attr_group(cspmu); 1168 - if (!attr_groups) 1169 - return -ENOMEM; 1206 + ret = arm_cspmu_alloc_attr_groups(cspmu); 1207 + if (ret) 1208 + return ret; 1170 1209 1171 1210 ret = cpuhp_state_add_instance(arm_cspmu_cpuhp_state, 1172 1211 &cspmu->cpuhp_node); ··· 1214 1201 .start = arm_cspmu_start, 1215 1202 .stop = arm_cspmu_stop, 1216 1203 .read = arm_cspmu_read, 1217 - .attr_groups = (const struct attribute_group **)attr_groups, 1204 + .attr_groups = cspmu->attr_groups, 1218 1205 .capabilities = capabilities, 1219 1206 }; 1220 1207 1221 1208 /* Hardware counter init */ 1222 - arm_cspmu_stop_counters(cspmu); 1223 1209 arm_cspmu_reset_counters(cspmu); 1224 1210 1225 1211 ret = perf_pmu_register(&cspmu->pmu, cspmu->name, -1); ··· 1264 1252 return ret; 1265 1253 } 1266 1254 1267 - static int arm_cspmu_device_remove(struct platform_device *pdev) 1255 + static void arm_cspmu_device_remove(struct platform_device *pdev) 1268 1256 { 1269 1257 struct arm_cspmu *cspmu = platform_get_drvdata(pdev); 1270 1258 1271 1259 perf_pmu_unregister(&cspmu->pmu); 1272 1260 cpuhp_state_remove_instance(arm_cspmu_cpuhp_state, &cspmu->cpuhp_node); 1273 - 1274 - return 0; 1275 1261 } 1276 1262 1277 1263 static const struct platform_device_id arm_cspmu_id[] = { ··· 1278 1268 }; 1279 1269 MODULE_DEVICE_TABLE(platform, arm_cspmu_id); 1280 1270 1271 + static const struct of_device_id arm_cspmu_of_match[] = { 1272 + { .compatible = "arm,coresight-pmu" }, 1273 + {} 1274 + }; 1275 + MODULE_DEVICE_TABLE(of, arm_cspmu_of_match); 1276 + 1281 1277 static struct platform_driver arm_cspmu_driver = { 1282 1278 .driver = { 1283 - .name = DRVNAME, 1284 - .suppress_bind_attrs = true, 1285 - }, 1279 + .name = DRVNAME, 1280 + .of_match_table = arm_cspmu_of_match, 1281 + .suppress_bind_attrs = true, 1282 + }, 1286 1283 .probe = arm_cspmu_device_probe, 1287 - .remove = arm_cspmu_device_remove, 1284 + .remove_new = arm_cspmu_device_remove, 1288 1285 .id_table = arm_cspmu_id, 1289 1286 }; 1290 1287
+1
drivers/perf/arm_cspmu/arm_cspmu.h
··· 157 157 int cycle_counter_logical_idx; 158 158 159 159 struct arm_cspmu_hw_events hw_events; 160 + const struct attribute_group *attr_groups[5]; 160 161 161 162 struct arm_cspmu_impl impl; 162 163 };
-6
drivers/perf/arm_cspmu/nvidia_cspmu.c
··· 388 388 impl_ops->get_format_attrs = nv_cspmu_get_format_attrs; 389 389 impl_ops->get_name = nv_cspmu_get_name; 390 390 391 - /* Set others to NULL to use default callback. */ 392 - impl_ops->event_type = NULL; 393 - impl_ops->event_attr_is_visible = NULL; 394 - impl_ops->get_identifier = NULL; 395 - impl_ops->is_cycle_counter_event = NULL; 396 - 397 391 return 0; 398 392 } 399 393
+2 -4
drivers/perf/arm_dmc620_pmu.c
··· 724 724 return ret; 725 725 } 726 726 727 - static int dmc620_pmu_device_remove(struct platform_device *pdev) 727 + static void dmc620_pmu_device_remove(struct platform_device *pdev) 728 728 { 729 729 struct dmc620_pmu *dmc620_pmu = platform_get_drvdata(pdev); 730 730 ··· 732 732 733 733 /* perf will synchronise RCU before devres can free dmc620_pmu */ 734 734 perf_pmu_unregister(&dmc620_pmu->pmu); 735 - 736 - return 0; 737 735 } 738 736 739 737 static const struct acpi_device_id dmc620_acpi_match[] = { ··· 746 748 .suppress_bind_attrs = true, 747 749 }, 748 750 .probe = dmc620_pmu_device_probe, 749 - .remove = dmc620_pmu_device_remove, 751 + .remove_new = dmc620_pmu_device_remove, 750 752 }; 751 753 752 754 static int __init dmc620_pmu_init(void)
+2 -4
drivers/perf/arm_dsu_pmu.c
··· 774 774 return rc; 775 775 } 776 776 777 - static int dsu_pmu_device_remove(struct platform_device *pdev) 777 + static void dsu_pmu_device_remove(struct platform_device *pdev) 778 778 { 779 779 struct dsu_pmu *dsu_pmu = platform_get_drvdata(pdev); 780 780 781 781 perf_pmu_unregister(&dsu_pmu->pmu); 782 782 cpuhp_state_remove_instance(dsu_pmu_cpuhp_state, &dsu_pmu->cpuhp_node); 783 - 784 - return 0; 785 783 } 786 784 787 785 static const struct of_device_id dsu_pmu_of_match[] = { ··· 804 806 .suppress_bind_attrs = true, 805 807 }, 806 808 .probe = dsu_pmu_device_probe, 807 - .remove = dsu_pmu_device_remove, 809 + .remove_new = dsu_pmu_device_remove, 808 810 }; 809 811 810 812 static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+2 -4
drivers/perf/arm_smmuv3_pmu.c
··· 965 965 return err; 966 966 } 967 967 968 - static int smmu_pmu_remove(struct platform_device *pdev) 968 + static void smmu_pmu_remove(struct platform_device *pdev) 969 969 { 970 970 struct smmu_pmu *smmu_pmu = platform_get_drvdata(pdev); 971 971 972 972 perf_pmu_unregister(&smmu_pmu->pmu); 973 973 cpuhp_state_remove_instance_nocalls(cpuhp_state_num, &smmu_pmu->node); 974 - 975 - return 0; 976 974 } 977 975 978 976 static void smmu_pmu_shutdown(struct platform_device *pdev) ··· 995 997 .suppress_bind_attrs = true, 996 998 }, 997 999 .probe = smmu_pmu_probe, 998 - .remove = smmu_pmu_remove, 1000 + .remove_new = smmu_pmu_remove, 999 1001 .shutdown = smmu_pmu_shutdown, 1000 1002 }; 1001 1003
+2 -3
drivers/perf/arm_spe_pmu.c
··· 1263 1263 return ret; 1264 1264 } 1265 1265 1266 - static int arm_spe_pmu_device_remove(struct platform_device *pdev) 1266 + static void arm_spe_pmu_device_remove(struct platform_device *pdev) 1267 1267 { 1268 1268 struct arm_spe_pmu *spe_pmu = platform_get_drvdata(pdev); 1269 1269 1270 1270 arm_spe_pmu_perf_destroy(spe_pmu); 1271 1271 arm_spe_pmu_dev_teardown(spe_pmu); 1272 1272 free_percpu(spe_pmu->handle); 1273 - return 0; 1274 1273 } 1275 1274 1276 1275 static struct platform_driver arm_spe_pmu_driver = { ··· 1280 1281 .suppress_bind_attrs = true, 1281 1282 }, 1282 1283 .probe = arm_spe_pmu_device_probe, 1283 - .remove = arm_spe_pmu_device_remove, 1284 + .remove_new = arm_spe_pmu_device_remove, 1284 1285 }; 1285 1286 1286 1287 static int __init arm_spe_pmu_init(void)
+2 -3
drivers/perf/fsl_imx8_ddr_perf.c
··· 826 826 return ret; 827 827 } 828 828 829 - static int ddr_perf_remove(struct platform_device *pdev) 829 + static void ddr_perf_remove(struct platform_device *pdev) 830 830 { 831 831 struct ddr_pmu *pmu = platform_get_drvdata(pdev); 832 832 ··· 836 836 perf_pmu_unregister(&pmu->pmu); 837 837 838 838 ida_free(&ddr_ida, pmu->id); 839 - return 0; 840 839 } 841 840 842 841 static struct platform_driver imx_ddr_pmu_driver = { ··· 845 846 .suppress_bind_attrs = true, 846 847 }, 847 848 .probe = ddr_perf_probe, 848 - .remove = ddr_perf_remove, 849 + .remove_new = ddr_perf_remove, 849 850 }; 850 851 851 852 module_platform_driver(imx_ddr_pmu_driver);
+2 -4
drivers/perf/fsl_imx9_ddr_perf.c
··· 679 679 return ret; 680 680 } 681 681 682 - static int ddr_perf_remove(struct platform_device *pdev) 682 + static void ddr_perf_remove(struct platform_device *pdev) 683 683 { 684 684 struct ddr_pmu *pmu = platform_get_drvdata(pdev); 685 685 ··· 689 689 perf_pmu_unregister(&pmu->pmu); 690 690 691 691 ida_free(&ddr_ida, pmu->id); 692 - 693 - return 0; 694 692 } 695 693 696 694 static struct platform_driver imx_ddr_pmu_driver = { ··· 698 700 .suppress_bind_attrs = true, 699 701 }, 700 702 .probe = ddr_perf_probe, 701 - .remove = ddr_perf_remove, 703 + .remove_new = ddr_perf_remove, 702 704 }; 703 705 module_platform_driver(imx_ddr_pmu_driver); 704 706
+53 -49
drivers/perf/hisilicon/hisi_pcie_pmu.c
··· 216 216 writeq_relaxed(val, pcie_pmu->base + offset); 217 217 } 218 218 219 - static void hisi_pcie_pmu_config_filter(struct perf_event *event) 219 + static u64 hisi_pcie_pmu_get_event_ctrl_val(struct perf_event *event) 220 220 { 221 - struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); 222 - struct hw_perf_event *hwc = &event->hw; 223 221 u64 port, trig_len, thr_len, len_mode; 224 222 u64 reg = HISI_PCIE_INIT_SET; 225 223 ··· 254 256 else 255 257 reg |= FIELD_PREP(HISI_PCIE_LEN_M, HISI_PCIE_LEN_M_DEFAULT); 256 258 259 + return reg; 260 + } 261 + 262 + static void hisi_pcie_pmu_config_event_ctrl(struct perf_event *event) 263 + { 264 + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); 265 + struct hw_perf_event *hwc = &event->hw; 266 + u64 reg = hisi_pcie_pmu_get_event_ctrl_val(event); 267 + 257 268 hisi_pcie_pmu_writeq(pcie_pmu, HISI_PCIE_EVENT_CTRL, hwc->idx, reg); 258 269 } 259 270 260 - static void hisi_pcie_pmu_clear_filter(struct perf_event *event) 271 + static void hisi_pcie_pmu_clear_event_ctrl(struct perf_event *event) 261 272 { 262 273 struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); 263 274 struct hw_perf_event *hwc = &event->hw; ··· 306 299 if (hisi_pcie_get_trig_len(event) > HISI_PCIE_TRIG_MAX_VAL) 307 300 return false; 308 301 309 - if (requester_id) { 310 - if (!hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id)) 311 - return false; 312 - } 302 + /* Need to explicitly set filter of "port" or "bdf" */ 303 + if (!hisi_pcie_get_port(event) && 304 + !hisi_pcie_pmu_valid_requester_id(pcie_pmu, requester_id)) 305 + return false; 313 306 314 307 return true; 315 308 } 316 309 310 + /* 311 + * Check Whether two events share the same config. The same config means not 312 + * only the event code, but also the filter settings of the two events are 313 + * the same. 314 + */ 317 315 static bool hisi_pcie_pmu_cmp_event(struct perf_event *target, 318 316 struct perf_event *event) 319 317 { 320 - return hisi_pcie_get_real_event(target) == hisi_pcie_get_real_event(event); 318 + return hisi_pcie_pmu_get_event_ctrl_val(target) == 319 + hisi_pcie_pmu_get_event_ctrl_val(event); 321 320 } 322 321 323 322 static bool hisi_pcie_pmu_validate_event_group(struct perf_event *event) ··· 398 385 return hisi_pcie_pmu_readq(pcie_pmu, event->hw.event_base, idx); 399 386 } 400 387 401 - static int hisi_pcie_pmu_find_related_event(struct hisi_pcie_pmu *pcie_pmu, 402 - struct perf_event *event) 388 + /* 389 + * Check all work events, if a relevant event is found then we return it 390 + * first, otherwise return the first idle counter (need to reset). 391 + */ 392 + static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu, 393 + struct perf_event *event) 403 394 { 395 + int first_idle = -EAGAIN; 404 396 struct perf_event *sibling; 405 397 int idx; 406 398 407 399 for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) { 408 400 sibling = pcie_pmu->hw_events[idx]; 409 - if (!sibling) 401 + if (!sibling) { 402 + if (first_idle == -EAGAIN) 403 + first_idle = idx; 410 404 continue; 411 - 412 - if (!hisi_pcie_pmu_cmp_event(sibling, event)) 413 - continue; 405 + } 414 406 415 407 /* Related events must be used in group */ 416 - if (sibling->group_leader == event->group_leader) 417 - return idx; 418 - else 419 - return -EINVAL; 420 - } 421 - 422 - return idx; 423 - } 424 - 425 - static int hisi_pcie_pmu_get_event_idx(struct hisi_pcie_pmu *pcie_pmu) 426 - { 427 - int idx; 428 - 429 - for (idx = 0; idx < HISI_PCIE_MAX_COUNTERS; idx++) { 430 - if (!pcie_pmu->hw_events[idx]) 408 + if (hisi_pcie_pmu_cmp_event(sibling, event) && 409 + sibling->group_leader == event->group_leader) 431 410 return idx; 432 411 } 433 412 434 - return -EINVAL; 413 + return first_idle; 435 414 } 436 415 437 416 static void hisi_pcie_pmu_event_update(struct perf_event *event) ··· 510 505 WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE)); 511 506 hwc->state = 0; 512 507 513 - hisi_pcie_pmu_config_filter(event); 508 + hisi_pcie_pmu_config_event_ctrl(event); 514 509 hisi_pcie_pmu_enable_counter(pcie_pmu, hwc); 515 510 hisi_pcie_pmu_enable_int(pcie_pmu, hwc); 516 511 hisi_pcie_pmu_set_period(event); ··· 531 526 hisi_pcie_pmu_event_update(event); 532 527 hisi_pcie_pmu_disable_int(pcie_pmu, hwc); 533 528 hisi_pcie_pmu_disable_counter(pcie_pmu, hwc); 534 - hisi_pcie_pmu_clear_filter(event); 529 + hisi_pcie_pmu_clear_event_ctrl(event); 535 530 WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); 536 531 hwc->state |= PERF_HES_STOPPED; 537 532 ··· 549 544 550 545 hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE; 551 546 552 - /* Check all working events to find a related event. */ 553 - idx = hisi_pcie_pmu_find_related_event(pcie_pmu, event); 554 - if (idx < 0) 555 - return idx; 556 - 557 - /* Current event shares an enabled counter with the related event */ 558 - if (idx < HISI_PCIE_MAX_COUNTERS) { 559 - hwc->idx = idx; 560 - goto start_count; 561 - } 562 - 563 - idx = hisi_pcie_pmu_get_event_idx(pcie_pmu); 547 + idx = hisi_pcie_pmu_get_event_idx(pcie_pmu, event); 564 548 if (idx < 0) 565 549 return idx; 566 550 567 551 hwc->idx = idx; 568 - pcie_pmu->hw_events[idx] = event; 569 - /* Reset Counter to avoid previous statistic interference. */ 570 - hisi_pcie_pmu_reset_counter(pcie_pmu, idx); 571 552 572 - start_count: 553 + /* No enabled counter found with related event, reset it */ 554 + if (!pcie_pmu->hw_events[idx]) { 555 + hisi_pcie_pmu_reset_counter(pcie_pmu, idx); 556 + pcie_pmu->hw_events[idx] = event; 557 + } 558 + 573 559 if (flags & PERF_EF_START) 574 560 hisi_pcie_pmu_start(event, PERF_EF_RELOAD); 575 561 ··· 710 714 HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_cnt, 0x10210), 711 715 HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_latency, 0x0011), 712 716 HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_cnt, 0x10011), 717 + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_flux, 0x0104), 718 + HISI_PCIE_PMU_EVENT_ATTR(rx_mwr_time, 0x10104), 713 719 HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_flux, 0x0804), 714 720 HISI_PCIE_PMU_EVENT_ATTR(rx_mrd_time, 0x10804), 721 + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_flux, 0x2004), 722 + HISI_PCIE_PMU_EVENT_ATTR(rx_cpl_time, 0x12004), 723 + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_flux, 0x0105), 724 + HISI_PCIE_PMU_EVENT_ATTR(tx_mwr_time, 0x10105), 715 725 HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_flux, 0x0405), 716 726 HISI_PCIE_PMU_EVENT_ATTR(tx_mrd_time, 0x10405), 727 + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_flux, 0x1005), 728 + HISI_PCIE_PMU_EVENT_ATTR(tx_cpl_time, 0x11005), 717 729 NULL 718 730 }; 719 731
+2 -3
drivers/perf/hisilicon/hisi_uncore_cpa_pmu.c
··· 341 341 return ret; 342 342 } 343 343 344 - static int hisi_cpa_pmu_remove(struct platform_device *pdev) 344 + static void hisi_cpa_pmu_remove(struct platform_device *pdev) 345 345 { 346 346 struct hisi_pmu *cpa_pmu = platform_get_drvdata(pdev); 347 347 ··· 349 349 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_CPA_ONLINE, 350 350 &cpa_pmu->node); 351 351 hisi_cpa_pmu_enable_pm(cpa_pmu); 352 - return 0; 353 352 } 354 353 355 354 static struct platform_driver hisi_cpa_pmu_driver = { ··· 358 359 .suppress_bind_attrs = true, 359 360 }, 360 361 .probe = hisi_cpa_pmu_probe, 361 - .remove = hisi_cpa_pmu_remove, 362 + .remove_new = hisi_cpa_pmu_remove, 362 363 }; 363 364 364 365 static int __init hisi_cpa_pmu_module_init(void)
+2 -3
drivers/perf/hisilicon/hisi_uncore_ddrc_pmu.c
··· 531 531 return ret; 532 532 } 533 533 534 - static int hisi_ddrc_pmu_remove(struct platform_device *pdev) 534 + static void hisi_ddrc_pmu_remove(struct platform_device *pdev) 535 535 { 536 536 struct hisi_pmu *ddrc_pmu = platform_get_drvdata(pdev); 537 537 538 538 perf_pmu_unregister(&ddrc_pmu->pmu); 539 539 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_DDRC_ONLINE, 540 540 &ddrc_pmu->node); 541 - return 0; 542 541 } 543 542 544 543 static struct platform_driver hisi_ddrc_pmu_driver = { ··· 547 548 .suppress_bind_attrs = true, 548 549 }, 549 550 .probe = hisi_ddrc_pmu_probe, 550 - .remove = hisi_ddrc_pmu_remove, 551 + .remove_new = hisi_ddrc_pmu_remove, 551 552 }; 552 553 553 554 static int __init hisi_ddrc_pmu_module_init(void)
+2 -3
drivers/perf/hisilicon/hisi_uncore_hha_pmu.c
··· 534 534 return ret; 535 535 } 536 536 537 - static int hisi_hha_pmu_remove(struct platform_device *pdev) 537 + static void hisi_hha_pmu_remove(struct platform_device *pdev) 538 538 { 539 539 struct hisi_pmu *hha_pmu = platform_get_drvdata(pdev); 540 540 541 541 perf_pmu_unregister(&hha_pmu->pmu); 542 542 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_HHA_ONLINE, 543 543 &hha_pmu->node); 544 - return 0; 545 544 } 546 545 547 546 static struct platform_driver hisi_hha_pmu_driver = { ··· 550 551 .suppress_bind_attrs = true, 551 552 }, 552 553 .probe = hisi_hha_pmu_probe, 553 - .remove = hisi_hha_pmu_remove, 554 + .remove_new = hisi_hha_pmu_remove, 554 555 }; 555 556 556 557 static int __init hisi_hha_pmu_module_init(void)
+2 -3
drivers/perf/hisilicon/hisi_uncore_l3c_pmu.c
··· 568 568 return ret; 569 569 } 570 570 571 - static int hisi_l3c_pmu_remove(struct platform_device *pdev) 571 + static void hisi_l3c_pmu_remove(struct platform_device *pdev) 572 572 { 573 573 struct hisi_pmu *l3c_pmu = platform_get_drvdata(pdev); 574 574 575 575 perf_pmu_unregister(&l3c_pmu->pmu); 576 576 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_L3_ONLINE, 577 577 &l3c_pmu->node); 578 - return 0; 579 578 } 580 579 581 580 static struct platform_driver hisi_l3c_pmu_driver = { ··· 584 585 .suppress_bind_attrs = true, 585 586 }, 586 587 .probe = hisi_l3c_pmu_probe, 587 - .remove = hisi_l3c_pmu_remove, 588 + .remove_new = hisi_l3c_pmu_remove, 588 589 }; 589 590 590 591 static int __init hisi_l3c_pmu_module_init(void)
+2 -3
drivers/perf/hisilicon/hisi_uncore_pa_pmu.c
··· 514 514 return ret; 515 515 } 516 516 517 - static int hisi_pa_pmu_remove(struct platform_device *pdev) 517 + static void hisi_pa_pmu_remove(struct platform_device *pdev) 518 518 { 519 519 struct hisi_pmu *pa_pmu = platform_get_drvdata(pdev); 520 520 521 521 perf_pmu_unregister(&pa_pmu->pmu); 522 522 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_PA_ONLINE, 523 523 &pa_pmu->node); 524 - return 0; 525 524 } 526 525 527 526 static const struct acpi_device_id hisi_pa_pmu_acpi_match[] = { ··· 538 539 .suppress_bind_attrs = true, 539 540 }, 540 541 .probe = hisi_pa_pmu_probe, 541 - .remove = hisi_pa_pmu_remove, 542 + .remove_new = hisi_pa_pmu_remove, 542 543 }; 543 544 544 545 static int __init hisi_pa_pmu_module_init(void)
+2 -3
drivers/perf/hisilicon/hisi_uncore_sllc_pmu.c
··· 460 460 return ret; 461 461 } 462 462 463 - static int hisi_sllc_pmu_remove(struct platform_device *pdev) 463 + static void hisi_sllc_pmu_remove(struct platform_device *pdev) 464 464 { 465 465 struct hisi_pmu *sllc_pmu = platform_get_drvdata(pdev); 466 466 467 467 perf_pmu_unregister(&sllc_pmu->pmu); 468 468 cpuhp_state_remove_instance_nocalls(CPUHP_AP_PERF_ARM_HISI_SLLC_ONLINE, 469 469 &sllc_pmu->node); 470 - return 0; 471 470 } 472 471 473 472 static struct platform_driver hisi_sllc_pmu_driver = { ··· 476 477 .suppress_bind_attrs = true, 477 478 }, 478 479 .probe = hisi_sllc_pmu_probe, 479 - .remove = hisi_sllc_pmu_remove, 480 + .remove_new = hisi_sllc_pmu_remove, 480 481 }; 481 482 482 483 static int __init hisi_sllc_pmu_module_init(void)
+41 -1
drivers/perf/hisilicon/hisi_uncore_uc_pmu.c
··· 287 287 return readq(uc_pmu->base + HISI_UC_CNTR_REGn(hwc->idx)); 288 288 } 289 289 290 - static void hisi_uc_pmu_write_counter(struct hisi_pmu *uc_pmu, 290 + static bool hisi_uc_pmu_get_glb_en_state(struct hisi_pmu *uc_pmu) 291 + { 292 + u32 val; 293 + 294 + val = readl(uc_pmu->base + HISI_UC_EVENT_CTRL_REG); 295 + return !!FIELD_GET(HISI_UC_EVENT_GLB_EN, val); 296 + } 297 + 298 + static void hisi_uc_pmu_write_counter_normal(struct hisi_pmu *uc_pmu, 291 299 struct hw_perf_event *hwc, u64 val) 292 300 { 293 301 writeq(val, uc_pmu->base + HISI_UC_CNTR_REGn(hwc->idx)); 302 + } 303 + 304 + static void hisi_uc_pmu_write_counter_quirk_v2(struct hisi_pmu *uc_pmu, 305 + struct hw_perf_event *hwc, u64 val) 306 + { 307 + hisi_uc_pmu_start_counters(uc_pmu); 308 + hisi_uc_pmu_write_counter_normal(uc_pmu, hwc, val); 309 + hisi_uc_pmu_stop_counters(uc_pmu); 310 + } 311 + 312 + static void hisi_uc_pmu_write_counter(struct hisi_pmu *uc_pmu, 313 + struct hw_perf_event *hwc, u64 val) 314 + { 315 + bool enable = hisi_uc_pmu_get_glb_en_state(uc_pmu); 316 + bool erratum = uc_pmu->identifier == HISI_PMU_V2; 317 + 318 + /* 319 + * HiSilicon UC PMU v2 suffers the erratum 162700402 that the 320 + * PMU counter cannot be set due to the lack of clock under power 321 + * saving mode. This will lead to error or inaccurate counts. 322 + * The clock can be enabled by the PMU global enabling control. 323 + * The irq handler and pmu_start() will call the function to set 324 + * period. If the function under irq context, the PMU has been 325 + * enabled therefore we set counter directly. Other situations 326 + * the PMU is disabled, we need to enable it to turn on the 327 + * counter clock to set period, and then restore PMU enable 328 + * status, the counter can hold its value without a clock. 329 + */ 330 + if (enable || !erratum) 331 + hisi_uc_pmu_write_counter_normal(uc_pmu, hwc, val); 332 + else 333 + hisi_uc_pmu_write_counter_quirk_v2(uc_pmu, hwc, val); 294 334 } 295 335 296 336 static void hisi_uc_pmu_enable_counter_int(struct hisi_pmu *uc_pmu,
+2 -3
drivers/perf/marvell_cn10k_ddr_pmu.c
··· 697 697 return ret; 698 698 } 699 699 700 - static int cn10k_ddr_perf_remove(struct platform_device *pdev) 700 + static void cn10k_ddr_perf_remove(struct platform_device *pdev) 701 701 { 702 702 struct cn10k_ddr_pmu *ddr_pmu = platform_get_drvdata(pdev); 703 703 ··· 706 706 &ddr_pmu->node); 707 707 708 708 perf_pmu_unregister(&ddr_pmu->pmu); 709 - return 0; 710 709 } 711 710 712 711 #ifdef CONFIG_OF ··· 732 733 .suppress_bind_attrs = true, 733 734 }, 734 735 .probe = cn10k_ddr_perf_probe, 735 - .remove = cn10k_ddr_perf_remove, 736 + .remove_new = cn10k_ddr_perf_remove, 736 737 }; 737 738 738 739 static int __init cn10k_ddr_pmu_init(void)
+2 -4
drivers/perf/marvell_cn10k_tad_pmu.c
··· 351 351 return ret; 352 352 } 353 353 354 - static int tad_pmu_remove(struct platform_device *pdev) 354 + static void tad_pmu_remove(struct platform_device *pdev) 355 355 { 356 356 struct tad_pmu *pmu = platform_get_drvdata(pdev); 357 357 358 358 cpuhp_state_remove_instance_nocalls(tad_pmu_cpuhp_state, 359 359 &pmu->node); 360 360 perf_pmu_unregister(&pmu->pmu); 361 - 362 - return 0; 363 361 } 364 362 365 363 #ifdef CONFIG_OF ··· 383 385 .suppress_bind_attrs = true, 384 386 }, 385 387 .probe = tad_pmu_probe, 386 - .remove = tad_pmu_remove, 388 + .remove_new = tad_pmu_remove, 387 389 }; 388 390 389 391 static int tad_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
+2 -3
drivers/perf/qcom_l2_pmu.c
··· 965 965 return err; 966 966 } 967 967 968 - static int l2_cache_pmu_remove(struct platform_device *pdev) 968 + static void l2_cache_pmu_remove(struct platform_device *pdev) 969 969 { 970 970 struct l2cache_pmu *l2cache_pmu = 971 971 to_l2cache_pmu(platform_get_drvdata(pdev)); ··· 973 973 perf_pmu_unregister(&l2cache_pmu->pmu); 974 974 cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE, 975 975 &l2cache_pmu->node); 976 - return 0; 977 976 } 978 977 979 978 static struct platform_driver l2_cache_pmu_driver = { ··· 982 983 .suppress_bind_attrs = true, 983 984 }, 984 985 .probe = l2_cache_pmu_probe, 985 - .remove = l2_cache_pmu_remove, 986 + .remove_new = l2_cache_pmu_remove, 986 987 }; 987 988 988 989 static int __init register_l2_cache_pmu_driver(void)
+2 -3
drivers/perf/thunderx2_pmu.c
··· 993 993 return 0; 994 994 } 995 995 996 - static int tx2_uncore_remove(struct platform_device *pdev) 996 + static void tx2_uncore_remove(struct platform_device *pdev) 997 997 { 998 998 struct tx2_uncore_pmu *tx2_pmu, *temp; 999 999 struct device *dev = &pdev->dev; ··· 1009 1009 } 1010 1010 } 1011 1011 } 1012 - return 0; 1013 1012 } 1014 1013 1015 1014 static struct platform_driver tx2_uncore_driver = { ··· 1018 1019 .suppress_bind_attrs = true, 1019 1020 }, 1020 1021 .probe = tx2_uncore_probe, 1021 - .remove = tx2_uncore_remove, 1022 + .remove_new = tx2_uncore_remove, 1022 1023 }; 1023 1024 1024 1025 static int __init tx2_uncore_driver_init(void)
+2 -4
drivers/perf/xgene_pmu.c
··· 1937 1937 } 1938 1938 } 1939 1939 1940 - static int xgene_pmu_remove(struct platform_device *pdev) 1940 + static void xgene_pmu_remove(struct platform_device *pdev) 1941 1941 { 1942 1942 struct xgene_pmu *xgene_pmu = dev_get_drvdata(&pdev->dev); 1943 1943 ··· 1947 1947 xgene_pmu_dev_cleanup(xgene_pmu, &xgene_pmu->mcpmus); 1948 1948 cpuhp_state_remove_instance(CPUHP_AP_PERF_ARM_APM_XGENE_ONLINE, 1949 1949 &xgene_pmu->node); 1950 - 1951 - return 0; 1952 1950 } 1953 1951 1954 1952 static struct platform_driver xgene_pmu_driver = { 1955 1953 .probe = xgene_pmu_probe, 1956 - .remove = xgene_pmu_remove, 1954 + .remove_new = xgene_pmu_remove, 1957 1955 .driver = { 1958 1956 .name = "xgene-pmu", 1959 1957 .of_match_table = xgene_pmu_of_match,
+1
include/uapi/linux/elf.h
··· 440 440 #define NT_ARM_SSVE 0x40b /* ARM Streaming SVE registers */ 441 441 #define NT_ARM_ZA 0x40c /* ARM SME ZA registers */ 442 442 #define NT_ARM_ZT 0x40d /* ARM SME ZT registers */ 443 + #define NT_ARM_FPMR 0x40e /* ARM floating point mode register */ 443 444 #define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */ 444 445 #define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */ 445 446 #define NT_MIPS_DSP 0x800 /* MIPS DSP ASE registers */
+5 -1
rust/Makefile
··· 297 297 298 298 # Derived from `scripts/Makefile.clang`. 299 299 BINDGEN_TARGET_x86 := x86_64-linux-gnu 300 + BINDGEN_TARGET_arm64 := aarch64-linux-gnu 300 301 BINDGEN_TARGET := $(BINDGEN_TARGET_$(SRCARCH)) 301 302 302 303 # All warnings are inhibited since GCC builds are very experimental, ··· 435 434 $(obj)/core.o: private skip_flags = -Dunreachable_pub 436 435 $(obj)/core.o: private rustc_objcopy = $(foreach sym,$(redirect-intrinsics),--redefine-sym $(sym)=__rust$(sym)) 437 436 $(obj)/core.o: private rustc_target_flags = $(core-cfgs) 438 - $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs scripts/target.json FORCE 437 + $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs FORCE 439 438 +$(call if_changed_dep,rustc_library) 439 + ifneq ($(or $(CONFIG_X86_64),$(CONFIG_LOONGARCH)),) 440 + $(obj)/core.o: scripts/target.json 441 + endif 440 442 441 443 $(obj)/compiler_builtins.o: private rustc_objcopy = -w -W '__*' 442 444 $(obj)/compiler_builtins.o: $(src)/compiler_builtins.rs $(obj)/core.o FORCE
+3 -1
scripts/Makefile
··· 11 11 hostprogs-always-$(CONFIG_SYSTEM_EXTRA_CERTIFICATE) += insert-sys-cert 12 12 hostprogs-always-$(CONFIG_RUST_KERNEL_DOCTESTS) += rustdoc_test_builder 13 13 hostprogs-always-$(CONFIG_RUST_KERNEL_DOCTESTS) += rustdoc_test_gen 14 - always-$(CONFIG_RUST) += target.json 15 14 15 + ifneq ($(or $(CONFIG_X86_64),$(CONFIG_LOONGARCH)),) 16 + always-$(CONFIG_RUST) += target.json 16 17 filechk_rust_target = $< < include/config/auto.conf 17 18 18 19 $(obj)/target.json: scripts/generate_rust_target include/config/auto.conf FORCE 19 20 $(call filechk,rust_target) 21 + endif 20 22 21 23 hostprogs += generate_rust_target 22 24 generate_rust_target-rust := y
+3 -1
scripts/generate_rust_target.rs
··· 148 148 let mut ts = TargetSpec::new(); 149 149 150 150 // `llvm-target`s are taken from `scripts/Makefile.clang`. 151 - if cfg.has("X86_64") { 151 + if cfg.has("ARM64") { 152 + panic!("arm64 uses the builtin rustc aarch64-unknown-none target"); 153 + } else if cfg.has("X86_64") { 152 154 ts.push("arch", "x86_64"); 153 155 ts.push( 154 156 "data-layout",
+217
tools/testing/selftests/arm64/abi/hwcap.c
··· 58 58 asm volatile(".inst 0xdac01c00" : : : "x0"); 59 59 } 60 60 61 + static void f8cvt_sigill(void) 62 + { 63 + /* FSCALE V0.4H, V0.4H, V0.4H */ 64 + asm volatile(".inst 0x2ec03c00"); 65 + } 66 + 67 + static void f8dp2_sigill(void) 68 + { 69 + /* FDOT V0.4H, V0.4H, V0.5H */ 70 + asm volatile(".inst 0xe40fc00"); 71 + } 72 + 73 + static void f8dp4_sigill(void) 74 + { 75 + /* FDOT V0.2S, V0.2S, V0.2S */ 76 + asm volatile(".inst 0xe00fc00"); 77 + } 78 + 79 + static void f8fma_sigill(void) 80 + { 81 + /* FMLALB V0.8H, V0.16B, V0.16B */ 82 + asm volatile(".inst 0xec0fc00"); 83 + } 84 + 85 + static void faminmax_sigill(void) 86 + { 87 + /* FAMIN V0.4H, V0.4H, V0.4H */ 88 + asm volatile(".inst 0x2ec01c00"); 89 + } 90 + 61 91 static void fp_sigill(void) 62 92 { 63 93 asm volatile("fmov s0, #1"); 94 + } 95 + 96 + static void fpmr_sigill(void) 97 + { 98 + asm volatile("mrs x0, S3_3_C4_C4_2" : : : "x0"); 64 99 } 65 100 66 101 static void ilrcpc_sigill(void) ··· 128 93 : "+r" (memp), "+r" (val0), "+r" (val1) 129 94 : 130 95 : "cc", "memory"); 96 + } 97 + 98 + static void lut_sigill(void) 99 + { 100 + /* LUTI2 V0.16B, { V0.16B }, V[0] */ 101 + asm volatile(".inst 0x4e801000"); 131 102 } 132 103 133 104 static void mops_sigill(void) ··· 252 211 253 212 /* FADD ZA.H[W0, 0], { Z0.H-Z1.H } */ 254 213 asm volatile(".inst 0xc1a41C00" : : : ); 214 + 215 + /* SMSTOP */ 216 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 217 + } 218 + 219 + static void smef8f16_sigill(void) 220 + { 221 + /* SMSTART */ 222 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 223 + 224 + /* FDOT ZA.H[W0, 0], Z0.B-Z1.B, Z0.B-Z1.B */ 225 + asm volatile(".inst 0xc1a01020" : : : ); 226 + 227 + /* SMSTOP */ 228 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 229 + } 230 + 231 + static void smef8f32_sigill(void) 232 + { 233 + /* SMSTART */ 234 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 235 + 236 + /* FDOT ZA.S[W0, 0], { Z0.B-Z1.B }, Z0.B[0] */ 237 + asm volatile(".inst 0xc1500038" : : : ); 238 + 239 + /* SMSTOP */ 240 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 241 + } 242 + 243 + static void smelutv2_sigill(void) 244 + { 245 + /* SMSTART */ 246 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 247 + 248 + /* LUTI4 { Z0.B-Z3.B }, ZT0, { Z0-Z1 } */ 249 + asm volatile(".inst 0xc08b0000" : : : ); 250 + 251 + /* SMSTOP */ 252 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 253 + } 254 + 255 + static void smesf8dp2_sigill(void) 256 + { 257 + /* SMSTART */ 258 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 259 + 260 + /* FDOT Z0.H, Z0.B, Z0.B[0] */ 261 + asm volatile(".inst 0x64204400" : : : ); 262 + 263 + /* SMSTOP */ 264 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 265 + } 266 + 267 + static void smesf8dp4_sigill(void) 268 + { 269 + /* SMSTART */ 270 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 271 + 272 + /* FDOT Z0.S, Z0.B, Z0.B[0] */ 273 + asm volatile(".inst 0xc1a41C00" : : : ); 274 + 275 + /* SMSTOP */ 276 + asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); 277 + } 278 + 279 + static void smesf8fma_sigill(void) 280 + { 281 + /* SMSTART */ 282 + asm volatile("msr S0_3_C4_C7_3, xzr" : : : ); 283 + 284 + /* FMLALB V0.8H, V0.16B, V0.16B */ 285 + asm volatile(".inst 0xec0fc00"); 255 286 256 287 /* SMSTOP */ 257 288 asm volatile("msr S0_3_C4_C6_3, xzr" : : : ); ··· 467 354 .sigill_fn = cssc_sigill, 468 355 }, 469 356 { 357 + .name = "F8CVT", 358 + .at_hwcap = AT_HWCAP2, 359 + .hwcap_bit = HWCAP2_F8CVT, 360 + .cpuinfo = "f8cvt", 361 + .sigill_fn = f8cvt_sigill, 362 + }, 363 + { 364 + .name = "F8DP4", 365 + .at_hwcap = AT_HWCAP2, 366 + .hwcap_bit = HWCAP2_F8DP4, 367 + .cpuinfo = "f8dp4", 368 + .sigill_fn = f8dp4_sigill, 369 + }, 370 + { 371 + .name = "F8DP2", 372 + .at_hwcap = AT_HWCAP2, 373 + .hwcap_bit = HWCAP2_F8DP2, 374 + .cpuinfo = "f8dp4", 375 + .sigill_fn = f8dp2_sigill, 376 + }, 377 + { 378 + .name = "F8E5M2", 379 + .at_hwcap = AT_HWCAP2, 380 + .hwcap_bit = HWCAP2_F8E5M2, 381 + .cpuinfo = "f8e5m2", 382 + }, 383 + { 384 + .name = "F8E4M3", 385 + .at_hwcap = AT_HWCAP2, 386 + .hwcap_bit = HWCAP2_F8E4M3, 387 + .cpuinfo = "f8e4m3", 388 + }, 389 + { 390 + .name = "F8FMA", 391 + .at_hwcap = AT_HWCAP2, 392 + .hwcap_bit = HWCAP2_F8FMA, 393 + .cpuinfo = "f8fma", 394 + .sigill_fn = f8fma_sigill, 395 + }, 396 + { 397 + .name = "FAMINMAX", 398 + .at_hwcap = AT_HWCAP2, 399 + .hwcap_bit = HWCAP2_FAMINMAX, 400 + .cpuinfo = "faminmax", 401 + .sigill_fn = faminmax_sigill, 402 + }, 403 + { 470 404 .name = "FP", 471 405 .at_hwcap = AT_HWCAP, 472 406 .hwcap_bit = HWCAP_FP, 473 407 .cpuinfo = "fp", 474 408 .sigill_fn = fp_sigill, 409 + }, 410 + { 411 + .name = "FPMR", 412 + .at_hwcap = AT_HWCAP2, 413 + .hwcap_bit = HWCAP2_FPMR, 414 + .cpuinfo = "fpmr", 415 + .sigill_fn = fpmr_sigill, 416 + .sigill_reliable = true, 475 417 }, 476 418 { 477 419 .name = "JSCVT", ··· 578 410 .hwcap_bit = HWCAP2_LSE128, 579 411 .cpuinfo = "lse128", 580 412 .sigill_fn = lse128_sigill, 413 + }, 414 + { 415 + .name = "LUT", 416 + .at_hwcap = AT_HWCAP2, 417 + .hwcap_bit = HWCAP2_LUT, 418 + .cpuinfo = "lut", 419 + .sigill_fn = lut_sigill, 581 420 }, 582 421 { 583 422 .name = "MOPS", ··· 685 510 .hwcap_bit = HWCAP2_SME_F16F16, 686 511 .cpuinfo = "smef16f16", 687 512 .sigill_fn = smef16f16_sigill, 513 + }, 514 + { 515 + .name = "SME F8F16", 516 + .at_hwcap = AT_HWCAP2, 517 + .hwcap_bit = HWCAP2_SME_F8F16, 518 + .cpuinfo = "smef8f16", 519 + .sigill_fn = smef8f16_sigill, 520 + }, 521 + { 522 + .name = "SME F8F32", 523 + .at_hwcap = AT_HWCAP2, 524 + .hwcap_bit = HWCAP2_SME_F8F32, 525 + .cpuinfo = "smef8f32", 526 + .sigill_fn = smef8f32_sigill, 527 + }, 528 + { 529 + .name = "SME LUTV2", 530 + .at_hwcap = AT_HWCAP2, 531 + .hwcap_bit = HWCAP2_SME_LUTV2, 532 + .cpuinfo = "smelutv2", 533 + .sigill_fn = smelutv2_sigill, 534 + }, 535 + { 536 + .name = "SME SF8FMA", 537 + .at_hwcap = AT_HWCAP2, 538 + .hwcap_bit = HWCAP2_SME_SF8FMA, 539 + .cpuinfo = "smesf8fma", 540 + .sigill_fn = smesf8fma_sigill, 541 + }, 542 + { 543 + .name = "SME SF8DP2", 544 + .at_hwcap = AT_HWCAP2, 545 + .hwcap_bit = HWCAP2_SME_SF8DP2, 546 + .cpuinfo = "smesf8dp2", 547 + .sigill_fn = smesf8dp2_sigill, 548 + }, 549 + { 550 + .name = "SME SF8DP4", 551 + .at_hwcap = AT_HWCAP2, 552 + .hwcap_bit = HWCAP2_SME_SF8DP4, 553 + .cpuinfo = "smesf8dp4", 554 + .sigill_fn = smesf8dp4_sigill, 688 555 }, 689 556 { 690 557 .name = "SVE",
+1
tools/testing/selftests/arm64/fp/.gitignore
··· 1 1 fp-pidbench 2 + fp-ptrace 2 3 fp-stress 3 4 fpsimd-test 4 5 rdvl-sme
+4 -1
tools/testing/selftests/arm64/fp/Makefile
··· 5 5 6 6 CFLAGS += $(KHDR_INCLUDES) 7 7 8 - TEST_GEN_PROGS := fp-stress \ 8 + TEST_GEN_PROGS := \ 9 + fp-ptrace \ 10 + fp-stress \ 9 11 sve-ptrace sve-probe-vls \ 10 12 vec-syscfg \ 11 13 za-fork za-ptrace ··· 26 24 # Build with nolibc to avoid effects due to libc's clone() support 27 25 $(OUTPUT)/fp-pidbench: fp-pidbench.S $(OUTPUT)/asm-utils.o 28 26 $(CC) -nostdlib $^ -o $@ 27 + $(OUTPUT)/fp-ptrace: fp-ptrace.c fp-ptrace-asm.S 29 28 $(OUTPUT)/fpsimd-test: fpsimd-test.S $(OUTPUT)/asm-utils.o 30 29 $(CC) -nostdlib $^ -o $@ 31 30 $(OUTPUT)/rdvl-sve: rdvl-sve.c $(OUTPUT)/rdvl.o
+279
tools/testing/selftests/arm64/fp/fp-ptrace-asm.S
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright (C) 2021-3 ARM Limited. 3 + // 4 + // Assembly portion of the FP ptrace test 5 + 6 + // 7 + // Load values from memory into registers, break on a breakpoint, then 8 + // break on a further breakpoint 9 + // 10 + 11 + #include "fp-ptrace.h" 12 + #include "sme-inst.h" 13 + 14 + .arch_extension sve 15 + 16 + // Load and save register values with pauses for ptrace 17 + // 18 + // x0 - SVE in use 19 + // x1 - SME in use 20 + // x2 - SME2 in use 21 + // x3 - FA64 supported 22 + 23 + .globl load_and_save 24 + load_and_save: 25 + stp x11, x12, [sp, #-0x10]! 26 + 27 + // This should be redundant in the SVE case 28 + ldr x7, =v_in 29 + ldp q0, q1, [x7] 30 + ldp q2, q3, [x7, #16 * 2] 31 + ldp q4, q5, [x7, #16 * 4] 32 + ldp q6, q7, [x7, #16 * 6] 33 + ldp q8, q9, [x7, #16 * 8] 34 + ldp q10, q11, [x7, #16 * 10] 35 + ldp q12, q13, [x7, #16 * 12] 36 + ldp q14, q15, [x7, #16 * 14] 37 + ldp q16, q17, [x7, #16 * 16] 38 + ldp q18, q19, [x7, #16 * 18] 39 + ldp q20, q21, [x7, #16 * 20] 40 + ldp q22, q23, [x7, #16 * 22] 41 + ldp q24, q25, [x7, #16 * 24] 42 + ldp q26, q27, [x7, #16 * 26] 43 + ldp q28, q29, [x7, #16 * 28] 44 + ldp q30, q31, [x7, #16 * 30] 45 + 46 + // SME? 47 + cbz x1, check_sve_in 48 + 49 + adrp x7, svcr_in 50 + ldr x7, [x7, :lo12:svcr_in] 51 + // SVCR is 0 by default, avoid triggering SME if not in use 52 + cbz x7, check_sve_in 53 + msr S3_3_C4_C2_2, x7 54 + 55 + // ZA? 56 + tbz x7, #SVCR_ZA_SHIFT, check_sm_in 57 + rdsvl 11, 1 58 + mov w12, #0 59 + ldr x6, =za_in 60 + 1: _ldr_za 12, 6 61 + add x6, x6, x11 62 + add x12, x12, #1 63 + cmp x11, x12 64 + bne 1b 65 + 66 + // ZT? 67 + cbz x2, check_sm_in 68 + adrp x6, zt_in 69 + add x6, x6, :lo12:zt_in 70 + _ldr_zt 6 71 + 72 + // In streaming mode? 73 + check_sm_in: 74 + tbz x7, #SVCR_SM_SHIFT, check_sve_in 75 + mov x4, x3 // Load FFR if we have FA64 76 + b load_sve 77 + 78 + // SVE? 79 + check_sve_in: 80 + cbz x0, wait_for_writes 81 + mov x4, #1 82 + 83 + load_sve: 84 + ldr x7, =z_in 85 + ldr z0, [x7, #0, MUL VL] 86 + ldr z1, [x7, #1, MUL VL] 87 + ldr z2, [x7, #2, MUL VL] 88 + ldr z3, [x7, #3, MUL VL] 89 + ldr z4, [x7, #4, MUL VL] 90 + ldr z5, [x7, #5, MUL VL] 91 + ldr z6, [x7, #6, MUL VL] 92 + ldr z7, [x7, #7, MUL VL] 93 + ldr z8, [x7, #8, MUL VL] 94 + ldr z9, [x7, #9, MUL VL] 95 + ldr z10, [x7, #10, MUL VL] 96 + ldr z11, [x7, #11, MUL VL] 97 + ldr z12, [x7, #12, MUL VL] 98 + ldr z13, [x7, #13, MUL VL] 99 + ldr z14, [x7, #14, MUL VL] 100 + ldr z15, [x7, #15, MUL VL] 101 + ldr z16, [x7, #16, MUL VL] 102 + ldr z17, [x7, #17, MUL VL] 103 + ldr z18, [x7, #18, MUL VL] 104 + ldr z19, [x7, #19, MUL VL] 105 + ldr z20, [x7, #20, MUL VL] 106 + ldr z21, [x7, #21, MUL VL] 107 + ldr z22, [x7, #22, MUL VL] 108 + ldr z23, [x7, #23, MUL VL] 109 + ldr z24, [x7, #24, MUL VL] 110 + ldr z25, [x7, #25, MUL VL] 111 + ldr z26, [x7, #26, MUL VL] 112 + ldr z27, [x7, #27, MUL VL] 113 + ldr z28, [x7, #28, MUL VL] 114 + ldr z29, [x7, #29, MUL VL] 115 + ldr z30, [x7, #30, MUL VL] 116 + ldr z31, [x7, #31, MUL VL] 117 + 118 + // FFR is not present in base SME 119 + cbz x4, 1f 120 + ldr x7, =ffr_in 121 + ldr p0, [x7] 122 + ldr x7, [x7, #0] 123 + cbz x7, 1f 124 + wrffr p0.b 125 + 1: 126 + 127 + ldr x7, =p_in 128 + ldr p0, [x7, #0, MUL VL] 129 + ldr p1, [x7, #1, MUL VL] 130 + ldr p2, [x7, #2, MUL VL] 131 + ldr p3, [x7, #3, MUL VL] 132 + ldr p4, [x7, #4, MUL VL] 133 + ldr p5, [x7, #5, MUL VL] 134 + ldr p6, [x7, #6, MUL VL] 135 + ldr p7, [x7, #7, MUL VL] 136 + ldr p8, [x7, #8, MUL VL] 137 + ldr p9, [x7, #9, MUL VL] 138 + ldr p10, [x7, #10, MUL VL] 139 + ldr p11, [x7, #11, MUL VL] 140 + ldr p12, [x7, #12, MUL VL] 141 + ldr p13, [x7, #13, MUL VL] 142 + ldr p14, [x7, #14, MUL VL] 143 + ldr p15, [x7, #15, MUL VL] 144 + 145 + wait_for_writes: 146 + // Wait for the parent 147 + brk #0 148 + 149 + // Save values 150 + ldr x7, =v_out 151 + stp q0, q1, [x7] 152 + stp q2, q3, [x7, #16 * 2] 153 + stp q4, q5, [x7, #16 * 4] 154 + stp q6, q7, [x7, #16 * 6] 155 + stp q8, q9, [x7, #16 * 8] 156 + stp q10, q11, [x7, #16 * 10] 157 + stp q12, q13, [x7, #16 * 12] 158 + stp q14, q15, [x7, #16 * 14] 159 + stp q16, q17, [x7, #16 * 16] 160 + stp q18, q19, [x7, #16 * 18] 161 + stp q20, q21, [x7, #16 * 20] 162 + stp q22, q23, [x7, #16 * 22] 163 + stp q24, q25, [x7, #16 * 24] 164 + stp q26, q27, [x7, #16 * 26] 165 + stp q28, q29, [x7, #16 * 28] 166 + stp q30, q31, [x7, #16 * 30] 167 + 168 + // SME? 169 + cbz x1, check_sve_out 170 + 171 + rdsvl 11, 1 172 + adrp x6, sme_vl_out 173 + str x11, [x6, :lo12:sme_vl_out] 174 + 175 + mrs x7, S3_3_C4_C2_2 176 + adrp x6, svcr_out 177 + str x7, [x6, :lo12:svcr_out] 178 + 179 + // ZA? 180 + tbz x7, #SVCR_ZA_SHIFT, check_sm_out 181 + mov w12, #0 182 + ldr x6, =za_out 183 + 1: _str_za 12, 6 184 + add x6, x6, x11 185 + add x12, x12, #1 186 + cmp x11, x12 187 + bne 1b 188 + 189 + // ZT? 190 + cbz x2, check_sm_out 191 + adrp x6, zt_out 192 + add x6, x6, :lo12:zt_out 193 + _str_zt 6 194 + 195 + // In streaming mode? 196 + check_sm_out: 197 + tbz x7, #SVCR_SM_SHIFT, check_sve_out 198 + mov x4, x3 // FFR? 199 + b read_sve 200 + 201 + // SVE? 202 + check_sve_out: 203 + cbz x0, wait_for_reads 204 + mov x4, #1 205 + 206 + rdvl x7, #1 207 + adrp x6, sve_vl_out 208 + str x7, [x6, :lo12:sve_vl_out] 209 + 210 + read_sve: 211 + ldr x7, =z_out 212 + str z0, [x7, #0, MUL VL] 213 + str z1, [x7, #1, MUL VL] 214 + str z2, [x7, #2, MUL VL] 215 + str z3, [x7, #3, MUL VL] 216 + str z4, [x7, #4, MUL VL] 217 + str z5, [x7, #5, MUL VL] 218 + str z6, [x7, #6, MUL VL] 219 + str z7, [x7, #7, MUL VL] 220 + str z8, [x7, #8, MUL VL] 221 + str z9, [x7, #9, MUL VL] 222 + str z10, [x7, #10, MUL VL] 223 + str z11, [x7, #11, MUL VL] 224 + str z12, [x7, #12, MUL VL] 225 + str z13, [x7, #13, MUL VL] 226 + str z14, [x7, #14, MUL VL] 227 + str z15, [x7, #15, MUL VL] 228 + str z16, [x7, #16, MUL VL] 229 + str z17, [x7, #17, MUL VL] 230 + str z18, [x7, #18, MUL VL] 231 + str z19, [x7, #19, MUL VL] 232 + str z20, [x7, #20, MUL VL] 233 + str z21, [x7, #21, MUL VL] 234 + str z22, [x7, #22, MUL VL] 235 + str z23, [x7, #23, MUL VL] 236 + str z24, [x7, #24, MUL VL] 237 + str z25, [x7, #25, MUL VL] 238 + str z26, [x7, #26, MUL VL] 239 + str z27, [x7, #27, MUL VL] 240 + str z28, [x7, #28, MUL VL] 241 + str z29, [x7, #29, MUL VL] 242 + str z30, [x7, #30, MUL VL] 243 + str z31, [x7, #31, MUL VL] 244 + 245 + ldr x7, =p_out 246 + str p0, [x7, #0, MUL VL] 247 + str p1, [x7, #1, MUL VL] 248 + str p2, [x7, #2, MUL VL] 249 + str p3, [x7, #3, MUL VL] 250 + str p4, [x7, #4, MUL VL] 251 + str p5, [x7, #5, MUL VL] 252 + str p6, [x7, #6, MUL VL] 253 + str p7, [x7, #7, MUL VL] 254 + str p8, [x7, #8, MUL VL] 255 + str p9, [x7, #9, MUL VL] 256 + str p10, [x7, #10, MUL VL] 257 + str p11, [x7, #11, MUL VL] 258 + str p12, [x7, #12, MUL VL] 259 + str p13, [x7, #13, MUL VL] 260 + str p14, [x7, #14, MUL VL] 261 + str p15, [x7, #15, MUL VL] 262 + 263 + // Only save FFR if it exists 264 + cbz x4, wait_for_reads 265 + ldr x7, =ffr_out 266 + rdffr p0.b 267 + str p0, [x7] 268 + 269 + wait_for_reads: 270 + // Wait for the parent 271 + brk #0 272 + 273 + // Ensure we don't leave ourselves in streaming mode 274 + cbz x1, out 275 + msr S3_3_C4_C2_2, xzr 276 + 277 + out: 278 + ldp x11, x12, [sp, #-0x10] 279 + ret
+1503
tools/testing/selftests/arm64/fp/fp-ptrace.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright (C) 2023 ARM Limited. 4 + * Original author: Mark Brown <broonie@kernel.org> 5 + */ 6 + 7 + #define _GNU_SOURCE 8 + 9 + #include <errno.h> 10 + #include <stdbool.h> 11 + #include <stddef.h> 12 + #include <stdio.h> 13 + #include <stdlib.h> 14 + #include <string.h> 15 + #include <unistd.h> 16 + 17 + #include <sys/auxv.h> 18 + #include <sys/prctl.h> 19 + #include <sys/ptrace.h> 20 + #include <sys/types.h> 21 + #include <sys/uio.h> 22 + #include <sys/wait.h> 23 + 24 + #include <linux/kernel.h> 25 + 26 + #include <asm/sigcontext.h> 27 + #include <asm/sve_context.h> 28 + #include <asm/ptrace.h> 29 + 30 + #include "../../kselftest.h" 31 + 32 + #include "fp-ptrace.h" 33 + 34 + /* <linux/elf.h> and <sys/auxv.h> don't like each other, so: */ 35 + #ifndef NT_ARM_SVE 36 + #define NT_ARM_SVE 0x405 37 + #endif 38 + 39 + #ifndef NT_ARM_SSVE 40 + #define NT_ARM_SSVE 0x40b 41 + #endif 42 + 43 + #ifndef NT_ARM_ZA 44 + #define NT_ARM_ZA 0x40c 45 + #endif 46 + 47 + #ifndef NT_ARM_ZT 48 + #define NT_ARM_ZT 0x40d 49 + #endif 50 + 51 + #define ARCH_VQ_MAX 256 52 + 53 + /* VL 128..2048 in powers of 2 */ 54 + #define MAX_NUM_VLS 5 55 + 56 + #define NUM_FPR 32 57 + __uint128_t v_in[NUM_FPR]; 58 + __uint128_t v_expected[NUM_FPR]; 59 + __uint128_t v_out[NUM_FPR]; 60 + 61 + char z_in[__SVE_ZREGS_SIZE(ARCH_VQ_MAX)]; 62 + char z_expected[__SVE_ZREGS_SIZE(ARCH_VQ_MAX)]; 63 + char z_out[__SVE_ZREGS_SIZE(ARCH_VQ_MAX)]; 64 + 65 + char p_in[__SVE_PREGS_SIZE(ARCH_VQ_MAX)]; 66 + char p_expected[__SVE_PREGS_SIZE(ARCH_VQ_MAX)]; 67 + char p_out[__SVE_PREGS_SIZE(ARCH_VQ_MAX)]; 68 + 69 + char ffr_in[__SVE_PREG_SIZE(ARCH_VQ_MAX)]; 70 + char ffr_expected[__SVE_PREG_SIZE(ARCH_VQ_MAX)]; 71 + char ffr_out[__SVE_PREG_SIZE(ARCH_VQ_MAX)]; 72 + 73 + char za_in[ZA_SIG_REGS_SIZE(ARCH_VQ_MAX)]; 74 + char za_expected[ZA_SIG_REGS_SIZE(ARCH_VQ_MAX)]; 75 + char za_out[ZA_SIG_REGS_SIZE(ARCH_VQ_MAX)]; 76 + 77 + char zt_in[ZT_SIG_REG_BYTES]; 78 + char zt_expected[ZT_SIG_REG_BYTES]; 79 + char zt_out[ZT_SIG_REG_BYTES]; 80 + 81 + uint64_t sve_vl_out; 82 + uint64_t sme_vl_out; 83 + uint64_t svcr_in, svcr_expected, svcr_out; 84 + 85 + void load_and_save(int sve, int sme, int sme2, int fa64); 86 + 87 + static bool got_alarm; 88 + 89 + static void handle_alarm(int sig, siginfo_t *info, void *context) 90 + { 91 + got_alarm = true; 92 + } 93 + 94 + #ifdef CONFIG_CPU_BIG_ENDIAN 95 + static __uint128_t arm64_cpu_to_le128(__uint128_t x) 96 + { 97 + u64 a = swab64(x); 98 + u64 b = swab64(x >> 64); 99 + 100 + return ((__uint128_t)a << 64) | b; 101 + } 102 + #else 103 + static __uint128_t arm64_cpu_to_le128(__uint128_t x) 104 + { 105 + return x; 106 + } 107 + #endif 108 + 109 + #define arm64_le128_to_cpu(x) arm64_cpu_to_le128(x) 110 + 111 + static bool sve_supported(void) 112 + { 113 + return getauxval(AT_HWCAP) & HWCAP_SVE; 114 + } 115 + 116 + static bool sme_supported(void) 117 + { 118 + return getauxval(AT_HWCAP2) & HWCAP2_SME; 119 + } 120 + 121 + static bool sme2_supported(void) 122 + { 123 + return getauxval(AT_HWCAP2) & HWCAP2_SME2; 124 + } 125 + 126 + static bool fa64_supported(void) 127 + { 128 + return getauxval(AT_HWCAP2) & HWCAP2_SME_FA64; 129 + } 130 + 131 + static bool compare_buffer(const char *name, void *out, 132 + void *expected, size_t size) 133 + { 134 + void *tmp; 135 + 136 + if (memcmp(out, expected, size) == 0) 137 + return true; 138 + 139 + ksft_print_msg("Mismatch in %s\n", name); 140 + 141 + /* Did we just get zeros back? */ 142 + tmp = malloc(size); 143 + if (!tmp) { 144 + ksft_print_msg("OOM allocating %lu bytes for %s\n", 145 + size, name); 146 + ksft_exit_fail(); 147 + } 148 + memset(tmp, 0, size); 149 + 150 + if (memcmp(out, tmp, size) == 0) 151 + ksft_print_msg("%s is zero\n", name); 152 + 153 + free(tmp); 154 + 155 + return false; 156 + } 157 + 158 + struct test_config { 159 + int sve_vl_in; 160 + int sve_vl_expected; 161 + int sme_vl_in; 162 + int sme_vl_expected; 163 + int svcr_in; 164 + int svcr_expected; 165 + }; 166 + 167 + struct test_definition { 168 + const char *name; 169 + bool sve_vl_change; 170 + bool (*supported)(struct test_config *config); 171 + void (*set_expected_values)(struct test_config *config); 172 + void (*modify_values)(pid_t child, struct test_config *test_config); 173 + }; 174 + 175 + static int vl_in(struct test_config *config) 176 + { 177 + int vl; 178 + 179 + if (config->svcr_in & SVCR_SM) 180 + vl = config->sme_vl_in; 181 + else 182 + vl = config->sve_vl_in; 183 + 184 + return vl; 185 + } 186 + 187 + static int vl_expected(struct test_config *config) 188 + { 189 + int vl; 190 + 191 + if (config->svcr_expected & SVCR_SM) 192 + vl = config->sme_vl_expected; 193 + else 194 + vl = config->sve_vl_expected; 195 + 196 + return vl; 197 + } 198 + 199 + static void run_child(struct test_config *config) 200 + { 201 + int ret; 202 + 203 + /* Let the parent attach to us */ 204 + ret = ptrace(PTRACE_TRACEME, 0, 0, 0); 205 + if (ret < 0) 206 + ksft_exit_fail_msg("PTRACE_TRACEME failed: %s (%d)\n", 207 + strerror(errno), errno); 208 + 209 + /* VL setup */ 210 + if (sve_supported()) { 211 + ret = prctl(PR_SVE_SET_VL, config->sve_vl_in); 212 + if (ret != config->sve_vl_in) { 213 + ksft_print_msg("Failed to set SVE VL %d: %d\n", 214 + config->sve_vl_in, ret); 215 + } 216 + } 217 + 218 + if (sme_supported()) { 219 + ret = prctl(PR_SME_SET_VL, config->sme_vl_in); 220 + if (ret != config->sme_vl_in) { 221 + ksft_print_msg("Failed to set SME VL %d: %d\n", 222 + config->sme_vl_in, ret); 223 + } 224 + } 225 + 226 + /* Load values and wait for the parent */ 227 + load_and_save(sve_supported(), sme_supported(), 228 + sme2_supported(), fa64_supported()); 229 + 230 + exit(0); 231 + } 232 + 233 + static void read_one_child_regs(pid_t child, char *name, 234 + struct iovec *iov_parent, 235 + struct iovec *iov_child) 236 + { 237 + int len = iov_parent->iov_len; 238 + int ret; 239 + 240 + ret = process_vm_readv(child, iov_parent, 1, iov_child, 1, 0); 241 + if (ret == -1) 242 + ksft_print_msg("%s read failed: %s (%d)\n", 243 + name, strerror(errno), errno); 244 + else if (ret != len) 245 + ksft_print_msg("Short read of %s: %d\n", name, ret); 246 + } 247 + 248 + static void read_child_regs(pid_t child) 249 + { 250 + struct iovec iov_parent, iov_child; 251 + 252 + /* 253 + * Since the child fork()ed from us the buffer addresses are 254 + * the same in parent and child. 255 + */ 256 + iov_parent.iov_base = &v_out; 257 + iov_parent.iov_len = sizeof(v_out); 258 + iov_child.iov_base = &v_out; 259 + iov_child.iov_len = sizeof(v_out); 260 + read_one_child_regs(child, "FPSIMD", &iov_parent, &iov_child); 261 + 262 + if (sve_supported() || sme_supported()) { 263 + iov_parent.iov_base = &sve_vl_out; 264 + iov_parent.iov_len = sizeof(sve_vl_out); 265 + iov_child.iov_base = &sve_vl_out; 266 + iov_child.iov_len = sizeof(sve_vl_out); 267 + read_one_child_regs(child, "SVE VL", &iov_parent, &iov_child); 268 + 269 + iov_parent.iov_base = &z_out; 270 + iov_parent.iov_len = sizeof(z_out); 271 + iov_child.iov_base = &z_out; 272 + iov_child.iov_len = sizeof(z_out); 273 + read_one_child_regs(child, "Z", &iov_parent, &iov_child); 274 + 275 + iov_parent.iov_base = &p_out; 276 + iov_parent.iov_len = sizeof(p_out); 277 + iov_child.iov_base = &p_out; 278 + iov_child.iov_len = sizeof(p_out); 279 + read_one_child_regs(child, "P", &iov_parent, &iov_child); 280 + 281 + iov_parent.iov_base = &ffr_out; 282 + iov_parent.iov_len = sizeof(ffr_out); 283 + iov_child.iov_base = &ffr_out; 284 + iov_child.iov_len = sizeof(ffr_out); 285 + read_one_child_regs(child, "FFR", &iov_parent, &iov_child); 286 + } 287 + 288 + if (sme_supported()) { 289 + iov_parent.iov_base = &sme_vl_out; 290 + iov_parent.iov_len = sizeof(sme_vl_out); 291 + iov_child.iov_base = &sme_vl_out; 292 + iov_child.iov_len = sizeof(sme_vl_out); 293 + read_one_child_regs(child, "SME VL", &iov_parent, &iov_child); 294 + 295 + iov_parent.iov_base = &svcr_out; 296 + iov_parent.iov_len = sizeof(svcr_out); 297 + iov_child.iov_base = &svcr_out; 298 + iov_child.iov_len = sizeof(svcr_out); 299 + read_one_child_regs(child, "SVCR", &iov_parent, &iov_child); 300 + 301 + iov_parent.iov_base = &za_out; 302 + iov_parent.iov_len = sizeof(za_out); 303 + iov_child.iov_base = &za_out; 304 + iov_child.iov_len = sizeof(za_out); 305 + read_one_child_regs(child, "ZA", &iov_parent, &iov_child); 306 + } 307 + 308 + if (sme2_supported()) { 309 + iov_parent.iov_base = &zt_out; 310 + iov_parent.iov_len = sizeof(zt_out); 311 + iov_child.iov_base = &zt_out; 312 + iov_child.iov_len = sizeof(zt_out); 313 + read_one_child_regs(child, "ZT", &iov_parent, &iov_child); 314 + } 315 + } 316 + 317 + static bool continue_breakpoint(pid_t child, 318 + enum __ptrace_request restart_type) 319 + { 320 + struct user_pt_regs pt_regs; 321 + struct iovec iov; 322 + int ret; 323 + 324 + /* Get PC */ 325 + iov.iov_base = &pt_regs; 326 + iov.iov_len = sizeof(pt_regs); 327 + ret = ptrace(PTRACE_GETREGSET, child, NT_PRSTATUS, &iov); 328 + if (ret < 0) { 329 + ksft_print_msg("Failed to get PC: %s (%d)\n", 330 + strerror(errno), errno); 331 + return false; 332 + } 333 + 334 + /* Skip over the BRK */ 335 + pt_regs.pc += 4; 336 + ret = ptrace(PTRACE_SETREGSET, child, NT_PRSTATUS, &iov); 337 + if (ret < 0) { 338 + ksft_print_msg("Failed to skip BRK: %s (%d)\n", 339 + strerror(errno), errno); 340 + return false; 341 + } 342 + 343 + /* Restart */ 344 + ret = ptrace(restart_type, child, 0, 0); 345 + if (ret < 0) { 346 + ksft_print_msg("Failed to restart child: %s (%d)\n", 347 + strerror(errno), errno); 348 + return false; 349 + } 350 + 351 + return true; 352 + } 353 + 354 + static bool check_ptrace_values_sve(pid_t child, struct test_config *config) 355 + { 356 + struct user_sve_header *sve; 357 + struct user_fpsimd_state *fpsimd; 358 + struct iovec iov; 359 + int ret, vq; 360 + bool pass = true; 361 + 362 + if (!sve_supported()) 363 + return true; 364 + 365 + vq = __sve_vq_from_vl(config->sve_vl_in); 366 + 367 + iov.iov_len = SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, SVE_PT_REGS_SVE); 368 + iov.iov_base = malloc(iov.iov_len); 369 + if (!iov.iov_base) { 370 + ksft_print_msg("OOM allocating %lu byte SVE buffer\n", 371 + iov.iov_len); 372 + return false; 373 + } 374 + 375 + ret = ptrace(PTRACE_GETREGSET, child, NT_ARM_SVE, &iov); 376 + if (ret != 0) { 377 + ksft_print_msg("Failed to read initial SVE: %s (%d)\n", 378 + strerror(errno), errno); 379 + pass = false; 380 + goto out; 381 + } 382 + 383 + sve = iov.iov_base; 384 + 385 + if (sve->vl != config->sve_vl_in) { 386 + ksft_print_msg("Mismatch in initial SVE VL: %d != %d\n", 387 + sve->vl, config->sve_vl_in); 388 + pass = false; 389 + } 390 + 391 + /* If we are in streaming mode we should just read FPSIMD */ 392 + if ((config->svcr_in & SVCR_SM) && (sve->flags & SVE_PT_REGS_SVE)) { 393 + ksft_print_msg("NT_ARM_SVE reports SVE with PSTATE.SM\n"); 394 + pass = false; 395 + } 396 + 397 + if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 398 + ksft_print_msg("Mismatch in SVE header size: %d != %lu\n", 399 + sve->size, SVE_PT_SIZE(vq, sve->flags)); 400 + pass = false; 401 + } 402 + 403 + /* The registers might be in completely different formats! */ 404 + if (sve->flags & SVE_PT_REGS_SVE) { 405 + if (!compare_buffer("initial SVE Z", 406 + iov.iov_base + SVE_PT_SVE_ZREG_OFFSET(vq, 0), 407 + z_in, SVE_PT_SVE_ZREGS_SIZE(vq))) 408 + pass = false; 409 + 410 + if (!compare_buffer("initial SVE P", 411 + iov.iov_base + SVE_PT_SVE_PREG_OFFSET(vq, 0), 412 + p_in, SVE_PT_SVE_PREGS_SIZE(vq))) 413 + pass = false; 414 + 415 + if (!compare_buffer("initial SVE FFR", 416 + iov.iov_base + SVE_PT_SVE_FFR_OFFSET(vq), 417 + ffr_in, SVE_PT_SVE_PREG_SIZE(vq))) 418 + pass = false; 419 + } else { 420 + fpsimd = iov.iov_base + SVE_PT_FPSIMD_OFFSET; 421 + if (!compare_buffer("initial V via SVE", &fpsimd->vregs[0], 422 + v_in, sizeof(v_in))) 423 + pass = false; 424 + } 425 + 426 + out: 427 + free(iov.iov_base); 428 + return pass; 429 + } 430 + 431 + static bool check_ptrace_values_ssve(pid_t child, struct test_config *config) 432 + { 433 + struct user_sve_header *sve; 434 + struct user_fpsimd_state *fpsimd; 435 + struct iovec iov; 436 + int ret, vq; 437 + bool pass = true; 438 + 439 + if (!sme_supported()) 440 + return true; 441 + 442 + vq = __sve_vq_from_vl(config->sme_vl_in); 443 + 444 + iov.iov_len = SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, SVE_PT_REGS_SVE); 445 + iov.iov_base = malloc(iov.iov_len); 446 + if (!iov.iov_base) { 447 + ksft_print_msg("OOM allocating %lu byte SSVE buffer\n", 448 + iov.iov_len); 449 + return false; 450 + } 451 + 452 + ret = ptrace(PTRACE_GETREGSET, child, NT_ARM_SSVE, &iov); 453 + if (ret != 0) { 454 + ksft_print_msg("Failed to read initial SSVE: %s (%d)\n", 455 + strerror(errno), errno); 456 + pass = false; 457 + goto out; 458 + } 459 + 460 + sve = iov.iov_base; 461 + 462 + if (sve->vl != config->sme_vl_in) { 463 + ksft_print_msg("Mismatch in initial SSVE VL: %d != %d\n", 464 + sve->vl, config->sme_vl_in); 465 + pass = false; 466 + } 467 + 468 + if ((config->svcr_in & SVCR_SM) && !(sve->flags & SVE_PT_REGS_SVE)) { 469 + ksft_print_msg("NT_ARM_SSVE reports FPSIMD with PSTATE.SM\n"); 470 + pass = false; 471 + } 472 + 473 + if (sve->size != SVE_PT_SIZE(vq, sve->flags)) { 474 + ksft_print_msg("Mismatch in SSVE header size: %d != %lu\n", 475 + sve->size, SVE_PT_SIZE(vq, sve->flags)); 476 + pass = false; 477 + } 478 + 479 + /* The registers might be in completely different formats! */ 480 + if (sve->flags & SVE_PT_REGS_SVE) { 481 + if (!compare_buffer("initial SSVE Z", 482 + iov.iov_base + SVE_PT_SVE_ZREG_OFFSET(vq, 0), 483 + z_in, SVE_PT_SVE_ZREGS_SIZE(vq))) 484 + pass = false; 485 + 486 + if (!compare_buffer("initial SSVE P", 487 + iov.iov_base + SVE_PT_SVE_PREG_OFFSET(vq, 0), 488 + p_in, SVE_PT_SVE_PREGS_SIZE(vq))) 489 + pass = false; 490 + 491 + if (!compare_buffer("initial SSVE FFR", 492 + iov.iov_base + SVE_PT_SVE_FFR_OFFSET(vq), 493 + ffr_in, SVE_PT_SVE_PREG_SIZE(vq))) 494 + pass = false; 495 + } else { 496 + fpsimd = iov.iov_base + SVE_PT_FPSIMD_OFFSET; 497 + if (!compare_buffer("initial V via SSVE", 498 + &fpsimd->vregs[0], v_in, sizeof(v_in))) 499 + pass = false; 500 + } 501 + 502 + out: 503 + free(iov.iov_base); 504 + return pass; 505 + } 506 + 507 + static bool check_ptrace_values_za(pid_t child, struct test_config *config) 508 + { 509 + struct user_za_header *za; 510 + struct iovec iov; 511 + int ret, vq; 512 + bool pass = true; 513 + 514 + if (!sme_supported()) 515 + return true; 516 + 517 + vq = __sve_vq_from_vl(config->sme_vl_in); 518 + 519 + iov.iov_len = ZA_SIG_CONTEXT_SIZE(vq); 520 + iov.iov_base = malloc(iov.iov_len); 521 + if (!iov.iov_base) { 522 + ksft_print_msg("OOM allocating %lu byte ZA buffer\n", 523 + iov.iov_len); 524 + return false; 525 + } 526 + 527 + ret = ptrace(PTRACE_GETREGSET, child, NT_ARM_ZA, &iov); 528 + if (ret != 0) { 529 + ksft_print_msg("Failed to read initial ZA: %s (%d)\n", 530 + strerror(errno), errno); 531 + pass = false; 532 + goto out; 533 + } 534 + 535 + za = iov.iov_base; 536 + 537 + if (za->vl != config->sme_vl_in) { 538 + ksft_print_msg("Mismatch in initial SME VL: %d != %d\n", 539 + za->vl, config->sme_vl_in); 540 + pass = false; 541 + } 542 + 543 + /* If PSTATE.ZA is not set we should just read the header */ 544 + if (config->svcr_in & SVCR_ZA) { 545 + if (za->size != ZA_PT_SIZE(vq)) { 546 + ksft_print_msg("Unexpected ZA ptrace read size: %d != %lu\n", 547 + za->size, ZA_PT_SIZE(vq)); 548 + pass = false; 549 + } 550 + 551 + if (!compare_buffer("initial ZA", 552 + iov.iov_base + ZA_PT_ZA_OFFSET, 553 + za_in, ZA_PT_ZA_SIZE(vq))) 554 + pass = false; 555 + } else { 556 + if (za->size != sizeof(*za)) { 557 + ksft_print_msg("Unexpected ZA ptrace read size: %d != %lu\n", 558 + za->size, sizeof(*za)); 559 + pass = false; 560 + } 561 + } 562 + 563 + out: 564 + free(iov.iov_base); 565 + return pass; 566 + } 567 + 568 + static bool check_ptrace_values_zt(pid_t child, struct test_config *config) 569 + { 570 + uint8_t buf[512]; 571 + struct iovec iov; 572 + int ret; 573 + 574 + if (!sme2_supported()) 575 + return true; 576 + 577 + iov.iov_base = &buf; 578 + iov.iov_len = ZT_SIG_REG_BYTES; 579 + ret = ptrace(PTRACE_GETREGSET, child, NT_ARM_ZT, &iov); 580 + if (ret != 0) { 581 + ksft_print_msg("Failed to read initial ZT: %s (%d)\n", 582 + strerror(errno), errno); 583 + return false; 584 + } 585 + 586 + return compare_buffer("initial ZT", buf, zt_in, ZT_SIG_REG_BYTES); 587 + } 588 + 589 + 590 + static bool check_ptrace_values(pid_t child, struct test_config *config) 591 + { 592 + bool pass = true; 593 + struct user_fpsimd_state fpsimd; 594 + struct iovec iov; 595 + int ret; 596 + 597 + iov.iov_base = &fpsimd; 598 + iov.iov_len = sizeof(fpsimd); 599 + ret = ptrace(PTRACE_GETREGSET, child, NT_PRFPREG, &iov); 600 + if (ret == 0) { 601 + if (!compare_buffer("initial V", &fpsimd.vregs, v_in, 602 + sizeof(v_in))) { 603 + pass = false; 604 + } 605 + } else { 606 + ksft_print_msg("Failed to read initial V: %s (%d)\n", 607 + strerror(errno), errno); 608 + pass = false; 609 + } 610 + 611 + if (!check_ptrace_values_sve(child, config)) 612 + pass = false; 613 + 614 + if (!check_ptrace_values_ssve(child, config)) 615 + pass = false; 616 + 617 + if (!check_ptrace_values_za(child, config)) 618 + pass = false; 619 + 620 + if (!check_ptrace_values_zt(child, config)) 621 + pass = false; 622 + 623 + return pass; 624 + } 625 + 626 + static bool run_parent(pid_t child, struct test_definition *test, 627 + struct test_config *config) 628 + { 629 + int wait_status, ret; 630 + pid_t pid; 631 + bool pass; 632 + 633 + /* Initial attach */ 634 + while (1) { 635 + pid = waitpid(child, &wait_status, 0); 636 + if (pid < 0) { 637 + if (errno == EINTR) 638 + continue; 639 + ksft_exit_fail_msg("waitpid() failed: %s (%d)\n", 640 + strerror(errno), errno); 641 + } 642 + 643 + if (pid == child) 644 + break; 645 + } 646 + 647 + if (WIFEXITED(wait_status)) { 648 + ksft_print_msg("Child exited loading values with status %d\n", 649 + WEXITSTATUS(wait_status)); 650 + pass = false; 651 + goto out; 652 + } 653 + 654 + if (WIFSIGNALED(wait_status)) { 655 + ksft_print_msg("Child died from signal %d loading values\n", 656 + WTERMSIG(wait_status)); 657 + pass = false; 658 + goto out; 659 + } 660 + 661 + /* Read initial values via ptrace */ 662 + pass = check_ptrace_values(child, config); 663 + 664 + /* Do whatever writes we want to do */ 665 + if (test->modify_values) 666 + test->modify_values(child, config); 667 + 668 + if (!continue_breakpoint(child, PTRACE_CONT)) 669 + goto cleanup; 670 + 671 + while (1) { 672 + pid = waitpid(child, &wait_status, 0); 673 + if (pid < 0) { 674 + if (errno == EINTR) 675 + continue; 676 + ksft_exit_fail_msg("waitpid() failed: %s (%d)\n", 677 + strerror(errno), errno); 678 + } 679 + 680 + if (pid == child) 681 + break; 682 + } 683 + 684 + if (WIFEXITED(wait_status)) { 685 + ksft_print_msg("Child exited saving values with status %d\n", 686 + WEXITSTATUS(wait_status)); 687 + pass = false; 688 + goto out; 689 + } 690 + 691 + if (WIFSIGNALED(wait_status)) { 692 + ksft_print_msg("Child died from signal %d saving values\n", 693 + WTERMSIG(wait_status)); 694 + pass = false; 695 + goto out; 696 + } 697 + 698 + /* See what happened as a result */ 699 + read_child_regs(child); 700 + 701 + if (!continue_breakpoint(child, PTRACE_DETACH)) 702 + goto cleanup; 703 + 704 + /* The child should exit cleanly */ 705 + got_alarm = false; 706 + alarm(1); 707 + while (1) { 708 + if (got_alarm) { 709 + ksft_print_msg("Wait for child timed out\n"); 710 + goto cleanup; 711 + } 712 + 713 + pid = waitpid(child, &wait_status, 0); 714 + if (pid < 0) { 715 + if (errno == EINTR) 716 + continue; 717 + ksft_exit_fail_msg("waitpid() failed: %s (%d)\n", 718 + strerror(errno), errno); 719 + } 720 + 721 + if (pid == child) 722 + break; 723 + } 724 + alarm(0); 725 + 726 + if (got_alarm) { 727 + ksft_print_msg("Timed out waiting for child\n"); 728 + pass = false; 729 + goto cleanup; 730 + } 731 + 732 + if (pid == child && WIFSIGNALED(wait_status)) { 733 + ksft_print_msg("Child died from signal %d cleaning up\n", 734 + WTERMSIG(wait_status)); 735 + pass = false; 736 + goto out; 737 + } 738 + 739 + if (pid == child && WIFEXITED(wait_status)) { 740 + if (WEXITSTATUS(wait_status) != 0) { 741 + ksft_print_msg("Child exited with error %d\n", 742 + WEXITSTATUS(wait_status)); 743 + pass = false; 744 + } 745 + } else { 746 + ksft_print_msg("Child did not exit cleanly\n"); 747 + pass = false; 748 + goto cleanup; 749 + } 750 + 751 + goto out; 752 + 753 + cleanup: 754 + ret = kill(child, SIGKILL); 755 + if (ret != 0) { 756 + ksft_print_msg("kill() failed: %s (%d)\n", 757 + strerror(errno), errno); 758 + return false; 759 + } 760 + 761 + while (1) { 762 + pid = waitpid(child, &wait_status, 0); 763 + if (pid < 0) { 764 + if (errno == EINTR) 765 + continue; 766 + ksft_exit_fail_msg("waitpid() failed: %s (%d)\n", 767 + strerror(errno), errno); 768 + } 769 + 770 + if (pid == child) 771 + break; 772 + } 773 + 774 + out: 775 + return pass; 776 + } 777 + 778 + static void fill_random(void *buf, size_t size) 779 + { 780 + int i; 781 + uint32_t *lbuf = buf; 782 + 783 + /* random() returns a 32 bit number regardless of the size of long */ 784 + for (i = 0; i < size / sizeof(uint32_t); i++) 785 + lbuf[i] = random(); 786 + } 787 + 788 + static void fill_random_ffr(void *buf, size_t vq) 789 + { 790 + uint8_t *lbuf = buf; 791 + int bits, i; 792 + 793 + /* 794 + * Only values with a continuous set of 0..n bits set are 795 + * valid for FFR, set all bits then clear a random number of 796 + * high bits. 797 + */ 798 + memset(buf, 0, __SVE_FFR_SIZE(vq)); 799 + 800 + bits = random() % (__SVE_FFR_SIZE(vq) * 8); 801 + for (i = 0; i < bits / 8; i++) 802 + lbuf[i] = 0xff; 803 + if (bits / 8 != __SVE_FFR_SIZE(vq)) 804 + lbuf[i] = (1 << (bits % 8)) - 1; 805 + } 806 + 807 + static void fpsimd_to_sve(__uint128_t *v, char *z, int vl) 808 + { 809 + int vq = __sve_vq_from_vl(vl); 810 + int i; 811 + __uint128_t *p; 812 + 813 + if (!vl) 814 + return; 815 + 816 + for (i = 0; i < __SVE_NUM_ZREGS; i++) { 817 + p = (__uint128_t *)&z[__SVE_ZREG_OFFSET(vq, i)]; 818 + *p = arm64_cpu_to_le128(v[i]); 819 + } 820 + } 821 + 822 + static void set_initial_values(struct test_config *config) 823 + { 824 + int vq = __sve_vq_from_vl(vl_in(config)); 825 + int sme_vq = __sve_vq_from_vl(config->sme_vl_in); 826 + 827 + svcr_in = config->svcr_in; 828 + svcr_expected = config->svcr_expected; 829 + svcr_out = 0; 830 + 831 + fill_random(&v_in, sizeof(v_in)); 832 + memcpy(v_expected, v_in, sizeof(v_in)); 833 + memset(v_out, 0, sizeof(v_out)); 834 + 835 + /* Changes will be handled in the test case */ 836 + if (sve_supported() || (config->svcr_in & SVCR_SM)) { 837 + /* The low 128 bits of Z are shared with the V registers */ 838 + fill_random(&z_in, __SVE_ZREGS_SIZE(vq)); 839 + fpsimd_to_sve(v_in, z_in, vl_in(config)); 840 + memcpy(z_expected, z_in, __SVE_ZREGS_SIZE(vq)); 841 + memset(z_out, 0, sizeof(z_out)); 842 + 843 + fill_random(&p_in, __SVE_PREGS_SIZE(vq)); 844 + memcpy(p_expected, p_in, __SVE_PREGS_SIZE(vq)); 845 + memset(p_out, 0, sizeof(p_out)); 846 + 847 + if ((config->svcr_in & SVCR_SM) && !fa64_supported()) 848 + memset(ffr_in, 0, __SVE_PREG_SIZE(vq)); 849 + else 850 + fill_random_ffr(&ffr_in, vq); 851 + memcpy(ffr_expected, ffr_in, __SVE_PREG_SIZE(vq)); 852 + memset(ffr_out, 0, __SVE_PREG_SIZE(vq)); 853 + } 854 + 855 + if (config->svcr_in & SVCR_ZA) 856 + fill_random(za_in, ZA_SIG_REGS_SIZE(sme_vq)); 857 + else 858 + memset(za_in, 0, ZA_SIG_REGS_SIZE(sme_vq)); 859 + if (config->svcr_expected & SVCR_ZA) 860 + memcpy(za_expected, za_in, ZA_SIG_REGS_SIZE(sme_vq)); 861 + else 862 + memset(za_expected, 0, ZA_SIG_REGS_SIZE(sme_vq)); 863 + if (sme_supported()) 864 + memset(za_out, 0, sizeof(za_out)); 865 + 866 + if (sme2_supported()) { 867 + if (config->svcr_in & SVCR_ZA) 868 + fill_random(zt_in, ZT_SIG_REG_BYTES); 869 + else 870 + memset(zt_in, 0, ZT_SIG_REG_BYTES); 871 + if (config->svcr_expected & SVCR_ZA) 872 + memcpy(zt_expected, zt_in, ZT_SIG_REG_BYTES); 873 + else 874 + memset(zt_expected, 0, ZT_SIG_REG_BYTES); 875 + memset(zt_out, 0, sizeof(zt_out)); 876 + } 877 + } 878 + 879 + static bool check_memory_values(struct test_config *config) 880 + { 881 + bool pass = true; 882 + int vq, sme_vq; 883 + 884 + if (!compare_buffer("saved V", v_out, v_expected, sizeof(v_out))) 885 + pass = false; 886 + 887 + vq = __sve_vq_from_vl(vl_expected(config)); 888 + sme_vq = __sve_vq_from_vl(config->sme_vl_expected); 889 + 890 + if (svcr_out != svcr_expected) { 891 + ksft_print_msg("Mismatch in saved SVCR %lx != %lx\n", 892 + svcr_out, svcr_expected); 893 + pass = false; 894 + } 895 + 896 + if (sve_vl_out != config->sve_vl_expected) { 897 + ksft_print_msg("Mismatch in SVE VL: %ld != %d\n", 898 + sve_vl_out, config->sve_vl_expected); 899 + pass = false; 900 + } 901 + 902 + if (sme_vl_out != config->sme_vl_expected) { 903 + ksft_print_msg("Mismatch in SME VL: %ld != %d\n", 904 + sme_vl_out, config->sme_vl_expected); 905 + pass = false; 906 + } 907 + 908 + if (!compare_buffer("saved Z", z_out, z_expected, 909 + __SVE_ZREGS_SIZE(vq))) 910 + pass = false; 911 + 912 + if (!compare_buffer("saved P", p_out, p_expected, 913 + __SVE_PREGS_SIZE(vq))) 914 + pass = false; 915 + 916 + if (!compare_buffer("saved FFR", ffr_out, ffr_expected, 917 + __SVE_PREG_SIZE(vq))) 918 + pass = false; 919 + 920 + if (!compare_buffer("saved ZA", za_out, za_expected, 921 + ZA_PT_ZA_SIZE(sme_vq))) 922 + pass = false; 923 + 924 + if (!compare_buffer("saved ZT", zt_out, zt_expected, ZT_SIG_REG_BYTES)) 925 + pass = false; 926 + 927 + return pass; 928 + } 929 + 930 + static bool sve_sme_same(struct test_config *config) 931 + { 932 + if (config->sve_vl_in != config->sve_vl_expected) 933 + return false; 934 + 935 + if (config->sme_vl_in != config->sme_vl_expected) 936 + return false; 937 + 938 + if (config->svcr_in != config->svcr_expected) 939 + return false; 940 + 941 + return true; 942 + } 943 + 944 + static bool sve_write_supported(struct test_config *config) 945 + { 946 + if (!sve_supported() && !sme_supported()) 947 + return false; 948 + 949 + if ((config->svcr_in & SVCR_ZA) != (config->svcr_expected & SVCR_ZA)) 950 + return false; 951 + 952 + if (config->svcr_expected & SVCR_SM) { 953 + if (config->sve_vl_in != config->sve_vl_expected) { 954 + return false; 955 + } 956 + 957 + /* Changing the SME VL disables ZA */ 958 + if ((config->svcr_expected & SVCR_ZA) && 959 + (config->sme_vl_in != config->sme_vl_expected)) { 960 + return false; 961 + } 962 + } else { 963 + if (config->sme_vl_in != config->sme_vl_expected) { 964 + return false; 965 + } 966 + } 967 + 968 + return true; 969 + } 970 + 971 + static void fpsimd_write_expected(struct test_config *config) 972 + { 973 + int vl; 974 + 975 + fill_random(&v_expected, sizeof(v_expected)); 976 + 977 + /* The SVE registers are flushed by a FPSIMD write */ 978 + vl = vl_expected(config); 979 + 980 + memset(z_expected, 0, __SVE_ZREGS_SIZE(__sve_vq_from_vl(vl))); 981 + memset(p_expected, 0, __SVE_PREGS_SIZE(__sve_vq_from_vl(vl))); 982 + memset(ffr_expected, 0, __SVE_PREG_SIZE(__sve_vq_from_vl(vl))); 983 + 984 + fpsimd_to_sve(v_expected, z_expected, vl); 985 + } 986 + 987 + static void fpsimd_write(pid_t child, struct test_config *test_config) 988 + { 989 + struct user_fpsimd_state fpsimd; 990 + struct iovec iov; 991 + int ret; 992 + 993 + memset(&fpsimd, 0, sizeof(fpsimd)); 994 + memcpy(&fpsimd.vregs, v_expected, sizeof(v_expected)); 995 + 996 + iov.iov_base = &fpsimd; 997 + iov.iov_len = sizeof(fpsimd); 998 + ret = ptrace(PTRACE_SETREGSET, child, NT_PRFPREG, &iov); 999 + if (ret == -1) 1000 + ksft_print_msg("FPSIMD set failed: (%s) %d\n", 1001 + strerror(errno), errno); 1002 + } 1003 + 1004 + static void sve_write_expected(struct test_config *config) 1005 + { 1006 + int vl = vl_expected(config); 1007 + int sme_vq = __sve_vq_from_vl(config->sme_vl_expected); 1008 + 1009 + fill_random(z_expected, __SVE_ZREGS_SIZE(__sve_vq_from_vl(vl))); 1010 + fill_random(p_expected, __SVE_PREGS_SIZE(__sve_vq_from_vl(vl))); 1011 + 1012 + if ((svcr_expected & SVCR_SM) && !fa64_supported()) 1013 + memset(ffr_expected, 0, __SVE_PREG_SIZE(sme_vq)); 1014 + else 1015 + fill_random_ffr(ffr_expected, __sve_vq_from_vl(vl)); 1016 + 1017 + /* Share the low bits of Z with V */ 1018 + fill_random(&v_expected, sizeof(v_expected)); 1019 + fpsimd_to_sve(v_expected, z_expected, vl); 1020 + 1021 + if (config->sme_vl_in != config->sme_vl_expected) { 1022 + memset(za_expected, 0, ZA_PT_ZA_SIZE(sme_vq)); 1023 + memset(zt_expected, 0, sizeof(zt_expected)); 1024 + } 1025 + } 1026 + 1027 + static void sve_write(pid_t child, struct test_config *config) 1028 + { 1029 + struct user_sve_header *sve; 1030 + struct iovec iov; 1031 + int ret, vl, vq, regset; 1032 + 1033 + vl = vl_expected(config); 1034 + vq = __sve_vq_from_vl(vl); 1035 + 1036 + iov.iov_len = SVE_PT_SVE_OFFSET + SVE_PT_SVE_SIZE(vq, SVE_PT_REGS_SVE); 1037 + iov.iov_base = malloc(iov.iov_len); 1038 + if (!iov.iov_base) { 1039 + ksft_print_msg("Failed allocating %lu byte SVE write buffer\n", 1040 + iov.iov_len); 1041 + return; 1042 + } 1043 + memset(iov.iov_base, 0, iov.iov_len); 1044 + 1045 + sve = iov.iov_base; 1046 + sve->size = iov.iov_len; 1047 + sve->flags = SVE_PT_REGS_SVE; 1048 + sve->vl = vl; 1049 + 1050 + memcpy(iov.iov_base + SVE_PT_SVE_ZREG_OFFSET(vq, 0), 1051 + z_expected, SVE_PT_SVE_ZREGS_SIZE(vq)); 1052 + memcpy(iov.iov_base + SVE_PT_SVE_PREG_OFFSET(vq, 0), 1053 + p_expected, SVE_PT_SVE_PREGS_SIZE(vq)); 1054 + memcpy(iov.iov_base + SVE_PT_SVE_FFR_OFFSET(vq), 1055 + ffr_expected, SVE_PT_SVE_PREG_SIZE(vq)); 1056 + 1057 + if (svcr_expected & SVCR_SM) 1058 + regset = NT_ARM_SSVE; 1059 + else 1060 + regset = NT_ARM_SVE; 1061 + 1062 + ret = ptrace(PTRACE_SETREGSET, child, regset, &iov); 1063 + if (ret != 0) 1064 + ksft_print_msg("Failed to write SVE: %s (%d)\n", 1065 + strerror(errno), errno); 1066 + 1067 + free(iov.iov_base); 1068 + } 1069 + 1070 + static bool za_write_supported(struct test_config *config) 1071 + { 1072 + if (config->svcr_expected & SVCR_SM) { 1073 + if (!(config->svcr_in & SVCR_SM)) 1074 + return false; 1075 + 1076 + /* Changing the SME VL exits streaming mode */ 1077 + if (config->sme_vl_in != config->sme_vl_expected) { 1078 + return false; 1079 + } 1080 + } 1081 + 1082 + /* Can't disable SM outside a VL change */ 1083 + if ((config->svcr_in & SVCR_SM) && 1084 + !(config->svcr_expected & SVCR_SM)) 1085 + return false; 1086 + 1087 + return true; 1088 + } 1089 + 1090 + static void za_write_expected(struct test_config *config) 1091 + { 1092 + int sme_vq, sve_vq; 1093 + 1094 + sme_vq = __sve_vq_from_vl(config->sme_vl_expected); 1095 + 1096 + if (config->svcr_expected & SVCR_ZA) { 1097 + fill_random(za_expected, ZA_PT_ZA_SIZE(sme_vq)); 1098 + } else { 1099 + memset(za_expected, 0, ZA_PT_ZA_SIZE(sme_vq)); 1100 + memset(zt_expected, 0, sizeof(zt_expected)); 1101 + } 1102 + 1103 + /* Changing the SME VL flushes ZT, SVE state and exits SM */ 1104 + if (config->sme_vl_in != config->sme_vl_expected) { 1105 + svcr_expected &= ~SVCR_SM; 1106 + 1107 + sve_vq = __sve_vq_from_vl(vl_expected(config)); 1108 + memset(z_expected, 0, __SVE_ZREGS_SIZE(sve_vq)); 1109 + memset(p_expected, 0, __SVE_PREGS_SIZE(sve_vq)); 1110 + memset(ffr_expected, 0, __SVE_PREG_SIZE(sve_vq)); 1111 + memset(zt_expected, 0, sizeof(zt_expected)); 1112 + 1113 + fpsimd_to_sve(v_expected, z_expected, vl_expected(config)); 1114 + } 1115 + } 1116 + 1117 + static void za_write(pid_t child, struct test_config *config) 1118 + { 1119 + struct user_za_header *za; 1120 + struct iovec iov; 1121 + int ret, vq; 1122 + 1123 + vq = __sve_vq_from_vl(config->sme_vl_expected); 1124 + 1125 + if (config->svcr_expected & SVCR_ZA) 1126 + iov.iov_len = ZA_PT_SIZE(vq); 1127 + else 1128 + iov.iov_len = sizeof(*za); 1129 + iov.iov_base = malloc(iov.iov_len); 1130 + if (!iov.iov_base) { 1131 + ksft_print_msg("Failed allocating %lu byte ZA write buffer\n", 1132 + iov.iov_len); 1133 + return; 1134 + } 1135 + memset(iov.iov_base, 0, iov.iov_len); 1136 + 1137 + za = iov.iov_base; 1138 + za->size = iov.iov_len; 1139 + za->vl = config->sme_vl_expected; 1140 + if (config->svcr_expected & SVCR_ZA) 1141 + memcpy(iov.iov_base + ZA_PT_ZA_OFFSET, za_expected, 1142 + ZA_PT_ZA_SIZE(vq)); 1143 + 1144 + ret = ptrace(PTRACE_SETREGSET, child, NT_ARM_ZA, &iov); 1145 + if (ret != 0) 1146 + ksft_print_msg("Failed to write ZA: %s (%d)\n", 1147 + strerror(errno), errno); 1148 + 1149 + free(iov.iov_base); 1150 + } 1151 + 1152 + static bool zt_write_supported(struct test_config *config) 1153 + { 1154 + if (!sme2_supported()) 1155 + return false; 1156 + if (config->sme_vl_in != config->sme_vl_expected) 1157 + return false; 1158 + if (!(config->svcr_expected & SVCR_ZA)) 1159 + return false; 1160 + if ((config->svcr_in & SVCR_SM) != (config->svcr_expected & SVCR_SM)) 1161 + return false; 1162 + 1163 + return true; 1164 + } 1165 + 1166 + static void zt_write_expected(struct test_config *config) 1167 + { 1168 + int sme_vq; 1169 + 1170 + sme_vq = __sve_vq_from_vl(config->sme_vl_expected); 1171 + 1172 + if (config->svcr_expected & SVCR_ZA) { 1173 + fill_random(zt_expected, sizeof(zt_expected)); 1174 + } else { 1175 + memset(za_expected, 0, ZA_PT_ZA_SIZE(sme_vq)); 1176 + memset(zt_expected, 0, sizeof(zt_expected)); 1177 + } 1178 + } 1179 + 1180 + static void zt_write(pid_t child, struct test_config *config) 1181 + { 1182 + struct iovec iov; 1183 + int ret; 1184 + 1185 + iov.iov_len = ZT_SIG_REG_BYTES; 1186 + iov.iov_base = zt_expected; 1187 + ret = ptrace(PTRACE_SETREGSET, child, NT_ARM_ZT, &iov); 1188 + if (ret != 0) 1189 + ksft_print_msg("Failed to write ZT: %s (%d)\n", 1190 + strerror(errno), errno); 1191 + } 1192 + 1193 + /* Actually run a test */ 1194 + static void run_test(struct test_definition *test, struct test_config *config) 1195 + { 1196 + pid_t child; 1197 + char name[1024]; 1198 + bool pass; 1199 + 1200 + if (sve_supported() && sme_supported()) 1201 + snprintf(name, sizeof(name), "%s, SVE %d->%d, SME %d/%x->%d/%x", 1202 + test->name, 1203 + config->sve_vl_in, config->sve_vl_expected, 1204 + config->sme_vl_in, config->svcr_in, 1205 + config->sme_vl_expected, config->svcr_expected); 1206 + else if (sve_supported()) 1207 + snprintf(name, sizeof(name), "%s, SVE %d->%d", test->name, 1208 + config->sve_vl_in, config->sve_vl_expected); 1209 + else if (sme_supported()) 1210 + snprintf(name, sizeof(name), "%s, SME %d/%x->%d/%x", 1211 + test->name, 1212 + config->sme_vl_in, config->svcr_in, 1213 + config->sme_vl_expected, config->svcr_expected); 1214 + else 1215 + snprintf(name, sizeof(name), "%s", test->name); 1216 + 1217 + if (test->supported && !test->supported(config)) { 1218 + ksft_test_result_skip("%s\n", name); 1219 + return; 1220 + } 1221 + 1222 + set_initial_values(config); 1223 + 1224 + if (test->set_expected_values) 1225 + test->set_expected_values(config); 1226 + 1227 + child = fork(); 1228 + if (child < 0) 1229 + ksft_exit_fail_msg("fork() failed: %s (%d)\n", 1230 + strerror(errno), errno); 1231 + /* run_child() never returns */ 1232 + if (child == 0) 1233 + run_child(config); 1234 + 1235 + pass = run_parent(child, test, config); 1236 + if (!check_memory_values(config)) 1237 + pass = false; 1238 + 1239 + ksft_test_result(pass, "%s\n", name); 1240 + } 1241 + 1242 + static void run_tests(struct test_definition defs[], int count, 1243 + struct test_config *config) 1244 + { 1245 + int i; 1246 + 1247 + for (i = 0; i < count; i++) 1248 + run_test(&defs[i], config); 1249 + } 1250 + 1251 + static struct test_definition base_test_defs[] = { 1252 + { 1253 + .name = "No writes", 1254 + .supported = sve_sme_same, 1255 + }, 1256 + { 1257 + .name = "FPSIMD write", 1258 + .supported = sve_sme_same, 1259 + .set_expected_values = fpsimd_write_expected, 1260 + .modify_values = fpsimd_write, 1261 + }, 1262 + }; 1263 + 1264 + static struct test_definition sve_test_defs[] = { 1265 + { 1266 + .name = "SVE write", 1267 + .supported = sve_write_supported, 1268 + .set_expected_values = sve_write_expected, 1269 + .modify_values = sve_write, 1270 + }, 1271 + }; 1272 + 1273 + static struct test_definition za_test_defs[] = { 1274 + { 1275 + .name = "ZA write", 1276 + .supported = za_write_supported, 1277 + .set_expected_values = za_write_expected, 1278 + .modify_values = za_write, 1279 + }, 1280 + }; 1281 + 1282 + static struct test_definition zt_test_defs[] = { 1283 + { 1284 + .name = "ZT write", 1285 + .supported = zt_write_supported, 1286 + .set_expected_values = zt_write_expected, 1287 + .modify_values = zt_write, 1288 + }, 1289 + }; 1290 + 1291 + static int sve_vls[MAX_NUM_VLS], sme_vls[MAX_NUM_VLS]; 1292 + static int sve_vl_count, sme_vl_count; 1293 + 1294 + static void probe_vls(const char *name, int vls[], int *vl_count, int set_vl) 1295 + { 1296 + unsigned int vq; 1297 + int vl; 1298 + 1299 + *vl_count = 0; 1300 + 1301 + for (vq = ARCH_VQ_MAX; vq > 0; vq /= 2) { 1302 + vl = prctl(set_vl, vq * 16); 1303 + if (vl == -1) 1304 + ksft_exit_fail_msg("SET_VL failed: %s (%d)\n", 1305 + strerror(errno), errno); 1306 + 1307 + vl &= PR_SVE_VL_LEN_MASK; 1308 + 1309 + if (*vl_count && (vl == vls[*vl_count - 1])) 1310 + break; 1311 + 1312 + vq = sve_vq_from_vl(vl); 1313 + 1314 + vls[*vl_count] = vl; 1315 + *vl_count += 1; 1316 + } 1317 + 1318 + if (*vl_count > 2) { 1319 + /* Just use the minimum and maximum */ 1320 + vls[1] = vls[*vl_count - 1]; 1321 + ksft_print_msg("%d %s VLs, using %d and %d\n", 1322 + *vl_count, name, vls[0], vls[1]); 1323 + *vl_count = 2; 1324 + } else { 1325 + ksft_print_msg("%d %s VLs\n", *vl_count, name); 1326 + } 1327 + } 1328 + 1329 + static struct { 1330 + int svcr_in, svcr_expected; 1331 + } svcr_combinations[] = { 1332 + { .svcr_in = 0, .svcr_expected = 0, }, 1333 + { .svcr_in = 0, .svcr_expected = SVCR_SM, }, 1334 + { .svcr_in = 0, .svcr_expected = SVCR_ZA, }, 1335 + /* Can't enable both SM and ZA with a single ptrace write */ 1336 + 1337 + { .svcr_in = SVCR_SM, .svcr_expected = 0, }, 1338 + { .svcr_in = SVCR_SM, .svcr_expected = SVCR_SM, }, 1339 + { .svcr_in = SVCR_SM, .svcr_expected = SVCR_ZA, }, 1340 + { .svcr_in = SVCR_SM, .svcr_expected = SVCR_SM | SVCR_ZA, }, 1341 + 1342 + { .svcr_in = SVCR_ZA, .svcr_expected = 0, }, 1343 + { .svcr_in = SVCR_ZA, .svcr_expected = SVCR_SM, }, 1344 + { .svcr_in = SVCR_ZA, .svcr_expected = SVCR_ZA, }, 1345 + { .svcr_in = SVCR_ZA, .svcr_expected = SVCR_SM | SVCR_ZA, }, 1346 + 1347 + { .svcr_in = SVCR_SM | SVCR_ZA, .svcr_expected = 0, }, 1348 + { .svcr_in = SVCR_SM | SVCR_ZA, .svcr_expected = SVCR_SM, }, 1349 + { .svcr_in = SVCR_SM | SVCR_ZA, .svcr_expected = SVCR_ZA, }, 1350 + { .svcr_in = SVCR_SM | SVCR_ZA, .svcr_expected = SVCR_SM | SVCR_ZA, }, 1351 + }; 1352 + 1353 + static void run_sve_tests(void) 1354 + { 1355 + struct test_config test_config; 1356 + int i, j; 1357 + 1358 + if (!sve_supported()) 1359 + return; 1360 + 1361 + test_config.sme_vl_in = sme_vls[0]; 1362 + test_config.sme_vl_expected = sme_vls[0]; 1363 + test_config.svcr_in = 0; 1364 + test_config.svcr_expected = 0; 1365 + 1366 + for (i = 0; i < sve_vl_count; i++) { 1367 + test_config.sve_vl_in = sve_vls[i]; 1368 + 1369 + for (j = 0; j < sve_vl_count; j++) { 1370 + test_config.sve_vl_expected = sve_vls[j]; 1371 + 1372 + run_tests(base_test_defs, 1373 + ARRAY_SIZE(base_test_defs), 1374 + &test_config); 1375 + if (sve_supported()) 1376 + run_tests(sve_test_defs, 1377 + ARRAY_SIZE(sve_test_defs), 1378 + &test_config); 1379 + } 1380 + } 1381 + 1382 + } 1383 + 1384 + static void run_sme_tests(void) 1385 + { 1386 + struct test_config test_config; 1387 + int i, j, k; 1388 + 1389 + if (!sme_supported()) 1390 + return; 1391 + 1392 + test_config.sve_vl_in = sve_vls[0]; 1393 + test_config.sve_vl_expected = sve_vls[0]; 1394 + 1395 + /* 1396 + * Every SME VL/SVCR combination 1397 + */ 1398 + for (i = 0; i < sme_vl_count; i++) { 1399 + test_config.sme_vl_in = sme_vls[i]; 1400 + 1401 + for (j = 0; j < sme_vl_count; j++) { 1402 + test_config.sme_vl_expected = sme_vls[j]; 1403 + 1404 + for (k = 0; k < ARRAY_SIZE(svcr_combinations); k++) { 1405 + test_config.svcr_in = svcr_combinations[k].svcr_in; 1406 + test_config.svcr_expected = svcr_combinations[k].svcr_expected; 1407 + 1408 + run_tests(base_test_defs, 1409 + ARRAY_SIZE(base_test_defs), 1410 + &test_config); 1411 + run_tests(sve_test_defs, 1412 + ARRAY_SIZE(sve_test_defs), 1413 + &test_config); 1414 + run_tests(za_test_defs, 1415 + ARRAY_SIZE(za_test_defs), 1416 + &test_config); 1417 + 1418 + if (sme2_supported()) 1419 + run_tests(zt_test_defs, 1420 + ARRAY_SIZE(zt_test_defs), 1421 + &test_config); 1422 + } 1423 + } 1424 + } 1425 + } 1426 + 1427 + int main(void) 1428 + { 1429 + struct test_config test_config; 1430 + struct sigaction sa; 1431 + int tests, ret, tmp; 1432 + 1433 + srandom(getpid()); 1434 + 1435 + ksft_print_header(); 1436 + 1437 + if (sve_supported()) { 1438 + probe_vls("SVE", sve_vls, &sve_vl_count, PR_SVE_SET_VL); 1439 + 1440 + tests = ARRAY_SIZE(base_test_defs) + 1441 + ARRAY_SIZE(sve_test_defs); 1442 + tests *= sve_vl_count * sve_vl_count; 1443 + } else { 1444 + /* Only run the FPSIMD tests */ 1445 + sve_vl_count = 1; 1446 + tests = ARRAY_SIZE(base_test_defs); 1447 + } 1448 + 1449 + if (sme_supported()) { 1450 + probe_vls("SME", sme_vls, &sme_vl_count, PR_SME_SET_VL); 1451 + 1452 + tmp = ARRAY_SIZE(base_test_defs) + ARRAY_SIZE(sve_test_defs) 1453 + + ARRAY_SIZE(za_test_defs); 1454 + 1455 + if (sme2_supported()) 1456 + tmp += ARRAY_SIZE(zt_test_defs); 1457 + 1458 + tmp *= sme_vl_count * sme_vl_count; 1459 + tmp *= ARRAY_SIZE(svcr_combinations); 1460 + tests += tmp; 1461 + } else { 1462 + sme_vl_count = 1; 1463 + } 1464 + 1465 + if (sme2_supported()) 1466 + ksft_print_msg("SME2 supported\n"); 1467 + 1468 + if (fa64_supported()) 1469 + ksft_print_msg("FA64 supported\n"); 1470 + 1471 + ksft_set_plan(tests); 1472 + 1473 + /* Get signal handers ready before we start any children */ 1474 + memset(&sa, 0, sizeof(sa)); 1475 + sa.sa_sigaction = handle_alarm; 1476 + sa.sa_flags = SA_RESTART | SA_SIGINFO; 1477 + sigemptyset(&sa.sa_mask); 1478 + ret = sigaction(SIGALRM, &sa, NULL); 1479 + if (ret < 0) 1480 + ksft_print_msg("Failed to install SIGALRM handler: %s (%d)\n", 1481 + strerror(errno), errno); 1482 + 1483 + /* 1484 + * Run the test set if there is no SVE or SME, with those we 1485 + * have to pick a VL for each run. 1486 + */ 1487 + if (!sve_supported()) { 1488 + test_config.sve_vl_in = 0; 1489 + test_config.sve_vl_expected = 0; 1490 + test_config.sme_vl_in = 0; 1491 + test_config.sme_vl_expected = 0; 1492 + test_config.svcr_in = 0; 1493 + test_config.svcr_expected = 0; 1494 + 1495 + run_tests(base_test_defs, ARRAY_SIZE(base_test_defs), 1496 + &test_config); 1497 + } 1498 + 1499 + run_sve_tests(); 1500 + run_sme_tests(); 1501 + 1502 + ksft_finished(); 1503 + }
+13
tools/testing/selftests/arm64/fp/fp-ptrace.h
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + // Copyright (C) 2021-3 ARM Limited. 3 + 4 + #ifndef FP_PTRACE_H 5 + #define FP_PTRACE_H 6 + 7 + #define SVCR_SM_SHIFT 0 8 + #define SVCR_ZA_SHIFT 1 9 + 10 + #define SVCR_SM (1 << SVCR_SM_SHIFT) 11 + #define SVCR_ZA (1 << SVCR_ZA_SHIFT) 12 + 13 + #endif
+1
tools/testing/selftests/arm64/signal/.gitignore
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 mangle_* 3 3 fake_sigreturn_* 4 + fpmr_* 4 5 sme_* 5 6 ssve_* 6 7 sve_*
+82
tools/testing/selftests/arm64/signal/testcases/fpmr_siginfo.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Copyright (C) 2023 ARM Limited 4 + * 5 + * Verify that the FPMR register context in signal frames is set up as 6 + * expected. 7 + */ 8 + 9 + #include <signal.h> 10 + #include <ucontext.h> 11 + #include <sys/auxv.h> 12 + #include <sys/prctl.h> 13 + #include <unistd.h> 14 + #include <asm/sigcontext.h> 15 + 16 + #include "test_signals_utils.h" 17 + #include "testcases.h" 18 + 19 + static union { 20 + ucontext_t uc; 21 + char buf[1024 * 128]; 22 + } context; 23 + 24 + #define SYS_FPMR "S3_3_C4_C4_2" 25 + 26 + static uint64_t get_fpmr(void) 27 + { 28 + uint64_t val; 29 + 30 + asm volatile ( 31 + "mrs %0, " SYS_FPMR "\n" 32 + : "=r"(val) 33 + : 34 + : "cc"); 35 + 36 + return val; 37 + } 38 + 39 + int fpmr_present(struct tdescr *td, siginfo_t *si, ucontext_t *uc) 40 + { 41 + struct _aarch64_ctx *head = GET_BUF_RESV_HEAD(context); 42 + struct fpmr_context *fpmr_ctx; 43 + size_t offset; 44 + bool in_sigframe; 45 + bool have_fpmr; 46 + __u64 orig_fpmr; 47 + 48 + have_fpmr = getauxval(AT_HWCAP2) & HWCAP2_FPMR; 49 + if (have_fpmr) 50 + orig_fpmr = get_fpmr(); 51 + 52 + if (!get_current_context(td, &context.uc, sizeof(context))) 53 + return 1; 54 + 55 + fpmr_ctx = (struct fpmr_context *) 56 + get_header(head, FPMR_MAGIC, td->live_sz, &offset); 57 + 58 + in_sigframe = fpmr_ctx != NULL; 59 + 60 + fprintf(stderr, "FPMR sigframe %s on system %s FPMR\n", 61 + in_sigframe ? "present" : "absent", 62 + have_fpmr ? "with" : "without"); 63 + 64 + td->pass = (in_sigframe == have_fpmr); 65 + 66 + if (have_fpmr && fpmr_ctx) { 67 + if (fpmr_ctx->fpmr != orig_fpmr) { 68 + fprintf(stderr, "FPMR in frame is %llx, was %llx\n", 69 + fpmr_ctx->fpmr, orig_fpmr); 70 + td->pass = false; 71 + } 72 + } 73 + 74 + return 0; 75 + } 76 + 77 + struct tdescr tde = { 78 + .name = "FPMR", 79 + .descr = "Validate that FPMR is present as expected", 80 + .timeout = 3, 81 + .run = fpmr_present, 82 + };
+8
tools/testing/selftests/arm64/signal/testcases/testcases.c
··· 209 209 zt = (struct zt_context *)head; 210 210 new_flags |= ZT_CTX; 211 211 break; 212 + case FPMR_MAGIC: 213 + if (flags & FPMR_CTX) 214 + *err = "Multiple FPMR_MAGIC"; 215 + else if (head->size != 216 + sizeof(struct fpmr_context)) 217 + *err = "Bad size for fpmr_context"; 218 + new_flags |= FPMR_CTX; 219 + break; 212 220 case EXTRA_MAGIC: 213 221 if (flags & EXTRA_CTX) 214 222 *err = "Multiple EXTRA_MAGIC";
+1
tools/testing/selftests/arm64/signal/testcases/testcases.h
··· 19 19 #define ZA_CTX (1 << 2) 20 20 #define EXTRA_CTX (1 << 3) 21 21 #define ZT_CTX (1 << 4) 22 + #define FPMR_CTX (1 << 5) 22 23 23 24 #define KSFT_BAD_MAGIC 0xdeadbeef 24 25