Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

- Support for the FWFT SBI extension, which is part of SBI 3.0 and a
dependency for many new SBI and ISA extensions

- Support for getrandom() in the VDSO

- Support for mseal

- Optimized routines for raid6 syndrome and recovery calculations

- kexec_file() supports loading Image-formatted kernel binaries

- Improvements to the instruction patching framework to allow for
atomic instruction patching, along with rules as to how systems need
to behave in order to function correctly

- Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
some SiFive vendor extensions

- Various fixes and cleanups, including: misaligned access handling,
perf symbol mangling, module loading, PUD THPs, and improved uaccess
routines

* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
riscv: uaccess: Only restore the CSR_STATUS SUM bit
RISC-V: vDSO: Wire up getrandom() vDSO implementation
riscv: enable mseal sysmap for RV64
raid6: Add RISC-V SIMD syndrome and recovery calculations
riscv: mm: Add support for Svinval extension
RISC-V: Documentation: Add enough title underlines to CMODX
riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
MAINTAINERS: Update Atish's email address
riscv: uaccess: do not do misaligned accesses in get/put_user()
riscv: process: use unsigned int instead of unsigned long for put_user()
riscv: make unsafe user copy routines use existing assembly routines
riscv: hwprobe: export Zabha extension
riscv: Make regs_irqs_disabled() more clear
perf symbols: Ignore mapping symbols on riscv
RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
riscv: module: Optimize PLT/GOT entry counting
riscv: Add support for PUD THP
riscv: xchg: Prefetch the destination word for sc.w
riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
riscv: Add support for Zicbop
...

+3793 -861
+2 -1
.mailmap
··· 107 107 Ashok Raj Nagarajan <quic_arnagara@quicinc.com> <arnagara@codeaurora.org> 108 108 Ashwin Chaugule <quic_ashwinc@quicinc.com> <ashwinc@codeaurora.org> 109 109 Asutosh Das <quic_asutoshd@quicinc.com> <asutoshd@codeaurora.org> 110 - Atish Patra <atishp@atishpatra.org> <atish.patra@wdc.com> 110 + Atish Patra <atish.patra@linux.dev> <atishp@atishpatra.org> 111 + Atish Patra <atish.patra@linux.dev> <atish.patra@wdc.com> 111 112 Avaneesh Kumar Dwivedi <quic_akdwived@quicinc.com> <akdwived@codeaurora.org> 112 113 Axel Dyks <xl@xlsigned.net> 113 114 Axel Lin <axel.lin@gmail.com>
+39 -7
Documentation/arch/riscv/cmodx.rst
··· 10 10 program must enforce its own synchronization with the unprivileged fence.i 11 11 instruction. 12 12 13 - However, the default Linux ABI prohibits the use of fence.i in userspace 14 - applications. At any point the scheduler may migrate a task onto a new hart. If 15 - migration occurs after the userspace synchronized the icache and instruction 16 - storage with fence.i, the icache on the new hart will no longer be clean. This 17 - is due to the behavior of fence.i only affecting the hart that it is called on. 18 - Thus, the hart that the task has been migrated to may not have synchronized 19 - instruction storage and icache. 13 + CMODX in the Kernel Space 14 + ------------------------- 15 + 16 + Dynamic ftrace 17 + --------------------- 18 + 19 + Essentially, dynamic ftrace directs the control flow by inserting a function 20 + call at each patchable function entry, and patches it dynamically at runtime to 21 + enable or disable the redirection. In the case of RISC-V, 2 instructions, 22 + AUIPC + JALR, are required to compose a function call. However, it is impossible 23 + to patch 2 instructions and expect that a concurrent read-side executes them 24 + without a race condition. This series makes atmoic code patching possible in 25 + RISC-V ftrace. Kernel preemption makes things even worse as it allows the old 26 + state to persist across the patching process with stop_machine(). 27 + 28 + In order to get rid of stop_machine() and run dynamic ftrace with full kernel 29 + preemption, we partially initialize each patchable function entry at boot-time, 30 + setting the first instruction to AUIPC, and the second to NOP. Now, atmoic 31 + patching is possible because the kernel only has to update one instruction. 32 + According to Ziccif, as long as an instruction is naturally aligned, the ISA 33 + guarantee an atomic update. 34 + 35 + By fixing down the first instruction, AUIPC, the range of the ftrace trampoline 36 + is limited to +-2K from the predetermined target, ftrace_caller, due to the lack 37 + of immediate encoding space in RISC-V. To address the issue, we introduce 38 + CALL_OPS, where an 8B naturally align metadata is added in front of each 39 + pacthable function. The metadata is resolved at the first trampoline, then the 40 + execution can be derect to another custom trampoline. 41 + 42 + CMODX in the User Space 43 + ----------------------- 44 + 45 + Though fence.i is an unprivileged instruction, the default Linux ABI prohibits 46 + the use of fence.i in userspace applications. At any point the scheduler may 47 + migrate a task onto a new hart. If migration occurs after the userspace 48 + synchronized the icache and instruction storage with fence.i, the icache on the 49 + new hart will no longer be clean. This is due to the behavior of fence.i only 50 + affecting the hart that it is called on. Thus, the hart that the task has been 51 + migrated to may not have synchronized instruction storage and icache. 20 52 21 53 There are two ways to solve this problem: use the riscv_flush_icache() syscall, 22 54 or use the ``PR_RISCV_SET_ICACHE_FLUSH_CTX`` prctl() and emit fence.i in
+26
Documentation/arch/riscv/hwprobe.rst
··· 271 271 * :c:macro:`RISCV_HWPROBE_EXT_ZICBOM`: The Zicbom extension is supported, as 272 272 ratified in commit 3dd606f ("Create cmobase-v1.0.pdf") of riscv-CMOs. 273 273 274 + * :c:macro:`RISCV_HWPROBE_EXT_ZABHA`: The Zabha extension is supported as 275 + ratified in commit 49f49c842ff9 ("Update to Rafified state") of 276 + riscv-zabha. 277 + 274 278 * :c:macro:`RISCV_HWPROBE_KEY_CPUPERF_0`: Deprecated. Returns similar values to 275 279 :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_SCALAR_PERF`, but the key was 276 280 mistakenly classified as a bitmask rather than a value. ··· 339 335 340 336 * :c:macro:`RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE`: An unsigned int which 341 337 represents the size of the Zicbom block in bytes. 338 + 339 + * :c:macro:`RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0`: A bitmask containing the 340 + sifive vendor extensions that are compatible with the 341 + :c:macro:`RISCV_HWPROBE_BASE_BEHAVIOR_IMA`: base system behavior. 342 + 343 + * SIFIVE 344 + 345 + * :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCDOD`: The Xsfqmaccdod vendor 346 + extension is supported in version 1.1 of SiFive Int8 Matrix Multiplication 347 + Extensions Specification. 348 + 349 + * :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCQOQ`: The Xsfqmaccqoq vendor 350 + extension is supported in version 1.1 of SiFive Int8 Matrix Multiplication 351 + Instruction Extensions Specification. 352 + 353 + * :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XSFVFNRCLIPXFQF`: The Xsfvfnrclipxfqf 354 + vendor extension is supported in version 1.0 of SiFive FP32-to-int8 Ranged 355 + Clip Instructions Extensions Specification. 356 + 357 + * :c:macro:`RISCV_HWPROBE_VENDOR_EXT_XSFVFWMACCQQQ`: The Xsfvfwmaccqqq 358 + vendor extension is supported in version 1.0 of Matrix Multiply Accumulate 359 + Instruction Extensions Specification.
+25
Documentation/devicetree/bindings/riscv/extensions.yaml
··· 662 662 Registers in the AX45MP datasheet. 663 663 https://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf 664 664 665 + # SiFive 666 + - const: xsfvqmaccdod 667 + description: 668 + SiFive Int8 Matrix Multiplication Extensions Specification. 669 + See more details in 670 + https://www.sifive.com/document-file/sifive-int8-matrix-multiplication-extensions-specification 671 + 672 + - const: xsfvqmaccqoq 673 + description: 674 + SiFive Int8 Matrix Multiplication Extensions Specification. 675 + See more details in 676 + https://www.sifive.com/document-file/sifive-int8-matrix-multiplication-extensions-specification 677 + 678 + - const: xsfvfnrclipxfqf 679 + description: 680 + SiFive FP32-to-int8 Ranged Clip Instructions Extensions Specification. 681 + See more details in 682 + https://www.sifive.com/document-file/fp32-to-int8-ranged-clip-instructions 683 + 684 + - const: xsfvfwmaccqqq 685 + description: 686 + SiFive Matrix Multiply Accumulate Instruction Extensions Specification. 687 + See more details in 688 + https://www.sifive.com/document-file/matrix-multiply-accumulate-instruction 689 + 665 690 # T-HEAD 666 691 - const: xtheadvector 667 692 description:
+2 -2
MAINTAINERS
··· 13270 13270 13271 13271 KERNEL VIRTUAL MACHINE FOR RISC-V (KVM/riscv) 13272 13272 M: Anup Patel <anup@brainfault.org> 13273 - R: Atish Patra <atishp@atishpatra.org> 13273 + R: Atish Patra <atish.patra@linux.dev> 13274 13274 L: kvm@vger.kernel.org 13275 13275 L: kvm-riscv@lists.infradead.org 13276 13276 L: linux-riscv@lists.infradead.org ··· 21332 21332 F: arch/riscv/boot/dts/starfive/ 21333 21333 21334 21334 RISC-V PMU DRIVERS 21335 - M: Atish Patra <atishp@atishpatra.org> 21335 + M: Atish Patra <atish.patra@linux.dev> 21336 21336 R: Anup Patel <anup@brainfault.org> 21337 21337 L: linux-riscv@lists.infradead.org 21338 21338 S: Supported
+30 -8
arch/riscv/Kconfig
··· 70 70 # LLD >= 14: https://github.com/llvm/llvm-project/issues/50505 71 71 select ARCH_SUPPORTS_LTO_CLANG if LLD_VERSION >= 140000 72 72 select ARCH_SUPPORTS_LTO_CLANG_THIN if LLD_VERSION >= 140000 73 + select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS if 64BIT && MMU 73 74 select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU 74 75 select ARCH_SUPPORTS_PER_VMA_LOCK if MMU 75 76 select ARCH_SUPPORTS_RT ··· 100 99 select EDAC_SUPPORT 101 100 select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE) 102 101 select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY if DYNAMIC_FTRACE 102 + select FUNCTION_ALIGNMENT_8B if DYNAMIC_FTRACE_WITH_CALL_OPS 103 103 select GENERIC_ARCH_TOPOLOGY 104 104 select GENERIC_ATOMIC64 if !64BIT 105 105 select GENERIC_CLOCKEVENTS_BROADCAST if SMP ··· 145 143 select HAVE_ARCH_THREAD_STRUCT_WHITELIST 146 144 select HAVE_ARCH_TRACEHOOK 147 145 select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU 146 + select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU 148 147 select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD 149 148 select HAVE_ARCH_VMAP_STACK if MMU && 64BIT 150 149 select HAVE_ASM_MODVERSIONS ··· 153 150 select HAVE_DEBUG_KMEMLEAK 154 151 select HAVE_DMA_CONTIGUOUS if MMU 155 152 select HAVE_DYNAMIC_FTRACE if !XIP_KERNEL && MMU && (CLANG_SUPPORTS_DYNAMIC_FTRACE || GCC_SUPPORTS_DYNAMIC_FTRACE) 156 - select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 153 + select FUNCTION_ALIGNMENT_4B if HAVE_DYNAMIC_FTRACE && RISCV_ISA_C 154 + select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS if HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS 155 + select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG) 157 156 select HAVE_DYNAMIC_FTRACE_WITH_ARGS if HAVE_DYNAMIC_FTRACE 158 157 select HAVE_FTRACE_GRAPH_FUNC 159 158 select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL 160 159 select HAVE_FUNCTION_GRAPH_TRACER if HAVE_DYNAMIC_FTRACE_WITH_ARGS 161 160 select HAVE_FUNCTION_GRAPH_FREGS 162 - select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !PREEMPTION 161 + select HAVE_FUNCTION_TRACER if !XIP_KERNEL 163 162 select HAVE_EBPF_JIT if MMU 164 163 select HAVE_GUP_FAST if MMU 165 164 select HAVE_FUNCTION_ARG_ACCESS_API ··· 223 218 select THREAD_INFO_IN_TASK 224 219 select TRACE_IRQFLAGS_SUPPORT 225 220 select UACCESS_MEMCPY if !MMU 221 + select VDSO_GETRANDOM if HAVE_GENERIC_VDSO 226 222 select USER_STACKTRACE_SUPPORT 227 223 select ZONE_DMA32 if 64BIT 228 224 ··· 242 236 config GCC_SUPPORTS_DYNAMIC_FTRACE 243 237 def_bool CC_IS_GCC 244 238 depends on $(cc-option,-fpatchable-function-entry=8) 239 + depends on CC_HAS_MIN_FUNCTION_ALIGNMENT || !RISCV_ISA_C 245 240 246 241 config HAVE_SHADOW_CALL_STACK 247 242 def_bool $(cc-option,-fsanitize=shadow-call-stack) ··· 671 664 default y 672 665 help 673 666 Usually, in-kernel SIMD routines are run with preemption disabled. 674 - Functions which envoke long running SIMD thus must yield core's 667 + Functions which invoke long running SIMD thus must yield the core's 675 668 vector unit to prevent blocking other tasks for too long. 676 669 677 - This config allows kernel to run SIMD without explicitly disable 678 - preemption. Enabling this config will result in higher memory 679 - consumption due to the allocation of per-task's kernel Vector context. 670 + This config allows the kernel to run SIMD without explicitly disabling 671 + preemption. Enabling this config will result in higher memory consumption 672 + due to the allocation of per-task's kernel Vector context. 680 673 681 674 config RISCV_ISA_ZAWRS 682 675 bool "Zawrs extension support for more efficient busy waiting" ··· 848 841 The Zicboz extension is used for faster zeroing of memory. 849 842 850 843 If you don't know what to do here, say Y. 844 + 845 + config RISCV_ISA_ZICBOP 846 + bool "Zicbop extension support for cache block prefetch" 847 + depends on MMU 848 + depends on RISCV_ALTERNATIVE 849 + default y 850 + help 851 + Adds support to dynamically detect the presence of the ZICBOP 852 + extension (Cache Block Prefetch Operations) and enable its 853 + usage. 854 + 855 + The Zicbop extension can be used to prefetch cache blocks for 856 + read/write fetch. 857 + 858 + If you don't know what to do here, say Y. 851 859 852 860 config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI 853 861 def_bool y ··· 1193 1171 config CMDLINE_EXTEND 1194 1172 bool "Extend bootloader kernel arguments" 1195 1173 help 1196 - The command-line arguments provided during boot will be 1197 - appended to the built-in command line. This is useful in 1174 + The built-in command line will be appended to the command- 1175 + line arguments provided during boot. This is useful in 1198 1176 cases where the provided arguments are insufficient and 1199 1177 you don't want to or cannot modify them. 1200 1178
+13
arch/riscv/Kconfig.vendor
··· 16 16 If you don't know what to do here, say Y. 17 17 endmenu 18 18 19 + menu "SiFive" 20 + config RISCV_ISA_VENDOR_EXT_SIFIVE 21 + bool "SiFive vendor extension support" 22 + select RISCV_ISA_VENDOR_EXT 23 + default y 24 + help 25 + Say N here if you want to disable all SiFive vendor extension 26 + support. This will cause any SiFive vendor extensions that are 27 + requested by hardware probing to be ignored. 28 + 29 + If you don't know what to do here, say Y. 30 + endmenu 31 + 19 32 menu "T-Head" 20 33 config RISCV_ISA_VENDOR_EXT_THEAD 21 34 bool "T-Head vendor extension support"
+2 -2
arch/riscv/Makefile
··· 15 15 LDFLAGS_vmlinux += --no-relax 16 16 KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY 17 17 ifeq ($(CONFIG_RISCV_ISA_C),y) 18 - CC_FLAGS_FTRACE := -fpatchable-function-entry=4 18 + CC_FLAGS_FTRACE := -fpatchable-function-entry=8,4 19 19 else 20 - CC_FLAGS_FTRACE := -fpatchable-function-entry=2 20 + CC_FLAGS_FTRACE := -fpatchable-function-entry=4,2 21 21 endif 22 22 endif 23 23
+2 -22
arch/riscv/configs/defconfig
··· 18 18 CONFIG_CGROUP_CPUACCT=y 19 19 CONFIG_CGROUP_PERF=y 20 20 CONFIG_CGROUP_BPF=y 21 - CONFIG_NAMESPACES=y 22 21 CONFIG_USER_NS=y 23 22 CONFIG_CHECKPOINT_RESTORE=y 24 23 CONFIG_BLK_DEV_INITRD=y 25 - CONFIG_EXPERT=y 26 - # CONFIG_SYSFS_SYSCALL is not set 27 24 CONFIG_PROFILING=y 28 25 CONFIG_ARCH_MICROCHIP=y 29 26 CONFIG_ARCH_SIFIVE=y ··· 179 182 CONFIG_REGULATOR_AXP20X=y 180 183 CONFIG_REGULATOR_GPIO=y 181 184 CONFIG_MEDIA_SUPPORT=m 185 + CONFIG_MEDIA_PLATFORM_SUPPORT=y 182 186 CONFIG_VIDEO_CADENCE_CSI2RX=m 183 187 CONFIG_DRM=m 184 188 CONFIG_DRM_RADEON=m ··· 295 297 CONFIG_CRYPTO_USER_API_HASH=y 296 298 CONFIG_CRYPTO_DEV_VIRTIO=y 297 299 CONFIG_PRINTK_TIME=y 300 + CONFIG_DEBUG_KERNEL=y 298 301 CONFIG_DEBUG_FS=y 299 - CONFIG_DEBUG_PAGEALLOC=y 300 - CONFIG_SCHED_STACK_END_CHECK=y 301 - CONFIG_DEBUG_VM=y 302 - CONFIG_DEBUG_VM_PGFLAGS=y 303 - CONFIG_DEBUG_MEMORY_INIT=y 304 - CONFIG_DEBUG_PER_CPU_MAPS=y 305 - CONFIG_SOFTLOCKUP_DETECTOR=y 306 - CONFIG_WQ_WATCHDOG=y 307 - CONFIG_DEBUG_RT_MUTEXES=y 308 - CONFIG_DEBUG_SPINLOCK=y 309 - CONFIG_DEBUG_MUTEXES=y 310 - CONFIG_DEBUG_RWSEMS=y 311 - CONFIG_DEBUG_ATOMIC_SLEEP=y 312 - CONFIG_DEBUG_LIST=y 313 - CONFIG_DEBUG_PLIST=y 314 - CONFIG_DEBUG_SG=y 315 - # CONFIG_RCU_TRACE is not set 316 - CONFIG_RCU_EQS_DEBUG=y 317 - # CONFIG_FTRACE is not set 318 302 # CONFIG_RUNTIME_TESTING_MENU is not set 319 303 CONFIG_MEMTEST=y
+1 -1
arch/riscv/include/asm/asm-prototypes.h
··· 12 12 #ifdef CONFIG_RISCV_ISA_V 13 13 14 14 #ifdef CONFIG_MMU 15 - asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n); 15 + asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n, bool enable_sum); 16 16 #endif /* CONFIG_MMU */ 17 17 18 18 void xor_regs_2_(unsigned long bytes, unsigned long *__restrict p1,
-5
arch/riscv/include/asm/barrier.h
··· 14 14 #include <asm/cmpxchg.h> 15 15 #include <asm/fence.h> 16 16 17 - #define nop() __asm__ __volatile__ ("nop") 18 - #define __nops(n) ".rept " #n "\nnop\n.endr\n" 19 - #define nops(n) __asm__ __volatile__ (__nops(n)) 20 - 21 - 22 17 /* These barriers need to enforce ordering on both devices or memory. */ 23 18 #define __mb() RISCV_FENCE(iorw, iorw) 24 19 #define __rmb() RISCV_FENCE(ir, ir)
+1
arch/riscv/include/asm/cacheflush.h
··· 85 85 86 86 extern unsigned int riscv_cbom_block_size; 87 87 extern unsigned int riscv_cboz_block_size; 88 + extern unsigned int riscv_cbop_block_size; 88 89 void riscv_init_cbo_blocksizes(void); 89 90 90 91 #ifdef CONFIG_RISCV_DMA_NONCOHERENT
+3 -1
arch/riscv/include/asm/cmpxchg.h
··· 13 13 #include <asm/hwcap.h> 14 14 #include <asm/insn-def.h> 15 15 #include <asm/cpufeature-macros.h> 16 + #include <asm/processor.h> 16 17 17 18 #define __arch_xchg_masked(sc_sfx, swap_sfx, prepend, sc_append, \ 18 19 swap_append, r, p, n) \ ··· 38 37 \ 39 38 __asm__ __volatile__ ( \ 40 39 prepend \ 40 + PREFETCHW_ASM(%5) \ 41 41 "0: lr.w %0, %2\n" \ 42 42 " and %1, %0, %z4\n" \ 43 43 " or %1, %1, %z3\n" \ ··· 46 44 " bnez %1, 0b\n" \ 47 45 sc_append \ 48 46 : "=&r" (__retx), "=&r" (__rc), "+A" (*(__ptr32b)) \ 49 - : "rJ" (__newx), "rJ" (~__mask) \ 47 + : "rJ" (__newx), "rJ" (~__mask), "rJ" (__ptr32b) \ 50 48 : "memory"); \ 51 49 \ 52 50 r = (__typeof__(*(p)))((__retx & __mask) >> __s); \
+12 -2
arch/riscv/include/asm/cpufeature.h
··· 67 67 _RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate) 68 68 69 69 bool __init check_unaligned_access_emulated_all_cpus(void); 70 + void unaligned_access_init(void); 71 + int cpu_online_unaligned_access_init(unsigned int cpu); 70 72 #if defined(CONFIG_RISCV_SCALAR_MISALIGNED) 71 - void check_unaligned_access_emulated(struct work_struct *work __always_unused); 72 73 void unaligned_emulation_finish(void); 73 74 bool unaligned_ctl_available(void); 74 - DECLARE_PER_CPU(long, misaligned_access_speed); 75 75 #else 76 76 static inline bool unaligned_ctl_available(void) 77 + { 78 + return false; 79 + } 80 + #endif 81 + 82 + #if defined(CONFIG_RISCV_MISALIGNED) 83 + DECLARE_PER_CPU(long, misaligned_access_speed); 84 + bool misaligned_traps_can_delegate(void); 85 + #else 86 + static inline bool misaligned_traps_can_delegate(void) 77 87 { 78 88 return false; 79 89 }
+35 -27
arch/riscv/include/asm/ftrace.h
··· 20 20 #define ftrace_return_address(n) return_address(n) 21 21 22 22 void _mcount(void); 23 - static inline unsigned long ftrace_call_adjust(unsigned long addr) 24 - { 25 - return addr; 26 - } 23 + unsigned long ftrace_call_adjust(unsigned long addr); 24 + unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip); 25 + #define ftrace_get_symaddr(fentry_ip) arch_ftrace_get_symaddr(fentry_ip) 27 26 28 27 /* 29 28 * Let's do like x86/arm64 and ignore the compat syscalls. ··· 56 57 * 2) jalr: setting low-12 offset to ra, jump to ra, and set ra to 57 58 * return address (original pc + 4) 58 59 * 60 + * The first 2 instructions for each tracable function is compiled to 2 nop 61 + * instructions. Then, the kernel initializes the first instruction to auipc at 62 + * boot time (<ftrace disable>). The second instruction is patched to jalr to 63 + * start the trace. 64 + * 65 + *<Image>: 66 + * 0: nop 67 + * 4: nop 68 + * 59 69 *<ftrace enable>: 60 - * 0: auipc t0/ra, 0x? 61 - * 4: jalr t0/ra, ?(t0/ra) 70 + * 0: auipc t0, 0x? 71 + * 4: jalr t0, ?(t0) 62 72 * 63 73 *<ftrace disable>: 64 - * 0: nop 74 + * 0: auipc t0, 0x? 65 75 * 4: nop 66 76 * 67 77 * Dynamic ftrace generates probes to call sites, so we must deal with ··· 83 75 #define AUIPC_OFFSET_MASK (0xfffff000) 84 76 #define AUIPC_PAD (0x00001000) 85 77 #define JALR_SHIFT 20 86 - #define JALR_RA (0x000080e7) 87 - #define AUIPC_RA (0x00000097) 88 78 #define JALR_T0 (0x000282e7) 89 79 #define AUIPC_T0 (0x00000297) 80 + #define JALR_RANGE (JALR_SIGN_MASK - 1) 90 81 91 82 #define to_jalr_t0(offset) \ 92 83 (((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_T0) ··· 103 96 call[1] = to_jalr_t0(offset); \ 104 97 } while (0) 105 98 106 - #define to_jalr_ra(offset) \ 107 - (((offset & JALR_OFFSET_MASK) << JALR_SHIFT) | JALR_RA) 108 - 109 - #define to_auipc_ra(offset) \ 110 - ((offset & JALR_SIGN_MASK) ? \ 111 - (((offset & AUIPC_OFFSET_MASK) + AUIPC_PAD) | AUIPC_RA) : \ 112 - ((offset & AUIPC_OFFSET_MASK) | AUIPC_RA)) 113 - 114 - #define make_call_ra(caller, callee, call) \ 115 - do { \ 116 - unsigned int offset = \ 117 - (unsigned long) (callee) - (unsigned long) (caller); \ 118 - call[0] = to_auipc_ra(offset); \ 119 - call[1] = to_jalr_ra(offset); \ 120 - } while (0) 121 - 122 99 /* 123 - * Let auipc+jalr be the basic *mcount unit*, so we make it 8 bytes here. 100 + * Only the jalr insn in the auipc+jalr is patched, so we make it 4 101 + * bytes here. 124 102 */ 125 - #define MCOUNT_INSN_SIZE 8 103 + #define MCOUNT_INSN_SIZE 4 104 + #define MCOUNT_AUIPC_SIZE 4 105 + #define MCOUNT_JALR_SIZE 4 106 + #define MCOUNT_NOP4_SIZE 4 126 107 127 108 #ifndef __ASSEMBLY__ 128 109 struct dyn_ftrace; ··· 130 135 unsigned long sp; 131 136 unsigned long s0; 132 137 unsigned long t1; 138 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 139 + unsigned long direct_tramp; 140 + #endif 133 141 union { 134 142 unsigned long args[8]; 135 143 struct { ··· 144 146 unsigned long a5; 145 147 unsigned long a6; 146 148 unsigned long a7; 149 + #ifdef CONFIG_CC_IS_CLANG 150 + unsigned long t2; 151 + unsigned long t3; 152 + unsigned long t4; 153 + unsigned long t5; 154 + unsigned long t6; 155 + #endif 147 156 }; 148 157 }; 149 158 }; ··· 226 221 struct ftrace_ops *op, struct ftrace_regs *fregs); 227 222 #define ftrace_graph_func ftrace_graph_func 228 223 224 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 229 225 static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs, unsigned long addr) 230 226 { 231 227 arch_ftrace_regs(fregs)->t1 = addr; 232 228 } 229 + #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ 230 + 233 231 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 234 232 235 233 #endif /* __ASSEMBLY__ */
+1
arch/riscv/include/asm/hwcap.h
··· 105 105 #define RISCV_ISA_EXT_ZVFBFWMA 96 106 106 #define RISCV_ISA_EXT_ZAAMO 97 107 107 #define RISCV_ISA_EXT_ZALRSC 98 108 + #define RISCV_ISA_EXT_ZICBOP 99 108 109 109 110 #define RISCV_ISA_EXT_XLINUXENVCFG 127 110 111
+2 -1
arch/riscv/include/asm/hwprobe.h
··· 8 8 9 9 #include <uapi/asm/hwprobe.h> 10 10 11 - #define RISCV_HWPROBE_MAX_KEY 12 11 + #define RISCV_HWPROBE_MAX_KEY 13 12 12 13 13 static inline bool riscv_hwprobe_key_is_valid(__s64 key) 14 14 { ··· 22 22 case RISCV_HWPROBE_KEY_IMA_EXT_0: 23 23 case RISCV_HWPROBE_KEY_CPUPERF_0: 24 24 case RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0: 25 + case RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0: 25 26 return true; 26 27 } 27 28
+2
arch/riscv/include/asm/image.h
··· 30 30 RISCV_HEADER_VERSION_MINOR) 31 31 32 32 #ifndef __ASSEMBLY__ 33 + #define riscv_image_flag_field(flags, field)\ 34 + (((flags) >> field##_SHIFT) & field##_MASK) 33 35 /** 34 36 * struct riscv_image_header - riscv kernel image header 35 37 * @code0: Executable code
+66
arch/riscv/include/asm/insn-def.h
··· 18 18 #define INSN_I_RD_SHIFT 7 19 19 #define INSN_I_OPCODE_SHIFT 0 20 20 21 + #define INSN_S_SIMM7_SHIFT 25 22 + #define INSN_S_RS2_SHIFT 20 23 + #define INSN_S_RS1_SHIFT 15 24 + #define INSN_S_FUNC3_SHIFT 12 25 + #define INSN_S_SIMM5_SHIFT 7 26 + #define INSN_S_OPCODE_SHIFT 0 27 + 21 28 #ifdef __ASSEMBLY__ 22 29 23 30 #ifdef CONFIG_AS_HAS_INSN ··· 35 28 36 29 .macro insn_i, opcode, func3, rd, rs1, simm12 37 30 .insn i \opcode, \func3, \rd, \rs1, \simm12 31 + .endm 32 + 33 + .macro insn_s, opcode, func3, rs2, simm12, rs1 34 + .insn s \opcode, \func3, \rs2, \simm12(\rs1) 38 35 .endm 39 36 40 37 #else ··· 62 51 (\simm12 << INSN_I_SIMM12_SHIFT)) 63 52 .endm 64 53 54 + .macro insn_s, opcode, func3, rs2, simm12, rs1 55 + .4byte ((\opcode << INSN_S_OPCODE_SHIFT) | \ 56 + (\func3 << INSN_S_FUNC3_SHIFT) | \ 57 + (.L__gpr_num_\rs2 << INSN_S_RS2_SHIFT) | \ 58 + (.L__gpr_num_\rs1 << INSN_S_RS1_SHIFT) | \ 59 + ((\simm12 & 0x1f) << INSN_S_SIMM5_SHIFT) | \ 60 + (((\simm12 >> 5) & 0x7f) << INSN_S_SIMM7_SHIFT)) 61 + .endm 62 + 65 63 #endif 66 64 67 65 #define __INSN_R(...) insn_r __VA_ARGS__ 68 66 #define __INSN_I(...) insn_i __VA_ARGS__ 67 + #define __INSN_S(...) insn_s __VA_ARGS__ 69 68 70 69 #else /* ! __ASSEMBLY__ */ 71 70 ··· 86 65 87 66 #define __INSN_I(opcode, func3, rd, rs1, simm12) \ 88 67 ".insn i " opcode ", " func3 ", " rd ", " rs1 ", " simm12 "\n" 68 + 69 + #define __INSN_S(opcode, func3, rs2, simm12, rs1) \ 70 + ".insn s " opcode ", " func3 ", " rs2 ", " simm12 "(" rs1 ")\n" 89 71 90 72 #else 91 73 ··· 116 92 " (\\simm12 << " __stringify(INSN_I_SIMM12_SHIFT) "))\n" \ 117 93 " .endm\n" 118 94 95 + #define DEFINE_INSN_S \ 96 + __DEFINE_ASM_GPR_NUMS \ 97 + " .macro insn_s, opcode, func3, rs2, simm12, rs1\n" \ 98 + " .4byte ((\\opcode << " __stringify(INSN_S_OPCODE_SHIFT) ") |" \ 99 + " (\\func3 << " __stringify(INSN_S_FUNC3_SHIFT) ") |" \ 100 + " (.L__gpr_num_\\rs2 << " __stringify(INSN_S_RS2_SHIFT) ") |" \ 101 + " (.L__gpr_num_\\rs1 << " __stringify(INSN_S_RS1_SHIFT) ") |" \ 102 + " ((\\simm12 & 0x1f) << " __stringify(INSN_S_SIMM5_SHIFT) ") |" \ 103 + " (((\\simm12 >> 5) & 0x7f) << " __stringify(INSN_S_SIMM7_SHIFT) "))\n" \ 104 + " .endm\n" 105 + 119 106 #define UNDEFINE_INSN_R \ 120 107 " .purgem insn_r\n" 121 108 122 109 #define UNDEFINE_INSN_I \ 123 110 " .purgem insn_i\n" 111 + 112 + #define UNDEFINE_INSN_S \ 113 + " .purgem insn_s\n" 124 114 125 115 #define __INSN_R(opcode, func3, func7, rd, rs1, rs2) \ 126 116 DEFINE_INSN_R \ ··· 145 107 DEFINE_INSN_I \ 146 108 "insn_i " opcode ", " func3 ", " rd ", " rs1 ", " simm12 "\n" \ 147 109 UNDEFINE_INSN_I 110 + 111 + #define __INSN_S(opcode, func3, rs2, simm12, rs1) \ 112 + DEFINE_INSN_S \ 113 + "insn_s " opcode ", " func3 ", " rs2 ", " simm12 ", " rs1 "\n" \ 114 + UNDEFINE_INSN_S 148 115 149 116 #endif 150 117 ··· 162 119 #define INSN_I(opcode, func3, rd, rs1, simm12) \ 163 120 __INSN_I(RV_##opcode, RV_##func3, RV_##rd, \ 164 121 RV_##rs1, RV_##simm12) 122 + 123 + #define INSN_S(opcode, func3, rs2, simm12, rs1) \ 124 + __INSN_S(RV_##opcode, RV_##func3, RV_##rs2, \ 125 + RV_##simm12, RV_##rs1) 165 126 166 127 #define RV_OPCODE(v) __ASM_STR(v) 167 128 #define RV_FUNC3(v) __ASM_STR(v) ··· 180 133 #define RV___RS2(v) __RV_REG(v) 181 134 182 135 #define RV_OPCODE_MISC_MEM RV_OPCODE(15) 136 + #define RV_OPCODE_OP_IMM RV_OPCODE(19) 183 137 #define RV_OPCODE_SYSTEM RV_OPCODE(115) 184 138 185 139 #define HFENCE_VVMA(vaddr, asid) \ ··· 244 196 INSN_I(OPCODE_MISC_MEM, FUNC3(2), __RD(0), \ 245 197 RS1(base), SIMM12(4)) 246 198 199 + #define PREFETCH_I(base, offset) \ 200 + INSN_S(OPCODE_OP_IMM, FUNC3(6), __RS2(0), \ 201 + SIMM12((offset) & 0xfe0), RS1(base)) 202 + 203 + #define PREFETCH_R(base, offset) \ 204 + INSN_S(OPCODE_OP_IMM, FUNC3(6), __RS2(1), \ 205 + SIMM12((offset) & 0xfe0), RS1(base)) 206 + 207 + #define PREFETCH_W(base, offset) \ 208 + INSN_S(OPCODE_OP_IMM, FUNC3(6), __RS2(3), \ 209 + SIMM12((offset) & 0xfe0), RS1(base)) 210 + 247 211 #define RISCV_PAUSE ".4byte 0x100000f" 248 212 #define ZAWRS_WRS_NTO ".4byte 0x00d00073" 249 213 #define ZAWRS_WRS_STO ".4byte 0x01d00073" 250 214 #define RISCV_NOP4 ".4byte 0x00000013" 251 215 252 216 #define RISCV_INSN_NOP4 _AC(0x00000013, U) 217 + 218 + #ifndef __ASSEMBLY__ 219 + #define nop() __asm__ __volatile__ ("nop") 220 + #define __nops(n) ".rept " #n "\nnop\n.endr\n" 221 + #define nops(n) __asm__ __volatile__ (__nops(n)) 222 + #endif 253 223 254 224 #endif /* __ASM_INSN_DEF_H */
+6
arch/riscv/include/asm/kexec.h
··· 56 56 57 57 #ifdef CONFIG_KEXEC_FILE 58 58 extern const struct kexec_file_ops elf_kexec_ops; 59 + extern const struct kexec_file_ops image_kexec_ops; 59 60 60 61 struct purgatory_info; 61 62 int arch_kexec_apply_relocations_add(struct purgatory_info *pi, ··· 68 67 struct kimage; 69 68 int arch_kimage_file_post_load_cleanup(struct kimage *image); 70 69 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup 70 + 71 + int load_extra_segments(struct kimage *image, unsigned long kernel_start, 72 + unsigned long kernel_len, char *initrd, 73 + unsigned long initrd_len, char *cmdline, 74 + unsigned long cmdline_len); 71 75 #endif 72 76 73 77 #endif
+3 -2
arch/riscv/include/asm/pgtable-64.h
··· 184 184 185 185 static inline int pud_bad(pud_t pud) 186 186 { 187 - return !pud_present(pud); 187 + return !pud_present(pud) || (pud_val(pud) & _PAGE_LEAF); 188 188 } 189 189 190 190 #define pud_leaf pud_leaf ··· 399 399 #ifdef CONFIG_TRANSPARENT_HUGEPAGE 400 400 static inline int pte_devmap(pte_t pte); 401 401 static inline pte_t pmd_pte(pmd_t pmd); 402 + static inline pte_t pud_pte(pud_t pud); 402 403 403 404 static inline int pmd_devmap(pmd_t pmd) 404 405 { ··· 408 407 409 408 static inline int pud_devmap(pud_t pud) 410 409 { 411 - return 0; 410 + return pte_devmap(pud_pte(pud)); 412 411 } 413 412 414 413 static inline int pgd_devmap(pgd_t pgd)
+97
arch/riscv/include/asm/pgtable.h
··· 900 900 #define pmdp_collapse_flush pmdp_collapse_flush 901 901 extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, 902 902 unsigned long address, pmd_t *pmdp); 903 + 904 + static inline pud_t pud_wrprotect(pud_t pud) 905 + { 906 + return pte_pud(pte_wrprotect(pud_pte(pud))); 907 + } 908 + 909 + static inline int pud_trans_huge(pud_t pud) 910 + { 911 + return pud_leaf(pud); 912 + } 913 + 914 + static inline int pud_dirty(pud_t pud) 915 + { 916 + return pte_dirty(pud_pte(pud)); 917 + } 918 + 919 + static inline pud_t pud_mkyoung(pud_t pud) 920 + { 921 + return pte_pud(pte_mkyoung(pud_pte(pud))); 922 + } 923 + 924 + static inline pud_t pud_mkold(pud_t pud) 925 + { 926 + return pte_pud(pte_mkold(pud_pte(pud))); 927 + } 928 + 929 + static inline pud_t pud_mkdirty(pud_t pud) 930 + { 931 + return pte_pud(pte_mkdirty(pud_pte(pud))); 932 + } 933 + 934 + static inline pud_t pud_mkclean(pud_t pud) 935 + { 936 + return pte_pud(pte_mkclean(pud_pte(pud))); 937 + } 938 + 939 + static inline pud_t pud_mkwrite(pud_t pud) 940 + { 941 + return pte_pud(pte_mkwrite_novma(pud_pte(pud))); 942 + } 943 + 944 + static inline pud_t pud_mkhuge(pud_t pud) 945 + { 946 + return pud; 947 + } 948 + 949 + static inline pud_t pud_mkdevmap(pud_t pud) 950 + { 951 + return pte_pud(pte_mkdevmap(pud_pte(pud))); 952 + } 953 + 954 + static inline int pudp_set_access_flags(struct vm_area_struct *vma, 955 + unsigned long address, pud_t *pudp, 956 + pud_t entry, int dirty) 957 + { 958 + return ptep_set_access_flags(vma, address, (pte_t *)pudp, pud_pte(entry), dirty); 959 + } 960 + 961 + static inline int pudp_test_and_clear_young(struct vm_area_struct *vma, 962 + unsigned long address, pud_t *pudp) 963 + { 964 + return ptep_test_and_clear_young(vma, address, (pte_t *)pudp); 965 + } 966 + 967 + static inline int pud_young(pud_t pud) 968 + { 969 + return pte_young(pud_pte(pud)); 970 + } 971 + 972 + static inline void update_mmu_cache_pud(struct vm_area_struct *vma, 973 + unsigned long address, pud_t *pudp) 974 + { 975 + pte_t *ptep = (pte_t *)pudp; 976 + 977 + update_mmu_cache(vma, address, ptep); 978 + } 979 + 980 + static inline pud_t pudp_establish(struct vm_area_struct *vma, 981 + unsigned long address, pud_t *pudp, pud_t pud) 982 + { 983 + page_table_check_pud_set(vma->vm_mm, pudp, pud); 984 + return __pud(atomic_long_xchg((atomic_long_t *)pudp, pud_val(pud))); 985 + } 986 + 987 + static inline pud_t pud_mkinvalid(pud_t pud) 988 + { 989 + return __pud(pud_val(pud) & ~(_PAGE_PRESENT | _PAGE_PROT_NONE)); 990 + } 991 + 992 + extern pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, 993 + pud_t *pudp); 994 + 995 + static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) 996 + { 997 + return pte_pud(pte_modify(pud_pte(pud), newprot)); 998 + } 999 + 903 1000 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ 904 1001 905 1002 /*
+30 -1
arch/riscv/include/asm/processor.h
··· 13 13 #include <vdso/processor.h> 14 14 15 15 #include <asm/ptrace.h> 16 + #include <asm/insn-def.h> 17 + #include <asm/alternative-macros.h> 18 + #include <asm/hwcap.h> 16 19 17 20 #define arch_get_mmap_end(addr, len, flags) \ 18 21 ({ \ ··· 55 52 #endif 56 53 57 54 #ifndef __ASSEMBLY__ 58 - #include <linux/cpumask.h> 59 55 60 56 struct task_struct; 61 57 struct pt_regs; ··· 81 79 * Thus, the task does not own preempt_v. Any use of Vector will have to 82 80 * save preempt_v, if dirty, and fallback to non-preemptible kernel-mode 83 81 * Vector. 82 + * - bit 29: The thread voluntarily calls schedule() while holding an active 83 + * preempt_v. All preempt_v context should be dropped in such case because 84 + * V-regs are caller-saved. Only sstatus.VS=ON is persisted across a 85 + * schedule() call. 84 86 * - bit 30: The in-kernel preempt_v context is saved, and requries to be 85 87 * restored when returning to the context that owns the preempt_v. 86 88 * - bit 31: The in-kernel preempt_v context is dirty, as signaled by the ··· 99 93 #define RISCV_PREEMPT_V 0x00000100 100 94 #define RISCV_PREEMPT_V_DIRTY 0x80000000 101 95 #define RISCV_PREEMPT_V_NEED_RESTORE 0x40000000 96 + #define RISCV_PREEMPT_V_IN_SCHEDULE 0x20000000 102 97 103 98 /* CPU-specific state of a task */ 104 99 struct thread_struct { ··· 110 103 struct __riscv_d_ext_state fstate; 111 104 unsigned long bad_cause; 112 105 unsigned long envcfg; 106 + unsigned long sum; 113 107 u32 riscv_v_flags; 114 108 u32 vstate_ctrl; 115 109 struct __riscv_v_ext_state vstate; ··· 144 136 #define KSTK_EIP(tsk) (task_pt_regs(tsk)->epc) 145 137 #define KSTK_ESP(tsk) (task_pt_regs(tsk)->sp) 146 138 139 + #define PREFETCH_ASM(x) \ 140 + ALTERNATIVE(__nops(1), PREFETCH_R(x, 0), 0, \ 141 + RISCV_ISA_EXT_ZICBOP, CONFIG_RISCV_ISA_ZICBOP) 142 + 143 + #define PREFETCHW_ASM(x) \ 144 + ALTERNATIVE(__nops(1), PREFETCH_W(x, 0), 0, \ 145 + RISCV_ISA_EXT_ZICBOP, CONFIG_RISCV_ISA_ZICBOP) 146 + 147 + #ifdef CONFIG_RISCV_ISA_ZICBOP 148 + #define ARCH_HAS_PREFETCH 149 + static inline void prefetch(const void *x) 150 + { 151 + __asm__ __volatile__(PREFETCH_ASM(%0) : : "r" (x) : "memory"); 152 + } 153 + 154 + #define ARCH_HAS_PREFETCHW 155 + static inline void prefetchw(const void *x) 156 + { 157 + __asm__ __volatile__(PREFETCHW_ASM(%0) : : "r" (x) : "memory"); 158 + } 159 + #endif /* CONFIG_RISCV_ISA_ZICBOP */ 147 160 148 161 /* Do necessary setup to start up a newly executed thread. */ 149 162 extern void start_thread(struct pt_regs *regs,
+1 -1
arch/riscv/include/asm/ptrace.h
··· 175 175 return 0; 176 176 } 177 177 178 - static inline int regs_irqs_disabled(struct pt_regs *regs) 178 + static __always_inline bool regs_irqs_disabled(struct pt_regs *regs) 179 179 { 180 180 return !(regs->status & SR_PIE); 181 181 }
+60
arch/riscv/include/asm/sbi.h
··· 35 35 SBI_EXT_DBCN = 0x4442434E, 36 36 SBI_EXT_STA = 0x535441, 37 37 SBI_EXT_NACL = 0x4E41434C, 38 + SBI_EXT_FWFT = 0x46574654, 38 39 39 40 /* Experimentals extensions must lie within this range */ 40 41 SBI_EXT_EXPERIMENTAL_START = 0x08000000, ··· 403 402 #define SBI_NACL_SHMEM_SRET_X(__i) ((__riscv_xlen / 8) * (__i)) 404 403 #define SBI_NACL_SHMEM_SRET_X_LAST 31 405 404 405 + /* SBI function IDs for FW feature extension */ 406 + #define SBI_EXT_FWFT_SET 0x0 407 + #define SBI_EXT_FWFT_GET 0x1 408 + 409 + enum sbi_fwft_feature_t { 410 + SBI_FWFT_MISALIGNED_EXC_DELEG = 0x0, 411 + SBI_FWFT_LANDING_PAD = 0x1, 412 + SBI_FWFT_SHADOW_STACK = 0x2, 413 + SBI_FWFT_DOUBLE_TRAP = 0x3, 414 + SBI_FWFT_PTE_AD_HW_UPDATING = 0x4, 415 + SBI_FWFT_POINTER_MASKING_PMLEN = 0x5, 416 + SBI_FWFT_LOCAL_RESERVED_START = 0x6, 417 + SBI_FWFT_LOCAL_RESERVED_END = 0x3fffffff, 418 + SBI_FWFT_LOCAL_PLATFORM_START = 0x40000000, 419 + SBI_FWFT_LOCAL_PLATFORM_END = 0x7fffffff, 420 + 421 + SBI_FWFT_GLOBAL_RESERVED_START = 0x80000000, 422 + SBI_FWFT_GLOBAL_RESERVED_END = 0xbfffffff, 423 + SBI_FWFT_GLOBAL_PLATFORM_START = 0xc0000000, 424 + SBI_FWFT_GLOBAL_PLATFORM_END = 0xffffffff, 425 + }; 426 + 427 + #define SBI_FWFT_PLATFORM_FEATURE_BIT BIT(30) 428 + #define SBI_FWFT_GLOBAL_FEATURE_BIT BIT(31) 429 + 430 + #define SBI_FWFT_SET_FLAG_LOCK BIT(0) 431 + 406 432 /* SBI spec version fields */ 407 433 #define SBI_SPEC_VERSION_DEFAULT 0x1 408 434 #define SBI_SPEC_VERSION_MAJOR_SHIFT 24 ··· 447 419 #define SBI_ERR_ALREADY_STARTED -7 448 420 #define SBI_ERR_ALREADY_STOPPED -8 449 421 #define SBI_ERR_NO_SHMEM -9 422 + #define SBI_ERR_INVALID_STATE -10 423 + #define SBI_ERR_BAD_RANGE -11 424 + #define SBI_ERR_TIMEOUT -12 425 + #define SBI_ERR_IO -13 426 + #define SBI_ERR_DENIED_LOCKED -14 450 427 451 428 extern unsigned long sbi_spec_version; 452 429 struct sbiret { ··· 503 470 unsigned long asid); 504 471 long sbi_probe_extension(int ext); 505 472 473 + int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags); 474 + int sbi_fwft_set_cpumask(const cpumask_t *mask, u32 feature, 475 + unsigned long value, unsigned long flags); 476 + /** 477 + * sbi_fwft_set_online_cpus() - Set a feature on all online cpus 478 + * @feature: The feature to be set 479 + * @value: The feature value to be set 480 + * @flags: FWFT feature set flags 481 + * 482 + * Return: 0 on success, appropriate linux error code otherwise. 483 + */ 484 + static inline int sbi_fwft_set_online_cpus(u32 feature, unsigned long value, 485 + unsigned long flags) 486 + { 487 + return sbi_fwft_set_cpumask(cpu_online_mask, feature, value, flags); 488 + } 489 + 506 490 /* Check if current SBI specification version is 0.1 or not */ 507 491 static inline int sbi_spec_is_0_1(void) 508 492 { ··· 553 503 case SBI_SUCCESS: 554 504 return 0; 555 505 case SBI_ERR_DENIED: 506 + case SBI_ERR_DENIED_LOCKED: 556 507 return -EPERM; 557 508 case SBI_ERR_INVALID_PARAM: 509 + case SBI_ERR_INVALID_STATE: 558 510 return -EINVAL; 511 + case SBI_ERR_BAD_RANGE: 512 + return -ERANGE; 559 513 case SBI_ERR_INVALID_ADDRESS: 560 514 return -EFAULT; 515 + case SBI_ERR_NO_SHMEM: 516 + return -ENOMEM; 517 + case SBI_ERR_TIMEOUT: 518 + return -ETIMEDOUT; 519 + case SBI_ERR_IO: 520 + return -EIO; 561 521 case SBI_ERR_NOT_SUPPORTED: 562 522 case SBI_ERR_FAILURE: 563 523 default:
+2
arch/riscv/include/asm/tlbflush.h
··· 56 56 #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE 57 57 void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, 58 58 unsigned long end); 59 + void flush_pud_tlb_range(struct vm_area_struct *vma, unsigned long start, 60 + unsigned long end); 59 61 #endif 60 62 61 63 bool arch_tlbbatch_should_defer(struct mm_struct *mm);
+163 -55
arch/riscv/include/asm/uaccess.h
··· 62 62 __asm__ __volatile__ ("csrc sstatus, %0" : : "r" (SR_SUM) : "memory") 63 63 64 64 /* 65 + * This is the smallest unsigned integer type that can fit a value 66 + * (up to 'long long') 67 + */ 68 + #define __inttype(x) __typeof__( \ 69 + __typefits(x, char, \ 70 + __typefits(x, short, \ 71 + __typefits(x, int, \ 72 + __typefits(x, long, 0ULL))))) 73 + 74 + #define __typefits(x, type, not) \ 75 + __builtin_choose_expr(sizeof(x) <= sizeof(type), (unsigned type)0, not) 76 + 77 + /* 65 78 * The exception table consists of pairs of addresses: the first is the 66 79 * address of an instruction that is allowed to fault, and the second is 67 80 * the address at which the program should continue. No registers are ··· 96 83 * call. 97 84 */ 98 85 99 - #define __get_user_asm(insn, x, ptr, err) \ 86 + #ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT 87 + #define __get_user_asm(insn, x, ptr, label) \ 88 + asm_goto_output( \ 89 + "1:\n" \ 90 + " " insn " %0, %1\n" \ 91 + _ASM_EXTABLE_UACCESS_ERR(1b, %l2, %0) \ 92 + : "=&r" (x) \ 93 + : "m" (*(ptr)) : : label) 94 + #else /* !CONFIG_CC_HAS_ASM_GOTO_OUTPUT */ 95 + #define __get_user_asm(insn, x, ptr, label) \ 100 96 do { \ 101 - __typeof__(x) __x; \ 97 + long __gua_err = 0; \ 102 98 __asm__ __volatile__ ( \ 103 99 "1:\n" \ 104 100 " " insn " %1, %2\n" \ 105 101 "2:\n" \ 106 102 _ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 2b, %0, %1) \ 107 - : "+r" (err), "=&r" (__x) \ 103 + : "+r" (__gua_err), "=&r" (x) \ 108 104 : "m" (*(ptr))); \ 109 - (x) = __x; \ 105 + if (__gua_err) \ 106 + goto label; \ 110 107 } while (0) 108 + #endif /* CONFIG_CC_HAS_ASM_GOTO_OUTPUT */ 111 109 112 110 #ifdef CONFIG_64BIT 113 - #define __get_user_8(x, ptr, err) \ 114 - __get_user_asm("ld", x, ptr, err) 111 + #define __get_user_8(x, ptr, label) \ 112 + __get_user_asm("ld", x, ptr, label) 115 113 #else /* !CONFIG_64BIT */ 116 - #define __get_user_8(x, ptr, err) \ 114 + 115 + #ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT 116 + #define __get_user_8(x, ptr, label) \ 117 + u32 __user *__ptr = (u32 __user *)(ptr); \ 118 + u32 __lo, __hi; \ 119 + asm_goto_output( \ 120 + "1:\n" \ 121 + " lw %0, %2\n" \ 122 + "2:\n" \ 123 + " lw %1, %3\n" \ 124 + _ASM_EXTABLE_UACCESS_ERR(1b, %l4, %0) \ 125 + _ASM_EXTABLE_UACCESS_ERR(2b, %l4, %0) \ 126 + : "=&r" (__lo), "=r" (__hi) \ 127 + : "m" (__ptr[__LSW]), "m" (__ptr[__MSW]) \ 128 + : : label); \ 129 + (x) = (__typeof__(x))((__typeof__((x) - (x)))( \ 130 + (((u64)__hi << 32) | __lo))); \ 131 + 132 + #else /* !CONFIG_CC_HAS_ASM_GOTO_OUTPUT */ 133 + #define __get_user_8(x, ptr, label) \ 117 134 do { \ 118 135 u32 __user *__ptr = (u32 __user *)(ptr); \ 119 136 u32 __lo, __hi; \ 137 + long __gu8_err = 0; \ 120 138 __asm__ __volatile__ ( \ 121 139 "1:\n" \ 122 140 " lw %1, %3\n" \ ··· 156 112 "3:\n" \ 157 113 _ASM_EXTABLE_UACCESS_ERR_ZERO(1b, 3b, %0, %1) \ 158 114 _ASM_EXTABLE_UACCESS_ERR_ZERO(2b, 3b, %0, %1) \ 159 - : "+r" (err), "=&r" (__lo), "=r" (__hi) \ 115 + : "+r" (__gu8_err), "=&r" (__lo), "=r" (__hi) \ 160 116 : "m" (__ptr[__LSW]), "m" (__ptr[__MSW])); \ 161 - if (err) \ 117 + if (__gu8_err) { \ 162 118 __hi = 0; \ 163 - (x) = (__typeof__(x))((__typeof__((x)-(x)))( \ 119 + goto label; \ 120 + } \ 121 + (x) = (__typeof__(x))((__typeof__((x) - (x)))( \ 164 122 (((u64)__hi << 32) | __lo))); \ 165 123 } while (0) 124 + #endif /* CONFIG_CC_HAS_ASM_GOTO_OUTPUT */ 125 + 166 126 #endif /* CONFIG_64BIT */ 167 127 168 - #define __get_user_nocheck(x, __gu_ptr, __gu_err) \ 128 + unsigned long __must_check __asm_copy_to_user_sum_enabled(void __user *to, 129 + const void *from, unsigned long n); 130 + unsigned long __must_check __asm_copy_from_user_sum_enabled(void *to, 131 + const void __user *from, unsigned long n); 132 + 133 + #define __get_user_nocheck(x, __gu_ptr, label) \ 169 134 do { \ 135 + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \ 136 + !IS_ALIGNED((uintptr_t)__gu_ptr, sizeof(*__gu_ptr))) { \ 137 + if (__asm_copy_from_user_sum_enabled(&(x), __gu_ptr, sizeof(*__gu_ptr))) \ 138 + goto label; \ 139 + break; \ 140 + } \ 170 141 switch (sizeof(*__gu_ptr)) { \ 171 142 case 1: \ 172 - __get_user_asm("lb", (x), __gu_ptr, __gu_err); \ 143 + __get_user_asm("lb", (x), __gu_ptr, label); \ 173 144 break; \ 174 145 case 2: \ 175 - __get_user_asm("lh", (x), __gu_ptr, __gu_err); \ 146 + __get_user_asm("lh", (x), __gu_ptr, label); \ 176 147 break; \ 177 148 case 4: \ 178 - __get_user_asm("lw", (x), __gu_ptr, __gu_err); \ 149 + __get_user_asm("lw", (x), __gu_ptr, label); \ 179 150 break; \ 180 151 case 8: \ 181 - __get_user_8((x), __gu_ptr, __gu_err); \ 152 + __get_user_8((x), __gu_ptr, label); \ 182 153 break; \ 183 154 default: \ 184 155 BUILD_BUG(); \ 185 156 } \ 157 + } while (0) 158 + 159 + #define __get_user_error(x, ptr, err) \ 160 + do { \ 161 + __label__ __gu_failed; \ 162 + \ 163 + __get_user_nocheck(x, ptr, __gu_failed); \ 164 + err = 0; \ 165 + break; \ 166 + __gu_failed: \ 167 + x = 0; \ 168 + err = -EFAULT; \ 186 169 } while (0) 187 170 188 171 /** ··· 236 165 ({ \ 237 166 const __typeof__(*(ptr)) __user *__gu_ptr = untagged_addr(ptr); \ 238 167 long __gu_err = 0; \ 168 + __typeof__(x) __gu_val; \ 239 169 \ 240 170 __chk_user_ptr(__gu_ptr); \ 241 171 \ 242 172 __enable_user_access(); \ 243 - __get_user_nocheck(x, __gu_ptr, __gu_err); \ 173 + __get_user_error(__gu_val, __gu_ptr, __gu_err); \ 244 174 __disable_user_access(); \ 175 + \ 176 + (x) = __gu_val; \ 245 177 \ 246 178 __gu_err; \ 247 179 }) ··· 275 201 ((x) = (__force __typeof__(x))0, -EFAULT); \ 276 202 }) 277 203 278 - #define __put_user_asm(insn, x, ptr, err) \ 204 + #define __put_user_asm(insn, x, ptr, label) \ 279 205 do { \ 280 206 __typeof__(*(ptr)) __x = x; \ 281 - __asm__ __volatile__ ( \ 207 + asm goto( \ 282 208 "1:\n" \ 283 - " " insn " %z2, %1\n" \ 284 - "2:\n" \ 285 - _ASM_EXTABLE_UACCESS_ERR(1b, 2b, %0) \ 286 - : "+r" (err), "=m" (*(ptr)) \ 287 - : "rJ" (__x)); \ 209 + " " insn " %z0, %1\n" \ 210 + _ASM_EXTABLE(1b, %l2) \ 211 + : : "rJ" (__x), "m"(*(ptr)) : : label); \ 288 212 } while (0) 289 213 290 214 #ifdef CONFIG_64BIT 291 - #define __put_user_8(x, ptr, err) \ 292 - __put_user_asm("sd", x, ptr, err) 215 + #define __put_user_8(x, ptr, label) \ 216 + __put_user_asm("sd", x, ptr, label) 293 217 #else /* !CONFIG_64BIT */ 294 - #define __put_user_8(x, ptr, err) \ 218 + #define __put_user_8(x, ptr, label) \ 295 219 do { \ 296 220 u32 __user *__ptr = (u32 __user *)(ptr); \ 297 221 u64 __x = (__typeof__((x)-(x)))(x); \ 298 - __asm__ __volatile__ ( \ 222 + asm goto( \ 299 223 "1:\n" \ 300 - " sw %z3, %1\n" \ 224 + " sw %z0, %2\n" \ 301 225 "2:\n" \ 302 - " sw %z4, %2\n" \ 303 - "3:\n" \ 304 - _ASM_EXTABLE_UACCESS_ERR(1b, 3b, %0) \ 305 - _ASM_EXTABLE_UACCESS_ERR(2b, 3b, %0) \ 306 - : "+r" (err), \ 307 - "=m" (__ptr[__LSW]), \ 308 - "=m" (__ptr[__MSW]) \ 309 - : "rJ" (__x), "rJ" (__x >> 32)); \ 226 + " sw %z1, %3\n" \ 227 + _ASM_EXTABLE(1b, %l4) \ 228 + _ASM_EXTABLE(2b, %l4) \ 229 + : : "rJ" (__x), "rJ" (__x >> 32), \ 230 + "m" (__ptr[__LSW]), \ 231 + "m" (__ptr[__MSW]) : : label); \ 310 232 } while (0) 311 233 #endif /* CONFIG_64BIT */ 312 234 313 - #define __put_user_nocheck(x, __gu_ptr, __pu_err) \ 235 + #define __put_user_nocheck(x, __gu_ptr, label) \ 314 236 do { \ 237 + if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && \ 238 + !IS_ALIGNED((uintptr_t)__gu_ptr, sizeof(*__gu_ptr))) { \ 239 + __inttype(x) val = (__inttype(x))x; \ 240 + if (__asm_copy_to_user_sum_enabled(__gu_ptr, &(val), sizeof(*__gu_ptr))) \ 241 + goto label; \ 242 + break; \ 243 + } \ 315 244 switch (sizeof(*__gu_ptr)) { \ 316 245 case 1: \ 317 - __put_user_asm("sb", (x), __gu_ptr, __pu_err); \ 246 + __put_user_asm("sb", (x), __gu_ptr, label); \ 318 247 break; \ 319 248 case 2: \ 320 - __put_user_asm("sh", (x), __gu_ptr, __pu_err); \ 249 + __put_user_asm("sh", (x), __gu_ptr, label); \ 321 250 break; \ 322 251 case 4: \ 323 - __put_user_asm("sw", (x), __gu_ptr, __pu_err); \ 252 + __put_user_asm("sw", (x), __gu_ptr, label); \ 324 253 break; \ 325 254 case 8: \ 326 - __put_user_8((x), __gu_ptr, __pu_err); \ 255 + __put_user_8((x), __gu_ptr, label); \ 327 256 break; \ 328 257 default: \ 329 258 BUILD_BUG(); \ 330 259 } \ 260 + } while (0) 261 + 262 + #define __put_user_error(x, ptr, err) \ 263 + do { \ 264 + __label__ err_label; \ 265 + __put_user_nocheck(x, ptr, err_label); \ 266 + break; \ 267 + err_label: \ 268 + (err) = -EFAULT; \ 331 269 } while (0) 332 270 333 271 /** ··· 372 286 __chk_user_ptr(__gu_ptr); \ 373 287 \ 374 288 __enable_user_access(); \ 375 - __put_user_nocheck(__val, __gu_ptr, __pu_err); \ 289 + __put_user_error(__val, __gu_ptr, __pu_err); \ 376 290 __disable_user_access(); \ 377 291 \ 378 292 __pu_err; \ ··· 437 351 } 438 352 439 353 #define __get_kernel_nofault(dst, src, type, err_label) \ 440 - do { \ 441 - long __kr_err = 0; \ 442 - \ 443 - __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ 444 - if (unlikely(__kr_err)) \ 445 - goto err_label; \ 446 - } while (0) 354 + __get_user_nocheck(*((type *)(dst)), (type *)(src), err_label) 447 355 448 356 #define __put_kernel_nofault(dst, src, type, err_label) \ 449 - do { \ 450 - long __kr_err = 0; \ 451 - \ 452 - __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ 453 - if (unlikely(__kr_err)) \ 454 - goto err_label; \ 357 + __put_user_nocheck(*((type *)(src)), (type *)(dst), err_label) 358 + 359 + static __must_check __always_inline bool user_access_begin(const void __user *ptr, size_t len) 360 + { 361 + if (unlikely(!access_ok(ptr, len))) 362 + return 0; 363 + __enable_user_access(); 364 + return 1; 365 + } 366 + #define user_access_begin user_access_begin 367 + #define user_access_end __disable_user_access 368 + 369 + static inline unsigned long user_access_save(void) { return 0UL; } 370 + static inline void user_access_restore(unsigned long enabled) { } 371 + 372 + /* 373 + * We want the unsafe accessors to always be inlined and use 374 + * the error labels - thus the macro games. 375 + */ 376 + #define unsafe_put_user(x, ptr, label) \ 377 + __put_user_nocheck(x, (ptr), label) 378 + 379 + #define unsafe_get_user(x, ptr, label) do { \ 380 + __inttype(*(ptr)) __gu_val; \ 381 + __get_user_nocheck(__gu_val, (ptr), label); \ 382 + (x) = (__force __typeof__(*(ptr)))__gu_val; \ 455 383 } while (0) 384 + 385 + #define unsafe_copy_to_user(_dst, _src, _len, label) \ 386 + if (__asm_copy_to_user_sum_enabled(_dst, _src, _len)) \ 387 + goto label; 388 + 389 + #define unsafe_copy_from_user(_dst, _src, _len, label) \ 390 + if (__asm_copy_from_user_sum_enabled(_dst, _src, _len)) \ 391 + goto label; 456 392 457 393 #else /* CONFIG_MMU */ 458 394 #include <asm-generic/uaccess.h>
+30
arch/riscv/include/asm/vdso/getrandom.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* 3 + * Copyright (C) 2025 Xi Ruoyao <xry111@xry111.site>. All Rights Reserved. 4 + */ 5 + #ifndef __ASM_VDSO_GETRANDOM_H 6 + #define __ASM_VDSO_GETRANDOM_H 7 + 8 + #ifndef __ASSEMBLY__ 9 + 10 + #include <asm/unistd.h> 11 + 12 + static __always_inline ssize_t getrandom_syscall(void *_buffer, size_t _len, unsigned int _flags) 13 + { 14 + register long ret asm("a0"); 15 + register long nr asm("a7") = __NR_getrandom; 16 + register void *buffer asm("a0") = _buffer; 17 + register size_t len asm("a1") = _len; 18 + register unsigned int flags asm("a2") = _flags; 19 + 20 + asm volatile ("ecall\n" 21 + : "+r" (ret) 22 + : "r" (nr), "r" (buffer), "r" (len), "r" (flags) 23 + : "memory"); 24 + 25 + return ret; 26 + } 27 + 28 + #endif /* !__ASSEMBLY__ */ 29 + 30 + #endif /* __ASM_VDSO_GETRANDOM_H */
+19 -3
arch/riscv/include/asm/vector.h
··· 120 120 csr_clear(CSR_SSTATUS, SR_VS); 121 121 } 122 122 123 + static __always_inline bool riscv_v_is_on(void) 124 + { 125 + return !!(csr_read(CSR_SSTATUS) & SR_VS); 126 + } 127 + 123 128 static __always_inline void __vstate_csr_save(struct __riscv_v_ext_state *dest) 124 129 { 125 130 asm volatile ( ··· 371 366 struct pt_regs *regs; 372 367 373 368 if (riscv_preempt_v_started(prev)) { 369 + if (riscv_v_is_on()) { 370 + WARN_ON(prev->thread.riscv_v_flags & RISCV_V_CTX_DEPTH_MASK); 371 + riscv_v_disable(); 372 + prev->thread.riscv_v_flags |= RISCV_PREEMPT_V_IN_SCHEDULE; 373 + } 374 374 if (riscv_preempt_v_dirty(prev)) { 375 375 __riscv_v_vstate_save(&prev->thread.kernel_vstate, 376 376 prev->thread.kernel_vstate.datap); ··· 386 376 riscv_v_vstate_save(&prev->thread.vstate, regs); 387 377 } 388 378 389 - if (riscv_preempt_v_started(next)) 390 - riscv_preempt_v_set_restore(next); 391 - else 379 + if (riscv_preempt_v_started(next)) { 380 + if (next->thread.riscv_v_flags & RISCV_PREEMPT_V_IN_SCHEDULE) { 381 + next->thread.riscv_v_flags &= ~RISCV_PREEMPT_V_IN_SCHEDULE; 382 + riscv_v_enable(); 383 + } else { 384 + riscv_preempt_v_set_restore(next); 385 + } 386 + } else { 392 387 riscv_v_vstate_set_restore(next, task_pt_regs(next)); 388 + } 393 389 } 394 390 395 391 void riscv_v_vstate_ctrl_init(struct task_struct *tsk);
+16
arch/riscv/include/asm/vendor_extensions/sifive.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _ASM_RISCV_VENDOR_EXTENSIONS_SIFIVE_H 3 + #define _ASM_RISCV_VENDOR_EXTENSIONS_SIFIVE_H 4 + 5 + #include <asm/vendor_extensions.h> 6 + 7 + #include <linux/types.h> 8 + 9 + #define RISCV_ISA_VENDOR_EXT_XSFVQMACCDOD 0 10 + #define RISCV_ISA_VENDOR_EXT_XSFVQMACCQOQ 1 11 + #define RISCV_ISA_VENDOR_EXT_XSFVFNRCLIPXFQF 2 12 + #define RISCV_ISA_VENDOR_EXT_XSFVFWMACCQQQ 3 13 + 14 + extern struct riscv_isa_vendor_ext_data_list riscv_isa_vendor_ext_list_sifive; 15 + 16 + #endif
+19
arch/riscv/include/asm/vendor_extensions/sifive_hwprobe.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + #ifndef _ASM_RISCV_VENDOR_EXTENSIONS_SIFIVE_HWPROBE_H 3 + #define _ASM_RISCV_VENDOR_EXTENSIONS_SIFIVE_HWPROBE_H 4 + 5 + #include <linux/cpumask.h> 6 + 7 + #include <uapi/asm/hwprobe.h> 8 + 9 + #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_SIFIVE 10 + void hwprobe_isa_vendor_ext_sifive_0(struct riscv_hwprobe *pair, const struct cpumask *cpus); 11 + #else 12 + static inline void hwprobe_isa_vendor_ext_sifive_0(struct riscv_hwprobe *pair, 13 + const struct cpumask *cpus) 14 + { 15 + pair->value = 0; 16 + } 17 + #endif 18 + 19 + #endif
+2
arch/riscv/include/uapi/asm/hwprobe.h
··· 81 81 #define RISCV_HWPROBE_EXT_ZICBOM (1ULL << 55) 82 82 #define RISCV_HWPROBE_EXT_ZAAMO (1ULL << 56) 83 83 #define RISCV_HWPROBE_EXT_ZALRSC (1ULL << 57) 84 + #define RISCV_HWPROBE_EXT_ZABHA (1ULL << 58) 84 85 #define RISCV_HWPROBE_KEY_CPUPERF_0 5 85 86 #define RISCV_HWPROBE_MISALIGNED_UNKNOWN (0 << 0) 86 87 #define RISCV_HWPROBE_MISALIGNED_EMULATED (1 << 0) ··· 105 104 #define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4 106 105 #define RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0 11 107 106 #define RISCV_HWPROBE_KEY_ZICBOM_BLOCK_SIZE 12 107 + #define RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0 13 108 108 /* Increase RISCV_HWPROBE_MAX_KEY when adding items. */ 109 109 110 110 /* Flags */
+6
arch/riscv/include/uapi/asm/vendor/sifive.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ 2 + 3 + #define RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCDOD (1 << 0) 4 + #define RISCV_HWPROBE_VENDOR_EXT_XSFVQMACCQOQ (1 << 1) 5 + #define RISCV_HWPROBE_VENDOR_EXT_XSFVFNRCLIPXFQF (1 << 2) 6 + #define RISCV_HWPROBE_VENDOR_EXT_XSFVFWMACCQQQ (1 << 3)
+1 -1
arch/riscv/kernel/Makefile
··· 107 107 obj-$(CONFIG_PARAVIRT) += paravirt.o 108 108 obj-$(CONFIG_KGDB) += kgdb.o 109 109 obj-$(CONFIG_KEXEC_CORE) += kexec_relocate.o crash_save_regs.o machine_kexec.o 110 - obj-$(CONFIG_KEXEC_FILE) += elf_kexec.o machine_kexec_file.o 110 + obj-$(CONFIG_KEXEC_FILE) += kexec_elf.o kexec_image.o machine_kexec_file.o 111 111 obj-$(CONFIG_CRASH_DUMP) += crash_dump.o 112 112 obj-$(CONFIG_VMCORE_INFO) += vmcore_info.o 113 113
+18
arch/riscv/kernel/asm-offsets.c
··· 34 34 OFFSET(TASK_THREAD_S9, task_struct, thread.s[9]); 35 35 OFFSET(TASK_THREAD_S10, task_struct, thread.s[10]); 36 36 OFFSET(TASK_THREAD_S11, task_struct, thread.s[11]); 37 + OFFSET(TASK_THREAD_SUM, task_struct, thread.sum); 37 38 38 39 OFFSET(TASK_TI_CPU, task_struct, thread_info.cpu); 39 40 OFFSET(TASK_TI_PREEMPT_COUNT, task_struct, thread_info.preempt_count); ··· 347 346 offsetof(struct task_struct, thread.s[11]) 348 347 - offsetof(struct task_struct, thread.ra) 349 348 ); 349 + DEFINE(TASK_THREAD_SUM_RA, 350 + offsetof(struct task_struct, thread.sum) 351 + - offsetof(struct task_struct, thread.ra) 352 + ); 350 353 351 354 DEFINE(TASK_THREAD_F0_F0, 352 355 offsetof(struct task_struct, thread.fstate.f[0]) ··· 498 493 DEFINE(STACKFRAME_SIZE_ON_STACK, ALIGN(sizeof(struct stackframe), STACK_ALIGN)); 499 494 OFFSET(STACKFRAME_FP, stackframe, fp); 500 495 OFFSET(STACKFRAME_RA, stackframe, ra); 496 + #ifdef CONFIG_FUNCTION_TRACER 497 + DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func)); 498 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 499 + DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call)); 500 + #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */ 501 + #endif 501 502 502 503 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS 503 504 DEFINE(FREGS_SIZE_ON_STACK, ALIGN(sizeof(struct __arch_ftrace_regs), STACK_ALIGN)); ··· 512 501 DEFINE(FREGS_SP, offsetof(struct __arch_ftrace_regs, sp)); 513 502 DEFINE(FREGS_S0, offsetof(struct __arch_ftrace_regs, s0)); 514 503 DEFINE(FREGS_T1, offsetof(struct __arch_ftrace_regs, t1)); 504 + #ifdef CONFIG_CC_IS_CLANG 505 + DEFINE(FREGS_T2, offsetof(struct __arch_ftrace_regs, t2)); 506 + DEFINE(FREGS_T3, offsetof(struct __arch_ftrace_regs, t3)); 507 + DEFINE(FREGS_T4, offsetof(struct __arch_ftrace_regs, t4)); 508 + DEFINE(FREGS_T5, offsetof(struct __arch_ftrace_regs, t5)); 509 + DEFINE(FREGS_T6, offsetof(struct __arch_ftrace_regs, t6)); 510 + #endif 515 511 DEFINE(FREGS_A0, offsetof(struct __arch_ftrace_regs, a0)); 516 512 DEFINE(FREGS_A1, offsetof(struct __arch_ftrace_regs, a1)); 517 513 DEFINE(FREGS_A2, offsetof(struct __arch_ftrace_regs, a2));
+21
arch/riscv/kernel/cpufeature.c
··· 32 32 #define NUM_ALPHA_EXTS ('z' - 'a' + 1) 33 33 34 34 static bool any_cpu_has_zicboz; 35 + static bool any_cpu_has_zicbop; 35 36 static bool any_cpu_has_zicbom; 36 37 37 38 unsigned long elf_hwcap __read_mostly; ··· 117 116 return -EINVAL; 118 117 } 119 118 any_cpu_has_zicboz = true; 119 + return 0; 120 + } 121 + 122 + static int riscv_ext_zicbop_validate(const struct riscv_isa_ext_data *data, 123 + const unsigned long *isa_bitmap) 124 + { 125 + if (!riscv_cbop_block_size) { 126 + pr_err("Zicbop detected in ISA string, disabling as no cbop-block-size found\n"); 127 + return -EINVAL; 128 + } 129 + if (!is_power_of_2(riscv_cbop_block_size)) { 130 + pr_err("Zicbop disabled as cbop-block-size present, but is not a power-of-2\n"); 131 + return -EINVAL; 132 + } 133 + any_cpu_has_zicbop = true; 120 134 return 0; 121 135 } 122 136 ··· 458 442 __RISCV_ISA_EXT_SUPERSET_VALIDATE(v, RISCV_ISA_EXT_v, riscv_v_exts, riscv_ext_vector_float_validate), 459 443 __RISCV_ISA_EXT_DATA(h, RISCV_ISA_EXT_h), 460 444 __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicbom, RISCV_ISA_EXT_ZICBOM, riscv_xlinuxenvcfg_exts, riscv_ext_zicbom_validate), 445 + __RISCV_ISA_EXT_DATA_VALIDATE(zicbop, RISCV_ISA_EXT_ZICBOP, riscv_ext_zicbop_validate), 461 446 __RISCV_ISA_EXT_SUPERSET_VALIDATE(zicboz, RISCV_ISA_EXT_ZICBOZ, riscv_xlinuxenvcfg_exts, riscv_ext_zicboz_validate), 462 447 __RISCV_ISA_EXT_DATA(ziccrse, RISCV_ISA_EXT_ZICCRSE), 463 448 __RISCV_ISA_EXT_DATA(zicntr, RISCV_ISA_EXT_ZICNTR), ··· 1129 1112 current->thread.envcfg |= ENVCFG_CBCFE; 1130 1113 else if (any_cpu_has_zicbom) 1131 1114 pr_warn("Zicbom disabled as it is unavailable on some harts\n"); 1115 + 1116 + if (!riscv_has_extension_unlikely(RISCV_ISA_EXT_ZICBOP) && 1117 + any_cpu_has_zicbop) 1118 + pr_warn("Zicbop disabled as it is unavailable on some harts\n"); 1132 1119 } 1133 1120 1134 1121 #ifdef CONFIG_RISCV_ALTERNATIVE
-485
arch/riscv/kernel/elf_kexec.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * Load ELF vmlinux file for the kexec_file_load syscall. 4 - * 5 - * Copyright (C) 2021 Huawei Technologies Co, Ltd. 6 - * 7 - * Author: Liao Chang (liaochang1@huawei.com) 8 - * 9 - * Based on kexec-tools' kexec-elf-riscv.c, heavily modified 10 - * for kernel. 11 - */ 12 - 13 - #define pr_fmt(fmt) "kexec_image: " fmt 14 - 15 - #include <linux/elf.h> 16 - #include <linux/kexec.h> 17 - #include <linux/slab.h> 18 - #include <linux/of.h> 19 - #include <linux/libfdt.h> 20 - #include <linux/types.h> 21 - #include <linux/memblock.h> 22 - #include <linux/vmalloc.h> 23 - #include <asm/setup.h> 24 - 25 - int arch_kimage_file_post_load_cleanup(struct kimage *image) 26 - { 27 - kvfree(image->arch.fdt); 28 - image->arch.fdt = NULL; 29 - 30 - vfree(image->elf_headers); 31 - image->elf_headers = NULL; 32 - image->elf_headers_sz = 0; 33 - 34 - return kexec_image_post_load_cleanup_default(image); 35 - } 36 - 37 - static int riscv_kexec_elf_load(struct kimage *image, struct elfhdr *ehdr, 38 - struct kexec_elf_info *elf_info, unsigned long old_pbase, 39 - unsigned long new_pbase) 40 - { 41 - int i; 42 - int ret = 0; 43 - size_t size; 44 - struct kexec_buf kbuf; 45 - const struct elf_phdr *phdr; 46 - 47 - kbuf.image = image; 48 - 49 - for (i = 0; i < ehdr->e_phnum; i++) { 50 - phdr = &elf_info->proghdrs[i]; 51 - if (phdr->p_type != PT_LOAD) 52 - continue; 53 - 54 - size = phdr->p_filesz; 55 - if (size > phdr->p_memsz) 56 - size = phdr->p_memsz; 57 - 58 - kbuf.buffer = (void *) elf_info->buffer + phdr->p_offset; 59 - kbuf.bufsz = size; 60 - kbuf.buf_align = phdr->p_align; 61 - kbuf.mem = phdr->p_paddr - old_pbase + new_pbase; 62 - kbuf.memsz = phdr->p_memsz; 63 - kbuf.top_down = false; 64 - ret = kexec_add_buffer(&kbuf); 65 - if (ret) 66 - break; 67 - } 68 - 69 - return ret; 70 - } 71 - 72 - /* 73 - * Go through the available phsyical memory regions and find one that hold 74 - * an image of the specified size. 75 - */ 76 - static int elf_find_pbase(struct kimage *image, unsigned long kernel_len, 77 - struct elfhdr *ehdr, struct kexec_elf_info *elf_info, 78 - unsigned long *old_pbase, unsigned long *new_pbase) 79 - { 80 - int i; 81 - int ret; 82 - struct kexec_buf kbuf; 83 - const struct elf_phdr *phdr; 84 - unsigned long lowest_paddr = ULONG_MAX; 85 - unsigned long lowest_vaddr = ULONG_MAX; 86 - 87 - for (i = 0; i < ehdr->e_phnum; i++) { 88 - phdr = &elf_info->proghdrs[i]; 89 - if (phdr->p_type != PT_LOAD) 90 - continue; 91 - 92 - if (lowest_paddr > phdr->p_paddr) 93 - lowest_paddr = phdr->p_paddr; 94 - 95 - if (lowest_vaddr > phdr->p_vaddr) 96 - lowest_vaddr = phdr->p_vaddr; 97 - } 98 - 99 - kbuf.image = image; 100 - kbuf.buf_min = lowest_paddr; 101 - kbuf.buf_max = ULONG_MAX; 102 - 103 - /* 104 - * Current riscv boot protocol requires 2MB alignment for 105 - * RV64 and 4MB alignment for RV32 106 - * 107 - */ 108 - kbuf.buf_align = PMD_SIZE; 109 - kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 110 - kbuf.memsz = ALIGN(kernel_len, PAGE_SIZE); 111 - kbuf.top_down = false; 112 - ret = arch_kexec_locate_mem_hole(&kbuf); 113 - if (!ret) { 114 - *old_pbase = lowest_paddr; 115 - *new_pbase = kbuf.mem; 116 - image->start = ehdr->e_entry - lowest_vaddr + kbuf.mem; 117 - } 118 - return ret; 119 - } 120 - 121 - #ifdef CONFIG_CRASH_DUMP 122 - static int get_nr_ram_ranges_callback(struct resource *res, void *arg) 123 - { 124 - unsigned int *nr_ranges = arg; 125 - 126 - (*nr_ranges)++; 127 - return 0; 128 - } 129 - 130 - static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg) 131 - { 132 - struct crash_mem *cmem = arg; 133 - 134 - cmem->ranges[cmem->nr_ranges].start = res->start; 135 - cmem->ranges[cmem->nr_ranges].end = res->end; 136 - cmem->nr_ranges++; 137 - 138 - return 0; 139 - } 140 - 141 - static int prepare_elf_headers(void **addr, unsigned long *sz) 142 - { 143 - struct crash_mem *cmem; 144 - unsigned int nr_ranges; 145 - int ret; 146 - 147 - nr_ranges = 1; /* For exclusion of crashkernel region */ 148 - walk_system_ram_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback); 149 - 150 - cmem = kmalloc(struct_size(cmem, ranges, nr_ranges), GFP_KERNEL); 151 - if (!cmem) 152 - return -ENOMEM; 153 - 154 - cmem->max_nr_ranges = nr_ranges; 155 - cmem->nr_ranges = 0; 156 - ret = walk_system_ram_res(0, -1, cmem, prepare_elf64_ram_headers_callback); 157 - if (ret) 158 - goto out; 159 - 160 - /* Exclude crashkernel region */ 161 - ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end); 162 - if (!ret) 163 - ret = crash_prepare_elf64_headers(cmem, true, addr, sz); 164 - 165 - out: 166 - kfree(cmem); 167 - return ret; 168 - } 169 - 170 - static char *setup_kdump_cmdline(struct kimage *image, char *cmdline, 171 - unsigned long cmdline_len) 172 - { 173 - int elfcorehdr_strlen; 174 - char *cmdline_ptr; 175 - 176 - cmdline_ptr = kzalloc(COMMAND_LINE_SIZE, GFP_KERNEL); 177 - if (!cmdline_ptr) 178 - return NULL; 179 - 180 - elfcorehdr_strlen = sprintf(cmdline_ptr, "elfcorehdr=0x%lx ", 181 - image->elf_load_addr); 182 - 183 - if (elfcorehdr_strlen + cmdline_len > COMMAND_LINE_SIZE) { 184 - pr_err("Appending elfcorehdr=<addr> exceeds cmdline size\n"); 185 - kfree(cmdline_ptr); 186 - return NULL; 187 - } 188 - 189 - memcpy(cmdline_ptr + elfcorehdr_strlen, cmdline, cmdline_len); 190 - /* Ensure it's nul terminated */ 191 - cmdline_ptr[COMMAND_LINE_SIZE - 1] = '\0'; 192 - return cmdline_ptr; 193 - } 194 - #endif 195 - 196 - static void *elf_kexec_load(struct kimage *image, char *kernel_buf, 197 - unsigned long kernel_len, char *initrd, 198 - unsigned long initrd_len, char *cmdline, 199 - unsigned long cmdline_len) 200 - { 201 - int ret; 202 - void *fdt; 203 - unsigned long old_kernel_pbase = ULONG_MAX; 204 - unsigned long new_kernel_pbase = 0UL; 205 - unsigned long initrd_pbase = 0UL; 206 - unsigned long kernel_start; 207 - struct elfhdr ehdr; 208 - struct kexec_buf kbuf; 209 - struct kexec_elf_info elf_info; 210 - char *modified_cmdline = NULL; 211 - 212 - ret = kexec_build_elf_info(kernel_buf, kernel_len, &ehdr, &elf_info); 213 - if (ret) 214 - return ERR_PTR(ret); 215 - 216 - ret = elf_find_pbase(image, kernel_len, &ehdr, &elf_info, 217 - &old_kernel_pbase, &new_kernel_pbase); 218 - if (ret) 219 - goto out; 220 - kernel_start = image->start; 221 - 222 - /* Add the kernel binary to the image */ 223 - ret = riscv_kexec_elf_load(image, &ehdr, &elf_info, 224 - old_kernel_pbase, new_kernel_pbase); 225 - if (ret) 226 - goto out; 227 - 228 - kbuf.image = image; 229 - kbuf.buf_min = new_kernel_pbase + kernel_len; 230 - kbuf.buf_max = ULONG_MAX; 231 - 232 - #ifdef CONFIG_CRASH_DUMP 233 - /* Add elfcorehdr */ 234 - if (image->type == KEXEC_TYPE_CRASH) { 235 - void *headers; 236 - unsigned long headers_sz; 237 - ret = prepare_elf_headers(&headers, &headers_sz); 238 - if (ret) { 239 - pr_err("Preparing elf core header failed\n"); 240 - goto out; 241 - } 242 - 243 - kbuf.buffer = headers; 244 - kbuf.bufsz = headers_sz; 245 - kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 246 - kbuf.memsz = headers_sz; 247 - kbuf.buf_align = ELF_CORE_HEADER_ALIGN; 248 - kbuf.top_down = true; 249 - 250 - ret = kexec_add_buffer(&kbuf); 251 - if (ret) { 252 - vfree(headers); 253 - goto out; 254 - } 255 - image->elf_headers = headers; 256 - image->elf_load_addr = kbuf.mem; 257 - image->elf_headers_sz = headers_sz; 258 - 259 - kexec_dprintk("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n", 260 - image->elf_load_addr, kbuf.bufsz, kbuf.memsz); 261 - 262 - /* Setup cmdline for kdump kernel case */ 263 - modified_cmdline = setup_kdump_cmdline(image, cmdline, 264 - cmdline_len); 265 - if (!modified_cmdline) { 266 - pr_err("Setting up cmdline for kdump kernel failed\n"); 267 - ret = -EINVAL; 268 - goto out; 269 - } 270 - cmdline = modified_cmdline; 271 - } 272 - #endif 273 - 274 - #ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY 275 - /* Add purgatory to the image */ 276 - kbuf.top_down = true; 277 - kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 278 - ret = kexec_load_purgatory(image, &kbuf); 279 - if (ret) { 280 - pr_err("Error loading purgatory ret=%d\n", ret); 281 - goto out; 282 - } 283 - kexec_dprintk("Loaded purgatory at 0x%lx\n", kbuf.mem); 284 - 285 - ret = kexec_purgatory_get_set_symbol(image, "riscv_kernel_entry", 286 - &kernel_start, 287 - sizeof(kernel_start), 0); 288 - if (ret) 289 - pr_err("Error update purgatory ret=%d\n", ret); 290 - #endif /* CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY */ 291 - 292 - /* Add the initrd to the image */ 293 - if (initrd != NULL) { 294 - kbuf.buffer = initrd; 295 - kbuf.bufsz = kbuf.memsz = initrd_len; 296 - kbuf.buf_align = PAGE_SIZE; 297 - kbuf.top_down = true; 298 - kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 299 - ret = kexec_add_buffer(&kbuf); 300 - if (ret) 301 - goto out; 302 - initrd_pbase = kbuf.mem; 303 - kexec_dprintk("Loaded initrd at 0x%lx\n", initrd_pbase); 304 - } 305 - 306 - /* Add the DTB to the image */ 307 - fdt = of_kexec_alloc_and_setup_fdt(image, initrd_pbase, 308 - initrd_len, cmdline, 0); 309 - if (!fdt) { 310 - pr_err("Error setting up the new device tree.\n"); 311 - ret = -EINVAL; 312 - goto out; 313 - } 314 - 315 - fdt_pack(fdt); 316 - kbuf.buffer = fdt; 317 - kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt); 318 - kbuf.buf_align = PAGE_SIZE; 319 - kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 320 - kbuf.top_down = true; 321 - ret = kexec_add_buffer(&kbuf); 322 - if (ret) { 323 - pr_err("Error add DTB kbuf ret=%d\n", ret); 324 - goto out_free_fdt; 325 - } 326 - /* Cache the fdt buffer address for memory cleanup */ 327 - image->arch.fdt = fdt; 328 - kexec_dprintk("Loaded device tree at 0x%lx\n", kbuf.mem); 329 - goto out; 330 - 331 - out_free_fdt: 332 - kvfree(fdt); 333 - out: 334 - kfree(modified_cmdline); 335 - kexec_free_elf_info(&elf_info); 336 - return ret ? ERR_PTR(ret) : NULL; 337 - } 338 - 339 - #define RV_X(x, s, n) (((x) >> (s)) & ((1 << (n)) - 1)) 340 - #define RISCV_IMM_BITS 12 341 - #define RISCV_IMM_REACH (1LL << RISCV_IMM_BITS) 342 - #define RISCV_CONST_HIGH_PART(x) \ 343 - (((x) + (RISCV_IMM_REACH >> 1)) & ~(RISCV_IMM_REACH - 1)) 344 - #define RISCV_CONST_LOW_PART(x) ((x) - RISCV_CONST_HIGH_PART(x)) 345 - 346 - #define ENCODE_ITYPE_IMM(x) \ 347 - (RV_X(x, 0, 12) << 20) 348 - #define ENCODE_BTYPE_IMM(x) \ 349 - ((RV_X(x, 1, 4) << 8) | (RV_X(x, 5, 6) << 25) | \ 350 - (RV_X(x, 11, 1) << 7) | (RV_X(x, 12, 1) << 31)) 351 - #define ENCODE_UTYPE_IMM(x) \ 352 - (RV_X(x, 12, 20) << 12) 353 - #define ENCODE_JTYPE_IMM(x) \ 354 - ((RV_X(x, 1, 10) << 21) | (RV_X(x, 11, 1) << 20) | \ 355 - (RV_X(x, 12, 8) << 12) | (RV_X(x, 20, 1) << 31)) 356 - #define ENCODE_CBTYPE_IMM(x) \ 357 - ((RV_X(x, 1, 2) << 3) | (RV_X(x, 3, 2) << 10) | (RV_X(x, 5, 1) << 2) | \ 358 - (RV_X(x, 6, 2) << 5) | (RV_X(x, 8, 1) << 12)) 359 - #define ENCODE_CJTYPE_IMM(x) \ 360 - ((RV_X(x, 1, 3) << 3) | (RV_X(x, 4, 1) << 11) | (RV_X(x, 5, 1) << 2) | \ 361 - (RV_X(x, 6, 1) << 7) | (RV_X(x, 7, 1) << 6) | (RV_X(x, 8, 2) << 9) | \ 362 - (RV_X(x, 10, 1) << 8) | (RV_X(x, 11, 1) << 12)) 363 - #define ENCODE_UJTYPE_IMM(x) \ 364 - (ENCODE_UTYPE_IMM(RISCV_CONST_HIGH_PART(x)) | \ 365 - (ENCODE_ITYPE_IMM(RISCV_CONST_LOW_PART(x)) << 32)) 366 - #define ENCODE_UITYPE_IMM(x) \ 367 - (ENCODE_UTYPE_IMM(x) | (ENCODE_ITYPE_IMM(x) << 32)) 368 - 369 - #define CLEAN_IMM(type, x) \ 370 - ((~ENCODE_##type##_IMM((uint64_t)(-1))) & (x)) 371 - 372 - int arch_kexec_apply_relocations_add(struct purgatory_info *pi, 373 - Elf_Shdr *section, 374 - const Elf_Shdr *relsec, 375 - const Elf_Shdr *symtab) 376 - { 377 - const char *strtab, *name, *shstrtab; 378 - const Elf_Shdr *sechdrs; 379 - Elf64_Rela *relas; 380 - int i, r_type; 381 - 382 - /* String & section header string table */ 383 - sechdrs = (void *)pi->ehdr + pi->ehdr->e_shoff; 384 - strtab = (char *)pi->ehdr + sechdrs[symtab->sh_link].sh_offset; 385 - shstrtab = (char *)pi->ehdr + sechdrs[pi->ehdr->e_shstrndx].sh_offset; 386 - 387 - relas = (void *)pi->ehdr + relsec->sh_offset; 388 - 389 - for (i = 0; i < relsec->sh_size / sizeof(*relas); i++) { 390 - const Elf_Sym *sym; /* symbol to relocate */ 391 - unsigned long addr; /* final location after relocation */ 392 - unsigned long val; /* relocated symbol value */ 393 - unsigned long sec_base; /* relocated symbol value */ 394 - void *loc; /* tmp location to modify */ 395 - 396 - sym = (void *)pi->ehdr + symtab->sh_offset; 397 - sym += ELF64_R_SYM(relas[i].r_info); 398 - 399 - if (sym->st_name) 400 - name = strtab + sym->st_name; 401 - else 402 - name = shstrtab + sechdrs[sym->st_shndx].sh_name; 403 - 404 - loc = pi->purgatory_buf; 405 - loc += section->sh_offset; 406 - loc += relas[i].r_offset; 407 - 408 - if (sym->st_shndx == SHN_ABS) 409 - sec_base = 0; 410 - else if (sym->st_shndx >= pi->ehdr->e_shnum) { 411 - pr_err("Invalid section %d for symbol %s\n", 412 - sym->st_shndx, name); 413 - return -ENOEXEC; 414 - } else 415 - sec_base = pi->sechdrs[sym->st_shndx].sh_addr; 416 - 417 - val = sym->st_value; 418 - val += sec_base; 419 - val += relas[i].r_addend; 420 - 421 - addr = section->sh_addr + relas[i].r_offset; 422 - 423 - r_type = ELF64_R_TYPE(relas[i].r_info); 424 - 425 - switch (r_type) { 426 - case R_RISCV_BRANCH: 427 - *(u32 *)loc = CLEAN_IMM(BTYPE, *(u32 *)loc) | 428 - ENCODE_BTYPE_IMM(val - addr); 429 - break; 430 - case R_RISCV_JAL: 431 - *(u32 *)loc = CLEAN_IMM(JTYPE, *(u32 *)loc) | 432 - ENCODE_JTYPE_IMM(val - addr); 433 - break; 434 - /* 435 - * With no R_RISCV_PCREL_LO12_S, R_RISCV_PCREL_LO12_I 436 - * sym is expected to be next to R_RISCV_PCREL_HI20 437 - * in purgatory relsec. Handle it like R_RISCV_CALL 438 - * sym, instead of searching the whole relsec. 439 - */ 440 - case R_RISCV_PCREL_HI20: 441 - case R_RISCV_CALL_PLT: 442 - case R_RISCV_CALL: 443 - *(u64 *)loc = CLEAN_IMM(UITYPE, *(u64 *)loc) | 444 - ENCODE_UJTYPE_IMM(val - addr); 445 - break; 446 - case R_RISCV_RVC_BRANCH: 447 - *(u32 *)loc = CLEAN_IMM(CBTYPE, *(u32 *)loc) | 448 - ENCODE_CBTYPE_IMM(val - addr); 449 - break; 450 - case R_RISCV_RVC_JUMP: 451 - *(u32 *)loc = CLEAN_IMM(CJTYPE, *(u32 *)loc) | 452 - ENCODE_CJTYPE_IMM(val - addr); 453 - break; 454 - case R_RISCV_ADD16: 455 - *(u16 *)loc += val; 456 - break; 457 - case R_RISCV_SUB16: 458 - *(u16 *)loc -= val; 459 - break; 460 - case R_RISCV_ADD32: 461 - *(u32 *)loc += val; 462 - break; 463 - case R_RISCV_SUB32: 464 - *(u32 *)loc -= val; 465 - break; 466 - /* It has been applied by R_RISCV_PCREL_HI20 sym */ 467 - case R_RISCV_PCREL_LO12_I: 468 - case R_RISCV_ALIGN: 469 - case R_RISCV_RELAX: 470 - break; 471 - case R_RISCV_64: 472 - *(u64 *)loc = val; 473 - break; 474 - default: 475 - pr_err("Unknown rela relocation: %d\n", r_type); 476 - return -ENOEXEC; 477 - } 478 - } 479 - return 0; 480 - } 481 - 482 - const struct kexec_file_ops elf_kexec_ops = { 483 - .probe = kexec_elf_probe, 484 - .load = elf_kexec_load, 485 - };
+9
arch/riscv/kernel/entry.S
··· 401 401 REG_S s9, TASK_THREAD_S9_RA(a3) 402 402 REG_S s10, TASK_THREAD_S10_RA(a3) 403 403 REG_S s11, TASK_THREAD_S11_RA(a3) 404 + 405 + /* save the user space access flag */ 406 + csrr s0, CSR_STATUS 407 + REG_S s0, TASK_THREAD_SUM_RA(a3) 408 + 404 409 /* Save the kernel shadow call stack pointer */ 405 410 scs_save_current 406 411 /* Restore context from next->thread */ 412 + REG_L s0, TASK_THREAD_SUM_RA(a4) 413 + li s1, SR_SUM 414 + and s0, s0, s1 415 + csrs CSR_STATUS, s0 407 416 REG_L ra, TASK_THREAD_RA_RA(a4) 408 417 REG_L sp, TASK_THREAD_SP_RA(a4) 409 418 REG_L s0, TASK_THREAD_S0_RA(a4)
+133 -121
arch/riscv/kernel/ftrace.c
··· 8 8 #include <linux/ftrace.h> 9 9 #include <linux/uaccess.h> 10 10 #include <linux/memory.h> 11 + #include <linux/irqflags.h> 11 12 #include <linux/stop_machine.h> 12 13 #include <asm/cacheflush.h> 13 14 #include <asm/text-patching.h> 14 15 15 16 #ifdef CONFIG_DYNAMIC_FTRACE 16 - void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex) 17 + unsigned long ftrace_call_adjust(unsigned long addr) 18 + { 19 + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) 20 + return addr + 8 + MCOUNT_AUIPC_SIZE; 21 + 22 + return addr + MCOUNT_AUIPC_SIZE; 23 + } 24 + 25 + unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip) 26 + { 27 + return fentry_ip - MCOUNT_AUIPC_SIZE; 28 + } 29 + 30 + void arch_ftrace_update_code(int command) 17 31 { 18 32 mutex_lock(&text_mutex); 19 - 20 - /* 21 - * The code sequences we use for ftrace can't be patched while the 22 - * kernel is running, so we need to use stop_machine() to modify them 23 - * for now. This doesn't play nice with text_mutex, we use this flag 24 - * to elide the check. 25 - */ 26 - riscv_patch_in_stop_machine = true; 27 - } 28 - 29 - void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex) 30 - { 31 - riscv_patch_in_stop_machine = false; 33 + command |= FTRACE_MAY_SLEEP; 34 + ftrace_modify_all_code(command); 32 35 mutex_unlock(&text_mutex); 36 + flush_icache_all(); 33 37 } 34 38 35 - static int ftrace_check_current_call(unsigned long hook_pos, 36 - unsigned int *expected) 39 + static int __ftrace_modify_call(unsigned long source, unsigned long target, bool validate) 37 40 { 41 + unsigned int call[2], offset; 38 42 unsigned int replaced[2]; 39 - unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4}; 40 43 41 - /* we expect nops at the hook position */ 42 - if (!expected) 43 - expected = nops; 44 + offset = target - source; 45 + call[1] = to_jalr_t0(offset); 44 46 45 - /* 46 - * Read the text we want to modify; 47 - * return must be -EFAULT on read error 48 - */ 49 - if (copy_from_kernel_nofault(replaced, (void *)hook_pos, 50 - MCOUNT_INSN_SIZE)) 51 - return -EFAULT; 47 + if (validate) { 48 + call[0] = to_auipc_t0(offset); 49 + /* 50 + * Read the text we want to modify; 51 + * return must be -EFAULT on read error 52 + */ 53 + if (copy_from_kernel_nofault(replaced, (void *)source, 2 * MCOUNT_INSN_SIZE)) 54 + return -EFAULT; 52 55 53 - /* 54 - * Make sure it is what we expect it to be; 55 - * return must be -EINVAL on failed comparison 56 - */ 57 - if (memcmp(expected, replaced, sizeof(replaced))) { 58 - pr_err("%p: expected (%08x %08x) but got (%08x %08x)\n", 59 - (void *)hook_pos, expected[0], expected[1], replaced[0], 60 - replaced[1]); 61 - return -EINVAL; 56 + if (replaced[0] != call[0]) { 57 + pr_err("%p: expected (%08x) but got (%08x)\n", 58 + (void *)source, call[0], replaced[0]); 59 + return -EINVAL; 60 + } 62 61 } 63 62 64 - return 0; 65 - } 66 - 67 - static int __ftrace_modify_call(unsigned long hook_pos, unsigned long target, 68 - bool enable, bool ra) 69 - { 70 - unsigned int call[2]; 71 - unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4}; 72 - 73 - if (ra) 74 - make_call_ra(hook_pos, target, call); 75 - else 76 - make_call_t0(hook_pos, target, call); 77 - 78 - /* Replace the auipc-jalr pair at once. Return -EPERM on write error. */ 79 - if (patch_insn_write((void *)hook_pos, enable ? call : nops, MCOUNT_INSN_SIZE)) 63 + /* Replace the jalr at once. Return -EPERM on write error. */ 64 + if (patch_insn_write((void *)(source + MCOUNT_AUIPC_SIZE), call + 1, MCOUNT_JALR_SIZE)) 80 65 return -EPERM; 81 66 82 67 return 0; 83 68 } 69 + 70 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS 71 + static const struct ftrace_ops *riscv64_rec_get_ops(struct dyn_ftrace *rec) 72 + { 73 + const struct ftrace_ops *ops = NULL; 74 + 75 + if (rec->flags & FTRACE_FL_CALL_OPS_EN) { 76 + ops = ftrace_find_unique_ops(rec); 77 + WARN_ON_ONCE(!ops); 78 + } 79 + 80 + if (!ops) 81 + ops = &ftrace_list_ops; 82 + 83 + return ops; 84 + } 85 + 86 + static int ftrace_rec_set_ops(const struct dyn_ftrace *rec, const struct ftrace_ops *ops) 87 + { 88 + unsigned long literal = ALIGN_DOWN(rec->ip - 12, 8); 89 + 90 + return patch_text_nosync((void *)literal, &ops, sizeof(ops)); 91 + } 92 + 93 + static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec) 94 + { 95 + return ftrace_rec_set_ops(rec, &ftrace_nop_ops); 96 + } 97 + 98 + static int ftrace_rec_update_ops(struct dyn_ftrace *rec) 99 + { 100 + return ftrace_rec_set_ops(rec, riscv64_rec_get_ops(rec)); 101 + } 102 + #else 103 + static int ftrace_rec_set_nop_ops(struct dyn_ftrace *rec) { return 0; } 104 + static int ftrace_rec_update_ops(struct dyn_ftrace *rec) { return 0; } 105 + #endif 84 106 85 107 int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr) 86 108 { 87 - unsigned int call[2]; 109 + unsigned long distance, orig_addr, pc = rec->ip - MCOUNT_AUIPC_SIZE; 110 + int ret; 88 111 89 - make_call_t0(rec->ip, addr, call); 112 + ret = ftrace_rec_update_ops(rec); 113 + if (ret) 114 + return ret; 90 115 91 - if (patch_insn_write((void *)rec->ip, call, MCOUNT_INSN_SIZE)) 92 - return -EPERM; 116 + orig_addr = (unsigned long)&ftrace_caller; 117 + distance = addr > orig_addr ? addr - orig_addr : orig_addr - addr; 118 + if (distance > JALR_RANGE) 119 + addr = FTRACE_ADDR; 93 120 94 - return 0; 121 + return __ftrace_modify_call(pc, addr, false); 95 122 } 96 123 97 - int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, 98 - unsigned long addr) 124 + int ftrace_make_nop(struct module *mod, struct dyn_ftrace *rec, unsigned long addr) 99 125 { 100 - unsigned int nops[2] = {RISCV_INSN_NOP4, RISCV_INSN_NOP4}; 126 + u32 nop4 = RISCV_INSN_NOP4; 127 + int ret; 101 128 102 - if (patch_insn_write((void *)rec->ip, nops, MCOUNT_INSN_SIZE)) 129 + ret = ftrace_rec_set_nop_ops(rec); 130 + if (ret) 131 + return ret; 132 + 133 + if (patch_insn_write((void *)rec->ip, &nop4, MCOUNT_NOP4_SIZE)) 103 134 return -EPERM; 104 135 105 136 return 0; ··· 145 114 */ 146 115 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec) 147 116 { 148 - int out; 117 + unsigned long pc = rec->ip - MCOUNT_AUIPC_SIZE; 118 + unsigned int nops[2], offset; 119 + int ret; 120 + 121 + ret = ftrace_rec_set_nop_ops(rec); 122 + if (ret) 123 + return ret; 124 + 125 + offset = (unsigned long) &ftrace_caller - pc; 126 + nops[0] = to_auipc_t0(offset); 127 + nops[1] = RISCV_INSN_NOP4; 149 128 150 129 mutex_lock(&text_mutex); 151 - out = ftrace_make_nop(mod, rec, MCOUNT_ADDR); 130 + ret = patch_insn_write((void *)pc, nops, 2 * MCOUNT_INSN_SIZE); 152 131 mutex_unlock(&text_mutex); 153 - 154 - return out; 155 - } 156 - 157 - int ftrace_update_ftrace_func(ftrace_func_t func) 158 - { 159 - int ret = __ftrace_modify_call((unsigned long)&ftrace_call, 160 - (unsigned long)func, true, true); 161 132 162 133 return ret; 163 134 } 164 135 165 - struct ftrace_modify_param { 166 - int command; 167 - atomic_t cpu_count; 168 - }; 169 - 170 - static int __ftrace_modify_code(void *data) 136 + ftrace_func_t ftrace_call_dest = ftrace_stub; 137 + int ftrace_update_ftrace_func(ftrace_func_t func) 171 138 { 172 - struct ftrace_modify_param *param = data; 139 + /* 140 + * When using CALL_OPS, the function to call is associated with the 141 + * call site, and we don't have a global function pointer to update. 142 + */ 143 + if (IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS)) 144 + return 0; 173 145 174 - if (atomic_inc_return(&param->cpu_count) == num_online_cpus()) { 175 - ftrace_modify_all_code(param->command); 176 - /* 177 - * Make sure the patching store is effective *before* we 178 - * increment the counter which releases all waiting CPUs 179 - * by using the release variant of atomic increment. The 180 - * release pairs with the call to local_flush_icache_all() 181 - * on the waiting CPU. 182 - */ 183 - atomic_inc_return_release(&param->cpu_count); 184 - } else { 185 - while (atomic_read(&param->cpu_count) <= num_online_cpus()) 186 - cpu_relax(); 187 - 188 - local_flush_icache_all(); 189 - } 190 - 146 + WRITE_ONCE(ftrace_call_dest, func); 147 + /* 148 + * The data fence ensure that the update to ftrace_call_dest happens 149 + * before the write to function_trace_op later in the generic ftrace. 150 + * If the sequence is not enforced, then an old ftrace_call_dest may 151 + * race loading a new function_trace_op set in ftrace_modify_all_code 152 + */ 153 + smp_wmb(); 154 + /* 155 + * Updating ftrace dpes not take stop_machine path, so irqs should not 156 + * be disabled. 157 + */ 158 + WARN_ON(irqs_disabled()); 159 + smp_call_function(ftrace_sync_ipi, NULL, 1); 191 160 return 0; 192 161 } 193 162 194 - void arch_ftrace_update_code(int command) 163 + #else /* CONFIG_DYNAMIC_FTRACE */ 164 + unsigned long ftrace_call_adjust(unsigned long addr) 195 165 { 196 - struct ftrace_modify_param param = { command, ATOMIC_INIT(0) }; 197 - 198 - stop_machine(__ftrace_modify_code, &param, cpu_online_mask); 166 + return addr; 199 167 } 200 - #endif 168 + #endif /* CONFIG_DYNAMIC_FTRACE */ 201 169 202 170 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 203 171 int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, 204 172 unsigned long addr) 205 173 { 206 - unsigned int call[2]; 207 - unsigned long caller = rec->ip; 174 + unsigned long caller = rec->ip - MCOUNT_AUIPC_SIZE; 208 175 int ret; 209 176 210 - make_call_t0(caller, old_addr, call); 211 - ret = ftrace_check_current_call(caller, call); 212 - 177 + ret = ftrace_rec_update_ops(rec); 213 178 if (ret) 214 179 return ret; 215 180 216 - return __ftrace_modify_call(caller, addr, true, false); 181 + return __ftrace_modify_call(caller, FTRACE_ADDR, true); 217 182 } 218 183 #endif 219 184 ··· 237 210 } 238 211 239 212 #ifdef CONFIG_DYNAMIC_FTRACE 240 - #ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS 241 213 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip, 242 214 struct ftrace_ops *op, struct ftrace_regs *fregs) 243 215 { ··· 257 231 if (!function_graph_enter_regs(old, ip, frame_pointer, parent, fregs)) 258 232 *parent = return_hooker; 259 233 } 260 - #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 261 - extern void ftrace_graph_call(void); 262 - int ftrace_enable_ftrace_graph_caller(void) 263 - { 264 - return __ftrace_modify_call((unsigned long)&ftrace_graph_call, 265 - (unsigned long)&prepare_ftrace_return, true, true); 266 - } 267 - 268 - int ftrace_disable_ftrace_graph_caller(void) 269 - { 270 - return __ftrace_modify_call((unsigned long)&ftrace_graph_call, 271 - (unsigned long)&prepare_ftrace_return, false, true); 272 - } 273 - #endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 274 234 #endif /* CONFIG_DYNAMIC_FTRACE */ 275 235 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+144
arch/riscv/kernel/kexec_elf.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Load ELF vmlinux file for the kexec_file_load syscall. 4 + * 5 + * Copyright (C) 2021 Huawei Technologies Co, Ltd. 6 + * 7 + * Author: Liao Chang (liaochang1@huawei.com) 8 + * 9 + * Based on kexec-tools' kexec-elf-riscv.c, heavily modified 10 + * for kernel. 11 + */ 12 + 13 + #define pr_fmt(fmt) "kexec_image: " fmt 14 + 15 + #include <linux/elf.h> 16 + #include <linux/kexec.h> 17 + #include <linux/slab.h> 18 + #include <linux/of.h> 19 + #include <linux/libfdt.h> 20 + #include <linux/types.h> 21 + #include <linux/memblock.h> 22 + #include <asm/setup.h> 23 + 24 + static int riscv_kexec_elf_load(struct kimage *image, struct elfhdr *ehdr, 25 + struct kexec_elf_info *elf_info, unsigned long old_pbase, 26 + unsigned long new_pbase) 27 + { 28 + int i; 29 + int ret = 0; 30 + size_t size; 31 + struct kexec_buf kbuf; 32 + const struct elf_phdr *phdr; 33 + 34 + kbuf.image = image; 35 + 36 + for (i = 0; i < ehdr->e_phnum; i++) { 37 + phdr = &elf_info->proghdrs[i]; 38 + if (phdr->p_type != PT_LOAD) 39 + continue; 40 + 41 + size = phdr->p_filesz; 42 + if (size > phdr->p_memsz) 43 + size = phdr->p_memsz; 44 + 45 + kbuf.buffer = (void *) elf_info->buffer + phdr->p_offset; 46 + kbuf.bufsz = size; 47 + kbuf.buf_align = phdr->p_align; 48 + kbuf.mem = phdr->p_paddr - old_pbase + new_pbase; 49 + kbuf.memsz = phdr->p_memsz; 50 + kbuf.top_down = false; 51 + ret = kexec_add_buffer(&kbuf); 52 + if (ret) 53 + break; 54 + } 55 + 56 + return ret; 57 + } 58 + 59 + /* 60 + * Go through the available phsyical memory regions and find one that hold 61 + * an image of the specified size. 62 + */ 63 + static int elf_find_pbase(struct kimage *image, unsigned long kernel_len, 64 + struct elfhdr *ehdr, struct kexec_elf_info *elf_info, 65 + unsigned long *old_pbase, unsigned long *new_pbase) 66 + { 67 + int i; 68 + int ret; 69 + struct kexec_buf kbuf; 70 + const struct elf_phdr *phdr; 71 + unsigned long lowest_paddr = ULONG_MAX; 72 + unsigned long lowest_vaddr = ULONG_MAX; 73 + 74 + for (i = 0; i < ehdr->e_phnum; i++) { 75 + phdr = &elf_info->proghdrs[i]; 76 + if (phdr->p_type != PT_LOAD) 77 + continue; 78 + 79 + if (lowest_paddr > phdr->p_paddr) 80 + lowest_paddr = phdr->p_paddr; 81 + 82 + if (lowest_vaddr > phdr->p_vaddr) 83 + lowest_vaddr = phdr->p_vaddr; 84 + } 85 + 86 + kbuf.image = image; 87 + kbuf.buf_min = lowest_paddr; 88 + kbuf.buf_max = ULONG_MAX; 89 + 90 + /* 91 + * Current riscv boot protocol requires 2MB alignment for 92 + * RV64 and 4MB alignment for RV32 93 + * 94 + */ 95 + kbuf.buf_align = PMD_SIZE; 96 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 97 + kbuf.memsz = ALIGN(kernel_len, PAGE_SIZE); 98 + kbuf.top_down = false; 99 + ret = arch_kexec_locate_mem_hole(&kbuf); 100 + if (!ret) { 101 + *old_pbase = lowest_paddr; 102 + *new_pbase = kbuf.mem; 103 + image->start = ehdr->e_entry - lowest_vaddr + kbuf.mem; 104 + } 105 + return ret; 106 + } 107 + 108 + static void *elf_kexec_load(struct kimage *image, char *kernel_buf, 109 + unsigned long kernel_len, char *initrd, 110 + unsigned long initrd_len, char *cmdline, 111 + unsigned long cmdline_len) 112 + { 113 + int ret; 114 + unsigned long old_kernel_pbase = ULONG_MAX; 115 + unsigned long new_kernel_pbase = 0UL; 116 + struct elfhdr ehdr; 117 + struct kexec_elf_info elf_info; 118 + 119 + ret = kexec_build_elf_info(kernel_buf, kernel_len, &ehdr, &elf_info); 120 + if (ret) 121 + return ERR_PTR(ret); 122 + 123 + ret = elf_find_pbase(image, kernel_len, &ehdr, &elf_info, 124 + &old_kernel_pbase, &new_kernel_pbase); 125 + if (ret) 126 + goto out; 127 + 128 + /* Add the kernel binary to the image */ 129 + ret = riscv_kexec_elf_load(image, &ehdr, &elf_info, 130 + old_kernel_pbase, new_kernel_pbase); 131 + if (ret) 132 + goto out; 133 + 134 + ret = load_extra_segments(image, image->start, kernel_len, 135 + initrd, initrd_len, cmdline, cmdline_len); 136 + out: 137 + kexec_free_elf_info(&elf_info); 138 + return ret ? ERR_PTR(ret) : NULL; 139 + } 140 + 141 + const struct kexec_file_ops elf_kexec_ops = { 142 + .probe = kexec_elf_probe, 143 + .load = elf_kexec_load, 144 + };
+96
arch/riscv/kernel/kexec_image.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * RISC-V Kexec image loader 4 + * 5 + */ 6 + 7 + #define pr_fmt(fmt) "kexec_file(Image): " fmt 8 + 9 + #include <linux/err.h> 10 + #include <linux/errno.h> 11 + #include <linux/kernel.h> 12 + #include <linux/kexec.h> 13 + #include <linux/pe.h> 14 + #include <linux/string.h> 15 + #include <asm/byteorder.h> 16 + #include <asm/image.h> 17 + 18 + static int image_probe(const char *kernel_buf, unsigned long kernel_len) 19 + { 20 + const struct riscv_image_header *h = (const struct riscv_image_header *)kernel_buf; 21 + 22 + if (!h || kernel_len < sizeof(*h)) 23 + return -EINVAL; 24 + 25 + /* According to Documentation/riscv/boot-image-header.rst, 26 + * use "magic2" field to check when version >= 0.2. 27 + */ 28 + 29 + if (h->version >= RISCV_HEADER_VERSION && 30 + memcmp(&h->magic2, RISCV_IMAGE_MAGIC2, sizeof(h->magic2))) 31 + return -EINVAL; 32 + 33 + return 0; 34 + } 35 + 36 + static void *image_load(struct kimage *image, 37 + char *kernel, unsigned long kernel_len, 38 + char *initrd, unsigned long initrd_len, 39 + char *cmdline, unsigned long cmdline_len) 40 + { 41 + struct riscv_image_header *h; 42 + u64 flags; 43 + bool be_image, be_kernel; 44 + struct kexec_buf kbuf; 45 + int ret; 46 + 47 + /* Check Image header */ 48 + h = (struct riscv_image_header *)kernel; 49 + if (!h->image_size) { 50 + ret = -EINVAL; 51 + goto out; 52 + } 53 + 54 + /* Check endianness */ 55 + flags = le64_to_cpu(h->flags); 56 + be_image = riscv_image_flag_field(flags, RISCV_IMAGE_FLAG_BE); 57 + be_kernel = IS_ENABLED(CONFIG_CPU_BIG_ENDIAN); 58 + if (be_image != be_kernel) { 59 + ret = -EINVAL; 60 + goto out; 61 + } 62 + 63 + /* Load the kernel image */ 64 + kbuf.image = image; 65 + kbuf.buf_min = 0; 66 + kbuf.buf_max = ULONG_MAX; 67 + kbuf.top_down = false; 68 + 69 + kbuf.buffer = kernel; 70 + kbuf.bufsz = kernel_len; 71 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 72 + kbuf.memsz = le64_to_cpu(h->image_size); 73 + kbuf.buf_align = le64_to_cpu(h->text_offset); 74 + 75 + ret = kexec_add_buffer(&kbuf); 76 + if (ret) { 77 + pr_err("Error add kernel image ret=%d\n", ret); 78 + goto out; 79 + } 80 + 81 + image->start = kbuf.mem; 82 + 83 + pr_info("Loaded kernel at 0x%lx bufsz=0x%lx memsz=0x%lx\n", 84 + kbuf.mem, kbuf.bufsz, kbuf.memsz); 85 + 86 + ret = load_extra_segments(image, kbuf.mem, kbuf.memsz, 87 + initrd, initrd_len, cmdline, cmdline_len); 88 + 89 + out: 90 + return ret ? ERR_PTR(ret) : NULL; 91 + } 92 + 93 + const struct kexec_file_ops image_kexec_ops = { 94 + .probe = image_probe, 95 + .load = image_load, 96 + };
+361
arch/riscv/kernel/machine_kexec_file.c
··· 7 7 * Author: Liao Chang (liaochang1@huawei.com) 8 8 */ 9 9 #include <linux/kexec.h> 10 + #include <linux/elf.h> 11 + #include <linux/slab.h> 12 + #include <linux/of.h> 13 + #include <linux/libfdt.h> 14 + #include <linux/types.h> 15 + #include <linux/memblock.h> 16 + #include <linux/vmalloc.h> 17 + #include <asm/setup.h> 10 18 11 19 const struct kexec_file_ops * const kexec_file_loaders[] = { 12 20 &elf_kexec_ops, 21 + &image_kexec_ops, 13 22 NULL 14 23 }; 24 + 25 + int arch_kimage_file_post_load_cleanup(struct kimage *image) 26 + { 27 + kvfree(image->arch.fdt); 28 + image->arch.fdt = NULL; 29 + 30 + vfree(image->elf_headers); 31 + image->elf_headers = NULL; 32 + image->elf_headers_sz = 0; 33 + 34 + return kexec_image_post_load_cleanup_default(image); 35 + } 36 + 37 + #ifdef CONFIG_CRASH_DUMP 38 + static int get_nr_ram_ranges_callback(struct resource *res, void *arg) 39 + { 40 + unsigned int *nr_ranges = arg; 41 + 42 + (*nr_ranges)++; 43 + return 0; 44 + } 45 + 46 + static int prepare_elf64_ram_headers_callback(struct resource *res, void *arg) 47 + { 48 + struct crash_mem *cmem = arg; 49 + 50 + cmem->ranges[cmem->nr_ranges].start = res->start; 51 + cmem->ranges[cmem->nr_ranges].end = res->end; 52 + cmem->nr_ranges++; 53 + 54 + return 0; 55 + } 56 + 57 + static int prepare_elf_headers(void **addr, unsigned long *sz) 58 + { 59 + struct crash_mem *cmem; 60 + unsigned int nr_ranges; 61 + int ret; 62 + 63 + nr_ranges = 1; /* For exclusion of crashkernel region */ 64 + walk_system_ram_res(0, -1, &nr_ranges, get_nr_ram_ranges_callback); 65 + 66 + cmem = kmalloc(struct_size(cmem, ranges, nr_ranges), GFP_KERNEL); 67 + if (!cmem) 68 + return -ENOMEM; 69 + 70 + cmem->max_nr_ranges = nr_ranges; 71 + cmem->nr_ranges = 0; 72 + ret = walk_system_ram_res(0, -1, cmem, prepare_elf64_ram_headers_callback); 73 + if (ret) 74 + goto out; 75 + 76 + /* Exclude crashkernel region */ 77 + ret = crash_exclude_mem_range(cmem, crashk_res.start, crashk_res.end); 78 + if (!ret) 79 + ret = crash_prepare_elf64_headers(cmem, true, addr, sz); 80 + 81 + out: 82 + kfree(cmem); 83 + return ret; 84 + } 85 + 86 + static char *setup_kdump_cmdline(struct kimage *image, char *cmdline, 87 + unsigned long cmdline_len) 88 + { 89 + int elfcorehdr_strlen; 90 + char *cmdline_ptr; 91 + 92 + cmdline_ptr = kzalloc(COMMAND_LINE_SIZE, GFP_KERNEL); 93 + if (!cmdline_ptr) 94 + return NULL; 95 + 96 + elfcorehdr_strlen = sprintf(cmdline_ptr, "elfcorehdr=0x%lx ", 97 + image->elf_load_addr); 98 + 99 + if (elfcorehdr_strlen + cmdline_len > COMMAND_LINE_SIZE) { 100 + pr_err("Appending elfcorehdr=<addr> exceeds cmdline size\n"); 101 + kfree(cmdline_ptr); 102 + return NULL; 103 + } 104 + 105 + memcpy(cmdline_ptr + elfcorehdr_strlen, cmdline, cmdline_len); 106 + /* Ensure it's nul terminated */ 107 + cmdline_ptr[COMMAND_LINE_SIZE - 1] = '\0'; 108 + return cmdline_ptr; 109 + } 110 + #endif 111 + 112 + #define RV_X(x, s, n) (((x) >> (s)) & ((1 << (n)) - 1)) 113 + #define RISCV_IMM_BITS 12 114 + #define RISCV_IMM_REACH (1LL << RISCV_IMM_BITS) 115 + #define RISCV_CONST_HIGH_PART(x) \ 116 + (((x) + (RISCV_IMM_REACH >> 1)) & ~(RISCV_IMM_REACH - 1)) 117 + #define RISCV_CONST_LOW_PART(x) ((x) - RISCV_CONST_HIGH_PART(x)) 118 + 119 + #define ENCODE_ITYPE_IMM(x) \ 120 + (RV_X(x, 0, 12) << 20) 121 + #define ENCODE_BTYPE_IMM(x) \ 122 + ((RV_X(x, 1, 4) << 8) | (RV_X(x, 5, 6) << 25) | \ 123 + (RV_X(x, 11, 1) << 7) | (RV_X(x, 12, 1) << 31)) 124 + #define ENCODE_UTYPE_IMM(x) \ 125 + (RV_X(x, 12, 20) << 12) 126 + #define ENCODE_JTYPE_IMM(x) \ 127 + ((RV_X(x, 1, 10) << 21) | (RV_X(x, 11, 1) << 20) | \ 128 + (RV_X(x, 12, 8) << 12) | (RV_X(x, 20, 1) << 31)) 129 + #define ENCODE_CBTYPE_IMM(x) \ 130 + ((RV_X(x, 1, 2) << 3) | (RV_X(x, 3, 2) << 10) | (RV_X(x, 5, 1) << 2) | \ 131 + (RV_X(x, 6, 2) << 5) | (RV_X(x, 8, 1) << 12)) 132 + #define ENCODE_CJTYPE_IMM(x) \ 133 + ((RV_X(x, 1, 3) << 3) | (RV_X(x, 4, 1) << 11) | (RV_X(x, 5, 1) << 2) | \ 134 + (RV_X(x, 6, 1) << 7) | (RV_X(x, 7, 1) << 6) | (RV_X(x, 8, 2) << 9) | \ 135 + (RV_X(x, 10, 1) << 8) | (RV_X(x, 11, 1) << 12)) 136 + #define ENCODE_UJTYPE_IMM(x) \ 137 + (ENCODE_UTYPE_IMM(RISCV_CONST_HIGH_PART(x)) | \ 138 + (ENCODE_ITYPE_IMM(RISCV_CONST_LOW_PART(x)) << 32)) 139 + #define ENCODE_UITYPE_IMM(x) \ 140 + (ENCODE_UTYPE_IMM(x) | (ENCODE_ITYPE_IMM(x) << 32)) 141 + 142 + #define CLEAN_IMM(type, x) \ 143 + ((~ENCODE_##type##_IMM((uint64_t)(-1))) & (x)) 144 + 145 + int arch_kexec_apply_relocations_add(struct purgatory_info *pi, 146 + Elf_Shdr *section, 147 + const Elf_Shdr *relsec, 148 + const Elf_Shdr *symtab) 149 + { 150 + const char *strtab, *name, *shstrtab; 151 + const Elf_Shdr *sechdrs; 152 + Elf64_Rela *relas; 153 + int i, r_type; 154 + 155 + /* String & section header string table */ 156 + sechdrs = (void *)pi->ehdr + pi->ehdr->e_shoff; 157 + strtab = (char *)pi->ehdr + sechdrs[symtab->sh_link].sh_offset; 158 + shstrtab = (char *)pi->ehdr + sechdrs[pi->ehdr->e_shstrndx].sh_offset; 159 + 160 + relas = (void *)pi->ehdr + relsec->sh_offset; 161 + 162 + for (i = 0; i < relsec->sh_size / sizeof(*relas); i++) { 163 + const Elf_Sym *sym; /* symbol to relocate */ 164 + unsigned long addr; /* final location after relocation */ 165 + unsigned long val; /* relocated symbol value */ 166 + unsigned long sec_base; /* relocated symbol value */ 167 + void *loc; /* tmp location to modify */ 168 + 169 + sym = (void *)pi->ehdr + symtab->sh_offset; 170 + sym += ELF64_R_SYM(relas[i].r_info); 171 + 172 + if (sym->st_name) 173 + name = strtab + sym->st_name; 174 + else 175 + name = shstrtab + sechdrs[sym->st_shndx].sh_name; 176 + 177 + loc = pi->purgatory_buf; 178 + loc += section->sh_offset; 179 + loc += relas[i].r_offset; 180 + 181 + if (sym->st_shndx == SHN_ABS) 182 + sec_base = 0; 183 + else if (sym->st_shndx >= pi->ehdr->e_shnum) { 184 + pr_err("Invalid section %d for symbol %s\n", 185 + sym->st_shndx, name); 186 + return -ENOEXEC; 187 + } else 188 + sec_base = pi->sechdrs[sym->st_shndx].sh_addr; 189 + 190 + val = sym->st_value; 191 + val += sec_base; 192 + val += relas[i].r_addend; 193 + 194 + addr = section->sh_addr + relas[i].r_offset; 195 + 196 + r_type = ELF64_R_TYPE(relas[i].r_info); 197 + 198 + switch (r_type) { 199 + case R_RISCV_BRANCH: 200 + *(u32 *)loc = CLEAN_IMM(BTYPE, *(u32 *)loc) | 201 + ENCODE_BTYPE_IMM(val - addr); 202 + break; 203 + case R_RISCV_JAL: 204 + *(u32 *)loc = CLEAN_IMM(JTYPE, *(u32 *)loc) | 205 + ENCODE_JTYPE_IMM(val - addr); 206 + break; 207 + /* 208 + * With no R_RISCV_PCREL_LO12_S, R_RISCV_PCREL_LO12_I 209 + * sym is expected to be next to R_RISCV_PCREL_HI20 210 + * in purgatory relsec. Handle it like R_RISCV_CALL 211 + * sym, instead of searching the whole relsec. 212 + */ 213 + case R_RISCV_PCREL_HI20: 214 + case R_RISCV_CALL_PLT: 215 + case R_RISCV_CALL: 216 + *(u64 *)loc = CLEAN_IMM(UITYPE, *(u64 *)loc) | 217 + ENCODE_UJTYPE_IMM(val - addr); 218 + break; 219 + case R_RISCV_RVC_BRANCH: 220 + *(u32 *)loc = CLEAN_IMM(CBTYPE, *(u32 *)loc) | 221 + ENCODE_CBTYPE_IMM(val - addr); 222 + break; 223 + case R_RISCV_RVC_JUMP: 224 + *(u32 *)loc = CLEAN_IMM(CJTYPE, *(u32 *)loc) | 225 + ENCODE_CJTYPE_IMM(val - addr); 226 + break; 227 + case R_RISCV_ADD16: 228 + *(u16 *)loc += val; 229 + break; 230 + case R_RISCV_SUB16: 231 + *(u16 *)loc -= val; 232 + break; 233 + case R_RISCV_ADD32: 234 + *(u32 *)loc += val; 235 + break; 236 + case R_RISCV_SUB32: 237 + *(u32 *)loc -= val; 238 + break; 239 + /* It has been applied by R_RISCV_PCREL_HI20 sym */ 240 + case R_RISCV_PCREL_LO12_I: 241 + case R_RISCV_ALIGN: 242 + case R_RISCV_RELAX: 243 + break; 244 + case R_RISCV_64: 245 + *(u64 *)loc = val; 246 + break; 247 + default: 248 + pr_err("Unknown rela relocation: %d\n", r_type); 249 + return -ENOEXEC; 250 + } 251 + } 252 + return 0; 253 + } 254 + 255 + 256 + int load_extra_segments(struct kimage *image, unsigned long kernel_start, 257 + unsigned long kernel_len, char *initrd, 258 + unsigned long initrd_len, char *cmdline, 259 + unsigned long cmdline_len) 260 + { 261 + int ret; 262 + void *fdt; 263 + unsigned long initrd_pbase = 0UL; 264 + struct kexec_buf kbuf; 265 + char *modified_cmdline = NULL; 266 + 267 + kbuf.image = image; 268 + kbuf.buf_min = kernel_start + kernel_len; 269 + kbuf.buf_max = ULONG_MAX; 270 + 271 + #ifdef CONFIG_CRASH_DUMP 272 + /* Add elfcorehdr */ 273 + if (image->type == KEXEC_TYPE_CRASH) { 274 + void *headers; 275 + unsigned long headers_sz; 276 + ret = prepare_elf_headers(&headers, &headers_sz); 277 + if (ret) { 278 + pr_err("Preparing elf core header failed\n"); 279 + goto out; 280 + } 281 + 282 + kbuf.buffer = headers; 283 + kbuf.bufsz = headers_sz; 284 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 285 + kbuf.memsz = headers_sz; 286 + kbuf.buf_align = ELF_CORE_HEADER_ALIGN; 287 + kbuf.top_down = true; 288 + 289 + ret = kexec_add_buffer(&kbuf); 290 + if (ret) { 291 + vfree(headers); 292 + goto out; 293 + } 294 + image->elf_headers = headers; 295 + image->elf_load_addr = kbuf.mem; 296 + image->elf_headers_sz = headers_sz; 297 + 298 + kexec_dprintk("Loaded elf core header at 0x%lx bufsz=0x%lx memsz=0x%lx\n", 299 + image->elf_load_addr, kbuf.bufsz, kbuf.memsz); 300 + 301 + /* Setup cmdline for kdump kernel case */ 302 + modified_cmdline = setup_kdump_cmdline(image, cmdline, 303 + cmdline_len); 304 + if (!modified_cmdline) { 305 + pr_err("Setting up cmdline for kdump kernel failed\n"); 306 + ret = -EINVAL; 307 + goto out; 308 + } 309 + cmdline = modified_cmdline; 310 + } 311 + #endif 312 + 313 + #ifdef CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY 314 + /* Add purgatory to the image */ 315 + kbuf.top_down = true; 316 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 317 + ret = kexec_load_purgatory(image, &kbuf); 318 + if (ret) { 319 + pr_err("Error loading purgatory ret=%d\n", ret); 320 + goto out; 321 + } 322 + kexec_dprintk("Loaded purgatory at 0x%lx\n", kbuf.mem); 323 + 324 + ret = kexec_purgatory_get_set_symbol(image, "riscv_kernel_entry", 325 + &kernel_start, 326 + sizeof(kernel_start), 0); 327 + if (ret) 328 + pr_err("Error update purgatory ret=%d\n", ret); 329 + #endif /* CONFIG_ARCH_SUPPORTS_KEXEC_PURGATORY */ 330 + 331 + /* Add the initrd to the image */ 332 + if (initrd != NULL) { 333 + kbuf.buffer = initrd; 334 + kbuf.bufsz = kbuf.memsz = initrd_len; 335 + kbuf.buf_align = PAGE_SIZE; 336 + kbuf.top_down = true; 337 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 338 + ret = kexec_add_buffer(&kbuf); 339 + if (ret) 340 + goto out; 341 + initrd_pbase = kbuf.mem; 342 + kexec_dprintk("Loaded initrd at 0x%lx\n", initrd_pbase); 343 + } 344 + 345 + /* Add the DTB to the image */ 346 + fdt = of_kexec_alloc_and_setup_fdt(image, initrd_pbase, 347 + initrd_len, cmdline, 0); 348 + if (!fdt) { 349 + pr_err("Error setting up the new device tree.\n"); 350 + ret = -EINVAL; 351 + goto out; 352 + } 353 + 354 + fdt_pack(fdt); 355 + kbuf.buffer = fdt; 356 + kbuf.bufsz = kbuf.memsz = fdt_totalsize(fdt); 357 + kbuf.buf_align = PAGE_SIZE; 358 + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN; 359 + kbuf.top_down = true; 360 + ret = kexec_add_buffer(&kbuf); 361 + if (ret) { 362 + pr_err("Error add DTB kbuf ret=%d\n", ret); 363 + goto out_free_fdt; 364 + } 365 + /* Cache the fdt buffer address for memory cleanup */ 366 + image->arch.fdt = fdt; 367 + kexec_dprintk("Loaded device tree at 0x%lx\n", kbuf.mem); 368 + goto out; 369 + 370 + out_free_fdt: 371 + kvfree(fdt); 372 + out: 373 + kfree(modified_cmdline); 374 + return ret; 375 + }
+68 -49
arch/riscv/kernel/mcount-dyn.S
··· 13 13 14 14 .text 15 15 16 - #define FENTRY_RA_OFFSET 8 17 16 #define ABI_SIZE_ON_STACK 80 18 17 #define ABI_A0 0 19 18 #define ABI_A1 8 ··· 55 56 addi sp, sp, ABI_SIZE_ON_STACK 56 57 .endm 57 58 58 - #ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS 59 - 60 59 /** 61 60 * SAVE_ABI_REGS - save regs against the ftrace_regs struct 62 61 * 63 62 * After the stack is established, 64 63 * 65 64 * 0(sp) stores the PC of the traced function which can be accessed 66 - * by &(fregs)->epc in tracing function. Note that the real 67 - * function entry address should be computed with -FENTRY_RA_OFFSET. 65 + * by &(fregs)->epc in tracing function. 68 66 * 69 67 * 8(sp) stores the function return address (i.e. parent IP) that 70 68 * can be accessed by &(fregs)->ra in tracing function. ··· 82 86 * +++++++++ 83 87 **/ 84 88 .macro SAVE_ABI_REGS 85 - mv t4, sp // Save original SP in T4 86 89 addi sp, sp, -FREGS_SIZE_ON_STACK 87 - 88 90 REG_S t0, FREGS_EPC(sp) 89 91 REG_S x1, FREGS_RA(sp) 90 - REG_S t4, FREGS_SP(sp) // Put original SP on stack 91 92 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST 92 93 REG_S x8, FREGS_S0(sp) 93 94 #endif 94 95 REG_S x6, FREGS_T1(sp) 95 - 96 + #ifdef CONFIG_CC_IS_CLANG 97 + REG_S x7, FREGS_T2(sp) 98 + REG_S x28, FREGS_T3(sp) 99 + REG_S x29, FREGS_T4(sp) 100 + REG_S x30, FREGS_T5(sp) 101 + REG_S x31, FREGS_T6(sp) 102 + #endif 96 103 // save the arguments 97 104 REG_S x10, FREGS_A0(sp) 98 105 REG_S x11, FREGS_A1(sp) ··· 105 106 REG_S x15, FREGS_A5(sp) 106 107 REG_S x16, FREGS_A6(sp) 107 108 REG_S x17, FREGS_A7(sp) 109 + mv a0, sp 110 + addi a0, a0, FREGS_SIZE_ON_STACK 111 + REG_S a0, FREGS_SP(sp) // Put original SP on stack 108 112 .endm 109 113 110 - .macro RESTORE_ABI_REGS, all=0 114 + .macro RESTORE_ABI_REGS 111 115 REG_L t0, FREGS_EPC(sp) 112 116 REG_L x1, FREGS_RA(sp) 113 117 #ifdef HAVE_FUNCTION_GRAPH_FP_TEST 114 118 REG_L x8, FREGS_S0(sp) 115 119 #endif 116 120 REG_L x6, FREGS_T1(sp) 117 - 121 + #ifdef CONFIG_CC_IS_CLANG 122 + REG_L x7, FREGS_T2(sp) 123 + REG_L x28, FREGS_T3(sp) 124 + REG_L x29, FREGS_T4(sp) 125 + REG_L x30, FREGS_T5(sp) 126 + REG_L x31, FREGS_T6(sp) 127 + #endif 118 128 // restore the arguments 119 129 REG_L x10, FREGS_A0(sp) 120 130 REG_L x11, FREGS_A1(sp) ··· 138 130 .endm 139 131 140 132 .macro PREPARE_ARGS 141 - addi a0, t0, -FENTRY_RA_OFFSET 133 + addi a0, t0, -MCOUNT_JALR_SIZE // ip (callsite's jalr insn) 134 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS 135 + mv a1, ra // parent_ip 136 + REG_L a2, -16(t0) // op 137 + REG_L ra, FTRACE_OPS_FUNC(a2) // op->func 138 + #else 142 139 la a1, function_trace_op 143 - REG_L a2, 0(a1) 144 - mv a1, ra 145 - mv a3, sp 140 + REG_L a2, 0(a1) // op 141 + mv a1, ra // parent_ip 142 + #endif 143 + mv a3, sp // regs 146 144 .endm 147 145 148 - #endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 149 - 150 - #ifndef CONFIG_DYNAMIC_FTRACE_WITH_ARGS 151 146 SYM_FUNC_START(ftrace_caller) 152 - SAVE_ABI 147 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS 148 + /* 149 + * When CALL_OPS is enabled (2 or 4) nops [8B] are placed before the 150 + * function entry, these are later overwritten with the pointer to the 151 + * associated struct ftrace_ops. 152 + * 153 + * -8: &ftrace_ops of the associated tracer function. 154 + *<ftrace enable>: 155 + * 0: auipc t0/ra, 0x? 156 + * 4: jalr t0/ra, ?(t0/ra) 157 + * 158 + * -8: &ftrace_nop_ops 159 + *<ftrace disable>: 160 + * 0: nop 161 + * 4: nop 162 + * 163 + * t0 is set to ip+8 after the jalr is executed at the callsite, 164 + * so we find the associated op at t0-16. 165 + */ 166 + REG_L t1, -16(t0) // op Should be SZ_REG instead of 16 153 167 154 - addi a0, t0, -FENTRY_RA_OFFSET 155 - la a1, function_trace_op 156 - REG_L a2, 0(a1) 157 - mv a1, ra 158 - mv a3, sp 159 - 160 - SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) 161 - call ftrace_stub 162 - 163 - #ifdef CONFIG_FUNCTION_GRAPH_TRACER 164 - addi a0, sp, ABI_RA 165 - REG_L a1, ABI_T0(sp) 166 - addi a1, a1, -FENTRY_RA_OFFSET 167 - #ifdef HAVE_FUNCTION_GRAPH_FP_TEST 168 - mv a2, s0 168 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 169 + /* 170 + * If the op has a direct call, handle it immediately without 171 + * saving/restoring registers. 172 + */ 173 + REG_L t1, FTRACE_OPS_DIRECT_CALL(t1) 174 + bnez t1, ftrace_caller_direct 169 175 #endif 170 - SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL) 171 - call ftrace_stub 172 176 #endif 173 - RESTORE_ABI 174 - jr t0 175 - SYM_FUNC_END(ftrace_caller) 176 - 177 - #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 178 - SYM_FUNC_START(ftrace_caller) 179 - mv t1, zero 180 177 SAVE_ABI_REGS 181 178 PREPARE_ARGS 182 179 180 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS 181 + jalr ra 182 + #else 183 183 SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL) 184 - call ftrace_stub 185 - 184 + REG_L ra, ftrace_call_dest 185 + jalr ra, 0(ra) 186 + #endif 186 187 RESTORE_ABI_REGS 187 - bnez t1, .Ldirect 188 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 189 + bnez t1, ftrace_caller_direct 190 + #endif 188 191 jr t0 189 - .Ldirect: 192 + #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 193 + SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL) 190 194 jr t1 195 + #endif 191 196 SYM_FUNC_END(ftrace_caller) 192 - 193 - #endif /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */ 194 197 195 198 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS 196 199 SYM_CODE_START(ftrace_stub_direct_tramp)
+65 -16
arch/riscv/kernel/module-sections.c
··· 9 9 #include <linux/kernel.h> 10 10 #include <linux/module.h> 11 11 #include <linux/moduleloader.h> 12 + #include <linux/sort.h> 12 13 13 14 unsigned long module_emit_got_entry(struct module *mod, unsigned long val) 14 15 { ··· 56 55 return (unsigned long)&plt[i]; 57 56 } 58 57 59 - static int is_rela_equal(const Elf_Rela *x, const Elf_Rela *y) 58 + #define cmp_3way(a, b) ((a) < (b) ? -1 : (a) > (b)) 59 + 60 + static int cmp_rela(const void *a, const void *b) 60 61 { 61 - return x->r_info == y->r_info && x->r_addend == y->r_addend; 62 + const Elf_Rela *x = a, *y = b; 63 + int i; 64 + 65 + /* sort by type, symbol index and addend */ 66 + i = cmp_3way(x->r_info, y->r_info); 67 + if (i == 0) 68 + i = cmp_3way(x->r_addend, y->r_addend); 69 + return i; 62 70 } 63 71 64 72 static bool duplicate_rela(const Elf_Rela *rela, int idx) 65 73 { 66 - int i; 67 - for (i = 0; i < idx; i++) { 68 - if (is_rela_equal(&rela[i], &rela[idx])) 69 - return true; 70 - } 71 - return false; 74 + /* 75 + * Entries are sorted by type, symbol index and addend. That means 76 + * that, if a duplicate entry exists, it must be in the preceding slot. 77 + */ 78 + return idx > 0 && cmp_rela(rela + idx, rela + idx - 1) == 0; 72 79 } 73 80 74 - static void count_max_entries(Elf_Rela *relas, int num, 81 + static void count_max_entries(const Elf_Rela *relas, size_t num, 75 82 unsigned int *plts, unsigned int *gots) 76 83 { 77 - for (int i = 0; i < num; i++) { 84 + for (size_t i = 0; i < num; i++) { 85 + if (duplicate_rela(relas, i)) 86 + continue; 87 + 78 88 switch (ELF_R_TYPE(relas[i].r_info)) { 79 89 case R_RISCV_CALL_PLT: 80 90 case R_RISCV_PLT32: 81 - if (!duplicate_rela(relas, i)) 82 - (*plts)++; 91 + (*plts)++; 83 92 break; 84 93 case R_RISCV_GOT_HI20: 85 - if (!duplicate_rela(relas, i)) 86 - (*gots)++; 94 + (*gots)++; 87 95 break; 96 + default: 97 + unreachable(); 88 98 } 99 + } 100 + } 101 + 102 + static bool rela_needs_plt_got_entry(const Elf_Rela *rela) 103 + { 104 + switch (ELF_R_TYPE(rela->r_info)) { 105 + case R_RISCV_CALL_PLT: 106 + case R_RISCV_GOT_HI20: 107 + case R_RISCV_PLT32: 108 + return true; 109 + default: 110 + return false; 89 111 } 90 112 } 91 113 92 114 int module_frob_arch_sections(Elf_Ehdr *ehdr, Elf_Shdr *sechdrs, 93 115 char *secstrings, struct module *mod) 94 116 { 117 + size_t num_scratch_relas = 0; 95 118 unsigned int num_plts = 0; 96 119 unsigned int num_gots = 0; 120 + Elf_Rela *scratch = NULL; 121 + size_t scratch_size = 0; 97 122 int i; 98 123 99 124 /* ··· 149 122 150 123 /* Calculate the maxinum number of entries */ 151 124 for (i = 0; i < ehdr->e_shnum; i++) { 125 + size_t num_relas = sechdrs[i].sh_size / sizeof(Elf_Rela); 152 126 Elf_Rela *relas = (void *)ehdr + sechdrs[i].sh_offset; 153 - int num_rela = sechdrs[i].sh_size / sizeof(Elf_Rela); 154 127 Elf_Shdr *dst_sec = sechdrs + sechdrs[i].sh_info; 128 + size_t scratch_size_needed; 155 129 156 130 if (sechdrs[i].sh_type != SHT_RELA) 157 131 continue; ··· 161 133 if (!(dst_sec->sh_flags & SHF_EXECINSTR)) 162 134 continue; 163 135 164 - count_max_entries(relas, num_rela, &num_plts, &num_gots); 136 + /* 137 + * apply_relocate_add() relies on HI20 and LO12 relocation pairs being 138 + * close together, so sort a copy of the section to avoid interfering. 139 + */ 140 + scratch_size_needed = (num_scratch_relas + num_relas) * sizeof(*scratch); 141 + if (scratch_size_needed > scratch_size) { 142 + scratch_size = scratch_size_needed; 143 + scratch = kvrealloc(scratch, scratch_size, GFP_KERNEL); 144 + if (!scratch) 145 + return -ENOMEM; 146 + } 147 + 148 + for (size_t j = 0; j < num_relas; j++) 149 + if (rela_needs_plt_got_entry(&relas[j])) 150 + scratch[num_scratch_relas++] = relas[j]; 151 + } 152 + 153 + if (scratch) { 154 + /* sort the accumulated PLT/GOT relocations so duplicates are adjacent */ 155 + sort(scratch, num_scratch_relas, sizeof(*scratch), cmp_rela, NULL); 156 + count_max_entries(scratch, num_scratch_relas, &num_plts, &num_gots); 157 + kvfree(scratch); 165 158 } 166 159 167 160 mod->arch.plt.shdr->sh_type = SHT_NOBITS;
+1 -1
arch/riscv/kernel/process.c
··· 60 60 if (!unaligned_ctl_available()) 61 61 return -EINVAL; 62 62 63 - return put_user(tsk->thread.align_ctl, (unsigned long __user *)adr); 63 + return put_user(tsk->thread.align_ctl, (unsigned int __user *)adr); 64 64 } 65 65 66 66 void __show_regs(struct pt_regs *regs)
+78 -3
arch/riscv/kernel/sbi.c
··· 299 299 return 0; 300 300 } 301 301 302 + static bool sbi_fwft_supported; 303 + 304 + struct fwft_set_req { 305 + u32 feature; 306 + unsigned long value; 307 + unsigned long flags; 308 + atomic_t error; 309 + }; 310 + 311 + static void cpu_sbi_fwft_set(void *arg) 312 + { 313 + struct fwft_set_req *req = arg; 314 + int ret; 315 + 316 + ret = sbi_fwft_set(req->feature, req->value, req->flags); 317 + if (ret) 318 + atomic_set(&req->error, ret); 319 + } 320 + 321 + /** 322 + * sbi_fwft_set() - Set a feature on the local hart 323 + * @feature: The feature ID to be set 324 + * @value: The feature value to be set 325 + * @flags: FWFT feature set flags 326 + * 327 + * Return: 0 on success, appropriate linux error code otherwise. 328 + */ 329 + int sbi_fwft_set(u32 feature, unsigned long value, unsigned long flags) 330 + { 331 + struct sbiret ret; 332 + 333 + if (!sbi_fwft_supported) 334 + return -EOPNOTSUPP; 335 + 336 + ret = sbi_ecall(SBI_EXT_FWFT, SBI_EXT_FWFT_SET, 337 + feature, value, flags, 0, 0, 0); 338 + 339 + return sbi_err_map_linux_errno(ret.error); 340 + } 341 + 342 + /** 343 + * sbi_fwft_set_cpumask() - Set a feature for the specified cpumask 344 + * @mask: CPU mask of cpus that need the feature to be set 345 + * @feature: The feature ID to be set 346 + * @value: The feature value to be set 347 + * @flags: FWFT feature set flags 348 + * 349 + * Return: 0 on success, appropriate linux error code otherwise. 350 + */ 351 + int sbi_fwft_set_cpumask(const cpumask_t *mask, u32 feature, 352 + unsigned long value, unsigned long flags) 353 + { 354 + struct fwft_set_req req = { 355 + .feature = feature, 356 + .value = value, 357 + .flags = flags, 358 + .error = ATOMIC_INIT(0), 359 + }; 360 + 361 + if (!sbi_fwft_supported) 362 + return -EOPNOTSUPP; 363 + 364 + if (feature & SBI_FWFT_GLOBAL_FEATURE_BIT) 365 + return -EINVAL; 366 + 367 + on_each_cpu_mask(mask, cpu_sbi_fwft_set, &req, 1); 368 + 369 + return atomic_read(&req.error); 370 + } 371 + 302 372 /** 303 373 * sbi_set_timer() - Program the timer for next timer event. 304 374 * @stime_value: The value after which next timer event should fire. ··· 679 609 } else { 680 610 __sbi_rfence = __sbi_rfence_v01; 681 611 } 682 - if ((sbi_spec_version >= sbi_mk_version(0, 3)) && 612 + if (sbi_spec_version >= sbi_mk_version(0, 3) && 683 613 sbi_probe_extension(SBI_EXT_SRST)) { 684 614 pr_info("SBI SRST extension detected\n"); 685 615 pm_power_off = sbi_srst_power_off; ··· 687 617 sbi_srst_reboot_nb.priority = 192; 688 618 register_restart_handler(&sbi_srst_reboot_nb); 689 619 } 690 - if ((sbi_spec_version >= sbi_mk_version(2, 0)) && 691 - (sbi_probe_extension(SBI_EXT_DBCN) > 0)) { 620 + if (sbi_spec_version >= sbi_mk_version(2, 0) && 621 + sbi_probe_extension(SBI_EXT_DBCN) > 0) { 692 622 pr_info("SBI DBCN extension detected\n"); 693 623 sbi_debug_console_available = true; 624 + } 625 + if (sbi_spec_version >= sbi_mk_version(3, 0) && 626 + sbi_probe_extension(SBI_EXT_FWFT)) { 627 + pr_info("SBI FWFT extension detected\n"); 628 + sbi_fwft_supported = true; 694 629 } 695 630 } else { 696 631 __sbi_set_timer = __sbi_set_timer_v01;
+6
arch/riscv/kernel/sys_hwprobe.c
··· 15 15 #include <asm/uaccess.h> 16 16 #include <asm/unistd.h> 17 17 #include <asm/vector.h> 18 + #include <asm/vendor_extensions/sifive_hwprobe.h> 18 19 #include <asm/vendor_extensions/thead_hwprobe.h> 19 20 #include <vdso/vsyscall.h> 20 21 ··· 97 96 * presence in the hart_isa bitmap, are made. 98 97 */ 99 98 EXT_KEY(ZAAMO); 99 + EXT_KEY(ZABHA); 100 100 EXT_KEY(ZACAS); 101 101 EXT_KEY(ZALRSC); 102 102 EXT_KEY(ZAWRS); ··· 300 298 301 299 case RISCV_HWPROBE_KEY_TIME_CSR_FREQ: 302 300 pair->value = riscv_timebase; 301 + break; 302 + 303 + case RISCV_HWPROBE_KEY_VENDOR_EXT_SIFIVE_0: 304 + hwprobe_isa_vendor_ext_sifive_0(pair, cpus); 303 305 break; 304 306 305 307 case RISCV_HWPROBE_KEY_VENDOR_EXT_THEAD_0:
+102 -14
arch/riscv/kernel/traps_misaligned.c
··· 16 16 #include <asm/entry-common.h> 17 17 #include <asm/hwprobe.h> 18 18 #include <asm/cpufeature.h> 19 + #include <asm/sbi.h> 19 20 #include <asm/vector.h> 20 21 21 22 #define INSN_MATCH_LB 0x3 ··· 369 368 370 369 perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); 371 370 372 - #ifdef CONFIG_RISCV_PROBE_UNALIGNED_ACCESS 373 371 *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED; 374 - #endif 375 372 376 373 if (!unaligned_enabled) 377 374 return -1; ··· 454 455 455 456 val.data_u64 = 0; 456 457 if (user_mode(regs)) { 457 - if (copy_from_user(&val, (u8 __user *)addr, len)) 458 + if (copy_from_user_nofault(&val, (u8 __user *)addr, len)) 458 459 return -1; 459 460 } else { 460 461 memcpy(&val, (u8 *)addr, len); ··· 555 556 return -EOPNOTSUPP; 556 557 557 558 if (user_mode(regs)) { 558 - if (copy_to_user((u8 __user *)addr, &val, len)) 559 + if (copy_to_user_nofault((u8 __user *)addr, &val, len)) 559 560 return -1; 560 561 } else { 561 562 memcpy((u8 *)addr, &val, len); ··· 625 626 { 626 627 int cpu; 627 628 629 + /* 630 + * While being documented as very slow, schedule_on_each_cpu() is used since 631 + * kernel_vector_begin() expects irqs to be enabled or it will panic() 632 + */ 628 633 schedule_on_each_cpu(check_vector_unaligned_access_emulated); 629 634 630 635 for_each_online_cpu(cpu) ··· 645 642 } 646 643 #endif 647 644 645 + static bool all_cpus_unaligned_scalar_access_emulated(void) 646 + { 647 + int cpu; 648 + 649 + for_each_online_cpu(cpu) 650 + if (per_cpu(misaligned_access_speed, cpu) != 651 + RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) 652 + return false; 653 + 654 + return true; 655 + } 656 + 648 657 #ifdef CONFIG_RISCV_SCALAR_MISALIGNED 649 658 650 659 static bool unaligned_ctl __read_mostly; 651 660 652 - void check_unaligned_access_emulated(struct work_struct *work __always_unused) 661 + static void check_unaligned_access_emulated(void *arg __always_unused) 653 662 { 654 663 int cpu = smp_processor_id(); 655 664 long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); ··· 672 657 __asm__ __volatile__ ( 673 658 " "REG_L" %[tmp], 1(%[ptr])\n" 674 659 : [tmp] "=r" (tmp_val) : [ptr] "r" (&tmp_var) : "memory"); 660 + } 661 + 662 + static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) 663 + { 664 + long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); 665 + 666 + check_unaligned_access_emulated(NULL); 675 667 676 668 /* 677 669 * If unaligned_ctl is already set, this means that we detected that all ··· 687 665 */ 688 666 if (unlikely(unaligned_ctl && (*mas_ptr != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED))) { 689 667 pr_crit("CPU misaligned accesses non homogeneous (expected all emulated)\n"); 690 - while (true) 691 - cpu_relax(); 668 + return -EINVAL; 692 669 } 670 + 671 + return 0; 693 672 } 694 673 695 674 bool __init check_unaligned_access_emulated_all_cpus(void) 696 675 { 697 - int cpu; 698 - 699 676 /* 700 677 * We can only support PR_UNALIGN controls if all CPUs have misaligned 701 678 * accesses emulated since tasks requesting such control can run on any 702 679 * CPU. 703 680 */ 704 - schedule_on_each_cpu(check_unaligned_access_emulated); 681 + on_each_cpu(check_unaligned_access_emulated, NULL, 1); 705 682 706 - for_each_online_cpu(cpu) 707 - if (per_cpu(misaligned_access_speed, cpu) 708 - != RISCV_HWPROBE_MISALIGNED_SCALAR_EMULATED) 709 - return false; 683 + if (!all_cpus_unaligned_scalar_access_emulated()) 684 + return false; 710 685 711 686 unaligned_ctl = true; 712 687 return true; ··· 718 699 { 719 700 return false; 720 701 } 702 + static int cpu_online_check_unaligned_access_emulated(unsigned int cpu) 703 + { 704 + return 0; 705 + } 721 706 #endif 707 + 708 + static bool misaligned_traps_delegated; 709 + 710 + #ifdef CONFIG_RISCV_SBI 711 + 712 + static int cpu_online_sbi_unaligned_setup(unsigned int cpu) 713 + { 714 + if (sbi_fwft_set(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0) && 715 + misaligned_traps_delegated) { 716 + pr_crit("Misaligned trap delegation non homogeneous (expected delegated)"); 717 + return -EINVAL; 718 + } 719 + 720 + return 0; 721 + } 722 + 723 + void __init unaligned_access_init(void) 724 + { 725 + int ret; 726 + 727 + ret = sbi_fwft_set_online_cpus(SBI_FWFT_MISALIGNED_EXC_DELEG, 1, 0); 728 + if (ret) 729 + return; 730 + 731 + misaligned_traps_delegated = true; 732 + pr_info("SBI misaligned access exception delegation ok\n"); 733 + /* 734 + * Note that we don't have to take any specific action here, if 735 + * the delegation is successful, then 736 + * check_unaligned_access_emulated() will verify that indeed the 737 + * platform traps on misaligned accesses. 738 + */ 739 + } 740 + #else 741 + void __init unaligned_access_init(void) {} 742 + 743 + static int cpu_online_sbi_unaligned_setup(unsigned int cpu __always_unused) 744 + { 745 + return 0; 746 + } 747 + 748 + #endif 749 + 750 + int cpu_online_unaligned_access_init(unsigned int cpu) 751 + { 752 + int ret; 753 + 754 + ret = cpu_online_sbi_unaligned_setup(cpu); 755 + if (ret) 756 + return ret; 757 + 758 + return cpu_online_check_unaligned_access_emulated(cpu); 759 + } 760 + 761 + bool misaligned_traps_can_delegate(void) 762 + { 763 + /* 764 + * Either we successfully requested misaligned traps delegation for all 765 + * CPUs, or the SBI does not implement the FWFT extension but delegated 766 + * the exception by default. 767 + */ 768 + return misaligned_traps_delegated || 769 + all_cpus_unaligned_scalar_access_emulated(); 770 + } 771 + EXPORT_SYMBOL_GPL(misaligned_traps_can_delegate);
+7 -1
arch/riscv/kernel/unaligned_access_speed.c
··· 236 236 237 237 static int riscv_online_cpu(unsigned int cpu) 238 238 { 239 + int ret = cpu_online_unaligned_access_init(cpu); 240 + 241 + if (ret) 242 + return ret; 243 + 239 244 /* We are already set since the last check */ 240 245 if (per_cpu(misaligned_access_speed, cpu) != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN) { 241 246 goto exit; ··· 253 248 { 254 249 static struct page *buf; 255 250 256 - check_unaligned_access_emulated(NULL); 257 251 buf = alloc_pages(GFP_KERNEL, MISALIGNED_BUFFER_ORDER); 258 252 if (!buf) { 259 253 pr_warn("Allocation failure, not measuring misaligned performance\n"); ··· 442 438 static int __init check_unaligned_access_all_cpus(void) 443 439 { 444 440 int cpu; 441 + 442 + unaligned_access_init(); 445 443 446 444 if (unaligned_scalar_speed_param != RISCV_HWPROBE_MISALIGNED_SCALAR_UNKNOWN) { 447 445 pr_info("scalar unaligned access speed set to '%s' (%lu) by command line\n",
+1 -1
arch/riscv/kernel/vdso.c
··· 136 136 137 137 ret = 138 138 _install_special_mapping(mm, vdso_base, vdso_text_len, 139 - (VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC), 139 + (VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC | VM_SEALED_SYSMAP), 140 140 vdso_info->cm); 141 141 142 142 if (IS_ERR(ret))
+14 -1
arch/riscv/kernel/vdso/Makefile
··· 13 13 vdso-syms += hwprobe 14 14 vdso-syms += sys_hwprobe 15 15 16 + ifdef CONFIG_VDSO_GETRANDOM 17 + vdso-syms += getrandom 18 + endif 19 + 16 20 # Files to link into the vdso 17 21 obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o 22 + 23 + ifdef CONFIG_VDSO_GETRANDOM 24 + obj-vdso += vgetrandom-chacha.o 25 + endif 18 26 19 27 ccflags-y := -fno-stack-protector 20 28 ccflags-y += -DDISABLE_BRANCH_PROFILING ··· 30 22 31 23 ifneq ($(c-gettimeofday-y),) 32 24 CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y) 25 + endif 26 + 27 + ifneq ($(c-getrandom-y),) 28 + CFLAGS_getrandom.o += -fPIC -include $(c-getrandom-y) 33 29 endif 34 30 35 31 CFLAGS_hwprobe.o += -fPIC ··· 50 38 51 39 # Disable -pg to prevent insert call site 52 40 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) 41 + CFLAGS_REMOVE_getrandom.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) 53 42 CFLAGS_REMOVE_hwprobe.o = $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) 54 43 55 44 # Force dependency ··· 60 47 $(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE 61 48 $(call if_changed,vdsold_and_check) 62 49 LDFLAGS_vdso.so.dbg = -shared -soname=linux-vdso.so.1 \ 63 - --build-id=sha1 --hash-style=both --eh-frame-hdr 50 + --build-id=sha1 --eh-frame-hdr 64 51 65 52 # strip rule for the .so file 66 53 $(obj)/%.so: OBJCOPYFLAGS := -S
+10
arch/riscv/kernel/vdso/getrandom.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright (C) 2025 Xi Ruoyao <xry111@xry111.site>. All Rights Reserved. 4 + */ 5 + #include <linux/types.h> 6 + 7 + ssize_t __vdso_getrandom(void *buffer, size_t len, unsigned int flags, void *opaque_state, size_t opaque_len) 8 + { 9 + return __cvdso_getrandom(buffer, len, flags, opaque_state, opaque_len); 10 + }
+3
arch/riscv/kernel/vdso/vdso.lds.S
··· 80 80 #ifndef COMPAT_VDSO 81 81 __vdso_riscv_hwprobe; 82 82 #endif 83 + #if defined(CONFIG_VDSO_GETRANDOM) && !defined(COMPAT_VDSO) 84 + __vdso_getrandom; 85 + #endif 83 86 local: *; 84 87 }; 85 88 }
+249
arch/riscv/kernel/vdso/vgetrandom-chacha.S
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * Copyright (C) 2025 Xi Ruoyao <xry111@xry111.site>. All Rights Reserved. 4 + * 5 + * Based on arch/loongarch/vdso/vgetrandom-chacha.S. 6 + */ 7 + 8 + #include <asm/asm.h> 9 + #include <linux/linkage.h> 10 + 11 + .text 12 + 13 + .macro ROTRI rd rs imm 14 + slliw t0, \rs, 32 - \imm 15 + srliw \rd, \rs, \imm 16 + or \rd, \rd, t0 17 + .endm 18 + 19 + .macro OP_4REG op d0 d1 d2 d3 s0 s1 s2 s3 20 + \op \d0, \d0, \s0 21 + \op \d1, \d1, \s1 22 + \op \d2, \d2, \s2 23 + \op \d3, \d3, \s3 24 + .endm 25 + 26 + /* 27 + * a0: output bytes 28 + * a1: 32-byte key input 29 + * a2: 8-byte counter input/output 30 + * a3: number of 64-byte blocks to write to output 31 + */ 32 + SYM_FUNC_START(__arch_chacha20_blocks_nostack) 33 + 34 + #define output a0 35 + #define key a1 36 + #define counter a2 37 + #define nblocks a3 38 + #define i a4 39 + #define state0 s0 40 + #define state1 s1 41 + #define state2 s2 42 + #define state3 s3 43 + #define state4 s4 44 + #define state5 s5 45 + #define state6 s6 46 + #define state7 s7 47 + #define state8 s8 48 + #define state9 s9 49 + #define state10 s10 50 + #define state11 s11 51 + #define state12 a5 52 + #define state13 a6 53 + #define state14 a7 54 + #define state15 t1 55 + #define cnt t2 56 + #define copy0 t3 57 + #define copy1 t4 58 + #define copy2 t5 59 + #define copy3 t6 60 + 61 + /* Packs to be used with OP_4REG */ 62 + #define line0 state0, state1, state2, state3 63 + #define line1 state4, state5, state6, state7 64 + #define line2 state8, state9, state10, state11 65 + #define line3 state12, state13, state14, state15 66 + 67 + #define line1_perm state5, state6, state7, state4 68 + #define line2_perm state10, state11, state8, state9 69 + #define line3_perm state15, state12, state13, state14 70 + 71 + #define copy copy0, copy1, copy2, copy3 72 + 73 + #define _16 16, 16, 16, 16 74 + #define _20 20, 20, 20, 20 75 + #define _24 24, 24, 24, 24 76 + #define _25 25, 25, 25, 25 77 + 78 + /* 79 + * The ABI requires s0-s9 saved. 80 + * This does not violate the stack-less requirement: no sensitive data 81 + * is spilled onto the stack. 82 + */ 83 + addi sp, sp, -12*SZREG 84 + REG_S s0, (sp) 85 + REG_S s1, SZREG(sp) 86 + REG_S s2, 2*SZREG(sp) 87 + REG_S s3, 3*SZREG(sp) 88 + REG_S s4, 4*SZREG(sp) 89 + REG_S s5, 5*SZREG(sp) 90 + REG_S s6, 6*SZREG(sp) 91 + REG_S s7, 7*SZREG(sp) 92 + REG_S s8, 8*SZREG(sp) 93 + REG_S s9, 9*SZREG(sp) 94 + REG_S s10, 10*SZREG(sp) 95 + REG_S s11, 11*SZREG(sp) 96 + 97 + ld cnt, (counter) 98 + 99 + li copy0, 0x61707865 100 + li copy1, 0x3320646e 101 + li copy2, 0x79622d32 102 + li copy3, 0x6b206574 103 + 104 + .Lblock: 105 + /* state[0,1,2,3] = "expand 32-byte k" */ 106 + mv state0, copy0 107 + mv state1, copy1 108 + mv state2, copy2 109 + mv state3, copy3 110 + 111 + /* state[4,5,..,11] = key */ 112 + lw state4, (key) 113 + lw state5, 4(key) 114 + lw state6, 8(key) 115 + lw state7, 12(key) 116 + lw state8, 16(key) 117 + lw state9, 20(key) 118 + lw state10, 24(key) 119 + lw state11, 28(key) 120 + 121 + /* state[12,13] = counter */ 122 + mv state12, cnt 123 + srli state13, cnt, 32 124 + 125 + /* state[14,15] = 0 */ 126 + mv state14, zero 127 + mv state15, zero 128 + 129 + li i, 10 130 + .Lpermute: 131 + /* odd round */ 132 + OP_4REG addw line0, line1 133 + OP_4REG xor line3, line0 134 + OP_4REG ROTRI line3, _16 135 + 136 + OP_4REG addw line2, line3 137 + OP_4REG xor line1, line2 138 + OP_4REG ROTRI line1, _20 139 + 140 + OP_4REG addw line0, line1 141 + OP_4REG xor line3, line0 142 + OP_4REG ROTRI line3, _24 143 + 144 + OP_4REG addw line2, line3 145 + OP_4REG xor line1, line2 146 + OP_4REG ROTRI line1, _25 147 + 148 + /* even round */ 149 + OP_4REG addw line0, line1_perm 150 + OP_4REG xor line3_perm, line0 151 + OP_4REG ROTRI line3_perm, _16 152 + 153 + OP_4REG addw line2_perm, line3_perm 154 + OP_4REG xor line1_perm, line2_perm 155 + OP_4REG ROTRI line1_perm, _20 156 + 157 + OP_4REG addw line0, line1_perm 158 + OP_4REG xor line3_perm, line0 159 + OP_4REG ROTRI line3_perm, _24 160 + 161 + OP_4REG addw line2_perm, line3_perm 162 + OP_4REG xor line1_perm, line2_perm 163 + OP_4REG ROTRI line1_perm, _25 164 + 165 + addi i, i, -1 166 + bnez i, .Lpermute 167 + 168 + /* output[0,1,2,3] = copy[0,1,2,3] + state[0,1,2,3] */ 169 + OP_4REG addw line0, copy 170 + sw state0, (output) 171 + sw state1, 4(output) 172 + sw state2, 8(output) 173 + sw state3, 12(output) 174 + 175 + /* from now on state[0,1,2,3] are scratch registers */ 176 + 177 + /* state[0,1,2,3] = lo(key) */ 178 + lw state0, (key) 179 + lw state1, 4(key) 180 + lw state2, 8(key) 181 + lw state3, 12(key) 182 + 183 + /* output[4,5,6,7] = state[0,1,2,3] + state[4,5,6,7] */ 184 + OP_4REG addw line1, line0 185 + sw state4, 16(output) 186 + sw state5, 20(output) 187 + sw state6, 24(output) 188 + sw state7, 28(output) 189 + 190 + /* state[0,1,2,3] = hi(key) */ 191 + lw state0, 16(key) 192 + lw state1, 20(key) 193 + lw state2, 24(key) 194 + lw state3, 28(key) 195 + 196 + /* output[8,9,10,11] = tmp[0,1,2,3] + state[8,9,10,11] */ 197 + OP_4REG addw line2, line0 198 + sw state8, 32(output) 199 + sw state9, 36(output) 200 + sw state10, 40(output) 201 + sw state11, 44(output) 202 + 203 + /* output[12,13,14,15] = state[12,13,14,15] + [cnt_lo, cnt_hi, 0, 0] */ 204 + addw state12, state12, cnt 205 + srli state0, cnt, 32 206 + addw state13, state13, state0 207 + sw state12, 48(output) 208 + sw state13, 52(output) 209 + sw state14, 56(output) 210 + sw state15, 60(output) 211 + 212 + /* ++counter */ 213 + addi cnt, cnt, 1 214 + 215 + /* output += 64 */ 216 + addi output, output, 64 217 + /* --nblocks */ 218 + addi nblocks, nblocks, -1 219 + bnez nblocks, .Lblock 220 + 221 + /* counter = [cnt_lo, cnt_hi] */ 222 + sd cnt, (counter) 223 + 224 + /* Zero out the potentially sensitive regs, in case nothing uses these 225 + * again. As at now copy[0,1,2,3] just contains "expand 32-byte k" and 226 + * state[0,...,11] are s0-s11 those we'll restore in the epilogue, we 227 + * only need to zero state[12,...,15]. 228 + */ 229 + mv state12, zero 230 + mv state13, zero 231 + mv state14, zero 232 + mv state15, zero 233 + 234 + REG_L s0, (sp) 235 + REG_L s1, SZREG(sp) 236 + REG_L s2, 2*SZREG(sp) 237 + REG_L s3, 3*SZREG(sp) 238 + REG_L s4, 4*SZREG(sp) 239 + REG_L s5, 5*SZREG(sp) 240 + REG_L s6, 6*SZREG(sp) 241 + REG_L s7, 7*SZREG(sp) 242 + REG_L s8, 8*SZREG(sp) 243 + REG_L s9, 9*SZREG(sp) 244 + REG_L s10, 10*SZREG(sp) 245 + REG_L s11, 11*SZREG(sp) 246 + addi sp, sp, 12*SZREG 247 + 248 + ret 249 + SYM_FUNC_END(__arch_chacha20_blocks_nostack)
+10
arch/riscv/kernel/vendor_extensions.c
··· 6 6 #include <asm/vendorid_list.h> 7 7 #include <asm/vendor_extensions.h> 8 8 #include <asm/vendor_extensions/andes.h> 9 + #include <asm/vendor_extensions/sifive.h> 9 10 #include <asm/vendor_extensions/thead.h> 10 11 11 12 #include <linux/array_size.h> ··· 15 14 struct riscv_isa_vendor_ext_data_list *riscv_isa_vendor_ext_list[] = { 16 15 #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_ANDES 17 16 &riscv_isa_vendor_ext_list_andes, 17 + #endif 18 + #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_SIFIVE 19 + &riscv_isa_vendor_ext_list_sifive, 18 20 #endif 19 21 #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_THEAD 20 22 &riscv_isa_vendor_ext_list_thead, ··· 47 43 case ANDES_VENDOR_ID: 48 44 bmap = &riscv_isa_vendor_ext_list_andes.all_harts_isa_bitmap; 49 45 cpu_bmap = riscv_isa_vendor_ext_list_andes.per_hart_isa_bitmap; 46 + break; 47 + #endif 48 + #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_SIFIVE 49 + case SIFIVE_VENDOR_ID: 50 + bmap = &riscv_isa_vendor_ext_list_sifive.all_harts_isa_bitmap; 51 + cpu_bmap = riscv_isa_vendor_ext_list_sifive.per_hart_isa_bitmap; 50 52 break; 51 53 #endif 52 54 #ifdef CONFIG_RISCV_ISA_VENDOR_EXT_THEAD
+2
arch/riscv/kernel/vendor_extensions/Makefile
··· 1 1 # SPDX-License-Identifier: GPL-2.0-only 2 2 3 3 obj-$(CONFIG_RISCV_ISA_VENDOR_EXT_ANDES) += andes.o 4 + obj-$(CONFIG_RISCV_ISA_VENDOR_EXT_SIFIVE) += sifive.o 5 + obj-$(CONFIG_RISCV_ISA_VENDOR_EXT_SIFIVE) += sifive_hwprobe.o 4 6 obj-$(CONFIG_RISCV_ISA_VENDOR_EXT_THEAD) += thead.o 5 7 obj-$(CONFIG_RISCV_ISA_VENDOR_EXT_THEAD) += thead_hwprobe.o
+21
arch/riscv/kernel/vendor_extensions/sifive.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + #include <asm/cpufeature.h> 4 + #include <asm/vendor_extensions.h> 5 + #include <asm/vendor_extensions/sifive.h> 6 + 7 + #include <linux/array_size.h> 8 + #include <linux/types.h> 9 + 10 + /* All SiFive vendor extensions supported in Linux */ 11 + const struct riscv_isa_ext_data riscv_isa_vendor_ext_sifive[] = { 12 + __RISCV_ISA_EXT_DATA(xsfvfnrclipxfqf, RISCV_ISA_VENDOR_EXT_XSFVFNRCLIPXFQF), 13 + __RISCV_ISA_EXT_DATA(xsfvfwmaccqqq, RISCV_ISA_VENDOR_EXT_XSFVFWMACCQQQ), 14 + __RISCV_ISA_EXT_DATA(xsfvqmaccdod, RISCV_ISA_VENDOR_EXT_XSFVQMACCDOD), 15 + __RISCV_ISA_EXT_DATA(xsfvqmaccqoq, RISCV_ISA_VENDOR_EXT_XSFVQMACCQOQ), 16 + }; 17 + 18 + struct riscv_isa_vendor_ext_data_list riscv_isa_vendor_ext_list_sifive = { 19 + .ext_data_count = ARRAY_SIZE(riscv_isa_vendor_ext_sifive), 20 + .ext_data = riscv_isa_vendor_ext_sifive, 21 + };
+22
arch/riscv/kernel/vendor_extensions/sifive_hwprobe.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + 3 + #include <asm/vendor_extensions/sifive.h> 4 + #include <asm/vendor_extensions/sifive_hwprobe.h> 5 + #include <asm/vendor_extensions/vendor_hwprobe.h> 6 + 7 + #include <linux/cpumask.h> 8 + #include <linux/types.h> 9 + 10 + #include <uapi/asm/hwprobe.h> 11 + #include <uapi/asm/vendor/sifive.h> 12 + 13 + void hwprobe_isa_vendor_ext_sifive_0(struct riscv_hwprobe *pair, const struct cpumask *cpus) 14 + { 15 + VENDOR_EXTENSION_SUPPORTED(pair, cpus, 16 + riscv_isa_vendor_ext_list_sifive.per_hart_isa_bitmap, { 17 + VENDOR_EXT_KEY(XSFVQMACCDOD); 18 + VENDOR_EXT_KEY(XSFVQMACCQOQ); 19 + VENDOR_EXT_KEY(XSFVFNRCLIPXFQF); 20 + VENDOR_EXT_KEY(XSFVFWMACCQQQ); 21 + }); 22 + }
+8 -3
arch/riscv/lib/riscv_v_helpers.c
··· 16 16 #ifdef CONFIG_MMU 17 17 size_t riscv_v_usercopy_threshold = CONFIG_RISCV_ISA_V_UCOPY_THRESHOLD; 18 18 int __asm_vector_usercopy(void *dst, void *src, size_t n); 19 + int __asm_vector_usercopy_sum_enabled(void *dst, void *src, size_t n); 19 20 int fallback_scalar_usercopy(void *dst, void *src, size_t n); 20 - asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n) 21 + int fallback_scalar_usercopy_sum_enabled(void *dst, void *src, size_t n); 22 + asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n, 23 + bool enable_sum) 21 24 { 22 25 size_t remain, copied; 23 26 ··· 29 26 goto fallback; 30 27 31 28 kernel_vector_begin(); 32 - remain = __asm_vector_usercopy(dst, src, n); 29 + remain = enable_sum ? __asm_vector_usercopy(dst, src, n) : 30 + __asm_vector_usercopy_sum_enabled(dst, src, n); 33 31 kernel_vector_end(); 34 32 35 33 if (remain) { ··· 44 40 return remain; 45 41 46 42 fallback: 47 - return fallback_scalar_usercopy(dst, src, n); 43 + return enable_sum ? fallback_scalar_usercopy(dst, src, n) : 44 + fallback_scalar_usercopy_sum_enabled(dst, src, n); 48 45 } 49 46 #endif
+34 -16
arch/riscv/lib/uaccess.S
··· 17 17 ALTERNATIVE("j fallback_scalar_usercopy", "nop", 0, RISCV_ISA_EXT_ZVE32X, CONFIG_RISCV_ISA_V) 18 18 REG_L t0, riscv_v_usercopy_threshold 19 19 bltu a2, t0, fallback_scalar_usercopy 20 - tail enter_vector_usercopy 20 + li a3, 1 21 + tail enter_vector_usercopy 21 22 #endif 23 + SYM_FUNC_END(__asm_copy_to_user) 24 + EXPORT_SYMBOL(__asm_copy_to_user) 25 + SYM_FUNC_ALIAS(__asm_copy_from_user, __asm_copy_to_user) 26 + EXPORT_SYMBOL(__asm_copy_from_user) 27 + 22 28 SYM_FUNC_START(fallback_scalar_usercopy) 23 - 24 29 /* Enable access to user memory */ 25 - li t6, SR_SUM 26 - csrs CSR_STATUS, t6 30 + li t6, SR_SUM 31 + csrs CSR_STATUS, t6 32 + mv t6, ra 27 33 34 + call fallback_scalar_usercopy_sum_enabled 35 + 36 + /* Disable access to user memory */ 37 + mv ra, t6 38 + li t6, SR_SUM 39 + csrc CSR_STATUS, t6 40 + ret 41 + SYM_FUNC_END(fallback_scalar_usercopy) 42 + 43 + SYM_FUNC_START(__asm_copy_to_user_sum_enabled) 44 + #ifdef CONFIG_RISCV_ISA_V 45 + ALTERNATIVE("j fallback_scalar_usercopy_sum_enabled", "nop", 0, RISCV_ISA_EXT_ZVE32X, CONFIG_RISCV_ISA_V) 46 + REG_L t0, riscv_v_usercopy_threshold 47 + bltu a2, t0, fallback_scalar_usercopy_sum_enabled 48 + li a3, 0 49 + tail enter_vector_usercopy 50 + #endif 51 + SYM_FUNC_END(__asm_copy_to_user_sum_enabled) 52 + SYM_FUNC_ALIAS(__asm_copy_from_user_sum_enabled, __asm_copy_to_user_sum_enabled) 53 + EXPORT_SYMBOL(__asm_copy_from_user_sum_enabled) 54 + EXPORT_SYMBOL(__asm_copy_to_user_sum_enabled) 55 + 56 + SYM_FUNC_START(fallback_scalar_usercopy_sum_enabled) 28 57 /* 29 58 * Save the terminal address which will be used to compute the number 30 59 * of bytes copied in case of a fixup exception. ··· 207 178 bltu a0, t0, 4b /* t0 - end of dst */ 208 179 209 180 .Lout_copy_user: 210 - /* Disable access to user memory */ 211 - csrc CSR_STATUS, t6 212 181 li a0, 0 213 182 ret 214 - 215 - /* Exception fixup code */ 216 183 10: 217 - /* Disable access to user memory */ 218 - csrc CSR_STATUS, t6 219 184 sub a0, t5, a0 220 185 ret 221 - SYM_FUNC_END(__asm_copy_to_user) 222 - SYM_FUNC_END(fallback_scalar_usercopy) 223 - EXPORT_SYMBOL(__asm_copy_to_user) 224 - SYM_FUNC_ALIAS(__asm_copy_from_user, __asm_copy_to_user) 225 - EXPORT_SYMBOL(__asm_copy_from_user) 226 - 186 + SYM_FUNC_END(fallback_scalar_usercopy_sum_enabled) 227 187 228 188 SYM_FUNC_START(__clear_user) 229 189
+12 -3
arch/riscv/lib/uaccess_vector.S
··· 24 24 /* Enable access to user memory */ 25 25 li t6, SR_SUM 26 26 csrs CSR_STATUS, t6 27 + mv t6, ra 27 28 29 + call __asm_vector_usercopy_sum_enabled 30 + 31 + /* Disable access to user memory */ 32 + mv ra, t6 33 + li t6, SR_SUM 34 + csrc CSR_STATUS, t6 35 + ret 36 + SYM_FUNC_END(__asm_vector_usercopy) 37 + 38 + SYM_FUNC_START(__asm_vector_usercopy_sum_enabled) 28 39 loop: 29 40 vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma 30 41 fixup vle8.v vData, (pSrc), 10f ··· 47 36 48 37 /* Exception fixup for vector load is shared with normal exit */ 49 38 10: 50 - /* Disable access to user memory */ 51 - csrc CSR_STATUS, t6 52 39 mv a0, iNum 53 40 ret 54 41 ··· 58 49 csrr t2, CSR_VSTART 59 50 sub iNum, iNum, t2 60 51 j 10b 61 - SYM_FUNC_END(__asm_vector_usercopy) 52 + SYM_FUNC_END(__asm_vector_usercopy_sum_enabled)
+25 -4
arch/riscv/mm/cacheflush.c
··· 24 24 25 25 if (num_online_cpus() < 2) 26 26 return; 27 - else if (riscv_use_sbi_for_rfence()) 27 + 28 + /* 29 + * Make sure all previous writes to the D$ are ordered before making 30 + * the IPI. The RISC-V spec states that a hart must execute a data fence 31 + * before triggering a remote fence.i in order to make the modification 32 + * visable for remote harts. 33 + * 34 + * IPIs on RISC-V are triggered by MMIO writes to either CLINT or 35 + * S-IMSIC, so the fence ensures previous data writes "happen before" 36 + * the MMIO. 37 + */ 38 + RISCV_FENCE(w, o); 39 + 40 + if (riscv_use_sbi_for_rfence()) 28 41 sbi_remote_fence_i(NULL); 29 42 else 30 43 on_each_cpu(ipi_remote_fence_i, NULL, 1); ··· 114 101 unsigned int riscv_cboz_block_size; 115 102 EXPORT_SYMBOL_GPL(riscv_cboz_block_size); 116 103 104 + unsigned int riscv_cbop_block_size; 105 + EXPORT_SYMBOL_GPL(riscv_cbop_block_size); 106 + 117 107 static void __init cbo_get_block_size(struct device_node *node, 118 108 const char *name, u32 *block_size, 119 109 unsigned long *first_hartid) ··· 141 125 142 126 void __init riscv_init_cbo_blocksizes(void) 143 127 { 144 - unsigned long cbom_hartid, cboz_hartid; 145 - u32 cbom_block_size = 0, cboz_block_size = 0; 128 + unsigned long cbom_hartid, cboz_hartid, cbop_hartid; 129 + u32 cbom_block_size = 0, cboz_block_size = 0, cbop_block_size = 0; 146 130 struct device_node *node; 147 131 struct acpi_table_header *rhct; 148 132 acpi_status status; ··· 154 138 &cbom_block_size, &cbom_hartid); 155 139 cbo_get_block_size(node, "riscv,cboz-block-size", 156 140 &cboz_block_size, &cboz_hartid); 141 + cbo_get_block_size(node, "riscv,cbop-block-size", 142 + &cbop_block_size, &cbop_hartid); 157 143 } 158 144 } else { 159 145 status = acpi_get_table(ACPI_SIG_RHCT, 0, &rhct); 160 146 if (ACPI_FAILURE(status)) 161 147 return; 162 148 163 - acpi_get_cbo_block_size(rhct, &cbom_block_size, &cboz_block_size, NULL); 149 + acpi_get_cbo_block_size(rhct, &cbom_block_size, &cboz_block_size, &cbop_block_size); 164 150 acpi_put_table((struct acpi_table_header *)rhct); 165 151 } 166 152 ··· 171 153 172 154 if (cboz_block_size) 173 155 riscv_cboz_block_size = cboz_block_size; 156 + 157 + if (cbop_block_size) 158 + riscv_cbop_block_size = cbop_block_size; 174 159 } 175 160 176 161 #ifdef CONFIG_SMP
+10
arch/riscv/mm/pgtable.c
··· 154 154 flush_tlb_mm(vma->vm_mm); 155 155 return pmd; 156 156 } 157 + 158 + pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, 159 + pud_t *pudp) 160 + { 161 + VM_WARN_ON_ONCE(!pud_present(*pudp)); 162 + pud_t old = pudp_establish(vma, address, pudp, pud_mkinvalid(*pudp)); 163 + 164 + flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); 165 + return old; 166 + } 157 167 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
+38
arch/riscv/mm/tlbflush.c
··· 7 7 #include <linux/mmu_notifier.h> 8 8 #include <asm/sbi.h> 9 9 #include <asm/mmu_context.h> 10 + #include <asm/cpufeature.h> 11 + 12 + #define has_svinval() riscv_has_extension_unlikely(RISCV_ISA_EXT_SVINVAL) 13 + 14 + static inline void local_sfence_inval_ir(void) 15 + { 16 + asm volatile(SFENCE_INVAL_IR() ::: "memory"); 17 + } 18 + 19 + static inline void local_sfence_w_inval(void) 20 + { 21 + asm volatile(SFENCE_W_INVAL() ::: "memory"); 22 + } 23 + 24 + static inline void local_sinval_vma(unsigned long vma, unsigned long asid) 25 + { 26 + if (asid != FLUSH_TLB_NO_ASID) 27 + asm volatile(SINVAL_VMA(%0, %1) : : "r" (vma), "r" (asid) : "memory"); 28 + else 29 + asm volatile(SINVAL_VMA(%0, zero) : : "r" (vma) : "memory"); 30 + } 10 31 11 32 /* 12 33 * Flush entire TLB if number of entries to be flushed is greater ··· 45 24 46 25 if (nr_ptes_in_range > tlb_flush_all_threshold) { 47 26 local_flush_tlb_all_asid(asid); 27 + return; 28 + } 29 + 30 + if (has_svinval()) { 31 + local_sfence_w_inval(); 32 + for (i = 0; i < nr_ptes_in_range; ++i) { 33 + local_sinval_vma(start, asid); 34 + start += stride; 35 + } 36 + local_sfence_inval_ir(); 48 37 return; 49 38 } 50 39 ··· 212 181 { 213 182 __flush_tlb_range(vma->vm_mm, mm_cpumask(vma->vm_mm), 214 183 start, end - start, PMD_SIZE); 184 + } 185 + 186 + void flush_pud_tlb_range(struct vm_area_struct *vma, unsigned long start, 187 + unsigned long end) 188 + { 189 + __flush_tlb_range(vma->vm_mm, mm_cpumask(vma->vm_mm), 190 + start, end - start, PUD_SIZE); 215 191 } 216 192 #endif 217 193
+2
include/linux/ftrace.h
··· 635 635 #define ftrace_get_symaddr(fentry_ip) (0) 636 636 #endif 637 637 638 + void ftrace_sync_ipi(void *data); 639 + 638 640 #ifdef CONFIG_DYNAMIC_FTRACE 639 641 640 642 void ftrace_arch_code_modify_prepare(void);
+5
include/linux/raid/pq.h
··· 108 108 extern const struct raid6_calls raid6_vpermxor8; 109 109 extern const struct raid6_calls raid6_lsx; 110 110 extern const struct raid6_calls raid6_lasx; 111 + extern const struct raid6_calls raid6_rvvx1; 112 + extern const struct raid6_calls raid6_rvvx2; 113 + extern const struct raid6_calls raid6_rvvx4; 114 + extern const struct raid6_calls raid6_rvvx8; 111 115 112 116 struct raid6_recov_calls { 113 117 void (*data2)(int, size_t, int, int, void **); ··· 129 125 extern const struct raid6_recov_calls raid6_recov_neon; 130 126 extern const struct raid6_recov_calls raid6_recov_lsx; 131 127 extern const struct raid6_recov_calls raid6_recov_lasx; 128 + extern const struct raid6_recov_calls raid6_recov_rvv; 132 129 133 130 extern const struct raid6_calls raid6_neonx1; 134 131 extern const struct raid6_calls raid6_neonx2;
+1 -1
kernel/trace/ftrace.c
··· 188 188 op->saved_func(ip, parent_ip, op, fregs); 189 189 } 190 190 191 - static void ftrace_sync_ipi(void *data) 191 + void ftrace_sync_ipi(void *data) 192 192 { 193 193 /* Probably not needed, but do it anyway */ 194 194 smp_rmb();
+1
lib/raid6/Makefile
··· 10 10 raid6_pq-$(CONFIG_KERNEL_MODE_NEON) += neon.o neon1.o neon2.o neon4.o neon8.o recov_neon.o recov_neon_inner.o 11 11 raid6_pq-$(CONFIG_S390) += s390vx8.o recov_s390xc.o 12 12 raid6_pq-$(CONFIG_LOONGARCH) += loongarch_simd.o recov_loongarch_simd.o 13 + raid6_pq-$(CONFIG_RISCV_ISA_V) += rvv.o recov_rvv.o 13 14 14 15 hostprogs += mktables 15 16
+9
lib/raid6/algos.c
··· 77 77 &raid6_lsx, 78 78 #endif 79 79 #endif 80 + #ifdef CONFIG_RISCV_ISA_V 81 + &raid6_rvvx1, 82 + &raid6_rvvx2, 83 + &raid6_rvvx4, 84 + &raid6_rvvx8, 85 + #endif 80 86 &raid6_intx8, 81 87 &raid6_intx4, 82 88 &raid6_intx2, ··· 115 109 #ifdef CONFIG_CPU_HAS_LSX 116 110 &raid6_recov_lsx, 117 111 #endif 112 + #endif 113 + #ifdef CONFIG_RISCV_ISA_V 114 + &raid6_recov_rvv, 118 115 #endif 119 116 &raid6_recov_intx1, 120 117 NULL
+229
lib/raid6/recov_rvv.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * Copyright 2024 Institute of Software, CAS. 4 + * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> 5 + */ 6 + 7 + #include <asm/simd.h> 8 + #include <asm/vector.h> 9 + #include <crypto/internal/simd.h> 10 + #include <linux/raid/pq.h> 11 + 12 + static int rvv_has_vector(void) 13 + { 14 + return has_vector(); 15 + } 16 + 17 + static void __raid6_2data_recov_rvv(int bytes, u8 *p, u8 *q, u8 *dp, 18 + u8 *dq, const u8 *pbmul, 19 + const u8 *qmul) 20 + { 21 + asm volatile (".option push\n" 22 + ".option arch,+v\n" 23 + "vsetvli x0, %[avl], e8, m1, ta, ma\n" 24 + ".option pop\n" 25 + : : 26 + [avl]"r"(16) 27 + ); 28 + 29 + /* 30 + * while ( bytes-- ) { 31 + * uint8_t px, qx, db; 32 + * 33 + * px = *p ^ *dp; 34 + * qx = qmul[*q ^ *dq]; 35 + * *dq++ = db = pbmul[px] ^ qx; 36 + * *dp++ = db ^ px; 37 + * p++; q++; 38 + * } 39 + */ 40 + while (bytes) { 41 + /* 42 + * v0:px, v1:dp, 43 + * v2:qx, v3:dq, 44 + * v4:vx, v5:vy, 45 + * v6:qm0, v7:qm1, 46 + * v8:pm0, v9:pm1, 47 + * v14:p/qm[vx], v15:p/qm[vy] 48 + */ 49 + asm volatile (".option push\n" 50 + ".option arch,+v\n" 51 + "vle8.v v0, (%[px])\n" 52 + "vle8.v v1, (%[dp])\n" 53 + "vxor.vv v0, v0, v1\n" 54 + "vle8.v v2, (%[qx])\n" 55 + "vle8.v v3, (%[dq])\n" 56 + "vxor.vv v4, v2, v3\n" 57 + "vsrl.vi v5, v4, 4\n" 58 + "vand.vi v4, v4, 0xf\n" 59 + "vle8.v v6, (%[qm0])\n" 60 + "vle8.v v7, (%[qm1])\n" 61 + "vrgather.vv v14, v6, v4\n" /* v14 = qm[vx] */ 62 + "vrgather.vv v15, v7, v5\n" /* v15 = qm[vy] */ 63 + "vxor.vv v2, v14, v15\n" /* v2 = qmul[*q ^ *dq] */ 64 + 65 + "vsrl.vi v5, v0, 4\n" 66 + "vand.vi v4, v0, 0xf\n" 67 + "vle8.v v8, (%[pm0])\n" 68 + "vle8.v v9, (%[pm1])\n" 69 + "vrgather.vv v14, v8, v4\n" /* v14 = pm[vx] */ 70 + "vrgather.vv v15, v9, v5\n" /* v15 = pm[vy] */ 71 + "vxor.vv v4, v14, v15\n" /* v4 = pbmul[px] */ 72 + "vxor.vv v3, v4, v2\n" /* v3 = db = pbmul[px] ^ qx */ 73 + "vxor.vv v1, v3, v0\n" /* v1 = db ^ px; */ 74 + "vse8.v v3, (%[dq])\n" 75 + "vse8.v v1, (%[dp])\n" 76 + ".option pop\n" 77 + : : 78 + [px]"r"(p), 79 + [dp]"r"(dp), 80 + [qx]"r"(q), 81 + [dq]"r"(dq), 82 + [qm0]"r"(qmul), 83 + [qm1]"r"(qmul + 16), 84 + [pm0]"r"(pbmul), 85 + [pm1]"r"(pbmul + 16) 86 + :); 87 + 88 + bytes -= 16; 89 + p += 16; 90 + q += 16; 91 + dp += 16; 92 + dq += 16; 93 + } 94 + } 95 + 96 + static void __raid6_datap_recov_rvv(int bytes, u8 *p, u8 *q, 97 + u8 *dq, const u8 *qmul) 98 + { 99 + asm volatile (".option push\n" 100 + ".option arch,+v\n" 101 + "vsetvli x0, %[avl], e8, m1, ta, ma\n" 102 + ".option pop\n" 103 + : : 104 + [avl]"r"(16) 105 + ); 106 + 107 + /* 108 + * while (bytes--) { 109 + * *p++ ^= *dq = qmul[*q ^ *dq]; 110 + * q++; dq++; 111 + * } 112 + */ 113 + while (bytes) { 114 + /* 115 + * v0:vx, v1:vy, 116 + * v2:dq, v3:p, 117 + * v4:qm0, v5:qm1, 118 + * v10:m[vx], v11:m[vy] 119 + */ 120 + asm volatile (".option push\n" 121 + ".option arch,+v\n" 122 + "vle8.v v0, (%[vx])\n" 123 + "vle8.v v2, (%[dq])\n" 124 + "vxor.vv v0, v0, v2\n" 125 + "vsrl.vi v1, v0, 4\n" 126 + "vand.vi v0, v0, 0xf\n" 127 + "vle8.v v4, (%[qm0])\n" 128 + "vle8.v v5, (%[qm1])\n" 129 + "vrgather.vv v10, v4, v0\n" 130 + "vrgather.vv v11, v5, v1\n" 131 + "vxor.vv v0, v10, v11\n" 132 + "vle8.v v1, (%[vy])\n" 133 + "vxor.vv v1, v0, v1\n" 134 + "vse8.v v0, (%[dq])\n" 135 + "vse8.v v1, (%[vy])\n" 136 + ".option pop\n" 137 + : : 138 + [vx]"r"(q), 139 + [vy]"r"(p), 140 + [dq]"r"(dq), 141 + [qm0]"r"(qmul), 142 + [qm1]"r"(qmul + 16) 143 + :); 144 + 145 + bytes -= 16; 146 + p += 16; 147 + q += 16; 148 + dq += 16; 149 + } 150 + } 151 + 152 + static void raid6_2data_recov_rvv(int disks, size_t bytes, int faila, 153 + int failb, void **ptrs) 154 + { 155 + u8 *p, *q, *dp, *dq; 156 + const u8 *pbmul; /* P multiplier table for B data */ 157 + const u8 *qmul; /* Q multiplier table (for both) */ 158 + 159 + p = (u8 *)ptrs[disks - 2]; 160 + q = (u8 *)ptrs[disks - 1]; 161 + 162 + /* 163 + * Compute syndrome with zero for the missing data pages 164 + * Use the dead data pages as temporary storage for 165 + * delta p and delta q 166 + */ 167 + dp = (u8 *)ptrs[faila]; 168 + ptrs[faila] = (void *)raid6_empty_zero_page; 169 + ptrs[disks - 2] = dp; 170 + dq = (u8 *)ptrs[failb]; 171 + ptrs[failb] = (void *)raid6_empty_zero_page; 172 + ptrs[disks - 1] = dq; 173 + 174 + raid6_call.gen_syndrome(disks, bytes, ptrs); 175 + 176 + /* Restore pointer table */ 177 + ptrs[faila] = dp; 178 + ptrs[failb] = dq; 179 + ptrs[disks - 2] = p; 180 + ptrs[disks - 1] = q; 181 + 182 + /* Now, pick the proper data tables */ 183 + pbmul = raid6_vgfmul[raid6_gfexi[failb - faila]]; 184 + qmul = raid6_vgfmul[raid6_gfinv[raid6_gfexp[faila] ^ 185 + raid6_gfexp[failb]]]; 186 + 187 + kernel_vector_begin(); 188 + __raid6_2data_recov_rvv(bytes, p, q, dp, dq, pbmul, qmul); 189 + kernel_vector_end(); 190 + } 191 + 192 + static void raid6_datap_recov_rvv(int disks, size_t bytes, int faila, 193 + void **ptrs) 194 + { 195 + u8 *p, *q, *dq; 196 + const u8 *qmul; /* Q multiplier table */ 197 + 198 + p = (u8 *)ptrs[disks - 2]; 199 + q = (u8 *)ptrs[disks - 1]; 200 + 201 + /* 202 + * Compute syndrome with zero for the missing data page 203 + * Use the dead data page as temporary storage for delta q 204 + */ 205 + dq = (u8 *)ptrs[faila]; 206 + ptrs[faila] = (void *)raid6_empty_zero_page; 207 + ptrs[disks - 1] = dq; 208 + 209 + raid6_call.gen_syndrome(disks, bytes, ptrs); 210 + 211 + /* Restore pointer table */ 212 + ptrs[faila] = dq; 213 + ptrs[disks - 1] = q; 214 + 215 + /* Now, pick the proper data tables */ 216 + qmul = raid6_vgfmul[raid6_gfinv[raid6_gfexp[faila]]]; 217 + 218 + kernel_vector_begin(); 219 + __raid6_datap_recov_rvv(bytes, p, q, dq, qmul); 220 + kernel_vector_end(); 221 + } 222 + 223 + const struct raid6_recov_calls raid6_recov_rvv = { 224 + .data2 = raid6_2data_recov_rvv, 225 + .datap = raid6_datap_recov_rvv, 226 + .valid = rvv_has_vector, 227 + .name = "rvv", 228 + .priority = 1, 229 + };
+1212
lib/raid6/rvv.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * RAID-6 syndrome calculation using RISC-V vector instructions 4 + * 5 + * Copyright 2024 Institute of Software, CAS. 6 + * Author: Chunyan Zhang <zhangchunyan@iscas.ac.cn> 7 + * 8 + * Based on neon.uc: 9 + * Copyright 2002-2004 H. Peter Anvin 10 + */ 11 + 12 + #include <asm/simd.h> 13 + #include <asm/vector.h> 14 + #include <crypto/internal/simd.h> 15 + #include <linux/raid/pq.h> 16 + #include <linux/types.h> 17 + #include "rvv.h" 18 + 19 + #define NSIZE (riscv_v_vsize / 32) /* NSIZE = vlenb */ 20 + 21 + static int rvv_has_vector(void) 22 + { 23 + return has_vector(); 24 + } 25 + 26 + static void raid6_rvv1_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) 27 + { 28 + u8 **dptr = (u8 **)ptrs; 29 + unsigned long d; 30 + int z, z0; 31 + u8 *p, *q; 32 + 33 + z0 = disks - 3; /* Highest data disk */ 34 + p = dptr[z0 + 1]; /* XOR parity */ 35 + q = dptr[z0 + 2]; /* RS syndrome */ 36 + 37 + asm volatile (".option push\n" 38 + ".option arch,+v\n" 39 + "vsetvli t0, x0, e8, m1, ta, ma\n" 40 + ".option pop\n" 41 + ); 42 + 43 + /* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */ 44 + for (d = 0; d < bytes; d += NSIZE * 1) { 45 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 46 + asm volatile (".option push\n" 47 + ".option arch,+v\n" 48 + "vle8.v v0, (%[wp0])\n" 49 + "vle8.v v1, (%[wp0])\n" 50 + ".option pop\n" 51 + : : 52 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) 53 + ); 54 + 55 + for (z = z0 - 1 ; z >= 0 ; z--) { 56 + /* 57 + * w2$$ = MASK(wq$$); 58 + * w1$$ = SHLBYTE(wq$$); 59 + * w2$$ &= NBYTES(0x1d); 60 + * w1$$ ^= w2$$; 61 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 62 + * wq$$ = w1$$ ^ wd$$; 63 + * wp$$ ^= wd$$; 64 + */ 65 + asm volatile (".option push\n" 66 + ".option arch,+v\n" 67 + "vsra.vi v2, v1, 7\n" 68 + "vsll.vi v3, v1, 1\n" 69 + "vand.vx v2, v2, %[x1d]\n" 70 + "vxor.vv v3, v3, v2\n" 71 + "vle8.v v2, (%[wd0])\n" 72 + "vxor.vv v1, v3, v2\n" 73 + "vxor.vv v0, v0, v2\n" 74 + ".option pop\n" 75 + : : 76 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 77 + [x1d]"r"(0x1d) 78 + ); 79 + } 80 + 81 + /* 82 + * *(unative_t *)&p[d+NSIZE*$$] = wp$$; 83 + * *(unative_t *)&q[d+NSIZE*$$] = wq$$; 84 + */ 85 + asm volatile (".option push\n" 86 + ".option arch,+v\n" 87 + "vse8.v v0, (%[wp0])\n" 88 + "vse8.v v1, (%[wq0])\n" 89 + ".option pop\n" 90 + : : 91 + [wp0]"r"(&p[d + NSIZE * 0]), 92 + [wq0]"r"(&q[d + NSIZE * 0]) 93 + ); 94 + } 95 + } 96 + 97 + static void raid6_rvv1_xor_syndrome_real(int disks, int start, int stop, 98 + unsigned long bytes, void **ptrs) 99 + { 100 + u8 **dptr = (u8 **)ptrs; 101 + u8 *p, *q; 102 + unsigned long d; 103 + int z, z0; 104 + 105 + z0 = stop; /* P/Q right side optimization */ 106 + p = dptr[disks - 2]; /* XOR parity */ 107 + q = dptr[disks - 1]; /* RS syndrome */ 108 + 109 + asm volatile (".option push\n" 110 + ".option arch,+v\n" 111 + "vsetvli t0, x0, e8, m1, ta, ma\n" 112 + ".option pop\n" 113 + ); 114 + 115 + /* v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 */ 116 + for (d = 0 ; d < bytes ; d += NSIZE * 1) { 117 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 118 + asm volatile (".option push\n" 119 + ".option arch,+v\n" 120 + "vle8.v v0, (%[wp0])\n" 121 + "vle8.v v1, (%[wp0])\n" 122 + ".option pop\n" 123 + : : 124 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]) 125 + ); 126 + 127 + /* P/Q data pages */ 128 + for (z = z0 - 1; z >= start; z--) { 129 + /* 130 + * w2$$ = MASK(wq$$); 131 + * w1$$ = SHLBYTE(wq$$); 132 + * w2$$ &= NBYTES(0x1d); 133 + * w1$$ ^= w2$$; 134 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 135 + * wq$$ = w1$$ ^ wd$$; 136 + * wp$$ ^= wd$$; 137 + */ 138 + asm volatile (".option push\n" 139 + ".option arch,+v\n" 140 + "vsra.vi v2, v1, 7\n" 141 + "vsll.vi v3, v1, 1\n" 142 + "vand.vx v2, v2, %[x1d]\n" 143 + "vxor.vv v3, v3, v2\n" 144 + "vle8.v v2, (%[wd0])\n" 145 + "vxor.vv v1, v3, v2\n" 146 + "vxor.vv v0, v0, v2\n" 147 + ".option pop\n" 148 + : : 149 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 150 + [x1d]"r"(0x1d) 151 + ); 152 + } 153 + 154 + /* P/Q left side optimization */ 155 + for (z = start - 1; z >= 0; z--) { 156 + /* 157 + * w2$$ = MASK(wq$$); 158 + * w1$$ = SHLBYTE(wq$$); 159 + * w2$$ &= NBYTES(0x1d); 160 + * wq$$ = w1$$ ^ w2$$; 161 + */ 162 + asm volatile (".option push\n" 163 + ".option arch,+v\n" 164 + "vsra.vi v2, v1, 7\n" 165 + "vsll.vi v3, v1, 1\n" 166 + "vand.vx v2, v2, %[x1d]\n" 167 + "vxor.vv v1, v3, v2\n" 168 + ".option pop\n" 169 + : : 170 + [x1d]"r"(0x1d) 171 + ); 172 + } 173 + 174 + /* 175 + * *(unative_t *)&p[d+NSIZE*$$] ^= wp$$; 176 + * *(unative_t *)&q[d+NSIZE*$$] ^= wq$$; 177 + * v0:wp0, v1:wq0, v2:p0, v3:q0 178 + */ 179 + asm volatile (".option push\n" 180 + ".option arch,+v\n" 181 + "vle8.v v2, (%[wp0])\n" 182 + "vle8.v v3, (%[wq0])\n" 183 + "vxor.vv v2, v2, v0\n" 184 + "vxor.vv v3, v3, v1\n" 185 + "vse8.v v2, (%[wp0])\n" 186 + "vse8.v v3, (%[wq0])\n" 187 + ".option pop\n" 188 + : : 189 + [wp0]"r"(&p[d + NSIZE * 0]), 190 + [wq0]"r"(&q[d + NSIZE * 0]) 191 + ); 192 + } 193 + } 194 + 195 + static void raid6_rvv2_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) 196 + { 197 + u8 **dptr = (u8 **)ptrs; 198 + unsigned long d; 199 + int z, z0; 200 + u8 *p, *q; 201 + 202 + z0 = disks - 3; /* Highest data disk */ 203 + p = dptr[z0 + 1]; /* XOR parity */ 204 + q = dptr[z0 + 2]; /* RS syndrome */ 205 + 206 + asm volatile (".option push\n" 207 + ".option arch,+v\n" 208 + "vsetvli t0, x0, e8, m1, ta, ma\n" 209 + ".option pop\n" 210 + ); 211 + 212 + /* 213 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 214 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 215 + */ 216 + for (d = 0; d < bytes; d += NSIZE * 2) { 217 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 218 + asm volatile (".option push\n" 219 + ".option arch,+v\n" 220 + "vle8.v v0, (%[wp0])\n" 221 + "vle8.v v1, (%[wp0])\n" 222 + "vle8.v v4, (%[wp1])\n" 223 + "vle8.v v5, (%[wp1])\n" 224 + ".option pop\n" 225 + : : 226 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 227 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) 228 + ); 229 + 230 + for (z = z0 - 1; z >= 0; z--) { 231 + /* 232 + * w2$$ = MASK(wq$$); 233 + * w1$$ = SHLBYTE(wq$$); 234 + * w2$$ &= NBYTES(0x1d); 235 + * w1$$ ^= w2$$; 236 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 237 + * wq$$ = w1$$ ^ wd$$; 238 + * wp$$ ^= wd$$; 239 + */ 240 + asm volatile (".option push\n" 241 + ".option arch,+v\n" 242 + "vsra.vi v2, v1, 7\n" 243 + "vsll.vi v3, v1, 1\n" 244 + "vand.vx v2, v2, %[x1d]\n" 245 + "vxor.vv v3, v3, v2\n" 246 + "vle8.v v2, (%[wd0])\n" 247 + "vxor.vv v1, v3, v2\n" 248 + "vxor.vv v0, v0, v2\n" 249 + 250 + "vsra.vi v6, v5, 7\n" 251 + "vsll.vi v7, v5, 1\n" 252 + "vand.vx v6, v6, %[x1d]\n" 253 + "vxor.vv v7, v7, v6\n" 254 + "vle8.v v6, (%[wd1])\n" 255 + "vxor.vv v5, v7, v6\n" 256 + "vxor.vv v4, v4, v6\n" 257 + ".option pop\n" 258 + : : 259 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 260 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 261 + [x1d]"r"(0x1d) 262 + ); 263 + } 264 + 265 + /* 266 + * *(unative_t *)&p[d+NSIZE*$$] = wp$$; 267 + * *(unative_t *)&q[d+NSIZE*$$] = wq$$; 268 + */ 269 + asm volatile (".option push\n" 270 + ".option arch,+v\n" 271 + "vse8.v v0, (%[wp0])\n" 272 + "vse8.v v1, (%[wq0])\n" 273 + "vse8.v v4, (%[wp1])\n" 274 + "vse8.v v5, (%[wq1])\n" 275 + ".option pop\n" 276 + : : 277 + [wp0]"r"(&p[d + NSIZE * 0]), 278 + [wq0]"r"(&q[d + NSIZE * 0]), 279 + [wp1]"r"(&p[d + NSIZE * 1]), 280 + [wq1]"r"(&q[d + NSIZE * 1]) 281 + ); 282 + } 283 + } 284 + 285 + static void raid6_rvv2_xor_syndrome_real(int disks, int start, int stop, 286 + unsigned long bytes, void **ptrs) 287 + { 288 + u8 **dptr = (u8 **)ptrs; 289 + u8 *p, *q; 290 + unsigned long d; 291 + int z, z0; 292 + 293 + z0 = stop; /* P/Q right side optimization */ 294 + p = dptr[disks - 2]; /* XOR parity */ 295 + q = dptr[disks - 1]; /* RS syndrome */ 296 + 297 + asm volatile (".option push\n" 298 + ".option arch,+v\n" 299 + "vsetvli t0, x0, e8, m1, ta, ma\n" 300 + ".option pop\n" 301 + ); 302 + 303 + /* 304 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 305 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 306 + */ 307 + for (d = 0; d < bytes; d += NSIZE * 2) { 308 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 309 + asm volatile (".option push\n" 310 + ".option arch,+v\n" 311 + "vle8.v v0, (%[wp0])\n" 312 + "vle8.v v1, (%[wp0])\n" 313 + "vle8.v v4, (%[wp1])\n" 314 + "vle8.v v5, (%[wp1])\n" 315 + ".option pop\n" 316 + : : 317 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 318 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]) 319 + ); 320 + 321 + /* P/Q data pages */ 322 + for (z = z0 - 1; z >= start; z--) { 323 + /* 324 + * w2$$ = MASK(wq$$); 325 + * w1$$ = SHLBYTE(wq$$); 326 + * w2$$ &= NBYTES(0x1d); 327 + * w1$$ ^= w2$$; 328 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 329 + * wq$$ = w1$$ ^ wd$$; 330 + * wp$$ ^= wd$$; 331 + */ 332 + asm volatile (".option push\n" 333 + ".option arch,+v\n" 334 + "vsra.vi v2, v1, 7\n" 335 + "vsll.vi v3, v1, 1\n" 336 + "vand.vx v2, v2, %[x1d]\n" 337 + "vxor.vv v3, v3, v2\n" 338 + "vle8.v v2, (%[wd0])\n" 339 + "vxor.vv v1, v3, v2\n" 340 + "vxor.vv v0, v0, v2\n" 341 + 342 + "vsra.vi v6, v5, 7\n" 343 + "vsll.vi v7, v5, 1\n" 344 + "vand.vx v6, v6, %[x1d]\n" 345 + "vxor.vv v7, v7, v6\n" 346 + "vle8.v v6, (%[wd1])\n" 347 + "vxor.vv v5, v7, v6\n" 348 + "vxor.vv v4, v4, v6\n" 349 + ".option pop\n" 350 + : : 351 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 352 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 353 + [x1d]"r"(0x1d) 354 + ); 355 + } 356 + 357 + /* P/Q left side optimization */ 358 + for (z = start - 1; z >= 0; z--) { 359 + /* 360 + * w2$$ = MASK(wq$$); 361 + * w1$$ = SHLBYTE(wq$$); 362 + * w2$$ &= NBYTES(0x1d); 363 + * wq$$ = w1$$ ^ w2$$; 364 + */ 365 + asm volatile (".option push\n" 366 + ".option arch,+v\n" 367 + "vsra.vi v2, v1, 7\n" 368 + "vsll.vi v3, v1, 1\n" 369 + "vand.vx v2, v2, %[x1d]\n" 370 + "vxor.vv v1, v3, v2\n" 371 + 372 + "vsra.vi v6, v5, 7\n" 373 + "vsll.vi v7, v5, 1\n" 374 + "vand.vx v6, v6, %[x1d]\n" 375 + "vxor.vv v5, v7, v6\n" 376 + ".option pop\n" 377 + : : 378 + [x1d]"r"(0x1d) 379 + ); 380 + } 381 + 382 + /* 383 + * *(unative_t *)&p[d+NSIZE*$$] ^= wp$$; 384 + * *(unative_t *)&q[d+NSIZE*$$] ^= wq$$; 385 + * v0:wp0, v1:wq0, v2:p0, v3:q0 386 + * v4:wp1, v5:wq1, v6:p1, v7:q1 387 + */ 388 + asm volatile (".option push\n" 389 + ".option arch,+v\n" 390 + "vle8.v v2, (%[wp0])\n" 391 + "vle8.v v3, (%[wq0])\n" 392 + "vxor.vv v2, v2, v0\n" 393 + "vxor.vv v3, v3, v1\n" 394 + "vse8.v v2, (%[wp0])\n" 395 + "vse8.v v3, (%[wq0])\n" 396 + 397 + "vle8.v v6, (%[wp1])\n" 398 + "vle8.v v7, (%[wq1])\n" 399 + "vxor.vv v6, v6, v4\n" 400 + "vxor.vv v7, v7, v5\n" 401 + "vse8.v v6, (%[wp1])\n" 402 + "vse8.v v7, (%[wq1])\n" 403 + ".option pop\n" 404 + : : 405 + [wp0]"r"(&p[d + NSIZE * 0]), 406 + [wq0]"r"(&q[d + NSIZE * 0]), 407 + [wp1]"r"(&p[d + NSIZE * 1]), 408 + [wq1]"r"(&q[d + NSIZE * 1]) 409 + ); 410 + } 411 + } 412 + 413 + static void raid6_rvv4_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) 414 + { 415 + u8 **dptr = (u8 **)ptrs; 416 + unsigned long d; 417 + int z, z0; 418 + u8 *p, *q; 419 + 420 + z0 = disks - 3; /* Highest data disk */ 421 + p = dptr[z0 + 1]; /* XOR parity */ 422 + q = dptr[z0 + 2]; /* RS syndrome */ 423 + 424 + asm volatile (".option push\n" 425 + ".option arch,+v\n" 426 + "vsetvli t0, x0, e8, m1, ta, ma\n" 427 + ".option pop\n" 428 + ); 429 + 430 + /* 431 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 432 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 433 + * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 434 + * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 435 + */ 436 + for (d = 0; d < bytes; d += NSIZE * 4) { 437 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 438 + asm volatile (".option push\n" 439 + ".option arch,+v\n" 440 + "vle8.v v0, (%[wp0])\n" 441 + "vle8.v v1, (%[wp0])\n" 442 + "vle8.v v4, (%[wp1])\n" 443 + "vle8.v v5, (%[wp1])\n" 444 + "vle8.v v8, (%[wp2])\n" 445 + "vle8.v v9, (%[wp2])\n" 446 + "vle8.v v12, (%[wp3])\n" 447 + "vle8.v v13, (%[wp3])\n" 448 + ".option pop\n" 449 + : : 450 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 451 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), 452 + [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), 453 + [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) 454 + ); 455 + 456 + for (z = z0 - 1; z >= 0; z--) { 457 + /* 458 + * w2$$ = MASK(wq$$); 459 + * w1$$ = SHLBYTE(wq$$); 460 + * w2$$ &= NBYTES(0x1d); 461 + * w1$$ ^= w2$$; 462 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 463 + * wq$$ = w1$$ ^ wd$$; 464 + * wp$$ ^= wd$$; 465 + */ 466 + asm volatile (".option push\n" 467 + ".option arch,+v\n" 468 + "vsra.vi v2, v1, 7\n" 469 + "vsll.vi v3, v1, 1\n" 470 + "vand.vx v2, v2, %[x1d]\n" 471 + "vxor.vv v3, v3, v2\n" 472 + "vle8.v v2, (%[wd0])\n" 473 + "vxor.vv v1, v3, v2\n" 474 + "vxor.vv v0, v0, v2\n" 475 + 476 + "vsra.vi v6, v5, 7\n" 477 + "vsll.vi v7, v5, 1\n" 478 + "vand.vx v6, v6, %[x1d]\n" 479 + "vxor.vv v7, v7, v6\n" 480 + "vle8.v v6, (%[wd1])\n" 481 + "vxor.vv v5, v7, v6\n" 482 + "vxor.vv v4, v4, v6\n" 483 + 484 + "vsra.vi v10, v9, 7\n" 485 + "vsll.vi v11, v9, 1\n" 486 + "vand.vx v10, v10, %[x1d]\n" 487 + "vxor.vv v11, v11, v10\n" 488 + "vle8.v v10, (%[wd2])\n" 489 + "vxor.vv v9, v11, v10\n" 490 + "vxor.vv v8, v8, v10\n" 491 + 492 + "vsra.vi v14, v13, 7\n" 493 + "vsll.vi v15, v13, 1\n" 494 + "vand.vx v14, v14, %[x1d]\n" 495 + "vxor.vv v15, v15, v14\n" 496 + "vle8.v v14, (%[wd3])\n" 497 + "vxor.vv v13, v15, v14\n" 498 + "vxor.vv v12, v12, v14\n" 499 + ".option pop\n" 500 + : : 501 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 502 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 503 + [wd2]"r"(&dptr[z][d + 2 * NSIZE]), 504 + [wd3]"r"(&dptr[z][d + 3 * NSIZE]), 505 + [x1d]"r"(0x1d) 506 + ); 507 + } 508 + 509 + /* 510 + * *(unative_t *)&p[d+NSIZE*$$] = wp$$; 511 + * *(unative_t *)&q[d+NSIZE*$$] = wq$$; 512 + */ 513 + asm volatile (".option push\n" 514 + ".option arch,+v\n" 515 + "vse8.v v0, (%[wp0])\n" 516 + "vse8.v v1, (%[wq0])\n" 517 + "vse8.v v4, (%[wp1])\n" 518 + "vse8.v v5, (%[wq1])\n" 519 + "vse8.v v8, (%[wp2])\n" 520 + "vse8.v v9, (%[wq2])\n" 521 + "vse8.v v12, (%[wp3])\n" 522 + "vse8.v v13, (%[wq3])\n" 523 + ".option pop\n" 524 + : : 525 + [wp0]"r"(&p[d + NSIZE * 0]), 526 + [wq0]"r"(&q[d + NSIZE * 0]), 527 + [wp1]"r"(&p[d + NSIZE * 1]), 528 + [wq1]"r"(&q[d + NSIZE * 1]), 529 + [wp2]"r"(&p[d + NSIZE * 2]), 530 + [wq2]"r"(&q[d + NSIZE * 2]), 531 + [wp3]"r"(&p[d + NSIZE * 3]), 532 + [wq3]"r"(&q[d + NSIZE * 3]) 533 + ); 534 + } 535 + } 536 + 537 + static void raid6_rvv4_xor_syndrome_real(int disks, int start, int stop, 538 + unsigned long bytes, void **ptrs) 539 + { 540 + u8 **dptr = (u8 **)ptrs; 541 + u8 *p, *q; 542 + unsigned long d; 543 + int z, z0; 544 + 545 + z0 = stop; /* P/Q right side optimization */ 546 + p = dptr[disks - 2]; /* XOR parity */ 547 + q = dptr[disks - 1]; /* RS syndrome */ 548 + 549 + asm volatile (".option push\n" 550 + ".option arch,+v\n" 551 + "vsetvli t0, x0, e8, m1, ta, ma\n" 552 + ".option pop\n" 553 + ); 554 + 555 + /* 556 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 557 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 558 + * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 559 + * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 560 + */ 561 + for (d = 0; d < bytes; d += NSIZE * 4) { 562 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 563 + asm volatile (".option push\n" 564 + ".option arch,+v\n" 565 + "vle8.v v0, (%[wp0])\n" 566 + "vle8.v v1, (%[wp0])\n" 567 + "vle8.v v4, (%[wp1])\n" 568 + "vle8.v v5, (%[wp1])\n" 569 + "vle8.v v8, (%[wp2])\n" 570 + "vle8.v v9, (%[wp2])\n" 571 + "vle8.v v12, (%[wp3])\n" 572 + "vle8.v v13, (%[wp3])\n" 573 + ".option pop\n" 574 + : : 575 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 576 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), 577 + [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), 578 + [wp3]"r"(&dptr[z0][d + 3 * NSIZE]) 579 + ); 580 + 581 + /* P/Q data pages */ 582 + for (z = z0 - 1; z >= start; z--) { 583 + /* 584 + * w2$$ = MASK(wq$$); 585 + * w1$$ = SHLBYTE(wq$$); 586 + * w2$$ &= NBYTES(0x1d); 587 + * w1$$ ^= w2$$; 588 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 589 + * wq$$ = w1$$ ^ wd$$; 590 + * wp$$ ^= wd$$; 591 + */ 592 + asm volatile (".option push\n" 593 + ".option arch,+v\n" 594 + "vsra.vi v2, v1, 7\n" 595 + "vsll.vi v3, v1, 1\n" 596 + "vand.vx v2, v2, %[x1d]\n" 597 + "vxor.vv v3, v3, v2\n" 598 + "vle8.v v2, (%[wd0])\n" 599 + "vxor.vv v1, v3, v2\n" 600 + "vxor.vv v0, v0, v2\n" 601 + 602 + "vsra.vi v6, v5, 7\n" 603 + "vsll.vi v7, v5, 1\n" 604 + "vand.vx v6, v6, %[x1d]\n" 605 + "vxor.vv v7, v7, v6\n" 606 + "vle8.v v6, (%[wd1])\n" 607 + "vxor.vv v5, v7, v6\n" 608 + "vxor.vv v4, v4, v6\n" 609 + 610 + "vsra.vi v10, v9, 7\n" 611 + "vsll.vi v11, v9, 1\n" 612 + "vand.vx v10, v10, %[x1d]\n" 613 + "vxor.vv v11, v11, v10\n" 614 + "vle8.v v10, (%[wd2])\n" 615 + "vxor.vv v9, v11, v10\n" 616 + "vxor.vv v8, v8, v10\n" 617 + 618 + "vsra.vi v14, v13, 7\n" 619 + "vsll.vi v15, v13, 1\n" 620 + "vand.vx v14, v14, %[x1d]\n" 621 + "vxor.vv v15, v15, v14\n" 622 + "vle8.v v14, (%[wd3])\n" 623 + "vxor.vv v13, v15, v14\n" 624 + "vxor.vv v12, v12, v14\n" 625 + ".option pop\n" 626 + : : 627 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 628 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 629 + [wd2]"r"(&dptr[z][d + 2 * NSIZE]), 630 + [wd3]"r"(&dptr[z][d + 3 * NSIZE]), 631 + [x1d]"r"(0x1d) 632 + ); 633 + } 634 + 635 + /* P/Q left side optimization */ 636 + for (z = start - 1; z >= 0; z--) { 637 + /* 638 + * w2$$ = MASK(wq$$); 639 + * w1$$ = SHLBYTE(wq$$); 640 + * w2$$ &= NBYTES(0x1d); 641 + * wq$$ = w1$$ ^ w2$$; 642 + */ 643 + asm volatile (".option push\n" 644 + ".option arch,+v\n" 645 + "vsra.vi v2, v1, 7\n" 646 + "vsll.vi v3, v1, 1\n" 647 + "vand.vx v2, v2, %[x1d]\n" 648 + "vxor.vv v1, v3, v2\n" 649 + 650 + "vsra.vi v6, v5, 7\n" 651 + "vsll.vi v7, v5, 1\n" 652 + "vand.vx v6, v6, %[x1d]\n" 653 + "vxor.vv v5, v7, v6\n" 654 + 655 + "vsra.vi v10, v9, 7\n" 656 + "vsll.vi v11, v9, 1\n" 657 + "vand.vx v10, v10, %[x1d]\n" 658 + "vxor.vv v9, v11, v10\n" 659 + 660 + "vsra.vi v14, v13, 7\n" 661 + "vsll.vi v15, v13, 1\n" 662 + "vand.vx v14, v14, %[x1d]\n" 663 + "vxor.vv v13, v15, v14\n" 664 + ".option pop\n" 665 + : : 666 + [x1d]"r"(0x1d) 667 + ); 668 + } 669 + 670 + /* 671 + * *(unative_t *)&p[d+NSIZE*$$] ^= wp$$; 672 + * *(unative_t *)&q[d+NSIZE*$$] ^= wq$$; 673 + * v0:wp0, v1:wq0, v2:p0, v3:q0 674 + * v4:wp1, v5:wq1, v6:p1, v7:q1 675 + * v8:wp2, v9:wq2, v10:p2, v11:q2 676 + * v12:wp3, v13:wq3, v14:p3, v15:q3 677 + */ 678 + asm volatile (".option push\n" 679 + ".option arch,+v\n" 680 + "vle8.v v2, (%[wp0])\n" 681 + "vle8.v v3, (%[wq0])\n" 682 + "vxor.vv v2, v2, v0\n" 683 + "vxor.vv v3, v3, v1\n" 684 + "vse8.v v2, (%[wp0])\n" 685 + "vse8.v v3, (%[wq0])\n" 686 + 687 + "vle8.v v6, (%[wp1])\n" 688 + "vle8.v v7, (%[wq1])\n" 689 + "vxor.vv v6, v6, v4\n" 690 + "vxor.vv v7, v7, v5\n" 691 + "vse8.v v6, (%[wp1])\n" 692 + "vse8.v v7, (%[wq1])\n" 693 + 694 + "vle8.v v10, (%[wp2])\n" 695 + "vle8.v v11, (%[wq2])\n" 696 + "vxor.vv v10, v10, v8\n" 697 + "vxor.vv v11, v11, v9\n" 698 + "vse8.v v10, (%[wp2])\n" 699 + "vse8.v v11, (%[wq2])\n" 700 + 701 + "vle8.v v14, (%[wp3])\n" 702 + "vle8.v v15, (%[wq3])\n" 703 + "vxor.vv v14, v14, v12\n" 704 + "vxor.vv v15, v15, v13\n" 705 + "vse8.v v14, (%[wp3])\n" 706 + "vse8.v v15, (%[wq3])\n" 707 + ".option pop\n" 708 + : : 709 + [wp0]"r"(&p[d + NSIZE * 0]), 710 + [wq0]"r"(&q[d + NSIZE * 0]), 711 + [wp1]"r"(&p[d + NSIZE * 1]), 712 + [wq1]"r"(&q[d + NSIZE * 1]), 713 + [wp2]"r"(&p[d + NSIZE * 2]), 714 + [wq2]"r"(&q[d + NSIZE * 2]), 715 + [wp3]"r"(&p[d + NSIZE * 3]), 716 + [wq3]"r"(&q[d + NSIZE * 3]) 717 + ); 718 + } 719 + } 720 + 721 + static void raid6_rvv8_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) 722 + { 723 + u8 **dptr = (u8 **)ptrs; 724 + unsigned long d; 725 + int z, z0; 726 + u8 *p, *q; 727 + 728 + z0 = disks - 3; /* Highest data disk */ 729 + p = dptr[z0 + 1]; /* XOR parity */ 730 + q = dptr[z0 + 2]; /* RS syndrome */ 731 + 732 + asm volatile (".option push\n" 733 + ".option arch,+v\n" 734 + "vsetvli t0, x0, e8, m1, ta, ma\n" 735 + ".option pop\n" 736 + ); 737 + 738 + /* 739 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 740 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 741 + * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 742 + * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 743 + * v16:wp4, v17:wq4, v18:wd4/w24, v19:w14 744 + * v20:wp5, v21:wq5, v22:wd5/w25, v23:w15 745 + * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16 746 + * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17 747 + */ 748 + for (d = 0; d < bytes; d += NSIZE * 8) { 749 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 750 + asm volatile (".option push\n" 751 + ".option arch,+v\n" 752 + "vle8.v v0, (%[wp0])\n" 753 + "vle8.v v1, (%[wp0])\n" 754 + "vle8.v v4, (%[wp1])\n" 755 + "vle8.v v5, (%[wp1])\n" 756 + "vle8.v v8, (%[wp2])\n" 757 + "vle8.v v9, (%[wp2])\n" 758 + "vle8.v v12, (%[wp3])\n" 759 + "vle8.v v13, (%[wp3])\n" 760 + "vle8.v v16, (%[wp4])\n" 761 + "vle8.v v17, (%[wp4])\n" 762 + "vle8.v v20, (%[wp5])\n" 763 + "vle8.v v21, (%[wp5])\n" 764 + "vle8.v v24, (%[wp6])\n" 765 + "vle8.v v25, (%[wp6])\n" 766 + "vle8.v v28, (%[wp7])\n" 767 + "vle8.v v29, (%[wp7])\n" 768 + ".option pop\n" 769 + : : 770 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 771 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), 772 + [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), 773 + [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), 774 + [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), 775 + [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), 776 + [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), 777 + [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) 778 + ); 779 + 780 + for (z = z0 - 1; z >= 0; z--) { 781 + /* 782 + * w2$$ = MASK(wq$$); 783 + * w1$$ = SHLBYTE(wq$$); 784 + * w2$$ &= NBYTES(0x1d); 785 + * w1$$ ^= w2$$; 786 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 787 + * wq$$ = w1$$ ^ wd$$; 788 + * wp$$ ^= wd$$; 789 + */ 790 + asm volatile (".option push\n" 791 + ".option arch,+v\n" 792 + "vsra.vi v2, v1, 7\n" 793 + "vsll.vi v3, v1, 1\n" 794 + "vand.vx v2, v2, %[x1d]\n" 795 + "vxor.vv v3, v3, v2\n" 796 + "vle8.v v2, (%[wd0])\n" 797 + "vxor.vv v1, v3, v2\n" 798 + "vxor.vv v0, v0, v2\n" 799 + 800 + "vsra.vi v6, v5, 7\n" 801 + "vsll.vi v7, v5, 1\n" 802 + "vand.vx v6, v6, %[x1d]\n" 803 + "vxor.vv v7, v7, v6\n" 804 + "vle8.v v6, (%[wd1])\n" 805 + "vxor.vv v5, v7, v6\n" 806 + "vxor.vv v4, v4, v6\n" 807 + 808 + "vsra.vi v10, v9, 7\n" 809 + "vsll.vi v11, v9, 1\n" 810 + "vand.vx v10, v10, %[x1d]\n" 811 + "vxor.vv v11, v11, v10\n" 812 + "vle8.v v10, (%[wd2])\n" 813 + "vxor.vv v9, v11, v10\n" 814 + "vxor.vv v8, v8, v10\n" 815 + 816 + "vsra.vi v14, v13, 7\n" 817 + "vsll.vi v15, v13, 1\n" 818 + "vand.vx v14, v14, %[x1d]\n" 819 + "vxor.vv v15, v15, v14\n" 820 + "vle8.v v14, (%[wd3])\n" 821 + "vxor.vv v13, v15, v14\n" 822 + "vxor.vv v12, v12, v14\n" 823 + 824 + "vsra.vi v18, v17, 7\n" 825 + "vsll.vi v19, v17, 1\n" 826 + "vand.vx v18, v18, %[x1d]\n" 827 + "vxor.vv v19, v19, v18\n" 828 + "vle8.v v18, (%[wd4])\n" 829 + "vxor.vv v17, v19, v18\n" 830 + "vxor.vv v16, v16, v18\n" 831 + 832 + "vsra.vi v22, v21, 7\n" 833 + "vsll.vi v23, v21, 1\n" 834 + "vand.vx v22, v22, %[x1d]\n" 835 + "vxor.vv v23, v23, v22\n" 836 + "vle8.v v22, (%[wd5])\n" 837 + "vxor.vv v21, v23, v22\n" 838 + "vxor.vv v20, v20, v22\n" 839 + 840 + "vsra.vi v26, v25, 7\n" 841 + "vsll.vi v27, v25, 1\n" 842 + "vand.vx v26, v26, %[x1d]\n" 843 + "vxor.vv v27, v27, v26\n" 844 + "vle8.v v26, (%[wd6])\n" 845 + "vxor.vv v25, v27, v26\n" 846 + "vxor.vv v24, v24, v26\n" 847 + 848 + "vsra.vi v30, v29, 7\n" 849 + "vsll.vi v31, v29, 1\n" 850 + "vand.vx v30, v30, %[x1d]\n" 851 + "vxor.vv v31, v31, v30\n" 852 + "vle8.v v30, (%[wd7])\n" 853 + "vxor.vv v29, v31, v30\n" 854 + "vxor.vv v28, v28, v30\n" 855 + ".option pop\n" 856 + : : 857 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 858 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 859 + [wd2]"r"(&dptr[z][d + 2 * NSIZE]), 860 + [wd3]"r"(&dptr[z][d + 3 * NSIZE]), 861 + [wd4]"r"(&dptr[z][d + 4 * NSIZE]), 862 + [wd5]"r"(&dptr[z][d + 5 * NSIZE]), 863 + [wd6]"r"(&dptr[z][d + 6 * NSIZE]), 864 + [wd7]"r"(&dptr[z][d + 7 * NSIZE]), 865 + [x1d]"r"(0x1d) 866 + ); 867 + } 868 + 869 + /* 870 + * *(unative_t *)&p[d+NSIZE*$$] = wp$$; 871 + * *(unative_t *)&q[d+NSIZE*$$] = wq$$; 872 + */ 873 + asm volatile (".option push\n" 874 + ".option arch,+v\n" 875 + "vse8.v v0, (%[wp0])\n" 876 + "vse8.v v1, (%[wq0])\n" 877 + "vse8.v v4, (%[wp1])\n" 878 + "vse8.v v5, (%[wq1])\n" 879 + "vse8.v v8, (%[wp2])\n" 880 + "vse8.v v9, (%[wq2])\n" 881 + "vse8.v v12, (%[wp3])\n" 882 + "vse8.v v13, (%[wq3])\n" 883 + "vse8.v v16, (%[wp4])\n" 884 + "vse8.v v17, (%[wq4])\n" 885 + "vse8.v v20, (%[wp5])\n" 886 + "vse8.v v21, (%[wq5])\n" 887 + "vse8.v v24, (%[wp6])\n" 888 + "vse8.v v25, (%[wq6])\n" 889 + "vse8.v v28, (%[wp7])\n" 890 + "vse8.v v29, (%[wq7])\n" 891 + ".option pop\n" 892 + : : 893 + [wp0]"r"(&p[d + NSIZE * 0]), 894 + [wq0]"r"(&q[d + NSIZE * 0]), 895 + [wp1]"r"(&p[d + NSIZE * 1]), 896 + [wq1]"r"(&q[d + NSIZE * 1]), 897 + [wp2]"r"(&p[d + NSIZE * 2]), 898 + [wq2]"r"(&q[d + NSIZE * 2]), 899 + [wp3]"r"(&p[d + NSIZE * 3]), 900 + [wq3]"r"(&q[d + NSIZE * 3]), 901 + [wp4]"r"(&p[d + NSIZE * 4]), 902 + [wq4]"r"(&q[d + NSIZE * 4]), 903 + [wp5]"r"(&p[d + NSIZE * 5]), 904 + [wq5]"r"(&q[d + NSIZE * 5]), 905 + [wp6]"r"(&p[d + NSIZE * 6]), 906 + [wq6]"r"(&q[d + NSIZE * 6]), 907 + [wp7]"r"(&p[d + NSIZE * 7]), 908 + [wq7]"r"(&q[d + NSIZE * 7]) 909 + ); 910 + } 911 + } 912 + 913 + static void raid6_rvv8_xor_syndrome_real(int disks, int start, int stop, 914 + unsigned long bytes, void **ptrs) 915 + { 916 + u8 **dptr = (u8 **)ptrs; 917 + u8 *p, *q; 918 + unsigned long d; 919 + int z, z0; 920 + 921 + z0 = stop; /* P/Q right side optimization */ 922 + p = dptr[disks - 2]; /* XOR parity */ 923 + q = dptr[disks - 1]; /* RS syndrome */ 924 + 925 + asm volatile (".option push\n" 926 + ".option arch,+v\n" 927 + "vsetvli t0, x0, e8, m1, ta, ma\n" 928 + ".option pop\n" 929 + ); 930 + 931 + /* 932 + * v0:wp0, v1:wq0, v2:wd0/w20, v3:w10 933 + * v4:wp1, v5:wq1, v6:wd1/w21, v7:w11 934 + * v8:wp2, v9:wq2, v10:wd2/w22, v11:w12 935 + * v12:wp3, v13:wq3, v14:wd3/w23, v15:w13 936 + * v16:wp4, v17:wq4, v18:wd4/w24, v19:w14 937 + * v20:wp5, v21:wq5, v22:wd5/w25, v23:w15 938 + * v24:wp6, v25:wq6, v26:wd6/w26, v27:w16 939 + * v28:wp7, v29:wq7, v30:wd7/w27, v31:w17 940 + */ 941 + for (d = 0; d < bytes; d += NSIZE * 8) { 942 + /* wq$$ = wp$$ = *(unative_t *)&dptr[z0][d+$$*NSIZE]; */ 943 + asm volatile (".option push\n" 944 + ".option arch,+v\n" 945 + "vle8.v v0, (%[wp0])\n" 946 + "vle8.v v1, (%[wp0])\n" 947 + "vle8.v v4, (%[wp1])\n" 948 + "vle8.v v5, (%[wp1])\n" 949 + "vle8.v v8, (%[wp2])\n" 950 + "vle8.v v9, (%[wp2])\n" 951 + "vle8.v v12, (%[wp3])\n" 952 + "vle8.v v13, (%[wp3])\n" 953 + "vle8.v v16, (%[wp4])\n" 954 + "vle8.v v17, (%[wp4])\n" 955 + "vle8.v v20, (%[wp5])\n" 956 + "vle8.v v21, (%[wp5])\n" 957 + "vle8.v v24, (%[wp6])\n" 958 + "vle8.v v25, (%[wp6])\n" 959 + "vle8.v v28, (%[wp7])\n" 960 + "vle8.v v29, (%[wp7])\n" 961 + ".option pop\n" 962 + : : 963 + [wp0]"r"(&dptr[z0][d + 0 * NSIZE]), 964 + [wp1]"r"(&dptr[z0][d + 1 * NSIZE]), 965 + [wp2]"r"(&dptr[z0][d + 2 * NSIZE]), 966 + [wp3]"r"(&dptr[z0][d + 3 * NSIZE]), 967 + [wp4]"r"(&dptr[z0][d + 4 * NSIZE]), 968 + [wp5]"r"(&dptr[z0][d + 5 * NSIZE]), 969 + [wp6]"r"(&dptr[z0][d + 6 * NSIZE]), 970 + [wp7]"r"(&dptr[z0][d + 7 * NSIZE]) 971 + ); 972 + 973 + /* P/Q data pages */ 974 + for (z = z0 - 1; z >= start; z--) { 975 + /* 976 + * w2$$ = MASK(wq$$); 977 + * w1$$ = SHLBYTE(wq$$); 978 + * w2$$ &= NBYTES(0x1d); 979 + * w1$$ ^= w2$$; 980 + * wd$$ = *(unative_t *)&dptr[z][d+$$*NSIZE]; 981 + * wq$$ = w1$$ ^ wd$$; 982 + * wp$$ ^= wd$$; 983 + */ 984 + asm volatile (".option push\n" 985 + ".option arch,+v\n" 986 + "vsra.vi v2, v1, 7\n" 987 + "vsll.vi v3, v1, 1\n" 988 + "vand.vx v2, v2, %[x1d]\n" 989 + "vxor.vv v3, v3, v2\n" 990 + "vle8.v v2, (%[wd0])\n" 991 + "vxor.vv v1, v3, v2\n" 992 + "vxor.vv v0, v0, v2\n" 993 + 994 + "vsra.vi v6, v5, 7\n" 995 + "vsll.vi v7, v5, 1\n" 996 + "vand.vx v6, v6, %[x1d]\n" 997 + "vxor.vv v7, v7, v6\n" 998 + "vle8.v v6, (%[wd1])\n" 999 + "vxor.vv v5, v7, v6\n" 1000 + "vxor.vv v4, v4, v6\n" 1001 + 1002 + "vsra.vi v10, v9, 7\n" 1003 + "vsll.vi v11, v9, 1\n" 1004 + "vand.vx v10, v10, %[x1d]\n" 1005 + "vxor.vv v11, v11, v10\n" 1006 + "vle8.v v10, (%[wd2])\n" 1007 + "vxor.vv v9, v11, v10\n" 1008 + "vxor.vv v8, v8, v10\n" 1009 + 1010 + "vsra.vi v14, v13, 7\n" 1011 + "vsll.vi v15, v13, 1\n" 1012 + "vand.vx v14, v14, %[x1d]\n" 1013 + "vxor.vv v15, v15, v14\n" 1014 + "vle8.v v14, (%[wd3])\n" 1015 + "vxor.vv v13, v15, v14\n" 1016 + "vxor.vv v12, v12, v14\n" 1017 + 1018 + "vsra.vi v18, v17, 7\n" 1019 + "vsll.vi v19, v17, 1\n" 1020 + "vand.vx v18, v18, %[x1d]\n" 1021 + "vxor.vv v19, v19, v18\n" 1022 + "vle8.v v18, (%[wd4])\n" 1023 + "vxor.vv v17, v19, v18\n" 1024 + "vxor.vv v16, v16, v18\n" 1025 + 1026 + "vsra.vi v22, v21, 7\n" 1027 + "vsll.vi v23, v21, 1\n" 1028 + "vand.vx v22, v22, %[x1d]\n" 1029 + "vxor.vv v23, v23, v22\n" 1030 + "vle8.v v22, (%[wd5])\n" 1031 + "vxor.vv v21, v23, v22\n" 1032 + "vxor.vv v20, v20, v22\n" 1033 + 1034 + "vsra.vi v26, v25, 7\n" 1035 + "vsll.vi v27, v25, 1\n" 1036 + "vand.vx v26, v26, %[x1d]\n" 1037 + "vxor.vv v27, v27, v26\n" 1038 + "vle8.v v26, (%[wd6])\n" 1039 + "vxor.vv v25, v27, v26\n" 1040 + "vxor.vv v24, v24, v26\n" 1041 + 1042 + "vsra.vi v30, v29, 7\n" 1043 + "vsll.vi v31, v29, 1\n" 1044 + "vand.vx v30, v30, %[x1d]\n" 1045 + "vxor.vv v31, v31, v30\n" 1046 + "vle8.v v30, (%[wd7])\n" 1047 + "vxor.vv v29, v31, v30\n" 1048 + "vxor.vv v28, v28, v30\n" 1049 + ".option pop\n" 1050 + : : 1051 + [wd0]"r"(&dptr[z][d + 0 * NSIZE]), 1052 + [wd1]"r"(&dptr[z][d + 1 * NSIZE]), 1053 + [wd2]"r"(&dptr[z][d + 2 * NSIZE]), 1054 + [wd3]"r"(&dptr[z][d + 3 * NSIZE]), 1055 + [wd4]"r"(&dptr[z][d + 4 * NSIZE]), 1056 + [wd5]"r"(&dptr[z][d + 5 * NSIZE]), 1057 + [wd6]"r"(&dptr[z][d + 6 * NSIZE]), 1058 + [wd7]"r"(&dptr[z][d + 7 * NSIZE]), 1059 + [x1d]"r"(0x1d) 1060 + ); 1061 + } 1062 + 1063 + /* P/Q left side optimization */ 1064 + for (z = start - 1; z >= 0; z--) { 1065 + /* 1066 + * w2$$ = MASK(wq$$); 1067 + * w1$$ = SHLBYTE(wq$$); 1068 + * w2$$ &= NBYTES(0x1d); 1069 + * wq$$ = w1$$ ^ w2$$; 1070 + */ 1071 + asm volatile (".option push\n" 1072 + ".option arch,+v\n" 1073 + "vsra.vi v2, v1, 7\n" 1074 + "vsll.vi v3, v1, 1\n" 1075 + "vand.vx v2, v2, %[x1d]\n" 1076 + "vxor.vv v1, v3, v2\n" 1077 + 1078 + "vsra.vi v6, v5, 7\n" 1079 + "vsll.vi v7, v5, 1\n" 1080 + "vand.vx v6, v6, %[x1d]\n" 1081 + "vxor.vv v5, v7, v6\n" 1082 + 1083 + "vsra.vi v10, v9, 7\n" 1084 + "vsll.vi v11, v9, 1\n" 1085 + "vand.vx v10, v10, %[x1d]\n" 1086 + "vxor.vv v9, v11, v10\n" 1087 + 1088 + "vsra.vi v14, v13, 7\n" 1089 + "vsll.vi v15, v13, 1\n" 1090 + "vand.vx v14, v14, %[x1d]\n" 1091 + "vxor.vv v13, v15, v14\n" 1092 + 1093 + "vsra.vi v18, v17, 7\n" 1094 + "vsll.vi v19, v17, 1\n" 1095 + "vand.vx v18, v18, %[x1d]\n" 1096 + "vxor.vv v17, v19, v18\n" 1097 + 1098 + "vsra.vi v22, v21, 7\n" 1099 + "vsll.vi v23, v21, 1\n" 1100 + "vand.vx v22, v22, %[x1d]\n" 1101 + "vxor.vv v21, v23, v22\n" 1102 + 1103 + "vsra.vi v26, v25, 7\n" 1104 + "vsll.vi v27, v25, 1\n" 1105 + "vand.vx v26, v26, %[x1d]\n" 1106 + "vxor.vv v25, v27, v26\n" 1107 + 1108 + "vsra.vi v30, v29, 7\n" 1109 + "vsll.vi v31, v29, 1\n" 1110 + "vand.vx v30, v30, %[x1d]\n" 1111 + "vxor.vv v29, v31, v30\n" 1112 + ".option pop\n" 1113 + : : 1114 + [x1d]"r"(0x1d) 1115 + ); 1116 + } 1117 + 1118 + /* 1119 + * *(unative_t *)&p[d+NSIZE*$$] ^= wp$$; 1120 + * *(unative_t *)&q[d+NSIZE*$$] ^= wq$$; 1121 + * v0:wp0, v1:wq0, v2:p0, v3:q0 1122 + * v4:wp1, v5:wq1, v6:p1, v7:q1 1123 + * v8:wp2, v9:wq2, v10:p2, v11:q2 1124 + * v12:wp3, v13:wq3, v14:p3, v15:q3 1125 + * v16:wp4, v17:wq4, v18:p4, v19:q4 1126 + * v20:wp5, v21:wq5, v22:p5, v23:q5 1127 + * v24:wp6, v25:wq6, v26:p6, v27:q6 1128 + * v28:wp7, v29:wq7, v30:p7, v31:q7 1129 + */ 1130 + asm volatile (".option push\n" 1131 + ".option arch,+v\n" 1132 + "vle8.v v2, (%[wp0])\n" 1133 + "vle8.v v3, (%[wq0])\n" 1134 + "vxor.vv v2, v2, v0\n" 1135 + "vxor.vv v3, v3, v1\n" 1136 + "vse8.v v2, (%[wp0])\n" 1137 + "vse8.v v3, (%[wq0])\n" 1138 + 1139 + "vle8.v v6, (%[wp1])\n" 1140 + "vle8.v v7, (%[wq1])\n" 1141 + "vxor.vv v6, v6, v4\n" 1142 + "vxor.vv v7, v7, v5\n" 1143 + "vse8.v v6, (%[wp1])\n" 1144 + "vse8.v v7, (%[wq1])\n" 1145 + 1146 + "vle8.v v10, (%[wp2])\n" 1147 + "vle8.v v11, (%[wq2])\n" 1148 + "vxor.vv v10, v10, v8\n" 1149 + "vxor.vv v11, v11, v9\n" 1150 + "vse8.v v10, (%[wp2])\n" 1151 + "vse8.v v11, (%[wq2])\n" 1152 + 1153 + "vle8.v v14, (%[wp3])\n" 1154 + "vle8.v v15, (%[wq3])\n" 1155 + "vxor.vv v14, v14, v12\n" 1156 + "vxor.vv v15, v15, v13\n" 1157 + "vse8.v v14, (%[wp3])\n" 1158 + "vse8.v v15, (%[wq3])\n" 1159 + 1160 + "vle8.v v18, (%[wp4])\n" 1161 + "vle8.v v19, (%[wq4])\n" 1162 + "vxor.vv v18, v18, v16\n" 1163 + "vxor.vv v19, v19, v17\n" 1164 + "vse8.v v18, (%[wp4])\n" 1165 + "vse8.v v19, (%[wq4])\n" 1166 + 1167 + "vle8.v v22, (%[wp5])\n" 1168 + "vle8.v v23, (%[wq5])\n" 1169 + "vxor.vv v22, v22, v20\n" 1170 + "vxor.vv v23, v23, v21\n" 1171 + "vse8.v v22, (%[wp5])\n" 1172 + "vse8.v v23, (%[wq5])\n" 1173 + 1174 + "vle8.v v26, (%[wp6])\n" 1175 + "vle8.v v27, (%[wq6])\n" 1176 + "vxor.vv v26, v26, v24\n" 1177 + "vxor.vv v27, v27, v25\n" 1178 + "vse8.v v26, (%[wp6])\n" 1179 + "vse8.v v27, (%[wq6])\n" 1180 + 1181 + "vle8.v v30, (%[wp7])\n" 1182 + "vle8.v v31, (%[wq7])\n" 1183 + "vxor.vv v30, v30, v28\n" 1184 + "vxor.vv v31, v31, v29\n" 1185 + "vse8.v v30, (%[wp7])\n" 1186 + "vse8.v v31, (%[wq7])\n" 1187 + ".option pop\n" 1188 + : : 1189 + [wp0]"r"(&p[d + NSIZE * 0]), 1190 + [wq0]"r"(&q[d + NSIZE * 0]), 1191 + [wp1]"r"(&p[d + NSIZE * 1]), 1192 + [wq1]"r"(&q[d + NSIZE * 1]), 1193 + [wp2]"r"(&p[d + NSIZE * 2]), 1194 + [wq2]"r"(&q[d + NSIZE * 2]), 1195 + [wp3]"r"(&p[d + NSIZE * 3]), 1196 + [wq3]"r"(&q[d + NSIZE * 3]), 1197 + [wp4]"r"(&p[d + NSIZE * 4]), 1198 + [wq4]"r"(&q[d + NSIZE * 4]), 1199 + [wp5]"r"(&p[d + NSIZE * 5]), 1200 + [wq5]"r"(&q[d + NSIZE * 5]), 1201 + [wp6]"r"(&p[d + NSIZE * 6]), 1202 + [wq6]"r"(&q[d + NSIZE * 6]), 1203 + [wp7]"r"(&p[d + NSIZE * 7]), 1204 + [wq7]"r"(&q[d + NSIZE * 7]) 1205 + ); 1206 + } 1207 + } 1208 + 1209 + RAID6_RVV_WRAPPER(1); 1210 + RAID6_RVV_WRAPPER(2); 1211 + RAID6_RVV_WRAPPER(4); 1212 + RAID6_RVV_WRAPPER(8);
+39
lib/raid6/rvv.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * Copyright 2024 Institute of Software, CAS. 4 + * 5 + * raid6/rvv.h 6 + * 7 + * Definitions for RISC-V RAID-6 code 8 + */ 9 + 10 + #define RAID6_RVV_WRAPPER(_n) \ 11 + static void raid6_rvv ## _n ## _gen_syndrome(int disks, \ 12 + size_t bytes, void **ptrs) \ 13 + { \ 14 + void raid6_rvv ## _n ## _gen_syndrome_real(int d, \ 15 + unsigned long b, void **p); \ 16 + kernel_vector_begin(); \ 17 + raid6_rvv ## _n ## _gen_syndrome_real(disks, \ 18 + (unsigned long)bytes, ptrs); \ 19 + kernel_vector_end(); \ 20 + } \ 21 + static void raid6_rvv ## _n ## _xor_syndrome(int disks, \ 22 + int start, int stop, \ 23 + size_t bytes, void **ptrs) \ 24 + { \ 25 + void raid6_rvv ## _n ## _xor_syndrome_real(int d, \ 26 + int s1, int s2, \ 27 + unsigned long b, void **p); \ 28 + kernel_vector_begin(); \ 29 + raid6_rvv ## _n ## _xor_syndrome_real(disks, \ 30 + start, stop, (unsigned long)bytes, ptrs); \ 31 + kernel_vector_end(); \ 32 + } \ 33 + struct raid6_calls const raid6_rvvx ## _n = { \ 34 + raid6_rvv ## _n ## _gen_syndrome, \ 35 + raid6_rvv ## _n ## _xor_syndrome, \ 36 + rvv_has_vector, \ 37 + "rvvx" #_n, \ 38 + 0 \ 39 + }
+6
tools/perf/util/symbol-elf.c
··· 1668 1668 continue; 1669 1669 } 1670 1670 1671 + /* Reject RISCV ELF "mapping symbols" */ 1672 + if (ehdr.e_machine == EM_RISCV) { 1673 + if (elf_name[0] == '$' && strchr("dx", elf_name[1])) 1674 + continue; 1675 + } 1676 + 1671 1677 if (runtime_ss->opdsec && sym.st_shndx == runtime_ss->opdidx) { 1672 1678 u32 offset = sym.st_value - syms_ss->opdshdr.sh_addr; 1673 1679 u64 *opd = opddata->d_buf + offset;
+2
tools/testing/selftests/vDSO/vgetrandom-chacha.S
··· 11 11 #include "../../../../arch/loongarch/vdso/vgetrandom-chacha.S" 12 12 #elif defined(__powerpc__) || defined(__powerpc64__) 13 13 #include "../../../../arch/powerpc/kernel/vdso/vgetrandom-chacha.S" 14 + #elif defined(__riscv) && __riscv_xlen == 64 15 + #include "../../../../arch/riscv/kernel/vdso/vgetrandom-chacha.S" 14 16 #elif defined(__s390x__) 15 17 #include "../../../../arch/s390/kernel/vdso64/vgetrandom-chacha.S" 16 18 #elif defined(__x86_64__)