Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux

Pull RISC-V updates from Palmer Dabbelt:
"This contains some small RISC-V updates I'd like to target for 4.18.

They are all fairly small this time. Here's a short summary, there's
more info in the commits/merges:

- a fix to __clear_user to respect the passed arguments.

- enough support for the perf subsystem to work with RISC-V's ISA
defined performance counters.

- support for sparse and cleanups suggested by it.

- support for R_RISCV_32 (a relocation, not the 32-bit ISA).

- some MAINTAINERS cleanups.

- the addition of CONFIG_HVC_RISCV_SBI to our defconfig, as it's
always present.

I've given these a simple build+boot test"

* tag 'riscv-for-linus-4.18-merge_window' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
RISC-V: Add CONFIG_HVC_RISCV_SBI=y to defconfig
RISC-V: Handle R_RISCV_32 in modules
riscv/ftrace: Export _mcount when DYNAMIC_FTRACE isn't set
riscv: add riscv-specific predefines to CHECKFLAGS
riscv: split the declaration of __copy_user
riscv: no __user for probe_kernel_address()
riscv: use NULL instead of a plain 0
perf: riscv: Add Document for Future Porting Guide
perf: riscv: preliminary RISC-V support
MAINTAINERS: Update Albert's email, he's back at Berkeley
MAINTAINERS: Add myself as a maintainer for SiFive's drivers
riscv: Fix the bug in memory access fixup code

+884 -15
+249
Documentation/riscv/pmu.txt
··· 1 + Supporting PMUs on RISC-V platforms 2 + ========================================== 3 + Alan Kao <alankao@andestech.com>, Mar 2018 4 + 5 + Introduction 6 + ------------ 7 + 8 + As of this writing, perf_event-related features mentioned in The RISC-V ISA 9 + Privileged Version 1.10 are as follows: 10 + (please check the manual for more details) 11 + 12 + * [m|s]counteren 13 + * mcycle[h], cycle[h] 14 + * minstret[h], instret[h] 15 + * mhpeventx, mhpcounterx[h] 16 + 17 + With such function set only, porting perf would require a lot of work, due to 18 + the lack of the following general architectural performance monitoring features: 19 + 20 + * Enabling/Disabling counters 21 + Counters are just free-running all the time in our case. 22 + * Interrupt caused by counter overflow 23 + No such feature in the spec. 24 + * Interrupt indicator 25 + It is not possible to have many interrupt ports for all counters, so an 26 + interrupt indicator is required for software to tell which counter has 27 + just overflowed. 28 + * Writing to counters 29 + There will be an SBI to support this since the kernel cannot modify the 30 + counters [1]. Alternatively, some vendor considers to implement 31 + hardware-extension for M-S-U model machines to write counters directly. 32 + 33 + This document aims to provide developers a quick guide on supporting their 34 + PMUs in the kernel. The following sections briefly explain perf' mechanism 35 + and todos. 36 + 37 + You may check previous discussions here [1][2]. Also, it might be helpful 38 + to check the appendix for related kernel structures. 39 + 40 + 41 + 1. Initialization 42 + ----------------- 43 + 44 + *riscv_pmu* is a global pointer of type *struct riscv_pmu*, which contains 45 + various methods according to perf's internal convention and PMU-specific 46 + parameters. One should declare such instance to represent the PMU. By default, 47 + *riscv_pmu* points to a constant structure *riscv_base_pmu*, which has very 48 + basic support to a baseline QEMU model. 49 + 50 + Then he/she can either assign the instance's pointer to *riscv_pmu* so that 51 + the minimal and already-implemented logic can be leveraged, or invent his/her 52 + own *riscv_init_platform_pmu* implementation. 53 + 54 + In other words, existing sources of *riscv_base_pmu* merely provide a 55 + reference implementation. Developers can flexibly decide how many parts they 56 + can leverage, and in the most extreme case, they can customize every function 57 + according to their needs. 58 + 59 + 60 + 2. Event Initialization 61 + ----------------------- 62 + 63 + When a user launches a perf command to monitor some events, it is first 64 + interpreted by the userspace perf tool into multiple *perf_event_open* 65 + system calls, and then each of them calls to the body of *event_init* 66 + member function that was assigned in the previous step. In *riscv_base_pmu*'s 67 + case, it is *riscv_event_init*. 68 + 69 + The main purpose of this function is to translate the event provided by user 70 + into bitmap, so that HW-related control registers or counters can directly be 71 + manipulated. The translation is based on the mappings and methods provided in 72 + *riscv_pmu*. 73 + 74 + Note that some features can be done in this stage as well: 75 + 76 + (1) interrupt setting, which is stated in the next section; 77 + (2) privilege level setting (user space only, kernel space only, both); 78 + (3) destructor setting. Normally it is sufficient to apply *riscv_destroy_event*; 79 + (4) tweaks for non-sampling events, which will be utilized by functions such as 80 + *perf_adjust_period*, usually something like the follows: 81 + 82 + if (!is_sampling_event(event)) { 83 + hwc->sample_period = x86_pmu.max_period; 84 + hwc->last_period = hwc->sample_period; 85 + local64_set(&hwc->period_left, hwc->sample_period); 86 + } 87 + 88 + In the case of *riscv_base_pmu*, only (3) is provided for now. 89 + 90 + 91 + 3. Interrupt 92 + ------------ 93 + 94 + 3.1. Interrupt Initialization 95 + 96 + This often occurs at the beginning of the *event_init* method. In common 97 + practice, this should be a code segment like 98 + 99 + int x86_reserve_hardware(void) 100 + { 101 + int err = 0; 102 + 103 + if (!atomic_inc_not_zero(&pmc_refcount)) { 104 + mutex_lock(&pmc_reserve_mutex); 105 + if (atomic_read(&pmc_refcount) == 0) { 106 + if (!reserve_pmc_hardware()) 107 + err = -EBUSY; 108 + else 109 + reserve_ds_buffers(); 110 + } 111 + if (!err) 112 + atomic_inc(&pmc_refcount); 113 + mutex_unlock(&pmc_reserve_mutex); 114 + } 115 + 116 + return err; 117 + } 118 + 119 + And the magic is in *reserve_pmc_hardware*, which usually does atomic 120 + operations to make implemented IRQ accessible from some global function pointer. 121 + *release_pmc_hardware* serves the opposite purpose, and it is used in event 122 + destructors mentioned in previous section. 123 + 124 + (Note: From the implementations in all the architectures, the *reserve/release* 125 + pair are always IRQ settings, so the *pmc_hardware* seems somehow misleading. 126 + It does NOT deal with the binding between an event and a physical counter, 127 + which will be introduced in the next section.) 128 + 129 + 3.2. IRQ Structure 130 + 131 + Basically, a IRQ runs the following pseudo code: 132 + 133 + for each hardware counter that triggered this overflow 134 + 135 + get the event of this counter 136 + 137 + // following two steps are defined as *read()*, 138 + // check the section Reading/Writing Counters for details. 139 + count the delta value since previous interrupt 140 + update the event->count (# event occurs) by adding delta, and 141 + event->hw.period_left by subtracting delta 142 + 143 + if the event overflows 144 + sample data 145 + set the counter appropriately for the next overflow 146 + 147 + if the event overflows again 148 + too frequently, throttle this event 149 + fi 150 + fi 151 + 152 + end for 153 + 154 + However as of this writing, none of the RISC-V implementations have designed an 155 + interrupt for perf, so the details are to be completed in the future. 156 + 157 + 4. Reading/Writing Counters 158 + --------------------------- 159 + 160 + They seem symmetric but perf treats them quite differently. For reading, there 161 + is a *read* interface in *struct pmu*, but it serves more than just reading. 162 + According to the context, the *read* function not only reads the content of the 163 + counter (event->count), but also updates the left period to the next interrupt 164 + (event->hw.period_left). 165 + 166 + But the core of perf does not need direct write to counters. Writing counters 167 + is hidden behind the abstraction of 1) *pmu->start*, literally start counting so one 168 + has to set the counter to a good value for the next interrupt; 2) inside the IRQ 169 + it should set the counter to the same resonable value. 170 + 171 + Reading is not a problem in RISC-V but writing would need some effort, since 172 + counters are not allowed to be written by S-mode. 173 + 174 + 175 + 5. add()/del()/start()/stop() 176 + ----------------------------- 177 + 178 + Basic idea: add()/del() adds/deletes events to/from a PMU, and start()/stop() 179 + starts/stop the counter of some event in the PMU. All of them take the same 180 + arguments: *struct perf_event *event* and *int flag*. 181 + 182 + Consider perf as a state machine, then you will find that these functions serve 183 + as the state transition process between those states. 184 + Three states (event->hw.state) are defined: 185 + 186 + * PERF_HES_STOPPED: the counter is stopped 187 + * PERF_HES_UPTODATE: the event->count is up-to-date 188 + * PERF_HES_ARCH: arch-dependent usage ... we don't need this for now 189 + 190 + A normal flow of these state transitions are as follows: 191 + 192 + * A user launches a perf event, resulting in calling to *event_init*. 193 + * When being context-switched in, *add* is called by the perf core, with a flag 194 + PERF_EF_START, which means that the event should be started after it is added. 195 + At this stage, a general event is bound to a physical counter, if any. 196 + The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, because it is now 197 + stopped, and the (software) event count does not need updating. 198 + ** *start* is then called, and the counter is enabled. 199 + With flag PERF_EF_RELOAD, it writes an appropriate value to the counter (check 200 + previous section for detail). 201 + Nothing is written if the flag does not contain PERF_EF_RELOAD. 202 + The state now is reset to none, because it is neither stopped nor updated 203 + (the counting already started) 204 + * When being context-switched out, *del* is called. It then checks out all the 205 + events in the PMU and calls *stop* to update their counts. 206 + ** *stop* is called by *del* 207 + and the perf core with flag PERF_EF_UPDATE, and it often shares the same 208 + subroutine as *read* with the same logic. 209 + The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again. 210 + 211 + ** Life cycle of these two pairs: *add* and *del* are called repeatedly as 212 + tasks switch in-and-out; *start* and *stop* is also called when the perf core 213 + needs a quick stop-and-start, for instance, when the interrupt period is being 214 + adjusted. 215 + 216 + Current implementation is sufficient for now and can be easily extended to 217 + features in the future. 218 + 219 + A. Related Structures 220 + --------------------- 221 + 222 + * struct pmu: include/linux/perf_event.h 223 + * struct riscv_pmu: arch/riscv/include/asm/perf_event.h 224 + 225 + Both structures are designed to be read-only. 226 + 227 + *struct pmu* defines some function pointer interfaces, and most of them take 228 + *struct perf_event* as a main argument, dealing with perf events according to 229 + perf's internal state machine (check kernel/events/core.c for details). 230 + 231 + *struct riscv_pmu* defines PMU-specific parameters. The naming follows the 232 + convention of all other architectures. 233 + 234 + * struct perf_event: include/linux/perf_event.h 235 + * struct hw_perf_event 236 + 237 + The generic structure that represents perf events, and the hardware-related 238 + details. 239 + 240 + * struct riscv_hw_events: arch/riscv/include/asm/perf_event.h 241 + 242 + The structure that holds the status of events, has two fixed members: 243 + the number of events and the array of the events. 244 + 245 + References 246 + ---------- 247 + 248 + [1] https://github.com/riscv/riscv-linux/pull/124 249 + [2] https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/f19TmCNP6yA
+9 -1
MAINTAINERS
··· 12179 12179 12180 12180 RISC-V ARCHITECTURE 12181 12181 M: Palmer Dabbelt <palmer@sifive.com> 12182 - M: Albert Ou <albert@sifive.com> 12182 + M: Albert Ou <aou@eecs.berkeley.edu> 12183 12183 L: linux-riscv@lists.infradead.org 12184 12184 T: git git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux.git 12185 12185 S: Supported ··· 12938 12938 F: drivers/media/usb/siano/ 12939 12939 F: drivers/media/usb/siano/ 12940 12940 F: drivers/media/mmc/siano/ 12941 + 12942 + SIFIVE DRIVERS 12943 + M: Palmer Dabbelt <palmer@sifive.com> 12944 + L: linux-riscv@lists.infradead.org 12945 + T: git git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux.git 12946 + S: Supported 12947 + K: sifive 12948 + N: sifive 12941 12949 12942 12950 SILEAD TOUCHSCREEN DRIVER 12943 12951 M: Hans de Goede <hdegoede@redhat.com>
+14
arch/riscv/Kconfig
··· 32 32 select HAVE_MEMBLOCK_NODE_MAP 33 33 select HAVE_DMA_CONTIGUOUS 34 34 select HAVE_GENERIC_DMA_COHERENT 35 + select HAVE_PERF_EVENTS 35 36 select IRQ_DOMAIN 36 37 select NO_BOOTMEM 37 38 select RISCV_ISA_A if SMP ··· 193 192 194 193 config RISCV_ISA_A 195 194 def_bool y 195 + 196 + menu "supported PMU type" 197 + depends on PERF_EVENTS 198 + 199 + config RISCV_BASE_PMU 200 + bool "Base Performance Monitoring Unit" 201 + def_bool y 202 + help 203 + A base PMU that serves as a reference implementation and has limited 204 + feature of perf. It can run on any RISC-V machines so serves as the 205 + fallback, but this option can also be disable to reduce kernel size. 206 + 207 + endmenu 196 208 197 209 endmenu 198 210
+3
arch/riscv/Makefile
··· 71 71 # architectures. It's faster to have GCC emit only aligned accesses. 72 72 KBUILD_CFLAGS += $(call cc-option,-mstrict-align) 73 73 74 + # arch specific predefines for sparse 75 + CHECKFLAGS += -D__riscv -D__riscv_xlen=$(BITS) 76 + 74 77 head-y := arch/riscv/kernel/head.o 75 78 76 79 core-y += arch/riscv/kernel/ arch/riscv/mm/
+1
arch/riscv/configs/defconfig
··· 44 44 CONFIG_SERIAL_8250=y 45 45 CONFIG_SERIAL_8250_CONSOLE=y 46 46 CONFIG_SERIAL_OF_PLATFORM=y 47 + CONFIG_HVC_RISCV_SBI=y 47 48 # CONFIG_PTP_1588_CLOCK is not set 48 49 CONFIG_DRM=y 49 50 CONFIG_DRM_RADEON=y
+1
arch/riscv/include/asm/Kbuild
··· 25 25 generic-y += kmap_types.h 26 26 generic-y += kvm_para.h 27 27 generic-y += local.h 28 + generic-y += local64.h 28 29 generic-y += mm-arch-hooks.h 29 30 generic-y += mman.h 30 31 generic-y += module.h
+1 -1
arch/riscv/include/asm/cacheflush.h
··· 47 47 48 48 #else /* CONFIG_SMP */ 49 49 50 - #define flush_icache_all() sbi_remote_fence_i(0) 50 + #define flush_icache_all() sbi_remote_fence_i(NULL) 51 51 void flush_icache_mm(struct mm_struct *mm, bool local); 52 52 53 53 #endif /* CONFIG_SMP */
+84
arch/riscv/include/asm/perf_event.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * Copyright (C) 2018 SiFive 4 + * Copyright (C) 2018 Andes Technology Corporation 5 + * 6 + */ 7 + 8 + #ifndef _ASM_RISCV_PERF_EVENT_H 9 + #define _ASM_RISCV_PERF_EVENT_H 10 + 11 + #include <linux/perf_event.h> 12 + #include <linux/ptrace.h> 13 + 14 + #define RISCV_BASE_COUNTERS 2 15 + 16 + /* 17 + * The RISCV_MAX_COUNTERS parameter should be specified. 18 + */ 19 + 20 + #ifdef CONFIG_RISCV_BASE_PMU 21 + #define RISCV_MAX_COUNTERS 2 22 + #endif 23 + 24 + #ifndef RISCV_MAX_COUNTERS 25 + #error "Please provide a valid RISCV_MAX_COUNTERS for the PMU." 26 + #endif 27 + 28 + /* 29 + * These are the indexes of bits in counteren register *minus* 1, 30 + * except for cycle. It would be coherent if it can directly mapped 31 + * to counteren bit definition, but there is a *time* register at 32 + * counteren[1]. Per-cpu structure is scarce resource here. 33 + * 34 + * According to the spec, an implementation can support counter up to 35 + * mhpmcounter31, but many high-end processors has at most 6 general 36 + * PMCs, we give the definition to MHPMCOUNTER8 here. 37 + */ 38 + #define RISCV_PMU_CYCLE 0 39 + #define RISCV_PMU_INSTRET 1 40 + #define RISCV_PMU_MHPMCOUNTER3 2 41 + #define RISCV_PMU_MHPMCOUNTER4 3 42 + #define RISCV_PMU_MHPMCOUNTER5 4 43 + #define RISCV_PMU_MHPMCOUNTER6 5 44 + #define RISCV_PMU_MHPMCOUNTER7 6 45 + #define RISCV_PMU_MHPMCOUNTER8 7 46 + 47 + #define RISCV_OP_UNSUPP (-EOPNOTSUPP) 48 + 49 + struct cpu_hw_events { 50 + /* # currently enabled events*/ 51 + int n_events; 52 + /* currently enabled events */ 53 + struct perf_event *events[RISCV_MAX_COUNTERS]; 54 + /* vendor-defined PMU data */ 55 + void *platform; 56 + }; 57 + 58 + struct riscv_pmu { 59 + struct pmu *pmu; 60 + 61 + /* generic hw/cache events table */ 62 + const int *hw_events; 63 + const int (*cache_events)[PERF_COUNT_HW_CACHE_MAX] 64 + [PERF_COUNT_HW_CACHE_OP_MAX] 65 + [PERF_COUNT_HW_CACHE_RESULT_MAX]; 66 + /* method used to map hw/cache events */ 67 + int (*map_hw_event)(u64 config); 68 + int (*map_cache_event)(u64 config); 69 + 70 + /* max generic hw events in map */ 71 + int max_events; 72 + /* number total counters, 2(base) + x(general) */ 73 + int num_counters; 74 + /* the width of the counter */ 75 + int counter_width; 76 + 77 + /* vendor-defined PMU features */ 78 + void *platform; 79 + 80 + irqreturn_t (*handle_irq)(int irq_num, void *dev); 81 + int irq; 82 + }; 83 + 84 + #endif /* _ASM_RISCV_PERF_EVENT_H */
+1 -1
arch/riscv/include/asm/tlbflush.h
··· 49 49 50 50 #include <asm/sbi.h> 51 51 52 - #define flush_tlb_all() sbi_remote_sfence_vma(0, 0, -1) 52 + #define flush_tlb_all() sbi_remote_sfence_vma(NULL, 0, -1) 53 53 #define flush_tlb_page(vma, addr) flush_tlb_range(vma, addr, 0) 54 54 #define flush_tlb_range(vma, start, end) \ 55 55 sbi_remote_sfence_vma(mm_cpumask((vma)->vm_mm)->bits, \
+5 -3
arch/riscv/include/asm/uaccess.h
··· 392 392 }) 393 393 394 394 395 - extern unsigned long __must_check __copy_user(void __user *to, 395 + extern unsigned long __must_check __asm_copy_to_user(void __user *to, 396 + const void *from, unsigned long n); 397 + extern unsigned long __must_check __asm_copy_from_user(void *to, 396 398 const void __user *from, unsigned long n); 397 399 398 400 static inline unsigned long 399 401 raw_copy_from_user(void *to, const void __user *from, unsigned long n) 400 402 { 401 - return __copy_user(to, from, n); 403 + return __asm_copy_to_user(to, from, n); 402 404 } 403 405 404 406 static inline unsigned long 405 407 raw_copy_to_user(void __user *to, const void *from, unsigned long n) 406 408 { 407 - return __copy_user(to, from, n); 409 + return __asm_copy_from_user(to, from, n); 408 410 } 409 411 410 412 extern long strncpy_from_user(char *dest, const char __user *src, long count);
+2
arch/riscv/kernel/Makefile
··· 39 39 obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o 40 40 obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o 41 41 42 + obj-$(CONFIG_PERF_EVENTS) += perf_event.o 43 + 42 44 clean:
+1 -1
arch/riscv/kernel/mcount.S
··· 126 126 RESTORE_ABI_STATE 127 127 ret 128 128 ENDPROC(_mcount) 129 - EXPORT_SYMBOL(_mcount) 130 129 #endif 130 + EXPORT_SYMBOL(_mcount)
+12
arch/riscv/kernel/module.c
··· 17 17 #include <linux/errno.h> 18 18 #include <linux/moduleloader.h> 19 19 20 + static int apply_r_riscv_32_rela(struct module *me, u32 *location, Elf_Addr v) 21 + { 22 + if (v != (u32)v) { 23 + pr_err("%s: value %016llx out of range for 32-bit field\n", 24 + me->name, v); 25 + return -EINVAL; 26 + } 27 + *location = v; 28 + return 0; 29 + } 30 + 20 31 static int apply_r_riscv_64_rela(struct module *me, u32 *location, Elf_Addr v) 21 32 { 22 33 *(u64 *)location = v; ··· 276 265 277 266 static int (*reloc_handlers_rela[]) (struct module *me, u32 *location, 278 267 Elf_Addr v) = { 268 + [R_RISCV_32] = apply_r_riscv_32_rela, 279 269 [R_RISCV_64] = apply_r_riscv_64_rela, 280 270 [R_RISCV_BRANCH] = apply_r_riscv_branch_rela, 281 271 [R_RISCV_JAL] = apply_r_riscv_jal_rela,
+485
arch/riscv/kernel/perf_event.c
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * Copyright (C) 2008 Thomas Gleixner <tglx@linutronix.de> 4 + * Copyright (C) 2008-2009 Red Hat, Inc., Ingo Molnar 5 + * Copyright (C) 2009 Jaswinder Singh Rajput 6 + * Copyright (C) 2009 Advanced Micro Devices, Inc., Robert Richter 7 + * Copyright (C) 2008-2009 Red Hat, Inc., Peter Zijlstra 8 + * Copyright (C) 2009 Intel Corporation, <markus.t.metzger@intel.com> 9 + * Copyright (C) 2009 Google, Inc., Stephane Eranian 10 + * Copyright 2014 Tilera Corporation. All Rights Reserved. 11 + * Copyright (C) 2018 Andes Technology Corporation 12 + * 13 + * Perf_events support for RISC-V platforms. 14 + * 15 + * Since the spec. (as of now, Priv-Spec 1.10) does not provide enough 16 + * functionality for perf event to fully work, this file provides 17 + * the very basic framework only. 18 + * 19 + * For platform portings, please check Documentations/riscv/pmu.txt. 20 + * 21 + * The Copyright line includes x86 and tile ones. 22 + */ 23 + 24 + #include <linux/kprobes.h> 25 + #include <linux/kernel.h> 26 + #include <linux/kdebug.h> 27 + #include <linux/mutex.h> 28 + #include <linux/bitmap.h> 29 + #include <linux/irq.h> 30 + #include <linux/interrupt.h> 31 + #include <linux/perf_event.h> 32 + #include <linux/atomic.h> 33 + #include <linux/of.h> 34 + #include <asm/perf_event.h> 35 + 36 + static const struct riscv_pmu *riscv_pmu __read_mostly; 37 + static DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events); 38 + 39 + /* 40 + * Hardware & cache maps and their methods 41 + */ 42 + 43 + static const int riscv_hw_event_map[] = { 44 + [PERF_COUNT_HW_CPU_CYCLES] = RISCV_PMU_CYCLE, 45 + [PERF_COUNT_HW_INSTRUCTIONS] = RISCV_PMU_INSTRET, 46 + [PERF_COUNT_HW_CACHE_REFERENCES] = RISCV_OP_UNSUPP, 47 + [PERF_COUNT_HW_CACHE_MISSES] = RISCV_OP_UNSUPP, 48 + [PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = RISCV_OP_UNSUPP, 49 + [PERF_COUNT_HW_BRANCH_MISSES] = RISCV_OP_UNSUPP, 50 + [PERF_COUNT_HW_BUS_CYCLES] = RISCV_OP_UNSUPP, 51 + }; 52 + 53 + #define C(x) PERF_COUNT_HW_CACHE_##x 54 + static const int riscv_cache_event_map[PERF_COUNT_HW_CACHE_MAX] 55 + [PERF_COUNT_HW_CACHE_OP_MAX] 56 + [PERF_COUNT_HW_CACHE_RESULT_MAX] = { 57 + [C(L1D)] = { 58 + [C(OP_READ)] = { 59 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 60 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 61 + }, 62 + [C(OP_WRITE)] = { 63 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 64 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 65 + }, 66 + [C(OP_PREFETCH)] = { 67 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 68 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 69 + }, 70 + }, 71 + [C(L1I)] = { 72 + [C(OP_READ)] = { 73 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 74 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 75 + }, 76 + [C(OP_WRITE)] = { 77 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 78 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 79 + }, 80 + [C(OP_PREFETCH)] = { 81 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 82 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 83 + }, 84 + }, 85 + [C(LL)] = { 86 + [C(OP_READ)] = { 87 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 88 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 89 + }, 90 + [C(OP_WRITE)] = { 91 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 92 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 93 + }, 94 + [C(OP_PREFETCH)] = { 95 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 96 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 97 + }, 98 + }, 99 + [C(DTLB)] = { 100 + [C(OP_READ)] = { 101 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 102 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 103 + }, 104 + [C(OP_WRITE)] = { 105 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 106 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 107 + }, 108 + [C(OP_PREFETCH)] = { 109 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 110 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 111 + }, 112 + }, 113 + [C(ITLB)] = { 114 + [C(OP_READ)] = { 115 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 116 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 117 + }, 118 + [C(OP_WRITE)] = { 119 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 120 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 121 + }, 122 + [C(OP_PREFETCH)] = { 123 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 124 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 125 + }, 126 + }, 127 + [C(BPU)] = { 128 + [C(OP_READ)] = { 129 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 130 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 131 + }, 132 + [C(OP_WRITE)] = { 133 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 134 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 135 + }, 136 + [C(OP_PREFETCH)] = { 137 + [C(RESULT_ACCESS)] = RISCV_OP_UNSUPP, 138 + [C(RESULT_MISS)] = RISCV_OP_UNSUPP, 139 + }, 140 + }, 141 + }; 142 + 143 + static int riscv_map_hw_event(u64 config) 144 + { 145 + if (config >= riscv_pmu->max_events) 146 + return -EINVAL; 147 + 148 + return riscv_pmu->hw_events[config]; 149 + } 150 + 151 + int riscv_map_cache_decode(u64 config, unsigned int *type, 152 + unsigned int *op, unsigned int *result) 153 + { 154 + return -ENOENT; 155 + } 156 + 157 + static int riscv_map_cache_event(u64 config) 158 + { 159 + unsigned int type, op, result; 160 + int err = -ENOENT; 161 + int code; 162 + 163 + err = riscv_map_cache_decode(config, &type, &op, &result); 164 + if (!riscv_pmu->cache_events || err) 165 + return err; 166 + 167 + if (type >= PERF_COUNT_HW_CACHE_MAX || 168 + op >= PERF_COUNT_HW_CACHE_OP_MAX || 169 + result >= PERF_COUNT_HW_CACHE_RESULT_MAX) 170 + return -EINVAL; 171 + 172 + code = (*riscv_pmu->cache_events)[type][op][result]; 173 + if (code == RISCV_OP_UNSUPP) 174 + return -EINVAL; 175 + 176 + return code; 177 + } 178 + 179 + /* 180 + * Low-level functions: reading/writing counters 181 + */ 182 + 183 + static inline u64 read_counter(int idx) 184 + { 185 + u64 val = 0; 186 + 187 + switch (idx) { 188 + case RISCV_PMU_CYCLE: 189 + val = csr_read(cycle); 190 + break; 191 + case RISCV_PMU_INSTRET: 192 + val = csr_read(instret); 193 + break; 194 + default: 195 + WARN_ON_ONCE(idx < 0 || idx > RISCV_MAX_COUNTERS); 196 + return -EINVAL; 197 + } 198 + 199 + return val; 200 + } 201 + 202 + static inline void write_counter(int idx, u64 value) 203 + { 204 + /* currently not supported */ 205 + WARN_ON_ONCE(1); 206 + } 207 + 208 + /* 209 + * pmu->read: read and update the counter 210 + * 211 + * Other architectures' implementation often have a xxx_perf_event_update 212 + * routine, which can return counter values when called in the IRQ, but 213 + * return void when being called by the pmu->read method. 214 + */ 215 + static void riscv_pmu_read(struct perf_event *event) 216 + { 217 + struct hw_perf_event *hwc = &event->hw; 218 + u64 prev_raw_count, new_raw_count; 219 + u64 oldval; 220 + int idx = hwc->idx; 221 + u64 delta; 222 + 223 + do { 224 + prev_raw_count = local64_read(&hwc->prev_count); 225 + new_raw_count = read_counter(idx); 226 + 227 + oldval = local64_cmpxchg(&hwc->prev_count, prev_raw_count, 228 + new_raw_count); 229 + } while (oldval != prev_raw_count); 230 + 231 + /* 232 + * delta is the value to update the counter we maintain in the kernel. 233 + */ 234 + delta = (new_raw_count - prev_raw_count) & 235 + ((1ULL << riscv_pmu->counter_width) - 1); 236 + local64_add(delta, &event->count); 237 + /* 238 + * Something like local64_sub(delta, &hwc->period_left) here is 239 + * needed if there is an interrupt for perf. 240 + */ 241 + } 242 + 243 + /* 244 + * State transition functions: 245 + * 246 + * stop()/start() & add()/del() 247 + */ 248 + 249 + /* 250 + * pmu->stop: stop the counter 251 + */ 252 + static void riscv_pmu_stop(struct perf_event *event, int flags) 253 + { 254 + struct hw_perf_event *hwc = &event->hw; 255 + 256 + WARN_ON_ONCE(hwc->state & PERF_HES_STOPPED); 257 + hwc->state |= PERF_HES_STOPPED; 258 + 259 + if ((flags & PERF_EF_UPDATE) && !(hwc->state & PERF_HES_UPTODATE)) { 260 + riscv_pmu->pmu->read(event); 261 + hwc->state |= PERF_HES_UPTODATE; 262 + } 263 + } 264 + 265 + /* 266 + * pmu->start: start the event. 267 + */ 268 + static void riscv_pmu_start(struct perf_event *event, int flags) 269 + { 270 + struct hw_perf_event *hwc = &event->hw; 271 + 272 + if (WARN_ON_ONCE(!(event->hw.state & PERF_HES_STOPPED))) 273 + return; 274 + 275 + if (flags & PERF_EF_RELOAD) { 276 + WARN_ON_ONCE(!(event->hw.state & PERF_HES_UPTODATE)); 277 + 278 + /* 279 + * Set the counter to the period to the next interrupt here, 280 + * if you have any. 281 + */ 282 + } 283 + 284 + hwc->state = 0; 285 + perf_event_update_userpage(event); 286 + 287 + /* 288 + * Since we cannot write to counters, this serves as an initialization 289 + * to the delta-mechanism in pmu->read(); otherwise, the delta would be 290 + * wrong when pmu->read is called for the first time. 291 + */ 292 + local64_set(&hwc->prev_count, read_counter(hwc->idx)); 293 + } 294 + 295 + /* 296 + * pmu->add: add the event to PMU. 297 + */ 298 + static int riscv_pmu_add(struct perf_event *event, int flags) 299 + { 300 + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); 301 + struct hw_perf_event *hwc = &event->hw; 302 + 303 + if (cpuc->n_events == riscv_pmu->num_counters) 304 + return -ENOSPC; 305 + 306 + /* 307 + * We don't have general conunters, so no binding-event-to-counter 308 + * process here. 309 + * 310 + * Indexing using hwc->config generally not works, since config may 311 + * contain extra information, but here the only info we have in 312 + * hwc->config is the event index. 313 + */ 314 + hwc->idx = hwc->config; 315 + cpuc->events[hwc->idx] = event; 316 + cpuc->n_events++; 317 + 318 + hwc->state = PERF_HES_UPTODATE | PERF_HES_STOPPED; 319 + 320 + if (flags & PERF_EF_START) 321 + riscv_pmu->pmu->start(event, PERF_EF_RELOAD); 322 + 323 + return 0; 324 + } 325 + 326 + /* 327 + * pmu->del: delete the event from PMU. 328 + */ 329 + static void riscv_pmu_del(struct perf_event *event, int flags) 330 + { 331 + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); 332 + struct hw_perf_event *hwc = &event->hw; 333 + 334 + cpuc->events[hwc->idx] = NULL; 335 + cpuc->n_events--; 336 + riscv_pmu->pmu->stop(event, PERF_EF_UPDATE); 337 + perf_event_update_userpage(event); 338 + } 339 + 340 + /* 341 + * Interrupt: a skeletion for reference. 342 + */ 343 + 344 + static DEFINE_MUTEX(pmc_reserve_mutex); 345 + 346 + irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev) 347 + { 348 + return IRQ_NONE; 349 + } 350 + 351 + static int reserve_pmc_hardware(void) 352 + { 353 + int err = 0; 354 + 355 + mutex_lock(&pmc_reserve_mutex); 356 + if (riscv_pmu->irq >= 0 && riscv_pmu->handle_irq) { 357 + err = request_irq(riscv_pmu->irq, riscv_pmu->handle_irq, 358 + IRQF_PERCPU, "riscv-base-perf", NULL); 359 + } 360 + mutex_unlock(&pmc_reserve_mutex); 361 + 362 + return err; 363 + } 364 + 365 + void release_pmc_hardware(void) 366 + { 367 + mutex_lock(&pmc_reserve_mutex); 368 + if (riscv_pmu->irq >= 0) 369 + free_irq(riscv_pmu->irq, NULL); 370 + mutex_unlock(&pmc_reserve_mutex); 371 + } 372 + 373 + /* 374 + * Event Initialization/Finalization 375 + */ 376 + 377 + static atomic_t riscv_active_events = ATOMIC_INIT(0); 378 + 379 + static void riscv_event_destroy(struct perf_event *event) 380 + { 381 + if (atomic_dec_return(&riscv_active_events) == 0) 382 + release_pmc_hardware(); 383 + } 384 + 385 + static int riscv_event_init(struct perf_event *event) 386 + { 387 + struct perf_event_attr *attr = &event->attr; 388 + struct hw_perf_event *hwc = &event->hw; 389 + int err; 390 + int code; 391 + 392 + if (atomic_inc_return(&riscv_active_events) == 1) { 393 + err = reserve_pmc_hardware(); 394 + 395 + if (err) { 396 + pr_warn("PMC hardware not available\n"); 397 + atomic_dec(&riscv_active_events); 398 + return -EBUSY; 399 + } 400 + } 401 + 402 + switch (event->attr.type) { 403 + case PERF_TYPE_HARDWARE: 404 + code = riscv_pmu->map_hw_event(attr->config); 405 + break; 406 + case PERF_TYPE_HW_CACHE: 407 + code = riscv_pmu->map_cache_event(attr->config); 408 + break; 409 + case PERF_TYPE_RAW: 410 + return -EOPNOTSUPP; 411 + default: 412 + return -ENOENT; 413 + } 414 + 415 + event->destroy = riscv_event_destroy; 416 + if (code < 0) { 417 + event->destroy(event); 418 + return code; 419 + } 420 + 421 + /* 422 + * idx is set to -1 because the index of a general event should not be 423 + * decided until binding to some counter in pmu->add(). 424 + * 425 + * But since we don't have such support, later in pmu->add(), we just 426 + * use hwc->config as the index instead. 427 + */ 428 + hwc->config = code; 429 + hwc->idx = -1; 430 + 431 + return 0; 432 + } 433 + 434 + /* 435 + * Initialization 436 + */ 437 + 438 + static struct pmu min_pmu = { 439 + .name = "riscv-base", 440 + .event_init = riscv_event_init, 441 + .add = riscv_pmu_add, 442 + .del = riscv_pmu_del, 443 + .start = riscv_pmu_start, 444 + .stop = riscv_pmu_stop, 445 + .read = riscv_pmu_read, 446 + }; 447 + 448 + static const struct riscv_pmu riscv_base_pmu = { 449 + .pmu = &min_pmu, 450 + .max_events = ARRAY_SIZE(riscv_hw_event_map), 451 + .map_hw_event = riscv_map_hw_event, 452 + .hw_events = riscv_hw_event_map, 453 + .map_cache_event = riscv_map_cache_event, 454 + .cache_events = &riscv_cache_event_map, 455 + .counter_width = 63, 456 + .num_counters = RISCV_BASE_COUNTERS + 0, 457 + .handle_irq = &riscv_base_pmu_handle_irq, 458 + 459 + /* This means this PMU has no IRQ. */ 460 + .irq = -1, 461 + }; 462 + 463 + static const struct of_device_id riscv_pmu_of_ids[] = { 464 + {.compatible = "riscv,base-pmu", .data = &riscv_base_pmu}, 465 + { /* sentinel value */ } 466 + }; 467 + 468 + int __init init_hw_perf_events(void) 469 + { 470 + struct device_node *node = of_find_node_by_type(NULL, "pmu"); 471 + const struct of_device_id *of_id; 472 + 473 + riscv_pmu = &riscv_base_pmu; 474 + 475 + if (node) { 476 + of_id = of_match_node(riscv_pmu_of_ids, node); 477 + 478 + if (of_id) 479 + riscv_pmu = of_id->data; 480 + } 481 + 482 + perf_pmu_register(riscv_pmu->pmu, "cpu", PERF_TYPE_RAW); 483 + return 0; 484 + } 485 + arch_initcall(init_hw_perf_events);
+2 -1
arch/riscv/kernel/riscv_ksyms.c
··· 13 13 * Assembly functions that may be used (directly or indirectly) by modules 14 14 */ 15 15 EXPORT_SYMBOL(__clear_user); 16 - EXPORT_SYMBOL(__copy_user); 16 + EXPORT_SYMBOL(__asm_copy_to_user); 17 + EXPORT_SYMBOL(__asm_copy_from_user); 17 18 EXPORT_SYMBOL(memset); 18 19 EXPORT_SYMBOL(memcpy);
+1 -1
arch/riscv/kernel/traps.c
··· 148 148 149 149 if (pc < PAGE_OFFSET) 150 150 return 0; 151 - if (probe_kernel_address((bug_insn_t __user *)pc, insn)) 151 + if (probe_kernel_address((bug_insn_t *)pc, insn)) 152 152 return 0; 153 153 return (insn == __BUG_INSN); 154 154 }
+13 -6
arch/riscv/lib/uaccess.S
··· 13 13 .previous 14 14 .endm 15 15 16 - ENTRY(__copy_user) 16 + ENTRY(__asm_copy_to_user) 17 + ENTRY(__asm_copy_from_user) 17 18 18 19 /* Enable access to user memory */ 19 20 li t6, SR_SUM ··· 64 63 addi a0, a0, 1 65 64 bltu a1, a3, 5b 66 65 j 3b 67 - ENDPROC(__copy_user) 66 + ENDPROC(__asm_copy_to_user) 67 + ENDPROC(__asm_copy_from_user) 68 68 69 69 70 70 ENTRY(__clear_user) ··· 86 84 bgeu t0, t1, 2f 87 85 bltu a0, t0, 4f 88 86 1: 89 - fixup REG_S, zero, (a0), 10f 87 + fixup REG_S, zero, (a0), 11f 90 88 addi a0, a0, SZREG 91 89 bltu a0, t1, 1b 92 90 2: ··· 98 96 li a0, 0 99 97 ret 100 98 4: /* Edge case: unalignment */ 101 - fixup sb, zero, (a0), 10f 99 + fixup sb, zero, (a0), 11f 102 100 addi a0, a0, 1 103 101 bltu a0, t0, 4b 104 102 j 1b 105 103 5: /* Edge case: remainder */ 106 - fixup sb, zero, (a0), 10f 104 + fixup sb, zero, (a0), 11f 107 105 addi a0, a0, 1 108 106 bltu a0, a3, 5b 109 107 j 3b ··· 111 109 112 110 .section .fixup,"ax" 113 111 .balign 4 112 + /* Fixup code for __copy_user(10) and __clear_user(11) */ 114 113 10: 115 114 /* Disable access to user memory */ 116 115 csrs sstatus, t6 117 - sub a0, a3, a0 116 + mv a0, a2 117 + ret 118 + 11: 119 + csrs sstatus, t6 120 + mv a0, a1 118 121 ret 119 122 .previous