ASoC: Merge up fixes · tjh.dev/kernel@2a740dc

+2

.mailmap

··· 316 316 Hans Verkuil <hverkuil@kernel.org> <hansverk@cisco.com> 317 317 Hao Ge <hao.ge@linux.dev> <gehao@kylinos.cn> 318 318 Harry Yoo <harry.yoo@oracle.com> <42.hyeyoo@gmail.com> 319 + Harry Yoo <harry@kernel.org> <harry.yoo@oracle.com> 319 320 Heiko Carstens <hca@linux.ibm.com> <h.carstens@de.ibm.com> 320 321 Heiko Carstens <hca@linux.ibm.com> <heiko.carstens@de.ibm.com> 321 322 Heiko Stuebner <heiko@sntech.de> <heiko.stuebner@bqreaders.com> ··· 588 587 Morten Welinder <welinder@anemone.rentec.com> 589 588 Morten Welinder <welinder@darter.rentec.com> 590 589 Morten Welinder <welinder@troll.com> 590 + Muhammad Usama Anjum <usama.anjum@arm.com> <usama.anjum@collabora.com> 591 591 Mukesh Ojha <quic_mojha@quicinc.com> <mojha@codeaurora.org> 592 592 Muna Sinada <quic_msinada@quicinc.com> <msinada@codeaurora.org> 593 593 Murali Nalajala <quic_mnalajal@quicinc.com> <mnalajal@codeaurora.org>

+10

Documentation/PCI/pcieaer-howto.rst

··· 85 85 the error message to the Root Port. Please refer to PCIe specs for other 86 86 fields. 87 87 88 + The 'TLP Header' is the prefix/header of the TLP that caused the error 89 + in raw hex format. To decode the TLP Header into human-readable form 90 + one may use tlp-tool: 91 + 92 + https://github.com/mmpg-x86/tlp-tool 93 + 94 + Example usage:: 95 + 96 + curl -L https://git.kernel.org/linus/2ca1c94ce0b6 | rtlp-tool --aer 97 + 88 98 AER Ratelimits 89 99 -------------- 90 100

+29 -7

Documentation/core-api/dma-attributes.rst

··· 149 149 DMA_ATTR_MMIO will not perform any cache flushing. The address 150 150 provided must never be mapped cacheable into the CPU. 151 151 152 - DMA_ATTR_CPU_CACHE_CLEAN 153 - ------------------------ 152 + DMA_ATTR_DEBUGGING_IGNORE_CACHELINES 153 + ------------------------------------ 154 154 155 - This attribute indicates the CPU will not dirty any cacheline overlapping this 156 - DMA_FROM_DEVICE/DMA_BIDIRECTIONAL buffer while it is mapped. This allows 157 - multiple small buffers to safely share a cacheline without risk of data 158 - corruption, suppressing DMA debug warnings about overlapping mappings. 159 - All mappings sharing a cacheline should have this attribute. 155 + This attribute indicates that CPU cache lines may overlap for buffers mapped 156 + with DMA_FROM_DEVICE or DMA_BIDIRECTIONAL. 157 + 158 + Such overlap may occur when callers map multiple small buffers that reside 159 + within the same cache line. In this case, callers must guarantee that the CPU 160 + will not dirty these cache lines after the mappings are established. When this 161 + condition is met, multiple buffers can safely share a cache line without risking 162 + data corruption. 163 + 164 + All mappings that share a cache line must set this attribute to suppress DMA 165 + debug warnings about overlapping mappings. 166 + 167 + DMA_ATTR_REQUIRE_COHERENT 168 + ------------------------- 169 + 170 + DMA mapping requests with the DMA_ATTR_REQUIRE_COHERENT fail on any 171 + system where SWIOTLB or cache management is required. This should only 172 + be used to support uAPI designs that require continuous HW DMA 173 + coherence with userspace processes, for example RDMA and DRM. At a 174 + minimum the memory being mapped must be userspace memory from 175 + pin_user_pages() or similar. 176 + 177 + Drivers should consider using dma_mmap_pages() instead of this 178 + interface when building their uAPIs, when possible. 179 + 180 + It must never be used in an in-kernel driver that only works with 181 + kernel memory.

+50

Documentation/filesystems/overlayfs.rst

··· 783 783 mounted with "uuid=on". 784 784 785 785 786 + Durability and copy up 787 + ---------------------- 788 + 789 + The fsync(2) system call ensures that the data and metadata of a file 790 + are safely written to the backing storage, which is expected to 791 + guarantee the existence of the information post system crash. 792 + 793 + Without an fsync(2) call, there is no guarantee that the observed 794 + data after a system crash will be either the old or the new data, but 795 + in practice, the observed data after crash is often the old or new data 796 + or a mix of both. 797 + 798 + When an overlayfs file is modified for the first time, copy up will 799 + create a copy of the lower file and its parent directories in the upper 800 + layer. Since the Linux filesystem API does not enforce any particular 801 + ordering on storing changes without explicit fsync(2) calls, in case 802 + of a system crash, the upper file could end up with no data at all 803 + (i.e. zeros), which would be an unusual outcome. To avoid this 804 + experience, overlayfs calls fsync(2) on the upper file before completing 805 + data copy up with rename(2) or link(2) to make the copy up "atomic". 806 + 807 + By default, overlayfs does not explicitly call fsync(2) on copied up 808 + directories or on metadata-only copy up, so it provides no guarantee to 809 + persist the user's modification unless the user calls fsync(2). 810 + The fsync during copy up only guarantees that if a copy up is observed 811 + after a crash, the observed data is not zeroes or intermediate values 812 + from the copy up staging area. 813 + 814 + On traditional local filesystems with a single journal (e.g. ext4, xfs), 815 + fsync on a file also persists the parent directory changes, because they 816 + are usually modified in the same transaction, so metadata durability during 817 + data copy up effectively comes for free. Overlayfs further limits risk by 818 + disallowing network filesystems as upper layer. 819 + 820 + Overlayfs can be tuned to prefer performance or durability when storing 821 + to the underlying upper layer. This is controlled by the "fsync" mount 822 + option, which supports these values: 823 + 824 + - "auto": (default) 825 + Call fsync(2) on upper file before completion of data copy up. 826 + No explicit fsync(2) on directory or metadata-only copy up. 827 + - "strict": 828 + Call fsync(2) on upper file and directories before completion of any 829 + copy up. 830 + - "volatile": [*] 831 + Prefer performance over durability (see `Volatile mount`_) 832 + 833 + [*] The mount option "volatile" is an alias to "fsync=volatile". 834 + 835 + 786 836 Volatile mount 787 837 -------------- 788 838

+4 -4

Documentation/hwmon/adm1177.rst

··· 27 27 Sysfs entries 28 28 ------------- 29 29 30 - The following attributes are supported. Current maxim attribute 30 + The following attributes are supported. Current maximum attribute 31 31 is read-write, all other attributes are read-only. 32 32 33 - in0_input Measured voltage in microvolts. 33 + in0_input Measured voltage in millivolts. 34 34 35 - curr1_input Measured current in microamperes. 36 - curr1_max_alarm Overcurrent alarm in microamperes. 35 + curr1_input Measured current in milliamperes. 36 + curr1_max Overcurrent shutdown threshold in milliamperes.

+6 -4

Documentation/hwmon/peci-cputemp.rst

··· 51 51 temp1_crit Provides shutdown temperature of the CPU package which 52 52 is also known as the maximum processor junction 53 53 temperature, Tjmax or Tprochot. 54 - temp1_crit_hyst Provides the hysteresis value from Tcontrol to Tjmax of 55 - the CPU package. 54 + temp1_crit_hyst Provides the hysteresis temperature of the CPU 55 + package. Returns Tcontrol, the temperature at which 56 + the critical condition clears. 56 57 57 58 temp2_label "DTS" 58 59 temp2_input Provides current temperature of the CPU package scaled ··· 63 62 temp2_crit Provides shutdown temperature of the CPU package which 64 63 is also known as the maximum processor junction 65 64 temperature, Tjmax or Tprochot. 66 - temp2_crit_hyst Provides the hysteresis value from Tcontrol to Tjmax of 67 - the CPU package. 65 + temp2_crit_hyst Provides the hysteresis temperature of the CPU 66 + package. Returns Tcontrol, the temperature at which 67 + the critical condition clears. 68 68 69 69 temp3_label "Tcontrol" 70 70 temp3_input Provides current Tcontrol temperature of the CPU

+19 -4

Documentation/userspace-api/landlock.rst

··· 8 8 ===================================== 9 9 10 10 :Author: Mickaël Salaün 11 - :Date: January 2026 11 + :Date: March 2026 12 12 13 13 The goal of Landlock is to enable restriction of ambient rights (e.g. global 14 14 filesystem or network access) for a set of processes. Because Landlock ··· 197 197 198 198 .. code-block:: c 199 199 200 - __u32 restrict_flags = LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON; 201 - if (abi < 7) { 202 - /* Clear logging flags unsupported before ABI 7. */ 200 + __u32 restrict_flags = 201 + LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON | 202 + LANDLOCK_RESTRICT_SELF_TSYNC; 203 + switch (abi) { 204 + case 1 ... 6: 205 + /* Removes logging flags for ABI < 7 */ 203 206 restrict_flags &= ~(LANDLOCK_RESTRICT_SELF_LOG_SAME_EXEC_OFF | 204 207 LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON | 205 208 LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF); 209 + __attribute__((fallthrough)); 210 + case 7: 211 + /* 212 + * Removes multithreaded enforcement flag for ABI < 8 213 + * 214 + * WARNING: Without this flag, calling landlock_restrict_self(2) is 215 + * only equivalent if the calling process is single-threaded. Below 216 + * ABI v8 (and as of ABI v8, when not using this flag), a Landlock 217 + * policy would only be enforced for the calling thread and its 218 + * children (and not for all threads, including parents and siblings). 219 + */ 220 + restrict_flags &= ~LANDLOCK_RESTRICT_SELF_TSYNC; 206 221 } 207 222 208 223 The next step is to restrict the current thread from gaining more privileges

+16 -7

MAINTAINERS

··· 3986 3986 ASUS NOTEBOOKS AND EEEPC ACPI/WMI EXTRAS DRIVERS 3987 3987 M: Corentin Chary <corentin.chary@gmail.com> 3988 3988 M: Luke D. Jones <luke@ljones.dev> 3989 - M: Denis Benato <benato.denis96@gmail.com> 3989 + M: Denis Benato <denis.benato@linux.dev> 3990 3990 L: platform-driver-x86@vger.kernel.org 3991 3991 S: Maintained 3992 3992 W: https://asus-linux.org/ ··· 8628 8628 F: include/uapi/drm/lima_drm.h 8629 8629 8630 8630 DRM DRIVERS FOR LOONGSON 8631 + M: Jianmin Lv <lvjianmin@loongson.cn> 8632 + M: Qianhai Wu <wuqianhai@loongson.cn> 8633 + R: Huacai Chen <chenhuacai@kernel.org> 8634 + R: Mingcong Bai <jeffbai@aosc.io> 8635 + R: Xi Ruoyao <xry111@xry111.site> 8636 + R: Icenowy Zheng <zhengxingda@iscas.ac.cn> 8631 8637 L: dri-devel@lists.freedesktop.org 8632 - S: Orphan 8638 + S: Maintained 8633 8639 T: git https://gitlab.freedesktop.org/drm/misc/kernel.git 8634 8640 F: drivers/gpu/drm/loongson/ 8635 8641 ··· 9619 9613 9620 9614 EXT4 FILE SYSTEM 9621 9615 M: "Theodore Ts'o" <tytso@mit.edu> 9622 - M: Andreas Dilger <adilger.kernel@dilger.ca> 9616 + R: Andreas Dilger <adilger.kernel@dilger.ca> 9617 + R: Baokun Li <libaokun@linux.alibaba.com> 9618 + R: Jan Kara <jack@suse.cz> 9619 + R: Ojaswin Mujoo <ojaswin@linux.ibm.com> 9620 + R: Ritesh Harjani (IBM) <ritesh.list@gmail.com> 9621 + R: Zhang Yi <yi.zhang@huawei.com> 9623 9622 L: linux-ext4@vger.kernel.org 9624 9623 S: Maintained 9625 9624 W: http://ext4.wiki.kernel.org ··· 12020 12009 M: Wolfram Sang <wsa+renesas@sang-engineering.com> 12021 12010 L: linux-i2c@vger.kernel.org 12022 12011 S: Maintained 12023 - W: https://i2c.wiki.kernel.org/ 12024 12012 Q: https://patchwork.ozlabs.org/project/linux-i2c/list/ 12025 12013 T: git git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git 12026 12014 F: Documentation/i2c/ ··· 12045 12035 M: Andi Shyti <andi.shyti@kernel.org> 12046 12036 L: linux-i2c@vger.kernel.org 12047 12037 S: Maintained 12048 - W: https://i2c.wiki.kernel.org/ 12049 12038 Q: https://patchwork.ozlabs.org/project/linux-i2c/list/ 12050 12039 T: git git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux.git 12051 12040 F: Documentation/devicetree/bindings/i2c/ ··· 16886 16877 R: Rik van Riel <riel@surriel.com> 16887 16878 R: Liam R. Howlett <Liam.Howlett@oracle.com> 16888 16879 R: Vlastimil Babka <vbabka@kernel.org> 16889 - R: Harry Yoo <harry.yoo@oracle.com> 16880 + R: Harry Yoo <harry@kernel.org> 16890 16881 R: Jann Horn <jannh@google.com> 16891 16882 L: linux-mm@kvack.org 16892 16883 S: Maintained ··· 24352 24343 24353 24344 SLAB ALLOCATOR 24354 24345 M: Vlastimil Babka <vbabka@kernel.org> 24355 - M: Harry Yoo <harry.yoo@oracle.com> 24346 + M: Harry Yoo <harry@kernel.org> 24356 24347 M: Andrew Morton <akpm@linux-foundation.org> 24357 24348 R: Hao Li <hao.li@linux.dev> 24358 24349 R: Christoph Lameter <cl@gentwo.org>

+2 -2

Makefile

··· 2 2 VERSION = 7 3 3 PATCHLEVEL = 0 4 4 SUBLEVEL = 0 5 - EXTRAVERSION = -rc5 5 + EXTRAVERSION = -rc6 6 6 NAME = Baby Opossum Posse 7 7 8 8 # *DOCUMENTATION* ··· 1654 1654 modules.builtin.ranges vmlinux.o.map vmlinux.unstripped \ 1655 1655 compile_commands.json rust/test \ 1656 1656 rust-project.json .vmlinux.objs .vmlinux.export.c \ 1657 - .builtin-dtbs-list .builtin-dtb.S 1657 + .builtin-dtbs-list .builtin-dtbs.S 1658 1658 1659 1659 # Directories & files removed with 'make mrproper' 1660 1660 MRPROPER_FILES += include/config include/generated \

+1 -1

arch/arm64/kvm/at.c

··· 1753 1753 if (!writable) 1754 1754 return -EPERM; 1755 1755 1756 - ptep = (u64 __user *)hva + offset; 1756 + ptep = (void __user *)hva + offset; 1757 1757 if (cpus_have_final_cap(ARM64_HAS_LSE_ATOMICS)) 1758 1758 r = __lse_swap_desc(ptep, old, new); 1759 1759 else

+14

arch/arm64/kvm/reset.c

··· 247 247 kvm_vcpu_set_be(vcpu); 248 248 249 249 *vcpu_pc(vcpu) = target_pc; 250 + 251 + /* 252 + * We may come from a state where either a PC update was 253 + * pending (SMC call resulting in PC being increpented to 254 + * skip the SMC) or a pending exception. Make sure we get 255 + * rid of all that, as this cannot be valid out of reset. 256 + * 257 + * Note that clearing the exception mask also clears PC 258 + * updates, but that's an implementation detail, and we 259 + * really want to make it explicit. 260 + */ 261 + vcpu_clear_flag(vcpu, PENDING_EXCEPTION); 262 + vcpu_clear_flag(vcpu, EXCEPT_MASK); 263 + vcpu_clear_flag(vcpu, INCREMENT_PC); 250 264 vcpu_set_reg(vcpu, 0, reset_state.r0); 251 265 } 252 266

+36

arch/loongarch/include/asm/linkage.h

··· 41 41 .cfi_endproc; \ 42 42 SYM_END(name, SYM_T_NONE) 43 43 44 + /* 45 + * This is for the signal handler trampoline, which is used as the return 46 + * address of the signal handlers in userspace instead of called normally. 47 + * The long standing libgcc bug https://gcc.gnu.org/PR124050 requires a 48 + * nop between .cfi_startproc and the actual address of the trampoline, so 49 + * we cannot simply use SYM_FUNC_START. 50 + * 51 + * This wrapper also contains all the .cfi_* directives for recovering 52 + * the content of the GPRs and the "return address" (where the rt_sigreturn 53 + * syscall will jump to), assuming there is a struct rt_sigframe (where 54 + * a struct sigcontext containing those information we need to recover) at 55 + * $sp. The "DWARF for the LoongArch(TM) Architecture" manual states 56 + * column 0 is for $zero, but it does not make too much sense to 57 + * save/restore the hardware zero register. Repurpose this column here 58 + * for the return address (here it's not the content of $ra we cannot use 59 + * the default column 3). 60 + */ 61 + #define SYM_SIGFUNC_START(name) \ 62 + .cfi_startproc; \ 63 + .cfi_signal_frame; \ 64 + .cfi_def_cfa 3, RT_SIGFRAME_SC; \ 65 + .cfi_return_column 0; \ 66 + .cfi_offset 0, SC_PC; \ 67 + \ 68 + .irp num, 1, 2, 3, 4, 5, 6, 7, 8, \ 69 + 9, 10, 11, 12, 13, 14, 15, 16, \ 70 + 17, 18, 19, 20, 21, 22, 23, 24, \ 71 + 25, 26, 27, 28, 29, 30, 31; \ 72 + .cfi_offset \num, SC_REGS + \num * SZREG; \ 73 + .endr; \ 74 + \ 75 + nop; \ 76 + SYM_START(name, SYM_L_GLOBAL, SYM_A_ALIGN) 77 + 78 + #define SYM_SIGFUNC_END(name) SYM_FUNC_END(name) 79 + 44 80 #endif

+9

arch/loongarch/include/asm/sigframe.h

··· 1 + /* SPDX-License-Identifier: GPL-2.0+ */ 2 + 3 + #include <asm/siginfo.h> 4 + #include <asm/ucontext.h> 5 + 6 + struct rt_sigframe { 7 + struct siginfo rs_info; 8 + struct ucontext rs_uctx; 9 + };

+2

arch/loongarch/kernel/asm-offsets.c

··· 16 16 #include <asm/ptrace.h> 17 17 #include <asm/processor.h> 18 18 #include <asm/ftrace.h> 19 + #include <asm/sigframe.h> 19 20 #include <vdso/datapage.h> 20 21 21 22 static void __used output_ptreg_defines(void) ··· 221 220 COMMENT("Linux sigcontext offsets."); 222 221 OFFSET(SC_REGS, sigcontext, sc_regs); 223 222 OFFSET(SC_PC, sigcontext, sc_pc); 223 + OFFSET(RT_SIGFRAME_SC, rt_sigframe, rs_uctx.uc_mcontext); 224 224 BLANK(); 225 225 } 226 226

+3 -4

arch/loongarch/kernel/env.c

··· 42 42 int cpu, ret; 43 43 char *cpuname; 44 44 const char *model; 45 - struct device_node *root; 46 45 47 46 /* Parsing cpuname from DTS model property */ 48 - root = of_find_node_by_path("/"); 49 - ret = of_property_read_string(root, "model", &model); 47 + ret = of_property_read_string(of_root, "model", &model); 50 48 if (ret == 0) { 51 49 cpuname = kstrdup(model, GFP_KERNEL); 50 + if (!cpuname) 51 + return -ENOMEM; 52 52 loongson_sysconf.cpuname = strsep(&cpuname, " "); 53 53 } 54 - of_node_put(root); 55 54 56 55 if (loongson_sysconf.cpuname && !strncmp(loongson_sysconf.cpuname, "Loongson", 8)) { 57 56 for (cpu = 0; cpu < NR_CPUS; cpu++)

+1 -5

arch/loongarch/kernel/signal.c

··· 35 35 #include <asm/cpu-features.h> 36 36 #include <asm/fpu.h> 37 37 #include <asm/lbt.h> 38 + #include <asm/sigframe.h> 38 39 #include <asm/ucontext.h> 39 40 #include <asm/vdso.h> 40 41 ··· 51 50 /* Make sure we will not lose LBT ownership */ 52 51 #define lock_lbt_owner() ({ preempt_disable(); pagefault_disable(); }) 53 52 #define unlock_lbt_owner() ({ pagefault_enable(); preempt_enable(); }) 54 - 55 - struct rt_sigframe { 56 - struct siginfo rs_info; 57 - struct ucontext rs_uctx; 58 - }; 59 53 60 54 struct _ctx_layout { 61 55 struct sctx_info *addr;

+8 -8

arch/loongarch/kvm/intc/eiointc.c

··· 83 83 84 84 if (!(s->status & BIT(EIOINTC_ENABLE_CPU_ENCODE))) { 85 85 cpuid = ffs(cpuid) - 1; 86 - cpuid = (cpuid >= 4) ? 0 : cpuid; 86 + cpuid = ((cpuid < 0) || (cpuid >= 4)) ? 0 : cpuid; 87 87 } 88 88 89 89 vcpu = kvm_get_vcpu_by_cpuid(s->kvm, cpuid); ··· 472 472 switch (addr) { 473 473 case EIOINTC_NODETYPE_START ... EIOINTC_NODETYPE_END: 474 474 offset = (addr - EIOINTC_NODETYPE_START) / 4; 475 - p = s->nodetype + offset * 4; 475 + p = (void *)s->nodetype + offset * 4; 476 476 break; 477 477 case EIOINTC_IPMAP_START ... EIOINTC_IPMAP_END: 478 478 offset = (addr - EIOINTC_IPMAP_START) / 4; 479 - p = &s->ipmap + offset * 4; 479 + p = (void *)&s->ipmap + offset * 4; 480 480 break; 481 481 case EIOINTC_ENABLE_START ... EIOINTC_ENABLE_END: 482 482 offset = (addr - EIOINTC_ENABLE_START) / 4; 483 - p = s->enable + offset * 4; 483 + p = (void *)s->enable + offset * 4; 484 484 break; 485 485 case EIOINTC_BOUNCE_START ... EIOINTC_BOUNCE_END: 486 486 offset = (addr - EIOINTC_BOUNCE_START) / 4; 487 - p = s->bounce + offset * 4; 487 + p = (void *)s->bounce + offset * 4; 488 488 break; 489 489 case EIOINTC_ISR_START ... EIOINTC_ISR_END: 490 490 offset = (addr - EIOINTC_ISR_START) / 4; 491 - p = s->isr + offset * 4; 491 + p = (void *)s->isr + offset * 4; 492 492 break; 493 493 case EIOINTC_COREISR_START ... EIOINTC_COREISR_END: 494 494 if (cpu >= s->num_cpu) 495 495 return -EINVAL; 496 496 497 497 offset = (addr - EIOINTC_COREISR_START) / 4; 498 - p = s->coreisr[cpu] + offset * 4; 498 + p = (void *)s->coreisr[cpu] + offset * 4; 499 499 break; 500 500 case EIOINTC_COREMAP_START ... EIOINTC_COREMAP_END: 501 501 offset = (addr - EIOINTC_COREMAP_START) / 4; 502 - p = s->coremap + offset * 4; 502 + p = (void *)s->coremap + offset * 4; 503 503 break; 504 504 default: 505 505 kvm_err("%s: unknown eiointc register, addr = %d\n", __func__, addr);

+3

arch/loongarch/kvm/vcpu.c

··· 588 588 { 589 589 struct kvm_phyid_map *map; 590 590 591 + if (cpuid < 0) 592 + return NULL; 593 + 591 594 if (cpuid >= KVM_MAX_PHYID) 592 595 return NULL; 593 596

+80

arch/loongarch/pci/pci.c

··· 5 5 #include <linux/kernel.h> 6 6 #include <linux/init.h> 7 7 #include <linux/acpi.h> 8 + #include <linux/delay.h> 8 9 #include <linux/types.h> 9 10 #include <linux/pci.h> 10 11 #include <linux/vgaarb.h> 12 + #include <linux/io-64-nonatomic-lo-hi.h> 11 13 #include <asm/cacheflush.h> 12 14 #include <asm/loongson.h> 13 15 ··· 17 15 #define PCI_DEVICE_ID_LOONGSON_DC1 0x7a06 18 16 #define PCI_DEVICE_ID_LOONGSON_DC2 0x7a36 19 17 #define PCI_DEVICE_ID_LOONGSON_DC3 0x7a46 18 + #define PCI_DEVICE_ID_LOONGSON_GPU1 0x7a15 19 + #define PCI_DEVICE_ID_LOONGSON_GPU2 0x7a25 20 + #define PCI_DEVICE_ID_LOONGSON_GPU3 0x7a35 20 21 21 22 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, 22 23 int reg, int len, u32 *val) ··· 104 99 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC1, pci_fixup_vgadev); 105 100 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC2, pci_fixup_vgadev); 106 101 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC3, pci_fixup_vgadev); 102 + 103 + #define CRTC_NUM_MAX 2 104 + #define CRTC_OUTPUT_ENABLE 0x100 105 + 106 + static void loongson_gpu_fixup_dma_hang(struct pci_dev *pdev, bool on) 107 + { 108 + u32 i, val, count, crtc_offset, device; 109 + void __iomem *crtc_reg, *base, *regbase; 110 + static u32 crtc_status[CRTC_NUM_MAX] = { 0 }; 111 + 112 + base = pdev->bus->ops->map_bus(pdev->bus, pdev->devfn + 1, 0); 113 + device = readw(base + PCI_DEVICE_ID); 114 + 115 + regbase = ioremap(readq(base + PCI_BASE_ADDRESS_0) & ~0xffull, SZ_64K); 116 + if (!regbase) { 117 + pci_err(pdev, "Failed to ioremap()\n"); 118 + return; 119 + } 120 + 121 + switch (device) { 122 + case PCI_DEVICE_ID_LOONGSON_DC2: 123 + crtc_reg = regbase + 0x1240; 124 + crtc_offset = 0x10; 125 + break; 126 + case PCI_DEVICE_ID_LOONGSON_DC3: 127 + crtc_reg = regbase; 128 + crtc_offset = 0x400; 129 + break; 130 + } 131 + 132 + for (i = 0; i < CRTC_NUM_MAX; i++, crtc_reg += crtc_offset) { 133 + val = readl(crtc_reg); 134 + 135 + if (!on) 136 + crtc_status[i] = val; 137 + 138 + /* No need to fixup if the status is off at startup. */ 139 + if (!(crtc_status[i] & CRTC_OUTPUT_ENABLE)) 140 + continue; 141 + 142 + if (on) 143 + val |= CRTC_OUTPUT_ENABLE; 144 + else 145 + val &= ~CRTC_OUTPUT_ENABLE; 146 + 147 + mb(); 148 + writel(val, crtc_reg); 149 + 150 + for (count = 0; count < 40; count++) { 151 + val = readl(crtc_reg) & CRTC_OUTPUT_ENABLE; 152 + if ((on && val) || (!on && !val)) 153 + break; 154 + udelay(1000); 155 + } 156 + 157 + pci_info(pdev, "DMA hang fixup at reg[0x%lx]: 0x%x\n", 158 + (unsigned long)crtc_reg & 0xffff, readl(crtc_reg)); 159 + } 160 + 161 + iounmap(regbase); 162 + } 163 + 164 + static void pci_fixup_dma_hang_early(struct pci_dev *pdev) 165 + { 166 + loongson_gpu_fixup_dma_hang(pdev, false); 167 + } 168 + DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU2, pci_fixup_dma_hang_early); 169 + DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU3, pci_fixup_dma_hang_early); 170 + 171 + static void pci_fixup_dma_hang_final(struct pci_dev *pdev) 172 + { 173 + loongson_gpu_fixup_dma_hang(pdev, true); 174 + } 175 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU2, pci_fixup_dma_hang_final); 176 + DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_GPU3, pci_fixup_dma_hang_final);

+2 -2

arch/loongarch/vdso/Makefile

··· 26 26 $(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \ 27 27 -std=gnu11 -fms-extensions -O2 -g -fno-strict-aliasing -fno-common -fno-builtin \ 28 28 -fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \ 29 - $(call cc-option, -fno-asynchronous-unwind-tables) \ 29 + $(call cc-option, -fasynchronous-unwind-tables) \ 30 30 $(call cc-option, -fno-stack-protector) 31 31 aflags-vdso := $(ccflags-vdso) \ 32 32 -D__ASSEMBLY__ -Wa,-gdwarf-2 ··· 41 41 42 42 # VDSO linker flags. 43 43 ldflags-y := -Bsymbolic --no-undefined -soname=linux-vdso.so.1 \ 44 - $(filter -E%,$(KBUILD_CFLAGS)) -shared --build-id -T 44 + $(filter -E%,$(KBUILD_CFLAGS)) -shared --build-id --eh-frame-hdr -T 45 45 46 46 # 47 47 # Shared build commands.

+3 -3

arch/loongarch/vdso/sigreturn.S

··· 12 12 13 13 #include <asm/regdef.h> 14 14 #include <asm/asm.h> 15 + #include <asm/asm-offsets.h> 15 16 16 17 .section .text 17 - .cfi_sections .debug_frame 18 18 19 - SYM_FUNC_START(__vdso_rt_sigreturn) 19 + SYM_SIGFUNC_START(__vdso_rt_sigreturn) 20 20 21 21 li.w a7, __NR_rt_sigreturn 22 22 syscall 0 23 23 24 - SYM_FUNC_END(__vdso_rt_sigreturn) 24 + SYM_SIGFUNC_END(__vdso_rt_sigreturn)

+2 -2

arch/s390/include/asm/barrier.h

··· 62 62 * @size: number of elements in array 63 63 */ 64 64 #define array_index_mask_nospec array_index_mask_nospec 65 - static inline unsigned long array_index_mask_nospec(unsigned long index, 66 - unsigned long size) 65 + static __always_inline unsigned long array_index_mask_nospec(unsigned long index, 66 + unsigned long size) 67 67 { 68 68 unsigned long mask; 69 69

+3

arch/s390/include/asm/kvm_host.h

··· 710 710 void kvm_arch_crypto_set_masks(struct kvm *kvm, unsigned long *apm, 711 711 unsigned long *aqm, unsigned long *adm); 712 712 713 + #define SIE64_RETURN_NORMAL 0 714 + #define SIE64_RETURN_MCCK 1 715 + 713 716 int __sie64a(phys_addr_t sie_block_phys, struct kvm_s390_sie_block *sie_block, u64 *rsa, 714 717 unsigned long gasce); 715 718

+1 -1

arch/s390/include/asm/stacktrace.h

··· 62 62 struct { 63 63 unsigned long sie_control_block; 64 64 unsigned long sie_savearea; 65 - unsigned long sie_reason; 65 + unsigned long sie_return; 66 66 unsigned long sie_flags; 67 67 unsigned long sie_control_block_phys; 68 68 unsigned long sie_guest_asce;

+1 -1

arch/s390/kernel/asm-offsets.c

··· 63 63 OFFSET(__SF_EMPTY, stack_frame, empty[0]); 64 64 OFFSET(__SF_SIE_CONTROL, stack_frame, sie_control_block); 65 65 OFFSET(__SF_SIE_SAVEAREA, stack_frame, sie_savearea); 66 - OFFSET(__SF_SIE_REASON, stack_frame, sie_reason); 66 + OFFSET(__SF_SIE_RETURN, stack_frame, sie_return); 67 67 OFFSET(__SF_SIE_FLAGS, stack_frame, sie_flags); 68 68 OFFSET(__SF_SIE_CONTROL_PHYS, stack_frame, sie_control_block_phys); 69 69 OFFSET(__SF_SIE_GUEST_ASCE, stack_frame, sie_guest_asce);

+5 -2

arch/s390/kernel/entry.S

··· 200 200 stg %r3,__SF_SIE_CONTROL(%r15) # ...and virtual addresses 201 201 stg %r4,__SF_SIE_SAVEAREA(%r15) # save guest register save area 202 202 stg %r5,__SF_SIE_GUEST_ASCE(%r15) # save guest asce 203 - xc __SF_SIE_REASON(8,%r15),__SF_SIE_REASON(%r15) # reason code = 0 203 + xc __SF_SIE_RETURN(8,%r15),__SF_SIE_RETURN(%r15) # return code = 0 204 204 mvc __SF_SIE_FLAGS(8,%r15),__TI_flags(%r14) # copy thread flags 205 205 lmg %r0,%r13,0(%r4) # load guest gprs 0-13 206 206 mvi __TI_sie(%r14),1 ··· 237 237 xgr %r4,%r4 238 238 xgr %r5,%r5 239 239 lmg %r6,%r14,__SF_GPRS(%r15) # restore kernel registers 240 - lg %r2,__SF_SIE_REASON(%r15) # return exit reason code 240 + lg %r2,__SF_SIE_RETURN(%r15) # return sie return code 241 241 BR_EX %r14 242 242 SYM_FUNC_END(__sie64a) 243 243 EXPORT_SYMBOL(__sie64a) ··· 271 271 xgr %r9,%r9 272 272 xgr %r10,%r10 273 273 xgr %r11,%r11 274 + xgr %r12,%r12 274 275 la %r2,STACK_FRAME_OVERHEAD(%r15) # pointer to pt_regs 275 276 mvc __PT_R8(64,%r2),__LC_SAVE_AREA(%r13) 276 277 MBEAR %r2,%r13 ··· 408 407 xgr %r6,%r6 409 408 xgr %r7,%r7 410 409 xgr %r10,%r10 410 + xgr %r12,%r12 411 411 xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11) 412 412 mvc __PT_R8(64,%r11),__LC_SAVE_AREA(%r13) 413 413 MBEAR %r11,%r13 ··· 498 496 xgr %r6,%r6 499 497 xgr %r7,%r7 500 498 xgr %r10,%r10 499 + xgr %r12,%r12 501 500 stmg %r8,%r9,__PT_PSW(%r11) 502 501 xc __PT_FLAGS(8,%r11),__PT_FLAGS(%r11) 503 502 xc __SF_BACKCHAIN(8,%r15),__SF_BACKCHAIN(%r15)

+2 -2

arch/s390/kernel/nmi.c

··· 487 487 mcck_dam_code = (mci.val & MCIC_SUBCLASS_MASK); 488 488 if (test_cpu_flag(CIF_MCCK_GUEST) && 489 489 (mcck_dam_code & MCCK_CODE_NO_GUEST) != mcck_dam_code) { 490 - /* Set exit reason code for host's later handling */ 491 - *((long *)(regs->gprs[15] + __SF_SIE_REASON)) = -EINTR; 490 + /* Set sie return code for host's later handling */ 491 + ((struct stack_frame *)regs->gprs[15])->sie_return = SIE64_RETURN_MCCK; 492 492 } 493 493 clear_cpu_flag(CIF_MCCK_GUEST); 494 494

+4 -1

arch/s390/kernel/syscall.c

··· 13 13 */ 14 14 15 15 #include <linux/cpufeature.h> 16 + #include <linux/nospec.h> 16 17 #include <linux/errno.h> 17 18 #include <linux/sched.h> 18 19 #include <linux/mm.h> ··· 132 131 if (unlikely(test_and_clear_pt_regs_flag(regs, PIF_SYSCALL_RET_SET))) 133 132 goto out; 134 133 regs->gprs[2] = -ENOSYS; 135 - if (likely(nr < NR_syscalls)) 134 + if (likely(nr < NR_syscalls)) { 135 + nr = array_index_nospec(nr, NR_syscalls); 136 136 regs->gprs[2] = sys_call_table[nr](regs); 137 + } 137 138 out: 138 139 syscall_exit_to_user_mode(regs); 139 140 }

+15 -85

arch/s390/kvm/dat.c

··· 135 135 } 136 136 137 137 /** 138 - * dat_crstep_xchg() - Exchange a gmap CRSTE with another. 139 - * @crstep: Pointer to the CRST entry 140 - * @new: Replacement entry. 141 - * @gfn: The affected guest address. 142 - * @asce: The ASCE of the address space. 143 - * 144 - * Context: This function is assumed to be called with kvm->mmu_lock held. 145 - */ 146 - void dat_crstep_xchg(union crste *crstep, union crste new, gfn_t gfn, union asce asce) 147 - { 148 - if (crstep->h.i) { 149 - WRITE_ONCE(*crstep, new); 150 - return; 151 - } else if (cpu_has_edat2()) { 152 - crdte_crste(crstep, *crstep, new, gfn, asce); 153 - return; 154 - } 155 - 156 - if (machine_has_tlb_guest()) 157 - idte_crste(crstep, gfn, IDTE_GUEST_ASCE, asce, IDTE_GLOBAL); 158 - else 159 - idte_crste(crstep, gfn, 0, NULL_ASCE, IDTE_GLOBAL); 160 - WRITE_ONCE(*crstep, new); 161 - } 162 - 163 - /** 164 138 * dat_crstep_xchg_atomic() - Atomically exchange a gmap CRSTE with another. 165 139 * @crstep: Pointer to the CRST entry. 166 140 * @old: Expected old value. ··· 149 175 * 150 176 * Return: %true if the exchange was successful. 151 177 */ 152 - bool dat_crstep_xchg_atomic(union crste *crstep, union crste old, union crste new, gfn_t gfn, 153 - union asce asce) 178 + bool __must_check dat_crstep_xchg_atomic(union crste *crstep, union crste old, union crste new, 179 + gfn_t gfn, union asce asce) 154 180 { 155 181 if (old.h.i) 156 182 return arch_try_cmpxchg((long *)crstep, &old.val, new.val); ··· 266 292 pt->ptes[i].val = init.val | i * PAGE_SIZE; 267 293 /* No need to take locks as the page table is not installed yet. */ 268 294 pgste_init.prefix_notif = old.s.fc1.prefix_notif; 295 + pgste_init.vsie_notif = old.s.fc1.vsie_notif; 269 296 pgste_init.pcl = uses_skeys && init.h.i; 270 297 dat_init_pgstes(pt, pgste_init.val); 271 298 } else { ··· 868 893 869 894 /* This table entry needs to be updated. */ 870 895 if (walk->start <= gfn && walk->end >= next) { 871 - dat_crstep_xchg_atomic(crstep, crste, new_crste, gfn, walk->asce); 896 + if (!dat_crstep_xchg_atomic(crstep, crste, new_crste, gfn, walk->asce)) 897 + return -EINVAL; 872 898 /* A lower level table was present, needs to be freed. */ 873 899 if (!crste.h.fc && !crste.h.i) { 874 900 if (is_pmd(crste)) ··· 997 1021 return _dat_walk_gfn_range(start, end, asce, &test_age_ops, 0, NULL) > 0; 998 1022 } 999 1023 1000 - int dat_link(struct kvm_s390_mmu_cache *mc, union asce asce, int level, 1001 - bool uses_skeys, struct guest_fault *f) 1002 - { 1003 - union crste oldval, newval; 1004 - union pte newpte, oldpte; 1005 - union pgste pgste; 1006 - int rc = 0; 1007 - 1008 - rc = dat_entry_walk(mc, f->gfn, asce, DAT_WALK_ALLOC_CONTINUE, level, &f->crstep, &f->ptep); 1009 - if (rc == -EINVAL || rc == -ENOMEM) 1010 - return rc; 1011 - if (rc) 1012 - return -EAGAIN; 1013 - 1014 - if (WARN_ON_ONCE(unlikely(get_level(f->crstep, f->ptep) > level))) 1015 - return -EINVAL; 1016 - 1017 - if (f->ptep) { 1018 - pgste = pgste_get_lock(f->ptep); 1019 - oldpte = *f->ptep; 1020 - newpte = _pte(f->pfn, f->writable, f->write_attempt | oldpte.s.d, !f->page); 1021 - newpte.s.sd = oldpte.s.sd; 1022 - oldpte.s.sd = 0; 1023 - if (oldpte.val == _PTE_EMPTY.val || oldpte.h.pfra == f->pfn) { 1024 - pgste = __dat_ptep_xchg(f->ptep, pgste, newpte, f->gfn, asce, uses_skeys); 1025 - if (f->callback) 1026 - f->callback(f); 1027 - } else { 1028 - rc = -EAGAIN; 1029 - } 1030 - pgste_set_unlock(f->ptep, pgste); 1031 - } else { 1032 - oldval = READ_ONCE(*f->crstep); 1033 - newval = _crste_fc1(f->pfn, oldval.h.tt, f->writable, 1034 - f->write_attempt | oldval.s.fc1.d); 1035 - newval.s.fc1.sd = oldval.s.fc1.sd; 1036 - if (oldval.val != _CRSTE_EMPTY(oldval.h.tt).val && 1037 - crste_origin_large(oldval) != crste_origin_large(newval)) 1038 - return -EAGAIN; 1039 - if (!dat_crstep_xchg_atomic(f->crstep, oldval, newval, f->gfn, asce)) 1040 - return -EAGAIN; 1041 - if (f->callback) 1042 - f->callback(f); 1043 - } 1044 - 1045 - return rc; 1046 - } 1047 - 1048 1024 static long dat_set_pn_crste(union crste *crstep, gfn_t gfn, gfn_t next, struct dat_walk *walk) 1049 1025 { 1050 - union crste crste = READ_ONCE(*crstep); 1026 + union crste newcrste, oldcrste; 1051 1027 int *n = walk->priv; 1052 1028 1053 - if (!crste.h.fc || crste.h.i || crste.h.p) 1054 - return 0; 1055 - 1029 + do { 1030 + oldcrste = READ_ONCE(*crstep); 1031 + if (!oldcrste.h.fc || oldcrste.h.i || oldcrste.h.p) 1032 + return 0; 1033 + if (oldcrste.s.fc1.prefix_notif) 1034 + break; 1035 + newcrste = oldcrste; 1036 + newcrste.s.fc1.prefix_notif = 1; 1037 + } while (!dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, gfn, walk->asce)); 1056 1038 *n = 2; 1057 - if (crste.s.fc1.prefix_notif) 1058 - return 0; 1059 - crste.s.fc1.prefix_notif = 1; 1060 - dat_crstep_xchg(crstep, crste, gfn, walk->asce); 1061 1039 return 0; 1062 1040 } 1063 1041

+12 -11

arch/s390/kvm/dat.h

··· 160 160 unsigned long :44; /* HW */ 161 161 unsigned long : 3; /* Unused */ 162 162 unsigned long : 1; /* HW */ 163 + unsigned long s : 1; /* Special */ 163 164 unsigned long w : 1; /* Writable soft-bit */ 164 165 unsigned long r : 1; /* Readable soft-bit */ 165 166 unsigned long d : 1; /* Dirty */ 166 167 unsigned long y : 1; /* Young */ 167 - unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 168 168 unsigned long : 3; /* HW */ 169 + unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 169 170 unsigned long vsie_notif : 1; /* Referenced in a shadow table */ 170 - unsigned long : 1; /* Unused */ 171 171 unsigned long : 4; /* HW */ 172 172 unsigned long sd : 1; /* Soft-Dirty */ 173 173 unsigned long pr : 1; /* Present */ ··· 183 183 unsigned long :33; /* HW */ 184 184 unsigned long :14; /* Unused */ 185 185 unsigned long : 1; /* HW */ 186 + unsigned long s : 1; /* Special */ 186 187 unsigned long w : 1; /* Writable soft-bit */ 187 188 unsigned long r : 1; /* Readable soft-bit */ 188 189 unsigned long d : 1; /* Dirty */ 189 190 unsigned long y : 1; /* Young */ 190 - unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 191 191 unsigned long : 3; /* HW */ 192 + unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 192 193 unsigned long vsie_notif : 1; /* Referenced in a shadow table */ 193 - unsigned long : 1; /* Unused */ 194 194 unsigned long : 4; /* HW */ 195 195 unsigned long sd : 1; /* Soft-Dirty */ 196 196 unsigned long pr : 1; /* Present */ ··· 254 254 struct { 255 255 unsigned long :47; 256 256 unsigned long : 1; /* HW (should be 0) */ 257 + unsigned long s : 1; /* Special */ 257 258 unsigned long w : 1; /* Writable */ 258 259 unsigned long r : 1; /* Readable */ 259 260 unsigned long d : 1; /* Dirty */ 260 261 unsigned long y : 1; /* Young */ 261 - unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 262 262 unsigned long : 3; /* HW */ 263 + unsigned long prefix_notif : 1; /* Guest prefix invalidation notification */ 263 264 unsigned long vsie_notif : 1; /* Referenced in a shadow table */ 264 - unsigned long : 1; 265 265 unsigned long : 4; /* HW */ 266 266 unsigned long sd : 1; /* Soft-Dirty */ 267 267 unsigned long pr : 1; /* Present */ ··· 540 540 u16 type, u16 param); 541 541 int dat_set_prefix_notif_bit(union asce asce, gfn_t gfn); 542 542 bool dat_test_age_gfn(union asce asce, gfn_t start, gfn_t end); 543 - int dat_link(struct kvm_s390_mmu_cache *mc, union asce asce, int level, 544 - bool uses_skeys, struct guest_fault *f); 545 543 546 544 int dat_perform_essa(union asce asce, gfn_t gfn, int orc, union essa_state *state, bool *dirty); 547 545 long dat_reset_cmma(union asce asce, gfn_t start_gfn); ··· 936 938 return dat_crstep_xchg_atomic(_CRSTEP(pudp), _CRSTE(old), _CRSTE(new), gfn, asce); 937 939 } 938 940 939 - static inline void dat_crstep_clear(union crste *crstep, gfn_t gfn, union asce asce) 941 + static inline union crste dat_crstep_clear_atomic(union crste *crstep, gfn_t gfn, union asce asce) 940 942 { 941 - union crste newcrste = _CRSTE_EMPTY(crstep->h.tt); 943 + union crste oldcrste, empty = _CRSTE_EMPTY(crstep->h.tt); 942 944 943 - dat_crstep_xchg(crstep, newcrste, gfn, asce); 945 + do { 946 + oldcrste = READ_ONCE(*crstep); 947 + } while (!dat_crstep_xchg_atomic(crstep, oldcrste, empty, gfn, asce)); 948 + return oldcrste; 944 949 } 945 950 946 951 static inline int get_level(union crste *crstep, union pte *ptep)

+55 -22

arch/s390/kvm/gaccess.c

··· 1434 1434 if (rc) 1435 1435 return rc; 1436 1436 1437 - pgste = pgste_get_lock(ptep_h); 1438 - newpte = _pte(f->pfn, f->writable, !p, 0); 1439 - newpte.s.d |= ptep->s.d; 1440 - newpte.s.sd |= ptep->s.sd; 1441 - newpte.h.p &= ptep->h.p; 1442 - pgste = _gmap_ptep_xchg(sg->parent, ptep_h, newpte, pgste, f->gfn, false); 1443 - pgste.vsie_notif = 1; 1437 + if (!pgste_get_trylock(ptep_h, &pgste)) 1438 + return -EAGAIN; 1439 + newpte = _pte(f->pfn, f->writable, !p, ptep_h->s.s); 1440 + newpte.s.d |= ptep_h->s.d; 1441 + newpte.s.sd |= ptep_h->s.sd; 1442 + newpte.h.p &= ptep_h->h.p; 1443 + if (!newpte.h.p && !f->writable) { 1444 + rc = -EOPNOTSUPP; 1445 + } else { 1446 + pgste = _gmap_ptep_xchg(sg->parent, ptep_h, newpte, pgste, f->gfn, false); 1447 + pgste.vsie_notif = 1; 1448 + } 1444 1449 pgste_set_unlock(ptep_h, pgste); 1450 + if (rc) 1451 + return rc; 1452 + if (!sg->parent) 1453 + return -EAGAIN; 1445 1454 1446 1455 newpte = _pte(f->pfn, 0, !p, 0); 1447 - pgste = pgste_get_lock(ptep); 1456 + if (!pgste_get_trylock(ptep, &pgste)) 1457 + return -EAGAIN; 1448 1458 pgste = __dat_ptep_xchg(ptep, pgste, newpte, gpa_to_gfn(raddr), sg->asce, uses_skeys(sg)); 1449 1459 pgste_set_unlock(ptep, pgste); 1450 1460 ··· 1464 1454 static int _do_shadow_crste(struct gmap *sg, gpa_t raddr, union crste *host, union crste *table, 1465 1455 struct guest_fault *f, bool p) 1466 1456 { 1467 - union crste newcrste; 1457 + union crste newcrste, oldcrste; 1468 1458 gfn_t gfn; 1469 1459 int rc; 1470 1460 ··· 1477 1467 if (rc) 1478 1468 return rc; 1479 1469 1480 - newcrste = _crste_fc1(f->pfn, host->h.tt, f->writable, !p); 1481 - newcrste.s.fc1.d |= host->s.fc1.d; 1482 - newcrste.s.fc1.sd |= host->s.fc1.sd; 1483 - newcrste.h.p &= host->h.p; 1484 - newcrste.s.fc1.vsie_notif = 1; 1485 - newcrste.s.fc1.prefix_notif = host->s.fc1.prefix_notif; 1486 - _gmap_crstep_xchg(sg->parent, host, newcrste, f->gfn, false); 1470 + do { 1471 + /* _gmap_crstep_xchg_atomic() could have unshadowed this shadow gmap */ 1472 + if (!sg->parent) 1473 + return -EAGAIN; 1474 + oldcrste = READ_ONCE(*host); 1475 + newcrste = _crste_fc1(f->pfn, oldcrste.h.tt, f->writable, !p); 1476 + newcrste.s.fc1.d |= oldcrste.s.fc1.d; 1477 + newcrste.s.fc1.sd |= oldcrste.s.fc1.sd; 1478 + newcrste.h.p &= oldcrste.h.p; 1479 + newcrste.s.fc1.vsie_notif = 1; 1480 + newcrste.s.fc1.prefix_notif = oldcrste.s.fc1.prefix_notif; 1481 + newcrste.s.fc1.s = oldcrste.s.fc1.s; 1482 + if (!newcrste.h.p && !f->writable) 1483 + return -EOPNOTSUPP; 1484 + } while (!_gmap_crstep_xchg_atomic(sg->parent, host, oldcrste, newcrste, f->gfn, false)); 1485 + if (!sg->parent) 1486 + return -EAGAIN; 1487 1487 1488 - newcrste = _crste_fc1(f->pfn, host->h.tt, 0, !p); 1489 - dat_crstep_xchg(table, newcrste, gpa_to_gfn(raddr), sg->asce); 1488 + newcrste = _crste_fc1(f->pfn, oldcrste.h.tt, 0, !p); 1489 + gfn = gpa_to_gfn(raddr); 1490 + while (!dat_crstep_xchg_atomic(table, READ_ONCE(*table), newcrste, gfn, sg->asce)) 1491 + ; 1490 1492 return 0; 1491 1493 } 1492 1494 ··· 1522 1500 if (rc) 1523 1501 return rc; 1524 1502 1525 - /* A race occourred. The shadow mapping is already valid, nothing to do */ 1526 - if ((ptep && !ptep->h.i) || (!ptep && crste_leaf(*table))) 1503 + /* A race occurred. The shadow mapping is already valid, nothing to do */ 1504 + if ((ptep && !ptep->h.i && ptep->h.p == w->p) || 1505 + (!ptep && crste_leaf(*table) && !table->h.i && table->h.p == w->p)) 1527 1506 return 0; 1528 1507 1529 1508 gl = get_level(table, ptep); 1509 + 1510 + /* In case of a real address space */ 1511 + if (w->level <= LEVEL_MEM) { 1512 + l = TABLE_TYPE_PAGE_TABLE; 1513 + hl = TABLE_TYPE_REGION1; 1514 + goto real_address_space; 1515 + } 1530 1516 1531 1517 /* 1532 1518 * Skip levels that are already protected. For each level, protect 1533 1519 * only the page containing the entry, not the whole table. 1534 1520 */ 1535 1521 for (i = gl ; i >= w->level; i--) { 1536 - rc = gmap_protect_rmap(mc, sg, entries[i - 1].gfn, gpa_to_gfn(saddr), 1537 - entries[i - 1].pfn, i, entries[i - 1].writable); 1522 + rc = gmap_protect_rmap(mc, sg, entries[i].gfn, gpa_to_gfn(saddr), 1523 + entries[i].pfn, i + 1, entries[i].writable); 1538 1524 if (rc) 1539 1525 return rc; 1526 + if (!sg->parent) 1527 + return -EAGAIN; 1540 1528 } 1541 1529 1542 1530 rc = dat_entry_walk(NULL, entries[LEVEL_MEM].gfn, sg->parent->asce, DAT_WALK_LEAF, ··· 1558 1526 /* Get the smallest granularity */ 1559 1527 l = min3(gl, hl, w->level); 1560 1528 1529 + real_address_space: 1561 1530 flags = DAT_WALK_SPLIT_ALLOC | (uses_skeys(sg->parent) ? DAT_WALK_USES_SKEYS : 0); 1562 1531 /* If necessary, create the shadow mapping */ 1563 1532 if (l < gl) {

+114 -46

arch/s390/kvm/gmap.c

··· 313 313 struct clear_young_pte_priv *priv = walk->priv; 314 314 union crste crste, new; 315 315 316 - crste = READ_ONCE(*crstep); 316 + do { 317 + crste = READ_ONCE(*crstep); 317 318 318 - if (!crste.h.fc) 319 - return 0; 320 - if (!crste.s.fc1.y && crste.h.i) 321 - return 0; 322 - if (!crste_prefix(crste) || gmap_mkold_prefix(priv->gmap, gfn, end)) { 319 + if (!crste.h.fc) 320 + return 0; 321 + if (!crste.s.fc1.y && crste.h.i) 322 + return 0; 323 + if (crste_prefix(crste) && !gmap_mkold_prefix(priv->gmap, gfn, end)) 324 + break; 325 + 323 326 new = crste; 324 327 new.h.i = 1; 325 328 new.s.fc1.y = 0; ··· 331 328 folio_set_dirty(phys_to_folio(crste_origin_large(crste))); 332 329 new.s.fc1.d = 0; 333 330 new.h.p = 1; 334 - dat_crstep_xchg(crstep, new, gfn, walk->asce); 335 - } 331 + } while (!dat_crstep_xchg_atomic(crstep, crste, new, gfn, walk->asce)); 332 + 336 333 priv->young = 1; 337 334 return 0; 338 335 } ··· 394 391 { 395 392 struct gmap_unmap_priv *priv = walk->priv; 396 393 struct folio *folio = NULL; 394 + union crste old = *crstep; 397 395 398 - if (crstep->h.fc) { 399 - if (crstep->s.fc1.pr && test_bit(GMAP_FLAG_EXPORT_ON_UNMAP, &priv->gmap->flags)) 400 - folio = phys_to_folio(crste_origin_large(*crstep)); 401 - gmap_crstep_xchg(priv->gmap, crstep, _CRSTE_EMPTY(crstep->h.tt), gfn); 402 - if (folio) 403 - uv_convert_from_secure_folio(folio); 404 - } 396 + if (!old.h.fc) 397 + return 0; 398 + 399 + if (old.s.fc1.pr && test_bit(GMAP_FLAG_EXPORT_ON_UNMAP, &priv->gmap->flags)) 400 + folio = phys_to_folio(crste_origin_large(old)); 401 + /* No races should happen because kvm->mmu_lock is held in write mode */ 402 + KVM_BUG_ON(!gmap_crstep_xchg_atomic(priv->gmap, crstep, old, _CRSTE_EMPTY(old.h.tt), gfn), 403 + priv->gmap->kvm); 404 + if (folio) 405 + uv_convert_from_secure_folio(folio); 405 406 406 407 return 0; 407 408 } ··· 481 474 482 475 if (fatal_signal_pending(current)) 483 476 return 1; 484 - crste = READ_ONCE(*table); 485 - if (!crste.h.fc) 486 - return 0; 487 - if (crste.h.p && !crste.s.fc1.sd) 488 - return 0; 477 + do { 478 + crste = READ_ONCE(*table); 479 + if (!crste.h.fc) 480 + return 0; 481 + if (crste.h.p && !crste.s.fc1.sd) 482 + return 0; 489 483 490 - /* 491 - * If this large page contains one or more prefixes of vCPUs that are 492 - * currently running, do not reset the protection, leave it marked as 493 - * dirty. 494 - */ 495 - if (!crste.s.fc1.prefix_notif || gmap_mkold_prefix(gmap, gfn, end)) { 484 + /* 485 + * If this large page contains one or more prefixes of vCPUs that are 486 + * currently running, do not reset the protection, leave it marked as 487 + * dirty. 488 + */ 489 + if (crste.s.fc1.prefix_notif && !gmap_mkold_prefix(gmap, gfn, end)) 490 + break; 496 491 new = crste; 497 492 new.h.p = 1; 498 493 new.s.fc1.sd = 0; 499 - gmap_crstep_xchg(gmap, table, new, gfn); 500 - } 494 + } while (!gmap_crstep_xchg_atomic(gmap, table, crste, new, gfn)); 501 495 502 496 for ( ; gfn < end; gfn++) 503 497 mark_page_dirty(gmap->kvm, gfn); ··· 519 511 _dat_walk_gfn_range(start, end, gmap->asce, &walk_ops, 0, gmap); 520 512 } 521 513 522 - static int gmap_handle_minor_crste_fault(union asce asce, struct guest_fault *f) 514 + static int gmap_handle_minor_crste_fault(struct gmap *gmap, struct guest_fault *f) 523 515 { 524 516 union crste newcrste, oldcrste = READ_ONCE(*f->crstep); 525 517 ··· 544 536 newcrste.s.fc1.d = 1; 545 537 newcrste.s.fc1.sd = 1; 546 538 } 547 - if (!oldcrste.s.fc1.d && newcrste.s.fc1.d) 548 - SetPageDirty(phys_to_page(crste_origin_large(newcrste))); 549 539 /* In case of races, let the slow path deal with it. */ 550 - return !dat_crstep_xchg_atomic(f->crstep, oldcrste, newcrste, f->gfn, asce); 540 + return !gmap_crstep_xchg_atomic(gmap, f->crstep, oldcrste, newcrste, f->gfn); 551 541 } 552 542 /* Trying to write on a read-only page, let the slow path deal with it. */ 553 543 return 1; ··· 574 568 newpte.s.d = 1; 575 569 newpte.s.sd = 1; 576 570 } 577 - if (!oldpte.s.d && newpte.s.d) 578 - SetPageDirty(pfn_to_page(newpte.h.pfra)); 579 571 *pgste = gmap_ptep_xchg(gmap, f->ptep, newpte, *pgste, f->gfn); 580 572 581 573 return 0; ··· 610 606 fault->callback(fault); 611 607 pgste_set_unlock(fault->ptep, pgste); 612 608 } else { 613 - rc = gmap_handle_minor_crste_fault(gmap->asce, fault); 609 + rc = gmap_handle_minor_crste_fault(gmap, fault); 614 610 if (!rc && fault->callback) 615 611 fault->callback(fault); 616 612 } ··· 627 623 return test_bit(GMAP_FLAG_ALLOW_HPAGE_1M, &gmap->flags); 628 624 } 629 625 626 + static int _gmap_link(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, int level, 627 + struct guest_fault *f) 628 + { 629 + union crste oldval, newval; 630 + union pte newpte, oldpte; 631 + union pgste pgste; 632 + int rc = 0; 633 + 634 + rc = dat_entry_walk(mc, f->gfn, gmap->asce, DAT_WALK_ALLOC_CONTINUE, level, 635 + &f->crstep, &f->ptep); 636 + if (rc == -ENOMEM) 637 + return rc; 638 + if (KVM_BUG_ON(rc == -EINVAL, gmap->kvm)) 639 + return rc; 640 + if (rc) 641 + return -EAGAIN; 642 + if (KVM_BUG_ON(get_level(f->crstep, f->ptep) > level, gmap->kvm)) 643 + return -EINVAL; 644 + 645 + if (f->ptep) { 646 + pgste = pgste_get_lock(f->ptep); 647 + oldpte = *f->ptep; 648 + newpte = _pte(f->pfn, f->writable, f->write_attempt | oldpte.s.d, !f->page); 649 + newpte.s.sd = oldpte.s.sd; 650 + oldpte.s.sd = 0; 651 + if (oldpte.val == _PTE_EMPTY.val || oldpte.h.pfra == f->pfn) { 652 + pgste = gmap_ptep_xchg(gmap, f->ptep, newpte, pgste, f->gfn); 653 + if (f->callback) 654 + f->callback(f); 655 + } else { 656 + rc = -EAGAIN; 657 + } 658 + pgste_set_unlock(f->ptep, pgste); 659 + } else { 660 + do { 661 + oldval = READ_ONCE(*f->crstep); 662 + newval = _crste_fc1(f->pfn, oldval.h.tt, f->writable, 663 + f->write_attempt | oldval.s.fc1.d); 664 + newval.s.fc1.s = !f->page; 665 + newval.s.fc1.sd = oldval.s.fc1.sd; 666 + if (oldval.val != _CRSTE_EMPTY(oldval.h.tt).val && 667 + crste_origin_large(oldval) != crste_origin_large(newval)) 668 + return -EAGAIN; 669 + } while (!gmap_crstep_xchg_atomic(gmap, f->crstep, oldval, newval, f->gfn)); 670 + if (f->callback) 671 + f->callback(f); 672 + } 673 + 674 + return rc; 675 + } 676 + 630 677 int gmap_link(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, struct guest_fault *f) 631 678 { 632 679 unsigned int order; 633 - int rc, level; 680 + int level; 634 681 635 682 lockdep_assert_held(&gmap->kvm->mmu_lock); 636 683 ··· 693 638 else if (order >= get_order(_SEGMENT_SIZE) && gmap_1m_allowed(gmap, f->gfn)) 694 639 level = TABLE_TYPE_SEGMENT; 695 640 } 696 - rc = dat_link(mc, gmap->asce, level, uses_skeys(gmap), f); 697 - KVM_BUG_ON(rc == -EINVAL, gmap->kvm); 698 - return rc; 641 + return _gmap_link(mc, gmap, level, f); 699 642 } 700 643 701 644 static int gmap_ucas_map_one(struct kvm_s390_mmu_cache *mc, struct gmap *gmap, 702 645 gfn_t p_gfn, gfn_t c_gfn, bool force_alloc) 703 646 { 647 + union crste newcrste, oldcrste; 704 648 struct page_table *pt; 705 - union crste newcrste; 706 649 union crste *crstep; 707 650 union pte *ptep; 708 651 int rc; ··· 726 673 &crstep, &ptep); 727 674 if (rc) 728 675 return rc; 729 - dat_crstep_xchg(crstep, newcrste, c_gfn, gmap->asce); 676 + do { 677 + oldcrste = READ_ONCE(*crstep); 678 + if (oldcrste.val == newcrste.val) 679 + break; 680 + } while (!dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, c_gfn, gmap->asce)); 730 681 return 0; 731 682 } 732 683 ··· 834 777 int rc; 835 778 836 779 rc = dat_entry_walk(NULL, c_gfn, gmap->asce, 0, TABLE_TYPE_SEGMENT, &crstep, &ptep); 837 - if (!rc) 838 - dat_crstep_xchg(crstep, _PMD_EMPTY, c_gfn, gmap->asce); 780 + if (rc) 781 + return; 782 + while (!dat_crstep_xchg_atomic(crstep, READ_ONCE(*crstep), _PMD_EMPTY, c_gfn, gmap->asce)) 783 + ; 839 784 } 840 785 841 786 void gmap_ucas_unmap(struct gmap *gmap, gfn_t c_gfn, unsigned long count) ··· 1076 1017 dat_ptep_xchg(ptep, _PTE_EMPTY, r_gfn, sg->asce, uses_skeys(sg)); 1077 1018 return; 1078 1019 } 1079 - crste = READ_ONCE(*crstep); 1080 - dat_crstep_clear(crstep, r_gfn, sg->asce); 1020 + 1021 + crste = dat_crstep_clear_atomic(crstep, r_gfn, sg->asce); 1081 1022 if (crste_leaf(crste) || crste.h.i) 1082 1023 return; 1083 1024 if (is_pmd(crste)) ··· 1160 1101 static inline int __gmap_protect_asce_top_level(struct kvm_s390_mmu_cache *mc, struct gmap *sg, 1161 1102 struct gmap_protect_asce_top_level *context) 1162 1103 { 1104 + struct gmap *parent; 1163 1105 int rc, i; 1164 1106 1165 1107 guard(write_lock)(&sg->kvm->mmu_lock); ··· 1168 1108 if (kvm_s390_array_needs_retry_safe(sg->kvm, context->seq, context->f)) 1169 1109 return -EAGAIN; 1170 1110 1171 - scoped_guard(spinlock, &sg->parent->children_lock) { 1111 + parent = READ_ONCE(sg->parent); 1112 + if (!parent) 1113 + return -EAGAIN; 1114 + scoped_guard(spinlock, &parent->children_lock) { 1115 + if (READ_ONCE(sg->parent) != parent) 1116 + return -EAGAIN; 1172 1117 for (i = 0; i < CRST_TABLE_PAGES; i++) { 1173 1118 if (!context->f[i].valid) 1174 1119 continue; ··· 1255 1190 { 1256 1191 struct gmap *sg, *new; 1257 1192 int rc; 1193 + 1194 + if (WARN_ON(!parent)) 1195 + return ERR_PTR(-EINVAL); 1258 1196 1259 1197 scoped_guard(spinlock, &parent->children_lock) { 1260 1198 sg = gmap_find_shadow(parent, asce, edat_level);

+21 -12

arch/s390/kvm/gmap.h

··· 185 185 else 186 186 _gmap_handle_vsie_unshadow_event(gmap, gfn); 187 187 } 188 + if (!ptep->s.d && newpte.s.d && !newpte.s.s) 189 + SetPageDirty(pfn_to_page(newpte.h.pfra)); 188 190 return __dat_ptep_xchg(ptep, pgste, newpte, gfn, gmap->asce, uses_skeys(gmap)); 189 191 } 190 192 ··· 196 194 return _gmap_ptep_xchg(gmap, ptep, newpte, pgste, gfn, true); 197 195 } 198 196 199 - static inline void _gmap_crstep_xchg(struct gmap *gmap, union crste *crstep, union crste ne, 200 - gfn_t gfn, bool needs_lock) 197 + static inline bool __must_check _gmap_crstep_xchg_atomic(struct gmap *gmap, union crste *crstep, 198 + union crste oldcrste, union crste newcrste, 199 + gfn_t gfn, bool needs_lock) 201 200 { 202 - unsigned long align = 8 + (is_pmd(*crstep) ? 0 : 11); 201 + unsigned long align = is_pmd(newcrste) ? _PAGE_ENTRIES : _PAGE_ENTRIES * _CRST_ENTRIES; 202 + 203 + if (KVM_BUG_ON(crstep->h.tt != oldcrste.h.tt || newcrste.h.tt != oldcrste.h.tt, gmap->kvm)) 204 + return true; 203 205 204 206 lockdep_assert_held(&gmap->kvm->mmu_lock); 205 207 if (!needs_lock) 206 208 lockdep_assert_held(&gmap->children_lock); 207 209 208 210 gfn = ALIGN_DOWN(gfn, align); 209 - if (crste_prefix(*crstep) && (ne.h.p || ne.h.i || !crste_prefix(ne))) { 210 - ne.s.fc1.prefix_notif = 0; 211 + if (crste_prefix(oldcrste) && (newcrste.h.p || newcrste.h.i || !crste_prefix(newcrste))) { 212 + newcrste.s.fc1.prefix_notif = 0; 211 213 gmap_unmap_prefix(gmap, gfn, gfn + align); 212 214 } 213 - if (crste_leaf(*crstep) && crstep->s.fc1.vsie_notif && 214 - (ne.h.p || ne.h.i || !ne.s.fc1.vsie_notif)) { 215 - ne.s.fc1.vsie_notif = 0; 215 + if (crste_leaf(oldcrste) && oldcrste.s.fc1.vsie_notif && 216 + (newcrste.h.p || newcrste.h.i || !newcrste.s.fc1.vsie_notif)) { 217 + newcrste.s.fc1.vsie_notif = 0; 216 218 if (needs_lock) 217 219 gmap_handle_vsie_unshadow_event(gmap, gfn); 218 220 else 219 221 _gmap_handle_vsie_unshadow_event(gmap, gfn); 220 222 } 221 - dat_crstep_xchg(crstep, ne, gfn, gmap->asce); 223 + if (!oldcrste.s.fc1.d && newcrste.s.fc1.d && !newcrste.s.fc1.s) 224 + SetPageDirty(phys_to_page(crste_origin_large(newcrste))); 225 + return dat_crstep_xchg_atomic(crstep, oldcrste, newcrste, gfn, gmap->asce); 222 226 } 223 227 224 - static inline void gmap_crstep_xchg(struct gmap *gmap, union crste *crstep, union crste ne, 225 - gfn_t gfn) 228 + static inline bool __must_check gmap_crstep_xchg_atomic(struct gmap *gmap, union crste *crstep, 229 + union crste oldcrste, union crste newcrste, 230 + gfn_t gfn) 226 231 { 227 - return _gmap_crstep_xchg(gmap, crstep, ne, gfn, true); 232 + return _gmap_crstep_xchg_atomic(gmap, crstep, oldcrste, newcrste, gfn, true); 228 233 } 229 234 230 235 /**

+18

arch/s390/kvm/interrupt.c

··· 2724 2724 2725 2725 bit = bit_nr + (addr % PAGE_SIZE) * 8; 2726 2726 2727 + /* kvm_set_routing_entry() should never allow this to happen */ 2728 + WARN_ON_ONCE(bit > (PAGE_SIZE * BITS_PER_BYTE - 1)); 2729 + 2727 2730 return swap ? (bit ^ (BITS_PER_LONG - 1)) : bit; 2728 2731 } 2729 2732 ··· 2827 2824 int rc; 2828 2825 2829 2826 mci.val = mcck_info->mcic; 2827 + 2828 + /* log machine checks being reinjected on all debugs */ 2829 + VCPU_EVENT(vcpu, 2, "guest machine check %lx", mci.val); 2830 + KVM_EVENT(2, "guest machine check %lx", mci.val); 2831 + pr_info("guest machine check pid %d: %lx", current->pid, mci.val); 2832 + 2830 2833 if (mci.sr) 2831 2834 cr14 |= CR14_RECOVERY_SUBMASK; 2832 2835 if (mci.dg) ··· 2861 2852 struct kvm_kernel_irq_routing_entry *e, 2862 2853 const struct kvm_irq_routing_entry *ue) 2863 2854 { 2855 + const struct kvm_irq_routing_s390_adapter *adapter; 2864 2856 u64 uaddr_s, uaddr_i; 2865 2857 int idx; 2866 2858 ··· 2871 2861 if (kvm_is_ucontrol(kvm)) 2872 2862 return -EINVAL; 2873 2863 e->set = set_adapter_int; 2864 + 2865 + adapter = &ue->u.adapter; 2866 + if (adapter->summary_addr + (adapter->summary_offset / 8) >= 2867 + (adapter->summary_addr & PAGE_MASK) + PAGE_SIZE) 2868 + return -EINVAL; 2869 + if (adapter->ind_addr + (adapter->ind_offset / 8) >= 2870 + (adapter->ind_addr & PAGE_MASK) + PAGE_SIZE) 2871 + return -EINVAL; 2874 2872 2875 2873 idx = srcu_read_lock(&kvm->srcu); 2876 2874 uaddr_s = gpa_to_hva(kvm, ue->u.adapter.summary_addr);

+23 -11

arch/s390/kvm/kvm-s390.c

··· 4617 4617 return 0; 4618 4618 } 4619 4619 4620 - static int vcpu_post_run(struct kvm_vcpu *vcpu, int exit_reason) 4620 + static int vcpu_post_run(struct kvm_vcpu *vcpu, int sie_return) 4621 4621 { 4622 4622 struct mcck_volatile_info *mcck_info; 4623 4623 struct sie_page *sie_page; ··· 4633 4633 vcpu->run->s.regs.gprs[14] = vcpu->arch.sie_block->gg14; 4634 4634 vcpu->run->s.regs.gprs[15] = vcpu->arch.sie_block->gg15; 4635 4635 4636 - if (exit_reason == -EINTR) { 4637 - VCPU_EVENT(vcpu, 3, "%s", "machine check"); 4636 + if (sie_return == SIE64_RETURN_MCCK) { 4638 4637 sie_page = container_of(vcpu->arch.sie_block, 4639 4638 struct sie_page, sie_block); 4640 4639 mcck_info = &sie_page->mcck_info; 4641 4640 kvm_s390_reinject_machine_check(vcpu, mcck_info); 4642 4641 return 0; 4643 4642 } 4643 + WARN_ON_ONCE(sie_return != SIE64_RETURN_NORMAL); 4644 4644 4645 4645 if (vcpu->arch.sie_block->icptcode > 0) { 4646 4646 rc = kvm_handle_sie_intercept(vcpu); ··· 4679 4679 #define PSW_INT_MASK (PSW_MASK_EXT | PSW_MASK_IO | PSW_MASK_MCHECK) 4680 4680 static int __vcpu_run(struct kvm_vcpu *vcpu) 4681 4681 { 4682 - int rc, exit_reason; 4682 + int rc, sie_return; 4683 4683 struct sie_page *sie_page = (struct sie_page *)vcpu->arch.sie_block; 4684 4684 4685 4685 /* ··· 4719 4719 guest_timing_enter_irqoff(); 4720 4720 __disable_cpu_timer_accounting(vcpu); 4721 4721 4722 - exit_reason = kvm_s390_enter_exit_sie(vcpu->arch.sie_block, 4723 - vcpu->run->s.regs.gprs, 4724 - vcpu->arch.gmap->asce.val); 4722 + sie_return = kvm_s390_enter_exit_sie(vcpu->arch.sie_block, 4723 + vcpu->run->s.regs.gprs, 4724 + vcpu->arch.gmap->asce.val); 4725 4725 4726 4726 __enable_cpu_timer_accounting(vcpu); 4727 4727 guest_timing_exit_irqoff(); ··· 4744 4744 } 4745 4745 kvm_vcpu_srcu_read_lock(vcpu); 4746 4746 4747 - rc = vcpu_post_run(vcpu, exit_reason); 4747 + rc = vcpu_post_run(vcpu, sie_return); 4748 4748 if (rc || guestdbg_exit_pending(vcpu)) { 4749 4749 kvm_vcpu_srcu_read_unlock(vcpu); 4750 4750 break; ··· 5520 5520 } 5521 5521 #endif 5522 5522 case KVM_S390_VCPU_FAULT: { 5523 - idx = srcu_read_lock(&vcpu->kvm->srcu); 5524 - r = vcpu_dat_fault_handler(vcpu, arg, 0); 5525 - srcu_read_unlock(&vcpu->kvm->srcu, idx); 5523 + gpa_t gaddr = arg; 5524 + 5525 + scoped_guard(srcu, &vcpu->kvm->srcu) { 5526 + r = vcpu_ucontrol_translate(vcpu, &gaddr); 5527 + if (r) 5528 + break; 5529 + 5530 + r = kvm_s390_faultin_gfn_simple(vcpu, NULL, gpa_to_gfn(gaddr), false); 5531 + if (r == PGM_ADDRESSING) 5532 + r = -EFAULT; 5533 + if (r <= 0) 5534 + break; 5535 + r = -EIO; 5536 + KVM_BUG_ON(r, vcpu->kvm); 5537 + } 5526 5538 break; 5527 5539 } 5528 5540 case KVM_ENABLE_CAP:

+8 -4

arch/s390/kvm/vsie.c

··· 1122 1122 { 1123 1123 struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s; 1124 1124 struct kvm_s390_sie_block *scb_o = vsie_page->scb_o; 1125 + unsigned long sie_return = SIE64_RETURN_NORMAL; 1125 1126 int guest_bp_isolation; 1126 1127 int rc = 0; 1127 1128 ··· 1164 1163 goto xfer_to_guest_mode_check; 1165 1164 } 1166 1165 guest_timing_enter_irqoff(); 1167 - rc = kvm_s390_enter_exit_sie(scb_s, vcpu->run->s.regs.gprs, sg->asce.val); 1166 + sie_return = kvm_s390_enter_exit_sie(scb_s, vcpu->run->s.regs.gprs, sg->asce.val); 1168 1167 guest_timing_exit_irqoff(); 1169 1168 local_irq_enable(); 1170 1169 } ··· 1179 1178 1180 1179 kvm_vcpu_srcu_read_lock(vcpu); 1181 1180 1182 - if (rc == -EINTR) { 1183 - VCPU_EVENT(vcpu, 3, "%s", "machine check"); 1181 + if (sie_return == SIE64_RETURN_MCCK) { 1184 1182 kvm_s390_reinject_machine_check(vcpu, &vsie_page->mcck_info); 1185 1183 return 0; 1186 1184 } 1185 + 1186 + WARN_ON_ONCE(sie_return != SIE64_RETURN_NORMAL); 1187 1187 1188 1188 if (rc > 0) 1189 1189 rc = 0; /* we could still have an icpt */ ··· 1328 1326 static int vsie_run(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page) 1329 1327 { 1330 1328 struct kvm_s390_sie_block *scb_s = &vsie_page->scb_s; 1331 - struct gmap *sg; 1329 + struct gmap *sg = NULL; 1332 1330 int rc = 0; 1333 1331 1334 1332 while (1) { ··· 1368 1366 sg = gmap_put(sg); 1369 1367 cond_resched(); 1370 1368 } 1369 + if (sg) 1370 + sg = gmap_put(sg); 1371 1371 1372 1372 if (rc == -EFAULT) { 1373 1373 /*

+9 -2

arch/s390/mm/fault.c

··· 441 441 folio = phys_to_folio(addr); 442 442 if (unlikely(!folio_try_get(folio))) 443 443 return; 444 - rc = arch_make_folio_accessible(folio); 444 + rc = uv_convert_from_secure(folio_to_phys(folio)); 445 + if (!rc) 446 + clear_bit(PG_arch_1, &folio->flags.f); 445 447 folio_put(folio); 448 + /* 449 + * There are some valid fixup types for kernel 450 + * accesses to donated secure memory. zeropad is one 451 + * of them. 452 + */ 446 453 if (rc) 447 - BUG(); 454 + return handle_fault_error_nolock(regs, 0); 448 455 } else { 449 456 if (faulthandler_disabled()) 450 457 return handle_fault_error_nolock(regs, 0);

+6

arch/x86/coco/sev/noinstr.c

··· 121 121 122 122 WARN_ON(!irqs_disabled()); 123 123 124 + if (!sev_cfg.ghcbs_initialized) 125 + return boot_ghcb; 126 + 124 127 data = this_cpu_read(runtime_data); 125 128 ghcb = &data->ghcb_page; 126 129 ··· 166 163 struct ghcb *ghcb; 167 164 168 165 WARN_ON(!irqs_disabled()); 166 + 167 + if (!sev_cfg.ghcbs_initialized) 168 + return; 169 169 170 170 data = this_cpu_read(runtime_data); 171 171 ghcb = &data->ghcb_page;

+14

arch/x86/entry/entry_fred.c

··· 177 177 } 178 178 } 179 179 180 + #ifdef CONFIG_AMD_MEM_ENCRYPT 181 + noinstr void exc_vmm_communication(struct pt_regs *regs, unsigned long error_code) 182 + { 183 + if (user_mode(regs)) 184 + return user_exc_vmm_communication(regs, error_code); 185 + else 186 + return kernel_exc_vmm_communication(regs, error_code); 187 + } 188 + #endif 189 + 180 190 static noinstr void fred_hwexc(struct pt_regs *regs, unsigned long error_code) 181 191 { 182 192 /* Optimize for #PF. That's the only exception which matters performance wise */ ··· 217 207 #ifdef CONFIG_X86_CET 218 208 case X86_TRAP_CP: return exc_control_protection(regs, error_code); 219 209 #endif 210 + #ifdef CONFIG_AMD_MEM_ENCRYPT 211 + case X86_TRAP_VC: return exc_vmm_communication(regs, error_code); 212 + #endif 213 + 220 214 default: return fred_bad_type(regs, error_code); 221 215 } 222 216

+26 -7

arch/x86/kernel/cpu/common.c

··· 433 433 434 434 /* These bits should not change their value after CPU init is finished. */ 435 435 static const unsigned long cr4_pinned_mask = X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | 436 - X86_CR4_FSGSBASE | X86_CR4_CET | X86_CR4_FRED; 436 + X86_CR4_FSGSBASE | X86_CR4_CET; 437 + 438 + /* 439 + * The CR pinning protects against ROP on the 'mov %reg, %CRn' instruction(s). 440 + * Since you can ROP directly to these instructions (barring shadow stack), 441 + * any protection must follow immediately and unconditionally after that. 442 + * 443 + * Specifically, the CR[04] write functions below will have the value 444 + * validation controlled by the @cr_pinning static_branch which is 445 + * __ro_after_init, just like the cr4_pinned_bits value. 446 + * 447 + * Once set, an attacker will have to defeat page-tables to get around these 448 + * restrictions. Which is a much bigger ask than 'simple' ROP. 449 + */ 437 450 static DEFINE_STATIC_KEY_FALSE_RO(cr_pinning); 438 451 static unsigned long cr4_pinned_bits __ro_after_init; 439 452 ··· 2063 2050 setup_umip(c); 2064 2051 setup_lass(c); 2065 2052 2066 - /* Enable FSGSBASE instructions if available. */ 2067 - if (cpu_has(c, X86_FEATURE_FSGSBASE)) { 2068 - cr4_set_bits(X86_CR4_FSGSBASE); 2069 - elf_hwcap2 |= HWCAP2_FSGSBASE; 2070 - } 2071 - 2072 2053 /* 2073 2054 * The vendor-specific functions might have changed features. 2074 2055 * Now we do "generic changes." ··· 2422 2415 2423 2416 /* GHCB needs to be setup to handle #VC. */ 2424 2417 setup_ghcb(); 2418 + 2419 + /* 2420 + * On CPUs with FSGSBASE support, paranoid_entry() uses 2421 + * ALTERNATIVE-patched RDGSBASE/WRGSBASE instructions. Secondary CPUs 2422 + * boot after alternatives are patched globally, so early exceptions 2423 + * execute patched code that depends on FSGSBASE. Enable the feature 2424 + * before any exceptions occur. 2425 + */ 2426 + if (cpu_feature_enabled(X86_FEATURE_FSGSBASE)) { 2427 + cr4_set_bits(X86_CR4_FSGSBASE); 2428 + elf_hwcap2 |= HWCAP2_FSGSBASE; 2429 + } 2425 2430 2426 2431 if (cpu_feature_enabled(X86_FEATURE_FRED)) { 2427 2432 /* The boot CPU has enabled FRED during early boot */

+10 -7

arch/x86/kvm/mmu/mmu.c

··· 3044 3044 bool prefetch = !fault || fault->prefetch; 3045 3045 bool write_fault = fault && fault->write; 3046 3046 3047 - if (unlikely(is_noslot_pfn(pfn))) { 3048 - vcpu->stat.pf_mmio_spte_created++; 3049 - mark_mmio_spte(vcpu, sptep, gfn, pte_access); 3050 - return RET_PF_EMULATE; 3051 - } 3052 - 3053 3047 if (is_shadow_present_pte(*sptep)) { 3054 3048 if (prefetch && is_last_spte(*sptep, level) && 3055 3049 pfn == spte_to_pfn(*sptep)) ··· 3060 3066 child = spte_to_child_sp(pte); 3061 3067 drop_parent_pte(vcpu->kvm, child, sptep); 3062 3068 flush = true; 3063 - } else if (WARN_ON_ONCE(pfn != spte_to_pfn(*sptep))) { 3069 + } else if (pfn != spte_to_pfn(*sptep)) { 3070 + WARN_ON_ONCE(vcpu->arch.mmu->root_role.direct); 3064 3071 drop_spte(vcpu->kvm, sptep); 3065 3072 flush = true; 3066 3073 } else 3067 3074 was_rmapped = 1; 3075 + } 3076 + 3077 + if (unlikely(is_noslot_pfn(pfn))) { 3078 + vcpu->stat.pf_mmio_spte_created++; 3079 + mark_mmio_spte(vcpu, sptep, gfn, pte_access); 3080 + if (flush) 3081 + kvm_flush_remote_tlbs_gfn(vcpu->kvm, gfn, level); 3082 + return RET_PF_EMULATE; 3068 3083 } 3069 3084 3070 3085 wrprot = make_spte(vcpu, sp, slot, pte_access, gfn, pfn, *sptep, prefetch,

+1 -1

arch/x86/platform/efi/quirks.c

··· 424 424 if (efi_enabled(EFI_DBG)) 425 425 return; 426 426 427 - sz = sizeof(*ranges_to_free) * efi.memmap.nr_map + 1; 427 + sz = sizeof(*ranges_to_free) * (efi.memmap.nr_map + 1); 428 428 ranges_to_free = kzalloc(sz, GFP_KERNEL); 429 429 if (!ranges_to_free) { 430 430 pr_err("Failed to allocate storage for freeable EFI regions\n");

+1

drivers/accel/ivpu/ivpu_drv.h

··· 35 35 #define IVPU_HW_IP_60XX 60 36 36 37 37 #define IVPU_HW_IP_REV_LNL_B0 4 38 + #define IVPU_HW_IP_REV_NVL_A0 0 38 39 39 40 #define IVPU_HW_BTRS_MTL 1 40 41 #define IVPU_HW_BTRS_LNL 2

+4 -2

drivers/accel/ivpu/ivpu_hw.c

··· 70 70 if (ivpu_hw_btrs_gen(vdev) == IVPU_HW_BTRS_MTL) 71 71 vdev->wa.interrupt_clear_with_0 = ivpu_hw_btrs_irqs_clear_with_0_mtl(vdev); 72 72 73 - if (ivpu_device_id(vdev) == PCI_DEVICE_ID_LNL && 74 - ivpu_revision(vdev) < IVPU_HW_IP_REV_LNL_B0) 73 + if ((ivpu_device_id(vdev) == PCI_DEVICE_ID_LNL && 74 + ivpu_revision(vdev) < IVPU_HW_IP_REV_LNL_B0) || 75 + (ivpu_device_id(vdev) == PCI_DEVICE_ID_NVL && 76 + ivpu_revision(vdev) == IVPU_HW_IP_REV_NVL_A0)) 75 77 vdev->wa.disable_clock_relinquish = true; 76 78 77 79 if (ivpu_test_mode & IVPU_TEST_MODE_CLK_RELINQ_ENABLE)

+2

drivers/acpi/ec.c

··· 1656 1656 1657 1657 ret = ec_install_handlers(ec, device, call_reg); 1658 1658 if (ret) { 1659 + ec_remove_handlers(ec); 1660 + 1659 1661 if (ec == first_ec) 1660 1662 first_ec = NULL; 1661 1663

+26 -4

drivers/base/regmap/regmap.c

··· 1545 1545 unsigned int val_num) 1546 1546 { 1547 1547 void *orig_work_buf; 1548 + unsigned int selector_reg; 1548 1549 unsigned int win_offset; 1549 1550 unsigned int win_page; 1550 1551 bool page_chg; ··· 1564 1563 return -EINVAL; 1565 1564 } 1566 1565 1567 - /* It is possible to have selector register inside data window. 1568 - In that case, selector register is located on every page and 1569 - it needs no page switching, when accessed alone. */ 1566 + /* 1567 + * Calculate the address of the selector register in the corresponding 1568 + * data window if it is located on every page. 1569 + */ 1570 + page_chg = in_range(range->selector_reg, range->window_start, range->window_len); 1571 + if (page_chg) 1572 + selector_reg = range->range_min + win_page * range->window_len + 1573 + range->selector_reg - range->window_start; 1574 + 1575 + /* 1576 + * It is possible to have selector register inside data window. 1577 + * In that case, selector register is located on every page and it 1578 + * needs no page switching, when accessed alone. 1579 + * 1580 + * Nevertheless we should synchronize the cache values for it. 1581 + * This can't be properly achieved if the selector register is 1582 + * the first and the only one to be read inside the data window. 1583 + * That's why we update it in that case as well. 1584 + * 1585 + * However, we specifically avoid updating it for the default page, 1586 + * when it's overlapped with the real data window, to prevent from 1587 + * infinite looping. 1588 + */ 1570 1589 if (val_num > 1 || 1590 + (page_chg && selector_reg != range->selector_reg) || 1571 1591 range->window_start + win_offset != range->selector_reg) { 1572 1592 /* Use separate work_buf during page switching */ 1573 1593 orig_work_buf = map->work_buf; ··· 1597 1575 ret = _regmap_update_bits(map, range->selector_reg, 1598 1576 range->selector_mask, 1599 1577 win_page << range->selector_shift, 1600 - &page_chg, false); 1578 + NULL, false); 1601 1579 1602 1580 map->work_buf = orig_work_buf; 1603 1581

+14 -25

drivers/block/zram/zram_drv.c

··· 917 917 918 918 static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req) 919 919 { 920 - u32 size, index = req->pps->index; 921 - int err, prio; 922 - bool huge; 920 + u32 index = req->pps->index; 921 + int err; 923 922 924 923 err = blk_status_to_errno(req->bio.bi_status); 925 924 if (err) { ··· 945 946 goto out; 946 947 } 947 948 948 - if (zram->compressed_wb) { 949 - /* 950 - * ZRAM_WB slots get freed, we need to preserve data required 951 - * for read decompression. 952 - */ 953 - size = get_slot_size(zram, index); 954 - prio = get_slot_comp_priority(zram, index); 955 - huge = test_slot_flag(zram, index, ZRAM_HUGE); 956 - } 957 - 958 - slot_free(zram, index); 959 - set_slot_flag(zram, index, ZRAM_WB); 949 + clear_slot_flag(zram, index, ZRAM_IDLE); 950 + if (test_slot_flag(zram, index, ZRAM_HUGE)) 951 + atomic64_dec(&zram->stats.huge_pages); 952 + atomic64_sub(get_slot_size(zram, index), &zram->stats.compr_data_size); 953 + zs_free(zram->mem_pool, get_slot_handle(zram, index)); 960 954 set_slot_handle(zram, index, req->blk_idx); 961 - 962 - if (zram->compressed_wb) { 963 - if (huge) 964 - set_slot_flag(zram, index, ZRAM_HUGE); 965 - set_slot_size(zram, index, size); 966 - set_slot_comp_priority(zram, index, prio); 967 - } 968 - 969 - atomic64_inc(&zram->stats.pages_stored); 955 + set_slot_flag(zram, index, ZRAM_WB); 970 956 971 957 out: 972 958 slot_unlock(zram, index); ··· 1994 2010 set_slot_comp_priority(zram, index, 0); 1995 2011 1996 2012 if (test_slot_flag(zram, index, ZRAM_HUGE)) { 2013 + /* 2014 + * Writeback completion decrements ->huge_pages but keeps 2015 + * ZRAM_HUGE flag for deferred decompression path. 2016 + */ 2017 + if (!test_slot_flag(zram, index, ZRAM_WB)) 2018 + atomic64_dec(&zram->stats.huge_pages); 1997 2019 clear_slot_flag(zram, index, ZRAM_HUGE); 1998 - atomic64_dec(&zram->stats.huge_pages); 1999 2020 } 2000 2021 2001 2022 if (test_slot_flag(zram, index, ZRAM_WB)) {

+8 -3

drivers/bluetooth/btintel.c

··· 251 251 252 252 bt_dev_err(hdev, "Hardware error 0x%2.2x", code); 253 253 254 + hci_req_sync_lock(hdev); 255 + 254 256 skb = __hci_cmd_sync(hdev, HCI_OP_RESET, 0, NULL, HCI_INIT_TIMEOUT); 255 257 if (IS_ERR(skb)) { 256 258 bt_dev_err(hdev, "Reset after hardware error failed (%ld)", 257 259 PTR_ERR(skb)); 258 - return; 260 + goto unlock; 259 261 } 260 262 kfree_skb(skb); 261 263 ··· 265 263 if (IS_ERR(skb)) { 266 264 bt_dev_err(hdev, "Retrieving Intel exception info failed (%ld)", 267 265 PTR_ERR(skb)); 268 - return; 266 + goto unlock; 269 267 } 270 268 271 269 if (skb->len != 13) { 272 270 bt_dev_err(hdev, "Exception info size mismatch"); 273 271 kfree_skb(skb); 274 - return; 272 + goto unlock; 275 273 } 276 274 277 275 bt_dev_err(hdev, "Exception info %s", (char *)(skb->data + 1)); 278 276 279 277 kfree_skb(skb); 278 + 279 + unlock: 280 + hci_req_sync_unlock(hdev); 280 281 } 281 282 EXPORT_SYMBOL_GPL(btintel_hw_error); 282 283

+4 -1

drivers/bluetooth/btusb.c

··· 2376 2376 if (data->air_mode == HCI_NOTIFY_ENABLE_SCO_CVSD) { 2377 2377 if (hdev->voice_setting & 0x0020) { 2378 2378 static const int alts[3] = { 2, 4, 5 }; 2379 + unsigned int sco_idx; 2379 2380 2380 - new_alts = alts[data->sco_num - 1]; 2381 + sco_idx = min_t(unsigned int, data->sco_num - 1, 2382 + ARRAY_SIZE(alts) - 1); 2383 + new_alts = alts[sco_idx]; 2381 2384 } else { 2382 2385 new_alts = data->sco_num; 2383 2386 }

+2

drivers/bluetooth/hci_ll.c

··· 541 541 if (err || !fw->data || !fw->size) { 542 542 bt_dev_err(lldev->hu.hdev, "request_firmware failed(errno %d) for %s", 543 543 err, bts_scr_name); 544 + if (!err) 545 + release_firmware(fw); 544 546 return -EINVAL; 545 547 } 546 548 ptr = (void *)fw->data;

+3 -6

drivers/cpufreq/cpufreq.c

··· 1427 1427 * If there is a problem with its frequency table, take it 1428 1428 * offline and drop it. 1429 1429 */ 1430 - if (policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_ASCENDING && 1431 - policy->freq_table_sorted != CPUFREQ_TABLE_SORTED_DESCENDING) { 1432 - ret = cpufreq_table_validate_and_sort(policy); 1433 - if (ret) 1434 - goto out_offline_policy; 1435 - } 1430 + ret = cpufreq_table_validate_and_sort(policy); 1431 + if (ret) 1432 + goto out_offline_policy; 1436 1433 1437 1434 /* related_cpus should at least include policy->cpus. */ 1438 1435 cpumask_copy(policy->related_cpus, policy->cpus);

+12

drivers/cpufreq/cpufreq_conservative.c

··· 313 313 dbs_info->requested_freq = policy->cur; 314 314 } 315 315 316 + static void cs_limits(struct cpufreq_policy *policy) 317 + { 318 + struct cs_policy_dbs_info *dbs_info = to_dbs_info(policy->governor_data); 319 + 320 + /* 321 + * The limits have changed, so may have the current frequency. Reset 322 + * requested_freq to avoid any unintended outcomes due to the mismatch. 323 + */ 324 + dbs_info->requested_freq = policy->cur; 325 + } 326 + 316 327 static struct dbs_governor cs_governor = { 317 328 .gov = CPUFREQ_DBS_GOVERNOR_INITIALIZER("conservative"), 318 329 .kobj_type = { .default_groups = cs_groups }, ··· 333 322 .init = cs_init, 334 323 .exit = cs_exit, 335 324 .start = cs_start, 325 + .limits = cs_limits, 336 326 }; 337 327 338 328 #define CPU_FREQ_GOV_CONSERVATIVE (cs_governor.gov)

+3

drivers/cpufreq/cpufreq_governor.c

··· 563 563 564 564 void cpufreq_dbs_governor_limits(struct cpufreq_policy *policy) 565 565 { 566 + struct dbs_governor *gov = dbs_governor_of(policy); 566 567 struct policy_dbs_info *policy_dbs; 567 568 568 569 /* Protect gov->gdbs_data against cpufreq_dbs_governor_exit() */ ··· 575 574 mutex_lock(&policy_dbs->update_mutex); 576 575 cpufreq_policy_apply_limits(policy); 577 576 gov_update_sample_delay(policy_dbs, 0); 577 + if (gov->limits) 578 + gov->limits(policy); 578 579 mutex_unlock(&policy_dbs->update_mutex); 579 580 580 581 out:

+1

drivers/cpufreq/cpufreq_governor.h

··· 138 138 int (*init)(struct dbs_data *dbs_data); 139 139 void (*exit)(struct dbs_data *dbs_data); 140 140 void (*start)(struct cpufreq_policy *policy); 141 + void (*limits)(struct cpufreq_policy *policy); 141 142 }; 142 143 143 144 static inline struct dbs_governor *dbs_governor_of(struct cpufreq_policy *policy)

+4

drivers/cpufreq/freq_table.c

··· 360 360 if (policy_has_boost_freq(policy)) 361 361 policy->boost_supported = true; 362 362 363 + if (policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_ASCENDING || 364 + policy->freq_table_sorted == CPUFREQ_TABLE_SORTED_DESCENDING) 365 + return 0; 366 + 363 367 return set_freq_table_sorted(policy); 364 368 } 365 369

+1

drivers/cxl/Kconfig

··· 59 59 tristate "CXL ACPI: Platform Support" 60 60 depends on ACPI 61 61 depends on ACPI_NUMA 62 + depends on CXL_PMEM || !CXL_PMEM 62 63 default CXL_BUS 63 64 select ACPI_TABLE_LIB 64 65 select ACPI_HMAT

+9 -16

drivers/cxl/core/hdm.c

··· 94 94 struct cxl_hdm *cxlhdm; 95 95 void __iomem *hdm; 96 96 u32 ctrl; 97 - int i; 98 97 99 98 if (!info) 100 99 return false; ··· 112 113 return false; 113 114 114 115 /* 115 - * If any decoders are committed already, there should not be any 116 - * emulated DVSEC decoders. 116 + * If HDM decoders are globally enabled, do not fall back to DVSEC 117 + * range emulation. Zeroed decoder registers after region teardown 118 + * do not imply absence of HDM capability. 119 + * 120 + * Falling back to DVSEC here would treat the decoder as AUTO and 121 + * may incorrectly latch default interleave settings. 117 122 */ 118 - for (i = 0; i < cxlhdm->decoder_count; i++) { 119 - ctrl = readl(hdm + CXL_HDM_DECODER0_CTRL_OFFSET(i)); 120 - dev_dbg(&info->port->dev, 121 - "decoder%d.%d: committed: %ld base: %#x_%.8x size: %#x_%.8x\n", 122 - info->port->id, i, 123 - FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl), 124 - readl(hdm + CXL_HDM_DECODER0_BASE_HIGH_OFFSET(i)), 125 - readl(hdm + CXL_HDM_DECODER0_BASE_LOW_OFFSET(i)), 126 - readl(hdm + CXL_HDM_DECODER0_SIZE_HIGH_OFFSET(i)), 127 - readl(hdm + CXL_HDM_DECODER0_SIZE_LOW_OFFSET(i))); 128 - if (FIELD_GET(CXL_HDM_DECODER0_CTRL_COMMITTED, ctrl)) 129 - return false; 130 - } 123 + ctrl = readl(hdm + CXL_HDM_DECODER_CTRL_OFFSET); 124 + if (ctrl & CXL_HDM_DECODER_ENABLE) 125 + return false; 131 126 132 127 return true; 133 128 }

+1 -1

drivers/cxl/core/mbox.c

··· 1301 1301 * Require an endpoint to be safe otherwise the driver can not 1302 1302 * be sure that the device is unmapped. 1303 1303 */ 1304 - if (endpoint && cxl_num_decoders_committed(endpoint) == 0) 1304 + if (cxlmd->dev.driver && cxl_num_decoders_committed(endpoint) == 0) 1305 1305 return __cxl_mem_sanitize(mds, cmd); 1306 1306 1307 1307 return -EBUSY;

+6 -2

drivers/cxl/core/port.c

··· 552 552 xa_destroy(&port->dports); 553 553 xa_destroy(&port->regions); 554 554 ida_free(&cxl_port_ida, port->id); 555 - if (is_cxl_root(port)) 555 + 556 + if (is_cxl_root(port)) { 556 557 kfree(to_cxl_root(port)); 557 - else 558 + } else { 559 + put_device(dev->parent); 558 560 kfree(port); 561 + } 559 562 } 560 563 561 564 static ssize_t decoders_committed_show(struct device *dev, ··· 710 707 struct cxl_port *iter; 711 708 712 709 dev->parent = &parent_port->dev; 710 + get_device(dev->parent); 713 711 port->depth = parent_port->depth + 1; 714 712 port->parent_dport = parent_dport; 715 713

+3 -1

drivers/cxl/core/region.c

··· 3854 3854 } 3855 3855 3856 3856 rc = sysfs_update_group(&cxlr->dev.kobj, &cxl_region_group); 3857 - if (rc) 3857 + if (rc) { 3858 + kfree(res); 3858 3859 return rc; 3860 + } 3859 3861 3860 3862 rc = insert_resource(cxlrd->res, res); 3861 3863 if (rc) {

+1 -1

drivers/cxl/pmem.c

··· 554 554 555 555 MODULE_DESCRIPTION("CXL PMEM: Persistent Memory Support"); 556 556 MODULE_LICENSE("GPL v2"); 557 - module_init(cxl_pmem_init); 557 + subsys_initcall(cxl_pmem_init); 558 558 module_exit(cxl_pmem_exit); 559 559 MODULE_IMPORT_NS("CXL"); 560 560 MODULE_ALIAS_CXL(CXL_DEVICE_NVDIMM_BRIDGE);

+6 -2

drivers/dma/dw-edma/dw-edma-core.c

··· 844 844 { 845 845 struct dw_edma_chip *chip = dw->chip; 846 846 struct device *dev = dw->chip->dev; 847 + struct msi_desc *msi_desc; 847 848 u32 wr_mask = 1; 848 849 u32 rd_mask = 1; 849 850 int i, err = 0; ··· 896 895 &dw->irq[i]); 897 896 if (err) 898 897 goto err_irq_free; 899 - 900 - if (irq_get_msi_desc(irq)) 898 + msi_desc = irq_get_msi_desc(irq); 899 + if (msi_desc) { 901 900 get_cached_msi_msg(irq, &dw->irq[i].msi); 901 + if (!msi_desc->pci.msi_attrib.is_msix) 902 + dw->irq[i].msi.data = dw->irq[0].msi.data + i; 903 + } 902 904 } 903 905 904 906 dw->nr_irqs = i;

+3 -3

drivers/dma/dw-edma/dw-hdma-v0-core.c

··· 252 252 lower_32_bits(chunk->ll_region.paddr)); 253 253 SET_CH_32(dw, chan->dir, chan->id, llp.msb, 254 254 upper_32_bits(chunk->ll_region.paddr)); 255 + /* Set consumer cycle */ 256 + SET_CH_32(dw, chan->dir, chan->id, cycle_sync, 257 + HDMA_V0_CONSUMER_CYCLE_STAT | HDMA_V0_CONSUMER_CYCLE_BIT); 255 258 } 256 - /* Set consumer cycle */ 257 - SET_CH_32(dw, chan->dir, chan->id, cycle_sync, 258 - HDMA_V0_CONSUMER_CYCLE_STAT | HDMA_V0_CONSUMER_CYCLE_BIT); 259 259 260 260 dw_hdma_v0_sync_ll_data(chunk); 261 261

+11 -15

drivers/dma/fsl-edma-main.c

··· 317 317 return NULL; 318 318 i = fsl_chan - fsl_edma->chans; 319 319 320 - fsl_chan->priority = dma_spec->args[1]; 321 - fsl_chan->is_rxchan = dma_spec->args[2] & FSL_EDMA_RX; 322 - fsl_chan->is_remote = dma_spec->args[2] & FSL_EDMA_REMOTE; 323 - fsl_chan->is_multi_fifo = dma_spec->args[2] & FSL_EDMA_MULTI_FIFO; 320 + if (!b_chmux && i != dma_spec->args[0]) 321 + continue; 324 322 325 323 if ((dma_spec->args[2] & FSL_EDMA_EVEN_CH) && (i & 0x1)) 326 324 continue; ··· 326 328 if ((dma_spec->args[2] & FSL_EDMA_ODD_CH) && !(i & 0x1)) 327 329 continue; 328 330 329 - if (!b_chmux && i == dma_spec->args[0]) { 330 - chan = dma_get_slave_channel(chan); 331 - chan->device->privatecnt++; 332 - return chan; 333 - } else if (b_chmux && !fsl_chan->srcid) { 334 - /* if controller support channel mux, choose a free channel */ 335 - chan = dma_get_slave_channel(chan); 336 - chan->device->privatecnt++; 337 - fsl_chan->srcid = dma_spec->args[0]; 338 - return chan; 339 - } 331 + fsl_chan->srcid = dma_spec->args[0]; 332 + fsl_chan->priority = dma_spec->args[1]; 333 + fsl_chan->is_rxchan = dma_spec->args[2] & FSL_EDMA_RX; 334 + fsl_chan->is_remote = dma_spec->args[2] & FSL_EDMA_REMOTE; 335 + fsl_chan->is_multi_fifo = dma_spec->args[2] & FSL_EDMA_MULTI_FIFO; 336 + 337 + chan = dma_get_slave_channel(chan); 338 + chan->device->privatecnt++; 339 + return chan; 340 340 } 341 341 return NULL; 342 342 }

+4 -4

drivers/dma/idxd/cdev.c

··· 158 158 static void idxd_cdev_dev_release(struct device *dev) 159 159 { 160 160 struct idxd_cdev *idxd_cdev = dev_to_cdev(dev); 161 - struct idxd_cdev_context *cdev_ctx; 162 - struct idxd_wq *wq = idxd_cdev->wq; 163 161 164 - cdev_ctx = &ictx[wq->idxd->data->type]; 165 - ida_free(&cdev_ctx->minor_ida, idxd_cdev->minor); 166 162 kfree(idxd_cdev); 167 163 } 168 164 ··· 578 582 579 583 void idxd_wq_del_cdev(struct idxd_wq *wq) 580 584 { 585 + struct idxd_cdev_context *cdev_ctx; 581 586 struct idxd_cdev *idxd_cdev; 582 587 583 588 idxd_cdev = wq->idxd_cdev; 584 589 wq->idxd_cdev = NULL; 585 590 cdev_device_del(&idxd_cdev->cdev, cdev_dev(idxd_cdev)); 591 + 592 + cdev_ctx = &ictx[wq->idxd->data->type]; 593 + ida_free(&cdev_ctx->minor_ida, idxd_cdev->minor); 586 594 put_device(cdev_dev(idxd_cdev)); 587 595 } 588 596

+31 -14

drivers/dma/idxd/device.c

··· 175 175 free_descs(wq); 176 176 dma_free_coherent(dev, wq->compls_size, wq->compls, wq->compls_addr); 177 177 sbitmap_queue_free(&wq->sbq); 178 + wq->type = IDXD_WQT_NONE; 178 179 } 179 180 EXPORT_SYMBOL_NS_GPL(idxd_wq_free_resources, "IDXD"); 180 181 ··· 383 382 lockdep_assert_held(&wq->wq_lock); 384 383 wq->state = IDXD_WQ_DISABLED; 385 384 memset(wq->wqcfg, 0, idxd->wqcfg_size); 386 - wq->type = IDXD_WQT_NONE; 387 385 wq->threshold = 0; 388 386 wq->priority = 0; 389 387 wq->enqcmds_retries = IDXD_ENQCMDS_RETRIES; ··· 831 831 struct device *dev = &idxd->pdev->dev; 832 832 struct idxd_evl *evl = idxd->evl; 833 833 834 - gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET); 835 - if (!gencfg.evl_en) 834 + if (!evl) 836 835 return; 837 836 838 837 mutex_lock(&evl->lock); ··· 1124 1125 { 1125 1126 int rc; 1126 1127 1127 - lockdep_assert_held(&idxd->dev_lock); 1128 + guard(spinlock)(&idxd->dev_lock); 1129 + 1130 + if (!test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) 1131 + return 0; 1132 + 1128 1133 rc = idxd_wqs_setup(idxd); 1129 1134 if (rc < 0) 1130 1135 return rc; ··· 1335 1332 1336 1333 free_irq(ie->vector, ie); 1337 1334 idxd_flush_pending_descs(ie); 1335 + 1336 + /* The interrupt might have been already released by FLR */ 1337 + if (ie->int_handle == INVALID_INT_HANDLE) 1338 + return; 1339 + 1338 1340 if (idxd->request_int_handles) 1339 1341 idxd_device_release_int_handle(idxd, ie->int_handle, IDXD_IRQ_MSIX); 1340 1342 idxd_device_clear_perm_entry(idxd, ie); 1341 1343 ie->vector = -1; 1342 1344 ie->int_handle = INVALID_INT_HANDLE; 1343 1345 ie->pasid = IOMMU_PASID_INVALID; 1346 + } 1347 + 1348 + void idxd_wq_flush_descs(struct idxd_wq *wq) 1349 + { 1350 + struct idxd_irq_entry *ie = &wq->ie; 1351 + struct idxd_device *idxd = wq->idxd; 1352 + 1353 + guard(mutex)(&wq->wq_lock); 1354 + 1355 + if (wq->state != IDXD_WQ_ENABLED || wq->type != IDXD_WQT_KERNEL) 1356 + return; 1357 + 1358 + idxd_flush_pending_descs(ie); 1359 + if (idxd->request_int_handles) 1360 + idxd_device_release_int_handle(idxd, ie->int_handle, IDXD_IRQ_MSIX); 1361 + idxd_device_clear_perm_entry(idxd, ie); 1362 + ie->int_handle = INVALID_INT_HANDLE; 1344 1363 } 1345 1364 1346 1365 int idxd_wq_request_irq(struct idxd_wq *wq) ··· 1479 1454 } 1480 1455 } 1481 1456 1482 - rc = 0; 1483 - spin_lock(&idxd->dev_lock); 1484 - if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) 1485 - rc = idxd_device_config(idxd); 1486 - spin_unlock(&idxd->dev_lock); 1457 + rc = idxd_device_config(idxd); 1487 1458 if (rc < 0) { 1488 1459 dev_dbg(dev, "Writing wq %d config failed: %d\n", wq->id, rc); 1489 1460 goto err; ··· 1554 1533 idxd_wq_reset(wq); 1555 1534 idxd_wq_free_resources(wq); 1556 1535 percpu_ref_exit(&wq->wq_active); 1557 - wq->type = IDXD_WQT_NONE; 1558 1536 wq->client_count = 0; 1559 1537 } 1560 1538 EXPORT_SYMBOL_NS_GPL(idxd_drv_disable_wq, "IDXD"); ··· 1574 1554 } 1575 1555 1576 1556 /* Device configuration */ 1577 - spin_lock(&idxd->dev_lock); 1578 - if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) 1579 - rc = idxd_device_config(idxd); 1580 - spin_unlock(&idxd->dev_lock); 1557 + rc = idxd_device_config(idxd); 1581 1558 if (rc < 0) 1582 1559 return -ENXIO; 1583 1560

+18

drivers/dma/idxd/dma.c

··· 194 194 kfree(idxd_dma); 195 195 } 196 196 197 + static int idxd_dma_terminate_all(struct dma_chan *c) 198 + { 199 + struct idxd_wq *wq = to_idxd_wq(c); 200 + 201 + idxd_wq_flush_descs(wq); 202 + 203 + return 0; 204 + } 205 + 206 + static void idxd_dma_synchronize(struct dma_chan *c) 207 + { 208 + struct idxd_wq *wq = to_idxd_wq(c); 209 + 210 + idxd_wq_drain(wq); 211 + } 212 + 197 213 int idxd_register_dma_device(struct idxd_device *idxd) 198 214 { 199 215 struct idxd_dma_dev *idxd_dma; ··· 240 224 dma->device_issue_pending = idxd_dma_issue_pending; 241 225 dma->device_alloc_chan_resources = idxd_dma_alloc_chan_resources; 242 226 dma->device_free_chan_resources = idxd_dma_free_chan_resources; 227 + dma->device_terminate_all = idxd_dma_terminate_all; 228 + dma->device_synchronize = idxd_dma_synchronize; 243 229 244 230 rc = dma_async_device_register(dma); 245 231 if (rc < 0) {

+1

drivers/dma/idxd/idxd.h

··· 803 803 int idxd_wq_init_percpu_ref(struct idxd_wq *wq); 804 804 void idxd_wq_free_irq(struct idxd_wq *wq); 805 805 int idxd_wq_request_irq(struct idxd_wq *wq); 806 + void idxd_wq_flush_descs(struct idxd_wq *wq); 806 807 807 808 /* submission */ 808 809 int idxd_submit_desc(struct idxd_wq *wq, struct idxd_desc *desc);

+7 -7

drivers/dma/idxd/init.c

··· 973 973 974 974 idxd->rdbuf_limit = idxd_saved->saved_idxd.rdbuf_limit; 975 975 976 - idxd->evl->size = saved_evl->size; 976 + if (idxd->evl) 977 + idxd->evl->size = saved_evl->size; 977 978 978 979 for (i = 0; i < idxd->max_groups; i++) { 979 980 struct idxd_group *saved_group, *group; ··· 1105 1104 idxd_device_config_restore(idxd, idxd->idxd_saved); 1106 1105 1107 1106 /* Re-configure IDXD device if allowed. */ 1108 - if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) { 1109 - rc = idxd_device_config(idxd); 1110 - if (rc < 0) { 1111 - dev_err(dev, "HALT: %s config fails\n", idxd_name); 1112 - goto out; 1113 - } 1107 + rc = idxd_device_config(idxd); 1108 + if (rc < 0) { 1109 + dev_err(dev, "HALT: %s config fails\n", idxd_name); 1110 + goto out; 1114 1111 } 1115 1112 1116 1113 /* Bind IDXD device to driver. */ ··· 1146 1147 } 1147 1148 out: 1148 1149 kfree(idxd->idxd_saved); 1150 + idxd->idxd_saved = NULL; 1149 1151 } 1150 1152 1151 1153 static const struct pci_error_handlers idxd_error_handler = {

+16

drivers/dma/idxd/irq.c

··· 397 397 dev_err(&idxd->pdev->dev, "FLR failed\n"); 398 398 } 399 399 400 + static void idxd_wqs_flush_descs(struct idxd_device *idxd) 401 + { 402 + int i; 403 + 404 + for (i = 0; i < idxd->max_wqs; i++) { 405 + struct idxd_wq *wq = idxd->wqs[i]; 406 + 407 + idxd_wq_flush_descs(wq); 408 + } 409 + } 410 + 400 411 static irqreturn_t idxd_halt(struct idxd_device *idxd) 401 412 { 402 413 union gensts_reg gensts; ··· 426 415 } else if (gensts.reset_type == IDXD_DEVICE_RESET_FLR) { 427 416 idxd->state = IDXD_DEV_HALTED; 428 417 idxd_mask_error_interrupts(idxd); 418 + /* Flush all pending descriptors, and disable 419 + * interrupts, they will be re-enabled when FLR 420 + * concludes. 421 + */ 422 + idxd_wqs_flush_descs(idxd); 429 423 dev_dbg(&idxd->pdev->dev, 430 424 "idxd halted, doing FLR. After FLR, configs are restored\n"); 431 425 INIT_WORK(&idxd->work, idxd_device_flr);

+1 -1

drivers/dma/idxd/submit.c

··· 138 138 */ 139 139 list_for_each_entry_safe(d, t, &flist, list) { 140 140 list_del_init(&d->list); 141 - idxd_dma_complete_txd(found, IDXD_COMPLETE_ABORT, true, 141 + idxd_dma_complete_txd(d, IDXD_COMPLETE_ABORT, true, 142 142 NULL, NULL); 143 143 } 144 144 }

+1

drivers/dma/idxd/sysfs.c

··· 1836 1836 { 1837 1837 struct idxd_device *idxd = confdev_to_idxd(dev); 1838 1838 1839 + destroy_workqueue(idxd->wq); 1839 1840 kfree(idxd->groups); 1840 1841 bitmap_free(idxd->wq_enable_map); 1841 1842 kfree(idxd->wqs);

+37 -31

drivers/dma/sh/rz-dmac.c

··· 10 10 */ 11 11 12 12 #include <linux/bitfield.h> 13 + #include <linux/cleanup.h> 13 14 #include <linux/dma-mapping.h> 14 15 #include <linux/dmaengine.h> 15 16 #include <linux/interrupt.h> ··· 297 296 { 298 297 struct dma_chan *chan = &channel->vc.chan; 299 298 struct rz_dmac *dmac = to_rz_dmac(chan->device); 300 - unsigned long flags; 301 299 302 300 dev_dbg(dmac->dev, "%s channel %d\n", __func__, channel->index); 303 301 304 - local_irq_save(flags); 305 302 rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1); 306 - local_irq_restore(flags); 307 303 } 308 304 309 305 static void rz_dmac_set_dmars_register(struct rz_dmac *dmac, int nr, u32 dmars) ··· 445 447 if (!desc) 446 448 break; 447 449 450 + /* No need to lock. This is called only for the 1st client. */ 448 451 list_add_tail(&desc->node, &channel->ld_free); 449 452 channel->descs_allocated++; 450 453 } ··· 501 502 dev_dbg(dmac->dev, "%s channel: %d src=0x%pad dst=0x%pad len=%zu\n", 502 503 __func__, channel->index, &src, &dest, len); 503 504 504 - if (list_empty(&channel->ld_free)) 505 - return NULL; 505 + scoped_guard(spinlock_irqsave, &channel->vc.lock) { 506 + if (list_empty(&channel->ld_free)) 507 + return NULL; 506 508 507 - desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node); 509 + desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node); 508 510 509 - desc->type = RZ_DMAC_DESC_MEMCPY; 510 - desc->src = src; 511 - desc->dest = dest; 512 - desc->len = len; 513 - desc->direction = DMA_MEM_TO_MEM; 511 + desc->type = RZ_DMAC_DESC_MEMCPY; 512 + desc->src = src; 513 + desc->dest = dest; 514 + desc->len = len; 515 + desc->direction = DMA_MEM_TO_MEM; 514 516 515 - list_move_tail(channel->ld_free.next, &channel->ld_queue); 517 + list_move_tail(channel->ld_free.next, &channel->ld_queue); 518 + } 519 + 516 520 return vchan_tx_prep(&channel->vc, &desc->vd, flags); 517 521 } 518 522 ··· 531 529 int dma_length = 0; 532 530 int i = 0; 533 531 534 - if (list_empty(&channel->ld_free)) 535 - return NULL; 532 + scoped_guard(spinlock_irqsave, &channel->vc.lock) { 533 + if (list_empty(&channel->ld_free)) 534 + return NULL; 536 535 537 - desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node); 536 + desc = list_first_entry(&channel->ld_free, struct rz_dmac_desc, node); 538 537 539 - for_each_sg(sgl, sg, sg_len, i) { 540 - dma_length += sg_dma_len(sg); 538 + for_each_sg(sgl, sg, sg_len, i) 539 + dma_length += sg_dma_len(sg); 540 + 541 + desc->type = RZ_DMAC_DESC_SLAVE_SG; 542 + desc->sg = sgl; 543 + desc->sgcount = sg_len; 544 + desc->len = dma_length; 545 + desc->direction = direction; 546 + 547 + if (direction == DMA_DEV_TO_MEM) 548 + desc->src = channel->src_per_address; 549 + else 550 + desc->dest = channel->dst_per_address; 551 + 552 + list_move_tail(channel->ld_free.next, &channel->ld_queue); 541 553 } 542 554 543 - desc->type = RZ_DMAC_DESC_SLAVE_SG; 544 - desc->sg = sgl; 545 - desc->sgcount = sg_len; 546 - desc->len = dma_length; 547 - desc->direction = direction; 548 - 549 - if (direction == DMA_DEV_TO_MEM) 550 - desc->src = channel->src_per_address; 551 - else 552 - desc->dest = channel->dst_per_address; 553 - 554 - list_move_tail(channel->ld_free.next, &channel->ld_queue); 555 555 return vchan_tx_prep(&channel->vc, &desc->vd, flags); 556 556 } 557 557 ··· 565 561 unsigned int i; 566 562 LIST_HEAD(head); 567 563 568 - rz_dmac_disable_hw(channel); 569 564 spin_lock_irqsave(&channel->vc.lock, flags); 565 + rz_dmac_disable_hw(channel); 570 566 for (i = 0; i < DMAC_NR_LMDESC; i++) 571 567 lmdesc[i].header = 0; 572 568 ··· 703 699 if (chstat & CHSTAT_ER) { 704 700 dev_err(dmac->dev, "DMAC err CHSTAT_%d = %08X\n", 705 701 channel->index, chstat); 706 - rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1); 702 + 703 + scoped_guard(spinlock_irqsave, &channel->vc.lock) 704 + rz_dmac_ch_writel(channel, CHCTRL_DEFAULT, CHCTRL, 1); 707 705 goto done; 708 706 } 709 707

+2 -2

drivers/dma/xilinx/xdma.c

··· 1234 1234 1235 1235 xdev->rmap = devm_regmap_init_mmio(&pdev->dev, reg_base, 1236 1236 &xdma_regmap_config); 1237 - if (!xdev->rmap) { 1238 - xdma_err(xdev, "config regmap failed: %d", ret); 1237 + if (IS_ERR(xdev->rmap)) { 1238 + xdma_err(xdev, "config regmap failed: %pe", xdev->rmap); 1239 1239 goto failed; 1240 1240 } 1241 1241 INIT_LIST_HEAD(&xdev->dma_dev.channels);

+30 -16

drivers/dma/xilinx/xilinx_dma.c

··· 997 997 struct xilinx_cdma_tx_segment, 998 998 node); 999 999 cdma_hw = &cdma_seg->hw; 1000 - residue += (cdma_hw->control - cdma_hw->status) & 1001 - chan->xdev->max_buffer_len; 1000 + residue += (cdma_hw->control & chan->xdev->max_buffer_len) - 1001 + (cdma_hw->status & chan->xdev->max_buffer_len); 1002 1002 } else if (chan->xdev->dma_config->dmatype == 1003 1003 XDMA_TYPE_AXIDMA) { 1004 1004 axidma_seg = list_entry(entry, 1005 1005 struct xilinx_axidma_tx_segment, 1006 1006 node); 1007 1007 axidma_hw = &axidma_seg->hw; 1008 - residue += (axidma_hw->control - axidma_hw->status) & 1009 - chan->xdev->max_buffer_len; 1008 + residue += (axidma_hw->control & chan->xdev->max_buffer_len) - 1009 + (axidma_hw->status & chan->xdev->max_buffer_len); 1010 1010 } else { 1011 1011 aximcdma_seg = 1012 1012 list_entry(entry, ··· 1014 1014 node); 1015 1015 aximcdma_hw = &aximcdma_seg->hw; 1016 1016 residue += 1017 - (aximcdma_hw->control - aximcdma_hw->status) & 1018 - chan->xdev->max_buffer_len; 1017 + (aximcdma_hw->control & chan->xdev->max_buffer_len) - 1018 + (aximcdma_hw->status & chan->xdev->max_buffer_len); 1019 1019 } 1020 1020 } 1021 1021 ··· 1234 1234 } 1235 1235 1236 1236 dma_cookie_init(dchan); 1237 - 1238 - if (chan->xdev->dma_config->dmatype == XDMA_TYPE_AXIDMA) { 1239 - /* For AXI DMA resetting once channel will reset the 1240 - * other channel as well so enable the interrupts here. 1241 - */ 1242 - dma_ctrl_set(chan, XILINX_DMA_REG_DMACR, 1243 - XILINX_DMA_DMAXR_ALL_IRQ_MASK); 1244 - } 1245 1237 1246 1238 if ((chan->xdev->dma_config->dmatype == XDMA_TYPE_CDMA) && chan->has_sg) 1247 1239 dma_ctrl_set(chan, XILINX_DMA_REG_DMACR, ··· 1556 1564 if (chan->err) 1557 1565 return; 1558 1566 1559 - if (list_empty(&chan->pending_list)) 1567 + if (list_empty(&chan->pending_list)) { 1568 + if (chan->cyclic) { 1569 + struct xilinx_dma_tx_descriptor *desc; 1570 + struct list_head *entry; 1571 + 1572 + desc = list_last_entry(&chan->done_list, 1573 + struct xilinx_dma_tx_descriptor, node); 1574 + list_for_each(entry, &desc->segments) { 1575 + struct xilinx_axidma_tx_segment *axidma_seg; 1576 + struct xilinx_axidma_desc_hw *axidma_hw; 1577 + axidma_seg = list_entry(entry, 1578 + struct xilinx_axidma_tx_segment, 1579 + node); 1580 + axidma_hw = &axidma_seg->hw; 1581 + axidma_hw->status = 0; 1582 + } 1583 + 1584 + list_splice_tail_init(&chan->done_list, &chan->active_list); 1585 + chan->desc_pendingcount = 0; 1586 + chan->idle = false; 1587 + } 1560 1588 return; 1589 + } 1561 1590 1562 1591 if (!chan->idle) 1563 1592 return; ··· 1604 1591 head_desc->async_tx.phys); 1605 1592 reg &= ~XILINX_DMA_CR_DELAY_MAX; 1606 1593 reg |= chan->irq_delay << XILINX_DMA_CR_DELAY_SHIFT; 1594 + reg |= XILINX_DMA_DMAXR_ALL_IRQ_MASK; 1607 1595 dma_ctrl_write(chan, XILINX_DMA_REG_DMACR, reg); 1608 1596 1609 1597 xilinx_dma_start(chan); ··· 3038 3024 return -EINVAL; 3039 3025 } 3040 3026 3041 - xdev->common.directions |= chan->direction; 3027 + xdev->common.directions |= BIT(chan->direction); 3042 3028 3043 3029 /* Request the interrupt */ 3044 3030 chan->irq = of_irq_get(node, chan->tdest);

+2 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c

··· 692 692 goto err_ib_sched; 693 693 } 694 694 695 - /* Drop the initial kref_init count (see drm_sched_main as example) */ 696 - dma_fence_put(f); 697 695 ret = dma_fence_wait(f, false); 696 + /* Drop the returned fence reference after the wait completes */ 697 + dma_fence_put(f); 698 698 699 699 err_ib_sched: 700 700 amdgpu_job_free(job);

+11 -2

drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

··· 4207 4207 4208 4208 static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev) 4209 4209 { 4210 - char *input = amdgpu_lockup_timeout; 4210 + char buf[AMDGPU_MAX_TIMEOUT_PARAM_LENGTH]; 4211 + char *input = buf; 4211 4212 char *timeout_setting = NULL; 4212 4213 int index = 0; 4213 4214 long timeout; ··· 4218 4217 adev->gfx_timeout = adev->compute_timeout = adev->sdma_timeout = 4219 4218 adev->video_timeout = msecs_to_jiffies(2000); 4220 4219 4221 - if (!strnlen(input, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) 4220 + if (!strnlen(amdgpu_lockup_timeout, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) 4222 4221 return 0; 4222 + 4223 + /* 4224 + * strsep() destructively modifies its input by replacing delimiters 4225 + * with '\0'. Use a stack copy so the global module parameter buffer 4226 + * remains intact for multi-GPU systems where this function is called 4227 + * once per device. 4228 + */ 4229 + strscpy(buf, amdgpu_lockup_timeout, sizeof(buf)); 4223 4230 4224 4231 while ((timeout_setting = strsep(&input, ",")) && 4225 4232 strnlen(timeout_setting, AMDGPU_MAX_TIMEOUT_PARAM_LENGTH)) {

+32 -13

drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c

··· 35 35 * PASIDs are global address space identifiers that can be shared 36 36 * between the GPU, an IOMMU and the driver. VMs on different devices 37 37 * may use the same PASID if they share the same address 38 - * space. Therefore PASIDs are allocated using a global IDA. VMs are 39 - * looked up from the PASID per amdgpu_device. 38 + * space. Therefore PASIDs are allocated using IDR cyclic allocator 39 + * (similar to kernel PID allocation) which naturally delays reuse. 40 + * VMs are looked up from the PASID per amdgpu_device. 40 41 */ 41 - static DEFINE_IDA(amdgpu_pasid_ida); 42 + 43 + static DEFINE_IDR(amdgpu_pasid_idr); 44 + static DEFINE_SPINLOCK(amdgpu_pasid_idr_lock); 42 45 43 46 /* Helper to free pasid from a fence callback */ 44 47 struct amdgpu_pasid_cb { ··· 53 50 * amdgpu_pasid_alloc - Allocate a PASID 54 51 * @bits: Maximum width of the PASID in bits, must be at least 1 55 52 * 56 - * Allocates a PASID of the given width while keeping smaller PASIDs 57 - * available if possible. 53 + * Uses kernel's IDR cyclic allocator (same as PID allocation). 54 + * Allocates sequentially with automatic wrap-around. 58 55 * 59 56 * Returns a positive integer on success. Returns %-EINVAL if bits==0. 60 57 * Returns %-ENOSPC if no PASID was available. Returns %-ENOMEM on ··· 62 59 */ 63 60 int amdgpu_pasid_alloc(unsigned int bits) 64 61 { 65 - int pasid = -EINVAL; 62 + int pasid; 66 63 67 - for (bits = min(bits, 31U); bits > 0; bits--) { 68 - pasid = ida_alloc_range(&amdgpu_pasid_ida, 1U << (bits - 1), 69 - (1U << bits) - 1, GFP_KERNEL); 70 - if (pasid != -ENOSPC) 71 - break; 72 - } 64 + if (bits == 0) 65 + return -EINVAL; 66 + 67 + spin_lock(&amdgpu_pasid_idr_lock); 68 + pasid = idr_alloc_cyclic(&amdgpu_pasid_idr, NULL, 1, 69 + 1U << bits, GFP_KERNEL); 70 + spin_unlock(&amdgpu_pasid_idr_lock); 73 71 74 72 if (pasid >= 0) 75 73 trace_amdgpu_pasid_allocated(pasid); ··· 85 81 void amdgpu_pasid_free(u32 pasid) 86 82 { 87 83 trace_amdgpu_pasid_freed(pasid); 88 - ida_free(&amdgpu_pasid_ida, pasid); 84 + 85 + spin_lock(&amdgpu_pasid_idr_lock); 86 + idr_remove(&amdgpu_pasid_idr, pasid); 87 + spin_unlock(&amdgpu_pasid_idr_lock); 89 88 } 90 89 91 90 static void amdgpu_pasid_free_cb(struct dma_fence *fence, ··· 622 615 dma_fence_put(id->pasid_mapping); 623 616 } 624 617 } 618 + } 619 + 620 + /** 621 + * amdgpu_pasid_mgr_cleanup - cleanup PASID manager 622 + * 623 + * Cleanup the IDR allocator. 624 + */ 625 + void amdgpu_pasid_mgr_cleanup(void) 626 + { 627 + spin_lock(&amdgpu_pasid_idr_lock); 628 + idr_destroy(&amdgpu_pasid_idr); 629 + spin_unlock(&amdgpu_pasid_idr_lock); 625 630 }

+1

drivers/gpu/drm/amd/amdgpu/amdgpu_ids.h

··· 74 74 void amdgpu_pasid_free(u32 pasid); 75 75 void amdgpu_pasid_free_delayed(struct dma_resv *resv, 76 76 u32 pasid); 77 + void amdgpu_pasid_mgr_cleanup(void); 77 78 78 79 bool amdgpu_vmid_had_gpu_reset(struct amdgpu_device *adev, 79 80 struct amdgpu_vmid *id);

+4 -3

drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c

··· 2898 2898 xa_destroy(&adev->vm_manager.pasids); 2899 2899 2900 2900 amdgpu_vmid_mgr_fini(adev); 2901 + amdgpu_pasid_mgr_cleanup(); 2901 2902 } 2902 2903 2903 2904 /** ··· 2974 2973 if (!root) 2975 2974 return false; 2976 2975 2977 - addr /= AMDGPU_GPU_PAGE_SIZE; 2978 - 2979 2976 if (is_compute_context && !svm_range_restore_pages(adev, pasid, vmid, 2980 - node_id, addr, ts, write_fault)) { 2977 + node_id, addr >> PAGE_SHIFT, ts, write_fault)) { 2981 2978 amdgpu_bo_unref(&root); 2982 2979 return true; 2983 2980 } 2981 + 2982 + addr /= AMDGPU_GPU_PAGE_SIZE; 2984 2983 2985 2984 r = amdgpu_bo_reserve(root, true); 2986 2985 if (r)

+3 -3

drivers/gpu/drm/amd/amdkfd/kfd_chardev.c

··· 3170 3170 struct kfd_process *process; 3171 3171 int ret; 3172 3172 3173 - /* Each FD owns only one kfd_process */ 3174 - if (p->context_id != KFD_CONTEXT_ID_PRIMARY) 3173 + if (!filep->private_data || !p) 3175 3174 return -EINVAL; 3176 3175 3177 - if (!filep->private_data || !p) 3176 + /* Each FD owns only one kfd_process */ 3177 + if (p->context_id != KFD_CONTEXT_ID_PRIMARY) 3178 3178 return -EINVAL; 3179 3179 3180 3180 mutex_lock(&kfd_processes_mutex);

+8 -2

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c

··· 3909 3909 3910 3910 aconnector->dc_sink = sink; 3911 3911 dc_sink_retain(aconnector->dc_sink); 3912 + drm_edid_free(aconnector->drm_edid); 3913 + aconnector->drm_edid = NULL; 3912 3914 if (sink->dc_edid.length == 0) { 3913 - aconnector->drm_edid = NULL; 3914 3915 hdmi_cec_unset_edid(aconnector); 3915 3916 if (aconnector->dc_link->aux_mode) { 3916 3917 drm_dp_cec_unset_edid(&aconnector->dm_dp_aux.aux); ··· 5423 5422 caps = &dm->backlight_caps[aconnector->bl_idx]; 5424 5423 5425 5424 /* Only offer ABM property when non-OLED and user didn't turn off by module parameter */ 5426 - if (!caps->ext_caps->bits.oled && amdgpu_dm_abm_level < 0) 5425 + if (caps->ext_caps && !caps->ext_caps->bits.oled && amdgpu_dm_abm_level < 0) 5427 5426 drm_object_attach_property(&aconnector->base.base, 5428 5427 dm->adev->mode_info.abm_level_property, 5429 5428 ABM_SYSFS_CONTROL); ··· 12524 12523 } 12525 12524 12526 12525 if (dc_resource_is_dsc_encoding_supported(dc)) { 12526 + for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { 12527 + dm_new_crtc_state = to_dm_crtc_state(new_crtc_state); 12528 + dm_new_crtc_state->mode_changed_independent_from_dsc = new_crtc_state->mode_changed; 12529 + } 12530 + 12527 12531 for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) { 12528 12532 if (drm_atomic_crtc_needs_modeset(new_crtc_state)) { 12529 12533 ret = add_affected_mst_dsc_crtcs(state, crtc);

+1

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h

··· 984 984 985 985 bool freesync_vrr_info_changed; 986 986 987 + bool mode_changed_independent_from_dsc; 987 988 bool dsc_force_changed; 988 989 bool vrr_supported; 989 990 struct mod_freesync_config freesync_config;

+3 -1

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c

··· 1744 1744 int ind = find_crtc_index_in_state_by_stream(state, stream); 1745 1745 1746 1746 if (ind >= 0) { 1747 + struct dm_crtc_state *dm_new_crtc_state = to_dm_crtc_state(state->crtcs[ind].new_state); 1748 + 1747 1749 DRM_INFO_ONCE("%s:%d MST_DSC no mode changed for stream 0x%p\n", 1748 1750 __func__, __LINE__, stream); 1749 - state->crtcs[ind].new_state->mode_changed = 0; 1751 + dm_new_crtc_state->base.mode_changed = dm_new_crtc_state->mode_changed_independent_from_dsc; 1750 1752 } 1751 1753 } 1752 1754 }

+2 -4

drivers/gpu/drm/amd/display/dc/resource/dce100/dce100_resource.c

··· 650 650 return &enc110->base; 651 651 } 652 652 653 - if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 654 - return NULL; 655 - 656 653 link_regs_id = 657 654 map_transmitter_id_to_phy_instance(enc_init_data->transmitter); 658 655 ··· 658 661 &link_enc_feature, 659 662 &link_enc_regs[link_regs_id], 660 663 &link_enc_aux_regs[enc_init_data->channel - 1], 661 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 664 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 665 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 662 666 return &enc110->base; 663 667 } 664 668

+3 -2

drivers/gpu/drm/amd/display/dc/resource/dce110/dce110_resource.c

··· 671 671 kzalloc_obj(struct dce110_link_encoder); 672 672 int link_regs_id; 673 673 674 - if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 674 + if (!enc110) 675 675 return NULL; 676 676 677 677 link_regs_id = ··· 682 682 &link_enc_feature, 683 683 &link_enc_regs[link_regs_id], 684 684 &link_enc_aux_regs[enc_init_data->channel - 1], 685 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 685 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 686 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 686 687 return &enc110->base; 687 688 } 688 689

+3 -2

drivers/gpu/drm/amd/display/dc/resource/dce112/dce112_resource.c

··· 632 632 kzalloc_obj(struct dce110_link_encoder); 633 633 int link_regs_id; 634 634 635 - if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 635 + if (!enc110) 636 636 return NULL; 637 637 638 638 link_regs_id = ··· 643 643 &link_enc_feature, 644 644 &link_enc_regs[link_regs_id], 645 645 &link_enc_aux_regs[enc_init_data->channel - 1], 646 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 646 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 647 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 647 648 return &enc110->base; 648 649 } 649 650

+3 -2

drivers/gpu/drm/amd/display/dc/resource/dce120/dce120_resource.c

··· 716 716 kzalloc_obj(struct dce110_link_encoder); 717 717 int link_regs_id; 718 718 719 - if (!enc110 || enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 719 + if (!enc110) 720 720 return NULL; 721 721 722 722 link_regs_id = ··· 727 727 &link_enc_feature, 728 728 &link_enc_regs[link_regs_id], 729 729 &link_enc_aux_regs[enc_init_data->channel - 1], 730 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 730 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 731 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 731 732 732 733 return &enc110->base; 733 734 }

+6 -8

drivers/gpu/drm/amd/display/dc/resource/dce60/dce60_resource.c

··· 746 746 return &enc110->base; 747 747 } 748 748 749 - if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 750 - return NULL; 751 - 752 749 link_regs_id = 753 750 map_transmitter_id_to_phy_instance(enc_init_data->transmitter); 754 751 755 752 dce60_link_encoder_construct(enc110, 756 - enc_init_data, 757 - &link_enc_feature, 758 - &link_enc_regs[link_regs_id], 759 - &link_enc_aux_regs[enc_init_data->channel - 1], 760 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 753 + enc_init_data, 754 + &link_enc_feature, 755 + &link_enc_regs[link_regs_id], 756 + &link_enc_aux_regs[enc_init_data->channel - 1], 757 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 758 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 761 759 return &enc110->base; 762 760 } 763 761

+2 -4

drivers/gpu/drm/amd/display/dc/resource/dce80/dce80_resource.c

··· 752 752 return &enc110->base; 753 753 } 754 754 755 - if (enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs)) 756 - return NULL; 757 - 758 755 link_regs_id = 759 756 map_transmitter_id_to_phy_instance(enc_init_data->transmitter); 760 757 ··· 760 763 &link_enc_feature, 761 764 &link_enc_regs[link_regs_id], 762 765 &link_enc_aux_regs[enc_init_data->channel - 1], 763 - &link_enc_hpd_regs[enc_init_data->hpd_source]); 766 + enc_init_data->hpd_source >= ARRAY_SIZE(link_enc_hpd_regs) ? 767 + NULL : &link_enc_hpd_regs[enc_init_data->hpd_source]); 764 768 return &enc110->base; 765 769 } 766 770

+32 -1

drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_0_ppt.c

··· 59 59 60 60 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c)) 61 61 62 + static void smu_v13_0_0_get_od_setting_limits(struct smu_context *smu, 63 + int od_feature_bit, 64 + int32_t *min, int32_t *max); 65 + 62 66 static const struct smu_feature_bits smu_v13_0_0_dpm_features = { 63 67 .bits = { 64 68 SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT), ··· 1047 1043 PPTable_t *pptable = smu->smu_table.driver_pptable; 1048 1044 const OverDriveLimits_t * const overdrive_upperlimits = 1049 1045 &pptable->SkuTable.OverDriveLimitsBasicMax; 1046 + int32_t min_value, max_value; 1047 + bool feature_enabled; 1050 1048 1051 - return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit); 1049 + switch (od_feature_bit) { 1050 + case PP_OD_FEATURE_FAN_CURVE_BIT: 1051 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 1052 + if (feature_enabled) { 1053 + smu_v13_0_0_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP, 1054 + &min_value, &max_value); 1055 + if (!min_value && !max_value) { 1056 + feature_enabled = false; 1057 + goto out; 1058 + } 1059 + 1060 + smu_v13_0_0_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM, 1061 + &min_value, &max_value); 1062 + if (!min_value && !max_value) { 1063 + feature_enabled = false; 1064 + goto out; 1065 + } 1066 + } 1067 + break; 1068 + default: 1069 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 1070 + break; 1071 + } 1072 + 1073 + out: 1074 + return feature_enabled; 1052 1075 } 1053 1076 1054 1077 static void smu_v13_0_0_get_od_setting_limits(struct smu_context *smu,

+12 -9

drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c

··· 1391 1391 break; 1392 1392 case SMU_OD_MCLK: 1393 1393 if (!smu_v13_0_6_cap_supported(smu, SMU_CAP(SET_UCLK_MAX))) 1394 - return 0; 1394 + return -EOPNOTSUPP; 1395 1395 1396 1396 size += sysfs_emit_at(buf, size, "%s:\n", "OD_MCLK"); 1397 1397 size += sysfs_emit_at(buf, size, "0: %uMhz\n1: %uMhz\n", ··· 2122 2122 { 2123 2123 struct smu_dpm_context *smu_dpm = &(smu->smu_dpm); 2124 2124 struct smu_13_0_dpm_context *dpm_context = smu_dpm->dpm_context; 2125 + struct smu_dpm_table *uclk_table = &dpm_context->dpm_tables.uclk_table; 2125 2126 struct smu_umd_pstate_table *pstate_table = &smu->pstate_table; 2126 2127 uint32_t min_clk; 2127 2128 uint32_t max_clk; ··· 2222 2221 if (ret) 2223 2222 return ret; 2224 2223 2225 - min_clk = SMU_DPM_TABLE_MIN( 2226 - &dpm_context->dpm_tables.uclk_table); 2227 - max_clk = SMU_DPM_TABLE_MAX( 2228 - &dpm_context->dpm_tables.uclk_table); 2229 - ret = smu_v13_0_6_set_soft_freq_limited_range( 2230 - smu, SMU_UCLK, min_clk, max_clk, false); 2231 - if (ret) 2232 - return ret; 2224 + if (SMU_DPM_TABLE_MAX(uclk_table) != 2225 + pstate_table->uclk_pstate.curr.max) { 2226 + min_clk = SMU_DPM_TABLE_MIN(&dpm_context->dpm_tables.uclk_table); 2227 + max_clk = SMU_DPM_TABLE_MAX(&dpm_context->dpm_tables.uclk_table); 2228 + ret = smu_v13_0_6_set_soft_freq_limited_range(smu, 2229 + SMU_UCLK, min_clk, 2230 + max_clk, false); 2231 + if (ret) 2232 + return ret; 2233 + } 2233 2234 smu_v13_0_reset_custom_level(smu); 2234 2235 } 2235 2236 break;

+32 -1

drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c

··· 59 59 60 60 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c)) 61 61 62 + static void smu_v13_0_7_get_od_setting_limits(struct smu_context *smu, 63 + int od_feature_bit, 64 + int32_t *min, int32_t *max); 65 + 62 66 static const struct smu_feature_bits smu_v13_0_7_dpm_features = { 63 67 .bits = { 64 68 SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT), ··· 1057 1053 PPTable_t *pptable = smu->smu_table.driver_pptable; 1058 1054 const OverDriveLimits_t * const overdrive_upperlimits = 1059 1055 &pptable->SkuTable.OverDriveLimitsBasicMax; 1056 + int32_t min_value, max_value; 1057 + bool feature_enabled; 1060 1058 1061 - return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit); 1059 + switch (od_feature_bit) { 1060 + case PP_OD_FEATURE_FAN_CURVE_BIT: 1061 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 1062 + if (feature_enabled) { 1063 + smu_v13_0_7_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP, 1064 + &min_value, &max_value); 1065 + if (!min_value && !max_value) { 1066 + feature_enabled = false; 1067 + goto out; 1068 + } 1069 + 1070 + smu_v13_0_7_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM, 1071 + &min_value, &max_value); 1072 + if (!min_value && !max_value) { 1073 + feature_enabled = false; 1074 + goto out; 1075 + } 1076 + } 1077 + break; 1078 + default: 1079 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 1080 + break; 1081 + } 1082 + 1083 + out: 1084 + return feature_enabled; 1062 1085 } 1063 1086 1064 1087 static void smu_v13_0_7_get_od_setting_limits(struct smu_context *smu,

+32 -1

drivers/gpu/drm/amd/pm/swsmu/smu14/smu_v14_0_2_ppt.c

··· 56 56 57 57 #define to_amdgpu_device(x) (container_of(x, struct amdgpu_device, pm.smu_i2c)) 58 58 59 + static void smu_v14_0_2_get_od_setting_limits(struct smu_context *smu, 60 + int od_feature_bit, 61 + int32_t *min, int32_t *max); 62 + 59 63 static const struct smu_feature_bits smu_v14_0_2_dpm_features = { 60 64 .bits = { SMU_FEATURE_BIT_INIT(FEATURE_DPM_GFXCLK_BIT), 61 65 SMU_FEATURE_BIT_INIT(FEATURE_DPM_UCLK_BIT), ··· 926 922 PPTable_t *pptable = smu->smu_table.driver_pptable; 927 923 const OverDriveLimits_t * const overdrive_upperlimits = 928 924 &pptable->SkuTable.OverDriveLimitsBasicMax; 925 + int32_t min_value, max_value; 926 + bool feature_enabled; 929 927 930 - return overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit); 928 + switch (od_feature_bit) { 929 + case PP_OD_FEATURE_FAN_CURVE_BIT: 930 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 931 + if (feature_enabled) { 932 + smu_v14_0_2_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_TEMP, 933 + &min_value, &max_value); 934 + if (!min_value && !max_value) { 935 + feature_enabled = false; 936 + goto out; 937 + } 938 + 939 + smu_v14_0_2_get_od_setting_limits(smu, PP_OD_FEATURE_FAN_CURVE_PWM, 940 + &min_value, &max_value); 941 + if (!min_value && !max_value) { 942 + feature_enabled = false; 943 + goto out; 944 + } 945 + } 946 + break; 947 + default: 948 + feature_enabled = !!(overdrive_upperlimits->FeatureCtrlMask & (1U << od_feature_bit)); 949 + break; 950 + } 951 + 952 + out: 953 + return feature_enabled; 931 954 } 932 955 933 956 static void smu_v14_0_2_get_od_setting_limits(struct smu_context *smu,

+27 -21

drivers/gpu/drm/drm_gem_shmem_helper.c

··· 550 550 } 551 551 EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create); 552 552 553 - static bool drm_gem_shmem_try_map_pmd(struct vm_fault *vmf, unsigned long addr, 554 - struct page *page) 553 + static vm_fault_t try_insert_pfn(struct vm_fault *vmf, unsigned int order, 554 + unsigned long pfn) 555 555 { 556 + if (!order) { 557 + return vmf_insert_pfn(vmf->vma, vmf->address, pfn); 556 558 #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP 557 - unsigned long pfn = page_to_pfn(page); 558 - unsigned long paddr = pfn << PAGE_SHIFT; 559 - bool aligned = (addr & ~PMD_MASK) == (paddr & ~PMD_MASK); 559 + } else if (order == PMD_ORDER) { 560 + unsigned long paddr = pfn << PAGE_SHIFT; 561 + bool aligned = (vmf->address & ~PMD_MASK) == (paddr & ~PMD_MASK); 560 562 561 - if (aligned && 562 - pmd_none(*vmf->pmd) && 563 - folio_test_pmd_mappable(page_folio(page))) { 564 - pfn &= PMD_MASK >> PAGE_SHIFT; 565 - if (vmf_insert_pfn_pmd(vmf, pfn, false) == VM_FAULT_NOPAGE) 566 - return true; 567 - } 563 + if (aligned && 564 + folio_test_pmd_mappable(page_folio(pfn_to_page(pfn)))) { 565 + pfn &= PMD_MASK >> PAGE_SHIFT; 566 + return vmf_insert_pfn_pmd(vmf, pfn, false); 567 + } 568 568 #endif 569 - 570 - return false; 569 + } 570 + return VM_FAULT_FALLBACK; 571 571 } 572 572 573 - static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf) 573 + static vm_fault_t drm_gem_shmem_any_fault(struct vm_fault *vmf, unsigned int order) 574 574 { 575 575 struct vm_area_struct *vma = vmf->vma; 576 576 struct drm_gem_object *obj = vma->vm_private_data; ··· 580 580 struct page **pages = shmem->pages; 581 581 pgoff_t page_offset; 582 582 unsigned long pfn; 583 + 584 + if (order && order != PMD_ORDER) 585 + return VM_FAULT_FALLBACK; 583 586 584 587 /* Offset to faulty address in the VMA. */ 585 588 page_offset = vmf->pgoff - vma->vm_pgoff; ··· 596 593 goto out; 597 594 } 598 595 599 - if (drm_gem_shmem_try_map_pmd(vmf, vmf->address, pages[page_offset])) { 600 - ret = VM_FAULT_NOPAGE; 601 - goto out; 602 - } 603 - 604 596 pfn = page_to_pfn(pages[page_offset]); 605 - ret = vmf_insert_pfn(vma, vmf->address, pfn); 597 + ret = try_insert_pfn(vmf, order, pfn); 606 598 607 599 out: 608 600 dma_resv_unlock(shmem->base.resv); 609 601 610 602 return ret; 603 + } 604 + 605 + static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf) 606 + { 607 + return drm_gem_shmem_any_fault(vmf, 0); 611 608 } 612 609 613 610 static void drm_gem_shmem_vm_open(struct vm_area_struct *vma) ··· 646 643 647 644 const struct vm_operations_struct drm_gem_shmem_vm_ops = { 648 645 .fault = drm_gem_shmem_fault, 646 + #ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP 647 + .huge_fault = drm_gem_shmem_any_fault, 648 + #endif 649 649 .open = drm_gem_shmem_vm_open, 650 650 .close = drm_gem_shmem_vm_close, 651 651 };

+2 -2

drivers/gpu/drm/drm_syncobj.c

··· 602 602 drm_syncobj_get(syncobj); 603 603 604 604 ret = xa_alloc(&file_private->syncobj_xa, handle, syncobj, xa_limit_32b, 605 - GFP_NOWAIT); 605 + GFP_KERNEL); 606 606 if (ret) 607 607 drm_syncobj_put(syncobj); 608 608 ··· 716 716 drm_syncobj_get(syncobj); 717 717 718 718 ret = xa_alloc(&file_private->syncobj_xa, handle, syncobj, xa_limit_32b, 719 - GFP_NOWAIT); 719 + GFP_KERNEL); 720 720 if (ret) 721 721 drm_syncobj_put(syncobj); 722 722

+7 -1

drivers/gpu/drm/i915/display/intel_display.c

··· 4602 4602 struct intel_crtc_state *crtc_state = 4603 4603 intel_atomic_get_new_crtc_state(state, crtc); 4604 4604 struct intel_crtc_state *saved_state; 4605 + int err; 4605 4606 4606 4607 saved_state = intel_crtc_state_alloc(crtc); 4607 4608 if (!saved_state) ··· 4611 4610 /* free the old crtc_state->hw members */ 4612 4611 intel_crtc_free_hw_state(crtc_state); 4613 4612 4614 - intel_dp_tunnel_atomic_clear_stream_bw(state, crtc_state); 4613 + err = intel_dp_tunnel_atomic_clear_stream_bw(state, crtc_state); 4614 + if (err) { 4615 + kfree(saved_state); 4616 + 4617 + return err; 4618 + } 4615 4619 4616 4620 /* FIXME: before the switch to atomic started, a new pipe_config was 4617 4621 * kzalloc'd. Code that depends on any field being zero should be

+14 -6

drivers/gpu/drm/i915/display/intel_dp_tunnel.c

··· 621 621 * 622 622 * Clear any DP tunnel stream BW requirement set by 623 623 * intel_dp_tunnel_atomic_compute_stream_bw(). 624 + * 625 + * Returns 0 in case of success, a negative error code otherwise. 624 626 */ 625 - void intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state, 626 - struct intel_crtc_state *crtc_state) 627 + int intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state, 628 + struct intel_crtc_state *crtc_state) 627 629 { 628 630 struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc); 631 + int err; 629 632 630 633 if (!crtc_state->dp_tunnel_ref.tunnel) 631 - return; 634 + return 0; 632 635 633 - drm_dp_tunnel_atomic_set_stream_bw(&state->base, 634 - crtc_state->dp_tunnel_ref.tunnel, 635 - crtc->pipe, 0); 636 + err = drm_dp_tunnel_atomic_set_stream_bw(&state->base, 637 + crtc_state->dp_tunnel_ref.tunnel, 638 + crtc->pipe, 0); 639 + if (err) 640 + return err; 641 + 636 642 drm_dp_tunnel_ref_put(&crtc_state->dp_tunnel_ref); 643 + 644 + return 0; 637 645 } 638 646 639 647 /**

+7 -4

drivers/gpu/drm/i915/display/intel_dp_tunnel.h

··· 40 40 struct intel_dp *intel_dp, 41 41 const struct intel_connector *connector, 42 42 struct intel_crtc_state *crtc_state); 43 - void intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state, 44 - struct intel_crtc_state *crtc_state); 43 + int intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state, 44 + struct intel_crtc_state *crtc_state); 45 45 46 46 int intel_dp_tunnel_atomic_add_state_for_crtc(struct intel_atomic_state *state, 47 47 struct intel_crtc *crtc); ··· 88 88 return 0; 89 89 } 90 90 91 - static inline void 91 + static inline int 92 92 intel_dp_tunnel_atomic_clear_stream_bw(struct intel_atomic_state *state, 93 - struct intel_crtc_state *crtc_state) {} 93 + struct intel_crtc_state *crtc_state) 94 + { 95 + return 0; 96 + } 94 97 95 98 static inline int 96 99 intel_dp_tunnel_atomic_add_state_for_crtc(struct intel_atomic_state *state,

+3 -1

drivers/gpu/drm/i915/display/intel_gmbus.c

··· 496 496 497 497 val = intel_de_read_fw(display, GMBUS3(display)); 498 498 do { 499 - if (extra_byte_added && len == 1) 499 + if (extra_byte_added && len == 1) { 500 + len--; 500 501 break; 502 + } 501 503 502 504 *buf++ = val & 0xff; 503 505 val >>= 8;

+9 -2

drivers/gpu/drm/i915/display/intel_plane.c

··· 436 436 drm_framebuffer_get(plane_state->hw.fb); 437 437 } 438 438 439 + static void unlink_nv12_plane(struct intel_crtc_state *crtc_state, 440 + struct intel_plane_state *plane_state); 441 + 439 442 void intel_plane_set_invisible(struct intel_crtc_state *crtc_state, 440 443 struct intel_plane_state *plane_state) 441 444 { 442 445 struct intel_plane *plane = to_intel_plane(plane_state->uapi.plane); 446 + 447 + unlink_nv12_plane(crtc_state, plane_state); 443 448 444 449 crtc_state->active_planes &= ~BIT(plane->id); 445 450 crtc_state->scaled_planes &= ~BIT(plane->id); ··· 1518 1513 struct intel_display *display = to_intel_display(plane_state); 1519 1514 struct intel_plane *plane = to_intel_plane(plane_state->uapi.plane); 1520 1515 1516 + if (!plane_state->planar_linked_plane) 1517 + return; 1518 + 1521 1519 plane_state->planar_linked_plane = NULL; 1522 1520 1523 1521 if (!plane_state->is_y_plane) ··· 1558 1550 if (plane->pipe != crtc->pipe) 1559 1551 continue; 1560 1552 1561 - if (plane_state->planar_linked_plane) 1562 - unlink_nv12_plane(crtc_state, plane_state); 1553 + unlink_nv12_plane(crtc_state, plane_state); 1563 1554 } 1564 1555 1565 1556 if (!crtc_state->nv12_planes)

+1 -1

drivers/gpu/drm/i915/i915_wait_util.h

··· 25 25 might_sleep(); \ 26 26 for (;;) { \ 27 27 const bool expired__ = ktime_after(ktime_get_raw(), end__); \ 28 - OP; \ 29 28 /* Guarantee COND check prior to timeout */ \ 30 29 barrier(); \ 30 + OP; \ 31 31 if (COND) { \ 32 32 ret__ = 0; \ 33 33 break; \

+5 -4

drivers/gpu/drm/mediatek/mtk_dsi.c

··· 1236 1236 1237 1237 dsi->host.ops = &mtk_dsi_ops; 1238 1238 dsi->host.dev = dev; 1239 + 1240 + init_waitqueue_head(&dsi->irq_wait_queue); 1241 + 1242 + platform_set_drvdata(pdev, dsi); 1243 + 1239 1244 ret = mipi_dsi_host_register(&dsi->host); 1240 1245 if (ret < 0) 1241 1246 return dev_err_probe(dev, ret, "Failed to register DSI host\n"); ··· 1251 1246 mipi_dsi_host_unregister(&dsi->host); 1252 1247 return dev_err_probe(&pdev->dev, ret, "Failed to request DSI irq\n"); 1253 1248 } 1254 - 1255 - init_waitqueue_head(&dsi->irq_wait_queue); 1256 - 1257 - platform_set_drvdata(pdev, dsi); 1258 1249 1259 1250 dsi->bridge.of_node = dev->of_node; 1260 1251 dsi->bridge.type = DRM_MODE_CONNECTOR_DSI;

+1

drivers/gpu/drm/xe/regs/xe_gt_regs.h

··· 553 553 #define ENABLE_SMP_LD_RENDER_SURFACE_CONTROL REG_BIT(44 - 32) 554 554 #define FORCE_SLM_FENCE_SCOPE_TO_TILE REG_BIT(42 - 32) 555 555 #define FORCE_UGM_FENCE_SCOPE_TO_TILE REG_BIT(41 - 32) 556 + #define L3_128B_256B_WRT_DIS REG_BIT(40 - 32) 556 557 #define MAXREQS_PER_BANK REG_GENMASK(39 - 32, 37 - 32) 557 558 #define DISABLE_128B_EVICTION_COMMAND_UDW REG_BIT(36 - 32) 558 559

+6 -6

drivers/gpu/drm/xe/xe_pt.c

··· 1442 1442 err = vma_check_userptr(vm, op->map.vma, pt_update); 1443 1443 break; 1444 1444 case DRM_GPUVA_OP_REMAP: 1445 - if (op->remap.prev) 1445 + if (op->remap.prev && !op->remap.skip_prev) 1446 1446 err = vma_check_userptr(vm, op->remap.prev, pt_update); 1447 - if (!err && op->remap.next) 1447 + if (!err && op->remap.next && !op->remap.skip_next) 1448 1448 err = vma_check_userptr(vm, op->remap.next, pt_update); 1449 1449 break; 1450 1450 case DRM_GPUVA_OP_UNMAP: ··· 2198 2198 2199 2199 err = unbind_op_prepare(tile, pt_update_ops, old); 2200 2200 2201 - if (!err && op->remap.prev) { 2201 + if (!err && op->remap.prev && !op->remap.skip_prev) { 2202 2202 err = bind_op_prepare(vm, tile, pt_update_ops, 2203 2203 op->remap.prev, false); 2204 2204 pt_update_ops->wait_vm_bookkeep = true; 2205 2205 } 2206 - if (!err && op->remap.next) { 2206 + if (!err && op->remap.next && !op->remap.skip_next) { 2207 2207 err = bind_op_prepare(vm, tile, pt_update_ops, 2208 2208 op->remap.next, false); 2209 2209 pt_update_ops->wait_vm_bookkeep = true; ··· 2428 2428 2429 2429 unbind_op_commit(vm, tile, pt_update_ops, old, fence, fence2); 2430 2430 2431 - if (op->remap.prev) 2431 + if (op->remap.prev && !op->remap.skip_prev) 2432 2432 bind_op_commit(vm, tile, pt_update_ops, op->remap.prev, 2433 2433 fence, fence2, false); 2434 - if (op->remap.next) 2434 + if (op->remap.next && !op->remap.skip_next) 2435 2435 bind_op_commit(vm, tile, pt_update_ops, op->remap.next, 2436 2436 fence, fence2, false); 2437 2437 break;

+2

drivers/gpu/drm/xe/xe_sriov_packet.c

··· 341 341 ret = xe_sriov_pf_migration_restore_produce(xe, vfid, *data); 342 342 if (ret) { 343 343 xe_sriov_packet_free(*data); 344 + *data = NULL; 345 + 344 346 return ret; 345 347 } 346 348

+18 -4

drivers/gpu/drm/xe/xe_vm.c

··· 2554 2554 if (!err && op->remap.skip_prev) { 2555 2555 op->remap.prev->tile_present = 2556 2556 tile_present; 2557 - op->remap.prev = NULL; 2558 2557 } 2559 2558 } 2560 2559 if (op->remap.next) { ··· 2563 2564 if (!err && op->remap.skip_next) { 2564 2565 op->remap.next->tile_present = 2565 2566 tile_present; 2566 - op->remap.next = NULL; 2567 2567 } 2568 2568 } 2569 2569 2570 - /* Adjust for partial unbind after removing VMA from VM */ 2570 + /* 2571 + * Adjust for partial unbind after removing VMA from VM. In case 2572 + * of unwind we might need to undo this later. 2573 + */ 2571 2574 if (!err) { 2572 2575 op->base.remap.unmap->va->va.addr = op->remap.start; 2573 2576 op->base.remap.unmap->va->va.range = op->remap.range; ··· 2688 2687 2689 2688 op->remap.start = xe_vma_start(old); 2690 2689 op->remap.range = xe_vma_size(old); 2690 + op->remap.old_start = op->remap.start; 2691 + op->remap.old_range = op->remap.range; 2691 2692 2692 2693 flags |= op->base.remap.unmap->va->flags & XE_VMA_CREATE_MASK; 2693 2694 if (op->base.remap.prev) { ··· 2838 2835 xe_svm_notifier_lock(vm); 2839 2836 vma->gpuva.flags &= ~XE_VMA_DESTROYED; 2840 2837 xe_svm_notifier_unlock(vm); 2841 - if (post_commit) 2838 + if (post_commit) { 2839 + /* 2840 + * Restore the old va range, in case of the 2841 + * prev/next skip optimisation. Otherwise what 2842 + * we re-insert here could be smaller than the 2843 + * original range. 2844 + */ 2845 + op->base.remap.unmap->va->va.addr = 2846 + op->remap.old_start; 2847 + op->base.remap.unmap->va->va.range = 2848 + op->remap.old_range; 2842 2849 xe_vm_insert_vma(vm, vma); 2850 + } 2843 2851 } 2844 2852 break; 2845 2853 }

+4

drivers/gpu/drm/xe/xe_vm_types.h

··· 373 373 u64 start; 374 374 /** @range: range of the VMA unmap */ 375 375 u64 range; 376 + /** @old_start: Original start of the VMA we unmap */ 377 + u64 old_start; 378 + /** @old_range: Original range of the VMA we unmap */ 379 + u64 old_range; 376 380 /** @skip_prev: skip prev rebind */ 377 381 bool skip_prev; 378 382 /** @skip_next: skip next rebind */

+2 -1

drivers/gpu/drm/xe/xe_wa.c

··· 247 247 LSN_DIM_Z_WGT_MASK, 248 248 LSN_LNI_WGT(1) | LSN_LNE_WGT(1) | 249 249 LSN_DIM_X_WGT(1) | LSN_DIM_Y_WGT(1) | 250 - LSN_DIM_Z_WGT(1))) 250 + LSN_DIM_Z_WGT(1)), 251 + SET(LSC_CHICKEN_BIT_0_UDW, L3_128B_256B_WRT_DIS)) 251 252 }, 252 253 253 254 /* Xe2_HPM */

+31 -23

drivers/hwmon/adm1177.c

··· 10 10 #include <linux/hwmon.h> 11 11 #include <linux/i2c.h> 12 12 #include <linux/init.h> 13 + #include <linux/math64.h> 14 + #include <linux/minmax.h> 13 15 #include <linux/module.h> 14 16 #include <linux/regulator/consumer.h> 15 17 ··· 35 33 struct adm1177_state { 36 34 struct i2c_client *client; 37 35 u32 r_sense_uohm; 38 - u32 alert_threshold_ua; 36 + u64 alert_threshold_ua; 39 37 bool vrange_high; 40 38 }; 41 39 ··· 50 48 } 51 49 52 50 static int adm1177_write_alert_thr(struct adm1177_state *st, 53 - u32 alert_threshold_ua) 51 + u64 alert_threshold_ua) 54 52 { 55 53 u64 val; 56 54 int ret; ··· 93 91 *val = div_u64((105840000ull * dummy), 94 92 4096 * st->r_sense_uohm); 95 93 return 0; 96 - case hwmon_curr_max_alarm: 97 - *val = st->alert_threshold_ua; 94 + case hwmon_curr_max: 95 + *val = div_u64(st->alert_threshold_ua, 1000); 98 96 return 0; 99 97 default: 100 98 return -EOPNOTSUPP; ··· 128 126 switch (type) { 129 127 case hwmon_curr: 130 128 switch (attr) { 131 - case hwmon_curr_max_alarm: 132 - adm1177_write_alert_thr(st, val); 133 - return 0; 129 + case hwmon_curr_max: 130 + val = clamp_val(val, 0, 131 + div_u64(105840000ULL, st->r_sense_uohm)); 132 + return adm1177_write_alert_thr(st, (u64)val * 1000); 134 133 default: 135 134 return -EOPNOTSUPP; 136 135 } ··· 159 156 if (st->r_sense_uohm) 160 157 return 0444; 161 158 return 0; 162 - case hwmon_curr_max_alarm: 159 + case hwmon_curr_max: 163 160 if (st->r_sense_uohm) 164 161 return 0644; 165 162 return 0; ··· 173 170 174 171 static const struct hwmon_channel_info * const adm1177_info[] = { 175 172 HWMON_CHANNEL_INFO(curr, 176 - HWMON_C_INPUT | HWMON_C_MAX_ALARM), 173 + HWMON_C_INPUT | HWMON_C_MAX), 177 174 HWMON_CHANNEL_INFO(in, 178 175 HWMON_I_INPUT), 179 176 NULL ··· 195 192 struct device *dev = &client->dev; 196 193 struct device *hwmon_dev; 197 194 struct adm1177_state *st; 198 - u32 alert_threshold_ua; 195 + u64 alert_threshold_ua; 196 + u32 prop; 199 197 int ret; 200 198 201 199 st = devm_kzalloc(dev, sizeof(*st), GFP_KERNEL); ··· 212 208 if (device_property_read_u32(dev, "shunt-resistor-micro-ohms", 213 209 &st->r_sense_uohm)) 214 210 st->r_sense_uohm = 0; 215 - if (device_property_read_u32(dev, "adi,shutdown-threshold-microamp", 216 - &alert_threshold_ua)) { 217 - if (st->r_sense_uohm) 218 - /* 219 - * set maximum default value from datasheet based on 220 - * shunt-resistor 221 - */ 222 - alert_threshold_ua = div_u64(105840000000, 223 - st->r_sense_uohm); 224 - else 225 - alert_threshold_ua = 0; 211 + if (!device_property_read_u32(dev, "adi,shutdown-threshold-microamp", 212 + &prop)) { 213 + alert_threshold_ua = prop; 214 + } else if (st->r_sense_uohm) { 215 + /* 216 + * set maximum default value from datasheet based on 217 + * shunt-resistor 218 + */ 219 + alert_threshold_ua = div_u64(105840000000ULL, 220 + st->r_sense_uohm); 221 + } else { 222 + alert_threshold_ua = 0; 226 223 } 227 224 st->vrange_high = device_property_read_bool(dev, 228 225 "adi,vrange-high-enable"); 229 - if (alert_threshold_ua && st->r_sense_uohm) 230 - adm1177_write_alert_thr(st, alert_threshold_ua); 226 + if (alert_threshold_ua && st->r_sense_uohm) { 227 + ret = adm1177_write_alert_thr(st, alert_threshold_ua); 228 + if (ret) 229 + return ret; 230 + } 231 231 232 232 ret = adm1177_write_cmd(st, ADM1177_CMD_V_CONT | 233 233 ADM1177_CMD_I_CONT |

+2 -2

drivers/hwmon/peci/cputemp.c

··· 131 131 *val = priv->temp.target.tjmax; 132 132 break; 133 133 case crit_hyst_type: 134 - *val = priv->temp.target.tjmax - priv->temp.target.tcontrol; 134 + *val = priv->temp.target.tcontrol; 135 135 break; 136 136 default: 137 137 ret = -EOPNOTSUPP; ··· 319 319 { 320 320 const struct peci_cputemp *priv = data; 321 321 322 - if (channel > CPUTEMP_CHANNEL_NUMS) 322 + if (channel >= CPUTEMP_CHANNEL_NUMS) 323 323 return 0; 324 324 325 325 if (channel < channel_core)

+2 -1

drivers/hwmon/pmbus/ina233.c

··· 72 72 73 73 /* Adjust returned value to match VIN coefficients */ 74 74 /* VIN: 1.25 mV VSHUNT: 2.5 uV LSB */ 75 - ret = DIV_ROUND_CLOSEST(ret * 25, 12500); 75 + ret = clamp_val(DIV_ROUND_CLOSEST((s16)ret * 25, 12500), 76 + S16_MIN, S16_MAX) & 0xffff; 76 77 break; 77 78 default: 78 79 ret = -ENODATA;

+18 -3

drivers/hwmon/pmbus/isl68137.c

··· 96 96 int page, 97 97 char *buf) 98 98 { 99 - int val = pmbus_read_byte_data(client, page, PMBUS_OPERATION); 99 + int val; 100 + 101 + val = pmbus_lock_interruptible(client); 102 + if (val) 103 + return val; 104 + 105 + val = pmbus_read_byte_data(client, page, PMBUS_OPERATION); 106 + 107 + pmbus_unlock(client); 100 108 101 109 if (val < 0) 102 110 return val; ··· 126 118 127 119 op_val = result ? ISL68137_VOUT_AVS : 0; 128 120 121 + rc = pmbus_lock_interruptible(client); 122 + if (rc) 123 + return rc; 124 + 129 125 /* 130 126 * Writes to VOUT setpoint over AVSBus will persist after the VRM is 131 127 * switched to PMBus control. Switching back to AVSBus control ··· 141 129 rc = pmbus_read_word_data(client, page, 0xff, 142 130 PMBUS_VOUT_COMMAND); 143 131 if (rc < 0) 144 - return rc; 132 + goto unlock; 145 133 146 134 rc = pmbus_write_word_data(client, page, PMBUS_VOUT_COMMAND, 147 135 rc); 148 136 if (rc < 0) 149 - return rc; 137 + goto unlock; 150 138 } 151 139 152 140 rc = pmbus_update_byte_data(client, page, PMBUS_OPERATION, 153 141 ISL68137_VOUT_AVS, op_val); 142 + 143 + unlock: 144 + pmbus_unlock(client); 154 145 155 146 return (rc < 0) ? rc : count; 156 147 }

+154 -38

drivers/hwmon/pmbus/pmbus_core.c

··· 6 6 * Copyright (c) 2012 Guenter Roeck 7 7 */ 8 8 9 + #include <linux/atomic.h> 9 10 #include <linux/debugfs.h> 10 11 #include <linux/delay.h> 11 12 #include <linux/dcache.h> ··· 22 21 #include <linux/pmbus.h> 23 22 #include <linux/regulator/driver.h> 24 23 #include <linux/regulator/machine.h> 25 - #include <linux/of.h> 26 24 #include <linux/thermal.h> 25 + #include <linux/workqueue.h> 27 26 #include "pmbus.h" 28 27 29 28 /* ··· 112 111 struct pmbus_sensor *sensors; 113 112 114 113 struct mutex update_lock; 114 + 115 + #if IS_ENABLED(CONFIG_REGULATOR) 116 + atomic_t regulator_events[PMBUS_PAGES]; 117 + struct work_struct regulator_notify_work; 118 + #endif 115 119 116 120 bool has_status_word; /* device uses STATUS_WORD register */ 117 121 int (*read_status)(struct i2c_client *client, int page); ··· 1215 1209 return sysfs_emit(buf, "%d\n", val); 1216 1210 } 1217 1211 1212 + static ssize_t pmbus_show_zero(struct device *dev, 1213 + struct device_attribute *devattr, char *buf) 1214 + { 1215 + return sysfs_emit(buf, "0\n"); 1216 + } 1217 + 1218 1218 static ssize_t pmbus_show_sensor(struct device *dev, 1219 1219 struct device_attribute *devattr, char *buf) 1220 1220 { ··· 1419 1407 int reg, 1420 1408 enum pmbus_sensor_classes class, 1421 1409 bool update, bool readonly, 1422 - bool convert) 1410 + bool writeonly, bool convert) 1423 1411 { 1424 1412 struct pmbus_sensor *sensor; 1425 1413 struct device_attribute *a; ··· 1448 1436 sensor->data = -ENODATA; 1449 1437 pmbus_dev_attr_init(a, sensor->name, 1450 1438 readonly ? 0444 : 0644, 1451 - pmbus_show_sensor, pmbus_set_sensor); 1439 + writeonly ? pmbus_show_zero : pmbus_show_sensor, 1440 + pmbus_set_sensor); 1452 1441 1453 1442 if (pmbus_add_attribute(data, &a->attr)) 1454 1443 return NULL; ··· 1508 1495 struct pmbus_limit_attr { 1509 1496 u16 reg; /* Limit register */ 1510 1497 u16 sbit; /* Alarm attribute status bit */ 1511 - bool update; /* True if register needs updates */ 1512 - bool low; /* True if low limit; for limits with compare functions only */ 1498 + bool readonly:1; /* True if the attribute is read-only */ 1499 + bool writeonly:1; /* True if the attribute is write-only */ 1500 + bool update:1; /* True if register needs updates */ 1501 + bool low:1; /* True if low limit; for limits with compare functions only */ 1513 1502 const char *attr; /* Attribute name */ 1514 1503 const char *alarm; /* Alarm attribute name */ 1515 1504 }; ··· 1526 1511 u8 nlimit; /* # of limit registers */ 1527 1512 enum pmbus_sensor_classes class;/* sensor class */ 1528 1513 const char *label; /* sensor label */ 1529 - bool paged; /* true if paged sensor */ 1530 - bool update; /* true if update needed */ 1531 - bool compare; /* true if compare function needed */ 1514 + bool paged:1; /* true if paged sensor */ 1515 + bool update:1; /* true if update needed */ 1516 + bool compare:1; /* true if compare function needed */ 1532 1517 u32 func; /* sensor mask */ 1533 1518 u32 sfunc; /* sensor status mask */ 1534 1519 int sreg; /* status register */ ··· 1559 1544 curr = pmbus_add_sensor(data, name, l->attr, index, 1560 1545 page, 0xff, l->reg, attr->class, 1561 1546 attr->update || l->update, 1562 - false, true); 1547 + l->readonly, l->writeonly, true); 1563 1548 if (!curr) 1564 1549 return -ENOMEM; 1565 1550 if (l->sbit && (info->func[page] & attr->sfunc)) { ··· 1599 1584 return ret; 1600 1585 } 1601 1586 base = pmbus_add_sensor(data, name, "input", index, page, phase, 1602 - attr->reg, attr->class, true, true, true); 1587 + attr->reg, attr->class, true, true, false, true); 1603 1588 if (!base) 1604 1589 return -ENOMEM; 1605 1590 /* No limit and alarm attributes for phase specific sensors */ ··· 1722 1707 }, { 1723 1708 .reg = PMBUS_VIRT_READ_VIN_AVG, 1724 1709 .update = true, 1710 + .readonly = true, 1725 1711 .attr = "average", 1726 1712 }, { 1727 1713 .reg = PMBUS_VIRT_READ_VIN_MIN, 1728 1714 .update = true, 1715 + .readonly = true, 1729 1716 .attr = "lowest", 1730 1717 }, { 1731 1718 .reg = PMBUS_VIRT_READ_VIN_MAX, 1732 1719 .update = true, 1720 + .readonly = true, 1733 1721 .attr = "highest", 1734 1722 }, { 1735 1723 .reg = PMBUS_VIRT_RESET_VIN_HISTORY, 1724 + .writeonly = true, 1736 1725 .attr = "reset_history", 1737 1726 }, { 1738 1727 .reg = PMBUS_MFR_VIN_MIN, 1728 + .readonly = true, 1739 1729 .attr = "rated_min", 1740 1730 }, { 1741 1731 .reg = PMBUS_MFR_VIN_MAX, 1732 + .readonly = true, 1742 1733 .attr = "rated_max", 1743 1734 }, 1744 1735 }; ··· 1797 1776 }, { 1798 1777 .reg = PMBUS_VIRT_READ_VOUT_AVG, 1799 1778 .update = true, 1779 + .readonly = true, 1800 1780 .attr = "average", 1801 1781 }, { 1802 1782 .reg = PMBUS_VIRT_READ_VOUT_MIN, 1803 1783 .update = true, 1784 + .readonly = true, 1804 1785 .attr = "lowest", 1805 1786 }, { 1806 1787 .reg = PMBUS_VIRT_READ_VOUT_MAX, 1807 1788 .update = true, 1789 + .readonly = true, 1808 1790 .attr = "highest", 1809 1791 }, { 1810 1792 .reg = PMBUS_VIRT_RESET_VOUT_HISTORY, 1793 + .writeonly = true, 1811 1794 .attr = "reset_history", 1812 1795 }, { 1813 1796 .reg = PMBUS_MFR_VOUT_MIN, 1797 + .readonly = true, 1814 1798 .attr = "rated_min", 1815 1799 }, { 1816 1800 .reg = PMBUS_MFR_VOUT_MAX, 1801 + .readonly = true, 1817 1802 .attr = "rated_max", 1818 1803 }, 1819 1804 }; ··· 1879 1852 }, { 1880 1853 .reg = PMBUS_VIRT_READ_IIN_AVG, 1881 1854 .update = true, 1855 + .readonly = true, 1882 1856 .attr = "average", 1883 1857 }, { 1884 1858 .reg = PMBUS_VIRT_READ_IIN_MIN, 1885 1859 .update = true, 1860 + .readonly = true, 1886 1861 .attr = "lowest", 1887 1862 }, { 1888 1863 .reg = PMBUS_VIRT_READ_IIN_MAX, 1889 1864 .update = true, 1865 + .readonly = true, 1890 1866 .attr = "highest", 1891 1867 }, { 1892 1868 .reg = PMBUS_VIRT_RESET_IIN_HISTORY, 1869 + .writeonly = true, 1893 1870 .attr = "reset_history", 1894 1871 }, { 1895 1872 .reg = PMBUS_MFR_IIN_MAX, 1873 + .readonly = true, 1896 1874 .attr = "rated_max", 1897 1875 }, 1898 1876 }; ··· 1921 1889 }, { 1922 1890 .reg = PMBUS_VIRT_READ_IOUT_AVG, 1923 1891 .update = true, 1892 + .readonly = true, 1924 1893 .attr = "average", 1925 1894 }, { 1926 1895 .reg = PMBUS_VIRT_READ_IOUT_MIN, 1927 1896 .update = true, 1897 + .readonly = true, 1928 1898 .attr = "lowest", 1929 1899 }, { 1930 1900 .reg = PMBUS_VIRT_READ_IOUT_MAX, 1931 1901 .update = true, 1902 + .readonly = true, 1932 1903 .attr = "highest", 1933 1904 }, { 1934 1905 .reg = PMBUS_VIRT_RESET_IOUT_HISTORY, 1906 + .writeonly = true, 1935 1907 .attr = "reset_history", 1936 1908 }, { 1937 1909 .reg = PMBUS_MFR_IOUT_MAX, 1910 + .readonly = true, 1938 1911 .attr = "rated_max", 1939 1912 }, 1940 1913 }; ··· 1980 1943 }, { 1981 1944 .reg = PMBUS_VIRT_READ_PIN_AVG, 1982 1945 .update = true, 1946 + .readonly = true, 1983 1947 .attr = "average", 1984 1948 }, { 1985 1949 .reg = PMBUS_VIRT_READ_PIN_MIN, 1986 1950 .update = true, 1951 + .readonly = true, 1987 1952 .attr = "input_lowest", 1988 1953 }, { 1989 1954 .reg = PMBUS_VIRT_READ_PIN_MAX, 1990 1955 .update = true, 1956 + .readonly = true, 1991 1957 .attr = "input_highest", 1992 1958 }, { 1993 1959 .reg = PMBUS_VIRT_RESET_PIN_HISTORY, 1960 + .writeonly = true, 1994 1961 .attr = "reset_history", 1995 1962 }, { 1996 1963 .reg = PMBUS_MFR_PIN_MAX, 1964 + .readonly = true, 1997 1965 .attr = "rated_max", 1998 1966 }, 1999 1967 }; ··· 2022 1980 }, { 2023 1981 .reg = PMBUS_VIRT_READ_POUT_AVG, 2024 1982 .update = true, 1983 + .readonly = true, 2025 1984 .attr = "average", 2026 1985 }, { 2027 1986 .reg = PMBUS_VIRT_READ_POUT_MIN, 2028 1987 .update = true, 1988 + .readonly = true, 2029 1989 .attr = "input_lowest", 2030 1990 }, { 2031 1991 .reg = PMBUS_VIRT_READ_POUT_MAX, 2032 1992 .update = true, 1993 + .readonly = true, 2033 1994 .attr = "input_highest", 2034 1995 }, { 2035 1996 .reg = PMBUS_VIRT_RESET_POUT_HISTORY, 1997 + .writeonly = true, 2036 1998 .attr = "reset_history", 2037 1999 }, { 2038 2000 .reg = PMBUS_MFR_POUT_MAX, 2001 + .readonly = true, 2039 2002 .attr = "rated_max", 2040 2003 }, 2041 2004 }; ··· 2096 2049 .sbit = PB_TEMP_OT_FAULT, 2097 2050 }, { 2098 2051 .reg = PMBUS_VIRT_READ_TEMP_MIN, 2052 + .readonly = true, 2099 2053 .attr = "lowest", 2100 2054 }, { 2101 2055 .reg = PMBUS_VIRT_READ_TEMP_AVG, 2056 + .readonly = true, 2102 2057 .attr = "average", 2103 2058 }, { 2104 2059 .reg = PMBUS_VIRT_READ_TEMP_MAX, 2060 + .readonly = true, 2105 2061 .attr = "highest", 2106 2062 }, { 2107 2063 .reg = PMBUS_VIRT_RESET_TEMP_HISTORY, 2064 + .writeonly = true, 2108 2065 .attr = "reset_history", 2109 2066 }, { 2110 2067 .reg = PMBUS_MFR_MAX_TEMP_1, 2068 + .readonly = true, 2111 2069 .attr = "rated_max", 2112 2070 }, 2113 2071 }; ··· 2142 2090 .sbit = PB_TEMP_OT_FAULT, 2143 2091 }, { 2144 2092 .reg = PMBUS_VIRT_READ_TEMP2_MIN, 2093 + .readonly = true, 2145 2094 .attr = "lowest", 2146 2095 }, { 2147 2096 .reg = PMBUS_VIRT_READ_TEMP2_AVG, 2097 + .readonly = true, 2148 2098 .attr = "average", 2149 2099 }, { 2150 2100 .reg = PMBUS_VIRT_READ_TEMP2_MAX, 2101 + .readonly = true, 2151 2102 .attr = "highest", 2152 2103 }, { 2153 2104 .reg = PMBUS_VIRT_RESET_TEMP2_HISTORY, 2105 + .writeonly = true, 2154 2106 .attr = "reset_history", 2155 2107 }, { 2156 2108 .reg = PMBUS_MFR_MAX_TEMP_2, 2109 + .readonly = true, 2157 2110 .attr = "rated_max", 2158 2111 }, 2159 2112 }; ··· 2188 2131 .sbit = PB_TEMP_OT_FAULT, 2189 2132 }, { 2190 2133 .reg = PMBUS_MFR_MAX_TEMP_3, 2134 + .readonly = true, 2191 2135 .attr = "rated_max", 2192 2136 }, 2193 2137 }; ··· 2272 2214 2273 2215 sensor = pmbus_add_sensor(data, "fan", "target", index, page, 2274 2216 0xff, PMBUS_VIRT_FAN_TARGET_1 + id, PSC_FAN, 2275 - false, false, true); 2217 + false, false, false, true); 2276 2218 2277 2219 if (!sensor) 2278 2220 return -ENOMEM; ··· 2283 2225 2284 2226 sensor = pmbus_add_sensor(data, "pwm", NULL, index, page, 2285 2227 0xff, PMBUS_VIRT_PWM_1 + id, PSC_PWM, 2286 - false, false, true); 2228 + false, false, false, true); 2287 2229 2288 2230 if (!sensor) 2289 2231 return -ENOMEM; 2290 2232 2291 2233 sensor = pmbus_add_sensor(data, "pwm", "enable", index, page, 2292 2234 0xff, PMBUS_VIRT_PWM_ENABLE_1 + id, PSC_PWM, 2293 - true, false, false); 2235 + true, false, false, false); 2294 2236 2295 2237 if (!sensor) 2296 2238 return -ENOMEM; ··· 2332 2274 2333 2275 if (pmbus_add_sensor(data, "fan", "input", index, 2334 2276 page, 0xff, pmbus_fan_registers[f], 2335 - PSC_FAN, true, true, true) == NULL) 2277 + PSC_FAN, true, true, false, true) == NULL) 2336 2278 return -ENOMEM; 2337 2279 2338 2280 /* Fan control */ ··· 3234 3176 .class = PSC_VOLTAGE_OUT, 3235 3177 .convert = true, 3236 3178 }; 3179 + int ret; 3237 3180 3181 + mutex_lock(&data->update_lock); 3238 3182 s.data = _pmbus_read_word_data(client, s.page, 0xff, PMBUS_READ_VOUT); 3239 - if (s.data < 0) 3240 - return s.data; 3183 + if (s.data < 0) { 3184 + ret = s.data; 3185 + goto unlock; 3186 + } 3241 3187 3242 - return (int)pmbus_reg2data(data, &s) * 1000; /* unit is uV */ 3188 + ret = (int)pmbus_reg2data(data, &s) * 1000; /* unit is uV */ 3189 + unlock: 3190 + mutex_unlock(&data->update_lock); 3191 + return ret; 3243 3192 } 3244 3193 3245 3194 static int pmbus_regulator_set_voltage(struct regulator_dev *rdev, int min_uv, ··· 3263 3198 }; 3264 3199 int val = DIV_ROUND_CLOSEST(min_uv, 1000); /* convert to mV */ 3265 3200 int low, high; 3201 + int ret; 3266 3202 3267 3203 *selector = 0; 3268 3204 3205 + mutex_lock(&data->update_lock); 3269 3206 low = pmbus_regulator_get_low_margin(client, s.page); 3270 - if (low < 0) 3271 - return low; 3207 + if (low < 0) { 3208 + ret = low; 3209 + goto unlock; 3210 + } 3272 3211 3273 3212 high = pmbus_regulator_get_high_margin(client, s.page); 3274 - if (high < 0) 3275 - return high; 3213 + if (high < 0) { 3214 + ret = high; 3215 + goto unlock; 3216 + } 3276 3217 3277 3218 /* Make sure we are within margins */ 3278 3219 if (low > val) ··· 3288 3217 3289 3218 val = pmbus_data2reg(data, &s, val); 3290 3219 3291 - return _pmbus_write_word_data(client, s.page, PMBUS_VOUT_COMMAND, (u16)val); 3220 + ret = _pmbus_write_word_data(client, s.page, PMBUS_VOUT_COMMAND, (u16)val); 3221 + unlock: 3222 + mutex_unlock(&data->update_lock); 3223 + return ret; 3292 3224 } 3293 3225 3294 3226 static int pmbus_regulator_list_voltage(struct regulator_dev *rdev, ··· 3301 3227 struct i2c_client *client = to_i2c_client(dev->parent); 3302 3228 struct pmbus_data *data = i2c_get_clientdata(client); 3303 3229 int val, low, high; 3230 + int ret; 3304 3231 3305 3232 if (data->flags & PMBUS_VOUT_PROTECTED) 3306 3233 return 0; ··· 3314 3239 val = DIV_ROUND_CLOSEST(rdev->desc->min_uV + 3315 3240 (rdev->desc->uV_step * selector), 1000); /* convert to mV */ 3316 3241 3242 + mutex_lock(&data->update_lock); 3243 + 3317 3244 low = pmbus_regulator_get_low_margin(client, rdev_get_id(rdev)); 3318 - if (low < 0) 3319 - return low; 3245 + if (low < 0) { 3246 + ret = low; 3247 + goto unlock; 3248 + } 3320 3249 3321 3250 high = pmbus_regulator_get_high_margin(client, rdev_get_id(rdev)); 3322 - if (high < 0) 3323 - return high; 3251 + if (high < 0) { 3252 + ret = high; 3253 + goto unlock; 3254 + } 3324 3255 3325 - if (val >= low && val <= high) 3326 - return val * 1000; /* unit is uV */ 3256 + if (val >= low && val <= high) { 3257 + ret = val * 1000; /* unit is uV */ 3258 + goto unlock; 3259 + } 3327 3260 3328 - return 0; 3261 + ret = 0; 3262 + unlock: 3263 + mutex_unlock(&data->update_lock); 3264 + return ret; 3329 3265 } 3330 3266 3331 3267 const struct regulator_ops pmbus_regulator_ops = { ··· 3367 3281 } 3368 3282 EXPORT_SYMBOL_NS_GPL(pmbus_regulator_init_cb, "PMBUS"); 3369 3283 3284 + static void pmbus_regulator_notify_work_cancel(void *data) 3285 + { 3286 + struct pmbus_data *pdata = data; 3287 + 3288 + cancel_work_sync(&pdata->regulator_notify_work); 3289 + } 3290 + 3291 + static void pmbus_regulator_notify_worker(struct work_struct *work) 3292 + { 3293 + struct pmbus_data *data = 3294 + container_of(work, struct pmbus_data, regulator_notify_work); 3295 + int i, j; 3296 + 3297 + for (i = 0; i < data->info->pages; i++) { 3298 + int event; 3299 + 3300 + event = atomic_xchg(&data->regulator_events[i], 0); 3301 + if (!event) 3302 + continue; 3303 + 3304 + for (j = 0; j < data->info->num_regulators; j++) { 3305 + if (i == rdev_get_id(data->rdevs[j])) { 3306 + regulator_notifier_call_chain(data->rdevs[j], 3307 + event, NULL); 3308 + break; 3309 + } 3310 + } 3311 + } 3312 + } 3313 + 3370 3314 static int pmbus_regulator_register(struct pmbus_data *data) 3371 3315 { 3372 3316 struct device *dev = data->dev; 3373 3317 const struct pmbus_driver_info *info = data->info; 3374 3318 const struct pmbus_platform_data *pdata = dev_get_platdata(dev); 3375 - int i; 3319 + int i, ret; 3376 3320 3377 3321 data->rdevs = devm_kzalloc(dev, sizeof(struct regulator_dev *) * info->num_regulators, 3378 3322 GFP_KERNEL); ··· 3426 3310 info->reg_desc[i].name); 3427 3311 } 3428 3312 3313 + INIT_WORK(&data->regulator_notify_work, pmbus_regulator_notify_worker); 3314 + 3315 + ret = devm_add_action_or_reset(dev, pmbus_regulator_notify_work_cancel, data); 3316 + if (ret) 3317 + return ret; 3318 + 3429 3319 return 0; 3430 3320 } 3431 3321 3432 3322 static void pmbus_regulator_notify(struct pmbus_data *data, int page, int event) 3433 3323 { 3434 - int j; 3435 - 3436 - for (j = 0; j < data->info->num_regulators; j++) { 3437 - if (page == rdev_get_id(data->rdevs[j])) { 3438 - regulator_notifier_call_chain(data->rdevs[j], event, NULL); 3439 - break; 3440 - } 3441 - } 3324 + atomic_or(event, &data->regulator_events[page]); 3325 + schedule_work(&data->regulator_notify_work); 3442 3326 } 3443 3327 #else 3444 3328 static int pmbus_regulator_register(struct pmbus_data *data)

+5 -6

drivers/i2c/busses/i2c-designware-amdisp.c

··· 7 7 8 8 #include <linux/module.h> 9 9 #include <linux/platform_device.h> 10 + #include <linux/pm_domain.h> 10 11 #include <linux/pm_runtime.h> 11 12 #include <linux/soc/amd/isp4_misc.h> 12 13 ··· 77 76 78 77 device_enable_async_suspend(&pdev->dev); 79 78 80 - pm_runtime_enable(&pdev->dev); 81 - pm_runtime_get_sync(&pdev->dev); 82 - 79 + dev_pm_genpd_resume(&pdev->dev); 83 80 ret = i2c_dw_probe(isp_i2c_dev); 84 81 if (ret) { 85 82 dev_err_probe(&pdev->dev, ret, "i2c_dw_probe failed\n"); 86 83 goto error_release_rpm; 87 84 } 88 - 89 - pm_runtime_put_sync(&pdev->dev); 85 + dev_pm_genpd_suspend(&pdev->dev); 86 + pm_runtime_set_suspended(&pdev->dev); 87 + pm_runtime_enable(&pdev->dev); 90 88 91 89 return 0; 92 90 93 91 error_release_rpm: 94 92 amd_isp_dw_i2c_plat_pm_cleanup(isp_i2c_dev); 95 - pm_runtime_put_sync(&pdev->dev); 96 93 return ret; 97 94 } 98 95

+32 -19

drivers/i2c/busses/i2c-imx.c

··· 1018 1018 return 0; 1019 1019 } 1020 1020 1021 - static inline void i2c_imx_isr_read_continue(struct imx_i2c_struct *i2c_imx) 1021 + static inline enum imx_i2c_state i2c_imx_isr_read_continue(struct imx_i2c_struct *i2c_imx) 1022 1022 { 1023 + enum imx_i2c_state next_state = IMX_I2C_STATE_READ_CONTINUE; 1023 1024 unsigned int temp; 1024 1025 1025 1026 if ((i2c_imx->msg->len - 1) == i2c_imx->msg_buf_idx) { ··· 1034 1033 i2c_imx->stopped = 1; 1035 1034 temp &= ~(I2CR_MSTA | I2CR_MTX); 1036 1035 imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2CR); 1037 - } else { 1038 - /* 1039 - * For i2c master receiver repeat restart operation like: 1040 - * read -> repeat MSTA -> read/write 1041 - * The controller must set MTX before read the last byte in 1042 - * the first read operation, otherwise the first read cost 1043 - * one extra clock cycle. 1044 - */ 1045 - temp = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2CR); 1046 - temp |= I2CR_MTX; 1047 - imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2CR); 1036 + 1037 + return IMX_I2C_STATE_DONE; 1048 1038 } 1039 + /* 1040 + * For i2c master receiver repeat restart operation like: 1041 + * read -> repeat MSTA -> read/write 1042 + * The controller must set MTX before read the last byte in 1043 + * the first read operation, otherwise the first read cost 1044 + * one extra clock cycle. 1045 + */ 1046 + temp = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2CR); 1047 + temp |= I2CR_MTX; 1048 + imx_i2c_write_reg(temp, i2c_imx, IMX_I2C_I2CR); 1049 + next_state = IMX_I2C_STATE_DONE; 1049 1050 } else if (i2c_imx->msg_buf_idx == (i2c_imx->msg->len - 2)) { 1050 1051 temp = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2CR); 1051 1052 temp |= I2CR_TXAK; ··· 1055 1052 } 1056 1053 1057 1054 i2c_imx->msg->buf[i2c_imx->msg_buf_idx++] = imx_i2c_read_reg(i2c_imx, IMX_I2C_I2DR); 1055 + return next_state; 1058 1056 } 1059 1057 1060 1058 static inline void i2c_imx_isr_read_block_data_len(struct imx_i2c_struct *i2c_imx) ··· 1092 1088 break; 1093 1089 1094 1090 case IMX_I2C_STATE_READ_CONTINUE: 1095 - i2c_imx_isr_read_continue(i2c_imx); 1096 - if (i2c_imx->msg_buf_idx == i2c_imx->msg->len) { 1097 - i2c_imx->state = IMX_I2C_STATE_DONE; 1091 + i2c_imx->state = i2c_imx_isr_read_continue(i2c_imx); 1092 + if (i2c_imx->state == IMX_I2C_STATE_DONE) 1098 1093 wake_up(&i2c_imx->queue); 1099 - } 1100 1094 break; 1101 1095 1102 1096 case IMX_I2C_STATE_READ_BLOCK_DATA: ··· 1492 1490 bool is_lastmsg) 1493 1491 { 1494 1492 int block_data = msgs->flags & I2C_M_RECV_LEN; 1493 + int ret = 0; 1495 1494 1496 1495 dev_dbg(&i2c_imx->adapter.dev, 1497 1496 "<%s> write slave address: addr=0x%x\n", ··· 1525 1522 dev_err(&i2c_imx->adapter.dev, "<%s> read timedout\n", __func__); 1526 1523 return -ETIMEDOUT; 1527 1524 } 1528 - if (!i2c_imx->stopped) 1529 - return i2c_imx_bus_busy(i2c_imx, 0, false); 1525 + if (i2c_imx->is_lastmsg) { 1526 + if (!i2c_imx->stopped) 1527 + ret = i2c_imx_bus_busy(i2c_imx, 0, false); 1528 + /* 1529 + * Only read the last byte of the last message after the bus is 1530 + * not busy. Else the controller generates another clock which 1531 + * might confuse devices. 1532 + */ 1533 + if (!ret) 1534 + i2c_imx->msg->buf[i2c_imx->msg_buf_idx++] = imx_i2c_read_reg(i2c_imx, 1535 + IMX_I2C_I2DR); 1536 + } 1530 1537 1531 - return 0; 1538 + return ret; 1532 1539 } 1533 1540 1534 1541 static int i2c_imx_xfer_common(struct i2c_adapter *adapter,

+30 -13

drivers/infiniband/core/rw.c

··· 608 608 if (rdma_rw_io_needs_mr(qp->device, port_num, dir, sg_cnt)) { 609 609 ret = rdma_rw_init_mr_wrs(ctx, qp, port_num, sg, sg_cnt, 610 610 sg_offset, remote_addr, rkey, dir); 611 - } else if (sg_cnt > 1) { 612 - ret = rdma_rw_init_map_wrs(ctx, qp, sg, sg_cnt, sg_offset, 613 - remote_addr, rkey, dir); 614 - } else { 615 - ret = rdma_rw_init_single_wr(ctx, qp, sg, sg_offset, 616 - remote_addr, rkey, dir); 611 + /* 612 + * If MR init succeeded or failed for a reason other 613 + * than pool exhaustion, that result is final. 614 + * 615 + * Pool exhaustion (-EAGAIN) from the max_sgl_rd 616 + * optimization is recoverable: fall back to 617 + * direct SGE posting. iWARP and force_mr require 618 + * MRs unconditionally, so -EAGAIN is terminal. 619 + */ 620 + if (ret != -EAGAIN || 621 + rdma_protocol_iwarp(qp->device, port_num) || 622 + unlikely(rdma_rw_force_mr)) 623 + goto out; 617 624 } 618 625 626 + if (sg_cnt > 1) 627 + ret = rdma_rw_init_map_wrs(ctx, qp, sg, sg_cnt, sg_offset, 628 + remote_addr, rkey, dir); 629 + else 630 + ret = rdma_rw_init_single_wr(ctx, qp, sg, sg_offset, 631 + remote_addr, rkey, dir); 632 + 633 + out: 619 634 if (ret < 0) 620 635 goto out_unmap_sg; 621 636 return ret; ··· 701 686 return ret; 702 687 703 688 /* 704 - * IOVA mapping not available. Check if MR registration provides 705 - * better performance than multiple SGE entries. 689 + * IOVA not available; fall back to the map_wrs path, which maps 690 + * each bvec as a direct SGE. This is always correct: the MR path 691 + * is a throughput optimization, not a correctness requirement. 692 + * (iWARP, which does require MRs, is handled by the check above.) 693 + * 694 + * The rdma_rw_io_needs_mr() gate is not used here because nr_bvec 695 + * is a raw page count that overstates DMA entry demand -- the bvec 696 + * caller has no post-DMA-coalescing segment count, and feeding the 697 + * inflated count into the MR path exhausts the pool on RDMA READs. 706 698 */ 707 - if (rdma_rw_io_needs_mr(dev, port_num, dir, nr_bvec)) 708 - return rdma_rw_init_mr_wrs_bvec(ctx, qp, port_num, bvecs, 709 - nr_bvec, &iter, remote_addr, 710 - rkey, dir); 711 - 712 699 return rdma_rw_init_map_wrs_bvec(ctx, qp, bvecs, nr_bvec, &iter, 713 700 remote_addr, rkey, dir); 714 701 }

+3 -2

drivers/infiniband/core/umem.c

··· 55 55 56 56 if (dirty) 57 57 ib_dma_unmap_sgtable_attrs(dev, &umem->sgt_append.sgt, 58 - DMA_BIDIRECTIONAL, 0); 58 + DMA_BIDIRECTIONAL, 59 + DMA_ATTR_REQUIRE_COHERENT); 59 60 60 61 for_each_sgtable_sg(&umem->sgt_append.sgt, sg, i) { 61 62 unpin_user_page_range_dirty_lock(sg_page(sg), ··· 170 169 unsigned long lock_limit; 171 170 unsigned long new_pinned; 172 171 unsigned long cur_base; 173 - unsigned long dma_attr = 0; 172 + unsigned long dma_attr = DMA_ATTR_REQUIRE_COHERENT; 174 173 struct mm_struct *mm; 175 174 unsigned long npages; 176 175 int pinned, ret;

+9 -5

drivers/infiniband/hw/bng_re/bng_dev.c

··· 210 210 return rc; 211 211 } 212 212 213 - static void bng_re_query_hwrm_version(struct bng_re_dev *rdev) 213 + static int bng_re_query_hwrm_version(struct bng_re_dev *rdev) 214 214 { 215 215 struct bnge_auxr_dev *aux_dev = rdev->aux_dev; 216 216 struct hwrm_ver_get_output ver_get_resp = {}; ··· 230 230 if (rc) { 231 231 ibdev_err(&rdev->ibdev, "Failed to query HW version, rc = 0x%x", 232 232 rc); 233 - return; 233 + return rc; 234 234 } 235 235 236 236 cctx = rdev->chip_ctx; ··· 244 244 245 245 if (!cctx->hwrm_cmd_max_timeout) 246 246 cctx->hwrm_cmd_max_timeout = BNG_ROCE_FW_MAX_TIMEOUT; 247 + 248 + return 0; 247 249 } 248 250 249 251 static void bng_re_dev_uninit(struct bng_re_dev *rdev) ··· 308 306 goto msix_ctx_fail; 309 307 } 310 308 311 - bng_re_query_hwrm_version(rdev); 309 + rc = bng_re_query_hwrm_version(rdev); 310 + if (rc) 311 + goto destroy_chip_ctx; 312 312 313 313 rc = bng_re_alloc_fw_channel(&rdev->bng_res, &rdev->rcfw); 314 314 if (rc) { 315 315 ibdev_err(&rdev->ibdev, 316 316 "Failed to allocate RCFW Channel: %#x\n", rc); 317 - goto alloc_fw_chl_fail; 317 + goto destroy_chip_ctx; 318 318 } 319 319 320 320 /* Allocate nq record memory */ ··· 395 391 kfree(rdev->nqr); 396 392 nq_alloc_fail: 397 393 bng_re_free_rcfw_channel(&rdev->rcfw); 398 - alloc_fw_chl_fail: 394 + destroy_chip_ctx: 399 395 bng_re_destroy_chip_ctx(rdev); 400 396 msix_ctx_fail: 401 397 bnge_unregister_dev(rdev->aux_dev);

+40 -48

drivers/infiniband/hw/efa/efa_com.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 OR BSD-2-Clause 2 2 /* 3 - * Copyright 2018-2025 Amazon.com, Inc. or its affiliates. All rights reserved. 3 + * Copyright 2018-2026 Amazon.com, Inc. or its affiliates. All rights reserved. 4 4 */ 5 5 6 6 #include <linux/log2.h> ··· 310 310 return &aq->comp_ctx[ctx_id]; 311 311 } 312 312 313 - static struct efa_comp_ctx *__efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq, 314 - struct efa_admin_aq_entry *cmd, 315 - size_t cmd_size_in_bytes, 316 - struct efa_admin_acq_entry *comp, 317 - size_t comp_size_in_bytes) 313 + static void __efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq, 314 + struct efa_comp_ctx *comp_ctx, 315 + struct efa_admin_aq_entry *cmd, 316 + size_t cmd_size_in_bytes, 317 + struct efa_admin_acq_entry *comp, 318 + size_t comp_size_in_bytes) 318 319 { 319 320 struct efa_admin_aq_entry *aqe; 320 - struct efa_comp_ctx *comp_ctx; 321 321 u16 queue_size_mask; 322 322 u16 cmd_id; 323 323 u16 ctx_id; 324 324 u16 pi; 325 - 326 - comp_ctx = efa_com_alloc_comp_ctx(aq); 327 - if (!comp_ctx) 328 - return ERR_PTR(-EINVAL); 329 325 330 326 queue_size_mask = aq->depth - 1; 331 327 pi = aq->sq.pc & queue_size_mask; ··· 356 360 357 361 /* barrier not needed in case of writel */ 358 362 writel(aq->sq.pc, aq->sq.db_addr); 359 - 360 - return comp_ctx; 361 363 } 362 364 363 365 static inline int efa_com_init_comp_ctxt(struct efa_com_admin_queue *aq) ··· 388 394 return 0; 389 395 } 390 396 391 - static struct efa_comp_ctx *efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq, 392 - struct efa_admin_aq_entry *cmd, 393 - size_t cmd_size_in_bytes, 394 - struct efa_admin_acq_entry *comp, 395 - size_t comp_size_in_bytes) 397 + static int efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq, 398 + struct efa_comp_ctx *comp_ctx, 399 + struct efa_admin_aq_entry *cmd, 400 + size_t cmd_size_in_bytes, 401 + struct efa_admin_acq_entry *comp, 402 + size_t comp_size_in_bytes) 396 403 { 397 - struct efa_comp_ctx *comp_ctx; 398 - 399 404 spin_lock(&aq->sq.lock); 400 405 if (!test_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state)) { 401 406 ibdev_err_ratelimited(aq->efa_dev, "Admin queue is closed\n"); 402 407 spin_unlock(&aq->sq.lock); 403 - return ERR_PTR(-ENODEV); 408 + return -ENODEV; 404 409 } 405 410 406 - comp_ctx = __efa_com_submit_admin_cmd(aq, cmd, cmd_size_in_bytes, comp, 407 - comp_size_in_bytes); 411 + __efa_com_submit_admin_cmd(aq, comp_ctx, cmd, cmd_size_in_bytes, comp, 412 + comp_size_in_bytes); 408 413 spin_unlock(&aq->sq.lock); 409 - if (IS_ERR(comp_ctx)) 410 - clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); 411 414 412 - return comp_ctx; 415 + return 0; 413 416 } 414 417 415 418 static int efa_com_handle_single_admin_completion(struct efa_com_admin_queue *aq, ··· 503 512 { 504 513 unsigned long timeout; 505 514 unsigned long flags; 506 - int err; 507 515 508 516 timeout = jiffies + usecs_to_jiffies(aq->completion_timeout); 509 517 ··· 522 532 atomic64_inc(&aq->stats.no_completion); 523 533 524 534 clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); 525 - err = -ETIME; 526 - goto out; 535 + return -ETIME; 527 536 } 528 537 529 538 msleep(aq->poll_interval); 530 539 } 531 540 532 - err = efa_com_comp_status_to_errno(comp_ctx->user_cqe->acq_common_descriptor.status); 533 - out: 534 - efa_com_dealloc_comp_ctx(aq, comp_ctx); 535 - return err; 541 + return efa_com_comp_status_to_errno( 542 + comp_ctx->user_cqe->acq_common_descriptor.status); 536 543 } 537 544 538 545 static int efa_com_wait_and_process_admin_cq_interrupts(struct efa_comp_ctx *comp_ctx, 539 546 struct efa_com_admin_queue *aq) 540 547 { 541 548 unsigned long flags; 542 - int err; 543 549 544 550 wait_for_completion_timeout(&comp_ctx->wait_event, 545 551 usecs_to_jiffies(aq->completion_timeout)); ··· 571 585 aq->cq.cc); 572 586 573 587 clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); 574 - err = -ETIME; 575 - goto out; 588 + return -ETIME; 576 589 } 577 590 578 - err = efa_com_comp_status_to_errno(comp_ctx->user_cqe->acq_common_descriptor.status); 579 - out: 580 - efa_com_dealloc_comp_ctx(aq, comp_ctx); 581 - return err; 591 + return efa_com_comp_status_to_errno( 592 + comp_ctx->user_cqe->acq_common_descriptor.status); 582 593 } 583 594 584 595 /* ··· 625 642 ibdev_dbg(aq->efa_dev, "%s (opcode %d)\n", 626 643 efa_com_cmd_str(cmd->aq_common_descriptor.opcode), 627 644 cmd->aq_common_descriptor.opcode); 628 - comp_ctx = efa_com_submit_admin_cmd(aq, cmd, cmd_size, comp, comp_size); 629 - if (IS_ERR(comp_ctx)) { 645 + 646 + comp_ctx = efa_com_alloc_comp_ctx(aq); 647 + if (!comp_ctx) { 648 + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); 649 + up(&aq->avail_cmds); 650 + return -EINVAL; 651 + } 652 + 653 + err = efa_com_submit_admin_cmd(aq, comp_ctx, cmd, cmd_size, comp, comp_size); 654 + if (err) { 630 655 ibdev_err_ratelimited( 631 656 aq->efa_dev, 632 - "Failed to submit command %s (opcode %u) err %pe\n", 657 + "Failed to submit command %s (opcode %u) err %d\n", 633 658 efa_com_cmd_str(cmd->aq_common_descriptor.opcode), 634 - cmd->aq_common_descriptor.opcode, comp_ctx); 659 + cmd->aq_common_descriptor.opcode, err); 635 660 661 + efa_com_dealloc_comp_ctx(aq, comp_ctx); 636 662 up(&aq->avail_cmds); 637 663 atomic64_inc(&aq->stats.cmd_err); 638 - return PTR_ERR(comp_ctx); 664 + return err; 639 665 } 640 666 641 667 err = efa_com_wait_and_process_admin_cq(comp_ctx, aq); 642 668 if (err) { 643 669 ibdev_err_ratelimited( 644 670 aq->efa_dev, 645 - "Failed to process command %s (opcode %u) comp_status %d err %d\n", 671 + "Failed to process command %s (opcode %u) err %d\n", 646 672 efa_com_cmd_str(cmd->aq_common_descriptor.opcode), 647 - cmd->aq_common_descriptor.opcode, 648 - comp_ctx->user_cqe->acq_common_descriptor.status, err); 673 + cmd->aq_common_descriptor.opcode, err); 649 674 atomic64_inc(&aq->stats.cmd_err); 650 675 } 651 676 677 + efa_com_dealloc_comp_ctx(aq, comp_ctx); 652 678 up(&aq->avail_cmds); 653 679 654 680 return err;

+3 -1

drivers/infiniband/hw/ionic/ionic_controlpath.c

··· 508 508 { 509 509 const struct ib_global_route *grh; 510 510 enum rdma_network_type net; 511 + u8 smac[ETH_ALEN]; 511 512 u16 vlan; 512 513 int rc; 513 514 ··· 519 518 520 519 grh = rdma_ah_read_grh(attr); 521 520 522 - rc = rdma_read_gid_l2_fields(grh->sgid_attr, &vlan, &hdr->eth.smac_h[0]); 521 + rc = rdma_read_gid_l2_fields(grh->sgid_attr, &vlan, smac); 523 522 if (rc) 524 523 return rc; 525 524 ··· 537 536 if (rc) 538 537 return rc; 539 538 539 + ether_addr_copy(hdr->eth.smac_h, smac); 540 540 ether_addr_copy(hdr->eth.dmac_h, attr->roce.dmac); 541 541 542 542 if (net == RDMA_NETWORK_IPV4) {

+16 -13

drivers/infiniband/hw/irdma/cm.c

··· 2241 2241 int oldarpindex; 2242 2242 int arpindex; 2243 2243 struct net_device *netdev = iwdev->netdev; 2244 + int ret; 2244 2245 2245 2246 /* create an hte and cm_node for this instance */ 2246 2247 cm_node = kzalloc_obj(*cm_node, GFP_ATOMIC); 2247 2248 if (!cm_node) 2248 - return NULL; 2249 + return ERR_PTR(-ENOMEM); 2249 2250 2250 2251 /* set our node specific transport info */ 2251 2252 cm_node->ipv4 = cm_info->ipv4; ··· 2349 2348 arpindex = -EINVAL; 2350 2349 } 2351 2350 2352 - if (arpindex < 0) 2351 + if (arpindex < 0) { 2352 + ret = -EINVAL; 2353 2353 goto err; 2354 + } 2354 2355 2355 2356 ether_addr_copy(cm_node->rem_mac, 2356 2357 iwdev->rf->arp_table[arpindex].mac_addr); ··· 2363 2360 err: 2364 2361 kfree(cm_node); 2365 2362 2366 - return NULL; 2363 + return ERR_PTR(ret); 2367 2364 } 2368 2365 2369 2366 static void irdma_destroy_connection(struct irdma_cm_node *cm_node) ··· 3024 3021 3025 3022 /* create a CM connection node */ 3026 3023 cm_node = irdma_make_cm_node(cm_core, iwdev, cm_info, NULL); 3027 - if (!cm_node) 3028 - return -ENOMEM; 3024 + if (IS_ERR(cm_node)) 3025 + return PTR_ERR(cm_node); 3029 3026 3030 3027 /* set our node side to client (active) side */ 3031 3028 cm_node->tcp_cntxt.client = 1; ··· 3222 3219 cm_info.cm_id = listener->cm_id; 3223 3220 cm_node = irdma_make_cm_node(cm_core, iwdev, &cm_info, 3224 3221 listener); 3225 - if (!cm_node) { 3222 + if (IS_ERR(cm_node)) { 3226 3223 ibdev_dbg(&cm_core->iwdev->ibdev, 3227 - "CM: allocate node failed\n"); 3224 + "CM: allocate node failed ret=%ld\n", PTR_ERR(cm_node)); 3228 3225 refcount_dec(&listener->refcnt); 3229 3226 return; 3230 3227 } ··· 4242 4239 irdma_cm_event_reset(event); 4243 4240 break; 4244 4241 case IRDMA_CM_EVENT_CONNECTED: 4245 - if (!event->cm_node->cm_id || 4246 - event->cm_node->state != IRDMA_CM_STATE_OFFLOADED) 4242 + if (!cm_node->cm_id || 4243 + cm_node->state != IRDMA_CM_STATE_OFFLOADED) 4247 4244 break; 4248 4245 irdma_cm_event_connected(event); 4249 4246 break; 4250 4247 case IRDMA_CM_EVENT_MPA_REJECT: 4251 - if (!event->cm_node->cm_id || 4248 + if (!cm_node->cm_id || 4252 4249 cm_node->state == IRDMA_CM_STATE_OFFLOADED) 4253 4250 break; 4254 4251 irdma_send_cm_event(cm_node, cm_node->cm_id, 4255 4252 IW_CM_EVENT_CONNECT_REPLY, -ECONNREFUSED); 4256 4253 break; 4257 4254 case IRDMA_CM_EVENT_ABORTED: 4258 - if (!event->cm_node->cm_id || 4259 - event->cm_node->state == IRDMA_CM_STATE_OFFLOADED) 4255 + if (!cm_node->cm_id || 4256 + cm_node->state == IRDMA_CM_STATE_OFFLOADED) 4260 4257 break; 4261 4258 irdma_event_connect_error(event); 4262 4259 break; ··· 4266 4263 break; 4267 4264 } 4268 4265 4269 - irdma_rem_ref_cm_node(event->cm_node); 4266 + irdma_rem_ref_cm_node(cm_node); 4270 4267 kfree(event); 4271 4268 } 4272 4269

+22 -17

drivers/infiniband/hw/irdma/uk.c

··· 1438 1438 * irdma_round_up_wq - return round up qp wq depth 1439 1439 * @wqdepth: wq depth in quanta to round up 1440 1440 */ 1441 - static int irdma_round_up_wq(u32 wqdepth) 1441 + static u64 irdma_round_up_wq(u64 wqdepth) 1442 1442 { 1443 1443 int scount = 1; 1444 1444 ··· 1491 1491 int irdma_get_sqdepth(struct irdma_uk_attrs *uk_attrs, u32 sq_size, u8 shift, 1492 1492 u32 *sqdepth) 1493 1493 { 1494 - u32 min_size = (u32)uk_attrs->min_hw_wq_size << shift; 1494 + u32 min_hw_quanta = (u32)uk_attrs->min_hw_wq_size << shift; 1495 + u64 hw_quanta = 1496 + irdma_round_up_wq(((u64)sq_size << shift) + IRDMA_SQ_RSVD); 1495 1497 1496 - *sqdepth = irdma_round_up_wq((sq_size << shift) + IRDMA_SQ_RSVD); 1497 - 1498 - if (*sqdepth < min_size) 1499 - *sqdepth = min_size; 1500 - else if (*sqdepth > uk_attrs->max_hw_wq_quanta) 1498 + if (hw_quanta < min_hw_quanta) 1499 + hw_quanta = min_hw_quanta; 1500 + else if (hw_quanta > uk_attrs->max_hw_wq_quanta) 1501 1501 return -EINVAL; 1502 1502 1503 + *sqdepth = hw_quanta; 1503 1504 return 0; 1504 1505 } 1505 1506 ··· 1514 1513 int irdma_get_rqdepth(struct irdma_uk_attrs *uk_attrs, u32 rq_size, u8 shift, 1515 1514 u32 *rqdepth) 1516 1515 { 1517 - u32 min_size = (u32)uk_attrs->min_hw_wq_size << shift; 1516 + u32 min_hw_quanta = (u32)uk_attrs->min_hw_wq_size << shift; 1517 + u64 hw_quanta = 1518 + irdma_round_up_wq(((u64)rq_size << shift) + IRDMA_RQ_RSVD); 1518 1519 1519 - *rqdepth = irdma_round_up_wq((rq_size << shift) + IRDMA_RQ_RSVD); 1520 - 1521 - if (*rqdepth < min_size) 1522 - *rqdepth = min_size; 1523 - else if (*rqdepth > uk_attrs->max_hw_rq_quanta) 1520 + if (hw_quanta < min_hw_quanta) 1521 + hw_quanta = min_hw_quanta; 1522 + else if (hw_quanta > uk_attrs->max_hw_rq_quanta) 1524 1523 return -EINVAL; 1525 1524 1525 + *rqdepth = hw_quanta; 1526 1526 return 0; 1527 1527 } 1528 1528 ··· 1537 1535 int irdma_get_srqdepth(struct irdma_uk_attrs *uk_attrs, u32 srq_size, u8 shift, 1538 1536 u32 *srqdepth) 1539 1537 { 1540 - *srqdepth = irdma_round_up_wq((srq_size << shift) + IRDMA_RQ_RSVD); 1538 + u32 min_hw_quanta = (u32)uk_attrs->min_hw_wq_size << shift; 1539 + u64 hw_quanta = 1540 + irdma_round_up_wq(((u64)srq_size << shift) + IRDMA_RQ_RSVD); 1541 1541 1542 - if (*srqdepth < ((u32)uk_attrs->min_hw_wq_size << shift)) 1543 - *srqdepth = uk_attrs->min_hw_wq_size << shift; 1544 - else if (*srqdepth > uk_attrs->max_hw_srq_quanta) 1542 + if (hw_quanta < min_hw_quanta) 1543 + hw_quanta = min_hw_quanta; 1544 + else if (hw_quanta > uk_attrs->max_hw_srq_quanta) 1545 1545 return -EINVAL; 1546 1546 1547 + *srqdepth = hw_quanta; 1547 1548 return 0; 1548 1549 } 1549 1550

-2

drivers/infiniband/hw/irdma/utils.c

··· 2322 2322 struct irdma_qp *qp = sc_qp->qp_uk.back_qp; 2323 2323 struct ib_qp_attr attr; 2324 2324 2325 - if (qp->iwdev->rf->reset) 2326 - return; 2327 2325 attr.qp_state = IB_QPS_ERR; 2328 2326 2329 2327 if (rdma_protocol_roce(qp->ibqp.device, 1))

+6 -4

drivers/infiniband/hw/irdma/verbs.c

··· 558 558 } 559 559 560 560 irdma_qp_rem_ref(&iwqp->ibqp); 561 - wait_for_completion(&iwqp->free_qp); 561 + if (!iwdev->rf->reset) 562 + wait_for_completion(&iwqp->free_qp); 562 563 irdma_free_lsmm_rsrc(iwqp); 563 564 irdma_cqp_qp_destroy_cmd(&iwdev->rf->sc_dev, &iwqp->sc_qp); 564 565 ··· 1106 1105 spin_lock_init(&iwqp->sc_qp.pfpdu.lock); 1107 1106 iwqp->sig_all = init_attr->sq_sig_type == IB_SIGNAL_ALL_WR; 1108 1107 rf->qp_table[qp_num] = iwqp; 1108 + init_completion(&iwqp->free_qp); 1109 1109 1110 1110 if (udata) { 1111 1111 /* GEN_1 legacy support with libi40iw does not have expanded uresp struct */ ··· 1131 1129 } 1132 1130 } 1133 1131 1134 - init_completion(&iwqp->free_qp); 1135 1132 return 0; 1136 1133 1137 1134 error: ··· 1463 1462 ctx_info->remote_atomics_en = true; 1464 1463 } 1465 1464 1466 - wait_event(iwqp->mod_qp_waitq, !atomic_read(&iwqp->hw_mod_qp_pend)); 1467 - 1468 1465 ibdev_dbg(&iwdev->ibdev, 1469 1466 "VERBS: caller: %pS qp_id=%d to_ibqpstate=%d ibqpstate=%d irdma_qpstate=%d attr_mask=0x%x\n", 1470 1467 __builtin_return_address(0), ibqp->qp_num, attr->qp_state, ··· 1539 1540 case IB_QPS_ERR: 1540 1541 case IB_QPS_RESET: 1541 1542 if (iwqp->iwarp_state == IRDMA_QP_STATE_ERROR) { 1543 + iwqp->ibqp_state = attr->qp_state; 1542 1544 spin_unlock_irqrestore(&iwqp->lock, flags); 1543 1545 if (udata && udata->inlen) { 1544 1546 if (ib_copy_from_udata(&ureq, udata, ··· 1745 1745 case IB_QPS_ERR: 1746 1746 case IB_QPS_RESET: 1747 1747 if (iwqp->iwarp_state == IRDMA_QP_STATE_ERROR) { 1748 + iwqp->ibqp_state = attr->qp_state; 1748 1749 spin_unlock_irqrestore(&iwqp->lock, flags); 1749 1750 if (udata && udata->inlen) { 1750 1751 if (ib_copy_from_udata(&ureq, udata, ··· 3724 3723 3725 3724 err: 3726 3725 ib_umem_release(region); 3726 + iwmr->region = NULL; 3727 3727 return err; 3728 3728 } 3729 3729

+17 -4

drivers/iommu/dma-iommu.c

··· 1211 1211 */ 1212 1212 if (dev_use_swiotlb(dev, size, dir) && 1213 1213 iova_unaligned(iovad, phys, size)) { 1214 - if (attrs & DMA_ATTR_MMIO) 1214 + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT)) 1215 1215 return DMA_MAPPING_ERROR; 1216 1216 1217 1217 phys = iommu_dma_map_swiotlb(dev, phys, size, dir, attrs); ··· 1223 1223 arch_sync_dma_for_device(phys, size, dir); 1224 1224 1225 1225 iova = __iommu_dma_map(dev, phys, size, prot, dma_mask); 1226 - if (iova == DMA_MAPPING_ERROR && !(attrs & DMA_ATTR_MMIO)) 1226 + if (iova == DMA_MAPPING_ERROR && 1227 + !(attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT))) 1227 1228 swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); 1228 1229 return iova; 1229 1230 } ··· 1234 1233 { 1235 1234 phys_addr_t phys; 1236 1235 1237 - if (attrs & DMA_ATTR_MMIO) { 1236 + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT)) { 1238 1237 __iommu_dma_unmap(dev, dma_handle, size); 1239 1238 return; 1240 1239 } ··· 1946 1945 if (WARN_ON_ONCE(iova_start_pad && offset > 0)) 1947 1946 return -EIO; 1948 1947 1948 + /* 1949 + * DMA_IOVA_USE_SWIOTLB is set on state after some entry 1950 + * took SWIOTLB path, which we were supposed to prevent 1951 + * for DMA_ATTR_REQUIRE_COHERENT attribute. 1952 + */ 1953 + if (WARN_ON_ONCE((state->__size & DMA_IOVA_USE_SWIOTLB) && 1954 + (attrs & DMA_ATTR_REQUIRE_COHERENT))) 1955 + return -EOPNOTSUPP; 1956 + 1957 + if (!dev_is_dma_coherent(dev) && (attrs & DMA_ATTR_REQUIRE_COHERENT)) 1958 + return -EOPNOTSUPP; 1959 + 1949 1960 if (dev_use_swiotlb(dev, size, dir) && 1950 1961 iova_unaligned(iovad, phys, size)) { 1951 - if (attrs & DMA_ATTR_MMIO) 1962 + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT)) 1952 1963 return -EPERM; 1953 1964 1954 1965 return iommu_dma_iova_link_swiotlb(dev, state, phys, offset,

+3

drivers/irqchip/irq-qcom-mpm.c

··· 306 306 if (ret < 0) 307 307 return ret; 308 308 309 + mbox_client_txdone(priv->mbox_chan, 0); 310 + 309 311 return 0; 310 312 } 311 313 ··· 436 434 } 437 435 438 436 priv->mbox_client.dev = dev; 437 + priv->mbox_client.knows_txdone = true; 439 438 priv->mbox_chan = mbox_request_channel(&priv->mbox_client, 0); 440 439 if (IS_ERR(priv->mbox_chan)) { 441 440 ret = PTR_ERR(priv->mbox_chan);

+1 -1

drivers/irqchip/irq-renesas-rzv2h.c

··· 621 621 return 0; 622 622 623 623 pm_put: 624 - pm_runtime_put(&pdev->dev); 624 + pm_runtime_put_sync(&pdev->dev); 625 625 626 626 return ret; 627 627 }

-2

drivers/media/i2c/ccs/ccs-core.c

··· 3080 3080 struct v4l2_rect *crop = 3081 3081 v4l2_subdev_state_get_crop(sd_state, pad); 3082 3082 3083 - guard(mutex)(&sensor->mutex); 3084 - 3085 3083 ccs_get_native_size(ssd, crop); 3086 3084 3087 3085 fmt->width = crop->width;

+5

drivers/media/mc/mc-request.c

··· 192 192 struct media_device *mdev = req->mdev; 193 193 unsigned long flags; 194 194 195 + mutex_lock(&mdev->req_queue_mutex); 196 + 195 197 spin_lock_irqsave(&req->lock, flags); 196 198 if (req->state != MEDIA_REQUEST_STATE_IDLE && 197 199 req->state != MEDIA_REQUEST_STATE_COMPLETE) { ··· 201 199 "request: %s not in idle or complete state, cannot reinit\n", 202 200 req->debug_str); 203 201 spin_unlock_irqrestore(&req->lock, flags); 202 + mutex_unlock(&mdev->req_queue_mutex); 204 203 return -EBUSY; 205 204 } 206 205 if (req->access_count) { ··· 209 206 "request: %s is being accessed, cannot reinit\n", 210 207 req->debug_str); 211 208 spin_unlock_irqrestore(&req->lock, flags); 209 + mutex_unlock(&mdev->req_queue_mutex); 212 210 return -EBUSY; 213 211 } 214 212 req->state = MEDIA_REQUEST_STATE_CLEANING; ··· 220 216 spin_lock_irqsave(&req->lock, flags); 221 217 req->state = MEDIA_REQUEST_STATE_IDLE; 222 218 spin_unlock_irqrestore(&req->lock, flags); 219 + mutex_unlock(&mdev->req_queue_mutex); 223 220 224 221 return 0; 225 222 }

+4

drivers/media/platform/rockchip/rkvdec/rkvdec-hevc-common.c

··· 500 500 ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, 501 501 V4L2_CID_STATELESS_HEVC_EXT_SPS_ST_RPS); 502 502 run->ext_sps_st_rps = ctrl ? ctrl->p_cur.p : NULL; 503 + } else { 504 + run->ext_sps_st_rps = NULL; 503 505 } 504 506 if (ctx->has_sps_lt_rps) { 505 507 ctrl = v4l2_ctrl_find(&ctx->ctrl_hdl, 506 508 V4L2_CID_STATELESS_HEVC_EXT_SPS_LT_RPS); 507 509 run->ext_sps_lt_rps = ctrl ? ctrl->p_cur.p : NULL; 510 + } else { 511 + run->ext_sps_lt_rps = NULL; 508 512 } 509 513 510 514 rkvdec_run_preamble(ctx, &run->base);

+27 -23

drivers/media/platform/rockchip/rkvdec/rkvdec-vdpu383-h264.c

··· 130 130 struct vdpu383_regs_h26x regs; 131 131 }; 132 132 133 - static void set_field_order_cnt(struct rkvdec_pps *pps, const struct v4l2_h264_dpb_entry *dpb) 133 + static noinline_for_stack void set_field_order_cnt(struct rkvdec_pps *pps, const struct v4l2_h264_dpb_entry *dpb) 134 134 { 135 135 pps->top_field_order_cnt0 = dpb[0].top_field_order_cnt; 136 136 pps->bot_field_order_cnt0 = dpb[0].bottom_field_order_cnt; ··· 166 166 pps->bot_field_order_cnt15 = dpb[15].bottom_field_order_cnt; 167 167 } 168 168 169 + static noinline_for_stack void set_dec_params(struct rkvdec_pps *pps, const struct v4l2_ctrl_h264_decode_params *dec_params) 170 + { 171 + const struct v4l2_h264_dpb_entry *dpb = dec_params->dpb; 172 + 173 + for (int i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) { 174 + if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) 175 + pps->is_longterm |= (1 << i); 176 + pps->ref_field_flags |= 177 + (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD)) << i; 178 + pps->ref_colmv_use_flag |= 179 + (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) << i; 180 + pps->ref_topfield_used |= 181 + (!!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF)) << i; 182 + pps->ref_botfield_used |= 183 + (!!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)) << i; 184 + } 185 + pps->pic_field_flag = 186 + !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC); 187 + pps->pic_associated_flag = 188 + !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD); 189 + 190 + pps->cur_top_field = dec_params->top_field_order_cnt; 191 + pps->cur_bot_field = dec_params->bottom_field_order_cnt; 192 + } 193 + 169 194 static void assemble_hw_pps(struct rkvdec_ctx *ctx, 170 195 struct rkvdec_h264_run *run) 171 196 { ··· 202 177 struct rkvdec_h264_priv_tbl *priv_tbl = h264_ctx->priv_tbl.cpu; 203 178 struct rkvdec_sps_pps *hw_ps; 204 179 u32 pic_width, pic_height; 205 - u32 i; 206 180 207 181 /* 208 182 * HW read the SPS/PPS information from PPS packet index by PPS id. ··· 285 261 !!(pps->flags & V4L2_H264_PPS_FLAG_SCALING_MATRIX_PRESENT); 286 262 287 263 set_field_order_cnt(&hw_ps->pps, dpb); 264 + set_dec_params(&hw_ps->pps, dec_params); 288 265 289 - for (i = 0; i < ARRAY_SIZE(dec_params->dpb); i++) { 290 - if (dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM) 291 - hw_ps->pps.is_longterm |= (1 << i); 292 - 293 - hw_ps->pps.ref_field_flags |= 294 - (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_FIELD)) << i; 295 - hw_ps->pps.ref_colmv_use_flag |= 296 - (!!(dpb[i].flags & V4L2_H264_DPB_ENTRY_FLAG_ACTIVE)) << i; 297 - hw_ps->pps.ref_topfield_used |= 298 - (!!(dpb[i].fields & V4L2_H264_TOP_FIELD_REF)) << i; 299 - hw_ps->pps.ref_botfield_used |= 300 - (!!(dpb[i].fields & V4L2_H264_BOTTOM_FIELD_REF)) << i; 301 - } 302 - 303 - hw_ps->pps.pic_field_flag = 304 - !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_FIELD_PIC); 305 - hw_ps->pps.pic_associated_flag = 306 - !!(dec_params->flags & V4L2_H264_DECODE_PARAM_FLAG_BOTTOM_FIELD); 307 - 308 - hw_ps->pps.cur_top_field = dec_params->top_field_order_cnt; 309 - hw_ps->pps.cur_bot_field = dec_params->bottom_field_order_cnt; 310 266 } 311 267 312 268 static void rkvdec_write_regs(struct rkvdec_ctx *ctx)

+2 -1

drivers/media/platform/rockchip/rkvdec/rkvdec-vp9.c

··· 893 893 update_ctx_last_info(vp9_ctx); 894 894 } 895 895 896 - static void rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx) 896 + static noinline_for_stack void 897 + rkvdec_init_v4l2_vp9_count_tbl(struct rkvdec_ctx *ctx) 897 898 { 898 899 struct rkvdec_vp9_ctx *vp9_ctx = ctx->priv; 899 900 struct rkvdec_vp9_intra_frame_symbol_counts *intra_cnts = vp9_ctx->count_tbl.cpu;

+1

drivers/media/platform/synopsys/Kconfig

··· 7 7 depends on VIDEO_DEV 8 8 depends on V4L_PLATFORM_DRIVERS 9 9 depends on PM && COMMON_CLK 10 + select GENERIC_PHY_MIPI_DPHY 10 11 select MEDIA_CONTROLLER 11 12 select V4L2_FWNODE 12 13 select VIDEO_V4L2_SUBDEV_API

+1 -1

drivers/media/platform/synopsys/dw-mipi-csi2rx.c

··· 301 301 302 302 return 0; 303 303 case DW_MIPI_CSI2RX_PAD_SINK: 304 - if (code->index > csi2->formats_num) 304 + if (code->index >= csi2->formats_num) 305 305 return -EINVAL; 306 306 307 307 code->code = csi2->formats[code->index].code;

+1 -1

drivers/media/platform/verisilicon/imx8m_vpu_hw.c

··· 343 343 .num_regs = ARRAY_SIZE(imx8mq_reg_names) 344 344 }; 345 345 346 - static const struct of_device_id imx8mq_vpu_shared_resources[] __initconst = { 346 + static const struct of_device_id imx8mq_vpu_shared_resources[] = { 347 347 { .compatible = "nxp,imx8mq-vpu-g1", }, 348 348 { .compatible = "nxp,imx8mq-vpu-g2", }, 349 349 { /* sentinel */ }

+5 -4

drivers/media/usb/uvc/uvc_video.c

··· 1751 1751 /* 1752 1752 * Free transfer buffers. 1753 1753 */ 1754 - static void uvc_free_urb_buffers(struct uvc_streaming *stream) 1754 + static void uvc_free_urb_buffers(struct uvc_streaming *stream, 1755 + unsigned int size) 1755 1756 { 1756 1757 struct usb_device *udev = stream->dev->udev; 1757 1758 struct uvc_urb *uvc_urb; ··· 1761 1760 if (!uvc_urb->buffer) 1762 1761 continue; 1763 1762 1764 - usb_free_noncoherent(udev, stream->urb_size, uvc_urb->buffer, 1763 + usb_free_noncoherent(udev, size, uvc_urb->buffer, 1765 1764 uvc_stream_dir(stream), uvc_urb->sgt); 1766 1765 uvc_urb->buffer = NULL; 1767 1766 uvc_urb->sgt = NULL; ··· 1821 1820 1822 1821 if (!uvc_alloc_urb_buffer(stream, uvc_urb, urb_size, 1823 1822 gfp_flags)) { 1824 - uvc_free_urb_buffers(stream); 1823 + uvc_free_urb_buffers(stream, urb_size); 1825 1824 break; 1826 1825 } 1827 1826 ··· 1869 1868 } 1870 1869 1871 1870 if (free_buffers) 1872 - uvc_free_urb_buffers(stream); 1871 + uvc_free_urb_buffers(stream, stream->urb_size); 1873 1872 } 1874 1873 1875 1874 /*

+3 -2

drivers/media/v4l2-core/v4l2-ioctl.c

··· 3082 3082 } 3083 3083 3084 3084 /* 3085 - * We need to serialize streamon/off with queueing new requests. 3085 + * We need to serialize streamon/off/reqbufs with queueing new requests. 3086 3086 * These ioctls may trigger the cancellation of a streaming 3087 3087 * operation, and that should not be mixed with queueing a new 3088 3088 * request at the same time. 3089 3089 */ 3090 3090 if (v4l2_device_supports_requests(vfd->v4l2_dev) && 3091 - (cmd == VIDIOC_STREAMON || cmd == VIDIOC_STREAMOFF)) { 3091 + (cmd == VIDIOC_STREAMON || cmd == VIDIOC_STREAMOFF || 3092 + cmd == VIDIOC_REQBUFS)) { 3092 3093 req_queue_lock = &vfd->v4l2_dev->mdev->req_queue_mutex; 3093 3094 3094 3095 if (mutex_lock_interruptible(req_queue_lock))

+3 -1

drivers/net/can/dev/netlink.c

··· 601 601 /* We need synchronization with dev->stop() */ 602 602 ASSERT_RTNL(); 603 603 604 - can_ctrlmode_changelink(dev, data, extack); 604 + err = can_ctrlmode_changelink(dev, data, extack); 605 + if (err) 606 + return err; 605 607 606 608 if (data[IFLA_CAN_BITTIMING]) { 607 609 struct can_bittiming bt;

+24 -5

drivers/net/can/spi/mcp251x.c

··· 1225 1225 } 1226 1226 1227 1227 mutex_lock(&priv->mcp_lock); 1228 - mcp251x_power_enable(priv->transceiver, 1); 1228 + ret = mcp251x_power_enable(priv->transceiver, 1); 1229 + if (ret) { 1230 + dev_err(&spi->dev, "failed to enable transceiver power: %pe\n", ERR_PTR(ret)); 1231 + goto out_close_candev; 1232 + } 1229 1233 1230 1234 priv->force_quit = 0; 1231 1235 priv->tx_skb = NULL; ··· 1276 1272 mcp251x_hw_sleep(spi); 1277 1273 out_close: 1278 1274 mcp251x_power_enable(priv->transceiver, 0); 1275 + out_close_candev: 1279 1276 close_candev(net); 1280 1277 mutex_unlock(&priv->mcp_lock); 1281 1278 if (release_irq) ··· 1521 1516 { 1522 1517 struct spi_device *spi = to_spi_device(dev); 1523 1518 struct mcp251x_priv *priv = spi_get_drvdata(spi); 1519 + int ret = 0; 1524 1520 1525 - if (priv->after_suspend & AFTER_SUSPEND_POWER) 1526 - mcp251x_power_enable(priv->power, 1); 1527 - if (priv->after_suspend & AFTER_SUSPEND_UP) 1528 - mcp251x_power_enable(priv->transceiver, 1); 1521 + if (priv->after_suspend & AFTER_SUSPEND_POWER) { 1522 + ret = mcp251x_power_enable(priv->power, 1); 1523 + if (ret) { 1524 + dev_err(dev, "failed to restore power: %pe\n", ERR_PTR(ret)); 1525 + return ret; 1526 + } 1527 + } 1528 + 1529 + if (priv->after_suspend & AFTER_SUSPEND_UP) { 1530 + ret = mcp251x_power_enable(priv->transceiver, 1); 1531 + if (ret) { 1532 + dev_err(dev, "failed to restore transceiver power: %pe\n", ERR_PTR(ret)); 1533 + if (priv->after_suspend & AFTER_SUSPEND_POWER) 1534 + mcp251x_power_enable(priv->power, 0); 1535 + return ret; 1536 + } 1537 + } 1529 1538 1530 1539 if (priv->after_suspend & (AFTER_SUSPEND_POWER | AFTER_SUSPEND_UP)) 1531 1540 queue_work(priv->wq, &priv->restart_work);

+2

drivers/net/ethernet/airoha/airoha_ppe.c

··· 227 227 if (!dev) 228 228 return -ENODEV; 229 229 230 + rcu_read_lock(); 230 231 err = dev_fill_forward_path(dev, addr, &stack); 232 + rcu_read_unlock(); 231 233 if (err) 232 234 return err; 233 235

+1 -1

drivers/net/ethernet/broadcom/Kconfig

··· 25 25 select SSB 26 26 select MII 27 27 select PHYLIB 28 - select FIXED_PHY if BCM47XX 28 + select FIXED_PHY 29 29 help 30 30 If you have a network (Ethernet) controller of this type, say Y 31 31 or M here.

+23 -18

drivers/net/ethernet/broadcom/asp2/bcmasp.c

··· 1152 1152 } 1153 1153 } 1154 1154 1155 - static void bcmasp_wol_irq_destroy(struct bcmasp_priv *priv) 1156 - { 1157 - if (priv->wol_irq > 0) 1158 - free_irq(priv->wol_irq, priv); 1159 - } 1160 - 1161 1155 static void bcmasp_eee_fixup(struct bcmasp_intf *intf, bool en) 1162 1156 { 1163 1157 u32 reg, phy_lpi_overwrite; ··· 1249 1255 if (priv->irq <= 0) 1250 1256 return -EINVAL; 1251 1257 1252 - priv->clk = devm_clk_get_optional_enabled(dev, "sw_asp"); 1258 + priv->clk = devm_clk_get_optional(dev, "sw_asp"); 1253 1259 if (IS_ERR(priv->clk)) 1254 1260 return dev_err_probe(dev, PTR_ERR(priv->clk), 1255 1261 "failed to request clock\n"); ··· 1277 1283 1278 1284 bcmasp_set_pdata(priv, pdata); 1279 1285 1286 + ret = clk_prepare_enable(priv->clk); 1287 + if (ret) 1288 + return dev_err_probe(dev, ret, "failed to start clock\n"); 1289 + 1280 1290 /* Enable all clocks to ensure successful probing */ 1281 1291 bcmasp_core_clock_set(priv, ASP_CTRL_CLOCK_CTRL_ASP_ALL_DISABLE, 0); 1282 1292 ··· 1292 1294 1293 1295 ret = devm_request_irq(&pdev->dev, priv->irq, bcmasp_isr, 0, 1294 1296 pdev->name, priv); 1295 - if (ret) 1296 - return dev_err_probe(dev, ret, "failed to request ASP interrupt: %d", ret); 1297 + if (ret) { 1298 + dev_err(dev, "Failed to request ASP interrupt: %d", ret); 1299 + goto err_clock_disable; 1300 + } 1297 1301 1298 1302 /* Register mdio child nodes */ 1299 1303 of_platform_populate(dev->of_node, bcmasp_mdio_of_match, NULL, dev); ··· 1307 1307 1308 1308 priv->mda_filters = devm_kcalloc(dev, priv->num_mda_filters, 1309 1309 sizeof(*priv->mda_filters), GFP_KERNEL); 1310 - if (!priv->mda_filters) 1311 - return -ENOMEM; 1310 + if (!priv->mda_filters) { 1311 + ret = -ENOMEM; 1312 + goto err_clock_disable; 1313 + } 1312 1314 1313 1315 priv->net_filters = devm_kcalloc(dev, priv->num_net_filters, 1314 1316 sizeof(*priv->net_filters), GFP_KERNEL); 1315 - if (!priv->net_filters) 1316 - return -ENOMEM; 1317 + if (!priv->net_filters) { 1318 + ret = -ENOMEM; 1319 + goto err_clock_disable; 1320 + } 1317 1321 1318 1322 bcmasp_core_init_filters(priv); 1319 1323 ··· 1326 1322 ports_node = of_find_node_by_name(dev->of_node, "ethernet-ports"); 1327 1323 if (!ports_node) { 1328 1324 dev_warn(dev, "No ports found\n"); 1329 - return -EINVAL; 1325 + ret = -EINVAL; 1326 + goto err_clock_disable; 1330 1327 } 1331 1328 1332 1329 i = 0; ··· 1349 1344 */ 1350 1345 bcmasp_core_clock_set(priv, 0, ASP_CTRL_CLOCK_CTRL_ASP_ALL_DISABLE); 1351 1346 1352 - clk_disable_unprepare(priv->clk); 1353 - 1354 1347 /* Now do the registration of the network ports which will take care 1355 1348 * of managing the clock properly. 1356 1349 */ ··· 1361 1358 count++; 1362 1359 } 1363 1360 1361 + clk_disable_unprepare(priv->clk); 1362 + 1364 1363 dev_info(dev, "Initialized %d port(s)\n", count); 1365 1364 1366 1365 return ret; 1367 1366 1368 1367 err_cleanup: 1369 - bcmasp_wol_irq_destroy(priv); 1370 1368 bcmasp_remove_intfs(priv); 1369 + err_clock_disable: 1370 + clk_disable_unprepare(priv->clk); 1371 1371 1372 1372 return ret; 1373 1373 } ··· 1382 1376 if (!priv) 1383 1377 return; 1384 1378 1385 - bcmasp_wol_irq_destroy(priv); 1386 1379 bcmasp_remove_intfs(priv); 1387 1380 } 1388 1381

+25 -16

drivers/net/ethernet/cadence/macb_main.c

··· 1071 1071 } 1072 1072 1073 1073 if (tx_skb->skb) { 1074 - napi_consume_skb(tx_skb->skb, budget); 1074 + dev_consume_skb_any(tx_skb->skb); 1075 1075 tx_skb->skb = NULL; 1076 1076 } 1077 1077 } ··· 3224 3224 spin_lock_irq(&bp->stats_lock); 3225 3225 gem_update_stats(bp); 3226 3226 memcpy(data, &bp->ethtool_stats, sizeof(u64) 3227 - * (GEM_STATS_LEN + QUEUE_STATS_LEN * MACB_MAX_QUEUES)); 3227 + * (GEM_STATS_LEN + QUEUE_STATS_LEN * bp->num_queues)); 3228 3228 spin_unlock_irq(&bp->stats_lock); 3229 3229 } 3230 3230 ··· 5776 5776 struct macb_queue *queue; 5777 5777 struct in_device *idev; 5778 5778 unsigned long flags; 5779 + u32 tmp, ifa_local; 5779 5780 unsigned int q; 5780 5781 int err; 5781 - u32 tmp; 5782 5782 5783 5783 if (!device_may_wakeup(&bp->dev->dev)) 5784 5784 phy_exit(bp->phy); ··· 5787 5787 return 0; 5788 5788 5789 5789 if (bp->wol & MACB_WOL_ENABLED) { 5790 - /* Check for IP address in WOL ARP mode */ 5791 - idev = __in_dev_get_rcu(bp->dev); 5792 - if (idev) 5793 - ifa = rcu_dereference(idev->ifa_list); 5794 - if ((bp->wolopts & WAKE_ARP) && !ifa) { 5795 - netdev_err(netdev, "IP address not assigned as required by WoL walk ARP\n"); 5796 - return -EOPNOTSUPP; 5790 + if (bp->wolopts & WAKE_ARP) { 5791 + /* Check for IP address in WOL ARP mode */ 5792 + rcu_read_lock(); 5793 + idev = __in_dev_get_rcu(bp->dev); 5794 + if (idev) 5795 + ifa = rcu_dereference(idev->ifa_list); 5796 + if (!ifa) { 5797 + rcu_read_unlock(); 5798 + netdev_err(netdev, "IP address not assigned as required by WoL walk ARP\n"); 5799 + return -EOPNOTSUPP; 5800 + } 5801 + ifa_local = be32_to_cpu(ifa->ifa_local); 5802 + rcu_read_unlock(); 5797 5803 } 5804 + 5798 5805 spin_lock_irqsave(&bp->lock, flags); 5799 5806 5800 5807 /* Disable Tx and Rx engines before disabling the queues, ··· 5840 5833 if (bp->wolopts & WAKE_ARP) { 5841 5834 tmp |= MACB_BIT(ARP); 5842 5835 /* write IP address into register */ 5843 - tmp |= MACB_BFEXT(IP, be32_to_cpu(ifa->ifa_local)); 5836 + tmp |= MACB_BFEXT(IP, ifa_local); 5844 5837 } 5838 + spin_unlock_irqrestore(&bp->lock, flags); 5845 5839 5846 5840 /* Change interrupt handler and 5847 5841 * Enable WoL IRQ on queue 0 ··· 5855 5847 dev_err(dev, 5856 5848 "Unable to request IRQ %d (error %d)\n", 5857 5849 bp->queues[0].irq, err); 5858 - spin_unlock_irqrestore(&bp->lock, flags); 5859 5850 return err; 5860 5851 } 5852 + spin_lock_irqsave(&bp->lock, flags); 5861 5853 queue_writel(bp->queues, IER, GEM_BIT(WOL)); 5862 5854 gem_writel(bp, WOL, tmp); 5855 + spin_unlock_irqrestore(&bp->lock, flags); 5863 5856 } else { 5864 5857 err = devm_request_irq(dev, bp->queues[0].irq, macb_wol_interrupt, 5865 5858 IRQF_SHARED, netdev->name, bp->queues); ··· 5868 5859 dev_err(dev, 5869 5860 "Unable to request IRQ %d (error %d)\n", 5870 5861 bp->queues[0].irq, err); 5871 - spin_unlock_irqrestore(&bp->lock, flags); 5872 5862 return err; 5873 5863 } 5864 + spin_lock_irqsave(&bp->lock, flags); 5874 5865 queue_writel(bp->queues, IER, MACB_BIT(WOL)); 5875 5866 macb_writel(bp, WOL, tmp); 5867 + spin_unlock_irqrestore(&bp->lock, flags); 5876 5868 } 5877 - spin_unlock_irqrestore(&bp->lock, flags); 5878 5869 5879 5870 enable_irq_wake(bp->queues[0].irq); 5880 5871 } ··· 5941 5932 queue_readl(bp->queues, ISR); 5942 5933 if (bp->caps & MACB_CAPS_ISR_CLEAR_ON_WRITE) 5943 5934 queue_writel(bp->queues, ISR, -1); 5935 + spin_unlock_irqrestore(&bp->lock, flags); 5936 + 5944 5937 /* Replace interrupt handler on queue 0 */ 5945 5938 devm_free_irq(dev, bp->queues[0].irq, bp->queues); 5946 5939 err = devm_request_irq(dev, bp->queues[0].irq, macb_interrupt, ··· 5951 5940 dev_err(dev, 5952 5941 "Unable to request IRQ %d (error %d)\n", 5953 5942 bp->queues[0].irq, err); 5954 - spin_unlock_irqrestore(&bp->lock, flags); 5955 5943 return err; 5956 5944 } 5957 - spin_unlock_irqrestore(&bp->lock, flags); 5958 5945 5959 5946 disable_irq_wake(bp->queues[0].irq); 5960 5947

+2

drivers/net/ethernet/freescale/enetc/enetc_ethtool.c

··· 813 813 { 814 814 struct enetc_ndev_priv *priv = netdev_priv(ndev); 815 815 816 + ring->rx_max_pending = priv->rx_bd_count; 817 + ring->tx_max_pending = priv->tx_bd_count; 816 818 ring->rx_pending = priv->rx_bd_count; 817 819 ring->tx_pending = priv->tx_bd_count; 818 820

+15 -16

drivers/net/ethernet/intel/iavf/iavf_ethtool.c

··· 313 313 { 314 314 /* Report the maximum number queues, even if not every queue is 315 315 * currently configured. Since allocation of queues is in pairs, 316 - * use netdev->real_num_tx_queues * 2. The real_num_tx_queues is set 317 - * at device creation and never changes. 316 + * use netdev->num_tx_queues * 2. The num_tx_queues is set at 317 + * device creation and never changes. 318 318 */ 319 319 320 320 if (sset == ETH_SS_STATS) 321 321 return IAVF_STATS_LEN + 322 - (IAVF_QUEUE_STATS_LEN * 2 * 323 - netdev->real_num_tx_queues); 322 + (IAVF_QUEUE_STATS_LEN * 2 * netdev->num_tx_queues); 324 323 else 325 324 return -EINVAL; 326 325 } ··· 344 345 iavf_add_ethtool_stats(&data, adapter, iavf_gstrings_stats); 345 346 346 347 rcu_read_lock(); 347 - /* As num_active_queues describe both tx and rx queues, we can use 348 - * it to iterate over rings' stats. 348 + /* Use num_tx_queues to report stats for the maximum number of queues. 349 + * Queues beyond num_active_queues will report zero. 349 350 */ 350 - for (i = 0; i < adapter->num_active_queues; i++) { 351 - struct iavf_ring *ring; 351 + for (i = 0; i < netdev->num_tx_queues; i++) { 352 + struct iavf_ring *tx_ring = NULL, *rx_ring = NULL; 352 353 353 - /* Tx rings stats */ 354 - ring = &adapter->tx_rings[i]; 355 - iavf_add_queue_stats(&data, ring); 354 + if (i < adapter->num_active_queues) { 355 + tx_ring = &adapter->tx_rings[i]; 356 + rx_ring = &adapter->rx_rings[i]; 357 + } 356 358 357 - /* Rx rings stats */ 358 - ring = &adapter->rx_rings[i]; 359 - iavf_add_queue_stats(&data, ring); 359 + iavf_add_queue_stats(&data, tx_ring); 360 + iavf_add_queue_stats(&data, rx_ring); 360 361 } 361 362 rcu_read_unlock(); 362 363 } ··· 375 376 iavf_add_stat_strings(&data, iavf_gstrings_stats); 376 377 377 378 /* Queues are always allocated in pairs, so we just use 378 - * real_num_tx_queues for both Tx and Rx queues. 379 + * num_tx_queues for both Tx and Rx queues. 379 380 */ 380 - for (i = 0; i < netdev->real_num_tx_queues; i++) { 381 + for (i = 0; i < netdev->num_tx_queues; i++) { 381 382 iavf_add_stat_strings(&data, iavf_gstrings_queue_stats, 382 383 "tx", i); 383 384 iavf_add_stat_strings(&data, iavf_gstrings_queue_stats,

+22

drivers/net/ethernet/intel/ice/ice.h

··· 840 840 } 841 841 842 842 /** 843 + * ice_get_max_txq - return the maximum number of Tx queues for in a PF 844 + * @pf: PF structure 845 + * 846 + * Return: maximum number of Tx queues 847 + */ 848 + static inline int ice_get_max_txq(struct ice_pf *pf) 849 + { 850 + return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_txq); 851 + } 852 + 853 + /** 854 + * ice_get_max_rxq - return the maximum number of Rx queues for in a PF 855 + * @pf: PF structure 856 + * 857 + * Return: maximum number of Rx queues 858 + */ 859 + static inline int ice_get_max_rxq(struct ice_pf *pf) 860 + { 861 + return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_rxq); 862 + } 863 + 864 + /** 843 865 * ice_get_main_vsi - Get the PF VSI 844 866 * @pf: PF instance 845 867 *

+11 -21

drivers/net/ethernet/intel/ice/ice_ethtool.c

··· 1930 1930 int i = 0; 1931 1931 char *p; 1932 1932 1933 + if (ice_is_port_repr_netdev(netdev)) { 1934 + ice_update_eth_stats(vsi); 1935 + 1936 + for (j = 0; j < ICE_VSI_STATS_LEN; j++) { 1937 + p = (char *)vsi + ice_gstrings_vsi_stats[j].stat_offset; 1938 + data[i++] = (ice_gstrings_vsi_stats[j].sizeof_stat == 1939 + sizeof(u64)) ? *(u64 *)p : *(u32 *)p; 1940 + } 1941 + return; 1942 + } 1943 + 1933 1944 ice_update_pf_stats(pf); 1934 1945 ice_update_vsi_stats(vsi); 1935 1946 ··· 1949 1938 data[i++] = (ice_gstrings_vsi_stats[j].sizeof_stat == 1950 1939 sizeof(u64)) ? *(u64 *)p : *(u32 *)p; 1951 1940 } 1952 - 1953 - if (ice_is_port_repr_netdev(netdev)) 1954 - return; 1955 1941 1956 1942 /* populate per queue stats */ 1957 1943 rcu_read_lock(); ··· 3779 3771 info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) | BIT(HWTSTAMP_FILTER_ALL); 3780 3772 3781 3773 return 0; 3782 - } 3783 - 3784 - /** 3785 - * ice_get_max_txq - return the maximum number of Tx queues for in a PF 3786 - * @pf: PF structure 3787 - */ 3788 - static int ice_get_max_txq(struct ice_pf *pf) 3789 - { 3790 - return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_txq); 3791 - } 3792 - 3793 - /** 3794 - * ice_get_max_rxq - return the maximum number of Rx queues for in a PF 3795 - * @pf: PF structure 3796 - */ 3797 - static int ice_get_max_rxq(struct ice_pf *pf) 3798 - { 3799 - return min(num_online_cpus(), pf->hw.func_caps.common_cap.num_rxq); 3800 3774 } 3801 3775 3802 3776 /**

+2 -2

drivers/net/ethernet/intel/ice/ice_main.c

··· 4699 4699 struct net_device *netdev; 4700 4700 u8 mac_addr[ETH_ALEN]; 4701 4701 4702 - netdev = alloc_etherdev_mqs(sizeof(*np), vsi->alloc_txq, 4703 - vsi->alloc_rxq); 4702 + netdev = alloc_etherdev_mqs(sizeof(*np), ice_get_max_txq(vsi->back), 4703 + ice_get_max_rxq(vsi->back)); 4704 4704 if (!netdev) 4705 4705 return -ENOMEM; 4706 4706

+3 -2

drivers/net/ethernet/intel/ice/ice_repr.c

··· 2 2 /* Copyright (C) 2019-2021, Intel Corporation. */ 3 3 4 4 #include "ice.h" 5 + #include "ice_lib.h" 5 6 #include "ice_eswitch.h" 6 7 #include "devlink/devlink.h" 7 8 #include "devlink/port.h" ··· 68 67 return; 69 68 vsi = repr->src_vsi; 70 69 71 - ice_update_vsi_stats(vsi); 70 + ice_update_eth_stats(vsi); 72 71 eth_stats = &vsi->eth_stats; 73 72 74 73 stats->tx_packets = eth_stats->tx_unicast + eth_stats->tx_broadcast + ··· 316 315 317 316 static int ice_repr_ready_vf(struct ice_repr *repr) 318 317 { 319 - return !ice_check_vf_ready_for_cfg(repr->vf); 318 + return ice_check_vf_ready_for_cfg(repr->vf); 320 319 } 321 320 322 321 static int ice_repr_ready_sf(struct ice_repr *repr)

+1 -1

drivers/net/ethernet/intel/idpf/idpf.h

··· 1066 1066 int idpf_idc_init(struct idpf_adapter *adapter); 1067 1067 int idpf_idc_init_aux_core_dev(struct idpf_adapter *adapter, 1068 1068 enum iidc_function_type ftype); 1069 - void idpf_idc_deinit_core_aux_device(struct iidc_rdma_core_dev_info *cdev_info); 1069 + void idpf_idc_deinit_core_aux_device(struct idpf_adapter *adapter); 1070 1070 void idpf_idc_deinit_vport_aux_device(struct iidc_rdma_vport_dev_info *vdev_info); 1071 1071 void idpf_idc_issue_reset_event(struct iidc_rdma_core_dev_info *cdev_info); 1072 1072 void idpf_idc_vdev_mtu_event(struct iidc_rdma_vport_dev_info *vdev_info,

+4 -2

drivers/net/ethernet/intel/idpf/idpf_idc.c

··· 470 470 471 471 /** 472 472 * idpf_idc_deinit_core_aux_device - de-initialize Auxiliary Device(s) 473 - * @cdev_info: IDC core device info pointer 473 + * @adapter: driver private data structure 474 474 */ 475 - void idpf_idc_deinit_core_aux_device(struct iidc_rdma_core_dev_info *cdev_info) 475 + void idpf_idc_deinit_core_aux_device(struct idpf_adapter *adapter) 476 476 { 477 + struct iidc_rdma_core_dev_info *cdev_info = adapter->cdev_info; 477 478 struct iidc_rdma_priv_dev_info *privd; 478 479 479 480 if (!cdev_info) ··· 486 485 kfree(privd->mapped_mem_regions); 487 486 kfree(privd); 488 487 kfree(cdev_info); 488 + adapter->cdev_info = NULL; 489 489 } 490 490 491 491 /**

+1 -1

drivers/net/ethernet/intel/idpf/idpf_txrx.c

··· 1860 1860 idpf_queue_assign(HSPLIT_EN, q, hs); 1861 1861 idpf_queue_assign(RSC_EN, q, rsc); 1862 1862 1863 - bufq_set->num_refillqs = num_rxq; 1864 1863 bufq_set->refillqs = kcalloc(num_rxq, swq_size, 1865 1864 GFP_KERNEL); 1866 1865 if (!bufq_set->refillqs) { 1867 1866 err = -ENOMEM; 1868 1867 goto err_alloc; 1869 1868 } 1869 + bufq_set->num_refillqs = num_rxq; 1870 1870 for (unsigned int k = 0; k < bufq_set->num_refillqs; k++) { 1871 1871 struct idpf_sw_queue *refillq = 1872 1872 &bufq_set->refillqs[k];

+1 -1

drivers/net/ethernet/intel/idpf/idpf_virtchnl.c

··· 3668 3668 3669 3669 idpf_ptp_release(adapter); 3670 3670 idpf_deinit_task(adapter); 3671 - idpf_idc_deinit_core_aux_device(adapter->cdev_info); 3671 + idpf_idc_deinit_core_aux_device(adapter); 3672 3672 idpf_rel_rx_pt_lkup(adapter); 3673 3673 idpf_intr_rel(adapter); 3674 3674

+5

drivers/net/ethernet/microchip/lan743x_main.c

··· 3053 3053 else if (speed == SPEED_100) 3054 3054 mac_cr |= MAC_CR_CFG_L_; 3055 3055 3056 + if (duplex == DUPLEX_FULL) 3057 + mac_cr |= MAC_CR_DPX_; 3058 + else 3059 + mac_cr &= ~MAC_CR_DPX_; 3060 + 3056 3061 lan743x_csr_write(adapter, MAC_CR, mac_cr); 3057 3062 3058 3063 lan743x_ptp_update_latency(adapter, speed);

+4 -2

drivers/net/ethernet/microsoft/mana/mana_en.c

··· 3425 3425 struct auxiliary_device *adev; 3426 3426 struct mana_adev *madev; 3427 3427 int ret; 3428 + int id; 3428 3429 3429 3430 madev = kzalloc_obj(*madev); 3430 3431 if (!madev) ··· 3435 3434 ret = mana_adev_idx_alloc(); 3436 3435 if (ret < 0) 3437 3436 goto idx_fail; 3438 - adev->id = ret; 3437 + id = ret; 3438 + adev->id = id; 3439 3439 3440 3440 adev->name = name; 3441 3441 adev->dev.parent = gd->gdma_context->dev; ··· 3462 3460 auxiliary_device_uninit(adev); 3463 3461 3464 3462 init_fail: 3465 - mana_adev_idx_free(adev->id); 3463 + mana_adev_idx_free(id); 3466 3464 3467 3465 idx_fail: 3468 3466 kfree(madev);

+11 -6

drivers/net/ethernet/pensando/ionic/ionic_lif.c

··· 1719 1719 if (ether_addr_equal(netdev->dev_addr, mac)) 1720 1720 return 0; 1721 1721 1722 - err = ionic_program_mac(lif, mac); 1723 - if (err < 0) 1724 - return err; 1722 + /* Only program macs for virtual functions to avoid losing the permanent 1723 + * Mac across warm reset/reboot. 1724 + */ 1725 + if (lif->ionic->pdev->is_virtfn) { 1726 + err = ionic_program_mac(lif, mac); 1727 + if (err < 0) 1728 + return err; 1725 1729 1726 - if (err > 0) 1727 - netdev_dbg(netdev, "%s: SET and GET ATTR Mac are not equal-due to old FW running\n", 1728 - __func__); 1730 + if (err > 0) 1731 + netdev_dbg(netdev, "%s: SET and GET ATTR Mac are not equal-due to old FW running\n", 1732 + __func__); 1733 + } 1729 1734 1730 1735 err = eth_prepare_mac_addr_change(netdev, addr); 1731 1736 if (err)

+2 -2

drivers/net/ethernet/ti/icssg/icssg_common.c

··· 962 962 pkt_len -= 4; 963 963 cppi5_desc_get_tags_ids(&desc_rx->hdr, &port_id, NULL); 964 964 psdata = cppi5_hdesc_get_psdata(desc_rx); 965 - k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx); 966 965 count++; 967 966 xsk_buff_set_size(xdp, pkt_len); 968 967 xsk_buff_dma_sync_for_cpu(xdp); ··· 987 988 emac_dispatch_skb_zc(emac, xdp, psdata); 988 989 xsk_buff_free(xdp); 989 990 } 991 + k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx); 990 992 } 991 993 992 994 if (xdp_status & ICSSG_XDP_REDIR) ··· 1057 1057 /* firmware adds 4 CRC bytes, strip them */ 1058 1058 pkt_len -= 4; 1059 1059 cppi5_desc_get_tags_ids(&desc_rx->hdr, &port_id, NULL); 1060 - k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx); 1061 1060 1062 1061 /* if allocation fails we drop the packet but push the 1063 1062 * descriptor back to the ring with old page to prevent a stall ··· 1114 1115 ndev->stats.rx_packets++; 1115 1116 1116 1117 requeue: 1118 + k3_cppi_desc_pool_free(rx_chn->desc_pool, desc_rx); 1117 1119 /* queue another RX DMA */ 1118 1120 ret = prueth_dma_rx_push_mapped(emac, &emac->rx_chns, new_page, 1119 1121 PRUETH_MAX_PKT_SIZE);

+64 -1

drivers/net/team/team_core.c

··· 2058 2058 * rt netlink interface 2059 2059 ***********************/ 2060 2060 2061 + /* For tx path we need a linkup && enabled port and for parse any port 2062 + * suffices. 2063 + */ 2064 + static struct team_port *team_header_port_get_rcu(struct team *team, 2065 + bool txable) 2066 + { 2067 + struct team_port *port; 2068 + 2069 + list_for_each_entry_rcu(port, &team->port_list, list) { 2070 + if (!txable || team_port_txable(port)) 2071 + return port; 2072 + } 2073 + 2074 + return NULL; 2075 + } 2076 + 2077 + static int team_header_create(struct sk_buff *skb, struct net_device *team_dev, 2078 + unsigned short type, const void *daddr, 2079 + const void *saddr, unsigned int len) 2080 + { 2081 + struct team *team = netdev_priv(team_dev); 2082 + const struct header_ops *port_ops; 2083 + struct team_port *port; 2084 + int ret = 0; 2085 + 2086 + rcu_read_lock(); 2087 + port = team_header_port_get_rcu(team, true); 2088 + if (port) { 2089 + port_ops = READ_ONCE(port->dev->header_ops); 2090 + if (port_ops && port_ops->create) 2091 + ret = port_ops->create(skb, port->dev, 2092 + type, daddr, saddr, len); 2093 + } 2094 + rcu_read_unlock(); 2095 + return ret; 2096 + } 2097 + 2098 + static int team_header_parse(const struct sk_buff *skb, 2099 + const struct net_device *team_dev, 2100 + unsigned char *haddr) 2101 + { 2102 + struct team *team = netdev_priv(team_dev); 2103 + const struct header_ops *port_ops; 2104 + struct team_port *port; 2105 + int ret = 0; 2106 + 2107 + rcu_read_lock(); 2108 + port = team_header_port_get_rcu(team, false); 2109 + if (port) { 2110 + port_ops = READ_ONCE(port->dev->header_ops); 2111 + if (port_ops && port_ops->parse) 2112 + ret = port_ops->parse(skb, port->dev, haddr); 2113 + } 2114 + rcu_read_unlock(); 2115 + return ret; 2116 + } 2117 + 2118 + static const struct header_ops team_header_ops = { 2119 + .create = team_header_create, 2120 + .parse = team_header_parse, 2121 + }; 2122 + 2061 2123 static void team_setup_by_port(struct net_device *dev, 2062 2124 struct net_device *port_dev) 2063 2125 { ··· 2128 2066 if (port_dev->type == ARPHRD_ETHER) 2129 2067 dev->header_ops = team->header_ops_cache; 2130 2068 else 2131 - dev->header_ops = port_dev->header_ops; 2069 + dev->header_ops = port_dev->header_ops ? 2070 + &team_header_ops : NULL; 2132 2071 dev->type = port_dev->type; 2133 2072 dev->hard_header_len = port_dev->hard_header_len; 2134 2073 dev->needed_headroom = port_dev->needed_headroom;

+1 -1

drivers/net/tun_vnet.h

··· 244 244 245 245 if (virtio_net_hdr_tnl_from_skb(skb, tnl_hdr, has_tnl_offload, 246 246 tun_vnet_is_little_endian(flags), 247 - vlan_hlen, true)) { 247 + vlan_hlen, true, false)) { 248 248 struct virtio_net_hdr_v1 *hdr = &tnl_hdr->hash_hdr.hdr; 249 249 struct skb_shared_info *sinfo = skb_shinfo(skb); 250 250

+6 -1

drivers/net/virtio_net.c

··· 3267 3267 struct virtio_net_hdr_v1_hash_tunnel *hdr; 3268 3268 int num_sg; 3269 3269 unsigned hdr_len = vi->hdr_len; 3270 + bool feature_hdrlen; 3270 3271 bool can_push; 3272 + 3273 + feature_hdrlen = virtio_has_feature(vi->vdev, 3274 + VIRTIO_NET_F_GUEST_HDRLEN); 3271 3275 3272 3276 pr_debug("%s: xmit %p %pM\n", vi->dev->name, skb, dest); 3273 3277 ··· 3292 3288 3293 3289 if (virtio_net_hdr_tnl_from_skb(skb, hdr, vi->tx_tnl, 3294 3290 virtio_is_little_endian(vi->vdev), 0, 3295 - false)) 3291 + false, feature_hdrlen)) 3296 3292 return -EPROTO; 3297 3293 3298 3294 if (vi->mergeable_rx_bufs) ··· 3355 3351 /* Don't wait up for transmitted skbs to be freed. */ 3356 3352 if (!use_napi) { 3357 3353 skb_orphan(skb); 3354 + skb_dst_drop(skb); 3358 3355 nf_reset_ct(skb); 3359 3356 } 3360 3357

+3 -1

drivers/pci/pwrctrl/core.c

··· 299 299 struct device_node *remote __free(device_node) = 300 300 of_graph_get_remote_port_parent(endpoint); 301 301 if (remote) { 302 - if (of_pci_supply_present(remote)) 302 + if (of_pci_supply_present(remote)) { 303 + of_node_put(endpoint); 303 304 return true; 305 + } 304 306 } 305 307 } 306 308 }

-12

drivers/pci/pwrctrl/pci-pwrctrl-pwrseq.c

··· 68 68 return pwrseq_power_off(pwrseq->pwrseq); 69 69 } 70 70 71 - static void devm_pwrseq_pwrctrl_power_off(void *data) 72 - { 73 - struct pwrseq_pwrctrl *pwrseq = data; 74 - 75 - pwrseq_pwrctrl_power_off(&pwrseq->pwrctrl); 76 - } 77 - 78 71 static int pwrseq_pwrctrl_probe(struct platform_device *pdev) 79 72 { 80 73 const struct pwrseq_pwrctrl_pdata *pdata; ··· 93 100 if (IS_ERR(pwrseq->pwrseq)) 94 101 return dev_err_probe(dev, PTR_ERR(pwrseq->pwrseq), 95 102 "Failed to get the power sequencer\n"); 96 - 97 - ret = devm_add_action_or_reset(dev, devm_pwrseq_pwrctrl_power_off, 98 - pwrseq); 99 - if (ret) 100 - return ret; 101 103 102 104 pwrseq->pwrctrl.power_on = pwrseq_pwrctrl_power_on; 103 105 pwrseq->pwrctrl.power_off = pwrseq_pwrctrl_power_off;

-1

drivers/pci/pwrctrl/slot.c

··· 63 63 { 64 64 struct slot_pwrctrl *slot = data; 65 65 66 - slot_pwrctrl_power_off(&slot->pwrctrl); 67 66 regulator_bulk_free(slot->num_supplies, slot->supplies); 68 67 } 69 68

+2 -3

drivers/phy/Kconfig

··· 6 6 menu "PHY Subsystem" 7 7 8 8 config PHY_COMMON_PROPS 9 - bool 9 + bool "PHY common properties" if KUNIT_ALL_TESTS 10 10 help 11 11 This parses properties common between generic PHYs and Ethernet PHYs. 12 12 ··· 16 16 17 17 config PHY_COMMON_PROPS_TEST 18 18 tristate "KUnit tests for PHY common props" if !KUNIT_ALL_TESTS 19 - select PHY_COMMON_PROPS 20 - depends on KUNIT 19 + depends on KUNIT && PHY_COMMON_PROPS 21 20 default KUNIT_ALL_TESTS 22 21 help 23 22 This builds KUnit tests for the PHY common property API.

+2

drivers/phy/freescale/phy-fsl-lynx-28g.c

··· 1069 1069 1070 1070 for (i = 0; i < LYNX_28G_NUM_LANE; i++) { 1071 1071 lane = &priv->lane[i]; 1072 + if (!lane->phy) 1073 + continue; 1072 1074 1073 1075 mutex_lock(&lane->phy->mutex); 1074 1076

+1 -2

drivers/phy/qualcomm/phy-qcom-qmp-ufs.c

··· 990 990 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_MULTI_LANE_CTRL1, 0x02), 991 991 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_MID_TERM_CTRL1, 0x43), 992 992 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PCS_CTRL1, 0xc1), 993 + QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PLL_CNTL, 0x33), 993 994 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_LARGE_AMP_DRV_LVL, 0x0f), 994 995 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_RX_SIGDET_CTRL2, 0x68), 995 996 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_POST_EMP_LVL_S4, 0x0e), ··· 1000 999 }; 1001 1000 1002 1001 static const struct qmp_phy_init_tbl sm8650_ufsphy_g4_pcs[] = { 1003 - QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PLL_CNTL, 0x13), 1004 1002 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_HSGEAR_CAPABILITY, 0x04), 1005 1003 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_RX_HSGEAR_CAPABILITY, 0x04), 1006 1004 }; 1007 1005 1008 1006 static const struct qmp_phy_init_tbl sm8650_ufsphy_g5_pcs[] = { 1009 - QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PLL_CNTL, 0x33), 1010 1007 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_HSGEAR_CAPABILITY, 0x05), 1011 1008 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_RX_HSGEAR_CAPABILITY, 0x05), 1012 1009 QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_RX_HS_G5_SYNC_LENGTH_CAPABILITY, 0x4d),

+14

drivers/phy/spacemit/phy-k1-usb2.c

··· 48 48 #define PHY_CLK_HSTXP_EN BIT(3) /* clock hstxp enable */ 49 49 #define PHY_HSTXP_MODE BIT(4) /* 0: force en_txp to be 1; 1: no force */ 50 50 51 + #define PHY_K1_HS_HOST_DISC 0x40 52 + #define PHY_K1_HS_HOST_DISC_CLR BIT(0) 53 + 51 54 #define PHY_PLL_DIV_CFG 0x98 52 55 #define PHY_FDIV_FRACT_8_15 GENMASK(7, 0) 53 56 #define PHY_FDIV_FRACT_16_19 GENMASK(11, 8) ··· 145 142 return 0; 146 143 } 147 144 145 + static int spacemit_usb2phy_disconnect(struct phy *phy, int port) 146 + { 147 + struct spacemit_usb2phy *sphy = phy_get_drvdata(phy); 148 + 149 + regmap_update_bits(sphy->regmap_base, PHY_K1_HS_HOST_DISC, 150 + PHY_K1_HS_HOST_DISC_CLR, PHY_K1_HS_HOST_DISC_CLR); 151 + 152 + return 0; 153 + } 154 + 148 155 static const struct phy_ops spacemit_usb2phy_ops = { 149 156 .init = spacemit_usb2phy_init, 150 157 .exit = spacemit_usb2phy_exit, 158 + .disconnect = spacemit_usb2phy_disconnect, 151 159 .owner = THIS_MODULE, 152 160 }; 153 161

+2

drivers/phy/ti/phy-j721e-wiz.c

··· 1425 1425 dev_err(dev, 1426 1426 "%s: Reading \"reg\" from \"%s\" failed: %d\n", 1427 1427 __func__, subnode->name, ret); 1428 + of_node_put(serdes); 1428 1429 return ret; 1429 1430 } 1430 1431 of_property_read_u32(subnode, "cdns,num-lanes", &num_lanes); ··· 1440 1439 } 1441 1440 } 1442 1441 1442 + of_node_put(serdes); 1443 1443 return 0; 1444 1444 } 1445 1445

+6 -3

drivers/pinctrl/mediatek/pinctrl-mtk-common.c

··· 1135 1135 goto chip_error; 1136 1136 } 1137 1137 1138 - ret = mtk_eint_init(pctl, pdev); 1139 - if (ret) 1140 - goto chip_error; 1138 + /* Only initialize EINT if we have EINT pins */ 1139 + if (data->eint_hw.ap_num > 0) { 1140 + ret = mtk_eint_init(pctl, pdev); 1141 + if (ret) 1142 + goto chip_error; 1143 + } 1141 1144 1142 1145 return 0; 1143 1146

+16

drivers/pinctrl/qcom/pinctrl-spmi-gpio.c

··· 723 723 .pin_config_group_dbg_show = pmic_gpio_config_dbg_show, 724 724 }; 725 725 726 + static int pmic_gpio_get_direction(struct gpio_chip *chip, unsigned pin) 727 + { 728 + struct pmic_gpio_state *state = gpiochip_get_data(chip); 729 + struct pmic_gpio_pad *pad; 730 + 731 + pad = state->ctrl->desc->pins[pin].drv_data; 732 + 733 + if (!pad->is_enabled || pad->analog_pass || 734 + (!pad->input_enabled && !pad->output_enabled)) 735 + return -EINVAL; 736 + 737 + /* Make sure the state is aligned on what pmic_gpio_get() returns */ 738 + return pad->input_enabled ? GPIO_LINE_DIRECTION_IN : GPIO_LINE_DIRECTION_OUT; 739 + } 740 + 726 741 static int pmic_gpio_direction_input(struct gpio_chip *chip, unsigned pin) 727 742 { 728 743 struct pmic_gpio_state *state = gpiochip_get_data(chip); ··· 816 801 } 817 802 818 803 static const struct gpio_chip pmic_gpio_gpio_template = { 804 + .get_direction = pmic_gpio_get_direction, 819 805 .direction_input = pmic_gpio_direction_input, 820 806 .direction_output = pmic_gpio_direction_output, 821 807 .get = pmic_gpio_get,

+1 -1

drivers/pinctrl/renesas/pinctrl-rza1.c

··· 589 589 { 590 590 void __iomem *mem = RZA1_ADDR(port->base, reg, port->id); 591 591 592 - return ioread16(mem) & BIT(bit); 592 + return !!(ioread16(mem) & BIT(bit)); 593 593 } 594 594 595 595 /**

+8 -7

drivers/pinctrl/renesas/pinctrl-rzt2h.c

··· 85 85 struct gpio_chip gpio_chip; 86 86 struct pinctrl_gpio_range gpio_range; 87 87 DECLARE_BITMAP(used_irqs, RZT2H_INTERRUPTS_NUM); 88 - spinlock_t lock; /* lock read/write registers */ 88 + raw_spinlock_t lock; /* lock read/write registers */ 89 89 struct mutex mutex; /* serialize adding groups and functions */ 90 90 bool safety_port_enabled; 91 91 atomic_t wakeup_path; ··· 145 145 u64 reg64; 146 146 u16 reg16; 147 147 148 - guard(spinlock_irqsave)(&pctrl->lock); 148 + guard(raw_spinlock_irqsave)(&pctrl->lock); 149 149 150 150 /* Set pin to 'Non-use (Hi-Z input protection)' */ 151 151 reg16 = rzt2h_pinctrl_readw(pctrl, port, PM(port)); ··· 474 474 if (ret) 475 475 return ret; 476 476 477 - guard(spinlock_irqsave)(&pctrl->lock); 477 + guard(raw_spinlock_irqsave)(&pctrl->lock); 478 478 479 479 /* Select GPIO mode in PMC Register */ 480 480 rzt2h_pinctrl_set_gpio_en(pctrl, port, bit, true); ··· 487 487 { 488 488 u16 reg; 489 489 490 - guard(spinlock_irqsave)(&pctrl->lock); 490 + guard(raw_spinlock_irqsave)(&pctrl->lock); 491 491 492 492 reg = rzt2h_pinctrl_readw(pctrl, port, PM(port)); 493 493 reg &= ~PM_PIN_MASK(bit); ··· 509 509 if (ret) 510 510 return ret; 511 511 512 - guard(spinlock_irqsave)(&pctrl->lock); 512 + guard(raw_spinlock_irqsave)(&pctrl->lock); 513 513 514 514 if (rzt2h_pinctrl_readb(pctrl, port, PMC(port)) & BIT(bit)) { 515 515 /* ··· 547 547 u8 bit = RZT2H_PIN_ID_TO_PIN(offset); 548 548 u8 reg; 549 549 550 - guard(spinlock_irqsave)(&pctrl->lock); 550 + guard(raw_spinlock_irqsave)(&pctrl->lock); 551 551 552 552 reg = rzt2h_pinctrl_readb(pctrl, port, P(port)); 553 553 if (value) ··· 833 833 if (ret) 834 834 return dev_err_probe(dev, ret, "Unable to parse gpio-ranges\n"); 835 835 836 + of_node_put(of_args.np); 836 837 if (of_args.args[0] != 0 || of_args.args[1] != 0 || 837 838 of_args.args[2] != pctrl->data->n_port_pins) 838 839 return dev_err_probe(dev, -EINVAL, ··· 965 964 if (ret) 966 965 return ret; 967 966 968 - spin_lock_init(&pctrl->lock); 967 + raw_spin_lock_init(&pctrl->lock); 969 968 mutex_init(&pctrl->mutex); 970 969 platform_set_drvdata(pdev, pctrl); 971 970

+1

drivers/pinctrl/stm32/Kconfig

··· 65 65 select PINMUX 66 66 select GENERIC_PINCONF 67 67 select GPIOLIB 68 + select GPIO_GENERIC 68 69 help 69 70 The Hardware Debug Port allows the observation of internal signals. 70 71 It uses configurable multiplexer to route signals in a dedicated observation register.

+32 -11

drivers/pinctrl/sunxi/pinctrl-sunxi.c

··· 157 157 const char *pin_name, 158 158 const char *func_name) 159 159 { 160 + unsigned long variant = pctl->flags & SUNXI_PINCTRL_VARIANT_MASK; 160 161 int i; 161 162 162 163 for (i = 0; i < pctl->desc->npins; i++) { ··· 169 168 while (func->name) { 170 169 if (!strcmp(func->name, func_name) && 171 170 (!func->variant || 172 - func->variant & pctl->variant)) 171 + func->variant & variant)) 173 172 return func; 174 173 175 174 func++; ··· 210 209 const u16 pin_num, 211 210 const u8 muxval) 212 211 { 212 + unsigned long variant = pctl->flags & SUNXI_PINCTRL_VARIANT_MASK; 213 + 213 214 for (unsigned int i = 0; i < pctl->desc->npins; i++) { 214 215 const struct sunxi_desc_pin *pin = pctl->desc->pins + i; 215 216 struct sunxi_desc_function *func = pin->functions; ··· 219 216 if (pin->pin.number != pin_num) 220 217 continue; 221 218 222 - if (pin->variant && !(pctl->variant & pin->variant)) 219 + if (pin->variant && !(variant & pin->variant)) 223 220 continue; 224 221 225 222 while (func->name) { ··· 1092 1089 { 1093 1090 struct sunxi_pinctrl *pctl = irq_data_get_irq_chip_data(d); 1094 1091 struct sunxi_desc_function *func; 1092 + unsigned int offset; 1093 + u32 reg, shift, mask; 1094 + u8 disabled_mux, muxval; 1095 1095 int ret; 1096 1096 1097 1097 func = sunxi_pinctrl_desc_find_function_by_pin(pctl, ··· 1102 1096 if (!func) 1103 1097 return -EINVAL; 1104 1098 1105 - ret = gpiochip_lock_as_irq(pctl->chip, 1106 - pctl->irq_array[d->hwirq] - pctl->desc->pin_base); 1099 + offset = pctl->irq_array[d->hwirq] - pctl->desc->pin_base; 1100 + sunxi_mux_reg(pctl, offset, &reg, &shift, &mask); 1101 + muxval = (readl(pctl->membase + reg) & mask) >> shift; 1102 + 1103 + /* Change muxing to GPIO INPUT mode if at reset value */ 1104 + if (pctl->flags & SUNXI_PINCTRL_NEW_REG_LAYOUT) 1105 + disabled_mux = SUN4I_FUNC_DISABLED_NEW; 1106 + else 1107 + disabled_mux = SUN4I_FUNC_DISABLED_OLD; 1108 + 1109 + if (muxval == disabled_mux) 1110 + sunxi_pmx_set(pctl->pctl_dev, pctl->irq_array[d->hwirq], 1111 + SUN4I_FUNC_INPUT); 1112 + 1113 + ret = gpiochip_lock_as_irq(pctl->chip, offset); 1107 1114 if (ret) { 1108 1115 dev_err(pctl->dev, "unable to lock HW IRQ %lu for IRQ\n", 1109 1116 irqd_to_hwirq(d)); ··· 1357 1338 static int sunxi_pinctrl_build_state(struct platform_device *pdev) 1358 1339 { 1359 1340 struct sunxi_pinctrl *pctl = platform_get_drvdata(pdev); 1341 + unsigned long variant = pctl->flags & SUNXI_PINCTRL_VARIANT_MASK; 1360 1342 void *ptr; 1361 1343 int i; 1362 1344 ··· 1382 1362 const struct sunxi_desc_pin *pin = pctl->desc->pins + i; 1383 1363 struct sunxi_pinctrl_group *group = pctl->groups + pctl->ngroups; 1384 1364 1385 - if (pin->variant && !(pctl->variant & pin->variant)) 1365 + if (pin->variant && !(variant & pin->variant)) 1386 1366 continue; 1387 1367 1388 1368 group->name = pin->pin.name; ··· 1407 1387 const struct sunxi_desc_pin *pin = pctl->desc->pins + i; 1408 1388 struct sunxi_desc_function *func; 1409 1389 1410 - if (pin->variant && !(pctl->variant & pin->variant)) 1390 + if (pin->variant && !(variant & pin->variant)) 1411 1391 continue; 1412 1392 1413 1393 for (func = pin->functions; func->name; func++) { 1414 - if (func->variant && !(pctl->variant & func->variant)) 1394 + if (func->variant && !(variant & func->variant)) 1415 1395 continue; 1416 1396 1417 1397 /* Create interrupt mapping while we're at it */ ··· 1439 1419 const struct sunxi_desc_pin *pin = pctl->desc->pins + i; 1440 1420 struct sunxi_desc_function *func; 1441 1421 1442 - if (pin->variant && !(pctl->variant & pin->variant)) 1422 + if (pin->variant && !(variant & pin->variant)) 1443 1423 continue; 1444 1424 1445 1425 for (func = pin->functions; func->name; func++) { 1446 1426 struct sunxi_pinctrl_function *func_item; 1447 1427 const char **func_grp; 1448 1428 1449 - if (func->variant && !(pctl->variant & func->variant)) 1429 + if (func->variant && !(variant & func->variant)) 1450 1430 continue; 1451 1431 1452 1432 func_item = sunxi_pinctrl_find_function_by_name(pctl, ··· 1588 1568 1589 1569 pctl->dev = &pdev->dev; 1590 1570 pctl->desc = desc; 1591 - pctl->variant = flags & SUNXI_PINCTRL_VARIANT_MASK; 1571 + pctl->flags = flags; 1592 1572 if (flags & SUNXI_PINCTRL_NEW_REG_LAYOUT) { 1593 1573 pctl->bank_mem_size = D1_BANK_MEM_SIZE; 1594 1574 pctl->pull_regs_offset = D1_PULL_REGS_OFFSET; ··· 1624 1604 1625 1605 for (i = 0, pin_idx = 0; i < pctl->desc->npins; i++) { 1626 1606 const struct sunxi_desc_pin *pin = pctl->desc->pins + i; 1607 + unsigned long variant = pctl->flags & SUNXI_PINCTRL_VARIANT_MASK; 1627 1608 1628 - if (pin->variant && !(pctl->variant & pin->variant)) 1609 + if (pin->variant && !(variant & pin->variant)) 1629 1610 continue; 1630 1611 1631 1612 pins[pin_idx++] = pin->pin;

+3 -1

drivers/pinctrl/sunxi/pinctrl-sunxi.h

··· 86 86 87 87 #define SUN4I_FUNC_INPUT 0 88 88 #define SUN4I_FUNC_IRQ 6 89 + #define SUN4I_FUNC_DISABLED_OLD 7 90 + #define SUN4I_FUNC_DISABLED_NEW 15 89 91 90 92 #define SUNXI_PINCTRL_VARIANT_MASK GENMASK(7, 0) 91 93 #define SUNXI_PINCTRL_NEW_REG_LAYOUT BIT(8) ··· 176 174 unsigned *irq_array; 177 175 raw_spinlock_t lock; 178 176 struct pinctrl_dev *pctl_dev; 179 - unsigned long variant; 177 + unsigned long flags; 180 178 u32 bank_mem_size; 181 179 u32 pull_regs_offset; 182 180 u32 dlevel_field_width;

+1 -1

drivers/platform/olpc/olpc-xo175-ec.c

··· 482 482 dev_dbg(dev, "CMD %x, %zd bytes expected\n", cmd, resp_len); 483 483 484 484 if (inlen > 5) { 485 - dev_err(dev, "command len %zd too big!\n", resp_len); 485 + dev_err(dev, "command len %zd too big!\n", inlen); 486 486 return -EOVERFLOW; 487 487 } 488 488

+1 -1

drivers/platform/x86/amd/hsmp/hsmp.c

··· 117 117 } 118 118 119 119 if (unlikely(mbox_status == HSMP_STATUS_NOT_READY)) { 120 - dev_err(sock->dev, "Message ID 0x%X failure : SMU tmeout (status = 0x%X)\n", 120 + dev_err(sock->dev, "Message ID 0x%X failure : SMU timeout (status = 0x%X)\n", 121 121 msg->msg_id, mbox_status); 122 122 return -ETIMEDOUT; 123 123 } else if (unlikely(mbox_status == HSMP_ERR_INVALID_MSG)) {

+77

drivers/platform/x86/asus-armoury.h

··· 1082 1082 }, 1083 1083 { 1084 1084 .matches = { 1085 + DMI_MATCH(DMI_BOARD_NAME, "GA503QM"), 1086 + }, 1087 + .driver_data = &(struct power_data) { 1088 + .ac_data = &(struct power_limits) { 1089 + .ppt_pl1_spl_min = 15, 1090 + .ppt_pl1_spl_def = 35, 1091 + .ppt_pl1_spl_max = 80, 1092 + .ppt_pl2_sppt_min = 65, 1093 + .ppt_pl2_sppt_max = 80, 1094 + }, 1095 + }, 1096 + }, 1097 + { 1098 + .matches = { 1085 1099 DMI_MATCH(DMI_BOARD_NAME, "GA503QR"), 1086 1100 }, 1087 1101 .driver_data = &(struct power_data) { ··· 1534 1520 }, 1535 1521 { 1536 1522 .matches = { 1523 + DMI_MATCH(DMI_BOARD_NAME, "GZ302EA"), 1524 + }, 1525 + .driver_data = &(struct power_data) { 1526 + .ac_data = &(struct power_limits) { 1527 + .ppt_pl1_spl_min = 28, 1528 + .ppt_pl1_spl_def = 60, 1529 + .ppt_pl1_spl_max = 80, 1530 + .ppt_pl2_sppt_min = 32, 1531 + .ppt_pl2_sppt_def = 75, 1532 + .ppt_pl2_sppt_max = 92, 1533 + .ppt_pl3_fppt_min = 45, 1534 + .ppt_pl3_fppt_def = 86, 1535 + .ppt_pl3_fppt_max = 93, 1536 + }, 1537 + .dc_data = &(struct power_limits) { 1538 + .ppt_pl1_spl_min = 28, 1539 + .ppt_pl1_spl_def = 45, 1540 + .ppt_pl1_spl_max = 80, 1541 + .ppt_pl2_sppt_min = 32, 1542 + .ppt_pl2_sppt_def = 52, 1543 + .ppt_pl2_sppt_max = 92, 1544 + .ppt_pl3_fppt_min = 45, 1545 + .ppt_pl3_fppt_def = 71, 1546 + .ppt_pl3_fppt_max = 93, 1547 + }, 1548 + }, 1549 + }, 1550 + { 1551 + .matches = { 1537 1552 DMI_MATCH(DMI_BOARD_NAME, "G513I"), 1538 1553 }, 1539 1554 .driver_data = &(struct power_data) { ··· 1633 1590 .ppt_pl2_sppt_max = 50, 1634 1591 .ppt_pl3_fppt_min = 28, 1635 1592 .ppt_pl3_fppt_max = 65, 1593 + .nv_temp_target_min = 75, 1594 + .nv_temp_target_max = 87, 1595 + }, 1596 + .requires_fan_curve = true, 1597 + }, 1598 + }, 1599 + { 1600 + .matches = { 1601 + DMI_MATCH(DMI_BOARD_NAME, "G614FP"), 1602 + }, 1603 + .driver_data = &(struct power_data) { 1604 + .ac_data = &(struct power_limits) { 1605 + .ppt_pl1_spl_min = 30, 1606 + .ppt_pl1_spl_max = 120, 1607 + .ppt_pl2_sppt_min = 65, 1608 + .ppt_pl2_sppt_def = 140, 1609 + .ppt_pl2_sppt_max = 165, 1610 + .ppt_pl3_fppt_min = 65, 1611 + .ppt_pl3_fppt_def = 140, 1612 + .ppt_pl3_fppt_max = 165, 1613 + .nv_temp_target_min = 75, 1614 + .nv_temp_target_max = 87, 1615 + .nv_dynamic_boost_min = 5, 1616 + .nv_dynamic_boost_max = 15, 1617 + .nv_tgp_min = 50, 1618 + .nv_tgp_max = 100, 1619 + }, 1620 + .dc_data = &(struct power_limits) { 1621 + .ppt_pl1_spl_min = 25, 1622 + .ppt_pl1_spl_max = 65, 1623 + .ppt_pl2_sppt_min = 25, 1624 + .ppt_pl2_sppt_max = 65, 1625 + .ppt_pl3_fppt_min = 35, 1626 + .ppt_pl3_fppt_max = 75, 1636 1627 .nv_temp_target_min = 75, 1637 1628 .nv_temp_target_max = 87, 1638 1629 },

+1 -1

drivers/platform/x86/asus-nb-wmi.c

··· 548 548 .callback = dmi_matched, 549 549 .ident = "ASUS ROG Z13", 550 550 .matches = { 551 - DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."), 551 + DMI_MATCH(DMI_SYS_VENDOR, "ASUS"), 552 552 DMI_MATCH(DMI_PRODUCT_NAME, "ROG Flow Z13"), 553 553 }, 554 554 .driver_data = &quirk_asus_z13,

+19

drivers/platform/x86/hp/hp-wmi.c

··· 120 120 .ec_tp_offset = HP_VICTUS_S_EC_THERMAL_PROFILE_OFFSET, 121 121 }; 122 122 123 + static const struct thermal_profile_params omen_v1_legacy_thermal_params = { 124 + .performance = HP_OMEN_V1_THERMAL_PROFILE_PERFORMANCE, 125 + .balanced = HP_OMEN_V1_THERMAL_PROFILE_DEFAULT, 126 + .low_power = HP_OMEN_V1_THERMAL_PROFILE_DEFAULT, 127 + .ec_tp_offset = HP_OMEN_EC_THERMAL_PROFILE_OFFSET, 128 + }; 129 + 123 130 /* 124 131 * A generic pointer for the currently-active board's thermal profile 125 132 * parameters. ··· 183 176 /* DMI Board names of Victus 16-r and Victus 16-s laptops */ 184 177 static const struct dmi_system_id victus_s_thermal_profile_boards[] __initconst = { 185 178 { 179 + .matches = { DMI_MATCH(DMI_BOARD_NAME, "8A4D") }, 180 + .driver_data = (void *)&omen_v1_legacy_thermal_params, 181 + }, 182 + { 186 183 .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BAB") }, 187 184 .driver_data = (void *)&omen_v1_thermal_params, 188 185 }, 189 186 { 190 187 .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BBE") }, 191 188 .driver_data = (void *)&victus_s_thermal_params, 189 + }, 190 + { 191 + .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BCA") }, 192 + .driver_data = (void *)&omen_v1_thermal_params, 192 193 }, 193 194 { 194 195 .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BCD") }, ··· 209 194 { 210 195 .matches = { DMI_MATCH(DMI_BOARD_NAME, "8BD5") }, 211 196 .driver_data = (void *)&victus_s_thermal_params, 197 + }, 198 + { 199 + .matches = { DMI_MATCH(DMI_BOARD_NAME, "8C76") }, 200 + .driver_data = (void *)&omen_v1_thermal_params, 212 201 }, 213 202 { 214 203 .matches = { DMI_MATCH(DMI_BOARD_NAME, "8C78") },

+9 -1

drivers/platform/x86/intel/hid.c

··· 438 438 return 0; 439 439 } 440 440 441 + static int intel_hid_pl_freeze_handler(struct device *device) 442 + { 443 + struct intel_hid_priv *priv = dev_get_drvdata(device); 444 + 445 + priv->wakeup_mode = false; 446 + return intel_hid_pl_suspend_handler(device); 447 + } 448 + 441 449 static int intel_hid_pl_resume_handler(struct device *device) 442 450 { 443 451 intel_hid_pm_complete(device); ··· 460 452 static const struct dev_pm_ops intel_hid_pl_pm_ops = { 461 453 .prepare = intel_hid_pm_prepare, 462 454 .complete = intel_hid_pm_complete, 463 - .freeze = intel_hid_pl_suspend_handler, 455 + .freeze = intel_hid_pl_freeze_handler, 464 456 .thaw = intel_hid_pl_resume_handler, 465 457 .restore = intel_hid_pl_resume_handler, 466 458 .suspend = intel_hid_pl_suspend_handler,

+4 -1

drivers/platform/x86/intel/speed_select_if/isst_tpmi_core.c

··· 558 558 { 559 559 u64 value; 560 560 561 + if (!static_cpu_has(X86_FEATURE_HWP)) 562 + return true; 563 + 561 564 rdmsrq(MSR_PM_ENABLE, value); 562 565 return !(value & 0x1); 563 566 } ··· 872 869 _read_pp_info("current_level", perf_level.current_level, SST_PP_STATUS_OFFSET, 873 870 SST_PP_LEVEL_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE) 874 871 _read_pp_info("locked", perf_level.locked, SST_PP_STATUS_OFFSET, 875 - SST_PP_LOCK_START, SST_PP_LEVEL_WIDTH, SST_MUL_FACTOR_NONE) 872 + SST_PP_LOCK_START, SST_PP_LOCK_WIDTH, SST_MUL_FACTOR_NONE) 876 873 _read_pp_info("feature_state", perf_level.feature_state, SST_PP_STATUS_OFFSET, 877 874 SST_PP_FEATURE_STATE_START, SST_PP_FEATURE_STATE_WIDTH, SST_MUL_FACTOR_NONE) 878 875 perf_level.enabled = !!(power_domain_info->sst_header.cap_mask & BIT(1));

-2

drivers/platform/x86/lenovo/wmi-gamezone.c

··· 31 31 #define LWMI_GZ_METHOD_ID_SMARTFAN_SET 44 32 32 #define LWMI_GZ_METHOD_ID_SMARTFAN_GET 45 33 33 34 - static BLOCKING_NOTIFIER_HEAD(gz_chain_head); 35 - 36 34 struct lwmi_gz_priv { 37 35 enum thermal_mode current_mode; 38 36 struct notifier_block event_nb;

+2 -1

drivers/scsi/ibmvscsi/ibmvfc.c

··· 4966 4966 switch (mad_status) { 4967 4967 case IBMVFC_MAD_SUCCESS: 4968 4968 ibmvfc_dbg(vhost, "Discover Targets succeeded\n"); 4969 - vhost->num_targets = be32_to_cpu(rsp->num_written); 4969 + vhost->num_targets = min_t(u32, be32_to_cpu(rsp->num_written), 4970 + max_targets); 4970 4971 ibmvfc_set_host_action(vhost, IBMVFC_HOST_ACTION_ALLOC_TGTS); 4971 4972 break; 4972 4973 case IBMVFC_MAD_FAILED:

+1 -1

drivers/scsi/scsi_transport_sas.c

··· 1734 1734 break; 1735 1735 1736 1736 default: 1737 - if (channel < shost->max_channel) { 1737 + if (channel <= shost->max_channel) { 1738 1738 res = scsi_scan_host_selected(shost, channel, id, lun, 1739 1739 SCSI_SCAN_MANUAL); 1740 1740 } else {

+1 -1

drivers/scsi/ses.c

··· 215 215 unsigned char *type_ptr = ses_dev->page1_types; 216 216 unsigned char *desc_ptr = ses_dev->page2 + 8; 217 217 218 - if (ses_recv_diag(sdev, 2, ses_dev->page2, ses_dev->page2_len) < 0) 218 + if (ses_recv_diag(sdev, 2, ses_dev->page2, ses_dev->page2_len)) 219 219 return NULL; 220 220 221 221 for (i = 0; i < ses_dev->page1_num_types; i++, type_ptr += 4) {

+2 -1

drivers/spi/spi-fsl-lpspi.c

··· 1009 1009 enable_irq(irq); 1010 1010 } 1011 1011 1012 - ret = devm_spi_register_controller(&pdev->dev, controller); 1012 + ret = spi_register_controller(controller); 1013 1013 if (ret < 0) { 1014 1014 dev_err_probe(&pdev->dev, ret, "spi_register_controller error\n"); 1015 1015 goto free_dma; ··· 1035 1035 struct fsl_lpspi_data *fsl_lpspi = 1036 1036 spi_controller_get_devdata(controller); 1037 1037 1038 + spi_unregister_controller(controller); 1038 1039 fsl_lpspi_dma_exit(controller); 1039 1040 1040 1041 pm_runtime_dont_use_autosuspend(fsl_lpspi->dev);

-2

drivers/spi/spi-meson-spicc.c

··· 1101 1101 1102 1102 /* Disable SPI */ 1103 1103 writel(0, spicc->base + SPICC_CONREG); 1104 - 1105 - spi_controller_put(spicc->host); 1106 1104 } 1107 1105 1108 1106 static const struct meson_spicc_data meson_spicc_gx_data = {

+10 -32

drivers/spi/spi-sn-f-ospi.c

··· 612 612 u32 num_cs = OSPI_NUM_CS; 613 613 int ret; 614 614 615 - ctlr = spi_alloc_host(dev, sizeof(*ospi)); 615 + ctlr = devm_spi_alloc_host(dev, sizeof(*ospi)); 616 616 if (!ctlr) 617 617 return -ENOMEM; 618 618 ··· 635 635 platform_set_drvdata(pdev, ospi); 636 636 637 637 ospi->base = devm_platform_ioremap_resource(pdev, 0); 638 - if (IS_ERR(ospi->base)) { 639 - ret = PTR_ERR(ospi->base); 640 - goto err_put_ctlr; 641 - } 638 + if (IS_ERR(ospi->base)) 639 + return PTR_ERR(ospi->base); 642 640 643 641 ospi->clk = devm_clk_get_enabled(dev, NULL); 644 - if (IS_ERR(ospi->clk)) { 645 - ret = PTR_ERR(ospi->clk); 646 - goto err_put_ctlr; 647 - } 642 + if (IS_ERR(ospi->clk)) 643 + return PTR_ERR(ospi->clk); 648 644 649 - mutex_init(&ospi->mlock); 645 + ret = devm_mutex_init(dev, &ospi->mlock); 646 + if (ret) 647 + return ret; 650 648 651 649 ret = f_ospi_init(ospi); 652 650 if (ret) 653 - goto err_destroy_mutex; 651 + return ret; 654 652 655 - ret = devm_spi_register_controller(dev, ctlr); 656 - if (ret) 657 - goto err_destroy_mutex; 658 - 659 - return 0; 660 - 661 - err_destroy_mutex: 662 - mutex_destroy(&ospi->mlock); 663 - 664 - err_put_ctlr: 665 - spi_controller_put(ctlr); 666 - 667 - return ret; 668 - } 669 - 670 - static void f_ospi_remove(struct platform_device *pdev) 671 - { 672 - struct f_ospi *ospi = platform_get_drvdata(pdev); 673 - 674 - mutex_destroy(&ospi->mlock); 653 + return devm_spi_register_controller(dev, ctlr); 675 654 } 676 655 677 656 static const struct of_device_id f_ospi_dt_ids[] = { ··· 665 686 .of_match_table = f_ospi_dt_ids, 666 687 }, 667 688 .probe = f_ospi_probe, 668 - .remove = f_ospi_remove, 669 689 }; 670 690 module_platform_driver(f_ospi_driver); 671 691

+19 -13

drivers/spi/spi.c

··· 50 50 struct spi_device *spi = to_spi_device(dev); 51 51 52 52 spi_controller_put(spi->controller); 53 - kfree(spi->driver_override); 54 53 free_percpu(spi->pcpu_statistics); 55 54 kfree(spi); 56 55 } ··· 72 73 struct device_attribute *a, 73 74 const char *buf, size_t count) 74 75 { 75 - struct spi_device *spi = to_spi_device(dev); 76 76 int ret; 77 77 78 - ret = driver_set_override(dev, &spi->driver_override, buf, count); 78 + ret = __device_set_driver_override(dev, buf, count); 79 79 if (ret) 80 80 return ret; 81 81 ··· 84 86 static ssize_t driver_override_show(struct device *dev, 85 87 struct device_attribute *a, char *buf) 86 88 { 87 - const struct spi_device *spi = to_spi_device(dev); 88 - ssize_t len; 89 - 90 - device_lock(dev); 91 - len = sysfs_emit(buf, "%s\n", spi->driver_override ? : ""); 92 - device_unlock(dev); 93 - return len; 89 + guard(spinlock)(&dev->driver_override.lock); 90 + return sysfs_emit(buf, "%s\n", dev->driver_override.name ?: ""); 94 91 } 95 92 static DEVICE_ATTR_RW(driver_override); 96 93 ··· 369 376 { 370 377 const struct spi_device *spi = to_spi_device(dev); 371 378 const struct spi_driver *sdrv = to_spi_driver(drv); 379 + int ret; 372 380 373 381 /* Check override first, and if set, only use the named driver */ 374 - if (spi->driver_override) 375 - return strcmp(spi->driver_override, drv->name) == 0; 382 + ret = device_match_driver_override(dev, drv); 383 + if (ret >= 0) 384 + return ret; 376 385 377 386 /* Attempt an OF style match */ 378 387 if (of_driver_match_device(dev, drv)) ··· 3534 3539 if (ret) 3535 3540 return ret; 3536 3541 3537 - return devm_add_action_or_reset(dev, devm_spi_unregister_controller, ctlr); 3542 + /* 3543 + * Prevent controller from being freed by spi_unregister_controller() 3544 + * if devm_add_action_or_reset() fails for a non-devres allocated 3545 + * controller. 3546 + */ 3547 + spi_controller_get(ctlr); 3538 3548 3549 + ret = devm_add_action_or_reset(dev, devm_spi_unregister_controller, ctlr); 3550 + 3551 + if (ret == 0 || ctlr->devm_allocated) 3552 + spi_controller_put(ctlr); 3553 + 3554 + return ret; 3539 3555 } 3540 3556 EXPORT_SYMBOL_GPL(devm_spi_register_controller); 3541 3557

+46 -6

drivers/target/loopback/tcm_loop.c

··· 26 26 #include <linux/slab.h> 27 27 #include <linux/types.h> 28 28 #include <linux/configfs.h> 29 + #include <linux/blk-mq.h> 29 30 #include <scsi/scsi.h> 30 31 #include <scsi/scsi_tcq.h> 31 32 #include <scsi/scsi_host.h> ··· 270 269 return (ret == TMR_FUNCTION_COMPLETE) ? SUCCESS : FAILED; 271 270 } 272 271 272 + static bool tcm_loop_flush_work_iter(struct request *rq, void *data) 273 + { 274 + struct scsi_cmnd *sc = blk_mq_rq_to_pdu(rq); 275 + struct tcm_loop_cmd *tl_cmd = scsi_cmd_priv(sc); 276 + struct se_cmd *se_cmd = &tl_cmd->tl_se_cmd; 277 + 278 + flush_work(&se_cmd->work); 279 + return true; 280 + } 281 + 273 282 static int tcm_loop_target_reset(struct scsi_cmnd *sc) 274 283 { 275 284 struct tcm_loop_hba *tl_hba; 276 285 struct tcm_loop_tpg *tl_tpg; 286 + struct Scsi_Host *sh = sc->device->host; 287 + int ret; 277 288 278 289 /* 279 290 * Locate the tcm_loop_hba_t pointer 280 291 */ 281 - tl_hba = *(struct tcm_loop_hba **)shost_priv(sc->device->host); 292 + tl_hba = *(struct tcm_loop_hba **)shost_priv(sh); 282 293 if (!tl_hba) { 283 294 pr_err("Unable to perform device reset without active I_T Nexus\n"); 284 295 return FAILED; ··· 299 286 * Locate the tl_tpg pointer from TargetID in sc->device->id 300 287 */ 301 288 tl_tpg = &tl_hba->tl_hba_tpgs[sc->device->id]; 302 - if (tl_tpg) { 303 - tl_tpg->tl_transport_status = TCM_TRANSPORT_ONLINE; 304 - return SUCCESS; 305 - } 306 - return FAILED; 289 + if (!tl_tpg) 290 + return FAILED; 291 + 292 + /* 293 + * Issue a LUN_RESET to drain all commands that the target core 294 + * knows about. This handles commands not yet marked CMD_T_COMPLETE. 295 + */ 296 + ret = tcm_loop_issue_tmr(tl_tpg, sc->device->lun, 0, TMR_LUN_RESET); 297 + if (ret != TMR_FUNCTION_COMPLETE) 298 + return FAILED; 299 + 300 + /* 301 + * Flush any deferred target core completion work that may still be 302 + * queued. Commands that already had CMD_T_COMPLETE set before the TMR 303 + * are skipped by the TMR drain, but their async completion work 304 + * (transport_lun_remove_cmd → percpu_ref_put, release_cmd → scsi_done) 305 + * may still be pending in target_completion_wq. 306 + * 307 + * The SCSI EH will reuse in-flight scsi_cmnd structures for recovery 308 + * commands (e.g. TUR) immediately after this handler returns SUCCESS — 309 + * if deferred work is still pending, the memset in queuecommand would 310 + * zero the se_cmd while the work accesses it, leaking the LUN 311 + * percpu_ref and hanging configfs unlink forever. 312 + * 313 + * Use blk_mq_tagset_busy_iter() to find all started requests and 314 + * flush_work() on each — the same pattern used by mpi3mr, scsi_debug, 315 + * and other SCSI drivers to drain outstanding commands during reset. 316 + */ 317 + blk_mq_tagset_busy_iter(&sh->tag_set, tcm_loop_flush_work_iter, NULL); 318 + 319 + tl_tpg->tl_transport_status = TCM_TRANSPORT_ONLINE; 320 + return SUCCESS; 307 321 } 308 322 309 323 static const struct scsi_host_template tcm_loop_driver_template = {

+1 -1

drivers/target/target_core_file.c

··· 276 276 ssize_t len = 0; 277 277 int ret = 0, i; 278 278 279 - aio_cmd = kmalloc_flex(*aio_cmd, bvecs, sgl_nents); 279 + aio_cmd = kzalloc_flex(*aio_cmd, bvecs, sgl_nents); 280 280 if (!aio_cmd) 281 281 return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE; 282 282

+7 -1

drivers/thermal/intel/int340x_thermal/processor_thermal_soc_slider.c

··· 176 176 177 177 static void set_soc_power_profile(struct proc_thermal_device *proc_priv, int slider) 178 178 { 179 + u8 offset; 179 180 u64 val; 180 181 181 182 val = read_soc_slider(proc_priv); 182 183 val &= ~SLIDER_MASK; 183 184 val |= FIELD_PREP(SLIDER_MASK, slider) | BIT(SLIDER_ENABLE_BIT); 184 185 186 + if (slider == SOC_SLIDER_VALUE_MINIMUM || slider == SOC_SLIDER_VALUE_MAXIMUM) 187 + offset = 0; 188 + else 189 + offset = slider_offset; 190 + 185 191 /* Set the slider offset from module params */ 186 192 val &= ~SLIDER_OFFSET_MASK; 187 - val |= FIELD_PREP(SLIDER_OFFSET_MASK, slider_offset); 193 + val |= FIELD_PREP(SLIDER_OFFSET_MASK, offset); 188 194 189 195 write_soc_slider(proc_priv, val); 190 196 }

+2 -3

drivers/vfio/pci/vfio_pci_dmabuf.c

··· 301 301 */ 302 302 ret = dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags); 303 303 if (ret < 0) 304 - goto err_dma_buf; 304 + dma_buf_put(priv->dmabuf); 305 + 305 306 return ret; 306 307 307 - err_dma_buf: 308 - dma_buf_put(priv->dmabuf); 309 308 err_dev_put: 310 309 vfio_device_put_registration(&vdev->vdev); 311 310 err_free_phys:

+10 -2

drivers/virt/coco/tdx-guest/tdx-guest.c

··· 171 171 #define GET_QUOTE_SUCCESS 0 172 172 #define GET_QUOTE_IN_FLIGHT 0xffffffffffffffff 173 173 174 + #define TDX_QUOTE_MAX_LEN (GET_QUOTE_BUF_SIZE - sizeof(struct tdx_quote_buf)) 175 + 174 176 /* struct tdx_quote_buf: Format of Quote request buffer. 175 177 * @version: Quote format version, filled by TD. 176 178 * @status: Status code of Quote request, filled by VMM. ··· 271 269 u8 *buf; 272 270 struct tdx_quote_buf *quote_buf = quote_data; 273 271 struct tsm_report_desc *desc = &report->desc; 272 + u32 out_len; 274 273 int ret; 275 274 u64 err; 276 275 ··· 309 306 return ret; 310 307 } 311 308 312 - buf = kvmemdup(quote_buf->data, quote_buf->out_len, GFP_KERNEL); 309 + out_len = READ_ONCE(quote_buf->out_len); 310 + 311 + if (out_len > TDX_QUOTE_MAX_LEN) 312 + return -EFBIG; 313 + 314 + buf = kvmemdup(quote_buf->data, out_len, GFP_KERNEL); 313 315 if (!buf) 314 316 return -ENOMEM; 315 317 316 318 report->outblob = buf; 317 - report->outblob_len = quote_buf->out_len; 319 + report->outblob_len = out_len; 318 320 319 321 /* 320 322 * TODO: parse the PEM-formatted cert chain out of the quote buffer when

+5 -5

drivers/virtio/virtio_ring.c

··· 2912 2912 * @data: the token identifying the buffer. 2913 2913 * @gfp: how to do memory allocations (if necessary). 2914 2914 * 2915 - * Same as virtqueue_add_inbuf but passes DMA_ATTR_CPU_CACHE_CLEAN to indicate 2916 - * that the CPU will not dirty any cacheline overlapping this buffer while it 2917 - * is available, and to suppress overlapping cacheline warnings in DMA debug 2918 - * builds. 2915 + * Same as virtqueue_add_inbuf but passes DMA_ATTR_DEBUGGING_IGNORE_CACHELINES 2916 + * to indicate that the CPU will not dirty any cacheline overlapping this buffer 2917 + * while it is available, and to suppress overlapping cacheline warnings in DMA 2918 + * debug builds. 2919 2919 * 2920 2920 * Caller must ensure we don't call this with other virtqueue operations 2921 2921 * at the same time (except where noted). ··· 2928 2928 gfp_t gfp) 2929 2929 { 2930 2930 return virtqueue_add(vq, &sg, num, 0, 1, data, NULL, false, gfp, 2931 - DMA_ATTR_CPU_CACHE_CLEAN); 2931 + DMA_ATTR_DEBUGGING_IGNORE_CACHELINES); 2932 2932 } 2933 2933 EXPORT_SYMBOL_GPL(virtqueue_add_inbuf_cache_clean); 2934 2934

+73 -3

drivers/xen/privcmd.c

··· 12 12 #include <linux/eventfd.h> 13 13 #include <linux/file.h> 14 14 #include <linux/kernel.h> 15 + #include <linux/kstrtox.h> 15 16 #include <linux/module.h> 16 17 #include <linux/mutex.h> 17 18 #include <linux/poll.h> ··· 31 30 #include <linux/seq_file.h> 32 31 #include <linux/miscdevice.h> 33 32 #include <linux/moduleparam.h> 33 + #include <linux/notifier.h> 34 + #include <linux/security.h> 34 35 #include <linux/virtio_mmio.h> 36 + #include <linux/wait.h> 35 37 36 38 #include <asm/xen/hypervisor.h> 37 39 #include <asm/xen/hypercall.h> ··· 50 46 #include <xen/page.h> 51 47 #include <xen/xen-ops.h> 52 48 #include <xen/balloon.h> 49 + #include <xen/xenbus.h> 53 50 #ifdef CONFIG_XEN_ACPI 54 51 #include <xen/acpi.h> 55 52 #endif ··· 73 68 MODULE_PARM_DESC(dm_op_buf_max_size, 74 69 "Maximum size of a dm_op hypercall buffer"); 75 70 71 + static bool unrestricted; 72 + module_param(unrestricted, bool, 0); 73 + MODULE_PARM_DESC(unrestricted, 74 + "Don't restrict hypercalls to target domain if running in a domU"); 75 + 76 76 struct privcmd_data { 77 77 domid_t domid; 78 78 }; 79 + 80 + /* DOMID_INVALID implies no restriction */ 81 + static domid_t target_domain = DOMID_INVALID; 82 + static bool restrict_wait; 83 + static DECLARE_WAIT_QUEUE_HEAD(restrict_wait_wq); 79 84 80 85 static int privcmd_vma_range_is_mapped( 81 86 struct vm_area_struct *vma, ··· 1578 1563 1579 1564 static int privcmd_open(struct inode *ino, struct file *file) 1580 1565 { 1581 - struct privcmd_data *data = kzalloc_obj(*data); 1566 + struct privcmd_data *data; 1582 1567 1568 + if (wait_event_interruptible(restrict_wait_wq, !restrict_wait) < 0) 1569 + return -EINTR; 1570 + 1571 + data = kzalloc_obj(*data); 1583 1572 if (!data) 1584 1573 return -ENOMEM; 1585 1574 1586 - /* DOMID_INVALID implies no restriction */ 1587 - data->domid = DOMID_INVALID; 1575 + data->domid = target_domain; 1588 1576 1589 1577 file->private_data = data; 1590 1578 return 0; ··· 1680 1662 .fops = &xen_privcmd_fops, 1681 1663 }; 1682 1664 1665 + static int init_restrict(struct notifier_block *notifier, 1666 + unsigned long event, 1667 + void *data) 1668 + { 1669 + char *target; 1670 + unsigned int domid; 1671 + 1672 + /* Default to an guaranteed unused domain-id. */ 1673 + target_domain = DOMID_IDLE; 1674 + 1675 + target = xenbus_read(XBT_NIL, "target", "", NULL); 1676 + if (IS_ERR(target) || kstrtouint(target, 10, &domid)) { 1677 + pr_err("No target domain found, blocking all hypercalls\n"); 1678 + goto out; 1679 + } 1680 + 1681 + target_domain = domid; 1682 + 1683 + out: 1684 + if (!IS_ERR(target)) 1685 + kfree(target); 1686 + 1687 + restrict_wait = false; 1688 + wake_up_all(&restrict_wait_wq); 1689 + 1690 + return NOTIFY_DONE; 1691 + } 1692 + 1693 + static struct notifier_block xenstore_notifier = { 1694 + .notifier_call = init_restrict, 1695 + }; 1696 + 1697 + static void __init restrict_driver(void) 1698 + { 1699 + if (unrestricted) { 1700 + if (security_locked_down(LOCKDOWN_XEN_USER_ACTIONS)) 1701 + pr_warn("Kernel is locked down, parameter \"unrestricted\" ignored\n"); 1702 + else 1703 + return; 1704 + } 1705 + 1706 + restrict_wait = true; 1707 + 1708 + register_xenstore_notifier(&xenstore_notifier); 1709 + } 1710 + 1683 1711 static int __init privcmd_init(void) 1684 1712 { 1685 1713 int err; 1686 1714 1687 1715 if (!xen_domain()) 1688 1716 return -ENODEV; 1717 + 1718 + if (!xen_initial_domain()) 1719 + restrict_driver(); 1689 1720 1690 1721 err = misc_register(&privcmd_dev); 1691 1722 if (err != 0) { ··· 1765 1698 1766 1699 static void __exit privcmd_exit(void) 1767 1700 { 1701 + if (!xen_initial_domain()) 1702 + unregister_xenstore_notifier(&xenstore_notifier); 1703 + 1768 1704 privcmd_ioeventfd_exit(); 1769 1705 privcmd_irqfd_exit(); 1770 1706 misc_deregister(&privcmd_dev);

+1 -1

fs/btrfs/block-group.c

··· 4583 4583 for (int i = 0; i < BTRFS_SPACE_INFO_SUB_GROUP_MAX; i++) { 4584 4584 if (space_info->sub_group[i]) { 4585 4585 check_removing_space_info(space_info->sub_group[i]); 4586 - kfree(space_info->sub_group[i]); 4586 + btrfs_sysfs_remove_space_info(space_info->sub_group[i]); 4587 4587 space_info->sub_group[i] = NULL; 4588 4588 } 4589 4589 }

+2 -2

fs/btrfs/disk-io.c

··· 2531 2531 2532 2532 if (mirror_num >= 0 && 2533 2533 btrfs_super_bytenr(sb) != btrfs_sb_offset(mirror_num)) { 2534 - btrfs_err(fs_info, "super offset mismatch %llu != %u", 2535 - btrfs_super_bytenr(sb), BTRFS_SUPER_INFO_OFFSET); 2534 + btrfs_err(fs_info, "super offset mismatch %llu != %llu", 2535 + btrfs_super_bytenr(sb), btrfs_sb_offset(mirror_num)); 2536 2536 ret = -EINVAL; 2537 2537 } 2538 2538

+65 -33

fs/btrfs/tree-log.c

··· 4616 4616 struct inode *inode, bool log_inode_only, 4617 4617 u64 logged_isize) 4618 4618 { 4619 + u64 gen = BTRFS_I(inode)->generation; 4619 4620 u64 flags; 4620 4621 4621 4622 if (log_inode_only) { 4622 - /* set the generation to zero so the recover code 4623 - * can tell the difference between an logging 4624 - * just to say 'this inode exists' and a logging 4625 - * to say 'update this inode with these values' 4623 + /* 4624 + * Set the generation to zero so the recover code can tell the 4625 + * difference between a logging just to say 'this inode exists' 4626 + * and a logging to say 'update this inode with these values'. 4627 + * But only if the inode was not already logged before. 4628 + * We access ->logged_trans directly since it was already set 4629 + * up in the call chain by btrfs_log_inode(), and data_race() 4630 + * to avoid false alerts from KCSAN and since it was set already 4631 + * and one can set it to 0 since that only happens on eviction 4632 + * and we are holding a ref on the inode. 4626 4633 */ 4627 - btrfs_set_inode_generation(leaf, item, 0); 4634 + ASSERT(data_race(BTRFS_I(inode)->logged_trans) > 0); 4635 + if (data_race(BTRFS_I(inode)->logged_trans) < trans->transid) 4636 + gen = 0; 4637 + 4628 4638 btrfs_set_inode_size(leaf, item, logged_isize); 4629 4639 } else { 4630 - btrfs_set_inode_generation(leaf, item, BTRFS_I(inode)->generation); 4631 4640 btrfs_set_inode_size(leaf, item, inode->i_size); 4632 4641 } 4642 + 4643 + btrfs_set_inode_generation(leaf, item, gen); 4633 4644 4634 4645 btrfs_set_inode_uid(leaf, item, i_uid_read(inode)); 4635 4646 btrfs_set_inode_gid(leaf, item, i_gid_read(inode)); ··· 5459 5448 return 0; 5460 5449 } 5461 5450 5462 - static int logged_inode_size(struct btrfs_root *log, struct btrfs_inode *inode, 5463 - struct btrfs_path *path, u64 *size_ret) 5451 + static int get_inode_size_to_log(struct btrfs_trans_handle *trans, 5452 + struct btrfs_inode *inode, 5453 + struct btrfs_path *path, u64 *size_ret) 5464 5454 { 5465 5455 struct btrfs_key key; 5456 + struct btrfs_inode_item *item; 5466 5457 int ret; 5467 5458 5468 5459 key.objectid = btrfs_ino(inode); 5469 5460 key.type = BTRFS_INODE_ITEM_KEY; 5470 5461 key.offset = 0; 5471 5462 5472 - ret = btrfs_search_slot(NULL, log, &key, path, 0, 0); 5473 - if (ret < 0) { 5474 - return ret; 5475 - } else if (ret > 0) { 5476 - *size_ret = 0; 5477 - } else { 5478 - struct btrfs_inode_item *item; 5463 + /* 5464 + * Our caller called inode_logged(), so logged_trans is up to date. 5465 + * Use data_race() to silence any warning from KCSAN. Once logged_trans 5466 + * is set, it can only be reset to 0 after inode eviction. 5467 + */ 5468 + if (data_race(inode->logged_trans) == trans->transid) { 5469 + ret = btrfs_search_slot(NULL, inode->root->log_root, &key, path, 0, 0); 5470 + } else if (inode->generation < trans->transid) { 5471 + path->search_commit_root = true; 5472 + path->skip_locking = true; 5473 + ret = btrfs_search_slot(NULL, inode->root, &key, path, 0, 0); 5474 + path->search_commit_root = false; 5475 + path->skip_locking = false; 5479 5476 5480 - item = btrfs_item_ptr(path->nodes[0], path->slots[0], 5481 - struct btrfs_inode_item); 5482 - *size_ret = btrfs_inode_size(path->nodes[0], item); 5483 - /* 5484 - * If the in-memory inode's i_size is smaller then the inode 5485 - * size stored in the btree, return the inode's i_size, so 5486 - * that we get a correct inode size after replaying the log 5487 - * when before a power failure we had a shrinking truncate 5488 - * followed by addition of a new name (rename / new hard link). 5489 - * Otherwise return the inode size from the btree, to avoid 5490 - * data loss when replaying a log due to previously doing a 5491 - * write that expands the inode's size and logging a new name 5492 - * immediately after. 5493 - */ 5494 - if (*size_ret > inode->vfs_inode.i_size) 5495 - *size_ret = inode->vfs_inode.i_size; 5477 + } else { 5478 + *size_ret = 0; 5479 + return 0; 5496 5480 } 5481 + 5482 + /* 5483 + * If the inode was logged before or is from a past transaction, then 5484 + * its inode item must exist in the log root or in the commit root. 5485 + */ 5486 + ASSERT(ret <= 0); 5487 + if (WARN_ON_ONCE(ret > 0)) 5488 + ret = -ENOENT; 5489 + 5490 + if (ret < 0) 5491 + return ret; 5492 + 5493 + item = btrfs_item_ptr(path->nodes[0], path->slots[0], 5494 + struct btrfs_inode_item); 5495 + *size_ret = btrfs_inode_size(path->nodes[0], item); 5496 + /* 5497 + * If the in-memory inode's i_size is smaller then the inode size stored 5498 + * in the btree, return the inode's i_size, so that we get a correct 5499 + * inode size after replaying the log when before a power failure we had 5500 + * a shrinking truncate followed by addition of a new name (rename / new 5501 + * hard link). Otherwise return the inode size from the btree, to avoid 5502 + * data loss when replaying a log due to previously doing a write that 5503 + * expands the inode's size and logging a new name immediately after. 5504 + */ 5505 + if (*size_ret > inode->vfs_inode.i_size) 5506 + *size_ret = inode->vfs_inode.i_size; 5497 5507 5498 5508 btrfs_release_path(path); 5499 5509 return 0; ··· 7028 6996 ret = drop_inode_items(trans, log, path, inode, 7029 6997 BTRFS_XATTR_ITEM_KEY); 7030 6998 } else { 7031 - if (inode_only == LOG_INODE_EXISTS && ctx->logged_before) { 6999 + if (inode_only == LOG_INODE_EXISTS) { 7032 7000 /* 7033 7001 * Make sure the new inode item we write to the log has 7034 7002 * the same isize as the current one (if it exists). ··· 7042 7010 * (zeroes), as if an expanding truncate happened, 7043 7011 * instead of getting a file of 4Kb only. 7044 7012 */ 7045 - ret = logged_inode_size(log, inode, path, &logged_isize); 7013 + ret = get_inode_size_to_log(trans, inode, path, &logged_isize); 7046 7014 if (ret) 7047 7015 goto out_unlock; 7048 7016 }

+3 -2

fs/btrfs/volumes.c

··· 8099 8099 smp_rmb(); 8100 8100 8101 8101 ret = update_dev_stat_item(trans, device); 8102 - if (!ret) 8103 - atomic_sub(stats_cnt, &device->dev_stats_ccnt); 8102 + if (ret) 8103 + break; 8104 + atomic_sub(stats_cnt, &device->dev_stats_ccnt); 8104 8105 } 8105 8106 mutex_unlock(&fs_devices->device_list_mutex); 8106 8107

+3 -1

fs/btrfs/zlib.c

··· 308 308 } 309 309 /* Queue the remaining part of the folio. */ 310 310 if (workspace->strm.total_out > bio->bi_iter.bi_size) { 311 - u32 cur_len = offset_in_folio(out_folio, workspace->strm.total_out); 311 + const u32 cur_len = workspace->strm.total_out - bio->bi_iter.bi_size; 312 + 313 + ASSERT(cur_len <= folio_size(out_folio)); 312 314 313 315 if (!bio_add_folio(bio, out_folio, cur_len, 0)) { 314 316 ret = -E2BIG;

+29 -14

fs/erofs/Kconfig

··· 16 16 select ZLIB_INFLATE if EROFS_FS_ZIP_DEFLATE 17 17 select ZSTD_DECOMPRESS if EROFS_FS_ZIP_ZSTD 18 18 help 19 - EROFS (Enhanced Read-Only File System) is a lightweight read-only 20 - file system with modern designs (e.g. no buffer heads, inline 21 - xattrs/data, chunk-based deduplication, multiple devices, etc.) for 22 - scenarios which need high-performance read-only solutions, e.g. 23 - smartphones with Android OS, LiveCDs and high-density hosts with 24 - numerous containers; 19 + EROFS (Enhanced Read-Only File System) is a modern, lightweight, 20 + secure read-only filesystem for various use cases, such as immutable 21 + system images, container images, application sandboxes, and datasets. 25 22 26 - It also provides transparent compression and deduplication support to 27 - improve storage density and maintain relatively high compression 28 - ratios, and it implements in-place decompression to temporarily reuse 29 - page cache for compressed data using proper strategies, which is 30 - quite useful for ensuring guaranteed end-to-end runtime decompression 23 + EROFS uses a flexible, hierarchical on-disk design so that features 24 + can be enabled on demand: the core on-disk format is block-aligned in 25 + order to perform optimally on all kinds of devices, including block 26 + and memory-backed devices; the format is easy to parse and has zero 27 + metadata redundancy, unlike generic filesystems, making it ideal for 28 + filesystem auditing and remote access; inline data, random-access 29 + friendly directory data, inline/shared extended attributes and 30 + chunk-based deduplication ensure space efficiency while maintaining 31 + high performance. 32 + 33 + Optionally, it supports multiple devices to reference external data, 34 + enabling data sharing for container images. 35 + 36 + It also has advanced encoded on-disk layouts, particularly for data 37 + compression and fine-grained deduplication. It utilizes fixed-size 38 + output compression to improve storage density while keeping relatively 39 + high compression ratios. Furthermore, it implements in-place 40 + decompression to reuse file pages to keep compressed data temporarily 41 + with proper strategies, which ensures guaranteed end-to-end runtime 31 42 performance under extreme memory pressure without extra cost. 32 43 33 - See the documentation at <file:Documentation/filesystems/erofs.rst> 34 - and the web pages at <https://erofs.docs.kernel.org> for more details. 44 + For more details, see the web pages at <https://erofs.docs.kernel.org> 45 + and the documentation at <file:Documentation/filesystems/erofs.rst>. 46 + 47 + To compile EROFS filesystem support as a module, choose M here. The 48 + module will be called erofs. 35 49 36 50 If unsure, say N. 37 51 ··· 119 105 depends on EROFS_FS 120 106 default y 121 107 help 122 - Enable transparent compression support for EROFS file systems. 108 + Enable EROFS compression layouts so that filesystems containing 109 + compressed files can be parsed by the kernel. 123 110 124 111 If you don't want to enable compression feature, say N. 125 112

+2 -4

fs/erofs/fileio.c

··· 25 25 container_of(iocb, struct erofs_fileio_rq, iocb); 26 26 struct folio_iter fi; 27 27 28 - if (ret >= 0 && ret != rq->bio.bi_iter.bi_size) { 29 - bio_advance(&rq->bio, ret); 30 - zero_fill_bio(&rq->bio); 31 - } 28 + if (ret >= 0 && ret != rq->bio.bi_iter.bi_size) 29 + ret = -EIO; 32 30 if (!rq->bio.bi_end_io) { 33 31 bio_for_each_folio_all(fi, &rq->bio) { 34 32 DBG_BUGON(folio_test_uptodate(fi.folio));

+13 -2

fs/erofs/ishare.c

··· 200 200 201 201 int __init erofs_init_ishare(void) 202 202 { 203 - erofs_ishare_mnt = kern_mount(&erofs_anon_fs_type); 204 - return PTR_ERR_OR_ZERO(erofs_ishare_mnt); 203 + struct vfsmount *mnt; 204 + int ret; 205 + 206 + mnt = kern_mount(&erofs_anon_fs_type); 207 + if (IS_ERR(mnt)) 208 + return PTR_ERR(mnt); 209 + /* generic_fadvise() doesn't work if s_bdi == &noop_backing_dev_info */ 210 + ret = super_setup_bdi(mnt->mnt_sb); 211 + if (ret) 212 + kern_unmount(mnt); 213 + else 214 + erofs_ishare_mnt = mnt; 215 + return ret; 205 216 } 206 217 207 218 void erofs_exit_ishare(void)

+3

fs/erofs/zdata.c

··· 1445 1445 int bios) 1446 1446 { 1447 1447 struct erofs_sb_info *const sbi = EROFS_SB(io->sb); 1448 + int gfp_flag; 1448 1449 1449 1450 /* wake up the caller thread for sync decompression */ 1450 1451 if (io->sync) { ··· 1478 1477 sbi->sync_decompress = EROFS_SYNC_DECOMPRESS_FORCE_ON; 1479 1478 return; 1480 1479 } 1480 + gfp_flag = memalloc_noio_save(); 1481 1481 z_erofs_decompressqueue_work(&io->u.work); 1482 + memalloc_noio_restore(gfp_flag); 1482 1483 } 1483 1484 1484 1485 static void z_erofs_fill_bio_vec(struct bio_vec *bvec,

+3 -2

fs/ext4/Makefile

··· 14 14 15 15 ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o 16 16 ext4-$(CONFIG_EXT4_FS_SECURITY) += xattr_security.o 17 - ext4-inode-test-objs += inode-test.o 18 - obj-$(CONFIG_EXT4_KUNIT_TESTS) += ext4-inode-test.o 17 + ext4-test-objs += inode-test.o mballoc-test.o \ 18 + extents-test.o 19 + obj-$(CONFIG_EXT4_KUNIT_TESTS) += ext4-test.o 19 20 ext4-$(CONFIG_FS_VERITY) += verity.o 20 21 ext4-$(CONFIG_FS_ENCRYPTION) += crypto.o

+8 -1

fs/ext4/crypto.c

··· 163 163 */ 164 164 165 165 if (handle) { 166 + /* 167 + * Since the inode is new it is ok to pass the 168 + * XATTR_CREATE flag. This is necessary to match the 169 + * remaining journal credits check in the set_handle 170 + * function with the credits allocated for the new 171 + * inode. 172 + */ 166 173 res = ext4_xattr_set_handle(handle, inode, 167 174 EXT4_XATTR_INDEX_ENCRYPTION, 168 175 EXT4_XATTR_NAME_ENCRYPTION_CONTEXT, 169 - ctx, len, 0); 176 + ctx, len, XATTR_CREATE); 170 177 if (!res) { 171 178 ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT); 172 179 ext4_clear_inode_state(inode,

+6

fs/ext4/ext4.h

··· 1570 1570 struct proc_dir_entry *s_proc; 1571 1571 struct kobject s_kobj; 1572 1572 struct completion s_kobj_unregister; 1573 + struct mutex s_error_notify_mutex; /* protects sysfs_notify vs kobject_del */ 1573 1574 struct super_block *s_sb; 1574 1575 struct buffer_head *s_mmp_bh; 1575 1576 ··· 3945 3944 extern int ext4_block_write_begin(handle_t *handle, struct folio *folio, 3946 3945 loff_t pos, unsigned len, 3947 3946 get_block_t *get_block); 3947 + 3948 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 3949 + #define EXPORT_SYMBOL_FOR_EXT4_TEST(sym) \ 3950 + EXPORT_SYMBOL_FOR_MODULES(sym, "ext4-test") 3951 + #endif 3948 3952 #endif /* __KERNEL__ */ 3949 3953 3950 3954 #endif /* _EXT4_H */

+12

fs/ext4/ext4_extents.h

··· 264 264 0xffff); 265 265 } 266 266 267 + extern int __ext4_ext_dirty(const char *where, unsigned int line, 268 + handle_t *handle, struct inode *inode, 269 + struct ext4_ext_path *path); 270 + extern int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex); 271 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 272 + extern int ext4_ext_space_root_idx_test(struct inode *inode, int check); 273 + extern struct ext4_ext_path *ext4_split_convert_extents_test( 274 + handle_t *handle, struct inode *inode, 275 + struct ext4_map_blocks *map, 276 + struct ext4_ext_path *path, 277 + int flags, unsigned int *allocated); 278 + #endif 267 279 #endif /* _EXT4_EXTENTS */ 268 280

+7 -5

fs/ext4/extents-test.c

··· 142 142 143 143 static void extents_kunit_exit(struct kunit *test) 144 144 { 145 - struct ext4_sb_info *sbi = k_ctx.k_ei->vfs_inode.i_sb->s_fs_info; 145 + struct super_block *sb = k_ctx.k_ei->vfs_inode.i_sb; 146 + struct ext4_sb_info *sbi = sb->s_fs_info; 146 147 148 + ext4_es_unregister_shrinker(sbi); 147 149 kfree(sbi); 148 150 kfree(k_ctx.k_ei); 149 151 kfree(k_ctx.k_data); ··· 282 280 eh->eh_depth = 0; 283 281 eh->eh_entries = cpu_to_le16(1); 284 282 eh->eh_magic = EXT4_EXT_MAGIC; 285 - eh->eh_max = 286 - cpu_to_le16(ext4_ext_space_root_idx(&k_ctx.k_ei->vfs_inode, 0)); 283 + eh->eh_max = cpu_to_le16(ext4_ext_space_root_idx_test( 284 + &k_ctx.k_ei->vfs_inode, 0)); 287 285 eh->eh_generation = 0; 288 286 289 287 /* ··· 386 384 387 385 switch (param->type) { 388 386 case TEST_SPLIT_CONVERT: 389 - path = ext4_split_convert_extents(NULL, inode, &map, path, 390 - param->split_flags, NULL); 387 + path = ext4_split_convert_extents_test(NULL, inode, &map, 388 + path, param->split_flags, NULL); 391 389 break; 392 390 case TEST_CREATE_BLOCKS: 393 391 ext4_map_create_blocks_helper(test, inode, &map, param->split_flags);

+68 -12

fs/ext4/extents.c

··· 184 184 * - ENOMEM 185 185 * - EIO 186 186 */ 187 - static int __ext4_ext_dirty(const char *where, unsigned int line, 188 - handle_t *handle, struct inode *inode, 189 - struct ext4_ext_path *path) 187 + int __ext4_ext_dirty(const char *where, unsigned int line, 188 + handle_t *handle, struct inode *inode, 189 + struct ext4_ext_path *path) 190 190 { 191 191 int err; 192 192 ··· 1736 1736 err = ext4_ext_get_access(handle, inode, path + k); 1737 1737 if (err) 1738 1738 return err; 1739 + if (unlikely(path[k].p_idx > EXT_LAST_INDEX(path[k].p_hdr))) { 1740 + EXT4_ERROR_INODE(inode, 1741 + "path[%d].p_idx %p > EXT_LAST_INDEX %p", 1742 + k, path[k].p_idx, 1743 + EXT_LAST_INDEX(path[k].p_hdr)); 1744 + return -EFSCORRUPTED; 1745 + } 1739 1746 path[k].p_idx->ei_block = border; 1740 1747 err = ext4_ext_dirty(handle, inode, path + k); 1741 1748 if (err) ··· 1755 1748 err = ext4_ext_get_access(handle, inode, path + k); 1756 1749 if (err) 1757 1750 goto clean; 1751 + if (unlikely(path[k].p_idx > EXT_LAST_INDEX(path[k].p_hdr))) { 1752 + EXT4_ERROR_INODE(inode, 1753 + "path[%d].p_idx %p > EXT_LAST_INDEX %p", 1754 + k, path[k].p_idx, 1755 + EXT_LAST_INDEX(path[k].p_hdr)); 1756 + err = -EFSCORRUPTED; 1757 + goto clean; 1758 + } 1758 1759 path[k].p_idx->ei_block = border; 1759 1760 err = ext4_ext_dirty(handle, inode, path + k); 1760 1761 if (err) ··· 3159 3144 } 3160 3145 3161 3146 /* FIXME!! we need to try to merge to left or right after zero-out */ 3162 - static int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex) 3147 + int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex) 3163 3148 { 3164 3149 ext4_fsblk_t ee_pblock; 3165 3150 unsigned int ee_len; ··· 3254 3239 3255 3240 insert_err = PTR_ERR(path); 3256 3241 err = 0; 3242 + if (insert_err != -ENOSPC && insert_err != -EDQUOT && 3243 + insert_err != -ENOMEM) 3244 + goto out_path; 3257 3245 3258 3246 /* 3259 3247 * Get a new path to try to zeroout or fix the extent length. ··· 3273 3255 goto out_path; 3274 3256 } 3275 3257 3258 + depth = ext_depth(inode); 3259 + ex = path[depth].p_ext; 3260 + if (!ex) { 3261 + EXT4_ERROR_INODE(inode, 3262 + "bad extent address lblock: %lu, depth: %d pblock %llu", 3263 + (unsigned long)ee_block, depth, path[depth].p_block); 3264 + err = -EFSCORRUPTED; 3265 + goto out; 3266 + } 3267 + 3276 3268 err = ext4_ext_get_access(handle, inode, path + depth); 3277 3269 if (err) 3278 3270 goto out; 3279 - 3280 - depth = ext_depth(inode); 3281 - ex = path[depth].p_ext; 3282 3271 3283 3272 fix_extent_len: 3284 3273 ex->ee_len = orig_ex.ee_len; ··· 3388 3363 3389 3364 ext4_ext_mark_initialized(ex); 3390 3365 3391 - ext4_ext_dirty(handle, inode, path + depth); 3366 + err = ext4_ext_dirty(handle, inode, path + depth); 3392 3367 if (err) 3393 3368 return err; 3394 3369 ··· 4482 4457 path = ext4_ext_insert_extent(handle, inode, path, &newex, flags); 4483 4458 if (IS_ERR(path)) { 4484 4459 err = PTR_ERR(path); 4485 - if (allocated_clusters) { 4460 + /* 4461 + * Gracefully handle out of space conditions. If the filesystem 4462 + * is inconsistent, we'll just leak allocated blocks to avoid 4463 + * causing even more damage. 4464 + */ 4465 + if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC)) { 4486 4466 int fb_flags = 0; 4487 - 4488 4467 /* 4489 4468 * free data blocks we just allocated. 4490 4469 * not a good idea to call discard here directly, ··· 6267 6238 return 0; 6268 6239 } 6269 6240 6270 - #ifdef CONFIG_EXT4_KUNIT_TESTS 6271 - #include "extents-test.c" 6241 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 6242 + int ext4_ext_space_root_idx_test(struct inode *inode, int check) 6243 + { 6244 + return ext4_ext_space_root_idx(inode, check); 6245 + } 6246 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_space_root_idx_test); 6247 + 6248 + struct ext4_ext_path *ext4_split_convert_extents_test(handle_t *handle, 6249 + struct inode *inode, struct ext4_map_blocks *map, 6250 + struct ext4_ext_path *path, int flags, 6251 + unsigned int *allocated) 6252 + { 6253 + return ext4_split_convert_extents(handle, inode, map, path, 6254 + flags, allocated); 6255 + } 6256 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_split_convert_extents_test); 6257 + 6258 + EXPORT_SYMBOL_FOR_EXT4_TEST(__ext4_ext_dirty); 6259 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_zeroout); 6260 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_register_shrinker); 6261 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_unregister_shrinker); 6262 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_map_create_blocks); 6263 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_init_tree); 6264 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_lookup_extent); 6265 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_insert_extent); 6266 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_insert_extent); 6267 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_find_extent); 6268 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_issue_zeroout); 6269 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_map_query_blocks); 6272 6270 #endif

+10 -7

fs/ext4/fast_commit.c

··· 975 975 int ret = 0; 976 976 977 977 list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) { 978 - ret = jbd2_submit_inode_data(journal, ei->jinode); 978 + ret = jbd2_submit_inode_data(journal, READ_ONCE(ei->jinode)); 979 979 if (ret) 980 980 return ret; 981 981 } 982 982 983 983 list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) { 984 - ret = jbd2_wait_inode_data(journal, ei->jinode); 984 + ret = jbd2_wait_inode_data(journal, READ_ONCE(ei->jinode)); 985 985 if (ret) 986 986 return ret; 987 987 } ··· 1613 1613 /* Immediately update the inode on disk. */ 1614 1614 ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh); 1615 1615 if (ret) 1616 - goto out; 1616 + goto out_brelse; 1617 1617 ret = sync_dirty_buffer(iloc.bh); 1618 1618 if (ret) 1619 - goto out; 1619 + goto out_brelse; 1620 1620 ret = ext4_mark_inode_used(sb, ino); 1621 1621 if (ret) 1622 - goto out; 1622 + goto out_brelse; 1623 1623 1624 1624 /* Given that we just wrote the inode on disk, this SHOULD succeed. */ 1625 1625 inode = ext4_iget(sb, ino, EXT4_IGET_NORMAL); 1626 1626 if (IS_ERR(inode)) { 1627 1627 ext4_debug("Inode not found."); 1628 - return -EFSCORRUPTED; 1628 + inode = NULL; 1629 + ret = -EFSCORRUPTED; 1630 + goto out_brelse; 1629 1631 } 1630 1632 1631 1633 /* ··· 1644 1642 ext4_inode_csum_set(inode, ext4_raw_inode(&iloc), EXT4_I(inode)); 1645 1643 ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh); 1646 1644 sync_dirty_buffer(iloc.bh); 1645 + out_brelse: 1647 1646 brelse(iloc.bh); 1648 1647 out: 1649 1648 iput(inode); 1650 1649 if (!ret) 1651 1650 blkdev_issue_flush(sb->s_bdev); 1652 1651 1653 - return 0; 1652 + return ret; 1654 1653 } 1655 1654 1656 1655 /*

+14 -2

fs/ext4/fsync.c

··· 83 83 int datasync, bool *needs_barrier) 84 84 { 85 85 struct inode *inode = file->f_inode; 86 + struct writeback_control wbc = { 87 + .sync_mode = WB_SYNC_ALL, 88 + .nr_to_write = 0, 89 + }; 86 90 int ret; 87 91 88 92 ret = generic_buffers_fsync_noflush(file, start, end, datasync); 89 - if (!ret) 90 - ret = ext4_sync_parent(inode); 93 + if (ret) 94 + return ret; 95 + 96 + /* Force writeout of inode table buffer to disk */ 97 + ret = ext4_write_inode(inode, &wbc); 98 + if (ret) 99 + return ret; 100 + 101 + ret = ext4_sync_parent(inode); 102 + 91 103 if (test_opt(inode->i_sb, BARRIER)) 92 104 *needs_barrier = true; 93 105

+6

fs/ext4/ialloc.c

··· 686 686 if (unlikely(!gdp)) 687 687 return 0; 688 688 689 + /* Inode was never used in this filesystem? */ 690 + if (ext4_has_group_desc_csum(sb) && 691 + (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT) || 692 + ino >= EXT4_INODES_PER_GROUP(sb) - ext4_itable_unused_count(sb, gdp))) 693 + return 0; 694 + 689 695 bh = sb_find_get_block(sb, ext4_inode_table(sb, gdp) + 690 696 (ino / inodes_per_block)); 691 697 if (!bh || !buffer_uptodate(bh))

+9 -1

fs/ext4/inline.c

··· 522 522 goto out; 523 523 524 524 len = min_t(size_t, ext4_get_inline_size(inode), i_size_read(inode)); 525 - BUG_ON(len > PAGE_SIZE); 525 + 526 + if (len > PAGE_SIZE) { 527 + ext4_error_inode(inode, __func__, __LINE__, 0, 528 + "inline size %zu exceeds PAGE_SIZE", len); 529 + ret = -EFSCORRUPTED; 530 + brelse(iloc.bh); 531 + goto out; 532 + } 533 + 526 534 kaddr = kmap_local_folio(folio, 0); 527 535 ret = ext4_read_inline_data(inode, kaddr, len, &iloc); 528 536 kaddr = folio_zero_tail(folio, len, kaddr + len);

+60 -15

fs/ext4/inode.c

··· 128 128 static inline int ext4_begin_ordered_truncate(struct inode *inode, 129 129 loff_t new_size) 130 130 { 131 + struct jbd2_inode *jinode = READ_ONCE(EXT4_I(inode)->jinode); 132 + 131 133 trace_ext4_begin_ordered_truncate(inode, new_size); 132 134 /* 133 135 * If jinode is zero, then we never opened the file for ··· 137 135 * jbd2_journal_begin_ordered_truncate() since there's no 138 136 * outstanding writes we need to flush. 139 137 */ 140 - if (!EXT4_I(inode)->jinode) 138 + if (!jinode) 141 139 return 0; 142 140 return jbd2_journal_begin_ordered_truncate(EXT4_JOURNAL(inode), 143 - EXT4_I(inode)->jinode, 141 + jinode, 144 142 new_size); 145 143 } 146 144 ··· 186 184 if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL) 187 185 ext4_evict_ea_inode(inode); 188 186 if (inode->i_nlink) { 187 + /* 188 + * If there's dirty page will lead to data loss, user 189 + * could see stale data. 190 + */ 191 + if (unlikely(!ext4_emergency_state(inode->i_sb) && 192 + mapping_tagged(&inode->i_data, PAGECACHE_TAG_DIRTY))) 193 + ext4_warning_inode(inode, "data will be lost"); 194 + 189 195 truncate_inode_pages_final(&inode->i_data); 190 196 191 197 goto no_delete; ··· 4461 4451 spin_unlock(&inode->i_lock); 4462 4452 return -ENOMEM; 4463 4453 } 4464 - ei->jinode = jinode; 4465 - jbd2_journal_init_jbd_inode(ei->jinode, inode); 4454 + jbd2_journal_init_jbd_inode(jinode, inode); 4455 + /* 4456 + * Publish ->jinode only after it is fully initialized so that 4457 + * readers never observe a partially initialized jbd2_inode. 4458 + */ 4459 + smp_wmb(); 4460 + WRITE_ONCE(ei->jinode, jinode); 4466 4461 jinode = NULL; 4467 4462 } 4468 4463 spin_unlock(&inode->i_lock); ··· 5416 5401 inode->i_op = &ext4_encrypted_symlink_inode_operations; 5417 5402 } else if (ext4_inode_is_fast_symlink(inode)) { 5418 5403 inode->i_op = &ext4_fast_symlink_inode_operations; 5419 - if (inode->i_size == 0 || 5420 - inode->i_size >= sizeof(ei->i_data) || 5421 - strnlen((char *)ei->i_data, inode->i_size + 1) != 5422 - inode->i_size) { 5423 - ext4_error_inode(inode, function, line, 0, 5424 - "invalid fast symlink length %llu", 5425 - (unsigned long long)inode->i_size); 5426 - ret = -EFSCORRUPTED; 5427 - goto bad_inode; 5404 + 5405 + /* 5406 + * Orphan cleanup can see inodes with i_size == 0 5407 + * and i_data uninitialized. Skip size checks in 5408 + * that case. This is safe because the first thing 5409 + * ext4_evict_inode() does for fast symlinks is 5410 + * clearing of i_data and i_size. 5411 + */ 5412 + if ((EXT4_SB(sb)->s_mount_state & EXT4_ORPHAN_FS)) { 5413 + if (inode->i_nlink != 0) { 5414 + ext4_error_inode(inode, function, line, 0, 5415 + "invalid orphan symlink nlink %d", 5416 + inode->i_nlink); 5417 + ret = -EFSCORRUPTED; 5418 + goto bad_inode; 5419 + } 5420 + } else { 5421 + if (inode->i_size == 0 || 5422 + inode->i_size >= sizeof(ei->i_data) || 5423 + strnlen((char *)ei->i_data, inode->i_size + 1) != 5424 + inode->i_size) { 5425 + ext4_error_inode(inode, function, line, 0, 5426 + "invalid fast symlink length %llu", 5427 + (unsigned long long)inode->i_size); 5428 + ret = -EFSCORRUPTED; 5429 + goto bad_inode; 5430 + } 5431 + inode_set_cached_link(inode, (char *)ei->i_data, 5432 + inode->i_size); 5428 5433 } 5429 - inode_set_cached_link(inode, (char *)ei->i_data, 5430 - inode->i_size); 5431 5434 } else { 5432 5435 inode->i_op = &ext4_symlink_inode_operations; 5433 5436 } ··· 5881 5848 5882 5849 if (attr->ia_size == inode->i_size) 5883 5850 inc_ivers = false; 5851 + 5852 + /* 5853 + * If file has inline data but new size exceeds inline capacity, 5854 + * convert to extent-based storage first to prevent inconsistent 5855 + * state (inline flag set but size exceeds inline capacity). 5856 + */ 5857 + if (ext4_has_inline_data(inode) && 5858 + attr->ia_size > EXT4_I(inode)->i_inline_size) { 5859 + error = ext4_convert_inline_data(inode); 5860 + if (error) 5861 + goto err_out; 5862 + } 5884 5863 5885 5864 if (shrink) { 5886 5865 if (ext4_should_order_data(inode)) {

+41 -40

fs/ext4/mballoc-test.c

··· 8 8 #include <linux/random.h> 9 9 10 10 #include "ext4.h" 11 + #include "mballoc.h" 11 12 12 13 struct mbt_grp_ctx { 13 14 struct buffer_head bitmap_bh; ··· 337 336 if (state) 338 337 mb_set_bits(bitmap_bh->b_data, blkoff, len); 339 338 else 340 - mb_clear_bits(bitmap_bh->b_data, blkoff, len); 339 + mb_clear_bits_test(bitmap_bh->b_data, blkoff, len); 341 340 342 341 return 0; 343 342 } ··· 414 413 415 414 /* get block at goal */ 416 415 ar.goal = ext4_group_first_block_no(sb, goal_group); 417 - found = ext4_mb_new_blocks_simple(&ar, &err); 416 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 418 417 KUNIT_ASSERT_EQ_MSG(test, ar.goal, found, 419 418 "failed to alloc block at goal, expected %llu found %llu", 420 419 ar.goal, found); 421 420 422 421 /* get block after goal in goal group */ 423 422 ar.goal = ext4_group_first_block_no(sb, goal_group); 424 - found = ext4_mb_new_blocks_simple(&ar, &err); 423 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 425 424 KUNIT_ASSERT_EQ_MSG(test, ar.goal + EXT4_C2B(sbi, 1), found, 426 425 "failed to alloc block after goal in goal group, expected %llu found %llu", 427 426 ar.goal + 1, found); ··· 429 428 /* get block after goal group */ 430 429 mbt_ctx_mark_used(sb, goal_group, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 431 430 ar.goal = ext4_group_first_block_no(sb, goal_group); 432 - found = ext4_mb_new_blocks_simple(&ar, &err); 431 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 433 432 KUNIT_ASSERT_EQ_MSG(test, 434 433 ext4_group_first_block_no(sb, goal_group + 1), found, 435 434 "failed to alloc block after goal group, expected %llu found %llu", ··· 439 438 for (i = goal_group; i < ext4_get_groups_count(sb); i++) 440 439 mbt_ctx_mark_used(sb, i, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 441 440 ar.goal = ext4_group_first_block_no(sb, goal_group); 442 - found = ext4_mb_new_blocks_simple(&ar, &err); 441 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 443 442 KUNIT_ASSERT_EQ_MSG(test, 444 443 ext4_group_first_block_no(sb, 0) + EXT4_C2B(sbi, 1), found, 445 444 "failed to alloc block before goal group, expected %llu found %llu", ··· 449 448 for (i = 0; i < ext4_get_groups_count(sb); i++) 450 449 mbt_ctx_mark_used(sb, i, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 451 450 ar.goal = ext4_group_first_block_no(sb, goal_group); 452 - found = ext4_mb_new_blocks_simple(&ar, &err); 451 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 453 452 KUNIT_ASSERT_NE_MSG(test, err, 0, 454 453 "unexpectedly get block when no block is available"); 455 454 } ··· 493 492 continue; 494 493 495 494 bitmap = mbt_ctx_bitmap(sb, i); 496 - bit = mb_find_next_zero_bit(bitmap, max, 0); 495 + bit = mb_find_next_zero_bit_test(bitmap, max, 0); 497 496 KUNIT_ASSERT_EQ_MSG(test, bit, max, 498 497 "free block on unexpected group %d", i); 499 498 } 500 499 501 500 bitmap = mbt_ctx_bitmap(sb, goal_group); 502 - bit = mb_find_next_zero_bit(bitmap, max, 0); 501 + bit = mb_find_next_zero_bit_test(bitmap, max, 0); 503 502 KUNIT_ASSERT_EQ(test, bit, start); 504 503 505 - bit = mb_find_next_bit(bitmap, max, bit + 1); 504 + bit = mb_find_next_bit_test(bitmap, max, bit + 1); 506 505 KUNIT_ASSERT_EQ(test, bit, start + len); 507 506 } 508 507 ··· 525 524 526 525 block = ext4_group_first_block_no(sb, goal_group) + 527 526 EXT4_C2B(sbi, start); 528 - ext4_free_blocks_simple(inode, block, len); 527 + ext4_free_blocks_simple_test(inode, block, len); 529 528 validate_free_blocks_simple(test, sb, goal_group, start, len); 530 529 mbt_ctx_mark_used(sb, goal_group, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 531 530 } ··· 567 566 568 567 bitmap = mbt_ctx_bitmap(sb, TEST_GOAL_GROUP); 569 568 memset(bitmap, 0, sb->s_blocksize); 570 - ret = ext4_mb_mark_diskspace_used(ac, NULL); 569 + ret = ext4_mb_mark_diskspace_used_test(ac, NULL); 571 570 KUNIT_ASSERT_EQ(test, ret, 0); 572 571 573 572 max = EXT4_CLUSTERS_PER_GROUP(sb); 574 - i = mb_find_next_bit(bitmap, max, 0); 573 + i = mb_find_next_bit_test(bitmap, max, 0); 575 574 KUNIT_ASSERT_EQ(test, i, start); 576 - i = mb_find_next_zero_bit(bitmap, max, i + 1); 575 + i = mb_find_next_zero_bit_test(bitmap, max, i + 1); 577 576 KUNIT_ASSERT_EQ(test, i, start + len); 578 - i = mb_find_next_bit(bitmap, max, i + 1); 577 + i = mb_find_next_bit_test(bitmap, max, i + 1); 579 578 KUNIT_ASSERT_EQ(test, max, i); 580 579 } 581 580 ··· 618 617 max = EXT4_CLUSTERS_PER_GROUP(sb); 619 618 bb_h = buddy + sbi->s_mb_offsets[1]; 620 619 621 - off = mb_find_next_zero_bit(bb, max, 0); 620 + off = mb_find_next_zero_bit_test(bb, max, 0); 622 621 grp->bb_first_free = off; 623 622 while (off < max) { 624 623 grp->bb_counters[0]++; 625 624 grp->bb_free++; 626 625 627 - if (!(off & 1) && !mb_test_bit(off + 1, bb)) { 626 + if (!(off & 1) && !mb_test_bit_test(off + 1, bb)) { 628 627 grp->bb_free++; 629 628 grp->bb_counters[0]--; 630 - mb_clear_bit(off >> 1, bb_h); 629 + mb_clear_bit_test(off >> 1, bb_h); 631 630 grp->bb_counters[1]++; 632 631 grp->bb_largest_free_order = 1; 633 632 off++; 634 633 } 635 634 636 - off = mb_find_next_zero_bit(bb, max, off + 1); 635 + off = mb_find_next_zero_bit_test(bb, max, off + 1); 637 636 } 638 637 639 638 for (order = 1; order < MB_NUM_ORDERS(sb) - 1; order++) { 640 639 bb = buddy + sbi->s_mb_offsets[order]; 641 640 bb_h = buddy + sbi->s_mb_offsets[order + 1]; 642 641 max = max >> 1; 643 - off = mb_find_next_zero_bit(bb, max, 0); 642 + off = mb_find_next_zero_bit_test(bb, max, 0); 644 643 645 644 while (off < max) { 646 - if (!(off & 1) && !mb_test_bit(off + 1, bb)) { 645 + if (!(off & 1) && !mb_test_bit_test(off + 1, bb)) { 647 646 mb_set_bits(bb, off, 2); 648 647 grp->bb_counters[order] -= 2; 649 - mb_clear_bit(off >> 1, bb_h); 648 + mb_clear_bit_test(off >> 1, bb_h); 650 649 grp->bb_counters[order + 1]++; 651 650 grp->bb_largest_free_order = order + 1; 652 651 off++; 653 652 } 654 653 655 - off = mb_find_next_zero_bit(bb, max, off + 1); 654 + off = mb_find_next_zero_bit_test(bb, max, off + 1); 656 655 } 657 656 } 658 657 659 658 max = EXT4_CLUSTERS_PER_GROUP(sb); 660 - off = mb_find_next_zero_bit(bitmap, max, 0); 659 + off = mb_find_next_zero_bit_test(bitmap, max, 0); 661 660 while (off < max) { 662 661 grp->bb_fragments++; 663 662 664 - off = mb_find_next_bit(bitmap, max, off + 1); 663 + off = mb_find_next_bit_test(bitmap, max, off + 1); 665 664 if (off + 1 >= max) 666 665 break; 667 666 668 - off = mb_find_next_zero_bit(bitmap, max, off + 1); 667 + off = mb_find_next_zero_bit_test(bitmap, max, off + 1); 669 668 } 670 669 } 671 670 ··· 707 706 /* needed by validation in ext4_mb_generate_buddy */ 708 707 ext4_grp->bb_free = mbt_grp->bb_free; 709 708 memset(ext4_buddy, 0xff, sb->s_blocksize); 710 - ext4_mb_generate_buddy(sb, ext4_buddy, bitmap, TEST_GOAL_GROUP, 709 + ext4_mb_generate_buddy_test(sb, ext4_buddy, bitmap, TEST_GOAL_GROUP, 711 710 ext4_grp); 712 711 713 712 KUNIT_ASSERT_EQ(test, memcmp(mbt_buddy, ext4_buddy, sb->s_blocksize), ··· 761 760 ex.fe_group = TEST_GOAL_GROUP; 762 761 763 762 ext4_lock_group(sb, TEST_GOAL_GROUP); 764 - mb_mark_used(e4b, &ex); 763 + mb_mark_used_test(e4b, &ex); 765 764 ext4_unlock_group(sb, TEST_GOAL_GROUP); 766 765 767 766 mb_set_bits(bitmap, start, len); ··· 770 769 memset(buddy, 0xff, sb->s_blocksize); 771 770 for (i = 0; i < MB_NUM_ORDERS(sb); i++) 772 771 grp->bb_counters[i] = 0; 773 - ext4_mb_generate_buddy(sb, buddy, bitmap, 0, grp); 772 + ext4_mb_generate_buddy_test(sb, buddy, bitmap, 0, grp); 774 773 775 774 KUNIT_ASSERT_EQ(test, memcmp(buddy, e4b->bd_buddy, sb->s_blocksize), 776 775 0); ··· 799 798 bb_counters[MB_NUM_ORDERS(sb)]), GFP_KERNEL); 800 799 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, grp); 801 800 802 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 801 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 803 802 KUNIT_ASSERT_EQ(test, ret, 0); 804 803 805 804 grp->bb_free = EXT4_CLUSTERS_PER_GROUP(sb); ··· 810 809 test_mb_mark_used_range(test, &e4b, ranges[i].start, 811 810 ranges[i].len, bitmap, buddy, grp); 812 811 813 - ext4_mb_unload_buddy(&e4b); 812 + ext4_mb_unload_buddy_test(&e4b); 814 813 } 815 814 816 815 static void ··· 826 825 return; 827 826 828 827 ext4_lock_group(sb, e4b->bd_group); 829 - mb_free_blocks(NULL, e4b, start, len); 828 + mb_free_blocks_test(NULL, e4b, start, len); 830 829 ext4_unlock_group(sb, e4b->bd_group); 831 830 832 - mb_clear_bits(bitmap, start, len); 831 + mb_clear_bits_test(bitmap, start, len); 833 832 /* bypass bb_free validatoin in ext4_mb_generate_buddy */ 834 833 grp->bb_free += len; 835 834 memset(buddy, 0xff, sb->s_blocksize); 836 835 for (i = 0; i < MB_NUM_ORDERS(sb); i++) 837 836 grp->bb_counters[i] = 0; 838 - ext4_mb_generate_buddy(sb, buddy, bitmap, 0, grp); 837 + ext4_mb_generate_buddy_test(sb, buddy, bitmap, 0, grp); 839 838 840 839 KUNIT_ASSERT_EQ(test, memcmp(buddy, e4b->bd_buddy, sb->s_blocksize), 841 840 0); ··· 866 865 bb_counters[MB_NUM_ORDERS(sb)]), GFP_KERNEL); 867 866 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, grp); 868 867 869 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 868 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 870 869 KUNIT_ASSERT_EQ(test, ret, 0); 871 870 872 871 ex.fe_start = 0; ··· 874 873 ex.fe_group = TEST_GOAL_GROUP; 875 874 876 875 ext4_lock_group(sb, TEST_GOAL_GROUP); 877 - mb_mark_used(&e4b, &ex); 876 + mb_mark_used_test(&e4b, &ex); 878 877 ext4_unlock_group(sb, TEST_GOAL_GROUP); 879 878 880 879 grp->bb_free = 0; ··· 887 886 test_mb_free_blocks_range(test, &e4b, ranges[i].start, 888 887 ranges[i].len, bitmap, buddy, grp); 889 888 890 - ext4_mb_unload_buddy(&e4b); 889 + ext4_mb_unload_buddy_test(&e4b); 891 890 } 892 891 893 892 #define COUNT_FOR_ESTIMATE 100000 ··· 905 904 if (sb->s_blocksize > PAGE_SIZE) 906 905 kunit_skip(test, "blocksize exceeds pagesize"); 907 906 908 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 907 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 909 908 KUNIT_ASSERT_EQ(test, ret, 0); 910 909 911 910 ex.fe_group = TEST_GOAL_GROUP; ··· 919 918 ex.fe_start = ranges[i].start; 920 919 ex.fe_len = ranges[i].len; 921 920 ext4_lock_group(sb, TEST_GOAL_GROUP); 922 - mb_mark_used(&e4b, &ex); 921 + mb_mark_used_test(&e4b, &ex); 923 922 ext4_unlock_group(sb, TEST_GOAL_GROUP); 924 923 } 925 924 end = jiffies; ··· 930 929 continue; 931 930 932 931 ext4_lock_group(sb, TEST_GOAL_GROUP); 933 - mb_free_blocks(NULL, &e4b, ranges[i].start, 932 + mb_free_blocks_test(NULL, &e4b, ranges[i].start, 934 933 ranges[i].len); 935 934 ext4_unlock_group(sb, TEST_GOAL_GROUP); 936 935 } 937 936 } 938 937 939 938 kunit_info(test, "costed jiffies %lu\n", all); 940 - ext4_mb_unload_buddy(&e4b); 939 + ext4_mb_unload_buddy_test(&e4b); 941 940 } 942 941 943 942 static const struct mbt_ext4_block_layout mbt_test_layouts[] = {

+114 -18

fs/ext4/mballoc.c

··· 1199 1199 1200 1200 /* searching for the right group start from the goal value specified */ 1201 1201 start = ac->ac_g_ex.fe_group; 1202 + if (start >= ngroups) 1203 + start = 0; 1202 1204 ac->ac_prefetch_grp = start; 1203 1205 ac->ac_prefetch_nr = 0; 1204 1206 ··· 2445 2443 return 0; 2446 2444 2447 2445 err = ext4_mb_load_buddy(ac->ac_sb, group, e4b); 2448 - if (err) 2446 + if (err) { 2447 + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) && 2448 + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY)) 2449 + return 0; 2449 2450 return err; 2451 + } 2450 2452 2451 2453 ext4_lock_group(ac->ac_sb, group); 2452 2454 if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) ··· 3586 3580 rcu_read_unlock(); 3587 3581 iput(sbi->s_buddy_cache); 3588 3582 err_freesgi: 3589 - rcu_read_lock(); 3590 - kvfree(rcu_dereference(sbi->s_group_info)); 3591 - rcu_read_unlock(); 3583 + kvfree(rcu_access_pointer(sbi->s_group_info)); 3592 3584 return -ENOMEM; 3593 3585 } 3594 3586 ··· 3893 3889 struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits); 3894 3890 int count; 3895 3891 3896 - if (test_opt(sb, DISCARD)) { 3897 - /* 3898 - * wait the discard work to drain all of ext4_free_data 3899 - */ 3900 - flush_work(&sbi->s_discard_work); 3901 - WARN_ON_ONCE(!list_empty(&sbi->s_discard_list)); 3902 - } 3892 + /* 3893 + * wait the discard work to drain all of ext4_free_data 3894 + */ 3895 + flush_work(&sbi->s_discard_work); 3896 + WARN_ON_ONCE(!list_empty(&sbi->s_discard_list)); 3903 3897 3904 - if (sbi->s_group_info) { 3898 + group_info = rcu_access_pointer(sbi->s_group_info); 3899 + if (group_info) { 3905 3900 for (i = 0; i < ngroups; i++) { 3906 3901 cond_resched(); 3907 3902 grinfo = ext4_get_group_info(sb, i); ··· 3918 3915 num_meta_group_infos = (ngroups + 3919 3916 EXT4_DESC_PER_BLOCK(sb) - 1) >> 3920 3917 EXT4_DESC_PER_BLOCK_BITS(sb); 3921 - rcu_read_lock(); 3922 - group_info = rcu_dereference(sbi->s_group_info); 3923 3918 for (i = 0; i < num_meta_group_infos; i++) 3924 3919 kfree(group_info[i]); 3925 3920 kvfree(group_info); 3926 - rcu_read_unlock(); 3927 3921 } 3928 3922 ext4_mb_avg_fragment_size_destroy(sbi); 3929 3923 ext4_mb_largest_free_orders_destroy(sbi); ··· 4084 4084 4085 4085 #define EXT4_MB_BITMAP_MARKED_CHECK 0x0001 4086 4086 #define EXT4_MB_SYNC_UPDATE 0x0002 4087 - static int 4087 + int 4088 4088 ext4_mb_mark_context(handle_t *handle, struct super_block *sb, bool state, 4089 4089 ext4_group_t group, ext4_grpblk_t blkoff, 4090 4090 ext4_grpblk_t len, int flags, ext4_grpblk_t *ret_changed) ··· 7188 7188 return error; 7189 7189 } 7190 7190 7191 - #ifdef CONFIG_EXT4_KUNIT_TESTS 7192 - #include "mballoc-test.c" 7191 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 7192 + void mb_clear_bits_test(void *bm, int cur, int len) 7193 + { 7194 + mb_clear_bits(bm, cur, len); 7195 + } 7196 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_clear_bits_test); 7197 + 7198 + ext4_fsblk_t 7199 + ext4_mb_new_blocks_simple_test(struct ext4_allocation_request *ar, 7200 + int *errp) 7201 + { 7202 + return ext4_mb_new_blocks_simple(ar, errp); 7203 + } 7204 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_new_blocks_simple_test); 7205 + 7206 + int mb_find_next_zero_bit_test(void *addr, int max, int start) 7207 + { 7208 + return mb_find_next_zero_bit(addr, max, start); 7209 + } 7210 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_find_next_zero_bit_test); 7211 + 7212 + int mb_find_next_bit_test(void *addr, int max, int start) 7213 + { 7214 + return mb_find_next_bit(addr, max, start); 7215 + } 7216 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_find_next_bit_test); 7217 + 7218 + void mb_clear_bit_test(int bit, void *addr) 7219 + { 7220 + mb_clear_bit(bit, addr); 7221 + } 7222 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_clear_bit_test); 7223 + 7224 + int mb_test_bit_test(int bit, void *addr) 7225 + { 7226 + return mb_test_bit(bit, addr); 7227 + } 7228 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_test_bit_test); 7229 + 7230 + int ext4_mb_mark_diskspace_used_test(struct ext4_allocation_context *ac, 7231 + handle_t *handle) 7232 + { 7233 + return ext4_mb_mark_diskspace_used(ac, handle); 7234 + } 7235 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_mark_diskspace_used_test); 7236 + 7237 + int mb_mark_used_test(struct ext4_buddy *e4b, struct ext4_free_extent *ex) 7238 + { 7239 + return mb_mark_used(e4b, ex); 7240 + } 7241 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_mark_used_test); 7242 + 7243 + void ext4_mb_generate_buddy_test(struct super_block *sb, void *buddy, 7244 + void *bitmap, ext4_group_t group, 7245 + struct ext4_group_info *grp) 7246 + { 7247 + ext4_mb_generate_buddy(sb, buddy, bitmap, group, grp); 7248 + } 7249 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_generate_buddy_test); 7250 + 7251 + int ext4_mb_load_buddy_test(struct super_block *sb, ext4_group_t group, 7252 + struct ext4_buddy *e4b) 7253 + { 7254 + return ext4_mb_load_buddy(sb, group, e4b); 7255 + } 7256 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_load_buddy_test); 7257 + 7258 + void ext4_mb_unload_buddy_test(struct ext4_buddy *e4b) 7259 + { 7260 + ext4_mb_unload_buddy(e4b); 7261 + } 7262 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_unload_buddy_test); 7263 + 7264 + void mb_free_blocks_test(struct inode *inode, struct ext4_buddy *e4b, 7265 + int first, int count) 7266 + { 7267 + mb_free_blocks(inode, e4b, first, count); 7268 + } 7269 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_free_blocks_test); 7270 + 7271 + void ext4_free_blocks_simple_test(struct inode *inode, ext4_fsblk_t block, 7272 + unsigned long count) 7273 + { 7274 + return ext4_free_blocks_simple(inode, block, count); 7275 + } 7276 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_free_blocks_simple_test); 7277 + 7278 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_wait_block_bitmap); 7279 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_init); 7280 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_get_group_desc); 7281 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_count_free_clusters); 7282 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_get_group_info); 7283 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_free_group_clusters_set); 7284 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_release); 7285 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_read_block_bitmap_nowait); 7286 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_set_bits); 7287 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_fc_init_inode); 7288 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_mark_context); 7193 7289 #endif

+30

fs/ext4/mballoc.h

··· 270 270 ext4_mballoc_query_range_fn formatter, 271 271 void *priv); 272 272 273 + extern int ext4_mb_mark_context(handle_t *handle, 274 + struct super_block *sb, bool state, 275 + ext4_group_t group, ext4_grpblk_t blkoff, 276 + ext4_grpblk_t len, int flags, 277 + ext4_grpblk_t *ret_changed); 278 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 279 + extern void mb_clear_bits_test(void *bm, int cur, int len); 280 + extern ext4_fsblk_t 281 + ext4_mb_new_blocks_simple_test(struct ext4_allocation_request *ar, 282 + int *errp); 283 + extern int mb_find_next_zero_bit_test(void *addr, int max, int start); 284 + extern int mb_find_next_bit_test(void *addr, int max, int start); 285 + extern void mb_clear_bit_test(int bit, void *addr); 286 + extern int mb_test_bit_test(int bit, void *addr); 287 + extern int 288 + ext4_mb_mark_diskspace_used_test(struct ext4_allocation_context *ac, 289 + handle_t *handle); 290 + extern int mb_mark_used_test(struct ext4_buddy *e4b, 291 + struct ext4_free_extent *ex); 292 + extern void ext4_mb_generate_buddy_test(struct super_block *sb, 293 + void *buddy, void *bitmap, ext4_group_t group, 294 + struct ext4_group_info *grp); 295 + extern int ext4_mb_load_buddy_test(struct super_block *sb, 296 + ext4_group_t group, struct ext4_buddy *e4b); 297 + extern void ext4_mb_unload_buddy_test(struct ext4_buddy *e4b); 298 + extern void mb_free_blocks_test(struct inode *inode, 299 + struct ext4_buddy *e4b, int first, int count); 300 + extern void ext4_free_blocks_simple_test(struct inode *inode, 301 + ext4_fsblk_t block, unsigned long count); 302 + #endif 273 303 #endif

+8 -2

fs/ext4/page-io.c

··· 524 524 nr_to_submit++; 525 525 } while ((bh = bh->b_this_page) != head); 526 526 527 - /* Nothing to submit? Just unlock the folio... */ 528 - if (!nr_to_submit) 527 + if (!nr_to_submit) { 528 + /* 529 + * We have nothing to submit. Just cycle the folio through 530 + * writeback state to properly update xarray tags. 531 + */ 532 + __folio_start_writeback(folio, keep_towrite); 533 + folio_end_writeback(folio); 529 534 return 0; 535 + } 530 536 531 537 bh = head = folio_buffers(folio); 532 538

+31 -6

fs/ext4/super.c

··· 1254 1254 struct buffer_head **group_desc; 1255 1255 int i; 1256 1256 1257 - rcu_read_lock(); 1258 - group_desc = rcu_dereference(sbi->s_group_desc); 1257 + group_desc = rcu_access_pointer(sbi->s_group_desc); 1259 1258 for (i = 0; i < sbi->s_gdb_count; i++) 1260 1259 brelse(group_desc[i]); 1261 1260 kvfree(group_desc); 1262 - rcu_read_unlock(); 1263 1261 } 1264 1262 1265 1263 static void ext4_flex_groups_free(struct ext4_sb_info *sbi) ··· 1265 1267 struct flex_groups **flex_groups; 1266 1268 int i; 1267 1269 1268 - rcu_read_lock(); 1269 - flex_groups = rcu_dereference(sbi->s_flex_groups); 1270 + flex_groups = rcu_access_pointer(sbi->s_flex_groups); 1270 1271 if (flex_groups) { 1271 1272 for (i = 0; i < sbi->s_flex_groups_allocated; i++) 1272 1273 kvfree(flex_groups[i]); 1273 1274 kvfree(flex_groups); 1274 1275 } 1275 - rcu_read_unlock(); 1276 1276 } 1277 1277 1278 1278 static void ext4_put_super(struct super_block *sb) ··· 1523 1527 invalidate_inode_buffers(inode); 1524 1528 clear_inode(inode); 1525 1529 ext4_discard_preallocations(inode); 1530 + /* 1531 + * We must remove the inode from the hash before ext4_free_inode() 1532 + * clears the bit in inode bitmap as otherwise another process reusing 1533 + * the inode will block in insert_inode_hash() waiting for inode 1534 + * eviction to complete while holding transaction handle open, but 1535 + * ext4_evict_inode() still running for that inode could block waiting 1536 + * for transaction commit if the inode is marked as IS_SYNC => deadlock. 1537 + * 1538 + * Removing the inode from the hash here is safe. There are two cases 1539 + * to consider: 1540 + * 1) The inode still has references to it (i_nlink > 0). In that case 1541 + * we are keeping the inode and once we remove the inode from the hash, 1542 + * iget() can create the new inode structure for the same inode number 1543 + * and we are fine with that as all IO on behalf of the inode is 1544 + * finished. 1545 + * 2) We are deleting the inode (i_nlink == 0). In that case inode 1546 + * number cannot be reused until ext4_free_inode() clears the bit in 1547 + * the inode bitmap, at which point all IO is done and reuse is fine 1548 + * again. 1549 + */ 1550 + remove_inode_hash(inode); 1526 1551 ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS); 1527 1552 dquot_drop(inode); 1528 1553 if (EXT4_I(inode)->jinode) { ··· 3650 3633 "extents feature\n"); 3651 3634 return 0; 3652 3635 } 3636 + if (ext4_has_feature_bigalloc(sb) && 3637 + le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block)) { 3638 + ext4_msg(sb, KERN_WARNING, 3639 + "bad geometry: bigalloc file system with non-zero " 3640 + "first_data_block\n"); 3641 + return 0; 3642 + } 3653 3643 3654 3644 #if !IS_ENABLED(CONFIG_QUOTA) || !IS_ENABLED(CONFIG_QFMT_V2) 3655 3645 if (!readonly && (ext4_has_feature_quota(sb) || ··· 5427 5403 5428 5404 timer_setup(&sbi->s_err_report, print_daily_error_info, 0); 5429 5405 spin_lock_init(&sbi->s_error_lock); 5406 + mutex_init(&sbi->s_error_notify_mutex); 5430 5407 INIT_WORK(&sbi->s_sb_upd_work, update_super_work); 5431 5408 5432 5409 err = ext4_group_desc_init(sb, es, logical_sb_block, &first_not_zeroed);

+9 -1

fs/ext4/sysfs.c

··· 597 597 598 598 void ext4_notify_error_sysfs(struct ext4_sb_info *sbi) 599 599 { 600 - sysfs_notify(&sbi->s_kobj, NULL, "errors_count"); 600 + mutex_lock(&sbi->s_error_notify_mutex); 601 + if (sbi->s_kobj.state_in_sysfs) 602 + sysfs_notify(&sbi->s_kobj, NULL, "errors_count"); 603 + mutex_unlock(&sbi->s_error_notify_mutex); 601 604 } 602 605 603 606 static struct kobject *ext4_root; ··· 613 610 int err; 614 611 615 612 init_completion(&sbi->s_kobj_unregister); 613 + mutex_lock(&sbi->s_error_notify_mutex); 616 614 err = kobject_init_and_add(&sbi->s_kobj, &ext4_sb_ktype, ext4_root, 617 615 "%s", sb->s_id); 616 + mutex_unlock(&sbi->s_error_notify_mutex); 618 617 if (err) { 619 618 kobject_put(&sbi->s_kobj); 620 619 wait_for_completion(&sbi->s_kobj_unregister); ··· 649 644 650 645 if (sbi->s_proc) 651 646 remove_proc_subtree(sb->s_id, ext4_proc_root); 647 + 648 + mutex_lock(&sbi->s_error_notify_mutex); 652 649 kobject_del(&sbi->s_kobj); 650 + mutex_unlock(&sbi->s_error_notify_mutex); 653 651 } 654 652 655 653 int __init ext4_init_sysfs(void)

+27 -9

fs/fs-writeback.c

··· 1711 1711 } 1712 1712 } 1713 1713 1714 + static bool __sync_lazytime(struct inode *inode) 1715 + { 1716 + spin_lock(&inode->i_lock); 1717 + if (!(inode_state_read(inode) & I_DIRTY_TIME)) { 1718 + spin_unlock(&inode->i_lock); 1719 + return false; 1720 + } 1721 + inode_state_clear(inode, I_DIRTY_TIME); 1722 + spin_unlock(&inode->i_lock); 1723 + inode->i_op->sync_lazytime(inode); 1724 + return true; 1725 + } 1726 + 1714 1727 bool sync_lazytime(struct inode *inode) 1715 1728 { 1716 1729 if (!(inode_state_read_once(inode) & I_DIRTY_TIME)) ··· 1731 1718 1732 1719 trace_writeback_lazytime(inode); 1733 1720 if (inode->i_op->sync_lazytime) 1734 - inode->i_op->sync_lazytime(inode); 1735 - else 1736 - mark_inode_dirty_sync(inode); 1721 + return __sync_lazytime(inode); 1722 + mark_inode_dirty_sync(inode); 1737 1723 return true; 1738 1724 } 1739 1725 ··· 2787 2775 * The mapping can appear untagged while still on-list since we 2788 2776 * do not have the mapping lock. Skip it here, wb completion 2789 2777 * will remove it. 2790 - * 2791 - * If the mapping does not have data integrity semantics, 2792 - * there's no need to wait for the writeout to complete, as the 2793 - * mapping cannot guarantee that data is persistently stored. 2794 2778 */ 2795 - if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK) || 2796 - mapping_no_data_integrity(mapping)) 2779 + if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) 2797 2780 continue; 2798 2781 2799 2782 spin_unlock_irq(&sb->s_inode_wblist_lock); ··· 2923 2916 */ 2924 2917 if (bdi == &noop_backing_dev_info) 2925 2918 return; 2919 + 2920 + /* 2921 + * If the superblock has SB_I_NO_DATA_INTEGRITY set, there's no need to 2922 + * wait for the writeout to complete, as the filesystem cannot guarantee 2923 + * data persistence on sync. Just kick off writeback and return. 2924 + */ 2925 + if (sb->s_iflags & SB_I_NO_DATA_INTEGRITY) { 2926 + wakeup_flusher_threads_bdi(bdi, WB_REASON_SYNC); 2927 + return; 2928 + } 2929 + 2926 2930 WARN_ON(!rwsem_is_locked(&sb->s_umount)); 2927 2931 2928 2932 /* protect against inode wb switch, see inode_switch_wbs_work_fn() */

+1 -3

fs/fuse/file.c

··· 3201 3201 3202 3202 inode->i_fop = &fuse_file_operations; 3203 3203 inode->i_data.a_ops = &fuse_file_aops; 3204 - if (fc->writeback_cache) { 3204 + if (fc->writeback_cache) 3205 3205 mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data); 3206 - mapping_set_no_data_integrity(&inode->i_data); 3207 - } 3208 3206 3209 3207 INIT_LIST_HEAD(&fi->write_files); 3210 3208 INIT_LIST_HEAD(&fi->queued_writes);

+1

fs/fuse/inode.c

··· 1709 1709 sb->s_export_op = &fuse_export_operations; 1710 1710 sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; 1711 1711 sb->s_iflags |= SB_I_NOIDMAP; 1712 + sb->s_iflags |= SB_I_NO_DATA_INTEGRITY; 1712 1713 if (sb->s_user_ns != &init_user_ns) 1713 1714 sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; 1714 1715 sb->s_flags &= ~(SB_NOSEC | SB_I_VERSION);

+50 -1

fs/iomap/bio.c

··· 8 8 #include "internal.h" 9 9 #include "trace.h" 10 10 11 - static void iomap_read_end_io(struct bio *bio) 11 + static DEFINE_SPINLOCK(failed_read_lock); 12 + static struct bio_list failed_read_list = BIO_EMPTY_LIST; 13 + 14 + static void __iomap_read_end_io(struct bio *bio) 12 15 { 13 16 int error = blk_status_to_errno(bio->bi_status); 14 17 struct folio_iter fi; ··· 19 16 bio_for_each_folio_all(fi, bio) 20 17 iomap_finish_folio_read(fi.folio, fi.offset, fi.length, error); 21 18 bio_put(bio); 19 + } 20 + 21 + static void 22 + iomap_fail_reads( 23 + struct work_struct *work) 24 + { 25 + struct bio *bio; 26 + struct bio_list tmp = BIO_EMPTY_LIST; 27 + unsigned long flags; 28 + 29 + spin_lock_irqsave(&failed_read_lock, flags); 30 + bio_list_merge_init(&tmp, &failed_read_list); 31 + spin_unlock_irqrestore(&failed_read_lock, flags); 32 + 33 + while ((bio = bio_list_pop(&tmp)) != NULL) { 34 + __iomap_read_end_io(bio); 35 + cond_resched(); 36 + } 37 + } 38 + 39 + static DECLARE_WORK(failed_read_work, iomap_fail_reads); 40 + 41 + static void iomap_fail_buffered_read(struct bio *bio) 42 + { 43 + unsigned long flags; 44 + 45 + /* 46 + * Bounce I/O errors to a workqueue to avoid nested i_lock acquisitions 47 + * in the fserror code. The caller no longer owns the bio reference 48 + * after the spinlock drops. 49 + */ 50 + spin_lock_irqsave(&failed_read_lock, flags); 51 + if (bio_list_empty(&failed_read_list)) 52 + WARN_ON_ONCE(!schedule_work(&failed_read_work)); 53 + bio_list_add(&failed_read_list, bio); 54 + spin_unlock_irqrestore(&failed_read_lock, flags); 55 + } 56 + 57 + static void iomap_read_end_io(struct bio *bio) 58 + { 59 + if (bio->bi_status) { 60 + iomap_fail_buffered_read(bio); 61 + return; 62 + } 63 + 64 + __iomap_read_end_io(bio); 22 65 } 23 66 24 67 static void iomap_bio_submit_read(struct iomap_read_folio_ctx *ctx)

+10 -5

fs/iomap/buffered-io.c

··· 514 514 loff_t length = iomap_length(iter); 515 515 struct folio *folio = ctx->cur_folio; 516 516 size_t folio_len = folio_size(folio); 517 + struct iomap_folio_state *ifs; 517 518 size_t poff, plen; 518 519 loff_t pos_diff; 519 520 int ret; ··· 526 525 return iomap_iter_advance(iter, length); 527 526 } 528 527 529 - ifs_alloc(iter->inode, folio, iter->flags); 528 + ifs = ifs_alloc(iter->inode, folio, iter->flags); 530 529 531 530 length = min_t(loff_t, length, folio_len - offset_in_folio(folio, pos)); 532 531 while (length) { ··· 561 560 562 561 *bytes_submitted += plen; 563 562 /* 564 - * If the entire folio has been read in by the IO 565 - * helper, then the helper owns the folio and will end 566 - * the read on it. 563 + * Hand off folio ownership to the IO helper when: 564 + * 1) The entire folio has been submitted for IO, or 565 + * 2) There is no ifs attached to the folio 566 + * 567 + * Case (2) occurs when 1 << i_blkbits matches the folio 568 + * size but the underlying filesystem or block device 569 + * uses a smaller granularity for IO. 567 570 */ 568 - if (*bytes_submitted == folio_len) 571 + if (*bytes_submitted == folio_len || !ifs) 569 572 ctx->cur_folio = NULL; 570 573 } 571 574

+13 -2

fs/jbd2/checkpoint.c

··· 267 267 */ 268 268 BUFFER_TRACE(bh, "queue"); 269 269 get_bh(bh); 270 - J_ASSERT_BH(bh, !buffer_jwrite(bh)); 270 + if (WARN_ON_ONCE(buffer_jwrite(bh))) { 271 + put_bh(bh); /* drop the ref we just took */ 272 + spin_unlock(&journal->j_list_lock); 273 + /* Clean up any previously batched buffers */ 274 + if (batch_count) 275 + __flush_batch(journal, &batch_count); 276 + jbd2_journal_abort(journal, -EFSCORRUPTED); 277 + return -EFSCORRUPTED; 278 + } 271 279 journal->j_chkpt_bhs[batch_count++] = bh; 272 280 transaction->t_chp_stats.cs_written++; 273 281 transaction->t_checkpoint_list = jh->b_cpnext; ··· 333 325 334 326 if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr)) 335 327 return 1; 336 - J_ASSERT(blocknr != 0); 328 + if (WARN_ON_ONCE(blocknr == 0)) { 329 + jbd2_journal_abort(journal, -EFSCORRUPTED); 330 + return -EFSCORRUPTED; 331 + } 337 332 338 333 /* 339 334 * We need to make sure that any blocks that were recently written out

+8 -2

fs/namei.c

··· 2437 2437 EXPORT_SYMBOL(hashlen_string); 2438 2438 2439 2439 /* 2440 - * Calculate the length and hash of the path component, and 2441 - * return the length as the result. 2440 + * hash_name - Calculate the length and hash of the path component 2441 + * @nd: the path resolution state 2442 + * @name: the pathname to read the component from 2443 + * @lastword: if the component fits in a single word, LAST_WORD_IS_DOT, 2444 + * LAST_WORD_IS_DOTDOT, or some other value depending on whether the 2445 + * component is '.', '..', or something else. Otherwise, @lastword is 0. 2446 + * 2447 + * Returns: a pointer to the terminating '/' or NUL character in @name. 2442 2448 */ 2443 2449 static inline const char *hash_name(struct nameidata *nd, 2444 2450 const char *name,

+1 -2

fs/netfs/buffered_read.c

··· 171 171 spin_lock(&rreq->lock); 172 172 list_add_tail(&subreq->rreq_link, &stream->subrequests); 173 173 if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { 174 - stream->front = subreq; 175 174 if (!stream->active) { 176 - stream->collected_to = stream->front->start; 175 + stream->collected_to = subreq->start; 177 176 /* Store list pointers before active flag */ 178 177 smp_store_release(&stream->active, true); 179 178 }

+1 -2

fs/netfs/direct_read.c

··· 71 71 spin_lock(&rreq->lock); 72 72 list_add_tail(&subreq->rreq_link, &stream->subrequests); 73 73 if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { 74 - stream->front = subreq; 75 74 if (!stream->active) { 76 - stream->collected_to = stream->front->start; 75 + stream->collected_to = subreq->start; 77 76 /* Store list pointers before active flag */ 78 77 smp_store_release(&stream->active, true); 79 78 }

+11 -4

fs/netfs/direct_write.c

··· 111 111 netfs_prepare_write(wreq, stream, wreq->start + wreq->transferred); 112 112 subreq = stream->construct; 113 113 stream->construct = NULL; 114 - stream->front = NULL; 115 114 } 116 115 117 116 /* Check if (re-)preparation failed. */ ··· 185 186 stream->sreq_max_segs = INT_MAX; 186 187 187 188 netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit); 188 - stream->prepare_write(subreq); 189 189 190 - __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); 191 - netfs_stat(&netfs_n_wh_retry_write_subreq); 190 + if (stream->prepare_write) { 191 + stream->prepare_write(subreq); 192 + __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); 193 + netfs_stat(&netfs_n_wh_retry_write_subreq); 194 + } else { 195 + struct iov_iter source; 196 + 197 + netfs_reset_iter(subreq); 198 + source = subreq->io_iter; 199 + netfs_reissue_write(stream, subreq, &source); 200 + } 192 201 } 193 202 194 203 netfs_unbuffered_write_done(wreq);

+43

fs/netfs/iterator.c

··· 143 143 } 144 144 145 145 /* 146 + * Select the span of a kvec iterator we're going to use. Limit it by both 147 + * maximum size and maximum number of segments. Returns the size of the span 148 + * in bytes. 149 + */ 150 + static size_t netfs_limit_kvec(const struct iov_iter *iter, size_t start_offset, 151 + size_t max_size, size_t max_segs) 152 + { 153 + const struct kvec *kvecs = iter->kvec; 154 + unsigned int nkv = iter->nr_segs, ix = 0, nsegs = 0; 155 + size_t len, span = 0, n = iter->count; 156 + size_t skip = iter->iov_offset + start_offset; 157 + 158 + if (WARN_ON(!iov_iter_is_kvec(iter)) || 159 + WARN_ON(start_offset > n) || 160 + n == 0) 161 + return 0; 162 + 163 + while (n && ix < nkv && skip) { 164 + len = kvecs[ix].iov_len; 165 + if (skip < len) 166 + break; 167 + skip -= len; 168 + n -= len; 169 + ix++; 170 + } 171 + 172 + while (n && ix < nkv) { 173 + len = min3(n, kvecs[ix].iov_len - skip, max_size); 174 + span += len; 175 + nsegs++; 176 + ix++; 177 + if (span >= max_size || nsegs >= max_segs) 178 + break; 179 + skip = 0; 180 + n -= len; 181 + } 182 + 183 + return min(span, max_size); 184 + } 185 + 186 + /* 146 187 * Select the span of an xarray iterator we're going to use. Limit it by both 147 188 * maximum size and maximum number of segments. It is assumed that segments 148 189 * can be larger than a page in size, provided they're physically contiguous. ··· 286 245 return netfs_limit_bvec(iter, start_offset, max_size, max_segs); 287 246 if (iov_iter_is_xarray(iter)) 288 247 return netfs_limit_xarray(iter, start_offset, max_size, max_segs); 248 + if (iov_iter_is_kvec(iter)) 249 + return netfs_limit_kvec(iter, start_offset, max_size, max_segs); 289 250 BUG(); 290 251 } 291 252 EXPORT_SYMBOL(netfs_limit_iter);

+2 -2

fs/netfs/read_collect.c

··· 205 205 * in progress. The issuer thread may be adding stuff to the tail 206 206 * whilst we're doing this. 207 207 */ 208 - front = READ_ONCE(stream->front); 208 + front = list_first_entry_or_null(&stream->subrequests, 209 + struct netfs_io_subrequest, rreq_link); 209 210 while (front) { 210 211 size_t transferred; 211 212 ··· 302 301 list_del_init(&front->rreq_link); 303 302 front = list_first_entry_or_null(&stream->subrequests, 304 303 struct netfs_io_subrequest, rreq_link); 305 - stream->front = front; 306 304 spin_unlock(&rreq->lock); 307 305 netfs_put_subrequest(remove, 308 306 notes & ABANDON_SREQ ?

+4 -1

fs/netfs/read_retry.c

··· 93 93 from->start, from->transferred, from->len); 94 94 95 95 if (test_bit(NETFS_SREQ_FAILED, &from->flags) || 96 - !test_bit(NETFS_SREQ_NEED_RETRY, &from->flags)) 96 + !test_bit(NETFS_SREQ_NEED_RETRY, &from->flags)) { 97 + subreq = from; 97 98 goto abandon; 99 + } 98 100 99 101 list_for_each_continue(next, &stream->subrequests) { 100 102 subreq = list_entry(next, struct netfs_io_subrequest, rreq_link); ··· 180 178 if (subreq == to) 181 179 break; 182 180 } 181 + subreq = NULL; 183 182 continue; 184 183 } 185 184

-1

fs/netfs/read_single.c

··· 107 107 spin_lock(&rreq->lock); 108 108 list_add_tail(&subreq->rreq_link, &stream->subrequests); 109 109 trace_netfs_sreq(subreq, netfs_sreq_trace_added); 110 - stream->front = subreq; 111 110 /* Store list pointers before active flag */ 112 111 smp_store_release(&stream->active, true); 113 112 spin_unlock(&rreq->lock);

+2 -2

fs/netfs/write_collect.c

··· 228 228 if (!smp_load_acquire(&stream->active)) 229 229 continue; 230 230 231 - front = stream->front; 231 + front = list_first_entry_or_null(&stream->subrequests, 232 + struct netfs_io_subrequest, rreq_link); 232 233 while (front) { 233 234 trace_netfs_collect_sreq(wreq, front); 234 235 //_debug("sreq [%x] %llx %zx/%zx", ··· 280 279 list_del_init(&front->rreq_link); 281 280 front = list_first_entry_or_null(&stream->subrequests, 282 281 struct netfs_io_subrequest, rreq_link); 283 - stream->front = front; 284 282 spin_unlock(&wreq->lock); 285 283 netfs_put_subrequest(remove, 286 284 notes & SAW_FAILURE ?

+1 -2

fs/netfs/write_issue.c

··· 206 206 spin_lock(&wreq->lock); 207 207 list_add_tail(&subreq->rreq_link, &stream->subrequests); 208 208 if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { 209 - stream->front = subreq; 210 209 if (!stream->active) { 211 - stream->collected_to = stream->front->start; 210 + stream->collected_to = subreq->start; 212 211 /* Write list pointers before active flag */ 213 212 smp_store_release(&stream->active, true); 214 213 }

+3 -3

fs/overlayfs/copy_up.c

··· 1146 1146 return -EOVERFLOW; 1147 1147 1148 1148 /* 1149 - * With metacopy disabled, we fsync after final metadata copyup, for 1149 + * With "fsync=strict", we fsync after final metadata copyup, for 1150 1150 * both regular files and directories to get atomic copyup semantics 1151 1151 * on filesystems that do not use strict metadata ordering (e.g. ubifs). 1152 1152 * 1153 - * With metacopy enabled we want to avoid fsync on all meta copyup 1153 + * By default, we want to avoid fsync on all meta copyup, because 1154 1154 * that will hurt performance of workloads such as chown -R, so we 1155 1155 * only fsync on data copyup as legacy behavior. 1156 1156 */ 1157 - ctx.metadata_fsync = !OVL_FS(dentry->d_sb)->config.metacopy && 1157 + ctx.metadata_fsync = ovl_should_sync_metadata(OVL_FS(dentry->d_sb)) && 1158 1158 (S_ISREG(ctx.stat.mode) || S_ISDIR(ctx.stat.mode)); 1159 1159 ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags); 1160 1160

+21

fs/overlayfs/overlayfs.h

··· 99 99 OVL_VERITY_REQUIRE, 100 100 }; 101 101 102 + enum { 103 + OVL_FSYNC_VOLATILE, 104 + OVL_FSYNC_AUTO, 105 + OVL_FSYNC_STRICT, 106 + }; 107 + 102 108 /* 103 109 * The tuple (fh,uuid) is a universal unique identifier for a copy up origin, 104 110 * where: ··· 660 654 static inline bool ovl_xino_warn(struct ovl_fs *ofs) 661 655 { 662 656 return ofs->config.xino == OVL_XINO_ON; 657 + } 658 + 659 + static inline bool ovl_should_sync(struct ovl_fs *ofs) 660 + { 661 + return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE; 662 + } 663 + 664 + static inline bool ovl_should_sync_metadata(struct ovl_fs *ofs) 665 + { 666 + return ofs->config.fsync_mode == OVL_FSYNC_STRICT; 667 + } 668 + 669 + static inline bool ovl_is_volatile(struct ovl_config *config) 670 + { 671 + return config->fsync_mode == OVL_FSYNC_VOLATILE; 663 672 } 664 673 665 674 /*

+1 -6

fs/overlayfs/ovl_entry.h

··· 18 18 int xino; 19 19 bool metacopy; 20 20 bool userxattr; 21 - bool ovl_volatile; 21 + int fsync_mode; 22 22 }; 23 23 24 24 struct ovl_sb { ··· 118 118 WARN_ON_ONCE(sb->s_type != &ovl_fs_type); 119 119 120 120 return (struct ovl_fs *)sb->s_fs_info; 121 - } 122 - 123 - static inline bool ovl_should_sync(struct ovl_fs *ofs) 124 - { 125 - return !ofs->config.ovl_volatile; 126 121 } 127 122 128 123 static inline unsigned int ovl_numlower(struct ovl_entry *oe)

+28 -5

fs/overlayfs/params.c

··· 58 58 Opt_xino, 59 59 Opt_metacopy, 60 60 Opt_verity, 61 + Opt_fsync, 61 62 Opt_volatile, 62 63 Opt_override_creds, 63 64 }; ··· 141 140 return OVL_VERITY_OFF; 142 141 } 143 142 143 + static const struct constant_table ovl_parameter_fsync[] = { 144 + { "volatile", OVL_FSYNC_VOLATILE }, 145 + { "auto", OVL_FSYNC_AUTO }, 146 + { "strict", OVL_FSYNC_STRICT }, 147 + {} 148 + }; 149 + 150 + static const char *ovl_fsync_mode(struct ovl_config *config) 151 + { 152 + return ovl_parameter_fsync[config->fsync_mode].name; 153 + } 154 + 155 + static int ovl_fsync_mode_def(void) 156 + { 157 + return OVL_FSYNC_AUTO; 158 + } 159 + 144 160 const struct fs_parameter_spec ovl_parameter_spec[] = { 145 161 fsparam_string_empty("lowerdir", Opt_lowerdir), 146 162 fsparam_file_or_string("lowerdir+", Opt_lowerdir_add), ··· 173 155 fsparam_enum("xino", Opt_xino, ovl_parameter_xino), 174 156 fsparam_enum("metacopy", Opt_metacopy, ovl_parameter_bool), 175 157 fsparam_enum("verity", Opt_verity, ovl_parameter_verity), 158 + fsparam_enum("fsync", Opt_fsync, ovl_parameter_fsync), 176 159 fsparam_flag("volatile", Opt_volatile), 177 160 fsparam_flag_no("override_creds", Opt_override_creds), 178 161 {} ··· 684 665 case Opt_verity: 685 666 config->verity_mode = result.uint_32; 686 667 break; 668 + case Opt_fsync: 669 + config->fsync_mode = result.uint_32; 670 + break; 687 671 case Opt_volatile: 688 - config->ovl_volatile = true; 672 + config->fsync_mode = OVL_FSYNC_VOLATILE; 689 673 break; 690 674 case Opt_userxattr: 691 675 config->userxattr = true; ··· 822 800 ofs->config.nfs_export = ovl_nfs_export_def; 823 801 ofs->config.xino = ovl_xino_def(); 824 802 ofs->config.metacopy = ovl_metacopy_def; 803 + ofs->config.fsync_mode = ovl_fsync_mode_def(); 825 804 826 805 fc->s_fs_info = ofs; 827 806 fc->fs_private = ctx; ··· 893 870 config->index = false; 894 871 } 895 872 896 - if (!config->upperdir && config->ovl_volatile) { 873 + if (!config->upperdir && ovl_is_volatile(config)) { 897 874 pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n"); 898 - config->ovl_volatile = false; 875 + config->fsync_mode = ovl_fsync_mode_def(); 899 876 } 900 877 901 878 if (!config->upperdir && config->uuid == OVL_UUID_ON) { ··· 1093 1070 seq_printf(m, ",xino=%s", ovl_xino_mode(&ofs->config)); 1094 1071 if (ofs->config.metacopy != ovl_metacopy_def) 1095 1072 seq_printf(m, ",metacopy=%s", str_on_off(ofs->config.metacopy)); 1096 - if (ofs->config.ovl_volatile) 1097 - seq_puts(m, ",volatile"); 1073 + if (ofs->config.fsync_mode != ovl_fsync_mode_def()) 1074 + seq_printf(m, ",fsync=%s", ovl_fsync_mode(&ofs->config)); 1098 1075 if (ofs->config.userxattr) 1099 1076 seq_puts(m, ",userxattr"); 1100 1077 if (ofs->config.verity_mode != ovl_verity_mode_def())

+1 -1

fs/overlayfs/super.c

··· 776 776 * For volatile mount, create a incompat/volatile/dirty file to keep 777 777 * track of it. 778 778 */ 779 - if (ofs->config.ovl_volatile) { 779 + if (ovl_is_volatile(&ofs->config)) { 780 780 err = ovl_create_volatile_dirty(ofs); 781 781 if (err < 0) { 782 782 pr_err("Failed to create volatile/dirty file.\n");

+4 -1

fs/overlayfs/util.c

··· 85 85 if (!exportfs_can_decode_fh(sb->s_export_op)) 86 86 return 0; 87 87 88 - return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN; 88 + if (sb->s_export_op->encode_fh == generic_encode_ino32_fh) 89 + return FILEID_INO32_GEN; 90 + 91 + return -1; 89 92 } 90 93 91 94 struct dentry *ovl_indexdir(struct super_block *sb)

+4 -3

fs/smb/client/Makefile

··· 48 48 # Build the SMB2 error mapping table from smb2status.h 49 49 # 50 50 $(obj)/smb2_mapping_table.c: $(src)/../common/smb2status.h \ 51 - $(src)/gen_smb2_mapping 52 - $(call cmd,gen_smb2_mapping) 51 + $(src)/gen_smb2_mapping FORCE 52 + $(call if_changed,gen_smb2_mapping) 53 53 54 54 $(obj)/smb2maperror.o: $(obj)/smb2_mapping_table.c 55 55 ··· 58 58 59 59 obj-$(CONFIG_SMB_KUNIT_TESTS) += smb2maperror_test.o 60 60 61 - clean-files += smb2_mapping_table.c 61 + # Let Kbuild handle tracking and cleaning 62 + targets += smb2_mapping_table.c

+45 -27

fs/smb/server/oplock.c

··· 82 82 spin_unlock(&lb->lb_lock); 83 83 } 84 84 85 - static void lb_add(struct lease_table *lb) 85 + static struct lease_table *alloc_lease_table(struct oplock_info *opinfo) 86 86 { 87 - write_lock(&lease_list_lock); 88 - list_add(&lb->l_entry, &lease_table_list); 89 - write_unlock(&lease_list_lock); 87 + struct lease_table *lb; 88 + 89 + lb = kmalloc_obj(struct lease_table, KSMBD_DEFAULT_GFP); 90 + if (!lb) 91 + return NULL; 92 + 93 + memcpy(lb->client_guid, opinfo->conn->ClientGUID, 94 + SMB2_CLIENT_GUID_SIZE); 95 + INIT_LIST_HEAD(&lb->lease_list); 96 + spin_lock_init(&lb->lb_lock); 97 + return lb; 90 98 } 91 99 92 100 static int alloc_lease(struct oplock_info *opinfo, struct lease_ctx_info *lctx) ··· 1050 1042 lease2->version = lease1->version; 1051 1043 } 1052 1044 1053 - static int add_lease_global_list(struct oplock_info *opinfo) 1045 + static void add_lease_global_list(struct oplock_info *opinfo, 1046 + struct lease_table *new_lb) 1054 1047 { 1055 1048 struct lease_table *lb; 1056 1049 1057 - read_lock(&lease_list_lock); 1050 + write_lock(&lease_list_lock); 1058 1051 list_for_each_entry(lb, &lease_table_list, l_entry) { 1059 1052 if (!memcmp(lb->client_guid, opinfo->conn->ClientGUID, 1060 1053 SMB2_CLIENT_GUID_SIZE)) { 1061 1054 opinfo->o_lease->l_lb = lb; 1062 1055 lease_add_list(opinfo); 1063 - read_unlock(&lease_list_lock); 1064 - return 0; 1056 + write_unlock(&lease_list_lock); 1057 + kfree(new_lb); 1058 + return; 1065 1059 } 1066 1060 } 1067 - read_unlock(&lease_list_lock); 1068 1061 1069 - lb = kmalloc_obj(struct lease_table, KSMBD_DEFAULT_GFP); 1070 - if (!lb) 1071 - return -ENOMEM; 1072 - 1073 - memcpy(lb->client_guid, opinfo->conn->ClientGUID, 1074 - SMB2_CLIENT_GUID_SIZE); 1075 - INIT_LIST_HEAD(&lb->lease_list); 1076 - spin_lock_init(&lb->lb_lock); 1077 - opinfo->o_lease->l_lb = lb; 1062 + opinfo->o_lease->l_lb = new_lb; 1078 1063 lease_add_list(opinfo); 1079 - lb_add(lb); 1080 - return 0; 1064 + list_add(&new_lb->l_entry, &lease_table_list); 1065 + write_unlock(&lease_list_lock); 1081 1066 } 1082 1067 1083 1068 static void set_oplock_level(struct oplock_info *opinfo, int level, ··· 1190 1189 int err = 0; 1191 1190 struct oplock_info *opinfo = NULL, *prev_opinfo = NULL; 1192 1191 struct ksmbd_inode *ci = fp->f_ci; 1192 + struct lease_table *new_lb = NULL; 1193 1193 bool prev_op_has_lease; 1194 1194 __le32 prev_op_state = 0; 1195 1195 ··· 1293 1291 set_oplock_level(opinfo, req_op_level, lctx); 1294 1292 1295 1293 out: 1294 + /* 1295 + * Set o_fp before any publication so that concurrent readers 1296 + * (e.g. find_same_lease_key() on the lease list) that 1297 + * dereference opinfo->o_fp don't hit a NULL pointer. 1298 + * 1299 + * Keep the original publication order so concurrent opens can 1300 + * still observe the in-flight grant via ci->m_op_list, but make 1301 + * everything after opinfo_add() no-fail by preallocating any new 1302 + * lease_table first. 1303 + */ 1304 + opinfo->o_fp = fp; 1305 + if (opinfo->is_lease) { 1306 + new_lb = alloc_lease_table(opinfo); 1307 + if (!new_lb) { 1308 + err = -ENOMEM; 1309 + goto err_out; 1310 + } 1311 + } 1312 + 1296 1313 opinfo_count_inc(fp); 1297 1314 opinfo_add(opinfo, fp); 1298 1315 1299 - if (opinfo->is_lease) { 1300 - err = add_lease_global_list(opinfo); 1301 - if (err) 1302 - goto err_out; 1303 - } 1316 + if (opinfo->is_lease) 1317 + add_lease_global_list(opinfo, new_lb); 1304 1318 1305 1319 rcu_assign_pointer(fp->f_opinfo, opinfo); 1306 - opinfo->o_fp = fp; 1307 1320 1308 1321 return 0; 1309 1322 err_out: 1310 - __free_opinfo(opinfo); 1323 + kfree(new_lb); 1324 + opinfo_put(opinfo); 1311 1325 return err; 1312 1326 } 1313 1327

+52 -21

fs/smb/server/smb2pdu.c

··· 1939 1939 if (sess->user && sess->user->flags & KSMBD_USER_FLAG_DELAY_SESSION) 1940 1940 try_delay = true; 1941 1941 1942 - sess->last_active = jiffies; 1943 - sess->state = SMB2_SESSION_EXPIRED; 1942 + /* 1943 + * For binding requests, session belongs to another 1944 + * connection. Do not expire it. 1945 + */ 1946 + if (!(req->Flags & SMB2_SESSION_REQ_FLAG_BINDING)) { 1947 + sess->last_active = jiffies; 1948 + sess->state = SMB2_SESSION_EXPIRED; 1949 + } 1944 1950 ksmbd_user_session_put(sess); 1945 1951 work->sess = NULL; 1946 1952 if (try_delay) { ··· 4452 4446 d_info.wptr = (char *)rsp->Buffer; 4453 4447 d_info.rptr = (char *)rsp->Buffer; 4454 4448 d_info.out_buf_len = 4455 - smb2_calc_max_out_buf_len(work, 8, 4456 - le32_to_cpu(req->OutputBufferLength)); 4449 + smb2_calc_max_out_buf_len(work, 4450 + offsetof(struct smb2_query_directory_rsp, Buffer), 4451 + le32_to_cpu(req->OutputBufferLength)); 4457 4452 if (d_info.out_buf_len < 0) { 4458 4453 rc = -EINVAL; 4459 4454 goto err_out; ··· 4721 4714 } 4722 4715 4723 4716 buf_free_len = 4724 - smb2_calc_max_out_buf_len(work, 8, 4725 - le32_to_cpu(req->OutputBufferLength)); 4717 + smb2_calc_max_out_buf_len(work, 4718 + offsetof(struct smb2_query_info_rsp, Buffer), 4719 + le32_to_cpu(req->OutputBufferLength)); 4726 4720 if (buf_free_len < 0) 4727 4721 return -EINVAL; 4728 4722 ··· 4940 4932 int conv_len; 4941 4933 char *filename; 4942 4934 u64 time; 4943 - int ret; 4935 + int ret, buf_free_len, filename_len; 4936 + struct smb2_query_info_req *req = ksmbd_req_buf_next(work); 4944 4937 4945 4938 if (!(fp->daccess & FILE_READ_ATTRIBUTES_LE)) { 4946 4939 ksmbd_debug(SMB, "no right to read the attributes : 0x%x\n", ··· 4952 4943 filename = convert_to_nt_pathname(work->tcon->share_conf, &fp->filp->f_path); 4953 4944 if (IS_ERR(filename)) 4954 4945 return PTR_ERR(filename); 4946 + 4947 + filename_len = strlen(filename); 4948 + buf_free_len = smb2_calc_max_out_buf_len(work, 4949 + offsetof(struct smb2_query_info_rsp, Buffer) + 4950 + offsetof(struct smb2_file_all_info, FileName), 4951 + le32_to_cpu(req->OutputBufferLength)); 4952 + if (buf_free_len < (filename_len + 1) * 2) { 4953 + kfree(filename); 4954 + return -EINVAL; 4955 + } 4955 4956 4956 4957 ret = vfs_getattr(&fp->filp->f_path, &stat, STATX_BASIC_STATS, 4957 4958 AT_STATX_SYNC_AS_STAT); ··· 5006 4987 file_info->Mode = fp->coption; 5007 4988 file_info->AlignmentRequirement = 0; 5008 4989 conv_len = smbConvertToUTF16((__le16 *)file_info->FileName, filename, 5009 - PATH_MAX, conn->local_nls, 0); 4990 + min(filename_len, PATH_MAX), 4991 + conn->local_nls, 0); 5010 4992 conv_len *= 2; 5011 4993 file_info->FileNameLength = cpu_to_le32(conv_len); 5012 4994 rsp->OutputBufferLength = ··· 5061 5041 file_info = (struct smb2_file_stream_info *)rsp->Buffer; 5062 5042 5063 5043 buf_free_len = 5064 - smb2_calc_max_out_buf_len(work, 8, 5065 - le32_to_cpu(req->OutputBufferLength)); 5044 + smb2_calc_max_out_buf_len(work, 5045 + offsetof(struct smb2_query_info_rsp, Buffer), 5046 + le32_to_cpu(req->OutputBufferLength)); 5066 5047 if (buf_free_len < 0) 5067 5048 goto out; 5068 5049 ··· 7607 7586 rc = vfs_lock_file(filp, smb_lock->cmd, flock, NULL); 7608 7587 skip: 7609 7588 if (smb_lock->flags & SMB2_LOCKFLAG_UNLOCK) { 7589 + locks_free_lock(flock); 7590 + kfree(smb_lock); 7610 7591 if (!rc) { 7611 7592 ksmbd_debug(SMB, "File unlocked\n"); 7612 7593 } else if (rc == -ENOENT) { 7613 7594 rsp->hdr.Status = STATUS_NOT_LOCKED; 7595 + err = rc; 7614 7596 goto out; 7615 7597 } 7616 - locks_free_lock(flock); 7617 - kfree(smb_lock); 7618 7598 } else { 7619 7599 if (rc == FILE_LOCK_DEFERRED) { 7620 7600 void **argv; ··· 7684 7662 spin_unlock(&work->conn->llist_lock); 7685 7663 ksmbd_debug(SMB, "successful in taking lock\n"); 7686 7664 } else { 7665 + locks_free_lock(flock); 7666 + kfree(smb_lock); 7667 + err = rc; 7687 7668 goto out; 7688 7669 } 7689 7670 } ··· 7717 7692 struct file_lock *rlock = NULL; 7718 7693 7719 7694 rlock = smb_flock_init(filp); 7720 - rlock->c.flc_type = F_UNLCK; 7721 - rlock->fl_start = smb_lock->start; 7722 - rlock->fl_end = smb_lock->end; 7695 + if (rlock) { 7696 + rlock->c.flc_type = F_UNLCK; 7697 + rlock->fl_start = smb_lock->start; 7698 + rlock->fl_end = smb_lock->end; 7723 7699 7724 - rc = vfs_lock_file(filp, F_SETLK, rlock, NULL); 7725 - if (rc) 7726 - pr_err("rollback unlock fail : %d\n", rc); 7700 + rc = vfs_lock_file(filp, F_SETLK, rlock, NULL); 7701 + if (rc) 7702 + pr_err("rollback unlock fail : %d\n", rc); 7703 + } else { 7704 + pr_err("rollback unlock alloc failed\n"); 7705 + } 7727 7706 7728 7707 list_del(&smb_lock->llist); 7729 7708 spin_lock(&work->conn->llist_lock); ··· 7737 7708 spin_unlock(&work->conn->llist_lock); 7738 7709 7739 7710 locks_free_lock(smb_lock->fl); 7740 - locks_free_lock(rlock); 7711 + if (rlock) 7712 + locks_free_lock(rlock); 7741 7713 kfree(smb_lock); 7742 7714 } 7743 7715 out2: ··· 8221 8191 buffer = (char *)req + le32_to_cpu(req->InputOffset); 8222 8192 8223 8193 cnt_code = le32_to_cpu(req->CtlCode); 8224 - ret = smb2_calc_max_out_buf_len(work, 48, 8225 - le32_to_cpu(req->MaxOutputResponse)); 8194 + ret = smb2_calc_max_out_buf_len(work, 8195 + offsetof(struct smb2_ioctl_rsp, Buffer), 8196 + le32_to_cpu(req->MaxOutputResponse)); 8226 8197 if (ret < 0) { 8227 8198 rsp->hdr.Status = STATUS_INVALID_PARAMETER; 8228 8199 goto out;

+2 -1

fs/xfs/libxfs/xfs_attr.h

··· 55 55 struct xfs_trans *tp; 56 56 struct xfs_inode *dp; /* inode */ 57 57 struct xfs_attrlist_cursor_kern cursor; /* position in list */ 58 - void *buffer; /* output buffer */ 58 + /* output buffer */ 59 + void *buffer __counted_by_ptr(bufsize); 59 60 60 61 /* 61 62 * Abort attribute list iteration if non-zero. Can be used to pass

+22

fs/xfs/libxfs/xfs_attr_leaf.c

··· 1416 1416 } 1417 1417 1418 1418 /* 1419 + * Reinitialize an existing attr fork block as an empty leaf, and attach 1420 + * the buffer to tp. 1421 + */ 1422 + int 1423 + xfs_attr3_leaf_init( 1424 + struct xfs_trans *tp, 1425 + struct xfs_inode *dp, 1426 + xfs_dablk_t blkno) 1427 + { 1428 + struct xfs_buf *bp = NULL; 1429 + struct xfs_da_args args = { 1430 + .trans = tp, 1431 + .dp = dp, 1432 + .owner = dp->i_ino, 1433 + .geo = dp->i_mount->m_attr_geo, 1434 + }; 1435 + 1436 + ASSERT(tp != NULL); 1437 + 1438 + return xfs_attr3_leaf_create(&args, blkno, &bp); 1439 + } 1440 + /* 1419 1441 * Split the leaf node, rebalance, then add the new entry. 1420 1442 * 1421 1443 * Returns 0 if the entry was added, 1 if a further split is needed or a

+3

fs/xfs/libxfs/xfs_attr_leaf.h

··· 87 87 /* 88 88 * Routines used for shrinking the Btree. 89 89 */ 90 + 91 + int xfs_attr3_leaf_init(struct xfs_trans *tp, struct xfs_inode *dp, 92 + xfs_dablk_t blkno); 90 93 int xfs_attr3_leaf_toosmall(struct xfs_da_state *state, int *retval); 91 94 void xfs_attr3_leaf_unbalance(struct xfs_da_state *state, 92 95 struct xfs_da_state_blk *drop_blk,

+42 -11

fs/xfs/libxfs/xfs_da_btree.c

··· 1506 1506 } 1507 1507 1508 1508 /* 1509 - * Remove an entry from an intermediate node. 1509 + * Internal implementation to remove an entry from an intermediate node. 1510 1510 */ 1511 1511 STATIC void 1512 - xfs_da3_node_remove( 1513 - struct xfs_da_state *state, 1514 - struct xfs_da_state_blk *drop_blk) 1512 + __xfs_da3_node_remove( 1513 + struct xfs_trans *tp, 1514 + struct xfs_inode *dp, 1515 + struct xfs_da_geometry *geo, 1516 + struct xfs_da_state_blk *drop_blk) 1515 1517 { 1516 1518 struct xfs_da_intnode *node; 1517 1519 struct xfs_da3_icnode_hdr nodehdr; 1518 1520 struct xfs_da_node_entry *btree; 1519 1521 int index; 1520 1522 int tmp; 1521 - struct xfs_inode *dp = state->args->dp; 1522 - 1523 - trace_xfs_da_node_remove(state->args); 1524 1523 1525 1524 node = drop_blk->bp->b_addr; 1526 1525 xfs_da3_node_hdr_from_disk(dp->i_mount, &nodehdr, node); ··· 1535 1536 tmp = nodehdr.count - index - 1; 1536 1537 tmp *= (uint)sizeof(xfs_da_node_entry_t); 1537 1538 memmove(&btree[index], &btree[index + 1], tmp); 1538 - xfs_trans_log_buf(state->args->trans, drop_blk->bp, 1539 + xfs_trans_log_buf(tp, drop_blk->bp, 1539 1540 XFS_DA_LOGRANGE(node, &btree[index], tmp)); 1540 1541 index = nodehdr.count - 1; 1541 1542 } 1542 1543 memset(&btree[index], 0, sizeof(xfs_da_node_entry_t)); 1543 - xfs_trans_log_buf(state->args->trans, drop_blk->bp, 1544 + xfs_trans_log_buf(tp, drop_blk->bp, 1544 1545 XFS_DA_LOGRANGE(node, &btree[index], sizeof(btree[index]))); 1545 1546 nodehdr.count -= 1; 1546 1547 xfs_da3_node_hdr_to_disk(dp->i_mount, node, &nodehdr); 1547 - xfs_trans_log_buf(state->args->trans, drop_blk->bp, 1548 - XFS_DA_LOGRANGE(node, &node->hdr, state->args->geo->node_hdr_size)); 1548 + xfs_trans_log_buf(tp, drop_blk->bp, 1549 + XFS_DA_LOGRANGE(node, &node->hdr, geo->node_hdr_size)); 1549 1550 1550 1551 /* 1551 1552 * Copy the last hash value from the block to propagate upwards. 1552 1553 */ 1553 1554 drop_blk->hashval = be32_to_cpu(btree[index - 1].hashval); 1555 + } 1556 + 1557 + /* 1558 + * Remove an entry from an intermediate node. 1559 + */ 1560 + STATIC void 1561 + xfs_da3_node_remove( 1562 + struct xfs_da_state *state, 1563 + struct xfs_da_state_blk *drop_blk) 1564 + { 1565 + trace_xfs_da_node_remove(state->args); 1566 + 1567 + __xfs_da3_node_remove(state->args->trans, state->args->dp, 1568 + state->args->geo, drop_blk); 1569 + } 1570 + 1571 + /* 1572 + * Remove an entry from an intermediate attr node at the specified index. 1573 + */ 1574 + void 1575 + xfs_attr3_node_entry_remove( 1576 + struct xfs_trans *tp, 1577 + struct xfs_inode *dp, 1578 + struct xfs_buf *bp, 1579 + int index) 1580 + { 1581 + struct xfs_da_state_blk blk = { 1582 + .index = index, 1583 + .bp = bp, 1584 + }; 1585 + 1586 + __xfs_da3_node_remove(tp, dp, dp->i_mount->m_attr_geo, &blk); 1554 1587 } 1555 1588 1556 1589 /*

+2

fs/xfs/libxfs/xfs_da_btree.h

··· 184 184 int xfs_da3_join(xfs_da_state_t *state); 185 185 void xfs_da3_fixhashpath(struct xfs_da_state *state, 186 186 struct xfs_da_state_path *path_to_to_fix); 187 + void xfs_attr3_node_entry_remove(struct xfs_trans *tp, struct xfs_inode *dp, 188 + struct xfs_buf *bp, int index); 187 189 188 190 /* 189 191 * Routines used for finding things in the Btree.

+3 -1

fs/xfs/scrub/quota.c

··· 171 171 172 172 error = xchk_quota_item_bmap(sc, dq, offset); 173 173 xchk_iunlock(sc, XFS_ILOCK_SHARED); 174 - if (!xchk_fblock_process_error(sc, XFS_DATA_FORK, offset, &error)) 174 + if (!xchk_fblock_process_error(sc, XFS_DATA_FORK, offset, &error)) { 175 + mutex_unlock(&dq->q_qlock); 175 176 return error; 177 + } 176 178 177 179 /* 178 180 * Warn if the hard limits are larger than the fs.

+2 -10

fs/xfs/scrub/trace.h

··· 972 972 TP_STRUCT__entry( 973 973 __field(dev_t, dev) 974 974 __field(unsigned long, ino) 975 - __array(char, pathname, MAXNAMELEN) 976 975 ), 977 976 TP_fast_assign( 978 - char *path; 979 - 980 977 __entry->ino = file_inode(xf->file)->i_ino; 981 - path = file_path(xf->file, __entry->pathname, MAXNAMELEN); 982 - if (IS_ERR(path)) 983 - strncpy(__entry->pathname, "(unknown)", 984 - sizeof(__entry->pathname)); 985 978 ), 986 - TP_printk("xfino 0x%lx path '%s'", 987 - __entry->ino, 988 - __entry->pathname) 979 + TP_printk("xfino 0x%lx", 980 + __entry->ino) 989 981 ); 990 982 991 983 TRACE_EVENT(xfile_destroy,

+59 -40

fs/xfs/xfs_attr_inactive.c

··· 140 140 xfs_daddr_t parent_blkno, child_blkno; 141 141 struct xfs_buf *child_bp; 142 142 struct xfs_da3_icnode_hdr ichdr; 143 - int error, i; 143 + int error; 144 144 145 145 /* 146 146 * Since this code is recursive (gasp!) we must protect ourselves. ··· 152 152 return -EFSCORRUPTED; 153 153 } 154 154 155 - xfs_da3_node_hdr_from_disk(dp->i_mount, &ichdr, bp->b_addr); 155 + xfs_da3_node_hdr_from_disk(mp, &ichdr, bp->b_addr); 156 156 parent_blkno = xfs_buf_daddr(bp); 157 157 if (!ichdr.count) { 158 158 xfs_trans_brelse(*trans, bp); ··· 167 167 * over the leaves removing all of them. If this is higher up 168 168 * in the tree, recurse downward. 169 169 */ 170 - for (i = 0; i < ichdr.count; i++) { 170 + while (ichdr.count > 0) { 171 171 /* 172 172 * Read the subsidiary block to see what we have to work with. 173 173 * Don't do this in a transaction. This is a depth-first ··· 218 218 xfs_trans_binval(*trans, child_bp); 219 219 child_bp = NULL; 220 220 221 - /* 222 - * If we're not done, re-read the parent to get the next 223 - * child block number. 224 - */ 225 - if (i + 1 < ichdr.count) { 226 - struct xfs_da3_icnode_hdr phdr; 221 + error = xfs_da3_node_read_mapped(*trans, dp, 222 + parent_blkno, &bp, XFS_ATTR_FORK); 223 + if (error) 224 + return error; 227 225 228 - error = xfs_da3_node_read_mapped(*trans, dp, 229 - parent_blkno, &bp, XFS_ATTR_FORK); 226 + /* 227 + * Remove entry from parent node, prevents being indexed to. 228 + */ 229 + xfs_attr3_node_entry_remove(*trans, dp, bp, 0); 230 + 231 + xfs_da3_node_hdr_from_disk(mp, &ichdr, bp->b_addr); 232 + bp = NULL; 233 + 234 + if (ichdr.count > 0) { 235 + /* 236 + * If we're not done, get the next child block number. 237 + */ 238 + child_fsb = be32_to_cpu(ichdr.btree[0].before); 239 + 240 + /* 241 + * Atomically commit the whole invalidate stuff. 242 + */ 243 + error = xfs_trans_roll_inode(trans, dp); 230 244 if (error) 231 245 return error; 232 - xfs_da3_node_hdr_from_disk(dp->i_mount, &phdr, 233 - bp->b_addr); 234 - child_fsb = be32_to_cpu(phdr.btree[i + 1].before); 235 - xfs_trans_brelse(*trans, bp); 236 - bp = NULL; 237 246 } 238 - /* 239 - * Atomically commit the whole invalidate stuff. 240 - */ 241 - error = xfs_trans_roll_inode(trans, dp); 242 - if (error) 243 - return error; 244 247 } 245 248 246 249 return 0; ··· 260 257 struct xfs_trans **trans, 261 258 struct xfs_inode *dp) 262 259 { 263 - struct xfs_mount *mp = dp->i_mount; 264 260 struct xfs_da_blkinfo *info; 265 261 struct xfs_buf *bp; 266 - xfs_daddr_t blkno; 267 262 int error; 268 263 269 264 /* ··· 273 272 error = xfs_da3_node_read(*trans, dp, 0, &bp, XFS_ATTR_FORK); 274 273 if (error) 275 274 return error; 276 - blkno = xfs_buf_daddr(bp); 277 275 278 276 /* 279 277 * Invalidate the tree, even if the "tree" is only a single leaf block. ··· 283 283 case cpu_to_be16(XFS_DA_NODE_MAGIC): 284 284 case cpu_to_be16(XFS_DA3_NODE_MAGIC): 285 285 error = xfs_attr3_node_inactive(trans, dp, bp, 1); 286 + /* 287 + * Empty root node block are not allowed, convert it to leaf. 288 + */ 289 + if (!error) 290 + error = xfs_attr3_leaf_init(*trans, dp, 0); 291 + if (!error) 292 + error = xfs_trans_roll_inode(trans, dp); 286 293 break; 287 294 case cpu_to_be16(XFS_ATTR_LEAF_MAGIC): 288 295 case cpu_to_be16(XFS_ATTR3_LEAF_MAGIC): 289 296 error = xfs_attr3_leaf_inactive(trans, dp, bp); 297 + /* 298 + * Reinit the leaf before truncating extents so that a crash 299 + * mid-truncation leaves an empty leaf rather than one with 300 + * entries that may reference freed remote value blocks. 301 + */ 302 + if (!error) 303 + error = xfs_attr3_leaf_init(*trans, dp, 0); 304 + if (!error) 305 + error = xfs_trans_roll_inode(trans, dp); 290 306 break; 291 307 default: 292 308 xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK); ··· 311 295 xfs_trans_brelse(*trans, bp); 312 296 break; 313 297 } 314 - if (error) 315 - return error; 316 - 317 - /* 318 - * Invalidate the incore copy of the root block. 319 - */ 320 - error = xfs_trans_get_buf(*trans, mp->m_ddev_targp, blkno, 321 - XFS_FSB_TO_BB(mp, mp->m_attr_geo->fsbcount), 0, &bp); 322 - if (error) 323 - return error; 324 - xfs_trans_binval(*trans, bp); /* remove from cache */ 325 - /* 326 - * Commit the invalidate and start the next transaction. 327 - */ 328 - error = xfs_trans_roll_inode(trans, dp); 329 298 330 299 return error; 331 300 } ··· 329 328 { 330 329 struct xfs_trans *trans; 331 330 struct xfs_mount *mp; 331 + struct xfs_buf *bp; 332 332 int lock_mode = XFS_ILOCK_SHARED; 333 333 int error = 0; 334 334 ··· 365 363 * removal below. 366 364 */ 367 365 if (dp->i_af.if_nextents > 0) { 366 + /* 367 + * Invalidate and truncate all blocks but leave the root block. 368 + */ 368 369 error = xfs_attr3_root_inactive(&trans, dp); 369 370 if (error) 370 371 goto out_cancel; 371 372 373 + error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 374 + XFS_FSB_TO_B(mp, mp->m_attr_geo->fsbcount)); 375 + if (error) 376 + goto out_cancel; 377 + 378 + /* 379 + * Invalidate and truncate the root block and ensure that the 380 + * operation is completed within a single transaction. 381 + */ 382 + error = xfs_da_get_buf(trans, dp, 0, &bp, XFS_ATTR_FORK); 383 + if (error) 384 + goto out_cancel; 385 + 386 + xfs_trans_binval(trans, bp); 372 387 error = xfs_itruncate_extents(&trans, dp, XFS_ATTR_FORK, 0); 373 388 if (error) 374 389 goto out_cancel;

+2 -49

fs/xfs/xfs_attr_item.c

··· 653 653 break; 654 654 } 655 655 if (error) { 656 - xfs_irele(ip); 657 656 XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, attrp, 658 657 sizeof(*attrp)); 659 658 return ERR_PTR(-EFSCORRUPTED); ··· 1046 1047 break; 1047 1048 case XFS_ATTRI_OP_FLAGS_SET: 1048 1049 case XFS_ATTRI_OP_FLAGS_REPLACE: 1049 - /* Log item, attr name, attr value */ 1050 - if (item->ri_total != 3) { 1050 + /* Log item, attr name, optional attr value */ 1051 + if (item->ri_total != 2 + !!attri_formatp->alfi_value_len) { 1051 1052 XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1052 1053 attri_formatp, len); 1053 1054 return -EFSCORRUPTED; ··· 1129 1130 XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1130 1131 attri_formatp, len); 1131 1132 return -EFSCORRUPTED; 1132 - } 1133 - 1134 - switch (op) { 1135 - case XFS_ATTRI_OP_FLAGS_REMOVE: 1136 - /* Regular remove operations operate only on names. */ 1137 - if (attr_value != NULL || value_len != 0) { 1138 - XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1139 - attri_formatp, len); 1140 - return -EFSCORRUPTED; 1141 - } 1142 - fallthrough; 1143 - case XFS_ATTRI_OP_FLAGS_PPTR_REMOVE: 1144 - case XFS_ATTRI_OP_FLAGS_PPTR_SET: 1145 - case XFS_ATTRI_OP_FLAGS_SET: 1146 - case XFS_ATTRI_OP_FLAGS_REPLACE: 1147 - /* 1148 - * Regular xattr set/remove/replace operations require a name 1149 - * and do not take a newname. Values are optional for set and 1150 - * replace. 1151 - * 1152 - * Name-value set/remove operations must have a name, do not 1153 - * take a newname, and can take a value. 1154 - */ 1155 - if (attr_name == NULL || name_len == 0) { 1156 - XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1157 - attri_formatp, len); 1158 - return -EFSCORRUPTED; 1159 - } 1160 - break; 1161 - case XFS_ATTRI_OP_FLAGS_PPTR_REPLACE: 1162 - /* 1163 - * Name-value replace operations require the caller to 1164 - * specify the old and new names and values explicitly. 1165 - * Values are optional. 1166 - */ 1167 - if (attr_name == NULL || name_len == 0) { 1168 - XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1169 - attri_formatp, len); 1170 - return -EFSCORRUPTED; 1171 - } 1172 - if (attr_new_name == NULL || new_name_len == 0) { 1173 - XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, mp, 1174 - attri_formatp, len); 1175 - return -EFSCORRUPTED; 1176 - } 1177 - break; 1178 1133 } 1179 1134 1180 1135 /*

+7 -2

fs/xfs/xfs_dquot_item.c

··· 125 125 struct xfs_dq_logitem *qlip = DQUOT_ITEM(lip); 126 126 struct xfs_dquot *dqp = qlip->qli_dquot; 127 127 struct xfs_buf *bp; 128 + struct xfs_ail *ailp = lip->li_ailp; 128 129 uint rval = XFS_ITEM_SUCCESS; 129 130 int error; 130 131 ··· 154 153 goto out_unlock; 155 154 } 156 155 157 - spin_unlock(&lip->li_ailp->ail_lock); 156 + spin_unlock(&ailp->ail_lock); 158 157 159 158 error = xfs_dquot_use_attached_buf(dqp, &bp); 160 159 if (error == -EAGAIN) { ··· 173 172 rval = XFS_ITEM_FLUSHING; 174 173 } 175 174 xfs_buf_relse(bp); 175 + /* 176 + * The buffer no longer protects the log item from reclaim, so 177 + * do not reference lip after this point. 178 + */ 176 179 177 180 out_relock_ail: 178 - spin_lock(&lip->li_ailp->ail_lock); 181 + spin_lock(&ailp->ail_lock); 179 182 out_unlock: 180 183 mutex_unlock(&dqp->q_qlock); 181 184 return rval;

+1 -1

fs/xfs/xfs_handle.c

··· 443 443 context.dp = dp; 444 444 context.resynch = 1; 445 445 context.attr_filter = xfs_attr_filter(flags); 446 - context.buffer = buffer; 447 446 context.bufsize = round_down(bufsize, sizeof(uint32_t)); 447 + context.buffer = buffer; 448 448 context.firstu = context.bufsize; 449 449 context.put_listent = xfs_ioc_attr_put_listent; 450 450

+2 -1

fs/xfs/xfs_inode.c

··· 1048 1048 xfs_assert_ilocked(ip, XFS_ILOCK_EXCL); 1049 1049 if (icount_read(VFS_I(ip))) 1050 1050 xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL); 1051 - ASSERT(new_size <= XFS_ISIZE(ip)); 1051 + if (whichfork == XFS_DATA_FORK) 1052 + ASSERT(new_size <= XFS_ISIZE(ip)); 1052 1053 ASSERT(tp->t_flags & XFS_TRANS_PERM_LOG_RES); 1053 1054 ASSERT(ip->i_itemp != NULL); 1054 1055 ASSERT(ip->i_itemp->ili_lock_flags == 0);

+7 -2

fs/xfs/xfs_inode_item.c

··· 746 746 struct xfs_inode_log_item *iip = INODE_ITEM(lip); 747 747 struct xfs_inode *ip = iip->ili_inode; 748 748 struct xfs_buf *bp = lip->li_buf; 749 + struct xfs_ail *ailp = lip->li_ailp; 749 750 uint rval = XFS_ITEM_SUCCESS; 750 751 int error; 751 752 ··· 772 771 if (!xfs_buf_trylock(bp)) 773 772 return XFS_ITEM_LOCKED; 774 773 775 - spin_unlock(&lip->li_ailp->ail_lock); 774 + spin_unlock(&ailp->ail_lock); 776 775 777 776 /* 778 777 * We need to hold a reference for flushing the cluster buffer as it may ··· 796 795 rval = XFS_ITEM_LOCKED; 797 796 } 798 797 799 - spin_lock(&lip->li_ailp->ail_lock); 798 + /* 799 + * The buffer no longer protects the log item from reclaim, so 800 + * do not reference lip after this point. 801 + */ 802 + spin_lock(&ailp->ail_lock); 800 803 return rval; 801 804 } 802 805

+4 -3

fs/xfs/xfs_mount.c

··· 608 608 * have been retrying in the background. This will prevent never-ending 609 609 * retries in AIL pushing from hanging the unmount. 610 610 * 611 - * Finally, we can push the AIL to clean all the remaining dirty objects, then 612 - * reclaim the remaining inodes that are still in memory at this point in time. 611 + * Stop inodegc and background reclaim before pushing the AIL so that they 612 + * are not running while the AIL is being flushed. Then push the AIL to 613 + * clean all the remaining dirty objects and reclaim the remaining inodes. 613 614 */ 614 615 static void 615 616 xfs_unmount_flush_inodes( ··· 622 621 623 622 xfs_set_unmounting(mp); 624 623 625 - xfs_ail_push_all_sync(mp->m_ail); 626 624 xfs_inodegc_stop(mp); 627 625 cancel_delayed_work_sync(&mp->m_reclaim_work); 626 + xfs_ail_push_all_sync(mp->m_ail); 628 627 xfs_reclaim_inodes(mp); 629 628 xfs_health_unmount(mp); 630 629 xfs_healthmon_unmount(mp);

+34 -13

fs/xfs/xfs_trace.h

··· 56 56 #include <linux/tracepoint.h> 57 57 58 58 struct xfs_agf; 59 + struct xfs_ail; 59 60 struct xfs_alloc_arg; 60 61 struct xfs_attr_list_context; 61 62 struct xfs_buf_log_item; ··· 1651 1650 DEFINE_EVENT(xfs_log_item_class, name, \ 1652 1651 TP_PROTO(struct xfs_log_item *lip), \ 1653 1652 TP_ARGS(lip)) 1654 - DEFINE_LOG_ITEM_EVENT(xfs_ail_push); 1655 - DEFINE_LOG_ITEM_EVENT(xfs_ail_pinned); 1656 - DEFINE_LOG_ITEM_EVENT(xfs_ail_locked); 1657 - DEFINE_LOG_ITEM_EVENT(xfs_ail_flushing); 1658 1653 DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_mark); 1659 1654 DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_skip); 1660 1655 DEFINE_LOG_ITEM_EVENT(xfs_cil_whiteout_unpin); 1661 1656 DEFINE_LOG_ITEM_EVENT(xlog_ail_insert_abort); 1662 1657 DEFINE_LOG_ITEM_EVENT(xfs_trans_free_abort); 1658 + 1659 + DECLARE_EVENT_CLASS(xfs_ail_push_class, 1660 + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), 1661 + TP_ARGS(ailp, type, flags, lsn), 1662 + TP_STRUCT__entry( 1663 + __field(dev_t, dev) 1664 + __field(uint, type) 1665 + __field(unsigned long, flags) 1666 + __field(xfs_lsn_t, lsn) 1667 + ), 1668 + TP_fast_assign( 1669 + __entry->dev = ailp->ail_log->l_mp->m_super->s_dev; 1670 + __entry->type = type; 1671 + __entry->flags = flags; 1672 + __entry->lsn = lsn; 1673 + ), 1674 + TP_printk("dev %d:%d lsn %d/%d type %s flags %s", 1675 + MAJOR(__entry->dev), MINOR(__entry->dev), 1676 + CYCLE_LSN(__entry->lsn), BLOCK_LSN(__entry->lsn), 1677 + __print_symbolic(__entry->type, XFS_LI_TYPE_DESC), 1678 + __print_flags(__entry->flags, "|", XFS_LI_FLAGS)) 1679 + ) 1680 + 1681 + #define DEFINE_AIL_PUSH_EVENT(name) \ 1682 + DEFINE_EVENT(xfs_ail_push_class, name, \ 1683 + TP_PROTO(struct xfs_ail *ailp, uint type, unsigned long flags, xfs_lsn_t lsn), \ 1684 + TP_ARGS(ailp, type, flags, lsn)) 1685 + DEFINE_AIL_PUSH_EVENT(xfs_ail_push); 1686 + DEFINE_AIL_PUSH_EVENT(xfs_ail_pinned); 1687 + DEFINE_AIL_PUSH_EVENT(xfs_ail_locked); 1688 + DEFINE_AIL_PUSH_EVENT(xfs_ail_flushing); 1663 1689 1664 1690 DECLARE_EVENT_CLASS(xfs_ail_class, 1665 1691 TP_PROTO(struct xfs_log_item *lip, xfs_lsn_t old_lsn, xfs_lsn_t new_lsn), ··· 5119 5091 TP_STRUCT__entry( 5120 5092 __field(dev_t, dev) 5121 5093 __field(unsigned long, ino) 5122 - __array(char, pathname, MAXNAMELEN) 5123 5094 ), 5124 5095 TP_fast_assign( 5125 - char *path; 5126 5096 struct file *file = btp->bt_file; 5127 5097 5128 5098 __entry->dev = btp->bt_mount->m_super->s_dev; 5129 5099 __entry->ino = file_inode(file)->i_ino; 5130 - path = file_path(file, __entry->pathname, MAXNAMELEN); 5131 - if (IS_ERR(path)) 5132 - strncpy(__entry->pathname, "(unknown)", 5133 - sizeof(__entry->pathname)); 5134 5100 ), 5135 - TP_printk("dev %d:%d xmino 0x%lx path '%s'", 5101 + TP_printk("dev %d:%d xmino 0x%lx", 5136 5102 MAJOR(__entry->dev), MINOR(__entry->dev), 5137 - __entry->ino, 5138 - __entry->pathname) 5103 + __entry->ino) 5139 5104 ); 5140 5105 5141 5106 TRACE_EVENT(xmbuf_free,

+75 -52

fs/xfs/xfs_trans_ail.c

··· 365 365 return XFS_ITEM_SUCCESS; 366 366 } 367 367 368 + /* 369 + * Push a single log item from the AIL. 370 + * 371 + * @lip may have been released and freed by the time this function returns, 372 + * so callers must not dereference the log item afterwards. 373 + */ 368 374 static inline uint 369 375 xfsaild_push_item( 370 376 struct xfs_ail *ailp, ··· 464 458 return target_lsn; 465 459 } 466 460 461 + static void 462 + xfsaild_process_logitem( 463 + struct xfs_ail *ailp, 464 + struct xfs_log_item *lip, 465 + int *stuck, 466 + int *flushing) 467 + { 468 + struct xfs_mount *mp = ailp->ail_log->l_mp; 469 + uint type = lip->li_type; 470 + unsigned long flags = lip->li_flags; 471 + xfs_lsn_t item_lsn = lip->li_lsn; 472 + int lock_result; 473 + 474 + /* 475 + * Note that iop_push may unlock and reacquire the AIL lock. We 476 + * rely on the AIL cursor implementation to be able to deal with 477 + * the dropped lock. 478 + * 479 + * The log item may have been freed by the push, so it must not 480 + * be accessed or dereferenced below this line. 481 + */ 482 + lock_result = xfsaild_push_item(ailp, lip); 483 + switch (lock_result) { 484 + case XFS_ITEM_SUCCESS: 485 + XFS_STATS_INC(mp, xs_push_ail_success); 486 + trace_xfs_ail_push(ailp, type, flags, item_lsn); 487 + 488 + ailp->ail_last_pushed_lsn = item_lsn; 489 + break; 490 + 491 + case XFS_ITEM_FLUSHING: 492 + /* 493 + * The item or its backing buffer is already being 494 + * flushed. The typical reason for that is that an 495 + * inode buffer is locked because we already pushed the 496 + * updates to it as part of inode clustering. 497 + * 498 + * We do not want to stop flushing just because lots 499 + * of items are already being flushed, but we need to 500 + * re-try the flushing relatively soon if most of the 501 + * AIL is being flushed. 502 + */ 503 + XFS_STATS_INC(mp, xs_push_ail_flushing); 504 + trace_xfs_ail_flushing(ailp, type, flags, item_lsn); 505 + 506 + (*flushing)++; 507 + ailp->ail_last_pushed_lsn = item_lsn; 508 + break; 509 + 510 + case XFS_ITEM_PINNED: 511 + XFS_STATS_INC(mp, xs_push_ail_pinned); 512 + trace_xfs_ail_pinned(ailp, type, flags, item_lsn); 513 + 514 + (*stuck)++; 515 + ailp->ail_log_flush++; 516 + break; 517 + case XFS_ITEM_LOCKED: 518 + XFS_STATS_INC(mp, xs_push_ail_locked); 519 + trace_xfs_ail_locked(ailp, type, flags, item_lsn); 520 + 521 + (*stuck)++; 522 + break; 523 + default: 524 + ASSERT(0); 525 + break; 526 + } 527 + } 528 + 467 529 static long 468 530 xfsaild_push( 469 531 struct xfs_ail *ailp) ··· 579 505 580 506 lsn = lip->li_lsn; 581 507 while ((XFS_LSN_CMP(lip->li_lsn, ailp->ail_target) <= 0)) { 582 - int lock_result; 583 508 584 509 if (test_bit(XFS_LI_FLUSHING, &lip->li_flags)) 585 510 goto next_item; 586 511 587 - /* 588 - * Note that iop_push may unlock and reacquire the AIL lock. We 589 - * rely on the AIL cursor implementation to be able to deal with 590 - * the dropped lock. 591 - */ 592 - lock_result = xfsaild_push_item(ailp, lip); 593 - switch (lock_result) { 594 - case XFS_ITEM_SUCCESS: 595 - XFS_STATS_INC(mp, xs_push_ail_success); 596 - trace_xfs_ail_push(lip); 597 - 598 - ailp->ail_last_pushed_lsn = lsn; 599 - break; 600 - 601 - case XFS_ITEM_FLUSHING: 602 - /* 603 - * The item or its backing buffer is already being 604 - * flushed. The typical reason for that is that an 605 - * inode buffer is locked because we already pushed the 606 - * updates to it as part of inode clustering. 607 - * 608 - * We do not want to stop flushing just because lots 609 - * of items are already being flushed, but we need to 610 - * re-try the flushing relatively soon if most of the 611 - * AIL is being flushed. 612 - */ 613 - XFS_STATS_INC(mp, xs_push_ail_flushing); 614 - trace_xfs_ail_flushing(lip); 615 - 616 - flushing++; 617 - ailp->ail_last_pushed_lsn = lsn; 618 - break; 619 - 620 - case XFS_ITEM_PINNED: 621 - XFS_STATS_INC(mp, xs_push_ail_pinned); 622 - trace_xfs_ail_pinned(lip); 623 - 624 - stuck++; 625 - ailp->ail_log_flush++; 626 - break; 627 - case XFS_ITEM_LOCKED: 628 - XFS_STATS_INC(mp, xs_push_ail_locked); 629 - trace_xfs_ail_locked(lip); 630 - 631 - stuck++; 632 - break; 633 - default: 634 - ASSERT(0); 635 - break; 636 - } 637 - 512 + xfsaild_process_logitem(ailp, lip, &stuck, &flushing); 638 513 count++; 639 514 640 515 /*

+8 -10

fs/xfs/xfs_verify_media.c

··· 183 183 min_not_zero(SZ_1M, me->me_max_io_size); 184 184 185 185 BUILD_BUG_ON(BBSHIFT != SECTOR_SHIFT); 186 - ASSERT(BBTOB(bbcount) >= bdev_logical_block_size(btp->bt_bdev)); 186 + ASSERT(BBTOB(bbcount) >= btp->bt_logical_sectorsize); 187 187 188 - return clamp(iosize, bdev_logical_block_size(btp->bt_bdev), 189 - BBTOB(bbcount)); 188 + return clamp(iosize, btp->bt_logical_sectorsize, BBTOB(bbcount)); 190 189 } 191 190 192 191 /* Allocate as much memory as we can get for verification buffer. */ ··· 217 218 unsigned int bio_bbcount, 218 219 blk_status_t bio_status) 219 220 { 220 - trace_xfs_verify_media_error(mp, me, btp->bt_bdev->bd_dev, daddr, 221 - bio_bbcount, bio_status); 221 + trace_xfs_verify_media_error(mp, me, btp->bt_dev, daddr, bio_bbcount, 222 + bio_status); 222 223 223 224 /* 224 225 * Pass any error, I/O or otherwise, up to the caller if we didn't ··· 279 280 btp = mp->m_ddev_targp; 280 281 break; 281 282 case XFS_DEV_LOG: 282 - if (mp->m_logdev_targp->bt_bdev != mp->m_ddev_targp->bt_bdev) 283 + if (mp->m_logdev_targp != mp->m_ddev_targp) 283 284 btp = mp->m_logdev_targp; 284 285 break; 285 286 case XFS_DEV_RT: ··· 298 299 299 300 /* start and end have to be aligned to the lba size */ 300 301 if (!IS_ALIGNED(BBTOB(me->me_start_daddr | me->me_end_daddr), 301 - bdev_logical_block_size(btp->bt_bdev))) 302 + btp->bt_logical_sectorsize)) 302 303 return -EINVAL; 303 304 304 305 /* ··· 330 331 if (!folio) 331 332 return -ENOMEM; 332 333 333 - trace_xfs_verify_media(mp, me, btp->bt_bdev->bd_dev, daddr, bbcount, 334 - folio); 334 + trace_xfs_verify_media(mp, me, btp->bt_dev, daddr, bbcount, folio); 335 335 336 336 bio = bio_alloc(btp->bt_bdev, 1, REQ_OP_READ, GFP_KERNEL); 337 337 if (!bio) { ··· 398 400 * an operational error. 399 401 */ 400 402 me->me_start_daddr = daddr; 401 - trace_xfs_verify_media_end(mp, me, btp->bt_bdev->bd_dev); 403 + trace_xfs_verify_media_end(mp, me, btp->bt_dev); 402 404 return 0; 403 405 } 404 406

+1 -1

fs/xfs/xfs_xattr.c

··· 332 332 memset(&context, 0, sizeof(context)); 333 333 context.dp = XFS_I(inode); 334 334 context.resynch = 1; 335 - context.buffer = size ? data : NULL; 336 335 context.bufsize = size; 336 + context.buffer = size ? data : NULL; 337 337 context.firstu = context.bufsize; 338 338 context.put_listent = xfs_xattr_put_listent; 339 339

+6

include/linux/damon.h

··· 810 810 struct damos_walk_control *walk_control; 811 811 struct mutex walk_control_lock; 812 812 813 + /* 814 + * indicate if this may be corrupted. Currentonly this is set only for 815 + * damon_commit_ctx() failure. 816 + */ 817 + bool maybe_corrupted; 818 + 813 819 /* Working thread of the given DAMON context */ 814 820 struct task_struct *kdamond; 815 821 /* Protects @kdamond field access */

+13 -6

include/linux/dma-mapping.h

··· 80 80 #define DMA_ATTR_MMIO (1UL << 10) 81 81 82 82 /* 83 - * DMA_ATTR_CPU_CACHE_CLEAN: Indicates the CPU will not dirty any cacheline 84 - * overlapping this buffer while it is mapped for DMA. All mappings sharing 85 - * a cacheline must have this attribute for this to be considered safe. 83 + * DMA_ATTR_DEBUGGING_IGNORE_CACHELINES: Indicates the CPU cache line can be 84 + * overlapped. All mappings sharing a cacheline must have this attribute for 85 + * this to be considered safe. 86 86 */ 87 - #define DMA_ATTR_CPU_CACHE_CLEAN (1UL << 11) 87 + #define DMA_ATTR_DEBUGGING_IGNORE_CACHELINES (1UL << 11) 88 + 89 + /* 90 + * DMA_ATTR_REQUIRE_COHERENT: Indicates that DMA coherency is required. 91 + * All mappings that carry this attribute can't work with SWIOTLB and cache 92 + * flushing. 93 + */ 94 + #define DMA_ATTR_REQUIRE_COHERENT (1UL << 12) 88 95 89 96 /* 90 97 * A dma_addr_t can hold any valid DMA or bus address for the platform. It can ··· 255 248 { 256 249 return NULL; 257 250 } 258 - static void dma_free_attrs(struct device *dev, size_t size, void *cpu_addr, 259 - dma_addr_t dma_handle, unsigned long attrs) 251 + static inline void dma_free_attrs(struct device *dev, size_t size, 252 + void *cpu_addr, dma_addr_t dma_handle, unsigned long attrs) 260 253 { 261 254 } 262 255 static inline void *dmam_alloc_attrs(struct device *dev, size_t size,

+1

include/linux/fs/super_types.h

··· 338 338 #define SB_I_NOUMASK 0x00001000 /* VFS does not apply umask */ 339 339 #define SB_I_NOIDMAP 0x00002000 /* No idmapped mounts on this superblock */ 340 340 #define SB_I_ALLOW_HSM 0x00004000 /* Allow HSM events on this superblock */ 341 + #define SB_I_NO_DATA_INTEGRITY 0x00008000 /* fs cannot guarantee data persistence on sync */ 341 342 342 343 #endif /* _LINUX_FS_SUPER_TYPES_H */

+21 -11

include/linux/leafops.h

··· 363 363 return swp_offset(entry) & SWP_PFN_MASK; 364 364 } 365 365 366 + static inline void softleaf_migration_sync(softleaf_t entry, 367 + struct folio *folio) 368 + { 369 + /* 370 + * Ensure we do not race with split, which might alter tail pages into new 371 + * folios and thus result in observing an unlocked folio. 372 + * This matches the write barrier in __split_folio_to_order(). 373 + */ 374 + smp_rmb(); 375 + 376 + /* 377 + * Any use of migration entries may only occur while the 378 + * corresponding page is locked 379 + */ 380 + VM_WARN_ON_ONCE(!folio_test_locked(folio)); 381 + } 382 + 366 383 /** 367 384 * softleaf_to_page() - Obtains struct page for PFN encoded within leaf entry. 368 385 * @entry: Leaf entry, softleaf_has_pfn(@entry) must return true. ··· 391 374 struct page *page = pfn_to_page(softleaf_to_pfn(entry)); 392 375 393 376 VM_WARN_ON_ONCE(!softleaf_has_pfn(entry)); 394 - /* 395 - * Any use of migration entries may only occur while the 396 - * corresponding page is locked 397 - */ 398 - VM_WARN_ON_ONCE(softleaf_is_migration(entry) && !PageLocked(page)); 377 + if (softleaf_is_migration(entry)) 378 + softleaf_migration_sync(entry, page_folio(page)); 399 379 400 380 return page; 401 381 } ··· 408 394 struct folio *folio = pfn_folio(softleaf_to_pfn(entry)); 409 395 410 396 VM_WARN_ON_ONCE(!softleaf_has_pfn(entry)); 411 - /* 412 - * Any use of migration entries may only occur while the 413 - * corresponding folio is locked. 414 - */ 415 - VM_WARN_ON_ONCE(softleaf_is_migration(entry) && 416 - !folio_test_locked(folio)); 397 + if (softleaf_is_migration(entry)) 398 + softleaf_migration_sync(entry, folio); 417 399 418 400 return folio; 419 401 }

+1

include/linux/mempolicy.h

··· 55 55 nodemask_t cpuset_mems_allowed; /* relative to these nodes */ 56 56 nodemask_t user_nodemask; /* nodemask passed by user */ 57 57 } w; 58 + struct rcu_head rcu; 58 59 }; 59 60 60 61 /*

-1

include/linux/netfs.h

··· 140 140 void (*issue_write)(struct netfs_io_subrequest *subreq); 141 141 /* Collection tracking */ 142 142 struct list_head subrequests; /* Contributory I/O operations */ 143 - struct netfs_io_subrequest *front; /* Op being collected */ 144 143 unsigned long long collected_to; /* Position we've collected results to */ 145 144 size_t transferred; /* The amount transferred from this stream */ 146 145 unsigned short error; /* Aggregate error for the stream */

-11

include/linux/pagemap.h

··· 210 210 AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM = 9, 211 211 AS_KERNEL_FILE = 10, /* mapping for a fake kernel file that shouldn't 212 212 account usage to user cgroups */ 213 - AS_NO_DATA_INTEGRITY = 11, /* no data integrity guarantees */ 214 213 /* Bits 16-25 are used for FOLIO_ORDER */ 215 214 AS_FOLIO_ORDER_BITS = 5, 216 215 AS_FOLIO_ORDER_MIN = 16, ··· 343 344 static inline bool mapping_writeback_may_deadlock_on_reclaim(const struct address_space *mapping) 344 345 { 345 346 return test_bit(AS_WRITEBACK_MAY_DEADLOCK_ON_RECLAIM, &mapping->flags); 346 - } 347 - 348 - static inline void mapping_set_no_data_integrity(struct address_space *mapping) 349 - { 350 - set_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); 351 - } 352 - 353 - static inline bool mapping_no_data_integrity(const struct address_space *mapping) 354 - { 355 - return test_bit(AS_NO_DATA_INTEGRITY, &mapping->flags); 356 347 } 357 348 358 349 static inline gfp_t mapping_gfp_mask(const struct address_space *mapping)

+1

include/linux/security.h

··· 145 145 LOCKDOWN_BPF_WRITE_USER, 146 146 LOCKDOWN_DBG_WRITE_KERNEL, 147 147 LOCKDOWN_RTAS_ERROR_INJECTION, 148 + LOCKDOWN_XEN_USER_ACTIONS, 148 149 LOCKDOWN_INTEGRITY_MAX, 149 150 LOCKDOWN_KCORE, 150 151 LOCKDOWN_KPROBES,

-5

include/linux/spi/spi.h

··· 159 159 * @modalias: Name of the driver to use with this device, or an alias 160 160 * for that name. This appears in the sysfs "modalias" attribute 161 161 * for driver coldplugging, and in uevents used for hotplugging 162 - * @driver_override: If the name of a driver is written to this attribute, then 163 - * the device will bind to the named driver and only the named driver. 164 - * Do not set directly, because core frees it; use driver_set_override() to 165 - * set or clear it. 166 162 * @pcpu_statistics: statistics for the spi_device 167 163 * @word_delay: delay to be inserted between consecutive 168 164 * words of a transfer ··· 220 224 void *controller_state; 221 225 void *controller_data; 222 226 char modalias[SPI_NAME_SIZE]; 223 - const char *driver_override; 224 227 225 228 /* The statistics */ 226 229 struct spi_statistics __percpu *pcpu_statistics;

+4

include/linux/srcutiny.h

··· 11 11 #ifndef _LINUX_SRCU_TINY_H 12 12 #define _LINUX_SRCU_TINY_H 13 13 14 + #include <linux/irq_work_types.h> 14 15 #include <linux/swait.h> 15 16 16 17 struct srcu_struct { ··· 25 24 struct rcu_head *srcu_cb_head; /* Pending callbacks: Head. */ 26 25 struct rcu_head **srcu_cb_tail; /* Pending callbacks: Tail. */ 27 26 struct work_struct srcu_work; /* For driving grace periods. */ 27 + struct irq_work srcu_irq_work; /* Defer schedule_work() to irq work. */ 28 28 #ifdef CONFIG_DEBUG_LOCK_ALLOC 29 29 struct lockdep_map dep_map; 30 30 #endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */ 31 31 }; 32 32 33 33 void srcu_drive_gp(struct work_struct *wp); 34 + void srcu_tiny_irq_work(struct irq_work *irq_work); 34 35 35 36 #define __SRCU_STRUCT_INIT(name, __ignored, ___ignored, ____ignored) \ 36 37 { \ 37 38 .srcu_wq = __SWAIT_QUEUE_HEAD_INITIALIZER(name.srcu_wq), \ 38 39 .srcu_cb_tail = &name.srcu_cb_head, \ 39 40 .srcu_work = __WORK_INITIALIZER(name.srcu_work, srcu_drive_gp), \ 41 + .srcu_irq_work = { .func = srcu_tiny_irq_work }, \ 40 42 __SRCU_DEP_MAP_INIT(name) \ 41 43 } 42 44

+5 -4

include/linux/srcutree.h

··· 34 34 /* Values: SRCU_READ_FLAVOR_.* */ 35 35 36 36 /* Update-side state. */ 37 - spinlock_t __private lock ____cacheline_internodealigned_in_smp; 37 + raw_spinlock_t __private lock ____cacheline_internodealigned_in_smp; 38 38 struct rcu_segcblist srcu_cblist; /* List of callbacks.*/ 39 39 unsigned long srcu_gp_seq_needed; /* Furthest future GP needed. */ 40 40 unsigned long srcu_gp_seq_needed_exp; /* Furthest future exp GP. */ ··· 55 55 * Node in SRCU combining tree, similar in function to rcu_data. 56 56 */ 57 57 struct srcu_node { 58 - spinlock_t __private lock; 58 + raw_spinlock_t __private lock; 59 59 unsigned long srcu_have_cbs[4]; /* GP seq for children having CBs, but only */ 60 60 /* if greater than ->srcu_gp_seq. */ 61 61 unsigned long srcu_data_have_cbs[4]; /* Which srcu_data structs have CBs for given GP? */ ··· 74 74 /* First node at each level. */ 75 75 int srcu_size_state; /* Small-to-big transition state. */ 76 76 struct mutex srcu_cb_mutex; /* Serialize CB preparation. */ 77 - spinlock_t __private lock; /* Protect counters and size state. */ 77 + raw_spinlock_t __private lock; /* Protect counters and size state. */ 78 78 struct mutex srcu_gp_mutex; /* Serialize GP work. */ 79 79 unsigned long srcu_gp_seq; /* Grace-period seq #. */ 80 80 unsigned long srcu_gp_seq_needed; /* Latest gp_seq needed. */ ··· 95 95 unsigned long reschedule_jiffies; 96 96 unsigned long reschedule_count; 97 97 struct delayed_work work; 98 + struct irq_work irq_work; 98 99 struct srcu_struct *srcu_ssp; 99 100 }; 100 101 ··· 157 156 158 157 #define __SRCU_USAGE_INIT(name) \ 159 158 { \ 160 - .lock = __SPIN_LOCK_UNLOCKED(name.lock), \ 159 + .lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock), \ 161 160 .srcu_gp_seq = SRCU_GP_SEQ_INITIAL_VAL, \ 162 161 .srcu_gp_seq_needed = SRCU_GP_SEQ_INITIAL_VAL_WITH_STATE, \ 163 162 .srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL, \

+49 -4

include/linux/virtio_net.h

··· 207 207 return __virtio_net_hdr_to_skb(skb, hdr, little_endian, hdr->gso_type); 208 208 } 209 209 210 + /* This function must be called after virtio_net_hdr_from_skb(). */ 211 + static inline void __virtio_net_set_hdrlen(const struct sk_buff *skb, 212 + struct virtio_net_hdr *hdr, 213 + bool little_endian) 214 + { 215 + u16 hdr_len; 216 + 217 + hdr_len = skb_transport_offset(skb); 218 + 219 + if (hdr->gso_type == VIRTIO_NET_HDR_GSO_UDP_L4) 220 + hdr_len += sizeof(struct udphdr); 221 + else 222 + hdr_len += tcp_hdrlen(skb); 223 + 224 + hdr->hdr_len = __cpu_to_virtio16(little_endian, hdr_len); 225 + } 226 + 227 + /* This function must be called after virtio_net_hdr_from_skb(). */ 228 + static inline void __virtio_net_set_tnl_hdrlen(const struct sk_buff *skb, 229 + struct virtio_net_hdr *hdr) 230 + { 231 + u16 hdr_len; 232 + 233 + hdr_len = skb_inner_transport_offset(skb); 234 + 235 + if (hdr->gso_type == VIRTIO_NET_HDR_GSO_UDP_L4) 236 + hdr_len += sizeof(struct udphdr); 237 + else 238 + hdr_len += inner_tcp_hdrlen(skb); 239 + 240 + hdr->hdr_len = __cpu_to_virtio16(true, hdr_len); 241 + } 242 + 210 243 static inline int virtio_net_hdr_from_skb(const struct sk_buff *skb, 211 244 struct virtio_net_hdr *hdr, 212 245 bool little_endian, ··· 418 385 bool tnl_hdr_negotiated, 419 386 bool little_endian, 420 387 int vlan_hlen, 421 - bool has_data_valid) 388 + bool has_data_valid, 389 + bool feature_hdrlen) 422 390 { 423 391 struct virtio_net_hdr *hdr = (struct virtio_net_hdr *)vhdr; 424 392 unsigned int inner_nh, outer_th; ··· 428 394 429 395 tnl_gso_type = skb_shinfo(skb)->gso_type & (SKB_GSO_UDP_TUNNEL | 430 396 SKB_GSO_UDP_TUNNEL_CSUM); 431 - if (!tnl_gso_type) 432 - return virtio_net_hdr_from_skb(skb, hdr, little_endian, 433 - has_data_valid, vlan_hlen); 397 + if (!tnl_gso_type) { 398 + ret = virtio_net_hdr_from_skb(skb, hdr, little_endian, 399 + has_data_valid, vlan_hlen); 400 + if (ret) 401 + return ret; 402 + 403 + if (feature_hdrlen && hdr->hdr_len) 404 + __virtio_net_set_hdrlen(skb, hdr, little_endian); 405 + 406 + return ret; 407 + } 434 408 435 409 /* Tunnel support not negotiated but skb ask for it. */ 436 410 if (!tnl_hdr_negotiated) ··· 455 413 skb_shinfo(skb)->gso_type |= tnl_gso_type; 456 414 if (ret) 457 415 return ret; 416 + 417 + if (feature_hdrlen && hdr->hdr_len) 418 + __virtio_net_set_tnl_hdrlen(skb, hdr); 458 419 459 420 if (skb->protocol == htons(ETH_P_IPV6)) 460 421 hdr->gso_type |= VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6;

+1

include/net/bluetooth/l2cap.h

··· 658 658 struct sk_buff *rx_skb; 659 659 __u32 rx_len; 660 660 struct ida tx_ida; 661 + __u8 tx_ident; 661 662 662 663 struct sk_buff_head pending_rx; 663 664 struct work_struct pending_rx_work;

+1

include/net/codel_impl.h

··· 158 158 bool drop; 159 159 160 160 if (!skb) { 161 + vars->first_above_time = 0; 161 162 vars->dropping = false; 162 163 return skb; 163 164 }

+14

include/net/inet_hashtables.h

··· 264 264 return &hinfo->bhash2[hash & (hinfo->bhash_size - 1)]; 265 265 } 266 266 267 + static inline bool inet_use_hash2_on_bind(const struct sock *sk) 268 + { 269 + #if IS_ENABLED(CONFIG_IPV6) 270 + if (sk->sk_family == AF_INET6) { 271 + if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) 272 + return false; 273 + 274 + if (!ipv6_addr_v4mapped(&sk->sk_v6_rcv_saddr)) 275 + return true; 276 + } 277 + #endif 278 + return sk->sk_rcv_saddr != htonl(INADDR_ANY); 279 + } 280 + 267 281 struct inet_bind_hashbucket * 268 282 inet_bhash2_addr_any_hashbucket(const struct sock *sk, const struct net *net, int port); 269 283

+20 -1

include/net/ip6_fib.h

··· 507 507 void inet6_rt_notify(int event, struct fib6_info *rt, struct nl_info *info, 508 508 unsigned int flags); 509 509 510 + void fib6_age_exceptions(struct fib6_info *rt, struct fib6_gc_args *gc_args, 511 + unsigned long now); 510 512 void fib6_run_gc(unsigned long expires, struct net *net, bool force); 511 - 512 513 void fib6_gc_cleanup(void); 513 514 514 515 int fib6_init(void); 515 516 517 + #if IS_ENABLED(CONFIG_IPV6) 516 518 /* Add the route to the gc list if it is not already there 517 519 * 518 520 * The callers should hold f6i->fib6_table->tb6_lock. ··· 546 544 if (!hlist_unhashed(&f6i->gc_link)) 547 545 hlist_del_init(&f6i->gc_link); 548 546 } 547 + 548 + static inline void fib6_may_remove_gc_list(struct net *net, 549 + struct fib6_info *f6i) 550 + { 551 + struct fib6_gc_args gc_args; 552 + 553 + if (hlist_unhashed(&f6i->gc_link)) 554 + return; 555 + 556 + gc_args.timeout = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_interval); 557 + gc_args.more = 0; 558 + 559 + rcu_read_lock(); 560 + fib6_age_exceptions(f6i, &gc_args, jiffies); 561 + rcu_read_unlock(); 562 + } 563 + #endif 549 564 550 565 struct ipv6_route_iter { 551 566 struct seq_net_private p;

+5

include/net/netfilter/nf_conntrack_core.h

··· 83 83 84 84 extern spinlock_t nf_conntrack_expect_lock; 85 85 86 + static inline void lockdep_nfct_expect_lock_held(void) 87 + { 88 + lockdep_assert_held(&nf_conntrack_expect_lock); 89 + } 90 + 86 91 /* ctnetlink code shared by both ctnetlink and nf_conntrack_bpf */ 87 92 88 93 static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)

+18 -2

include/net/netfilter/nf_conntrack_expect.h

··· 22 22 /* Hash member */ 23 23 struct hlist_node hnode; 24 24 25 + /* Network namespace */ 26 + possible_net_t net; 27 + 25 28 /* We expect this tuple, with the following mask */ 26 29 struct nf_conntrack_tuple tuple; 27 30 struct nf_conntrack_tuple_mask mask; 28 31 32 + #ifdef CONFIG_NF_CONNTRACK_ZONES 33 + struct nf_conntrack_zone zone; 34 + #endif 29 35 /* Usage count. */ 30 36 refcount_t use; 31 37 ··· 46 40 struct nf_conntrack_expect *this); 47 41 48 42 /* Helper to assign to new connection */ 49 - struct nf_conntrack_helper *helper; 43 + struct nf_conntrack_helper __rcu *helper; 50 44 51 45 /* The conntrack of the master connection */ 52 46 struct nf_conn *master; ··· 68 62 69 63 static inline struct net *nf_ct_exp_net(struct nf_conntrack_expect *exp) 70 64 { 71 - return nf_ct_net(exp->master); 65 + return read_pnet(&exp->net); 66 + } 67 + 68 + static inline bool nf_ct_exp_zone_equal_any(const struct nf_conntrack_expect *a, 69 + const struct nf_conntrack_zone *b) 70 + { 71 + #ifdef CONFIG_NF_CONNTRACK_ZONES 72 + return a->zone.id == b->id; 73 + #else 74 + return true; 75 + #endif 72 76 } 73 77 74 78 #define NF_CT_EXP_POLICY_NAME_LEN 16

+1 -1

include/net/netns/xfrm.h

··· 59 59 struct list_head inexact_bins; 60 60 61 61 62 - struct sock *nlsk; 62 + struct sock __rcu *nlsk; 63 63 struct sock *nlsk_stash; 64 64 65 65 u32 sysctl_aevent_etime;

-5

include/sound/sdca_function.h

··· 27 27 #define SDCA_MAX_ENTITY_COUNT 128 28 28 29 29 /* 30 - * Sanity check on number of initialization writes, can be expanded if needed. 31 - */ 32 - #define SDCA_MAX_INIT_COUNT 2048 33 - 34 - /* 35 30 * The Cluster IDs are 16-bit, so a maximum of 65535 Clusters per 36 31 * function can be represented, however limit this to a slightly 37 32 * more reasonable value. Can be expanded if needed.

+7 -4

include/trace/events/btrfs.h

··· 769 769 ), 770 770 771 771 TP_fast_assign( 772 - const struct dentry *dentry = file->f_path.dentry; 773 - const struct inode *inode = d_inode(dentry); 772 + struct dentry *dentry = file_dentry(file); 773 + struct inode *inode = file_inode(file); 774 + struct dentry *parent = dget_parent(dentry); 775 + struct inode *parent_inode = d_inode(parent); 774 776 775 - TP_fast_assign_fsid(btrfs_sb(file->f_path.dentry->d_sb)); 777 + dput(parent); 778 + TP_fast_assign_fsid(btrfs_sb(inode->i_sb)); 776 779 __entry->ino = btrfs_ino(BTRFS_I(inode)); 777 - __entry->parent = btrfs_ino(BTRFS_I(d_inode(dentry->d_parent))); 780 + __entry->parent = btrfs_ino(BTRFS_I(parent_inode)); 778 781 __entry->datasync = datasync; 779 782 __entry->root_objectid = btrfs_root_id(BTRFS_I(inode)->root); 780 783 ),

+3 -1

include/trace/events/dma.h

··· 32 32 { DMA_ATTR_ALLOC_SINGLE_PAGES, "ALLOC_SINGLE_PAGES" }, \ 33 33 { DMA_ATTR_NO_WARN, "NO_WARN" }, \ 34 34 { DMA_ATTR_PRIVILEGED, "PRIVILEGED" }, \ 35 - { DMA_ATTR_MMIO, "MMIO" }) 35 + { DMA_ATTR_MMIO, "MMIO" }, \ 36 + { DMA_ATTR_DEBUGGING_IGNORE_CACHELINES, "CACHELINES_OVERLAP" }, \ 37 + { DMA_ATTR_REQUIRE_COHERENT, "REQUIRE_COHERENT" }) 36 38 37 39 DECLARE_EVENT_CLASS(dma_map, 38 40 TP_PROTO(struct device *dev, phys_addr_t phys_addr, dma_addr_t dma_addr,

+4 -4

include/trace/events/netfs.h

··· 740 740 __field(unsigned int, wreq) 741 741 __field(unsigned char, stream) 742 742 __field(unsigned long long, collected_to) 743 - __field(unsigned long long, front) 743 + __field(unsigned long long, issued_to) 744 744 ), 745 745 746 746 TP_fast_assign( 747 747 __entry->wreq = wreq->debug_id; 748 748 __entry->stream = stream->stream_nr; 749 749 __entry->collected_to = stream->collected_to; 750 - __entry->front = stream->front ? stream->front->start : UINT_MAX; 750 + __entry->issued_to = atomic64_read(&wreq->issued_to); 751 751 ), 752 752 753 - TP_printk("R=%08x[%x:] cto=%llx frn=%llx", 753 + TP_printk("R=%08x[%x:] cto=%llx ito=%llx", 754 754 __entry->wreq, __entry->stream, 755 - __entry->collected_to, __entry->front) 755 + __entry->collected_to, __entry->issued_to) 756 756 ); 757 757 758 758 TRACE_EVENT(netfs_folioq,

+4

include/uapi/linux/netfilter/nf_conntrack_common.h

··· 159 159 #define NF_CT_EXPECT_INACTIVE 0x2 160 160 #define NF_CT_EXPECT_USERSPACE 0x4 161 161 162 + #ifdef __KERNEL__ 163 + #define NF_CT_EXPECT_MASK (NF_CT_EXPECT_PERMANENT | NF_CT_EXPECT_INACTIVE | \ 164 + NF_CT_EXPECT_USERSPACE) 165 + #endif 162 166 163 167 #endif /* _UAPI_NF_CONNTRACK_COMMON_H */

+1 -1

init/Kconfig

··· 146 146 config CC_HAS_COUNTED_BY_PTR 147 147 bool 148 148 # supported since clang 22 149 - default y if CC_IS_CLANG && CLANG_VERSION >= 220000 149 + default y if CC_IS_CLANG && CLANG_VERSION >= 220100 150 150 # supported since gcc 16.0.0 151 151 default y if CC_IS_GCC && GCC_VERSION >= 160000 152 152

+3 -1

io_uring/fdinfo.c

··· 119 119 sq_idx); 120 120 break; 121 121 } 122 - if ((++sq_head & sq_mask) == 0) { 122 + if (sq_idx == sq_mask) { 123 123 seq_printf(m, 124 124 "%5u: corrupted sqe, wrapping 128B entry\n", 125 125 sq_idx); 126 126 break; 127 127 } 128 + sq_head++; 129 + i++; 128 130 sqe128 = true; 129 131 } 130 132 seq_printf(m, "%5u: opcode:%s, fd:%d, flags:%x, off:%llu, "

+5 -4

kernel/dma/debug.c

··· 453 453 return overlap; 454 454 } 455 455 456 - static void active_cacheline_inc_overlap(phys_addr_t cln) 456 + static void active_cacheline_inc_overlap(phys_addr_t cln, bool is_cache_clean) 457 457 { 458 458 int overlap = active_cacheline_read_overlap(cln); 459 459 ··· 462 462 /* If we overflowed the overlap counter then we're potentially 463 463 * leaking dma-mappings. 464 464 */ 465 - WARN_ONCE(overlap > ACTIVE_CACHELINE_MAX_OVERLAP, 465 + WARN_ONCE(!is_cache_clean && overlap > ACTIVE_CACHELINE_MAX_OVERLAP, 466 466 pr_fmt("exceeded %d overlapping mappings of cacheline %pa\n"), 467 467 ACTIVE_CACHELINE_MAX_OVERLAP, &cln); 468 468 } ··· 495 495 if (rc == -EEXIST) { 496 496 struct dma_debug_entry *existing; 497 497 498 - active_cacheline_inc_overlap(cln); 498 + active_cacheline_inc_overlap(cln, entry->is_cache_clean); 499 499 existing = radix_tree_lookup(&dma_active_cacheline, cln); 500 500 /* A lookup failure here after we got -EEXIST is unexpected. */ 501 501 WARN_ON(!existing); ··· 601 601 unsigned long flags; 602 602 int rc; 603 603 604 - entry->is_cache_clean = !!(attrs & DMA_ATTR_CPU_CACHE_CLEAN); 604 + entry->is_cache_clean = attrs & (DMA_ATTR_DEBUGGING_IGNORE_CACHELINES | 605 + DMA_ATTR_REQUIRE_COHERENT); 605 606 606 607 bucket = get_hash_bucket(entry, &flags); 607 608 hash_bucket_add(bucket, entry);

+4 -3

kernel/dma/direct.h

··· 84 84 dma_addr_t dma_addr; 85 85 86 86 if (is_swiotlb_force_bounce(dev)) { 87 - if (attrs & DMA_ATTR_MMIO) 87 + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT)) 88 88 return DMA_MAPPING_ERROR; 89 89 90 90 return swiotlb_map(dev, phys, size, dir, attrs); ··· 98 98 dma_addr = phys_to_dma(dev, phys); 99 99 if (unlikely(!dma_capable(dev, dma_addr, size, true)) || 100 100 dma_kmalloc_needs_bounce(dev, size, dir)) { 101 - if (is_swiotlb_active(dev)) 101 + if (is_swiotlb_active(dev) && 102 + !(attrs & DMA_ATTR_REQUIRE_COHERENT)) 102 103 return swiotlb_map(dev, phys, size, dir, attrs); 103 104 104 105 goto err_overflow; ··· 124 123 { 125 124 phys_addr_t phys; 126 125 127 - if (attrs & DMA_ATTR_MMIO) 126 + if (attrs & (DMA_ATTR_MMIO | DMA_ATTR_REQUIRE_COHERENT)) 128 127 /* nothing to do: uncached and no swiotlb */ 129 128 return; 130 129

+6

kernel/dma/mapping.c

··· 164 164 if (WARN_ON_ONCE(!dev->dma_mask)) 165 165 return DMA_MAPPING_ERROR; 166 166 167 + if (!dev_is_dma_coherent(dev) && (attrs & DMA_ATTR_REQUIRE_COHERENT)) 168 + return DMA_MAPPING_ERROR; 169 + 167 170 if (dma_map_direct(dev, ops) || 168 171 (!is_mmio && arch_dma_map_phys_direct(dev, phys + size))) 169 172 addr = dma_direct_map_phys(dev, phys, size, dir, attrs); ··· 237 234 int ents; 238 235 239 236 BUG_ON(!valid_dma_direction(dir)); 237 + 238 + if (!dev_is_dma_coherent(dev) && (attrs & DMA_ATTR_REQUIRE_COHERENT)) 239 + return -EOPNOTSUPP; 240 240 241 241 if (WARN_ON_ONCE(!dev->dma_mask)) 242 242 return 0;

+19 -2

kernel/dma/swiotlb.c

··· 30 30 #include <linux/gfp.h> 31 31 #include <linux/highmem.h> 32 32 #include <linux/io.h> 33 + #include <linux/kmsan-checks.h> 33 34 #include <linux/iommu-helper.h> 34 35 #include <linux/init.h> 35 36 #include <linux/memblock.h> ··· 902 901 903 902 local_irq_save(flags); 904 903 page = pfn_to_page(pfn); 905 - if (dir == DMA_TO_DEVICE) 904 + if (dir == DMA_TO_DEVICE) { 905 + /* 906 + * Ideally, kmsan_check_highmem_page() 907 + * could be used here to detect infoleaks, 908 + * but callers may map uninitialized buffers 909 + * that will be written by the device, 910 + * causing false positives. 911 + */ 906 912 memcpy_from_page(vaddr, page, offset, sz); 907 - else 913 + } else { 914 + kmsan_unpoison_memory(vaddr, sz); 908 915 memcpy_to_page(page, offset, vaddr, sz); 916 + } 909 917 local_irq_restore(flags); 910 918 911 919 size -= sz; ··· 923 913 offset = 0; 924 914 } 925 915 } else if (dir == DMA_TO_DEVICE) { 916 + /* 917 + * Ideally, kmsan_check_memory() could be used here to detect 918 + * infoleaks (uninitialized data being sent to device), but 919 + * callers may map uninitialized buffers that will be written 920 + * by the device, causing false positives. 921 + */ 926 922 memcpy(vaddr, phys_to_virt(orig_addr), size); 927 923 } else { 924 + kmsan_unpoison_memory(vaddr, size); 928 925 memcpy(phys_to_virt(orig_addr), vaddr, size); 929 926 } 930 927 }

+1 -1

kernel/futex/core.c

··· 342 342 if (!vma) 343 343 return FUTEX_NO_NODE; 344 344 345 - mpol = vma_policy(vma); 345 + mpol = READ_ONCE(vma->vm_policy); 346 346 if (!mpol) 347 347 return FUTEX_NO_NODE; 348 348

+2 -1

kernel/futex/pi.c

··· 918 918 int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *time, int trylock) 919 919 { 920 920 struct hrtimer_sleeper timeout, *to; 921 - struct task_struct *exiting = NULL; 921 + struct task_struct *exiting; 922 922 struct rt_mutex_waiter rt_waiter; 923 923 struct futex_q q = futex_q_init; 924 924 DEFINE_WAKE_Q(wake_q); ··· 933 933 to = futex_setup_timer(time, &timeout, flags, 0); 934 934 935 935 retry: 936 + exiting = NULL; 936 937 ret = get_futex_key(uaddr, flags, &q.key, FUTEX_WRITE); 937 938 if (unlikely(ret != 0)) 938 939 goto out;

+8

kernel/futex/syscalls.c

··· 459 459 if (ret) 460 460 return ret; 461 461 462 + /* 463 + * For now mandate both flags are identical, like the sys_futex() 464 + * interface has. If/when we merge the variable sized futex support, 465 + * that patch can modify this test to allow a difference in size. 466 + */ 467 + if (futexes[0].w.flags != futexes[1].w.flags) 468 + return -EINVAL; 469 + 462 470 cmpval = futexes[0].w.val; 463 471 464 472 return futex_requeue(u64_to_user_ptr(futexes[0].w.uaddr), futexes[0].w.flags,

+1 -1

kernel/power/main.c

··· 40 40 { 41 41 WARN_ON(!mutex_is_locked(&system_transition_mutex)); 42 42 43 - if (WARN_ON(!saved_gfp_count) || --saved_gfp_count) 43 + if (!saved_gfp_count || --saved_gfp_count) 44 44 return; 45 45 46 46 gfp_allowed_mask = saved_gfp_mask;

+11

kernel/power/snapshot.c

··· 2855 2855 { 2856 2856 int error; 2857 2857 2858 + /* 2859 + * Call snapshot_write_next() to drain any trailing zero pages, 2860 + * but make sure we're in the data page region first. 2861 + * This function can return PAGE_SIZE if the kernel was expecting 2862 + * another copy page. Return -ENODATA in that situation. 2863 + */ 2864 + if (handle->cur > nr_meta_pages + 1) { 2865 + error = snapshot_write_next(handle); 2866 + if (error) 2867 + return error > 0 ? -ENODATA : error; 2868 + } 2858 2869 copy_last_highmem_page(); 2859 2870 error = hibernate_restore_protect_page(handle->buffer); 2860 2871 /* Do that only if we have loaded the image entirely */

+9

kernel/rcu/rcu.h

··· 502 502 ___locked; \ 503 503 }) 504 504 505 + #define raw_spin_trylock_irqsave_rcu_node(p, flags) \ 506 + ({ \ 507 + bool ___locked = raw_spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \ 508 + \ 509 + if (___locked) \ 510 + smp_mb__after_unlock_lock(); \ 511 + ___locked; \ 512 + }) 513 + 505 514 #define raw_lockdep_assert_held_rcu_node(p) \ 506 515 lockdep_assert_held(&ACCESS_PRIVATE(p, lock)) 507 516

+18 -1

kernel/rcu/srcutiny.c

··· 9 9 */ 10 10 11 11 #include <linux/export.h> 12 + #include <linux/irq_work.h> 12 13 #include <linux/mutex.h> 13 14 #include <linux/preempt.h> 14 15 #include <linux/rcupdate_wait.h> ··· 42 41 ssp->srcu_idx_max = 0; 43 42 INIT_WORK(&ssp->srcu_work, srcu_drive_gp); 44 43 INIT_LIST_HEAD(&ssp->srcu_work.entry); 44 + init_irq_work(&ssp->srcu_irq_work, srcu_tiny_irq_work); 45 45 return 0; 46 46 } 47 47 ··· 86 84 void cleanup_srcu_struct(struct srcu_struct *ssp) 87 85 { 88 86 WARN_ON(ssp->srcu_lock_nesting[0] || ssp->srcu_lock_nesting[1]); 87 + irq_work_sync(&ssp->srcu_irq_work); 89 88 flush_work(&ssp->srcu_work); 90 89 WARN_ON(ssp->srcu_gp_running); 91 90 WARN_ON(ssp->srcu_gp_waiting); ··· 180 177 } 181 178 EXPORT_SYMBOL_GPL(srcu_drive_gp); 182 179 180 + /* 181 + * Use an irq_work to defer schedule_work() to avoid acquiring the workqueue 182 + * pool->lock while the caller might hold scheduler locks, causing lockdep 183 + * splats due to workqueue_init() doing a wakeup. 184 + */ 185 + void srcu_tiny_irq_work(struct irq_work *irq_work) 186 + { 187 + struct srcu_struct *ssp; 188 + 189 + ssp = container_of(irq_work, struct srcu_struct, srcu_irq_work); 190 + schedule_work(&ssp->srcu_work); 191 + } 192 + EXPORT_SYMBOL_GPL(srcu_tiny_irq_work); 193 + 183 194 static void srcu_gp_start_if_needed(struct srcu_struct *ssp) 184 195 { 185 196 unsigned long cookie; ··· 206 189 WRITE_ONCE(ssp->srcu_idx_max, cookie); 207 190 if (!READ_ONCE(ssp->srcu_gp_running)) { 208 191 if (likely(srcu_init_done)) 209 - schedule_work(&ssp->srcu_work); 192 + irq_work_queue(&ssp->srcu_irq_work); 210 193 else if (list_empty(&ssp->srcu_work.entry)) 211 194 list_add(&ssp->srcu_work.entry, &srcu_boot_list); 212 195 }

+102 -109

kernel/rcu/srcutree.c

··· 19 19 #include <linux/mutex.h> 20 20 #include <linux/percpu.h> 21 21 #include <linux/preempt.h> 22 + #include <linux/irq_work.h> 22 23 #include <linux/rcupdate_wait.h> 23 24 #include <linux/sched.h> 24 25 #include <linux/smp.h> ··· 76 75 static void srcu_invoke_callbacks(struct work_struct *work); 77 76 static void srcu_reschedule(struct srcu_struct *ssp, unsigned long delay); 78 77 static void process_srcu(struct work_struct *work); 78 + static void srcu_irq_work(struct irq_work *work); 79 79 static void srcu_delay_timer(struct timer_list *t); 80 - 81 - /* Wrappers for lock acquisition and release, see raw_spin_lock_rcu_node(). */ 82 - #define spin_lock_rcu_node(p) \ 83 - do { \ 84 - spin_lock(&ACCESS_PRIVATE(p, lock)); \ 85 - smp_mb__after_unlock_lock(); \ 86 - } while (0) 87 - 88 - #define spin_unlock_rcu_node(p) spin_unlock(&ACCESS_PRIVATE(p, lock)) 89 - 90 - #define spin_lock_irq_rcu_node(p) \ 91 - do { \ 92 - spin_lock_irq(&ACCESS_PRIVATE(p, lock)); \ 93 - smp_mb__after_unlock_lock(); \ 94 - } while (0) 95 - 96 - #define spin_unlock_irq_rcu_node(p) \ 97 - spin_unlock_irq(&ACCESS_PRIVATE(p, lock)) 98 - 99 - #define spin_lock_irqsave_rcu_node(p, flags) \ 100 - do { \ 101 - spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \ 102 - smp_mb__after_unlock_lock(); \ 103 - } while (0) 104 - 105 - #define spin_trylock_irqsave_rcu_node(p, flags) \ 106 - ({ \ 107 - bool ___locked = spin_trylock_irqsave(&ACCESS_PRIVATE(p, lock), flags); \ 108 - \ 109 - if (___locked) \ 110 - smp_mb__after_unlock_lock(); \ 111 - ___locked; \ 112 - }) 113 - 114 - #define spin_unlock_irqrestore_rcu_node(p, flags) \ 115 - spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags) \ 116 80 117 81 /* 118 82 * Initialize SRCU per-CPU data. Note that statically allocated ··· 97 131 */ 98 132 for_each_possible_cpu(cpu) { 99 133 sdp = per_cpu_ptr(ssp->sda, cpu); 100 - spin_lock_init(&ACCESS_PRIVATE(sdp, lock)); 134 + raw_spin_lock_init(&ACCESS_PRIVATE(sdp, lock)); 101 135 rcu_segcblist_init(&sdp->srcu_cblist); 102 136 sdp->srcu_cblist_invoking = false; 103 137 sdp->srcu_gp_seq_needed = ssp->srcu_sup->srcu_gp_seq; ··· 152 186 153 187 /* Each pass through this loop initializes one srcu_node structure. */ 154 188 srcu_for_each_node_breadth_first(ssp, snp) { 155 - spin_lock_init(&ACCESS_PRIVATE(snp, lock)); 189 + raw_spin_lock_init(&ACCESS_PRIVATE(snp, lock)); 156 190 BUILD_BUG_ON(ARRAY_SIZE(snp->srcu_have_cbs) != 157 191 ARRAY_SIZE(snp->srcu_data_have_cbs)); 158 192 for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) { ··· 208 242 if (!ssp->srcu_sup) 209 243 return -ENOMEM; 210 244 if (!is_static) 211 - spin_lock_init(&ACCESS_PRIVATE(ssp->srcu_sup, lock)); 245 + raw_spin_lock_init(&ACCESS_PRIVATE(ssp->srcu_sup, lock)); 212 246 ssp->srcu_sup->srcu_size_state = SRCU_SIZE_SMALL; 213 247 ssp->srcu_sup->node = NULL; 214 248 mutex_init(&ssp->srcu_sup->srcu_cb_mutex); ··· 218 252 mutex_init(&ssp->srcu_sup->srcu_barrier_mutex); 219 253 atomic_set(&ssp->srcu_sup->srcu_barrier_cpu_cnt, 0); 220 254 INIT_DELAYED_WORK(&ssp->srcu_sup->work, process_srcu); 255 + init_irq_work(&ssp->srcu_sup->irq_work, srcu_irq_work); 221 256 ssp->srcu_sup->sda_is_static = is_static; 222 257 if (!is_static) { 223 258 ssp->sda = alloc_percpu(struct srcu_data); ··· 230 263 ssp->srcu_sup->srcu_gp_seq_needed_exp = SRCU_GP_SEQ_INITIAL_VAL; 231 264 ssp->srcu_sup->srcu_last_gp_end = ktime_get_mono_fast_ns(); 232 265 if (READ_ONCE(ssp->srcu_sup->srcu_size_state) == SRCU_SIZE_SMALL && SRCU_SIZING_IS_INIT()) { 233 - if (!init_srcu_struct_nodes(ssp, is_static ? GFP_ATOMIC : GFP_KERNEL)) 266 + if (!preemptible()) 267 + WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_ALLOC); 268 + else if (init_srcu_struct_nodes(ssp, GFP_KERNEL)) 269 + WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG); 270 + else 234 271 goto err_free_sda; 235 - WRITE_ONCE(ssp->srcu_sup->srcu_size_state, SRCU_SIZE_BIG); 236 272 } 237 273 ssp->srcu_sup->srcu_ssp = ssp; 238 274 smp_store_release(&ssp->srcu_sup->srcu_gp_seq_needed, ··· 364 394 /* Double-checked locking on ->srcu_size-state. */ 365 395 if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) 366 396 return; 367 - spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags); 397 + raw_spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags); 368 398 if (smp_load_acquire(&ssp->srcu_sup->srcu_size_state) != SRCU_SIZE_SMALL) { 369 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 399 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 370 400 return; 371 401 } 372 402 __srcu_transition_to_big(ssp); 373 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 403 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 374 404 } 375 405 376 406 /* 377 407 * Check to see if the just-encountered contention event justifies 378 408 * a transition to SRCU_SIZE_BIG. 379 409 */ 380 - static void spin_lock_irqsave_check_contention(struct srcu_struct *ssp) 410 + static void raw_spin_lock_irqsave_check_contention(struct srcu_struct *ssp) 381 411 { 382 412 unsigned long j; 383 413 ··· 399 429 * to SRCU_SIZE_BIG. But only if the srcutree.convert_to_big module 400 430 * parameter permits this. 401 431 */ 402 - static void spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned long *flags) 432 + static void raw_spin_lock_irqsave_sdp_contention(struct srcu_data *sdp, unsigned long *flags) 403 433 { 404 434 struct srcu_struct *ssp = sdp->ssp; 405 435 406 - if (spin_trylock_irqsave_rcu_node(sdp, *flags)) 436 + if (raw_spin_trylock_irqsave_rcu_node(sdp, *flags)) 407 437 return; 408 - spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags); 409 - spin_lock_irqsave_check_contention(ssp); 410 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, *flags); 411 - spin_lock_irqsave_rcu_node(sdp, *flags); 438 + raw_spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags); 439 + raw_spin_lock_irqsave_check_contention(ssp); 440 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, *flags); 441 + raw_spin_lock_irqsave_rcu_node(sdp, *flags); 412 442 } 413 443 414 444 /* ··· 417 447 * to SRCU_SIZE_BIG. But only if the srcutree.convert_to_big module 418 448 * parameter permits this. 419 449 */ 420 - static void spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags) 450 + static void raw_spin_lock_irqsave_ssp_contention(struct srcu_struct *ssp, unsigned long *flags) 421 451 { 422 - if (spin_trylock_irqsave_rcu_node(ssp->srcu_sup, *flags)) 452 + if (raw_spin_trylock_irqsave_rcu_node(ssp->srcu_sup, *flags)) 423 453 return; 424 - spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags); 425 - spin_lock_irqsave_check_contention(ssp); 454 + raw_spin_lock_irqsave_rcu_node(ssp->srcu_sup, *flags); 455 + raw_spin_lock_irqsave_check_contention(ssp); 426 456 } 427 457 428 458 /* ··· 440 470 /* The smp_load_acquire() pairs with the smp_store_release(). */ 441 471 if (!rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq_needed))) /*^^^*/ 442 472 return; /* Already initialized. */ 443 - spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags); 473 + raw_spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags); 444 474 if (!rcu_seq_state(ssp->srcu_sup->srcu_gp_seq_needed)) { 445 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 475 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 446 476 return; 447 477 } 448 478 init_srcu_struct_fields(ssp, true); 449 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 479 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 450 480 } 451 481 452 482 /* ··· 712 742 unsigned long delay; 713 743 struct srcu_usage *sup = ssp->srcu_sup; 714 744 715 - spin_lock_irq_rcu_node(ssp->srcu_sup); 745 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 716 746 delay = srcu_get_delay(ssp); 717 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 747 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 718 748 if (WARN_ON(!delay)) 719 749 return; /* Just leak it! */ 720 750 if (WARN_ON(srcu_readers_active(ssp))) 721 751 return; /* Just leak it! */ 752 + /* Wait for irq_work to finish first as it may queue a new work. */ 753 + irq_work_sync(&sup->irq_work); 722 754 flush_delayed_work(&sup->work); 723 755 for_each_possible_cpu(cpu) { 724 756 struct srcu_data *sdp = per_cpu_ptr(ssp->sda, cpu); ··· 932 960 mutex_lock(&sup->srcu_cb_mutex); 933 961 934 962 /* End the current grace period. */ 935 - spin_lock_irq_rcu_node(sup); 963 + raw_spin_lock_irq_rcu_node(sup); 936 964 idx = rcu_seq_state(sup->srcu_gp_seq); 937 965 WARN_ON_ONCE(idx != SRCU_STATE_SCAN2); 938 966 if (srcu_gp_is_expedited(ssp)) ··· 943 971 gpseq = rcu_seq_current(&sup->srcu_gp_seq); 944 972 if (ULONG_CMP_LT(sup->srcu_gp_seq_needed_exp, gpseq)) 945 973 WRITE_ONCE(sup->srcu_gp_seq_needed_exp, gpseq); 946 - spin_unlock_irq_rcu_node(sup); 974 + raw_spin_unlock_irq_rcu_node(sup); 947 975 mutex_unlock(&sup->srcu_gp_mutex); 948 976 /* A new grace period can start at this point. But only one. */ 949 977 ··· 955 983 } else { 956 984 idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs); 957 985 srcu_for_each_node_breadth_first(ssp, snp) { 958 - spin_lock_irq_rcu_node(snp); 986 + raw_spin_lock_irq_rcu_node(snp); 959 987 cbs = false; 960 988 last_lvl = snp >= sup->level[rcu_num_lvls - 1]; 961 989 if (last_lvl) ··· 970 998 else 971 999 mask = snp->srcu_data_have_cbs[idx]; 972 1000 snp->srcu_data_have_cbs[idx] = 0; 973 - spin_unlock_irq_rcu_node(snp); 1001 + raw_spin_unlock_irq_rcu_node(snp); 974 1002 if (cbs) 975 1003 srcu_schedule_cbs_snp(ssp, snp, mask, cbdelay); 976 1004 } ··· 980 1008 if (!(gpseq & counter_wrap_check)) 981 1009 for_each_possible_cpu(cpu) { 982 1010 sdp = per_cpu_ptr(ssp->sda, cpu); 983 - spin_lock_irq_rcu_node(sdp); 1011 + raw_spin_lock_irq_rcu_node(sdp); 984 1012 if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed + 100)) 985 1013 sdp->srcu_gp_seq_needed = gpseq; 986 1014 if (ULONG_CMP_GE(gpseq, sdp->srcu_gp_seq_needed_exp + 100)) 987 1015 sdp->srcu_gp_seq_needed_exp = gpseq; 988 - spin_unlock_irq_rcu_node(sdp); 1016 + raw_spin_unlock_irq_rcu_node(sdp); 989 1017 } 990 1018 991 1019 /* Callback initiation done, allow grace periods after next. */ 992 1020 mutex_unlock(&sup->srcu_cb_mutex); 993 1021 994 1022 /* Start a new grace period if needed. */ 995 - spin_lock_irq_rcu_node(sup); 1023 + raw_spin_lock_irq_rcu_node(sup); 996 1024 gpseq = rcu_seq_current(&sup->srcu_gp_seq); 997 1025 if (!rcu_seq_state(gpseq) && 998 1026 ULONG_CMP_LT(gpseq, sup->srcu_gp_seq_needed)) { 999 1027 srcu_gp_start(ssp); 1000 - spin_unlock_irq_rcu_node(sup); 1028 + raw_spin_unlock_irq_rcu_node(sup); 1001 1029 srcu_reschedule(ssp, 0); 1002 1030 } else { 1003 - spin_unlock_irq_rcu_node(sup); 1031 + raw_spin_unlock_irq_rcu_node(sup); 1004 1032 } 1005 1033 1006 1034 /* Transition to big if needed. */ ··· 1031 1059 if (WARN_ON_ONCE(rcu_seq_done(&ssp->srcu_sup->srcu_gp_seq, s)) || 1032 1060 (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s))) 1033 1061 return; 1034 - spin_lock_irqsave_rcu_node(snp, flags); 1062 + raw_spin_lock_irqsave_rcu_node(snp, flags); 1035 1063 sgsne = snp->srcu_gp_seq_needed_exp; 1036 1064 if (!srcu_invl_snp_seq(sgsne) && ULONG_CMP_GE(sgsne, s)) { 1037 - spin_unlock_irqrestore_rcu_node(snp, flags); 1065 + raw_spin_unlock_irqrestore_rcu_node(snp, flags); 1038 1066 return; 1039 1067 } 1040 1068 WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s); 1041 - spin_unlock_irqrestore_rcu_node(snp, flags); 1069 + raw_spin_unlock_irqrestore_rcu_node(snp, flags); 1042 1070 } 1043 - spin_lock_irqsave_ssp_contention(ssp, &flags); 1071 + raw_spin_lock_irqsave_ssp_contention(ssp, &flags); 1044 1072 if (ULONG_CMP_LT(ssp->srcu_sup->srcu_gp_seq_needed_exp, s)) 1045 1073 WRITE_ONCE(ssp->srcu_sup->srcu_gp_seq_needed_exp, s); 1046 - spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 1074 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 1047 1075 } 1048 1076 1049 1077 /* ··· 1081 1109 for (snp = snp_leaf; snp != NULL; snp = snp->srcu_parent) { 1082 1110 if (WARN_ON_ONCE(rcu_seq_done(&sup->srcu_gp_seq, s)) && snp != snp_leaf) 1083 1111 return; /* GP already done and CBs recorded. */ 1084 - spin_lock_irqsave_rcu_node(snp, flags); 1112 + raw_spin_lock_irqsave_rcu_node(snp, flags); 1085 1113 snp_seq = snp->srcu_have_cbs[idx]; 1086 1114 if (!srcu_invl_snp_seq(snp_seq) && ULONG_CMP_GE(snp_seq, s)) { 1087 1115 if (snp == snp_leaf && snp_seq == s) 1088 1116 snp->srcu_data_have_cbs[idx] |= sdp->grpmask; 1089 - spin_unlock_irqrestore_rcu_node(snp, flags); 1117 + raw_spin_unlock_irqrestore_rcu_node(snp, flags); 1090 1118 if (snp == snp_leaf && snp_seq != s) { 1091 1119 srcu_schedule_cbs_sdp(sdp, do_norm ? SRCU_INTERVAL : 0); 1092 1120 return; ··· 1101 1129 sgsne = snp->srcu_gp_seq_needed_exp; 1102 1130 if (!do_norm && (srcu_invl_snp_seq(sgsne) || ULONG_CMP_LT(sgsne, s))) 1103 1131 WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s); 1104 - spin_unlock_irqrestore_rcu_node(snp, flags); 1132 + raw_spin_unlock_irqrestore_rcu_node(snp, flags); 1105 1133 } 1106 1134 1107 1135 /* Top of tree, must ensure the grace period will be started. */ 1108 - spin_lock_irqsave_ssp_contention(ssp, &flags); 1136 + raw_spin_lock_irqsave_ssp_contention(ssp, &flags); 1109 1137 if (ULONG_CMP_LT(sup->srcu_gp_seq_needed, s)) { 1110 1138 /* 1111 1139 * Record need for grace period s. Pair with load ··· 1126 1154 // it isn't. And it does not have to be. After all, it 1127 1155 // can only be executed during early boot when there is only 1128 1156 // the one boot CPU running with interrupts still disabled. 1157 + // 1158 + // Use an irq_work here to avoid acquiring runqueue lock with 1159 + // srcu rcu_node::lock held. BPF instrument could introduce the 1160 + // opposite dependency, hence we need to break the possible 1161 + // locking dependency here. 1129 1162 if (likely(srcu_init_done)) 1130 - queue_delayed_work(rcu_gp_wq, &sup->work, 1131 - !!srcu_get_delay(ssp)); 1163 + irq_work_queue(&sup->irq_work); 1132 1164 else if (list_empty(&sup->work.work.entry)) 1133 1165 list_add(&sup->work.work.entry, &srcu_boot_list); 1134 1166 } 1135 - spin_unlock_irqrestore_rcu_node(sup, flags); 1167 + raw_spin_unlock_irqrestore_rcu_node(sup, flags); 1136 1168 } 1137 1169 1138 1170 /* ··· 1148 1172 { 1149 1173 unsigned long curdelay; 1150 1174 1151 - spin_lock_irq_rcu_node(ssp->srcu_sup); 1175 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 1152 1176 curdelay = !srcu_get_delay(ssp); 1153 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 1177 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1154 1178 1155 1179 for (;;) { 1156 1180 if (srcu_readers_active_idx_check(ssp, idx)) ··· 1261 1285 return false; 1262 1286 /* If the local srcu_data structure has callbacks, not idle. */ 1263 1287 sdp = raw_cpu_ptr(ssp->sda); 1264 - spin_lock_irqsave_rcu_node(sdp, flags); 1288 + raw_spin_lock_irqsave_rcu_node(sdp, flags); 1265 1289 if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { 1266 - spin_unlock_irqrestore_rcu_node(sdp, flags); 1290 + raw_spin_unlock_irqrestore_rcu_node(sdp, flags); 1267 1291 return false; /* Callbacks already present, so not idle. */ 1268 1292 } 1269 - spin_unlock_irqrestore_rcu_node(sdp, flags); 1293 + raw_spin_unlock_irqrestore_rcu_node(sdp, flags); 1270 1294 1271 1295 /* 1272 1296 * No local callbacks, so probabilistically probe global state. ··· 1326 1350 sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); 1327 1351 else 1328 1352 sdp = raw_cpu_ptr(ssp->sda); 1329 - spin_lock_irqsave_sdp_contention(sdp, &flags); 1353 + raw_spin_lock_irqsave_sdp_contention(sdp, &flags); 1330 1354 if (rhp) 1331 1355 rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp); 1332 1356 /* ··· 1386 1410 sdp->srcu_gp_seq_needed_exp = s; 1387 1411 needexp = true; 1388 1412 } 1389 - spin_unlock_irqrestore_rcu_node(sdp, flags); 1413 + raw_spin_unlock_irqrestore_rcu_node(sdp, flags); 1390 1414 1391 1415 /* Ensure that snp node tree is fully initialized before traversing it */ 1392 1416 if (ss_state < SRCU_SIZE_WAIT_BARRIER) ··· 1498 1522 1499 1523 /* 1500 1524 * Make sure that later code is ordered after the SRCU grace 1501 - * period. This pairs with the spin_lock_irq_rcu_node() 1525 + * period. This pairs with the raw_spin_lock_irq_rcu_node() 1502 1526 * in srcu_invoke_callbacks(). Unlike Tree RCU, this is needed 1503 1527 * because the current CPU might have been totally uninvolved with 1504 1528 * (and thus unordered against) that grace period. ··· 1677 1701 */ 1678 1702 static void srcu_barrier_one_cpu(struct srcu_struct *ssp, struct srcu_data *sdp) 1679 1703 { 1680 - spin_lock_irq_rcu_node(sdp); 1704 + raw_spin_lock_irq_rcu_node(sdp); 1681 1705 atomic_inc(&ssp->srcu_sup->srcu_barrier_cpu_cnt); 1682 1706 sdp->srcu_barrier_head.func = srcu_barrier_cb; 1683 1707 debug_rcu_head_queue(&sdp->srcu_barrier_head); ··· 1686 1710 debug_rcu_head_unqueue(&sdp->srcu_barrier_head); 1687 1711 atomic_dec(&ssp->srcu_sup->srcu_barrier_cpu_cnt); 1688 1712 } 1689 - spin_unlock_irq_rcu_node(sdp); 1713 + raw_spin_unlock_irq_rcu_node(sdp); 1690 1714 } 1691 1715 1692 1716 /** ··· 1737 1761 bool needcb = false; 1738 1762 struct srcu_data *sdp = container_of(rhp, struct srcu_data, srcu_ec_head); 1739 1763 1740 - spin_lock_irqsave_sdp_contention(sdp, &flags); 1764 + raw_spin_lock_irqsave_sdp_contention(sdp, &flags); 1741 1765 if (sdp->srcu_ec_state == SRCU_EC_IDLE) { 1742 1766 WARN_ON_ONCE(1); 1743 1767 } else if (sdp->srcu_ec_state == SRCU_EC_PENDING) { ··· 1747 1771 sdp->srcu_ec_state = SRCU_EC_PENDING; 1748 1772 needcb = true; 1749 1773 } 1750 - spin_unlock_irqrestore_rcu_node(sdp, flags); 1774 + raw_spin_unlock_irqrestore_rcu_node(sdp, flags); 1751 1775 // If needed, requeue ourselves as an expedited SRCU callback. 1752 1776 if (needcb) 1753 1777 __call_srcu(sdp->ssp, &sdp->srcu_ec_head, srcu_expedite_current_cb, false); ··· 1771 1795 1772 1796 migrate_disable(); 1773 1797 sdp = this_cpu_ptr(ssp->sda); 1774 - spin_lock_irqsave_sdp_contention(sdp, &flags); 1798 + raw_spin_lock_irqsave_sdp_contention(sdp, &flags); 1775 1799 if (sdp->srcu_ec_state == SRCU_EC_IDLE) { 1776 1800 sdp->srcu_ec_state = SRCU_EC_PENDING; 1777 1801 needcb = true; ··· 1780 1804 } else { 1781 1805 WARN_ON_ONCE(sdp->srcu_ec_state != SRCU_EC_REPOST); 1782 1806 } 1783 - spin_unlock_irqrestore_rcu_node(sdp, flags); 1807 + raw_spin_unlock_irqrestore_rcu_node(sdp, flags); 1784 1808 // If needed, queue an expedited SRCU callback. 1785 1809 if (needcb) 1786 1810 __call_srcu(ssp, &sdp->srcu_ec_head, srcu_expedite_current_cb, false); ··· 1824 1848 */ 1825 1849 idx = rcu_seq_state(smp_load_acquire(&ssp->srcu_sup->srcu_gp_seq)); /* ^^^ */ 1826 1850 if (idx == SRCU_STATE_IDLE) { 1827 - spin_lock_irq_rcu_node(ssp->srcu_sup); 1851 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 1828 1852 if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) { 1829 1853 WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq)); 1830 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 1854 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1831 1855 mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex); 1832 1856 return; 1833 1857 } 1834 1858 idx = rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)); 1835 1859 if (idx == SRCU_STATE_IDLE) 1836 1860 srcu_gp_start(ssp); 1837 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 1861 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1838 1862 if (idx != SRCU_STATE_IDLE) { 1839 1863 mutex_unlock(&ssp->srcu_sup->srcu_gp_mutex); 1840 1864 return; /* Someone else started the grace period. */ ··· 1848 1872 return; /* readers present, retry later. */ 1849 1873 } 1850 1874 srcu_flip(ssp); 1851 - spin_lock_irq_rcu_node(ssp->srcu_sup); 1875 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 1852 1876 rcu_seq_set_state(&ssp->srcu_sup->srcu_gp_seq, SRCU_STATE_SCAN2); 1853 1877 ssp->srcu_sup->srcu_n_exp_nodelay = 0; 1854 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 1878 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1855 1879 } 1856 1880 1857 1881 if (rcu_seq_state(READ_ONCE(ssp->srcu_sup->srcu_gp_seq)) == SRCU_STATE_SCAN2) { ··· 1889 1913 1890 1914 ssp = sdp->ssp; 1891 1915 rcu_cblist_init(&ready_cbs); 1892 - spin_lock_irq_rcu_node(sdp); 1916 + raw_spin_lock_irq_rcu_node(sdp); 1893 1917 WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL)); 1894 1918 rcu_segcblist_advance(&sdp->srcu_cblist, 1895 1919 rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq)); ··· 1900 1924 */ 1901 1925 if (sdp->srcu_cblist_invoking || 1902 1926 !rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) { 1903 - spin_unlock_irq_rcu_node(sdp); 1927 + raw_spin_unlock_irq_rcu_node(sdp); 1904 1928 return; /* Someone else on the job or nothing to do. */ 1905 1929 } 1906 1930 ··· 1908 1932 sdp->srcu_cblist_invoking = true; 1909 1933 rcu_segcblist_extract_done_cbs(&sdp->srcu_cblist, &ready_cbs); 1910 1934 len = ready_cbs.len; 1911 - spin_unlock_irq_rcu_node(sdp); 1935 + raw_spin_unlock_irq_rcu_node(sdp); 1912 1936 rhp = rcu_cblist_dequeue(&ready_cbs); 1913 1937 for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) { 1914 1938 debug_rcu_head_unqueue(rhp); ··· 1923 1947 * Update counts, accelerate new callbacks, and if needed, 1924 1948 * schedule another round of callback invocation. 1925 1949 */ 1926 - spin_lock_irq_rcu_node(sdp); 1950 + raw_spin_lock_irq_rcu_node(sdp); 1927 1951 rcu_segcblist_add_len(&sdp->srcu_cblist, -len); 1928 1952 sdp->srcu_cblist_invoking = false; 1929 1953 more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist); 1930 - spin_unlock_irq_rcu_node(sdp); 1954 + raw_spin_unlock_irq_rcu_node(sdp); 1931 1955 /* An SRCU barrier or callbacks from previous nesting work pending */ 1932 1956 if (more) 1933 1957 srcu_schedule_cbs_sdp(sdp, 0); ··· 1941 1965 { 1942 1966 bool pushgp = true; 1943 1967 1944 - spin_lock_irq_rcu_node(ssp->srcu_sup); 1968 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 1945 1969 if (ULONG_CMP_GE(ssp->srcu_sup->srcu_gp_seq, ssp->srcu_sup->srcu_gp_seq_needed)) { 1946 1970 if (!WARN_ON_ONCE(rcu_seq_state(ssp->srcu_sup->srcu_gp_seq))) { 1947 1971 /* All requests fulfilled, time to go idle. */ ··· 1951 1975 /* Outstanding request and no GP. Start one. */ 1952 1976 srcu_gp_start(ssp); 1953 1977 } 1954 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 1978 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1955 1979 1956 1980 if (pushgp) 1957 1981 queue_delayed_work(rcu_gp_wq, &ssp->srcu_sup->work, delay); ··· 1971 1995 ssp = sup->srcu_ssp; 1972 1996 1973 1997 srcu_advance_state(ssp); 1974 - spin_lock_irq_rcu_node(ssp->srcu_sup); 1998 + raw_spin_lock_irq_rcu_node(ssp->srcu_sup); 1975 1999 curdelay = srcu_get_delay(ssp); 1976 - spin_unlock_irq_rcu_node(ssp->srcu_sup); 2000 + raw_spin_unlock_irq_rcu_node(ssp->srcu_sup); 1977 2001 if (curdelay) { 1978 2002 WRITE_ONCE(sup->reschedule_count, 0); 1979 2003 } else { ··· 1989 2013 } 1990 2014 } 1991 2015 srcu_reschedule(ssp, curdelay); 2016 + } 2017 + 2018 + static void srcu_irq_work(struct irq_work *work) 2019 + { 2020 + struct srcu_struct *ssp; 2021 + struct srcu_usage *sup; 2022 + unsigned long delay; 2023 + unsigned long flags; 2024 + 2025 + sup = container_of(work, struct srcu_usage, irq_work); 2026 + ssp = sup->srcu_ssp; 2027 + 2028 + raw_spin_lock_irqsave_rcu_node(ssp->srcu_sup, flags); 2029 + delay = srcu_get_delay(ssp); 2030 + raw_spin_unlock_irqrestore_rcu_node(ssp->srcu_sup, flags); 2031 + 2032 + queue_delayed_work(rcu_gp_wq, &sup->work, !!delay); 1992 2033 } 1993 2034 1994 2035 void srcutorture_get_gp_data(struct srcu_struct *ssp, int *flags,

+1 -1

kernel/sysctl.c

··· 1118 1118 unsigned long bitmap_len = table->maxlen; 1119 1119 unsigned long *bitmap = *(unsigned long **) table->data; 1120 1120 unsigned long *tmp_bitmap = NULL; 1121 - char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c; 1121 + char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c = 0; 1122 1122 1123 1123 if (!bitmap || !bitmap_len || !left || (*ppos && SYSCTL_KERN_TO_USER(dir))) { 1124 1124 *lenp = 0;

+1 -1

kernel/time/alarmtimer.c

··· 540 540 { 541 541 struct alarm *alarm = &timr->it.alarm.alarmtimer; 542 542 543 - return alarm_forward(alarm, timr->it_interval, now); 543 + return alarm_forward(alarm, now, timr->it_interval); 544 544 } 545 545 546 546 /**

+69 -16

kernel/trace/trace_events_trigger.c

··· 22 22 static struct llist_head trigger_data_free_list; 23 23 static DEFINE_MUTEX(trigger_data_kthread_mutex); 24 24 25 + static int trigger_kthread_fn(void *ignore); 26 + 27 + static void trigger_create_kthread_locked(void) 28 + { 29 + lockdep_assert_held(&trigger_data_kthread_mutex); 30 + 31 + if (!trigger_kthread) { 32 + struct task_struct *kthread; 33 + 34 + kthread = kthread_create(trigger_kthread_fn, NULL, 35 + "trigger_data_free"); 36 + if (!IS_ERR(kthread)) 37 + WRITE_ONCE(trigger_kthread, kthread); 38 + } 39 + } 40 + 41 + static void trigger_data_free_queued_locked(void) 42 + { 43 + struct event_trigger_data *data, *tmp; 44 + struct llist_node *llnodes; 45 + 46 + lockdep_assert_held(&trigger_data_kthread_mutex); 47 + 48 + llnodes = llist_del_all(&trigger_data_free_list); 49 + if (!llnodes) 50 + return; 51 + 52 + tracepoint_synchronize_unregister(); 53 + 54 + llist_for_each_entry_safe(data, tmp, llnodes, llist) 55 + kfree(data); 56 + } 57 + 25 58 /* Bulk garbage collection of event_trigger_data elements */ 26 59 static int trigger_kthread_fn(void *ignore) 27 60 { ··· 89 56 if (data->cmd_ops->set_filter) 90 57 data->cmd_ops->set_filter(NULL, data, NULL); 91 58 92 - if (unlikely(!trigger_kthread)) { 93 - guard(mutex)(&trigger_data_kthread_mutex); 94 - /* Check again after taking mutex */ 95 - if (!trigger_kthread) { 96 - struct task_struct *kthread; 97 - 98 - kthread = kthread_create(trigger_kthread_fn, NULL, 99 - "trigger_data_free"); 100 - if (!IS_ERR(kthread)) 101 - WRITE_ONCE(trigger_kthread, kthread); 102 - } 59 + /* 60 + * Boot-time trigger registration can fail before kthread creation 61 + * works. Keep the deferred-free semantics during boot and let late 62 + * init start the kthread to drain the list. 63 + */ 64 + if (system_state == SYSTEM_BOOTING && !trigger_kthread) { 65 + llist_add(&data->llist, &trigger_data_free_list); 66 + return; 103 67 } 104 68 105 - if (!trigger_kthread) { 106 - /* Do it the slow way */ 107 - tracepoint_synchronize_unregister(); 108 - kfree(data); 109 - return; 69 + if (unlikely(!trigger_kthread)) { 70 + guard(mutex)(&trigger_data_kthread_mutex); 71 + 72 + trigger_create_kthread_locked(); 73 + /* Check again after taking mutex */ 74 + if (!trigger_kthread) { 75 + llist_add(&data->llist, &trigger_data_free_list); 76 + /* Drain the queued frees synchronously if creation failed. */ 77 + trigger_data_free_queued_locked(); 78 + return; 79 + } 110 80 } 111 81 112 82 llist_add(&data->llist, &trigger_data_free_list); 113 83 wake_up_process(trigger_kthread); 114 84 } 85 + 86 + static int __init trigger_data_free_init(void) 87 + { 88 + guard(mutex)(&trigger_data_kthread_mutex); 89 + 90 + if (llist_empty(&trigger_data_free_list)) 91 + return 0; 92 + 93 + trigger_create_kthread_locked(); 94 + if (trigger_kthread) 95 + wake_up_process(trigger_kthread); 96 + else 97 + trigger_data_free_queued_locked(); 98 + 99 + return 0; 100 + } 101 + late_initcall(trigger_data_free_init); 115 102 116 103 static inline void data_ops_trigger(struct event_trigger_data *data, 117 104 struct trace_buffer *buffer, void *rec,

+5 -5

kernel/trace/trace_osnoise.c

··· 2073 2073 if (!osnoise_has_registered_instances()) 2074 2074 return; 2075 2075 2076 - guard(mutex)(&interface_lock); 2077 2076 guard(cpus_read_lock)(); 2077 + guard(mutex)(&interface_lock); 2078 2078 2079 2079 if (!cpu_online(cpu)) 2080 2080 return; ··· 2237 2237 if (running) 2238 2238 stop_per_cpu_kthreads(); 2239 2239 2240 - mutex_lock(&interface_lock); 2241 2240 /* 2242 2241 * avoid CPU hotplug operations that might read options. 2243 2242 */ 2244 2243 cpus_read_lock(); 2244 + mutex_lock(&interface_lock); 2245 2245 2246 2246 retval = cnt; 2247 2247 ··· 2257 2257 clear_bit(option, &osnoise_options); 2258 2258 } 2259 2259 2260 - cpus_read_unlock(); 2261 2260 mutex_unlock(&interface_lock); 2261 + cpus_read_unlock(); 2262 2262 2263 2263 if (running) 2264 2264 start_per_cpu_kthreads(); ··· 2345 2345 if (running) 2346 2346 stop_per_cpu_kthreads(); 2347 2347 2348 - mutex_lock(&interface_lock); 2349 2348 /* 2350 2349 * osnoise_cpumask is read by CPU hotplug operations. 2351 2350 */ 2352 2351 cpus_read_lock(); 2352 + mutex_lock(&interface_lock); 2353 2353 2354 2354 cpumask_copy(&osnoise_cpumask, osnoise_cpumask_new); 2355 2355 2356 - cpus_read_unlock(); 2357 2356 mutex_unlock(&interface_lock); 2357 + cpus_read_unlock(); 2358 2358 2359 2359 if (running) 2360 2360 start_per_cpu_kthreads();

+2 -5

lib/bug.c

··· 173 173 return module_find_bug(bugaddr); 174 174 } 175 175 176 - __diag_push(); 177 - __diag_ignore(GCC, all, "-Wsuggest-attribute=format", 178 - "Not a valid __printf() conversion candidate."); 179 - static void __warn_printf(const char *fmt, struct pt_regs *regs) 176 + static __printf(1, 0) 177 + void __warn_printf(const char *fmt, struct pt_regs *regs) 180 178 { 181 179 if (!fmt) 182 180 return; ··· 193 195 194 196 printk("%s", fmt); 195 197 } 196 - __diag_pop(); 197 198 198 199 static enum bug_trap_type __report_bug(struct bug_entry *bug, unsigned long bugaddr, struct pt_regs *regs) 199 200 {

+8

mm/damon/core.c

··· 1252 1252 { 1253 1253 int err; 1254 1254 1255 + dst->maybe_corrupted = true; 1255 1256 if (!is_power_of_2(src->min_region_sz)) 1256 1257 return -EINVAL; 1257 1258 ··· 1278 1277 dst->addr_unit = src->addr_unit; 1279 1278 dst->min_region_sz = src->min_region_sz; 1280 1279 1280 + dst->maybe_corrupted = false; 1281 1281 return 0; 1282 1282 } 1283 1283 ··· 2680 2678 complete(&control->completion); 2681 2679 else if (control->canceled && control->dealloc_on_cancel) 2682 2680 kfree(control); 2681 + if (!cancel && ctx->maybe_corrupted) 2682 + break; 2683 2683 } 2684 2684 2685 2685 mutex_lock(&ctx->call_controls_lock); ··· 2711 2707 kdamond_usleep(min_wait_time); 2712 2708 2713 2709 kdamond_call(ctx, false); 2710 + if (ctx->maybe_corrupted) 2711 + return -EINVAL; 2714 2712 damos_walk_cancel(ctx); 2715 2713 } 2716 2714 return -EBUSY; ··· 2796 2790 * kdamond_merge_regions() if possible, to reduce overhead 2797 2791 */ 2798 2792 kdamond_call(ctx, false); 2793 + if (ctx->maybe_corrupted) 2794 + break; 2799 2795 if (!list_empty(&ctx->schemes)) 2800 2796 kdamond_apply_schemes(ctx); 2801 2797 else

+50 -3

mm/damon/stat.c

··· 145 145 return 0; 146 146 } 147 147 148 + struct damon_stat_system_ram_range_walk_arg { 149 + bool walked; 150 + struct resource res; 151 + }; 152 + 153 + static int damon_stat_system_ram_walk_fn(struct resource *res, void *arg) 154 + { 155 + struct damon_stat_system_ram_range_walk_arg *a = arg; 156 + 157 + if (!a->walked) { 158 + a->walked = true; 159 + a->res.start = res->start; 160 + } 161 + a->res.end = res->end; 162 + return 0; 163 + } 164 + 165 + static unsigned long damon_stat_res_to_core_addr(resource_size_t ra, 166 + unsigned long addr_unit) 167 + { 168 + /* 169 + * Use div_u64() for avoiding linking errors related with __udivdi3, 170 + * __aeabi_uldivmod, or similar problems. This should also improve the 171 + * performance optimization (read div_u64() comment for the detail). 172 + */ 173 + if (sizeof(ra) == 8 && sizeof(addr_unit) == 4) 174 + return div_u64(ra, addr_unit); 175 + return ra / addr_unit; 176 + } 177 + 178 + static int damon_stat_set_monitoring_region(struct damon_target *t, 179 + unsigned long addr_unit, unsigned long min_region_sz) 180 + { 181 + struct damon_addr_range addr_range; 182 + struct damon_stat_system_ram_range_walk_arg arg = {}; 183 + 184 + walk_system_ram_res(0, -1, &arg, damon_stat_system_ram_walk_fn); 185 + if (!arg.walked) 186 + return -EINVAL; 187 + addr_range.start = damon_stat_res_to_core_addr( 188 + arg.res.start, addr_unit); 189 + addr_range.end = damon_stat_res_to_core_addr( 190 + arg.res.end + 1, addr_unit); 191 + if (addr_range.end <= addr_range.start) 192 + return -EINVAL; 193 + return damon_set_regions(t, &addr_range, 1, min_region_sz); 194 + } 195 + 148 196 static struct damon_ctx *damon_stat_build_ctx(void) 149 197 { 150 198 struct damon_ctx *ctx; 151 199 struct damon_attrs attrs; 152 200 struct damon_target *target; 153 - unsigned long start = 0, end = 0; 154 201 155 202 ctx = damon_new_ctx(); 156 203 if (!ctx) ··· 227 180 if (!target) 228 181 goto free_out; 229 182 damon_add_target(ctx, target); 230 - if (damon_set_region_biggest_system_ram_default(target, &start, &end, 231 - ctx->min_region_sz)) 183 + if (damon_stat_set_monitoring_region(target, ctx->addr_unit, 184 + ctx->min_region_sz)) 232 185 goto free_out; 233 186 return ctx; 234 187 free_out:

+9 -1

mm/damon/sysfs.c

··· 1524 1524 if (IS_ERR(param_ctx)) 1525 1525 return PTR_ERR(param_ctx); 1526 1526 test_ctx = damon_sysfs_new_test_ctx(kdamond->damon_ctx); 1527 - if (!test_ctx) 1527 + if (!test_ctx) { 1528 + damon_destroy_ctx(param_ctx); 1528 1529 return -ENOMEM; 1530 + } 1529 1531 err = damon_commit_ctx(test_ctx, param_ctx); 1530 1532 if (err) 1531 1533 goto out; ··· 1620 1618 1621 1619 if (!mutex_trylock(&damon_sysfs_lock)) 1622 1620 return 0; 1621 + if (sysfs_kdamond->contexts->nr != 1) 1622 + goto out; 1623 1623 damon_sysfs_upd_tuned_intervals(sysfs_kdamond); 1624 1624 damon_sysfs_upd_schemes_stats(sysfs_kdamond); 1625 1625 damon_sysfs_upd_schemes_effective_quotas(sysfs_kdamond); 1626 + out: 1626 1627 mutex_unlock(&damon_sysfs_lock); 1627 1628 return 0; 1628 1629 } ··· 1752 1747 static int damon_sysfs_handle_cmd(enum damon_sysfs_cmd cmd, 1753 1748 struct damon_sysfs_kdamond *kdamond) 1754 1749 { 1750 + if (cmd != DAMON_SYSFS_CMD_OFF && kdamond->contexts->nr != 1) 1751 + return -EINVAL; 1752 + 1755 1753 switch (cmd) { 1756 1754 case DAMON_SYSFS_CMD_ON: 1757 1755 return damon_sysfs_turn_damon_on(kdamond);

+2 -2

mm/hmm.c

··· 778 778 struct page *page = hmm_pfn_to_page(pfns[idx]); 779 779 phys_addr_t paddr = hmm_pfn_to_phys(pfns[idx]); 780 780 size_t offset = idx * map->dma_entry_size; 781 - unsigned long attrs = 0; 781 + unsigned long attrs = DMA_ATTR_REQUIRE_COHERENT; 782 782 dma_addr_t dma_addr; 783 783 int ret; 784 784 ··· 871 871 struct dma_iova_state *state = &map->state; 872 872 dma_addr_t *dma_addrs = map->dma_list; 873 873 unsigned long *pfns = map->pfn_list; 874 - unsigned long attrs = 0; 874 + unsigned long attrs = DMA_ATTR_REQUIRE_COHERENT; 875 875 876 876 if ((pfns[idx] & valid_dma) != valid_dma) 877 877 return false;

+15 -3

mm/memory.c

··· 6815 6815 6816 6816 pudp = pud_offset(p4dp, address); 6817 6817 pud = pudp_get(pudp); 6818 - if (pud_none(pud)) 6818 + if (!pud_present(pud)) 6819 6819 goto out; 6820 6820 if (pud_leaf(pud)) { 6821 6821 lock = pud_lock(mm, pudp); 6822 - if (!unlikely(pud_leaf(pud))) { 6822 + pud = pudp_get(pudp); 6823 + 6824 + if (unlikely(!pud_present(pud))) { 6825 + spin_unlock(lock); 6826 + goto out; 6827 + } else if (unlikely(!pud_leaf(pud))) { 6823 6828 spin_unlock(lock); 6824 6829 goto retry; 6825 6830 } ··· 6836 6831 6837 6832 pmdp = pmd_offset(pudp, address); 6838 6833 pmd = pmdp_get_lockless(pmdp); 6834 + if (!pmd_present(pmd)) 6835 + goto out; 6839 6836 if (pmd_leaf(pmd)) { 6840 6837 lock = pmd_lock(mm, pmdp); 6841 - if (!unlikely(pmd_leaf(pmd))) { 6838 + pmd = pmdp_get(pmdp); 6839 + 6840 + if (unlikely(!pmd_present(pmd))) { 6841 + spin_unlock(lock); 6842 + goto out; 6843 + } else if (unlikely(!pmd_leaf(pmd))) { 6842 6844 spin_unlock(lock); 6843 6845 goto retry; 6844 6846 }

+8 -2

mm/mempolicy.c

··· 487 487 { 488 488 if (!atomic_dec_and_test(&pol->refcnt)) 489 489 return; 490 - kmem_cache_free(policy_cache, pol); 490 + /* 491 + * Required to allow mmap_lock_speculative*() access, see for example 492 + * futex_key_to_node_opt(). All accesses are serialized by mmap_lock, 493 + * however the speculative lock section unbound by the normal lock 494 + * boundaries, requiring RCU freeing. 495 + */ 496 + kfree_rcu(pol, rcu); 491 497 } 492 498 EXPORT_SYMBOL_FOR_MODULES(__mpol_put, "kvm"); 493 499 ··· 1026 1020 } 1027 1021 1028 1022 old = vma->vm_policy; 1029 - vma->vm_policy = new; /* protected by mmap_lock */ 1023 + WRITE_ONCE(vma->vm_policy, new); /* protected by mmap_lock */ 1030 1024 mpol_put(old); 1031 1025 1032 1026 return 0;

+1 -2

mm/mseal.c

··· 56 56 unsigned long start, unsigned long end) 57 57 { 58 58 struct vm_area_struct *vma, *prev; 59 - unsigned long curr_start = start; 60 59 VMA_ITERATOR(vmi, mm, start); 61 60 62 61 /* We know there are no gaps so this will be non-NULL. */ ··· 65 66 prev = vma; 66 67 67 68 for_each_vma_range(vmi, vma, end) { 69 + const unsigned long curr_start = MAX(vma->vm_start, start); 68 70 const unsigned long curr_end = MIN(vma->vm_end, end); 69 71 70 72 if (!(vma->vm_flags & VM_SEALED)) { ··· 79 79 } 80 80 81 81 prev = vma; 82 - curr_start = curr_end; 83 82 } 84 83 85 84 return 0;

+22 -3

mm/pagewalk.c

··· 97 97 static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, 98 98 struct mm_walk *walk) 99 99 { 100 + pud_t pudval = pudp_get(pud); 100 101 pmd_t *pmd; 101 102 unsigned long next; 102 103 const struct mm_walk_ops *ops = walk->ops; ··· 105 104 bool has_install = ops->install_pte; 106 105 int err = 0; 107 106 int depth = real_depth(3); 107 + 108 + /* 109 + * For PTE handling, pte_offset_map_lock() takes care of checking 110 + * whether there actually is a page table. But it also has to be 111 + * very careful about concurrent page table reclaim. 112 + * 113 + * Similarly, we have to be careful here - a PUD entry that points 114 + * to a PMD table cannot go away, so we can just walk it. But if 115 + * it's something else, we need to ensure we didn't race something, 116 + * so need to retry. 117 + * 118 + * A pertinent example of this is a PUD refault after PUD split - 119 + * we will need to split again or risk accessing invalid memory. 120 + */ 121 + if (!pud_present(pudval) || pud_leaf(pudval)) { 122 + walk->action = ACTION_AGAIN; 123 + return 0; 124 + } 108 125 109 126 pmd = pmd_offset(pud, addr); 110 127 do { ··· 237 218 else if (pud_leaf(*pud) || !pud_present(*pud)) 238 219 continue; /* Nothing to do. */ 239 220 240 - if (pud_none(*pud)) 241 - goto again; 242 - 243 221 err = walk_pmd_range(pud, addr, next, walk); 244 222 if (err) 245 223 break; 224 + 225 + if (walk->action == ACTION_AGAIN) 226 + goto again; 246 227 } while (pud++, addr = next, addr != end); 247 228 248 229 return err;

+7

mm/rmap.c

··· 457 457 list_del(&avc->same_vma); 458 458 anon_vma_chain_free(avc); 459 459 } 460 + 461 + /* 462 + * The anon_vma assigned to this VMA is no longer valid, as we were not 463 + * able to correctly clone AVC state. Avoid inconsistent anon_vma tree 464 + * state by resetting. 465 + */ 466 + vma->anon_vma = NULL; 460 467 } 461 468 462 469 /**

+4 -5

mm/swap_state.c

··· 494 494 495 495 __folio_set_locked(folio); 496 496 __folio_set_swapbacked(folio); 497 + 498 + if (!charged && mem_cgroup_swapin_charge_folio(folio, NULL, gfp, entry)) 499 + goto failed; 500 + 497 501 for (;;) { 498 502 ret = swap_cache_add_folio(folio, entry, &shadow); 499 503 if (!ret) ··· 516 512 swapcache = swap_cache_get_folio(entry); 517 513 if (swapcache) 518 514 goto failed; 519 - } 520 - 521 - if (!charged && mem_cgroup_swapin_charge_folio(folio, NULL, gfp, entry)) { 522 - swap_cache_del_folio(folio); 523 - goto failed; 524 515 } 525 516 526 517 memcg1_swapin(entry, folio_nr_pages(folio));

+7 -1

mm/zswap.c

··· 942 942 943 943 /* zswap entries of length PAGE_SIZE are not compressed. */ 944 944 if (entry->length == PAGE_SIZE) { 945 + void *dst; 946 + 945 947 WARN_ON_ONCE(input->length != PAGE_SIZE); 946 - memcpy_from_sglist(kmap_local_folio(folio, 0), input, 0, PAGE_SIZE); 948 + 949 + dst = kmap_local_folio(folio, 0); 950 + memcpy_from_sglist(dst, input, 0, PAGE_SIZE); 947 951 dlen = PAGE_SIZE; 952 + kunmap_local(dst); 953 + flush_dcache_folio(folio); 948 954 } else { 949 955 sg_init_table(&output, 1); 950 956 sg_set_folio(&output, folio, PAGE_SIZE, 0);

+1 -1

net/bluetooth/hci_conn.c

··· 3095 3095 * hci_connect_le serializes the connection attempts so only one 3096 3096 * connection can be in BT_CONNECT at time. 3097 3097 */ 3098 - if (conn->state == BT_CONNECT && hdev->req_status == HCI_REQ_PEND) { 3098 + if (conn->state == BT_CONNECT && READ_ONCE(hdev->req_status) == HCI_REQ_PEND) { 3099 3099 switch (hci_skb_event(hdev->sent_cmd)) { 3100 3100 case HCI_EV_CONN_COMPLETE: 3101 3101 case HCI_EV_LE_CONN_COMPLETE:

+1 -1

net/bluetooth/hci_core.c

··· 4126 4126 kfree_skb(skb); 4127 4127 } 4128 4128 4129 - if (hdev->req_status == HCI_REQ_PEND && 4129 + if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND && 4130 4130 !hci_dev_test_and_set_flag(hdev, HCI_CMD_PENDING)) { 4131 4131 kfree_skb(hdev->req_skb); 4132 4132 hdev->req_skb = skb_clone(hdev->sent_cmd, GFP_KERNEL);

+10 -10

net/bluetooth/hci_sync.c

··· 25 25 { 26 26 bt_dev_dbg(hdev, "result 0x%2.2x", result); 27 27 28 - if (hdev->req_status != HCI_REQ_PEND) 28 + if (READ_ONCE(hdev->req_status) != HCI_REQ_PEND) 29 29 return; 30 30 31 31 hdev->req_result = result; 32 - hdev->req_status = HCI_REQ_DONE; 32 + WRITE_ONCE(hdev->req_status, HCI_REQ_DONE); 33 33 34 34 /* Free the request command so it is not used as response */ 35 35 kfree_skb(hdev->req_skb); ··· 167 167 168 168 hci_cmd_sync_add(&req, opcode, plen, param, event, sk); 169 169 170 - hdev->req_status = HCI_REQ_PEND; 170 + WRITE_ONCE(hdev->req_status, HCI_REQ_PEND); 171 171 172 172 err = hci_req_sync_run(&req); 173 173 if (err < 0) 174 174 return ERR_PTR(err); 175 175 176 176 err = wait_event_interruptible_timeout(hdev->req_wait_q, 177 - hdev->req_status != HCI_REQ_PEND, 177 + READ_ONCE(hdev->req_status) != HCI_REQ_PEND, 178 178 timeout); 179 179 180 180 if (err == -ERESTARTSYS) 181 181 return ERR_PTR(-EINTR); 182 182 183 - switch (hdev->req_status) { 183 + switch (READ_ONCE(hdev->req_status)) { 184 184 case HCI_REQ_DONE: 185 185 err = -bt_to_errno(hdev->req_result); 186 186 break; ··· 194 194 break; 195 195 } 196 196 197 - hdev->req_status = 0; 197 + WRITE_ONCE(hdev->req_status, 0); 198 198 hdev->req_result = 0; 199 199 skb = hdev->req_rsp; 200 200 hdev->req_rsp = NULL; ··· 665 665 { 666 666 bt_dev_dbg(hdev, "err 0x%2.2x", err); 667 667 668 - if (hdev->req_status == HCI_REQ_PEND) { 668 + if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND) { 669 669 hdev->req_result = err; 670 - hdev->req_status = HCI_REQ_CANCELED; 670 + WRITE_ONCE(hdev->req_status, HCI_REQ_CANCELED); 671 671 672 672 queue_work(hdev->workqueue, &hdev->cmd_sync_cancel_work); 673 673 } ··· 683 683 { 684 684 bt_dev_dbg(hdev, "err 0x%2.2x", err); 685 685 686 - if (hdev->req_status == HCI_REQ_PEND) { 686 + if (READ_ONCE(hdev->req_status) == HCI_REQ_PEND) { 687 687 /* req_result is __u32 so error must be positive to be properly 688 688 * propagated. 689 689 */ 690 690 hdev->req_result = err < 0 ? -err : err; 691 - hdev->req_status = HCI_REQ_CANCELED; 691 + WRITE_ONCE(hdev->req_status, HCI_REQ_CANCELED); 692 692 693 693 wake_up_interruptible(&hdev->req_wait_q); 694 694 }

+53 -18

net/bluetooth/l2cap_core.c

··· 926 926 927 927 static int l2cap_get_ident(struct l2cap_conn *conn) 928 928 { 929 + u8 max; 930 + int ident; 931 + 929 932 /* LE link does not support tools like l2ping so use the full range */ 930 933 if (conn->hcon->type == LE_LINK) 931 - return ida_alloc_range(&conn->tx_ida, 1, 255, GFP_ATOMIC); 932 - 934 + max = 255; 933 935 /* Get next available identificator. 934 936 * 1 - 128 are used by kernel. 935 937 * 129 - 199 are reserved. 936 938 * 200 - 254 are used by utilities like l2ping, etc. 937 939 */ 938 - return ida_alloc_range(&conn->tx_ida, 1, 128, GFP_ATOMIC); 940 + else 941 + max = 128; 942 + 943 + /* Allocate ident using min as last used + 1 (cyclic) */ 944 + ident = ida_alloc_range(&conn->tx_ida, READ_ONCE(conn->tx_ident) + 1, 945 + max, GFP_ATOMIC); 946 + /* Force min 1 to start over */ 947 + if (ident <= 0) { 948 + ident = ida_alloc_range(&conn->tx_ida, 1, max, GFP_ATOMIC); 949 + if (ident <= 0) { 950 + /* If all idents are in use, log an error, this is 951 + * extremely unlikely to happen and would indicate a bug 952 + * in the code that idents are not being freed properly. 953 + */ 954 + BT_ERR("Unable to allocate ident: %d", ident); 955 + return 0; 956 + } 957 + } 958 + 959 + WRITE_ONCE(conn->tx_ident, ident); 960 + 961 + return ident; 939 962 } 940 963 941 964 static void l2cap_send_acl(struct l2cap_conn *conn, struct sk_buff *skb, ··· 1771 1748 1772 1749 BT_DBG("hcon %p conn %p, err %d", hcon, conn, err); 1773 1750 1751 + disable_delayed_work_sync(&conn->info_timer); 1752 + disable_delayed_work_sync(&conn->id_addr_timer); 1753 + 1774 1754 mutex_lock(&conn->lock); 1775 1755 1776 1756 kfree_skb(conn->rx_skb); ··· 1788 1762 cancel_work_sync(&conn->pending_rx_work); 1789 1763 1790 1764 ida_destroy(&conn->tx_ida); 1791 - 1792 - cancel_delayed_work_sync(&conn->id_addr_timer); 1793 1765 1794 1766 l2cap_unregister_all_users(conn); 1795 1767 ··· 1806 1782 l2cap_chan_unlock(chan); 1807 1783 l2cap_chan_put(chan); 1808 1784 } 1809 - 1810 - if (conn->info_state & L2CAP_INFO_FEAT_MASK_REQ_SENT) 1811 - cancel_delayed_work_sync(&conn->info_timer); 1812 1785 1813 1786 hci_chan_del(conn->hchan); 1814 1787 conn->hchan = NULL; ··· 2397 2376 2398 2377 /* Remote device may have requested smaller PDUs */ 2399 2378 pdu_len = min_t(size_t, pdu_len, chan->remote_mps); 2379 + 2380 + if (!pdu_len) 2381 + return -EINVAL; 2400 2382 2401 2383 if (len <= pdu_len) { 2402 2384 sar = L2CAP_SAR_UNSEGMENTED; ··· 4336 4312 if (test_bit(CONF_INPUT_DONE, &chan->conf_state)) { 4337 4313 set_default_fcs(chan); 4338 4314 4339 - if (chan->mode == L2CAP_MODE_ERTM || 4340 - chan->mode == L2CAP_MODE_STREAMING) 4341 - err = l2cap_ertm_init(chan); 4315 + if (chan->state != BT_CONNECTED) { 4316 + if (chan->mode == L2CAP_MODE_ERTM || 4317 + chan->mode == L2CAP_MODE_STREAMING) 4318 + err = l2cap_ertm_init(chan); 4342 4319 4343 - if (err < 0) 4344 - l2cap_send_disconn_req(chan, -err); 4345 - else 4346 - l2cap_chan_ready(chan); 4320 + if (err < 0) 4321 + l2cap_send_disconn_req(chan, -err); 4322 + else 4323 + l2cap_chan_ready(chan); 4324 + } 4347 4325 4348 4326 goto unlock; 4349 4327 } ··· 5107 5081 cmd_len -= sizeof(*req); 5108 5082 num_scid = cmd_len / sizeof(u16); 5109 5083 5110 - /* Always respond with the same number of scids as in the request */ 5111 - rsp_len = cmd_len; 5112 - 5113 5084 if (num_scid > L2CAP_ECRED_MAX_CID) { 5114 5085 result = L2CAP_CR_LE_INVALID_PARAMS; 5115 5086 goto response; 5116 5087 } 5088 + 5089 + /* Always respond with the same number of scids as in the request */ 5090 + rsp_len = cmd_len; 5117 5091 5118 5092 mtu = __le16_to_cpu(req->mtu); 5119 5093 mps = __le16_to_cpu(req->mps); ··· 6633 6607 struct l2cap_le_credits pkt; 6634 6608 u16 return_credits = l2cap_le_rx_credits(chan); 6635 6609 6610 + if (chan->mode != L2CAP_MODE_LE_FLOWCTL && 6611 + chan->mode != L2CAP_MODE_EXT_FLOWCTL) 6612 + return; 6613 + 6636 6614 if (chan->rx_credits >= return_credits) 6637 6615 return; 6638 6616 ··· 6719 6689 6720 6690 if (!chan->sdu) { 6721 6691 u16 sdu_len; 6692 + 6693 + if (!pskb_may_pull(skb, L2CAP_SDULEN_SIZE)) { 6694 + err = -EINVAL; 6695 + goto failed; 6696 + } 6722 6697 6723 6698 sdu_len = get_unaligned_le16(skb->data); 6724 6699 skb_pull(skb, L2CAP_SDULEN_SIZE);

+3

net/bluetooth/l2cap_sock.c

··· 1698 1698 struct sock *sk = chan->data; 1699 1699 struct sock *parent; 1700 1700 1701 + if (!sk) 1702 + return; 1703 + 1701 1704 lock_sock(sk); 1702 1705 1703 1706 parent = bt_sk(sk)->parent;

+1 -1

net/bluetooth/mgmt.c

··· 5355 5355 * hci_adv_monitors_clear is about to be called which will take care of 5356 5356 * freeing the adv_monitor instances. 5357 5357 */ 5358 - if (status == -ECANCELED && !mgmt_pending_valid(hdev, cmd)) 5358 + if (status == -ECANCELED || !mgmt_pending_valid(hdev, cmd)) 5359 5359 return; 5360 5360 5361 5361 monitor = cmd->user_data;

+7 -3

net/bluetooth/sco.c

··· 401 401 struct sock *sk; 402 402 403 403 sco_conn_lock(conn); 404 - sk = conn->sk; 404 + sk = sco_sock_hold(conn); 405 405 sco_conn_unlock(conn); 406 406 407 407 if (!sk) ··· 410 410 BT_DBG("sk %p len %u", sk, skb->len); 411 411 412 412 if (sk->sk_state != BT_CONNECTED) 413 - goto drop; 413 + goto drop_put; 414 414 415 - if (!sock_queue_rcv_skb(sk, skb)) 415 + if (!sock_queue_rcv_skb(sk, skb)) { 416 + sock_put(sk); 416 417 return; 418 + } 417 419 420 + drop_put: 421 + sock_put(sk); 418 422 drop: 419 423 kfree_skb(skb); 420 424 }

+2 -2

net/can/af_can.c

··· 469 469 470 470 rcv->can_id = can_id; 471 471 rcv->mask = mask; 472 - rcv->matches = 0; 472 + atomic_long_set(&rcv->matches, 0); 473 473 rcv->func = func; 474 474 rcv->data = data; 475 475 rcv->ident = ident; ··· 573 573 static inline void deliver(struct sk_buff *skb, struct receiver *rcv) 574 574 { 575 575 rcv->func(skb, rcv->data); 576 - rcv->matches++; 576 + atomic_long_inc(&rcv->matches); 577 577 } 578 578 579 579 static int can_rcv_filter(struct can_dev_rcv_lists *dev_rcv_lists, struct sk_buff *skb)

+1 -1

net/can/af_can.h

··· 52 52 struct hlist_node list; 53 53 canid_t can_id; 54 54 canid_t mask; 55 - unsigned long matches; 55 + atomic_long_t matches; 56 56 void (*func)(struct sk_buff *skb, void *data); 57 57 void *data; 58 58 char *ident;

+3 -3

net/can/gw.c

··· 375 375 return; 376 376 377 377 if (from <= to) { 378 - for (i = crc8->from_idx; i <= crc8->to_idx; i++) 378 + for (i = from; i <= to; i++) 379 379 crc = crc8->crctab[crc ^ cf->data[i]]; 380 380 } else { 381 - for (i = crc8->from_idx; i >= crc8->to_idx; i--) 381 + for (i = from; i >= to; i--) 382 382 crc = crc8->crctab[crc ^ cf->data[i]]; 383 383 } 384 384 ··· 397 397 break; 398 398 } 399 399 400 - cf->data[crc8->result_idx] = crc ^ crc8->final_xor_val; 400 + cf->data[res] = crc ^ crc8->final_xor_val; 401 401 } 402 402 403 403 static void cgw_csum_crc8_pos(struct canfd_frame *cf,

+18 -6

net/can/isotp.c

··· 1248 1248 so->ifindex = 0; 1249 1249 so->bound = 0; 1250 1250 1251 - if (so->rx.buf != so->rx.sbuf) 1252 - kfree(so->rx.buf); 1253 - 1254 - if (so->tx.buf != so->tx.sbuf) 1255 - kfree(so->tx.buf); 1256 - 1257 1251 sock_orphan(sk); 1258 1252 sock->sk = NULL; 1259 1253 ··· 1616 1622 return NOTIFY_DONE; 1617 1623 } 1618 1624 1625 + static void isotp_sock_destruct(struct sock *sk) 1626 + { 1627 + struct isotp_sock *so = isotp_sk(sk); 1628 + 1629 + /* do the standard CAN sock destruct work */ 1630 + can_sock_destruct(sk); 1631 + 1632 + /* free potential extended PDU buffers */ 1633 + if (so->rx.buf != so->rx.sbuf) 1634 + kfree(so->rx.buf); 1635 + 1636 + if (so->tx.buf != so->tx.sbuf) 1637 + kfree(so->tx.buf); 1638 + } 1639 + 1619 1640 static int isotp_init(struct sock *sk) 1620 1641 { 1621 1642 struct isotp_sock *so = isotp_sk(sk); ··· 1674 1665 spin_lock(&isotp_notifier_lock); 1675 1666 list_add_tail(&so->notifier, &isotp_notifier_list); 1676 1667 spin_unlock(&isotp_notifier_lock); 1668 + 1669 + /* re-assign default can_sock_destruct() reference */ 1670 + sk->sk_destruct = isotp_sock_destruct; 1677 1671 1678 1672 return 0; 1679 1673 }

+2 -1

net/can/proc.c

··· 196 196 " %-5s %03x %08x %pK %pK %8ld %s\n"; 197 197 198 198 seq_printf(m, fmt, DNAME(dev), r->can_id, r->mask, 199 - r->func, r->data, r->matches, r->ident); 199 + r->func, r->data, atomic_long_read(&r->matches), 200 + r->ident); 200 201 } 201 202 } 202 203

+17 -5

net/core/dev.c

··· 3769 3769 return vlan_features_check(skb, features); 3770 3770 } 3771 3771 3772 + static bool skb_gso_has_extension_hdr(const struct sk_buff *skb) 3773 + { 3774 + if (!skb->encapsulation) 3775 + return ((skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6 || 3776 + (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && 3777 + vlan_get_protocol(skb) == htons(ETH_P_IPV6))) && 3778 + skb_transport_header_was_set(skb) && 3779 + skb_network_header_len(skb) != sizeof(struct ipv6hdr)); 3780 + else 3781 + return (!skb_inner_network_header_was_set(skb) || 3782 + ((skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6 || 3783 + (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && 3784 + inner_ip_hdr(skb)->version == 6)) && 3785 + skb_inner_network_header_len(skb) != sizeof(struct ipv6hdr))); 3786 + } 3787 + 3772 3788 static netdev_features_t gso_features_check(const struct sk_buff *skb, 3773 3789 struct net_device *dev, 3774 3790 netdev_features_t features) ··· 3832 3816 * so neither does TSO that depends on it. 3833 3817 */ 3834 3818 if (features & NETIF_F_IPV6_CSUM && 3835 - (skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6 || 3836 - (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && 3837 - vlan_get_protocol(skb) == htons(ETH_P_IPV6))) && 3838 - skb_transport_header_was_set(skb) && 3839 - skb_network_header_len(skb) != sizeof(struct ipv6hdr)) 3819 + skb_gso_has_extension_hdr(skb)) 3840 3820 features &= ~(NETIF_F_IPV6_CSUM | NETIF_F_TSO6 | NETIF_F_GSO_UDP_L4); 3841 3821 3842 3822 return features;

+25 -3

net/core/rtnetlink.c

··· 629 629 unlock: 630 630 mutex_unlock(&link_ops_mutex); 631 631 632 + if (err) 633 + cleanup_srcu_struct(&ops->srcu); 634 + 632 635 return err; 633 636 } 634 637 EXPORT_SYMBOL_GPL(rtnl_link_register); ··· 710 707 goto out; 711 708 712 709 ops = master_dev->rtnl_link_ops; 713 - if (!ops || !ops->get_slave_size) 710 + if (!ops) 711 + goto out; 712 + size += nla_total_size(strlen(ops->kind) + 1); /* IFLA_INFO_SLAVE_KIND */ 713 + if (!ops->get_slave_size) 714 714 goto out; 715 715 /* IFLA_INFO_SLAVE_DATA + nested data */ 716 - size = nla_total_size(sizeof(struct nlattr)) + 717 - ops->get_slave_size(master_dev, dev); 716 + size += nla_total_size(sizeof(struct nlattr)) + 717 + ops->get_slave_size(master_dev, dev); 718 718 719 719 out: 720 720 rcu_read_unlock(); ··· 1273 1267 return size; 1274 1268 } 1275 1269 1270 + static size_t rtnl_dev_parent_size(const struct net_device *dev) 1271 + { 1272 + size_t size = 0; 1273 + 1274 + /* IFLA_PARENT_DEV_NAME */ 1275 + if (dev->dev.parent) 1276 + size += nla_total_size(strlen(dev_name(dev->dev.parent)) + 1); 1277 + 1278 + /* IFLA_PARENT_DEV_BUS_NAME */ 1279 + if (dev->dev.parent && dev->dev.parent->bus) 1280 + size += nla_total_size(strlen(dev->dev.parent->bus->name) + 1); 1281 + 1282 + return size; 1283 + } 1284 + 1276 1285 static noinline size_t if_nlmsg_size(const struct net_device *dev, 1277 1286 u32 ext_filter_mask) 1278 1287 { ··· 1349 1328 + nla_total_size(8) /* IFLA_MAX_PACING_OFFLOAD_HORIZON */ 1350 1329 + nla_total_size(2) /* IFLA_HEADROOM */ 1351 1330 + nla_total_size(2) /* IFLA_TAILROOM */ 1331 + + rtnl_dev_parent_size(dev) 1352 1332 + 0; 1353 1333 1354 1334 if (!(ext_filter_mask & RTEXT_FILTER_SKIP_STATS))

+6 -3

net/ipv4/esp4.c

··· 235 235 xfrm_dev_resume(skb); 236 236 } else { 237 237 if (!err && 238 - x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) 239 - esp_output_tail_tcp(x, skb); 240 - else 238 + x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) { 239 + err = esp_output_tail_tcp(x, skb); 240 + if (err != -EINPROGRESS) 241 + kfree_skb(skb); 242 + } else { 241 243 xfrm_output_resume(skb_to_full_sk(skb), skb, err); 244 + } 242 245 } 243 246 } 244 247

+3 -17

net/ipv4/inet_connection_sock.c

··· 154 154 } 155 155 EXPORT_SYMBOL(inet_sk_get_local_port_range); 156 156 157 - static bool inet_use_bhash2_on_bind(const struct sock *sk) 158 - { 159 - #if IS_ENABLED(CONFIG_IPV6) 160 - if (sk->sk_family == AF_INET6) { 161 - if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) 162 - return false; 163 - 164 - if (!ipv6_addr_v4mapped(&sk->sk_v6_rcv_saddr)) 165 - return true; 166 - } 167 - #endif 168 - return sk->sk_rcv_saddr != htonl(INADDR_ANY); 169 - } 170 - 171 157 static bool inet_bind_conflict(const struct sock *sk, struct sock *sk2, 172 158 kuid_t uid, bool relax, 173 159 bool reuseport_cb_ok, bool reuseport_ok) ··· 245 259 * checks separately because their spinlocks have to be acquired/released 246 260 * independently of each other, to prevent possible deadlocks 247 261 */ 248 - if (inet_use_bhash2_on_bind(sk)) 262 + if (inet_use_hash2_on_bind(sk)) 249 263 return tb2 && inet_bhash2_conflict(sk, tb2, uid, relax, 250 264 reuseport_cb_ok, reuseport_ok); 251 265 ··· 362 376 head = &hinfo->bhash[inet_bhashfn(net, port, 363 377 hinfo->bhash_size)]; 364 378 spin_lock_bh(&head->lock); 365 - if (inet_use_bhash2_on_bind(sk)) { 379 + if (inet_use_hash2_on_bind(sk)) { 366 380 if (inet_bhash2_addr_any_conflict(sk, port, l3mdev, relax, false)) 367 381 goto next_port; 368 382 } ··· 548 562 check_bind_conflict = false; 549 563 } 550 564 551 - if (check_bind_conflict && inet_use_bhash2_on_bind(sk)) { 565 + if (check_bind_conflict && inet_use_hash2_on_bind(sk)) { 552 566 if (inet_bhash2_addr_any_conflict(sk, port, l3mdev, true, true)) 553 567 goto fail_unlock; 554 568 }

+1 -1

net/ipv4/udp.c

··· 287 287 } else { 288 288 hslot = udp_hashslot(udptable, net, snum); 289 289 spin_lock_bh(&hslot->lock); 290 - if (hslot->count > 10) { 290 + if (inet_use_hash2_on_bind(sk) && hslot->count > 10) { 291 291 int exist; 292 292 unsigned int slot2 = udp_sk(sk)->udp_portaddr_hash ^ snum; 293 293

+2 -2

net/ipv6/addrconf.c

··· 2862 2862 fib6_add_gc_list(rt); 2863 2863 } else { 2864 2864 fib6_clean_expires(rt); 2865 - fib6_remove_gc_list(rt); 2865 + fib6_may_remove_gc_list(net, rt); 2866 2866 } 2867 2867 2868 2868 spin_unlock_bh(&table->tb6_lock); ··· 4840 4840 4841 4841 if (!(flags & RTF_EXPIRES)) { 4842 4842 fib6_clean_expires(f6i); 4843 - fib6_remove_gc_list(f6i); 4843 + fib6_may_remove_gc_list(net, f6i); 4844 4844 } else { 4845 4845 fib6_set_expires(f6i, expires); 4846 4846 fib6_add_gc_list(f6i);

+6 -3

net/ipv6/esp6.c

··· 271 271 xfrm_dev_resume(skb); 272 272 } else { 273 273 if (!err && 274 - x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) 275 - esp_output_tail_tcp(x, skb); 276 - else 274 + x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP) { 275 + err = esp_output_tail_tcp(x, skb); 276 + if (err != -EINPROGRESS) 277 + kfree_skb(skb); 278 + } else { 277 279 xfrm_output_resume(skb_to_full_sk(skb), skb, err); 280 + } 278 281 } 279 282 } 280 283

+13 -2

net/ipv6/ip6_fib.c

··· 1133 1133 return -EEXIST; 1134 1134 if (!(rt->fib6_flags & RTF_EXPIRES)) { 1135 1135 fib6_clean_expires(iter); 1136 - fib6_remove_gc_list(iter); 1136 + fib6_may_remove_gc_list(info->nl_net, iter); 1137 1137 } else { 1138 1138 fib6_set_expires(iter, rt->expires); 1139 1139 fib6_add_gc_list(iter); ··· 2348 2348 /* 2349 2349 * Garbage collection 2350 2350 */ 2351 + void fib6_age_exceptions(struct fib6_info *rt, struct fib6_gc_args *gc_args, 2352 + unsigned long now) 2353 + { 2354 + bool may_expire = rt->fib6_flags & RTF_EXPIRES && rt->expires; 2355 + int old_more = gc_args->more; 2356 + 2357 + rt6_age_exceptions(rt, gc_args, now); 2358 + 2359 + if (!may_expire && old_more == gc_args->more) 2360 + fib6_remove_gc_list(rt); 2361 + } 2351 2362 2352 2363 static int fib6_age(struct fib6_info *rt, struct fib6_gc_args *gc_args) 2353 2364 { ··· 2381 2370 * Note, that clones are aged out 2382 2371 * only if they are not in use now. 2383 2372 */ 2384 - rt6_age_exceptions(rt, gc_args, now); 2373 + fib6_age_exceptions(rt, gc_args, now); 2385 2374 2386 2375 return 0; 2387 2376 }

+4

net/ipv6/netfilter/ip6t_rt.c

··· 157 157 pr_debug("unknown flags %X\n", rtinfo->invflags); 158 158 return -EINVAL; 159 159 } 160 + if (rtinfo->addrnr > IP6T_RT_HOPS) { 161 + pr_debug("too many addresses specified\n"); 162 + return -EINVAL; 163 + } 160 164 if ((rtinfo->flags & (IP6T_RT_RES | IP6T_RT_FST_MASK)) && 161 165 (!(rtinfo->flags & IP6T_RT_TYP) || 162 166 (rtinfo->rt_type != 0) ||

+1 -1

net/ipv6/route.c

··· 1033 1033 1034 1034 if (!addrconf_finite_timeout(lifetime)) { 1035 1035 fib6_clean_expires(rt); 1036 - fib6_remove_gc_list(rt); 1036 + fib6_may_remove_gc_list(net, rt); 1037 1037 } else { 1038 1038 fib6_set_expires(rt, jiffies + HZ * lifetime); 1039 1039 fib6_add_gc_list(rt);

+12 -7

net/key/af_key.c

··· 3518 3518 3519 3519 static int set_ipsecrequest(struct sk_buff *skb, 3520 3520 uint8_t proto, uint8_t mode, int level, 3521 - uint32_t reqid, uint8_t family, 3521 + uint32_t reqid, sa_family_t family, 3522 3522 const xfrm_address_t *src, const xfrm_address_t *dst) 3523 3523 { 3524 3524 struct sadb_x_ipsecrequest *rq; ··· 3583 3583 3584 3584 /* ipsecrequests */ 3585 3585 for (i = 0, mp = m; i < num_bundles; i++, mp++) { 3586 - /* old locator pair */ 3587 - size_pol += sizeof(struct sadb_x_ipsecrequest) + 3588 - pfkey_sockaddr_pair_size(mp->old_family); 3589 - /* new locator pair */ 3590 - size_pol += sizeof(struct sadb_x_ipsecrequest) + 3591 - pfkey_sockaddr_pair_size(mp->new_family); 3586 + int pair_size; 3587 + 3588 + pair_size = pfkey_sockaddr_pair_size(mp->old_family); 3589 + if (!pair_size) 3590 + return -EINVAL; 3591 + size_pol += sizeof(struct sadb_x_ipsecrequest) + pair_size; 3592 + 3593 + pair_size = pfkey_sockaddr_pair_size(mp->new_family); 3594 + if (!pair_size) 3595 + return -EINVAL; 3596 + size_pol += sizeof(struct sadb_x_ipsecrequest) + pair_size; 3592 3597 } 3593 3598 3594 3599 size += sizeof(struct sadb_msg) + size_pol;

+6 -2

net/netfilter/nf_conntrack_broadcast.c

··· 21 21 unsigned int timeout) 22 22 { 23 23 const struct nf_conntrack_helper *helper; 24 + struct net *net = read_pnet(&ct->ct_net); 24 25 struct nf_conntrack_expect *exp; 25 26 struct iphdr *iph = ip_hdr(skb); 26 27 struct rtable *rt = skb_rtable(skb); ··· 71 70 exp->expectfn = NULL; 72 71 exp->flags = NF_CT_EXPECT_PERMANENT; 73 72 exp->class = NF_CT_EXPECT_CLASS_DEFAULT; 74 - exp->helper = NULL; 75 - 73 + rcu_assign_pointer(exp->helper, helper); 74 + write_pnet(&exp->net, net); 75 + #ifdef CONFIG_NF_CONNTRACK_ZONES 76 + exp->zone = ct->zone; 77 + #endif 76 78 nf_ct_expect_related(exp, 0); 77 79 nf_ct_expect_put(exp); 78 80

+2

net/netfilter/nf_conntrack_ecache.c

··· 247 247 struct nf_ct_event_notifier *notify; 248 248 struct nf_conntrack_ecache *e; 249 249 250 + lockdep_nfct_expect_lock_held(); 251 + 250 252 rcu_read_lock(); 251 253 notify = rcu_dereference(net->ct.nf_conntrack_event_cb); 252 254 if (!notify)

+34 -5

net/netfilter/nf_conntrack_expect.c

··· 51 51 struct net *net = nf_ct_exp_net(exp); 52 52 struct nf_conntrack_net *cnet; 53 53 54 + lockdep_nfct_expect_lock_held(); 54 55 WARN_ON(!master_help); 55 56 WARN_ON(timer_pending(&exp->timeout)); 56 57 ··· 113 112 const struct net *net) 114 113 { 115 114 return nf_ct_tuple_mask_cmp(tuple, &i->tuple, &i->mask) && 116 - net_eq(net, nf_ct_net(i->master)) && 117 - nf_ct_zone_equal_any(i->master, zone); 115 + net_eq(net, read_pnet(&i->net)) && 116 + nf_ct_exp_zone_equal_any(i, zone); 118 117 } 119 118 120 119 bool nf_ct_remove_expect(struct nf_conntrack_expect *exp) 121 120 { 121 + lockdep_nfct_expect_lock_held(); 122 + 122 123 if (timer_delete(&exp->timeout)) { 123 124 nf_ct_unlink_expect(exp); 124 125 nf_ct_expect_put(exp); ··· 179 176 struct nf_conntrack_net *cnet = nf_ct_pernet(net); 180 177 struct nf_conntrack_expect *i, *exp = NULL; 181 178 unsigned int h; 179 + 180 + lockdep_nfct_expect_lock_held(); 182 181 183 182 if (!cnet->expect_count) 184 183 return NULL; ··· 314 309 } 315 310 EXPORT_SYMBOL_GPL(nf_ct_expect_alloc); 316 311 312 + /* This function can only be used from packet path, where accessing 313 + * master's helper is safe, because the packet holds a reference on 314 + * the conntrack object. Never use it from control plane. 315 + */ 317 316 void nf_ct_expect_init(struct nf_conntrack_expect *exp, unsigned int class, 318 317 u_int8_t family, 319 318 const union nf_inet_addr *saddr, 320 319 const union nf_inet_addr *daddr, 321 320 u_int8_t proto, const __be16 *src, const __be16 *dst) 322 321 { 322 + struct nf_conntrack_helper *helper = NULL; 323 + struct nf_conn *ct = exp->master; 324 + struct net *net = read_pnet(&ct->ct_net); 325 + struct nf_conn_help *help; 323 326 int len; 324 327 325 328 if (family == AF_INET) ··· 338 325 exp->flags = 0; 339 326 exp->class = class; 340 327 exp->expectfn = NULL; 341 - exp->helper = NULL; 328 + 329 + help = nfct_help(ct); 330 + if (help) 331 + helper = rcu_dereference(help->helper); 332 + 333 + rcu_assign_pointer(exp->helper, helper); 334 + write_pnet(&exp->net, net); 335 + #ifdef CONFIG_NF_CONNTRACK_ZONES 336 + exp->zone = ct->zone; 337 + #endif 342 338 exp->tuple.src.l3num = family; 343 339 exp->tuple.dst.protonum = proto; 344 340 ··· 464 442 unsigned int h; 465 443 int ret = 0; 466 444 445 + lockdep_nfct_expect_lock_held(); 446 + 467 447 if (!master_help) { 468 448 ret = -ESHUTDOWN; 469 449 goto out; ··· 522 498 523 499 nf_ct_expect_insert(expect); 524 500 525 - spin_unlock_bh(&nf_conntrack_expect_lock); 526 501 nf_ct_expect_event_report(IPEXP_NEW, expect, portid, report); 502 + spin_unlock_bh(&nf_conntrack_expect_lock); 503 + 527 504 return 0; 528 505 out: 529 506 spin_unlock_bh(&nf_conntrack_expect_lock); ··· 652 627 { 653 628 struct nf_conntrack_expect *expect; 654 629 struct nf_conntrack_helper *helper; 630 + struct net *net = seq_file_net(s); 655 631 struct hlist_node *n = v; 656 632 char *delim = ""; 657 633 658 634 expect = hlist_entry(n, struct nf_conntrack_expect, hnode); 635 + 636 + if (!net_eq(nf_ct_exp_net(expect), net)) 637 + return 0; 659 638 660 639 if (expect->timeout.function) 661 640 seq_printf(s, "%ld ", timer_pending(&expect->timeout) ··· 683 654 if (expect->flags & NF_CT_EXPECT_USERSPACE) 684 655 seq_printf(s, "%sUSERSPACE", delim); 685 656 686 - helper = rcu_dereference(nfct_help(expect->master)->helper); 657 + helper = rcu_dereference(expect->helper); 687 658 if (helper) { 688 659 seq_printf(s, "%s%s", expect->flags ? " " : "", helper->name); 689 660 if (helper->expect_policy[expect->class].name[0])

+6 -6

net/netfilter/nf_conntrack_h323_main.c

··· 643 643 &ct->tuplehash[!dir].tuple.src.u3, 644 644 &ct->tuplehash[!dir].tuple.dst.u3, 645 645 IPPROTO_TCP, NULL, &port); 646 - exp->helper = &nf_conntrack_helper_h245; 646 + rcu_assign_pointer(exp->helper, &nf_conntrack_helper_h245); 647 647 648 648 nathook = rcu_dereference(nfct_h323_nat_hook); 649 649 if (memcmp(&ct->tuplehash[dir].tuple.src.u3, ··· 767 767 nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), 768 768 &ct->tuplehash[!dir].tuple.src.u3, &addr, 769 769 IPPROTO_TCP, NULL, &port); 770 - exp->helper = nf_conntrack_helper_q931; 770 + rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); 771 771 772 772 nathook = rcu_dereference(nfct_h323_nat_hook); 773 773 if (memcmp(&ct->tuplehash[dir].tuple.src.u3, ··· 1234 1234 &ct->tuplehash[!dir].tuple.src.u3 : NULL, 1235 1235 &ct->tuplehash[!dir].tuple.dst.u3, 1236 1236 IPPROTO_TCP, NULL, &port); 1237 - exp->helper = nf_conntrack_helper_q931; 1237 + rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); 1238 1238 exp->flags = NF_CT_EXPECT_PERMANENT; /* Accept multiple calls */ 1239 1239 1240 1240 nathook = rcu_dereference(nfct_h323_nat_hook); ··· 1306 1306 nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), 1307 1307 &ct->tuplehash[!dir].tuple.src.u3, &addr, 1308 1308 IPPROTO_UDP, NULL, &port); 1309 - exp->helper = nf_conntrack_helper_ras; 1309 + rcu_assign_pointer(exp->helper, nf_conntrack_helper_ras); 1310 1310 1311 1311 if (nf_ct_expect_related(exp, 0) == 0) { 1312 1312 pr_debug("nf_ct_ras: expect RAS "); ··· 1523 1523 &ct->tuplehash[!dir].tuple.src.u3, &addr, 1524 1524 IPPROTO_TCP, NULL, &port); 1525 1525 exp->flags = NF_CT_EXPECT_PERMANENT; 1526 - exp->helper = nf_conntrack_helper_q931; 1526 + rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); 1527 1527 1528 1528 if (nf_ct_expect_related(exp, 0) == 0) { 1529 1529 pr_debug("nf_ct_ras: expect Q.931 "); ··· 1577 1577 &ct->tuplehash[!dir].tuple.src.u3, &addr, 1578 1578 IPPROTO_TCP, NULL, &port); 1579 1579 exp->flags = NF_CT_EXPECT_PERMANENT; 1580 - exp->helper = nf_conntrack_helper_q931; 1580 + rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); 1581 1581 1582 1582 if (nf_ct_expect_related(exp, 0) == 0) { 1583 1583 pr_debug("nf_ct_ras: expect Q.931 ");

+6 -5

net/netfilter/nf_conntrack_helper.c

··· 395 395 396 396 static bool expect_iter_me(struct nf_conntrack_expect *exp, void *data) 397 397 { 398 - struct nf_conn_help *help = nfct_help(exp->master); 399 398 const struct nf_conntrack_helper *me = data; 400 399 const struct nf_conntrack_helper *this; 401 400 402 - if (exp->helper == me) 403 - return true; 404 - 405 - this = rcu_dereference_protected(help->helper, 401 + this = rcu_dereference_protected(exp->helper, 406 402 lockdep_is_held(&nf_conntrack_expect_lock)); 407 403 return this == me; 408 404 } ··· 417 421 418 422 nf_ct_expect_iterate_destroy(expect_iter_me, NULL); 419 423 nf_ct_iterate_destroy(unhelp, me); 424 + 425 + /* nf_ct_iterate_destroy() does an unconditional synchronize_rcu() as 426 + * last step, this ensures rcu readers of exp->helper are done. 427 + * No need for another synchronize_rcu() here. 428 + */ 420 429 } 421 430 EXPORT_SYMBOL_GPL(nf_conntrack_helper_unregister); 422 431

+40 -35

net/netfilter/nf_conntrack_netlink.c

··· 910 910 }; 911 911 912 912 static const struct nla_policy cta_filter_nla_policy[CTA_FILTER_MAX + 1] = { 913 - [CTA_FILTER_ORIG_FLAGS] = { .type = NLA_U32 }, 914 - [CTA_FILTER_REPLY_FLAGS] = { .type = NLA_U32 }, 913 + [CTA_FILTER_ORIG_FLAGS] = NLA_POLICY_MASK(NLA_U32, CTA_FILTER_F_ALL), 914 + [CTA_FILTER_REPLY_FLAGS] = NLA_POLICY_MASK(NLA_U32, CTA_FILTER_F_ALL), 915 915 }; 916 916 917 917 static int ctnetlink_parse_filter(const struct nlattr *attr, ··· 925 925 if (ret) 926 926 return ret; 927 927 928 - if (tb[CTA_FILTER_ORIG_FLAGS]) { 928 + if (tb[CTA_FILTER_ORIG_FLAGS]) 929 929 filter->orig_flags = nla_get_u32(tb[CTA_FILTER_ORIG_FLAGS]); 930 - if (filter->orig_flags & ~CTA_FILTER_F_ALL) 931 - return -EOPNOTSUPP; 932 - } 933 930 934 - if (tb[CTA_FILTER_REPLY_FLAGS]) { 931 + if (tb[CTA_FILTER_REPLY_FLAGS]) 935 932 filter->reply_flags = nla_get_u32(tb[CTA_FILTER_REPLY_FLAGS]); 936 - if (filter->reply_flags & ~CTA_FILTER_F_ALL) 937 - return -EOPNOTSUPP; 938 - } 939 933 940 934 return 0; 941 935 } ··· 2628 2634 [CTA_EXPECT_HELP_NAME] = { .type = NLA_NUL_STRING, 2629 2635 .len = NF_CT_HELPER_NAME_LEN - 1 }, 2630 2636 [CTA_EXPECT_ZONE] = { .type = NLA_U16 }, 2631 - [CTA_EXPECT_FLAGS] = { .type = NLA_U32 }, 2637 + [CTA_EXPECT_FLAGS] = NLA_POLICY_MASK(NLA_BE32, NF_CT_EXPECT_MASK), 2632 2638 [CTA_EXPECT_CLASS] = { .type = NLA_U32 }, 2633 2639 [CTA_EXPECT_NAT] = { .type = NLA_NESTED }, 2634 2640 [CTA_EXPECT_FN] = { .type = NLA_NUL_STRING }, ··· 3006 3012 { 3007 3013 struct nf_conn *master = exp->master; 3008 3014 long timeout = ((long)exp->timeout.expires - (long)jiffies) / HZ; 3009 - struct nf_conn_help *help; 3015 + struct nf_conntrack_helper *helper; 3010 3016 #if IS_ENABLED(CONFIG_NF_NAT) 3011 3017 struct nlattr *nest_parms; 3012 3018 struct nf_conntrack_tuple nat_tuple = {}; ··· 3051 3057 nla_put_be32(skb, CTA_EXPECT_FLAGS, htonl(exp->flags)) || 3052 3058 nla_put_be32(skb, CTA_EXPECT_CLASS, htonl(exp->class))) 3053 3059 goto nla_put_failure; 3054 - help = nfct_help(master); 3055 - if (help) { 3056 - struct nf_conntrack_helper *helper; 3057 3060 3058 - helper = rcu_dereference(help->helper); 3059 - if (helper && 3060 - nla_put_string(skb, CTA_EXPECT_HELP_NAME, helper->name)) 3061 - goto nla_put_failure; 3062 - } 3061 + helper = rcu_dereference(exp->helper); 3062 + if (helper && 3063 + nla_put_string(skb, CTA_EXPECT_HELP_NAME, helper->name)) 3064 + goto nla_put_failure; 3065 + 3063 3066 expfn = nf_ct_helper_expectfn_find_by_symbol(exp->expectfn); 3064 3067 if (expfn != NULL && 3065 3068 nla_put_string(skb, CTA_EXPECT_FN, expfn->name)) ··· 3349 3358 if (err < 0) 3350 3359 return err; 3351 3360 3361 + skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); 3362 + if (!skb2) 3363 + return -ENOMEM; 3364 + 3365 + spin_lock_bh(&nf_conntrack_expect_lock); 3352 3366 exp = nf_ct_expect_find_get(info->net, &zone, &tuple); 3353 - if (!exp) 3367 + if (!exp) { 3368 + spin_unlock_bh(&nf_conntrack_expect_lock); 3369 + kfree_skb(skb2); 3354 3370 return -ENOENT; 3371 + } 3355 3372 3356 3373 if (cda[CTA_EXPECT_ID]) { 3357 3374 __be32 id = nla_get_be32(cda[CTA_EXPECT_ID]); 3358 3375 3359 3376 if (id != nf_expect_get_id(exp)) { 3360 3377 nf_ct_expect_put(exp); 3378 + spin_unlock_bh(&nf_conntrack_expect_lock); 3379 + kfree_skb(skb2); 3361 3380 return -ENOENT; 3362 3381 } 3363 - } 3364 - 3365 - skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); 3366 - if (!skb2) { 3367 - nf_ct_expect_put(exp); 3368 - return -ENOMEM; 3369 3382 } 3370 3383 3371 3384 rcu_read_lock(); ··· 3378 3383 exp); 3379 3384 rcu_read_unlock(); 3380 3385 nf_ct_expect_put(exp); 3386 + spin_unlock_bh(&nf_conntrack_expect_lock); 3387 + 3381 3388 if (err <= 0) { 3382 3389 kfree_skb(skb2); 3383 3390 return -ENOMEM; ··· 3391 3394 static bool expect_iter_name(struct nf_conntrack_expect *exp, void *data) 3392 3395 { 3393 3396 struct nf_conntrack_helper *helper; 3394 - const struct nf_conn_help *m_help; 3395 3397 const char *name = data; 3396 3398 3397 - m_help = nfct_help(exp->master); 3398 - 3399 - helper = rcu_dereference(m_help->helper); 3399 + helper = rcu_dereference(exp->helper); 3400 3400 if (!helper) 3401 3401 return false; 3402 3402 ··· 3426 3432 if (err < 0) 3427 3433 return err; 3428 3434 3435 + spin_lock_bh(&nf_conntrack_expect_lock); 3436 + 3429 3437 /* bump usage count to 2 */ 3430 3438 exp = nf_ct_expect_find_get(info->net, &zone, &tuple); 3431 - if (!exp) 3439 + if (!exp) { 3440 + spin_unlock_bh(&nf_conntrack_expect_lock); 3432 3441 return -ENOENT; 3442 + } 3433 3443 3434 3444 if (cda[CTA_EXPECT_ID]) { 3435 3445 __be32 id = nla_get_be32(cda[CTA_EXPECT_ID]); 3436 3446 3437 3447 if (id != nf_expect_get_id(exp)) { 3438 3448 nf_ct_expect_put(exp); 3449 + spin_unlock_bh(&nf_conntrack_expect_lock); 3439 3450 return -ENOENT; 3440 3451 } 3441 3452 } 3442 3453 3443 3454 /* after list removal, usage count == 1 */ 3444 - spin_lock_bh(&nf_conntrack_expect_lock); 3445 3455 if (timer_delete(&exp->timeout)) { 3446 3456 nf_ct_unlink_expect_report(exp, NETLINK_CB(skb).portid, 3447 3457 nlmsg_report(info->nlh)); ··· 3532 3534 struct nf_conntrack_tuple *tuple, 3533 3535 struct nf_conntrack_tuple *mask) 3534 3536 { 3535 - u_int32_t class = 0; 3537 + struct net *net = read_pnet(&ct->ct_net); 3536 3538 struct nf_conntrack_expect *exp; 3537 3539 struct nf_conn_help *help; 3540 + u32 class = 0; 3538 3541 int err; 3539 3542 3540 3543 help = nfct_help(ct); ··· 3572 3573 3573 3574 exp->class = class; 3574 3575 exp->master = ct; 3575 - exp->helper = helper; 3576 + write_pnet(&exp->net, net); 3577 + #ifdef CONFIG_NF_CONNTRACK_ZONES 3578 + exp->zone = ct->zone; 3579 + #endif 3580 + if (!helper) 3581 + helper = rcu_dereference(help->helper); 3582 + rcu_assign_pointer(exp->helper, helper); 3576 3583 exp->tuple = *tuple; 3577 3584 exp->mask.src.u3 = mask->src.u3; 3578 3585 exp->mask.src.u.all = mask->src.u.all;

+3 -7

net/netfilter/nf_conntrack_proto_tcp.c

··· 1385 1385 } 1386 1386 1387 1387 static const struct nla_policy tcp_nla_policy[CTA_PROTOINFO_TCP_MAX+1] = { 1388 - [CTA_PROTOINFO_TCP_STATE] = { .type = NLA_U8 }, 1389 - [CTA_PROTOINFO_TCP_WSCALE_ORIGINAL] = { .type = NLA_U8 }, 1390 - [CTA_PROTOINFO_TCP_WSCALE_REPLY] = { .type = NLA_U8 }, 1388 + [CTA_PROTOINFO_TCP_STATE] = NLA_POLICY_MAX(NLA_U8, TCP_CONNTRACK_SYN_SENT2), 1389 + [CTA_PROTOINFO_TCP_WSCALE_ORIGINAL] = NLA_POLICY_MAX(NLA_U8, TCP_MAX_WSCALE), 1390 + [CTA_PROTOINFO_TCP_WSCALE_REPLY] = NLA_POLICY_MAX(NLA_U8, TCP_MAX_WSCALE), 1391 1391 [CTA_PROTOINFO_TCP_FLAGS_ORIGINAL] = { .len = sizeof(struct nf_ct_tcp_flags) }, 1392 1392 [CTA_PROTOINFO_TCP_FLAGS_REPLY] = { .len = sizeof(struct nf_ct_tcp_flags) }, 1393 1393 }; ··· 1413 1413 tcp_nla_policy, NULL); 1414 1414 if (err < 0) 1415 1415 return err; 1416 - 1417 - if (tb[CTA_PROTOINFO_TCP_STATE] && 1418 - nla_get_u8(tb[CTA_PROTOINFO_TCP_STATE]) >= TCP_CONNTRACK_MAX) 1419 - return -EINVAL; 1420 1416 1421 1417 spin_lock_bh(&ct->lock); 1422 1418 if (tb[CTA_PROTOINFO_TCP_STATE])

+12 -6

net/netfilter/nf_conntrack_sip.c

··· 924 924 exp = __nf_ct_expect_find(net, nf_ct_zone(ct), &tuple); 925 925 926 926 if (!exp || exp->master == ct || 927 - nfct_help(exp->master)->helper != nfct_help(ct)->helper || 927 + exp->helper != nfct_help(ct)->helper || 928 928 exp->class != class) 929 929 break; 930 930 #if IS_ENABLED(CONFIG_NF_NAT) ··· 1040 1040 unsigned int port; 1041 1041 const struct sdp_media_type *t; 1042 1042 int ret = NF_ACCEPT; 1043 + bool have_rtp_addr = false; 1043 1044 1044 1045 hooks = rcu_dereference(nf_nat_sip_hooks); 1045 1046 ··· 1057 1056 caddr_len = 0; 1058 1057 if (ct_sip_parse_sdp_addr(ct, *dptr, sdpoff, *datalen, 1059 1058 SDP_HDR_CONNECTION, SDP_HDR_MEDIA, 1060 - &matchoff, &matchlen, &caddr) > 0) 1059 + &matchoff, &matchlen, &caddr) > 0) { 1061 1060 caddr_len = matchlen; 1061 + memcpy(&rtp_addr, &caddr, sizeof(rtp_addr)); 1062 + have_rtp_addr = true; 1063 + } 1062 1064 1063 1065 mediaoff = sdpoff; 1064 1066 for (i = 0; i < ARRAY_SIZE(sdp_media_types); ) { ··· 1095 1091 &matchoff, &matchlen, &maddr) > 0) { 1096 1092 maddr_len = matchlen; 1097 1093 memcpy(&rtp_addr, &maddr, sizeof(rtp_addr)); 1098 - } else if (caddr_len) 1094 + have_rtp_addr = true; 1095 + } else if (caddr_len) { 1099 1096 memcpy(&rtp_addr, &caddr, sizeof(rtp_addr)); 1100 - else { 1097 + have_rtp_addr = true; 1098 + } else { 1101 1099 nf_ct_helper_log(skb, ct, "cannot parse SDP message"); 1102 1100 return NF_DROP; 1103 1101 } ··· 1131 1125 1132 1126 /* Update session connection and owner addresses */ 1133 1127 hooks = rcu_dereference(nf_nat_sip_hooks); 1134 - if (hooks && ct->status & IPS_NAT_MASK) 1128 + if (hooks && ct->status & IPS_NAT_MASK && have_rtp_addr) 1135 1129 ret = hooks->sdp_session(skb, protoff, dataoff, 1136 1130 dptr, datalen, sdpoff, 1137 1131 &rtp_addr); ··· 1303 1297 nf_ct_expect_init(exp, SIP_EXPECT_SIGNALLING, nf_ct_l3num(ct), 1304 1298 saddr, &daddr, proto, NULL, &port); 1305 1299 exp->timeout.expires = sip_timeout * HZ; 1306 - exp->helper = helper; 1300 + rcu_assign_pointer(exp->helper, helper); 1307 1301 exp->flags = NF_CT_EXPECT_PERMANENT | NF_CT_EXPECT_INACTIVE; 1308 1302 1309 1303 hooks = rcu_dereference(nf_nat_sip_hooks);

+2 -6

net/netfilter/nfnetlink_log.c

··· 647 647 648 648 if (data_len) { 649 649 struct nlattr *nla; 650 - int size = nla_attr_size(data_len); 651 650 652 - if (skb_tailroom(inst->skb) < nla_total_size(data_len)) 651 + nla = nla_reserve(inst->skb, NFULA_PAYLOAD, data_len); 652 + if (!nla) 653 653 goto nla_put_failure; 654 - 655 - nla = skb_put(inst->skb, nla_total_size(data_len)); 656 - nla->nla_type = NFULA_PAYLOAD; 657 - nla->nla_len = size; 658 654 659 655 if (skb_copy_bits(skb, 0, nla_data(nla), data_len)) 660 656 BUG();

+10 -10

net/netfilter/nft_set_pipapo_avx2.c

··· 242 242 243 243 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 244 244 if (last) 245 - return b; 245 + ret = b; 246 246 247 247 if (unlikely(ret == -1)) 248 248 ret = b / XSAVE_YMM_SIZE; ··· 319 319 320 320 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 321 321 if (last) 322 - return b; 322 + ret = b; 323 323 324 324 if (unlikely(ret == -1)) 325 325 ret = b / XSAVE_YMM_SIZE; ··· 414 414 415 415 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 416 416 if (last) 417 - return b; 417 + ret = b; 418 418 419 419 if (unlikely(ret == -1)) 420 420 ret = b / XSAVE_YMM_SIZE; ··· 505 505 506 506 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 507 507 if (last) 508 - return b; 508 + ret = b; 509 509 510 510 if (unlikely(ret == -1)) 511 511 ret = b / XSAVE_YMM_SIZE; ··· 641 641 642 642 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 643 643 if (last) 644 - return b; 644 + ret = b; 645 645 646 646 if (unlikely(ret == -1)) 647 647 ret = b / XSAVE_YMM_SIZE; ··· 699 699 700 700 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 701 701 if (last) 702 - return b; 702 + ret = b; 703 703 704 704 if (unlikely(ret == -1)) 705 705 ret = b / XSAVE_YMM_SIZE; ··· 764 764 765 765 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 766 766 if (last) 767 - return b; 767 + ret = b; 768 768 769 769 if (unlikely(ret == -1)) 770 770 ret = b / XSAVE_YMM_SIZE; ··· 839 839 840 840 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 841 841 if (last) 842 - return b; 842 + ret = b; 843 843 844 844 if (unlikely(ret == -1)) 845 845 ret = b / XSAVE_YMM_SIZE; ··· 925 925 926 926 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 927 927 if (last) 928 - return b; 928 + ret = b; 929 929 930 930 if (unlikely(ret == -1)) 931 931 ret = b / XSAVE_YMM_SIZE; ··· 1019 1019 1020 1020 b = nft_pipapo_avx2_refill(i_ul, &map[i_ul], fill, f->mt, last); 1021 1021 if (last) 1022 - return b; 1022 + ret = b; 1023 1023 1024 1024 if (unlikely(ret == -1)) 1025 1025 ret = b / XSAVE_YMM_SIZE;

+75 -17

net/netfilter/nft_set_rbtree.c

··· 572 572 return array; 573 573 } 574 574 575 - #define NFT_ARRAY_EXTRA_SIZE 10240 576 - 577 575 /* Similar to nft_rbtree_{u,k}size to hide details to userspace, but consider 578 576 * packed representation coming from userspace for anonymous sets too. 579 577 */ 580 578 static u32 nft_array_elems(const struct nft_set *set) 581 579 { 582 - u32 nelems = atomic_read(&set->nelems); 580 + u32 nelems = atomic_read(&set->nelems) - set->ndeact; 583 581 584 582 /* Adjacent intervals are represented with a single start element in 585 583 * anonymous sets, use the current element counter as is. ··· 593 595 return (nelems / 2) + 2; 594 596 } 595 597 596 - static int nft_array_may_resize(const struct nft_set *set) 598 + #define NFT_ARRAY_INITIAL_SIZE 1024 599 + #define NFT_ARRAY_INITIAL_ANON_SIZE 16 600 + #define NFT_ARRAY_INITIAL_ANON_THRESH (8192U / sizeof(struct nft_array_interval)) 601 + 602 + static int nft_array_may_resize(const struct nft_set *set, bool flush) 597 603 { 598 - u32 nelems = nft_array_elems(set), new_max_intervals; 604 + u32 initial_intervals, max_intervals, new_max_intervals, delta; 605 + u32 shrinked_max_intervals, nelems = nft_array_elems(set); 599 606 struct nft_rbtree *priv = nft_set_priv(set); 600 607 struct nft_array *array; 601 608 602 - if (!priv->array_next) { 603 - array = nft_array_alloc(nelems + NFT_ARRAY_EXTRA_SIZE); 609 + if (nft_set_is_anonymous(set)) 610 + initial_intervals = NFT_ARRAY_INITIAL_ANON_SIZE; 611 + else 612 + initial_intervals = NFT_ARRAY_INITIAL_SIZE; 613 + 614 + if (priv->array_next) { 615 + max_intervals = priv->array_next->max_intervals; 616 + new_max_intervals = priv->array_next->max_intervals; 617 + } else { 618 + if (priv->array) { 619 + max_intervals = priv->array->max_intervals; 620 + new_max_intervals = priv->array->max_intervals; 621 + } else { 622 + max_intervals = 0; 623 + new_max_intervals = initial_intervals; 624 + } 625 + } 626 + 627 + if (nft_set_is_anonymous(set)) 628 + goto maybe_grow; 629 + 630 + if (flush) { 631 + /* Set flush just started, nelems still report elements.*/ 632 + nelems = 0; 633 + new_max_intervals = NFT_ARRAY_INITIAL_SIZE; 634 + goto realloc_array; 635 + } 636 + 637 + if (check_add_overflow(new_max_intervals, new_max_intervals, 638 + &shrinked_max_intervals)) 639 + return -EOVERFLOW; 640 + 641 + shrinked_max_intervals = DIV_ROUND_UP(shrinked_max_intervals, 3); 642 + 643 + if (shrinked_max_intervals > NFT_ARRAY_INITIAL_SIZE && 644 + nelems < shrinked_max_intervals) { 645 + new_max_intervals = shrinked_max_intervals; 646 + goto realloc_array; 647 + } 648 + maybe_grow: 649 + if (nelems > new_max_intervals) { 650 + if (nft_set_is_anonymous(set) && 651 + new_max_intervals < NFT_ARRAY_INITIAL_ANON_THRESH) { 652 + new_max_intervals <<= 1; 653 + } else { 654 + delta = new_max_intervals >> 1; 655 + if (check_add_overflow(new_max_intervals, delta, 656 + &new_max_intervals)) 657 + return -EOVERFLOW; 658 + } 659 + } 660 + 661 + realloc_array: 662 + if (WARN_ON_ONCE(nelems > new_max_intervals)) 663 + return -ENOMEM; 664 + 665 + if (priv->array_next) { 666 + if (max_intervals == new_max_intervals) 667 + return 0; 668 + 669 + if (nft_array_intervals_alloc(priv->array_next, new_max_intervals) < 0) 670 + return -ENOMEM; 671 + } else { 672 + array = nft_array_alloc(new_max_intervals); 604 673 if (!array) 605 674 return -ENOMEM; 606 675 607 676 priv->array_next = array; 608 677 } 609 - 610 - if (nelems < priv->array_next->max_intervals) 611 - return 0; 612 - 613 - new_max_intervals = priv->array_next->max_intervals + NFT_ARRAY_EXTRA_SIZE; 614 - if (nft_array_intervals_alloc(priv->array_next, new_max_intervals) < 0) 615 - return -ENOMEM; 616 678 617 679 return 0; 618 680 } ··· 688 630 689 631 nft_rbtree_maybe_reset_start_cookie(priv, tstamp); 690 632 691 - if (nft_array_may_resize(set) < 0) 633 + if (nft_array_may_resize(set, false) < 0) 692 634 return -ENOMEM; 693 635 694 636 do { ··· 799 741 nft_rbtree_interval_null(set, this)) 800 742 priv->start_rbe_cookie = 0; 801 743 802 - if (nft_array_may_resize(set) < 0) 744 + if (nft_array_may_resize(set, false) < 0) 803 745 return NULL; 804 746 805 747 while (parent != NULL) { ··· 869 811 870 812 switch (iter->type) { 871 813 case NFT_ITER_UPDATE_CLONE: 872 - if (nft_array_may_resize(set) < 0) { 814 + if (nft_array_may_resize(set, true) < 0) { 873 815 iter->err = -ENOMEM; 874 816 break; 875 817 }

+6 -4

net/nfc/nci/core.c

··· 579 579 skb_queue_purge(&ndev->rx_q); 580 580 skb_queue_purge(&ndev->tx_q); 581 581 582 - /* Flush RX and TX wq */ 583 - flush_workqueue(ndev->rx_wq); 582 + /* Flush TX wq, RX wq flush can't be under the lock */ 584 583 flush_workqueue(ndev->tx_wq); 585 584 586 585 /* Reset device */ ··· 591 592 msecs_to_jiffies(NCI_RESET_TIMEOUT)); 592 593 593 594 /* After this point our queues are empty 594 - * and no works are scheduled. 595 + * rx work may be running but will see that NCI_UP was cleared 595 596 */ 596 597 ndev->ops->close(ndev); 597 598 598 599 clear_bit(NCI_INIT, &ndev->flags); 599 600 600 - /* Flush cmd wq */ 601 + /* Flush cmd and tx wq */ 601 602 flush_workqueue(ndev->cmd_wq); 602 603 603 604 timer_delete_sync(&ndev->cmd_timer); ··· 611 612 ndev->flags &= BIT(NCI_UNREG); 612 613 613 614 mutex_unlock(&ndev->req_lock); 615 + 616 + /* rx_work may take req_lock via nci_deactivate_target */ 617 + flush_workqueue(ndev->rx_wq); 614 618 615 619 return 0; 616 620 }

+2

net/openvswitch/flow_netlink.c

··· 2953 2953 case OVS_KEY_ATTR_MPLS: 2954 2954 if (!eth_p_mpls(eth_type)) 2955 2955 return -EINVAL; 2956 + if (key_len != sizeof(struct ovs_key_mpls)) 2957 + return -EINVAL; 2956 2958 break; 2957 2959 2958 2960 case OVS_KEY_ATTR_SCTP:

+8 -3

net/openvswitch/vport-netdev.c

··· 151 151 void ovs_netdev_detach_dev(struct vport *vport) 152 152 { 153 153 ASSERT_RTNL(); 154 - vport->dev->priv_flags &= ~IFF_OVS_DATAPATH; 155 154 netdev_rx_handler_unregister(vport->dev); 156 155 netdev_upper_dev_unlink(vport->dev, 157 156 netdev_master_upper_dev_get(vport->dev)); 158 157 dev_set_promiscuity(vport->dev, -1); 158 + 159 + /* paired with smp_mb() in netdev_destroy() */ 160 + smp_wmb(); 161 + 162 + vport->dev->priv_flags &= ~IFF_OVS_DATAPATH; 159 163 } 160 164 161 165 static void netdev_destroy(struct vport *vport) ··· 178 174 rtnl_unlock(); 179 175 } 180 176 177 + /* paired with smp_wmb() in ovs_netdev_detach_dev() */ 178 + smp_mb(); 179 + 181 180 call_rcu(&vport->rcu, vport_netdev_free); 182 181 } 183 182 ··· 196 189 */ 197 190 if (vport->dev->reg_state == NETREG_REGISTERED) 198 191 rtnl_delete_link(vport->dev, 0, NULL); 199 - netdev_put(vport->dev, &vport->dev_tracker); 200 - vport->dev = NULL; 201 192 rtnl_unlock(); 202 193 203 194 call_rcu(&vport->rcu, vport_netdev_free);

+1

net/packet/af_packet.c

··· 3135 3135 3136 3136 spin_lock(&po->bind_lock); 3137 3137 unregister_prot_hook(sk, false); 3138 + WRITE_ONCE(po->num, 0); 3138 3139 packet_cached_dev_reset(po); 3139 3140 3140 3141 if (po->prot_hook.dev) {

+8 -1

net/smc/smc_rx.c

··· 135 135 sock_put(sk); 136 136 } 137 137 138 + static bool smc_rx_pipe_buf_get(struct pipe_inode_info *pipe, 139 + struct pipe_buffer *buf) 140 + { 141 + /* smc_spd_priv in buf->private is not shareable; disallow cloning. */ 142 + return false; 143 + } 144 + 138 145 static const struct pipe_buf_operations smc_pipe_ops = { 139 146 .release = smc_rx_pipe_buf_release, 140 - .get = generic_pipe_buf_get 147 + .get = smc_rx_pipe_buf_get, 141 148 }; 142 149 143 150 static void smc_rx_spd_release(struct splice_pipe_desc *spd,

+1 -1

net/tls/tls_sw.c

··· 246 246 crypto_wait_req(-EINPROGRESS, &ctx->async_wait); 247 247 atomic_inc(&ctx->decrypt_pending); 248 248 249 + __skb_queue_purge(&ctx->async_hold); 249 250 return ctx->async_wait.err; 250 251 } 251 252 ··· 2226 2225 2227 2226 /* Wait for all previously submitted records to be decrypted */ 2228 2227 ret = tls_decrypt_async_wait(ctx); 2229 - __skb_queue_purge(&ctx->async_hold); 2230 2228 2231 2229 if (ret) { 2232 2230 if (err >= 0 || err == -EINPROGRESS)

+4 -1

net/xfrm/xfrm_input.c

··· 75 75 76 76 spin_lock_bh(&xfrm_input_afinfo_lock); 77 77 if (likely(xfrm_input_afinfo[afinfo->is_ipip][afinfo->family])) { 78 - if (unlikely(xfrm_input_afinfo[afinfo->is_ipip][afinfo->family] != afinfo)) 78 + const struct xfrm_input_afinfo *cur; 79 + 80 + cur = rcu_access_pointer(xfrm_input_afinfo[afinfo->is_ipip][afinfo->family]); 81 + if (unlikely(cur != afinfo)) 79 82 err = -EINVAL; 80 83 else 81 84 RCU_INIT_POINTER(xfrm_input_afinfo[afinfo->is_ipip][afinfo->family], NULL);

+14 -3

net/xfrm/xfrm_iptfs.c

··· 901 901 iptfs_skb_can_add_frags(newskb, fragwalk, data, copylen)) { 902 902 iptfs_skb_add_frags(newskb, fragwalk, data, copylen); 903 903 } else { 904 + if (skb_linearize(newskb)) { 905 + XFRM_INC_STATS(xs_net(xtfs->x), 906 + LINUX_MIB_XFRMINBUFFERERROR); 907 + goto abandon; 908 + } 909 + 904 910 /* copy fragment data into newskb */ 905 911 if (skb_copy_seq_read(st, data, skb_put(newskb, copylen), 906 912 copylen)) { ··· 997 991 998 992 iplen = be16_to_cpu(iph->tot_len); 999 993 iphlen = iph->ihl << 2; 994 + if (iplen < iphlen || iphlen < sizeof(*iph)) { 995 + XFRM_INC_STATS(net, 996 + LINUX_MIB_XFRMINHDRERROR); 997 + goto done; 998 + } 1000 999 protocol = cpu_to_be16(ETH_P_IP); 1001 1000 XFRM_MODE_SKB_CB(skbseq->root_skb)->tos = iph->tos; 1002 1001 } else if (iph->version == 0x6) { ··· 2664 2653 if (!xtfs) 2665 2654 return -ENOMEM; 2666 2655 2667 - x->mode_data = xtfs; 2668 - xtfs->x = x; 2669 - 2670 2656 xtfs->ra_newskb = NULL; 2671 2657 if (xtfs->cfg.reorder_win_size) { 2672 2658 xtfs->w_saved = kzalloc_objs(*xtfs->w_saved, ··· 2673 2665 return -ENOMEM; 2674 2666 } 2675 2667 } 2668 + 2669 + x->mode_data = xtfs; 2670 + xtfs->x = x; 2676 2671 2677 2672 return 0; 2678 2673 }

+1 -1

net/xfrm/xfrm_nat_keepalive.c

··· 261 261 262 262 int xfrm_nat_keepalive_net_fini(struct net *net) 263 263 { 264 - cancel_delayed_work_sync(&net->xfrm.nat_keepalive_work); 264 + disable_delayed_work_sync(&net->xfrm.nat_keepalive_work); 265 265 return 0; 266 266 } 267 267

+7 -5

net/xfrm/xfrm_policy.c

··· 4156 4156 int i; 4157 4157 4158 4158 for (i = 0; i < ARRAY_SIZE(xfrm_policy_afinfo); i++) { 4159 - if (xfrm_policy_afinfo[i] != afinfo) 4159 + if (rcu_access_pointer(xfrm_policy_afinfo[i]) != afinfo) 4160 4160 continue; 4161 4161 RCU_INIT_POINTER(xfrm_policy_afinfo[i], NULL); 4162 4162 break; ··· 4242 4242 net->xfrm.policy_count[XFRM_POLICY_MAX + dir] = 0; 4243 4243 4244 4244 htab = &net->xfrm.policy_bydst[dir]; 4245 - htab->table = xfrm_hash_alloc(sz); 4245 + rcu_assign_pointer(htab->table, xfrm_hash_alloc(sz)); 4246 4246 if (!htab->table) 4247 4247 goto out_bydst; 4248 4248 htab->hmask = hmask; ··· 4269 4269 struct xfrm_policy_hash *htab; 4270 4270 4271 4271 htab = &net->xfrm.policy_bydst[dir]; 4272 - xfrm_hash_free(htab->table, sz); 4272 + xfrm_hash_free(rcu_dereference_protected(htab->table, true), sz); 4273 4273 } 4274 4274 xfrm_hash_free(net->xfrm.policy_byidx, sz); 4275 4275 out_byidx: ··· 4281 4281 struct xfrm_pol_inexact_bin *b, *t; 4282 4282 unsigned int sz; 4283 4283 int dir; 4284 + 4285 + disable_work_sync(&net->xfrm.policy_hthresh.work); 4284 4286 4285 4287 flush_work(&net->xfrm.policy_hash_work); 4286 4288 #ifdef CONFIG_XFRM_SUB_POLICY ··· 4297 4295 4298 4296 htab = &net->xfrm.policy_bydst[dir]; 4299 4297 sz = (htab->hmask + 1) * sizeof(struct hlist_head); 4300 - WARN_ON(!hlist_empty(htab->table)); 4301 - xfrm_hash_free(htab->table, sz); 4298 + WARN_ON(!hlist_empty(rcu_dereference_protected(htab->table, true))); 4299 + xfrm_hash_free(rcu_dereference_protected(htab->table, true), sz); 4302 4300 } 4303 4301 4304 4302 sz = (net->xfrm.policy_idx_hmask + 1) * sizeof(struct hlist_head);

+64 -52

net/xfrm/xfrm_state.c

··· 53 53 static HLIST_HEAD(xfrm_state_gc_list); 54 54 static HLIST_HEAD(xfrm_state_dev_gc_list); 55 55 56 - static inline bool xfrm_state_hold_rcu(struct xfrm_state __rcu *x) 56 + static inline bool xfrm_state_hold_rcu(struct xfrm_state *x) 57 57 { 58 58 return refcount_inc_not_zero(&x->refcnt); 59 59 } ··· 870 870 for (i = 0; i <= net->xfrm.state_hmask; i++) { 871 871 struct xfrm_state *x; 872 872 873 - hlist_for_each_entry(x, net->xfrm.state_bydst+i, bydst) { 873 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + i, bydst) { 874 874 if (xfrm_id_proto_match(x->id.proto, proto) && 875 875 (err = security_xfrm_state_delete(x)) != 0) { 876 876 xfrm_audit_state_delete(x, 0, task_valid); ··· 891 891 struct xfrm_state *x; 892 892 struct xfrm_dev_offload *xso; 893 893 894 - hlist_for_each_entry(x, net->xfrm.state_bydst+i, bydst) { 894 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + i, bydst) { 895 895 xso = &x->xso; 896 896 897 897 if (xso->dev == dev && ··· 931 931 for (i = 0; i <= net->xfrm.state_hmask; i++) { 932 932 struct xfrm_state *x; 933 933 restart: 934 - hlist_for_each_entry(x, net->xfrm.state_bydst+i, bydst) { 934 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + i, bydst) { 935 935 if (!xfrm_state_kern(x) && 936 936 xfrm_id_proto_match(x->id.proto, proto)) { 937 937 xfrm_state_hold(x); ··· 973 973 err = -ESRCH; 974 974 for (i = 0; i <= net->xfrm.state_hmask; i++) { 975 975 restart: 976 - hlist_for_each_entry(x, net->xfrm.state_bydst+i, bydst) { 976 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + i, bydst) { 977 977 xso = &x->xso; 978 978 979 979 if (!xfrm_state_kern(x) && xso->dev == dev) { ··· 1563 1563 list_add(&x->km.all, &net->xfrm.state_all); 1564 1564 h = xfrm_dst_hash(net, daddr, saddr, tmpl->reqid, encap_family); 1565 1565 XFRM_STATE_INSERT(bydst, &x->bydst, 1566 - net->xfrm.state_bydst + h, 1566 + xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, 1567 1567 x->xso.type); 1568 1568 h = xfrm_src_hash(net, daddr, saddr, encap_family); 1569 1569 XFRM_STATE_INSERT(bysrc, &x->bysrc, 1570 - net->xfrm.state_bysrc + h, 1570 + xfrm_state_deref_prot(net->xfrm.state_bysrc, net) + h, 1571 1571 x->xso.type); 1572 1572 INIT_HLIST_NODE(&x->state_cache); 1573 1573 if (x->id.spi) { 1574 1574 h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, encap_family); 1575 1575 XFRM_STATE_INSERT(byspi, &x->byspi, 1576 - net->xfrm.state_byspi + h, 1576 + xfrm_state_deref_prot(net->xfrm.state_byspi, net) + h, 1577 1577 x->xso.type); 1578 1578 } 1579 1579 if (x->km.seq) { 1580 1580 h = xfrm_seq_hash(net, x->km.seq); 1581 1581 XFRM_STATE_INSERT(byseq, &x->byseq, 1582 - net->xfrm.state_byseq + h, 1582 + xfrm_state_deref_prot(net->xfrm.state_byseq, net) + h, 1583 1583 x->xso.type); 1584 1584 } 1585 1585 x->lft.hard_add_expires_seconds = net->xfrm.sysctl_acq_expires; ··· 1652 1652 1653 1653 spin_lock_bh(&net->xfrm.xfrm_state_lock); 1654 1654 h = xfrm_dst_hash(net, daddr, saddr, reqid, family); 1655 - hlist_for_each_entry(x, net->xfrm.state_bydst+h, bydst) { 1655 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, bydst) { 1656 1656 if (x->props.family == family && 1657 1657 x->props.reqid == reqid && 1658 1658 (mark & x->mark.m) == x->mark.v && ··· 1703 1703 struct xfrm_state *x; 1704 1704 unsigned int i; 1705 1705 1706 - rcu_read_lock(); 1707 1706 for (i = 0; i <= net->xfrm.state_hmask; i++) { 1708 - hlist_for_each_entry_rcu(x, &net->xfrm.state_byspi[i], byspi) { 1709 - if (x->id.spi == spi && x->id.proto == proto) { 1710 - if (!xfrm_state_hold_rcu(x)) 1711 - continue; 1712 - rcu_read_unlock(); 1707 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_byspi, net) + i, byspi) { 1708 + if (x->id.spi == spi && x->id.proto == proto) 1713 1709 return x; 1714 - } 1715 1710 } 1716 1711 } 1717 - rcu_read_unlock(); 1718 1712 return NULL; 1719 1713 } 1720 1714 ··· 1724 1730 1725 1731 h = xfrm_dst_hash(net, &x->id.daddr, &x->props.saddr, 1726 1732 x->props.reqid, x->props.family); 1727 - XFRM_STATE_INSERT(bydst, &x->bydst, net->xfrm.state_bydst + h, 1733 + XFRM_STATE_INSERT(bydst, &x->bydst, 1734 + xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, 1728 1735 x->xso.type); 1729 1736 1730 1737 h = xfrm_src_hash(net, &x->id.daddr, &x->props.saddr, x->props.family); 1731 - XFRM_STATE_INSERT(bysrc, &x->bysrc, net->xfrm.state_bysrc + h, 1738 + XFRM_STATE_INSERT(bysrc, &x->bysrc, 1739 + xfrm_state_deref_prot(net->xfrm.state_bysrc, net) + h, 1732 1740 x->xso.type); 1733 1741 1734 1742 if (x->id.spi) { 1735 1743 h = xfrm_spi_hash(net, &x->id.daddr, x->id.spi, x->id.proto, 1736 1744 x->props.family); 1737 1745 1738 - XFRM_STATE_INSERT(byspi, &x->byspi, net->xfrm.state_byspi + h, 1746 + XFRM_STATE_INSERT(byspi, &x->byspi, 1747 + xfrm_state_deref_prot(net->xfrm.state_byspi, net) + h, 1739 1748 x->xso.type); 1740 1749 } 1741 1750 1742 1751 if (x->km.seq) { 1743 1752 h = xfrm_seq_hash(net, x->km.seq); 1744 1753 1745 - XFRM_STATE_INSERT(byseq, &x->byseq, net->xfrm.state_byseq + h, 1754 + XFRM_STATE_INSERT(byseq, &x->byseq, 1755 + xfrm_state_deref_prot(net->xfrm.state_byseq, net) + h, 1746 1756 x->xso.type); 1747 1757 } 1748 1758 ··· 1773 1775 u32 cpu_id = xnew->pcpu_num; 1774 1776 1775 1777 h = xfrm_dst_hash(net, &xnew->id.daddr, &xnew->props.saddr, reqid, family); 1776 - hlist_for_each_entry(x, net->xfrm.state_bydst+h, bydst) { 1778 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, bydst) { 1777 1779 if (x->props.family == family && 1778 1780 x->props.reqid == reqid && 1779 1781 x->if_id == if_id && ··· 1809 1811 struct xfrm_state *x; 1810 1812 u32 mark = m->v & m->m; 1811 1813 1812 - hlist_for_each_entry(x, net->xfrm.state_bydst+h, bydst) { 1814 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, bydst) { 1813 1815 if (x->props.reqid != reqid || 1814 1816 x->props.mode != mode || 1815 1817 x->props.family != family || ··· 1866 1868 ktime_set(net->xfrm.sysctl_acq_expires, 0), 1867 1869 HRTIMER_MODE_REL_SOFT); 1868 1870 list_add(&x->km.all, &net->xfrm.state_all); 1869 - XFRM_STATE_INSERT(bydst, &x->bydst, net->xfrm.state_bydst + h, 1871 + XFRM_STATE_INSERT(bydst, &x->bydst, 1872 + xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, 1870 1873 x->xso.type); 1871 1874 h = xfrm_src_hash(net, daddr, saddr, family); 1872 - XFRM_STATE_INSERT(bysrc, &x->bysrc, net->xfrm.state_bysrc + h, 1875 + XFRM_STATE_INSERT(bysrc, &x->bysrc, 1876 + xfrm_state_deref_prot(net->xfrm.state_bysrc, net) + h, 1873 1877 x->xso.type); 1874 1878 1875 1879 net->xfrm.state_num++; ··· 2091 2091 if (m->reqid) { 2092 2092 h = xfrm_dst_hash(net, &m->old_daddr, &m->old_saddr, 2093 2093 m->reqid, m->old_family); 2094 - hlist_for_each_entry(x, net->xfrm.state_bydst+h, bydst) { 2094 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + h, bydst) { 2095 2095 if (x->props.mode != m->mode || 2096 2096 x->id.proto != m->proto) 2097 2097 continue; ··· 2110 2110 } else { 2111 2111 h = xfrm_src_hash(net, &m->old_daddr, &m->old_saddr, 2112 2112 m->old_family); 2113 - hlist_for_each_entry(x, net->xfrm.state_bysrc+h, bysrc) { 2113 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bysrc, net) + h, bysrc) { 2114 2114 if (x->props.mode != m->mode || 2115 2115 x->id.proto != m->proto) 2116 2116 continue; ··· 2264 2264 2265 2265 err = 0; 2266 2266 x->km.state = XFRM_STATE_DEAD; 2267 + xfrm_dev_state_delete(x); 2267 2268 __xfrm_state_put(x); 2268 2269 } 2269 2270 ··· 2313 2312 2314 2313 spin_lock_bh(&net->xfrm.xfrm_state_lock); 2315 2314 for (i = 0; i <= net->xfrm.state_hmask; i++) { 2316 - hlist_for_each_entry(x, net->xfrm.state_bydst + i, bydst) 2315 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_bydst, net) + i, bydst) 2317 2316 xfrm_dev_state_update_stats(x); 2318 2317 } 2319 2318 spin_unlock_bh(&net->xfrm.xfrm_state_lock); ··· 2504 2503 unsigned int h = xfrm_seq_hash(net, seq); 2505 2504 struct xfrm_state *x; 2506 2505 2507 - hlist_for_each_entry_rcu(x, net->xfrm.state_byseq + h, byseq) { 2506 + hlist_for_each_entry(x, xfrm_state_deref_prot(net->xfrm.state_byseq, net) + h, byseq) { 2508 2507 if (x->km.seq == seq && 2509 2508 (mark & x->mark.m) == x->mark.v && 2510 2509 x->pcpu_num == pcpu_num && ··· 2603 2602 if (!x0) { 2604 2603 x->id.spi = newspi; 2605 2604 h = xfrm_spi_hash(net, &x->id.daddr, newspi, x->id.proto, x->props.family); 2606 - XFRM_STATE_INSERT(byspi, &x->byspi, net->xfrm.state_byspi + h, x->xso.type); 2605 + XFRM_STATE_INSERT(byspi, &x->byspi, 2606 + xfrm_state_deref_prot(net->xfrm.state_byspi, net) + h, 2607 + x->xso.type); 2607 2608 spin_unlock_bh(&net->xfrm.xfrm_state_lock); 2608 2609 err = 0; 2609 2610 goto unlock; 2610 2611 } 2611 - xfrm_state_put(x0); 2612 2612 spin_unlock_bh(&net->xfrm.xfrm_state_lock); 2613 2613 2614 2614 next: ··· 3260 3258 3261 3259 int __net_init xfrm_state_init(struct net *net) 3262 3260 { 3261 + struct hlist_head *ndst, *nsrc, *nspi, *nseq; 3263 3262 unsigned int sz; 3264 3263 3265 3264 if (net_eq(net, &init_net)) ··· 3271 3268 3272 3269 sz = sizeof(struct hlist_head) * 8; 3273 3270 3274 - net->xfrm.state_bydst = xfrm_hash_alloc(sz); 3275 - if (!net->xfrm.state_bydst) 3271 + ndst = xfrm_hash_alloc(sz); 3272 + if (!ndst) 3276 3273 goto out_bydst; 3277 - net->xfrm.state_bysrc = xfrm_hash_alloc(sz); 3278 - if (!net->xfrm.state_bysrc) 3274 + rcu_assign_pointer(net->xfrm.state_bydst, ndst); 3275 + 3276 + nsrc = xfrm_hash_alloc(sz); 3277 + if (!nsrc) 3279 3278 goto out_bysrc; 3280 - net->xfrm.state_byspi = xfrm_hash_alloc(sz); 3281 - if (!net->xfrm.state_byspi) 3279 + rcu_assign_pointer(net->xfrm.state_bysrc, nsrc); 3280 + 3281 + nspi = xfrm_hash_alloc(sz); 3282 + if (!nspi) 3282 3283 goto out_byspi; 3283 - net->xfrm.state_byseq = xfrm_hash_alloc(sz); 3284 - if (!net->xfrm.state_byseq) 3284 + rcu_assign_pointer(net->xfrm.state_byspi, nspi); 3285 + 3286 + nseq = xfrm_hash_alloc(sz); 3287 + if (!nseq) 3285 3288 goto out_byseq; 3289 + rcu_assign_pointer(net->xfrm.state_byseq, nseq); 3286 3290 3287 3291 net->xfrm.state_cache_input = alloc_percpu(struct hlist_head); 3288 3292 if (!net->xfrm.state_cache_input) ··· 3305 3295 return 0; 3306 3296 3307 3297 out_state_cache_input: 3308 - xfrm_hash_free(net->xfrm.state_byseq, sz); 3298 + xfrm_hash_free(nseq, sz); 3309 3299 out_byseq: 3310 - xfrm_hash_free(net->xfrm.state_byspi, sz); 3300 + xfrm_hash_free(nspi, sz); 3311 3301 out_byspi: 3312 - xfrm_hash_free(net->xfrm.state_bysrc, sz); 3302 + xfrm_hash_free(nsrc, sz); 3313 3303 out_bysrc: 3314 - xfrm_hash_free(net->xfrm.state_bydst, sz); 3304 + xfrm_hash_free(ndst, sz); 3315 3305 out_bydst: 3316 3306 return -ENOMEM; 3317 3307 } 3318 3308 3309 + #define xfrm_state_deref_netexit(table) \ 3310 + rcu_dereference_protected((table), true /* netns is going away */) 3319 3311 void xfrm_state_fini(struct net *net) 3320 3312 { 3321 3313 unsigned int sz; ··· 3330 3318 WARN_ON(!list_empty(&net->xfrm.state_all)); 3331 3319 3332 3320 for (i = 0; i <= net->xfrm.state_hmask; i++) { 3333 - WARN_ON(!hlist_empty(net->xfrm.state_byseq + i)); 3334 - WARN_ON(!hlist_empty(net->xfrm.state_byspi + i)); 3335 - WARN_ON(!hlist_empty(net->xfrm.state_bysrc + i)); 3336 - WARN_ON(!hlist_empty(net->xfrm.state_bydst + i)); 3321 + WARN_ON(!hlist_empty(xfrm_state_deref_netexit(net->xfrm.state_byseq) + i)); 3322 + WARN_ON(!hlist_empty(xfrm_state_deref_netexit(net->xfrm.state_byspi) + i)); 3323 + WARN_ON(!hlist_empty(xfrm_state_deref_netexit(net->xfrm.state_bysrc) + i)); 3324 + WARN_ON(!hlist_empty(xfrm_state_deref_netexit(net->xfrm.state_bydst) + i)); 3337 3325 } 3338 3326 3339 3327 sz = (net->xfrm.state_hmask + 1) * sizeof(struct hlist_head); 3340 - xfrm_hash_free(net->xfrm.state_byseq, sz); 3341 - xfrm_hash_free(net->xfrm.state_byspi, sz); 3342 - xfrm_hash_free(net->xfrm.state_bysrc, sz); 3343 - xfrm_hash_free(net->xfrm.state_bydst, sz); 3328 + xfrm_hash_free(xfrm_state_deref_netexit(net->xfrm.state_byseq), sz); 3329 + xfrm_hash_free(xfrm_state_deref_netexit(net->xfrm.state_byspi), sz); 3330 + xfrm_hash_free(xfrm_state_deref_netexit(net->xfrm.state_bysrc), sz); 3331 + xfrm_hash_free(xfrm_state_deref_netexit(net->xfrm.state_bydst), sz); 3344 3332 free_percpu(net->xfrm.state_cache_input); 3345 3333 } 3346 3334

+22 -10

net/xfrm/xfrm_user.c

··· 35 35 #endif 36 36 #include <linux/unaligned.h> 37 37 38 + static struct sock *xfrm_net_nlsk(const struct net *net, const struct sk_buff *skb) 39 + { 40 + /* get the source of this request, see netlink_unicast_kernel */ 41 + const struct sock *sk = NETLINK_CB(skb).sk; 42 + 43 + /* sk is refcounted, the netns stays alive and nlsk with it */ 44 + return rcu_dereference_protected(net->xfrm.nlsk, sk->sk_net_refcnt); 45 + } 46 + 38 47 static int verify_one_alg(struct nlattr **attrs, enum xfrm_attr_type_t type, 39 48 struct netlink_ext_ack *extack) 40 49 { ··· 1736 1727 err = build_spdinfo(r_skb, net, sportid, seq, *flags); 1737 1728 BUG_ON(err < 0); 1738 1729 1739 - return nlmsg_unicast(net->xfrm.nlsk, r_skb, sportid); 1730 + return nlmsg_unicast(xfrm_net_nlsk(net, skb), r_skb, sportid); 1740 1731 } 1741 1732 1742 1733 static inline unsigned int xfrm_sadinfo_msgsize(void) ··· 1796 1787 err = build_sadinfo(r_skb, net, sportid, seq, *flags); 1797 1788 BUG_ON(err < 0); 1798 1789 1799 - return nlmsg_unicast(net->xfrm.nlsk, r_skb, sportid); 1790 + return nlmsg_unicast(xfrm_net_nlsk(net, skb), r_skb, sportid); 1800 1791 } 1801 1792 1802 1793 static int xfrm_get_sa(struct sk_buff *skb, struct nlmsghdr *nlh, ··· 1816 1807 if (IS_ERR(resp_skb)) { 1817 1808 err = PTR_ERR(resp_skb); 1818 1809 } else { 1819 - err = nlmsg_unicast(net->xfrm.nlsk, resp_skb, NETLINK_CB(skb).portid); 1810 + err = nlmsg_unicast(xfrm_net_nlsk(net, skb), resp_skb, NETLINK_CB(skb).portid); 1820 1811 } 1821 1812 xfrm_state_put(x); 1822 1813 out_noput: ··· 1859 1850 pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]); 1860 1851 if (pcpu_num >= num_possible_cpus()) { 1861 1852 err = -EINVAL; 1853 + NL_SET_ERR_MSG(extack, "pCPU number too big"); 1862 1854 goto out_noput; 1863 1855 } 1864 1856 } ··· 1907 1897 } 1908 1898 } 1909 1899 1910 - err = nlmsg_unicast(net->xfrm.nlsk, resp_skb, NETLINK_CB(skb).portid); 1900 + err = nlmsg_unicast(xfrm_net_nlsk(net, skb), resp_skb, NETLINK_CB(skb).portid); 1911 1901 1912 1902 out: 1913 1903 xfrm_state_put(x); ··· 2552 2542 r_up->out = net->xfrm.policy_default[XFRM_POLICY_OUT]; 2553 2543 nlmsg_end(r_skb, r_nlh); 2554 2544 2555 - return nlmsg_unicast(net->xfrm.nlsk, r_skb, portid); 2545 + return nlmsg_unicast(xfrm_net_nlsk(net, skb), r_skb, portid); 2556 2546 } 2557 2547 2558 2548 static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, ··· 2618 2608 if (IS_ERR(resp_skb)) { 2619 2609 err = PTR_ERR(resp_skb); 2620 2610 } else { 2621 - err = nlmsg_unicast(net->xfrm.nlsk, resp_skb, 2611 + err = nlmsg_unicast(xfrm_net_nlsk(net, skb), resp_skb, 2622 2612 NETLINK_CB(skb).portid); 2623 2613 } 2624 2614 } else { ··· 2791 2781 err = build_aevent(r_skb, x, &c); 2792 2782 BUG_ON(err < 0); 2793 2783 2794 - err = nlmsg_unicast(net->xfrm.nlsk, r_skb, NETLINK_CB(skb).portid); 2784 + err = nlmsg_unicast(xfrm_net_nlsk(net, skb), r_skb, NETLINK_CB(skb).portid); 2795 2785 spin_unlock_bh(&x->lock); 2796 2786 xfrm_state_put(x); 2797 2787 return err; ··· 3011 3001 if (attrs[XFRMA_SA_PCPU]) { 3012 3002 x->pcpu_num = nla_get_u32(attrs[XFRMA_SA_PCPU]); 3013 3003 err = -EINVAL; 3014 - if (x->pcpu_num >= num_possible_cpus()) 3004 + if (x->pcpu_num >= num_possible_cpus()) { 3005 + NL_SET_ERR_MSG(extack, "pCPU number too big"); 3015 3006 goto free_state; 3007 + } 3016 3008 } 3017 3009 3018 3010 err = verify_newpolicy_info(&ua->policy, extack); ··· 3495 3483 goto err; 3496 3484 } 3497 3485 3498 - err = netlink_dump_start(net->xfrm.nlsk, skb, nlh, &c); 3486 + err = netlink_dump_start(xfrm_net_nlsk(net, skb), skb, nlh, &c); 3499 3487 goto err; 3500 3488 } 3501 3489 ··· 3685 3673 } 3686 3674 if (x->if_id) 3687 3675 l += nla_total_size(sizeof(x->if_id)); 3688 - if (x->pcpu_num) 3676 + if (x->pcpu_num != UINT_MAX) 3689 3677 l += nla_total_size(sizeof(x->pcpu_num)); 3690 3678 3691 3679 /* Must count x->lastused as it may become non-zero behind our back. */

+18 -15

rust/kernel/regulator.rs

··· 23 23 prelude::*, 24 24 }; 25 25 26 - use core::{marker::PhantomData, mem::ManuallyDrop, ptr::NonNull}; 26 + use core::{ 27 + marker::PhantomData, 28 + mem::ManuallyDrop, // 29 + }; 27 30 28 31 mod private { 29 32 pub trait Sealed {} ··· 232 229 /// 233 230 /// # Invariants 234 231 /// 235 - /// - `inner` is a non-null wrapper over a pointer to a `struct 236 - /// regulator` obtained from [`regulator_get()`]. 232 + /// - `inner` is a pointer obtained from a successful call to 233 + /// [`regulator_get()`]. It is treated as an opaque token that may only be 234 + /// accessed using C API methods (e.g., it may be `NULL` if the C API returns 235 + /// `NULL`). 237 236 /// 238 237 /// [`regulator_get()`]: https://docs.kernel.org/driver-api/regulator.html#c.regulator_get 239 238 pub struct Regulator<State> 240 239 where 241 240 State: RegulatorState, 242 241 { 243 - inner: NonNull<bindings::regulator>, 242 + inner: *mut bindings::regulator, 244 243 _phantom: PhantomData<State>, 245 244 } 246 245 ··· 254 249 // SAFETY: Safe as per the type invariants of `Regulator`. 255 250 to_result(unsafe { 256 251 bindings::regulator_set_voltage( 257 - self.inner.as_ptr(), 252 + self.inner, 258 253 min_voltage.as_microvolts(), 259 254 max_voltage.as_microvolts(), 260 255 ) ··· 264 259 /// Gets the current voltage of the regulator. 265 260 pub fn get_voltage(&self) -> Result<Voltage> { 266 261 // SAFETY: Safe as per the type invariants of `Regulator`. 267 - let voltage = unsafe { bindings::regulator_get_voltage(self.inner.as_ptr()) }; 262 + let voltage = unsafe { bindings::regulator_get_voltage(self.inner) }; 268 263 269 264 to_result(voltage).map(|()| Voltage::from_microvolts(voltage)) 270 265 } ··· 275 270 // received from the C code. 276 271 from_err_ptr(unsafe { bindings::regulator_get(dev.as_raw(), name.as_char_ptr()) })?; 277 272 278 - // SAFETY: We can safely trust `inner` to be a pointer to a valid 279 - // regulator if `ERR_PTR` was not returned. 280 - let inner = unsafe { NonNull::new_unchecked(inner) }; 281 - 273 + // INVARIANT: `inner` is a pointer obtained from `regulator_get()`, and 274 + // the call was successful. 282 275 Ok(Self { 283 276 inner, 284 277 _phantom: PhantomData, ··· 285 282 286 283 fn enable_internal(&self) -> Result { 287 284 // SAFETY: Safe as per the type invariants of `Regulator`. 288 - to_result(unsafe { bindings::regulator_enable(self.inner.as_ptr()) }) 285 + to_result(unsafe { bindings::regulator_enable(self.inner) }) 289 286 } 290 287 291 288 fn disable_internal(&self) -> Result { 292 289 // SAFETY: Safe as per the type invariants of `Regulator`. 293 - to_result(unsafe { bindings::regulator_disable(self.inner.as_ptr()) }) 290 + to_result(unsafe { bindings::regulator_disable(self.inner) }) 294 291 } 295 292 } 296 293 ··· 352 349 /// Checks if the regulator is enabled. 353 350 pub fn is_enabled(&self) -> bool { 354 351 // SAFETY: Safe as per the type invariants of `Regulator`. 355 - unsafe { bindings::regulator_is_enabled(self.inner.as_ptr()) != 0 } 352 + unsafe { bindings::regulator_is_enabled(self.inner) != 0 } 356 353 } 357 354 } 358 355 ··· 362 359 // SAFETY: By the type invariants, we know that `self` owns a 363 360 // reference on the enabled refcount, so it is safe to relinquish it 364 361 // now. 365 - unsafe { bindings::regulator_disable(self.inner.as_ptr()) }; 362 + unsafe { bindings::regulator_disable(self.inner) }; 366 363 } 367 364 // SAFETY: By the type invariants, we know that `self` owns a reference, 368 365 // so it is safe to relinquish it now. 369 - unsafe { bindings::regulator_put(self.inner.as_ptr()) }; 366 + unsafe { bindings::regulator_put(self.inner) }; 370 367 } 371 368 } 372 369

+3 -2

samples/landlock/sandboxer.c

··· 299 299 300 300 /* clang-format on */ 301 301 302 - #define LANDLOCK_ABI_LAST 7 302 + #define LANDLOCK_ABI_LAST 8 303 303 304 304 #define XSTR(s) #s 305 305 #define STR(s) XSTR(s) ··· 436 436 /* Removes LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON for ABI < 7 */ 437 437 supported_restrict_flags &= 438 438 ~LANDLOCK_RESTRICT_SELF_LOG_NEW_EXEC_ON; 439 - 439 + __attribute__((fallthrough)); 440 + case 7: 440 441 /* Must be printed for any ABI < LANDLOCK_ABI_LAST. */ 441 442 fprintf(stderr, 442 443 "Hint: You should update the running kernel "

+11

scripts/coccinelle/api/kmalloc_objs.cocci

··· 122 122 - ALLOC(struct_size_t(TYPE, FLEX, COUNT), GFP) 123 123 + ALLOC_FLEX(TYPE, FLEX, COUNT, GFP) 124 124 ) 125 + 126 + @drop_gfp_kernel depends on patch && !(file in "tools") && !(file in "samples")@ 127 + identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, 128 + kzalloc_obj,kzalloc_objs,kzalloc_flex, 129 + kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, 130 + kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; 131 + @@ 132 + 133 + ALLOC(... 134 + - , GFP_KERNEL 135 + )

+10 -14

scripts/kconfig/merge_config.sh

··· 151 151 if ! "$AWK" -v prefix="$CONFIG_PREFIX" \ 152 152 -v warnoverride="$WARNOVERRIDE" \ 153 153 -v strict="$STRICT" \ 154 + -v outfile="$TMP_FILE.new" \ 154 155 -v builtin="$BUILTIN" \ 155 156 -v warnredun="$WARNREDUN" ' 156 157 BEGIN { ··· 196 195 197 196 # First pass: read merge file, store all lines and index 198 197 FILENAME == ARGV[1] { 199 - mergefile = FILENAME 198 + mergefile = FILENAME 200 199 merge_lines[FNR] = $0 201 200 merge_total = FNR 202 201 cfg = get_cfg($0) ··· 213 212 214 213 # Not a config or not in merge file - keep it 215 214 if (cfg == "" || !(cfg in merge_cfg)) { 216 - print $0 >> ARGV[3] 215 + print $0 >> outfile 217 216 next 218 217 } 219 218 220 - prev_val = $0 219 + prev_val = $0 221 220 new_val = merge_cfg[cfg] 222 221 223 222 # BUILTIN: do not demote y to m 224 223 if (builtin == "true" && new_val ~ /=m$/ && prev_val ~ /=y$/) { 225 224 warn_builtin(cfg, prev_val, new_val) 226 - print $0 >> ARGV[3] 225 + print $0 >> outfile 227 226 skip_merge[merge_cfg_line[cfg]] = 1 228 227 next 229 228 } ··· 236 235 237 236 # "=n" is the same as "is not set" 238 237 if (prev_val ~ /=n$/ && new_val ~ / is not set$/) { 239 - print $0 >> ARGV[3] 238 + print $0 >> outfile 240 239 next 241 240 } 242 241 ··· 247 246 } 248 247 } 249 248 250 - # output file, skip all lines 251 - FILENAME == ARGV[3] { 252 - nextfile 253 - } 254 - 255 249 END { 256 250 # Newline in case base file lacks trailing newline 257 - print "" >> ARGV[3] 251 + print "" >> outfile 258 252 # Append merge file, skipping lines marked for builtin preservation 259 253 for (i = 1; i <= merge_total; i++) { 260 254 if (!(i in skip_merge)) { 261 - print merge_lines[i] >> ARGV[3] 255 + print merge_lines[i] >> outfile 262 256 } 263 257 } 264 258 if (strict_violated) { 265 259 exit 1 266 260 } 267 261 }' \ 268 - "$ORIG_MERGE_FILE" "$TMP_FILE" "$TMP_FILE.new"; then 262 + "$ORIG_MERGE_FILE" "$TMP_FILE"; then 269 263 # awk exited non-zero, strict mode was violated 270 264 STRICT_MODE_VIOLATED=true 271 265 fi ··· 377 381 STRICT_MODE_VIOLATED=true 378 382 fi 379 383 380 - if [ "$STRICT" == "true" ] && [ "$STRICT_MODE_VIOLATED" == "true" ]; then 384 + if [ "$STRICT" = "true" ] && [ "$STRICT_MODE_VIOLATED" = "true" ]; then 381 385 echo "Requested and effective config differ" 382 386 exit 1 383 387 fi

+1 -2

security/landlock/domain.c

··· 94 94 * allocate with GFP_KERNEL_ACCOUNT because it is independent from the 95 95 * caller. 96 96 */ 97 - details = 98 - kzalloc_flex(*details, exe_path, path_size); 97 + details = kzalloc_flex(*details, exe_path, path_size); 99 98 if (!details) 100 99 return ERR_PTR(-ENOMEM); 101 100

+4 -5

security/landlock/ruleset.c

··· 32 32 { 33 33 struct landlock_ruleset *new_ruleset; 34 34 35 - new_ruleset = 36 - kzalloc_flex(*new_ruleset, access_masks, num_layers, 37 - GFP_KERNEL_ACCOUNT); 35 + new_ruleset = kzalloc_flex(*new_ruleset, access_masks, num_layers, 36 + GFP_KERNEL_ACCOUNT); 38 37 if (!new_ruleset) 39 38 return ERR_PTR(-ENOMEM); 40 39 refcount_set(&new_ruleset->usage, 1); ··· 558 559 if (IS_ERR(new_dom)) 559 560 return new_dom; 560 561 561 - new_dom->hierarchy = kzalloc_obj(*new_dom->hierarchy, 562 - GFP_KERNEL_ACCOUNT); 562 + new_dom->hierarchy = 563 + kzalloc_obj(*new_dom->hierarchy, GFP_KERNEL_ACCOUNT); 563 564 if (!new_dom->hierarchy) 564 565 return ERR_PTR(-ENOMEM); 565 566

+73 -19

security/landlock/tsync.c

··· 203 203 return ctx; 204 204 } 205 205 206 + /** 207 + * tsync_works_trim - Put the last tsync_work element 208 + * 209 + * @s: TSYNC works to trim. 210 + * 211 + * Put the last task and decrement the size of @s. 212 + * 213 + * This helper does not cancel a running task, but just reset the last element 214 + * to zero. 215 + */ 216 + static void tsync_works_trim(struct tsync_works *s) 217 + { 218 + struct tsync_work *ctx; 219 + 220 + if (WARN_ON_ONCE(s->size <= 0)) 221 + return; 222 + 223 + ctx = s->works[s->size - 1]; 224 + 225 + /* 226 + * For consistency, remove the task from ctx so that it does not look like 227 + * we handed it a task_work. 228 + */ 229 + put_task_struct(ctx->task); 230 + *ctx = (typeof(*ctx)){}; 231 + 232 + /* 233 + * Cancel the tsync_works_provide() change to recycle the reserved memory 234 + * for the next thread, if any. This also ensures that cancel_tsync_works() 235 + * and tsync_works_release() do not see any NULL task pointers. 236 + */ 237 + s->size--; 238 + } 239 + 206 240 /* 207 241 * tsync_works_grow_by - preallocates space for n more contexts in s 208 242 * ··· 290 256 * tsync_works_contains - checks for presence of task in s 291 257 */ 292 258 static bool tsync_works_contains_task(const struct tsync_works *s, 293 - struct task_struct *task) 259 + const struct task_struct *task) 294 260 { 295 261 size_t i; 296 262 297 263 for (i = 0; i < s->size; i++) 298 264 if (s->works[i]->task == task) 299 265 return true; 266 + 300 267 return false; 301 268 } 302 269 ··· 311 276 size_t i; 312 277 313 278 for (i = 0; i < s->size; i++) { 314 - if (!s->works[i]->task) 279 + if (WARN_ON_ONCE(!s->works[i]->task)) 315 280 continue; 316 281 317 282 put_task_struct(s->works[i]->task); ··· 319 284 320 285 for (i = 0; i < s->capacity; i++) 321 286 kfree(s->works[i]); 287 + 322 288 kfree(s->works); 323 289 s->works = NULL; 324 290 s->size = 0; ··· 331 295 */ 332 296 static size_t count_additional_threads(const struct tsync_works *works) 333 297 { 334 - struct task_struct *thread, *caller; 298 + const struct task_struct *caller, *thread; 335 299 size_t n = 0; 336 300 337 301 caller = current; ··· 370 334 struct tsync_shared_context *shared_ctx) 371 335 { 372 336 int err; 373 - struct task_struct *thread, *caller; 337 + const struct task_struct *caller; 338 + struct task_struct *thread; 374 339 struct tsync_work *ctx; 375 340 bool found_more_threads = false; 376 341 ··· 416 379 417 380 init_task_work(&ctx->work, restrict_one_thread_callback); 418 381 err = task_work_add(thread, &ctx->work, TWA_SIGNAL); 419 - if (err) { 382 + if (unlikely(err)) { 420 383 /* 421 384 * task_work_add() only fails if the task is about to exit. We 422 385 * checked that earlier, but it can happen as a race. Resume 423 386 * without setting an error, as the task is probably gone in the 424 - * next loop iteration. For consistency, remove the task from ctx 425 - * so that it does not look like we handed it a task_work. 387 + * next loop iteration. 426 388 */ 427 - put_task_struct(ctx->task); 428 - ctx->task = NULL; 389 + tsync_works_trim(works); 429 390 430 391 atomic_dec(&shared_ctx->num_preparing); 431 392 atomic_dec(&shared_ctx->num_unfinished); ··· 441 406 * shared_ctx->num_preparing and shared_ctx->num_unfished and mark the two 442 407 * completions if needed, as if the task was never scheduled. 443 408 */ 444 - static void cancel_tsync_works(struct tsync_works *works, 409 + static void cancel_tsync_works(const struct tsync_works *works, 445 410 struct tsync_shared_context *shared_ctx) 446 411 { 447 - int i; 412 + size_t i; 448 413 449 414 for (i = 0; i < works->size; i++) { 415 + if (WARN_ON_ONCE(!works->works[i]->task)) 416 + continue; 417 + 450 418 if (!task_work_cancel(works->works[i]->task, 451 419 &works->works[i]->work)) 452 420 continue; ··· 484 446 shared_ctx.old_cred = old_cred; 485 447 shared_ctx.new_cred = new_cred; 486 448 shared_ctx.set_no_new_privs = task_no_new_privs(current); 449 + 450 + /* 451 + * Serialize concurrent TSYNC operations to prevent deadlocks when 452 + * multiple threads call landlock_restrict_self() simultaneously. 453 + * If the lock is already held, we gracefully yield by restarting the 454 + * syscall. This allows the current thread to process pending 455 + * task_works before retrying. 456 + */ 457 + if (!down_write_trylock(&current->signal->exec_update_lock)) 458 + return restart_syscall(); 487 459 488 460 /* 489 461 * We schedule a pseudo-signal task_work for each of the calling task's ··· 575 527 -ERESTARTNOINTR); 576 528 577 529 /* 578 - * Cancel task works for tasks that did not start running yet, 579 - * and decrement all_prepared and num_unfinished accordingly. 530 + * Opportunistic improvement: try to cancel task 531 + * works for tasks that did not start running 532 + * yet. We do not have a guarantee that it 533 + * cancels any of the enqueued task works 534 + * because task_work_run() might already have 535 + * dequeued them. 580 536 */ 581 537 cancel_tsync_works(&works, &shared_ctx); 582 538 583 539 /* 584 - * The remaining task works have started running, so waiting for 585 - * their completion will finish. 540 + * Break the loop with error. The cleanup code 541 + * after the loop unblocks the remaining 542 + * task_works. 586 543 */ 587 - wait_for_completion(&shared_ctx.all_prepared); 544 + break; 588 545 } 589 546 } 590 547 } while (found_more_threads && 591 548 !atomic_read(&shared_ctx.preparation_error)); 592 549 593 550 /* 594 - * We now have all sibling threads blocking and in "prepared" state in the 595 - * task work. Ask all threads to commit. 551 + * We now have either (a) all sibling threads blocking and in "prepared" 552 + * state in the task work, or (b) the preparation error is set. Ask all 553 + * threads to commit (or abort). 596 554 */ 597 555 complete_all(&shared_ctx.ready_to_commit); 598 556 ··· 610 556 wait_for_completion(&shared_ctx.all_finished); 611 557 612 558 tsync_works_release(&works); 613 - 559 + up_write(&current->signal->exec_update_lock); 614 560 return atomic_read(&shared_ctx.preparation_error); 615 561 }

+1

security/security.c

··· 61 61 [LOCKDOWN_BPF_WRITE_USER] = "use of bpf to write user RAM", 62 62 [LOCKDOWN_DBG_WRITE_KERNEL] = "use of kgdb/kdb to write kernel RAM", 63 63 [LOCKDOWN_RTAS_ERROR_INJECTION] = "RTAS error injection", 64 + [LOCKDOWN_XEN_USER_ACTIONS] = "Xen guest user action", 64 65 [LOCKDOWN_INTEGRITY_MAX] = "integrity", 65 66 [LOCKDOWN_KCORE] = "/proc/kcore access", 66 67 [LOCKDOWN_KPROBES] = "use of kprobes",

+1 -1

sound/firewire/amdtp-stream.c

··· 1164 1164 struct pkt_desc *desc = s->packet_descs_cursor; 1165 1165 unsigned int pkt_header_length; 1166 1166 unsigned int packets; 1167 - u32 curr_cycle_time; 1167 + u32 curr_cycle_time = 0; 1168 1168 bool need_hw_irq; 1169 1169 int i; 1170 1170

+59 -7

sound/hda/codecs/realtek/alc269.c

··· 1017 1017 return 0; 1018 1018 } 1019 1019 1020 - #define STARLABS_STARFIGHTER_SHUTUP_DELAY_MS 30 1020 + #define ALC233_STARFIGHTER_SPK_PIN 0x1b 1021 + #define ALC233_STARFIGHTER_GPIO2 0x04 1021 1022 1022 - static void starlabs_starfighter_shutup(struct hda_codec *codec) 1023 + static void alc233_starfighter_update_amp(struct hda_codec *codec, bool on) 1023 1024 { 1024 - if (snd_hda_gen_shutup_speakers(codec)) 1025 - msleep(STARLABS_STARFIGHTER_SHUTUP_DELAY_MS); 1025 + snd_hda_codec_write(codec, ALC233_STARFIGHTER_SPK_PIN, 0, 1026 + AC_VERB_SET_EAPD_BTLENABLE, 1027 + on ? AC_EAPDBTL_EAPD : 0); 1028 + alc_update_gpio_data(codec, ALC233_STARFIGHTER_GPIO2, on); 1029 + } 1030 + 1031 + static void alc233_starfighter_pcm_hook(struct hda_pcm_stream *hinfo, 1032 + struct hda_codec *codec, 1033 + struct snd_pcm_substream *substream, 1034 + int action) 1035 + { 1036 + switch (action) { 1037 + case HDA_GEN_PCM_ACT_PREPARE: 1038 + alc233_starfighter_update_amp(codec, true); 1039 + break; 1040 + case HDA_GEN_PCM_ACT_CLEANUP: 1041 + alc233_starfighter_update_amp(codec, false); 1042 + break; 1043 + } 1026 1044 } 1027 1045 1028 1046 static void alc233_fixup_starlabs_starfighter(struct hda_codec *codec, ··· 1049 1031 { 1050 1032 struct alc_spec *spec = codec->spec; 1051 1033 1052 - if (action == HDA_FIXUP_ACT_PRE_PROBE) 1053 - spec->shutup = starlabs_starfighter_shutup; 1034 + switch (action) { 1035 + case HDA_FIXUP_ACT_PRE_PROBE: 1036 + spec->gpio_mask |= ALC233_STARFIGHTER_GPIO2; 1037 + spec->gpio_dir |= ALC233_STARFIGHTER_GPIO2; 1038 + spec->gpio_data &= ~ALC233_STARFIGHTER_GPIO2; 1039 + break; 1040 + case HDA_FIXUP_ACT_PROBE: 1041 + spec->gen.pcm_playback_hook = alc233_starfighter_pcm_hook; 1042 + break; 1043 + } 1054 1044 } 1055 1045 1056 1046 static void alc269_fixup_pincfg_no_hp_to_lineout(struct hda_codec *codec, ··· 3725 3699 alc_fixup_hp_gpio_led(codec, action, 0x04, 0x0); 3726 3700 alc285_fixup_hp_coef_micmute_led(codec, fix, action); 3727 3701 } 3702 + 3703 + static void alc245_hp_spk_mute_led_update(void *private_data, int enabled) 3704 + { 3705 + struct hda_codec *codec = private_data; 3706 + unsigned int val; 3707 + 3708 + val = enabled ? 0x08 : 0x04; /* 0x08 led on, 0x04 led off */ 3709 + alc_update_coef_idx(codec, 0x0b, 0x0c, val); 3710 + } 3711 + 3728 3712 /* JD2: mute led GPIO3: micmute led */ 3729 3713 static void alc245_tas2781_i2c_hp_fixup_muteled(struct hda_codec *codec, 3730 3714 const struct hda_fixup *fix, int action) 3731 3715 { 3732 3716 struct alc_spec *spec = codec->spec; 3717 + hda_nid_t hp_pin = alc_get_hp_pin(spec); 3733 3718 static const hda_nid_t conn[] = { 0x02 }; 3734 3719 3735 3720 switch (action) { 3736 3721 case HDA_FIXUP_ACT_PRE_PROBE: 3722 + if (!hp_pin) { 3723 + spec->gen.vmaster_mute.hook = alc245_hp_spk_mute_led_update; 3724 + spec->gen.vmaster_mute_led = 1; 3725 + } 3737 3726 spec->gen.auto_mute_via_amp = 1; 3738 3727 snd_hda_override_conn_list(codec, 0x17, ARRAY_SIZE(conn), conn); 3728 + break; 3729 + case HDA_FIXUP_ACT_INIT: 3730 + if (!hp_pin) 3731 + alc245_hp_spk_mute_led_update(codec, !spec->gen.master_mute); 3739 3732 break; 3740 3733 } 3741 3734 3742 3735 tas2781_fixup_txnw_i2c(codec, fix, action); 3743 - alc245_fixup_hp_mute_led_coefbit(codec, fix, action); 3736 + if (hp_pin) 3737 + alc245_fixup_hp_mute_led_coefbit(codec, fix, action); 3744 3738 alc285_fixup_hp_coef_micmute_led(codec, fix, action); 3745 3739 } 3746 3740 /* ··· 6900 6854 SND_PCI_QUIRK(0x103c, 0x8730, "HP ProBook 445 G7", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), 6901 6855 SND_PCI_QUIRK(0x103c, 0x8735, "HP ProBook 435 G7", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF), 6902 6856 SND_PCI_QUIRK(0x103c, 0x8736, "HP", ALC285_FIXUP_HP_GPIO_AMP_INIT), 6857 + SND_PCI_QUIRK(0x103c, 0x8756, "HP ENVY Laptop 13-ba0xxx", ALC245_FIXUP_HP_X360_MUTE_LEDS), 6903 6858 SND_PCI_QUIRK(0x103c, 0x8760, "HP EliteBook 8{4,5}5 G7", ALC285_FIXUP_HP_BEEP_MICMUTE_LED), 6904 6859 SND_PCI_QUIRK(0x103c, 0x876e, "HP ENVY x360 Convertible 13-ay0xxx", ALC245_FIXUP_HP_X360_MUTE_LEDS), 6905 6860 SND_PCI_QUIRK(0x103c, 0x877a, "HP", ALC285_FIXUP_HP_MUTE_LED), ··· 6914 6867 SND_PCI_QUIRK(0x103c, 0x8788, "HP OMEN 15", ALC285_FIXUP_HP_MUTE_LED), 6915 6868 SND_PCI_QUIRK(0x103c, 0x87b7, "HP Laptop 14-fq0xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), 6916 6869 SND_PCI_QUIRK(0x103c, 0x87c8, "HP", ALC287_FIXUP_HP_GPIO_LED), 6870 + SND_PCI_QUIRK(0x103c, 0x87cb, "HP Pavilion 15-eg0xxx", ALC287_FIXUP_HP_GPIO_LED), 6917 6871 SND_PCI_QUIRK(0x103c, 0x87cc, "HP Pavilion 15-eg0xxx", ALC287_FIXUP_HP_GPIO_LED), 6918 6872 SND_PCI_QUIRK(0x103c, 0x87d3, "HP Laptop 15-gw0xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), 6919 6873 SND_PCI_QUIRK(0x103c, 0x87df, "HP ProBook 430 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED), ··· 7146 7098 SND_PCI_QUIRK(0x103c, 0x8da7, "HP 14 Enstrom OmniBook X", ALC287_FIXUP_CS35L41_I2C_2), 7147 7099 SND_PCI_QUIRK(0x103c, 0x8da8, "HP 16 Piston OmniBook X", ALC287_FIXUP_CS35L41_I2C_2), 7148 7100 SND_PCI_QUIRK(0x103c, 0x8dd4, "HP EliteStudio 8 AIO", ALC274_FIXUP_HP_AIO_BIND_DACS), 7101 + SND_PCI_QUIRK(0x103c, 0x8dd7, "HP Laptop 15-fd0xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), 7149 7102 SND_PCI_QUIRK(0x103c, 0x8de8, "HP Gemtree", ALC245_FIXUP_TAS2781_SPI_2), 7150 7103 SND_PCI_QUIRK(0x103c, 0x8de9, "HP Gemtree", ALC245_FIXUP_TAS2781_SPI_2), 7151 7104 SND_PCI_QUIRK(0x103c, 0x8dec, "HP EliteBook 640 G12", ALC236_FIXUP_HP_GPIO_LED), ··· 7226 7177 SND_PCI_QUIRK(0x1043, 0x115d, "Asus 1015E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), 7227 7178 SND_PCI_QUIRK(0x1043, 0x1194, "ASUS UM3406KA", ALC287_FIXUP_CS35L41_I2C_2), 7228 7179 SND_PCI_QUIRK(0x1043, 0x11c0, "ASUS X556UR", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), 7180 + HDA_CODEC_QUIRK(0x1043, 0x1204, "ASUS Strix G16 G615JMR", ALC287_FIXUP_TXNW2781_I2C_ASUS), 7229 7181 SND_PCI_QUIRK(0x1043, 0x1204, "ASUS Strix G615JHR_JMR_JPR", ALC287_FIXUP_TAS2781_I2C), 7230 7182 SND_PCI_QUIRK(0x1043, 0x1214, "ASUS Strix G615LH_LM_LP", ALC287_FIXUP_TAS2781_I2C), 7231 7183 SND_PCI_QUIRK(0x1043, 0x125e, "ASUS Q524UQK", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), ··· 7256 7206 SND_PCI_QUIRK(0x1043, 0x14e3, "ASUS G513PI/PU/PV", ALC287_FIXUP_CS35L41_I2C_2), 7257 7207 SND_PCI_QUIRK(0x1043, 0x14f2, "ASUS VivoBook X515JA", ALC256_FIXUP_ASUS_MIC_NO_PRESENCE), 7258 7208 SND_PCI_QUIRK(0x1043, 0x1503, "ASUS G733PY/PZ/PZV/PYV", ALC287_FIXUP_CS35L41_I2C_2), 7209 + SND_PCI_QUIRK(0x1043, 0x1514, "ASUS ROG Flow Z13 GZ302EAC", ALC287_FIXUP_CS35L41_I2C_2), 7259 7210 SND_PCI_QUIRK(0x1043, 0x1517, "Asus Zenbook UX31A", ALC269VB_FIXUP_ASUS_ZENBOOK_UX31A), 7260 7211 SND_PCI_QUIRK(0x1043, 0x1533, "ASUS GV302XA/XJ/XQ/XU/XV/XI", ALC287_FIXUP_CS35L41_I2C_2), 7261 7212 SND_PCI_QUIRK(0x1043, 0x1573, "ASUS GZ301VV/VQ/VU/VJ/VA/VC/VE/VVC/VQC/VUC/VJC/VEC/VCC", ALC285_FIXUP_ASUS_HEADSET_MIC), ··· 7626 7575 SND_PCI_QUIRK(0x17aa, 0x38ab, "Thinkbook 16P", ALC287_FIXUP_MG_RTKC_CSAMP_CS35L41_I2C_THINKPAD), 7627 7576 SND_PCI_QUIRK(0x17aa, 0x38b4, "Legion Slim 7 16IRH8", ALC287_FIXUP_CS35L41_I2C_2), 7628 7577 HDA_CODEC_QUIRK(0x17aa, 0x391c, "Lenovo Yoga 7 2-in-1 14AKP10", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), 7578 + HDA_CODEC_QUIRK(0x17aa, 0x391d, "Lenovo Yoga 7 2-in-1 16AKP10", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), 7629 7579 SND_PCI_QUIRK(0x17aa, 0x38b5, "Legion Slim 7 16IRH8", ALC287_FIXUP_CS35L41_I2C_2), 7630 7580 SND_PCI_QUIRK(0x17aa, 0x38b6, "Legion Slim 7 16APH8", ALC287_FIXUP_CS35L41_I2C_2), 7631 7581 SND_PCI_QUIRK(0x17aa, 0x38b7, "Legion Slim 7 16APH8", ALC287_FIXUP_CS35L41_I2C_2),

-1

sound/hda/controllers/intel.c

··· 2077 2077 { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1043, 0x874f) }, /* ASUS ROG Zenith II / Strix */ 2078 2078 { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1462, 0xcb59) }, /* MSI TRX40 Creator */ 2079 2079 { PCI_DEVICE_SUB(0x1022, 0x1487, 0x1462, 0xcb60) }, /* MSI TRX40 */ 2080 - { PCI_DEVICE_SUB(0x1022, 0x15e3, 0x1462, 0xee59) }, /* MSI X870E Tomahawk WiFi */ 2081 2080 {} 2082 2081 }; 2083 2082

+4 -2

sound/pci/asihpi/hpimsgx.c

··· 581 581 HPI_ADAPTER_OPEN); 582 582 hm.adapter_index = adapter; 583 583 hw_entry_point(&hm, &hr); 584 - memcpy(&rESP_HPI_ADAPTER_OPEN[adapter], &hr, 585 - sizeof(rESP_HPI_ADAPTER_OPEN[0])); 584 + memcpy(&rESP_HPI_ADAPTER_OPEN[adapter].h, &hr, 585 + sizeof(rESP_HPI_ADAPTER_OPEN[adapter].h)); 586 + memcpy(&rESP_HPI_ADAPTER_OPEN[adapter].a, &hr.u.ax.info, 587 + sizeof(rESP_HPI_ADAPTER_OPEN[adapter].a)); 586 588 if (hr.error) 587 589 return hr.error; 588 590

+14

sound/soc/amd/yc/acp6x-mach.c

··· 55 55 { 56 56 .driver_data = &acp6x_card, 57 57 .matches = { 58 + DMI_MATCH(DMI_BOARD_VENDOR, "HP"), 59 + DMI_MATCH(DMI_PRODUCT_NAME, "HP Laptop 15-fc0xxx"), 60 + } 61 + }, 62 + { 63 + .driver_data = &acp6x_card, 64 + .matches = { 58 65 DMI_MATCH(DMI_BOARD_VENDOR, "Dell Inc."), 59 66 DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5525"), 60 67 } ··· 757 750 .matches = { 758 751 DMI_MATCH(DMI_BOARD_VENDOR, "Micro-Star International Co., Ltd."), 759 752 DMI_MATCH(DMI_PRODUCT_NAME, "Thin A15 B7VE"), 753 + } 754 + }, 755 + { 756 + .driver_data = &acp6x_card, 757 + .matches = { 758 + DMI_MATCH(DMI_BOARD_VENDOR, "ASUSTeK COMPUTER INC."), 759 + DMI_MATCH(DMI_PRODUCT_NAME, "M7601RM"), 760 760 } 761 761 }, 762 762 {}

+24 -10

sound/soc/codecs/adau1372.c

··· 762 762 return 0; 763 763 } 764 764 765 - static void adau1372_enable_pll(struct adau1372 *adau1372) 765 + static int adau1372_enable_pll(struct adau1372 *adau1372) 766 766 { 767 767 unsigned int val, timeout = 0; 768 768 int ret; ··· 778 778 timeout++; 779 779 } while (!(val & 1) && timeout < 3); 780 780 781 - if (ret < 0 || !(val & 1)) 781 + if (ret < 0 || !(val & 1)) { 782 782 dev_err(adau1372->dev, "Failed to lock PLL\n"); 783 + return ret < 0 ? ret : -ETIMEDOUT; 784 + } 785 + 786 + return 0; 783 787 } 784 788 785 - static void adau1372_set_power(struct adau1372 *adau1372, bool enable) 789 + static int adau1372_set_power(struct adau1372 *adau1372, bool enable) 786 790 { 787 791 if (adau1372->enabled == enable) 788 - return; 792 + return 0; 789 793 790 794 if (enable) { 791 795 unsigned int clk_ctrl = ADAU1372_CLK_CTRL_MCLK_EN; 796 + int ret; 792 797 793 - clk_prepare_enable(adau1372->mclk); 798 + ret = clk_prepare_enable(adau1372->mclk); 799 + if (ret) 800 + return ret; 794 801 if (adau1372->pd_gpio) 795 802 gpiod_set_value(adau1372->pd_gpio, 0); 796 803 ··· 811 804 * accessed. 812 805 */ 813 806 if (adau1372->use_pll) { 814 - adau1372_enable_pll(adau1372); 807 + ret = adau1372_enable_pll(adau1372); 808 + if (ret) { 809 + regcache_cache_only(adau1372->regmap, true); 810 + if (adau1372->pd_gpio) 811 + gpiod_set_value(adau1372->pd_gpio, 1); 812 + clk_disable_unprepare(adau1372->mclk); 813 + return ret; 814 + } 815 815 clk_ctrl |= ADAU1372_CLK_CTRL_CLKSRC; 816 816 } 817 817 ··· 843 829 } 844 830 845 831 adau1372->enabled = enable; 832 + 833 + return 0; 846 834 } 847 835 848 836 static int adau1372_set_bias_level(struct snd_soc_component *component, ··· 858 842 case SND_SOC_BIAS_PREPARE: 859 843 break; 860 844 case SND_SOC_BIAS_STANDBY: 861 - adau1372_set_power(adau1372, true); 862 - break; 845 + return adau1372_set_power(adau1372, true); 863 846 case SND_SOC_BIAS_OFF: 864 - adau1372_set_power(adau1372, false); 865 - break; 847 + return adau1372_set_power(adau1372, false); 866 848 } 867 849 868 850 return 0;

+10 -4

sound/soc/sdca/sdca_functions.c

··· 216 216 } else if (num_init_writes % sizeof(*raw) != 0) { 217 217 dev_err(dev, "%pfwP: init table size invalid\n", function_node); 218 218 return -EINVAL; 219 - } else if ((num_init_writes / sizeof(*raw)) > SDCA_MAX_INIT_COUNT) { 220 - dev_err(dev, "%pfwP: maximum init table size exceeded\n", function_node); 221 - return -EINVAL; 222 219 } 223 220 224 221 raw = kzalloc(num_init_writes, GFP_KERNEL); ··· 1601 1604 static struct sdca_entity *find_sdca_entity_by_label(struct sdca_function_data *function, 1602 1605 const char *entity_label) 1603 1606 { 1607 + struct sdca_entity *entity = NULL; 1604 1608 int i; 1605 1609 1606 1610 for (i = 0; i < function->num_entities; i++) { 1607 - struct sdca_entity *entity = &function->entities[i]; 1611 + entity = &function->entities[i]; 1612 + 1613 + /* check whole string first*/ 1614 + if (!strcmp(entity->label, entity_label)) 1615 + return entity; 1616 + } 1617 + 1618 + for (i = 0; i < function->num_entities; i++) { 1619 + entity = &function->entities[i]; 1608 1620 1609 1621 if (!strncmp(entity->label, entity_label, strlen(entity_label))) 1610 1622 return entity;

+1 -1

sound/soc/sof/ipc4-topology.c

··· 2951 2951 return -EINVAL; 2952 2952 } 2953 2953 2954 - if (scontrol->priv_size < sizeof(struct sof_abi_hdr)) { 2954 + if (scontrol->priv_size && scontrol->priv_size < sizeof(struct sof_abi_hdr)) { 2955 2955 dev_err(sdev->dev, 2956 2956 "bytes control %s initial data size %zu is insufficient.\n", 2957 2957 scontrol->name, scontrol->priv_size);

+1

sound/usb/Kconfig

··· 192 192 tristate "Qualcomm Audio Offload driver" 193 193 depends on QCOM_QMI_HELPERS && SND_USB_AUDIO && SND_SOC_USB 194 194 depends on USB_XHCI_HCD && USB_XHCI_SIDEBAND 195 + select AUXILIARY_BUS 195 196 help 196 197 Say Y here to enable the Qualcomm USB audio offloading feature. 197 198

+1 -1

sound/usb/qcom/qc_audio_offload.c

··· 1 1 // SPDX-License-Identifier: GPL-2.0 2 2 /* 3 - * Copyright (c) 2022-2025 Qualcomm Innovation Center, Inc. All rights reserved. 3 + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. 4 4 */ 5 5 6 6 #include <linux/auxiliary_bus.h>

+4

sound/usb/quirks.c

··· 2148 2148 /* Device matches */ 2149 2149 DEVICE_FLG(0x001f, 0x0b21, /* AB13X USB Audio */ 2150 2150 QUIRK_FLAG_FORCE_IFACE_RESET | QUIRK_FLAG_IFACE_DELAY), 2151 + DEVICE_FLG(0x001f, 0x0b23, /* AB17X USB Audio */ 2152 + QUIRK_FLAG_FORCE_IFACE_RESET | QUIRK_FLAG_IFACE_DELAY), 2151 2153 DEVICE_FLG(0x0020, 0x0b21, /* GHW-123P */ 2152 2154 QUIRK_FLAG_FORCE_IFACE_RESET | QUIRK_FLAG_IFACE_DELAY), 2153 2155 DEVICE_FLG(0x03f0, 0x654a, /* HP 320 FHD Webcam */ ··· 2431 2429 QUIRK_FLAG_CTL_MSG_DELAY | QUIRK_FLAG_IFACE_DELAY), 2432 2430 VENDOR_FLG(0x07fd, /* MOTU */ 2433 2431 QUIRK_FLAG_VALIDATE_RATES), 2432 + DEVICE_FLG(0x1235, 0x8006, 0), /* Focusrite Scarlett 2i2 1st Gen */ 2433 + DEVICE_FLG(0x1235, 0x800a, 0), /* Focusrite Scarlett 2i4 1st Gen */ 2434 2434 VENDOR_FLG(0x1235, /* Focusrite Novation */ 2435 2435 QUIRK_FLAG_SKIP_CLOCK_SELECTOR | 2436 2436 QUIRK_FLAG_SKIP_IFACE_SETUP),

+4 -1

tools/arch/x86/include/asm/msr-index.h

··· 740 740 #define MSR_AMD64_SNP_SMT_PROT BIT_ULL(MSR_AMD64_SNP_SMT_PROT_BIT) 741 741 #define MSR_AMD64_SNP_SECURE_AVIC_BIT 18 742 742 #define MSR_AMD64_SNP_SECURE_AVIC BIT_ULL(MSR_AMD64_SNP_SECURE_AVIC_BIT) 743 - #define MSR_AMD64_SNP_RESV_BIT 19 743 + #define MSR_AMD64_SNP_RESERVED_BITS19_22 GENMASK_ULL(22, 19) 744 + #define MSR_AMD64_SNP_IBPB_ON_ENTRY_BIT 23 745 + #define MSR_AMD64_SNP_IBPB_ON_ENTRY BIT_ULL(MSR_AMD64_SNP_IBPB_ON_ENTRY_BIT) 746 + #define MSR_AMD64_SNP_RESV_BIT 24 744 747 #define MSR_AMD64_SNP_RESERVED_MASK GENMASK_ULL(63, MSR_AMD64_SNP_RESV_BIT) 745 748 #define MSR_AMD64_SAVIC_CONTROL 0xc0010138 746 749 #define MSR_AMD64_SAVIC_EN_BIT 0

+1

tools/arch/x86/include/uapi/asm/kvm.h

··· 476 476 #define KVM_X86_QUIRK_SLOT_ZAP_ALL (1 << 7) 477 477 #define KVM_X86_QUIRK_STUFF_FEATURE_MSRS (1 << 8) 478 478 #define KVM_X86_QUIRK_IGNORE_GUEST_PAT (1 << 9) 479 + #define KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM (1 << 10) 479 480 480 481 #define KVM_STATE_NESTED_FORMAT_VMX 0 481 482 #define KVM_STATE_NESTED_FORMAT_SVM 1

+3 -1

tools/include/linux/build_bug.h

··· 32 32 /** 33 33 * BUILD_BUG_ON_MSG - break compile if a condition is true & emit supplied 34 34 * error message. 35 - * @condition: the condition which the compiler should know is false. 35 + * @cond: the condition which the compiler should know is false. 36 + * @msg: build-time error message 36 37 * 37 38 * See BUILD_BUG_ON for description. 38 39 */ ··· 61 60 62 61 /** 63 62 * static_assert - check integer constant expression at build time 63 + * @expr: expression to be checked 64 64 * 65 65 * static_assert() is a wrapper for the C11 _Static_assert, with a 66 66 * little macro magic to make the message optional (defaulting to the

+8

tools/include/uapi/linux/kvm.h

··· 14 14 #include <linux/ioctl.h> 15 15 #include <asm/kvm.h> 16 16 17 + #ifdef __KERNEL__ 18 + #include <linux/kvm_types.h> 19 + #endif 20 + 17 21 #define KVM_API_VERSION 12 18 22 19 23 /* ··· 1605 1601 __u16 size; 1606 1602 __u32 offset; 1607 1603 __u32 bucket_size; 1604 + #ifdef __KERNEL__ 1605 + char name[KVM_STATS_NAME_SIZE]; 1606 + #else 1608 1607 char name[]; 1608 + #endif 1609 1609 }; 1610 1610 1611 1611 #define KVM_GET_STATS_FD _IO(KVMIO, 0xce)

-1

tools/perf/check-headers.sh

··· 187 187 check arch/x86/lib/memcpy_64.S '-I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>" -I"^SYM_FUNC_START$_LOCAL$*(memcpy_$erms\|orig$)" -I"^#include <linux/cfi_types.h>"' 188 188 check arch/x86/lib/memset_64.S '-I "^EXPORT_SYMBOL" -I "^#include <asm/export.h>" -I"^SYM_FUNC_START$_LOCAL$*(memset_$erms\|orig$)"' 189 189 check arch/x86/include/asm/amd/ibs.h '-I "^#include .*/msr-index.h"' 190 - check arch/arm64/include/asm/cputype.h '-I "^#include [<\"]$asm/$*sysreg.h"' 191 190 check include/linux/unaligned.h '-I "^#include <linux/unaligned/packed_struct.h>" -I "^#include <asm/byteorder.h>" -I "^#pragma GCC diagnostic"' 192 191 check include/uapi/asm-generic/mman.h '-I "^#include <$uapi/$*asm-generic/mman-common$-tools$*.h>"' 193 192 check include/uapi/linux/mman.h '-I "^#include <$uapi/$*asm/mman.h>"'

+3 -3

tools/perf/util/kvm-stat-arch/kvm-stat-x86.c

··· 4 4 #include "../kvm-stat.h" 5 5 #include "../evsel.h" 6 6 #include "../env.h" 7 - #include "../../arch/x86/include/uapi/asm/svm.h" 8 - #include "../../arch/x86/include/uapi/asm/vmx.h" 9 - #include "../../arch/x86/include/uapi/asm/kvm.h" 7 + #include "../../../arch/x86/include/uapi/asm/svm.h" 8 + #include "../../../arch/x86/include/uapi/asm/vmx.h" 9 + #include "../../../arch/x86/include/uapi/asm/kvm.h" 10 10 #include <subcmd/parse-options.h> 11 11 12 12 define_exit_reasons_table(vmx_exit_reasons, VMX_EXIT_REASONS);

+3 -3

tools/perf/util/metricgroup.c

··· 1605 1605 .metric_or_groups = metric_or_groups, 1606 1606 }; 1607 1607 1608 - return pmu_metrics_table__for_each_metric(table, 1609 - metricgroup__has_metric_or_groups_callback, 1610 - &data) 1608 + return metricgroup__for_each_metric(table, 1609 + metricgroup__has_metric_or_groups_callback, 1610 + &data) 1611 1611 ? true : false; 1612 1612 } 1613 1613

+65 -17

tools/perf/util/parse-events.c

··· 1117 1117 1118 1118 static struct evsel_config_term *add_config_term(enum evsel_term_type type, 1119 1119 struct list_head *head_terms, 1120 - bool weak) 1120 + bool weak, char *str, u64 val) 1121 1121 { 1122 1122 struct evsel_config_term *t; 1123 1123 ··· 1128 1128 INIT_LIST_HEAD(&t->list); 1129 1129 t->type = type; 1130 1130 t->weak = weak; 1131 - list_add_tail(&t->list, head_terms); 1132 1131 1132 + switch (type) { 1133 + case EVSEL__CONFIG_TERM_PERIOD: 1134 + case EVSEL__CONFIG_TERM_FREQ: 1135 + case EVSEL__CONFIG_TERM_STACK_USER: 1136 + case EVSEL__CONFIG_TERM_USR_CHG_CONFIG: 1137 + case EVSEL__CONFIG_TERM_USR_CHG_CONFIG1: 1138 + case EVSEL__CONFIG_TERM_USR_CHG_CONFIG2: 1139 + case EVSEL__CONFIG_TERM_USR_CHG_CONFIG3: 1140 + case EVSEL__CONFIG_TERM_USR_CHG_CONFIG4: 1141 + t->val.val = val; 1142 + break; 1143 + case EVSEL__CONFIG_TERM_TIME: 1144 + t->val.time = val; 1145 + break; 1146 + case EVSEL__CONFIG_TERM_INHERIT: 1147 + t->val.inherit = val; 1148 + break; 1149 + case EVSEL__CONFIG_TERM_OVERWRITE: 1150 + t->val.overwrite = val; 1151 + break; 1152 + case EVSEL__CONFIG_TERM_MAX_STACK: 1153 + t->val.max_stack = val; 1154 + break; 1155 + case EVSEL__CONFIG_TERM_MAX_EVENTS: 1156 + t->val.max_events = val; 1157 + break; 1158 + case EVSEL__CONFIG_TERM_PERCORE: 1159 + t->val.percore = val; 1160 + break; 1161 + case EVSEL__CONFIG_TERM_AUX_OUTPUT: 1162 + t->val.aux_output = val; 1163 + break; 1164 + case EVSEL__CONFIG_TERM_AUX_SAMPLE_SIZE: 1165 + t->val.aux_sample_size = val; 1166 + break; 1167 + case EVSEL__CONFIG_TERM_CALLGRAPH: 1168 + case EVSEL__CONFIG_TERM_BRANCH: 1169 + case EVSEL__CONFIG_TERM_DRV_CFG: 1170 + case EVSEL__CONFIG_TERM_RATIO_TO_PREV: 1171 + case EVSEL__CONFIG_TERM_AUX_ACTION: 1172 + if (str) { 1173 + t->val.str = strdup(str); 1174 + if (!t->val.str) { 1175 + zfree(&t); 1176 + return NULL; 1177 + } 1178 + t->free_str = true; 1179 + } 1180 + break; 1181 + default: 1182 + t->val.val = val; 1183 + break; 1184 + } 1185 + 1186 + list_add_tail(&t->list, head_terms); 1133 1187 return t; 1134 1188 } 1135 1189 ··· 1196 1142 struct evsel_config_term *new_term; 1197 1143 enum evsel_term_type new_type; 1198 1144 bool str_type = false; 1199 - u64 val; 1145 + u64 val = 0; 1200 1146 1201 1147 switch (term->type_term) { 1202 1148 case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD: ··· 1288 1234 continue; 1289 1235 } 1290 1236 1291 - new_term = add_config_term(new_type, head_terms, term->weak); 1237 + /* 1238 + * Note: Members evsel_config_term::val and 1239 + * parse_events_term::val are unions and endianness needs 1240 + * to be taken into account when changing such union members. 1241 + */ 1242 + new_term = add_config_term(new_type, head_terms, term->weak, 1243 + str_type ? term->val.str : NULL, val); 1292 1244 if (!new_term) 1293 1245 return -ENOMEM; 1294 - 1295 - if (str_type) { 1296 - new_term->val.str = strdup(term->val.str); 1297 - if (!new_term->val.str) { 1298 - zfree(&new_term); 1299 - return -ENOMEM; 1300 - } 1301 - new_term->free_str = true; 1302 - } else { 1303 - new_term->val.val = val; 1304 - } 1305 1246 } 1306 1247 return 0; 1307 1248 } ··· 1326 1277 if (bits) { 1327 1278 struct evsel_config_term *new_term; 1328 1279 1329 - new_term = add_config_term(new_term_type, head_terms, false); 1280 + new_term = add_config_term(new_term_type, head_terms, false, NULL, bits); 1330 1281 if (!new_term) 1331 1282 return -ENOMEM; 1332 - new_term->val.cfg_chg = bits; 1333 1283 } 1334 1284 1335 1285 return 0;

+1

tools/testing/selftests/drivers/net/team/Makefile

··· 3 3 4 4 TEST_PROGS := \ 5 5 dev_addr_lists.sh \ 6 + non_ether_header_ops.sh \ 6 7 options.sh \ 7 8 propagation.sh \ 8 9 refleak.sh \

+2

tools/testing/selftests/drivers/net/team/config

··· 1 + CONFIG_BONDING=y 1 2 CONFIG_DUMMY=y 2 3 CONFIG_IPV6=y 3 4 CONFIG_MACVLAN=y 4 5 CONFIG_NETDEVSIM=m 6 + CONFIG_NET_IPGRE=y 5 7 CONFIG_NET_TEAM=y 6 8 CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=y 7 9 CONFIG_NET_TEAM_MODE_LOADBALANCE=y

+41

tools/testing/selftests/drivers/net/team/non_ether_header_ops.sh

··· 1 + #!/bin/bash 2 + # SPDX-License-Identifier: GPL-2.0 3 + # shellcheck disable=SC2154 4 + # 5 + # Reproduce the non-Ethernet header_ops confusion scenario with: 6 + # g0 (gre) -> b0 (bond) -> t0 (team) 7 + # 8 + # Before the fix, direct header_ops inheritance in this stack could call 9 + # callbacks with the wrong net_device context and crash. 10 + 11 + lib_dir=$(dirname "$0") 12 + source "$lib_dir"/../../../net/lib.sh 13 + 14 + trap cleanup_all_ns EXIT 15 + 16 + setup_ns ns1 17 + 18 + ip -n "$ns1" link add d0 type dummy 19 + ip -n "$ns1" addr add 10.10.10.1/24 dev d0 20 + ip -n "$ns1" link set d0 up 21 + 22 + ip -n "$ns1" link add g0 type gre local 10.10.10.1 23 + ip -n "$ns1" link add b0 type bond mode active-backup 24 + ip -n "$ns1" link add t0 type team 25 + 26 + ip -n "$ns1" link set g0 master b0 27 + ip -n "$ns1" link set b0 master t0 28 + 29 + ip -n "$ns1" link set g0 up 30 + ip -n "$ns1" link set b0 up 31 + ip -n "$ns1" link set t0 up 32 + 33 + # IPv6 address assignment triggers MLD join reports that call 34 + # dev_hard_header() on t0, exercising the inherited header_ops path. 35 + ip -n "$ns1" -6 addr add 2001:db8:1::1/64 dev t0 nodad 36 + for i in $(seq 1 20); do 37 + ip netns exec "$ns1" ping -6 -I t0 ff02::1 -c1 -W1 &>/dev/null || true 38 + done 39 + 40 + echo "PASS: non-Ethernet header_ops stacking did not crash" 41 + exit "$EXIT_STATUS"

+1

tools/testing/selftests/kvm/Makefile.kvm

··· 206 206 TEST_GEN_PROGS_s390 += s390/user_operexec 207 207 TEST_GEN_PROGS_s390 += s390/keyop 208 208 TEST_GEN_PROGS_s390 += rseq_test 209 + TEST_GEN_PROGS_s390 += s390/irq_routing 209 210 210 211 TEST_GEN_PROGS_riscv = $(TEST_GEN_PROGS_COMMON) 211 212 TEST_GEN_PROGS_riscv += riscv/sbi_pmu_test

+75

tools/testing/selftests/kvm/s390/irq_routing.c

··· 1 + // SPDX-License-Identifier: GPL-2.0-only 2 + /* 3 + * IRQ routing offset tests. 4 + * 5 + * Copyright IBM Corp. 2026 6 + * 7 + * Authors: 8 + * Janosch Frank <frankja@linux.ibm.com> 9 + */ 10 + #include <stdio.h> 11 + #include <stdlib.h> 12 + #include <string.h> 13 + #include <sys/ioctl.h> 14 + 15 + #include "test_util.h" 16 + #include "kvm_util.h" 17 + #include "kselftest.h" 18 + #include "ucall_common.h" 19 + 20 + extern char guest_code[]; 21 + asm("guest_code:\n" 22 + "diag %r0,%r0,0\n" 23 + "j .\n"); 24 + 25 + static void test(void) 26 + { 27 + struct kvm_irq_routing *routing; 28 + struct kvm_vcpu *vcpu; 29 + struct kvm_vm *vm; 30 + vm_paddr_t mem; 31 + int ret; 32 + 33 + struct kvm_irq_routing_entry ue = { 34 + .type = KVM_IRQ_ROUTING_S390_ADAPTER, 35 + .gsi = 1, 36 + }; 37 + 38 + vm = vm_create_with_one_vcpu(&vcpu, guest_code); 39 + mem = vm_phy_pages_alloc(vm, 2, 4096 * 42, 0); 40 + 41 + routing = kvm_gsi_routing_create(); 42 + routing->nr = 1; 43 + routing->entries[0] = ue; 44 + routing->entries[0].u.adapter.summary_addr = (uintptr_t)mem; 45 + routing->entries[0].u.adapter.ind_addr = (uintptr_t)mem; 46 + 47 + routing->entries[0].u.adapter.summary_offset = 4096 * 8; 48 + ret = __vm_ioctl(vm, KVM_SET_GSI_ROUTING, routing); 49 + ksft_test_result(ret == -1 && errno == EINVAL, "summary offset outside of page\n"); 50 + 51 + routing->entries[0].u.adapter.summary_offset -= 4; 52 + ret = __vm_ioctl(vm, KVM_SET_GSI_ROUTING, routing); 53 + ksft_test_result(ret == 0, "summary offset inside of page\n"); 54 + 55 + routing->entries[0].u.adapter.ind_offset = 4096 * 8; 56 + ret = __vm_ioctl(vm, KVM_SET_GSI_ROUTING, routing); 57 + ksft_test_result(ret == -1 && errno == EINVAL, "ind offset outside of page\n"); 58 + 59 + routing->entries[0].u.adapter.ind_offset -= 4; 60 + ret = __vm_ioctl(vm, KVM_SET_GSI_ROUTING, routing); 61 + ksft_test_result(ret == 0, "ind offset inside of page\n"); 62 + 63 + kvm_vm_free(vm); 64 + } 65 + 66 + int main(int argc, char *argv[]) 67 + { 68 + TEST_REQUIRE(kvm_has_cap(KVM_CAP_IRQ_ROUTING)); 69 + 70 + ksft_print_header(); 71 + ksft_set_plan(4); 72 + test(); 73 + 74 + ksft_finished(); /* Print results and exit() accordingly */ 75 + }

+91 -2

tools/testing/selftests/landlock/tsync_test.c

··· 6 6 */ 7 7 8 8 #define _GNU_SOURCE 9 - #include <pthread.h> 10 - #include <sys/prctl.h> 11 9 #include <linux/landlock.h> 10 + #include <pthread.h> 11 + #include <signal.h> 12 + #include <sys/prctl.h> 12 13 13 14 #include "common.h" 14 15 ··· 155 154 /* Expect that both succeeded. */ 156 155 EXPECT_EQ(0, d[0].result); 157 156 EXPECT_EQ(0, d[1].result); 157 + 158 + EXPECT_EQ(0, close(ruleset_fd)); 159 + } 160 + 161 + static void signal_nop_handler(int sig) 162 + { 163 + } 164 + 165 + struct signaler_data { 166 + pthread_t target; 167 + volatile bool stop; 168 + }; 169 + 170 + static void *signaler_thread(void *data) 171 + { 172 + struct signaler_data *sd = data; 173 + 174 + while (!sd->stop) 175 + pthread_kill(sd->target, SIGUSR1); 176 + 177 + return NULL; 178 + } 179 + 180 + /* 181 + * Number of idle sibling threads. This must be large enough that even on 182 + * machines with many cores, the sibling threads cannot all complete their 183 + * credential preparation in a single parallel wave, otherwise the signaler 184 + * thread has no window to interrupt wait_for_completion_interruptible(). 185 + * 200 threads on a 64-core machine yields ~3 serialized waves, giving the 186 + * tight signal loop enough time to land an interruption. 187 + */ 188 + #define NUM_IDLE_THREADS 200 189 + 190 + /* 191 + * Exercises the tsync interruption and cancellation paths in tsync.c. 192 + * 193 + * When a signal interrupts the calling thread while it waits for sibling 194 + * threads to finish their credential preparation 195 + * (wait_for_completion_interruptible in landlock_restrict_sibling_threads), 196 + * the kernel sets ERESTARTNOINTR, cancels queued task works that have not 197 + * started yet (cancel_tsync_works), then waits for the remaining works to 198 + * finish. On the error return, syscalls.c aborts the prepared credentials. 199 + * The kernel automatically restarts the syscall, so userspace sees success. 200 + */ 201 + TEST(tsync_interrupt) 202 + { 203 + size_t i; 204 + pthread_t threads[NUM_IDLE_THREADS]; 205 + pthread_t signaler; 206 + struct signaler_data sd; 207 + struct sigaction sa = {}; 208 + const int ruleset_fd = create_ruleset(_metadata); 209 + 210 + disable_caps(_metadata); 211 + 212 + /* Install a no-op SIGUSR1 handler so the signal does not kill us. */ 213 + sa.sa_handler = signal_nop_handler; 214 + sigemptyset(&sa.sa_mask); 215 + ASSERT_EQ(0, sigaction(SIGUSR1, &sa, NULL)); 216 + 217 + ASSERT_EQ(0, prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)); 218 + 219 + for (i = 0; i < NUM_IDLE_THREADS; i++) 220 + ASSERT_EQ(0, pthread_create(&threads[i], NULL, idle, NULL)); 221 + 222 + /* 223 + * Start a signaler thread that continuously sends SIGUSR1 to the 224 + * calling thread. This maximizes the chance of interrupting 225 + * wait_for_completion_interruptible() in the kernel's tsync path. 226 + */ 227 + sd.target = pthread_self(); 228 + sd.stop = false; 229 + ASSERT_EQ(0, pthread_create(&signaler, NULL, signaler_thread, &sd)); 230 + 231 + /* 232 + * The syscall may be interrupted and transparently restarted by the 233 + * kernel (ERESTARTNOINTR). From userspace, it should always succeed. 234 + */ 235 + EXPECT_EQ(0, landlock_restrict_self(ruleset_fd, 236 + LANDLOCK_RESTRICT_SELF_TSYNC)); 237 + 238 + sd.stop = true; 239 + ASSERT_EQ(0, pthread_join(signaler, NULL)); 240 + 241 + for (i = 0; i < NUM_IDLE_THREADS; i++) { 242 + ASSERT_EQ(0, pthread_cancel(threads[i])); 243 + ASSERT_EQ(0, pthread_join(threads[i], NULL)); 244 + } 158 245 159 246 EXPECT_EQ(0, close(ruleset_fd)); 160 247 }

+1 -1

tools/testing/selftests/mount_setattr/mount_setattr_test.c

··· 1020 1020 "size=100000,mode=700"), 0); 1021 1021 1022 1022 ASSERT_EQ(mount("testing", "/mnt", "tmpfs", MS_NOATIME | MS_NODEV, 1023 - "size=2m,mode=700"), 0); 1023 + "size=256m,mode=700"), 0); 1024 1024 1025 1025 ASSERT_EQ(mkdir("/mnt/A", 0777), 0); 1026 1026

+58 -3

tools/testing/selftests/net/fib_tests.sh

··· 868 868 check_rt_num 5 $($IP -6 route list |grep -v expires|grep 2001:20::|wc -l) 869 869 log_test $ret 0 "ipv6 route garbage collection (replace with permanent)" 870 870 871 + # Delete dummy_10 and remove all routes 872 + $IP link del dev dummy_10 873 + 874 + # rd6 is required for the next test. (ipv6toolkit) 875 + if [ ! -x "$(command -v rd6)" ]; then 876 + echo "SKIP: rd6 not found." 877 + set +e 878 + cleanup &> /dev/null 879 + return 880 + fi 881 + 882 + setup_ns ns2 883 + $IP link add veth1 type veth peer veth2 netns $ns2 884 + $IP link set veth1 up 885 + ip -netns $ns2 link set veth2 up 886 + $IP addr add fe80:dead::1/64 dev veth1 887 + ip -netns $ns2 addr add fe80:dead::2/64 dev veth2 888 + 889 + # Add NTF_ROUTER neighbour to prevent rt6_age_examine_exception() 890 + # from removing not-yet-expired exceptions. 891 + ip -netns $ns2 link set veth2 address 00:11:22:33:44:55 892 + $IP neigh add fe80:dead::3 lladdr 00:11:22:33:44:55 dev veth1 router 893 + 894 + $NS_EXEC sysctl -wq net.ipv6.conf.veth1.accept_redirects=1 895 + $NS_EXEC sysctl -wq net.ipv6.conf.veth1.forwarding=0 896 + 897 + # Temporary routes 898 + for i in $(seq 1 5); do 899 + # Expire route after $EXPIRE seconds 900 + $IP -6 route add 2001:10::$i \ 901 + via fe80:dead::2 dev veth1 expires $EXPIRE 902 + 903 + ip netns exec $ns2 rd6 -i veth2 \ 904 + -s fe80:dead::2 -d fe80:dead::1 \ 905 + -r 2001:10::$i -t fe80:dead::3 -p ICMP6 906 + done 907 + 908 + check_rt_num 5 $($IP -6 route list | grep expires | grep 2001:10:: | wc -l) 909 + 910 + # Promote to permanent routes by "prepend" (w/o NLM_F_EXCL and NLM_F_REPLACE) 911 + for i in $(seq 1 5); do 912 + # -EEXIST, but the temporary route becomes the permanent route. 913 + $IP -6 route append 2001:10::$i \ 914 + via fe80:dead::2 dev veth1 2>/dev/null || true 915 + done 916 + 917 + check_rt_num 5 $($IP -6 route list | grep -v expires | grep 2001:10:: | wc -l) 918 + check_rt_num 5 $($IP -6 route list cache | grep 2001:10:: | wc -l) 919 + 920 + # Trigger GC instead of waiting $GC_WAIT_TIME. 921 + # rt6_nh_dump_exceptions() just skips expired exceptions. 922 + $NS_EXEC sysctl -wq net.ipv6.route.flush=1 923 + check_rt_num 0 $($IP -6 route list cache | grep 2001:10:: | wc -l) 924 + log_test $ret 0 "ipv6 route garbage collection (promote to permanent routes)" 925 + 926 + $IP neigh del fe80:dead::3 lladdr 00:11:22:33:44:55 dev veth1 router 927 + $IP link del veth1 928 + 871 929 # ra6 is required for the next test. (ipv6toolkit) 872 930 if [ ! -x "$(command -v ra6)" ]; then 873 931 echo "SKIP: ra6 not found." ··· 933 875 cleanup &> /dev/null 934 876 return 935 877 fi 936 - 937 - # Delete dummy_10 and remove all routes 938 - $IP link del dev dummy_10 939 878 940 879 # Create a pair of veth devices to send a RA message from one 941 880 # device to another.

+69 -1

tools/testing/selftests/net/netfilter/nft_concat_range.sh

··· 29 29 net6_port_net6_port net_port_mac_proto_net" 30 30 31 31 # Reported bugs, also described by TYPE_ variables below 32 - BUGS="flush_remove_add reload net_port_proto_match avx2_mismatch doublecreate insert_overlap" 32 + BUGS="flush_remove_add reload net_port_proto_match avx2_mismatch doublecreate 33 + insert_overlap load_flush_load4 load_flush_load8" 33 34 34 35 # List of possible paths to pktgen script from kernel tree for performance tests 35 36 PKTGEN_SCRIPT_PATHS=" ··· 423 422 424 423 TYPE_insert_overlap=" 425 424 display reject overlapping range on add 425 + type_spec ipv4_addr . ipv4_addr 426 + chain_spec ip saddr . ip daddr 427 + dst addr4 428 + proto icmp 429 + 430 + race_repeat 0 431 + 432 + perf_duration 0 433 + " 434 + 435 + TYPE_load_flush_load4=" 436 + display reload with flush, 4bit groups 437 + type_spec ipv4_addr . ipv4_addr 438 + chain_spec ip saddr . ip daddr 439 + dst addr4 440 + proto icmp 441 + 442 + race_repeat 0 443 + 444 + perf_duration 0 445 + " 446 + 447 + TYPE_load_flush_load8=" 448 + display reload with flush, 8bit groups 426 449 type_spec ipv4_addr . ipv4_addr 427 450 chain_spec ip saddr . ip daddr 428 451 dst addr4 ··· 2018 1993 2019 1994 elements="1.2.3.4 . 1.2.4.1-1.2.4.2" 2020 1995 add_fail "{ $elements }" || return 1 1996 + 1997 + return 0 1998 + } 1999 + 2000 + test_bug_load_flush_load4() 2001 + { 2002 + local i 2003 + 2004 + setup veth send_"${proto}" set || return ${ksft_skip} 2005 + 2006 + for i in $(seq 0 255); do 2007 + local addelem="add element inet filter test" 2008 + local j 2009 + 2010 + for j in $(seq 0 20); do 2011 + echo "$addelem { 10.$j.0.$i . 10.$j.1.$i }" 2012 + echo "$addelem { 10.$j.0.$i . 10.$j.2.$i }" 2013 + done 2014 + done > "$tmp" 2015 + 2016 + nft -f "$tmp" || return 1 2017 + 2018 + ( echo "flush set inet filter test";cat "$tmp") | nft -f - 2019 + [ $? -eq 0 ] || return 1 2020 + 2021 + return 0 2022 + } 2023 + 2024 + test_bug_load_flush_load8() 2025 + { 2026 + local i 2027 + 2028 + setup veth send_"${proto}" set || return ${ksft_skip} 2029 + 2030 + for i in $(seq 1 100); do 2031 + echo "add element inet filter test { 10.0.0.$i . 10.0.1.$i }" 2032 + echo "add element inet filter test { 10.0.0.$i . 10.0.2.$i }" 2033 + done > "$tmp" 2034 + 2035 + nft -f "$tmp" || return 1 2036 + 2037 + ( echo "flush set inet filter test";cat "$tmp") | nft -f - 2038 + [ $? -eq 0 ] || return 1 2021 2039 2022 2040 return 0 2023 2041 }

Configure Feed

Configure Feed