commits

user_scan() invokes updated sas_user_scan() for channel 0, and if
successful, iteratively scans remaining channels (1 to shost->max_channel)
via scsi_scan_host_selected() in commit 37c4e72b0651 ("scsi: Fix
sas_user_scan() to handle wildcard and multi-channel scans"). However,
hisi_sas supports only one channel, and the current value of max_channel is
1. sas_user_scan() for channel 1 will trigger the following NULL pointer
exception:

[ 441.554662] Unable to handle kernel NULL pointer dereference at virtual address 00000000000008b0
[ 441.554699] Mem abort info:
[ 441.554710] ESR = 0x0000000096000004
[ 441.554718] EC = 0x25: DABT (current EL), IL = 32 bits
[ 441.554723] SET = 0, FnV = 0
[ 441.554726] EA = 0, S1PTW = 0
[ 441.554730] FSC = 0x04: level 0 translation fault
[ 441.554735] Data abort info:
[ 441.554737] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
[ 441.554742] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 441.554747] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 441.554752] user pgtable: 4k pages, 48-bit VAs, pgdp=00000828377a6000
[ 441.554757] [00000000000008b0] pgd=0000000000000000, p4d=0000000000000000
[ 441.554769] Internal error: Oops: 0000000096000004 [#1] SMP
[ 441.629589] Modules linked in: arm_spe_pmu arm_smmuv3_pmu tpm_tis_spi hisi_uncore_sllc_pmu hisi_uncore_pa_pmu hisi_uncore_l3c_pmu hisi_uncore_hha_pmu hisi_uncore_ddrc_pmu hisi_uncore_cpa_pmu hns3_pmu hisi_ptt hisi_pcie_pmu tpm_tis_core spidev spi_hisi_sfc_v3xx hisi_uncore_pmu spi_dw_mmio fuse hclge hclge_common hisi_sec2 hisi_hpre hisi_zip hisi_qm hns3 hisi_sas_v3_hw sm3_ce sbsa_gwdt hnae3 hisi_sas_main uacce hisi_dma i2c_hisi dm_mirror dm_region_hash dm_log dm_mod
[ 441.670819] CPU: 46 UID: 0 PID: 6994 Comm: bash Kdump: loaded Not tainted 7.0.0-rc2+ #84 PREEMPT
[ 441.691327] pstate: 81400009 (Nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
[ 441.698277] pc : sas_find_dev_by_rphy+0x44/0x118
[ 441.702896] lr : sas_find_dev_by_rphy+0x3c/0x118
[ 441.707502] sp : ffff80009abbba40
[ 441.710805] x29: ffff80009abbba40 x28: ffff082819a40008 x27: ffff082810c37c08
[ 441.717930] x26: ffff082810c37c28 x25: ffff082819a40290 x24: ffff082810c37c00
[ 441.725054] x23: 0000000000000000 x22: 0000000000000001 x21: ffff082819a40000
[ 441.732179] x20: ffff082819a40290 x19: 0000000000000000 x18: 0000000000000020
[ 441.739304] x17: 0000000000000000 x16: ffffb5dad6bda690 x15: 00000000ffffffff
[ 441.746428] x14: ffff082814c3b26c x13: 00000000ffffffff x12: ffff082814c3b26a
[ 441.753553] x11: 00000000000000c0 x10: 000000000000003a x9 : ffffb5dad5ea94f4
[ 441.760678] x8 : 000000000000003a x7 : ffff80009abbbab0 x6 : 0000000000000030
[ 441.767802] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 441.774926] x2 : ffff08280f35a300 x1 : ffffb5dad7127180 x0 : 0000000000000000
[ 441.782053] Call trace:
[ 441.784488] sas_find_dev_by_rphy+0x44/0x118 (P)
[ 441.789095] sas_target_alloc+0x24/0xb0
[ 441.792920] scsi_alloc_target+0x290/0x330
[ 441.797010] __scsi_scan_target+0x88/0x258
[ 441.801096] scsi_scan_channel+0x74/0xb8
[ 441.805008] scsi_scan_host_selected+0x170/0x188
[ 441.809615] sas_user_scan+0xfc/0x148
[ 441.813267] store_scan+0x10c/0x180
[ 441.816743] dev_attr_store+0x20/0x40
[ 441.820398] sysfs_kf_write+0x84/0xa8
[ 441.824054] kernfs_fop_write_iter+0x130/0x1c8
[ 441.828487] vfs_write+0x2c0/0x370
[ 441.831880] ksys_write+0x74/0x118
[ 441.835271] __arm64_sys_write+0x24/0x38
[ 441.839182] invoke_syscall+0x50/0x120
[ 441.842919] el0_svc_common.constprop.0+0xc8/0xf0
[ 441.847611] do_el0_svc+0x24/0x38
[ 441.850913] el0_svc+0x38/0x158
[ 441.854043] el0t_64_sync_handler+0xa0/0xe8
[ 441.858214] el0t_64_sync+0x1ac/0x1b0
[ 441.861865] Code: aa1303e0 97ff70a8 34ffff80 d10a4273 (f9445a75)
[ 441.867946] ---[ end trace 0000000000000000 ]---

Therefore, set max_channel to 0.

Fixes: e21fe3a52692 ("scsi: hisi_sas: add initialisation for v3 pci-based controller")
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Link: https://patch.msgid.link/20260305064039.4096775-1-liyihang9@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

2mo ago

Linus Torvalds

62cda74c

Merge tag 'bootconfig-fixes-v7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

7w ago

Masami Hiramatsu (Google)

5ef268cb

kprobes: Remove unneeded warnings from __arm_kprobe_ftrace()

7w ago

Vladimir Riabchun

c0b7da13

scsi: qla2xxx: Completely fix fcport double free

2mo ago

Linus Torvalds

11e8c7e9

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Quite a large pull request, partly due to skipping last week and
therefore having material from ~all submaintainers in this one. About
a fourth of it is a new selftest, and a couple more changes are large
in number of files touched (fixing a -Wflex-array-member-not-at-end
compiler warning) or lines changed (reformatting of a table in the API
documentation, thanks rST).

But who am I kidding---it's a lot of commits and there are a lot of
bugs being fixed here, some of them on the nastier side like the
RISC-V ones.

ARM:

- Correctly handle deactivation of interrupts that were activated
from LRs. Since EOIcount only denotes deactivation of interrupts
that are not present in an LR, start EOIcount deactivation walk
*after* the last irq that made it into an LR

- Avoid calling into the stubs to probe for ICH_VTR_EL2.TDS when pKVM
is already enabled -- not only thhis isn't possible (pKVM will
reject the call), but it is also useless: this can only happen for
a CPU that has already booted once, and the capability will not
change

- Fix a couple of low-severity bugs in our S2 fault handling path,
affecting the recently introduced LS64 handling and the even more
esoteric handling of hwpoison in a nested context

- Address yet another syzkaller finding in the vgic initialisation,
where we would end-up destroying an uninitialised vgic with nasty
consequences

- Address an annoying case of pKVM failing to boot when some of the
memblock regions that the host is faulting in are not page-aligned

- Inject some sanity in the NV stage-2 walker by checking the limits
against the advertised PA size, and correctly report the resulting
faults

PPC:

- Fix a PPC e500 build error due to a long-standing wart that was
exposed by the recent conversion to kmalloc_obj(); rip out all the
ugliness that led to the wart

RISC-V:

- Prevent speculative out-of-bounds access using array_index_nospec()
in APLIC interrupt handling, ONE_REG regiser access, AIA CSR
access, float register access, and PMU counter access

- Fix potential use-after-free issues in kvm_riscv_gstage_get_leaf(),
kvm_riscv_aia_aplic_has_attr(), and kvm_riscv_aia_imsic_has_attr()

- Fix potential null pointer dereference in
kvm_riscv_vcpu_aia_rmw_topei()

- Fix off-by-one array access in SBI PMU

- Skip THP support check during dirty logging

- Fix error code returned for Smstateen and Ssaia ONE_REG interface

- Check host Ssaia extension when creating AIA irqchip

x86:

- Fix cases where CPUID mitigation features were incorrectly marked
as available whenever the kernel used scattered feature words for
them

- Validate _all_ GVAs, rather than just the first GVA, when
processing a range of GVAs for Hyper-V's TLB flush hypercalls

- Fix a brown paper bug in add_atomic_switch_msr()

- Use hlist_for_each_entry_srcu() when traversing mask_notifier_list,
to fix a lockdep warning; KVM doesn't hold RCU, just irq_srcu

- Ensure AVIC VMCB fields are initialized if the VM has an in-kernel
local APIC (and AVIC is enabled at the module level)

- Update CR8 write interception when AVIC is (de)activated, to fix a
bug where the guest can run in perpetuity with the CR8 intercept
enabled

- Add a quirk to skip the consistency check on FREEZE_IN_SMM, i.e. to
allow L1 hypervisors to set FREEZE_IN_SMM. This reverts (by
default) an unintentional tightening of userspace ABI in 6.17, and
provides some amount of backwards compatibility with hypervisors
who want to freeze PMCs on VM-Entry

- Validate the VMCS/VMCB on return to a nested guest from SMM,
because either userspace or the guest could stash invalid values in
memory and trigger the processor's consistency checks

Generic:

- Remove a subtle pseudo-overlay of kvm_stats_desc, which, aside from
being unnecessary and confusing, triggered compiler warnings due to
-Wflex-array-member-not-at-end

- Document that vcpu->mutex is take outside of kvm->slots_lock and
kvm->slots_arch_lock, which is intentional and desirable despite
being rather unintuitive

Selftests:

- Increase the maximum number of NUMA nodes in the guest_memfd
selftest to 64 (from 8)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (43 commits)
KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8
Documentation: kvm: fix formatting of the quirks table
KVM: x86: clarify leave_smm() return value
selftests: kvm: add a test that VMX validates controls on RSM
selftests: kvm: extract common functionality out of smm_test.c
KVM: SVM: check validity of VMCB controls when returning from SMM
KVM: VMX: check validity of VMCS controls when returning from SMM
KVM: SVM: Set/clear CR8 write interception when AVIC is (de)activated
KVM: SVM: Initialize AVIC VMCB fields if AVIC is enabled with in-kernel APIC
KVM: x86: Introduce KVM_X86_QUIRK_VMCS12_ALLOW_FREEZE_IN_SMM
KVM: x86: Fix SRCU list traversal in kvm_fire_mask_notifiers()
KVM: VMX: Fix a wrong MSR update in add_atomic_switch_msr()
KVM: x86: hyper-v: Validate all GVAs during PV TLB flush
KVM: x86: synthesize CPUID bits only if CPU capability is set
KVM: PPC: e500: Rip out "struct tlbe_ref"
KVM: PPC: e500: Fix build error due to using kmalloc_obj() with wrong type
KVM: selftests: Increase 'maxnode' for guest_memfd tests
KVM: arm64: pkvm: Don't reprobe for ICH_VTR_EL2.TDS on CPU hotplug
KVM: arm64: vgic: Pick EOIcount deactivations from AP-list tail
KVM: arm64: Remove the redundant ISB in __kvm_at_s1e2()
...

7w ago

Masami Hiramatsu (Google)

e2715ea5

bootconfig: Add bootconfig tests about braces

7w ago

Masami Hiramatsu (Google)

e113f0b4

kprobes: avoid crash when rmmod/insmod after ftrace killed

7w ago

Wang Shuaiwei

b0bd84c3

scsi: ufs: core: Fix SError in ufshcd_rtc_work() during UFS suspend

2mo ago

Linus Torvalds

4f3df2e5

Merge tag 'powerpc-7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

7w ago

Sean Christopherson

d2ea4ff1

KVM: selftests: Verify SEV+ guests can read and write EFER, CR0, CR4, and CR8

7w ago

Josh Law

1120a36b

lib/bootconfig: fix snprintf truncation check in xbc_node_compose_key_after()

7w ago

Linus Torvalds

1f318b96

Linux 7.0-rc3 v7.0-rc3

1mo ago

Junxiao Bi

4ce7ada4

scsi: core: Fix error handling for scsi_alloc_sdev()

2mo ago

Linus Torvalds

13af67f5

Merge tag 'x86-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

7w ago

Nilay Shroff

82f73ef9

powerpc/iommu: fix lockdep warning during PCI enumeration

Commit a75b2be249d6 ("iommu: Add iommu_driver_get_domain_for_dev()
helper") introduced iommu_driver_get_domain_for_dev() for driver
code paths that hold iommu_group->mutex while attaching a device
to an IOMMU domain.

The same commit also added a lockdep assertion in
iommu_get_domain_for_dev() to ensure that callers do not hold
iommu_group->mutex when invoking it.

On powerpc platforms, when PCI device ownership is switched from
BLOCKED to the PLATFORM domain, the attach callback
spapr_tce_platform_iommu_attach_dev() still calls
iommu_get_domain_for_dev(). This happens while iommu_group->mutex
is held during domain switching, which triggers the lockdep warning
below during PCI enumeration:

WARNING: drivers/iommu/iommu.c:2252 at iommu_get_domain_for_dev+0x38/0x80, CPU#2: swapper/0/1
Modules linked in:
CPU: 2 UID: 0 PID: 1 Comm: swapper/0 Not tainted 7.0.0-rc2+ #35 PREEMPT
Hardware name: IBM,9105-22A Power11 (architected) 0x820200 0xf000007 of:IBM,FW1120.00 (RB1120_115) hv:phyp pSeries
NIP: c000000000c244c4 LR: c00000000005b5a4 CTR: c00000000005b578
REGS: c00000000a7bf280 TRAP: 0700 Not tainted (7.0.0-rc2+)
MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 22004422 XER: 0000000a
CFAR: c000000000c24508 IRQMASK: 0
GPR00: c00000000005b5a4 c00000000a7bf520 c000000001dc8100 0000000000000001
GPR04: c00000000f972f10 0000000000000000 0000000000000000 0000000000000001
GPR08: 0000001ffbc60000 0000000000000001 0000000000000000 0000000000000000
GPR12: c00000000005b578 c000001fffffe480 c000000000011618 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: ffffffffffffefff 0000000000000000 c000000002d30eb0 0000000000000001
GPR24: c0000000017881f8 0000000000000000 0000000000000001 c00000000f972e00
GPR28: c00000000bbba0d0 0000000000000000 c00000000bbba0d0 c00000000f972e00
NIP [c000000000c244c4] iommu_get_domain_for_dev+0x38/0x80
LR [c00000000005b5a4] spapr_tce_platform_iommu_attach_dev+0x2c/0x98
Call Trace:
iommu_get_domain_for_dev+0x68/0x80 (unreliable)
spapr_tce_platform_iommu_attach_dev+0x2c/0x98
__iommu_attach_device+0x44/0x220
__iommu_device_set_domain+0xf4/0x194
__iommu_group_set_domain_internal+0xec/0x228
iommu_setup_default_domain+0x5f4/0x6a4
__iommu_probe_device+0x674/0x724
iommu_probe_device+0x50/0xb4
iommu_add_device+0x48/0x198
pci_dma_dev_setup_pSeriesLP+0x198/0x4f0
pcibios_bus_add_device+0x80/0x464
pci_bus_add_device+0x40/0x100
pci_bus_add_devices+0x54/0xb0
pcibios_init+0xd8/0x140
do_one_initcall+0x8c/0x598
kernel_init_freeable+0x3ec/0x850
kernel_init+0x34/0x270
ret_from_kernel_user_thread+0x14/0x1c

Fix this by using iommu_driver_get_domain_for_dev() instead of
iommu_get_domain_for_dev() in spapr_tce_platform_iommu_attach_dev(),
which is the appropriate helper for callers holding the group mutex.

Cc: stable@vger.kernel.org
Fixes: a75b2be249d6 ("iommu: Add iommu_driver_get_domain_for_dev() helper")
Closes: https://patchwork.ozlabs.org/project/linuxppc-dev/patch/d5c834ff-4c95-44dd-8bef-57242d63aeee@linux.ibm.com/
Signed-off-by: Nilay Shroff <nilay@linux.ibm.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
[Maddy: Added Closes, tested and reviewed by tags]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260310082129.3630996-1-nilay@linux.ibm.com

7w ago

Paolo Bonzini

dca01b0a

Documentation: kvm: fix formatting of the quirks table

1mo ago

Josh Law

560f763b

lib/bootconfig: check bounds before writing in __xbc_open_brace()

7w ago

Linus Torvalds

fc9f248d

Merge tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

1mo ago

Prithvi Tambewagh

14d4ac19

scsi: target: Fix recursive locking in __configfs_open_file()

2mo ago

Linus Torvalds

164cb546

Merge tag 'timers-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

7w ago

Shashank Balaji

8cc7dd77

x86/apic: Disable x2apic on resume if the kernel expects so

When resuming from s2ram, firmware may re-enable x2apic mode, which may have
been disabled by the kernel during boot either because it doesn't support IRQ
remapping or for other reasons. This causes the kernel to continue using the
xapic interface, while the hardware is in x2apic mode, which causes hangs.
This happens on defconfig + bare metal + s2ram.

Fix this in lapic_resume() by disabling x2apic if the kernel expects it to be
disabled, i.e. when x2apic_mode = 0.

The ACPI v6.6 spec, Section 16.3 [1] says firmware restores either the
pre-sleep configuration or initial boot configuration for each CPU, including
MSR state:

When executing from the power-on reset vector as a result of waking from an
S2 or S3 sleep state, the platform firmware performs only the hardware
initialization required to restore the system to either the state the
platform was in prior to the initial operating system boot, or to the
pre-sleep configuration state. In multiprocessor systems, non-boot
processors should be placed in the same state as prior to the initial
operating system boot.

(further ahead)

If this is an S2 or S3 wake, then the platform runtime firmware restores
minimum context of the system before jumping to the waking vector. This
includes:

CPU configuration. Platform runtime firmware restores the pre-sleep
configuration or initial boot configuration of each CPU (MSR, MTRR,
firmware update, SMBase, and so on). Interrupts must be disabled (for
IA-32 processors, disabled by CLI instruction).

(and other things)

So at least as per the spec, re-enablement of x2apic by the firmware is
allowed if "x2apic on" is a part of the initial boot configuration.

[1] https://uefi.org/specs/ACPI/6.6/16_Waking_and_Sleeping.html#initialization

[ bp: Massage. ]

Fixes: 6e1cb38a2aef ("x64, x2apic/intr-remap: add x2apic support, including enabling interrupt-remapping")
Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@kernel.org>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260306-x2apic-fix-v2-1-bee99c12efa3@sony.com

1mo ago

Sayali Patil

146c9ab3

powerpc/selftests/copyloops: extend selftest to exercise __copy_tofrom_user_power7_vmx

7w ago

Paolo Bonzini

6b1ca262

KVM: x86: clarify leave_smm() return value

1mo ago

Josh Law

39ebc8d7

lib/bootconfig: fix off-by-one in xbc_verify_tree() unclosed brace error

7w ago

Linus Torvalds

014441d1

Merge tag 'i2c-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

1mo ago

Mike Rapoport (Microsoft)

a4b0bf6a

x86/efi: defer freeing of boot services memory

2mo ago

Florian Fuchs

80bf3b28

scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP

2mo ago

Linus Torvalds

63724e95

Merge tag 'sched-urgent-2026-03-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

7w ago

Steven Rostedt

755a648e

time/jiffies: Mark jiffies_64_to_clock_t() notrace

1mo ago

Sayali Patil

6bc9c0a9

powerpc: fix KUAP warning in VMX usercopy path

On powerpc with PREEMPT_FULL or PREEMPT_LAZY and function tracing enabled,
KUAP warnings can be triggered from the VMX usercopy path under memory
stress workloads.

KUAP requires that no subfunctions are called once userspace access has
been enabled. The existing VMX copy implementation violates this
requirement by invoking enter_vmx_usercopy() from the assembly path after
userspace access has already been enabled. If preemption occurs
in this window, the AMR state may not be preserved correctly,
leading to unexpected userspace access state and resulting in
KUAP warnings.

Fix this by restructuring the VMX usercopy flow so that VMX selection
and VMX state management are centralized in raw_copy_tofrom_user(),
which is invoked by the raw_copy_{to,from,in}_user() wrappers.

The new flow is:

- raw_copy_{to,from,in}_user() calls raw_copy_tofrom_user()
- raw_copy_tofrom_user() decides whether to use the VMX path
based on size and CPU capability
- Call enter_vmx_usercopy() before enabling userspace access
- Enable userspace access as per the copy direction
and perform the VMX copy
- Disable userspace access as per the copy direction
- Call exit_vmx_usercopy()
- Fall back to the base copy routine if the VMX copy faults

With this change, the VMX assembly routines no longer perform VMX state
management or call helper functions; they only implement the
copy operations.
The previous feature-section based VMX selection inside
__copy_tofrom_user_power7() is removed, and a dedicated
__copy_tofrom_user_power7_vmx() entry point is introduced.

This ensures correct KUAP ordering, avoids subfunction calls
while KUAP is unlocked, and eliminates the warnings while preserving
the VMX fast path.

Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Closes: https://lore.kernel.org/all/20260109064917.777587-2-sshegde@linux.ibm.com/
Suggested-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Co-developed-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Aboorva Devarajan <aboorvad@linux.ibm.com>
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20260304122201.153049-1-sayalip@linux.ibm.com

7w ago

Paolo Bonzini

3e745694

selftests: kvm: add a test that VMX validates controls on RSM

1mo ago

Linus Torvalds

c23719ab

Merge tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Charles Haithcock

cfc69c2e

i2c: i801: Revert "i2c: i801: replace acpi_lock with I2C bus lock"

This reverts commit f707d6b9e7c18f669adfdb443906d46cfbaaa0c1.

Under rare circumstances, multiple udev threads can collect i801 device
info on boot and walk i801_acpi_io_handler somewhat concurrently. The
first will note the area is reserved by acpi to prevent further touches.
This ultimately causes the area to be deregistered. The second will
enter i801_acpi_io_handler after the area is unregistered but before a
check can be made that the area is unregistered. i2c_lock_bus relies on
the now unregistered area containing lock_ops to lock the bus. The end
result is a kernel panic on boot with the following backtrace;

[ 14.971872] ioatdma 0000:09:00.2: enabling device (0100 -> 0102)
[ 14.971873] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 14.971880] #PF: supervisor read access in kernel mode
[ 14.971884] #PF: error_code(0x0000) - not-present page
[ 14.971887] PGD 0 P4D 0
[ 14.971894] Oops: 0000 [#1] PREEMPT SMP PTI
[ 14.971900] CPU: 5 PID: 956 Comm: systemd-udevd Not tainted 5.14.0-611.5.1.el9_7.x86_64 #1
[ 14.971905] Hardware name: XXXXXXXXXXXXXXXXXXXXXXX BIOS 1.20.10.SV91 01/30/2023
[ 14.971908] RIP: 0010:i801_acpi_io_handler+0x2d/0xb0 [i2c_i801]
[ 14.971929] Code: 00 00 49 8b 40 20 41 57 41 56 4d 8b b8 30 04 00 00 49 89 ce 41 55 41 89 d5 41 54 49 89 f4 be 02 00 00 00 55 4c 89 c5 53 89 fb <48> 8b 00 4c 89 c7 e8 18 61 54 e9 80 bd 80 04 00 00 00 75 09 4c 3b
[ 14.971933] RSP: 0018:ffffbaa841483838 EFLAGS: 00010282
[ 14.971938] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9685e01ba568
[ 14.971941] RDX: 0000000000000008 RSI: 0000000000000002 RDI: 0000000000000000
[ 14.971944] RBP: ffff9685ca22f028 R08: ffff9685ca22f028 R09: ffff9685ca22f028
[ 14.971948] R10: 000000000000000b R11: 0000000000000580 R12: 0000000000000580
[ 14.971951] R13: 0000000000000008 R14: ffff9685e01ba568 R15: ffff9685c222f000
[ 14.971954] FS: 00007f8287c0ab40(0000) GS:ffff96a47f940000(0000) knlGS:0000000000000000
[ 14.971959] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 14.971963] CR2: 0000000000000000 CR3: 0000000168090001 CR4: 00000000003706f0
[ 14.971966] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 14.971968] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 14.971972] Call Trace:
[ 14.971977] <TASK>
[ 14.971981] ? show_trace_log_lvl+0x1c4/0x2df
[ 14.971994] ? show_trace_log_lvl+0x1c4/0x2df
[ 14.972003] ? acpi_ev_address_space_dispatch+0x16e/0x3c0
[ 14.972014] ? __die_body.cold+0x8/0xd
[ 14.972021] ? page_fault_oops+0x132/0x170
[ 14.972028] ? exc_page_fault+0x61/0x150
[ 14.972036] ? asm_exc_page_fault+0x22/0x30
[ 14.972045] ? i801_acpi_io_handler+0x2d/0xb0 [i2c_i801]
[ 14.972061] acpi_ev_address_space_dispatch+0x16e/0x3c0
[ 14.972069] ? __pfx_i801_acpi_io_handler+0x10/0x10 [i2c_i801]
[ 14.972085] acpi_ex_access_region+0x5b/0xd0
[ 14.972093] acpi_ex_field_datum_io+0x73/0x2e0
[ 14.972100] acpi_ex_read_data_from_field+0x8e/0x230
[ 14.972106] acpi_ex_resolve_node_to_value+0x23d/0x310
[ 14.972114] acpi_ds_evaluate_name_path+0xad/0x110
[ 14.972121] acpi_ds_exec_end_op+0x321/0x510
[ 14.972127] acpi_ps_parse_loop+0xf7/0x680
[ 14.972136] acpi_ps_parse_aml+0x17a/0x3d0
[ 14.972143] acpi_ps_execute_method+0x137/0x270
[ 14.972150] acpi_ns_evaluate+0x1f4/0x2e0
[ 14.972158] acpi_evaluate_object+0x134/0x2f0
[ 14.972164] acpi_evaluate_integer+0x50/0xe0
[ 14.972173] ? vsnprintf+0x24b/0x570
[ 14.972181] acpi_ac_get_state.part.0+0x23/0x70
[ 14.972189] get_ac_property+0x4e/0x60
[ 14.972195] power_supply_show_property+0x90/0x1f0
[ 14.972205] add_prop_uevent+0x29/0x90
[ 14.972213] power_supply_uevent+0x109/0x1d0
[ 14.972222] dev_uevent+0x10e/0x2f0
[ 14.972228] uevent_show+0x8e/0x100
[ 14.972236] dev_attr_show+0x19/0x40
[ 14.972246] sysfs_kf_seq_show+0x9b/0x100
[ 14.972253] seq_read_iter+0x120/0x4b0
[ 14.972262] ? selinux_file_permission+0x106/0x150
[ 14.972273] vfs_read+0x24f/0x3a0
[ 14.972284] ksys_read+0x5f/0xe0
[ 14.972291] do_syscall_64+0x5f/0xe0
...

The kernel panic is mitigated by setting limiting the count of udev
children to 1. Revert to using the acpi_lock to continue protecting
marking the area as owned by firmware without relying on a lock in
a potentially unmapped region of memory.

Fixes: f707d6b9e7c1 ("i2c: i801: replace acpi_lock with I2C bus lock")
Signed-off-by: Charles Haithcock <chaithco@redhat.com>
[wsa: added Fixes-tag and updated comment stating the importance of the lock]
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>