commits

Running stress-ng --schedpolicy 0 on an RT kernel on a big machine
might lead to the following WARNINGs (edited).

sched: DL de-boosted task PID 22725: REPLENISH flag missing

WARNING: CPU: 93 PID: 0 at kernel/sched/deadline.c:239 dequeue_task_dl+0x15c/0x1f8
... (running_bw underflow)
Call trace:
dequeue_task_dl+0x15c/0x1f8 (P)
dequeue_task+0x80/0x168
deactivate_task+0x24/0x50
push_dl_task+0x264/0x2e0
dl_task_timer+0x1b0/0x228
__hrtimer_run_queues+0x188/0x378
hrtimer_interrupt+0xfc/0x260
...

The problem is that when a SCHED_DEADLINE task (lock holder) is
changed to a lower priority class via sched_setscheduler(), it may
fail to properly inherit the parameters of potential DEADLINE donors
if it didn't already inherit them in the past (shorter deadline than
donor's at that time). This might lead to bandwidth accounting
corruption, as enqueue_task_dl() won't recognize the lock holder as
boosted.

The scenario occurs when:
1. A DEADLINE task (donor) blocks on a PI mutex held by another
DEADLINE task (holder), but the holder doesn't inherit parameters
(e.g., it already has a shorter deadline)
2. sched_setscheduler() changes the holder from DEADLINE to a lower
class while still holding the mutex
3. The holder should now inherit DEADLINE parameters from the donor
and be enqueued with ENQUEUE_REPLENISH, but this doesn't happen

Fix the issue by introducing __setscheduler_dl_pi(), which detects when
a DEADLINE (proper or boosted) task gets setscheduled to a lower
priority class. In case, the function makes the task inherit DEADLINE
parameters of the donoer (pi_se) and sets ENQUEUE_REPLENISH flag to
ensure proper bandwidth accounting during the next enqueue operation.

Fixes: 2279f540ea7d ("sched/deadline: Fix priority inheritance with multiple scheduling classes")
Reported-by: Bruno Goncalves <bgoncalv@redhat.com>
Signed-off-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260302-upstream-fix-deadline-piboost-b4-v3-1-6ba32184a9e0@redhat.com

2mo ago

Linus Torvalds

11439c46

Linux 7.0-rc2 v7.0-rc2

2mo ago

Peter Zijlstra

528d89a4

x86/topo: Fix SNC topology mess

Per 4d6dd05d07d0 ("sched/topology: Fix sched domain build error for GNR, CWF in
SNC-3 mode"), the original crazy SNC-3 SLIT table was:

node distances:
node 0 1 2 3 4 5
0: 10 15 17 21 28 26
1: 15 10 15 23 26 23
2: 17 15 10 26 23 21
3: 21 28 26 10 15 17
4: 23 26 23 15 10 15
5: 26 23 21 17 15 10

And per:

https://lore.kernel.org/lkml/20250825075642.GQ3245006@noisy.programming.kicks-ass.net/

The suggestion was to average the off-trace clusters to restore sanity.

However, 4d6dd05d07d0 implements this under various assumptions:

- anything GNR/CWF with numa_in_package;
- there will never be more than 2 packages;
- the off-trace cluster will have distance >20

And then HPE shows up with a machine that matches the
Vendor-Family-Model checks but looks like this:

Here's an 8 socket (2 chassis) HPE system with SNC enabled:

node 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0: 10 12 16 16 16 16 18 18 40 40 40 40 40 40 40 40
1: 12 10 16 16 16 16 18 18 40 40 40 40 40 40 40 40
2: 16 16 10 12 18 18 16 16 40 40 40 40 40 40 40 40
3: 16 16 12 10 18 18 16 16 40 40 40 40 40 40 40 40
4: 16 16 18 18 10 12 16 16 40 40 40 40 40 40 40 40
5: 16 16 18 18 12 10 16 16 40 40 40 40 40 40 40 40
6: 18 18 16 16 16 16 10 12 40 40 40 40 40 40 40 40
7: 18 18 16 16 16 16 12 10 40 40 40 40 40 40 40 40
8: 40 40 40 40 40 40 40 40 10 12 16 16 16 16 18 18
9: 40 40 40 40 40 40 40 40 12 10 16 16 16 16 18 18
10: 40 40 40 40 40 40 40 40 16 16 10 12 18 18 16 16
11: 40 40 40 40 40 40 40 40 16 16 12 10 18 18 16 16
12: 40 40 40 40 40 40 40 40 16 16 18 18 10 12 16 16
13: 40 40 40 40 40 40 40 40 16 16 18 18 12 10 16 16
14: 40 40 40 40 40 40 40 40 18 18 16 16 16 16 10 12
15: 40 40 40 40 40 40 40 40 18 18 16 16 16 16 12 10

10 = Same chassis and socket
12 = Same chassis and socket (SNC)
16 = Same chassis and adjacent socket
18 = Same chassis and non-adjacent socket
40 = Different chassis

Turns out, the 'max 2 packages' thing is only relevant to the SNC-3 parts, the
smaller parts do 8 sockets (like usual). The above SLIT table is sane, but
violates the previous assumptions and trips a WARN.

Now that the topology code has a sensible measure of nodes-per-package, we can
use that to divinate the SNC mode at hand, and only fix up SNC-3 topologies.

There is a 'healthy' amount of paranoia code validating the assumptions on the
SLIT table, a simple pr_err(FW_BUG) print on failure and a fallback to using
the regular table. Lets see how long this lasts :-)

Fixes: 4d6dd05d07d0 ("sched/topology: Fix sched domain build error for GNR, CWF in SNC-3 mode")
Reported-by: Kyle Meyer <kyle.meyer@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Tested-by: K Prateek Nayak <kprateek.nayak@amd.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Kyle Meyer <kyle.meyer@hpe.com>
Link: https://patch.msgid.link/20260303110100.238361290@infradead.org

2mo ago

Linus Torvalds

3b5d535c

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

2mo ago

Linus Torvalds

949d0a46

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

2mo ago

Peter Zijlstra

717b64d5

x86/topo: Replace x86_has_numa_in_package

2mo ago

Linus Torvalds

fb07430e

Merge tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

2mo ago

Prithvi Tambewagh

14d4ac19

scsi: target: Fix recursive locking in __configfs_open_file()

2mo ago

Linus Torvalds

e2bd1b13

Merge tag 'core-debugobjects-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Paolo Bonzini

55365ab8

Merge tag 'kvmarm-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

2mo ago

Peter Zijlstra

ae6730ff

x86/topo: Add topology_num_nodes_per_package()

2mo ago

Linus Torvalds

6deccafc

Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

2mo ago

Helge Deller

e31a374a

fbdev: au1100fb: Fix build on MIPS64

2mo ago

Florian Fuchs

80bf3b28

scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP

2mo ago

Linus Torvalds

5920da44

Merge tag 'x86-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Thomas Gleixner

fd363431

debugobject: Make it work with deferred page initialization - again

2mo ago

Paolo Bonzini

70295a47

KVM: always define KVM_CAP_SYNC_MMU

2mo ago

Marc Zyngier

54e367cb

KVM: arm64: Deduplicate ASID retrieval code

2mo ago

Peter Zijlstra

48084cc1

x86/numa: Store extra copy of numa_nodes_parsed

2mo ago

Linus Torvalds

8b7f4cd3

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

2mo ago

Helge Deller

8475d8fe

parisc: Fix initial page table creation for boot

2mo ago

Ranjan Kumar

dbd53975

scsi: mpi3mr: Clear reset history on ready and recheck state after timeout

2mo ago

Linus Torvalds

f6542af9

Merge tag 'timers-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Thomas Huth

237dc6a0

x86/headers: Replace __ASSEMBLY__ stragglers with __ASSEMBLER__

2mo ago

Linus Torvalds

05f7e89a

Linux 6.19 v6.19

2mo ago

Paolo Bonzini

407fd8b8

KVM: remove CONFIG_KVM_GENERIC_MMU_NOTIFIER

2mo ago

Sascha Bischoff

29c8b85a

irqchip/gic-v5: Fix inversion of IRS_IDR0.virt flag

2mo ago

Jan Stancek

3d1973a0

x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths

2mo ago

Linus Torvalds

03dcad79

Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

2mo ago

Ihor Solodrai

b0dcdcb9

resolve_btfids: Fix linker flags detection

2mo ago

Helge Deller

17c144f1

parisc: Check kernel mapping earlier at bootup

2mo ago

Junxiao Bi

1ac22c8e

scsi: core: Fix refcount leak for tagset_refcnt

2mo ago

Linus Torvalds

61706251

Merge tag 'sched-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Eric Dumazet

b777b5e0

time/jiffies: Inline jiffies_to_msecs() and jiffies_to_usecs()

2mo ago

Peter Zijlstra

24c8147a

x86/cfi: Fix CFI rewrite for odd alignments

2mo ago

Linus Torvalds

e98f34af

Merge tag 'i2c-for-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

2mo ago

Linus Torvalds

6de23f81

Linux 7.0-rc1 v7.0-rc1

2mo ago

Fuad Tabba

ec197dca

KVM: arm64: Revert accidental drop of kvm_uninit_stage2_mmu() for non-NV VMs

2mo ago

Kim Phillips

9073428b

x86/sev: Allow IBPB-on-Entry feature for SNP guests

2mo ago

Linus Torvalds

aed0af05

Merge tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

2mo ago

Paul E. McKenney

78c2ce0f

scftorture: Update due to x86 not supporting none/voluntary preemption

2mo ago

Alexei Starovoitov

325d1ba3

Merge branch 'bpf-fix-precision-backtracking-bug-with-linked-registers'

Eduard Zingerman says:

====================
bpf: Fix precision backtracking bug with linked registers

Emil Tsalapatis reported a verifier bug hit by the scx_lavd sched_ext
scheduler. The essential part of the verifier log looks as follows:

436: ...
// checkpoint hit for 438: (1d) if r7 == r8 goto ...
frame 3: propagating r2,r7,r8
frame 2: propagating r6
mark_precise: frame3: last_idx ...
mark_precise: frame3: regs=r2,r7,r8 stack= before 436: ...
mark_precise: frame3: regs=r2,r7 stack= before 435: ...
mark_precise: frame3: regs=r2,r7 stack= before 434: (85) call bpf_trace_vprintk#177
verifier bug: backtracking call unexpected regs 84

The log complains that registers r2 and r7 are tracked as precise
while processing the bpf_trace_vprintk() call in precision backtracking.
This can't be right, as r2 is reset by the call and there is nothing
to backtrack it to. The precision propagation is triggered when
a checkpoint is hit at instruction 438, r2 is dead at that instruction.

This happens because of the following sequence of events:
- Instruction 438 is first reached with registers r2 and r7 having
the same id via a path that does not call bpf_trace_vprintk():
- Checkpoint is created at 438.
- The jump at 438 is predicted, hence r7 and registers linked to it
(r2) are propagated as precise, marking r2 and r7 precise in the
checkpoint.
- Instruction 438 is reached a second time with r2 undefined and via
a path that calls bpf_trace_vprintk():
- Checkpoint is hit.
- propagate_precision() picks registers r2 and r7 and propagates
precision marks for those up to the helper call.

The root cause is the fact that states_equal() and
propagate_precision() assume that the precision flag can't be set for a
dead register (as computed by compute_live_registers()).
However, this is not the case when linked registers are at play.
Fix this by accounting for live register flags in
collect_linked_regs().
---
====================

Link: https://patch.msgid.link/20260306-linked-regs-and-propagate-precision-v1-0-18e859be570d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

2mo ago

Helge Deller

8e732934

parisc: Increase initial mapping to 64 MB with KALLSYMS

2mo ago

wangshuaiwei

2f38fd99

scsi: ufs: core: Fix shift out of bounds when MAXQ=32

2mo ago

Linus Torvalds

cb36eabc

Merge tag 'perf-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2mo ago

Mathieu Desnoyers

3b68df97

rseq: slice ext: Ensure rseq feature size differs from original rseq size

2mo ago

Linus Torvalds

192c0159

Merge tag 'powerpc-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates for 7.0

- Implement masked user access

- Add bpf support for internal only per-CPU instructions and inline the
bpf_get_smp_processor_id() and bpf_get_current_task() functions

- Fix pSeries MSI-X allocation failure when quota is exceeded

- Fix recursive pci_lock_rescan_remove locking in EEH event handling

- Support tailcalls with subprogs & BPF exceptions on 64bit

- Extend "trusted" keys to support the PowerVM Key Wrapping Module
(PKWM)

Thanks to Abhishek Dubey, Christophe Leroy, Gaurav Batra, Guangshuo Li,
Jarkko Sakkinen, Mahesh Salgaonkar, Mimi Zohar, Miquel Sabaté Solà, Nam
Cao, Narayana Murty N, Nayna Jain, Nilay Shroff, Puranjay Mohan, Saket
Kumar Bhaskar, Sourabh Jain, Srish Srinivasan, and Venkat Rao Bagalkote.

* tag 'powerpc-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (27 commits)
powerpc/pseries: plpks: export plpks_wrapping_is_supported
docs: trusted-encryped: add PKWM as a new trust source
keys/trusted_keys: establish PKWM as a trusted source
pseries/plpks: add HCALLs for PowerVM Key Wrapping Module
pseries/plpks: expose PowerVM wrapping features via the sysfs
powerpc/pseries: move the PLPKS config inside its own sysfs directory
pseries/plpks: fix kernel-doc comment inconsistencies
powerpc/smp: Add check for kcalloc() failure in parse_thread_groups()
powerpc: kgdb: Remove OUTBUFMAX constant
powerpc64/bpf: Additional NVR handling for bpf_throw
powerpc64/bpf: Support exceptions
powerpc64/bpf: Add arch_bpf_stack_walk() for BPF JIT
powerpc64/bpf: Avoid tailcall restore from trampoline
powerpc64/bpf: Support tailcalls with subprogs
powerpc64/bpf: Moving tail_call_cnt to bottom of frame
powerpc/eeh: fix recursive pci_lock_rescan_remove locking in EEH event handling
powerpc/pseries: Fix MSI-X allocation failure when quota is exceeded
powerpc/iommu: bypass DMA APIs for coherent allocations for pre-mapped memory
powerpc64/bpf: Inline bpf_get_smp_processor_id() and bpf_get_current_task/_btf()
powerpc64/bpf: Support internal-only MOV instruction to resolve per-CPU addrs
...

2mo ago

Hou Wenlong

a0cb371b

x86/bug: Handle __WARN_printf() trap in early_fixup_exception()

2mo ago

Linus Torvalds

e7aa5724

Merge tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

2mo ago

LI Qingwu

b126097b

i2c: imx: preserve error state in block data length handler

3mo ago

Linus Torvalds

fbf33803

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

2mo ago

Marc Zyngier

08f97454

KVM: arm64: Fix protected mode handling of pages larger than 4kB

2mo ago

Tom Lendacky

4ca191ce

x86/boot/sev: Move SEV decompressor variables into the .data section

2mo ago

Merge tag 'x86-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

c23719ab

Linus Torvalds

2mo

Merge tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

6ff1020c

Linus Torvalds

2mo

x86/entry/vdso32: Work around libgcc unwinder bug

b5ef09a7

H. Peter Anvin

2mo

Merge tag 'sched-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

b1b9a9d0

Linus Torvalds

2mo

timekeeping: Fix timex status validation for auxiliary clocks

e48a8699

Miroslav Lichvar

2mo

x86/resctrl: Fix SNC detection

59674fc9

Tony Luck

2mo

eventpoll: Convert epoll_put_uevent() to scoped user access

1954c4f0

Eric Dumazet

2mo

sched/deadline: Fix missing ENQUEUE_REPLENISH during PI de-boosting

d658686a

Juri Lelli

2mo

Linux 7.0-rc2 v7.0-rc2

11439c46

Linus Torvalds

2mo

x86/topo: Fix SNC topology mess

528d89a4

Peter Zijlstra

2mo

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

3b5d535c

Linus Torvalds

2mo

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

949d0a46

Linus Torvalds

2mo

x86/topo: Replace x86_has_numa_in_package

717b64d5

Peter Zijlstra

2mo

Merge tag 'fbdev-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

fb07430e

Linus Torvalds

2mo

scsi: target: Fix recursive locking in __configfs_open_file()

14d4ac19

Prithvi Tambewagh

2mo

Merge tag 'core-debugobjects-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

e2bd1b13

Linus Torvalds

2mo

Merge tag 'kvmarm-fixes-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

55365ab8

Paolo Bonzini

2mo

x86/topo: Add topology_num_nodes_per_package()

ae6730ff

Peter Zijlstra

2mo

Merge tag 'parisc-for-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

6deccafc

Linus Torvalds

2mo

fbdev: au1100fb: Fix build on MIPS64

e31a374a

Helge Deller

2mo

scsi: devinfo: Add BLIST_SKIP_IO_HINTS for Iomega ZIP

80bf3b28

Florian Fuchs

2mo

Merge tag 'x86-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

5920da44

Linus Torvalds

2mo

debugobject: Make it work with deferred page initialization - again

fd363431

Thomas Gleixner

2mo

KVM: always define KVM_CAP_SYNC_MMU

70295a47

Paolo Bonzini

2mo

KVM: arm64: Deduplicate ASID retrieval code

54e367cb

Marc Zyngier

2mo

x86/numa: Store extra copy of numa_nodes_parsed

48084cc1

Peter Zijlstra

2mo

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

8b7f4cd3

Linus Torvalds

2mo

parisc: Fix initial page table creation for boot

8475d8fe

Helge Deller

2mo

scsi: mpi3mr: Clear reset history on ready and recheck state after timeout

dbd53975

Ranjan Kumar

2mo

Merge tag 'timers-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

f6542af9

Linus Torvalds

2mo

x86/headers: Replace __ASSEMBLY__ stragglers with __ASSEMBLER__

237dc6a0

Thomas Huth

2mo

Linux 6.19 v6.19

05f7e89a

Linus Torvalds

2mo

KVM: remove CONFIG_KVM_GENERIC_MMU_NOTIFIER

407fd8b8

Paolo Bonzini

2mo

irqchip/gic-v5: Fix inversion of IRS_IDR0.virt flag

29c8b85a

Sascha Bischoff

2mo

x86/boot: Handle relative CONFIG_EFI_SBAT_FILE file paths

3d1973a0

Jan Stancek

2mo

Merge tag 'rcu-fixes.v7.0-20260307a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux

03dcad79

Linus Torvalds

2mo

resolve_btfids: Fix linker flags detection

b0dcdcb9

Ihor Solodrai

2mo

parisc: Check kernel mapping earlier at bootup

17c144f1

Helge Deller

2mo

scsi: core: Fix refcount leak for tagset_refcnt

1ac22c8e

Junxiao Bi

2mo

Merge tag 'sched-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

61706251

Linus Torvalds

2mo

time/jiffies: Inline jiffies_to_msecs() and jiffies_to_usecs()

b777b5e0

Eric Dumazet

2mo

x86/cfi: Fix CFI rewrite for odd alignments

24c8147a

Peter Zijlstra

2mo

Merge tag 'i2c-for-6.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

e98f34af

Linus Torvalds

2mo

Linux 7.0-rc1 v7.0-rc1

6de23f81

Linus Torvalds

2mo

KVM: arm64: Revert accidental drop of kvm_uninit_stage2_mmu() for non-NV VMs

ec197dca

Fuad Tabba

2mo

x86/sev: Allow IBPB-on-Entry feature for SNP guests

9073428b

Kim Phillips

2mo

Merge tag 'trace-v7.0-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

aed0af05

Linus Torvalds

2mo

scftorture: Update due to x86 not supporting none/voluntary preemption

78c2ce0f

Paul E. McKenney

2mo

Merge branch 'bpf-fix-precision-backtracking-bug-with-linked-registers'

325d1ba3

Alexei Starovoitov

2mo

parisc: Increase initial mapping to 64 MB with KALLSYMS

8e732934

Helge Deller

2mo

scsi: ufs: core: Fix shift out of bounds when MAXQ=32

2f38fd99

wangshuaiwei

2mo

Merge tag 'perf-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

cb36eabc

Linus Torvalds

2mo

rseq: slice ext: Ensure rseq feature size differs from original rseq size

Before rseq became extensible, its original size was 32 bytes even
though the active rseq area was only 20 bytes. This had the following
impact in terms of userspace ecosystem evolution:

* The GNU libc between 2.35 and 2.39 expose a __rseq_size symbol set
to 32, even though the size of the active rseq area is really 20.
* The GNU libc 2.40 changes this __rseq_size to 20, thus making it
express the active rseq area.
* Starting from glibc 2.41, __rseq_size corresponds to the
AT_RSEQ_FEATURE_SIZE from getauxval(3).

This means that users of __rseq_size can always expect it to
correspond to the active rseq area, except for the value 32, for
which the active rseq area is 20 bytes.

Exposing a 32 bytes feature size would make life needlessly painful
for userspace. Therefore, add a reserved field at the end of the
rseq area to bump the feature size to 33 bytes. This reserved field
is expected to be replaced with whatever field will come next,
expecting that this field will be larger than 1 byte.

The effect of this change is to increase the size from 32 to 64 bytes
before we actually have fields using that memory.

Clarify the allocation size and alignment requirements in the struct
rseq uapi comment.

Change the value returned by getauxval(AT_RSEQ_ALIGN) to return the
value of the active rseq area size rounded up to next power of 2, which
guarantees that the rseq structure will always be aligned on the nearest
power of two large enough to contain it, even as it grows. Change the
alignment check in the rseq registration accordingly.

This will minimize the amount of ABI corner-cases we need to document
and require userspace to play games with. The rule stays simple when
__rseq_size != 32:

#define rseq_field_available(field) (__rseq_size >= offsetofend(struct rseq_abi, field))

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260220200642.1317826-3-mathieu.desnoyers@efficios.com

3b68df97

Mathieu Desnoyers

2mo

Merge tag 'powerpc-7.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

192c0159

Linus Torvalds

2mo

x86/bug: Handle __WARN_printf() trap in early_fixup_exception()

a0cb371b

Hou Wenlong

2mo

Merge tag 'spi-fix-v6.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

e7aa5724

Linus Torvalds

2mo

i2c: imx: preserve error state in block data length handler

b126097b

LI Qingwu

3mo

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

fbf33803

Linus Torvalds

2mo

KVM: arm64: Fix protected mode handling of pages larger than 4kB

Since 3669ddd8fa8b5 ("KVM: arm64: Add a range to pkvm_mappings"),
pKVM tracks the memory that has been mapped into a guest in a
side data structure. Crucially, it uses it to find out whether
a page has already been mapped, and therefore refuses to map it
twice. So far, so good.

However, this very patch completely breaks non-4kB page support,
with guests being unable to boot. The most obvious symptom is that
we take the same fault repeatedly, and not making forward progress.
A quick investigation shows that this is because of the above
rejection code.

As it turns out, there are multiple issues at play:

- while the HPFAR_EL2 register gives you the faulting IPA minus
the bottom 12 bits, it will still give you the extra bits that
are part of the page offset for anything larger than 4kB,
even for a level-3 mapping

- pkvm_pgtable_stage2_map() assumes that the address passed as
a parameter is aligned to the size of the intended mapping

- the faulting address is only aligned for a non-page mapping

When the planets are suitably aligned (pun intended), the guest
faults on a page by accessing it past the bottom 4kB, and extra bits
get set in the HPFAR_EL2 register. If this results in a page mapping
(which is likely with large granule sizes), nothing aligns it further
down, and pkvm_mapping_iter_first() finds an intersection that
doesn't really exist. We assume this is a spurious fault and return
-EAGAIN. And again...

This doesn't hit outside of the protected code, as the page table
code always aligns the IPA down to a page boundary, hiding the issue
for everyone else.

Fix it by always forcing the alignment on vma_pagesize, irrespective
of the value of vma_pagesize.

Fixes: 3669ddd8fa8b5 ("KVM: arm64: Add a range to pkvm_mappings")
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://https://patch.msgid.link/20260222141000.3084258-1-maz@kernel.org
Cc: stable@vger.kernel.org

08f97454

Marc Zyngier

2mo

x86/boot/sev: Move SEV decompressor variables into the .data section

4ca191ce

Tom Lendacky

2mo

Configure Feed

Configure Feed

commits