Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

powerpc: Avoid nmi_enter/nmi_exit in real mode interrupt.

nmi_enter()/nmi_exit() touches per cpu variables which can lead to kernel
crash when invoked during real mode interrupt handling (e.g. early HMI/MCE
interrupt handler) if percpu allocation comes from vmalloc area.

Early HMI/MCE handlers are called through DEFINE_INTERRUPT_HANDLER_NMI()
wrapper which invokes nmi_enter/nmi_exit calls. We don't see any issue when
percpu allocation is from the embedded first chunk. However with
CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there are chances where percpu
allocation can come from the vmalloc area.

With kernel command line "percpu_alloc=page" we can force percpu allocation
to come from vmalloc area and can see kernel crash in machine_check_early:

[ 1.215714] NIP [c000000000e49eb4] rcu_nmi_enter+0x24/0x110
[ 1.215717] LR [c0000000000461a0] machine_check_early+0xf0/0x2c0
[ 1.215719] --- interrupt: 200
[ 1.215720] [c000000fffd73180] [0000000000000000] 0x0 (unreliable)
[ 1.215722] [c000000fffd731b0] [0000000000000000] 0x0
[ 1.215724] [c000000fffd73210] [c000000000008364] machine_check_early_common+0x134/0x1f8

Fix this by avoiding use of nmi_enter()/nmi_exit() in real mode if percpu
first chunk is not embedded.

Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Shirisha Ganta <shirisha@linux.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20240410043006.81577-1-mahesh@linux.ibm.com

authored by

Mahesh Salgaonkar and committed by
Michael Ellerman
0db880fc 676b2f99

+22
+10
arch/powerpc/include/asm/interrupt.h
··· 336 336 if (IS_ENABLED(CONFIG_KASAN)) 337 337 return; 338 338 339 + /* 340 + * Likewise, do not use it in real mode if percpu first chunk is not 341 + * embedded. With CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK enabled there 342 + * are chances where percpu allocation can come from vmalloc area. 343 + */ 344 + if (percpu_first_chunk_is_paged) 345 + return; 346 + 339 347 /* Otherwise, it should be safe to call it */ 340 348 nmi_enter(); 341 349 } ··· 359 351 // no nmi_exit for a pseries hash guest taking a real mode exception 360 352 } else if (IS_ENABLED(CONFIG_KASAN)) { 361 353 // no nmi_exit for KASAN in real mode 354 + } else if (percpu_first_chunk_is_paged) { 355 + // no nmi_exit if percpu first chunk is not embedded 362 356 } else { 363 357 nmi_exit(); 364 358 }
+10
arch/powerpc/include/asm/percpu.h
··· 15 15 #endif /* CONFIG_SMP */ 16 16 #endif /* __powerpc64__ */ 17 17 18 + #if defined(CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK) && defined(CONFIG_SMP) 19 + #include <linux/jump_label.h> 20 + DECLARE_STATIC_KEY_FALSE(__percpu_first_chunk_is_paged); 21 + 22 + #define percpu_first_chunk_is_paged \ 23 + (static_key_enabled(&__percpu_first_chunk_is_paged.key)) 24 + #else 25 + #define percpu_first_chunk_is_paged false 26 + #endif /* CONFIG_PPC64 && CONFIG_SMP */ 27 + 18 28 #include <asm-generic/percpu.h> 19 29 20 30 #include <asm/paca.h>
+2
arch/powerpc/kernel/setup_64.c
··· 834 834 835 835 unsigned long __per_cpu_offset[NR_CPUS] __read_mostly; 836 836 EXPORT_SYMBOL(__per_cpu_offset); 837 + DEFINE_STATIC_KEY_FALSE(__percpu_first_chunk_is_paged); 837 838 838 839 void __init setup_per_cpu_areas(void) 839 840 { ··· 877 876 if (rc < 0) 878 877 panic("cannot initialize percpu area (err=%d)", rc); 879 878 879 + static_key_enable(&__percpu_first_chunk_is_paged.key); 880 880 delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start; 881 881 for_each_possible_cpu(cpu) { 882 882 __per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];