Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

randomize_kstack: Maintain kstack_offset per task

kstack_offset was previously maintained per-cpu, but this caused a
couple of issues. So let's instead make it per-task.

Issue 1: add_random_kstack_offset() and choose_random_kstack_offset()
expected and required to be called with interrupts and preemption
disabled so that it could manipulate per-cpu state. But arm64, loongarch
and risc-v are calling them with interrupts and preemption enabled. I
don't _think_ this causes any functional issues, but it's certainly
unexpected and could lead to manipulating the wrong cpu's state, which
could cause a minor performance degradation due to bouncing the cache
lines. By maintaining the state per-task those functions can safely be
called in preemptible context.

Issue 2: add_random_kstack_offset() is called before executing the
syscall and expands the stack using a previously chosen random offset.
choose_random_kstack_offset() is called after executing the syscall and
chooses and stores a new random offset for the next syscall. With
per-cpu storage for this offset, an attacker could force cpu migration
during the execution of the syscall and prevent the offset from being
updated for the original cpu such that it is predictable for the next
syscall on that cpu. By maintaining the state per-task, this problem
goes away because the per-task random offset is updated after the
syscall regardless of which cpu it is executing on.

Fixes: 39218ff4c625 ("stack: Optionally randomize kernel stack offset each syscall")
Closes: https://lore.kernel.org/all/dd8c37bc-795f-4c7a-9086-69e584d8ab24@arm.com/
Cc: stable@vger.kernel.org
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://patch.msgid.link/20260303150840.3789438-2-ryan.roberts@arm.com
Signed-off-by: Kees Cook <kees@kernel.org>

authored by

Ryan Roberts and committed by
Kees Cook
37beb425 11439c46

+21 -12
+15 -11
include/linux/randomize_kstack.h
··· 9 9 10 10 DECLARE_STATIC_KEY_MAYBE(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, 11 11 randomize_kstack_offset); 12 - DECLARE_PER_CPU(u32, kstack_offset); 13 12 14 13 /* 15 14 * Do not use this anywhere else in the kernel. This is used here because ··· 49 50 * add_random_kstack_offset - Increase stack utilization by previously 50 51 * chosen random offset 51 52 * 52 - * This should be used in the syscall entry path when interrupts and 53 - * preempt are disabled, and after user registers have been stored to 54 - * the stack. For testing the resulting entropy, please see: 55 - * tools/testing/selftests/lkdtm/stack-entropy.sh 53 + * This should be used in the syscall entry path after user registers have been 54 + * stored to the stack. Preemption may be enabled. For testing the resulting 55 + * entropy, please see: tools/testing/selftests/lkdtm/stack-entropy.sh 56 56 */ 57 57 #define add_random_kstack_offset() do { \ 58 58 if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ 59 59 &randomize_kstack_offset)) { \ 60 - u32 offset = raw_cpu_read(kstack_offset); \ 60 + u32 offset = current->kstack_offset; \ 61 61 u8 *ptr = __kstack_alloca(KSTACK_OFFSET_MAX(offset)); \ 62 62 /* Keep allocation even after "ptr" loses scope. */ \ 63 63 asm volatile("" :: "r"(ptr) : "memory"); \ ··· 67 69 * choose_random_kstack_offset - Choose the random offset for the next 68 70 * add_random_kstack_offset() 69 71 * 70 - * This should only be used during syscall exit when interrupts and 71 - * preempt are disabled. This position in the syscall flow is done to 72 - * frustrate attacks from userspace attempting to learn the next offset: 72 + * This should only be used during syscall exit. Preemption may be enabled. This 73 + * position in the syscall flow is done to frustrate attacks from userspace 74 + * attempting to learn the next offset: 73 75 * - Maximize the timing uncertainty visible from userspace: if the 74 76 * offset is chosen at syscall entry, userspace has much more control 75 77 * over the timing between choosing offsets. "How long will we be in ··· 83 85 #define choose_random_kstack_offset(rand) do { \ 84 86 if (static_branch_maybe(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, \ 85 87 &randomize_kstack_offset)) { \ 86 - u32 offset = raw_cpu_read(kstack_offset); \ 88 + u32 offset = current->kstack_offset; \ 87 89 offset = ror32(offset, 5) ^ (rand); \ 88 - raw_cpu_write(kstack_offset, offset); \ 90 + current->kstack_offset = offset; \ 89 91 } \ 90 92 } while (0) 93 + 94 + static inline void random_kstack_task_init(struct task_struct *tsk) 95 + { 96 + tsk->kstack_offset = 0; 97 + } 91 98 #else /* CONFIG_RANDOMIZE_KSTACK_OFFSET */ 92 99 #define add_random_kstack_offset() do { } while (0) 93 100 #define choose_random_kstack_offset(rand) do { } while (0) 101 + #define random_kstack_task_init(tsk) do { } while (0) 94 102 #endif /* CONFIG_RANDOMIZE_KSTACK_OFFSET */ 95 103 96 104 #endif
+4
include/linux/sched.h
··· 1592 1592 unsigned long prev_lowest_stack; 1593 1593 #endif 1594 1594 1595 + #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET 1596 + u32 kstack_offset; 1597 + #endif 1598 + 1595 1599 #ifdef CONFIG_X86_MCE 1596 1600 void __user *mce_vaddr; 1597 1601 __u64 mce_kflags;
-1
init/main.c
··· 833 833 #ifdef CONFIG_RANDOMIZE_KSTACK_OFFSET 834 834 DEFINE_STATIC_KEY_MAYBE_RO(CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT, 835 835 randomize_kstack_offset); 836 - DEFINE_PER_CPU(u32, kstack_offset); 837 836 838 837 static int __init early_randomize_kstack_offset(char *buf) 839 838 {
+2
kernel/fork.c
··· 95 95 #include <linux/thread_info.h> 96 96 #include <linux/kstack_erase.h> 97 97 #include <linux/kasan.h> 98 + #include <linux/randomize_kstack.h> 98 99 #include <linux/scs.h> 99 100 #include <linux/io_uring.h> 100 101 #include <linux/io_uring_types.h> ··· 2234 2233 if (retval) 2235 2234 goto bad_fork_cleanup_io; 2236 2235 2236 + random_kstack_task_init(p); 2237 2237 stackleak_task_init(p); 2238 2238 2239 2239 if (pid != &init_struct_pid) {