Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Revert "x86/fault: BUG() when uaccess helpers fault on kernel addresses"

This reverts commit 9da3f2b74054406f87dff7101a569217ffceb29b.

It was well-intentioned, but wrong. Overriding the exception tables for
instructions for random reasons is just wrong, and that is what the new
code did.

It caused problems for tracing, and it caused problems for strncpy_from_user(),
because the new checks made perfectly valid use cases break, rather than
catch things that did bad things.

Unchecked user space accesses are a problem, but that's not a reason to
add invalid checks that then people have to work around with silly flags
(in this case, that 'kernel_uaccess_faults_ok' flag, which is just an
odd way to say "this commit was wrong" and was sprinked into random
places to hide the wrongness).

The real fix to unchecked user space accesses is to get rid of the
special "let's not check __get_user() and __put_user() at all" logic.
Make __{get|put}_user() be just aliases to the regular {get|put}_user()
functions, and make it impossible to access user space without having
the proper checks in places.

The raison d'être of the special double-underscore versions used to be
that the range check was expensive, and if you did multiple user
accesses, you'd do the range check up front (like the signal frame
handling code, for example). But SMAP (on x86) and PAN (on ARM) have
made that optimization pointless, because the _real_ expense is the "set
CPU flag to allow user space access".

Do let's not break the valid cases to catch invalid cases that shouldn't
even exist.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

-72
-58
arch/x86/mm/extable.c
··· 117 117 } 118 118 EXPORT_SYMBOL_GPL(ex_handler_fprestore); 119 119 120 - /* Helper to check whether a uaccess fault indicates a kernel bug. */ 121 - static bool bogus_uaccess(struct pt_regs *regs, int trapnr, 122 - unsigned long fault_addr) 123 - { 124 - /* This is the normal case: #PF with a fault address in userspace. */ 125 - if (trapnr == X86_TRAP_PF && fault_addr < TASK_SIZE_MAX) 126 - return false; 127 - 128 - /* 129 - * This code can be reached for machine checks, but only if the #MC 130 - * handler has already decided that it looks like a candidate for fixup. 131 - * This e.g. happens when attempting to access userspace memory which 132 - * the CPU can't access because of uncorrectable bad memory. 133 - */ 134 - if (trapnr == X86_TRAP_MC) 135 - return false; 136 - 137 - /* 138 - * There are two remaining exception types we might encounter here: 139 - * - #PF for faulting accesses to kernel addresses 140 - * - #GP for faulting accesses to noncanonical addresses 141 - * Complain about anything else. 142 - */ 143 - if (trapnr != X86_TRAP_PF && trapnr != X86_TRAP_GP) { 144 - WARN(1, "unexpected trap %d in uaccess\n", trapnr); 145 - return false; 146 - } 147 - 148 - /* 149 - * This is a faulting memory access in kernel space, on a kernel 150 - * address, in a usercopy function. This can e.g. be caused by improper 151 - * use of helpers like __put_user and by improper attempts to access 152 - * userspace addresses in KERNEL_DS regions. 153 - * The one (semi-)legitimate exception are probe_kernel_{read,write}(), 154 - * which can be invoked from places like kgdb, /dev/mem (for reading) 155 - * and privileged BPF code (for reading). 156 - * The probe_kernel_*() functions set the kernel_uaccess_faults_ok flag 157 - * to tell us that faulting on kernel addresses, and even noncanonical 158 - * addresses, in a userspace accessor does not necessarily imply a 159 - * kernel bug, root might just be doing weird stuff. 160 - */ 161 - if (current->kernel_uaccess_faults_ok) 162 - return false; 163 - 164 - /* This is bad. Refuse the fixup so that we go into die(). */ 165 - if (trapnr == X86_TRAP_PF) { 166 - pr_emerg("BUG: pagefault on kernel address 0x%lx in non-whitelisted uaccess\n", 167 - fault_addr); 168 - } else { 169 - pr_emerg("BUG: GPF in non-whitelisted uaccess (non-canonical address?)\n"); 170 - } 171 - return true; 172 - } 173 - 174 120 __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, 175 121 struct pt_regs *regs, int trapnr, 176 122 unsigned long error_code, 177 123 unsigned long fault_addr) 178 124 { 179 - if (bogus_uaccess(regs, trapnr, fault_addr)) 180 - return false; 181 125 regs->ip = ex_fixup_addr(fixup); 182 126 return true; 183 127 } ··· 132 188 unsigned long error_code, 133 189 unsigned long fault_addr) 134 190 { 135 - if (bogus_uaccess(regs, trapnr, fault_addr)) 136 - return false; 137 191 /* Special hack for uaccess_err */ 138 192 current->thread.uaccess_err = 1; 139 193 regs->ip = ex_fixup_addr(fixup);
-2
fs/namespace.c
··· 2698 2698 if (!access_ok(from, n)) 2699 2699 return n; 2700 2700 2701 - current->kernel_uaccess_faults_ok++; 2702 2701 while (n) { 2703 2702 if (__get_user(c, f)) { 2704 2703 memset(t, 0, n); ··· 2707 2708 f++; 2708 2709 n--; 2709 2710 } 2710 - current->kernel_uaccess_faults_ok--; 2711 2711 return n; 2712 2712 } 2713 2713
-6
include/linux/sched.h
··· 739 739 unsigned use_memdelay:1; 740 740 #endif 741 741 742 - /* 743 - * May usercopy functions fault on kernel addresses? 744 - * This is not just a single bit because this can potentially nest. 745 - */ 746 - unsigned int kernel_uaccess_faults_ok; 747 - 748 742 unsigned long atomic_flags; /* Flags requiring atomic access. */ 749 743 750 744 struct restart_block restart_block;
-6
mm/maccess.c
··· 30 30 31 31 set_fs(KERNEL_DS); 32 32 pagefault_disable(); 33 - current->kernel_uaccess_faults_ok++; 34 33 ret = __copy_from_user_inatomic(dst, 35 34 (__force const void __user *)src, size); 36 - current->kernel_uaccess_faults_ok--; 37 35 pagefault_enable(); 38 36 set_fs(old_fs); 39 37 ··· 58 60 59 61 set_fs(KERNEL_DS); 60 62 pagefault_disable(); 61 - current->kernel_uaccess_faults_ok++; 62 63 ret = __copy_to_user_inatomic((__force void __user *)dst, src, size); 63 - current->kernel_uaccess_faults_ok--; 64 64 pagefault_enable(); 65 65 set_fs(old_fs); 66 66 ··· 94 98 95 99 set_fs(KERNEL_DS); 96 100 pagefault_disable(); 97 - current->kernel_uaccess_faults_ok++; 98 101 99 102 do { 100 103 ret = __get_user(*dst++, (const char __user __force *)src++); 101 104 } while (dst[-1] && ret == 0 && src - unsafe_addr < count); 102 105 103 - current->kernel_uaccess_faults_ok--; 104 106 dst[-1] = '\0'; 105 107 pagefault_enable(); 106 108 set_fs(old_fs);