Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

security: Add EXEC_RESTRICT_FILE and EXEC_DENY_INTERACTIVE securebits

The new SECBIT_EXEC_RESTRICT_FILE, SECBIT_EXEC_DENY_INTERACTIVE, and
their *_LOCKED counterparts are designed to be set by processes setting
up an execution environment, such as a user session, a container, or a
security sandbox. Unlike other securebits, these ones can be set by
unprivileged processes. Like seccomp filters or Landlock domains, the
securebits are inherited across processes.

When SECBIT_EXEC_RESTRICT_FILE is set, programs interpreting code should
control executable resources according to execveat(2) + AT_EXECVE_CHECK
(see previous commit).

When SECBIT_EXEC_DENY_INTERACTIVE is set, a process should deny
execution of user interactive commands (which excludes executable
regular files).

Being able to configure each of these securebits enables system
administrators or owner of image containers to gradually validate the
related changes and to identify potential issues (e.g. with interpreter
or audit logs).

It should be noted that unlike other security bits, the
SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE bits are
dedicated to user space willing to restrict itself. Because of that,
they only make sense in the context of a trusted environment (e.g.
sandbox, container, user session, full system) where the process
changing its behavior (according to these bits) and all its parent
processes are trusted. Otherwise, any parent process could just execute
its own malicious code (interpreting a script or not), or even enforce a
seccomp filter to mask these bits.

Such a secure environment can be achieved with an appropriate access
control (e.g. mount's noexec option, file access rights, LSM policy) and
an enlighten ld.so checking that libraries are allowed for execution
e.g., to protect against illegitimate use of LD_PRELOAD.

Ptrace restrictions according to these securebits would not make sense
because of the processes' trust assumption.

Scripts may need some changes to deal with untrusted data (e.g. stdin,
environment variables), but that is outside the scope of the kernel.

See chromeOS's documentation about script execution control and the
related threat model:
https://www.chromium.org/chromium-os/developer-library/guides/security/noexec-shell-scripts/

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-3-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>

authored by

Mickaël Salaün and committed by
Kees Cook
a0623b2a a5874fde

+153 -7
+107
Documentation/userspace-api/check_exec.rst
··· 5 5 Executability check 6 6 =================== 7 7 8 + The ``AT_EXECVE_CHECK`` :manpage:`execveat(2)` flag, and the 9 + ``SECBIT_EXEC_RESTRICT_FILE`` and ``SECBIT_EXEC_DENY_INTERACTIVE`` securebits 10 + are intended for script interpreters and dynamic linkers to enforce a 11 + consistent execution security policy handled by the kernel. See the 12 + `samples/check-exec/inc.c`_ example. 13 + 14 + Whether an interpreter should check these securebits or not depends on the 15 + security risk of running malicious scripts with respect to the execution 16 + environment, and whether the kernel can check if a script is trustworthy or 17 + not. For instance, Python scripts running on a server can use arbitrary 18 + syscalls and access arbitrary files. Such interpreters should then be 19 + enlighten to use these securebits and let users define their security policy. 20 + However, a JavaScript engine running in a web browser should already be 21 + sandboxed and then should not be able to harm the user's environment. 22 + 23 + Script interpreters or dynamic linkers built for tailored execution environments 24 + (e.g. hardened Linux distributions or hermetic container images) could use 25 + ``AT_EXECVE_CHECK`` without checking the related securebits if backward 26 + compatibility is handled by something else (e.g. atomic update ensuring that 27 + all legitimate libraries are allowed to be executed). It is then recommended 28 + for script interpreters and dynamic linkers to check the securebits at run time 29 + by default, but also to provide the ability for custom builds to behave like if 30 + ``SECBIT_EXEC_RESTRICT_FILE`` or ``SECBIT_EXEC_DENY_INTERACTIVE`` were always 31 + set to 1 (i.e. always enforce restrictions). 32 + 8 33 AT_EXECVE_CHECK 9 34 =============== 10 35 ··· 60 35 To avoid race conditions leading to time-of-check to time-of-use issues, 61 36 ``AT_EXECVE_CHECK`` should be used with ``AT_EMPTY_PATH`` to check against a 62 37 file descriptor instead of a path. 38 + 39 + SECBIT_EXEC_RESTRICT_FILE and SECBIT_EXEC_DENY_INTERACTIVE 40 + ========================================================== 41 + 42 + When ``SECBIT_EXEC_RESTRICT_FILE`` is set, a process should only interpret or 43 + execute a file if a call to :manpage:`execveat(2)` with the related file 44 + descriptor and the ``AT_EXECVE_CHECK`` flag succeed. 45 + 46 + This secure bit may be set by user session managers, service managers, 47 + container runtimes, sandboxer tools... Except for test environments, the 48 + related ``SECBIT_EXEC_RESTRICT_FILE_LOCKED`` bit should also be set. 49 + 50 + Programs should only enforce consistent restrictions according to the 51 + securebits but without relying on any other user-controlled configuration. 52 + Indeed, the use case for these securebits is to only trust executable code 53 + vetted by the system configuration (through the kernel), so we should be 54 + careful to not let untrusted users control this configuration. 55 + 56 + However, script interpreters may still use user configuration such as 57 + environment variables as long as it is not a way to disable the securebits 58 + checks. For instance, the ``PATH`` and ``LD_PRELOAD`` variables can be set by 59 + a script's caller. Changing these variables may lead to unintended code 60 + executions, but only from vetted executable programs, which is OK. For this to 61 + make sense, the system should provide a consistent security policy to avoid 62 + arbitrary code execution e.g., by enforcing a write xor execute policy. 63 + 64 + When ``SECBIT_EXEC_DENY_INTERACTIVE`` is set, a process should never interpret 65 + interactive user commands (e.g. scripts). However, if such commands are passed 66 + through a file descriptor (e.g. stdin), its content should be interpreted if a 67 + call to :manpage:`execveat(2)` with the related file descriptor and the 68 + ``AT_EXECVE_CHECK`` flag succeed. 69 + 70 + For instance, script interpreters called with a script snippet as argument 71 + should always deny such execution if ``SECBIT_EXEC_DENY_INTERACTIVE`` is set. 72 + 73 + This secure bit may be set by user session managers, service managers, 74 + container runtimes, sandboxer tools... Except for test environments, the 75 + related ``SECBIT_EXEC_DENY_INTERACTIVE_LOCKED`` bit should also be set. 76 + 77 + Here is the expected behavior for a script interpreter according to combination 78 + of any exec securebits: 79 + 80 + 1. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0`` 81 + 82 + Always interpret scripts, and allow arbitrary user commands (default). 83 + 84 + No threat, everyone and everything is trusted, but we can get ahead of 85 + potential issues thanks to the call to :manpage:`execveat(2)` with 86 + ``AT_EXECVE_CHECK`` which should always be performed but ignored by the 87 + script interpreter. Indeed, this check is still important to enable systems 88 + administrators to verify requests (e.g. with audit) and prepare for 89 + migration to a secure mode. 90 + 91 + 2. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=0`` 92 + 93 + Deny script interpretation if they are not executable, but allow 94 + arbitrary user commands. 95 + 96 + The threat is (potential) malicious scripts run by trusted (and not fooled) 97 + users. That can protect against unintended script executions (e.g. ``sh 98 + /tmp/*.sh``). This makes sense for (semi-restricted) user sessions. 99 + 100 + 3. ``SECBIT_EXEC_RESTRICT_FILE=0`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1`` 101 + 102 + Always interpret scripts, but deny arbitrary user commands. 103 + 104 + This use case may be useful for secure services (i.e. without interactive 105 + user session) where scripts' integrity is verified (e.g. with IMA/EVM or 106 + dm-verity/IPE) but where access rights might not be ready yet. Indeed, 107 + arbitrary interactive commands would be much more difficult to check. 108 + 109 + 4. ``SECBIT_EXEC_RESTRICT_FILE=1`` and ``SECBIT_EXEC_DENY_INTERACTIVE=1`` 110 + 111 + Deny script interpretation if they are not executable, and also deny 112 + any arbitrary user commands. 113 + 114 + The threat is malicious scripts run by untrusted users (but trusted code). 115 + This makes sense for system services that may only execute trusted scripts. 116 + 117 + .. Links 118 + .. _samples/check-exec/inc.c: 119 + https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/samples/check-exec/inc.c
+23 -1
include/uapi/linux/securebits.h
··· 52 52 #define SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED \ 53 53 (issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE_LOCKED)) 54 54 55 + /* See Documentation/userspace-api/check_exec.rst */ 56 + #define SECURE_EXEC_RESTRICT_FILE 8 57 + #define SECURE_EXEC_RESTRICT_FILE_LOCKED 9 /* make bit-8 immutable */ 58 + 59 + #define SECBIT_EXEC_RESTRICT_FILE (issecure_mask(SECURE_EXEC_RESTRICT_FILE)) 60 + #define SECBIT_EXEC_RESTRICT_FILE_LOCKED \ 61 + (issecure_mask(SECURE_EXEC_RESTRICT_FILE_LOCKED)) 62 + 63 + /* See Documentation/userspace-api/check_exec.rst */ 64 + #define SECURE_EXEC_DENY_INTERACTIVE 10 65 + #define SECURE_EXEC_DENY_INTERACTIVE_LOCKED 11 /* make bit-10 immutable */ 66 + 67 + #define SECBIT_EXEC_DENY_INTERACTIVE \ 68 + (issecure_mask(SECURE_EXEC_DENY_INTERACTIVE)) 69 + #define SECBIT_EXEC_DENY_INTERACTIVE_LOCKED \ 70 + (issecure_mask(SECURE_EXEC_DENY_INTERACTIVE_LOCKED)) 71 + 55 72 #define SECURE_ALL_BITS (issecure_mask(SECURE_NOROOT) | \ 56 73 issecure_mask(SECURE_NO_SETUID_FIXUP) | \ 57 74 issecure_mask(SECURE_KEEP_CAPS) | \ 58 - issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE)) 75 + issecure_mask(SECURE_NO_CAP_AMBIENT_RAISE) | \ 76 + issecure_mask(SECURE_EXEC_RESTRICT_FILE) | \ 77 + issecure_mask(SECURE_EXEC_DENY_INTERACTIVE)) 59 78 #define SECURE_ALL_LOCKS (SECURE_ALL_BITS << 1) 79 + 80 + #define SECURE_ALL_UNPRIVILEGED (issecure_mask(SECURE_EXEC_RESTRICT_FILE) | \ 81 + issecure_mask(SECURE_EXEC_DENY_INTERACTIVE)) 60 82 61 83 #endif /* _UAPI_LINUX_SECUREBITS_H */
+23 -6
security/commoncap.c
··· 1302 1302 & (old->securebits ^ arg2)) /*[1]*/ 1303 1303 || ((old->securebits & SECURE_ALL_LOCKS & ~arg2)) /*[2]*/ 1304 1304 || (arg2 & ~(SECURE_ALL_LOCKS | SECURE_ALL_BITS)) /*[3]*/ 1305 - || (cap_capable(current_cred(), 1306 - current_cred()->user_ns, 1307 - CAP_SETPCAP, 1308 - CAP_OPT_NONE) != 0) /*[4]*/ 1309 1305 /* 1310 1306 * [1] no changing of bits that are locked 1311 1307 * [2] no unlocking of locks 1312 1308 * [3] no setting of unsupported bits 1313 - * [4] doing anything requires privilege (go read about 1314 - * the "sendmail capabilities bug") 1315 1309 */ 1316 1310 ) 1317 1311 /* cannot change a locked bit */ 1318 1312 return -EPERM; 1313 + 1314 + /* 1315 + * Doing anything requires privilege (go read about the 1316 + * "sendmail capabilities bug"), except for unprivileged bits. 1317 + * Indeed, the SECURE_ALL_UNPRIVILEGED bits are not 1318 + * restrictions enforced by the kernel but by user space on 1319 + * itself. 1320 + */ 1321 + if (cap_capable(current_cred(), current_cred()->user_ns, 1322 + CAP_SETPCAP, CAP_OPT_NONE) != 0) { 1323 + const unsigned long unpriv_and_locks = 1324 + SECURE_ALL_UNPRIVILEGED | 1325 + SECURE_ALL_UNPRIVILEGED << 1; 1326 + const unsigned long changed = old->securebits ^ arg2; 1327 + 1328 + /* For legacy reason, denies non-change. */ 1329 + if (!changed) 1330 + return -EPERM; 1331 + 1332 + /* Denies privileged changes. */ 1333 + if (changed & ~unpriv_and_locks) 1334 + return -EPERM; 1335 + } 1319 1336 1320 1337 new = prepare_creds(); 1321 1338 if (!new)