exec: Add a new AT_EXECVE_CHECK flag to execveat(2)

Add a new AT_EXECVE_CHECK flag to execveat(2) to check if a file would
be allowed for execution. The main use case is for script interpreters
and dynamic linkers to check execution permission according to the
kernel's security policy. Another use case is to add context to access
logs e.g., which script (instead of interpreter) accessed a file. As
any executable code, scripts could also use this check [1].

This is different from faccessat(2) + X_OK which only checks a subset of
access rights (i.e. inode permission and mount options for regular
files), but not the full context (e.g. all LSM access checks). The main
use case for access(2) is for SUID processes to (partially) check access
on behalf of their caller. The main use case for execveat(2) +
AT_EXECVE_CHECK is to check if a script execution would be allowed,
according to all the different restrictions in place. Because the use
of AT_EXECVE_CHECK follows the exact kernel semantic as for a real
execution, user space gets the same error codes.

An interesting point of using execveat(2) instead of openat2(2) is that
it decouples the check from the enforcement. Indeed, the security check
can be logged (e.g. with audit) without blocking an execution
environment not yet ready to enforce a strict security policy.

LSMs can control or log execution requests with
security_bprm_creds_for_exec(). However, to enforce a consistent and
complete access control (e.g. on binary's dependencies) LSMs should
restrict file executability, or measure executed files, with
security_file_open() by checking file->f_flags & __FMODE_EXEC.

Because AT_EXECVE_CHECK is dedicated to user space interpreters, it
doesn't make sense for the kernel to parse the checked files, look for
interpreters known to the kernel (e.g. ELF, shebang), and return ENOEXEC
if the format is unknown. Because of that, security_bprm_check() is
never called when AT_EXECVE_CHECK is used.

It should be noted that script interpreters cannot directly use
execveat(2) (without this new AT_EXECVE_CHECK flag) because this could
lead to unexpected behaviors e.g., `python script.sh` could lead to Bash
being executed to interpret the script. Unlike the kernel, script
interpreters may just interpret the shebang as a simple comment, which
should not change for backward compatibility reasons.

Because scripts or libraries files might not currently have the
executable permission set, or because we might want specific users to be
allowed to run arbitrary scripts, the following patch provides a dynamic
configuration mechanism with the SECBIT_EXEC_RESTRICT_FILE and
SECBIT_EXEC_DENY_INTERACTIVE securebits.

This is a redesign of the CLIP OS 4's O_MAYEXEC:
https://github.com/clipos-archive/src_platform_clip-patches/blob/f5cb330d6b684752e403b4e41b39f7004d88e561/1901_open_mayexec.patch
This patch has been used for more than a decade with customized script
interpreters. Some examples can be found here:
https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jeff Xu <jeffxu@chromium.org>
Tested-by: Jeff Xu <jeffxu@chromium.org>
Link: https://docs.python.org/3/library/io.html#io.open_code [1]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20241212174223.389435-2-mic@digikod.net
Signed-off-by: Kees Cook <kees@kernel.org>

authored by

Mickaël Salaün and committed by

Kees Cook 1 year ago a5874fde fac04efc

+76 -3

6 changed files

expand all

Documentation

userspace-api

check_exec.rst

index.rst

exec.c

include

linux

binfmts.h

uapi

linux

fcntl.h

security

security.c

+37

Documentation/userspace-api/check_exec.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + .. Copyright © 2024 Microsoft Corporation 3 + 4 + =================== 5 + Executability check 6 + =================== 7 + 8 + AT_EXECVE_CHECK 9 + =============== 10 + 11 + Passing the ``AT_EXECVE_CHECK`` flag to :manpage:`execveat(2)` only performs a 12 + check on a regular file and returns 0 if execution of this file would be 13 + allowed, ignoring the file format and then the related interpreter dependencies 14 + (e.g. ELF libraries, script's shebang). 15 + 16 + Programs should always perform this check to apply kernel-level checks against 17 + files that are not directly executed by the kernel but passed to a user space 18 + interpreter instead. All files that contain executable code, from the point of 19 + view of the interpreter, should be checked. However the result of this check 20 + should only be enforced according to ``SECBIT_EXEC_RESTRICT_FILE`` or 21 + ``SECBIT_EXEC_DENY_INTERACTIVE.``. 22 + 23 + The main purpose of this flag is to improve the security and consistency of an 24 + execution environment to ensure that direct file execution (e.g. 25 + ``./script.sh``) and indirect file execution (e.g. ``sh script.sh``) lead to 26 + the same result. For instance, this can be used to check if a file is 27 + trustworthy according to the caller's environment. 28 + 29 + In a secure environment, libraries and any executable dependencies should also 30 + be checked. For instance, dynamic linking should make sure that all libraries 31 + are allowed for execution to avoid trivial bypass (e.g. using ``LD_PRELOAD``). 32 + For such secure execution environment to make sense, only trusted code should 33 + be executable, which also requires integrity guarantees. 34 + 35 + To avoid race conditions leading to time-of-check to time-of-use issues, 36 + ``AT_EXECVE_CHECK`` should be used with ``AT_EMPTY_PATH`` to check against a 37 + file descriptor instead of a path.

Documentation/userspace-api/index.rst

··· 35 35 mfd_noexec 36 36 spec_ctrl 37 37 tee 38 + check_exec 38 39 39 40 Devices and I/O 40 41 ===============

+18 -2

fs/exec.c

··· 892 892 .lookup_flags = LOOKUP_FOLLOW, 893 893 }; 894 894 895 - if ((flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH)) != 0) 895 + if ((flags & 896 + ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH | AT_EXECVE_CHECK)) != 0) 896 897 return ERR_PTR(-EINVAL); 897 898 if (flags & AT_SYMLINK_NOFOLLOW) 898 899 open_exec_flags.lookup_flags &= ~LOOKUP_FOLLOW; ··· 1542 1541 } 1543 1542 bprm->interp = bprm->filename; 1544 1543 1544 + /* 1545 + * At this point, security_file_open() has already been called (with 1546 + * __FMODE_EXEC) and access control checks for AT_EXECVE_CHECK will 1547 + * stop just after the security_bprm_creds_for_exec() call in 1548 + * bprm_execve(). Indeed, the kernel should not try to parse the 1549 + * content of the file with exec_binprm() nor change the calling 1550 + * thread, which means that the following security functions will not 1551 + * be called: 1552 + * - security_bprm_check() 1553 + * - security_bprm_creds_from_file() 1554 + * - security_bprm_committing_creds() 1555 + * - security_bprm_committed_creds() 1556 + */ 1557 + bprm->is_check = !!(flags & AT_EXECVE_CHECK); 1558 + 1545 1559 retval = bprm_mm_init(bprm); 1546 1560 if (!retval) 1547 1561 return bprm; ··· 1852 1836 1853 1837 /* Set the unchanging part of bprm->cred */ 1854 1838 retval = security_bprm_creds_for_exec(bprm); 1855 - if (retval) 1839 + if (retval || bprm->is_check) 1856 1840 goto out; 1857 1841 1858 1842 retval = exec_binprm(bprm);

+6 -1

include/linux/binfmts.h

··· 42 42 * Set when errors can no longer be returned to the 43 43 * original userspace. 44 44 */ 45 - point_of_no_return:1; 45 + point_of_no_return:1, 46 + /* 47 + * Set by user space to check executability according to the 48 + * caller's environment. 49 + */ 50 + is_check:1; 46 51 struct file *executable; /* Executable to pass to the interpreter */ 47 52 struct file *interpreter; 48 53 struct file *file;

include/uapi/linux/fcntl.h

··· 155 155 #define AT_HANDLE_MNT_ID_UNIQUE 0x001 /* Return the u64 unique mount ID. */ 156 156 #define AT_HANDLE_CONNECTABLE 0x002 /* Request a connectable file handle */ 157 157 158 + /* Flags for execveat2(2). */ 159 + #define AT_EXECVE_CHECK 0x10000 /* Only perform a check if execution 160 + would be allowed. */ 161 + 158 162 #endif /* _UAPI_LINUX_FCNTL_H */

+10

security/security.c

··· 1248 1248 * to 1 if AT_SECURE should be set to request libc enable secure mode. @bprm 1249 1249 * contains the linux_binprm structure. 1250 1250 * 1251 + * If execveat(2) is called with the AT_EXECVE_CHECK flag, bprm->is_check is 1252 + * set. The result must be the same as without this flag even if the execution 1253 + * will never really happen and @bprm will always be dropped. 1254 + * 1255 + * This hook must not change current->cred, only @bprm->cred. 1256 + * 1251 1257 * Return: Returns 0 if the hook is successful and permission is granted. 1252 1258 */ 1253 1259 int security_bprm_creds_for_exec(struct linux_binprm *bprm) ··· 3103 3097 * 3104 3098 * Save open-time permission checking state for later use upon file_permission, 3105 3099 * and recheck access if anything has changed since inode_permission. 3100 + * 3101 + * We can check if a file is opened for execution (e.g. execve(2) call), either 3102 + * directly or indirectly (e.g. ELF's ld.so) by checking file->f_flags & 3103 + * __FMODE_EXEC . 3106 3104 * 3107 3105 * Return: Returns 0 if permission is granted. 3108 3106 */

Configure Feed

Configure Feed