Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs mount updates from Christian Brauner:

- statmount: accept fd as a parameter

Extend struct mnt_id_req with a file descriptor field and a new
STATMOUNT_BY_FD flag. When set, statmount() returns mount information
for the mount the fd resides on — including detached mounts
(unmounted via umount2(MNT_DETACH)).

For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID
mask bits are cleared since neither is meaningful. The capability
check is skipped for STATMOUNT_BY_FD since holding an fd already
implies prior access to the mount and equivalent information is
available through fstatfs() and /proc/pid/mountinfo without
privilege. Includes comprehensive selftests covering both attached
and detached mount cases.

- fs: Remove internal old mount API code (1 patch)

Now that every in-tree filesystem has been converted to the new
mount API, remove all the legacy shim code in fs_context.c that
handled unconverted filesystems. This deletes ~280 lines including
legacy_init_fs_context(), the legacy_fs_context struct, and
associated wrappers. The mount(2) syscall path for userspace remains
untouched. Documentation references to the legacy callbacks are
cleaned up.

- mount: add OPEN_TREE_NAMESPACE to open_tree()

Container runtimes currently use CLONE_NEWNS to copy the caller's
entire mount namespace — only to then pivot_root() and recursively
unmount everything they just copied. With large mount tables and
thousands of parallel container launches this creates significant
contention on the namespace semaphore.

OPEN_TREE_NAMESPACE copies only the specified mount tree (like
OPEN_TREE_CLONE) but returns a mount namespace fd instead of a
detached mount fd. The new namespace contains the copied tree mounted
on top of a clone of the real rootfs.

This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a
single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER)
followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by
the new user namespace. Mount namespace file mounts are excluded from
the copy to prevent cycles. Includes ~1000 lines of selftests"

* tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/open_tree: add OPEN_TREE_NAMESPACE tests
mount: add OPEN_TREE_NAMESPACE
fs: Remove internal old mount API code
selftests: statmount: tests for STATMOUNT_BY_FD
statmount: accept fd as a parameter
statmount: permission check should return EPERM

+1669 -365
-8
Documentation/filesystems/locking.rst
··· 180 180 int (*freeze_fs) (struct super_block *); 181 181 int (*unfreeze_fs) (struct super_block *); 182 182 int (*statfs) (struct dentry *, struct kstatfs *); 183 - int (*remount_fs) (struct super_block *, int *, char *); 184 183 void (*umount_begin) (struct super_block *); 185 184 int (*show_options)(struct seq_file *, struct dentry *); 186 185 ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); ··· 203 204 freeze_fs: write 204 205 unfreeze_fs: write 205 206 statfs: maybe(read) (see below) 206 - remount_fs: write 207 207 umount_begin: no 208 208 show_options: no (namespace_sem) 209 209 quota_read: no (see below) ··· 227 229 228 230 prototypes:: 229 231 230 - struct dentry *(*mount) (struct file_system_type *, int, 231 - const char *, void *); 232 232 void (*kill_sb) (struct super_block *); 233 233 234 234 locking rules: ··· 234 238 ======= ========= 235 239 ops may block 236 240 ======= ========= 237 - mount yes 238 241 kill_sb yes 239 242 ======= ========= 240 - 241 - ->mount() returns ERR_PTR or the root dentry; its superblock should be locked 242 - on return. 243 243 244 244 ->kill_sb() takes a write-locked superblock, does all shutdown work on it, 245 245 unlocks and drops the reference.
-2
Documentation/filesystems/mount_api.rst
··· 299 299 On success it should return 0. In the case of an error, it should return 300 300 a negative error code. 301 301 302 - .. Note:: reconfigure is intended as a replacement for remount_fs. 303 - 304 302 305 303 Filesystem context Security 306 304 ===========================
+2 -5
Documentation/filesystems/porting.rst
··· 448 448 449 449 **mandatory** 450 450 451 - ->get_sb() is gone. Switch to use of ->mount(). Typically it's just 452 - a matter of switching from calling ``get_sb_``... to ``mount_``... and changing 453 - the function type. If you were doing it manually, just switch from setting 454 - ->mnt_root to some pointer to returning that pointer. On errors return 455 - ERR_PTR(...). 451 + ->get_sb() and ->mount() are gone. Switch to using the new mount API. See 452 + Documentation/filesystems/mount_api.rst for more details. 456 453 457 454 --- 458 455
+3 -55
Documentation/filesystems/vfs.rst
··· 94 94 95 95 The passed struct file_system_type describes your filesystem. When a 96 96 request is made to mount a filesystem onto a directory in your 97 - namespace, the VFS will call the appropriate mount() method for the 98 - specific filesystem. New vfsmount referring to the tree returned by 99 - ->mount() will be attached to the mountpoint, so that when pathname 100 - resolution reaches the mountpoint it will jump into the root of that 101 - vfsmount. 97 + namespace, the VFS will call the appropriate get_tree() method for the 98 + specific filesystem. See Documentation/filesystems/mount_api.rst 99 + for more details. 102 100 103 101 You can see all filesystems that are registered to the kernel in the 104 102 file /proc/filesystems. ··· 115 117 int fs_flags; 116 118 int (*init_fs_context)(struct fs_context *); 117 119 const struct fs_parameter_spec *parameters; 118 - struct dentry *(*mount) (struct file_system_type *, int, 119 - const char *, void *); 120 120 void (*kill_sb) (struct super_block *); 121 121 struct module *owner; 122 122 struct file_system_type * next; ··· 147 151 'struct fs_parameter_spec'. 148 152 More info in Documentation/filesystems/mount_api.rst. 149 153 150 - ``mount`` 151 - the method to call when a new instance of this filesystem should 152 - be mounted 153 - 154 154 ``kill_sb`` 155 155 the method to call when an instance of this filesystem should be 156 156 shut down ··· 164 172 165 173 s_lock_key, s_umount_key, s_vfs_rename_key, s_writers_key, 166 174 i_lock_key, i_mutex_key, invalidate_lock_key, i_mutex_dir_key: lockdep-specific 167 - 168 - The mount() method has the following arguments: 169 - 170 - ``struct file_system_type *fs_type`` 171 - describes the filesystem, partly initialized by the specific 172 - filesystem code 173 - 174 - ``int flags`` 175 - mount flags 176 - 177 - ``const char *dev_name`` 178 - the device name we are mounting. 179 - 180 - ``void *data`` 181 - arbitrary mount options, usually comes as an ASCII string (see 182 - "Mount Options" section) 183 - 184 - The mount() method must return the root dentry of the tree requested by 185 - caller. An active reference to its superblock must be grabbed and the 186 - superblock must be locked. On failure it should return ERR_PTR(error). 187 - 188 - The arguments match those of mount(2) and their interpretation depends 189 - on filesystem type. E.g. for block filesystems, dev_name is interpreted 190 - as block device name, that device is opened and if it contains a 191 - suitable filesystem image the method creates and initializes struct 192 - super_block accordingly, returning its root dentry to caller. 193 - 194 - ->mount() may choose to return a subtree of existing filesystem - it 195 - doesn't have to create a new one. The main result from the caller's 196 - point of view is a reference to dentry at the root of (sub)tree to be 197 - attached; creation of new superblock is a common side effect. 198 - 199 - The most interesting member of the superblock structure that the mount() 200 - method fills in is the "s_op" field. This is a pointer to a "struct 201 - super_operations" which describes the next level of the filesystem 202 - implementation. 203 - 204 - For more information on mounting (and the new mount API), see 205 - Documentation/filesystems/mount_api.rst. 206 175 207 176 The Superblock Object 208 177 ===================== ··· 197 244 enum freeze_wholder who); 198 245 int (*unfreeze_fs) (struct super_block *); 199 246 int (*statfs) (struct dentry *, struct kstatfs *); 200 - int (*remount_fs) (struct super_block *, int *, char *); 201 247 void (*umount_begin) (struct super_block *); 202 248 203 249 int (*show_options)(struct seq_file *, struct dentry *); ··· 302 350 303 351 ``statfs`` 304 352 called when the VFS needs to get filesystem statistics. 305 - 306 - ``remount_fs`` 307 - called when the filesystem is remounted. This is called with 308 - the kernel lock held 309 353 310 354 ``umount_begin`` 311 355 called when the VFS is unmounting a filesystem.
+3 -205
fs/fs_context.c
··· 24 24 #include "mount.h" 25 25 #include "internal.h" 26 26 27 - enum legacy_fs_param { 28 - LEGACY_FS_UNSET_PARAMS, 29 - LEGACY_FS_MONOLITHIC_PARAMS, 30 - LEGACY_FS_INDIVIDUAL_PARAMS, 31 - }; 32 - 33 - struct legacy_fs_context { 34 - char *legacy_data; /* Data page for legacy filesystems */ 35 - size_t data_size; 36 - enum legacy_fs_param param_type; 37 - }; 38 - 39 - static int legacy_init_fs_context(struct fs_context *fc); 40 - 41 27 static const struct constant_table common_set_sb_flag[] = { 42 28 { "dirsync", SB_DIRSYNC }, 43 29 { "lazytime", SB_LAZYTIME }, ··· 261 275 unsigned int sb_flags_mask, 262 276 enum fs_context_purpose purpose) 263 277 { 264 - int (*init_fs_context)(struct fs_context *); 265 278 struct fs_context *fc; 266 279 int ret = -ENOMEM; 267 280 ··· 292 307 break; 293 308 } 294 309 295 - /* TODO: Make all filesystems support this unconditionally */ 296 - init_fs_context = fc->fs_type->init_fs_context; 297 - if (!init_fs_context) 298 - init_fs_context = legacy_init_fs_context; 299 - 300 - ret = init_fs_context(fc); 310 + ret = fc->fs_type->init_fs_context(fc); 301 311 if (ret < 0) 302 312 goto err_fc; 303 313 fc->need_free = true; ··· 355 375 fc->root = NULL; 356 376 deactivate_locked_super(sb); 357 377 } 358 - 359 - static void legacy_fs_context_free(struct fs_context *fc); 360 378 361 379 /** 362 380 * vfs_dup_fs_context - Duplicate a filesystem context. ··· 509 531 } 510 532 EXPORT_SYMBOL(put_fs_context); 511 533 512 - /* 513 - * Free the config for a filesystem that doesn't support fs_context. 514 - */ 515 - static void legacy_fs_context_free(struct fs_context *fc) 516 - { 517 - struct legacy_fs_context *ctx = fc->fs_private; 518 - 519 - if (ctx) { 520 - if (ctx->param_type == LEGACY_FS_INDIVIDUAL_PARAMS) 521 - kfree(ctx->legacy_data); 522 - kfree(ctx); 523 - } 524 - } 525 - 526 - /* 527 - * Duplicate a legacy config. 528 - */ 529 - static int legacy_fs_context_dup(struct fs_context *fc, struct fs_context *src_fc) 530 - { 531 - struct legacy_fs_context *ctx; 532 - struct legacy_fs_context *src_ctx = src_fc->fs_private; 533 - 534 - ctx = kmemdup(src_ctx, sizeof(*src_ctx), GFP_KERNEL); 535 - if (!ctx) 536 - return -ENOMEM; 537 - 538 - if (ctx->param_type == LEGACY_FS_INDIVIDUAL_PARAMS) { 539 - ctx->legacy_data = kmemdup(src_ctx->legacy_data, 540 - src_ctx->data_size, GFP_KERNEL); 541 - if (!ctx->legacy_data) { 542 - kfree(ctx); 543 - return -ENOMEM; 544 - } 545 - } 546 - 547 - fc->fs_private = ctx; 548 - return 0; 549 - } 550 - 551 - /* 552 - * Add a parameter to a legacy config. We build up a comma-separated list of 553 - * options. 554 - */ 555 - static int legacy_parse_param(struct fs_context *fc, struct fs_parameter *param) 556 - { 557 - struct legacy_fs_context *ctx = fc->fs_private; 558 - unsigned int size = ctx->data_size; 559 - size_t len = 0; 560 - int ret; 561 - 562 - ret = vfs_parse_fs_param_source(fc, param); 563 - if (ret != -ENOPARAM) 564 - return ret; 565 - 566 - if (ctx->param_type == LEGACY_FS_MONOLITHIC_PARAMS) 567 - return invalf(fc, "VFS: Legacy: Can't mix monolithic and individual options"); 568 - 569 - switch (param->type) { 570 - case fs_value_is_string: 571 - len = 1 + param->size; 572 - fallthrough; 573 - case fs_value_is_flag: 574 - len += strlen(param->key); 575 - break; 576 - default: 577 - return invalf(fc, "VFS: Legacy: Parameter type for '%s' not supported", 578 - param->key); 579 - } 580 - 581 - if (size + len + 2 > PAGE_SIZE) 582 - return invalf(fc, "VFS: Legacy: Cumulative options too large"); 583 - if (strchr(param->key, ',') || 584 - (param->type == fs_value_is_string && 585 - memchr(param->string, ',', param->size))) 586 - return invalf(fc, "VFS: Legacy: Option '%s' contained comma", 587 - param->key); 588 - if (!ctx->legacy_data) { 589 - ctx->legacy_data = kmalloc(PAGE_SIZE, GFP_KERNEL); 590 - if (!ctx->legacy_data) 591 - return -ENOMEM; 592 - } 593 - 594 - if (size) 595 - ctx->legacy_data[size++] = ','; 596 - len = strlen(param->key); 597 - memcpy(ctx->legacy_data + size, param->key, len); 598 - size += len; 599 - if (param->type == fs_value_is_string) { 600 - ctx->legacy_data[size++] = '='; 601 - memcpy(ctx->legacy_data + size, param->string, param->size); 602 - size += param->size; 603 - } 604 - ctx->legacy_data[size] = '\0'; 605 - ctx->data_size = size; 606 - ctx->param_type = LEGACY_FS_INDIVIDUAL_PARAMS; 607 - return 0; 608 - } 609 - 610 - /* 611 - * Add monolithic mount data. 612 - */ 613 - static int legacy_parse_monolithic(struct fs_context *fc, void *data) 614 - { 615 - struct legacy_fs_context *ctx = fc->fs_private; 616 - 617 - if (ctx->param_type != LEGACY_FS_UNSET_PARAMS) { 618 - pr_warn("VFS: Can't mix monolithic and individual options\n"); 619 - return -EINVAL; 620 - } 621 - 622 - ctx->legacy_data = data; 623 - ctx->param_type = LEGACY_FS_MONOLITHIC_PARAMS; 624 - if (!ctx->legacy_data) 625 - return 0; 626 - 627 - if (fc->fs_type->fs_flags & FS_BINARY_MOUNTDATA) 628 - return 0; 629 - return security_sb_eat_lsm_opts(ctx->legacy_data, &fc->security); 630 - } 631 - 632 - /* 633 - * Get a mountable root with the legacy mount command. 634 - */ 635 - static int legacy_get_tree(struct fs_context *fc) 636 - { 637 - struct legacy_fs_context *ctx = fc->fs_private; 638 - struct super_block *sb; 639 - struct dentry *root; 640 - 641 - root = fc->fs_type->mount(fc->fs_type, fc->sb_flags, 642 - fc->source, ctx->legacy_data); 643 - if (IS_ERR(root)) 644 - return PTR_ERR(root); 645 - 646 - sb = root->d_sb; 647 - BUG_ON(!sb); 648 - 649 - fc->root = root; 650 - return 0; 651 - } 652 - 653 - /* 654 - * Handle remount. 655 - */ 656 - static int legacy_reconfigure(struct fs_context *fc) 657 - { 658 - struct legacy_fs_context *ctx = fc->fs_private; 659 - struct super_block *sb = fc->root->d_sb; 660 - 661 - if (!sb->s_op->remount_fs) 662 - return 0; 663 - 664 - return sb->s_op->remount_fs(sb, &fc->sb_flags, 665 - ctx ? ctx->legacy_data : NULL); 666 - } 667 - 668 - const struct fs_context_operations legacy_fs_context_ops = { 669 - .free = legacy_fs_context_free, 670 - .dup = legacy_fs_context_dup, 671 - .parse_param = legacy_parse_param, 672 - .parse_monolithic = legacy_parse_monolithic, 673 - .get_tree = legacy_get_tree, 674 - .reconfigure = legacy_reconfigure, 675 - }; 676 - 677 - /* 678 - * Initialise a legacy context for a filesystem that doesn't support 679 - * fs_context. 680 - */ 681 - static int legacy_init_fs_context(struct fs_context *fc) 682 - { 683 - fc->fs_private = kzalloc(sizeof(struct legacy_fs_context), GFP_KERNEL_ACCOUNT); 684 - if (!fc->fs_private) 685 - return -ENOMEM; 686 - fc->ops = &legacy_fs_context_ops; 687 - return 0; 688 - } 689 - 690 534 int parse_monolithic_mount_data(struct fs_context *fc, void *data) 691 535 { 692 536 int (*monolithic_mount_data)(struct fs_context *, void *); ··· 557 757 if (fc->phase != FS_CONTEXT_AWAITING_RECONF) 558 758 return 0; 559 759 560 - if (fc->fs_type->init_fs_context) 561 - error = fc->fs_type->init_fs_context(fc); 562 - else 563 - error = legacy_init_fs_context(fc); 760 + error = fc->fs_type->init_fs_context(fc); 761 + 564 762 if (unlikely(error)) { 565 763 fc->phase = FS_CONTEXT_FAILED; 566 764 return error;
-10
fs/fsopen.c
··· 404 404 return -EINVAL; 405 405 406 406 fc = fd_file(f)->private_data; 407 - if (fc->ops == &legacy_fs_context_ops) { 408 - switch (cmd) { 409 - case FSCONFIG_SET_BINARY: 410 - case FSCONFIG_SET_PATH: 411 - case FSCONFIG_SET_PATH_EMPTY: 412 - case FSCONFIG_SET_FD: 413 - case FSCONFIG_CMD_CREATE_EXCL: 414 - return -EOPNOTSUPP; 415 - } 416 - } 417 407 418 408 if (_key) { 419 409 param.key = strndup_user(_key, 256);
+1 -1
fs/internal.h
··· 44 44 /* 45 45 * fs_context.c 46 46 */ 47 - extern const struct fs_context_operations legacy_fs_context_ops; 48 47 extern int parse_monolithic_mount_data(struct fs_context *, void *); 49 48 extern void vfs_clean_context(struct fs_context *fc); 50 49 extern int finish_clean_context(struct fs_context *fc); ··· 248 249 */ 249 250 extern const struct dentry_operations ns_dentry_operations; 250 251 int open_namespace(struct ns_common *ns); 252 + struct file *open_namespace_file(struct ns_common *ns); 251 253 252 254 /* 253 255 * fs/stat.c:
+214 -51
fs/namespace.c
··· 2796 2796 __unlock_mount(m); 2797 2797 } 2798 2798 2799 + static void lock_mount_exact(const struct path *path, 2800 + struct pinned_mountpoint *mp); 2801 + 2799 2802 #define LOCK_MOUNT_MAYBE_BENEATH(mp, path, beneath) \ 2800 2803 struct pinned_mountpoint mp __cleanup(unlock_mount) = {}; \ 2801 2804 do_lock_mount((path), &mp, (beneath)) ··· 2949 2946 return check_anonymous_mnt(mnt); 2950 2947 } 2951 2948 2952 - 2953 - static struct mount *__do_loopback(const struct path *old_path, int recurse) 2949 + static struct mount *__do_loopback(const struct path *old_path, 2950 + unsigned int flags, unsigned int copy_flags) 2954 2951 { 2955 2952 struct mount *old = real_mount(old_path->mnt); 2953 + bool recurse = flags & AT_RECURSIVE; 2956 2954 2957 2955 if (IS_MNT_UNBINDABLE(old)) 2958 2956 return ERR_PTR(-EINVAL); ··· 2964 2960 if (!recurse && __has_locked_children(old, old_path->dentry)) 2965 2961 return ERR_PTR(-EINVAL); 2966 2962 2963 + /* 2964 + * When creating a new mount namespace we don't want to copy over 2965 + * mounts of mount namespaces to avoid the risk of cycles and also to 2966 + * minimize the default complex interdependencies between mount 2967 + * namespaces. 2968 + * 2969 + * We could ofc just check whether all mount namespace files aren't 2970 + * creating cycles but really let's keep this simple. 2971 + */ 2972 + if (!(flags & OPEN_TREE_NAMESPACE)) 2973 + copy_flags |= CL_COPY_MNT_NS_FILE; 2974 + 2967 2975 if (recurse) 2968 - return copy_tree(old, old_path->dentry, CL_COPY_MNT_NS_FILE); 2969 - else 2970 - return clone_mnt(old, old_path->dentry, 0); 2976 + return copy_tree(old, old_path->dentry, copy_flags); 2977 + 2978 + return clone_mnt(old, old_path->dentry, copy_flags); 2971 2979 } 2972 2980 2973 2981 /* ··· 2990 2974 { 2991 2975 struct path old_path __free(path_put) = {}; 2992 2976 struct mount *mnt = NULL; 2977 + unsigned int flags = recurse ? AT_RECURSIVE : 0; 2993 2978 int err; 2979 + 2994 2980 if (!old_name || !*old_name) 2995 2981 return -EINVAL; 2996 2982 err = kern_path(old_name, LOOKUP_FOLLOW|LOOKUP_AUTOMOUNT, &old_path); ··· 3009 2991 if (!check_mnt(mp.parent)) 3010 2992 return -EINVAL; 3011 2993 3012 - mnt = __do_loopback(&old_path, recurse); 2994 + mnt = __do_loopback(&old_path, flags, 0); 3013 2995 if (IS_ERR(mnt)) 3014 2996 return PTR_ERR(mnt); 3015 2997 ··· 3022 3004 return err; 3023 3005 } 3024 3006 3025 - static struct mnt_namespace *get_detached_copy(const struct path *path, bool recursive) 3007 + static struct mnt_namespace *get_detached_copy(const struct path *path, unsigned int flags) 3026 3008 { 3027 3009 struct mnt_namespace *ns, *mnt_ns = current->nsproxy->mnt_ns, *src_mnt_ns; 3028 3010 struct user_namespace *user_ns = mnt_ns->user_ns; ··· 3047 3029 ns->seq_origin = src_mnt_ns->ns.ns_id; 3048 3030 } 3049 3031 3050 - mnt = __do_loopback(path, recursive); 3032 + mnt = __do_loopback(path, flags, 0); 3051 3033 if (IS_ERR(mnt)) { 3052 3034 emptied_ns = ns; 3053 3035 return ERR_CAST(mnt); ··· 3061 3043 return ns; 3062 3044 } 3063 3045 3064 - static struct file *open_detached_copy(struct path *path, bool recursive) 3046 + static struct file *open_detached_copy(struct path *path, unsigned int flags) 3065 3047 { 3066 - struct mnt_namespace *ns = get_detached_copy(path, recursive); 3048 + struct mnt_namespace *ns = get_detached_copy(path, flags); 3067 3049 struct file *file; 3068 3050 3069 3051 if (IS_ERR(ns)) ··· 3079 3061 return file; 3080 3062 } 3081 3063 3064 + DEFINE_FREE(put_empty_mnt_ns, struct mnt_namespace *, 3065 + if (!IS_ERR_OR_NULL(_T)) free_mnt_ns(_T)) 3066 + 3067 + static struct mnt_namespace *create_new_namespace(struct path *path, unsigned int flags) 3068 + { 3069 + struct mnt_namespace *new_ns __free(put_empty_mnt_ns) = NULL; 3070 + struct path to_path __free(path_put) = {}; 3071 + struct mnt_namespace *ns = current->nsproxy->mnt_ns; 3072 + struct user_namespace *user_ns = current_user_ns(); 3073 + struct mount *new_ns_root; 3074 + struct mount *mnt; 3075 + unsigned int copy_flags = 0; 3076 + bool locked = false; 3077 + 3078 + if (user_ns != ns->user_ns) 3079 + copy_flags |= CL_SLAVE; 3080 + 3081 + new_ns = alloc_mnt_ns(user_ns, false); 3082 + if (IS_ERR(new_ns)) 3083 + return ERR_CAST(new_ns); 3084 + 3085 + scoped_guard(namespace_excl) { 3086 + new_ns_root = clone_mnt(ns->root, ns->root->mnt.mnt_root, copy_flags); 3087 + if (IS_ERR(new_ns_root)) 3088 + return ERR_CAST(new_ns_root); 3089 + 3090 + /* 3091 + * If the real rootfs had a locked mount on top of it somewhere 3092 + * in the stack, lock the new mount tree as well so it can't be 3093 + * exposed. 3094 + */ 3095 + mnt = ns->root; 3096 + while (mnt->overmount) { 3097 + mnt = mnt->overmount; 3098 + if (mnt->mnt.mnt_flags & MNT_LOCKED) 3099 + locked = true; 3100 + } 3101 + } 3102 + 3103 + /* 3104 + * We dropped the namespace semaphore so we can actually lock 3105 + * the copy for mounting. The copied mount isn't attached to any 3106 + * mount namespace and it is thus excluded from any propagation. 3107 + * So realistically we're isolated and the mount can't be 3108 + * overmounted. 3109 + */ 3110 + 3111 + /* Borrow the reference from clone_mnt(). */ 3112 + to_path.mnt = &new_ns_root->mnt; 3113 + to_path.dentry = dget(new_ns_root->mnt.mnt_root); 3114 + 3115 + /* Now lock for actual mounting. */ 3116 + LOCK_MOUNT_EXACT(mp, &to_path); 3117 + if (unlikely(IS_ERR(mp.parent))) 3118 + return ERR_CAST(mp.parent); 3119 + 3120 + /* 3121 + * We don't emulate unshare()ing a mount namespace. We stick to the 3122 + * restrictions of creating detached bind-mounts. It has a lot 3123 + * saner and simpler semantics. 3124 + */ 3125 + mnt = __do_loopback(path, flags, copy_flags); 3126 + if (IS_ERR(mnt)) 3127 + return ERR_CAST(mnt); 3128 + 3129 + scoped_guard(mount_writer) { 3130 + if (locked) 3131 + mnt->mnt.mnt_flags |= MNT_LOCKED; 3132 + /* 3133 + * Now mount the detached tree on top of the copy of the 3134 + * real rootfs we created. 3135 + */ 3136 + attach_mnt(mnt, new_ns_root, mp.mp); 3137 + if (user_ns != ns->user_ns) 3138 + lock_mnt_tree(new_ns_root); 3139 + } 3140 + 3141 + /* Add all mounts to the new namespace. */ 3142 + for (struct mount *p = new_ns_root; p; p = next_mnt(p, new_ns_root)) { 3143 + mnt_add_to_ns(new_ns, p); 3144 + new_ns->nr_mounts++; 3145 + } 3146 + 3147 + new_ns->root = real_mount(no_free_ptr(to_path.mnt)); 3148 + ns_tree_add_raw(new_ns); 3149 + return no_free_ptr(new_ns); 3150 + } 3151 + 3152 + static struct file *open_new_namespace(struct path *path, unsigned int flags) 3153 + { 3154 + struct mnt_namespace *new_ns; 3155 + 3156 + new_ns = create_new_namespace(path, flags); 3157 + if (IS_ERR(new_ns)) 3158 + return ERR_CAST(new_ns); 3159 + return open_namespace_file(to_ns_common(new_ns)); 3160 + } 3161 + 3082 3162 static struct file *vfs_open_tree(int dfd, const char __user *filename, unsigned int flags) 3083 3163 { 3084 3164 int ret; 3085 3165 struct path path __free(path_put) = {}; 3086 3166 int lookup_flags = LOOKUP_AUTOMOUNT | LOOKUP_FOLLOW; 3087 - bool detached = flags & OPEN_TREE_CLONE; 3088 3167 3089 3168 BUILD_BUG_ON(OPEN_TREE_CLOEXEC != O_CLOEXEC); 3090 3169 3091 3170 if (flags & ~(AT_EMPTY_PATH | AT_NO_AUTOMOUNT | AT_RECURSIVE | 3092 3171 AT_SYMLINK_NOFOLLOW | OPEN_TREE_CLONE | 3093 - OPEN_TREE_CLOEXEC)) 3172 + OPEN_TREE_CLOEXEC | OPEN_TREE_NAMESPACE)) 3094 3173 return ERR_PTR(-EINVAL); 3095 3174 3096 - if ((flags & (AT_RECURSIVE | OPEN_TREE_CLONE)) == AT_RECURSIVE) 3175 + if ((flags & (AT_RECURSIVE | OPEN_TREE_CLONE | OPEN_TREE_NAMESPACE)) == 3176 + AT_RECURSIVE) 3177 + return ERR_PTR(-EINVAL); 3178 + 3179 + if (hweight32(flags & (OPEN_TREE_CLONE | OPEN_TREE_NAMESPACE)) > 1) 3097 3180 return ERR_PTR(-EINVAL); 3098 3181 3099 3182 if (flags & AT_NO_AUTOMOUNT) ··· 3204 3085 if (flags & AT_EMPTY_PATH) 3205 3086 lookup_flags |= LOOKUP_EMPTY; 3206 3087 3207 - if (detached && !may_mount()) 3088 + /* 3089 + * If we create a new mount namespace with the cloned mount tree we 3090 + * just care about being privileged over our current user namespace. 3091 + * The new mount namespace will be owned by it. 3092 + */ 3093 + if ((flags & OPEN_TREE_NAMESPACE) && 3094 + !ns_capable(current_user_ns(), CAP_SYS_ADMIN)) 3095 + return ERR_PTR(-EPERM); 3096 + 3097 + if ((flags & OPEN_TREE_CLONE) && !may_mount()) 3208 3098 return ERR_PTR(-EPERM); 3209 3099 3210 3100 ret = user_path_at(dfd, filename, lookup_flags, &path); 3211 3101 if (unlikely(ret)) 3212 3102 return ERR_PTR(ret); 3213 3103 3214 - if (detached) 3215 - return open_detached_copy(&path, flags & AT_RECURSIVE); 3104 + if (flags & OPEN_TREE_NAMESPACE) 3105 + return open_new_namespace(&path, flags); 3106 + 3107 + if (flags & OPEN_TREE_CLONE) 3108 + return open_detached_copy(&path, flags); 3216 3109 3217 3110 return dentry_open(&path, O_PATH, current_cred()); 3218 3111 } ··· 5685 5554 5686 5555 /* locks: namespace_shared */ 5687 5556 static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id, 5688 - struct mnt_namespace *ns) 5557 + struct file *mnt_file, struct mnt_namespace *ns) 5689 5558 { 5690 - struct mount *m; 5691 5559 int err; 5692 5560 5693 - /* Has the namespace already been emptied? */ 5694 - if (mnt_ns_id && mnt_ns_empty(ns)) 5695 - return -ENOENT; 5561 + if (mnt_file) { 5562 + WARN_ON_ONCE(ns != NULL); 5696 5563 5697 - s->mnt = lookup_mnt_in_ns(mnt_id, ns); 5698 - if (!s->mnt) 5699 - return -ENOENT; 5564 + s->mnt = mnt_file->f_path.mnt; 5565 + ns = real_mount(s->mnt)->mnt_ns; 5566 + if (!ns) 5567 + /* 5568 + * We can't set mount point and mnt_ns_id since we don't have a 5569 + * ns for the mount. This can happen if the mount is unmounted 5570 + * with MNT_DETACH. 5571 + */ 5572 + s->mask &= ~(STATMOUNT_MNT_POINT | STATMOUNT_MNT_NS_ID); 5573 + } else { 5574 + /* Has the namespace already been emptied? */ 5575 + if (mnt_ns_id && mnt_ns_empty(ns)) 5576 + return -ENOENT; 5700 5577 5701 - err = grab_requested_root(ns, &s->root); 5702 - if (err) 5703 - return err; 5578 + s->mnt = lookup_mnt_in_ns(mnt_id, ns); 5579 + if (!s->mnt) 5580 + return -ENOENT; 5581 + } 5704 5582 5705 - /* 5706 - * Don't trigger audit denials. We just want to determine what 5707 - * mounts to show users. 5708 - */ 5709 - m = real_mount(s->mnt); 5710 - if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) && 5711 - !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) 5712 - return -EPERM; 5583 + if (ns) { 5584 + err = grab_requested_root(ns, &s->root); 5585 + if (err) 5586 + return err; 5587 + 5588 + if (!mnt_file) { 5589 + struct mount *m; 5590 + /* 5591 + * Don't trigger audit denials. We just want to determine what 5592 + * mounts to show users. 5593 + */ 5594 + m = real_mount(s->mnt); 5595 + if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) && 5596 + !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) 5597 + return -EPERM; 5598 + } 5599 + } 5713 5600 5714 5601 err = security_sb_statfs(s->mnt->mnt_root); 5715 5602 if (err) ··· 5849 5700 } 5850 5701 5851 5702 static int copy_mnt_id_req(const struct mnt_id_req __user *req, 5852 - struct mnt_id_req *kreq) 5703 + struct mnt_id_req *kreq, unsigned int flags) 5853 5704 { 5854 5705 int ret; 5855 5706 size_t usize; ··· 5867 5718 ret = copy_struct_from_user(kreq, sizeof(*kreq), req, usize); 5868 5719 if (ret) 5869 5720 return ret; 5870 - if (kreq->mnt_ns_fd != 0 && kreq->mnt_ns_id) 5871 - return -EINVAL; 5872 - /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ 5873 - if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET) 5874 - return -EINVAL; 5721 + 5722 + if (flags & STATMOUNT_BY_FD) { 5723 + if (kreq->mnt_id || kreq->mnt_ns_id) 5724 + return -EINVAL; 5725 + } else { 5726 + if (kreq->mnt_ns_fd != 0 && kreq->mnt_ns_id) 5727 + return -EINVAL; 5728 + /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ 5729 + if (kreq->mnt_id <= MNT_UNIQUE_ID_OFFSET) 5730 + return -EINVAL; 5731 + } 5875 5732 return 0; 5876 5733 } 5877 5734 ··· 5924 5769 { 5925 5770 struct mnt_namespace *ns __free(mnt_ns_release) = NULL; 5926 5771 struct kstatmount *ks __free(kfree) = NULL; 5772 + struct file *mnt_file __free(fput) = NULL; 5927 5773 struct mnt_id_req kreq; 5928 5774 /* We currently support retrieval of 3 strings. */ 5929 5775 size_t seq_size = 3 * PATH_MAX; 5930 5776 int ret; 5931 5777 5932 - if (flags) 5778 + if (flags & ~STATMOUNT_BY_FD) 5933 5779 return -EINVAL; 5934 5780 5935 - ret = copy_mnt_id_req(req, &kreq); 5781 + ret = copy_mnt_id_req(req, &kreq, flags); 5936 5782 if (ret) 5937 5783 return ret; 5938 5784 5939 - ns = grab_requested_mnt_ns(&kreq); 5940 - if (IS_ERR(ns)) 5941 - return PTR_ERR(ns); 5785 + if (flags & STATMOUNT_BY_FD) { 5786 + mnt_file = fget_raw(kreq.mnt_fd); 5787 + if (!mnt_file) 5788 + return -EBADF; 5789 + /* do_statmount sets ns in case of STATMOUNT_BY_FD */ 5790 + } else { 5791 + ns = grab_requested_mnt_ns(&kreq); 5792 + if (IS_ERR(ns)) 5793 + return PTR_ERR(ns); 5942 5794 5943 - if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) && 5944 - !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) 5945 - return -ENOENT; 5795 + if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) && 5796 + !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) 5797 + return -EPERM; 5798 + } 5946 5799 5947 5800 ks = kmalloc(sizeof(*ks), GFP_KERNEL_ACCOUNT); 5948 5801 if (!ks) ··· 5962 5799 return ret; 5963 5800 5964 5801 scoped_guard(namespace_shared) 5965 - ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, ns); 5802 + ret = do_statmount(ks, kreq.mnt_id, kreq.mnt_ns_id, mnt_file, ns); 5966 5803 5967 5804 if (!ret) 5968 5805 ret = copy_statmount_to_user(ks); ··· 6102 5939 if (!access_ok(mnt_ids, nr_mnt_ids * sizeof(*mnt_ids))) 6103 5940 return -EFAULT; 6104 5941 6105 - ret = copy_mnt_id_req(req, &kreq); 5942 + ret = copy_mnt_id_req(req, &kreq, 0); 6106 5943 if (ret) 6107 5944 return ret; 6108 5945
+13
fs/nsfs.c
··· 99 99 return ns_get_path_cb(path, ns_get_path_task, &args); 100 100 } 101 101 102 + struct file *open_namespace_file(struct ns_common *ns) 103 + { 104 + struct path path __free(path_put) = {}; 105 + int err; 106 + 107 + /* call first to consume reference */ 108 + err = path_from_stashed(&ns->stashed, nsfs_mnt, ns, &path); 109 + if (err < 0) 110 + return ERR_PTR(err); 111 + 112 + return dentry_open(&path, O_RDONLY, current_cred()); 113 + } 114 + 102 115 /** 103 116 * open_namespace - open a namespace 104 117 * @ns: the namespace to open
-2
include/linux/fs.h
··· 2282 2282 #define FS_RENAME_DOES_D_MOVE 32768 /* FS will handle d_move() during rename() internally. */ 2283 2283 int (*init_fs_context)(struct fs_context *); 2284 2284 const struct fs_parameter_spec *parameters; 2285 - struct dentry *(*mount) (struct file_system_type *, int, 2286 - const char *, void *); 2287 2285 void (*kill_sb) (struct super_block *); 2288 2286 struct module *owner; 2289 2287 struct file_system_type * next;
-1
include/linux/fs/super_types.h
··· 97 97 const void *owner); 98 98 int (*unfreeze_fs)(struct super_block *sb); 99 99 int (*statfs)(struct dentry *dentry, struct kstatfs *kstatfs); 100 - int (*remount_fs) (struct super_block *, int *, char *); 101 100 void (*umount_begin)(struct super_block *sb); 102 101 103 102 int (*show_options)(struct seq_file *seq, struct dentry *dentry);
+11 -2
include/uapi/linux/mount.h
··· 61 61 /* 62 62 * open_tree() flags. 63 63 */ 64 - #define OPEN_TREE_CLONE 1 /* Clone the target tree and attach the clone */ 64 + #define OPEN_TREE_CLONE (1 << 0) /* Clone the target tree and attach the clone */ 65 + #define OPEN_TREE_NAMESPACE (1 << 1) /* Clone the target tree into a new mount namespace */ 65 66 #define OPEN_TREE_CLOEXEC O_CLOEXEC /* Close the file on execve() */ 66 67 67 68 /* ··· 198 197 */ 199 198 struct mnt_id_req { 200 199 __u32 size; 201 - __u32 mnt_ns_fd; 200 + union { 201 + __u32 mnt_ns_fd; 202 + __u32 mnt_fd; 203 + }; 202 204 __u64 mnt_id; 203 205 __u64 param; 204 206 __u64 mnt_ns_id; ··· 235 231 */ 236 232 #define LSMT_ROOT 0xffffffffffffffff /* root mount */ 237 233 #define LISTMOUNT_REVERSE (1 << 0) /* List later mounts first */ 234 + 235 + /* 236 + * @flag bits for statmount(2) 237 + */ 238 + #define STATMOUNT_BY_FD 0x00000001U /* want mountinfo for given fd */ 238 239 239 240 #endif /* _UAPI_LINUX_MOUNT_H */
+1
tools/testing/selftests/filesystems/open_tree_ns/.gitignore
··· 1 + open_tree_ns_test
+10
tools/testing/selftests/filesystems/open_tree_ns/Makefile
··· 1 + # SPDX-License-Identifier: GPL-2.0 2 + TEST_GEN_PROGS := open_tree_ns_test 3 + 4 + CFLAGS := -Wall -Werror -g $(KHDR_INCLUDES) 5 + LDLIBS := -lcap 6 + 7 + include ../../lib.mk 8 + 9 + $(OUTPUT)/open_tree_ns_test: open_tree_ns_test.c ../utils.c 10 + $(CC) $(CFLAGS) -o $@ $^ $(LDLIBS)
+1030
tools/testing/selftests/filesystems/open_tree_ns/open_tree_ns_test.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* 3 + * Test for OPEN_TREE_NAMESPACE flag. 4 + * 5 + * Test that open_tree() with OPEN_TREE_NAMESPACE creates a new mount 6 + * namespace containing the specified mount tree. 7 + */ 8 + #define _GNU_SOURCE 9 + 10 + #include <errno.h> 11 + #include <fcntl.h> 12 + #include <limits.h> 13 + #include <linux/nsfs.h> 14 + #include <sched.h> 15 + #include <stdio.h> 16 + #include <stdlib.h> 17 + #include <string.h> 18 + #include <sys/ioctl.h> 19 + #include <sys/mount.h> 20 + #include <sys/stat.h> 21 + #include <sys/wait.h> 22 + #include <unistd.h> 23 + 24 + #include "../wrappers.h" 25 + #include "../statmount/statmount.h" 26 + #include "../utils.h" 27 + #include "../../kselftest_harness.h" 28 + 29 + #ifndef OPEN_TREE_NAMESPACE 30 + #define OPEN_TREE_NAMESPACE (1 << 1) 31 + #endif 32 + 33 + static int get_mnt_ns_id(int fd, uint64_t *mnt_ns_id) 34 + { 35 + if (ioctl(fd, NS_GET_MNTNS_ID, mnt_ns_id) < 0) 36 + return -errno; 37 + return 0; 38 + } 39 + 40 + static int get_mnt_ns_id_from_path(const char *path, uint64_t *mnt_ns_id) 41 + { 42 + int fd, ret; 43 + 44 + fd = open(path, O_RDONLY); 45 + if (fd < 0) 46 + return -errno; 47 + 48 + ret = get_mnt_ns_id(fd, mnt_ns_id); 49 + close(fd); 50 + return ret; 51 + } 52 + 53 + #define STATMOUNT_BUFSIZE (1 << 15) 54 + 55 + static struct statmount *statmount_alloc(uint64_t mnt_id, uint64_t mnt_ns_id, uint64_t mask) 56 + { 57 + struct statmount *buf; 58 + size_t bufsize = STATMOUNT_BUFSIZE; 59 + int ret; 60 + 61 + for (;;) { 62 + buf = malloc(bufsize); 63 + if (!buf) 64 + return NULL; 65 + 66 + ret = statmount(mnt_id, mnt_ns_id, mask, buf, bufsize, 0); 67 + if (ret == 0) 68 + return buf; 69 + 70 + free(buf); 71 + if (errno != EOVERFLOW) 72 + return NULL; 73 + 74 + bufsize <<= 1; 75 + } 76 + } 77 + 78 + static void log_mount(struct __test_metadata *_metadata, struct statmount *sm) 79 + { 80 + const char *fs_type = ""; 81 + const char *mnt_root = ""; 82 + const char *mnt_point = ""; 83 + 84 + if (sm->mask & STATMOUNT_FS_TYPE) 85 + fs_type = sm->str + sm->fs_type; 86 + if (sm->mask & STATMOUNT_MNT_ROOT) 87 + mnt_root = sm->str + sm->mnt_root; 88 + if (sm->mask & STATMOUNT_MNT_POINT) 89 + mnt_point = sm->str + sm->mnt_point; 90 + 91 + TH_LOG(" mnt_id: %llu, parent_id: %llu, fs_type: %s, root: %s, point: %s", 92 + (unsigned long long)sm->mnt_id, 93 + (unsigned long long)sm->mnt_parent_id, 94 + fs_type, mnt_root, mnt_point); 95 + } 96 + 97 + static void dump_mounts(struct __test_metadata *_metadata, uint64_t mnt_ns_id) 98 + { 99 + uint64_t list[256]; 100 + ssize_t nr_mounts; 101 + 102 + nr_mounts = listmount(LSMT_ROOT, mnt_ns_id, 0, list, 256, 0); 103 + if (nr_mounts < 0) { 104 + TH_LOG("listmount failed: %s", strerror(errno)); 105 + return; 106 + } 107 + 108 + TH_LOG("Mount namespace %llu contains %zd mount(s):", 109 + (unsigned long long)mnt_ns_id, nr_mounts); 110 + 111 + for (ssize_t i = 0; i < nr_mounts; i++) { 112 + struct statmount *sm; 113 + 114 + sm = statmount_alloc(list[i], mnt_ns_id, 115 + STATMOUNT_MNT_BASIC | 116 + STATMOUNT_FS_TYPE | 117 + STATMOUNT_MNT_ROOT | 118 + STATMOUNT_MNT_POINT); 119 + if (!sm) { 120 + TH_LOG(" [%zd] mnt_id %llu: statmount failed: %s", 121 + i, (unsigned long long)list[i], strerror(errno)); 122 + continue; 123 + } 124 + 125 + log_mount(_metadata, sm); 126 + free(sm); 127 + } 128 + } 129 + 130 + FIXTURE(open_tree_ns) 131 + { 132 + int fd; 133 + uint64_t current_ns_id; 134 + }; 135 + 136 + FIXTURE_VARIANT(open_tree_ns) 137 + { 138 + const char *path; 139 + unsigned int flags; 140 + bool expect_success; 141 + bool expect_different_ns; 142 + int min_mounts; 143 + }; 144 + 145 + FIXTURE_VARIANT_ADD(open_tree_ns, basic_root) 146 + { 147 + .path = "/", 148 + .flags = OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC, 149 + .expect_success = true, 150 + .expect_different_ns = true, 151 + /* 152 + * The empty rootfs is hidden from listmount()/mountinfo, 153 + * so we only see the bind mount on top of it. 154 + */ 155 + .min_mounts = 1, 156 + }; 157 + 158 + FIXTURE_VARIANT_ADD(open_tree_ns, recursive_root) 159 + { 160 + .path = "/", 161 + .flags = OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC, 162 + .expect_success = true, 163 + .expect_different_ns = true, 164 + .min_mounts = 1, 165 + }; 166 + 167 + FIXTURE_VARIANT_ADD(open_tree_ns, subdir_tmp) 168 + { 169 + .path = "/tmp", 170 + .flags = OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC, 171 + .expect_success = true, 172 + .expect_different_ns = true, 173 + .min_mounts = 1, 174 + }; 175 + 176 + FIXTURE_VARIANT_ADD(open_tree_ns, subdir_proc) 177 + { 178 + .path = "/proc", 179 + .flags = OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC, 180 + .expect_success = true, 181 + .expect_different_ns = true, 182 + .min_mounts = 1, 183 + }; 184 + 185 + FIXTURE_VARIANT_ADD(open_tree_ns, recursive_tmp) 186 + { 187 + .path = "/tmp", 188 + .flags = OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC, 189 + .expect_success = true, 190 + .expect_different_ns = true, 191 + .min_mounts = 1, 192 + }; 193 + 194 + FIXTURE_VARIANT_ADD(open_tree_ns, recursive_run) 195 + { 196 + .path = "/run", 197 + .flags = OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC, 198 + .expect_success = true, 199 + .expect_different_ns = true, 200 + .min_mounts = 1, 201 + }; 202 + 203 + FIXTURE_VARIANT_ADD(open_tree_ns, invalid_recursive_alone) 204 + { 205 + .path = "/", 206 + .flags = AT_RECURSIVE | OPEN_TREE_CLOEXEC, 207 + .expect_success = false, 208 + .expect_different_ns = false, 209 + .min_mounts = 0, 210 + }; 211 + 212 + FIXTURE_SETUP(open_tree_ns) 213 + { 214 + int ret; 215 + 216 + self->fd = -1; 217 + 218 + /* Check if open_tree syscall is supported */ 219 + ret = sys_open_tree(-1, NULL, 0); 220 + if (ret == -1 && errno == ENOSYS) 221 + SKIP(return, "open_tree() syscall not supported"); 222 + 223 + /* Check if statmount/listmount are supported */ 224 + ret = statmount(0, 0, 0, NULL, 0, 0); 225 + if (ret == -1 && errno == ENOSYS) 226 + SKIP(return, "statmount() syscall not supported"); 227 + 228 + /* Get current mount namespace ID for comparison */ 229 + ret = get_mnt_ns_id_from_path("/proc/self/ns/mnt", &self->current_ns_id); 230 + if (ret < 0) 231 + SKIP(return, "Failed to get current mount namespace ID"); 232 + } 233 + 234 + FIXTURE_TEARDOWN(open_tree_ns) 235 + { 236 + if (self->fd >= 0) 237 + close(self->fd); 238 + } 239 + 240 + TEST_F(open_tree_ns, create_namespace) 241 + { 242 + uint64_t new_ns_id; 243 + uint64_t list[256]; 244 + ssize_t nr_mounts; 245 + int ret; 246 + 247 + self->fd = sys_open_tree(AT_FDCWD, variant->path, variant->flags); 248 + 249 + if (!variant->expect_success) { 250 + ASSERT_LT(self->fd, 0); 251 + ASSERT_EQ(errno, EINVAL); 252 + return; 253 + } 254 + 255 + if (self->fd < 0 && errno == EINVAL) 256 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 257 + 258 + ASSERT_GE(self->fd, 0); 259 + 260 + /* Verify we can get the namespace ID */ 261 + ret = get_mnt_ns_id(self->fd, &new_ns_id); 262 + ASSERT_EQ(ret, 0); 263 + 264 + /* Verify it's a different namespace */ 265 + if (variant->expect_different_ns) 266 + ASSERT_NE(new_ns_id, self->current_ns_id); 267 + 268 + /* List mounts in the new namespace */ 269 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, 0); 270 + ASSERT_GE(nr_mounts, 0) { 271 + TH_LOG("%m - listmount failed"); 272 + } 273 + 274 + /* Verify minimum expected mounts */ 275 + ASSERT_GE(nr_mounts, variant->min_mounts); 276 + TH_LOG("Namespace contains %zd mounts", nr_mounts); 277 + } 278 + 279 + TEST_F(open_tree_ns, setns_into_namespace) 280 + { 281 + uint64_t new_ns_id; 282 + pid_t pid; 283 + int status; 284 + int ret; 285 + 286 + /* Only test with basic flags */ 287 + if (!(variant->flags & OPEN_TREE_NAMESPACE)) 288 + SKIP(return, "setns test only for basic / case"); 289 + 290 + self->fd = sys_open_tree(AT_FDCWD, variant->path, variant->flags); 291 + if (self->fd < 0 && errno == EINVAL) 292 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 293 + 294 + ASSERT_GE(self->fd, 0); 295 + 296 + /* Get namespace ID and dump all mounts */ 297 + ret = get_mnt_ns_id(self->fd, &new_ns_id); 298 + ASSERT_EQ(ret, 0); 299 + 300 + dump_mounts(_metadata, new_ns_id); 301 + 302 + pid = fork(); 303 + ASSERT_GE(pid, 0); 304 + 305 + if (pid == 0) { 306 + /* Child: try to enter the namespace */ 307 + if (setns(self->fd, CLONE_NEWNS) < 0) 308 + _exit(1); 309 + _exit(0); 310 + } 311 + 312 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 313 + ASSERT_TRUE(WIFEXITED(status)); 314 + ASSERT_EQ(WEXITSTATUS(status), 0); 315 + } 316 + 317 + TEST_F(open_tree_ns, verify_mount_properties) 318 + { 319 + struct statmount sm; 320 + uint64_t new_ns_id; 321 + uint64_t list[256]; 322 + ssize_t nr_mounts; 323 + int ret; 324 + 325 + /* Only test with basic flags on root */ 326 + if (variant->flags != (OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC) || 327 + strcmp(variant->path, "/") != 0) 328 + SKIP(return, "mount properties test only for basic / case"); 329 + 330 + self->fd = sys_open_tree(AT_FDCWD, "/", OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC); 331 + if (self->fd < 0 && errno == EINVAL) 332 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 333 + 334 + ASSERT_GE(self->fd, 0); 335 + 336 + ret = get_mnt_ns_id(self->fd, &new_ns_id); 337 + ASSERT_EQ(ret, 0); 338 + 339 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, 0); 340 + ASSERT_GE(nr_mounts, 1); 341 + 342 + /* Get info about the root mount (the bind mount, rootfs is hidden) */ 343 + ret = statmount(list[0], new_ns_id, STATMOUNT_MNT_BASIC, &sm, sizeof(sm), 0); 344 + ASSERT_EQ(ret, 0); 345 + 346 + ASSERT_NE(sm.mnt_id, sm.mnt_parent_id); 347 + 348 + TH_LOG("Root mount id: %llu, parent: %llu", 349 + (unsigned long long)sm.mnt_id, 350 + (unsigned long long)sm.mnt_parent_id); 351 + } 352 + 353 + FIXTURE(open_tree_ns_caps) 354 + { 355 + bool has_caps; 356 + }; 357 + 358 + FIXTURE_SETUP(open_tree_ns_caps) 359 + { 360 + int ret; 361 + 362 + /* Check if open_tree syscall is supported */ 363 + ret = sys_open_tree(-1, NULL, 0); 364 + if (ret == -1 && errno == ENOSYS) 365 + SKIP(return, "open_tree() syscall not supported"); 366 + 367 + self->has_caps = (geteuid() == 0); 368 + } 369 + 370 + FIXTURE_TEARDOWN(open_tree_ns_caps) 371 + { 372 + } 373 + 374 + TEST_F(open_tree_ns_caps, requires_cap_sys_admin) 375 + { 376 + pid_t pid; 377 + int status; 378 + 379 + pid = fork(); 380 + ASSERT_GE(pid, 0); 381 + 382 + if (pid == 0) { 383 + int fd; 384 + 385 + /* Child: drop privileges using utils.h helper */ 386 + if (enter_userns() != 0) 387 + _exit(2); 388 + 389 + /* Drop all caps using utils.h helper */ 390 + if (caps_down() == 0) 391 + _exit(3); 392 + 393 + fd = sys_open_tree(AT_FDCWD, "/", 394 + OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC); 395 + if (fd >= 0) { 396 + close(fd); 397 + /* Should have failed without caps */ 398 + _exit(1); 399 + } 400 + 401 + if (errno == EPERM) 402 + _exit(0); 403 + 404 + /* EINVAL means OPEN_TREE_NAMESPACE not supported */ 405 + if (errno == EINVAL) 406 + _exit(4); 407 + 408 + /* Unexpected error */ 409 + _exit(5); 410 + } 411 + 412 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 413 + ASSERT_TRUE(WIFEXITED(status)); 414 + 415 + switch (WEXITSTATUS(status)) { 416 + case 0: 417 + /* Expected: EPERM without caps */ 418 + break; 419 + case 1: 420 + ASSERT_FALSE(true) TH_LOG("OPEN_TREE_NAMESPACE succeeded without caps"); 421 + break; 422 + case 2: 423 + SKIP(return, "setup_userns failed"); 424 + break; 425 + case 3: 426 + SKIP(return, "caps_down failed"); 427 + break; 428 + case 4: 429 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 430 + break; 431 + default: 432 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 433 + WEXITSTATUS(status)); 434 + break; 435 + } 436 + } 437 + 438 + FIXTURE(open_tree_ns_userns) 439 + { 440 + int fd; 441 + }; 442 + 443 + FIXTURE_SETUP(open_tree_ns_userns) 444 + { 445 + int ret; 446 + 447 + self->fd = -1; 448 + 449 + /* Check if open_tree syscall is supported */ 450 + ret = sys_open_tree(-1, NULL, 0); 451 + if (ret == -1 && errno == ENOSYS) 452 + SKIP(return, "open_tree() syscall not supported"); 453 + 454 + /* Check if statmount/listmount are supported */ 455 + ret = statmount(0, 0, 0, NULL, 0, 0); 456 + if (ret == -1 && errno == ENOSYS) 457 + SKIP(return, "statmount() syscall not supported"); 458 + } 459 + 460 + FIXTURE_TEARDOWN(open_tree_ns_userns) 461 + { 462 + if (self->fd >= 0) 463 + close(self->fd); 464 + } 465 + 466 + TEST_F(open_tree_ns_userns, create_in_userns) 467 + { 468 + pid_t pid; 469 + int status; 470 + 471 + pid = fork(); 472 + ASSERT_GE(pid, 0); 473 + 474 + if (pid == 0) { 475 + uint64_t new_ns_id; 476 + uint64_t list[256]; 477 + ssize_t nr_mounts; 478 + int fd; 479 + 480 + /* Create new user namespace (also creates mount namespace) */ 481 + if (enter_userns() != 0) 482 + _exit(2); 483 + 484 + /* Now we have CAP_SYS_ADMIN in the user namespace */ 485 + fd = sys_open_tree(AT_FDCWD, "/", 486 + OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC); 487 + if (fd < 0) { 488 + if (errno == EINVAL) 489 + _exit(4); /* OPEN_TREE_NAMESPACE not supported */ 490 + _exit(1); 491 + } 492 + 493 + /* Verify we can get the namespace ID */ 494 + if (get_mnt_ns_id(fd, &new_ns_id) != 0) 495 + _exit(5); 496 + 497 + /* Verify we can list mounts in the new namespace */ 498 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, 0); 499 + if (nr_mounts < 0) 500 + _exit(6); 501 + 502 + /* Should have at least 1 mount */ 503 + if (nr_mounts < 1) 504 + _exit(7); 505 + 506 + close(fd); 507 + _exit(0); 508 + } 509 + 510 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 511 + ASSERT_TRUE(WIFEXITED(status)); 512 + 513 + switch (WEXITSTATUS(status)) { 514 + case 0: 515 + /* Success */ 516 + break; 517 + case 1: 518 + ASSERT_FALSE(true) TH_LOG("open_tree(OPEN_TREE_NAMESPACE) failed in userns"); 519 + break; 520 + case 2: 521 + SKIP(return, "setup_userns failed"); 522 + break; 523 + case 4: 524 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 525 + break; 526 + case 5: 527 + ASSERT_FALSE(true) TH_LOG("Failed to get mount namespace ID"); 528 + break; 529 + case 6: 530 + ASSERT_FALSE(true) TH_LOG("listmount failed in new namespace"); 531 + break; 532 + case 7: 533 + ASSERT_FALSE(true) TH_LOG("New namespace has no mounts"); 534 + break; 535 + default: 536 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 537 + WEXITSTATUS(status)); 538 + break; 539 + } 540 + } 541 + 542 + TEST_F(open_tree_ns_userns, setns_in_userns) 543 + { 544 + pid_t pid; 545 + int status; 546 + 547 + pid = fork(); 548 + ASSERT_GE(pid, 0); 549 + 550 + if (pid == 0) { 551 + uint64_t new_ns_id; 552 + int fd; 553 + pid_t inner_pid; 554 + int inner_status; 555 + 556 + /* Create new user namespace */ 557 + if (enter_userns() != 0) 558 + _exit(2); 559 + 560 + fd = sys_open_tree(AT_FDCWD, "/", 561 + OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC); 562 + if (fd < 0) { 563 + if (errno == EINVAL) 564 + _exit(4); 565 + _exit(1); 566 + } 567 + 568 + if (get_mnt_ns_id(fd, &new_ns_id) != 0) 569 + _exit(5); 570 + 571 + /* Fork again to test setns into the new namespace */ 572 + inner_pid = fork(); 573 + if (inner_pid < 0) 574 + _exit(8); 575 + 576 + if (inner_pid == 0) { 577 + /* Inner child: enter the new namespace */ 578 + if (setns(fd, CLONE_NEWNS) < 0) 579 + _exit(1); 580 + _exit(0); 581 + } 582 + 583 + if (waitpid(inner_pid, &inner_status, 0) != inner_pid) 584 + _exit(9); 585 + 586 + if (!WIFEXITED(inner_status) || WEXITSTATUS(inner_status) != 0) 587 + _exit(10); 588 + 589 + close(fd); 590 + _exit(0); 591 + } 592 + 593 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 594 + ASSERT_TRUE(WIFEXITED(status)); 595 + 596 + switch (WEXITSTATUS(status)) { 597 + case 0: 598 + /* Success */ 599 + break; 600 + case 1: 601 + ASSERT_FALSE(true) TH_LOG("open_tree or setns failed in userns"); 602 + break; 603 + case 2: 604 + SKIP(return, "setup_userns failed"); 605 + break; 606 + case 4: 607 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 608 + break; 609 + case 5: 610 + ASSERT_FALSE(true) TH_LOG("Failed to get mount namespace ID"); 611 + break; 612 + case 8: 613 + ASSERT_FALSE(true) TH_LOG("Inner fork failed"); 614 + break; 615 + case 9: 616 + ASSERT_FALSE(true) TH_LOG("Inner waitpid failed"); 617 + break; 618 + case 10: 619 + ASSERT_FALSE(true) TH_LOG("setns into new namespace failed"); 620 + break; 621 + default: 622 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 623 + WEXITSTATUS(status)); 624 + break; 625 + } 626 + } 627 + 628 + TEST_F(open_tree_ns_userns, recursive_in_userns) 629 + { 630 + pid_t pid; 631 + int status; 632 + 633 + pid = fork(); 634 + ASSERT_GE(pid, 0); 635 + 636 + if (pid == 0) { 637 + uint64_t new_ns_id; 638 + uint64_t list[256]; 639 + ssize_t nr_mounts; 640 + int fd; 641 + 642 + /* Create new user namespace */ 643 + if (enter_userns() != 0) 644 + _exit(2); 645 + 646 + /* Test recursive flag in userns */ 647 + fd = sys_open_tree(AT_FDCWD, "/", 648 + OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC); 649 + if (fd < 0) { 650 + if (errno == EINVAL) 651 + _exit(4); 652 + _exit(1); 653 + } 654 + 655 + if (get_mnt_ns_id(fd, &new_ns_id) != 0) 656 + _exit(5); 657 + 658 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, 0); 659 + if (nr_mounts < 0) 660 + _exit(6); 661 + 662 + /* Recursive should copy submounts too */ 663 + if (nr_mounts < 1) 664 + _exit(7); 665 + 666 + close(fd); 667 + _exit(0); 668 + } 669 + 670 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 671 + ASSERT_TRUE(WIFEXITED(status)); 672 + 673 + switch (WEXITSTATUS(status)) { 674 + case 0: 675 + /* Success */ 676 + break; 677 + case 1: 678 + ASSERT_FALSE(true) TH_LOG("open_tree(OPEN_TREE_NAMESPACE|AT_RECURSIVE) failed in userns"); 679 + break; 680 + case 2: 681 + SKIP(return, "setup_userns failed"); 682 + break; 683 + case 4: 684 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 685 + break; 686 + case 5: 687 + ASSERT_FALSE(true) TH_LOG("Failed to get mount namespace ID"); 688 + break; 689 + case 6: 690 + ASSERT_FALSE(true) TH_LOG("listmount failed in new namespace"); 691 + break; 692 + case 7: 693 + ASSERT_FALSE(true) TH_LOG("New namespace has no mounts"); 694 + break; 695 + default: 696 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 697 + WEXITSTATUS(status)); 698 + break; 699 + } 700 + } 701 + 702 + TEST_F(open_tree_ns_userns, umount_fails_einval) 703 + { 704 + pid_t pid; 705 + int status; 706 + 707 + pid = fork(); 708 + ASSERT_GE(pid, 0); 709 + 710 + if (pid == 0) { 711 + uint64_t new_ns_id; 712 + uint64_t list[256]; 713 + ssize_t nr_mounts; 714 + int fd; 715 + ssize_t i; 716 + 717 + /* Create new user namespace */ 718 + if (enter_userns() != 0) 719 + _exit(2); 720 + 721 + fd = sys_open_tree(AT_FDCWD, "/", 722 + OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC); 723 + if (fd < 0) { 724 + if (errno == EINVAL) 725 + _exit(4); 726 + _exit(1); 727 + } 728 + 729 + if (get_mnt_ns_id(fd, &new_ns_id) != 0) 730 + _exit(5); 731 + 732 + /* Get all mounts in the new namespace */ 733 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, LISTMOUNT_REVERSE); 734 + if (nr_mounts < 0) 735 + _exit(9); 736 + 737 + if (nr_mounts < 1) 738 + _exit(10); 739 + 740 + /* Enter the new namespace */ 741 + if (setns(fd, CLONE_NEWNS) < 0) 742 + _exit(6); 743 + 744 + for (i = 0; i < nr_mounts; i++) { 745 + struct statmount *sm; 746 + const char *mnt_point; 747 + 748 + sm = statmount_alloc(list[i], new_ns_id, 749 + STATMOUNT_MNT_POINT); 750 + if (!sm) 751 + _exit(11); 752 + 753 + mnt_point = sm->str + sm->mnt_point; 754 + 755 + TH_LOG("Trying to umount %s", mnt_point); 756 + if (umount2(mnt_point, MNT_DETACH) == 0) { 757 + free(sm); 758 + _exit(7); 759 + } 760 + 761 + if (errno != EINVAL) { 762 + /* Wrong error */ 763 + free(sm); 764 + _exit(8); 765 + } 766 + 767 + free(sm); 768 + } 769 + 770 + close(fd); 771 + _exit(0); 772 + } 773 + 774 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 775 + ASSERT_TRUE(WIFEXITED(status)); 776 + 777 + switch (WEXITSTATUS(status)) { 778 + case 0: 779 + break; 780 + case 1: 781 + ASSERT_FALSE(true) TH_LOG("open_tree(OPEN_TREE_NAMESPACE) failed"); 782 + break; 783 + case 2: 784 + SKIP(return, "setup_userns failed"); 785 + break; 786 + case 4: 787 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 788 + break; 789 + case 5: 790 + ASSERT_FALSE(true) TH_LOG("Failed to get mount namespace ID"); 791 + break; 792 + case 6: 793 + ASSERT_FALSE(true) TH_LOG("setns into new namespace failed"); 794 + break; 795 + case 7: 796 + ASSERT_FALSE(true) TH_LOG("umount succeeded but should have failed with EINVAL"); 797 + break; 798 + case 8: 799 + ASSERT_FALSE(true) TH_LOG("umount failed with wrong error (expected EINVAL)"); 800 + break; 801 + case 9: 802 + ASSERT_FALSE(true) TH_LOG("listmount failed"); 803 + break; 804 + case 10: 805 + ASSERT_FALSE(true) TH_LOG("No mounts in new namespace"); 806 + break; 807 + case 11: 808 + ASSERT_FALSE(true) TH_LOG("statmount_alloc failed"); 809 + break; 810 + default: 811 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 812 + WEXITSTATUS(status)); 813 + break; 814 + } 815 + } 816 + 817 + TEST_F(open_tree_ns_userns, umount_succeeds) 818 + { 819 + pid_t pid; 820 + int status; 821 + 822 + pid = fork(); 823 + ASSERT_GE(pid, 0); 824 + 825 + if (pid == 0) { 826 + uint64_t new_ns_id; 827 + uint64_t list[256]; 828 + ssize_t nr_mounts; 829 + int fd; 830 + ssize_t i; 831 + 832 + if (unshare(CLONE_NEWNS)) 833 + _exit(1); 834 + 835 + if (sys_mount(NULL, "/", NULL, MS_SLAVE | MS_REC, NULL) != 0) 836 + _exit(1); 837 + 838 + fd = sys_open_tree(AT_FDCWD, "/", 839 + OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC); 840 + if (fd < 0) { 841 + if (errno == EINVAL) 842 + _exit(4); 843 + _exit(1); 844 + } 845 + 846 + if (get_mnt_ns_id(fd, &new_ns_id) != 0) 847 + _exit(5); 848 + 849 + /* Get all mounts in the new namespace */ 850 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, LISTMOUNT_REVERSE); 851 + if (nr_mounts < 0) 852 + _exit(9); 853 + 854 + if (nr_mounts < 1) 855 + _exit(10); 856 + 857 + /* Enter the new namespace */ 858 + if (setns(fd, CLONE_NEWNS) < 0) 859 + _exit(6); 860 + 861 + for (i = 0; i < nr_mounts; i++) { 862 + struct statmount *sm; 863 + const char *mnt_point; 864 + 865 + sm = statmount_alloc(list[i], new_ns_id, 866 + STATMOUNT_MNT_POINT); 867 + if (!sm) 868 + _exit(11); 869 + 870 + mnt_point = sm->str + sm->mnt_point; 871 + 872 + TH_LOG("Trying to umount %s", mnt_point); 873 + if (umount2(mnt_point, MNT_DETACH) != 0) { 874 + free(sm); 875 + _exit(7); 876 + } 877 + 878 + free(sm); 879 + } 880 + 881 + close(fd); 882 + _exit(0); 883 + } 884 + 885 + ASSERT_EQ(waitpid(pid, &status, 0), pid); 886 + ASSERT_TRUE(WIFEXITED(status)); 887 + 888 + switch (WEXITSTATUS(status)) { 889 + case 0: 890 + break; 891 + case 1: 892 + ASSERT_FALSE(true) TH_LOG("open_tree(OPEN_TREE_NAMESPACE) failed"); 893 + break; 894 + case 2: 895 + SKIP(return, "setup_userns failed"); 896 + break; 897 + case 4: 898 + SKIP(return, "OPEN_TREE_NAMESPACE not supported"); 899 + break; 900 + case 5: 901 + ASSERT_FALSE(true) TH_LOG("Failed to get mount namespace ID"); 902 + break; 903 + case 6: 904 + ASSERT_FALSE(true) TH_LOG("setns into new namespace failed"); 905 + break; 906 + case 7: 907 + ASSERT_FALSE(true) TH_LOG("umount succeeded but should have failed with EINVAL"); 908 + break; 909 + case 9: 910 + ASSERT_FALSE(true) TH_LOG("listmount failed"); 911 + break; 912 + case 10: 913 + ASSERT_FALSE(true) TH_LOG("No mounts in new namespace"); 914 + break; 915 + case 11: 916 + ASSERT_FALSE(true) TH_LOG("statmount_alloc failed"); 917 + break; 918 + default: 919 + ASSERT_FALSE(true) TH_LOG("Unexpected error in child (exit %d)", 920 + WEXITSTATUS(status)); 921 + break; 922 + } 923 + } 924 + 925 + FIXTURE(open_tree_ns_unbindable) 926 + { 927 + char tmpdir[PATH_MAX]; 928 + bool mounted; 929 + }; 930 + 931 + FIXTURE_SETUP(open_tree_ns_unbindable) 932 + { 933 + int ret; 934 + 935 + self->mounted = false; 936 + 937 + /* Check if open_tree syscall is supported */ 938 + ret = sys_open_tree(-1, NULL, 0); 939 + if (ret == -1 && errno == ENOSYS) 940 + SKIP(return, "open_tree() syscall not supported"); 941 + 942 + /* Create a temporary directory for the test mount */ 943 + snprintf(self->tmpdir, sizeof(self->tmpdir), 944 + "/tmp/open_tree_ns_test.XXXXXX"); 945 + ASSERT_NE(mkdtemp(self->tmpdir), NULL); 946 + 947 + /* Mount tmpfs there */ 948 + ret = mount("tmpfs", self->tmpdir, "tmpfs", 0, NULL); 949 + if (ret < 0) { 950 + rmdir(self->tmpdir); 951 + SKIP(return, "Failed to mount tmpfs"); 952 + } 953 + self->mounted = true; 954 + 955 + ret = mount(NULL, self->tmpdir, NULL, MS_UNBINDABLE, NULL); 956 + if (ret < 0) { 957 + rmdir(self->tmpdir); 958 + SKIP(return, "Failed to make tmpfs unbindable"); 959 + } 960 + } 961 + 962 + FIXTURE_TEARDOWN(open_tree_ns_unbindable) 963 + { 964 + if (self->mounted) 965 + umount2(self->tmpdir, MNT_DETACH); 966 + rmdir(self->tmpdir); 967 + } 968 + 969 + TEST_F(open_tree_ns_unbindable, fails_on_unbindable) 970 + { 971 + int fd; 972 + 973 + fd = sys_open_tree(AT_FDCWD, self->tmpdir, 974 + OPEN_TREE_NAMESPACE | OPEN_TREE_CLOEXEC); 975 + ASSERT_LT(fd, 0); 976 + } 977 + 978 + TEST_F(open_tree_ns_unbindable, recursive_skips_on_unbindable) 979 + { 980 + uint64_t new_ns_id; 981 + uint64_t list[256]; 982 + ssize_t nr_mounts; 983 + int fd; 984 + ssize_t i; 985 + bool found_unbindable = false; 986 + 987 + fd = sys_open_tree(AT_FDCWD, "/", 988 + OPEN_TREE_NAMESPACE | AT_RECURSIVE | OPEN_TREE_CLOEXEC); 989 + ASSERT_GT(fd, 0); 990 + 991 + ASSERT_EQ(get_mnt_ns_id(fd, &new_ns_id), 0); 992 + 993 + nr_mounts = listmount(LSMT_ROOT, new_ns_id, 0, list, 256, 0); 994 + ASSERT_GE(nr_mounts, 0) { 995 + TH_LOG("listmount failed: %m"); 996 + } 997 + 998 + /* 999 + * Iterate through all mounts in the new namespace and verify 1000 + * the unbindable tmpfs mount was silently dropped. 1001 + */ 1002 + for (i = 0; i < nr_mounts; i++) { 1003 + struct statmount *sm; 1004 + const char *mnt_point; 1005 + 1006 + sm = statmount_alloc(list[i], new_ns_id, STATMOUNT_MNT_POINT); 1007 + ASSERT_NE(sm, NULL) { 1008 + TH_LOG("statmount_alloc failed for mnt_id %llu", 1009 + (unsigned long long)list[i]); 1010 + } 1011 + 1012 + mnt_point = sm->str + sm->mnt_point; 1013 + 1014 + if (strcmp(mnt_point, self->tmpdir) == 0) { 1015 + TH_LOG("Found unbindable mount at %s (should have been dropped)", 1016 + mnt_point); 1017 + found_unbindable = true; 1018 + } 1019 + 1020 + free(sm); 1021 + } 1022 + 1023 + ASSERT_FALSE(found_unbindable) { 1024 + TH_LOG("Unbindable mount at %s was not dropped", self->tmpdir); 1025 + } 1026 + 1027 + close(fd); 1028 + } 1029 + 1030 + TEST_HARNESS_MAIN
+10 -5
tools/testing/selftests/filesystems/statmount/statmount.h
··· 43 43 #endif 44 44 #endif 45 45 46 - static inline int statmount(uint64_t mnt_id, uint64_t mnt_ns_id, uint64_t mask, 47 - struct statmount *buf, size_t bufsize, 46 + static inline int statmount(uint64_t mnt_id, uint64_t mnt_ns_id, uint32_t fd, 47 + uint64_t mask, struct statmount *buf, size_t bufsize, 48 48 unsigned int flags) 49 49 { 50 50 struct mnt_id_req req = { 51 51 .size = MNT_ID_REQ_SIZE_VER0, 52 - .mnt_id = mnt_id, 53 52 .param = mask, 54 53 }; 55 54 56 - if (mnt_ns_id) { 55 + if (flags & STATMOUNT_BY_FD) { 57 56 req.size = MNT_ID_REQ_SIZE_VER1; 58 - req.mnt_ns_id = mnt_ns_id; 57 + req.mnt_fd = fd; 58 + } else { 59 + req.mnt_id = mnt_id; 60 + if (mnt_ns_id) { 61 + req.size = MNT_ID_REQ_SIZE_VER1; 62 + req.mnt_ns_id = mnt_ns_id; 63 + } 59 64 } 60 65 61 66 return syscall(__NR_statmount, &req, buf, bufsize, flags);
+246 -15
tools/testing/selftests/filesystems/statmount/statmount_test.c
··· 33 33 "sysv", "tmpfs", "tracefs", "ubifs", "udf", "ufs", "v7", "vboxsf", 34 34 "vfat", "virtiofs", "vxfs", "xenfs", "xfs", "zonefs", NULL }; 35 35 36 - static struct statmount *statmount_alloc(uint64_t mnt_id, uint64_t mask, unsigned int flags) 36 + static struct statmount *statmount_alloc(uint64_t mnt_id, int fd, uint64_t mask, unsigned int flags) 37 37 { 38 38 size_t bufsize = 1 << 15; 39 - struct statmount *buf = NULL, *tmp = alloca(bufsize); 39 + struct statmount *buf = NULL, *tmp = NULL; 40 40 int tofree = 0; 41 41 int ret; 42 42 43 + if (flags & STATMOUNT_BY_FD && fd < 0) 44 + return NULL; 45 + 46 + tmp = alloca(bufsize); 47 + 43 48 for (;;) { 44 - ret = statmount(mnt_id, 0, mask, tmp, bufsize, flags); 49 + if (flags & STATMOUNT_BY_FD) 50 + ret = statmount(0, 0, (uint32_t) fd, mask, tmp, bufsize, flags); 51 + else 52 + ret = statmount(mnt_id, 0, 0, mask, tmp, bufsize, flags); 53 + 45 54 if (ret != -1) 46 55 break; 47 56 if (tofree) ··· 246 237 struct statmount sm; 247 238 int ret; 248 239 249 - ret = statmount(root_id, 0, 0, &sm, sizeof(sm), 0); 240 + ret = statmount(root_id, 0, 0, 0, &sm, sizeof(sm), 0); 250 241 if (ret == -1) { 251 242 ksft_test_result_fail("statmount zero mask: %s\n", 252 243 strerror(errno)); ··· 272 263 int ret; 273 264 uint64_t mask = STATMOUNT_MNT_BASIC; 274 265 275 - ret = statmount(root_id, 0, mask, &sm, sizeof(sm), 0); 266 + ret = statmount(root_id, 0, 0, mask, &sm, sizeof(sm), 0); 276 267 if (ret == -1) { 277 268 ksft_test_result_fail("statmount mnt basic: %s\n", 278 269 strerror(errno)); ··· 332 323 struct statx sx; 333 324 struct statfs sf; 334 325 335 - ret = statmount(root_id, 0, mask, &sm, sizeof(sm), 0); 326 + ret = statmount(root_id, 0, 0, mask, &sm, sizeof(sm), 0); 336 327 if (ret == -1) { 337 328 ksft_test_result_fail("statmount sb basic: %s\n", 338 329 strerror(errno)); ··· 384 375 { 385 376 struct statmount *sm; 386 377 387 - sm = statmount_alloc(root_id, STATMOUNT_MNT_POINT, 0); 378 + sm = statmount_alloc(root_id, 0, STATMOUNT_MNT_POINT, 0); 388 379 if (!sm) { 389 380 ksft_test_result_fail("statmount mount point: %s\n", 390 381 strerror(errno)); ··· 414 405 assert(last_dir); 415 406 last_dir++; 416 407 417 - sm = statmount_alloc(root_id, STATMOUNT_MNT_ROOT, 0); 408 + sm = statmount_alloc(root_id, 0, STATMOUNT_MNT_ROOT, 0); 418 409 if (!sm) { 419 410 ksft_test_result_fail("statmount mount root: %s\n", 420 411 strerror(errno)); ··· 447 438 const char *fs_type; 448 439 const char *const *s; 449 440 450 - sm = statmount_alloc(root_id, STATMOUNT_FS_TYPE, 0); 441 + sm = statmount_alloc(root_id, 0, STATMOUNT_FS_TYPE, 0); 451 442 if (!sm) { 452 443 ksft_test_result_fail("statmount fs type: %s\n", 453 444 strerror(errno)); ··· 476 467 char *line = NULL; 477 468 size_t len = 0; 478 469 479 - sm = statmount_alloc(root_id, STATMOUNT_MNT_BASIC | STATMOUNT_MNT_OPTS, 470 + sm = statmount_alloc(root_id, 0, STATMOUNT_MNT_BASIC | STATMOUNT_MNT_OPTS, 480 471 0); 481 472 if (!sm) { 482 473 ksft_test_result_fail("statmount mnt opts: %s\n", ··· 566 557 uint32_t start, i; 567 558 int ret; 568 559 569 - sm = statmount_alloc(root_id, mask, 0); 560 + sm = statmount_alloc(root_id, 0, mask, 0); 570 561 if (!sm) { 571 562 ksft_test_result_fail("statmount %s: %s\n", name, 572 563 strerror(errno)); ··· 595 586 exactsize = sm->size; 596 587 shortsize = sizeof(*sm) + i; 597 588 598 - ret = statmount(root_id, 0, mask, sm, exactsize, 0); 589 + ret = statmount(root_id, 0, 0, mask, sm, exactsize, 0); 599 590 if (ret == -1) { 600 591 ksft_test_result_fail("statmount exact size: %s\n", 601 592 strerror(errno)); 602 593 goto out; 603 594 } 604 595 errno = 0; 605 - ret = statmount(root_id, 0, mask, sm, shortsize, 0); 596 + ret = statmount(root_id, 0, 0, mask, sm, shortsize, 0); 606 597 if (ret != -1 || errno != EOVERFLOW) { 607 598 ksft_test_result_fail("should have failed with EOVERFLOW: %s\n", 608 599 strerror(errno)); ··· 667 658 ksft_test_result_pass("listmount tree\n"); 668 659 } 669 660 661 + static void test_statmount_by_fd(void) 662 + { 663 + struct statmount *sm = NULL; 664 + char tmpdir[] = "/statmount.fd.XXXXXX"; 665 + const char root[] = "/test"; 666 + char subdir[PATH_MAX], tmproot[PATH_MAX]; 667 + int fd; 668 + 669 + if (!mkdtemp(tmpdir)) { 670 + ksft_perror("mkdtemp"); 671 + return; 672 + } 673 + 674 + if (mount("statmount.test", tmpdir, "tmpfs", 0, NULL)) { 675 + ksft_perror("mount"); 676 + rmdir(tmpdir); 677 + return; 678 + } 679 + 680 + snprintf(subdir, PATH_MAX, "%s%s", tmpdir, root); 681 + snprintf(tmproot, PATH_MAX, "%s/%s", tmpdir, "chroot"); 682 + 683 + if (mkdir(subdir, 0755)) { 684 + ksft_perror("mkdir"); 685 + goto err_tmpdir; 686 + } 687 + 688 + if (mount(subdir, subdir, NULL, MS_BIND, 0)) { 689 + ksft_perror("mount"); 690 + goto err_subdir; 691 + } 692 + 693 + if (mkdir(tmproot, 0755)) { 694 + ksft_perror("mkdir"); 695 + goto err_subdir; 696 + } 697 + 698 + fd = open(subdir, O_PATH); 699 + if (fd < 0) { 700 + ksft_perror("open"); 701 + goto err_tmproot; 702 + } 703 + 704 + if (chroot(tmproot)) { 705 + ksft_perror("chroot"); 706 + goto err_fd; 707 + } 708 + 709 + sm = statmount_alloc(0, fd, STATMOUNT_MNT_ROOT | STATMOUNT_MNT_POINT, STATMOUNT_BY_FD); 710 + if (!sm) { 711 + ksft_test_result_fail("statmount by fd failed: %s\n", strerror(errno)); 712 + goto err_chroot; 713 + } 714 + 715 + if (sm->size < sizeof(*sm)) { 716 + ksft_test_result_fail("unexpected size: %u < %u\n", 717 + sm->size, (uint32_t) sizeof(*sm)); 718 + goto err_chroot; 719 + } 720 + 721 + if (sm->mask & STATMOUNT_MNT_POINT) { 722 + ksft_test_result_fail("STATMOUNT_MNT_POINT unexpectedly set in statmount\n"); 723 + goto err_chroot; 724 + } 725 + 726 + if (!(sm->mask & STATMOUNT_MNT_ROOT)) { 727 + ksft_test_result_fail("STATMOUNT_MNT_ROOT not set in statmount\n"); 728 + goto err_chroot; 729 + } 730 + 731 + if (strcmp(root, sm->str + sm->mnt_root) != 0) { 732 + ksft_test_result_fail("statmount returned incorrect mnt_root," 733 + "statmount mnt_root: %s != %s\n", 734 + sm->str + sm->mnt_root, root); 735 + goto err_chroot; 736 + } 737 + 738 + if (chroot(".")) { 739 + ksft_perror("chroot"); 740 + goto out; 741 + } 742 + 743 + free(sm); 744 + sm = statmount_alloc(0, fd, STATMOUNT_MNT_ROOT | STATMOUNT_MNT_POINT, STATMOUNT_BY_FD); 745 + if (!sm) { 746 + ksft_test_result_fail("statmount by fd failed: %s\n", strerror(errno)); 747 + goto err_fd; 748 + } 749 + 750 + if (sm->size < sizeof(*sm)) { 751 + ksft_test_result_fail("unexpected size: %u < %u\n", 752 + sm->size, (uint32_t) sizeof(*sm)); 753 + goto out; 754 + } 755 + 756 + if (!(sm->mask & STATMOUNT_MNT_POINT)) { 757 + ksft_test_result_fail("STATMOUNT_MNT_POINT not set in statmount\n"); 758 + goto out; 759 + } 760 + 761 + if (!(sm->mask & STATMOUNT_MNT_ROOT)) { 762 + ksft_test_result_fail("STATMOUNT_MNT_ROOT not set in statmount\n"); 763 + goto out; 764 + } 765 + 766 + if (strcmp(subdir, sm->str + sm->mnt_point) != 0) { 767 + ksft_test_result_fail("statmount returned incorrect mnt_point," 768 + "statmount mnt_point: %s != %s\n", sm->str + sm->mnt_point, subdir); 769 + goto out; 770 + } 771 + 772 + if (strcmp(root, sm->str + sm->mnt_root) != 0) { 773 + ksft_test_result_fail("statmount returned incorrect mnt_root," 774 + "statmount mnt_root: %s != %s\n", sm->str + sm->mnt_root, root); 775 + goto out; 776 + } 777 + 778 + ksft_test_result_pass("statmount by fd\n"); 779 + goto out; 780 + err_chroot: 781 + chroot("."); 782 + out: 783 + free(sm); 784 + err_fd: 785 + close(fd); 786 + err_tmproot: 787 + rmdir(tmproot); 788 + err_subdir: 789 + umount2(subdir, MNT_DETACH); 790 + rmdir(subdir); 791 + err_tmpdir: 792 + umount2(tmpdir, MNT_DETACH); 793 + rmdir(tmpdir); 794 + } 795 + 796 + static void test_statmount_by_fd_unmounted(void) 797 + { 798 + const char root[] = "/test.unmounted"; 799 + char tmpdir[] = "/statmount.fd.XXXXXX"; 800 + char subdir[PATH_MAX]; 801 + int fd; 802 + struct statmount *sm = NULL; 803 + 804 + if (!mkdtemp(tmpdir)) { 805 + ksft_perror("mkdtemp"); 806 + return; 807 + } 808 + 809 + if (mount("statmount.test", tmpdir, "tmpfs", 0, NULL)) { 810 + ksft_perror("mount"); 811 + rmdir(tmpdir); 812 + return; 813 + } 814 + 815 + snprintf(subdir, PATH_MAX, "%s%s", tmpdir, root); 816 + 817 + if (mkdir(subdir, 0755)) { 818 + ksft_perror("mkdir"); 819 + goto err_tmpdir; 820 + } 821 + 822 + if (mount(subdir, subdir, 0, MS_BIND, NULL)) { 823 + ksft_perror("mount"); 824 + goto err_subdir; 825 + } 826 + 827 + fd = open(subdir, O_PATH); 828 + if (fd < 0) { 829 + ksft_perror("open"); 830 + goto err_subdir; 831 + } 832 + 833 + if (umount2(tmpdir, MNT_DETACH)) { 834 + ksft_perror("umount2"); 835 + goto err_fd; 836 + } 837 + 838 + sm = statmount_alloc(0, fd, STATMOUNT_MNT_POINT | STATMOUNT_MNT_ROOT, STATMOUNT_BY_FD); 839 + if (!sm) { 840 + ksft_test_result_fail("statmount by fd unmounted: %s\n", 841 + strerror(errno)); 842 + goto err_sm; 843 + } 844 + 845 + if (sm->size < sizeof(*sm)) { 846 + ksft_test_result_fail("unexpected size: %u < %u\n", 847 + sm->size, (uint32_t) sizeof(*sm)); 848 + goto err_sm; 849 + } 850 + 851 + if (sm->mask & STATMOUNT_MNT_POINT) { 852 + ksft_test_result_fail("STATMOUNT_MNT_POINT unexpectedly set in mask\n"); 853 + goto err_sm; 854 + } 855 + 856 + if (!(sm->mask & STATMOUNT_MNT_ROOT)) { 857 + ksft_test_result_fail("STATMOUNT_MNT_ROOT not set in mask\n"); 858 + goto err_sm; 859 + } 860 + 861 + if (strcmp(sm->str + sm->mnt_root, root) != 0) { 862 + ksft_test_result_fail("statmount returned incorrect mnt_root," 863 + "statmount mnt_root: %s != %s\n", 864 + sm->str + sm->mnt_root, root); 865 + goto err_sm; 866 + } 867 + 868 + ksft_test_result_pass("statmount by fd on unmounted mount\n"); 869 + err_sm: 870 + free(sm); 871 + err_fd: 872 + close(fd); 873 + err_subdir: 874 + umount2(subdir, MNT_DETACH); 875 + rmdir(subdir); 876 + err_tmpdir: 877 + umount2(tmpdir, MNT_DETACH); 878 + rmdir(tmpdir); 879 + } 880 + 670 881 #define str_off(memb) (offsetof(struct statmount, memb) / sizeof(uint32_t)) 671 882 672 883 int main(void) ··· 898 669 899 670 ksft_print_header(); 900 671 901 - ret = statmount(0, 0, 0, NULL, 0, 0); 672 + ret = statmount(0, 0, 0, 0, NULL, 0, 0); 902 673 assert(ret == -1); 903 674 if (errno == ENOSYS) 904 675 ksft_exit_skip("statmount() syscall not supported\n"); 905 676 906 677 setup_namespace(); 907 678 908 - ksft_set_plan(15); 679 + ksft_set_plan(17); 909 680 test_listmount_empty_root(); 910 681 test_statmount_zero_mask(); 911 682 test_statmount_mnt_basic(); ··· 922 693 test_statmount_string(all_mask, str_off(fs_type), "fs type & all"); 923 694 924 695 test_listmount_tree(); 696 + test_statmount_by_fd_unmounted(); 697 + test_statmount_by_fd(); 925 698 926 699 927 700 if (ksft_get_fail_cnt() + ksft_get_error_cnt() > 0)
+98 -3
tools/testing/selftests/filesystems/statmount/statmount_test_ns.c
··· 102 102 if (!root_id) 103 103 return NSID_ERROR; 104 104 105 - ret = statmount(root_id, 0, STATMOUNT_MNT_NS_ID, &sm, sizeof(sm), 0); 105 + ret = statmount(root_id, 0, 0, STATMOUNT_MNT_NS_ID, &sm, sizeof(sm), 0); 106 106 if (ret == -1) { 107 107 ksft_print_msg("statmount mnt ns id: %s\n", strerror(errno)); 108 108 return NSID_ERROR; ··· 128 128 return NSID_PASS; 129 129 } 130 130 131 + static int _test_statmount_mnt_ns_id_by_fd(void) 132 + { 133 + struct statmount sm; 134 + uint64_t mnt_ns_id; 135 + int ret, fd, mounted = 1, status = NSID_ERROR; 136 + char mnt[] = "/statmount.fd.XXXXXX"; 137 + 138 + ret = get_mnt_ns_id("/proc/self/ns/mnt", &mnt_ns_id); 139 + if (ret != NSID_PASS) 140 + return ret; 141 + 142 + if (!mkdtemp(mnt)) { 143 + ksft_print_msg("statmount by fd mnt ns id mkdtemp: %s\n", strerror(errno)); 144 + return NSID_ERROR; 145 + } 146 + 147 + if (mount(mnt, mnt, NULL, MS_BIND, 0)) { 148 + ksft_print_msg("statmount by fd mnt ns id mount: %s\n", strerror(errno)); 149 + status = NSID_ERROR; 150 + goto err; 151 + } 152 + 153 + fd = open(mnt, O_PATH); 154 + if (fd < 0) { 155 + ksft_print_msg("statmount by fd mnt ns id open: %s\n", strerror(errno)); 156 + goto err; 157 + } 158 + 159 + ret = statmount(0, 0, fd, STATMOUNT_MNT_NS_ID, &sm, sizeof(sm), STATMOUNT_BY_FD); 160 + if (ret == -1) { 161 + ksft_print_msg("statmount mnt ns id statmount: %s\n", strerror(errno)); 162 + status = NSID_ERROR; 163 + goto out; 164 + } 165 + 166 + if (sm.size != sizeof(sm)) { 167 + ksft_print_msg("unexpected size: %u != %u\n", sm.size, 168 + (uint32_t)sizeof(sm)); 169 + status = NSID_FAIL; 170 + goto out; 171 + } 172 + if (sm.mask != STATMOUNT_MNT_NS_ID) { 173 + ksft_print_msg("statmount mnt ns id unavailable\n"); 174 + status = NSID_SKIP; 175 + goto out; 176 + } 177 + 178 + if (sm.mnt_ns_id != mnt_ns_id) { 179 + ksft_print_msg("unexpected mnt ns ID: 0x%llx != 0x%llx\n", 180 + (unsigned long long)sm.mnt_ns_id, 181 + (unsigned long long)mnt_ns_id); 182 + status = NSID_FAIL; 183 + goto out; 184 + } 185 + 186 + mounted = 0; 187 + if (umount2(mnt, MNT_DETACH)) { 188 + ksft_print_msg("statmount by fd mnt ns id umount2: %s\n", strerror(errno)); 189 + goto out; 190 + } 191 + 192 + ret = statmount(0, 0, fd, STATMOUNT_MNT_NS_ID, &sm, sizeof(sm), STATMOUNT_BY_FD); 193 + if (ret == -1) { 194 + ksft_print_msg("statmount mnt ns id statmount: %s\n", strerror(errno)); 195 + status = NSID_ERROR; 196 + goto out; 197 + } 198 + 199 + if (sm.size != sizeof(sm)) { 200 + ksft_print_msg("unexpected size: %u != %u\n", sm.size, 201 + (uint32_t)sizeof(sm)); 202 + status = NSID_FAIL; 203 + goto out; 204 + } 205 + 206 + if (sm.mask == STATMOUNT_MNT_NS_ID) { 207 + ksft_print_msg("unexpected STATMOUNT_MNT_NS_ID in mask\n"); 208 + status = NSID_FAIL; 209 + goto out; 210 + } 211 + 212 + status = NSID_PASS; 213 + out: 214 + close(fd); 215 + if (mounted) 216 + umount2(mnt, MNT_DETACH); 217 + err: 218 + rmdir(mnt); 219 + return status; 220 + } 221 + 222 + 131 223 static void test_statmount_mnt_ns_id(void) 132 224 { 133 225 pid_t pid; ··· 240 148 if (ret != NSID_PASS) 241 149 exit(ret); 242 150 ret = _test_statmount_mnt_ns_id(); 151 + if (ret != NSID_PASS) 152 + exit(ret); 153 + ret = _test_statmount_mnt_ns_id_by_fd(); 243 154 exit(ret); 244 155 } 245 156 ··· 274 179 for (int i = 0; i < nr_mounts; i++) { 275 180 struct statmount sm; 276 181 277 - ret = statmount(list[i], mnt_ns_id, STATMOUNT_MNT_NS_ID, &sm, 182 + ret = statmount(list[i], mnt_ns_id, 0, STATMOUNT_MNT_NS_ID, &sm, 278 183 sizeof(sm), 0); 279 184 if (ret < 0) { 280 185 ksft_print_msg("statmount mnt ns id: %s\n", strerror(errno)); ··· 370 275 int ret; 371 276 372 277 ksft_print_header(); 373 - ret = statmount(0, 0, 0, NULL, 0, 0); 278 + ret = statmount(0, 0, 0, 0, NULL, 0, 0); 374 279 assert(ret == -1); 375 280 if (errno == ENOSYS) 376 281 ksft_exit_skip("statmount() syscall not supported\n");
+26
tools/testing/selftests/filesystems/utils.c
··· 515 515 return 0; 516 516 } 517 517 518 + int enter_userns(void) 519 + { 520 + int ret; 521 + char buf[32]; 522 + uid_t uid = getuid(); 523 + gid_t gid = getgid(); 524 + 525 + ret = unshare(CLONE_NEWUSER); 526 + if (ret) 527 + return ret; 528 + 529 + sprintf(buf, "0 %d 1", uid); 530 + ret = write_file("/proc/self/uid_map", buf); 531 + if (ret) 532 + return ret; 533 + ret = write_file("/proc/self/setgroups", "deny"); 534 + if (ret) 535 + return ret; 536 + sprintf(buf, "0 %d 1", gid); 537 + ret = write_file("/proc/self/gid_map", buf); 538 + if (ret) 539 + return ret; 540 + 541 + return 0; 542 + } 543 + 518 544 /* caps_down - lower all effective caps */ 519 545 int caps_down(void) 520 546 {
+1
tools/testing/selftests/filesystems/utils.h
··· 28 28 29 29 extern bool switch_ids(uid_t uid, gid_t gid); 30 30 extern int setup_userns(void); 31 + extern int enter_userns(void); 31 32 32 33 static inline bool switch_userns(int fd, uid_t uid, gid_t gid, bool drop_caps) 33 34 {