mount: always duplicate mount
In the OPEN_TREE_NAMESPACE path vfs_open_tree() resolves a path via
filename_lookup() without holding namespace_lock. Between the lookup
and create_new_namespace() acquiring namespace_lock via
LOCK_MOUNT_EXACT_COPY() another thread can unmount the mount, setting
mnt->mnt_ns to NULL.
When create_new_namespace() then checks !mnt->mnt_ns it incorrectly
takes the swap-and-mntget path that was designed for fsmount()'s
detached mounts. This reuses a mount whose mnt_mp_list is in an
inconsistent state from the concurrent unmount, causing a general
protection fault in __umount_mnt() -> hlist_del_init(&mnt->mnt_mp_list)
during namespace teardown.
Remove the !mnt->mnt_ns special case entirely. Instead, always
duplicate the mount:
- For OPEN_TREE_NAMESPACE use __do_loopback() which will properly
clone the mount or reject it via may_copy_tree() if it was
unmounted in the race window.
- For fsmount() use clone_mnt() directly (via the new MOUNT_COPY_NEW
flag) since the mount is freshly created by vfs_create_mount() and
not in any namespace so __do_loopback()'s IS_MNT_UNBINDABLE,
may_copy_tree, and __has_locked_children checks don't apply.
Reported-by: syzbot+e4470cc28308f2081ec8@syzkaller.appspotmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>