commits
Pull jfs updates from Dave Kleikamp:
"More robust data integrity checking and some fixes"
* tag 'jfs-7.1' of github.com:kleikamp/linux-shaggy:
jfs: avoid -Wtautological-constant-out-of-range-compare warning again
JFS: always load filesystem UUID during mount
jfs: hold LOG_LOCK on umount to avoid null-ptr-deref
jfs: Set the lbmDone flag at the end of lbmIODone
jfs: fix corrupted list in dbUpdatePMap
jfs: add dmapctl integrity check to prevent invalid operations
jfs: add dtpage integrity check to prevent index/pointer overflows
jfs: add dtroot integrity check to prevent index out-of-bounds
Pull ext2, udf, quota updates from Jan Kara:
- A fix for a race in quota code that can expose ocfs2 to
use-after-free issues
- UDF fix to avoid memory corruption in face of corrupted format
- Couple of ext2 fixes for better handling of fs corruption
- Some more various code cleanups in UDF & ext2
* tag 'fs_for_v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: reject inodes with zero i_nlink and valid mode in ext2_iget()
ext2: use get_random_u32() where appropriate
quota: Fix race of dquot_scan_active() with quota deactivation
udf: fix partition descriptor append bookkeeping
ext2: avoid drop_nlink() during unlink of zero-nlink inode in ext2_unlink()
ext2: guard reservation window dump with EXT2FS_DEBUG
ext2: replace BUG_ON with WARN_ON_ONCE in ext2_get_blocks
ext2: remove stale TODO about kmap
fs: udf: avoid assignment in condition when selecting allocation goal
The comparison of an __s8 value against DTPAGEMAXSLOT is still trivially
true, causing a harmless (default disabled) warning with clang:
fs/jfs/jfs_dtree.c:4419:25: error: result of comparison of constant 128 with expression of type 's8' (aka 'signed char') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
4419 | p->header.freelist >= DTPAGEMAXSLOT)) {
| ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~
I previously worked around two of these in commit 7833570dae83 ("jfs: avoid
-Wtautological-constant-out-of-range-compare warning"), but now a new one has
come up, so address the same way by dropping the redundant range check.
Fixes: 119e448bb50a ("jfs: add dtpage integrity check to prevent index/pointer overflows")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull fsnotify updates from Jan Kara:
"A couple of small fsnotify fixes and cleanups"
* tag 'fsnotify_for_v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: replace deprecated strcpy in fanotify_info_copy_{name,name2}
fsnotify: inotify: pass mark connector to fsnotify_recalc_mask()
fanotify: call fanotify_events_supported() before path_permission() and security_path_notify()
fanotify: avoid/silence premature LSM capability checks
inotify: fix watch count leak when fsnotify_add_inode_mark_locked() fails
ext2_iget() already rejects inodes with i_nlink == 0 when i_mode is
zero or i_dtime is set, treating them as deleted. However, the case of
i_nlink == 0 with a non-zero mode and zero dtime slips through. Since
ext2 has no orphan list, such a combination can only result from
filesystem corruption - a legitimate inode deletion always sets either
i_dtime or clears i_mode before freeing the inode.
A crafted image can exploit this gap to present such an inode to the
VFS, which then triggers WARN_ON inside drop_nlink() (fs/inode.c) via
ext2_unlink(), ext2_rename() and ext2_rmdir():
WARNING: CPU: 3 PID: 609 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 3 UID: 0 PID: 609 Comm: syz-executor Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_unlink+0x26c/0x300 fs/ext2/namei.c:295
vfs_unlink+0x2fc/0x9b0 fs/namei.c:4477
do_unlinkat+0x53e/0x730 fs/namei.c:4541
__x64_sys_unlink+0xc6/0x110 fs/namei.c:4587
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
WARNING: CPU: 0 PID: 646 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 0 UID: 0 PID: 646 Comm: syz.0.17 Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_rename+0x35e/0x850 fs/ext2/namei.c:374
vfs_rename+0xf2f/0x2060 fs/namei.c:5021
do_renameat2+0xbe2/0xd50 fs/namei.c:5178
__x64_sys_rename+0x7e/0xa0 fs/namei.c:5223
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
WARNING: CPU: 0 PID: 634 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 0 UID: 0 PID: 634 Comm: syz-executor Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_rmdir+0xca/0x110 fs/ext2/namei.c:311
vfs_rmdir+0x204/0x690 fs/namei.c:4348
do_rmdir+0x372/0x3e0 fs/namei.c:4407
__x64_sys_unlinkat+0xf0/0x130 fs/namei.c:4577
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Extend the existing i_nlink == 0 check to also catch this case,
reporting the corruption via ext2_error() and returning -EFSCORRUPTED.
This rejects the inode at load time and prevents it from reaching any
of the namei.c paths.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
Link: https://patch.msgid.link/20260404152011.2590197-1-kovalev@altlinux.org
Signed-off-by: Jan Kara <jack@suse.cz>
The filesystem UUID was only being loaded into super_block sb when an
external journal device was in use. When mounting without an external
journal, the UUID remained unset, which prevented the computation of
a filesystem ID (fsid), which could be confirmed via `stat -f -c "%i"`
and thus user space could not use fanotify correctly.
A missing filesystem ID causes fanotify to return ENODEV when marking
the filesystem for events like FAN_CREATE, FAN_DELETE, FAN_MOVED_TO,
and FAN_MOVED_FROM. As a result, applications relying on fanotify
could not monitor these events on JFS filesystems without an external
journal.
Moved the UUID initialization so it is always performed during mount,
ensuring the superblock UUID is consistently available.
Signed-off-by: João Paredes <joaommp@yahoo.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull smb server updates from Steve French:
- smbdirect double free fixes
- Add some smbdirect logging
- Minor cleanup in crypto, and smbdirect and in IPC handling
- Minor cleanup to move header info to common FSCC code
- Fix crypt message use after free
- Fix memory leak in session setup
- Fix for DACL parsing
- Fix EA name length validation
- Reconnect fix
- Fix use after free in close
* tag 'v7.1-rc-part1-ksmbd-srv-fixes' of git://git.samba.org/ksmbd:
smb: smbdirect: add some logging to SMBDIRECT_CHECK_STATUS_{WARN,DISCONNECT}()
smb: smbdirect: introduce smbdirect_socket.logging infrastructure
smb: smbdirect: let smbdirect.h include #include <linux/types.h>
smb: server: avoid double-free in smb_direct_free_sendmsg after smb_direct_flush_send_list()
smb: client: avoid double-free in smbd_free_send_io() after smbd_send_batch_flush()
ksmbd: fix use-after-free from async crypto on Qualcomm crypto engine
ksmbd: fix mechToken leak when SPNEGO decode fails after token alloc
ksmbd: require 3 sub-authorities before reading sub_auth[2]
ksmbd: validate EaNameLength in smb2_get_ea()
ksmbd: Remove unnecessary selection of CRYPTO_ECB
ksmbd: validate owner of durable handle on reconnect
ksmbd: fix use-after-free in __ksmbd_close_fd() via durable scavenger
ksmbd: ipc: use kzalloc_flex and __counted_by
smb: move filesystem_vol_info into common/fscc.h
smb: move file_basic_info into common/fscc.h
smb: move some definitions from common/smb2pdu.h into common/fscc.h
strcpy() has been deprecated [1] because it performs no bounds checking
on the destination buffer, which can lead to buffer overflows. Replace
it with the safer strscpy().
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1]
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260321210544.519259-4-thorsten.blum@linux.dev
Signed-off-by: Jan Kara <jack@suse.cz>
Use the typed random integer helpers instead of
get_random_bytes() when filling a single integer variable.
The helpers return the value directly, require no pointer
or size argument, and better express intent.
Signed-off-by: David Carlier <devnexen@gmail.com>
Link: https://patch.msgid.link/20260405154717.4705-1-devnexen@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
write_special_inodes() function iterate through the log->sb_list and
access the sbi fields, which can be set to NULL concurrently by umount.
Fix concurrency issue by holding LOG_LOCK and checking for NULL.
Reported-by: syzbot+e14b1036481911ae4d77@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e14b1036481911ae4d77
Signed-off-by: Helen Koike <koike@igalia.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull gfs2 updates from Andreas Gruenbacher:
- Fix possible data loss during inode evict
- Fix a race during bufdata allocation
- More careful cleaning up during a withdraw
- Prevent excessive log flushing under memory pressure
- Various other minor fixes and cleanups
* tag 'gfs2-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: prevent NULL pointer dereference during unmount
gfs2: hide error messages after withdraw
gfs2: wait for withdraw earlier during unmount
gfs2: inode directory consistency checks
gfs2: gfs2_log_flush withdraw fixes
gfs2: add some missing log locking
gfs2: fix address space truncation during withdraw
gfs2: drain ail under sd_log_flush_lock
gfs2: bufdata allocation race
gfs2: Remove trans_drain code duplication
gfs2: Move gfs2_remove_from_journal to log.c
gfs2: Get rid of gfs2_log_[un]lock helpers
gfs2: less aggressive low-memory log flushing
gfs2: Fix data loss during inode evict
gfs2: minor evict_[un]linked_inode cleanup
gfs2: Avoid unnecessary transactions in evict_linked_inode
gfs2: Remove unnecessary check in gfs2_evict_inode
gfs2: Call unlock_new_inode before d_instantiate
This should make it easier to analyze any possible problems.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
fsnotify_recalc_mask() expects a plain struct fsnotify_mark_connector *,
but inode->i_fsnotify_marks is an __rcu pointer. Use fsn_mark->connector
instead to avoid sparse "different address spaces" warnings.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://patch.msgid.link/20260214051217.1381363-1-sun.jian.kdev@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
dquot_scan_active() can race with quota deactivation in
quota_release_workfn() like:
CPU0 (quota_release_workfn) CPU1 (dquot_scan_active)
============================== ==============================
spin_lock(&dq_list_lock);
list_replace_init(
&releasing_dquots, &rls_head);
/* dquot X on rls_head,
dq_count == 0,
DQ_ACTIVE_B still set */
spin_unlock(&dq_list_lock);
synchronize_srcu(&dquot_srcu);
spin_lock(&dq_list_lock);
list_for_each_entry(dquot,
&inuse_list, dq_inuse) {
/* finds dquot X */
dquot_active(X) -> true
atomic_inc(&X->dq_count);
}
spin_unlock(&dq_list_lock);
spin_lock(&dq_list_lock);
dquot = list_first_entry(&rls_head);
WARN_ON_ONCE(atomic_read(&dquot->dq_count));
The problem is not only a cosmetic one as under memory pressure the
caller of dquot_scan_active() can end up working on freed dquot.
Fix the problem by making sure the dquot is removed from releasing list
when we acquire a reference to it.
Fixes: 869b6ea1609f ("quota: Fix slow quotaoff")
Reported-by: Sam Sun <samsun1006219@gmail.com>
Link: https://lore.kernel.org/all/CAEkJfYPTt3uP1vAYnQ5V2ZWn5O9PLhhGi5HbOcAzyP9vbXyjeg@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
In lbmRead(), the I/O event waited for by wait_event() finishes before
it goes to sleep, and the lbmIODone() prematurely sets the flag to
lbmDONE, thus ending the wait. This causes wait_event() to return before
lbmREAD is cleared (because lbmDONE was set first), the premature return
of wait_event() leads to the release of lbuf before lbmIODone() returns,
thus triggering the use-after-free vulnerability reported in [1].
Moving the operation of setting the lbmDONE flag to after clearing lbmREAD
in lbmIODone() avoids the use-after-free vulnerability reported in [1].
[1]
BUG: KASAN: slab-use-after-free in rt_spin_lock+0x88/0x3e0 kernel/locking/spinlock_rt.c:56
Call Trace:
blk_update_request+0x57e/0xe60 block/blk-mq.c:1007
blk_mq_end_request+0x3e/0x70 block/blk-mq.c:1169
blk_complete_reqs block/blk-mq.c:1244 [inline]
blk_done_softirq+0x10a/0x160 block/blk-mq.c:1249
Allocated by task 6101:
lbmLogInit fs/jfs/jfs_logmgr.c:1821 [inline]
lmLogInit+0x3d0/0x19e0 fs/jfs/jfs_logmgr.c:1269
open_inline_log fs/jfs/jfs_logmgr.c:1175 [inline]
lmLogOpen+0x4e1/0xfa0 fs/jfs/jfs_logmgr.c:1069
jfs_mount_rw+0xe9/0x670 fs/jfs/jfs_mount.c:257
jfs_fill_super+0x754/0xd80 fs/jfs/super.c:532
Freed by task 6101:
kfree+0x1bd/0x900 mm/slub.c:6876
lbmLogShutdown fs/jfs/jfs_logmgr.c:1864 [inline]
lmLogInit+0x1137/0x19e0 fs/jfs/jfs_logmgr.c:1415
open_inline_log fs/jfs/jfs_logmgr.c:1175 [inline]
lmLogOpen+0x4e1/0xfa0 fs/jfs/jfs_logmgr.c:1069
jfs_mount_rw+0xe9/0x670 fs/jfs/jfs_mount.c:257
jfs_fill_super+0x754/0xd80 fs/jfs/super.c:532
Reported-by: syzbot+1d38eedcb25a3b5686a7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1d38eedcb25a3b5686a7
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull fuse update from Miklos Szeredi:
- Fix possible hang in virtiofs when cleaning up a DAX inode (Sergio
Lopez)
- Fix a warning when using large folio as the source of SPLICE_F_MOVE
on the fuse device (Bernd)
- Fix uninitialized value found by KMSAN (Luis Henriques)
- Fix synchronous INIT hang (Miklos)
- Fix race between inode initialization and FUSE_NOTIFY_INVAL_INODE
(Horst)
- Allow fd to be closed after passing fuse device fd to
fsconfig(..., "fd", ...) (Miklos)
- Support FSCONFIG_SET_FD for "fd" option (Miklos)
- Misc fixes and cleanups
* tag 'fuse-update-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (21 commits)
fuse: support FSCONFIG_SET_FD for "fd" option
fuse: clean up device cloning
fuse: don't require /dev/fuse fd to be kept open during mount
fuse: add refcount to fuse_dev
fuse: create fuse_dev on /dev/fuse open instead of mount
fuse: check connection state on notification
fuse: fuse_dev_ioctl_clone() should wait for device file to be initialized
fuse: fix inode initialization race
fuse: abort on fatal signal during sync init
fuse: fix uninit-value in fuse_dentry_revalidate()
fuse: use offset_in_page() for page offset calculations
fuse: use DIV_ROUND_UP() for page count calculations
fuse: simplify logic in fuse_notify_store() and fuse_retrieve()
fuse: validate outarg offset and size in notify store/retrieve
fuse: Check for large folio with SPLICE_F_MOVE
fuse: quiet down complaints in fuse_conn_limit_write
fuse: drop unnecessary argument from fuse_lookup_init()
fuse: fix premature writetrhough request for large folio
fuse: refactor duplicate queue teardown operation
virtiofs: add FUSE protocol validation
...
When flushing out outstanding glock work during an unmount, gfs2_log_flush()
can be called when sdp->sd_jdesc has already been deallocated and sdp->sd_jdesc
is NULL. Commit 35264909e9d1 ("gfs2: Fix NULL pointer dereference in
gfs2_log_flush") added a check for that to gfs2_log_flush() itself, but it
missed the sdp->sd_jdesc dereference in gfs2_log_release(). Fix that.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604071139.HNJiCaAi-lkp@intel.com/
Fixes: 35264909e9d1 ("gfs2: Fix NULL pointer dereference in gfs2_log_flush")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This will be used by client and server in order to keep controlling
the logging when we move to shared functions.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The latter trigger LSM (e.g. SELinux) checks, which will log a denial
when permission is denied, so it's better to do them after validity
checks to avoid logging a denial when the operation would fail anyway.
Fixes: 0b3b094ac9a7 ("fanotify: Disallow permission events for proc filesystem")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://patch.msgid.link/20260216150625.793013-3-omosnace@redhat.com
Signed-off-by: Jan Kara <jack@suse.cz>
Mounting a crafted UDF image with repeated partition descriptors can
trigger a heap out-of-bounds write in part_descs_loc[].
handle_partition_descriptor() deduplicates entries by partition number,
but appended slots never record partnum. As a result duplicate
Partition Descriptors are appended repeatedly and num_part_descs keeps
growing.
Once the table is full, the growth path still sizes the allocation from
partnum even though inserts are indexed by num_part_descs. If partnum is
already aligned to PART_DESC_ALLOC_STEP, ALIGN(partnum, step) can keep
the old capacity and the next append writes past the end of the table.
Store partnum in the appended slot and size growth from the next append
count so deduplication and capacity tracking follow the same model.
Fixes: ee4af50ca94f ("udf: Fix mounting of Win7 created UDF filesystems")
Cc: stable@vger.kernel.org
Signed-off-by: Seohyeon Maeng <bioloidgp@gmail.com>
Link: https://patch.msgid.link/20260310081652.21220-1-bioloidgp@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
This patch resolves the "list_add corruption. next is NULL" Oops
reported by syzkaller in dbUpdatePMap(). The root cause is uninitialized
synclist nodes in struct metapage and struct TxBlock, plus improper list
node removal using list_del() (which leaves nodes in an invalid state).
This fixes the following Oops reported by syzkaller.
list_add corruption. next is NULL.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:28!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 122 Comm: jfsCommit Not tainted syzkaller #0
PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 10/02/2025
RIP: 0010:__list_add_valid_or_report+0xc3/0x130 lib/list_debug.c:27
Code: 4c 89 f2 48 89 d9 e8 0c 88 a4 fc 90 0f 0b 48 c7 c7 20 de 3d 8b e8
fd 87 a4 fc 90 0f 0b 48 c7 c7 c0 de 3d 8b e8 ee 87 a4 fc 90 <0f> 0b 48
89 df e8 13 c3 7d fd 42 80 7c 2d 00 00 74 08 4c 89 e7 e8
RSP: 0018:ffffc9000395fa20 EFLAGS: 00010246
RAX: 0000000000000022 RBX: 0000000000000000 RCX: 270c5dfadb559700
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00000000000f0000 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffff5200072bee9 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000004 R15: 1ffff92000632266
FS: 0000000000000000(0000) GS:ffff888126ef9000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056341fdb86c0 CR3: 0000000040a18000 CR4: 00000000003526f0
Call Trace:
<TASK>
__list_add_valid include/linux/list.h:96 [inline]
__list_add include/linux/list.h:158 [inline]
list_add include/linux/list.h:177 [inline]
dbUpdatePMap+0x7e4/0xeb0 fs/jfs/jfs_dmap.c:577
txAllocPMap+0x57d/0x6b0 fs/jfs/jfs_txnmgr.c:2426
txUpdateMap+0x81e/0x9c0 fs/jfs/jfs_txnmgr.c:2364
txLazyCommit fs/jfs/jfs_txnmgr.c:2665 [inline]
jfs_lazycommit+0x3f1/0xa10 fs/jfs/jfs_txnmgr.c:2734
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
Reported-by: syzbot+4d0a0feb49c5138cac46@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4d0a0feb49c5138cac46
Tested-by: syzbot+4d0a0feb49c5138cac46@syzkaller.appspotmail.com
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull RTLA updates from Steven Rostedt:
- Simplify option parsing
Auto-generate getopt_long() optstring for short options from long
options array, avoiding the need to specify it manually and reducing
the surface for mistakes.
- Add unit tests
Implement unit tests (make unit-tests) using libcheck, next to
existing runtime tests (make check). Currently, three functions from
utils.c are tested.
- Add --stack-format option
In addition to stopping stack pointer decoding (with -s/--stack
option) on first unresolvable pointer, allow also skipping
unresolvable pointers and displaying everything, configurable with a
new option.
- Unify number of CPUs into one global variable
Use one global variable, nr_cpus, to store the number of CPUs instead
of retrieving it and passing it at multiple places.
- Fix behavior in various corner cases
Make RTLA behave correctly in several corner cases: memory allocation
failure, invalid value read from kernel side, thread creation
failure, malformed time value input, and read/write failure or
interruption by signal.
- Improve string handling
Simplify several places in the code that handle strings, including
parsing of action arguments. A few new helper functions and variables
are added for that purpose.
- Get rid of magic numbers
Few places handling paths use a magic number of 1024. Replace it with
MAX_PATH and ARRAY_SIZE() macro.
- Unify threshold handling
Code that handles response to latency threshold is duplicated between
tools, which has led to bugs in the past. Unify it into a new helper
as much as possible.
- Fix segfault on SIGINT during cleanup
The SIGINT handler touches dynamically allocated memory. Detach it
before freeing it during cleanup to prevent segmentation fault and
discarding of output buffers. Also, properly document SIGINT handling
while at it.
* tag 'trace-rtla-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits)
Documentation/rtla: Document SIGINT behavior
rtla: Fix segfault on multiple SIGINTs
rtla/utils: Fix loop condition in PID validation
rtla/utils: Fix resource leak in set_comm_sched_attr()
rtla/trace: Fix I/O handling in save_trace_to_file()
rtla/trace: Fix write loop in trace_event_save_hist()
rtla/timerlat: Simplify RTLA_NO_BPF environment variable check
rtla: Use str_has_prefix() for option prefix check
rtla: Enforce exact match for time unit suffixes
rtla: Use str_has_prefix() for prefix checks
rtla: Add str_has_prefix() helper function
rtla: Handle pthread_create() failure properly
rtla/timerlat: Add bounds check for softirq vector
rtla: Simplify code by caching string lengths
rtla: Replace magic number with MAX_PATH
rtla: Introduce common_threshold_handler() helper
rtla/actions: Simplify argument parsing
rtla: Use strdup() to simplify code
rtla: Exit on memory allocation failures during initialization
tools/rtla: Remove unneeded nr_cpus from for_each_monitored_cpu
...
This is not only cleaner to use in userspace (no need to sprintf the fd to
a string) but also allows userspace to detect that the devfd can be closed
after the fsconfig call.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
In gfs2_evict_inode(), don't issue error messages once a withdraw has already
occurred.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This will make it easier to use.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Make sure calling capable()/ns_capable() actually leads to access denied
when false is returned, because these functions emit an audit record
when a Linux Security Module denies the capability, which makes it
difficult to avoid allowing/silencing unnecessary permissions in
security policies (namely with SELinux).
Where the return value just used to set a flag, use the non-auditing
ns_capable_noaudit() instead.
Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260216150625.793013-2-omosnace@redhat.com
Signed-off-by: Jan Kara <jack@suse.cz>
ext2_unlink() calls inode_dec_link_count() unconditionally, which
invokes drop_nlink(). If the inode was loaded from a corrupted disk
image with i_links_count == 0, drop_nlink()
triggers WARN_ON(inode->i_nlink == 0)
Follow the ext4 pattern from __ext4_unlink(): check i_nlink before
decrementing. If already zero, skip the decrement.
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Link: https://patch.msgid.link/20260211022052.973114-1-n7l8m4@u.northwestern.edu
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dmapctl() to validate dmapctl structure integrity, focusing on
preventing invalid operations caused by on-disk corruption.
Key checks:
- nleafs bounded by [0, LPERCTL] (maximum leaf nodes per dmapctl).
- l2nleafs bounded by [0, L2LPERCTL] and consistent with nleafs
(nleafs must be 2^l2nleafs).
- leafidx must be exactly CTLLEAFIND (expected leaf index position).
- height bounded by [0, L2LPERCTL >> 1] (valid tree height range).
- budmin validity: NOFREE only if nleafs=0; otherwise >= BUDMIN.
- Leaf nodes fit within stree array (leafidx + nleafs <= CTLTREESIZE).
- Leaf node values are either non-negative or NOFREE.
Invoked in dbAllocAG(), dbFindCtl(), dbAdjCtl() and dbExtendFS() when
accessing dmapctl pages, catching corruption early before dmap operations
trigger invalid memory access or logic errors.
This fixes the following UBSAN warning.
[58245.668090][T14017] ------------[ cut here ]------------
[58245.668103][T14017] UBSAN: shift-out-of-bounds in fs/jfs/jfs_dmap.c:2641:11
[58245.668119][T14017] shift exponent 110 is too large for 32-bit type 'int'
[58245.668137][T14017] CPU: 0 UID: 0 PID: 14017 Comm: 4c1966e88c28fa9 Tainted: G E 6.18.0-rc4-00253-g21ce5d4ba045-dirty #124 PREEMPT_{RT,(full)}
[58245.668174][T14017] Tainted: [E]=UNSIGNED_MODULE
[58245.668176][T14017] Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[58245.668184][T14017] Call Trace:
[58245.668200][T14017] <TASK>
[58245.668208][T14017] dump_stack_lvl+0x189/0x250
[58245.668288][T14017] ? __pfx_dump_stack_lvl+0x10/0x10
[58245.668301][T14017] ? __pfx__printk+0x10/0x10
[58245.668315][T14017] ? lock_metapage+0x303/0x400 [jfs]
[58245.668406][T14017] ubsan_epilogue+0xa/0x40
[58245.668422][T14017] __ubsan_handle_shift_out_of_bounds+0x386/0x410
[58245.668462][T14017] dbSplit+0x1f8/0x200 [jfs]
[58245.668543][T14017] dbAdjCtl+0x34c/0xa20 [jfs]
[58245.668628][T14017] dbAllocNear+0x2ee/0x3d0 [jfs]
[58245.668710][T14017] dbAlloc+0x933/0xba0 [jfs]
[58245.668797][T14017] ea_write+0x374/0xdd0 [jfs]
[58245.668888][T14017] ? __pfx_ea_write+0x10/0x10 [jfs]
[58245.668966][T14017] ? __jfs_setxattr+0x76e/0x1120 [jfs]
[58245.669046][T14017] __jfs_setxattr+0xa01/0x1120 [jfs]
[58245.669135][T14017] ? __pfx___jfs_setxattr+0x10/0x10 [jfs]
[58245.669216][T14017] ? mutex_lock_nested+0x154/0x1d0
[58245.669252][T14017] ? __jfs_xattr_set+0xb9/0x170 [jfs]
[58245.669333][T14017] __jfs_xattr_set+0xda/0x170 [jfs]
[58245.669430][T14017] ? __pfx___jfs_xattr_set+0x10/0x10 [jfs]
[58245.669509][T14017] ? xattr_full_name+0x6f/0x90
[58245.669546][T14017] ? jfs_xattr_set+0x33/0x60 [jfs]
[58245.669636][T14017] ? __pfx_jfs_xattr_set+0x10/0x10 [jfs]
[58245.669726][T14017] __vfs_setxattr+0x43c/0x480
[58245.669743][T14017] __vfs_setxattr_noperm+0x12d/0x660
[58245.669756][T14017] vfs_setxattr+0x16b/0x2f0
[58245.669768][T14017] ? __pfx_vfs_setxattr+0x10/0x10
[58245.669782][T14017] filename_setxattr+0x274/0x600
[58245.669795][T14017] ? __pfx_filename_setxattr+0x10/0x10
[58245.669806][T14017] ? getname_flags+0x1e5/0x540
[58245.669829][T14017] path_setxattrat+0x364/0x3a0
[58245.669840][T14017] ? __pfx_path_setxattrat+0x10/0x10
[58245.669859][T14017] ? __se_sys_chdir+0x1b9/0x280
[58245.669876][T14017] __x64_sys_lsetxattr+0xbf/0xe0
[58245.669888][T14017] do_syscall_64+0xfa/0xfa0
[58245.669901][T14017] ? lockdep_hardirqs_on+0x9c/0x150
[58245.669913][T14017] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[58245.669927][T14017] ? exc_page_fault+0xab/0x100
[58245.669937][T14017] entry_SYSCALL_64_after_hwframe+0x77/0x7f
Reported-by: syzbot+4c1966e88c28fa96e053@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4c1966e88c28fa96e053
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull runtime verification updates from Steven Rostedt:
- Refactor da_monitor header to share handlers across monitor types
No functional changes, only less code duplication.
- Add Hybrid Automata model class
Add a new model class that extends deterministic automata by adding
constraints on transitions and states. Those constraints can take
into account wall-clock time and as such allow RV monitor to make
assertions on real time. Add documentation and code generation
scripts.
- Add stall monitor as hybrid automaton example
Add a monitor that triggers a violation when a task is stalling as an
example of automaton working with real time variables.
- Convert the opid monitor to a hybrid automaton
The opid monitor can be heavily simplified if written as a hybrid
automaton: instead of tracking preempt and interrupt enable/disable
events, it can just run constraints on the preemption/interrupt
states when events like wakeup and need_resched verify.
- Add support for per-object monitors in DA/HA
Allow writing deterministic and hybrid automata monitors for generic
objects (e.g. any struct), by exploiting a hash table where objects
are saved. This allows to track more than just tasks in RV. For
instance it will be used to track deadline entities in deadline
monitors.
- Add deadline tracepoints and move some deadline utilities
Prepare the ground for deadline monitors by defining events and
exporting helpers.
- Add nomiss deadline monitor
Add first example of deadline monitor asserting all entities complete
before their deadline.
- Improve rvgen error handling
Introduce AutomataError exception class and better handle expected
exceptions while showing a backtrace for unexpected ones.
- Improve python code quality in rvgen
Refactor the rvgen generation scripts to align with python best
practices: use f-strings instead of %, use len() instead of
__len__(), remove semicolons, use context managers for file
operations, fix whitespace violations, extract magic strings into
constants, remove unused imports and methods.
- Fix small bugs in rvgen
The generator scripts presented some corner case bugs: logical error
in validating what a correct dot file looks like, fix an isinstance()
check, enforce a dot file has an initial state, fix type annotations
and typos in comments.
- rvgen refactoring
Refactor automata.py to use iterator-based parsing and handle
required arguments directly in argparse.
- Allow epoll in rtapp-sleep monitor
The epoll_wait call is now rt-friendly so it should be allowed in the
sleep monitor as a valid sleep method.
* tag 'trace-rv-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (32 commits)
rv: Allow epoll in rtapp-sleep monitor
rv/rvgen: fix _fill_states() return type annotation
rv/rvgen: fix unbound loop variable warning
rv/rvgen: enforce presence of initial state
rv/rvgen: extract node marker string to class constant
rv/rvgen: fix isinstance check in Variable.expand()
rv/rvgen: make monitor arguments required in rvgen
rv/rvgen: remove unused __get_main_name method
rv/rvgen: remove unused sys import from dot2c
rv/rvgen: refactor automata.py to use iterator-based parsing
rv/rvgen: use class constant for init marker
rv/rvgen: fix DOT file validation logic error
rv/rvgen: fix PEP 8 whitespace violations
rv/rvgen: fix typos in automata and generator docstring and comments
rv/rvgen: use context managers for file operations
rv/rvgen: remove unnecessary semicolons
rv/rvgen: replace __len__() calls with len()
rv/rvgen: replace % string formatting with f-strings
rv/rvgen: remove bare except clauses in generator
rv/rvgen: introduce AutomataError exception class
...
The behavior of RTLA on receiving SIGINT is currently undocumented.
Describe it in RTLA's common appendix that appears in man pages for all
RTLA tools to avoid confusion.
Suggested-by: Attila Fazekas <afazekas@redhat.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260324123229.152424-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
- fuse_mutex is not needed for device cloning, because fuse_dev_install()
uses cmpxcg() to set fud->fc, which prevents races between clone/mount
or clone/clone. This makes the logic simpler
- Drop fc->dev_count. This is only used to check in release if the device
is the last clone, but checking list_empty(&fc->devices) is equivalent
after removing the released device from the list. Removing the fuse_dev
before calling fuse_abort_conn() is okay, since the processing and io
lists are now empty for this device.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
During an unmount, wait for potential withdraw to complete before calling
gfs2_make_fs_ro(). This will allow gfs2_make_fs_ro() to skip much of its work.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
smb_direct_flush_send_list() already calls smb_direct_free_sendmsg(),
so we should not call it again after post_sendmsg()
moved it to the batch list.
Reported-by: Ruikai Peng <ruikai@pwno.io>
Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/
Fixes: 34abd408c8ba ("smb: server: make use of smbdirect_socket.send_io.bcredits")
Cc: stable@kernel.org
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Ruikai Peng <ruikai@pwno.io>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: security@kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Tested-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Steve French <stfrench@microsoft.com>
When fsnotify_add_inode_mark_locked() fails in inotify_new_watch(),
the error path calls inotify_remove_from_idr() but does not call
dec_inotify_watches() to undo the preceding inc_inotify_watches().
This leaks a watch count, and repeated failures can exhaust the
max_user_watches limit with -ENOSPC even when no watches are active.
Prior to commit 1cce1eea0aff ("inotify: Convert to using per-namespace
limits"), the watch count was incremented after fsnotify_add_mark_locked()
succeeded, so this path was not affected. The conversion moved
inc_inotify_watches() before the mark insertion without adding the
corresponding rollback.
Add the missing dec_inotify_watches() call in the error path.
Fixes: 1cce1eea0aff ("inotify: Convert to using per-namespace limits")
Cc: stable@vger.kernel.org
Signed-off-by: Chia-Ming Chang <chiamingc@synology.com>
Signed-off-by: robbieko <robbieko@synology.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://patch.msgid.link/20260224093442.3076294-1-chiamingc@synology.com
Signed-off-by: Jan Kara <jack@suse.cz>
The function __rsv_window_dump() is a heavyweight debug tool that walks
the reservation red-black tree. It is currently guarded by #if 1,
forcing it to be compiled into all kernels, even production ones.
Match the rest of the file by guarding it with #ifdef EXT2FS_DEBUG,
so it is only included when explicit debugging is enabled.
This removes the unused function code from standard builds.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207022920.258247-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dtpage() to validate dtpage_t integrity, focusing on
preventing index/pointer overflows from on-disk corruption.
Key checks:
- maxslot must be exactly DTPAGEMAXSLOT (128) as defined for dtpage
slot array.
- freecnt bounded by [0, DTPAGEMAXSLOT-1] (slot[0] reserved for header).
- freelist validity: -1 when freecnt=0; 1~DTPAGEMAXSLOT-1 when non-zero,
with linked list checks (no duplicates, proper termination via next=-1).
- stblindex bounds: must be within range that avoids overlapping with
stbl itself (stblindex < DTPAGEMAXSLOT - stblsize).
- nextindex bounded by stbl size (stblsize << L2DTSLOTSIZE). stbl entries
validity: within 1~DTPAGEMAXSLOT-1, no duplicates(excluding invalid
entries marked as -1).
Invoked when loading dtpage (in BT_GETPAGE macro context) to catch
corruption early before directory operations trigger out-of-bounds access.
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull ktest updates from Steven Rostedt:
- Fix undef warning when WARNINGS_FILE is unset
The check_buildlog() references WARNINGS_FILE even when it's not set.
Perl triggers a warning in this case. Check if the WARNINGS_FILE is
defined before checking if the file it represents exists.
- Fix how LOG_FILE is resolved
LOG_FILE is expanded immediately after the config file is parsed. If
LOG_FILE depends on variables from the tests it will use stale values
instead of using the test variables. Have LOG_FILE also resolve test
variables.
- Treat a undefined self reference variable as empty
Variables can recursively include itself for appending. Currently, if
the references itself and it is not defined, it leaves the variable
in the define: "VAR = ${VAR} foo" keeps the ${VAR} around. Have it
removed instead.
- Fix clearing of variables per tests
If a variable has a defined default, a test can not clear it by
assigning the variable to empty. Fix this by clearing the variable
for a test when the test config has that variable assigned to
nothing.
- Fix run_command() to catch stderr in the shell command parsing
Switch to Perl list form open to use "sh -c" wrapper to run shell
commands to have the log file catch shell parsing errors.
- Fix console output during reboot cycle
The POWER_CYCLE callback during reboot() can miss output from the
next boot making ktest miss the boot string it was waiting for.
- Add PRE_KTEST_DIE for PRE_KTEST failures
If the command for PRE_KTEST fails, ktest does not fail (this was by
design as this command was used to add patches that may or may not
apply). Add PRE_KTEST_DIE value to force ktest to fail if PRE_KTEST
fails.
- Run POST_KTEST hooks on failure and cancellation
PRE_KTEST always runs before a ktest test, have POST_KTEST always run
after a test even if the test fails or is cancelled to do the
teardown of PRE_KTEST.
- Add a --dry-run mode
Add --dry-run to parse the config, print the results and exit without
running any of the tests.
- Store failures from the dodie() path as well
The STORE_FAILURES saves the logs on failure, but there's failure
paths that miss storing. Perform STORE_FAILURES in dodie() to capture
these failures too.
* tag 'ktest-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Store failure logs also in fatal paths
ktest: Add a --dry-run mode
ktest: Run POST_KTEST hooks on failure and cancellation
ktest: Add PRE_KTEST_DIE for PRE_KTEST failures
ktest: Stop dropping console output during power-cycle reboot
ktest: Run commands through list-form shell open
ktest: Honor empty per-test option overrides
ktest: Treat undefined self-reference as empty
ktest: Resolve LOG_FILE in test option context
ktest: Avoid undef warning when WARNINGS_FILE is unset
Since commit 0c43094f8cc9 ("eventpoll: Replace rwlock with spinlock"),
epoll_wait is real-time-safe syscall for sleeping.
Add epoll_wait to the list of rt-safe sleeping APIs.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260401130828.3115428-1-namcao@linutronix.de
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Detach stop_trace() from SIGINT/SIGALRM on tool clean-up to prevent it
from crashing RTLA by accessing freed memory.
This prevents a crash when multiple SIGINTs are received.
Fixes: d6899e560366 ("rtla/timerlat_hist: Abort event processing on second signal")
Fixes: 80967b354a76 ("rtla/timerlat_top: Abort event processing on second signal")
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20260310160725.144443-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
With the new mount API the sequence of syscalls would be:
fs_fd = fsopen("fuse", 0);
snprintf(opt, sizeof(opt), "%i", devfd);
fsconfig(fs_fd, FSCONFIG_SET_STRING, "fd", opt, 0);
/* ... */
fsconfig(fs_fd, FSCONFIG_CMD_CREATE, 0, 0, 0);
Current mount code just stores the value of devfd in the fs_context and
uses it in during FSCONFIG_CMD_CREATE, which is inelegant.
Instead grab a reference to the underlying fuse_dev, and use that during
the filesystem creation.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In gfs2_dinode_in(), only allow directories to have the GFS2_DIF_EXHASH
flag set. This will prevent other parts of the code from treating
regular inodes as directories based on the presence of that flag.
In sweep_bh_for_rgrps() and __gfs2_free_blocks(), check if the
GFS2_DIF_EXHASH flag is set instead of checking if i_depth is non-zero.
This matches what the directory code does. (The i_depth checks were
introduced in commit 6d3117b412951 ("GFS2: Wipe directory hash table
metadata when deallocating a directory").)
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
smbd_send_batch_flush() already calls smbd_free_send_io(),
so we should not call it again after smbd_post_send()
moved it to the batch list.
Reported-by: Ruikai Peng <ruikai@pwno.io>
Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/
Fixes: 21538121efe6 ("smb: client: make use of smbdirect_socket.send_io.bcredits")
Cc: stable@kernel.org
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Ruikai Peng <ruikai@pwno.io>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: security@kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Tested-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull erofs fixes from Gao Xiang:
- Do not share the page cache if the real @aops differs
- Fix the incomplete condition for interlaced plain extents
- Get rid of more unnecessary #ifdefs
* tag 'erofs-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix interlaced plain identification for encoded extents
erofs: remove more unnecessary #ifdefs
erofs: allow sharing page cache with the same aops only
If ext2_get_blocks() is called with maxblocks == 0, it currently triggers
a BUG_ON(), causing a kernel panic.
While this condition implies a logic error in the caller, a filesystem
should not crash the system due to invalid arguments.
Replace the BUG_ON() with a WARN_ON_ONCE() to provide a stack trace for
debugging, and return -EINVAL to handle the error gracefully.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207010617.216675-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dtroot() to validate dtroot_t integrity, focusing on preventing
index/pointer overflows from on-disk corruption.
Key checks:
- freecnt bounded by [0, DTROOTMAXSLOT-1] (slot[0] reserved for header).
- freelist validity: -1 when freecnt=0; 1~DTROOTMAXSLOT-1 when non-zero,
with linked list checks (no duplicates, proper termination via next=-1).
- stbl bounds: nextindex within stbl array size; entries within 0~8, no
duplicates (excluding idx=0).
Invoked in copy_from_dinode() when loading directory inodes, catching
corruption early before directory operations trigger out-of-bounds access.
This fixes the following UBSAN warning.
[ 101.832754][ T5960] ------------[ cut here ]------------
[ 101.832762][ T5960] UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dtree.c:3713:8
[ 101.832792][ T5960] index -1 is out of range for type 'struct dtslot[128]'
[ 101.832807][ T5960] CPU: 2 UID: 0 PID: 5960 Comm: 5f7f0caf9979e9d Tainted: G E 6.18.0-rc4-00250-g2603eb907f03 #119 PREEMPT_{RT,(full
[ 101.832817][ T5960] Tainted: [E]=UNSIGNED_MODULE
[ 101.832819][ T5960] Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 101.832823][ T5960] Call Trace:
[ 101.832833][ T5960] <TASK>
[ 101.832838][ T5960] dump_stack_lvl+0x189/0x250
[ 101.832909][ T5960] ? __pfx_dump_stack_lvl+0x10/0x10
[ 101.832925][ T5960] ? __pfx__printk+0x10/0x10
[ 101.832934][ T5960] ? rt_mutex_slowunlock+0x493/0x8a0
[ 101.832959][ T5960] ubsan_epilogue+0xa/0x40
[ 101.832966][ T5960] __ubsan_handle_out_of_bounds+0xe9/0xf0
[ 101.833007][ T5960] dtInsertEntry+0x936/0x1430 [jfs]
[ 101.833094][ T5960] dtSplitPage+0x2c8b/0x3ed0 [jfs]
[ 101.833177][ T5960] ? __pfx_rt_mutex_slowunlock+0x10/0x10
[ 101.833193][ T5960] dtInsert+0x109b/0x6000 [jfs]
[ 101.833283][ T5960] ? rt_mutex_slowunlock+0x493/0x8a0
[ 101.833296][ T5960] ? __pfx_rt_mutex_slowunlock+0x10/0x10
[ 101.833307][ T5960] ? rt_spin_unlock+0x161/0x200
[ 101.833315][ T5960] ? __pfx_dtInsert+0x10/0x10 [jfs]
[ 101.833391][ T5960] ? txLock+0xaf9/0x1cb0 [jfs]
[ 101.833477][ T5960] ? dtInitRoot+0x22a/0x670 [jfs]
[ 101.833556][ T5960] jfs_mkdir+0x6ec/0xa70 [jfs]
[ 101.833636][ T5960] ? __pfx_jfs_mkdir+0x10/0x10 [jfs]
[ 101.833721][ T5960] ? generic_permission+0x2e5/0x690
[ 101.833760][ T5960] ? bpf_lsm_inode_mkdir+0x9/0x20
[ 101.833776][ T5960] vfs_mkdir+0x306/0x510
[ 101.833786][ T5960] do_mkdirat+0x247/0x590
[ 101.833795][ T5960] ? __pfx_do_mkdirat+0x10/0x10
[ 101.833804][ T5960] ? getname_flags+0x1e5/0x540
[ 101.833815][ T5960] __x64_sys_mkdir+0x6c/0x80
[ 101.833823][ T5960] do_syscall_64+0xfa/0xfa0
[ 101.833832][ T5960] ? lockdep_hardirqs_on+0x9c/0x150
[ 101.833840][ T5960] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 101.833847][ T5960] ? exc_page_fault+0xab/0x100
[ 101.833856][ T5960] entry_SYSCALL_64_after_hwframe+0x77/0x7f
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull tracefs updates from Steven Rostedt:
- Simplify error handling with guards()
Use guards() to simplify the handling of releasing locks in exit
paths.
- Use dentry name snapshots instead of allocation
Instead of allocating a temp buffer to store the dentry name to use
in mkdir() and rmdir() use take_dentry_name_snapshot().
- Fix default permissions not being applied at boot
The default permissions for tracefs was 0700 to only allow root
having access. But after a change to fix other mount options the
update to permissions ignored the defined default and used the system
default of 0755. This is a regression and is fixed.
* tag 'tracefs-v7.1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracefs: Removed unused 'ret' variable in eventfs_iterate()
tracefs: Fix default permissions not being applied on initial mount
tracefs: Use dentry name snapshots instead of heap allocation
eventfs: Simplify code using guard()s
STORE_FAILURES was only saved from fail(), so paths that reached dodie()
could exit without preserving failure logs.
That includes fatal hook paths such as:
POST_BUILD_DIE = 1
and ordinary failures when:
DIE_ON_FAILURE = 1
Call save_logs("fail", ...) from dodie() too so fatal failures keep the
same STORE_FAILURES artifacts as non-fatal fail() paths.
Cc: John Hawley <warthog9@eaglescrag.net>
Link: https://patch.msgid.link/20260318-ktest-fixes-v1-1-9dd94d46d84c@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The _fill_states() method returns a list of strings, but the type
annotation incorrectly specified str. Update the annotation to
list[str] to match the actual return value.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-20-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
The procfs_is_workload_pid() function iterates through a directory
entry name to validate if it represents a process ID. The loop
condition checks if the pointer t_name is non-NULL, but since
incrementing a pointer never makes it NULL, this condition is always
true within the loop's context. Although the inner isdigit() check
catches the NUL terminator and breaks out of the loop, the condition
is semantically misleading and not idiomatic for C string processing.
Correct the loop condition from checking the pointer (t_name) to
checking the character it points to (*t_name). This ensures the loop
terminates when the NUL terminator is reached, aligning with standard
C string iteration practices. While the original code functioned
correctly due to the existing character validation, this change
improves code clarity and maintainability.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20260309195040.1019085-19-wander@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
This will make it possible to grab the fuse_dev and subsequently release
the file that it came from.
In the above case, fud->fc will be set to FUSE_DEV_FC_DISCONNECTED to
indicate that this is no longer a functional device.
When trying to assign an fc to such a disconnected fuse_dev, the fc is set
to the disconnected state.
Use atomic operations xchg() and cmpxchg() to prevent races.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
When a withdraw occurs in gfs2_log_flush() and we are left with an unsubmitted
bio, fail that bio. Otherwise, the bh's in that bio will remain locked and
gfs2_evict_inode() -> truncate_inode_pages() -> gfs2_invalidate_folio() ->
gfs2_discard() will hang trying to discard the locked bh's.
In addition, when gfs2_log_flush() fails to submit a new transaction, unpin the
buffers in the failing transaction like gfs2_remove_from_journal() does. If
any of the bd's are on the ail2 list, leave them there and do_withdraw() ->
gfs2_withdraw_glocks() -> inode_go_inval() -> truncate_inode_pages() ->
gfs2_invalidate_folio() -> gfs2_discard() will remove them. They will be freed
in gfs2_release_folio().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
ksmbd_crypt_message() sets a NULL completion callback on AEAD requests
and does not handle the -EINPROGRESS return code from async hardware
crypto engines like the Qualcomm Crypto Engine (QCE). When QCE returns
-EINPROGRESS, ksmbd treats it as an error and immediately frees the
request while the hardware DMA operation is still in flight. The DMA
completion callback then dereferences freed memory, causing a NULL
pointer crash:
pc : qce_skcipher_done+0x24/0x174
lr : vchan_complete+0x230/0x27c
...
el1h_64_irq+0x68/0x6c
ksmbd_free_work_struct+0x20/0x118 [ksmbd]
ksmbd_exit_file_cache+0x694/0xa4c [ksmbd]
Use the standard crypto_wait_req() pattern with crypto_req_done() as
the completion callback, matching the approach used by the SMB client
in fs/smb/client/smb2ops.c. This properly handles both synchronous
engines (immediate return) and async engines (-EINPROGRESS followed
by callback notification).
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Link: https://github.com/openwrt/openwrt/issues/21822
Signed-off-by: Joshua Klinesmith <joshuaklinesmith@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull ata fixes from Niklas Cassel:
- The newly introduced feature that issues a deferred (non-NCQ) command
from a workqueue, forgot to consider the case where the deferred QC
times out. Fix the code to take timeouts into consideration, which
avoids a use after free (Damien)
- The newly introduced feature that issues a deferred (non-NCQ) command
from a workqueue, when unloading the module, calls cancel_work_sync(),
a function that can sleep, while holding a spin lock. Move the function
call outside the lock (Damien)
* tag 'ata-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: fix cancellation of a port deferred qc work
ata: libata-eh: correctly handle deferred qc timeouts
Only plain data whose start position and on-disk physical length are
both aligned to the block size should be classified as interlaced
plain extents. Otherwise, it must be treated as shifted plain extents.
This issue was found by syzbot using a crafted compressed image
containing plain extents with unaligned physical lengths, which can
cause OOB read in z_erofs_transform_plain().
Reported-and-tested-by: syzbot+d988dc155e740d76a331@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/699d5714.050a0220.cdd3c.03e7.GAE@google.com
Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
The TODO comment in the file header asking to get rid of kmap() is
outdated. The code has already been converted to use the folio API
(specifically kmap_local_folio).
Remove the stale comment to reflect the current state of the code.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207002908.176933-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Pull ring-buffer updates from Steven Rostedt:
- Add remote buffers for pKVM
pKVM has a hypervisor component that is used to protect the guest
from the host kernel. This hypervisor is a black box to the kernel as
the kernel is to user space. The remote buffers are used to have a
memory mapping between the hypervisor and the kernel where kernel may
send commands to enable tracing within the hypervisor. Then the
kernel will read this memory mapping just like user space can read
the memory mapped ring buffer of the kernel tracing system.
Since the hypervisor only has a single context, it doesn't need to
worry about races between normal context, interrupt context and NMIs
like the kernel does. The ring buffer it uses doesn't need to be as
complex. The remote buffers are a simple version of the ring buffer
that works in a single context. They are still per-CPU and use sub
buffers. The data layout is the same as the kernel's ring buffer to
share the same parsing.
Currently, only ARM64 implements pKVM, but there's work to implement
it also in x86. The remote buffer code is separated out from the ARM
implementation so that it can be used in the future by x86.
The ARM64 updates for pKVM is in the ARM/KVM tree and it merged in
the remote buffers of this tree.
- Make the backup instance non reusable
The backup instance is a copy of the persistent ring buffer so that
the persistent ring buffer could start recording again without using
the data from the previous boot. The backup isn't for normal tracing.
It is made read-only, and after it is consumed, it is automatically
removed.
- Have backup copy persistent instance before it starts recording
To allow the persistent ring buffer to start recording from the
kernel command line commands, move the copy of the backup instance to
before the the command line options start recording.
- Report header_page overwrite field as "char" and not "int'
The rust parser of the header_page file was triggering a warning when
it defined the overwrite variable as "int" but it was only a single
byte in size.
- Fix memory barriers for the trace_buffer CPU mask
When a CPU comes online, the bit is set to allow readers to know that
the CPU buffer is allocated. The bit is set after the allocation is
done, and a smp_wmb() is performed after the allocation and before
the setting of the bit. But instead of adding a smp_rmb() to all
readers, since once a buffer is created for a CPU it is not deleted
if that CPU goes offline, so this allocation is almost always done at
boot up before any readers exist.
If for the unlikely case where a CPU comes online for the first time
after the system boot has finished, send an IPI to all CPUs to force
the smp_rmb() for each CPU.
- Show clock function being used in debugging ring buffer data
When the ring buffer checks are enabled and the ring buffer detects
an inconsistency in the times of the invents, print out the clock
being used when the error occurred. There was a very hard to hit bug
that would happen every so often and it ended up being only triggered
when the jiffies clock was being used. If the bug showed the clock
being used, it would have been much easier to find the problem (which
was an internal function was being traced which caused the clock
accounting to go off).
* tag 'trace-ringbuffer-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
ring-buffer: Prevent off-by-one array access in ring_buffer_desc_page()
ring-buffer: Report header_page overwrite as char
tracing: Allow backup to save persistent ring buffer before it starts
tracing/Documentation: Add a section about backup instance
tracing: Remove the backup instance automatically after read
tracing: Make the backup instance non-reusable
ring-buffer: Enforce read ordering of trace_buffer cpumask and buffers
ring-buffer: Show what clock function is used on timestamp errors
tracing: Check for undefined symbols in simple_ring_buffer
tracing: load/unload page callbacks for simple_ring_buffer
Documentation: tracing: Add tracing remotes
tracing: selftests: Add trace remote tests
tracing: Add a trace remote module for testing
tracing: Introduce simple_ring_buffer
ring-buffer: Export buffer_data_page and macros
tracing: Add helpers to create trace remote events
tracing: Add events/ root files to trace remotes
tracing: Add events to trace remotes
tracing: Add init callback to trace remotes
tracing: Add non-consuming read to trace remotes
...
Moving to guard() usage removed the need of using the 'ret' variable but
it wasn't removed. As it was set to zero, the compiler in use didn't warn
(although some compilers do).
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260414110344.75c0663f@robin
Fixes: 4d9b262031f ("eventfs: Simplify code using guard()s")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604100111.AAlbQKmK-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When working on a ktest configuration, it is often useful to inspect the
final option values after includes, defaults, per-test overrides, and
variable expansion have been applied, without actually starting a test run.
Add a --dry-run option that reads the configuration, prints the test
preamble using resolved option values, and exits before opening LOG_FILE or
executing any test logic.
This is useful for debugging ktest configurations and for scripts that need
to validate the final resolved settings without triggering side effects.
Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-9-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pyright static analysis reports a "possibly unbound variable" warning
for the loop variable `i` in the `abbreviate_atoms` function. The
variable is accessed after the inner loop terminates to slice the atom
string. While the loop logic currently ensures execution, the analyzer
flags the reliance on the loop variable persisting outside its scope.
Refactor the prefix length calculation into a nested `find_share_length`
helper function. This encapsulates the search logic and uses explicit
return statements, ensuring the length value is strictly defined. This
satisfies the type checker and improves code readability without
altering the runtime behavior.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-19-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Pull jfs updates from Dave Kleikamp:
"More robust data integrity checking and some fixes"
* tag 'jfs-7.1' of github.com:kleikamp/linux-shaggy:
jfs: avoid -Wtautological-constant-out-of-range-compare warning again
JFS: always load filesystem UUID during mount
jfs: hold LOG_LOCK on umount to avoid null-ptr-deref
jfs: Set the lbmDone flag at the end of lbmIODone
jfs: fix corrupted list in dbUpdatePMap
jfs: add dmapctl integrity check to prevent invalid operations
jfs: add dtpage integrity check to prevent index/pointer overflows
jfs: add dtroot integrity check to prevent index out-of-bounds
Pull ext2, udf, quota updates from Jan Kara:
- A fix for a race in quota code that can expose ocfs2 to
use-after-free issues
- UDF fix to avoid memory corruption in face of corrupted format
- Couple of ext2 fixes for better handling of fs corruption
- Some more various code cleanups in UDF & ext2
* tag 'fs_for_v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: reject inodes with zero i_nlink and valid mode in ext2_iget()
ext2: use get_random_u32() where appropriate
quota: Fix race of dquot_scan_active() with quota deactivation
udf: fix partition descriptor append bookkeeping
ext2: avoid drop_nlink() during unlink of zero-nlink inode in ext2_unlink()
ext2: guard reservation window dump with EXT2FS_DEBUG
ext2: replace BUG_ON with WARN_ON_ONCE in ext2_get_blocks
ext2: remove stale TODO about kmap
fs: udf: avoid assignment in condition when selecting allocation goal
The comparison of an __s8 value against DTPAGEMAXSLOT is still trivially
true, causing a harmless (default disabled) warning with clang:
fs/jfs/jfs_dtree.c:4419:25: error: result of comparison of constant 128 with expression of type 's8' (aka 'signed char') is always false [-Werror,-Wtautological-constant-out-of-range-compare]
4419 | p->header.freelist >= DTPAGEMAXSLOT)) {
| ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~
I previously worked around two of these in commit 7833570dae83 ("jfs: avoid
-Wtautological-constant-out-of-range-compare warning"), but now a new one has
come up, so address the same way by dropping the redundant range check.
Fixes: 119e448bb50a ("jfs: add dtpage integrity check to prevent index/pointer overflows")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull fsnotify updates from Jan Kara:
"A couple of small fsnotify fixes and cleanups"
* tag 'fsnotify_for_v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
fanotify: replace deprecated strcpy in fanotify_info_copy_{name,name2}
fsnotify: inotify: pass mark connector to fsnotify_recalc_mask()
fanotify: call fanotify_events_supported() before path_permission() and security_path_notify()
fanotify: avoid/silence premature LSM capability checks
inotify: fix watch count leak when fsnotify_add_inode_mark_locked() fails
ext2_iget() already rejects inodes with i_nlink == 0 when i_mode is
zero or i_dtime is set, treating them as deleted. However, the case of
i_nlink == 0 with a non-zero mode and zero dtime slips through. Since
ext2 has no orphan list, such a combination can only result from
filesystem corruption - a legitimate inode deletion always sets either
i_dtime or clears i_mode before freeing the inode.
A crafted image can exploit this gap to present such an inode to the
VFS, which then triggers WARN_ON inside drop_nlink() (fs/inode.c) via
ext2_unlink(), ext2_rename() and ext2_rmdir():
WARNING: CPU: 3 PID: 609 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 3 UID: 0 PID: 609 Comm: syz-executor Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_unlink+0x26c/0x300 fs/ext2/namei.c:295
vfs_unlink+0x2fc/0x9b0 fs/namei.c:4477
do_unlinkat+0x53e/0x730 fs/namei.c:4541
__x64_sys_unlink+0xc6/0x110 fs/namei.c:4587
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
WARNING: CPU: 0 PID: 646 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 0 UID: 0 PID: 646 Comm: syz.0.17 Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_rename+0x35e/0x850 fs/ext2/namei.c:374
vfs_rename+0xf2f/0x2060 fs/namei.c:5021
do_renameat2+0xbe2/0xd50 fs/namei.c:5178
__x64_sys_rename+0x7e/0xa0 fs/namei.c:5223
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
WARNING: CPU: 0 PID: 634 at fs/inode.c:336 drop_nlink+0xad/0xd0 fs/inode.c:336
CPU: 0 UID: 0 PID: 634 Comm: syz-executor Not tainted 6.12.77+ #1
Call Trace:
<TASK>
inode_dec_link_count include/linux/fs.h:2518 [inline]
ext2_rmdir+0xca/0x110 fs/ext2/namei.c:311
vfs_rmdir+0x204/0x690 fs/namei.c:4348
do_rmdir+0x372/0x3e0 fs/namei.c:4407
__x64_sys_unlinkat+0xf0/0x130 fs/namei.c:4577
do_syscall_64+0xf5/0x220 arch/x86/entry/common.c:78
entry_SYSCALL_64_after_hwframe+0x77/0x7f
</TASK>
Extend the existing i_nlink == 0 check to also catch this case,
reporting the corruption via ext2_error() and returning -EFSCORRUPTED.
This rejects the inode at load time and prevents it from reaching any
of the namei.c paths.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Vasiliy Kovalev <kovalev@altlinux.org>
Link: https://patch.msgid.link/20260404152011.2590197-1-kovalev@altlinux.org
Signed-off-by: Jan Kara <jack@suse.cz>
The filesystem UUID was only being loaded into super_block sb when an
external journal device was in use. When mounting without an external
journal, the UUID remained unset, which prevented the computation of
a filesystem ID (fsid), which could be confirmed via `stat -f -c "%i"`
and thus user space could not use fanotify correctly.
A missing filesystem ID causes fanotify to return ENODEV when marking
the filesystem for events like FAN_CREATE, FAN_DELETE, FAN_MOVED_TO,
and FAN_MOVED_FROM. As a result, applications relying on fanotify
could not monitor these events on JFS filesystems without an external
journal.
Moved the UUID initialization so it is always performed during mount,
ensuring the superblock UUID is consistently available.
Signed-off-by: João Paredes <joaommp@yahoo.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull smb server updates from Steve French:
- smbdirect double free fixes
- Add some smbdirect logging
- Minor cleanup in crypto, and smbdirect and in IPC handling
- Minor cleanup to move header info to common FSCC code
- Fix crypt message use after free
- Fix memory leak in session setup
- Fix for DACL parsing
- Fix EA name length validation
- Reconnect fix
- Fix use after free in close
* tag 'v7.1-rc-part1-ksmbd-srv-fixes' of git://git.samba.org/ksmbd:
smb: smbdirect: add some logging to SMBDIRECT_CHECK_STATUS_{WARN,DISCONNECT}()
smb: smbdirect: introduce smbdirect_socket.logging infrastructure
smb: smbdirect: let smbdirect.h include #include <linux/types.h>
smb: server: avoid double-free in smb_direct_free_sendmsg after smb_direct_flush_send_list()
smb: client: avoid double-free in smbd_free_send_io() after smbd_send_batch_flush()
ksmbd: fix use-after-free from async crypto on Qualcomm crypto engine
ksmbd: fix mechToken leak when SPNEGO decode fails after token alloc
ksmbd: require 3 sub-authorities before reading sub_auth[2]
ksmbd: validate EaNameLength in smb2_get_ea()
ksmbd: Remove unnecessary selection of CRYPTO_ECB
ksmbd: validate owner of durable handle on reconnect
ksmbd: fix use-after-free in __ksmbd_close_fd() via durable scavenger
ksmbd: ipc: use kzalloc_flex and __counted_by
smb: move filesystem_vol_info into common/fscc.h
smb: move file_basic_info into common/fscc.h
smb: move some definitions from common/smb2pdu.h into common/fscc.h
strcpy() has been deprecated [1] because it performs no bounds checking
on the destination buffer, which can lead to buffer overflows. Replace
it with the safer strscpy().
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1]
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20260321210544.519259-4-thorsten.blum@linux.dev
Signed-off-by: Jan Kara <jack@suse.cz>
Use the typed random integer helpers instead of
get_random_bytes() when filling a single integer variable.
The helpers return the value directly, require no pointer
or size argument, and better express intent.
Signed-off-by: David Carlier <devnexen@gmail.com>
Link: https://patch.msgid.link/20260405154717.4705-1-devnexen@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
write_special_inodes() function iterate through the log->sb_list and
access the sbi fields, which can be set to NULL concurrently by umount.
Fix concurrency issue by holding LOG_LOCK and checking for NULL.
Reported-by: syzbot+e14b1036481911ae4d77@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e14b1036481911ae4d77
Signed-off-by: Helen Koike <koike@igalia.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull gfs2 updates from Andreas Gruenbacher:
- Fix possible data loss during inode evict
- Fix a race during bufdata allocation
- More careful cleaning up during a withdraw
- Prevent excessive log flushing under memory pressure
- Various other minor fixes and cleanups
* tag 'gfs2-for-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: prevent NULL pointer dereference during unmount
gfs2: hide error messages after withdraw
gfs2: wait for withdraw earlier during unmount
gfs2: inode directory consistency checks
gfs2: gfs2_log_flush withdraw fixes
gfs2: add some missing log locking
gfs2: fix address space truncation during withdraw
gfs2: drain ail under sd_log_flush_lock
gfs2: bufdata allocation race
gfs2: Remove trans_drain code duplication
gfs2: Move gfs2_remove_from_journal to log.c
gfs2: Get rid of gfs2_log_[un]lock helpers
gfs2: less aggressive low-memory log flushing
gfs2: Fix data loss during inode evict
gfs2: minor evict_[un]linked_inode cleanup
gfs2: Avoid unnecessary transactions in evict_linked_inode
gfs2: Remove unnecessary check in gfs2_evict_inode
gfs2: Call unlock_new_inode before d_instantiate
This should make it easier to analyze any possible problems.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
fsnotify_recalc_mask() expects a plain struct fsnotify_mark_connector *,
but inode->i_fsnotify_marks is an __rcu pointer. Use fsn_mark->connector
instead to avoid sparse "different address spaces" warnings.
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Link: https://patch.msgid.link/20260214051217.1381363-1-sun.jian.kdev@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
dquot_scan_active() can race with quota deactivation in
quota_release_workfn() like:
CPU0 (quota_release_workfn) CPU1 (dquot_scan_active)
============================== ==============================
spin_lock(&dq_list_lock);
list_replace_init(
&releasing_dquots, &rls_head);
/* dquot X on rls_head,
dq_count == 0,
DQ_ACTIVE_B still set */
spin_unlock(&dq_list_lock);
synchronize_srcu(&dquot_srcu);
spin_lock(&dq_list_lock);
list_for_each_entry(dquot,
&inuse_list, dq_inuse) {
/* finds dquot X */
dquot_active(X) -> true
atomic_inc(&X->dq_count);
}
spin_unlock(&dq_list_lock);
spin_lock(&dq_list_lock);
dquot = list_first_entry(&rls_head);
WARN_ON_ONCE(atomic_read(&dquot->dq_count));
The problem is not only a cosmetic one as under memory pressure the
caller of dquot_scan_active() can end up working on freed dquot.
Fix the problem by making sure the dquot is removed from releasing list
when we acquire a reference to it.
Fixes: 869b6ea1609f ("quota: Fix slow quotaoff")
Reported-by: Sam Sun <samsun1006219@gmail.com>
Link: https://lore.kernel.org/all/CAEkJfYPTt3uP1vAYnQ5V2ZWn5O9PLhhGi5HbOcAzyP9vbXyjeg@mail.gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
In lbmRead(), the I/O event waited for by wait_event() finishes before
it goes to sleep, and the lbmIODone() prematurely sets the flag to
lbmDONE, thus ending the wait. This causes wait_event() to return before
lbmREAD is cleared (because lbmDONE was set first), the premature return
of wait_event() leads to the release of lbuf before lbmIODone() returns,
thus triggering the use-after-free vulnerability reported in [1].
Moving the operation of setting the lbmDONE flag to after clearing lbmREAD
in lbmIODone() avoids the use-after-free vulnerability reported in [1].
[1]
BUG: KASAN: slab-use-after-free in rt_spin_lock+0x88/0x3e0 kernel/locking/spinlock_rt.c:56
Call Trace:
blk_update_request+0x57e/0xe60 block/blk-mq.c:1007
blk_mq_end_request+0x3e/0x70 block/blk-mq.c:1169
blk_complete_reqs block/blk-mq.c:1244 [inline]
blk_done_softirq+0x10a/0x160 block/blk-mq.c:1249
Allocated by task 6101:
lbmLogInit fs/jfs/jfs_logmgr.c:1821 [inline]
lmLogInit+0x3d0/0x19e0 fs/jfs/jfs_logmgr.c:1269
open_inline_log fs/jfs/jfs_logmgr.c:1175 [inline]
lmLogOpen+0x4e1/0xfa0 fs/jfs/jfs_logmgr.c:1069
jfs_mount_rw+0xe9/0x670 fs/jfs/jfs_mount.c:257
jfs_fill_super+0x754/0xd80 fs/jfs/super.c:532
Freed by task 6101:
kfree+0x1bd/0x900 mm/slub.c:6876
lbmLogShutdown fs/jfs/jfs_logmgr.c:1864 [inline]
lmLogInit+0x1137/0x19e0 fs/jfs/jfs_logmgr.c:1415
open_inline_log fs/jfs/jfs_logmgr.c:1175 [inline]
lmLogOpen+0x4e1/0xfa0 fs/jfs/jfs_logmgr.c:1069
jfs_mount_rw+0xe9/0x670 fs/jfs/jfs_mount.c:257
jfs_fill_super+0x754/0xd80 fs/jfs/super.c:532
Reported-by: syzbot+1d38eedcb25a3b5686a7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=1d38eedcb25a3b5686a7
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull fuse update from Miklos Szeredi:
- Fix possible hang in virtiofs when cleaning up a DAX inode (Sergio
Lopez)
- Fix a warning when using large folio as the source of SPLICE_F_MOVE
on the fuse device (Bernd)
- Fix uninitialized value found by KMSAN (Luis Henriques)
- Fix synchronous INIT hang (Miklos)
- Fix race between inode initialization and FUSE_NOTIFY_INVAL_INODE
(Horst)
- Allow fd to be closed after passing fuse device fd to
fsconfig(..., "fd", ...) (Miklos)
- Support FSCONFIG_SET_FD for "fd" option (Miklos)
- Misc fixes and cleanups
* tag 'fuse-update-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (21 commits)
fuse: support FSCONFIG_SET_FD for "fd" option
fuse: clean up device cloning
fuse: don't require /dev/fuse fd to be kept open during mount
fuse: add refcount to fuse_dev
fuse: create fuse_dev on /dev/fuse open instead of mount
fuse: check connection state on notification
fuse: fuse_dev_ioctl_clone() should wait for device file to be initialized
fuse: fix inode initialization race
fuse: abort on fatal signal during sync init
fuse: fix uninit-value in fuse_dentry_revalidate()
fuse: use offset_in_page() for page offset calculations
fuse: use DIV_ROUND_UP() for page count calculations
fuse: simplify logic in fuse_notify_store() and fuse_retrieve()
fuse: validate outarg offset and size in notify store/retrieve
fuse: Check for large folio with SPLICE_F_MOVE
fuse: quiet down complaints in fuse_conn_limit_write
fuse: drop unnecessary argument from fuse_lookup_init()
fuse: fix premature writetrhough request for large folio
fuse: refactor duplicate queue teardown operation
virtiofs: add FUSE protocol validation
...
When flushing out outstanding glock work during an unmount, gfs2_log_flush()
can be called when sdp->sd_jdesc has already been deallocated and sdp->sd_jdesc
is NULL. Commit 35264909e9d1 ("gfs2: Fix NULL pointer dereference in
gfs2_log_flush") added a check for that to gfs2_log_flush() itself, but it
missed the sdp->sd_jdesc dereference in gfs2_log_release(). Fix that.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Closes: https://lore.kernel.org/r/202604071139.HNJiCaAi-lkp@intel.com/
Fixes: 35264909e9d1 ("gfs2: Fix NULL pointer dereference in gfs2_log_flush")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This will be used by client and server in order to keep controlling
the logging when we move to shared functions.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The latter trigger LSM (e.g. SELinux) checks, which will log a denial
when permission is denied, so it's better to do them after validity
checks to avoid logging a denial when the operation would fail anyway.
Fixes: 0b3b094ac9a7 ("fanotify: Disallow permission events for proc filesystem")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Link: https://patch.msgid.link/20260216150625.793013-3-omosnace@redhat.com
Signed-off-by: Jan Kara <jack@suse.cz>
Mounting a crafted UDF image with repeated partition descriptors can
trigger a heap out-of-bounds write in part_descs_loc[].
handle_partition_descriptor() deduplicates entries by partition number,
but appended slots never record partnum. As a result duplicate
Partition Descriptors are appended repeatedly and num_part_descs keeps
growing.
Once the table is full, the growth path still sizes the allocation from
partnum even though inserts are indexed by num_part_descs. If partnum is
already aligned to PART_DESC_ALLOC_STEP, ALIGN(partnum, step) can keep
the old capacity and the next append writes past the end of the table.
Store partnum in the appended slot and size growth from the next append
count so deduplication and capacity tracking follow the same model.
Fixes: ee4af50ca94f ("udf: Fix mounting of Win7 created UDF filesystems")
Cc: stable@vger.kernel.org
Signed-off-by: Seohyeon Maeng <bioloidgp@gmail.com>
Link: https://patch.msgid.link/20260310081652.21220-1-bioloidgp@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
This patch resolves the "list_add corruption. next is NULL" Oops
reported by syzkaller in dbUpdatePMap(). The root cause is uninitialized
synclist nodes in struct metapage and struct TxBlock, plus improper list
node removal using list_del() (which leaves nodes in an invalid state).
This fixes the following Oops reported by syzkaller.
list_add corruption. next is NULL.
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:28!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 122 Comm: jfsCommit Not tainted syzkaller #0
PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 10/02/2025
RIP: 0010:__list_add_valid_or_report+0xc3/0x130 lib/list_debug.c:27
Code: 4c 89 f2 48 89 d9 e8 0c 88 a4 fc 90 0f 0b 48 c7 c7 20 de 3d 8b e8
fd 87 a4 fc 90 0f 0b 48 c7 c7 c0 de 3d 8b e8 ee 87 a4 fc 90 <0f> 0b 48
89 df e8 13 c3 7d fd 42 80 7c 2d 00 00 74 08 4c 89 e7 e8
RSP: 0018:ffffc9000395fa20 EFLAGS: 00010246
RAX: 0000000000000022 RBX: 0000000000000000 RCX: 270c5dfadb559700
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00000000000f0000 R08: 0000000000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffff5200072bee9 R12: 0000000000000000
R13: dffffc0000000000 R14: 0000000000000004 R15: 1ffff92000632266
FS: 0000000000000000(0000) GS:ffff888126ef9000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000056341fdb86c0 CR3: 0000000040a18000 CR4: 00000000003526f0
Call Trace:
<TASK>
__list_add_valid include/linux/list.h:96 [inline]
__list_add include/linux/list.h:158 [inline]
list_add include/linux/list.h:177 [inline]
dbUpdatePMap+0x7e4/0xeb0 fs/jfs/jfs_dmap.c:577
txAllocPMap+0x57d/0x6b0 fs/jfs/jfs_txnmgr.c:2426
txUpdateMap+0x81e/0x9c0 fs/jfs/jfs_txnmgr.c:2364
txLazyCommit fs/jfs/jfs_txnmgr.c:2665 [inline]
jfs_lazycommit+0x3f1/0xa10 fs/jfs/jfs_txnmgr.c:2734
kthread+0x711/0x8a0 kernel/kthread.c:463
ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
Reported-by: syzbot+4d0a0feb49c5138cac46@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4d0a0feb49c5138cac46
Tested-by: syzbot+4d0a0feb49c5138cac46@syzkaller.appspotmail.com
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull RTLA updates from Steven Rostedt:
- Simplify option parsing
Auto-generate getopt_long() optstring for short options from long
options array, avoiding the need to specify it manually and reducing
the surface for mistakes.
- Add unit tests
Implement unit tests (make unit-tests) using libcheck, next to
existing runtime tests (make check). Currently, three functions from
utils.c are tested.
- Add --stack-format option
In addition to stopping stack pointer decoding (with -s/--stack
option) on first unresolvable pointer, allow also skipping
unresolvable pointers and displaying everything, configurable with a
new option.
- Unify number of CPUs into one global variable
Use one global variable, nr_cpus, to store the number of CPUs instead
of retrieving it and passing it at multiple places.
- Fix behavior in various corner cases
Make RTLA behave correctly in several corner cases: memory allocation
failure, invalid value read from kernel side, thread creation
failure, malformed time value input, and read/write failure or
interruption by signal.
- Improve string handling
Simplify several places in the code that handle strings, including
parsing of action arguments. A few new helper functions and variables
are added for that purpose.
- Get rid of magic numbers
Few places handling paths use a magic number of 1024. Replace it with
MAX_PATH and ARRAY_SIZE() macro.
- Unify threshold handling
Code that handles response to latency threshold is duplicated between
tools, which has led to bugs in the past. Unify it into a new helper
as much as possible.
- Fix segfault on SIGINT during cleanup
The SIGINT handler touches dynamically allocated memory. Detach it
before freeing it during cleanup to prevent segmentation fault and
discarding of output buffers. Also, properly document SIGINT handling
while at it.
* tag 'trace-rtla-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (28 commits)
Documentation/rtla: Document SIGINT behavior
rtla: Fix segfault on multiple SIGINTs
rtla/utils: Fix loop condition in PID validation
rtla/utils: Fix resource leak in set_comm_sched_attr()
rtla/trace: Fix I/O handling in save_trace_to_file()
rtla/trace: Fix write loop in trace_event_save_hist()
rtla/timerlat: Simplify RTLA_NO_BPF environment variable check
rtla: Use str_has_prefix() for option prefix check
rtla: Enforce exact match for time unit suffixes
rtla: Use str_has_prefix() for prefix checks
rtla: Add str_has_prefix() helper function
rtla: Handle pthread_create() failure properly
rtla/timerlat: Add bounds check for softirq vector
rtla: Simplify code by caching string lengths
rtla: Replace magic number with MAX_PATH
rtla: Introduce common_threshold_handler() helper
rtla/actions: Simplify argument parsing
rtla: Use strdup() to simplify code
rtla: Exit on memory allocation failures during initialization
tools/rtla: Remove unneeded nr_cpus from for_each_monitored_cpu
...
This will make it easier to use.
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Make sure calling capable()/ns_capable() actually leads to access denied
when false is returned, because these functions emit an audit record
when a Linux Security Module denies the capability, which makes it
difficult to avoid allowing/silencing unnecessary permissions in
security policies (namely with SELinux).
Where the return value just used to set a flag, use the non-auditing
ns_capable_noaudit() instead.
Fixes: 7cea2a3c505e ("fanotify: support limited functionality for unprivileged users")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Paul Moore <paul@paul-moore.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260216150625.793013-2-omosnace@redhat.com
Signed-off-by: Jan Kara <jack@suse.cz>
ext2_unlink() calls inode_dec_link_count() unconditionally, which
invokes drop_nlink(). If the inode was loaded from a corrupted disk
image with i_links_count == 0, drop_nlink()
triggers WARN_ON(inode->i_nlink == 0)
Follow the ext4 pattern from __ext4_unlink(): check i_nlink before
decrementing. If already zero, skip the decrement.
Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu>
Link: https://patch.msgid.link/20260211022052.973114-1-n7l8m4@u.northwestern.edu
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dmapctl() to validate dmapctl structure integrity, focusing on
preventing invalid operations caused by on-disk corruption.
Key checks:
- nleafs bounded by [0, LPERCTL] (maximum leaf nodes per dmapctl).
- l2nleafs bounded by [0, L2LPERCTL] and consistent with nleafs
(nleafs must be 2^l2nleafs).
- leafidx must be exactly CTLLEAFIND (expected leaf index position).
- height bounded by [0, L2LPERCTL >> 1] (valid tree height range).
- budmin validity: NOFREE only if nleafs=0; otherwise >= BUDMIN.
- Leaf nodes fit within stree array (leafidx + nleafs <= CTLTREESIZE).
- Leaf node values are either non-negative or NOFREE.
Invoked in dbAllocAG(), dbFindCtl(), dbAdjCtl() and dbExtendFS() when
accessing dmapctl pages, catching corruption early before dmap operations
trigger invalid memory access or logic errors.
This fixes the following UBSAN warning.
[58245.668090][T14017] ------------[ cut here ]------------
[58245.668103][T14017] UBSAN: shift-out-of-bounds in fs/jfs/jfs_dmap.c:2641:11
[58245.668119][T14017] shift exponent 110 is too large for 32-bit type 'int'
[58245.668137][T14017] CPU: 0 UID: 0 PID: 14017 Comm: 4c1966e88c28fa9 Tainted: G E 6.18.0-rc4-00253-g21ce5d4ba045-dirty #124 PREEMPT_{RT,(full)}
[58245.668174][T14017] Tainted: [E]=UNSIGNED_MODULE
[58245.668176][T14017] Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[58245.668184][T14017] Call Trace:
[58245.668200][T14017] <TASK>
[58245.668208][T14017] dump_stack_lvl+0x189/0x250
[58245.668288][T14017] ? __pfx_dump_stack_lvl+0x10/0x10
[58245.668301][T14017] ? __pfx__printk+0x10/0x10
[58245.668315][T14017] ? lock_metapage+0x303/0x400 [jfs]
[58245.668406][T14017] ubsan_epilogue+0xa/0x40
[58245.668422][T14017] __ubsan_handle_shift_out_of_bounds+0x386/0x410
[58245.668462][T14017] dbSplit+0x1f8/0x200 [jfs]
[58245.668543][T14017] dbAdjCtl+0x34c/0xa20 [jfs]
[58245.668628][T14017] dbAllocNear+0x2ee/0x3d0 [jfs]
[58245.668710][T14017] dbAlloc+0x933/0xba0 [jfs]
[58245.668797][T14017] ea_write+0x374/0xdd0 [jfs]
[58245.668888][T14017] ? __pfx_ea_write+0x10/0x10 [jfs]
[58245.668966][T14017] ? __jfs_setxattr+0x76e/0x1120 [jfs]
[58245.669046][T14017] __jfs_setxattr+0xa01/0x1120 [jfs]
[58245.669135][T14017] ? __pfx___jfs_setxattr+0x10/0x10 [jfs]
[58245.669216][T14017] ? mutex_lock_nested+0x154/0x1d0
[58245.669252][T14017] ? __jfs_xattr_set+0xb9/0x170 [jfs]
[58245.669333][T14017] __jfs_xattr_set+0xda/0x170 [jfs]
[58245.669430][T14017] ? __pfx___jfs_xattr_set+0x10/0x10 [jfs]
[58245.669509][T14017] ? xattr_full_name+0x6f/0x90
[58245.669546][T14017] ? jfs_xattr_set+0x33/0x60 [jfs]
[58245.669636][T14017] ? __pfx_jfs_xattr_set+0x10/0x10 [jfs]
[58245.669726][T14017] __vfs_setxattr+0x43c/0x480
[58245.669743][T14017] __vfs_setxattr_noperm+0x12d/0x660
[58245.669756][T14017] vfs_setxattr+0x16b/0x2f0
[58245.669768][T14017] ? __pfx_vfs_setxattr+0x10/0x10
[58245.669782][T14017] filename_setxattr+0x274/0x600
[58245.669795][T14017] ? __pfx_filename_setxattr+0x10/0x10
[58245.669806][T14017] ? getname_flags+0x1e5/0x540
[58245.669829][T14017] path_setxattrat+0x364/0x3a0
[58245.669840][T14017] ? __pfx_path_setxattrat+0x10/0x10
[58245.669859][T14017] ? __se_sys_chdir+0x1b9/0x280
[58245.669876][T14017] __x64_sys_lsetxattr+0xbf/0xe0
[58245.669888][T14017] do_syscall_64+0xfa/0xfa0
[58245.669901][T14017] ? lockdep_hardirqs_on+0x9c/0x150
[58245.669913][T14017] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[58245.669927][T14017] ? exc_page_fault+0xab/0x100
[58245.669937][T14017] entry_SYSCALL_64_after_hwframe+0x77/0x7f
Reported-by: syzbot+4c1966e88c28fa96e053@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=4c1966e88c28fa96e053
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull runtime verification updates from Steven Rostedt:
- Refactor da_monitor header to share handlers across monitor types
No functional changes, only less code duplication.
- Add Hybrid Automata model class
Add a new model class that extends deterministic automata by adding
constraints on transitions and states. Those constraints can take
into account wall-clock time and as such allow RV monitor to make
assertions on real time. Add documentation and code generation
scripts.
- Add stall monitor as hybrid automaton example
Add a monitor that triggers a violation when a task is stalling as an
example of automaton working with real time variables.
- Convert the opid monitor to a hybrid automaton
The opid monitor can be heavily simplified if written as a hybrid
automaton: instead of tracking preempt and interrupt enable/disable
events, it can just run constraints on the preemption/interrupt
states when events like wakeup and need_resched verify.
- Add support for per-object monitors in DA/HA
Allow writing deterministic and hybrid automata monitors for generic
objects (e.g. any struct), by exploiting a hash table where objects
are saved. This allows to track more than just tasks in RV. For
instance it will be used to track deadline entities in deadline
monitors.
- Add deadline tracepoints and move some deadline utilities
Prepare the ground for deadline monitors by defining events and
exporting helpers.
- Add nomiss deadline monitor
Add first example of deadline monitor asserting all entities complete
before their deadline.
- Improve rvgen error handling
Introduce AutomataError exception class and better handle expected
exceptions while showing a backtrace for unexpected ones.
- Improve python code quality in rvgen
Refactor the rvgen generation scripts to align with python best
practices: use f-strings instead of %, use len() instead of
__len__(), remove semicolons, use context managers for file
operations, fix whitespace violations, extract magic strings into
constants, remove unused imports and methods.
- Fix small bugs in rvgen
The generator scripts presented some corner case bugs: logical error
in validating what a correct dot file looks like, fix an isinstance()
check, enforce a dot file has an initial state, fix type annotations
and typos in comments.
- rvgen refactoring
Refactor automata.py to use iterator-based parsing and handle
required arguments directly in argparse.
- Allow epoll in rtapp-sleep monitor
The epoll_wait call is now rt-friendly so it should be allowed in the
sleep monitor as a valid sleep method.
* tag 'trace-rv-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (32 commits)
rv: Allow epoll in rtapp-sleep monitor
rv/rvgen: fix _fill_states() return type annotation
rv/rvgen: fix unbound loop variable warning
rv/rvgen: enforce presence of initial state
rv/rvgen: extract node marker string to class constant
rv/rvgen: fix isinstance check in Variable.expand()
rv/rvgen: make monitor arguments required in rvgen
rv/rvgen: remove unused __get_main_name method
rv/rvgen: remove unused sys import from dot2c
rv/rvgen: refactor automata.py to use iterator-based parsing
rv/rvgen: use class constant for init marker
rv/rvgen: fix DOT file validation logic error
rv/rvgen: fix PEP 8 whitespace violations
rv/rvgen: fix typos in automata and generator docstring and comments
rv/rvgen: use context managers for file operations
rv/rvgen: remove unnecessary semicolons
rv/rvgen: replace __len__() calls with len()
rv/rvgen: replace % string formatting with f-strings
rv/rvgen: remove bare except clauses in generator
rv/rvgen: introduce AutomataError exception class
...
The behavior of RTLA on receiving SIGINT is currently undocumented.
Describe it in RTLA's common appendix that appears in man pages for all
RTLA tools to avoid confusion.
Suggested-by: Attila Fazekas <afazekas@redhat.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260324123229.152424-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
- fuse_mutex is not needed for device cloning, because fuse_dev_install()
uses cmpxcg() to set fud->fc, which prevents races between clone/mount
or clone/clone. This makes the logic simpler
- Drop fc->dev_count. This is only used to check in release if the device
is the last clone, but checking list_empty(&fc->devices) is equivalent
after removing the released device from the list. Removing the fuse_dev
before calling fuse_abort_conn() is okay, since the processing and io
lists are now empty for this device.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
smb_direct_flush_send_list() already calls smb_direct_free_sendmsg(),
so we should not call it again after post_sendmsg()
moved it to the batch list.
Reported-by: Ruikai Peng <ruikai@pwno.io>
Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/
Fixes: 34abd408c8ba ("smb: server: make use of smbdirect_socket.send_io.bcredits")
Cc: stable@kernel.org
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Ruikai Peng <ruikai@pwno.io>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: security@kernel.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Tested-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Steve French <stfrench@microsoft.com>
When fsnotify_add_inode_mark_locked() fails in inotify_new_watch(),
the error path calls inotify_remove_from_idr() but does not call
dec_inotify_watches() to undo the preceding inc_inotify_watches().
This leaks a watch count, and repeated failures can exhaust the
max_user_watches limit with -ENOSPC even when no watches are active.
Prior to commit 1cce1eea0aff ("inotify: Convert to using per-namespace
limits"), the watch count was incremented after fsnotify_add_mark_locked()
succeeded, so this path was not affected. The conversion moved
inc_inotify_watches() before the mark insertion without adding the
corresponding rollback.
Add the missing dec_inotify_watches() call in the error path.
Fixes: 1cce1eea0aff ("inotify: Convert to using per-namespace limits")
Cc: stable@vger.kernel.org
Signed-off-by: Chia-Ming Chang <chiamingc@synology.com>
Signed-off-by: robbieko <robbieko@synology.com>
Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
Link: https://patch.msgid.link/20260224093442.3076294-1-chiamingc@synology.com
Signed-off-by: Jan Kara <jack@suse.cz>
The function __rsv_window_dump() is a heavyweight debug tool that walks
the reservation red-black tree. It is currently guarded by #if 1,
forcing it to be compiled into all kernels, even production ones.
Match the rest of the file by guarding it with #ifdef EXT2FS_DEBUG,
so it is only included when explicit debugging is enabled.
This removes the unused function code from standard builds.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207022920.258247-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dtpage() to validate dtpage_t integrity, focusing on
preventing index/pointer overflows from on-disk corruption.
Key checks:
- maxslot must be exactly DTPAGEMAXSLOT (128) as defined for dtpage
slot array.
- freecnt bounded by [0, DTPAGEMAXSLOT-1] (slot[0] reserved for header).
- freelist validity: -1 when freecnt=0; 1~DTPAGEMAXSLOT-1 when non-zero,
with linked list checks (no duplicates, proper termination via next=-1).
- stblindex bounds: must be within range that avoids overlapping with
stbl itself (stblindex < DTPAGEMAXSLOT - stblsize).
- nextindex bounded by stbl size (stblsize << L2DTSLOTSIZE). stbl entries
validity: within 1~DTPAGEMAXSLOT-1, no duplicates(excluding invalid
entries marked as -1).
Invoked when loading dtpage (in BT_GETPAGE macro context) to catch
corruption early before directory operations trigger out-of-bounds access.
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull ktest updates from Steven Rostedt:
- Fix undef warning when WARNINGS_FILE is unset
The check_buildlog() references WARNINGS_FILE even when it's not set.
Perl triggers a warning in this case. Check if the WARNINGS_FILE is
defined before checking if the file it represents exists.
- Fix how LOG_FILE is resolved
LOG_FILE is expanded immediately after the config file is parsed. If
LOG_FILE depends on variables from the tests it will use stale values
instead of using the test variables. Have LOG_FILE also resolve test
variables.
- Treat a undefined self reference variable as empty
Variables can recursively include itself for appending. Currently, if
the references itself and it is not defined, it leaves the variable
in the define: "VAR = ${VAR} foo" keeps the ${VAR} around. Have it
removed instead.
- Fix clearing of variables per tests
If a variable has a defined default, a test can not clear it by
assigning the variable to empty. Fix this by clearing the variable
for a test when the test config has that variable assigned to
nothing.
- Fix run_command() to catch stderr in the shell command parsing
Switch to Perl list form open to use "sh -c" wrapper to run shell
commands to have the log file catch shell parsing errors.
- Fix console output during reboot cycle
The POWER_CYCLE callback during reboot() can miss output from the
next boot making ktest miss the boot string it was waiting for.
- Add PRE_KTEST_DIE for PRE_KTEST failures
If the command for PRE_KTEST fails, ktest does not fail (this was by
design as this command was used to add patches that may or may not
apply). Add PRE_KTEST_DIE value to force ktest to fail if PRE_KTEST
fails.
- Run POST_KTEST hooks on failure and cancellation
PRE_KTEST always runs before a ktest test, have POST_KTEST always run
after a test even if the test fails or is cancelled to do the
teardown of PRE_KTEST.
- Add a --dry-run mode
Add --dry-run to parse the config, print the results and exit without
running any of the tests.
- Store failures from the dodie() path as well
The STORE_FAILURES saves the logs on failure, but there's failure
paths that miss storing. Perform STORE_FAILURES in dodie() to capture
these failures too.
* tag 'ktest-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Store failure logs also in fatal paths
ktest: Add a --dry-run mode
ktest: Run POST_KTEST hooks on failure and cancellation
ktest: Add PRE_KTEST_DIE for PRE_KTEST failures
ktest: Stop dropping console output during power-cycle reboot
ktest: Run commands through list-form shell open
ktest: Honor empty per-test option overrides
ktest: Treat undefined self-reference as empty
ktest: Resolve LOG_FILE in test option context
ktest: Avoid undef warning when WARNINGS_FILE is unset
Since commit 0c43094f8cc9 ("eventpoll: Replace rwlock with spinlock"),
epoll_wait is real-time-safe syscall for sleeping.
Add epoll_wait to the list of rt-safe sleeping APIs.
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260401130828.3115428-1-namcao@linutronix.de
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Detach stop_trace() from SIGINT/SIGALRM on tool clean-up to prevent it
from crashing RTLA by accessing freed memory.
This prevents a crash when multiple SIGINTs are received.
Fixes: d6899e560366 ("rtla/timerlat_hist: Abort event processing on second signal")
Fixes: 80967b354a76 ("rtla/timerlat_top: Abort event processing on second signal")
Reviewed-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20260310160725.144443-1-tglozar@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
With the new mount API the sequence of syscalls would be:
fs_fd = fsopen("fuse", 0);
snprintf(opt, sizeof(opt), "%i", devfd);
fsconfig(fs_fd, FSCONFIG_SET_STRING, "fd", opt, 0);
/* ... */
fsconfig(fs_fd, FSCONFIG_CMD_CREATE, 0, 0, 0);
Current mount code just stores the value of devfd in the fs_context and
uses it in during FSCONFIG_CMD_CREATE, which is inelegant.
Instead grab a reference to the underlying fuse_dev, and use that during
the filesystem creation.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In gfs2_dinode_in(), only allow directories to have the GFS2_DIF_EXHASH
flag set. This will prevent other parts of the code from treating
regular inodes as directories based on the presence of that flag.
In sweep_bh_for_rgrps() and __gfs2_free_blocks(), check if the
GFS2_DIF_EXHASH flag is set instead of checking if i_depth is non-zero.
This matches what the directory code does. (The i_depth checks were
introduced in commit 6d3117b412951 ("GFS2: Wipe directory hash table
metadata when deallocating a directory").)
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
smbd_send_batch_flush() already calls smbd_free_send_io(),
so we should not call it again after smbd_post_send()
moved it to the batch list.
Reported-by: Ruikai Peng <ruikai@pwno.io>
Closes: https://lore.kernel.org/linux-cifs/CAFD3drNOSJ05y3A+jNXSDxW-2w09KHQ0DivhxQ_pcc7immVVOQ@mail.gmail.com/
Fixes: 21538121efe6 ("smb: client: make use of smbdirect_socket.send_io.bcredits")
Cc: stable@kernel.org
Cc: Steve French <smfrench@gmail.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Long Li <longli@microsoft.com>
Cc: Ruikai Peng <ruikai@pwno.io>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: security@kernel.org
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Tested-by: Ruikai Peng <ruikai@pwno.io>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull erofs fixes from Gao Xiang:
- Do not share the page cache if the real @aops differs
- Fix the incomplete condition for interlaced plain extents
- Get rid of more unnecessary #ifdefs
* tag 'erofs-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix interlaced plain identification for encoded extents
erofs: remove more unnecessary #ifdefs
erofs: allow sharing page cache with the same aops only
If ext2_get_blocks() is called with maxblocks == 0, it currently triggers
a BUG_ON(), causing a kernel panic.
While this condition implies a logic error in the caller, a filesystem
should not crash the system due to invalid arguments.
Replace the BUG_ON() with a WARN_ON_ONCE() to provide a stack trace for
debugging, and return -EINVAL to handle the error gracefully.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207010617.216675-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Add check_dtroot() to validate dtroot_t integrity, focusing on preventing
index/pointer overflows from on-disk corruption.
Key checks:
- freecnt bounded by [0, DTROOTMAXSLOT-1] (slot[0] reserved for header).
- freelist validity: -1 when freecnt=0; 1~DTROOTMAXSLOT-1 when non-zero,
with linked list checks (no duplicates, proper termination via next=-1).
- stbl bounds: nextindex within stbl array size; entries within 0~8, no
duplicates (excluding idx=0).
Invoked in copy_from_dinode() when loading directory inodes, catching
corruption early before directory operations trigger out-of-bounds access.
This fixes the following UBSAN warning.
[ 101.832754][ T5960] ------------[ cut here ]------------
[ 101.832762][ T5960] UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dtree.c:3713:8
[ 101.832792][ T5960] index -1 is out of range for type 'struct dtslot[128]'
[ 101.832807][ T5960] CPU: 2 UID: 0 PID: 5960 Comm: 5f7f0caf9979e9d Tainted: G E 6.18.0-rc4-00250-g2603eb907f03 #119 PREEMPT_{RT,(full
[ 101.832817][ T5960] Tainted: [E]=UNSIGNED_MODULE
[ 101.832819][ T5960] Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[ 101.832823][ T5960] Call Trace:
[ 101.832833][ T5960] <TASK>
[ 101.832838][ T5960] dump_stack_lvl+0x189/0x250
[ 101.832909][ T5960] ? __pfx_dump_stack_lvl+0x10/0x10
[ 101.832925][ T5960] ? __pfx__printk+0x10/0x10
[ 101.832934][ T5960] ? rt_mutex_slowunlock+0x493/0x8a0
[ 101.832959][ T5960] ubsan_epilogue+0xa/0x40
[ 101.832966][ T5960] __ubsan_handle_out_of_bounds+0xe9/0xf0
[ 101.833007][ T5960] dtInsertEntry+0x936/0x1430 [jfs]
[ 101.833094][ T5960] dtSplitPage+0x2c8b/0x3ed0 [jfs]
[ 101.833177][ T5960] ? __pfx_rt_mutex_slowunlock+0x10/0x10
[ 101.833193][ T5960] dtInsert+0x109b/0x6000 [jfs]
[ 101.833283][ T5960] ? rt_mutex_slowunlock+0x493/0x8a0
[ 101.833296][ T5960] ? __pfx_rt_mutex_slowunlock+0x10/0x10
[ 101.833307][ T5960] ? rt_spin_unlock+0x161/0x200
[ 101.833315][ T5960] ? __pfx_dtInsert+0x10/0x10 [jfs]
[ 101.833391][ T5960] ? txLock+0xaf9/0x1cb0 [jfs]
[ 101.833477][ T5960] ? dtInitRoot+0x22a/0x670 [jfs]
[ 101.833556][ T5960] jfs_mkdir+0x6ec/0xa70 [jfs]
[ 101.833636][ T5960] ? __pfx_jfs_mkdir+0x10/0x10 [jfs]
[ 101.833721][ T5960] ? generic_permission+0x2e5/0x690
[ 101.833760][ T5960] ? bpf_lsm_inode_mkdir+0x9/0x20
[ 101.833776][ T5960] vfs_mkdir+0x306/0x510
[ 101.833786][ T5960] do_mkdirat+0x247/0x590
[ 101.833795][ T5960] ? __pfx_do_mkdirat+0x10/0x10
[ 101.833804][ T5960] ? getname_flags+0x1e5/0x540
[ 101.833815][ T5960] __x64_sys_mkdir+0x6c/0x80
[ 101.833823][ T5960] do_syscall_64+0xfa/0xfa0
[ 101.833832][ T5960] ? lockdep_hardirqs_on+0x9c/0x150
[ 101.833840][ T5960] ? entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 101.833847][ T5960] ? exc_page_fault+0xab/0x100
[ 101.833856][ T5960] entry_SYSCALL_64_after_hwframe+0x77/0x7f
Signed-off-by: Yun Zhou <yun.zhou@windriver.com>
Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Pull tracefs updates from Steven Rostedt:
- Simplify error handling with guards()
Use guards() to simplify the handling of releasing locks in exit
paths.
- Use dentry name snapshots instead of allocation
Instead of allocating a temp buffer to store the dentry name to use
in mkdir() and rmdir() use take_dentry_name_snapshot().
- Fix default permissions not being applied at boot
The default permissions for tracefs was 0700 to only allow root
having access. But after a change to fix other mount options the
update to permissions ignored the defined default and used the system
default of 0755. This is a regression and is fixed.
* tag 'tracefs-v7.1-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracefs: Removed unused 'ret' variable in eventfs_iterate()
tracefs: Fix default permissions not being applied on initial mount
tracefs: Use dentry name snapshots instead of heap allocation
eventfs: Simplify code using guard()s
STORE_FAILURES was only saved from fail(), so paths that reached dodie()
could exit without preserving failure logs.
That includes fatal hook paths such as:
POST_BUILD_DIE = 1
and ordinary failures when:
DIE_ON_FAILURE = 1
Call save_logs("fail", ...) from dodie() too so fatal failures keep the
same STORE_FAILURES artifacts as non-fatal fail() paths.
Cc: John Hawley <warthog9@eaglescrag.net>
Link: https://patch.msgid.link/20260318-ktest-fixes-v1-1-9dd94d46d84c@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The _fill_states() method returns a list of strings, but the type
annotation incorrectly specified str. Update the annotation to
list[str] to match the actual return value.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-20-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
The procfs_is_workload_pid() function iterates through a directory
entry name to validate if it represents a process ID. The loop
condition checks if the pointer t_name is non-NULL, but since
incrementing a pointer never makes it NULL, this condition is always
true within the loop's context. Although the inner isdigit() check
catches the NUL terminator and breaks out of the loop, the condition
is semantically misleading and not idiomatic for C string processing.
Correct the loop condition from checking the pointer (t_name) to
checking the character it points to (*t_name). This ensures the loop
terminates when the NUL terminator is reached, aligning with standard
C string iteration practices. While the original code functioned
correctly due to the existing character validation, this change
improves code clarity and maintainability.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Link: https://lore.kernel.org/r/20260309195040.1019085-19-wander@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
This will make it possible to grab the fuse_dev and subsequently release
the file that it came from.
In the above case, fud->fc will be set to FUSE_DEV_FC_DISCONNECTED to
indicate that this is no longer a functional device.
When trying to assign an fc to such a disconnected fuse_dev, the fc is set
to the disconnected state.
Use atomic operations xchg() and cmpxchg() to prevent races.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
When a withdraw occurs in gfs2_log_flush() and we are left with an unsubmitted
bio, fail that bio. Otherwise, the bh's in that bio will remain locked and
gfs2_evict_inode() -> truncate_inode_pages() -> gfs2_invalidate_folio() ->
gfs2_discard() will hang trying to discard the locked bh's.
In addition, when gfs2_log_flush() fails to submit a new transaction, unpin the
buffers in the failing transaction like gfs2_remove_from_journal() does. If
any of the bd's are on the ail2 list, leave them there and do_withdraw() ->
gfs2_withdraw_glocks() -> inode_go_inval() -> truncate_inode_pages() ->
gfs2_invalidate_folio() -> gfs2_discard() will remove them. They will be freed
in gfs2_release_folio().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
ksmbd_crypt_message() sets a NULL completion callback on AEAD requests
and does not handle the -EINPROGRESS return code from async hardware
crypto engines like the Qualcomm Crypto Engine (QCE). When QCE returns
-EINPROGRESS, ksmbd treats it as an error and immediately frees the
request while the hardware DMA operation is still in flight. The DMA
completion callback then dereferences freed memory, causing a NULL
pointer crash:
pc : qce_skcipher_done+0x24/0x174
lr : vchan_complete+0x230/0x27c
...
el1h_64_irq+0x68/0x6c
ksmbd_free_work_struct+0x20/0x118 [ksmbd]
ksmbd_exit_file_cache+0x694/0xa4c [ksmbd]
Use the standard crypto_wait_req() pattern with crypto_req_done() as
the completion callback, matching the approach used by the SMB client
in fs/smb/client/smb2ops.c. This properly handles both synchronous
engines (immediate return) and async engines (-EINPROGRESS followed
by callback notification).
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Link: https://github.com/openwrt/openwrt/issues/21822
Signed-off-by: Joshua Klinesmith <joshuaklinesmith@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull ata fixes from Niklas Cassel:
- The newly introduced feature that issues a deferred (non-NCQ) command
from a workqueue, forgot to consider the case where the deferred QC
times out. Fix the code to take timeouts into consideration, which
avoids a use after free (Damien)
- The newly introduced feature that issues a deferred (non-NCQ) command
from a workqueue, when unloading the module, calls cancel_work_sync(),
a function that can sleep, while holding a spin lock. Move the function
call outside the lock (Damien)
* tag 'ata-7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux:
ata: libata-core: fix cancellation of a port deferred qc work
ata: libata-eh: correctly handle deferred qc timeouts
Only plain data whose start position and on-disk physical length are
both aligned to the block size should be classified as interlaced
plain extents. Otherwise, it must be treated as shifted plain extents.
This issue was found by syzbot using a crafted compressed image
containing plain extents with unaligned physical lengths, which can
cause OOB read in z_erofs_transform_plain().
Reported-and-tested-by: syzbot+d988dc155e740d76a331@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/699d5714.050a0220.cdd3c.03e7.GAE@google.com
Fixes: 1d191b4ca51d ("erofs: implement encoded extent metadata")
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
The TODO comment in the file header asking to get rid of kmap() is
outdated. The code has already been converted to use the folio API
(specifically kmap_local_folio).
Remove the stale comment to reflect the current state of the code.
Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
Link: https://patch.msgid.link/20260207002908.176933-1-nikic.milos@gmail.com
Signed-off-by: Jan Kara <jack@suse.cz>
Pull ring-buffer updates from Steven Rostedt:
- Add remote buffers for pKVM
pKVM has a hypervisor component that is used to protect the guest
from the host kernel. This hypervisor is a black box to the kernel as
the kernel is to user space. The remote buffers are used to have a
memory mapping between the hypervisor and the kernel where kernel may
send commands to enable tracing within the hypervisor. Then the
kernel will read this memory mapping just like user space can read
the memory mapped ring buffer of the kernel tracing system.
Since the hypervisor only has a single context, it doesn't need to
worry about races between normal context, interrupt context and NMIs
like the kernel does. The ring buffer it uses doesn't need to be as
complex. The remote buffers are a simple version of the ring buffer
that works in a single context. They are still per-CPU and use sub
buffers. The data layout is the same as the kernel's ring buffer to
share the same parsing.
Currently, only ARM64 implements pKVM, but there's work to implement
it also in x86. The remote buffer code is separated out from the ARM
implementation so that it can be used in the future by x86.
The ARM64 updates for pKVM is in the ARM/KVM tree and it merged in
the remote buffers of this tree.
- Make the backup instance non reusable
The backup instance is a copy of the persistent ring buffer so that
the persistent ring buffer could start recording again without using
the data from the previous boot. The backup isn't for normal tracing.
It is made read-only, and after it is consumed, it is automatically
removed.
- Have backup copy persistent instance before it starts recording
To allow the persistent ring buffer to start recording from the
kernel command line commands, move the copy of the backup instance to
before the the command line options start recording.
- Report header_page overwrite field as "char" and not "int'
The rust parser of the header_page file was triggering a warning when
it defined the overwrite variable as "int" but it was only a single
byte in size.
- Fix memory barriers for the trace_buffer CPU mask
When a CPU comes online, the bit is set to allow readers to know that
the CPU buffer is allocated. The bit is set after the allocation is
done, and a smp_wmb() is performed after the allocation and before
the setting of the bit. But instead of adding a smp_rmb() to all
readers, since once a buffer is created for a CPU it is not deleted
if that CPU goes offline, so this allocation is almost always done at
boot up before any readers exist.
If for the unlikely case where a CPU comes online for the first time
after the system boot has finished, send an IPI to all CPUs to force
the smp_rmb() for each CPU.
- Show clock function being used in debugging ring buffer data
When the ring buffer checks are enabled and the ring buffer detects
an inconsistency in the times of the invents, print out the clock
being used when the error occurred. There was a very hard to hit bug
that would happen every so often and it ended up being only triggered
when the jiffies clock was being used. If the bug showed the clock
being used, it would have been much easier to find the problem (which
was an internal function was being traced which caused the clock
accounting to go off).
* tag 'trace-ringbuffer-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
ring-buffer: Prevent off-by-one array access in ring_buffer_desc_page()
ring-buffer: Report header_page overwrite as char
tracing: Allow backup to save persistent ring buffer before it starts
tracing/Documentation: Add a section about backup instance
tracing: Remove the backup instance automatically after read
tracing: Make the backup instance non-reusable
ring-buffer: Enforce read ordering of trace_buffer cpumask and buffers
ring-buffer: Show what clock function is used on timestamp errors
tracing: Check for undefined symbols in simple_ring_buffer
tracing: load/unload page callbacks for simple_ring_buffer
Documentation: tracing: Add tracing remotes
tracing: selftests: Add trace remote tests
tracing: Add a trace remote module for testing
tracing: Introduce simple_ring_buffer
ring-buffer: Export buffer_data_page and macros
tracing: Add helpers to create trace remote events
tracing: Add events/ root files to trace remotes
tracing: Add events to trace remotes
tracing: Add init callback to trace remotes
tracing: Add non-consuming read to trace remotes
...
Moving to guard() usage removed the need of using the 'ret' variable but
it wasn't removed. As it was set to zero, the compiler in use didn't warn
(although some compilers do).
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://patch.msgid.link/20260414110344.75c0663f@robin
Fixes: 4d9b262031f ("eventfs: Simplify code using guard()s")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202604100111.AAlbQKmK-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
When working on a ktest configuration, it is often useful to inspect the
final option values after includes, defaults, per-test overrides, and
variable expansion have been applied, without actually starting a test run.
Add a --dry-run option that reads the configuration, prints the test
preamble using resolved option values, and exits before opening LOG_FILE or
executing any test logic.
This is useful for debugging ktest configurations and for scripts that need
to validate the final resolved settings without triggering side effects.
Cc: John Hawley <warthog9@eaglescrag.net>
Cc: Andrea Righi <arighi@nvidia.com>
Cc: Marcos Paulo de Souza <mpdesouza@suse.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Fernando Fernandez Mancera <fmancera@suse.de>
Cc: Pedro Falcato <pfalcato@suse.de>
Link: https://patch.msgid.link/20260307-ktest-fixes-v1-9-565d412f4925@suse.com
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pyright static analysis reports a "possibly unbound variable" warning
for the loop variable `i` in the `abbreviate_atoms` function. The
variable is accessed after the inner loop terminates to slice the atom
string. While the loop logic currently ensures execution, the analyzer
flags the reliance on the loop variable persisting outside its scope.
Refactor the prefix length calculation into a nested `find_share_length`
helper function. This encapsulates the search logic and uses explicit
return statements, ensuring the length value is strictly defined. This
satisfies the type checker and improves code readability without
altering the runtime behavior.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20260223162407.147003-19-wander@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>