Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

hung_task: replace blocker_mutex with encoded blocker

Patch series "hung_task: extend blocking task stacktrace dump to
semaphore", v5.

Inspired by mutex blocker tracking[1], this patch series extend the
feature to not only dump the blocker task holding a mutex but also to
support semaphores. Unlike mutexes, semaphores lack explicit ownership
tracking, making it challenging to identify the root cause of hangs. To
address this, we introduce a last_holder field to the semaphore structure,
which is updated when a task successfully calls down() and cleared during
up().

The assumption is that if a task is blocked on a semaphore, the holders
must not have released it. While this does not guarantee that the last
holder is one of the current blockers, it likely provides a practical hint
for diagnosing semaphore-related stalls.

With this change, the hung task detector can now show blocker task's info
like below:

[Tue Apr 8 12:19:07 2025] INFO: task cat:945 blocked for more than 120 seconds.
[Tue Apr 8 12:19:07 2025] Tainted: G E 6.14.0-rc6+ #1
[Tue Apr 8 12:19:07 2025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[Tue Apr 8 12:19:07 2025] task:cat state:D stack:0 pid:945 tgid:945 ppid:828 task_flags:0x400000 flags:0x00000000
[Tue Apr 8 12:19:07 2025] Call Trace:
[Tue Apr 8 12:19:07 2025] <TASK>
[Tue Apr 8 12:19:07 2025] __schedule+0x491/0xbd0
[Tue Apr 8 12:19:07 2025] schedule+0x27/0xf0
[Tue Apr 8 12:19:07 2025] schedule_timeout+0xe3/0xf0
[Tue Apr 8 12:19:07 2025] ? __folio_mod_stat+0x2a/0x80
[Tue Apr 8 12:19:07 2025] ? set_ptes.constprop.0+0x27/0x90
[Tue Apr 8 12:19:07 2025] __down_common+0x155/0x280
[Tue Apr 8 12:19:07 2025] down+0x53/0x70
[Tue Apr 8 12:19:07 2025] read_dummy_semaphore+0x23/0x60
[Tue Apr 8 12:19:07 2025] full_proxy_read+0x5f/0xa0
[Tue Apr 8 12:19:07 2025] vfs_read+0xbc/0x350
[Tue Apr 8 12:19:07 2025] ? __count_memcg_events+0xa5/0x140
[Tue Apr 8 12:19:07 2025] ? count_memcg_events.constprop.0+0x1a/0x30
[Tue Apr 8 12:19:07 2025] ? handle_mm_fault+0x180/0x260
[Tue Apr 8 12:19:07 2025] ksys_read+0x66/0xe0
[Tue Apr 8 12:19:07 2025] do_syscall_64+0x51/0x120
[Tue Apr 8 12:19:07 2025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Tue Apr 8 12:19:07 2025] RIP: 0033:0x7f419478f46e
[Tue Apr 8 12:19:07 2025] RSP: 002b:00007fff1c4d2668 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Tue Apr 8 12:19:07 2025] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f419478f46e
[Tue Apr 8 12:19:07 2025] RDX: 0000000000020000 RSI: 00007f4194683000 RDI: 0000000000000003
[Tue Apr 8 12:19:07 2025] RBP: 00007f4194683000 R08: 00007f4194682010 R09: 0000000000000000
[Tue Apr 8 12:19:07 2025] R10: fffffffffffffbc5 R11: 0000000000000246 R12: 0000000000000000
[Tue Apr 8 12:19:07 2025] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[Tue Apr 8 12:19:07 2025] </TASK>
[Tue Apr 8 12:19:07 2025] INFO: task cat:945 blocked on a semaphore likely last held by task cat:938
[Tue Apr 8 12:19:07 2025] task:cat state:S stack:0 pid:938 tgid:938 ppid:584 task_flags:0x400000 flags:0x00000000
[Tue Apr 8 12:19:07 2025] Call Trace:
[Tue Apr 8 12:19:07 2025] <TASK>
[Tue Apr 8 12:19:07 2025] __schedule+0x491/0xbd0
[Tue Apr 8 12:19:07 2025] ? _raw_spin_unlock_irqrestore+0xe/0x40
[Tue Apr 8 12:19:07 2025] schedule+0x27/0xf0
[Tue Apr 8 12:19:07 2025] schedule_timeout+0x77/0xf0
[Tue Apr 8 12:19:07 2025] ? __pfx_process_timeout+0x10/0x10
[Tue Apr 8 12:19:07 2025] msleep_interruptible+0x49/0x60
[Tue Apr 8 12:19:07 2025] read_dummy_semaphore+0x2d/0x60
[Tue Apr 8 12:19:07 2025] full_proxy_read+0x5f/0xa0
[Tue Apr 8 12:19:07 2025] vfs_read+0xbc/0x350
[Tue Apr 8 12:19:07 2025] ? __count_memcg_events+0xa5/0x140
[Tue Apr 8 12:19:07 2025] ? count_memcg_events.constprop.0+0x1a/0x30
[Tue Apr 8 12:19:07 2025] ? handle_mm_fault+0x180/0x260
[Tue Apr 8 12:19:07 2025] ksys_read+0x66/0xe0
[Tue Apr 8 12:19:07 2025] do_syscall_64+0x51/0x120
[Tue Apr 8 12:19:07 2025] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Tue Apr 8 12:19:07 2025] RIP: 0033:0x7f7c584a646e
[Tue Apr 8 12:19:07 2025] RSP: 002b:00007ffdba8ce158 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[Tue Apr 8 12:19:07 2025] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f7c584a646e
[Tue Apr 8 12:19:07 2025] RDX: 0000000000020000 RSI: 00007f7c5839a000 RDI: 0000000000000003
[Tue Apr 8 12:19:07 2025] RBP: 00007f7c5839a000 R08: 00007f7c58399010 R09: 0000000000000000
[Tue Apr 8 12:19:07 2025] R10: fffffffffffffbc5 R11: 0000000000000246 R12: 0000000000000000
[Tue Apr 8 12:19:07 2025] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[Tue Apr 8 12:19:07 2025] </TASK>


This patch (of 3):

This patch replaces 'struct mutex *blocker_mutex' with 'unsigned long
blocker', as only one blocker is active at a time.

The blocker filed can store both the lock addrees and the lock type, with
LSB used to encode the type as Masami suggested, making it easier to
extend the feature to cover other types of locks.

Also, once the lock type is determined, we can directly extract the
address and cast it to a lock pointer ;)

Link: https://lkml.kernel.org/r/20250414145945.84916-1-ioworker0@gmail.com
Link: https://lore.kernel.org/all/174046694331.2194069.15472952050240807469.stgit@mhiramat.tok.corp.google.com [1]
Link: https://lkml.kernel.org/r/20250414145945.84916-2-ioworker0@gmail.com
Signed-off-by: Mingzhe Yang <mingzhe.yang@ly.com>
Signed-off-by: Lance Yang <ioworker0@gmail.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anna Schumaker <anna.schumaker@oracle.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joel Granados <joel.granados@kernel.org>
Cc: John Stultz <jstultz@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yongliang Gao <leonylgao@tencent.com>
Cc: Zi Li <amaindex@outlook.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lance Yang and committed by
Andrew Morton
e711faaa 50af973c

+115 -8
+99
include/linux/hung_task.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-only */ 2 + /* 3 + * Detect Hung Task: detecting tasks stuck in D state 4 + * 5 + * Copyright (C) 2025 Tongcheng Travel (www.ly.com) 6 + * Author: Lance Yang <mingzhe.yang@ly.com> 7 + */ 8 + #ifndef __LINUX_HUNG_TASK_H 9 + #define __LINUX_HUNG_TASK_H 10 + 11 + #include <linux/bug.h> 12 + #include <linux/sched.h> 13 + #include <linux/compiler.h> 14 + 15 + /* 16 + * @blocker: Combines lock address and blocking type. 17 + * 18 + * Since lock pointers are at least 4-byte aligned(32-bit) or 8-byte 19 + * aligned(64-bit). This leaves the 2 least bits (LSBs) of the pointer 20 + * always zero. So we can use these bits to encode the specific blocking 21 + * type. 22 + * 23 + * Type encoding: 24 + * 00 - Blocked on mutex (BLOCKER_TYPE_MUTEX) 25 + * 01 - Blocked on semaphore (BLOCKER_TYPE_SEM) 26 + * 10 - Blocked on rt-mutex (BLOCKER_TYPE_RTMUTEX) 27 + * 11 - Blocked on rw-semaphore (BLOCKER_TYPE_RWSEM) 28 + */ 29 + #define BLOCKER_TYPE_MUTEX 0x00UL 30 + #define BLOCKER_TYPE_SEM 0x01UL 31 + #define BLOCKER_TYPE_RTMUTEX 0x02UL 32 + #define BLOCKER_TYPE_RWSEM 0x03UL 33 + 34 + #define BLOCKER_TYPE_MASK 0x03UL 35 + 36 + #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER 37 + static inline void hung_task_set_blocker(void *lock, unsigned long type) 38 + { 39 + unsigned long lock_ptr = (unsigned long)lock; 40 + 41 + WARN_ON_ONCE(!lock_ptr); 42 + WARN_ON_ONCE(READ_ONCE(current->blocker)); 43 + 44 + /* 45 + * If the lock pointer matches the BLOCKER_TYPE_MASK, return 46 + * without writing anything. 47 + */ 48 + if (WARN_ON_ONCE(lock_ptr & BLOCKER_TYPE_MASK)) 49 + return; 50 + 51 + WRITE_ONCE(current->blocker, lock_ptr | type); 52 + } 53 + 54 + static inline void hung_task_clear_blocker(void) 55 + { 56 + WARN_ON_ONCE(!READ_ONCE(current->blocker)); 57 + 58 + WRITE_ONCE(current->blocker, 0UL); 59 + } 60 + 61 + /* 62 + * hung_task_get_blocker_type - Extracts blocker type from encoded blocker 63 + * address. 64 + * 65 + * @blocker: Blocker pointer with encoded type (via LSB bits) 66 + * 67 + * Returns: BLOCKER_TYPE_MUTEX, BLOCKER_TYPE_SEM, etc. 68 + */ 69 + static inline unsigned long hung_task_get_blocker_type(unsigned long blocker) 70 + { 71 + WARN_ON_ONCE(!blocker); 72 + 73 + return blocker & BLOCKER_TYPE_MASK; 74 + } 75 + 76 + static inline void *hung_task_blocker_to_lock(unsigned long blocker) 77 + { 78 + WARN_ON_ONCE(!blocker); 79 + 80 + return (void *)(blocker & ~BLOCKER_TYPE_MASK); 81 + } 82 + #else 83 + static inline void hung_task_set_blocker(void *lock, unsigned long type) 84 + { 85 + } 86 + static inline void hung_task_clear_blocker(void) 87 + { 88 + } 89 + static inline unsigned long hung_task_get_blocker_type(unsigned long blocker) 90 + { 91 + return 0UL; 92 + } 93 + static inline void *hung_task_blocker_to_lock(unsigned long blocker) 94 + { 95 + return NULL; 96 + } 97 + #endif 98 + 99 + #endif /* __LINUX_HUNG_TASK_H */
+5 -1
include/linux/sched.h
··· 1240 1240 #endif 1241 1241 1242 1242 #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER 1243 - struct mutex *blocker_mutex; 1243 + /* 1244 + * Encoded lock address causing task block (lower 2 bits = type from 1245 + * <linux/hung_task.h>). Accessed via hung_task_*() helpers. 1246 + */ 1247 + unsigned long blocker; 1244 1248 #endif 1245 1249 1246 1250 #ifdef CONFIG_DEBUG_ATOMIC_SLEEP
+8 -5
kernel/hung_task.c
··· 22 22 #include <linux/sched/signal.h> 23 23 #include <linux/sched/debug.h> 24 24 #include <linux/sched/sysctl.h> 25 + #include <linux/hung_task.h> 25 26 26 27 #include <trace/events/sched.h> 27 28 ··· 99 98 static void debug_show_blocker(struct task_struct *task) 100 99 { 101 100 struct task_struct *g, *t; 102 - unsigned long owner; 103 - struct mutex *lock; 101 + unsigned long owner, blocker; 104 102 105 103 RCU_LOCKDEP_WARN(!rcu_read_lock_held(), "No rcu lock held"); 106 104 107 - lock = READ_ONCE(task->blocker_mutex); 108 - if (!lock) 105 + blocker = READ_ONCE(task->blocker); 106 + if (!blocker || 107 + hung_task_get_blocker_type(blocker) != BLOCKER_TYPE_MUTEX) 109 108 return; 110 109 111 - owner = mutex_get_owner(lock); 110 + owner = mutex_get_owner( 111 + (struct mutex *)hung_task_blocker_to_lock(blocker)); 112 + 112 113 if (unlikely(!owner)) { 113 114 pr_err("INFO: task %s:%d is blocked on a mutex, but the owner is not found.\n", 114 115 task->comm, task->pid);
+3 -2
kernel/locking/mutex.c
··· 29 29 #include <linux/interrupt.h> 30 30 #include <linux/debug_locks.h> 31 31 #include <linux/osq_lock.h> 32 + #include <linux/hung_task.h> 32 33 33 34 #define CREATE_TRACE_POINTS 34 35 #include <trace/events/lock.h> ··· 192 191 struct list_head *list) 193 192 { 194 193 #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER 195 - WRITE_ONCE(current->blocker_mutex, lock); 194 + hung_task_set_blocker(lock, BLOCKER_TYPE_MUTEX); 196 195 #endif 197 196 debug_mutex_add_waiter(lock, waiter, current); 198 197 ··· 210 209 211 210 debug_mutex_remove_waiter(lock, waiter, current); 212 211 #ifdef CONFIG_DETECT_HUNG_TASK_BLOCKER 213 - WRITE_ONCE(current->blocker_mutex, NULL); 212 + hung_task_clear_blocker(); 214 213 #endif 215 214 } 216 215