Re: [PATCH v4 1/2] hung_task: Show the blocker task if the task is hung on mutex
From: Lance Yang
Date: Wed Jul 30 2025 - 05:38:22 EST
On 2025/7/30 16:51, Masami Hiramatsu (Google) wrote:
On Wed, 30 Jul 2025 16:59:22 +0900
Sergey Senozhatsky <senozhatsky@xxxxxxxxxxxx> wrote:
One thing that gives me a bit of "inconvenience" is that in certain
cases this significantly increases the amount of stack traces to go
through. A distilled real life example:
- task T1 acquires lock L1, attempts to acquire L2
- task T2 acquires lock L2, attempts to acquire L3
- task T3 acquires lock L3, attempts to acquire L1
So we'd now see:
- a backtrace of T1, followed by a backtrace of T2 (owner of L2)
- a backtrace of T2, followed by a backtrace of T3 (owner of L3)
- a backtrace of T3, followed by a backtrace of T1 (owner of L1)
Notice how each task is backtraced twice. I wonder if it's worth it
to de-dup the backtraces. E.g. in
task cat:115 is blocked on a mutex likely owned by task cat:114
if we know that cat:114 is also blocked on a lock, then we probably
can just say "is blocked on a mutex likely owned by task cat:114" and
continue iterating through tasks. That "cat:114" will be backtraced
individually later, as it's also blocked on a lock, owned by another
task.
Does this make any sense?
Hrm, OK. So what about dump the blocker task only if that task is
NOT blocked? (because if the task is blocked, it should be dumped
afterwards (or already))
Hmm... I'm concerned about a potential side effect of that logic.
Consider a simple, non-circular blocking chain like T1 -> T2 -> T3.
In this scenario, T1, T2, and T3 would all be dumped as hung tasks.
However, with the proposed rule (dump only if NOT blocked), when the
detector processes T1, it would see that its blocker (T2) is also
blocked and would therefore skip printing any blocker information about
T2.
The key issue is that we would lose the crucial T1 -> T2 relationship
information from the log.
While all three tasks would still be dumped, we would no longer be able
to see the explicit dependency chain. It seems like the blocker tracking
itself would be broken in this case.
Thanks,
Lance