Re: [PATCH v3] locking/lockdep: add debug_show_all_lock_holders()

From: Peter Zijlstra
Date: Mon Feb 13 2023 - 07:49:54 EST


On Mon, Feb 13, 2023 at 08:34:55PM +0900, Tetsuo Handa wrote:
> On 2023/02/13 20:02, Peter Zijlstra wrote:
> >> @@ -213,7 +213,7 @@ static void check_hung_uninterruptible_tasks(unsigned long timeout)
> >> unlock:
> >> rcu_read_unlock();
> >> if (hung_task_show_lock)
> >> - debug_show_all_locks();
> >> + debug_show_all_lock_holders();
> >>
> >> if (hung_task_show_all_bt) {
> >> hung_task_show_all_bt = false;
> >
> > This being the hung-task detector, which is mostly about sleeping locks.
>
> Yes, the intent of this patch is to report tasks sleeping with locks held,
> for the cause of hung task is sometimes a deadlock.
>
> >> + rcu_read_lock();
> >> + for_each_process_thread(g, p) {
> >> + if (!p->lockdep_depth)
> >> + continue;
> >> + if (p == current && p->lockdep_depth == 1)
> >> + continue;
> >> + sched_show_task(p);
> >
> > And sched_show_task() being an utter piece of crap that will basically
> > print garbage for anything that's running (it doesn't have much
> > options).
> >
> > Should we try and do better? dump_cpu_task() prefers
> > trigger_single_cpu_backtrace(), which sends an interrupt in order to get
> > active registers for the CPU.
>
> What is the intent of using trigger_single_cpu_backtrace() here?
> check_hung_uninterruptible_tasks() is calling trigger_all_cpu_backtrace()
> if sysctl_hung_task_all_cpu_backtrace is set.

Then have that also print the held locks for those tasks. And skip over
them again later.

> Locks held and kernel backtrace are helpful for describing deadlock
> situation, but registers values are not.

Register state is required to start the unwind. You can't unwind a
running task out of thin-air.

> What is important is that tasks which are not on CPUs are reported,
> for when a task is reported as hung, that task must be sleeping.
> Therefore, I think sched_show_task() is fine.

The backtraces generated by sched_show_task() for a running task are
absolutely worthless, might as well not print them.

And if I read your Changelog right, you explicitly wanted useful
backtraces for the running tasks -- such that you could see what they
were doing while holding the lock the other tasks were blocked on.

The only way to do that is to send an interrupt, the interrupt will have
the register state for the interrupted task -- including the stack
pointer. By virtue of running the interrupt handler we know the stack
won't shrink, so we can then safely traverse the stack starting from the
given stack pointer.