Re: [PATCH v3] locking/lockdep: add debug_show_all_lock_holders()

From: Tetsuo Handa
Date: Sat Mar 18 2023 - 07:15:18 EST


Peter?

On 2023/02/13 22:49, Tetsuo Handa wrote:
> On 2023/02/13 21:49, Peter Zijlstra wrote:
>>>> And sched_show_task() being an utter piece of crap that will basically
>>>> print garbage for anything that's running (it doesn't have much
>>>> options).
>>>>
>>>> Should we try and do better? dump_cpu_task() prefers
>>>> trigger_single_cpu_backtrace(), which sends an interrupt in order to get
>>>> active registers for the CPU.
>>>
>>> What is the intent of using trigger_single_cpu_backtrace() here?
>>> check_hung_uninterruptible_tasks() is calling trigger_all_cpu_backtrace()
>>> if sysctl_hung_task_all_cpu_backtrace is set.
>>
>> Then have that also print the held locks for those tasks. And skip over
>> them again later.
>>
>>> Locks held and kernel backtrace are helpful for describing deadlock
>>> situation, but registers values are not.
>>
>> Register state is required to start the unwind. You can't unwind a
>> running task out of thin-air.
>
> Excuse me. There are two types of TASK_RUNNING tasks, one is that a thread
> is actually running on some CPU, and the other is that a thread is waiting
> for CPU to become available for that thread, aren't there?
>
> lockdep_print_held_locks() does not show locks held even if a thread is
> waiting for CPU to become available for that thread, does it?
>
> But sched_show_task() can show backtrace even if a thread is waiting for
> CPU to become available for that thread, can't it?
>
> Therefore, calling sched_show_task() helps understanding what that thread
> is doing when lockdep_print_held_locks() did not show locks held.
>
>>
>>> What is important is that tasks which are not on CPUs are reported,
>>> for when a task is reported as hung, that task must be sleeping.
>>> Therefore, I think sched_show_task() is fine.
>>
>> The backtraces generated by sched_show_task() for a running task are
>> absolutely worthless, might as well not print them.
>
> "a thread actually running on some CPU" or
> "a thread waiting for CPU to become available for that thread",
> which does this "running task" mean?
>
>>
>> And if I read your Changelog right, you explicitly wanted useful
>> backtraces for the running tasks -- such that you could see what they
>> were doing while holding the lock the other tasks were blocked on.
>
> Yes, we can get useful backtraces for threads that are waiting for CPU
> to become available for that thread. That's why sched_show_task() is chosen.
>
>>
>> The only way to do that is to send an interrupt, the interrupt will have
>> the register state for the interrupted task -- including the stack
>> pointer. By virtue of running the interrupt handler we know the stack
>> won't shrink, so we can then safely traverse the stack starting from the
>> given stack pointer.
>
> But trigger_single_cpu_backtrace() is for a thread actually running on some CPU,
> isn't it? While it would be helpful to get backtrace of a thread that is actually
> running on some CPU, it would be helpless not getting backtrace of a thread
> that is waiting for CPU to become available for that thread.
>
> We can later get backtrace of threads actually running on some CPU using
> trigger_all_cpu_backtrace() via sysctl_hung_task_all_cpu_backtrace setting,
> though I seldom find useful backtraces via trigger_all_cpu_backtrace(); it is
> likely that khungtaskd thread and some random workqueue thread (which are
> irrelevant to hung task) are reported via trigger_all_cpu_backtrace()...
>