[RFC PATCH] hung_task: show sysctl_hung_task_warnings

From: Weiping Zhang
Date: Thu Jan 12 2023 - 04:28:24 EST


This patch try to add more debug info to detect lost kernel log or no
hung task was detected.

The user set 10 to the hung_task_timeout_secs, the kernel log:

[ 3942.642220] INFO: task mount:19066 blocked for more than 10 seconds.
[ 3952.876768] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
[ 3952.877088] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
[ 3952.878212] INFO: task mount:19066 blocked for more than 10 seconds.
[ 3963.116805] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
[ 3963.117137] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
[ 3963.118275] INFO: task mount:19066 blocked for more than 10 seconds.
[ 3973.356837] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
[ 3973.357148] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
[ 3973.358247] INFO: task mount:19066 blocked for more than 10 seconds.
[ 3993.836899] INFO: task kworker/u81:0:7 blocked for more than 10 seconds.
[ 3993.837238] INFO: task scsi_eh_0:506 blocked for more than 10 seconds.
[ 3993.838356] INFO: task mount:19066 blocked for more than 10 seconds.

There is no any log at about 3983, it's hard to know if kernel log was
lost or there is no hung task was detected at that moment. So this patch
print sysctl_hung_task_warnings to distinguish the above two cases.

Signed-off-by: Weiping Zhang <zhangweiping@xxxxxxxxxxxxxx>
---
kernel/hung_task.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index c71889f3f3fc..ca917931473d 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -127,8 +127,11 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
* complain:
*/
if (sysctl_hung_task_warnings) {
- if (sysctl_hung_task_warnings > 0)
+ if (sysctl_hung_task_warnings > 0) {
sysctl_hung_task_warnings--;
+ pr_err("sysctl_hung_task_warnings: %d\n",
+ sysctl_hung_task_warnings);
+ }
pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
pr_err(" %s %s %.*s\n",
--
2.34.1