[tip:core/locking] hung_task debugging: Add tracepoint to report the hang

From: tip-bot for Oleg Nesterov
Date: Thu Oct 31 2013 - 06:54:20 EST


Commit-ID: 6a716c90a51338009c3bc1f460829afaed8f922d
Gitweb: http://git.kernel.org/tip/6a716c90a51338009c3bc1f460829afaed8f922d
Author: Oleg Nesterov <oleg@xxxxxxxxxx>
AuthorDate: Sat, 19 Oct 2013 18:18:28 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 31 Oct 2013 11:16:18 +0100

hung_task debugging: Add tracepoint to report the hang

Currently check_hung_task() prints a warning if it detects the
problem, but it is not convenient to watch the system logs if
user-space wants to be notified about the hang.

Add the new trace_sched_process_hang() into check_hung_task(),
this way a user-space monitor can easily wait for the hang and
potentially resolve a problem.

Signed-off-by: Oleg Nesterov <oleg@xxxxxxxxxx>
Cc: Dave Sullivan <dsulliva@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20131019161828.GA7439@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
include/trace/events/sched.h | 19 +++++++++++++++++++
kernel/hung_task.c | 4 ++++
2 files changed, 23 insertions(+)

diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
index 2e7d994..2a652d1 100644
--- a/include/trace/events/sched.h
+++ b/include/trace/events/sched.h
@@ -424,6 +424,25 @@ TRACE_EVENT(sched_pi_setprio,
__entry->oldprio, __entry->newprio)
);

+#ifdef CONFIG_DETECT_HUNG_TASK
+TRACE_EVENT(sched_process_hang,
+ TP_PROTO(struct task_struct *tsk),
+ TP_ARGS(tsk),
+
+ TP_STRUCT__entry(
+ __array( char, comm, TASK_COMM_LEN )
+ __field( pid_t, pid )
+ ),
+
+ TP_fast_assign(
+ memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
+ __entry->pid = tsk->pid;
+ ),
+
+ TP_printk("comm=%s pid=%d", __entry->comm, __entry->pid)
+);
+#endif /* CONFIG_DETECT_HUNG_TASK */
+
#endif /* _TRACE_SCHED_H */

/* This part must be outside protection */
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 0422523..8807061 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -16,6 +16,7 @@
#include <linux/export.h>
#include <linux/sysctl.h>
#include <linux/utsname.h>
+#include <trace/events/sched.h>

/*
* The number of tasks checked:
@@ -92,6 +93,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
t->last_switch_count = switch_count;
return;
}
+
+ trace_sched_process_hang(t);
+
if (!sysctl_hung_task_warnings)
return;
sysctl_hung_task_warnings--;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/