[tip:sched/core] sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug

From: tip-bot for Steven Rostedt (Red Hat)
Date: Mon Feb 29 2016 - 06:17:38 EST


Commit-ID: ef477183d06b0aa41c9e7c02cf5bfec41536e2c4
Gitweb: http://git.kernel.org/tip/ef477183d06b0aa41c9e7c02cf5bfec41536e2c4
Author: Steven Rostedt (Red Hat) <rostedt@xxxxxxxxxxx>
AuthorDate: Mon, 22 Feb 2016 16:26:52 -0500
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Mon, 29 Feb 2016 09:53:07 +0100

sched/debug: Add deadline scheduler bandwidth ratio to /proc/sched_debug

Playing with SCHED_DEADLINE and cpusets, I found that I was unable to create
new SCHED_DEADLINE tasks, with the error of EBUSY as if the bandwidth was
already used up. I then realized there wa no way to see what bandwidth is
used by the runqueues to debug the issue.

By adding the dl_bw->bw and dl_bw->total_bw to the output of the deadline
info in /proc/sched_debug, this allows us to see what bandwidth has been
reserved and where a problem may exist.

For example, before the issue we see the ratio of the bandwidth:

# cat /proc/sys/kernel/sched_rt_runtime_us
950000
# cat /proc/sys/kernel/sched_rt_period_us
1000000

# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 0

Note: (950000 / 1000000) << 20 == 996147

After I played with cpusets and hit the issue, the result is now:

# grep dl /proc/sched_debug
dl_rq[0]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[1]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[2]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[3]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : 104857
dl_rq[4]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[5]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[6]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857
dl_rq[7]:
.dl_nr_running : 0
.dl_bw->bw : 996147
.dl_bw->total_bw : -104857

This shows that there is definitely a problem as we should never have a
negative total bandwidth.

Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Clark Williams <williams@xxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/20160222212825.756849091@xxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/debug.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 313c65f..4fbc3bd 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -566,8 +566,17 @@ void print_rt_rq(struct seq_file *m, int cpu, struct rt_rq *rt_rq)

void print_dl_rq(struct seq_file *m, int cpu, struct dl_rq *dl_rq)
{
+ struct dl_bw *dl_bw;
+
SEQ_printf(m, "\ndl_rq[%d]:\n", cpu);
SEQ_printf(m, " .%-30s: %ld\n", "dl_nr_running", dl_rq->dl_nr_running);
+#ifdef CONFIG_SMP
+ dl_bw = &cpu_rq(cpu)->rd->dl_bw;
+#else
+ dl_bw = &dl_rq->dl_bw;
+#endif
+ SEQ_printf(m, " .%-30s: %lld\n", "dl_bw->bw", dl_bw->bw);
+ SEQ_printf(m, " .%-30s: %lld\n", "dl_bw->total_bw", dl_bw->total_bw);
}

extern __read_mostly int sched_clock_running;