[PATCH 3/4] nohz: Fix idle/iowait counts going backwards

From: Denys Vlasenko
Date: Thu Apr 24 2014 - 14:47:15 EST


With this change, "iowait-ness" of every idle period is decided
at the moment it starts:
if this CPU's run-queue had tasks waiting on I/O, then this idle
period's duration will be added to iowait_sleeptime.

This fixes the bug where iowait and/or idle counts could go backwards,
but iowait accounting is not precise (it can show more iowait
that there really is).

Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Hidetoshi Seto <seto.hidetoshi@xxxxxxxxxxxxxx>
Cc: Fernando Luis Vazquez Cao <fernando_b1@xxxxxxxxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Arjan van de Ven <arjan@xxxxxxxxxxxxxxx>
Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
---
kernel/time/tick-sched.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 8f0f2ee..47ed7cf 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -413,7 +413,7 @@ static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
/* Updates the per cpu time idle statistics counters */
write_seqcount_begin(&ts->idle_sleeptime_seq);
delta = ktime_sub(now, ts->idle_entrytime);
- if (nr_iowait_cpu(smp_processor_id()) > 0)
+ if (ts->idle_active == 2)
ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
else
ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
@@ -429,7 +429,7 @@ static ktime_t tick_nohz_start_idle(struct tick_sched *ts)

write_seqcount_begin(&ts->idle_sleeptime_seq);
ts->idle_entrytime = now;
- ts->idle_active = 1;
+ ts->idle_active = nr_iowait_cpu(smp_processor_id()) ? 2 : 1;
write_seqcount_end(&ts->idle_sleeptime_seq);

sched_clock_idle_sleep_event();
@@ -469,7 +469,7 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)

seq = read_seqcount_begin(&ts->idle_sleeptime_seq);
idle = ts->idle_sleeptime;
- if (ts->idle_active && !nr_iowait_cpu(cpu)) {
+ if (ts->idle_active == 1) {
delta = ktime_sub(now, ts->idle_entrytime);
idle = ktime_add(idle, delta);
}
@@ -511,7 +511,7 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)

seq = read_seqcount_begin(&ts->idle_sleeptime_seq);
iowait = ts->iowait_sleeptime;
- if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
+ if (ts->idle_active == 2) {
delta = ktime_sub(now, ts->idle_entrytime);
iowait = ktime_add(ts->iowait_sleeptime, delta);
}
--
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/