[RFC] nohz: no update idle entry time on get_cpu_idle/iowait_time_us()

From: Yunhong Jiang
Date: Mon Sep 28 2015 - 21:10:38 EST


Currently the get_cpu_idle/iowait_time_us() updates the idle_entrytime.
When it's invoked from another CPU and the target CPU has been on idle
already, it will update the idle_entrytime to now, which is incorrect.

However, the get_cpu_idle/iowait_time_us() is not guranteed to be called
on the target CPU. For example, the get_cpu_idle_time_us() seems is
invoked remotely on drivers/cpufreq/cpufreq_governor.c through
get_cpu_idle_time().

There is a check that the update happens only if a valid last_update_time
parameter passed. IMHO, this is more a hack because there is no guarantee
that it's invoked on the target CPU when last_update_time is valid.

In fact, we don't need update the idle stats from
get_cpu_idle/iowait_time_us(). Now the policy is, we record the
entrytime when tick_nohz_start_idle() and update the stats
when tick_nohz_stop_idle(). We calculate the stats on other situations.

Please notice:
1) There is a bug currently that the tick_nohz_stop_idle() calls the
update_ts_time_stats() and update the idle_entrytime, which is sure
to be wrong. Removing the idle_entrytime update resolves the bug also.

2) There is a small widows in tick_nohz_stop_idle() between
update_ts_time_stats() and clear ts->idle_active, that
get_cpu_idle/iowait_time_us(), when invoked remotely, may double
count last idle period. This window exists w/o this change and this
change does not fix it.

Signed-off-by: Yunhong Jiang <yunhong.jiang@xxxxxxxxxxxxxxx>
---
kernel/time/tick-sched.c | 43 +++++++++++++++----------------------------
1 file changed, 15 insertions(+), 28 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 7c7ec4515983..c2a78887d265 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -437,7 +437,7 @@ static void tick_nohz_update_jiffies(ktime_t now)
* Updates the per cpu time idle statistics counters
*/
static void
-update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_update_time)
+update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now)
{
ktime_t delta;

@@ -447,17 +447,12 @@ update_ts_time_stats(int cpu, struct tick_sched *ts, ktime_t now, u64 *last_upda
ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta);
else
ts->idle_sleeptime = ktime_add(ts->idle_sleeptime, delta);
- ts->idle_entrytime = now;
}
-
- if (last_update_time)
- *last_update_time = ktime_to_us(now);
-
}

static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now)
{
- update_ts_time_stats(smp_processor_id(), ts, now, NULL);
+ update_ts_time_stats(smp_processor_id(), ts, now);
ts->idle_active = 0;

sched_clock_idle_wakeup_event(0);
@@ -496,21 +491,16 @@ u64 get_cpu_idle_time_us(int cpu, u64 *last_update_time)
return -1;

now = ktime_get();
- if (last_update_time) {
- update_ts_time_stats(cpu, ts, now, last_update_time);
- idle = ts->idle_sleeptime;
- } else {
- if (ts->idle_active && !nr_iowait_cpu(cpu)) {
- ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+ idle = ts->idle_sleeptime;
+ if (ts->idle_active && !nr_iowait_cpu(cpu)) {
+ ktime_t delta = ktime_sub(now, ts->idle_entrytime);

- idle = ktime_add(ts->idle_sleeptime, delta);
- } else {
- idle = ts->idle_sleeptime;
- }
+ idle = ktime_add(ts->idle_sleeptime, delta);
}

+ if (last_update_time)
+ *last_update_time = ktime_to_us(now);
return ktime_to_us(idle);
-
}
EXPORT_SYMBOL_GPL(get_cpu_idle_time_us);

@@ -537,19 +527,16 @@ u64 get_cpu_iowait_time_us(int cpu, u64 *last_update_time)
return -1;

now = ktime_get();
- if (last_update_time) {
- update_ts_time_stats(cpu, ts, now, last_update_time);
- iowait = ts->iowait_sleeptime;
- } else {
- if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
- ktime_t delta = ktime_sub(now, ts->idle_entrytime);
+ iowait = ts->iowait_sleeptime;
+ if (ts->idle_active && nr_iowait_cpu(cpu) > 0) {
+ ktime_t delta = ktime_sub(now, ts->idle_entrytime);

- iowait = ktime_add(ts->iowait_sleeptime, delta);
- } else {
- iowait = ts->iowait_sleeptime;
- }
+ iowait = ktime_add(ts->iowait_sleeptime, delta);
}

+ if (last_update_time)
+ *last_update_time = ktime_to_us(now);
+
return ktime_to_us(iowait);
}
EXPORT_SYMBOL_GPL(get_cpu_iowait_time_us);
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/