Re: [patch 1/3] sched: init rt_avg stat whenever rq comes online

From: Suresh Siddha
Date: Mon Aug 16 2010 - 13:36:53 EST


On Mon, 2010-08-16 at 00:47 -0700, Peter Zijlstra wrote:
> On Fri, 2010-08-13 at 12:45 -0700, Suresh Siddha wrote:
> > plain text document attachment (sched_reset_rt_avg_stat_online.patch)
> > TSC's get reset after suspend/resume and this leads to a scenario of
> > rq->clock (sched_clock_cpu()) less than rq->age_stamp. This leads
> > to a big value returned by scale_rt_power() and the resulting big group
> > power set by the update_group_power() is causing improper load balancing
> > between busy and idle cpu's after suspend/resume.
>
> ARGH, so i[357] westmere mobile stops TSC on some power state?

WSM has working TSC with constant rate across P/C/T-states. This issue
is about suspend/resume (S-states).

> Why don't we sync it back to the other CPUs instead?

All the cpu's entered suspend state and during resume it gets reset for
all the CPU's.

>
> Or does it simply mark TSCs unstable and leaves it at that?

TSCs are stable and in sync after resume aswell. If we want to do SW
sync, we need to keep track of the time we spent in the suspend state
and do a SW sync (during resume) that can potentially disturb the HW
sync.

>
> In any case, this needs to be fixed at the clock level, not like this.

If we have more such dependencies on TSC then we may need to address the
issue at clock level aswell. Nevertheless, across cpu online/offline,
current scheduler code is expecting TSC (sched_clock) to be going
forward and not sure why we need to carry the rt_avg history across
online/offline.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/