Re: [RFC PATCH] sched: Reduce overestimating avg_idle

From: Jason Low
Date: Fri Aug 02 2013 - 04:20:41 EST

On Wed, 2013-07-31 at 11:53 +0200, Peter Zijlstra wrote:

> No they're quite unrelated. I think you can measure the max time we've
> ever spend in newidle balance and use that to clip the values.

So I tried using the rq's max newidle balance cost to compare with the
average and used sysctl_migration_cost as the initial/default max. One
thing I noticed when running this on 8 socket machine was that the max
idle balance cost was a lot higher during around boot time compared to
after boot time. Not sure if IRQ/NMI/SMI was the cause of this. A
temporary "fix" I made was to reset the max idle balance costs every 2

> Similarly, I've thought about how we updated the sd->avg_cost in the
> previous patches and wondered if we should not track max_cost.
> The 'only' down-side I could come up with is that its all ran from
> SoftIRQ context which means IRQ/NMI/SMI can all stretch/warp the time it
> takes to actually do the idle balance.

Another thing that I thought of was that max idle balance cost may also
vary based on the workload that is running. So running a workload in
which there are shorter idle balances after running a workload that has
longer idle balances may sometimes cause it to make use of a higher idle
balance cost. But I guess it is okay if we're trying to reduce
overrunning the average.


