Re: [RFC PATCH] sched: Reduce overestimating avg_idle

From: Jason Low
Date: Thu Aug 01 2013 - 03:36:52 EST

> I wonder if we could get even more conservative values
> of avg_idle by clamping delta to max, before calling
> update_avg...
> Or rather, I wonder if that would matter enough to make
> a difference, and in what direction that difference would
> be.
> In other words:
> if (rq->idle_stamp) {
> u64 delta = rq->clock - rq->idle_stamp;
> u64 max = (sysctl_sched_migration_cost * 3) / 2;
> if (delta > max)
> delta = max;
> update_avg(&rq->avg_idle, delta);
> rq->idle_stamp = 0;
> }

Yes, I initially tried to limit delta to the max. That helped keep the
avg_idle smaller and provided even better performance improvements on
the 8 socket HT-enabled case. Here were some of those performance boosts
on AIM7:

alltests: +14.5% custom: +15.9% disk: +15.9%
fserver: +33.7% new_fserver: +15.7% high_systime: +16.7%
shared: +14.1%

When we limit the average instead of the delta, the performance boosts
were in the range of 5-10%, with the exception of fserver.

I initially thought that limiting delta to a small value might cause the
average to often be underestimated. But come to think of it, this might
actually provide a more accurate estimate of whether the majority of
idle durations are either less than or greater than migration_cost. Idle
durations can be a lot higher while there's a limit to how small each
short idle duration is. This may help offset some of that bias towards a
high avg.

So how acceptable is setting a limit of 2*migration cost or less on the
delta rather than on the avg?


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at