Re: [RFC] sched: Limit idle_balance() when it is being used too frequently

From: Rik van Riel
Date: Wed Jul 17 2013 - 13:54:24 EST


On 07/17/2013 12:18 PM, Peter Zijlstra wrote:
On Wed, Jul 17, 2013 at 08:59:01AM -0700, Jason Low wrote:


Do you think its worth a try to consider each newidle balance attempt as
the total load_balance attempts until it is able to move a task, and
then skip balancing within the domain if a CPU's avg idle time is less
than that avg time doing newidle balance?

So the way I see things is that the only way newidle balance can slow down
things is if it runs when we could have ran something useful.

Due to contention on the runqueue locks of other CPUs,
newidle also has the potential to keep _others_ from
running something useful.

So all we need to ensure is to not run longer than we expect to be idle for and
things should be 'free', right?

So the problem I have with your proposal is that supposing we're successful
once every 10 newidle balances. Then the sd->newidle_balance_cost gets inflated
by a factor 10 -- for we'd count 10 of them before 'success'.

However when we're idle for that amount of time (10 times longer than it takes
to do a single newidle balance) we'd still only do a single newidle balance,
not 10.

Could we prevent that downside by measuring both the
time spent idle, and the time spent in idle balancing,
and making sure the idle balancing time never exceeds
more than N% of the idle time?

Say, have the CPU never spend more than 10% of its
idle time in the idle balancing code, as averaged
over some time period?

That way we might still do idle balancing every X
idle periods, even if the idle periods themselves
are relatively short.

It might also be enough to prevent excessive lock
contention triggered by the idle balancing code,
though I have to admit I did not really think that
part through :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/