Re: [PATCH] Load balancing problem in 2.6.2-mm1

From: Nick Piggin
Date: Fri Feb 06 2004 - 18:27:00 EST




Rick Lindsley wrote:

OK, but do you agree that the rate we rebalance things like 2 vs 1 should
be slower than the rate we rebalance 3 vs 1 ? Fairness is only relevant
over a long term imbalance anyway, so there should be a big damper on
"fairness only" rebalances.

I think, given the precision we're granted via SCHED_LOAD_SCALE, in
combination with the new "load average" (cpu_load) code, that we can
achieve what we want.

If cpu0 has 2 runnable tasks and cpu1 has 1 runnable task, won't we see
the "load average" of cpu0 slowly approach 2, but not jump there?



Yep

Right now, we round up on all fractions and Martin has proposed a patch
which takes it the other way and rounds down. What if in marginal
cases like this where this is a small but persistent difference, we
could bump the task to another cpu when it reaches (say) 1.8 or 1.9?
That would keep it there longer for shorter-lived tasks, but for those
long-runners, they'd eventually spread the pain around a little.

And yes, a cpu_load of even 1.0 should *never* get migrated to a cpu
with a load 0.0. Instead of


This isn't the load though, but imbalance. It has already passed
through our imbalance_pct filter (or we are idle), so we can pretty
safely assume that we want to try to move at least one task.


*imbalance = (*imbalance + SCHED_LOAD_SCALE - 1) >> SCHED_LOAD_SHIFT;

how about, for instance,

if (max_load <= SCHED_LOAD_SCALE)
*imbalance = 0;
else
*imbalance = (*imbalance + (SCHED_LOAD_SCALE / 6) - 1)
>> SCHED_LOAD_SHIFT;

The intent is to never move anything if max_load is 1 or less (what
advantage is there?) and to create a slight tendency to round up at
loads greater than that, which would still tend to leave things where
they were until they'd been there a while. In fact the "bonus"
(SCHED_LOAD_SCALE / 6 - 1) could be another configurable in the scheduling
domain so that at some level you're not interested in fairness and
they just don't bounce at all.



Hopefully just tending to round down more would damp it better.
*imbalance = (*imbalance + SCHED_LOAD_SCALE/2) >> SCHED_LOAD_SHIFT;
Or even remove the addition all together.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/