Re: [PATCH 1/2] Customize sched domain via cpuset

From: Peter Zijlstra
Date: Tue Apr 01 2008 - 08:00:00 EST


On Tue, 2008-04-01 at 06:55 -0500, Paul Jackson wrote:
> Interesting ...
>
> So, we have two flags here. One flag "sched_wake_idle_far" that will
> cause the current task to search farther for an idle CPU when it wakes
> up another task that needs a CPU on which to run, and the other flag
> "sched_balance_newidle_far" that will cause a soon-to-idle CPU to search
> farther for a task it might pull over and run, instead of going idle.
>
> I am tempted to ask if we should not elaborate this in one dimension,
> and simplify it in another dimension.
>
> First the simplification side: do we need both flags? Yes, they are
> two distinct cases in the code, but perhaps practical uses will always
> end up setting both flags the same way. If that's the case, then we
> are just burdening the user of these flags with understanding a detail
> that didn't matter to them: did a waking task or an idle CPU provoke
> the search? Do you have or know of a situation where you actually
> desire to enable one flag while disabling the other?
>
> For the elaboration side: your proposal has just two-level's of
> distance, near and far. Perhaps, as architectures become more
> elaborate and hierarchies deeper, we would want N-level's of distance,
> and the ability to request such load balancing for all levels "n"
> for our choice of "n" <= N.
>
> If we did both the above, then we might have a single per-cpuset file
> that took an integer value ... this "n". If (n == 0), that might mean
> no such balancing at all. If (n == 1), that might mean just the
> nearest balancing, for example, to the hyperthread within the same core,
> on some current Intel architectures. If (n == 2), then that might mean,
> on the same architectures, that balancing could occur across cores
> within the same package. If (n == 3) then that might mean, again on
> that architecture, that balancing could occur across packages on the
> same node board. As architectures evolve over time, the exact details
> of what each value of "n" mean would evolve, but always higher "n"
> would enable balancing across a wider portion of the system.
>
> Please understand I am just brain storming here. I don't know that
> the alternatives I considered above are preferrable or not to what
> your patch presents.

FWIW I like your suggestions.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/