Re: [RFC][PATCH v5 01/14] sched: add a new arch_sd_local_flags forsched_domain init

From: Dietmar Eggemann
Date: Tue Nov 12 2013 - 12:43:52 EST


On 06/11/13 14:08, Peter Zijlstra wrote:
> On Wed, Nov 06, 2013 at 02:53:44PM +0100, Martin Schwidefsky wrote:
>> On Tue, 5 Nov 2013 23:27:52 +0100
>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>>
>>> On Tue, Nov 05, 2013 at 03:57:23PM +0100, Vincent Guittot wrote:
>>>> Your proposal looks fine for me. It's clearly better to move in one
>>>> place the configuration of sched_domain fields. Have you already got
>>>> an idea about how to let architecture override the topology?
>>>
>>> Maybe something like the below -- completely untested (my s390 compiler
>>> is on a machine that's currently powered off).
>>
>> In principle I do not see a reason why this should not work, but there
>> are a few more things to take care of. E.g. struct sd_data is defined
>> in kernel/sched/core.c, cpu_cpu_mask as well. These need to be moved
>> to a header where arch/s390/kernel/smp.c can pick it up.
>>
>> I do have the feeling that the sched_domain_topology should be left
>> where they are, or do we really want to expose more of the scheduler
>> internals?
>
> Ah, its a trade off; in that previous patch I removed the entire
> sched_domain initializers the archs used to 'have' to fill out. That
> exposed far too much behavioural stuff the archs really shouldn't
> bother with.
>
> In return we now provide a (hopefully) simpler interface that allows
> archs to communicate their topology to the scheduler -- without getting
> mixed up in the behavioural aspects (too much).
>
> Maybe s390 wasn't the best example to pick, as the book domain really
> isn't that exciting. Arguably I should have taken Power7+ and the
> ASYM_PACKING SMT thing.
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

We actually don't have to expose sched_domain_topology or any internal
scheduler data structures.

We still can get rid of the SD_XXX_INIT stuff and do the sched_domain
initialization for all levels in one function sd_init().

Moreover, we could introduce a arch specific general function replacing
arch specific functions for particular flags and levels like
arch_sd_sibling_asym_packing() or Vincent's arch_sd_local_flags().
This arch specific general function exposes the level and the
sched_domain pointer to the arch which then could fine tune sched_domain
in each individual level.

Below is a patch which bases on your idea to transform sd_numa_init()
into sd_init(). The main difference is that I don't try to distinguish
based of power management related flags inside sd_init() but rather on
the new sd level data.

Dietmar

----8<----