Re: [PATCH 1/1] sched/topology: Make sched_init_numa() use a set for the deduplicating sort

From: Valentin Schneider
Date: Mon Feb 01 2021 - 06:56:25 EST


On 01/02/21 10:53, Dietmar Eggemann wrote:
> On 22/01/2021 13:39, Valentin Schneider wrote:
>
> [...]
>
>> @@ -1705,7 +1702,7 @@ void sched_init_numa(void)
>> /* Compute default topology size */
>> for (i = 0; sched_domain_topology[i].mask; i++);
>>
>> - tl = kzalloc((i + level + 1) *
>> + tl = kzalloc((i + nr_levels) *
>> sizeof(struct sched_domain_topology_level), GFP_KERNEL);
>> if (!tl)
>> return;
>
> This hunk creates issues during startup on my Arm64 juno board on tip/sched/core.
>
> ---8<---
>
> From: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> Date: Mon, 1 Feb 2021 09:58:04 +0100
> Subject: [PATCH] sched/topology: Fix sched_domain_topology_level alloc in
> sched_init_numa
>
> Commit "sched/topology: Make sched_init_numa() use a set for the
> deduplicating sort" allocates 'i + nr_levels (level)' instead of
> 'i + nr_levels + 1' sched_domain_topology_level.
>
> This led to an Oops (on Arm64 juno with CONFIG_SCHED_DEBUG):
>
> sched_init_domains
> build_sched_domains()
> __free_domain_allocs()
> __sdt_free() {
> ...
> for_each_sd_topology(tl)
> ...
> sd = *per_cpu_ptr(sdd->sd, j); <--
> ...
> }
>
> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>

Damn, I forgot the topology level stuff must terminate in a NULL'd
sentinel! Vincent fixed the same thing a few years ago...

c515db8cd311 ("sched/numa: Fix initialization of sched_domain_topology for NUMA")

Thanks for fixing up my mistake, I ought to have tested !NUMA setups.