Re: [PATCH v2] sched/topology: Introduce NUMA identity node sched domain

From: Borislav Petkov
Date: Mon Aug 14 2017 - 13:43:23 EST


Dropping stable@ from CC.

When using git send-email, make sure to exclude stable@ from the list of
recipients as this is not how we send a patch to stable.

On Fri, Aug 11, 2017 at 02:41:15AM -0500, Suravee Suthikulpanit wrote:
> On AMD Family17h-based (EPYC) system, a NUMA node can contain
> upto 8 cores (16 threads) with the following topology.
>
> ----------------------------
> C0 | T0 T1 | || | T0 T1 | C4
> --------| || |--------
> C1 | T0 T1 | L3 || L3 | T0 T1 | C5
> --------| || |--------
> C2 | T0 T1 | #0 || #1 | T0 T1 | C6
> --------| || |--------
> C3 | T0 T1 | || | T0 T1 | C7
> ----------------------------
>
> Here, there are 2 last-level (L3) caches per NUMA node. A socket can
> contain upto 4 NUMA nodes, and a system can support upto 2 sockets.
> With full system configuration, current scheduler creates 4 sched
> domains:
>
> domain0 SMT (span a core)
> domain1 MC (span a last-level-cache)
> domain2 NUMA (span a socket: 4 nodes)
> domain3 NUMA (span a system: 8 nodes)
>
> Note that there is no domain to represent cpus spaning a NUMA node.

...

> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index 79895ae..2dd5b11 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1335,6 +1335,10 @@ void sched_init_numa(void)
> if (!sched_domains_numa_distance)
> return;
>
> + /* Includes NUMA identity node at level 0. */
> + sched_domains_numa_distance[level++] = curr_distance;
> + sched_domains_numa_levels = level;
> +
> /*
> * O(nr_nodes^2) deduplicating selection sort -- in order to find the
> * unique distances in the node_distance() table.
> @@ -1382,8 +1386,7 @@ void sched_init_numa(void)
> return;
>
> /*
> - * 'level' contains the number of unique distances, excluding the
> - * identity distance node_distance(i,i).

I'm still unclear as to why were we excluding this identity distance
until now and how would that change affect existing systems.

Also, you do use the term "NUMA" pretty loosely in the text - please
take care to explain precisely what kind of node you mean: physical,
logical, ... Don't be afraid to be too verbose.

Thanks.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton, HRB 21284 (AG NÃrnberg)
--