Re: [PATCH v2] arm64: topology: Avoid checking numa mask for scheduler MC selection

From: Catalin Marinas
Date: Wed Jun 06 2018 - 13:18:24 EST


On Wed, Jun 06, 2018 at 11:38:46AM -0500, Jeremy Linton wrote:
> The numa mask subset check can often lead to system hang or crash during
> CPU hotplug and system suspend operation if NUMA is disabled. This is
> mostly observed on HMP systems where the CPU compute capacities are
> different and ends up in different scheduler domains. Since
> cpumask_of_node is returned instead core_sibling, the scheduler is
> confused with incorrect cpumasks(e.g. one CPU in two different sched
> domains at the same time) on CPU hotplug.
>
> Lets disable the NUMA siblings checks for the time being, as NUMA in
> socket machines have LLC's that will assure that the scheduler topology
> isn't "borken".
>
> The NUMA check exists to assure that if a LLC within a socket crosses
> NUMA nodes/chiplets the scheduler domains remain consistent. This code will
> likely have to be re-enabled in the near future once the NUMA mask story
> is sorted. At the moment its not necessary because the NUMA in socket
> machines LLC's are contained within the NUMA domains.
>
> Further, as a defensive mechanism during hot-plug, lets assure that the
> LLC siblings are also masked.
>
> Reported-by: Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>
> Reviewed-by: Sudeep Holla <sudeep.holla@xxxxxxx>
> Signed-off-by: Jeremy Linton <jeremy.linton@xxxxxxx>

Thanks for this. I queued it for this merging window.

--
Catalin