Re: [PATCH v6] workqueue: Fix edge cases for calc of pool's cpumask

From: Tejun Heo
Date: Thu Jul 27 2017 - 16:46:41 EST


Hello, Michael.

On Thu, Jul 27, 2017 at 03:15:47PM -0500, Michael Bringmann wrote:
> There is an underlying assumption in many layers / modules of the Linux
> system that CPU <-> node mapping is static. This is despite the presence
> of features like NUMA and 'hotplug' that support the dynamic addition/
> removal of fundamental system resources like CPUs and memory. PowerPC
> systems, however, do provide extensive features for the dynamic change
> of resources available to a system.

The text can go as-is but keeping cpu <-> node mapping static is a
trade-off rather than upper layers missing out something. Making cpu
<-> node mapping dynamic means adding complexities and overhead to a
lot hotter paths including memory allocation. It's just a better
trade off to keep the mapping static from arch side even if that means
we have to use more complex mapping in arch and/or allocate more
possible cpus than strictly necessary.

> Currently, there is little or no synchronization protection around the
> updating of the CPU <-> node mapping, and the export/update of this
> information for other layers / modules. In systems which can change
> this mapping during 'hotplug', like PowerPC, the information is changing
> underneath all layers that might reference it.
>
> This patch attempts to ensure that a valid, usable cpumask attribute is
> used by the workqueue infrastructure when setting up new resource pools.
> It prevents a crash that has been observed when an 'empty' cpumask is
> passed along to the worker/task scheduling code. It is intended as an
> intermediate fix until a more fundamental review and correction of the
> issue can be done.

Thanks a lot for your patience! Much appreciated.

--
tejun