Re: [PATCH v3] workqueue: Use active mask for new worker when pool is DISASSOCIATED

From: Lai Jiangshan
Date: Thu Jul 14 2022 - 04:34:40 EST


On Thu, Jul 14, 2022 at 11:16 AM Schspa Shi <schspa@xxxxxxxxx> wrote:
>
> When CPU-[un]hotplugs, all workers will be bound to active CPU via
> unbind_workers().
>
> But the unbound worker still has a chance to create a new worker, which
> has bound the newly created task to pool->attrs->cpumask. But the CPU has
> been unplugged.
>
> Please refer to the following scenarios.
>
> CPU0 CPU1
> ------------------------------------------------------------------
> sched_cpu_deactivate(cpu_active_mask clear)
> workqueue_offline_cpu(work pool POOL_DISASSOCIATED)
> -- all worker will migrate to another cpu --
> worker_thread
> -- will create new worker if
> pool->worklist is not empty
> create_worker()
> -- new kworker will bound to CPU0

How will the new kworker bound to CPU0? Could you give more details?

Since the pool is POOL_DISASSOCIATED and kthread_is_per_cpu() will
be false for the new worker. ttwu() will put it on a fallback CPU IIUC
(see select_task_rq()).

> (pool->attrs->cpumask will be mask of CPU0).
> kworker/0:x will running on rq
>
> sched_cpu_dying
> if (rq->nr_running != 1 || rq_has_pinned_tasks(rq))
> WARN(true, "Dying CPU not properly vacated!");
> ---------OOPS-------------
>


> The stack trace of the bad running task was dumped via the following patch:
> Link: https://lore.kernel.org/all/20220519161125.41144-1-schspa@xxxxxxxxx/
> And I think this debug patch needs to be added to the mainline,
> it can help us to debug this kind of problem
>
> To fix it, we can use cpu_active_mask when work pool is DISASSOCIATED.

use wq_unbound_cpumask.

>
> Signed-off-by: Schspa Shi <schspa@xxxxxxxxx>

Please solo CC Peter, as:

CC: Peter Zijlstra <peterz@xxxxxxxxxxxxx>

>
> --
>
> Changelog:
> v1 -> v2:
> - Move worker task bind to worker_attach_to_pool, remove extra
> wq_pool_attach_mutex added.
> - Add a timing diagram to make this question clearer.
> v2 -> v3:
> - Add missing PF_NO_SETAFFINITY, use cpumask_intersects to
> avoid setting bad mask for unbound work pool as Lai Jiangshan
> advised.
> - Call kthread_set_pre_cpu correctly for unbound worker.
> ---
> kernel/workqueue.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1ea50f6be843..b3e9289d9640 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1860,8 +1860,16 @@ static struct worker *alloc_worker(int node)
> static void worker_attach_to_pool(struct worker *worker,
> struct worker_pool *pool)
> {
> + const struct cpumask *cpu_mask;
> +
> mutex_lock(&wq_pool_attach_mutex);
>
> + if (cpumask_intersects(pool->attrs->cpumask, cpu_active_mask))
> + cpu_mask = pool->attrs->cpumask;
> + else
> + cpu_mask = wq_unbound_cpumask;
> +
> + set_cpus_allowed_ptr(worker->task, cpu_mask);
> /*
> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
> * stable across this function. See the comments above the flag
> @@ -1870,10 +1878,8 @@ static void worker_attach_to_pool(struct worker *worker,
> if (pool->flags & POOL_DISASSOCIATED)
> worker->flags |= WORKER_UNBOUND;
> else
> - kthread_set_per_cpu(worker->task, pool->cpu);
> -
> - if (worker->rescue_wq)
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + kthread_set_per_cpu(worker->task,
> + cpu_mask == wq_unbound_cpumask ? -1 : pool->cpu);

Only workers for percpu pool need to set kthread_set_per_cpu().
So it is already handled in the above code, the branch is unneeded.

>
> list_add_tail(&worker->node, &pool->workers);
> worker->pool = pool;
> @@ -1952,8 +1958,8 @@ static struct worker *create_worker(struct worker_pool *pool)
> goto fail;
>
> set_user_nice(worker->task, pool->attrs->nice);
> - kthread_bind_mask(worker->task, pool->attrs->cpumask);
>
> + worker->task->flags |= PF_NO_SETAFFINITY;
> /* successful, attach the worker to the pool */
> worker_attach_to_pool(worker, pool);
>
> --
> 2.29.0
>