Re: WARN_ON_ONCE() in process_one_work()?

From: Tejun Heo
Date: Tue Jun 13 2017 - 16:58:43 EST


Hello, Paul.

On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote:
> Just following up... I have hit this bug a couple of times over the
> past few days. Anything I can do to help?

My apologies for dropping the ball on this. I've gone over the hot
plug code in workqueue several times but can't really find how this
would happen. Can you please apply the following patch and see what
it says when the problem happens?

Thanks.

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c74bf39ef764..bd2ce3cbfb41 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1691,13 +1691,20 @@ static struct worker *alloc_worker(int node)
static void worker_attach_to_pool(struct worker *worker,
struct worker_pool *pool)
{
+ int ret;
+
mutex_lock(&pool->attach_mutex);

/*
* set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
* online CPUs. It'll be re-applied when any of the CPUs come up.
*/
- set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+ ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
+
+ WARN(ret && !(pool->flags & POOL_DISASSOCIATED),
+ "set_cpus_allowed_ptr failed, ret=%d pool->cpu/flags=%d/0x%x cpumask=%*pbl online=%*pbl active=%*pbl\n",
+ ret, pool->cpu, pool->flags, cpumask_pr_args(pool->attrs->cpumask),
+ cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask));

/*
* The pool->attach_mutex ensures %POOL_DISASSOCIATED remains
@@ -2037,8 +2044,11 @@ __acquires(&pool->lock)
lockdep_copy_map(&lockdep_map, &work->lockdep_map);
#endif
/* ensure we're on the correct CPU */
- WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
- raw_smp_processor_id() != pool->cpu);
+ if (WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
+ raw_smp_processor_id() != pool->cpu))
+ printk_once("XXX workfn=%pf pool->cpu/flags=%d/0x%x curcpu=%d online=%*pbl active=%*pbl\n",
+ work->func, pool->cpu, pool->flags, raw_smp_processor_id(),
+ cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask));

/*
* A single work shouldn't be executed concurrently by