Re: [PATCH] kthread_worker: re-set CPU affinities if CPU come online

From: Petr Mladek
Date: Mon Oct 26 2020 - 12:46:27 EST


On Mon 2020-10-26 09:50:11, Tejun Heo wrote:
> On Mon, Oct 26, 2020 at 02:52:13PM +0800, qiang.zhang@xxxxxxxxxxxxx wrote:
> > @@ -737,8 +741,11 @@ __kthread_create_worker(int cpu, unsigned int flags,
> > if (IS_ERR(task))
> > goto fail_task;
> >
> > - if (cpu >= 0)
> > + if (cpu >= 0) {
> > kthread_bind(task, cpu);
> > + worker->bind_cpu = cpu;
> > + cpuhp_state_add_instance_nocalls(kworker_online, &worker->cpuhp_node);
> > + }
> >
> > worker->flags = flags;
> > worker->task = task;
> ...
> > +static int kworker_cpu_online(unsigned int cpu, struct hlist_node *node)
> > +{
> > + struct kthread_worker *worker = hlist_entry(node, struct kthread_worker, cpuhp_node);
> > + struct task_struct *task = worker->task;
> > +
> > + if (cpu == worker->bind_cpu)
> > + WARN_ON_ONCE(set_cpus_allowed_ptr(task, cpumask_of(cpu)) < 0);
> > + return 0;
> > +}
>
> I don't think this works. The kthread may have changed its binding while
> running using set_cpus_allowed_ptr() as you're doing above. Besides, when a
> cpu goes offline, the bound kthread can fall back to other cpus but its cpu
> mask isn't cleared, is it?

If I get it correctly, select_fallback_rq() calls
do_set_cpus_allowed() explicitly or in cpuset_cpus_allowed_fallback().
It seems that the original mask gets lost.

It would make sense to assume that kthread_worker API will take care of
the affinity when it was set by kthread_create_worker_on_cpu().

But is it safe to assume that the work can be safely proceed also
on another CPU? We should probably add a warning into
kthread_worker_fn() when it detects wrong CPU.

BTW: kthread_create_worker_on_cpu() is currently used only by
start_power_clamp_worker(). And it has its own CPU hotplug
handling. The kthreads are stopped and started again
in powerclamp_cpu_predown() and powerclamp_cpu_online().


I havn't checked all details yet. But in principle, the patch looks
sane to me.

Best Regards,
Petr