Re: [PATCH v2] cpu/hotplug: wait for cpuset_hotplug_work to finish on cpu onlining

From: Qais Yousef
Date: Tue Feb 16 2021 - 13:30:25 EST


On 02/12/21 00:30, Alexey Klimov wrote:
> When a CPU offlined and onlined via device_offline() and device_online()
> the userspace gets uevent notification. If, after receiving "online" uevent,
> userspace executes sched_setaffinity() on some task trying to move it
> to a recently onlined CPU, then it often fails with -EINVAL. Userspace needs
> to wait around 5..30 ms before sched_setaffinity() will succeed for recently
> onlined CPU after receiving uevent.
>
> If in_mask argument for sched_setaffinity() has only recently onlined CPU,
> it often fails with such flow:
>
> sched_setaffinity()
> cpuset_cpus_allowed()
> guarantee_online_cpus() <-- cs->effective_cpus mask does not
> contain recently onlined cpu
> cpumask_and() <-- final new_mask is empty
> __set_cpus_allowed_ptr()
> cpumask_any_and_distribute() <-- returns dest_cpu equal to nr_cpu_ids
> returns -EINVAL
>
> Cpusets used in guarantee_online_cpus() are updated using workqueue from
> cpuset_update_active_cpus() which in its turn is called from cpu hotplug callback
> sched_cpu_activate() hence it may not be observable by sched_setaffinity() if
> it is called immediately after uevent.

nit: newline

> Out of line uevent can be avoided if we will ensure that cpuset_hotplug_work
> has run to completion using cpuset_wait_for_hotplug() after onlining the
> cpu in cpu_device_up() and in cpuhp_smt_enable().
>
> Co-analyzed-by: Joshua Baker <jobaker@xxxxxxxxxx>
> Signed-off-by: Alexey Klimov <aklimov@xxxxxxxxxx>
> ---

This looks good to me.

Reviewed-by: Qais Yousef <qais.yousef@xxxxxxx>

Thanks

--
Qais Yousef