Re: [RFC PATCH] workqueue: cut wq_rr_cpu_last

From: Tejun Heo
Date: Thu Dec 03 2020 - 10:37:27 EST


Hello,

On Thu, Dec 03, 2020 at 06:28:41PM +0800, Hillf Danton wrote:
> + new_cpu = cpumask_any_and_distribute(wq_unbound_cpumask, cpu_online_mask);
> + if (new_cpu < nr_cpu_ids)
> + return new_cpu;
> + else
> + return cpu;
> }
>
> static void __queue_work(int cpu, struct workqueue_struct *wq,
> @@ -1554,7 +1546,7 @@ static int workqueue_select_cpu_near(int
> return cpu;
>
> /* Use "random" otherwise know as "first" online CPU of node */
> - cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> + cpu = cpumask_any_and_distribute(cpumask_of_node(node), cpu_online_mask);

This looks generally okay but I think there's a real risk of different
cpumasks interfering with cpu selection. e.g. imagine a cpu issuing work
items to two unbound workqueues consecutively, one numa-bound, the other
not. The above change will basically confine the !numa one to the numa node.

I think the right thing to do here is expanding the
cpumask_any_and_distribute() so that the user can provide its own cursor
similar to what we do with ratelimits.

Thanks.

--
tejun