Re: synchronize_rcu_expedited gets stuck in hotplug path

From: Tejun Heo
Date: Tue Jan 18 2022 - 15:11:40 EST


Hello,

On Tue, Jan 18, 2022 at 12:06:46PM -0800, Paul E. McKenney wrote:
> Interesting. Adding Tejun and Lai on CC for their perspective.
>
> As you say, the incoming CPU invoked synchronize_rcu_expedited() which
> in turn invoked queue_work(). By default, workqueues will of course
> queue that work on the current CPU. But in this case, the CPU's bit
> is not yet set in the cpu_active_mask. Thus, a workqueue scheduled on
> the incoming CPU won't be invoked until CPUHP_AP_ACTIVE, which won't
> be reached until after the grace period ends, which cannot happen until
> the workqueue handler is invoked.
>
> I could imagine doing something as shown in the (untested) patch below,
> but first does this help?
>
> If it does help, would this sort of check be appropriate here or
> should it instead go into workqueues?

Maybe it can be solved by rearranging the hotplug sequence but it's fragile
to schedule per-cpu work items from hotplug paths. Maybe the whole issue can
be side-stepped by making synchronize_rcu_expedited() use unbound workqueue
instead? Does it require to be per-cpu?

Thanks.

--
tejun