Re: rcu: Add might_sleep() check to synchronize_rcu()

From: Paul E. McKenney
Date: Sun Mar 25 2018 - 14:49:46 EST


On Fri, Mar 23, 2018 at 10:12:24PM +0100, Thomas Gleixner wrote:
> Subject: rcu: Add might_sleep() check to synchronize_rcu()
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Date: Fri, 23 Mar 2018 22:02:18 +0100
>
> Joel reported a debugobjects warning which is triggered by a RCU callback
> invoking synchronize_rcu(). RCU callbacks run in softirq context, so
> calling synchronize_rcu() is a bad idea as it might sleep.
>
> debugobjects triggers because __wait_rcu_gp() uses on stack objects and
> invokes debug_object_init_on_stack(). That function checks the object
> address against current's task stack, which fails because the code runs on
> the softirq stack.
>
> synchronize_rcu() lacks a might_sleep() check which would have caught that
> issue way earlier because it would trigger with the minimal debug options
> enabled.
>
> Add a might_sleep() check to catch such cases.
>
> Reported-by: Joel Fernandes <joelaf@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx>
> ---
> kernel/rcu/tree_plugin.h | 1 +
> 1 file changed, 1 insertion(+)
>
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -753,6 +753,7 @@ void synchronize_rcu(void)
> "Illegal synchronize_rcu() in RCU read-side critical section");
> if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE)
> return;
> + might_sleep();
> if (rcu_gp_is_expedited())
> synchronize_rcu_expedited();
> else

I could add this, but synchronize_rcu_expedited() will do
either a mutex_lock() or a wait_event(), both of which already
have a might_sleep(), and wait_rcu_gp() unconditionally calls
wait_for_completion(), which already has a might_sleep().

Unless there is only one CPU in the system either at early boot. Is
this possibility common enough to warrant a might_sleep() further up?

Thanx, Paul