Re: Weird rcu lockdep warning

From: Paul E. McKenney
Date: Wed Apr 14 2010 - 12:01:21 EST


On Wed, Apr 14, 2010 at 05:51:11PM +0200, Frederic Weisbecker wrote:
> On Wed, Apr 14, 2010 at 08:43:02AM -0700, Paul E. McKenney wrote:
> > On Wed, Apr 14, 2010 at 11:34:33AM +0800, Lai Jiangshan wrote:
> > > Paul E. McKenney wrote:
> > > > On Tue, Apr 13, 2010 at 05:13:06PM -0700, David Miller wrote:
> > > >> From: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> > > >> Date: Wed, 14 Apr 2010 02:02:27 +0200
> > > >>
> > > >>> I just have a guess though....
> > > >>> This seems to always happen from NMI path, and lockdep is disabled on NMI.
> > > >>> I suspect the lock_acquire() performed by rcu_read_lock() is just ignored
> > > >>> and then the rcu_read_lock_held() check has the wrong result...
> > > >> Yeah, I bet that's it too.
> > > >>
> > > >> lock_is_held() can't return anything meaningful while lockdep is
> > > >> disabled, which it is during NMIs.
> > > >
> > > > Ah! So I just need to add a "current->lockdep_recursion"
> > > > check to debug_lockdep_rcu_enabled(). And move the function to
> > > > kernel/rcutree_plugin.h to avoid #include hell.
> > > >
> > > > See below for (untested) patch.
> > > >
> > > > Thanx, Paul
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > include/linux/rcupdate.h | 5 +----
> > > > kernel/rcutree_plugin.h | 11 +++++++++++
> > > > 2 files changed, 12 insertions(+), 4 deletions(-)
> > > >
> > > > commit 304d8da6cd791a81ce3164f867e1b3ef4f9af1d1
> > > > Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > > Date: Tue Apr 13 18:45:51 2010 -0700
> > > >
> > > > rcu: Make RCU lockdep check the lockdep_recursion variable
> > > >
> > > > The lockdep facility temporarily disables lockdep checking by incrementing
> > > > the current->lockdep_recursion variable. Such disabling happens in NMIs
> > > > and in other situations where lockdep might expect to recurse on itself.
> > > > This patch therefore checks current->lockdep_recursion, disabling RCU
> > > > lockdep splats when this variable is non-zero.
> > > >
> > > > Reported-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> > > > Reported-by: David Miller <davem@xxxxxxxxxxxxx>
> > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> > > >
> > > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > > index 9f1ddfe..07db2fe 100644
> > > > --- a/include/linux/rcupdate.h
> > > > +++ b/include/linux/rcupdate.h
> > > > @@ -101,10 +101,7 @@ extern struct lockdep_map rcu_sched_lock_map;
> > > > # define rcu_read_release_sched() \
> > > > lock_release(&rcu_sched_lock_map, 1, _THIS_IP_)
> > > >
> > > > -static inline int debug_lockdep_rcu_enabled(void)
> > > > -{
> > > > - return likely(rcu_scheduler_active && debug_locks);
> > > > -}
> > > > +extern int debug_lockdep_rcu_enabled(void);
> > > >
> > > > /**
> > > > * rcu_read_lock_held - might we be in RCU read-side critical section?
> > > > diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> > > > index 79b53bd..2169abe 100644
> > > > --- a/kernel/rcutree_plugin.h
> > > > +++ b/kernel/rcutree_plugin.h
> > > > @@ -1067,3 +1067,14 @@ static void rcu_needs_cpu_flush(void)
> > > > }
> > > >
> > > > #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
> > > > +
> > > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> > > > +
> > > > +int debug_lockdep_rcu_enabled(void)
> > > > +{
> > > > + return likely(rcu_scheduler_active &&
> > > > + debug_locks &&
> > > > + current->lockdep_recursion == 0);
> > > > +}
> > > > +
> > >
> > > Looks good to me too, but I think
> > > 'likely' is needless since the function is not inline.
> >
> > Good point. And to add injury to insult, I forgot EXPORT_SYMBOL_GPL().
> >
> > Updated patch in the works.
>
>
> Note I just tested the patch the previous one and it looks fine now.
> You can then safely consider the "general idea" fixes the problem :)

Thank you, Frederic!!!

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/