Re: [PATCH 2/3] rcu: Equip sleepable RCU with lockdep dependency graph checks

From: Boqun Feng
Date: Sat Jan 14 2023 - 02:32:14 EST


On Sat, Jan 14, 2023 at 03:18:32PM +0800, Hillf Danton wrote:
> On Fri, 13 Jan 2023 16:17:59 -0800 Boqun Feng <boqun.feng@xxxxxxxxx>
> > On Sat, Jan 14, 2023 at 07:58:09AM +0800, Hillf Danton wrote:
> > > On 13 Jan 2023 09:58:10 -0800 Boqun Feng <boqun.feng@xxxxxxxxx>
> > > > On Fri, Jan 13, 2023 at 09:03:30PM +0800, Hillf Danton wrote:
> > > > > On 12 Jan 2023 22:59:54 -0800 Boqun Feng <boqun.feng@xxxxxxxxx>
> > > > > > --- a/kernel/rcu/srcutree.c
> > > > > > +++ b/kernel/rcu/srcutree.c
> > > > > > @@ -1267,6 +1267,8 @@ static void __synchronize_srcu(struct srcu_struct *ssp, bool do_norm)
> > > > > > {
> > > > > > struct rcu_synchronize rcu;
> > > > > >
> > > > > > + srcu_lock_sync(&ssp->dep_map);
> > > > > > +
> > > > > > RCU_LOCKDEP_WARN(lockdep_is_held(ssp) ||
> > > > > > lock_is_held(&rcu_bh_lock_map) ||
> > > > > > lock_is_held(&rcu_lock_map) ||
> > > > > > --
> > > > > > 2.38.1
> > > > >
> > > > > The following deadlock is able to escape srcu_lock_sync() because the
> > > > > __lock_release folded in sync leaves one lock on the sync side.
> > > > >
> > > > > cpu9 cpu0
> > > > > --- ---
> > > > > lock A srcu_lock_acquire(&ssp->dep_map);
> > > > > srcu_lock_sync(&ssp->dep_map);
> > > > > lock A
> > > >
> > > > But isn't it just the srcu_mutex_ABBA test case in patch #3, and my run
> > > > of lockdep selftest shows we can catch it. Anything subtle I'm missing?
> > >
> > > I am leaning to not call it ABBA deadlock, because B is unlocked.
> > >
> > > task X task Y
> > > --- ---
> > > lock A
> > > lock B
> > > lock B
> > > unlock B
> > > wait_for_completion E
> > >
> > > lock A
> > > complete E
> > >
> > > And no deadlock should be detected/caught after B goes home.
> >
> > Your example makes me more confused.. given the case:
> >
> > task X task Y
> > --- ---
> > mutex_lock(A);
> > srcu_read_lock(B);
> > synchronze_srcu(B);
> > mutex_lock(A);
> >
> > isn't it a deadlock?
>
> Yes and nope, see below.
>
> > If your example, A, B or E which one is srcu?
>
> A and B are mutex, and E is completion in my example to show the failure
> of catching deadlock in case of non-fake lock. Now see srcu after your change.
>
> task X task Y
> --- ---
> mutex_lock(A);
> srcu_read_lock(B);
> srcu_lock_acquire(&B->dep_map);
> a) lock_map_acquire_read(&B->dep_map);
> synchronze_srcu(B);
> __synchronize_srcu(B);
> srcu_lock_sync(&B->dep_map);
> lock_map_sync(&B->dep_map);
> lock_sync(&B->dep_map);
> __lock_acquire(&B->dep_map);

At this time, lockdep add dependency A -> B in the dependency graph.

> b) lock_map_acquire_read(&B->dep_map);
> __lock_release(&B->dep_map);
> c) lock_map_acquire_read(&B->dep_map);
> mutex_lock(A);

and here, lockdep will try to add dependency B -> A into the dependency
graph, and find that A -> B -> A will form a circle (with strong
dependency), therefore lockdep knows it's a deadlock.

>
> No deadlock could be detected if taskY takes mutexA after taskX releases B,

The timing that taskX releases B doesn't master, since lockdep uses
graph to detect deadlocks rather than after-fact detection.

> and how taskY acquires B does not matter as per the a), b) and c) modes in
> the above chart, again because releasing lock can break deadlock in general.

I have test cases showing the above deadlock can be detected, so if you
think there is a deadlock that may dodge from my change, feel free to
add a test case in lib/locking-selftest.c ;-)

Regards,
Boqun