Re: deadlock in synchronize_srcu() in debugfs?

From: Johannes Berg
Date: Fri Mar 24 2017 - 14:52:23 EST



> Yes.ÂÂCPU2 has a pre-existing reader that CPU1's synchronize_srcu()
> must wait for.ÂÂBut CPU2's reader cannot end until CPU1 releases
> its lock, which it cannot do until after CPU2's reader ends.ÂÂThus,
> as you say, deadlock.
>
> The rule is that if you are within any kind of RCU read-side critical
> section, you cannot directly or indirectly wait for a grace period
> from that same RCU flavor.

Right. This is indirect then, in a way.

> There are some challenges, though.ÂÂThis is OK:
>
> CPU1 CPU2
> i = srcu_read_lock(&mysrcu); mutex_lock(&my_lock);
> mutex_lock(&my_lock); i = srcu_read_lock(&mysrcu);
> srcu_read_unlock(&mysrcu, i); mutex_unlock(&my_lock);
> mutex_unlock(&my_lock); srcu_read_unlock(&mysrcu, i);
>
> CPU3
> synchronize_srcu(&mylock);
>
> This could be a deadlock for reader-writer locking, but not for SRCU.

Hmm, yes, that's a good point. If srcu_read_lock() was read_lock, and
synchronize_srcu() was write_lock(), then the write_lock() could stop
CPU2's read_lock() from acquiring the lock, and thus cause a deadlock.

However, I'm not convinced that lockdep handles reader/writer locks
correctly to start with, right now, since it *didn't* actually trigger
any warnings when I annotated SRCU as a reader/writer lock.

> This is also OK:
> CPU1 CPU2
> i = srcu_read_lock(&mysrcu); mutex_lock(&my_lock);
> mutex_lock(&my_lock); synchronize_srcu(&yoursrc
u);
> srcu_read_unlock(&mysrcu, i); mutex_unlock(&my_lock);
> mutex_unlock(&my_lock);
>
> Here CPU1's read-side critical sections are for mysrcu, which is
> independent of CPU2's grace period for yoursrcu.

Right, but that's already covered by having separate a lockdep_map for
each SRCU subsystem (mysrcu, yoursrcu).

> So you could flag any lockdep cycle that contained a reader and a
> synchronous grace period for the same flavor of RCU, where for SRCU
> the identity of the srcu_struct structure is part of the flavor.

Right. Basically, I think SRCU should be like a reader/writer lock
(perhaps fixed to work right). The only difference seems to be the
scenario you outlined above (first of the two)?

Actually, given the scenario above, for lockdep purposes the
reader/writer lock is actually the same as a recursive lock, I guess?

You outlined a scenario in which the reader gets blocked due to a
writer (CPU3 doing a write_lock()) so the reader can still participate
in a deadlock cycle since it can - without any other locks being held
by CPU3 that participate - cause a deadlock between CPU1 and CPU2 here.
For lockdep then, even seeing the CPU1 and CPU2 scenarios should be
sufficient to flag a deadlock (*).

This part then isn't true for SRCU, because there forward progress will
still be made. So for SRCU, the "reader" side really needs to be
connected with a "writer" side to form a deadlock cycle, unlike for a
reader/writer lock.

johannes

(*) technically only after checking that write_lock() is ever used, but
... seems reasonable enough to assume that it will be used, since why
would anyone ever use a reader/writer lock if there are only readers?
That's a no-op.