Re: [PATCH] lockdep: Do no validate wait context for novalidate class

From: Sebastian Andrzej Siewior
Date: Thu Aug 20 2020 - 10:06:26 EST


On 2020-08-20 14:38:59 [+0200], peterz@xxxxxxxxxxxxx wrote:
> On Thu, Aug 20, 2020 at 01:43:48PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2020-08-20 13:40:36 [+0200], peterz@xxxxxxxxxxxxx wrote:
> > > Anyway, all 3 users should have the same wait context, so where is the
> > > actual problem?
> >
> > I have one in RT which is a per-CPU spinlock within local_bh_disable()
> > to act as a per-CPU BLK like mainline.
>
> Then can we get to see that code and an explanation for what the problem
> is and why it is still correct?

An actual backtrace looks like this:
| WARNING: possible circular locking dependency detected

| Possible unsafe locking scenario:
|
| CPU0 CPU1
| ---- ----
| lock(k-sk_lock-AF_NETLINK);
| lock((l).lock#2);
| lock(k-sk_lock-AF_NETLINK);
| lock((l).lock#2);
|
| *** DEADLOCK ***

The "k-sk_lock-AF_NETLINK" is global but "(l).lock#2" is per CPU. The
circular dependency can not occur because CPU0 and CPU1 can acquire the
lock simultaneously.
The softirq code is at
https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/softirq-Add-preemptible-softirq.patch?h=linux-5.6.y-rt-patches&id=4ce1fda10dae882d494c6430cc438ff645a35603#n146

I'm not sure why sk_lock on CPU0 is before (l).lock. It doesn't change
even if the lock is acquired after trace_softirqs_off(). If the sk_lock
would be acquired with enabled BH then lockdep would complain.

The lovely in_atomic() check is due to irq_enter(), preempt_disable() +
local_bh_disable() and others.

> Because as is, this patch isn't needed.
I can hold on to this and maybe it is not needed the final version of
softirq ends up to be different :)

Thanks.

Sebastian