Re: possible deadlock in xfrm_policy_delete

From: Dmitry Vyukov
Date: Thu Sep 24 2020 - 00:44:26 EST


On Thu, Sep 24, 2020 at 6:36 AM Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sun, Sep 20, 2020 at 01:22:14PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 5fa35f24 Add linux-next specific files for 20200916
> > git tree: linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1122e2d9900000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=6bdb7e39caf48f53
> > dashboard link: https://syzkaller.appspot.com/bug?extid=c32502fd255cb3a44048
> > compiler: gcc (GCC) 10.1.0-syz 20200507
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+c32502fd255cb3a44048@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > =====================================================
> > WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
> > 5.9.0-rc5-next-20200916-syzkaller #0 Not tainted
> > -----------------------------------------------------
> > syz-executor.1/13775 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
> > ffff88805ee15a58 (&net->xfrm.xfrm_policy_lock){+...}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
> > ffff88805ee15a58 (&net->xfrm.xfrm_policy_lock){+...}-{2:2}, at: xfrm_policy_delete+0x3a/0x90 net/xfrm/xfrm_policy.c:2236
> >
> > and this task is already holding:
> > ffff8880a811b1e0 (k-slock-AF_INET6){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
> > ffff8880a811b1e0 (k-slock-AF_INET6){+.-.}-{2:2}, at: tcp_close+0x6e3/0x1200 net/ipv4/tcp.c:2503
> > which would create a new lock dependency:
> > (k-slock-AF_INET6){+.-.}-{2:2} -> (&net->xfrm.xfrm_policy_lock){+...}-{2:2}
> >
> > but this new dependency connects a SOFTIRQ-irq-safe lock:
> > (k-slock-AF_INET6){+.-.}-{2:2}
> >
> > ... which became SOFTIRQ-irq-safe at:
> > lock_acquire+0x1f2/0xaa0 kernel/locking/lockdep.c:5398
> > __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> > _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
> > spin_lock include/linux/spinlock.h:354 [inline]
> > sctp_rcv+0xd96/0x2d50 net/sctp/input.c:231
>
> What's going on with all these bogus lockdep reports?
>
> These are two completely different locks, one is for TCP and the
> other is for SCTP. Why is lockdep suddenly beoming confused about
> this?
>
> FWIW this flood of bogus reports started on 16/Sep.


FWIW one of the dups of this issue was bisected to:

commit 1909760f5fc3f123e47b4e24e0ccdc0fc8f3f106
Author: Ahmed S. Darwish <a.darwish@xxxxxxxxxxxxx>
Date: Fri Sep 4 15:32:31 2020 +0000

seqlock: PREEMPT_RT: Do not starve seqlock_t writers

Can it be related?


A number of other new lockdep reports were bisected to the following
one, which was true intentional root cause of these, but it looks a
bit too old to cause the xfrm reports:

commit f08e3888574d490b31481eef6d84c61bedba7a47
Author: Boqun Feng <boqun.feng@xxxxxxxxx>
Date: Fri Aug 7 07:42:30 2020 +0000

lockdep: Fix recursive read lock related safe->unsafe detection