Re: possible deadlock in send_sigio

From: Dmitry Vyukov
Date: Thu Jun 11 2020 - 03:44:16 EST


On Thu, Jun 11, 2020 at 4:33 AM Waiman Long <longman@xxxxxxxxxx> wrote:
>
> On 4/4/20 1:55 AM, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
> > dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
> >
> > The bug was bisected to:
> >
> > commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
> > Author: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
> > Date: Thu Feb 20 00:22:26 2020 +0000
> >
> > proc: Use a list of inodes to flush from proc
> >
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
> > final crash: https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+a9fb1457d720a55d6dc5@xxxxxxxxxxxxxxxxxxxxxxxxx
> > Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
> >
> > ========================================================
> > WARNING: possible irq lock inversion dependency detected
> > 5.6.0-syzkaller #0 Not tainted
> > --------------------------------------------------------
> > ksoftirqd/0/9 just changed the state of lock:
> > ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
> > but this lock took another, SOFTIRQ-unsafe lock in the past:
> > (&pid->wait_pidfd){+.+.}-{2:2}
> >
> >
> > and interrupts could create inverse lock ordering between them.
> >
> >
> > other info that might help us debug this:
> > Possible interrupt unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&pid->wait_pidfd);
> > local_irq_disable();
> > lock(tasklist_lock);
> > lock(&pid->wait_pidfd);
> > <Interrupt>
> > lock(tasklist_lock);
> >
> > *** DEADLOCK ***
>
> That is a false positive. The qrwlock has the special property that it
> becomes unfair (for read lock) at interrupt context. So unless it is
> taking a write lock in the interrupt context, it won't go into deadlock.
> The current lockdep code does not capture the full semantics of qrwlock
> leading to this false positive.

Hi Longman

Thanks for looking into this.
Now the question is: how should we change lockdep annotations to fix this bug?