Re: WARNING in blk_mq_init_sched

From: Eric Biggers
Date: Fri Sep 27 2019 - 22:40:42 EST


On Wed, Sep 25, 2019 at 10:13:30PM +0000, Damien Le Moal wrote:
> On 2019/09/25 10:56, Damien Le Moal wrote:
> > On 2019/09/25 9:56, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following crash on:
> >>
> >> HEAD commit: f7c3bf8f Merge tag 'gfs2-for-5.4' of git://git.kernel.org/..
> >> git tree: upstream
> >> console output: https://syzkaller.appspot.com/x/log.txt?x=15f5baf9600000
> >> kernel config: https://syzkaller.appspot.com/x/.config?x=50d4af03d68a470c
> >> dashboard link: https://syzkaller.appspot.com/bug?extid=b2c197f98f86543b69c8
> >> compiler: clang version 9.0.0 (/home/glider/llvm/clang
> >> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
> >>
> >> Unfortunately, I don't have any reproducer for this crash yet.
> >>
> >> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >> Reported-by: syzbot+b2c197f98f86543b69c8@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > Oh... When the queue is initialized and the elevator initialization done by
> > elevator_init_mq() is executed without the queue sysfs lock held. In that step,
> > if the elevator initialization fails, blk_mq_sched_free_requests() is called and
> > will trip on the lockdep_assert_held(&q->sysfs_lock) check on entry. I guess
> > that is what is causing the crash ? But I thought lockdep_assert_held() only
> > spits out warnings...
> >
> > Ming,
> >
> > Your patch c48dac137a62 ("block: don't hold q->sysfs_lock in elevator_init_mq")
> > removed the sysfs_lock use in elevator_init_mq(). With that, should we move the
> > lockdep_assert_held(&q->sysfs_lock) call out of blk_mq_sched_free_requests() and
> > directly call it lockdep before calling that function (that's ugly) or do you
> > see a nice trick for handling the special case that is the first initialization ?
>
> Please ignore. It looks like the gfs2 tree tested does not have commit
> 954b4a5ce4a8 ("block: Change elevator_init_mq() to always succeed") which
> removes the possibility of having blk_mq_sched_free_requests() being called
> during the first elevator initialization without the sysfs lock being held.
>
> So if the crash is indeed triggered by the lockdep_assert_held() call, then this
> problem will be fixed after a rebase on 5.4-rc1.
>

No, as the report says, this occurred on commit f7c3bf8f. Commit 954b4a5ce4a8
("block: Change elevator_init_mq() to always succeed") was already merged then.

- Eric