Re: INFO: task hung in wdm_flush

From: Dmitry Vyukov
Date: Tue Feb 11 2020 - 09:11:24 EST


On Tue, Feb 11, 2020 at 2:55 PM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On 2020/02/11 0:21, Tetsuo Handa wrote:
> > On 2020/02/11 0:06, Dmitry Vyukov wrote:
> >>> On Mon, Feb 10, 2020 at 4:03 PM Tetsuo Handa
> >>> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 2020/02/10 21:46, Tetsuo Handa wrote:
> >>>>> On 2020/02/10 19:09, Dmitry Vyukov wrote:
> >>>>>> You may also try on the exact commit the bug was reported, because
> >>>>>> usb-fuzzer is tracking branch, things may change there.
> >>>>>
> >>>>> OK. I explicitly tried
> >>>>>
> >>>>> #syz test: https://github.com/google/kasan.git e5cd56e94edde38ca4dafae5a450c5a16b8a5f23
> >>>>>
> >>>>> but syzbot still cannot reproduce this bug using the reproducer...
> >>>>
> >>>> It seems that there is non-trivial difference between kernel config in dashboard
> >>>> and kernel config in "syz test:" mails. Maybe that's the cause...
> >>
> >>
> >> syzkaller runs oldconfig when building any kernels:
> >> https://github.com/google/syzkaller/blob/master/pkg/build/linux.go#L56
> >> Is that difference what oldconfig produces?
> >>
> >
> > Here is the diff (with "#" lines excluded) between dashboard and "syz test:" mails.
> > I feel this difference is bigger than what simple oldconfig would cause.
> >
>
> I explicitly tried a commit as of the first report (instead of the latest report)
>
> #syz test: https://github.com/google/kasan.git e96407b497622d03f088bcf17d2c8c5a1ab066c8
>
> and syzbot reproduced this bug using the reproducer. Therefore, it seems that differences
> in the kernel config used for "syz test:" was inappropriate but "syz test:" failed to detect
> it. Since there might be changes which fixed different bugs (and in order to confirm that
> proposed patch cleanly applies to the current kernel without causing other problems), I guess
> that people tend to test using the latest commit (instead of a commit as of the first report).
>
> I suggest "syz test:" to retest without proposed patch when proposed patch did not reproduce
> the bug. If retesting without proposed patch did not reproduce the bug, we can figure out that
> something is wrong (maybe the bug is difficult to reproduce, maybe the bug was already fixed,
> maybe kernel config was inappropriate, maybe something else).

This is already possible, right? One can request any single testing as
they see fit.
Chaining tests into complex workflows won't necessarily make things
simpler. It will be hard to explain what exactly happened and why.
Also, consider, a reproducer is flaky, it did not crashed with patch,
but crashed without the patch (just because it's flaky).


> Regarding the bug for this report, debug printk() reported that WDM_IN_USE was not cleared
> for some reason. While we need to investigate why WDM_IN_USE was not cleared, I guess that
> wdm_write() should clear WDM_IN_USE upon error
> ( https://syzkaller.appspot.com/x/patch.diff?x=17ec7ee9e00000 ) so that we will surely
> wake up somebody potentially waiting on WDM_IN_USE.
>
> [ 38.587596][ T2807] wdm_flush: file=ffff8881d488bb80 flags=2
> [ 40.214039][ T2807] wdm_flush: file=ffff8881d63fb400 flags=2
> [ 40.304390][ T2842] wdm_flush: file=ffff8881d5e22500 flags=0
> [ 40.371742][ T2869] wdm_flush: file=ffff8881d4964c80 flags=0
> [ 40.429954][ T2844] wdm_flush: file=ffff8881d5937b80 flags=0
> [ 40.461538][ T2858] wdm_flush: file=ffff8881d488b400 flags=0
> [ 40.464909][ T2863] wdm_flush: file=ffff8881d488ea00 flags=0
> [ 41.576761][ T2896] wdm_flush: file=ffff8881d43dea00 flags=2
> [ 41.949941][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
> [ 43.760828][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2
> [ 43.857364][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
> [ 43.857501][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
> [ 43.866560][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
> [ 43.876210][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
> [ 72.308895][ T2909] INFO: task syz-executor.0:2909 blocked for more than 30 seconds.
> [ 72.316860][ T2909] wdm_flush: file=ffff8881d63c3b80 flags=2
> [ 74.228916][ T2906] INFO: task syz-executor.1:2906 blocked for more than 30 seconds.
> [ 74.228921][ T2911] INFO: task syz-executor.3:2911 blocked for more than 30 seconds.
> [ 74.228935][ T2911] wdm_flush: file=ffff8881d63c2000 flags=2
> [ 74.236949][ T2906] wdm_flush: file=ffff8881d5ce4780 flags=2
> [ 74.236991][ T2904] INFO: task syz-executor.4:2904 blocked for more than 30 seconds.
> [ 74.245459][ T2897] INFO: task syz-executor.2:2897 blocked for more than 30 seconds.
> [ 74.251305][ T2904] wdm_flush: file=ffff8881d3d7a280 flags=2
> [ 74.257129][ T2897] wdm_flush: file=ffff8881d385db80 flags=2
> [ 74.257951][ T2899] INFO: task syz-executor.5:2899 blocked for more than 30 seconds.
> [ 74.294465][ T2899] wdm_flush: file=ffff8881d3d7a000 flags=2
>