Re: INFO: task hung in wdm_flush

From: Dmitry Vyukov
Date: Mon Feb 10 2020 - 05:06:25 EST


On Sat, Nov 23, 2019 at 7:52 AM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>
> On Tue, Nov 19, 2019 at 12:34 PM BjÃrn Mork <bjorn@xxxxxxx> wrote:
> >
> > Oliver Neukum <oneukum@xxxxxxx> writes:
> > > Am Dienstag, den 19.11.2019, 10:14 +0100 schrieb BjÃrn Mork:
> > >
> > >> Anyway, I believe this is not a bug.
> > >>
> > >> wdm_flush will wait forever for the IN_USE flag to be cleared or the
> > >
> > > Damn. Too obvious. So you think we simply have pending output that does
> > > just not complete?
> >
> > I do miss a lot of stuff so I might be wrong, but I can't see any other
> > way this can happen. The out_callback will unconditionally clear the
> > IN_USE flag and wake up the wait_queue.
> >
> > >> DISCONNECTING flag to be set. The only way you can avoid this is by
> > >> creating a device that works normally up to a point and then completely
> > >> ignores all messages,
> > >
> > > Devices may crash. I don't think we can ignore that case.
> >
> > Sure, but I've never seen that happen without the device falling off the
> > bus. Which is a disconnect.
> >
> > But I am all for handling this *if* someone reproduces it with a real
> > device. I just don't think it's worth the effort if it's only a
> > theoretical problem.
> >
> > >> but without resetting or disconnecting. It is
> > >> obviously possible to create such a device. But I think the current
> > >> error handling is more than sufficient, unless you show me some way to
> > >> abuse this or reproduce the issue with a real device.
> > >
> > > Malicious devices are real. Potentially at least.
> > > But you are right, we need not bend over to handle them well, but we
> > > ought to be able to handle them.
> >
> > Sure, we need to handle malicious devices. But only if they can be used
> > for real harm.
> >
> > This warning requires physical acceess and is only slightly annoying.
> > Like a USB device making loud farting sounds. You'd just disconnect the
> > device. No need for Linux to detect the sound and handle it
> > automatically, I think.
>
> Hi BjÃrn,
>
> Besides the production use you are referring to, there are 2 cases we
> should take into account as well:
> 1. Testing.
> Any kernel testing system needs a binary criteria for detecting kernel
> bugs. It seems right to detect unkillable hung tasks as kernel bugs.
> Which means that we need to resolve this in some way regardless of the
> production scenario.
> 2. Reliable killing of processes.
> It's a very important property that an admin or script can reliably
> kill whatever process/container they need to kill for whatever reason.
> This case results in an unkillable process, which means scripts will
> fail, automated systems will misbehave, admins will waste time (if
> they are qualified to resolve this at all).

On Mon, Feb 10, 2020 at 11:00 AM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> Hello.
>
> Will you check whether patch testing is working? I tried
>
> #syz test: https://github.com/google/kasan.git usb-fuzzer
>
> but the reproducer did not trigger crash for both "with a patch"
> and "without a patch", despite dashboard is still adding crashes.
> I suspect something is wrong. Is it possible that reproducer is
> trying to test a bug which was already fixed but a different new
> bug is still reported as the same bug?