Re: [PATCH] usb_storage: make usb-stor-scan task non-freezable

From: Seth Forshee
Date: Sat Jul 23 2011 - 17:08:54 EST

On Tue, Jul 19, 2011 at 10:26:39AM -0400, Alan Stern wrote:
> On Mon, 18 Jul 2011, Seth Forshee wrote:
> > On Mon, Jul 18, 2011 at 05:12:35PM -0400, Alan Stern wrote:
> > > On Mon, 18 Jul 2011, Seth Forshee wrote:
> > >
> > > > The following patch is in response to a consistently reproducible
> > > > failure to freeze tasks prior to restoring a hibernation image on a
> > > > Toshiba NB505 netbook. This machine has a built-in USB card reader.
> > > > Since the usb-stor-scan task is freezable but the code in
> > > > quiesce_and_remove_host() that waits for scanning to complete is not,
> > > > khubd can fail to freeze when processing the disconnect for the card
> > > > reader.
> > >
> > > What card-reader disconnect?
> >
> > The call trace (below) shows that the code is processing a device
> > disconnection when this happens. I don't know what triggers it. I take
> > it from your response that this isn't expected (sorry, I'm not really
> > all that familiar with USB)?
> But why is there a disconnect at this time? Maybe you could find out
> if you collect the kernel log from the boot kernel (serial console or
> network console).

After experimenting with this device more I came to the conclusion that
the normal behavior with this machine is for the card reader to be
disconnected from the USB bus unless there's a card in the slot. During
a normal boot with an empty card slot the card reader never shows up on
the bus. But for some reason after S4 it shows up, but then some
transaction errors happen due to no card being present (I can trigger
the same errors by quickly inserting and removing a card) and the device
is disconnected. This is only after S4 though -- if I set the
hibernation mode to reboot or shutdown the errors don't happen. And I do
not see the errors if I hibernate and then boot up with noresume, so
it's not something that happens as a result of the restore process.

So at this point I'm pretty convinced that this is some kind of firmware
issue and that there isn't much hope of fixing it. It's not even
something that can be fixed by patching the AML, as no GPEs or AML
methods are coming into play when all this happens.

> Maybe it would be better to come up with a freezable version of
> wait_for_completion(). You should ask for advice on the linux-pm
> mailing list.

I tried this, and it doesn't work. Well, it works in the sense of
freezing khubd, but then khubd is frozen holding the mutex for the
device, and the restore hangs later while trying to suspend devices.

The only solution I've come up with is to leave usb-stor-scan freezable
without allowing it to actually freeze. We can request a fake signal be
sent when freezing and use interruptible sleep to abort the wait early
and finish up the thread's processing. This is implemented in the patch
below. Does this approach look reasonable? It's rather subtle, but it
does seem to work. I done numerous S4 cycles with and without a card
inserted and didn't get any failures.