Re: [PATCH] USB: hcd-pci: Fully suspend across freeze/thaw cycle

From: Evan Green
Date: Thu Apr 14 2022 - 12:56:35 EST


Hi Alan and Mathias,

On Thu, Apr 14, 2022 at 7:21 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Apr 14, 2022 at 05:00:12PM +0300, Mathias Nyman wrote:
> > On 12.4.2022 18.40, Alan Stern wrote:
> > > On Tue, Apr 12, 2022 at 05:56:42PM +0300, Mathias Nyman wrote:
> > >> On 11.4.2022 17.50, Alan Stern wrote:
> > >>> For example, what would happen if the user unplugs a device right in the
> > >>> middle of the freeze transition, after the root hub has been frozen but
> > >>> before the controller is frozen? We don't want such an unplug event to
> > >>> prevent the system from going into hibernation -- especially if the root
> > >>> hub was not enabled for wakeup.
> > >>
> > >> We should be able to let system go to hibernate even if we get a disconnect
> > >> interrupt between roothub and host controller freeze.
> > >> Host is not yet suspended so no PME# wake is generated, only an interrupt.
> > >>
> > >> From Linux PM point of view it should be ok as well as the actual xhci
> > >> device that is generating the interrupt is hasnt completer freeze()
> > >>
> > >> The xhci interrupt handler just needs to make sure that the disconnect
> > >> isn't propagated if roothub is suspended and wake on disconnect
> > >> is not set. And definitely make sure xhci doesn't start roothub polling.
> > >>
> > >> When freeze() is called for the host we should prevent the host from
> > >> generating interrupts.
> > >
> > > I guess that means adding a new callback. Or we could just suspend the
> > > controller, like Evan proposed originally
> >
> > Suspending the host in freeze should work.
> > It will do an extra xhci controller state save stage, but that should be harmless.
> >
> > But is there really a need for the suggested noirq part?
> >
> > + .freeze_noirq = hcd_pci_suspend_noirq,
> >
> > That will try to set the host to PCI D3 state.
> > It seems a bit unnecessary for freeze.
>
> Agreed.
>
> > >>> (If the root hub _is_ enabled for wakeup then it's questionable.
> > >>> Unplugging a device would be a wakeup event, so you could easily argue
> > >>> that it _should_ prevent the system from going into hibernation. After
> > >>> all, if the unplug happened a few milliseconds later, after the system
> > >>> had fully gone into hibernation, then it would cause the system to wake
> > >>> up.)
> > >>>
> > >>>> Would it make sense prevent xHCI interrupt generation in the host
> > >>>> freeze() stage, clearing the xHCI EINT bit in addition to calling
> > >>>> check_roothub_suspend()?
> > >>>> Then enable it back in thaw()
> > >>>
> > >>> That won't fully eliminate the problem mentioned in the preceding
> > >>> paragraphs, although I guess it would help somewhat.
> > >>
> > >> Would the following steps solve this?
> > >>
> > >> 1. Disable device initiated resume for connected usb devices in freeze()
> > >>
> > >> 2. Don't propagate connect or OC changes if roothub is suspended and port wake
> > >> flags are disabled. I.E don't kick roothub polling in xhci interrupt
> > >> handler here.
> > >
> > > I guess you can't just halt the entire host controller when only one of
> > > the root hubs is suspended with wakeup disabled. That does complicate
> > > things. But you could halt it as soon as both of the root hubs are
> > > frozen. Wouldn't that prevent interrupt generation?
> >
> > True, but probably easier to just suspend host in freeze() as you stated above.
>
> Okay.
>
> Evan, this discussion suggests that you rewrite your patch as a series
> of three:
>
> 1. Change choose_wakeup() so that for PM_EVENT_FREEZE, wakeup is
> always disabled.

If I understand this correctly, this means potentially runtime
resuming the device so its wakeup setting can be consistently set to
wakeups disabled across a freeze transition. Got it I think in terms
of the "how".

>
> 2. Change the xhci-hcd interrupt handler so that port-status
> changes are ignored if the port's root hub is suspended with
> wakeup disabled.

This part confuses me. This would be way deep under
xhci_handle_event(), probably in handle_port_status(), just throwing
away certain events that come in the ring. How would we know to go
back and process those events later? I think we don't need to do this
if we suspend the controller as in #3 below. The suspended (halted)
controller wouldn't generate event interrupts (since the spec mentions
port status change generation is gated on HCHalted). So we're already
covered against receiving interrupts in this zone by halting the
controller, and the events stay nicely pending for when we restart it
in thaw.

Is the goal of #1 purely a setup change for #2, or does it stand on
its own even if we nixed #2? Said differently, is #1 trying to ensure
that wake signaling doesn't occur at all between freeze and thaw, even
when the controller is suspended and guaranteed not to generate
interrupts via its "normal" mechanism? I don't have a crisp mental
picture of how the wake signaling works, but if the controller wake
mechanism sidesteps the original problem of sending an MSI to a dead
CPU (as in, it does not use MSIs), then it might be ok as-is.

>
> 3. As in the original patch, make the .freeze and .thaw callbacks
> in hcd-pci.c call the appropriate suspend and resume routines,
> but don't do anything for .freeze_noirq and .thaw_noirq.

Sure. I had made the _noirq paths match suspend for consistency, I
wasn't sure if those could mix n match without issues. I'll try it out
leaving the _noirq callbacks alone.
-Evan

>
> How does that sound?
>
> Alan Stern