Re: [PATCH] USB:Fix ehci infinite suspend-resume loop issue in zhaoxin

From: Alan Stern
Date: Wed Apr 06 2022 - 13:54:29 EST


On Wed, Apr 06, 2022 at 10:38:28AM +0800, WeitaoWang-oc@xxxxxxxxxxx wrote:
> On 2022/4/6 00:02, Alan Stern wrote:
> > In fact, the resume kernel doesn't call ehci_resume at all. Here's what
> > it does:
> >
> > The resume kernel boots;
> >
> > If your patch causes STS_PCD to be set at this point, the flag
> > should get cleared shortly afterward by ehci_irq;
> >
> > ehci-hcd goes into runtime suspend;
> >
> > The kernel reads the system image that was stored earlier when
> > hibernation began;
> >
> > After the image is loaded, the system goes into the freeze
> > state (this does not call any routines in ehci-hcd);
> On this phase, pci_pm_freeze will be called for pci device. In this
> function, pm_runtime_resume will be called to resume already
> runtime-suspend devices. which will cause ehci_resume to be called.
> Thus STS_PCD flag will be set in ehci_resume function.

Aha! I was missing that piece of information, thanks.

But this still doesn't explain why check_root_hub_suspended is failing.
That routine checks the HCD_RH_RUNNING bit, which gets set in
hcd_bus_resume. hcd_bus_resume gets called as part of resuming the root
hub, and in ehci-hcd this happens when ehci_irq sees that STS_PCD is set
and calls usb_hcd_resume_root_hub. That routine queues a wakeup request
on the pm_wq work queue, which is then supposed to run hcd_resume_work
to actually restart the root hub.

But pm_wq is a freezable work queue! While the system is in the freeze
state, the work queue isn't running. This means that the root hub
should remain suspended until the end of the freeze phase, and so the
call to check_root_hub_suspended should succeed.

Can you check to see what's really happening on your system? Something
must be wrong with my analysis, but I can't tell what it is. I'm still
puzzled.

Alan Stern