Re: [PATCH v2 1/2] USB: reset-resume the device when PORT_SUSPEND is set but timeout

From: Chris Chiu
Date: Thu May 13 2021 - 00:22:12 EST


On Wed, May 12, 2021 at 11:04 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, May 12, 2021 at 12:17:12PM +0800, Chris Chiu wrote:
> > On Wed, May 12, 2021 at 12:30 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Tue, May 11, 2021 at 01:04:36PM +0800, Chris Chiu wrote:
> > > > On Mon, May 10, 2021 at 11:02 PM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > On Mon, May 10, 2021 at 10:50:29PM +0800, chris.chiu@xxxxxxxxxxxxx wrote:
> > > > > > From: Chris Chiu <chris.chiu@xxxxxxxxxxxxx>
> > > > > >
> > > > > > On the Realtek high-speed Hub(0bda:5487), the port which has wakeup
> > > > > > enabled_descendants will sometimes timeout when setting PORT_SUSPEND
> > > > > > feature. After checking the PORT_SUSPEND bit in wPortStatus, it is
> > > > > > already set. However, the hub will fail to activate because the
> > > > > > PORT_SUSPEND feature of that port is not cleared during resume. All
> > > > > > connected devices are lost after resume.
> > > > > >
> > > > > > This commit force reset-resume the device connected to the timeout
> > > > > > but suspended port so that the hub will have chance to clear the
> > > > > > PORT_SUSPEND feature during resume.
> > > > >
> > > > > Are you certain that the reset-resume is needed? What happens if you
> > > > > leave out the line that sets udev->reset_resume? The rest of the patch
> > > > > will cause the kernel to realize that the port really is suspended, so
> > > > > maybe the suspend feature will get cleared properly during resume.
> > > > >
> > > > > It's worthwhile to try the experiement and see what happens.
> > > > >
> > > > > Alan Stern
> > > > >
> > > >
> > > > If I leave out the udev->reset_resume set, the resume will fail. Please refer
> > > > to the following kernel log. The usb 1-1 is the hub which has wakeup enabled
> > > > descendants.
> > > >
> > > > [ 57.210472] usb 1-1: kworker/u32:7 timed out on ep0out len=0/0
> > > > [ 57.211022] usb 1-1-port3: suspend timeout, status 0507
> > > > [ 57.211130] hub 1-1:1.0: hub_suspend
> > > > [ 57.230500] usb 1-1: usb suspend, wakeup 0
> > > >
> > > > The timeout happens at 57.210472 and you can see the PORT_SUSPEND
> > > > bit is actually set in the "status 0507". The following shows the resume log.
> > > >
> > > > [ 58.046556] usb 1-1: usb resume
> > > > [ 58.114515] usb 1-1: Waited 0ms for CONNECT
> > > > [ 58.114524] usb 1-1: finish resume
> > > > [ 58.114928] hub 1-1:1.0: hub_resume
> > > > [ 58.116035] usb 1-1-port3: status 0507 change 0000
> > > > [ 58.116720] usb 1-1-port5: status 0503 change 0000
> > > > [ 58.116778] hub 1-1.3:1.0: hub_resume
> > > > [ 58.116908] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.116952] usb 1-1.5: Waited 0ms for CONNECT
> > > > [ 58.116955] usb 1-1.5: finish resume
> > > > [ 58.117157] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.117397] usb 1-1.3-port5: can't resume, status -71
> > > > [ 58.117782] hub 1-1.3:1.0: hub_ext_port_status failed (err = -71)
> > > > [ 58.118147] usb 1-1.3-port2: can't resume, status -71
> > > > [ 58.118149] usb 1-1.3.2: Waited 0ms for CONNECT
> > > > [ 58.118151] usb 1-1.3-port2: status 07eb.906e after resume, -19
> > > > [ 58.118153] usb 1-1.3.2: can't resume, status -19
> > > > [ 58.118154] usb 1-1.3-port2: logical disconnect
> > > > [ 58.118526] usb 1-1.3-port2: cannot disable (err = -71)
> > > >
> > > > As you see in the 58.116035, the hub_resume and activate is OK for the
> > > > usb 1-1. The "usb 1-1.3: finish resume" is not in the log because it's not
> > > > considered suspended and no chance to ClearPortFeature.
> > >
> > > Wait -- why isn't it considered suspended? We saw at 57.211022 that
> > > 1-1-port3's Suspend feature really was set, and thanks to your patch,
> > > the kernel should now believe that the port is suspended.
> > >
> > But it's still in the `if (status)` branch so it will not get
> > usb_set_device_state
> > to USB_STATE_SUSPENDED, then usb_resume_both will not do the resume
> > process for it.
>
> Ah, yes. I was mis-reading the patch.
>
> > My original thought is, we still take this as an abnormal status
> > because we don't
> > really know the reason for the timeout. Set reset_resume for the udev
> > will make the
> > kernel to reset_resume it. Or I have to create a new `goto` name in
> > the `else` branch
> > to force it back to the successful suspended process. And should I
> > clean the status to
> > zero for pm_runtime_put_sync()? What's your suggestion?
>
> For testing purposes, set status to 0 and jump to a new goto label in
> the "else" branch. In other words, treat it as if the suspend really
> had worked and go back to the successful pathway. Try this out and see
> if it fixes the problem.
>
> If it does then the reset-resume isn't needed. If it doesn't, post your
> patch again, and mention in the patch description that testing shows the
> reset-resume really is necessary.
>
> Alan Stern

Thanks for the suggestion. I revised the patch to create a new goto
label in the "else" branch and clear the status to zero if the port is
really suspended. It fixed the problem in my 100 time suspend/resume
test. I will send v3 patch w/ the modification.

Chris