Re: [PATCH v4] usb: core: hub: Add quirks for reducing device address timeout

From: Alan Stern
Date: Tue Oct 24 2023 - 11:44:25 EST


On Mon, Oct 23, 2023 at 06:13:48PM +0200, Hardik Gajjar wrote:
> On Sat, Oct 21, 2023 at 12:15:35PM +0200, Greg KH wrote:
> > On Tue, Oct 17, 2023 at 02:59:54PM -0400, Alan Stern wrote:
> > > On Tue, Oct 17, 2023 at 06:53:44PM +0200, Greg KH wrote:
> > > > On Tue, Oct 17, 2023 at 06:10:21PM +0200, Hardik Gajjar wrote:
> > > > > More logs and detailed in patch V1:
> > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_linux-2Dusb_20230818092353.124658-2D1-2Dhgajjar-40de.adit-2Djv.com_T_-23m452ec9dad94e8181fdb050cd29483dd89437f7c1&d=DwICAg&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=SAhjP5GOmrADp1v_EE5jWoSuMlYCIt9gKduw-DCBPLs&m=P0HXZTx6ta7v5M4y2Y7WZkPrY-dpKkxBq8tAzuX8cI9aj9tE2NuVvJjLl3Uvojpw&s=N_HwnQeZb_gHMmgz53uTGDUZVi28EXb1l9Pg6PdbvVI&e=
> > > > > >
> > > > > > > Achieving this is impossible in scenarios where the set_address is
> > > > > > > not successful and waits for a timeout.
> > > > > >
> > > > > > Agreed, broken hardware is a pain, but if your device is allowed to take
> > > > > > longer, it can, and will, so you have to support that.
> > > > > >
> > > > > The problem is not caused by the device taking an extended amount of time to
> > > > > process the 'set_address' request. Instead, the issue lies in the absence of
> > > > > any activity on the upstream bus until a timeout occurs.
> > > >
> > > > So, a broken device. Why are you then adding the hub to the quirk list
> > > > and not the broken device? We are used to adding broken devices to
> > > > qurik lists all the time, this shouldn't be new.
> > >
> > > Adding a quirk for the device isn't feasible, because the problem occurs
> > > before the device has been initialized and enumerated. The kernel
> > > doesn't know anything about the device at this point; only that it has
> > > just connected.
> >
> > Ah, ick, you are right, but we do know the "broken hub" id, so that
> > makes a bit more sense. Should this be a hub-only type quirk?
>
> In addition to the earlier comment, it appears that the issue is most likely
> related to the hub. While we have identified one specific phone that triggers
> this problem, we cannot determine how many other devices might encounter a
> similar issue, where they enumerate as full speed initially and then switch
> to high speed. To address this, we are proposing to use a 500 ms timeout for
> all devices connected via the hub. This change aims to prevent potential
> timeout-related problems with various devices

So it sounds like the best approach is to make this a hub-specific
quirk.

> It does appear that the issue is related to the hub, and the ideal solution would involve
> modifying the hub's firmware. However, implementing such a firmware fix is currently not
> a straightforward task. As a result, we have implemented this quirk-based solution to
> mitigate the issue to some extent
>
> Following is the LeCroy analyzer logs:
>
> 1. logs between Hub and phone with broken hub.
>
> In packet 58, there is a Full-speed J (suspend) event that lasted for 5.347 seconds.
> It's suspected that the hub was suspended due to incorrect chirp parsing.
> This anomaly in chirp parsing may be a contributing factor to the issue we're facing.

Yes, that's probably true. It's another indication that the hub is
somehow at fault.

Alan Stern