Re: USB disk speed regression WD Elements - with bisect result 22547c4cc4fe20698a6a85a55b8788859134b8e4

From: Zdenek Kabelac
Date: Thu Jul 27 2017 - 05:12:01 EST


Dne 27.7.2017 v 03:03 Alan Stern napsal(a):
[Added linux-usb mailing list to CC:]

Short description of bug: In 4.12 or later, when Zdenek's Western
Digital disk is attached to an EHCI controller, it ends up connecting
at full speed to the companion UHCI controller instead. But when
commit 22547c4cc4fe ("usb: hub: Wait for connection to be reestablished
after port reset") is reverted, the disk connects correctly at high
speed.

On Wed, 26 Jul 2017, Zdenek Kabelac wrote:

Dne 25.7.2017 v 21:50 Alan Stern napsal(a):
On Tue, 25 Jul 2017, Zdenek Kabelac wrote:


08:18 usb 2-2: USB disconnect, device number 2
08:25 usb 2-2: new high-speed USB device number 3 using ehci-pci
08:26 usb 2-2: new high-speed USB device number 4 using ehci-pci

If the drive were working entirely correctly, it wouldn't do that.

We could continue futzing around with hardware and driver tests for a
long time. But there may be a shortcut: If you have a USB hub, you
could try attaching the drive through it. It's entirely possible that
this will fix the problem.

If not, you'll have to start doing some very detailed tests. As a
start, you can enable debugging for the usbcore and ehci_hcd drivers
immediately before the test:

echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control
echo 'module ehci_hcd =p' >/sys/kernel/debug/dynamic_debug/control
dmesg -C

Then after the test, see what shows up in the dmesg output. And again,
we'll want to do a comparison. In fact, 4.12 with and without the
commit you identified would make a better comparison than 4.12 vs. 4.8.



Hi


So here we go with traces - made with freshly compiled recent 4.13-rc2.
OK trace is with revert patch applied.
BAD trace is the one with it (essentially vaniala master).

Trace also has KOBJECT debugging enabled - I think difference is
nicely visible between them - but I've no explanation for it.

Both traces start with cable detach followed with attachment.

Okay, I figured out the cause.

The USB stack assumes that if a high-speed device initialization fails
and the device is still connected, it means that the device can't run
properly at high speed and the computer needs to try again at full
speed. See commit 90da096ee46b ("USB: force handover port to companion
when hub_port_connect_change fails").

In Zdenek's case, the device really does disconnect itself before the
second port reset. If 22547c4cc4fe is present, it causes an extra
delay -- some 200 ms -- long enough for the device to reconnect again.
So we appear to be in the situation described above, and the connection
is switched over to the companion controller.

If 22547c4cc4fe is not present, the kernel sees that the device is not
connected at the end of the second reset and gives up trying to
initialize the device. When the device reconnects about 140 ms later,
the kernel treats it as a new connection and successfully negotiates a
high-speed link.

Zdenek, you check this explanation by commenting out these last two
lines at the end of hub_port_connect() in drivers/usb/core/hub.c:

if (hcd->driver->relinquish_port && !hub->hdev->parent)
hcd->driver->relinquish_port(hcd, port1);

That should prevent the connection from being handed over to the UHCI
companion, allowing the device to operate at high speed.


Hi

Yep - seems this helped - I've dropped revert and commented those 2 lines
and I've used the very same kernel - and speed was all good:

usb 2-2: new high-speed USB device number 2 using ehci-pci
usb 2-2: new high-speed USB device number 3 using ehci-pci
usb 2-2: New USB device found, idVendor=1058, idProduct=10a8
usb 2-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3


So while I'm not sure if this is final fix for USB - this solution surely solved my WD Element disk attachment issue.

Regards

Zdenek