[PATCH 3.14 11/88] USB: Fix persist resume of some SS USB devices

From: Greg Kroah-Hartman
Date: Wed Sep 03 2014 - 19:34:32 EST


3.14-stable review patch. If anyone has any objections, please let me know.

------------------

From: Pratyush Anand <pratyush.anand@xxxxxx>

commit a40178b2fa6ad87670fb1e5fa4024db00c149629 upstream.

Problem Summary: Problem has been observed generally with PM states
where VBUS goes off during suspend. There are some SS USB devices which
take longer time for link training compared to many others. Such
devices fail to reconnect with same old address which was associated
with it before suspend.

When system resumes, at some point of time (dpm_run_callback->
usb_dev_resume->usb_resume->usb_resume_both->usb_resume_device->
usb_port_resume) SW reads hub status. If device is present,
then it finishes port resume and re-enumerates device with same
address. If device is not present then, SW thinks that device was
removed during suspend and therefore does logical disconnection
and removes all the resource allocated for this device.

Now, if I put sufficient delay just before root hub status read in
usb_resume_device then, SW sees always that device is present. In normal
course(without any delay) SW sees that no device is present and then SW
removes all resource associated with the device at this port. In the
latter case, after sometime, device says that hey I am here, now host
enumerates it, but with new address.

Problem had been reproduced when I connect verbatim USB3.0 hard disc
with my STiH407 XHCI host running with 3.10 kernel.

I see that similar problem has been reported here.
https://bugzilla.kernel.org/show_bug.cgi?id=53211
Reading above it seems that bug was not in 3.6.6 and was present in 3.8
and again it was not present for some in 3.12.6, while it was present
for few others. I tested with 3.13-FC19 running at i686 desktop, problem
was still there. However, I was failed to reproduce it with 3.16-RC4
running at same i686 machine. I would say it is just a random
observation. Problem for few devices is always there, as I am unable to
find a proper fix for the issue.

So, now question is what should be the amount of delay so that host is
always able to recognize suspended device after resume.

XHCI specs 4.19.4 says that when Link training is successful, port sets
CSC bit to 1. So if SW reads port status before successful link
training, then it will not find device to be present. USB Analyzer log
with such buggy devices show that in some cases device switch on the
RX termination after long delay of host enabling the VBUS. In few other
cases it has been seen that device fails to negotiate link training in
first attempt. It has been reported till now that few devices take as
long as 2000 ms to train the link after host enabling its VBUS and
RX termination. This patch implements a 2000 ms timeout for CSC bit to set
ie for link training. If in a case link trains before timeout, loop will
exit earlier.

This patch implements above delay, but only for SS device and when
persist is enabled.

So, for the good device overhead is almost none. While for the bad
devices penalty could be the time which it take for link training.
But, If a device was connected before suspend, and was removed
while system was asleep, then the penalty would be the timeout ie
2000 ms.

Results:

Verbatim USB SS hard disk connected with STiH407 USB host running 3.10
Kernel resumes in 461 msecs without this patch, but hard disk is
assigned a new device address. Same system resumes in 790 msecs with
this patch, but with old device address.

Signed-off-by: Pratyush Anand <pratyush.anand@xxxxxx>
Acked-by: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
drivers/usb/core/hub.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)

--- a/drivers/usb/core/hub.c
+++ b/drivers/usb/core/hub.c
@@ -3174,6 +3174,43 @@ static int finish_port_resume(struct usb
}

/*
+ * There are some SS USB devices which take longer time for link training.
+ * XHCI specs 4.19.4 says that when Link training is successful, port
+ * sets CSC bit to 1. So if SW reads port status before successful link
+ * training, then it will not find device to be present.
+ * USB Analyzer log with such buggy devices show that in some cases
+ * device switch on the RX termination after long delay of host enabling
+ * the VBUS. In few other cases it has been seen that device fails to
+ * negotiate link training in first attempt. It has been
+ * reported till now that few devices take as long as 2000 ms to train
+ * the link after host enabling its VBUS and termination. Following
+ * routine implements a 2000 ms timeout for link training. If in a case
+ * link trains before timeout, loop will exit earlier.
+ *
+ * FIXME: If a device was connected before suspend, but was removed
+ * while system was asleep, then the loop in the following routine will
+ * only exit at timeout.
+ *
+ * This routine should only be called when persist is enabled for a SS
+ * device.
+ */
+static int wait_for_ss_port_enable(struct usb_device *udev,
+ struct usb_hub *hub, int *port1,
+ u16 *portchange, u16 *portstatus)
+{
+ int status = 0, delay_ms = 0;
+
+ while (delay_ms < 2000) {
+ if (status || *portstatus & USB_PORT_STAT_CONNECTION)
+ break;
+ msleep(20);
+ delay_ms += 20;
+ status = hub_port_status(hub, *port1, portstatus, portchange);
+ }
+ return status;
+}
+
+/*
* usb_port_resume - re-activate a suspended usb device's upstream port
* @udev: device to re-activate, not a root hub
* Context: must be able to sleep; device not locked; pm locks held
@@ -3275,6 +3312,10 @@ int usb_port_resume(struct usb_device *u

clear_bit(port1, hub->busy_bits);

+ if (udev->persist_enabled && hub_is_superspeed(hub->hdev))
+ status = wait_for_ss_port_enable(udev, hub, &port1, &portchange,
+ &portstatus);
+
status = check_port_resume_type(udev,
hub, port1, status, portchange, portstatus);
if (status == 0)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/