Re: USB devices on Dell TB16 dock stop working after resuming

From: Paul Menzel
Date: Fri Jan 17 2020 - 04:57:05 EST


Dear Mathias, dear Mario,


On 2019-12-23 10:39, Mathias Nyman wrote:
> On 20.12.2019 16.25, Paul Menzel wrote:

>> On 2019-11-26 13:44, Mathias Nyman wrote:
>>> On 26.11.2019 13.33, Paul Menzel wrote:
>>
>>>> On 2019-11-25 10:20, Mathias Nyman wrote:
>>>>> On 22.11.2019 13.41, Mika Westerberg wrote:
>>>>>> On Fri, Nov 22, 2019 at 12:33:44PM +0100, Paul Menzel wrote:
>>>>
>>>>>>> On 2019-11-22 12:29, Mika Westerberg wrote:
>>>>>>>> On Fri, Nov 22, 2019 at 12:05:13PM +0100, Paul Menzel wrote:
>>>>>>>
>>>>>>>>> On 2019-11-22 11:50, Mika Westerberg wrote:
>>>>>>>>>> On Wed, Nov 20, 2019 at 12:50:53PM +0200, Mika Westerberg wrote:
>>>>>>>>>>> On Tue, Nov 19, 2019 at 05:55:43PM +0100, Paul Menzel wrote:
>>>>>>>>>
>>>>>>>>>>>> On 2019-11-04 17:21, Mika Westerberg wrote:
>>>>>>>>>>>>> On Mon, Nov 04, 2019 at 05:11:10PM +0100, Paul Menzel wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2019-11-04 16:49, Mario.Limonciello@xxxxxxxx wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> From: Mika Westerberg <mika.westerberg@xxxxxxxxxxxxxxx>
>>>>>>>>>>>>>>>> Sent: Monday, November 4, 2019 9:45 AM
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Mon, Nov 04, 2019 at 04:44:40PM +0200, Mika Westerberg wrote:
>>>>>>>>>>>>>>>>> On Mon, Nov 04, 2019 at 04:25:03PM +0200, Mika Westerberg wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Mon, Nov 04, 2019 at 02:13:13PM +0100, Paul Menzel wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On the Dell XPS 13 9380 with Debian Sid/unstable with Linux 5.3.7
>>>>>>>>>>>>>>>>>>> suspending the system, and resuming with Dellâs Thunderbolt TB16
>>>>>>>>>>>>>>>>>>> dock connected, the USB input devices, keyboard and mouse,
>>>>>>>>>>>>>>>>>>> connected to the TB16 stop working. They work for a few seconds
>>>>>>>>>>>>>>>>>>> (mouse cursor can be moved), but then stop working. The laptop
>>>>>>>>>>>>>>>>>>> keyboard and touchpad still works fine. All firmware is up-to-date
>>>>>>>>>>>>>>>>>>> according to `fwupdmgr`.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> What are the exact steps to reproduce? Just "echo mem >
>>>>>>>>>>>>>>>>>> /sys/power/state" and then resume by pressing power button?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> GNOME Shell 3.34.1+git20191024-1 is used, and the user just closes the
>>>>>>>>>>>>>> display. So more than `echo mem > /sys/power/state` is done. What
>>>>>>>>>>>>>> distribution do you use?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have buildroot based "distro" so there is no UI running.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm, this is quite different from the ânormalâ use-case of the these devices.
>>>>>>>>>>>> That way you wonât hit the bugs of the normal users. ;-)
>>>>>>>>>>>
>>>>>>>>>>> Well, I can install some distro to that thing also :) I suppose Debian
>>>>>>>>>>> 10.2 does have this issue, no?
>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I tried v5.4-rc6 on my 9380 with TB16 dock connected and did a couple of
>>>>>>>>>>>>>>>>> suspend/resume cycles (to s2idle) but I don't see any issues.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I may have older/different firmware than you, though.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Upgraded BIOS to 1.8.0 and TBT NVM to v44 but still can't reproduce this
>>>>>>>>>>>>>>>> on my system :/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The user reported the issue with the previous firmwares 1.x and TBT NVM v40.
>>>>>>>>>>>>>> Updating to the recent version (I got the logs with) did not fix the issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also tried v40 (that was originally on that system) but I was not able
>>>>>>>>>>>>> to reproduce it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you know if the user changed any BIOS settings?
>>>>>>>>>>>>
>>>>>>>>>>>> We had to disable the Thunderbolt security settings as otherwise the USB
>>>>>>>>>>>> devices wouldnât work at cold boot either.
>>>>>>>>>>>
>>>>>>>>>>> That does not sound right at all. There is the preboot ACL that allows
>>>>>>>>>>> you to use TBT dock aready on boot. Bolt takes care of this.
>>>>>>>>>>>
>>>>>>>>>>> Are you talking about USB devices connected to the TB16 dock?
>>>>>>>>>>>
>>>>>>>>>>> Also are you connecting the TB16 dock to the Thunderbolt ports (left
>>>>>>>>>>> side of the system marked with small lightning logo) or to the normal
>>>>>>>>>>> Type-C ports (right side)?
>>>>>>>>>>>
>>>>>>>>>>>> So, I built Linux 5.4-rc8 (`make bindeb-pkg -j8`), but unfortunately the
>>>>>>>>>>>> error is still there. Sometimes, re-plugging the dock helped, and sometimes
>>>>>>>>>>>> it did not.
>>>>>>>>>>>>
>>>>>>>>>>>> Please find the logs attached. The strange thing is, the Linux kernel detects
>>>>>>>>>>>> the devices and I do not see any disconnect events. But, `lsusb` does not list
>>>>>>>>>>>> the keyboard and the mouse. Is that expected.
>>>>>>>>>>>
>>>>>>>>>>> I'm bit confused. Can you describe the exact steps what you do (so I can
>>>>>>>>>>> replicate them).
>>>>>>>>>>
>>>>>>>>>> I managed to reproduce following scenario.
>>>>>>>>>>
>>>>>>>>>> 1. Boot the system up to UI
>>>>>>>>>> 2. Connect TB16 dock (and see that it gets authorized by bolt)
>>>>>>>>>> 3. Connect keyboard and mouse to the TB16 dock
>>>>>>>>>> 4. Both mouse and keyboard are functional
>>>>>>>>>> 5. Enter s2idle by closing laptop lid
>>>>>>>>>> 6. Exit s2idle by opening the laptop lid
>>>>>>>>>> 7. After ~10 seconds or so the mouse or keyboard or both do not work
>>>>>>>>>> ÂÂÂÂÂ anymore. They do not respond but they are still "present".
>>>>>>>>>>
>>>>>>>>>> The above does not happen always but from time to time.
>>>>>>>>>>
>>>>>>>>>> Is this the scenario you see as well?
>>>>>>>>>
>>>>>>>>> Yes, it is. Though Iâd say itâs only five seconds or so.
>>>>>>>>>
>>>>>>>>>> This is on Ubuntu 19.10 with the 5.3 stock kernel.
>>>>>>>>>
>>>>>>>>> âstockâ in upstreamâs or Ubuntuâs?
>>>>>>>>
>>>>>>>> It is Ubuntu's.
>>>>>>>>
>>>>>>>>>> I can get them work again by unplugging them and plugging back (leaving
>>>>>>>>>> the TBT16 dock connected). Also if you run lspci when the problem
>>>>>>>>>> occurs it still shows the dock so PCIe link stays up.
>>>>>>>>>
>>>>>>>>> Re-connecting the USB devices does not help here, but I still suspect itâs
>>>>>>>>> the same issue.
>>>>>>>>
>>>>>>>> Yeah, sounds like so. Did you try to connect the device (mouse,
>>>>>>>> keyboard) to another USB port?
>>>>>>>
>>>>>>> I do not think I did, but I canât remember. Next week would be the next chance
>>>>>>> to test this.
>>>>>>>
>>>>>>>>> Yesterday, I had my hand on a Dell XPS 13 7390 (10th Intel generation) and
>>>>>>>>> tried it with the shipped Ubuntu 18.04 LTS. There, the problem was not
>>>>>>>>> always reproducible, but it still happened. Sometimes, only one of the USB
>>>>>>>>> device (either keyboard or mouse) stopped working.
>>>>>>>>
>>>>>>>> I suppose this is also with the TB16 dock connected, correct?
>>>>>>>
>>>>>>> Correct.
>>>>>>>
>>>>>>> Can I ask again, how the USB devices connected to the dock can be listed on
>>>>>>> the command line? lsusb needs to be adapted for that or is a different
>>>>>>> mechanism needed?
>>>>>>
>>>>>> The TB16 dock has ASMEDIA xHCI controller, which is PCIe device so you
>>>>>> can see it by running lsusb and looking at the devices under that
>>>>>> controller. I think maybe 'lsusb -t' is helpful.
>>>>>>
>>>>>> The xHCI controller itself you can see by running lspci.
>>>>>
>>>>> I got traces from the ASMedia xHC controller in the TB16 dock.
>>>>
>>>> Nice. Thank you for looking into that. How can these traces be captured?
>>>
>>> The Linux tracepoints added to the xhci driver can be enabled by:
>>>
>>> mount -t debugfs none /sys/kernel/debug
>>> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
>>> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>>> < Trigger the issue >
>>>
>>> Copy traces found in /sys/kernel/debug/tracing/trace
>>>
>>> Trace file grows fast.
>>
>>>>> There are issues with split transactions between the ASMedia host and the 7 port
>>>>> High speed hub built in to the dock.
>>>>>
>>>>> host reports a split transaction error for mouse or keyboard full-speed/low-speed
>>>>> interrupt transactions. Endpoint doesn't recover after resetting it.
>>>>>
>>>>> Split transaction allows full- and low-speed devices to be attached to high-speed
>>>>> hubs, and are used only between the host and the HS hub. A transaction translator (TT)
>>>>> in the HS hub will translate the high-speed split transactions on its upstream port to
>>>>> low/full speed transactions on the downstream port.
>>>>>
>>>>> I'll see if there are any xHC parameters driver is setting that trigger these
>>>>> split transaction errors to trigger more easy.
>>>>
>>>> I always wonder how Microsoft Windows driver do it.
>>>>
>>>> Mario, should I contact the Dell support regarding this issue?
>> Sorry for bothering, but were you able to find some workaround for
>> this issue?
>
> Unfortunately no, I couldn't find any workaround.
> xhci slot and endpoint context values for both the HS hub, and the
> full/low speed device seem correct.
>
> I was able to reproduce the issue with an external HS hub as well, so this issue
> appears to be more related to ASMedia host than the built in HS hub in TB16

I contacted the (German) Dell support, and they asked me to update the laptop
firmware to 1.9.1 claiming that these issues might be fixed there (despite the
change-log not containing that). Anyway, after the update, the user is still
able to reproduce the issue.

Mario, what can I do, so the issue is escalated to your team, so you can work
with ASMedia to solve this?


Kind regards,

Paul

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature