Re: I'd like to donate a MacBook Pro

From: Mathias Nyman
Date: Thu May 11 2017 - 07:14:25 EST


On 04.05.2017 05:34, Alex Henrie wrote:
2017-05-03 8:58 GMT-06:00 Joerg Roedel <jroedel@xxxxxxx>:
On Wed, May 03, 2017 at 08:35:47AM -0600, Alex Henrie wrote:
2017-05-03 5:58 GMT-06:00 Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>:
On Tue, May 02, 2017 at 10:55:09PM -0600, Alex Henrie wrote:
Today I ran a regression test to determine which commit made the
keyboard stop working entirely. The last commit that worked for me was
c09e22d5370739e16463c113525df51b5980b1d5. After that, there is a long
series of commits where the screen stays black, and after that, I
start getting errors like the one above.

So git bisect said that commit was a good change, what one was the "bad"
commit that git bisect pointed at?

The commit right after that, but it's not clear whether the screen
blanking problem was part of the same bug or just another bug that was
introduced at about the same time.

That would be 39ab9555c24110671f8dc671311a26e5c985b592:

iommu: Add sysfs bindings for struct iommu_device

And it introduced a regression when iommu-sysfs entries are accessed.
This is fixed in a7fdb6e648fb.

Does that commit fix the screen blanking problem?

At a7fdb6e648fb the screen works but the keyboard is broken. I think
that the screen was actually fixed by an earlier commit, but when I
tried to run a regression test to find when the keyboard problem
appeared after the screen was fixed, I gave up because testing each
revision takes about half an hour and the keyboard problem appears to
be somewhat intermittent.

-Alex

Looks like there are a few people suffering from macbook xhci related issues

From the bugs linked it looks like there are both some UEFI issues and a non-responsive
usb device at boot. at least one USB device does not respond to a address device command,
and driver times out and aborts the command after 5 second.

I don't think I can do anything about the UEFI or the first 5 second command timeout
from the xhci driver, but after that some improvement could be done.

This timeout code path in the xhci driver is not exercised that often.
Some minor races were fixed in 4.11 in this area, and better tracing added, but unfortunately
I just discovered there is also has a regression in 4.11

If you can help me debug this by compiling some new kernels and taking logs and traces on
your MacBook we could get this forward.

I set up a branch for this issue, it has the race-fixes, tracing and regression fixes:

git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git xhci-macbook

Remember to check out the xhci-macbook branch.

Can you compile and run that kernel on yout MacBook, and take logs?

Traces can be enabled by adding "trace_event=xhci-hcd" to you kernel command line,
then send me the output of both dmesg and /sys/kernel/debug/tracing/trace after the issue is seen.

Thanks
-Mathias