RE: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade

From: Mario.Limonciello
Date: Wed May 24 2017 - 15:06:43 EST


> -----Original Message-----
> From: Mika Westerberg [mailto:mika.westerberg@xxxxxxxxxxxxxxx]
> Sent: Wednesday, May 24, 2017 6:11 AM
> To: Limonciello, Mario <Mario_Limonciello@xxxxxxxx>
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; andreas.noever@xxxxxxxxx;
> michael.jamet@xxxxxxxxx; yehezkel.bernat@xxxxxxxxx; lukas@xxxxxxxxx;
> amir.jer.levy@xxxxxxxxx; luto@xxxxxxxxxx; Dominguez, Jared
> <Jared_Dominguez@xxxxxxxx>; andriy.shevchenko@xxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade
>
> On Tue, May 23, 2017 at 05:30:43PM +0000, Mario.Limonciello@xxxxxxxx wrote:
> > (Sorry my email client is not going to wrap these at 80 columns)o
>
> That's fine. It is more readable this way :)
>
> > [ 0.467319] pci 0000:00:1c.0: [8086:9d10] type 01 class 0x060400
> > [ 0.467389] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
> > [ 0.467513] pci 0000:00:1c.0: System wakeup disabled by ACPI
>
> [...]
>
> > [ 0.469363] pci 0000:01:00.0: [8086:1576] type 01 class 0x060400
> > [ 0.469483] pci 0000:01:00.0: supports D1 D2
> > [ 0.469484] pci 0000:01:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> > [ 0.469570] pci 0000:01:00.0: System wakeup disabled by ACPI
> > [ 0.469609] pci 0000:00:1c.0: PCI bridge to [bus 01-39]
> > [ 0.469614] pci 0000:00:1c.0: bridge window [mem 0xc4000000-0xda0fffff]
> > [ 0.469618] pci 0000:00:1c.0: bridge window [mem 0xa0000000-0xc1ffffff
> 64bit pref]
> > [ 0.469621] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]),
> reconfiguring
>
> This is the problem. Here the PCIe upstream port (0000:01:00.0) is
> visible to Linux but it is not fully configured by the BIOS ->
> (primary/secondary/subordinate) is set to 0.

So at least for me the other difference between a successful run (where you plug
in after boot instead) is that it shows up as instead:
PCI bridge to [bus 02-39]

Same bridge window though.

>
> At this point Linux decides to configure the port itself and goes wrong
> since our allocation strategy tries to keep resource windows, including
> reserved buses as small as possible so that everything we currently find
> barely fits there.
>
> This continues few lines below:
>
> > [ 0.469670] pci_bus 0000:02: busn_res: can not insert [bus 02-ff] under [bus 01-
> 39] (conflicts with (null) [bus 01-39])
> > [ 0.469688] pci 0000:02:00.0: [8086:1576] type 01 class 0x060400
> > [ 0.469809] pci 0000:02:00.0: supports D1 D2
> > [ 0.469810] pci 0000:02:00.0: PME# supported from D0 D1 D2 D3hot D3cold
> > [ 0.469877] pci 0000:02:01.0: [8086:1576] type 01 class 0x060400
> > [ 0.470000] pci 0000:02:01.0: supports D1 D2
> > [ 0.470001] pci 0000:02:01.0: PME# supported from D0 D1 D2 D3hot D3cold
> > [ 0.470067] pci 0000:02:02.0: [8086:1576] type 01 class 0x060400
> > [ 0.470188] pci 0000:02:02.0: supports D1 D2
> > [ 0.470189] pci 0000:02:02.0: PME# supported from D0 D1 D2 D3hot D3cold
> > [ 0.470277] pci 0000:01:00.0: PCI bridge to [bus 02-ff]
> > [ 0.470283] pci 0000:01:00.0: bridge window [io 0x0000-0x0fff]
> > [ 0.470287] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff]
> > [ 0.470294] pci 0000:01:00.0: bridge window [mem 0x00000000-0x000fffff
> 64bit pref]
> > [ 0.470296] pci 0000:02:00.0: bridge configuration invalid ([bus 00-00]),
> reconfiguring
> > [ 0.470304] pci 0000:02:01.0: bridge configuration invalid ([bus 00-00]),
> reconfiguring
> > [ 0.470312] pci 0000:02:02.0: bridge configuration invalid ([bus 00-00]),
> reconfiguring
>
> Here.
>
> And ends up in failure when we create PCIe tunnels later on.

For what it's worth the XPS 9365 which has a different BIOS core has these
exact same behaviors on Linux if booted with the TBT dock plugged in.

>
> Now, this is probably where Windows does something else, like it may
> skip re-configuring phase which could explain why it works. However, to
> me this looks pretty much like a bug in the BIOS/firmware as we are
> expecting the BIOS to configure the PCIe devices properly before the OS
> is send ACPI hotplug event.
>

I'll reach out to the BIOS guys to see if they can give some more comments
from their perspective.

I came across something interesting from browsing MSDN about this topic.
It hasn't been updated in a long time but I think should still be a relevant
indication of the approach that Windows was taking and why the firmware
is this way and expecting OS to reconfigure.

"The BIOS cannot preconfigure PCI-to-PCI (P2P) bridges on adapters during
hot plug. Consequently, the operating system assigns resource windows of
a default size to a bridge.

I/O window. The default size for the I/O window is 4 KB in Windows 2000,
Windows XP, and Windows Server 2003.
Memory window. The configuration for the memory window differs for
Windows 2000, Windows XP, and Windows Server 2003:
* For Windows 2000, the default size for the memory window is 2 MB.
* For Windows XP and Windows Server 2003, the operating system
first attempts to find a memory window of 32 MB. If it cannot find a
window of that size, the operating system attempts to find a memory
window of progressively smaller sizes (16, 8, 4, 2, and finally 1 MB) until
it finds a size that works."

> We need to handle this in Linux in the same way Windows does but
> currently I have no idea. It is however, more related to our PCI
> enumeration code than the patches in question, I think.
>

Come to think of it, I have seen the dock have troubles if plugged in at
boot on Linux even with SL0 before this patch series.

> I also have a Dell 9350 here so I can reproduce the problem and I'm
> going to investigate this further probably involving Linux PCI people.
To clarify are you reproducing it with a TB16 or some other TBT device?

>
> My testing on the machine shows this behaviour only when the cable is
> connected during boot.

Yep same.

>
> If I connect the cable after OS is booted I don't see the problem, even
> if I do unplug / plug cycle.
>
> Can you try that also (again)? And if you see the problem, send me the
> dmesg? I have the latest BIOS (1.4.17) and NVM 16 so this machine
> configuration should match yours if I'm not mistaken.

It does work properly if I boot no cable plugged in and then plug one in.