Re: [PATCH 00/24] Thunderbolt security levels and NVM firmware upgrade

From: mika.westerberg@xxxxxxxxxxxxxxx
Date: Thu May 25 2017 - 08:03:19 EST


On Thu, May 25, 2017 at 11:04:08AM +0300, mika.westerberg@xxxxxxxxxxxxxxx wrote:
> On Thu, May 25, 2017 at 10:20:10AM +0300, mika.westerberg@xxxxxxxxxxxxxxx wrote:
> > On Wed, May 24, 2017 at 07:32:45PM +0000, Jamet, Michael wrote:
> > > I talked to our BIOS expert today. Here is his advice to debugging further:
> > >
> > > It looks like something may have been wrong from system (BIOS, FW, others...) perspective.
> > > On reboot need to enter EFI shell and check resources of
> > > pci 0000:01:00.0: bridge.
> > > At the EFI shell, this bridge MUST be either configured or absent.
> > >
> > > I would start this way, once we have this info, we may circle back to
> > > him and look into next debugging step.
> >
> > Thanks, I'll try this today.
>
>
> This is the contents dumped directly from EFI shell when a device is
> connected. It seems that the vendor_id/device_id is 0xffff but the rest
> of the config seems to be present (although not fully configured):
>
> PCI Segment 00 Bus 01 Device 00 Func 00 [EFI 0001000000]
> 00000000: FF FF FF FF 00 00 10 00-00 00 04 06 00 00 01 00 *................*
> 00000010: 00 00 00 00 00 00 00 00-00 00 00 00 01 01 00 00 *................*
> 00000020: 00 00 00 00 01 00 01 00-00 00 00 00 00 00 00 00 *................*
> 00000030: 00 00 00 00 80 00 00 00-00 00 00 00 FF 01 00 00 *................*
> 00000040: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
> 00000050: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
> 00000060: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
> 00000070: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
> 00000080: 01 88 C3 FF 08 00 00 00-05 AC 80 00 00 00 00 00 *................*
> 00000090: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
> 000000A0: 00 00 00 00 00 00 00 00-00 00 00 00 0D C0 00 00 *................*
> 000000B0: 22 22 11 11 00 00 00 00-00 00 00 00 00 00 00 00 *""..............*
> 000000C0: 10 00 52 00 20 80 E8 07-10 28 10 00 43 5C 45 00 *..R. ....(..C\E.*
> 000000D0: 00 00 23 10 00 00 00 00-00 00 00 00 00 00 00 00 *..#.............*
> 000000E0: 00 00 00 00 00 08 00 00-00 00 00 00 0E 00 00 00 *................*
> 000000F0: 03 00 1E 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
>
> I wonder how Linux manages to find the device if vendor_id/device_id
> reads 0xffff?

OK, here's the explanation.

When Linux initializes ACPI (this happens before PCI initial scan), it
calls acpi_initialize_objects(). This in turn causes _INI methods of
devices to be executed. Now, the _SB.PCI0._INI() ends up calling
\_GPE.TINI() which executes Thunderbolt specific OSUP() method. Purpose
of this method is to overwrite vendor_id/device_id to the correct values
with the assumption that the OS has already done the initial PCI scan.

In case of Linux this is not true and that is the reason the upstream
port is found half-initialized leading to the failure.