Re: PROBLEM: 4.15.0-rc3 APIC causes lockups on Core 2 Duo laptop

From: Alexandru Chirvasitu
Date: Thu Dec 28 2017 - 19:38:13 EST


On Thu, Dec 28, 2017 at 06:15:19PM -0600, Bjorn Helgaas wrote:
> On Thu, Dec 28, 2017 at 06:30:58PM -0500, Alexandru Chirvasitu wrote:
> > Attached, but heads up on this: when redirecting the output of lspci
> > -vvv to a text file as root I get
> >
> > pcilib: sysfs_read_vpd: read failed: Input/output error
> >
> > I can find bugs filed for various distros to this same effect, but
> > haven't tracked down any explanations.
>
> This is a tangent, but I think you should *always* see "Input/output
> error" on this system when running "lspci -vvv" as root, regardless of
> whether you redirect the output (the error probably goes to stderr,
> not stdout, so it's probably easy to miss when not redirecting the
> output).
>
> I think this is the -EIO return from pci_vpd_read(), which probably
> means pci_vpd_size() returned 0 for one of your devices, which means
> the VPD data provided by the device wasn't formatted correctly. If
> this happens, you should see a warning in dmesg about it ("invalid VPD
> tag" or similar) -- could you verify that?
>

This in dmesg:

pci 0000:06:00.0: [Firmware Bug]: disabling VPD access (can't
determine size of non-standard VPD format)

So yes, looks like you pinned it down good. No other VPD instances in
dmesg.

And yes, the error does seem to always be present. I see it with

lspci -vvv 2>&1 | grep pcilib

so it was there in stderr all along.

> It's possible we should return something other than -EIO, or maybe
> pcilib should do something other than emitting the warning. In
> pcilib, sysfs_read_vpd() emits the warning [1], and it would seem sort
> of ugly to special-case EIO, so maybe we should change this in the
> kernel.
>
> It looks like your Qualcomm Atheros Attansic NIC at 06:00.0 is the
> only device with VPD, so that's probably the one:
>
> 06:00.0 Ethernet controller: Qualcomm Atheros Attansic L2 Fast Ethernet
> Capabilities: [6c] Vital Product Data
> Not readable
>
> I think lspci would still print "Not readable" if we just made the
> kernel return 0 instead of -EIO [2].
>
> Bjorn
>
> [1] https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/tree/lib/sysfs.c#n410
> [2] https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/tree/ls-vpd.c#n87