Re: [PATCH v1 1/2] PCI/AER: Decode Error Source Requester ID

From: Bjorn Helgaas
Date: Thu May 31 2018 - 00:42:24 EST


On Wed, May 30, 2018 at 11:41:23AM -0700, Rajat Jain wrote:
> On Wed, May 30, 2018 at 10:54 AM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> > From: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
>
> > Decode the Requester ID from the AER Error Source Register into domain/
> > bus/device/function format to match other logging. In cases where the ID
> > matches the device used for pci_err(), drop the extra ID completely so we
> > don't print it twice.
>
> > Signed-off-by: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > ---
> > drivers/pci/pcie/aer/aerdrv_errprint.c | 18 +++++++++++-------
> > 1 file changed, 11 insertions(+), 7 deletions(-)
>
> > diff --git a/drivers/pci/pcie/aer/aerdrv_errprint.c
> b/drivers/pci/pcie/aer/aerdrv_errprint.c
> > index 21ca5e1b0ded..d7fde8368d81 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_errprint.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_errprint.c
> > @@ -163,17 +163,17 @@ void aer_print_error(struct pci_dev *dev, struct
> aer_err_info *info)
> > int id = ((dev->bus->number << 8) | dev->devfn);

> > if (!info->status) {
> > - pci_err(dev, "PCIe Bus Error: severity=%s,
> type=Unaccessible, id=%04x(Unregistered Agent ID)\n",
> > - aer_error_severity_string[info->severity], id);
> > + pci_err(dev, "PCIe Bus Error: severity=%s,
> type=Inaccessible, (Unregistered Agent ID)\n",
> > + aer_error_severity_string[info->severity]);
>
> Does this code path indicate that a requester id was decoded to a device
> that is not registered with the kernel? If so, shouldn't we log the bad
> requester ID for better debugging, specifically since there is not going to
> be any subsequent print about this ID (since we return from this function
> in this case)?

Previously we printed "id", which was constructed above from "dev":

id = ((dev->bus->number << 8) | dev->devfn);

so even if we print "id=%04x", it contains exactly the same
information as the bus/device/function printed using "dev".

So no, I don't think "Unregistered Agent ID" means a device not registered
with the kernel. At any rate, we do have a pci_dev for it.

I *think* "info->status == 0" means PCI_ERR_COR_STATUS (or
PCI_ERR_UNCOR_STATUS) was zero, i.e., we didn't find any error status
bits set for this device. I don't think "Unregistered Agent ID" is a
very good description of this situation.

Bjorn