Re: [PATCH] PCI/AER: Do not clear AER bits if we don't own AER

From: Bjorn Helgaas
Date: Thu Aug 09 2018 - 10:15:55 EST


On Tue, Jul 17, 2018 at 10:31:23AM -0500, Alexandru Gagniuc wrote:
> When we don't own AER, we shouldn't touch the AER error bits. This
> happens unconditionally on device probe(). Clearing AER bits
> willy-nilly might cause firmware to miss errors. Instead
> these bits should get cleared by FFS, or via ACPI _HPX method.
>
> This race is mostly of theoretical significance, as it is not easy to
> reasonably demonstrate it in testing.
>
> Signed-off-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx>
> ---
> drivers/pci/pcie/aer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index a2e88386af28..18037a2a8231 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -383,6 +383,9 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
> if (!pci_is_pcie(dev))
> return -ENODEV;
>
> + if (pcie_aer_get_firmware_first(dev))
> + return -EIO;

I like this patch.

Do we need the same thing in the following places that also clear AER
status bits or write AER control bits?

enable_ecrc_checking()
disable_ecrc_checking()
pci_cleanup_aer_uncorrect_error_status()
pci_aer_clear_fatal_status()

> pos = dev->aer_cap;
> if (!pos)
> return -EIO;
> --
> 2.14.3
>