Re: [PATCH v2] PCI/AER: Do not clear AER bits if we don't own AER

From: Alex G.
Date: Tue Jul 24 2018 - 11:59:37 EST




On 07/23/2018 11:52 AM, Alexandru Gagniuc wrote:
When we don't own AER, we shouldn't touch the AER error bits. Clearing
error bits willy-nilly might cause firmware to miss some errors. In
theory, these bits get cleared by FFS, or via ACPI _HPX method. These
mechanisms are not subject to the problem.

This race is mostly of theoretical significance, since I can't
reasonably demonstrate this race in the lab.

On a side-note, pcie_aer_is_kernel_first() is created to alleviate the
need for two checks: aer_cap and get_firmware_first().

Signed-off-by: Alexandru Gagniuc <mr.nuke.me@xxxxxxxxx>
---
drivers/pci/pcie/aer.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
index a2e88386af28..85c3e173c025 100644
--- a/drivers/pci/pcie/aer.c
+++ b/drivers/pci/pcie/aer.c
@@ -307,6 +307,12 @@ int pcie_aer_get_firmware_first(struct pci_dev *dev)
aer_set_firmware_first(dev);
return dev->__aer_firmware_first;
}
+
+static bool pcie_aer_is_kernel_first(struct pci_dev *dev)
+{
+ return !!dev->aer_cap && !pcie_aer_get_firmware_first(dev);
+}
+
#define PCI_EXP_AER_FLAGS (PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE | \
PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE)
@@ -337,10 +343,7 @@ bool aer_acpi_firmware_first(void)
int pci_enable_pcie_error_reporting(struct pci_dev *dev)
{
- if (pcie_aer_get_firmware_first(dev))
- return -EIO;
-
- if (!dev->aer_cap)
+ if (!pcie_aer_is_kernel_first(dev))
return -EIO;
return pcie_capability_set_word(dev, PCI_EXP_DEVCTL, PCI_EXP_AER_FLAGS);
@@ -349,7 +352,7 @@ EXPORT_SYMBOL_GPL(pci_enable_pcie_error_reporting);
int pci_disable_pcie_error_reporting(struct pci_dev *dev)
{
- if (pcie_aer_get_firmware_first(dev))
+ if (!pcie_aer_is_kernel_first(dev))
return -EIO;
return pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
@@ -383,10 +386,10 @@ int pci_cleanup_aer_error_status_regs(struct pci_dev *dev)
if (!pci_is_pcie(dev))
return -ENODEV;
- pos = dev->aer_cap;
- if (!pos)
+ if (pcie_aer_is_kernel_first(dev))

This here is missing a '!'. It's in my local branch, so I must have exported the patch before I fixed that. I'll get that fixed next rev.

return -EIO;
+ pos = dev->aer_cap;
port_type = pci_pcie_type(dev);
if (port_type == PCI_EXP_TYPE_ROOT_PORT) {
pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);