Re: [patch 7/7] PCI/AER: Fix the broken interrupt injection
From: Kuppuswamy Sathyanarayanan
Date: Fri Mar 06 2020 - 14:32:04 EST
On 3/6/20 10:32 AM, Kuppuswamy Sathyanarayanan wrote:
On 3/6/20 5:03 AM, Thomas Gleixner wrote:
The AER error injection mechanism just blindly abuses
generic_handle_irq()
which is really not meant for consumption by random drivers. The
include of
linux/irq.h should have been a red flag in the first place. Driver code,
unless implementing interrupt chips or low level hypervisor
functionality
has absolutely no business with that.
Invoking generic_handle_irq() from non interrupt handling context can
have
nasty side effects at least on x86 due to the hardware trainwreck which
makes interrupt affinity changes a fragile beast. Sathyanarayanan
triggered
a NULL pointer dereference in the low level APIC code that way. While
the
particular pointer could be checked this would only paper over the issue
because there are other ways to trigger warnings or silently corrupt
state.
Invoke the new irq_inject_interrupt() mechanism, which has the necessary
sanity checks in place and injects the interrupt via the irq_retrigger()
mechanism, which is at least halfways safe vs. the fragile x86 affinity
change mechanics.
It's safe on x86 as it does not corrupt state, but it still can cause a
premature completion of an interrupt affinity change causing the
interrupt
line to become stale. Very unlikely, but possible.
For regular operations this is a non issue as AER error injection is
meant
for debugging and testing and not for usage on production systems.
People
using this should better know what they are doing.
It looks good to me.
Reviewed-by: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Tested-by: Kuppuswamy Sathyanarayanan
<sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
Fixes: 390e2db82480 ("PCI/AER: Abstract AER interrupt handling")
This patch is merged in v4.20 kernel. So this fix could be a candidate
for stable fix.
Reported-by: sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
 drivers/pci/pcie/Kconfig | 1 +
 drivers/pci/pcie/aer_inject.c | 6 ++----
 2 files changed, 3 insertions(+), 4 deletions(-)
--- a/drivers/pci/pcie/Kconfig
+++ b/drivers/pci/pcie/Kconfig
@@ -34,6 +34,7 @@ config PCIEAER
 config PCIEAER_INJECT
ÂÂÂÂÂ tristate "PCI Express error injection support"
ÂÂÂÂÂ depends on PCIEAER
+ÂÂÂ select GENERIC_IRQ_INJECTION
ÂÂÂÂÂ help
ÂÂÂÂÂÂÂ This enables PCI Express Root Port Advanced Error Reporting
ÂÂÂÂÂÂÂ (AER) software error injector.
--- a/drivers/pci/pcie/aer_inject.c
+++ b/drivers/pci/pcie/aer_inject.c
@@ -16,7 +16,7 @@
  #include <linux/module.h>
 #include <linux/init.h>
-#include <linux/irq.h>
+#include <linux/interrupt.h>
 #include <linux/miscdevice.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
@@ -468,9 +468,7 @@ static int aer_inject(struct aer_error_i
ÂÂÂÂÂÂÂÂÂ }
ÂÂÂÂÂÂÂÂÂ pci_info(edev->port, "Injecting errors %08x/%08x into
device %s\n",
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ einj->cor_status, einj->uncor_status, pci_name(dev));
-ÂÂÂÂÂÂÂ local_irq_disable();
-ÂÂÂÂÂÂÂ generic_handle_irq(edev->irq);
-ÂÂÂÂÂÂÂ local_irq_enable();
+ÂÂÂÂÂÂÂ ret = irq_inject_interrupt(edev->irq);
ÂÂÂÂÂ } else {
ÂÂÂÂÂÂÂÂÂ pci_err(rpdev, "AER device not found\n");
ÂÂÂÂÂÂÂÂÂ ret = -ENODEV;
--
Sathyanarayanan Kuppuswamy
Linux kernel developer