[PATCH v6 2/3] genirq: introduce irq_update_devid()

From: Ben Luo
Date: Mon Sep 02 2019 - 00:13:41 EST


Sometimes, only the dev_id field of irqaction needs to be changed.

E.g. KVM VM with device passthru via VFIO may switch the interrupt
injection path between KVM irqfd and userspace eventfd. These two
paths share the same interrupt number and handler for the same msi
vector of a device, only with different 'dev_id's referencing to
different fds' contexts. Set interrupt affinity in this VM is a way
to trigger the path switching.

Currently, VFIO uses a free-then-request-irq way for the path switching.
There is a time window between free_irq() and request_irq() where the
target IRTE is invalid. So, in-flight interrupts (buffering in hardware
layer and unfortunately cannot be synchronized in software) can cause
DMAR faults and even worse, this VM may hang in waiting IO completion.

By using irq_update_devid(), this issue can be avoided since IRTE will
not be invalidated during the whole process.

Signed-off-by: Ben Luo <luoben@xxxxxxxxxxxxxxxxx>
---
include/linux/interrupt.h | 3 ++
kernel/irq/manage.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 78 insertions(+)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 5b8328a..09b6a0f 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -172,6 +172,9 @@ struct irqaction {
request_percpu_nmi(unsigned int irq, irq_handler_t handler,
const char *devname, void __percpu *dev);

+extern int __must_check
+irq_update_devid(unsigned int irq, void *dev_id, void *new_dev_id);
+
extern const void *free_irq(unsigned int, void *);
extern void free_percpu_irq(unsigned int, void __percpu *);

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 10ec3e9..adb1980 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -2063,6 +2063,81 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
EXPORT_SYMBOL(request_threaded_irq);

/**
+ * irq_update_devid - update irq dev_id to a new one
+ *
+ * @irq: Interrupt line to update
+ * @dev_id: A cookie to find the irqaction to update
+ * @new_dev_id: New cookie passed to the handler function
+ *
+ * Sometimes, only the cookie data need to be changed. Instead of
+ * free-then-request interrupt, only update dev_id of irqaction can
+ * not only gain some performance benefit, but also reduce the risk
+ * of losing interrupt.
+ *
+ * This function won't update dev_id until any executing interrupts
+ * for this IRQ have completed. This function must not be called
+ * from interrupt context.
+ *
+ * On failure, it returns a negative value. On success,
+ * it returns 0
+ */
+int irq_update_devid(unsigned int irq, void *dev_id, void *new_dev_id)
+{
+ struct irq_desc *desc = irq_to_desc(irq);
+ struct irqaction *action, **action_ptr;
+ unsigned long flags;
+
+ if (WARN(in_interrupt(),
+ "Trying to update IRQ %d (dev_id %p to %p) from IRQ context!\n",
+ irq, dev_id, new_dev_id))
+ return -EPERM;
+
+ if (!desc)
+ return -EINVAL;
+
+ /*
+ * Ensure that an interrupt in flight on another CPU which uses the
+ * old 'dev_id' has completed because the caller can free the memory
+ * to which it points after this function returns. And also avoid to
+ * update 'dev_id' in the middle of a threaded interrupt process, it
+ * can lead to a twist that primary handler uses old 'dev_id' but new
+ * 'dev_id' is used by secondary handler.
+ */
+ disable_irq(irq);
+ raw_spin_lock_irqsave(&desc->lock, flags);
+
+ /*
+ * There can be multiple actions per IRQ descriptor, find the right
+ * one based on the dev_id:
+ */
+ action_ptr = &desc->action;
+ for (;;) {
+ action = *action_ptr;
+
+ if (!action) {
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+ enable_irq(irq);
+ WARN(1,
+ "Trying to update already-free IRQ %d (dev_id %p to %p)\n",
+ irq, dev_id, new_dev_id);
+ return -ENXIO;
+ }
+
+ if (action->dev_id == dev_id) {
+ action->dev_id = new_dev_id;
+ break;
+ }
+ action_ptr = &action->next;
+ }
+
+ raw_spin_unlock_irqrestore(&desc->lock, flags);
+ enable_irq(irq);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(irq_update_devid);
+
+/**
* request_any_context_irq - allocate an interrupt line
* @irq: Interrupt line to allocate
* @handler: Function to be called when the IRQ occurs.
--
1.8.3.1