[RFC/PATCH] x86/irq: round-robin distribution of irqs to cpus w/innode

From: Arthur Kepner
Date: Mon Sep 27 2010 - 00:09:15 EST



SGI has encountered situations where particular CPUs run out of
interrupt vectors on systems with many (several hundred or more)
CPUs. This happens because some drivers (particularly the mlx4_core
driver) select the number of interrupts they allocate based on the
number of CPUs, and because of how the default irq affinity is used.

Do psuedo round-robin distribution of irqs to CPUs within a node
to avoid (or at least delay) running out of vectors on any particular
CPU.

Signed-off-by: Arthur Kepner <akepner@xxxxxxx>
---

arch/x86/kernel/apic/io_apic.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index f1efeba..ad540a9 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3254,6 +3254,8 @@ unsigned int create_irq_nr(unsigned int irq_want, int node)

raw_spin_lock_irqsave(&vector_lock, flags);
for (new = irq_want; new < nr_irqs; new++) {
+ cpumask_var_t tmp_mask;
+
desc_new = irq_to_desc_alloc_node(new, node);
if (!desc_new) {
printk(KERN_INFO "can not get irq_desc for %d\n", new);
@@ -3267,8 +3269,30 @@ unsigned int create_irq_nr(unsigned int irq_want, int node)
desc_new = move_irq_desc(desc_new, node);
cfg_new = desc_new->chip_data;

- if (__assign_irq_vector(new, cfg_new, apic->target_cpus()) == 0)
- irq = new;
+ if ((node != -1) && alloc_cpumask_var(&tmp_mask, GFP_ATOMIC)) {
+
+ static int cpu;
+
+ /* try to place irq on a cpu in the node in psuedo-
+ * round robin order*/
+
+ cpu = __next_cpu_nr(cpu, cpumask_of_node(node));
+ if (cpu >= nr_cpu_ids)
+ cpu = 0;
+
+ cpumask_set_cpu(cpu, tmp_mask);
+
+ if (cpumask_test_cpu(cpu, apic->target_cpus()) &&
+ __assign_irq_vector(new, cfg_new, tmp_mask) == 0)
+ irq = new;
+
+ free_cpumask_var(tmp_mask);
+ }
+
+ if (irq == 0)
+ if (__assign_irq_vector(new, cfg_new,
+ apic->target_cpus()) == 0)
+ irq = new;
break;
}
raw_spin_unlock_irqrestore(&vector_lock, flags);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/