[patch, -rc5-mm3] fix irqpoll some more

From: Ingo Molnar
Date: Mon Jun 05 2006 - 04:49:14 EST



* Alan Cox <alan@xxxxxxxxxxxxxxxxxxx> wrote:

> Ar Sul, 2006-06-04 am 23:00 -0700, ysgrifennodd akpm@xxxxxxxx:
> > irqpoll/irqfixup ignored IRQ_DISABLED but that could cause lockups. So
> > listen to desc->depth to correctly honor disable_irq(). Also, when an
> > interrupt it screaming, set IRQ_DISABLED but do not touch ->depth.
> >
> > Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
> > Cc: Alan Cox <alan@xxxxxxxxxxxxxxxxxxx>
>
> NAK
>
> The moment you do this you as good as lose irqpoll support because the
> IRQ being disabled is unrelated to the IRQ delivery line when dealing
> with faults of this nature.

hm, agreed. I really sent it more as an RFC. The real fix is to do the
disable_irq_handler()/enable_irq_handler() thing. [which is also a much
nicer interface for more advanced MSI concepts]

> There is a saner fix - walk all the IRQs that are not disabled and
> only if that produces no claim (ie the box is about to die otherwise)
> then walk the disabled ones. Its a slow path at that point in error
> recovery so the performance isn't critical.

yeah. How about the patch below? [builds and boots on x86, but no real
irqpoll testing was done as i dont have such a problem-system.]

Ingo

---------------
Subject: fix irqpoll some more
From: Ingo Molnar <mingo@xxxxxxx>

implement Alan's suggestion of irqpoll doing a second pass over
disabled irq lines if we didnt manage to handle the screaming
interrupt.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/irq/spurious.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

Index: linux/kernel/irq/spurious.c
===================================================================
--- linux.orig/kernel/irq/spurious.c
+++ linux/kernel/irq/spurious.c
@@ -18,8 +18,9 @@ static int irqfixup __read_mostly;
*/
static int misrouted_irq(int irq, struct pt_regs *regs)
{
- int i, ok = 0, work = 0;
+ int i, ok = 0, work = 0, first_pass = 1;

+repeat:
for (i = 1; i < NR_IRQS; i++) {
struct irq_desc *desc = irq_desc + i;
struct irqaction *action;
@@ -61,7 +62,7 @@ static int misrouted_irq(int irq, struct
* IRQ_DISABLED - which might have been set due to
* a screaming interrupt.
*/
- if (desc->depth) {
+ if (first_pass && desc->depth) {
spin_unlock(&desc->lock);
continue;
}
@@ -90,6 +91,25 @@ static int misrouted_irq(int irq, struct
desc->chip->end(i);
spin_unlock(&desc->lock);
}
+ /*
+ * HACK:
+ *
+ * In the first pass we dont touch handlers that are behind
+ * a disabled IRQ line. In the second pass (having no other
+ * choice) we ignore the disabled state of IRQ lines. We've
+ * got a screaming interrupt, so we have the choice between
+ * a real lockup happening due to that screaming interrupt,
+ * against a theoretical locking that becomes possible if we
+ * ignore a disabled IRQ line.
+ *
+ * FIXME: proper disable_irq_handler() API would remove the
+ * need for this hack.
+ */
+ if (!ok && first_pass) {
+ first_pass = 0;
+ goto repeat;
+ }
+
/* So the caller can adjust the irq error counts */
return ok;
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/