Re: [patch 3/3] genirq: mark io_apic level interrupts to avoidresend

From: Thomas Gleixner
Date: Mon Aug 13 2007 - 10:48:58 EST


On Mon, 2007-08-13 at 13:28 +0200, Jarek Poplawski wrote:
> Maybe I miss something, but:
> - why it's not done in other places with handle_level_irq or

I removed the PENDING bit magic in handle_level_irq, which was put there
by a mismerge.

> handle_fasteoi_irq. Are we sure they can never get IRQ_PENDING flag
> or we take the risk?

handle_fasteoi_irq is special. It is used by ioapic and weird Powerpc
hardware. We marked the ioapic level interrupts IRQ_LEVEL, so we wont
get a resend on them (see kernel/irq/resend.c)

> - what about e.g. handle_simple_irq: do we think it needs resending
> or simply going to wait for good bug reports?

Oops. I missed that one. I'll have a look at that as well.

> - is this .status cleared enough on free_irq or during request_irq?

AFAICT, there is no danger.

> BTW, of course something like this set of patches was needed here,
> but I still think this shouldn't be done this way: the bug wasn't
> diagnosed nor tested enough, some chips are suspected, and
> nevertheless the change is done for everybody, whithout any
> possibility to set this back for people who had no problems with
> 2.6.21 and 2.6.22 (at least for some transition time).

It is quite clear what happened:

1. The retrigger/resend mechanism was never meant to be used for level
type interrupts and it makes absolutely no sense. When a level type
interrupt is masked while active it is resent by hardware on unmask when
it is still active. The resend / retrigger mechanism is only useful for
edge type interrupts, because we would lose an interrupt when we mask it
delayed without triggering a resend on unmask.

2. I found a box similar to Marcins which gets confused when level type
interrupts are resent.

> Maybe some
> other chips get confused now, and we get some notice of this maybe
> only in 2.6.25, if we'are lucky enough to have somebody like Marcin
> around to do the next git bisection?

There is nothing to confuse anymore. The resend for level type
interrupts is not happening, which is the same behavior as we had before
the delayed disable. The edge type interrupts resend is there since we
did the big genirq changes (2.6.18) and it was tested on various boxen
(including the above one) quite extensive.

The delayed disable was added later and unfortunately introduced the
problems we've seen.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/