Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

From: Jarek Poplawski
Date: Fri Aug 10 2007 - 04:48:53 EST


On Fri, Aug 10, 2007 at 10:30:50AM +0200, Ingo Molnar wrote:
>
> * Jarek Poplawski <jarkao2@xxxxx> wrote:
>
> > > Hmm. This solution is still just pampering over the real problem.
> > > The delayed disable just re-sends level interrupts unnecessarily. I
> > > have a fix (needs some testing) for this, which I send out tomorrow,
> > > when I'm really back from vacation.
> > >
> > > But suppressing the resend is not fixing the driver problem. The
> > > problem can show up with spurious interrupts and with interrupts on
> > > a shared PCI interrupt line at any time. It just might take weeks
> > > instead of minutes.
> >
> > Doesn't it look like a little change of mind? [...]
>
> what change of mind do you mean exactly?
>
> > [...] Well, there are probably (but need more testing) two other
> > solutions: _SW_RESEND and disabling without delay for levels only...
>
> IIRC Marcin tested software-resend and it didnt fix the hang. That
> strongly points in the direction of a driver bug (or a genirq bug) being
> made more prominent by the genirq change - not any hardware detail such
> as the APIC vector-retrigger sequence.
>
> While we'd like to see the suspected driver bug (or any higher level
> genirq bug) fixed, we'll undo the effect of the genirq change (because
> it is causing a regression). We'll also add a separate, optional
> irq-debugging feature that generates high-rate interrupts on any shared
> irq line. (and thus artificially stresses the robustness of the driver
> and the genirq layer against spurious interrupts.)

Not exactly so... I've send modified version of your software-resend
patch, and it seems to work OK.

Jarek P.