Re: [PATCH] x86_64 irq: check remote IRR bit before migrating level triggered irq (v3)

From: Siddha, Suresh B
Date: Thu May 31 2007 - 16:03:44 EST


On Thu, May 31, 2007 at 07:50:58AM -0600, Eric W. Biederman wrote:
>
> On x86_64 kernel, level triggered irq migration gets initiated in the context
> of that interrupt(after executing the irq handler) and following steps are
> followed to do the irq migration.
>
> 1. mask IOAPIC RTE entry; // write to IOAPIC RTE
> 2. EOI; // processor EOI write
> 3. reprogram IOAPIC RTE entry // write to IOAPIC RTE with new destination and
> // and interrupt vector due to per cpu vector
> // allocation.
> 4. unmask IOAPIC RTE entry; // write to IOAPIC RTE
>
> Because of the per cpu vector allocation in x86_64 kernels, when the irq
> migrates to a different cpu, new vector(corresponding to the new cpu) will
> get allocated.
>
> An EOI write to local APIC has a side effect of generating an EOI write
> for level trigger interrupts (normally this is a broadcast to all IOAPICs).
> The EOI broadcast generated as a side effect of EOI write to processor may
> be delayed while the other IOAPIC writes (step 3 and 4) can go through.
>
> Normally, the EOI generated by local APIC for level trigger interrupt
> contains vector number. The IOAPIC will take this vector number and
> search the IOAPIC RTE entries for an entry with matching vector number and
> clear the remote IRR bit (indicate EOI). However, if the vector number is
> changed (as in step 3) the IOAPIC will not find the RTE entry when the EOI
> is received later. This will cause the remote IRR to get stuck causing the
> interrupt hang (no more interrupt from this RTE).
>
> Current x86_64 kernel assumes that remote IRR bit is cleared by the time
> IOAPIC RTE is reprogrammed. Fix this assumption by checking for remote IRR
> bit and if it still set, delay the irq migration to the next interrupt
> arrival event(hopefully, next time remote IRR bit will get cleared
> before the IOAPIC RTE is reprogrammed).
>
> Initial analysis and patch from Nanhai.
>
> Clean up patch from Suresh.
>
> Rewritten to be less intrusive, and to contain a big fat comment by Eric.

Acked-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>

Thanks Eric.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/