Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

From: David Dillow
Date: Tue Aug 25 2009 - 16:41:00 EST

On Tue, 2009-08-25 at 13:22 -0700, Eric W. Biederman wrote:
> David Dillow <dave@xxxxxxxxxxxxxx> writes:
> > I'm not real happy with the interrupt handling in the driver; it makes a
> > certain amount of sense to split the MSI vs non-MSI interrupt cases out.
> > It also means another pass through re-auditing things against the vendor
> > driver. That's more work than I'm able to commit to at the moment.
> >
> > I've not been able to reproduce it locally on my r8169d, running for ~30
> > minutes straight at full speed. I've not tried running it in UP, though.
> > Perhaps I can do that tomorrow.
> >
> > Here's a possible patch to mask the NAPI events while we're running in
> > NAPI mode. I'm not sure it is going to help, since the intr_mask was
> > 0xffff when you hit the loop guard, so I left it in for now.
> Interesting.
> If I understand this correctly the situation is that we have on the
> chip there is correct logic for a level triggered interrupt and that
> the msi logic sits on it and sends an event when the interrupt signal
> goes high, but when we acknowledge some bits but not all it does not
> send another interrupt.

Correct, we have to acknowledge all current outstanding event sources
before we get another MSI interrupt. It looks like the MSI interrupt is
triggered on the edge transition of a logical OR of all irq sources.

> Baring playing games with what version of the card has working logic
> and which does not we seem to have to simple choices (if we don't want
> to loop possibly forever).
> - Don't use the msi logic on this card.
> - Move all of the logic into rtl8169_poll and only come out of NAPI
> mode when we have caught up with all of the interrupt work.
> Is that how you understand the hardware issue you are trying to work
> around?

That's how I understood the issue I was working around with the
problematic patch, but I thought I had covered both issues fairly well
without having to split the handling any further -- we ACK all existing
sources each pass through the loop, so we'll get a new interrupt on the
unmasked events, but not on ones we've masked out for NAPI until NAPI
completes and unmasks them.

I'm curious how you managed to receive an packet between us clearing the
all current sources and reading the current source list continuously for
60+ seconds -- the loop is basically

status = get IRQ events from chip
while (status) {
/* process events, start NAPI if needed */
clear current events from chip
status = get IRQ events from chip

That seems like a very small race window to consistently hit --
especially for long enough to trigger soft lockups.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at