Re: [PATCH 2.6.30-rc4] r8169: avoid losing MSI interrupts

From: David Dillow
Date: Mon Aug 24 2009 - 22:59:27 EST


On Mon, 2009-08-24 at 17:51 -0700, Eric W. Biederman wrote:
> When I decode the bits in status they are TxOK, RxOK and TxDescUnavail so it looks
> there is some bidirectional communication going on.
>
> Do we really want to loop when those bits are set?

Maybe not when only those bits are set, but I worry that we would trade
one race for another where we stop getting interrupts from the card.

> Perhaps we want to remove them from rtl_cfg_infos for the part?

Then you'd never get an interrupt for them in the first place, I think.

I'm not real happy with the interrupt handling in the driver; it makes a
certain amount of sense to split the MSI vs non-MSI interrupt cases out.
It also means another pass through re-auditing things against the vendor
driver. That's more work than I'm able to commit to at the moment.

I've not been able to reproduce it locally on my r8169d, running for ~30
minutes straight at full speed. I've not tried running it in UP, though.
Perhaps I can do that tomorrow.

Here's a possible patch to mask the NAPI events while we're running in
NAPI mode. I'm not sure it is going to help, since the intr_mask was
0xffff when you hit the loop guard, so I left it in for now.

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index b82780d..12755b7 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -3556,6 +3556,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
void __iomem *ioaddr = tp->mmio_addr;
int handled = 0;
int status;
+ int count = 0;

/* loop handling interrupts until we have no new ones or
* we hit a invalid/hotplug case.
@@ -3564,6 +3565,15 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
while (status && status != 0xffff) {
handled = 1;

+ if (count++ > 100) {
+ printk_once("r8169 screaming irq status %08x "
+ "mask %08x event %08x napi %08x\n",
+ status, tp->intr_mask, tp->intr_event,
+ tp->napi_event);
+ break;
+ }
+
+
/* Handle all of the error cases first. These will reset
* the chip, so just exit the loop.
*/
@@ -3613,6 +3623,7 @@ static irqreturn_t rtl8169_interrupt(int irq, void *dev_instance)
RTL_W16(IntrStatus,
(status & RxFIFOOver) ? (status | RxOverflow) : status);
status = RTL_R16(IntrStatus);
+ status &= tp->intr_mask;
}

return IRQ_RETVAL(handled);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/