Re: Sun GEM PPC32 Bug?

From: David Miller
Date: Fri Feb 04 2011 - 17:54:37 EST


From: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
Date: Sat, 05 Feb 2011 07:51:07 +1100

> The FIFO overflow could be a driver bug or a HW issue, there are some
> known issues with the small FIFOs in that chip, but it's also possible
> that we don't configure them quite right. Anybody wants to dig in and
> see what's going on there ? May want to look at the Darwin sungem driver
> for reference on how it configures them... However, it should generally
> recover when that happens. If not, then we have a bug there.

I think we're simply not resetting enough when the RX FIFO overflow
happens.

Just for fun I checked the OpenBSD GEM driver to see what they do.
When an overflow occurs, they bump the statistic, record the current
read and write fifo pointer registers, and schedule a watchdog timer
for 400ms into the future.

If the watchdog timer sees that the RX FIFO overflow bit is still set
in the RX status register, and the RX FIFO read and write pointers
have not changed, it resets the entire chip.

We unconditionally reset the RX MAC when an overflow occurs, that may
simply not be enough to unwedge this thing.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/