Re: [E1000-devel] 2.6.36 abrupt total e1000e carrier loss (cured by reboot)

From: Nix
Date: Mon Nov 08 2010 - 03:01:53 EST


On 4 Nov 2010, Jesse Brandeburg spake thusly:
> The above could be responsible for your issue. If you don't want to
> disable ASPM system wide, then you could just make sure to run a recent
> kernel with the ASPM patches, or get our e1000.sf.net e1000e driver and
> try it, as it will work around the issue whether or not aspm is enabled.

For the record, cherry-picking ff10e13cd06f3dbe90e9fffc3c2dd2057a116e4b
(the periodic phy-crash-and-reset check) atop 2.6.36 seems to have fixed
it: at least, the machine has been up for a day now without trouble.
This commit doesn't seem to be in Greg's stable-queue yet, but seems
like a good candidate.

It's still rather gruesome that anything like this is even needed, but
as long as it only fires every hour or so I guess I can live with it.
Are there firmware updates or something that might fix this properly in
the end?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/