Re: sky2 (was Re: 2.6.18-mm2)

From: Stephen Hemminger
Date: Thu Sep 28 2006 - 19:20:49 EST


On Thu, 28 Sep 2006 19:07:05 -0400
Jeff Garzik <jeff@xxxxxxxxxx> wrote:

> Andrew Morton wrote:
> > Another customer..
> >
> > Begin forwarded message:
> >
> > Date: Fri, 29 Sep 2006 00:44:01 +0200
> > From: Matthias Hentges <oe@xxxxxxxxxxx>
> > To: Andrew Morton <akpm@xxxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx
> > Subject: Re: 2.6.18-mm2
> >
> >
> > Hello all,
> >
> > I've just tested -mm2 on my C2D system and I'm getting a lot of these
> > messages:
> >
> > "[ 139.143807] printk: 131 messages suppressed.
> > [ 139.148235] sky2 0000:03:00.0: pci express error (0x500547)"
> >
> > Please note that the "sky2" driver has always been the black sheep on
> > that system due to regular full lock-ups of the driver, requiring a
> > rmmod sky2 + modprobe sky2 cycle.
> >
> > This happens often enough to warrant writing a cronjob checking the
> > network and auto-rmmod'ing the module.....
> >
> > While the above is bloody annoying at times (heh), the driver never
> > caused any messages like the ones I now get with -mm2 .
>
> sky2 just turned on PCI Express error reporting, so it makes sense that
> messages would appear. The better question is whether this is a driver
> problem, or a hardware problem. With your "black sheep" comment, I
> wonder if it isn't a hardware problem that's been hidden.

Here is the debug patch I sent to the first reporter of the problem.
I know what the offset is supposed to be, so if the PCI subsystem is
wrong, this will show.

--- sky2.orig/drivers/net/sky2.c 2006-09-28 08:45:27.000000000 -0700
+++ sky2/drivers/net/sky2.c 2006-09-28 08:51:24.000000000 -0700
@@ -2463,6 +2463,7 @@

sky2_write8(hw, B0_CTST, CS_MRST_CLR);

+#define PEX_UNC_ERR_STAT 0x104 /* PCI extended error capablity */
/* clear any PEX errors */
if (pci_find_capability(hw->pdev, PCI_CAP_ID_EXP)) {
hw->err_cap = pci_find_ext_capability(hw->pdev, PCI_EXT_CAP_ID_ERR);
@@ -2470,6 +2471,15 @@
sky2_pci_write32(hw,
hw->err_cap + PCI_ERR_UNCOR_STATUS,
0xffffffffUL);
+ else
+ printk(KERN_ERR PFX "pci express found but not extended error support?\n");
+
+ if (hw->err_cap + PCI_ERR_UNCOR_STATUS != PEX_UNC_ERR_STAT) {
+
+ printk(KERN_ERR PFX "pci express error status register fixed from %#x to %#x\n",
+ hw->err_cap, PEX_UNC_ERR_STAT - PCI_ERR_UNCOR_STATUS);
+ hw->err_cap = PEX_UNC_ERR_STAT - PCI_ERR_UNCOR_STATUS;
+ }
}

hw->pmd_type = sky2_read8(hw, B2_PMD_TYP);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/