Re: [E1000-devel] e1000: "eeprom checksum is not valid" after kexec
From: Thadeu Lima de Souza Cascardo
Date: Thu Apr 23 2009 - 17:18:20 EST
On Thu, Apr 23, 2009 at 10:40:14PM +0200, Jiri Slaby wrote:
> On 04/23/2009 04:41 PM, Thadeu Lima de Souza Cascardo wrote:
> > On Thu, Apr 23, 2009 at 04:30:01PM +0200, Jiri Slaby wrote:
> >> On 04/23/2009 04:10 PM, Thadeu Lima de Souza Cascardo wrote:
> >>> Have you tried b43fcd7dc7b, found in v2.6.30-rc3?
> >> I've tried 2.6.30-rc3-next-20090423 without success.
> >
> > You mean next-20090423. The patch is really found there.
> >
> > But, then, I realize you mean reverting these patches for the kernel
> > that is running or the kernel that is being kexec'd?
>
> The latter.
>
> > If b43fcd7dc7b is applied to the running kernel, it fixes the shutdown
> > issue, and the next loaded kernel probes e1000 fine.
>
> Makes sense.
>
> > If you are reverting 4a865905f in the kexec'd kernel and the running
> > kernel does not have b43fcd7dc7b, then I'd like to test the revert for
> > my case here, which is e100.
>
> To make things clear: on that machine, there was stock opensuse 11.1
> distro kernel which is 2.6.27-based (no b43fcd7dc7b). I needed to debug
> a wireless bug, so I kexec'ed wireless-testing (contains 4a865905f already).
>
> So in fact, 4a865905f from the testing kernel triggered a bug fixed in
> near past by b43fcd7dc7b.
>
> Did the other two e100* drivers suffer from the same and were fixed
> recently? It would render kexec pretty unusable from the older kernels
> if this is not going to be fixed anyhow :(.
Yes, as well as some other network drivers, it seems. My fix for e100
should be in Jeffrey Kirsher's tree by now and go into netdev and rc4
soon, I expect.
But, since I also thought that it would be good to fix that and allow
people to kexec from earlier kernels, I did a followup to e100-devel,
linux-pci, netdev and Rafael Wysocki. I didn't include linux-kernel,
which I have just fixed, bouncing the message (oops!). I may bounce it
to you too, if you want that.
Your findings shed a light into that problem. But I could find it in
very early kernels too for some configurations, and these commits you
are reverting may only fix the issue for the most common configurations
out there. That is, it was very easy to trigger the shutdown bug with
these patches. But I think there are some other bugs out there that will
trigger it, and they are not that easy bisecting, it seems, since only
some very particular configurations trigger it.
I will do some tests with the commits you mention and reproduce the
problem using as earlier kernels as I can and send the config.
Regards,
Cascardo.
Attachment:
signature.asc
Description: Digital signature