Re: kernel + gcc 4.1 = several problems

From: D. Hazelton
Date: Tue Jan 02 2007 - 17:06:43 EST


On Tuesday 02 January 2007 16:56, Alistair John Strachan wrote:
> On Tuesday 02 January 2007 21:10, Adrian Bunk wrote:
> [snip]
>
> > > > Comparing your report and [1], it seems that if these are the same
> > > > problem, it's not a hardware bug but a gcc or kernel bug.
> > >
> > > This bug specifically indicates some kind of miscompilation in a
> > > driver, causing boot time hangs. My problem is quite different, and
> > > more subtle. The crash happens in the same place every time, which does
> > > suggest determinism (even with various options toggled on and off, and
> > > a 300K smaller kernel image), but it takes 8-12 hours to manifest and
> > > only happens with GCC 4.1.1. ...
> >
> > Sorry if my point goes a bit away from your problem:
> >
> > My point is that we have several reported problems only visible
> > with gcc 4.1.
> >
> > Other bug reports are e.g. [2] and [3], but they are only present with
> > using gcc 4.1 _and_ using -Os.
>
> I find [2] most compelling, and I can confirm that I do have the same
> problem with or without optimisation for size. I don't use selinux nor has
> it ever been enabled.
>
> At any rate, I have absolute confirmation that it is GCC 4.1.1, because
> with GCC 3.4.6 the same kernel I reported booting three days ago is still
> cheerfully working. I regularly get uptimes of 60+ days on that machine,
> rebooting only for kernel upgrades. 2.6.19 seems to be no worse in this
> regard.
>
> Perhaps fortunately, the configs I've tried have consistently failed to
> shake the crash, so I have a semi-reproducible test case here on C3-2
> hardware if somebody wants to investigate the problem (though it still
> takes 6-12 hours).

The GCC code generator appears to have been rewritten between 3.4.6 and
4.1.1....

I took a look at the dump he posted and there are some minor and some massive
differences between the code. In one case some of the code is swapped, in
another there is code in the 3.4.6 version that isn't in the 4.1.1... Finally
the 4.1.1 version of the function has what appears to be function calls and
these don't appear in the code generated by 3.4.6

In other words - the code generation for 4.1.1 appears to be broken when it
comes to generating system code.

DRH
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/