Re: lots of 2.2.4 oopses

Valient Gough (vgough@pobox.com)
Thu, 25 Mar 1999 21:43:52 -0700


I've been running Linux 2.2.4 compiled with gcc 2.7.2.3 for nearly two days
now without problems.

When I ran it with egcs-1.1.2, using exactly the same options, etc,
I couldn't run for more then an hour before getting problems.

So, it's starting to look more and more likely that the problem is due to
the combination of egcs-1.1.2 and Linux 2.2.4...

regards,
Val

Linus Torvalds wrote:

> In article <36F92EBB.B33E673E@pobox.com>,
> Valient Gough <vgough@pobox.com> wrote:
> >
> >I upgraded from 2.2.3 to 2.2.4 last night. Since then I've had to
> >lockups. The second one managed to log "Unable to handle kernel paging
> >request at virtual address 00de3f9c" before it locked up, but I didn't
> >have the correct System.map file in place, so klogd didn't have symbols.
> >
> >But, this morning, I had 4 such oops in a row, with a correct
> >System.map.
>
> Ok, they are all basically the same oops, the file pointer list has
> gotten corrupted somehow, and that corruption will result in an oops
> whenever the kernel tries to touch one of the file pointers.
>
> The oops happens in remove_filp() (which is an inline function,
> explaining why the symbol table says it is in get_empty_filp()).
>
> The code looks like
>
> if(file->f_next)
> file->f_next->f_pprev = file->f_pprev;
> *file->f_pprev = file->f_next;
>
> and the thing oopses apparently because file->f_next is just crap (it
> appears to have the value 0x2c302c35, to be exact - seems to be the
> string "5,0," in fact).
>
> I would _much_ prefer it if you could try to compile 2.2.4 with
> gcc-2.7.2. We now have two reports of an oops like this, and in both
> cases egcs was used as the compiler. And egcs is doing alias analysis
> these days - and following the ANSI rules about aliases is something
> that we haven't validated that the kernel is doing correctly (and that's
> assuming that egcs is bug-free, which is obviously also a potential
> issue).
>
> Basically, right now I don't know whether this is (a) a real kernel bug
> (b) the kernel having some place it is not alias analysis aware and
> doing something ugly that breaks with egcs or (c) an egcs bug.
>
> Compiling with gcc-2.7.2 would at least resolve whether it is the
> compiler that induces the effect..
>
> Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/