Interesting.
The oops looks fine. The symbolic information also looks fine: the code
in question does in fact look like it is the second instruction in
"inet_sendmsg()". Everything basically seems to say that the oops is
correctly decoded and caught.
The thing that does NOT make sense is the cause of the oops itself,
though.
The oops happens on
c017b651 pushl %ebx
and %esp = c3941e80.
And quite frankly, there's not a way in h*ll that that instruction could
raise the exception in question. But it does.
I would _strongly_ suspect one of two things:
- bad CPU.
- bad cache or RAM timings.
Basically, the instruction cannot raise that exception with those
inputs. So either the CPU is just doing something randomly wrong due to
internal corruption, OR the CPU gets fed the wrong data at some earlier
point, and when the exception happens and we re-fetch that data, now it
is magically ok again because the timings were better this time.
Or something.
Note that the "bad CPU" thing may have been brought about by the MTRR
changes: maybe Linux sets up some Cyrix CPU state (it was a Cyrix CPU,
right?) incorrectly.
Oh, and do you get the message
Cyrix processor with "coma bug" found, workaround enabled
at booptup? Maybe that workaround does something else bad.
So I would strongly suggest turning off MTRR support, and see if the
behaviour is more reliable.
I would also suggest making sure that everything is properly cooled:
overheating can easily result in random problems - corrupting internal
CPU state resulting in basically random behaviour.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/