Re: Another strange OOPS in kernel 2.0.25

Ion Badulescu (ionut@moisil.wal.rhno.columbia.edu)
Fri, 6 Dec 1996 22:03:47 -0500 (EST)


On Fri, 6 Dec 1996, Flavio Spada wrote:

> I got these OOPS in kernel 2.0.25 (uptime about 3 weeks):

[snip]

My educated guess is that you have some really weird hardware problems. I
looked at your previously posted oopses, and they all exhibit segfaults at
either addresses that make no sense (but are not the usual off-by-one-bit
addresses denoting memory problems), or at 0.

> eax: 00203a00 ebx: 0020ffe0 ecx: 00000006 edx: 00000000
^^^^^^^^^^^^^

> Code: 11b024 <shrink_mmap+74/1e0> testb $0x10,0x14(%edx)

This is a NULL pointer in the page->buffers list, which is supposed to be
a circular list. Granted, it could be a bug, but...

> eax: 00000000 ebx: 0020ffe0 ecx: 00000006 edx: f0009ab7
^^^^^^^^^^^^^

> Code: 11b024 <shrink_mmap+74/1e0> testb $0x10,0x14(%edx)

... the next oops, occuring in exactly the same point, gives an address
that makes no sense whatsoever. HOWEVER, the pattern looks very similar to
that in one of the older oopses:

> eax: 08008b80 ebx: 00010302 ecx: 00000302 edx: 00000444
^^^^^^^^^^^^^

> Code: 123c60 <get_hash_table+30/d0> cmpl %ebp,(%eax)

This is tmp->b_blocknr in find_buffer. Again, a linked list having to do
with the buffer allocation - coincidence or real bug?

Does anybody have a better explanation for these oopses?

Ionut

--
  It is better to keep your mouth shut and be thought a fool,
            than to open it and remove all doubt.