Re: 2.4.22-pre lockups (now decoded oops for pre10)

From: Stephan von Krawczynski (skraw@ithnet.com)
Date: Wed Aug 06 2003 - 04:36:58 EST


On Wed, 6 Aug 2003 11:09:20 +0200
Willy Tarreau <willy@w.ods.org> wrote:

> On Wed, Aug 06, 2003 at 09:41:50AM +0200, Stephan von Krawczynski wrote:
>
> > Code; c0144b14 <__remove_from_queues+14/30>
> > 00000000 <_EIP>:
> > Code; c0144b14 <__remove_from_queues+14/30> <=====
> > 0: 89 02 mov %eax,(%edx) <=====
> > Code; c0144b16 <__remove_from_queues+16/30>
> > 2: c7 41 30 00 00 00 00 movl $0x0,0x30(%ecx)
> > Code; c0144b1d <__remove_from_queues+1d/30>
> > 9: 89 4c 24 04 mov %ecx,0x4(%esp,1)
> > Code; c0144b21 <__remove_from_queues+21/30>
> > d: e9 7a ff ff ff jmp ffffff8c <_EIP+0xffffff8c>
> > Code; c0144b26 <__remove_from_queues+26/30>
> > 12: 8d 76 00 lea 0x0(%esi),%esi
>
> once again, it's *pprev=next which is is causing trouble, with pprev=6 this
> time (fs/buffer.c:523). There really seems to be something playing badly with
> this...
>
> I find amazing that such widely used portions of code only trigger panics on
> your system ! either it's a rare combinations of several components/drivers,
> or a strange hardware problem, although I can't imagine which (cpu? bus
> locking?).

Hm, the hardware may not be that widespread. I guess not many people are really
using SMP, 64 bit PCI network, 3 GB RAM, 3ware RAID5 and serverworks board
altogether in one box. I can't fight the impression it has something to do with
locking issues. It doesn't look exactly like a hardware problem, you would not
expect crashes on the same type of code then.
The question is: what additional information is needed to find the underlying
problem?

Regards,
Stephan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 07 2003 - 22:00:32 EST