debugging hardware problems

David van der Spoel (spoel@xray.bmc.uu.se)
Sat, 6 Mar 1999 22:26:30 +0100 (MET)


Hi,

I have two identical SMP machines running 2.0.36 with 2 3com905b cyclone
network cards. I bought the machines for floating point parallel work
using MPI or PVM. One of the two often crashes when doing something on
the network, either my parallel app. or something simple like lynx. I
tried hard to crash the other machine but no success yet.
I swapped the two network cards (or swapped eth0 and eth1) but that does
not make a difference. I also swapped memory chips, but no difference.
Usually the machine crashes with
Mar 4 13:37:34 matejko kernel: Kernel panic: skput:over: 10824af8:1514
Mar 4 13:37:34 matejko kernel: In swapper task - not syncing
Mar 4 13:37:34 matejko kernel: eth0: Warning -- the skbuff addresses do
not match in boomerang_rx: 03dbc02a vs. f0006ae0.
Mar 4 13:37:34 matejko kernel: general protection: 0000

Since only one of my machines crashes, I suspect hardware. Is there any
hardware testing tool out there? Other hints?

I also tried 2.2.2 kernel but no difference.

Groeten, David.
________________________________________________________________________
Dr. David van der Spoel Biomedical center, Dept. of Biochemistry
s-mail: Husargatan 3, Box 576, 75123 Uppsala, Sweden
e-mail: spoel@xray.bmc.uu.se www: http://zorn.bmc.uu.se/~spoel
phone: 46 18 471 4205 fax: 46 18 511 755
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/