Showstoppers (SCSI ?)

Peter Waltenberg (peterw@dascom.com)
Wed, 09 Dec 1998 08:25:19 +1000 (EST)


I think I may have found another. (2.0.131 AND 2.0.32)

While trying to untar from a scsi tape to scsi disk I'm getting reliable
system hangs.
The tape tar tvf's just fine, but reliably hangs the system when writing a
particular file.

Interestingly enough 2.0.32 behaves simillarly and hangs writing the same
file, but not quite as lethally.

On 2.1.131.
More than just the tar blocks (D state) all filesystem accesses to the
destination filesystem wedge, system load keeps growing and eventually
the kernel dies.

Manually translating the last page of the crash trace:

Mostly the system was ping-ponging between do_page_fault and error_code
with an occassional vgacon_cursor or set_cursor.

Last words: Cannot handle kernel paging request.
Aieee Killing interrupt handler.

The scsi disk is only about 3% full when this happens and has plenty of
inodes left.

The 2.0.32 system we had problems with originally has an AIC7880 with the disk
on it and a 2940 for the tape.

The 2.1.131 system has an AHA1542 with both devices on it.

The scsi disk is 9G , split into 4 just over 2G partitions.

Things I've tried so far:

No bad blocks on the disk, I rebuilt the e2fs file system on it with tools that
built a usable 2G+ partition on IDE on the same machine. cp -ar/ tar from
another disk to the scsi disk all work error free.

I can believe the tar image on tape may have some sort of wierd problem, however
it looks like this (whatever it is) would make a great DOS attack ;).

Extracting to /dev/null also hangs. But since it hasn't taken
the machine with it .....

ps al
FLAGS UID PID PPID PRI NI SIZE RSS WCHAN STA TTY TIME COMMAND

40 0 409 1 0 0 0 0 down_failed SW p0 0:00 (scsi_eh_
0)

0 0 482 390 0 0 872 464 down_failed D p0 0:10 tar xvfO
/dev/tape

If anyone has any ideas on finding the real bug here mail me. I can do some
testing on the machine after hours.

Peter
----------------------------------
E-Mail: Peter Waltenberg <peterw@dascom.com>
Date: 09-Dec-98
Time: 07:27:33

This message was sent by XFMail
----------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/