Problems with scsi adapter timeouts using stripes

Tony_Ten_Broeck@themoneystore.com
Fri, 22 Jan 1999 15:18:18 -0800


Hi all,
I have been having problems with the developmental scsi support(since
at least before 1.128 through 220p8). This problem has been recurrent on
both adaptec 7880's and ncr53c875's. Basically everything is fine until I
put a heavy load on the adapter-ie a stripe across 5 or more drives. Then I
get a series of timeout related errors-see below.. I have reduced queue
tags to zero, but I still get this problem. I have tried on multiple
systems(dual vectra xu p6's, dual p2 compac 1600's, hp netservers lh
II's-all with their own adapters), and they all have similar timeout
problems-that hang the system for 10 secs or more before resetting. The
only kernel I haven't had problems with is the out of the box redhat
5.2(v2.0.36)-which is very slow for scsi-and if I recompile that one, I get
a system hang on the scsi timeout. I was wondering if anyone has had a
similiar problem, and could tell me what changes to make in the scsi code,
so that either I won't get this problem, or the timeout and reloading the
scripts into the controllers memory won't take so long(it may be a
bottleneck anyways)...
thanks

logged errors:
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73540, scsi1, channel 0, id 5, lun 0 Write (6) 01 7e b6 f4 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73540
serial_number=73649 serial_number_at_timeout=73649
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73543, scsi1, channel 0, id 4, lun 0 Write (6) 01 7e de f4 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73543
serial_number=73652 serial_number_at_timeout=73652
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73544, scsi1, channel 0, id 3, lun 0 Write (6) 01 7f 4e b8 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73544
serial_number=73653 serial_number_at_timeout=73653
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73545, scsi1, channel 0, id 8, lun 0 Write (6) 01 7f 56 a8 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73545
serial_number=73654 serial_number_at_timeout=73654
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73546, scsi1, channel 0, id 5, lun 0 Write (6) 01 7f aa 54 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73546
serial_number=73655 serial_number_at_timeout=73655
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c7968820 (cancel)
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73547, scsi1, channel 0, id 9, lun 0 Write (6) 01 7f fe 02 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73547
serial_number=73656 serial_number_at_timeout=73656
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73548, scsi1, channel 0, id 10, lun 0 Write (6) 01 7f fe 40 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73548
serial_number=73657 serial_number_at_timeout=73657
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73549, scsi1, channel 0, id 4, lun 0 Write (6) 01 7f d2 6c 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73549
serial_number=73658 serial_number_at_timeout=73658
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c0fe8020 (cancel)
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73550, scsi1, channel 0, id 8, lun 0 Write (6) 01 7f fe 40 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73550
serial_number=73659 serial_number_at_timeout=73659
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c0fe7020 (cancel)
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73551, scsi1, channel 0, id 3, lun 0 Write (6) 01 80 06 3a 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73551
serial_number=73660 serial_number_at_timeout=73660
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c0fe8820 (cancel)
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73552, scsi1, channel 0, id 9, lun 0 Write (6) 01 80 00 3e 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73552
serial_number=73661 serial_number_at_timeout=73661
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c7bb6820 (cancel)
Jan 20 11:38:30 newsmail kernel: scsi : aborting command due to timeout :
pid 73553, scsi1, channel 0, id 10, lun 0 Write (6) 01 80 ae 6a 00
Jan 20 11:38:30 newsmail kernel: ncr53c8xx_abort: pid=73553
serial_number=73662 serial_number_at_timeout=73662
Jan 20 11:38:30 newsmail kernel: ncr53c875-1: abort ccb=c0fe6020 (cancel)
Jan 20 11:38:33 newsmail kernel: SCSI host 1 abort (pid 73545) timed out -
resetting
Jan 20 11:38:33 newsmail kernel: SCSI bus is being reset for host 1 channel
0.
Jan 20 11:38:33 newsmail kernel: ncr53c8xx_reset: pid=73545 reset_flags=2
serial_number=73654 serial_number_at_timeout=73654
Jan 20 11:38:33 newsmail kernel: ncr53c875-1: restart (scsi reset).
Jan 20 11:38:33 newsmail kernel: ncr53c875-1: Downloading SCSI SCRIPTS.
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<5,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 15)
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<3,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 15)
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<4,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 15)
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<10,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 16)
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<9,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 16)
Jan 20 11:38:34 newsmail kernel: ncr53c875-1-<8,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 16)
Jan 20 11:38:49 newsmail kernel: scsi : aborting command due to timeout :
pid 73765, scsi1, channel 0, id 3, lun 0 Write (6) 01 89 66 f4 00
Jan 20 11:38:49 newsmail kernel: ncr53c8xx_abort: pid=73765
serial_number=73892 serial_number_at_timeout=73892
Jan 20 11:38:49 newsmail kernel: scsi : aborting command due to timeout :
pid 73766, scsi1, channel 0, id 9, lun 0 Write (6) 01 8a 54 92 00
Jan 20 11:38:49 newsmail kernel: ncr53c8xx_abort: pid=73766
serial_number=73893 serial_number_at_timeout=73893

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/