Re: sg SCSI bus timeout problem (2.3.99pre...)

From: Michael H. Warfield (mhw@wittsend.com)
Date: Sun May 21 2000 - 14:48:30 EST


Follow up to my own post...

On Sun, May 21, 2000 at 03:23:38PM -0400, Michael H. Warfield wrote:
> Hello all,

> Where I am having problems is closing out the session. The final
> close out command is giving me an error 30 seconds after initiating
> with this error being returned from cdrecord:

> ] cdrecord: Input/output error. close track/session: scsi sendcmd: retryable error
> ] CDB: 5B 00 02 00 00 00 00 00 00 00
> ] status: 0x2 (CHECK CONDITION)
> ] Sense Bytes: 70 00 06 00 00 00 00 0A 00 00 00 00 29 00 00 00
> ] Sense Key: 0x6 Unit Attention, Segment 0
> ] Sense Code: 0x29 Qual 0x00 (power on, reset, or bus device reset occurred) Fru 0x0
> ] Sense flags: Blk 0 (not valid)
> ] cmd finished after 30.393s timeout 480s

> Now... That last line is saying it died 30 seconds after issuing
> the command and that it had set the timeout to 480 seconds.

> Here's what I'm getting down in syslog, however:

> ] May 21 14:35:36 alcove kernel: scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun 0 Read (10) 00 00 04 3c 51 00 00 20 00
> ] May 21 14:35:36 alcove kernel: scsi : aborting command due to timeout : pid 0, scsi0, channel 0, id 0, lun 0 Read (10) 00 00 04 3c 79 00 00 58 00
> ] May 21 14:35:37 alcove kernel: SCSI host 0 abort (pid 0) timed out - resetting
> ] May 21 14:35:37 alcove kernel: SCSI bus is being reset for host 0 channel 0.
> ] May 21 14:35:40 alcove kernel: (scsi0:0:0:0) Synchronous at 5.0 Mbyte/sec, offset 15.
> ] May 21 14:35:40 alcove kernel: (scsi0:0:6:0) Using asynchronous transfers.
> ] May 21 14:35:41 alcove kernel: (scsi0:0:4:0) Synchronous at 10.0 Mbyte/sec, offset 15.
> ] May 21 14:35:45 alcove kernel: (scsi0:0:3:0) Synchronous at 10.0 Mbyte/sec, offset 15.
> ] May 21 14:35:45 alcove kernel: (scsi0:0:1:0) Synchronous at 5.0 Mbyte/sec, offset 15.

> Oh oh!!! The kernel timed out the command and reset the SCSI bus.
> That basically agrees with the error message returned by cdrecord. But
> the timeout was suppose to have been 480 seconds, not 30 seconds! That's
> off by a factor of 16! WTF???

        Further note... Setting the session close timeout value in
cdrecord to some ridiculous value gets things to work. The CD fixates
in 64 seconds at 4X speed (which would have been plenty of time under
the 480 seconds it should have had) but was way off on the 30 seconds
it got. Looks like the timeout value is being honored by not being set
to the value it should be.

        Is this a kernel bug or a cdrecord bug? I would assume that since
cdrecord is setting the timeout value to seconds*HZ, the problem is down
in the kernel somewhere.

        I hate to just arbitrarily multiply all the timeouts by 16, but
that looks like the kludge to work around the problem at the moment...

        [...]

        Mike

-- 
 Michael H. Warfield    |  (770) 985-6132   |  mhw@WittsEnd.com
  (The Mad Wizard)      |  (770) 331-2437   |  http://www.wittsend.com/mhw/
  NIC whois:  MHW9      |  An optimist believes we live in the best of all
 PGP Key: 0xDF1DD471    |  possible worlds.  A pessimist is sure of it!

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue May 23 2000 - 21:00:20 EST