Re: Crashed Drive, libata wedges when trying to recover data

From: Brad Campbell
Date: Sat Sep 04 2004 - 23:04:12 EST


Greg Stark wrote:
Alan Cox <alan@xxxxxxxxxxxxxxxxxxx> writes:


Jeff, do we really have to wait 30 seconds for a timeout? If the drive hits an unreadble spot I would have thought it would come back to us with a read error rather than timing out the command.

The drive will retry for a few seconds then fail. The failure now
generates a SCSI medium error to the core scsi layer and it does like to
issue a few retries. The default retry count for scsi is probably too
high for SATA given the drive retries.


Certainly over an hour seems a little excessive:

$ time dd bs=512 count=1 if=/dev/sda4 of=/dev/null
dd: reading `/dev/sda4': Input/output error
0+0 records in
0+0 records out

real 67m59.382s
user 0m0.001s
sys 0m0.002s

Yes. I noted that even when reading a single block, the block layer does a large read ahead request and the entire request times out block by block. I have been meaning to have a look at it and see what is required to get it to time out like SCSI/USB devices appear to (which is fail the entire request on error).

I'm also not sure there is not another issue lurking in there, but when it takes an hour to recover from a bad block read it does slow down testing somewhat ;p)

Regards,
Brad
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/