Re: [BUG] usb-storage: Error in queuecommand: us->srb = ffff88006a338480

From: Alan Stern
Date: Mon Nov 10 2008 - 15:18:33 EST


On Mon, 10 Nov 2008, Brian Kysela wrote:

> I tested with 2.6.27.5 and found that, although the process would hang as often
> as usual, it always recovered instead of needing to reboot. No kernel bug or
> system freeze, no climbing load avg, etc. Here is the usbmon output on a
> failed copy:
>
> http://www.kysela.org/pub/4.mon.out
>
> The syslog:
>
> [ 1003.736201] sd 6:0:0:0: [sdb] Assuming drive cache: write through
> [ 1003.738949] sd 6:0:0:0: [sdb] Assuming drive cache: write through
> [ 1003.741886] sdb1
> [ 1112.311917] end_request: I/O error, dev sdb, sector 667600
> [ 1112.311956] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312038] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312074] end_request: I/O error, dev sdb, sector 667840
> [ 1112.312121] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312131] Buffer I/O error on device sdb1, logical block 512
> [ 1112.312137] lost page write due to I/O error on sdb1
> [ 1112.312159] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312168] Buffer I/O error on device sdb1, logical block 576
> [ 1112.312172] lost page write due to I/O error on sdb1
> [ 1112.312181] Buffer I/O error on device sdb1, logical block 577
> [ 1112.312185] lost page write due to I/O error on sdb1
> [ 1112.312193] Buffer I/O error on device sdb1, logical block 578
> [ 1112.312198] lost page write due to I/O error on sdb1
> [ 1112.312211] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312234] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.312247] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.379235] sd 6:0:0:0: rejecting I/O to offline device
> [ 1112.379640] FAT: unable to read inode block for updating (i_pos 9253)

This is essentially the same failure mechanism as before, but without
the timeout-related kernel bug.

There is a communications error during one of the reads. It takes the
same form in both logs: A transfer receives only 4051 bytes when it
should get 4096. Don't ask me why that happens; it's some sort of
hardware or firmware failure either in the drive or in your USB host
controller.

The kernel tries to recover, but it looks as though the drive is stuck
trying to send the remaining bytes. Resets don't help, so the drive
is taken off-line.

You _might_ be able to prevent these problems by reducing the drive's
max_sectors value, say to 128. See

http://www.linux-usb.org/FAQ.html#i5

No guarantees, though.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/