Re: How to online remove an error scsi disk from the system?

From: Bart Van Assche
Date: Fri Feb 01 2013 - 02:54:41 EST


On 02/01/13 07:13, Tao Ma wrote:
In our product system, we have several sata disks attached to one
machine. So when one of the disk fails, the jbd2(yes, we use ext4) will
hang forever and we will get something in /var/log/messages like below.
It seems to me that the io sent to the scsi layer is never returned back
with -EIO which is a little bit surprised for me(It should be a timeout
somewhere, right?). We have tried echo "offline" >
/sys/block/sdl/device/state, but it doesn't work. So is there any way
for us to let the scsi device returns all the io requests back with EIO
so that all the end_io can be called accordingly? Am I missing something
here?

Please note that I'm not familiar with SAS. But I found this in drivers/scsi/scsi_proc.c:

* proc_scsi_write - handle writes to /proc/scsi/scsi
* @file: not used
* @buf: buffer to write
* @length: length of buf, at most PAGE_SIZE
* @ppos: not used
*
* Description: this provides a legacy mechanism to add or remove
* devices by Host, Channel, ID, and Lun. To use,
* "echo 'scsi add-single-device 0 1 2 3' > /proc/scsi/scsi" or
* "echo 'scsi remove-single-device 0 1 2 3' > /proc/scsi/scsi" with
* "0 1 2 3" replaced by the Host, Channel, Id, and Lun.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/