scsi remove-single-device broken and RAID rebuild bug

From: Malcolm Beattie (mbeattie@sable.ox.ac.uk)
Date: Mon Apr 03 2000 - 08:44:50 EST


With kernel 2.0, I use
    echo "scsi remove-single-device 0 0 x 0" > /proc/scsi/scsi
to remove a (hardware hot-swappable) disk from the SCSI bus in order
to simulate RAID5 disk failure (and, when the disk really fails, to
do the disk replacement itself).

With kernel 2.2.14, that no longer works. strace shows that the write
returns -EBUSY (corresponding to a non-zero scd->attached in scsi.c).
Help? What am I missing? Surely someone else must have noticed this?

As a RAID-related follow-on: I tried instead to use the ioctl
SCSI_IOCTL_STOP_UNIT to spin down a disk in order to simulate a disk
failure and initiate a RAID5 rebuild (with kernel 2.2.14 plus patch
raid-2.2.14-B1 as shipped in Red Hat's kernels for 6.1 and 6.2). This
led to the error

 md: bug in file raid5.c, line 659
  
       **********************************
       * <COMPLETE RAID STATE PRINTOUT> *
       **********************************

followed by a dump of RAID superblock info in the log and all
processes having files open on the (ext2) filesystem on the RAID5
device hung in uninterruptible sleep, as did any process even trying
a "cat /proc/mdstat".

I sent a version of this message to the linux-raid mailing list 7 days
ago and had no reply at all so I'm trying here since at least the
remove-single-device feature is a general kernel thing not raid only.

--Malcolm

-- 
Malcolm Beattie <mbeattie@sable.ox.ac.uk>
Unix Systems Programmer
Oxford University Computing Services

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Apr 07 2000 - 21:00:09 EST