Bug with background reconstruction

David Harris (dharris@drh.net)
Mon, 7 Dec 1998 17:17:37 -0500


Hi,

I'm running the raid kernel patch 0.90 and the raidtools version 0.90. More
specifically raid0145-19981110-2.0.35 and raidtools-19981105-0.90.tar.

In testing my raid setup I was playing around with disabling the disks and
running the device in degraded mode and then reconstructing it.

I found that if I disabled power to the disk while the system was running
with the md device was active and running, it detected everything just fine
and marked the member drive as bad. Then on reboot when the disk was enabled
again, the md device was still running in degraded mode - the superblock on
the good disk remembered that the other disk was bad. So, I used
"raidhotadd" to re-insert the disk which was bad back into the array. After
this background reconstruction starts and works beautifully.

Encountered a problem when I did a "raidhotadd" on a md device which was
running in readonly mode. (In my testing I've taken a reconstructing md
device and set it readonly, which just stopped the reconstruction, so I
thought this would be okay.)

Doing the "raidhotadd" on the readonly md device crashed the system. I've
attached the copy-and-paste of the crash as it appeared on my system
console. I'll avoid causing this problem in the future, but the bug should
be fixed.

- David Harris
Principal Engineer, DRH Internet Services

-----
[root@w1 /root]#
[root@w1 /root]# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 128 sectors
md2 : active (read-only) raid1 sda9[0] 0 blocks [2/1] [U_]
md1 : active raid1 sdb7[1] sda7[0] 0 blocks [2/2] [UU]
md0 : active raid1 sda6[0] 0 blocks [2/1] [U_]
unused devices: <none>
[root@w1 /root]# raidhotadd /dev/md2 /dev/sdb9
trying to hot-add sdb9 to md2 ...
bind<sdb9,2>
RAID1 conf printout:
--- wd:1 rd:2 nd:1
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda9
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
RAID1 conf printout:
--- wd:1 rd:2 nd:2
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda9
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:1, o:0, n:2 rd:2 us:1 dev:sdb9
disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
sdb9(write) sdb9's sb offset: 1534080
sda9(write) sda9's sb offset: 1534080
.
RAID1 conf printout:
--- wd:1 rd:2 nd:2
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda9
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:1, o:0, n:2 rd:2 us:1 dev:sdb9
disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
RAID1 conf printout:
--- wd:1 rd:2 nd:2
disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sda9
disk 1, s:0, o:0, n:1 rd:1 us:1 dev:[dev 00:00]
disk 2, s:1, o:1, n:2 rd:2 us:1 dev:sdb9
disk 3, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 4, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 5, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 6, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 7, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 8, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 9, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 10, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
disk 11, s:0, o:0, n:0 rd:0 us:0 dev:[dev 00:00]
[root@w1 /root]# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 128 sectors
md2 : active (read-only) raid1 sdb9[2] sda9[0] 0 blocks [2/1] [U_] resync=1%
finish=4.8min
md1 : active raid1 sdb7[1] sda7[0] 0 blocks [2/2] [UU]
md0 : active raid1 sda6[0] 0 blocks [2/1] [U_]
unused devices: <none>
[root@w1 /root]# Can't write to read-only device 09:02
Can't write to read-only device 09:02
Can't write to read-only device 09:02
Can't write to read-only device 09:02
Can't write to read-only device 09:02
Can't write to read-only device 09:02
-----

The error message just continued repeating until I shut down the machine.
Tried to telnet it and put the device into readwrite mode, but I could not.
Guess the system was CPU bound printing the error message... I dunno.

- David Harris
Principal Engineer, DRH Internet Services

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/