Non fatal "bug in file" error

David Harris (dharris@drh.net)
Wed, 9 Dec 1998 03:16:18 -0500


Hi,

I've been torture testing my raid setup and encountered one of those "bug in
file md.c" errors. The md device still came up and runs properly, so this is
not a fatal error.

I've got two drives, sda and sdb, which support multiple raid1 md devices. I
got the system running (without any filesystems mounted) and raidstarted
md0, then killed power to the sda drive. This caused the raid driver to mark
the sda drive as bad in the raid superblock.

I then rebooted with the sda drive powered off, which I can do because each
drive has a working MBR and kernel and initrd images. At this point, sda is
now what sdb was when I created the raid array and there is no sdb. I wanted
to see if the raiddriver would detect this properly. On running the
"raidstart /dev/md0" I got notified that "device name has changed from sdb6
to sda6 since last import!". Cool, this is what needed to happen! Then came
a "bug in file md.c, line 1321" error.

Despite the "bug in file" warning the driver kept on doing stuff and
reported that the md0 device was up an running in a degraded mode. I then
mounted the filesystem readonly and looked at it to verify that the md0
device was really up.

Included is (1) the /proc/mdstat status showing md0 in the degraded mode
when sda and sdb are correct and (2) a cut-and-paste of the system console
running the "raidstart md0" when sda was disk two and there was no sdb. I'm
running raid199811110-2.0.35.

I don't know what this "bug in file" warning means, but I thought I should
pass it along.

- David Harris
Principal Engineer, DRH Internet Services

First attachment:

bash# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 128 sectors
md2 : active raid1 sdb9[1] sda9[0] 0 blocks [2/2] [UU]
md1 : active raid1 sdb7[1] sda7[0] 0 blocks [2/2] [UU]
md0 : active raid1 sdb6[1] sda6[0](F) 0 blocks [2/1] [_U]

Second attachment:

bash# raidstart /dev/md0
(read) sda6's sb offset: 208704
bind<sda6,1>
md: sdb6 has zero size, marking faulty!
bind<sdb6,2>
md: auto-running md0.
md: device name has changed from sdb6 to sda6 since last import!
md: bug in file md.c, line 1321

**********************************
* <COMPLETE RAID STATE PRINTOUT> *
**********************************
md0: <sdb6><sda6> array superblock:
SB: (V:0.90.0) ID:<3ee3a861.d204f956.d7e1be0f.fc18e0f8> CT:36673d63
L1 S00208640 ND:2 RD:2 md0 LO:0 CS:131072
UT:366de58e ST:1 AD:1 WD:1 FD:1 SD:0 CSUM:fae8e279
D 0: DISK<N:0,sda6(8,6),R:0,S:1>
D 1: DISK<N:1,sda6(8,6),R:1,S:6>
D 2: DISK<N:2,[dev 00:00](0,0),R:2,S:9>
D 3: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 4: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 5: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 6: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 7: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 8: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 9: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 10: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 11: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
THIS: DISK<N:1,sdb6(8,22),R:1,S:6>
rdev sdb6: O:[dev 00:00], SZ:00000000 F:1 DN:-1 no rdev superblock!
rdev sda6: O:sdb6, SZ:00000000 F:0 DN:1 rdev superblock:
SB: (V:0.90.0) ID:<3ee3a861.d204f956.d7e1be0f.fc18e0f8> CT:36673d63
L1 S00208640 ND:2 RD:2 md0 LO:0 CS:131072
UT:366de58e ST:1 AD:1 WD:1 FD:1 SD:0 CSUM:fae8e279
D 0: DISK<N:0,sda6(8,6),R:0,S:1>
D 1: DISK<N:1,sdb6(8,22),R:1,S:6>
D 2: DISK<N:2,[dev 00:00](0,0),R:2,S:9>
D 3: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 4: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 5: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 6: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 7: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 8: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 9: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 10: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
D 11: DISK<N:0,[dev 00:00](0,0),R:0,S:0>
THIS: DISK<N:1,sda6(8,6),R:1,S:6>
**********************************

md: dropping descriptor-less faulty sdb6
unbind<sdb6,1>
export_rdev(sdb6)
raid1: device sda6 operational as mirror 1
raid1: md0, not all disks are operational -- trying to recover array
raid1: raid set md0 active with 1 out of 2 mirrors
md: updating md0 RAID superblock on device
sda6(write) sda6's sb offset: 208704
md: recovery thread got woken up ...
md0: no spare disk to reconstruct array! -- continuing in degraded mode
md: recovery thread finished ...
.
bash# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 128 sectors
md0 : active raid1 sda6[0] 0 blocks [2/1] [_U]
unused devices: <none>
bash# mkdir /mnt/root
bash# mount /dev/md0 /mnt/root -o ro
bash# ls /mnt/root
bin dev lib mnt1 sbin
boot1 etc lost+found mnt2 tmp
boot2 home misc proc usr
bru initrd mnt root var
bash#
bash# cat /proc/mdstat
Personalities : [raid1] [raid5]
read_ahead 128 sectors
md0 : active raid1 sda6[0] 0 blocks [2/1] [_U]
unused devices: <none>
bash#
bash#

- David Harris
Principal Engineer, DRH Internet Services

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/