Re: unusual ext2 FS corruption -- for statistics

Frank van Maarseveen (fvm@tasking.nl)
Fri, 23 Jul 1999 21:42:18 +0200


On Fri, Jul 23, 1999 at 01:44:41PM -0400, Theodore Y. Ts'o wrote:
> Frank, can you send some other examples of the bogus inodes you found?
> The one that you sent seemed extremely unusual, and it would be
> worthwhile to track down the rest to see if we can establish any pattern
> as to what's going on here. Thanks!!
There is only one other corrupt inode left on the system. All the others
have been moved to /lost+found and have been cleared with debugfs. From
what I can recall they were all block special with a similar
looking mode and absurd owner, group, size

turku:/usr/share/zoneinfo/US # ls -il
total 968527940
208388 -rw-r--r-- 2 root root 826 Jun 7 1998 Alaska
208389 -rw-r--r-- 3 root root 823 Jun 7 1998 Aleutian
208390 -rw-r--r-- 3 root root 130 Jun 7 1998 Arizona
208391 -rw-r--r-- 3 root root 1262 Jun 7 1998 Central
147100 -rw-r--r-- 5 root root 269 Jun 7 1998 East-Indiana
208392 -rw-r--r-- 3 root root 1250 Jun 7 1998 Eastern
202216 -rw-r--r-- 3 root root 113 Jun 7 1998 Hawaii
147101 -rw-r--r-- 3 root root 532 Jun 7 1998 Indiana-Starke
208393 -rw-r--r-- 2 root root 794 Jun 7 1998 Michigan
208394 -rw-r--r-- 5 root root 860 Jun 7 1998 Mountain
208395 -rw-r--r-- 3 root root 1000 Jun 7 1998 Pacific
202217 b---r-srwx 3 30309 29545 115, 97 Sep 6 1996 Samoa

Compare this with the old one:
turku:/usr/share/zoneinfo/US # ls -il /usr/share/locale/et_EE
total 270080582
202150 -rw-r--r-- 1 root root 29970 Jun 7 1998 LC_COLLATE
202151 -rw-r--r-- 1 root root 10428 Jun 7 1998 LC_CTYPE
183808 drwxr-xr-x 2 root root 1024 Jul 30 1998 LC_MESSAGES
202152 -rw-r--r-- 1 root root 94 Jun 7 1998 LC_MONETARY
202153 b---r-srwx 1 30309 12832 100, 32 Nov 21 2025 LC_NUMERIC
202154 -rw-r--r-- 1 root root 516 Jun 7 1998 LC_TIME

debugfs: show_inode_info Samoa
Inode: 202217 Type: block special Mode: 0057 Flags: 0x62202c65 Version: 1
User: 30309 Group: 29545 Size: 1633973039
File ACL: 0 Directory ACL: 0
Links: 3 Blockcount: 1937055854
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x65646f6e -- Mon Nov 27 11:29:02 2023
atime: 0x49203a32 -- Sun Nov 16 16:20:18 2008
mtime: 0x32303220 -- Fri Sep 6 16:16:00 1996
BLOCKS:
1679848289 1701669236 1952805664 1176510510 1145395273 2606
TOTAL: 6

Using the inode numbers:
202217-202153 * 128 == 8k. Indeed 8k beyond the first corruption
at 0x400 there is a second one at 0x2400 (again add 811039k to get
real /dev/sda2 offsets):

(possibly unrelated funny pattern in inode -- there are other similar)
680: a4 81 00 00 82 77 00 00 3b 2f 98 37 f2 f2 ff 35 .....w..;/.7...5
690: df d8 7a 35 00 00 00 00 00 00 01 00 3e 00 00 00 ..z5........>...
6a0: 00 00 00 00 00 00 00 00 08 71 0c 00 09 71 0c 00 .........q...q..
6b0: 0a 71 0c 00 0b 71 0c 00 0c 71 0c 00 0d 71 0c 00 .q...q...q...q..
6c0: 0e 71 0c 00 0f 71 0c 00 10 71 0c 00 11 71 0c 00 .q...q...q...q..
6d0: 12 71 0c 00 13 71 0c 00 14 71 0c 00 00 00 00 00 .q...q...q......
6e0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
6f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
.
.
2380: a4 81 00 00 71 00 00 00 46 2f 98 37 07 f3 ff 35 ....q...F/.7...5
2390: 4d d8 7a 35 00 00 00 00 00 00 03 00 02 00 00 00 M.z5............
23a0: 00 00 00 00 00 00 00 00 92 72 0c 00 00 00 00 00 .........r......
23b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
23c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
23d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
23e0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
23f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2400: 2f 64 65 76 2f 73 64 61 32 3a 20 49 6e 6f 64 65 /dev/sda2: Inode
2410: 20 32 30 32 00 00 00 00 69 73 03 00 6e 20 75 73 202....is..n us
2420: 65 2c 20 62 75 74 20 68 61 73 20 64 74 69 6d 65 e, but has dtime
2430: 20 73 65 74 2e 20 20 46 49 58 45 44 2e 0a 00 00 set. FIXED....
2440: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2450: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2460: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
2470: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
2480: ed 41 00 00 00 04 00 00 ff 2f 98 37 27 f4 ff 35 .A......./.7'..5
2490: 55 dd eb 35 00 00 00 00 00 00 02 00 02 00 00 00 U..5............
24a0: 00 00 00 00 00 00 00 00 94 72 0c 00 00 00 00 00 .........r......
24b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
24c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
24d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
24e0: 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
24f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Notice that the first three digits of the inode number in the message
match the numbers of the two corrupted inodes. The other
message referred to inode number "2..142" which doesn't match any two of
them but could have been one of the other inodes which have been cleared.
I guess they contained an fsck text as well.

The story is that the system in question had no keyboard, no video
card and no screen: it was just operating in a corner of our computerroom.
We installed a new kernel (2.2.10-ac11 with NFS/RPC patches posted earlier
by me), updated /etc/lilo.conf and ran lilo and rebooted but it seemed to hang
during boot-up. Regarding this new kernel: it it running right now on the system
(without re-installing the kernel or rerunning lilo) and was already running
on some other systems so I don't think this one is to blame.
Next, the system was switched off and on -- no success.
Keyboard, video and monitor were plugged in and after reboot init asked for
a runlevel because /etc/inittab was gone. A couple of unclean reboots we
succeeded in getting a shell and discovered that /etc was completely hosed:
The directory itself contained garbage and accessing it in some way caused
an enormous amount of kernel messages referring to bitmaps or something
(sorry that I have no more details -- this is all from what I can recall).
I unlinked /etc with debugfs and made a new directory with an empty fstab
because fsck refused to run without it. fsck filled /lost+found
with things from /etc. Some "etc" files were missing but in addition we found
the funny looking block special files. Their enormous size and unclean
appearence convinced me to clear them with debugfs (rm -f refused as root --
should have reset the immutable flag?). Finally we re-installed /etc using
another system and all troubles were gone except for those two corrupted
inodes just reported.

About the new kernel: I strongly feel that I have to mention this fact
in order to give a complete story but at the same time I am quite sure
that my patch wasn't responsible.

We just found an area on /dev/sda2 with resembles an old /etc directory
but has an fsck message patched into it. Here it is (add 811040k to get
real /dev/sda2 offsets):

39b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39bb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39bc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39bd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39be0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39bf0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
39c00: e9 14 03 00 0c 00 01 00 2e 00 20 49 0b 00 00 00 .......... I....
39c10: f4 03 02 00 2e 2e 00 20 69 73 20 69 6e 20 75 73 ....... is in us
39c20: 65 2c 20 62 75 74 20 68 61 73 20 64 74 69 6d 65 e, but has dtime
39c30: 20 73 65 74 2e 20 20 46 49 58 45 44 2e 0a 66 00 set. FIXED..f.
39c40: ed 14 03 00 10 00 05 00 68 6f 73 74 73 00 00 00 ........hosts...
39c50: ee 14 03 00 14 00 09 00 63 73 68 2e 63 73 68 72 ........csh.cshr
39c60: 63 00 00 00 ef 14 03 00 10 00 07 00 65 78 70 6f c...........expo
39c70: 72 74 73 00 2c 16 03 00 10 00 05 00 67 72 6f 75 rts.,.......grou
39c80: 70 00 00 00 f1 14 03 00 14 00 09 00 68 6f 73 74 p...........host
39c90: 2e 63 6f 6e 66 00 00 00 f2 14 03 00 14 00 0b 00 .conf...........
39ca0: 68 6f 73 74 73 2e 61 6c 6c 6f 77 00 f3 14 03 00 hosts.allow.....
39cb0: 14 00 0a 00 68 6f 73 74 73 2e 64 65 6e 79 00 00 ....hosts.deny..
39cc0: f4 14 03 00 0c 00 04 00 6d 6f 74 64 2b 16 03 00 ........motd+...
39cd0: 10 00 06 00 70 61 73 73 77 64 00 00 f6 14 03 00 ....passwd......
39ce0: 10 00 08 00 70 72 69 6e 74 63 61 70 f7 14 03 00 ....printcap....
39cf0: 10 00 07 00 70 72 6f 66 69 6c 65 00 f4 0f 00 00 ....profile.....
39d00: 14 00 09 00 70 72 6f 66 69 6c 65 2e 64 00 00 00 ....profile.d...
39d10: f8 14 03 00 14 00 09 00 70 72 6f 74 6f 63 6f 6c ........protocol
39d20: 73 00 00 00 f9 14 03 00 14 00 09 00 73 65 63 75 s...........secu
39d30: 72 65 74 74 79 00 00 00 12 15 03 00 10 00 08 00 retty...........
39d40: 73 65 72 76 69 63 65 73 dc 27 00 00 0c 00 03 00 services.'......
39d50: 58 31 31 00 fb 14 03 00 18 00 0d 00 6e 73 73 77 X11.........nssw
39d60: 69 74 63 68 2e 63 6f 6e 66 00 00 00 fc 14 03 00 itch.conf.......
39d70: 0c 00 03 00 72 70 63 00 fd 14 03 00 14 00 0a 00 ....rpc.........
39d80: 44 49 52 5f 43 4f 4c 4f 52 53 00 00 8a 15 03 00 DIR_COLORS......

-- 
Frank

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/