Re: EXT3 File System Corruption 2.6.34

From: Eric Sandeen
Date: Mon Jun 07 2010 - 21:34:08 EST



On Jun 7, 2010, at 6:55 PM, Jeffrey Merkey <jeffmerkey@xxxxxxxxx> wrote:

---------- Forwarded message ----------
From: Jeffrey Merkey <jeffmerkey@xxxxxxxxx>
Date: Mon, Jun 7, 2010 at 5:54 PM
Subject: Re: EXT3 File System Corruption 2.6.34
To: Eric Sandeen <sandeen@xxxxxxxxxxx>


REPLY TO ALL

CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set

Whether set this way or not, should not see corruption.

Here you are mistaken. Mount with data=ordered and see. Writeback can expose stale data.

-Eric

I am seeing
data corruption including the following:

/boot/grub/grub.conf getting filled with binary chars
/root/.viminfo filled with strange text chars (not binary)
.o files filled with the same garbage.

Looks like EXT3 meta data -- maybe some blocks getting transposed somewhere?

I will recreate the data patterns I see during corruption and post
here. They are consitent with some sort of fill pattern -- at least
what I see in
viminfo is.

In the case of corrupted .o files, the endian headers are missing and
trashed in the OBJ section headers -- chances are the same kind of
garbage.

Jeff


Still seeing file system corruption after journal recovery in EXT3.
It's easy to reproduce, though the symptoms vary. One way is to
rebuild a program and while the program is being compiled just shut
off power to the system by pulling the plug. I am seeing the
/root/.viminfo file trashed after recovery if Vim was active during
poweroff. I am also seeing object modules getting built which the LD
linker claims are "invalid" following a recovery event. I suspect a
bug in the buffer cache since deleting the file still causes the old
data to be returned from buffer cache even when the sectors are
overwritten, but both are interrelated. Seems in some way related to
EXT3 recovery which results in the buffer cache returning old sectors
and junk.

Not hard to reproduce, but the symptoms are always a little different
but the /root/.viminfo file getting nuked seems a common affect of
this bug.

"file system corruption" usually means corrupted metadata, but I guess
here you mean file corruption, i.e. corrupted data.

If you have buffered data in the cache, it will be lost when you pull
the plug. If your userspace doesn't sync it, this is expected. But it's
not clear to me what you're seeing.

I'm also not clear on what you mean about deleting the file and having old
data returned. Maybe a little cut and paste from the screen would help
explain what you see.

I'd also check CONFIG_EXT3_DEFAULTS_TO_ORDERED and be sure you're
using data=ordered mode by default.

-Eric

Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux- kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/