Re: Mild filesystem corruption on ext4 (no journal)

From: Alan Jenkins
Date: Fri Jun 05 2009 - 12:42:01 EST


Eric Sandeen wrote:
Alan Jenkins wrote:
Aioanei Rares wrote:

I suspect, although I might be wrong, that this is not a kernel-related
problem.
"To try and rule out a faulty userspace program, I marked the file as read-only (chmod a-w) and immutable (chattr +i). After a reboot, the file was still read-only and immutable, yet it still became corrupted."

Since the immutable bit is not respected, I tend to think it is a kernel problem. Unless the filesystem isn't getting unmounted/flushed properly for some reason... but I thought the modern kernel had that covered.

I agree it is very suspicious this happens only after upgrading libc. I'll see if I can find an individual change in libc locale-handling that might trigger this.

Maybe you could try some things in your shutdown script, such as
explicitly fsyncing the file, or bmapping it with filefrag, or dropping
caches and rereading it... see what the state is just before the
shutdown compared to after the reboot.

-Eric

Dropping caches (and running sync first) had no effect on the result of md5sum. Hopefully that narrows it down a bit.

Thanks to your prodding though, I have another interesting finding:

If I remove the corrupted file and copy a "known good" copy into it's place, then the corruption doesn't happen. I've verified this a couple of times. The corruption only occurs if the file was created by "locale-gen".

I'll continue to try work out why :-).

Thanks
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/