Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?)

From: Nix
Date: Tue Oct 23 2012 - 19:34:45 EST


On 24 Oct 2012, Theodore Ts'o told this:

> hurt, but we do want to make 100% sure that it really fixes the
> problem.

Well, yes, that would be nice. I can certainly try to verify that it
stops my filesystems getting corrupted. (And if so, I owe you a
$BEVERAGE. Though I suspect I owe you about three million of those
already for other code written in the past.)

>> The bug did really quite a lot of damage to my /home fs in only a few
>> minutes of uptime, given how few files I wrote to it. What it could have
>> done to a more conventional distro install with everything including
>> /home on one filesystem, I shudder to think.
>
> Well, the problem won't show up if the journal has wrapped. So it
> will only show up if the system has been rebooted twice in fairly
> quick succession. A full conventional distro install probably
> wouldn't have triggered a bug...

A full *install* from scratch, no. I was more worried about the
possibility of someone running -stable kernels on an existing distro
installation, and shutting down every night (given what's been happening
to UK electricity prices in the last few years I suspect there are quite
a lot of people doing that in the UK to save power). If they happen not
to do much on one particular day other than a bit of light distro
updating, they could perfectly well end up roasting things touched
during the distro update. Things like glibc :(

> although someone who habitually
> reboots their laptop instead of using suspend/resume or hiberbate, or
> someone who is trying to bisect the kernel looking for some other bug
> could easily trip over this --- which I guess is how you got hit by
> it.

I was first hit by it in /var before I was even trying to bisect: I was
just rebooting to unwedge NFS lockd. It's true that in less than a week
probably not all that many people have rebooted often enough to trip
over this.

I hope.

--
NULL && (void)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/