Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: Pavel Machek
Date: Mon Aug 24 2009 - 17:33:30 EST


On Mon 2009-08-24 16:11:08, Rob Landley wrote:
> On Monday 24 August 2009 04:31:43 Pavel Machek wrote:
> > Running journaling filesystem such as ext3 over flashdisk or degraded
> > RAID array is a bad idea: journaling guarantees no longer apply and
> > you will get data corruption on powerfail.
> >
> > We can't solve it easily, but we should certainly warn the users. I
> > actually lost data because I did not understand these limitations...
> >
> > Signed-off-by: Pavel Machek <pavel@xxxxxx>
>
> Acked-by: Rob Landley <rob@xxxxxxxxxxx>
>
> With a couple comments:
>
> > +* write caching is disabled. ext2 does not know how to issue barriers
> > + as of 2.6.28. hdparm -W0 disables it on SATA disks.
>
> It's coming up on 2.6.31, has it learned anything since or should that version
> number be bumped?

Jan, did those "barrier for ext2" patches get merged?

> > + (Thrash may get written into sectors during powerfail. And
> > + ext3 handles this surprisingly well at least in the
> > + catastrophic case of garbage getting written into the inode
> > + table, since the journal replay often will "repair" the
> > + garbage that was written into the filesystem metadata blocks.
> > + It won't do a bit of good for the data blocks, of course
> > + (unless you are using data=journal mode). But this means that
> > + in fact, ext3 is more resistant to suriving failures to the
> > + first problem (powerfail while writing can damage old data on
> > + a failed write) but fortunately, hard drives generally don't
> > + cause collateral damage on a failed write.
>
> Possible rewording of this paragraph:
>
> Ext3 handles trash getting written into sectors during powerfail
> surprisingly well. It's not foolproof, but it is resilient. Incomplete
> journal entries are ignored, and journal replay of complete entries will
> often "repair" garbage written into the inode table. The data=journal
> option extends this behavior to file and directory data blocks as well
> (without which your dentries can still be badly corrupted by a power fail
> during a write).
>
> (I'm not entirely sure about that last bit, but clarifying it one way or the
> other would be nice because I can't tell from reading it which it is. My
> _guess_ is that directories are just treated as files with an attitude and an
> extra cacheing layer...?)

Thanks, applied, it looks better than what I wrote. I removed the ()
part, as I'm not sure about it...
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/