Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: Pavel Machek
Date: Sat Aug 29 2009 - 06:43:20 EST


On Thu 2009-08-27 08:24:23, Theodore Tso wrote:
> On Thu, Aug 27, 2009 at 12:19:02AM -0500, Rob Landley wrote:
> > > To me, this isn't a particularly interesting or newsworthy point,
> > > since a competent system administrator
> >
> > I'm a bit concerned by the argument that we don't need to document
> > serious pitfalls because every Linux system has a sufficiently
> > competent administrator they already know stuff that didn't even
> > come up until the second or third day it was discussed on lkml.
>
> I'm not convinced that information which needs to be known by System
> Administrators is best documented in the kernel Documentation
> directory. Should there be a HOWTO document on stuff like that?

It is not only for system administrators; I was trying to find out if
kernel is buggy, and that should be in kernel tree.


> > If "degraded array" just means "don't have a replacement disk yet",
> > then it sounds like what Pavel wants to document is "don't write to
> > a degraded array at all, because power failures can cost you data
> > due to write granularity being larger than filesystem block size".
> > (Which still comes as news to some of us, and you need a way to
> > remount mount the degraded array read only until the sysadmin can
> > fix it.)
>
> If you want to document that as a property of RAID arrays, sure. But
> it's not something that should live in Documentation/filesystems/ext2.txt
> and Documentation/filesystems/ext3.txt. The MD RAID howto might be a

ext3 documentation states that journal protects fs integrity on
powerfail. If you don't want to talk about storage stacks, perhaps
that should be removed?

Now... You mocked me up for 'ext3 expects disks to behave like disks
(alarmist)'. I actually believe that should be written somewhere. ext3
depends on fairly subtle storage disk characteristics, and many common
configs just do not meet the expectations (missing barriers is most
common, followed by collateral damage).

Maybe not documenting that was okay 10 years ago, but with all the USB
sticks and raid arrays around, its just sloppy. Because those
characteristics are not documented, storage stack authors do not know
what they have to guarantee, and the result is bad. See for example
nbd -- it does not propagate barriers and is therefore unsafe.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/