Re: [BK][PATCH] Reiser4, will double Linux FS performance, pleaseapply

From: Andreas Dilger (adilger@clusterfs.com)
Date: Mon Nov 04 2002 - 14:56:29 EST


On Nov 04, 2002 14:00 +0300, Nikita Danilov wrote:
> > Jup, this fixes the leak, but free space still isn't reported accurately
> > until after sync gets called, which I believe is a bug too.
>
> In reiser4 allocation of disk space is delayed to transaction commit. It
> is not possible to estimate precisely amount of disk space that will be
> allocated during commit, and hence statfs(2) results are not updated
> until one does sync(2) (forcing commit) or transaction is committed due
> to age (10 minutes by default).

I find this more than a bit frightening, and it could obviously be a
huge source of reiser4's dramatic performance improvements - nothing is
being written to disk until long after a benchmark is complete (provided
you have enough RAM) if it isn't explicitly syncing before completing
the test (benchmarks like dbench and iozone don't necessarily sync).

Even more importantly, people losing 10 minutes of work is pretty
unacceptable, IMHO. The default flush interval is 30 seconds for a
reason, and in realistic scenarios files don't grow over a 10 minute
period, and even if they do you would want to start flushing that to
disk long before you have a few GB of outstanding changes. Also, this
would be a real source of problems (as I previously read was hinted at
in another reiser4 email) with filesystem full conditions.

At the very least, you need to reserve blocks in the filesystem for writes
that are under delayed allocation. Overestimating space requirements
(i.e. reserve a full block for each file, regardless of whether it will be
packed in the future or not) is far preferrable to underestimating and
running out of space after a write which already "completed" suddenly
finding itself out of space. If you get close to filling the filesystem,
then you can always flush the transaction to disk to "solidify your
estimates" before returning a needless ENOSPC. This will also make your
"statfs" space reporting fairly consistent, because you will return the
"reserved" stats even if they are only slightly off.

Cheers, Andreas

--
Andreas Dilger  \ "If a man ate a pound of pasta and a pound of antipasto,
                 \  would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/               -- Dogbert
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Nov 07 2002 - 22:00:33 EST