Re: Solaris ZFS on Linux [Was: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion]

From: Theodore Tso
Date: Mon Jul 31 2006 - 22:58:36 EST


On Mon, Jul 31, 2006 at 08:31:32PM -0500, David Masover wrote:
> So you use a repacker. Nice thing about a repacker is, everyone has
> downtime. Better to plan to be a little sluggish when you'll have
> 1/10th or 1/50th of the users than be MUCH slower all the time.

Actually, that's a problem with log-structured filesystems in general.
There are quite a few real-life workloads where you *don't* have
downtime. The thing is, in a global economy, you move from the
London/European stock exchanges, to the New York/US exchanges, to the
Asian exchanges, with little to no downtime available. In addition,
people have been getting more sophisticated with workload
consolidation tricks so that you use your "downtime" for other
applications (either to service other parts of the world, or to do
daily summaries, 3-d frame rendering at animation companies, etc.) So
the assumption that there will always be time to run the repacker is a
dangerous one.

The problem is that many benchmarks (such as taring and untaring the
kernel sources in reiser4 sort order) are overly simplistic, in that
they don't really reflect how people use the filesystem in real life.
(How many times can you guarantee that files will be written in the
precise hash/tree order so that the filesystem gets the best possible
time?) A more subtle version of this problem happens for filesystems
where their performance degrades dramatically over-time without a
repacker. If the benchmark doesn't take into account the need for
repacker, or if the repacker is disabled or fails to run during the
benchmark, the filesystem are in effect "cheating" on the benchmark
because there is critical work which is necessary for the long-term
health of the filesystem which is getting deferred until after the
benchmark has finished measuring the performance of the system under
test.

This sort of marketing benchmarks ("lies, d*mn lies, and benchmarks")
may be useful for trying to scam mainline acceptance of the filesystem
code, or to make pretty graphs that make log-structured filesystems
look good on Usenix papers, but despite the fact that huge numbers of
papers were written about the lfs filesystem two decades ago, it never
was used in real-life by any of the commercial Unix systems. This
wasn't an accident, and it wasn't due to a secret conspiracy of BSD
fast filesystem hackers keeping people from using lfs. No, the BSD
lfs died on its own merits....

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/