Re: multiply files in one (was GNU/Linux stance by Richard Stallman)

Larry McVoy (lm@bitmover.com)
Mon, 29 Mar 1999 16:27:02 -0800


: > If you have 200 small files and you have to go get 200 inodes and
: > 200 blocks to read them in, the inodes and the blocks aren't next to
: > each other, that's 400 disk transactions instead of 1.
:
: OK, so please explain the scale of what you have in mind. How big is
: the compressed tarball, and how big is it uncompressed?
:
: Do you plan on having a large FS (say /usr) stored in this way, or are
: you thinking of a selected subset?

Do it on a per directory basis, only for files where links == 1.

I wouldn't bother to compress - just putting all the files into a ``tarball''
will compress them quite nicely because of the lack of fragmentation.
Compression isn't the point, disk access / file read is the point. For a
large group of small files I want 2 disk accesses (inode + tarball) for all
files, instead of 2/file.

: Note that I'm thinking about reading-ahead the inodes *and* the
: blocks. If you have RAM to burn, then this looks like it should
: work. I'm not thinking in terms of reading ahead a few piddling
: blocks. I'm talking megabytes or dozens of megabytes.
:
: So with two seek operations, you can read-ahead lots of files and lots
: of data.

More brain cells, more, more, more, I want more. :-)

It's a transaction cost, not a platter speed cost. Read ahead does not
help you on small files and inodes.

Look, if things worked the way I'm describing then you should see a
quantum leap in small file performance. Think 10x faster and you'll
see that the number of I/O's have to go down by at least a factor of 10.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/