Re: multiply files in one (was GNU/Linux stance by Richard Stallman)

Albert D. Cahalan (acahalan@cs.uml.edu)
Mon, 29 Mar 1999 22:43:58 -0500 (EST)


Michael Talbot-Wilson writes:
> On Mon, 29 Mar 1999, Larry McVoy wrote:
>
>>>> Please show me the design of a file system which can bring in
>>>> multiple small files with a single seek and read. When you
>>>> get done, you'll be looking at what I described.

That looks like Reiserfs to me. You can get lots of files with
just one seek, since small files (with their inode-like data) get
stored inside directories.

>>> Seek and read which files? If you want anything but all files, you're
>>> probably out of lock. Again, you will have to seek if any of the
>>> files are fragmented (which is the case with most fs).

That is a layout and cleaner issue.

>> Your still missing the performance implications. In many, many typical
>> cases (and this will become more and more true as memory gets cheaper),
>> it would be far faster to eat the whole tarball and explode it into a
>> bunch of in memory files even to get just a few of the files. Work
>> through the math - if each file costs you a seek, you only need to get
>> to 2-3 files before it would be faster to get 'em all.
>
> If that would be faster, how would it be storing the bunch of small
> files (or files of any size) in a gdbm file and extracting one on
> the hashed id? Faster, at a guess, and simpler to do... ar, anyone?

That adds an extra layer. One of the big goals of Reiserfs is the
elimination of the need for userspace namespaces. That includes any
file format designed to group tiny bits of data for performance on
older filesystem designs.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/