Re: [ANNOUNCE] Btrfs: a copy on write, snapshotting FS

From: Albert Cahalan
Date: Wed Jun 13 2007 - 12:14:51 EST


On 6/13/07, Chris Mason <chris.mason@xxxxxxxxxx> wrote:
On Wed, Jun 13, 2007 at 01:45:28AM -0400, Albert Cahalan wrote:

> The usual wishlist:
>
> * inode-to-pathnames mapping

This one I'll code, it will help with inode link count verification. I
want to be able to detect at run time that an inode with a link count of
zero is still actually in a directory. So there will be back pointers
from the inode to the directory.

Great, but fsck improvement wasn't on my mind. This is
a desirable feature for the NFS server, and for regular users.
Think about a backup program trying to maintain hard links.

Also, the incremental backup code will be able to walk the btree to find
inodes that have changed, and the backpointers will help make a list of
file names that need to be rsync'd or whatever.

> * a subvolume that is a single file (disk image, database, etc.)

subvolumes can be made that have a single file in them, but they have to
be directories right now. Doing otherwise would complicate mounts and
other management tools (inside the btree, it doesn't really matter).

Bummer. As I understand it, ZFS provides this. :-)

> * directory indexes to better support Wine and Samba
> * secure delete via destruction of per-file or per-block random crypto keys

I'd rather keep secure delete as a userland problem (or a layered FS
problem). When you take backups and other copies of the file into
account, it's a bigger problem than btrfs wants to tackle right now.

It can't be a userland problem if you allow disk blocks to move.
Volume resizing, logging/journalling, etc. -- they combine to make
the userland solution essentially impossible. (one could wipe the
whole partition, or maybe fill ALL space on the volume)

I think it needs to be per-extent.

At each level in the btree, you place a randomly generated key
for the more leafward nodes. This means that secure deletion is
merely the act of wiping the key... which can itself occur by
wiping the key of the more rootward node.

> * atomic creation of copy-on-write directory trees

Do you mean something more fine grained than the current snapshotting
system?

I believe so. Example: I have a linux-2.6 directory. It's not
a mount point or anything special like that. I want to copy
it to a new directory called wip, without actually copying
all the blocks. To all the normal POSIX API stuff, this copy
should look like the result of "cp -a", not hard links.

> * insert/delete ability (add/remove a chunk in the middle of a file)

The disk format makes this O(extent records past the chunk). It's
possible to code but it would not be optimized.

That's understandable, but note that Reiserfs can support this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/