Re: Recursive directory accounting for size, ctime, etc.

From: J. Bruce Fields
Date: Tue Jul 15 2008 - 15:53:47 EST


On Tue, Jul 15, 2008 at 11:28:22AM -0700, Sage Weil wrote:
> Fields prefixed with 'r' are recursively defined, while
> entries/files/subdirs is just for the one directory. 'rctime' is the most
> recent ctime within the hierarchy, which should be useful for backup
> software or anything else scanning the hierarchy for recent changes.
>
> Naturally, there are a few caveats:
>
> - There is some built-in delay before statistics fully propagate up
> toward the root of the hierarchy. Changes are propagated
> opportunistically when lock/lease state allows, with an upper bound of (by
> default) ~30 seconds for each level of directory nesting.

That makes it less useful, e.g., for somebody with cached data trying to
validate their cache, or for something like git trying to check a
directory tree for changes.

> - Ceph internally distinguishes between multiple links to the same file
> (there is a single 'primary' link, and then zero or more 'remote' links).
> Only the primary link contributes toward the 'rbytes' total.

Is that only true for 'rbytes'?

--b.

>
> - The 'rbytes' summation is over i_size, not blocks used. That means
> sparse files "appear" larger than the storage space they actually consume.
>
> - Directories don't yet contribute anything to the 'rbytes' total. They
> should probably include an estimate of the storage consumed by directory
> metadata. For this reason, and because the size isn't rounded up to the
> block size, the 'rbytes' total will usually be slightly smaller than what
> you get from 'du'.
>
> - Currently no stats for the root directory itself.
>
>
> I'm extremely interested in what people think of overloading the file
> system interface in this way. Handy? Crufty? Dangerous? Does anybody
> know of any applications that rely on or expect meaningful values for a
> directory's i_size? Or read() a directory?
>
>
> More information on the recursive accounting at
>
> http://ceph.newdream.net/wiki/Recursive_accounting
>
> and Ceph itself at
>
> http://ceph.newdream.net/
>
> Cheers-
> sage
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/