Re: getdents - ext4 vs btrfs performance

From: Ted Ts'o
Date: Tue Mar 13 2012 - 17:33:08 EST


On Tue, Mar 13, 2012 at 04:22:52PM -0400, Phillip Susi wrote:
>
> I think a format change would be preferable to runtime sorting.

Are you volunteering to spearhead the design and coding of such a
thing? Run-time sorting is backwards compatible, and a heck of a lot
easier to code and test...

The reality is we'd probably want to implement run-time sorting
*anyway*, for the vast majority of people who don't want to convert to
a new incompatible file system format. (Even if you can do the
conversion using e2fsck --- which is possible, but it would be even
more code to write --- system administrators tend to be very
conservative about such things, since they might need to boot an older
kernel, or use a rescue CD that doesn't have an uptodate kernel or
file system utilities, etc.)

> So the index nodes contain the hash ranges for the leaf block, but
> the leaf block only contains the regular directory entries, not a
> hash for each name? That would mean that adding or removing names
> would require moving around the regular directory entries wouldn't
> it?

They aren't sorted in the leaf block, so we only need to move around
regular directory entries when we do a node split (and at the moment
we don't support shrinking directories), so we don't have to worry the
reverse case.

> I would think that hash collisions are rare enough that reading a
> directory block you end up not needing once in a blue moon would be
> chalked up under "who cares". So just stick with hash, offset pairs
> to map the hash to the normal directory entry.

With a 64-bit hash, and if we were actually going to implement this as
a new incompatible feature, you're probably right in terms of
accepting the extra directory block search.

We would still have to implement the case where hash collisions *do*
exist, though, and make sure the right thing happens in that case.
Even if the chance of that happening is 1 in 2**32, with enough
deployed systems (i.e., every Android handset, etc.) it's going to
happen in real life.

- Ted




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/