Re: [PATCH 01/23] radix-tree: implement preload for multiplecontiguous elements

From: Matthew Wilcox
Date: Tue Aug 06 2013 - 12:34:32 EST


On Mon, Aug 05, 2013 at 01:17:39PM +0200, Jan Kara wrote:
> On Sun 04-08-13 05:17:03, Kirill A. Shutemov wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
> > The radix tree is variable-height, so an insert operation not only has
> > to build the branch to its corresponding item, it also has to build the
> > branch to existing items if the size has to be increased (by
> > radix_tree_extend).
> > @@ -82,16 +82,24 @@ static struct kmem_cache *radix_tree_node_cachep;
> > * The worst case is a zero height tree with just a single item at index 0,
> > * and then inserting an item at index ULONG_MAX. This requires 2 new branches
> > * of RADIX_TREE_MAX_PATH size to be created, with only the root node shared.
> > + *
> > + * Worst case for adding N contiguous items is adding entries at indexes
> > + * (ULONG_MAX - N) to ULONG_MAX. It requires nodes to insert single worst-case
> > + * item plus extra nodes if you cross the boundary from one node to the next.
> > + *
> > * Hence:
> > */
> > -#define RADIX_TREE_PRELOAD_SIZE (RADIX_TREE_MAX_PATH * 2 - 1)
> > +#define RADIX_TREE_PRELOAD_MIN (RADIX_TREE_MAX_PATH * 2 - 1)
> > +#define RADIX_TREE_PRELOAD_MAX \
> > + (RADIX_TREE_PRELOAD_MIN + \
> > + DIV_ROUND_UP(RADIX_TREE_PRELOAD_NR - 1, RADIX_TREE_MAP_SIZE))
>
> Umm, is this really correct? I see two problems:
> 1) You may need internal tree nodes at various levels but you seem to
> account only for the level 1.
> 2) The rounding doesn't seem right because RADIX_TREE_MAP_SIZE+2 nodes may
> require 3 nodes at level 1 if the indexes are like:
> i_0 | i_1 .. i_{RADIX_TREE_MAP_SIZE} | i_{RADIX_TREE_MAP_SIZE+1}
> ^ ^
> node boundary node boundary
>
> Otherwise the patch looks good.

You are correct that in the fully general case, these things are needed,
and the patch undercounts the number of nodes needed. However, in the
specific case of THP pagecache, insertions are naturally aligned, and
we end up needing very few internal nodes (so few that we've never hit
the end of this array in some fairly heavy testing).

There are two penalties for getting the general case correct. One is
that the calculation becomes harder to understand, and the other is
that we consume more per-CPU memory. I think we should document that
the current code requires "natural alignment", and include a note about
what things will need to change in order to support arbitrary alignment
in case anybody needs to do it in the future, but not include support
for arbitrary alignment in this patchset.

What do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/