Re: [PATCH 1/8] rbtree: Cache leftmost node internally

From: Jan Kara
Date: Fri Jun 09 2017 - 03:43:53 EST


On Thu 08-06-17 11:03:22, Davidlohr Bueso wrote:
> Red-black tree semantics imply that nodes with smaller or
> greater (or equal for duplicates) keys always be to the
> left and right, respectively. For the kernel this is
> extremely evident when considering our rb_first() semantics.
> Enabling lookups for the smallest node in the tree in O(1)
> can save a good chunk of cycles in not having to walk down the
> tree each time. To this end there are a few core users that
> explicitly do this, such as the scheduler and rtmutexes.
> There is also the desire for interval trees to have this
> optimization allowing faster overlap checking.
>
> This patch introduces a new 'struct rb_root_cached' which
> is just the root with a cached pointer to the leftmost node.
> The reason why the regular rb_root was not extended instead
> of adding a new structure was that this allows the user to
> have the choice between memory footprint and actual tree
> performance. The new wrappers on top of the regular rb_root
> calls are:
>
> - rb_first_cached(cached_root) -- which is a fast replacement
> for rb_first.
>
> - rb_insert_color_cached(node, cached_root, new)
>
> - rb_erase_cached(node, cached_root)
>
> In addition, augmented cached interfaces are also added for
> basic insertion and deletion operations; which becomes
> important for the interval tree changes.
>
> With the exception of the inserts, which adds a bool for
> updating the new leftmost, the interfaces are kept the same.
> To this end, porting rb users to the cached version becomes really
> trivial, and keeping current rbtree semantics for users that
> don't care about the optimization requires zero overhead.
>
> Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>

Looks good to me. Just one nit below:

> @@ -150,6 +161,7 @@ extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
>
> static __always_inline struct rb_node *
> __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
> + struct rb_node **leftmost,
> const struct rb_augment_callbacks *augment)
> {
> struct rb_node *child = node->rb_right;
> @@ -157,6 +169,9 @@ __rb_erase_augmented(struct rb_node *node, struct rb_root *root,
> struct rb_node *parent, *rebalance;
> unsigned long pc;
>
> + if (leftmost && node == *leftmost)
> + *leftmost = rb_next(node);
> +
> if (!tmp) {
> /*
> * Case 1: node to erase has no more than 1 child (easy!)

Why do you propagate e.g. 'leftmost' down to __rb_erase_augmented() when
you could just handle everything within rb_erase_augmented_cached?
Similarly for other functions like __rb_insert()... It would seem like less
churn and I don't see downside to it...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR