Re: [RFC] B+Tree library V2

From: JÃrn Engel
Date: Thu Jan 08 2009 - 16:09:28 EST


On Thu, 8 January 2009 21:18:55 +0100, Johannes Berg wrote:
>
> > Looks correct otherwise. Probably needs a comment that without "tmp" we
> > would skip a 0 key. Or am I the only one who wants to simplify the code
> > before spotting this little subtlety?
>
> I, uh, I didn't even realise that. I think the code for
> btree_last/btree_get_prev_key isn't correct as is since the 0 key is
> valid, but you can't tell whether it returned 0 because it didn't find
> anything, or because there was no more entry. Or am I missing something?

Correct or not is a matter of opinion, so let's not go there. It
certainly is unexpected and also inefficient. The alternative would be
to return two values, they key and a flag to indicate the end.

> > > (and possibly some type-checking variants that hardcode the geo)
> > >
> > > Does that seem correct? And would it be possible to provide btree_last()
> > > that takes an void ** and fills it with the last entry, and the same for
> > > lookup_less(), so we can write btree_for_each_entry() too?
> >
> > Not sure what you mean. Something with the same effect as this?
^^^^^^^^^^^ ;)
> >
> > #define btree_for_each_val(head, geo, key, val) \
> > for (key = btree_last(head, geo), \
> > val = btree_lookup(head, geo, key); \
> > val; \
> > key = btree_get_prev_key(head, geo, key), \
> > val = btree_lookup(head, geo, key))
>
> Well, that does lots of lookups that don't seem necessary, since a
> function like btree_last should be able to return the value right away.
> Also, if it was
>
> #define btree_for_each_val(head, geo, key, val)
> for (val = btree_last(head, geo, &key);
> val;
> val = btree_get_prev(head, geo, &key))
>
> it would be more correct, I think?

More efficient, certainly. Half the tree walks are gone. Let's do it.

Note, btw, that this changes effort from O(2n) to O(n), while the old
visitor is O(1) *). That was the reason why I wrote it in the first
place. If the code wasn't as horrible and hard to use, it would be a
clear winner. Guess we'll have to keep both variants.

*) Or rather O(2n*log(n)), O(n*log(n)) and O(log(n)) respectively.

JÃrn

--
Joern's library part 6:
http://www.gzip.org/zlib/feldspar.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/