Questions on prio_tree

From: Kevin O'Connor
Date: Sun Apr 18 2004 - 11:15:59 EST

Hi Rajesh,

I've been reading through your recently posted prio tree code. There are a
couple of things in the code that I do not understand, and I am concerned
that they may be bugs or omissions. Hopefully, you'll be just able to
clarify the code for me. :-)

1 - Why does prio_tree_expand use the max_heap_index to determine the
height of the radix tree? The bits of the radix_index is only ever used
(never the bits in the heap_index), so shouldn't max_radix_index be used?
I understand that using the max_heap_index isn't necessary broken (it is
guaranteed to be greater than or equal to the max_radix_index), but it
seems inefficient. As an example, consider the case where 100 vmas are
created all starting at address 0 and ending at addresses 0 through 100.
The current code will force the address part of the tree to a height of 7,
but since there is only one start address (0) the tree really could have a
height of just 1.

2 - When traversing the tree, after the first root->index_bits levels of
the tree have been walked, the code starts inspecting the "size" portion of
the radix key. Why does the code set the start bit for the size portion to
root->index_bits? Shouldn't the code set the start bit to (BITS_PER_LONG -

Consider the case where 100 vmas are created all starting at address 0 and
ending at addresses 0 through 100, followed by the creation of a vma from
address 100,000 - 100,001. After the first 100 vmas, the tree will have a
height of 7 (reasons sited above) for the start address and an additional
height of 7 for the size portion of the index. This will cause the tree to
have 7 records in the start_address portion of the tree, and the remaining
93 will be distributed pretty evenly in the size portion of the radix tree
that hangs off the zero address. When the 101st vma is added, the start
address portion of the tree will be expanded to a height of 17. The
current code will expand the tree to contain 17 records in the size portion
of the radix tree and leave the remaining 84 records in the size portion of
the tree hanging off the zero address.

The problem, as I see it, is that new inserts into the tree that start at
address zero will not be inspecting the same bits that were used when the
first 100 were inserted. (When the first 100 were inserted, bits 1-7 were
used to branch in the size portion of the tree, but inserts 102 and on will
use bits 1-17.) I don't think this can cause a "crash", but I think it
skews the tree ordering and could disturb the O(log n + m) behavior of the

Did I miss something?

| Kevin O'Connor "BTW, IMHO we need a FAQ for |
| kevin@xxxxxxxxxxxx 'IMHO', 'FAQ', 'BTW', etc. !" |
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at