Re: [PATCH 01/11] staging: lustre: simplify use of interval-tree.

From: NeilBrown
Date: Sat Jun 16 2018 - 19:00:09 EST


On Sat, Jun 16 2018, James Simmons wrote:

>> Lustre has a private interval-tree implementation. This
>> implementation (inexplicably) refuses to insert an interval if an
>> identical interval already exists. It is OK with all sorts of
>> overlapping intervals, but identical intervals are rejected.
>
> I talked to Oleg about this since this changes the behavior. He is worried
> about having identical items that would end up being merged. If we can
> guarantee by some other means there are no identical nodes, we are
> probably fine with the interval tree code allowing this. Oleg can explain
> better than me in this case.

I don't think this is a change in behaviour.
In the driver/staging client code, interval tree is being used in two
places and both of them have clumsy work-arounds for the fact that they
cannot insert duplicates in the interval tree.
The patch just cleans his up.

However if I have missed something, please provide details.
What "identical items" might get merged?

>
>> Both users of interval-tree in lustre would be simpler if this was not
>> the case. They need to store all intervals, even if some are
>> identical.
>>
>> llite/range_lock.c add a rl_next_lock list_head to each lock.
>> If it cannot insert a new lock because the range is in use, it
>> attached the new lock to the existing lock using rl_next_lock.
>> This requires extra code to iterate over the rl_next_lock lists when
>> iterating over locks, and to update the list when deleting a lock from
>> the tree.
>>
>> ldlm_extend allocates a separate ldlm_interval which as a list of
>> ldlm_locks which share the same interval. This is linked together
>> by over-loading the l_sl_policy which, for non-extent locks, is used
>> for linking together locks with the same policy.
>> This doesn't only require extra code, but also an extra memory
>> allocation.
>>
>> This patch removes all that complexity.
>> - interval_insert() now never fails.
>
> Its not really a failure. What it does is if it finds a already existing
> node with the range requested it returns the already existing node
> pointer. If not it just creates a new node and returns NULL. Sometimes
> identical request can happen. A good example of this is with HSM request
> on the MDS server. In that case sometimes we get identical progress
> reports which we want to filter out so not add the same data.

This example is server-side code which is not a focus at present.
Having a quick look, it looks like it would be easy enough to do a
lookup first and then only insert if the lookup failed.
I think this is a much nicer approach than never allowing duplicates in
the interval tree.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature