Re: [PATCH - revised] rhashtable: detect when object movement might have invalidated a lookup

From: NeilBrown
Date: Sun Jul 15 2018 - 21:24:18 EST


On Mon, Jul 16 2018, Herbert Xu wrote:

> On Mon, Jul 16, 2018 at 09:57:11AM +1000, NeilBrown wrote:
>>
>> Some users of rhashtable might need to change the key
>> of an object and move it to a different location in the table.
>> Other users might want to allocate objects using
>> SLAB_TYPESAFE_BY_RCU which can result in the same memory allocation
>> being used for a different (type-compatible) purpose and similarly
>> end up in a different hash-chain.
>>
>> To support these, we store a unique NULLS_MARKER at the end of
>> each chain, and when a search fails to find a match, we check
>> if the NULLS marker found was the expected one. If not,
>> the search is repeated.
>>
>> The unique NULLS_MARKER is derived from the address of the
>> head of the chain.
>>
>> If an object is removed and re-added to the same hash chain, we won't
>> notice by looking that the NULLS marker. In this case we must be sure
>> that it was not re-added *after* its original location, or a lookup may
>> incorrectly fail. The easiest solution is to ensure it is inserted at
>> the start of the chain. insert_slow() already does that,
>> insert_fast() does not. So this patch changes insert_fast to always
>> insert at the head of the chain.
>>
>> Note that such a user must do their own double-checking of
>> the object found by rhashtable_lookup_fast() after ensuring
>> mutual exclusion which anything that might change the key, such as
>> successfully taking a new reference.
>>
>> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
>
> I still don't understand why we need this feature. The only
> existing user of this (which currently doesn't use rhashtable)
> does not readd the reused entry to the same table. IOW the flow
> is always from table A to table B. After which the entry will
> be properly freed rather than reused.
>
> So who is going to use this?

I want it so I can use SLAB_TYPESAFE_BY_RCU slab caches in lustre.
lustre isn't in mainline any more so I cannot point at the code but the
concept is simple.
Without this, you need to use rcu_free to free any object added to an
rhashtable.
When kmalloc is used, kfree_rcu() can be used, which is fairly painless.
When a kmem_cache is used, you need to provide your own rcu free
function, which is clumsy.
With SLAB_TYPESAFE_BY_RCU, that isn't needed. The object can be freed
immediately, providing you can cope with it being re-used (as the same
type) immediately. Part of handling that is coping with the possibility
that it might be inserted into the same hash table, and possibly the
same chain, immediately it is freed.
lustre has 6 different resizable hashtables which I want to convert
to use rhashtable.
I currently need call_rcu handlers for 3 for these. I want those
3 to use SLAB_TYPESAFE_BY_RCU instead so they can use
kmem_cache_free() directly. For this, I need rhashtable to be safe if
an object is deleted and immediately re-inserted into the same hash
chain.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature