Re: [PATCH v3] rhashtable: detect when object movement between tables might have invalidated a lookup

From: NeilBrown
Date: Sun Dec 02 2018 - 17:20:53 EST


On Sat, Dec 01 2018, Herbert Xu wrote:

> On Fri, Nov 30, 2018 at 10:26:50AM +1100, NeilBrown wrote:
>>
>> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
>> index 30526afa8343..852ffa5160f1 100644
>> --- a/lib/rhashtable.c
>> +++ b/lib/rhashtable.c
>> @@ -1179,8 +1179,7 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>> unsigned int hash)
>> {
>> const unsigned int shift = PAGE_SHIFT - ilog2(sizeof(void *));
>> - static struct rhash_head __rcu *rhnull =
>> - (struct rhash_head __rcu *)NULLS_MARKER(0);
>> + static struct rhash_head __rcu *rhnull;
>> unsigned int index = hash & ((1 << tbl->nest) - 1);
>> unsigned int size = tbl->size >> tbl->nest;
>> unsigned int subhash = hash;
>> @@ -1198,8 +1197,11 @@ struct rhash_head __rcu **rht_bucket_nested(const struct bucket_table *tbl,
>> subhash >>= shift;
>> }
>>
>> - if (!ntbl)
>> + if (!ntbl) {
>> + if (!rhnull)
>> + INIT_RHT_NULLS_HEAD(rhnull);
>> return &rhnull;
>> + }
>
> I think you missed my earlier reply beause of bouncing emails.

Yeah, sorry about that. I should have looked through an lkml archive
once I realized that was happening - I have now.

>
> I think this is unnecessary because
>
> RHT_NULLS_MARKER(RHT_NULLS_MARKER(0)) = RHT_NULLS_MARKER(0)
>

I don't understand how this is relevant.

I think you are saying that when rht_bucket_nested() finds that the
target page hasn't been allocated, it should return a pointer to a
static variable which contains RHT_NULLS_MARKER(0)

static struct rhash_head *rhnull = RHT_NULLS_MARKER(0);

Then in __rhashtable_lookup(),
head = rht_bucket(tbl, hash);

would result in 'head' having the value '&rhnull'.

Then
rht_for_each_rcu_continue(he, *head, tbl, hash) {

would result in 'he' having the value RHT_NULLS_MARKER(0)

Then
} while (he != RHT_NULLS_MARKER(head));

will compare RHT_NULLS_MARKER(0) with RHT_NULLS_MARKED(&rhnull)
and they will be different, so it will loop indefinitely.

With respect to the shifting, you wrote:

> The top-bit is most likely to be fixed and offer no real value.

While this might be likely, it is not certain, so not relevant.
On a 32bit machine with more than 2GB of physical memory, some memory
addresses will have 0 in the msb, some will have 1.
It is possible (however unlikely) that two hash buckets in different
tables will have the same address except for the msb. If we ignore the
msb, we might incorrectly determine that we have reached the end of the
chain from the first bucket, whereas we actually reached the end of the
chain from the second bucket.

Thanks,
NeilBrown

Attachment: signature.asc
Description: PGP signature