Re: [PATCH 3/3] kref: Remove the memory barriers

From: Ming Lei
Date: Sat Dec 10 2011 - 21:22:57 EST


On Sun, Dec 11, 2011 at 3:49 AM, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> On Sat, 2011-12-10 at 23:57 +0800, Ming Lei wrote:
>
>> CPU0                  CPU1
>>
>> atomic_set(v)
>> smp_mb()
>>                               smp_mb()
>>                               atomic_dec_and_test(v)
>>
>> Without the barrier after atomic_set, CPU1 may see a stale
>> value of v first, then decrease it, so may miss a release operation.
>
> Your example is doubly broken. If there's concurrency possible with
> atomic_set() you've lost.

kref_init is guaranteed to be run only one time __before__ executing
kref_get/kref_put.

>
> Lets change it to kref_get() aka atomic_inc():
>
>        CPU0            CPU1
>
>        atomic_inc()
>                        atomic_dec_and_test()
>
> and
>
>                        atomic_dec_and_test()
>        atomic_inc()
>
> For if the first is possible, then so is the second.

Yes, both are reasonable.

>
> This illustrates that no matter how many barriers you put in, you're
> still up shit creek without no paddle because the kref_put() can come in
> before you do the kref_get(), making the kref_get() the invalid
> operation.

So one smp_mb__before_atomic_inc should be added before atomic_inc
to make sure that CPU0 can see the uptodate ref, right?

>> The pair of smp_mb can make order between atomic_set
>> and atomic_dec_and_test, can't it?
>
> No. Because there's nothing stopping the dec from happening before the
> set/inc.

As stated above, one smp_mb__before_atomic_inc before atomic_inc
may avoid the race.

>> > If there's a destruction race with kref_put() the barrier won't
>>
>> Sorry, could you say what the destruction race is?
>
> The race to 0-refs, iow. the case above when you assume both cases start
> out with 1 ref.

But the initial value of kref is 1, so seems we don't need to consider
the 0-refs.


thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/