Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next

From: Qian Cai
Date: Mon Jan 27 2020 - 22:11:34 EST




> On Jan 23, 2020, at 4:39 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Jan 22, 2020 at 06:54:43PM -0500, Qian Cai wrote:
>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>> index 1f7734949ac8..832e87966dcf 100644
>> --- a/kernel/locking/osq_lock.c
>> +++ b/kernel/locking/osq_lock.c
>> @@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock,
>> * wait for either @lock to point to us, through its Step-B, or
>> * wait for a new @node->next from its Step-C.
>> */
>> - if (node->next) {
>> + if (READ_ONCE(node->next)) {
>> next = xchg(&node->next, NULL);
>> if (next)
>> break;
>
> This could possibly trigger the warning, but is a false positive. The
> above doesn't fix anything in that even if that load is shattered the
> code will function correctly -- it checks for any !0 value, any byte
> composite that is !0 is sufficient.
>
> This is in fact something KCSAN compiler infrastructure could deduce.


Marco, any thought on improving KCSAN for this to reduce the false
positives?