Re: [PATCH] locking/osq_lock: fix a data race in osq_wait_next

From: Peter Zijlstra
Date: Thu Jan 23 2020 - 04:39:13 EST


On Wed, Jan 22, 2020 at 06:54:43PM -0500, Qian Cai wrote:
> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
> index 1f7734949ac8..832e87966dcf 100644
> --- a/kernel/locking/osq_lock.c
> +++ b/kernel/locking/osq_lock.c
> @@ -75,7 +75,7 @@ osq_wait_next(struct optimistic_spin_queue *lock,
> * wait for either @lock to point to us, through its Step-B, or
> * wait for a new @node->next from its Step-C.
> */
> - if (node->next) {
> + if (READ_ONCE(node->next)) {
> next = xchg(&node->next, NULL);
> if (next)
> break;

This could possibly trigger the warning, but is a false positive. The
above doesn't fix anything in that even if that load is shattered the
code will function correctly -- it checks for any !0 value, any byte
composite that is !0 is sufficient.

This is in fact something KCSAN compiler infrastructure could deduce.