Re: [PATCH 1/2] qspinlock: Ensure writes are pushed out of core write buffer

From: Peter Zijlstra
Date: Thu Jan 28 2021 - 06:26:37 EST


On Thu, Jan 28, 2021 at 08:36:24AM +0100, Alexander Sverdlin wrote:

> >> diff --git a/kernel/locking/mcs_spinlock.h b/kernel/locking/mcs_spinlock.h
> >> index 5e10153..10e497a 100644
> >> --- a/kernel/locking/mcs_spinlock.h
> >> +++ b/kernel/locking/mcs_spinlock.h
> >> @@ -89,6 +89,11 @@ void mcs_spin_lock(struct mcs_spinlock **lock, struct mcs_spinlock *node)
> >> return;
> >> }
> >> WRITE_ONCE(prev->next, node);
> >> + /*
> >> + * This is necessary to make sure that the corresponding "while" in the
> >> + * mcs_spin_unlock() doesn't loop forever
> >> + */
> >> + smp_wmb();
> > If it loops forever, that's broken hardware design; store buffers need to
> > drain. I don't think we should add unconditional barriers to bodge this.
>
> The comment is a bit exaggerating the situation, but it's undeterministic and you see the
> measurements above. Something is wrong in the qspinlocks code, please consider this patch
> "RFC", but something has to be done here.

The qspinlock code has been TLA+ modelled and has had extensive memory
ordering analysis. It has had lots of runtime on extremely large x86,
arm64, and Power machines. I'm fairly confident there is nothing wrong.

What I do think is more likely is that your platform is broken, it
wouldn't be the first MIPS that's got store-buffers completely wrong,
see commit:

a30718868915 ("MIPS: Change definition of cpu_relax() for Loongson-3")

Working around micro arch store-buffer issues is not something the
generic code is for.