Re: [PATCH v9 09/17] x86/split_lock: Handle #AC exception for split lock

From: Thomas Gleixner
Date: Wed Jun 26 2019 - 17:47:59 EST


On Wed, 26 Jun 2019, Fenghua Yu wrote:

> On Wed, Jun 26, 2019 at 10:20:05PM +0200, Thomas Gleixner wrote:
> > On Tue, 18 Jun 2019, Fenghua Yu wrote:
> > > +
> > > +static atomic_t split_lock_debug;
> > > +
> > > +void split_lock_disable(void)
> > > +{
> > > + /* Disable split lock detection on this CPU */
> > > + this_cpu_and(msr_test_ctl_cached, ~MSR_TEST_CTL_SPLIT_LOCK_DETECT);
> > > + wrmsrl(MSR_TEST_CTL, this_cpu_read(msr_test_ctl_cached));
> > > +
> > > + /*
> > > + * Use the atomic variable split_lock_debug to ensure only the
> > > + * first CPU hitting split lock issue prints one single complete
> > > + * warning. This also solves the race if the split-lock #AC fault
> > > + * is re-triggered by NMI of perf context interrupting one
> > > + * split-lock warning execution while the original WARN_ONCE() is
> > > + * executing.
> > > + */
> > > + if (atomic_cmpxchg(&split_lock_debug, 0, 1) == 0) {
> > > + WARN_ONCE(1, "split lock operation detected\n");
> > > + atomic_set(&split_lock_debug, 0);
> >
> > What's the purpose of this atomic_set()?
>
> atomic_set() releases the split_lock_debug flag after WARN_ONCE() is done.
> The same split_lock_debug flag will be used in sysfs write for atomic
> operation as well, as proposed by Ingo in https://lkml.org/lkml/2019/4/25/48

Your comment above lacks any useful information about that whole thing.

> So that's why the flag needs to be cleared, right?

Errm. No.

CPU 0 CPU 1

hits AC hits AC
if (atomic_cmpxchg() == success) if (atomic_cmpxchg() == success)
warn() warn()

So only one of the CPUs will win the cmpxchg race, set te variable to 1 and
warn, the other and any subsequent AC on any other CPU will not warn
either. So you don't need WARN_ONCE() at all. It's redundant and confusing
along with the atomic_set().

Whithout reading that link [1], what Ingo proposed was surely not the
trainwreck which you decided to put into that debugfs thing.

Thanks,

tglx

[1] lkml.org sucks. We have https://lkml.kernel.org/r/$MESSAGEID for
that. That actually works.