Re: Question about qspinlock nest

From: Peter Zijlstra
Date: Mon Jan 14 2019 - 08:16:21 EST


On Fri, Jan 11, 2019 at 06:32:58PM +0000, James Morse wrote:
> Hi Peter,
>
> On 10/01/2019 20:12, Peter Zijlstra wrote:
> > On Thu, Jan 10, 2019 at 06:25:57PM +0000, James Morse wrote:
> >
> >> On arm64 if all the RAS and psuedo-NMI patches land, our worst-case interleaving
> >> jumps to at least 7. The culprit is APEI using spinlocks to protect fixmap slots.
> >>
> >> I have an RFC to bump the number of node bits from 2 to 3, but as this is APEI
> >> four times, it may be preferable to make it use something other than spinlocks.
>
> >> The worst-case order is below. Each one masks those before it:
> >> 1. process context
> >> 2. soft-irq
> >> 3. hard-irq
> >> 4. psuedo-nmi [0]
> >> - using the irqchip priorities to configure some IRQs as NMI.
> >> 5. SError [1]
> >> - a bit like an asynchronous MCE. ACPI allows this to convey CPER records,
> >> requiring an APEI call.
> >> 6&7. SDEI [2]
> >> - a firmware triggered software interrupt, only its two of them, either of
> >> which could convey CPER records.
> >> 8. Synchronous external abort
> >> - again, similar to MCE. There are systems using this with APEI.
>
> > The thing is, everything non-maskable (NMI like) really should not be
> > using spinlocks at all.
> >
> > I otherwise have no clue about wth APEI is, but it sounds like horrible
> > crap ;-)
>
> I think you've called it that before!: its that GHES thing in drivers/acpi/apei.
>
> What is the alternative? bit_spin_lock()?
> These things can happen independently on multiple CPUs. On arm64 these NMIlike
> things don't affect all CPUs like they seem to on x86.

It has nothing to do with how many CPUs are affected. It has everything
to do with not being maskable.

What avoids the trivial self-recursion:

spin_lock(&)
<NMI>
spin_lock(&x)
... wait forever more ...
</NMI>
spin_unlock(&x)

?

Normally for actual maskable interrupts, we use:

spin_lock_irq(&x)
// our IRQ cannot happen here because: masked
spin_unlock_irq(&x)

But non-maskable, has, per definition, a wee issue there.

Non-maskable MUST NOT _EVAH_ use any form of spinlocks, they're
fundamentally incompatible. Non-maskable interrupts must employ
wait-free atomic constructs.