Re: Question about qspinlock nest

From: James Morse
Date: Mon Jan 14 2019 - 08:54:55 EST


Hi Peter,

On 14/01/2019 13:16, Peter Zijlstra wrote:
> On Fri, Jan 11, 2019 at 06:32:58PM +0000, James Morse wrote:
>> On 10/01/2019 20:12, Peter Zijlstra wrote:
>>> On Thu, Jan 10, 2019 at 06:25:57PM +0000, James Morse wrote:
>>> The thing is, everything non-maskable (NMI like) really should not be
>>> using spinlocks at all.
>>>
>>> I otherwise have no clue about wth APEI is, but it sounds like horrible
>>> crap ;-)
>>
>> I think you've called it that before!: its that GHES thing in drivers/acpi/apei.
>>
>> What is the alternative? bit_spin_lock()?

>> These things can happen independently on multiple CPUs. On arm64 these NMIlike
>> things don't affect all CPUs like they seem to on x86.
>
> It has nothing to do with how many CPUs are affected. It has everything
> to do with not being maskable.

(sorry, I didn't include any of the context, let me back-up a bit here:)

> What avoids the trivial self-recursion:
>
> spin_lock(&)
> <NMI>
> spin_lock(&x)
> ... wait forever more ...
> </NMI>
> spin_unlock(&x)
>
> ?

If its trying to take the same lock, I agree its deadlocked.
If the sequence above started with <NMI>, I agree its deadlocked.

APEI/GHES is doing neither of these things. It take a lock that is only ever
taken in_nmi(). nmi_enter()s BUG_ON(in_nmi()) means these never become re-entrant.

What is the lock doing? Protecting the 'NMI' fixmap slot in the unlikely case
that two CPUs end up in here at the same time.

(I though x86's NMI masked NMI until the next iret?)


This is murkier on arm64 as we have multiple things that behave like this, but
there is an order to them, and none of them can interrupt themselves.
e.g. We can't take an SError during the SError handler.

But we can take this SError/NMI on another CPU while the first one is still
running the handler.

These multiple NMIlike notifications mean having multiple locks/fixmap-slots,
one per notification. This is where the qspinlock node limit comes in, as we
could have more than 4 contexts.


Thanks,

James

> Normally for actual maskable interrupts, we use:
>
> spin_lock_irq(&x)
> // our IRQ cannot happen here because: masked
> spin_unlock_irq(&x)
>
> But non-maskable, has, per definition, a wee issue there.

> Non-maskable MUST NOT _EVAH_ use any form of spinlocks, they're
> fundamentally incompatible. Non-maskable interrupts must employ
> wait-free atomic constructs.