Re: Question about qspinlock nest

From: James Morse
Date: Fri Jan 11 2019 - 13:33:04 EST


Hi Peter,

On 10/01/2019 20:12, Peter Zijlstra wrote:
> On Thu, Jan 10, 2019 at 06:25:57PM +0000, James Morse wrote:
>
>> On arm64 if all the RAS and psuedo-NMI patches land, our worst-case interleaving
>> jumps to at least 7. The culprit is APEI using spinlocks to protect fixmap slots.
>>
>> I have an RFC to bump the number of node bits from 2 to 3, but as this is APEI
>> four times, it may be preferable to make it use something other than spinlocks.

>> The worst-case order is below. Each one masks those before it:
>> 1. process context
>> 2. soft-irq
>> 3. hard-irq
>> 4. psuedo-nmi [0]
>> - using the irqchip priorities to configure some IRQs as NMI.
>> 5. SError [1]
>> - a bit like an asynchronous MCE. ACPI allows this to convey CPER records,
>> requiring an APEI call.
>> 6&7. SDEI [2]
>> - a firmware triggered software interrupt, only its two of them, either of
>> which could convey CPER records.
>> 8. Synchronous external abort
>> - again, similar to MCE. There are systems using this with APEI.

> The thing is, everything non-maskable (NMI like) really should not be
> using spinlocks at all.
>
> I otherwise have no clue about wth APEI is, but it sounds like horrible
> crap ;-)

I think you've called it that before!: its that GHES thing in drivers/acpi/apei.

What is the alternative? bit_spin_lock()?
These things can happen independently on multiple CPUs. On arm64 these NMIlike
things don't affect all CPUs like they seem to on x86.


Thanks,

James