Re: [PATCH v9 3/7] acpi: apei: Add SEI notification type support for ARMv8

From: gengdongjiu
Date: Tue Jan 23 2018 - 04:24:39 EST


Hi James,

On 2018/1/23 3:39, James Morse wrote:
> Hi Dongjiu Geng,
>
> (versions of patches 1,2 and 4 have been queued by Catalin)
>
> (Nit 'ACPI / APEI:' is the normal subject prefix for ghes.c, this helps the
> maintainers know which patches they need to pay attention to when you are
> touching multiple trees)
>
> On 06/01/18 16:02, Dongjiu Geng wrote:
>> ARMv8.2 requires implementation of the RAS extension.
>
>> In
>> this extension, it adds SEI(SError Interrupt) notification
>> type, this patch adds new GHES error source SEI handling
>> functions.
>
> This reads as if this patch is handling SError RAS notifications generated by a
> CPU with the RAS extensions. These are about CPU->Software notifications. APEI
> and GHES are a firmware first mechanism which is Software->Software.
> Reading the v8.2 documents won't help anyone with the APEI/GHES code.
>
> Please describe this from the ACPI view, "ACPI 6.x adds support for NOTIFY_SEI
> as a GHES notification mechanism... ", its up to the arch code to spot a v8.2
> RAS Error based on the cpu caps.Ok, I will modify it.

>
>
>> This error source parsing and handling method
>> is similar with the SEA.
>
> There are problems with doing this:
>
> Oct. 18, 2017, 10:26 a.m. James Morse wrote:
> | How do SEA and SEI interact?
> |
> | As far as I can see they can both interrupt each other, which isn't something
> | the single in_nmi() path in APEI can handle. I thinks we should fix this
> | first.
>
> [..]
>
> | SEA gets away with a lot of things because its synchronous. SEI isn't. Xie
> | XiuQi pointed to the memory_failure_queue() code. We can use this directly
> | from SEA, but not SEI. (what happens if an SError arrives while we are
> | queueing memory_failure work from an IRQ).
> |
> | The one that scares me is the trace-point reporting stuff. What happens if an
> | SError arrives while we are enabling a trace point? (these are static-keys
> | right?)
> |
> | I don't think we can just plumb SEI in like this and be done with it.
> | (I'm looking at teasing out the estatus cache code from being x86:NMI only.
> | This way we solve the same 'cant do this from NMI context' with the same
> | code'.)
>
>
> I will post what I've got for this estatus-cache thing as an RFC, its not ready
> to be considered yet.Yes, I know you are dong that. Your serial's patch will consider all above things, right?
If your patch can be consider that, this patch can based on your patchset. thanks.

>
>
>> Expose API ghes_notify_sei() to external users. External
>> modules can call this exposed API to parse APEI table and
>> handle the SEI notification.
>
> external modules? You mean called by the arch code when it gets this NOTIFY_SEI?
yes, called by kernel ARCH code, such as below, I remember I have discussed with you.

asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
{
nmi_enter();


if (!ghes_notify_sei())
return;



/* non-RAS errors are not containable */
if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
arm64_serror_panic(regs, esr);

nmi_exit();
}

>
>
> Thanks,
>
> James
>
> .
>