Re: [PATCH] kvm: pass the virtual SEI syndrome to guest OS

From: gengdongjiu
Date: Wed Mar 29 2017 - 21:23:55 EST


Hi Christoffer/Laszlo,

On 2017/3/30 1:44, Christoffer Dall wrote:
> On Wed, Mar 29, 2017 at 05:37:49PM +0200, Laszlo Ersek wrote:
>> On 03/29/17 16:48, Christoffer Dall wrote:
>>> On Wed, Mar 29, 2017 at 10:36:51PM +0800, gengdongjiu wrote:
>>>> 2017-03-29 18:36 GMT+08:00, Achin Gupta <achin.gupta@xxxxxxx>:
>>
>>>>> Qemu is essentially fulfilling the role of secure firmware at the
>>>>> EL2/EL1 interface (as discussed with Christoffer below). So it
>>>>> should generate the CPER before injecting the error.
>>>>>
>>>>> This is corresponds to (1) above apart from notifying UEFI (I am
>>>>> assuming you mean guest UEFI). At this time, the guest OS already
>>>>> knows where to pick up the CPER from through the HEST. Qemu has
>>>>> to create the CPER and populate its address at the address
>>>>> exported in the HEST. Guest UEFI should not be involved in this
>>>>> flow. Its job was to create the HEST at boot and that has been
>>>>> done by this stage.
>>>>
>>>> Sorry, As I understand it, after Qemu generate the CPER table, it
>>>> should pass the CPER table to the guest UEFI, then Guest UEFI place
>>>> this CPER table to the guest OS memory. In this flow, the Guest UEFI
>>>> should be involved, else the Guest OS can not see the CPER table.
>>>>
>>>
>>> I think you need to explain the "pass the CPER table to the guest UEFI"
>>> concept in terms of what really happens, step by step, and when you say
>>> "then Guest UEFI place the CPER table to the guest OS memory", I'm
>>> curious who is running what code on the hardware when doing that.
>>
>> I strongly suggest to keep the guest firmware's runtime involvement to
>> zero. Two reasons:
>>
>> (1) As you explained above (... which I conveniently snipped), when you
>> inject an interrupt to the guest, the handler registered for that
>> interrupt will come from the guest kernel.
>>
>> The only exception to this is when the platform provides a type of
>> interrupt whose handler can be registered and then locked down by the
>> firmware. On x86, this is the SMI.
>>
>> In practice though,
>> - in OVMF (x86), we only do synchronous (software-initiated) SMIs (for
>> privileged UEFI varstore access),
>> - and in ArmVirtQemu (ARM / aarch64), none of the management mode stuff
>> exists at all.
>>
>> I understand that the Platform Init 1.5 (or 1.6?) spec abstracted away
>> the MM (management mode) protocols from Intel SMM, but at this point
>> there is zero code in ArmVirtQemu for that. (And I'm unsure how much of
>> any eligible underlying hw emulation exists in QEMU.)
>>
>> So you can't get the guest firmware to react to the injected interrupt
>> without the guest OS coming between first.
>>
>> (2) Achin's description matches really-really closely what is possible,
>> and what should be done with QEMU, ArmVirtQemu, and the guest kernel.
>>
>> In any solution for this feature, the firmware has to reserve some
>> memory from the OS at boot. The current facilities we have enable this.
>> As I described previously, the ACPI linker/loader actions can be mapped
>> more or less 1:1 to Achin's design. From a practical perspective, you
>> really want to keep the guest firmware as dumb as possible (meaning: as
>> generic as possible), and keep the ACPI specifics to the QEMU and the
>> guest kernel sides.
>>
>> The error serialization actions -- the co-operation between guest kernel
>> and QEMU on the special memory areas -- that were mentioned earlier by
>> Michael and Punit look like a complication. But, IMO, they don't differ
>> from any other device emulation -- DMA actions in particular -- that
>> QEMU already does. Device models are what QEMU *does*. Read the command
>> block that the guest driver placed in guest memory, parse it, sanity
>> check it, verify it, execute it, write back the status code, inject an
>> interrupt (and/or let any polling guest driver notice it "soon after" --
>> use barriers as necessary).
>>
>> Thus, I suggest to rely on the generic ACPI linker/loader interface
>> (between QEMU and guest firmware) *only* to make the firmware lay out
>> stuff (= reserve buffers, set up pointers, install QEMU's ACPI tables)
>> *at boot*. Then, at runtime, let the guest kernel and QEMU (the "device
>> model") talk to each other directly. Keep runtime firmware involvement
>> to zero.
>>
>> You *really* don't want to debug three components at runtime, when you
>> can solve the thing with two. (Two components whose build systems won't
>> drive you mad, I should add.)
>>
>> IMO, Achin's design nailed it. We can do that.
>>
> I completely agree.
>
> My questions were intended for gengdongjiu to clarify his/her
> and clear up any misunderstandings between what Achin suggested and what
> he/she wrote.

Achin and Laszlo's understanding are right. thanks for your suggestion.

>
> Thanks,
> -Christoffer
>
> .
>