Re: [PATCH v4.1] KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES

From: Marc Zyngier
Date: Mon Apr 11 2022 - 05:46:08 EST


On Fri, 08 Apr 2022 17:56:42 +0100,
Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> Queued, thanks. But documentation was missing:
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e7a0dfdc0178..72183ae628f7 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6088,8 +6088,12 @@ should put the acknowledged interrupt vector into the 'epr' field.
> #define KVM_SYSTEM_EVENT_SHUTDOWN 1
> #define KVM_SYSTEM_EVENT_RESET 2
> #define KVM_SYSTEM_EVENT_CRASH 3
> + #define KVM_SYSTEM_EVENT_SEV_TERM 4
> + #define KVM_SYSTEM_EVENT_NDATA_VALID (1u << 31)
> __u32 type;
> + __u32 ndata;
> __u64 flags;
> + __u64 data[16];
> } system_event;
>
> If exit_reason is KVM_EXIT_SYSTEM_EVENT then the vcpu has triggered
> @@ -6099,7 +6103,7 @@ HVC instruction based PSCI call from the vcpu. The 'type' field describes
> the system-level event type. The 'flags' field describes architecture
> specific flags for the system-level event.
>
> -Valid values for 'type' are:
> +Valid values for bits 30:0 of 'type' are:
>
> - KVM_SYSTEM_EVENT_SHUTDOWN -- the guest has requested a shutdown of the
> VM. Userspace is not obliged to honour this, and if it does honour
> @@ -6112,12 +6116,18 @@ Valid values for 'type' are:
> has requested a crash condition maintenance. Userspace can choose
> to ignore the request, or to gather VM memory core dump and/or
> reset/shutdown of the VM.
> + - KVM_SYSTEM_EVENT_SEV_TERM -- an AMD SEV guest requested termination.
> + The guest physical address of the guest's GHCB is stored in `data[0]`.
>
> Valid flags are:
>
> - KVM_SYSTEM_EVENT_RESET_FLAG_PSCI_RESET2 (arm64 only) -- the guest issued
> a SYSTEM_RESET2 call according to v1.1 of the PSCI specification.
>
> +Extra data for this event is stored in the `data[]` array, up to index
> +`ndata-1` included, if bit 31 is set in `type`. The data depends on the
> +`type` field. There is no extra data if bit 31 is clear or `ndata` is zero.
> +

This has the potential to break userspace as it expects a strict match
on the whole of 'type', and does not expect to treat it as a bitfield.

Case in point, QEMU:

accel/kvm/kvm-all.c::kvm_cpu_exec()

case KVM_EXIT_SYSTEM_EVENT:
switch (run->system_event.type) {

CrosVM and kvmtool have similar constructs, and will break as soon as
KVM_SYSTEM_EVENT_NDATA_VALID is or'ed into 'type'.

M.

--
Without deviation from the norm, progress is not possible.