Re: [RFC PATCH 0/3] KVM: Introduce "VM bugged" concept

From: Cornelia Huck
Date: Tue Sep 29 2020 - 05:28:06 EST


On Wed, 23 Sep 2020 15:45:27 -0700
Sean Christopherson <sean.j.christopherson@xxxxxxxxx> wrote:

> This series introduces a concept we've discussed a few times in x86 land.
> The crux of the problem is that x86 has a few cases where KVM could
> theoretically encounter a software or hardware bug deep in a call stack
> without any sane way to propagate the error out to userspace.
>
> Another use case would be for scenarios where letting the VM live will
> do more harm than good, e.g. we've been using KVM_BUG_ON for early TDX
> enabling as botching anything related to secure paging all but guarantees
> there will be a flood of WARNs and error messages because lower level PTE
> operations will fail if an upper level operation failed.
>
> The basic idea is to WARN_ONCE if a bug is encountered, kick all vCPUs out
> to userspace, and mark the VM as bugged so that no ioctls() can be issued
> on the VM or its devices/vCPUs.

I think this makes a lot of sense.

Are there other user space interactions where we want to generate an
error for a bugged VM, e.g. via eventfd?

And can we make the 'bugged' information available to user space in a
structured way?

>
> RFC as I've done nowhere near enough testing to verify that rejecting the
> ioctls(), evicting running vCPUs, etc... works as intended.
>
> Sean Christopherson (3):
> KVM: Export kvm_make_all_cpus_request() for use in marking VMs as
> bugged
> KVM: Add infrastructure and macro to mark VM as bugged
> KVM: x86: Use KVM_BUG/KVM_BUG_ON to handle bugs that are fatal to the
> VM
>
> arch/x86/kvm/svm/svm.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 23 ++++++++++++--------
> arch/x86/kvm/x86.c | 4 ++++
> include/linux/kvm_host.h | 45 ++++++++++++++++++++++++++++++++--------
> virt/kvm/kvm_main.c | 11 +++++-----
> 5 files changed, 61 insertions(+), 24 deletions(-)
>