Re: [PATCH v2] KVM: VMX: Enable Notify VM exit

From: Sean Christopherson
Date: Thu Sep 09 2021 - 14:59:14 EST


On Tue, Sep 07, 2021, Xiaoyao Li wrote:
> On 9/3/2021 12:36 AM, Sean Christopherson wrote:
> > On Thu, Sep 02, 2021, Sean Christopherson wrote:
> > > On Tue, Aug 03, 2021, Xiaoyao Li wrote:
> > > > On 8/2/2021 11:46 PM, Sean Christopherson wrote:
> > > > > > > > @@ -5642,6 +5653,31 @@ static int handle_bus_lock_vmexit(struct kvm_vcpu *vcpu)
> > > > > > > > return 0;
> > > > > > > > }
> > > > > > > > +static int handle_notify(struct kvm_vcpu *vcpu)
> > > > > > > > +{
> > > > > > > > + unsigned long exit_qual = vmx_get_exit_qual(vcpu);
> > > > > > > > +
> > > > > > > > + if (!(exit_qual & NOTIFY_VM_CONTEXT_INVALID)) {
> > > > > > >
> > > > > > > What does CONTEXT_INVALID mean? The ISE doesn't provide any information whatsoever.
> > > > > >
> > > > > > It means whether the VM context is corrupted and not valid in the VMCS.
> > > > >
> > > > > Well that's a bit terrifying. Under what conditions can the VM context become
> > > > > corrupted? E.g. if the context can be corrupted by an inopportune NOTIFY exit,
> > > > > then KVM needs to be ultra conservative as a false positive could be fatal to a
> > > > > guest.
> > > > >
> > > >
> > > > Short answer is no case will set the VM_CONTEXT_INVALID bit.
> > >
> > > But something must set it, otherwise it wouldn't exist.
>
> For existing Intel silicon, no case will set it. Maybe in the future new
> case will set it.
>
> > The condition(s) under
> > > which it can be set matters because it affects how KVM should respond. E.g. if
> > > the guest can trigger VM_CONTEXT_INVALID at will, then we should probably treat
> > > it as a shutdown and reset the VMCS.
> >
> > Oh, and "shutdown" would be relative to the VMCS, i.e. if L2 triggers a NOTIFY
> > exit with VM_CONTEXT_INVALID then KVM shouldn't kill the entire VM. The least
> > awful option would probably be to synthesize a shutdown VM-Exit to L1. That
> > won't communicate to L1 that vmcs12 state is stale/bogus, but I don't see any way
> > to handle that via an existing VM-Exit reason :-/
> >
> > > But if VM_CONTEXT_INVALID can occur if and only if there's a hardware/ucode
> > > issue, then we can do:
> > >
> > > if (KVM_BUG_ON(exit_qual & NOTIFY_VM_CONTEXT_INVALID, vcpu->kvm))
> > > return -EIO;
> > >
> > > Either way, to enable this by default we need some form of documentation that
> > > describes what conditions lead to VM_CONTEXT_INVALID.
>
> I still don't know why the conditions lead to it matters. I think the
> consensus is that once VM_CONTEXT_INVALID happens, the vcpu can no longer
> run.

Yes, and no longer being able to run the vCPU is precisely the problem. The
condition(s) matters because if there's a possibility, however small, that enabling
NOTIFY_WINDOW can kill a well-behaved guest then it absolutely cannot be enabled by
default.

> Either KVM_BUG_ON() or a specific EXIT to userspace should be OK?

Not if the VM_CONTEXT_INVALID happens while L2 is running. If software can trigger
VM_CONTEXT_INVALID at will, then killing the VM would open up the door to a
malicious L2 killing L1 (which would be rather ironic since this is an anti-DoS
feature). IIUC, VM_CONTEXT_INVALID only means the current VMCS is garbage, thus
an occurence while L2 is active means that vmcs02 is junk, but L1's state in vmcs01,
vmcs12, etc... is still valid.