Re: [PATCH v5 09/26] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS

From: Sean Christopherson
Date: Fri Aug 19 2022 - 13:43:40 EST


On Fri, Aug 19, 2022, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@xxxxxxxxxx> writes:
>
> > On Tue, Aug 02, 2022, Vitaly Kuznetsov wrote:
> >> +static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
> >> + enum evmcs_unsupported_ctrl_type ctrl_type)
> >> +{
> >> + struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> >> + enum evmcs_revision evmcs_rev = EVMCSv1_2016;
> >> +
> >> + if (!hv_vcpu)
> >
> > This is a functiontal change, and I don't think it's correct either. Previously,
> > KVM would apply the EVMCSv1_2016 filter irrespective of whether or not
> > vcpu->arch.hyperv is non-NULL. nested_enable_evmcs() doesn't require a Hyper-V
> > vCPU, and AFAICT nothing requires a Hyper-V vCPU to use eVMCS.
>
> Indeed, this *is* correct after PATCH11 when we get rid of VMX feature
> MSR filtering for KVM-on-Hyper-V as the remaining use for
> evmcs_get_unsupported_ctls() is Hyper-V on KVM and hv_vcpu is not NULL
> there.

Hmm, nested_vmx_handle_enlightened_vmptrld() will fail without a Hyper-V vCPU, so
filtering eVMCS control iff there's a Hyper-V vCPU makes sense. But that's a guest
visible change and should be a separate patch.

But that also raises the question of whether or not KVM should honor hyperv_enabled
when filtering MSRs. Same question for nested VM-Enter. nested_enlightened_vmentry()
will "fail" without an assist page, and the guest can't set the assist page without
hyperv_enabled==true, but nothing prevents the host from stuffing the assist page.

And on a very related topic, the handling of kvm_hv_vcpu_init() in kvm_hv_set_cpuid()
is buggy. KVM will not report an error to userspace for KVM_SET_CPUID2 if allocation
fails. If a later operation successfully create a Hyper-V vCPU, KVM will chug along
with Hyper-V enabled but without having cached the relevant Hyper-V CPUID info.

Less important is that kvm_hv_set_cpuid() should also zero out the CPUID cache if
Hyper-V is disabled. I'm pretty sure sure all paths check hyperv_enabled before
consuming cpuid_cache, but it's unnecessarily risky.

If we fix the kvm_hv_set_cpuid() allocation failure, then we can also guarantee
that vcpu->arch.hyperv is non-NULL if vcpu->arch.hyperv_enabled==true. And then
we can add gate guest eVMCS flow on hyperv_enabled, and evmcs_get_unsupported_ctls()
can then WARN if hv_vcpu is NULL.

Assuming I'm not overlooking something, I'll fold in yet more patches.