Re: [PATCH v6 3/3] KVM: VMX: Enable Notify VM exit

From: Sean Christopherson
Date: Wed May 18 2022 - 18:30:37 EST


On Thu, Apr 21, 2022, Chenyi Qiang wrote:
> @@ -1504,6 +1511,8 @@ struct kvm_x86_ops {
> * Returns vCPU specific APICv inhibit reasons
> */
> unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
> +
> + bool has_notify_vmexit;

I'm pretty sure I suggested this, but seeing it in code, it kinda sorta makes things
worst if we don't first consolidate the existing flags. kvm_x86_ops works, but we'd
definitely be taking liberties with the "ops" part.

What about adding struct kvm_caps to collect these flags/settings that don't fit
into kvm_cpu_caps because they're not a CPUID feature flag? kvm_x86_ops has the
advantage of kinda being read-only after init since VMX modifies vmx_x86_ops,
but IMO that's not enough reason to shove this into kvm_x86_ops. And long term,
we might be able find a way to mark kvm_caps as full __ro_after_init.

If no one objects, the attached patch can slide in before this patch, then
has_notifiy_vmexit can land in kvm_caps.

struct kvm_caps {
/* control of guest tsc rate supported? */
bool has_tsc_control;
/* maximum supported tsc_khz for guests */
u32 max_guest_tsc_khz;
/* number of bits of the fractional part of the TSC scaling ratio */
u8 tsc_scaling_ratio_frac_bits;
/* maximum allowed value of TSC scaling ratio */
u64 max_tsc_scaling_ratio;
/* 1ull << kvm_caps.tsc_scaling_ratio_frac_bits */
u64 default_tsc_scaling_ratio;
/* bus lock detection supported? */
bool has_bus_lock_exit;

u64 supported_mce_cap;
u64 supported_xcr0;
u64 supported_xss;
};

> @@ -6090,6 +6094,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> }
> mutex_unlock(&kvm->lock);
> break;
> + case KVM_CAP_X86_NOTIFY_VMEXIT:
> + r = -EINVAL;
> + if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
> + break;
> + if (!kvm_x86_ops.has_notify_vmexit)
> + break;
> + if (!(u32)cap->args[0] & KVM_X86_NOTIFY_VMEXIT_ENABLED)
> + break;
> + kvm->arch.notify_window = cap->args[0] >> 32;

Setting notify_vmexit and notify_vmexit_flags needs to be done under kvm->lock,
and changing notify_window if kvm->created_vcpus > 0 needs to disallowed, otherwise
init_vmcs() will use the wrong value.

notify_vmexit_flags could be changed on the fly, but I doubt that's worth
supporting as even the smallest amount of complexity will go unused.

So I think this?

case KVM_CAP_X86_NOTIFY_VMEXIT:
r = -EINVAL;
if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
break;
if (!kvm_x86_ops.has_notify_vmexit)
break;
if (!(u32)cap->args[0] & KVM_X86_NOTIFY_VMEXIT_ENABLED)
break;
mutex_lock(&kvm->lock);
if (!kvm->created_vcpus) {
kvm->arch.notify_window = cap->args[0] >> 32;
kvm->arch.notify_vmexit_flags = (u32)cap->args[0];
r = 0;
}
mutex_unlock(&kvm->lock);
break;