Re: [PATCH v6 3/3] KVM: VMX: Enable Notify VM exit

From: Chenyi Qiang
Date: Thu May 19 2022 - 06:38:30 EST




On 5/19/2022 6:30 AM, Sean Christopherson wrote:
On Thu, Apr 21, 2022, Chenyi Qiang wrote:
@@ -1504,6 +1511,8 @@ struct kvm_x86_ops {
* Returns vCPU specific APICv inhibit reasons
*/
unsigned long (*vcpu_get_apicv_inhibit_reasons)(struct kvm_vcpu *vcpu);
+
+ bool has_notify_vmexit;

I'm pretty sure I suggested this, but seeing it in code, it kinda sorta makes things
worst if we don't first consolidate the existing flags. kvm_x86_ops works, but we'd
definitely be taking liberties with the "ops" part.

What about adding struct kvm_caps to collect these flags/settings that don't fit
into kvm_cpu_caps because they're not a CPUID feature flag? kvm_x86_ops has the
advantage of kinda being read-only after init since VMX modifies vmx_x86_ops,
but IMO that's not enough reason to shove this into kvm_x86_ops. And long term,
we might be able find a way to mark kvm_caps as full __ro_after_init.

If no one objects, the attached patch can slide in before this patch, then
has_notifiy_vmexit can land in kvm_caps.

struct kvm_caps {
/* control of guest tsc rate supported? */
bool has_tsc_control;
/* maximum supported tsc_khz for guests */
u32 max_guest_tsc_khz;
/* number of bits of the fractional part of the TSC scaling ratio */
u8 tsc_scaling_ratio_frac_bits;
/* maximum allowed value of TSC scaling ratio */
u64 max_tsc_scaling_ratio;
/* 1ull << kvm_caps.tsc_scaling_ratio_frac_bits */
u64 default_tsc_scaling_ratio;
/* bus lock detection supported? */
bool has_bus_lock_exit;

u64 supported_mce_cap;
u64 supported_xcr0;
u64 supported_xss;
};


Thanks Sean for your patch. I think an unintentional change is mixed in it:

@@ -4739,7 +4725,8 @@ static int kvm_vcpu_ready_for_interrupt_injection(struct kvm_vcpu *vcpu)
return (kvm_arch_interrupt_allowed(vcpu) &&
kvm_cpu_accept_dm_intr(vcpu) &&
!kvm_event_needs_reinjection(vcpu) &&
- !vcpu->arch.exception.pending);
+ !vcpu->arch.exception.pending &&
+ !kvm_test_request(KVM_REQ_TRIPLE_FAULT, vcpu));
}

Maybe this should belong to the patch 1?

@@ -6090,6 +6094,18 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
}
mutex_unlock(&kvm->lock);
break;
+ case KVM_CAP_X86_NOTIFY_VMEXIT:
+ r = -EINVAL;
+ if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
+ break;
+ if (!kvm_x86_ops.has_notify_vmexit)
+ break;
+ if (!(u32)cap->args[0] & KVM_X86_NOTIFY_VMEXIT_ENABLED)
+ break;
+ kvm->arch.notify_window = cap->args[0] >> 32;

Setting notify_vmexit and notify_vmexit_flags needs to be done under kvm->lock,
and changing notify_window if kvm->created_vcpus > 0 needs to disallowed, otherwise
init_vmcs() will use the wrong value.

notify_vmexit_flags could be changed on the fly, but I doubt that's worth
supporting as even the smallest amount of complexity will go unused.

So I think this?


Make sense.

case KVM_CAP_X86_NOTIFY_VMEXIT:
r = -EINVAL;
if ((u32)cap->args[0] & ~KVM_X86_NOTIFY_VMEXIT_VALID_BITS)
break;
if (!kvm_x86_ops.has_notify_vmexit)
break;
if (!(u32)cap->args[0] & KVM_X86_NOTIFY_VMEXIT_ENABLED)
break;
mutex_lock(&kvm->lock);
if (!kvm->created_vcpus) {
kvm->arch.notify_window = cap->args[0] >> 32;
kvm->arch.notify_vmexit_flags = (u32)cap->args[0];
r = 0;
}
mutex_unlock(&kvm->lock);
break;