Re: [PATCH 1/3] kvm: vmx: Add IA32_FLUSH_CMD guest support

From: Sean Christopherson
Date: Mon Mar 20 2023 - 10:56:33 EST


On Fri, Mar 17, 2023, Pawan Gupta wrote:
> On Fri, Mar 17, 2023 at 04:14:01PM -0700, Nathan Chancellor wrote:
> > On Fri, Mar 17, 2023 at 03:53:45PM -0700, Pawan Gupta wrote:
> > > On Fri, Mar 17, 2023 at 12:04:32PM -0700, Nathan Chancellor wrote:
> > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > > > index c788aa382611..9a78ea96a6d7 100644
> > > > > --- a/arch/x86/kvm/vmx/vmx.c
> > > > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > > > @@ -2133,6 +2133,39 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
> > > > > return debugctl;
> > > > > }
> > > > >
> > > > > +static int vmx_set_msr_ia32_cmd(struct kvm_vcpu *vcpu,
> > > > > + struct msr_data *msr_info,
> > > > > + bool guest_has_feat, u64 cmd,
> > > > > + int x86_feature_bit)
> > > > > +{
> > > > > + if (!msr_info->host_initiated && !guest_has_feat)
> > > > > + return 1;
> > > > > +
> > > > > + if (!(msr_info->data & ~cmd))
> > >
> > > Looks like this is doing a reverse check. Shouldn't this be as below:
> >
> > That diff on top of next-20230317 appears to resolve the issue for me
> > and my L1 guest can spawn an L2 guest without any issues (which is the
> > extent of my KVM testing).
>
> Great!
>
> > Is this a problem for the SVM version? It has the same check it seems,
> > although I did not have any issues on my AMD test platform (but I guess
> > that means that the system has the support?).
>
> IIUC, SVM version also needs to be fixed.

Yeah, looks that way. If we do go this route, can you also rename "cmd" to something
like "allowed_mask"? It took me far too long to understand what "cmd" represents.

> > I assume this will just be squashed into the original change but if not:
>
> Thats what I think, and if its too late to be squashed I will send a
> formal patch. Maintainers?

Honestly, I'd rather revert the whole mess and try again. The patches obviously
weren't tested, and the entire approach (that was borrowed from the existing
MSR_IA32_PRED_CMD code) makes no sense.

The MSRs are write-only command registers, i.e. don't have a persistent value.
So unlike MSR_IA32_SPEC_CTRL (which I suspect was the source for the copy pasta),
where KVM needs to track the guest value, there are no downsides to disabling
interception of the MSRs.

Manually checking the value written by the guest or host userspace is similarly
ridiculous. The MSR is being exposed directly to the guest, i.e. after the first
access, the guest can throw any value at bare metal anyways, so just do wrmsrl_safe()
and call it good.

In other words, the common __kvm_set_msr() switch should have something like so,

case MSR_IA32_PRED_CMD:
if (!cpu_feature_enabled(X86_FEATURE_IBPB))
return 1;

if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, guest_has_pred_cmd_msr(vcpu)))
return 1;

ret = !!wrmsrl_safe(MSR_IA32_PRED_CMD, msr_info->data);
break;
case MSR_IA32_FLUSH_CMD:
if (!cpu_feature_enabled(X86_FEATURE_FLUSH_L1D))
return 1;

if (!msr_info->host_initiated &&
!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D))
return 1;

ret = !!wrmsrl_safe(MSR_IA32_FLUSH_CMD, msr_info->data);
break;

with the MSR interception handled in e.g. vmx_vcpu_after_set_cpuid().

Paolo?