Re: [PATCH v2 1/2] kvm/vmx: avoid CPUID faulting leaking to guest

From: Sean Christopherson
Date: Mon Mar 18 2019 - 12:23:40 EST


On Mon, Mar 18, 2019 at 07:43:23PM +0800, Xiaoyao Li wrote:
> cpuid faulting is a feature about CPUID instruction. When cpuif faulting
^^^^^
cpuid
> is enabled, all execution of the CPUID instruction outside system-management
> mode (SMM) cause a general-protection (#GP) if the CPL > 0.
>
> About this feature, detailed information can be found at
> https://www.intel.com/content/dam/www/public/us/en/documents/application-notes/virtualization-technology-flexmigration-application-note.pdf
>
> Current KVM provides software emulation of this feature for guest.
> However, because cpuid faulting takes higher priority over CPUID vm exit (Intel
> SDM vol3.25.1.1), there is a risk of leaking cpuid faulting to guest when host
> enables it. If host enables cpuid faulting by setting the bit 0 of
> MSR_MISC_FEATURES_ENABLES, it will pass to guest since there is no handling of
> MSR_MISC_FEATURES_ENABLES yet. As a result, when guest calls CPUID instruction
> in CPL > 0, it will generate a #GP instead of CPUID vm eixt.
>
> This issue will cause guest boot failure when guest uses *modprobe*
> to load modules. *modprobe* calls CPUID instruction, thus causing #GP in
> guest. Since there is no handling of cpuid faulting in #GP handler, guest
> fails boot.
>
> To fix this issue, we should switch cpuid faulting bit between host and guest.
> Since MSR_MISC_FEATURES_ENABLES is intel-specific, this patch implement the
> switching only in vmx. It clears the cpuid faulting bit and save host's
> value before switching to guest, and restores the cpuid faulting settings of
> host before switching to host.
>
> Because kvm provides the software emulation of cpuid faulting, we can
> just clear the cpuid faulting bit in hardware MSR when switching to
> guest.
>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxxxxxxxx>
> ---
> Changes in v2:
> - move the save/restore of cpuid faulting bit to
> vmx_prepare_swich_to_guest/vmx_prepare_swich_to_host to avoid every
> vmentry RDMSR, based on Paolo's comment.
>
> ---
> arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.h | 2 ++
> 2 files changed, 36 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 541d442edd4e..2c59e0209e36 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1035,6 +1035,23 @@ static void pt_guest_exit(struct vcpu_vmx *vmx)
> wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
> }
>
> +static void vmx_save_host_cpuid_fault(struct vcpu_vmx *vmx)

Maybe vmx_load_guest_misc_features_enables()? See below for reasoning.
Alternatively you can drop the helpers altogether.

> +{
> + u64 host_val;
> +
> + if (!boot_cpu_has(X86_FEATURE_CPUID_FAULT))
> + return;
> +
> + rdmsrl(MSR_MISC_FEATURES_ENABLES, host_val);
> + vmx->host_msr_misc_features_enables = host_val;

There's no need to cache the host value in struct vcpu_vmx, just use the
kernel's shadow value.

> +
> + /* clear cpuid fault bit to avoid it leak to guest */

Personally I find the comment unnecessary and somewhat misleading.

> + if (host_val & MSR_MISC_FEATURES_ENABLES_CPUID_FAULT) {
> + wrmsrl(MSR_MISC_FEATURES_ENABLES,
> + host_val & ~MSR_MISC_FEATURES_ENABLES_CPUID_FAULT);

I think we can also skip WRMSR if CPUID faulting is also enabled in the
guest. It probably doesn't make sense to install the guest's value if
CPUID faulting is enabled in the guest and not host since intercepting
CPUID likely provides better overall performance than switching the MSR
on entry and exit. And last but not least, since there are other bits
in the MSR that we don't want to expose to the guest, e.g. RING3MWAIT,
checking for a non-zero host value is probably better than checking for
individual feature bits. Same reasoning for writing the guest's value
instead of clearing just the CPUID faulting bit.

So something like:

u64 msrval = this_cpu_read(msr_misc_features_shadow);

if (!msrval || msrval == vcpu->arch.msr_misc_features_enables)
return;

wrmsrl(MSR_MISC_FEATURES_ENABLES, vcpu->arch.msr_misc_features_enables);


or if you drop the helpers:

msrval = this_cpu_read(msr_misc_features_shadow);
if (msrval && msrval != vcpu->arch.msr_misc_features_enables)
wrmsrl(MSR_MISC_FEATURES_ENABLES,
vcpu->arch.msr_misc_features_enables);

> + }
> +}
> +
> void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -1068,6 +1085,8 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> vmx->loaded_cpu_state = vmx->loaded_vmcs;
> host_state = &vmx->loaded_cpu_state->host_state;
>
> + vmx_save_host_cpuid_fault(vmx);
> +
> /*
> * Set host fs and gs selectors. Unfortunately, 22.2.3 does not
> * allow segment selectors with cpl > 0 or ti == 1.
> @@ -1124,6 +1143,19 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
> }
> }
>
> +static void vmx_restore_host_cpuid_fault(struct vcpu_vmx *vmx)

If you keep the helpers, maybe vmx_load_host_misc_features_enables()
to pair with the new function name for loading guest state?

> +{
> + u64 msrval;
> +
> + if (!boot_cpu_has(X86_FEATURE_CPUID_FAULT))
> + return;
> +
> + rdmsrl(MSR_MISC_FEATURES_ENABLES, msrval);

> + msrval |= vmx->host_msr_misc_features_enables &
> + MSR_MISC_FEATURES_ENABLES_CPUID_FAULT;

Again, there's no need for RDMSR, the host's value can be pulled from
msr_misc_features_shadow, and the WRMSR can be skipped if the host and
guest have the same value, i.e. we didn't install the guest's value.

u64 msrval = this_cpu_read(msr_misc_features_shadow);

if (!msrval || msrval == vcpu->arch.msr_misc_features_enables)
return;

wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);

> + wrmsrl(MSR_MISC_FEATURES_ENABLES, msrval);
> +}
> +
> static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx)
> {
> struct vmcs_host_state *host_state;
> @@ -1137,6 +1169,8 @@ static void vmx_prepare_switch_to_host(struct vcpu_vmx *vmx)
> ++vmx->vcpu.stat.host_state_reload;
> vmx->loaded_cpu_state = NULL;
>
> + vmx_restore_host_cpuid_fault(vmx);
> +
> #ifdef CONFIG_X86_64
> rdmsrl(MSR_KERNEL_GS_BASE, vmx->msr_guest_kernel_gs_base);
> #endif
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 5df73b36fa49..ba867bbc5676 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -268,6 +268,8 @@ struct vcpu_vmx {
> u64 msr_ia32_feature_control_valid_bits;
> u64 ept_pointer;
>
> + u64 host_msr_misc_features_enables;

As mentioned above, this isn't needed.

> +
> struct pt_desc pt_desc;
> };
>
> --
> 2.19.1
>