Re: [PATCH 06/15] KVM: SVM: Probe and load MSR_TSC_AUX regardless of RDTSCP support in host

From: Maxim Levitsky
Date: Mon May 10 2021 - 04:20:38 EST


On Tue, 2021-05-04 at 10:17 -0700, Sean Christopherson wrote:
> Probe MSR_TSC_AUX whether or not RDTSCP is supported in the host, and
> if probing succeeds, load the guest's MSR_TSC_AUX into hardware prior to
> VMRUN. Because SVM doesn't support interception of RDPID, RDPID cannot
> be disallowed in the guest (without resorting to binary translation).
> Leaving the host's MSR_TSC_AUX in hardware would leak the host's value to
> the guest if RDTSCP is not supported.
>
> Note, there is also a kernel bug that prevents leaking the host's value.
> The host kernel initializes MSR_TSC_AUX if and only if RDTSCP is
> supported, even though the vDSO usage consumes MSR_TSC_AUX via RDPID.
> I.e. if RDTSCP is not supported, there is no host value to leak. But,
> if/when the host kernel bug is fixed, KVM would start leaking MSR_TSC_AUX
> in the case where hardware supports RDPID but RDTSCP is unavailable for
> whatever reason.
>
> Probing MSR_TSC_AUX will also allow consolidating the probe and define
> logic in common x86, and will make it simpler to condition the existence
> of MSR_TSX_AUX (from the guest's perspective) on RDTSCP *or* RDPID.
>
> Fixes: AMD CPUs
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/svm.c | 18 ++++++++++--------
> 1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8f2b184270c0..b3153d40cc4d 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -212,7 +212,7 @@ DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
> * RDTSCP and RDPID are not used in the kernel, specifically to allow KVM to
> * defer the restoration of TSC_AUX until the CPU returns to userspace.
> */
> -#define TSC_AUX_URET_SLOT 0
> +static int tsc_aux_uret_slot __read_mostly = -1;
>
> static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000};
>
> @@ -959,8 +959,10 @@ static __init int svm_hardware_setup(void)
> kvm_tsc_scaling_ratio_frac_bits = 32;
> }
>
> - if (boot_cpu_has(X86_FEATURE_RDTSCP))
> - kvm_define_user_return_msr(TSC_AUX_URET_SLOT, MSR_TSC_AUX);
> + if (!kvm_probe_user_return_msr(MSR_TSC_AUX)) {
> + tsc_aux_uret_slot = 0;
> + kvm_define_user_return_msr(tsc_aux_uret_slot, MSR_TSC_AUX);
> + }
>
> /* Check for pause filtering support */
> if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
> @@ -1454,8 +1456,8 @@ static void svm_prepare_guest_switch(struct kvm_vcpu *vcpu)
> }
> }
>
> - if (static_cpu_has(X86_FEATURE_RDTSCP))
> - kvm_set_user_return_msr(TSC_AUX_URET_SLOT, svm->tsc_aux, -1ull);
> + if (likely(tsc_aux_uret_slot >= 0))
> + kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull);
>
> svm->guest_state_loaded = true;
> }
> @@ -2664,7 +2666,7 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> msr_info->data |= (u64)svm->sysenter_esp_hi << 32;
> break;
> case MSR_TSC_AUX:
> - if (!boot_cpu_has(X86_FEATURE_RDTSCP))
> + if (tsc_aux_uret_slot < 0)
> return 1;
> if (!msr_info->host_initiated &&
> !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP))
> @@ -2885,7 +2887,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> svm->sysenter_esp_hi = guest_cpuid_is_intel(vcpu) ? (data >> 32) : 0;
> break;
> case MSR_TSC_AUX:
> - if (!boot_cpu_has(X86_FEATURE_RDTSCP))
> + if (tsc_aux_uret_slot < 0)
> return 1;
>
> if (!msr->host_initiated &&
> @@ -2908,7 +2910,7 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> * guest via direct_access_msrs, and switch it via user return.
> */
> preempt_disable();
> - r = kvm_set_user_return_msr(TSC_AUX_URET_SLOT, data, -1ull);
> + r = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull);
> preempt_enable();
> if (r)
> return 1;

If L1 has ignore_msrs=1, then we will end up writing the IA32_TSC_AUX for nothing,
but this shouldn't be that of a big deal, so:

Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>

Best regards,
Maxim Levitsky