Re: [PATCH] KVM: x86: work around QEMU issue with synthetic CPUID leaves

From: Maxim Levitsky
Date: Sun May 01 2022 - 07:16:55 EST


On Fri, 2022-04-29 at 15:25 -0400, Paolo Bonzini wrote:
> Synthesizing AMD leaves up to 0x80000021 caused problems with QEMU,
> which assumes the *host* CPUID[0x80000000].EAX is higher or equal
> to what KVM_GET_SUPPORTED_CPUID reports.
>
> This causes QEMU to issue bogus host CPUIDs when preparing the input
> to KVM_SET_CPUID2. It can even get into an infinite loop, which is
> only terminated by an abort():
>
> cpuid_data is full, no space for cpuid(eax:0x8000001d,ecx:0x3e)
>
> To work around this, only synthesize those leaves if 0x8000001d exists
> on the host. The synthetic 0x80000021 leaf is mostly useful on Zen2,
> which satisfies the condition.
>
> Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
> Reported-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/cpuid.c | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index b24ca7f4ed7c..598334ed5fbc 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -1085,12 +1085,21 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
> case 0x80000000:
> entry->eax = min(entry->eax, 0x80000021);
> /*
> - * Serializing LFENCE is reported in a multitude of ways,
> - * and NullSegClearsBase is not reported in CPUID on Zen2;
> - * help userspace by providing the CPUID leaf ourselves.
> + * Serializing LFENCE is reported in a multitude of ways, and
> + * NullSegClearsBase is not reported in CPUID on Zen2; help
> + * userspace by providing the CPUID leaf ourselves.
> + *
> + * However, only do it if the host has CPUID leaf 0x8000001d.
> + * QEMU thinks that it can query the host blindly for that
> + * CPUID leaf if KVM reports that it supports 0x8000001d or
> + * above. The processor merrily returns values from the
> + * highest Intel leaf which QEMU tries to use as the guest's
> + * 0x8000001d. Even worse, this can result in an infinite
> + * loop if said highest leaf has no subleaves indexed by ECX.

Very small nitpick: It might be useful to add a note that qemu does this only for the
leaf 0x8000001d.

> */
> - if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
> - || !static_cpu_has_bug(X86_BUG_NULL_SEG))
> + if (entry->eax >= 0x8000001d &&
> + (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
> + || !static_cpu_has_bug(X86_BUG_NULL_SEG)))
> entry->eax = max(entry->eax, 0x80000021);
> break;
> case 0x80000001:

Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>

Best regards,
Maxim Levitsky