Re: [PATCH v2 0/4] KVM: x86: Partially allow KVM_SET_CPUID{,2} after KVM_RUN for CPU hotplug

From: Sean Christopherson
Date: Tue Jan 18 2022 - 11:53:57 EST


On Tue, Jan 18, 2022, Paolo Bonzini wrote:
> On 1/18/22 15:35, Igor Mammedov wrote:
> > Can you check following scenario:
> > * on host that has IA32_TSX_CTRL and TSX enabled (RTM/HLE cpuid bits present)
> > * boot 2 vcpus VM with TSX enabled on VMM side but with tsx=off on kernel CLI
> >
> > that should cause kernel to set MSR_IA32_TSX_CTRL to 3H from initial 0H
> > and clear RTM+HLE bits in CPUID, check that RTM/HLE cpuid it cleared
> >
> > * hotunplug a VCPU and then replug it again
> > if IA32_TSX_CTRL is reset to initial state, that should re-enable
> > RTM/HLE cpuid bits and KVM_SET_CPUID2 might fail due to difference
> >
> > and as Sean pointed out there might be other non constant leafs,
> > where exact match check could leave userspace broken.
>
>
> MSR_IA32_TSX_CTRL is handled differently straight during the CPUID call:
>
> if (function == 7 && index == 0) {
> u64 data;
> if (!__kvm_get_msr(vcpu, MSR_IA32_TSX_CTRL, &data, true) &&
> (data & TSX_CTRL_CPUID_CLEAR))
> *ebx &= ~(F(RTM) | F(HLE));
> }
>
>
> and I think we should redo all or most of kvm_update_cpuid_runtime
> the same way.

Please no. xstate_required_size() requires multiple host CPUID calls, and glibc
does CPUID.0xD.0x0 and CPUID.0xD.0x1 as part of its initialization, i.e. launching
a new userspace process in the guest will see additional performance overhread due
to KVM dynamically computing the XSAVE size instead of caching it based on vCPU
state. Nested virtualization would be especially painful as every one of those
"host" CPUID invocations will trigger and exit from L1=>L0.