Re: [PATCH v4 03/32] KVM: SVM: Flush the "current" TLB when activating AVIC
From: Maxim Levitsky
Date: Thu Dec 08 2022 - 16:54:05 EST
On Wed, 2022-12-07 at 18:02 +0200, Maxim Levitsky wrote:
On Sat, 2022-10-01 at 00:58 +0000, Sean Christopherson wrote:
> Flush the TLB when activating AVIC as the CPU can insert into the TLB
> while AVIC is "locally" disabled. KVM doesn't treat "APIC hardware
> disabled" as VM-wide AVIC inhibition, and so when a vCPU has its APIC
> hardware disabled, AVIC is not guaranteed to be inhibited. As a result,
> KVM may create a valid NPT mapping for the APIC base, which the CPU can
> cache as a non-AVIC translation.
>
> Note, Intel handles this in vmx_set_virtual_apic_mode().
>
> Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/avic.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 6919dee69f18..712330b80891 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -86,6 +86,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
> /* Disabling MSR intercept for x2APIC registers */
> svm_set_x2apic_msr_interception(svm, false);
> } else {
> + /*
> + * Flush the TLB, the guest may have inserted a non-APIC
> + * mapping into the TLB while AVIC was disabled.
> + */
> + kvm_make_request(KVM_REQ_TLB_FLUSH_CURRENT, &svm->vcpu);
> +
> /* For xAVIC and hybrid-xAVIC modes */
> vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> /* Enabling MSR intercept for x2APIC registers */
I agree, that if guest disables APIC on a vCPU, this will lead to call to kvm_apic_update_apicv which will
disable AVIC, but if other vCPUs don't disable it, the AVIC's private memslot will still be mapped and
guest could read/write it from this vCPU, and its TLB mapping needs to be invalidated if/when APIC is re-enabled.
However I think that this adds an unnecessarily (at least in the future) performance penalty to AVIC nesting coexistence:
L1's AVIC is inhibited on each nested VM entry, and uninhibited on each nested VM exit, but while nested the guest
can't really access it as it has its own NPT.
With this patch KVM will invalidate L1's TLB on each nested VM exit. KVM sadly already does this but this can be fixed
(its another thing on my TODO list)
Note that APICv doesn't have this issue, it is not inhibited on nested VM entry/exit, thus this code is not performance
sensitive for APICv.
I somewhat vote again, as I said before to disable the APICv/AVIC memslot, if any of vCPUs have APICv/AVIC hardware disabled,
because it is also more correct from an x86 perspective. I do wonder how often is the usage of having "extra" cpus but not using them, and thus having their APIC in disabled state.
KVM does support adding new vCPUs on the fly, so this shouldn't be needed, and APICv inhibit in this case is just a perf regression.
Or at least do this only when APIC does back from hardware disabled state to enabled.
Best regards,
Maxim Levitsky