Re: [PATCH v4 03/30] KVM: x86/mmu: Formalize TDP MMU's (unintended?) deferred TLB flush logic

From: Mingwei Zhang
Date: Thu Mar 03 2022 - 18:39:40 EST


On Thu, Mar 03, 2022, Paolo Bonzini wrote:
> From: Sean Christopherson <seanjc@xxxxxxxxxx>
>
> Explicitly ignore the result of zap_gfn_range() when putting the last
> reference to a TDP MMU root, and add a pile of comments to formalize the
> TDP MMU's behavior of deferring TLB flushes to alloc/reuse. Note, this
> only affects the !shared case, as zap_gfn_range() subtly never returns
> true for "flush" as the flush is handled by tdp_mmu_zap_spte_atomic().
>
> Putting the root without a flush is ok because even if there are stale
> references to the root in the TLB, they are unreachable because KVM will
> not run the guest with the same ASID without first flushing (where ASID
> in this context refers to both SVM's explicit ASID and Intel's implicit
> ASID that is constructed from VPID+PCID+EPT4A+etc...).
>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> Message-Id: <20220226001546.360188-5-seanjc@xxxxxxxxxx>
> Reviewed-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/mmu.c | 8 ++++++++
> arch/x86/kvm/mmu/tdp_mmu.c | 10 +++++++++-
> 2 files changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 32c041ed65cb..9a6df2d02777 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5083,6 +5083,14 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu)
> kvm_mmu_sync_roots(vcpu);
>
> kvm_mmu_load_pgd(vcpu);
> +
> + /*
> + * Flush any TLB entries for the new root, the provenance of the root
> + * is unknown. Even if KVM ensures there are no stale TLB entries
> + * for a freed root, in theory another hypervisor could have left
> + * stale entries. Flushing on alloc also allows KVM to skip the TLB
> + * flush when freeing a root (see kvm_tdp_mmu_put_root()).
> + */
> static_call(kvm_x86_flush_tlb_current)(vcpu);
> out:
> return r;
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index b97a4125feac..921fa386df99 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -93,7 +93,15 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
> list_del_rcu(&root->link);
> spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
>
> - zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);
> + /*
> + * A TLB flush is not necessary as KVM performs a local TLB flush when
> + * allocating a new root (see kvm_mmu_load()), and when migrating vCPU
> + * to a different pCPU. Note, the local TLB flush on reuse also
> + * invalidates any paging-structure-cache entries, i.e. TLB entries for
> + * intermediate paging structures, that may be zapped, as such entries
> + * are associated with the ASID on both VMX and SVM.
> + */
> + (void)zap_gfn_range(kvm, root, 0, -1ull, false, false, shared);

Discussed offline with Sean. Now I get myself comfortable with the style
of mmu with multiple 'roots' and leaving TLB unflushed for invalidated
roots.

I guess one minor improvement on the comment could be:

"A TLB flush is not necessary as each vCPU performs a local TLB flush
when allocating or assigning a new root (see kvm_mmu_load()), and when
migrating to a different pCPU."

The above could be better since "KVM performs a local TLB flush" makes
readers think why we miss the 'remote' TLB flushes?
>
> call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
> }
> --
> 2.31.1
>
>