Re: [PATCH v2 09/13] KVM: x86/mmu: Allow zap gfn range to operate under the mmu read lock

From: Ben Gardon
Date: Mon Apr 12 2021 - 14:22:14 EST


On Fri, Apr 2, 2021 at 12:53 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 02/04/21 01:37, Ben Gardon wrote:
> > +void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root,
> > + bool shared)
> > {
> > gfn_t max_gfn = 1ULL << (shadow_phys_bits - PAGE_SHIFT);
> >
> > - lockdep_assert_held_write(&kvm->mmu_lock);
> > + kvm_lockdep_assert_mmu_lock_held(kvm, shared);
> >
> > if (!refcount_dec_and_test(&root->tdp_mmu_root_count))
> > return;
> > @@ -81,7 +92,7 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
> > list_del_rcu(&root->link);
> > spin_unlock(&kvm->arch.tdp_mmu_pages_lock);
> >
> > - zap_gfn_range(kvm, root, 0, max_gfn, false, false);
> > + zap_gfn_range(kvm, root, 0, max_gfn, false, false, shared);
> >
> > call_rcu(&root->rcu_head, tdp_mmu_free_sp_rcu_callback);
>
> Instead of patch 13, would it make sense to delay the zap_gfn_range and
> call_rcu to a work item (either unconditionally, or only if
> shared==false)? Then the zap_gfn_range would be able to yield and take
> the mmu_lock for read, similar to kvm_tdp_mmu_zap_invalidated_roots.
>
> If done unconditionally, this would also allow removing the "shared"
> argument to kvm_tdp_mmu_put_root, tdp_mmu_next_root and
> for_each_tdp_mmu_root_yield_safe, so I would place that change before
> this patch.
>
> Paolo
>

I tried that and it created problems. I believe the issue was that on
VM teardown memslots would be freed and the memory reallocated before
the root was torn down, resulting in a use-after free from
mark_pfn_dirty. Perhaps this could be resolved by forcing memslot
changes to wait until that work item was processed before returning. I
can look into it but I suspect there will be a lot of "gotchas"
involved.