Re: [PATCH v3 22/28] KVM: x86/mmu: Zap defunct roots via asynchronous worker

From: Paolo Bonzini
Date: Wed Mar 02 2022 - 16:22:22 EST


On 3/2/22 21:47, Sean Christopherson wrote:
On Wed, Mar 02, 2022, Paolo Bonzini wrote:
For now let's do it the simple but ugly way. Keeping
next_invalidated_root() does not make things worse than the status quo, and
further work will be easier to review if it's kept separate from this
already-complex work.

Oof, that's not gonna work. My approach here in v3 doesn't work either. I finally
remembered why I had the dedicated tdp_mmu_defunct_root flag and thus the smp_mb_*()
dance.

kvm_tdp_mmu_zap_invalidated_roots() assumes that it was gifted a reference to
_all_ invalid roots by kvm_tdp_mmu_invalidate_all_roots(). This works in the
current code base only because kvm->slots_lock is held for the entire duration,
i.e. roots can't become invalid between the end of kvm_tdp_mmu_invalidate_all_roots()
and the end of kvm_tdp_mmu_zap_invalidated_roots().

Yeah, of course that doesn't work if kvm_tdp_mmu_zap_invalidated_roots() calls kvm_tdp_mmu_put_root() and the worker also does the same kvm_tdp_mmu_put_root().

But, it seems so me that we were so close to something that works and is elegant with the worker idea. It does avoid the possibility of two "puts", because the work item is created on the valid->invalid transition. What do you think of having a separate workqueue for each struct kvm, so that kvm_tdp_mmu_zap_invalidated_roots() can be replaced with a flush? I can probably do it next Friday.

Paolo


Marking a root invalid in kvm_tdp_mmu_put_root() breaks that assumption, e.g. if a
new root is created and then dropped, it will be marked invalid but the "fast zap"
will not have a reference. The "defunct" flag prevents this scenario by allowing
the "fast zap" path to identify invalid roots for which it did not take a reference.
By virtue of holding a reference, "fast zap" also guarantees that the roots it needs
to invalidate and put can't become defunct.

My preference would be to either go back to a variant of v2, or to implement my
"second list" idea.

I also need to figure out why I didn't encounter errors in v3, because I distinctly
remember underflowing the refcount before adding the defunct flag...