Re: [Patch] KVM: x86/mmu: Make optimized __handle_changed_spte() for clear dirty log

From: David Matlack
Date: Wed Jan 25 2023 - 17:00:53 EST


On Wed, Jan 25, 2023 at 01:38:57PM -0800, Vipin Sharma wrote:
> Use a tone down version of __handle_changed_spte() when clearing dirty
> log. Remove checks which will not be needed when dirty logs are cleared.
>
> This change shows ~13% improvement in clear dirty log calls in
> dirty_log_perf_test
>
> Before tone down version:
> Clear dirty log over 3 iterations took 10.006764203s. (Avg 3.335588067s/iteration)
>
> After tone down version:
> Clear dirty log over 3 iterations took 8.686433554s. (Avg 2.895477851s/iteration)
>
> Signed-off-by: Vipin Sharma <vipinsh@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index bba33aea0fb0..ca21b33c4386 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -504,6 +504,19 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared)
> call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback);
> }
>
> +static void handle_changed_spte_clear_dirty_log(int as_id, gfn_t gfn,
> + u64 old_spte, u64 new_spte,
> + int level)
> +{
> + if (old_spte == new_spte)
> + return;
> +
> + trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte);
> +
> + if (is_dirty_spte(old_spte) && !is_dirty_spte(new_spte))
> + kvm_set_pfn_dirty(spte_to_pfn(old_spte));
> +}
> +
> /**
> * __handle_changed_spte - handle bookkeeping associated with an SPTE change
> * @kvm: kvm instance
> @@ -736,7 +749,12 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
>
> old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
>
> - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false);
> + if (record_dirty_log)
> + __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte,
> + level, false);
> + else
> + handle_changed_spte_clear_dirty_log(as_id, gfn, old_spte,
> + new_spte, level);

I find it very non-intuitive to tie this behavior to !record_dirty_log,
even though it happens to work. It's also risky if any future calls are
added that pass record_dirty_log=false but change other bits in the
SPTE.

I wonder if we could get the same effect by optimizing
__handle_changed_spte() to check for a cleared D-bit *first* and if
that's the only diff between the old and new SPTE, bail immediately
rather than doing all the other checks.