Re: [PATCH 1/2] KVM: arm64: Split kvm_pgtable_stage2_destroy()

From: Oliver Upton
Date: Tue Jul 29 2025 - 11:57:58 EST


On Thu, Jul 24, 2025 at 11:51:43PM +0000, Raghavendra Rao Ananta wrote:
> Split kvm_pgtable_stage2_destroy() into two:
> - kvm_pgtable_stage2_destroy_range(), that performs the
> page-table walk and free the entries over a range of addresses.
> - kvm_pgtable_stage2_destroy_pgd(), that frees the PGD.
>
> This refactoring enables subsequent patches to free large page-tables
> in chunks, calling cond_resched() between each chunk, to yield the CPU
> as necessary.
>
> Direct callers of kvm_pgtable_stage2_destroy() will continue to walk
> the entire range of the VM as before, ensuring no functional changes.
>
> Also, add equivalent pkvm_pgtable_stage2_*() stubs to maintain 1:1
> mapping of the page-table functions.

Uhh... We can't stub these functions out for protected mode, we already
have a load-bearing implementation of pkvm_pgtable_stage2_destroy().
Just reuse what's already there and provide a NOP for
pkvm_pgtable_stage2_destroy_pgd().

> +void kvm_pgtable_stage2_destroy_pgd(struct kvm_pgtable *pgt)
> +{
> + /*
> + * We aren't doing a pgtable walk here, but the walker struct is needed
> + * for kvm_dereference_pteref(), which only looks at the ->flags.
> + */
> + struct kvm_pgtable_walker walker = {0};

This feels subtle and prone for error. I'd rather we have something that
boils down to rcu_dereference_raw() (with the appropriate n/hVHE awareness)
and add a comment why it is safe.

> +void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt)
> +{
> + kvm_pgtable_stage2_destroy_range(pgt, 0, BIT(pgt->ia_bits));
> + kvm_pgtable_stage2_destroy_pgd(pgt);
> +}
> +

Move this to mmu.c as a static function and use KVM_PGT_FN()

Thanks,
Oliver