Re: [PATCH v4 3/8] KVM: arm64: Add guard pages for KVM nVHE hypervisor stack

From: Marc Zyngier
Date: Wed Mar 02 2022 - 02:53:58 EST


On Fri, 25 Feb 2022 03:34:48 +0000,
Kalesh Singh <kaleshsingh@xxxxxxxxxx> wrote:
>
> Maps the stack pages in the flexible private VA range and allocates
> guard pages below the stack as unbacked VA space. The stack is aligned
> to twice its size to aid overflow detection (implemented in a subsequent
> patch in the series).
>
> Signed-off-by: Kalesh Singh <kaleshsingh@xxxxxxxxxx>
> ---
>
> Changes in v4:
> - Replace IS_ERR_OR_NULL check with IS_ERR check now that
> hyp_alloc_private_va_range() returns an error for null
> pointer, per Fuad
> - Format comments to < 80 cols, per Fuad
>
> Changes in v3:
> - Handle null ptr in IS_ERR_OR_NULL checks, per Mark
>
> arch/arm64/include/asm/kvm_asm.h | 1 +
> arch/arm64/kvm/arm.c | 32 +++++++++++++++++++++++++++++---
> 2 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index d5b0386ef765..2e277f2ed671 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -169,6 +169,7 @@ struct kvm_nvhe_init_params {
> unsigned long tcr_el2;
> unsigned long tpidr_el2;
> unsigned long stack_hyp_va;
> + unsigned long stack_pa;
> phys_addr_t pgd_pa;
> unsigned long hcr_el2;
> unsigned long vttbr;
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index ecc5958e27fe..0a83c0e7f838 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1541,7 +1541,6 @@ static void cpu_prepare_hyp_mode(int cpu)
> tcr |= (idmap_t0sz & GENMASK(TCR_TxSZ_WIDTH - 1, 0)) << TCR_T0SZ_OFFSET;
> params->tcr_el2 = tcr;
>
> - params->stack_hyp_va = kern_hyp_va(per_cpu(kvm_arm_hyp_stack_page, cpu) + PAGE_SIZE);
> params->pgd_pa = kvm_mmu_get_httbr();
> if (is_protected_kvm_enabled())
> params->hcr_el2 = HCR_HOST_NVHE_PROTECTED_FLAGS;
> @@ -1990,14 +1989,41 @@ static int init_hyp_mode(void)
> * Map the Hyp stack pages
> */
> for_each_possible_cpu(cpu) {
> + struct kvm_nvhe_init_params *params = per_cpu_ptr_nvhe_sym(kvm_init_params, cpu);
> char *stack_page = (char *)per_cpu(kvm_arm_hyp_stack_page, cpu);
> - err = create_hyp_mappings(stack_page, stack_page + PAGE_SIZE,
> - PAGE_HYP);
> + unsigned long stack_hyp_va, guard_hyp_va;
>
> + /*
> + * Private mappings are allocated downwards from io_map_base
> + * so allocate the stack first then the guard page.
> + *
> + * The stack is aligned to twice its size to facilitate overflow
> + * detection.
> + */
> + err = __create_hyp_private_mapping(__pa(stack_page), PAGE_SIZE,
> + PAGE_SIZE * 2, &stack_hyp_va, PAGE_HYP);

Right, I guess that's where my earlier ask breaks, as you want an
alignment that is *larger* than the allocation.

> if (err) {
> kvm_err("Cannot map hyp stack\n");
> goto out_err;
> }
> +
> + /* Allocate unbacked private VA range for stack guard page */
> + guard_hyp_va = hyp_alloc_private_va_range(PAGE_SIZE, PAGE_SIZE);

Huh. You are implicitly relying on the VA allocator handing you an
address contiguous with the previous mapping. That's... brave. I'd
rather you allocate the VA space upfront with the correct alignment
and then map the single page where it should be in the VA region.

That'd be a lot less fragile.

> + if (IS_ERR((void *)guard_hyp_va)) {
> + err = PTR_ERR((void *)guard_hyp_va);
> + kvm_err("Cannot allocate hyp stack guard page\n");
> + goto out_err;
> + }
> +
> + /*
> + * Save the stack PA in nvhe_init_params. This will be needed
> + * to recreate the stack mapping in protected nVHE mode.
> + * __hyp_pa() won't do the right thing there, since the stack
> + * has been mapped in the flexible private VA space.
> + */
> + params->stack_pa = __pa(stack_page) + PAGE_SIZE;
> +
> + params->stack_hyp_va = stack_hyp_va + PAGE_SIZE;
> }
>
> for_each_possible_cpu(cpu) {

Thanks,

M.

--
Without deviation from the norm, progress is not possible.