Re: [PATCH 1/1] KVM: selftests: Adjust VM's initial stack address to align with SysV ABI spec

From: Sean Christopherson
Date: Thu Feb 23 2023 - 15:06:13 EST


On Fri, Feb 17, 2023, Ackerley Tng wrote:
> Align stack to match calling sequence requirements in section "The
> Stack Frame" of the System V ABI AMD64 Architecture Processor
> Supplement, which requires the value (%rsp + 8) to be a multiple of 16
> when control is transferred to the function entry point.

To make it slightly more clear what is wrong:

Align the guest stack to match calling sequence requirements in section
"The Stack Frame" of the System V ABI AMD64 Architecture Processor
Supplement, which requires the value (%rsp + 8), NOT %rsp, to be a
multiple of 16 when control is transferred to the function entry point.
I.e. in a normal function call, %rsp needs to be 16-byte aligned
_before_ CALL, not after.

> This is required because GCC is already aligned with the SysV ABI
> spec, and compiles code resulting in (%rsp + 8) being a multiple of 16
> when control is transferred to the function entry point.

I'd leave out this paragraph, any sane compiler, not just gcc, will adhere to the
SysV ABI.

> This fixes guest crashes when compiled guest code contains certain SSE

Nit, explicitly call out that #GP behavior, e.g. if/when KVM installs exception
handlers by default, there will be no crash.

E.g.

This fixes unexpected #GPs in guest code when the compiler uses SSE
instructions, e.g. to initialize memory, as many SSE instruction require
memory operands (including those on the stack) to be 16-byte aligned.

> instructions, because thes SSE instructions expect memory
> references (including those on the stack) to be 16-byte-aligned.
>
> Signed-off-by: Ackerley Tng <ackerleytng@xxxxxxxxxx>
> ---
>
> This patch is a follow-up from discussions at
> https://lore.kernel.org/lkml/20230121001542.2472357-9-ackerleytng@xxxxxxxxxx/
>
> ---
> .../selftests/kvm/include/linux/align.h | 15 +++++++++++++++
> .../selftests/kvm/lib/x86_64/processor.c | 18 +++++++++++++++++-
> 2 files changed, 32 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/kvm/include/linux/align.h
>
> diff --git a/tools/testing/selftests/kvm/include/linux/align.h b/tools/testing/selftests/kvm/include/linux/align.h
> new file mode 100644
> index 000000000000..2b4acec7b95a
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/include/linux/align.h
> @@ -0,0 +1,15 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_ALIGN_H
> +#define _LINUX_ALIGN_H
> +
> +#include <linux/const.h>
> +
> +/* @a is a power of 2 value */
> +#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
> +#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
> +#define __ALIGN_MASK(x, mask) __ALIGN_KERNEL_MASK((x), (mask))
> +#define PTR_ALIGN(p, a) ((typeof(p))ALIGN((unsigned long)(p), (a)))
> +#define PTR_ALIGN_DOWN(p, a) ((typeof(p))ALIGN_DOWN((unsigned long)(p), (a)))
> +#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)

I agree it's high time align.h is pulled into tools/ but it belongs in
tools/include/linux/, not in KVM selftests.

For this fix specifically, tools/include/linux/bitmap.h already #defines IS_ALIGNED(),
so just use that, and pull in align.h (and remove the definition in bitmap.h) in
a separate patch (and let us maintainers will deal with the conflicts).

> +
> +#endif /* _LINUX_ALIGN_H */
> diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
> index acfa1d01e7df..09b48ae96fdd 100644
> --- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
> +++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
> @@ -5,6 +5,7 @@
> * Copyright (C) 2018, Google LLC.
> */
>
> +#include "linux/align.h"
> #include "test_util.h"
> #include "kvm_util.h"
> #include "processor.h"
> @@ -569,6 +570,21 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
> DEFAULT_GUEST_STACK_VADDR_MIN,
> MEM_REGION_DATA);
>
> + stack_vaddr += DEFAULT_STACK_PGS * getpagesize();
> +
> + /*
> + * Align stack to match calling sequence requirements in section "The
> + * Stack Frame" of the System V ABI AMD64 Architecture Processor
> + * Supplement, which requires the value (%rsp + 8) to be a multiple of
> + * 16 when control is transferred to the function entry point.
> + *
> + * If this code is ever used to launch a vCPU with 32-bit entry point it
> + * may need to subtract 4 bytes instead of 8 bytes.
> + */
> + TEST_ASSERT(IS_ALIGNED(stack_vaddr, PAGE_SIZE),
> + "stack_vaddr must be page aligned for stack adjustment of -8 to work");

Nit, for the message, tie it to the allocation, not to the usage, e.g.

TEST_ASSERT(IS_ALIGNED(stack_vaddr, PAGE_SIZE),
"__vm_vaddr_alloc() did not provide a page-aligned address");

The assert exists to verify an assumption (that the allocator always provides
page-aligned addresses), and the error message should capture that. Explaining
what will break isn't as helpful because it doesn't help understand what went
wrong

> + stack_vaddr -= 8;
> +
> vcpu = __vm_vcpu_add(vm, vcpu_id);
> vcpu_init_cpuid(vcpu, kvm_get_supported_cpuid());
> vcpu_setup(vm, vcpu);
> @@ -576,7 +592,7 @@ struct kvm_vcpu *vm_arch_vcpu_add(struct kvm_vm *vm, uint32_t vcpu_id,
> /* Setup guest general purpose registers */
> vcpu_regs_get(vcpu, &regs);
> regs.rflags = regs.rflags | 0x2;
> - regs.rsp = stack_vaddr + (DEFAULT_STACK_PGS * getpagesize());
> + regs.rsp = stack_vaddr;
> regs.rip = (unsigned long) guest_code;
> vcpu_regs_set(vcpu, &regs);
>
> --
> 2.39.2.637.g21b0678d19-goog
>