Re: [PATCH] x86/sev-es: Do not use copy_from_kernel_nofault in early #VC handler

From: Dave Hansen
Date: Thu Sep 07 2023 - 13:15:24 EST


On 9/6/23 16:25, Adam Dunlap wrote:
>> Usually, we'll add some gunk in arch/x86/boot/compressed/misc.h to
>> override the troublesome implementation. In this case, it would make a
>> lot of sense to somehow avoid touching boot_cpu_data.x86_virt_bits in
>> the first place.
> Thanks for the comment. I realize this patch is doing something a bit misleading
> here. In this case, "early" does not refer to the compressed kernel, but
> actually the regular kernel but in the stage with this early #VC handler
> vc_boot_ghcb (instead of the usual vc_raw_handle_exception). This #VC handler
> triggers for the first time on a cpuid instruction in secondary_startup_64, but
> boot_cpu_data.x86_virt_bits is not initialized until setup_arch inside of
> start_kernel, which is at the end of secondary_startup_64.

How about something like the attached patch?

It avoids passing around 'is_early' everywhere, which I'm sure we'll get
wrong at some point. If we get it wrong, we lose *ALL* the checking
that copy_from_kernel*() does in addition to the canonical checks.

The attached patch at least preserves the userspace address checks.

This also makes me wonder how much other code is called via the early
exception handlers that's subtly broken. I scanned a function or two
deep and the instruction decoding was the most guilty looking thing.
But a closer look would be appreciated.

Also, what's the root cause here? What's causing the early exception?
It is some silly CPUID leaf? Should we be more careful to just avoid
these exceptions?diff --git a/arch/x86/mm/maccess.c b/arch/x86/mm/maccess.c
index 5a53c2cc169c..4f76c34d70a2 100644
--- a/arch/x86/mm/maccess.c
+++ b/arch/x86/mm/maccess.c
@@ -7,14 +7,24 @@
bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{
unsigned long vaddr = (unsigned long)unsafe_src;
+ bool ret;

/*
- * Range covering the highest possible canonical userspace address
- * as well as non-canonical address range. For the canonical range
- * we also need to include the userspace guard page.
+ * Do not allow userspace addresses. This disallows
+ * normal userspace and the userspace guard page:
*/
- return vaddr >= TASK_SIZE_MAX + PAGE_SIZE &&
- __is_canonical_address(vaddr, boot_cpu_data.x86_virt_bits);
+ if (vaddr < TASK_SIZE_MAX + PAGE_SIZE)
+ return false;
+
+ /*
+ * Allow everything during early boot before 'x86_virt_bits'
+ * is initialized. Needed for instruction decoding in early
+ * exception handlers.
+ */
+ if (!boot_cpu_data.x86_virt_bits)
+ return true;
+
+ return __is_canonical_address(vaddr, boot_cpu_data.x86_virt_bits);
}
#else
bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)