Re: [PATCH v3 2/3] Fix undefined operation fault that can hang a cpu on crash or panic

From: Sean Christopherson
Date: Tue Jul 07 2020 - 01:09:53 EST


On Sat, Jul 04, 2020 at 04:38:08PM -0400, David P. Reed wrote:
> Fix: Mask undefined operation fault during emergency VMXOFF that must be
> attempted to force cpu exit from VMX root operation.
> Explanation: When a cpu may be in VMX root operation (only possible when
> CR4.VMXE is set), crash or panic reboot tries to exit VMX root operation
> using VMXOFF. This is necessary, because any INIT will be masked while cpu
> is in VMX root operation, but that state cannot be reliably
> discerned by the state of the cpu.
> VMXOFF faults if the cpu is not actually in VMX root operation, signalling
> undefined operation.
> Discovered while debugging an out-of-tree x-visor with a race. Can happen
> due to certain kinds of bugs in KVM.
>
> Fixes: 208067 <https://bugzilla.kernel.org/show_bug.cgi?id=208067>
> Reported-by: David P. Reed <dpreed@xxxxxxxxxxxx>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Suggested-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Suggested-by: Andy Lutomirski <luto@xxxxxxxxxx>
> Signed-off-by: David P. Reed <dpreed@xxxxxxxxxxxx>
> ---
> arch/x86/include/asm/virtext.h | 20 ++++++++++++++------
> 1 file changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/virtext.h b/arch/x86/include/asm/virtext.h
> index 0ede8d04535a..0e0900eacb9c 100644
> --- a/arch/x86/include/asm/virtext.h
> +++ b/arch/x86/include/asm/virtext.h
> @@ -30,11 +30,11 @@ static inline int cpu_has_vmx(void)
> }
>
>
> -/* Disable VMX on the current CPU
> +/* Exit VMX root mode and isable VMX on the current CPU.
> *
> * vmxoff causes a undefined-opcode exception if vmxon was not run
> - * on the CPU previously. Only call this function if you know VMX
> - * is enabled.
> + * on the CPU previously. Only call this function if you know cpu
> + * is in VMX root mode.
> */
> static inline void cpu_vmxoff(void)
> {
> @@ -47,14 +47,22 @@ static inline int cpu_vmx_enabled(void)
> return __read_cr4() & X86_CR4_VMXE;
> }
>
> -/* Disable VMX if it is enabled on the current CPU
> +/* Safely exit VMX root mode and disable VMX if VMX enabled
> + * on the current CPU. Handle undefined-opcode fault
> + * that can occur if cpu is not in VMX root mode, due
> + * to a race.
> *
> * You shouldn't call this if cpu_has_vmx() returns 0.
> */
> static inline void __cpu_emergency_vmxoff(void)
> {
> - if (cpu_vmx_enabled())
> - cpu_vmxoff();
> + if (!cpu_vmx_enabled())
> + return;
> + asm volatile ("1:vmxoff\n\t"
> + "2:\n\t"
> + _ASM_EXTABLE(1b, 2b)
> + ::: "cc", "memory");
> + cr4_clear_bits(X86_CR4_VMXE);

Open coding vmxoff doesn't make sense, and IMO is flat out wrong as it fixes
flows that use __cpu_emergency_vmxoff() but leaves the same bug hanging
around in emergency_vmx_disable_all() until the next patch.

The reason I say it doesn't make sense is that there is no sane scenario
where the generic vmxoff helper should _not_ eat the fault. All other VMXOFF
faults are mode related, i.e. any fault is guaranteed to be due to the
!post-VMXON check unless we're magically in RM, VM86, compat mode, or at
CPL>0. Given that the whole point of this series is that it's impossible to
determine whether or not the CPU if post-VMXON if CR4.VMXE=1 without taking a
fault of some form, there's simply no way that anything except the hypervisor
(in normal operation) can know the state of VMX. And given that the only
in-tree hypervisor (KVM) has its own version of vmxoff, that means there is
no scenario in which cpu_vmxoff() can safely be used. Case in point, after
the next patch there are no users of cpu_vmxoff().

TL;DR: Just do fixup on cpu_vmxoff().

> }
>
> /* Disable VMX if it is supported and enabled on the current CPU
> --
> 2.26.2
>