Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

From: Huacai Chen
Date: Thu Sep 24 2020 - 02:32:03 EST


Hi, Sean,

On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
> fully enabled during VM init in order to make SEAMCALLs.
>
> This also provides consistent ordering between kvm_create_vm() and
> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
> hardware_disable_all().
Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
also enabled at hardware_enable_all()?

The swapping seems not affect MIPS, but I observed a fact:
kvm_arch_hardware_enable() not only be called at
hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
if you swap the order, new starting CPUs are not enabled VMX before
kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
VMX/TDX).

Huacai
>
> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> Cc: James Morse <james.morse@xxxxxxx>
> Cc: Julien Thierry <julien.thierry.kdev@xxxxxxxxx>
> Cc: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: Huacai Chen <chenhc@xxxxxxxxxx>
> Cc: Aleksandar Markovic <aleksandar.qemu.devel@xxxxxxxxx>
> Cc: linux-mips@xxxxxxxxxxxxxxx
> Cc: Paul Mackerras <paulus@xxxxxxxxxx>
> Cc: kvm-ppc@xxxxxxxxxxxxxxx
> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
> Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> Cc: Jim Mattson <jmattson@xxxxxxxxxx>
> Cc: Joerg Roedel <joro@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
>
> Obviously not required until the TDX series comes along, but IMO KVM
> should be consistent with respect to enabling and disabling virt support
> in hardware.
>
> Tested only on Intel hardware. Unless I missed something, this only
> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
> Arm looks safe (based on my mostly clueless reading of the code), but I
> have no idea if this will cause problem for MIPS, which is doing all kinds
> of things in hardware_enable() that I don't pretend to fully understand.
>
> virt/kvm/kvm_main.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index cf88233b819a..58fa19bcfc90 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
> struct kvm_memslots *slots = kvm_alloc_memslots();
>
> if (!slots)
> - goto out_err_no_arch_destroy_vm;
> + goto out_err_no_disable;
> /* Generations must be different for each address space. */
> slots->generation = i;
> rcu_assign_pointer(kvm->memslots[i], slots);
> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
> rcu_assign_pointer(kvm->buses[i],
> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
> if (!kvm->buses[i])
> - goto out_err_no_arch_destroy_vm;
> + goto out_err_no_disable;
> }
>
> kvm->max_halt_poll_ns = halt_poll_ns;
>
> - r = kvm_arch_init_vm(kvm, type);
> - if (r)
> - goto out_err_no_arch_destroy_vm;
> -
> r = hardware_enable_all();
> if (r)
> goto out_err_no_disable;
>
> + r = kvm_arch_init_vm(kvm, type);
> + if (r)
> + goto out_err_no_arch_destroy_vm;
> +
> #ifdef CONFIG_HAVE_KVM_IRQFD
> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
> #endif
> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
> #endif
> out_err_no_mmu_notifier:
> - hardware_disable_all();
> -out_err_no_disable:
> kvm_arch_destroy_vm(kvm);
> out_err_no_arch_destroy_vm:
> + hardware_disable_all();
> +out_err_no_disable:
> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> for (i = 0; i < KVM_NR_BUSES; i++)
> kfree(kvm_get_bus(kvm, i));
> --
> 2.28.0
>