Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

From: Huacai Chen
Date: Thu Sep 24 2020 - 03:45:26 EST


Hi, Paolo,

On Thu, Sep 24, 2020 at 2:50 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 24/09/20 08:31, Huacai Chen wrote:
> > Hi, Sean,
> >
> > On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> >>
> >> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
> >> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
> >> fully enabled during VM init in order to make SEAMCALLs.
> >>
> >> This also provides consistent ordering between kvm_create_vm() and
> >> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
> >> hardware_disable_all().
> > Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
> > enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
> > also enabled at hardware_enable_all()?
>
> kvm_arch_init_vm() enables TDX *for the VM*, and to do that it needs VMX
> instructions (specifically SEAMCALL, which is a hypervisor->"ultravisor"
> call). Because that action is VM-specific it cannot be done in
> hardware_enable_all().
>
> Paolo
OK, I know.

Reviewed-by: Huacai Chen <chenhc@xxxxxxxxxx>

>
> > The swapping seems not affect MIPS, but I observed a fact:
> > kvm_arch_hardware_enable() not only be called at
> > hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
> > if you swap the order, new starting CPUs are not enabled VMX before
> > kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
> > VMX/TDX).
> >
> > Huacai
> >>
> >> Cc: Marc Zyngier <maz@xxxxxxxxxx>
> >> Cc: James Morse <james.morse@xxxxxxx>
> >> Cc: Julien Thierry <julien.thierry.kdev@xxxxxxxxx>
> >> Cc: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> >> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> >> Cc: Huacai Chen <chenhc@xxxxxxxxxx>
> >> Cc: Aleksandar Markovic <aleksandar.qemu.devel@xxxxxxxxx>
> >> Cc: linux-mips@xxxxxxxxxxxxxxx
> >> Cc: Paul Mackerras <paulus@xxxxxxxxxx>
> >> Cc: kvm-ppc@xxxxxxxxxxxxxxx
> >> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
> >> Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
> >> Cc: David Hildenbrand <david@xxxxxxxxxx>
> >> Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
> >> Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
> >> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> >> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> >> Cc: Jim Mattson <jmattson@xxxxxxxxxx>
> >> Cc: Joerg Roedel <joro@xxxxxxxxxx>
> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> >> ---
> >>
> >> Obviously not required until the TDX series comes along, but IMO KVM
> >> should be consistent with respect to enabling and disabling virt support
> >> in hardware.
> >>
> >> Tested only on Intel hardware. Unless I missed something, this only
> >> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
> >> Arm looks safe (based on my mostly clueless reading of the code), but I
> >> have no idea if this will cause problem for MIPS, which is doing all kinds
> >> of things in hardware_enable() that I don't pretend to fully understand.
> >>
> >> virt/kvm/kvm_main.c | 16 ++++++++--------
> >> 1 file changed, 8 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >> index cf88233b819a..58fa19bcfc90 100644
> >> --- a/virt/kvm/kvm_main.c
> >> +++ b/virt/kvm/kvm_main.c
> >> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> struct kvm_memslots *slots = kvm_alloc_memslots();
> >>
> >> if (!slots)
> >> - goto out_err_no_arch_destroy_vm;
> >> + goto out_err_no_disable;
> >> /* Generations must be different for each address space. */
> >> slots->generation = i;
> >> rcu_assign_pointer(kvm->memslots[i], slots);
> >> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> rcu_assign_pointer(kvm->buses[i],
> >> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
> >> if (!kvm->buses[i])
> >> - goto out_err_no_arch_destroy_vm;
> >> + goto out_err_no_disable;
> >> }
> >>
> >> kvm->max_halt_poll_ns = halt_poll_ns;
> >>
> >> - r = kvm_arch_init_vm(kvm, type);
> >> - if (r)
> >> - goto out_err_no_arch_destroy_vm;
> >> -
> >> r = hardware_enable_all();
> >> if (r)
> >> goto out_err_no_disable;
> >>
> >> + r = kvm_arch_init_vm(kvm, type);
> >> + if (r)
> >> + goto out_err_no_arch_destroy_vm;
> >> +
> >> #ifdef CONFIG_HAVE_KVM_IRQFD
> >> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
> >> #endif
> >> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
> >> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
> >> #endif
> >> out_err_no_mmu_notifier:
> >> - hardware_disable_all();
> >> -out_err_no_disable:
> >> kvm_arch_destroy_vm(kvm);
> >> out_err_no_arch_destroy_vm:
> >> + hardware_disable_all();
> >> +out_err_no_disable:
> >> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
> >> for (i = 0; i < KVM_NR_BUSES; i++)
> >> kfree(kvm_get_bus(kvm, i));
> >> --
> >> 2.28.0
> >>
> >
>