Re: [PATCH] KVM: Enable hardware before doing arch VM initialization

From: Paolo Bonzini
Date: Thu Sep 24 2020 - 02:50:46 EST


On 24/09/20 08:31, Huacai Chen wrote:
> Hi, Sean,
>
> On Thu, Sep 24, 2020 at 3:00 AM Sean Christopherson
> <sean.j.christopherson@xxxxxxxxx> wrote:
>>
>> Swap the order of hardware_enable_all() and kvm_arch_init_vm() to
>> accommodate Intel's Trust Domain Extension (TDX), which needs VMX to be
>> fully enabled during VM init in order to make SEAMCALLs.
>>
>> This also provides consistent ordering between kvm_create_vm() and
>> kvm_destroy_vm() with respect to calling kvm_arch_destroy_vm() and
>> hardware_disable_all().
> Do you means that hardware_enable_all() enable VMX, kvm_arch_init_vm()
> enable TDX, and TDX depends on VMX enabled at first? If so, can TDX be
> also enabled at hardware_enable_all()?

kvm_arch_init_vm() enables TDX *for the VM*, and to do that it needs VMX
instructions (specifically SEAMCALL, which is a hypervisor->"ultravisor"
call). Because that action is VM-specific it cannot be done in
hardware_enable_all().

Paolo

> The swapping seems not affect MIPS, but I observed a fact:
> kvm_arch_hardware_enable() not only be called at
> hardware_enable_all(), but also be called at kvm_starting_cpu(). Even
> if you swap the order, new starting CPUs are not enabled VMX before
> kvm_arch_init_vm(). (Maybe I am wrong because I'm not familiar with
> VMX/TDX).
>
> Huacai
>>
>> Cc: Marc Zyngier <maz@xxxxxxxxxx>
>> Cc: James Morse <james.morse@xxxxxxx>
>> Cc: Julien Thierry <julien.thierry.kdev@xxxxxxxxx>
>> Cc: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Cc: Huacai Chen <chenhc@xxxxxxxxxx>
>> Cc: Aleksandar Markovic <aleksandar.qemu.devel@xxxxxxxxx>
>> Cc: linux-mips@xxxxxxxxxxxxxxx
>> Cc: Paul Mackerras <paulus@xxxxxxxxxx>
>> Cc: kvm-ppc@xxxxxxxxxxxxxxx
>> Cc: Christian Borntraeger <borntraeger@xxxxxxxxxx>
>> Cc: Janosch Frank <frankja@xxxxxxxxxxxxx>
>> Cc: David Hildenbrand <david@xxxxxxxxxx>
>> Cc: Cornelia Huck <cohuck@xxxxxxxxxx>
>> Cc: Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx>
>> Cc: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> Cc: Wanpeng Li <wanpengli@xxxxxxxxxxx>
>> Cc: Jim Mattson <jmattson@xxxxxxxxxx>
>> Cc: Joerg Roedel <joro@xxxxxxxxxx>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>> ---
>>
>> Obviously not required until the TDX series comes along, but IMO KVM
>> should be consistent with respect to enabling and disabling virt support
>> in hardware.
>>
>> Tested only on Intel hardware. Unless I missed something, this only
>> affects x86, Arm and MIPS as hardware enabling is a nop for s390 and PPC.
>> Arm looks safe (based on my mostly clueless reading of the code), but I
>> have no idea if this will cause problem for MIPS, which is doing all kinds
>> of things in hardware_enable() that I don't pretend to fully understand.
>>
>> virt/kvm/kvm_main.c | 16 ++++++++--------
>> 1 file changed, 8 insertions(+), 8 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index cf88233b819a..58fa19bcfc90 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -766,7 +766,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> struct kvm_memslots *slots = kvm_alloc_memslots();
>>
>> if (!slots)
>> - goto out_err_no_arch_destroy_vm;
>> + goto out_err_no_disable;
>> /* Generations must be different for each address space. */
>> slots->generation = i;
>> rcu_assign_pointer(kvm->memslots[i], slots);
>> @@ -776,19 +776,19 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> rcu_assign_pointer(kvm->buses[i],
>> kzalloc(sizeof(struct kvm_io_bus), GFP_KERNEL_ACCOUNT));
>> if (!kvm->buses[i])
>> - goto out_err_no_arch_destroy_vm;
>> + goto out_err_no_disable;
>> }
>>
>> kvm->max_halt_poll_ns = halt_poll_ns;
>>
>> - r = kvm_arch_init_vm(kvm, type);
>> - if (r)
>> - goto out_err_no_arch_destroy_vm;
>> -
>> r = hardware_enable_all();
>> if (r)
>> goto out_err_no_disable;
>>
>> + r = kvm_arch_init_vm(kvm, type);
>> + if (r)
>> + goto out_err_no_arch_destroy_vm;
>> +
>> #ifdef CONFIG_HAVE_KVM_IRQFD
>> INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
>> #endif
>> @@ -815,10 +815,10 @@ static struct kvm *kvm_create_vm(unsigned long type)
>> mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
>> #endif
>> out_err_no_mmu_notifier:
>> - hardware_disable_all();
>> -out_err_no_disable:
>> kvm_arch_destroy_vm(kvm);
>> out_err_no_arch_destroy_vm:
>> + hardware_disable_all();
>> +out_err_no_disable:
>> WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
>> for (i = 0; i < KVM_NR_BUSES; i++)
>> kfree(kvm_get_bus(kvm, i));
>> --
>> 2.28.0
>>
>