Re: [PATCH v11 023/113] KVM: TDX: allocate/free TDX vcpu structure

From: Isaku Yamahata
Date: Tue Feb 28 2023 - 15:19:07 EST


On Tue, Feb 28, 2023 at 11:52:59AM +0000,
"Huang, Kai" <kai.huang@xxxxxxxxx> wrote:

> On Tue, 2023-02-28 at 03:06 -0800, Isaku Yamahata wrote:
> > > > + if (!e)
> > > > + return -ENOMEM;
> > > > + *e  = (struct kvm_cpuid_entry2) {
> > > > + .function = 1, /* Features for X2APIC */
> > > > + .index = 0,
> > > > + .eax = 0,
> > > > + .ebx = 0,
> > > > + .ecx = 1ULL << 21, /* X2APIC */
> > > > + .edx = 0,
> > > > + };
> > > > + vcpu->arch.cpuid_entries = e;
> > > > + vcpu->arch.cpuid_nent = 1;
> > >
> > > As mentioned above, why doing it here? Won't be this be overwritten later in
> > > KVM_SET_CPUID2?
> >
> > Yes, user space VMM can overwrite cpuid[0x1] and APIC base MSR.  But it
> > doesn't
> > matter because it's a bug of user space VMM. user space VMM has to keep the
> > consistency of cpuid and MSRs.
> > Because TDX module virtualizes cpuid[0x1].x2apic to fixed 1, KVM value doesn't
> > matter after vcpu creation.
> > Because KVM virtualizes APIC base as read only to guest, cpuid[0x1].x2apic
> > doesn't matter after vcpu creation as long as user space VMM keeps KVM APIC
> > BASE
> > value.
> >
>
> Contrary, can we depend on userspace VMM to set x2APIC in CPUID, but not do this
> in KVM? If userspace doesn't do it, we treat it as userspace's bug.
>
> Plus, userspace anyway needs to set x2APIC in CPUID regardless whether you have
> done above here, correct?
>
> I don't see the point of doing above in KVM because you are neither enforcing
> anything in KVM, nor you are reducing effort of userspace.

Good idea. I can drop cpuid part from tdx_vcpu_create() and apic base part from
tdx_vcpu_reset(). It needs to modify tdx_has_emulated_msr() to allow user space
VMM to update APIC BASE MSR.

diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
index e045a8132639..46c82ce3ef46 100644
--- a/arch/x86/kvm/vmx/tdx.c
+++ b/arch/x86/kvm/vmx/tdx.c
@@ -2624,7 +2624,14 @@ int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)

int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
{
- if (tdx_has_emulated_msr(msr->index, true))
+ if (tdx_has_emulated_msr(msr->index, true) ||
+ /*
+ * user space VMM should explicitly set to X2APIC mode as initial
+ * value that is deviated from the conventional case.
+ */
+ (msr->host_initiated && msr->index == MSR_IA32_APICBASE &&
+ (msr->data & ~MSR_IA32_APICBASE_BSP) ==
+ (APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC)))
return kvm_set_msr_common(vcpu, msr);
return 1;
}


Just FYI, qemu needs the following change.

--- a/target/i386/kvm/tdx.c
+++ b/target/i386/kvm/tdx.c
@@ -274,6 +274,11 @@ bool is_tdx_vm(void)
return !!tdx_guest;
}

+bool is_tdx_vm_finalized(void)
+{
+ return !!tdx_guest && tdx_guest->parent_obj.ready;
+}
+
static inline uint32_t host_cpuid_reg(uint32_t function,
uint32_t index, int reg)
{
@@ -875,10 +880,20 @@ static void tdx_post_init_vcpus(void)
TdxFirmwareEntry *hob;
CPUState *cpu;
int r;
+ uint64_t apic_base;

hob = tdx_get_hob_entry(tdx_guest);
CPU_FOREACH(cpu) {
- apic_force_x2apic(X86_CPU(cpu)->apic_state);
+ X86CPU *x86_cpu = X86_CPU(cpu);
+
+ apic_force_x2apic(x86_cpu->apic_state);
+
+ apic_base = APIC_DEFAULT_ADDRESS | MSR_IA32_APICBASE_ENABLE |
+ MSR_IA32_APICBASE_EXTD;
+ if (cpu_is_bsp(x86_cpu))
+ apic_base |= MSR_IA32_APICBASE_BSP;
+ cpu_set_apic_base(x86_cpu->apic_state, apic_base);
+ kvm_put_apicbase(x86_cpu, apic_base);

r = tdx_vcpu_ioctl(cpu, KVM_TDX_INIT_VCPU, 0, (void *)hob->address);
if (r < 0) {
diff --git a/target/i386/kvm/tdx.h b/target/i386/kvm/tdx.h
index 5cc0f730afa6..f77689464738 100644
--- a/target/i386/kvm/tdx.h
+++ b/target/i386/kvm/tdx.h
@@ -59,8 +59,10 @@ typedef struct TdxGuest {

#ifdef CONFIG_TDX
bool is_tdx_vm(void);
+bool is_tdx_vm_finalized(void);
#else
#define is_tdx_vm() 0
+#define is_tdx_vm_finalized() false;
#endif /* CONFIG_TDX */

int tdx_kvm_init(MachineState *ms, Error **errp);

--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>