Re: [RFC PATCH 0/4] KVM: x86/tdx: Have TDX handle VMXON during bringup

From: dan.j.williams

Date: Mon Oct 13 2025 - 18:23:04 EST


[ Add Alexey for question below about SEV-TIO needing to enable SNP from
the PSP driver? ]

Sean Christopherson wrote:
> This is a sort of middle ground between fully yanking core virtualization
> support out of KVM, and unconditionally doing VMXON during boot[0].

Thanks for this, Sean!

> I got quite far long on rebasing some internal patches we have to extract the
> core virtualization bits out of KVM x86, but as I paged back in all of the
> things we had punted on (because they were waaay out of scope for our needs),
> I realized more and more that providing truly generic virtualization
> instrastructure is vastly different than providing infrastructure that can be
> shared by multiple instances of KVM (or things very similar to KVM)[1].
>
> So while I still don't want to blindly do VMXON, I also think that trying to
> actually support another in-tree hypervisor, without an imminent user to drive
> the development, is a waste of resources, and would saddle KVM with a pile of
> pointless complexity.
>
> The idea here is to extract _only_ VMXON+VMXOFF and EFER.SVME toggling. AFAIK
> there's no second user of SVM, i.e. no equivalent to TDX, but I wanted to keep
> things as symmetrical as possible.

Alexey did mention in the TEE I/O call that the PSP driver does need to
turn on SVM. Added him to the Cc to clarify if SEV-TIO needs at least
SVM enabled outside of KVM in some cases.

> Emphasis on "only", because leaving VMCS tracking and clearing in KVM is
> another key difference from Xin's series. The "light bulb" moment on that
> front is that TDX isn't a hypervisor, and isn't trying to be a hypervisor.
> Specifically, TDX should _never_ have it's own VMCSes (that are visible to the
> host; the TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there
> is simply no reason to move that functionality out of KVM.
>
> With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly
> simple refcounting game.
>
> Oh, and I didn't bother looking to see if it would work, but if TDX only needs
> VMXON during boot, then the TDX use of VMXON could be transient.

With the work-in-progress "Host Services", the expectation is that VMX
would remain on especially because there is no current way to de-init
TDX.

Now, the "TDX always-on even outside of Host Services" this series is
proposing gives me slight pause. I.e. Any resources that TDX gobbles, or
features that TDX is incompatible (ACPI S3), need a trip through a BIOS
menu to turn off. However, if that becomes a problem in practice we can
circle back later to fix that up.

> could simply blast on_each_cpu() and forego the cpuhp and syscore hooks (a
> non-emergency reboot during init isn't possible). I don't particuarly care
> what TDX does, as it's a fairly minor detail all things concerned. I went with
> the "harder" approach, e.g. to validate keeping the VMXON users count elevated
> would do the right thing with respect to CPU offlining, etc.
>
> Lightly tested (see the hacks below to verify the TDX side appears to do what
> it's supposed to do), but it seems to work? Heavily RFC, e.g. the third patch
> in particular needs to be chunked up, I'm sure there's polishing to be done,
> etc.

Sounds good and I read this as "hey, this is the form I would like to
see, when someone else cleans this up and sends it back to me as a
non-RFC".

Thanks again!