Re: [ANNOUNCE] PUCK Notes - 2024.04.03 - TDX Upstreaming Strategy

From: Edgecombe, Rick P
Date: Fri Apr 12 2024 - 16:09:59 EST


On Fri, 2024-04-12 at 10:39 -0700, Isaku Yamahata wrote:
> On Fri, Apr 12, 2024 at 04:40:37PM +0800,
> Xiaoyao Li <xiaoyao.li@xxxxxxxxx> wrote:
>
> > > The second issue is that userspace can’t know what CPUID values are
> > > configured
> > > in the TD. In the existing API for normal guests, it knows because it
> > > tells the
> > > guest what CPUID values to have. But for the TDX module that model is
> > > complicated to fit into in its API where you tell it some things and it
> > > gives
> > > you the resulting leaves. How to handle KVM_SET_CPUID kind of follows from
> > > this
> > > issue.
> > >
> > > One option is to demand the TDX module change to be able to efficiently
> > > wedge
> > > into KVM’s exiting “tell” model. This looks like the metadata API to query
> > > the
> > > fixed bits. Then userspace can know what bits it has to set, and call
> > > KVM_SET_CPUID with them. I think it is still kind of awkward. "Tell me
> > > what you
> > > want to hear?", "Ok here it is".
> > >
> > > Another option would be to add TDX specific KVM APIs that work for the TDX
> > > module's “ask” model, and meet the enumerated two goals. It could look
> > > something
> > > like:
> > > 1. KVM_TDX_GET_CONFIG_CPUID provides a list of directly configurable bits
> > > by
> > > KVM. This is based on static data on what KVM supports, with sanity check
> > > of
> > > TD_SYSINFO.CPUID_CONFIG[]. Bits that KVM doesn’t know about, but are
> > > returned as
> > > configurable by TD_SYSINFO.CPUID_CONFIG[] are not exposed as configurable.
> > > (they
> > > will be set to 1 by KVM, per the recommendation)
> >
> > This is not how KVM works. KVM will never enable unknown features blindly.
> > If the feature is unknown to KVM, it cannot be enable for guest. That's why
> > every new feature needs enabling patch in KVM, even the simplest case that
> > needs one patch to enumerate the CPUID of new instruction in
> > KVM_GET_SUPPORTED_CPUID.

I *think* the part in the docs that says VMMs need to do this is concerned with
fixed 1 bits becoming configurable. So not newly defined bits, but just ones
that were previously fixed as 1 and now need to be configurable to 1 to result
in no change. We need to clarify this.

>
> We can use device attributes as discussed at
> https://lore.kernel.org/kvm/CABgObfZzkNiP3q8p=KpvvFnh8m6qcHX4=tATaJc7cvVv2QWpJQ@xxxxxxxxxxxxxx/
> https://lore.kernel.org/kvm/20240404121327.3107131-6-pbonzini@xxxxxxxxxx/
>
> Something like
>
> #define KVM_X86_GRP_TDX         2
> ioctl(fd, KVM_GET_DEVICE_ATTR, (KVM_X86_GRP_TDX, metadata_field_id))

This would be instead of ATTRIBUTES and XFAM?

>
>
> > > 2. KVM_TDX_INIT_VM is passed userspaces choice of configurable bits, along
> > > with
> > > XFAM and ATTRIBUTES as dedicated fields. They go into TDH.MNG.INIT.
> > > 3. KVM_TDX_INIT_VCPU_CPUID takes a list of CPUID leafs. It pulls the CPUID
> > > bits
> > > actually configured in the TD for these leafs. They go into the struct
> > > kvm_vcpu,
> > > and are also passed up to userspace so everyone knows what actually got
> > > configured.
>
> Any reason to introduce KVM_TDX_INIT_VCPU_CPUID in addition to
> KVM_TDX_INIT_VCPU?  We can make single vCPU KVM TDX ioctl do all.

No, that is a good idea to use KVM_TDX_INIT_VCPU.

>
>
> > > KVM_SET_CPUID is not used for TDX.
>
> What cpuid does KVM_TDX_INIT_VCPU_CPUID accept?  The one that TDX module
> accepts with TDH.MNG.INIT()?  Or any cpuids that KVM_SET_CPUID2 accepts?
> I'm asking it because TDX module virtualizes only subset of CPUIDs.
> TDG.VP.VMCALL<CPUID> would need info from KVM_SET_CPUID.

It should only take bits that are exposed by KVM as configurable for TDX which
are those that KVM knows about AND the TDX module supports. The other features
come in as ATTRIBUTES or XFAM bits (or KVM versions of the same concept).

>
>
> > > Then we get TDX module folks to commit to never breaking KVM/userspace
> > > that
> > > follows this logic. One thing still missing is how to handle unknown
> > > future
> > > leafs with fixed bits. If a future leaf is defined and gets fixed 1, QEMU
> > > wouldn't know to query it.
> >
> > We can make KVM_TDX_INIT_VCPU_CPUID provide a large enough CPUID leafs and
> > KVM reports every leafs to userpsace. Instead of something that userspace
> > cares leafs X,Y,Z and KVM only reports back leafs X,Y,Z via
> > KVM_TDX_INIT_VCPU_CPUID.
>
> If new CPUID index is introduced, the userspace will get default values of
> CPUIDs and don't touch unknown CPUIDs?  Or KVM_TDX_GET_CONFIG_CPUID will mask
> out CPUID unknown to KVM?

The API to get at this data only allows for 256 possible leafs. But it also
allows for 256 bits of either sub leafs or specifying no subleaf. So total
possibly values is 65536. There is another field CPUID_VALID, which says "Non-
architectural" and doesn't seem to have anything on the subleafs either.

We could ask to make all future bits configurable and default 0. For the simple
instruction support type CPUID bits, I don't know if there would always be a
functional problem for KVM if they became the native value. But this requirement
would make things more consistent. Even if we have a new metadata API to get the
specifically fixed bits, it doesn't help with the problem of new bits that start
as host value or fixed 1. So I think TDX module guarantees are the only way to
deal with new bits like KVM does today.

We could also ask for something more flexible, like: The TDX module cannot set
new bits to fixed 1 or native value, unless it won't break any existing kernels
or userspace.

This way for some simple new instruction bits they can leave them native/fixed
and it could simplify the TDX module. But then the knowledge of what bits are
set becomes inconsistent.

Hmm...