Re: [PATCH v8 13/16] x86/virt/tdx: Configure global KeyID on all packages

From: Huang, Kai
Date: Tue Jan 10 2023 - 05:16:17 EST


On Fri, 2023-01-06 at 14:49 -0800, Dave Hansen wrote:
> On 12/8/22 22:52, Kai Huang wrote:
> > After the list of TDMRs and the global KeyID are configured to the TDX
> > module, the kernel needs to configure the key of the global KeyID on all
> > packages using TDH.SYS.KEY.CONFIG.
> >
> > TDH.SYS.KEY.CONFIG needs to be done on one (any) cpu for each package.
> > Also, it cannot run concurrently on different cpus, so just use
> > smp_call_function_single() to do it one by one.
> >
> > Note to keep things simple, neither the function to configure the global
> > KeyID on all packages nor the tdx_enable() checks whether there's at
> > least one online cpu for each package. Also, neither of them explicitly
> > prevents any cpu from going offline. It is caller's responsibility to
> > guarantee this.
>
> OK, but does someone *actually* do this?

Please see below reply around the code.

>
> > Intel hardware doesn't guarantee cache coherency across different
> > KeyIDs. The kernel needs to flush PAMT's dirty cachelines (associated
> > with KeyID 0) before the TDX module uses the global KeyID to access the
> > PAMT. Otherwise, those dirty cachelines can silently corrupt the TDX
> > module's metadata. Note this breaks TDX from functionality point of
> > view but TDX's security remains intact.
>
> Intel hardware doesn't guarantee cache coherency across
> different KeyIDs. The PAMTs are transitioning from being used
> by the kernel mapping (KeyId 0) to the TDX module's "global
> KeyID" mapping.
>
> This means that the kernel must flush any dirty KeyID-0 PAMT
> cachelines before the TDX module uses the global KeyID to access
> the PAMT. Otherwise, if those dirty cachelines were written
> back, they would corrupt the TDX module's metadata. Aside: This
> corruption would be detected by the memory integrity hardware on
> the next read of the memory with the global KeyID. The result
> would likely be fatal to the system but would not impact TDX
> security.

Thanks!

>
> > Following the TDX module specification, flush cache before configuring
> > the global KeyID on all packages. Given the PAMT size can be large
> > (~1/256th of system RAM), just use WBINVD on all CPUs to flush.
> >
> > Note if any TDH.SYS.KEY.CONFIG fails, the TDX module may already have
> > used the global KeyID to write any PAMT. Therefore, need to use WBINVD
> > to flush cache before freeing the PAMTs back to the kernel.
>
> s/need to// ^

Will do. Thanks.

>
>
> > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > index ab961443fed5..4c779e8412f1 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -946,6 +946,66 @@ static int config_tdx_module(struct tdmr_info_list *tdmr_list, u64 global_keyid)
> > return ret;
> > }
> >
> > +static void do_global_key_config(void *data)
> > +{
> > + int ret;
> > +
> > + /*
> > + * TDH.SYS.KEY.CONFIG may fail with entropy error (which is a
> > + * recoverable error). Assume this is exceedingly rare and
> > + * just return error if encountered instead of retrying.
> > + */
> > + ret = seamcall(TDH_SYS_KEY_CONFIG, 0, 0, 0, 0, NULL, NULL);
> > +
> > + *(int *)data = ret;
> > +}
> > +
> > +/*
> > + * Configure the global KeyID on all packages by doing TDH.SYS.KEY.CONFIG
> > + * on one online cpu for each package. If any package doesn't have any
> > + * online
>
> This looks like it stopped mid-sentence.

Oops I forgot to delete the broken sentence.

>
> > + * Note:
> > + *
> > + * This function neither checks whether there's at least one online cpu
> > + * for each package, nor explicitly prevents any cpu from going offline.
> > + * If any package doesn't have any online cpu then the SEAMCALL won't be
> > + * done on that package and the later step of TDX module initialization
> > + * will fail. The caller needs to guarantee this.
> > + */
>
> *Does* the caller guarantee it?
>
> You're basically saying, "this code needs $FOO to work", but you're not
> saying who *provides* $FOO.

In short, KVM can do something to guarantee but won't 100% guarantee this.

Specifically, KVM won't actively try to bring up cpu to guarantee this if
there's any package has no online cpu at all (see the first lore link below).
But KVM can _check_ whether this condition has been met before calling
tdx_init() and speak out if not. At the meantime, if the condition is met,
refuse to offline the last cpu for each package (or any cpu) during module
initialization.

And KVM needs similar handling anyway. The reason is not only configuring the
global KeyID has such requirement, creating/destroying TD (which involves
programming/reclaiming one TDX KeyID) also require at least one online cpu for
each package.  

There were discussions around this on KVM how to handle. IIUC the solution is
KVM will:
1) fail to create TD if any package has no online cpu.
2) refuse to offline the last cpu for each package when there's any _active_ TDX
guest running.

https://lore.kernel.org/lkml/20221102231911.3107438-1-seanjc@xxxxxxxxxx/T/#m1ff338686cfcb7ba691cd969acc17b32ff194073
https://lore.kernel.org/lkml/de6b69781a6ba1fe65535f48db2677eef3ec6a83.1667110240.git.isaku.yamahata@xxxxxxxxx/

Thus TDX module initialization in KVM can be handled in similar way.

Btw, in v7 (which has per-lp init requirement on all cpus), tdx_init() does
early check on whether all machine boot-time present cpu are online and simply
returns error if condition is not met. Here the difference is we don't have any
check but depend on SEAMCALL to fail. To me there's no fundamental difference.

[snip]

>
> > static int init_tdx_module(void)
> > {
> > /*
> > @@ -998,19 +1058,46 @@ static int init_tdx_module(void)
> > if (ret)
> > goto out_free_pamts;
> >
> > + /*
> > + * Hardware doesn't guarantee cache coherency across different
> > + * KeyIDs. The kernel needs to flush PAMT's dirty cachelines
> > + * (associated with KeyID 0) before the TDX module can use the
> > + * global KeyID to access the PAMT. Given PAMTs are potentially
> > + * large (~1/256th of system RAM), just use WBINVD on all cpus
> > + * to flush the cache.
> > + *
> > + * Follow the TDX spec to flush cache before configuring the
> > + * global KeyID on all packages.
> > + */
>
> I don't think this second paragraph adds very much clarity.
>
>

OK will remove.