RE: [PATCH v9 07/18] x86/virt/tdx: Do TDX module per-cpu initialization

From: Huang, Kai
Date: Mon Feb 13 2023 - 16:14:04 EST


> On 2/13/23 03:59, Kai Huang wrote:
> > @@ -247,8 +395,17 @@ int tdx_enable(void)
> > ret = __tdx_enable();
> > break;
> > case TDX_MODULE_INITIALIZED:
> > - /* Already initialized, great, tell the caller. */
> > - ret = 0;
> > + /*
> > + * The previous call of __tdx_enable() may only have
> > + * initialized part of present cpus during module
> > + * initialization, and new cpus may have become online
> > + * since then.
> > + *
> > + * To make sure all online cpus are TDX-runnable, always
> > + * do per-cpu initialization for all online cpus here
> > + * even the module has been initialized.
> > + */
> > + ret = __tdx_enable_online_cpus();
>
> I'm missing something here. CPUs get initialized through either:
>
> 1. __tdx_enable(), for the CPUs around at the time 2. tdx_cpu_online(), for
> hotplugged CPUs after __tdx_enable()
>
> But, this is a third class. CPUs that came online after #1, but which got missed
> by #2. How can that happen?

(Replying via Microsoft Outlook cause my Evolution suddenly stopped to work after updating the Fedora).

Currently we depend on KVM's CPU hotplug callback to call tdx_cpu_online(). The problem is the KVM's callback can go away when KVM module gets unloaded.

For example:

1) KVM module loaded when CPU 0, 1, 2 are online, CPU 3, 4, 5 are offline.
2) __tdx_enable() gets called. LP.INIT are done on CPU 0, 1, 2.
3) KVM gets unloaded. It's CPU hotplug callbacks are removed too.
4) CPU 3 becomes online. In this case, tdx_cpu_online() is not called for it as the KVM's CPU hotplug callback is gone.

So later if KVM gets loaded again, we need to go through __tdx_enable_online_cpus() to do LP.INIT for CPU 3 as it's already online.

Perhaps I didn't explain clearly in the comment. Below is the updated one:

/*
* The previous call of __tdx_enable() may only have
* initialized part of present cpus during module
* initialization, and new cpus may have become online
* since then w/o doing per-cpu initialization.
*
* For example, a new CPU can become online when KVM is
* unloaded, in which case tdx_cpu_enable() is not called since
* KVM's CPU online callback has been removed.
*
* To make sure all online cpus are TDX-runnable, always
* do per-cpu initialization for all online cpus here
* even the module has been initialized.
*/