Re: [PATCH v19 023/130] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module

From: Huang, Kai
Date: Mon Apr 08 2024 - 20:38:39 EST




On 26/02/2024 9:25 pm, Yamahata, Isaku wrote:
+struct tdx_enabled {
+ cpumask_var_t enabled;
+ atomic_t err;
+};
+
+static void __init tdx_on(void *_enable)
+{
+ struct tdx_enabled *enable = _enable;
+ int r;
+
+ r = vmx_hardware_enable();
+ if (!r) {
+ cpumask_set_cpu(smp_processor_id(), enable->enabled);
+ r = tdx_cpu_enable();
+ }
+ if (r)
+ atomic_set(&enable->err, r);
+}
+
+static void __init vmx_off(void *_enabled)
+{
+ cpumask_var_t *enabled = (cpumask_var_t *)_enabled;
+
+ if (cpumask_test_cpu(smp_processor_id(), *enabled))
+ vmx_hardware_disable();
+}
+
+int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops)
+{
+ struct tdx_enabled enable = {
+ .err = ATOMIC_INIT(0),
+ };
+ int r = 0;
+
+ if (!enable_ept) {
+ pr_warn("Cannot enable TDX with EPT disabled\n");
+ return -EINVAL;
+ }
+
+ if (!zalloc_cpumask_var(&enable.enabled, GFP_KERNEL)) {
+ r = -ENOMEM;
+ goto out;
+ }
+
+ /* tdx_enable() in tdx_module_setup() requires cpus lock. */
+ cpus_read_lock();
+ on_each_cpu(tdx_on, &enable, true); /* TDX requires vmxon. */
+ r = atomic_read(&enable.err);
+ if (!r)
+ r = tdx_module_setup();
+ else
+ r = -EIO;

I was thinking why do we need to convert to -EIO.

Convert error code to -EIO unconditionally would cause the original error code being lost. Although given tdx_on() is called on all online cpus in parallel, the @enable.err could be imprecise anyway, the explicit conversion seems not quite reasonable to be done _here_.

I think it would be more reasonable to explicitly set the error code to -EIO in tdx_on(), where we _exactly_ know what went wrong and can still possibly do something before losing the error code.

E.g., we can dump the error code to the user, but looks both vmx_hardware_enable() and tdx_cpu_enable() will do so already so we can safely lose the error code there.

We can perhaps add a comment to point this out before losing the error code if that's better:

/*
* Both vmx_hardware_enable() and tdx_cpu_enable() print error
* message when they fail. Just convert the error code to -EIO
* when multiple cpu fault the @err cannot be used to precisely
* record the error code for them anyway.
*/

+ on_each_cpu(vmx_off, &enable.enabled, true);
+ cpus_read_unlock();
+ free_cpumask_var(enable.enabled);
+
+out:
+ return r;
+}