Re: [PATCH v19 103/130] KVM: TDX: Handle EXIT_REASON_OTHER_SMI with MSMI

From: Isaku Yamahata
Date: Tue May 07 2024 - 16:55:56 EST


On Tue, May 07, 2024 at 03:06:57PM +0800,
Binbin Wu <binbin.wu@xxxxxxxxxxxxxxx> wrote:

> > > > + /*
> > > > + * The only reason it gets EXIT_REASON_OTHER_SMI is there is an
> > > > + * #MSMI(Machine Check System Management Interrupt) with
> > > > + * exit_qualification bit 0 set in TD guest.
> > > > + * The #MSMI is delivered right after SEAMCALL returns,
> > > > + * and an #MC is delivered to host kernel after SMI handler
> > > > + * returns.
> > > > + *
> > > > + * The #MC right after SEAMCALL is fixed up and skipped in #MC
> > > Looks fixing up and skipping #MC on the first instruction after TD-exit is
> > > missing in v19?
> > Right. We removed it as MSMI will provides if #MC happened in SEAM or not.
>
> According to the patch of host #MC handler
> https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/,
> the #MC triggered by MSMI can be handled by kernel #MC handler.
> There is no need to call kvm_machine_check().
>
> Does the following fixup make sense to you?

Yes. Now this patch becomes mostly none. So it makes more sense to squash
this patch into [PATCH v19 100/130] KVM: TDX: handle EXIT_REASON_OTHER_SMI.

> --------
>
> KVM: TDX: Handle EXIT_REASON_OTHER_SMI
>
> Handle "Other SMI" VM exit for TDX.
>
> Unlike VMX, an SMI occurs in SEAM non-root mode cause VM exit to TDX
> module, then SEAMRET to KVM. Once it exits to KVM, SMI is delivered and
> handled by kernel handler right away.
>
> Specifically, when BIOS eMCA MCE-SMI morphing is enabled, the #MC occurs
> in TDX guest is delivered as an Machine Check System Management Interrupt
> (MSMI) with the exit reason of EXIT_REASON_OTHER_SMI with MSMI (bit 0) set
> in the exit qualification.  On VM exit, TDX module checks whether the "Other
> SMI" is caused by a MSMI or not.  If so, TDX module makes TD as fatal,
> preventing further TD entries, and then completes the TD exit flow to KVM
> with the TDH.VP.ENTER outputs indicating TDX_NON_RECOVERABLE_TD.
> After TD exit, the MSMI is delivered and eventually handled by the kernel
> #MC handler[1].
>
> So, to handle "Other SMI" VM exit:
> - For non-MSMI case, KVM doesn't need to do anything, just continue TDX vCPU
>   execution.
> - For MSMI case, since the TDX guest is dead, follow other non-recoverable
>   cases, exit to userspace.
>
> [1] The patch supports handling MSMI signaled during SEAM operation.
>     It's already in tip tree.
> https://lore.kernel.org/lkml/171265126376.10875.16864387954272613660.tip-bot2@tip-bot2/
>
> diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c
> index 4ee94bfb17e2..fd756d231204 100644
> --- a/arch/x86/kvm/vmx/tdx.c
> +++ b/arch/x86/kvm/vmx/tdx.c
> @@ -975,30 +975,6 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu)
> tdexit_intr_info(vcpu));
>         else if (exit_reason == EXIT_REASON_EXCEPTION_NMI)
>                 vmx_handle_exception_irqoff(vcpu, tdexit_intr_info(vcpu));
> -       else if (unlikely(tdx->exit_reason.non_recoverable ||
> -                tdx->exit_reason.error)) {
> -               /*
> -                * The only reason it gets EXIT_REASON_OTHER_SMI is there is
> an
> -                * #MSMI(Machine Check System Management Interrupt) with
> -                * exit_qualification bit 0 set in TD guest.
> -                * The #MSMI is delivered right after SEAMCALL returns,
> -                * and an #MC is delivered to host kernel after SMI handler
> -                * returns.
> -                *
> -                * The #MC right after SEAMCALL is fixed up and skipped in
> #MC
> -                * handler because it's an #MC happens in TD guest we cannot
> -                * handle it with host's context.
> -                *
> -                * Call KVM's machine check handler explicitly here.
> -                */
> -               if (tdx->exit_reason.basic == EXIT_REASON_OTHER_SMI) {
> -                       unsigned long exit_qual;
> -
> -                       exit_qual = tdexit_exit_qual(vcpu);
> -                       if (exit_qual & TD_EXIT_OTHER_SMI_IS_MSMI)
> -                               kvm_machine_check();
> -               }
> -       }
>  }
>
>  static int tdx_handle_exception(struct kvm_vcpu *vcpu)
> @@ -1923,10 +1899,6 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t
> fastpath)
>                               to_kvm_tdx(vcpu->kvm)->hkid,
>                               set_hkid_to_hpa(0,
> to_kvm_tdx(vcpu->kvm)->hkid));
>
> -               /*
> -                * tdx_handle_exit_irqoff() handled EXIT_REASON_OTHER_SMI. 
> It
> -                * must be handled before enabling preemption because it's
> #MC.
> -                */
>                 goto unhandled_exit;
>         }
>
> @@ -1970,14 +1942,14 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu,
> fastpath_t fastpath)
>                 return tdx_handle_ept_misconfig(vcpu);
>         case EXIT_REASON_OTHER_SMI:
>                 /*
> -                * Unlike VMX, all the SMI in SEAM non-root mode (i.e. when
> -                * TD guest vcpu is running) will cause TD exit to TDX
> module,
> -                * then SEAMRET to KVM. Once it exits to KVM, SMI is
> delivered
> -                * and handled right away.
> +                * Unlike VMX, SMI occurs in SEAM non-root mode (i.e. when
> +                * TD guest vCPU is running) will cause VM exit to TDX
> module,
> +                * then SEAMRET to KVM.  Once it exits to KVM, SMI is
> delivered
> +                * and handled by kernel handler right away.
>                  *
> -                * - If it's an Machine Check System Management Interrupt
> -                *   (MSMI), it's handled above due to non_recoverable bit
> set.
> -                * - If it's not an MSMI, don't need to do anything here.
> +                * - A MSMI will not reach here, it's handled as
> non_recoverable
> +                *   case above.
> +                * - If it's not a MSMI, no need to do anything here.
>                  */
>                 return 1;
>         default:
> diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h
> index 2aecffe9f276..aa2fea7b2652 100644
> --- a/arch/x86/kvm/vmx/tdx_arch.h
> +++ b/arch/x86/kvm/vmx/tdx_arch.h
> @@ -41,8 +41,6 @@
>  #define TDH_PHYMEM_PAGE_WBINVD         41
>  #define TDH_VP_WR                      43
>
> -#define TD_EXIT_OTHER_SMI_IS_MSMI      BIT_ULL(1)
> -
>  /* TDX control structure (TDR/TDCS/TDVPS) field access codes */
>  #define TDX_NON_ARCH                   BIT_ULL(63)
>  #define TDX_CLASS_SHIFT                        56
>
>
>

--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>