Re: [PATCH v5 34/34] KVM: x86/vmx: execute "int $2" to handle NMI in NMI caused VM exits when FRED is enabled

From: Sean Christopherson
Date: Wed Mar 22 2023 - 19:43:49 EST


On Wed, Mar 22, 2023, andrew.cooper3@xxxxxxxxxx wrote:
> On 22/03/2023 5:49 pm, Sean Christopherson wrote:
> > On Mon, Mar 06, 2023, Xin Li wrote:
> >> Execute "int $2" to handle NMI in NMI caused VM exits when FRED is enabled.
> >>
> >> Like IRET for IDT, ERETS/ERETU are required to end the NMI handler for FRED
> >> to unblock NMI ASAP (w/ bit 28 of CS set).
> > That's "CS" on the stack correct? Is bit 28 set manually by software, or is it
> > set automatically by hardware? If it's set by hardware, does "int $2" actually
> > set the bit since it's not a real NMI?
>
> int $2 had better not set it...� This is the piece of state that is
> intended to cause everything which isn't a real NMI to nest properly
> inside a real NMI.
>
> It is supposed to be set on delivery of an NMI, and act as the trigger
> for ERET{U,S} to drop the latch.
>
> Software is can set it manually in a FRED-frame in order to explicitly
> unblock NMIs.

Ah, found this in patch 19. That hunk really belongs in this patch, because this
patch is full of magic without that information.

+ /*
+ * VM exits induced by NMIs keep NMI blocked, and we do
+ * "int $2" to reinject the NMI w/ NMI kept being blocked.
+ * However "int $2" doesn't set the nmi bit in the FRED
+ * stack frame, so we explicitly set it to make sure a
+ * later ERETS will unblock NMI immediately.
+ */
+ regs->nmi = 1;

Organization aside, this seems to defeat the purpose of _not_ unconditionally
unmasking NMIs on ERET since the kernel assumes any random "int $2" is coming from
KVM after an NMI VM-Exit.

Eww, and "int $2" doesn't even go directly to fred_exc_nmi(), it trampolines
through fred_sw_interrupt_kernel() first. Looks like "int $2" from userspace gets
routed to a #GP, so at least that bit is handled.

I'm not dead set against the proposed approach, but IMO it's not obviously better
than a bit of assembly to have a more direct call into the NMI handler.