Re: [RFC][PATCH 3/3] x86: Add workaround to NMI iret woes

From: Steven Rostedt
Date: Fri Dec 09 2011 - 12:19:33 EST


[ added Boris as he's my AMD guy ]

On Fri, 2011-12-09 at 11:34 -0500, Steven Rostedt wrote:
> On Thu, 2011-12-08 at 21:43 -0500, Steven Rostedt wrote:
>
> > diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> > index a8e3eb8..906a02a 100644
> > --- a/arch/x86/kernel/traps.c
> > +++ b/arch/x86/kernel/traps.c
> > @@ -723,4 +723,9 @@ void __init trap_init(void)
> > cpu_init();
> >
> > x86_init.irqs.trap_init();
> > +
> > +#ifdef CONFIG_X86_64
> > + memcpy(&nmi_idt_table, &idt_table, IDT_ENTRIES * 16);
> > + set_nmi_gate(1, &debug);
>
> Frederic Weisbecker told me on IRC that int3 is 3 #bp, not 1 #db. I need
> to also add that:
>
> set_nmi_gate(3, &int3)
>
> Mathieu says we need to worry about MCEs, so maybe we can add that stack
> as well.
>
> set_nmi_gate(18, &machine_check);

Looking at the documentation, I'm not sure NMIs can interrupt an MCE.
The MCE is higher up on the exception priority chart (thanks Peter for
pointing that out). But the documentation is vague at best.

Boris, H. Peter,

Could you shed some light on this. Can an NMI interrupt an MCE in
progress?

Of course if it can, we have the NMI->MCE->NMI that could happen too.
And this problem exists today. Actually, just having an MCE happen
inside an NMI can cause the NMI->NMI problem as well.

-- Steve


>
>
> If we make NMIs not modify any stack, then we can remove the "NMI
> executing variable" on the stack. As any nested NMIs will see that it
> preempted an NMI by just checking the stack. We have to check it anyway,
> and by removing another check, this may be good to do.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/