Re: [PATCH] x86, UV: Fix NMI handler for UV platforms

From: Don Zickus
Date: Tue Mar 22 2011 - 14:45:23 EST


On Tue, Mar 22, 2011 at 12:11:18PM -0500, Jack Steiner wrote:
> How certain are you that multiple NMIs triggered at about the same time will
> deliver discrete NMI events? I updated the patch so that I'm running with:

I think as long as there isn't more than two (1 active, 1 latched), you
would be ok. A third one looks like it would get dropped.

>
> - no special code in traps.c (I removed the traps.c code that was
> in the patch I posted)
> - used die_notifier for calling the UV nmi handler
> - UV priority is higher than the hw_perf priority
>
> Both hw_perf (perf top) & UV NMIs work correctly under light loads. However, if I
> run for 10 - 15 minutes injecting UV NMIs at a rate of about 30/min, "perf top"
> stops generating output. Strace shows that it continues to poll() but no data
> is received.

That's a low frequency and it still gets stuck?

>
> While "perf top" is hung, if I inject an NMI into the system in a way that will NOT
> be consumed by the UV nmi handler, "perf top" resumes output but will stop again after
> a few minutes.

So that means the PMU set its interrupt bit but the cpu failed to get the
NMI.

>
>
> AFAICT, the UV nmi handler is not consuming extra NMI interrupts. I can't
> rule out that I'm missing something but I don't see it.

What happens if you put the UV nmi handler below the hw_perf handler in
priority? I assume the DIE_NMIUNKNOWN snippet in the hw_perf handler will
swallow some of the UV NMIs, but more importantly does it still generate
the hang you see?

>
>
> Do you have any ideas or clues???

Part of the problem is most of the NMI testing is done with perf and maybe
kgdb. So high frequency NMI sharing is probably exposing more bugs.

Also is it a problem to move your testing on to the latest upstream code
instead of RHEL-6? Not all the latest NMI work is there. I want to make
sure we are all starting at the same code. :-)

Cheers,
Don

>
>
> >
> > >
> > > The root cause of the problem is that architecturally, x86 does not
> > > have a way to identifies the source(s) that cause an NMI. If multiple
> > > events occur at about the same time, there is no way that I can see that the
> > > OS can detect it.
> >
> > There are registers we can check to see who owns trigger the NMI (at least
> > for the perf code, the SGI code maybe not, which is why I set it to a
> > lower priority to be a catch-all).
> >
> > I'm not aware of the x86 architecture dropping NMIs, so they should all
> > get processed. It is just a matter of which subsystems get determine if
> > they are the source of the NMI or not.
> >
> > >
> > > >
> > > > My first impression is the skip nmi logic in the perf handler is probably
> > > > accidentally thinking the SGI external nmi is the perf's 'extra' nmi it is
> > > > supposed to skip and thus swallows it. At least that is the impression I
> > >
> > > Agree
> > >
> > >
> > > > get from the RedHat bugzilla which says SGI is running 'perf top', getting
> > > > a hang, then pressing their nmi button to see the stack traces.
> > > >
> > > > Jack,
> > > >
> > > > I worked through a number of these issues upstream and I already talked to
> > > > George and Russ over here at RedHat about working through the issue over
> > > > here with them. They can help me get access to your box to help debug.
> > >
> > > Russ is right down the hall.
> >
> > Great!
> >
> > Cheers,
> > Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/