Re: [PATCH] perf: fix interrupt handler timing harness

From: Dave Hansen
Date: Mon Jul 08 2013 - 16:35:06 EST


On 07/08/2013 01:20 PM, Stephane Eranian wrote:
> On Mon, Jul 8, 2013 at 10:05 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> If the interrupts _consistently_ take too long individually they can
>> starve out all the other CPU users. I saw no way to make them finish
>> faster, so the only recourse is to also drop the rate.
>>
> I think we need to investigate why some interrupts take so much time.
> Could be HW, could be SW. Not talking about old hardware here.
> Once we understand this, then we know maybe adjust the timing on
> our patch.

I spent quite a while looking at it on my hardware. It's difficult to
profile in NMIs, but I'm fairly satisfied (for me) it is a NUMA issue
which gets worse as I add cores.

I did a quite a bit of ftracing to look for spots inside the handler
which were taking large amounts of time. There were none. The
execution time was spread very evenly over the entire nmi handler. It
didn't appear to be any individual hot cachelines or doing something
silly like sitting in a loop handling lots of PMU events.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/