Re: Yet more softlockups.

From: Markus Trippelsdorf
Date: Wed Jul 10 2013 - 11:20:27 EST


On 2013.07.10 at 11:13 -0400, Dave Jones wrote:
> On Sat, Jul 06, 2013 at 09:24:08AM +0200, Ingo Molnar wrote:
> >
> > * Dave Jones <davej@xxxxxxxxxx> wrote:
> >
> > > On Fri, Jul 05, 2013 at 05:15:07PM +0200, Thomas Gleixner wrote:
> > > > On Fri, 5 Jul 2013, Dave Jones wrote:
> > > >
> > > > > BUG: soft lockup - CPU#3 stuck for 23s! [trinity-child1:14565]
> > > > > perf samples too long (2519 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> > > > > INFO: NMI handler (perf_event_nmi_handler) took too long to run: 238147.002 msecs
> > > >
> > > > So we see a softlockup of 23 seconds and the perf_event_nmi_handler
> > > > claims it did run 23.8 seconds.
> > > >
> > > > Are there more instances of NMI handler messages ?
> > >
> > > [ 2552.006181] perf samples too long (2511 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> > > [ 2552.008680] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 500392.002 msecs
> >
> > Dave, could you pull in the latest perf fixes at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git perf/urgent
> >
> > In particular this:
> >
> > e5302920da9e perf: Fix interrupt handler timing harness
> >
> > could make a difference - if your tests somehow end up activating perf.
>
> Something is really fucked up in the kernel side of perf.
> I get this right after booting..
>
> [ 114.516619] perf samples too long (4262 > 2500), lowering kernel.perf_event_max_sample_rate to 50000

You can disable this warning by:

echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent

--
Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/