RE: Yet more softlockups.

From: Seiji Aguchi
Date: Fri Jul 05 2013 - 14:21:36 EST




> -----Original Message-----
> From: H. Peter Anvin [mailto:hpa@xxxxxxxxx]
> Sent: Friday, July 05, 2013 12:41 PM
> To: Thomas Gleixner
> Cc: Dave Jones; Linus Torvalds; Linux Kernel; Ingo Molnar; Peter Zijlstra; Seiji Aguchi
> Subject: Re: Yet more softlockups.
>
> On 07/05/2013 09:02 AM, Thomas Gleixner wrote:
> > On Fri, 5 Jul 2013, Dave Jones wrote:
> >> On Fri, Jul 05, 2013 at 05:15:07PM +0200, Thomas Gleixner wrote:
> >> > On Fri, 5 Jul 2013, Dave Jones wrote:
> >> >
> >> > > BUG: soft lockup - CPU#3 stuck for 23s! [trinity-child1:14565]
> >> > > perf samples too long (2519 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> >> > > INFO: NMI handler (perf_event_nmi_handler) took too long to run: 238147.002 msecs
> >> >
> >> > So we see a softlockup of 23 seconds and the perf_event_nmi_handler
> >> > claims it did run 23.8 seconds.
> >> >
> >> > Are there more instances of NMI handler messages ?
> >>
> >> [ 2552.006181] perf samples too long (2511 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
> >> [ 2552.008680] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 500392.002 msecs
> >
> > Yuck. Spending 50 seconds in NMI context surely explains a softlockup :)
> >
>
> Hmmm... this makes me wonder if the interrupt tracepoint stuff is at
> fault here, as it changes the IDT handling for NMI context.

This softlockup happens while disabling the interrupt tracepoints,
Because if it is enabled, "smp_trace_apic_timer_interrupt" is displayed
instead of "smp_apic_timer_interrupt" in the call trace below.

But I can't say anything how this issue is related to the tracepoint stuff,
I need to reproduce it on my machine first.

Call Trace:
<IRQ>
[<ffffffff8105424f>] __do_softirq+0xff/0x440
[<ffffffff8105474d>] irq_exit+0xcd/0xe0
[<ffffffff816f5fcb>] smp_apic_timer_interrupt+0x6b/0x9b
[<ffffffff816f512f>] apic_timer_interrupt+0x6f/0x80

Seiji
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/