Re: [PATCH][GIT PULL] trace,x86: Move creation of irq tracepointsfrom apic.c to irq.c

From: Steven Rostedt
Date: Sat Jun 22 2013 - 10:17:29 EST


On Fri, 2013-06-21 at 17:26 -0400, Steven Rostedt wrote:
> On Fri, 2013-06-21 at 17:09 -0400, Steven Rostedt wrote:
> > On Fri, 2013-06-21 at 13:31 -0400, Steven Rostedt wrote:
> >
> > > My testing also triggered another bug, but I'm not sure it's related to
> > > these patches or something that already existed. I'm currently
> > > investigating it now.
> >
> >
> > I haven't been able to reproduce it.
>
> Correction, I am able to reproduce it. The trick is I need to kick off
> the test as soon as I+
+/*
+ * the load_current_idt() is called with interrupt disabled by
local_irq_save()
+ * to avoid races. That way the IDT will always be set back to the
expected
+ * descriptor.
+ */
+static inline void load_current_idt(void)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (is_debug_idt_enabled())
+ load_debug_idt();
+ else
+ load_idt((const struct desc_ptr *)&idt_descr);
+ local_irq_restore(flags);
+}

> get a login prompt, as the gui is still coming up
> (this time the crash happened on pulseaudio). The test suite runs the
> test when it detects the login. I was trying to reproduce it manually,
> which was well after login was set up.
>
> I'll try to make sure I can reproduce it consistently, or at least in a
> short number of tries. And then remove all the patches and see if I can
> still trigger it.

OK, I can easily reproduce it when applying this commit:

629f4f9d59a27d8e58aa612e886e6a9a63ea7aeb
"x86: Rename variables for debugging"

Which doesn't just rename a variable but changes the way we update the
IDT for debugging.

After removing this commit, I can not reproduce the dump. I'm thinking
that the IDT switch did something that caused this to happen. Could be,
as lockdep (which is what is having issues here) is where things are
going wrong.

Oh, I think the issue is with this...

+
+/*
+ * the load_current_idt() is called with interrupt disabled by local_irq_save()
+ * to avoid races. That way the IDT will always be set back to the expected
+ * descriptor.
+ */
+static inline void load_current_idt(void)
+{
+ unsigned long flags;
+
+ local_irq_save(flags);
+ if (is_debug_idt_enabled())
+ load_debug_idt();
+ else
+ load_idt((const struct desc_ptr *)&idt_descr);
+ local_irq_restore(flags);
+}

It's not safe to call local_irq_save() here. From entry_64.S:

.macro TRACE_IRQS_OFF_DEBUG
call debug_stack_set_zero
TRACE_IRQS_OFF
call debug_stack_reset
.endm

We must change the idt before we can trace irqs being disabled. The
local_irq_save() here is going to be traced by lockdep. Why do we need
to disable interrupts? It's pretty pointless since this same code can be
called by NMIs.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/