Re: bisect results of MSI-X related panic (help!)

From: Tejun Heo
Date: Mon Oct 12 2009 - 03:54:43 EST


Jesse Brandeburg wrote:
> Kernel stack is corrupted in: ffffffff810b5b31
>
> I've built with a full debug kernel before this crash, so I did:
>
> (gdb) l *0xffffffff810b5b31
> 0xffffffff810b5b31 is in move_native_irq (kernel/irq/migration.c:67).
> 62 return;
> 63
> 64 desc->chip->mask(irq);
> 65 move_masked_irq(irq);
> 66 desc->chip->unmask(irq);
>>>> 67 }
> 68
> (gdb) l move_native_irq
> 54 void move_native_irq(int irq)
> 55 {
> 56 struct irq_desc *desc = irq_to_desc(irq);
> 57
> 58 if (likely(!(desc->status & IRQ_MOVE_PENDING)))
> 59 return;
> 60
> 61 if (unlikely(desc->status & IRQ_DISABLED))
> 62 return;
> 63
> 64 desc->chip->mask(irq);
> 65 move_masked_irq(irq);
> 66 desc->chip->unmask(irq);
> 67 }
>
> So, this seems very related to my panic, as it is likely that
> irqbalance or something else might try to move my interrupt from one
> core to another and this seems likely related, and the original issue
> as well as this one reproduce with LOTS of MSI-X vectors active.
>
> - I tried connecting after the panic with kgdboc, no connection
> - I tried kdump, but the same kernel I am using panics/hangs during
> boot right after udev during the kexec() kernel boot (should I try
> harder to get this working given it got so far?)
> - I have ftrace function tracer running but no way to get at the log
> post panic (wouldn't it be great if the kernel just dumped the ftrace
> log on __stack_chk_fail?)
>
> any other debugging tricks/ideas?

Hmm... stackprotector adds considerable amount of stack usage and it
could be you're seeing stack overflow which would also explain the
random crashes you've been seeing. Do you have DEBUG_STACKOVERFLOW
turned on? This is on x86_64, right?

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/