Re: [Fastboot] [KDUMP] pending interrupts problem

From: OBATA Noboru
Date: Tue Nov 01 2005 - 04:13:48 EST


On Thu, 27 Oct 2005, Eric W. Biederman wrote:
>
> OBATA Noboru <noboru.obata.ar@xxxxxxxxxxx> writes:
>
> > What happens in the second kernel is that it receives the
> > pending IPI at the first local_irq_enable() in init/main.c, and
> > the IPI handler calls the uninitialized function pointer
> > call_data->func in arch/i386/kernel/smp.c.
> >
> > One way to solve this problem is to implement some blocking
> > mechanism in the IPI handler so that it prevents itself from
> > calling the uninitialized pointer.
>
> Or simply initializing the pointer to NULL and testing to
> see if the IPI handler has been initialized. Weird but
> doable.

As I replied in the other thread, your patch has solved the
problem. Thank you.

> > But pending interrupts on other vectors may cause another
> > problem. They would cause misrouted IRQs, which are now
> > addressed by "irqpoll" kernel parameter. But I'm not sure this
> > solves all the problems.
>
> The irqs should not be misrouted. They simply come at an unexpected
> time.

Okay, I guess the irqs are not misrouted because kdump does not
touch the IO-APIC routing table, and the second kernel will
build the same one.

> > So another way to solve this problem is to clear all such
> > pending interrupts before booting the second kernel.
>
> No. The second kernel gets to cope, because we cannot
> do anything reliable in the crashed kernel.

Well, I first thought that restoring the hardware status back to
normal before booting the second kernel is better because it may
require fewer changes on the kernel core code.

But now I'm getting the idea what kdump is trying to do. Kdump
wants to do less in the crashed kernel, and solve problems in
the second kernel.

> > /* pending-IPI.c - Panic a kernel with IPI pending */
> >
> > #include <linux/config.h>
> > #include <linux/module.h>
> > #include <linux/smp.h>
> >
> > int init_module(void)
> > {
> > local_irq_disable();
> >
> > apic_wait_icr_idle();
> > apic_write_around(APIC_ICR2, SET_APIC_DEST_FIELD(1));
> > apic_write_around(APIC_ICR, (APIC_DM_FIXED
> > | APIC_DEST_LOGICAL | 0xfb));
> >
> > panic("pending-IPI: Bye-bye.");
> >
> > return 0;
> > }
> >
> > MODULE_LICENSE("GPL");
> > MODULE_AUTHOR("OBATA Noboru <noboru.obata.ar@xxxxxxxxxxx>");
> > MODULE_DESCRIPTION("Panic a kernel with IPI pending");
>
>
> This looks like a good test case.

I think we need more test cases, especially the cases that focus
on "status" of hardware, to make kdump more reliable. Kdump
should recover from all possible status of supported hardware.

Is anyone working on developing such test cases for kdump?

Regards,

--
OBATA Noboru (noboru.obata.ar@xxxxxxxxxxx)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/