Re: [PATCH] NMI watch dog notify patch

From: Andrew Morton
Date: Fri Jul 29 2005 - 00:17:26 EST


Keith Owens <kaos@xxxxxxx> wrote:
>
> >I had though that too, but it does not allow recovery (i.e. lets reset
> >the watchdog and try again).
>
> die_nmi() returns to nmi_watchdog_tick(), nmi_watchdog_tick does the
> reset and continues. Patch below.
>
> >Hmm.. just looked at traps.c. Seems die_nmi is NOT called from the nmi
> >trap, only from the watchdog. Also, there is a notify in the path to
> >the other nmi stuff.
>
> I was looking at unknown_nmi_panic_callback(), which also calls
> die_nmi().
>
> traps.c already has several notify_die() calls, nmi.c has none. It is
> cleaner to keep all the notification in traps.c, with this small change
> to nmi.c to cope with die_nmi() returning.
>
> Index: linux/arch/i386/kernel/nmi.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/nmi.c 2005-07-28 17:22:06.735038510 +1000
> +++ linux/arch/i386/kernel/nmi.c 2005-07-29 15:19:00.371196596 +1000
> @@ -494,8 +494,10 @@ void nmi_watchdog_tick (struct pt_regs *
> * wait a few IRQs (5 seconds) before doing the oops ...
> */
> alert_counter[cpu]++;
> - if (alert_counter[cpu] == 5*nmi_hz)
> + if (alert_counter[cpu] == 5*nmi_hz) {
> die_nmi(regs, "NMI Watchdog detected LOCKUP");
> + alert_counter[cpu] = 0;
> + }
> } else {
> last_irq_sums[cpu] = sum;
> alert_counter[cpu] = 0;

That all makes sense - let's go that way?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/