Re: [PATCH] fix i386 condition to call nmi_watchdog_tick

From: Zwane Mwaikambo
Date: Thu Sep 08 2005 - 19:40:08 EST


On Thu, 8 Sep 2005, Jan Beulich wrote:

> diff -Npru 2.6.13/arch/i386/kernel/traps.c
> 2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c
> --- 2.6.13/arch/i386/kernel/traps.c 2005-08-29 01:41:01.000000000
> +0200
> +++
> 2.6.13-i386-watchdog-active/arch/i386/kernel/traps.c 2005-09-01
> 14:04:35.000000000 +0200
> @@ -611,7 +611,7 @@ static void default_do_nmi(struct pt_reg
> * Ok, so this is none of the documented NMI sources,
> * so it must be the NMI watchdog.
> */
> - if (nmi_watchdog) {
> + if (nmi_watchdog && nmi_active > 0) {
> nmi_watchdog_tick(regs);
> return;
> }

I dislike this patch, and it's not your fault. The reason being is that
there are a few systems (i have one such) which always reports "CPU stuck"
during watchdog setup but then eventually the watchdog starts ticking
during runtime. Unfortunately if this gets in you'll get lots of the
following;

Uhhuh. NMI received for unknown reason 00 on CPU 1.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
Uhhuh. NMI received for unknown reason 21 on CPU 0.

So, before the patch can go in, the "CPU stuck" systems probably need
looking at. Since i have one, i'll have a look.

Thanks,
Zwane

Ps. why is NMI watchdog perpetually broken?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/