RFC: generic support for two-stage watchdogs?

From: Chris Friesen
Date: Thu Jul 28 2011 - 13:09:06 EST



When using kdump to store crash recovery information, it is frustrating when the watchdog timer fires and reboots the system under our feet.

Many hardware watchdogs support a two-stage operation where the initial stage expires and sends an NMI or other interrupt to the CPU. Only once the second stage fires does it actually reboot the hardware.

Has anyone considered adding support for this sort of hardware to the /dev/watchdog API? It seems like it would make sense to reset the watchdog timeout to some suitable period and trigger kdump. This would let us preserve the crash information.

If the system is really fubared then the second stage will fire and reboot the machine.

Chris

--
Chris Friesen
Software Developer
GENBAND
chris.friesen@xxxxxxxxxxx
www.genband.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/