Re: ipmi_watchdog can not reset the kernel panic machine

From: Corey Minyard
Date: Sun Nov 25 2007 - 19:26:47 EST


The watchdog is "off" by default, meaning that you have to have something actually start resetting the watchdog before it will start running. That's why you are seeing this behavior.

There is a start_now option that will start the watchdog when it is loaded, but then it will reset the system unless something resets the watchdog periodically, and you have a limited time to start this operation.

On a panic, the IPMI driver attempts to preserve the state of the watchdog and (if running) increase the timeout time to allow a kdump or something like that to occur. That's the purpose of the code you reference. It is not to start a reset operation on any panic. It used to start a reset on every panic, but that cause problems for many users.

-corey

Andrew Morton wrote:
(cc's added)

On Fri, 23 Nov 2007 20:28:41 -0800 (PST) youquan_song@xxxxxxxxxxxxxxx wrote:

Build kernel-2.6.24-rc3. pmi_watchdog can not reset the kernel panic
machine. The watchdog can never to record panic information to IPMI SEL.

1. I disable auto reset when kernel panic by echo "0" >
/proc/sys/kernel/panic

2. modprobe ipmi_watchdog timeout=120 action=reset

3. Load a driver, the driver will call panic() when ioctl to call into
the driver.

4. By ioctl call into the driver, panic the system.

in wdog_panic_handler, I printk "ipmi_watchdog_state=WDOG_TIMEOUT_NONE"
so, the watchdog can never to record panic information to IPMI SEL.


static int wdog_panic_handler(struct notifier_block *this,
unsigned long event,
void *unused)
{
static int panic_event_handled = 0;

/* On a panic, if we have a panic timeout, make sure to extend
the watchdog timer to a reasonable value to complete the
panic, if the watchdog timer is running. Plus the
pretimeout is meaningless at panic time. */
if (watchdog_user && !panic_event_handled &&
ipmi_watchdog_state != WDOG_TIMEOUT_NONE) {
/* Make sure we do this only once. */
panic_event_handled = 1;

timeout = 255;
pretimeout = 0;
panic_halt_ipmi_set_timeout();
}

return NOTIFY_OK;
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/