Re: [PATCHv2] watchdog: Add stop_on_reboot parameter to control reboot policy

From: Guenter Roeck
Date: Sat Feb 22 2020 - 11:06:43 EST


On Fri, Feb 14, 2020 at 04:22:09PM +0000, Dmitry Safonov wrote:
> Many watchdog drivers use watchdog_stop_on_reboot() helper in order
> to stop the watchdog on system reboot. Unfortunately, this logic is
> coded in driver's probe function and doesn't allows user to decide what
> to do during shutdown/reboot.
>
> On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb)
> may be configured to either send NMI or turn off/reboot VM as
> the watchdog action. As the kernel may stuck at any state, sending NMIs
> can't reliably reboot the VM.
>
> At Arista, we benefited from the following set-up: the emulated watchdogs
> trigger VM reset and softdog is set to catch less severe conditions to
> generate vmcore. Just before reboot watchdog's timeout is increased
> to some good-enough value (3 mins). That keeps watchdog always running
> and guarantees that VM doesn't stuck.
>
> Provide new stop_on_reboot module parameter to let user control
> watchdog's reboot policy.
>
> Cc: Guenter Roeck <linux@xxxxxxxxxxxx>
> Cc: Wim Van Sebroeck <wim@xxxxxxxxxxxxxxxxxx>
> Cc: linux-watchdog@xxxxxxxxxxxxxxx
> Signed-off-by: Dmitry Safonov <dima@xxxxxxxxxx>
> ---
> Changes v1 => v2: Add module parameter instead of ioctl()
>
> drivers/watchdog/watchdog_core.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index 861daf4f37b2..5ead96199a0b 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -39,6 +39,10 @@
>
> static DEFINE_IDA(watchdog_ida);
>
> +static int stop_on_reboot = -1;
> +module_param(stop_on_reboot, int, 0644);
> +MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)");
> +

My major concern is that this is writeable at runtime.
Changing the value won't change the behavior of already loaded
drivers. Unloading and reloading the driver will change its behavior
after the value was changed. This would be confusing, and it is hard
to imagine for anyone to expect such a behavior. Does this have to be
writeable ?

Guenter

> /*
> * Deferred Registration infrastructure.
> *
> @@ -254,6 +258,14 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
> }
> }
>
> + /* Module parameter to force watchdog policy on reboot. */
> + if (stop_on_reboot != -1) {
> + if (stop_on_reboot)
> + set_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
> + else
> + clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
> + }
> +
> if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) {
> wdd->reboot_nb.notifier_call = watchdog_reboot_notifier;
>