Re: [PATCH v2] watchdog: wdat_wdt: Set the min and max timeout values properly

From: Jean Delvare
Date: Tue Aug 23 2022 - 12:31:50 EST


Hi all,

On Sat, 6 Aug 2022 08:15:24 +0200, Jean Delvare wrote:
> The wdat_wdt driver is misusing the min_hw_heartbeat_ms field. This
> field should only be used when the hardware watchdog device should not
> be pinged more frequently than a specific period. The ACPI WDAT
> "Minimum Count" field, on the other hand, specifies the minimum
> timeout value that can be set. This corresponds to the min_timeout
> field in Linux's watchdog infrastructure.
>
> Setting min_hw_heartbeat_ms instead can cause pings to the hardware
> to be delayed when there is no reason for that, eventually leading to
> unexpected firing of the watchdog timer (and thus unexpected reboot).
> (...)

This patch no longer applies as it conflicts with:

commit 6d72c7ac9fbe26a77800676507da980436b40b2f
Author: Liu Xinpeng
Date: Tue Apr 26 22:53:28 2022 +0800

which made it into kernel v5.19.

Having reviewed the commit mentioned above, I must say I'm skeptical. I
can't see how setting min_timeout to 1 arbitrarily has been considered
a good thing. This allows setting timeout values lower than the ACPI
WDAT "Minimum Count" field, while presumably the hardware does not
support such short timeouts.

Furthermore, calling watchdog_timeout_invalid() to validate the timeout
value is a good idea in principle, however, given that min_timeout is
now 1 and max_hw_heartbeat_ms is defined, the function is no longer
checking much.

My understanding is that the original code was checking the right
limits (from the WDAT table's perspective) using the wrong fields (from
the watchdog core's perspective). This fix from Liu is not really fixing
the problem (min_hw_heartbeat_ms and max_hw_heartbeat_ms are still set,
which enables watchdog core facilities that the driver doesn't need
IMHO) and is adding a new problem (the timeout limits defined in the
ACPI WDAT table are no longer checked).

I will rebase my patch on top and address both problems.

--
Jean Delvare
SUSE L3 Support