Re: [PATCH] clocksource: timer-tegra186: Enable WDT at probe

From: Jon Hunter
Date: Thu Jul 03 2025 - 06:29:28 EST




On 03/07/2025 11:12, Thierry Reding wrote:
On Thu, Jul 03, 2025 at 08:55:04AM +0100, Jon Hunter wrote:


On 03/07/2025 07:55, Thierry Reding wrote:
On Mon, Jun 30, 2025 at 04:31:35PM +0530, Kartik Rajput wrote:
Currently, if the system crashes or hangs during kernel boot before
userspace initializes and configures the watchdog timer, then the
watchdog won’t be able to recover the system as it’s not running. This
becomes crucial during an over-the-air update, where if the newly
updated kernel crashes on boot, the watchdog is needed to reset the
device and boot into an alternative system partition. If the watchdog
is disabled in such scenarios, it can lead to the system getting
bricked.

Enable the WDT during driver probe to allow recovery from any crash/hang
seen during early kernel boot. Also, disable interrupts once userspace
starts pinging the watchdog.

Signed-off-by: Kartik Rajput <kkartik@xxxxxxxxxx>
---
drivers/clocksource/timer-tegra186.c | 42 ++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)

This seems dangerous to me. It means that if the operating system
doesn't start some sort of watchdog service in userspace that pings the
watchdog, the system will reboot 120 seconds after the watchdog probe.


I don't believe that will happen with this change. The kernel will continue
to pet the watchdog until userspace takes over with this change. At least
that is my understanding.

Ah yes... I skipped over that IRQ handling bit. However, I think this
still violates the assumptions because the driver will keep petting the
watchdog no matter what, which means that we now have no way of forcing
a reset of the system when userspace hangs. As long as just a tiny part
of the kernel keeps running, the watchdog would keep getting petted and
prevent it from resetting the system.

Using a second watchdog still seems like a more robust alternative. Or
maybe we can find a way to remove the kernel petting once userspace
starts the watchdog.

Once userspace calls the "->ping" callback then, 'enable_irq' is set to false and when 'tegra186_wdt_enable()' is called this will disable the IRQ so that the kernel no longer pets the watchdog. So this should disable kernel petting once userspace is up and running.

Cheers!
Jon

--
nvpublic