RE: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

From: Ken Sloat
Date: Mon Jun 10 2019 - 16:57:06 EST


> -----Original Message-----
> From: Guenter Roeck <groeck7@xxxxxxxxx> On Behalf Of Guenter Roeck
> Sent: Monday, June 10, 2019 4:29 PM
> To: Alexandre Belloni <alexandre.belloni@xxxxxxxxxxxxxxxxxx>
> Cc: Ken Sloat <KSloat@xxxxxxxxxxxxxx>; Nicolas.Ferre@xxxxxxxxxxxxx;
> wim@xxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-
> watchdog@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt
>
> [This is an EXTERNAL EMAIL]
> ________________________________
>
> On Mon, Jun 10, 2019 at 10:13:01PM +0200, Alexandre Belloni wrote:
> > Hello,
> >
> > On 10/06/2019 09:28:11-0700, Guenter Roeck wrote:
> > > On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> > > > Hello Nicolas,
> > > >
> > > > I wanted to open a discussion proposing new functionality to allow
> > > > disabling of the watchdog timer upon entering suspend in the
> SAMA5D2/4.
> > > >
> > > > Typical use case of a hardware watchdog timer in the kernel is a
> > > > userspace application opens the watchdog timer and periodically
> > > > "kicks" it. If the application hits a deadlock somewhere and is no
> > > > longer able to kick it, then the watchdog intervenes and often
> > > > resets the processor. Such is the case for the Atmel driver (which
> > > > also allows a watchdog interrupt to be asserted in lieu of a system
> reset). In most use cases, upon entering a low power/suspend state, the
> application will no longer be able to "kick" the watchdog. If the watchdog is
> not disabled or kicked via another method, then it will reset the system. This
> is the current behavior of the Atmel driver as of today.
> > > >
> > > > The watchdog peripheral itself does have a "WDIDLEHLT" bit
> > > > however, and this is enabled via the "atmel,idle-halt" dt
> > > > property. However, this is not very useful, as it literally only
> > > > makes the watchdog count when the CPU is active. This results in
> > > > non-deterministic triggering of the WDT and means that if a
> > > > critical application were to crash, it may be quite a long time
> > > > before the WDT would ever trigger. Below is a similar statement
> > > > made in the device-tree doc for this
> > > > peripheral:
> > > >
> > > > - atmel,idle-halt: present if you want to stop the watchdog when the
> CPU is
> > > > in idle state.
> > > > CAUTION: This property should be used with care, it actually makes
> the
> > > > watchdog not counting when the CPU is in idle state, therefore the
> > > > watchdog reset time depends on mean CPU usage and will not reset at
> all
> > > > if the CPU stop working while it is in idle state, which is probably
> > > > not what you want.
> > > >
> > > > It seems to me, that it would be logical and useful to introduce a
> > > > new property that would cause the Atmel WDT to disable on suspend
> > > > and re-enable on resume. It also appears that the WDT is re-initialized
> anyways upon resume, so the only piece missing here would really be a dt
> flag and a call to disable.
> > > >
> > > Wondering - why would this need a dt property ? That would be quite
> > > unusual. Is there a condition where one would _not_ want the watchdog
> to stop on suspend ?
> > >
> >
> > There are customers that protects suspend/resume using the watchdog.
> > They wake up their platform every 15s to ping the watchdog.
> >
>
> Interesting use case.
>
> > Also, I don't see why the application deciding to go to suspend
> > wouldn't be able to disable the watchdog before do so if this is the wanted
> policy.
> >
>
> Many watchdog drivers already implement suspend/resume support. Such a
> platform specific functionality seems to be quite undesirable to me.
>
> Besides (and pretty much all watchdog drivers implementing
> suspend/resume do that wrong), you'd likely want to disable the watchog
> late during suspend and early during resume to reduce the risk of a hang. I
> don't think you can do that from userspace.
>

Agreed, there's also the risk if using something like wake_count where the
suspend process can be interrupted by the kernel. This just makes for more
cases that the application has to try and handle, and what happens if the
application hangs/crashes some time between disabling the wdt and suspend?

Right now, what probably is a very common case of devices that enter low power
mode for longer than 16 seconds are precluded from using this WDT.

> Thanks,
> Guenter

Thanks,
Ken