watchdog: pcf2127: systemd fails on 5.11

From: Bruno Thomsen
Date: Thu Feb 18 2021 - 09:23:34 EST


Hi,

After updating the kernel from 5.8.17 to 5.11 systemd (246.6) is
unable to init watchdog in pcf2127 during boot. Kernel option
CONFIG_WATCHDOG_OPEN_TIMEOUT=300 is working as expected.
It's possible to get watchdog from userspace working in
the following 2 ways.
1) Disable watchdog in systemd and use busybox watchdog.
2) Restart systemd after boot with "kill 1".

During boot setting the system clock from RTC is working.
RTC read/write from userland with hwclock is also working.

DTS: imx7d-flex-concentrator-mfg.dts
SOC: NXP i.MX7D
Drivers: rtc-pcf2127, spi-imx
Communication: SPI

There are no patches applied to the kernel.

When systemd changes watchdog timeout it receives an
error that to our best knowledge comes from spi-imx[1].

We suspect it's a race condition between drivers or
incompatible error handling.

Any help in investigating the issue is appreciated.

/Bruno

[1] https://elixir.bootlin.com/linux/v5.11/source/drivers/spi/spi-imx.c#L1424

Relevant boot trace of pcf2127 and systemd:

Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: pcf2127_probe
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0:
pcf2127_rtc_read_time: raw data is cr3=08, sec=37, min=26, hr=09,
mday=18, wday=04, mon=02, year=21
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0:
pcf2127_rtc_read_time: tm is secs=37, mins=26, hours=9, mday=18,
mon=1, year=121, wday=4
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: char device (252:0)
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: registered as rtc0
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0:
pcf2127_rtc_read_time: raw data is cr3=08, sec=37, min=26, hr=09,
mday=18, wday=04, mon=02, year=21
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0:
pcf2127_rtc_read_time: tm is secs=37, mins=26, hours=9, mday=18,
mon=1, year=121, wday=4
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: setting system
clock to 2021-02-18T09:26:37 UTC (1613640397)
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: I/O Error in PIO
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: SPI transfer failed: -110
Feb 18 09:26:46 tqma7 kernel: spi_master spi1: failed to transfer one
message from queue
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: I/O Error in PIO
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: SPI transfer failed: -110
Feb 18 09:26:46 tqma7 kernel: spi_master spi1: failed to transfer one
message from queue
Feb 18 09:26:46 tqma7 systemd[1]: Hardware watchdog 'NXP
PCF2127/PCF2129 Watchdog', version 0
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: new watchdog
timeout: 120s (old: 60s)
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: I/O Error in PIO
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0: SPI transfer failed: -110
Feb 18 09:26:46 tqma7 kernel: spi_master spi1: failed to transfer one
message from queue
Feb 18 09:26:46 tqma7 kernel: rtc-pcf2127-spi spi1.0:
pcf2127_wdt_active_ping: watchdog restart failed, ret=-110
Feb 18 09:26:46 tqma7 systemd[1]: Failed to set timeout to 120s:
Connection timed out


Relevant trace after systemd restart (kill 1):

Feb 02 10:42:39 tqma7 kernel: watchdog: watchdog0: nowayout prevents
watchdog being stopped!
Feb 02 10:42:39 tqma7 kernel: watchdog: watchdog0: nowayout prevents
watchdog being stopped!
Feb 02 10:42:39 tqma7 kernel: watchdog: watchdog0: watchdog did not stop!
Feb 02 10:42:39 tqma7 systemd[1]: systemd 246.6 running in system
mode. (-PAM -AUDIT -SELINUX -IMA -APPARMOR -SMACK -SYSVINIT -UTMP
-LIBCRYPTSETUP -GCRYPT -GNUTLS -ACL +XZ -LZ4 +ZSTD +SECCOMP +BLKID
+ELFUTILS +KMOD -IDN2 -IDN -PCRE2 default-hierarchy=unified)
Feb 02 10:42:39 tqma7 systemd[1]: Detected architecture arm.
Feb 02 10:42:40 tqma7 systemd[1]: Hardware watchdog 'NXP
PCF2127/PCF2129 Watchdog', version 0
Feb 02 10:42:40 tqma7 kernel: rtc-pcf2127-spi spi1.0: new watchdog
timeout: 30s (old: 30s)
Feb 02 10:42:40 tqma7 systemd[1]: Set hardware watchdog to 30s.