Re: [PATCH] timekeeping: Add a lockdep override in tick_freeze().

From: Mateusz Jończyk
Date: Mon Jun 02 2025 - 15:59:50 EST


W dniu 31.05.2025 o 21:16, Chris Bainbridge pisze:
On Sat, May 31, 2025 at 07:27:03PM +0100, Chris Bainbridge wrote:
Hi,

I'm getting "WARNING: inconsistent lock state" on resume with this
commit (92e250c624ea37fde64bfd624fd2556f0d846f18):

Further testing shows there are some required conditions for this
warning to be shown. The suspend must be of a short enough duration that
it does "not reach hardware sleep state" (according to amd_s2idle.py).

Also the warning is only shown once, I don't know if this is because the
conditions for the warning only occur once, or if there is log limit
somewhere that prevents it from being logged more than once.

I can reliably reproduce the warning by running amd_s2idle.py and
waiting for the automatic resume:

# ./amd_s2idle.py --log log --duration 5 --wait 4 --count 1
Debugging script for s2idle on AMD systems
💻 HP HP Pavilion Aero Laptop 13-be0xxx (103C_5335KV HP Pavilion) running BIOS 15.17 (F.17) released 12/18/2024 and EC 79.31
🐧 Debian GNU/Linux trixie/sid
🐧 Kernel 6.15.0-rc1-00002-g92e250c624ea
🔋 Battery BAT0 (313-27-3C-A PC03043XL) is operating at 100.00% of design
Checking prerequisites for s2idle
✅ Logs are provided via systemd
✅ AMD Ryzen 7 5800U with Radeon Graphics (family 19 model 50)
✅ SMT enabled
✅ LPS0 _DSM enabled
✅ ACPI FADT supports Low-power S0 idle
✅ HSMP driver `amd_hsmp` not detected (blocked: False)
✅ PMC driver `amd_pmc` loaded (Program 0 Firmware 64.73.0)
✅ GPU driver `amdgpu` bound to 0000:03:00.0
✅ System is configured for s2idle
✅ NVME Intel Corporation SSD 670p Series [Keystone Harbor] is configured for s2idle in BIOS
✅ GPIO driver `pinctrl_amd` available
🚦 Device firmware checks unavailable without fwupd gobject introspection
Started at 2025-05-31 19:46:33.911590 (cycle finish expected @ 2025-05-31 19:46:42.911616)
Results from last s2idle cycle
○ Suspend count: 1
○ Hardware sleep cycle count: 1
○ Wakeup triggered from IRQ 9: ACPI SCI
○ Woke up from IRQ 9: ACPI SCI
○ gpe03 increased from 140 to 148
✅ Userspace suspended for 0:00:08.256333
❌ Did not reach hardware sleep state

If the duration arg is 6 or higher, then amd_s2idle.py reports that the
hardware sleep state was entered, and the "inconsistent lock state"
warning does not appear. If the duration is too low (e.g. 1 second),
then the laptop does not wake up automatically, and upon pressing a
keyboard key, the amdgpu driver will report an error resuming the GPU,
and the GPU will not be working. (I don't think the amdgpu problem is
related to the lock state warning, I'm just mentioning it for
completeness). It is the state between these two cases, where the laptop
does suspend and resume correctly, but the suspend is too short to enter
a hardware sleep state, where the problem occurs.

Hello,

Thank you for this bug report.

amd_s2idle apparently uses an RTC alarm to wake the system up
(which on newer systems is handled by ACPI SCI instead).
When the delay before the alarm is very low (like 1 second),
the alarm fires before the system is fully
suspended and the system does not wake thereafter - you have
to wake it up manually. The ACPI SPI interrupt is queued, however,
and fires just thereafter.

It appears, however, that both the RTC interrupt and ACPI SPI
interrupts fired (one after the other or at the same time).

I have noticed that cmos_interrupt() in drivers/rtc/rtc-cmos.c
uses spin_lock(), not spin_lock_irqsave() etc., even though it
can be called from a non-interrupt context - indirectly by
cmos_resume() during system resume and also by rtc_handler().

This can lead to a deadlock and is likely while lockdep is
complaining - see "Single-lock state rules:" in
Documentation/locking/lockdep-design.rst .

It is possible that
commit 92e250c624ea ("timekeeping: Add a lockdep override in tick_freeze()")
is masking the current problem because only the first issue is shown.

I'll send you a debug patch shortly.

Greetings,
Mateusz