Re: [PATCH v2] i2c: designware: Fix unbalanced suspended flag
From: Hans de Goede
Date: Thu Dec 15 2022 - 14:27:01 EST
Hi,
On 12/15/22 16:09, Jarkko Nikula wrote:
> Hi
>
> On 12/14/22 13:28, Hans de Goede wrote:
>> Hi Richard,
>>
>> On 12/13/22 15:45, Richard Fitzgerald wrote:
>>> Ensure that i2c_mark_adapter_suspended() is always balanced by a call to
>>> i2c_mark_adapter_resumed().
>>>
>>> dw_i2c_plat_resume() must always be called, so that
>>> i2c_mark_adapter_resumed() is called. This is not compatible with
>>> DPM_FLAG_MAY_SKIP_RESUME.
>>>
>>> The pairing of pm_runtime_force_suspend() and pm_runtime_force_resume()
>>> can replace this. If nothing is using the driver, and it is not currently
>>> suspended, it will be put into runtime-suspend and will be left in
>>> runtime-suspend during the system resume.
>>>
>>> pm_runtime_force_suspend() is not compatible with DPM_FLAG_SMART_SUSPEND
>>> so this must also be removed. DPM_FLAG_SMART_SUSPEND will set the device
>>> back to pm_runtime_active() during resume_noirq if it cannot skip resume.
>>> This would lead to the inconsistent state where the driver runtime_suspend
>>> has been called (by force_suspend()) but it is marked active (by PM core).
>>>
>>> The unbalanced suspended flag was introduced by
>>> commit c57813b8b288 ("i2c: designware: Lock the adapter while setting the
>>> suspended flag")
>>>
>>> Before that commit, the system and runtime PM used the same functions. The
>>> DPM_FLAG_MAY_SKIP_RESUME was used to skip the system resume if the driver
>>> had been in runtime-suspend. If system resume was skipped, the suspended
>>> flag would be cleared by the next runtime resume. The check of the
>>> suspended flag was _after_ the call to pm_runtime_get_sync() in
>>> i2c_dw_xfer(). So either a system resume or a runtime resume would clear
>>> the flag before it was checked.
>>>
>>> Having introduced the unbalanced suspended flag with that commit, a further
>>> commit 80704a84a9f8 ("i2c: designware: Use the
>>> i2c_mark_adapter_suspended/resumed() helpers")
>>>
>>> changed from using a local suspended flag to using the
>>> i2c_mark_adapter_suspended/resumed() functions. These use a flag that is
>>> checked by I2C core code before issuing the transfer to the bus driver, so
>>> there was no opportunity for the bus driver to runtime resume itself before
>>> the flag check.
>>>
>>> Signed-off-by: Richard Fitzgerald <rf@xxxxxxxxxxxxxxxxxxxxx>
>>> Fixes: c57813b8b288 ("i2c: designware: Lock the adapter while setting the suspended flag")
>>
>> Thank you. I like the new approach in this version.
>>
>> Reviewed-by: Hans de Goede <hdegoede@xxxxxxxxxx>
>>
> I noticed a resume regression with this patch but not all machines and only when resuming from s2idle. After resume all devices on those machines are in D0 even their runtime state show they are suspended. Works with v6.1.
>
> - Baytrail based MRD7 with one I2C bus shared with PUNIT
>
> After boot. All ok (5th bus is shared with PUNIT and not power managed)
>
> cat /sys/bus/platform/devices/80860F41\:0*/power/runtime_status
> suspended
> suspended
> suspended
> suspended
> active
>
> cat /sys/bus/platform/devices/80860F41\:0*/firmware_node/power_state
> D3hot
> D3hot
> D3hot
> D3hot
> D0
>
> After suspend-to-idle (s2idle) resume cycle. Devices stay in D0
>
> cat /sys/bus/platform/devices/80860F41\:0*/power/runtime_status
> suspended
> suspended
> suspended
> suspended
> active
>
> cat /sys/bus/platform/devices/80860F41\:0*/firmware_node/power_state
> D0
> D0
> D0
> D0
> D0
Ah yes this makes sense. We leave the device runtime-suspended,
but since the system-level resume handler does run, the ACPI
pm domain resume handler also has ran before calling the
driver level callback and the acpi pm domain handler will have
put the device in D0 before calling the driver level
resume handler.
So now the device actually is in D0 while the pm-core
thinks it is left in runtime-suspended state, so it
will not runtime suspend it later, as would happen when
not using force_runtime-suspend + force-runtime-resume (*).
So yeah this is not going to work. I think instead we need
to go back to v1 together with the changes which I proposed
for v1. As I mentioned when reviewing v1, v1 is not prefect,
but it really is an ok solution to this.
Regards,
Hans
*) which does not actually always runtime resume despite
its name