Re: [PATCH] driver core: Ensure proper suspend/resume ordering

From: Grygorii Strashko
Date: Thu Sep 17 2015 - 11:48:46 EST


Hi,

On 09/17/2015 03:07 AM, Rafael J. Wysocki wrote:
> On Wednesday, September 16, 2015 03:27:55 PM Alan Stern wrote:
>> On Wed, 16 Sep 2015, Grygorii Strashko wrote:
>>
>>> I think, It should prohibited to probe devices during suspend/hibernation.
>>> And solution introduced in this patch might help to fix it -
>>> in general, we could do :
>>> - add sync point on suspend enter: wait_for_device_probe() and
>>> - prohibit probing: move all devices which will request probing into
>>> deferred_probe list
>>> - one suspend exit: allow probing and do driver_deferred_probe_trigger
>>
>> That could work; it's a good idea.
>>
>>> I'd like to mention here that this patch will work only
>>> if dmp_list will be filled according device creation order ("parent<-child" dependencies)
>>> *AND* according device's probing order ("supplier<-consumer").
>>> So, if there is the case when Parent device can be probed AFTER its children
>>> - it will not work, because "parent<-child" dependencies will not be tracked
>>> any more :( Sry, I could not even imagine that such crazy case exist :'(
>>
>> If we avoid moving devices to the end of the dpm_list when they already
>> have children, then we should be okay, right?
>>
>>> Are there any other subsystems with the same behavior like PCI?
>>
>> I don't know.
>>
>>> If not - probably, it could be fixed in PCI subsystem using device_pm_move_after() or
>>> device_move() in PCIe ports probe.
>>> if yes - ... maybe we can scan/re-check and reorder dpm_list on suspend enter and
>>> restore ("parent<-child" dependencies).
>>
>>> Truth is that smth. need to be done 100%. Personally, I was hit by this issue also,
>>> and it cost me 3 hours of debugging and I came up with the same patch as
>>> Bill Huang, then spent some time trying to understand what is wrong with PCI
>>> - finally, I've just changed the order of my devices in DT :)
>>>
>>> Also, I think, it will be good to have this patch in -next to collect more feedbacks.
>>
>> I like the idea of forcing all probes during a sleep transition to be
>> deferred. We could carry them out just before unfreezing the user
>> threads. That combined with the change mentioned above ought to be
>> worth testing.
>
> Agreed.
>

I've prepared code change which should prohibit devices probing during suspend/hibernation
(below). It also expected to fix wait_for_device_probe() to take into account the case
when the deferred probe workqueue could be still active.

NOTE: It's only compile time tested!

I'm very sorry that I'm replying here instead of sending a proper patch -
I'm on business trip right now and I will be traveling next week also and will not
be able to work on it intensively.

If proposed approach is correct I can send RFC/RFT patch/es (or anyone else could
pick up it if interested to move forward faster).

--
regards,
-grygorii