Re: [PATCH v1 1/5] drivercore: Revert "deferral race condition fix"

From: Grant Likely
Date: Wed Nov 14 2018 - 18:25:20 EST




On 11/11/2018 19:26, Rob Herring wrote:
> On Sun, Nov 11, 2018 at 7:04 AM Andy Shevchenko
> <andy.shevchenko@xxxxxxxxx> wrote:
>>
>> I seems Grant's mail delivery bounces messages. I delibirately reduced
>> the Cc list for sake of ping Grant in case it would pass.
>
> That would be because he is not at Linaro anymore. Added his ARM account.
>
>> On Sat, Nov 10, 2018 at 8:12 PM Andy Shevchenko
>> <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
>>>
>>> Consider the following scenario.
>>>
>>> There are two independent devices coupled together by functional dependencies:
>>> - USB OTG (dwc3-pci)
>>> - extcon (tested with extcon-intel-mrfld, not yet in upstream)
>>>
>>> Each of the driver services a corresponding device is built as a module. In the
>>> Buildroot environment the modules are probed by alphabetical ordering of their
>>> modaliases. The latter comes to the case when USB OTG driver will be probed
>>> first followed by extcon one.
>>>
>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>> we will get deferred probe of USB OTG, because of ordering.
>>>
>>> Now, a cherry on top of the cake, the deferred probing list contains
>>> the only two modules, i.e. USB OTG and extcon. Due to above circumstances,
>>> values in the local_trigger_count and deferred_trigger_count are not the same,
>>> and thus provokes deferred probe triggering again and again.
>>>
>>> ...
>>> [ 20.678332] platform dwc3.0.auto: Retrying from deferred list
>>> [ 20.694743] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>> [ 20.701254] platform dwc3.0.auto: Added to deferred list
>>> [ 20.706620] platform dwc3.0.auto: driver_deferred_probe_add_trigger 1 2
>>> [ 20.713732] platform dwc3.0.auto: Retrying from deferred list
>>> [ 20.730035] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>> [ 20.736540] platform dwc3.0.auto: Added to deferred list
>>> [ 20.741889] platform dwc3.0.auto: driver_deferred_probe_add_trigger 3 4
>>> [ 20.748991] platform dwc3.0.auto: Retrying from deferred list
>>> [ 20.765416] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>> [ 20.771914] platform dwc3.0.auto: Added to deferred list
>>> [ 20.777279] platform dwc3.0.auto: driver_deferred_probe_add_trigger 5 6
>>> ...
>>>
>>> Deeper investigation shows the culprit commit 58b116bce136
>>> ("drivercore: deferral race condition fix") which was dedicated to fix some
>>> other issue while bringing a regression.
>>>
>>> This reverts commit 58b116bce13612e5aa6fcd49ecbd4cf8bb59e835 for good until
>>> we will have better solution.
>
> How is behavior that's been there for 4.5 years a regression? Aren't
> we going to break Davinci that this commit was supposed to fix? What
> else could be relying on current behavior?

It has been a long time since I looked at this code, but so this may be
totally wrong. Something is causing driver_deferred_probe_trigger to get
called. I suspect all of this is happening within the single probe
event. This driver creates nested devices, which are bound to there own
driver. If one of those child devices probes successfully, and then the
parent probe fails with -EPROBE_DEFER, does it cause the whole structure
to be torn down again? If so, then that's the cause. The successful
probe will cause the probe loop.

You should investigate what driver probe/remove events happen in the
failure case. That will tell you exactly what is happening.

g.


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.