Re: [PATCH v2] usb: dwc3: Prevent indefinite sleep in _dwc3_set_mode during suspend/resume

From: Minas Harutyunyan
Date: Sat Mar 17 2018 - 02:33:24 EST


Hi,

On 3/16/2018 4:25 PM, Felipe Balbi wrote:
>
> Hi,
>
> Minas Harutyunyan <Minas.Harutyunyan@xxxxxxxxxxxx> writes:
>>>>> On 09/03/18 14:47, Roger Quadros wrote:
>>>>>> In the following test we get stuck by sleeping forever in _dwc3_set_mode()
>>>>>> after which dual-role switching doesn't work.
>>>>>>
>>>>>> On dra7-evm's dual-role port,
>>>>>> - Load g_zero gadget driver and enumerate to host
>>>>>> - suspend to mem
>>>>>> - disconnect USB cable to host and connect otg cable with Pen drive in it.
>>>>>> - resume system
>>>>>> - we sleep indefinitely in _dwc3_set_mode due to.
>>>>>> dwc3_gadget_exit()->usb_del_gadget_udc()->udc_stop()->
>>>>>> dwc3_gadget_stop()->wait_event_lock_irq()
>>>>>>
>>>>>> To fix this instead of waiting indefinitely with wait_event_lock_irq()
>>>>>> we use wait_event_interruptible_lock_irq_timeout() and print
>>>>>> and error message if there was a timeout.
>>>>>>
>>>>>> Signed-off-by: Roger Quadros <rogerq@xxxxxx>
>>>>>
>>>>> Thanks for picking this for -next.
>>>>> Is it better to have this in v4.16-rc fixes?
>>>>> and also stable? v4.12+
>>>>
>>>> Well, there was no "Fixes: foobar" or "Cc: stable" lines in the commit
>>>> log ;-)
>>>>
>>>> The best we can do now, is wait for -rc1 and manually send the commit to
>>>> stable.
>>>>
>>>
>>> That's fine. Thanks.
>>>
>>
>> Same issue seen in dwc3_gadget_ep_dequeue() function where also used
>> wait_event_lock_irq() - as result infinite loop.
>
> how did this happen? During rmmod dwc3? Or, perhaps, after you unloaded
> a gadget driver?
>
No, not during rmmod's.
We using our internal USB testing tool. Test case; ISOC OUT, transfer
size N frames. When host starts ISOC OUT traffic then the dwc3 based on
"Transfer not ready" event in frame F starts transfers staring from
frame F+4 (for bInterval=1) as result 4 requests, which already queued
on device side, remain incomplete. Function driver on some timeout
trying dequeue these 4 requests (without disabling EP) to complete test.
For IN ISOC's these requests completed on MISSED ISOC event, but for
ISOC OUT required call dequeue on some timeout.

>> Actually to fix this issue I updated condition of wait function
>> from:
>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING)
>> to:
>> !(dep->flags & DWC3_EP_END_TRANSFER_PENDING & DWC3_EP_ENABLED)
>
> you're not fixing anything. You're, essentially, removing the entire
> end transfer pending logic.
yes, you are right, but how to overcome this infinite loop? Replace
wait_event_lock_irq() by wait_event_interruptible_lock_irq_timeout()?

The whole idea of this is that we can
> disable the endpoint and wait for the End Transfer interrupt. When you
> add a check for the endpoint being enabled, then that code will never
> run and, thus, never wait for the End Transfer IRQ.
>
> If you manage to find a more reliable way of reproducing this, then make
> sure to capture dwc3 tracepoints (see the documentation for details) and
> let's start trying to figure out what's going on.
>
> cheers
>