Re: [PATCH][RESEND] usb: dwc3: gadget: Handle dequeuing of non queued URB gracefully

From: Ardelean, Alexandru
Date: Tue Mar 10 2020 - 09:23:21 EST


On Thu, 2020-01-30 at 14:02 +0200, Felipe Balbi wrote:
> [External]
>
>
> Hi,
>
> Alexandru Ardelean <alexandru.ardelean@xxxxxxxxxx> writes:
>
> > From: Lars-Peter Clausen <lars@xxxxxxxxxx>
> >
> > Trying to dequeue and URB that is currently not queued should be a no-op
> > and be handled gracefully.
> >
> > Use the list field of the URB to indicate whether it is queued or not by
> > setting it to the empty list when it is not queued.
> >
> > Handling this gracefully allows for race condition free synchronization
> > between the complete callback being called to to a completed transfer and
> > trying to call usb_ep_dequeue() at the same time.
>
> We need a little more information here. Can you further explain what
> happens and how you caught this?

Apologies for the delay [of this reply].
It's been a while since this patch was created, and it was on a 4.14 kernel.
Lars was trying to fix various crashes with USB DWC3 OTG + some Xilinx patches.
I did not track the status of the OTG stuff upstream. I think it's a lot of
patches in the Xilinx tree.

The context has changed from 4.14 [obviously], and there were many things that
could have influenced things.
I've been trying to RFC some of these patches now.
[ yeah I know: maybe I should have [probably] also added an RFC tag :) ]
Some of the patches [including this one] seemed to make sense, even outside of
the context of the crashes that were happening on 4.14.
Atm, we're at 4.19 and we don't see issues, but we still have this patch.
We may drop it and see what happens.
Â\_(ã)_/Â

But in any case, it does require a bit more re-investigation.
Apologies for the noise that this patch created :)

>
> > Tested-by: Michael Olbrich <m.olbrich@xxxxxxxxxxxxxx>
> > Signed-off-by: Lars-Peter Clausen <lars@xxxxxxxxxx>
> > Signed-off-by: Alexandru Ardelean <alexandru.ardelean@xxxxxxxxxx>
> > ---
> >
> > * Added Michael Olbrich's Tested-by tag
> > https://lore.kernel.org/linux-usb/20191112144108.GA1859@xxxxxxxxxxxxxx/
> >
> > drivers/usb/dwc3/gadget.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 1b8014ab0b25..22a78eb41a5b 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -177,7 +177,7 @@ static void dwc3_gadget_del_and_unmap_request(struct
> > dwc3_ep *dep,
> > {
> > struct dwc3 *dwc = dep->dwc;
> >
> > - list_del(&req->list);
> > + list_del_init(&req->list);
>
> this should *not* be necessary. Neither the INIT_LIST_HEAD() below.
>
> > req->remaining = 0;
> > req->needs_extra_trb = false;
> >
> > @@ -847,6 +847,7 @@ static struct usb_request
> > *dwc3_gadget_ep_alloc_request(struct usb_ep *ep,
> > req->epnum = dep->number;
> > req->dep = dep;
> > req->status = DWC3_REQUEST_STATUS_UNKNOWN;
> > + INIT_LIST_HEAD(&req->list);
> >
> > trace_dwc3_alloc_request(req);
> >
> > @@ -1549,6 +1550,10 @@ static int dwc3_gadget_ep_dequeue(struct usb_ep *ep,
> >
> > spin_lock_irqsave(&dwc->lock, flags);
> >
> > + /* Not queued, nothing to do */
> > + if (list_empty(&req->list))
> > + goto out0;
>
> The loop below is actually looking for the request in our lists. You
> just made the entire loop below unnecessary, but you didn't change it
> accordingly. Moreover, I think that a user dequeueing a request that
> wasn't queued for the current endpoint indicates a possible bug in the
> gadget driver which needs to be fixed.
>

Yeah, that could be.
Will see about reverting the patch on our end, and trying to track this again.

Thanks
Alex

> If you really disagree, suffice to change "ret = -EINVAL;" to "ret =
> 0;" and you would get what you want, without any of the extra cruft.
>
> cheers
>