Re: More dwc3 gadget issues with adb

From: John Stultz
Date: Wed Apr 22 2020 - 01:09:46 EST


On Tue, Apr 21, 2020 at 9:38 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
>
> On Thu, Apr 16, 2020 at 1:19 AM Felipe Balbi <balbi@xxxxxxxxxx> wrote:
> > One thing I noticed is that we're missing a giveback on ep1out. Here's a
> > working case:
> >
>
> Hey Felipe,
> So I found some time to dig around on this today and I started
> trying to understand this issue that you've pointed out about missing
> the giveback.
>
> It seems part of the issue is we get to a point where we have some req
> where pending_sgs is more than one.
>
> We call dwc3_prepare_one_trb_sg() on it:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n1068
>
> And we process the sg list incrementing req->num_queued_sgs for each one.
>
> then later, dwc3_gadget_ep_cleanup_completed_request() is called on the request:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n2522
>
> We call dwc3_gadget_ep_reclaim_trb_sg()
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n2470
>
> Where we iterate over the req->sg, ideally decrementing
> num_pending_sgs each time and return.
>
> But back in dwc3_gadget_ep_cleanup_completed_request() and there
> we're hitting the:
> if (!dwc3_gadget_ep_request_completed(req) ||
> req->num_pending_sgs) {
> case which causes us to skip the call to dwc3_gadget_giveback().
>
> Looking as to why the num_pending_sgs is non zero, that's because in
> dwc3_gadget_ep_reclaim_trb_sg we're hitting the case where the trb has
> the DWC3_TRB_CTRL_HWO ctrl flag set, which breaks us out of the loop
> early before we decrement num_pending_sgs.
>
> For that trb, we're setting the HWO flag in __dwc3_prepare_one_trb()
> (called from dwc3_prepare_one_trb_sg() back at the beginning):
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n921
>
> I added logic showing every time we set or clear that flag, and it
> seems like we're always setting it but never clearing it. And often
> that's not an issue as we only have one sg entry. But if its set on a
> trb in a request with multiple sgs, that's where it seems to be
> causing the issue.
>
> I'll continue to dig around to try to understand where it might be
> going awry (why we never clear the HWO flag). But figured I'd try to
> explain this much in case it rings any bells to you.

I was looking some more at this and it seems a little odd...

In dwc3_gadget_ep_reclaim_trb_sg():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n2470

The check for (trb->ctrl & DWC3_TRB_CTRL_HWO) which breaks us out of
the loop happens before we call
dwc3_gadget_ep_reclaim_completed_trb():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.7-rc2#n2406

Which is what clears the DWC3_TRB_CTRL_HWO flag (outside of
dwc3_gadget_ep_skip_trbs()).

So on a whim I dropped that check, and things go back to working on
HiKey960, no more adb stalls!

Does something like this make sense? It's not causing trouble on
db845c either so far in my testing.

thanks
-john

(sorry gmail will whitespace corrupt this code paste - just want to
communicate what I did clearly, not to apply)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 4d3c79d90a6e..2a26d33520ce 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2457,9 +2457,6 @@ static int dwc3_gadget_ep_reclaim_trb_sg(struct
dwc3_ep *dep,
for_each_sg(sg, s, pending, i) {
trb = &dep->trb_pool[dep->trb_dequeue];

- if (trb->ctrl & DWC3_TRB_CTRL_HWO)
- break;
-
req->sg = sg_next(s);
req->num_pending_sgs--;