RE: [PATCH] usb: cdns3: remove fetched trb from cache before dequeuing

From: Pawel Laszczak
Date: Mon Jan 16 2023 - 03:28:45 EST


>
>
>On 23-01-09 05:38:31, Pawel Laszczak wrote:
>> >
>> >On Thu, Nov 17, 2022 at 8:27 PM Pawel Laszczak <pawell@xxxxxxxxxxx>
>> >wrote:
>> >>
>> >> >
>> >> >On Tue, Nov 15, 2022 at 6:01 PM Pawel Laszczak
>> >> ><pawell@xxxxxxxxxxx>
>> >> >wrote:
>> >> >>
>> >> >> After doorbell DMA fetches the TRB. If during dequeuing request
>> >> >> driver changes NORMAL TRB to LINK TRB but doesn't delete it from
>> >> >> controller cache then controller will handle cached TRB and
>> >> >> packet can be
>> >lost.
>> >> >>
>> >> >> The example scenario for this issue looks like:
>> >> >> 1. queue request - set doorbell
>> >> >> 2. dequeue request
>> >> >> 3. send OUT data packet from host 4. Device will accept this
>> >> >> packet which is unexpected 5. queue new request - set doorbell
>> >> >> 6. Device lost the expected packet.
>> >> >>
>> >> >> By setting DFLUSH controller clears DRDY bit and stop DMA transfer.
>> >> >>
>> >> >> Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
>> >> >> cc: <stable@xxxxxxxxxxxxxxx>
>> >> >> Signed-off-by: Pawel Laszczak <pawell@xxxxxxxxxxx>
>> >> >> ---
>> >> >> drivers/usb/cdns3/cdns3-gadget.c | 12 ++++++++++++
>> >> >> 1 file changed, 12 insertions(+)
>> >> >>
>> >> >> diff --git a/drivers/usb/cdns3/cdns3-gadget.c
>> >> >> b/drivers/usb/cdns3/cdns3-gadget.c
>> >> >> index 5adcb349718c..ccfaebca6faa 100644
>> >> >> --- a/drivers/usb/cdns3/cdns3-gadget.c
>> >> >> +++ b/drivers/usb/cdns3/cdns3-gadget.c
>> >> >> @@ -2614,6 +2614,7 @@ int cdns3_gadget_ep_dequeue(struct usb_ep
>> >*ep,
>> >> >> u8 req_on_hw_ring = 0;
>> >> >> unsigned long flags;
>> >> >> int ret = 0;
>> >> >> + int val;
>> >> >>
>> >> >> if (!ep || !request || !ep->desc)
>> >> >> return -EINVAL;
>> >> >> @@ -2649,6 +2650,13 @@ int cdns3_gadget_ep_dequeue(struct
>usb_ep
>> >> >*ep,
>> >> >>
>> >> >> /* Update ring only if removed request is on pending_req_list list
>*/
>> >> >> if (req_on_hw_ring && link_trb) {
>> >> >> + /* Stop DMA */
>> >> >> + writel(EP_CMD_DFLUSH, &priv_dev->regs->ep_cmd);
>> >> >> +
>> >> >> + /* wait for DFLUSH cleared */
>> >> >> + readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
>> >> >> + !(val &
>> >> >> + EP_CMD_DFLUSH), 1, 1000);
>> >> >> +
>> >> >> link_trb->buffer =
>> >> >>cpu_to_le32(TRB_BUFFER(priv_ep- trb_pool_dma +
>> >> >> ((priv_req->end_trb + 1) * TRB_SIZE)));
>> >> >> link_trb->control =
>> >> >> cpu_to_le32((le32_to_cpu(link_trb->control) & TRB_CYCLE) | @@
>> >> >>-2660,6
>> >> >> +2668,10 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
>> >> >>
>> >> >> cdns3_gadget_giveback(priv_ep, priv_req, -ECONNRESET);
>> >> >>
>> >> >> + req = cdns3_next_request(&priv_ep->pending_req_list);
>> >> >> + if (req)
>> >> >> + cdns3_rearm_transfer(priv_ep, 1);
>> >> >> +
>> >> >
>> >> >Why the above changes are needed?
>> >> >
>> >>
>> >> Do you mean the last line or this patch?
>> >>
>> >> Last line:
>> >> DMA is stopped, so driver arm the queued transfers
>> >>
>> >
>> >Sorry, I have been very busy recently, so the response may not be in time.
>> >I mean why it needs to re-arm the transfers after DMA is stopped?
>>
>> Because driver can have queued more transfers. Only one of them are
>> dequeued. In the vast majority of the rest request will be removed in
>> the next steps, but there can be case in which we have queued e.g. 10
>> usb requests and only one of them will be removed. In such case the driver
>can stuck.
>> To avoid this driver, rearm the endpoint if there are other transfer
>> in transfer ring.
>>
>
>Since this logic (re-arm the pending request) is different with current one,
>please test it well to avoid other use cases. After you have fully tested, feel
>free to add my ack:
>
>Acked-by: Peter Chen <peter.chen@xxxxxxxxxx>

I confirm that it works correct with my tests.
Also our customer who raised this issue confirmed that it work on his tests.

Pawel

>
>Peter
>
>> Regards,
>> Pawel
>>
>> >
>> >
>> >> If you means this patch:
>> >> Issue was detected by customer test. I don’t know whether it was
>> >> only test or the real application.
>> >>
>> >> The problem happens because user application queued the transfer
>> >> (endpoint has been armed), so controller fetch the TRB.
>> >> When user application removed this request the TRB was still
>> >> processed by controller. If at that time the host will send data
>> >> packet then controller will accept it, but it shouldn't because the
>> >> usb_request associated with TRB cached by controller was removed.
>> >> To force the controller to drop this TRB DFLUSH is required.
>> >>
>> >> Pawel
>> >>
>> >> >
>> >> >> not_found:
>> >> >> spin_unlock_irqrestore(&priv_dev->lock, flags);
>> >> >> return ret;
>> >> >> --
>> >> >> 2.25.1
>> >> >>
>
>--
>
>Thanks,
>Peter Chen