Re: [PATCH v8 0/5] Re-introduce TX FIFO resize for larger EP bursting

From: Wesley Cheng
Date: Fri May 28 2021 - 20:40:21 EST


Hi Felipe,

Sorry for the ping, but was just wondering if you had any feedback on
the latest txfifo resize patch series? I think I addressed the concerns
you had about making sure we had enough FIFO size for a composition
before allowing the configuration to bind with the check_config() API.
It would ensure at least enough room for 1 max packet size for each EP
in a configuration before allowing the bind to complete.

That way we'd avoid being enumerated w/ the host, and having
non-functioning endpoints. We've been testing these changes internally,
and they are providing a pretty significant boost to our USB throughput
numbers.

Thanks
Wesley Cheng

On 5/19/2021 12:43 AM, Wesley Cheng wrote:
> Changes in V8:
> - Rebased to usb-testing
> - Using devm_kzalloc for adding txfifo property in dwc3-qcom
> - Removed DWC3 QCOM ACPI property for enabling the txfifo resize
>
> Changes in V7:
> - Added a new property tx-fifo-max-num for limiting how much fifo space the
> resizing logic can allocate for endpoints with large burst values. This
> can differ across platforms, and tie in closely with overall system latency.
> - Added recommended checks for DWC32.
> - Added changes to set the tx-fifo-resize property from dwc3-qcom by default
> instead of modifying the current DTSI files.
> - Added comments on all APIs/variables introduced.
> - Updated the DWC3 YAML to include a better description of the tx-fifo-resize
> property and added an entry for tx-fifo-max-num.
>
> Changes in V6:
> - Rebased patches to usb-testing.
> - Renamed to PATCH series instead of RFC.
> - Checking for fs_descriptors instead of ss_descriptors for determining the
> endpoint count for a particular configuration.
> - Re-ordered patch series to fix patch dependencies.
>
> Changes in V5:
> - Added check_config() logic, which is used to communicate the number of EPs
> used in a particular configuration. Based on this, the DWC3 gadget driver
> has the ability to know the maximum number of eps utilized in all configs.
> This helps reduce unnecessary allocation to unused eps, and will catch fifo
> allocation issues at bind() time.
> - Fixed variable declaration to single line per variable, and reverse xmas.
> - Created a helper for fifo clearing, which is used by ep0.c
>
> Changes in V4:
> - Removed struct dwc3* as an argument for dwc3_gadget_resize_tx_fifos()
> - Removed WARN_ON(1) in case we run out of fifo space
>
> Changes in V3:
> - Removed "Reviewed-by" tags
> - Renamed series back to RFC
> - Modified logic to ensure that fifo_size is reset if we pass the minimum
> threshold. Tested with binding multiple FDs requesting 6 FIFOs.
>
> Changes in V2:
> - Modified TXFIFO resizing logic to ensure that each EP is reserved a
> FIFO.
> - Removed dev_dbg() prints and fixed typos from patches
> - Added some more description on the dt-bindings commit message
>
> Currently, there is no functionality to allow for resizing the TXFIFOs, and
> relying on the HW default setting for the TXFIFO depth. In most cases, the
> HW default is probably sufficient, but for USB compositions that contain
> multiple functions that require EP bursting, the default settings
> might not be enough. Also to note, the current SW will assign an EP to a
> function driver w/o checking to see if the TXFIFO size for that particular
> EP is large enough. (this is a problem if there are multiple HW defined
> values for the TXFIFO size)
>
> It is mentioned in the SNPS databook that a minimum of TX FIFO depth = 3
> is required for an EP that supports bursting. Otherwise, there may be
> frequent occurences of bursts ending. For high bandwidth functions,
> such as data tethering (protocols that support data aggregation), mass
> storage, and media transfer protocol (over FFS), the bMaxBurst value can be
> large, and a bigger TXFIFO depth may prove to be beneficial in terms of USB
> throughput. (which can be associated to system access latency, etc...) It
> allows for a more consistent burst of traffic, w/o any interruptions, as
> data is readily available in the FIFO.
>
> With testing done using the mass storage function driver, the results show
> that with a larger TXFIFO depth, the bandwidth increased significantly.
>
> Test Parameters:
> - Platform: Qualcomm SM8150
> - bMaxBurst = 6
> - USB req size = 256kB
> - Num of USB reqs = 16
> - USB Speed = Super-Speed
> - Function Driver: Mass Storage (w/ ramdisk)
> - Test Application: CrystalDiskMark
>
> Results:
>
> TXFIFO Depth = 3 max packets
>
> Test Case | Data Size | AVG tput (in MB/s)
> -------------------------------------------
> Sequential|1 GB x |
> Read |9 loops | 193.60
> | | 195.86
> | | 184.77
> | | 193.60
> -------------------------------------------
>
> TXFIFO Depth = 6 max packets
>
> Test Case | Data Size | AVG tput (in MB/s)
> -------------------------------------------
> Sequential|1 GB x |
> Read |9 loops | 287.35
> | | 304.94
> | | 289.64
> | | 293.61
> -------------------------------------------
>
> Wesley Cheng (5):
> usb: gadget: udc: core: Introduce check_config to verify USB
> configuration
> usb: gadget: configfs: Check USB configuration before adding
> usb: dwc3: Resize TX FIFOs to meet EP bursting requirements
> usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default
> arm64: boot: dts: qcom: sm8150: Enable dynamic TX FIFO resize logic
>
> arch/arm64/boot/dts/qcom/sm8150.dtsi | 1 +
> drivers/usb/dwc3/core.c | 9 ++
> drivers/usb/dwc3/core.h | 15 +++
> drivers/usb/dwc3/dwc3-qcom.c | 9 ++
> drivers/usb/dwc3/ep0.c | 2 +
> drivers/usb/dwc3/gadget.c | 212 +++++++++++++++++++++++++++++++++++
> drivers/usb/gadget/configfs.c | 22 ++++
> drivers/usb/gadget/udc/core.c | 25 +++++
> include/linux/usb/gadget.h | 5 +
> 9 files changed, 300 insertions(+)
>

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project