Re: [PATCH] usb: xhci: inherit dma_mask from bus if set correctly

From: Robin Murphy
Date: Fri Apr 26 2019 - 10:03:31 EST


On 26/04/2019 05:46, Pankaj Dubey wrote:

On 4/24/19 4:28 PM, Robin Murphy wrote:
On 24/04/2019 10:05, Pankaj Dubey wrote:

On 4/10/19 4:32 AM, Robin Murphy wrote:
On 2019-04-09 3:56 am, Sriram Dash wrote:
On Tue, Apr 2, 2019 at 9:53 PM Pankaj Dubey <pankaj.dubey@xxxxxxxxxxx> wrote:

On Tue, 2 Apr 2019 at 15:34, Robin Murphy
<robin.murphy@xxxxxxx> wrote:

On 02/04/2019 10:40, Pankaj Dubey wrote:
From: Sriram Dash <sriram.dash@xxxxxxxxxxx>

The xhci forcefully converts the dma_mask to either 64
or 32 and the dma-mask set by the bus is somewhat
ignored. If the platform sets the correct dma_mask,
then respect that.

It's expected for dma_mask to be larger than bus_dma_mask
if the latter is set - conceptually, the device mask
represents what the device is inherently capable of,
while the bus mask represents external interconnect
restrictions which individual drivers should not have to care about. The DMA API backend should take care of
combining the two to find the intersection.

Thanks for the review.

We are dealing here with the xhci platform which inherits
the dma mask of the parent, which is from a controller
device.

When the controller dma mask is set by the platform in DT,
what we observe is, its not getting inherited properly and
the xhci bus is forcing the dma address to be either 32 bit
or 64 bit.

In "drivers/usb/host/xhci-plat.c" we have dma_mask setting
as below:

/* Try to set 64-bit DMA first */ if
(WARN_ON(!sysdev->dma_mask)) /* Platform did not initialize
dma_mask */ ret = dma_coerce_mask_and_coherent(sysdev,
DMA_BIT_MASK(64)); else ret =
dma_set_mask_and_coherent(sysdev, DMA_BIT_MASK(64));

So even if the controller device has set the dma_mask as
per it's configuration in DT, xhci-plat.c will override it
here in else part.

Next it goes to "drivers/usb/host/xhci.c" file, here we
have code as:

/* Set dma_mask and coherent_dma_mask to 64-bits, * if xHC
supports 64-bit addressing */ if
(HCC_64BIT_ADDR(xhci->hcc_params) && !dma_set_mask(dev,
DMA_BIT_MASK(64))) { xhci_dbg(xhci, "Enabling 64-bit DMA
addresses.\n"); dma_set_coherent_mask(dev,
DMA_BIT_MASK(64)); } else { /* * This is to avoid error in
cases where a 32-bit USB * controller is used on a 64-bit
capable system. */ retval = dma_set_mask(dev,
DMA_BIT_MASK(32)); if (retval) return retval; xhci_dbg(xhci, "Enabling 32-bit DMA addresses.\n"); dma_set_coherent_mask(dev, DMA_BIT_MASK(32)); }

So xhci will force the dma_mask to either DMA_BIT_MASK(32)
or DMA_BIT_MASK(64), what if my device needs other than 32
bit or 64 bit dma_mask.

The bus_dma_mask was introduced for a case when the bus
from a device's dma interface may carry fewer address bits.
But apparently, it is the only mask which retains the
original dma addressing from the parent. So basically what
we observe is currently there is no way we can pass
dma_mask greater than 32-bit, from DT via dev->dma_mask or dev->coherent_dma_mask due to below logic in

from "drivers/of/platform.c" we have static struct
platform_device *of_platform_device_create_pdata( struct
device_node *np, const char *bus_id, void *platform_data, struct device *parent) { struct platform_device *dev; ... dev->dev.coherent_dma_mask = DMA_BIT_MASK(32); if
(!dev->dev.dma_mask) dev->dev.dma_mask =
&dev->dev.coherent_dma_mask; ... }

and then in of_dma_configure function in
"drivers/of/device.c" we have..

mask = DMA_BIT_MASK(ilog2(dma_addr + size - 1) + 1); //This
mask computation is going fine and gets mask greater than
32-bit if defined in DT dev->coherent_dma_mask &= mask; //
Here the higher bit [63:32] will get truncated as
coherent_dma_mask is initialized to DMA_BIT_MASK(32) in
platform.c

*dev->dma_mask &= mask; //Same here higher bits will get
truncated /* ...but only set bus mask if we found valid
dma-ranges earlier */ if (!ret) dev->bus_dma_mask = mask;
//Only bus_dma_mask can carry the original mask as
specified in platform DT.

To minimise the impact on existing code, we reused the
bus_dma_mask for finding the dma addressing bits.

Or other way we may need to initialise
dma_mask/coherent_dma_mask as DMA_BIT_MASK(64) in
"drivers/of/platform.c" and let all devices set dma_mask
via DT using "dma-ranges" property or respective platform driver.

Are you seeing an actual problem here, and if so on which
platform? (If the bus mask is set at all then it
wouldn't seem to be the DT PCI issue that I'm still
trying to fix).



We are facing this issue in one of the Samsung's upcoming
SoC where we need dma_mask greater than 32-bit.

Thanks, Pankaj
Robin.

Signed-off-by: Pankaj Dubey <pankaj.dubey@xxxxxxxxxxx> Signed-off-by: Sriram Dash <sriram.dash@xxxxxxxxxxx> --- drivers/usb/host/xhci.c | 10 ++++++++++ 1 file
changed, 10 insertions(+)

diff --git a/drivers/usb/host/xhci.c
b/drivers/usb/host/xhci.c index 005e659..55cf89e
100644 --- a/drivers/usb/host/xhci.c +++
b/drivers/usb/host/xhci.c @@ -5119,6 +5119,16 @@ int
xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t
get_quirks) dma_set_coherent_mask(dev,
DMA_BIT_MASK(32)); }

+ /* + * A platform may require coherent masks
other than 64/32 bit, and we + * should respect
that. If the firmware has already requested for a +
* dma-range, we inherit the dma_mask presuming the
platform knows + * what it is doing. + */ + +
if (dev->bus_dma_mask) +
dma_set_mask_and_coherent(dev, dev->bus_dma_mask); + xhci_dbg(xhci, "Calling HCD init\n"); /* Initialize HCD
and host controller data structures. */ retval =
xhci_init(hcd);


Hello Robin,

Hope you found the crux of the matter. Any comments on the
same?

Sorry, I never received either of these replies - I've just
happened to notice this thread again by pure chance while
looking at the linux-usb patchwork for something else entirely,
and managed to dredge an mbox off lore.kernel.org to reply to.
Mail is not my area of expertise, but looking at the headers of
the initial patch in my inbox it seems that outlook.com is
doing SPF negotiation with samsung.com, so sending via gmail
(as those replies appear to be) may be failing that and getting
silently discarded (they're not even in my spam quarantine).

From the snippets of code quoted above I don't see anything
obviously wrong, but I'll take a closer look tomorrow. AFAICS
though, if dev->bus_dma_mask is set then dev is probably the
appropriate device for DMA, so I wouldn't expect a problem -
XHCI is inherently a 64-bit device, so its driver *should* be
setting a 64-bit mask in this case. To reiterate, what's the
nature of the DMA issue? Do the mapping operations fail, or do
you actually see transfers going wrong due to address
truncation? Also, which arch is involved here - is it arm64 (as
I seem to have assumed), or something else?

Robin.


Regarding issue in above code snippet, current code sets "dev->dev.coherent_dma_mask" as DMA_BIT_MASK(32) in platform.c (irrespective of architecture) and when we parse "dma-ranges"
property and try to set coherent_dma_mask or dma_mask in
of_dma_configure function in "drivers/of/device.c", even if
"dma-ranges" specified in DT is more than 32-bit, 32-63 bits will
be cleared to zero due to masking done in platform.c.

So effectively we are not able to set dma_mask or
coherent_dma_mask larger than 32-bit mask.

For better or worse, that is the expected and intended behaviour
for the default device masks. Drivers these days are expected to explicitly set their supported mask to replace the default, but
there are still some remaining legacy assumptions that the default
masks are 32-bit, so making them bigger risks subtle breakage, and
that's why of_dma_configure() does the weird things it does.

In that case, for systems supporting masks greater than 32-bit, IMO, they should be able to handle the mask properly via DT. Without disturbing legacy code, this is one solution that can be considered. Requesting you to give your opinion on this, if it is acceptable we
will submit formal patch for this.

diff --git a/drivers/of/device.c b/drivers/of/device.c index
3717f2a..9cc7a28 100644 --- a/drivers/of/device.c +++
b/drivers/of/device.c @@ -151,10 +151,19 @@ int
of_dma_configure(struct device *dev, struct device_node *np, bool
force_dma) mask = DMA_BIT_MASK(ilog2(dma_addr + size - 1) + 1); dev->coherent_dma_mask &= mask; *dev->dma_mask &= mask; - /*
...but only set bus mask if we found valid dma-ranges earlier */ -
if (!ret) - dev->bus_dma_mask = mask;

+ /* + * ...but only set bus mask if we found valid
dma-ranges earlier + * and also, set the coherent_dma_mask and
dma_mask properly + * for busses with size more than 32-bit +
*/ + if (!ret) { + dev->bus_dma_mask = mask; +
if (mask >= (1ULL << 32)) { +
dev->coherent_dma_mask = mask; + *dev->dma_mask
= mask; + } + } coherent =
of_dma_is_coherent(np); dev_dbg(dev, "device is%sdma coherent\n", coherent ? " " : " not ");


For the SoC concerned here is based on arm64, the USB IP (64 bit capable) is connected to a 36-bit Data bus and a 32-bit Control
bus. The 36-bit Data bus is connected to an IOMMU which
translates the address before they are passed on to other blocks.
Here we have IOMMU is capable of 40-bit addressing. But as data
bus is only capable of 36-bit, we need to ensure that IOMMU
translates to address which does not exceed 36-bit.

Without this fix we are observing context fault from IOMMU.

Thanks for the clarification.

Now, to get a workaround to this problem, we are inheriting the bus_dma_mask which is apparently the only one which contains the
36-bit bus mask.

If the bus mask is correctly set to 36 bits, but the DMA API implementation is failing to take that into account and giving you 40-bit DMA addresses, that is a bug in the DMA API implementation,
and it needs to be fixed in that DMA API code, not worked around
in individual drivers.

There are 2 issues here. First being, the 32-bit limitation for the device dma_mask during device registration. You have already
suggested one approach for this to set from driver itself. IMO, this
would require another DT node property addition for any individual
IP. As same driver is being used in another SoC where, it may not
need more than 32-bit/64-bit dma_mask.

For the SoC concerned here, there are multiple IPs which are sharing
the 36-bit DATA bus. If of_dma_configure" takes care of assigning
the dma_mask properly from "dma-ranges" DT property, it would solve
this issue and driver's do not need to set this explicitly either
via hard-coding or through another DT property. For this we suggest
code snippet as given above.

The second problem is the XHCI overrides dma_mask to 32-bit or
64-bit. During the device registration, the DWC3 device get the
default 32-bit dma_mask. This DWC3 serves as the parent device for
XHCI device, which also gets 32-bit dma_mask from inception. There
are 2 places for the xhci-device, where the dma_mask of the
xhci-device is explicitly modified to either 32-bit or 64-bit.

1) In "drivers/usb/host/xhci-plat.c" we have dma_mask setting as
below:

ret = dma_coerce_mask_and_coherent(sysdev, DMA_BIT_MASK(64));

2) In "drivers/usb/host/xhci.c" file, here we have code as:

dma_set_coherent_mask(dev, DMA_BIT_MASK(64));

So, even if the platform sets the dma_mask properly to 36-bit, the
xhci driver overrides it to 32-bit or 64-bit. This can be fixed in
the xhci driver by checking if the dma_mask is already getting set in
the parent driver. For this we have not yet submitted the change, as
in this case of_dma_configure needs to be fixed first.

None of that is a real problem, and none of it needs to be fixed. As I tried to explain before, the bus mask and device masks are *expected* to be different sizes, because they represent different things, and are owned by different levels of the driver model abstraction.

However, the proposed solution (current patch) is leveraging the
dma_mask from bus_dma_mask, which is set from the dma-ranges properly, and this case we don't need any changes in
of_dma_configure.

Is this a 32-bit Arm system, by any chance?

For the SoC concerned here, it is a 64-bit ARM system, which has
many IPs connected via the 36-bit DATA bus.

OK, if it's an arm64 system with an IOMMU, then all your DMA addresses come from here:


static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
size_t size, dma_addr_t dma_limit, struct device *dev)
{
...
if (dev->bus_dma_mask)
dma_limit &= dev->bus_dma_mask;
...
}


Most likely something somewhere is passing the wrong device into DMA API calls - *that* is what needs to be fixed.

Of course I can't 100% rule out the possibility that something else is going screwy as well or instead, but either way, start debugging from iommu_dma_alloc_iova() and work backwards until you can see why dma_limit is not getting clamped to the appropriate bus mask.

Robin.