Re: [PATCH 00/11] of: Fix DMA configuration for non-DT masters

From: Rob Herring
Date: Wed Sep 25 2019 - 17:33:39 EST


On Wed, Sep 25, 2019 at 11:52 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:
>
> On 25/09/2019 17:16, Rob Herring wrote:
> > On Wed, Sep 25, 2019 at 10:30 AM Nicolas Saenz Julienne
> > <nsaenzjulienne@xxxxxxx> wrote:
> >>
> >> On Wed, 2019-09-25 at 16:09 +0100, Robin Murphy wrote:
> >>> On 25/09/2019 15:52, Nicolas Saenz Julienne wrote:
> >>>> On Tue, 2019-09-24 at 16:59 -0500, Rob Herring wrote:
> >>>>> On Tue, Sep 24, 2019 at 1:12 PM Nicolas Saenz Julienne
> >>>>> <nsaenzjulienne@xxxxxxx> wrote:
> >>>>>> Hi All,
> >>>>>> this series tries to address one of the issues blocking us from
> >>>>>> upstreaming Broadcom's STB PCIe controller[1]. Namely, the fact that
> >>>>>> devices not represented in DT which sit behind a PCI bus fail to get the
> >>>>>> bus' DMA addressing constraints.
> >>>>>>
> >>>>>> This is due to the fact that of_dma_configure() assumes it's receiving a
> >>>>>> DT node representing the device being configured, as opposed to the PCIe
> >>>>>> bridge node we currently pass. This causes the code to directly jump
> >>>>>> into PCI's parent node when checking for 'dma-ranges' and misses
> >>>>>> whatever was set there.
> >>>>>>
> >>>>>> To address this I create a new API in OF - inspired from Robin Murphys
> >>>>>> original proposal[2] - which accepts a bus DT node as it's input in
> >>>>>> order to configure a device's DMA constraints. The changes go deep into
> >>>>>> of/address.c's implementation, as a device being having a DT node
> >>>>>> assumption was pretty strong.
> >>>>>>
> >>>>>> On top of this work, I also cleaned up of_dma_configure() removing its
> >>>>>> redundant arguments and creating an alternative function for the special
> >>>>>> cases
> >>>>>> not applicable to either the above case or the default usage.
> >>>>>>
> >>>>>> IMO the resulting functions are more explicit. They will probably
> >>>>>> surface some hacky usages that can be properly fixed as I show with the
> >>>>>> DT fixes on the Layerscape platform.
> >>>>>>
> >>>>>> This was also tested on a Raspberry Pi 4 with a custom PCIe driver and
> >>>>>> on a Seattle AMD board.
> >>>>>
> >>>>> Humm, I've been working on this issue too. Looks similar though yours
> >>>>> has a lot more churn and there's some other bugs I've found.
> >>>>
> >>>> That's good news, and yes now that I see it, some stuff on my series is
> >>>> overly
> >>>> complicated. Specially around of_translate_*().
> >>>>
> >>>> On top of that, you removed in of_dma_get_range():
> >>>>
> >>>> - /*
> >>>> - * At least empty ranges has to be defined for parent node if
> >>>> - * DMA is supported
> >>>> - */
> >>>> - if (!ranges)
> >>>> - break;
> >>>>
> >>>> Which I assumed was bound to the standard and makes things easier.
> >>>>
> >>>>> Can you test out this branch[1]. I don't have any h/w needing this,
> >>>>> but wrote a unittest and tested with modified QEMU.
> >>>>
> >>>> I reviewed everything, I did find a minor issue, see the patch attached.
> >>>
> >>> WRT that patch, the original intent of "force_dma" was purely to
> >>> consider a device DMA-capable regardless of the presence of
> >>> "dma-ranges". Expecting of_dma_configure() to do anything for a non-OF
> >>> device has always been bogus - magic paravirt devices which appear out
> >>> of nowhere and expect to be treated as genuine DMA masters are a
> >>> separate problem that we haven't really approached yet.
> >>
> >> I agree it's clearly abusing the function. I have no problem with the behaviour
> >> change if it's OK with you.
>
> Thinking about it, you could probably just remove that call from the Xen
> DRM driver now anyway - since the dma-direct rework, we lost the ability
> to set dma_dummy_ops by default, and NULL ops now represent what it
> (presumably) wants.

Not xen_dma_ops? In any case, I'll send out a patch for the the Xen
folks to comment on.

> >> Robin, have you looked into supporting multiple dma-ranges? It's the next thing
> >> we need for BCM STB's PCIe. I'll have a go at it myself if nothing is in the
> >> works already.
> >
> > Multiple dma-ranges as far as configuring inbound windows should work
> > already other than the bug when there's any parent translation. But if
> > you mean supporting multiple DMA offsets and masks per device in the
> > DMA API, there's nothing in the works yet.
>
> There's also the in-between step of making of_dma_get_range() return a
> size based on all the dma-ranges entries rather than only the first one
> - otherwise, something like [1] can lead to pretty unworkable default
> masks. We implemented that when doing acpi_dma_get_range(), it's just
> that the OF counterpart never caught up.

Right. I suppose we assume any holes in the ranges are addressable by
the device but won't get used for other reasons (such as no memory
there). However, to be correct, the range of the dma offset plus mask
would need to be within the min start and max end addresses. IOW,
while we need to round up (0xa_8000_0000 - 0x2c1c_0000) to the next
power of 2, the 'correct' thing to do is round down.

Rob

> [1]
> http://linux-arm.org/git?p=linux-rm.git;a=commitdiff;h=a2814af56b3486c2985a95540a88d8f9fa3a699f