Re: [PATCH] DMA-API: Change dma_declare_coherent_memory() CPU address to phys_addr_t

From: Liviu Dudau
Date: Tue May 06 2014 - 05:00:12 EST


On Mon, May 05, 2014 at 11:42:40PM +0100, Bjorn Helgaas wrote:
> On Fri, May 02, 2014 at 12:18:07PM +0100, Liviu Dudau wrote:
> > On Thu, May 01, 2014 at 08:05:04PM +0100, Bjorn Helgaas wrote:
> > > On Thu, May 1, 2014 at 8:08 AM, James Bottomley
> > > <jbottomley@xxxxxxxxxxxxx> wrote:
> > > > On Wed, 2014-04-30 at 14:33 -0600, Bjorn Helgaas wrote:
> > > >> dma_declare_coherent_memory() takes two addresses for a region of memory: a
> > > >> "bus_addr" and a "device_addr". I think the intent is that "bus_addr" is
> > > >> the physical address a *CPU* would use to access the region, and
> > > >> "device_addr" is the bus address the *device* would use to address the
> > > >> region.
> > > >>
> > > >> Rename "bus_addr" to "phys_addr" and change its type to phys_addr_t.
> > > >
> > > > Remind me what the difference between phys_addr_t and dma_addr_t are.
> > > >
> > > > I thought phys_addr_t was the maximum address the CPU could reach after
> > > > address translation and dma_addr_t was the maximum physical address any
> > > > bus attached memory mapped devices could appear at. (of course, mostly
> > > > they're the same).
> > >
> > > I assumed phys_addr_t was for physical addresses generated by a CPU
> > > and dma_addr_t was for addresses generated by a device, e.g.,
> > > addresses you would see with a PCI bus analyzer or in a PCI BAR. But
> > > ARCH_DMA_ADDR_T_64BIT does seem more connected to processor-specific
> > > things, e.g., ARM_LPAE, than to device bus properties like "support
> > > 64-bit PCI, not just 32-bit PCI."
> > >
> > > DMA-API-HOWTO.txt contains this:
> > >
> > > ... the definition of the dma_addr_t (which can hold any valid DMA
> > > address for the platform) type which should be used everywhere you
> > > hold a DMA (bus) address returned from the DMA mapping functions.
> > >
> > > and clearly you use a dma_addr_t from dma_alloc_coherent(), etc., to
> > > program the device. It seems silly to have phys_addr_t and dma_addr_t
> > > for two (possibly slightly different) flavors of CPU physical
> > > addresses, and then sort of overload dma_addr_t by also using it to
> > > hold the addresses devices use for DMA.
> >
> > dma_addr_t is a phys_addr_t masked according to the capabilities of the
> > *DMA engine* (not device).

Sorry, I should've said restricted rather than masked.

>
> I don't quite agree with this. phys_addr_t is used for physical
> addresses generated by a CPU, e.g., ioremap() takes a phys_addr_t and
> returns a void * (a virtual address). A dma_addr_t is an address you
> get from dma_map_page() or a similar interface, and you program into a
> device (or DMA engine if you prefer). You can't generate a dma_addr_t
> simply by masking a phys_addr_t because an IOMMU can make arbitrary
> mappings of bus addresses to physical memory addresses.

Agree, sorry for the confusion I've generated.

>
> It is meaningless for a CPU to generate a reference to a dma_addr_t.
> It will work in some simple cases where there's no IOMMU, but it won't
> work in general.
>
> > Quite a lot of PCI devices (all?) now contain
> > a DMA engine, so the distinction is somewhat blurred nowadays.
>
> I'll take your word for the DMA engine/device distinction, but I don't
> see how it's relevant here. Linux doesn't have a generic idea of a
> DMA engine; we just treat it as part of the device capabilities.

Well, ARM has a separate DMA engine that can be used by multiple devices.
Hence for (my) sanity I prefer to keep the distinction clear.

>
> > > I'm not a CPU
> > > person, so I don't understand the usefulness of dma_addr_t as a
> > > separate type to hold CPU addresses at which memory-mapped devices can
> > > appear. If that's all it is, why not just use phys_addr_t everywhere
> > > and get rid of dma_addr_t altogether?
> >
> > Because dma_addr_t imposes further restrictions based on what the DMA engine
> > can access. You could have a DMA engine that can only access 32bit addresses
> > in a 40+ bit CPU addresses system.
>
> Yes; this is an argument for having separate types for bus addresses
> and CPU addresses. I think dma_addr_t *is* for bus addresses. I
> thought James was suggesting that it was a subset of CPU physical
> address space, and that any valid dma_addr_t is also a valid
> phys_addr_t. That doesn't seem right to me.

I'm actually more inclined towards James' view. To me the DMA engine is a very
specialised CPU that does memcpy(). You can pair it with an IOMMU engine and
you can have address translation into virtual space of any size you want, or
you can have the engine sitting behind a bus in which case it will use bus
addresses. But its uses of addresses is similar to a CPU (whenever you want to
call them phys_addr_t or not is a different discussion).

ARM has a PL330 DMA engine. It's a SoC component and usually doesn't get attached
to any PCI bus. It uses a system bus called AXI in the same way a CPU would. So
for that having a *bus* address does not fit with the way hardware is connected.

My mental picture for this discussion is a hypotetical DMA engine that sits on
a PCI bus but has access to the system memory that the CPU can see (sort of like
a DMA engine inside a host bridge). Would you use a phys_addr_t when programming
the device or a bus address? I would say the former, because you know how to
translate that into bus addresses and back in the host bridge.

>
> > > dma_declare_coherent_memory() calls ioremap() on bus_addr, which seems
> > > a clear indication that bus_addr is really a physical address usable
> > > by the CPU. But I do see your point that bus_addr is a CPU address
> > > for a memory-mapped device, and would thus fit into your idea of a
> > > dma_addr_t.
> > >
> > > I think this is the only part of the DMA API that deals with CPU
> > > physical addresses. I suppose we could side-step this question by
> > > having the caller do the ioremap; then dma_declare_coherent_memory()
> > > could just take a CPU virtual address (a void *) and a device address
> > > (a dma_addr_t).
> >
> > And how would that change the meaning of the function?
>
> I think the interesting thing here is that we ioremap() a bus address.
> That doesn't work in general because ioremap() is expecting a CPU
> physical address, not a bus address. We don't have a generic way for
> dma_declare_coherent_memory() to convert a bus address to a CPU
> physical address. The caller knows a little more and probably *could*
> do that, e.g., a PCI driver could use pcibios_bus_to_resource(). If
> we made a generic way to do the conversion, it would definitely be
> nicer to leave the ioremap() inside dma_declare_coherent_memory().

Agree, ioremap()-ing a bus address sounds to me like a bug.

Best regards,
Liviu

>
> Bjorn
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
====================
| I would like to |
| fix the world, |
| but they're not |
| giving me the |
\ source code! /
---------------
Â\_(ã)_/Â

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/