Re: [PATCH 0/4] allow drivers to flush in-flight DMA

From: Grant Grundler
Date: Wed Sep 26 2007 - 02:50:27 EST

[+jejb to cc]

On Tue, Sep 25, 2007 at 04:58:43PM -0700, akepner@xxxxxxx wrote:
> This is a followup to
> Despite Grant's desire for a more elegant solution, there's
> not much new here. I moved the API change from pci.h to
> dma-mapping.h and removed the pci_ prefix from the name.

Thanks - but I don't have a better idea either.
I think you are right to just move forward with this until
someone provides a better API.

> Problem Description
> -------------------
> On Altix, DMA may be reordered within the NUMA interconnect.
> This can be a problem with Infiniband, where DMA to Completion Queues
> allocated in user-space can race with data DMA. This patchset allows
> a driver to associate a user-space memory region with a "dmaflush"
> attribute, so that writes to the memory region flush in-flight DMA,
> preventing the CQ/data race.

Can we define this API to provide the same semantics as the memory
that dma_alloc_coherent() returns?
Did I summarize this correctly?

Defining it terms of completion queues won't mean much to most folks.
Better to add a description of completion queues to the DMA-API.txt if
necessary. dma_alloc_coherent() API is pretty well understood.

> There are four patches in this set:
> [1/4] dma: add dma_flags_set_dmaflush() to dma interface

Sorry - this feels like a "color of the shed" argument, but isn't
this about DMA ordering attribute?
"dmaflush" is an action and not an attribute to me.
Is dma_flags_set_coherent() better since it's doing the same thing
as dma_alloc_coherent()?

> [2/4] dma: redefine dma_flags_set_dmaflush() for sn-ia64
> [3/4] dma: document dma_flags_set_dmaflush()

This patch updates Documentation/DMA-mapping.txt. But it's a change to
the generic (not PCI specific) API described in DMA-API.txt.
Can you update that as well please?

Upon reading the "2) Platforms that permit DMA reordering", I think I
have been confusing coherency with ordering. I think I have because DMA
is leaving the "PCI domain", crossing an "unordered domain" (NUMA,
interconnect), and then finally hitting the cache coherency "domain"
when it reaches a "far away" memory controller. That's why I've
been thinking of this as a coherency problem.

The description and API uses the word "flush" (which is ok I guess) instead
of describing this in terms of enforcing DMA ordering. Any DMA write to the
"strongly ordered" region will cause _all_ inflight DMA to be visible
to cache coherency, thus preserving the illusion of strong DMA ordering.

Does that sound right/better to you too?
I don't have chipset docs and some of this is just trying to rephrase
what I've heard before from former SGI employees.

> [4/4] mthca: allow setting "dmaflush" attribute on user-allocated memory

Besides calling the parameter "dmaflush", it looks fine to me.
(It's either a DMA ordering or coherency attribute depending on how
you want to look at it.)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at