Re: MMIO and gcc re-ordering issue

From: Trent Piepho
Date: Tue Jun 03 2008 - 17:45:27 EST


On Tue, 3 Jun 2008, Matthew Wilcox wrote:
On Tue, Jun 03, 2008 at 12:43:21PM -0700, Trent Piepho wrote:
IOW, there are four ways one can defined endianness/swapping:
1) Little-endian
2) Big-endian
3) Native-endian aka non-byte-swapping
4) Foreign-endian aka byte-swapping

1 and 2 are by far the most used. Some code wants 3. No one wants 4. Yet
our API is providing 3 & 4, the two which are the least useful.

You've fundamentally misunderstood.

readX/writeX and __readX/__writeX provide little-endian access.
__raw_readX provide native-endian.

If you want 2 or 4, define your own accessors. Some architectures define
other accessors (eg gsc_readX on parisc is native (big) endian, and

How about providing 1 and 2, and if you want 3 or 4 define your own accessors?

Is it enough to provide only "all or none" for ordering strictness? For
instance on powerpc, one can get a speedup by dropping strict ordering for
IO
vs cacheable memory, but still keeping ordering for IO vs IO and IO vs
locks. This is much easier to program for than no ordering at all. In
fact, if one
doesn't use coherent DMA, it's basically the same as fully strict ordering.

I don't understand why you keep talking about DMA. Are you talking
about ordering between readX() and DMA? PCI proides those guarantees.

I guess you haven't been reading the whole thread. The reason it started was
because gcc can re-order powerpc (and everyone else's too) IO accesses vs
accesses to cachable memory (but not spin-locks), which ends up only being a
problem with coherent DMA.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/