Re: MMIO and gcc re-ordering issue

From: Benjamin Herrenschmidt
Date: Tue Jun 03 2008 - 00:34:20 EST



> This whole thread also ties in with my posts about mmiowb (which IMO
> should go away).
>
> readl/writel: strongly ordered wrt one another and other stores
> to cacheable RAM, byteswapping
> __readl/__writel: not ordered (needs mb/rmb/wmb to order with
> other readl/writel and cacheable operations, or
> io_*mb to order with one another)
> raw_readl/raw_writel: strongly ordered, no byteswapping
> __raw_readl/__raw_writel: not ordered, no byteswapping
>
> then get rid of *relaxed* variants.

In addition, some archs like powerpc also provide readl_be/writel_be as
being defined as big endian (ie. byteswap on LE archs, no byteswap on BE
archs).

As of today, powerpc lacks the raw_readl/raw_writel and __readl/__writel
variants (ie, we only provide fully ordered + byteswap and no ordering +
no byteswap variants).

If we agree on the above semantics, I'll do a patch providing the
missing ones.

> Linus: on x86, memory operations to wc and wc+ memory are not ordered
> with one another, or operations to other memory types (ie. load/load
> and store/store reordering is allowed). Also, as you know, store/load
> reordering is explicitly allowed as well, which covers all memory
> types. So perhaps it is not quite true to say readl/writel is strongly
> ordered by default even on x86. You would have to put in some
> mfence instructions in them to make it so.
>
> So, what *exact* definition are you going to mandate for readl/writel?
> Anything less than strict ordering then we also need to ensure drivers
> use the correct barriers (to implement strict ordering, we could either
> put mfence instructions in, or explicitly disallow readl/writel to be
> used on wc/wc+ memory).

The ordering guarantees that I provide on powerpc for "ordered" variants
are:

- cacheable store + writel stays ordered (ie, write to some
DMA stuff and then a register to trigger the DMA).

- readl + cacheable read stays ordered (ie. read some status
register, for example, after an interrupt, and then read the
resulting data in memory).

- any of these ordered vs. spin_lock and spin_unlock (with the
exception that stores done before the spin_lock
could potentially leak into the lock).

- readl is synchronous (ie, makes the CPU think the
data was actually used before executing subsequent
instructions, thus waits for the data to come back,
for example to ensure that a read used to push out
post buffers followed by a delay will indeed happen
with the right delay).

We don't provide meaningless ones like writel + cacheable store for
example. (PCI posting would defeat it anyway).

> The other way we can go is just say that they have x86 semantics,
> although that would be a bit sad IMO: we should have strong ops, in
> which case driver writers never need to use a single barrier provided
> they have locking right, and weak ops, in which case they should match
> up with the weak Linux memory ordering model for system RAM.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/