Re: MMIO and gcc re-ordering issue

From: James Bottomley
Date: Tue May 27 2008 - 12:38:18 EST


On Tue, 2008-05-27 at 08:50 -0700, Roland Dreier wrote:
> > Though it's my understanding that at least ia64 does require the
> > explicit barriers anyway, so we are still in a dodgy situation here
> > where it's not clear what drivers should do and we end up with
> > possibly excessive barriers on powerpc where I end up with both
> > the wmb/rmb/mb that were added for ia64 -and- the ones I have in
> > readl/writel to make them look synchronous... Not nice.
>
> ia64 is a disaster with a slightly different ordering problem -- the
> mmiowb() issue. I know Ben knows far too much about this, but for big
> SGI boxes, you sometimes need mmiowb() to avoid problems with driver
> code that does totally sane stuff like
>
> spin_lock(&mmio_lock);
> writel(val1, reg1);
> writel(val2, reg2);
> spin_unlock(&mmio_lock);
>
> If that snippet is called on two CPUs at the same time, then the device
> might see a sequence like
>
> CPU1 -- write reg1
> CPU2 -- write reg1
> CPU1 -- write reg2
> CPU2 -- write reg2
>
> in spite of the fact that everything is totally ordered on the CPUs by
> the spin lock.
>
> The reason this is such a disaster is because the code looks right,
> makes sense, and works fine on 99.99% of all systems out there. So I
> would bet that 99% of our drivers have missing mmiowb() "bugs" -- no one
> has plugged the hardware into an Altix box and cared enough to stress
> test it.
>
> However for the issue at hand, my expectation as a driver writer is that
> readl()/writel() are ordered with respect to MMIO operations, but not
> necessarily with respect to normal writes to coherent CPU memory. And
> I've included explicit wmb()s in code I've written like
> drivers/infiniband/hw/mthca.

Actually, this specifically should not be. The need for mmiowb on altix
is because it explicitly violates some of the PCI rules that would
otherwise impede performance. The compromise is that readX on altix
contains the needed dma flush but there's a variant operator,
readX_relaxed that doesn't (for drivers that know what they're doing).
The altix critical drivers have all been converted to use the relaxed
form for performance, and the unconverted ones should all operate just
fine (albeit potentially more slowly).

You can see all of this in include/asm-ia64/sn/io.h

It is confusing to me that sn_dma_flush() and sn_mmiowb() have different
implementations, but I think both fix the spinlock problem you allude to
by ensuring the DMA operation is completed before the CPU instruction is
executed.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/