Re: I/O memory barriers vs SMP memory barriers

From: Paul E. McKenney
Date: Wed Mar 28 2007 - 10:28:10 EST


On Mon, Mar 26, 2007 at 10:46:39AM +0200, Lennert Buytenhek wrote:
> On Sun, Mar 25, 2007 at 08:24:18PM -0700, Paul E. McKenney wrote:
>
> > > > > > [ background: On ARM, SMP synchronisation does need barriers but device
> > > > > > synchronisation does not. The question is that given this, whether
> > > > > > mb() and friends can be NOPs on ARM or not (i.e. whether mb() is
> > > > > > supposed to sync against other CPUs or not, or whether only smp_mb()
> > > > > > can be used for this.) ]
> > > > >
> > > > > Hmmmm...
> > > > >
> > > > > [snip]
> > > >
> > > > 3. Orders memory accesses and device accesses, but not necessarily
> > > > the union of the two -- mb(), rmb(), wmb().
> > >
> > > If mb/rmb/wmb are required to order normal memory accesses, that means
> > > that the change made in commit 9623b3732d11b0a18d9af3419f680d27ea24b014
> > > to always define mb/rmb/wmb as barrier() on ARM systems was wrong.
> >
> > This was on UP ARM systems, right?
>
> No.
>
> If you look at commit 9623b3732d11b0a18d9af3419f680d27ea24b014, you can
> see that it defines mb/rmb/wmb as barrier() on both ARM UP and SMP systems.
> The UP part is obviously fine, the SMP part is what is under debate here.

Yep, looks wrong to me.

> > Assuming that ARM CPUs respect the usual CPU-self-consistency
> > semantics, and given the background that device accesses are ordered,
> > then it might well be OK to have mb/rmb/wmb be barrier() on UP ARM
> > systems.
> >
> > Most likely not on SMP ARM systems, however.
>
> Given the semantics above, mb/rmb/wmb can obviously be just barrier()s
> on ARM UP systems.. I don't think anyone ever disagreed about that.

Good.

> > > Does everybody agree on these semantics, though? At least David
> > > seems to think that mb/rmb/wmb aren't required to order normal
> > > memory accesses against each other..
> >
> > Not on UP. On SMP, ordering is (almost certainly) required.
>
> 'almost certainly'? That sounds like there is a possibility that it
> wouldn't have to? What does this depend on?

The underlying memory model of the CPU. For sequentially consistent
systems, only compiler barriers are required. There are very few such
systems -- MIPS and PA-RISC, if I remember correctly. Performance
dictates otherwise.

I believe that MIPS is -not- sequentially consistent, but have not yet
purchased an architecture reference manual.

> At least David and Catalin seem to disagree with the statement
> that mb/rmb/wmb should order accesses from different CPUs. And
> memory-barriers.txt is pretty vague about this..

mb() needs to do everything that smp_mb() does, ditto for rmb() and
wmb(). There really are cases where both I/O and memory accesses
need to be ordered, so just providing separate memory ordering and
I/O ordering is not enough.

Given that ARM device drivers are accessing MMIO locations, which are
often slow anyway, how much is ARM really gaining by dropping memory
barriers when only I/O accesses need be ordered? Is it measurable?
If not, there is no point in adding yet another set of combinatorial
choices to the memory-barrier API.

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/