Re: [PATCH 0/2] jump label: 2.6.38 updates

From: Mathieu Desnoyers
Date: Mon Feb 14 2011 - 18:29:59 EST


[ added Segher Boessenkool and Paul Mackerras to CC list ]

* Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
> On Mon, Feb 14, 2011 at 06:03:01PM -0500, Mathieu Desnoyers wrote:
> > * Matt Fleming (matt@xxxxxxxxxxxxxxxxx) wrote:
> > > On Mon, 14 Feb 2011 13:46:00 -0800 (PST)
> > > David Miller <davem@xxxxxxxxxxxxx> wrote:
> > >
> > > > From: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > > > Date: Mon, 14 Feb 2011 16:39:36 -0500
> > > >
> > > > > Thus it is not about global, as global is updated by normal means
> > > > > and will update the caches. atomic_t is updated via the ll/sc that
> > > > > ignores the cache and causes all this to break down. IOW... broken
> > > > > hardware ;)
> > > >
> > > > I don't see how cache coherency can possibly work if the hardware
> > > > behaves this way.
> > >
> > > Cache coherency is still maintained provided writes/reads both go
> > > through the cache ;-)
> > >
> > > The problem is that for read-modify-write operations the arbitration
> > > logic that decides who "wins" and is allowed to actually perform the
> > > write, assuming two or more CPUs are competing for a single memory
> > > address, is not implemented in the cache controller, I think. I'm not a
> > > hardware engineer and I never understood how the arbitration logic
> > > worked but I'm guessing that's the reason that the ll/sc instructions
> > > bypass the cache.
> > >
> > > Which is why the atomic_t functions worked out really well for that
> > > arch, such that any accesses to an atomic_t * had to go through the
> > > wrapper functions.
>
> ???
>
> What CPU family are we talking about here? For cache coherent CPUs,
> cache coherence really is supposed to work, even for mixed atomic and
> non-atomic instructions to the same variable.
>

I'm really curious to know which CPU families too. I've used git blame
to see where these lwz/stw instructions were added to powerpc, and it
points to:

commit 9f0cbea0d8cc47801b853d3c61d0e17475b0cc89
Author: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>
Date: Sat Aug 11 10:15:30 2007 +1000

[POWERPC] Implement atomic{, 64}_{read, write}() without volatile

Instead, use asm() like all other atomic operations already do.

Also use inline functions instead of macros; this actually
improves code generation (some code becomes a little smaller,
probably because of improved alias information -- just a few
hundred bytes total on a default kernel build, nothing shocking).

Signed-off-by: Segher Boessenkool <segher@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Paul Mackerras <paulus@xxxxxxxxx>

So let's ping the relevant people to see if there was any reason for
making these atomic read/set operations different from other
architectures in the first place.

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/