Re: [patch] x86: improved memory barrier implementation

From: Linus Torvalds
Date: Sat Sep 29 2007 - 12:07:50 EST

Next message: Linus Torvalds: "Re: [rfc][patch] i386: remove comment about barriers"
Previous message: Scott Preece: "Re: [PATCH 1/3] CodingStyle updates"
In reply to: Nick Piggin: "Re: [patch] x86: improved memory barrier implementation"
Next in thread: Nick Piggin: "Re: [patch] x86: improved memory barrier implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, 29 Sep 2007, Nick Piggin wrote:
> >
> > The non-temporal stores should be basically considered to be "IO", not any
> > normal memory operation.
>
> Maybe you're thinking of uncached / WC? Non-temporal stores to cacheable
> RAM apparently can go out of order too, and they are being used in the kernel
> for some things.

I'm really saying that people to a first approximation should think "NT is
an IO (DMA) thing". Whether cached or not. Exactly because they do not
honor the normal memory ordering.

It may be worth noting that "clflush" falls under that heading too - even
if all the actual *writes* were done with totally normal writes, if
anybody does a clflush instruction, that breaks the ordering, and that
turns it to "DMA ordering" again - ie we're not talking about the normal
SMP ordering rules at all.

So all the spinlocks and all the smp_*mb() barriers have never really done
*anything* for those things (in particular, "smp_wmb()" has *always*
ignored them on i386!)

> Likewise for rep stos, apparently.

No. As far as I can tell, the fast string operations are unordered
*within*themselves*, but not wrt the operations around it.

In other words, you cannot depend on the ordering of stores *in* the
memcpy() or memset() when it is implemented by "rep movs/stos" - but that
is 100% equivalent to the fact that you cannot depend on the ordering even
when it isn't - since the "memcpy()" library routine might be copying
memory backwards for all you know!

The Intel memory ordering paper doesn't talk about the fast string
instructions (except to say that the rules it *does* speak about do not
hold), but the regular IA manuals do say (for example):

"Code dependent upon sequential store ordering should not use the
string operations for the entire data structure to be stored. Data and
semaphores should be separated. Order dependent code should use a
discrete semaphore uniquely stored to after any string operations to
allow correctly ordered data to be seen by all processors."

and note how it says you should just store to the semaphore. If you think
about it, that semahore will be involving all the memory ordering
requirements that we *already* depend on, so if a semaphore is sufficient
to order the fast string instruction, then by definition using a spinlock
around them must be the same thing!

In other words, by Intels architecture manual, fast string instructions
cannot escape a "semaphore" - but that means that they cannot escape a
spinlock either (since the two are exactly the same wrt memory ordering
rules! In other words, whenever the Intel docs say "semaphore", think
"mutual exclusion lock", not necessarily the kernel kind of "sleeping
semaphore").

But it might be good to have that explicitly mentioned in the IA memory
ordering thing, so I'll ask the Intel people about that. However, I'd say
that given our *current* documentation, string instructions may be
*internally* out-of-order, but they would not escape a lock.

> But this means they are already at odds with spin_unlock, unless they
> are enclosed with mfences everywhere they are used (of which I think
> most are not). So this is an existing bug in the kernel.

See above. I do not believe that it's an existing bug, but the basic point
that the change to "smp_rmb()" doesn't change our existing rules is true.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Linus Torvalds: "Re: [rfc][patch] i386: remove comment about barriers"
Previous message: Scott Preece: "Re: [PATCH 1/3] CodingStyle updates"
In reply to: Nick Piggin: "Re: [patch] x86: improved memory barrier implementation"
Next in thread: Nick Piggin: "Re: [patch] x86: improved memory barrier implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]