Re: [locks] 6d390e4b5d: will-it-scale.per_process_ops -96.6% regression

From: Linus Torvalds
Date: Tue Mar 10 2020 - 17:47:58 EST


On Tue, Mar 10, 2020 at 2:22 PM NeilBrown <neilb@xxxxxxx> wrote:
>
> A compiler barrier() is probably justified. Memory barriers delay reads
> and expedite writes so they cannot be needed.

That's not at all guaranteed. Weakly ordered memory things can
actually have odd orderings, and not just "writes delayed, reads done
early". Reads may be delayed too by cache misses, and memory barriers
can thus expedite reads as well (by forcing the missing read to happen
before later non-missing ones).

So don't assume that a memory barrier would only delay reads and
expedite writes. Quite the reverse: assume that there is no ordering
at all unless you impose one with a memory barrier (*).

Linus

(*) it's a bit more complex than that, in that we do assume that
control dependencies end up gating writes, for example, but those
kinds of implicit ordering things should *not* be what you depend on
in the code unless you're doing some seriously subtle memory ordering
work and comment on it extensively.