Re: [PATCH 1/1] Fix: trace sched switch start/stop racy updates

From: Linus Torvalds
Date: Fri Aug 16 2019 - 18:59:28 EST


On Fri, Aug 16, 2019 at 3:27 PM Valentin Schneider
<valentin.schneider@xxxxxxx> wrote:
>
> How would you differentiate optimizations you want from those you don't with
> just a flag? There's a reason we use volatile casts instead of declaring
> everything volatile: we actually *want* those optimizations. It just so
> happens that we don't want them *in some places*, and we have tools to tag
> them as such.

We actually disable lots of "valid" (read: the standard allows them,
but they are completely wrong for the kernel) optimizations because
they are wrong.

The whole type-based alias thing is just wrong. The C standards body
was incompetent to allow that garbage. So we disable it.

If you can *prove* that no aliasing exists, go ahead and re-order
accesses. But no guesses based on random types.

Similarly, if some compiler decides that it's ok to make speculative
writes (knowing it will over-write it with the right data later) to
data that is possibly visible to other threads, then such an
"optimization" needs to just be disabled. It might help some
benchmark, and if you read the standard just the right way it might be
allowed - but that doesn't make it valid.

We already had situations like that, where compiler people thought it
would be ok (for example) to turns a narrow write into a wider
read-modify-write because it had already done the wider read for other
reasons.

Again, the original C standard "allows" that in theory, because the
original C standard doesn't take threading into account. In fact, the
alpha architecture made actively bad design decisions based on that
(incorrect) assumption.

It turns out that in that case, even non-kernel people rebelled, and
it's apparently thankfully not allowed in newer versions of the
standard, exactly because threading has become a thing. You can't
magically write back unrelated variables just because they might be
next-door neighbors and share a word.

So no, we do *not* in general just say that we want any random
optimizations. A compiler that turns a single write into something
else is almost certainly something that shouldn't be allowed near the
kernel.

We add READ_ONCE and WRITE_ONCE annotations when they make sense. Not
because of some theoretical "compiler is free to do garbage"
arguments. If such garbage happens, we need to fix the compiler, the
same way we already do with

-fno-strict-aliasing
-fno-delete-null-pointer-checks
-fno-strict-overflow

because all those "optimizations" are just fundamentally unsafe and wrong.

I really wish the compiler would never take advantage of "I can prove
this is undefined behavior" kind of things when it comes to the kernel
(or any other projects I am involved with, for that matter). If you
can prove that, then you shouldn't decide to generate random code
without a big warning. But that's what those optimizations that we
disable effectively all do.

I'd love to have a flag that says "all undefined behavior is treated
as implementation-defined". There's a somewhat subtle - but very
important - difference there.

And that's what some hypothetical speculative write optimizations do
too. I do not believe they are valid for the kernel. If the code says

if (a)
global_var = 1
else
global_var = 0

then the compiler had better not turn that into

global_var = 0
if (a)
global_var = 1

even if there isn't a volatile there. But yes, we've had compiler
writers that say "if you read the specs, that's ok".

No, it's not ok. Because reality trumps any weasel-spec-reading.

And happily, I don't think we've ever really seen a compiler that we
use that actually does the above kind of speculative write thing (but
doing it for your own local variables that can't be seen by other
threads of execution - go wild).

So in general, we very much expect the compiler to do sane code
generation, and not (for example) do store tearing on normal
word-sized things or add writes that weren't there originally etc.

And yes, reads are different from writes. Reads don't have the same
kind of "other threads of execution can see them" effects, so a
compiler turning a single read into multiple reads is much more
realistic and not the same kind of "we need to expect a certain kind
of sanity from the compiler" issue.

Linus