Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implementadaptive spinning

From: Linus Torvalds
Date: Mon Jan 12 2009 - 14:44:25 EST




On Mon, 12 Jan 2009, Andi Kleen wrote:
>
> What I find nonsensical is that -fno-strict-aliasing generates
> better code here. Normally one would expect the compiler seeing
> more aliases with that option and then be more conservative
> regarding any sharing. But it seems to be the other way round
> here.

No, that's not the surprising part. And in fact, now that you mention it,
I can even tell you why gcc does what it does.

But you'll need some background to it:

Type-based aliasing is _stupid_. It's so incredibly stupid that it's not
even funny. It's broken. And gcc took the broken notion, and made it more
so by making it a "by-the-letter-of-the-law" thing that makes no sense.

What happens (well, maybe it's fixed, but this was _literally_ what gcc
used to do) is that the type-based aliasing overrode everything else, so
if two accesses were to different types (and not in a union, and none of
the types were "char"), then gcc "knew" that they clearly could not alias,
and could thus wildly re-order accesses.

That's INSANE. It's so incredibly insane that people who do that should
just be put out of their misery before they can reproduce. But real gcc
developers really thought that it makes sense, because the standard allows
it, and it gives the compiler the maximal freedom - because it can now do
things that are CLEARLY NONSENSICAL.

And to compiler people, being able to do things that are clearly
nonsensical seems to often be seen as a really good thing, because it
means that they no longer have to worry about whether the end result works
or not - they just got permission to do stupid things in the name of
optimization.

So gcc did. I know for a _fact_ that gcc would re-order write accesses
that were clearly to (statically) the same address. Gcc would suddenly
think that

unsigned long a;

a = 5;
*(unsigned short *)&a = 4;

could be re-ordered to set it to 4 first (because clearly they don't alias
- by reading the standard), and then because now the assignment of 'a=5'
was later, the assignment of 4 could be elided entirely! And if somebody
complains that the compiler is insane, the compiler people would say
"nyaah, nyaah, the standards people said we can do this", with absolutely
no introspection to ask whether it made any SENSE.

Anyway, once you start doing stupid things like that, and once you start
thinking that the standard makes more sense than a human being using his
brain for 5 seconds, suddenly you end up in a situation where you can move
stores around wildly, and it's all 'correct'.

Now, take my stupid example, and make "fn1()" do "a.a = 1" and make
"fn2()" do "b.b = 2", and think about what a compiler that thinks it can
re-order the two writes willy-nilly will do?

Right. It will say "ok, a.a and b.b can not alias EVEN IF THEY HAVE
STATICALLY THE SAME ADDFRESS ON THE STACK", because they are in two
different structres. So we can then re-order the accesses, and move the
stores around.

Guess what happens if you have that kind of insane mentality, and you then
try to make sure that they really don't alias, so you allocate extra stack
space.

The fact is, Linux uses -fno-strict-aliasing for a damn good reason:
because the gcc notion of "strict aliasing" is one huge stinking pile of
sh*t. Linux doesn't use that flag because Linux is playing fast and loose,
it uses that flag because _not_ using that flag is insane.

Type-based aliasing is unacceptably stupid to begin with, and gcc took
that stupidity to totally new heights by making it actually more important
than even statically visible aliasing.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/