Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user

From: Linus Torvalds
Date: Tue Jan 17 2023 - 12:46:19 EST


On Tue, Jan 17, 2023 at 9:18 AM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> The reason clang seems to generate saner code is that clang seems to
> largely ignore the whole "__builtin_expect()", at least not to the
> point where it tries to make the unlikely case be out-of-line.

Side note: that's not something new or unusual. It's been the case
since I started testing clang - we have several code-paths where we
use "unlikely()" to try to get very unlikely cases to be out-of-line,
and clang just mostly ignores it, or treats it as a very weak hint. I
think the only way to get clang to treat it as a *strong* hint is to
use PGO.

And in this case it actually made code generation look better,
probably because this particular use of static_branch_likely() is a
bit confused about which side should be the preferred one. It's using
the static branch to make the old case not have the masked load, but
then it's saying that the new case is the likely one.

So clang ignoring the likely() hint is probably the right thing here,
and then the wrong thing in some other places.

Linus