Re: [PATCH v8 00/16] Add support for Clang LTO

From: Nick Desaulniers
Date: Thu Dec 03 2020 - 17:33:10 EST


On Thu, Dec 3, 2020 at 10:23 AM Will Deacon <will@xxxxxxxxxx> wrote:
>
> On Thu, Dec 03, 2020 at 09:07:30AM -0800, Sami Tolvanen wrote:
> > On Thu, Dec 3, 2020 at 3:26 AM Will Deacon <will@xxxxxxxxxx> wrote:
> > > I took this series for a spin, with my for-next/lto branch merged in but
> > > I see a failure during the LTO stage with clang 11.0.5 because it doesn't
> > > understand the '.arch_extension rcpc' directive we throw out in READ_ONCE().
> >
> > I just tested this with Clang 11.0.0, which I believe is the latest
> > 11.x version, and the current Clang 12 development branch, and both
> > work for me. Godbolt confirms that '.arch_extension rcpc' is supported
> > by the integrated assembler starting with Clang 11 (the example fails
> > with 10.0.1):
> >
> > https://godbolt.org/z/1csGcT
> >
> > What does running clang --version and ld.lld --version tell you?
>
> I'm using some Android prebuilts I had kicking around:
>
> Android (6875598, based on r399163b) clang version 11.0.5 (https://android.googlesource.com/toolchain/llvm-project 87f1315dfbea7c137aa2e6d362dbb457e388158d)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /usr/local/google/home/willdeacon/work/android/repo/android-kernel/prebuilts-master/clang/host/linux-x86/clang-r399163b/bin
>
> and:
>
> LLD 11.0.5 (/buildbot/tmp/tmpx1DlI_ 87f1315dfbea7c137aa2e6d362dbb457e388158d) (compatible with GNU linkers)

On Thu, Dec 3, 2020 at 10:22 AM Nathan Chancellor
<natechancellor@xxxxxxxxx> wrote:
>
> 11.0.5 is AOSP's clang, which is behind the upstream 11.0.0 release so
> it is most likely the case that it is missing the patch that added rcpc.
> I think that a version based on the development branch (12.0.0) is in
> the works but I am not sure.

Yep, I have a lot of thoughts on the AOSP LLVM versioning scheme, but
they're not for LKML. That's yet another reason to prefer feature
detection as opposed to brittle version checks. Of course, as Will
points out, if your feature detection is broken, that helps no
one...more thoughts below.

> > > We actually check that this extension is available before using it in
> > > the arm64 Kconfig:
> > >
> > > config AS_HAS_LDAPR
> > > def_bool $(as-instr,.arch_extension rcpc)
> > >
> > > so this shouldn't happen. I then realised, I wasn't passing LLVM_IAS=1
> > > on my Make command line; with that, then the detection works correctly
> > > and the LTO step succeeds.
> > >
> > > Why is it necessary to pass LLVM_IAS=1 if LTO is enabled? I think it
> > > would be _much_ better if this was implicit (or if LTO depended on it).
> >
> > Without LLVM_IAS=1, Clang uses two different assemblers when LTO is
> > enabled: the external GNU assembler for stand-alone assembly, and
> > LLVM's integrated assembler for inline assembly. as-instr tests the
> > external assembler and makes an admittedly reasonable assumption that
> > the test is also valid for inline assembly.
> >
> > I agree that it would reduce confusion in future if we just always
> > enabled IAS with LTO. Nick, Nathan, any thoughts about this?
>
> That works for me, although I'm happy with anything which means that the
> assembler checks via as-instr apply to the assembler which will ultimately
> be used.

I agree with Will.

I think interoperability of tools is important. We should be able to
mix tools from GNU and LLVM and produce working images. Specifically,
combinations like gcc+llvm-nm+as+llvm-objcopy, or clang+nm+as+objcopy
as two examples. There's a combinatorial explosion of combinations to
test/validate, which we're not doing today, but if for some reason
someone wants to use some varied combination and it doesn't work, it's
worthwhile to understand the differences and issues and try to fix
them. That is a win for optionality and loose coupling.

That's not what's going on here though.

While I think it's ok to select a compiler and assembler and linker
etc from ecosystem or another, I think trying to support a build that
mixes or uses different assemblers (or linkers, compilers, etc) from
both for the same build is something we should draw a line in the sand
and explicitly not support (except for the compat vdso's*...). ie. if
I say `make LD=ld.bfd` and ld.lld gets invoked somehow (or vice
versa); I consider that a bug in KBUILD.

That is what's happening here, it's why as-instr feature detection is
broken; because two different assemblers were used in the same build.
One for inline asm, a different one for out of line asm. At the very
least, it violates the Principle of Least Surprise (or is it the Law
of Equivalent Exchange, I forget).

In fact, lots of the work we've been doing to enable LLVM tools to
build the kernel have been identifying places throughout KBUILD where
tools were hardcoded rather than using what make was told to use, and
we've been making progress fixing those. The ultimate test of Linux
kernel build hermiticity IMO is that I should be able to build a
kernel in an environment that only has one version of either
GCC/binutils or LLVM, and the kernel should build without failure.
That's not the case today for all arch's; cross compiling compat vdsos
again are a major pain point*, but we're making progress. In that
sense, the mixing of an individual GNU and LLVM utility is what I
would consider a bug in KBUILD. I want to emphasize that's distinct
from mixing and matching tools when invoking make, which I consider
OK, if under-tested.

Ok (mixes GNU and LLVM tools; gcc is the only compiler invoked, ld.lld
is the only linker invoked):
$ make CC=gcc LD=ld.lld

Not ok (if ld.bfd or both are invoked)
$ make LD=ld.lld

Not ok (if ld.lld or both are invoked)
$ make LD=ld.bfd

Not ok (if clang's integrated assembler and GAS are invoked)
$ ./scripts/config -e LTO_CLANG
$ make LLVM=1 LLVM_IAS=1

The mixing of GAS and clang's integrated assembler for kernel LTO
builds is a relic of a time when this series was first written when
Clang's integrated assembler was in no form ready to assemble the
entire Linux kernel, but could handle the inline asm for aarch64.
Fortunately, ARM's LLVM team has done great work to ensure the latest
extensions like RCpc are supported and compatible, and Jian has done
the hard work ironing out the last mile issues in clang's assembler to
get the ball in the end zone. Removing mixing GAS and clang's IA here
ups the ante and removes a fallback/pressure relief valve, but I'm
fine with that. Requiring clang's integrated assembler here aligns
incentives to keep this working and to continue investing here.

Just because it's possible to mix the use of clang's integrated
assembler with GNU assembler for LTO (for some combination of versions
of these tools) doesn't mean we should support it, or encourage it,
for all of the reasons above. We should make this config depend on
clang's integrated assembler, and not support the mixing of assemblers
in one build.

Thou shalt not support invoking of different tools than what's
specified*. Do not pass go, do not collect $200. Full stop.

* The compat vdso's are again a special case; when cross compiling
using GNU tools, a separate binary with a different target triple
prefix will typically get invoked than what's used to build the rest
of the kernel image. This still doesn't cross the GNU/LLVM boundary
though, and most importantly doesn't involve linking together object
files that were built with distinct assemblers (for example).

So I'd recommend to Sami to simply make the Kconfig also depend on
clang's integrated assembler (not just llvm-nm and llvm-ar). If
someone cares about LTO with Clang as the compiler but GAS as the
assembler, then we can revisit supporting that combination (and the
changes to KCONFIG), but it shouldn't be something we consider Tier 1
supported or a combination that need be supported in a minimum viable
product. And at that point we should make it avoid clang's integrated
assembler entirely (I suspect LTO won't work at all in that case, so
maybe even considering it is a waste of time).

One question I have to Will; if for aarch64 LTO will depend on RCpc,
but RCpc is an ARMv8.3 extension, what are the implications for LTO on
pre-ARMv8.3 aarch64 processors?
--
Thanks,
~Nick Desaulniers