Re: [PATCH v6] pgo: add clang's Profile Guided Optimization infrastructure

From: Nick Desaulniers
Date: Thu Jan 21 2021 - 20:31:02 EST


On Thu, Jan 21, 2021 at 12:24 AM Bill Wendling <morbo@xxxxxxxxxx> wrote:
>
> From: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
>
> Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> profile, the kernel is instrumented with PGO counters, a representative
> workload is run, and the raw profile data is collected from
> /sys/kernel/debug/pgo/profraw.
>
> The raw profile data must be processed by clang's "llvm-profdata" tool
> before it can be used during recompilation:
>
> $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
>
> Multiple raw profiles may be merged during this step.
>
> The data can now be used by the compiler:
>
> $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
>
> This initial submission is restricted to x86, as that's the platform we
> know works. This restriction can be lifted once other platforms have
> been verified to work with PGO.
>
> Note that this method of profiling the kernel is clang-native, unlike
> the clang support in kernel/gcov.
>
> [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
>
> Signed-off-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> Co-developed-by: Bill Wendling <morbo@xxxxxxxxxx>
> Signed-off-by: Bill Wendling <morbo@xxxxxxxxxx>
> Tested-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> ---
> v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's
> testing.
> - Corrected documentation, re PGO flags when using LTO, based on Fangrui
> Song's comments.
> v3: - Added change log section based on Sedat Dilek's comments.
> v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our
> own popcount implementation, based on Nick Desaulniers's comment.
> v5: - Correct padding calculation, discovered by Nathan Chancellor.
> v6: - Add better documentation about the locking scheme and other things.
> - Rename macros to better match the same macros in LLVM's source code.

This is a major win for readability and comparing it against LLVM's
compiler-rt implementation! Thank you for doing that. It looks like
it addresses most of my concerns. I'm not against following up on
little details in subsequent patches on top. However Sedat is right
about the small issue that v6 doesn't compile. If you were to roll
his fixup into a v7 I'd be happy to sign off on it at this point.
--
Thanks,
~Nick Desaulniers