Re: [PATCH v2] bpf/scripts: Generate GCC compatible helpers

From: James Hilliard
Date: Wed Jul 13 2022 - 01:25:53 EST


On Tue, Jul 12, 2022 at 10:25 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Tue, Jul 12, 2022 at 08:56:35PM -0600, James Hilliard wrote:
> > On Tue, Jul 12, 2022 at 7:45 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Tue, Jul 12, 2022 at 6:29 PM James Hilliard
> > > <james.hilliard1@xxxxxxxxx> wrote:
> > > >
> > > > On Tue, Jul 12, 2022 at 7:18 PM Alexei Starovoitov
> > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > >
> > > > > On Tue, Jul 12, 2022 at 07:10:27PM -0600, James Hilliard wrote:
> > > > > > On Tue, Jul 12, 2022 at 10:48 AM Alexei Starovoitov
> > > > > > <alexei.starovoitov@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On Tue, Jul 12, 2022 at 4:20 AM Jose E. Marchesi
> > > > > > > <jose.marchesi@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > > CC Quentin as well
> > > > > > > > >
> > > > > > > > > On Mon, Jul 11, 2022 at 5:11 PM James Hilliard
> > > > > > > > > <james.hilliard1@xxxxxxxxx> wrote:
> > > > > > > > >>
> > > > > > > > >> On Mon, Jul 11, 2022 at 5:36 PM Yonghong Song <yhs@xxxxxx> wrote:
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On 7/6/22 10:28 AM, James Hilliard wrote:
> > > > > > > > >> > > The current bpf_helper_defs.h helpers are llvm specific and don't work
> > > > > > > > >> > > correctly with gcc.
> > > > > > > > >> > >
> > > > > > > > >> > > GCC appears to required kernel helper funcs to have the following
> > > > > > > > >> > > attribute set: __attribute__((kernel_helper(NUM)))
> > > > > > > > >> > >
> > > > > > > > >> > > Generate gcc compatible headers based on the format in bpf-helpers.h.
> > > > > > > > >> > >
> > > > > > > > >> > > This adds conditional blocks for GCC while leaving clang codepaths
> > > > > > > > >> > > unchanged, for example:
> > > > > > > > >> > > #if __GNUC__ && !__clang__
> > > > > > > > >> > > void *bpf_map_lookup_elem(void *map, const void *key)
> > > > > > > > >> > > __attribute__((kernel_helper(1)));
> > > > > > > > >> > > #else
> > > > > > > > >> > > static void *(*bpf_map_lookup_elem)(void *map, const void *key) = (void *) 1;
> > > > > > > > >> > > #endif
> > > > > > > > >> >
> > > > > > > > >> > It does look like that gcc kernel_helper attribute is better than
> > > > > > > > >> > '(void *) 1' style. The original clang uses '(void *) 1' style is
> > > > > > > > >> > just for simplicity.
> > > > > > > > >>
> > > > > > > > >> Isn't the original style going to be needed for backwards compatibility with
> > > > > > > > >> older clang versions for a while?
> > > > > > > > >
> > > > > > > > > I'm curious, is there any added benefit to having this special
> > > > > > > > > kernel_helper attribute vs what we did in Clang for a long time?
> > > > > > > > > Did GCC do it just to be different and require workarounds like this
> > > > > > > > > or there was some technical benefit to this?
> > > > > > > >
> > > > > > > > We did it that way so we could make trouble and piss you off.
> > > > > > > >
> > > > > > > > Nah :)
> > > > > > > >
> > > > > > > > We did it that way because technically speaking the clang construction
> > > > > > > > works relying on particular optimizations to happen to get correct
> > > > > > > > compiled programs, which is not guaranteed to happen and _may_ break in
> > > > > > > > the future.
> > > > > > > >
> > > > > > > > In fact, if you compile a call to such a function prototype with clang
> > > > > > > > with -O0 the compiler will try to load the function's address in a
> > > > > > > > register and then emit an invalid BPF instruction:
> > > > > > > >
> > > > > > > > 28: 8d 00 00 00 03 00 00 00 *unknown*
> > > > > > > >
> > > > > > > > On the other hand the kernel_helper attribute is bullet-proof: will work
> > > > > > > > with any optimization level, with any version of the compiler, and in
> > > > > > > > our opinion it is also more readable, more tidy and more correct.
> > > > > > > >
> > > > > > > > Note I'm not saying what you do in clang is not reasonable; it may be,
> > > > > > > > obviously it works well enough for you in practice. Only that we have
> > > > > > > > good reasons for doing it differently in GCC.
> > > > > > >
> > > > > > > Not questioning the validity of the reasons, but they created
> > > > > > > the unnecessary difference between compilers.
> > > > > >
> > > > > > Sounds to me like clang is relying on an unreliable hack that may
> > > > > > be difficult to implement in GCC, so let's see what's the best option
> > > > > > moving forwards in terms of a migration path for both GCC and clang.
> > > > >
> > > > > The following is a valid C code:
> > > > > static long (*foo) (void) = (void *) 1234;
> > > > > foo();
> > > > >
> > > > > and GCC has to generate correct assembly assuming it runs at -O1 or higher.
> > > >
> > > > Providing -O1 or higher with gcc-bpf does not seem to work at the moment.
> > >
> > > Let's fix gcc first.
> >
> > If the intention is to migrate to kernel_helper for clang as well it
> > seems kind of
> > redundant, is there a real world use case for supporting the '(void *)
> > 1' style in
> > GCC rather than just adding feature detection+kernel_helper support to libbpf?
> >
> > My assumption is that kernel helpers are in practice always used via libbpf
> > which appears to be sufficient in terms of being able to provide a compatibility
> > layer via feature detection. Or is there some use case I'm missing here?
>
> static long (*foo) (void) = (void *) 1234;
> is not about calling into "kernel helpers".
> There is no concept of "kernel" in BPF ISA.

I thought GCC at least had a somewhat kernel specific BPF ISA target,
I presume clang's bpf target is more generalized.

> 'call 1234' insn means call a function with that absolute address.
> The gcc named that attribute incorrectly.
> It should be renamed to something like __attribute__((fixed_address(1234))).
>
> It's a linux kernel abi choice to interpret 'call abs_addr' as a call to a kernel
> provided function at that address. 1,2,3,... are addresses of functions.

The impression I got was that GCC's BPF support was designed for targeting
the kernel ISA effectively, at least going off of the gcc-bpf docs gave me that
impression, although I might be wrong about that.

>
> > >
> > > > > There is no indirect call insn defined in BPF ISA yet,
> > > > > so the -O0 behavior is undefined.
> > > >
> > > > Well GCC at least seems to be able to compile BPF programs with -O0 using
> > > > kernel_helper. I assume -O0 is probably just targeting the minimum BPF ISA
> > > > optimization level or something like that which avoids indirect calls.
> > >
> > > There are other reasons why -O0 compiled progs will
> > > fail in the verifier.
> >
> > Why would -O0 generate code that isn't compatible with the selected
> > target BPF ISA?
>
> llvm has no issue producing valid BPF code with -O0.
> It's the kernel verifier that doesn't understand such code.
> For the following code:
> static long (*foo) (void) = (void *) 1234;
> long bar(void)
> {
> return foo();
> }
>
> With -O[12] llvm will generate
> call 1234
> exit
> With -O0
> r1 = foo ll
> r1 = *(u64 *)(r1 + 0)
> callx r1
> exit
>
> Both codes are valid and equivalent.
> 'callx' here is a reserved insn. The kernel verifier doesn't know about it yet,
> but llvm was generting such code for 8+ years.

Hmm, I thought GCC gates non-kernel compatible BPF behind -mxbpf(for
use with GCC's internal test suite mostly AFAIU):
https://gcc.gnu.org/onlinedocs/gcc/eBPF-Options.html

>
> > > Assuming that kernel_helper attr is actually necessary
> > > we have to add its support to clang as well.
> >
> > I mean, I'd argue there's a difference between something being arguably a better
> > alternative(optional) and actually being necessary(non-optional).
>
> gcc's attribute is not better.
> It's just a different way to tell compiler about fixed function address.

I presume it's a lot simpler implementation wise than the clang
version, but I could
be wrong about that though. I mostly work with compiler integration testing and
build fixes, compiler internals are a bit out of my area of expertise.

>
> > > gcc-bpf is a niche. If gcc devs want it to become a real
> > > alternative to clang they have to always aim for feature parity
> > > instead of inventing their own ways of doing things.
> >
> > What's ultimately going to help the most in regards to helping gcc-bpf reach
> > feature parity with clang is getting it minimally usable in the real
> > world, because
> > that's how you're going to get more people testing+fixing bugs so that all these
> > differences/incompatibilities can be worked though/fixed.
>
> Can gcc-bpf compile all of selftests/bpf ?

Don't pretty much all of those use?:
#include <bpf/bpf_helpers.h>

Which doesn't really work without adding kernel_helper support to libbpf at the
moment when building with gcc-bpf.

> How many of compiled programs will pass the verifier ?

Not really sure, still been working through toolchain/build issues...kinda
tricky to do proper testing when those are all using clang specific headers.

Would be handy to get integration testing running against them with gcc-bpf
so that we can at least get a baseline in terms of what's working and catch
regressions when fixing compiler/toolchain issues, right now I think gcc-bpf is
mostly only using an internal test suite.

>
> > If nobody can compile a real world BPF program with gcc-bpf it's likely going to
> > lag further behind.
>
> selftest/bpf is a first milestone that gcc-bpf has to pass before talking about
> 'real world' bpf progs.

A test suite designed to exercise lots of edge cases isn't exactly a great first
milestone for something like this, something like the 3 small systemd BPF
programs on the other hand would be a good start IMO, since they are
widely used real world programs and relatively simple. They aren't going to
exercise all potential edge cases but they are a good starting point when it
comes to having say something for testing real world toolchain integrations
against(which is in really rough shape at the moment).

I mean even getting some normal-ish progs buildable without downstream
library patches would be a big improvement as one can then iterate a lot easier.

I mean, we're dealing with multiple issues here, some of which are more
toolchain/integration issues and others are compiler issues. If we can get
a little more integration testing going it's going to be easier to flush out the
remaining compiler issues. Kinda tricky to fix one without fixing the other.