Re: [PATCH bpf-next v2 3/4] bpf: Add support for writing to nf_conn:mark

From: Alexei Starovoitov
Date: Fri Aug 19 2022 - 13:21:21 EST


On Fri, Aug 19, 2022 at 6:05 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote:
>
> Daniel Xu <dxu@xxxxxxxxx> writes:
>
> > Hi Toke,
> >
> > On Thu, Aug 18, 2022 at 09:52:08PM +0200, Toke Høiland-Jørgensen wrote:
> >> Daniel Xu <dxu@xxxxxxxxx> writes:
> >>
> >> > Support direct writes to nf_conn:mark from TC and XDP prog types. This
> >> > is useful when applications want to store per-connection metadata. This
> >> > is also particularly useful for applications that run both bpf and
> >> > iptables/nftables because the latter can trivially access this
> >> > metadata.
> >>
> >> Looking closer at the nf_conn definition, the mark field (and possibly
> >> secmark) seems to be the only field that is likely to be feasible to
> >> support direct writes to, as everything else either requires special
> >> handling (like status and timeout), or they are composite field that
> >> will require helpers anyway to use correctly.
> >>
> >> Which means we're in the process of creating an API where users have to
> >> call helpers to fill in all fields *except* this one field that happens
> >> to be directly writable. That seems like a really confusing and
> >> inconsistent API, so IMO it strengthens the case for just making a
> >> helper for this field as well, even though it adds a bit of overhead
> >> (and then solving the overhead issue in a more generic way such as by
> >> supporting clever inlining).
> >>
> >> -Toke
> >
> > I don't particularly have a strong opinion here. But to play devil's
> > advocate:
> >
> > * It may be confusing now, but over time I expect to see more direct
> > write support via BTF, especially b/c there is support for unstable
> > helpers now. So perhaps in the future it will seem more sensible.
>
> Right, sure, for other structs. My point was that it doesn't look like
> this particular one (nf_conn) is likely to grow any other members we can
> access directly, so it'll be a weird one-off for that single field...
>
> > * The unstable helpers do not have external documentation. Nor should
> > they in my opinion as their unstableness + stale docs may lead to
> > undesirable outcomes. So users of the unstable API already have to
> > splunk through kernel code and/or selftests to figure out how to wield
> > the APIs. All this to say there may not be an argument for
> > discoverability.
>
> This I don't buy at all. Just because it's (supposedly) "unstable" is no

They're unstable. Please don't start this 'but can we actually remove
them' doubts. You're only confusing yourself and others.
We tweaked kfuncs already. We removed tracepoints too after they
were in a full kernel release.

> excuse to design a bad API, or make it actively user-hostile by hiding

'bad API'? what? It's a direct field write.
We do allow it in other data structures.

> things so users have to go browse kernel code to know how to use it. So
> in any case, we should definitely document everything.
>
> > * Direct writes are slightly more ergnomic than using a helper.
>
> This is true, and that's the main argument for doing it this way. The
> point of my previous email was that since it's only a single field,
> consistency weighs heavier than ergonomics :)

I don't think the 'consistency' argument applies here.
We already allow direct read of all fields.
Also the field access is easier to handle with CO-RE.