Re: [PATCH v2] cgroup/bpf: fast path for not loaded skb BPF filtering

From: Pavel Begunkov
Date: Tue Dec 14 2021 - 06:40:35 EST


On 12/14/21 07:27, Martin KaFai Lau wrote:
On Sat, Dec 11, 2021 at 07:17:49PM +0000, Pavel Begunkov wrote:
cgroup_bpf_enabled_key static key guards from overhead in cases where
no cgroup bpf program of a specific type is loaded in any cgroup. Turn
out that's not always good enough, e.g. when there are many cgroups but
ones that we're interesting in are without bpf. It's seen in server
environments, but the problem seems to be even wider as apparently
systemd loads some BPF affecting my laptop.

Profiles for small packet or zerocopy transmissions over fast network
show __cgroup_bpf_run_filter_skb() taking 2-3%, 1% of which is from
migrate_disable/enable(), and similarly on the receiving side. Also
got +4-5% of t-put for local testing.
What is t-put? throughput?

yes

Local testing means sending to lo/dummy?

yes, it was dummy specifically


[ ... ]

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 11820a430d6c..793e4f65ccb5 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -219,11 +219,28 @@ int bpf_percpu_cgroup_storage_copy(struct bpf_map *map, void *key, void *value);
int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key,
void *value, u64 flags);
+static inline bool
+__cgroup_bpf_prog_array_is_empty(struct cgroup_bpf *cgrp_bpf,
+ enum cgroup_bpf_attach_type type)
Lets remove this.

+{
+ struct bpf_prog_array *array = rcu_access_pointer(cgrp_bpf->effective[type]);
+
+ return array == &empty_prog_array.hdr;
+}
+
+#define CGROUP_BPF_TYPE_ENABLED(sk, atype) \
and change cgroup.c to directly use this instead, so
everywhere holding a fullsock sk will use this instead
of having two helpers for empty check.

Why? CGROUP_BPF_TYPE_ENABLED can't be a function atm because of header
dependency hell, and so it'd kill some of typization, which doesn't add
clarity. And also it imposes some extra overhead to *sockopt using
the first helper directly.

I think it's better with two of them. I could inline the second
one, but it wouldn't have been pretty.


[ ... ]

diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 2405e39d800f..fedc7b44a1a9 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -1967,18 +1967,10 @@ static struct bpf_prog_dummy {
},
};
-/* to avoid allocating empty bpf_prog_array for cgroups that
- * don't have bpf program attached use one global 'empty_prog_array'
- * It will not be modified the caller of bpf_prog_array_alloc()
- * (since caller requested prog_cnt == 0)
- * that pointer should be 'freed' by bpf_prog_array_free()
- */
-static struct {
- struct bpf_prog_array hdr;
- struct bpf_prog *null_prog;
-} empty_prog_array = {
+struct bpf_empty_prog_array empty_prog_array = {
.null_prog = NULL,
};
+EXPORT_SYMBOL(empty_prog_array);
nit. Since it is exported, may be prefix it with 'bpf_'.

yeah, sure


--
Pavel Begunkov