Re: [PATCH bpf-next v1 3/5] bpf: Introduce cgroup iter

From: Yonghong Song
Date: Fri May 20 2022 - 20:53:02 EST




On 5/20/22 9:45 AM, Tejun Heo wrote:
Hello, Yonghong.

On Fri, May 20, 2022 at 09:29:43AM -0700, Yonghong Song wrote:
Maybe you can have a bpf program signature like below:

int BPF_PROG(dump_vmscan, struct bpf_iter_meta *meta, struct cgroup *cgrp,
struct cgroup *parent_cgrp)

parent_cgrp is NULL when cgrp is the root cgroup.

I would like the bpf program should send the following information to
user space:
<parent cgroup dir name> <current cgroup dir name>

I don't think parent cgroup dir name would be sufficient to reconstruct the
path given that multiple cgroups in different subtrees can have the same
name. For live cgroups, userspace can find the path from id (or ino) without
traversing anything by constructing the fhandle, open it open_by_handle_at()
and then reading /proc/self/fd/$FD symlink -
https://lkml.org/lkml/2020/12/2/1126. This isn't available for dead cgroups
but I'm not sure how much that'd matter given that they aren't visible from
userspace anyway.

passing id/ino to user space and then get directory name in userspace
should work just fine.


<various stats interested by the user>

This way, user space can easily construct the cgroup hierarchy stat like
cpu mem cpu pressure mem pressure ...
cgroup1 ...
child1 ...
grandchild1 ...
child2 ...
cgroup 2 ...
child 3 ...
... ...

the bpf iterator can have additional parameter like
cgroup_id = ... to only call bpf program once with that
cgroup_id if specified.

The kernel part of cgroup_iter can call cgroup_rstat_flush()
before calling cgroup_iter bpf program.

WDYT?

Would it work to just pass in @cgrp and provide a group of helpers so that
the program can do whatever it wanna do including looking up the full path
and passing that to userspace?

I am not super familiar with cgroup internals, I guess with cgroup + helpers to retrieve stats, or directly expose stats data structure
to bpf program. Either one is okay to me as long as we can get
desired results.


Thanks.