[PATCH] perf tools: Ignore deleted cgroups

From: Namhyung Kim
Date: Thu May 09 2024 - 14:22:44 EST


On a large system, cgroups can be created and deleted often. That means
there's a race between perf tools and cgroups when it gets the cgroup
name and opens the cgroup. I got a report that perf stat with many
cgroups failed a quite often due to the missing cgroups on such a large
machine.

I think we can ignore such cgroups when expanding events and use id 0 if
it fails to read the cgroup id. IIUC 0 is not a vaild cgroup id so it
won't update event counts for the failed cgroups.

Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
---
tools/perf/util/bpf_counter_cgroup.c | 5 ++---
tools/perf/util/cgroup.c | 4 +++-
2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/bpf_counter_cgroup.c b/tools/perf/util/bpf_counter_cgroup.c
index 1c82377ed78b..ea29c372f339 100644
--- a/tools/perf/util/bpf_counter_cgroup.c
+++ b/tools/perf/util/bpf_counter_cgroup.c
@@ -136,9 +136,8 @@ static int bperf_load_program(struct evlist *evlist)
cgrp = evsel->cgrp;

if (read_cgroup_id(cgrp) < 0) {
- pr_err("Failed to get cgroup id\n");
- err = -1;
- goto out;
+ pr_debug("Failed to get cgroup id for %s\n", cgrp->name);
+ cgrp->id = 0;
}

map_fd = bpf_map__fd(skel->maps.cgrp_idx);
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index fcb509058499..0f759dd96db7 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -465,9 +465,11 @@ int evlist__expand_cgroup(struct evlist *evlist, const char *str,
name = cn->name + prefix_len;
if (name[0] == '/' && name[1])
name++;
+
+ /* the cgroup can go away in the meantime */
cgrp = cgroup__new(name, open_cgroup);
if (cgrp == NULL)
- goto out_err;
+ continue;

leader = NULL;
evlist__for_each_entry(orig_list, pos) {
--
2.45.0.118.g7fe29c98d7-goog