Re: [PATCH v1 2/2] perf evlist: Don't run perf in non-root PID namespace when launch workload

From: James Clark
Date: Mon Dec 13 2021 - 08:54:42 EST




On 12/12/2021 13:47, Leo Yan wrote:
> In function evlist__prepare_workload(), after perf forks a child process
> and launches a workload in the created process, it needs to retrieve
> process and namespace related info via '/proc/$PID/' node.
>
> The process folders under 'proc' file system use the PID number from the
> root PID namespace, when perf tool runs in non-root PID namespace and
> creates new process for profiled program, this leads to the perf tool
> wrongly gather process info since it uses PID from non-root namespace to
> access nodes under '/proc'.
>
> Let's see an example:
>
> unshare --fork --pid perf record -e cs_etm//u -a -- test_program
>
> This command runs perf tool and the profiled program 'test_program' in
> the non-root PID namespace. When perf tool launches 'test_program',
> e.g. the forked PID number is 2, perf tool retrieves process info for
> 'test_program' from the folder '/proc/2'. But '/proc/2' is actually for
> a kernel thread so perf tool wrongly gather info for 'test_program'.

Hi Leo,

Which features aren't working exactly when you run in a non root namespace?

I did "perf record -- ls" and it seemed to be working for me. At least kernel
sampling would be working in a namespace, even if there was something wrong
with userspace.

I think causing a failure might be too restrictive and would prevent people
from using perf in a container. Maybe we could show a warning instead, but
I'm not sure exactly what's not working because I thought perf looked up stuff
based on the path of the process not the pid.

James

>
> To fix this issue, we don't allow perf tool runs in non-root PID
> namespace when it launches workload and reports error in this
> case. This can notify users to run the perf tool in root PID namespace
> to gather correct info for profiled program.
>
> Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
> ---
> tools/perf/util/evlist.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index 5f92319ce258..bdf79a97db66 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -11,6 +11,7 @@
> #include <poll.h>
> #include "cpumap.h"
> #include "util/mmap.h"
> +#include "util/namespaces.h"
> #include "thread_map.h"
> #include "target.h"
> #include "evlist.h"
> @@ -1364,6 +1365,12 @@ int evlist__prepare_workload(struct evlist *evlist, struct target *target, const
> int child_ready_pipe[2], go_pipe[2];
> char bf;
>
> + if (!nsinfo__is_in_root_namespace()) {
> + pr_err("Perf runs in non-root PID namespace; please run perf tool ");
> + pr_err("in the root PID namespace for gathering process info.\n");
> + return -EPERM;
> + }
> +
> if (pipe(child_ready_pipe) < 0) {
> perror("failed to create 'ready' pipe");
> return -1;
>