Re: [PATCH] perf/record: make perf_event__synthesize_mmap_events() scale

From: Jiri Olsa
Date: Wed Mar 15 2017 - 07:34:03 EST


On Tue, Mar 14, 2017 at 11:57:21PM -0700, Stephane Eranian wrote:
> This patch significantly improves the execution time of
> perf_event__synthesize_mmap_events() when running perf record
> on systems where processes have lots of threads. It just happens
> that cat /proc/pid/maps support uses a O(N^2) algorithm to generate
> each map line in the maps file. If you have 1000 threads, then you have
> necessarily 1000 stacks. For each vma, you need to check if it corresponds
> to a thread's stack. With a large number of threads, this can take a very long time. I have seen latencies >> 10mn.
>
> As of today, perf does not use the fact that a mapping is a stack,
> therefore we can work around the issue by using /proc/pid/tasks/pid/maps.
> This entry does not try to map a vma to stack and is thus much
> faster with no loss of functonality.
>
> The proc-map-timeout logic is kept in case user still want some uppre limit.
>
> Signed-off-by: Stephane Eranian <eranian@xxxxxxxxxx>
> ---
> tools/perf/util/event.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 4ea7ce7..b137566 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -255,8 +255,8 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
> if (machine__is_default_guest(machine))
> return 0;
>
> - snprintf(filename, sizeof(filename), "%s/proc/%d/maps",
> - machine->root_dir, pid);
> + snprintf(filename, sizeof(filename), "%s/proc/%d/tasks/%d/maps",
> + machine->root_dir, pid, pid);
>
> fp = fopen(filename, "r");
> if (fp == NULL) {
> --
> 2.5.0
>

nice..

Acked-by: Jiri Olsa <jolsa@xxxxxxxxxx>

thanks,
jirka