Re: [PATCHv2] perf tools: Force uncore events to system wide monitoring

From: Arnaldo Carvalho de Melo
Date: Wed Mar 01 2017 - 16:48:13 EST


Em Mon, Feb 27, 2017 at 10:48:18AM +0100, Jiri Olsa escreveu:
> On Mon, Feb 27, 2017 at 10:44:13AM +0100, Borislav Petkov wrote:
> > On Mon, Feb 27, 2017 at 10:28:59AM +0100, Jiri Olsa wrote:
> > > would you mind testing new version?
> >
> > Looks ok to me.
> >
> > Thanks.
>
> thanks a lot, full patch attached
>
> jirka

With it I can't build perf anymore, as I use perf with all make
invocations, so that I can catch when the tool stops working for users
more quickly, like in this case:

[acme@jouet linux]$ perf stat -e cycles usleep 1

Performance counter stats for 'usleep 1':

653,232 cycles:u

0.005507594 seconds time elapsed

[acme@jouet linux]$ perf stat usleep 1
Error:
You may not have permission to collect system-wide stats.

Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).

The current value is 2:

-1: Allow use of (almost) all events by all users
>= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN

To make this setting permanent, edit /etc/sysctl.conf too, e.g.:

kernel.perf_event_paranoid = -1

[acme@jouet linux]$

------------------------------------------------

If I revert your patch thins are back working:

[acme@jouet linux]$ perf stat -e cycles usleep 1

Performance counter stats for 'usleep 1':

459,396 cycles:u

0.001159874 seconds time elapsed

[acme@jouet linux]$ perf stat usleep 1

Performance counter stats for 'usleep 1':

2.022049 task-clock:u (msec) # 0.181 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
48 page-faults:u # 0.024 M/sec
2,177,301 cycles:u # 1.077 GHz
245,106 instructions:u # 0.11 insn per cycle
48,000 branches:u # 23.738 M/sec
3,936 branch-misses:u # 8.20% of all branches

0.011181563 seconds time elapsed

[acme@jouet linux]$

---------------------------------------------

Seems related to the auto :u done when the user can't do kernel profiling, etc.

Re-applying your patch, to triple check:

[acme@jouet linux]$ perf stat usleep 1 2>&1 | head -2
Error:
You may not have permission to collect system-wide stats.
[acme@jouet linux]$

------------------------------------------------

Waiting for v3 :-)

- Arnaldo



>
> ---
> Making system wide (-a) the default option if no target
> was specified and one of following conditions is met:
>
> - there's no workload specified (current behaviour)
> - there is workload specified but all requested
> events are system wide events
>
> Mixed events core/uncore with workload:
> $ perf stat -e 'uncore_cbox_0/clockticks/,cycles' sleep 1
>
> Performance counter stats for 'sleep 1':
>
> <not supported> uncore_cbox_0/clockticks/
> 980,489 cycles
>
> 1.000897406 seconds time elapsed
>
> Uncore event with workload:
> $ perf stat -e 'uncore_cbox_0/clockticks/' sleep 1
>
> Performance counter stats for 'system wide':
>
> 281,473,897,192,670 uncore_cbox_0/clockticks/
>
> 1.000833784 seconds time elapsed
>
> Suggested-by: Borislav Petkov <bp@xxxxxxxxx>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Link: http://lkml.kernel.org/n/tip-rh8kybg6xdadfrw6x6uj37i2@xxxxxxxxxxxxxx
> ---
> tools/perf/builtin-stat.c | 32 +++++++++++++++++++++++++++++---
> tools/perf/util/parse-events.c | 5 +++--
> 2 files changed, 32 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index f4f555a67e9b..1320352cda04 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -2350,6 +2350,34 @@ static int __cmd_report(int argc, const char **argv)
> return 0;
> }
>
> +static void setup_system_wide(int forks)
> +{
> + /*
> + * Make system wide (-a) the default target if
> + * no target was specified and one of following
> + * conditions is met:
> + *
> + * - there's no workload specified
> + * - there is workload specified but all requested
> + * events are system wide events
> + */
> + if (!target__none(&target))
> + return;
> +
> + if (!forks)
> + target.system_wide = true;
> + else {
> + struct perf_evsel *counter;
> +
> + evlist__for_each_entry(evsel_list, counter) {
> + if (!counter->system_wide)
> + return;
> + }
> +
> + target.system_wide = true;
> + }
> +}
> +
> int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
> {
> const char * const stat_usage[] = {
> @@ -2456,9 +2484,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
> } else if (big_num_opt == 0) /* User passed --no-big-num */
> big_num = false;
>
> - /* Make system wide (-a) the default target. */
> - if (!argc && target__none(&target))
> - target.system_wide = true;
> + setup_system_wide(argc);
>
> if (run_count < 0) {
> pr_err("Run count must be a positive number\n");
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 67a8aebc67ab..54355d3caf09 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -316,8 +316,9 @@ __add_event(struct list_head *list, int *idx,
> return NULL;
>
> (*idx)++;
> - evsel->cpus = cpu_map__get(cpus);
> - evsel->own_cpus = cpu_map__get(cpus);
> + evsel->cpus = cpu_map__get(cpus);
> + evsel->own_cpus = cpu_map__get(cpus);
> + evsel->system_wide = !!cpus;
>
> if (name)
> evsel->name = strdup(name);
> --
> 2.7.4