Re: [PATCH 18/21] perf top: Support callchain accumulation

From: Jiri Olsa
Date: Sun Jan 05 2014 - 13:02:22 EST

Next message: Andi Kleen: "Re: [PATCH] Revert "x86: Disable IST stacks for debug/int 3/stackfault for PREEMPT_RT""
Previous message: Greg KH: "Re: [PATCH V2 0/4] misc: xgene: Add support for APM X-Gene SoC QueueManager/Traffic Manager"
Next in thread: Namhyung Kim: "Re: [PATCH 18/21] perf top: Support callchain accumulation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Dec 24, 2013 at 05:22:24PM +0900, Namhyung Kim wrote:
> From: Namhyung Kim <namhyung.kim@xxxxxxx>
>
> Enable cumulation of callchain of children in perf top.
>
> Cc: Arun Sharma <asharma@xxxxxx>
> Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx>
> ---
> tools/perf/builtin-top.c | 106 +++++++++++++++++++++++++++++++++++++++++++++--
> 1 file changed, 103 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> index 48c527a0f4c8..6a7a76496c94 100644
> --- a/tools/perf/builtin-top.c
> +++ b/tools/perf/builtin-top.c
> @@ -657,6 +657,99 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
> return 0;
> }
>
> +static int process_cumulative_entry(struct perf_top *top,
> + struct hist_entry *he,
> + struct perf_evsel *evsel,
> + struct addr_location *al,
> + struct perf_sample *sample,
> + struct symbol *parent)
> +{

hum, I wonder how hard would it be to export the iterator
stuff out of the report command and export it to be usable
from here as well.. to much code dusplicated below :-\

jirka

> + struct hist_entry **he_cache;
> + struct callchain_cursor_node *node;
> + int idx = 0, err;
> +
> + he_cache = malloc(sizeof(*he_cache) * (PERF_MAX_STACK_DEPTH + 1));
> + if (he_cache == NULL)
> + return -ENOMEM;
> +
> + pthread_mutex_lock(&evsel->hists.lock);
> +
> + he_cache[idx++] = he;
> +
> + /*
> + * This is for putting parents upward during output resort iff
> + * only a child gets sampled. See hist_entry__sort_on_period().
> + */
> + he->callchain->max_depth = PERF_MAX_STACK_DEPTH + 1;
> +
> + callchain_cursor_commit(&callchain_cursor);
> +
> + node = callchain_cursor_current(&callchain_cursor);
> + while (node) {
> + int i;
> + struct hist_entry he_tmp = {
> + .cpu = al->cpu,
> + .thread = al->thread,
> + .comm = thread__comm(al->thread),
> + .parent = parent,
> + };
> +
> + fill_callchain_info(al, node, false);
> +
> + he_tmp.ip = al->addr;
> + he_tmp.ms.map = al->map;
> + he_tmp.ms.sym = al->sym;
> +
> + if (al->sym && al->sym->ignore)
> + goto next;
> +
> + /*
> + * Check if there's duplicate entries in the callchain.
> + * It's possible that it has cycles or recursive calls.
> + */
> + for (i = 0; i < idx; i++) {
> + if (hist_entry__cmp(he_cache[i], &he_tmp) == 0)
> + goto next;
> + }
> +
> + he = __hists__add_entry(&evsel->hists, al, parent, NULL, NULL,
> + sample->period, sample->weight,
> + sample->transaction, false);
> + if (he == NULL) {
> + err = -ENOMEM;
> + break;;
> + }
> +
> + he_cache[idx++] = he;
> +
> + /*
> + * This is for putting parents upward during output resort iff
> + * only a child gets sampled. See hist_entry__sort_on_period().
> + */
> + he->callchain->max_depth = callchain_cursor.nr - callchain_cursor.pos;
> +
> + if (sort__has_sym) {
> + u64 ip;
> +
> + if (al->map)
> + ip = al->map->unmap_ip(al->map, al->addr);
> + else
> + ip = al->addr;
> +
> + perf_top__record_precise_ip(top, he, evsel->idx, ip);
> + }
> +
> +next:
> + callchain_cursor_advance(&callchain_cursor);
> + node = callchain_cursor_current(&callchain_cursor);
> + }
> +
> + pthread_mutex_unlock(&evsel->hists.lock);
> +
> + free(he_cache);
> + return err;
> +}
> +
> static void perf_event__process_sample(struct perf_tool *tool,
> const union perf_event *event,
> struct perf_evsel *evsel,
> @@ -754,9 +847,16 @@ static void perf_event__process_sample(struct perf_tool *tool,
> return;
> }
>
> - err = hist_entry__append_callchain(he, sample);
> - if (err)
> - return;
> + if (symbol_conf.cumulate_callchain) {
> + err = process_cumulative_entry(top, he, evsel, &al,
> + sample, parent);
> + if (err)
> + return;
> + } else {
> + err = hist_entry__append_callchain(he, sample);
> + if (err)
> + return;
> + }
>
> if (sort__has_sym)
> perf_top__record_precise_ip(top, he, evsel->idx, ip);
> --
> 1.7.11.7
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andi Kleen: "Re: [PATCH] Revert "x86: Disable IST stacks for debug/int 3/stackfault for PREEMPT_RT""
Previous message: Greg KH: "Re: [PATCH V2 0/4] misc: xgene: Add support for APM X-Gene SoC QueueManager/Traffic Manager"
Next in thread: Namhyung Kim: "Re: [PATCH 18/21] perf top: Support callchain accumulation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]