Re: [PATCH 4/4] perf tools: determine if LR is the return address

From: Jiri Olsa
Date: Sat Jan 23 2021 - 19:08:06 EST


On Fri, Jan 22, 2021 at 04:18:54PM +0000, Alexandre Truong wrote:
> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> use dwarf unwind info to check if the link register is the return
> address in order to inject it to the frame pointer stack.
>
> Write the following application:
>
> int a = 10;
>
> void f2(void)
> {
> for (int i = 0; i < 1000000; i++)
> a *= a;
> }
>
> void f1()
> {
> f2();
> }
>
> int main (void)
> {
> f1();
> return 0;
> }
>
> with the following compilation flags:
> gcc -g -fno-omit-frame-pointer -fno-inline -O1
>
> The compiler omits the frame pointer for f2 on arm. This is a problem
> with any leaf call, for example an application with many different
> calls to malloc() would always omit the calling frame, even if it
> can be determined.
>
> ./perf record --call-graph fp ./a.out
> ./perf report
>
> currently gives the following stack:
>
> 0xffffea52f361
> _start
> __libc_start_main
> main
> f2

reproduced on x86 as well

> +static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> +{
> + return callchain_param.record_mode != CALLCHAIN_FP || !sample->user_regs.regs
> + || sample->user_regs.mask != PERF_REGS_MASK;
> +}
> +
> +static int add_entry(struct unwind_entry *entry, void *arg)
> +{
> + struct entries *entries = arg;
> +
> + entries->stack[entries->i++] = entry->ip;
> + return 0;
> +}
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread)
> +{
> + u64 leaf_frame;
> + struct entries entries = {{0, 0}, 0};
> +
> + if (get_leaf_frame_caller_enabled(sample))

the name suggest you'd want to continue if it's true

> + return 0;
> +
> + unwind__get_entries(add_entry, &entries, thread, sample, 2);

I'm scratching my head how this unwinds anything, you enabled just
registers, not the stack right? so the unwind code would do just
IP -> LR + 1 shift?

thanks,
jirka

> + leaf_frame = callchain_param.order == ORDER_CALLER ?
> + entries.stack[0] : entries.stack[1];
> +
> + if (leaf_frame + 1 == sample->user_regs.regs[PERF_REG_ARM64_LR])
> + return sample->user_regs.regs[PERF_REG_ARM64_LR];
> + return 0;
> +}

SNIP