Re: Getting empty callchain from perf_callchain_kernel()

From: Peter Zijlstra
Date: Wed Jun 12 2019 - 04:59:27 EST


On Tue, Jun 11, 2019 at 10:05:01PM -0500, Josh Poimboeuf wrote:
> On Fri, May 24, 2019 at 10:53:19AM +0200, Peter Zijlstra wrote:
> > > For ORC, I'm thinking we may be able to just require that all generated
> > > code (BPF and others) always use frame pointers. Then when ORC doesn't
> > > recognize a code address, it could try using the frame pointer as a
> > > fallback.
> >
> > Yes, this seems like a sensible approach. We'd also have to audit the
> > ftrace and kprobe trampolines, IIRC they only do framepointer setup for
> > CONFIG_FRAME_POINTER currently, which should be easy to fix (after the
> > patches I have to fix the FP generation in the first place:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=x86/wip
>
> Right now, ftrace has a special hook in the ORC unwinder
> (orc_ftrace_find). It would be great if we could get rid of that in
> favor of the "always use frame pointers" approach. I'll hold off on
> doing the kpatch/kprobe trampoline conversions in my patches since it
> would conflict with yours.
>
> Though, hm, because of pt_regs I guess ORC would need to be able to
> decode an encoded frame pointer? I was hoping we could leave those
> encoded frame pointers behind in CONFIG_FRAME_POINTER-land forever...

Ah, I see.. could a similar approach work for the kprobe trampolines
perhaps?

> Here are my latest BPF unwinder patches in case anybody wants a sneak
> peek:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git/log/?h=bpf-orc-fix

On a quick read-through, that looks good to me. A minor nit:

/* mov dst_reg, %r11 */
EMIT_mov(dst_reg, AUX_REG);

The disparity between %r11 and AUX_REG is jarring. I understand the
whole bpf register mapping thing, but it is just weird when reading
this.

Other than that, the same note as before, the 32bit JIT still seems
buggered, but I'm not sure you (or anybody else) cares enough about that
to fix it though. It seems to use ebp as its own frame pointer, which
completely defeats an unwinder.