Re: [PATCH v2] arm64: fix unwind_frame() for filtered out fn for function graph tracing

From: Will Deacon
Date: Tue Sep 12 2017 - 22:43:15 EST


On Tue, Sep 12, 2017 at 10:54:28AM +0100, James Morse wrote:
> Hi Pratyush,
>
> On 01/09/17 06:48, Pratyush Anand wrote:
> > do_task_stat() calls get_wchan(), which further does unbind_frame().
> > unbind_frame() restores frame->pc to original value in case function
> > graph tracer has modified a return address (LR) in a stack frame to hook
> > a function return. However, if function graph tracer has hit a filtered
> > function, then we can't unwind it as ftrace_push_return_trace() has
> > biased the index(frame->graph) with a 'huge negative'
> > offset(-FTRACE_NOTRACE_DEPTH).
> >
> > Moreover, arm64 stack walker defines index(frame->graph) as unsigned
> > int, which can not compare a -ve number.
> >
> > Similar problem we can have with calling of walk_stackframe() from
> > save_stack_trace_tsk() or dump_backtrace().
> >
> > This patch fixes unwind_frame() to test the index for -ve value and
> > restore index accordingly before we can restore frame->pc.
>
> I've just spotted arm64's profile_pc, which does this:
> From arch/arm64/kernel/time.c:profile_pc():
> > #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> > frame.graph = -1; /* no task info */
> > #endif
>
> Is this another elaborate way of hitting this problem?
>
> I guess the options are skip any return-address restore in the unwinder if
> frame.graph is -1. (and profile_pc may have a bug here). Or, put
> current->curr_ret_stack in there.
>
> profile_pc() always passes tsk=NULL, so the unwinder assumes its current...
> kernel/profile.c pulls the pt_regs from a per-cpu irq_regs variable, that is
> updated by handle_IPI ... so it looks like this should always be current...

Hmmm... is profile_pc the *only* case where frame->graph isn't equal to
tsk->curr_ret_stack in unwind_frame? If so, maybe unwind_frame should just
use that, and we could kill the graph member of struct stackframe completely?

Will