Re: [RFC PATCH v2 3/4] arm64: Detect FTRACE cases that make the stack trace unreliable

From: Mark Rutland
Date: Fri Apr 09 2021 - 08:27:10 EST


On Mon, Apr 05, 2021 at 03:43:12PM -0500, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
> From: "Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx>
>
> When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled and tracing is activated
> for a function, the ftrace infrastructure is called for the function at
> the very beginning. Ftrace creates two frames:
>
> - One for the traced function
>
> - One for the caller of the traced function
>
> That gives a reliable stack trace while executing in the ftrace code. When
> ftrace returns to the traced function, the frames are popped and everything
> is back to normal.
>
> However, in cases like live patch, a tracer function may redirect execution
> to a different function when it returns. A stack trace taken while still in
> the tracer function will not show the target function. The target function
> is the real function that we want to track.
>
> So, if an FTRACE frame is detected on the stack, just mark the stack trace
> as unreliable. The detection is done by checking the return PC against
> FTRACE trampolines.
>
> Also, the Function Graph Tracer modifies the return address of a traced
> function to a return trampoline to gather tracing data on function return.
> Stack traces taken from that trampoline and functions it calls are
> unreliable as the original return address may not be available in
> that context. Mark the stack trace unreliable accordingly.
>
> Signed-off-by: Madhavan T. Venkataraman <madvenka@xxxxxxxxxxxxxxxxxxx>
> ---
> arch/arm64/kernel/entry-ftrace.S | 12 +++++++
> arch/arm64/kernel/stacktrace.c | 61 ++++++++++++++++++++++++++++++++
> 2 files changed, 73 insertions(+)
>
> diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
> index b3e4f9a088b1..1f0714a50c71 100644
> --- a/arch/arm64/kernel/entry-ftrace.S
> +++ b/arch/arm64/kernel/entry-ftrace.S
> @@ -86,6 +86,18 @@ SYM_CODE_START(ftrace_caller)
> b ftrace_common
> SYM_CODE_END(ftrace_caller)
>
> +/*
> + * A stack trace taken from anywhere in the FTRACE trampoline code should be
> + * considered unreliable as a tracer function (patched at ftrace_call) could
> + * potentially set pt_regs->pc and redirect execution to a function different
> + * than the traced function. E.g., livepatch.

IIUC the issue here that we have two copies of the pc: one in the regs,
and one in a frame record, and so after the update to the regs, the
frame record is stale.

This is something that we could fix by having
ftrace_instruction_pointer_set() set both.

However, as noted elsewhere there are other issues which mean we'd still
need special unwinding code for this.

Thanks,
Mark.

> + *
> + * No stack traces are taken in this FTRACE trampoline assembly code. But
> + * they can be taken from C functions that get called from here. The unwinder
> + * checks if a return address falls in this FTRACE trampoline code. See
> + * stacktrace.c. If function calls in this code are changed, please keep the
> + * special_functions[] array in stacktrace.c in sync.
> + */
> SYM_CODE_START(ftrace_common)
> sub x0, x30, #AARCH64_INSN_SIZE // ip (callsite's BL insn)
> mov x1, x9 // parent_ip (callsite's LR)
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index fb11e4372891..7a3c638d4aeb 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -51,6 +51,52 @@ struct function_range {
> * unreliable. Breakpoints are used for executing probe code. Stack traces
> * taken while in the probe code will show an EL1 frame and will be considered
> * unreliable. This is correct behavior.
> + *
> + * FTRACE
> + * ======
> + *
> + * When CONFIG_DYNAMIC_FTRACE_WITH_REGS is enabled, the FTRACE trampoline code
> + * is called from a traced function even before the frame pointer prolog.
> + * FTRACE sets up two stack frames (one for the traced function and one for
> + * its caller) so that the unwinder can provide a sensible stack trace for
> + * any tracer function called from the FTRACE trampoline code.
> + *
> + * There are two cases where the stack trace is not reliable.
> + *
> + * (1) The task gets preempted before the two frames are set up. Preemption
> + * involves an interrupt which is an EL1 exception. The unwinder already
> + * handles EL1 exceptions.
> + *
> + * (2) The tracer function that gets called by the FTRACE trampoline code
> + * changes the return PC (e.g., livepatch).
> + *
> + * Not all tracer functions do that. But to err on the side of safety,
> + * consider the stack trace as unreliable in all cases.
> + *
> + * When Function Graph Tracer is used, FTRACE modifies the return address of
> + * the traced function in its stack frame to an FTRACE return trampoline
> + * (return_to_handler). When the traced function returns, control goes to
> + * return_to_handler. return_to_handler calls FTRACE to gather tracing data
> + * and to obtain the original return address. Then, return_to_handler returns
> + * to the original return address.
> + *
> + * There are two cases to consider from a stack trace reliability point of
> + * view:
> + *
> + * (1) Stack traces taken within the traced function (and functions that get
> + * called from there) will show return_to_handler instead of the original
> + * return address. The original return address can be obtained from FTRACE.
> + * The unwinder already obtains it and modifies the return PC in its copy
> + * of the stack frame to the original return address. So, this is handled.
> + *
> + * (2) return_to_handler calls FTRACE as mentioned before. FTRACE discards
> + * the record of the original return address along the way as it does not
> + * need to maintain it anymore. This means that the unwinder cannot get
> + * the original return address beyond that point while the task is still
> + * executing in return_to_handler. So, consider the stack trace unreliable
> + * if return_to_handler is detected on the stack.
> + *
> + * NOTE: The unwinder must do (1) before (2).
> */
> static struct function_range special_functions[] = {
> /*
> @@ -64,6 +110,21 @@ static struct function_range special_functions[] = {
> { (unsigned long) el1_fiq_invalid, 0 },
> { (unsigned long) el1_error_invalid, 0 },
>
> + /*
> + * FTRACE trampolines.
> + *
> + * Tracer function gets patched at the label ftrace_call. Its return
> + * address is the next instruction address.
> + */
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> + { (unsigned long) ftrace_call + 4, 0 },
> +#endif
> +
> +#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> + { (unsigned long) ftrace_graph_caller, 0 },
> + { (unsigned long) return_to_handler, 0 },
> +#endif
> +
> { /* sentinel */ }
> };
>
> --
> 2.25.1
>