Re: [RFC PATCH v3 19/22] arm64: unwinder: Add a reliability check in the unwinder based on ORC

From: Suraj Jitindar Singh
Date: Wed Feb 22 2023 - 23:07:34 EST


On Thu, 2023-02-02 at 01:40 -0600, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
> From: "Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx>
>
> Introduce a reliability flag in struct unwind_state. This will be set
> to
> false if the PC does not have a valid ORC or if the frame pointer
> computed
> from the ORC does not match the actual frame pointer.
>
> Now that the unwinder can validate the frame pointer, introduce
> arch_stack_walk_reliable().
>
> Signed-off-by: Madhavan T. Venkataraman <madvenka@xxxxxxxxxxxxxxxxxxx
> >
> ---
> arch/arm64/include/asm/stacktrace/common.h | 15 ++
> arch/arm64/kernel/stacktrace.c | 167
> ++++++++++++++++++++-
> 2 files changed, 175 insertions(+), 7 deletions(-)
>

[snip]

> -static void notrace unwind(struct unwind_state *state,
> +static int notrace unwind(struct unwind_state *state, bool
> need_reliable,
> stack_trace_consume_fn consume_entry, void
> *cookie)
> {
> - while (1) {
> - int ret;
> + int ret = 0;
>
> + while (1) {
> + if (need_reliable && !state->reliable)
> + return -EINVAL;
> if (!consume_entry(cookie, state->pc))
> break;
> ret = unwind_next(state);
> + if (need_reliable && !ret)
> + unwind_check_reliable(state);
> if (ret < 0)
> break;
> }
> + return ret;

nit:

I think you're looking more for comments on the approach and the
correctness of these patches, but from an initial read I'm still
putting it all together in my head. So this comment is on the coding
style.

The above loop seems to check the current reliability state, then
unwind a frame then check the reliability, and then break based of
something which couldn't have been updated by the line immediately
above. I propose something like:

unwind(...) {
ret = 0;

while (!ret) {
if (need_reliable) {
unwind_check_reliable(state);
if (!state->reliable)
return -EINVAL;
}
if (!consume_entry(cookie, state->pc))
return -EINVAL;
ret = unwind_next(state);
}

return ret;
}

This also removes the need for the call to unwind_check_reliable()
before the first unwind() below in arch_stack_walk_reliable().

- Suraj

> }
> NOKPROBE_SYMBOL(unwind);
>
> @@ -216,5 +337,37 @@ noinline notrace void
> arch_stack_walk(stack_trace_consume_fn consume_entry,
> unwind_init_from_task(&state, task);
> }
>
> - unwind(&state, consume_entry, cookie);
> + unwind(&state, false, consume_entry, cookie);
> +}
> +
> +noinline notrace int arch_stack_walk_reliable(
> + stack_trace_consume_fn consume_entry,
> + void *cookie, struct task_struct *task)
> +{
> + struct stack_info stacks[] = {
> + stackinfo_get_task(task),
> + STACKINFO_CPU(irq),
> +#if defined(CONFIG_VMAP_STACK)
> + STACKINFO_CPU(overflow),
> +#endif
> +#if defined(CONFIG_VMAP_STACK) && defined(CONFIG_ARM_SDE_INTERFACE)
> + STACKINFO_SDEI(normal),
> + STACKINFO_SDEI(critical),
> +#endif
> + };
> + struct unwind_state state = {
> + .stacks = stacks,
> + .nr_stacks = ARRAY_SIZE(stacks),
> + };
> + int ret;
> +
> + if (task == current)
> + unwind_init_from_caller(&state);
> + else
> + unwind_init_from_task(&state, task);
> + unwind_check_reliable(&state);
> +
> + ret = unwind(&state, true, consume_entry, cookie);
> +
> + return ret == -ENOENT ? 0 : -EINVAL;
> }