Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC

From: Josh Poimboeuf
Date: Thu Nov 30 2017 - 14:59:45 EST


On Thu, Nov 30, 2017 at 01:57:10PM -0600, Josh Poimboeuf wrote:
> On Thu, Nov 30, 2017 at 09:03:24AM +0100, Jiri Slaby wrote:
> > save_stack_trace_reliable now returns "non reliable" when there are
> > kernel pt_regs on stack. This means an interrupt or exception happened.
> > Somewhere down the route. It is a problem for frame pointer unwinder,
> > because the frame might now have been set up yet when the irq happened,
> > so it might fail to unwind from the interrupted function.
> >
> > With ORC, this is not a problem, as ORC has out-of-band data. We can
> > find ORC data even for the IP in interrupted function and always unwind
> > one level up.
> >
> > So introduce `unwind_regs_reliable' which decides if this is an issue
> > for the currently selected unwinder at all and change the code
> > accordingly.
>
> Thanks. I'm thinking there a few ways we can simplify things. (Most of
> these are actually issues with the existing code.)
>
> - Currently we check to make sure that there's no frame *after* the user
> space regs. I think there's no way that could ever happen and the
> check is overkill.
>
> - We should probably remove the STACKTRACE_DUMP_ONCE() warnings. There
> are some known places where a stack trace will fail, particularly with
> generated code. I wish we had a DEBUG_WARN_ON() macro which used
> pr_debug(), but oh well. At least the livepatch code has some helpful
> pr_warn()s, those are probably good enough.
^^^^^^^
meant to say pr_debug()s.

Also adding the live patching mailing list as an FYI.

>
> - The unwind->error checks are superfluous. The only errors we need to
> check for are (a) whether the FP unwinder encountered a kernel irq and
> b) whether we reached the final user regs frame. So I think
> unwind->error can be removed altogether.
>
> So with those changes in mind, how about something like this (plus
> comments)?
>
> for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> unwind_next_frame(&state)) {
>
> regs = unwind_get_entry_regs(&state);
> if (regs) {
> if (user_mode(regs))
> goto success;
>
> if (IS_ENABLED(CONFIG_FRAME_POINTER))
> return -EINVAL;
> }
>
> addr = unwind_get_return_address(&state);
> if (!addr)
> return -EINVAL;
>
> if (save_stack_address(trace, addr, false))
> return -EINVAL;
> }
>
> return -EINVAL;
>
> success:
> if (trace->nr_entries < trace->max_entries)
> trace->entries[trace->nr_entries++] = ULONG_MAX;
>
> return 0;
>
> After these changes I believe we can enable
> CONFIG_HAVE_RELIABLE_STACKTRACE for ORC.
>
> Also, when you post the next version, please cc the live patching
> mailing list, since this is directly relevant to livepatch.
>
> Thanks!

--
Josh