Re: [PATCH v6 17/23] scripts: decode_stacktrace: demangle Rust symbols

From: Kees Cook
Date: Sat May 07 2022 - 04:32:24 EST


On Sat, May 07, 2022 at 07:24:15AM +0200, Miguel Ojeda wrote:
> Recent versions of both Binutils (`c++filt`) and LLVM (`llvm-cxxfilt`)
> provide Rust v0 mangling support.
>
> Co-developed-by: Alex Gaynor <alex.gaynor@xxxxxxxxx>
> Signed-off-by: Alex Gaynor <alex.gaynor@xxxxxxxxx>
> Co-developed-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxxx>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxxx>
> Signed-off-by: Miguel Ojeda <ojeda@xxxxxxxxxx>
> ---
> I would like to use this patch for discussing the demangling topic.
>
> The following discusses the different approaches we could take.
>
>
> # Leave demangling to userspace
>
> This is the easiest and less invasive approach, the one implemented
> by this patch.
>
> The `decode_stacktrace.sh` script is already needed to map
> the offsets to the source code. Therefore, we could also take
> the chance to demangle the symbols here.
>
> With this approach, we do not need to introduce any change in the
> `vsprintf` machinery and we minimize the risk of breaking user tools.
>
> Note that, if we take this approach, it is likely we want to ask
> for a minimum version of either of the tools (since there may be
> users of the script that do not have recent enough toolchains).

For the first in-tree Rust support, I think this is entirely the right
approach.

> # Demangling in kernelspace on-the-fly

Please no. :) I don't see a benefit compared to doing it at
compile-time.

> Furthermore, this approach (and the ones below) likely require adding
> a new `%p` specifier (or a new modifier to existing ones) if we do
> not want to affect non-backtrace uses of the `B`/`S` ones. Also,
> it is unclear whether we should write the demangled versions in an
> extra, different line or replace the real symbol -- we could be
> breaking user tools relying on parsing backtraces (e.g. our own
> `decode_stacktrace.sh`). For instance, they could be relying on
> having real symbols there, or may break due to e.g. spaces.

I may need some examples here for what you're thinking will cause
problems. Why a new specifier? Won't demangling just give us text? Is
the concern about breaking backtrace parsers that only understand C
symbols?

> # Demangling at compile-time
>
> This implies having kallsyms demangle all the Rust symbols.
>
> The size of this data is around the same order of magnitude of the
> non-demangled ones. However, this is notably more than the demangling
> code (see previous point), e.g. 120 KiB (uncompressed) in a
> small kernel.

It seems all of that would be in the build-time helper, not the kernel
image, though, so that seems better than run-time demangling.

> # Demangling at compile-time and substituting symbols by hashes

Nah; this is even less readable than the mangled symbols. :) I don't
think the symbol length should be a concern. (Though maybe there are
some crash parsers that we can buffer overflow!)

> scripts/decode_stacktrace.sh | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)

Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>

--
Kees Cook