Re: [PATCH 7/7] DWARF: add the config option

From: Josh Poimboeuf
Date: Tue May 09 2017 - 15:23:00 EST


On Tue, May 09, 2017 at 08:47:50PM +0200, Jiri Kosina wrote:
> On Sun, 7 May 2017, Josh Poimboeuf wrote:
>
> > DWARF is great for debuggers. It helps you find all the registers on
> > the stack, so you can see function arguments and local variables. All
> > expressed in a nice compact format.
> >
> > But that's overkill for unwinders. We don't need all those registers,
> > and the state machine is too complicated.
>
> OTOH if we make the failures in processing of those "auxiliary"
> information non-fatal (in a sense that it neither causes kernel bug nor
> does it actually corrupt the unwinding process, but the only effect is
> losing "optional" information), having this data available doesn't hurt.

But it does hurt, in the sense that the complicated format of DWARF CFI
means the unwinder has to jump through a lot more hoops to read it.

> It's there anyway for builds containing debuginfo, and the information is
> all there so that it can be used by things like gdb or crash, so it seems
> natural to re-use as much as possible of it.

There's a valid argument to be made that we should start with the DWARF
data instead of creating the new data from scratch. That might be fine.
Right now I don't have a strong feeling about it either way.

But if we do that, we should still convert the DWARF data to a simple
streamlined format for the in-kernel unwinder, so it can easily be read
by the kernel without having to fire up a DWARF state machine in the
middle of an oops.

And if we wanted it to be reasonably reliable, we'd also need to fix up
the DWARF data somehow before converting it, presumably with objtool.

> > Unwinders basically only need to know one thing: given an instruction
> > address and a stack pointer, where is the caller's stack frame?
>
> Again, DWARF should be able to give us all of this (including the
> FP-fallback etc). It feels a bit silly to purposedly ignore it and
> reinvent parts of it again, instead of fixing (read: "asking toolchain
> guys to fix") the cases where we actually are not getting the proper data
> in DWARF. That's a win-win at the end of the day.

Most of the kernel DWARF issues I've seen aren't caused by toolchain
bugs. They're caused by the kernel's quirks: asm, inline asm, special
sections.

And anyway, fixing the correctness of the DWARF data is only half the
problem IMO. The other half of the problem is unwinder complexity.

--
Josh