Re: BUG-lockdep and freeze (was: Arrr! Linux 2.6.18)

From: Linus Torvalds
Date: Sat Sep 30 2006 - 17:12:23 EST

On Sat, 30 Sep 2006, Ingo Molnar wrote:
> (i'd have been happy with an %rbp based unwinder for x86_64, in fact i
> implemented it for lockdep and used it for some time on x86_64, but Andi
> wanted a dwarf-based, lower-overhead one. Andi also nicely integrated it
> into stacktrace.c.)

I wouldn't mind the dawrf-based one so much, if it wasn't so obviously

It could - and _should_ dammit! - do some basic sanity tests like "is the
thing even in the same stack page"? But nooo... It seems _designed_ to be
fragile and broken.

Here's a simple test: if the next stack-slot isn't on the same page, the
unwind information is bogus unless you had the IRQ stack-switch signature
there. Does the code do that? No. It just assumes that unwind information
is complete and perfect.

That's not the kind of code we write in the kernel. In the kernel, we
write code that _works_, regardless of the kind of horrible stuff people
feed it. That's _doubly_ true for something like a stack frame debugger,
which is invoced when there is trouble, and for all we know the stack

In short, I think the stack unwinder is just _broken_. It has made all the
wrong policy decisions - it only works when everything is perfect, yet
it's actually meant to be _used_ when somethign bad happened. Doesn't that
strike anybody else as a totally flawed design?

It damn well should.

