Re: BUG-lockdep and freeze (was: Arrr! Linux 2.6.18)

From: Linus Torvalds
Date: Sat Sep 30 2006 - 17:57:25 EST




On Sat, 30 Sep 2006, Andi Kleen wrote:
>
> Anyways, I guess we need even more validation in the fallback code,
> but just terminating the kernel thread stacks should fix that particular case.

Why not just add the simple validation?

A kernel stack is one page in size. If you move to another page, you
terminate. It's that simple.

What if the kernel stack is corrupt? Buffer overruns do that.

This patch seems to just paper over the _real_ problem, namely the fact
that the stack tracer code doesn't actually validate any of its arguments.

> When show_trace faults it is actually not the unwinder itself
> (which would be unwind()/processCFI() etc.), but fallback code
> which is actually just the old unwinder. So most of your accusations
> are hitting the wrong piece of code, Linus.

Ehh, that's not even _true_, Andi.

The old unwinder (well, at least for x86, and I assume x86-64 used that as
the beginning point) didn't have this problem at all, exactly because it
couldn't get on the wrong stack-page in the first place.

The old code literally had:

static inline int valid_stack_ptr(struct thread_info *tinfo, void *p)
{
return p > (void *)tinfo &&
p < (void *)tinfo + THREAD_SIZE - 3;
}

and would refuse to touch anything that wasn't in the stack page. It was
simple, AND WE NEVER _EVER_ HAD A BUG RELATED TO IT, AFAIK.

In contrast, the new code doesn't do any sanity checking at all, and this
is not the first or even the second time it causes problems.

So be honest here, don't try to shift the blame.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/