Re: Bisections with different bug manifestations

From: Dmitry Vyukov
Date: Fri Apr 16 2021 - 09:26:37 EST


On Fri, Apr 16, 2021 at 3:13 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Fri, 16 Apr 2021 09:51:45 +0200
> Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> >
> > If you look at substantial base of bisection logs, you will find lots
> > of cases where bug types, functions don't match. Kernel crashes
> > differently even on the same revision. And obviously things change if
> > you change revisions. Also if you see presumably a different bug, what
> > does it say regarding the original bug.
>
> Yes, but there are also several types of cases where the issue will be the
> same. Namely lockdep. I agree that use after free warnings can have a side
> effect, and may be more difficult.

But how do we know it's lockdep, rather than a use-after-free
manifested as lockdep?
A Significant portion of kernel bugs are caused by concurrency and can
manifest in different ways, e.g. these are not lockdep, or WARN, or
use-after-free, but rather a race in nature.

> But there's many other bugs that remain
> consistent across kernels. And if you stumble on one of them, look for it
> only.

For example? Does not look to be true for WARN, BUG, KASAN,
"inconsistent lock state".


> And if you hit another bug, and if it doesn't crash, then ignore it (of
> course this could be an issue if you have panic on warning set). But
> otherwise, just skip it.

It's not possible to skip, say, BUG.
And if we skip, say, a use-after-free, how do we know we are not
making things worse? Because now we are running on corrupted memory,
so anything can happen. Definitely a stray lockdep report can happen,
or other way around not happen when it should...

> > I would very much like to improve automatic bisection quality, but it
> > does not look trivial at all.
> >
> > Some random examples where, say, your hypothesis of WARN-to-WARN,
> > BUG-to-BUG does not hold even on the same kernel revision (add to this
>
> At least lockdep to lockdep, as when I do manual bisects, that's exactly
> what I look for, and ignore all other warnings. And that has found the
> problem commit pretty much every time.

What lockdep bug types do you mean? All?
In the examples above you can see at least "inconsistent lock state"
mixed with 2 other completely different bug types.

> > different revisions and the fact that a different bug does not give
> > info regarding the original bug):
> >
>
> Can you tell me that all these examples bisected to the commit that caused
> the bug? Because if it did not, then you may have just proved my point ;-)

I don't know now what was the result, but for a single run these were
manifestations of the same root bug.
E.g. see below, that's UAF in fuse_dev_do_read vs WARNING in
request_end. request_end is also fuse. And you can see that a memory
corruption causing a random bug type, in this case WARNING, but can as
well be LOCKDEP.


> > run #0: crashed: KASAN: use-after-free Read in fuse_dev_do_read
> > run #1: crashed: WARNING in request_end
> > run #2: crashed: KASAN: use-after-free Read in fuse_dev_do_read
> > run #3: OK
> > run #4: OK