Re: BUG_ON() in workingset_node_shadows_dec() triggers

From: Kees Cook
Date: Fri Oct 07 2016 - 13:17:00 EST


On Thu, Oct 6, 2016 at 10:48 PM, Willy Tarreau <w@xxxxxx> wrote:
> On Thu, Oct 06, 2016 at 04:59:20PM -0700, Linus Torvalds wrote:
>> We should just switch BUG() over and be done with it. The whole point
>> it that since it should never trigger in the first place, the
>> semantics on BUG() should never matter.
>>
>> And if you have some code that depends on the semantics of BUG(), that
>> code is buggy crap *by*definition*.
>
> I totally agree with this. If a developer writes BUG() somewhere, it
> means he doesn't see how it is possible to end up in this situation.
> Thus we cannot hope that the BUG() call is doing anything right to
> fix what the code author didn't expect to happen. It just means
> "try to limit the risks but I don't really know which ones".
>
> Also we won't make things worse. Where people currently have an oops,
> they'll get one or more warnings. The side effects (lockups, panic,
> etc) will more or less be the same, but many of us already don't want
> to continue after an oops and despite this our systems work fine, so
> I don't see why anyone would suffer from such a change. However some
> developers may get more details about issues than what they could get
> in the past.

Fair enough. I'll put something together for at least my use-cases and
see how ugly it gets in testing. :) I actually started on something
like this for the CONFIG_DEBUG_LIST, which had to deal with the logic
of "continue after WARN or abort after BUG" etc...

Regardless, I still think that we can't let BUG continue kernel
execution though, since it may lead to entirely unexpected behavior
(possibly security-sensitive) by still running. Upgrading BUG to
panic(), though, I'd be fine with, as a way to get people to convert
to WARN.

-Kees

--
Kees Cook
Nexus Security