Re: [PATCH] Introduce the pkill_on_warn boot parameter

From: Kees Cook
Date: Thu Sep 30 2021 - 14:28:28 EST

Next message: Will McVicker: "Re: [PATCH v2 02/12] timekeeping: add API for getting timekeeping_suspended"
Previous message: Alexander Popov: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
In reply to: Steven Rostedt: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
Next in thread: Andrew Morton: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Sep 30, 2021 at 11:15:41AM +0200, Petr Mladek wrote:
> On Wed 2021-09-29 12:49:24, Paul E. McKenney wrote:
> > On Wed, Sep 29, 2021 at 10:01:33PM +0300, Alexander Popov wrote:
> > > On 29.09.2021 21:58, Alexander Popov wrote:
> > > > Currently, the Linux kernel provides two types of reaction to kernel
> > > > warnings:
> > > > 1. Do nothing (by default),
> > > > 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
> > > > so panic_on_warn is usually disabled on production systems.
>
> Honestly, I am not sure if panic_on_warn() or the new pkill_on_warn()
> work as expected. I wonder who uses it in practice and what is
> the experience.

panic_on_warn() gets used by folks with paranoid security concerns.

> The problem is that many developers do not know about this behavior.
> They use WARN() when they are lazy to write more useful message or when
> they want to see all the provided details: task, registry, backtrace.

The documentation[1] on this hopefully clarifies the situation:

Note that the WARN()-family should only be used for “expected to be
unreachable” situations. If you want to warn about “reachable but
undesirable” situations, please use the pr_warn()-family of functions.
System owners may have set the panic_on_warn sysctl, to make sure their
systems do not continue running in the face of “unreachable” conditions.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on

> Also it is inconsistent with pr_warn() behavior. Why a single line
> warning would be innocent and full info WARN() cause panic/pkill?

Because pr_warn() is intended for system admins. WARN() is for
developers and should not be reachable through any known path.

> What about pr_err(), pr_crit(), pr_alert(), pr_emerg()? They inform
> about even more serious problems. Why a warning should cause panic/pkill
> while an alert message is just printed?

Additionally, pr_*() don't include stack traces, etc. WARN() is for
situations that should never happen. pr_warn() is about undesirable but
reachable states.

For example:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d4689846881d160a4d12a514e991a740bcb5d65a

> It somehow reminds me the saga with %pK. We were not able to teach
> developers to use it correctly for years and ended with hashed
> pointers.

And this was pointed out when %pK was introduced, but Linus couldn't be
convinced. He changed his mind, thankfully.

--
Kees Cook

Next message: Will McVicker: "Re: [PATCH v2 02/12] timekeeping: add API for getting timekeeping_suspended"
Previous message: Alexander Popov: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
In reply to: Steven Rostedt: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
Next in thread: Andrew Morton: "Re: [PATCH] Introduce the pkill_on_warn boot parameter"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]