Re: [PATCH] exit: Put an upper limit on how often we can oops

From: Jann Horn
Date: Mon Nov 07 2022 - 16:49:03 EST


On Mon, Nov 7, 2022 at 10:15 PM Solar Designer <solar@xxxxxxxxxxxx> wrote:
> On Mon, Nov 07, 2022 at 09:13:17PM +0100, Jann Horn wrote:
> > +oops_limit
> > +==========
> > +
> > +Number of kernel oopses after which the kernel should panic when
> > +``panic_on_oops`` is not set.
>
> Rather than introduce this separate oops_limit, how about making
> panic_on_oops (and maybe all panic_on_*) take the limit value(s) instead
> of being Boolean? I think this would preserve the current behavior at
> panic_on_oops = 0 and panic_on_oops = 1, but would introduce your
> desired behavior at panic_on_oops = 10000. We can make 10000 the new
> default. If a distro overrides panic_on_oops, it probably sets it to 1
> like RHEL does.
>
> Are there distros explicitly setting panic_on_oops to 0? If so, that
> could be a reason to introduce the separate oops_limit.
>
> I'm not advocating one way or the other - I just felt this should be
> explicitly mentioned and decided on.

I think at least internally in the kernel, it probably works better to
keep those two concepts separate? For example, sparc has a function
die_nmi() that uses panic_on_oops to determine whether the system
should panic when a watchdog detects a lockup.