Re: [PATCH] panic: Add sysctl/cmdline to dump all CPUs backtraces on oops event

From: Randy Dunlap
Date: Tue Mar 10 2020 - 16:59:23 EST


Hi-

On 3/10/20 9:37 AM, Guilherme G. Piccoli wrote:
>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@xxxxxxxxxxxxx>
> ---
>
> As a P.S. note, my choice to put the backtrace dump in the end of
> oops_enter() was from previous experience (in which I used this
> approach in a kprobes to collect more data on oops), but I'd
> gladly accept suggestion in case there's a better place to dump
> this. Thanks in advance for the reviews!
> Cheers,
>
> Guilherme
>
>
> .../admin-guide/kernel-parameters.txt | 8 +++++++
> Documentation/admin-guide/sysctl/kernel.rst | 15 +++++++++++++
> include/linux/kernel.h | 6 ++++++
> kernel/panic.c | 21 +++++++++++++++++++
> kernel/sysctl.c | 11 ++++++++++
> 5 files changed, 61 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 4c6595b5f6c8..888b1fab3f6e 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -3333,6 +3333,14 @@
> This will also cause panics on machine check exceptions.
> Useful together with panic=30 to trigger a reboot.
>
> + oops_all_cpu_backtrace=
> + [KNL] Should kernel generates backtraces on all cpus

generate backtraces on all CPUs

> + when oops occurs - this should be a last measure resort
> + in case a kdump cannot be collected, for example.
> + Defaults to 0 and can be controlled by the sysctl
> + kernel.oops_all_cpu_backtrace.
> + Format: <integer>
> +
> page_alloc.shuffle=
> [KNL] Boolean flag to control whether the page allocator
> should randomize its free lists. The randomization may
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 218c717c1354..460112c3f656 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -573,6 +574,20 @@ numa_balancing_scan_size_mb is how many megabytes worth of pages are
> scanned for a given scan.
>
>
> +oops_all_cpu_backtrace:
> +================
> +
> +Determines if kernel should NMI all CPUs to dump their backtraces when

I would much prefer that to be written without using NMI as a verb.

> +an oops event occurs. It should be used as a last resort in case a panic
> +cannot be triggered (to protect VMs running, for example) or kdump can't
> +be collected. This file shows up if CONFIG_SMP is enabled.
> +
> +0: Won't show all CPUs backtraces when an oops is detected.
> +This is the default behavior.
> +
> +1: Will NMI all CPUs and dump their backtraces when an oops is detected.

Same here.

> +
> +
> osrelease, ostype & version:
> ============================
>



Thanks.
--
~Randy