Re: [PATCH] kdump: add default crashkernel reserve kernel config options

From: Dave Young
Date: Mon May 21 2018 - 20:49:37 EST


On 05/21/18 at 12:02pm, Andrew Morton wrote:
> On Mon, 21 May 2018 10:53:37 +0800 Dave Young <dyoung@xxxxxxxxxx> wrote:
>
> > This is a rework of the crashkernel=auto patches back to 2009 although
> > I'm not sure if below is the last version of the old effort:
> > https://lkml.org/lkml/2009/8/12/61
> > https://lwn.net/Articles/345344/
> >
> > I changed the original design, instead of adding the auto reserve logic
> > in code, in this patch just introduce two kernel config options for
> > the default crashkernel value in MB and the threshold of system memory
> > in MB so that only reserve default when system memory is equal or
> > above the threshold.
> >
> > With the kernel configs distributions can easily change the default
> > values so that people do not need to manually set kernel cmdline
> > for common use cases and one can still overwrite the default value
> > with manual setup or disable it by using crashkernel=0
> >
> > Signed-off-by: Dave Young <dyoung@xxxxxxxxxx>
> > ---
> > Another difference is with original design the crashkernel size scales
> > with system memory, according to test, large machine may need more
> > memory in kdump kernel because of several factors:
> > 1. cpu numbers, because of the percpu memory allocated for cpus.
> > (kdump can use nr_cpus=1 to workaround this, but some
> > arches do not support nr_cpus=X for example powerpc)
> > 2. IO devices, large system can have a lot of io devices, although we
> > can try to only add those device drivers we needed, it is still a
> > problem because of some built-in drivers, some stacked logical devices
> > eg. device mapper devices, acpi etc. Even if only considering the
> > meta data for driver model it will still be a big number eg. sysfs
> > files etc.
> > 3. The minimum memory requirement for some device drivers are big, even
> > if some of them have implemented low meory profile. It is usual to see
> > 10M memory use for a storage driver.
> > 4. user space initramfs size growing. Busybox is not usable if we need
> > to add udev support and some complicate storage support. Use dracut
> > with systemd, especially networking stuff need more memory.
> >
> > So probably add another kernel config option to scale the memory size
> > eg. CRASHKERNEL_DEFAULT_SCALE_RATIO is also good to have, in RHEL we
> > use base_value + system_mem >> (2^14) for x86. I'm still hesatating
> > how to describe and add this option. Any suggestions will be appreciated.
> >
> > ...
> >
> > --- linux-x86.orig/arch/Kconfig
> > +++ linux-x86/arch/Kconfig
> > @@ -10,6 +10,22 @@ config KEXEC_CORE
> > select CRASH_CORE
> > bool
> >
> > +config CRASHKERNEL_DEFAULT_THRESHOLD_MB
> > + int "System memory size threshold for kdump memory default reserving"
> > + depends on CRASH_CORE
> > + default 0
> > + help
> > + CRASHKERNEL_DEFAULT_MB is used as default crashkernel value if
> > + the system memory size is equal or bigger than the threshold.
>
> "the threshold" is rather vague. Can it be clarified?
>
> In fact I'm really struggling to understand the logic here....
>
>
> > +config CRASHKERNEL_DEFAULT_MB
> > + int "Default crashkernel memory size reserved for kdump"
> > + depends on CRASH_CORE
> > + default 0
> > + help
> > + This is used as the default kdump reserved memory size in MB.
> > + crashkernel=X kernel cmdline can overwrite this value.
> > +
> > config HAVE_IMA_KEXEC
> > bool
> >
> > @@ -143,6 +144,24 @@ static int __init parse_crashkernel_simp
> > return 0;
> > }
> >
> > +static int __init get_crashkernel_default(unsigned long long system_ram,
> > + unsigned long long *size)
> > +{
> > + unsigned long long sz = CONFIG_CRASHKERNEL_DEFAULT_MB;
> > + unsigned long long thres = CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB;
> > +
> > + thres *= SZ_1M;
> > + sz *= SZ_1M;
> > +
> > + if (sz >= system_ram || system_ram < thres) {
> > + pr_debug("crashkernel default size can not be used.\n");
> > + return -EINVAL;
>
> In other words,
>
> if (system_ram <= CONFIG_CRASHKERNEL_DEFAULT_MB ||
> system_ram < CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB)
> fail;
>
> yes?

the first comparison is a sanity check for the default reserved
size, if it is bigger than system ram size it is apparently bad:
if ( CONFIG_CRASHKERNEL_DEFAULT_MB >= system_ram )
fail;

The second comparison is for the threshold setting, it is a designed
logic like:
if ( system_ram >= CONFIG_CRASHKERNEL_DEFAULT_THRESHOLD_MB ) then
go ahead to use the default value of CONFIG_CRASHKERNEL_DEFAULT_MB

>
> How come? What's happening here? Perhaps a (good) explanatory comment
> is needed. And clearer Kconfig text.
>
> All confused :(

Hmm, scratch head~, will think about how to describe it better. If you
have any suggestions just let me know :)

Thanks
Dave