Re: [RFC v2 0/2] Early use of boot service memory

From: HATAYAMA Daisuke
Date: Thu Nov 21 2013 - 22:53:35 EST


(2013/11/22 11:29), Vivek Goyal wrote:
[..]
makedumpfile going to cyclic buffer has helped out greatly, but on
our new systems we're still looking at 512 MB crash kernels.

I tried 6TiB system/16 PCIE cards, kdump on RHEL 6.5 beta still does not work.
still get OOM.

What crashkernel= option you are using?

Interesting. So something is consuming lot of memory. How about setting
"debug_mem_level 1" in /etc/kdump.conf and regenerate initrd and retry.
This time it should output some memory usage info at various points
during boot and that can give us some idea who is consuming how much
memory.

If some module are consuming lot of memory, then you can try "blacklist"
option in /etc/kdump.conf to disable those.

If it is not modules, then it will concern me because then either
kernel is consuming too much memory (which it should not) or for
some reason makedumpfile cyclic mode did not work for you properly.

While you are re-testing, how about also increasing debug message
level of makedumpfile. makedumpfile developers should be able to
have a look at that. In /etc/kdump.conf, specify.

core_collector makedumpfile -c --message-level 31 -d 31

If message level 31 turns out to be too verbose, reduce it as per
makedumpfile man page.


The following configuration is more flexible:

core_collector false
default shell

Then crash dump collection fails and emergency shell shows up,
where you can type a variety of commands.

If 2nd kernel keeps failing even on this configuration, it's likely
that kernel side already causes the OOM you're facing before reaching
invocation of the command specified by core_collector directive.

--
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/