Re: [RFC Patch 5/6] slimdump: Capture slimdump for fatal MCEgenerated crashes

From: Vivek Goyal
Date: Thu May 26 2011 - 14:27:15 EST


On Thu, May 26, 2011 at 08:09:31PM +0200, Andi Kleen wrote:
> On Thu, May 26, 2011 at 01:44:47PM -0400, Vivek Goyal wrote:
> > On Thu, May 26, 2011 at 10:53:05PM +0530, K.Prasad wrote:
> > >
> > > slimdump: Capture slimdump for fatal MCE generated crashes
> > >
> > > System crashes resulting from fatal hardware errors (such as MCE) don't need
> > > all the contents from crashing-kernel's memory. Generate a new 'slimdump' that
> > > retains only essential information while discarding the old memory.
> > >
> >
> > Why to enforce zeroing out of rest of the vmcore data in kernel. Why not
> > leave it to user space.
>
> I think it's a good default to not do a full dump on MCE.
> It's very unlikely to be useful for anything, and will just waste
> reboot time (aka nines).

If we are just extracting and saving MCE registers from vmcore, then
reboot time does not increase. It increases only if user decides to
extract and save extra data from vmcore.

>
> That said including the dmesg too may be a good idea.

dmesg is already part of vmcore and user space tools can easily find
it.

I can easily imagine a default policy of a distro in user space where
in case of MCE crash, we just extract dmesg and MCE registers (from vmcore
notes section) reboot. This will be fast and will reduce the amount of code
in kernel.

IMHO, we should not introduce any additional notion of slimdump as such in
kernel. A better thing would be to just read MCE registers and export to
user space through ELF notes and then let user space automate the rest of
it.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/