Re: [PATCH v1 0/2] mm/kdump: exclude reserved pages in dumps

From: Michal Hocko
Date: Tue Jul 24 2018 - 04:54:10 EST


On Tue 24-07-18 10:46:20, David Hildenbrand wrote:
> On 24.07.2018 09:25, Michal Hocko wrote:
> > On Mon 23-07-18 19:20:43, David Hildenbrand wrote:
> >> On 23.07.2018 14:30, Michal Hocko wrote:
> >>> On Mon 23-07-18 13:45:18, Vlastimil Babka wrote:
> >>>> On 07/20/2018 02:34 PM, David Hildenbrand wrote:
> >>>>> Dumping tools (like makedumpfile) right now don't exclude reserved pages.
> >>>>> So reserved pages might be access by dump tools although nobody except
> >>>>> the owner should touch them.
> >>>>
> >>>> Are you sure about that? Or maybe I understand wrong. Maybe it changed
> >>>> recently, but IIRC pages that are backing memmap (struct pages) are also
> >>>> PG_reserved. And you definitely do want those in the dump.
> >>>
> >>> You are right. reserve_bootmem_region will make all early bootmem
> >>> allocations (including those backing memmaps) PageReserved. I have asked
> >>> several times but I haven't seen a satisfactory answer yet. Why do we
> >>> even care for kdump about those. If they are reserved the nobody should
> >>> really look at those specific struct pages and manipulate them. Kdump
> >>> tools are using a kernel interface to read the content. If the specific
> >>> content is backed by a non-existing memory then they should simply not
> >>> return anything.
> >>>
> >>
> >> "new kernel" provides an interface to read memory from "old kernel".
> >>
> >> The new kernel has no idea about
> >> - which memory was added/online in the old kernel
> >> - where struct pages of the old kernel are and what their content is
> >> - which memory is save to touch and which not
> >>
> >> Dump tools figure all that out by interpreting the VMCORE. They e.g.
> >> identify "struct pages" and see if they should be dumped. The "new
> >> kernel" only allows to read that memory. It cannot hinder to crash the
> >> system (e.g. if a dump tool would try to read a hwpoison page).
> >>
> >> So how should the "new kernel" know if a page can be touched or not?
> >
> > I am sorry I am not familiar with kdump much. But from what I remember
> > it reads from /proc/vmcore and implementation of this interface should
> > simply return EINVAL or alike when you try to dump inaccessible memory
> > range.
>
> I assume the main problem with this approach is that we would always
> have to fallback to reading old memory from vmcore page by page. e.g.
> makedumpfile will always try to read bigger bunches. I also assume the
> reason HWPOISON is handled in dump tools instead of in the kernel using
> the mechanism you describe is the case.

Is falling back to page-by-page for some ranges a real problem? I mean
most of pages will simply be there so you can go in larger chunks. Once
you get EINVAL, you just fall back to page-by-page for that particular
range.
--
Michal Hocko
SUSE Labs