Re: [PATCH] mm/spase: never partially remove memmap for early section

From: Dan Williams
Date: Wed Jun 24 2020 - 18:21:14 EST


On Wed, Jun 24, 2020 at 3:06 PM Wei Yang
<richard.weiyang@xxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Jun 24, 2020 at 09:10:09AM -0700, Dan Williams wrote:
> >On Tue, Jun 23, 2020 at 11:14 PM Wei Yang
> ><richard.weiyang@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> On Tue, Jun 23, 2020 at 05:18:28PM +0200, Michal Hocko wrote:
> >> >On Tue 23-06-20 17:42:58, Wei Yang wrote:
> >> >> For early sections, we assumes its memmap will never be partially
> >> >> removed. But current behavior breaks this.
> >> >>
> >> >> Let's correct it.
> >> >>
> >> >> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
> >> >> Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxxxxxxxxxx>
> >> >
> >> >Can a user trigger this or is this a theoretical bug?
> >>
> >> Let me rewrite the changelog a little. Look forward any comments.
> >>
> >> For early sections, its memmap is handled specially even sub-section is
> >> enabled. The memmap could only be populated as a whole.
> >>
> >> Quoted from the comment of section_activate():
> >>
> >> * The early init code does not consider partially populated
> >> * initial sections, it simply assumes that memory will never be
> >> * referenced. If we hot-add memory into such a section then we
> >> * do not need to populate the memmap and can simply reuse what
> >> * is already there.
> >>
> >> While current section_deactivate() breaks this rule. When hot-remove a
> >> sub-section, section_deactivate() would depopulate its memmap. The
> >> consequence is if we hot-add this subsection again, its memmap never get
> >> proper populated.
> >
> >Ok, forgive the latency as re-fetched this logic into my mental cache.
> >So what I was remembering was the initial state of the code that
> >special cased early sections, and that still seems to be the case in
> >pfn_valid(). IIRC early_sections / bootmem are blocked from being
> >removed entirely. Partial / subsection removals are ok.
>
> Would you mind giving more words? Partial subsection removal is ok, so no need
> to fix this?

Early sections establish a memmap for the full section. There's
conceptually nothing wrong with unplugging the non-system-RAM portion
of the memmap, but it would need to be careful, at least on x86, to
map the partial section with PTEs instead of PMDs.

So, you are right that there is a mismatch here, but I think the
comprehensive fix is to allow early sections to be partially
depopulated/repopulated rather than have section_activate() and
section_deacticate() special case early sections. The special casing
is problematic in retrospect as section_deactivate() can't be
maintained without understand special rules in section_activate().