Re: [PATCH] mm/hotplug: Only respect mem= parameter during boot stage

From: Michal Hocko
Date: Tue Dec 10 2019 - 05:28:38 EST


On Tue 10-12-19 15:24:53, Baoquan He wrote:
> On 12/09/19 at 11:07am, Michal Hocko wrote:
> > On Fri 06-12-19 23:05:24, Baoquan He wrote:
> > > In commit 357b4da50a62 ("x86: respect memory size limiting via mem=
> > > parameter") a global varialbe global max_mem_size is added to store
> > > the value which is parsed from 'mem= '. This truly stops those
> > > DIMM from being added into system memory during boot.
> > >
> > > However, it also limits the later memory hotplug functionality. Any
> > > memory board can't be hot added any more if its region is beyond the
> > > max_mem_size. System will print error like below:
> > >
> > > [ 216.387164] acpi PNP0C80:02: add_memory failed
> > > [ 216.389301] acpi PNP0C80:02: acpi_memory_enable_device() error
> > > [ 216.392187] acpi PNP0C80:02: Enumeration failure
> > >
> > > >From document of 'mem =' parameter, it should be a restriction during
> > > boot, but not impact the system memory adding/removing after booting.
> > >
> > > mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
> > >
> > > So fix it by also checking if it's during SYSTEM_BOOTING stage when
> > > restrict memory adding. Otherwise, skip the restriction.
> >
> > Could you be more specific about why the boot vs. later hotplug makes
> > any difference? The documentation is explicit about the boot time but
> > considering this seems to be like that since ever I strongly suspect
> > that this is just an omission.
>
> I think the 'mem=' updating in commit 357b4da50a62 will only affect
> those hotplugable memory regions. When I tested it, there are three
> memmory boards, one is the normal memory region with 4G memory, and the
> other two are hotpluggable memory boards and recorded in ACPI tables,
> each is 1GB. When put all them three onsite before boot, they will be
> recognized by firmware and written into e820 table and/or EFI table, then
> kernel can read them from e820 and them as system memory, we get 6G
> memory.
>
> However, if add 'mem=', like 'mme=3G', w/o commit 357b4da50a62, in e820,
> we will only get 3G memory. Later in acpi_init(), acpi scanning will
> search those two memory regions, and try to add them into system call
> because the two hotpluggable memory boards are power on and ready.
> Then we will get 3G + 1G + 1G, 5G memory. the 1st 3G is from the normal
> memory board, its last 1G is trimmed. Jurgen's patch is trying to fix this
> because the adding happens during boot time, and conflicts with 'mem='.

Unless I misunderstand what you are saying this all is just expected.
You have restricted the memory explicitly and the result is that not all
the memory is visible.

> But after system bootup, we should be able to hot add/remove any memory
> board. This should not be restricted by a boot-time kernel parameter
> 'mme='. This is what I am trying to fix.

This is a simple statement without any actual explanation on why. Why is
hotplug memory special? What is the usecase? Who would want to use mem
parameter and later expect a memory above the restrected area to be
hotplugable?

David has provided an actual usecase [1] but this needs to be documented
somewhere so that we do not break that accidentally in the future.
Ideally both in code which adds the boot restriction and the kernel
command line documentation to be explicit about BOOT restriction.

[1] http://lkml.kernel.org/r/429622cf-f0f4-5d80-d39d-b0d8a6c6605f@xxxxxxxxxx
--
Michal Hocko
SUSE Labs