Re: [PATCH] arm64: configurable sparsemem section size

From: Pavel Tatashin
Date: Wed Apr 24 2019 - 15:54:41 EST


<resending> from original email

On Wed, Apr 24, 2019 at 3:48 PM Pavel Tatashin
<patatash@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Apr 24, 2019 at 5:07 AM Anshuman Khandual
> <anshuman.khandual@xxxxxxx> wrote:
> >
> > On 04/24/2019 02:08 AM, Pavel Tatashin wrote:
> > > sparsemem section size determines the maximum size and alignment that
> > > is allowed to offline/online memory block. The bigger the size the less
> > > the clutter in /sys/devices/system/memory/*. On the other hand, however,
> > > there is less flexability in what granules of memory can be added and
> > > removed.
> >
> > Is there any scenario where less than a 1GB needs to be added on arm64 ?
>
> Yes, DAX hotplug loses 1G of memory without allowing smaller sections.
> Machines on which we are going to be using this functionality have 8G
> of System RAM, therefore losing 1G is a big problem.
>
> For details about using scenario see this cover letter:
> https://lore.kernel.org/lkml/20190421014429.31206-1-pasha.tatashin@xxxxxxxxxx/
>
> >
> > >
> > > Recently, it was enabled in Linux to hotadd persistent memory that
> > > can be either real NV device, or reserved from regular System RAM
> > > and has identity of devdax.
> >
> > devdax (even ZONE_DEVICE) support has not been enabled on arm64 yet.
>
> Correct, I use your patches to enable ZONE_DEVICE, and thus devdax on ARM64:
> https://lore.kernel.org/lkml/1554265806-11501-1-git-send-email-anshuman.khandual@xxxxxxx/
>
> >
> > >
> > > The problem is that because ARM64's section size is 1G, and devdax must
> > > have 2M label section, the first 1G is always missed when device is
> > > attached, because it is not 1G aligned.
> >
> > devdax has to be 2M aligned ? Does Linux enforce that right now ?
>
> Unfortunately, there is no way around this. Part of the memory can be
> reserved as persistent memory via device tree.
> memory@40000000 {
> device_type = "memory";
> reg = < 0x00000000 0x40000000
> 0x00000002 0x00000000 >;
> };
>
> pmem@1c0000000 {
> compatible = "pmem-region";
> reg = <0x00000001 0xc0000000
> 0x00000000 0x80000000>;
> volatile;
> numa-node-id = <0>;
> };
>
> So, while pmem is section aligned, as it should be, the dax device is
> going to be pmem start address + label size, which is 2M. The actual
> DAX device starts at:
> 0x1c0000000 + 2M.
>
> Because section size is 1G, the hotplug will able to add only memory
> starting from
> 0x1c0000000 + 1G
>
> > 27 and 28 do not even compile for ARM64_64_PAGES because of MAX_ORDER and
> > SECTION_SIZE mismatch.
>
> Can you please elaborate what configs are you using? I have no
> problems compiling with 27 and 28 bit.
>
> Thank you,
> Pasha