RE: [PATCH] arm64: mm: decrease the section size to reduce the memory reserved for the page map

From: Song Bao Hua (Barry Song)
Date: Mon Dec 07 2020 - 02:48:23 EST




> -----Original Message-----
> From: Anshuman Khandual [mailto:anshuman.khandual@xxxxxxx]
> Sent: Monday, December 7, 2020 8:31 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; Mike Rapoport
> <rppt@xxxxxxxxxxxxx>; Will Deacon <will@xxxxxxxxxx>
> Cc: steve.capper@xxxxxxx; catalin.marinas@xxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; nsaenzjulienne@xxxxxxx; liwei (CM)
> <liwei213@xxxxxxxxxx>; butao <butao@xxxxxxxxxxxxx>;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; fengbaopeng
> <fengbaopeng2@xxxxxxxxxxxxx>
> Subject: Re: [PATCH] arm64: mm: decrease the section size to reduce the memory
> reserved for the page map
>
>
>
> On 12/7/20 7:10 AM, Song Bao Hua (Barry Song) wrote:
> >
> >
> >> -----Original Message-----
> >> From: Mike Rapoport [mailto:rppt@xxxxxxxxxxxxx]
> >> Sent: Saturday, December 5, 2020 12:44 AM
> >> To: Will Deacon <will@xxxxxxxxxx>
> >> Cc: liwei (CM) <liwei213@xxxxxxxxxx>; catalin.marinas@xxxxxxx; fengbaopeng
> >> <fengbaopeng2@xxxxxxxxxxxxx>; nsaenzjulienne@xxxxxxx;
> steve.capper@xxxxxxx;
> >> Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>;
> >> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; butao
> >> <butao@xxxxxxxxxxxxx>
> >> Subject: Re: [PATCH] arm64: mm: decrease the section size to reduce the memory
> >> reserved for the page map
> >>
> >> On Fri, Dec 04, 2020 at 11:13:47AM +0000, Will Deacon wrote:
> >>> On Fri, Dec 04, 2020 at 09:44:43AM +0800, Wei Li wrote:
> >>>> For the memory hole, sparse memory model that define SPARSEMEM_VMEMMAP
> >>>> do not free the reserved memory for the page map, decrease the section
> >>>> size can reduce the waste of reserved memory.
> >>>>
> >>>> Signed-off-by: Wei Li <liwei213@xxxxxxxxxx>
> >>>> Signed-off-by: Baopeng Feng <fengbaopeng2@xxxxxxxxxxxxx>
> >>>> Signed-off-by: Xia Qing <saberlily.xia@xxxxxxxxxxxxx>
> >>>> ---
> >>>> arch/arm64/include/asm/sparsemem.h | 2 +-
> >>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/arch/arm64/include/asm/sparsemem.h
> >> b/arch/arm64/include/asm/sparsemem.h
> >>>> index 1f43fcc79738..8963bd3def28 100644
> >>>> --- a/arch/arm64/include/asm/sparsemem.h
> >>>> +++ b/arch/arm64/include/asm/sparsemem.h
> >>>> @@ -7,7 +7,7 @@
> >>>>
> >>>> #ifdef CONFIG_SPARSEMEM
> >>>> #define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
> >>>> -#define SECTION_SIZE_BITS 30
> >>>> +#define SECTION_SIZE_BITS 27
> >>>
> >>> We chose '30' to avoid running out of bits in the page flags. What changed?
> >>
> >> I think that for 64-bit there are still plenty of free bits. I didn't
> >> check now, but when I played with SPARSEMEM on m68k there were 8 bits
> >> for section out of 32.
> >>
> >>> With this patch, I can trigger:
> >>>
> >>> ./include/linux/mmzone.h:1170:2: error: Allocator MAX_ORDER exceeds
> >> SECTION_SIZE
> >>> #error Allocator MAX_ORDER exceeds SECTION_SIZE
> >>>
> >>> if I bump up NR_CPUS and NODES_SHIFT.
> >>
> >> I don't think it's related to NR_CPUS and NODES_SHIFT.
> >> This seems rather 64K pages that cause this.
> >>
> >> Not that is shouldn't be addressed.
> >
> > Right now, only 4K PAGES will define ARM64_SWAPPER_USES_SECTION_MAPS.
> > Other cases will use vmemmap_populate_basepages().
> > The original patch should be only addressing the issue in 4K pages:
> > https://lore.kernel.org/lkml/20200812010655.96339-1-liwei213@xxxxxxxxxx/
> >
> > would we do something like the below?
> > #ifdef CONFIG_ARM64_4K_PAGE
> > #define SECTION_SIZE_BITS 27
> > #else
> > #define SECTION_SIZE_BITS 30
> > #endif
>
> This is bit arbitrary. Probably 27 can be further reduced for 4K page size.
> Instead, we should make SECTION_SIZE_BITS explicitly depend upon MAX_ORDER.
> IOW section size should be the same as the highest order page in the buddy.
> CONFIG_FORCE_MAX_ZONEORDER is always defined on arm64. A quick test shows
> SECTION_SIZE_BITS would be 22 on 4K pages and 29 for 64K pages. As a fall
> back SECTION_SIZE_BITS can still be 30 in case CONFIG_FORCE_MAX_ZONEORDER
> is not defined.

This will break the pmd mapping for vmemmap. As for each 128M(2^27), we can
have 2MB pmd mapping.

>
> --- a/arch/arm64/include/asm/sparsemem.h
> +++ b/arch/arm64/include/asm/sparsemem.h
> @@ -7,7 +7,7 @@
>
> #ifdef CONFIG_SPARSEMEM
> #define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS
> -#define SECTION_SIZE_BITS 30
> +#define SECTION_SIZE_BITS (CONFIG_FORCE_MAX_ZONEORDER - 1 + PAGE_SHIFT)
> #endif
>
> #endif
>
> A similar approach exists on ia64 platform as well.

Thanks
Barry