Re: [PATCH v3 5/6] x86/mm/KASLR: Calculate the actual size of vmemmap region

From: Baoquan He
Date: Mon Feb 18 2019 - 05:09:38 EST


On 02/18/19 at 05:50pm, Baoquan He wrote:
> On 02/17/19 at 09:25am, Kees Cook wrote:
> > On Sat, Feb 16, 2019 at 6:04 AM Baoquan He <bhe@xxxxxxxxxx> wrote:
> > >
> > > Vmemmap region has different maximum size depending on paging mode.
> > > Now its size is hardcoded as 1TB in memory KASLR, this is not
> > > right for 5-level paging mode. It will cause overflow if vmemmap
> > > region is randomized to be adjacent to cpu_entry_area region and
> > > its actual size is bigger than 1TB.
> > >
> > > So here calculate how many TB by the actual size of vmemmap region
> > > and align up to 1TB boundary.
> > >
> > > Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
> > > ---
> > > arch/x86/mm/kaslr.c | 11 ++++++++++-
> > > 1 file changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
> > > index 97768df923e3..ca12ed4e5239 100644
> > > --- a/arch/x86/mm/kaslr.c
> > > +++ b/arch/x86/mm/kaslr.c
> > > @@ -101,7 +101,7 @@ static __initdata struct kaslr_memory_region {
> > > } kaslr_regions[] = {
> > > { &page_offset_base, 0 },
> > > { &vmalloc_base, 0 },
> > > - { &vmemmap_base, 1 },
> > > + { &vmemmap_base, 0 },
> > > };
> > >
> > > /*
> > > @@ -121,6 +121,7 @@ void __init kernel_randomize_memory(void)
> > > unsigned long rand, memory_tb;
> > > struct rnd_state rand_state;
> > > unsigned long remain_entropy;
> > > + unsigned long vmemmap_size;
> > >
> > > vaddr_start = pgtable_l5_enabled() ? __PAGE_OFFSET_BASE_L5 : __PAGE_OFFSET_BASE_L4;
> > > vaddr = vaddr_start;
> > > @@ -152,6 +153,14 @@ void __init kernel_randomize_memory(void)
> > > if (memory_tb < kaslr_regions[0].size_tb)
> > > kaslr_regions[0].size_tb = memory_tb;
> > >
> > > + /*
> > > + * Calculate how many TB vmemmap region needs, and align to
> > > + * 1TB boundary.
> >
> > Can you describe why this is the right calculation? (This will help
> > explain why 4-level is different from 5-level here.)
>
> In the old code, the size of vmemmap is hardcoded as 1 TB. This is true
> in 4-level paging mode, 64 TB RAM supported at most, and usually
> sizeof(struct page) is 64 Bytes, it happens to be 1 TB.
>
> However, in 5-level paging mode, 4 PB is the biggest RAM size we can
> support, it's (4 PB)/64 == 1<<48, namely 256 TB area needed for vmemmap,
> assuming sizeof(struct page) is 64 Bytes here.
>
> So, the hardcoding of 1 TB is not correct for 5-level paging mode.
>
> Thanks
> Baoquan
>
> >
> > > + */
> > > + vmemmap_size = (kaslr_regions[0].size_tb << (TB_SHIFT - PAGE_SHIFT)) *
> > > + sizeof(struct page);
> > > + kaslr_regions[2].size_tb = DIV_ROUND_UP(vmemmap_size, 1UL << TB_SHIFT);

Forgot mentioning what this patch is trying to do. As you can see, we
just use the adjusted size of the direct mapping section, including the
possible later hotplugged memory region, to calculate the actual needed
size for vmemmap, then aligned up to 1 TB. For 4-level paging mode, it's
still 1 TB. For 5-level paging mode, it will be smaller than 256 TB
which is in the case of 4 PB RAM installed.

Thanks
Baoquan