Re: [RFC PATCH 2/5] mm, arch: unify vmemmap_populate altmap handling

From: Gerald Schaefer
Date: Mon Jul 31 2017 - 10:28:04 EST


On Mon, 31 Jul 2017 14:55:56 +0200
Michal Hocko <mhocko@xxxxxxxxxx> wrote:

> On Mon 31-07-17 14:40:53, Gerald Schaefer wrote:
> [...]
> > > @@ -247,12 +248,12 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
> > > * use large frames even if they are only partially
> > > * used.
> > > * Otherwise we would have also page tables since
> > > - * vmemmap_populate gets called for each section
> > > + * __vmemmap_populate gets called for each section
> > > * separately. */
> > > if (MACHINE_HAS_EDAT1) {
> > > void *new_page;
> > >
> > > - new_page = vmemmap_alloc_block(PMD_SIZE, node);
> > > + new_page = __vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
> > > if (!new_page)
> > > goto out;
> > > pmd_val(*pm_dir) = __pa(new_page) | sgt_prot;
> >
> > There is another call to vmemmap_alloc_block() in this function, a couple
> > of lines below, this should also be replaced by __vmemmap_alloc_block_buf().
>
> I've noticed that one but in general I have only transformed PMD
> mappings because we shouldn't even get to pte level if the forme works
> AFAICS. Memory sections should be always 2MB aligned unless I am missing
> something. Or is this not true?

vmemmap_populate() on s390 will only stop at pmd level if we have HW
support for large pages (MACHINE_HAS_EDAT1). In that case we will allocate
a PMD_SIZE block with vmemmap_alloc_block() and map it on pmd level as
a large page.

Without HW large page support, we will continue to allocate a pte page,
populate the pmd entry with that, and fall through to the pte_none()
check below, with its PAGE_SIZE vmemmap_alloc_block() allocation. In this
case we should use the __vmemmap_alloc_block_buf().

Regards,
Gerald