Re: [PATCH v3 0/5] hugetlb: add support gigantic page allocation at runtime

From: Andrew Morton
Date: Tue Apr 22 2014 - 17:55:54 EST


On Tue, 22 Apr 2014 17:37:26 -0400 Luiz Capitulino <lcapitulino@xxxxxxxxxx> wrote:

> On Thu, 17 Apr 2014 16:01:10 -0700
> Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > On Thu, 10 Apr 2014 13:58:40 -0400 Luiz Capitulino <lcapitulino@xxxxxxxxxx> wrote:
> >
> > > The HugeTLB subsystem uses the buddy allocator to allocate hugepages during
> > > runtime. This means that hugepages allocation during runtime is limited to
> > > MAX_ORDER order. For archs supporting gigantic pages (that is, page sizes
> > > greater than MAX_ORDER), this in turn means that those pages can't be
> > > allocated at runtime.
> >
> > Dumb question: what's wrong with just increasing MAX_ORDER?
>
> To be honest I'm not a buddy allocator expert and I'm not familiar with
> what is involved in increasing MAX_ORDER. What I do know though is that it's
> not just a matter of increasing a macro's value. For example, for sparsemem
> support we have this check (include/linux/mmzone.h:1084):
>
> #if (MAX_ORDER - 1 + PAGE_SHIFT) > SECTION_SIZE_BITS
> #error Allocator MAX_ORDER exceeds SECTION_SIZE
> #endif
>
> I _guess_ it's because we can't allocate more pages than what's within a
> section on sparsemem. Can sparsemem and the other stuff be changed to
> accommodate a bigger MAX_ORDER? I don't know. Is it worth it to increase
> MAX_ORDER and do all the required changes, given that a bigger MAX_ORDER is
> only useful for HugeTLB and the archs supporting gigantic pages? I'd guess not.

afacit we'd need to increase SECTION_SIZE_BITS to 29 or more to
accommodate 1G MAX_ORDER. I assume this means that some machines with
sparse physical memory layout may not be able to use all (or as much)
of the physical memory. Perhaps Yinghai can advise?

I do think we should fully explore this option before giving up and
adding new special-case code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/