Re: [PATCH v7] mm/page_alloc.c: memory_hotplug: free pages as higher order

From: Michal Hocko
Date: Tue Jan 08 2019 - 15:04:43 EST


On Tue 08-01-19 10:40:18, Alexander Duyck wrote:
> On Fri, 2019-01-04 at 10:31 +0530, Arun KS wrote:
> > When freeing pages are done with higher order, time spent on coalescing
> > pages by buddy allocator can be reduced. With section size of 256MB, hot
> > add latency of a single section shows improvement from 50-60 ms to less
> > than 1 ms, hence improving the hot add latency by 60 times. Modify
> > external providers of online callback to align with the change.
> >
> > Signed-off-by: Arun KS <arunks@xxxxxxxxxxxxxx>
> > Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> > Reviewed-by: Oscar Salvador <osalvador@xxxxxxx>
>
> After running into my initial issue I actually had a few more questions
> about this patch.
>
> > [...]
> > +static int online_pages_blocks(unsigned long start, unsigned long nr_pages)
> > +{
> > + unsigned long end = start + nr_pages;
> > + int order, ret, onlined_pages = 0;
> > +
> > + while (start < end) {
> > + order = min(MAX_ORDER - 1,
> > + get_order(PFN_PHYS(end) - PFN_PHYS(start)));
> > +
> > + ret = (*online_page_callback)(pfn_to_page(start), order);
> > + if (!ret)
> > + onlined_pages += (1UL << order);
> > + else if (ret > 0)
> > + onlined_pages += ret;
> > +
> > + start += (1UL << order);
> > + }
> > + return onlined_pages;
> > }
> >
>
> Should the limit for this really be MAX_ORDER - 1 or should it be
> pageblock_order? In some cases this will be the same value, but I seem
> to recall that for x86 MAX_ORDER can be several times larger than
> pageblock_order.

Does it make any difference when we are in fact trying to onine nr_pages
and we clamp to it properly?

> > static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
> > void *arg)
> > {
> > - unsigned long i;
> > unsigned long onlined_pages = *(unsigned long *)arg;
> > - struct page *page;
> >
> > if (PageReserved(pfn_to_page(start_pfn)))
>
> I'm not sure we even really need this check. Getting back to the
> discussion I have been having with Michal in regards to the need for
> the DAX pages to not have the reserved bit cleared I was originally
> wondering if we could replace this check with a call to
> online_section_nr since the section shouldn't be online until we set
> the bit below in online_mem_sections.
>
> However after doing some further digging it looks like this could
> probably be dropped entirely since we only call this function from
> online_pages and that function is only called by memory_block_action if
> pages_correctly_probed returns true. However pages_correctly_probed
> should return false if any of the sections contained in the page range
> is already online.

Yes you are right but I guess it would be better to address in a
separate patch that deals with PageReserved manipulation in general.
I do not think we want to remove the check silently. People who might be
interested in backporting this for whatever reason might screatch their
head why the test is not needed anymore.
--
Michal Hocko
SUSE Labs