Re: [PATCH 3/9] mm, page_alloc: do not set_cpu_numa_mem on empty nodes initialization

From: Michal Hocko
Date: Fri Jul 14 2017 - 08:39:51 EST


On Fri 14-07-17 13:32:42, Mel Gorman wrote:
> On Fri, Jul 14, 2017 at 12:50:04PM +0200, Michal Hocko wrote:
> > On Fri 14-07-17 10:48:10, Mel Gorman wrote:
> > > On Fri, Jul 14, 2017 at 10:00:00AM +0200, Michal Hocko wrote:
> > > > From: Michal Hocko <mhocko@xxxxxxxx>
> > > >
> > > > __build_all_zonelists reinitializes each online cpu local node for
> > > > CONFIG_HAVE_MEMORYLESS_NODES. This makes sense because previously memory
> > > > less nodes could gain some memory during memory hotplug and so the local
> > > > node should be changed for CPUs close to such a node. It makes less
> > > > sense to do that unconditionally for a newly creaded NUMA node which is
> > > > still offline and without any memory.
> > > >
> > > > Let's also simplify the cpu loop and use for_each_online_cpu instead of
> > > > an explicit cpu_online check for all possible cpus.
> > > >
> > > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
> > > > ---
> > > > mm/page_alloc.c | 6 ++----
> > > > 1 file changed, 2 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > index 7746824a425d..ebc3311555b1 100644
> > > > --- a/mm/page_alloc.c
> > > > +++ b/mm/page_alloc.c
> > > > @@ -5096,10 +5096,8 @@ static int __build_all_zonelists(void *data)
> > > >
> > > > build_zonelists(pgdat);
> > > > }
> > > > - }
> > > >
> > > > #ifdef CONFIG_HAVE_MEMORYLESS_NODES
> > > > - for_each_possible_cpu(cpu) {
> > > > /*
> > > > * We now know the "local memory node" for each node--
> > > > * i.e., the node of the first zone in the generic zonelist.
> > > > @@ -5108,10 +5106,10 @@ static int __build_all_zonelists(void *data)
> > > > * secondary cpus' numa_mem as they come on-line. During
> > > > * node/memory hotplug, we'll fixup all on-line cpus.
> > > > */
> > > > - if (cpu_online(cpu))
> > > > + for_each_online_cpu(cpu)
> > > > set_cpu_numa_mem(cpu, local_memory_node(cpu_to_node(cpu)));
> > > > - }
> > > > #endif
> > > > + }
> > > >
> > >
> > > This is not as clear a benefit. For each online node, we now go through
> > > all online CPUs once per node. There would be some rationale for using
> > > for_each_online_cpu.
> >
> > I am not sure I understand. I am using for_each_online_cpu...
>
> Yes, but within a loop that looks like
>
> for_each_online_node(nid)
> ...
> for_each_online_cpu(cpu)
>
> Or maybe you aren't because we are looking at different baselines. I had
> minor fuzz and conflicts applying the series.

The current mmotm after this patch looks like this
if (self && !node_online(self->node_id)) {
build_zonelists(self);
} else {
for_each_online_node(nid) {
pg_data_t *pgdat = NODE_DATA(nid);

build_zonelists(pgdat);
}

#ifdef CONFIG_HAVE_MEMORYLESS_NODES
/*
* We now know the "local memory node" for each node--
* i.e., the node of the first zone in the generic zonelist.
* Set up numa_mem percpu variable for on-line cpus. During
* boot, only the boot cpu should be on-line; we'll init the
* secondary cpus' numa_mem as they come on-line. During
* node/memory hotplug, we'll fixup all on-line cpus.
*/
for_each_online_cpu(cpu)
set_cpu_numa_mem(cpu, local_memory_node(cpu_to_node(cpu)));
#endif
}

So for_each_online_cpu is called outside of the for_each_online_node.
Have a look at
git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
branch attempts/zonlists-build-simplification
--
Michal Hocko
SUSE Labs