RE: [PATCH] online CPU before memory failed in pcpu_alloc_pages()

From: Guo, Chaohong
Date: Sun May 23 2010 - 21:04:43 EST

>> >>
>> >>
>> >> @@ -2968,9 +2991,23 @@ static int __build_all_zonelists(void *d
>> >> ...
>> >>
>> >> - for_each_possible_cpu(cpu)
>> >> + for_each_possible_cpu(cpu) {
>> >> setup_pageset(&per_cpu(boot_pageset, cpu), 0);
>> >> ...
>> >>
>> >> + if (cpu_online(cpu))
>> >> + cpu_to_mem(cpu) = local_memory_node(cpu_to_node(cpu));
>> >> +#endif
>> Look at the above code, int __build_all_zonelists(), cpu_to_mem(cpu)
>> is set only when cpu is onlined. Suppose that a node with local memory,
>> all memory segments are onlined first, and then, cpus within that node
>> are onlined one by one, in this case, where does the cpu_to_mem(cpu)
>> for the last cpu get its value ?
>As I mentioned to Kame-san, x86 does not define
>CONFIG_HAVE_MEMORYLESS_NODES, so this code is not compiled for that
>arch. If x86 did support memoryless nodes--i.e., did not hide them and
>reassign the cpus to other nodes, as is the case for ia64--then we could
>have on-line cpus associated with memoryless nodes. The code above is
>in __build_all_zonelists() so that in the case where we add memory to a
>previously memoryless node, we re-evaluate the "local memory node" for
>all online cpus.
>For cpu hotplug--again, if x86 supports memoryless nodes--we'll need to
>add a similar chunk to the path where we set up the cpu_to_node map for
>a hotplugged cpu. See, for example, the call to set_numa_mem() in
>smp_callin() in arch/ia64/kernel/smpboot.c.

Yeah, that's what I am looking for.

But currently, I don't
>think you can use the numa_mem_id()/cpu_to_mem() interfaces for your
>purpose. I suppose you could change page_alloc.c to compile
>local_memory_node() #if defined(CONFIG_HAVE_MEMORYLESS_NODES) ||
>(CPU_HOTPLUG) and use that function to find the nearest memory. It
>should return a valid node after zonelists have been rebuilt.
>Does that make sense?

Yes, besides, I need to find a place in hotplug path to call set_numa_mem()
just as you mentioned for ia64 platform. Is my understanding right ?


>> >
>> > So, cpu_to_node(cpu) for possible cpus will have NUMA_NO_NODE(-1)
>> > or the number of the nearest node.
>> >
>> > IIUC, if SRAT is not broken, all pxm has its own node_id.
>> Thank you very much for the info, I have been thinking why node_id
>> is (-1) in some cases.
>> -minskey
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
>> see: .
>> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at