Re: sched: CPU #1's llc-sibling CPU #0 is not on the same node!

From: Tang Chen
Date: Wed Feb 27 2013 - 02:11:59 EST


On 02/27/2013 02:54 PM, Yinghai Lu wrote:
On Tue, Feb 26, 2013 at 9:49 PM, Yasuaki Ishimatsu
<isimatu.yasuaki@xxxxxxxxxxxxxx> wrote:
2013/02/27 14:11, Yinghai Lu wrote:

On Tue, Feb 26, 2013 at 8:43 PM, Yasuaki Ishimatsu
<isimatu.yasuaki@xxxxxxxxxxxxxx> wrote:

2013/02/27 13:04, Yinghai Lu wrote:


On Tue, Feb 26, 2013 at 7:38 PM, Yasuaki Ishimatsu
<isimatu.yasuaki@xxxxxxxxxxxxxx> wrote:


2013/02/27 11:30, Yinghai Lu wrote:


Do you mean you can not boot one socket system with 1G ram ?
Assume socket 0 does not support hotplug, other 31 sockets support hot
plug.

So we could boot system only with socket0, and later one by one hot
add other cpus.




In this case, system can boot. But other cpus with bunch of ram hot
plug may fails, since system does not have enough memory for cover
hot added memory. When hot adding memory device, kernel object for the
memory is allocated from 1G ram since hot added memory has not been
enabled.


yes, it may fail, if the one node memory need page table and vmemmap
is more than 1g ...



for hot add memory we need to
1. add another wrapper for init_memory_mapping, just like
init_mem_mapping() for booting path.
2. we need make memblock more generic, so we can use it with hot add
memory during runtime.
3. with that we can initialize page table for hot added node with ram.
a. initial page table for 2M near node top is from node0 ( that does
not support hot plug).
b. then will use 2M for memory below node top...
c. with that we will make sure page table stay on local node.
alloc_low_pages need to be updated to support that.
4. need to make sure vmemmap on local node too.



I think so too. By this, memory hot plug becomes more useful.


I agree with your idea. But I think above ideas is future work.
So at first we should use movable memory for memory hot plug.
After that, we will implement above ideas.




so hot-remove node will work too later.

In the long run, we should make booting path and hot adding more
similar and share at most code.
That will make code get more test coverage.


Tang, Yasuaki, Andrew,

Please check if you are ok with attached reverting patch.


We will fix this problem with no objection. So please wait a while.

And the problem occurs by "movablemem_map=srat" not
"movablemem_map=nn[KMG]@ss[KMG]"
At least, if you want to revert it, you should revert only
"movablemem_map=srat" part.

Those patches are tangled together.

No, they are not.

The following commits supports "movablemem_map=nn[KMG]@ss[KMG]".

commit fb06bc8e5f42f38c011de0e59481f464a82380f6
page_alloc: bootmem limit with movablecore_map
commit 42f47e27e761fee07da69e04612ec7dd0d490edd
page_alloc: make movablemem_map have higher priority
commit 6981ec31146cf19454c55c130625f6cee89aab95
page_alloc: introduce zone_movable_limit[] to keep movable limit for nodes
commit 34b71f1e04fcba578e719e675b4882eeeb2a1f6f
page_alloc: add movable_memmap kernel parameter
commit 4d59a75125d5a4717e57e9fc62c64b3d346e603e
x86: get pg_data_t's memory from other node

And the following supports "movablemem_map=srat".

commit f7210e6c4ac795694106c1c5307134d3fc233e88
mm/memblock.c: use CONFIG_HAVE_MEMBLOCK_NODE_MAP to protect movablecore_map in memblock_overlaps_region().
commit 01a178a94e8eaec351b29ee49fbb3d1c124cb7fb
acpi, memory-hotplug: support getting hotplug info from SRAT
commit 27168d38fa209073219abedbe6a9de7ba9acbfad
acpi, memory-hotplug: extend movablemem_map ranges to the end of node
commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f
acpi, memory-hotplug: parse SRAT before memblock is ready


Also it looks funny to ask user to specify mem range in boot command
line to enable mem hotplug.

Well, I think sometimes users don't like the SRAT memory style, and want to
increase or reduce hot-pluggable memory by themselves. And also, it is useful
for debuging firmware bugs.

I agree that "movablemem_map=srat" functionality need more work to improve.
Can we not revert it, and improve it during 3.9rc ? I think during rc time,
at least we can fix the problems brought by early_parse_srat().

Thanks. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/