Re: [Part1 PATCH v5 00/22] x86, ACPI, numa: Parse numa info earlier

From: Tang Chen
Date: Sun Jun 23 2013 - 23:49:25 EST


On 06/22/2013 02:25 AM, Tejun Heo wrote:
Hey,

On Fri, Jun 21, 2013 at 05:19:48PM +0800, Tang Chen wrote:
* As memblock allocator can relocate itself. There's no point in
avoiding setting NUMA node while parsing and registering NUMA
topology. Just parse and register NUMA info and later tell it to
relocate itself out of hot-pluggable node. A number of patches in
the series is doing this dancing - carefully reordering NUMA
probing. No need to do that. It's really fragile thing to do.

* Once you get the above out of the way, I don't think there are a lot
of permanent allocations in the way before NUMA is initialized.
Re-order the remaining ones if that's cleaner to do. If that gets
overly messy / fragile, copying them around or freeing and reloading
afterwards could be an option too.

memblock allocator can relocate itself, but it cannot relocate the memory

Hmmm... maybe I wasn't clear but that's the first bullet point above.

it allocated for users. There could be some pointers pointing to these
memory ranges. If we do the relocation, how to update these pointers ?

And the second. Can you please list what persistent areas are
allocated before numa info is configured into memblock? There

Hi tj,

My box is x86_64, and the memory layout is:
[ 0.000000] SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff]
[ 0.000000] SRAT: Node 0 PXM 0 [mem 0x100000000-0x307ffffff]
[ 0.000000] SRAT: Node 1 PXM 2 [mem 0x308000000-0x587ffffff] Hot Pluggable
[ 0.000000] SRAT: Node 2 PXM 3 [mem 0x588000000-0x7ffffffff] Hot Pluggable


I marked ranges reserved by memblock before we parse SRAT with flag 0x4.
There are about 14 ranges which is persistent after boot.

[ 0.000000] reserved[0x0] [0x00000000000000-0x0000000000ffff], 0x10000 bytes flags: 0x4
[ 0.000000] reserved[0x1] [0x00000000093000-0x000000000fffff], 0x6d000 bytes flags: 0x4
[ 0.000000] reserved[0x2] [0x00000001000000-0x00000002a9afff], 0x1a9b000 bytes flags: 0x4
[ 0.000000] reserved[0x3] [0x00000030000000-0x00000037ffffff], 0x8000000 bytes flags: 0x4
...
[ 0.000000] reserved[0x5] [0x0000006da81000-0x0000006e46afff], 0x9ea000 bytes flags: 0x4
[ 0.000000] reserved[0x6] [0x0000006ed6a000-0x0000006f246fff], 0x4dd000 bytes flags: 0x4
[ 0.000000] reserved[0x7] [0x0000006f28a000-0x0000006f299fff], 0x10000 bytes flags: 0x4
[ 0.000000] reserved[0x8] [0x0000006f29c000-0x0000006fe91fff], 0xbf6000 bytes flags: 0x4
[ 0.000000] reserved[0x9] [0x00000070e92000-0x00000071d54fff], 0xec3000 bytes flags: 0x4
[ 0.000000] reserved[0xa] [0x00000071d5e000-0x00000072204fff], 0x4a7000 bytes flags: 0x4
[ 0.000000] reserved[0xb] [0x00000072220000-0x0000007222074f], 0x750 bytes flags: 0x4
...
[ 0.000000] reserved[0xd] [0x000000722bc000-0x000000722bc1cf], 0x1d0 bytes flags: 0x4
[ 0.000000] reserved[0xe] [0x00000072bd3000-0x00000076c8ffff], 0x40bd000 bytes flags: 0x4
......
[ 0.000000] reserved[0x134] [0x000007fffdf000-0x000007ffffffff], 0x21000 bytes flags: 0x4


Just for the readability:
[0x00000308000000-0x00000587ffffff] Hot Pluggable
[0x00000588000000-0x000007ffffffff] Hot Pluggable

Seeing from the dmesg, only the last one is in hotpluggable area. I need to go
through the code to find out what it is, and find a way to relocate it.

But I'm not sure if a box with a different SRAT will have different result.

I will send more info later.

Thanks. :)


shouldn't be whole lot. And, again, this type of information should
have been available in the head message so that high-level discussion
could take place right away.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/