Re: [patch] x86, mm: Fix size of numa_distance array

From: Tejun Heo
Date: Fri Feb 25 2011 - 04:03:20 EST


Hello,

On Thu, Feb 24, 2011 at 02:46:38PM -0800, David Rientjes wrote:
> That's this:
>
> 430 numa_distance_cnt = cnt;
> 431
> 432 /* fill with the default distances */
> 433 for (i = 0; i < cnt; i++)
> 434 for (j = 0; j < cnt; j++)
> 435 ===> numa_distance[i * cnt + j] = i == j ?
> 436 LOCAL_DISTANCE : REMOTE_DISTANCE;
> 437 printk(KERN_DEBUG "NUMA: Initialized distance table, cnt=%d\n", cnt);
> 438
> 439 return 0;
>
> We're overflowing the array and it's easy to see why:
>
> for_each_node_mask(i, nodes_parsed)
> cnt = i;
> size = ++cnt * sizeof(numa_distance[0]);
>
> cnt is the highest node id parsed, so numa_distance[] must be cnt * cnt.
> The following patch fixes the issue on top of x86/mm.

Oops, that was stupid.

> I'm running on a 64GB machine with CONFIG_NODES_SHIFT == 10, so
> numa=fake=128M would result in 512 nodes. That's going to require 2MB for
> numa_distance (and that's not __initdata). Before these changes, we
> calculated numa_distance() using pxms without this additional mapping, is
> there any way to reduce this? (Admittedly real NUMA machines with 512
> nodes wouldn't mind sacrificing 2MB, but we didn't need this before.)

We can leave the physical distance table unmodified and map through
emu_nid_to_phys[] while dereferencing. It just seemed simpler this
way. Does it actually matter? Anyways, I'll give it a shot. Do you
guys actually use 512 nodes?

> x86, mm: Fix size of numa_distance array
>
> numa_distance should be sized like the SLIT, an NxN matrix where N is the
> highest node id. This patch fixes the calulcation to avoid overflowing
> the array on the subsequent iteration.
>
> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
> ---
> arch/x86/mm/numa_64.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
> index cccc01d..abf0131 100644
> --- a/arch/x86/mm/numa_64.c
> +++ b/arch/x86/mm/numa_64.c
> @@ -414,7 +414,7 @@ static int __init numa_alloc_distance(void)
>
> for_each_node_mask(i, nodes_parsed)
> cnt = i;
> - size = ++cnt * sizeof(numa_distance[0]);
> + size = cnt * cnt * sizeof(numa_distance[0]);

It should be cnt++; cnt * cnt; as Yinghai wrote.

> phys = memblock_find_in_range(0, (u64)max_pfn_mapped << PAGE_SHIFT,
> size, PAGE_SIZE);

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/