Re: [PATCH 0/3] NUMA boot hash allocation interleaving

From: Brent Casavant
Date: Wed Dec 15 2004 - 13:29:43 EST


On Wed, 15 Dec 2004, Andi Kleen wrote:

> On Tue, Dec 14, 2004 at 11:14:46PM -0800, Martin J. Bligh wrote:
> > Well hold on a sec. We don't need to use the hugepages pool for this,
> > do we? This is the same as using huge page mappings for the whole of
> > kernel space on ia32. As long as it's a kernel mapping, and 16MB aligned
> > and contig, we get it for free, surely?
>
> The whole point of the patch is to not use the direct mapping, but
> use a different interleaved mapping on NUMA machines to spread
> the memory out over multiple nodes.

There is a middle ground, in theory. At least on a NUMA machine you
can divide up the allocation roughly as requested_size/number_nodes.
Round the result up to the next available page size, and allocate
interleaved on the nodes until you've satisfied the requested size.
This minimizes the number of TLB entries required to interleave the
allocation.

However, as noted, the kernel barely handles two page sizes, much
less multiple page sizes. If more flexible page-size handling
comes along someday this and many other sections of code could
stand to benefit from some rewriting.

> > > Using other page sizes would be probably tricky because the
> > > linux VM can currently barely deal with two page sizes.
> > > I suspect handling more would need some VM infrastructure effort
> > > at least in the changed port.
> >
> > For the general case I'd agree. But this is a setup-time only tweak
> > of the static kernel mapping, isn't it?
>
> It's probably not impossible, just lots of ugly special cases.
> e.g. how about supporting it for /proc/kcore etc?

Just to bring a bit of closure regarding the patches I posted yesterday,
I'm reading the overall discussion as "The patches look good enough for
current kernels, and this would benefit from multiple page size support,
if we ever get it." Fair read?

Brent

--
Brent Casavant If you had nothing to fear,
bcasavan@xxxxxxx how then could you be brave?
Silicon Graphics, Inc. -- Queen Dama, Source Wars
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/