Re: hash table sizes

From: Jack Steiner
Date: Fri Nov 28 2003 - 14:38:53 EST



On Fri, Nov 28, 2003 at 11:22:47AM -0500, Jes Sorensen wrote:
> >>>>> "Jack" == Jack Steiner <steiner@xxxxxxx> writes:
>
> Jack> On Fri, Nov 28, 2003 at 09:15:02AM -0500, Jes Sorensen wrote:
> >> What about something like this? I believe node_present_pages
> >> should be the same as nym_physpages on a non-NUMA machine. If not
> >> we can make it min(num_physpages,
> >> NODE_DATA(0)->node_present_pages).
>
> Jack> The system has a large number of nodes. Physically, each node
> Jack> has the same amount of memory. After boot, we observe that
> Jack> several nodes have substantially less memory than other
> Jack> nodes. Some of the inbalance is due to the kernel data/text
> Jack> being on node 0, but by far, the major source of in the
> Jack> inbalance is the 3 (in 2.4.x) large hash tables that are being
> Jack> allocated.
>
> Jack> I suspect the size of the hash tables is a lot bigger than is
> Jack> needed. That is certainly the first problem to be fixed, but
> Jack> unless the required size is a very small percentage (5-10%) of
> Jack> the amount of memory on a node (2GB to 32GB per node & 256
> Jack> nodes), we still have a problem.
>
> Jack,
>
> I agree with you, however as you point out, there are two problems to
> deal with, the excessive size of the hash tables on large systems and
> the imbalance that everything goes on node zero. My patch only solves
> the first problem, or rather works around it.
>
> Solving the problem of allocating structures on multiple nodes is yet
> to be solved.

Jes

Then I still dont understand your proposal. (I probably missed some piece
of the discussion).

You proposed above to limit the allocation to the amount of memory on a node.
I dont see that does anything on SN systems - allocation is already limited to
that amount because memory between nodes is discontiguous. We need to limit
the allocation to a small percentage of the memory on a node. I
dont see how we can do that without:

- using vmalloc (on systems that dont have vmalloc issues)
OR
- changing algorithms so that a lrge hash table is not
needed. Either lots of smaller hash tables or ???. I suspect
there are performance issues with this.
OR
- ????

I suppose I need to wait to see the proposal for allocating memory across nodes....


--
Thanks

Jack Steiner (steiner@xxxxxxx) 651-683-5302
Principal Engineer SGI - Silicon Graphics, Inc.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/