Re: [PATCH 2.6.9-rc2-mm1 0/2] mm: memory policy for page cache allocation

From: William Lee Irwin III
Date: Mon Sep 20 2004 - 18:26:34 EST


On Tue, Sep 21, 2004 at 12:37:42AM +0200, Andi Kleen wrote:
> Your counter can have the same worst case behaviour, just
> different. You only have to add freeing into the picture.
> Or when you consider getting more memory bandwidth from the interleaving
> (I know this is not your primary goal with this) then a sufficient
> access pattern could lead to rather uninterleaved allocation
> in the file.
> Any allocation algorithm will have such a worst case, so I'm not
> too worried. Given ia hash function is not too bad it should
> be bearable.
> The nice advantage of the static offset is that it makes benchmarks
> actually repeatable and is completely lockless

The hash function looks like choosing the nth node whose corresponding
bit is set in node_online_map such that linear_page_index(vma, address)
(why isn't it using linear_page_index()?) mod num_online_nodes() is n,
which actually appears weak compared to various hash functions I've
seen in use for e.g. page coloring. The hash functions I've seen in use
are not tremendously more expensive than mod, and generally meant to be
computationally cheap as opposed to strong.

The kind of scheme you've employed for MPOL_INTERLEAVE is what would be
called "direct mapped" in the context of page coloring, and Ray Bryant's
would be called "bin hopping" there. A nontrivial (though not
necessarily complex or expensive) hash function mod num_online_nodes()
would be considered hashed, and the last category I see in use
elsewhere is a "best bin" algorithm, which tracks utilization of bins
(for page coloring, colors; here nodes) and chooses one of the least
utilized bins thus far.

I'd expect all 4 alternatives (and maybe even a variety of hash
functions for address hashing) to be useful in various contexts,
though I'm unaware of which kinds of apps want which algorithms most.

I don't have any idea what kind of difference the variations on the
locality domain for Bryant's bin hopping algorithm make; I'd tend to
try to make it similar to the others' precedents, but there may be
other interactions.


-- wli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/