Re: [PATCH] allocate page caches pages in round robin fasion

From: Nick Piggin
Date: Fri Aug 13 2004 - 12:37:08 EST


Martin J. Bligh wrote:
--Jesse Barnes <jbarnes@xxxxxxxxxxxx> wrote (on Friday, August 13, 2004 09:34:20 -0700):


On Friday, August 13, 2004 9:20 am, Martin J. Bligh wrote:

I really don't think this is a good idea - you're assuming there's
really no locality of reference, which I don't think is at all true in
most cases.

No, not at all, just that locality of reference matters more for stack
and anonymous pages than it does for page cache pages. I.e. we don't
want a node to be filled up with page cache pages causing all other
memory references from the process to be off node.

Does that actually happen though? Looking at the current code makes me
think it'll keep some pages free on all nodes at all times, and if kswapd
does it's job, we'll never fall back across nodes. Now ... I think that's
broken, but I think that's what currently happens - that was what we
discussed at KS ... I might be misreading it though, I should test it.

Not nearly enough pages for any sizeable app though. Maybe the behavior could be configurable?


Well, either we're:

1. Falling back and putting all our most recent accesses off-node.

or.

2. Not falling back and only able to use one node's memory for any one (single threaded) app.

Either situation is crap, though I'm not sure which turd we picked right
now ... I'd have to look at the code again ;-) I thought it was 2, but
I might be wrong.

I'm looking at this now. We are doing 1 currently.

There are a couple of issues. The first is that you need to minimise
regressions for when working set size is bigger than the local node.
The second is a fundamental tradeoff where purely FIFO touch patterns
will never be allowed to expand past local node memory for any sane
and simple (ie 2.6 worthy) scanning algorithms that I can think of.

I have a patch going now that just reclaims use-once file cache before
going off node. Seems to help a bit for basic things that just push
pagecache through the system. It definitely reduces remote allocations
by several orders of magnitude for those cases.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/