Re: [PATCH] allocate page caches pages in round robin fasion

From: Martin J. Bligh
Date: Fri Aug 13 2004 - 09:52:34 EST


> On a NUMA machine, page cache pages should be spread out across the system
> since they're generally global in nature and can eat up whole nodes worth of
> memory otherwise. This can end up hurting performance since jobs will have
> to make off node references for much or all of their non-file data.
>
> The patch works by adding an alloc_page_round_robin routine that simply
> allocates on successive nodes each time its called, based on the value of a
> per-cpu variable modulo the number of nodes. The variable is per-cpu to
> avoid cacheline contention when many cpus try to do page cache allocations at
> once.

I really don't think this is a good idea - you're assuming there's really
no locality of reference, which I don't think is at all true in most cases.

If we round-robin it ... surely 7/8 of your data (on your 8 node machine)
will ALWAYS be off-node ? I thought we discussed this at KS/OLS - what is
needed is to punt old pages back off onto another node, rather than swapping
them out. That way all your pages are going to be local.

> After dd if=/dev/zero of=/tmp/bigfile bs=1G count=2 on a stock kernel:
> Node 7 MemUsed: 49248 kB
> Node 6 MemUsed: 42176 kB
> Node 5 MemUsed: 316880 kB
> Node 4 MemUsed: 36160 kB
> Node 3 MemUsed: 45152 kB
> Node 2 MemUsed: 50000 kB
> Node 1 MemUsed: 68704 kB
> Node 0 MemUsed: 2426256 kB
>
> and after the patch:
> Node 7 MemUsed: 328608 kB
> Node 6 MemUsed: 319424 kB
> Node 5 MemUsed: 318608 kB
> Node 4 MemUsed: 321600 kB
> Node 3 MemUsed: 319648 kB
> Node 2 MemUsed: 327504 kB
> Node 1 MemUsed: 389504 kB
> Node 0 MemUsed: 744752 kB

OK, so it obviously does something ... but is the dd actually faster?
I'd think it's slower ...

M.

PS. I think it prob makes sense to make the *receiving* node's kswapd
do the transfer if possible for work distribution reasons ... however
I'm pretty sure there are some locking assumptions that mean this won't
be easy / possible.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/