Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Nick Piggin
Date: Tue Nov 01 2005 - 20:04:41 EST


Joel Schopp wrote:

The patches do ad a reasonable amount of complexity to the page allocator. In my opinion that is the only downside of these patches, even though it is a big one. What we need to decide as a community is if there is a less complex way to do this, and if there isn't a less complex way then is the benefit worth the increased complexity.

As to the non-zero performance cost, I think hard numbers should carry more weight than they have been given in this area. Mel has posted hard numbers that say the patches are a wash with respect to performance. I don't see any evidence to contradict those results.


The numbers I have seen show that performance is decreased. People
like Ken Chen spend months trying to find a 0.05% improvement in
performance. Not long ago I just spent days getting our cached
kbuild performance back to where 2.4 is on my build system.

I can simply see they will cost more icache, more dcache, more branches,
etc. in what is the hottest part of the kernel in some workloads (kernel
compiles, for one).

I'm sorry if I sound like a wet blanket. I just don't look at a patch
and think "wow all those 3 guys with Linux on IBM mainframes and using
lpars are going to be so much happier now, this is something we need".

The will need high order allocations if we want to provide HugeTLB pages
to userspace on-demand rather than reserving at boot-time. This is a
future problem, but it's one that is not worth tackling until the
fragmentation problem is fixed first.


Sure. In what form, we haven't agreed. I vote zones! :)


I'd like to hear more details of how zones would be less complex while still solving the problem. I just don't get it.


You have an extra zone. You size that zone at boot according to the
amount of memory you need to be able to free. Only easy-reclaim stuff
goes in that zone.

It is less complex because zones are a complexity we already have to
live with. 99% of the infrastructure is already there to do this.

If you want to hot unplug memory or guarantee hugepage allocation,
this is the way to do it. Nobody has told me why this *doesn't* work.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/