Re: Avoiding external fragmentation with a placement policy Version 12

From: Martin J. Bligh
Date: Thu Jun 02 2005 - 09:03:31 EST


>> >> Other than the very minor whitespace changes above I have nothing bad to
>> >> say about this patch. I think it is about time to pick in up in -mm for
>> >> wider testing.
>> >>
>> >
>> > It adds a lot of complexity to the page allocator and while
>> > it might be very good, the only improvement we've been shown
>> > yet is allocating lots of MAX_ORDER allocations I think? (ie.
>> > not very useful)
>>
>> I agree that MAX_ORDER allocs aren't interesting, but we can hit
>> frag problems easily at way less than max order. CIFS does it, NFS
>> does it, jumbo frame gigabit ethernet does it, to name a few. The
>> most common failure I see is order 3.
>>
>
> I focused on the MAX_ORDER allocations for two reasons. The first is
> because they are very difficult to satisfy. If we can service MAX_ORDER
> allocations, we can certainly service order 3. The second is that my very
> long-term (and currently vapour-ware) aim is to transparently support
> large pages which will require 4MiB blocks on the x86 at least.

Oh, I wasn't arguing with your approach ... is always better to go a bit
further. Was just illustrating that there are real world problems right
now that hit this stuff, ergo we need it. Yes, I'd like to be able to do
large page, memory hotplug, etc too ... but if people aren't excited about
those, there are plenty of other reasons to fix the frag problem.

It seems apparent statistically that the larger the machine, the worse the
frag problem is, as we'll blow away more memory before getting contig
blocks. If it wasn't pre-7am, I'd try to calculate the statistics, but
frankly, I can't be bothered ;-) I'm sure there are others whose math
degree is less rusty than mine.

> With this allocator, we are still using a blunderbus approach but the
> chances of big enough chunks been available are a lot better. I released a
> proof-of-concept patch that freed pages by linearly scanning that worked
> very well, but it needs a lot of work. Linearly scanning would help
> guarantee high-order allocations but the penalty is that LRU-ordering
> would be violated.

Yes, would be nice ... but we need to gather things into freeable and
non-freeable either way, it seems, so doesn't invalidate what you're
doing at all.

It seems apparent statistically that the larger the machine, the worse the
frag problem is, as we'll blow away more memory before getting contig
blocks. If it wasn't pre-7am, I'd try to calculate the statistics, but
frankly, I can't be bothered ;-) I'm sure there are others whose math
degree is less rusty than mine, and I'd hate to deprive them of the
opportunity to play ;-)

> To test lower-order allocations, I ran a slightly different test where I
> tried to allocate 6000 order-5 pages under heavy pressure. The standard
> allocator repeatadly went OOM and allocated 5190 pages. The modified one
> did not OOM and allocated 5961. The test is not very fair though because
> it pins memory and the allocations are type GFP_KERNEL. For the gigabit
> ethernet and network filesystem tests, I imagine we are dealing with
> GFP_ATOMIC or GFP_NFS?

cifsd: page allocation failure. order:3, mode:0xd0

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/