Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

From: Mel Gorman
Date: Thu Nov 03 2005 - 11:27:28 EST


On Thu, 3 Nov 2005, Linus Torvalds wrote:

>
>
> On Thu, 3 Nov 2005, Arjan van de Ven wrote:
>
> > On Thu, 2005-11-03 at 07:36 -0800, Martin J. Bligh wrote:
> > > >> Can we quit coming up with specialist hacks for hotplug, and try to solve
> > > >> the generic problem please? hotplug is NOT the only issue here. Fragmentation
> > > >> in general is.
> > > >>
> > > >
> > > > Not really it isn't. There have been a few cases (e1000 being the main
> > > > one, and is fixed upstream) where fragmentation in general is a problem.
> > > > But mostly it is not.
> > >
> > > Sigh. OK, tell me how you're going to fix kernel stacks > 4K please.
> >
> > with CONFIG_4KSTACKS :)
>
> 2-page allocations are _not_ a problem.
>
> Especially not for fork()/clone(). If you don't even have 2-page
> contiguous areas, you are doing something _wrong_, or you're so low on
> memory that there's no point in forking any more.
>
> Don't confuse "fragmentation" with "perfectly spread out page
> allocations".
>
> Fragmentation means that it gets _exponentially_ more unlikely that you
> can allocate big contiguous areas. But contiguous areas of order 1 are
> very very likely indeed. It's only the _big_ areas that aren't going to
> happen.
>

For me, it's the big areas that I am interested in, especially if we want
to give HugeTLB pages to a user when they are asking for them. The obvious
one here is database and HPC loads, particularly the HPC loads which may
not have had a chance to reserve what they needed at boot time. These
loads need 1024 contiguous pages on the x86 at least, not 2. We can free
all we want on todays kernels and you're not going to get more than 1 or
two blocks this large unless you are very lucky.

Hotplug is, for me, an additional benefit. For others, it is the main
benefit. For others of course, they don't care, but others don't are about
scalability to 64 processors either but we provide it anyway at a low cost
to smaller machines.

> This is why fragmentation avoidance has always been totally useless. It is
> - only useful for big areas
> - very hard for big areas
>
> (Corollary: when it's easy and possible, it's not useful).
>

Unless you are a user that wants a large area when it suddenly is useful.

> Don't do it. We've never done it, and we've been fine. Claiming that
> fork() is a reason to do fragmentation avoidance is invalid.
>

We've never done it but, but we've only supported HugeTLB pages being
reserved at boot time and nothing else as well.

I'm going to setup a kbuild environment, hopefully this evening, and see
are these patches adversely impacting a load that kernel developers care
about. If I am impacting it, oops I'm in some trouble. If I'm not, then
why not try and help out the people who care about the big areas.

--
Mel Gorman
Part-time Phd Student Java Applications Developer
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/