Re: [00/41] Large Blocksize Support V7 (adds memmap support)

From: Nick Piggin
Date: Tue Sep 18 2007 - 13:30:52 EST


On Tuesday 18 September 2007 08:21, Christoph Lameter wrote:
> On Sun, 16 Sep 2007, Nick Piggin wrote:
> > > > So if you argue that vmap is a downside, then please tell me how you
> > > > consider the -ENOMEM of your approach to be better?
> > >
> > > That is again pretty undifferentiated. Are we talking about low page
> >
> > In general.
>
> There is no -ENOMEM approach. Lower order page allocation (<
> PAGE_ALLOC_COSTLY_ORDER) will reclaim and in the worst case the OOM killer
> will be activated.

ROFL! Yeah of course, how could I have forgotten about our trusty OOM killer
as the solution to the fragmentation problem? It would only have been funnier
if you had said to reboot every so often when memory gets fragmented :)


> That is the nature of the failures that we saw early in
> the year when this was first merged into mm.
>
> > > With the ZONE_MOVABLE you can remove the unmovable objects into a
> > > defined pool then higher order success rates become reasonable.
> >
> > OK, if you rely on reserve pools, then it is not 1st class support and
> > hence it is a non-solution to VM and IO scalability problems.
>
> ZONE_MOVABLE creates two memory pools in a machine. One of them is for
> movable and one for unmovable. That is in 2.6.23. So 2.6.23 has no first
> call support for order 0 pages?

What?


> > > > If, by special software layer, you mean the vmap/vunmap support in
> > > > fsblock, let's see... that's probably all of a hundred or two lines.
> > > > Contrast that with anti-fragmentation, lumpy reclaim, higher order
> > > > pagecache and its new special mmap layer... Hmm, seems like a no
> > > > brainer to me. You really still want to persue the "extra layer"
> > > > argument as a point against fsblock here?
> > >
> > > Yes sure. You code could not live without these approaches. Without the
> >
> > Actually: your code is the one that relies on higher order allocations.
> > Now you're trying to turn that into an argument against fsblock?
>
> fsblock also needs contiguous pages in order to have a beneficial
> effect that we seem to be looking for.

Keyword: relies.


> > > antifragmentation measures your fsblock code would not be very
> > > successful in getting the larger contiguous segments you need to
> > > improve performance.
> >
> > Complely wrong. *I* don't need to do any of that to improve performance.
> > Actually the VM is well tuned for order-0 pages, and so seeing as I have
> > sane hardware, 4K pagecache works beautifully for me.
>
> Sure the system works fine as is. Not sure why we would need fsblock then.

Large block filesystem.


> > > (There is no new mmap layer, the higher order pagecache is simply the
> > > old API with set_blocksize expanded).
> >
> > Yes you add another layer in the userspace mapping code to handle higher
> > order pagecache.
>
> That would imply a new API or something? I do not see it.

I was not implying a new API.


> > > Why: It is the same approach that you use.
> >
> > Again, rubbish.
>
> Ok the logical conclusion from the above is that you think your approach
> is rubbish....

The logical conclusion is that _they are not the same approach_!


> Is there some way you could cool down a bit?

I'm not upset, but what you were saying was rubbish, plain and simple. The
amount of times we've gone in circles, I most likely have already explained
this, serveral times, in a more polite manner.

And I know you're more than capable to understand at least the concept
behind fsblock, even without time to work through the exact details. What
are you expecting me to say, after all this back and forth, when you come
up with things like "[fsblock] is not a generic change but special to the
block layer", and then claim that fsblock is the same as allocating "virtual
compound pages" with vmalloc as a fallback for higher order allocs.

What I will say is that fsblock has still a relatively longer way to go, so
maybe that's your reason for not looking at it. And yes, when fsblock is
in a better state to actually perform useful comparisons with it, will be a
much better time to have these debates. But in that case, just say so :)
then I can go away and do more constructive work on it instead of filling
people's inboxes.

I believe the fsblock approach is the best one, but it's not without problems
and complexities, so I'm quite ready for it to be proven incorrect, not
performant, or otherwise rejected.

I'm going on holiday for 2 weeks. I'll try to stay away from email, and
particularly this thread.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/