Re: [00/17] Large Blocksize Support V3

From: Christoph Lameter
Date: Fri Apr 27 2007 - 01:50:45 EST


On Thu, 26 Apr 2007, Andrew Morton wrote:

> > Or make sure that truncate
> > doesn't race on a partial *block* truncate?
>
> lock four pages

You would only lock a single higher order block. Truncate works on that
level.

If you have 4 separate pages then you need to take separate locks and you
may not have contiguous memory which makes the filesystem run through all
sorts of hoops.

> I'm not saying it's especially simple, nor fast. But it has the advantage
> that we're not forced to use larger pages with _it's_ attendant performance
> problems.

The patch is not about forcing to use large pages but about the option to
use larger pages. Its a new flexibility.

> And it doesn't introduce a rather nasty hack of pretending (in some places)
> that pages are larger than they really are.

They are really larger. One page struct controls it all.

> And it has the very significant advantage that it doesn't introduce brand
> new concepts and some complexity into core MM.

The patchset would reduce complexity and making it easy to handle the page
cache. Gets rid of the hacks to support larger ones right now. Its
straightforward, no new locking, very much a cleanup patch.

> And make no mistake: the latter disadvantage is huge. Because if we do the
> PAGE_CACHE_SIZE hack (sorry, but it _is_), we have to do it *for ever*.
> Maintaining and enhancing core MM and VFS becomes harder and more costly
> and slower and more buggy *for ever*. The ramp for people to become
> competent on core MM becomes longer. Our developer pool becomes smaller, and
> proportionally less skilled.

No it becomes easier. Look at the patchset. It cleans up a huge mess.
What is hacky about it? It is consistently using larger pages for the page
cache and it integrates nicely into the VM.

> And hardware gets better. If Intel & AMD come out with a 16k pagesize
> option in a couple of years we'll look pretty dumb. If the problems which
> you're presently having with that controller get sorted out in the next
> generation of the hardware, we'll also look pretty dumb.

We are currently looking dumb and unable to deal with the hardware. Yes
we can pressure the hardware vendors to produce hardware conforming to our
specifications but I always thought that was how another company operates.

> As always, there are tradeoffs. We can see the cons, and they are very
> significant. We don't yet know the pros. Perhaps they will be similarly
> significant. But I don't believe that the larger PAGE_CACHE_SIZE hack
> (sorry) is the only way in which they can be realised.

It is the most consistent solution that avoid the proliferation of further
hacks to address the large blocksize.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/