Re: [00/17] Large Blocksize Support V3

From: Andrew Morton
Date: Sat Apr 28 2007 - 04:24:00 EST


On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:

> >
> > The other thing is that we can batch up pagecache page insertions for bulk
> > writes as well (that is. write(2) with buffer size > page size). I should
> > have a patch somewhere for that as well if anyone interested.
>
> Together with the optimistic locking from my concurrent pagecache that
> should bring most of the gains:
>
> sequential insert of 8388608 items:
>
> CONFIG_RADIX_TREE_CONCURRENT=n
>
> [ffff81007d7f60c0] insert 0 done in 15286 ms
>
> CONFIG_RADIX_TREE_OPTIMISTIC=y
>
> [ffff81006b36e040] insert 0 done in 3443 ms
>
> only 4.4 times faster, and more scalable, since we don't bounce the
> upper level locks around.

I'm not sure what we're looking at here. radix-tree changes? Locking
changes? Both?

If we have a whole pile of pages to insert then there are obvious gains
from not taking the lock once per page (gang insert). But I expect there
will also be gains from not walking down the radix tree once per page too:
walk all the way down and populate all the way to the end of the node.

The implementation could get a bit tricky, handling pages which a racer
instantiated when we dropped the lock, and suitably adjusting ->index. Not
rocket science though.

The depth of the radix tree matters (ie, the file size). 'twould be useful
to always describe the tree's size when publishing microbenchmark results
like this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/