Re: [PATCH 17/18] fs: icache remove inode_lock
From: Nick Piggin
Date: Thu Oct 14 2010 - 05:13:14 EST
On Thu, Oct 14, 2010 at 08:06:09PM +1100, Nick Piggin wrote:
> On Thu, Oct 14, 2010 at 10:23:19AM +1100, Dave Chinner wrote:
> Shrinker and zone reclaim is definitely needed. It is needed for NUMA
> scalability and locality of reclaim, and also for container and directed
> dentry/inode reclaim. Google have a very similar patch and they've said
> this is needed (and I already know it is needed for scalability on
> large NUMA -- SGI were complaining about this nearly 5 years ago IIRC).
> So that is _definitely_ going to be needed.
>
> Store-free path walking is definitely needed, so we need to do RCU inodes.
> With RCU inodes, the optimal locking protocols change quite a bit --
[...]
> It's much past a prototype. While the patches need some more cleanup
> and review still, the final end result gives a tree with almost no
> global cachelines in the entire vfs, including path walking. Things
> like path walks are nearly 50% faster single threaded, and perfectly
> scalable. Linus actually wants the store-free path walk stuff
> _before_ any of the other things, if that gives you an idea of where
> other people are putting the priority of the patches.
With this said, I think you're probably not quite aware of the bigger
picture with the vfs-scale series. Yes it will be important to help
your XFS inode contention, but there are many other people having other
problems, and other big improvements that will benefit desktops and
more common workloads than big-IO ones in the series.
So yes I'll definitely keep the vfs-scale series together. Most of the
inode scaling work is at the bottom of it and should be able to go in
soon. But for example, the inode RCU work is going to go in -- Linus
has acked my strategy for it (and plan for mitigating/avoiding possible
regressions if needed). So with that, it makes more sense to design
the locking with the RCU available.
If we _know_ it will be needed in future anyway, it doesn't make sense
to a different non-RCU approach, and then rework that again down the
line IMO. That just gives a larger burden of locking models that need
to be supported/debugged.
Ditto for other things like per zone locking.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/