Re: page fault scalability (ext3, ext4, xfs)

From: Andy Lutomirski
Date: Thu Aug 15 2013 - 02:29:30 EST

On Wed, Aug 14, 2013 at 11:18 PM, David Lang <david@xxxxxxx> wrote:
> On Wed, 14 Aug 2013, Andy Lutomirski wrote:
>>> The big problem with this approach is that not doing the
>>> timestamp update on page faults is going to break the inode change
>>> version counting because for ext4, btrfs and XFS it takes a
>>> transaction to bump that counter. NFS needs to know the moment a
>>> file is changed in memory, not when it is written to disk. Also, NFS
>>> requires the change to the counter to be persistent over server
>>> failures, so it needs to be changed as part of a transaction....
>> NFS can do whatever it wants, although I suspect that even NFS can get
>> away with deferring cmtime updates.
> NFS already has to do syncs to make sure the data is safe on disk, have a
> flag that NFS can use to make the ctime safe, everyone else can get the
> performance improvement and NFS can have it's slow-but-safe approach.

I don't see the current code that updates times for NFS. I'm not
planning on making any changes that'll affect NFS at all (i.e. I don't
think any flag will be needed), but I'd be more confident if I
understand why it worked in the first place.

(For filesystems that provide page_mkwrite, there hasn't been a
file_update_time call in the core code for several kernel versions.)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at