Re: [PATCH] mm for fs: add truncate_pagecache_range

From: Hugh Dickins
Date: Sun Mar 25 2012 - 16:26:32 EST


On Fri, 23 Mar 2012, Andrew Morton wrote:
> On Fri, 23 Mar 2012 14:14:54 -0700 (PDT)
> Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > On Fri, 23 Mar 2012, Andrew Morton wrote:
> > >
> > > --- a/mm/truncate.c~mm-for-fs-add-truncate_pagecache_range-fix
> > > +++ a/mm/truncate.c
> > > @@ -639,6 +639,9 @@ int vmtruncate_range(struct inode *inode
> > > * with on-disk format, and the filesystem would not have to deal with
> > > * situations such as writepage being called for a page that has already
> > > * had its underlying blocks deallocated.
> > > + *
> > > + * Must be called with inode->i_mapping->i_mutex held.
> >
> > You catch me offguard: I forget whether that's an absolute requirement or
> > just commonly the case. What do the other interfaces in truncate.c say ?-)
>
> i_mutex is generally required, to stabilise i_size.

Sorry for being quarrelsome, but I do want to Nak your followup "fix".

Building a test kernel quickly told me that inode->i_mapping->i_mutex
doesn't exist, of course it's inode->i_mutex.

Then running the test kernel quickly told me that neither ext4 nor xfs
(I didn't try ocfs2) holds inode->i_mutex where holepunching calls
truncate_inode_pages_range().

Now, there might or might not be reasons why ext4 or xfs ought to hold
i_mutex there for its own consistency, but it's beyond me to determine
that: let's assume they're correct without evidence to the contrary.

Stabilizing i_size is not a reason: holepunching does not affect i_size
and is not affected by i_size (okay, ext4 still has the bug I reported
a couple of months ago, whereby its holepunching stops at i_size,
forgetting blocks fallocated beyond; but no doubt that will get fixed).

And nothing that truncate_pagecache_range() does needs i_mutex:
neither the unmap_mapping_range() nor the truncate_inode_pages_range()
needs i_mutex. A year ago, yes, Miklos showed how unmap_mapping_range()
was relying on mutex serialization, and added an additional mutex for
that, which Peter was able to remove once he mutified i_mmap_lock.

truncate_pagecache_range() is just a drop-in replacement for
truncate_inode_pages_range(), and has no different locking needs.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/