Re: [PATCH 2/9] vfs: export do_splice_direct() to modules

From: Jan Kara
Date: Wed Mar 20 2013 - 18:19:11 EST


On Wed 20-03-13 21:48:13, Al Viro wrote:
> On Wed, Mar 20, 2013 at 08:52:22PM +0100, Jan Kara wrote:
> > > do_bio_filebacked(), with some ugliness between that and callsite. Note,
> > > BTW, that we have a pair of possible vfs_fsync() calls in there; how do those
> > > interact with freeze?
> > Freezing code takes care that all dirty data is synced before fs is
> > frozen and no new dirty data can be created before fs is thawed. So
> > vfs_fsync() should just return without doing anything on frozen filesystem.
>
> Um... How does it interact with vfs_fsync() already in progress when you
> ask to freeze it?
So the exact sequence of freezing is:
sb->s_writers.frozen = SB_FREEZE_WRITE;
smp_wmb();
sb_wait_write(sb, SB_FREEZE_WRITE);
Now there are no processes in sb_start_write() - sb_end_write() section.
Then we do the same for SB_FREEZE_PAGEFAULT. After this noone should be
able to dirty a page or inode. Writeback or vfs_fsync() may be still
running (so fs can be creating new transactions in the journal for
writeback etc.).
sync_filesystem(sb);
After this there should be no dirty data so although we can still be
somewhere inside vfs_fsync() it should have nothing to do.

Now we freeze to state SB_FREEZE_FS (nop for ext4, but for XFS it may
interact e.g. with inode reclaim trimming preallocated blocks) and we are
done.

> Anyway, I've pulled the fscker out of ->aio_write, ->write and ->splice_write;
> on that pathway it's in the do_splice_from() (see vfs.git#experimental).
>
> ... and now, for something *really* nasty: where do mandatory file locks
> belong in the locking hierarchy? Relative to fsfreeze one, for starters,
> but both for unionmount and overlayfs we need to decide where they live
> relative to ->i_mutex on directories.
Hum, interesting question :). Relative to fsfreeze, it doesn't seem to
matter much, does it? We don't actually lock / unlock these from write
paths needing freeze protection, we only wait for them. But maybe I miss
some ugly case you have in mind.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/