Re: [PATCH 2/9] vfs: export do_splice_direct() to modules

From: Jan Kara
Date: Tue Mar 19 2013 - 16:54:18 EST


On Mon 18-03-13 23:01:03, Al Viro wrote:
> On Mon, Mar 18, 2013 at 09:53:34PM +0000, Al Viro wrote:
> > On Mon, Mar 18, 2013 at 04:39:36PM +0100, Jan Kara wrote:
> > > IMO the deadlock is real. In freeze_super() we wait for all writers to
> > > the filesystem to finish while blocking beginning of any further writes. So
> > > we have a deadlock scenario like:
> > >
> > > THREAD1 THREAD2 THREAD3
> > > mnt_want_write() mutex_lock(&inode->i_mutex);
> > > ... freeze_super()
> > > block on mutex_lock(&inode->i_mutex)
> > > sb_wait_write(sb, SB_FREEZE_WRITE);
> > > block in sb_start_write()
> >
> > The bug is on fsfreeze side and this is not the only problem related to it.
> > I've missed the implications when I applied "fs: Add freezing handling
> > to mnt_want_write() / mnt_drop_write()" last June ;-/
> >
> > The thing is, until then mnt_want_write() used to be a counter; it could be
> > nested. Now any such nesting is a deadlock you've just described. This
> > is seriously wrong, IMO.
> >
> > BTW, having sb_start_write() buried in individual ->splice_write() is
> > asking for trouble; could you describe the rules for that? E.g. where
> > does it nest wrt filesystem-private locks? XFS iolock, for example...
>
> I'm looking at the existing callers and I really wonder if we ought to
> push sb_start_write() from ->splice_write()/->aio_write()/etc. into the
> callers.
Yeah, that should be OK.

> Something like file_start_write()/file_end_write(), with check for file
> being regular one might be a good starting point. As it is, copyup is
> really fucked both in unionmount and overlayfs...
Makes sense. I can do the patch. BTW, for months I'm trying to push to you
a patch which creates a function like file_start_write() which returns
EAGAIN if the file is open with O_NONBLOCK and fs is frozen (this allows me
to solve a deadlock with bsd process accounting to frozen fs). After this
change the patch will become trivial so I'll add it to the series and
hopefully it won't get ignored.

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/