Re: [RFC PATCH v5 06/10] ovl: implement overlayfs' ->write_inode operation

From: Chengguang Xu
Date: Sun Dec 05 2021 - 09:08:15 EST


---- 在 星期四, 2021-12-02 06:47:25 Amir Goldstein <amir73il@xxxxxxxxx> 撰写 ----
> On Wed, Dec 1, 2021 at 6:24 PM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
> >
> > ---- 在 星期三, 2021-12-01 21:46:10 Jan Kara <jack@xxxxxxx> 撰写 ----
> > > On Wed 01-12-21 09:19:17, Amir Goldstein wrote:
> > > > On Wed, Dec 1, 2021 at 8:31 AM Chengguang Xu <cgxu519@xxxxxxxxxxxx> wrote:
> > > > > So the final solution to handle all the concerns looks like accurately
> > > > > mark overlay inode diry on modification and re-mark dirty only for
> > > > > mmaped file in ->write_inode().
> > > > >
> > > > > Hi Miklos, Jan
> > > > >
> > > > > Will you agree with new proposal above?
> > > > >
> > > >
> > > > Maybe you can still pull off a simpler version by remarking dirty only
> > > > writably mmapped upper AND inode_is_open_for_write(upper)?
> > >
> > > Well, if inode is writeably mapped, it must be also open for write, doesn't
> > > it? The VMA of the mapping will hold file open. So remarking overlay inode
> > > dirty during writeback while inode_is_open_for_write(upper) looks like
> > > reasonably easy and presumably there won't be that many inodes open for
> > > writing for this to become big overhead?
>
> I think it should be ok and a good tradeoff of complexity vs. performance.

IMO, mark dirtiness on write is relatively simple, so I think we can mark the
overlayfs inode dirty during real write behavior and only remark writable mmap
unconditionally in ->write_inode().


>
> > >
> > > > If I am not mistaken, if you always mark overlay inode dirty on ovl_flush()
> > > > of FMODE_WRITE file, there is nothing that can make upper inode dirty
> > > > after last close (if upper is not mmaped), so one more inode sync should
> > > > be enough. No?
> > >
> > > But we still need to catch other dirtying events like timestamp updates,
> > > truncate(2) etc. to mark overlay inode dirty. Not sure how reliably that
> > > can be done...
> > >
>
> Oh yeh, we have those as well :)
> All those cases should be covered by ovl_copyattr() that updates the
> ovl inode ctime/mtime, so always dirty in ovl_copyattr() should be good.

Currently ovl_copyattr() does not cover all the cases, so I think we still need to carefully
check all the places of calling mnt_want_write().


Thanks,
Chengguang



> I *think* the only case of ovl_copyattr() that should not dirty is in
> ovl_inode_init(), so need some special helper there.
>
> >
> > To be honest I even don't fully understand what's the ->flush() logic in overlayfs.
> > Why should we open new underlying file when calling ->flush()?
> > Is it still correct in the case of opening lower layer first then copy-uped case?
> >
>
> The semantics of flush() are far from being uniform across filesystems.
> most local filesystems do nothing on close.
> most network fs only flush dirty data when a writer closes a file
> but not when a reader closes a file.
> It is hard to imagine that applications rely on flush-on-close of
> rdonly fd behavior and I agree that flushing only if original fd was upper
> makes more sense, so I am not sure if it is really essential for
> overlayfs to open an upper rdonly fd just to do whatever the upper fs
> would have done on close of rdonly fd, but maybe there is no good
> reason to change this behavior either.
>
> Thanks,
> Amir.
>