回复:[RFC PATCH V6 0/7] implement containerized syncfs for overlayfs

From: Chengguang Xu
Date: Sat Nov 27 2021 - 04:29:06 EST


---- 在 星期一, 2021-11-22 11:00:31 Chengguang Xu <cgxu519@xxxxxxxxxxxx> 撰写 ----
> From: Chengguang Xu <charliecgxu@xxxxxxxxxxx>
>
> Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
> on upper_sb to synchronize whole dirty inodes in upper filesystem
> regardless of the overlay ownership of the inode. In the use case of
> container, when multiple containers using the same underlying upper
> filesystem, it has some shortcomings as below.
>
> (1) Performance
> Synchronization is probably heavy because it actually syncs unnecessary
> inodes for target overlayfs.
>
> (2) Interference
> Unplanned synchronization will probably impact IO performance of
> unrelated container processes on the other overlayfs.
>
> This series try to implement containerized syncfs for overlayfs so that
> only sync target dirty upper inodes which are belong to specific overlayfs
> instance. By doing this, it is able to reduce cost of synchronization and
> will not seriously impact IO performance of unrelated processes.
>
> v1->v2:
> - Mark overlayfs' inode dirty itself instead of adding notification
> mechanism to vfs inode.
>
> v2->v3:
> - Introduce overlayfs' extra syncfs wait list to wait target upper inodes
> in ->sync_fs.
>
> v3->v4:
> - Using wait_sb_inodes() to wait syncing upper inodes.
> - Mark overlay inode dirty only when having upper inode and VM_SHARED
> flag in ovl_mmap().
> - Check upper i_state after checking upper mmap state
> in ovl_write_inode.
>
> v4->v5:
> - Add underlying inode dirtiness check after mnt_drop_write().
> - Handle both wait/no-wait mode of syncfs(2) in overlayfs' ->sync_fs().
>
> v5->v6:
> - Rebase to latest overlayfs-next tree.
> - Mark oerlay inode dirty when it has upper instead of marking dirty on
> modification.
> - Trigger dirty page writeback in overlayfs' ->write_inode().
> - Mark overlay inode 'DONTCACHE' flag.
> - Delete overlayfs' ->writepages() and ->evict_inode() operations.


Hi Miklos,

Have you got time to have a look at this V6 series? I think this version has already fixed
the issues in previous feedbacks of you guys and passed fstests (generic/overlay cases).

I did some stress long time tests (tar & syncfs & diff on w/wo copy-up) and found no obvious problem.
For syncfs time with 1M clean upper inodes, there was extra 1.3s wasted on waiting scheduling.
I guess this 1.3s will not bring significant impact to container instance in most cases, I also
agree with Jack that we can start with this approach and do some improvements afterwards if there is
complain from any real users.



Thanks,
Chengguang


>
> Chengguang Xu (7):
> ovl: setup overlayfs' private bdi
> ovl: mark overlayfs inode dirty when it has upper
> ovl: implement overlayfs' own ->write_inode operation
> ovl: set 'DONTCACHE' flag for overlayfs inode
> fs: export wait_sb_inodes()
> ovl: introduce ovl_sync_upper_blockdev()
> ovl: implement containerized syncfs for overlayfs
>
> fs/fs-writeback.c | 3 ++-
> fs/overlayfs/inode.c | 5 +++-
> fs/overlayfs/super.c | 49 ++++++++++++++++++++++++++++++++-------
> fs/overlayfs/util.c | 1 +
> include/linux/writeback.h | 1 +
> 5 files changed, 48 insertions(+), 11 deletions(-)
>
> --
> 2.27.0
>
>