Re: [PATCH RESEND] fs: avoid iterate_supers to trigger call for sync on filesystem during mount

From: Jan Kara
Date: Thu Mar 09 2017 - 07:51:23 EST


On Wed 08-02-17 17:27:58, Vivek Trivedi wrote:
> It has been observed that apps may block in sys_sync for long time if there
> is parallel mount request for large size storage block device in a loaded
> environment.
>
> For example, sys_sync is reported to be hunged when a large size disk
> (e.g. 4TB/8TB disks) is connected. sys_mount may take time for reading large
> amount of metedata - free space accounting by reading bitmap during mount.
> The larger the volume, the larger the size of the bitmaps read during mount.
>
> During mount operation s_umount lock is taken by sys_mount. The lock is not
> released till the mount is completed. The sync() process keeps on waiting till
> s_umount is relased by sys_mount.
>
> Scenario
> Process1 Process2
> sys_sync() sys_mount()
> iterate_supers do_mount()
> ... vfs_kern_mount()
> mount_fs() (Filesystem_mount)
> ... mount_bdev()
> sget()
> alloc_super()
> down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); => LOCK HELD by MOUNT
> ...
> down_read(&sb->s_umount); => WAITING FOR LOCK ...
> . ...
> . fill_super() -> time TAKING (as per filesystem)
> . up_write(&sb->s_umount); => LOCK RELEASE by mount process
> .
> . STUCKED TILL MOUNT is completed
>
> Since, the superblock gets added to the list during the mount path alloc_super,
> so the 'sb' is visible in the s_list. But the behaviour of waiting to sync() a
> filesystem which is not active yet, seems ambigous here.
>
> To avoid this issue, may be we should consider about having to check only
> the ACTIVE filesystem for doing operations from the superblock list.
> 'sb' is valid and referencable as long as part of the s_list and MS_ACTIVE is
> set after successful mount and cleared during the umount path from
> generic_shutdown_super.
>
> ---
> Fixed a typo and updated the condition in iterate_supers.

Changelog of a patch belongs under the diffstat so that it does not get
into the final commit log.

> Signed-off-by: Vivek Trivedi <t.vivek@xxxxxxxxxxx>
> Reviewed-by: Amit Sahrawat <a.sahrawat@xxxxxxxxxxx>
> ---
> fs/super.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/super.c b/fs/super.c
> index b8b6a08..01711a4 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -587,6 +587,10 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
> list_for_each_entry(sb, &super_blocks, s_list) {
> if (hlist_unhashed(&sb->s_instances))
> continue;
> +
> + if (!(sb->s_flags & MS_ACTIVE))
> + continue;
> +

So I have two comments to this:

1) Using MS_BORN would be more appropriate here - and consistent with the
fact that we do check it in iterate_supers(), just under s_umount.

2) iterate_supers_type() should be updated as well to keep them consistent.

Otherwise the patch looks OK to me.

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR