Re: [RFC PATCH 19/35] ovl: readd reflink/copyfile/dedup support

From: Amir Goldstein
Date: Tue Apr 17 2018 - 16:31:51 EST


On Thu, Apr 12, 2018 at 6:08 PM, Miklos Szeredi <mszeredi@xxxxxxxxxx> wrote:
> Since set of arguments are so similar, handle in a common helper.
>
> Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
> ---
> fs/overlayfs/file.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 79 insertions(+)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index 9670e160967e..39b1b73334ad 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -352,6 +352,81 @@ long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> return ret;
> }
>
> +enum ovl_copyop {
> + OVL_COPY,
> + OVL_CLONE,
> + OVL_DEDUPE,
> +};
> +
> +static ssize_t ovl_copyfile(struct file *file_in, loff_t pos_in,
> + struct file *file_out, loff_t pos_out,
> + u64 len, unsigned int flags, enum ovl_copyop op)
> +{
> + struct inode *inode_out = file_inode(file_out);
> + struct fd real_in, real_out;
> + const struct cred *old_cred;
> + int ret;
> +
> + ret = ovl_real_file(file_out, &real_out);
> + if (ret)
> + return ret;
> +
> + ret = ovl_real_file(file_in, &real_in);
> + if (ret) {
> + fdput(real_out);
> + return ret;
> + }
> +
> + old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
> + switch (op) {
> + case OVL_COPY:
> + ret = vfs_copy_file_range(real_in.file, pos_in,
> + real_out.file, pos_out, len, flags);

Problem:
vfs_copy_file_range(ovl_lower_file, ovl_upper_file) on non samefs
will get -EXDEV from ovl_copy_file_range(), so will not fall back
to do_splice_direct().
We may be better off checking in_sb != out_sb and returning
-EOPNOTSUPP? not sure.


> + break;
> +
> + case OVL_CLONE:
> + ret = vfs_clone_file_range(real_in.file, pos_in,
> + real_out.file, pos_out, len);
> + break;
> +
> + case OVL_DEDUPE:
> + ret = vfs_dedupe_file_range_one(real_in.file, pos_in, len,
> + real_out.file, pos_out);

Problem:
real_out can be a readonly fd (for is_admin), so we will be deduping
the lower file.
I guess this problem is mitigated in current code by may_write_real().

How can we deal with that sort of "write leak" without patching
mnt_want_write_file()?

Thanks,
Amir.