Re: [PATCH v3 0/6] Composefs: an opportunistically sharing verified image filesystem

From: Giuseppe Scrivano
Date: Wed Jan 25 2023 - 08:11:37 EST


Amir Goldstein <amir73il@xxxxxxxxx> writes:

>> >
>> > Based on Alexander's explanation about the differences between overlayfs
>> > lookup vs. composefs lookup of a regular "metacopy" file, I just need to
>> > point out that the same optimization (lazy lookup of the lower data
>> > file on open)
>> > can be done in overlayfs as well.
>> > (*) currently, overlayfs needs to lookup the lower file also for st_blocks.
>> >
>> > I am not saying that it should be done or that Miklos will agree to make
>> > this change in overlayfs, but that seems to be the major difference.
>> > getxattr may have some extra cost depending on in-inode xattr format
>> > of erofs, but specifically, the metacopy getxattr can be avoided if this
>> > is a special overlayfs RO mount that is marked as EVERYTHING IS
>> > METACOPY.
>> >
>> > I don't expect you guys to now try to hack overlayfs and explore
>> > this path to completion.
>> > My expectation is that this information will be clearly visible to anyone
>> > reviewing future submission, e.g.:
>> >
>> > - This is the comparison we ran...
>> > - This is the reason that composefs gives better results...
>> > - It MAY be possible to optimize erofs/overlayfs to get to similar results,
>> > but we did not try to do that
>> >
>> > It is especially important IMO to get the ACK of both Gao and Miklos
>> > on your analysis, because remember than when this thread started,
>> > you did not know about the metacopy option and your main argument
>> > was saving the time it takes to create the overlayfs layer files in the
>> > filesystem, because you were missing some technical background on overlayfs.
>>
>> we knew about metacopy, which we already use in our tools to create
>> mapped image copies when idmapped mounts are not available, and also
>> knew about the other new features in overlayfs. For example, the
>> "volatile" feature which was mentioned in your
>> Overlayfs-containers-lpc-2020 talk, was only submitted upstream after
>> begging Miklos and Vivek for months. I had a PoC that I used and tested
>> locally and asked for their help to get it integrated at the file
>> system layer, using seccomp for the same purpose would have been more
>> complex and prone to errors when dealing with external bind mounts
>> containing persistent data.
>>
>> The only missing bit, at least from my side, was to consider an image
>> that contains only overlay metadata as something we could distribute.
>>
>
> I'm glad that I was able to point this out to you, because now the comparison
> between the overlayfs and composefs options is more fair.
>
>> I previously mentioned my wish of using it from a user namespace, the
>> goal seems more challenging with EROFS or any other block devices. I
>> don't know about the difficulty of getting overlay metacopy working in a
>> user namespace, even though it would be helpful for other use cases as
>> well.
>>
>
> There is no restriction of metacopy in user namespace.
> overlayfs needs to be mounted with -o userxattr and the overlay
> xattrs needs to use user.overlay. prefix.

if I specify both userxattr and metacopy=on then the mount ends up in
the following check:

if (config->userxattr) {
[...]
if (config->metacopy && metacopy_opt) {
pr_err("conflicting options: userxattr,metacopy=on\n");
return -EINVAL;
}
}

to me it looks like it was done on purpose to prevent metacopy from a
user namespace, but I don't know the reason for sure.

> w.r.t. the implied claim that composefs on-disk format is simple enough
> so it could be made robust enough to avoid exploits, I will remain
> silent and let others speak up, but I advise you to take cover,
> because this is an explosive topic ;)
>
> Thanks,
> Amir.