Re: [PATCH 0/3] ceph: minor fixes and encrypted snapshot names

From: Xiubo Li
Date: Sun Mar 06 2022 - 19:50:07 EST



On 3/5/22 10:56 PM, Luís Henriques wrote:
Jeff Layton <jlayton@xxxxxxxxxx> writes:

On Fri, 2022-03-04 at 16:26 +0000, Luís Henriques wrote:
Luís Henriques <lhenriques@xxxxxxx> writes:

Hi!

I'm sending another iteration of the encrypted snapshot names patch. This
patch assumes PR#45224 [1] to be merged as it adds support for the
alternate names.

Two notes:

1. Patch 0001 is just a small fix from another fscrypt patch. It's
probably better to simply squash it.

2. I'm not sure how easy it is to hit the UAF fixed by patch 0002. I can
reproduce it easily by commenting the code that adds the
DCACHE_NOKEY_NAME flag in patch 0003.
Obviously, immediately after sending this patchset I realized I failed to
mention a very (*VERY*) important note:

Snapshot names can not start with a '_'. I think the reason is related
with the 'long snapshot names', but I can't really remember the details
anymore. The point is that an encrypted snapshot name base64-encoded
*may* end-up starting with an '_' as we're using the base64-url variant.

I really don't know if it's possible to fix that. I guess that in that
case the user will get an error and fail to create the snapshot but he'll
be clueless because the reason. Probably a warning can be added to the
kernel logs, but maybe there are other ideas.


Ouch. Is that imposed by the MDS? It'd be best if we could remove that
limitation from it altogether if we can.
I do remember hitting this limitation in the past, but a quick grep didn't
show anything in the documentation about it. This seems to have been
added to the MDS a *long* time ago, with commit 068553473c82 ("mds: adjust
trace encoding, clean up snap naming") but (as usual) there aren't a lot
of details.

When making a snapshot and in MDS code:

10458   if (snapname.length() == 0 ||
10459       snapname[0] == '_') {
10460     respond_to_request(mdr, -CEPHFS_EINVAL);
10461     return;
10462   }


If we can't, then we might be able to get away with prepending all the
encrypted names with some legal characte. Then when we go to decrypt it
we just strip that off.
This is probably the best way to fix it, but it's worth trying to find
out the origins of this limitation. I do seem to remember some obscure
reasons, related with the long snap names (for which Xiubo has a patch),
which will start with '_'. But yeah I'll have to go dig deeper.

It will recognize the encrypted "_XYZ_${DIGIT}" snapshot name as the long snapshot name inherited from its parent snap realm, and will parse the "${DIGIT}" as an ino in other places.

Maybe in MDS we should fail the request only when snapshot name is in type of "_XYZ_${DIGIT}" instead of only "_XYZ", and in client side should also print one error or warn log about this ?

This why added the ceph PR[1] to tell the kclient current snapshot name is a long snap name in lssnap. So if we can forbid the snap shot name begin with '_' it will simple in kclient code to handle the long snap name, or it will be complex in both MDS and kclient.

[1] https://github.com/ceph/ceph/pull/45208

-- Xiubo


We could also consider changing the base64 routine to use something else
in lieu of '_' but that's more of a hassle.
Cheers,