Re: [PATCH v3 2/3] userfaultfd: UFFDIO_MOVE uABI

From: David Hildenbrand
Date: Tue Oct 24 2023 - 10:28:35 EST


On 23.10.23 20:56, Suren Baghdasaryan wrote:
On Mon, Oct 23, 2023 at 5:29 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

Focusing on validate_remap_areas():

+
+static int validate_remap_areas(struct vm_area_struct *src_vma,
+ struct vm_area_struct *dst_vma)
+{
+ /* Only allow remapping if both have the same access and protection */
+ if ((src_vma->vm_flags & VM_ACCESS_FLAGS) != (dst_vma->vm_flags & VM_ACCESS_FLAGS) ||
+ pgprot_val(src_vma->vm_page_prot) != pgprot_val(dst_vma->vm_page_prot))
+ return -EINVAL;

Makes sense. I do wonder about pkey and friends and if we even have to
so anything special.

I don't see anything special done for mremap. Do you have something in mind?

Nothing concrete, not a pkey expert. But as there is indeed nothing pkey-special in the VMA, there is nothing we can really check for and/or adjust.

So let's assume this is fine.


+
+ /* Only allow remapping if both are mlocked or both aren't */
+ if ((src_vma->vm_flags & VM_LOCKED) != (dst_vma->vm_flags & VM_LOCKED))
+ return -EINVAL;
+
+ if (!(src_vma->vm_flags & VM_WRITE) || !(dst_vma->vm_flags & VM_WRITE))
+ return -EINVAL;

Why does one of both need VM_WRITE? If one really needs it, then the
destination (where we're moving stuff to).

As you noticed later, both should have VM_WRITE.

Can you comment why? Just a simplification for now? Would be good to add that comment in the code as well.

/* For now, we keep it simple and only move between writable VMAs. */

+ */
+ if (!dst_vma->vm_userfaultfd_ctx.ctx &&
+ !src_vma->vm_userfaultfd_ctx.ctx)
+ return -EINVAL;



+
+ /*
+ * FIXME: only allow remapping across anonymous vmas,
+ * tmpfs should be added.
+ */
+ if (!vma_is_anonymous(src_vma) || !vma_is_anonymous(dst_vma))
+ return -EINVAL;

Why a FIXME here? Just drop the comment completely or replace it with
"We only allow to remap anonymous folios accross anonymous VMAs".

Will do. I guess Andrea had plans to cover tmpfs as well.


That is rather future work (or what's to fix here?) and better documented in the cover letter.

Having thought about VMA checks, I do wonder if we want to just block some VM_ flags right at the beginning (VM_IO,VM_PFNMAP,VM_HUGETLB,...). That might be covered by some other checks here implicitly, but I'm not 100% sure if that's always the case. An explicit list as in vma_ksm_compatible() might be clearer.

Further, I wonder if we have to block VM_SHADOW_STACK; we certainly don't want to let users modify the shadow stack by moving modified target pages into place. But this might already be covered by earlier checks (vm_page_prot? but I didn't look up with which setting we ended up in the upstream version).

Cc'ing Rick: see "validate_remap_areas()" in [1]

[1] https://lkml.kernel.org/r/20231009064230.2952396-3-surenb@xxxxxxxxxx


--
Cheers,

David / dhildenb