Re: [RFC PATCH] userfaultfd: support control over mm of remote PIDs

From: David Hildenbrand
Date: Mon Sep 27 2021 - 16:11:36 EST


On 27.09.21 22:08, Nadav Amit wrote:


On Sep 27, 2021, at 10:06 AM, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 27.09.21 12:19, Nadav Amit wrote:
On Sep 27, 2021, at 2:29 AM, David Hildenbrand <david@xxxxxxxxxx> wrote:

On 26.09.21 19:06, Nadav Amit wrote:
From: Nadav Amit <namit@xxxxxxxxxx>
Non-cooperative mode is useful but only for forked processes.
Userfaultfd can be useful to monitor, debug and manage memory of remote
processes.
To support this mode, add a new flag, UFFD_REMOTE_PID, and an optional
second argument to the userfaultfd syscall. When the flag is set, the
second argument is assumed to be the PID of the process that is to be
monitored. Otherwise the flag is ignored.
The syscall enforces that the caller has CAP_SYS_PTRACE to prevent
misuse of this feature.

What supposed to happen if the target process intents to use uffd itself?
Thanks for the quick response.
First, sorry that I mistakenly dropped the changes to userfaultfd.h
that define UFFD_REMOTE_PID.

Didn't even notice it :)

As for your question: there are standard ways to deal with such cases,
similarly to when a debugged program wants to use PTRACE. One way is
to block the userfaultfd syscall, using seccomp. Another way is to do
chaining using ptrace (although using ptrace for anything is
challenging).
It is also possible to add tailor something specific to userfaultfd,
but I think seccomp is a good enough solution. I am open to suggestions.

If we have something already in place to handle PTRACE, we'd better reuse what's already there. Thanks!

Just to ensure we are on the same page: I meant that this is usually
left for the user application to handle. The 2 basic solutions are to
not expose userfaultfd to the monitored process (easy using seccomp)
or to chain the two monitors (hard using ptrace).

Yes, and I agree that the first approach then makes sense. Chaining might be way to complicated to support.

As long as the kernel will continue working when a second one tries to register (which I think is the case), that should be good enough.


Since ptrace is hard, in theory we can have facilities to “hijack”
a context and “inject” uffd event to another monitor. I just think
it is a total overkill at this stage.

Agreed


--
Thanks,

David / dhildenb