Re: [patch 1/4] add basic task isolation prctl interface

From: Marcelo Tosatti
Date: Tue Jul 27 2021 - 09:06:59 EST


On Tue, Jul 27, 2021 at 02:38:15PM +0200, nsaenzju@xxxxxxxxxx wrote:
> Hi Marcelo,
>
> On Tue, 2021-07-27 at 08:00 -0300, Marcelo Tosatti wrote:
> > On Tue, Jul 27, 2021 at 12:48:33PM +0200, nsaenzju@xxxxxxxxxx wrote:
> > > On Tue, 2021-07-27 at 07:38 -0300, Marcelo Tosatti wrote:
> > > > +Isolation mode (PR_ISOL_MODE):
> > > > +------------------------------
> > > > +
> > > > +- PR_ISOL_MODE_NONE (arg4): no per-task isolation (default mode).
> > > > + PR_ISOL_EXIT sets mode to PR_ISOL_MODE_NONE.
> > > > +
> > > > +- PR_ISOL_MODE_NORMAL (arg4): applications can perform system calls normally,
> > > > + and in case of interruption events, the notifications can be collected
> > > > + by BPF programs.
> > > > + In this mode, if system calls are performed, deferred actions initiated
> > > > + by the system call will be executed before return to userspace.
> > > > +
> > > > +Other modes, which for example send signals upon interruptions events,
> > > > +can be implemented.
> > >
> > > Shouldn't this be a set of flags that enable specific isolation features?
> > > Something the likes of 'PR_ISOL_QUIESCE_ON_EXIT'. Modes seem more restrictive
> > > and too much of a commitment. If we merge MODE_NORMAL as is, we won't be able
> > > to tweak/extend its behaviour in the future.
> >
> > Hi Nicolas,
> >
> > Well, its assuming PR_ISOL_MODE_NORMAL means "enable all isolation
> > features on return to userspace".
> >
> > Later on, if desired, can add extend interface as follows (using
> > Christoph's idea to not perform automatic quiesce on return to
> > userspace, but expose which parts need quiescing
> > so userspace can do it on its own, as an example):
> >
> > #define PR_ISOL_QUIESCE_ON_EXIT (1<<0)
> > #define PR_ISOL_VSYSCALL_PAGE (1<<1)
> > ...
> >
> > unsigned long bitmap = PR_ISOL_VSYSCALL_PAGE;
> >
> > /* allow system calls */
> > prctl(PR_ISOL_SET, PR_ISOL_MODE, PR_ISOL_MODE_NORMAL, 0, 0, 0);
> >
> > /*
> > * disable quiescing on exit, enable reporting through
> > * vsyscall page
> > */
> > prctl(PR_ISOL_SET, PR_ISOL_FEATURES, &bitmap, 0, 0);
> > /*
> > * configure vsyscall page
> > */
> > prctl(PR_ISOL_VSYSCALLS, params, ...);
> >
> > So unless i am missing something, it is possible to tweak/extend the
> > interface. No?
>
> OK, sorry if I'm being thick, but what is the benefit of having a distincnt
> PR_ISOL_MODE instead expressing everything as PR_ISOL_FEATURES.
>
> PR_ISOL_MODE_NONE == Empty PR_ISOL_FEATURES bitmap
>
> PR_ISOL_MODE_NORMAL == Bitmap of commonly used PR_ISOL_FEATURES
> (we could introduce a define)
>
> PR_ISOL_MODE_NORMAL+PR_ISOL_VSYSCALLS == Custom bitmap
>
> Other than that, my rationale is that if you extend PR_ISOL_MODE_NORMAL's
> behaviour as new features are merged, wouldn't you be potentially breaking
> userspace (i.e. older applications might not like the new default)?
>
> --
> Nicolás Sáenz

The main reason is that PR_ISOL_MODE would allow for distinct
modes to be implemented (matching each use case). For example:

https://lwn.net/Articles/816298/

"When a task has finished its initialization, it can activate isolation
by using the PR_TASK_ISOLATION operation provided by the prctl()
system call. This operation may fail for either permanent or temporary
reasons. An example of a permanent error is when the task is set up
on a CPU without isolation; in this case, entering isolation mode
is not possible. Temporary errors are indicated by the EAGAIN error
code; examples include a time when the delayed workqueues could not be
stopped. In such cases, the task may retry the operation if it wants to
enter isolation, as it may succeed the next time.

In the prctl() call, the developer may also configure the signal to be
sent to the task when it loses isolation. The additional macro to use is
PR_TASK_ISOLATION_SET_SIG(), passing it the signal to send. The command
then becomes similar to the one in the example code:"