Re: [PATCH v6 6/9] kernel: entry: Support Syscall User Dispatch for common syscall entry

From: Christian Brauner
Date: Mon Sep 07 2020 - 06:15:32 EST


On Fri, Sep 04, 2020 at 04:31:44PM -0400, Gabriel Krisman Bertazi wrote:
> Syscall User Dispatch (SUD) must take precedence over seccomp, since the
> use case is emulation (it can be invoked with a different ABI) such that
> seccomp filtering by syscall number doesn't make sense in the first
> place. In addition, either the syscall is dispatched back to userspace,
> in which case there is no resource for seccomp to protect, or the

Tbh, I'm torn here. I'm not a super clever attacker but it feels to me
that this is still at least a clever way to circumvent a seccomp
sandbox.
If I'd be confined by a seccomp profile that would cause me to be
SIGKILLed when I try do open() I could prctl() myself to do user
dispatch to prevent that from happening, no?

> syscall will be executed, and seccomp will execute next.
>
> Regarding ptrace, I experimented with before and after, and while the
> same ABI argument applies, I felt it was easier to debug if I let ptrace
> happen for syscalls that are dispatched back to userspace. In addition,
> doing it after ptrace makes the code in syscall_exit_work slightly
> simpler, since it doesn't require special handling for this feature.
>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx>
> ---
> kernel/entry/common.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index 44fd089d59da..fdb0c543539d 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -6,6 +6,8 @@
> #include <linux/audit.h>
> #include <linux/syscall_intercept.h>
>
> +#include "common.h"
> +
> #define CREATE_TRACE_POINTS
> #include <trace/events/syscalls.h>
>
> @@ -47,6 +49,12 @@ static inline long do_syscall_intercept(struct pt_regs *regs)
> int sysint_work = READ_ONCE(current->syscall_intercept);
> int ret;
>
> + if (sysint_work & SYSINT_USER_DISPATCH) {
> + ret = do_syscall_user_dispatch(regs);
> + if (ret == -1L)
> + return ret;
> + }
> +
> if (sysint_work & SYSINT_SECCOMP) {
> ret = __secure_computing(NULL);
> if (ret == -1L)
> --
> 2.28.0