Re: [PATCH] kernel: Expose SYS_kcmp by default

From: Kees Cook
Date: Fri Feb 05 2021 - 13:39:56 EST


On Fri, Feb 05, 2021 at 04:37:52PM +0000, Chris Wilson wrote:
> Userspace has discovered the functionality offered by SYS_kcmp and has
> started to depend upon it. In particular, Mesa uses SYS_kcmp for
> os_same_file_description() in order to identify when two fd (e.g. device
> or dmabuf) point to the same struct file. Since they depend on it for
> core functionality, lift SYS_kcmp out of the non-default
> CONFIG_CHECKPOINT_RESTORE into the selectable syscall category.
>
> Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> Cc: Will Drewry <wad@xxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Dave Airlie <airlied@xxxxxxxxx>
> Cc: Daniel Vetter <daniel@xxxxxxxx>
> Cc: Lucas Stach <l.stach@xxxxxxxxxxxxxx>
> ---
> init/Kconfig | 11 +++++++++++
> kernel/Makefile | 2 +-
> tools/testing/selftests/seccomp/seccomp_bpf.c | 2 +-
> 3 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/init/Kconfig b/init/Kconfig
> index b77c60f8b963..f62fca13ac5b 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1194,6 +1194,7 @@ endif # NAMESPACES
> config CHECKPOINT_RESTORE
> bool "Checkpoint/restore support"
> select PROC_CHILDREN
> + select KCMP
> default n
> help
> Enables additional kernel features in a sake of checkpoint/restore.
> @@ -1737,6 +1738,16 @@ config ARCH_HAS_MEMBARRIER_CALLBACKS
> config ARCH_HAS_MEMBARRIER_SYNC_CORE
> bool
>
> +config KCMP
> + bool "Enable kcmp() system call" if EXPERT
> + default y

I would expect this to be not default-y, especially if
CHECKPOINT_RESTORE does a "select" on it.

This is a really powerful syscall, but it is bounded by ptrace access
controls, and uses pointer address obfuscation, so it may be okay to
expose this. As it is, at least Ubuntu already has
CONFIG_CHECKPOINT_RESTORE, so really, there's probably not much
difference on exposure.

So, if you drop the "default y", I'm fine with this.

-Kees

> + help
> + Enable the file descriptor comparison system call. It provides
> + user-space with the ability to compare two fd to see if they
> + point to the same file, and check other attributes.
> +
> + If unsure, say Y.
> +
> config RSEQ
> bool "Enable rseq() system call" if EXPERT
> default y
> diff --git a/kernel/Makefile b/kernel/Makefile
> index aa7368c7eabf..320f1f3941b7 100644
> --- a/kernel/Makefile
> +++ b/kernel/Makefile
> @@ -51,7 +51,7 @@ obj-y += livepatch/
> obj-y += dma/
> obj-y += entry/
>
> -obj-$(CONFIG_CHECKPOINT_RESTORE) += kcmp.o
> +obj-$(CONFIG_KCMP) += kcmp.o
> obj-$(CONFIG_FREEZER) += freezer.o
> obj-$(CONFIG_PROFILING) += profile.o
> obj-$(CONFIG_STACKTRACE) += stacktrace.o
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 26c72f2b61b1..1b6c7d33c4ff 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -315,7 +315,7 @@ TEST(kcmp)
> ret = __filecmp(getpid(), getpid(), 1, 1);
> EXPECT_EQ(ret, 0);
> if (ret != 0 && errno == ENOSYS)
> - SKIP(return, "Kernel does not support kcmp() (missing CONFIG_CHECKPOINT_RESTORE?)");
> + SKIP(return, "Kernel does not support kcmp() (missing CONFIG_KCMP?)");
> }
>
> TEST(mode_strict_support)
> --
> 2.20.1
>

--
Kees Cook