Re: [PATCH v2 2/8] x86/vsyscall: Add a new vsyscall=xonly mode

From: Kees Cook
Date: Thu Jun 27 2019 - 13:26:58 EST


On Wed, Jun 26, 2019 at 09:45:03PM -0700, Andy Lutomirski wrote:
> With vsyscall emulation on, we still expose a readable vsyscall page
> that contains syscall instructions that validly implement the
> vsyscalls. We need this because certain dynamic binary
> instrumentation tools attempt to read the call targets of call
> instructions in the instrumented code. If the instrumented code
> uses vsyscalls, then the vsyscal page needs to contain readable
> code.
>
> Unfortunately, leaving readable memory at a deterministic address
> can be used to help various ASLR bypasses, so we gain some hardening
> value if we disallow vsyscall reads.
>
> Given how rarely the vsyscall page needs to be readable, add a
> mechanism to make the vsyscall page be execute only.
>
> Cc: Kees Cook <keescook@xxxxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Kernel Hardening <kernel-hardening@xxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>

Reviewed-by: Kees Cook <keescook@xxxxxxxxxxxx>

-Kees

> ---
> .../admin-guide/kernel-parameters.txt | 7 +++-
> arch/x86/Kconfig | 33 ++++++++++++++-----
> arch/x86/entry/vsyscall/vsyscall_64.c | 16 +++++++--
> 3 files changed, 44 insertions(+), 12 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 0082d1e56999..be8c3a680afa 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -5100,7 +5100,12 @@
> targets for exploits that can control RIP.
>
> emulate [default] Vsyscalls turn into traps and are
> - emulated reasonably safely.
> + emulated reasonably safely. The vsyscall
> + page is readable.
> +
> + xonly Vsyscalls turn into traps and are
> + emulated reasonably safely. The vsyscall
> + page is not readable.
>
> none Vsyscalls don't work at all. This makes
> them quite hard to use for exploits but
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 2bbbd4d1ba31..0182d2c67590 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2293,23 +2293,38 @@ choice
> it can be used to assist security vulnerability exploitation.
>
> This setting can be changed at boot time via the kernel command
> - line parameter vsyscall=[emulate|none].
> + line parameter vsyscall=[emulate|xonly|none].
>
> On a system with recent enough glibc (2.14 or newer) and no
> static binaries, you can say None without a performance penalty
> to improve security.
>
> - If unsure, select "Emulate".
> + If unsure, select "Emulate execution only".
>
> config LEGACY_VSYSCALL_EMULATE
> - bool "Emulate"
> + bool "Full emulation"
> help
> - The kernel traps and emulates calls into the fixed
> - vsyscall address mapping. This makes the mapping
> - non-executable, but it still contains known contents,
> - which could be used in certain rare security vulnerability
> - exploits. This configuration is recommended when userspace
> - still uses the vsyscall area.
> + The kernel traps and emulates calls into the fixed vsyscall
> + address mapping. This makes the mapping non-executable, but
> + it still contains readable known contents, which could be
> + used in certain rare security vulnerability exploits. This
> + configuration is recommended when using legacy userspace
> + that still uses vsyscalls along with legacy binary
> + instrumentation tools that require code to be readable.
> +
> + An example of this type of legacy userspace is running
> + Pin on an old binary that still uses vsyscalls.
> +
> + config LEGACY_VSYSCALL_XONLY
> + bool "Emulate execution only"
> + help
> + The kernel traps and emulates calls into the fixed vsyscall
> + address mapping and does not allow reads. This
> + configuration is recommended when userspace might use the
> + legacy vsyscall area but support for legacy binary
> + instrumentation of legacy code is not needed. It mitigates
> + certain uses of the vsyscall area as an ASLR-bypassing
> + buffer.
>
> config LEGACY_VSYSCALL_NONE
> bool "None"
> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscall/vsyscall_64.c
> index d9d81ad7a400..fedd7628f3a6 100644
> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> @@ -42,9 +42,11 @@
> #define CREATE_TRACE_POINTS
> #include "vsyscall_trace.h"
>
> -static enum { EMULATE, NONE } vsyscall_mode =
> +static enum { EMULATE, XONLY, NONE } vsyscall_mode =
> #ifdef CONFIG_LEGACY_VSYSCALL_NONE
> NONE;
> +#elif defined(CONFIG_LEGACY_VSYSCALL_XONLY)
> + XONLY;
> #else
> EMULATE;
> #endif
> @@ -54,6 +56,8 @@ static int __init vsyscall_setup(char *str)
> if (str) {
> if (!strcmp("emulate", str))
> vsyscall_mode = EMULATE;
> + else if (!strcmp("xonly", str))
> + vsyscall_mode = XONLY;
> else if (!strcmp("none", str))
> vsyscall_mode = NONE;
> else
> @@ -357,12 +361,20 @@ void __init map_vsyscall(void)
> extern char __vsyscall_page;
> unsigned long physaddr_vsyscall = __pa_symbol(&__vsyscall_page);
>
> - if (vsyscall_mode != NONE) {
> + /*
> + * For full emulation, the page needs to exist for real. In
> + * execute-only mode, there is no PTE at all backing the vsyscall
> + * page.
> + */
> + if (vsyscall_mode == EMULATE) {
> __set_fixmap(VSYSCALL_PAGE, physaddr_vsyscall,
> PAGE_KERNEL_VVAR);
> set_vsyscall_pgtable_user_bits(swapper_pg_dir);
> }
>
> + if (vsyscall_mode == XONLY)
> + gate_vma.vm_flags = VM_EXEC;
> +
> BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=
> (unsigned long)VSYSCALL_ADDR);
> }
> --
> 2.21.0
>

--
Kees Cook