Re: [RFC PATCH v2 1/1] x86/pkeys: update PKRU to enable pkey 0 before XSAVE

From: Ingo Molnar
Date: Fri Mar 22 2024 - 05:46:22 EST



* Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx> wrote:

> This patch is a workaround for a bug where the PKRU value is not
> restored to the init value before the signal handler is invoked.
>
> Problem description:
> Let's assume there's a multithreaded application that runs untrusted
> user code. Each thread has its stack/code protected by a non-zero pkey,
> and the PKRU register is set up such that only that particular non-zero
> pkey is enabled. Each thread also sets up an alternate signal stack to
> handle signals, which is protected by pkey zero. The pkeys man page
> documents that the PKRU will be reset to init_pkru when the signal
> handler is invoked, which means that pkey zero access will be enabled.
> But this reset happens after the kernel attempts to push fpu state
> to the alternate stack, which is not (yet) accessible by the kernel,
> which leads to a new SIGSEGV being sent to the application, terminating
> it.
>
> Enabling both the non-zero pkey (for the thread) and pkey zero (in
> userspace) will not work for us. We cannot have the alt stack writeable
> by all - the rationale here is that the code running in that thread
> (using a non-zero pkey) is untrusted and should not have access to the
> alternate signal stack (that uses pkey zero), to prevent the return
> address of a function from being changed. The expectation is that kernel
> should be able to set up the alternate signal stack and deliver the
> signal to the application even if pkey zero is explicitly disabled by
> the application. The signal handler accessibility should not be dictated
> by the PKRU value that the thread sets up.
>
> Solution:
> The PKRU register is managed by XSAVE, which means the sigframe contents
> must match the register contents - which is not the case here. We want
> the sigframe to contain the user-defined PKRU value (so that it is
> restored correctly from sigcontext) but the actual register must be
> reset to init_pkru so that the alt stack is accessible and the signal
> can be delivered to the application. It seems that the proper fix here
> would be to remove PKRU from the XSAVE framework and manage it
> separately, which is quite complicated. As a workaround, this patch does
> something like this:
>
> orig_pkru = rdpkru();
> wrpkru(init_pkru & orig_pkru);
> xsave_to_user_sigframe();
> put_user(pkru_sigframe_addr, orig_pkru)
>
> Signed-off-by: Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx>
> Helped-by: Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
> Helped-by: Prakash Sangappa <prakash.sangappa@xxxxxxxxxx>
> ---
> arch/x86/include/asm/fpu/signal.h | 3 +-
> arch/x86/include/asm/sighandling.h | 10 +++----
> arch/x86/kernel/fpu/signal.c | 44 ++++++++++++++++++++++++++----
> arch/x86/kernel/fpu/xstate.c | 13 +++++++++
> arch/x86/kernel/fpu/xstate.h | 1 +
> arch/x86/kernel/signal.c | 40 +++++++++++++++++++++------
> arch/x86/kernel/signal_32.c | 8 +++---
> arch/x86/kernel/signal_64.c | 9 +++---
> 8 files changed, 101 insertions(+), 27 deletions(-)

Ok, this looks a lot saner than the first patch.

A couple of requests:

1)

Please split out all the PKRU parameter passing interface changes into a
separate patch. Ie. split out patches that don't change any behavior, and
try to make the final feature-enabling (bug-fixing) patch as small and easy
to read as possible. Maybe even have 3 patches:

- function interface changes
- helper function additions
- behavioral changes to signal handler pkru context

2)

I do agree that isolation of sandboxed execution into a non-zero pkey might
make sense. But this really needs an actual testcase.

3)

The semantics you've implemented for sigaltstacks are not the only possible
ones. In principle, a signal handler with its own stack might want to have
its own key(s) enabled. In a way a Linux signal handler is a mini-thread
created on the fly, with its own stack and its own attributes. Some thought
& analysis should go into which way to go here, and the best path should be
chosen. Fixing the SIGSEGV you observed should be a happy side effect of
other worthwile improvements.

Thanks,

Ingo