Re: [PATCH v7 3/4] rseq: Make rseq work with protection keys

From: Mathieu Desnoyers
Date: Mon Jul 21 2025 - 09:25:45 EST


On 2025-07-18 05:01, Dmitry Vyukov wrote:
On Tue, 24 Jun 2025 at 11:17, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
If an application registers rseq, and ever switches to another pkey
protection (such that the rseq becomes inaccessible), then any
context switch will cause failure in __rseq_handle_notify_resume()
attempting to read/write struct rseq and/or rseq_cs. Since context
switches are asynchronous and are outside of the application control
(not part of the restricted code scope), temporarily switch to
pkey value that allows access to the 0 (default) PKEY.

Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Aruna Ramakrishna <aruna.ramakrishna@xxxxxxxxxx>
Cc: x86@xxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx
Fixes: d7822b1e24f2 ("rseq: Introduce restartable sequences system call")

Dave, can you please ack this patch? Ingo said he was waiting for your
review before taking this to -tip.

Are there any remaining concerns with this series? If not, Thomas,
Ingo, can you please take this to -tip tree?

Gentle ping. What needs to happen for this series to be merged?

This series looks OK from my perspective. I think the last piece that
was missing was to get a review from Dave Hansen.

Dave ?

Thanks,

Mathieu



---
Changes in v7:
- Added Mathieu's Reviewed-by

Changes in v6:
- Added a comment to struct rseq with MPK rules

Changes in v4:
- Added Fixes tag

Changes in v3:
- simplify control flow to always enable access to 0 pkey

Changes in v2:
- fixed typos and reworded the comment
---
include/uapi/linux/rseq.h | 4 ++++
kernel/rseq.c | 11 +++++++++++
2 files changed, 15 insertions(+)

diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index c233aae5eac90..019fd248cf749 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -58,6 +58,10 @@ struct rseq_cs {
* contained within a single cache-line.
*
* A single struct rseq per thread is allowed.
+ *
+ * If struct rseq or struct rseq_cs is used with Memory Protection Keys,
+ * then the assigned pkey should either be accessible whenever these structs
+ * are registered/installed, or they should be protected with pkey 0.
*/
struct rseq {
/*
diff --git a/kernel/rseq.c b/kernel/rseq.c
index b7a1ec327e811..88fc8cb789b3b 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -10,6 +10,7 @@

#include <linux/sched.h>
#include <linux/uaccess.h>
+#include <linux/pkeys.h>
#include <linux/syscalls.h>
#include <linux/rseq.h>
#include <linux/types.h>
@@ -424,11 +425,19 @@ static int rseq_ip_fixup(struct pt_regs *regs)
void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
{
struct task_struct *t = current;
+ pkey_reg_t saved_pkey;
int ret, sig;

if (unlikely(t->flags & PF_EXITING))
return;

+ /*
+ * Enable access to the default (0) pkey in case the thread has
+ * currently disabled access to it and struct rseq/rseq_cs has
+ * 0 pkey assigned (the only supported value for now).
+ */
+ saved_pkey = enable_zero_pkey_val();
+
/*
* regs is NULL if and only if the caller is in a syscall path. Skip
* fixup and leave rseq_cs as is so that rseq_sycall() will detect and
@@ -441,9 +450,11 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
}
if (unlikely(rseq_update_cpu_node_id(t)))
goto error;
+ write_pkey_val(saved_pkey);
return;

error:
+ write_pkey_val(saved_pkey);
sig = ksig ? ksig->sig : 0;
force_sigsegv(sig);
}
--
2.49.0.1143.g0be31eac6b-goog



--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com