Re: [RFC PATCH for 5.8 3/4] rseq: Introduce RSEQ_FLAG_RELIABLE_CPU_ID

From: Florian Weimer
Date: Tue Jul 07 2020 - 03:29:27 EST


* Mathieu Desnoyers:

> commit 93b585c08d16 ("Fix: sched: unreliable rseq cpu_id for new tasks")
> addresses an issue with cpu_id field of newly created processes. Expose
> a flag which can be used by user-space to query whether the kernel
> implements this fix.
>
> Considering that this issue can cause corruption of user-space per-cpu
> data updated with rseq, it is recommended that user-space detects
> availability of this fix by using the RSEQ_FLAG_RELIABLE_CPU_ID flag
> either combined with registration or on its own before using rseq.

Presumably, the intent is that glibc uses RSEQ_FLAG_RELIABLE_CPU_ID to
register the rseq area. That will surely prevent glibc itself from
activating rseq on broken kernels. But if another rseq library
performs registration and has not been updated to use
RSEQ_FLAG_RELIABLE_CPU_ID, we still end up with an active rseq area
(and incorrect CPU IDs from sched_getcpu in glibc). So further glibc
changes will be needed. I suppose we could block third-party rseq
registration with a registration of a hidden rseq area (not
__rseq_abi). But then the question is if any of the third-party rseq
users are expecting the EINVAL error code from their failed
registration.

The rseq registration state machine is quite tricky already, and the
need to use RSEQ_FLAG_RELIABLE_CPU_ID would make it even more
complicated. Even if we implemented all the changes, it's all going
to be essentially dead, untestable code in a few months, when the
broken kernels are out of circulation. It does not appear to be good
investment to me.