Re: rseq + membarrier programming model

From: Mathieu Desnoyers
Date: Mon Dec 13 2021 - 14:19:46 EST


----- On Dec 13, 2021, at 1:47 PM, Florian Weimer fweimer@xxxxxxxxxx wrote:

> I've been studying Jann Horn's biased locking example:
>
> Re: [PATCH 0/4 POC] Allow executing code and syscalls in another address space
> <https://lore.kernel.org/linux-api/CAG48ez02UDn_yeLuLF4c=kX0=h2Qq8Fdb0cer1yN8atbXSNjkQ@xxxxxxxxxxxxxx/>
>
> It uses MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ as part of the biased lock
> revocation.
>
> How does the this code know that the process has called
> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ?

I won't speak for this code snippet in particular, but in general
issuing MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ from a thread which
belongs to a process which has not performed
MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ will result in
membarrier returning -EPERM. If the kernel is built without CONFIG_RSEQ
support, it will return -EINVAL:

membarrier_private_expedited():

} else if (flags == MEMBARRIER_FLAG_RSEQ) {
if (!IS_ENABLED(CONFIG_RSEQ))
return -EINVAL;
if (!(atomic_read(&mm->membarrier_state) &
MEMBARRIER_STATE_PRIVATE_EXPEDITED_RSEQ_READY))
return -EPERM;

If you want to create code which optionally depends on availability
of EXPEDITED_RSEQ membarrier, I suspect you will want to perform
registration from a library constructor, and keep track of registration
success/failure in a static variable within the library.

> Could it fall back to
> MEMBARRIER_CMD_GLOBAL instead?

No. CMD_GLOBAL does not issue the required rseq fence used by the
algorithm discussed. Also, CMD_GLOBAL has quite a few other shortcomings:
it takes a while to execute, and is incompatible with nohz_full kernels.

> Why is it that MEMBARRIER_CMD_GLOBAL
> does not require registration (the broader/more expensive barrier), but
> the more restricted versions do?

The more restricted versions (which require explicit registration) have a
close integration with the Linux scheduler, and in some cases require
additional code to be executed when scheduling between threads which
belong to different processes, for instance the for "SYNC_CORE" membarrier,
which is useful for JITs:

static inline void membarrier_mm_sync_core_before_usermode(struct mm_struct *mm)
{
if (current->mm != mm)
return;
if (likely(!(atomic_read(&mm->membarrier_state) &
MEMBARRIER_STATE_PRIVATE_EXPEDITED_SYNC_CORE)))
return;
sync_core_before_usermode();
}

Also, for the "global-expedited" commands, these can generate IPIs which will
interrupt the flow of threads running on behalf of a registered process.
Therefore, in order to make sure we do not add delays to real-time sensitive
applications, we made this registration "opt-in".

In order to make sure the programming model is the same for expedited
private/global plain/sync-core/rseq membarrier commands, we require that
each process perform a registration beforehand.

>
> Or put differently, why wouldn't we request
> MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_RSEQ unconditionally at
> process start in glibc, once we start biased locking in a few places?

The registration of membarrier expedited can be either performed immediately
when the process starts, or later, possibly when there are other threads
running concurrently. Note however that the registration scheme has been
optimized for the scenario where it is called when a single thread is
running in the process (see sync_runqueues_membarrier_state()). Otherwise
we need to use the more heavyweight synchronize_rcu(). So my advice would
be to perform the membarrier expedited registration while the process
is still single-threaded if possible, rather than postpone this and
do it entirely lazily on first use, which may happen while other
threads are already running.

Thanks,

Mathieu

>
> Thanks,
> Florian

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com