Re: [RFC PATCH v4 1/5] glibc: Perform rseq(2) registration at nptl init and thread creation

From: Mathieu Desnoyers
Date: Thu Nov 22 2018 - 10:33:24 EST


----- On Nov 22, 2018, at 10:21 AM, Florian Weimer fweimer@xxxxxxxxxx wrote:

> * Rich Felker:
>
>> On Thu, Nov 22, 2018 at 04:11:45PM +0100, Florian Weimer wrote:
>>> * Mathieu Desnoyers:
>>>
>>> > Thoughts ?
>>> >
>>> > /* Unregister rseq TLS from kernel. */
>>> > if (has_rseq && __rseq_unregister_current_thread ())
>>> > abort();
>>> >
>>> > advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
>>> > pd->guardsize);
>>> >
>>> > /* If the thread is detached free the TCB. */
>>> > if (IS_DETACHED (pd))
>>> > /* Free the TCB. */
>>> > __free_tcb (pd);
>>>
>>> Considering that we proceed to free the TCB, I really hope that all
>>> signals are blocked at this point. (I have not checked this, though.)
>>>
>>> Wouldn't this address your concern about access to the rseq area?
>>
>> I'm not familiar with glibc's logic here, but for other reasons, I
>> don't think freeing it is safe until the kernel task exit futex (set
>> via clone or set_tid_address) has fired. I would guess __free_tcb just
>> sets up for it to be reclaimable when this happens rather than
>> immediately freeing it for reuse.
>
> Right, but in case of user-supplied stacks, we actually free TLS memory
> at this point, so signals need to be blocked because the TCB is
> (partially) gone after that.

Unfortuntately, disabling signals is not enough.

With rseq registered, the kernel accesses the rseq TLS area when returning to
user-space after _preemption_ of user-space, which can be triggered at any
point by an interrupt or a fault, even if signals are blocked.

So if there are cases where the TLS memory is freed while the thread is still
running, we _need_ to explicitly unregister rseq beforehand.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com