Re: [PATCH v2 1/4] sched/task: Add the put_task_struct_atomic_safe function

From: Valentin Schneider
Date: Fri Jan 27 2023 - 10:56:48 EST


On 23/01/23 14:24, Wander Lairson Costa wrote:
> On Mon, Jan 23, 2023 at 1:30 PM Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>>
>> On 01/20, Wander Lairson Costa wrote:
>> >
>> > +static inline void put_task_struct_atomic_safe(struct task_struct *task)
>> > +{
>> > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
>> > + /*
>> > + * Decrement the refcount explicitly to avoid unnecessarily
>> > + * calling call_rcu.
>> > + */
>> > + if (refcount_dec_and_test(&task->usage))
>> > + /*
>> > + * under PREEMPT_RT, we can't call put_task_struct
>> > + * in atomic context because it will indirectly
>> > + * acquire sleeping locks.
>> > + */
>> > + call_rcu(&task->rcu, __delayed_put_task_struct);
>> ^^^^^^^^^
>> I am not sure the usage of task->rcu is safe...
>>
>> Suppose that, before __delayed_put_task_struct() is called by RCU, this task
>> does the last schedule and calls put_task_struct_rcu_user().
>>
>> And, can't we simply turn put_task_struct() into something like
>>
>> put_task_struct(struct task_struct *t)
>> {
>> if (refcount_dec_and_test(&t->usage)) {
>> if (IS_ENABLED(CONFIG_PREEMPT_RT)
>> && (in_atomic() || irqs_disabled()))
>> call_rcu(...);
>> else
>> __put_task_struct(t);
>> }
>> }
>>
>> ?
>
> Yeah, that was one approach I thought about. I chose to use an
> explicit function because I assumed calling __put_task_struct() from a
> non-preemptable context should be the exception, not the rule.

I'd tend to agree.

> Therefore (if I am correct in my assumption), it would make sense for
> only some call sites to pay the overhead price for it. But this is
> just a guess, and I have no evidence to support my claim.

My worry here is that it's easy to miss problematic callgraphs, and it's
potentially easy for new ones to creep in. Having a solution within
put_task_struct() itself would prevent that.

Another thing, if you look at release_task_stack(), it either caches the
outgoing stack for later use, or frees it via RCU (regardless of
PREEMPT_RT). Perhaps we could follow that and just always punt the freeing
of the task struct to RCU?