Re: [RFC][PATCH] exec: Conceal the other threads from wakeups during exec

From: Eric W. Biederman
Date: Fri Jul 31 2020 - 12:53:23 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> Eric, I won't comment the intent, but I too do not understand this idea.
>
> On 07/30, Eric W. Biederman wrote:
>>
>> [This change requires more work to handle TASK_STOPPED and TASK_TRACED]
>
> Yes. And it is not clear to me how can you solve this.

I was imagining something putting TASK_STOPPED and TASK_TRACED in a loop
that verified they should be in that state before exiting so they could
handle spurious wake ups.

There are a many subtlties in that code, especially in the conversion
fo TASK_STOPPED to TASK_TRACED. So I suspect something more would be
required but I have not looked yet to see how tricky that would be.

>> [This adds a new lock ordering dependency siglock -> pi_lock -> rq_lock ]
>
> Not really, ttwu() can be safely called with siglock held and it takes
> pi_lock + rq_lock. Say, signal_wake_up().

Good point.

>> +int make_task_wakekill(struct task_struct *p)
>> +{
>> + unsigned long flags;
>> + int cpu, success = 0;
>> + struct rq_flags rf;
>> + struct rq *rq;
>> + long state;
>> +
>> + /* Assumes p != current */
>> + preempt_disable();
>> + /*
>> + * If we are going to change a thread waiting for CONDITION we
>> + * need to ensure that CONDITION=1 done by the caller can not be
>> + * reordered with p->state check below. This pairs with mb() in
>> + * set_current_state() the waiting thread does.
>> + */
>> + raw_spin_lock_irqsave(&p->pi_lock, flags);
>> + smp_mb__after_spinlock();
>> + state = p->state;
>> +
>> + /* FIXME handle TASK_STOPPED and TASK_TRACED */
>> + if ((state == TASK_KILLABLE) ||
>> + (state == TASK_INTERRUPTIBLE)) {
>> + success = 1;
>> + cpu = task_cpu(p);
>> + rq = cpu_rq(cpu);
>> + rq_lock(rq, &rf);
>> + p->state = TASK_WAKEKILL;
>
> You can only do this if the task was already deactivated. Just suppose it
> is preempted or does something like
>
> set_current_sate(TASK_INTERRUPTIBLE);
>
> if (CONDITION) {
> // make_task_wakekill() sets state = TASK_WAKEKILL
> __set_current_state(TASK_RUNNING);
> return;
> }
>
> schedule();

You are quite right.

So that bit of code would need to be:
if (!task->on_rq)
goto out;
if ((state == TASK_KILLABLE) ||
(state == TASK_INTERRUPTIBLE)) {
...

Eric