Re: [BUG] TASK_DEAD task is able to be woken up in specialcondition

From: Oleg Nesterov
Date: Fri Jan 06 2012 - 09:19:17 EST


On 01/06, Peter Zijlstra wrote:
>
> On Fri, 2012-01-06 at 21:01 +0900, Yasunori Goto wrote:
>
> > Do you mean the following patch?
>
> Yes, something like that. At that point ->state should be TASK_RUNNING
> (since we are after all running). The unlock_wait() will synchronize
> against any in-progress ttwu() while its fast path is a non-atomic
> compare. Any ttwu after this will bail since it will either observe
> TASK_RUNNING or TASK_DEAD, neither are a state it will act upon.
>
> Now the only question that remains is if we need the full memory barrier
> or if we can get away with less.
>
> I guess the mb separates the write to ->state (setting TASK_RUNNING)
> from the read of ->pi_lock. The remote CPU must see the TASK_RUNNING,
> and we must see ->pi_lock taken if it is.

Yes, I think we need the full mb, STORE vs LOAD.

> > --- linux-3.2-rc7.orig/kernel/exit.c
> > +++ linux-3.2-rc7/kernel/exit.c
> > @@ -1038,6 +1038,10 @@ NORET_TYPE void do_exit(long code)
> >
> > preempt_disable();
> > exit_rcu();
> > +
> > + smp_mb();
> > + raw_spin_unlock_wait(&tsk->pi_lock);
> > +
> > /* causes final put_task_struct in finish_task_switch(). */
> > tsk->state = TASK_DEAD;

Interesting. Initially I thought this is wrong and we should do

raw_spin_unlock_wait(pi_lock);

mb();

tsk->state = TASK_DEAD;

This "obviously" serializes LOAD(pi_lock) and STORE(state).

But when I re-read your explanation above I think you are right,
mb() before unlock_wait() should work too, just it refers to
state = RUNNING in the past.

But this makes me worry. We are doing a lot of things after
exit_mm(). In particular we take tasklist_lock in exit_notify()
and then do_exit() takes task_lock(). But every unlock + lock
implies mb(). So how it was possible to hit this bug???

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/