[RFC BUG] There is a potential bug in "yield_to"

From: Michael Wang
Date: Thu Jul 05 2012 - 01:33:20 EST


Hi, All

I found there may be a potential bug in "yield_to":

local_irq_save(flags);
rq = this_rq();

again:

//task's rq may already changed in "sched_move_task"

p_rq = task_rq(p);
double_rq_lock(rq, p_rq);
while (task_rq(p) != p_rq) {
double_rq_unlock(rq, p_rq);
goto again;
}

I think it may happen in this scene:

cpu 0 cpu 1(task a)

yield_to {
disable_irq
sched_move_task { rq = this_rq();
task_rq_lock(task a) double_rq_lock

hold lock of rq 1
set_task_rq //task rq changed
release lock of rq 1

hold lock of rq 1
but task b no longer
there

set rq 1's current to
skip which is not task a

which means we hold a rq's lock but it's current is not the one should
do yield.

Only "sched_move_task" will cause this issue as it will move the task
which is still running.

The bug will make the task who want to do yield failed to set skip buddy
to himself, but to a innocent task instead, not very harmful and almost
impossible to occur in normal, but should we fix it with another check
"rq == this_rq()"?

Regards,
Michael Wang

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/