Re: [RFC BUG] There is a potential bug in "yield_to"

From: Peter Zijlstra
Date: Thu Jul 05 2012 - 04:35:48 EST


On Thu, 2012-07-05 at 13:31 +0800, Michael Wang wrote:
> Hi, All
>
> I found there may be a potential bug in "yield_to":
>
> local_irq_save(flags);
> rq = this_rq();
>
> again:
>
> //task's rq may already changed in "sched_move_task"
>
> p_rq = task_rq(p);
> double_rq_lock(rq, p_rq);
> while (task_rq(p) != p_rq) {
> double_rq_unlock(rq, p_rq);
> goto again;
> }
>
> I think it may happen in this scene:
>
> cpu 0 cpu 1(task a)
>
> yield_to {
> disable_irq
> sched_move_task { rq = this_rq();
> task_rq_lock(task a) double_rq_lock
>
> hold lock of rq 1
> set_task_rq //task rq changed
> release lock of rq 1
>
> hold lock of rq 1
> but task b no longer
> there
>
> set rq 1's current to
> skip which is not task a
>
> which means we hold a rq's lock but it's current is not the one should
> do yield.
>
> Only "sched_move_task" will cause this issue as it will move the task
> which is still running.
>
> The bug will make the task who want to do yield failed to set skip buddy
> to himself, but to a innocent task instead, not very harmful and almost
> impossible to occur in normal, but should we fix it with another check
> "rq == this_rq()"?

Uhm, what?!

We've got interrupts disabled, this_rq() cannot ever possibly change, so
rq is always correct.

Only p_rq can change, and we have an again loop on that, so what's the
problem again?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/