Re: Bug in scheduler when using rt_mutex

From: Yong Zhang
Date: Tue Jan 18 2011 - 21:39:06 EST


On Tue, Jan 18, 2011 at 9:35 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> Subject: sched: Fix switch_to_fair()
> From: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Date: Mon Jan 17 17:03:27 CET 2011
>
> When a task is placed back into fair_sched_class, we must update its
> placement, since we don't know how long its been gone, hence its
> vruntime is stale and cannot be trusted.
>
> There is also a case where it was moved from fair_sched_class when it
> was in a blocked state and moved back while it is running, this causes
> an imbalance between DEQUEUE_SLEEP/DEQUEUE_WAKEUP for the fair class
> and leaves vruntime way out there (due to the min_vruntime
> adjustment).
>
> Also update sysrq-n to call the ->switch_{to,from} methods.
>
> Reported-by: Onkalo Samu <samu.p.onkalo@xxxxxxxxx>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> Âkernel/sched.c   Â|  Â4 ++++
> Âkernel/sched_fair.c | Â 16 ++++++++++++++++
> Â2 files changed, 20 insertions(+)
>
> Index: linux-2.6/kernel/sched_fair.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched_fair.c
> +++ linux-2.6/kernel/sched_fair.c
> @@ -4075,6 +4075,22 @@ static void prio_changed_fair(struct rq
> Âstatic void switched_to_fair(struct rq *rq, struct task_struct *p,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â int running)
> Â{
> + Â Â Â struct sched_entity *se = &p->se;
> + Â Â Â struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +
> + Â Â Â if (se->on_rq && cfs_rq->curr != se)

(cfs_rq->curr != se) equals to (!running), no?

> + Â Â Â Â Â Â Â __dequeue_entity(cfs_rq, se);
> +
> + Â Â Â /*
> + Â Â Â Â* se->vruntime can be completely out there, there is no telling
> + Â Â Â Â* how long this task was !fair and on what CPU if any it became
> + Â Â Â Â* !fair. Therefore, reset it to a known, reasonable value.
> + Â Â Â Â*/
> + Â Â Â se->vruntime = cfs_rq->min_vruntime;

But this is not fair for !SLEEP task.
You know se->vruntime -= cfs_rq->min_vruntime for !SLEEP task,
then after it go through sched_fair-->sched_rt-->sched_fair by some
means, current cfs_rq->min_vruntime is added back.

But here se is putted before where it should be. Is this what we want?

Thanks,
Yong

--
Only stand for myself
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/