Re: [PATCH 2/2] sched: use the old min_vruntime when normalizingon dequeue

From: Mike Galbraith
Date: Fri Oct 08 2010 - 02:57:24 EST


On Thu, 2010-10-07 at 14:00 -0700, Dima Zavin wrote:
> Mike,
>
> Thanks for the Ack for patch 1/2, could you take a look at this one too?

Ok, did that. I'd do it like below instead.

> Should I re-upload the series as v2 or you can pick the latest from
> patch 1 and take this one?

Peter's the merge point, I just help break stuff ;-)

I tested the below with pinned/unpinned mixes of sleepers and hogs, and
saw no ill effects. My thought on the logic is embedded in the comment.

From: Dima Zavin <dima@xxxxxxxxxxx>
Subject: [PATCH 2/2] sched: use the old min_vruntime when normalizing on dequeue
Date: Tue, 28 Sep 2010 23:46:14 -0700

After pulling the thread off the run-queue during a cgroup change,
the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
then gets normalized to this new value. This can then lead to the thread
getting an unfair boost in the new group if the vruntime of the next
task in the old run-queue was way further ahead.

Cc: Arve HjÃnnevÃg <arve@xxxxxxxxxxx>
Signed-off-by: Dima Zavin <dima@xxxxxxxxxxx>
---
kernel/sched_fair.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

Index: linux-2.6/kernel/sched_fair.c
===================================================================
--- linux-2.6.orig/kernel/sched_fair.c
+++ linux-2.6/kernel/sched_fair.c
@@ -826,15 +826,17 @@ dequeue_entity(struct cfs_rq *cfs_rq, st
if (se != cfs_rq->curr)
__dequeue_entity(cfs_rq, se);
account_entity_dequeue(cfs_rq, se);
- update_min_vruntime(cfs_rq);

/*
- * Normalize the entity after updating the min_vruntime because the
- * update can refer to the ->curr item and we need to reflect this
- * movement in our normalized position.
+ * Normalize vruntime prior to updating min_vruntime. Any motion
+ * referring to ->curr will have been captured by update_curr() above.
+ * We don't want to preserve what lag might become as a result of
+ * this dequeue, we want to preserve what lag is at dequeue time.
*/
if (!(flags & DEQUEUE_SLEEP))
se->vruntime -= cfs_rq->min_vruntime;
+
+ update_min_vruntime(cfs_rq);
}

/*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/