Re: CFS review

From: Ingo Molnar
Date: Sun Aug 12 2007 - 00:18:23 EST



* Al Boldi <a1426z@xxxxxxxxx> wrote:

> That's because granularity increases when decreasing nice, and results
> in larger timeslices, which affects smoothness negatively. chew.c
> easily shows this problem with 2 background cpu-hogs at the same
> nice-level.
>
> pid 908, prio 0, out for 8 ms, ran for 4 ms, load 37%
> pid 908, prio 0, out for 8 ms, ran for 4 ms, load 37%
> pid 908, prio 0, out for 8 ms, ran for 2 ms, load 26%
> pid 908, prio 0, out for 8 ms, ran for 4 ms, load 38%
> pid 908, prio 0, out for 2 ms, ran for 1 ms, load 47%
>
> pid 908, prio -5, out for 23 ms, ran for 3 ms, load 14%
> pid 908, prio -5, out for 17 ms, ran for 9 ms, load 35%

yeah. Incidentally, i refined this last week and those nice-level
granularity changes went into the upstream scheduler code a few days
ago:

commit 7cff8cf61cac15fa29a1ca802826d2bcbca66152
Author: Ingo Molnar <mingo@xxxxxxx>
Date: Thu Aug 9 11:16:52 2007 +0200

sched: refine negative nice level granularity

refine the granularity of negative nice level tasks: let them
reschedule more often to offset the effect of them consuming
their wait_runtime proportionately slower. (This makes nice-0
task scheduling smoother in the presence of negatively
reniced tasks.)

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

so could you please re-check chew jitter behavior with the latest
kernel? (i've attached the standalone patch below, it will apply cleanly
to rc2 too.)

when testing this, you might also want to try chew-max:

http://redhat.com/~mingo/cfs-scheduler/tools/chew-max.c

i added a few trivial enhancements to chew.c: it tracks the maximum
latency, latency fluctuations (noise of scheduling) and allows it to be
run for a fixed amount of time.

NOTE: if you run chew from any indirect terminal (xterm, ssh, etc.) it's
best to capture/report chew numbers like this:

./chew-max 60 > chew.log

otherwise the indirect scheduling activities of the chew printout will
disturb the numbers.

> It looks like the larger the granularity, the more unpredictable it
> gets, which probably means that this unpredictability exists even at
> smaller granularity but is only exposed with larger ones.

this should only affect non-default nice levels. Note that 99.9%+ of all
userspace Linux CPU time is spent on default nice level 0, and that is
what controls the design. So the approach was always to first get nice-0
right, and then to adjust the non-default nice level behavior too,
carefully, without hurting nice-0 - to refine all the workloads where
people (have to) use positive or negative nice levels. In any case,
please keep re-testing this so that we can adjust it.

Ingo

--------------------->
commit 7cff8cf61cac15fa29a1ca802826d2bcbca66152
Author: Ingo Molnar <mingo@xxxxxxx>
Date: Thu Aug 9 11:16:52 2007 +0200

sched: refine negative nice level granularity

refine the granularity of negative nice level tasks: let them
reschedule more often to offset the effect of them consuming
their wait_runtime proportionately slower. (This makes nice-0
task scheduling smoother in the presence of negatively
reniced tasks.)

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>

diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
index 7a632c5..e91db32 100644
--- a/kernel/sched_fair.c
+++ b/kernel/sched_fair.c
@@ -222,21 +222,25 @@ niced_granularity(struct sched_entity *curr, unsigned long granularity)
{
u64 tmp;

+ if (likely(curr->load.weight == NICE_0_LOAD))
+ return granularity;
/*
- * Negative nice levels get the same granularity as nice-0:
+ * Positive nice levels get the same granularity as nice-0:
*/
- if (likely(curr->load.weight >= NICE_0_LOAD))
- return granularity;
+ if (likely(curr->load.weight < NICE_0_LOAD)) {
+ tmp = curr->load.weight * (u64)granularity;
+ return (long) (tmp >> NICE_0_SHIFT);
+ }
/*
- * Positive nice level tasks get linearly finer
+ * Negative nice level tasks get linearly finer
* granularity:
*/
- tmp = curr->load.weight * (u64)granularity;
+ tmp = curr->load.inv_weight * (u64)granularity;

/*
* It will always fit into 'long':
*/
- return (long) (tmp >> NICE_0_SHIFT);
+ return (long) (tmp >> WMULT_SHIFT);
}

static inline void
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/