Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig

From: Yuyang Du
Date: Mon Sep 21 2015 - 05:06:14 EST


On Thu, Sep 17, 2015 at 12:38:25PM +0200, Peter Zijlstra wrote:
> On Fri, Sep 11, 2015 at 06:22:47PM +0100, Morten Rasmussen wrote:
>
> > While at it, should I include Yuyang's patch redefining the SCALE/SHIFT
> > mess?
>
> I suspect his patch will fail to compile on ARM which uses
> SCHED_CAPACITY_* outside of kernel/sched/*.
>
> But if you all (Ben, Yuyang, you) can agree on a patch simplifying these
> things I'm not opposed to it.

Yes, indeed. So SCHED_RESOLUTION_SHIFT has to be defined in include/linux/sched.h.

With this, I think the codes still need some cleanup, and importantly
documentation.

But first, I think as load_sum and load_avg can afford NICE_0_LOAD with either high
or low resolution. So we have no reason to have low resolution (10bits) load_avg
when NICE_0_LOAD has high resolution (20bits), because load_avg = runnable% * load,
as opposed to now we have load_avg = runnable% * scale_load_down(load).

We get rid of all scale_load_down() for runnable load average?

--

Subject: [PATCH] sched/fair: Generalize the load/util averages resolution
definition

The metric needs certain resolution to determine how much detail we
can look into (or not losing detail by integer rounding), which also
determines the range of the metrics.

For instance, to increase the resolution of [0, 1] (two levels), one
can multiply 1024 and get [0, 1024] (1025 levels).

In sched/fair, a few metrics depend on the resolution: load/load_avg,
util_avg, and capacity (frequency adjustment). In order to reduce the
risks to make mistakes relating to resolution/range, we therefore
generalize the resolution by defining a basic resolution constant
number, and then formalize all metrics by depending on the basic
resolution. The basic resolution is 1024 or (1 << 10). Further, one
can recursively apply the basic resolution to increase the final
resolution.

Pointed out by Ben Segall, NICE_0's weight (visible to user) and load
have independent resolution, but they must be well calibrated.

Signed-off-by: Yuyang Du <yuyang.du@xxxxxxxxx>
---
include/linux/sched.h | 9 ++++++---
kernel/sched/fair.c | 4 ----
kernel/sched/sched.h | 15 ++++++++++-----
3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index bd38b3e..9b86f79 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -909,10 +909,13 @@ enum cpu_idle_type {
CPU_MAX_IDLE_TYPES
};

+# define SCHED_RESOLUTION_SHIFT 10
+# define SCHED_RESOLUTION_SCALE (1L << SCHED_RESOLUTION_SHIFT)
+
/*
* Increase resolution of cpu_capacity calculations
*/
-#define SCHED_CAPACITY_SHIFT 10
+#define SCHED_CAPACITY_SHIFT SCHED_RESOLUTION_SHIFT
#define SCHED_CAPACITY_SCALE (1L << SCHED_CAPACITY_SHIFT)

/*
@@ -1180,8 +1183,8 @@ struct load_weight {
* 1) load_avg factors frequency scaling into the amount of time that a
* sched_entity is runnable on a rq into its weight. For cfs_rq, it is the
* aggregated such weights of all runnable and blocked sched_entities.
- * 2) util_avg factors frequency and cpu scaling into the amount of time
- * that a sched_entity is running on a CPU, in the range [0..SCHED_LOAD_SCALE].
+ * 2) util_avg factors frequency and cpu capacity scaling into the amount of time
+ * that a sched_entity is running on a CPU, in the range [0..SCHED_CAPACITY_SCALE].
* For cfs_rq, it is the aggregated such times of all runnable and
* blocked sched_entities.
* The 64 bit load_sum can:
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4df37a4..c61fd8e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2522,10 +2522,6 @@ static u32 __compute_runnable_contrib(u64 n)
return contrib + runnable_avg_yN_sum[n];
}

-#if (SCHED_LOAD_SHIFT - SCHED_LOAD_RESOLUTION) != 10 || SCHED_CAPACITY_SHIFT != 10
-#error "load tracking assumes 2^10 as unit"
-#endif
-
#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)

/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 3845a71..31b4022 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -53,18 +53,23 @@ static inline void update_cpu_load_active(struct rq *this_rq) { }
* increased costs.
*/
#if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load */
-# define SCHED_LOAD_RESOLUTION 10
-# define scale_load(w) ((w) << SCHED_LOAD_RESOLUTION)
-# define scale_load_down(w) ((w) >> SCHED_LOAD_RESOLUTION)
+# define SCHED_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT + SCHED_RESOLUTION_SHIFT)
+# define scale_load(w) ((w) << SCHED_RESOLUTION_SHIFT)
+# define scale_load_down(w) ((w) >> SCHED_RESOLUTION_SHIFT)
#else
-# define SCHED_LOAD_RESOLUTION 0
+# define SCHED_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT)
# define scale_load(w) (w)
# define scale_load_down(w) (w)
#endif

-#define SCHED_LOAD_SHIFT (10 + SCHED_LOAD_RESOLUTION)
#define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT)

+/*
+ * NICE_0's weight (visible to user) and its load (invisible to user) have
+ * independent resolution, but they should be well calibrated. We use scale_load()
+ * and scale_load_down(w) to convert between them, the following must be true:
+ * scale_load(prio_to_weight[20]) == NICE_0_LOAD
+ */
#define NICE_0_LOAD SCHED_LOAD_SCALE
#define NICE_0_SHIFT SCHED_LOAD_SHIFT

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/