Re: [PATCH 2/4] sched/fair: Remove SCHED_LOAD_SHIFT and SCHED_LOAD_SCALE

From: Vincent Guittot
Date: Tue Oct 06 2015 - 05:29:54 EST


On 4 October 2015 at 19:56, Yuyang Du <yuyang.du@xxxxxxxxx> wrote:
> After cleaning up the sched metrics, these two definitions that cause
> ambiguity are not needed any more. Use NICE_0_LOAD_SHIFT and NICE_0_LOAD
> instead (the names suggest clearly who they are).
>
> Suggested-by: Ben Segall <bsegall@xxxxxxxxxx>
> Signed-off-by: Yuyang Du <yuyang.du@xxxxxxxxx>
> ---
> kernel/sched/fair.c | 4 ++--
> kernel/sched/sched.h | 22 +++++++++++-----------
> 2 files changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index c61fd8e..fdb7937 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -682,7 +682,7 @@ void init_entity_runnable_average(struct sched_entity *se)
> sa->period_contrib = 1023;
> sa->load_avg = scale_load_down(se->load.weight);
> sa->load_sum = sa->load_avg * LOAD_AVG_MAX;
> - sa->util_avg = scale_load_down(SCHED_LOAD_SCALE);
> + sa->util_avg = SCHED_CAPACITY_SCALE;
> sa->util_sum = sa->util_avg * LOAD_AVG_MAX;
> /* when this task enqueue'ed, it will contribute to its cfs_rq's load_avg */
> }
> @@ -6651,7 +6651,7 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s
> if (busiest->group_type == group_overloaded &&
> local->group_type == group_overloaded) {
> load_above_capacity = busiest->sum_nr_running *
> - SCHED_LOAD_SCALE;
> + SCHED_CAPACITY_SCALE;

load_above_capacity is then compared against load_avg. In patch 3, you
directly use weight instead of scale_down(weight) to compute the
load_avg. It implies that load_above_capacity must also move to the
same range. So you will have to replace SCHED_CAPACITY_SCALE with
NICE_0_LOAD.
This comment applied to patch 3 but it was easier to describe the
issue here with the code than doing that in patch 3 which doesn't have
reference to this code

So you should better use scale_down(NICE_0_LOAD) in this patch and
remove the scale_down in patch 3 to keep only NICE_0_LOAD so you will
be consistent in each patch


> if (load_above_capacity > busiest->group_capacity)
> load_above_capacity -= busiest->group_capacity;

Here you will also have to move the capacity in the same range than
the load. So in patch 3 you will have to use
scale_load(busiest->group_capacity)

Regards,
Vincent

> else
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 31b4022..3d03956 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -53,25 +53,25 @@ static inline void update_cpu_load_active(struct rq *this_rq) { }
> * increased costs.
> */
> #if 0 /* BITS_PER_LONG > 32 -- currently broken: it increases power usage under light load */
> -# define SCHED_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT + SCHED_RESOLUTION_SHIFT)
> +# define NICE_0_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT + SCHED_RESOLUTION_SHIFT)
> # define scale_load(w) ((w) << SCHED_RESOLUTION_SHIFT)
> # define scale_load_down(w) ((w) >> SCHED_RESOLUTION_SHIFT)
> #else
> -# define SCHED_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT)
> +# define NICE_0_LOAD_SHIFT (SCHED_RESOLUTION_SHIFT)
> # define scale_load(w) (w)
> # define scale_load_down(w) (w)
> #endif
>
> -#define SCHED_LOAD_SCALE (1L << SCHED_LOAD_SHIFT)
> -
> /*
> - * NICE_0's weight (visible to user) and its load (invisible to user) have
> - * independent resolution, but they should be well calibrated. We use scale_load()
> - * and scale_load_down(w) to convert between them, the following must be true:
> - * scale_load(prio_to_weight[20]) == NICE_0_LOAD
> + * Task weight (visible to user) and its load (invisible to user) have
> + * independent resolution, but they should be well calibrated. We use
> + * scale_load() and scale_load_down(w) to convert between them. The
> + * following must be true:
> + *
> + * scale_load(prio_to_weight[USER_PRIO(NICE_TO_PRIO(0))]) == NICE_0_LOAD
> + *
> */
> -#define NICE_0_LOAD SCHED_LOAD_SCALE
> -#define NICE_0_SHIFT SCHED_LOAD_SHIFT
> +#define NICE_0_LOAD (1L << NICE_0_LOAD_SHIFT)
>
> /*
> * Single value that decides SCHED_DEADLINE internal math precision.
> @@ -850,7 +850,7 @@ DECLARE_PER_CPU(struct sched_domain *, sd_asym);
> struct sched_group_capacity {
> atomic_t ref;
> /*
> - * CPU capacity of this group, SCHED_LOAD_SCALE being max capacity
> + * CPU capacity of this group, SCHED_CAPACITY_SCALE being max capacity
> * for a single CPU.
> */
> unsigned int capacity;
> --
> 2.1.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/