[tip:sched/core] sched: Make scale_rt invariant with frequency

From: tip-bot for Vincent Guittot
Date: Fri Mar 27 2015 - 07:41:51 EST


Commit-ID: b5b4860d1d61ddc5308c7d492cbeaa3a6e508d7f
Gitweb: http://git.kernel.org/tip/b5b4860d1d61ddc5308c7d492cbeaa3a6e508d7f
Author: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
AuthorDate: Fri, 27 Feb 2015 16:54:08 +0100
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Fri, 27 Mar 2015 09:36:01 +0100

sched: Make scale_rt invariant with frequency

The average running time of RT tasks is used to estimate the remaining compute
capacity for CFS tasks. This remaining capacity is the original capacity scaled
down by a factor (aka scale_rt_capacity). This estimation of available capacity
must also be invariant with frequency scaling.

A frequency scaling factor is applied on the running time of the RT tasks for
computing scale_rt_capacity.

In sched_rt_avg_update(), we now scale the RT execution time like below:

rq->rt_avg += rt_delta * arch_scale_freq_capacity() >> SCHED_CAPACITY_SHIFT

Then, scale_rt_capacity can be summarized by:

scale_rt_capacity = SCHED_CAPACITY_SCALE * available / total

with available = total - rq->rt_avg

This has been been optimized in current code by:

scale_rt_capacity = available / (total >> SCHED_CAPACITY_SHIFT)

But we can also developed the equation like below:

scale_rt_capacity = SCHED_CAPACITY_SCALE - ((rq->rt_avg << SCHED_CAPACITY_SHIFT) / total)

and we can optimize the equation by removing SCHED_CAPACITY_SHIFT shift in
the computation of rq->rt_avg and scale_rt_capacity().

so rq->rt_avg += rt_delta * arch_scale_freq_capacity()
and
scale_rt_capacity = SCHED_CAPACITY_SCALE - (rq->rt_avg / total)

arch_scale_frequency_capacity() will be called in the hot path of the scheduler
which implies to have a short and efficient function.

As an example, arch_scale_frequency_capacity() should return a cached value that
is updated periodically outside of the hot path.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Acked-by: Morten Rasmussen <morten.rasmussen@xxxxxxx>
Cc: Morten.Rasmussen@xxxxxxx
Cc: dietmar.eggemann@xxxxxxx
Cc: efault@xxxxxx
Cc: kamalesh@xxxxxxxxxxxxxxxxxx
Cc: linaro-kernel@xxxxxxxxxxxxxxxx
Cc: nicolas.pitre@xxxxxxxxxx
Cc: preeti@xxxxxxxxxxxxxxxxxx
Cc: riel@xxxxxxxxxx
Link: http://lkml.kernel.org/r/1425052454-25797-6-git-send-email-vincent.guittot@xxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 17 +++++------------
kernel/sched/sched.h | 4 +++-
2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7f031e4..dc7c693 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6004,7 +6004,7 @@ unsigned long __weak arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
static unsigned long scale_rt_capacity(int cpu)
{
struct rq *rq = cpu_rq(cpu);
- u64 total, available, age_stamp, avg;
+ u64 total, used, age_stamp, avg;
s64 delta;

/*
@@ -6020,19 +6020,12 @@ static unsigned long scale_rt_capacity(int cpu)

total = sched_avg_period() + delta;

- if (unlikely(total < avg)) {
- /* Ensures that capacity won't end up being negative */
- available = 0;
- } else {
- available = total - avg;
- }
+ used = div_u64(avg, total);

- if (unlikely((s64)total < SCHED_CAPACITY_SCALE))
- total = SCHED_CAPACITY_SCALE;
+ if (likely(used < SCHED_CAPACITY_SCALE))
+ return SCHED_CAPACITY_SCALE - used;

- total >>= SCHED_CAPACITY_SHIFT;
-
- return div_u64(available, total);
+ return 1;
}

static void update_cpu_capacity(struct sched_domain *sd, int cpu)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 4c95cc2..3600002 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1386,9 +1386,11 @@ static inline int hrtick_enabled(struct rq *rq)

#ifdef CONFIG_SMP
extern void sched_avg_update(struct rq *rq);
+extern unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu);
+
static inline void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
{
- rq->rt_avg += rt_delta;
+ rq->rt_avg += rt_delta * arch_scale_freq_capacity(NULL, cpu_of(rq));
sched_avg_update(rq);
}
#else
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/