Re: [PATCH 3 2/6] sched: Record the current active power savingslevel

From: Vaidyanathan Srinivasan
Date: Thu Mar 19 2009 - 12:31:06 EST


* Gautham R Shenoy <ego@xxxxxxxxxx> [2009-03-18 14:52:28]:

> The current active power savings level of a system is defined as the
> maximum of the sched_mc_power_savings and the sched_smt_power_savings.
>
> The decisions during power-aware loadbalancing, depend on this value.
>
> Record this value in a read mostly global variable instead of having to
> compute it everytime.
>
> Signed-off-by: Gautham R Shenoy <ego@xxxxxxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
>
> include/linux/sched.h | 1 +
> kernel/sched.c | 8 ++++++--
> kernel/sched_fair.c | 2 +-
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 37fecf7..7dc8aea 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -793,6 +793,7 @@ enum powersavings_balance_level {
> };
>
> extern int sched_mc_power_savings, sched_smt_power_savings;

Now we will need sched_mc_power_savings and sched_smt_power_savings
only until we rebuild the sched domains. These can be static
variables and we can perhaps remove the extern for them? Better still
if we can capture this information elsewhere until sched domain is
built and SD_POWERSAVINGS_BALANCE flags are set so as to not
have a need for these global variables.

> +extern enum powersavings_balance_level active_power_savings_level;
> enum sched_domain_level {
> SD_LV_NONE = 0,
> diff --git a/kernel/sched.c b/kernel/sched.c
> index 8e2558c..407ee03 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -3398,7 +3398,7 @@ out_balanced:
>
> if (this == group_leader && group_leader != group_min) {
> *imbalance = min_load_per_task;
> - if (sched_mc_power_savings >= POWERSAVINGS_BALANCE_WAKEUP) {
> + if (active_power_savings_level >= POWERSAVINGS_BALANCE_WAKEUP) {
> cpu_rq(this_cpu)->rd->sched_mc_preferred_wakeup_cpu =
> cpumask_first(sched_group_cpus(group_leader));
> }
> @@ -3683,7 +3683,7 @@ redo:
> !test_sd_parent(sd, SD_POWERSAVINGS_BALANCE))
> return -1;
>
> - if (sched_mc_power_savings < POWERSAVINGS_BALANCE_WAKEUP)
> + if (active_power_savings_level < POWERSAVINGS_BALANCE_WAKEUP)
> return -1;
>
> if (sd->nr_balance_failed++ < 2)
> @@ -7206,6 +7206,8 @@ static void sched_domain_node_span(int node, struct cpumask *span)
> #endif /* CONFIG_NUMA */
>
> int sched_smt_power_savings = 0, sched_mc_power_savings = 0;
> +/* Records the currently active power savings level */
> +enum powersavings_balance_level __read_mostly active_power_savings_level;
>
> /*
> * The cpus mask in sched_group and sched_domain hangs off the end.
> @@ -8040,6 +8042,8 @@ static ssize_t sched_power_savings_store(const char *buf, size_t count, int smt)
> sched_smt_power_savings = level;
> else
> sched_mc_power_savings = level;
> + active_power_savings_level = max(sched_smt_power_savings,
> + sched_mc_power_savings);
>
> arch_reinit_sched_domains();
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 0566f2a..a3583c6 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -1054,7 +1054,7 @@ static int wake_idle(int cpu, struct task_struct *p)
> chosen_wakeup_cpu =
> cpu_rq(this_cpu)->rd->sched_mc_preferred_wakeup_cpu;
>
> - if (sched_mc_power_savings >= POWERSAVINGS_BALANCE_WAKEUP &&
> + if (active_power_savings_level >= POWERSAVINGS_BALANCE_WAKEUP &&
> idle_cpu(cpu) && idle_cpu(this_cpu) &&
> p->mm && !(p->flags & PF_KTHREAD) &&
> cpu_isset(chosen_wakeup_cpu, p->cpus_allowed))
>


Acked-by: Vaidyanathan Srinivasan <svaidy@xxxxxxxxxxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/