[tip:sched/core] sched/fair: Fix the group_capacity computation

From: tip-bot for Peter Zijlstra
Date: Thu Sep 12 2013 - 14:07:00 EST


Commit-ID: c61037e905a5cb74c7d786c35ee2cdbab9ed63af
Gitweb: http://git.kernel.org/tip/c61037e905a5cb74c7d786c35ee2cdbab9ed63af
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Wed, 28 Aug 2013 12:40:38 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Thu, 12 Sep 2013 19:14:45 +0200

sched/fair: Fix the group_capacity computation

Do away with 'phantom' cores due to N*frac(smt_power) >= 1 by limiting
the capacity to the actual number of cores.

The assumption of 1 < smt_power < 2 is an actual requirement because
of what SMT is so this should work regardless of the SMT
implementation.

It can still be defeated by creative use of cpu hotplug, but if you're
one of those freaks, you get to live with it.

Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Acked-by: Vincent Guittot <vincent.guitto@xxxxxxxxxx>
Link: http://lkml.kernel.org/n/tip-dczmbi8tfgixacg1ji2av1un@xxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 218f9c5..51c5c3e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4556,18 +4556,24 @@ static inline int sg_imbalanced(struct sched_group *group)
/*
* Compute the group capacity.
*
- * For now the capacity is simply the number of power units in the group_power.
- * A power unit represents a full core.
- *
- * This has an issue where N*frac(smt_power) >= 1, in that case we'll see extra
- * 'cores' that aren't actually there.
+ * Avoid the issue where N*frac(smt_power) >= 1 creates 'phantom' cores by
+ * first dividing out the smt factor and computing the actual number of cores
+ * and limit power unit capacity with that.
*/
static inline int sg_capacity(struct lb_env *env, struct sched_group *group)
{
+ unsigned int capacity, smt, cpus;
+ unsigned int power, power_orig;
+
+ power = group->sgp->power;
+ power_orig = group->sgp->power_orig;
+ cpus = group->group_weight;

- unsigned int power = group->sgp->power;
- unsigned int capacity = DIV_ROUND_CLOSEST(power, SCHED_POWER_SCALE);
+ /* smt := ceil(cpus / power), assumes: 1 < smt_power < 2 */
+ smt = DIV_ROUND_UP(SCHED_POWER_SCALE * cpus, power_orig);
+ capacity = cpus / smt; /* cores */

+ capacity = min_t(unsigned, capacity, DIV_ROUND_CLOSEST(power, SCHED_POWER_SCALE));
if (!capacity)
capacity = fix_small_capacity(env->sd, group);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/