[tip:sched/core] sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in fix_small_imbalance()

From: tip-bot for Vladimir Davydov
Date: Fri Sep 20 2013 - 09:47:25 EST


Commit-ID: 3029ede39373c368f402a76896600d85a4f7121b
Gitweb: http://git.kernel.org/tip/3029ede39373c368f402a76896600d85a4f7121b
Author: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
AuthorDate: Sun, 15 Sep 2013 17:49:14 +0400
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Fri, 20 Sep 2013 11:59:38 +0200

sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in fix_small_imbalance()

In busiest->group_imb case we can come to fix_small_imbalance() with
local->avg_load > busiest->avg_load. This can result in wrong imbalance
fix-up, because there is the following check there where all the
members are unsigned:

if (busiest->avg_load - local->avg_load + scaled_busy_load_per_task >=
(scaled_busy_load_per_task * imbn)) {
env->imbalance = busiest->load_per_task;
return;
}

As a result we can end up constantly bouncing tasks from one cpu to
another if there are pinned tasks.

Fix it by substituting the subtraction with an equivalent addition in
the check.

[ The bug can be caught by running 2*N cpuhogs pinned to two logical cpus
belonging to different cores on an HT-enabled machine with N logical
cpus: just look at se.nr_migrations growth. ]

Signed-off-by: Vladimir Davydov <vdavydov@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/ef167822e5c5b2d96cf5b0e3e4f4bdff3f0414a2.1379252740.git.vdavydov@xxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0b99aae..2aedacc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4823,8 +4823,8 @@ void fix_small_imbalance(struct lb_env *env, struct sd_lb_stats *sds)
(busiest->load_per_task * SCHED_POWER_SCALE) /
busiest->group_power;

- if (busiest->avg_load - local->avg_load + scaled_busy_load_per_task >=
- (scaled_busy_load_per_task * imbn)) {
+ if (busiest->avg_load + scaled_busy_load_per_task >=
+ local->avg_load + (scaled_busy_load_per_task * imbn)) {
env->imbalance = busiest->load_per_task;
return;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/