[PATCH, 2.6.9] improved load_balance() tolerance for pinned tasks

From: John Hawkes
Date: Wed Oct 20 2004 - 14:45:19 EST


A large number of processes that are pinned to a single CPU results in
every other CPU's load_balance() seeing this overloaded CPU as "busiest",
yet move_tasks() never finds a task to pull-migrate. This condition
occurs during module unload, but can also occur as a denial-of-service
using sys_sched_setaffinity(). Several hundred CPUs performing this
fruitless load_balance() will livelock on the busiest CPU's runqueue
lock. A smaller number of CPUs will livelock if the pinned task count
gets high. This simple patch remedies the more common first problem:
after a move_tasks() failure to migrate anything, the balance_interval
increments. Using a simple increment, vs. the more dramatic doubling of
the balance_interval, is conservative and yet also effective.

John Hawkes


Signed-off-by: John Hawkes <hawkes@xxxxxxx>




Index: linux/kernel/sched.c
===================================================================
--- linux.orig/kernel/sched.c 2004-10-19 15:04:11.000000000 -0700
+++ linux/kernel/sched.c 2004-10-19 15:09:50.000000000 -0700
@@ -2123,11 +2123,19 @@
*/
sd->nr_balance_failed = sd->cache_nice_tries;
}
- } else
- sd->nr_balance_failed = 0;

- /* We were unbalanced, so reset the balancing interval */
- sd->balance_interval = sd->min_interval;
+ /*
+ * We were unbalanced, but unsuccessful in move_tasks(),
+ * so bump the balance_interval to lessen the lock contention.
+ */
+ if (sd->balance_interval < sd->max_interval)
+ sd->balance_interval++;
+ } else {
+ sd->nr_balance_failed = 0;
+
+ /* We were unbalanced, so reset the balancing interval */
+ sd->balance_interval = sd->min_interval;
+ }

return nr_moved;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/