Re: [PATCH 3/4] sched: drop group_capacity to 1 only if localgroup has extra capacity

From: Suresh Siddha
Date: Thu Oct 14 2010 - 19:42:59 EST


On Wed, 2010-10-13 at 22:48 -0700, Nikhil Rao wrote:
> Resending this patch since the original patch was munged. Thanks to Mike
> Galbraith for pointing this out.
>
> When SD_PREFER_SIBLING is set on a sched domain, drop group_capacity to 1
> only if the local group has extra capacity. For niced task balancing, we pull
> low weight tasks away from a sched group as long as there is capacity in other
> groups. When all other groups are saturated, we do not drop capacity of the
> niced group down to 1. This prevents active balance from kicking out the low
> weight threads and which hurts system utilization.
>
> Signed-off-by: Nikhil Rao <ncrao@xxxxxxxxxx>
> ---
> kernel/sched_fair.c | 8 ++++++--
> 1 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched_fair.c b/kernel/sched_fair.c
> index 0dd1021..da0c688 100644
> --- a/kernel/sched_fair.c
> +++ b/kernel/sched_fair.c
> @@ -2030,6 +2030,7 @@ struct sd_lb_stats {
> unsigned long this_load;
> unsigned long this_load_per_task;
> unsigned long this_nr_running;
> + unsigned long this_group_capacity;
>
> /* Statistics of the busiest group */
> unsigned long max_load;
> @@ -2546,15 +2547,18 @@ static inline void update_sd_lb_stats(struct sched_domain *sd, int this_cpu,
> /*
> * In case the child domain prefers tasks go to siblings
> * first, lower the sg capacity to one so that we'll try
> - * and move all the excess tasks away.
> + * and move all the excess tasks away. We lower capacity only
> + * if the local group can handle the extra capacity.
> */
> - if (prefer_sibling)
> + if (prefer_sibling && !local_group &&
> + sds->this_nr_running < sds->this_group_capacity)
> sgs.group_capacity = min(sgs.group_capacity, 1UL);

Yes Nikhil. This should solve my earlier concern of SMT/MC idle
balancing case.

Acked-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>

>
> if (local_group) {
> sds->this_load = sgs.avg_load;
> sds->this = sg;
> sds->this_nr_running = sgs.sum_nr_running;
> + sds->this_group_capacity = sgs.group_capacity;
> sds->this_load_per_task = sgs.sum_weighted_load;
> } else if (update_sd_pick_busiest(sd, sds, sg, &sgs, this_cpu)) {
> sds->max_load = sgs.avg_load;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/