RE: [Linuxarm] Re: [PATCH v2] sched/topology: fix the issue groups don't span domain->span for NUMA diameter > 2

From: Valentin Schneider
Date: Thu Feb 18 2021 - 09:37:03 EST



Hi Barry,

On 18/02/21 09:17, Song Bao Hua (Barry Song) wrote:
> Hi Valentin,
>
> I understand Peter's concern is that the local group has different
> size with remote groups. Is this patch resolving Peter's concern?
> To me, it seems not :-)
>

If you remove the '&& i != cpu' in build_overlap_sched_groups() you get
that, but then you also get some extra warnings :-)

Now yes, should_we_balance() only matters for the local group. However I'm
somewhat wary of messing with the local groups; for one it means you would
have more than one tl now accessing the same sgc->next_update, sgc->{min,
max}capacity, sgc->group_imbalance (as Vincent had pointed out).

By ensuring only remote (i.e. !local) groups are modified (which is what
your patch does), we absolve ourselves of this issue, which is why I prefer
this approach ATM.

> Though I don’t understand why different group sizes will be harmful
> since all groups are calculating avg_load and group_type based on
> their own capacities. Thus, for a smaller group, its capacity would
> be smaller.
>
> Is it because a bigger group has relatively less chance to pull, so
> load balancing will be completed more slowly while small groups have
> high load?
>

Peter's point is that, if at a given tl you have groups that look like

g0: 0-4, g1: 5-6, g2: 7-8

Then g0 is half as likely to pull tasks with load_balance() than g1 or g2
(due to the group size vs should_we_balance())


However, I suppose one "trick" to be aware of here is that since your patch
*doesn't* change the local group, we do have e.g. on CPU0:

[ 0.374840] domain-2: span=0-5 level=NUMA
[ 0.375054] groups: 0:{ span=0-3 cap=4003 }, 4:{ span=4-5 cap=1988 }

*but* on CPU4 we get:

[ 0.387019] domain-2: span=0-1,4-7 level=NUMA
[ 0.387211] groups: 4:{ span=4-7 cap=3984 }, 0:{ span=0-1 cap=2013 }

IOW, at a given tl, all *local* groups have /roughly/ the same size and thus
similar pull probability (it took me writing this mail to see it that
way). So perhaps this is all fine already?