Re: [PATCH 4/4] sched/topology: the group balance cpu must be a cpu where the group is installed

From: Peter Zijlstra
Date: Tue Apr 25 2017 - 11:39:54 EST


On Tue, Apr 25, 2017 at 05:27:03PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 25, 2017 at 05:22:36PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 25, 2017 at 05:12:00PM +0200, Peter Zijlstra wrote:
> > > But I'll first try and figure out why I'm not having empty masks.
> >
> > Ah, so this is before all the degenerate stuff, so there's a bunch of
> > redundant domains below that make it work -- and there always will be,
> > unless FORCE_SD_OVERLAP.
> >
> > Now I wonder what triggered it.. let me put it back.
>
> Ah! the asymmetric setup, where @sibling is entirely uninitialized for
> the top domain.
>
And it still works correctly too:


[ 0.078756] XXX 1 NUMA
[ 0.079005] XXX 2 NUMA
[ 0.080003] XXY 0-2:0
[ 0.081007] XXX 1 NUMA
[ 0.082005] XXX 2 NUMA
[ 0.083003] XXY 1-3:3
[ 0.084032] XXX 1 NUMA
[ 0.085003] XXX 2 NUMA
[ 0.086003] XXY 1-3:3
[ 0.087015] XXX 1 NUMA
[ 0.088003] XXX 2 NUMA
[ 0.089002] XXY 0-2:0


[ 0.090007] CPU0 attaching sched-domain:
[ 0.091002] domain 0: span 0-2 level NUMA
[ 0.092002] groups: 0 (mask: 0), 1, 2
[ 0.093002] domain 1: span 0-3 level NUMA
[ 0.094002] groups: 0-2 (mask: 0) (cpu_capacity: 3072), 1-3 (cpu_capacity: 3072)
[ 0.095005] CPU1 attaching sched-domain:
[ 0.096003] domain 0: span 0-3 level NUMA
[ 0.097002] groups: 1 (mask: 1), 2, 3, 0
[ 0.098004] CPU2 attaching sched-domain:
[ 0.099002] domain 0: span 0-3 level NUMA
[ 0.100002] groups: 2 (mask: 2), 3, 0, 1
[ 0.101004] CPU3 attaching sched-domain:
[ 0.102002] domain 0: span 1-3 level NUMA
[ 0.103002] groups: 3 (mask: 3), 1, 2
[ 0.104002] domain 1: span 0-3 level NUMA
[ 0.105002] groups: 1-3 (mask: 3) (cpu_capacity: 3072), 0-2 (cpu_capacity: 3072)


static void
build_group_mask(struct sched_domain *sd, struct sched_group *sg, struct cpumask *mask)
{
const struct cpumask *sg_span = sched_group_cpus(sg);
struct sd_data *sdd = sd->private;
struct sched_domain *sibling;
int i, funny = 0;

cpumask_clear(mask);

for_each_cpu(i, sg_span) {
sibling = *per_cpu_ptr(sdd->sd, i);

if (!sibling->child) {
funny = 1;
printk("XXX %d %s %*pbl\n", i, sd->name, cpumask_pr_args(sched_domain_span(sibling)));
continue;
}

/* If we would not end up here, we can't continue from here */
if (!cpumask_equal(sg_span, sched_domain_span(sibling->child)))
continue;

cpumask_set_cpu(i, mask);
}

if (funny) {
printk("XXY %*pbl:%*pbl\n",
cpumask_pr_args(sg_span),
cpumask_pr_args(mask));
}
}


So that will still get the right balance cpu and thus sgc.

Another thing I've been thinking about; I think we can do away with the
kzalloc() in build_group_from_child_sched_domain() and use the sdd->sg
storage.

I just didn't want to move too much code around again, and ideally put
more assertions in place to catch bad stuff; I just haven't had a good
time thinking of good assertions :/