Re: [PATCH v2 1/7] sched/fair: Generalize asym_packing logic for SMT local sched group

From: Ricardo Neri
Date: Fri Dec 23 2022 - 08:03:19 EST


On Thu, Dec 22, 2022 at 12:12:00PM +0100, Dietmar Eggemann wrote:
> On 22/12/2022 05:32, Ricardo Neri wrote:
> > On Wed, Dec 21, 2022 at 02:03:15PM +0100, Dietmar Eggemann wrote:
> >> On 12/12/2022 18:53, Ricardo Neri wrote:
> >>> On Tue, Dec 06, 2022 at 06:22:41PM +0100, Dietmar Eggemann wrote:
> >>>> On 22/11/2022 21:35, Ricardo Neri wrote:
> >>
> >> [...]
> >>
> >>>> I'm not sure why you change asym_smt_can_pull_tasks() together with
> >>>> removing SD_ASYM_PACKING from SMT level (patch 5/7)?
> >>>
> >>> In x86 we have SD_ASYM_PACKING at the MC, CLS* and, before my patches, SMT
> >>> sched domains.
> >>>
> >>>>
> >>>> update_sg_lb_stats()
> >>>>
> >>>> ... && env->sd->flags & SD_ASYM_PACKING && .. && sched_asym()
> >>>> ^^^^^^^^^^^^
> >>>> sched_asym()
> >>>>
> >>>> if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
> >>>> (group->flags & SD_SHARE_CPUCAPACITY))
> >>>> return asym_smt_can_pull_tasks()
> >>>> ^^^^^^^^^^^^^^^^^^^^^^^^^
> >>>>
> >>>> So x86 won't have a sched domain with SD_SHARE_CPUCAPACITY and
> >>>> SD_ASYM_PACKING anymore. So sched_asym() would call sched_asym_prefer()
> >>>> directly on MC. What do I miss here?
> >>>
> >>> asym_smt_can_pull_tasks() is used above the SMT level *and* when either the
> >>> local or sg sched groups are composed of CPUs that are SMT siblings.
> >>
> >> OK.
> >>
> >>> In fact, asym_smt_can_pull_tasks() can only be called above the SMT level.
> >>> This is because the flags of a sched_group in a sched_domain are equal to
> >>> the flags of the child sched_domain. Since SMT is the lowest sched_domain,
> >>> its groups' flags are 0.
> >>
> >> I see. I forgot about `[PATCH v5 0/6] sched/fair: Fix load balancing of
> >> SMT siblings with ASYM_PACKING` from Sept 21 (specifically [PATCH v5
> >> 2/6] sched/topology: Introduce sched_group::flags).
> >>
> >>> sched_asym() calls sched_asym_prefer() directly if balancing at the
> >>> SMT level and, at higher domains, if the child domain is not SMT.
> >>
> >> OK.
> >>
> >>> This meets the requirement of Power7, where SMT siblings have different
> >>> priorities; and of x86, where physical cores have different priorities.
> >>>
> >>> Thanks and BR,
> >>> Ricardo
> >>>
> >>> * The target of these patches is Intel hybrid processors, on which cluster
> >>> scheduling is disabled - cabdc3a8475b ("sched,x86: Don't use cluster
> >>> topology for x86 hybrid CPUs"). Also, I have not observed topologies in
> >>> which CPUs of the same cluster have different priorities.
> >>
> >> OK.
> >>
> >> IMHO, the function header of asym_smt_can_pull_tasks() (3rd and 4th
> >> paragraph ... `If both @dst_cpu and @sg have SMT siblings` and
> >
> > Agreed. I changed the behavior of the function. I will update the
> > description.
> >
> >> `If @sg does not have SMT siblings` still reflect the old code layout.
> >
> > But this behavior did not change. The check covers both SMT and non-SMT
> > cases:
>
> The condition to call sched_asym_prefer() seems to have changed slightly
> though (including the explanation that busy_cpus_delta >= 2 handling
> should be done by fbg().:
>
> sds->local_stat.sum_nr_running (A)
> busy_cpus_delta = sg_busy_cpus - local_busy_cpus (B)
> sg_busy_cpus = sgs->group_weight - sgs->idle_cpus (C)
>
> From ((smt && B == 1) || (!smt && !A)) to (C == 1 && !A)

I agree that ((smt && B == 1) did change and I will update the comment.

My point is that (!smt && !A) is equivalent to (C == 1 && !A) if @sg has
only one CPU and is busy. The fourth paragraph still stands.

>
> >
> > /*
> > * non-SMT @sg can only have 1 busy CPU. We only care SMT @sg
> > * has exactly one busy sibling
> > */
> > if (sg_busy_cpus == 1 &&
> > /* local group is fully idle, SMT and non-SMT. */
> > !sds->local_stat.sum_nr_running)
> >
> > Perhaps I can collapse the two paragraphs into one.
>
> Sounds good to me.

Will do.

Thanks and BR,
Ricardo