Re: [PATCH v9 13/15] sched/fair: Introduce an energy estimation helper function

From: Peter Zijlstra
Date: Thu Nov 22 2018 - 08:56:24 EST


On Wed, Nov 21, 2018 at 04:05:27PM +0000, Quentin Perret wrote:
> On Wednesday 21 Nov 2018 at 15:28:03 (+0100), Peter Zijlstra wrote:
> > On Mon, Nov 19, 2018 at 02:18:55PM +0000, Quentin Perret wrote:
> > > +static long
> > > +compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
> > > +{
> > > + long util, max_util, sum_util, energy = 0;
> > > + int cpu;
> > > +
> > > + for (; pd; pd = pd->next) {
> > > + max_util = sum_util = 0;
> > > + /*
> > > + * The capacity state of CPUs of the current rd can be driven by
> > > + * CPUs of another rd if they belong to the same performance
> > > + * domain. So, account for the utilization of these CPUs too
> > > + * by masking pd with cpu_online_mask instead of the rd span.
> > > + *
> > > + * If an entire performance domain is outside of the current rd,
> > > + * it will not appear in its pd list and will not be accounted
> > > + * by compute_energy().
> > > + */
> > > + for_each_cpu_and(cpu, perf_domain_span(pd), cpu_online_mask) {
> >
> > Should that not be cpu_active_mask ?
>
> Hmm, I must admit I'm sometimes a bit confused by the exact difference
> between these masks, so maybe yeah ...
>
> IIUC, cpu_active_mask is basically the set of CPUs on which the
> scheduler is actually allowed to migrate tasks. Is that correct ?

Yep. Which is a strict subset of online. The difference only matters
during hotplug. We take a CPU out of active before we take if offline
and we add it to active only after the CPU is fully online and
scheduling.

> I have always seen cpu_online_mask as a superset of cpu_active_mask
> which can also include CPUs which are still running 'special' tasks
> (kthreads and things like that I assume) although not allowed for
> migration any more (or not yet) because we're in the process of
> hotplugging that CPU.

Right.

> So, the thing is, I'm not trying to select a CPU candidate for my task
> here, I'm trying to understand what's the energy impact of a migration.
> That involves all CPUs that are running _something_ in a perf domain
> no matter if they're allowed to run more tasks or not. I mean, raising
> the OPP will make running online && !active CPUs more expensive as well.
> That's why I thought cpu_online_mask was good match here.

Ah, fair enough. Thanks!