Re: [RFC PATCH v3 03/10] PM: Introduce an Energy Model management framework

From: Juri Lelli
Date: Wed Jun 06 2018 - 11:20:12 EST


On 06/06/18 15:37, Quentin Perret wrote:
> Hi Dietmar,
>
> On Wednesday 06 Jun 2018 at 15:12:15 (+0200), Dietmar Eggemann wrote:
> > > +static void fd_update_cs_table(struct em_cs_table *cs_table, int cpu)
> > > +{
> > > + unsigned long cmax = arch_scale_cpu_capacity(NULL, cpu);
> > > + int max_cap_state = cs_table->nr_cap_states - 1;
> > > + unsigned long fmax = cs_table->state[max_cap_state].frequency;
> > > + int i;
> > > +
> > > + for (i = 0; i < cs_table->nr_cap_states; i++)
> > > + cs_table->state[i].capacity = cmax *
> > > + cs_table->state[i].frequency / fmax;
> > > +}
> >
> > This has issues on a 32bit system. cs_table->state[i].capacity (unsigned
> > long) overflows with the frequency values stored in Hz.
>
> Ah, thank you very much for pointing this out ! I haven't tried on a 32bit
> machine yet, my bad. I'll fix that for v4.
>
> >
> > Maybe something like this to cure it:
> >
> > diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> > index 6ad53f1cf7e6..c13b3eb8bf35 100644
> > --- a/kernel/power/energy_model.c
> > +++ b/kernel/power/energy_model.c
> > @@ -144,9 +144,11 @@ static void fd_update_cs_table(struct em_cs_table *cs_table, int cpu)
> > unsigned long fmax = cs_table->state[max_cap_state].frequency;
> > int i;
> >
> > - for (i = 0; i < cs_table->nr_cap_states; i++)
> > - cs_table->state[i].capacity = cmax *
> > - cs_table->state[i].frequency / fmax;
> > + for (i = 0; i < cs_table->nr_cap_states; i++) {
> > + u64 val = (u64)cmax * cs_table->state[i].frequency;
> > + do_div(val, fmax);
> > + cs_table->state[i].capacity = (unsigned long)val;
> > + }
> > }
>
> Hmmm yes, that should work.
>
> >
> > This brings me to another question. Let's say there are multiple users of
> > the Energy Model in the system. Shouldn't the units of frequency and power
> > not standardized, maybe Mhz and mW?
> > The task scheduler doesn't care since it is only interested in power diffs
> > but other user might do.
>
> So the good thing about specifying units is that we can probably assume
> ranges on the values. If the power is in mW, assuming that we're talking
> about a single CPU, it'll probably fit in 16 bits. 65W/core should be
> a reasonable upper-bound ?
> But there are also vendors who might not be happy with disclosing absolute
> values ... These are sometimes considered sensitive and only relative
> numbers are discussed publicly. Now, you can also argue that we already
> have units specified in IPA for ex, and that it doesn't really matter if
> a driver "lies" about the real value, as long as the ratios are correct.
> And I guess that anyone can do measurement on the hardware and get those
> values anyway. So specifying a unit (mW) for the power is probably a
> good idea.

Mmm, I remember we fought quite a bit while getting capacity-dmpis-mhz
binding accepted, and one of the musts was that the values were going to
be normalized. So, normalized power values again maybe?

Best,

- Juri