Re: [PATCH v3 1/3] cpufreq: ondemand: Change the calculation of target frequency

From: Rafael J. Wysocki
Date: Sat Jun 08 2013 - 07:09:40 EST


On Saturday, June 08, 2013 12:56:00 PM Stratos Karafotis wrote:
> On 06/07/2013 11:57 PM, Rafael J. Wysocki wrote:
> > On Friday, June 07, 2013 10:14:34 PM Stratos Karafotis wrote:
> >> On 06/05/2013 11:35 PM, Rafael J. Wysocki wrote:
> >>> On Wednesday, June 05, 2013 08:13:26 PM Stratos Karafotis wrote:
> >>>> Hi Borislav,
> >>>>
> >>>> On 06/05/2013 07:17 PM, Borislav Petkov wrote:
> >>>>> On Wed, Jun 05, 2013 at 07:01:25PM +0300, Stratos Karafotis wrote:
> >>>>>> Ondemand calculates load in terms of frequency and increases it only
> >>>>>> if the load_freq is greater than up_threshold multiplied by current
> >>>>>> or average frequency. This seems to produce oscillations of frequency
> >>>>>> between min and max because, for example, a relatively small load can
> >>>>>> easily saturate minimum frequency and lead the CPU to max. Then, the
> >>>>>> CPU will decrease back to min due to a small load_freq.
> >>>>>
> >>>>> Right, and I think this is how we want it, no?
> >>>>>
> >>>>> The thing is, the faster you finish your work, the faster you can become
> >>>>> idle and save power.
> >>>>
> >>>> This is exactly the goal of this patch. To use more efficiently middle
> >>>> frequencies to finish faster the work.
> >>>>
> >>>>> If you switch frequencies in a staircase-like manner, you're going to
> >>>>> take longer to finish, in certain cases, and burn more power while doing
> >>>>> so.
> >>>>
> >>>> This is not true with this patch. It switches to middle frequencies
> >>>> when the load < up_threshold.
> >>>> Now, ondemand does not increase freq. CPU runs in lowest freq till the
> >>>> load is greater than up_threshold.
> >>>>
> >>>>> Btw, racing to idle is also a good example for why you want boosting:
> >>>>> you want to go max out the core but stay within power limits so that you
> >>>>> can finish sooner.
> >>>>>
> >>>>>> This patch changes the calculation method of load and target frequency
> >>>>>> considering 2 points:
> >>>>>> - Load computation should be independent from current or average
> >>>>>> measured frequency. For example an absolute load 80% at 100MHz is not
> >>>>>> necessarily equivalent to 8% at 1000MHz in the next sampling interval.
> >>>>>> - Target frequency should be increased to any value of frequency table
> >>>>>> proportional to absolute load, instead to only the max. Thus:
> >>>>>>
> >>>>>> Target frequency = C * load
> >>>>>>
> >>>>>> where C = policy->cpuinfo.max_freq / 100
> >>>>>>
> >>>>>> Tested on Intel i7-3770 CPU @ 3.40GHz and on Quad core 1500MHz Krait.
> >>>>>> Phoronix benchmark of Linux Kernel Compilation 3.1 test shows an
> >>>>>> increase ~1.5% in performance. cpufreq_stats (time_in_state) shows
> >>>>>> that middle frequencies are used more, with this patch. Highest
> >>>>>> and lowest frequencies were used less by ~9%
> >>>
> >>> Can you also use powertop to measure the percentage of time spent in idle
> >>> states for the same workload with and without your patchset? Also, it would
> >>> be good to measure the total energy consumption somehow ...
> >>>
> >>> Thanks,
> >>> Rafael
> >>
> >> Hi Rafael,
> >>
> >> I repeated the tests extracting also powertop results.
> >> Measurement steps with and without this patch:
> >> 1) Reboot system
> >> 2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
> >> without taking measurement
> >> 3) Wait few minutes
> >> 4) Run Phoronix and powertop for 100secs and take measurement.
> >
> > Well, while this is not conclusive, it definitely looks very promising. :-)
> >
> > We're seeing measurable performance improvement with the patchset applied *and*
> > more time spent in idle states both at the same time. I'd be very surprised if
> > the energy consumption measuremets did not confirm that the patchset allowed
> > us to reduce it.
> >
> > If my computations are correct (somebody please check), the cores spent about
> > 20% more time in idle on the average with the patchset applied and in addition
> > to that the cc6 residency was greater by about 2% on the average with respect
> > to the kernel without the patchset.
> >
> > We need to verify if there are gains (or at least no regressions) with other
> > workloads, but since this *also* reduces code complexity quite a bit, I'm
> > seriously considering taking it.
> >
> >> I will try to repeat the test and take measurements with turbostat as
> >> Borislav suggested.
> >
> > Please do!
> >
> > Thanks,
> > Rafael
> >
>
> Hi,
>
> I repeated the tests extracting results from turbostat.
> Measurement steps with and without this patch:
> 1) Reboot system
> 2) Running twice Phoronix benchmark of Linux Kernel Compilation 3.1 test
> without taking measurement
> 3) Wait few minutes
> 4) Run Phoronix and turbostat (-i 100) and take measurement

You need to do something like

# ./turbostat <command invoking the phoronix suite>

Did you do that?

Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/