Re: [PATCH 3/3] cpufreq: schedutil: remove redundant code from sugov_next_freq_shared()

From: Rafael J. Wysocki
Date: Wed Mar 08 2017 - 05:50:16 EST


On Wed, Mar 8, 2017 at 5:18 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
> On 07-03-17, 14:19, Rafael J. Wysocki wrote:
>> On Tue, Mar 7, 2017 at 11:31 AM, Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>> > Why do you think so? I thought all CPU in the policy can have the RT/DL flag set
>> > and the probability of all of them is just the same.
>>
>> Well, yes, but if the current CPU has that flag set already, we surely
>> don't need to check the other ones in the policy?
>
> That's true for every other CPU in policy too..

Not exactly.

The flags value for the current CPU is in a hot cache line already (if
not in a register) and it is not necessary to chase a pointer (and
possibly fetch a new cache line) to get to it.

That also applies to util and max for the current CPU, but the benefit
here is debatable.

>> >> So to the point, the code was written this way on purpose and not just
>> >> by accident as your changelog suggests and
>> >
>> > I didn't wanted to convey that really and I knew that it was written on purpose.
>> >
>> >> if you want to change it, you need numbers.
>> >
>> > What kind of numbers can we get for such a change ? I tried to take the running
>> > average of the time it takes to execute this routine over 10000 samples, but it
>> > varies a lot even with the same build. Any tests like hackbench, etc wouldn't be
>> > of any help as well.
>>
>> So why do you think it needs to be changed, but really?
>>
>> Is that because it is particularly hard to follow or similar?
>
> Just that I didn't like keeping the same code at two places (outside
> and inside the loop) and the benefit it has.

So there are two things here, the flags check and the invocation of
sugov_iowait_boost() for the current CPU.

I claim that the flags check is a clear benefit due to what I said above.

The other thing is a way to initialize util and max to sensible
values. It also can be done the way you did it and that change should
not affect the execution time.

So overall, maybe you can move the flags check to
sugov_update_shared(), so that you don't need to pass flags to
sugov_next_freq_shared(), and then do what you did to util and max.

But that would be a 4.12 change anyway.

> Anyway, its not straight forward to get any numbers supporting my
> argument. I can claim improvement only theoretically by comparing the
> number of comparisons that we may end up doing for quad or octa core
> policies. Lets abandon this patch as I failed to convince you :)
>
> Thanks for applying the other two patches though.

No problem.

Thanks,
Rafael