Re: [PATCH 2/5] cpufreq: governor: Create separate sysfs-ops

From: Saravana Kannan
Date: Tue Feb 02 2016 - 23:03:29 EST


On 02/02/2016 05:52 PM, Rafael J. Wysocki wrote:
On Wed, Feb 3, 2016 at 2:32 AM, Saravana Kannan <skannan@xxxxxxxxxxxxxx> wrote:
On 02/02/2016 05:07 PM, Rafael J. Wysocki wrote:

On Wed, Feb 3, 2016 at 12:42 AM, Rafael J. Wysocki <rafael@xxxxxxxxxx>
wrote:

On Tue, Feb 2, 2016 at 11:21 PM, Saravana Kannan <skannan@xxxxxxxxxxxxxx>
wrote:

On 02/02/2016 11:40 AM, Rafael J. Wysocki wrote:


On Tue, Feb 2, 2016 at 6:01 PM, Juri Lelli <juri.lelli@xxxxxxx> wrote:



[cut]


I also don't like this patch because it forces governors to either
implement
their own macros and management of their attributes or force them to use
the
governor structs that come with cpufreq_governor.h. cpufreq_governor.h
IMHO
is very ondemand and conservative governor specific and is very
irrelevant
for sched-dvfs or any other governors (hint hint).

The only time this ABBA locking is an issue is when governor are
changing
and trying to add/remove attributes. That can easily be checked in
store_governor and dealt with without holding the policy rwsem if the
governors can provide their per sys and per policy attribute arrays as
part
of registering themselves.

I'm sorry that I just keep talking about the idea and not sending out
the
patches.


I think you have a point, though.

The deadlock really is specific to the governors using the code in
cpufreq_governor.c.


That said no other governors in the tree use any sysfs attributes for
tunables AFAICS and the out-of-the tree ones are out of interest here.


But if we are expecting sched dvfs to come in, why make it worse for it. It
would be completely pointless to try and shoehorn sched dvfs to use
cpufreq_governor.c

Well, do you honestly think that using the existing stuff in it would
be a good idea?

If not, then why it matters at all?

Also the deadlock happens if one of the tunable attributes is accessed
while we're trying to remove it which very well may happen on read
access too.

Isn't this THE deadlock we are talking about? The removal of the attributes
only happen when governors are changes and we send a POLICY_EXIT and or all
the cores are hotplugged out.

It generally happens when the "old" governor is going away, whatever the reason.

And my suggestion would work just as well there.

Why are you prefixing your sentence with "Also"? Is there some other case
I'm not considering?

Say someone is reading sampling_rate for a policy with 1 CPU in it and
someone else is taking the CPU offline. The governor EXIT code path
(that will trigger as a result) will try to remove the sampling_rate
attribute and (if it does that under policy->rwsem) it'll wait for the
read access to finish. Where exactly would you put the deadlock
prevention in this case?

This is the hotplug case I mentioned. The sysfs file removals will happen only for the last CPU in that policy (we thankfully optimized that part last year). We also know that multiple CPUs can't be hotplugged at the same time. So, in the start of cpufreq_offline_prepare, we just need to check if this is the last CPU in the policy and if that's the case, do the gov sysfs remove and then grab the policy lock and do all our crap. If a read is going on, that's going to finish before the sysfs attr remove can go ahead and it can grab the policy lock if it needs to and that still won't cause a deadlock because we haven't yet grabbed the policy lock in cpufreq_offline_prepare(). If the read comes after the sysfs remove, then the read is obviously going to fail (we can depend on the sysfs framework on doing its job there).

Will that still leave any race conditions in?

-Saravana
--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project