Re: [PATCH] cpufreq, store_scaling_governor requires policy->rwsem to be held for duration of changing governors [v2]

From: Saravana Kannan
Date: Thu Jul 31 2014 - 20:55:52 EST


On 07/31/2014 03:58 PM, Prarit Bhargava wrote:


On 07/31/2014 06:13 PM, Saravana Kannan wrote:
On 07/31/2014 02:08 PM, Prarit Bhargava wrote:


On 07/31/2014 04:38 PM, Saravana Kannan wrote:
On 07/31/2014 01:30 PM, Prarit Bhargava wrote:


On 07/31/2014 04:24 PM, Saravana Kannan wrote:

Prarit,

I'm not an expert on sysfs locking, but I would think the specific sysfs lock
would depend on the file/attribute group. So, can you please try to hotplug a
core in/out (to trigger the POLICY_EXIT) and then read a sysfs file
exported by
the governor? scaling_governor doesn't cut it since that file is not
removed on
policy exit event to governor. If it's ondemand, try reading/write it's
sampling
rate file.

Thanks Saravana -- will do. I will get back to you shortly on this.


Thanks. Btw, in case you weren't already aware of it. You'll have to hoplug out
all the CPUs in a cluster to trigger a POLICY_EXIT for that cluster/policy.

Yep -- the affected_cpus file should show all the cpus in the policy IIRC. One
of the systems I have has 1 cpu/policy and has 48 threads so the POLICY_EXIT is
called.

I'll put something like

while [1];
do
echo ondemand > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
cat /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate
echo 20000 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate
cat /sys/devices/system/cpu/cpufreq/ondemand/sampling_rate
echo 0 > /sys/devices/system/cpu/cpu1/online
sleep 1
echo 1 > /sys/devices/system/cpu/cpu1/online
sleep 1
done


The actual race can only happen with 2 threads. I'm just trying to trigger a
lockdep warning here.

I ran the above in two separate terminals with cpuset -c 0 and cpuset -c 1 to
multi-thread it all. No deadlock or LOCKDEP trace after about 1/2 hour, so I
think we're in the clear on that concern.


I wasn't convinced. So, I took some help from Stephen to test it.

It's been a while, so I didn't remember the original issue clearly when I gave you some test suggestions. Now that I looked at the code more closely, I have a proper way to reproduce the original issue.

Nack for this patch for 2 reasons:
1. You seem to have accidentally removed a GOV_STOP in your patch. We definitely can't do that. This broke changing governors and that's why your patch didn't cause any issues. Because all your governor echos were failing.
2. When we fixed that and actually tried a proper test (not the one I gave you), we reproduced the original issue.

To reproduce original issue:
Preconditions:
* lockdep is enabled
* governor per policy is enabled

Steps:
1. Set governor to ondemand.
2. Cat one of the ondemand sysfs files.
3. Change governor to conservative.

When you do that, there's an AB, BA dead lock issue with one thread trying to cat a governor sysfs file and another thread trying to change governors.

-Saravana

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/