[PATCH] remove rwsem lock from CPUFREQ_GOV_STOP call (second callsite)

From: Mathieu Desnoyers
Date: Mon Jun 08 2009 - 11:23:33 EST


* Dave Jones (davej@xxxxxxxxxx) wrote:
> On Mon, Jun 08, 2009 at 08:48:45AM -0400, Mathieu Desnoyers wrote:
>
> > > > >> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13475
> > > > >> Subject : suspend/hibernate lockdep warning
> > > > >> References : http://marc.info/?l=linux-kernel&m=124393723321241&w=4
> > > >
> > > > I suspect the following commit, after revert this patch I test 5 times
> > > > without lockdep warnings.
> > > >
> > > > commit b14893a62c73af0eca414cfed505b8c09efc613c
> > > > Author: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
> > > > Date: Sun May 17 10:30:45 2009 -0400
> > > >
> > > > [CPUFREQ] fix timer teardown in ondemand governor
> > >
> > > The patch is probably not at fault here. I suspect it's some latent bug
> > > that simply got exposed by the change to cancel_delayed_work_sync(). In
> > > any case, Mathieu, can you take a look at this please?
> >
> > Yes, it's been looked at and discussed on the cpufreq ML. The short
> > answer is that they plan to re-engineer cpufreq and remove the policy
> > rwlock taken around almost every operations at the cpufreq level.
> >
> > The short-term solution, which is recognised as ugly, would be do to the
> > following before doing the cancel_delayed_work_sync() :
> >
> > unlock policy rwlock write lock
> >
> > lock policy rwlock write lock
> >
> > It basically works because this rwlock is unneeded for teardown, hence
> > the future re-work planned.
> >
> > I'm sorry I cannot prepare a patch current... I've got quite a few pages
> > of Ph.D. thesis due for the beginning of July.
>
> I'm kinda scared to touch this code at all for .30 due to the number of
> unexpected gotchas we seem to run into every time we touch something
> locking related. So I'm inclined to just live with the lockdep warning
> for .30, and see how the real fixes look for .31, and push them back
> as -stable updates if they work out.
>
>
> Venki, what are your thoughts?
>

Hi Dave,

I've looked through the cpufreq code, and the following patch should
address the call site I've missed in commit
42a06f2166f2f6f7bf04f32b4e823eacdceafdc9. I've followed all
__cpufreq_set_policy call sites within cpufreq.c to make sure they all
hold the rwsem write lock. An extra round of review would be good
though.

Can someone try the following patch and see if it fixes the regression ?
My test machine is currently busy running long formal verifications, and
therefore unavailable for kernel patch testing. It compiles fine on a
2.6.30-rc5 kernel with my (now mainlined) cpufreq patches applied.

Mathieu


remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)

commit 42a06f2166f2f6f7bf04f32b4e823eacdceafdc9

Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the
teardown. To make a long story short, the rwlock write-lock causes a circular
dependency with cancel_delayed_work_sync(), because the timer handler takes the
read lock.

Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs
callers (writers) hold the write rwsem at the earliest sysfs calling stage.

However, the rwlock write-lock is not needed upon governor stop.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
CC: rjw@xxxxxxx
CC: mingo@xxxxxxx
CC: Shaohua Li <shaohua.li@xxxxxxxxx>
CC: Pekka Enberg <penberg@xxxxxxxxxxxxxx>
CC: Dave Young <hidave.darkstar@xxxxxxxxx>
CC: "Rafael J. Wysocki" <rjw@xxxxxxx>
CC: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
CC: trenn@xxxxxxx
CC: sven.wegener@xxxxxxxxxxx
CC: Venkatesh Pallipadi <venkatesh.pallipadi@xxxxxxxxx>
CC: cpufreq@xxxxxxxxxxxxxxx
---
drivers/cpufreq/cpufreq.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

Index: linux-2.6-lttng/drivers/cpufreq/cpufreq.c
===================================================================
--- linux-2.6-lttng.orig/drivers/cpufreq/cpufreq.c 2009-06-08 10:20:48.000000000 -0400
+++ linux-2.6-lttng/drivers/cpufreq/cpufreq.c 2009-06-08 10:48:52.000000000 -0400
@@ -1697,8 +1697,17 @@ static int __cpufreq_set_policy(struct c
dprintk("governor switch\n");

/* end old governor */
- if (data->governor)
+ if (data->governor) {
+ /*
+ * Need to release the rwsem around governor
+ * stop due to lock dependency between
+ * cancel_delayed_work_sync and the read lock
+ * taken in the delayed work handler.
+ */
+ unlock_policy_rwsem_write(data->cpu);
__cpufreq_governor(data, CPUFREQ_GOV_STOP);
+ lock_policy_rwsem_write(data->cpu);
+ }

/* start new governor */
data->governor = policy->governor;


--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/