Re: Regression on cpufreq in v3.12-rc1

From: Srivatsa S. Bhat
Date: Thu Sep 19 2013 - 08:50:50 EST


On 09/19/2013 04:11 AM, Rafael J. Wysocki wrote:
> On Wednesday, September 18, 2013 11:21:45 PM Linus Walleij wrote:
>> Hi Rafael, Viresh,
>>
>> I'm seeing this problem and maybe you can help me out fixing it
>> properly:
>>
>> On some machines like the StrongARM SA1100 it seems that
>> cpufreq_get() can be called before the cpufreq driver and thus the
>> policy is set, resulting in a crash like this:
>
> Did you try the current linux-next branch from my tree?
>
> Rafael
>
>
>> ------------[ cut here ]------------
>> kernel BUG at /home/linus/linux/drivers/cpufreq/cpufreq.c:80!
>> Internal error: Oops - BUG: 0 [#1] ARM
>> Modules linked in:
>> CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc1-00001-g1266dae-dirty #17
>> task: c1830000 ti: c1832000 task.ti: c1832000
>> (...)
>> Backtrace:
>> [<c01ea1a4>] (lock_policy_rwsem_read+0x0/0x48) from [<c01eb5c8>]
>> (cpufreq_get+0x34/0x68)
>> [<c01eb594>] (cpufreq_get+0x0/0x68) from [<c0185908>]
>> (sa1100fb_activate_var+0xdc/0x3ac)
>> r5:00000003 r4:0000000a
>> [<c018582c>] (sa1100fb_activate_var+0x0/0x3ac) from [<c0185c78>]
>> (sa1100fb_set_par+0xa0/0xa8)
>> [<c0185bd8>] (sa1100fb_set_par+0x0/0xa8) from [<c0180418>]
>> (fbcon_init+0x444/0x4a8)
>> r4:c1803200
>> [<c017ffd4>] (fbcon_init+0x0/0x4a8) from [<c019a8b4>] (visual_init+0x78/0xc8)
>> [<c019a83c>] (visual_init+0x0/0xc8) from [<c01a0010>]
>> (do_bind_con_driver+0x160/0x310)
>>
>> This bug comes from the framebuffer but I first encountered it
>> in the PCMCIA driver, and both seem to cause the bug.
>>
>> In the past I think things worked smoothly: the consumers
>> calling cpufreq_get() too early would just get 0 back.
>> (Or so it seems to me.)
>>
>> The BUG() statement causing it is in the lock_policy_rwsem_##mode(int cpu)
>> macro.
>>
>> Applying a patch like this seems to fix the issue:
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 43c24aa..4977b4b 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -70,7 +70,8 @@ static DEFINE_PER_CPU(struct rw_semaphore, cpu_policy_rwsem);
>> static int lock_policy_rwsem_##mode(int cpu) \
>> { \
>> struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu); \
>> - BUG_ON(!policy); \
>> + if(!policy) \
>> + return 0; \
>> down_##mode(&per_cpu(cpu_policy_rwsem, policy->cpu)); \
>> \
>> return 0; \
>> @@ -83,7 +84,8 @@ lock_policy_rwsem(write, cpu);
>> static void unlock_policy_rwsem_##mode(int cpu)
>> \
>> { \
>> struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu); \
>> - BUG_ON(!policy); \
>> + if(!policy) \
>> + return; \
>> up_##mode(&per_cpu(cpu_policy_rwsem, policy->cpu)); \
>> }
>>
>> @@ -1423,6 +1425,9 @@ static unsigned int __cpufreq_get(unsigned int cpu)
>> struct cpufreq_policy *policy = per_cpu(cpufreq_cpu_data, cpu);
>> unsigned int ret_freq = 0;
>>
>> + if (!policy)
>> + return ret_freq;
>> +
>> if (!cpufreq_driver->get)
>> return ret_freq;
>>
>> I don't really know if this is the right solution at all, so please
>> help me out here... if you want that patch I can send it once
>> I understand this properly.
>>

IIRC, recent kernels didn't return 0 or any error code when the !policy
condition was matched. So can you check whether this problem occurs with
3.11 or 3.10 as well?

In case the problem is unique to 3.12-rc1, I guess some recent commit changed
the ordering somewhere, which lead to premature invocations of cpufreq_get().
So I think we should first identify (bisect?) and understand what caused that
particular change and then we will be in a position to evaluate whether the
patch you proposed would be the right fix or not.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/