Re: Regression on cpufreq in v3.12-rc1

From: Srivatsa S. Bhat
Date: Thu Sep 19 2013 - 14:21:31 EST


On 09/19/2013 11:41 PM, Srivatsa S. Bhat wrote:
> On 09/19/2013 06:28 PM, Srivatsa S. Bhat wrote:
>> On 09/19/2013 06:25 PM, Linus Walleij wrote:
>>> On Thu, Sep 19, 2013 at 2:46 PM, Srivatsa S. Bhat
>>> <srivatsa.bhat@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>>>>> I don't really know if this is the right solution at all, so please
>>>>>> help me out here... if you want that patch I can send it once
>>>>>> I understand this properly.
>>>>
>>>> IIRC, recent kernels didn't return 0 or any error code when the !policy
>>>> condition was matched. So can you check whether this problem occurs with
>>>> 3.11 or 3.10 as well?
>>>
>>> v3.11 works fine.
>>>
>
> Ok, now that I got a chance to look at the code and the git logs, I think
> I have a clearer idea of what is happening.
>
> Basically commit 474deff74 (cpufreq: remove cpufreq_policy_cpu per-cpu
> variable) exposed a hidden issue. Before that commit, the code looked like
> this:
>
> static DEFINE_PER_CPU(int, cpufreq_policy_cpu);
>
> [...]
>
> static int lock_policy_rwsem_##mode(int cpu) \
> { \
> int policy_cpu = per_cpu(cpufreq_policy_cpu, cpu); \
> BUG_ON(policy_cpu == -1); \
> down_##mode(&per_cpu(cpu_policy_rwsem, policy_cpu)); \
> \
> return 0; \
> }
>
> But there was no code to set the per-cpu values to -1 to begin with. Since
> the per-cpu variable was defined as static, it would have been initialized
> to zero. Thus, we would never actually hit the BUG_ON() condition, since
> policy_cpu didn't turn out to be -1.
>
> (The per-cpu variable was set to -1 only during error in __cpufreq_add_dev
> and during cpufreq_remove_dev, both of which are irrelevant to the scenario
> we are dealing with here).
>
> So, code ended up accessing and locking CPU 0's rwsemaphore. But how was
> its initialization taken care of? The answer is the initcall sequence.
> Cpufreq core code initialized itself at the core_initcall stage, and the
> sa1100 cpufreq driver initialized itself at the arch_initcall stage, like
> Viresh mentioned. And core-initcall comes before arch-initcall. So the
> cpufreq core would have finished initializing the per-cpu rwsemaphores,
> so that locking/unlocking them wouldn't crash the system or get complaints
> from lockdep.
>
> To summarize, I think before commit 474deff74, we were accessing CPU0's
> rwsems (perhaps incorrectly) and due to the lacking initialization of the
> per-cpu vars, the BUG_ON() would never fire.
>
> To confirm this, you can perhaps try out one commit before 474deff74 and see
> if it works for you. Or equivalently, you can apply the following patch
> (revert of 474deff74) over mainline and see if it works.

Just before sending, I rebased that patch on top of Rafael's linux-next
branch, but forgot to update the above sentence. However I think the same
patch might apply cleanly over mainline as well.

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/