Re: [PATCH -tip] Re: [PATCH -tip 0/2] Temporary RCU fixes for notraceand hotplug CPU

From: Lai Jiangshan
Date: Tue Aug 25 2009 - 20:53:20 EST


Paul E. McKenney wrote:
> On Tue, Aug 25, 2009 at 02:02:33PM -0400, Mathieu Desnoyers wrote:
>> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
>>> On Tue, Aug 25, 2009 at 12:25:49PM -0400, Mathieu Desnoyers wrote:
>>>> * Paul E. McKenney (paulmck@xxxxxxxxxxxxxxxxxx) wrote:
>>>>> On Tue, Aug 25, 2009 at 10:00:47AM +0200, Ingo Molnar wrote:
>>>>>> btw., i'm still seeing crashes with the latest RCU bits:
>>>>>>
>>>>>> [ 20.621740] Testing event sys_enter_futex: OK
>>>>>> [ 20.629738] Testing event sys_exit_futex: OK
>>>>>> [ 20.637737] Testing event lock_acquire: [reboot]
>>>>>>
>>>>>> Possibly due to infinite recursion as well. Config attached.
>>>>> Color me confused...
>>>>>
>>>>> Unless someone has a better idea, I will send in a patch that adds
>>>>> "notrace" to every RCU API member used by any file in the kernel
>>>>> that has "trace" in its name (excluding ptrace.c and rcutree_trace.c,
>>>>> of course). This list is as follows:
>>>>>
>>>>> call_rcu()
>>>>> call_rcu_sched()
>>>>> rcu_read_lock()
>>>>> rcu_read_unlock()
>>>>>
>>>>> So, any better ideas?
>>>> Tracers using RCU should use the _notrace() version of read_lock/unlock.
>>>> I think the callers should be fixed rather than RCU.
>>>>
>>>> Tracepoints have been designed to use the _notrace variant on the
>>>> instrumentation site. The core of tracepoint management use
>>>> call_rcu_sched(), which can be traced without any problem.
>>>>
>>>> I have not followed the late tracing development as closely though, so
>>>> errors might have crept in.
>>> Or I might have inadvertently broken something in a non-obvious (to me,
>>> anyway) manner.
>>>
>>> So, would you be willing to look at commit bc33f24bd in the -tip
>>> tree and see if there is anything I broke other than the now-fixed
>>> rcu_read_lock_sched_notrace() and rcu_read_unlock_sched_notrace()?
>>> And for that matter, whether my alleged fix for these two API members
>>> really does fix the problem (-tip commit 7c614d6461)?
>>>
>>> The -tip tree is at:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
>>>
>>> And these patches are on the tip/core/rcu branch.
>>>
>> sure, here we go:
>>
>> static inline void rcu_read_lock_sched_notrace(void)
>> {
>> preempt_disable_notrace();
>> + __acquire(RCU_SCHED);
>> + rcu_read_acquire();
>> }
>>
>> and
>>
>>
>> static inline void rcu_read_unlock_sched_notrace(void)
>> {
>> + rcu_read_release();
>> + __release(RCU_SCHED);
>> preempt_enable_notrace();
>> }
>>
>> will make those _notrace primitives call into lockdep. I don't think
>> this is correct, and this might be causing your problem.
>>
>> rcu_read_acquire/release are calling lock_acquire/release, those should
>> be removed.
>>
>> __acquire() simply seems to be defined to a gcc "context" attribute,
>> probably for the sparse checker. I think it should be safe to leave them
>> there.
>
> Thank you for looking this over, Mathieu!!! Please see below for patch.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> Remove lockdep annotations from RCU's _notrace() API members.
>
> The lockdep annotations rcu_read_acquire() and rcu_read_release()
> might lead to infinite looping if called from lockdep. So this patch
> removes them.
>
> Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> rcupdate.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index 8b4422c..95e0615 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -195,7 +195,6 @@ static inline notrace void rcu_read_lock_sched_notrace(void)
> {
> preempt_disable_notrace();
> __acquire(RCU_SCHED);
> - rcu_read_acquire();
> }
>
> /*
> @@ -211,7 +210,6 @@ static inline void rcu_read_unlock_sched(void)
> }
> static inline notrace void rcu_read_unlock_sched_notrace(void)
> {
> - rcu_read_release();
> __release(RCU_SCHED);
> preempt_enable_notrace();
> }
>
>


I remembered I have suggested this ...

Reviewed-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>

Lai Jiangshan wrote:
>
>> {
>> preempt_disable_notrace();
>> + __acquire(RCU_SCHED);
>> + rcu_read_acquire();
>> }
>>
>
> It may cause infinity recursion.
> rcu_read_acquire() calls rcu_read_lock_sched_notrace()
> before current->lockdep_recursion is set to 1 when tracing in on,
> thus infinity recursion occurs.
>
> Lai
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/