Re: [PATCH RFC] rcu: Make __rcu_read_lock() inlinable

From: Paul E. McKenney
Date: Tue Mar 27 2012 - 12:57:28 EST


On Tue, Mar 27, 2012 at 04:06:08PM +0800, Lai Jiangshan wrote:
> On 03/26/2012 04:52 AM, Paul E. McKenney wrote:
>
> > +void rcu_switch_from(void)
> > {
> > - current->rcu_read_lock_nesting++;
> > - barrier(); /* needed if we ever invoke rcu_read_lock in rcutree.c */
> > + current->rcu_read_lock_nesting_save =
> > + __this_cpu_read(rcu_read_lock_nesting);
> > + barrier();
> > + __this_cpu_write(rcu_read_lock_nesting, 0);
>
> - __this_cpu_write(rcu_read_lock_nesting, 0);
> + __this_cpu_write(rcu_read_lock_nesting, 1);
>
> if prev or next task has non-zero rcu_read_unlock_special,
> "__this_cpu_write(rcu_read_lock_nesting, 1)" will prevent wrong qs reporting
> when rcu_read_unlock() is called in any interrupt/tracing while doing switch_to().

This is one approach that I have been considering. I am concerned about
interactions with ->rcu_read_unlock_special, however. The approach that I
am favoring at the moment is to save and restore ->rcu_read_unlock_special
from another per-CPU variable, which would allow that per-CPU variable to
be zeroed at this point. Then because there can be no preemption at this
point in the code, execution would stay out of rcu_read_unlock_special()
for the duration of the context switch.

> > +}
> > +
> > +/*
> > + * Restore the incoming task's value for rcu_read_lock_nesting at the
> > + * end of a context switch.
> > + */
> > +void rcu_switch_to(void)
> > +{
> > + __this_cpu_write(rcu_read_lock_nesting,
> > + current->rcu_read_lock_nesting_save);
> > + barrier();
> > + current->rcu_read_lock_nesting_save = 0;
> > }
>
> - barrier();
> - current->rcu_read_lock_nesting_save = 0;
>
> rcu_read_lock_nesting_save is set but not used before next set here, just remove it.

Yep, as noted earlier.

> I don't like it hooks too much into scheduler.
>
> Approaches:
> 0) stay using function call
> 1) hook into kbuild(https://lkml.org/lkml/2011/3/27/170,https://lkml.org/lkml/2011/3/27/171)
> 2) hook into scheduler(still need more works for rcu_read_unlock())
> 3) Add rcu_read_lock_nesting to thread_info like preempt_count
> 4) resolve header-file dependence
>
> For me
> 3=4>1>2>0

The advantage of the current per-CPU-variable approach is that it
permits x86 to reduce rcu_read_lock() to a single instruction, so it
seems worthwhile persuing it. In addition, having RCU-preempt hook
at switch_to() eliminates needless task queuing in the case where the
scheduler is entered, but no context switch actually takes place.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/