Consolidating RCU-bh, RCU-preempt, and RCU-sched

From: Paul E. McKenney
Date: Thu Jul 12 2018 - 20:00:40 EST


Hello!

I now have a semi-reasonable prototype of changes consolidating the
RCU-bh, RCU-preempt, and RCU-sched update-side APIs in my -rcu tree.
There are likely still bugs to be fixed and probably other issues as well,
but a prototype does exist.

Assuming continued good rcutorture results and no objections, I am
thinking in terms of this timeline:

o Preparatory work and cleanups are slated for the v4.19 merge window.

o The actual consolidation and post-consolidation cleanup is slated
for the merge window after v4.19 (v5.0?). These cleanups include
the replacements called out below within the RCU implementation
itself (but excluding kernel/rcu/sync.c, see question below).

o Replacement of now-obsolete update APIs is slated for the second
merge window after v4.19 (v5.1?). The replacements are currently
expected to be as follows:

synchronize_rcu_bh() -> synchronize_rcu()
synchronize_rcu_bh_expedited() -> synchronize_rcu_expedited()
call_rcu_bh() -> call_rcu()
rcu_barrier_bh() -> rcu_barrier()
synchronize_sched() -> synchronize_rcu()
synchronize_sched_expedited() -> synchronize_rcu_expedited()
call_rcu_sched() -> call_rcu()
rcu_barrier_sched() -> rcu_barrier()
get_state_synchronize_sched() -> get_state_synchronize_rcu()
cond_synchronize_sched() -> cond_synchronize_rcu()
synchronize_rcu_mult() -> synchronize_rcu()

I have done light testing of these replacements with good results.

Any objections to this timeline?

I also have some questions on the ultimate end point. I have default
choices, which I will likely take if there is no discussion.

o
Currently, I am thinking in terms of keeping the per-flavor
read-side functions. For example, rcu_read_lock_bh() would
continue to disable softirq, and would also continue to tell
lockdep about the RCU-bh read-side critical section. However,
synchronize_rcu() will wait for all flavors of read-side critical
sections, including those introduced by (say) preempt_disable(),
so there will no longer be any possibility of mismatching (say)
RCU-bh readers with RCU-sched updaters.

I could imagine other ways of handling this, including:

a. Eliminate rcu_read_lock_bh() in favor of
local_bh_disable() and so on. Rely on lockdep
instrumentation of these other functions to identify RCU
readers, introducing such instrumentation as needed. I am
not a fan of this approach because of the large number of
places in the Linux kernel where interrupts, preemption,
and softirqs are enabled or disabled "behind the scenes".

b. Eliminate rcu_read_lock_bh() in favor of rcu_read_lock(),
and required callers to also disable softirqs, preemption,
or whatever as needed. I am not a fan of this approach
because it seems a lot less convenient to users of RCU-bh
and RCU-sched.

At the moment, I therefore favor keeping the RCU-bh and RCU-sched
read-side APIs. But are there better approaches?

o How should kernel/rcu/sync.c be handled? Here are some
possibilities:

a. Leave the full gp_ops[] array and simply translate
the obsolete update-side functions to their RCU
equivalents.

b. Leave the current gp_ops[] array, but only have
the RCU_SYNC entry. The __INIT_HELD field would
be set to a function that was OK with being in an
RCU read-side critical section, an interrupt-disabled
section, etc.

This allows for possible addition of SRCU functionality.
It is also a trivial change. Note that the sole user
of sync.c uses RCU_SCHED_SYNC, and this would need to
be changed to RCU_SYNC.

But is it likely that we will ever add SRCU?

c. Eliminate that gp_ops[] array, hard-coding the function
pointers into their call sites.

I don't really have a preference. Left to myself, I will be lazy
and take option #a. Are there better approaches?

o Currently, if a lock related to the scheduler's rq or pi locks is
held across rcu_read_unlock(), that lock must be held across the
entire read-side critical section in order to avoid deadlock.
Now that the end of the RCU read-side critical section is
deferred until sometime after interrupts are re-enabled, this
requirement could be lifted. However, because the end of the RCU
read-side critical section is detected sometime after interrupts
are re-enabled, this means that a low-priority RCU reader might
remain priority-boosted longer than need be, which could be a
problem when running real-time workloads.

My current thought is therefore to leave this constraint in
place. Thoughts?

Anything else that I should be worried about? ;-)

Thanx, Paul