[PATCH v2 tip/core/rcu 0/22] CPU hotplug updates for v4.1

From: Paul E. McKenney
Date: Mon Mar 16 2015 - 14:38:04 EST


Hello!

This series updates RCU's handling of CPU hotplug offline operations,
allowing RCU to have a precise notification of when it should start
ignoring a CPU. This allowed detection of some illegal use of RCU by
offline CPUs, and this series contains fixes for these. A similar
problem exists for CPU onlining, but will be addressed later. One
CPU-hotplug dragon at a time.

1. Add common code for notification from dying CPU. This is
part of the fix for issues uncovered by improved detection,
but is placed first to avoid messing up bisection.

2-4. Use #1 for x86, blackfin, and metag. (ARM also has this problem,
but ARM's maintainers are working on their own fix.)

5. Remove duplicate offline-CPU callback-list initialization.
This simplifies later changes to RCU's handling of offlining
operations.

6. Improve code readability in rcu_cleanup_dead_cpu(). Simple
code motion, no semantic change.

7. Eliminate a boolean variable and "if" statement by rearranging
sync_rcu_preempt_exp_init()'s checks for CPUs not having blocked
tasks.

8. Eliminate empty CONFIG_HOTPLUG_CPU #ifdef.

9. Add diagnostics to detect when RCU CPU stall warnings have been
caused by failure to propagate quiescent states up the rcu_node
combining tree.

10. Provide CONFIG_RCU_TORTURE_TEST_SLOW_INIT Kconfig option to
artificially slow down grace-period initialization, thus increasing
the probability of detecting races with this initialization process.

11. Update data files to enable CONFIG_RCU_TORTURE_TEST_SLOW_INIT
by default during rcutorture testing, but leave the default
time at zero. This default may be overridden by passing
"--bootargs rcutree.gp_init_delay=1" to kvm.sh.

12. Remove event tracing from rcu_cpu_notify(), which is invoked
by offline CPUs. (Event tracing uses RCU.)

13. Change meaning of ->expmask bitmasks to track blocked tasks
rather than online CPUs.

14. Move rcu_report_unblock_qs_rnp() to common code. This will
make it easier to provide proper locking protection.

15. Avoid races between CPU-hotplug operations and RCU grace-period
initialization by processing CPU-hotplug changes only at the
start of each grace period. This works because RCU need not
wait on a CPU that came online after the start of the current
grace period.

16. Eliminate the no-longer-needed ->onoff_mutex from the rcu_node
structure. This is the only sleeplock acquired during RCU's
CPU-hotplug processing, which in turn allows rcu_cpu_notify()
to be invoked from the preemption-disabled idle loop.

17. Use a per-CPU variable to make the CPU-offline idle-loop
transition point precise. (RCU's magic one-jiffy grace-period
wait for offline CPUs must remain until the analogous online
issue is addressed.)

18. Invoke rcu_cpu_notify() with a new CPU_DYING_IDLE op just before
the idle-loop invocation of arch_cpu_idle_dead().

19. Now that CPU-hotplug events are applied only during grace-period
initialization, it is safe to unconditionally enable slow
grace-period initialization for rcutorture testing. Note
that this delay is applied randomly in order to get a good
mix of fast and slow grace-period initialization.

20. Add checks that all quiescent states were received at grace-period
cleanup time.

21. Add a check for the last task on a given RCU-node structure
leaving its RCU read-side critical section between the time
that hotplug information is propagated up the tree and the
time that the grace period starts.

22. Add checks for grace-period number to all propagations of
quiescent states up the rcu_node combining tree. These are
required because a new grace period could start during this
propagation due to the resolution of #21 above. (Thanks
to Sasha Levin for exposing this bug during the course of
his testing.)

Changes since v1:

o Fixed per-CPU state mechine to work correctly for architectures
that online CPUs without needing to check whether or not previous
offline operations completed correctly and on time, thanks to
James Hogan.

o Fixed Xen's interfacing to the common-code notifications, thanks
to Boris Ostrovsky.

o Added two fixes for handling of quiescent states and grace periods
given the updated handling of CPU hotplug.

Thanx, Paul

------------------------------------------------------------------------

b/Documentation/kernel-parameters.txt | 6
b/arch/blackfin/mach-common/smp.c | 6
b/arch/metag/kernel/smp.c | 5
b/arch/x86/include/asm/cpu.h | 2
b/arch/x86/include/asm/smp.h | 2
b/arch/x86/kernel/smpboot.c | 39 -
b/arch/x86/xen/smp.c | 46 -
b/include/linux/cpu.h | 14
b/include/linux/rcupdate.h | 2
b/kernel/cpu.c | 4
b/kernel/rcu/tree.c | 356 ++++++++++----
b/kernel/rcu/tree.h | 11
b/kernel/rcu/tree_plugin.h | 169 +++---
b/kernel/rcu/tree_trace.c | 4
b/kernel/sched/idle.c | 9
b/kernel/smpboot.c | 156 ++++++
b/lib/Kconfig.debug | 26 -
b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon | 1
18 files changed, 617 insertions(+), 241 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/