[PATCH 6/5 v2] rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work

From: Steven Rostedt
Date: Fri Apr 07 2017 - 13:03:31 EST


From: "Steven Rostedt (VMware)" <rostedt@xxxxxxxxxxx>

Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
it needs to use rcu_read_lock() and friends. This is because tracing can
happen as RCU is about to enter user space, or about to go idle, and RCU
does not watch for RCU read side critical sections as it makes the
transition.

There is a small location within the RCU infrastructure that rcu_irq_enter()
itself will not work. If tracing were to occur in that section it will break
if it tries to use rcu_irq_enter().

Originally, this happens with the stack_tracer, because it will call
save_stack_trace when it encounters stack usage that is greater than any
stack usage it had encountered previously. There was a case where that
happened in the RCU section where rcu_irq_enter() did not work, and lockdep
complained loudly about it. To fix it, stack tracing added a call to be
disabled and RCU would disable stack tracing during the critical section
that rcu_irq_enter() was inoperable. This solution worked, but there are
other cases that use rcu_irq_enter() and it would be a good idea to let RCU
give a way to let others know that rcu_irq_enter() will not work. For
example, in trace events.

Another helpful aspect of this change is that it also moves the per cpu
variable called in the RCU critical section into a cache locale along with
other RCU per cpu variables used in that same location.

I'm keeping the stack_trace_disable() code, as that still could be used in
the future by places that really need to disable it. And since it's only a
static inline, it wont take up any kernel text if it is not used.

Signed-off-by: Steven Rostedt (VMware) <rostedt@xxxxxxxxxxx>
---
include/linux/rcupdate.h | 5 +++++
kernel/rcu/tree.c | 18 ++++++++++++++++--
kernel/trace/trace_stack.c | 8 ++++++++
3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index de88b33..b096685 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -97,6 +97,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
unsigned long secs,
unsigned long c_old,
unsigned long c);
+bool rcu_irq_enter_disabled(void);
#else
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
int *flags,
@@ -113,6 +114,10 @@ static inline void rcutorture_record_test_transition(void)
static inline void rcutorture_record_progress(unsigned long vernum)
{
}
+bool rcu_irq_enter_disabled(void)
+{
+ return false;
+}
#ifdef CONFIG_RCU_TRACE
void do_trace_rcu_torture_read(const char *rcutorturename,
struct rcu_head *rhp,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8b4d273..a6dcf3b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -285,6 +285,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
};

/*
+ * There's a few places, currently just in the tracing infrastructure,
+ * that uses rcu_irq_enter() to make sure RCU is watching. But there's
+ * a small location where that will not even work. In those cases
+ * rcu_irq_enter_disabled() needs to be checked to make sure rcu_irq_enter()
+ * can be called.
+ */
+static DEFINE_PER_CPU(bool, disable_rcu_irq_enter);
+
+bool rcu_irq_enter_disabled(void)
+{
+ return this_cpu_read(disable_rcu_irq_enter);
+}
+
+/*
* Record entry into an extended quiescent state. This is only to be
* called when not already in an extended quiescent state.
*/
@@ -800,10 +814,10 @@ static void rcu_eqs_enter_common(bool user)
do_nocb_deferred_wakeup(rdp);
}
rcu_prepare_for_idle();
- stack_tracer_disable();
+ __this_cpu_inc(disable_rcu_irq_enter);
rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
- stack_tracer_enable();
+ __this_cpu_dec(disable_rcu_irq_enter);
rcu_dynticks_task_enter();

/*
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index f2f02ff..76aa04d 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,6 +96,14 @@ check_stack(unsigned long ip, unsigned long *stack)
if (in_nmi())
return;

+ /*
+ * There's a slight chance that we are tracing inside the
+ * RCU infrastructure, and rcu_irq_enter() will not work
+ * as expected.
+ */
+ if (unlikely(rcu_irq_enter_disabled()))
+ return;
+
local_irq_save(flags);
arch_spin_lock(&stack_trace_max_lock);

--
2.9.3