Re: [GIT PULL rcu/next] RCU commits for 3.3

From: Frederic Weisbecker
Date: Wed Dec 14 2011 - 12:47:46 EST


On Wed, Dec 14, 2011 at 05:30:11PM +0100, Frederic Weisbecker wrote:
> On Wed, Dec 14, 2011 at 04:47:36PM +0100, Ingo Molnar wrote:
> >
> > * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > > Hello, Ingo,
> > >
> > > The major features of this series are RCU infrastructure in
> > > support of Frederic Weisbecker's tickless userspace work and a
> > > reworked RCU_FAST_NO_HZ that improves energy efficiency on
> > > large lightly loaded SMP systems by allowing the
> > > scheduler-clock tick to be turned off more quickly upon entry
> > > to idle, even when RCU callbacks are queued on the newly idle
> > > CPU. In addition, RCU_FAST_NO_HZ may now be used on systems
> > > running TREE_PREEMPT_RCU, where earlier it was restricted to
> > > TREE_RCU.
> > >
> > > In addition, this series provides additional event tracing,
> > > rcutorture changes that improve automated KVM-based testing of
> > > RCU, introduces an srcu_read_lock_raw() needed by uprobes, and
> > > ports a couple of patches from -rt to mainline. Finally, this
> > > series updates documentation, improves diagnostics, and fixes
> > > a number of bugs, including a nasty use of RCU from the idle
> > > task spotted and fixed by Frederic Weisbecker.
> > >
> > > These commits have been posted to LKML and updated based on
> > > feedback:
> > >
> > > https://lkml.org/lkml/2011/11/2/363
> > > https://lkml.org/lkml/2011/11/15/302
> > > https://lkml.org/lkml/2011/11/28/588
> > > https://lkml.org/lkml/2011/12/3/77
> > > https://lkml.org/lkml/2011/12/12/625
> > >
> > > They have also been exposed to -next testing.
> > >
> > > These changes are based off of 3.2-rc5 and are available at
> > > the git repository at:
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/next
> > >
> > > Could you please pull them into -tip for additional testing?
> > >
> > > Thanx, Paul
> > >
> > > ------------------>
> > > Frederic Weisbecker (11):
> > > rcu: Detect illegal rcu dereference in extended quiescent state
> > > rcu: Inform the user about extended quiescent state on PROVE_RCU warning
> > > rcu: Warn when rcu_read_lock() is used in extended quiescent state
> > > nohz: Separate out irq exit and idle loop dyntick logic
> > > nohz: Allow rcu extended quiescent state handling seperately from tick stop
> > > x86: Enter rcu extended qs after idle notifier call
> > > x86: Call idle notifier after irq_enter()
> > > rcu: Fix early call to rcu_idle_enter()
> > > nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()
> > > rcu: Don't check irq nesting from rcu idle entry/exit
> > > rcu: Irq nesting is always 0 on rcu_enter_idle_common
> > >
> > > Josh Triplett (1):
> > > driver-core/cpu: Expose hotpluggability to the rest of the kernel
> > >
> > > Kees Cook (1):
> > > docs: Additional LWN links to RCU API
> > >
> > > Paul E. McKenney (49):
> > > rcu: ->signaled better named ->fqs_state
> > > rcu: Avoid RCU-preempt expedited grace-period botch
> > > rcu: Make synchronize_sched_expedited() better at work sharing
> > > lockdep: Update documentation for lock-class leak detection
> > > rcu: Track idleness independent of idle tasks
> > > trace: Allow ftrace_dump() to be called from modules
> > > rcu: Add failure tracing to rcutorture
> > > rcu: Document failing tick as cause of RCU CPU stall warning
> > > rcu: Disable preemption in rcu_is_cpu_idle()
> > > rcu: Remove one layer of abstraction from PROVE_RCU checking
> > > rcu: Warn when srcu_read_lock() is used in an extended quiescent state
> > > rcu: Make srcu_read_lock_held() call common lockdep-enabled function
> > > powerpc: Tell RCU about idle after hcall tracing
> > > rcu: Introduce raw SRCU read-side primitives
> > > rcu: Add documentation for raw SRCU read-side primitives
> > > rcu: Deconfuse dynticks entry-exit tracing
> > > rcu: Add more information to the wrong-idle-task complaint
> > > rcu: Allow dyntick-idle mode for CPUs with callbacks
> > > rcu: Fix idle-task checks
> > > rcu: Permit RCU_FAST_NO_HZ to be used by TREE_PREEMPT_RCU
> > > rcu: Add rcutorture system-shutdown capability
> > > rcu: Control rcutorture startup from kernel boot parameters
> > > sched: Add is_idle_task() to handle invalidated uses of idle_cpu()
> > > rcu: Make RCU use the new is_idle_task() API
> > > sparc: Make SPARC use the new is_idle_task() API
> > > kdb: Make KDB use the new is_idle_task() API
> > > events: Make events use the new is_idle_task() API
> > > tile: Make tile use the new is_idle_task() API
> > > rcu: Add rcutorture CPU-hotplug capability
> > > doc: Add load/store guarantees to Documentation/atomic-ops.txt
> > > rcu: Update trace_rcu_dyntick() header comment
> > > rcu: Add tracing for RCU_FAST_NO_HZ
> > > rcu: Go dyntick-idle more quickly if CPU has serviced current grace period
> > > rcu: Avoid needlessly IPIing CPUs at GP end
> > > rcu: Eliminate RCU_FAST_NO_HZ grace-period hang
> > > rcu: Reduce latency of rcu_prepare_for_idle()
> > > rcu: Remove dynticks false positives and RCU failures
> > > rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass
> > > rcu: Document same-context read-side constraints
> > > rcu: Permit dyntick-idle with callbacks pending
> > > rcu: Keep invoking callbacks if CPU otherwise idle
> > > rcu: Adaptive dyntick-idle preparation
> > > rcu: Remove redundant rcu_cpu_stall_suppress declaration
> > > rcu: Make rcutorture test for hotpluggability before offlining CPUs
> > > rcu: Add rcutorture tests for srcu_read_lock_raw()
> > > rcu: Augment rcu_batch_end tracing for idle and callback state
> > > Revert "rcu: Permit rt_mutex_unlock() with irqs disabled"
> > > rcu: Apply ACCESS_ONCE() to rcu_boost() return value
> > > cpu: Export cpu_up()
> > >
> > > Thomas Gleixner (2):
> > > rcu: Omit self-awaken when setting up expedited grace period
> > > rcu: Remove redundant return from rcu_report_exp_rnp()
> > >
> > > Documentation/RCU/checklist.txt | 6 +
> > > Documentation/RCU/rcu.txt | 10 +-
> > > Documentation/RCU/stallwarn.txt | 16 +-
> > > Documentation/RCU/torture.txt | 13 ++
> > > Documentation/RCU/trace.txt | 4 -
> > > Documentation/RCU/whatisRCU.txt | 19 ++-
> > > Documentation/atomic_ops.txt | 87 +++++++++
> > > Documentation/lockdep-design.txt | 63 +++++++
> > > arch/arm/kernel/process.c | 6 +-
> > > arch/avr32/kernel/process.c | 6 +-
> > > arch/blackfin/kernel/process.c | 6 +-
> > > arch/microblaze/kernel/process.c | 6 +-
> > > arch/mips/kernel/process.c | 6 +-
> > > arch/openrisc/kernel/idle.c | 6 +-
> > > arch/powerpc/kernel/idle.c | 15 ++-
> > > arch/powerpc/platforms/iseries/setup.c | 12 +-
> > > arch/powerpc/platforms/pseries/lpar.c | 4 +
> > > arch/s390/kernel/process.c | 6 +-
> > > arch/sh/kernel/idle.c | 6 +-
> > > arch/sparc/kernel/process_64.c | 6 +-
> > > arch/sparc/kernel/setup_32.c | 2 +-
> > > arch/tile/kernel/process.c | 6 +-
> > > arch/tile/mm/fault.c | 4 +-
> > > arch/um/kernel/process.c | 6 +-
> > > arch/unicore32/kernel/process.c | 6 +-
> > > arch/x86/kernel/apic/apic.c | 6 +-
> > > arch/x86/kernel/apic/io_apic.c | 2 +-
> > > arch/x86/kernel/cpu/mcheck/therm_throt.c | 2 +-
> > > arch/x86/kernel/cpu/mcheck/threshold.c | 2 +-
> > > arch/x86/kernel/irq.c | 6 +-
> > > arch/x86/kernel/process_32.c | 6 +-
> > > arch/x86/kernel/process_64.c | 10 +-
> > > drivers/base/cpu.c | 7 +
> > > include/linux/cpu.h | 1 +
> > > include/linux/hardirq.h | 21 ---
> > > include/linux/rcupdate.h | 115 ++++++++-----
> > > include/linux/sched.h | 8 +
> > > include/linux/srcu.h | 87 ++++++++--
> > > include/linux/tick.h | 11 +-
> > > include/trace/events/rcu.h | 122 +++++++++++--
> > > init/Kconfig | 10 +-
> > > kernel/cpu.c | 1 +
> > > kernel/debug/kdb/kdb_support.c | 2 +-
> > > kernel/events/core.c | 2 +-
> > > kernel/lockdep.c | 22 +++
> > > kernel/rcu.h | 7 +
> > > kernel/rcupdate.c | 12 ++
> > > kernel/rcutiny.c | 149 +++++++++++++--
> > > kernel/rcutiny_plugin.h | 29 +++-
> > > kernel/rcutorture.c | 225 ++++++++++++++++++++++-
> > > kernel/rcutree.c | 290 +++++++++++++++++++++---------
> > > kernel/rcutree.h | 26 ++--
> > > kernel/rcutree_plugin.h | 289 ++++++++++++++++++++++++------
> > > kernel/rcutree_trace.c | 12 +-
> > > kernel/rtmutex.c | 8 -
> > > kernel/softirq.c | 4 +-
> > > kernel/time/tick-sched.c | 97 ++++++----
> > > kernel/trace/trace.c | 1 +
> > > 58 files changed, 1512 insertions(+), 407 deletions(-)
> >
> > Pulled into tip:core/rcu, thanks a lot Paul!
> >
> > Note that this commit from Frederic:
> >
> > 69e1e811dcc4: sched, nohz: Track nr_busy_cpus in the sched_group_power
> >
> > conflicted with this commit from Suresh in sched/core:
> >
> > 69e1e811dcc4: sched, nohz: Track nr_busy_cpus in the sched_group_power
> >
> > I resolved it by making the set_cpu_sd_state_idle() call
> > unconditional within the newly decoupled
> > tick_nohz_stop_sched_tick() function - please double check that
> > it's the right resolution.
>
> After a quick look, I believe this should rather be under tick_nohz_idle_enter(),
> (This is the equivalent of the old tick_nohz_stop_sched_tick(1))
> This wants to be set only once we enter idle, not everytime we have an idle
> interrupt.

I don't know how you plan to fix the conflict, by redoing the merge or by
applying a patch on tip/master.

In any case, here is a patch you can use. Feel free to apply it as is
or to just refer to its diff to redo the merge:

(Outrageously only compile tested)

---
From: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Date: Wed, 14 Dec 2011 18:36:00 +0100
Subject: [PATCH] sched: Only update the CPU idleness in the domain
hierarchy from idle loop entry

We don't need to inform the sched domain hierarchy about the
CPU idleness everytime we call tick_nohz_stop_sched_tick() as this
includes both idle loop entry and idle interrupt exit.

Doing it once from the idle loop entry is enough, call
set_cpu_sd_state_idle() only from tick_nohz_idle_enter() instead
to fix this.

Signed-off-by: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
Cc: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
---
kernel/time/tick-sched.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 1f6dc515..696c997 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -289,14 +289,6 @@ static void tick_nohz_stop_sched_tick(struct tick_sched *ts)
now = tick_nohz_start_idle(cpu, ts);

/*
- * Update the idle state in the scheduler domain hierarchy
- * when tick_nohz_stop_sched_tick() is called from the idle loop.
- * State will be updated to busy during the first busy tick after
- * exiting idle.
- */
- set_cpu_sd_state_idle();
-
- /*
* If this cpu is offline and it is the one which updates
* jiffies, then give up the assignment and let it be taken by
* the cpu which runs the tick timer next. If we don't drop
@@ -483,6 +475,14 @@ void tick_nohz_idle_enter(void)
* update of the idle time accounting in tick_nohz_start_idle().
*/
ts->inidle = 1;
+
+ /*
+ * Update the idle state in the scheduler domain hierarchy
+ * when tick_nohz_idle_enter() is called from the idle loop.
+ * State will be updated to busy during the first busy tick after
+ * exiting idle.
+ */
+ set_cpu_sd_state_idle();
tick_nohz_stop_sched_tick(ts);

local_irq_enable();
--
1.7.5.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/