Re: frequent lockups in 3.18rc4

From: Paul E. McKenney
Date: Fri Dec 12 2014 - 13:11:05 EST


On Thu, Dec 11, 2014 at 11:45:09PM -0500, Dave Jones wrote:
> On Thu, Dec 11, 2014 at 10:03:43PM -0500, Dave Jones wrote:
> > On Thu, Dec 11, 2014 at 01:49:17PM -0800, Linus Torvalds wrote:
> >
> > > Anyway, you might as well stop bisecting. Regardless of where it lands
> > > in the remaining pile, it's not going to give us any useful
> > > information, methinks.
> > >
> > > I'm stumped.
> >
> > yeah, likewise. I don't recall any bug that's given me this much headache.
> > I don't think it's helped that the symptoms are vague enough that a
> > number of people have thought they've seen the same thing, which have
> > turned out to be unrelated incidents. At least some of those have
> > gotten closure though it seems.
> >
> > > Maybe it's worth it to concentrate on just testing current kernels,
> > > and instead try to limit the triggering some other way. In particular,
> > > you had a trinity run that was *only* testing lsetxattr(). Is that
> > > really *all* that was going on? Obviously trinity will be using
> > > timers, fork, and other things? Can you recreate that lsetxattr thing,
> > > and just try to get as many problem reports as possible from one
> > > particular kernel (say, 3.18, since that should be a reasonable modern
> > > base with hopefully not a lot of other random issues)?
> >
> > I'll let it run overnight, but so far after 4hrs, on .18 it's not done
> > anything.
>
> Two hours later, it had spewed this, but survived. (Trinity had quit after that
> point because /proc/sys/kernel/tainted changed).

[ . . . ]

> Few seconds later rcu craps itself..
>
> [18801.941908] INFO: rcu_preempt detected stalls on CPUs/tasks:
> [18801.942920] 3: (3 GPs behind) idle=bf4/0/0 softirq=1597256/1597257
> [18801.943890] (detected by 0, t=6002 jiffies, g=763359, c=763358, q=0)
> [18801.944843] Task dump for CPU 3:
> [18801.945770] swapper/3 R running task 14576 0 1 0x00200000
> [18801.946706] 0000000342b6fe28 def23185c07e1b3d ffffe8ffff403518 0000000000000001
> [18801.947629] ffffffff81cb2000 0000000000000003 ffff880242b6fe78 ffffffff8166cb95
> [18801.948557] 0000111242adb59f ffffffff81cb2070 ffff880242b6c000 ffffffff81d21ab0
> [18801.949478] Call Trace:
> [18801.950384] [<ffffffff8166cb95>] ? cpuidle_enter_state+0x55/0x1c0
> [18801.951303] [<ffffffff8166cdb7>] ? cpuidle_enter+0x17/0x20
> [18801.952211] [<ffffffff810bf303>] ? cpu_startup_entry+0x423/0x4d0
> [18801.953125] [<ffffffff810314c3>] ? start_secondary+0x1a3/0x220

Very strange. Both cpuidle_enter() and cpuidle_enter_state() should be
within the idle loop, so that RCU should be ignoring this CPU. And the
"idle=bf4/0/0" means that it really has marked itself as being idle from
an RCU perspective. So I am guessing that the RCU grace-period kthread
has not gotten a chance to run.

If you are willing to live a bit dangerously, could you please see if
the (not for mainline) patch below clears this up?

Thanx, Paul

------------------------------------------------------------------------

rcu: Run grace-period kthreads at real-time priority

This is a experimental commit that attempts to better handle high-load
situations.

Not-yet-signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>

diff --git a/init/Kconfig b/init/Kconfig
index cecce1b13825..6db1f304157c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -677,7 +677,6 @@ config RCU_BOOST
config RCU_KTHREAD_PRIO
int "Real-time priority to use for RCU worker threads"
range 1 99
- depends on RCU_BOOST
default 1
help
This option specifies the SCHED_FIFO priority value that will be
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 93bca38925a9..57fd8f5bd1ad 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -156,6 +156,10 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
static void invoke_rcu_core(void);
static void invoke_rcu_callbacks(struct rcu_state *rsp, struct rcu_data *rdp);

+/* rcuc/rcub kthread realtime priority */
+static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
+module_param(kthread_prio, int, 0644);
+
/*
* Track the rcutorture test sequence number and the update version
* number within a given test. The rcutorture_testseq is incremented
@@ -3631,15 +3635,19 @@ static int __init rcu_spawn_gp_kthread(void)
unsigned long flags;
struct rcu_node *rnp;
struct rcu_state *rsp;
+ struct sched_param sp;
struct task_struct *t;

rcu_scheduler_fully_active = 1;
for_each_rcu_flavor(rsp) {
- t = kthread_run(rcu_gp_kthread, rsp, "%s", rsp->name);
+ t = kthread_create(rcu_gp_kthread, rsp, "%s", rsp->name);
BUG_ON(IS_ERR(t));
rnp = rcu_get_root(rsp);
raw_spin_lock_irqsave(&rnp->lock, flags);
rsp->gp_kthread = t;
+ sp.sched_priority = kthread_prio;
+ sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+ wake_up_process(t);
raw_spin_unlock_irqrestore(&rnp->lock, flags);
}
rcu_spawn_nocb_kthreads();
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index cf3b4d532379..564944964f14 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -34,10 +34,6 @@

#include "../locking/rtmutex_common.h"

-/* rcuc/rcub kthread realtime priority */
-static int kthread_prio = CONFIG_RCU_KTHREAD_PRIO;
-module_param(kthread_prio, int, 0644);
-
/*
* Control variables for per-CPU and per-rcu_node kthreads. These
* handle all flavors of RCU.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/