Re: frequent lockups in 3.18rc4

From: Dave Jones
Date: Fri Dec 12 2014 - 13:43:38 EST


On Fri, Dec 12, 2014 at 10:10:44AM -0800, Paul E. McKenney wrote:

> > [18801.941908] INFO: rcu_preempt detected stalls on CPUs/tasks:
> > [18801.942920] 3: (3 GPs behind) idle=bf4/0/0 softirq=1597256/1597257
> > [18801.943890] (detected by 0, t=6002 jiffies, g=763359, c=763358, q=0)
> > [18801.944843] Task dump for CPU 3:
> > [18801.945770] swapper/3 R running task 14576 0 1 0x00200000
> > [18801.946706] 0000000342b6fe28 def23185c07e1b3d ffffe8ffff403518 0000000000000001
> > [18801.947629] ffffffff81cb2000 0000000000000003 ffff880242b6fe78 ffffffff8166cb95
> > [18801.948557] 0000111242adb59f ffffffff81cb2070 ffff880242b6c000 ffffffff81d21ab0
> > [18801.949478] Call Trace:
> > [18801.950384] [<ffffffff8166cb95>] ? cpuidle_enter_state+0x55/0x1c0
> > [18801.951303] [<ffffffff8166cdb7>] ? cpuidle_enter+0x17/0x20
> > [18801.952211] [<ffffffff810bf303>] ? cpu_startup_entry+0x423/0x4d0
> > [18801.953125] [<ffffffff810314c3>] ? start_secondary+0x1a3/0x220
>
> Very strange. Both cpuidle_enter() and cpuidle_enter_state() should be
> within the idle loop, so that RCU should be ignoring this CPU. And the
> "idle=bf4/0/0" means that it really has marked itself as being idle from
> an RCU perspective. So I am guessing that the RCU grace-period kthread
> has not gotten a chance to run.
>
> If you are willing to live a bit dangerously, could you please see if
> the (not for mainline) patch below clears this up?

I'll try anything at this point, regardless of danger level :)

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/