Re: frequent softlockups with 3.10rc6.

From: Dave Jones
Date: Thu Jun 20 2013 - 12:29:58 EST


On Thu, Jun 20, 2013 at 09:16:52AM -0700, Paul E. McKenney wrote:
> On Wed, Jun 19, 2013 at 08:12:12PM -0400, Dave Jones wrote:
> > On Wed, Jun 19, 2013 at 11:13:02AM -0700, Paul E. McKenney wrote:
> > > On Wed, Jun 19, 2013 at 01:53:56PM -0400, Dave Jones wrote:
> > > > On Wed, Jun 19, 2013 at 12:45:40PM -0400, Dave Jones wrote:
> > > > > I've been hitting this a lot the last few days.
> > > > > This is the same machine that I was also seeing lockups during sync()
> > > >
> > > > On a whim, I reverted 971394f389992f8462c4e5ae0e3b49a10a9534a3
> > > > (As I started seeing these just after that rcu merge).
> > > >
> > > > It's only been 30 minutes, but it seems stable again. Normally I would
> > > > hit these within 5 minutes.
> > > >
> > > > I think this may be the same root cause for http://www.spinics.net/lists/kernel/msg1551503.html too.
> > > >
> > > > Paul ?
> > >
> > > ???
> > >
> > > In both cases, I am guessing that you built with CONFIG_PROVE_RCU_DELAY=y.
> > > Even then, this is very strange. I am at a loss as to why udelay(200)
> > > would result in a hang. Or does your system turn udelay() into something
> > > other than a pure spin?
> >
> > Dammit. Paul, you're off the hook (for now).
> > It just took longer to hit.
>
> Well, this commit could significantly increase CPU overhead, which might
> make the bug more likely to occur. (Hey, I can rationalize -anything-!!!)

bisecting it now. Hopefully by end of day I'll have it figured out.

Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/