Re: [PATCH 0/5] [GIT PULL] updates for tip/tracing/ftrace

From: Paul E. McKenney
Date: Sat Mar 21 2009 - 17:02:16 EST


On Sat, Mar 21, 2009 at 09:09:19PM +0100, Ingo Molnar wrote:
> * Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > On Sat, Mar 21, 2009 at 01:25:23PM -0400, Steven Rostedt wrote:
> > > On Sat, 21 Mar 2009, Ingo Molnar wrote:
> > > > * Ingo Molnar <mingo@xxxxxxx> wrote:

[ . . . ]

> > > > CONFIG_CLASSIC_RCU=y
> > >
> > > All the crashes you reported only happen with classic RCU.
> > >
> > > Paul,
> > >
> > > Did anything change recently that could cause this lockup?
> >
> > Arjan van de Ven is seeing a problem where a single
> > synchronize_rcu() during bootup is taking a full second, which is
> > currently thought to be due to some drivers spinning in the kernel
> > (Arjan is working on a bootgraph that will hopefully pinpoint the
> > problem: http://lkml.org/lkml/2009/3/21/7). If the drivers were
> > also instrumented with ftrace, they might (or might not)slow down
> > even further, depending on exactly why they are spinning.
>
> for one of the hung boxes in the past i waited 24 hours but it never
> unwedged itself. The box that hung today is still hanging and the
> RCU stall detector is still busy printing out those backtraces.

And on the last trace you emailed, the first and the last stall warning
are identical according to "diff". In fact, they are all identical.
That is a bit unusual, one would normally expect to see slight differences
in the stack based on the scheduling clock interrupt hitting the "longer
than average loop" in different places each time.

That would indicate either a very tight loop or a loop that has
interrupts enabled only in one spot.

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/