Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen,32-bit guest only.

From: Paul E. McKenney
Date: Tue Jul 12 2011 - 12:46:31 EST


On Tue, Jul 12, 2011 at 12:32:10PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > http://darnok.org/xen/cpu1.log
> > > >
> > > > OK, a fair amount of variety, then lots and lots of task_waking_fair(),
> > > > so I still feel good about asking you for the following.
> > >
> > > But... But... But...
> > >
> > > Just how accurate are these stack traces? For example, do you have
> > > frame pointers enabled? If not, could you please enable them?
>
> Frame pointers are enabled.
> > >
> > > The reason that I ask is that the wakeme_after_rcu() looks like it is
> > > being invoked from softirq, which would be grossly illegal and could
> > > cause any manner of misbehavior. Did someone put a synchronize_rcu()
> > > into an RCU callback or something? Or did I do something really really
>
> This is a 3.0-rc6 based kernels with the debug patch, the initial
> RCU inhibit patch (where you disable the RCU checking during bootup) and
> that is it.
>
> What is bizzare is that the soft_irq shows but there is no corresponding
> Xen eventchannel stack trace - there should have been also xen_evtchn_upcall
> (which is the general code that calls the main IRQ handler.. which would make
> the softirq call). This is assuming that the IRQ (timer one) is reguarly dispatching
> (which it looks to be doing). Somehow getting just the softirq by itself is bizzre.
>
> Perhaps an IPI has been sent that does this. Let me see what a stack
> trace for an IPI looks like.

Thank you for the info!

> > > braindead inside the RCU implementation?
> > >
> > > (I am looking into this last question, but would appreciate any and all
> > > help with the other questions!)
> >
> > OK, I was confusing Julie's, Ravi's, and Konrad's situations.
>
> Do you want me to create a new email thread to keep this one seperate?

Let's please keep everyone on copy. I bet that these problems are
related. Plus once we get something that works, it would be good if
everyone could test it.

> > The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
> > only if the scheduler is actually running. This is what happens if
> > you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
> > callback is posted that, when invoked, awakens the task that invoked
> > synchronize_rcu().
> >
> > And, based on http://darnok.org/xen/log-rcu-stall, Konrad's system
> > appears to be well past the point where the scheduler is initialized.
> >
> > So I am coming back around to the loop in task_waking_fair().
> >
> > Though the patch I sent out earlier might help, for example, if early
> > invocation of RCU callbacks is somehow messing up the scheduler's
> > initialization.
>
> Ok, let me try it out.

Thank you again!

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/