Re: PROBLEM: 3.0-rc kernels unbootable since -rc3 - under Xen,32-bit guest only.

From: Konrad Rzeszutek Wilk
Date: Tue Jul 12 2011 - 12:32:37 EST


> > > > http://darnok.org/xen/cpu1.log
> > >
> > > OK, a fair amount of variety, then lots and lots of task_waking_fair(),
> > > so I still feel good about asking you for the following.
> >
> > But... But... But...
> >
> > Just how accurate are these stack traces? For example, do you have
> > frame pointers enabled? If not, could you please enable them?

Frame pointers are enabled.
> >
> > The reason that I ask is that the wakeme_after_rcu() looks like it is
> > being invoked from softirq, which would be grossly illegal and could
> > cause any manner of misbehavior. Did someone put a synchronize_rcu()
> > into an RCU callback or something? Or did I do something really really

This is a 3.0-rc6 based kernels with the debug patch, the initial
RCU inhibit patch (where you disable the RCU checking during bootup) and
that is it.

What is bizzare is that the soft_irq shows but there is no corresponding
Xen eventchannel stack trace - there should have been also xen_evtchn_upcall
(which is the general code that calls the main IRQ handler.. which would make
the softirq call). This is assuming that the IRQ (timer one) is reguarly dispatching
(which it looks to be doing). Somehow getting just the softirq by itself is bizzre.

Perhaps an IPI has been sent that does this. Let me see what a stack
trace for an IPI looks like.

> > braindead inside the RCU implementation?
> >
> > (I am looking into this last question, but would appreciate any and all
> > help with the other questions!)
>
> OK, I was confusing Julie's, Ravi's, and Konrad's situations.

Do you want me to create a new email thread to keep this one seperate?

> The wakeme_after_rcu() is in fact OK to call from sofirq -- if and
> only if the scheduler is actually running. This is what happens if
> you do a synchronize_rcu() given your CONFIG_TREE_RCU setup -- an RCU
> callback is posted that, when invoked, awakens the task that invoked
> synchronize_rcu().
>
> And, based on http://darnok.org/xen/log-rcu-stall, Konrad's system
> appears to be well past the point where the scheduler is initialized.
>
> So I am coming back around to the loop in task_waking_fair().
>
> Though the patch I sent out earlier might help, for example, if early
> invocation of RCU callbacks is somehow messing up the scheduler's
> initialization.

Ok, let me try it out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/