Re: Regression in v3.4-rc0 " BUG: soft lockup - CPU#0 stuck for 29s! [migration/0:6]..[<ffffffff810d3b8b>] stop_machine_cpu_stop+0x7b/0xf

From: Konrad Rzeszutek Wilk
Date: Wed Mar 21 2012 - 23:09:12 EST


On Wed, Mar 21, 2012 at 05:32:21PM +0100, Peter Zijlstra wrote:
> On Wed, 2012-03-21 at 17:30 +0100, Peter Zijlstra wrote:
> > On Wed, 2012-03-21 at 16:57 +0100, Peter Zijlstra wrote:
> > > On Wed, 2012-03-21 at 11:26 -0400, Konrad Rzeszutek Wilk wrote:
> > > > On Tue, Mar 20, 2012 at 07:53:22PM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > Seeing this in v3.4-rc0 tree and didn't see that with v3.3:
> > > >
> > > > Hey Peter,
> > > >
> > > > Git bisection points this to the fault of
> > > > 5fbd036b552f633abb394a319f7c62a5c86a9cd7 " sched: Cleanup cpu_active madness"
> > > >
> > > > thoughts? (also attaching the .config)
> > >
> > > Argh.. so when is this? boot? No that's somewhat unexpected. I have one
> > > report of funnies during a hotplug bash that I'm looking into, but I
> > > haven't actually been able to reproduce that report myself either.
> >
> > is arch/x86/xen/smp.c:cpu_bringup() missing a call to
> > notify_cpu_starting() before doing set_cpu_online()?
> >
> > Also, shouldn't that also take the ipi_call_lock() around setting the
> > cpu online?
>
>
> And before you ask, yes all that should live in generic code... somehow.
> This per-arch replication of the cpu hotplug logic is driving me insane.

Thanks to Peter, here is the patch that fixes the regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/