Re: WARNING: kernel/smp.c:292 smp_call_function_single [Was: mmotm2009-11-24-16-47 uploaded]

From: Thomas Gleixner
Date: Fri Nov 27 2009 - 11:46:35 EST


On Fri, 27 Nov 2009, Thomas Gleixner wrote:
> On Fri, 27 Nov 2009, Peter Zijlstra wrote:
>
> > On Fri, 2009-11-27 at 16:03 +0100, Jiri Slaby wrote:
> > > On 11/25/2009 01:47 AM, akpm@xxxxxxxxxxxxxxxxxxxx wrote:
> > > > The mm-of-the-moment snapshot 2009-11-24-16-47 has been uploaded to
> > >
> > > Hi, when executing qemu-kvm I often get following warning and a hard lockup.
> > >
> > > WARNING: at kernel/smp.c:292 smp_call_function_single+0xbd/0x140()
> > > Hardware name: To Be Filled By O.E.M.
> > > Modules linked in: kvm_intel kvm fuse ath5k ath
> > > Pid: 3265, comm: qemu-kvm Not tainted 2.6.32-rc8-mm1_64 #912
> > > Call Trace:
> > > [<ffffffff81039678>] warn_slowpath_common+0x78/0xb0
> > > [<ffffffffa007fd50>] ? __vcpu_clear+0x0/0xd0 [kvm_intel]
> > > [<ffffffff810396bf>] warn_slowpath_null+0xf/0x20
> > > [<ffffffff8106410d>] smp_call_function_single+0xbd/0x140
> > > [<ffffffffa0080af6>] vmx_vcpu_load+0x46/0x170 [kvm_intel]
> > > [<ffffffffa004dd94>] kvm_arch_vcpu_load+0x24/0x60 [kvm]
> > > [<ffffffffa0047a8d>] kvm_sched_in+0xd/0x10 [kvm]
> > > [<ffffffff8102de37>] finish_task_switch+0x67/0xc0
> > > [<ffffffff814699f8>] schedule+0x2f8/0x9c0
> >
> > >
> > > It is a regression against 2009-11-13-19-59.
> > >
> > > Any ideas?
> >
> > Looks like kvm is trying to send an IPI from the preempt notifiers,
> > which are called with IRQs disabled, not a sane thing to do.
> >
> > If they really want that, they'll have to use a pre-allocated struct
> > call_single_data and use __smp_call_function_single.
>
> Hmm, commit 498657a moved the fire_sched_in_preempt_notifiers() call
> into the irqs disabled section recently.
>
> sched, kvm: Fix race condition involving sched_in_preempt_notifers
>
> In finish_task_switch(), fire_sched_in_preempt_notifiers() is
> called after finish_lock_switch().
>
> However, depending on architecture, preemption can be enabled after
> finish_lock_switch() which breaks the semantics of preempt
> notifiers.

This is patently wrong btw.

schedule()
{

need_resched:
preempt_disable();
....
task_switch();
....

preempt_enable_no_resched();
if (need_resched())
goto need_resched;
}

>
> So move it before finish_arch_switch(). This also makes the in-
> notifiers symmetric to out- notifiers in terms of locking - now
> both are called under rq lock.
>
> It's not a surprise that this breaks the existing code which does the
> smp function call.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/