Re: [nohz] 2a16fc93d2c: kernel lockup on idle injection

From: Thomas Gleixner
Date: Tue Dec 16 2014 - 17:55:18 EST


On Tue, 16 Dec 2014, Peter Zijlstra wrote:
> On Tue, Dec 16, 2014 at 10:21:27PM +0100, Thomas Gleixner wrote:
> > /* rq->lock is held for evaluating rq->nr_running */
> > static void sched_ttwu_remote_nohz(struct rq *rq)
> > {
> > if (nohz_full_disabled())
> > return;
> >
> > if (rq->nr_running != 2)
> > return;
> > /*
> > * Force smp_send_reschedule(). irq_exit() on the
> > * remote cpu will handle the rest.
> > */
>
> smp_send_reschedule() is magic and does not guarantee irq_{enter,exit}()
> being called, although we could audit and fix that.

I know. I did not want to do all the work for the nohz
folks. Otherwise I would have simply sent a patch. :)

> > if (!task_uses_posix_timers(task))
> > clear_bit(NOHZ_POSIXTIMER_NEEDS_TICK,
> > this_cpu_ptr(nohz_full_must_tick));
> > else
> > set_bit(NOHZ_POSIXTIMER_NEEDS_TICK,
> > this_cpu_ptr(nohz_full_must_tick));
> >
>
> /me hands you a few spare {} :-)

Without the proper instruction manual they are pretty useless.

> Arguably test state before doing a possibly pointless update?
>
> > local_irq_disable();
> > tick_full_nohz_update_state();
> > local_irq_enable();
> > }
>
> But yes, that should work just fine..

So I'm not the only one who thinks that this needs a proper
reimplementation :)

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/