Re: [PATCH v15 04/13] task_isolation: add initial support

From: Peter Zijlstra
Date: Tue Aug 30 2016 - 03:59:15 EST


On Mon, Aug 29, 2016 at 12:40:32PM -0400, Chris Metcalf wrote:
> On 8/29/2016 12:33 PM, Peter Zijlstra wrote:
> >On Tue, Aug 16, 2016 at 05:19:27PM -0400, Chris Metcalf wrote:
> >>+ /*
> >>+ * Request rescheduling unless we are in full dynticks mode.
> >>+ * We would eventually get pre-empted without this, and if
> >>+ * there's another task waiting, it would run; but by
> >>+ * explicitly requesting the reschedule, we may reduce the
> >>+ * latency. We could directly call schedule() here as well,
> >>+ * but since our caller is the standard place where schedule()
> >>+ * is called, we defer to the caller.
> >>+ *
> >>+ * A more substantive approach here would be to use a struct
> >>+ * completion here explicitly, and complete it when we shut
> >>+ * down dynticks, but since we presumably have nothing better
> >>+ * to do on this core anyway, just spinning seems plausible.
> >>+ */
> >>+ if (!tick_nohz_tick_stopped())
> >>+ set_tsk_need_resched(current);
> >This is broken.. and it would be really good if you don't actually need
> >to do this.
>
> Can you elaborate? We clearly do want to wait until we are in full
> dynticks mode before we return to userspace.
>
> We could do it just in the prctl() syscall only, but then we lose the
> ability to implement the NOSIG mode, which can be a convenience.

So this isn't spelled out anywhere. Why does this need to be in the
return to user path?

> Even without that consideration, we really can't be sure we stay in
> dynticks mode if we disable the dynamic tick, but then enable interrupts,
> and end up taking an interrupt on the way back to userspace, and
> it turns the tick back on. That's why we do it here, where we know
> interrupts will stay disabled until we get to userspace.

But but but.. task_isolation_enter() is explicitly ran with IRQs
_enabled_!! It even WARNs if they're disabled.

> So if we are doing it here, what else can/should we do? There really
> shouldn't be any other tasks waiting to run at this point, so there's
> not a heck of a lot else to do on this core. We could just spin and
> check need_resched and signal status manually instead, but that
> seems kind of duplicative of code already done in our caller here.

What !? I really don't get this, what are you waiting for? Why is
rescheduling making things better.