Re: [PATCH v15 04/13] task_isolation: add initial support

From: Chris Metcalf
Date: Mon Aug 29 2016 - 13:13:59 EST


On 8/29/2016 12:33 PM, Peter Zijlstra wrote:
On Tue, Aug 16, 2016 at 05:19:27PM -0400, Chris Metcalf wrote:
+ /*
+ * Request rescheduling unless we are in full dynticks mode.
+ * We would eventually get pre-empted without this, and if
+ * there's another task waiting, it would run; but by
+ * explicitly requesting the reschedule, we may reduce the
+ * latency. We could directly call schedule() here as well,
+ * but since our caller is the standard place where schedule()
+ * is called, we defer to the caller.
+ *
+ * A more substantive approach here would be to use a struct
+ * completion here explicitly, and complete it when we shut
+ * down dynticks, but since we presumably have nothing better
+ * to do on this core anyway, just spinning seems plausible.
+ */
+ if (!tick_nohz_tick_stopped())
+ set_tsk_need_resched(current);
This is broken.. and it would be really good if you don't actually need
to do this.

Can you elaborate? We clearly do want to wait until we are in full
dynticks mode before we return to userspace.

We could do it just in the prctl() syscall only, but then we lose the
ability to implement the NOSIG mode, which can be a convenience.

Even without that consideration, we really can't be sure we stay in
dynticks mode if we disable the dynamic tick, but then enable interrupts,
and end up taking an interrupt on the way back to userspace, and
it turns the tick back on. That's why we do it here, where we know
interrupts will stay disabled until we get to userspace.

So if we are doing it here, what else can/should we do? There really
shouldn't be any other tasks waiting to run at this point, so there's
not a heck of a lot else to do on this core. We could just spin and
check need_resched and signal status manually instead, but that
seems kind of duplicative of code already done in our caller here.

So... thoughts?

--
Chris Metcalf, Mellanox Technologies
http://www.mellanox.com