Re: [RFC][PATCH v5 00/14] sched: packing tasks

From: Catalin Marinas
Date: Mon Nov 11 2013 - 13:20:01 EST


On Mon, Nov 11, 2013 at 04:39:45PM +0000, Arjan van de Ven wrote:
> > I think the scheduler simply wants to say: we expect to go idle for X
> > ns, we want a guaranteed wakeup latency of Y ns -- go do your thing.
>
> as long as Y normally is "large" or "infinity" that is ok ;-)
> (a smaller Y will increase power consumption and decrease system performance)

Cpuidle already takes a latency into account via pm_qos. The scheduler
could pass this information down to the hardware driver or the cpuidle
driver could use pm_qos directly (as it's currently done in governors).

The scheduler may have its own requirements in terms of latency (e.g.
some real-time thread) and we could extend the pm_qos API with
per-thread information. But so far we don't have a way to pass such
per-thread requirements from user space (unless we assume that any
real-time thread has some fixed latency requirements). I suggest we
ignore this per-thread part until we find an actual need.

> > I think you also raised the point in that we do want some feedback as to
> > the cost of waking up particular cores to better make decisions on which
> > to wake. That is indeed so.
>
> having a hardware driver give a prefered CPU ordering for wakes can indeed be useful.
> (I'm doubtful that changing the recommendation for each idle is going to pay off,
> but proof is in the pudding; there are certainly long term effects where this can help)

The ordering is based on the actual C-state, so a simple way is to wake
up the CPU in the shallowest C-state. With asymmetric configurations
(big.LITTLE) we have different costs for the same C-state, so this would
come in handy.

Even for symmetric configuration, the cost of moving a task to a CPU
includes wake-up cost plus the run-time cost which depends on the
P-state after wake-up (that's much trickier since we can't easily
estimate the cost of a P-state and it may change once you place a task
on it).

--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/