Re: [RFC PATCHC 3/3] sched/fair: use the idle state info to choose the idlest cpu

From: Nicolas Pitre
Date: Thu Apr 17 2014 - 11:53:50 EST


On Thu, 17 Apr 2014, Daniel Lezcano wrote:

> Ok, refreshed the patchset but before sending it out I would to discuss about
> the rational of the changes and the policy, and change the patchset
> consequently.
>
> What order to choose if the cpu is idle ?
>
> Let's assume all cpus are idle on a dual socket quad core.
>
> Also, we can reasonably do the hypothesis if the cluster is in low power mode,
> the cpus belonging to the same cluster are in the same idle state (putting
> apart the auto-promote where we don't have control on).
>
> If the policy you talk above is 'aggressive power saving', we can follow the
> rules with decreasing priority:
>
> 1. We want to prevent to wakeup the entire cluster
> => as the cpus are in the same idle state, by choosing a cpu in
> => shallow
> state, we should have the guarantee we won't wakeup a cluster (except if no
> shallowest idle cpu are found).

This is unclear to me. Obviously, if an entire cluster is down, that
means all the CPUs it contains have been idle for a long time. And
therefore they shouldn't be subject to selection unless there is no
other CPUs available. Is that what you mean?

> 2. We want to prevent to wakeup a cpu which did not reach the target residency
> time (will need some work to unify cpuidle idle time and idle task run time)
> => with the target residency and, as a first step, with the idle
> => stamp,
> we can determine if the cpu slept enough

Agreed. However, right now, the scheduler does not have any
consideration for that. So this should be done as a separate patch.

> 3. We want to prevent to wakeup a cpu in deep idle state
> => by looking for the cpu in shallowest idle state

Obvious.

> 4. We want to prevent to wakeup a cpu where the exit latency is longer than
> the expected run time of the task (and the time to migrate the task ?)

Sure. That would be a case for using task packing even if the policy is
set to performance rather than powersave whereas task packing is
normally for powersave.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/