[RFC] A new CPU load metric for power-efficient scheduler: CPU ConCurrency

From: Yuyang Du
Date: Thu Apr 24 2014 - 23:35:33 EST


Hi Ingo, PeterZ, and others,

The current schedulerâs load balancing is completely work-conserving. In some
workload, generally low CPU utilization but immersed with CPU bursts of
transient tasks, migrating task to engage all available CPUs for
work-conserving can lead to significant overhead: cache locality loss,
idle/active HW state transitional latency and power, shallower idle state,
etc, which are both power and performance inefficient especially for todayâs
low power processors in mobile.

This RFC introduces a sense of idleness-conserving into work-conserving (by
all means, we really donât want to be overwhelming in only one way). But to
what extent the idleness-conserving should be, bearing in mind that we donât
want to sacrifice performance? We first need a load/idleness indicator to that
end.

Thanks to CFSâs âmodel an ideal, precise multi-tasking CPUâ, tasks can be seen
as concurrently running (the tasks in the runqueue). So it is natural to use
task concurrency as load indicator. Having said that, we do two things:

1) Divide continuous time into periods of time, and average task concurrency
in period, for tolerating the transient bursts:
a = sum(concurrency * time) / period
2) Exponentially decay past periods, and synthesize them all, for hysteresis
to load drops or resilience to load rises (let f be decaying factor, and a_x
the xth period average since period 0):
s = a_n + f^1 * a_n-1 + f^2 * a_n-2 +, â..,+ f^(n-1) * a_1 + f^n * a_0

We name this load indicator as CPU ConCurrency (CC): task concurrency
determines how many CPUs are needed to be running concurrently.

To track CC, we intercept the scheduler in 1) enqueue, 2) dequeue, 3)
scheduler tick, and 4) enter/exit idle.

By CC, we implemented a Workload Consolidation patch on two Intel mobile
platforms (a quad-core composed of two dual-core modules): contain load and load
balancing in the first dual-core when aggregated CC low, and if not in the
full quad-core. Results show that we got power savings and no substantial
performance regression (even gains for some).

Thanks,
Yuyang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/