[PATCH] sched/fair: Reset ::runnable_load_avg when dequeueing last entity

From: Matt Fleming
Date: Thu Jun 09 2016 - 14:48:14 EST


The task and runqueue load averages maintained in p->se.avg.load_avg
and cfs_rq->runnable_load_avg respectively, can decay at different
wall clock rates, which means that enqueueing and then dequeueing a
task on an otherwise empty runqueue doesn't always leave
::runnable_load_avg with its initial value.

This can lead to the situation where cfs_rq->runnable_load_avg has a
non-zero value even though there are no runnable entities on the
runqueue. Assuming no entity is enqueued on this runqueue for some
time this residual load average will decay gradually as the load
averages are updated.

But we can optimise the special case of dequeueing the last entity and
reset ::runnable_load_avg early, which gives a performance improvement
to workloads that trigger the load balancer, such as fork-heavy
applications when SD_BALANCE_FORK is set, because it gives a more up
to date view of how busy the cpu is.

Signed-off-by: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
---
kernel/sched/fair.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index c6dd8bab010c..408ee90c7ea8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3007,10 +3007,20 @@ enqueue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
static inline void
dequeue_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
+ unsigned long load_avg = 0;
+
update_load_avg(se, 1);

- cfs_rq->runnable_load_avg =
- max_t(long, cfs_rq->runnable_load_avg - se->avg.load_avg, 0);
+ /*
+ * If we're about to dequeue the last runnable entity we can
+ * reset the runnable load average to zero instead of waiting
+ * for it to decay naturally. This gives the load balancer a
+ * more timely and accurate view of how busy this cpu is.
+ */
+ if (cfs_rq->nr_running > 1)
+ load_avg = max_t(long, cfs_rq->runnable_load_avg - se->avg.load_avg, 0);
+
+ cfs_rq->runnable_load_avg = load_avg;
cfs_rq->runnable_load_sum =
max_t(s64, cfs_rq->runnable_load_sum - se->avg.load_sum, 0);
}
--
2.7.3