Re: [rfc patch] sched/fair: Use instantaneous load for fork/exec balancing

From: Matt Fleming
Date: Mon Jul 04 2016 - 11:05:03 EST


On Wed, 15 Jun, at 04:32:58PM, Dietmar Eggemann wrote:
> On 14/06/16 17:40, Mike Galbraith wrote:
> > On Tue, 2016-06-14 at 15:14 +0100, Dietmar Eggemann wrote:
> >
> >> IMHO, the hackbench performance "boost" w/o 0905f04eb21f is due to the
> >> fact that a new task gets all it's load decayed (making it a small task)
> >> in the __update_load_avg() call in remove_entity_load_avg() because its
> >> se->avg.last_update_time value is 0 which creates a huge time difference
> >> comparing it to cfs_rq->avg.last_update_time. The patch 0905f04eb21f
> >> avoids this and thus the task stays big se->avg.load_avg = 1024.
> >
> > I don't care much at all about the hackbench "regression" in its own
> > right, and what causes it, for me, bottom line is that there are cases
> > where we need to be able to resolve, and can't, simply because we're
> > looking at a fuzzy (rippling) reflection.
>
> Understood. I just thought it would be nice to know why 0905f04eb21f
> makes this problem even more visible. But so far I wasn't able to figure
> out why this diff in se->avg.load_avg [1024 versus 0] has this effect on
> cfs_rq->runnable_load_avg making it even less suitable in find idlest*.
> enqueue_entity_load_avg()'s cfs_rq->runnable_load_* += sa->load_* looks
> suspicious though.

In my testing without 0905f04eb21f I saw that se->avg.load_avg
actually managed to skip being decayed at all before the task was
dequeued, which meant that cfs_rq->runnable_load_avg was more likely
to be zero after dequeue, for those workloads like hackbench that
essentially are just a fork bomb.

se->avg.load_avg evaded decay because se->avg.period_contrib was being
zero'd in __update_load_avg().

With 0905f04eb21f applied, it's less likely (though not impossible)
that ->period_contrib will be zero'd and so we usually end up with
some residual load in cfs_rq->runnable_load_avg on dequeue, and hence,

cfs_rq->runnable_load_avg > se->avg.load_avg

even if 'se' is the only task on the runqueue.

FYI, below is my quick and dirty hack that restored hackbench
performance for the few machines I checked. I didn't try schbench with
it.

---