Re: [RFC 3/4] sched/fair: replace runnable load average by runnable average

From: Mel Gorman
Date: Wed Feb 12 2020 - 09:30:31 EST


On Tue, Feb 11, 2020 at 06:46:50PM +0100, Vincent Guittot wrote:
> Now that runnable_load_avg is not more used, we can replace it by a new
> signal that will highlight the runnable pressure on a cfs_rq. This signal
> track the waiting time of tasks on rq and can help to better define the
> state of rqs.
>
> At now, only util_avg is used to define the state of a rq:
> A rq with more that around 80% of utilization and more than 1 tasks is
> considered as overloaded.
>
> But the util_avg signal of a rq can become temporaly low after that a task
> migrated onto another rq which can bias the classification of the rq.
>
> When tasks compete for the same rq, their runnable average signal will be
> higher than util_avg as it will include the waiting time and we can use
> this signal to better classify cfs_rqs.
>
> The new runnable_avg will track the runnable time of a task which simply
> adds the waiting time to the running time. The runnbale _avg of cfs_rq
> will be the /Sum of se's runnable_avg and the runnable_avg of group entity
> will follow the one of the rq similarly to util_avg.
>

s/runnbale/runnable/

Otherwise, all I can do is give a heads-up that I will not be able to
review this patch and the next patch properly in the short-term. While the
new metric appears to have a sensible definition, I've not spent enough
time comparing/contrasting the pro's and con's of PELT implementation
details or their consequences. I am not confident I can accurately
predict whether this is better or if there are corner cases that make
poor placement decisions based on fast changes of runnable_avg. At least
not within a reasonable amount of time.

This caught my attention though

> @@ -4065,8 +4018,8 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> * - Add its new weight to cfs_rq->load.weight
> */
> update_load_avg(cfs_rq, se, UPDATE_TG | DO_ATTACH);
> + se_update_runnable(se);
> update_cfs_group(se);
> - enqueue_runnable_load_avg(cfs_rq, se);
> account_entity_enqueue(cfs_rq, se);
>
> if (flags & ENQUEUE_WAKEUP)

I don't think the ordering matters any more because of what was removed
from update_cfs_group. Unfortunately, I'm not 100% confident so am
bringing it to your attention in case it does.

--
Mel Gorman
SUSE Labs