Re: [PATCH v10 8/9] sched/fair: Add latency list

From: Peter Zijlstra
Date: Wed Feb 22 2023 - 04:52:42 EST


On Fri, Jan 13, 2023 at 03:12:33PM +0100, Vincent Guittot wrote:

> +static void __enqueue_latency(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
> +{
> +
> + /* Only latency sensitive entity can be added to the list */
> + if (se->latency_offset >= 0)
> + return;
> +
> + if (!RB_EMPTY_NODE(&se->latency_node))
> + return;
> +
> + /*
> + * An execution time less than sysctl_sched_min_granularity means that
> + * the entity has been preempted by a higher sched class or an entity
> + * with higher latency constraint.
> + * Put it back in the list so it gets a chance to run 1st during the
> + * next slice.
> + */
> + if (!(flags & ENQUEUE_WAKEUP)) {
> + u64 delta_exec = se->sum_exec_runtime - se->prev_sum_exec_runtime;
> +
> + if (delta_exec >= sysctl_sched_min_granularity)
> + return;
> + }

I'm not a big fan of this dynamic enqueueing condition; it makes it
rather hard to interpret the below addition to pick_next_entity().

Let me think about this more... at the very least the comment with
__pick_first_latency() use below needs to be expanded upon if we keep it
like so.

> +
> + rb_add_cached(&se->latency_node, &cfs_rq->latency_timeline, __latency_less);
> +}

> @@ -4966,7 +5040,7 @@ static struct sched_entity *
> pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
> {
> struct sched_entity *left = __pick_first_entity(cfs_rq);
> - struct sched_entity *se;
> + struct sched_entity *latency, *se;
>
> /*
> * If curr is set we have to see if its left of the leftmost entity
> @@ -5008,6 +5082,12 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr)
> se = cfs_rq->last;
> }
>
> + /* Check for latency sensitive entity waiting for running */
> + latency = __pick_first_latency(cfs_rq);
> + if (latency && (latency != se) &&
> + wakeup_preempt_entity(latency, se) < 1)
> + se = latency;

I'm not quite sure why this condition isn't sufficient on it's own.
After all, if a task does a 'spurious' nanosleep it can get around the
'restriction' in __enqueue_latency() without any great penalty to it's
actual bandwidth consumption.