Re: [External] Re: [PATCH] sched/fair: favor non-idle group in tick preemption

From: Josh Don
Date: Fri Nov 04 2022 - 17:25:36 EST


On Thu, Nov 3, 2022 at 8:49 PM Hao Jia <jiahao.os@xxxxxxxxxxxxx> wrote:
>
>
>
> On 2022/11/2 Josh Don wrote:
> >>> Some weirdness about this change though, is that if there is a
> >>> non-idle current entity, and the two next entities on the cfs_rq are
> >>> idle and non-idle respectively, we'll now take longer to preempt the
> >>> on-cpu non-idle entity, because the non-idle entity on the cfs_rq is
> >>> 'hidden' by the idle 'first' entity. Wakeup preemption is different
> >>> because we're always directly comparing the current entity with the
> >>> newly woken entity.
> >>>
> >> You are right, this can happen with high probability.
> >> This patch just compared the curr with the first entity in
> >> the tick, and it seems hard to consider all the other entity
> >> in cfs_rq.
> >>
> >> So, what specific negative effects this situation would cause?
> >> For example, the "hidden" non-idle entity's latency will be worse
> >> than before?
> >
> > As Abel points out in his email, it can push out the time it'll take
> > to switch to the other non-idle entity. The change might boost some
> > benchmarks numbers, but I don't think it is conclusive enough to say
> > it is a generically beneficial improvement that should be integrated.
> >
> > By the way, I'm curious if you modified any of the sched_idle_cpu()
> > and related load balancing around idle entities given that you've made
> > it so that idle entities can have arbitrary weight (since, as I
> > described in my prior email, this can otherwise cause issues there).
>
> If we want to make it easier for non-idle tasks to preempt idle tasks in
> tick, maybe we can consider lowering sysctl_sched_idle_min_granularity.
> Of course this may not ensure that non-idle tasks successfully preempt
> idle tasks every time, but it seems to be more beneficial for non-idle
> tasks.
>
> IMHO, even if it is allowed to increase the weight of non-idle, it seems
> that we can make it easier for non-idle tasks to preempt idle tasks by
> lowering sysctl_sched_idle_min_granularity.

Yep, although the effectiveness is partially limited by whatever the
HZ is set to for the scheduling tick.

>
> Thanks,
> Hao