Re: [PATCH 5/9] sched/fair: Take into account latency priority at wakeup

From: Vincent Guittot
Date: Wed Nov 30 2022 - 08:42:57 EST


On Wed, 30 Nov 2022 at 04:10, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> Hi Vincent,
>
> On Tue, Nov 29, 2022 at 5:21 PM Vincent Guittot
> <vincent.guittot@xxxxxxxxxx> wrote:
> [...]
> > > >>> }
> > > >>>
> > > >>> /*
> > > >>> @@ -7544,7 +7558,7 @@ static int __sched_setscheduler(struct task_struct *p,
> > > >>> if (attr->sched_flags & SCHED_FLAG_UTIL_CLAMP)
> > > >>> goto change;
> > > >>> if (attr->sched_flags & SCHED_FLAG_LATENCY_NICE &&
> > > >>> - attr->sched_latency_nice != p->latency_nice)
> > > >>> + attr->sched_latency_nice != LATENCY_TO_NICE(p->latency_prio))
> > > >>> goto change;
> > > >>>
> > > >>> p->sched_reset_on_fork = reset_on_fork;
> > > >>> @@ -8085,7 +8099,7 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
> > > >>> get_params(p, &kattr);
> > > >>> kattr.sched_flags &= SCHED_FLAG_ALL;
> > > >>>
> > > >>> - kattr.sched_latency_nice = p->latency_nice;
> > > >>> + kattr.sched_latency_nice = LATENCY_TO_NICE(p->latency_prio);
> > > >>>
> > > >>> #ifdef CONFIG_UCLAMP_TASK
> > > >>> /*
> > > >>> @@ -11294,6 +11308,20 @@ const u32 sched_prio_to_wmult[40] = {
> > > >>> /* 15 */ 119304647, 148102320, 186737708, 238609294, 286331153,
> > > >>> };
> > > >>>
> > > >>> +/*
> > > >>> + * latency weight for wakeup preemption
> > > >>> + */
> > > >>> +const int sched_latency_to_weight[40] = {
> > > >>> + /* -20 */ -1024, -973, -922, -870, -819,
> > > >>> + /* -15 */ -768, -717, -666, -614, -563,
> > > >>> + /* -10 */ -512, -461, -410, -358, -307,
> > > >>> + /* -5 */ -256, -205, -154, -102, -51,
> > > >>> + /* 0 */ 0, 51, 102, 154, 205,
> > > >>> + /* 5 */ 256, 307, 358, 410, 461,
> > > >>> + /* 10 */ 512, 563, 614, 666, 717,
> > > >>> + /* 15 */ 768, 819, 870, 922, 973,
> > > >>> +};
> > > >>> +
> > > >>
> > > >> The table is linear. You could approximate this as: weight = nice * 51
> > > >> since it is a linear scale and do the conversion in place.
> > > >>
> > > >> Or, since the only place you are using the latency_to_weight is in
> > > >> set_latency_offset(), can we drop the sched_latency_to_weight array
> > > >> and simplify as follows?
> > > >
> > > > It's also used in cgroup patch and keeps a coherency between
> > > > nice/weight an latency_nice/offset so I prefer
> > >
> > > I dont think it’s a valid comparison as nice/weight conversion are non linear and over there a table makes sense: weight = 1024 / 1.25 ^ nice
> > >
> > > > keeping current
> > > > implementation
> > >
> > > I could be missing something, but, since its a linear scale, why does cgroup need weight at all? Just store nice directly. Why would that not work?
> > >
> > > In the end the TG and SE has the latency offset in the struct, that is all you care about. All the conversion back and forth is unnecessary, as it is a linear scale and just increases LOC and takes more memory to store linear arrays.
> > >
> > > Again I could be missing something and I will try to play with your series and see if I can show you what I mean (or convince myself it’s needed).
> >
> > I get what you mean but I think that having an array gives latitude to
> > adjust this internal offset mapping at a minimum cost of a const array
>
> Ok that makes sense. If you feel like there might be updates in the
> future to this mapping array (like changing the constants as you
> mentioned), then I am Ok with us keeping it.
>
> Reviewed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
>
> I am excited about your series, the CFS latency issues have been
> thorny. This feels like a step forward in the right direction. Cheers,

Thanks
Vincent

>
> - Joel