Re: [PATCH v2] sched: rt: Make RT capacity aware

From: Qais Yousef
Date: Wed Feb 05 2020 - 09:48:07 EST


On 02/04/20 18:23, Dietmar Eggemann wrote:
> On 03/02/2020 20:03, Qais Yousef wrote:
> > On 02/03/20 13:12, Steven Rostedt wrote:
> >> On Mon, 3 Feb 2020 17:17:46 +0000
> >> Qais Yousef <qais.yousef@xxxxxxx> wrote:
>
> [...]
>
> > In the light of strictly adhering to priority based scheduling; yes this makes
> > sense. Though I still think the migration will produce worse performance, but
> > I can appreciate even if that was true it breaks the strict priority rule.
> >
> >>
> >> You can add to the logic that you do not take over an RT task that is
> >> pinned and can't move itself. Perhaps that may be the only change to
> >
> > I get this.
> >
> >> cpu_find(), is that it will only pick a big CPU if little CPUs are
> >> available if the big CPU doesn't have a pinned RT task on it.
> >
> > But not that. Do you mind rephrasing it?
> >
> > Or let me try first:
> >
> > 1. Search all priority levels for a fitting CPU
>
> Just so I get this right: All _lower_ prio levels than p->prio, right?

Correct, that's what I meant :)

>
> > 2. If failed, return the first lowest mask found
> > 3. If it succeeds, remove any CPU that has a pinned task in it
> > 4. If the lowest_mask is empty, return (2).
> > 5. Else return the lowest_mask with the fitting CPU(s)
>
> Mapping this to the 5 FIFO tasks rt-tasks of Pavan's example (all
> p->prio=89 (dflt rt-app prio), dflt min_cap=1024 max_cap=1024) on a 4
> big (Cpu Capacity=1024) 4 little (Cpu capacity < 1024) system:
>
> You search from idx 1 to 11 [p->prio=89 eq. idx (task_pri)=12] and since
> there are no lower prior RT tasks the lowest mask of idx=1 (CFS or Idle)
> for the 5th RT task is returned.

We should basically fallback to whatever was supposed to be returned if this
patch is not applied.

if (lower_mask) {
// record the value of the first valid lower_mask

if lower_mask doesn't contain a fitting CPU:
continue searching in the next priority level
}

if no fitting cpu was found at any lower level:
return the recorded first valid lower_mask

>
> But that means that CPU capacity trumps priority?

I'm not sure how to translate 'trumps' here.

So priority has precedence over capacity. I think this is not the best option,
but it keeps the rules consistent; which is if a higher priority task is
runnable it'd be pushed to another CPU running a lower priority one if we can
find one. We'll attempt to make sure this CPU fits the capacity requirement of
the task, but if there isn't one we'll fallback to the next best thing.

I think this makes sense and will keep this fitness logic generic.

Maybe it's easier to discuss over a patch. I will post one soon hopefully.

Thanks

--
Qais Yousef