Re: [PATCH 3/3] sched/rt: fix pushing unfit tasks to a better CPU

From: Qais Yousef
Date: Wed Feb 19 2020 - 05:46:49 EST


On 02/19/20 08:16, Pavan Kondeti wrote:
> On Tue, Feb 18, 2020 at 05:47:19PM +0000, Qais Yousef wrote:
> > On 02/18/20 09:46, Pavan Kondeti wrote:
> > > The original RT task placement i.e without capacity awareness, places the task
> > > on the previous CPU if the task can preempt the running task. I interpreted it
> > > as that "higher prio RT" task should get better treatment even if it results
> > > in stopping the lower prio RT execution and migrating it to another CPU.
> > >
> > > Now coming to your patch (merged), we force find_lowest_rq() if the previous
> > > CPU can't fit the task though this task can right away run there. When the
> > > lowest mask returns an unfit CPU (with your new patch), We have two choices,
> > > either to place it on this unfit CPU (may involve migration) or place it on
> > > the previous CPU to avoid the migration. We are selecting the first approach.
> > >
> > > The task_cpu(p) check in find_lowest_rq() only works when the previous CPU
> > > does not have a RT task. If it is running a lower prio RT task than the
> > > waking task, the lowest_mask may not contain the previous CPU.
> > >
> > > I don't if any workload hurts due to this change in behavior. So not sure
> > > if we have to restore the original behavior. Something like below will do.
> >
> > Is this patch equivalent to yours? If yes, then I got you. If not, then I need
> > to re-read this again..
> >
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index ace9acf9d63c..854a0c9a7be6 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -1476,6 +1476,13 @@ select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags)
> > if (test || !rt_task_fits_capacity(p, cpu)) {
> > int target = find_lowest_rq(p);
> >
> > + /*
> > + * Bail out if we were forcing a migration to find a better
> > + * fitting CPU but our search failed.
> > + */
> > + if (!test && !rt_task_fits_capacity(p, target))
> > + goto out_unlock;
> > +
>
> Yes. This is what I was referring to.

Cool. I can't see how this could be a problem too but since as you say it'd
preserve the older behavior, I'll add it to the lot with proper changelog.

Thanks!

--
Qais Yousef