Re: [PATCH 3/3] sched/rt: fix pushing unfit tasks to a better CPU

From: Qais Yousef
Date: Mon Feb 17 2020 - 08:53:13 EST


On 02/17/20 14:53, Pavan Kondeti wrote:
> Hi Qais,
>
> On Fri, Feb 14, 2020 at 04:39:49PM +0000, Qais Yousef wrote:
>
> [...]
>
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index 0c8bac134d3a..5ea235f2cfe8 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -1430,7 +1430,7 @@ select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags)
> > {
> > struct task_struct *curr;
> > struct rq *rq;
> > - bool test;
> > + bool test, fit;
> >
> > /* For anything but wake ups, just return the task_cpu */
> > if (sd_flag != SD_BALANCE_WAKE && sd_flag != SD_BALANCE_FORK)
> > @@ -1471,16 +1471,32 @@ select_task_rq_rt(struct task_struct *p, int cpu, int sd_flag, int flags)
> > unlikely(rt_task(curr)) &&
> > (curr->nr_cpus_allowed < 2 || curr->prio <= p->prio);
> >
> > - if (test || !rt_task_fits_capacity(p, cpu)) {
> > + fit = rt_task_fits_capacity(p, cpu);
> > +
> > + if (test || !fit) {
> > int target = find_lowest_rq(p);
> >
> > - /*
> > - * Don't bother moving it if the destination CPU is
> > - * not running a lower priority task.
> > - */
> > - if (target != -1 &&
> > - p->prio < cpu_rq(target)->rt.highest_prio.curr)
> > - cpu = target;
> > + if (target != -1) {
> > + /*
> > + * Don't bother moving it if the destination CPU is
> > + * not running a lower priority task.
> > + */
> > + if (p->prio < cpu_rq(target)->rt.highest_prio.curr) {
> > +
> > + cpu = target;
> > +
> > + } else if (p->prio == cpu_rq(target)->rt.highest_prio.curr) {
> > +
> > + /*
> > + * If the priority is the same and the new CPU
> > + * is a better fit, then move, otherwise don't
> > + * bother here either.
> > + */
> > + fit = rt_task_fits_capacity(p, target);
> > + if (fit)
> > + cpu = target;
> > + }
> > + }
>
> I understand that we are opting for the migration when priorities are tied but
> the task can fit on the new task. But there is no guarantee that this task
> stay there. Because any CPU that drops RT prio can pull the task. Then why
> not leave it to the balancer?

This patch does help in the 2 RT task test case. Without it I can see a big
delay for the task to migrate from a little CPU to a big one, although the big
is free.

Maybe my test is too short (1 second). The delay I've seen is 0.5-0.7s..

https://imgur.com/a/qKJk4w4

Maybe I missed the real root cause. Let me dig more.

>
> I notice a case where tasks would migrate for no reason (happens without this
> patch also). Assuming BIG cores are busy with other RT tasks. Now this RT
> task can go to *any* little CPU. There is no bias towards its previous CPU.
> I don't know if it makes any difference but I see RT task placement is too
> keen on reducing the migrations unless it is absolutely needed.

In find_lowest_rq() there's a check if the task_cpu(p) is in the lowest_mask
and prefer it if it is.

But yeah I see it happening too

https://imgur.com/a/FYqLIko

Tasks on CPU 0 and 3 swap. Note that my tasks are periodic but the plots don't
show that.

I shouldn't have changed something to affect this bias. Do you think it's
something I introduced?

It's something maybe worth digging into though. I'll try to have a look.

Thanks

--
Qais Yousef