Re: [PATCH v3 4/6] sched/rt: Allow pulling unfitting task

From: Qais Yousef
Date: Wed Mar 04 2020 - 10:28:28 EST


On 03/04/20 22:52, Tao Zhou wrote:
> Hi Qais,
>
> On Mon, Mar 02, 2020 at 01:27:19PM +0000, Qais Yousef wrote:
> > When implemented RT Capacity Awareness; the logic was done such that if
> > a task was running on a fitting CPU, then it was sticky and we would try
> > our best to keep it there.
> >
> > But as Steve suggested, to adhere to the strict priority rules of RT
> > class; allow pulling an RT task to unfitting CPU to ensure it gets a
> > chance to run ASAP.
> >
> > Suggested-by: Steven Rostedt <rostedt@xxxxxxxxxxx>
> > Fixes: 804d402fb6f6 ("sched/rt: Make RT capacity-aware")
> > LINK: https://lore.kernel.org/lkml/20200203111451.0d1da58f@xxxxxxxxxxxxxxxx/
> > Signed-off-by: Qais Yousef <qais.yousef@xxxxxxx>
> > ---
> > kernel/sched/rt.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> > index 3071c8612c03..e79a23ad4a93 100644
> > --- a/kernel/sched/rt.c
> > +++ b/kernel/sched/rt.c
> > @@ -1656,8 +1656,7 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
> > static int pick_rt_task(struct rq *rq, struct task_struct *p, int cpu)
> > {
> > if (!task_running(rq, p) &&
> > - cpumask_test_cpu(cpu, p->cpus_ptr) &&
> > - rt_task_fits_capacity(p, cpu))
> > + cpumask_test_cpu(cpu, p->cpus_ptr))
> > return 1;
> >
> > return 0;
> > --
> > 2.17.1
> >
>
> How about using a rt_cap_overloaded(like rt_overloaded) to indicate the
> cpu is overloaded because a RT task is on unfit CPU. And use stop_one_cpu
> to do in this case.

We have explored a variation of this (without using the stop_one_cpu) in v2

https://lore.kernel.org/lkml/20200223184001.14248-6-qais.yousef@xxxxxxx/

I might still consider this in the future. But I think I need to do better
analysis of the cost-benefit here before pushing further for that.

I'm not keen on stopping a running task as well, not yet at least.

>
> IIRC, HAVE_RT_PUSH_IPI do not select the specific cpu to do the
> push because the complex there. When RT cap join in, i don't know it
> is need to select the specific unfit CPU or rt overloaded CPU in what
> order is a choice.

I'm not sure I understood you completely here.

I think the patch above dealt with the complexity I think you're talking about.

Thanks

--
Qais Yousef