Re: [External] Re: [PATCH] sched/core: Minor optimize pick_next_task() when core-sched enable

From: Vineeth Pillai
Date: Thu Mar 23 2023 - 13:29:59 EST


On Thu, Mar 23, 2023 at 3:03 AM Hao Jia <jiahao.os@xxxxxxxxxxxxx> wrote:

> > The other issue was - we don't update core rbtree when vruntime changes and
> > this can cause starvation of cookied task if there are more than one task with
> > the same cookie on an rq.
> >
>
> If I understand correctly, when a cookied task is enqueued, the
> difference delta1 between its vruntime and min_vruntime is very large.
>
> Another task with the same cookie is very actively dequeuing and
> enqueuing, and the difference delta2 between its vruntime and
> min_vruntime is always smaller than delta1?
> I'm not sure if this is the case?

This case I was mentioning is about tasks that are continuously running
and hence always in the runqueue. sched_core_enqueue/dequeue is
not called and hence their position in the core rbtree is static while cfs
rbtree positions change as vruntime progresses.

BTW, this is a separate issue than the one you are targeting with this
fix. I just thought of mentioning it here as well..

> >> Yeah, this is an absolute no-no, it makes the overhead of the second rb
> >> tree unconditional.
> >
> > I agree. Could we keep it conditional by enqueuing 0-cookied tasks only when
> > coresched is enabled, just like what we do for cookied tasks? This is still an
> > overhead where we have two trees storing all the runnable tasks but in
> > different order. We would also need to populate core rbtree from cfs rbtree
> > on coresched enable and empty the tree on coresched disable.
> >
>
> I'm not sure if the other way is reasonable, I'm trying to provide a
> function for each scheduling class to find a highest priority non-cookie
> task.
>
> For example fair_sched_class, we can use rq->cfs_tasks to traverse the
> search. But this search may take a long time, maybe we need to limit the
> number of searches.

Yes, it can be time consuming based on the number of cgroups and tasks
that are runnable. You could probably take some performance numbers to
see how worse it is.

We could also have some optimization like marking a runqueue having
non-cookied tasks and then do the search only if it is marked. I haven't
thought much about it, but search could be optimized hopefully.

Thanks,
Vineeth