Re: [RFC PATCH 1/9] sched,cgroup: Add interface for latency-nice

From: Tim Chen
Date: Wed Sep 04 2019 - 13:32:09 EST


On 8/30/19 10:49 AM, subhra mazumdar wrote:
> Add Cgroup interface for latency-nice. Each CPU Cgroup adds a new file
> "latency-nice" which is shared by all the threads in that Cgroup.


Subhra,

Thanks for posting the patchset. Having a latency nice hint
is useful beyond idle load balancing. I can think of other
application scenarios, like scheduling batch machine learning AVX 512
processes with latency sensitive processes. AVX512 limits the frequency
of the CPU and it is best to avoid latency sensitive task on the
same core with AVX512. So latency nice hint allows the scheduler
to have a criteria to determine the latency sensitivity of a task
and arrange latency sensitive tasks away from AVX512 tasks.

You configure the latency hint on a cgroup basis.
But I think not all tasks in a cgroup necessarily have the same
latency sensitivity.

For example, I can see that cgroup can be applied on a per user basis,
and the user could run different tasks that have different latency sensitivity.
We may also need a way to configure latency sensitivity on a per task basis instead on
a per cgroup basis.

Tim


> @@ -631,6 +631,7 @@ struct task_struct {
> int static_prio;
> int normal_prio;
> unsigned int rt_priority;
> + u64 latency_nice;

Does it need to be 64 bit? Max latency nice is only 100.

>
> const struct sched_class *sched_class;
> struct sched_entity se;
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 874c427..47969bc 100644

> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index b52ed1a..365c928 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -143,6 +143,13 @@ static inline void cpu_load_update_active(struct rq *this_rq) { }
> #define NICE_0_LOAD (1L << NICE_0_LOAD_SHIFT)
>
> /*
> + * Latency-nice default value
> + */

Will be useful to add comments to let reader know
that higher latency nice number means a task is more
latency tolerant.

Is there a reason for setting the default to be a low
value of 5?

Seems like we will default to only to search the
same core for idle cpu on a smaller system,
as we only search 5% of the cpu span of the target sched domain.

> +#define LATENCY_NICE_DEFAULT 5
> +#define LATENCY_NICE_MIN 1
> +#define LATENCY_NICE_MAX 100
> +