[PATCH v5 0/4] Introduce per-task latency_nice for scheduler hints

From: Parth Shah
Date: Fri Feb 28 2020 - 04:08:10 EST


This is the 5th revision of the patch set to introduce latency_nice as a
per task attribute.

The previous version can be found at:
v1: https://lkml.org/lkml/2019/11/25/151
v2: https://lkml.org/lkml/2019/12/8/10
v3: https://lkml.org/lkml/2020/1/16/319
v4: https://lkml.org/lkml/2020/2/24/216

Changes in this revision are:
v4->v5:
- Added debugging prints in /proc/<pid>/sched for latency_nice ( based on
suggestion from Pavan Kondeti )
- Initialized init_task with latency_nice = 0
- Collected review tag and added few minor fixes.
v3->v4:
- Based on Chris's comment, added security check to refrain non-root user
set lower value than the current latency_nice values.
v2 -> v3:
- This series changes the longer attribute name to "latency_nice" as per
the comment from Dietmar Eggemann https://lkml.org/lkml/2019/12/5/394
v1 -> v2:
- Addressed comments from Qais Yousef
- As per suggestion from Dietmar, moved content from newly created
include/linux/sched/latency_tolerance.h to kernel/sched/sched.h
- Extend sched_setattr() to support latency_tolerance in tools headers UAPI


Introduction:
==============
This patch series introduces a new per-task attribute latency_nice to
provide the scheduler hints about the latency requirements of the task [1].

Latency_nice is a ranged attribute of a task with the value ranging
from [-20, 19] both inclusive which makes it align with the task nice
value.

The value should provide scheduler hints about the relative latency
requirements of tasks, meaning the task with "latency_nice = -20"
should have lower latency requirements than compared to those tasks with
higher values. Similarly a task with "latency_nice = 19" can have higher
latency and hence such tasks may not care much about latency.

The default value is set to 0. The usecases discussed below can use this
range of [-20, 19] for latency_nice for the specific purpose. This
patch does not implement any use cases for such attribute so that any
change in naming or range does not affect much to the other (future)
patches using this. The actual use of latency_nice during task wakeup
and load-balancing is yet to be coded for each of those usecases.

As per my view, this defined attribute can be used in following ways for a
some of the usecases:
1 Reduce search scan time for select_idle_cpu():
- Reduce search scans for finding idle CPU for a waking task with lower
latency_nice values.

2 TurboSched:
- Classify the tasks with higher latency_nice values as a small
background task given that its historic utilization is very low, for
which the scheduler can search for more number of cores to do task
packing. A task with a latency_nice >= some_threshold (e.g, == 19)
and util <= 12.5% can be background tasks.

3 Optimize AVX512 based workload:
- Bias scheduler to not put a task having (latency_nice == -20) on a
core occupying AVX512 based workload.


Series Organization:
====================
- Patch 1-3: Add support for latency_nice attr in the task struct using
sched_{set,get}attr syscall
- Patch 4 : Add permission checks for setting the value.


The patch series can be applied on tip/sched/core at the
commit a0f03b617c3b ("sched/numa: Stop an exhastive search if a reasonable swap candidate or idle CPU is found")


References:
============
[1]. Usecases for the per-task latency-nice attribute,
https://lkml.org/lkml/2019/9/30/215
[2]. Task Latency-nice, "Subhra Mazumdar",
https://lkml.org/lkml/2019/8/30/829
[3]. Introduce per-task latency_tolerance for scheduler hints,
https://lkml.org/lkml/2019/12/8/10



Parth Shah (4):
sched: Introduce latency-nice as a per-task attribute
sched/core: Propagate parent task's latency requirements to the child
task
sched: Allow sched_{get,set}attr to change latency_nice of the task
sched/core: Add permission checks for setting the latency_nice value

include/linux/sched.h | 1 +
include/uapi/linux/sched.h | 4 +++-
include/uapi/linux/sched/types.h | 19 +++++++++++++++++++
init/init_task.c | 1 +
kernel/sched/core.c | 26 ++++++++++++++++++++++++++
kernel/sched/debug.c | 1 +
kernel/sched/sched.h | 18 ++++++++++++++++++
tools/include/uapi/linux/sched.h | 4 +++-
8 files changed, 72 insertions(+), 2 deletions(-)

--
2.17.2