[PATCH] kernel/sys: do not use tasklist_lock to set/get scheduling priorities

From: Davidlohr Bueso
Date: Fri May 01 2020 - 23:10:05 EST


For both setpriority(2) and getpriority(2) there's really no need
to be taking the tasklist_lock at all - for which both share it
for the entirety of the syscall. The tasklist_lock does not protect
reading/writing the p->static_prio and task lookups are already rcu
safe, providing a stable pointer.

The following raw microbenchmark improvements on a 40-core box
were seen running the stressng-get workload, which pathologically
pounds on various syscalls that get information from the kernel.
Increasing thread counts of course shows more wins, albeit probably
not something that would be seen in a real workload.

5.7.0-rc3 5.7.0-rc3
getpriority-v1
Hmean get-1 3443.65 ( 0.00%) 3314.08 * -3.76%*
Hmean get-2 7809.99 ( 0.00%) 8547.60 * 9.44%*
Hmean get-4 15498.01 ( 0.00%) 17396.85 * 12.25%*
Hmean get-8 28001.37 ( 0.00%) 31137.53 * 11.20%*
Hmean get-16 31460.88 ( 0.00%) 40284.35 * 28.05%*
Hmean get-32 30036.64 ( 0.00%) 40657.88 * 35.36%*
Hmean get-64 31429.86 ( 0.00%) 41021.73 * 30.52%*
Hmean get-80 31804.13 ( 0.00%) 39188.55 * 23.22%*

Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>
---
kernel/sys.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/kernel/sys.c b/kernel/sys.c
index d325f3ab624a..12ade1a00a18 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -214,7 +214,6 @@ SYSCALL_DEFINE3(setpriority, int, which, int, who, int, niceval)
niceval = MAX_NICE;

rcu_read_lock();
- read_lock(&tasklist_lock);
switch (which) {
case PRIO_PROCESS:
if (who)
@@ -252,7 +251,6 @@ SYSCALL_DEFINE3(setpriority, int, which, int, who, int, niceval)
break;
}
out_unlock:
- read_unlock(&tasklist_lock);
rcu_read_unlock();
out:
return error;
@@ -277,7 +275,6 @@ SYSCALL_DEFINE2(getpriority, int, which, int, who)
return -EINVAL;

rcu_read_lock();
- read_lock(&tasklist_lock);
switch (which) {
case PRIO_PROCESS:
if (who)
@@ -323,7 +320,6 @@ SYSCALL_DEFINE2(getpriority, int, which, int, who)
break;
}
out_unlock:
- read_unlock(&tasklist_lock);
rcu_read_unlock();

return retval;
--
2.16.4