[tip: sched/core] sched: Bring the PF_IO_WORKER and PF_WQ_WORKER bits closer together

From: tip-bot2 for Sebastian Andrzej Siewior
Date: Thu Aug 27 2020 - 03:55:45 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: 01ccf592362a984534371b3596d4c953da6a7bb2
Gitweb: https://git.kernel.org/tip/01ccf592362a984534371b3596d4c953da6a7bb2
Author: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
AuthorDate: Wed, 19 Aug 2020 21:55:05 +02:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Wed, 26 Aug 2020 12:41:58 +02:00

sched: Bring the PF_IO_WORKER and PF_WQ_WORKER bits closer together

The bits PF_IO_WORKER and PF_WQ_WORKER are tested together in
sched_submit_work() which is considered to be a hot path.
If the two bits cross the 8 or 16 bit boundary then most architecture
require multiple load instructions in order to create the constant
value. Also, such a value can not be encoded within the compare opcode.

By moving the bit definition within the same block, the compiler can
create/use one immediate value.

For some reason gcc-10 on ARM64 requires both bits to be next to each
other in order to issue "tst reg, val; bne label". Otherwise the result
is "mov reg1, val; tst reg, reg1; bne label".

Move PF_VCPU out of the way so that PF_IO_WORKER can be next to
PF_WQ_WORKER.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lkml.kernel.org/r/20200819195505.y3fxk72sotnrkczi@xxxxxxxxxxxxx
---
include/linux/sched.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 93ecd93..2bf0af1 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1489,9 +1489,10 @@ extern struct pid *cad_pid;
/*
* Per process flags
*/
+#define PF_VCPU 0x00000001 /* I'm a virtual CPU */
#define PF_IDLE 0x00000002 /* I am an IDLE thread */
#define PF_EXITING 0x00000004 /* Getting shut down */
-#define PF_VCPU 0x00000010 /* I'm a virtual CPU */
+#define PF_IO_WORKER 0x00000010 /* Task is an IO worker */
#define PF_WQ_WORKER 0x00000020 /* I'm a workqueue worker */
#define PF_FORKNOEXEC 0x00000040 /* Forked but didn't exec */
#define PF_MCE_PROCESS 0x00000080 /* Process policy on mce errors */
@@ -1515,7 +1516,6 @@ extern struct pid *cad_pid;
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
#define PF_MEMALLOC_NOCMA 0x10000000 /* All allocation request will have _GFP_MOVABLE cleared */
-#define PF_IO_WORKER 0x20000000 /* Task is an IO worker */
#define PF_FREEZER_SKIP 0x40000000 /* Freezer should not count it as freezable */
#define PF_SUSPEND_TASK 0x80000000 /* This thread called freeze_processes() and should not be frozen */