Re: [PATCH 1/4] sched: make nr_running() return 32-bit

From: Alexey Dobriyan
Date: Thu May 13 2021 - 03:23:59 EST


On Thu, May 13, 2021 at 01:58:16AM +0200, Thomas Gleixner wrote:
> Alexey,
>
> On Thu, Apr 22 2021 at 23:02, Alexey Dobriyan wrote:
> > Creating 2**32 tasks is impossible due to futex pid limits and wasteful
> > anyway. Nobody has done it.
> >
>
> this whole pile lacks useful numbers. What's the actual benefit of that
> churn?

The long term goal is to use 32-bit data more. People will see it in
core kernel and copy everywhere elase.

> Just with the default config for one of my reference machines:
>
> text data bss dec hex filename
> 16679864 6627950 1671296 24979110 17d26a6 ../build/vmlinux-before
> 16679894 6627950 1671296 24979140 17d26c4 ../build/vmlinux-after
> ------------------------------------------------------------------------
> +30
>
> I'm truly impressed by the massive savings of this change and I'm even
> more impressed by the justification:
>
> > Bring nr_running() into 32-bit world to save on REX prefixes.

I collected numbers initially but then stopped because noone cared and
they can be config and arch dependent.

> Aside of the obvious useless churn,

oh... Sometimes I think churn is the whole point.

> REX prefixes are universaly true for
> all architectures, right? There is a world outside x86 ...

In general, 32-bitness is preferred for code generation.

32-bit RISCs naturally prefers 32-bit.

64-bit RISCs don't care because they remember 32-bit roots and
have necessary 32-bit fixed width(!) instructions.

x86_64 is the only arch where going 64-bit generally adds more bytes
to the instruction stream.

Effects can be smudged by compilers of course, in this case, percpu
stuff. That "unsigned int i" is a mistake. Proper diff looks like this:

-ffffffff811115fa: 8b 44 18 04 mov eax,DWORD PTR [rax+rbx*1+0x4]
-ffffffff811115fe: 49 01 c4 add r12,rax
+ffffffff811115fa: 44 03 64 18 04 add r12d,DWORD PTR [rax+rbx*1+0x4]

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4348,9 +4348,10 @@ context_switch(struct rq *rq, struct task_struct *prev,
* externally visible scheduler statistics: current number of runnable
* threads, total number of context switches performed since bootup.
*/
-unsigned long nr_running(void)
+unsigned int nr_running(void)
{
- unsigned long i, sum = 0;
+ unsigned int sum = 0;
+ unsigned long i;

for_each_online_cpu(i)
sum += cpu_rq(i)->nr_running;