better wake-balancing: respin

From: Chen, Kenneth W
Date: Wed Oct 26 2005 - 20:24:47 EST


Once upon a time, this patch was in -mm tree (2.6.13-mm1):
http://marc.theaimsgroup.com/?l=linux-kernel&m=112265450426975&w=2

It is neither in Linus's official tree, nor it is in -mm anymore.

I guess I missed the objection for dropping the patch. I'm bringing
up this discussion again. The wake-up path is a lot hotter on numa
system running database benchmark. Even on a moderate 8P numa box,
__wake_up and try_to_wake_up is showing up as #1 and #4 hottest kernel
functions. While on a comparable 4P smp box, these two functions are
#5 and #9 respectively.

I think situation will be worse on 32P numa box in the wake up path.
I don't have any measurement on 32P setup yet, because 8P numa
performance sucks at the moment and it is a blocker for us before
proceed any bigger setup.


Execution profile for 8P numa box [1]:

Symbol Clockticks Inst. Retired L3 Misses
#1 __wake_up 8.08% 1.88% 4.67%
#2 finish_task_switch 7.53% 18.11% 5.82%
#3 __make_request 6.87% 2.09% 4.35%
#4 try_to_wake_up 5.57% 0.64% 3.10%



Execution profile for 4P SMP box [2]:

Symbol Clockticks
#5 __wake_up 3.57%
#9 try_to_wake_up 2.38%

My question is: what was the reason this patch is dropped and what
can we do to improve wake-up performance? In my opinion, we should
simply put the task on the CPU it was previously ran and have
rebalance_tick and load_balance_newidle to balance out the load.

- Ken


[1] 8 processor: 1.6 GHz Itanium2 processor, 9M L3. 256 GB memory
[2] 4 processor: 1.6 GHz Itanium2 processor, 9M L3. 128 GB memory

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/