[PATCH 0/3] XICS emulation optimizations in KVM for PPC

From: Gautam Menghani
Date: Mon May 06 2024 - 12:19:17 EST


Optimize the XICS emulation code in KVM as per the 'performance todos'
in the comments of book3s_xics.c.

Performance numbers:
1. Test case: Pgbench run in a KVM on PowerVM guest for 120 secs


2. Time taken by arch_send_call_function_single_ipi() currently measured
with funclatency [1].

$ ./funclatency.py -u arch_send_call_function_single_ipi

usecs : count distribution
0 -> 1 : 7 | |
2 -> 3 : 16 | |
4 -> 7 : 141 | |
8 -> 15 : 4455631 |****************************************|
16 -> 31 : 437981 |*** |
32 -> 63 : 5036 | |
64 -> 127 : 92 | |

avg = 12 usecs, total: 60,532,481 usecs, count: 4,898,904


3. Time taken by arch_send_call_function_single_ipi() with changes in
this series.

$ ./funclatency.py -u arch_send_call_function_single_ipi

usecs : count distribution
0 -> 1 : 15 | |
2 -> 3 : 7 | |
4 -> 7 : 3798 | |
8 -> 15 : 4569610 |****************************************|
16 -> 31 : 339284 |** |
32 -> 63 : 4542 | |
64 -> 127 : 68 | |
128 -> 255 : 0 | |
256 -> 511 : 1 | |

avg = 11 usecs, total: 57,720,612 usecs, count: 4,917,325

4. This patch series has been also tested on KVM on Power8 CPU.

[1]: https://github.com/iovisor/bcc/blob/master/tools/funclatency.py

Gautam Menghani (3):
arch/powerpc/kvm: Use bitmap to speed up resend of irqs in ICS
arch/powerpc/kvm: Optimize the server number -> ICP lookup
arch/powerpc/kvm: Reduce lock contention by moving spinlock from ics
to irq_state

arch/powerpc/kvm/book3s_hv_rm_xics.c | 8 ++--
arch/powerpc/kvm/book3s_xics.c | 70 ++++++++++++----------------
arch/powerpc/kvm/book3s_xics.h | 13 ++----
3 files changed, 39 insertions(+), 52 deletions(-)

--
2.44.0