[PATCH RFC V4 0/3] kvm: Improving directed yield in PLE handler

From: Raghavendra K T
Date: Mon Jul 16 2012 - 04:26:56 EST

Currently Pause Loop Exit (PLE) handler is doing directed yield to a
random vcpu on pl-exit. We already have filtering while choosing
the candidate to yield_to. This change adds more checks while choosing
a candidate to yield_to.

On a large vcpu guests, there is a high probability of
yielding to the same vcpu who had recently done a pause-loop exit.
Such a yield can lead to the vcpu spinning again.

The patchset keeps track of the pause loop exit and gives chance to a
vcpu which has:

(a) Not done pause loop exit at all (probably he is preempted lock-holder)

(b) vcpu skipped in last iteration because it did pause loop exit, and
probably has become eligible now (next eligible lock holder)

This concept also helps in cpu relax interception cases which use same handler.

Changes since V3:
- arch specific fix/changes (Christian)

Changes since v2:
- Move ple structure to common code (Avi)
- rename pause_loop_exited to cpu_relax_intercepted (Avi)
- Drop superfluous curly braces (Ingo)

Changes since v1:
- Add more documentation for structure and algorithm and Rename
plo ==> ple (Rik).
- change dy_eligible initial value to false. (otherwise very first directed
yield will not be skipped. (Nikunj)
- fixup signoff/from issue

Future enhancements:
(1) Currently we have a boolean to decide on eligibility of vcpu. It
would be nice if I get feedback on guest (>32 vcpu) whether we can
improve better with integer counter. (with counter = say f(log n )).

(2) We have not considered system load during iteration of vcpu. With
that information we can limit the scan and also decide whether schedule()
is better. [ I am able to use #kicked vcpus to decide on this But may
be there are better ideas like information from global loadavg.]

(3) We can exploit this further with PV patches since it also knows about
next eligible lock-holder.

Summary: There is a very good improvement for moderate / no over-commit scenario
for kvm based guest on PLE machine.

Results: kernbench improves by around 30%, 6% for 1x,2x respectively
ebizzy improves by around 87%, 23% for 1x,2x respectively

Note: The patches are tested on x86.

V1: https://lkml.org/lkml/2012/7/9/32
V2: https://lkml.org/lkml/2012/7/10/392
V3: https://lkml.org/lkml/2012/7/12/437

Raghavendra K T (3):
config: Add config to support ple or cpu relax optimzation
kvm : Note down when cpu relax intercepted or pause loop exited
kvm : Choose a better candidate for directed yield
arch/s390/kvm/Kconfig | 1 +
arch/x86/kvm/Kconfig | 1 +
include/linux/kvm_host.h | 42 ++++++++++++++++++++++++++++++++++++++++++
virt/kvm/Kconfig | 3 +++
virt/kvm/kvm_main.c | 40 ++++++++++++++++++++++++++++++++++++++++
5 files changed, 87 insertions(+), 0 deletions(-)

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/