Re: [PATCH v2] kvm: Replace vcpu->swait with rcuwait

From: Marc Zyngier
Date: Mon Apr 20 2020 - 13:12:24 EST


On 2020-04-20 17:41, Davidlohr Bueso wrote:
The use of any sort of waitqueue (simple or regular) for
wait/waking vcpus has always been an overkill and semantically
wrong. Because this is per-vcpu (which is blocked) there is
only ever a single waiting vcpu, thus no need for any sort of
queue.

As such, make use of the rcuwait primitive, with the following
considerations:

- rcuwait already provides the proper barriers that serialize
concurrent waiter and waker.

- Task wakeup is done in rcu read critical region, with a
stable task pointer.

- Because there is no concurrency among waiters, we need
not worry about rcuwait_wait_event() calls corrupting
the wait->task. As a consequence, this saves the locking
done in swait when modifying the queue. This also applies
to per-vcore wait for powerpc kvm-hv.

The x86-tscdeadline_latency test mentioned in 8577370fb0cb
("KVM: Use simple waitqueue for vcpu->wq") shows that, on avg,
latency is reduced by around 15-20% with this change.

This patch also changes TASK_INTERRUPTIBLE for TASK_IDLE, as
kvm is (ab)using the former such that idle vcpus do no contribute
to the loadavg. Let use the correct semantics for this.

Cc: Paul Mackerras <paulus@xxxxxxxxxx>
Cc: kvmarm@xxxxxxxxxxxxxxxxxxxxx
Cc: linux-mips@xxxxxxxxxxxxxxx
Signed-off-by: Davidlohr Bueso <dbueso@xxxxxxx>
---
v2: Added missing semicolon in mips change.

The rest of the patches in this series continues to apply on tip,
as such I am only sending a v2 for this particular patch.

arch/mips/kvm/mips.c | 6 ++----
arch/powerpc/include/asm/kvm_book3s.h | 2 +-
arch/powerpc/include/asm/kvm_host.h | 2 +-
arch/powerpc/kvm/book3s_hv.c | 22 ++++++++--------------
arch/powerpc/kvm/powerpc.c | 2 +-
arch/x86/kvm/lapic.c | 2 +-
include/linux/kvm_host.h | 10 +++++-----
virt/kvm/arm/arch_timer.c | 2 +-
virt/kvm/arm/arm.c | 9 +++++----
virt/kvm/async_pf.c | 3 +--
virt/kvm/kvm_main.c | 31 +++++++++++--------------------
11 files changed, 37 insertions(+), 54 deletions(-)

[...]

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 70f03ce0e5c1..887efb39fb1a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -343,7 +343,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu,
struct kvm *kvm, unsigned id)
vcpu->kvm = kvm;
vcpu->vcpu_id = id;
vcpu->pid = NULL;
- init_swait_queue_head(&vcpu->wq);
+ rcuwait_init(&vcpu->wait);
kvm_async_pf_vcpu_init(vcpu);

vcpu->pre_pcpu = -1;
@@ -2465,9 +2465,8 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
void kvm_vcpu_block(struct kvm_vcpu *vcpu)
{
ktime_t start, cur;
- DECLARE_SWAITQUEUE(wait);
- bool waited = false;
u64 block_ns;
+ int block_check = -EINTR;

kvm_arch_vcpu_blocking(vcpu);

@@ -2491,17 +2490,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
} while (single_task_running() && ktime_before(cur, stop));
}

- for (;;) {
- prepare_to_swait_exclusive(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
-
- if (kvm_vcpu_check_block(vcpu) < 0)
- break;
-
- waited = true;
- schedule();
- }
-
- finish_swait(&vcpu->wq, &wait);
+ rcuwait_wait_event(&vcpu->wait,
+ (block_check = kvm_vcpu_check_block(vcpu)) < 0,
+ TASK_IDLE);
cur = ktime_get();
out:
kvm_arch_vcpu_unblocking(vcpu);
@@ -2525,18 +2516,17 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
}
}

- trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
+ trace_kvm_vcpu_wakeup(block_ns, !block_check, vcpu_valid_wakeup(vcpu));

This looks like a change in the semantics of the tracepoint. Before this
change, 'waited' would have been true if the vcpu waited at all. Here, you'd
have false if it has been interrupted by a signal, even if the vcpu has waited
for a period of time.

Thanks,

M.
--
Jazz is not dead. It just smells funny...