Re: [PATCH][RFC] kvm-scheduler integration

From: Ingo Molnar
Date: Sun Jul 08 2007 - 09:09:48 EST



* Avi Kivity <avi@xxxxxxxxxxxx> wrote:

> Intel VT essentially introduces a new set of registers into the
> processor; this means we cannot preempt kvm in kernel mode lest a new
> VM run with and old VM's registers. In addition, kvm lazy switches
> some host registers as well. (AMD does not introduce new registers,
> but we still want lazy msr switching, and we want to know when we move
> to a different cpu in order to be able to guarantee a monotonously
> increasing tsc).
>
> Current kvm code simply disables preemption when guest context is in
> use. This, however, has many drawbacks:
>
> - some kvm mmu code is O(n), causing possibly unbounded latencies and causing
> -rt great unhappiness.
> - the mmu code wants to sleep (especially with guest paging), but can't.
> - some optimizations are not possible; for example, if we switch from one
> VM to another, we need not restore some host registers (as they will simply
> be overwritten with the new guest registers immediately).
>
> This patch adds hooks to the scheduler that allow kvm to be notified
> about scheduling decisions involving virtual machines. When we
> schedule out a VM, kvm is told to swap guest registers out; when we
> schedule the VM in, we swap the registers back in.

hm, why not do what i have in -rt? See the patch below. Seems to work
fine for me, although i might be missing something.

Ingo

------------------------->
Subject: [patch] kvm: make vcpu_load/put preemptible
From: Ingo Molnar <mingo@xxxxxxx>

make vcpu_load/put preemptible.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
drivers/kvm/svm.c | 13 ++++++++++---
drivers/kvm/vmx.c | 15 ++++++++++++---
2 files changed, 22 insertions(+), 6 deletions(-)

Index: linux-rt-rebase.q/drivers/kvm/svm.c
===================================================================
--- linux-rt-rebase.q.orig/drivers/kvm/svm.c
+++ linux-rt-rebase.q/drivers/kvm/svm.c
@@ -610,9 +610,17 @@ static void svm_free_vcpu(struct kvm_vcp

static void svm_vcpu_load(struct kvm_vcpu *vcpu)
{
- int cpu, i;
+ int cpu = raw_smp_processor_id(), i;
+ cpumask_t this_mask = cpumask_of_cpu(cpu);
+
+ /*
+ * Keep the context preemptible, but do not migrate
+ * away to another CPU. TODO: make sure this persists.
+ * Save/restore original mask.
+ */
+ if (unlikely(!cpus_equal(current->cpus_allowed, this_mask)))
+ set_cpus_allowed(current, cpumask_of_cpu(cpu));

- cpu = get_cpu();
if (unlikely(cpu != vcpu->cpu)) {
u64 tsc_this, delta;

@@ -638,7 +646,6 @@ static void svm_vcpu_put(struct kvm_vcpu
wrmsrl(host_save_user_msrs[i], vcpu->svm->host_user_msrs[i]);

rdtscll(vcpu->host_tsc);
- put_cpu();
}

static void svm_vcpu_decache(struct kvm_vcpu *vcpu)
Index: linux-rt-rebase.q/drivers/kvm/vmx.c
===================================================================
--- linux-rt-rebase.q.orig/drivers/kvm/vmx.c
+++ linux-rt-rebase.q/drivers/kvm/vmx.c
@@ -241,9 +241,16 @@ static void vmcs_set_bits(unsigned long
static void vmx_vcpu_load(struct kvm_vcpu *vcpu)
{
u64 phys_addr = __pa(vcpu->vmcs);
- int cpu;
+ int cpu = raw_smp_processor_id();
+ cpumask_t this_mask = cpumask_of_cpu(cpu);

- cpu = get_cpu();
+ /*
+ * Keep the context preemptible, but do not migrate
+ * away to another CPU. TODO: make sure this persists.
+ * Save/restore original mask.
+ */
+ if (unlikely(!cpus_equal(current->cpus_allowed, this_mask)))
+ set_cpus_allowed(current, cpumask_of_cpu(cpu));

if (vcpu->cpu != cpu)
vcpu_clear(vcpu);
@@ -281,7 +288,6 @@ static void vmx_vcpu_load(struct kvm_vcp
static void vmx_vcpu_put(struct kvm_vcpu *vcpu)
{
kvm_put_guest_fpu(vcpu);
- put_cpu();
}

static void vmx_vcpu_decache(struct kvm_vcpu *vcpu)
@@ -1862,6 +1868,7 @@ again:
}
#endif

+ preempt_disable();
asm (
/* Store host registers */
"pushf \n\t"
@@ -2002,6 +2009,8 @@ again:

reload_tss();
}
+ preempt_enable();
+
++vcpu->stat.exits;

#ifdef CONFIG_X86_64
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/