Re: [RFC PATCH 0/3] generic hypercall support

From: Avi Kivity
Date: Fri May 08 2009 - 04:00:45 EST


Marcelo Tosatti wrote:
I think comparison is not entirely fair. You're using
KVM_HC_VAPIC_POLL_IRQ ("null" hypercall) and the compiler optimizes that
(on Intel) to only one register read:

nr = kvm_register_read(vcpu, VCPU_REGS_RAX);

Whereas in a real hypercall for (say) PIO you would need the address,
size, direction and data.

Well, that's probably one of the reasons pio is slower, as the cpu has to set these up, and the kernel has to read them.

Also for PIO/MMIO you're adding this unoptimized lookup to the measurement:

pio_dev = vcpu_find_pio_dev(vcpu, port, size, !in);
if (pio_dev) {
kernel_pio(pio_dev, vcpu, vcpu->arch.pio_data);
complete_pio(vcpu); return 1;
}

Since there are only one or two elements in the list, I don't see how it could be optimized.

Whereas for hypercall measurement you don't. I believe a fair comparison
would be have a shared guest/host memory area where you store guest/host
TSC values and then do, on guest:

rdtscll(&shared_area->guest_tsc);
pio/mmio/hypercall
... back to host
rdtscll(&shared_area->host_tsc);

And then calculate the difference (minus guests TSC_OFFSET of course)?

I don't understand why you want host tsc? We're interested in round-trip latency, so you want guest tsc all the time.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/