Re: [External] Re: [RESEND RFC: timer passthrough 0/9] Support timer passthrough for VM

From: Zhimin Feng
Date: Tue Feb 23 2021 - 08:32:51 EST


Hi

The host timer would be saved when cpu entry the non-root mode, and it would be restored when cpu entry the root mode. So the guest doesn't the host timer.

The host timer would be written to the preemption timer in non-root mode. When the host timer is expired(preemption timer value is '0'), the preemption timer would trigger immediate VMExit, so the host timer would be handled in the preemption timer handler.

Thanks!

Zhimin

在 2021/2/9 上午2:13, Konrad Rzeszutek Wilk 写道:
On Fri, Feb 05, 2021 at 06:03:08PM +0800, Zhimin Feng wrote:
The main motivation for this patch is to improve the performance of VM.
This patch series introduces how to enable the timer passthrough in
non-root mode.
Nice! Those are impressive numbers!

The main idea is to offload the host timer to the preemtion timer in
non-root mode. Through doing this, guest can write tscdeadline msr directly
in non-root mode and host timer isn't lost. If CPU is in root mode,
guest timer is switched to software timer.
I am sorry - but I am having a hard time understanding the sentence
above so let me ask some specific questions.

- How do you protect against the guest DoS-ing the host and mucking with
the host timer?

- As in can you explain how the host can still continue scheduling it's
own quanta?

And one more - what happens with Live Migration? I would assume that
becomes a no-go anymore unless you swap in the guest timer back in? So
we end up emulating the MSR again?

Thanks!

Testing on Intel(R) Xeon(R) Platinum 8260 server.

The guest OS is Debian(kernel: 4.19.28). The specific configuration is
is as follows: 8 cpu, 16GB memory, guest idle=poll
memcached in guest(memcached -d -t 8 -u root)

I use the memtier_benchmark tool to test performance
(memtier_benchmark -P memcache_text -s guest_ip -c 16 -t 32
--key-maximum=10000000000 --random-data --data-size-range=64-128 -p 11211
--generate-keys --ratio 5:1 --test-time=500)

Total Ops can be improved 25% and Avg.Latency can be improved 20% when
the timer-passthrough is enabled.

=============================================================
| Enable timer-passth | Disable timer-passth |
=============================================================
Totals Ops/sec | 514869.67 | 411766.67 |
-------------------------------------------------------------
Avg.Latency | 0.99483 | 1.24294 |
=============================================================


Zhimin Feng (9):
KVM: vmx: hook set_next_event for getting the host tscd
KVM: vmx: enable host lapic timer offload preemtion timer
KVM: vmx: enable passthrough timer to guest
KVM: vmx: enable passth timer switch to sw timer
KVM: vmx: use tsc_adjust to enable tsc_offset timer passthrough
KVM: vmx: check enable_timer_passth strictly
KVM: vmx: save the initial value of host tscd
KVM: vmx: Dynamically open or close the timer-passthrough for pre-vm
KVM: vmx: query the state of timer-passth for vm

arch/x86/include/asm/kvm_host.h | 27 ++++
arch/x86/kvm/lapic.c | 1 +
arch/x86/kvm/vmx/vmx.c | 331 +++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 26 +++-
include/linux/kvm_host.h | 1 +
include/uapi/linux/kvm.h | 3 +
kernel/time/tick-common.c | 1 +
tools/include/uapi/linux/kvm.h | 3 +
virt/kvm/kvm_main.c | 1 +
9 files changed, 389 insertions(+), 5 deletions(-)

--
2.11.0