Re: [RFC 2/2] x86/kvmclock: Use host timekeeping.

From: Vitaly Kuznetsov
Date: Tue Sep 24 2019 - 07:14:40 EST


Suleiman Souhlal <suleiman@xxxxxxxxxx> writes:

> On Fri, Sep 20, 2019 at 10:33 PM Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> wrote:
>>
>> Suleiman Souhlal <suleiman@xxxxxxxxxx> writes:
>>
>> > When CONFIG_KVMCLOCK_HOST_TIMEKEEPING is enabled, and the host
>> > supports it, update our timekeeping parameters to be the same as
>> > the host. This lets us have our time synchronized with the host's,
>> > even in the presence of host NTP or suspend.
>> >
>> > When enabled, kvmclock uses raw tsc instead of pvclock.
>> >
>> > When enabled, syscalls that can change time, such as settimeofday(2)
>> > or adj_timex(2) are disabled in the guest.
>> >
>> > Signed-off-by: Suleiman Souhlal <suleiman@xxxxxxxxxx>
>> > ---
>> > arch/x86/Kconfig | 9 +++
>> > arch/x86/include/asm/kvmclock.h | 2 +
>> > arch/x86/kernel/kvmclock.c | 127 +++++++++++++++++++++++++++++++-
>> > kernel/time/timekeeping.c | 21 ++++++
>> > 4 files changed, 155 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> > index 4195f44c6a09..37299377d9d7 100644
>> > --- a/arch/x86/Kconfig
>> > +++ b/arch/x86/Kconfig
>> > @@ -837,6 +837,15 @@ config PARAVIRT_TIME_ACCOUNTING
>> > config PARAVIRT_CLOCK
>> > bool
>> >
>> > +config KVMCLOCK_HOST_TIMEKEEPING
>> > + bool "kvmclock uses host timekeeping"
>> > + depends on KVM_GUEST
>> > + ---help---
>> > + Select this option to make the guest use the same timekeeping
>> > + parameters as the host. This means that time will be almost
>> > + exactly the same between the two. Only works if the host uses "tsc"
>> > + clocksource.
>> > +
>>
>> I'd also like to speak up against this config, it is confusing. In case
>> the goal is to come up with a TSC-based clock for guests which will
>> return the same as clock_gettime() on the host (or, is the goal to just
>> have the same reading for all guests on the host?) I'd suggest we create
>> a separate (from KVMCLOCK) clocksource (mirroring host timekeeper) and
>> guests will be free to pick the one they like.
>
> Fair enough. I'll do that in the next version of the patch.
>
> The goal is to have a guest clock that gives the same
> clock_gettime(CLOCK_MONOTONIC) as the host.
>

KVMCLOCK has lots of legacy derived from times when TSC synchronization
was not a given (I heard that this is still sometimes problematic with
multi-socket systems but oh well). If I was to design a new clock I'd
probably mirror Hyper-V's TSC page clocksource invalidating the page
when host timekeeper values are updated (and making guest spin).

The tricky part with this approach is probably tsc scaling/tsc
offsetting which is still going to be controlled by kvmclock (so the
guest has no option to read 'pure TSC'). As an alternative, you can make
kvmclock un-pluggable so when it's not enabled TSC frequency/offset can
remain intact. You can, of course, try to update timekeeper values to
match the new frequency/offset every time they change but rounding
errors may bite.

--
Vitaly