Re: [KVM timekeeping 10/35] Fix deep C-state TSC desynchronization

From: Jan Kiszka
Date: Wed Sep 15 2010 - 04:09:47 EST


Am 15.09.2010 06:07, Zachary Amsden wrote:
> On 09/13/2010 11:10 PM, Jan Kiszka wrote:
>> Am 20.08.2010 10:07, Zachary Amsden wrote:
>>
>>> When CPUs with unstable TSCs enter deep C-state, TSC may stop
>>> running. This causes us to require resynchronization. Since
>>> we can't tell when this may potentially happen, we assume the
>>> worst by forcing re-compensation for it at every point the VCPU
>>> task is descheduled.
>>>
>>> Signed-off-by: Zachary Amsden<zamsden@xxxxxxxxxx>
>>> ---
>>> arch/x86/kvm/x86.c | 2 +-
>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 7fc4a55..52b6c21 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -1866,7 +1866,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu,
>>> int cpu)
>>> }
>>>
>>> kvm_x86_ops->vcpu_load(vcpu, cpu);
>>> - if (unlikely(vcpu->cpu != cpu)) {
>>> + if (unlikely(vcpu->cpu != cpu) || check_tsc_unstable()) {
>>> /* Make sure TSC doesn't go backwards */
>>> s64 tsc_delta = !vcpu->arch.last_host_tsc ? 0 :
>>> native_read_tsc() - vcpu->arch.last_host_tsc;
>>>
>> For yet unknown reason, this commit breaks Linux guests here if they are
>> started with only a single VCPU. They hang during boot, obviously no
>> longer receiving interrupts.
>>
>> I'm using kvm-kmod against a 2.6.34 host kernel, so this may be a side
>> effect of the wrapping, though I cannot imagine how.
>>
>> Anyone any ideas?
>>
>
> Question: how did you come to the knowledge that this is the commit
> which breaks things? I'm assuming you bisected, in which case a
> transition from stable -> unstable would have only happened once.

Right.

> This
> also means the PM suspend event which you observed only happened once,
> so obviously if you bisected successfully, there is a bug which doesn't
> involved the PM transition or the stable -> unstable transition.

Right, see my other posting.

>
> Your host TSC must have desynchronized during the PM transition, and
> this change compensates the TSC on an unstable host to effectively show
> run time, not real time. Perhaps the lack of catchup code (to catch
> back up to real time) is triggering the bug.

I'm still unsure if KVM is right in declaring the TSC unstable. It looks
like Linux is less picky here - are the requirements different?

>
> In any case, I'll proceed with the forcing of unstable TSC and HPET
> clocksource and see what happens.

I tried that before, but it did not trigger the issue that kvm-clock
guests no longer boot properly. This only happens if the TSC is marked
unstable.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature