Re: [PATCH v8] KVM: LAPIC: Apply change to TDCR right away to the timer

From: Radim KrÄmÃÅ
Date: Fri Oct 06 2017 - 12:37:59 EST


2017-10-06 07:38-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>
> The description in the Intel SDM of how the divide configuration
> register is used: "The APIC timer frequency will be the processor's bus
> clock or core crystal clock frequency divided by the value specified in
> the divide configuration register."
>
> Observation of baremetal shown that when the TDCR is change, the TMCCT
> does not change or make a big jump in value, but the rate at which it
> count down change.
>
> The patch update the emulation to APIC timer to so that a change to the
> divide configuration would be reflected in the value of the counter and
> when the next interrupt is triggered.
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
> ---
> arch/x86/kvm/lapic.c | 31 +++++++++++++++++++++++++++++--
> 1 file changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 14f63b3..21629dd 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1458,6 +1458,26 @@ static void start_sw_period(struct kvm_lapic *apic)
> HRTIMER_MODE_ABS_PINNED);
> }
>
> +static void update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
> +{
> + ktime_t now, remaining;
> + u64 ns_remaining_old, ns_remaining_new;
> +
> + apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
> + * APIC_BUS_CYCLE_NS * apic->divide_count;
> + limit_periodic_timer_frequency(apic);
> +
> + now = ktime_get();
> + remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> + ns_remaining_old = ktime_to_ns(remaining);
> + ns_remaining_new = mul_u64_u32_div(ns_remaining_old,
> + apic->divide_count, old_divisor);
> +
> + apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, ns_remaining_new) -
> + nsec_to_cycles(apic->vcpu, ns_remaining_old);
> + apic->lapic_timer.target_expiration = ktime_add_ns(now, ns_remaining_new);
> +}
> +
> static bool set_target_expiration(struct kvm_lapic *apic)
> {
> ktime_t now;
> @@ -1750,13 +1770,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
> start_apic_timer(apic);
> break;
>
> - case APIC_TDCR:
> + case APIC_TDCR: {
> + uint32_t old_divisor = apic->divide_count;
> +
> if (val & 4)
> apic_debug("KVM_WRITE:TDCR %x\n", val);
> kvm_lapic_set_reg(apic, APIC_TDCR, val);
> update_divide_count(apic);

This revealed a preemption_timer bug in restart_apic_timer() that
enables the oneshot timer even if TMICT = 0.

> + if (apic->divide_count != old_divisor) {

I added '&& apic->lapic_timer.period' check as we don't need to do
anything if the timer is not running.

> + hrtimer_cancel(&apic->lapic_timer.timer);
> + update_target_expiration(apic, old_divisor);
> + restart_apic_timer(apic);

Will send a patch for the real bug later.

> + }
> break;
> -
> + }
> case APIC_ESR:
> if (apic_x2apic_mode(apic) && val != 0) {
> apic_debug("KVM_WRITE:ESR not zero %x\n", val);
> --
> 2.7.4
>