Re: [PATCH] KVM: x86: ioapic: Clear IRR for rtc bit when rtc EOI gotten

From: Wanpeng Li
Date: Mon Dec 25 2017 - 02:29:26 EST


2017-12-14 20:23 GMT+08:00 Gonglei <arei.gonglei@xxxxxxxxxx>:
> We hit a bug in our test while run PCMark 10 in a windows 7 VM,
> The VM got stuck and the wallclock was hang after several minutes running
> PCMark 10 in it.
> It is quite easily to reproduce the bug with the upstream KVM and Qemu.
>
> We found that KVM can not inject any RTC irq to VM after it was hang, it fails to
> Deliver irq in ioapic_set_irq() because RTC irq is still pending in ioapic->irr.
>
> static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
> int irq_level, bool line_status)
> {
> ...
> if (!irq_level) {
> ioapic->irr &= ~mask;
> ret = 1;
> goto out;
> }
> ...
> if ((edge && old_irr == ioapic->irr) ||
> (!edge && entry.fields.remote_irr)) {
> ret = 0;
> goto out;
> }
>
> According to RTC spec, after RTC injects a High level irq, OS will read CMOS's

I think it is falling edge and active low.

Regards,
Wanpeng Li

> register C to to clear the irq flag, and pull down the irq electric pin.
>
> For Qemu, we will emulate the reading operation in cmos_ioport_read(),
> but Guest OS will fire a write operation before to tell which register will be read
> after this write, where we use s->cmos_index to record the following register to read.
>
> But in our test, we found that there is a possible situation that Vcpu fails to read
> RTC_REG_C to clear irq, This could happens while two VCpus are writing/reading
> registers at the same time, for example, vcpu 0 is trying to read RTC_REG_C,
> so it write RTC_REG_C first, where the s->cmos_index will be RTC_REG_C,
> but before it tries to read register C, another vcpu1 is going to read RTC_YEAR,
> it changes s->cmos_index to RTC_YEAR by a writing action.
> The next operation of vcpu0 will be lead to read RTC_YEAR, In this case, we will miss
> calling qemu_irq_lower(s->irq) to clear the irq. After this, kvm will never inject RTC irq,
> and Windows VM will hang.
>
> Let's clear IRR of rtc when corresponding EOI is gotten to avoid the issue.
>
> Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Signed-off-by: Gonglei <arei.gonglei@xxxxxxxxxx>
> ---
> Thanks to Paolo provides a good solution. :)
>
> arch/x86/kvm/ioapic.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
> index 4e822ad..5022d63 100644
> --- a/arch/x86/kvm/ioapic.c
> +++ b/arch/x86/kvm/ioapic.c
> @@ -160,6 +160,7 @@ static void rtc_irq_eoi(struct kvm_ioapic *ioapic, struct kvm_vcpu *vcpu)
> {
> if (test_and_clear_bit(vcpu->vcpu_id,
> ioapic->rtc_status.dest_map.map)) {
> + ioapic->irr &= ~(1 << RTC_GSI);
> --ioapic->rtc_status.pending_eoi;
> rtc_status_pending_eoi_check_valid(ioapic);
> }
> --
> 1.8.3.1
>
>