Re: [PATCH] KVM: x86: inject exceptions produced by x86_decode_insn
From: Yanan Fu
Date:  Mon Nov 13 2017 - 05:09:11 EST
----- Original Message -----
> From: "Paolo Bonzini" <pbonzini@xxxxxxxxxx>
> To: "Wanpeng Li" <kernellwp@xxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx, "kvm" <kvm@xxxxxxxxxxxxxxx>, yfu@xxxxxxxxxx, "Eduardo Habkost"
> <ehabkost@xxxxxxxxxx>
> Sent: Monday, November 13, 2017 4:32:09 PM
> Subject: Re: [PATCH] KVM: x86: inject exceptions produced by x86_decode_insn
> 
> On 13/11/2017 08:15, Wanpeng Li wrote:
> > 2017-11-10 17:49 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
> >> Sometimes, a processor might execute an instruction while another
> >> processor is updating the page tables for that instruction's code page,
> >> but before the TLB shootdown completes.  The interesting case happens
> >> if the page is in the TLB.
> >>
> >> In general, the processor will succeed in executing the instruction and
> >> nothing bad happens.  However, what if the instruction is an MMIO access?
> >> If *that* happens, KVM invokes the emulator, and the emulator gets the
> >> updated page tables.  If the update side had marked the code page as non
> >> present, the page table walk then will fail and so will x86_decode_insn.
> >>
> >> Unfortunately, even though kvm_fetch_guest_virt is correctly returning
> >> X86EMUL_PROPAGATE_FAULT, x86_decode_insn's caller treats the failure as
> >> a fatal error if the instruction cannot simply be reexecuted (as is the
> >> case for MMIO).  And this in fact happened sometimes when rebooting
> >> Windows 2012r2 guests.  Just checking ctxt->have_exception and injecting
> >> the exception if true is enough to fix the case.
> > 
> > I found the only place which can set ctxt->have_exception is in the
> > function x86_emulate_insn(), and x86_decode_insn() will not set
> > ctxt->have_exception even if kvm_fetch_guest_virt() returns
> > X86_EMUL_PROPAGATE_FAULT.
> 
> Hmm, you're right.  Looks like Yanan has been (un)lucky when trying out
> this patch! :(
> 
> Yanan, can you double check that you can reproduce the issue with an
> unpatched kernel?  I will work on a kvm-unit-tests testcsae
Hi Paolo, 
Yes, i still can reproduce it. In the latest acceptance testing which i just
finished this afternoon, 7 cases failed as this problem (all for win2012.r2 guest) 
And, with the scratch build that be provides in bz 1493501, i repeat 30 times, it
is ok. Thanks !
Best Wishes
Yanan Fu
> 
> Paolo
> 
> > Regards,
> > Wanpeng Li
> > 
> >>
> >> Thanks to Eduardo Habkost for helping in the debugging of this issue.
> >>
> >> Reported-by: Yanan Fu <yfu@xxxxxxxxxx>
> >> Cc: Eduardo Habkost <ehabkost@xxxxxxxxxx>
> >> Cc: stable@xxxxxxxxxxxxxxx
> >> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> >> ---
> >>  arch/x86/kvm/x86.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> index 34c85aa2e2d1..6dbed9022797 100644
> >> --- a/arch/x86/kvm/x86.c
> >> +++ b/arch/x86/kvm/x86.c
> >> @@ -5722,6 +5722,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
> >>                         if (reexecute_instruction(vcpu, cr2,
> >>                         write_fault_to_spt,
> >>                                                 emulation_type))
> >>                                 return EMULATE_DONE;
> >> +                       if (ctxt->have_exception &&
> >> inject_emulated_exception(vcpu))
> >> +                               return EMULATE_DONE;
> >>                         if (emulation_type & EMULTYPE_SKIP)
> >>                                 return EMULATE_FAIL;
> >>                         return handle_emulation_failure(vcpu);
> >> --
> >> 1.8.3.1
> >>
> 
>