Re: [PATCH v2 2/2] KVM: X86: Fix loss of exception which has not yet injected

From: Paolo Bonzini
Date: Wed Aug 23 2017 - 08:27:42 EST


On 23/08/2017 12:23, Wanpeng Li wrote:
> @@ -6341,7 +6345,8 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
> int r;
>
> /* try to reinject previous events if any */
> - if (vcpu->arch.exception.pending) {
> + if (vcpu->arch.exception.pending ||
> + vcpu->arch.exception.injected) {
> trace_kvm_inj_exception(vcpu->arch.exception.nr,
> vcpu->arch.exception.has_error_code,
> vcpu->arch.exception.error_code);
> @@ -6357,6 +6362,11 @@ static int inject_pending_event(struct kvm_vcpu *vcpu, bool req_int_win)
> }
>
> r = kvm_x86_ops->queue_exception(vcpu);
> + if (r == 0 && vcpu->arch.exception.pending &&
> + !vcpu->arch.exception.injected) {
> + vcpu->arch.exception.pending = false;
> + vcpu->arch.exception.injected = true;
> + }
> return r;
> }
>

There are some changes needed here:

- this "if" should check only .injected and call
kvm_x86_ops->queue_exception. The "if" for .pending, which handles
rflags/dr7, assigns false to .pending and true to .injected, and also
calls kvm_x86_ops->queue_exception, should be after the call to
check_nested_events.

- in the call to inject_pending_event, the "else" should have a

WARN_ON(vcpu->arch.exception.pending);

just for completeness.


Also, nested_vmx_check_exception has to be moved from
vmx_queue_exception to vmx_check_nested_events, so that
nested_run_pending is checked correctly. Something like this:

if (vcpu->arch.exception.pending &&
nested_vmx_check_exception(vcpu, &exit_qual)) {
if (vmx->nested.nested_run_pending)
return -EBUSY;

nested_vmx_inject_exception_vmexit(vcpu, exit_qual);
return 0;
}

Maybe you can have:

- patch 1 which I'll apply now

- patch 2 is the same as this one, with only the above changes to
inject_pending_event and the WARN_ON.

- patch 3 moves the nested_vmx_inject_exception_vmexit call from
nested_vmx_check_exception to vmx_queue_exception.

- patch 4 moves the code to vmx_check_nested_events and add the
nested_run_pending check, fixing the original bug.

As to GET/SET_VCPU_EVENTS, for now I would not do any change. Instead,
we can do

/*
* FIXME: pass injected and pending separately. This is only
* needed for nested virtualization, whose state cannot be
* migrated yet. For now we combine them just in case.
*/
events->exception.injected =
(vcpu->arch.exception.pending ||
vcpu->arch.exception.injected) &&
!kvm_exception_is_soft(vcpu->arch.exception.nr);
}

and set_vcpu_events can just set .injected = false. Separating the two
will need a separate capability and KVM_ENABLE_CAP.

Paolo

>
> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h
> index c2824d0..7c55916 100644
> --- a/tools/arch/x86/include/uapi/asm/kvm.h
> +++ b/tools/arch/x86/include/uapi/asm/kvm.h
> @@ -296,9 +296,9 @@ struct kvm_reinject_control {
> struct kvm_vcpu_events {
> struct {
> __u8 injected;
> + __u8 pending;
> __u8 nr;
> __u8 has_error_code;
> - __u8 pad;
> __u32 error_code;
> } exception;
> struct {

I think you don't need t