Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction

From: Maciej S. Szmigiero
Date: Wed Apr 06 2022 - 17:28:07 EST


On 6.04.2022 21:48, Sean Christopherson wrote:
On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote:
On 6.04.2022 19:10, Sean Christopherson wrote:
On Wed, Apr 06, 2022, Maciej S. Szmigiero wrote:
And what if it's L0 that is trying to inject a NMI into L2?
In this case is_guest_mode() is true, but the full NMI injection machinery
should be used.

Gah, you're right, I got misled by a benign bug in nested_vmx_l1_wants_exit() and
was thinking that NMIs always exit. The "L1 wants" part should be conditioned on
NMI exiting being enabled. It's benign because KVM always wants "real" NMIs, and
so the path is never encountered.

@@ -5980,7 +6005,7 @@ static bool nested_vmx_l1_wants_exit(struct kvm_vcpu *vcpu,
switch ((u16)exit_reason.basic) {
case EXIT_REASON_EXCEPTION_NMI:
intr_info = vmx_get_intr_info(vcpu);
- if (is_nmi(intr_info))
+ if (is_nmi(intr_info) && nested_cpu_has_nmi_exiting(vmcs12))
return true;
else if (is_page_fault(intr_info))
return true;


I guess you mean "benign" when having KVM as L1, since other hypervisors may
let their L2s handle NMIs themselves.

It is also incorrect to block L1 -> L2 NMI injection because either L1
or L2 is currently under NMI blocking: the first case is obvious,
the second because it's L1 that is supposed to take care of proper NMI
blocking for L2 when injecting an NMI there.

Yep, but I don't think there's a bug here. At least not for nVMX.

I agree this scenario should currently work (including on nSVM) - mentioned
it just as a constraint on solution space.

With the code in my previous patch set I planned to use
exit_during_event_injection() to detect such case, but if we implement
VMCB12 EVENTINJ parsing we can simply add a flag that the relevant event
comes from L1, so its normal injection side-effects should be skipped.

Do we still need a flag based on the above? Honest question... I've been staring
at all this for the better part of an hour and may have lost track of things.

If checking just is_guest_mode() is not enough due to reasons I described
above then we need to somehow determine in the NMI / IRQ injection handler
whether the event to be injected into L2 comes from L0 or L1.
For this (assuming we do VMCB12 EVENTINJ parsing) I think we need an extra flag.

Yes :-( And I believe the extra flag would need to be handled by KVM_{G,S}ET_VCPU_EVENTS.


Another option for saving and restoring a VM would be to add it to
KVM_{GET,SET}_NESTED_STATE somewhere (maybe as a part of the saved VMCB12
control area?).

Thanks,
Maciej