[PATCHv4 03/11] KVM: nSVM: Raise event on nested VM exit if L1 doesn't intercept IRQs

From: Santosh Shukla
Date: Mon Feb 27 2023 - 03:51:08 EST


From: Maxim Levitsky <mlevitsk@xxxxxxxxxx>

If the L1 doesn't intercept interrupts, then the KVM will use vmcb02's
V_IRQ for L1 (to detect an interrupt window)

In this case on nested VM exit KVM might need to copy the V_IRQ bit
from the vmcb02 to the vmcb01, to continue waiting for the
interrupt window.

To make it simple, just raise the KVM_REQ_EVENT request, which
execution will lead to the reenabling of the interrupt
window if needed.

Note that this is a theoretical bug because KVM already does raise
KVM_REQ_EVENT request on each nested VM exit because the nested
VM exit resets RFLAGS and the kvm_set_rflags() raises the
KVM_REQ_EVENT request in the response.

However raising this request explicitly, together with
documenting why this is needed, is still preferred.

Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
[reworded description as per Sean's v2 comment]
Signed-off-by: Santosh Shukla <Santosh.Shukla@xxxxxxx>
---
v3:
Reworded commit description per Sean's v2 comment:
https://lore.kernel.org/all/Y9RypRsfpLteK51v@xxxxxxxxxx/

arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 107258ed46ee..74e9e9e76d77 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1025,6 +1025,31 @@ int nested_svm_vmexit(struct vcpu_svm *svm)

svm_switch_vmcb(svm, &svm->vmcb01);

+ /* Note about synchronizing some of int_ctl bits from vmcb02 to vmcb01:
+ *
+ * V_IRQ, V_IRQ_VECTOR, V_INTR_PRIO_MASK, V_IGN_TPR:
+ * If the L1 doesn't intercept interrupts, then
+ * (even if the L1 does use virtual interrupt masking),
+ * KVM will use the vmcb02's V_INTR to detect interrupt window.
+ *
+ * In this case, the KVM raises KVM_REQ_EVENT to ensure that interrupt
+ * window is not lost and KVM implicitly V_IRQ bit from vmcb02 to vmcb01
+ *
+ * V_TPR:
+ * If the L1 doesn't use virtual interrupt masking, then the L1's vTPR
+ * is stored in the vmcb02 but its value doesn't need to be copied
+ * from/to vmcb01 because it is copied from/to the TPR APIC's register
+ * on each VM entry/exit.
+ *
+ * V_GIF:
+ * If the nested vGIF is not used, KVM uses vmcb02's V_GIF for L1's
+ * V_GIF, however, the L1 vGIF is reset to false on each VM exit, thus
+ * there is no need to copy it from vmcb02 to vmcb01.
+ */
+
+ if (!nested_exit_on_intr(svm))
+ kvm_make_request(KVM_REQ_EVENT, &svm->vcpu);
+
if (unlikely(svm->lbrv_enabled && (svm->nested.ctl.virt_ext & LBR_CTL_ENABLE_MASK))) {
svm_copy_lbrs(vmcb12, vmcb02);
svm_update_lbrv(vcpu);
--
2.25.1