Re: [PATCH v3 18/62] KVM: SVM: Disable (x2)AVIC IPI virtualization if CPU has erratum #1235
From: Naveen N Rao
Date: Mon Jun 23 2025 - 10:16:05 EST
On Wed, Jun 11, 2025 at 03:45:21PM -0700, Sean Christopherson wrote:
> From: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
>
> Disable IPI virtualization on AMD Family 17h CPUs (Zen2 and Zen1), as
> hardware doesn't reliably detect changes to the 'IsRunning' bit during ICR
> write emulation, and might fail to VM-Exit on the sending vCPU, if
> IsRunning was recently cleared.
>
> The absence of the VM-Exit leads to KVM not waking (or triggering nested
> VM-Exit) of the target vCPU(s) of the IPI, which can lead to hung vCPUs,
^^^^^^^^^^^
VM-Exit of)
> unbounded delays in L2 execution, etc.
>
> To workaround the erratum, simply disable IPI virtualization, which
> prevents KVM from setting IsRunning and thus eliminates the race where
> hardware sees a stale IsRunning=1. As a result, all ICR writes (except
> when "Self" shorthand is used) will VM-Exit and therefore be correctly
> emulated by KVM.
>
> Disabling IPI virtualization does carry a performance penalty, but
> benchmarkng shows that enabling AVIC without IPI virtualization is still
> much better than not using AVIC at all, because AVIC still accelerates
> posted interrupts and the receiving end of the IPIs.
>
> Note, when virtualizaing Self-IPIs, the CPU skips reading the physical ID
^^^^^^^^^^^^^
virtualizing
> table and updates the vIRR directly (because the vCPU is by definition
> actively running), i.e. Self-IPI isn't susceptible to the erratum *and*
> is still accelerated by hardware.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> [sean: rebase, massage changelog, disallow user override]
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/svm/avic.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 48c737e1200a..bf8b59556373 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -1187,6 +1187,14 @@ bool avic_hardware_setup(void)
> if (x2avic_enabled)
> pr_info("x2AVIC enabled\n");
>
> + /*
> + * Disable IPI virtualization for AMD Family 17h CPUs (Zen1 and Zen2)
> + * due to erratum 1235, which results in missed GA log events and thus
^^^^^^^^^^^^^
Not sure I understand the reference to GA log events here -- those are
only for device interrupts and not IPIs.
> + * missed wake events for blocking vCPUs due to the CPU failing to see
> + * a software update to clear IsRunning.
> + */
> + enable_ipiv = enable_ipiv && boot_cpu_data.x86 != 0x17;
> +
Apart from the above, this LGTM.
Acked-by: Naveen N Rao (AMD) <naveen@xxxxxxxxxx>
- Naveen