Re: [PATCH 10/19] KVM: SVM: Document that vCPU ID == APIC ID in AVIC kick fastpatch

From: Sean Christopherson
Date: Wed Aug 31 2022 - 12:16:47 EST


On Wed, Aug 31, 2022, Maxim Levitsky wrote:
> On Wed, 2022-08-31 at 00:34 +0000, Sean Christopherson wrote:
> > Document that AVIC is inhibited if any vCPU's APIC ID diverges from its
> > vCPU ID, i.e. that there's no need to check for a destination match in
> > the AVIC kick fast path.
> >
> > Opportunistically tweak comments to remove "guest bug", as that suggests
> > KVM is punting on error handling, which is not the case. Targeting a
> > non-existent vCPU or no vCPUs _may_ be a guest software bug, but whether
> > or not it's a guest bug is irrelevant. Such behavior is architecturally
> > legal and thus needs to faithfully emulated by KVM (and it is).
>
> I don't want to pick a fight,

Please don't hesitate to push back, I would much rather discuss points of contention
than have an ongoing, silent battle through patches.

> but personally these things *are* guest bugs / improper usage of APIC, and I
> don't think it is wrong to document them as such.

If the guest is intentionally exercising an edge case, e.g. KUT or selftests, then
from the guest's perspective its code/behavior isn't buggy.

I completely agree that abusing/aliasing logical IDs is improper usage and arguably
out of spec, but the scenarios here are very much in spec, e.g. a bitmap of '0'
isn't expressly forbidden and both Intel and AMD specs very clearly state that
APICs discard interrupt messages if they are not the destination.

But that's somewhat beside the point, as I have no objection to documenting scenarios
that are out-of-spec or undefined. What I object to is documenting such scenarios as
"guest bugs". If the CPU/APIC/whatever behavior is undefined, then document it
as undefined. Saying "guest bug" doesn't help future readers understand what is
architecturally supposed to happen, whereas a comment like

/*
* Logical IDs are architecturally "required" to be unique, i.e. this is
* technically undefined behavior. Disable the optimized logical map so
* that KVM is consistent with itself, as the non-optimized matching
* logic with accept interrupts on all CPUs with the logical ID.
*/

anchors KVM's behavior to the specs and explains why KVM does XYZ in response to
undefined behavior.

I feel very strongly about "guest bug" because KVM has a history of disregarding
architectural correctness and using a "good enough" approach. Simply stating
"guest bug" makes it all the more difficult to differentiate between KVM handling
architecturally undefined behavior, versus KVM deviating from the spec because
someone decided that KVM's partial emulation/virtualziation was "good enough".