Re: [RFC PATCH v3 02/19] KVM: x86: inhibit APICv/AVIC when the guest and/or host changes apic id/base from the defaults.

From: Sean Christopherson
Date: Mon May 23 2022 - 14:05:41 EST


On Mon, May 23, 2022, Maxim Levitsky wrote:
> On Sun, 2022-05-22 at 07:47 -0700, Jim Mattson wrote:
> > On Sun, May 22, 2022 at 2:03 AM Maxim Levitsky <mlevitsk@xxxxxxxxxx> wrote:
> > > On Thu, 2022-05-19 at 16:06 +0000, Sean Christopherson wrote:
> > > > On Wed, Apr 27, 2022, Maxim Levitsky wrote:
> > > > > Neither of these settings should be changed by the guest and it is
> > > > > a burden to support it in the acceleration code, so just inhibit
> > > > > it instead.
> > > > >
> > > > > Also add a boolean 'apic_id_changed' to indicate if apic id ever changed.
> > > > >
> > > > > Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> > > > > ---
> > > > > + return;
> > > > > +
> > > > > + pr_warn_once("APIC ID change is unsupported by KVM");
> > > >
> > > > It's supported (modulo x2APIC shenanigans), otherwise KVM wouldn't need to disable
> > > > APICv.
> > >
> > > Here, as I said, it would be nice to see that warning if someone complains.
> > > Fact is that AVIC code was totally broken in this regard, and there are probably more,
> > > so it would be nice to see if anybody complains.
> > >
> > > If you insist, I'll remove this warning.
> >
> > This may be fine for a hobbyist, but it's a terrible API in an
> > enterprise environment. To be honest, I have no way of propagating
> > this warning from /var/log/messages on a particular host to a
> > potentially impacted customer. Worse, if they're not the first
> > impacted customer since the last host reboot, there's no warning to
> > propagate. I suppose I could just tell every later customer, "Your VM
> > was scheduled to run on a host that previously reported, 'APIC ID
> > change is unsupported by KVM.' If you notice any unusual behavior,
> > that might be the reason for it," but that isn't going to inspire
> > confidence. I could schedule a drain and reboot of the host, but that
> > defeats the whole point of the "_once" suffix.
>
> Mostly agree, and I read alrady few discussions about exactly this,
> those warnings are mostly useless, but they are used in the
> cases where we don't have the courage to just exit with KVM_EXIT_INTERNAL_ERROR.
>
> I do not thing though that the warning is completely useless,
> as we often have the kernel log of the target machine when things go wrong,
> so *we* can notice it.
> In other words a kernel warning is mostly useless but better that nothing.

IMO, it's worse than doing nothing. Us developers become desensitized to the
kernel message due to running tests, the existence of these message propagates
the notion that they are a good thing (and we keep rehashing these discussions...),
users may not realize it's a _once() printk and so think they _aren't_ affected
when re-running a workload, etc...

And in this case, "APIC ID change is unsupported by KVM" is partly wrong. KVM
fully models Intel's behavior where the ID change isn't carried across x2APIC
enabling, the only unsupported behavior is that the guest will lose APICv
acceleration.

> About KVM_EXIT_WARNING, this is IMHO a very good idea, probably combined
> with some form of taint flag, which could be read by qemu and then shown
> over hmp/qmp interfaces.