Re: [PATCH v3] x86/speculation, KVM: only IBPB for switch_mm_always_ibpb on vCPU load

From: Sean Christopherson
Date: Fri Apr 29 2022 - 18:00:10 EST


On Fri, Apr 29, 2022, Borislav Petkov wrote:
> On Fri, Apr 29, 2022 at 08:29:30PM +0000, Sean Christopherson wrote:
> > That's why there's a bunch of hand-waving.
>
> Well, I'm still not sure what this patch is trying to fix but both your
> latest replies do sound clearer...
>
> > Can you clarify what "this" is? Does "this" mean "this patch", or does it mean
>
> This patch.
>
> > "this IBPB when switching vCPUs"? Because if it means the latter, then I think
> > you're in violent agreement; the IBPB when switching vCPUs is pointless and
> > unnecessary.
>
> Ok, let's concentrate on the bug first - whether a second IBPB - so to
> speak - is needed. Doing some git archeology points to:
>
> 15d45071523d ("KVM/x86: Add IBPB support")
>
> which - and I'm surprised - goes to great lengths to explain what
> those IBPB calls in KVM protect against. From that commit message, for
> example:
>
> " * Mitigate attacks from guest/ring3->host/ring3.
> These would require a IBPB during context switch in host, or after
> VMEXIT."

Except that snippet changelog doesn't actually state what KVM does, it states what
a hypervsior _could_ do to protect the host from the guest via IBPB.

> so with my very limited virt understanding, when you vmexit, you don't
> do switch_mm(), right?

Correct, but KVM also doesn't do IBPB on VM-Exit (or VM-Entry), nor does KVM do
IBPB before exiting to userspace. The IBPB we want to whack is issued only when
KVM is switching vCPUs.

> If so, you need to do a barrier. Regardless of conditional IBPB or not
> as you want to protect the host from a malicious guest.
>
> In general, the whole mitigation strategies are enumerated in
>
> Documentation/admin-guide/hw-vuln/spectre.rst
>
> There's also a "3. VM mitigation" section.
>
> And so on...
>
> Bottomline is this: at the time, we went to great lengths to document
> what the attacks are and how we are protecting against them.

Except that _none_ of that documentation explains why the hell KVM does IBPB when
switching betwen vCPUs. The only item is this snippet from the changelog:

* Mitigate guests from being attacked by other guests.
- This is addressed by issing IBPB when we do a guest switch.

And that's the one that I pointed out in v1 as being flawed/wrong, and how Jon
ended up with this patch.

: But stepping back, why does KVM do its own IBPB in the first place?  The goal is
: to prevent one vCPU from attacking the next vCPU run on the same pCPU.  But unless
: userspace is running multiple VMs in the same process/mm_struct, switching vCPUs,
: i.e. switching tasks, will also switch mm_structs and thus do IPBP via cond_mitigation.
:
: If userspace runs multiple VMs in the same process, enables cond_ipbp, _and_ sets
: TIF_SPEC_IB, then it's being stupid and isn't getting full protection in any case,
: e.g. if userspace is handling an exit-to-userspace condition for two vCPUs from
: different VMs, then the kernel could switch between those two vCPUs' tasks without
: bouncing through KVM and thus without doing KVM's IBPB.
:
: I can kinda see doing this for always_ibpb, e.g. if userspace is unaware of spectre
: and is naively running multiple VMs in the same process.
:
: What am I missing?