Re: Virt Call depth tracking mitigation

From: Thomas Gleixner
Date: Tue Jul 19 2022 - 10:27:29 EST


On Tue, Jul 19 2022 at 10:24, Andrew Cooper wrote:
> On 17/07/2022 00:17, Thomas Gleixner wrote:
>> As IBRS is a performance horror show, Peter Zijstra and me revisited the
>> call depth tracking approach and implemented it in a way which is hopefully
>> more palatable and avoids the downsides of the original attempt.
>>
>> We both unsurprisingly hate the result with a passion...
>
> And I hate to add more problems, but here we go.
>
> Under virt, it's not just SMI's which might run behind your back. 
> Regular interrupts/etc can probably be hand-waved away in the same way
> that SMIs are.

You mean host side interrupts, right?

> Hypercalls however are a different matter.
>
> Xen and HyperV both have hypercall pages, where the hypervisor provides
> some executable code for the guest kernel to use.
>
> Under the current scheme, the calls into the hypercall pages get
> accounted, as objtool can see them, but the ret's don't.  This imbalance
> is exasperated because some hypercalls are called in loops.

Bah.

> Worse however, it opens a hole where branch history is calculable and
> the ret can reliably underflow.  This occurs when there's a minimal call
> depth in Linux to get to the hypercall, and then a call depth of >16 in
> the hypervisor.
>
> The only variable in these cases is how much user control there is of
> the registers, and I for one am not feeling lucky in face of the current
> research.
>
> The only solution I see here is for Linux to ret-thunk the hypercall
> page too.  Under Xen, the hypercall page is mutable by the guest and
> there is room to turn every ret into a jmp, but obviously none of this
> is covered by any formal ABI, and this probably needs more careful
> consideration than the short time I've put towards it.

Well, that makes the guest side "safe", but isn't a deep hypercall > 16
already underflowing in the hypervisor code before it returns to the
guest?

> That said, after a return from the hypervisor, Linux has no idea what
> state the RSB is in, so the only safe course of action is to re-stuff.

Indeed.

Another proof for my claim that virt creates more problems than it
solves.

Thanks,

tglx