Re: [RFC][PATCH 9/9] arch/idle: Change arch_cpu_idle() IRQ behaviour

From: Isaku Yamahata
Date: Tue May 24 2022 - 10:55:46 EST


On Fri, May 20, 2022 at 02:58:19PM +0200,
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Fri, May 20, 2022 at 01:13:22PM +0300, Kirill A. Shutemov wrote:
>
> > So you want to call call the HLT hypercall with .irq_disabled=false and
> > .do_sti=false, but actual RFLAGS.IF in the guest is 0 and avoid CLI on
> > wake up expecting it to be cleared already, right?
>
> Yep, just like MWAIT can, avoids pointless IF flipping.
>
> > My reading of the spec is "don't do that". But actual behaviour is up to
> > VMM and TDX module implementation. VMM doens't have access to the guest
> > register file, so it *may* work, I donno.
>
> Yeah, it totally *can* work, but I've no idea if they done the right
> thing.

There are two cases when interrupt arrives.

- If interrupts arrives after the CPU start executing VMM (or the TDX module),
VMM can know if interrupt for vCPU arrives. VMM will unblock vcpu scheduling.
The HLT hypercall returns back to guest.

- If interrupts arrives and vcpu recognizes it before the CPU starts executing
VMM (or TDX module), the interrupt request is recorded in vRVI (VMCS.RVI)
due to vRFLAGS.IF=0. After that, CPU exits from guest to VMM due to HLT
hypercall.
Before KVM blocking vcpu scheduling, due to irq_disable=false TDX KVM checks
if deliverable interrupt events is pending by TDX SEAMCALL (because CPU state
is protected, VMM can't peek vRVI and vPPR directly. Note that vRFLAGS.IF is
ignored in this check). If vcpu has deliverable pending interrupt, HLT
hypercall returns.

Anyway this scenario isn't tested, I need to test it.
--
Isaku Yamahata <isaku.yamahata@xxxxxxxxx>