Re: [PATCH] KVM: VMX: Read BNDCFGS if not from_vmentry

From: Sean Christopherson
Date: Thu May 19 2022 - 13:59:59 EST


On Thu, Apr 21, 2022, Lei Wang wrote:
> In the migration case, if nested state is set after MSR state, the value
> needs to come from the current MSR value.
>
> Signed-off-by: Lei Wang <lei4.wang@xxxxxxxxx>
> Reported-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/vmx/nested.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index f18744f7ff82..58a1fa7defc9 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3381,7 +3381,8 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
> if (!(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_DEBUG_CONTROLS))
> vmx->nested.vmcs01_debugctl = vmcs_read64(GUEST_IA32_DEBUGCTL);
> if (kvm_mpx_supported() &&
> - !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
> + (!from_vmentry ||

Gah, my bad, this isn't correct either. The minor issue is that it should check
vmx->nested.nested_run_pending, not just from_vmentry. If nested state is restored
and a VM-Entry is pending, then the MSRs that were saved+restore were L1's MSRs,
not L2's MSRs.

That won't cause problems because the consumption correctly checks nested_run_pending,
it's just confusing and an unnecessary VMREAD.

But that's a moot point because vmcs01 will not hold the correct value in the SMM
case. Luckily, BNDCFGS is easy to handle because it's unconditionally saved on
VM-Exit, which means that vmcs12 is guaranteed to hold the correct value for both
SMM and state restore (without pending entry) because the pseudo-VM-Exit for both
will always save vmcs02's value into vmcs12.

GUEST_IA32_DEBUGCTL is a much bigger pain because it's conditionally saved on
exit. I think the least awful approach would be to save L2's value into
vmcs01_debugctl prior to the forced exit in vmx_enter_smm(), but that will require
more changes to the state restore flow. Grr.

I'll send patches for both BNDCFGS and IA32_DEBUGCTL, and will take a careful look
at the PKS stuff too. I'm guessing it should follow the BNDCFGS logic.

Sorry for the runaround.