Re: [PATCH v9 7/7] KVM: X86: Add user-space access interface for CET MSRs

From: Sean Christopherson
Date: Wed Mar 04 2020 - 10:45:53 EST


On Wed, Mar 04, 2020 at 11:18:15PM +0800, Yang Weijiang wrote:
> On Tue, Mar 03, 2020 at 02:28:27PM -0800, Sean Christopherson wrote:
> > > @@ -1886,6 +1976,26 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> > > else
> > > msr_info->data = vmx->pt_desc.guest.addr_a[index / 2];
> > > break;
> > > + case MSR_IA32_S_CET:
> > > + if (!cet_ctl_access_allowed(vcpu, msr_info))
> > > + return 1;
> > > + msr_info->data = vmcs_readl(GUEST_S_CET);
> > > + break;
> > > + case MSR_IA32_INT_SSP_TAB:
> > > + if (!cet_ssp_access_allowed(vcpu, msr_info))
> > > + return 1;
> > > + msr_info->data = vmcs_readl(GUEST_INTR_SSP_TABLE);
> > > + break;
> > > + case MSR_IA32_U_CET:
> > > + if (!cet_ctl_access_allowed(vcpu, msr_info))
> > > + return 1;
> > > + rdmsrl(MSR_IA32_U_CET, msr_info->data);
> > > + break;
> > > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > > + if (!cet_ssp_access_allowed(vcpu, msr_info))
> > > + return 1;
> > > + rdmsrl(msr_info->index, msr_info->data);
> >
> > Ugh, thought of another problem. If a SoftIRQ runs after an IRQ it can
> > load the kernel FPU state. So for all the XSAVES MSRs we'll need a helper
> > similar to vmx_write_guest_kernel_gs_base(), except XSAVES has to be even
> > more restrictive and disable IRQs entirely. E.g.
> >
> > static void vmx_get_xsave_msr(struct msr_data *msr_info)
> > {
> > local_irq_disable();
> > if (test_thread_flag(TIF_NEED_FPU_LOAD))
> > switch_fpu_return();
> > rdmsrl(msr_info->index, msr_info->data);
> > local_irq_enable();
> In this case, would SoftIRQ destroy vcpu->arch.guest.fpu states which
> had been restored to XSAVES MSRs that we were accessing?

Doing kernel_fpu_begin() from a softirq would swap guest.fpu out of the
CPUs registers. It sets TIF_NEED_FPU_LOAD to mark the tasks has needing to
reload its FPU state prior to returning to userspace. So it doesn't
destroy it per se. The result is that KVM would read/write the CET MSRs
after they're loaded from the kernel's FPU state instead of reading the
MSRs loaded from the guest's FPU state.

> So should we restore
> guest.fpu or? In previous patch, we have restored guest.fpu before
> access the XSAVES MSRs.

There are three different FPU states:

- kernel
- userspace
- guest

RDMSR/WRMSR for CET MSRs need to run while the guest.fpu state is loaded
into the CPU registers[1]. At the beginning of the syscall from userspace,
i.e. the vCPU ioctl(), the task's FPU state[2] holds userspace FPU state.
Patch 6/7 swaps out the userspace state and loads the guest state.

But, if a softirq runs between kvm_load_guest_fpu() and now, and executes
kernel_fpu_begin(), it will swap the guest state (out of CPU registers)
and load the kernel state (into PCU registers). The actual RDMSR/WRMSR
needs to ensure the guest state is still loaded by checking and handling
TIF_NEED_FPU_LOAD.

[1] An alternative to doing switch_fpu_return() on TIF_NEED_FPU_LOAD would
be to calculate the offset into the xsave and read/write directly
to/from memory. But IMO that's unnecessary complexity as the guest's
fpu state still needs to be reloaded before re-entering the guest, e.g.
if vmx_{g,s}et_msr() is invoked on {RD,WR}MSR intercept, while loading
or saving MSR state from userspace isn't a hot path.

[2] I worded this to say "task's FPU state" because it's also possible the
CPU registers hold kernel state at the beginning of the vCPU ioctl(),
e.g. because of softirq.