Re: [RFC 2/7] KVM: VMX: Expose IA32_PKRS MSR

From: Sean Christopherson
Date: Fri Aug 21 2020 - 23:28:24 EST


On Tue, Aug 18, 2020 at 11:23:47AM -0700, Jim Mattson wrote:
> On Tue, Aug 18, 2020 at 12:28 AM Chenyi Qiang <chenyi.qiang@xxxxxxxxx> wrote:
> >
> >
> >
> > On 8/14/2020 1:31 AM, Jim Mattson wrote:
> > > On Wed, Aug 12, 2020 at 10:42 PM Chenyi Qiang <chenyi.qiang@xxxxxxxxx> wrote:
> > >>
> > >>
> > >>
> > >> On 8/13/2020 5:21 AM, Jim Mattson wrote:
> > >>> On Fri, Aug 7, 2020 at 1:46 AM Chenyi Qiang <chenyi.qiang@xxxxxxxxx> wrote:
> > >>>>
> > >>>> Protection Keys for Supervisor Pages (PKS) uses IA32_PKRS MSR (PKRS) at
> > >>>> index 0x6E1 to allow software to manage supervisor protection key
> > >>>> rights. For performance consideration, PKRS intercept will be disabled
> > >>>> so that the guest can access the PKRS without VM exits.
> > >>>> PKS introduces dedicated control fields in VMCS to switch PKRS, which
> > >>>> only does the retore part. In addition, every VM exit saves PKRS into
> > >>>> the guest-state area in VMCS, while VM enter won't save the host value
> > >>>> due to the expectation that the host won't change the MSR often. Update
> > >>>> the host's value in VMCS manually if the MSR has been changed by the
> > >>>> kernel since the last time the VMCS was run.
> > >>>> The function get_current_pkrs() in arch/x86/mm/pkeys.c exports the
> > >>>> per-cpu variable pkrs_cache to avoid frequent rdmsr of PKRS.
> > >>>>
> > >>>> Signed-off-by: Chenyi Qiang <chenyi.qiang@xxxxxxxxx>
> > >>>> ---
> > >>>
> > >>>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > >>>> index 11e4df560018..df2c2e733549 100644
> > >>>> --- a/arch/x86/kvm/vmx/nested.c
> > >>>> +++ b/arch/x86/kvm/vmx/nested.c
> > >>>> @@ -289,6 +289,7 @@ static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx,
> > >>>> dest->ds_sel = src->ds_sel;
> > >>>> dest->es_sel = src->es_sel;
> > >>>> #endif
> > >>>> + dest->pkrs = src->pkrs;
> > >>>
> > >>> Why isn't this (and other PKRS code) inside the #ifdef CONFIG_X86_64?
> > >>> PKRS isn't usable outside of long mode, is it?
> > >>>
> > >>
> > >> Yes, I'm also thinking about whether to put all pks code into
> > >> CONFIG_X86_64. The kernel implementation also wrap its pks code inside
> > >> CONFIG_ARCH_HAS_SUPERVISOR_PKEYS which has dependency with CONFIG_X86_64.
> > >> However, maybe this can help when host kernel disable PKS but the guest
> > >> enable it. What do you think about this?
> > >
> > > I see no problem in exposing PKRS to the guest even if the host
> > > doesn't have CONFIG_ARCH_HAS_SUPERVISOR_PKEYS.
> > >
> >
> > Yes, but I would prefer to keep it outside CONFIG_X86_64. PKS code has
> > several code blocks and putting them under x86_64 may end up being a
> > mess. In addition, PKU KVM related code isn't under CONFIG_X86_64 as
> > well. So, is it really necessary to put inside?
>
> I'll let someone who actually cares about the i386 build answer that question.

Ha, I care about the i386 build to the extent that it doesn't break. I
don't care at all shaving cycles/memory for things like this. Given how
long some KVM i386 bugs have gone unnoticed I'm not sure there's anyone that
cares about KVM i386 these days :-)