Re: [PATCH] KVM: x86: VMX: Make smaller physical guest address space support user-configurable

From: Jim Mattson
Date: Thu Sep 03 2020 - 14:33:05 EST


On Thu, Sep 3, 2020 at 11:03 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 03/09/20 19:57, Jim Mattson wrote:
> > On Thu, Sep 3, 2020 at 7:12 AM Mohammed Gamal <mgamal@xxxxxxxxxx> wrote:
> >> This patch exposes allow_smaller_maxphyaddr to the user as a module parameter.
> >>
> >> Since smaller physical address spaces are only supported on VMX, the parameter
> >> is only exposed in the kvm_intel module.
> >> Modifications to VMX page fault and EPT violation handling will depend on whether
> >> that parameter is enabled.
> >>
> >> Also disable support by default, and let the user decide if they want to enable
> >> it.
> >
> > I think a smaller guest physical address width *should* be allowed.
> > However, perhaps the pedantic adherence to the architectural
> > specification could be turned on or off per-VM? And, if we're going to
> > be pedantic, I think we should go all the way and get MOV-to-CR3
> > correct.
>
> That would be way too slow. Even the current trapping of present #PF
> can introduce some slowdown depending on the workload.

Yes, I was concerned about that...which is why I would not want to
enable pedantic mode. But if you're going to be pedantic, why go
halfway?

> > Does the typical guest care about whether or not setting any of the
> > bits 51:46 in a PFN results in a fault?
>
> At least KVM with shadow pages does, which is a bit niche but it shows
> that you cannot really rely on no one doing it. As you guessed, the
> main usage of the feature is for machines with 5-level page tables where
> there are no reserved bits; emulating smaller MAXPHYADDR allows
> migrating VMs from 4-level page-table hosts.
>
> Enabling per-VM would not be particularly useful IMO because if you want
> to disable this code you can just set host MAXPHYADDR = guest
> MAXPHYADDR, which should be the common case unless you want to do that
> kind of Skylake to Icelake (or similar) migration.

I expect that it will be quite common to run 46-bit wide legacy VMs on
Ice Lake hardware, as Ice Lake machines start showing up in
heterogeneous data centers.