Re: [PATCH] KVM: x86: VMX: Make smaller physical guest address space support user-configurable

From: Jim Mattson
Date: Thu Sep 03 2020 - 17:27:13 EST


On Thu, Sep 3, 2020 at 1:02 PM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 03/09/20 20:32, Jim Mattson wrote:
> >> [Checking writes to CR3] would be way too slow. Even the current
> >> trapping of present #PF can introduce some slowdown depending on the
> >> workload.
> >
> > Yes, I was concerned about that...which is why I would not want to
> > enable pedantic mode. But if you're going to be pedantic, why go
> > halfway?
>
> Because I am not sure about any guest, even KVM, caring about setting
> bits 51:46 in CR3.
>
> >>> Does the typical guest care about whether or not setting any of the
> >>> bits 51:46 in a PFN results in a fault?
> >>
> >> At least KVM with shadow pages does, which is a bit niche but it shows
> >> that you cannot really rely on no one doing it. As you guessed, the
> >> main usage of the feature is for machines with 5-level page tables where
> >> there are no reserved bits; emulating smaller MAXPHYADDR allows
> >> migrating VMs from 4-level page-table hosts.
> >>
> >> Enabling per-VM would not be particularly useful IMO because if you want
> >> to disable this code you can just set host MAXPHYADDR = guest
> >> MAXPHYADDR, which should be the common case unless you want to do that
> >> kind of Skylake to Icelake (or similar) migration.
> >
> > I expect that it will be quite common to run 46-bit wide legacy VMs on
> > Ice Lake hardware, as Ice Lake machines start showing up in
> > heterogeneous data centers.
>
> If you'll be okay with running _all_ 46-bit wide legacy VMs without
> MAXPHYADDR emulation, that's what this patch is for. If you'll be okay
> with running _only_ 46-bit wide VMs without emulation, you still don't
> need special enabling per-VM beyond the automatic one based on
> CPUID[0x8000_0008]. Do you think you'll need to enable it for some
> special 46-bit VMs?

Yes. From what you've said above, we would only want to enable this
for the niche case of 46-bit KVM guests using shadow paging. I would
expect that to be a very small number of VMs. :-)