Re: [PATCH] kvm/x86: reserve bit KVM_HINTS_PHYS_ADDRESS_SIZE_DATA_VALID

From: Gerd Hoffmann
Date: Fri Sep 09 2022 - 01:02:37 EST


On Thu, Sep 08, 2022 at 02:52:36PM +0000, Sean Christopherson wrote:
> On Thu, Sep 08, 2022, Gerd Hoffmann wrote:
> > The KVM_HINTS_PHYS_ADDRESS_SIZE_DATA_VALID bit hints to the guest
> > that the size of the physical address space as advertised by CPUID
> > leaf 0x80000008 is actually valid and can be used.
> >
> > Unfortunately this is not the case today with qemu. Default behavior is
> > to advertise 40 address bits (which I think comes from the very first x64
> > opteron processors). There are lots of intel desktop processors around
> > which support less than that (36 or 39 depending on age), and when trying
> > to use the full 40 bit address space on those things go south quickly.
> >
> > This renders the physical address size information effectively useless
> > for guests. This patch paves the way to fix that by adding a hint for
> > the guest so it knows whenever the physical address size is usable or
> > not.
> >
> > The plan for qemu is to set the bit when the physical address size is
> > valid. That is the case when qemu is started with the host-phys-bits=on
> > option set for the cpu. Eventually qemu can also flip the default for
> > that option from off to on, unfortunately that isn't easy for backward
> > compatibility reasons.
> >
> > The plan for the firmware is to check that bit and when it is set just
> > query and use the available physical address space. When the bit is not
> > set be conservative and try not exceed 36 bits (aka 64G) address space.
> > The latter is what the firmware does today unconditionally.
> >
> > Signed-off-by: Gerd Hoffmann <kraxel@xxxxxxxxxx>
> > ---
> > arch/x86/include/uapi/asm/kvm_para.h | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h
> > index 6e64b27b2c1e..115bb34413cf 100644
> > --- a/arch/x86/include/uapi/asm/kvm_para.h
> > +++ b/arch/x86/include/uapi/asm/kvm_para.h
> > @@ -37,7 +37,8 @@
> > #define KVM_FEATURE_HC_MAP_GPA_RANGE 16
> > #define KVM_FEATURE_MIGRATION_CONTROL 17
> >
> > -#define KVM_HINTS_REALTIME 0
> > +#define KVM_HINTS_REALTIME 0
> > +#define KVM_HINTS_PHYS_ADDRESS_SIZE_DATA_VALID 1
>
> Why does KVM need to get involved? This is purely a userspace problem.

It doesn't. I only need reserve a hints bit, and the canonical source
for that happens to live in the kernel. That's why this patch doesn't
touch any actual code ;)

> E.g. why not use QEMU's fw_cfg to communicate this information to the
> guest?

That is indeed the other obvious way to implement this. Given this
information will be needed in code paths which already do CPUID queries
using CPUID to transport that information looked like the better option
to me.

> Defining this flag arguably breaks backwards compatibility for VMMs
> that already accurately advertise MAXPHYADDR. The absence of the flag
> would imply that MAXPHYADDR is invalid, which is not the case.

That is true no matter how we try to transport that information from the
host to the guest (even with fw_cfg because other hypervisors start
using that interface too).

In practice it is not much of a problem though. The firmware needs to
know the exact platform it runs on anyway to initialize everything
properly, so the logic can easily be restricted to qemu.

take care,
Gerd