Re: [PATCH] KVM: x86/mmu: Do not create SPTEs for GFNs that exceed host.MAXPHYADDR

From: Sean Christopherson
Date: Wed May 04 2022 - 10:47:23 EST


On Wed, May 04, 2022, Maxim Levitsky wrote:
> On Tue, 2022-05-03 at 20:30 +0000, Sean Christopherson wrote:
> > Well, I officially give up, I'm out of ideas to try and repro this on my end. To
> > try and narrow the search, maybe try processing "all" possible gfns and see if that
> > makes the leak go away?
> >
> > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > index 7e258cc94152..a354490939ec 100644
> > --- a/arch/x86/kvm/mmu.h
> > +++ b/arch/x86/kvm/mmu.h
> > @@ -84,9 +84,7 @@ static inline gfn_t kvm_mmu_max_gfn(void)
> > * than hardware's real MAXPHYADDR. Using the host MAXPHYADDR
> > * disallows such SPTEs entirely and simplifies the TDP MMU.
> > */
> > - int max_gpa_bits = likely(tdp_enabled) ? shadow_phys_bits : 52;
> > -
> > - return (1ULL << (max_gpa_bits - PAGE_SHIFT)) - 1;
> > + return (1ULL << (52 - PAGE_SHIFT)) - 1;
> > }
> >
> > static inline u8 kvm_get_shadow_phys_bits(void)
> >
>
> Nope, still reproduces.
>
> I'll think on how to trace this, maybe that will give me some ideas.
> Anything useful to dump from the mmu pages that are still not freed at that point?

Dumping the role and gfn is most likely to be useful. Assuming you aren't seeing
this WARN too:

WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));

then it's not a VM refcounting problem. The bugs thus far have been tied to the
gfn in some way, e.g. skipping back-to-back entries, the MAXPHYADDR thing. But I
don't have any ideas for why such a simple test would generate unique behavior.

> Also do you test on AMD? I test on my 3970X.

Yep, I've tried Rome and Milan, and CLX (or maybe SKX?) and HSW on the Intel side.