Re: [RFC PATCH v3 31/59] KVM: x86: Add infrastructure for stolen GPA bits

From: Thomas Gleixner
Date: Thu Nov 25 2021 - 15:03:38 EST


On Wed, Nov 24 2021 at 16:20, isaku yamahata wrote:
> Add support in KVM's MMU for aliasing multiple GPAs (from a hardware
> perspective) to a single GPA (from a memslot perspective). GPA alising
> will be used to repurpose GPA bits as attribute bits, e.g. to expose an
> execute-only permission bit to the guest. To keep the implementation
> simple (relatively speaking), GPA aliasing is only supported via TDP.
>
> Today KVM assumes two things that are broken by GPA aliasing.
> 1. GPAs coming from hardware can be simply shifted to get the GFNs.
> 2. GPA bits 51:MAXPHYADDR are reserved to zero.
>
> With GPA aliasing, translating a GPA to GFN requires masking off the
> repurposed bit, and a repurposed bit may reside in 51:MAXPHYADDR.
>
> To support GPA aliasing, introduce the concept of per-VM GPA stolen bits,
> that is, bits stolen from the GPA to act as new virtualized attribute
> bits. A bit in the mask will cause the MMU code to create aliases of the
> GPA. It can also be used to find the GFN out of a GPA coming from a tdp
> fault.
>
> To handle case (1) from above, retain any stolen bits when passing a GPA
> in KVM's MMU code, but strip them when converting to a GFN so that the
> GFN contains only the "real" GFN, i.e. never has repurposed bits set.
>
> GFNs (without stolen bits) continue to be used to:
> -Specify physical memory by userspace via memslots
> -Map GPAs to TDP PTEs via RMAP
> -Specify dirty tracking and write protection
> -Look up MTRR types
> -Inject async page faults
>
> Since there are now multiple aliases for the same aliased GPA, when
> userspace memory backing the memslots is paged out, both aliases need to be
> modified. Fortunately this happens automatically. Since rmap supports
> multiple mappings for the same GFN for PTE shadowing based paging, by
> adding/removing each alias PTE with its GFN, kvm_handle_hva() based
> operations will be applied to both aliases.
>
> In the case of the rmap being removed in the future, the needed
> information could be recovered by iterating over the stolen bits and
> walking the TDP page tables.
>
> For TLB flushes that are address based, make sure to flush both aliases
> in the stolen bits case.
>
> Only support stolen bits in 64 bit guest paging modes (long, PAE).
> Features that use this infrastructure should restrict the stolen bits to
> exclude the other paging modes. Don't support stolen bits for shadow EPT.

This is a real reasonable and informative changelog. Thanks to Rick for
writing this up!

Thanks,

tglx