Re: [PATCH v3 26/32] KVM: arm64: Introduce PROT_NONE mappings for stage 2

From: Will Deacon
Date: Fri Mar 05 2021 - 14:04:43 EST


On Fri, Mar 05, 2021 at 09:52:12AM +0000, Quentin Perret wrote:
> On Thursday 04 Mar 2021 at 20:00:45 (+0000), Will Deacon wrote:
> > On Tue, Mar 02, 2021 at 02:59:56PM +0000, Quentin Perret wrote:
> > > Once we start unmapping portions of memory from the host stage 2 (such
> > > as e.g. the hypervisor memory sections, or pages that belong to
> > > protected guests), we will need a way to track page ownership. And
> > > given that all mappings in the host stage 2 will be identity-mapped, we
> > > can use the host stage 2 page-table itself as a simplistic rmap.
> > >
> > > As a first step towards this, introduce a new protection attribute
> > > in the stage 2 page table code, called KVM_PGTABLE_PROT_NONE, which
> > > allows to annotate portions of the IPA space as inaccessible. For
> > > simplicity, PROT_NONE mappings are created as invalid mappings with a
> > > software bit set.
> >
> > Just as an observation, but given that they're invalid we can use any bit
> > from [63:2] to indicate that it's a PROT_NONE mapping, and that way we
> > can keep the real "software bits" for live mappings.
> >
> > But we can of course change that later when we need the bit for something
> > else.
>
> Right, so I used this approach for consistency with the kernel's
> PROT_NONE mappings:
>
> #define PTE_PROT_NONE (_AT(pteval_t, 1) << 58) /* only when !PTE_VALID */
>
> And in fact now that I think about it, it might be worth re-using the
> same bit in stage 2.
>
> But yes it would be pretty neat to use the other bits of invalid
> mappings to add metadata about the pages. I could even drop the
> PROT_NONE stuff straight away in favor of a more extensive mechanism for
> tracking page ownership...
>
> Thinking about it, it should be relatively straightforward to construct
> the host stage 2 with the following invariants:
>
> 1. everything is identity-mapped in the host stage 2;
>
> 2. all valid mappings imply the underlying PA range belongs to the
> host;
>
> 3. bits [63:32] (say) of all invalid mappings are used to store a
> unique identifier for the owner of the underlying PA range;
>
> 4. the host identifier is 0, such that it owns all of memory by
> default at boot as its pgd is zeroed;
>
> And then I could replace my PROT_NONE permission stuff by an ownership
> change. E.g. the hypervisor would have its own identifier, and I could
> use it to mark the .hyp memory sections as owned by the hyp (which
> implies inaccessible by the host). And that should scale quite easily
> when we start running protected guests as we'll assign them their own
> identifiers. Sharing pages between guests (or even worse, between host
> and guests) is a bit trickier, but maybe that is for later.
>
> Thoughts?

I think this sounds like a worthwhile generalisation to me, although virtio
brings an immediate need for shared pages and so we'll still need a software
bit for those so that we e.g. prevent the host from donating such a shared
page to the hypervisor.

Will