Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory

From: Kirill A. Shutemov
Date: Wed Sep 15 2021 - 10:29:34 EST


On Wed, Sep 15, 2021 at 03:51:25PM +0200, David Hildenbrand wrote:
> > > diff --git a/mm/memfd.c b/mm/memfd.c
> > > index 081dd33e6a61..ae43454789f4 100644
> > > --- a/mm/memfd.c
> > > +++ b/mm/memfd.c
> > > @@ -130,11 +130,24 @@ static unsigned int *memfd_file_seals_ptr(struct file *file)
> > > return NULL;
> > > }
> > > +int memfd_register_guest(struct inode *inode, void *owner,
> > > + const struct guest_ops *guest_ops,
> > > + const struct guest_mem_ops **guest_mem_ops)
> > > +{
> > > + if (shmem_mapping(inode->i_mapping)) {
> > > + return shmem_register_guest(inode, owner,
> > > + guest_ops, guest_mem_ops);
> > > + }
> > > +
> > > + return -EINVAL;
> > > +}
> >
> > Are we stick our design to memfd interface (e.g other memory backing
> > stores like tmpfs and hugetlbfs will all rely on this memfd interface to
> > interact with KVM), or this is just the initial implementation for PoC?
>
> I don't think we are, it still feels like we are in the early prototype
> phase (even way before a PoC). I'd be happy to see something "cleaner" so to
> say -- it still feels kind of hacky to me, especially there seem to be many
> pieces of the big puzzle missing so far. Unfortunately, this series hasn't
> caught the attention of many -MM people so far, maybe because other people
> miss the big picture as well and are waiting for a complete design proposal.
>
> For example, what's unclear to me: we'll be allocating pages with
> GFP_HIGHUSER_MOVABLE, making them land on MIGRATE_CMA or ZONE_MOVABLE; then
> we silently turn them unmovable, which breaks these concepts. Who'd migrate
> these pages away just like when doing long-term pinning, or how is that
> supposed to work?

That's fair point. We can fix it by changing mapping->gfp_mask.

> Also unclear to me is how refcount and mapcount will be handled to prevent
> swapping,

refcount and mapcount are unchanged. Pages not pinned per se. Swapping
prevented with the change in shmem_writepage().

> who will actually do some kind of gfn-epfn etc. mapping, how we'll
> forbid access to this memory e.g., via /proc/kcore or when dumping memory

It's not aimed to prevent root to shoot into his leg. Root do root.

> ... and how it would ever work with migration/swapping/rmap (it's clearly
> future work, but it's been raised that this would be the way to make it
> work, I don't quite see how it would all come together).

Given that hardware supports it migration and swapping can be implemented
by providing new callbacks in guest_ops. Like ->migrate_page would
transfer encrypted data between pages and ->swapout would provide
encrypted blob that can be put on disk or handled back to ->swapin to
bring back to memory.

--
Kirill A. Shutemov