[PATCH RFC v1 0/9] KVM: SVM: Defer page pinning for SEV guests

From: Nikunj A Dadhania
Date: Mon Mar 07 2022 - 23:40:21 EST


This is a follow-up to the RFC implementation [1] that incorporates
review feedback and bug fixes. See the "RFC v1" section below for a
list of changes.

SEV guest requires the guest's pages to be pinned in host physical
memory as migration of encrypted pages is not supported. The memory
encryption scheme uses the physical address of the memory being
encrypted. If guest pages are moved by the host, content decrypted in
the guest would be incorrect thereby corrupting guest's memory.

For SEV/SEV-ES guests, the hypervisor doesn't know which pages are
encrypted and when the guest is done using those pages. Hypervisor
should treat all the guest pages as encrypted until they are
deallocated or the guest is destroyed.

While provision a pfn, make KVM aware that guest pages need to be
pinned for long-term and use appropriate pin_user_pages API for these
special encrypted memory regions. KVM takes the first reference and
holds it until a mapping is done. Take an extra reference before KVM
releases the pfn.

Actual pinning management is handled by vendor code via new
kvm_x86_ops hooks. MMU calls in to vendor code to pin the page on
demand. Metadata of the pinning is stored in architecture specific
memslot area. During the memslot freeing path and deallocation path
guest pages are unpinned.

Guest boot time comparison:
+---------------+----------------+-------------------+
| Guest Memory | baseline | Demand Pinning + |
| Size (GB) | v5.17-rc6(secs)| v5.17-rc6(secs) |
+---------------+----------------+-------------------+
| 4 | 6.16 | 5.71 |
+---------------+----------------+-------------------+
| 16 | 7.38 | 5.91 |
+---------------+----------------+-------------------+
| 64 | 12.17 | 6.16 |
+---------------+----------------+-------------------+
| 128 | 18.20 | 6.50 |
+---------------+----------------+-------------------+
| 192 | 24.56 | 6.80 |
+---------------+----------------+-------------------+


Changelog:
RFC v1:
* Use pin_user_pages API with FOLL_LONGTERM flag for pinning the
encrypted guest pages. [David Hildenbrand]
* Use new api kvm_for_each_memslot_in_hva_range to walk the memslot.
[Maciej S. Szmigiero]
* Maintain the non-mmu pinned memory and free them on destruction.
[Peter Gonda]
* Handle non-mmu pinned memory for intra host migration. [Peter Gonda]
* Add the missing RLIMIT_MEMLOCK check. [David Hildenbrand]
* Use pin_user_pages API for long term pinning of pages.
[David Hildenbrand]
* Flush the page before releasing it to the host system.
[Mingwei Zhang]

[1] https://lore.kernel.org/kvm/20220118110621.62462-1-nikunj@xxxxxxx/

Nikunj A Dadhania (7):
KVM: Introduce pinning flag to hva_to_pfn*
KVM: x86/mmu: Move hugepage adjust to direct_page_fault
KVM: x86/mmu: Add hook to pin PFNs on demand in MMU
KVM: SVM: Add pinning metadata in the arch memslot
KVM: SVM: Implement demand page pinning
KVM: SEV: Carve out routine for allocation of pages
KVM: Move kvm_for_each_memslot_in_hva_range() to be used in SVM

Sean Christopherson (2):
KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV/TDX
KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data()

arch/x86/include/asm/kvm-x86-ops.h | 3 +
arch/x86/include/asm/kvm_host.h | 10 +
arch/x86/kvm/mmu.h | 3 +
arch/x86/kvm/mmu/mmu.c | 57 +++-
arch/x86/kvm/mmu/tdp_mmu.c | 2 -
arch/x86/kvm/svm/sev.c | 531 ++++++++++++++++++++++-------
arch/x86/kvm/svm/svm.c | 4 +
arch/x86/kvm/svm/svm.h | 12 +-
arch/x86/kvm/x86.c | 11 +-
include/linux/kvm_host.h | 12 +
virt/kvm/kvm_main.c | 69 ++--
virt/kvm/kvm_mm.h | 2 +-
virt/kvm/pfncache.c | 2 +-
13 files changed, 556 insertions(+), 162 deletions(-)

--
2.32.0