Re: [RFC] KVM: x86: Support KVM VMs sharing SEV context

From: Ashish Kalra
Date: Fri Mar 05 2021 - 09:05:04 EST


On Wed, Feb 24, 2021 at 08:59:15AM +0000, Nathan Tempelman wrote:
> Add a capability for userspace to mirror SEV encryption context from
> one vm to another. On our side, this is intended to support a
> Migration Helper vCPU, but it can also be used generically to support
> other in-guest workloads scheduled by the host. The intention is for
> the primary guest and the mirror to have nearly identical memslots.
>
> The primary benefits of this are that:
> 1) The VMs do not share KVM contexts (think APIC/MSRs/etc), so they
> can't accidentally clobber each other.
> 2) The VMs can have different memory-views, which is necessary for post-copy
> migration (the migration vCPUs on the target need to read and write to
> pages, when the primary guest would VMEXIT).
>
> This does not change the threat model for AMD SEV. Any memory involved
> is still owned by the primary guest and its initial state is still
> attested to through the normal SEV_LAUNCH_* flows. If userspace wanted
> to circumvent SEV, they could achieve the same effect by simply attaching
> a vCPU to the primary VM.
> This patch deliberately leaves userspace in charge of the memslots for the
> mirror, as it already has the power to mess with them in the primary guest.
>
> This patch does not support SEV-ES (much less SNP), as it does not
> handle handing off attested VMSAs to the mirror.
>
> For additional context, we need a Migration Helper because SEV PSP migration
> is far too slow for our live migration on its own. Using an in-guest
> migrator lets us speed this up significantly.
>
> Signed-off-by: Nathan Tempelman <natet@xxxxxxxxxx>
> ---
> Documentation/virt/kvm/api.rst | 17 +++++++
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/svm/sev.c | 82 +++++++++++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.c | 2 +
> arch/x86/kvm/svm/svm.h | 2 +
> arch/x86/kvm/x86.c | 7 ++-
> include/linux/kvm_host.h | 1 +
> include/uapi/linux/kvm.h | 1 +
> virt/kvm/kvm_main.c | 8 ++++
> 9 files changed, 120 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 482508ec7cc4..438b647663c9 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -6213,6 +6213,23 @@ the bus lock vm exit can be preempted by a higher priority VM exit, the exit
> notifications to userspace can be KVM_EXIT_BUS_LOCK or other reasons.
> KVM_RUN_BUS_LOCK flag is used to distinguish between them.
>
> +7.23 KVM_CAP_VM_COPY_ENC_CONTEXT_TO
> +-----------------------------------
> +
> +Architectures: x86 SEV enabled
> +Type: system
> +Parameters: args[0] is the fd of the kvm to mirror encryption context to
> +Returns: 0 on success; ENOTTY on error
> +
> +This capability enables userspace to copy encryption context from a primary
> +vm to the vm indicated by the fd.
> +
> +This is intended to support in-guest workloads scheduled by the host. This
> +allows the in-guest workload to maintain its own NPTs and keeps the two vms
> +from accidentally clobbering each other with interrupts and the like (separate
> +APIC/MSRs/etc).
> +
> +
> 8. Other capabilities.
> ======================
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 84499aad01a4..b7636c009647 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1334,6 +1334,7 @@ struct kvm_x86_ops {
> int (*mem_enc_op)(struct kvm *kvm, void __user *argp);
> int (*mem_enc_reg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
> int (*mem_enc_unreg_region)(struct kvm *kvm, struct kvm_enc_region *argp);
> + int (*vm_copy_enc_context_to)(struct kvm *kvm, unsigned int child_fd);
>
> int (*get_msr_feature)(struct kvm_msr_entry *entry);
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 874ea309279f..2bad6cd2cb4c 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -66,6 +66,11 @@ static int sev_flush_asids(void)
> return ret;
> }
>
> +static inline bool is_mirroring_enc_context(struct kvm *kvm)
> +{
> + return &to_kvm_svm(kvm)->sev_info.enc_context_owner;
> +}
> +
> /* Must be called with the sev_bitmap_lock held */
> static bool __sev_recycle_asids(int min_asid, int max_asid)
> {
> @@ -1124,6 +1129,10 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
> if (copy_from_user(&sev_cmd, argp, sizeof(struct kvm_sev_cmd)))
> return -EFAULT;
>
> + /* enc_context_owner handles all memory enc operations */
> + if (is_mirroring_enc_context(kvm))
> + return -ENOTTY;
> +
> mutex_lock(&kvm->lock);
>
> switch (sev_cmd.id) {
> @@ -1186,6 +1195,10 @@ int svm_register_enc_region(struct kvm *kvm,
> if (!sev_guest(kvm))
> return -ENOTTY;
>
> + /* If kvm is mirroring encryption context it isn't responsible for it */
> + if (is_mirroring_enc_context(kvm))
> + return -ENOTTY;
> +
> if (range->addr > ULONG_MAX || range->size > ULONG_MAX)
> return -EINVAL;
>
> @@ -1252,6 +1265,10 @@ int svm_unregister_enc_region(struct kvm *kvm,
> struct enc_region *region;
> int ret;
>
> + /* If kvm is mirroring encryption context it isn't responsible for it */
> + if (is_mirroring_enc_context(kvm))
> + return -ENOTTY;
> +
> mutex_lock(&kvm->lock);
>
> if (!sev_guest(kvm)) {
> @@ -1282,6 +1299,65 @@ int svm_unregister_enc_region(struct kvm *kvm,
> return ret;
> }
>
> +int svm_vm_copy_asid_to(struct kvm *kvm, unsigned int mirror_kvm_fd)
> +{
> + struct file *mirror_kvm_file;
> + struct kvm *mirror_kvm;
> + struct kvm_sev_info *mirror_kvm_sev;
> + unsigned int asid;
> + int ret;
> +
> + if (!sev_guest(kvm))
> + return -ENOTTY;
> +
> + mutex_lock(&kvm->lock);
> +
> + /* Mirrors of mirrors should work, but let's not get silly */
> + if (is_mirroring_enc_context(kvm)) {
> + ret = -ENOTTY;
> + goto failed;
> + }

How will A->B->C->... type of live migration work if mirrors of
mirrors are not supported ?

Thanks,
Ashish