Re: [RFC PATCH 6/6] KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data()

From: Nikunj A. Dadhania
Date: Wed Jan 19 2022 - 01:34:05 EST


Hi Maciej,

On 1/18/2022 8:30 PM, Maciej S. Szmigiero wrote:
> Hi Nikunj,
>
> On 18.01.2022 12:06, Nikunj A Dadhania wrote:
>> From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>>
>> Pin the memory for the data being passed to launch_update_data()
>> because it gets encrypted before the guest is first run and must
>> not be moved which would corrupt it.
>>
>> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>> [ * Changed hva_to_gva() to take an extra argument and return gpa_t.
>>    * Updated sev_pin_memory_in_mmu() error handling.
>>    * As pinning/unpining pages is handled within MMU, removed
>>      {get,put}_user(). ]
>> Signed-off-by: Nikunj A Dadhania <nikunj@xxxxxxx>
>> ---
>>   arch/x86/kvm/svm/sev.c | 122 ++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 119 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
>> index 14aeccfc500b..1ae714e83a3c 100644
>> --- a/arch/x86/kvm/svm/sev.c
>> +++ b/arch/x86/kvm/svm/sev.c
>> @@ -22,6 +22,7 @@
>>   #include <asm/trapnr.h>
>>   #include <asm/fpu/xcr.h>
>>   +#include "mmu.h"
>>   #include "x86.h"
>>   #include "svm.h"
>>   #include "svm_ops.h"
>> @@ -490,6 +491,110 @@ static unsigned long get_num_contig_pages(unsigned long idx,
>>       return pages;
>>   }
>>   +#define SEV_PFERR_RO (PFERR_USER_MASK)
>> +#define SEV_PFERR_RW (PFERR_WRITE_MASK | PFERR_USER_MASK)
>> +
>> +static struct kvm_memory_slot *hva_to_memslot(struct kvm *kvm,
>> +                          unsigned long hva)
>> +{
>> +    struct kvm_memslots *slots = kvm_memslots(kvm);
>> +    struct kvm_memory_slot *memslot;
>> +    int bkt;
>> +
>> +    kvm_for_each_memslot(memslot, bkt, slots) {
>> +        if (hva >= memslot->userspace_addr &&
>> +            hva < memslot->userspace_addr +
>> +            (memslot->npages << PAGE_SHIFT))
>> +            return memslot;
>> +    }
>> +
>> +    return NULL;
>> +}
>
> We have kvm_for_each_memslot_in_hva_range() now, please don't do a linear
> search through memslots.
> You might need to move the aforementioned macro from kvm_main.c to some
> header file, though.

Sure, let me try optimizing with this newly added macro.

>
>> +static gpa_t hva_to_gpa(struct kvm *kvm, unsigned long hva, bool *ro)
>> +{
>> +    struct kvm_memory_slot *memslot;
>> +    gpa_t gpa_offset;
>> +
>> +    memslot = hva_to_memslot(kvm, hva);
>> +    if (!memslot)
>> +        return UNMAPPED_GVA;
>> +
>> +    *ro = !!(memslot->flags & KVM_MEM_READONLY);
>> +    gpa_offset = hva - memslot->userspace_addr;
>> +    return ((memslot->base_gfn << PAGE_SHIFT) + gpa_offset);
>> +}
>> +
>> +static struct page **sev_pin_memory_in_mmu(struct kvm *kvm, unsigned long addr,
>> +                       unsigned long size,
>> +                       unsigned long *npages)
>> +{
>> +    struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
>> +    struct kvm_vcpu *vcpu;
>> +    struct page **pages;
>> +    unsigned long i;
>> +    u32 error_code;
>> +    kvm_pfn_t pfn;
>> +    int idx, ret = 0;
>> +    gpa_t gpa;
>> +    bool ro;
>> +
>> +    pages = sev_alloc_pages(sev, addr, size, npages);
>> +    if (IS_ERR(pages))
>> +        return pages;
>> +
>> +    vcpu = kvm_get_vcpu(kvm, 0);
>> +    if (mutex_lock_killable(&vcpu->mutex)) {
>> +        kvfree(pages);
>> +        return ERR_PTR(-EINTR);
>> +    }
>> +
>> +    vcpu_load(vcpu);
>> +    idx = srcu_read_lock(&kvm->srcu);
>> +
>> +    kvm_mmu_load(vcpu);
>> +
>> +    for (i = 0; i < *npages; i++, addr += PAGE_SIZE) {
>> +        if (signal_pending(current)) {
>> +            ret = -ERESTARTSYS;
>> +            break;
>> +        }
>> +
>> +        if (need_resched())
>> +            cond_resched();
>> +
>> +        gpa = hva_to_gpa(kvm, addr, &ro);
>> +        if (gpa == UNMAPPED_GVA) {
>> +            ret = -EFAULT;
>> +            break;
>> +        }
>
> This function is going to have worst case O(n²) complexity if called with
> the whole VM memory (or O(n * log(n)) when hva_to_memslot() is modified
> to use kvm_for_each_memslot_in_hva_range()).

I understand your concern and will address it. BTW, this is called for a small
fragment of VM memory( <10MB), that needs to be pinned before the guest execution
starts.

> That's really bad for something that can be done in O(n) time - look how
> kvm_for_each_memslot_in_gfn_range() does it over gfns.
>

I saw one use of kvm_for_each_memslot_in_gfn_range() in __kvm_zap_rmaps(), and
that too calls slot_handle_level_range() which has a for_each_slot_rmap_range().
How would that be O(n) ?

kvm_for_each_memslot_in_gfn_range() {
...
slot_handle_level_range()
...
}

slot_handle_level_range() {
...
for_each_slot_rmap_range() {
...
}
...
}

Regards,
Nikunj