Re: [PATCH v5 04/13] KVM: Introduce a new guest mapping API

From: David Hildenbrand
Date: Thu Jan 10 2019 - 08:07:09 EST


On 09.01.19 10:42, KarimAllah Ahmed wrote:
> In KVM, specially for nested guests, there is a dominant pattern of:
>
> => map guest memory -> do_something -> unmap guest memory
>
> In addition to all this unnecessarily noise in the code due to boiler plate
> code, most of the time the mapping function does not properly handle memory
> that is not backed by "struct page". This new guest mapping API encapsulate
> most of this boiler plate code and also handles guest memory that is not
> backed by "struct page".
>
> The current implementation of this API is using memremap for memory that is
> not backed by a "struct page" which would lead to a huge slow-down if it
> was used for high-frequency mapping operations. The API does not have any
> effect on current setups where guest memory is backed by a "struct page".
> Further patches are going to also introduce a pfn-cache which would
> significantly improve the performance of the memremap case.
>
> Signed-off-by: KarimAllah Ahmed <karahmed@xxxxxxxxx>
> ---
> v3 -> v4:
> - Update the commit message.
> v1 -> v2:
> - Drop the caching optimization (pbonzini)
> - Use 'hva' instead of 'kaddr' (pbonzini)
> - Return 0/-EINVAL/-EFAULT instead of true/false. -EFAULT will be used for
> AMD patch (pbonzini)
> - Introduce __kvm_map_gfn which accepts a memory slot and use it (pbonzini)
> - Only clear map->hva instead of memsetting the whole structure.
> - Drop kvm_vcpu_map_valid since it is no longer used.
> - Fix EXPORT_MODULE naming.
> ---
> include/linux/kvm_host.h | 9 ++++++++
> virt/kvm/kvm_main.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 62 insertions(+)
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index c38cc5e..8a2f5fa 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -205,6 +205,13 @@ enum {
> READING_SHADOW_PAGE_TABLES,
> };
>
> +struct kvm_host_map {
> + struct page *page;

Can you add somme comments to what it means when there is a page vs.
when there is none?

> + void *hva;
> + kvm_pfn_t pfn;
> + kvm_pfn_t gfn;
> +};
> +
> /*
> * Sometimes a large or cross-page mmio needs to be broken up into separate
> * exits for userspace servicing.
> @@ -710,7 +717,9 @@ struct kvm_memslots *kvm_vcpu_memslots(struct kvm_vcpu *vcpu);
> struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn);
> kvm_pfn_t kvm_vcpu_gfn_to_pfn_atomic(struct kvm_vcpu *vcpu, gfn_t gfn);
> kvm_pfn_t kvm_vcpu_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn);
> +int kvm_vcpu_map(struct kvm_vcpu *vcpu, gpa_t gpa, struct kvm_host_map *map);
> struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn);
> +void kvm_vcpu_unmap(struct kvm_host_map *map, bool dirty);
> unsigned long kvm_vcpu_gfn_to_hva(struct kvm_vcpu *vcpu, gfn_t gfn);
> unsigned long kvm_vcpu_gfn_to_hva_prot(struct kvm_vcpu *vcpu, gfn_t gfn, bool *writable);
> int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, int offset,
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 1f888a1..4d8f2e3 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1733,6 +1733,59 @@ struct page *gfn_to_page(struct kvm *kvm, gfn_t gfn)
> }
> EXPORT_SYMBOL_GPL(gfn_to_page);
>
> +static int __kvm_map_gfn(struct kvm_memory_slot *slot, gfn_t gfn,
> + struct kvm_host_map *map)
> +{
> + kvm_pfn_t pfn;
> + void *hva = NULL;
> + struct page *page = NULL;

nit: I prefer these in a growing line-length fashion.

> +
> + pfn = gfn_to_pfn_memslot(slot, gfn);
> + if (is_error_noslot_pfn(pfn))
> + return -EINVAL;
> +
> + if (pfn_valid(pfn)) {
> + page = pfn_to_page(pfn);
> + hva = kmap(page);
> + } else {
> + hva = memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
> + }
> +
> + if (!hva)
> + return -EFAULT;
> +
> + map->page = page;
> + map->hva = hva;
> + map->pfn = pfn;
> + map->gfn = gfn;
> +
> + return 0;
> +}
> +
> +int kvm_vcpu_map(struct kvm_vcpu *vcpu, gfn_t gfn, struct kvm_host_map *map)
> +{
> + return __kvm_map_gfn(kvm_vcpu_gfn_to_memslot(vcpu, gfn), gfn, map);
> +}
> +EXPORT_SYMBOL_GPL(kvm_vcpu_map);
> +
> +void kvm_vcpu_unmap(struct kvm_host_map *map, bool dirty)
> +{
> + if (!map->hva)
> + return;
> +
> + if (map->page)
> + kunmap(map->page);
> + else
> + memunmap(map->hva);
> +
> + if (dirty)


I am wondering if this would also be the right place for

kvm_vcpu_mark_page_dirty() to mark the page dirty for migration.


> + kvm_release_pfn_dirty(map->pfn);
> + else
> + kvm_release_pfn_clean(map->pfn);
> + map->hva = NULL;

> +}
> +EXPORT_SYMBOL_GPL(kvm_vcpu_unmap);
> +
> struct page *kvm_vcpu_gfn_to_page(struct kvm_vcpu *vcpu, gfn_t gfn)
> {
> kvm_pfn_t pfn;
>


--

Thanks,

David / dhildenb