Re: [PATCH 1/2] KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved

From: Sean Christopherson
Date: Wed Nov 06 2019 - 18:39:15 EST


On Wed, Nov 06, 2019 at 03:20:11PM -0800, Dan Williams wrote:
> After some more thought I'd feel more comfortable just collapsing the
> ZONE_DEVICE case into the VM_IO/VM_PFNMAP case. I.e. with something
> like this (untested) that just drops the reference immediately and let
> kvm_is_reserved_pfn() do the right thing going forward.

This will break the page fault flow, as it will allow the page to be
whacked before KVM can ensure it will get proper notification from the
mmu_notifier. E.g. KVM would install the PFN in its secondary MMU after
getting the invalidate notification for the PFN.

> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index d6f0696d98ef..d21689e2b4eb 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1464,6 +1464,14 @@ static bool hva_to_pfn_fast(unsigned long addr,
> bool write_fault,
> npages = __get_user_pages_fast(addr, 1, 1, page);
> if (npages == 1) {
> *pfn = page_to_pfn(page[0]);
> + /*
> + * ZONE_DEVICE pages are effectively VM_IO/VM_PFNMAP as
> + * far as KVM is concerned kvm_is_reserved_pfn() will
> + * prevent further unnecessary page management on this
> + * page.
> + */
> + if (is_zone_device_page(page[0]))
> + put_page(page[0]);
>
> if (writable)
> *writable = true;
> @@ -1509,6 +1517,11 @@ static int hva_to_pfn_slow(unsigned long addr,
> bool *async, bool write_fault,
> }
> }
> *pfn = page_to_pfn(page);
> +
> + /* See comment in hva_to_pfn_fast. */
> + if (is_zone_device_page(page[0]))
> + put_page(page[0]);
> +
> return npages;
> }