Re: [PATCH v6 1/5] KVM: arm64: Block cacheable PFNMAP mapping

From: Jason Gunthorpe
Date: Mon Jun 09 2025 - 08:27:19 EST


On Fri, Jun 06, 2025 at 11:11:56AM -0700, Sean Christopherson wrote:
> > @@ -1612,6 +1624,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >
> > vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
> >
> > + if ((vma->vm_flags & VM_PFNMAP) &&
> > + !mapping_type_noncacheable(vma->vm_page_prot))
>
> I don't think this is correct, and there's a very real chance this will break
> existing setups. PFNMAP memory isn't strictly device memory, and IIUC, KVM
> force DEVICE/NORMAL_NC based on kvm_is_device_pfn(), not based on VM_PFNMAP.

kvm_is_device_pfn() effecitvely means KVM can't use CMOs on that
PFN. It doesn't really mean anything more..

PFNMAP says the same thing, or at least from a mm perspective we don't
want drivers taking PFNMAP memory and then trying to guess if there
are struct pages/KVAs for it. PFNMAP memory is supposed to be fully
opaque.

Though that confusion seems to be a separate issue from this patch.

> if (kvm_is_device_pfn(pfn)) {
> /*
> * If the page was identified as device early by looking at
> * the VMA flags, vma_pagesize is already representing the
> * largest quantity we can map. If instead it was mapped
> * via __kvm_faultin_pfn(), vma_pagesize is set to PAGE_SIZE
> * and must not be upgraded.
> *
> * In both cases, we don't let transparent_hugepage_adjust()
> * change things at the last minute.
> */
> device = true;

"device" here is sort of a mis-nomer, it is really just trying to
setup the S2 so that CMOs are not going go to be done.

Calling it 'disable_cmo' would sure make this code clearer..

> @@ -1639,6 +1653,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> return -EFAULT;
>
> if (kvm_is_device_pfn(pfn)) {
> + if (is_vma_cacheable)
> + return -EINVAL;
> +

eg

if (!kvm_can_use_cmo_pfn(pfn)) {
if (is_vma_cacheable)
return -EINVAL;

> * If the page was identified as device early by looking at
> * the VMA flags, vma_pagesize is already representing the
> @@ -1722,6 +1739,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> prot |= KVM_PGTABLE_PROT_X;
>
> if (device) {
> + if (is_vma_cacheable) {
> + ret = -EINVAL;
> + goto out;
> + }

if (disable_cmo) {
if (is_vma_cacheable)
return -EINVAL;

Makes alot more sense, right? If KVM can't do CMOs then it should not
attempt to use memory mapped into the VMA as cachable.

> if (vfio_allow_any_uc)
> prot |= KVM_PGTABLE_PROT_NORMAL_NC;
> else
>

Regardless, this seems good for this patch at least.

Jason