Re: [PATCH v3 19/20] PCI/P2PDMA: introduce pci_mmap_p2pmem()

From: Logan Gunthorpe
Date: Wed Sep 29 2021 - 17:42:26 EST




On 2021-09-28 1:55 p.m., Jason Gunthorpe wrote:
> On Thu, Sep 16, 2021 at 05:40:59PM -0600, Logan Gunthorpe wrote:
>> +int pci_mmap_p2pmem(struct pci_dev *pdev, struct vm_area_struct *vma)
>> +{
>> + struct pci_p2pdma_map *pmap;
>> + struct pci_p2pdma *p2pdma;
>> + int ret;
>> +
>> + /* prevent private mappings from being established */
>> + if ((vma->vm_flags & VM_MAYSHARE) != VM_MAYSHARE) {
>> + pci_info_ratelimited(pdev,
>> + "%s: fail, attempted private mapping\n",
>> + current->comm);
>> + return -EINVAL;
>> + }
>> +
>> + pmap = pci_p2pdma_map_alloc(pdev, vma->vm_end - vma->vm_start);
>> + if (!pmap)
>> + return -ENOMEM;
>> +
>> + rcu_read_lock();
>> + p2pdma = rcu_dereference(pdev->p2pdma);
>> + if (!p2pdma) {
>> + ret = -ENODEV;
>> + goto out;
>> + }
>> +
>> + ret = simple_pin_fs(&pci_p2pdma_fs_type, &pci_p2pdma_fs_mnt,
>> + &pci_p2pdma_fs_cnt);
>> + if (ret)
>> + goto out;
>> +
>> + ihold(p2pdma->inode);
>> + pmap->inode = p2pdma->inode;
>> + rcu_read_unlock();
>> +
>> + vma->vm_flags |= VM_MIXEDMAP;
>
> Why is this a VM_MIXEDMAP? Everything fault sticks in here has a
> struct page, right?

Yes. This decision is not so simple, I tried a few variations before
settling on this.

The main reason is probably this: if we don't use VM_MIXEDMAP, then we
can't set pte_devmap(). If we don't set pte_devmap(), then every single
page that GUP processes needs to check if it's a ZONE_DEVICE page and
also if it's a P2PDMA page (thus dereferencing pgmap) in order to
satisfy the requirements of FOLL_PCI_P2PDMA.

I didn't think other developers would go for that kind of performance
hit. With VM_MIXEDMAP we hide the performance penalty behind the
existing check. And with the current pgmap code as is, we only need to
do that check once for every new pgmap pointer.

Logan