Re: [PATCH 5/9] mm, drm/ttm, drm/vmwgfx: Support huge TTM pagefaults

From: Thomas HellstrÃm (VMware)
Date: Thu Jan 30 2020 - 08:29:54 EST


On 1/29/20 3:55 PM, Christian KÃnig wrote:
Am 24.01.20 um 10:09 schrieb Thomas HellstrÃm (VMware):
From: Thomas Hellstrom <thellstrom@xxxxxxxxxx>

Support huge (PMD-size and PUD-size) page-table entries by providing a
huge_fault() callback.
We still support private mappings and write-notify by splitting the huge
page-table entries on write-access.

Note that for huge page-faults to occur, either the kernel needs to be
compiled with trans-huge-pages always enabled, or the kernel needs to be
compiled with trans-huge-pages enabled using madvise, and the user-space
app needs to call madvise() to enable trans-huge pages on a per-mapping
basis.

Furthermore huge page-faults will not succeed unless buffer objects and
user-space addresses are aligned on huge page size boundaries.

Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Michal Hocko <mhocko@xxxxxxxx>
Cc: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx>
Cc: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Ralph Campbell <rcampbell@xxxxxxxxxx>
Cc: "JÃrÃme Glisse" <jglisse@xxxxxxxxxx>
Cc: "Christian KÃnig" <christian.koenig@xxxxxxx>
Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
Signed-off-by: Thomas Hellstrom <thellstrom@xxxxxxxxxx>
Reviewed-by: Roland Scheidegger <sroland@xxxxxxxxxx>
---
 drivers/gpu/drm/ttm/ttm_bo_vm.c | 145 ++++++++++++++++++++-
 drivers/gpu/drm/vmwgfx/vmwgfx_page_dirty.c | 2 +-
 include/drm/ttm/ttm_bo_api.h | 3 +-
 3 files changed, 145 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 389128b8c4dd..49704261a00d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -156,6 +156,89 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo,
 }
 EXPORT_SYMBOL(ttm_bo_vm_reserve);
 +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+/**
+ * ttm_bo_vm_insert_huge - Insert a pfn for PUD or PMD faults
+ * @vmf: Fault data
+ * @bo: The buffer object
+ * @page_offset: Page offset from bo start
+ * @fault_page_size: The size of the fault in pages.
+ * @pgprot: The page protections.
+ * Does additional checking whether it's possible to insert a PUD or PMD
+ * pfn and performs the insertion.
+ *
+ * Return: VM_FAULT_NOPAGE on successful insertion, VM_FAULT_FALLBACK if
+ * a huge fault was not possible, and a VM_FAULT_ERROR code otherwise.
+ */
+static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ struct ttm_buffer_object *bo,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pgoff_t page_offset,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pgoff_t fault_page_size,
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ pgprot_t pgprot)
+{
+ÂÂÂ pgoff_t i;
+ÂÂÂ vm_fault_t ret;
+ÂÂÂ unsigned long pfn;
+ÂÂÂ pfn_t pfnt;
+ÂÂÂ struct ttm_tt *ttm = bo->ttm;
+ÂÂÂ bool write = vmf->flags & FAULT_FLAG_WRITE;
+
+ÂÂÂ /* Fault should not cross bo boundary. */
+ÂÂÂ page_offset &= ~(fault_page_size - 1);
+ÂÂÂ if (page_offset + fault_page_size > bo->num_pages)
+ÂÂÂÂÂÂÂ goto out_fallback;
+
+ÂÂÂ if (bo->mem.bus.is_iomem)
+ÂÂÂÂÂÂÂ pfn = ttm_bo_io_mem_pfn(bo, page_offset);
+ÂÂÂ else
+ÂÂÂÂÂÂÂ pfn = page_to_pfn(ttm->pages[page_offset]);
+
+ÂÂÂ /* pfn must be fault_page_size aligned. */
+ÂÂÂ if ((pfn & (fault_page_size - 1)) != 0)
+ÂÂÂÂÂÂÂ goto out_fallback;
+
+ÂÂÂ /* Check that memory is contiguous. */
+ÂÂÂ if (!bo->mem.bus.is_iomem)
+ÂÂÂÂÂÂÂ for (i = 1; i < fault_page_size; ++i) {
+ÂÂÂÂÂÂÂÂÂÂÂ if (page_to_pfn(ttm->pages[page_offset + i]) != pfn + i)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ goto out_fallback;
+ÂÂÂÂÂÂÂ }
+ÂÂÂ /* IO mem without the io_mem_pfn callback is always contiguous. */
+ÂÂÂ else if (bo->bdev->driver->io_mem_pfn)
+ÂÂÂÂÂÂÂ for (i = 1; i < fault_page_size; ++i) {
+ÂÂÂÂÂÂÂÂÂÂÂ if (ttm_bo_io_mem_pfn(bo, page_offset + i) != pfn + i)
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ goto out_fallback;
+ÂÂÂÂÂÂÂ }

Maybe add {} to the if to make clear where things start/end.

+
+ÂÂÂ pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
+ÂÂÂ if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
+ÂÂÂÂÂÂÂ ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
+ÂÂÂ else if (fault_page_size == (HPAGE_PUD_SIZE >> PAGE_SHIFT))
+ÂÂÂÂÂÂÂ ret = vmf_insert_pfn_pud_prot(vmf, pfnt, pgprot, write);
+#endif
+ÂÂÂ else
+ÂÂÂÂÂÂÂ WARN_ON_ONCE(ret = VM_FAULT_FALLBACK);
+
+ÂÂÂ if (ret != VM_FAULT_NOPAGE)
+ÂÂÂÂÂÂÂ goto out_fallback;
+
+ÂÂÂ return VM_FAULT_NOPAGE;
+out_fallback:
+ÂÂÂ count_vm_event(THP_FAULT_FALLBACK);
+ÂÂÂ return VM_FAULT_FALLBACK;

This doesn't seem to match the function documentation since we never return ret here as far as I can see.

Apart from those comments it looks like that should work,
Christian.


Thanks for reviewing, Christian. I'll update the next version with your feedback.

/Thomas