RE: [PATCH 23/26] ib: Convert qib_get_user_pages() toget_user_pages_unlocked()

From: Marciniszyn, Mike
Date: Wed Oct 02 2013 - 10:55:08 EST


Thanks!!

I would like to test these two patches and also do the stable work for the deadlock as well. Do you have these patches in a repo somewhere to save me a bit of work?

We had been working on an internal version of the deadlock portion of this patch that uses get_user_pages_fast() vs. the new get_user_pages_unlocked().

The risk of GUP fast is the loss of the "force" arg on GUP fast, which I don't see as significant give our use case.

Some nits on the subject and commit message:
1. use IB/qib, IB/ipath vs. ib
2. use the correct ipath vs. qib in the commit message text

Mike

> -----Original Message-----
> From: Jan Kara [mailto:jack@xxxxxxx]
> Sent: Wednesday, October 02, 2013 10:28 AM
> To: LKML
> Cc: linux-mm@xxxxxxxxx; Jan Kara; infinipath; Roland Dreier; linux-
> rdma@xxxxxxxxxxxxxxx
> Subject: [PATCH 23/26] ib: Convert qib_get_user_pages() to
> get_user_pages_unlocked()
>
> Convert qib_get_user_pages() to use get_user_pages_unlocked(). This
> shortens the section where we hold mmap_sem for writing and also removes
> the knowledge about get_user_pages() locking from ipath driver. We also fix
> a bug in testing pinned number of pages when changing the code.
>
> CC: Mike Marciniszyn <infinipath@xxxxxxxxx>
> CC: Roland Dreier <roland@xxxxxxxxxx>
> CC: linux-rdma@xxxxxxxxxxxxxxx
> Signed-off-by: Jan Kara <jack@xxxxxxx>
> ---
> drivers/infiniband/hw/qib/qib_user_pages.c | 62 +++++++++++++----------------
> -
> 1 file changed, 26 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c
> b/drivers/infiniband/hw/qib/qib_user_pages.c
> index 2bc1d2b96298..57ce83c2d1d9 100644
> --- a/drivers/infiniband/hw/qib/qib_user_pages.c
> +++ b/drivers/infiniband/hw/qib/qib_user_pages.c
> @@ -48,39 +48,55 @@ static void __qib_release_user_pages(struct page
> **p, size_t num_pages,
> }
> }
>
> -/*
> - * Call with current->mm->mmap_sem held.
> +/**
> + * qib_get_user_pages - lock user pages into memory
> + * @start_page: the start page
> + * @num_pages: the number of pages
> + * @p: the output page structures
> + *
> + * This function takes a given start page (page aligned user virtual
> + * address) and pins it and the following specified number of pages.
> +For
> + * now, num_pages is always 1, but that will probably change at some
> +point
> + * (because caller is doing expected sends on a single virtually
> +contiguous
> + * buffer, so we can do all pages at once).
> */
> -static int __qib_get_user_pages(unsigned long start_page, size_t
> num_pages,
> - struct page **p, struct vm_area_struct
> **vma)
> +int qib_get_user_pages(unsigned long start_page, size_t num_pages,
> + struct page **p)
> {
> unsigned long lock_limit;
> size_t got;
> int ret;
> + struct mm_struct *mm = current->mm;
>
> + down_write(&mm->mmap_sem);
> lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
>
> - if (num_pages > lock_limit && !capable(CAP_IPC_LOCK)) {
> + if (mm->pinned_vm + num_pages > lock_limit &&
> !capable(CAP_IPC_LOCK)) {
> + up_write(&mm->mmap_sem);
> ret = -ENOMEM;
> goto bail;
> }
> + mm->pinned_vm += num_pages;
> + up_write(&mm->mmap_sem);
>
> for (got = 0; got < num_pages; got += ret) {
> - ret = get_user_pages(current, current->mm,
> - start_page + got * PAGE_SIZE,
> - num_pages - got, 1, 1,
> - p + got, vma);
> + ret = get_user_pages_unlocked(current, mm,
> + start_page + got * PAGE_SIZE,
> + num_pages - got, 1, 1,
> + p + got);
> if (ret < 0)
> goto bail_release;
> }
>
> - current->mm->pinned_vm += num_pages;
>
> ret = 0;
> goto bail;
>
> bail_release:
> __qib_release_user_pages(p, got, 0);
> + down_write(&mm->mmap_sem);
> + mm->pinned_vm -= num_pages;
> + up_write(&mm->mmap_sem);
> bail:
> return ret;
> }
> @@ -117,32 +133,6 @@ dma_addr_t qib_map_page(struct pci_dev *hwdev,
> struct page *page,
> return phys;
> }
>
> -/**
> - * qib_get_user_pages - lock user pages into memory
> - * @start_page: the start page
> - * @num_pages: the number of pages
> - * @p: the output page structures
> - *
> - * This function takes a given start page (page aligned user virtual
> - * address) and pins it and the following specified number of pages. For
> - * now, num_pages is always 1, but that will probably change at some point
> - * (because caller is doing expected sends on a single virtually contiguous
> - * buffer, so we can do all pages at once).
> - */
> -int qib_get_user_pages(unsigned long start_page, size_t num_pages,
> - struct page **p)
> -{
> - int ret;
> -
> - down_write(&current->mm->mmap_sem);
> -
> - ret = __qib_get_user_pages(start_page, num_pages, p, NULL);
> -
> - up_write(&current->mm->mmap_sem);
> -
> - return ret;
> -}
> -
> void qib_release_user_pages(struct page **p, size_t num_pages) {
> if (current->mm) /* during close after signal, mm can be NULL */
> --
> 1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/